From cjfields at uiuc.edu Sun Sep 2 19:54:54 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Sun, 2 Sep 2007 18:54:54 -0500 Subject: [Bioperl-l] (no subject) Message-ID: Posted this to biosql-l already but felt it needed posting here as well. Sorry if you get this twice. I noticed some critical recursion issues with bioperl-db when working in Bio::Ontology changes. This was using bioperl-live (post-feature/ annotation fixes). Bug report is here: http://bugzilla.open-bio.org/show_bug.cgi?id=2355 It seems to be Bio:Taxon related; this is from 03swiss.t: --------------------- WARNING --------------------- MSG: recursion detected for Bio::Taxon object STACK Bio::DB::BioSQL::BasePersistenceAdaptor::_process_child /Users/cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ BasePersistenceAdaptor.pm:681 STACK Bio::DB::BioSQL::BasePersistenceAdaptor::_create_persistent /Users/cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ BasePersistenceAdaptor.pm:630 STACK Bio::DB::BioSQL::BasePersistenceAdaptor::_process_child /Users/cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ BasePersistenceAdaptor.pm:692 STACK Bio::DB::BioSQL::BasePersistenceAdaptor::_create_persistent /Users/cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ BasePersistenceAdaptor.pm:630 ... /Users/cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ BasePersistenceAdaptor.pm:587 STACK Bio::DB::BioSQL::BasePersistenceAdaptor::store /Users/cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ BasePersistenceAdaptor.pm:253 STACK Bio::DB::BioSQL::PrimarySeqAdaptor::store_children /Users/cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ PrimarySeqAdaptor.pm:229 STACK Bio::DB::BioSQL::SeqAdaptor::store_children /Users/cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ SeqAdaptor.pm:217 STACK Bio::DB::BioSQL::BasePersistenceAdaptor::create /Users/cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ BasePersistenceAdaptor.pm:214 STACK Bio::DB::Persistent::PersistentObject::create /Users/cjfields/src/core/bioperl-db/blib/lib/Bio/DB/Persistent/ PersistentObject.pm:244 STACK toplevel t/04swiss.t:36 --------------------------------------------------- Also, seeing this with 13remove.t and 15.cluster.t, both of which appear to infinitely recurse: Deep recursion on subroutine "Bio::DB::BioSQL::BasePersistenceAdaptor::_create_persistent" at /Users/cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ BasePersistenceAdaptor.pm line 587, line 1. Deep recursion on subroutine "Bio::DB::BioSQL::BasePersistenceAdaptor::_process_child" at /Users/cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ BasePersistenceAdaptor.pm line 630, line 1. chris From cjfields at uiuc.edu Sun Sep 2 19:57:59 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Sun, 2 Sep 2007 18:57:59 -0500 Subject: [Bioperl-l] recursion issues with bioperl-db Message-ID: <2E14450C-C135-42DD-A9DE-EB47EB80E6AC@uiuc.edu> Apologies if you get this more than once; the first post appeared to get sent w/o a proper subject line. Posted this to biosql-l already but felt it needed posting here as well. I noticed some critical recursion issues with bioperl-db when working in Bio::Ontology changes. This was using bioperl-live (post-feature/ annotation fixes). Bug report is here: http://bugzilla.open-bio.org/show_bug.cgi?id=2355 It seems to be Bio:Taxon related; this is from 03swiss.t: --------------------- WARNING --------------------- MSG: recursion detected for Bio::Taxon object STACK Bio::DB::BioSQL::BasePersistenceAdaptor::_process_child /Users/cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ BasePersistenceAdaptor.pm:681 STACK Bio::DB::BioSQL::BasePersistenceAdaptor::_create_persistent /Users/cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ BasePersistenceAdaptor.pm:630 STACK Bio::DB::BioSQL::BasePersistenceAdaptor::_process_child /Users/cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ BasePersistenceAdaptor.pm:692 STACK Bio::DB::BioSQL::BasePersistenceAdaptor::_create_persistent /Users/cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ BasePersistenceAdaptor.pm:630 ... /Users/cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ BasePersistenceAdaptor.pm:587 STACK Bio::DB::BioSQL::BasePersistenceAdaptor::store /Users/cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ BasePersistenceAdaptor.pm:253 STACK Bio::DB::BioSQL::PrimarySeqAdaptor::store_children /Users/cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ PrimarySeqAdaptor.pm:229 STACK Bio::DB::BioSQL::SeqAdaptor::store_children /Users/cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ SeqAdaptor.pm:217 STACK Bio::DB::BioSQL::BasePersistenceAdaptor::create /Users/cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ BasePersistenceAdaptor.pm:214 STACK Bio::DB::Persistent::PersistentObject::create /Users/cjfields/src/core/bioperl-db/blib/lib/Bio/DB/Persistent/ PersistentObject.pm:244 STACK toplevel t/04swiss.t:36 --------------------------------------------------- Also, seeing this with 13remove.t and 15.cluster.t, both of which appear to infinitely recurse: Deep recursion on subroutine "Bio::DB::BioSQL::BasePersistenceAdaptor::_create_persistent" at /Users/cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ BasePersistenceAdaptor.pm line 587, line 1. Deep recursion on subroutine "Bio::DB::BioSQL::BasePersistenceAdaptor::_process_child" at /Users/cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ BasePersistenceAdaptor.pm line 630, line 1. chris From cjfields at uiuc.edu Sun Sep 2 21:40:48 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Sun, 2 Sep 2007 20:40:48 -0500 Subject: [Bioperl-l] recursion issues with bioperl-db In-Reply-To: <2E14450C-C135-42DD-A9DE-EB47EB80E6AC@uiuc.edu> References: <2E14450C-C135-42DD-A9DE-EB47EB80E6AC@uiuc.edu> Message-ID: <25CFD36D-D921-4F5F-BADF-D858A2FE76D4@uiuc.edu> Okay, we can the previous posts! Odd, but I started from scratch and can't reproduce the issue; there may have been some cross-talk with different bioperl installations on my laptop. Anyway, everything passes now w/o recursion so I'll mark the bug as invalid. chris On Sep 2, 2007, at 6:57 PM, Chris Fields wrote: > Apologies if you get this more than once; the first post appeared to > get sent w/o a proper subject line. Posted this to biosql-l already > but felt it needed posting here as well. > > I noticed some critical recursion issues with bioperl-db when working > in Bio::Ontology changes. This was using bioperl-live (post-feature/ > annotation fixes). Bug report is here: > > http://bugzilla.open-bio.org/show_bug.cgi?id=2355 > > It seems to be Bio:Taxon related; this is from 03swiss.t: > > --------------------- WARNING --------------------- > MSG: recursion detected for Bio::Taxon object > STACK Bio::DB::BioSQL::BasePersistenceAdaptor::_process_child > /Users/cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ > BasePersistenceAdaptor.pm:681 > STACK Bio::DB::BioSQL::BasePersistenceAdaptor::_create_persistent > /Users/cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ > BasePersistenceAdaptor.pm:630 > STACK Bio::DB::BioSQL::BasePersistenceAdaptor::_process_child > /Users/cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ > BasePersistenceAdaptor.pm:692 > STACK Bio::DB::BioSQL::BasePersistenceAdaptor::_create_persistent > /Users/cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ > BasePersistenceAdaptor.pm:630 > ... > /Users/cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ > BasePersistenceAdaptor.pm:587 > STACK Bio::DB::BioSQL::BasePersistenceAdaptor::store > /Users/cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ > BasePersistenceAdaptor.pm:253 > STACK Bio::DB::BioSQL::PrimarySeqAdaptor::store_children > /Users/cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ > PrimarySeqAdaptor.pm:229 > STACK Bio::DB::BioSQL::SeqAdaptor::store_children > /Users/cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ > SeqAdaptor.pm:217 > STACK Bio::DB::BioSQL::BasePersistenceAdaptor::create > /Users/cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ > BasePersistenceAdaptor.pm:214 > STACK Bio::DB::Persistent::PersistentObject::create > /Users/cjfields/src/core/bioperl-db/blib/lib/Bio/DB/Persistent/ > PersistentObject.pm:244 > STACK toplevel t/04swiss.t:36 > --------------------------------------------------- > > Also, seeing this with 13remove.t and 15.cluster.t, both of which > appear to infinitely recurse: > > Deep recursion on subroutine > "Bio::DB::BioSQL::BasePersistenceAdaptor::_create_persistent" at > /Users/cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ > BasePersistenceAdaptor.pm > line 587, line 1. > Deep recursion on subroutine > "Bio::DB::BioSQL::BasePersistenceAdaptor::_process_child" at > /Users/cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ > BasePersistenceAdaptor.pm > line 630, line 1. > > > chris > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From bernd.web at gmail.com Mon Sep 3 08:43:26 2007 From: bernd.web at gmail.com (Bernd Web) Date: Mon, 3 Sep 2007 14:43:26 +0200 Subject: [Bioperl-l] Fh::flush warning Message-ID: <716af09c0709030543w79f83368gf0ac74d220a96f8c@mail.gmail.com> Hi, Sometimes with Bio::SimpleAlign/AlignIO, I get the following warning: (in cleanup) Undefined subroutine Fh::flush, at /lib/perl/Bio/Root/IO.pm line 541. This occurs in a rather large script and have not been able to isolate a small example where I also get this warning. Does someone know more about this warning and why it is thrown? Regards, Bernd From cjfields at uiuc.edu Mon Sep 3 10:41:49 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 3 Sep 2007 09:41:49 -0500 Subject: [Bioperl-l] Fh::flush warning In-Reply-To: <716af09c0709030543w79f83368gf0ac74d220a96f8c@mail.gmail.com> References: <716af09c0709030543w79f83368gf0ac74d220a96f8c@mail.gmail.com> Message-ID: <98A9D081-2570-4D4E-A8F8-D03282D41E0C@uiuc.edu> Could you give a bit more info (bioperl version, OS, etc)? I'm guessing a recent version as the error coincides with a call to flush() in Root::IO (which is probably called indirectly via DESTROY) and that you're probably using a tied filehandle somewhere for output, e.g. Bio::AlignIO::newFh() or Bio::AlignIO::fh(), so knowing the input/output formats could help. chris On Sep 3, 2007, at 7:43 AM, Bernd Web wrote: > Hi, > > Sometimes with Bio::SimpleAlign/AlignIO, I get the following warning: > (in cleanup) Undefined subroutine Fh::flush, at > /lib/perl/Bio/Root/IO.pm line 541. > > This occurs in a rather large script and have not been able to isolate > a small example where I also get this warning. Does someone know more > about this warning and why it is thrown? > > Regards, > Bernd > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From xianranli78 at yahoo.com.cn Mon Sep 3 22:11:09 2007 From: xianranli78 at yahoo.com.cn (xianran li) Date: Tue, 4 Sep 2007 10:11:09 +0800 (CST) Subject: [Bioperl-l] question about Bio::DB::GFF Message-ID: <361239.6752.qm@web15309.mail.cnb.yahoo.com> Hi, I tried to load the gff3 file with load_gff.pl and extrac some information with Bio::DB::GFF. Althougth this code work properly under windows xp, the $seg got nothing when i run it under Linux. Here is my code and the gff3 file, #################################################################### #!/usr/local/bin/perl -w use strict; use Bio::SeqIO; use Bio::DB::GFF; my $in_gff = Bio::DB::GFF->new( -adaptor => 'dbi::mysqlopt', -dsn => 'dbi:mysql:test', -aggregator => ['coding'], -user => "lixr", -pass => "123456" ); my $seg = $in_gff->segment'BGIOSIBCE000001.1'); print $seg->abs_start."\n"; ################################################################## ##gff-version 3 ##sequence-region Chr01 1 43037 Chr01 bgf mRNA 18113 20165 . + . ID=BGIOSIBCE000001.1 Chr01 bgf CDS 18113 19150 . + 0 Parent=BGIOSIBCE000001.1 Chr01 bgf CDS 19344 20165 . + 0 Parent=BGIOSIBCE000001.1 Chr01 bgf mRNA 30220 36442 . + . ID=BGIOSIBCE000002.1 Chr01 bgf CDS 30220 30387 . + 0 Parent=BGIOSIBCE000002.1 Chr01 bgf CDS 31128 31226 . + 0 Parent=BGIOSIBCE000002.1 Chr01 bgf CDS 32228 32331 . + 0 Parent=BGIOSIBCE000002.1 Chr01 bgf CDS 33907 34715 . + 1 Parent=BGIOSIBCE000002.1 Chr01 bgf CDS 34799 34921 . + 2 Parent=BGIOSIBCE000002.1 Chr01 bgf CDS 35003 35091 . + 2 Parent=BGIOSIBCE000002.1 Chr01 bgf CDS 35179 35379 . + 0 Parent=BGIOSIBCE000002.1 Chr01 bgf CDS 35981 36442 . + 0 Parent=BGIOSIBCE000002.1 Chr01 bgf mRNA 38143 39015 . - . ID=BGIOSIBCE000003.1 Chr01 bgf CDS 38143 38541 . - 0 Parent=BGIOSIBCE000003.1 Chr01 bgf CDS 38649 38813 . - 0 Parent=BGIOSIBCE000003.1 Chr01 bgf CDS 38917 39015 . - 0 Parent=BGIOSIBCE000003.1 Chr01 bgf mRNA 39545 42080 . + . ID=BGIOSIBCE000004.1 Chr01 bgf CDS 39545 40584 . + 0 Parent=BGIOSIBCE000004.1 Chr01 bgf CDS 40677 41042 . + 1 Parent=BGIOSIBCE000004.1 Chr01 bgf CDS 41130 41208 . + 1 Parent=BGIOSIBCE000004.1 Chr01 bgf CDS 41740 41920 . + 0 Parent=BGIOSIBCE000004.1 Chr01 bgf CDS 42037 42080 . + 2 Parent=BGIOSIBCE000004.1 ################################################################# I would appreaciate if any one can give me some clues/link to accomplish this. thanks in advance , Xianran Li --------------------------------- ???????????????????????????????????????????? From cjfields at uiuc.edu Tue Sep 4 00:04:29 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 3 Sep 2007 23:04:29 -0500 Subject: [Bioperl-l] question about Bio::DB::GFF In-Reply-To: <361239.6752.qm@web15309.mail.cnb.yahoo.com> References: <361239.6752.qm@web15309.mail.cnb.yahoo.com> Message-ID: <37BE6493-B49B-47DF-8047-37D616B669A8@uiuc.edu> Not sure if the gff3 you show was modified for demonstration here but it should always be tab-delimited. Also, I have had problems myself when using files with Windows/Mac Classic line endings on UNIX'y systems (Excel and a few other Mac OS X programs insist on adding \r instead of \n, which plays havoc with parsers sometimes even with readline fixes). chris On Sep 3, 2007, at 9:11 PM, xianran li wrote: > > Hi, > > I tried to load the gff3 file with load_gff.pl and extrac some > information with Bio::DB::GFF. Althougth this code work properly > under windows xp, the $seg got nothing when i run it under Linux. > > Here is my code and the gff3 file, > #################################################################### > > #!/usr/local/bin/perl -w > > use strict; > use Bio::SeqIO; > use Bio::DB::GFF; > > my $in_gff = Bio::DB::GFF->new( -adaptor => 'dbi::mysqlopt', > -dsn => 'dbi:mysql:test', > -aggregator => ['coding'], > -user => "lixr", > -pass => "123456" > ); > my $seg = $in_gff->segment'BGIOSIBCE000001.1'); > print $seg->abs_start."\n"; > > > ################################################################## > ##gff-version 3 > ##sequence-region Chr01 1 43037 > Chr01 bgf mRNA 18113 20165 . + . ID=BGIOSIBCE000001.1 > Chr01 bgf CDS 18113 19150 . + 0 Parent=BGIOSIBCE000001.1 > Chr01 bgf CDS 19344 20165 . + 0 Parent=BGIOSIBCE000001.1 > Chr01 bgf mRNA 30220 36442 . + . ID=BGIOSIBCE000002.1 > Chr01 bgf CDS 30220 30387 . + 0 Parent=BGIOSIBCE000002.1 > Chr01 bgf CDS 31128 31226 . + 0 Parent=BGIOSIBCE000002.1 > Chr01 bgf CDS 32228 32331 . + 0 Parent=BGIOSIBCE000002.1 > Chr01 bgf CDS 33907 34715 . + 1 Parent=BGIOSIBCE000002.1 > Chr01 bgf CDS 34799 34921 . + 2 Parent=BGIOSIBCE000002.1 > Chr01 bgf CDS 35003 35091 . + 2 Parent=BGIOSIBCE000002.1 > Chr01 bgf CDS 35179 35379 . + 0 Parent=BGIOSIBCE000002.1 > Chr01 bgf CDS 35981 36442 . + 0 Parent=BGIOSIBCE000002.1 > Chr01 bgf mRNA 38143 39015 . - . ID=BGIOSIBCE000003.1 > Chr01 bgf CDS 38143 38541 . - 0 Parent=BGIOSIBCE000003.1 > Chr01 bgf CDS 38649 38813 . - 0 Parent=BGIOSIBCE000003.1 > Chr01 bgf CDS 38917 39015 . - 0 Parent=BGIOSIBCE000003.1 > Chr01 bgf mRNA 39545 42080 . + . ID=BGIOSIBCE000004.1 > Chr01 bgf CDS 39545 40584 . + 0 Parent=BGIOSIBCE000004.1 > Chr01 bgf CDS 40677 41042 . + 1 Parent=BGIOSIBCE000004.1 > Chr01 bgf CDS 41130 41208 . + 1 Parent=BGIOSIBCE000004.1 > Chr01 bgf CDS 41740 41920 . + 0 Parent=BGIOSIBCE000004.1 > Chr01 bgf CDS 42037 42080 . + 2 Parent=BGIOSIBCE000004.1 > ################################################################# > > > I would appreaciate if any one can give me some clues/link to > accomplish this. > > thanks in advance , > > Xianran Li > > > --------------------------------- > ?????????????????????? > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From xianranli78 at yahoo.com.cn Tue Sep 4 00:58:48 2007 From: xianranli78 at yahoo.com.cn (xianran li) Date: Tue, 4 Sep 2007 12:58:48 +0800 (CST) Subject: [Bioperl-l] =?gb2312?q?=BB=D8=B8=B4=A3=BA=20Re:=20=20question=20about=20Bi?= =?gb2312?q?o::DB::GFF?= In-Reply-To: <37BE6493-B49B-47DF-8047-37D616B669A8@uiuc.edu> Message-ID: <866169.66154.qm@web15309.mail.cnb.yahoo.com> Hi, everybody, It looks like for the different perl version(5.8.8 of windows and 5.8.5 for linux). And I fixed this problem by adding ";Name=XXXX" after each line with "mRNA" ############################################################################## Chr01 bgf mRNA 18113 20165 . + . ID=BGIOSIBCE000001.1;Name=BGIOSIBCE000001.1 Chr01 bgf CDS 18113 19150 . + 0 Parent=BGIOSIBCE000001.1 ############################################################################## This time my code works properly. Xianran Chris Fields ?????? Not sure if the gff3 you show was modified for demonstration here but it should always be tab-delimited. Also, I have had problems myself when using files with Windows/Mac Classic line endings on UNIX'y systems (Excel and a few other Mac OS X programs insist on adding \r instead of \n, which plays havoc with parsers sometimes even with readline fixes). chris On Sep 3, 2007, at 9:11 PM, xianran li wrote: > > Hi, > > I tried to load the gff3 file with load_gff.pl and extrac some > information with Bio::DB::GFF. Althougth this code work properly > under windows xp, the $seg got nothing when i run it under Linux. > > Here is my code and the gff3 file, > #################################################################### > > #!/usr/local/bin/perl -w > > use strict; > use Bio::SeqIO; > use Bio::DB::GFF; > > my $in_gff = Bio::DB::GFF->new( -adaptor => 'dbi::mysqlopt', > -dsn => 'dbi:mysql:test', > -aggregator => ['coding'], > -user => "lixr", > -pass => "123456" > ); > my $seg = $in_gff->segment'BGIOSIBCE000001.1'); > print $seg->abs_start."\n"; > > > ################################################################## > ##gff-version 3 > ##sequence-region Chr01 1 43037 > Chr01 bgf mRNA 18113 20165 . + . ID=BGIOSIBCE000001.1 > Chr01 bgf CDS 18113 19150 . + 0 Parent=BGIOSIBCE000001.1 > Chr01 bgf CDS 19344 20165 . + 0 Parent=BGIOSIBCE000001.1 > Chr01 bgf mRNA 30220 36442 . + . ID=BGIOSIBCE000002.1 > Chr01 bgf CDS 30220 30387 . + 0 Parent=BGIOSIBCE000002.1 > Chr01 bgf CDS 31128 31226 . + 0 Parent=BGIOSIBCE000002.1 > Chr01 bgf CDS 32228 32331 . + 0 Parent=BGIOSIBCE000002.1 > Chr01 bgf CDS 33907 34715 . + 1 Parent=BGIOSIBCE000002.1 > Chr01 bgf CDS 34799 34921 . + 2 Parent=BGIOSIBCE000002.1 > Chr01 bgf CDS 35003 35091 . + 2 Parent=BGIOSIBCE000002.1 > Chr01 bgf CDS 35179 35379 . + 0 Parent=BGIOSIBCE000002.1 > Chr01 bgf CDS 35981 36442 . + 0 Parent=BGIOSIBCE000002.1 > Chr01 bgf mRNA 38143 39015 . - . ID=BGIOSIBCE000003.1 > Chr01 bgf CDS 38143 38541 . - 0 Parent=BGIOSIBCE000003.1 > Chr01 bgf CDS 38649 38813 . - 0 Parent=BGIOSIBCE000003.1 > Chr01 bgf CDS 38917 39015 . - 0 Parent=BGIOSIBCE000003.1 > Chr01 bgf mRNA 39545 42080 . + . ID=BGIOSIBCE000004.1 > Chr01 bgf CDS 39545 40584 . + 0 Parent=BGIOSIBCE000004.1 > Chr01 bgf CDS 40677 41042 . + 1 Parent=BGIOSIBCE000004.1 > Chr01 bgf CDS 41130 41208 . + 1 Parent=BGIOSIBCE000004.1 > Chr01 bgf CDS 41740 41920 . + 0 Parent=BGIOSIBCE000004.1 > Chr01 bgf CDS 42037 42080 . + 2 Parent=BGIOSIBCE000004.1 > ################################################################# > > > I would appreaciate if any one can give me some clues/link to > accomplish this. > > thanks in advance , > > Xianran Li > > > --------------------------------- > ???????????????????????????????????????????? > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign --------------------------------- ???????????????????????????????? From jay at jays.net Tue Sep 4 10:31:36 2007 From: jay at jays.net (Jay Hannah) Date: Tue, 4 Sep 2007 09:31:36 -0500 Subject: [Bioperl-l] display protein/CDS multiple sequence alignments with exon boundaries In-Reply-To: <358f4d650708310229i421bb1d7w5f64e3ded5f92618@mail.gmail.com> References: <358f4d650708310229i421bb1d7w5f64e3ded5f92618@mail.gmail.com> Message-ID: <96BB820A-9199-43FC-BA88-5F45F2D16D56@jays.net> On Aug 31, 2007, at 4:29 AM, Albert Vilella wrote: > Probably a bit of a long shot but does anyone have code for > displaying protein or CDS multiple sequence alignments with the exon > boundaries of each gene in the alignment? > > Something in the bioperl world without funky external dependencies. > I think > it would be an awesome addition to the howtos. > > Currently, the Bio::Graphics howto has cdna to genome mapping > scripts or > blast output scripts, but > I couldn't find code for dealing with multiple sequence alignments. I'm currently under the (potentially uninformed) impression that Bio::Graphics and related tools only work with a single coordinate system. I've never seen a multiple sequence alignment example. ( I Google'd for "gbrowse alignment" and hit this: http://maizeapache.danforthcenter.org/cgi-bin/gbrowse.cgi Click the second Example link and you'll see exons mapped out. But zooming all the way in with all the tracks turned on it looks like the AZM tracks are just the coding regions. I don't see any multiple sequence alignment... ) I doubt that helped. :) Jay Hannah http://clab.ist.unomaha.edu/CLAB/index.php/User:Jhannah From cjfields at uiuc.edu Tue Sep 4 11:28:01 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 4 Sep 2007 10:28:01 -0500 Subject: [Bioperl-l] display protein/CDS multiple sequence alignments with exon boundaries In-Reply-To: <96BB820A-9199-43FC-BA88-5F45F2D16D56@jays.net> References: <358f4d650708310229i421bb1d7w5f64e3ded5f92618@mail.gmail.com> <96BB820A-9199-43FC-BA88-5F45F2D16D56@jays.net> Message-ID: <59E877A9-B2A1-4E62-A291-FDE21CF586E6@uiuc.edu> On Sep 4, 2007, at 9:31 AM, Jay Hannah wrote: > On Aug 31, 2007, at 4:29 AM, Albert Vilella wrote: >> Probably a bit of a long shot but does anyone have code for >> displaying protein or CDS multiple sequence alignments with the exon >> boundaries of each gene in the alignment? >> >> Something in the bioperl world without funky external dependencies. >> I think >> it would be an awesome addition to the howtos. >> >> Currently, the Bio::Graphics howto has cdna to genome mapping >> scripts or >> blast output scripts, but >> I couldn't find code for dealing with multiple sequence alignments. > > I'm currently under the (potentially uninformed) impression that > Bio::Graphics and related tools only work with a single coordinate > system. I've never seen a multiple sequence alignment example. > > ( > I Google'd for "gbrowse alignment" and hit this: > http://maizeapache.danforthcenter.org/cgi-bin/gbrowse.cgi > > Click the second Example link and you'll see exons mapped out. > > But zooming all the way in with all the tracks turned on it looks > like the AZM tracks are just the coding regions. I don't see any > multiple sequence alignment... > ) > > I doubt that helped. :) > > Jay Hannah > http://clab.ist.unomaha.edu/CLAB/index.php/User:Jhannah Acc. to the Gbrowse tutorial GBrowse can deal with alignment data: http://gmod.cvs.sourceforge.net/*checkout*/gmod/Generic-Genome- Browser/docs/tutorial/tutorial.html chris From avilella at gmail.com Wed Sep 5 05:42:37 2007 From: avilella at gmail.com (Albert Vilella) Date: Wed, 5 Sep 2007 11:42:37 +0200 Subject: [Bioperl-l] display protein/CDS multiple sequence alignments with exon boundaries In-Reply-To: <59E877A9-B2A1-4E62-A291-FDE21CF586E6@uiuc.edu> References: <358f4d650708310229i421bb1d7w5f64e3ded5f92618@mail.gmail.com> <96BB820A-9199-43FC-BA88-5F45F2D16D56@jays.net> <59E877A9-B2A1-4E62-A291-FDE21CF586E6@uiuc.edu> Message-ID: <358f4d650709050242u7743a532u3b460fcc55e1867e@mail.gmail.com> A couple of examples: http://www.treefam.org/cgi-bin/alnview.pl?ac=TF105041 treefam has exon boundary and PFAM domain mappings http://www.ensembl.org/Homo_sapiens/genetreeview?gene=ENSG00000139618 here the tree is shown as well, but the idea would be to plot the alignment So it's more "show me the multiple CDS/protein alignment" rather than "show my aligned CDS/proteins wrt my reference genome" I think it would be quite neat to have this as a bioperl howto, Comments? Albert. On 9/4/07, Chris Fields wrote: > > > On Sep 4, 2007, at 9:31 AM, Jay Hannah wrote: > > > On Aug 31, 2007, at 4:29 AM, Albert Vilella wrote: > >> Probably a bit of a long shot but does anyone have code for > >> displaying protein or CDS multiple sequence alignments with the exon > >> boundaries of each gene in the alignment? > >> > >> Something in the bioperl world without funky external dependencies. > >> I think > >> it would be an awesome addition to the howtos. > >> > >> Currently, the Bio::Graphics howto has cdna to genome mapping > >> scripts or > >> blast output scripts, but > >> I couldn't find code for dealing with multiple sequence alignments. > > > > I'm currently under the (potentially uninformed) impression that > > Bio::Graphics and related tools only work with a single coordinate > > system. I've never seen a multiple sequence alignment example. > > > > ( > > I Google'd for "gbrowse alignment" and hit this: > > http://maizeapache.danforthcenter.org/cgi-bin/gbrowse.cgi > > > > Click the second Example link and you'll see exons mapped out. > > > > But zooming all the way in with all the tracks turned on it looks > > like the AZM tracks are just the coding regions. I don't see any > > multiple sequence alignment... > > ) > > > > I doubt that helped. :) > > > > Jay Hannah > > http://clab.ist.unomaha.edu/CLAB/index.php/User:Jhannah > > Acc. to the Gbrowse tutorial GBrowse can deal with alignment data: > > http://gmod.cvs.sourceforge.net/*checkout*/gmod/Generic-Genome- > Browser/docs/tutorial/tutorial.html > > chris > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > From alexl at users.sourceforge.net Wed Sep 5 06:08:14 2007 From: alexl at users.sourceforge.net (Alex Lancaster) Date: Wed, 05 Sep 2007 03:08:14 -0700 Subject: [Bioperl-l] Clarifying license of bioperl In-Reply-To: <8D3FBCDF-47E7-4A6E-8001-C034CA27BF3F@gmx.net> (Hilmar Lapp's message of "Sat\, 18 Aug 2007 12\:13\:28 -0400") References: <1A4207F8295607498283FE9E93B775B40390172E@EX02.asurite.ad.asu.edu> <3515AB25-9919-407B-93E9-352BC426AFA1@uiuc.edu> <8td4xlyt4h.fsf@allele2.localdomain> <8D3FBCDF-47E7-4A6E-8001-C034CA27BF3F@gmx.net> Message-ID: >>>>> "HL" == Hilmar Lapp writes: HL> On Aug 18, 2007, at 7:33 AM, Alex Lancaster wrote: > I imagine the intent of the bioperl >> contributors is that it should be under the same terms as Perl, >> whatever that happens to be (which just happens to be GPL or >> Artistic, which is fine). HL> I fully agree. >> A clarification to that effect would be useful. HL> Agreed, too. Would you mind changing that language on the wiki, HL> since you seem to have a fairly good grasp on the issue? OK, I've updated the wiki in two places: http://www.bioperl.org/wiki/Licensing_BioPerl http://www.bioperl.org/wiki/FAQ#What_are_the_license_terms_for_BioPerl.3F It would also be nice if the LICENSE and Build.PL files in CVS (so it finds its way into the next release) were also updated to reflect the dual-licensed status, currently they only mention the Artistic license: http://code.open-bio.org/cgi/viewcvs.cgi/bioperl-live/LICENSE?rev=HEAD&content-type=text/vnd.viewcvs-markup http://code.open-bio.org/cgi/viewcvs.cgi/bioperl-live/Build.PL?rev=HEAD&content-type=text/vnd.viewcvs-markup For Build.PL this is easy: (e.g., license => 'artistic', should be license => 'GPL or Artistic',) Possible solutions for the LICENSE file include: 1) The GPL could be added to LICENSE file at the end (with a note at the top to indicate that GPL is also included); 2) LICENSE could be moved to LICENSE.Artistic and another file "LICENSE.GPL" added with the GPL (version 2+) conditions, and the contents of LICENSE would include a note about each license. I don't have access to the bioperl CVS repository, so I can't make the changes myself). This would also apply to the Build.PL (and LICENSE files if they are present) in bioperl-run and other modules. Thanks, Alex From cjfields at uiuc.edu Wed Sep 5 08:25:21 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Wed, 5 Sep 2007 07:25:21 -0500 Subject: [Bioperl-l] Clarifying license of bioperl In-Reply-To: References: <1A4207F8295607498283FE9E93B775B40390172E@EX02.asurite.ad.asu.edu> <3515AB25-9919-407B-93E9-352BC426AFA1@uiuc.edu> <8td4xlyt4h.fsf@allele2.localdomain> <8D3FBCDF-47E7-4A6E-8001-C034CA27BF3F@gmx.net> Message-ID: On Sep 5, 2007, at 5:08 AM, Alex Lancaster wrote: ... > > OK, I've updated the wiki in two places: > > http://www.bioperl.org/wiki/Licensing_BioPerl > > http://www.bioperl.org/wiki/ > FAQ#What_are_the_license_terms_for_BioPerl.3F > > It would also be nice if the LICENSE and Build.PL files in CVS (so it > finds its way into the next release) were also updated to reflect the > dual-licensed status, currently they only mention the Artistic > license: > > http://code.open-bio.org/cgi/viewcvs.cgi/bioperl-live/LICENSE? > rev=HEAD&content-type=text/vnd.viewcvs-markup > > http://code.open-bio.org/cgi/viewcvs.cgi/bioperl-live/Build.PL? > rev=HEAD&content-type=text/vnd.viewcvs-markup > > For Build.PL this is easy: > > (e.g., license => 'artistic', should be > license => 'GPL or Artistic',) > > Possible solutions for the LICENSE file include: > > 1) The GPL could be added to LICENSE file at the end (with a note at > the top to indicate that GPL is also included); > > 2) LICENSE could be moved to LICENSE.Artistic and another file > "LICENSE.GPL" added with the GPL (version 2+) conditions, and the > contents of LICENSE would include a note about each license. > > I don't have access to the bioperl CVS repository, so I can't make the > changes myself). This would also apply to the Build.PL (and LICENSE > files if they are present) in bioperl-run and other modules. > > Thanks, > Alex Looks like Sendu has done that. There have been recent troubling developments re: Artistic License: http://use.perl.org/article.pl?sid=07/08/26/1541205&from=rss but the case hasn't been conclusively decided yet. chris From bix at sendu.me.uk Wed Sep 5 08:18:35 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Wed, 05 Sep 2007 13:18:35 +0100 Subject: [Bioperl-l] Clarifying license of bioperl In-Reply-To: References: <1A4207F8295607498283FE9E93B775B40390172E@EX02.asurite.ad.asu.edu> <3515AB25-9919-407B-93E9-352BC426AFA1@uiuc.edu> <8td4xlyt4h.fsf@allele2.localdomain> <8D3FBCDF-47E7-4A6E-8001-C034CA27BF3F@gmx.net> Message-ID: <46DE9E9B.80107@sendu.me.uk> Alex Lancaster wrote: >>>>>> "HL" == Hilmar Lapp writes: > > HL> On Aug 18, 2007, at 7:33 AM, Alex Lancaster wrote: > >>> I imagine the intent of the bioperl >>> contributors is that it should be under the same terms as Perl, >>> whatever that happens to be (which just happens to be GPL or >>> Artistic, which is fine). > > HL> I fully agree. > >>> A clarification to that effect would be useful. > > HL> Agreed, too. Would you mind changing that language on the wiki, > HL> since you seem to have a fairly good grasp on the issue? > > OK, I've updated the wiki in two places: > > http://www.bioperl.org/wiki/Licensing_BioPerl > > http://www.bioperl.org/wiki/FAQ#What_are_the_license_terms_for_BioPerl.3F Thank you very much for that Alex. > It would also be nice if the LICENSE and Build.PL files in CVS (so it > finds its way into the next release) were also updated to reflect the > dual-licensed status, currently they only mention the Artistic > license: [snip] > For Build.PL this is easy: > > (e.g., license => 'artistic', should be > license => 'GPL or Artistic',) As per the 'license' section of http://search.cpan.org/~kwilliams/Module-Build-0.2808/lib/Module/Build/API.pod, I've changed it to 'perl', which means Artistic or GPL. > Possible solutions for the LICENSE file include: > > 1) The GPL could be added to LICENSE file at the end (with a note at > the top to indicate that GPL is also included); I took this approach, using your language for the explanation at the top, and including GPL 3.0 at the bottom. I've made these changes for core (live), run, db and network. Thanks again for your help and advice. From cjfields at uiuc.edu Wed Sep 5 08:53:25 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Wed, 5 Sep 2007 07:53:25 -0500 Subject: [Bioperl-l] display protein/CDS multiple sequence alignments with exon boundaries In-Reply-To: <358f4d650709050242u7743a532u3b460fcc55e1867e@mail.gmail.com> References: <358f4d650708310229i421bb1d7w5f64e3ded5f92618@mail.gmail.com> <96BB820A-9199-43FC-BA88-5F45F2D16D56@jays.net> <59E877A9-B2A1-4E62-A291-FDE21CF586E6@uiuc.edu> <358f4d650709050242u7743a532u3b460fcc55e1867e@mail.gmail.com> Message-ID: <08E1C429-0686-49E2-825A-D98C709166F7@uiuc.edu> You mean something like this? http://www.bioperl.org/wiki/HOWTO_Discussion:Graphics chris On Sep 5, 2007, at 4:42 AM, Albert Vilella wrote: > A couple of examples: > > http://www.treefam.org/cgi-bin/alnview.pl?ac=TF105041 > > treefam has exon boundary and PFAM domain mappings > > http://www.ensembl.org/Homo_sapiens/genetreeview?gene=ENSG00000139618 > > here the tree is shown as well, but the idea would be to plot the > alignment > > So it's more "show me the multiple CDS/protein alignment" rather > than "show > my aligned CDS/proteins wrt my reference genome" > > I think it would be quite neat to have this as a bioperl howto, > > Comments? > > Albert. > > On 9/4/07, Chris Fields wrote: >> >> >> On Sep 4, 2007, at 9:31 AM, Jay Hannah wrote: >> >>> On Aug 31, 2007, at 4:29 AM, Albert Vilella wrote: >>>> Probably a bit of a long shot but does anyone have code for >>>> displaying protein or CDS multiple sequence alignments with the >>>> exon >>>> boundaries of each gene in the alignment? >>>> >>>> Something in the bioperl world without funky external dependencies. >>>> I think >>>> it would be an awesome addition to the howtos. >>>> >>>> Currently, the Bio::Graphics howto has cdna to genome mapping >>>> scripts or >>>> blast output scripts, but >>>> I couldn't find code for dealing with multiple sequence alignments. >>> >>> I'm currently under the (potentially uninformed) impression that >>> Bio::Graphics and related tools only work with a single coordinate >>> system. I've never seen a multiple sequence alignment example. >>> >>> ( >>> I Google'd for "gbrowse alignment" and hit this: >>> http://maizeapache.danforthcenter.org/cgi-bin/gbrowse.cgi >>> >>> Click the second Example link and you'll see exons mapped out. >>> >>> But zooming all the way in with all the tracks turned on it looks >>> like the AZM tracks are just the coding regions. I don't see any >>> multiple sequence alignment... >>> ) >>> >>> I doubt that helped. :) >>> >>> Jay Hannah >>> http://clab.ist.unomaha.edu/CLAB/index.php/User:Jhannah >> >> Acc. to the Gbrowse tutorial GBrowse can deal with alignment data: >> >> http://gmod.cvs.sourceforge.net/*checkout*/gmod/Generic-Genome- >> Browser/docs/tutorial/tutorial.html >> >> chris >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From avilella at gmail.com Wed Sep 5 09:31:24 2007 From: avilella at gmail.com (Albert Vilella) Date: Wed, 5 Sep 2007 15:31:24 +0200 Subject: [Bioperl-l] display protein/CDS multiple sequence alignments with exon boundaries In-Reply-To: <08E1C429-0686-49E2-825A-D98C709166F7@uiuc.edu> References: <358f4d650708310229i421bb1d7w5f64e3ded5f92618@mail.gmail.com> <96BB820A-9199-43FC-BA88-5F45F2D16D56@jays.net> <59E877A9-B2A1-4E62-A291-FDE21CF586E6@uiuc.edu> <358f4d650709050242u7743a532u3b460fcc55e1867e@mail.gmail.com> <08E1C429-0686-49E2-825A-D98C709166F7@uiuc.edu> Message-ID: <358f4d650709050631t136901e6v6280a44089999bde@mail.gmail.com> Awesome!! Thanks Chris! On 9/5/07, Chris Fields wrote: > > You mean something like this? > > http://www.bioperl.org/wiki/HOWTO_Discussion:Graphics > > chris > > On Sep 5, 2007, at 4:42 AM, Albert Vilella wrote: > > > A couple of examples: > > > > http://www.treefam.org/cgi-bin/alnview.pl?ac=TF105041 > > > > treefam has exon boundary and PFAM domain mappings > > > > http://www.ensembl.org/Homo_sapiens/genetreeview?gene=ENSG00000139618 > > > > here the tree is shown as well, but the idea would be to plot the > > alignment > > > > So it's more "show me the multiple CDS/protein alignment" rather > > than "show > > my aligned CDS/proteins wrt my reference genome" > > > > I think it would be quite neat to have this as a bioperl howto, > > > > Comments? > > > > Albert. > > > > On 9/4/07, Chris Fields wrote: > >> > >> > >> On Sep 4, 2007, at 9:31 AM, Jay Hannah wrote: > >> > >>> On Aug 31, 2007, at 4:29 AM, Albert Vilella wrote: > >>>> Probably a bit of a long shot but does anyone have code for > >>>> displaying protein or CDS multiple sequence alignments with the > >>>> exon > >>>> boundaries of each gene in the alignment? > >>>> > >>>> Something in the bioperl world without funky external dependencies. > >>>> I think > >>>> it would be an awesome addition to the howtos. > >>>> > >>>> Currently, the Bio::Graphics howto has cdna to genome mapping > >>>> scripts or > >>>> blast output scripts, but > >>>> I couldn't find code for dealing with multiple sequence alignments. > >>> > >>> I'm currently under the (potentially uninformed) impression that > >>> Bio::Graphics and related tools only work with a single coordinate > >>> system. I've never seen a multiple sequence alignment example. > >>> > >>> ( > >>> I Google'd for "gbrowse alignment" and hit this: > >>> http://maizeapache.danforthcenter.org/cgi-bin/gbrowse.cgi > >>> > >>> Click the second Example link and you'll see exons mapped out. > >>> > >>> But zooming all the way in with all the tracks turned on it looks > >>> like the AZM tracks are just the coding regions. I don't see any > >>> multiple sequence alignment... > >>> ) > >>> > >>> I doubt that helped. :) > >>> > >>> Jay Hannah > >>> http://clab.ist.unomaha.edu/CLAB/index.php/User:Jhannah > >> > >> Acc. to the Gbrowse tutorial GBrowse can deal with alignment data: > >> > >> http://gmod.cvs.sourceforge.net/*checkout*/gmod/Generic-Genome- > >> Browser/docs/tutorial/tutorial.html > >> > >> chris > >> > >> _______________________________________________ > >> Bioperl-l mailing list > >> Bioperl-l at lists.open-bio.org > >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > >> > > _______________________________________________ > > Bioperl-l mailing list > > Bioperl-l at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > Christopher Fields > Postdoctoral Researcher > Lab of Dr. Robert Switzer > Dept of Biochemistry > University of Illinois Urbana-Champaign > > > > From cjfields at uiuc.edu Wed Sep 5 10:17:51 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Wed, 5 Sep 2007 09:17:51 -0500 Subject: [Bioperl-l] display protein/CDS multiple sequence alignments with exon boundaries In-Reply-To: <358f4d650709050631t136901e6v6280a44089999bde@mail.gmail.com> References: <358f4d650708310229i421bb1d7w5f64e3ded5f92618@mail.gmail.com> <96BB820A-9199-43FC-BA88-5F45F2D16D56@jays.net> <59E877A9-B2A1-4E62-A291-FDE21CF586E6@uiuc.edu> <358f4d650709050242u7743a532u3b460fcc55e1867e@mail.gmail.com> <08E1C429-0686-49E2-825A-D98C709166F7@uiuc.edu> <358f4d650709050631t136901e6v6280a44089999bde@mail.gmail.com> Message-ID: <31E25B64-2043-4460-ADC8-9684D01C2468@uiuc.edu> It would be nice to place the labels to the left of the segments. I believe there is a way to do this, but can't remember; if I can find it I'll revise the script. chris On Sep 5, 2007, at 8:31 AM, Albert Vilella wrote: > Awesome!! > > Thanks Chris! > > On 9/5/07, Chris Fields wrote: >> >> You mean something like this? >> >> http://www.bioperl.org/wiki/HOWTO_Discussion:Graphics >> >> chris >> >> On Sep 5, 2007, at 4:42 AM, Albert Vilella wrote: >> >>> A couple of examples: >>> >>> http://www.treefam.org/cgi-bin/alnview.pl?ac=TF105041 >>> >>> treefam has exon boundary and PFAM domain mappings >>> >>> http://www.ensembl.org/Homo_sapiens/genetreeview? >>> gene=ENSG00000139618 >>> >>> here the tree is shown as well, but the idea would be to plot the >>> alignment >>> >>> So it's more "show me the multiple CDS/protein alignment" rather >>> than "show >>> my aligned CDS/proteins wrt my reference genome" >>> >>> I think it would be quite neat to have this as a bioperl howto, >>> >>> Comments? >>> >>> Albert. >>> >>> On 9/4/07, Chris Fields wrote: >>>> >>>> >>>> On Sep 4, 2007, at 9:31 AM, Jay Hannah wrote: >>>> >>>>> On Aug 31, 2007, at 4:29 AM, Albert Vilella wrote: >>>>>> Probably a bit of a long shot but does anyone have code for >>>>>> displaying protein or CDS multiple sequence alignments with the >>>>>> exon >>>>>> boundaries of each gene in the alignment? >>>>>> >>>>>> Something in the bioperl world without funky external >>>>>> dependencies. >>>>>> I think >>>>>> it would be an awesome addition to the howtos. >>>>>> >>>>>> Currently, the Bio::Graphics howto has cdna to genome mapping >>>>>> scripts or >>>>>> blast output scripts, but >>>>>> I couldn't find code for dealing with multiple sequence >>>>>> alignments. >>>>> >>>>> I'm currently under the (potentially uninformed) impression that >>>>> Bio::Graphics and related tools only work with a single >>>>> coordinate >>>>> system. I've never seen a multiple sequence alignment example. >>>>> >>>>> ( >>>>> I Google'd for "gbrowse alignment" and hit this: >>>>> http://maizeapache.danforthcenter.org/cgi-bin/gbrowse.cgi >>>>> >>>>> Click the second Example link and you'll see exons mapped out. >>>>> >>>>> But zooming all the way in with all the tracks turned on it >>>>> looks >>>>> like the AZM tracks are just the coding regions. I don't see any >>>>> multiple sequence alignment... >>>>> ) >>>>> >>>>> I doubt that helped. :) >>>>> >>>>> Jay Hannah >>>>> http://clab.ist.unomaha.edu/CLAB/index.php/User:Jhannah >>>> >>>> Acc. to the Gbrowse tutorial GBrowse can deal with alignment data: >>>> >>>> http://gmod.cvs.sourceforge.net/*checkout*/gmod/Generic-Genome- >>>> Browser/docs/tutorial/tutorial.html >>>> >>>> chris >>>> >>>> _______________________________________________ >>>> Bioperl-l mailing list >>>> Bioperl-l at lists.open-bio.org >>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>>> >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> >> Christopher Fields >> Postdoctoral Researcher >> Lab of Dr. Robert Switzer >> Dept of Biochemistry >> University of Illinois Urbana-Champaign >> >> >> >> > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From n.haigh at sheffield.ac.uk Wed Sep 5 10:22:44 2007 From: n.haigh at sheffield.ac.uk (Nathan Haigh) Date: Wed, 05 Sep 2007 15:22:44 +0100 Subject: [Bioperl-l] Bio::Graphics support for floating point positions In-Reply-To: <08E1C429-0686-49E2-825A-D98C709166F7@uiuc.edu> References: <358f4d650708310229i421bb1d7w5f64e3ded5f92618@mail.gmail.com> <96BB820A-9199-43FC-BA88-5F45F2D16D56@jays.net> <59E877A9-B2A1-4E62-A291-FDE21CF586E6@uiuc.edu> <358f4d650709050242u7743a532u3b460fcc55e1867e@mail.gmail.com> <08E1C429-0686-49E2-825A-D98C709166F7@uiuc.edu> Message-ID: <46DEBBB4.1030200@sheffield.ac.uk> Chris Fields wrote: > You mean something like this? > > http://www.bioperl.org/wiki/HOWTO_Discussion:Graphics > > chris > > On Sep 5, 2007, at 4:42 AM, Albert Vilella wrote: > > Nice! On a similar (well, related to Bio::Graphics) topic, I've written a script that uses markers that have been mapped from a model organism to linkage groups in related species in order to estimate the location of "unknown" markers in those linkage groups. I'm using the Bio::Map::* modules for much of this work and then I use Bio::Graphics to display the linkage groups of the non-model organism with the putative position of the "unknown" markers. However, I've had to do a bit of fudging to get Bio::Graphics to draw this data. The problems I encountered are described below. I also have an open bug: http://bugzilla.open-bio.org/show_bug.cgi?id=2343 1) Linkage maps are measured in cM - which can and are likely to be non-integer values. Bio::Graphics needs integer values, so I simply scaled all my cM measurements prior to drawing by *1000. However, the ruler now doesn't represent the "true scale" - can this be adjusted? 2) Some markers map to 0cM. However, Bio::Graphics requires positions >0. To get round this I simply incremented these positions by 1 (post-scaling), so they display almost in the correct place. Is it possible/likely/wise to support positions starting at zero and float positions? Would such support simply be to internalise what I have already done outside Bio::Graphics into the Bio::Graphics modules and have it display the correctly scaled ruler? Thoughts comments welcome. Cheers, Nath From cjfields at uiuc.edu Wed Sep 5 10:52:00 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Wed, 5 Sep 2007 09:52:00 -0500 Subject: [Bioperl-l] display protein/CDS multiple sequence alignments with exon boundaries In-Reply-To: <358f4d650709050631t136901e6v6280a44089999bde@mail.gmail.com> References: <358f4d650708310229i421bb1d7w5f64e3ded5f92618@mail.gmail.com> <96BB820A-9199-43FC-BA88-5F45F2D16D56@jays.net> <59E877A9-B2A1-4E62-A291-FDE21CF586E6@uiuc.edu> <358f4d650709050242u7743a532u3b460fcc55e1867e@mail.gmail.com> <08E1C429-0686-49E2-825A-D98C709166F7@uiuc.edu> <358f4d650709050631t136901e6v6280a44089999bde@mail.gmail.com> Message-ID: Updated the page on the web site with the new script. Figured it out; if you pass the parameter -label_position 'left' it will display the label to the left. However it displays them right next to the segment (ala GBrowse). I added a hack to Bio::Graphics::Glyph::generic in CVS which allows 'alignment_left' as an option, displaying it aligned to the far left of the panel; there is probably a way to use a callback here as well. chris On Sep 5, 2007, at 8:31 AM, Albert Vilella wrote: > Awesome!! > > Thanks Chris! > > On 9/5/07, Chris Fields wrote: >> >> You mean something like this? >> >> http://www.bioperl.org/wiki/HOWTO_Discussion:Graphics >> >> chris >> >> On Sep 5, 2007, at 4:42 AM, Albert Vilella wrote: >> >>> A couple of examples: >>> >>> http://www.treefam.org/cgi-bin/alnview.pl?ac=TF105041 >>> >>> treefam has exon boundary and PFAM domain mappings >>> >>> http://www.ensembl.org/Homo_sapiens/genetreeview? >>> gene=ENSG00000139618 >>> >>> here the tree is shown as well, but the idea would be to plot the >>> alignment >>> >>> So it's more "show me the multiple CDS/protein alignment" rather >>> than "show >>> my aligned CDS/proteins wrt my reference genome" >>> >>> I think it would be quite neat to have this as a bioperl howto, >>> >>> Comments? >>> >>> Albert. >>> >>> On 9/4/07, Chris Fields wrote: >>>> >>>> >>>> On Sep 4, 2007, at 9:31 AM, Jay Hannah wrote: >>>> >>>>> On Aug 31, 2007, at 4:29 AM, Albert Vilella wrote: >>>>>> Probably a bit of a long shot but does anyone have code for >>>>>> displaying protein or CDS multiple sequence alignments with the >>>>>> exon >>>>>> boundaries of each gene in the alignment? >>>>>> >>>>>> Something in the bioperl world without funky external >>>>>> dependencies. >>>>>> I think >>>>>> it would be an awesome addition to the howtos. >>>>>> >>>>>> Currently, the Bio::Graphics howto has cdna to genome mapping >>>>>> scripts or >>>>>> blast output scripts, but >>>>>> I couldn't find code for dealing with multiple sequence >>>>>> alignments. >>>>> >>>>> I'm currently under the (potentially uninformed) impression that >>>>> Bio::Graphics and related tools only work with a single >>>>> coordinate >>>>> system. I've never seen a multiple sequence alignment example. >>>>> >>>>> ( >>>>> I Google'd for "gbrowse alignment" and hit this: >>>>> http://maizeapache.danforthcenter.org/cgi-bin/gbrowse.cgi >>>>> >>>>> Click the second Example link and you'll see exons mapped out. >>>>> >>>>> But zooming all the way in with all the tracks turned on it >>>>> looks >>>>> like the AZM tracks are just the coding regions. I don't see any >>>>> multiple sequence alignment... >>>>> ) >>>>> >>>>> I doubt that helped. :) >>>>> >>>>> Jay Hannah >>>>> http://clab.ist.unomaha.edu/CLAB/index.php/User:Jhannah >>>> >>>> Acc. to the Gbrowse tutorial GBrowse can deal with alignment data: >>>> >>>> http://gmod.cvs.sourceforge.net/*checkout*/gmod/Generic-Genome- >>>> Browser/docs/tutorial/tutorial.html >>>> >>>> chris >>>> >>>> _______________________________________________ >>>> Bioperl-l mailing list >>>> Bioperl-l at lists.open-bio.org >>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>>> >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> >> Christopher Fields >> Postdoctoral Researcher >> Lab of Dr. Robert Switzer >> Dept of Biochemistry >> University of Illinois Urbana-Champaign >> >> >> >> > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From cjfields at uiuc.edu Wed Sep 5 12:47:46 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Wed, 5 Sep 2007 11:47:46 -0500 Subject: [Bioperl-l] Bio::Graphics support for floating point positions In-Reply-To: <46DEBBB4.1030200@sheffield.ac.uk> References: <358f4d650708310229i421bb1d7w5f64e3ded5f92618@mail.gmail.com> <96BB820A-9199-43FC-BA88-5F45F2D16D56@jays.net> <59E877A9-B2A1-4E62-A291-FDE21CF586E6@uiuc.edu> <358f4d650709050242u7743a532u3b460fcc55e1867e@mail.gmail.com> <08E1C429-0686-49E2-825A-D98C709166F7@uiuc.edu> <46DEBBB4.1030200@sheffield.ac.uk> Message-ID: On Sep 5, 2007, at 9:22 AM, Nathan Haigh wrote: > ... > On a similar (well, related to Bio::Graphics) topic, I've written a > script that uses markers that have been mapped from a model > organism to > linkage groups in related species in order to estimate the location of > "unknown" markers in those linkage groups. > > I'm using the Bio::Map::* modules for much of this work and then I use > Bio::Graphics to display the linkage groups of the non-model organism > with the putative position of the "unknown" markers. However, I've had > to do a bit of fudging to get Bio::Graphics to draw this data. The > problems I encountered are described below. I also have an open bug: > http://bugzilla.open-bio.org/show_bug.cgi?id=2343 > > 1) Linkage maps are measured in cM - which can and are likely to be > non-integer values. Bio::Graphics needs integer values, so I simply > scaled all my cM measurements prior to drawing by *1000. However, the > ruler now doesn't represent the "true scale" - can this be adjusted? > > 2) Some markers map to 0cM. However, Bio::Graphics requires positions >> 0. To get round this I simply incremented these positions by 1 > (post-scaling), so they display almost in the correct place. > > Is it possible/likely/wise to support positions starting at zero and > float positions? Would such support simply be to internalise what I > have > already done outside Bio::Graphics into the Bio::Graphics modules and > have it display the correctly scaled ruler? > > Thoughts comments welcome. > > Cheers, > Nath There is this section in the GBrowse configure doc, which to me suggests there is a way to do what you want in Bioperl; you may have to delve into the Bio::Graphics or GBrowse code to work it out, though. I think the GBrowse mail list archives also have more on this. chris ..... F. DISPLAYING GENETIC AND RH MAPS GBrowse can be tweaked to make it more suitable for displaying genetic and radiation hybrid maps. The main issue is that the Bio::DB::GFF database expects coordinates to be positive integers, not fractions, but genetic and RH maps use floating point numbers. Working around this is a bit of an ugly hack. Before loading your data you must multiply all your coordinates by a constant power of 10 in order to convert them into integers. For example, if a genetic map uses Morgan units ranging from 0 to 1.80, you would multiple by 100 to create a map in ranging from 0 to 180. Create a GFF file containing the markers in modified coordinates and load it as usual. Now you must tell GBrowse to reverse these changes. Enter the following options into the [GENERAL] section of the configuration file: units = M unit_divider = 100 These two options tell GBrowse to use "M" (Morgan) units, and to divide all coordinates by 100. GBrowse will automatically display the scale using the most appropriate units, so the displayed map will typically be drawn using cM units. From bernd.web at gmail.com Wed Sep 5 13:44:26 2007 From: bernd.web at gmail.com (Bernd Web) Date: Wed, 5 Sep 2007 19:44:26 +0200 Subject: [Bioperl-l] SearchIO ResultWriter Message-ID: <716af09c0709051044t1cb9e857uc4b91ad64c9ef22a@mail.gmail.com> Hi, For SearchIO there are ResultWriters to write text, html and BSML (BSMLResultWriter). However, is there also a BLAST xml writer, which writes the original blast xml files. This may have come up before. If there is not, is there interest in having this? Regards, Bernd From sac at bioperl.org Wed Sep 5 16:37:37 2007 From: sac at bioperl.org (Steve Chervitz) Date: Wed, 5 Sep 2007 13:37:37 -0700 Subject: [Bioperl-l] SearchIO ResultWriter In-Reply-To: <716af09c0709051044t1cb9e857uc4b91ad64c9ef22a@mail.gmail.com> References: <716af09c0709051044t1cb9e857uc4b91ad64c9ef22a@mail.gmail.com> Message-ID: <8f200b4c0709051337u532804d6r27712b05faaeea7d@mail.gmail.com> Looks like there is no such functionality in the current repository. If you have implemented such a beast and are willing to contribute it, go for it (or coordinate with a developer if you lack CVS write access). Steve On 9/5/07, Bernd Web wrote: > > Hi, > > For SearchIO there are ResultWriters to write text, html and BSML > (BSMLResultWriter). However, is there also a BLAST xml writer, which > writes the original blast xml files. This may have come up before. If > there is not, is there interest in having this? > > > Regards, > Bernd > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > From n.haigh at sheffield.ac.uk Wed Sep 5 17:18:17 2007 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Wed, 05 Sep 2007 22:18:17 +0100 Subject: [Bioperl-l] Bio::Graphics support for floating point positions In-Reply-To: References: <358f4d650708310229i421bb1d7w5f64e3ded5f92618@mail.gmail.com> <96BB820A-9199-43FC-BA88-5F45F2D16D56@jays.net> <59E877A9-B2A1-4E62-A291-FDE21CF586E6@uiuc.edu> <358f4d650709050242u7743a532u3b460fcc55e1867e@mail.gmail.com> <08E1C429-0686-49E2-825A-D98C709166F7@uiuc.edu> <46DEBBB4.1030200@sheffield.ac.uk> Message-ID: <46DF1D19.9010707@sheffield.ac.uk> Chris Fields wrote: > On Sep 5, 2007, at 9:22 AM, Nathan Haigh wrote: > >> ... >> On a similar (well, related to Bio::Graphics) topic, I've written a >> script that uses markers that have been mapped from a model organism to >> linkage groups in related species in order to estimate the location of >> "unknown" markers in those linkage groups. >> >> I'm using the Bio::Map::* modules for much of this work and then I use >> Bio::Graphics to display the linkage groups of the non-model organism >> with the putative position of the "unknown" markers. However, I've had >> to do a bit of fudging to get Bio::Graphics to draw this data. The >> problems I encountered are described below. I also have an open bug: >> http://bugzilla.open-bio.org/show_bug.cgi?id=2343 >> >> 1) Linkage maps are measured in cM - which can and are likely to be >> non-integer values. Bio::Graphics needs integer values, so I simply >> scaled all my cM measurements prior to drawing by *1000. However, the >> ruler now doesn't represent the "true scale" - can this be adjusted? >> >> 2) Some markers map to 0cM. However, Bio::Graphics requires positions >>> 0. To get round this I simply incremented these positions by 1 >> (post-scaling), so they display almost in the correct place. >> >> Is it possible/likely/wise to support positions starting at zero and >> float positions? Would such support simply be to internalise what I have >> already done outside Bio::Graphics into the Bio::Graphics modules and >> have it display the correctly scaled ruler? >> >> Thoughts comments welcome. >> >> Cheers, >> Nath > > There is this section in the GBrowse configure doc, which to me > suggests there is a way to do what you want in Bioperl; you may have > to delve into the Bio::Graphics or GBrowse code to work it out, > though. I think the GBrowse mail list archives also have more on this. > > chris > > ..... > > F. DISPLAYING GENETIC AND RH MAPS > > GBrowse can be tweaked to make it more suitable for displaying genetic > and radiation hybrid maps. > > The main issue is that the Bio::DB::GFF database expects coordinates > to be positive integers, not fractions, but genetic and RH maps use > floating point numbers. Working around this is a bit of an ugly hack. > Before loading your data you must multiply all your coordinates by a > constant power of 10 in order to convert them into integers. For > example, if a genetic map uses Morgan units ranging from 0 to 1.80, > you would multiple by 100 to create a map in ranging from 0 to 180. > > Create a GFF file containing the markers in modified coordinates and > load it as usual. Now you must tell GBrowse to reverse these changes. > Enter the following options into the [GENERAL] section of the > configuration file: > > units = M > unit_divider = 100 > > These two options tell GBrowse to use "M" (Morgan) units, and to > divide all coordinates by 100. GBrowse will automatically display the > scale using the most appropriate units, so the displayed map will > typically be drawn using cM units. > Thanks for for the pointer Chris! >From what you've said, it appears they might have done a similar hack to me - which is always nice to know! It seems then to me, that it may be worth making the Bio::Graphic::* modules slightly more generic and applicable for these situations. It's late, so does anyone have suggestions before I start digging through Bio::Graphic::* modules in the morning? Maybe you guys across the water have something to say by the time I wake up in the morning!? Thanks Nath From jason at bioperl.org Wed Sep 5 17:33:44 2007 From: jason at bioperl.org (Jason Stajich) Date: Wed, 5 Sep 2007 14:33:44 -0700 Subject: [Bioperl-l] SearchIO ResultWriter In-Reply-To: <8f200b4c0709051337u532804d6r27712b05faaeea7d@mail.gmail.com> References: <716af09c0709051044t1cb9e857uc4b91ad64c9ef22a@mail.gmail.com> <8f200b4c0709051337u532804d6r27712b05faaeea7d@mail.gmail.com> Message-ID: I think most ppl aren't that enamored with the NCBI XML Blast format but I guess it is standard if the NCBI puts it out... It should be a pretty easy writer to make at any rate just follow along with what was done for BSMLWriter. -jason On Sep 5, 2007, at 1:37 PM, Steve Chervitz wrote: > Looks like there is no such functionality in the current > repository. If you > have implemented such a beast and are willing to contribute it, go > for it > (or coordinate with a developer if you lack CVS write access). > > Steve > > On 9/5/07, Bernd Web wrote: >> >> Hi, >> >> For SearchIO there are ResultWriters to write text, html and BSML >> (BSMLResultWriter). However, is there also a BLAST xml writer, which >> writes the original blast xml files. This may have come up before. If >> there is not, is there interest in having this? >> >> >> Regards, >> Bernd >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- Jason Stajich jason at bioperl.org From jay at jays.net Thu Sep 6 15:50:53 2007 From: jay at jays.net (Jay Hannah) Date: Thu, 6 Sep 2007 15:50:53 -0400 (EDT) Subject: [Bioperl-l] display protein/CDS multiple sequence alignments with exon boundaries In-Reply-To: <08E1C429-0686-49E2-825A-D98C709166F7@uiuc.edu> References: <358f4d650708310229i421bb1d7w5f64e3ded5f92618@mail.gmail.com> <96BB820A-9199-43FC-BA88-5F45F2D16D56@jays.net> <59E877A9-B2A1-4E62-A291-FDE21CF586E6@uiuc.edu> <358f4d650709050242u7743a532u3b460fcc55e1867e@mail.gmail.com> <08E1C429-0686-49E2-825A-D98C709166F7@uiuc.edu> Message-ID: On Wed, 5 Sep 2007, Chris Fields wrote: > http://www.bioperl.org/wiki/HOWTO_Discussion:Graphics Wow. That's slick. :) Is it possible to zoom in far enough to see the individual bases and gaps?? On Tue, 4 Sep 2007, Chris Fields wrote: > Acc. to the Gbrowse tutorial GBrowse can deal with alignment data: > http://gmod.cvs.sourceforge.net/*checkout*/gmod/Generic-Genome-Browser/docs/tutorial/tutorial.html Yes, indeed. GBrowse graphs all sorts of amazing things. Specifically, this image might be what Albert is looking for: http://gmod.cvs.sourceforge.net/*checkout*/gmod/Generic-Genome-Browser/docs/tutorial/figures/segmented_features2.gif He'd need to map his exon boundaries from whatever format he has into a GFF file (or DAS/BioSQL/Chado/? server, or...?) for GBrowse to munch on. On Aug 31, 2007, at 4:29 AM, Albert Vilella wrote: > "Something in the bioperl world without funky external dependencies" There are still things the long arm of BioPerl justice hasn't reached? :) Jay Hannah http://clab.ist.unomaha.edu/CLAB/index.php/User:Jhannah From cjfields at uiuc.edu Thu Sep 6 19:39:07 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 6 Sep 2007 18:39:07 -0500 Subject: [Bioperl-l] display protein/CDS multiple sequence alignments with exon boundaries In-Reply-To: References: <358f4d650708310229i421bb1d7w5f64e3ded5f92618@mail.gmail.com> <96BB820A-9199-43FC-BA88-5F45F2D16D56@jays.net> <59E877A9-B2A1-4E62-A291-FDE21CF586E6@uiuc.edu> <358f4d650709050242u7743a532u3b460fcc55e1867e@mail.gmail.com> <08E1C429-0686-49E2-825A-D98C709166F7@uiuc.edu> Message-ID: <9AD575D1-A56C-415F-9C52-73105E846040@uiuc.edu> On Sep 6, 2007, at 2:50 PM, Jay Hannah wrote: > > On Wed, 5 Sep 2007, Chris Fields wrote: >> http://www.bioperl.org/wiki/HOWTO_Discussion:Graphics > > Wow. That's slick. :) Is it possible to zoom in far enough to > see the > individual bases and gaps?? I'm not sure; you can do something like that with GBrowse with some features so there is probably a way to put something together which could do that. > On Tue, 4 Sep 2007, Chris Fields wrote: >> Acc. to the Gbrowse tutorial GBrowse can deal with alignment data: >> http://gmod.cvs.sourceforge.net/*checkout*/gmod/Generic-Genome- >> Browser/docs/tutorial/tutorial.html > > Yes, indeed. GBrowse graphs all sorts of amazing things. Specifically, > this image might be what Albert is looking for: > > http://gmod.cvs.sourceforge.net/*checkout*/gmod/Generic-Genome- > Browser/docs/tutorial/figures/segmented_features2.gif > > He'd need to map his exon boundaries from whatever format he has > into a > GFF file (or DAS/BioSQL/Chado/? server, or...?) for GBrowse to > munch on. I use segmented SeqFeatures in my example. The HOWTO also uses a variation ('graded_segments'): http://www.bioperl.org/wiki/HOWTO:Graphics#Parsing_Real_BLAST_Output The subseqfeatures are colored by score. Feasibly one could hack this so that the exons/introns have a different 'score', thus displaying different colors. > On Aug 31, 2007, at 4:29 AM, Albert Vilella wrote: >> "Something in the bioperl world without funky external dependencies" > > There are still things the long arm of BioPerl justice hasn't > reached? :) > > Jay Hannah > http://clab.ist.unomaha.edu/CLAB/index.php/User:Jhannah chris From cain.cshl at gmail.com Thu Sep 6 23:20:04 2007 From: cain.cshl at gmail.com (Scott Cain) Date: Thu, 06 Sep 2007 23:20:04 -0400 Subject: [Bioperl-l] display protein/CDS multiple sequence alignments with exon boundaries In-Reply-To: <9AD575D1-A56C-415F-9C52-73105E846040@uiuc.edu> References: <358f4d650708310229i421bb1d7w5f64e3ded5f92618@mail.gmail.com> <96BB820A-9199-43FC-BA88-5F45F2D16D56@jays.net> <59E877A9-B2A1-4E62-A291-FDE21CF586E6@uiuc.edu> <358f4d650709050242u7743a532u3b460fcc55e1867e@mail.gmail.com> <08E1C429-0686-49E2-825A-D98C709166F7@uiuc.edu> <9AD575D1-A56C-415F-9C52-73105E846040@uiuc.edu> Message-ID: <1189135204.2560.52.camel@localhost.localdomain> On Thu, 2007-09-06 at 18:39 -0500, Chris Fields wrote: > On Sep 6, 2007, at 2:50 PM, Jay Hannah wrote: > > > > > On Wed, 5 Sep 2007, Chris Fields wrote: > >> http://www.bioperl.org/wiki/HOWTO_Discussion:Graphics > > > > Wow. That's slick. :) Is it possible to zoom in far enough to > > see the > > individual bases and gaps?? > > I'm not sure; you can do something like that with GBrowse with some > features so there is probably a way to put something together which > could do that. Yeah, if it were me, I would install GBrowse, hack my data into GFF and use gbrowse_img to generate pictures. It would probably be easier than starting from scratch. > > > On Tue, 4 Sep 2007, Chris Fields wrote: > >> Acc. to the Gbrowse tutorial GBrowse can deal with alignment data: > >> http://gmod.cvs.sourceforge.net/*checkout*/gmod/Generic-Genome- > >> Browser/docs/tutorial/tutorial.html > > > > Yes, indeed. GBrowse graphs all sorts of amazing things. Specifically, > > this image might be what Albert is looking for: > > > > http://gmod.cvs.sourceforge.net/*checkout*/gmod/Generic-Genome- > > Browser/docs/tutorial/figures/segmented_features2.gif > > > > He'd need to map his exon boundaries from whatever format he has > > into a > > GFF file (or DAS/BioSQL/Chado/? server, or...?) for GBrowse to > > munch on. > > I use segmented SeqFeatures in my example. The HOWTO also uses a > variation ('graded_segments'): > > http://www.bioperl.org/wiki/HOWTO:Graphics#Parsing_Real_BLAST_Output > > The subseqfeatures are colored by score. Feasibly one could hack > this so that the exons/introns have a different 'score', thus > displaying different colors. > > > On Aug 31, 2007, at 4:29 AM, Albert Vilella wrote: > >> "Something in the bioperl world without funky external dependencies" > > > > There are still things the long arm of BioPerl justice hasn't > > reached? :) > > > > Jay Hannah > > http://clab.ist.unomaha.edu/CLAB/index.php/User:Jhannah > > chris > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- ------------------------------------------------------------------------ Scott Cain, Ph. D. cain at cshl.edu GMOD Coordinator (http://www.gmod.org/) 216-392-3087 Cold Spring Harbor Laboratory -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: This is a digitally signed message part Url : http://lists.open-bio.org/pipermail/bioperl-l/attachments/20070906/6b2b9ea2/attachment.bin From avilella at gmail.com Fri Sep 7 05:20:01 2007 From: avilella at gmail.com (Albert Vilella) Date: Fri, 7 Sep 2007 10:20:01 +0100 Subject: [Bioperl-l] display protein/CDS multiple sequence alignments with exon boundaries In-Reply-To: <9AD575D1-A56C-415F-9C52-73105E846040@uiuc.edu> References: <358f4d650708310229i421bb1d7w5f64e3ded5f92618@mail.gmail.com> <96BB820A-9199-43FC-BA88-5F45F2D16D56@jays.net> <59E877A9-B2A1-4E62-A291-FDE21CF586E6@uiuc.edu> <358f4d650709050242u7743a532u3b460fcc55e1867e@mail.gmail.com> <08E1C429-0686-49E2-825A-D98C709166F7@uiuc.edu> <9AD575D1-A56C-415F-9C52-73105E846040@uiuc.edu> Message-ID: <358f4d650709070220q3680c5a2kbd01adb6f2fca3dc@mail.gmail.com> > > > He'd need to map his exon boundaries from whatever format he has > > into a > > GFF file (or DAS/BioSQL/Chado/? server, or...?) for GBrowse to > > munch on. > > I use segmented SeqFeatures in my example. The HOWTO also uses a > variation ('graded_segments'): > > http://www.bioperl.org/wiki/HOWTO:Graphics#Parsing_Real_BLAST_Output > > The subseqfeatures are colored by score. Feasibly one could hack > this so that the exons/introns have a different 'score', thus > displaying different colors. The exon boundary could be a vertical line or a triangular tick or something. I don't know if there is a consensus on this kind of cartoons. Does anybody know how exon boundaries are displayed in different browsers/apps? From yangmeng at genomics.org.cn Fri Sep 7 03:57:14 2007 From: yangmeng at genomics.org.cn (=?ISO-8859-1?Q?=D1=EE=C3=CD=A3=A8=D6=D0=D0=C4=CA=B5=D1=E9=CA=D2?= ) Date: Fri, 7 Sep 2007 15:57:14 +0800 Subject: [Bioperl-l] a question Message-ID: <200709071557.AA78971054@genomics.org.cn> I am a student from China.During my learing the bioperl,I encounter a problem as follows: I run the program, use Bio::Perl; $seq_object = get_sequence('swissprot',"ROA1_HUMAN"); write_sequence(">roa1.fasta",'fasta',$seq_object); But It returns lots of mistake informatiom, ------------- EXCEPTION: Bio::Root::Exception ------------- MSG: WebDBSeqI Request Error: 501 Protocol scheme '' is not supported Content-Type: text/plain Client-Date: Fri, 07 Sep 2007 07:26:06 GMT Client-Warning: Internal response 501 Protocol scheme '' is not supported STACK: Error::throw STACK: Bio::Root::Root::throw D:/perl/site/lib/Bio/Root/Root.pm:359 STACK: Bio::DB::WebDBSeqI::_request D:/perl/site/lib/Bio/DB/WebDBSeqI.pm:685 STACK: Bio::DB::WebDBSeqI::get_seq_stream D:/perl/site/lib/Bio/DB/WebDBSeqI.pm:4 91 STACK: Bio::DB::WebDBSeqI::get_Stream_by_id D:/perl/site/lib/Bio/DB/WebDBSeqI.pm :275 STACK: Bio::DB::WebDBSeqI::get_Seq_by_id D:/perl/site/lib/Bio/DB/WebDBSeqI.pm:14 5 STACK: Bio::Perl::get_sequence D:/perl/site/lib/Bio/Perl.pm:510 STACK: C:\DOCUME~1\yangmeng\LOCALS~1\Temp\dir13D.tmp\Untitled.pl:6 ----------------------------------------------------------- I don't know the reason of the problem.I have installed the addition perl modules such as bioperl-db,bioperl-network,bioperlgui and almost all "BioPerl Dependencies modules".My network is also OK. It's an annoying promleb to me. I have consulted many experts but didn't got a reply. Could you vacuate in your mass business to give me a help? Thank you! Best regards! YangMeng ________________________________________________________________ Sent via the WebMail system at genomics.org.cn From cjfields at uiuc.edu Fri Sep 7 10:09:18 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Fri, 7 Sep 2007 09:09:18 -0500 Subject: [Bioperl-l] a question In-Reply-To: <200709071557.AA78971054@genomics.org.cn> References: <200709071557.AA78971054@genomics.org.cn> Message-ID: <7F176E39-18A6-4BF9-9247-863D6F3C167D@uiuc.edu> On Sep 7, 2007, at 2:57 AM, ???????????????? wrote: > I am a student from China.During my learing the bioperl,I encounter > a problem as follows: > > I run the program, > > use Bio::Perl; > $seq_object = get_sequence('swissprot',"ROA1_HUMAN"); > write_sequence(">roa1.fasta",'fasta',$seq_object); > > But It returns lots of mistake informatiom, First, always preface problems of this sort with the version of BioPerl you are using (there are quite a few versions still being used). > ------------- EXCEPTION: Bio::Root::Exception ------------- > MSG: WebDBSeqI Request Error: > 501 Protocol scheme '' is not supported > Content-Type: text/plain > Client-Date: Fri, 07 Sep 2007 07:26:06 GMT > Client-Warning: Internal response > 501 Protocol scheme '' is not supported > STACK: Error::throw > STACK: Bio::Root::Root::throw D:/perl/site/lib/Bio/Root/Root.pm:359 > STACK: Bio::DB::WebDBSeqI::_request D:/perl/site/lib/Bio/DB/ > WebDBSeqI.pm:685 > STACK: Bio::DB::WebDBSeqI::get_seq_stream D:/perl/site/lib/Bio/DB/ > WebDBSeqI.pm:4 > 91 > STACK: Bio::DB::WebDBSeqI::get_Stream_by_id D:/perl/site/lib/Bio/DB/ > WebDBSeqI.pm > :275 > STACK: Bio::DB::WebDBSeqI::get_Seq_by_id D:/perl/site/lib/Bio/DB/ > WebDBSeqI.pm:14 > 5 > STACK: Bio::Perl::get_sequence D:/perl/site/lib/Bio/Perl.pm:510 > STACK: C:\DOCUME~1\yangmeng\LOCALS~1\Temp\dir13D.tmp\Untitled.pl:6 > ----------------------------------------------------------- This works for me using bioperl from CVS. There were a few remote DbFetch server changes if I recall correctly, so updating from CVS may be your best option. > I don't know the reason of the problem.I have installed the > addition perl modules such as bioperl-db,bioperl-network,bioperlgui > and almost all "BioPerl Dependencies modules".My network is also > OK. It's an annoying promleb to me. > I have consulted many experts but didn't got a reply. Could you > vacuate in your mass business to give me a help? > > Thank you! > > Best regards! > > YangMeng I think my 'vacuating' is a private matter, let alone doing so in my mass business... http://www.thefreedictionary.com/Vacuate ;> chris From cjfields at uiuc.edu Mon Sep 10 18:04:14 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 10 Sep 2007 17:04:14 -0500 Subject: [Bioperl-l] display protein/CDS multiple sequence alignments with exon boundaries In-Reply-To: <358f4d650709070220q3680c5a2kbd01adb6f2fca3dc@mail.gmail.com> References: <358f4d650708310229i421bb1d7w5f64e3ded5f92618@mail.gmail.com> <96BB820A-9199-43FC-BA88-5F45F2D16D56@jays.net> <59E877A9-B2A1-4E62-A291-FDE21CF586E6@uiuc.edu> <358f4d650709050242u7743a532u3b460fcc55e1867e@mail.gmail.com> <08E1C429-0686-49E2-825A-D98C709166F7@uiuc.edu> <9AD575D1-A56C-415F-9C52-73105E846040@uiuc.edu> <358f4d650709070220q3680c5a2kbd01adb6f2fca3dc@mail.gmail.com> Message-ID: <3D1D9D7E-4162-4694-A3D6-23E926B07C0E@uiuc.edu> On Sep 7, 2007, at 4:20 AM, Albert Vilella wrote: >>> He'd need to map his exon boundaries from whatever format he has >>> into a >>> GFF file (or DAS/BioSQL/Chado/? server, or...?) for GBrowse to >>> munch on. >> >> I use segmented SeqFeatures in my example. The HOWTO also uses a >> variation ('graded_segments'): >> >> http://www.bioperl.org/wiki/HOWTO:Graphics#Parsing_Real_BLAST_Output >> >> The subseqfeatures are colored by score. Feasibly one could hack >> this so that the exons/introns have a different 'score', thus >> displaying different colors. > > > The exon boundary could be a vertical line or a triangular tick or > something. I don't know if there is a consensus on this kind of > cartoons. > Does anybody know how exon boundaries are displayed in different > browsers/apps? Don't know. BTW, apparently there is something being cooked up as an alignment browser (among other things) for GBrowse: https://www.nescent.org/wg_phyloinformatics/ PhyloSoC:Phylogenetic_and_Haplotype_Displays_for_GBrowse Acc. to Lincoln (from his last GBrowse post) there will be a testable version within a few weeks or so. You could always ask more questions about it on the GBrowse list. chris From lstein at cshl.edu Mon Sep 10 18:09:41 2007 From: lstein at cshl.edu (Lincoln Stein) Date: Mon, 10 Sep 2007 18:09:41 -0400 Subject: [Bioperl-l] display protein/CDS multiple sequence alignments with exon boundaries In-Reply-To: <3D1D9D7E-4162-4694-A3D6-23E926B07C0E@uiuc.edu> References: <358f4d650708310229i421bb1d7w5f64e3ded5f92618@mail.gmail.com> <96BB820A-9199-43FC-BA88-5F45F2D16D56@jays.net> <59E877A9-B2A1-4E62-A291-FDE21CF586E6@uiuc.edu> <358f4d650709050242u7743a532u3b460fcc55e1867e@mail.gmail.com> <08E1C429-0686-49E2-825A-D98C709166F7@uiuc.edu> <9AD575D1-A56C-415F-9C52-73105E846040@uiuc.edu> <358f4d650709070220q3680c5a2kbd01adb6f2fca3dc@mail.gmail.com> <3D1D9D7E-4162-4694-A3D6-23E926B07C0E@uiuc.edu> Message-ID: <6dce9a0b0709101509x5b1a18cfx7c567e40dffb4947@mail.gmail.com> You can view a simple multiple alignment now. Go to www.wormbase.org, turn on some of the EST tracks and then zoom down to base pair level. In bio::graphics, use the "segments" glyph and turn on the -draw_target option. The features must have DNA attached to them. What's coming soon is support for MAF format, which provides genome-level alignments. Lincoln On 9/10/07, Chris Fields wrote: > > On Sep 7, 2007, at 4:20 AM, Albert Vilella wrote: > > >>> He'd need to map his exon boundaries from whatever format he has > >>> into a > >>> GFF file (or DAS/BioSQL/Chado/? server, or...?) for GBrowse to > >>> munch on. > >> > >> I use segmented SeqFeatures in my example. The HOWTO also uses a > >> variation ('graded_segments'): > >> > >> http://www.bioperl.org/wiki/HOWTO:Graphics#Parsing_Real_BLAST_Output > >> > >> The subseqfeatures are colored by score. Feasibly one could hack > >> this so that the exons/introns have a different 'score', thus > >> displaying different colors. > > > > > > The exon boundary could be a vertical line or a triangular tick or > > something. I don't know if there is a consensus on this kind of > > cartoons. > > Does anybody know how exon boundaries are displayed in different > > browsers/apps? > > Don't know. BTW, apparently there is something being cooked up as an > alignment browser (among other things) for GBrowse: > > https://www.nescent.org/wg_phyloinformatics/ > PhyloSoC:Phylogenetic_and_Haplotype_Displays_for_GBrowse > > Acc. to Lincoln (from his last GBrowse post) there will be a testable > version within a few weeks or so. You could always ask more > questions about it on the GBrowse list. > > chris > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > -- Lincoln D. Stein Cold Spring Harbor Laboratory 1 Bungtown Road Cold Spring Harbor, NY 11724 (516) 367-8380 (voice) (516) 367-8389 (fax) FOR URGENT MESSAGES & SCHEDULING, PLEASE CONTACT MY ASSISTANT, SANDRA MICHELSEN, AT michelse at cshl.edu From cjfields at uiuc.edu Mon Sep 10 23:00:29 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 10 Sep 2007 22:00:29 -0500 Subject: [Bioperl-l] display protein/CDS multiple sequence alignments with exon boundaries In-Reply-To: <6dce9a0b0709101509x5b1a18cfx7c567e40dffb4947@mail.gmail.com> References: <358f4d650708310229i421bb1d7w5f64e3ded5f92618@mail.gmail.com> <96BB820A-9199-43FC-BA88-5F45F2D16D56@jays.net> <59E877A9-B2A1-4E62-A291-FDE21CF586E6@uiuc.edu> <358f4d650709050242u7743a532u3b460fcc55e1867e@mail.gmail.com> <08E1C429-0686-49E2-825A-D98C709166F7@uiuc.edu> <9AD575D1-A56C-415F-9C52-73105E846040@uiuc.edu> <358f4d650709070220q3680c5a2kbd01adb6f2fca3dc@mail.gmail.com> <3D1D9D7E-4162-4694-A3D6-23E926B07C0E@uiuc.edu> <6dce9a0b0709101509x5b1a18cfx7c567e40dffb4947@mail.gmail.com> Message-ID: <885E5E3B-E2F7-4279-8EE3-FC21AF535D7E@uiuc.edu> Doesn't that work only for SeqFeature::SimilarityPair and HSP-like (paired) alignments, or am I mistaken? chris On Sep 10, 2007, at 5:09 PM, Lincoln Stein wrote: > You can view a simple multiple alignment now. Go to > www.wormbase.org, turn > on some of the EST tracks and then zoom down to base pair level. > > In bio::graphics, use the "segments" glyph and turn on the - > draw_target > option. The features must have DNA attached to them. > > What's coming soon is support for MAF format, which provides genome- > level > alignments. > > Lincoln From christoph.theunert at web.de Tue Sep 11 06:37:49 2007 From: christoph.theunert at web.de (Christoph Theunert) Date: Tue, 11 Sep 2007 03:37:49 -0700 (PDT) Subject: [Bioperl-l] release of own projects Message-ID: <12611951.post@talk.nabble.com> Hi, I am a bioinformatics student from germany and I need your help Working with perl and bioperl is pretty new to me - currently I am working on a Bioperl project, and I don't know how to release my project when i am finished with it. I want to pack my modules so that other users can download it and install it on their machines. Do I use the command h2xs as to create cpan modules ( makefiles ...) or what is the best way to solve my problem ? thanks for help Christoph -- View this message in context: http://www.nabble.com/release-of-own-projects-tf4421681.html#a12611951 Sent from the Perl - Bioperl-L mailing list archive at Nabble.com. From spiros at lokku.com Tue Sep 11 06:57:14 2007 From: spiros at lokku.com (Spiros Denaxas) Date: Tue, 11 Sep 2007 11:57:14 +0100 Subject: [Bioperl-l] release of own projects In-Reply-To: <12611951.post@talk.nabble.com> References: <12611951.post@talk.nabble.com> Message-ID: Hey, Yes, IMHO the best way would be to create CPANesque modules that people are able to download and install. The installation is pretty straightforward, covers prerequisites and more advanced features if needed and as an approach it is widely supported. Also, it gives you the ability to create and integrate tests seamlessly :) Check out these URL's on how to do it: http://search.cpan.org/~nwclark/perl-5.8.8/pod/perlnewmod.pod http://www.perlmonks.org/?node_id=158999 http://www.perlmonks.org/?node_id=431702 Btw, more friendly and automated tools exist besides h2xs. Be sure to have a look at: http://search.cpan.org/perldoc?ExtUtils::ModuleMaker http://search.cpan.org/perldoc?Module::Starter Hope this helps, Spiros ps. i suggest since its your research work you are going to be handing out to read a bit on the various software licenses which exist and which you prefer to license your code under. On 9/11/07, Christoph Theunert wrote: > > Hi, I am a bioinformatics student from germany and I need your help > > Working with perl and bioperl is pretty new to me - > currently I am working on a Bioperl project, and I don't know how to release > my project when i am finished with it. > > I want to pack my modules so that other users can download it and install it > on their machines. > > Do I use the command h2xs as to create cpan modules ( makefiles ...) or what > is the best way to solve my > problem ? > > thanks for help > > Christoph > -- > View this message in context: http://www.nabble.com/release-of-own-projects-tf4421681.html#a12611951 > Sent from the Perl - Bioperl-L mailing list archive at Nabble.com. > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > From bix at sendu.me.uk Tue Sep 11 07:12:41 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Tue, 11 Sep 2007 12:12:41 +0100 Subject: [Bioperl-l] release of own projects In-Reply-To: <12611951.post@talk.nabble.com> References: <12611951.post@talk.nabble.com> Message-ID: <46E67829.8060303@sendu.me.uk> Christoph Theunert wrote: > Hi, I am a bioinformatics student from germany and I need your help > > Working with perl and bioperl is pretty new to me - > currently I am working on a Bioperl project, and I don't know how to release > my project when i am finished with it. > > I want to pack my modules so that other users can download it and install it > on their machines. > > Do I use the command h2xs as to create cpan modules ( makefiles ...) or what > is the best way to solve my > problem ? You can do it however you like. You can just stick the modules in a folder, .tar.gz it and offer that to people. You can use h2xs to automate certain things. You can use Module::Build. To make your work available via cpan, see http://www.cpan.org/modules/04pause.html If your modules are of general bioinformatic utility you might even consider making them a part of bioperl itself. From jay at jays.net Tue Sep 11 17:15:17 2007 From: jay at jays.net (Jay Hannah) Date: Tue, 11 Sep 2007 16:15:17 -0500 Subject: [Bioperl-l] display protein/CDS multiple sequence alignments with exon boundaries In-Reply-To: <6dce9a0b0709101509x5b1a18cfx7c567e40dffb4947@mail.gmail.com> References: <358f4d650708310229i421bb1d7w5f64e3ded5f92618@mail.gmail.com> <96BB820A-9199-43FC-BA88-5F45F2D16D56@jays.net> <59E877A9-B2A1-4E62-A291-FDE21CF586E6@uiuc.edu> <358f4d650709050242u7743a532u3b460fcc55e1867e@mail.gmail.com> <08E1C429-0686-49E2-825A-D98C709166F7@uiuc.edu> <9AD575D1-A56C-415F-9C52-73105E846040@uiuc.edu> <358f4d650709070220q3680c5a2kbd01adb6f2fca3dc@mail.gmail.com> <3D1D9D7E-4162-4694-A3D6-23E926B07C0E@uiuc.edu> <6dce9a0b0709101509x5b1a18cfx7c567e40dffb4947@mail.gmail.com> Message-ID: <46E70565.5040503@jays.net> Lincoln Stein wrote: > You can view a simple multiple alignment now. Go to www.wormbase.org > , turn on some of the EST tracks and then > zoom down to base pair level. > > In bio::graphics, use the "segments" glyph and turn on the > -draw_target option. The features must have DNA attached to them. Wow. *http://tinyurl.com/yuz8bq* I hadn't seen that done before. > What's coming soon is support for MAF format, which provides > genome-level alignments. I'm looking forward to trying to wrap my head around that. :) Jay Hannah http://clab.ist.unomaha.edu/CLAB/index.php/User:Jhannah From cjfields at uiuc.edu Tue Sep 11 18:40:55 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 11 Sep 2007 17:40:55 -0500 Subject: [Bioperl-l] display protein/CDS multiple sequence alignments with exon boundaries In-Reply-To: <46E70565.5040503@jays.net> References: <358f4d650708310229i421bb1d7w5f64e3ded5f92618@mail.gmail.com> <96BB820A-9199-43FC-BA88-5F45F2D16D56@jays.net> <59E877A9-B2A1-4E62-A291-FDE21CF586E6@uiuc.edu> <358f4d650709050242u7743a532u3b460fcc55e1867e@mail.gmail.com> <08E1C429-0686-49E2-825A-D98C709166F7@uiuc.edu> <9AD575D1-A56C-415F-9C52-73105E846040@uiuc.edu> <358f4d650709070220q3680c5a2kbd01adb6f2fca3dc@mail.gmail.com> <3D1D9D7E-4162-4694-A3D6-23E926B07C0E@uiuc.edu> <6dce9a0b0709101509x5b1a18cfx7c567e40dffb4947@mail.gmail.com> <46E70565.5040503@jays.net> Message-ID: On Sep 11, 2007, at 4:15 PM, Jay Hannah wrote: > Lincoln Stein wrote: >> You can view a simple multiple alignment now. Go to www.wormbase.org >> , turn on some of the EST tracks and then >> zoom down to base pair level. >> >> In bio::graphics, use the "segments" glyph and turn on the >> -draw_target option. The features must have DNA attached to them. > > Wow. *http://tinyurl.com/yuz8bq* I hadn't seen that done before. There is a section detailing how this is done in the GBrowse tutorial (though it uses older GFF): http://gmod.cvs.sourceforge.net/*checkout*/gmod/Generic-Genome- Browser/docs/tutorial/tutorial.html >> What's coming soon is support for MAF format, which provides >> genome-level alignments. > > I'm looking forward to trying to wrap my head around that. :) > > Jay Hannah > http://clab.ist.unomaha.edu/CLAB/index.php/User:Jhannah It's easily parsible, which is nice! chris From stephan.roessner at gsf.de Wed Sep 12 04:44:10 2007 From: stephan.roessner at gsf.de (Stephan Roessner) Date: Wed, 12 Sep 2007 10:44:10 +0200 Subject: [Bioperl-l] bug in Bio::SearchIO? Message-ID: <200709121044.11741.stephan.roessner@gsf.de> Hi, I am parsing a BlastN output with Bio::SearchIO and getting an error for some of the hits when retrieving the start and/or the end position with $hit->end('sbjct') , $hit->start('sbjct'). I want to filter for hits which are are of equal length (~ > 0.9) to the query sequences. SearchIO is retrieving the right results, but throws an exemption, in this case: MSG:Undefined sub-sequence (1633,760). Valid range = 693 - 760 ..... It seems to me valid range is parsed incorrectly, isn't it? Is this a bug? Does anybody have a similar problem? see code, error, and blastn output below. thanks, Stephan Stephan Roessner MIPS/IBI Inst. for Bioinformatics GSF Research Center for Environment and Health Ingolst?dter Landstr. 1 85764 Neuherberg; Germany phone: +49 (0)89 3187 3583 fax: ? ? ? +49 (0)89 3187 3585 email: stephan.roessner at gsf.de Here is the piece of code I am using: my $blast_report = new Bio::SearchIO ('-format'=>'blast', '-file' => $source); while( my $result=$blast_report->next_result) { while( my $hit= $result->next_hit()) { print "Name: ".$hit->name."\n"; print "S: ".$hit->start('sbjct')."\n"; print "E: ".$hit->end('sbjct')."\n"; print "L: ".$hit->length()."\n"; } } Here's the message: ------------- EXCEPTION: Bio::Root::Exception ------------- MSG: Undefined sub-sequence (1633,760). Valid range = 693 - 760 STACK: Error::throw STACK: Bio::Root::Root::throw /usr/lib/perl5/vendor_perl/5.8.8/Bio/Root/Root.pm:359 STACK: Bio::Search::HSP::HSPI::matches /usr/lib/perl5/vendor_perl/5.8.8/Bio/Search/HSP/HSPI.pm:691 STACK: Bio::Search::SearchUtils::_adjust_contigs /usr/lib/perl5/vendor_perl/5.8.8/Bio/Search/SearchUtils.pm:489 STACK: Bio::Search::SearchUtils::tile_hsps /usr/lib/perl5/vendor_perl/5.8.8/Bio/Search/SearchUtils.pm:206 STACK: Bio::Search::Hit::GenericHit::start /usr/lib/perl5/vendor_perl/5.8.8/Bio/Search/Hit/GenericHit.pm:935 STACK: main::parse /home/users/roessner/workspace/GeneSimilarity/similarity_analysis.pl:82 STACK: /home/users/roessner/workspace/GeneSimilarity/similarity_analysis.pl:51 ----------------------------------------------------------- S: 635 E: 790 L: 2052 This is the BLASTN output I am parsing:: >LOC_Os11g37470.1 chr11_pseudomolecule_TIGR r_jap version0 21623485-21621434 BestGuessTranscript Length = 2052 Score = 95.6 bits (48), Expect = 1e-17 Identities = 106/124 (85%), Gaps = 1/124 (0%) Strand = Plus / Plus Query: 3191 tattaagcataattaatgtatcattagcacatgtagg-ttactgtagcatttaaggctaa 3249 |||||||| |||||||| | ||||| ||||||||||| |||||||| || ||| |||||| Sbjct: 635 tattaagcctaattaatctgtcattggcacatgtagggttactgtaacacttatggctaa 694 Query: 3250 tcatagagtaactagacttaaaagactcgtctcgcgattttcaaccaaactgtgtaatta 3309 |||| || ||| |||||| |||||| || |||||||||||||| ||||| ||| ||||| Sbjct: 695 tcatggactaaatagactcaaaagattcatctcgcgattttcatgcaaaccgtgcaatta 754 Query: 3310 gttt 3313 |||| Sbjct: 755 gttt 758 Score = 48.1 bits (24), Expect = 0.002 Identities = 57/68 (83%) Strand = Plus / Minus Query: 2253 aaaaactaattacacaatttacctgtacatcgcgagatgaatcttttaagtttagttact 2312 ||||||||||| ||| ||| | || | ||||||||||||||||||| ||| || ||| | Sbjct: 760 aaaaactaattgcacggtttgcatgaaaatcgcgagatgaatcttttgagtctatttagt 701 Query: 2313 ccatgatt 2320 |||||||| Sbjct: 700 ccatgatt 693 Score = 44.1 bits (22), Expect = 0.038 Identities = 76/94 (80%) Strand = Plus / Minus Query: 1539 atgcatgtagtattaaatatagacgaaaataaaaactaattgcacagtttggtcgaaatt 1598 ||||||| || |||||||||| | ||| ||||||||||||||| ||||| |||| | Sbjct: 790 atgcatggagcattaaatataaataaaatgaaaaactaattgcacggtttgcatgaaaat 731 Query: 1599 gtcgagacgaattttttgagtctagttaggccat 1632 ||||| |||| ||||||||||| |||| |||| Sbjct: 730 cgcgagatgaatcttttgagtctatttagtccat 697 Score = 44.1 bits (22), Expect = 0.038 Identities = 73/90 (81%) Strand = Plus / Plus Query: 2026 actaactagaattaaaagattcgtctcgtcatttacagacaaactgtgtaattagttttt 2085 ||||| |||| | ||||||||| ||||| |||| || ||||| ||| ||||||||||| Sbjct: 701 actaaatagactcaaaagattcatctcgcgattttcatgcaaaccgtgcaattagttttt 760 Query: 2086 gttttcgtctatatttaatgcttcatgcat 2115 ||| | ||||||||||||| ||||||| Sbjct: 761 cattttatttatatttaatgctccatgcat 790 From cjfields at uiuc.edu Wed Sep 12 10:57:22 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Wed, 12 Sep 2007 09:57:22 -0500 Subject: [Bioperl-l] bug in Bio::SearchIO? In-Reply-To: <200709121044.11741.stephan.roessner@gsf.de> References: <200709121044.11741.stephan.roessner@gsf.de> Message-ID: <74CE1BB2-FCEB-43C3-B783-09706C7F55D8@uiuc.edu> Try updating to bioperl from CVS. I believe this issue was fixed but I don't believe it made the 1.5.2 release. chris On Sep 12, 2007, at 3:44 AM, Stephan Roessner wrote: > Hi, > > I am parsing a BlastN output with Bio::SearchIO and getting an > error for some > of the hits when retrieving the start and/or the end position with > $hit->end('sbjct') , $hit->start('sbjct'). I want to filter for > hits which > are are of equal length (~ > 0.9) to the query sequences. > > SearchIO is retrieving the right results, but throws an exemption, > in this > case: MSG:Undefined sub-sequence (1633,760). Valid range = 693 - > 760 ..... > > It seems to me valid range is parsed incorrectly, isn't it? Is this > a bug? > > Does anybody have a similar problem? > > see code, error, and blastn output below. > > thanks, > Stephan > > > Stephan Roessner > MIPS/IBI Inst. for Bioinformatics > GSF Research Center for Environment and Health > Ingolst?dter Landstr. 1 > 85764 Neuherberg; Germany > phone: +49 (0)89 3187 3583 > fax: +49 (0)89 3187 3585 > email: stephan.roessner at gsf.de > > > Here is the piece of code I am using: > > my $blast_report = new Bio::SearchIO ('-format'=>'blast', > '-file' => $source); > > while( my $result=$blast_report->next_result) { > while( my $hit= $result->next_hit()) { > print "Name: ".$hit->name."\n"; > print "S: ".$hit->start('sbjct')."\n"; > print "E: ".$hit->end('sbjct')."\n"; > print "L: ".$hit->length()."\n"; > } > } > > > Here's the message: > > ------------- EXCEPTION: Bio::Root::Exception ------------- > MSG: Undefined sub-sequence (1633,760). Valid range = 693 - 760 > STACK: Error::throw > STACK: > Bio::Root::Root::throw /usr/lib/perl5/vendor_perl/5.8.8/Bio/Root/ > Root.pm:359 > STACK: > Bio::Search::HSP::HSPI::matches /usr/lib/perl5/vendor_perl/5.8.8/ > Bio/Search/HSP/HSPI.pm:691 > STACK: > Bio::Search::SearchUtils::_adjust_contigs /usr/lib/perl5/ > vendor_perl/5.8.8/Bio/Search/SearchUtils.pm:489 > STACK: > Bio::Search::SearchUtils::tile_hsps /usr/lib/perl5/vendor_perl/ > 5.8.8/Bio/Search/SearchUtils.pm:206 > STACK: > Bio::Search::Hit::GenericHit::start /usr/lib/perl5/vendor_perl/ > 5.8.8/Bio/Search/Hit/GenericHit.pm:935 > STACK: > main::parse /home/users/roessner/workspace/GeneSimilarity/ > similarity_analysis.pl:82 > STACK: /home/users/roessner/workspace/GeneSimilarity/ > similarity_analysis.pl:51 > ----------------------------------------------------------- > > S: 635 > E: 790 > L: 2052 > > This is the BLASTN output I am parsing:: > >> LOC_Os11g37470.1 chr11_pseudomolecule_TIGR r_jap version0 > 21623485-21621434 BestGuessTranscript > Length = 2052 > > Score = 95.6 bits (48), Expect = 1e-17 > Identities = 106/124 (85%), Gaps = 1/124 (0%) > Strand = Plus / Plus > > > Query: 3191 tattaagcataattaatgtatcattagcacatgtagg- > ttactgtagcatttaaggctaa 3249 > |||||||| |||||||| | ||||| ||||||||||| |||||||| || ||| > |||||| > Sbjct: 635 > tattaagcctaattaatctgtcattggcacatgtagggttactgtaacacttatggctaa 694 > > > Query: 3250 > tcatagagtaactagacttaaaagactcgtctcgcgattttcaaccaaactgtgtaatta 3309 > |||| || ||| |||||| |||||| || |||||||||||||| ||||| ||| > ||||| > Sbjct: 695 > tcatggactaaatagactcaaaagattcatctcgcgattttcatgcaaaccgtgcaatta 754 > > > Query: 3310 gttt 3313 > |||| > Sbjct: 755 gttt 758 > > > > Score = 48.1 bits (24), Expect = 0.002 > Identities = 57/68 (83%) > Strand = Plus / Minus > > > Query: 2253 > aaaaactaattacacaatttacctgtacatcgcgagatgaatcttttaagtttagttact 2312 > ||||||||||| ||| ||| | || | ||||||||||||||||||| ||| || > ||| | > Sbjct: 760 > aaaaactaattgcacggtttgcatgaaaatcgcgagatgaatcttttgagtctatttagt 701 > > > Query: 2313 ccatgatt 2320 > |||||||| > Sbjct: 700 ccatgatt 693 > > > > Score = 44.1 bits (22), Expect = 0.038 > Identities = 76/94 (80%) > Strand = Plus / Minus > > > Query: 1539 > atgcatgtagtattaaatatagacgaaaataaaaactaattgcacagtttggtcgaaatt 1598 > ||||||| || |||||||||| | ||| ||||||||||||||| ||||| > |||| | > Sbjct: 790 > atgcatggagcattaaatataaataaaatgaaaaactaattgcacggtttgcatgaaaat 731 > > > Query: 1599 gtcgagacgaattttttgagtctagttaggccat 1632 > ||||| |||| ||||||||||| |||| |||| > Sbjct: 730 cgcgagatgaatcttttgagtctatttagtccat 697 > > > > Score = 44.1 bits (22), Expect = 0.038 > Identities = 73/90 (81%) > Strand = Plus / Plus > > > Query: 2026 > actaactagaattaaaagattcgtctcgtcatttacagacaaactgtgtaattagttttt 2085 > ||||| |||| | ||||||||| ||||| |||| || ||||| ||| > ||||||||||| > Sbjct: 701 > actaaatagactcaaaagattcatctcgcgattttcatgcaaaccgtgcaattagttttt 760 > > > Query: 2086 gttttcgtctatatttaatgcttcatgcat 2115 > ||| | ||||||||||||| ||||||| > Sbjct: 761 cattttatttatatttaatgctccatgcat 790 > > > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From n.haigh at sheffield.ac.uk Wed Sep 12 12:34:26 2007 From: n.haigh at sheffield.ac.uk (Nathan Haigh) Date: Wed, 12 Sep 2007 17:34:26 +0100 Subject: [Bioperl-l] Bio::Graphics transparent background Message-ID: <46E81512.3090503@sheffield.ac.uk> Is it possible to set the bg colour of glyphs and the panel background to be transparent? If so, which output formats support transparency? Cheers Nath From Kevin.M.Brown at asu.edu Wed Sep 12 14:15:10 2007 From: Kevin.M.Brown at asu.edu (Kevin Brown) Date: Wed, 12 Sep 2007 11:15:10 -0700 Subject: [Bioperl-l] Bio::Graphics transparent background In-Reply-To: <46E81512.3090503@sheffield.ac.uk> References: <46E81512.3090503@sheffield.ac.uk> Message-ID: <1A4207F8295607498283FE9E93B775B403AEFF97@EX02.asurite.ad.asu.edu> > Is it possible to set the bg colour of glyphs and the panel > background to be transparent? If so, which output formats > support transparency? Not sure if you can, but SVG, PNG, Gif all support a transparent background. From bioperl-list at superfrink.net Thu Sep 13 01:15:39 2007 From: bioperl-list at superfrink.net (bioperl-list at superfrink.net) Date: Wed, 12 Sep 2007 23:15:39 -0600 (MDT) Subject: [Bioperl-l] Bio::Graphics transparent background Message-ID: > Is it possible to set the bg colour of glyphs and the panel background > to be transparent? If so, which output formats support transparency? I had a look at the code and I don't believe it is possible. You could produce a PNG file and knowing the red/green/blue values of the background colour run the following script to make an image with the bg colour transparent. For example: ./make-transparent.pl 252 253 252 2004-11-22.png will produce: 2004-11-22.png.new.png with the RGB colour of (252, 253, 252) replaced with transparency. Regards, Chad #!/usr/bin/perl -w # # file: make-transparent.pl # purpose: make a single colour in a PNG file transparent # author: chad c d clark # $Id$ use strict; use GD; # -- subroutines ------------------------------------------------------- sub usage_message(); # -- main() ------------------------------------------------------------ if(scalar @ARGV < 4) { print usage_message(); exit 1; } # get the colour and make sure it is valid my @RGB = splice @ARGV, 0, 3; for my $i (@RGB) { if ( ($i !~ /^[\d]+$/) or (255 < $i) ) { print "Invalid colour '$i'.\n"; print usage_message(); exit 1; } } print "RGB: (@RGB)\n"; # process each file FILE: while (my $filename = shift @ARGV) { # read the file my $image = GD::Image->new($filename); unless(defined $image) { warn "Unable to read image from file. Skipping '$filename'.\n"; next FILE; } # find the colour index my $index = $image->colorExact(@RGB); if(-1 == $index) { warn "Colour not found in file. Skipping '$filename'.\n"; next FILE; } # make the colour index transparent if(-1 == $image->transparent($index)) { warn "Unable to make colour transparent. Skipping '$filename'.\n"; next FILE; } # write the updated image file my $new_filename = $filename . ".new.png"; # my $new_filename = $filename; # use to over-write existing file open FH, ">" . $new_filename or die "can't open $new_filename"; print FH $image->png; close FH; print "Found file '$filename'.\tCreated '$new_filename'.\n"; } exit 0; # -- subroutines ------------------------------------------------------- sub usage_message() { return qq/ Usage: $0 RED GREEN BLUE FILELIST Where: RED - red value in decimal (0 to 255) GREEN - green value in decimal (0 to 255) BLUE - blue value in decimal (0 to 255) FILELIST - list of files to convert Examples: $0 255 255 255 2004-11-22.png $0 252 253 252 2004-11-22.png second.png $0 1 1 200 2004-11-22.png second.png third.png Description: For each file "foo.png" a new file "foo.png.new.png" will be created (and over-written if it existed). The new file will be the same as the original but the colour specified by the RED, GREEN, and BLUE value will be removed and replaced by transparent pixels. /; } From n.haigh at sheffield.ac.uk Thu Sep 13 06:07:46 2007 From: n.haigh at sheffield.ac.uk (Nathan Haigh) Date: Thu, 13 Sep 2007 11:07:46 +0100 Subject: [Bioperl-l] Bio::Graphics transparent background In-Reply-To: <1A4207F8295607498283FE9E93B775B403AEFF97@EX02.asurite.ad.asu.edu> References: <46E81512.3090503@sheffield.ac.uk> <1A4207F8295607498283FE9E93B775B403AEFF97@EX02.asurite.ad.asu.edu> Message-ID: <46E90BF2.5010607@sheffield.ac.uk> Kevin Brown wrote: >> Is it possible to set the bg colour of glyphs and the panel >> background to be transparent? If so, which output formats >> support transparency? >> > > Not sure if you can, but SVG, PNG, Gif all support a transparent > background. > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > Looking at the GD module documentation: http://search.cpan.org/~lds/GD-2.30/GD.pm It appears that you can set a colour as being transparent - so I think it should be possible to get Bio::Graphics to do this = may require some code to be written. Any one got ideas? Cheers, Nath From n.haigh at sheffield.ac.uk Thu Sep 13 07:59:20 2007 From: n.haigh at sheffield.ac.uk (Nathan Haigh) Date: Thu, 13 Sep 2007 12:59:20 +0100 Subject: [Bioperl-l] Bio::Graphics transparent background In-Reply-To: <46E90BF2.5010607@sheffield.ac.uk> References: <46E81512.3090503@sheffield.ac.uk> <1A4207F8295607498283FE9E93B775B403AEFF97@EX02.asurite.ad.asu.edu> <46E90BF2.5010607@sheffield.ac.uk> Message-ID: <46E92618.7050208@sheffield.ac.uk> Nathan Haigh wrote: > Kevin Brown wrote: > >>> Is it possible to set the bg colour of glyphs and the panel >>> background to be transparent? If so, which output formats >>> support transparency? >>> >>> >> Not sure if you can, but SVG, PNG, Gif all support a transparent >> background. >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> >> > > Looking at the GD module documentation: > http://search.cpan.org/~lds/GD-2.30/GD.pm > > It appears that you can set a colour as being transparent - so I think > it should be possible to get Bio::Graphics to do this = may require some > code to be written. Any one got ideas? > > Cheers, > Nath > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > I took a look and made a simple change to Bio::Graphics::Panel Please see the following bug for a patch and explanation: http://bugzilla.open-bio.org/show_bug.cgi?id=2365 I'd appreciate any comments, especially regarding the method name! If there aren't any complaints I'll commit it later today. Nath From n.haigh at sheffield.ac.uk Thu Sep 13 08:26:57 2007 From: n.haigh at sheffield.ac.uk (Nathan Haigh) Date: Thu, 13 Sep 2007 13:26:57 +0100 Subject: [Bioperl-l] Bio::Graphics Resolution Message-ID: <46E92C91.5020307@sheffield.ac.uk> I want to be able to print my Bio::Graphics image on a poster with good resolution. What can I do to ensure I don't get blocky graphics/text. Altering the width/height of the panel simple increases the size of the canvas on which to draw the image, but the text appears the same size and thus relatively smaller to the rest of the image. So I don't think this would work for printing on a poster. Cheers, Nath From cjfields at uiuc.edu Thu Sep 13 08:46:02 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 13 Sep 2007 07:46:02 -0500 Subject: [Bioperl-l] Bio::Graphics Resolution In-Reply-To: <46E92C91.5020307@sheffield.ac.uk> References: <46E92C91.5020307@sheffield.ac.uk> Message-ID: <69321A85-8715-43C0-BCB0-CEE8F42D7235@uiuc.edu> Print to SVG instead of PNG (should be resolution-independent); I use Illustrator to fine-tune it but there are several other programs which can do the same. You'll need to install GD::SVG for it to work. The alignment example I posted previously about (http:// www.bioperl.org/wiki/HOWTO_Discussion:Graphics) shows essentially what you need to do: my $panel = Bio::Graphics::Panel->new( -image_class => 'SVG', # and whatever else ); # later... print $panel->svg; chris On Sep 13, 2007, at 7:26 AM, Nathan Haigh wrote: > I want to be able to print my Bio::Graphics image on a poster with > good > resolution. What can I do to ensure I don't get blocky graphics/text. > > Altering the width/height of the panel simple increases the size of > the > canvas on which to draw the image, but the text appears the same size > and thus relatively smaller to the rest of the image. So I don't think > this would work for printing on a poster. > > Cheers, > Nath > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From jonathancrabtree at gmail.com Thu Sep 13 09:09:56 2007 From: jonathancrabtree at gmail.com (Jonathan Crabtree) Date: Thu, 13 Sep 2007 09:09:56 -0400 Subject: [Bioperl-l] Bio::Graphics transparent background In-Reply-To: <46E92618.7050208@sheffield.ac.uk> References: <46E81512.3090503@sheffield.ac.uk> <1A4207F8295607498283FE9E93B775B403AEFF97@EX02.asurite.ad.asu.edu> <46E90BF2.5010607@sheffield.ac.uk> <46E92618.7050208@sheffield.ac.uk> Message-ID: <8e5b8bf80709130609x4be19cf6y60f2440a1ac5d332@mail.gmail.com> Hi Nathan- One problem with your proposed solution is that it won't necessarily work when GD::SVG is being used instead of GD (i.e., via the image_class method of Bio::Graphics::Panel). SVG doesn't handle transparency in the same way as GD. At least when you're compositing multiple SVG images/documents, transparency is the default; if you superimpose one SVG image on another ( e.g., by merging the two into a single SVG document) then the bottom image will be visible through any area of the top image that has not been drawn on. When I'm working in SVG with Bio::Graphics I get a "transparent" background by simply not setting the bgcolor; this ensures that Bio::Graphics::Panel will refrain from drawing a filled background rectangle underneath the drawing area. What I don't know is how to ensure that the background is transparent when you're working with the various methods of embedding SVG in web pages ( i.e., transparent with respect to whatever is _underneath_ the SVG-rendered content); this is probably a slightly different issue that's more a question of what the browser/plugin supports. I'm not sure what to suggest as an alternative, but at the very least this probably warrants a YMMV comment in the documentation for the new method, or perhaps it could even throw a runtime error if called when the $gd object is of type GD::SVG. A final option would be to say that this (setting a transparent background) is something that should get handled outside of Bio::Graphics::Panel; I don't think there's any technical reason why the calling code couldn't be responsible for this. I don't think we can modify your new method to unset the bgcolor when working with GD::SVG, because that might affect the image in other ways. I do it in my code but I'm not sure it's 100% safe, since I think GD::SVG might actually _use_ the bgcolor in some situations (e.g., drawing dashed lines) and I haven't checked the code thoroughly to make sure that there are no unintended consequences. Jonathan p.s. I see that Chris has beaten me to the punch in mentioning SVG as a fix to your blocky font problems. All the more reason to think about how this feature will work in that context! On 9/13/07, Nathan Haigh wrote: > > Nathan Haigh wrote: > > Kevin Brown wrote: > > > >>> Is it possible to set the bg colour of glyphs and the panel > >>> background to be transparent? If so, which output formats > >>> support transparency? > >>> > >>> > >> Not sure if you can, but SVG, PNG, Gif all support a transparent > >> background. > >> > >> _______________________________________________ > >> Bioperl-l mailing list > >> Bioperl-l at lists.open-bio.org > >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > >> > >> > > > > Looking at the GD module documentation: > > http://search.cpan.org/~lds/GD-2.30/GD.pm > > > > > It appears that you can set a colour as being transparent - so I think > > it should be possible to get Bio::Graphics to do this = may require some > > code to be written. Any one got ideas? > > > > Cheers, > > Nath > > _______________________________________________ > > Bioperl-l mailing list > > Bioperl-l at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > > I took a look and made a simple change to Bio::Graphics::Panel > > Please see the following bug for a patch and explanation: > http://bugzilla.open-bio.org/show_bug.cgi?id=2365 > > I'd appreciate any comments, especially regarding the method name! If > there aren't any complaints I'll commit it later today. > > Nath > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > From bix at sendu.me.uk Thu Sep 13 09:03:46 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Thu, 13 Sep 2007 14:03:46 +0100 Subject: [Bioperl-l] Bio::Graphics Resolution In-Reply-To: <46E92C91.5020307@sheffield.ac.uk> References: <46E92C91.5020307@sheffield.ac.uk> Message-ID: <46E93532.6030505@sendu.me.uk> Nathan Haigh wrote: > I want to be able to print my Bio::Graphics image on a poster with good > resolution. What can I do to ensure I don't get blocky graphics/text. Output in SVG, which is a vector format == no blockiness. From jonathancrabtree at gmail.com Thu Sep 13 09:20:43 2007 From: jonathancrabtree at gmail.com (Jonathan Crabtree) Date: Thu, 13 Sep 2007 09:20:43 -0400 Subject: [Bioperl-l] Bio::Graphics Resolution In-Reply-To: <46E92C91.5020307@sheffield.ac.uk> References: <46E92C91.5020307@sheffield.ac.uk> Message-ID: <8e5b8bf80709130620r4a24fe8fi5171539f50735bf3@mail.gmail.com> Nathan- As Chris said, you'll want to use GD::SVG instead of GD. However, you're still going to have the issue that you raised that the fonts will be proportionally small with respect to your figure (particularly if you're printing a large region at poster size.) From what I remember GD only gives you a few font sizes to choose from, so even at the largest size you may still have problems. I've worked around this in the past by using scripts to post-process the resulting SVG. I do a global search and replace to increase the font sizes (and, in many cases, to adjust the y-offset of the text accordingly.) You may also need to tweak the amount of vertical whitespace in the image (e.g., between adjacent rows of features) to give yourself space to increase the font size. The same caveat applies to the horizontal dimension, since with a larger font you may have collisions between labels (assuming that the features in your figure are labeled.) To fix this you need to trick Bio::Graphics into thinking the feature labels are longer than they actually are. I forget whether I did this by padding the labels with extra whitespace or actually modifying the code that computes the feature bounding boxes, but something along those lines should work. Essentially you have to trick Bio::Graphics into leaving extra whitespace so that everything looks OK when you bump up the font sizes. Unfortunately I don't have a generic script that does this; after generating a couple of posters this way I switched to direct SVG generation to avoid the constraints imposed by going through GD. Jonathan On 9/13/07, Nathan Haigh wrote: > > I want to be able to print my Bio::Graphics image on a poster with good > resolution. What can I do to ensure I don't get blocky graphics/text. > > Altering the width/height of the panel simple increases the size of the > canvas on which to draw the image, but the text appears the same size > and thus relatively smaller to the rest of the image. So I don't think > this would work for printing on a poster. > > Cheers, > Nath > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > From arareko at campus.iztacala.unam.mx Thu Sep 13 10:59:31 2007 From: arareko at campus.iztacala.unam.mx (Mauricio Herrera Cuadra) Date: Thu, 13 Sep 2007 09:59:31 -0500 Subject: [Bioperl-l] Bio::Graphics Resolution In-Reply-To: <46E92C91.5020307@sheffield.ac.uk> References: <46E92C91.5020307@sheffield.ac.uk> Message-ID: <46E95053.8090300@campus.iztacala.unam.mx> Try saving the output of your Bio::Graphics image as SVG (with the desired proportions between text & graphics), then, at the moment of printing, set the desired output size (from the SVG file) and everything should be scaled accordingly. That can probably work. Cheers, Mauricio. Nathan Haigh wrote: > I want to be able to print my Bio::Graphics image on a poster with good > resolution. What can I do to ensure I don't get blocky graphics/text. > > Altering the width/height of the panel simple increases the size of the > canvas on which to draw the image, but the text appears the same size > and thus relatively smaller to the rest of the image. So I don't think > this would work for printing on a poster. > > Cheers, > Nath > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > -- MAURICIO HERRERA CUADRA arareko at campus.iztacala.unam.mx Laboratorio de Gen?tica Unidad de Morfofisiolog?a y Funci?n Facultad de Estudios Superiores Iztacala, UNAM From jay at jays.net Fri Sep 14 10:27:38 2007 From: jay at jays.net (Jay Hannah) Date: Fri, 14 Sep 2007 09:27:38 -0500 Subject: [Bioperl-l] [patch] getGenBank.pl Message-ID: <1088BA7F-009A-482E-B15E-80D4D59218BE@jays.net> http://cvs.bioperl.org/cgi-bin/viewcvs/viewcvs.cgi/bioperl-live/ examples/db/getGenBank.pl Using this: my $seqio = $gb->get_Stream_by_batch([ qw( 124430577 )]); Throws this warning: $ ./fetch.pl > 124430577.gbk get_Stream_by_batch() is deprecated; use get_Stream_by_id() instead STACK Bio::DB::NCBIHelper::__ANON__ /usr/lib/perl5/Bio/DB/ NCBIHelper.pm:261 STACK toplevel ./fetch.pl:17 Can someone with commit access please change getGenBank.pl? 24,25c24,25 < # if you want to get a bunch of sequences use the batch method < my $seqio = $gb->get_Stream_by_batch([ qw(J00522 AF303112 2981014)]); --- > # feel free to pull multiple sequences > my $seqio = $gb->get_Stream_by_id([ qw(J00522 AF303112 2981014)]); The tweaked version works fine for me and the warning goes away. Thanks, Jay Hannah http://clab.ist.unomaha.edu/CLAB/index.php/User:Jhannah From cjfields at uiuc.edu Fri Sep 14 12:36:56 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Fri, 14 Sep 2007 11:36:56 -0500 Subject: [Bioperl-l] [patch] getGenBank.pl In-Reply-To: <1088BA7F-009A-482E-B15E-80D4D59218BE@jays.net> References: <1088BA7F-009A-482E-B15E-80D4D59218BE@jays.net> Message-ID: <5255407E-F19E-45B6-9F66-3DC1AB25C0AC@uiuc.edu> Done. Thanks for the heads up! chris On Sep 14, 2007, at 9:27 AM, Jay Hannah wrote: > http://cvs.bioperl.org/cgi-bin/viewcvs/viewcvs.cgi/bioperl-live/ > examples/db/getGenBank.pl > > Using this: > my $seqio = $gb->get_Stream_by_batch([ qw( 124430577 )]); > > Throws this warning: > $ ./fetch.pl > 124430577.gbk > get_Stream_by_batch() is deprecated; use get_Stream_by_id() instead > STACK Bio::DB::NCBIHelper::__ANON__ /usr/lib/perl5/Bio/DB/ > NCBIHelper.pm:261 > STACK toplevel ./fetch.pl:17 > > > Can someone with commit access please change getGenBank.pl? > > 24,25c24,25 > < # if you want to get a bunch of sequences use the batch method > < my $seqio = $gb->get_Stream_by_batch([ qw(J00522 AF303112 > 2981014)]); > --- >> # feel free to pull multiple sequences >> my $seqio = $gb->get_Stream_by_id([ qw(J00522 AF303112 2981014)]); > > > The tweaked version works fine for me and the warning goes away. > > Thanks, > > Jay Hannah > http://clab.ist.unomaha.edu/CLAB/index.php/User:Jhannah > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From MEC at stowers-institute.org Mon Sep 17 16:15:39 2007 From: MEC at stowers-institute.org (Cook, Malcolm) Date: Mon, 17 Sep 2007 15:15:39 -0500 Subject: [Bioperl-l] Bioperl -- why so old? ... or ... Feature/Annotation rollback breaks Bioperl/Ensembl compatibility Message-ID: Sometime ago, in the ensembl-dev mailing list, mdr wrote: > > I'm always running into bugs in bioperl that have been fixed in more recent > > versions than the version 1-2-3 that the installation document specifies. > > > > Just wondering, are there plans to try to move to 1-5 any time soon, or is that > > not possible for some reason? Or, by any chance, is Ensembl actually compatible > > with 1-5 and it's just a documentation issue? > > "Ewan Birney" replied > Ensembl doesn't make heavy use of Bioperl anymore - most of the critical things > we re-wrote, mainly due to speed/memory issues. I think the short answer is that > it _probably_ works with 1.5, but we don't have a strong desire to move up > as certainly there are no problems with the 1.2.3 release we are using. FWIW, I have just discovered that the round of bioperl changes in service of http://www.bioperl.org/wiki/Feature_Annotation_rollback introduce (additional?) incompatibilities between current bioperl and the Ensembl Core API. The changes bring me to obtain and use Bioperl version 1.2.3 for use in conjunction with Ensemble API application (as is recommended by Ensembl). Until now, the ways I have used the Ensembl API appear not to have been effected by changes in Bioperl; I have successfully used it in conjunction with the bioperl's leading edge. Of course there may be other incompatibilities that I have just not noticed yet. Evidence of the new incompatibility is present in this back trace, which bridges between code in current bioperl-live and current ensembl/modules/Bio: -------------------- EXCEPTION -------------------- MSG: Operator overloading of AnnotationI is deprecated STACK Bio::Annotation::DBLink::__ANON__ /home/mec/cvs/bioperl-live/Bio/Annotation/DBLink.pm:59 STACK Bio::EnsEMBL::DBSQL::DBEntryAdaptor::_fetch_by_object_type /home/mec/cvs/foo/ensembl/modules/Bio/EnsEMBL/DBSQL/DBEntryAdaptor.pm:77 8 Obtaining version 1.2.3 fixes the issue for me. This is just a warning to others.... Your milage may vary.... -- Malcolm Cook Stowers Institute for Medical Research - Kansas City, Missouri From cjfields at uiuc.edu Mon Sep 17 17:52:55 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 17 Sep 2007 16:52:55 -0500 Subject: [Bioperl-l] Bioperl -- why so old? ... or ... Feature/Annotation rollback breaks Bioperl/Ensembl compatibility In-Reply-To: References: Message-ID: <1CAA1977-45AE-4A8F-815C-4C726DB0E6E4@uiuc.edu> Malcolm, I have removed the Bio::Annotation overloading exceptions from bioperl-live; they're just more trouble than they're worth right now. Could you try it out and see if that suffices, and drop us a note if it doesn't or if you run into other odd issues? I'll be busy until the end of the month but I'll do the best I can to help out. The rollbacks were fairly simple and essentially reversed, corrected, or simplified many changes made prior to the 1.5 release (most of which were undocumented and not completely implemented). They pass all current tests and should make BioPerl classes (particularly Annotations and SeqFeatures) behave more like 1.4. Beyond the now- removed exceptions it should be fine unless it is in an area of already-known incompatibility between BioPerl and Ensembl, some of which you've already outlined. chris On Sep 17, 2007, at 3:15 PM, Cook, Malcolm wrote: > ... > FWIW, I have just discovered that the round of bioperl changes in > service of http://www.bioperl.org/wiki/Feature_Annotation_rollback > introduce (additional?) incompatibilities between current bioperl and > the Ensembl Core API. The changes bring me to obtain and use Bioperl > version 1.2.3 for use in conjunction with Ensemble API application (as > is recommended by Ensembl). > > Until now, the ways I have used the Ensembl API appear not to have > been effected by changes in Bioperl; I have successfully used it > in conjunction with the bioperl's leading edge. Of course there > may be > other incompatibilities that I have just not noticed yet. > > Evidence of the new incompatibility is present in this back trace, > which bridges between code in current bioperl-live and current > ensembl/modules/Bio: > > -------------------- EXCEPTION -------------------- > MSG: Operator overloading of AnnotationI is deprecated > STACK Bio::Annotation::DBLink::__ANON__ > /home/mec/cvs/bioperl-live/Bio/Annotation/DBLink.pm:59 > STACK Bio::EnsEMBL::DBSQL::DBEntryAdaptor::_fetch_by_object_type > /home/mec/cvs/foo/ensembl/modules/Bio/EnsEMBL/DBSQL/ > DBEntryAdaptor.pm:77 > 8 > > > Obtaining version 1.2.3 fixes the issue for me. > > This is just a warning to others.... > > Your milage may vary.... > > -- > > Malcolm Cook > Stowers Institute for Medical Research - Kansas City, Missouri From MEC at stowers-institute.org Mon Sep 17 18:14:41 2007 From: MEC at stowers-institute.org (Cook, Malcolm) Date: Mon, 17 Sep 2007 17:14:41 -0500 Subject: [Bioperl-l] Bioperl -- why so old? ... or ... Feature/Annotation rollback breaks Bioperl/Ensembl compatibility In-Reply-To: <1CAA1977-45AE-4A8F-815C-4C726DB0E6E4@uiuc.edu> References: <1CAA1977-45AE-4A8F-815C-4C726DB0E6E4@uiuc.edu> Message-ID: Chris, Removing those exceptions makes my application work with current bioperl-live again. Hooray! Thanks. But! I have been warned! Regards, Malcolm Cook Stowers Institute for Medical Research - Kansas City, Missouri > -----Original Message----- > From: Chris Fields [mailto:cjfields at uiuc.edu] > Sent: Monday, September 17, 2007 4:53 PM > To: Cook, Malcolm > Cc: bioperl list; ensembl-dev at ebi.ac.uk > Subject: Re: [Bioperl-l] Bioperl -- why so old? ... or ... > Feature/Annotation rollback breaks Bioperl/Ensembl compatibility > > Malcolm, > > I have removed the Bio::Annotation overloading exceptions > from bioperl-live; they're just more trouble than they're > worth right now. Could you try it out and see if that > suffices, and drop us a note if it doesn't or if you run into > other odd issues? I'll be busy until the end of the month > but I'll do the best I can to help out. > > The rollbacks were fairly simple and essentially reversed, > corrected, or simplified many changes made prior to the 1.5 > release (most of which were undocumented and not completely > implemented). They pass all current tests and should make > BioPerl classes (particularly Annotations and SeqFeatures) > behave more like 1.4. Beyond the now- removed exceptions it > should be fine unless it is in an area of already-known > incompatibility between BioPerl and Ensembl, some of which > you've already outlined. > > chris > > On Sep 17, 2007, at 3:15 PM, Cook, Malcolm wrote: > > > ... > > FWIW, I have just discovered that the round of bioperl changes in > > service of http://www.bioperl.org/wiki/Feature_Annotation_rollback > > introduce (additional?) incompatibilities between current > bioperl and > > the Ensembl Core API. The changes bring me to obtain and > use Bioperl > > version 1.2.3 for use in conjunction with Ensemble API > application (as > > is recommended by Ensembl). > > > > Until now, the ways I have used the Ensembl API appear not to have > > been effected by changes in Bioperl; I have successfully used it in > > conjunction with the bioperl's leading edge. Of course > there may be > > other incompatibilities that I have just not noticed yet. > > > > Evidence of the new incompatibility is present in this back trace, > > which bridges between code in current bioperl-live and current > > ensembl/modules/Bio: > > > > -------------------- EXCEPTION -------------------- > > MSG: Operator overloading of AnnotationI is deprecated STACK > > Bio::Annotation::DBLink::__ANON__ > > /home/mec/cvs/bioperl-live/Bio/Annotation/DBLink.pm:59 > > STACK Bio::EnsEMBL::DBSQL::DBEntryAdaptor::_fetch_by_object_type > > /home/mec/cvs/foo/ensembl/modules/Bio/EnsEMBL/DBSQL/ > > DBEntryAdaptor.pm:77 > > 8 > > > > > > Obtaining version 1.2.3 fixes the issue for me. > > > > This is just a warning to others.... > > > > Your milage may vary.... > > > > -- > > > > Malcolm Cook > > Stowers Institute for Medical Research - Kansas City, Missouri > > > From neetisomaiya at gmail.com Tue Sep 18 06:30:34 2007 From: neetisomaiya at gmail.com (neeti somaiya) Date: Tue, 18 Sep 2007 16:00:34 +0530 Subject: [Bioperl-l] A perl regex query Message-ID: <764978cf0709180330i232c2f04yf033414ad0dba655@mail.gmail.com> Hi, This isnt really a bioperl query. But does anyone know how I can substitute all special characters (+ some other things) in a string with nothing in perl? I mean if I have a string like Cyclic-2,3-bisphospho-D-glycerate and I want ouput as bisphosphoglycerate. I want to remove -D-, Cyclic-, 2,3- etc. -- -Neeti Even my blood says, B positive From spiros at lokku.com Tue Sep 18 06:57:18 2007 From: spiros at lokku.com (Spiros Denaxas) Date: Tue, 18 Sep 2007 11:57:18 +0100 Subject: [Bioperl-l] A perl regex query In-Reply-To: <764978cf0709180330i232c2f04yf033414ad0dba655@mail.gmail.com> References: <764978cf0709180330i232c2f04yf033414ad0dba655@mail.gmail.com> Message-ID: Heya, seperate the items you want to remove by a pipe and add the g regex flag. For example: spiros$ echo Cyclic-2,3-bisphospho-D-glycerate | perl -ne ' $_ =~ s at -D-|Cyclic\-@@g ; print $_ ;' 2,3-bisphosphoglycerate IMHO this is ugly. Best to make an array of all the elements you want to remove and then iterate through the array, calling the regex each time with a different element. This way it will be much more easy to read, debug and maintain. For example my $ra_bad_terms = [ '-D-', 'Cyclic-' ] ; foreach (@$ra_bad_terms) { $string =~ s@$_@@g ; } etc. Dont forget escaping and \Q \E if needed. Spiros On 9/18/07, neeti somaiya wrote: > Hi, > > This isnt really a bioperl query. > But does anyone know how I can substitute all special characters (+ some > other things) in a string with nothing in perl? > I mean if I have a string like Cyclic-2,3-bisphospho-D-glycerate and I want > ouput as bisphosphoglycerate. I want to remove -D-, Cyclic-, 2,3- etc. > > -- > -Neeti > Even my blood says, B positive > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > From n.haigh at sheffield.ac.uk Tue Sep 18 07:44:14 2007 From: n.haigh at sheffield.ac.uk (Nathan Haigh) Date: Tue, 18 Sep 2007 12:44:14 +0100 Subject: [Bioperl-l] A perl regex query In-Reply-To: References: <764978cf0709180330i232c2f04yf033414ad0dba655@mail.gmail.com> Message-ID: <46EFBA0E.4030104@sheffield.ac.uk> An even better way is to use the array as Spiros suggested, but you should then be able to use that in the regex like this: my @ra_bad_terms = ( '-D-', 'Cyclic-' ); $string =~ s/@ra_bad_terms//g; Again you might need escaping with \Q and \E - can't remember off hand. You might also want to look here: http://www.perl.com/pub/a/2002/06/04/apo5.html?page=15 Cheers Nath Spiros Denaxas wrote: > Heya, seperate the items you want to remove by a pipe and add the g > regex flag. For example: > > spiros$ echo Cyclic-2,3-bisphospho-D-glycerate | perl -ne ' $_ =~ > s at -D-|Cyclic\-@@g ; print $_ ;' > 2,3-bisphosphoglycerate > > IMHO this is ugly. Best to make an array of all the elements you want > to remove and then iterate through the array, calling the regex each > time with a different element. This way it will be much more easy to > read, debug and maintain. > > For example > > my $ra_bad_terms = [ '-D-', 'Cyclic-' ] ; > > foreach (@$ra_bad_terms) { > $string =~ s@$_@@g ; > } > > etc. > > Dont forget escaping and \Q \E if needed. > > Spiros > > > > On 9/18/07, neeti somaiya wrote: > >> Hi, >> >> This isnt really a bioperl query. >> But does anyone know how I can substitute all special characters (+ some >> other things) in a string with nothing in perl? >> I mean if I have a string like Cyclic-2,3-bisphospho-D-glycerate and I want >> ouput as bisphosphoglycerate. I want to remove -D-, Cyclic-, 2,3- etc. >> >> -- >> -Neeti >> Even my blood says, B positive >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> >> > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > From neetisomaiya at gmail.com Tue Sep 18 08:13:42 2007 From: neetisomaiya at gmail.com (neeti somaiya) Date: Tue, 18 Sep 2007 17:43:42 +0530 Subject: [Bioperl-l] A perl regex query In-Reply-To: <46EFBE8A.6080402@cam.ac.uk> References: <764978cf0709180330i232c2f04yf033414ad0dba655@mail.gmail.com> <46EFBE8A.6080402@cam.ac.uk> Message-ID: <764978cf0709180513r4bf44ea3gcd183a5aadcacb23@mail.gmail.com> Thanks. It might work, but not always, because the string could be somthing like Cyclic-2,3-Bisphospho-D-Glycerate. Here I will first convert the full thing to a lower case and would then try to get what I want. Nothing seems to work, when I try to substitute -D- with nothing, "D" and "-" when occuring separately also get substituted with nothing. On 9/18/07, Roy Chaudhuri wrote: > > > This isnt really a bioperl query. > > But does anyone know how I can substitute all special characters (+ some > > other things) in a string with nothing in perl? > > I mean if I have a string like Cyclic-2,3-bisphospho-D-glycerate and I > want > > ouput as bisphosphoglycerate. I want to remove -D-, Cyclic-, 2,3- etc. > > > > A more general approach that might work is to keep lower case words (I > don't know if that will be true for all your cases): > > $_='Cyclic-2,3-bisphospho-D-glycerate'; > print join '', /\b[a-z]+\b/g; > > Roy. > -- > Dr. Roy Chaudhuri > Department of Veterinary Medicine > University of Cambridge, U.K. > -- -Neeti Even my blood says, B positive From ak at ebi.ac.uk Tue Sep 18 08:20:32 2007 From: ak at ebi.ac.uk (Andreas Kahari) Date: Tue, 18 Sep 2007 13:20:32 +0100 Subject: [Bioperl-l] A perl regex query In-Reply-To: <764978cf0709180330i232c2f04yf033414ad0dba655@mail.gmail.com> References: <764978cf0709180330i232c2f04yf033414ad0dba655@mail.gmail.com> Message-ID: <20070918122032.GV14066@ebi.ac.uk> On Tue, Sep 18, 2007 at 04:00:34PM +0530, neeti somaiya wrote: > Hi, > > This isnt really a bioperl query. > But does anyone know how I can substitute all special characters (+ some > other things) in a string with nothing in perl? > I mean if I have a string like Cyclic-2,3-bisphospho-D-glycerate and I want > ouput as bisphosphoglycerate. I want to remove -D-, Cyclic-, 2,3- etc. This is in additions to the suggestions you've already had. If you always want to concatenate the 3rd and 5th part of the string, as delimited by dashes, then you could do this: my $string = 'Cyclic-2,3-bisphospho-D-glycerate'; my $newstring = join( '', ( split( /-/, $string ) )[ 2, 4 ] ); Cheers, Andreas -- Andreas K?h?ri :: Ensembl Software Developer European Bioinformatics Institute (EMBL-EBI) -------------------------------------------- From spiros at lokku.com Tue Sep 18 08:23:49 2007 From: spiros at lokku.com (Spiros Denaxas) Date: Tue, 18 Sep 2007 13:23:49 +0100 Subject: [Bioperl-l] A perl regex query In-Reply-To: <764978cf0709180513r4bf44ea3gcd183a5aadcacb23@mail.gmail.com> References: <764978cf0709180330i232c2f04yf033414ad0dba655@mail.gmail.com> <46EFBE8A.6080402@cam.ac.uk> <764978cf0709180513r4bf44ea3gcd183a5aadcacb23@mail.gmail.com> Message-ID: Its not impossibe, you just have to use \b to denote the word boundaries :) echo 'this-is-a_test-D-string-D' | perl -ne ' s/\b\-D\-\b//g ; print ;' this-is-a_teststring-D It only gets rid of -D- , all other occurrences of D and - remain intact. Spiros On 9/18/07, neeti somaiya wrote: > Thanks. > It might work, but not always, because the string could be somthing like > Cyclic-2,3-Bisphospho-D-Glycerate. > Here I will first convert the full thing to a lower case and would then try > to get what I want. > > Nothing seems to work, when I try to substitute -D- with nothing, "D" and > "-" when occuring separately also get substituted with nothing. > > On 9/18/07, Roy Chaudhuri wrote: > > > > > This isnt really a bioperl query. > > > But does anyone know how I can substitute all special characters (+ some > > > other things) in a string with nothing in perl? > > > I mean if I have a string like Cyclic-2,3-bisphospho-D-glycerate and I > > want > > > ouput as bisphosphoglycerate. I want to remove -D-, Cyclic-, 2,3- etc. > > > > > > > A more general approach that might work is to keep lower case words (I > > don't know if that will be true for all your cases): > > > > $_='Cyclic-2,3-bisphospho-D-glycerate'; > > print join '', /\b[a-z]+\b/g; > > > > Roy. > > -- > > Dr. Roy Chaudhuri > > Department of Veterinary Medicine > > University of Cambridge, U.K. > > > > > > -- > -Neeti > Even my blood says, B positive > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > From rrc22 at cam.ac.uk Tue Sep 18 08:03:22 2007 From: rrc22 at cam.ac.uk (Roy Chaudhuri) Date: Tue, 18 Sep 2007 13:03:22 +0100 Subject: [Bioperl-l] A perl regex query In-Reply-To: <764978cf0709180330i232c2f04yf033414ad0dba655@mail.gmail.com> References: <764978cf0709180330i232c2f04yf033414ad0dba655@mail.gmail.com> Message-ID: <46EFBE8A.6080402@cam.ac.uk> > This isnt really a bioperl query. > But does anyone know how I can substitute all special characters (+ some > other things) in a string with nothing in perl? > I mean if I have a string like Cyclic-2,3-bisphospho-D-glycerate and I want > ouput as bisphosphoglycerate. I want to remove -D-, Cyclic-, 2,3- etc. > A more general approach that might work is to keep lower case words (I don't know if that will be true for all your cases): $_='Cyclic-2,3-bisphospho-D-glycerate'; print join '', /\b[a-z]+\b/g; Roy. -- Dr. Roy Chaudhuri Department of Veterinary Medicine University of Cambridge, U.K. From neetisomaiya at gmail.com Tue Sep 18 08:47:18 2007 From: neetisomaiya at gmail.com (neeti somaiya) Date: Tue, 18 Sep 2007 18:17:18 +0530 Subject: [Bioperl-l] A perl regex query In-Reply-To: References: <764978cf0709180330i232c2f04yf033414ad0dba655@mail.gmail.com> <46EFBE8A.6080402@cam.ac.uk> <764978cf0709180513r4bf44ea3gcd183a5aadcacb23@mail.gmail.com> Message-ID: <764978cf0709180547w69e0fcfbp67976254e19ab253@mail.gmail.com> My actual problem is a bit more complicated. It is not just one string, nut lakhs of them, they are actually names of chemical compounds. THe problem is there are 2 different data sources, I need to match the compond names between them, but the problem is though the compound may be the same in the two, they use different naming formats for them. eg 1 : Glucose DB1 : D-glucose DB2 : alpha-D-Glucose eg2 : 2,3-bisphosphoglycerate DB1 : Cyclic-2,3-bisphospho-D-Glycerate DB2 : 2,3 bisphoshpglycerate And there are some simple examples, there are even more complicated ones, with many digits, alhas, betas, hyphens, S, R, cis, trans etc etc. I just want to see if the basic compond is the same, i.e. the first one will be glucose and second one will be 2,3-biphosphoglycerate (can't take just bisphosphoglycerate because 1,3-bisphosphoglycerate would mean something else). Anyone has any suggestions how to tackle this? Thanks. On 9/18/07, Spiros Denaxas wrote: > > Its not impossibe, you just have to use \b to denote the word boundaries > :) > > echo 'this-is-a_test-D-string-D' | perl -ne ' s/\b\-D\-\b//g ; print ;' > > this-is-a_teststring-D > > It only gets rid of -D- , all other occurrences of D and - remain intact. > > Spiros > > > On 9/18/07, neeti somaiya wrote: > > Thanks. > > It might work, but not always, because the string could be somthing like > > Cyclic-2,3-Bisphospho-D-Glycerate. > > Here I will first convert the full thing to a lower case and would then > try > > to get what I want. > > > > Nothing seems to work, when I try to substitute -D- with nothing, "D" > and > > "-" when occuring separately also get substituted with nothing. > > > > On 9/18/07, Roy Chaudhuri wrote: > > > > > > > This isnt really a bioperl query. > > > > But does anyone know how I can substitute all special characters (+ > some > > > > other things) in a string with nothing in perl? > > > > I mean if I have a string like Cyclic-2,3-bisphospho-D-glycerate and > I > > > want > > > > ouput as bisphosphoglycerate. I want to remove -D-, Cyclic-, 2,3- > etc. > > > > > > > > > > A more general approach that might work is to keep lower case words (I > > > don't know if that will be true for all your cases): > > > > > > $_='Cyclic-2,3-bisphospho-D-glycerate'; > > > print join '', /\b[a-z]+\b/g; > > > > > > Roy. > > > -- > > > Dr. Roy Chaudhuri > > > Department of Veterinary Medicine > > > University of Cambridge, U.K. > > > > > > > > > > > -- > > -Neeti > > Even my blood says, B positive > > _______________________________________________ > > Bioperl-l mailing list > > Bioperl-l at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > -- -Neeti Even my blood says, B positive From spiros at lokku.com Tue Sep 18 08:56:44 2007 From: spiros at lokku.com (Spiros Denaxas) Date: Tue, 18 Sep 2007 13:56:44 +0100 Subject: [Bioperl-l] A perl regex query In-Reply-To: <46EFCA4E.5090605@sendu.me.uk> References: <764978cf0709180330i232c2f04yf033414ad0dba655@mail.gmail.com> <46EFBE8A.6080402@cam.ac.uk> <764978cf0709180513r4bf44ea3gcd183a5aadcacb23@mail.gmail.com> <46EFCA4E.5090605@sendu.me.uk> Message-ID: On 9/18/07, Sendu Bala wrote: > Spiros Denaxas wrote: > > Its not impossibe, you just have to use \b to denote the word boundaries :) > > > > echo 'this-is-a_test-D-string-D' | perl -ne ' s/\b\-D\-\b//g ; print ;' > > > > this-is-a_teststring-D > > > > It only gets rid of -D- , all other occurrences of D and - remain intact. > > I'm confused. The simpler: > > echo 'this-is-a_test-D-string-D' | perl -ne ' s/-D-//g ; print ;' > > gives the same answer. You'd have to something very strange for a regex > on -D- to match D or - alone. > Its the same thing. He was just mixing up character classes in the regex. Spiros From bix at sendu.me.uk Tue Sep 18 08:53:34 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Tue, 18 Sep 2007 13:53:34 +0100 Subject: [Bioperl-l] A perl regex query In-Reply-To: References: <764978cf0709180330i232c2f04yf033414ad0dba655@mail.gmail.com> <46EFBE8A.6080402@cam.ac.uk> <764978cf0709180513r4bf44ea3gcd183a5aadcacb23@mail.gmail.com> Message-ID: <46EFCA4E.5090605@sendu.me.uk> Spiros Denaxas wrote: > Its not impossibe, you just have to use \b to denote the word boundaries :) > > echo 'this-is-a_test-D-string-D' | perl -ne ' s/\b\-D\-\b//g ; print ;' > > this-is-a_teststring-D > > It only gets rid of -D- , all other occurrences of D and - remain intact. I'm confused. The simpler: echo 'this-is-a_test-D-string-D' | perl -ne ' s/-D-//g ; print ;' gives the same answer. You'd have to something very strange for a regex on -D- to match D or - alone. From rrc22 at cam.ac.uk Tue Sep 18 09:26:47 2007 From: rrc22 at cam.ac.uk (Roy Chaudhuri) Date: Tue, 18 Sep 2007 14:26:47 +0100 Subject: [Bioperl-l] A perl regex query In-Reply-To: <764978cf0709180547w69e0fcfbp67976254e19ab253@mail.gmail.com> References: <764978cf0709180330i232c2f04yf033414ad0dba655@mail.gmail.com> <46EFBE8A.6080402@cam.ac.uk> <764978cf0709180513r4bf44ea3gcd183a5aadcacb23@mail.gmail.com> <764978cf0709180547w69e0fcfbp67976254e19ab253@mail.gmail.com> Message-ID: <46EFD217.1030103@cam.ac.uk> > My actual problem is a bit more complicated. > It is not just one string, nut lakhs of them, they are actually names of > chemical compounds. > > THe problem is there are 2 different data sources, I need to match the > compond names between them, but the problem is though the compound may > be the same in the two, they use different naming formats for them. Unless you can define in simple and precise terms exactly which parts of the string you need then there is no way that you will be able to code a solution in Perl. Maybe you could look for a database that contains the synonyms for each molecule? A quick Google finds ChEBI (http://www.ebi.ac.uk/chebi), which is available to download as flat files. Roy. -- Dr. Roy Chaudhuri Department of Veterinary Medicine University of Cambridge, U.K. From js5 at sanger.ac.uk Tue Sep 18 08:58:36 2007 From: js5 at sanger.ac.uk (James Smith) Date: Tue, 18 Sep 2007 13:58:36 +0100 (BST) Subject: [Bioperl-l] A perl regex query In-Reply-To: <20070918122032.GV14066@ebi.ac.uk> References: <764978cf0709180330i232c2f04yf033414ad0dba655@mail.gmail.com> <20070918122032.GV14066@ebi.ac.uk> Message-ID: Neeti, This isn't really a bioperl query - but I will try and explain a simple solution... warn simplify( 'Cyclic-2,3-bisphospho-D-glycerate' ); sub simplify { local $_ = "-$_[0]-"; ## Quick hack add -'s at start and end! as always match "-string-" s/-( Cyclic | # The prefix "cyclic" \d+ | # a single number between two "-"s \d+,\d+| # number,number between two "-"s \w # a single letter between two "-"s )(?=-)//ixg; ## case-insensitive, commented, multiple matches! ## 0-width +ve lookahead assertion - so can match ## multiple consecutive -x- constructions in same regexp! s/-//g; ## remove remaining "-"s from string... } Not sure what other test strings you may want - but most should be able to fit in the () brackets in the first regexp of simplify James On Tue, 18 Sep 2007, Andreas Kahari wrote: > On Tue, Sep 18, 2007 at 04:00:34PM +0530, neeti somaiya wrote: >> Hi, >> >> This isnt really a bioperl query. >> But does anyone know how I can substitute all special characters (+ some >> other things) in a string with nothing in perl? >> I mean if I have a string like Cyclic-2,3-bisphospho-D-glycerate and I want >> ouput as bisphosphoglycerate. I want to remove -D-, Cyclic-, 2,3- etc. > > This is in additions to the suggestions you've already had. > > If you always want to concatenate the 3rd and 5th part of the string, as > delimited by dashes, then you could do this: > > my $string = 'Cyclic-2,3-bisphospho-D-glycerate'; > my $newstring = join( '', ( split( /-/, $string ) )[ 2, 4 ] ); > > > Cheers, > Andreas > > -- > Andreas K?h?ri :: Ensembl Software Developer > European Bioinformatics Institute (EMBL-EBI) > -------------------------------------------- > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > -- The Wellcome Trust Sanger Institute is operated by Genome Research Limited, a charity registered in England with number 1021457 and a company registered in England with number 2742969, whose registered office is 215 Euston Road, London, NW1 2BE. From stefan.kirov at bms.com Tue Sep 18 09:05:16 2007 From: stefan.kirov at bms.com (Stefan Kirov) Date: Tue, 18 Sep 2007 09:05:16 -0400 Subject: [Bioperl-l] A perl regex query In-Reply-To: <764978cf0709180547w69e0fcfbp67976254e19ab253@mail.gmail.com> References: <764978cf0709180330i232c2f04yf033414ad0dba655@mail.gmail.com> <46EFBE8A.6080402@cam.ac.uk> <764978cf0709180513r4bf44ea3gcd183a5aadcacb23@mail.gmail.com> <764978cf0709180547w69e0fcfbp67976254e19ab253@mail.gmail.com> Message-ID: <46EFCD0C.3010306@bms.com> neeti somaiya wrote: > My actual problem is a bit more complicated. > It is not just one string, nut lakhs of them, they are actually names of > chemical compounds. > > THe problem is there are 2 different data sources, I need to match the > compond names between them, but the problem is though the compound may be > the same in the two, they use different naming formats for them. > > eg 1 : Glucose > DB1 : D-glucose > DB2 : alpha-D-Glucose > > eg2 : 2,3-bisphosphoglycerate > DB1 : Cyclic-2,3-bisphospho-D-Glycerate > DB2 : 2,3 bisphoshpglycerate > It seems to me you are trying to match 2 collections of chemical compounds. If you need to do this reliably you need to use canonical smiles (perhaps there are other solutions but I am not aware of them). There are many resources for that, including open-source: http://openbabel.sourceforge.net/wiki/Main_Page It is not really bioperl's cup of tea, this is much more a chemi-informatics problem. I am not sure if there is a need for bioperl to be extended this way- any thoughts on that? Hope this helps, regards Stefan > And there are some simple examples, there are even more complicated ones, > with many digits, alhas, betas, hyphens, S, R, cis, trans etc etc. > > I just want to see if the basic compond is the same, i.e. the first one will > be glucose and second one will be 2,3-biphosphoglycerate (can't take just > bisphosphoglycerate because 1,3-bisphosphoglycerate would mean something > else). > > Anyone has any suggestions how to tackle this? > > Thanks. > > On 9/18/07, Spiros Denaxas wrote: > >> Its not impossibe, you just have to use \b to denote the word boundaries >> :) >> >> echo 'this-is-a_test-D-string-D' | perl -ne ' s/\b\-D\-\b//g ; print ;' >> >> this-is-a_teststring-D >> >> It only gets rid of -D- , all other occurrences of D and - remain intact. >> >> Spiros >> >> >> On 9/18/07, neeti somaiya wrote: >> >>> Thanks. >>> It might work, but not always, because the string could be somthing like >>> Cyclic-2,3-Bisphospho-D-Glycerate. >>> Here I will first convert the full thing to a lower case and would then >>> >> try >> >>> to get what I want. >>> >>> Nothing seems to work, when I try to substitute -D- with nothing, "D" >>> >> and >> >>> "-" when occuring separately also get substituted with nothing. >>> >>> On 9/18/07, Roy Chaudhuri wrote: >>> >>>>> This isnt really a bioperl query. >>>>> But does anyone know how I can substitute all special characters (+ >>>>> >> some >> >>>>> other things) in a string with nothing in perl? >>>>> I mean if I have a string like Cyclic-2,3-bisphospho-D-glycerate and >>>>> >> I >> >>>> want >>>> >>>>> ouput as bisphosphoglycerate. I want to remove -D-, Cyclic-, 2,3- >>>>> >> etc. >> >>>> A more general approach that might work is to keep lower case words (I >>>> don't know if that will be true for all your cases): >>>> >>>> $_='Cyclic-2,3-bisphospho-D-glycerate'; >>>> print join '', /\b[a-z]+\b/g; >>>> >>>> Roy. >>>> -- >>>> Dr. Roy Chaudhuri >>>> Department of Veterinary Medicine >>>> University of Cambridge, U.K. >>>> >>>> >>> >>> -- >>> -Neeti >>> Even my blood says, B positive >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>> >>> > > > > From stephane.teletchea at jouy.inra.fr Tue Sep 18 09:48:05 2007 From: stephane.teletchea at jouy.inra.fr (=?ISO-8859-1?Q?St=E9phane_T=E9letch=E9a?=) Date: Tue, 18 Sep 2007 15:48:05 +0200 Subject: [Bioperl-l] A perl regex query In-Reply-To: <764978cf0709180547w69e0fcfbp67976254e19ab253@mail.gmail.com> References: <764978cf0709180330i232c2f04yf033414ad0dba655@mail.gmail.com> <46EFBE8A.6080402@cam.ac.uk> <764978cf0709180513r4bf44ea3gcd183a5aadcacb23@mail.gmail.com> <764978cf0709180547w69e0fcfbp67976254e19ab253@mail.gmail.com> Message-ID: <46EFD715.4060308@jouy.inra.fr> neeti somaiya a ?crit : > My actual problem is a bit more complicated. > It is not just one string, nut lakhs of them, they are actually names of > chemical compounds. > > THe problem is there are 2 different data sources, I need to match the > compond names between them, but the problem is though the compound may be > the same in the two, they use different naming formats for them. > > eg 1 : Glucose > DB1 : D-glucose > DB2 : alpha-D-Glucose > > eg2 : 2,3-bisphosphoglycerate > DB1 : Cyclic-2,3-bisphospho-D-Glycerate > DB2 : 2,3 bisphoshpglycerate > > And there are some simple examples, there are even more complicated ones, > with many digits, alhas, betas, hyphens, S, R, cis, trans etc etc. > > I just want to see if the basic compond is the same, i.e. the first one will > be glucose and second one will be 2,3-biphosphoglycerate (can't take just > bisphosphoglycerate because 1,3-bisphosphoglycerate would mean something > else). > > Anyone has any suggestions how to tackle this? > I would use a two step approach : 1 - filter the entries, use a convention, for instance translata all '+' into their 'plus' literal equivalent, change spaces by '_', change all '-' for '_' also, etc 2 - try matching the result, if the match does not work, try to match some characters (for instance, try to remove all non alphabetical characters and see if the resulting produces a match). That's theory, now, you have some time for errors and trials, but i think there is not essay, one shot solution, neither a bioperl facility for handling (bio)chemical compounds. Cheers, St?phane -- St?phane T?letch?a, PhD. http://www.steletch.org Unit? Math?matique Informatique et G?nome http://migale.jouy.inra.fr/mig INRA, Domaine de Vilvert T?l : (33) 134 652 891 78352 Jouy-en-Josas cedex, France Fax : (33) 134 652 901 From puetz at mpipsykl.mpg.de Tue Sep 18 10:12:47 2007 From: puetz at mpipsykl.mpg.de (Benno Puetz) Date: Tue, 18 Sep 2007 16:12:47 +0200 Subject: [Bioperl-l] A perl regex query In-Reply-To: References: <764978cf0709180330i232c2f04yf033414ad0dba655@mail.gmail.com> <20070918122032.GV14066@ebi.ac.uk> Message-ID: <46EFDCDF.6030309@mpipsykl.mpg.de> James Smith wrote: > > Neeti, > > This isn't really a bioperl query - but I will try and explain a simple > solution... > > warn simplify( 'Cyclic-2,3-bisphospho-D-glycerate' ); > > sub simplify { > local $_ = "-$_[0]-"; > ## Quick hack add -'s at start and end! as always match > "-string-" > s/-( > Cyclic | # The prefix "cyclic" > \d+ | # a single number between two "-"s > \d+,\d+| # number,number between two "-"s > \w # a single letter between two "-"s > )(?=-)//ixg; ## case-insensitive, commented, multiple matches! > ## 0-width +ve lookahead assertion - so can match > ## multiple consecutive -x- constructions in same regexp! > s/-//g; > ## remove remaining "-"s from string... > } > > Not sure what other test strings you may want - but most should be > able to > fit in the () brackets in the first regexp of simplify > > James Along the same line # some test for most of the removals below my $string = "Alpha-Cyclic-2,3-bi-sphos-1,2,5-pho-D-beta-glycerate"; my @ra_bad_terms = ( '-?(D|R|S)-', '-?([aA]lpha|[bB]eta|[gG]amma)-', '-?([cC]is|[tT]rans)-', '-?[cC]yclic-', # '-?\d+(,\d+)+-', # uncomment to remove numbers, too '(?//g; print lc($string),"\n"; -- Benno P?tz Statistische Genetik Max-Planck-Institut f. Psychiatrie Tel.: +49-89-30622-222 Kraepelinstr. 10 Fax : +49-89-30622-601 80804 M?nchen, Germany From spiros at lokku.com Tue Sep 18 10:41:20 2007 From: spiros at lokku.com (Spiros Denaxas) Date: Tue, 18 Sep 2007 15:41:20 +0100 Subject: [Bioperl-l] A perl regex query In-Reply-To: <46EFDCDF.6030309@mpipsykl.mpg.de> References: <764978cf0709180330i232c2f04yf033414ad0dba655@mail.gmail.com> <20070918122032.GV14066@ebi.ac.uk> <46EFDCDF.6030309@mpipsykl.mpg.de> Message-ID: On 9/18/07, Benno Puetz wrote: > James Smith wrote: > > > > Neeti, > > > > This isn't really a bioperl query - but I will try and explain a simple > > solution... > > > > warn simplify( 'Cyclic-2,3-bisphospho-D-glycerate' ); > > > > sub simplify { > > local $_ = "-$_[0]-"; > > ## Quick hack add -'s at start and end! as always match > > "-string-" > > s/-( > > Cyclic | # The prefix "cyclic" > > \d+ | # a single number between two "-"s > > \d+,\d+| # number,number between two "-"s > > \w # a single letter between two "-"s > > )(?=-)//ixg; ## case-insensitive, commented, multiple matches! > > ## 0-width +ve lookahead assertion - so can match > > ## multiple consecutive -x- constructions in same regexp! > > s/-//g; > > ## remove remaining "-"s from string... > > } > > > > Not sure what other test strings you may want - but most should be > > able to > > fit in the () brackets in the first regexp of simplify > > > > James > Along the same line > > # some test for most of the removals below > my $string = "Alpha-Cyclic-2,3-bi-sphos-1,2,5-pho-D-beta-glycerate"; > my @ra_bad_terms = ( '-?(D|R|S)-', > '-?([aA]lpha|[bB]eta|[gG]amma)-', > '-?([cC]is|[tT]rans)-', > '-?[cC]yclic-', > # '-?\d+(,\d+)+-', # uncomment to remove numbers, too > '(? print "$string\n"; > foreach ( @ra_bad_terms ){ > > eval { $string =~ s/$_//g; }; > print "$_:$string\n"; # for feedback only > } > #$string =~ s/<@ra_bad_terms>//g; > > print lc($string),"\n"; > > > -- > Benno P?tz My humble opinion would be to avoid using regular expressions to do your task and try and locate a more valid and centralized information repository to use, be it a database of synonyms or some other indexing code. This will add the required domain knowledge in your solution. Using regular expressions will almost certainly lead to problems and bugs which will be very hard to resolve. Should you decide to go forward and treat everything simply as strings and compare them, I feel this is more of an NLP problem. Spiros From cjfields at uiuc.edu Tue Sep 18 11:24:52 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 18 Sep 2007 10:24:52 -0500 Subject: [Bioperl-l] A perl regex query In-Reply-To: <46EFD217.1030103@cam.ac.uk> References: <764978cf0709180330i232c2f04yf033414ad0dba655@mail.gmail.com> <46EFBE8A.6080402@cam.ac.uk> <764978cf0709180513r4bf44ea3gcd183a5aadcacb23@mail.gmail.com> <764978cf0709180547w69e0fcfbp67976254e19ab253@mail.gmail.com> <46EFD217.1030103@cam.ac.uk> Message-ID: <155C67C0-1F81-4A1C-AD68-A21B4E6918C9@uiuc.edu> On Sep 18, 2007, at 8:26 AM, Roy Chaudhuri wrote: >> My actual problem is a bit more complicated. >> It is not just one string, nut lakhs of them, they are actually >> names of >> chemical compounds. >> >> THe problem is there are 2 different data sources, I need to match >> the >> compond names between them, but the problem is though the compound >> may >> be the same in the two, they use different naming formats for them. > > Unless you can define in simple and precise terms exactly which > parts of > the string you need then there is no way that you will be able to > code a > solution in Perl. > > Maybe you could look for a database that contains the synonyms for > each > molecule? A quick Google finds ChEBI (http://www.ebi.ac.uk/chebi), > which > is available to download as flat files. > > Roy. > -- > Dr. Roy Chaudhuri > Department of Veterinary Medicine > University of Cambridge, U.K. D'oh! Roy beat me to it; that's what I was going to suggest. I agree; don't trust simple word munging to always get you the correct answer in this case, it's just too complicated to try and catch every case. ChEBI is a good choice; Stefan's suggestion of OpenBabel is also a good one. I would also try not to reinvent the wheel; there may be some modules available via CPAN which do what you need, such as these: http://search.cpan.org/search?query=chem&mode=module or this: http://search.cpan.org/~ghutchis/Chemistry-OpenBabel-1.2.0/ chris From shameer at ncbs.res.in Tue Sep 18 10:57:55 2007 From: shameer at ncbs.res.in (Shameer Khadar) Date: Tue, 18 Sep 2007 20:27:55 +0530 (IST) Subject: [Bioperl-l] A perl regex query In-Reply-To: <46EFDCDF.6030309@mpipsykl.mpg.de> References: <764978cf0709180330i232c2f04yf033414ad0dba655@mail.gmail.com> <20070918122032.GV14066@ebi.ac.uk> <46EFDCDF.6030309@mpipsykl.mpg.de> Message-ID: <53713.192.168.1.1.1190127475.squirrel@mail.ncbs.res.in> I used this module for my simple chemoinformatics tasks, http://www.perlmol.org/ - PerlMol - Perl Modules for Molecular Chemistry Please explore, you may find something useful. -- > James Smith wrote: >> >> Neeti, >> >> This isn't really a bioperl query - but I will try and explain a simple >> solution... >> >> warn simplify( 'Cyclic-2,3-bisphospho-D-glycerate' ); >> >> sub simplify { >> local $_ = "-$_[0]-"; >> ## Quick hack add -'s at start and end! as always match >> "-string-" >> s/-( >> Cyclic | # The prefix "cyclic" >> \d+ | # a single number between two "-"s >> \d+,\d+| # number,number between two "-"s >> \w # a single letter between two "-"s >> )(?=-)//ixg; ## case-insensitive, commented, multiple matches! >> ## 0-width +ve lookahead assertion - so can match >> ## multiple consecutive -x- constructions in same regexp! >> s/-//g; >> ## remove remaining "-"s from string... >> } >> >> Not sure what other test strings you may want - but most should be >> able to >> fit in the () brackets in the first regexp of simplify >> >> James > Along the same line > > # some test for most of the removals below > my $string = "Alpha-Cyclic-2,3-bi-sphos-1,2,5-pho-D-beta-glycerate"; > my @ra_bad_terms = ( '-?(D|R|S)-', > '-?([aA]lpha|[bB]eta|[gG]amma)-', > '-?([cC]is|[tT]rans)-', > '-?[cC]yclic-', > # '-?\d+(,\d+)+-', # uncomment to remove numbers, > too > '(? print "$string\n"; > foreach ( @ra_bad_terms ){ > > eval { $string =~ s/$_//g; }; > print "$_:$string\n"; # for feedback only > } > #$string =~ s/<@ra_bad_terms>//g; > > print lc($string),"\n"; > > > -- > Benno P?tz > Statistische Genetik > Max-Planck-Institut f. Psychiatrie Tel.: +49-89-30622-222 > Kraepelinstr. 10 Fax : +49-89-30622-601 > 80804 M?nchen, Germany > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > -- Shameer Khadar Prof. R. Sowdhamini's Lab (# 25) The Computational Biology Group National Centre for Biological Sciences (TIFR) GKVK Campus, Bellary Road, Bangalore - 65, Karnataka - India T - 91-080-23666001 EXT - 6251 W - http://www.ncbs.res.in From js5 at sanger.ac.uk Tue Sep 18 11:37:57 2007 From: js5 at sanger.ac.uk (James Smith) Date: Tue, 18 Sep 2007 16:37:57 +0100 (BST) Subject: [Bioperl-l] A perl regex query In-Reply-To: References: <764978cf0709180330i232c2f04yf033414ad0dba655@mail.gmail.com> <20070918122032.GV14066@ebi.ac.uk> <46EFDCDF.6030309@mpipsykl.mpg.de> Message-ID: On Tue, 18 Sep 2007, Spiros Denaxas wrote: > On 9/18/07, Benno Puetz wrote: > James Smith wrote: > > > > Neeti, > > > > This isn't really a bioperl query - but I will try and explain a simple > > solution... > > > > warn simplify( 'Cyclic-2,3-bisphospho-D-glycerate' ); > > > > sub simplify { > > local $_ = "-$_[0]-"; > > ## Quick hack add -'s at start and end! as always match > > "-string-" > > s/-( > > Cyclic | # The prefix "cyclic" > > \d+ | # a single number between two "-"s > > \d+,\d+| # number,number between two "-"s > > \w # a single letter between two "-"s > > )(?=-)//ixg; ## case-insensitive, commented, multiple matches! > > ## 0-width +ve lookahead assertion - so can match > > ## multiple consecutive -x- constructions in same regexp! > > s/-//g; > > ## remove remaining "-"s from string... > > } > > > > Not sure what other test strings you may want - but most should be > > able to > > fit in the () brackets in the first regexp of simplify > > > > James > Along the same line > But the point is you don't need to loop over things.... Updated regexp... sub simplify { local $_ = "-$_[0]-"; # Add '-' at start and end! s{-( [cC]yclic | # The prefix "cyclic" [aA]lpha | [bB]eta | [gG]amma | # Alpha/beta/gamma [tT]rans | [cC]is | # Trans/cis [DRS] | # Single letter "D","R" or "S" # \d+(,\d+)* | # list of 1 or more "," separated nos )(?=-)}{}xg; # No. list currently commented out! s/-//g; # remove all "-" s/([^\d,])([\d,])/\1-\2/g; # re-introduce "-" between number/ s/([\d,])([^\d,])/\1-\2/g; # comma and letters s/--/-/g; # remove duplicate "-" signs.. return $_; } -- The Wellcome Trust Sanger Institute is operated by Genome Research Limited, a charity registered in England with number 1021457 and a company registered in England with number 2742969, whose registered office is 215 Euston Road, London, NW1 2BE. From cjfields at uiuc.edu Tue Sep 18 11:38:04 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 18 Sep 2007 10:38:04 -0500 Subject: [Bioperl-l] A perl regex query In-Reply-To: <46EFCD0C.3010306@bms.com> References: <764978cf0709180330i232c2f04yf033414ad0dba655@mail.gmail.com> <46EFBE8A.6080402@cam.ac.uk> <764978cf0709180513r4bf44ea3gcd183a5aadcacb23@mail.gmail.com> <764978cf0709180547w69e0fcfbp67976254e19ab253@mail.gmail.com> <46EFCD0C.3010306@bms.com> Message-ID: <68EC5D58-9D84-4692-BD99-F53FC75FD0E7@uiuc.edu> On Sep 18, 2007, at 8:05 AM, Stefan Kirov wrote: > neeti somaiya wrote: >> ... > It seems to me you are trying to match 2 collections of chemical > compounds. If you need to do this reliably you need to use canonical > smiles (perhaps there are other solutions but I am not aware of them). > There are many resources for that, including open-source: > http://openbabel.sourceforge.net/wiki/Main_Page > It is not really bioperl's cup of tea, this is much more a > chemi-informatics problem. I am not sure if there is a need for > bioperl > to be extended this way- any thoughts on that? > Hope this helps, regards > Stefan I would vote nyet myself unless I was convinced that this would be beneficial to bioperl core. Right now I'm not yet there, primarily b/ c of the already available OpenBabel (with available CPAN interface) and other resources, not to mention there are too many areas in bioperl which need more focus (tests, documentation, etc). However, if we do want to incorporate chemi-informatics at some point it could be something which is not integrated into the core architecture and can be installed separately (like network, ext, db, etc). chris From bioperl-list at superfrink.net Tue Sep 18 12:16:48 2007 From: bioperl-list at superfrink.net (bioperl-list at superfrink.net) Date: Tue, 18 Sep 2007 10:16:48 -0600 (MDT) Subject: [Bioperl-l] A perl regex query In-Reply-To: <46EFBA0E.4030104@sheffield.ac.uk> References: <764978cf0709180330i232c2f04yf033414ad0dba655@mail.gmail.com> <46EFBA0E.4030104@sheffield.ac.uk> Message-ID: On Tue, 18 Sep 2007, Nathan Haigh wrote: > An even better way is to use the array as Spiros suggested, but you > should then be able to use that in the regex like this: > > my @ra_bad_terms = ( '-D-', 'Cyclic-' ); > $string =~ s/@ra_bad_terms//g; I didn't know one could do that. I couldn't get it to work so I asked around. In case anyone else read it and thought about using that code it might only work in Perl 6. Regards, Chad From bix at sendu.me.uk Tue Sep 18 13:21:06 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Tue, 18 Sep 2007 18:21:06 +0100 Subject: [Bioperl-l] A perl regex query In-Reply-To: References: <764978cf0709180330i232c2f04yf033414ad0dba655@mail.gmail.com> <46EFBA0E.4030104@sheffield.ac.uk> Message-ID: <46F00902.9070401@sendu.me.uk> bioperl-list at superfrink.net wrote: > On Tue, 18 Sep 2007, Nathan Haigh wrote: > >> An even better way is to use the array as Spiros suggested, but you >> should then be able to use that in the regex like this: >> >> my @ra_bad_terms = ( '-D-', 'Cyclic-' ); >> $string =~ s/@ra_bad_terms//g; > > I didn't know one could do that. I couldn't get it to work so I asked > around. In case anyone else read it and thought about using that code it > might only work in Perl 6. I assumed it was a typo. You can get it to work by adding $" = '|'; before the regex; From cjfields at uiuc.edu Tue Sep 18 13:52:00 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 18 Sep 2007 12:52:00 -0500 Subject: [Bioperl-l] A perl regex query In-Reply-To: References: <764978cf0709180330i232c2f04yf033414ad0dba655@mail.gmail.com> <46EFBA0E.4030104@sheffield.ac.uk> Message-ID: On Sep 18, 2007, at 11:16 AM, bioperl-list at superfrink.net wrote: > On Tue, 18 Sep 2007, Nathan Haigh wrote: > >> An even better way is to use the array as Spiros suggested, but you >> should then be able to use that in the regex like this: >> >> my @ra_bad_terms = ( '-D-', 'Cyclic-' ); >> $string =~ s/@ra_bad_terms//g; > > I didn't know one could do that. I couldn't get it to work so I asked > around. In case anyone else read it and thought about using that > code it > might only work in Perl 6. > > Regards, > Chad I think the problem is what s/@terms//g means. To most it means group substitutions, which you can get by using s/(?:a|b|c|d)//g; to others it means stepwise 's/$old//g for $old (@terms)'. To go from an array of terms to an optimized group regex, use Regexp::List (CPAN to the rescue!): http://search.cpan.org/~dankogai/Regexp-Optimizer-0.15/lib/Regexp/ List.pm chris From cjfields at uiuc.edu Tue Sep 18 14:23:37 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 18 Sep 2007 13:23:37 -0500 Subject: [Bioperl-l] A perl regex query In-Reply-To: <46F00902.9070401@sendu.me.uk> References: <764978cf0709180330i232c2f04yf033414ad0dba655@mail.gmail.com> <46EFBA0E.4030104@sheffield.ac.uk> <46F00902.9070401@sendu.me.uk> Message-ID: <5C811C68-D049-407A-946D-061DD8951B54@uiuc.edu> On Sep 18, 2007, at 12:21 PM, Sendu Bala wrote: > bioperl-list at superfrink.net wrote: >> On Tue, 18 Sep 2007, Nathan Haigh wrote: >> >>> An even better way is to use the array as Spiros suggested, but you >>> should then be able to use that in the regex like this: >>> >>> my @ra_bad_terms = ( '-D-', 'Cyclic-' ); >>> $string =~ s/@ra_bad_terms//g; >> >> I didn't know one could do that. I couldn't get it to work so I >> asked >> around. In case anyone else read it and thought about using that >> code it >> might only work in Perl 6. > > I assumed it was a typo. You can get it to work by adding > > $" = '|'; > > before the regex; Ah, didn't know that one. Nice, though shouldn't it be localized? The (supposed) advantage of Regexp::List is the regex is optimized for speed; I haven't tried it out myself, so YMMV. chris From stefan.kirov at bms.com Tue Sep 18 14:54:49 2007 From: stefan.kirov at bms.com (Stefan Kirov) Date: Tue, 18 Sep 2007 14:54:49 -0400 Subject: [Bioperl-l] A perl regex query In-Reply-To: <155C67C0-1F81-4A1C-AD68-A21B4E6918C9@uiuc.edu> References: <764978cf0709180330i232c2f04yf033414ad0dba655@mail.gmail.com> <46EFBE8A.6080402@cam.ac.uk> <764978cf0709180513r4bf44ea3gcd183a5aadcacb23@mail.gmail.com> <764978cf0709180547w69e0fcfbp67976254e19ab253@mail.gmail.com> <46EFD217.1030103@cam.ac.uk> <155C67C0-1F81-4A1C-AD68-A21B4E6918C9@uiuc.edu> Message-ID: <46F01EF9.80003@bms.com> Actually, smiles can be tricky too- you can easily generate non-canonical keys, where InChi is unique (as I understand it at least). It is promoted by IUAPC: http://www.iupac.org/inchi/ and can be generated by OpenBabel. My take is that if you need to map between small molecules InChi might be the best way.. Stefan Chris Fields wrote: > On Sep 18, 2007, at 8:26 AM, Roy Chaudhuri wrote: > > >>> My actual problem is a bit more complicated. >>> It is not just one string, nut lakhs of them, they are actually >>> names of >>> chemical compounds. >>> >>> THe problem is there are 2 different data sources, I need to match >>> the >>> compond names between them, but the problem is though the compound >>> may >>> be the same in the two, they use different naming formats for them. >>> >> Unless you can define in simple and precise terms exactly which >> parts of >> the string you need then there is no way that you will be able to >> code a >> solution in Perl. >> >> Maybe you could look for a database that contains the synonyms for >> each >> molecule? A quick Google finds ChEBI (http://www.ebi.ac.uk/chebi), >> which >> is available to download as flat files. >> >> Roy. >> -- >> Dr. Roy Chaudhuri >> Department of Veterinary Medicine >> University of Cambridge, U.K. >> > > D'oh! Roy beat me to it; that's what I was going to suggest. I > agree; don't trust simple word munging to always get you the correct > answer in this case, it's just too complicated to try and catch every > case. > > ChEBI is a good choice; Stefan's suggestion of OpenBabel is also a > good one. I would also try not to reinvent the wheel; there may be > some modules available via CPAN which do what you need, such as these: > > http://search.cpan.org/search?query=chem&mode=module > > or this: > > http://search.cpan.org/~ghutchis/Chemistry-OpenBabel-1.2.0/ > > chris > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > From jason at bioperl.org Tue Sep 18 20:04:05 2007 From: jason at bioperl.org (Jason Stajich) Date: Tue, 18 Sep 2007 17:04:05 -0700 Subject: [Bioperl-l] bioperl + GFF3 audit Message-ID: Something to throw out there for discussion with GFF3 gurus. Maybe we can have a little STATE-OF-GFF3 and compliance at the GMOD workshop after Genome Informatics in Nov? I propose after we get the next stable release out we consider doing a systematic code audit to insure that we can really generate proper GFF3 compliant data from all of our parsers. This would include both good ID/Parent as well as . I'd be happy to also think about making sure we can generate proper GTF/GFF2.5 - whether this means we have a translator that works on these objects or we have to code this into the parser software that creating the sequence features, not sure. The whole Bio::Tools mishmash is a little unsettling when trying to generate standardized output. I'm not really clear if Bio::FeatureIO actually tries to do this properly, but 'gene_id'/'transcript_id' for GTF and ID/Parent 3-level Features for gene->transcript->exon/CDS doesn't really come out properly and I end up writing workarounds on the downstream data. One aspect that is biting is the flat versus multi-level features (genes -> transcripts -> exons) and how we handle them. I think this ought to get fleshed out better so we can really support . A lot of the Bio::Tools parsers are generally pretty laissez fair here about things and we have a variety of non-standard and non-compliant aspects. For example, I am playing with tRNA parsing and I assume that proper GFF3 here is three levels of : gene -> tRNA -> exon with those being the primary_tag names that correspond to the Sequence Ontology. I have modified the code locally to report generic features but which have sub-features that must be extracted. In addition the ID/Parent fields are explicitly filled in and I wonder if we want to do a better job insuring these are meaningfully entered? So if there are interested people out there we can try and hammer out a todo list on the wiki and see if we're generating proper GFF3 in the first place and trying to make sure all the features that get fed out to Bio::FeatureIO or Bio::Tools::GFF can get properly transformed into GFF3 and GTF output. Comments/Volunteers? -jason -- Jason Stajich jason at bioperl.org From cjfields at uiuc.edu Tue Sep 18 23:37:30 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 18 Sep 2007 22:37:30 -0500 Subject: [Bioperl-l] bioperl + GFF3 audit In-Reply-To: References: Message-ID: On Sep 18, 2007, at 7:04 PM, Jason Stajich wrote: > Something to throw out there for discussion with GFF3 gurus. Maybe > we can have a little STATE-OF-GFF3 and compliance at the GMOD > workshop after Genome Informatics in Nov? > > I propose after we get the next stable release out we consider doing > a systematic code audit to insure that we can really generate proper > GFF3 compliant data from all of our parsers. This would include both > good ID/Parent as well as . I'd be happy to also think about making > sure we can generate proper GTF/GFF2.5 - whether this means we have a > translator that works on these objects or we have to code this into > the parser software that creating the sequence features, not sure. > The whole Bio::Tools mishmash is a little unsettling when trying to > generate standardized output. I'm not really clear if Bio::FeatureIO > actually tries to do this properly, but 'gene_id'/'transcript_id' for > GTF and ID/Parent 3-level Features for gene->transcript->exon/CDS > doesn't really come out properly and I end up writing workarounds on > the downstream data. This suggests we should try to get a stable out fairly quickly and work on the next dev straight away. I'm okay with that, though it would be nice to finish up a few loose ends first, the svn move foremost. The Feature/Annotation stuff has been pretty much rolled back so maybe a stable release can be done fairly quickly. My main concern was that any rollback would break FeatureIO or SF::Annotated, but so far FeatureIO and SF::Annotated both pass tests. However, I think both also need better documentation and possibly more/better test coverage. > One aspect that is biting is the flat versus multi-level features > (genes -> transcripts -> exons) and how we handle them. I think this > ought to get fleshed out better so we can really support . A lot of > the Bio::Tools parsers are generally pretty laissez fair here about > things and we have a variety of non-standard and non-compliant > aspects. Agreed. > For example, I am playing with tRNA parsing and I assume that proper > GFF3 here is three levels of : > gene -> tRNA -> exon > with those being the primary_tag names that correspond to the > Sequence Ontology. > > I have modified the code locally to report generic features but which > have sub-features that must be extracted. In addition the ID/Parent > fields are explicitly filled in and I wonder if we want to do a > better job insuring these are meaningfully entered? Would a factory approach work here? For instance, have a Factory which generates the SeqFeature type you want on the fly if passed appropriate parameters and location, say flattened vs unflattened, strictly typed vs lightweight, etc. For that matter, maybe we could reimplement FTHelper in SeqIO to do the same... > So if there are interested people out there we can try and hammer out > a todo list on the wiki and see if we're generating proper GFF3 in > the first place and trying to make sure all the features that get fed > out to Bio::FeatureIO or Bio::Tools::GFF can get properly transformed > into GFF3 and GTF output. > > Comments/Volunteers? > > -jason > > -- > Jason Stajich > jason at bioperl.org I'll be busy 'til mid-Oct but I'll chip in. I'll keep tabs on the wiki. chris From harryzs1981 at yahoo.com.cn Fri Sep 21 10:40:55 2007 From: harryzs1981 at yahoo.com.cn (sheng zhao) Date: Fri, 21 Sep 2007 22:40:55 +0800 (CST) Subject: [Bioperl-l] Help for extracting CDS sequences from FASTA form Message-ID: <815154.22486.qm@web15901.mail.cnb.yahoo.com> Dear all: I have got a set of DAN sequences in FASTA form as following: >gnl|UG|Bt#S37443275 Bos taurus cathelicidin 4, mRNA (cDNA clone MGC:157131 IMAGE:8442308), complete cds /cds=p(18,452) /gb=BC133480 /gi=126717494 /ug=Bt.3 /len=572 TAGGCAGACTGGGGACCATGCAAACCCAGAGGGCCAGCCTCTCACTGGGGCGGTGGTCAC.................................. >gnl|UG|Bt#S11932596 B.taurus mRNA for interleukin-5 /cds=p(1,405) /gb=Z67872 /gi=1113120 /ug=Bt.5 /len=405 TAGGCAGACTGGGGACCATGCAAACCCAGAGGGCCAGCCTCTCACTGGGGCGGTGGTCAC................................... >gnl|UG|Bt#S29311270 Hw_Loin_11_0520_C11 Bos taurus CF-24-HW loin cDNA library Bos taurus cDNA, mRNA sequence /gb=DV796078 /gi=82648993 /ug=Bt.10 /len=1332 AACCGGGAGCACGCCGTGTACCCGCCAGTGGGGCTTCTGAGGACATGGGGGCCACCGTCA................................... I would like to know how to extract CDS sequences from them? Or a Perl program? Thank you for your reply and help. Sincerely yours, Best wishes, Harry --------------------------------- @yahoo.cn ?????????????????????????? From bix at sendu.me.uk Fri Sep 21 11:28:30 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Fri, 21 Sep 2007 16:28:30 +0100 Subject: [Bioperl-l] Help for extracting CDS sequences from FASTA form In-Reply-To: <815154.22486.qm@web15901.mail.cnb.yahoo.com> References: <815154.22486.qm@web15901.mail.cnb.yahoo.com> Message-ID: <46F3E31E.2010606@sendu.me.uk> sheng zhao wrote: > >gnl|UG|Bt#S37443275 [snip] /gb=BC133480 /gi=126717494 /ug=Bt.3 /len=572 > TAGGCAGACTGGGGACCATGCAAACCCAGAGGGCCAGC [snip] > I would like to know how to extract CDS sequences from them? Or a Perl program? Where did you get the fasta sequences from? It would be easiest to go to the source that originally generated them and get it to give you the CDS coordinates as well. Failing that you can get them from the NCBI database using the gb or gi ids. Someone else will be along to give you the Bioperl code to do that, I'm sure :) From harryzs1981 at yahoo.com.cn Fri Sep 21 12:30:45 2007 From: harryzs1981 at yahoo.com.cn (sheng zhao) Date: Sat, 22 Sep 2007 00:30:45 +0800 (CST) Subject: [Bioperl-l] Help for extracting CDS sequences from FASTA form Message-ID: <352861.19969.qm@web15909.mail.cnb.yahoo.com> Dear sir: Thank you for your relp. I got these sequences from NCBI(ftp.ncbi.nlm.nih.gov/repository/UniGene/Bos_taurus/Bt.seq.uniq.gz). Would you mind to tell me how to get gb or gi ids from this form? Thank you again. Best wishes. Harry Sendu Bala ?????? sheng zhao wrote: > >gnl|UG|Bt#S37443275 [snip] /gb=BC133480 /gi=126717494 /ug=Bt.3 /len=572 > TAGGCAGACTGGGGACCATGCAAACCCAGAGGGCCAGC [snip] > I would like to know how to extract CDS sequences from them? Or a Perl program? Where did you get the fasta sequences from? It would be easiest to go to the source that originally generated them and get it to give you the CDS coordinates as well. Failing that you can get them from the NCBI database using the gb or gi ids. Someone else will be along to give you the Bioperl code to do that, I'm sure :) --------------------------------- ???????????????????? From harryzs1981 at yahoo.com.cn Fri Sep 21 12:32:21 2007 From: harryzs1981 at yahoo.com.cn (sheng zhao) Date: Sat, 22 Sep 2007 00:32:21 +0800 (CST) Subject: [Bioperl-l] =?gb2312?q?=BB=D8=B8=B4=A3=BA=20Re:=20=20Help=20for=20extracti?= =?gb2312?q?ng=20CDS=20sequences=20from=20FASTA=20form?= In-Reply-To: Message-ID: <548016.71302.qm@web15914.mail.cnb.yahoo.com> Dear Brian O.: Thank you for your reple. I just want to get CDS sequences (DNA sequence) from them according to the CDS information in the title for each sequence. For example, for the first sequence, the information is "complete cds /cds=p(18,452)" . Thank you again! Best wishes! Harry Brian Osborne ?????? Harry, Do you mean find an ORF starting at the first initiation codon and translate that? Or some other approach? Take a look at the translate() method: http://www.bioperl.org/wiki/Bptutorial.pl#Translating Brian O. On 9/21/07 10:40 AM, "sheng zhao" wrote: > Dear all: > I have got a set of DAN sequences in FASTA form as following: > >> gnl|UG|Bt#S37443275 Bos taurus cathelicidin 4, mRNA (cDNA clone MGC:157131 >> IMAGE:8442308), complete cds /cds=p(18,452) /gb=BC133480 /gi=126717494 >> /ug=Bt.3 /len=572 > > TAGGCAGACTGGGGACCATGCAAACCCAGAGGGCCAGCCTCTCACTGGGGCGGTGGTCAC.................. > ................ >> gnl|UG|Bt#S11932596 B.taurus mRNA for interleukin-5 /cds=p(1,405) /gb=Z67872 >> /gi=1113120 /ug=Bt.5 /len=405 > > TAGGCAGACTGGGGACCATGCAAACCCAGAGGGCCAGCCTCTCACTGGGGCGGTGGTCAC.................. > ................. >> gnl|UG|Bt#S29311270 Hw_Loin_11_0520_C11 Bos taurus CF-24-HW loin cDNA library >> Bos taurus cDNA, mRNA sequence /gb=DV796078 /gi=82648993 /ug=Bt.10 /len=1332 > > AACCGGGAGCACGCCGTGTACCCGCCAGTGGGGCTTCTGAGGACATGGGGGCCACCGTCA.................. > ................. > > I would like to know how to extract CDS sequences from them? Or a Perl > program? > > Thank you for your reply and help. > > Sincerely yours, > Best wishes, > Harry > > > > > > --------------------------------- > @yahoo.cn ?????????????????????????? > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l --------------------------------- ???????????????????? From bosborne11 at verizon.net Fri Sep 21 11:53:46 2007 From: bosborne11 at verizon.net (Brian Osborne) Date: Fri, 21 Sep 2007 11:53:46 -0400 Subject: [Bioperl-l] Help for extracting CDS sequences from FASTA form In-Reply-To: <815154.22486.qm@web15901.mail.cnb.yahoo.com> Message-ID: Harry, Do you mean find an ORF starting at the first initiation codon and translate that? Or some other approach? Take a look at the translate() method: http://www.bioperl.org/wiki/Bptutorial.pl#Translating Brian O. On 9/21/07 10:40 AM, "sheng zhao" wrote: > Dear all: > I have got a set of DAN sequences in FASTA form as following: > >> gnl|UG|Bt#S37443275 Bos taurus cathelicidin 4, mRNA (cDNA clone MGC:157131 >> IMAGE:8442308), complete cds /cds=p(18,452) /gb=BC133480 /gi=126717494 >> /ug=Bt.3 /len=572 > > TAGGCAGACTGGGGACCATGCAAACCCAGAGGGCCAGCCTCTCACTGGGGCGGTGGTCAC.................. > ................ >> gnl|UG|Bt#S11932596 B.taurus mRNA for interleukin-5 /cds=p(1,405) /gb=Z67872 >> /gi=1113120 /ug=Bt.5 /len=405 > > TAGGCAGACTGGGGACCATGCAAACCCAGAGGGCCAGCCTCTCACTGGGGCGGTGGTCAC.................. > ................. >> gnl|UG|Bt#S29311270 Hw_Loin_11_0520_C11 Bos taurus CF-24-HW loin cDNA library >> Bos taurus cDNA, mRNA sequence /gb=DV796078 /gi=82648993 /ug=Bt.10 /len=1332 > > AACCGGGAGCACGCCGTGTACCCGCCAGTGGGGCTTCTGAGGACATGGGGGCCACCGTCA.................. > ................. > > I would like to know how to extract CDS sequences from them? Or a Perl > program? > > Thank you for your reply and help. > > Sincerely yours, > Best wishes, > Harry > > > > > > --------------------------------- > @yahoo.cn ?????????????????????????? > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From florent.angly at gmail.com Sat Sep 22 01:02:28 2007 From: florent.angly at gmail.com (Florent Angly) Date: Fri, 21 Sep 2007 22:02:28 -0700 Subject: [Bioperl-l] Scoring an overlap In-Reply-To: <5C811C68-D049-407A-946D-061DD8951B54@uiuc.edu> References: <764978cf0709180330i232c2f04yf033414ad0dba655@mail.gmail.com> <46EFBA0E.4030104@sheffield.ac.uk> <46F00902.9070401@sendu.me.uk> <5C811C68-D049-407A-946D-061DD8951B54@uiuc.edu> Message-ID: <46F4A1E4.7020002@gmail.com> Hi, I need to quantify how good some overlaps in contigs are. I have extracted the alignment of the overlapping region and only need to score it. I noticed the Bio::Tools::dpAlign has a scoring function. Is it the right tool for the right tool? Is there anything else? Thank you, Florent From florent.angly at gmail.com Sat Sep 22 21:41:40 2007 From: florent.angly at gmail.com (Florent Angly) Date: Sat, 22 Sep 2007 18:41:40 -0700 Subject: [Bioperl-l] Scoring an overlap In-Reply-To: <46F4A1E4.7020002@gmail.com> References: <764978cf0709180330i232c2f04yf033414ad0dba655@mail.gmail.com> <46EFBA0E.4030104@sheffield.ac.uk> <46F00902.9070401@sendu.me.uk> <5C811C68-D049-407A-946D-061DD8951B54@uiuc.edu> <46F4A1E4.7020002@gmail.com> Message-ID: <46F5C454.3050005@gmail.com> Eventually, I gave Bio::Tools::dpAlign a try. I had no luck running it when installing its dependency, bioperl-ext v1.4 or v1.5.1. However all the tests passed when installing the CVS version. So finally, here I am trying to score my alignments. For alignments of 2 small sequences, it works, but as soon as the sequences get bigger than a few dozen nucleotides, it crashes: Segmentation fault (core dumped) I did not find any help in the documentation... Can I fix this? Is this a bug? Thanks for your help, Florent Florent Angly wrote: > Hi, > I need to quantify how good some overlaps in contigs are. I have > extracted the alignment of the overlapping region and only need to > score it. I noticed the Bio::Tools::dpAlign has a scoring function. > Is it the right tool for the right tool? Is there anything else? > Thank you, > Florent > From bioperl-list at superfrink.net Sun Sep 23 12:21:44 2007 From: bioperl-list at superfrink.net (Chad Clark) Date: Sun, 23 Sep 2007 10:21:44 -0600 (MDT) Subject: [Bioperl-l] Scoring an overlap In-Reply-To: <46F5C454.3050005@gmail.com> References: <764978cf0709180330i232c2f04yf033414ad0dba655@mail.gmail.com> <46EFBA0E.4030104@sheffield.ac.uk> <46F00902.9070401@sendu.me.uk> <5C811C68-D049-407A-946D-061DD8951B54@uiuc.edu> <46F4A1E4.7020002@gmail.com> <46F5C454.3050005@gmail.com> Message-ID: On Sat, 22 Sep 2007, Florent Angly wrote: > So finally, here I am trying to score my alignments. For alignments of 2 > small sequences, it works, but as soon as the sequences get bigger than > a few dozen nucleotides, it crashes: > Segmentation fault (core dumped) In my experience if a program runs on small sets of data and segfaults on larger sets it is likely running out of stack space. You can try changing the allowed stack size with "ulimit" before running your program and see if it works with more data. [chad at water ~]$ ulimit -a | grep -i stack stack size (kbytes, -s) 10240 [chad at water ~]$ ulimit -s 40960 [chad at water ~]$ ulimit -a | grep -i stack stack size (kbytes, -s) 40960 I don't know the code / algorithm in question but it might require significantly more stack space as the data set grows in size so this change might not help enough. Good luck, Chad From bix at sendu.me.uk Mon Sep 24 05:35:39 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Mon, 24 Sep 2007 10:35:39 +0100 Subject: [Bioperl-l] Bio::FeatureIO::gff bug? Message-ID: <46F784EB.9050507@sendu.me.uk> Hi, I'm finding that when writing GFF files the version header line gets printed out twice. This is because: sub _initialize { # [snip] if ($arg{-file} =~ /^>.*/ ) { $self->_print("##gff-version " . $self->version() . "\n"); } else { my $directive; while(($directive = $self->_readline()) && ( $directive =~ /^##/ || $directive =~ /^>/)){ $self->_handle_directive($directive); } $self->_pushback($directive); } if ($arg{-file} =~ /^>.*/ ) { $self->_print("##gff-version " . $self->version() . "\n"); } # [snip] } Does it make sense for if ($arg{-file} =~ /^>.*/ ) to appear twice like that? If not, which one should be removed? The independent one, or the if/else one? Cheers, Sendu. From awitney at sgul.ac.uk Mon Sep 24 08:40:07 2007 From: awitney at sgul.ac.uk (Adam Witney) Date: Mon, 24 Sep 2007 13:40:07 +0100 Subject: [Bioperl-l] Minor docs discrepancy? In-Reply-To: Message-ID: Hi, I was just going through the BlastHSP.pm POD and lines 23-24 say "For Bio::SearchIO BLAST parsing usage examples, see the "examples/search-blast" directory of the Bioperl distribution." However in the distribution (1.4 and 1.5.2_102) it looks to be "examples/searchio" Is this in need of updating, or am I looking in the wrong place? Thanks Adam From cjfields at uiuc.edu Mon Sep 24 09:09:46 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 24 Sep 2007 08:09:46 -0500 Subject: [Bioperl-l] Bio::FeatureIO::gff bug? In-Reply-To: <46F784EB.9050507@sendu.me.uk> References: <46F784EB.9050507@sendu.me.uk> Message-ID: <7C7D7FB3-86B9-43CF-B506-E66FA8264CFC@uiuc.edu> It looks like the first is a cut-and-paste revision of the second, so I would say the second independent if block is redundant. Should we be printing output in _initialize()? I would think any output would be handled in a write_* method of some sort and not in a common method used for initializing both input and output stream data. What happens here if you use '-fh' and want output redirected to STDOUT? chris On Sep 24, 2007, at 4:35 AM, Sendu Bala wrote: > Hi, > > I'm finding that when writing GFF files the version header line gets > printed out twice. This is because: > > sub _initialize { > # [snip] > > if ($arg{-file} =~ /^>.*/ ) { > $self->_print("##gff-version " . $self->version() . "\n"); > } > else { > my $directive; > while(($directive = $self->_readline()) && ( $directive =~ / > ^##/ || > $directive =~ /^>/)){ > $self->_handle_directive($directive); > } > $self->_pushback($directive); > } > > if ($arg{-file} =~ /^>.*/ ) { > $self->_print("##gff-version " . $self->version() . "\n"); > } > > # [snip] > } > > Does it make sense for if ($arg{-file} =~ /^>.*/ ) to appear twice > like > that? If not, which one should be removed? The independent one, or the > if/else one? > > > Cheers, > Sendu. From cjfields at uiuc.edu Mon Sep 24 09:13:53 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 24 Sep 2007 08:13:53 -0500 Subject: [Bioperl-l] Minor docs discrepancy? In-Reply-To: References: Message-ID: I have updated that in CVS. Thanks for pointing that out! chris On Sep 24, 2007, at 7:40 AM, Adam Witney wrote: > > Hi, > > I was just going through the BlastHSP.pm POD and lines 23-24 say > > "For Bio::SearchIO BLAST parsing usage examples, see the > "examples/search-blast" directory of the Bioperl distribution." > > However in the distribution (1.4 and 1.5.2_102) it looks to be > "examples/searchio" > > Is this in need of updating, or am I looking in the wrong place? > > Thanks > > Adam From bix at sendu.me.uk Mon Sep 24 09:20:33 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Mon, 24 Sep 2007 14:20:33 +0100 Subject: [Bioperl-l] Bio::FeatureIO::gff bug? In-Reply-To: <7C7D7FB3-86B9-43CF-B506-E66FA8264CFC@uiuc.edu> References: <46F784EB.9050507@sendu.me.uk> <7C7D7FB3-86B9-43CF-B506-E66FA8264CFC@uiuc.edu> Message-ID: <46F7B9A1.9080206@sendu.me.uk> Chris Fields wrote: > It looks like the first is a cut-and-paste revision of the second, so I > would say the second independent if block is redundant. I agree. I'll make that change. > Should we be printing output in _initialize()? I would think any output > would be handled in a write_* method of some sort and not in a common > method used for initializing both input and output stream data. What > happens here if you use '-fh' and want output redirected to STDOUT? I think the problem is that the method is write_feature(), which can be called many times for a single output file, but the version should only be printed once at the very start of the file. I suppose it just needs better capturing of when we're intending to write... Hmmm... didn't I fix a method related to that?... Yes, yes I did: Bio::Root::IO->mode ;) Any objections to me replacing the if clause with one using that method? From cjfields at uiuc.edu Mon Sep 24 09:35:22 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 24 Sep 2007 08:35:22 -0500 Subject: [Bioperl-l] Bio::FeatureIO::gff bug? In-Reply-To: <46F7B9A1.9080206@sendu.me.uk> References: <46F784EB.9050507@sendu.me.uk> <7C7D7FB3-86B9-43CF-B506-E66FA8264CFC@uiuc.edu> <46F7B9A1.9080206@sendu.me.uk> Message-ID: On Sep 24, 2007, at 8:20 AM, Sendu Bala wrote: > Chris Fields wrote: >> It looks like the first is a cut-and-paste revision of the second, >> so I would say the second independent if block is redundant. > > I agree. I'll make that change. > > >> Should we be printing output in _initialize()? I would think any >> output would be handled in a write_* method of some sort and not >> in a common method used for initializing both input and output >> stream data. What happens here if you use '-fh' and want output >> redirected to STDOUT? > > I think the problem is that the method is write_feature(), which > can be called many times for a single output file, but the version > should only be printed once at the very start of the file. > > I suppose it just needs better capturing of when we're intending to > write... Hmmm... didn't I fix a method related to that?... > > Yes, yes I did: > Bio::Root::IO->mode > ;) > > Any objections to me replacing the if clause with one using that > method? I think that'll work fine. The other option would be call a print_gff_header() function within write_feature() with the intent to print the header only once, using a flag or similar: if (!$self->header_printed) { $self->print_gff_header; $self->header_printed(1); } chris From hlapp at gmx.net Mon Sep 24 13:41:34 2007 From: hlapp at gmx.net (Hilmar Lapp) Date: Mon, 24 Sep 2007 13:41:34 -0400 Subject: [Bioperl-l] Bio::FeatureIO::gff bug? In-Reply-To: References: <46F784EB.9050507@sendu.me.uk> <7C7D7FB3-86B9-43CF-B506-E66FA8264CFC@uiuc.edu> <46F7B9A1.9080206@sendu.me.uk> Message-ID: <234C109D-85FE-4161-8CBA-8E24BE34C5B5@gmx.net> I'd lean toward this or a similar approach too. Writing stuff out in the constructor doesn't feel like the best design. -hilmar On Sep 24, 2007, at 9:35 AM, Chris Fields wrote: > > On Sep 24, 2007, at 8:20 AM, Sendu Bala wrote: > >> Chris Fields wrote: >>> It looks like the first is a cut-and-paste revision of the second, >>> so I would say the second independent if block is redundant. >> >> I agree. I'll make that change. >> >> >>> Should we be printing output in _initialize()? I would think any >>> output would be handled in a write_* method of some sort and not >>> in a common method used for initializing both input and output >>> stream data. What happens here if you use '-fh' and want output >>> redirected to STDOUT? >> >> I think the problem is that the method is write_feature(), which >> can be called many times for a single output file, but the version >> should only be printed once at the very start of the file. >> >> I suppose it just needs better capturing of when we're intending to >> write... Hmmm... didn't I fix a method related to that?... >> >> Yes, yes I did: >> Bio::Root::IO->mode >> ;) >> >> Any objections to me replacing the if clause with one using that >> method? > > I think that'll work fine. The other option would be call a > print_gff_header() function within write_feature() with the intent to > print the header only once, using a flag or similar: > > if (!$self->header_printed) { > $self->print_gff_header; > $self->header_printed(1); > } > > chris > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From cjfields at uiuc.edu Mon Sep 24 14:11:47 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 24 Sep 2007 13:11:47 -0500 Subject: [Bioperl-l] Scoring an overlap In-Reply-To: <46F5C454.3050005@gmail.com> References: <764978cf0709180330i232c2f04yf033414ad0dba655@mail.gmail.com> <46EFBA0E.4030104@sheffield.ac.uk> <46F00902.9070401@sendu.me.uk> <5C811C68-D049-407A-946D-061DD8951B54@uiuc.edu> <46F4A1E4.7020002@gmail.com> <46F5C454.3050005@gmail.com> Message-ID: <9347CF3A-7F01-4C73-82A4-3B164420EF04@uiuc.edu> As Chad mentioned it could be a stack issue, but it might be worth filing a bug on. I will note that bioperl-ext has seen very little use in the last few years, so don't expect it to be fixed unless you can contact the ext module author. chris On Sep 22, 2007, at 8:41 PM, Florent Angly wrote: > Eventually, I gave Bio::Tools::dpAlign a try. I had no luck running it > when installing its dependency, bioperl-ext v1.4 or v1.5.1. However > all > the tests passed when installing the CVS version. > So finally, here I am trying to score my alignments. For alignments > of 2 > small sequences, it works, but as soon as the sequences get bigger > than > a few dozen nucleotides, it crashes: > Segmentation fault (core dumped) > > I did not find any help in the documentation... > > Can I fix this? Is this a bug? > > Thanks for your help, > > Florent > > Florent Angly wrote: >> Hi, >> I need to quantify how good some overlaps in contigs are. I have >> extracted the alignment of the overlapping region and only need to >> score it. I noticed the Bio::Tools::dpAlign has a scoring function. >> Is it the right tool for the right tool? Is there anything else? >> Thank you, >> Florent >> > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From florent.angly at gmail.com Mon Sep 24 15:07:35 2007 From: florent.angly at gmail.com (Florent Angly) Date: Mon, 24 Sep 2007 12:07:35 -0700 Subject: [Bioperl-l] Scoring an overlap In-Reply-To: <9347CF3A-7F01-4C73-82A4-3B164420EF04@uiuc.edu> References: <764978cf0709180330i232c2f04yf033414ad0dba655@mail.gmail.com> <46EFBA0E.4030104@sheffield.ac.uk> <46F00902.9070401@sendu.me.uk> <5C811C68-D049-407A-946D-061DD8951B54@uiuc.edu> <46F4A1E4.7020002@gmail.com> <46F5C454.3050005@gmail.com> <9347CF3A-7F01-4C73-82A4-3B164420EF04@uiuc.edu> Message-ID: <46F80AF7.4090109@gmail.com> I see... Thanks for the replies Chad and Chris. Then I have two more questions! 1/ Do you know how to get a core dump that could help debug my segmentation fault? I have produced dumps of binary C programs before with gdb. I have used the Perl debugger for Perl scripts. But how to deal with C functions called by Perl? 2/ Is there an easier method to calculate an alignment score in BioPerl than using Bio::Tools::dpAlign? I didn't seem to locate something else, but who knows... I have workarounds for quantifying the quality of the overlap, so calculating a score is not critical for me (though I believe this would be the most accurate/adapted method). Florent Chris Fields wrote: > As Chad mentioned it could be a stack issue, but it might be worth > filing a bug on. I will note that bioperl-ext has seen very little > use in the last few years, so don't expect it to be fixed unless you > can contact the ext module author. > > chris > > On Sep 22, 2007, at 8:41 PM, Florent Angly wrote: > >> Eventually, I gave Bio::Tools::dpAlign a try. I had no luck running it >> when installing its dependency, bioperl-ext v1.4 or v1.5.1. However all >> the tests passed when installing the CVS version. >> So finally, here I am trying to score my alignments. For alignments of 2 >> small sequences, it works, but as soon as the sequences get bigger than >> a few dozen nucleotides, it crashes: >> Segmentation fault (core dumped) >> >> I did not find any help in the documentation... >> >> Can I fix this? Is this a bug? >> >> Thanks for your help, >> >> Florent >> >> Florent Angly wrote: >>> Hi, >>> I need to quantify how good some overlaps in contigs are. I have >>> extracted the alignment of the overlapping region and only need to >>> score it. I noticed the Bio::Tools::dpAlign has a scoring function. >>> Is it the right tool for the right tool? Is there anything else? >>> Thank you, >>> Florent >>> >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > Christopher Fields > Postdoctoral Researcher > Lab of Dr. Robert Switzer > Dept of Biochemistry > University of Illinois Urbana-Champaign > > > > From cjfields at uiuc.edu Mon Sep 24 15:26:46 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 24 Sep 2007 14:26:46 -0500 Subject: [Bioperl-l] Scoring an overlap In-Reply-To: <46F80AF7.4090109@gmail.com> References: <764978cf0709180330i232c2f04yf033414ad0dba655@mail.gmail.com> <46EFBA0E.4030104@sheffield.ac.uk> <46F00902.9070401@sendu.me.uk> <5C811C68-D049-407A-946D-061DD8951B54@uiuc.edu> <46F4A1E4.7020002@gmail.com> <46F5C454.3050005@gmail.com> <9347CF3A-7F01-4C73-82A4-3B164420EF04@uiuc.edu> <46F80AF7.4090109@gmail.com> Message-ID: <0C6508FB-4765-4E4E-AFD1-6E5BFE8F9368@uiuc.edu> I suppose if you can find a way to export the contig data into a Bio::SimpleAlign you look at the methods in Bio::Align::DNAStatistics. SimpleAlign also has some builtin methods like average_percentage_identity, percentage_identity, etc, which may be worth a look. chris On Sep 24, 2007, at 2:07 PM, Florent Angly wrote: > I see... Thanks for the replies Chad and Chris. Then I have two more > questions! > 1/ Do you know how to get a core dump that could help debug my > segmentation fault? I have produced dumps of binary C programs before > with gdb. I have used the Perl debugger for Perl scripts. But how to > deal with C functions called by Perl? > 2/ Is there an easier method to calculate an alignment score in > BioPerl > than using Bio::Tools::dpAlign? I didn't seem to locate something > else, > but who knows... > I have workarounds for quantifying the quality of the overlap, so > calculating a score is not critical for me (though I believe this > would > be the most accurate/adapted method). > Florent > > Chris Fields wrote: >> As Chad mentioned it could be a stack issue, but it might be worth >> filing a bug on. I will note that bioperl-ext has seen very little >> use in the last few years, so don't expect it to be fixed unless you >> can contact the ext module author. >> >> chris >> >> On Sep 22, 2007, at 8:41 PM, Florent Angly wrote: >> >>> Eventually, I gave Bio::Tools::dpAlign a try. I had no luck >>> running it >>> when installing its dependency, bioperl-ext v1.4 or v1.5.1. >>> However all >>> the tests passed when installing the CVS version. >>> So finally, here I am trying to score my alignments. For >>> alignments of 2 >>> small sequences, it works, but as soon as the sequences get >>> bigger than >>> a few dozen nucleotides, it crashes: >>> Segmentation fault (core dumped) >>> >>> I did not find any help in the documentation... >>> >>> Can I fix this? Is this a bug? >>> >>> Thanks for your help, >>> >>> Florent >>> >>> Florent Angly wrote: >>>> Hi, >>>> I need to quantify how good some overlaps in contigs are. I have >>>> extracted the alignment of the overlapping region and only need to >>>> score it. I noticed the Bio::Tools::dpAlign has a scoring function. >>>> Is it the right tool for the right tool? Is there anything else? >>>> Thank you, >>>> Florent >>>> >>> >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> >> Christopher Fields >> Postdoctoral Researcher >> Lab of Dr. Robert Switzer >> Dept of Biochemistry >> University of Illinois Urbana-Champaign >> >> >> >> > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From florent.angly at gmail.com Mon Sep 24 15:46:36 2007 From: florent.angly at gmail.com (Florent Angly) Date: Mon, 24 Sep 2007 12:46:36 -0700 Subject: [Bioperl-l] Scoring an overlap In-Reply-To: <0C6508FB-4765-4E4E-AFD1-6E5BFE8F9368@uiuc.edu> References: <764978cf0709180330i232c2f04yf033414ad0dba655@mail.gmail.com> <46EFBA0E.4030104@sheffield.ac.uk> <46F00902.9070401@sendu.me.uk> <5C811C68-D049-407A-946D-061DD8951B54@uiuc.edu> <46F4A1E4.7020002@gmail.com> <46F5C454.3050005@gmail.com> <9347CF3A-7F01-4C73-82A4-3B164420EF04@uiuc.edu> <46F80AF7.4090109@gmail.com> <0C6508FB-4765-4E4E-AFD1-6E5BFE8F9368@uiuc.edu> Message-ID: <46F8141C.8010507@gmail.com> Yes, right! That's the methods I use as my workarounds. =) Thanks for suggesting. Florent Chris Fields wrote: > I suppose if you can find a way to export the contig data into a > Bio::SimpleAlign you look at the methods in > Bio::Align::DNAStatistics. SimpleAlign also has some builtin methods > like average_percentage_identity, percentage_identity, etc, which may > be worth a look. > > chris > > On Sep 24, 2007, at 2:07 PM, Florent Angly wrote: > >> I see... Thanks for the replies Chad and Chris. Then I have two more >> questions! >> 1/ Do you know how to get a core dump that could help debug my >> segmentation fault? I have produced dumps of binary C programs before >> with gdb. I have used the Perl debugger for Perl scripts. But how to >> deal with C functions called by Perl? >> 2/ Is there an easier method to calculate an alignment score in BioPerl >> than using Bio::Tools::dpAlign? I didn't seem to locate something else, >> but who knows... >> I have workarounds for quantifying the quality of the overlap, so >> calculating a score is not critical for me (though I believe this would >> be the most accurate/adapted method). >> Florent >> >> Chris Fields wrote: >>> As Chad mentioned it could be a stack issue, but it might be worth >>> filing a bug on. I will note that bioperl-ext has seen very little >>> use in the last few years, so don't expect it to be fixed unless you >>> can contact the ext module author. >>> >>> chris >>> >>> On Sep 22, 2007, at 8:41 PM, Florent Angly wrote: >>> >>>> Eventually, I gave Bio::Tools::dpAlign a try. I had no luck running it >>>> when installing its dependency, bioperl-ext v1.4 or v1.5.1. However >>>> all >>>> the tests passed when installing the CVS version. >>>> So finally, here I am trying to score my alignments. For alignments >>>> of 2 >>>> small sequences, it works, but as soon as the sequences get bigger >>>> than >>>> a few dozen nucleotides, it crashes: >>>> Segmentation fault (core dumped) >>>> >>>> I did not find any help in the documentation... >>>> >>>> Can I fix this? Is this a bug? >>>> >>>> Thanks for your help, >>>> >>>> Florent >>>> >>>> Florent Angly wrote: >>>>> Hi, >>>>> I need to quantify how good some overlaps in contigs are. I have >>>>> extracted the alignment of the overlapping region and only need to >>>>> score it. I noticed the Bio::Tools::dpAlign has a scoring function. >>>>> Is it the right tool for the right tool? Is there anything else? >>>>> Thank you, >>>>> Florent >>>>> >>>> >>>> _______________________________________________ >>>> Bioperl-l mailing list >>>> Bioperl-l at lists.open-bio.org >>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>> >>> Christopher Fields >>> Postdoctoral Researcher >>> Lab of Dr. Robert Switzer >>> Dept of Biochemistry >>> University of Illinois Urbana-Champaign >>> >>> >>> >>> >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > Christopher Fields > Postdoctoral Researcher > Lab of Dr. Robert Switzer > Dept of Biochemistry > University of Illinois Urbana-Champaign > > > > From bix at sendu.me.uk Tue Sep 25 06:00:20 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Tue, 25 Sep 2007 11:00:20 +0100 Subject: [Bioperl-l] Bio::FeatureIO::gff bug? In-Reply-To: <234C109D-85FE-4161-8CBA-8E24BE34C5B5@gmx.net> References: <46F784EB.9050507@sendu.me.uk> <7C7D7FB3-86B9-43CF-B506-E66FA8264CFC@uiuc.edu> <46F7B9A1.9080206@sendu.me.uk> <234C109D-85FE-4161-8CBA-8E24BE34C5B5@gmx.net> Message-ID: <46F8DC34.6020908@sendu.me.uk> Hilmar Lapp wrote: > On Sep 24, 2007, at 9:35 AM, Chris Fields wrote: >> I think that'll work fine. The other option would be call a >> print_gff_header() function within write_feature() with the intent to >> print the header only once, using a flag or similar: >> >> if (!$self->header_printed) { >> $self->print_gff_header; >> $self->header_printed(1); >> } > > I'd lean toward this or a similar approach too. Writing stuff out in the > constructor doesn't feel like the best design. I'd argue that the alternative is just inefficient with no compensating benefit. You have something that must only be done once, and a method (_initialize) that is only called once. The constructor is used to set up the file, getting it into a state ready to add features. This involves opening it for writing with the correct filename and setting the desired GFF version. Why wouldn't it also output what ever else was necessary it initialize the file? Also, what do we expect should happen when we use Bioperl to create a GFF file and don't write any features to it? Should it be an empty file, or should it contain whatever GFF information the user had managed to supply (the version)? From cjfields at uiuc.edu Tue Sep 25 10:14:04 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 25 Sep 2007 09:14:04 -0500 Subject: [Bioperl-l] Bio::FeatureIO::gff bug? In-Reply-To: <46F8DC34.6020908@sendu.me.uk> References: <46F784EB.9050507@sendu.me.uk> <7C7D7FB3-86B9-43CF-B506-E66FA8264CFC@uiuc.edu> <46F7B9A1.9080206@sendu.me.uk> <234C109D-85FE-4161-8CBA-8E24BE34C5B5@gmx.net> <46F8DC34.6020908@sendu.me.uk> Message-ID: <22FB7AE5-2E1C-450C-A48C-6014CC5EB786@uiuc.edu> On Sep 25, 2007, at 5:00 AM, Sendu Bala wrote: > Hilmar Lapp wrote: >> On Sep 24, 2007, at 9:35 AM, Chris Fields wrote: >>> I think that'll work fine. The other option would be call a >>> print_gff_header() function within write_feature() with the >>> intent to >>> print the header only once, using a flag or similar: >>> >>> if (!$self->header_printed) { >>> $self->print_gff_header; >>> $self->header_printed(1); >>> } >> >> I'd lean toward this or a similar approach too. Writing stuff out >> in the >> constructor doesn't feel like the best design. > > I'd argue that the alternative is just inefficient with no > compensating > benefit. You have something that must only be done once, and a method > (_initialize) that is only called once. The constructor is used to set > up the file, getting it into a state ready to add features. This > involves opening it for writing with the correct filename and setting > the desired GFF version. Why wouldn't it also output what ever else > was > necessary it initialize the file? It's great to have someone picking this up, so anything that works is fine by me, to tell the truth. I'll state my piece, though, and stand out of the way. In my opinion there are a couple of compensating benefits. One is long-term maintenance, primarily being all calls to generate output are contained within the write_features method and are thus easier to find and more maintainable. The logic goes, if there were a bug I would expect to find output in a write_* method or a method called from within write_* method, not in the constructor or something called from the constructor, like _initialize(). I've always been told the constructor in any OO language is typically limited to setting state data and behavior, not generating new data (i.e. generating output). Related to that, the other benefit is expected behavior when calling a method. I don't know of cases in other IO classes in Bioperl which generate output when a new() instance is created; output is expected specifically when calling a particular write_* method. Therefore I wouldn't expect any output be generated until write_feature() were called (or calling similarly named methods where output would be expected, like a print_* method). Saying all that, I'm probably not the best one to bang the 'best practices' drum right now as I haven't had time to finish up several modules I've been working on! Speaking of (going back to work...) > Also, what do we expect should happen when we use Bioperl to create a > GFF file and don't write any features to it? Should it be an empty > file, > or should it contain whatever GFF information the user had managed to > supply (the version)? As mentioned above, I would expect no output generated at all unless explicitly calling write_features(), just like any of the other IOs; the header info would be generated then. BioPerl has traditionally been for whatever works though, which is fine by me. chris From forrest_zhang at 163.com Thu Sep 27 03:41:44 2007 From: forrest_zhang at 163.com (Forrest) Date: Thu, 27 Sep 2007 15:41:44 +0800 Subject: [Bioperl-l] load_seqdatabase.pl error Message-ID: <000501c800d9$dc9c8e90$95d5abb0$@com> Hi, all I install the biosql, and bioperl-db. I want to import swissport data. But the programe show some error as below: ============================================================================ =============================================== >perl load_seqdatabase.pl -dbuser root -dbname bioseqdb -driver mysql -namespace swissprot -format swiss ~/uniprot/uniprot_sprot.dat Loading /home/forrest/uniprot/uniprot_sprot.dat ... Could not store Q6DAH5: ------------- EXCEPTION: Bio::Root::Exception ------------- MSG: The supplied lineage does not start near 'Erwinia carotovora subsp. atroseptica' (I was supplied 'Erwinia carotovora subsp. | Pectobacterium | Enterobacteriaceae | Enterobacteriales | Gammaproteobacteria | Proteobacteria | Bacteria') STACK: Error::throw STACK: Bio::Root::Root::throw /usr/lib/perl5/site_perl/5.8.8/Bio/Root/Root.pm:359 STACK: Bio::Species::classification /usr/lib/perl5/site_perl/5.8.8/Bio/Species.pm:174 STACK: Bio::DB::Persistent::PersistentObject::AUTOLOAD /usr/lib/perl5/site_perl/5.8.8/Bio/DB/Persistent/PersistentObject.pm:552 STACK: Bio::DB::BioSQL::SpeciesAdaptor::populate_from_row /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/SpeciesAdaptor.pm:281 STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::_build_object /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/BasePersistenceAdaptor.pm:1305 STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::_find_by_unique_key /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/BasePersistenceAdaptor.pm:973 STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::find_by_unique_key /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/BasePersistenceAdaptor.pm:852 STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/BasePersistenceAdaptor.pm:182 STACK: Bio::DB::Persistent::PersistentObject::create /usr/lib/perl5/site_perl/5.8.8/Bio/DB/Persistent/PersistentObject.pm:244 STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/BasePersistenceAdaptor.pm:169 STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::store /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/BasePersistenceAdaptor.pm:251 STACK: Bio::DB::Persistent::PersistentObject::store /usr/lib/perl5/site_perl/5.8.8/Bio/DB/Persistent/PersistentObject.pm:271 STACK: load_seqdatabase.pl:620 ----------------------------------------------------------- at load_seqdatabase.pl line 633 ============================================================================ =============================================== How can I solve it, please help me, Thank you. Thanks Forrest zhang From bix at sendu.me.uk Thu Sep 27 04:38:00 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Thu, 27 Sep 2007 09:38:00 +0100 Subject: [Bioperl-l] load_seqdatabase.pl error In-Reply-To: <000501c800d9$dc9c8e90$95d5abb0$@com> References: <000501c800d9$dc9c8e90$95d5abb0$@com> Message-ID: <46FB6BE8.9050203@sendu.me.uk> Forrest wrote: > Hi, all > I install the biosql, and bioperl-db. I want to import swissport data. > But the programe show some error as below: > ============================================================================ > =============================================== >> perl load_seqdatabase.pl -dbuser root -dbname bioseqdb -driver mysql > -namespace swissprot -format swiss ~/uniprot/uniprot_sprot.dat > Loading /home/forrest/uniprot/uniprot_sprot.dat ... > Could not store Q6DAH5: > ------------- EXCEPTION: Bio::Root::Exception ------------- > MSG: The supplied lineage does not start near 'Erwinia carotovora subsp. > atroseptica' (I was supplied 'Erwinia carotovora subsp. | Pectobacterium | > Enterobacteriaceae | Enterobacteriales | Gammaproteobacteria | > Proteobacteria | Bacteria') From: OS Erwinia carotovora subsp. atroseptica (Pectobacterium atrosepticum). OC Bacteria; Proteobacteria; Gammaproteobacteria; Enterobacteriales; OC Enterobacteriaceae; Pectobacterium. I'm guessing some oddity in the Swissprot parser where in one place it truncates the OS to the first '.', and in another it doesn't? Can someone confirm this with CVS versions of bioperl-live and -db in case Chris already fixed it in recent parser changes? From cjfields at uiuc.edu Thu Sep 27 08:47:00 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 27 Sep 2007 07:47:00 -0500 Subject: [Bioperl-l] load_seqdatabase.pl error In-Reply-To: <46FB6BE8.9050203@sendu.me.uk> References: <000501c800d9$dc9c8e90$95d5abb0$@com> <46FB6BE8.9050203@sendu.me.uk> Message-ID: On Sep 27, 2007, at 3:38 AM, Sendu Bala wrote: > Forrest wrote: >> Hi, all >> I install the biosql, and bioperl-db. I want to import >> swissport data. >> But the programe show some error as below: >> ===================================================================== >> ======= >> =============================================== >>> perl load_seqdatabase.pl -dbuser root -dbname bioseqdb -driver mysql >> -namespace swissprot -format swiss ~/uniprot/uniprot_sprot.dat >> Loading /home/forrest/uniprot/uniprot_sprot.dat ... >> Could not store Q6DAH5: >> ------------- EXCEPTION: Bio::Root::Exception ------------- >> MSG: The supplied lineage does not start near 'Erwinia carotovora >> subsp. >> atroseptica' (I was supplied 'Erwinia carotovora subsp. | >> Pectobacterium | >> Enterobacteriaceae | Enterobacteriales | Gammaproteobacteria | >> Proteobacteria | Bacteria') > > From: > OS Erwinia carotovora subsp. atroseptica (Pectobacterium > atrosepticum). > OC Bacteria; Proteobacteria; Gammaproteobacteria; Enterobacteriales; > OC Enterobacteriaceae; Pectobacterium. > > I'm guessing some oddity in the Swissprot parser where in one place it > truncates the OS to the first '.', and in another it doesn't? > > Can someone confirm this with CVS versions of bioperl-live and -db in > case Chris already fixed it in recent parser changes? It looks suspiciously like he isn't using bioperl-live code (I changed the exception to a warning a while back, and I think the '.' truncation was fixed). This is still an outstanding issue with bioperl-db which hasn't been fully fixed yet, though; we may have to move the priority up on this one. chris From bix at sendu.me.uk Thu Sep 27 09:47:06 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Thu, 27 Sep 2007 14:47:06 +0100 Subject: [Bioperl-l] Bio::SeqFeature::Annotated / SOFA question Message-ID: <46FBB45A.10505@sendu.me.uk> I want to create a Bio::SeqFeature::Annotated object where the 'type' is 'conserved_region'. I got the idea that 'conserved_region' might be ok from here: http://song.sourceforge.net/SOterm_tables.html#SO:0000330 However, this doesn't work since: ------------- EXCEPTION ------------- MSG: couldn't find a SOFA term matching type 'conserved_region'. STACK Bio::SeqFeature::Annotated::type /data/bioinf/home/sb/current/bioperl-core/Bio/SeqFeature/Annotated.pm:371 [snip] I'm guessing Bio::Ontology::OntologyStore is getting its allowed SOFA terms from: http://umn.dl.sourceforge.net/sourceforge/song/sofa.definition I don't know much about this area. Can someone offer a little guidance as to what the significance of these two different files is, why they don't contain the same terms, and why I can't use 'conserved_region'? What's the closest alternative term? From cain.cshl at gmail.com Thu Sep 27 10:20:50 2007 From: cain.cshl at gmail.com (Scott Cain) Date: Thu, 27 Sep 2007 10:20:50 -0400 Subject: [Bioperl-l] Bio::SeqFeature::Annotated / SOFA question In-Reply-To: <46FBB45A.10505@sendu.me.uk> References: <46FBB45A.10505@sendu.me.uk> Message-ID: <1190902850.12078.26.camel@localhost.localdomain> Hi Sendu, I believe that BSFA uses SOFA but the growing consensus is that SOFA should be pitched and all of SO should be used where SOFA was being used. I also suspect that BioPerl is using a very old version of SOFA, since at the time BSFA was written, BioPerl couldn't parse OBO files (can it now?), so it was using the very old file format (whose name I can't even remember now) and that file hasn't been updated in a long time (which is why it isn't finding conserved_region). If BioPerl can parse OBO files, we should switch BSFA to validate against http://song.cvs.sourceforge.net/*checkout*/song/ontology/so.obo Scott On Thu, 2007-09-27 at 14:47 +0100, Sendu Bala wrote: > I want to create a Bio::SeqFeature::Annotated object where the 'type' is > 'conserved_region'. > > I got the idea that 'conserved_region' might be ok from here: > http://song.sourceforge.net/SOterm_tables.html#SO:0000330 > > However, this doesn't work since: > > ------------- EXCEPTION ------------- > MSG: couldn't find a SOFA term matching type 'conserved_region'. > STACK Bio::SeqFeature::Annotated::type > /data/bioinf/home/sb/current/bioperl-core/Bio/SeqFeature/Annotated.pm:371 > [snip] > > > I'm guessing Bio::Ontology::OntologyStore is getting its allowed SOFA > terms from: > http://umn.dl.sourceforge.net/sourceforge/song/sofa.definition > > > I don't know much about this area. Can someone offer a little guidance > as to what the significance of these two different files is, why they > don't contain the same terms, and why I can't use 'conserved_region'? > > What's the closest alternative term? > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- ------------------------------------------------------------------------ Scott Cain, Ph. D. cain at cshl.edu GMOD Coordinator (http://www.gmod.org/) 216-392-3087 Cold Spring Harbor Laboratory From cjfields at uiuc.edu Thu Sep 27 11:25:13 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 27 Sep 2007 10:25:13 -0500 Subject: [Bioperl-l] Bio::SeqFeature::Annotated / SOFA question In-Reply-To: <1190902850.12078.26.camel@localhost.localdomain> References: <46FBB45A.10505@sendu.me.uk> <1190902850.12078.26.camel@localhost.localdomain> Message-ID: <29DFE331-33F5-47ED-9A6D-AF488E5F0689@uiuc.edu> On Sep 27, 2007, at 9:20 AM, Scott Cain wrote: > Hi Sendu, > > I believe that BSFA uses SOFA but the growing consensus is that SOFA > should be pitched and all of SO should be used where SOFA was being > used. I also suspect that BioPerl is using a very old version of > SOFA, > since at the time BSFA was written, BioPerl couldn't parse OBO files > (can it now?), so it was using the very old file format (whose name I > can't even remember now) and that file hasn't been updated in a long > time (which is why it isn't finding conserved_region). > > If BioPerl can parse OBO files, we should switch BSFA to validate > against > > http://song.cvs.sourceforge.net/*checkout*/song/ontology/so.obo > > Scott I agree, this would definitely be for the best. BioPerl can parse obo; not sure how often it's used or what the tests are like, but switching to SO should give it a good workout and might wring out any issues. chris > On Thu, 2007-09-27 at 14:47 +0100, Sendu Bala wrote: >> I want to create a Bio::SeqFeature::Annotated object where the >> 'type' is >> 'conserved_region'. >> >> I got the idea that 'conserved_region' might be ok from here: >> http://song.sourceforge.net/SOterm_tables.html#SO:0000330 >> >> However, this doesn't work since: >> >> ------------- EXCEPTION ------------- >> MSG: couldn't find a SOFA term matching type 'conserved_region'. >> STACK Bio::SeqFeature::Annotated::type >> /data/bioinf/home/sb/current/bioperl-core/Bio/SeqFeature/ >> Annotated.pm:371 >> [snip] >> >> >> I'm guessing Bio::Ontology::OntologyStore is getting its allowed SOFA >> terms from: >> http://umn.dl.sourceforge.net/sourceforge/song/sofa.definition >> >> >> I don't know much about this area. Can someone offer a little >> guidance >> as to what the significance of these two different files is, why they >> don't contain the same terms, and why I can't use 'conserved_region'? >> >> What's the closest alternative term? >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > -- > ---------------------------------------------------------------------- > -- > Scott Cain, Ph. D. > cain at cshl.edu > GMOD Coordinator (http://www.gmod.org/) > 216-392-3087 > Cold Spring Harbor Laboratory > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Research Associate Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From cain.cshl at gmail.com Thu Sep 27 11:34:57 2007 From: cain.cshl at gmail.com (Scott Cain) Date: Thu, 27 Sep 2007 11:34:57 -0400 Subject: [Bioperl-l] Bio::SeqFeature::Annotated / SOFA question In-Reply-To: <29DFE331-33F5-47ED-9A6D-AF488E5F0689@uiuc.edu> References: <46FBB45A.10505@sendu.me.uk> <1190902850.12078.26.camel@localhost.localdomain> <29DFE331-33F5-47ED-9A6D-AF488E5F0689@uiuc.edu> Message-ID: <1190907297.12078.32.camel@localhost.localdomain> OK--while I would normal volunteer to do this, I don't think I am going to have time until after the Genome Informatics and GMOD meetings in November :-/ If it is still not done then, somebody poke me and remind me that I said that. Scott On Thu, 2007-09-27 at 10:25 -0500, Chris Fields wrote: > On Sep 27, 2007, at 9:20 AM, Scott Cain wrote: > > > Hi Sendu, > > > > I believe that BSFA uses SOFA but the growing consensus is that SOFA > > should be pitched and all of SO should be used where SOFA was being > > used. I also suspect that BioPerl is using a very old version of > > SOFA, > > since at the time BSFA was written, BioPerl couldn't parse OBO files > > (can it now?), so it was using the very old file format (whose name I > > can't even remember now) and that file hasn't been updated in a long > > time (which is why it isn't finding conserved_region). > > > > If BioPerl can parse OBO files, we should switch BSFA to validate > > against > > > > http://song.cvs.sourceforge.net/*checkout*/song/ontology/so.obo > > > > Scott > > I agree, this would definitely be for the best. BioPerl can parse > obo; not sure how often it's used or what the tests are like, but > switching to SO should give it a good workout and might wring out any > issues. > > chris > > > On Thu, 2007-09-27 at 14:47 +0100, Sendu Bala wrote: > >> I want to create a Bio::SeqFeature::Annotated object where the > >> 'type' is > >> 'conserved_region'. > >> > >> I got the idea that 'conserved_region' might be ok from here: > >> http://song.sourceforge.net/SOterm_tables.html#SO:0000330 > >> > >> However, this doesn't work since: > >> > >> ------------- EXCEPTION ------------- > >> MSG: couldn't find a SOFA term matching type 'conserved_region'. > >> STACK Bio::SeqFeature::Annotated::type > >> /data/bioinf/home/sb/current/bioperl-core/Bio/SeqFeature/ > >> Annotated.pm:371 > >> [snip] > >> > >> > >> I'm guessing Bio::Ontology::OntologyStore is getting its allowed SOFA > >> terms from: > >> http://umn.dl.sourceforge.net/sourceforge/song/sofa.definition > >> > >> > >> I don't know much about this area. Can someone offer a little > >> guidance > >> as to what the significance of these two different files is, why they > >> don't contain the same terms, and why I can't use 'conserved_region'? > >> > >> What's the closest alternative term? > >> _______________________________________________ > >> Bioperl-l mailing list > >> Bioperl-l at lists.open-bio.org > >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > -- > > ---------------------------------------------------------------------- > > -- > > Scott Cain, Ph. D. > > cain at cshl.edu > > GMOD Coordinator (http://www.gmod.org/) > > 216-392-3087 > > Cold Spring Harbor Laboratory > > > > _______________________________________________ > > Bioperl-l mailing list > > Bioperl-l at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > Christopher Fields > Research Associate > Lab of Dr. Robert Switzer > Dept of Biochemistry > University of Illinois Urbana-Champaign > > > -- ------------------------------------------------------------------------ Scott Cain, Ph. D. cain.cshl at gmail.com GMOD Coordinator (http://www.gmod.org/) 216-392-3087 Cold Spring Harbor Laboratory From cjfields at uiuc.edu Thu Sep 27 11:43:06 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 27 Sep 2007 10:43:06 -0500 Subject: [Bioperl-l] Bio::SeqFeature::Annotated / SOFA question In-Reply-To: <1190907297.12078.32.camel@localhost.localdomain> References: <46FBB45A.10505@sendu.me.uk> <1190902850.12078.26.camel@localhost.localdomain> <29DFE331-33F5-47ED-9A6D-AF488E5F0689@uiuc.edu> <1190907297.12078.32.camel@localhost.localdomain> Message-ID: Actually, I just added 'Sequence Ontology OBO' to Bio::Ontology::DocumentRegistry and switched BSFA over to use that in bioperl-live. So far it still passes tests checking SO using obo. Sendu, does that work or crash-and-burn? chris On Sep 27, 2007, at 10:34 AM, Scott Cain wrote: > OK--while I would normal volunteer to do this, I don't think I am > going > to have time until after the Genome Informatics and GMOD meetings in > November :-/ If it is still not done then, somebody poke me and > remind > me that I said that. > > Scott > > > On Thu, 2007-09-27 at 10:25 -0500, Chris Fields wrote: >> On Sep 27, 2007, at 9:20 AM, Scott Cain wrote: >> >>> Hi Sendu, >>> >>> I believe that BSFA uses SOFA but the growing consensus is that SOFA >>> should be pitched and all of SO should be used where SOFA was being >>> used. I also suspect that BioPerl is using a very old version of >>> SOFA, >>> since at the time BSFA was written, BioPerl couldn't parse OBO files >>> (can it now?), so it was using the very old file format (whose >>> name I >>> can't even remember now) and that file hasn't been updated in a long >>> time (which is why it isn't finding conserved_region). >>> >>> If BioPerl can parse OBO files, we should switch BSFA to validate >>> against >>> >>> http://song.cvs.sourceforge.net/*checkout*/song/ontology/so.obo >>> >>> Scott >> >> I agree, this would definitely be for the best. BioPerl can parse >> obo; not sure how often it's used or what the tests are like, but >> switching to SO should give it a good workout and might wring out any >> issues. >> >> chris >> >>> On Thu, 2007-09-27 at 14:47 +0100, Sendu Bala wrote: >>>> I want to create a Bio::SeqFeature::Annotated object where the >>>> 'type' is >>>> 'conserved_region'. >>>> >>>> I got the idea that 'conserved_region' might be ok from here: >>>> http://song.sourceforge.net/SOterm_tables.html#SO:0000330 >>>> >>>> However, this doesn't work since: >>>> >>>> ------------- EXCEPTION ------------- >>>> MSG: couldn't find a SOFA term matching type 'conserved_region'. >>>> STACK Bio::SeqFeature::Annotated::type >>>> /data/bioinf/home/sb/current/bioperl-core/Bio/SeqFeature/ >>>> Annotated.pm:371 >>>> [snip] >>>> >>>> >>>> I'm guessing Bio::Ontology::OntologyStore is getting its allowed >>>> SOFA >>>> terms from: >>>> http://umn.dl.sourceforge.net/sourceforge/song/sofa.definition >>>> >>>> >>>> I don't know much about this area. Can someone offer a little >>>> guidance >>>> as to what the significance of these two different files is, why >>>> they >>>> don't contain the same terms, and why I can't use >>>> 'conserved_region'? >>>> >>>> What's the closest alternative term? >>>> _______________________________________________ >>>> Bioperl-l mailing list >>>> Bioperl-l at lists.open-bio.org >>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>> -- >>> -------------------------------------------------------------------- >>> -- >>> -- >>> Scott Cain, Ph. D. >>> cain at cshl.edu >>> GMOD Coordinator (http://www.gmod.org/) >>> 216-392-3087 >>> Cold Spring Harbor Laboratory >>> >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> >> Christopher Fields >> Research Associate >> Lab of Dr. Robert Switzer >> Dept of Biochemistry >> University of Illinois Urbana-Champaign >> >> >> > -- > ---------------------------------------------------------------------- > -- > Scott Cain, Ph. D. > cain.cshl at gmail.com > GMOD Coordinator (http://www.gmod.org/) > 216-392-3087 > Cold Spring Harbor Laboratory > > Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From hlapp at gmx.net Thu Sep 27 18:17:16 2007 From: hlapp at gmx.net (Hilmar Lapp) Date: Thu, 27 Sep 2007 18:17:16 -0400 Subject: [Bioperl-l] load_seqdatabase.pl error In-Reply-To: <000501c800d9$dc9c8e90$95d5abb0$@com> References: <000501c800d9$dc9c8e90$95d5abb0$@com> Message-ID: <2461576A-1A15-45E2-A14F-AEED6ACD6007@gmx.net> Forrest, have you preloaded the NCBI taxonomy as suggested in the BioSQL installation guidelines? SwissProt format has NCBI taxon IDs, and the code will try to use it to look up species and their lineage, rather than inserting the lineage from whatever BioPerl parses out of the sequence record. -hilmar On Sep 27, 2007, at 3:41 AM, Forrest wrote: > Hi, all > I install the biosql, and bioperl-db. I want to import > swissport data. > But the programe show some error as below: > ====================================================================== > ====== > =============================================== >> perl load_seqdatabase.pl -dbuser root -dbname bioseqdb -driver mysql > -namespace swissprot -format swiss ~/uniprot/uniprot_sprot.dat > Loading /home/forrest/uniprot/uniprot_sprot.dat ... > Could not store Q6DAH5: > ------------- EXCEPTION: Bio::Root::Exception ------------- > MSG: The supplied lineage does not start near 'Erwinia carotovora > subsp. > atroseptica' (I was supplied 'Erwinia carotovora subsp. | > Pectobacterium | > Enterobacteriaceae | Enterobacteriales | Gammaproteobacteria | > Proteobacteria | Bacteria') > STACK: Error::throw > STACK: Bio::Root::Root::throw > /usr/lib/perl5/site_perl/5.8.8/Bio/Root/Root.pm:359 > STACK: Bio::Species::classification > /usr/lib/perl5/site_perl/5.8.8/Bio/Species.pm:174 > STACK: Bio::DB::Persistent::PersistentObject::AUTOLOAD > /usr/lib/perl5/site_perl/5.8.8/Bio/DB/Persistent/ > PersistentObject.pm:552 > STACK: Bio::DB::BioSQL::SpeciesAdaptor::populate_from_row > /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/SpeciesAdaptor.pm:281 > STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::_build_object > /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/ > BasePersistenceAdaptor.pm:1305 > STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::_find_by_unique_key > /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/ > BasePersistenceAdaptor.pm:973 > STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::find_by_unique_key > /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/ > BasePersistenceAdaptor.pm:852 > STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create > /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/ > BasePersistenceAdaptor.pm:182 > STACK: Bio::DB::Persistent::PersistentObject::create > /usr/lib/perl5/site_perl/5.8.8/Bio/DB/Persistent/ > PersistentObject.pm:244 > STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create > /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/ > BasePersistenceAdaptor.pm:169 > STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::store > /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/ > BasePersistenceAdaptor.pm:251 > STACK: Bio::DB::Persistent::PersistentObject::store > /usr/lib/perl5/site_perl/5.8.8/Bio/DB/Persistent/ > PersistentObject.pm:271 > STACK: load_seqdatabase.pl:620 > ----------------------------------------------------------- > > at load_seqdatabase.pl line 633 > ====================================================================== > ====== > =============================================== > > How can I solve it, please help me, Thank you. > > Thanks > Forrest zhang > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From wcnelson at usc.edu Thu Sep 27 15:20:47 2007 From: wcnelson at usc.edu (William C. Nelson) Date: Thu, 27 Sep 2007 15:20:47 -0400 Subject: [Bioperl-l] cpan install Message-ID: <46FC028F.3050000@usc.edu> Hello, I tried to install v1.5.2 using cpan. My urllist looks like: cpan[2]> o conf urllist urllist 0 [ftp://cpan.cs.utah.edu/pub/CPAN/] 1 [ftp://cpan.mirrors.tds.net/pub/CPAN] 2 [ftp://ftp.open-bio.org/pub/bioperl/DIST/] Type 'o conf' to view all configuration items And when I look for bioperl, I see: cpan[1]> d /bioperl/ CPAN: Storable loaded ok (v2.16) Going to read /root/.cpan/Metadata Database was generated on Thu, 27 Sep 2007 18:36:44 GMT Distribution BIRNEY/bioperl-1.2.1.tar.gz Distribution BIRNEY/bioperl-1.2.2.tar.gz Distribution BIRNEY/bioperl-1.2.3.tar.gz Distribution BIRNEY/bioperl-1.2.tar.gz Distribution BIRNEY/bioperl-1.4.tar.gz Distribution BIRNEY/bioperl-db-0.1.tar.gz Distribution BIRNEY/bioperl-ext-1.4.tar.gz Distribution BIRNEY/bioperl-gui-0.7.tar.gz Distribution BIRNEY/bioperl-run-1.2.2.tar.gz Distribution BIRNEY/bioperl-run-1.4.tar.gz Distribution BOZO/Fry-Lib-BioPerl-0.15.tar.gz Distribution CRAFFI/Bundle-BioPerl-2.1.8.tar.gz 12 items found No v 1.5.2. This may be because it can't see the distribution at ftp://ftp.open-bio.org/pub/bioperl/DIST/. When I try to reload the index, I get messages saying cpan can't find the files ftp://ftp.open-bio.org/pub/bioperl/DIST/authors/01mailrc.txt.gz or ftp://ftp.open-bio.org/pub/bioperl/DIST/modules/02packages.details.txt.gz or ftp://ftp.open-bio.org/pub/bioperl/DIST/modules/03modlist.data.gz. Am I doing something wrong? Does the FTP site need to be updated? Thanks, Bill -- ----------------------------------------------------- William C. Nelson, PhD Research Asst Professor Wrigely Institute for Environmental Studies University of Southern California LAS/MEB 310-510-4097 wcnelson at usc.edu From wgallin at ualberta.ca Thu Sep 27 22:51:12 2007 From: wgallin at ualberta.ca (Warren Gallin) Date: Thu, 27 Sep 2007 20:51:12 -0600 Subject: [Bioperl-l] A couple Eutilities questions Message-ID: <98B80D80-AF6F-424B-81B7-5B0CFD8D6CB2@ualberta.ca> I've just started using Bio::DB::Eutilities and I have encountered two things that seem like problems. I am using the latest (retrieved Wednesday September 26, 2007) CVS version, running in an Apple Xserver. Problem 1: When I execute the following code: #Create new EUTILS object for retrieving sets of entries, given an array of accession numbers my $gpeptfactory = Bio::DB::EUtilities -> new( -eutil => 'efetch', -db => 'protein', -rettype =>'genbank', -id => \@pro_acc) ; my $file = 'temp_hold.gb'; $gpeptfactory -> get_Response(-file => $file); my $retr_seq = Bio::SeqIO->new( -file => $file, -format => 'genbank'); I get the following warning, consistently: Use of uninitialized value in concatenation (.) or string at /Library/ Perl/5.8.1/Bio/DB/GenericWebAgent.pm line 92. Also, about half the time I get a crash with the following error message: ------------- EXCEPTION: Bio::Root::Exception ------------- MSG: Response Error Bad Gateway STACK: Error::throw STACK: Bio::Root::Root::throw /Library/Perl/5.8.1/Bio/Root/Root.pm:357 STACK: Bio::DB::GenericWebAgent::get_Response /Library/Perl/5.8.1/Bio/ DB/GenericWebAgent.pm:184 STACK: gb_update_v4.pl:118 ----------------------------------------------------------- The other half of the time the script runs fine through to the end. I have no idea whether the crash is related to the warning or not. I looked at the line where the warning is generated, and it appears to be the "new" method for the GenericWebAgent.pm . I can't see how the call to Eutilities is can be passing an undefined value through to this method. Problem #2: When the code runs, I retrieve an incorrect record. I am retrieving using accessions, and accession I51532 retrieves two records. One is the record I am after, an ion channel protein, the other comes from a patent application; the problem is that, although the accession number for the unwanted record is AAB76204, the LOCUS entry in the record is I51532. So, is it possible that the efetch function is collecting on the basis of LOCUS, not ACCESSION? I realize that the two are almost always the same, but not apparently in this case. Any advice and/or explanation is appreciated. Warren Gallin From forrest_zhang at 163.com Thu Sep 27 22:50:53 2007 From: forrest_zhang at 163.com (Forrest Zhang) Date: Fri, 28 Sep 2007 10:50:53 +0800 Subject: [Bioperl-l] cpan install In-Reply-To: <46FC028F.3050000@usc.edu> References: <46FC028F.3050000@usc.edu> Message-ID: <000001c8017a$6384bae0$2a8e30a0$@com> Try cpan>install S/SE/SENDU/bioperl-1.5.2_102.tar.gz other question you should browse http://www.bioperl.org/wiki/Installing_Bioperl_for_Unix -----Original Message----- From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-bounces at lists.open-bio.org] On Behalf Of William C. Nelson Sent: Friday, September 28, 2007 3:21 AM To: bioperl-l at bioperl.org Subject: [Bioperl-l] cpan install Hello, I tried to install v1.5.2 using cpan. My urllist looks like: cpan[2]> o conf urllist urllist 0 [ftp://cpan.cs.utah.edu/pub/CPAN/] 1 [ftp://cpan.mirrors.tds.net/pub/CPAN] 2 [ftp://ftp.open-bio.org/pub/bioperl/DIST/] Type 'o conf' to view all configuration items And when I look for bioperl, I see: cpan[1]> d /bioperl/ CPAN: Storable loaded ok (v2.16) Going to read /root/.cpan/Metadata Database was generated on Thu, 27 Sep 2007 18:36:44 GMT Distribution BIRNEY/bioperl-1.2.1.tar.gz Distribution BIRNEY/bioperl-1.2.2.tar.gz Distribution BIRNEY/bioperl-1.2.3.tar.gz Distribution BIRNEY/bioperl-1.2.tar.gz Distribution BIRNEY/bioperl-1.4.tar.gz Distribution BIRNEY/bioperl-db-0.1.tar.gz Distribution BIRNEY/bioperl-ext-1.4.tar.gz Distribution BIRNEY/bioperl-gui-0.7.tar.gz Distribution BIRNEY/bioperl-run-1.2.2.tar.gz Distribution BIRNEY/bioperl-run-1.4.tar.gz Distribution BOZO/Fry-Lib-BioPerl-0.15.tar.gz Distribution CRAFFI/Bundle-BioPerl-2.1.8.tar.gz 12 items found No v 1.5.2. This may be because it can't see the distribution at ftp://ftp.open-bio.org/pub/bioperl/DIST/. When I try to reload the index, I get messages saying cpan can't find the files ftp://ftp.open-bio.org/pub/bioperl/DIST/authors/01mailrc.txt.gz or ftp://ftp.open-bio.org/pub/bioperl/DIST/modules/02packages.details.txt.gz or ftp://ftp.open-bio.org/pub/bioperl/DIST/modules/03modlist.data.gz. Am I doing something wrong? Does the FTP site need to be updated? Thanks, Bill -- ----------------------------------------------------- William C. Nelson, PhD Research Asst Professor Wrigely Institute for Environmental Studies University of Southern California LAS/MEB 310-510-4097 wcnelson at usc.edu _______________________________________________ Bioperl-l mailing list Bioperl-l at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/bioperl-l From forrest_zhang at 163.com Thu Sep 27 23:15:03 2007 From: forrest_zhang at 163.com (Forrest Zhang) Date: Fri, 28 Sep 2007 11:15:03 +0800 Subject: [Bioperl-l] load_seqdatabase.pl error In-Reply-To: <2461576A-1A15-45E2-A14F-AEED6ACD6007@gmx.net> References: <000501c800d9$dc9c8e90$95d5abb0$@com> <2461576A-1A15-45E2-A14F-AEED6ACD6007@gmx.net> Message-ID: <000101c8017d$c4643360$4d2c9a20$@com> Hilmar, I have already pre-loaded the NCBI taxonomy using load_ncbi_taxonomy.pl yet. The error message show: --------------------- WARNING --------------------- MSG: The supplied lineage does not start near 'Phaseolus aureus' (I was supplied 'Vigna | Papilionoideae | Fabaceae | Fabales | eurosids I | rosids | core eudicotyledons | eudicotyledons | Magnoliophyta | Euphyllophyta | Embryophyta | Streptophytina | Viridiplantae | Eukaryota') --------------------------------------------------- Could not store Q40784: ------------- EXCEPTION: Bio::Root::Exception ------------- MSG: create: object (Bio::Species) failed to insert or to be found by unique key STACK: Error::throw STACK: Bio::Root::Root::throw /usr/lib/perl5/site_perl/5.8.8/Bio/Root/Root.pm:357 STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/BasePersistenceAdaptor.pm:206 STACK: Bio::DB::Persistent::PersistentObject::create /usr/lib/perl5/site_perl/5.8.8/Bio/DB/Persistent/PersistentObject.pm:244 STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/BasePersistenceAdaptor.pm:169 STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::store /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/BasePersistenceAdaptor.pm:251 STACK: Bio::DB::Persistent::PersistentObject::store /usr/lib/perl5/site_perl/5.8.8/Bio/DB/Persistent/PersistentObject.pm:271 STACK: /usr/bin/bp_load_seqdatabase.pl:633 ----------------------------------------------------------- Sigh~~~~~~ Forrest Zhang -----Original Message----- From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-bounces at lists.open-bio.org] On Behalf Of Hilmar Lapp Sent: Friday, September 28, 2007 6:17 AM To: Forrest Cc: bioperl-l at bioperl.org Subject: Re: [Bioperl-l] load_seqdatabase.pl error Forrest, have you preloaded the NCBI taxonomy as suggested in the BioSQL installation guidelines? SwissProt format has NCBI taxon IDs, and the code will try to use it to look up species and their lineage, rather than inserting the lineage from whatever BioPerl parses out of the sequence record. -hilmar On Sep 27, 2007, at 3:41 AM, Forrest wrote: > Hi, all > I install the biosql, and bioperl-db. I want to import > swissport data. > But the programe show some error as below: > ====================================================================== > ====== > =============================================== >> perl load_seqdatabase.pl -dbuser root -dbname bioseqdb -driver mysql > -namespace swissprot -format swiss ~/uniprot/uniprot_sprot.dat > Loading /home/forrest/uniprot/uniprot_sprot.dat ... > Could not store Q6DAH5: > ------------- EXCEPTION: Bio::Root::Exception ------------- > MSG: The supplied lineage does not start near 'Erwinia carotovora > subsp. > atroseptica' (I was supplied 'Erwinia carotovora subsp. | > Pectobacterium | > Enterobacteriaceae | Enterobacteriales | Gammaproteobacteria | > Proteobacteria | Bacteria') > STACK: Error::throw > STACK: Bio::Root::Root::throw > /usr/lib/perl5/site_perl/5.8.8/Bio/Root/Root.pm:359 > STACK: Bio::Species::classification > /usr/lib/perl5/site_perl/5.8.8/Bio/Species.pm:174 > STACK: Bio::DB::Persistent::PersistentObject::AUTOLOAD > /usr/lib/perl5/site_perl/5.8.8/Bio/DB/Persistent/ > PersistentObject.pm:552 > STACK: Bio::DB::BioSQL::SpeciesAdaptor::populate_from_row > /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/SpeciesAdaptor.pm:281 > STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::_build_object > /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/ > BasePersistenceAdaptor.pm:1305 > STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::_find_by_unique_key > /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/ > BasePersistenceAdaptor.pm:973 > STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::find_by_unique_key > /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/ > BasePersistenceAdaptor.pm:852 > STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create > /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/ > BasePersistenceAdaptor.pm:182 > STACK: Bio::DB::Persistent::PersistentObject::create > /usr/lib/perl5/site_perl/5.8.8/Bio/DB/Persistent/ > PersistentObject.pm:244 > STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create > /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/ > BasePersistenceAdaptor.pm:169 > STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::store > /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/ > BasePersistenceAdaptor.pm:251 > STACK: Bio::DB::Persistent::PersistentObject::store > /usr/lib/perl5/site_perl/5.8.8/Bio/DB/Persistent/ > PersistentObject.pm:271 > STACK: load_seqdatabase.pl:620 > ----------------------------------------------------------- > > at load_seqdatabase.pl line 633 > ====================================================================== > ====== > =============================================== > > How can I solve it, please help me, Thank you. > > Thanks > Forrest zhang > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== _______________________________________________ Bioperl-l mailing list Bioperl-l at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/bioperl-l From forrest_zhang at 163.com Thu Sep 27 23:33:21 2007 From: forrest_zhang at 163.com (Forrest Zhang) Date: Fri, 28 Sep 2007 11:33:21 +0800 Subject: [Bioperl-l] load_seqdatabase.pl error In-Reply-To: <000101c8017d$c4643360$4d2c9a20$@com> References: <000501c800d9$dc9c8e90$95d5abb0$@com> <2461576A-1A15-45E2-A14F-AEED6ACD6007@gmx.net> <000101c8017d$c4643360$4d2c9a20$@com> Message-ID: <000201c80180$54762650$fd6272f0$@com> I reinstall the bioperl-db, I found some error. t/01dbadaptor.....ok t/02species.......FAILED tests 66-95 Failed 30/65 tests, 53.85% okay t/03simpleseq.....ok t/04swiss.........ok t/05seqfeature....ok t/06comment.......ok t/07dblink........ok t/08genbank.......ok t/09fuzzy2........ok t/10ensembl.......ok t/11locuslink.....ok t/12ontology......ok t/13remove........ok t/14query.........ok t/15cluster.......ok 9/160 --------------------- WARNING --------------------- MSG: failed to store one or more child objects for an instance of class Bio::Cluster::UniGene (PK=320) --------------------------------------------------- t/15cluster.......ok t/16obda..........ok Failed Test Stat Wstat Total Fail List of Failed ---------------------------------------------------------------------------- --- t/02species.t 65 30 66-95 Failed 1/16 test scripts. -30/1423 subtests failed. Files=16, Tests=1423, 35 wallclock secs (16.67 cusr + 0.63 csys = 17.30 CPU) Failed 1/16 test programs. -30/1423 subtests failed. make: *** [test] Error 255 -----Original Message----- From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-bounces at lists.open-bio.org] On Behalf Of Forrest Zhang Sent: Friday, September 28, 2007 11:15 AM To: bioperl-l at bioperl.org Subject: Re: [Bioperl-l] load_seqdatabase.pl error Hilmar, I have already pre-loaded the NCBI taxonomy using load_ncbi_taxonomy.pl yet. The error message show: --------------------- WARNING --------------------- MSG: The supplied lineage does not start near 'Phaseolus aureus' (I was supplied 'Vigna | Papilionoideae | Fabaceae | Fabales | eurosids I | rosids | core eudicotyledons | eudicotyledons | Magnoliophyta | Euphyllophyta | Embryophyta | Streptophytina | Viridiplantae | Eukaryota') --------------------------------------------------- Could not store Q40784: ------------- EXCEPTION: Bio::Root::Exception ------------- MSG: create: object (Bio::Species) failed to insert or to be found by unique key STACK: Error::throw STACK: Bio::Root::Root::throw /usr/lib/perl5/site_perl/5.8.8/Bio/Root/Root.pm:357 STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/BasePersistenceAdaptor.pm:206 STACK: Bio::DB::Persistent::PersistentObject::create /usr/lib/perl5/site_perl/5.8.8/Bio/DB/Persistent/PersistentObject.pm:244 STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/BasePersistenceAdaptor.pm:169 STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::store /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/BasePersistenceAdaptor.pm:251 STACK: Bio::DB::Persistent::PersistentObject::store /usr/lib/perl5/site_perl/5.8.8/Bio/DB/Persistent/PersistentObject.pm:271 STACK: /usr/bin/bp_load_seqdatabase.pl:633 ----------------------------------------------------------- Sigh~~~~~~ Forrest Zhang -----Original Message----- From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-bounces at lists.open-bio.org] On Behalf Of Hilmar Lapp Sent: Friday, September 28, 2007 6:17 AM To: Forrest Cc: bioperl-l at bioperl.org Subject: Re: [Bioperl-l] load_seqdatabase.pl error Forrest, have you preloaded the NCBI taxonomy as suggested in the BioSQL installation guidelines? SwissProt format has NCBI taxon IDs, and the code will try to use it to look up species and their lineage, rather than inserting the lineage from whatever BioPerl parses out of the sequence record. -hilmar On Sep 27, 2007, at 3:41 AM, Forrest wrote: > Hi, all > I install the biosql, and bioperl-db. I want to import > swissport data. > But the programe show some error as below: > ====================================================================== > ====== > =============================================== >> perl load_seqdatabase.pl -dbuser root -dbname bioseqdb -driver mysql > -namespace swissprot -format swiss ~/uniprot/uniprot_sprot.dat > Loading /home/forrest/uniprot/uniprot_sprot.dat ... > Could not store Q6DAH5: > ------------- EXCEPTION: Bio::Root::Exception ------------- > MSG: The supplied lineage does not start near 'Erwinia carotovora > subsp. > atroseptica' (I was supplied 'Erwinia carotovora subsp. | > Pectobacterium | > Enterobacteriaceae | Enterobacteriales | Gammaproteobacteria | > Proteobacteria | Bacteria') > STACK: Error::throw > STACK: Bio::Root::Root::throw > /usr/lib/perl5/site_perl/5.8.8/Bio/Root/Root.pm:359 > STACK: Bio::Species::classification > /usr/lib/perl5/site_perl/5.8.8/Bio/Species.pm:174 > STACK: Bio::DB::Persistent::PersistentObject::AUTOLOAD > /usr/lib/perl5/site_perl/5.8.8/Bio/DB/Persistent/ > PersistentObject.pm:552 > STACK: Bio::DB::BioSQL::SpeciesAdaptor::populate_from_row > /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/SpeciesAdaptor.pm:281 > STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::_build_object > /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/ > BasePersistenceAdaptor.pm:1305 > STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::_find_by_unique_key > /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/ > BasePersistenceAdaptor.pm:973 > STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::find_by_unique_key > /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/ > BasePersistenceAdaptor.pm:852 > STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create > /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/ > BasePersistenceAdaptor.pm:182 > STACK: Bio::DB::Persistent::PersistentObject::create > /usr/lib/perl5/site_perl/5.8.8/Bio/DB/Persistent/ > PersistentObject.pm:244 > STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create > /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/ > BasePersistenceAdaptor.pm:169 > STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::store > /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/ > BasePersistenceAdaptor.pm:251 > STACK: Bio::DB::Persistent::PersistentObject::store > /usr/lib/perl5/site_perl/5.8.8/Bio/DB/Persistent/ > PersistentObject.pm:271 > STACK: load_seqdatabase.pl:620 > ----------------------------------------------------------- > > at load_seqdatabase.pl line 633 > ====================================================================== > ====== > =============================================== > > How can I solve it, please help me, Thank you. > > Thanks > Forrest zhang > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== _______________________________________________ Bioperl-l mailing list Bioperl-l at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/bioperl-l _______________________________________________ Bioperl-l mailing list Bioperl-l at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/bioperl-l From cjfields at uiuc.edu Fri Sep 28 00:58:27 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 27 Sep 2007 23:58:27 -0500 Subject: [Bioperl-l] load_seqdatabase.pl error In-Reply-To: <000201c80180$54762650$fd6272f0$@com> References: <000501c800d9$dc9c8e90$95d5abb0$@com> <2461576A-1A15-45E2-A14F-AEED6ACD6007@gmx.net> <000101c8017d$c4643360$4d2c9a20$@com> <000201c80180$54762650$fd6272f0$@com> Message-ID: <9535284F-2DC5-4361-81A2-0B739A7E89E4@uiuc.edu> Bio::Species will have problems if you test on a database with taxonomy loaded (it's mentioned in the install docs I think). The UniGene warning has always popped up and isn't anything to worry about. chris On Sep 27, 2007, at 10:33 PM, Forrest Zhang wrote: > I reinstall the bioperl-db, I found some error. > > t/01dbadaptor.....ok > > t/02species.......FAILED tests 66-95 > > Failed 30/65 tests, 53.85% okay > t/03simpleseq.....ok > > t/04swiss.........ok > > t/05seqfeature....ok > > t/06comment.......ok > > t/07dblink........ok > > t/08genbank.......ok > > t/09fuzzy2........ok > > t/10ensembl.......ok > > t/11locuslink.....ok > > t/12ontology......ok > > t/13remove........ok > > t/14query.........ok > > t/15cluster.......ok 9/160 > > --------------------- WARNING --------------------- > MSG: failed to store one or more child objects for an instance of > class > Bio::Cluster::UniGene (PK=320) > --------------------------------------------------- > t/15cluster.......ok > > t/16obda..........ok > > Failed Test Stat Wstat Total Fail List of Failed > ---------------------------------------------------------------------- > ------ > --- > t/02species.t 65 30 66-95 > Failed 1/16 test scripts. -30/1423 subtests failed. > Files=16, Tests=1423, 35 wallclock secs (16.67 cusr + 0.63 csys = > 17.30 > CPU) > Failed 1/16 test programs. -30/1423 subtests failed. > make: *** [test] Error 255 > > -----Original Message----- > From: bioperl-l-bounces at lists.open-bio.org > [mailto:bioperl-l-bounces at lists.open-bio.org] On Behalf Of Forrest > Zhang > Sent: Friday, September 28, 2007 11:15 AM > To: bioperl-l at bioperl.org > Subject: Re: [Bioperl-l] load_seqdatabase.pl error > > Hilmar, > I have already pre-loaded the NCBI taxonomy using > load_ncbi_taxonomy.pl yet. The error message show: > > --------------------- WARNING --------------------- > MSG: The supplied lineage does not start near 'Phaseolus aureus' (I > was > supplied 'Vigna | Papilionoideae | Fabaceae | Fabales | eurosids I > | rosids > | core eudicotyledons | eudicotyledons | Magnoliophyta | > Euphyllophyta | > Embryophyta | Streptophytina | Viridiplantae | Eukaryota') > --------------------------------------------------- > Could not store Q40784: > ------------- EXCEPTION: Bio::Root::Exception ------------- > MSG: create: object (Bio::Species) failed to insert or to be found > by unique > key > STACK: Error::throw > STACK: Bio::Root::Root::throw > /usr/lib/perl5/site_perl/5.8.8/Bio/Root/Root.pm:357 > STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create > /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/ > BasePersistenceAdaptor.pm:206 > STACK: Bio::DB::Persistent::PersistentObject::create > /usr/lib/perl5/site_perl/5.8.8/Bio/DB/Persistent/ > PersistentObject.pm:244 > STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create > /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/ > BasePersistenceAdaptor.pm:169 > STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::store > /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/ > BasePersistenceAdaptor.pm:251 > STACK: Bio::DB::Persistent::PersistentObject::store > /usr/lib/perl5/site_perl/5.8.8/Bio/DB/Persistent/ > PersistentObject.pm:271 > STACK: /usr/bin/bp_load_seqdatabase.pl:633 > ----------------------------------------------------------- > Sigh~~~~~~ > > Forrest Zhang > > > -----Original Message----- > From: bioperl-l-bounces at lists.open-bio.org > [mailto:bioperl-l-bounces at lists.open-bio.org] On Behalf Of Hilmar Lapp > Sent: Friday, September 28, 2007 6:17 AM > To: Forrest > Cc: bioperl-l at bioperl.org > Subject: Re: [Bioperl-l] load_seqdatabase.pl error > > Forrest, > > have you preloaded the NCBI taxonomy as suggested in the BioSQL > installation guidelines? SwissProt format has NCBI taxon IDs, and the > code will try to use it to look up species and their lineage, rather > than inserting the lineage from whatever BioPerl parses out of the > sequence record. > > -hilmar > > On Sep 27, 2007, at 3:41 AM, Forrest wrote: > >> Hi, all >> I install the biosql, and bioperl-db. I want to import >> swissport data. >> But the programe show some error as below: >> ===================================================================== >> = >> ====== >> =============================================== >>> perl load_seqdatabase.pl -dbuser root -dbname bioseqdb -driver mysql >> -namespace swissprot -format swiss ~/uniprot/uniprot_sprot.dat >> Loading /home/forrest/uniprot/uniprot_sprot.dat ... >> Could not store Q6DAH5: >> ------------- EXCEPTION: Bio::Root::Exception ------------- >> MSG: The supplied lineage does not start near 'Erwinia carotovora >> subsp. >> atroseptica' (I was supplied 'Erwinia carotovora subsp. | >> Pectobacterium | >> Enterobacteriaceae | Enterobacteriales | Gammaproteobacteria | >> Proteobacteria | Bacteria') >> STACK: Error::throw >> STACK: Bio::Root::Root::throw >> /usr/lib/perl5/site_perl/5.8.8/Bio/Root/Root.pm:359 >> STACK: Bio::Species::classification >> /usr/lib/perl5/site_perl/5.8.8/Bio/Species.pm:174 >> STACK: Bio::DB::Persistent::PersistentObject::AUTOLOAD >> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/Persistent/ >> PersistentObject.pm:552 >> STACK: Bio::DB::BioSQL::SpeciesAdaptor::populate_from_row >> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/SpeciesAdaptor.pm:281 >> STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::_build_object >> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/ >> BasePersistenceAdaptor.pm:1305 >> STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::_find_by_unique_key >> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/ >> BasePersistenceAdaptor.pm:973 >> STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::find_by_unique_key >> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/ >> BasePersistenceAdaptor.pm:852 >> STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create >> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/ >> BasePersistenceAdaptor.pm:182 >> STACK: Bio::DB::Persistent::PersistentObject::create >> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/Persistent/ >> PersistentObject.pm:244 >> STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create >> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/ >> BasePersistenceAdaptor.pm:169 >> STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::store >> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/ >> BasePersistenceAdaptor.pm:251 >> STACK: Bio::DB::Persistent::PersistentObject::store >> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/Persistent/ >> PersistentObject.pm:271 >> STACK: load_seqdatabase.pl:620 >> ----------------------------------------------------------- >> >> at load_seqdatabase.pl line 633 >> ===================================================================== >> = >> ====== >> =============================================== >> >> How can I solve it, please help me, Thank you. >> >> Thanks >> Forrest zhang >> >> >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > -- > =========================================================== > : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : > =========================================================== > > > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From cjfields at uiuc.edu Fri Sep 28 00:57:55 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 27 Sep 2007 23:57:55 -0500 Subject: [Bioperl-l] A couple Eutilities questions In-Reply-To: <98B80D80-AF6F-424B-81B7-5B0CFD8D6CB2@ualberta.ca> References: <98B80D80-AF6F-424B-81B7-5B0CFD8D6CB2@ualberta.ca> Message-ID: On Sep 27, 2007, at 9:51 PM, Warren Gallin wrote: > I've just started using Bio::DB::Eutilities and I have encountered > two things that seem like problems. > > I am using the latest (retrieved Wednesday September 26, 2007) CVS > version, running in an Apple Xserver. > > Problem 1: When I execute the following code: > > > #Create new EUTILS object for retrieving sets of entries, given an > array of accession numbers > my $gpeptfactory = Bio::DB::EUtilities -> new( -eutil => 'efetch', > -db => 'protein', > -rettype =>'genbank', > -id => \@pro_acc) ; > my $file = 'temp_hold.gb'; > > $gpeptfactory -> get_Response(-file => $file); > > my $retr_seq = Bio::SeqIO->new( -file => $file, > -format => 'genbank'); > > > I get the following warning, consistently: > > Use of uninitialized value in concatenation (.) or string at /Library/ > Perl/5.8.1/Bio/DB/GenericWebAgent.pm line 92. The above works for me w/o problems. The error itself doesn't make much sense; the line is: $self->ua(LWP::UserAgent->new(env_proxy => 1, agent => ref($self).':'.$self->VERSION)); so either $self isn't a ref (which it appears to be) or there is no version (which is odd but may be a perl bug). What happens if you hard-code the version number to something simple? Also, I noticed you're using perl 5.8.1; which version of Mac OS X are you using? I remember something was off about that perl version but I can't remember what it was... > Also, about half the time I get a crash with the following error > message: > > > ------------- EXCEPTION: Bio::Root::Exception ------------- > MSG: Response Error > Bad Gateway > STACK: Error::throw > STACK: Bio::Root::Root::throw /Library/Perl/5.8.1/Bio/Root/Root.pm:357 > STACK: Bio::DB::GenericWebAgent::get_Response /Library/Perl/5.8.1/Bio/ > DB/GenericWebAgent.pm:184 > STACK: gb_update_v4.pl:118 > ----------------------------------------------------------- I have seen it sometimes pop up when the NCBI server is under heavy server load. It may also be related to your local ISP or setup; see here: http://www.checkupdown.com/status/E502.html Supposedly this may pop up with mod_perl but I haven't seen/heard anything myself related to this. > The other half of the time the script runs fine through to the end. > I have no idea whether the crash is related to the warning or not. I > looked at the line where the warning is generated, and it appears to > be the "new" method for the GenericWebAgent.pm . I can't see how > the call to Eutilities is can be passing an undefined value through > to this method. EUtilities is-a GenericWebAgent; the new() constructors are chained using SUPER::new(). Also, you can call VERSION from any variable so it could be a problem there if VERSION is undef, though again I can't think why this would fail. Regardless, the 'Use of undefined' warning is not a fatal error. > Problem #2: > > When the code runs, I retrieve an incorrect record. I am retrieving > using accessions, and accession I51532 retrieves two records. One is > the record I am after, an ion channel protein, the other comes from a > patent application; the problem is that, although the accession > number for the unwanted record is AAB76204, the LOCUS entry in the > record is I51532. > > So, is it possible that the efetch function is collecting on the > basis of LOCUS, not ACCESSION? I realize that the two are almost > always the same, but not apparently in this case. > > Any advice and/or explanation is appreciated. > > Warren Gallin The only means NCBI guarantees to retrieve a unique record every time is by using the primary ID, which for sequence records is the GI. The accession works most of the time, and efetch accepts accs in the place of GI (it's the only eutil that does). However, every once in a while you get stung and retrieve multiple seqs. BTW, I entered your sequence into Entrez and it popped up as discontinued (which could be part of the problem); the current acc is Q91781. chris From cjfields at uiuc.edu Fri Sep 28 01:00:18 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Fri, 28 Sep 2007 00:00:18 -0500 Subject: [Bioperl-l] load_seqdatabase.pl error In-Reply-To: <000101c8017d$c4643360$4d2c9a20$@com> References: <000501c800d9$dc9c8e90$95d5abb0$@com> <2461576A-1A15-45E2-A14F-AEED6ACD6007@gmx.net> <000101c8017d$c4643360$4d2c9a20$@com> Message-ID: <8C9CDFBD-200C-4CD6-8295-7833D2CD3758@uiuc.edu> If this is occurring using bioperl from CVS then I'll try taking a look at it. chris On Sep 27, 2007, at 10:15 PM, Forrest Zhang wrote: > Hilmar, > I have already pre-loaded the NCBI taxonomy using > load_ncbi_taxonomy.pl yet. The error message show: > > --------------------- WARNING --------------------- > MSG: The supplied lineage does not start near 'Phaseolus aureus' (I > was > supplied 'Vigna | Papilionoideae | Fabaceae | Fabales | eurosids I > | rosids > | core eudicotyledons | eudicotyledons | Magnoliophyta | > Euphyllophyta | > Embryophyta | Streptophytina | Viridiplantae | Eukaryota') > --------------------------------------------------- > Could not store Q40784: > ------------- EXCEPTION: Bio::Root::Exception ------------- > MSG: create: object (Bio::Species) failed to insert or to be found > by unique > key > STACK: Error::throw > STACK: Bio::Root::Root::throw > /usr/lib/perl5/site_perl/5.8.8/Bio/Root/Root.pm:357 > STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create > /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/ > BasePersistenceAdaptor.pm:206 > STACK: Bio::DB::Persistent::PersistentObject::create > /usr/lib/perl5/site_perl/5.8.8/Bio/DB/Persistent/ > PersistentObject.pm:244 > STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create > /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/ > BasePersistenceAdaptor.pm:169 > STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::store > /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/ > BasePersistenceAdaptor.pm:251 > STACK: Bio::DB::Persistent::PersistentObject::store > /usr/lib/perl5/site_perl/5.8.8/Bio/DB/Persistent/ > PersistentObject.pm:271 > STACK: /usr/bin/bp_load_seqdatabase.pl:633 > ----------------------------------------------------------- > Sigh~~~~~~ > > Forrest Zhang > > > -----Original Message----- > From: bioperl-l-bounces at lists.open-bio.org > [mailto:bioperl-l-bounces at lists.open-bio.org] On Behalf Of Hilmar Lapp > Sent: Friday, September 28, 2007 6:17 AM > To: Forrest > Cc: bioperl-l at bioperl.org > Subject: Re: [Bioperl-l] load_seqdatabase.pl error > > Forrest, > > have you preloaded the NCBI taxonomy as suggested in the BioSQL > installation guidelines? SwissProt format has NCBI taxon IDs, and the > code will try to use it to look up species and their lineage, rather > than inserting the lineage from whatever BioPerl parses out of the > sequence record. > > -hilmar > > On Sep 27, 2007, at 3:41 AM, Forrest wrote: > >> Hi, all >> I install the biosql, and bioperl-db. I want to import >> swissport data. >> But the programe show some error as below: >> ===================================================================== >> = >> ====== >> =============================================== >>> perl load_seqdatabase.pl -dbuser root -dbname bioseqdb -driver mysql >> -namespace swissprot -format swiss ~/uniprot/uniprot_sprot.dat >> Loading /home/forrest/uniprot/uniprot_sprot.dat ... >> Could not store Q6DAH5: >> ------------- EXCEPTION: Bio::Root::Exception ------------- >> MSG: The supplied lineage does not start near 'Erwinia carotovora >> subsp. >> atroseptica' (I was supplied 'Erwinia carotovora subsp. | >> Pectobacterium | >> Enterobacteriaceae | Enterobacteriales | Gammaproteobacteria | >> Proteobacteria | Bacteria') >> STACK: Error::throw >> STACK: Bio::Root::Root::throw >> /usr/lib/perl5/site_perl/5.8.8/Bio/Root/Root.pm:359 >> STACK: Bio::Species::classification >> /usr/lib/perl5/site_perl/5.8.8/Bio/Species.pm:174 >> STACK: Bio::DB::Persistent::PersistentObject::AUTOLOAD >> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/Persistent/ >> PersistentObject.pm:552 >> STACK: Bio::DB::BioSQL::SpeciesAdaptor::populate_from_row >> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/SpeciesAdaptor.pm:281 >> STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::_build_object >> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/ >> BasePersistenceAdaptor.pm:1305 >> STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::_find_by_unique_key >> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/ >> BasePersistenceAdaptor.pm:973 >> STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::find_by_unique_key >> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/ >> BasePersistenceAdaptor.pm:852 >> STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create >> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/ >> BasePersistenceAdaptor.pm:182 >> STACK: Bio::DB::Persistent::PersistentObject::create >> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/Persistent/ >> PersistentObject.pm:244 >> STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create >> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/ >> BasePersistenceAdaptor.pm:169 >> STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::store >> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/ >> BasePersistenceAdaptor.pm:251 >> STACK: Bio::DB::Persistent::PersistentObject::store >> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/Persistent/ >> PersistentObject.pm:271 >> STACK: load_seqdatabase.pl:620 >> ----------------------------------------------------------- >> >> at load_seqdatabase.pl line 633 >> ===================================================================== >> = >> ====== >> =============================================== >> >> How can I solve it, please help me, Thank you. >> >> Thanks >> Forrest zhang >> >> >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > -- > =========================================================== > : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : > =========================================================== > > > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From forrest_zhang at 163.com Fri Sep 28 01:34:21 2007 From: forrest_zhang at 163.com (Forrest Zhang) Date: Fri, 28 Sep 2007 13:34:21 +0800 Subject: [Bioperl-l] FW: load_seqdatabase.pl error Message-ID: <000101c80191$3b300e70$b1902b50$@com> Oh, my God! I am tring reinstall bioperl-live using CVS, so many error shown below. biocc bioperl-live # perl Build.PL Checking whether your kit is complete... Looks good Checking prerequisites... Looks good Checking features: BioDBGFF.................enabled BioDBSeqFeature_mysql....enabled Network..................enabled BioDBSeqFeature_BDB......enabled Install [a]ll Bioperl scripts, [n]one, or choose groups [i]nteractively? [a] - will install all scripts Do you want to run the BioDBGFF live database tests? y/n [n] y Which database should I use for testing the mysql driver? [test] On which host is database 'test' running (hostname, ip address or host:port) [localhost] User name for connecting to database 'test'? [undef] root Password for connecting to database 'test'? [undef] - will run the BioDBGFF tests with database driver 'mysql' and these settings: Database test Host localhost DSN dbi:mysql:database=test User root Password undef Do you want to run tests that require connection to servers across the internet (likely to cause some failures)? y/n [n] y - will run internet-requiring tests Deleting Build Removed previous script 'Build' Creating new 'Build' script for 'bioperl' version '1.0050021' biocc bioperl-live # ./Build test Copying Bio/Align/Utilities.pm -> blib/lib/Bio/Align/Utilities.pm Copying Bio/Search/HSP/ModelHSP.pm -> blib/lib/Bio/Search/HSP/ModelHSP.pm Copying Bio/Ontology/DocumentRegistry.pm -> blib/lib/Bio/Ontology/DocumentRegistry.pm Copying Bio/SeqFeature/Annotated.pm -> blib/lib/Bio/SeqFeature/Annotated.pm Copying Bio/SimpleAlign.pm -> blib/lib/Bio/SimpleAlign.pm Copying Bio/AlignIO/stockholm.pm -> blib/lib/Bio/AlignIO/stockholm.pm Copying scripts/utilities/bp_sreformat.PLS -> blib/script/bp_sreformat.PLS Deleting blib/script/bp_sreformat.PLS.bak blib/script/bp_sreformat.PLS -> blib/script/bp_sreformat.pl Copying scripts/graphics/contig_draw.PLS -> blib/script/contig_draw.PLS Deleting blib/script/contig_draw.PLS.bak blib/script/contig_draw.PLS -> blib/script/bp_contig_draw.pl Copying scripts/Bio-DB-GFF/meta_gff.PLS -> blib/script/meta_gff.PLS Deleting blib/script/meta_gff.PLS.bak blib/script/meta_gff.PLS -> blib/script/bp_meta_gff.pl Copying scripts/tree/tree2pag.PLS -> blib/script/tree2pag.PLS Deleting blib/script/tree2pag.PLS.bak blib/script/tree2pag.PLS -> blib/script/bp_tree2pag.pl Copying scripts/Bio-SeqFeature-Store/bp_seqfeature_gff3.PLS -> blib/script/bp_seqfeature_gff3.PLS blib/script/bp_seqfeature_gff3.PLS -> blib/script/bp_seqfeature_gff3.pl Copying scripts/popgen/heterogeneity_test.PLS -> blib/script/heterogeneity_test.PLS Deleting blib/script/heterogeneity_test.PLS.bak blib/script/heterogeneity_test.PLS -> blib/script/bp_heterogeneity_test.pl Copying scripts/DB/flanks.PLS -> blib/script/flanks.PLS Deleting blib/script/flanks.PLS.bak blib/script/flanks.PLS -> blib/script/bp_flanks.pl Copying scripts/graphics/feature_draw.PLS -> blib/script/feature_draw.PLS Deleting blib/script/feature_draw.PLS.bak blib/script/feature_draw.PLS -> blib/script/bp_feature_draw.pl Copying scripts/DB/biogetseq.PLS -> blib/script/biogetseq.PLS Deleting blib/script/biogetseq.PLS.bak blib/script/biogetseq.PLS -> blib/script/bp_biogetseq.pl Copying scripts/Bio-SeqFeature-Store/bp_seqfeature_load.PLS -> blib/script/bp_seqfeature_load.PLS Deleting blib/script/bp_seqfeature_load.PLS.bak blib/script/bp_seqfeature_load.PLS -> blib/script/bp_seqfeature_load.pl Copying scripts/searchio/fastam9_to_table.PLS -> blib/script/fastam9_to_table.PLS Deleting blib/script/fastam9_to_table.PLS.bak blib/script/fastam9_to_table.PLS -> blib/script/bp_fastam9_to_table.pl Copying scripts/utilities/seq_length.PLS -> blib/script/seq_length.PLS Deleting blib/script/seq_length.PLS.bak blib/script/seq_length.PLS -> blib/script/bp_seq_length.pl Copying scripts/Bio-DB-GFF/genbank2gff.PLS -> blib/script/genbank2gff.PLS Deleting blib/script/genbank2gff.PLS.bak blib/script/genbank2gff.PLS -> blib/script/bp_genbank2gff.pl Copying scripts/taxa/taxid4species.PLS -> blib/script/taxid4species.PLS Deleting blib/script/taxid4species.PLS.bak blib/script/taxid4species.PLS -> blib/script/bp_taxid4species.pl Copying scripts/biographics/bp_glyphs1-demo.PLS -> blib/script/bp_glyphs1-demo.PLS Deleting blib/script/bp_glyphs1-demo.PLS.bak blib/script/bp_glyphs1-demo.PLS -> blib/script/bp_glyphs1-demo.pl Copying scripts/tree/blast2tree.PLS -> blib/script/blast2tree.PLS Deleting blib/script/blast2tree.PLS.bak blib/script/blast2tree.PLS -> blib/script/bp_blast2tree.pl Copying scripts/graphics/frend.PLS -> blib/script/frend.PLS Deleting blib/script/frend.PLS.bak blib/script/frend.PLS -> blib/script/bp_frend.pl Copying scripts/taxa/query_entrez_taxa.PLS -> blib/script/query_entrez_taxa.PLS Deleting blib/script/query_entrez_taxa.PLS.bak blib/script/query_entrez_taxa.PLS -> blib/script/bp_query_entrez_taxa.pl Copying scripts/biographics/bp_glyphs2-demo.PLS -> blib/script/bp_glyphs2-demo.PLS Deleting blib/script/bp_glyphs2-demo.PLS.bak blib/script/bp_glyphs2-demo.PLS -> blib/script/bp_glyphs2-demo.pl Copying scripts/taxa/taxonomy2tree.PLS -> blib/script/taxonomy2tree.PLS Deleting blib/script/taxonomy2tree.PLS.bak blib/script/taxonomy2tree.PLS -> blib/script/bp_taxonomy2tree.pl Copying scripts/utilities/search2alnblocks.PLS -> blib/script/search2alnblocks.PLS Deleting blib/script/search2alnblocks.PLS.bak blib/script/search2alnblocks.PLS -> blib/script/bp_search2alnblocks.pl Copying scripts/utilities/mask_by_search.PLS -> blib/script/mask_by_search.PLS Deleting blib/script/mask_by_search.PLS.bak blib/script/mask_by_search.PLS -> blib/script/bp_mask_by_search.pl Copying scripts/seqstats/gccalc.PLS -> blib/script/gccalc.PLS Deleting blib/script/gccalc.PLS.bak blib/script/gccalc.PLS -> blib/script/bp_gccalc.pl Copying scripts/popgen/composite_LD.PLS -> blib/script/composite_LD.PLS Deleting blib/script/composite_LD.PLS.bak blib/script/composite_LD.PLS -> blib/script/bp_composite_LD.pl Copying scripts/seqstats/aacomp.PLS -> blib/script/aacomp.PLS Deleting blib/script/aacomp.PLS.bak blib/script/aacomp.PLS -> blib/script/bp_aacomp.pl Copying scripts/Bio-DB-GFF/process_wormbase.PLS -> blib/script/process_wormbase.PLS Deleting blib/script/process_wormbase.PLS.bak blib/script/process_wormbase.PLS -> blib/script/bp_process_wormbase.pl Copying scripts/taxa/local_taxonomydb_query.PLS -> blib/script/local_taxonomydb_query.PLS Deleting blib/script/local_taxonomydb_query.PLS.bak blib/script/local_taxonomydb_query.PLS -> blib/script/bp_local_taxonomydb_query.pl Copying scripts/biblio/biblio.PLS -> blib/script/biblio.PLS Deleting blib/script/biblio.PLS.bak blib/script/biblio.PLS -> blib/script/bp_biblio.pl Copying scripts/biographics/bp_embl2picture.PLS -> blib/script/bp_embl2picture.PLS Deleting blib/script/bp_embl2picture.PLS.bak blib/script/bp_embl2picture.PLS -> blib/script/bp_embl2picture.pl Copying scripts/Bio-DB-GFF/genbank2gff3.PLS -> blib/script/genbank2gff3.PLS Deleting blib/script/genbank2gff3.PLS.bak blib/script/genbank2gff3.PLS -> blib/script/bp_genbank2gff3.pl Copying scripts/utilities/search2BSML.PLS -> blib/script/search2BSML.PLS Deleting blib/script/search2BSML.PLS.bak blib/script/search2BSML.PLS -> blib/script/bp_search2BSML.pl Copying scripts/seq/seqconvert.PLS -> blib/script/seqconvert.PLS Deleting blib/script/seqconvert.PLS.bak blib/script/seqconvert.PLS -> blib/script/bp_seqconvert.pl Copying scripts/searchio/parse_hmmsearch.PLS -> blib/script/parse_hmmsearch.PLS Deleting blib/script/parse_hmmsearch.PLS.bak blib/script/parse_hmmsearch.PLS -> blib/script/bp_parse_hmmsearch.pl Copying scripts/index/bp_seqret.PLS -> blib/script/bp_seqret.PLS Deleting blib/script/bp_seqret.PLS.bak blib/script/bp_seqret.PLS -> blib/script/bp_seqret.pl Copying scripts/searchio/filter_search.PLS -> blib/script/filter_search.PLS Deleting blib/script/filter_search.PLS.bak blib/script/filter_search.PLS -> blib/script/bp_filter_search.pl Copying scripts/tree/nexus2nh.PLS -> blib/script/nexus2nh.PLS Deleting blib/script/nexus2nh.PLS.bak blib/script/nexus2nh.PLS -> blib/script/bp_nexus2nh.pl Copying scripts/Bio-DB-GFF/generate_histogram.PLS -> blib/script/generate_histogram.PLS Deleting blib/script/generate_histogram.PLS.bak blib/script/generate_histogram.PLS -> blib/script/bp_generate_histogram.pl Copying scripts/seq/split_seq.PLS -> blib/script/split_seq.PLS Deleting blib/script/split_seq.PLS.bak blib/script/split_seq.PLS -> blib/script/bp_split_seq.pl Copying scripts/Bio-DB-GFF/load_gff.PLS -> blib/script/load_gff.PLS Deleting blib/script/load_gff.PLS.bak blib/script/load_gff.PLS -> blib/script/bp_load_gff.pl Copying scripts/index/bp_fetch.PLS -> blib/script/bp_fetch.PLS Deleting blib/script/bp_fetch.PLS.bak blib/script/bp_fetch.PLS -> blib/script/bp_fetch.pl Copying scripts/utilities/mutate.PLS -> blib/script/mutate.PLS Deleting blib/script/mutate.PLS.bak blib/script/mutate.PLS -> blib/script/bp_mutate.pl Copying scripts/Bio-DB-GFF/process_sgd.PLS -> blib/script/process_sgd.PLS Deleting blib/script/process_sgd.PLS.bak blib/script/process_sgd.PLS -> blib/script/bp_process_sgd.pl Copying scripts/index/bp_index.PLS -> blib/script/bp_index.PLS Deleting blib/script/bp_index.PLS.bak blib/script/bp_index.PLS -> blib/script/bp_index.pl Copying scripts/utilities/dbsplit.PLS -> blib/script/dbsplit.PLS Deleting blib/script/dbsplit.PLS.bak blib/script/dbsplit.PLS -> blib/script/bp_dbsplit.pl Copying scripts/seqstats/oligo_count.PLS -> blib/script/oligo_count.PLS Deleting blib/script/oligo_count.PLS.bak blib/script/oligo_count.PLS -> blib/script/bp_oligo_count.pl Copying scripts/searchio/hmmer_to_table.PLS -> blib/script/hmmer_to_table.PLS Deleting blib/script/hmmer_to_table.PLS.bak blib/script/hmmer_to_table.PLS -> blib/script/bp_hmmer_to_table.pl Copying scripts/Bio-DB-GFF/process_gadfly.PLS -> blib/script/process_gadfly.PLS Deleting blib/script/process_gadfly.PLS.bak blib/script/process_gadfly.PLS -> blib/script/bp_process_gadfly.pl Copying scripts/DB/biofetch_genbank_proxy.PLS -> blib/script/biofetch_genbank_proxy.PLS Deleting blib/script/biofetch_genbank_proxy.PLS.bak blib/script/biofetch_genbank_proxy.PLS -> blib/script/bp_biofetch_genbank_proxy.pl Copying scripts/seq/extract_feature_seq.PLS -> blib/script/extract_feature_seq.PLS Deleting blib/script/extract_feature_seq.PLS.bak blib/script/extract_feature_seq.PLS -> blib/script/bp_extract_feature_seq.pl Copying scripts/Bio-DB-GFF/bulk_load_gff.PLS -> blib/script/bulk_load_gff.PLS Deleting blib/script/bulk_load_gff.PLS.bak blib/script/bulk_load_gff.PLS -> blib/script/bp_bulk_load_gff.pl Copying scripts/utilities/search2gff.PLS -> blib/script/search2gff.PLS Deleting blib/script/search2gff.PLS.bak blib/script/search2gff.PLS -> blib/script/bp_search2gff.pl Copying scripts/seq/make_mrna_protein.PLS -> blib/script/make_mrna_protein.PLS Deleting blib/script/make_mrna_protein.PLS.bak blib/script/make_mrna_protein.PLS -> blib/script/bp_make_mrna_protein.pl Copying scripts/seq/unflatten_seq.PLS -> blib/script/unflatten_seq.PLS Deleting blib/script/unflatten_seq.PLS.bak blib/script/unflatten_seq.PLS -> blib/script/bp_unflatten_seq.pl Copying scripts/utilities/search2tribe.PLS -> blib/script/search2tribe.PLS Deleting blib/script/search2tribe.PLS.bak blib/script/search2tribe.PLS -> blib/script/bp_search2tribe.pl Copying scripts/DB/bioflat_index.PLS -> blib/script/bioflat_index.PLS Deleting blib/script/bioflat_index.PLS.bak blib/script/bioflat_index.PLS -> blib/script/bp_bioflat_index.pl Copying scripts/utilities/pairwise_kaks.PLS -> blib/script/pairwise_kaks.PLS Deleting blib/script/pairwise_kaks.PLS.bak blib/script/pairwise_kaks.PLS -> blib/script/bp_pairwise_kaks.pl Copying scripts/Bio-DB-GFF/fast_load_gff.PLS -> blib/script/fast_load_gff.PLS Deleting blib/script/fast_load_gff.PLS.bak blib/script/fast_load_gff.PLS -> blib/script/bp_fast_load_gff.pl Copying scripts/seqstats/chaos_plot.PLS -> blib/script/chaos_plot.PLS Deleting blib/script/chaos_plot.PLS.bak blib/script/chaos_plot.PLS -> blib/script/bp_chaos_plot.pl Copying scripts/utilities/bp_mrtrans.PLS -> blib/script/bp_mrtrans.PLS Deleting blib/script/bp_mrtrans.PLS.bak blib/script/bp_mrtrans.PLS -> blib/script/bp_mrtrans.pl Copying scripts/utilities/bp_nrdb.PLS -> blib/script/bp_nrdb.PLS Deleting blib/script/bp_nrdb.PLS.bak blib/script/bp_nrdb.PLS -> blib/script/bp_nrdb.pl Copying scripts/taxa/classify_hits_kingdom.PLS -> blib/script/classify_hits_kingdom.PLS Deleting blib/script/classify_hits_kingdom.PLS.bak blib/script/classify_hits_kingdom.PLS -> blib/script/bp_classify_hits_kingdom.pl Copying scripts/utilities/remote_blast.PLS -> blib/script/remote_blast.PLS Deleting blib/script/remote_blast.PLS.bak blib/script/remote_blast.PLS -> blib/script/bp_remote_blast.pl Copying scripts/searchio/search2table.PLS -> blib/script/search2table.PLS Deleting blib/script/search2table.PLS.bak blib/script/search2table.PLS -> blib/script/bp_search2table.pl Copying scripts/seq/translate_seq.PLS -> blib/script/translate_seq.PLS Deleting blib/script/translate_seq.PLS.bak blib/script/translate_seq.PLS -> blib/script/bp_translate_seq.pl Copying scripts/graphics/search_overview.PLS -> blib/script/search_overview.PLS Deleting blib/script/search_overview.PLS.bak blib/script/search_overview.PLS -> blib/script/bp_search_overview.pl t/AAChange...................ok t/AAReverseMutate............ok t/AlignIO....................ok t/AlignStats.................ok t/AlignUtil..................ok t/Allele.....................ok t/Alphabet...................ok t/Annotation.................ok 1/112 # Failed (TODO) test 'The object isa Bio::Annotation::Comment' # at t/Annotation.t line 214. # The object isn't a 'Bio::Annotation::Comment' it's a 'Bio::Annotation::OntologyTerm' t/Annotation.................ok 1/112 unexpectedly succeeded TODO PASSED test 96 t/AnnotationAdaptor..........ok t/Assembly...................ok 1/35 # Failed (TODO) test 'get_nof_singlets' # at t/Assembly.t line 44. # got: '0' # expected: '1' # Failed (TODO) test 'get_seq_ids' # at t/Assembly.t line 48. # got: '0' # expected: '2' # Failed (TODO) test at t/Assembly.t line 53. # '0' # ne # '0' # Failed test at t/Assembly.t line 145. # got: '_main_contig_feature:106' # expected: '_aligned_coord:sdsu|SDSU_RFPERU_006_E04.x01.phd.1' t/Assembly...................NOK 31/35# Looks like you failed 1 test of 35. t/Assembly...................dubious Test returned status 1 (wstat 256, 0x100) DIED. FAILED test 31 Failed 1/35 tests, 97.14% okay t/Biblio.....................ok t/BiblioReferences...........ok t/Biblio_biofetch............ok t/Biblio_eutils..............ok t/BioDBGFF...................ok 3/277 skipped: various reasons t/BioDBSeqFeature............ok t/BioDBSeqFeature_BDB........ok t/BioDBSeqFeature_mysql......ok t/BioFetch_DB................ok t/BioGraphics................ok t/BlastIndex.................ok t/Chain......................ok t/ClusterIO..................ok t/Coalescent.................ok t/CodonTable.................ok t/Compatible.................ok t/CoordinateGraph............ok t/CoordinateMapper...........ok t/Correlate..................ok t/CytoMap....................ok t/DB.........................ok 104/116Pseudo-hashes are deprecated at /home/forrest/src/bioperl/bioperl-live/Bio/SeqIO/entrezgene.pm line 491. Pseudo-hashes are deprecated at /home/forrest/src/bioperl/bioperl-live/Bio/SeqIO/entrezgene.pm line 372. Pseudo-hashes are deprecated at /home/forrest/src/bioperl/bioperl-live/Bio/SeqIO/entrezgene.pm line 372. Pseudo-hashes are deprecated at /home/forrest/src/bioperl/bioperl-live/Bio/SeqIO/entrezgene.pm line 565. Pseudo-hashes are deprecated at /home/forrest/src/bioperl/bioperl-live/Bio/SeqIO/entrezgene.pm line 372. Pseudo-hashes are deprecated at /home/forrest/src/bioperl/bioperl-live/Bio/SeqIO/entrezgene.pm line 372. Pseudo-hashes are deprecated at /home/forrest/src/bioperl/bioperl-live/Bio/SeqIO/entrezgene.pm line 565. t/DB.........................ok 107/116Pseudo-hashes are deprecated at /home/forrest/src/bioperl/bioperl-live/Bio/SeqIO/entrezgene.pm line 372. Pseudo-hashes are deprecated at /home/forrest/src/bioperl/bioperl-live/Bio/SeqIO/entrezgene.pm line 372. Pseudo-hashes are deprecated at /home/forrest/src/bioperl/bioperl-live/Bio/SeqIO/entrezgene.pm line 565. Pseudo-hashes are deprecated at /home/forrest/src/bioperl/bioperl-live/Bio/SeqIO/entrezgene.pm line 372. Pseudo-hashes are deprecated at /home/forrest/src/bioperl/bioperl-live/Bio/SeqIO/entrezgene.pm line 372. Pseudo-hashes are deprecated at /home/forrest/src/bioperl/bioperl-live/Bio/SeqIO/entrezgene.pm line 565. t/DB.........................ok t/DBCUTG.....................ok t/DBFasta....................ok t/DNAMutation................ok t/Domcut.....................ok t/ECnumber...................ok t/ELM........................ok t/EMBL_DB....................ok t/EMBOSS_Tools...............ok t/ESEfinder..................ok t/EUtilities.................skipped all skipped: Must set BIOPERLDEBUG=1 for network tests t/EncodedSeq.................ok t/Exception..................ok t/Exonerate..................ok 4/45 skipped: various reasons t/FeatureIO..................ok t/FootPrinter................ok t/GDB........................ok t/GFF........................ok t/GOR4.......................ok t/GOterm.....................ok t/GbrowseGFF.................ok t/Gel........................ok t/GeneCoordinateMapper.......ok t/Geneid.....................ok t/Genewise...................ok 1/53 # Failed (TODO) test at t/Genewise.t line 79. # got: 'Scaffold_2042.1' # expected: 'SINFRUP00000067802' # Failed (TODO) test at t/Genewise.t line 80. # got: 'SINFRUP00000067802' # expected: 'Scaffold_2042.1' t/Genewise...................NOK 37/53 # Failed test at t/Genewise.t line 82. # got: '' # expected: '2054.68' t/Genewise...................NOK 41/53 # Failed test at t/Genewise.t line 88. # got: '' # expected: '2054.68' t/Genewise...................NOK 45/53 # Failed test at t/Genewise.t line 93. # got: '' # expected: '2054.68' # Looks like you failed 3 tests of 53. t/Genewise...................dubious Test returned status 3 (wstat 768, 0x300) DIED. FAILED tests 37, 41, 45 Failed 3/53 tests, 94.34% okay t/Genomewise.................ok t/Genpred....................ok 1/157Argument "<1" isn't numeric in numeric gt (>) at /home/forrest/src/bioperl/bioperl-live/blib/lib/Bio/Tools/Glimmer.pm line 519, line 2. t/Genpred....................ok t/GraphAdaptor...............ok t/GuessSeqFormat.............ok t/HNN........................ok t/Handler....................ok 288/545 # Failed (TODO) test at t/Handler.t line 696. t/Handler....................ok t/HtSNP......................ok t/IUPAC......................ok t/Index......................ok t/InstanceSite...............ok t/InterProParser.............ok t/LargeLocatableSeq..........ok t/LinkageMap.................ok t/LiveSeq....................ok t/LocatableSeq...............ok t/Location...................ok t/LocationFactory............ok t/LocusLink..................ok t/MK.........................ok 4/46 skipped: various reasons t/Map........................ok t/MapIO......................ok t/Matrix.....................ok t/MeSH.......................ok t/Measure....................ok t/MetaSeq....................ok t/MicrosatelliteMarker.......ok t/MiniMIMentry...............ok t/MitoProt...................ok t/Molphy.....................ok t/MultiFile..................ok t/Mutation...................ok t/Mutator....................ok t/NetPhos....................ok t/Node.......................ok t/OMIMentry..................ok t/OMIMentryAllelicVariant....ok t/OMIMparser.................ok t/OddCodes...................ok t/Ontology...................ok t/OntologyEngine.............ok t/OntologyStore..............ok t/PAML.......................ok t/Perl.......................ok t/Phenotype..................ok t/PhylipDist.................ok t/PhysicalMap................ok t/Pictogram..................ok t/PodSyntax..................skipped all skipped: Test::Pod 1.00 required for testing POD t/PopGen.....................ok 1/99 # Failed (TODO) test at t/PopGen.t line 242. t/PopGen.....................ok 2/99 unexpectedly succeeded TODO PASSED tests 97-98 t/PopGenSims.................ok t/PrimarySeq.................ok t/Primer.....................ok t/Promoterwise...............ok t/ProtDist...................ok t/ProtMatrix.................ok t/ProtPsm....................ok 10/14 skipped: various reasons t/Pseudowise.................ok t/QRNA.......................ok t/RNAChange..................ok t/RNA_SearchIO...............ok 2/496 # Failed (TODO) test 'HSP meta' # at t/RNA_SearchIO.t line 798. # undef # ne # undef # Failed (TODO) test at t/RNA_SearchIO.t line 800. # undef # ne # undef # Failed (TODO) test at t/RNA_SearchIO.t line 802. # undef # ne # undef # Failed (TODO) test 'HSP meta' # at t/RNA_SearchIO.t line 848. # undef # ne # undef # Failed (TODO) test at t/RNA_SearchIO.t line 850. # undef # ne # undef # Failed (TODO) test at t/RNA_SearchIO.t line 852. # undef # ne # undef t/RNA_SearchIO...............ok t/RandDistFunctions..........ok t/RandomTreeFactory..........ok t/Range......................ok t/RangeI.....................ok t/RefSeq.....................ok t/Registry...................ok 1/14 --------------------- WARNING --------------------- MSG: Couldn't call new_from_registry() on [Bio::DB::Flat] ------------- EXCEPTION: Bio::Root::Exception ------------- MSG: you must specify an indexing scheme STACK: Error::throw STACK: Bio::Root::Root::throw /home/forrest/src/bioperl/bioperl-live/blib/lib/Bio/Root/Root.pm:357 STACK: Bio::DB::Flat::new /home/forrest/src/bioperl/bioperl-live/blib/lib/Bio/DB/Flat.pm:160 STACK: Bio::DB::Flat::new_from_registry /home/forrest/src/bioperl/bioperl-live/blib/lib/Bio/DB/Flat.pm:252 STACK: Bio::DB::Registry::_load_registry /home/forrest/src/bioperl/bioperl-live/blib/lib/Bio/DB/Registry.pm:164 STACK: Bio::DB::Registry::new /home/forrest/src/bioperl/bioperl-live/blib/lib/Bio/DB/Registry.pm:95 STACK: t/Registry.t:51 ----------------------------------------------------------- --------------------------------------------------- t/Registry...................ok 6/14 skipped: various reasons t/Relationship...............ok t/RelationshipType...........ok t/RemoteBlast................ok t/RepeatMasker...............ok t/RestrictionAnalysis........ok t/RestrictionIO..............ok 1/15 # Failed (TODO) test at t/RestrictionIO.t line 31. t/RestrictionIO..............ok t/Root-Utilities.............ok 1/50 --------------------- WARNING --------------------- MSG: Not owner of file t/data/test.txt. Compressing to temp file /tmp/MBKEv1uzJB.tmp.bioperl.gz. --------------------------------------------------- t/Root-Utilities.............ok t/RootI......................ok t/RootIO.....................ok t/RootStorable...............ok t/SNP........................ok t/Scansite...................ok 5/14 skipped: various reasons t/SearchDist.................skipped all skipped: The optional module Bio::Ext::Align (or dependencies thereof) was not installed t/SearchIO...................ok 529/1449 # Failed (TODO) test at t/SearchIO.t line 989. # '0.852' # > # '0.9' # Failed (TODO) test at t/SearchIO.t line 990. # '1.599' # <= # '1' t/SearchIO...................ok t/Seg........................ok t/Seq........................ok t/SeqAnalysisParser..........ok t/SeqBuilder.................ok t/SeqDiff....................ok t/SeqEvolution...............ok t/SeqFeatAnnotated...........ok t/SeqFeatCollection..........ok t/SeqFeature.................ok t/SeqHound_DB................ok t/SeqIO......................ok t/SeqPattern.................ok t/SeqStats...................ok t/SeqUtils...................ok t/SeqVersion.................ok t/SeqWords...................ok t/SequenceFamily.............ok t/Sigcleave..................ok t/Signalp....................ok t/Signalp2...................ok t/Sim4.......................ok t/SimilarityPair.............ok t/SimpleAlign................ok t/SiteMatrix.................ok t/Sopma......................ok t/Species....................ok t/Spidey.....................ok t/StandAloneBlast............ok 11/45 skipped: various reasons t/StructIO...................ok t/Structure..................ok t/Symbol.....................ok t/TagHaplotype...............ok t/TandemRepeatsFinder........ok t/TaxonTree..................skipped all skipped: All tests are being skipped, probably because the module(s) being tested here are now deprecated t/Taxonomy...................ok t/Tempfile...................ok t/Term.......................ok t/Tmhmm......................ok t/Tools......................ok t/Tree.......................ok t/TreeBuild..................ok t/TreeIO.....................ok t/UCSCParsers................ok t/Unflattener................ok t/Unflattener2...............ok t/UniGene....................ok t/Variation_IO...............ok t/WABA.......................ok t/WrapperBase................ok t/abi........................skipped all skipped: The optional module Bio::SeqIO::staden::read (or dependencies thereof) was not installed t/ace........................ok t/alignUtilities.............ok t/asciitree..................ok t/blast_pull.................ok 1/287 # Failed (TODO) test at t/blast_pull.t line 258. # got: '0.946' # expected: '0.943' t/blast_pull.................ok t/bsml_sax...................ok t/chaosxml...................ok t/cigarstring................ok t/consed.....................ok t/ctf........................skipped all skipped: The optional module Bio::SeqIO::staden::read (or dependencies thereof) was not installed t/dblink.....................ok t/ePCR.......................ok t/embl.......................ok t/entrezgene.................ok 542/1422Pseudo-hashes are deprecated at /home/forrest/src/bioperl/bioperl-live/blib/lib/Bio/SeqIO/entrezgene.pm line 491. t/entrezgene.................ok 966/1422Pseudo-hashes are deprecated at /home/forrest/src/bioperl/bioperl-live/blib/lib/Bio/SeqIO/entrezgene.pm line 491. Pseudo-hashes are deprecated at /home/forrest/src/bioperl/bioperl-live/blib/lib/Bio/SeqIO/entrezgene.pm line 491. Pseudo-hashes are deprecated at /home/forrest/src/bioperl/bioperl-live/blib/lib/Bio/SeqIO/entrezgene.pm line 491. t/entrezgene.................ok t/est2genome.................ok t/exp........................skipped all skipped: The optional module Bio::SeqIO::staden::read (or dependencies thereof) was not installed t/fasta......................ok t/flat.......................ok t/game.......................ok t/gcg........................ok t/genbank....................ok t/hmmer......................ok t/hmmer_pull.................ok t/interpro...................ok t/kegg.......................ok t/largefasta.................ok t/largepseq..................ok t/lasergene..................ok t/lucy.......................ok t/masta......................ok t/metafasta..................ok t/multiple_fasta.............ok t/obo_parser.................ok t/pICalculator...............ok t/phd........................ok t/pir........................ok t/pln........................skipped all skipped: The optional module Bio::SeqIO::staden::read (or dependencies thereof) was not installed t/primaryqual................ok t/primedseq..................ok t/primer3....................ok t/protgraph..................ok 1/70 # Failed (TODO) test at t/protgraph.t line 55. # Failed (TODO) test at t/protgraph.t line 56. # got: '13' # expected: '14' t/protgraph..................ok 49/70 # Failed (TODO) test at t/protgraph.t line 248. # got: 'Helicobacter pylori' # expected: 'Helicobacter pylori 26695' t/protgraph..................ok t/psm........................ok t/qual.......................ok t/raw........................ok t/rnamotif...................ok t/scf........................ok t/seq_quality................ok t/seqfeaturePrimer...........ok t/seqread_fail...............ok t/sequencetrace..............ok t/seqwithquality.............ok t/simpleGOparser.............ok t/singlet....................ok t/sirna......................ok t/splicedseq.................ok t/swiss......................ok 1/239 # Failed (TODO) test at t/swiss.t line 47. t/swiss......................ok t/tRNAscanSE.................ok t/tab........................ok t/table......................ok t/targetp....................ok t/tigrxml....................ok t/tinyseq....................ok t/trim.......................ok t/ztr........................skipped all skipped: The optional module Bio::SeqIO::staden::read (or dependencies thereof) was not installed Failed Test Stat Wstat Total Fail List of Failed ---------------------------------------------------------------------------- --- t/Assembly.t 1 256 35 1 31 t/Genewise.t 3 768 53 3 37 41 45 (3 subtests UNEXPECTEDLY SUCCEEDED), 9 tests and 43 subtests skipped. Failed 2/248 test scripts. 4/15415 subtests failed. Files=248, Tests=15415, 973 wallclock secs (123.75 cusr + 8.58 csys = 132.33 CPU) Failed 2/248 test programs. 4/15415 subtests failed. -----Original Message----- From: Chris Fields [mailto:cjfields at uiuc.edu] Sent: Friday, September 28, 2007 1:00 PM To: Forrest Zhang Cc: bioperl-l at bioperl.org Subject: Re: [Bioperl-l] load_seqdatabase.pl error If this is occurring using bioperl from CVS then I'll try taking a look at it. chris On Sep 27, 2007, at 10:15 PM, Forrest Zhang wrote: > Hilmar, > I have already pre-loaded the NCBI taxonomy using > load_ncbi_taxonomy.pl yet. The error message show: > > --------------------- WARNING --------------------- > MSG: The supplied lineage does not start near 'Phaseolus aureus' (I > was > supplied 'Vigna | Papilionoideae | Fabaceae | Fabales | eurosids I > | rosids > | core eudicotyledons | eudicotyledons | Magnoliophyta | > Euphyllophyta | > Embryophyta | Streptophytina | Viridiplantae | Eukaryota') > --------------------------------------------------- > Could not store Q40784: > ------------- EXCEPTION: Bio::Root::Exception ------------- > MSG: create: object (Bio::Species) failed to insert or to be found > by unique > key > STACK: Error::throw > STACK: Bio::Root::Root::throw > /usr/lib/perl5/site_perl/5.8.8/Bio/Root/Root.pm:357 > STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create > /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/ > BasePersistenceAdaptor.pm:206 > STACK: Bio::DB::Persistent::PersistentObject::create > /usr/lib/perl5/site_perl/5.8.8/Bio/DB/Persistent/ > PersistentObject.pm:244 > STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create > /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/ > BasePersistenceAdaptor.pm:169 > STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::store > /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/ > BasePersistenceAdaptor.pm:251 > STACK: Bio::DB::Persistent::PersistentObject::store > /usr/lib/perl5/site_perl/5.8.8/Bio/DB/Persistent/ > PersistentObject.pm:271 > STACK: /usr/bin/bp_load_seqdatabase.pl:633 > ----------------------------------------------------------- > Sigh~~~~~~ > > Forrest Zhang > > > -----Original Message----- > From: bioperl-l-bounces at lists.open-bio.org > [mailto:bioperl-l-bounces at lists.open-bio.org] On Behalf Of Hilmar Lapp > Sent: Friday, September 28, 2007 6:17 AM > To: Forrest > Cc: bioperl-l at bioperl.org > Subject: Re: [Bioperl-l] load_seqdatabase.pl error > > Forrest, > > have you preloaded the NCBI taxonomy as suggested in the BioSQL > installation guidelines? SwissProt format has NCBI taxon IDs, and the > code will try to use it to look up species and their lineage, rather > than inserting the lineage from whatever BioPerl parses out of the > sequence record. > > -hilmar > > On Sep 27, 2007, at 3:41 AM, Forrest wrote: > >> Hi, all >> I install the biosql, and bioperl-db. I want to import >> swissport data. >> But the programe show some error as below: >> ===================================================================== >> = >> ====== >> =============================================== >>> perl load_seqdatabase.pl -dbuser root -dbname bioseqdb -driver mysql >> -namespace swissprot -format swiss ~/uniprot/uniprot_sprot.dat >> Loading /home/forrest/uniprot/uniprot_sprot.dat ... >> Could not store Q6DAH5: >> ------------- EXCEPTION: Bio::Root::Exception ------------- >> MSG: The supplied lineage does not start near 'Erwinia carotovora >> subsp. >> atroseptica' (I was supplied 'Erwinia carotovora subsp. | >> Pectobacterium | >> Enterobacteriaceae | Enterobacteriales | Gammaproteobacteria | >> Proteobacteria | Bacteria') >> STACK: Error::throw >> STACK: Bio::Root::Root::throw >> /usr/lib/perl5/site_perl/5.8.8/Bio/Root/Root.pm:359 >> STACK: Bio::Species::classification >> /usr/lib/perl5/site_perl/5.8.8/Bio/Species.pm:174 >> STACK: Bio::DB::Persistent::PersistentObject::AUTOLOAD >> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/Persistent/ >> PersistentObject.pm:552 >> STACK: Bio::DB::BioSQL::SpeciesAdaptor::populate_from_row >> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/SpeciesAdaptor.pm:281 >> STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::_build_object >> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/ >> BasePersistenceAdaptor.pm:1305 >> STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::_find_by_unique_key >> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/ >> BasePersistenceAdaptor.pm:973 >> STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::find_by_unique_key >> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/ >> BasePersistenceAdaptor.pm:852 >> STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create >> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/ >> BasePersistenceAdaptor.pm:182 >> STACK: Bio::DB::Persistent::PersistentObject::create >> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/Persistent/ >> PersistentObject.pm:244 >> STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create >> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/ >> BasePersistenceAdaptor.pm:169 >> STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::store >> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/ >> BasePersistenceAdaptor.pm:251 >> STACK: Bio::DB::Persistent::PersistentObject::store >> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/Persistent/ >> PersistentObject.pm:271 >> STACK: load_seqdatabase.pl:620 >> ----------------------------------------------------------- >> >> at load_seqdatabase.pl line 633 >> ===================================================================== >> = >> ====== >> =============================================== >> >> How can I solve it, please help me, Thank you. >> >> Thanks >> Forrest zhang >> >> >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > -- > =========================================================== > : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : > =========================================================== > > > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From bix at sendu.me.uk Fri Sep 28 02:46:48 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Fri, 28 Sep 2007 07:46:48 +0100 Subject: [Bioperl-l] FW: load_seqdatabase.pl error In-Reply-To: <000101c80191$3b300e70$b1902b50$@com> References: <000101c80191$3b300e70$b1902b50$@com> Message-ID: <46FCA358.9070202@sendu.me.uk> Forrest Zhang wrote: > Oh, my God! I am tring reinstall bioperl-live using CVS, so many error > shown below. [snip] > t/Assembly.t 1 256 35 1 31 > t/Genewise.t 3 768 53 3 37 41 45 You failed 4 tests, and this is CVS. Don't worry about the failures if they're in tests of modules you're not using. Do you use the Assembly or Genewise modules? From bix at sendu.me.uk Fri Sep 28 02:40:04 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Fri, 28 Sep 2007 07:40:04 +0100 Subject: [Bioperl-l] cpan install In-Reply-To: <000001c8017a$6384bae0$2a8e30a0$@com> References: <46FC028F.3050000@usc.edu> <000001c8017a$6384bae0$2a8e30a0$@com> Message-ID: <46FCA1C4.7030101@sendu.me.uk> Forrest Zhang wrote: > Try > cpan>install S/SE/SENDU/bioperl-1.5.2_102.tar.gz > > other question you should browse > http://www.bioperl.org/wiki/Installing_Bioperl_for_Unix Yes, the explanation for 1.5.2 not showing up in d /bioperl/ being 'only stable versions appear in that list'. From forrest_zhang at 163.com Fri Sep 28 07:26:56 2007 From: forrest_zhang at 163.com (Forrest Zhang) Date: Fri, 28 Sep 2007 19:26:56 +0800 Subject: [Bioperl-l] load_seqdatabase.pl error In-Reply-To: <8C9CDFBD-200C-4CD6-8295-7833D2CD3758@uiuc.edu> References: <000501c800d9$dc9c8e90$95d5abb0$@com> <2461576A-1A15-45E2-A14F-AEED6ACD6007@gmx.net> <000101c8017d$c4643360$4d2c9a20$@com> <8C9CDFBD-200C-4CD6-8295-7833D2CD3758@uiuc.edu> Message-ID: <000301c801c2$7deb6080$79c22180$@com> Yes, it is happened using CVS. -----Original Message----- From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-bounces at lists.open-bio.org] On Behalf Of Chris Fields Sent: Friday, September 28, 2007 1:00 PM To: Forrest Zhang Cc: bioperl-l at bioperl.org Subject: Re: [Bioperl-l] load_seqdatabase.pl error If this is occurring using bioperl from CVS then I'll try taking a look at it. chris On Sep 27, 2007, at 10:15 PM, Forrest Zhang wrote: > Hilmar, > I have already pre-loaded the NCBI taxonomy using > load_ncbi_taxonomy.pl yet. The error message show: > > --------------------- WARNING --------------------- > MSG: The supplied lineage does not start near 'Phaseolus aureus' (I > was > supplied 'Vigna | Papilionoideae | Fabaceae | Fabales | eurosids I > | rosids > | core eudicotyledons | eudicotyledons | Magnoliophyta | > Euphyllophyta | > Embryophyta | Streptophytina | Viridiplantae | Eukaryota') > --------------------------------------------------- > Could not store Q40784: > ------------- EXCEPTION: Bio::Root::Exception ------------- > MSG: create: object (Bio::Species) failed to insert or to be found > by unique > key > STACK: Error::throw > STACK: Bio::Root::Root::throw > /usr/lib/perl5/site_perl/5.8.8/Bio/Root/Root.pm:357 > STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create > /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/ > BasePersistenceAdaptor.pm:206 > STACK: Bio::DB::Persistent::PersistentObject::create > /usr/lib/perl5/site_perl/5.8.8/Bio/DB/Persistent/ > PersistentObject.pm:244 > STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create > /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/ > BasePersistenceAdaptor.pm:169 > STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::store > /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/ > BasePersistenceAdaptor.pm:251 > STACK: Bio::DB::Persistent::PersistentObject::store > /usr/lib/perl5/site_perl/5.8.8/Bio/DB/Persistent/ > PersistentObject.pm:271 > STACK: /usr/bin/bp_load_seqdatabase.pl:633 > ----------------------------------------------------------- > Sigh~~~~~~ > > Forrest Zhang > > > -----Original Message----- > From: bioperl-l-bounces at lists.open-bio.org > [mailto:bioperl-l-bounces at lists.open-bio.org] On Behalf Of Hilmar Lapp > Sent: Friday, September 28, 2007 6:17 AM > To: Forrest > Cc: bioperl-l at bioperl.org > Subject: Re: [Bioperl-l] load_seqdatabase.pl error > > Forrest, > > have you preloaded the NCBI taxonomy as suggested in the BioSQL > installation guidelines? SwissProt format has NCBI taxon IDs, and the > code will try to use it to look up species and their lineage, rather > than inserting the lineage from whatever BioPerl parses out of the > sequence record. > > -hilmar > > On Sep 27, 2007, at 3:41 AM, Forrest wrote: > >> Hi, all >> I install the biosql, and bioperl-db. I want to import >> swissport data. >> But the programe show some error as below: >> ===================================================================== >> = >> ====== >> =============================================== >>> perl load_seqdatabase.pl -dbuser root -dbname bioseqdb -driver mysql >> -namespace swissprot -format swiss ~/uniprot/uniprot_sprot.dat >> Loading /home/forrest/uniprot/uniprot_sprot.dat ... >> Could not store Q6DAH5: >> ------------- EXCEPTION: Bio::Root::Exception ------------- >> MSG: The supplied lineage does not start near 'Erwinia carotovora >> subsp. >> atroseptica' (I was supplied 'Erwinia carotovora subsp. | >> Pectobacterium | >> Enterobacteriaceae | Enterobacteriales | Gammaproteobacteria | >> Proteobacteria | Bacteria') >> STACK: Error::throw >> STACK: Bio::Root::Root::throw >> /usr/lib/perl5/site_perl/5.8.8/Bio/Root/Root.pm:359 >> STACK: Bio::Species::classification >> /usr/lib/perl5/site_perl/5.8.8/Bio/Species.pm:174 >> STACK: Bio::DB::Persistent::PersistentObject::AUTOLOAD >> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/Persistent/ >> PersistentObject.pm:552 >> STACK: Bio::DB::BioSQL::SpeciesAdaptor::populate_from_row >> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/SpeciesAdaptor.pm:281 >> STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::_build_object >> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/ >> BasePersistenceAdaptor.pm:1305 >> STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::_find_by_unique_key >> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/ >> BasePersistenceAdaptor.pm:973 >> STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::find_by_unique_key >> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/ >> BasePersistenceAdaptor.pm:852 >> STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create >> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/ >> BasePersistenceAdaptor.pm:182 >> STACK: Bio::DB::Persistent::PersistentObject::create >> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/Persistent/ >> PersistentObject.pm:244 >> STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create >> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/ >> BasePersistenceAdaptor.pm:169 >> STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::store >> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/ >> BasePersistenceAdaptor.pm:251 >> STACK: Bio::DB::Persistent::PersistentObject::store >> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/Persistent/ >> PersistentObject.pm:271 >> STACK: load_seqdatabase.pl:620 >> ----------------------------------------------------------- >> >> at load_seqdatabase.pl line 633 >> ===================================================================== >> = >> ====== >> =============================================== >> >> How can I solve it, please help me, Thank you. >> >> Thanks >> Forrest zhang >> >> >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > -- > =========================================================== > : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : > =========================================================== > > > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign _______________________________________________ Bioperl-l mailing list Bioperl-l at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/bioperl-l From forrest_zhang at 163.com Fri Sep 28 07:28:41 2007 From: forrest_zhang at 163.com (Forrest Zhang) Date: Fri, 28 Sep 2007 19:28:41 +0800 Subject: [Bioperl-l] load_seqdatabase.pl error In-Reply-To: <8C9CDFBD-200C-4CD6-8295-7833D2CD3758@uiuc.edu> References: <000501c800d9$dc9c8e90$95d5abb0$@com> <2461576A-1A15-45E2-A14F-AEED6ACD6007@gmx.net> <000101c8017d$c4643360$4d2c9a20$@com> <8C9CDFBD-200C-4CD6-8295-7833D2CD3758@uiuc.edu> Message-ID: <000401c801c2$ba0bae30$2e230a90$@com> Message using CVS. biocc bioperl-db # ./Build test Copying scripts/biosql/terms/importrelation.pl -> blib/script/importrelation.pl blib/script/importrelation.pl -> blib/script/bp_importrelation.pl Copying scripts/biosql/merge-unique-ann.pl -> blib/script/merge-unique-ann.pl blib/script/merge-unique-ann.pl -> blib/script/bp_merge-unique-ann.pl Copying scripts/biosql/update-on-new-date.pl -> blib/script/update-on-new-date.pl blib/script/update-on-new-date.pl -> blib/script/bp_update-on-new-date.pl Copying scripts/biosql/terms/add-term-annot.pl -> blib/script/add-term-annot.pl Deleting blib/script/add-term-annot.pl.bak blib/script/add-term-annot.pl -> blib/script/bp_add-term-annot.pl Copying scripts/corba/caching_corba_server.pl -> blib/script/caching_corba_server.pl Deleting blib/script/caching_corba_server.pl.bak blib/script/caching_corba_server.pl -> blib/script/bp_caching_corba_server.pl Copying scripts/biosql/load_ontology.pl -> blib/script/load_ontology.pl Deleting blib/script/load_ontology.pl.bak blib/script/load_ontology.pl -> blib/script/bp_load_ontology.pl Copying scripts/biosql/load_seqdatabase.pl -> blib/script/load_seqdatabase.pl Deleting blib/script/load_seqdatabase.pl.bak blib/script/load_seqdatabase.pl -> blib/script/bp_load_seqdatabase.pl Copying scripts/biosql/terms/interpro2go.pl -> blib/script/interpro2go.pl blib/script/interpro2go.pl -> blib/script/bp_interpro2go.pl Copying scripts/biosql/clean_ontology.pl -> blib/script/clean_ontology.pl blib/script/clean_ontology.pl -> blib/script/bp_clean_ontology.pl Copying scripts/corba/test_bioenv.pl -> blib/script/test_bioenv.pl Deleting blib/script/test_bioenv.pl.bak blib/script/test_bioenv.pl -> blib/script/bp_test_bioenv.pl Copying scripts/biosql/update-on-new-version.pl -> blib/script/update-on-new-version.pl blib/script/update-on-new-version.pl -> blib/script/bp_update-on-new-version.pl Copying scripts/biosql/bioentry2flat.pl -> blib/script/bioentry2flat.pl Deleting blib/script/bioentry2flat.pl.bak blib/script/bioentry2flat.pl -> blib/script/bp_bioentry2flat.pl Copying scripts/corba/bioenv_server.pl -> blib/script/bioenv_server.pl Deleting blib/script/bioenv_server.pl.bak blib/script/bioenv_server.pl -> blib/script/bp_bioenv_server.pl Copying scripts/biosql/load_interpro.pl -> blib/script/load_interpro.pl blib/script/load_interpro.pl -> blib/script/bp_load_interpro.pl Copying scripts/biosql/cgi-bin/getentry.pl -> blib/script/getentry.pl Deleting blib/script/getentry.pl.bak blib/script/getentry.pl -> blib/script/bp_getentry.pl Copying scripts/biosql/del-assocs-sql.pl -> blib/script/del-assocs-sql.pl blib/script/del-assocs-sql.pl -> blib/script/bp_del-assocs-sql.pl Copying scripts/biosql/freshen-annot.pl -> blib/script/freshen-annot.pl blib/script/freshen-annot.pl -> blib/script/bp_freshen-annot.pl t/01dbadaptor.....ok t/02species.......FAILED tests 66-95 Failed 30/65 tests, 53.85% okay t/03simpleseq.....ok t/04swiss.........ok t/05seqfeature....ok t/06comment.......ok t/07dblink........ok t/08genbank.......ok t/09fuzzy2........ok t/10ensembl.......ok t/11locuslink.....ok t/12ontology......ok t/13remove........ok t/14query.........ok t/15cluster.......ok 9/160 --------------------- WARNING --------------------- MSG: failed to store one or more child objects for an instance of class Bio::Cluster::UniGene (PK=366) --------------------------------------------------- t/15cluster.......ok t/16obda..........ok Failed Test Stat Wstat Total Fail List of Failed ---------------------------------------------------------------------------- --- t/02species.t 65 30 66-95 Failed 1/16 test scripts. -30/1423 subtests failed. Files=16, Tests=1423, 36 wallclock secs (16.64 cusr + 0.65 csys = 17.29 CPU) Failed 1/16 test programs. -30/1423 subtests failed. -----Original Message----- From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-bounces at lists.open-bio.org] On Behalf Of Chris Fields Sent: Friday, September 28, 2007 1:00 PM To: Forrest Zhang Cc: bioperl-l at bioperl.org Subject: Re: [Bioperl-l] load_seqdatabase.pl error If this is occurring using bioperl from CVS then I'll try taking a look at it. chris On Sep 27, 2007, at 10:15 PM, Forrest Zhang wrote: > Hilmar, > I have already pre-loaded the NCBI taxonomy using > load_ncbi_taxonomy.pl yet. The error message show: > > --------------------- WARNING --------------------- > MSG: The supplied lineage does not start near 'Phaseolus aureus' (I > was > supplied 'Vigna | Papilionoideae | Fabaceae | Fabales | eurosids I > | rosids > | core eudicotyledons | eudicotyledons | Magnoliophyta | > Euphyllophyta | > Embryophyta | Streptophytina | Viridiplantae | Eukaryota') > --------------------------------------------------- > Could not store Q40784: > ------------- EXCEPTION: Bio::Root::Exception ------------- > MSG: create: object (Bio::Species) failed to insert or to be found > by unique > key > STACK: Error::throw > STACK: Bio::Root::Root::throw > /usr/lib/perl5/site_perl/5.8.8/Bio/Root/Root.pm:357 > STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create > /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/ > BasePersistenceAdaptor.pm:206 > STACK: Bio::DB::Persistent::PersistentObject::create > /usr/lib/perl5/site_perl/5.8.8/Bio/DB/Persistent/ > PersistentObject.pm:244 > STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create > /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/ > BasePersistenceAdaptor.pm:169 > STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::store > /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/ > BasePersistenceAdaptor.pm:251 > STACK: Bio::DB::Persistent::PersistentObject::store > /usr/lib/perl5/site_perl/5.8.8/Bio/DB/Persistent/ > PersistentObject.pm:271 > STACK: /usr/bin/bp_load_seqdatabase.pl:633 > ----------------------------------------------------------- > Sigh~~~~~~ > > Forrest Zhang > > > -----Original Message----- > From: bioperl-l-bounces at lists.open-bio.org > [mailto:bioperl-l-bounces at lists.open-bio.org] On Behalf Of Hilmar Lapp > Sent: Friday, September 28, 2007 6:17 AM > To: Forrest > Cc: bioperl-l at bioperl.org > Subject: Re: [Bioperl-l] load_seqdatabase.pl error > > Forrest, > > have you preloaded the NCBI taxonomy as suggested in the BioSQL > installation guidelines? SwissProt format has NCBI taxon IDs, and the > code will try to use it to look up species and their lineage, rather > than inserting the lineage from whatever BioPerl parses out of the > sequence record. > > -hilmar > > On Sep 27, 2007, at 3:41 AM, Forrest wrote: > >> Hi, all >> I install the biosql, and bioperl-db. I want to import >> swissport data. >> But the programe show some error as below: >> ===================================================================== >> = >> ====== >> =============================================== >>> perl load_seqdatabase.pl -dbuser root -dbname bioseqdb -driver mysql >> -namespace swissprot -format swiss ~/uniprot/uniprot_sprot.dat >> Loading /home/forrest/uniprot/uniprot_sprot.dat ... >> Could not store Q6DAH5: >> ------------- EXCEPTION: Bio::Root::Exception ------------- >> MSG: The supplied lineage does not start near 'Erwinia carotovora >> subsp. >> atroseptica' (I was supplied 'Erwinia carotovora subsp. | >> Pectobacterium | >> Enterobacteriaceae | Enterobacteriales | Gammaproteobacteria | >> Proteobacteria | Bacteria') >> STACK: Error::throw >> STACK: Bio::Root::Root::throw >> /usr/lib/perl5/site_perl/5.8.8/Bio/Root/Root.pm:359 >> STACK: Bio::Species::classification >> /usr/lib/perl5/site_perl/5.8.8/Bio/Species.pm:174 >> STACK: Bio::DB::Persistent::PersistentObject::AUTOLOAD >> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/Persistent/ >> PersistentObject.pm:552 >> STACK: Bio::DB::BioSQL::SpeciesAdaptor::populate_from_row >> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/SpeciesAdaptor.pm:281 >> STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::_build_object >> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/ >> BasePersistenceAdaptor.pm:1305 >> STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::_find_by_unique_key >> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/ >> BasePersistenceAdaptor.pm:973 >> STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::find_by_unique_key >> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/ >> BasePersistenceAdaptor.pm:852 >> STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create >> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/ >> BasePersistenceAdaptor.pm:182 >> STACK: Bio::DB::Persistent::PersistentObject::create >> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/Persistent/ >> PersistentObject.pm:244 >> STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create >> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/ >> BasePersistenceAdaptor.pm:169 >> STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::store >> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/ >> BasePersistenceAdaptor.pm:251 >> STACK: Bio::DB::Persistent::PersistentObject::store >> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/Persistent/ >> PersistentObject.pm:271 >> STACK: load_seqdatabase.pl:620 >> ----------------------------------------------------------- >> >> at load_seqdatabase.pl line 633 >> ===================================================================== >> = >> ====== >> =============================================== >> >> How can I solve it, please help me, Thank you. >> >> Thanks >> Forrest zhang >> >> >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > -- > =========================================================== > : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : > =========================================================== > > > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign _______________________________________________ Bioperl-l mailing list Bioperl-l at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/bioperl-l From hlapp at gmx.net Fri Sep 28 11:36:39 2007 From: hlapp at gmx.net (Hilmar Lapp) Date: Fri, 28 Sep 2007 11:36:39 -0400 Subject: [Bioperl-l] Bio::FeatureIO::gff bug? In-Reply-To: <46F8DC34.6020908@sendu.me.uk> References: <46F784EB.9050507@sendu.me.uk> <7C7D7FB3-86B9-43CF-B506-E66FA8264CFC@uiuc.edu> <46F7B9A1.9080206@sendu.me.uk> <234C109D-85FE-4161-8CBA-8E24BE34C5B5@gmx.net> <46F8DC34.6020908@sendu.me.uk> Message-ID: <8283EF43-2AF0-4B3B-8A00-4DE615186EC7@gmx.net> You do have a point here. From a design perspective, it feels odd if instantiating an object can fail with an I/O exception. But in reality that's how it's done all the time, from Bio::SeqIO to java.io.*, if I'm not mistaken. I also agree that asking a conditional that we know will be false every time except once violates the sense of elegance. So upon second consideration, I think I agree with you. And a GFF3 file with zero features in it should still be a valid GFF3 file, i.e., have the mandatory headers. Does that make sense? -hilmar On Sep 25, 2007, at 6:00 AM, Sendu Bala wrote: > Hilmar Lapp wrote: >> On Sep 24, 2007, at 9:35 AM, Chris Fields wrote: >>> I think that'll work fine. The other option would be call a >>> print_gff_header() function within write_feature() with the >>> intent to >>> print the header only once, using a flag or similar: >>> >>> if (!$self->header_printed) { >>> $self->print_gff_header; >>> $self->header_printed(1); >>> } > > >> I'd lean toward this or a similar approach too. Writing stuff out >> in the constructor doesn't feel like the best design. > > I'd argue that the alternative is just inefficient with no > compensating benefit. You have something that must only be done > once, and a method (_initialize) that is only called once. The > constructor is used to set up the file, getting it into a state > ready to add features. This involves opening it for writing with > the correct filename and setting the desired GFF version. Why > wouldn't it also output what ever else was necessary it initialize > the file? > > Also, what do we expect should happen when we use Bioperl to create > a GFF file and don't write any features to it? Should it be an > empty file, or should it contain whatever GFF information the user > had managed to supply (the version)? -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From hlapp at gmx.net Fri Sep 28 11:53:33 2007 From: hlapp at gmx.net (Hilmar Lapp) Date: Fri, 28 Sep 2007 11:53:33 -0400 Subject: [Bioperl-l] load_seqdatabase.pl error In-Reply-To: <8C9CDFBD-200C-4CD6-8295-7833D2CD3758@uiuc.edu> References: <000501c800d9$dc9c8e90$95d5abb0$@com> <2461576A-1A15-45E2-A14F-AEED6ACD6007@gmx.net> <000101c8017d$c4643360$4d2c9a20$@com> <8C9CDFBD-200C-4CD6-8295-7833D2CD3758@uiuc.edu> Message-ID: <9917490A-B7AF-4AE6-9C78-AD516700155B@gmx.net> Chris let me know if you get stumped. I'm surprised that the special ranks ('eurosids I' etc) show up in the lineage (has NCBI started to assign ranks to them? I thought I filter them out. Needs to be looked into too.), but at any rate I don't understand why they aren't being accepted. Also, maybe we need a more verbose output here - Forrest, can you run this with adding a --printerror argument. (I'm embarrassed to find that this doesn't seem to be documented. Sigh.) -hilmar On Sep 28, 2007, at 1:00 AM, Chris Fields wrote: > If this is occurring using bioperl from CVS then I'll try taking a > look at it. > > chris > > On Sep 27, 2007, at 10:15 PM, Forrest Zhang wrote: > >> Hilmar, >> I have already pre-loaded the NCBI taxonomy using >> load_ncbi_taxonomy.pl yet. The error message show: >> >> --------------------- WARNING --------------------- >> MSG: The supplied lineage does not start near 'Phaseolus aureus' (I >> was >> supplied 'Vigna | Papilionoideae | Fabaceae | Fabales | eurosids I >> | rosids >> | core eudicotyledons | eudicotyledons | Magnoliophyta | >> Euphyllophyta | >> Embryophyta | Streptophytina | Viridiplantae | Eukaryota') >> --------------------------------------------------- >> Could not store Q40784: >> ------------- EXCEPTION: Bio::Root::Exception ------------- >> MSG: create: object (Bio::Species) failed to insert or to be found >> by unique >> key >> STACK: Error::throw >> STACK: Bio::Root::Root::throw >> /usr/lib/perl5/site_perl/5.8.8/Bio/Root/Root.pm:357 >> STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create >> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/ >> BasePersistenceAdaptor.pm:206 >> STACK: Bio::DB::Persistent::PersistentObject::create >> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/Persistent/ >> PersistentObject.pm:244 >> STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create >> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/ >> BasePersistenceAdaptor.pm:169 >> STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::store >> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/ >> BasePersistenceAdaptor.pm:251 >> STACK: Bio::DB::Persistent::PersistentObject::store >> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/Persistent/ >> PersistentObject.pm:271 >> STACK: /usr/bin/bp_load_seqdatabase.pl:633 >> ----------------------------------------------------------- >> Sigh~~~~~~ >> >> Forrest Zhang >> >> >> -----Original Message----- >> From: bioperl-l-bounces at lists.open-bio.org >> [mailto:bioperl-l-bounces at lists.open-bio.org] On Behalf Of Hilmar >> Lapp >> Sent: Friday, September 28, 2007 6:17 AM >> To: Forrest >> Cc: bioperl-l at bioperl.org >> Subject: Re: [Bioperl-l] load_seqdatabase.pl error >> >> Forrest, >> >> have you preloaded the NCBI taxonomy as suggested in the BioSQL >> installation guidelines? SwissProt format has NCBI taxon IDs, and the >> code will try to use it to look up species and their lineage, rather >> than inserting the lineage from whatever BioPerl parses out of the >> sequence record. >> >> -hilmar >> >> On Sep 27, 2007, at 3:41 AM, Forrest wrote: >> >>> Hi, all >>> I install the biosql, and bioperl-db. I want to import >>> swissport data. >>> But the programe show some error as below: >>> ==================================================================== >>> = >>> = >>> ====== >>> =============================================== >>>> perl load_seqdatabase.pl -dbuser root -dbname bioseqdb -driver >>>> mysql >>> -namespace swissprot -format swiss ~/uniprot/uniprot_sprot.dat >>> Loading /home/forrest/uniprot/uniprot_sprot.dat ... >>> Could not store Q6DAH5: >>> ------------- EXCEPTION: Bio::Root::Exception ------------- >>> MSG: The supplied lineage does not start near 'Erwinia carotovora >>> subsp. >>> atroseptica' (I was supplied 'Erwinia carotovora subsp. | >>> Pectobacterium | >>> Enterobacteriaceae | Enterobacteriales | Gammaproteobacteria | >>> Proteobacteria | Bacteria') >>> STACK: Error::throw >>> STACK: Bio::Root::Root::throw >>> /usr/lib/perl5/site_perl/5.8.8/Bio/Root/Root.pm:359 >>> STACK: Bio::Species::classification >>> /usr/lib/perl5/site_perl/5.8.8/Bio/Species.pm:174 >>> STACK: Bio::DB::Persistent::PersistentObject::AUTOLOAD >>> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/Persistent/ >>> PersistentObject.pm:552 >>> STACK: Bio::DB::BioSQL::SpeciesAdaptor::populate_from_row >>> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/SpeciesAdaptor.pm:281 >>> STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::_build_object >>> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/ >>> BasePersistenceAdaptor.pm:1305 >>> STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::_find_by_unique_key >>> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/ >>> BasePersistenceAdaptor.pm:973 >>> STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::find_by_unique_key >>> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/ >>> BasePersistenceAdaptor.pm:852 >>> STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create >>> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/ >>> BasePersistenceAdaptor.pm:182 >>> STACK: Bio::DB::Persistent::PersistentObject::create >>> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/Persistent/ >>> PersistentObject.pm:244 >>> STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create >>> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/ >>> BasePersistenceAdaptor.pm:169 >>> STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::store >>> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/ >>> BasePersistenceAdaptor.pm:251 >>> STACK: Bio::DB::Persistent::PersistentObject::store >>> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/Persistent/ >>> PersistentObject.pm:271 >>> STACK: load_seqdatabase.pl:620 >>> ----------------------------------------------------------- >>> >>> at load_seqdatabase.pl line 633 >>> ==================================================================== >>> = >>> = >>> ====== >>> =============================================== >>> >>> How can I solve it, please help me, Thank you. >>> >>> Thanks >>> Forrest zhang >>> >>> >>> >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> >> -- >> =========================================================== >> : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : >> =========================================================== >> >> >> >> >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> >> >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > Christopher Fields > Postdoctoral Researcher > Lab of Dr. Robert Switzer > Dept of Biochemistry > University of Illinois Urbana-Champaign > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From cjfields at uiuc.edu Fri Sep 28 12:04:08 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Fri, 28 Sep 2007 11:04:08 -0500 Subject: [Bioperl-l] Bio::FeatureIO::gff bug? In-Reply-To: <8283EF43-2AF0-4B3B-8A00-4DE615186EC7@gmx.net> References: <46F784EB.9050507@sendu.me.uk> <7C7D7FB3-86B9-43CF-B506-E66FA8264CFC@uiuc.edu> <46F7B9A1.9080206@sendu.me.uk> <234C109D-85FE-4161-8CBA-8E24BE34C5B5@gmx.net> <46F8DC34.6020908@sendu.me.uk> <8283EF43-2AF0-4B3B-8A00-4DE615186EC7@gmx.net> Message-ID: <5298B700-EFDE-45E4-A8F3-674FA673A0C7@uiuc.edu> On Sep 28, 2007, at 10:36 AM, Hilmar Lapp wrote: > You do have a point here. From a design perspective, it feels odd > if instantiating an object can fail with an I/O exception. But in > reality that's how it's done all the time, from Bio::SeqIO to > java.io.*, if I'm not mistaken. I also agree that asking a > conditional that we know will be false every time except once > violates the sense of elegance. I agree with the lack of elegance from a design perspective on both counts, but when have we ever been worried about that? ;> In general I don't think SeqIO classes generates actual output (like the GFF header information) in the constructor, they just initialize IO and other state data. It makes sense to fail in this case if an error pops up. Regardless, one could argue ad infinitum that either proposed fix has its benefits/deficiencies, however both will work, so I'm happy with either. chris > So upon second consideration, I think I agree with you. And a GFF3 > file with zero features in it should still be a valid GFF3 file, > i.e., have the mandatory headers. > > Does that make sense? > > -hilmar > > On Sep 25, 2007, at 6:00 AM, Sendu Bala wrote: > >> Hilmar Lapp wrote: >>> On Sep 24, 2007, at 9:35 AM, Chris Fields wrote: >>>> I think that'll work fine. The other option would be call a >>>> print_gff_header() function within write_feature() with the >>>> intent to >>>> print the header only once, using a flag or similar: >>>> >>>> if (!$self->header_printed) { >>>> $self->print_gff_header; >>>> $self->header_printed(1); >>>> } >> > >>> I'd lean toward this or a similar approach too. Writing stuff out >>> in the constructor doesn't feel like the best design. >> >> I'd argue that the alternative is just inefficient with no >> compensating benefit. You have something that must only be done >> once, and a method (_initialize) that is only called once. The >> constructor is used to set up the file, getting it into a state >> ready to add features. This involves opening it for writing with >> the correct filename and setting the desired GFF version. Why >> wouldn't it also output what ever else was necessary it initialize >> the file? >> >> Also, what do we expect should happen when we use Bioperl to >> create a GFF file and don't write any features to it? Should it be >> an empty file, or should it contain whatever GFF information the >> user had managed to supply (the version)? > > -- > =========================================================== > : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : > =========================================================== > > > > > Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From cjfields at uiuc.edu Fri Sep 28 12:10:59 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Fri, 28 Sep 2007 11:10:59 -0500 Subject: [Bioperl-l] load_seqdatabase.pl error In-Reply-To: <9917490A-B7AF-4AE6-9C78-AD516700155B@gmx.net> References: <000501c800d9$dc9c8e90$95d5abb0$@com> <2461576A-1A15-45E2-A14F-AEED6ACD6007@gmx.net> <000101c8017d$c4643360$4d2c9a20$@com> <8C9CDFBD-200C-4CD6-8295-7833D2CD3758@uiuc.edu> <9917490A-B7AF-4AE6-9C78-AD516700155B@gmx.net> Message-ID: I'm actually getting some odd recursion issues again; not sure what's causing it, but a reinstall of both bioperl and bioperl-db fixed it last time. It may be related to the rollback, just not sure yet. I'll try tracking it down if it persists (bad pun). t/04swiss....ok 3/52 --------------------- WARNING --------------------- MSG: recursion detected for Bio::Taxon object --------------------------------------------------- --------------------- WARNING --------------------- MSG: recursion detected for Bio::Taxon object --------------------------------------------------- --------------------- WARNING --------------------- MSG: recursion detected for Bio::Taxon object --------------------------------------------------- --------------------- WARNING --------------------- MSG: recursion detected for Bio::Taxon object --------------------------------------------------- --------------------- WARNING --------------------- MSG: recursion detected for Bio::Taxon object --------------------------------------------------- --------------------- WARNING --------------------- MSG: recursion detected for Bio::Taxon object --------------------------------------------------- --------------------- WARNING --------------------- MSG: recursion detected for Bio::Taxon object --------------------------------------------------- --------------------- WARNING --------------------- MSG: recursion detected for Bio::Taxon object --------------------------------------------------- --------------------- WARNING --------------------- MSG: recursion detected for Bio::Taxon object --------------------------------------------------- --------------------- WARNING --------------------- MSG: recursion detected for Bio::Taxon object --------------------------------------------------- --------------------- WARNING --------------------- MSG: recursion detected for Bio::Taxon object --------------------------------------------------- --------------------- WARNING --------------------- MSG: recursion detected for Bio::Taxon object --------------------------------------------------- --------------------- WARNING --------------------- MSG: recursion detected for Bio::Taxon object --------------------------------------------------- t/04swiss....ok All tests successful. Files=1, Tests=52, 2 wallclock secs ( 1.33 cusr + 0.18 csys = 1.51 CPU) The specific error under verbose running is: --------------------- WARNING --------------------- MSG: recursion detected for Bio::Taxon object STACK Bio::DB::BioSQL::BasePersistenceAdaptor::_process_child /Users/ cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ BasePersistenceAdaptor.pm:680 STACK Bio::DB::BioSQL::BasePersistenceAdaptor::_create_persistent / Users/cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ BasePersistenceAdaptor.pm:629 STACK Bio::DB::BioSQL::BasePersistenceAdaptor::create_persistent / Users/cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ BasePersistenceAdaptor.pm:586 STACK Bio::DB::BioSQL::BasePersistenceAdaptor::_process_child /Users/ cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ BasePersistenceAdaptor.pm:677 STACK Bio::DB::BioSQL::BasePersistenceAdaptor::_create_persistent / Users/cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ BasePersistenceAdaptor.pm:629 STACK Bio::DB::BioSQL::BasePersistenceAdaptor::_process_child /Users/ cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ BasePersistenceAdaptor.pm:691 STACK Bio::DB::BioSQL::BasePersistenceAdaptor::_create_persistent / Users/cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ BasePersistenceAdaptor.pm:629 STACK Bio::DB::BioSQL::BasePersistenceAdaptor::create_persistent / Users/cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ BasePersistenceAdaptor.pm:586 STACK Bio::DB::BioSQL::BasePersistenceAdaptor::_process_child /Users/ cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ BasePersistenceAdaptor.pm:677 STACK Bio::DB::BioSQL::BasePersistenceAdaptor::_create_persistent / Users/cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ BasePersistenceAdaptor.pm:629 STACK Bio::DB::BioSQL::BasePersistenceAdaptor::_process_child /Users/ cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ BasePersistenceAdaptor.pm:691 STACK Bio::DB::BioSQL::BasePersistenceAdaptor::_create_persistent / Users/cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ BasePersistenceAdaptor.pm:629 STACK Bio::DB::BioSQL::BasePersistenceAdaptor::create_persistent / Users/cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ BasePersistenceAdaptor.pm:586 STACK Bio::DB::BioSQL::BasePersistenceAdaptor::_process_child /Users/ cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ BasePersistenceAdaptor.pm:677 STACK Bio::DB::BioSQL::BasePersistenceAdaptor::_create_persistent / Users/cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ BasePersistenceAdaptor.pm:629 STACK Bio::DB::BioSQL::BasePersistenceAdaptor::create_persistent / Users/cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ BasePersistenceAdaptor.pm:586 STACK Bio::DB::BioSQL::BasePersistenceAdaptor::_process_child /Users/ cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ BasePersistenceAdaptor.pm:677 STACK Bio::DB::BioSQL::BasePersistenceAdaptor::_create_persistent / Users/cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ BasePersistenceAdaptor.pm:629 STACK Bio::DB::BioSQL::BasePersistenceAdaptor::create_persistent / Users/cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ BasePersistenceAdaptor.pm:586 STACK Bio::DB::BioSQL::BasePersistenceAdaptor::_process_child /Users/ cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ BasePersistenceAdaptor.pm:658 STACK Bio::DB::BioSQL::BasePersistenceAdaptor::_create_persistent / Users/cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ BasePersistenceAdaptor.pm:629 STACK Bio::DB::BioSQL::BasePersistenceAdaptor::create_persistent / Users/cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ BasePersistenceAdaptor.pm:586 STACK Bio::DB::BioSQL::BasePersistenceAdaptor::store /Users/cjfields/ src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/BasePersistenceAdaptor.pm:252 STACK Bio::DB::BioSQL::PrimarySeqAdaptor::store_children /Users/ cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ PrimarySeqAdaptor.pm:229 STACK Bio::DB::BioSQL::SeqAdaptor::store_children /Users/cjfields/src/ core/bioperl-db/blib/lib/Bio/DB/BioSQL/SeqAdaptor.pm:217 STACK Bio::DB::BioSQL::BasePersistenceAdaptor::create /Users/cjfields/ src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/BasePersistenceAdaptor.pm:213 STACK Bio::DB::Persistent::PersistentObject::create /Users/cjfields/ src/core/bioperl-db/blib/lib/Bio/DB/Persistent/PersistentObject.pm:244 STACK toplevel t/04swiss.t:37 --------------------------------------------------- chris On Sep 28, 2007, at 10:53 AM, Hilmar Lapp wrote: > Chris let me know if you get stumped. I'm surprised that the special > ranks ('eurosids I' etc) show up in the lineage (has NCBI started to > assign ranks to them? I thought I filter them out. Needs to be looked > into too.), but at any rate I don't understand why they aren't being > accepted. > > Also, maybe we need a more verbose output here - Forrest, can you run > this with adding a --printerror argument. (I'm embarrassed to find > that this doesn't seem to be documented. Sigh.) > > -hilmar > > On Sep 28, 2007, at 1:00 AM, Chris Fields wrote: > >> If this is occurring using bioperl from CVS then I'll try taking a >> look at it. >> >> chris >> >> On Sep 27, 2007, at 10:15 PM, Forrest Zhang wrote: >> >>> Hilmar, >>> I have already pre-loaded the NCBI taxonomy using >>> load_ncbi_taxonomy.pl yet. The error message show: >>> >>> --------------------- WARNING --------------------- >>> MSG: The supplied lineage does not start near 'Phaseolus aureus' (I >>> was >>> supplied 'Vigna | Papilionoideae | Fabaceae | Fabales | eurosids I >>> | rosids >>> | core eudicotyledons | eudicotyledons | Magnoliophyta | >>> Euphyllophyta | >>> Embryophyta | Streptophytina | Viridiplantae | Eukaryota') >>> --------------------------------------------------- >>> Could not store Q40784: >>> ------------- EXCEPTION: Bio::Root::Exception ------------- >>> MSG: create: object (Bio::Species) failed to insert or to be found >>> by unique >>> key >>> STACK: Error::throw >>> STACK: Bio::Root::Root::throw >>> /usr/lib/perl5/site_perl/5.8.8/Bio/Root/Root.pm:357 >>> STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create >>> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/ >>> BasePersistenceAdaptor.pm:206 >>> STACK: Bio::DB::Persistent::PersistentObject::create >>> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/Persistent/ >>> PersistentObject.pm:244 >>> STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create >>> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/ >>> BasePersistenceAdaptor.pm:169 >>> STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::store >>> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/ >>> BasePersistenceAdaptor.pm:251 >>> STACK: Bio::DB::Persistent::PersistentObject::store >>> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/Persistent/ >>> PersistentObject.pm:271 >>> STACK: /usr/bin/bp_load_seqdatabase.pl:633 >>> ----------------------------------------------------------- >>> Sigh~~~~~~ >>> >>> Forrest Zhang >>> >>> >>> -----Original Message----- >>> From: bioperl-l-bounces at lists.open-bio.org >>> [mailto:bioperl-l-bounces at lists.open-bio.org] On Behalf Of Hilmar >>> Lapp >>> Sent: Friday, September 28, 2007 6:17 AM >>> To: Forrest >>> Cc: bioperl-l at bioperl.org >>> Subject: Re: [Bioperl-l] load_seqdatabase.pl error >>> >>> Forrest, >>> >>> have you preloaded the NCBI taxonomy as suggested in the BioSQL >>> installation guidelines? SwissProt format has NCBI taxon IDs, and >>> the >>> code will try to use it to look up species and their lineage, rather >>> than inserting the lineage from whatever BioPerl parses out of the >>> sequence record. >>> >>> -hilmar >>> >>> On Sep 27, 2007, at 3:41 AM, Forrest wrote: >>> >>>> Hi, all >>>> I install the biosql, and bioperl-db. I want to import >>>> swissport data. >>>> But the programe show some error as below: >>>> =================================================================== >>>> = >>>> = >>>> = >>>> ====== >>>> =============================================== >>>>> perl load_seqdatabase.pl -dbuser root -dbname bioseqdb -driver >>>>> mysql >>>> -namespace swissprot -format swiss ~/uniprot/uniprot_sprot.dat >>>> Loading /home/forrest/uniprot/uniprot_sprot.dat ... >>>> Could not store Q6DAH5: >>>> ------------- EXCEPTION: Bio::Root::Exception ------------- >>>> MSG: The supplied lineage does not start near 'Erwinia carotovora >>>> subsp. >>>> atroseptica' (I was supplied 'Erwinia carotovora subsp. | >>>> Pectobacterium | >>>> Enterobacteriaceae | Enterobacteriales | Gammaproteobacteria | >>>> Proteobacteria | Bacteria') >>>> STACK: Error::throw >>>> STACK: Bio::Root::Root::throw >>>> /usr/lib/perl5/site_perl/5.8.8/Bio/Root/Root.pm:359 >>>> STACK: Bio::Species::classification >>>> /usr/lib/perl5/site_perl/5.8.8/Bio/Species.pm:174 >>>> STACK: Bio::DB::Persistent::PersistentObject::AUTOLOAD >>>> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/Persistent/ >>>> PersistentObject.pm:552 >>>> STACK: Bio::DB::BioSQL::SpeciesAdaptor::populate_from_row >>>> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/SpeciesAdaptor.pm:281 >>>> STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::_build_object >>>> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/ >>>> BasePersistenceAdaptor.pm:1305 >>>> STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::_find_by_unique_key >>>> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/ >>>> BasePersistenceAdaptor.pm:973 >>>> STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::find_by_unique_key >>>> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/ >>>> BasePersistenceAdaptor.pm:852 >>>> STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create >>>> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/ >>>> BasePersistenceAdaptor.pm:182 >>>> STACK: Bio::DB::Persistent::PersistentObject::create >>>> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/Persistent/ >>>> PersistentObject.pm:244 >>>> STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create >>>> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/ >>>> BasePersistenceAdaptor.pm:169 >>>> STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::store >>>> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/ >>>> BasePersistenceAdaptor.pm:251 >>>> STACK: Bio::DB::Persistent::PersistentObject::store >>>> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/Persistent/ >>>> PersistentObject.pm:271 >>>> STACK: load_seqdatabase.pl:620 >>>> ----------------------------------------------------------- >>>> >>>> at load_seqdatabase.pl line 633 >>>> =================================================================== >>>> = >>>> = >>>> = >>>> ====== >>>> =============================================== >>>> >>>> How can I solve it, please help me, Thank you. >>>> >>>> Thanks >>>> Forrest zhang >>>> >>>> >>>> >>>> _______________________________________________ >>>> Bioperl-l mailing list >>>> Bioperl-l at lists.open-bio.org >>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>> >>> -- >>> =========================================================== >>> : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : >>> =========================================================== >>> >>> >>> >>> >>> >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>> >>> >>> >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> >> Christopher Fields >> Postdoctoral Researcher >> Lab of Dr. Robert Switzer >> Dept of Biochemistry >> University of Illinois Urbana-Champaign >> >> >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > -- > =========================================================== > : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : > =========================================================== > > > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From hlapp at gmx.net Fri Sep 28 17:09:28 2007 From: hlapp at gmx.net (Hilmar Lapp) Date: Fri, 28 Sep 2007 17:09:28 -0400 Subject: [Bioperl-l] Bio::FeatureIO::gff bug? In-Reply-To: <5298B700-EFDE-45E4-A8F3-674FA673A0C7@uiuc.edu> References: <46F784EB.9050507@sendu.me.uk> <7C7D7FB3-86B9-43CF-B506-E66FA8264CFC@uiuc.edu> <46F7B9A1.9080206@sendu.me.uk> <234C109D-85FE-4161-8CBA-8E24BE34C5B5@gmx.net> <46F8DC34.6020908@sendu.me.uk> <8283EF43-2AF0-4B3B-8A00-4DE615186EC7@gmx.net> <5298B700-EFDE-45E4-A8F3-674FA673A0C7@uiuc.edu> Message-ID: <41A20518-63BC-4D01-8FFC-01C903ADD423@gmx.net> On Sep 28, 2007, at 12:04 PM, Chris Fields wrote: > In general I don't think SeqIO classes generates actual output > (like the GFF header information) in the constructor There's probably two reasons they don't (if really all of them don't): i) unless you explicitly test (how?) whether the file has been opened for writing, you actually don't know in the SeqIO constructor whether someone's going to write to the file or read from it. ii) off hand, I don't know of a sequence file format that would require a particular header being written just once. Though thinking about this, I start asking myself whether i) wouldn't also apply to FeatureIO (are we not reading gff too in this class?), and I'm wondering that there must be a header (or at least an enclosing tag) for SeqIO XML formats - so how is that dealt with there? -hilmar -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From cjfields at uiuc.edu Fri Sep 28 17:34:13 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Fri, 28 Sep 2007 16:34:13 -0500 Subject: [Bioperl-l] Bio::FeatureIO::gff bug? In-Reply-To: <41A20518-63BC-4D01-8FFC-01C903ADD423@gmx.net> References: <46F784EB.9050507@sendu.me.uk> <7C7D7FB3-86B9-43CF-B506-E66FA8264CFC@uiuc.edu> <46F7B9A1.9080206@sendu.me.uk> <234C109D-85FE-4161-8CBA-8E24BE34C5B5@gmx.net> <46F8DC34.6020908@sendu.me.uk> <8283EF43-2AF0-4B3B-8A00-4DE615186EC7@gmx.net> <5298B700-EFDE-45E4-A8F3-674FA673A0C7@uiuc.edu> <41A20518-63BC-4D01-8FFC-01C903ADD423@gmx.net> Message-ID: <530A0322-A3BC-471D-AE91-17AD8F0EB237@uiuc.edu> On Sep 28, 2007, at 4:09 PM, Hilmar Lapp wrote: > > On Sep 28, 2007, at 12:04 PM, Chris Fields wrote: > >> In general I don't think SeqIO classes generates actual output >> (like the GFF header information) in the constructor > > There's probably two reasons they don't (if really all of them > don't): i) unless you explicitly test (how?) whether the file has > been opened for writing, you actually don't know in the SeqIO > constructor whether someone's going to write to the file or read from > it. ii) off hand, I don't know of a sequence file format that would > require a particular header being written just once. > > Though thinking about this, I start asking myself whether i) wouldn't > also apply to FeatureIO (are we not reading gff too in this class?), > and I'm wondering that there must be a header (or at least an > enclosing tag) for SeqIO XML formats - so how is that dealt with > there? > > -hilmar Re: (i) and FeatureIO: I believe most FeatureIO classes read/write to/ from specific feature files (bed, gff, ptt, interpro, etc), which is one reason I thought everything I(nput) should go into next_feature (), everything O(utput) into write_feature(). The section writing the gff header info in _initialize() checks the file specifically for '>' prior to output; I think Sendu planned on changing that to use mode() instead. Re: (ii): I'm not sure, actually; I wouldn't be surprised if XML output hasn't been tested very well. If I could go to the Nov. GMOD meeting to help hammer out some of the GFF3/FeatureIO/SF::Annotated stuff I would, but I would be traveling on my own dime. Maybe I'll see what I can come up with and stay at the no-tell motel... chris From cjfields at uiuc.edu Fri Sep 28 18:03:06 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Fri, 28 Sep 2007 17:03:06 -0500 Subject: [Bioperl-l] load_seqdatabase.pl error In-Reply-To: References: <000501c800d9$dc9c8e90$95d5abb0$@com> <2461576A-1A15-45E2-A14F-AEED6ACD6007@gmx.net> <000101c8017d$c4643360$4d2c9a20$@com> <8C9CDFBD-200C-4CD6-8295-7833D2CD3758@uiuc.edu> <9917490A-B7AF-4AE6-9C78-AD516700155B@gmx.net> Message-ID: Okay, fixed the recursion (extra copy of a BasePersistentAdaptor module I was working which tripped it, so nothing in CVS). Forrest, I get all tests passing. I used a database without taxonomy loaded with bioperl-db and bioperl from cvs and it worked w/o problems. I'll try working with your sequence when I have time this weekend. chris On Sep 28, 2007, at 11:10 AM, Chris Fields wrote: > I'm actually getting some odd recursion issues again; not sure what's > causing it, but a reinstall of both bioperl and bioperl-db fixed it > last time. It may be related to the rollback, just not sure yet. > > I'll try tracking it down if it persists (bad pun). > > t/04swiss....ok 3/52 > --------------------- WARNING --------------------- > MSG: recursion detected for Bio::Taxon object > --------------------------------------------------- > > --------------------- WARNING --------------------- > MSG: recursion detected for Bio::Taxon object > --------------------------------------------------- > > --------------------- WARNING --------------------- > MSG: recursion detected for Bio::Taxon object > --------------------------------------------------- > > --------------------- WARNING --------------------- > MSG: recursion detected for Bio::Taxon object > --------------------------------------------------- > > --------------------- WARNING --------------------- > MSG: recursion detected for Bio::Taxon object > --------------------------------------------------- > > --------------------- WARNING --------------------- > MSG: recursion detected for Bio::Taxon object > --------------------------------------------------- > > --------------------- WARNING --------------------- > MSG: recursion detected for Bio::Taxon object > --------------------------------------------------- > > --------------------- WARNING --------------------- > MSG: recursion detected for Bio::Taxon object > --------------------------------------------------- > > --------------------- WARNING --------------------- > MSG: recursion detected for Bio::Taxon object > --------------------------------------------------- > > --------------------- WARNING --------------------- > MSG: recursion detected for Bio::Taxon object > --------------------------------------------------- > > --------------------- WARNING --------------------- > MSG: recursion detected for Bio::Taxon object > --------------------------------------------------- > > --------------------- WARNING --------------------- > MSG: recursion detected for Bio::Taxon object > --------------------------------------------------- > > --------------------- WARNING --------------------- > MSG: recursion detected for Bio::Taxon object > --------------------------------------------------- > t/04swiss....ok > All tests successful. > Files=1, Tests=52, 2 wallclock secs ( 1.33 cusr + 0.18 csys = 1.51 > CPU) > > The specific error under verbose running is: > > --------------------- WARNING --------------------- > MSG: recursion detected for Bio::Taxon object > STACK Bio::DB::BioSQL::BasePersistenceAdaptor::_process_child /Users/ > cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ > BasePersistenceAdaptor.pm:680 > STACK Bio::DB::BioSQL::BasePersistenceAdaptor::_create_persistent / > Users/cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ > BasePersistenceAdaptor.pm:629 > STACK Bio::DB::BioSQL::BasePersistenceAdaptor::create_persistent / > Users/cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ > BasePersistenceAdaptor.pm:586 > STACK Bio::DB::BioSQL::BasePersistenceAdaptor::_process_child /Users/ > cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ > BasePersistenceAdaptor.pm:677 > STACK Bio::DB::BioSQL::BasePersistenceAdaptor::_create_persistent / > Users/cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ > BasePersistenceAdaptor.pm:629 > STACK Bio::DB::BioSQL::BasePersistenceAdaptor::_process_child /Users/ > cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ > BasePersistenceAdaptor.pm:691 > STACK Bio::DB::BioSQL::BasePersistenceAdaptor::_create_persistent / > Users/cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ > BasePersistenceAdaptor.pm:629 > STACK Bio::DB::BioSQL::BasePersistenceAdaptor::create_persistent / > Users/cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ > BasePersistenceAdaptor.pm:586 > STACK Bio::DB::BioSQL::BasePersistenceAdaptor::_process_child /Users/ > cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ > BasePersistenceAdaptor.pm:677 > STACK Bio::DB::BioSQL::BasePersistenceAdaptor::_create_persistent / > Users/cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ > BasePersistenceAdaptor.pm:629 > STACK Bio::DB::BioSQL::BasePersistenceAdaptor::_process_child /Users/ > cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ > BasePersistenceAdaptor.pm:691 > STACK Bio::DB::BioSQL::BasePersistenceAdaptor::_create_persistent / > Users/cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ > BasePersistenceAdaptor.pm:629 > STACK Bio::DB::BioSQL::BasePersistenceAdaptor::create_persistent / > Users/cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ > BasePersistenceAdaptor.pm:586 > STACK Bio::DB::BioSQL::BasePersistenceAdaptor::_process_child /Users/ > cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ > BasePersistenceAdaptor.pm:677 > STACK Bio::DB::BioSQL::BasePersistenceAdaptor::_create_persistent / > Users/cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ > BasePersistenceAdaptor.pm:629 > STACK Bio::DB::BioSQL::BasePersistenceAdaptor::create_persistent / > Users/cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ > BasePersistenceAdaptor.pm:586 > STACK Bio::DB::BioSQL::BasePersistenceAdaptor::_process_child /Users/ > cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ > BasePersistenceAdaptor.pm:677 > STACK Bio::DB::BioSQL::BasePersistenceAdaptor::_create_persistent / > Users/cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ > BasePersistenceAdaptor.pm:629 > STACK Bio::DB::BioSQL::BasePersistenceAdaptor::create_persistent / > Users/cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ > BasePersistenceAdaptor.pm:586 > STACK Bio::DB::BioSQL::BasePersistenceAdaptor::_process_child /Users/ > cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ > BasePersistenceAdaptor.pm:658 > STACK Bio::DB::BioSQL::BasePersistenceAdaptor::_create_persistent / > Users/cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ > BasePersistenceAdaptor.pm:629 > STACK Bio::DB::BioSQL::BasePersistenceAdaptor::create_persistent / > Users/cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ > BasePersistenceAdaptor.pm:586 > STACK Bio::DB::BioSQL::BasePersistenceAdaptor::store /Users/cjfields/ > src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ > BasePersistenceAdaptor.pm:252 > STACK Bio::DB::BioSQL::PrimarySeqAdaptor::store_children /Users/ > cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ > PrimarySeqAdaptor.pm:229 > STACK Bio::DB::BioSQL::SeqAdaptor::store_children /Users/cjfields/src/ > core/bioperl-db/blib/lib/Bio/DB/BioSQL/SeqAdaptor.pm:217 > STACK Bio::DB::BioSQL::BasePersistenceAdaptor::create /Users/cjfields/ > src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ > BasePersistenceAdaptor.pm:213 > STACK Bio::DB::Persistent::PersistentObject::create /Users/cjfields/ > src/core/bioperl-db/blib/lib/Bio/DB/Persistent/PersistentObject.pm:244 > STACK toplevel t/04swiss.t:37 > --------------------------------------------------- > > > chris > > On Sep 28, 2007, at 10:53 AM, Hilmar Lapp wrote: > >> Chris let me know if you get stumped. I'm surprised that the special >> ranks ('eurosids I' etc) show up in the lineage (has NCBI started to >> assign ranks to them? I thought I filter them out. Needs to be looked >> into too.), but at any rate I don't understand why they aren't being >> accepted. >> >> Also, maybe we need a more verbose output here - Forrest, can you run >> this with adding a --printerror argument. (I'm embarrassed to find >> that this doesn't seem to be documented. Sigh.) >> >> -hilmar >> >> On Sep 28, 2007, at 1:00 AM, Chris Fields wrote: >> >>> If this is occurring using bioperl from CVS then I'll try taking a >>> look at it. >>> >>> chris >>> >>> On Sep 27, 2007, at 10:15 PM, Forrest Zhang wrote: >>> >>>> Hilmar, >>>> I have already pre-loaded the NCBI taxonomy using >>>> load_ncbi_taxonomy.pl yet. The error message show: >>>> >>>> --------------------- WARNING --------------------- >>>> MSG: The supplied lineage does not start near 'Phaseolus aureus' (I >>>> was >>>> supplied 'Vigna | Papilionoideae | Fabaceae | Fabales | eurosids I >>>> | rosids >>>> | core eudicotyledons | eudicotyledons | Magnoliophyta | >>>> Euphyllophyta | >>>> Embryophyta | Streptophytina | Viridiplantae | Eukaryota') >>>> --------------------------------------------------- >>>> Could not store Q40784: >>>> ------------- EXCEPTION: Bio::Root::Exception ------------- >>>> MSG: create: object (Bio::Species) failed to insert or to be found >>>> by unique >>>> key >>>> STACK: Error::throw >>>> STACK: Bio::Root::Root::throw >>>> /usr/lib/perl5/site_perl/5.8.8/Bio/Root/Root.pm:357 >>>> STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create >>>> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/ >>>> BasePersistenceAdaptor.pm:206 >>>> STACK: Bio::DB::Persistent::PersistentObject::create >>>> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/Persistent/ >>>> PersistentObject.pm:244 >>>> STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create >>>> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/ >>>> BasePersistenceAdaptor.pm:169 >>>> STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::store >>>> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/ >>>> BasePersistenceAdaptor.pm:251 >>>> STACK: Bio::DB::Persistent::PersistentObject::store >>>> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/Persistent/ >>>> PersistentObject.pm:271 >>>> STACK: /usr/bin/bp_load_seqdatabase.pl:633 >>>> ----------------------------------------------------------- >>>> Sigh~~~~~~ >>>> >>>> Forrest Zhang >>>> >>>> >>>> -----Original Message----- >>>> From: bioperl-l-bounces at lists.open-bio.org >>>> [mailto:bioperl-l-bounces at lists.open-bio.org] On Behalf Of Hilmar >>>> Lapp >>>> Sent: Friday, September 28, 2007 6:17 AM >>>> To: Forrest >>>> Cc: bioperl-l at bioperl.org >>>> Subject: Re: [Bioperl-l] load_seqdatabase.pl error >>>> >>>> Forrest, >>>> >>>> have you preloaded the NCBI taxonomy as suggested in the BioSQL >>>> installation guidelines? SwissProt format has NCBI taxon IDs, and >>>> the >>>> code will try to use it to look up species and their lineage, >>>> rather >>>> than inserting the lineage from whatever BioPerl parses out of the >>>> sequence record. >>>> >>>> -hilmar >>>> >>>> On Sep 27, 2007, at 3:41 AM, Forrest wrote: >>>> >>>>> Hi, all >>>>> I install the biosql, and bioperl-db. I want to import >>>>> swissport data. >>>>> But the programe show some error as below: >>>>> ================================================================== >>>>> = >>>>> = >>>>> = >>>>> = >>>>> ====== >>>>> =============================================== >>>>>> perl load_seqdatabase.pl -dbuser root -dbname bioseqdb -driver >>>>>> mysql >>>>> -namespace swissprot -format swiss ~/uniprot/uniprot_sprot.dat >>>>> Loading /home/forrest/uniprot/uniprot_sprot.dat ... >>>>> Could not store Q6DAH5: >>>>> ------------- EXCEPTION: Bio::Root::Exception ------------- >>>>> MSG: The supplied lineage does not start near 'Erwinia carotovora >>>>> subsp. >>>>> atroseptica' (I was supplied 'Erwinia carotovora subsp. | >>>>> Pectobacterium | >>>>> Enterobacteriaceae | Enterobacteriales | Gammaproteobacteria | >>>>> Proteobacteria | Bacteria') >>>>> STACK: Error::throw >>>>> STACK: Bio::Root::Root::throw >>>>> /usr/lib/perl5/site_perl/5.8.8/Bio/Root/Root.pm:359 >>>>> STACK: Bio::Species::classification >>>>> /usr/lib/perl5/site_perl/5.8.8/Bio/Species.pm:174 >>>>> STACK: Bio::DB::Persistent::PersistentObject::AUTOLOAD >>>>> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/Persistent/ >>>>> PersistentObject.pm:552 >>>>> STACK: Bio::DB::BioSQL::SpeciesAdaptor::populate_from_row >>>>> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/SpeciesAdaptor.pm:281 >>>>> STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::_build_object >>>>> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/ >>>>> BasePersistenceAdaptor.pm:1305 >>>>> STACK: >>>>> Bio::DB::BioSQL::BasePersistenceAdaptor::_find_by_unique_key >>>>> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/ >>>>> BasePersistenceAdaptor.pm:973 >>>>> STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::find_by_unique_key >>>>> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/ >>>>> BasePersistenceAdaptor.pm:852 >>>>> STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create >>>>> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/ >>>>> BasePersistenceAdaptor.pm:182 >>>>> STACK: Bio::DB::Persistent::PersistentObject::create >>>>> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/Persistent/ >>>>> PersistentObject.pm:244 >>>>> STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create >>>>> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/ >>>>> BasePersistenceAdaptor.pm:169 >>>>> STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::store >>>>> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/ >>>>> BasePersistenceAdaptor.pm:251 >>>>> STACK: Bio::DB::Persistent::PersistentObject::store >>>>> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/Persistent/ >>>>> PersistentObject.pm:271 >>>>> STACK: load_seqdatabase.pl:620 >>>>> ----------------------------------------------------------- >>>>> >>>>> at load_seqdatabase.pl line 633 >>>>> ================================================================== >>>>> = >>>>> = >>>>> = >>>>> = >>>>> ====== >>>>> =============================================== >>>>> >>>>> How can I solve it, please help me, Thank you. >>>>> >>>>> Thanks >>>>> Forrest zhang >>>>> >>>>> >>>>> >>>>> _______________________________________________ >>>>> Bioperl-l mailing list >>>>> Bioperl-l at lists.open-bio.org >>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>>> >>>> -- >>>> =========================================================== >>>> : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : >>>> =========================================================== >>>> >>>> >>>> >>>> >>>> >>>> _______________________________________________ >>>> Bioperl-l mailing list >>>> Bioperl-l at lists.open-bio.org >>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>>> >>>> >>>> >>>> _______________________________________________ >>>> Bioperl-l mailing list >>>> Bioperl-l at lists.open-bio.org >>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>> >>> Christopher Fields >>> Postdoctoral Researcher >>> Lab of Dr. Robert Switzer >>> Dept of Biochemistry >>> University of Illinois Urbana-Champaign >>> >>> >>> >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> >> -- >> =========================================================== >> : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : >> =========================================================== >> >> >> >> >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > Christopher Fields > Postdoctoral Researcher > Lab of Dr. Robert Switzer > Dept of Biochemistry > University of Illinois Urbana-Champaign > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From hlapp at gmx.net Fri Sep 28 18:20:37 2007 From: hlapp at gmx.net (Hilmar Lapp) Date: Fri, 28 Sep 2007 18:20:37 -0400 Subject: [Bioperl-l] Bio::FeatureIO::gff bug? In-Reply-To: <530A0322-A3BC-471D-AE91-17AD8F0EB237@uiuc.edu> References: <46F784EB.9050507@sendu.me.uk> <7C7D7FB3-86B9-43CF-B506-E66FA8264CFC@uiuc.edu> <46F7B9A1.9080206@sendu.me.uk> <234C109D-85FE-4161-8CBA-8E24BE34C5B5@gmx.net> <46F8DC34.6020908@sendu.me.uk> <8283EF43-2AF0-4B3B-8A00-4DE615186EC7@gmx.net> <5298B700-EFDE-45E4-A8F3-674FA673A0C7@uiuc.edu> <41A20518-63BC-4D01-8FFC-01C903ADD423@gmx.net> <530A0322-A3BC-471D-AE91-17AD8F0EB237@uiuc.edu> Message-ID: <5FC8F92C-42DD-4DAF-8008-0F8C545065B5@gmx.net> On Sep 28, 2007, at 5:34 PM, Chris Fields wrote: > The section writing the gff header info in _initialize() checks the > file specifically for '>' prior to output; I think Sendu planned on > changing that to use mode() instead. What if we pass in a file handle? -hilmar -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From cjfields at uiuc.edu Fri Sep 28 19:04:21 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Fri, 28 Sep 2007 18:04:21 -0500 Subject: [Bioperl-l] Bio::FeatureIO::gff bug? In-Reply-To: <5FC8F92C-42DD-4DAF-8008-0F8C545065B5@gmx.net> References: <46F784EB.9050507@sendu.me.uk> <7C7D7FB3-86B9-43CF-B506-E66FA8264CFC@uiuc.edu> <46F7B9A1.9080206@sendu.me.uk> <234C109D-85FE-4161-8CBA-8E24BE34C5B5@gmx.net> <46F8DC34.6020908@sendu.me.uk> <8283EF43-2AF0-4B3B-8A00-4DE615186EC7@gmx.net> <5298B700-EFDE-45E4-A8F3-674FA673A0C7@uiuc.edu> <41A20518-63BC-4D01-8FFC-01C903ADD423@gmx.net> <530A0322-A3BC-471D-AE91-17AD8F0EB237@uiuc.edu> <5FC8F92C-42DD-4DAF-8008-0F8C545065B5@gmx.net> Message-ID: On Sep 28, 2007, at 5:20 PM, Hilmar Lapp wrote: > > On Sep 28, 2007, at 5:34 PM, Chris Fields wrote: > >> The section writing the gff header info in _initialize() checks the >> file specifically for '>' prior to output; I think Sendu planned on >> changing that to use mode() instead. > > What if we pass in a file handle? > > -hilmar The old way def. wouldn't work with filehandles. Not sure if checking Root::IO::mode() would work as expected in this case, but it's certainly worth a try. chris From jay at jays.net Fri Sep 28 18:50:29 2007 From: jay at jays.net (Jay Hannah) Date: Fri, 28 Sep 2007 17:50:29 -0500 Subject: [Bioperl-l] Bio::FeatureIO::gff bug? In-Reply-To: <530A0322-A3BC-471D-AE91-17AD8F0EB237@uiuc.edu> References: <46F784EB.9050507@sendu.me.uk> <7C7D7FB3-86B9-43CF-B506-E66FA8264CFC@uiuc.edu> <46F7B9A1.9080206@sendu.me.uk> <234C109D-85FE-4161-8CBA-8E24BE34C5B5@gmx.net> <46F8DC34.6020908@sendu.me.uk> <8283EF43-2AF0-4B3B-8A00-4DE615186EC7@gmx.net> <5298B700-EFDE-45E4-A8F3-674FA673A0C7@uiuc.edu> <41A20518-63BC-4D01-8FFC-01C903ADD423@gmx.net> <530A0322-A3BC-471D-AE91-17AD8F0EB237@uiuc.edu> Message-ID: On Sep 28, 2007, at 4:34 PM, Chris Fields wrote: > On Sep 28, 2007, at 4:09 PM, Hilmar Lapp wrote: >> and I'm wondering that there must be a header (or at least an >> enclosing tag) for SeqIO XML formats - so how is that dealt with >> there? > > Re: (ii): I'm not sure, actually; I wouldn't be surprised if XML > output hasn't been tested very well. For good or ill I stole what I found in other Bio/SeqIO/* classes when I started writing my Bio::SeqIO::solrxml. It is not ready, but you can poke it with a stick here if you like: http://vc.jays.net/viewvc.cgi/SolrGene/solrxml.pm? revision=26&root=CLAB&view=markup In _initialize() it sends XML header goo into $self->_print(), and it uses DESTROY to drop an XML closure tag into $self->_print(). Constructing it $out_solr = Bio::SeqIO->new(-file => ">seq.solr.xml", -format => 'solrxml'); without writing any sequences to it created a seq.solr.xml file with the XML header and footer and nothing in the middle. If this is not The Right Way I'm happy to change it to do whatever. :) > If I could go to the Nov. GMOD meeting to help hammer out some of the > GFF3/FeatureIO/SF::Annotated stuff I would, but I would be traveling > on my own dime. Maybe I'll see what I can come up with and stay at > the no-tell motel... I've decided to spend many of my own dimes and make the trek. I hope to meet many BioPerl'ers. I'll buy you dinner if you show up. :) Take care, Jay Hannah http://clab.ist.unomaha.edu/CLAB/index.php/User:Jhannah From cjfields at uiuc.edu Sat Sep 29 14:28:16 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Sat, 29 Sep 2007 13:28:16 -0500 Subject: [Bioperl-l] Bio::FeatureIO::gff bug? In-Reply-To: References: <46F784EB.9050507@sendu.me.uk> <7C7D7FB3-86B9-43CF-B506-E66FA8264CFC@uiuc.edu> <46F7B9A1.9080206@sendu.me.uk> <234C109D-85FE-4161-8CBA-8E24BE34C5B5@gmx.net> <46F8DC34.6020908@sendu.me.uk> <8283EF43-2AF0-4B3B-8A00-4DE615186EC7@gmx.net> <5298B700-EFDE-45E4-A8F3-674FA673A0C7@uiuc.edu> <41A20518-63BC-4D01-8FFC-01C903ADD423@gmx.net> <530A0322-A3BC-471D-AE91-17AD8F0EB237@uiuc.edu> Message-ID: On Sep 28, 2007, at 5:50 PM, Jay Hannah wrote: > On Sep 28, 2007, at 4:34 PM, Chris Fields wrote: >> On Sep 28, 2007, at 4:09 PM, Hilmar Lapp wrote: >>> and I'm wondering that there must be a header (or at least an >>> enclosing tag) for SeqIO XML formats - so how is that dealt with >>> there? >> >> Re: (ii): I'm not sure, actually; I wouldn't be surprised if XML >> output hasn't been tested very well. > > For good or ill I stole what I found in other Bio/SeqIO/* classes > when I started writing my Bio::SeqIO::solrxml. It is not ready, but > you can poke it with a stick here if you like: > > http://vc.jays.net/viewvc.cgi/SolrGene/solrxml.pm? > revision=26&root=CLAB&view=markup > > In _initialize() it sends XML header goo into $self->_print(), and it > uses DESTROY to drop an XML closure tag into $self->_print(). > > Constructing it > > $out_solr = Bio::SeqIO->new(-file => ">seq.solr.xml", > -format => 'solrxml'); > > without writing any sequences to it created a seq.solr.xml file with > the XML header and footer and nothing in the middle. > > If this is not The Right Way I'm happy to change it to do > whatever. :) If you do it this way you should probably run a check on the mode() the object state is in ('r'=read, 'w'=write, '?'=unknown), and only _print() on write mode. Might also be a good idea to implement a next_seq with an exception ('Module is write only'). >> If I could go to the Nov. GMOD meeting to help hammer out some of the >> GFF3/FeatureIO/SF::Annotated stuff I would, but I would be traveling >> on my own dime. Maybe I'll see what I can come up with and stay at >> the no-tell motel... > > I've decided to spend many of my own dimes and make the trek. I hope > to meet many BioPerl'ers. I'll buy you dinner if you show up. :) > > Take care, > > Jay Hannah > http://clab.ist.unomaha.edu/CLAB/index.php/User:Jhannah I may see what I can scrape up myself but it doesn't look good (lab's closing down soon, so money's pretty tight). If I knew about the meeting a while in advance I would probably have made it. Oh well! chris From cjfields at uiuc.edu Sun Sep 30 16:39:23 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Sun, 30 Sep 2007 15:39:23 -0500 Subject: [Bioperl-l] DB::SeqFeature::Store error Message-ID: I'm getting the following error on my local MySQL (v 5.0.41) with bp_seqfeature_load: -------------------- EXCEPTION -------------------- MSG: The used table type doesn't support FULLTEXT indexes STACK Bio::DB::SeqFeature::Store::DBI::mysql::_init_database /Library/ Perl/5.8.6/Bio/DB/SeqFeature/Store/DBI/mysql.pm:414 STACK Bio::DB::SeqFeature::Store::init_database /Library/Perl/5.8.6/ Bio/DB/SeqFeature/Store.pm:382 STACK Bio::DB::SeqFeature::Store::DBI::mysql::init /Library/Perl/ 5.8.6/Bio/DB/SeqFeature/Store/DBI/mysql.pm:218 STACK Bio::DB::SeqFeature::Store::new /Library/Perl/5.8.6/Bio/DB/ SeqFeature/Store.pm:345 STACK toplevel /usr/local/bin/bp_seqfeature_load.pl:57 ------------------------------------------- The default setting for storage is InnoDB; switching to MyISAM fixes the issue. Should we specify TYPE = MyISAM with the various CREATE TABLE queries in Bio::DB::SeqFeature::Store::DBI::mysql to be on the safe side? chris From alan at tll.org.sg Sun Sep 30 21:53:07 2007 From: alan at tll.org.sg (alan) Date: Mon, 1 Oct 2007 09:53:07 +0800 Subject: [Bioperl-l] exonerate References: <034FB11C-B4E9-4E4E-B213-D4AC6A397B1B@tll.org.sg> Message-ID: <29C4D729-6715-4C19-9872-3B1AF90EAFA3@tll.org.sg> Hi, >> I am calling exonerate.pm within my script while attempting to >> align cDNA to multiple genomic fragments. After processing about >> 120+ genomic fragments my code crashes with the following error: >> >> ** ERROR **: Could not open [/tmp/tlInatbOED] : Too many open files >> aborting... >> MSG: Exonerate call (/usr/local/bin/exonerate /tmp/8X9jQuHUGF / >> tmp/tlInatbOED > /tmp/EolF5qCNLZ/cIf0HfIRf5) crashed: 34304 >> STACK Bio::Tools::Run::Alignment::Exonerate::_run /nfs1/alan/ >> cvs_src/bioperl-run/Bio/Tools/Run/Alignment/Exonerate.pm:214 >> STACK Bio::Tools::Run::Alignment::Exonerate::run /nfs1/alan/ >> cvs_src/bioperl-run/Bio/Tools/Run/Alignment/Exonerate.pm:174 >> >> The code in Exonerate.pm closes the tmpfile at the end of the >> routine yet I get the error message about "too many open files". >> Any suggestions on how I should be closing these files? >> >> >> Extract from my code that runs exonerate is listed below. >> >> foreach my $f(@files) { >> next unless (-f "$dir/$f"); >> my $q_in = Bio::SeqIO->new(-file=>$query, -format=>"Fasta"); >> my $query_obj = $q_in->next_seq(); >> my $target_in = Bio::SeqIO->new(-file=>"$dir/$f", - >> format=>"Fasta"); >> my $target_obj = $target_in->next_seq(); >> my $run = Bio::Tools::Run::Alignment::Exonerate->new(); >> my $exonerate_io = $run->run($query_obj, $target_obj); >> >> [code for parsing the data.......] >> >> $exonerate_io->close; #tried this line out of desperation but it >> did not help :-) >> } >> >> thanks >> alan >> >> >> >> Alan Christoffels >> Computational Biology Group >> Temasek LifeSciences Laboratory >> 1 Research Link >> National University of Singapore >> Singapore >> 117604 >> Tel: +65 68744945 >> Fax: +65 68727007 >> Lab webpage: http://www.tll.org.sg/alan.asp >> >> > From ewijaya at gmail.com Sun Sep 30 10:10:25 2007 From: ewijaya at gmail.com (Edward Wijaya) Date: Sun, 30 Sep 2007 22:10:25 +0800 Subject: [Bioperl-l] Bio::Graphics - Howto draw graded segments overlap with line track Message-ID: <3521d3670709300710l4a41c47es1c72cc5a450a3736@mail.gmail.com> Hi, I want to draw a binding sites hits on sequence of various length. What I have now is a graded segments only. Is there a way to draw segments overlapping with the line ? (see attached figure). -- Edward -------------- next part -------------- A non-text attachment was scrubbed... Name: hits.PNG Type: image/png Size: 15613 bytes Desc: not available Url : http://lists.open-bio.org/pipermail/bioperl-l/attachments/20070930/2402791e/attachment-0001.png From cjfields at uiuc.edu Sun Sep 2 19:54:54 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Sun, 2 Sep 2007 18:54:54 -0500 Subject: [Bioperl-l] (no subject) Message-ID: Posted this to biosql-l already but felt it needed posting here as well. Sorry if you get this twice. I noticed some critical recursion issues with bioperl-db when working in Bio::Ontology changes. This was using bioperl-live (post-feature/ annotation fixes). Bug report is here: http://bugzilla.open-bio.org/show_bug.cgi?id=2355 It seems to be Bio:Taxon related; this is from 03swiss.t: --------------------- WARNING --------------------- MSG: recursion detected for Bio::Taxon object STACK Bio::DB::BioSQL::BasePersistenceAdaptor::_process_child /Users/cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ BasePersistenceAdaptor.pm:681 STACK Bio::DB::BioSQL::BasePersistenceAdaptor::_create_persistent /Users/cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ BasePersistenceAdaptor.pm:630 STACK Bio::DB::BioSQL::BasePersistenceAdaptor::_process_child /Users/cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ BasePersistenceAdaptor.pm:692 STACK Bio::DB::BioSQL::BasePersistenceAdaptor::_create_persistent /Users/cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ BasePersistenceAdaptor.pm:630 ... /Users/cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ BasePersistenceAdaptor.pm:587 STACK Bio::DB::BioSQL::BasePersistenceAdaptor::store /Users/cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ BasePersistenceAdaptor.pm:253 STACK Bio::DB::BioSQL::PrimarySeqAdaptor::store_children /Users/cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ PrimarySeqAdaptor.pm:229 STACK Bio::DB::BioSQL::SeqAdaptor::store_children /Users/cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ SeqAdaptor.pm:217 STACK Bio::DB::BioSQL::BasePersistenceAdaptor::create /Users/cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ BasePersistenceAdaptor.pm:214 STACK Bio::DB::Persistent::PersistentObject::create /Users/cjfields/src/core/bioperl-db/blib/lib/Bio/DB/Persistent/ PersistentObject.pm:244 STACK toplevel t/04swiss.t:36 --------------------------------------------------- Also, seeing this with 13remove.t and 15.cluster.t, both of which appear to infinitely recurse: Deep recursion on subroutine "Bio::DB::BioSQL::BasePersistenceAdaptor::_create_persistent" at /Users/cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ BasePersistenceAdaptor.pm line 587, line 1. Deep recursion on subroutine "Bio::DB::BioSQL::BasePersistenceAdaptor::_process_child" at /Users/cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ BasePersistenceAdaptor.pm line 630, line 1. chris From cjfields at uiuc.edu Sun Sep 2 19:57:59 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Sun, 2 Sep 2007 18:57:59 -0500 Subject: [Bioperl-l] recursion issues with bioperl-db Message-ID: <2E14450C-C135-42DD-A9DE-EB47EB80E6AC@uiuc.edu> Apologies if you get this more than once; the first post appeared to get sent w/o a proper subject line. Posted this to biosql-l already but felt it needed posting here as well. I noticed some critical recursion issues with bioperl-db when working in Bio::Ontology changes. This was using bioperl-live (post-feature/ annotation fixes). Bug report is here: http://bugzilla.open-bio.org/show_bug.cgi?id=2355 It seems to be Bio:Taxon related; this is from 03swiss.t: --------------------- WARNING --------------------- MSG: recursion detected for Bio::Taxon object STACK Bio::DB::BioSQL::BasePersistenceAdaptor::_process_child /Users/cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ BasePersistenceAdaptor.pm:681 STACK Bio::DB::BioSQL::BasePersistenceAdaptor::_create_persistent /Users/cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ BasePersistenceAdaptor.pm:630 STACK Bio::DB::BioSQL::BasePersistenceAdaptor::_process_child /Users/cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ BasePersistenceAdaptor.pm:692 STACK Bio::DB::BioSQL::BasePersistenceAdaptor::_create_persistent /Users/cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ BasePersistenceAdaptor.pm:630 ... /Users/cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ BasePersistenceAdaptor.pm:587 STACK Bio::DB::BioSQL::BasePersistenceAdaptor::store /Users/cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ BasePersistenceAdaptor.pm:253 STACK Bio::DB::BioSQL::PrimarySeqAdaptor::store_children /Users/cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ PrimarySeqAdaptor.pm:229 STACK Bio::DB::BioSQL::SeqAdaptor::store_children /Users/cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ SeqAdaptor.pm:217 STACK Bio::DB::BioSQL::BasePersistenceAdaptor::create /Users/cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ BasePersistenceAdaptor.pm:214 STACK Bio::DB::Persistent::PersistentObject::create /Users/cjfields/src/core/bioperl-db/blib/lib/Bio/DB/Persistent/ PersistentObject.pm:244 STACK toplevel t/04swiss.t:36 --------------------------------------------------- Also, seeing this with 13remove.t and 15.cluster.t, both of which appear to infinitely recurse: Deep recursion on subroutine "Bio::DB::BioSQL::BasePersistenceAdaptor::_create_persistent" at /Users/cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ BasePersistenceAdaptor.pm line 587, line 1. Deep recursion on subroutine "Bio::DB::BioSQL::BasePersistenceAdaptor::_process_child" at /Users/cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ BasePersistenceAdaptor.pm line 630, line 1. chris From cjfields at uiuc.edu Sun Sep 2 21:40:48 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Sun, 2 Sep 2007 20:40:48 -0500 Subject: [Bioperl-l] recursion issues with bioperl-db In-Reply-To: <2E14450C-C135-42DD-A9DE-EB47EB80E6AC@uiuc.edu> References: <2E14450C-C135-42DD-A9DE-EB47EB80E6AC@uiuc.edu> Message-ID: <25CFD36D-D921-4F5F-BADF-D858A2FE76D4@uiuc.edu> Okay, we can the previous posts! Odd, but I started from scratch and can't reproduce the issue; there may have been some cross-talk with different bioperl installations on my laptop. Anyway, everything passes now w/o recursion so I'll mark the bug as invalid. chris On Sep 2, 2007, at 6:57 PM, Chris Fields wrote: > Apologies if you get this more than once; the first post appeared to > get sent w/o a proper subject line. Posted this to biosql-l already > but felt it needed posting here as well. > > I noticed some critical recursion issues with bioperl-db when working > in Bio::Ontology changes. This was using bioperl-live (post-feature/ > annotation fixes). Bug report is here: > > http://bugzilla.open-bio.org/show_bug.cgi?id=2355 > > It seems to be Bio:Taxon related; this is from 03swiss.t: > > --------------------- WARNING --------------------- > MSG: recursion detected for Bio::Taxon object > STACK Bio::DB::BioSQL::BasePersistenceAdaptor::_process_child > /Users/cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ > BasePersistenceAdaptor.pm:681 > STACK Bio::DB::BioSQL::BasePersistenceAdaptor::_create_persistent > /Users/cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ > BasePersistenceAdaptor.pm:630 > STACK Bio::DB::BioSQL::BasePersistenceAdaptor::_process_child > /Users/cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ > BasePersistenceAdaptor.pm:692 > STACK Bio::DB::BioSQL::BasePersistenceAdaptor::_create_persistent > /Users/cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ > BasePersistenceAdaptor.pm:630 > ... > /Users/cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ > BasePersistenceAdaptor.pm:587 > STACK Bio::DB::BioSQL::BasePersistenceAdaptor::store > /Users/cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ > BasePersistenceAdaptor.pm:253 > STACK Bio::DB::BioSQL::PrimarySeqAdaptor::store_children > /Users/cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ > PrimarySeqAdaptor.pm:229 > STACK Bio::DB::BioSQL::SeqAdaptor::store_children > /Users/cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ > SeqAdaptor.pm:217 > STACK Bio::DB::BioSQL::BasePersistenceAdaptor::create > /Users/cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ > BasePersistenceAdaptor.pm:214 > STACK Bio::DB::Persistent::PersistentObject::create > /Users/cjfields/src/core/bioperl-db/blib/lib/Bio/DB/Persistent/ > PersistentObject.pm:244 > STACK toplevel t/04swiss.t:36 > --------------------------------------------------- > > Also, seeing this with 13remove.t and 15.cluster.t, both of which > appear to infinitely recurse: > > Deep recursion on subroutine > "Bio::DB::BioSQL::BasePersistenceAdaptor::_create_persistent" at > /Users/cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ > BasePersistenceAdaptor.pm > line 587, line 1. > Deep recursion on subroutine > "Bio::DB::BioSQL::BasePersistenceAdaptor::_process_child" at > /Users/cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ > BasePersistenceAdaptor.pm > line 630, line 1. > > > chris > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From bernd.web at gmail.com Mon Sep 3 08:43:26 2007 From: bernd.web at gmail.com (Bernd Web) Date: Mon, 3 Sep 2007 14:43:26 +0200 Subject: [Bioperl-l] Fh::flush warning Message-ID: <716af09c0709030543w79f83368gf0ac74d220a96f8c@mail.gmail.com> Hi, Sometimes with Bio::SimpleAlign/AlignIO, I get the following warning: (in cleanup) Undefined subroutine Fh::flush, at /lib/perl/Bio/Root/IO.pm line 541. This occurs in a rather large script and have not been able to isolate a small example where I also get this warning. Does someone know more about this warning and why it is thrown? Regards, Bernd From cjfields at uiuc.edu Mon Sep 3 10:41:49 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 3 Sep 2007 09:41:49 -0500 Subject: [Bioperl-l] Fh::flush warning In-Reply-To: <716af09c0709030543w79f83368gf0ac74d220a96f8c@mail.gmail.com> References: <716af09c0709030543w79f83368gf0ac74d220a96f8c@mail.gmail.com> Message-ID: <98A9D081-2570-4D4E-A8F8-D03282D41E0C@uiuc.edu> Could you give a bit more info (bioperl version, OS, etc)? I'm guessing a recent version as the error coincides with a call to flush() in Root::IO (which is probably called indirectly via DESTROY) and that you're probably using a tied filehandle somewhere for output, e.g. Bio::AlignIO::newFh() or Bio::AlignIO::fh(), so knowing the input/output formats could help. chris On Sep 3, 2007, at 7:43 AM, Bernd Web wrote: > Hi, > > Sometimes with Bio::SimpleAlign/AlignIO, I get the following warning: > (in cleanup) Undefined subroutine Fh::flush, at > /lib/perl/Bio/Root/IO.pm line 541. > > This occurs in a rather large script and have not been able to isolate > a small example where I also get this warning. Does someone know more > about this warning and why it is thrown? > > Regards, > Bernd > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From xianranli78 at yahoo.com.cn Mon Sep 3 22:11:09 2007 From: xianranli78 at yahoo.com.cn (xianran li) Date: Tue, 4 Sep 2007 10:11:09 +0800 (CST) Subject: [Bioperl-l] question about Bio::DB::GFF Message-ID: <361239.6752.qm@web15309.mail.cnb.yahoo.com> Hi, I tried to load the gff3 file with load_gff.pl and extrac some information with Bio::DB::GFF. Althougth this code work properly under windows xp, the $seg got nothing when i run it under Linux. Here is my code and the gff3 file, #################################################################### #!/usr/local/bin/perl -w use strict; use Bio::SeqIO; use Bio::DB::GFF; my $in_gff = Bio::DB::GFF->new( -adaptor => 'dbi::mysqlopt', -dsn => 'dbi:mysql:test', -aggregator => ['coding'], -user => "lixr", -pass => "123456" ); my $seg = $in_gff->segment'BGIOSIBCE000001.1'); print $seg->abs_start."\n"; ################################################################## ##gff-version 3 ##sequence-region Chr01 1 43037 Chr01 bgf mRNA 18113 20165 . + . ID=BGIOSIBCE000001.1 Chr01 bgf CDS 18113 19150 . + 0 Parent=BGIOSIBCE000001.1 Chr01 bgf CDS 19344 20165 . + 0 Parent=BGIOSIBCE000001.1 Chr01 bgf mRNA 30220 36442 . + . ID=BGIOSIBCE000002.1 Chr01 bgf CDS 30220 30387 . + 0 Parent=BGIOSIBCE000002.1 Chr01 bgf CDS 31128 31226 . + 0 Parent=BGIOSIBCE000002.1 Chr01 bgf CDS 32228 32331 . + 0 Parent=BGIOSIBCE000002.1 Chr01 bgf CDS 33907 34715 . + 1 Parent=BGIOSIBCE000002.1 Chr01 bgf CDS 34799 34921 . + 2 Parent=BGIOSIBCE000002.1 Chr01 bgf CDS 35003 35091 . + 2 Parent=BGIOSIBCE000002.1 Chr01 bgf CDS 35179 35379 . + 0 Parent=BGIOSIBCE000002.1 Chr01 bgf CDS 35981 36442 . + 0 Parent=BGIOSIBCE000002.1 Chr01 bgf mRNA 38143 39015 . - . ID=BGIOSIBCE000003.1 Chr01 bgf CDS 38143 38541 . - 0 Parent=BGIOSIBCE000003.1 Chr01 bgf CDS 38649 38813 . - 0 Parent=BGIOSIBCE000003.1 Chr01 bgf CDS 38917 39015 . - 0 Parent=BGIOSIBCE000003.1 Chr01 bgf mRNA 39545 42080 . + . ID=BGIOSIBCE000004.1 Chr01 bgf CDS 39545 40584 . + 0 Parent=BGIOSIBCE000004.1 Chr01 bgf CDS 40677 41042 . + 1 Parent=BGIOSIBCE000004.1 Chr01 bgf CDS 41130 41208 . + 1 Parent=BGIOSIBCE000004.1 Chr01 bgf CDS 41740 41920 . + 0 Parent=BGIOSIBCE000004.1 Chr01 bgf CDS 42037 42080 . + 2 Parent=BGIOSIBCE000004.1 ################################################################# I would appreaciate if any one can give me some clues/link to accomplish this. thanks in advance , Xianran Li --------------------------------- ???????????????????????????????????????????? From cjfields at uiuc.edu Tue Sep 4 00:04:29 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 3 Sep 2007 23:04:29 -0500 Subject: [Bioperl-l] question about Bio::DB::GFF In-Reply-To: <361239.6752.qm@web15309.mail.cnb.yahoo.com> References: <361239.6752.qm@web15309.mail.cnb.yahoo.com> Message-ID: <37BE6493-B49B-47DF-8047-37D616B669A8@uiuc.edu> Not sure if the gff3 you show was modified for demonstration here but it should always be tab-delimited. Also, I have had problems myself when using files with Windows/Mac Classic line endings on UNIX'y systems (Excel and a few other Mac OS X programs insist on adding \r instead of \n, which plays havoc with parsers sometimes even with readline fixes). chris On Sep 3, 2007, at 9:11 PM, xianran li wrote: > > Hi, > > I tried to load the gff3 file with load_gff.pl and extrac some > information with Bio::DB::GFF. Althougth this code work properly > under windows xp, the $seg got nothing when i run it under Linux. > > Here is my code and the gff3 file, > #################################################################### > > #!/usr/local/bin/perl -w > > use strict; > use Bio::SeqIO; > use Bio::DB::GFF; > > my $in_gff = Bio::DB::GFF->new( -adaptor => 'dbi::mysqlopt', > -dsn => 'dbi:mysql:test', > -aggregator => ['coding'], > -user => "lixr", > -pass => "123456" > ); > my $seg = $in_gff->segment'BGIOSIBCE000001.1'); > print $seg->abs_start."\n"; > > > ################################################################## > ##gff-version 3 > ##sequence-region Chr01 1 43037 > Chr01 bgf mRNA 18113 20165 . + . ID=BGIOSIBCE000001.1 > Chr01 bgf CDS 18113 19150 . + 0 Parent=BGIOSIBCE000001.1 > Chr01 bgf CDS 19344 20165 . + 0 Parent=BGIOSIBCE000001.1 > Chr01 bgf mRNA 30220 36442 . + . ID=BGIOSIBCE000002.1 > Chr01 bgf CDS 30220 30387 . + 0 Parent=BGIOSIBCE000002.1 > Chr01 bgf CDS 31128 31226 . + 0 Parent=BGIOSIBCE000002.1 > Chr01 bgf CDS 32228 32331 . + 0 Parent=BGIOSIBCE000002.1 > Chr01 bgf CDS 33907 34715 . + 1 Parent=BGIOSIBCE000002.1 > Chr01 bgf CDS 34799 34921 . + 2 Parent=BGIOSIBCE000002.1 > Chr01 bgf CDS 35003 35091 . + 2 Parent=BGIOSIBCE000002.1 > Chr01 bgf CDS 35179 35379 . + 0 Parent=BGIOSIBCE000002.1 > Chr01 bgf CDS 35981 36442 . + 0 Parent=BGIOSIBCE000002.1 > Chr01 bgf mRNA 38143 39015 . - . ID=BGIOSIBCE000003.1 > Chr01 bgf CDS 38143 38541 . - 0 Parent=BGIOSIBCE000003.1 > Chr01 bgf CDS 38649 38813 . - 0 Parent=BGIOSIBCE000003.1 > Chr01 bgf CDS 38917 39015 . - 0 Parent=BGIOSIBCE000003.1 > Chr01 bgf mRNA 39545 42080 . + . ID=BGIOSIBCE000004.1 > Chr01 bgf CDS 39545 40584 . + 0 Parent=BGIOSIBCE000004.1 > Chr01 bgf CDS 40677 41042 . + 1 Parent=BGIOSIBCE000004.1 > Chr01 bgf CDS 41130 41208 . + 1 Parent=BGIOSIBCE000004.1 > Chr01 bgf CDS 41740 41920 . + 0 Parent=BGIOSIBCE000004.1 > Chr01 bgf CDS 42037 42080 . + 2 Parent=BGIOSIBCE000004.1 > ################################################################# > > > I would appreaciate if any one can give me some clues/link to > accomplish this. > > thanks in advance , > > Xianran Li > > > --------------------------------- > ?????????????????????? > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From xianranli78 at yahoo.com.cn Tue Sep 4 00:58:48 2007 From: xianranli78 at yahoo.com.cn (xianran li) Date: Tue, 4 Sep 2007 12:58:48 +0800 (CST) Subject: [Bioperl-l] =?gb2312?q?=BB=D8=B8=B4=A3=BA=20Re:=20=20question=20about=20Bi?= =?gb2312?q?o::DB::GFF?= In-Reply-To: <37BE6493-B49B-47DF-8047-37D616B669A8@uiuc.edu> Message-ID: <866169.66154.qm@web15309.mail.cnb.yahoo.com> Hi, everybody, It looks like for the different perl version(5.8.8 of windows and 5.8.5 for linux). And I fixed this problem by adding ";Name=XXXX" after each line with "mRNA" ############################################################################## Chr01 bgf mRNA 18113 20165 . + . ID=BGIOSIBCE000001.1;Name=BGIOSIBCE000001.1 Chr01 bgf CDS 18113 19150 . + 0 Parent=BGIOSIBCE000001.1 ############################################################################## This time my code works properly. Xianran Chris Fields ?????? Not sure if the gff3 you show was modified for demonstration here but it should always be tab-delimited. Also, I have had problems myself when using files with Windows/Mac Classic line endings on UNIX'y systems (Excel and a few other Mac OS X programs insist on adding \r instead of \n, which plays havoc with parsers sometimes even with readline fixes). chris On Sep 3, 2007, at 9:11 PM, xianran li wrote: > > Hi, > > I tried to load the gff3 file with load_gff.pl and extrac some > information with Bio::DB::GFF. Althougth this code work properly > under windows xp, the $seg got nothing when i run it under Linux. > > Here is my code and the gff3 file, > #################################################################### > > #!/usr/local/bin/perl -w > > use strict; > use Bio::SeqIO; > use Bio::DB::GFF; > > my $in_gff = Bio::DB::GFF->new( -adaptor => 'dbi::mysqlopt', > -dsn => 'dbi:mysql:test', > -aggregator => ['coding'], > -user => "lixr", > -pass => "123456" > ); > my $seg = $in_gff->segment'BGIOSIBCE000001.1'); > print $seg->abs_start."\n"; > > > ################################################################## > ##gff-version 3 > ##sequence-region Chr01 1 43037 > Chr01 bgf mRNA 18113 20165 . + . ID=BGIOSIBCE000001.1 > Chr01 bgf CDS 18113 19150 . + 0 Parent=BGIOSIBCE000001.1 > Chr01 bgf CDS 19344 20165 . + 0 Parent=BGIOSIBCE000001.1 > Chr01 bgf mRNA 30220 36442 . + . ID=BGIOSIBCE000002.1 > Chr01 bgf CDS 30220 30387 . + 0 Parent=BGIOSIBCE000002.1 > Chr01 bgf CDS 31128 31226 . + 0 Parent=BGIOSIBCE000002.1 > Chr01 bgf CDS 32228 32331 . + 0 Parent=BGIOSIBCE000002.1 > Chr01 bgf CDS 33907 34715 . + 1 Parent=BGIOSIBCE000002.1 > Chr01 bgf CDS 34799 34921 . + 2 Parent=BGIOSIBCE000002.1 > Chr01 bgf CDS 35003 35091 . + 2 Parent=BGIOSIBCE000002.1 > Chr01 bgf CDS 35179 35379 . + 0 Parent=BGIOSIBCE000002.1 > Chr01 bgf CDS 35981 36442 . + 0 Parent=BGIOSIBCE000002.1 > Chr01 bgf mRNA 38143 39015 . - . ID=BGIOSIBCE000003.1 > Chr01 bgf CDS 38143 38541 . - 0 Parent=BGIOSIBCE000003.1 > Chr01 bgf CDS 38649 38813 . - 0 Parent=BGIOSIBCE000003.1 > Chr01 bgf CDS 38917 39015 . - 0 Parent=BGIOSIBCE000003.1 > Chr01 bgf mRNA 39545 42080 . + . ID=BGIOSIBCE000004.1 > Chr01 bgf CDS 39545 40584 . + 0 Parent=BGIOSIBCE000004.1 > Chr01 bgf CDS 40677 41042 . + 1 Parent=BGIOSIBCE000004.1 > Chr01 bgf CDS 41130 41208 . + 1 Parent=BGIOSIBCE000004.1 > Chr01 bgf CDS 41740 41920 . + 0 Parent=BGIOSIBCE000004.1 > Chr01 bgf CDS 42037 42080 . + 2 Parent=BGIOSIBCE000004.1 > ################################################################# > > > I would appreaciate if any one can give me some clues/link to > accomplish this. > > thanks in advance , > > Xianran Li > > > --------------------------------- > ???????????????????????????????????????????? > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign --------------------------------- ???????????????????????????????? From jay at jays.net Tue Sep 4 10:31:36 2007 From: jay at jays.net (Jay Hannah) Date: Tue, 4 Sep 2007 09:31:36 -0500 Subject: [Bioperl-l] display protein/CDS multiple sequence alignments with exon boundaries In-Reply-To: <358f4d650708310229i421bb1d7w5f64e3ded5f92618@mail.gmail.com> References: <358f4d650708310229i421bb1d7w5f64e3ded5f92618@mail.gmail.com> Message-ID: <96BB820A-9199-43FC-BA88-5F45F2D16D56@jays.net> On Aug 31, 2007, at 4:29 AM, Albert Vilella wrote: > Probably a bit of a long shot but does anyone have code for > displaying protein or CDS multiple sequence alignments with the exon > boundaries of each gene in the alignment? > > Something in the bioperl world without funky external dependencies. > I think > it would be an awesome addition to the howtos. > > Currently, the Bio::Graphics howto has cdna to genome mapping > scripts or > blast output scripts, but > I couldn't find code for dealing with multiple sequence alignments. I'm currently under the (potentially uninformed) impression that Bio::Graphics and related tools only work with a single coordinate system. I've never seen a multiple sequence alignment example. ( I Google'd for "gbrowse alignment" and hit this: http://maizeapache.danforthcenter.org/cgi-bin/gbrowse.cgi Click the second Example link and you'll see exons mapped out. But zooming all the way in with all the tracks turned on it looks like the AZM tracks are just the coding regions. I don't see any multiple sequence alignment... ) I doubt that helped. :) Jay Hannah http://clab.ist.unomaha.edu/CLAB/index.php/User:Jhannah From cjfields at uiuc.edu Tue Sep 4 11:28:01 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 4 Sep 2007 10:28:01 -0500 Subject: [Bioperl-l] display protein/CDS multiple sequence alignments with exon boundaries In-Reply-To: <96BB820A-9199-43FC-BA88-5F45F2D16D56@jays.net> References: <358f4d650708310229i421bb1d7w5f64e3ded5f92618@mail.gmail.com> <96BB820A-9199-43FC-BA88-5F45F2D16D56@jays.net> Message-ID: <59E877A9-B2A1-4E62-A291-FDE21CF586E6@uiuc.edu> On Sep 4, 2007, at 9:31 AM, Jay Hannah wrote: > On Aug 31, 2007, at 4:29 AM, Albert Vilella wrote: >> Probably a bit of a long shot but does anyone have code for >> displaying protein or CDS multiple sequence alignments with the exon >> boundaries of each gene in the alignment? >> >> Something in the bioperl world without funky external dependencies. >> I think >> it would be an awesome addition to the howtos. >> >> Currently, the Bio::Graphics howto has cdna to genome mapping >> scripts or >> blast output scripts, but >> I couldn't find code for dealing with multiple sequence alignments. > > I'm currently under the (potentially uninformed) impression that > Bio::Graphics and related tools only work with a single coordinate > system. I've never seen a multiple sequence alignment example. > > ( > I Google'd for "gbrowse alignment" and hit this: > http://maizeapache.danforthcenter.org/cgi-bin/gbrowse.cgi > > Click the second Example link and you'll see exons mapped out. > > But zooming all the way in with all the tracks turned on it looks > like the AZM tracks are just the coding regions. I don't see any > multiple sequence alignment... > ) > > I doubt that helped. :) > > Jay Hannah > http://clab.ist.unomaha.edu/CLAB/index.php/User:Jhannah Acc. to the Gbrowse tutorial GBrowse can deal with alignment data: http://gmod.cvs.sourceforge.net/*checkout*/gmod/Generic-Genome- Browser/docs/tutorial/tutorial.html chris From avilella at gmail.com Wed Sep 5 05:42:37 2007 From: avilella at gmail.com (Albert Vilella) Date: Wed, 5 Sep 2007 11:42:37 +0200 Subject: [Bioperl-l] display protein/CDS multiple sequence alignments with exon boundaries In-Reply-To: <59E877A9-B2A1-4E62-A291-FDE21CF586E6@uiuc.edu> References: <358f4d650708310229i421bb1d7w5f64e3ded5f92618@mail.gmail.com> <96BB820A-9199-43FC-BA88-5F45F2D16D56@jays.net> <59E877A9-B2A1-4E62-A291-FDE21CF586E6@uiuc.edu> Message-ID: <358f4d650709050242u7743a532u3b460fcc55e1867e@mail.gmail.com> A couple of examples: http://www.treefam.org/cgi-bin/alnview.pl?ac=TF105041 treefam has exon boundary and PFAM domain mappings http://www.ensembl.org/Homo_sapiens/genetreeview?gene=ENSG00000139618 here the tree is shown as well, but the idea would be to plot the alignment So it's more "show me the multiple CDS/protein alignment" rather than "show my aligned CDS/proteins wrt my reference genome" I think it would be quite neat to have this as a bioperl howto, Comments? Albert. On 9/4/07, Chris Fields wrote: > > > On Sep 4, 2007, at 9:31 AM, Jay Hannah wrote: > > > On Aug 31, 2007, at 4:29 AM, Albert Vilella wrote: > >> Probably a bit of a long shot but does anyone have code for > >> displaying protein or CDS multiple sequence alignments with the exon > >> boundaries of each gene in the alignment? > >> > >> Something in the bioperl world without funky external dependencies. > >> I think > >> it would be an awesome addition to the howtos. > >> > >> Currently, the Bio::Graphics howto has cdna to genome mapping > >> scripts or > >> blast output scripts, but > >> I couldn't find code for dealing with multiple sequence alignments. > > > > I'm currently under the (potentially uninformed) impression that > > Bio::Graphics and related tools only work with a single coordinate > > system. I've never seen a multiple sequence alignment example. > > > > ( > > I Google'd for "gbrowse alignment" and hit this: > > http://maizeapache.danforthcenter.org/cgi-bin/gbrowse.cgi > > > > Click the second Example link and you'll see exons mapped out. > > > > But zooming all the way in with all the tracks turned on it looks > > like the AZM tracks are just the coding regions. I don't see any > > multiple sequence alignment... > > ) > > > > I doubt that helped. :) > > > > Jay Hannah > > http://clab.ist.unomaha.edu/CLAB/index.php/User:Jhannah > > Acc. to the Gbrowse tutorial GBrowse can deal with alignment data: > > http://gmod.cvs.sourceforge.net/*checkout*/gmod/Generic-Genome- > Browser/docs/tutorial/tutorial.html > > chris > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > From alexl at users.sourceforge.net Wed Sep 5 06:08:14 2007 From: alexl at users.sourceforge.net (Alex Lancaster) Date: Wed, 05 Sep 2007 03:08:14 -0700 Subject: [Bioperl-l] Clarifying license of bioperl In-Reply-To: <8D3FBCDF-47E7-4A6E-8001-C034CA27BF3F@gmx.net> (Hilmar Lapp's message of "Sat\, 18 Aug 2007 12\:13\:28 -0400") References: <1A4207F8295607498283FE9E93B775B40390172E@EX02.asurite.ad.asu.edu> <3515AB25-9919-407B-93E9-352BC426AFA1@uiuc.edu> <8td4xlyt4h.fsf@allele2.localdomain> <8D3FBCDF-47E7-4A6E-8001-C034CA27BF3F@gmx.net> Message-ID: >>>>> "HL" == Hilmar Lapp writes: HL> On Aug 18, 2007, at 7:33 AM, Alex Lancaster wrote: > I imagine the intent of the bioperl >> contributors is that it should be under the same terms as Perl, >> whatever that happens to be (which just happens to be GPL or >> Artistic, which is fine). HL> I fully agree. >> A clarification to that effect would be useful. HL> Agreed, too. Would you mind changing that language on the wiki, HL> since you seem to have a fairly good grasp on the issue? OK, I've updated the wiki in two places: http://www.bioperl.org/wiki/Licensing_BioPerl http://www.bioperl.org/wiki/FAQ#What_are_the_license_terms_for_BioPerl.3F It would also be nice if the LICENSE and Build.PL files in CVS (so it finds its way into the next release) were also updated to reflect the dual-licensed status, currently they only mention the Artistic license: http://code.open-bio.org/cgi/viewcvs.cgi/bioperl-live/LICENSE?rev=HEAD&content-type=text/vnd.viewcvs-markup http://code.open-bio.org/cgi/viewcvs.cgi/bioperl-live/Build.PL?rev=HEAD&content-type=text/vnd.viewcvs-markup For Build.PL this is easy: (e.g., license => 'artistic', should be license => 'GPL or Artistic',) Possible solutions for the LICENSE file include: 1) The GPL could be added to LICENSE file at the end (with a note at the top to indicate that GPL is also included); 2) LICENSE could be moved to LICENSE.Artistic and another file "LICENSE.GPL" added with the GPL (version 2+) conditions, and the contents of LICENSE would include a note about each license. I don't have access to the bioperl CVS repository, so I can't make the changes myself). This would also apply to the Build.PL (and LICENSE files if they are present) in bioperl-run and other modules. Thanks, Alex From cjfields at uiuc.edu Wed Sep 5 08:25:21 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Wed, 5 Sep 2007 07:25:21 -0500 Subject: [Bioperl-l] Clarifying license of bioperl In-Reply-To: References: <1A4207F8295607498283FE9E93B775B40390172E@EX02.asurite.ad.asu.edu> <3515AB25-9919-407B-93E9-352BC426AFA1@uiuc.edu> <8td4xlyt4h.fsf@allele2.localdomain> <8D3FBCDF-47E7-4A6E-8001-C034CA27BF3F@gmx.net> Message-ID: On Sep 5, 2007, at 5:08 AM, Alex Lancaster wrote: ... > > OK, I've updated the wiki in two places: > > http://www.bioperl.org/wiki/Licensing_BioPerl > > http://www.bioperl.org/wiki/ > FAQ#What_are_the_license_terms_for_BioPerl.3F > > It would also be nice if the LICENSE and Build.PL files in CVS (so it > finds its way into the next release) were also updated to reflect the > dual-licensed status, currently they only mention the Artistic > license: > > http://code.open-bio.org/cgi/viewcvs.cgi/bioperl-live/LICENSE? > rev=HEAD&content-type=text/vnd.viewcvs-markup > > http://code.open-bio.org/cgi/viewcvs.cgi/bioperl-live/Build.PL? > rev=HEAD&content-type=text/vnd.viewcvs-markup > > For Build.PL this is easy: > > (e.g., license => 'artistic', should be > license => 'GPL or Artistic',) > > Possible solutions for the LICENSE file include: > > 1) The GPL could be added to LICENSE file at the end (with a note at > the top to indicate that GPL is also included); > > 2) LICENSE could be moved to LICENSE.Artistic and another file > "LICENSE.GPL" added with the GPL (version 2+) conditions, and the > contents of LICENSE would include a note about each license. > > I don't have access to the bioperl CVS repository, so I can't make the > changes myself). This would also apply to the Build.PL (and LICENSE > files if they are present) in bioperl-run and other modules. > > Thanks, > Alex Looks like Sendu has done that. There have been recent troubling developments re: Artistic License: http://use.perl.org/article.pl?sid=07/08/26/1541205&from=rss but the case hasn't been conclusively decided yet. chris From bix at sendu.me.uk Wed Sep 5 08:18:35 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Wed, 05 Sep 2007 13:18:35 +0100 Subject: [Bioperl-l] Clarifying license of bioperl In-Reply-To: References: <1A4207F8295607498283FE9E93B775B40390172E@EX02.asurite.ad.asu.edu> <3515AB25-9919-407B-93E9-352BC426AFA1@uiuc.edu> <8td4xlyt4h.fsf@allele2.localdomain> <8D3FBCDF-47E7-4A6E-8001-C034CA27BF3F@gmx.net> Message-ID: <46DE9E9B.80107@sendu.me.uk> Alex Lancaster wrote: >>>>>> "HL" == Hilmar Lapp writes: > > HL> On Aug 18, 2007, at 7:33 AM, Alex Lancaster wrote: > >>> I imagine the intent of the bioperl >>> contributors is that it should be under the same terms as Perl, >>> whatever that happens to be (which just happens to be GPL or >>> Artistic, which is fine). > > HL> I fully agree. > >>> A clarification to that effect would be useful. > > HL> Agreed, too. Would you mind changing that language on the wiki, > HL> since you seem to have a fairly good grasp on the issue? > > OK, I've updated the wiki in two places: > > http://www.bioperl.org/wiki/Licensing_BioPerl > > http://www.bioperl.org/wiki/FAQ#What_are_the_license_terms_for_BioPerl.3F Thank you very much for that Alex. > It would also be nice if the LICENSE and Build.PL files in CVS (so it > finds its way into the next release) were also updated to reflect the > dual-licensed status, currently they only mention the Artistic > license: [snip] > For Build.PL this is easy: > > (e.g., license => 'artistic', should be > license => 'GPL or Artistic',) As per the 'license' section of http://search.cpan.org/~kwilliams/Module-Build-0.2808/lib/Module/Build/API.pod, I've changed it to 'perl', which means Artistic or GPL. > Possible solutions for the LICENSE file include: > > 1) The GPL could be added to LICENSE file at the end (with a note at > the top to indicate that GPL is also included); I took this approach, using your language for the explanation at the top, and including GPL 3.0 at the bottom. I've made these changes for core (live), run, db and network. Thanks again for your help and advice. From cjfields at uiuc.edu Wed Sep 5 08:53:25 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Wed, 5 Sep 2007 07:53:25 -0500 Subject: [Bioperl-l] display protein/CDS multiple sequence alignments with exon boundaries In-Reply-To: <358f4d650709050242u7743a532u3b460fcc55e1867e@mail.gmail.com> References: <358f4d650708310229i421bb1d7w5f64e3ded5f92618@mail.gmail.com> <96BB820A-9199-43FC-BA88-5F45F2D16D56@jays.net> <59E877A9-B2A1-4E62-A291-FDE21CF586E6@uiuc.edu> <358f4d650709050242u7743a532u3b460fcc55e1867e@mail.gmail.com> Message-ID: <08E1C429-0686-49E2-825A-D98C709166F7@uiuc.edu> You mean something like this? http://www.bioperl.org/wiki/HOWTO_Discussion:Graphics chris On Sep 5, 2007, at 4:42 AM, Albert Vilella wrote: > A couple of examples: > > http://www.treefam.org/cgi-bin/alnview.pl?ac=TF105041 > > treefam has exon boundary and PFAM domain mappings > > http://www.ensembl.org/Homo_sapiens/genetreeview?gene=ENSG00000139618 > > here the tree is shown as well, but the idea would be to plot the > alignment > > So it's more "show me the multiple CDS/protein alignment" rather > than "show > my aligned CDS/proteins wrt my reference genome" > > I think it would be quite neat to have this as a bioperl howto, > > Comments? > > Albert. > > On 9/4/07, Chris Fields wrote: >> >> >> On Sep 4, 2007, at 9:31 AM, Jay Hannah wrote: >> >>> On Aug 31, 2007, at 4:29 AM, Albert Vilella wrote: >>>> Probably a bit of a long shot but does anyone have code for >>>> displaying protein or CDS multiple sequence alignments with the >>>> exon >>>> boundaries of each gene in the alignment? >>>> >>>> Something in the bioperl world without funky external dependencies. >>>> I think >>>> it would be an awesome addition to the howtos. >>>> >>>> Currently, the Bio::Graphics howto has cdna to genome mapping >>>> scripts or >>>> blast output scripts, but >>>> I couldn't find code for dealing with multiple sequence alignments. >>> >>> I'm currently under the (potentially uninformed) impression that >>> Bio::Graphics and related tools only work with a single coordinate >>> system. I've never seen a multiple sequence alignment example. >>> >>> ( >>> I Google'd for "gbrowse alignment" and hit this: >>> http://maizeapache.danforthcenter.org/cgi-bin/gbrowse.cgi >>> >>> Click the second Example link and you'll see exons mapped out. >>> >>> But zooming all the way in with all the tracks turned on it looks >>> like the AZM tracks are just the coding regions. I don't see any >>> multiple sequence alignment... >>> ) >>> >>> I doubt that helped. :) >>> >>> Jay Hannah >>> http://clab.ist.unomaha.edu/CLAB/index.php/User:Jhannah >> >> Acc. to the Gbrowse tutorial GBrowse can deal with alignment data: >> >> http://gmod.cvs.sourceforge.net/*checkout*/gmod/Generic-Genome- >> Browser/docs/tutorial/tutorial.html >> >> chris >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From avilella at gmail.com Wed Sep 5 09:31:24 2007 From: avilella at gmail.com (Albert Vilella) Date: Wed, 5 Sep 2007 15:31:24 +0200 Subject: [Bioperl-l] display protein/CDS multiple sequence alignments with exon boundaries In-Reply-To: <08E1C429-0686-49E2-825A-D98C709166F7@uiuc.edu> References: <358f4d650708310229i421bb1d7w5f64e3ded5f92618@mail.gmail.com> <96BB820A-9199-43FC-BA88-5F45F2D16D56@jays.net> <59E877A9-B2A1-4E62-A291-FDE21CF586E6@uiuc.edu> <358f4d650709050242u7743a532u3b460fcc55e1867e@mail.gmail.com> <08E1C429-0686-49E2-825A-D98C709166F7@uiuc.edu> Message-ID: <358f4d650709050631t136901e6v6280a44089999bde@mail.gmail.com> Awesome!! Thanks Chris! On 9/5/07, Chris Fields wrote: > > You mean something like this? > > http://www.bioperl.org/wiki/HOWTO_Discussion:Graphics > > chris > > On Sep 5, 2007, at 4:42 AM, Albert Vilella wrote: > > > A couple of examples: > > > > http://www.treefam.org/cgi-bin/alnview.pl?ac=TF105041 > > > > treefam has exon boundary and PFAM domain mappings > > > > http://www.ensembl.org/Homo_sapiens/genetreeview?gene=ENSG00000139618 > > > > here the tree is shown as well, but the idea would be to plot the > > alignment > > > > So it's more "show me the multiple CDS/protein alignment" rather > > than "show > > my aligned CDS/proteins wrt my reference genome" > > > > I think it would be quite neat to have this as a bioperl howto, > > > > Comments? > > > > Albert. > > > > On 9/4/07, Chris Fields wrote: > >> > >> > >> On Sep 4, 2007, at 9:31 AM, Jay Hannah wrote: > >> > >>> On Aug 31, 2007, at 4:29 AM, Albert Vilella wrote: > >>>> Probably a bit of a long shot but does anyone have code for > >>>> displaying protein or CDS multiple sequence alignments with the > >>>> exon > >>>> boundaries of each gene in the alignment? > >>>> > >>>> Something in the bioperl world without funky external dependencies. > >>>> I think > >>>> it would be an awesome addition to the howtos. > >>>> > >>>> Currently, the Bio::Graphics howto has cdna to genome mapping > >>>> scripts or > >>>> blast output scripts, but > >>>> I couldn't find code for dealing with multiple sequence alignments. > >>> > >>> I'm currently under the (potentially uninformed) impression that > >>> Bio::Graphics and related tools only work with a single coordinate > >>> system. I've never seen a multiple sequence alignment example. > >>> > >>> ( > >>> I Google'd for "gbrowse alignment" and hit this: > >>> http://maizeapache.danforthcenter.org/cgi-bin/gbrowse.cgi > >>> > >>> Click the second Example link and you'll see exons mapped out. > >>> > >>> But zooming all the way in with all the tracks turned on it looks > >>> like the AZM tracks are just the coding regions. I don't see any > >>> multiple sequence alignment... > >>> ) > >>> > >>> I doubt that helped. :) > >>> > >>> Jay Hannah > >>> http://clab.ist.unomaha.edu/CLAB/index.php/User:Jhannah > >> > >> Acc. to the Gbrowse tutorial GBrowse can deal with alignment data: > >> > >> http://gmod.cvs.sourceforge.net/*checkout*/gmod/Generic-Genome- > >> Browser/docs/tutorial/tutorial.html > >> > >> chris > >> > >> _______________________________________________ > >> Bioperl-l mailing list > >> Bioperl-l at lists.open-bio.org > >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > >> > > _______________________________________________ > > Bioperl-l mailing list > > Bioperl-l at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > Christopher Fields > Postdoctoral Researcher > Lab of Dr. Robert Switzer > Dept of Biochemistry > University of Illinois Urbana-Champaign > > > > From cjfields at uiuc.edu Wed Sep 5 10:17:51 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Wed, 5 Sep 2007 09:17:51 -0500 Subject: [Bioperl-l] display protein/CDS multiple sequence alignments with exon boundaries In-Reply-To: <358f4d650709050631t136901e6v6280a44089999bde@mail.gmail.com> References: <358f4d650708310229i421bb1d7w5f64e3ded5f92618@mail.gmail.com> <96BB820A-9199-43FC-BA88-5F45F2D16D56@jays.net> <59E877A9-B2A1-4E62-A291-FDE21CF586E6@uiuc.edu> <358f4d650709050242u7743a532u3b460fcc55e1867e@mail.gmail.com> <08E1C429-0686-49E2-825A-D98C709166F7@uiuc.edu> <358f4d650709050631t136901e6v6280a44089999bde@mail.gmail.com> Message-ID: <31E25B64-2043-4460-ADC8-9684D01C2468@uiuc.edu> It would be nice to place the labels to the left of the segments. I believe there is a way to do this, but can't remember; if I can find it I'll revise the script. chris On Sep 5, 2007, at 8:31 AM, Albert Vilella wrote: > Awesome!! > > Thanks Chris! > > On 9/5/07, Chris Fields wrote: >> >> You mean something like this? >> >> http://www.bioperl.org/wiki/HOWTO_Discussion:Graphics >> >> chris >> >> On Sep 5, 2007, at 4:42 AM, Albert Vilella wrote: >> >>> A couple of examples: >>> >>> http://www.treefam.org/cgi-bin/alnview.pl?ac=TF105041 >>> >>> treefam has exon boundary and PFAM domain mappings >>> >>> http://www.ensembl.org/Homo_sapiens/genetreeview? >>> gene=ENSG00000139618 >>> >>> here the tree is shown as well, but the idea would be to plot the >>> alignment >>> >>> So it's more "show me the multiple CDS/protein alignment" rather >>> than "show >>> my aligned CDS/proteins wrt my reference genome" >>> >>> I think it would be quite neat to have this as a bioperl howto, >>> >>> Comments? >>> >>> Albert. >>> >>> On 9/4/07, Chris Fields wrote: >>>> >>>> >>>> On Sep 4, 2007, at 9:31 AM, Jay Hannah wrote: >>>> >>>>> On Aug 31, 2007, at 4:29 AM, Albert Vilella wrote: >>>>>> Probably a bit of a long shot but does anyone have code for >>>>>> displaying protein or CDS multiple sequence alignments with the >>>>>> exon >>>>>> boundaries of each gene in the alignment? >>>>>> >>>>>> Something in the bioperl world without funky external >>>>>> dependencies. >>>>>> I think >>>>>> it would be an awesome addition to the howtos. >>>>>> >>>>>> Currently, the Bio::Graphics howto has cdna to genome mapping >>>>>> scripts or >>>>>> blast output scripts, but >>>>>> I couldn't find code for dealing with multiple sequence >>>>>> alignments. >>>>> >>>>> I'm currently under the (potentially uninformed) impression that >>>>> Bio::Graphics and related tools only work with a single >>>>> coordinate >>>>> system. I've never seen a multiple sequence alignment example. >>>>> >>>>> ( >>>>> I Google'd for "gbrowse alignment" and hit this: >>>>> http://maizeapache.danforthcenter.org/cgi-bin/gbrowse.cgi >>>>> >>>>> Click the second Example link and you'll see exons mapped out. >>>>> >>>>> But zooming all the way in with all the tracks turned on it >>>>> looks >>>>> like the AZM tracks are just the coding regions. I don't see any >>>>> multiple sequence alignment... >>>>> ) >>>>> >>>>> I doubt that helped. :) >>>>> >>>>> Jay Hannah >>>>> http://clab.ist.unomaha.edu/CLAB/index.php/User:Jhannah >>>> >>>> Acc. to the Gbrowse tutorial GBrowse can deal with alignment data: >>>> >>>> http://gmod.cvs.sourceforge.net/*checkout*/gmod/Generic-Genome- >>>> Browser/docs/tutorial/tutorial.html >>>> >>>> chris >>>> >>>> _______________________________________________ >>>> Bioperl-l mailing list >>>> Bioperl-l at lists.open-bio.org >>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>>> >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> >> Christopher Fields >> Postdoctoral Researcher >> Lab of Dr. Robert Switzer >> Dept of Biochemistry >> University of Illinois Urbana-Champaign >> >> >> >> > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From n.haigh at sheffield.ac.uk Wed Sep 5 10:22:44 2007 From: n.haigh at sheffield.ac.uk (Nathan Haigh) Date: Wed, 05 Sep 2007 15:22:44 +0100 Subject: [Bioperl-l] Bio::Graphics support for floating point positions In-Reply-To: <08E1C429-0686-49E2-825A-D98C709166F7@uiuc.edu> References: <358f4d650708310229i421bb1d7w5f64e3ded5f92618@mail.gmail.com> <96BB820A-9199-43FC-BA88-5F45F2D16D56@jays.net> <59E877A9-B2A1-4E62-A291-FDE21CF586E6@uiuc.edu> <358f4d650709050242u7743a532u3b460fcc55e1867e@mail.gmail.com> <08E1C429-0686-49E2-825A-D98C709166F7@uiuc.edu> Message-ID: <46DEBBB4.1030200@sheffield.ac.uk> Chris Fields wrote: > You mean something like this? > > http://www.bioperl.org/wiki/HOWTO_Discussion:Graphics > > chris > > On Sep 5, 2007, at 4:42 AM, Albert Vilella wrote: > > Nice! On a similar (well, related to Bio::Graphics) topic, I've written a script that uses markers that have been mapped from a model organism to linkage groups in related species in order to estimate the location of "unknown" markers in those linkage groups. I'm using the Bio::Map::* modules for much of this work and then I use Bio::Graphics to display the linkage groups of the non-model organism with the putative position of the "unknown" markers. However, I've had to do a bit of fudging to get Bio::Graphics to draw this data. The problems I encountered are described below. I also have an open bug: http://bugzilla.open-bio.org/show_bug.cgi?id=2343 1) Linkage maps are measured in cM - which can and are likely to be non-integer values. Bio::Graphics needs integer values, so I simply scaled all my cM measurements prior to drawing by *1000. However, the ruler now doesn't represent the "true scale" - can this be adjusted? 2) Some markers map to 0cM. However, Bio::Graphics requires positions >0. To get round this I simply incremented these positions by 1 (post-scaling), so they display almost in the correct place. Is it possible/likely/wise to support positions starting at zero and float positions? Would such support simply be to internalise what I have already done outside Bio::Graphics into the Bio::Graphics modules and have it display the correctly scaled ruler? Thoughts comments welcome. Cheers, Nath From cjfields at uiuc.edu Wed Sep 5 10:52:00 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Wed, 5 Sep 2007 09:52:00 -0500 Subject: [Bioperl-l] display protein/CDS multiple sequence alignments with exon boundaries In-Reply-To: <358f4d650709050631t136901e6v6280a44089999bde@mail.gmail.com> References: <358f4d650708310229i421bb1d7w5f64e3ded5f92618@mail.gmail.com> <96BB820A-9199-43FC-BA88-5F45F2D16D56@jays.net> <59E877A9-B2A1-4E62-A291-FDE21CF586E6@uiuc.edu> <358f4d650709050242u7743a532u3b460fcc55e1867e@mail.gmail.com> <08E1C429-0686-49E2-825A-D98C709166F7@uiuc.edu> <358f4d650709050631t136901e6v6280a44089999bde@mail.gmail.com> Message-ID: Updated the page on the web site with the new script. Figured it out; if you pass the parameter -label_position 'left' it will display the label to the left. However it displays them right next to the segment (ala GBrowse). I added a hack to Bio::Graphics::Glyph::generic in CVS which allows 'alignment_left' as an option, displaying it aligned to the far left of the panel; there is probably a way to use a callback here as well. chris On Sep 5, 2007, at 8:31 AM, Albert Vilella wrote: > Awesome!! > > Thanks Chris! > > On 9/5/07, Chris Fields wrote: >> >> You mean something like this? >> >> http://www.bioperl.org/wiki/HOWTO_Discussion:Graphics >> >> chris >> >> On Sep 5, 2007, at 4:42 AM, Albert Vilella wrote: >> >>> A couple of examples: >>> >>> http://www.treefam.org/cgi-bin/alnview.pl?ac=TF105041 >>> >>> treefam has exon boundary and PFAM domain mappings >>> >>> http://www.ensembl.org/Homo_sapiens/genetreeview? >>> gene=ENSG00000139618 >>> >>> here the tree is shown as well, but the idea would be to plot the >>> alignment >>> >>> So it's more "show me the multiple CDS/protein alignment" rather >>> than "show >>> my aligned CDS/proteins wrt my reference genome" >>> >>> I think it would be quite neat to have this as a bioperl howto, >>> >>> Comments? >>> >>> Albert. >>> >>> On 9/4/07, Chris Fields wrote: >>>> >>>> >>>> On Sep 4, 2007, at 9:31 AM, Jay Hannah wrote: >>>> >>>>> On Aug 31, 2007, at 4:29 AM, Albert Vilella wrote: >>>>>> Probably a bit of a long shot but does anyone have code for >>>>>> displaying protein or CDS multiple sequence alignments with the >>>>>> exon >>>>>> boundaries of each gene in the alignment? >>>>>> >>>>>> Something in the bioperl world without funky external >>>>>> dependencies. >>>>>> I think >>>>>> it would be an awesome addition to the howtos. >>>>>> >>>>>> Currently, the Bio::Graphics howto has cdna to genome mapping >>>>>> scripts or >>>>>> blast output scripts, but >>>>>> I couldn't find code for dealing with multiple sequence >>>>>> alignments. >>>>> >>>>> I'm currently under the (potentially uninformed) impression that >>>>> Bio::Graphics and related tools only work with a single >>>>> coordinate >>>>> system. I've never seen a multiple sequence alignment example. >>>>> >>>>> ( >>>>> I Google'd for "gbrowse alignment" and hit this: >>>>> http://maizeapache.danforthcenter.org/cgi-bin/gbrowse.cgi >>>>> >>>>> Click the second Example link and you'll see exons mapped out. >>>>> >>>>> But zooming all the way in with all the tracks turned on it >>>>> looks >>>>> like the AZM tracks are just the coding regions. I don't see any >>>>> multiple sequence alignment... >>>>> ) >>>>> >>>>> I doubt that helped. :) >>>>> >>>>> Jay Hannah >>>>> http://clab.ist.unomaha.edu/CLAB/index.php/User:Jhannah >>>> >>>> Acc. to the Gbrowse tutorial GBrowse can deal with alignment data: >>>> >>>> http://gmod.cvs.sourceforge.net/*checkout*/gmod/Generic-Genome- >>>> Browser/docs/tutorial/tutorial.html >>>> >>>> chris >>>> >>>> _______________________________________________ >>>> Bioperl-l mailing list >>>> Bioperl-l at lists.open-bio.org >>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>>> >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> >> Christopher Fields >> Postdoctoral Researcher >> Lab of Dr. Robert Switzer >> Dept of Biochemistry >> University of Illinois Urbana-Champaign >> >> >> >> > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From cjfields at uiuc.edu Wed Sep 5 12:47:46 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Wed, 5 Sep 2007 11:47:46 -0500 Subject: [Bioperl-l] Bio::Graphics support for floating point positions In-Reply-To: <46DEBBB4.1030200@sheffield.ac.uk> References: <358f4d650708310229i421bb1d7w5f64e3ded5f92618@mail.gmail.com> <96BB820A-9199-43FC-BA88-5F45F2D16D56@jays.net> <59E877A9-B2A1-4E62-A291-FDE21CF586E6@uiuc.edu> <358f4d650709050242u7743a532u3b460fcc55e1867e@mail.gmail.com> <08E1C429-0686-49E2-825A-D98C709166F7@uiuc.edu> <46DEBBB4.1030200@sheffield.ac.uk> Message-ID: On Sep 5, 2007, at 9:22 AM, Nathan Haigh wrote: > ... > On a similar (well, related to Bio::Graphics) topic, I've written a > script that uses markers that have been mapped from a model > organism to > linkage groups in related species in order to estimate the location of > "unknown" markers in those linkage groups. > > I'm using the Bio::Map::* modules for much of this work and then I use > Bio::Graphics to display the linkage groups of the non-model organism > with the putative position of the "unknown" markers. However, I've had > to do a bit of fudging to get Bio::Graphics to draw this data. The > problems I encountered are described below. I also have an open bug: > http://bugzilla.open-bio.org/show_bug.cgi?id=2343 > > 1) Linkage maps are measured in cM - which can and are likely to be > non-integer values. Bio::Graphics needs integer values, so I simply > scaled all my cM measurements prior to drawing by *1000. However, the > ruler now doesn't represent the "true scale" - can this be adjusted? > > 2) Some markers map to 0cM. However, Bio::Graphics requires positions >> 0. To get round this I simply incremented these positions by 1 > (post-scaling), so they display almost in the correct place. > > Is it possible/likely/wise to support positions starting at zero and > float positions? Would such support simply be to internalise what I > have > already done outside Bio::Graphics into the Bio::Graphics modules and > have it display the correctly scaled ruler? > > Thoughts comments welcome. > > Cheers, > Nath There is this section in the GBrowse configure doc, which to me suggests there is a way to do what you want in Bioperl; you may have to delve into the Bio::Graphics or GBrowse code to work it out, though. I think the GBrowse mail list archives also have more on this. chris ..... F. DISPLAYING GENETIC AND RH MAPS GBrowse can be tweaked to make it more suitable for displaying genetic and radiation hybrid maps. The main issue is that the Bio::DB::GFF database expects coordinates to be positive integers, not fractions, but genetic and RH maps use floating point numbers. Working around this is a bit of an ugly hack. Before loading your data you must multiply all your coordinates by a constant power of 10 in order to convert them into integers. For example, if a genetic map uses Morgan units ranging from 0 to 1.80, you would multiple by 100 to create a map in ranging from 0 to 180. Create a GFF file containing the markers in modified coordinates and load it as usual. Now you must tell GBrowse to reverse these changes. Enter the following options into the [GENERAL] section of the configuration file: units = M unit_divider = 100 These two options tell GBrowse to use "M" (Morgan) units, and to divide all coordinates by 100. GBrowse will automatically display the scale using the most appropriate units, so the displayed map will typically be drawn using cM units. From bernd.web at gmail.com Wed Sep 5 13:44:26 2007 From: bernd.web at gmail.com (Bernd Web) Date: Wed, 5 Sep 2007 19:44:26 +0200 Subject: [Bioperl-l] SearchIO ResultWriter Message-ID: <716af09c0709051044t1cb9e857uc4b91ad64c9ef22a@mail.gmail.com> Hi, For SearchIO there are ResultWriters to write text, html and BSML (BSMLResultWriter). However, is there also a BLAST xml writer, which writes the original blast xml files. This may have come up before. If there is not, is there interest in having this? Regards, Bernd From sac at bioperl.org Wed Sep 5 16:37:37 2007 From: sac at bioperl.org (Steve Chervitz) Date: Wed, 5 Sep 2007 13:37:37 -0700 Subject: [Bioperl-l] SearchIO ResultWriter In-Reply-To: <716af09c0709051044t1cb9e857uc4b91ad64c9ef22a@mail.gmail.com> References: <716af09c0709051044t1cb9e857uc4b91ad64c9ef22a@mail.gmail.com> Message-ID: <8f200b4c0709051337u532804d6r27712b05faaeea7d@mail.gmail.com> Looks like there is no such functionality in the current repository. If you have implemented such a beast and are willing to contribute it, go for it (or coordinate with a developer if you lack CVS write access). Steve On 9/5/07, Bernd Web wrote: > > Hi, > > For SearchIO there are ResultWriters to write text, html and BSML > (BSMLResultWriter). However, is there also a BLAST xml writer, which > writes the original blast xml files. This may have come up before. If > there is not, is there interest in having this? > > > Regards, > Bernd > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > From n.haigh at sheffield.ac.uk Wed Sep 5 17:18:17 2007 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Wed, 05 Sep 2007 22:18:17 +0100 Subject: [Bioperl-l] Bio::Graphics support for floating point positions In-Reply-To: References: <358f4d650708310229i421bb1d7w5f64e3ded5f92618@mail.gmail.com> <96BB820A-9199-43FC-BA88-5F45F2D16D56@jays.net> <59E877A9-B2A1-4E62-A291-FDE21CF586E6@uiuc.edu> <358f4d650709050242u7743a532u3b460fcc55e1867e@mail.gmail.com> <08E1C429-0686-49E2-825A-D98C709166F7@uiuc.edu> <46DEBBB4.1030200@sheffield.ac.uk> Message-ID: <46DF1D19.9010707@sheffield.ac.uk> Chris Fields wrote: > On Sep 5, 2007, at 9:22 AM, Nathan Haigh wrote: > >> ... >> On a similar (well, related to Bio::Graphics) topic, I've written a >> script that uses markers that have been mapped from a model organism to >> linkage groups in related species in order to estimate the location of >> "unknown" markers in those linkage groups. >> >> I'm using the Bio::Map::* modules for much of this work and then I use >> Bio::Graphics to display the linkage groups of the non-model organism >> with the putative position of the "unknown" markers. However, I've had >> to do a bit of fudging to get Bio::Graphics to draw this data. The >> problems I encountered are described below. I also have an open bug: >> http://bugzilla.open-bio.org/show_bug.cgi?id=2343 >> >> 1) Linkage maps are measured in cM - which can and are likely to be >> non-integer values. Bio::Graphics needs integer values, so I simply >> scaled all my cM measurements prior to drawing by *1000. However, the >> ruler now doesn't represent the "true scale" - can this be adjusted? >> >> 2) Some markers map to 0cM. However, Bio::Graphics requires positions >>> 0. To get round this I simply incremented these positions by 1 >> (post-scaling), so they display almost in the correct place. >> >> Is it possible/likely/wise to support positions starting at zero and >> float positions? Would such support simply be to internalise what I have >> already done outside Bio::Graphics into the Bio::Graphics modules and >> have it display the correctly scaled ruler? >> >> Thoughts comments welcome. >> >> Cheers, >> Nath > > There is this section in the GBrowse configure doc, which to me > suggests there is a way to do what you want in Bioperl; you may have > to delve into the Bio::Graphics or GBrowse code to work it out, > though. I think the GBrowse mail list archives also have more on this. > > chris > > ..... > > F. DISPLAYING GENETIC AND RH MAPS > > GBrowse can be tweaked to make it more suitable for displaying genetic > and radiation hybrid maps. > > The main issue is that the Bio::DB::GFF database expects coordinates > to be positive integers, not fractions, but genetic and RH maps use > floating point numbers. Working around this is a bit of an ugly hack. > Before loading your data you must multiply all your coordinates by a > constant power of 10 in order to convert them into integers. For > example, if a genetic map uses Morgan units ranging from 0 to 1.80, > you would multiple by 100 to create a map in ranging from 0 to 180. > > Create a GFF file containing the markers in modified coordinates and > load it as usual. Now you must tell GBrowse to reverse these changes. > Enter the following options into the [GENERAL] section of the > configuration file: > > units = M > unit_divider = 100 > > These two options tell GBrowse to use "M" (Morgan) units, and to > divide all coordinates by 100. GBrowse will automatically display the > scale using the most appropriate units, so the displayed map will > typically be drawn using cM units. > Thanks for for the pointer Chris! >From what you've said, it appears they might have done a similar hack to me - which is always nice to know! It seems then to me, that it may be worth making the Bio::Graphic::* modules slightly more generic and applicable for these situations. It's late, so does anyone have suggestions before I start digging through Bio::Graphic::* modules in the morning? Maybe you guys across the water have something to say by the time I wake up in the morning!? Thanks Nath From jason at bioperl.org Wed Sep 5 17:33:44 2007 From: jason at bioperl.org (Jason Stajich) Date: Wed, 5 Sep 2007 14:33:44 -0700 Subject: [Bioperl-l] SearchIO ResultWriter In-Reply-To: <8f200b4c0709051337u532804d6r27712b05faaeea7d@mail.gmail.com> References: <716af09c0709051044t1cb9e857uc4b91ad64c9ef22a@mail.gmail.com> <8f200b4c0709051337u532804d6r27712b05faaeea7d@mail.gmail.com> Message-ID: I think most ppl aren't that enamored with the NCBI XML Blast format but I guess it is standard if the NCBI puts it out... It should be a pretty easy writer to make at any rate just follow along with what was done for BSMLWriter. -jason On Sep 5, 2007, at 1:37 PM, Steve Chervitz wrote: > Looks like there is no such functionality in the current > repository. If you > have implemented such a beast and are willing to contribute it, go > for it > (or coordinate with a developer if you lack CVS write access). > > Steve > > On 9/5/07, Bernd Web wrote: >> >> Hi, >> >> For SearchIO there are ResultWriters to write text, html and BSML >> (BSMLResultWriter). However, is there also a BLAST xml writer, which >> writes the original blast xml files. This may have come up before. If >> there is not, is there interest in having this? >> >> >> Regards, >> Bernd >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- Jason Stajich jason at bioperl.org From jay at jays.net Thu Sep 6 15:50:53 2007 From: jay at jays.net (Jay Hannah) Date: Thu, 6 Sep 2007 15:50:53 -0400 (EDT) Subject: [Bioperl-l] display protein/CDS multiple sequence alignments with exon boundaries In-Reply-To: <08E1C429-0686-49E2-825A-D98C709166F7@uiuc.edu> References: <358f4d650708310229i421bb1d7w5f64e3ded5f92618@mail.gmail.com> <96BB820A-9199-43FC-BA88-5F45F2D16D56@jays.net> <59E877A9-B2A1-4E62-A291-FDE21CF586E6@uiuc.edu> <358f4d650709050242u7743a532u3b460fcc55e1867e@mail.gmail.com> <08E1C429-0686-49E2-825A-D98C709166F7@uiuc.edu> Message-ID: On Wed, 5 Sep 2007, Chris Fields wrote: > http://www.bioperl.org/wiki/HOWTO_Discussion:Graphics Wow. That's slick. :) Is it possible to zoom in far enough to see the individual bases and gaps?? On Tue, 4 Sep 2007, Chris Fields wrote: > Acc. to the Gbrowse tutorial GBrowse can deal with alignment data: > http://gmod.cvs.sourceforge.net/*checkout*/gmod/Generic-Genome-Browser/docs/tutorial/tutorial.html Yes, indeed. GBrowse graphs all sorts of amazing things. Specifically, this image might be what Albert is looking for: http://gmod.cvs.sourceforge.net/*checkout*/gmod/Generic-Genome-Browser/docs/tutorial/figures/segmented_features2.gif He'd need to map his exon boundaries from whatever format he has into a GFF file (or DAS/BioSQL/Chado/? server, or...?) for GBrowse to munch on. On Aug 31, 2007, at 4:29 AM, Albert Vilella wrote: > "Something in the bioperl world without funky external dependencies" There are still things the long arm of BioPerl justice hasn't reached? :) Jay Hannah http://clab.ist.unomaha.edu/CLAB/index.php/User:Jhannah From cjfields at uiuc.edu Thu Sep 6 19:39:07 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 6 Sep 2007 18:39:07 -0500 Subject: [Bioperl-l] display protein/CDS multiple sequence alignments with exon boundaries In-Reply-To: References: <358f4d650708310229i421bb1d7w5f64e3ded5f92618@mail.gmail.com> <96BB820A-9199-43FC-BA88-5F45F2D16D56@jays.net> <59E877A9-B2A1-4E62-A291-FDE21CF586E6@uiuc.edu> <358f4d650709050242u7743a532u3b460fcc55e1867e@mail.gmail.com> <08E1C429-0686-49E2-825A-D98C709166F7@uiuc.edu> Message-ID: <9AD575D1-A56C-415F-9C52-73105E846040@uiuc.edu> On Sep 6, 2007, at 2:50 PM, Jay Hannah wrote: > > On Wed, 5 Sep 2007, Chris Fields wrote: >> http://www.bioperl.org/wiki/HOWTO_Discussion:Graphics > > Wow. That's slick. :) Is it possible to zoom in far enough to > see the > individual bases and gaps?? I'm not sure; you can do something like that with GBrowse with some features so there is probably a way to put something together which could do that. > On Tue, 4 Sep 2007, Chris Fields wrote: >> Acc. to the Gbrowse tutorial GBrowse can deal with alignment data: >> http://gmod.cvs.sourceforge.net/*checkout*/gmod/Generic-Genome- >> Browser/docs/tutorial/tutorial.html > > Yes, indeed. GBrowse graphs all sorts of amazing things. Specifically, > this image might be what Albert is looking for: > > http://gmod.cvs.sourceforge.net/*checkout*/gmod/Generic-Genome- > Browser/docs/tutorial/figures/segmented_features2.gif > > He'd need to map his exon boundaries from whatever format he has > into a > GFF file (or DAS/BioSQL/Chado/? server, or...?) for GBrowse to > munch on. I use segmented SeqFeatures in my example. The HOWTO also uses a variation ('graded_segments'): http://www.bioperl.org/wiki/HOWTO:Graphics#Parsing_Real_BLAST_Output The subseqfeatures are colored by score. Feasibly one could hack this so that the exons/introns have a different 'score', thus displaying different colors. > On Aug 31, 2007, at 4:29 AM, Albert Vilella wrote: >> "Something in the bioperl world without funky external dependencies" > > There are still things the long arm of BioPerl justice hasn't > reached? :) > > Jay Hannah > http://clab.ist.unomaha.edu/CLAB/index.php/User:Jhannah chris From cain.cshl at gmail.com Thu Sep 6 23:20:04 2007 From: cain.cshl at gmail.com (Scott Cain) Date: Thu, 06 Sep 2007 23:20:04 -0400 Subject: [Bioperl-l] display protein/CDS multiple sequence alignments with exon boundaries In-Reply-To: <9AD575D1-A56C-415F-9C52-73105E846040@uiuc.edu> References: <358f4d650708310229i421bb1d7w5f64e3ded5f92618@mail.gmail.com> <96BB820A-9199-43FC-BA88-5F45F2D16D56@jays.net> <59E877A9-B2A1-4E62-A291-FDE21CF586E6@uiuc.edu> <358f4d650709050242u7743a532u3b460fcc55e1867e@mail.gmail.com> <08E1C429-0686-49E2-825A-D98C709166F7@uiuc.edu> <9AD575D1-A56C-415F-9C52-73105E846040@uiuc.edu> Message-ID: <1189135204.2560.52.camel@localhost.localdomain> On Thu, 2007-09-06 at 18:39 -0500, Chris Fields wrote: > On Sep 6, 2007, at 2:50 PM, Jay Hannah wrote: > > > > > On Wed, 5 Sep 2007, Chris Fields wrote: > >> http://www.bioperl.org/wiki/HOWTO_Discussion:Graphics > > > > Wow. That's slick. :) Is it possible to zoom in far enough to > > see the > > individual bases and gaps?? > > I'm not sure; you can do something like that with GBrowse with some > features so there is probably a way to put something together which > could do that. Yeah, if it were me, I would install GBrowse, hack my data into GFF and use gbrowse_img to generate pictures. It would probably be easier than starting from scratch. > > > On Tue, 4 Sep 2007, Chris Fields wrote: > >> Acc. to the Gbrowse tutorial GBrowse can deal with alignment data: > >> http://gmod.cvs.sourceforge.net/*checkout*/gmod/Generic-Genome- > >> Browser/docs/tutorial/tutorial.html > > > > Yes, indeed. GBrowse graphs all sorts of amazing things. Specifically, > > this image might be what Albert is looking for: > > > > http://gmod.cvs.sourceforge.net/*checkout*/gmod/Generic-Genome- > > Browser/docs/tutorial/figures/segmented_features2.gif > > > > He'd need to map his exon boundaries from whatever format he has > > into a > > GFF file (or DAS/BioSQL/Chado/? server, or...?) for GBrowse to > > munch on. > > I use segmented SeqFeatures in my example. The HOWTO also uses a > variation ('graded_segments'): > > http://www.bioperl.org/wiki/HOWTO:Graphics#Parsing_Real_BLAST_Output > > The subseqfeatures are colored by score. Feasibly one could hack > this so that the exons/introns have a different 'score', thus > displaying different colors. > > > On Aug 31, 2007, at 4:29 AM, Albert Vilella wrote: > >> "Something in the bioperl world without funky external dependencies" > > > > There are still things the long arm of BioPerl justice hasn't > > reached? :) > > > > Jay Hannah > > http://clab.ist.unomaha.edu/CLAB/index.php/User:Jhannah > > chris > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- ------------------------------------------------------------------------ Scott Cain, Ph. D. cain at cshl.edu GMOD Coordinator (http://www.gmod.org/) 216-392-3087 Cold Spring Harbor Laboratory -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: This is a digitally signed message part URL: From avilella at gmail.com Fri Sep 7 05:20:01 2007 From: avilella at gmail.com (Albert Vilella) Date: Fri, 7 Sep 2007 10:20:01 +0100 Subject: [Bioperl-l] display protein/CDS multiple sequence alignments with exon boundaries In-Reply-To: <9AD575D1-A56C-415F-9C52-73105E846040@uiuc.edu> References: <358f4d650708310229i421bb1d7w5f64e3ded5f92618@mail.gmail.com> <96BB820A-9199-43FC-BA88-5F45F2D16D56@jays.net> <59E877A9-B2A1-4E62-A291-FDE21CF586E6@uiuc.edu> <358f4d650709050242u7743a532u3b460fcc55e1867e@mail.gmail.com> <08E1C429-0686-49E2-825A-D98C709166F7@uiuc.edu> <9AD575D1-A56C-415F-9C52-73105E846040@uiuc.edu> Message-ID: <358f4d650709070220q3680c5a2kbd01adb6f2fca3dc@mail.gmail.com> > > > He'd need to map his exon boundaries from whatever format he has > > into a > > GFF file (or DAS/BioSQL/Chado/? server, or...?) for GBrowse to > > munch on. > > I use segmented SeqFeatures in my example. The HOWTO also uses a > variation ('graded_segments'): > > http://www.bioperl.org/wiki/HOWTO:Graphics#Parsing_Real_BLAST_Output > > The subseqfeatures are colored by score. Feasibly one could hack > this so that the exons/introns have a different 'score', thus > displaying different colors. The exon boundary could be a vertical line or a triangular tick or something. I don't know if there is a consensus on this kind of cartoons. Does anybody know how exon boundaries are displayed in different browsers/apps? From yangmeng at genomics.org.cn Fri Sep 7 03:57:14 2007 From: yangmeng at genomics.org.cn (=?ISO-8859-1?Q?=D1=EE=C3=CD=A3=A8=D6=D0=D0=C4=CA=B5=D1=E9=CA=D2?= ) Date: Fri, 7 Sep 2007 15:57:14 +0800 Subject: [Bioperl-l] a question Message-ID: <200709071557.AA78971054@genomics.org.cn> I am a student from China.During my learing the bioperl,I encounter a problem as follows: I run the program, use Bio::Perl; $seq_object = get_sequence('swissprot',"ROA1_HUMAN"); write_sequence(">roa1.fasta",'fasta',$seq_object); But It returns lots of mistake informatiom, ------------- EXCEPTION: Bio::Root::Exception ------------- MSG: WebDBSeqI Request Error: 501 Protocol scheme '' is not supported Content-Type: text/plain Client-Date: Fri, 07 Sep 2007 07:26:06 GMT Client-Warning: Internal response 501 Protocol scheme '' is not supported STACK: Error::throw STACK: Bio::Root::Root::throw D:/perl/site/lib/Bio/Root/Root.pm:359 STACK: Bio::DB::WebDBSeqI::_request D:/perl/site/lib/Bio/DB/WebDBSeqI.pm:685 STACK: Bio::DB::WebDBSeqI::get_seq_stream D:/perl/site/lib/Bio/DB/WebDBSeqI.pm:4 91 STACK: Bio::DB::WebDBSeqI::get_Stream_by_id D:/perl/site/lib/Bio/DB/WebDBSeqI.pm :275 STACK: Bio::DB::WebDBSeqI::get_Seq_by_id D:/perl/site/lib/Bio/DB/WebDBSeqI.pm:14 5 STACK: Bio::Perl::get_sequence D:/perl/site/lib/Bio/Perl.pm:510 STACK: C:\DOCUME~1\yangmeng\LOCALS~1\Temp\dir13D.tmp\Untitled.pl:6 ----------------------------------------------------------- I don't know the reason of the problem.I have installed the addition perl modules such as bioperl-db,bioperl-network,bioperlgui and almost all "BioPerl Dependencies modules".My network is also OK. It's an annoying promleb to me. I have consulted many experts but didn't got a reply. Could you vacuate in your mass business to give me a help? Thank you! Best regards! YangMeng ________________________________________________________________ Sent via the WebMail system at genomics.org.cn From cjfields at uiuc.edu Fri Sep 7 10:09:18 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Fri, 7 Sep 2007 09:09:18 -0500 Subject: [Bioperl-l] a question In-Reply-To: <200709071557.AA78971054@genomics.org.cn> References: <200709071557.AA78971054@genomics.org.cn> Message-ID: <7F176E39-18A6-4BF9-9247-863D6F3C167D@uiuc.edu> On Sep 7, 2007, at 2:57 AM, ???????????????? wrote: > I am a student from China.During my learing the bioperl,I encounter > a problem as follows: > > I run the program, > > use Bio::Perl; > $seq_object = get_sequence('swissprot',"ROA1_HUMAN"); > write_sequence(">roa1.fasta",'fasta',$seq_object); > > But It returns lots of mistake informatiom, First, always preface problems of this sort with the version of BioPerl you are using (there are quite a few versions still being used). > ------------- EXCEPTION: Bio::Root::Exception ------------- > MSG: WebDBSeqI Request Error: > 501 Protocol scheme '' is not supported > Content-Type: text/plain > Client-Date: Fri, 07 Sep 2007 07:26:06 GMT > Client-Warning: Internal response > 501 Protocol scheme '' is not supported > STACK: Error::throw > STACK: Bio::Root::Root::throw D:/perl/site/lib/Bio/Root/Root.pm:359 > STACK: Bio::DB::WebDBSeqI::_request D:/perl/site/lib/Bio/DB/ > WebDBSeqI.pm:685 > STACK: Bio::DB::WebDBSeqI::get_seq_stream D:/perl/site/lib/Bio/DB/ > WebDBSeqI.pm:4 > 91 > STACK: Bio::DB::WebDBSeqI::get_Stream_by_id D:/perl/site/lib/Bio/DB/ > WebDBSeqI.pm > :275 > STACK: Bio::DB::WebDBSeqI::get_Seq_by_id D:/perl/site/lib/Bio/DB/ > WebDBSeqI.pm:14 > 5 > STACK: Bio::Perl::get_sequence D:/perl/site/lib/Bio/Perl.pm:510 > STACK: C:\DOCUME~1\yangmeng\LOCALS~1\Temp\dir13D.tmp\Untitled.pl:6 > ----------------------------------------------------------- This works for me using bioperl from CVS. There were a few remote DbFetch server changes if I recall correctly, so updating from CVS may be your best option. > I don't know the reason of the problem.I have installed the > addition perl modules such as bioperl-db,bioperl-network,bioperlgui > and almost all "BioPerl Dependencies modules".My network is also > OK. It's an annoying promleb to me. > I have consulted many experts but didn't got a reply. Could you > vacuate in your mass business to give me a help? > > Thank you! > > Best regards! > > YangMeng I think my 'vacuating' is a private matter, let alone doing so in my mass business... http://www.thefreedictionary.com/Vacuate ;> chris From cjfields at uiuc.edu Mon Sep 10 18:04:14 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 10 Sep 2007 17:04:14 -0500 Subject: [Bioperl-l] display protein/CDS multiple sequence alignments with exon boundaries In-Reply-To: <358f4d650709070220q3680c5a2kbd01adb6f2fca3dc@mail.gmail.com> References: <358f4d650708310229i421bb1d7w5f64e3ded5f92618@mail.gmail.com> <96BB820A-9199-43FC-BA88-5F45F2D16D56@jays.net> <59E877A9-B2A1-4E62-A291-FDE21CF586E6@uiuc.edu> <358f4d650709050242u7743a532u3b460fcc55e1867e@mail.gmail.com> <08E1C429-0686-49E2-825A-D98C709166F7@uiuc.edu> <9AD575D1-A56C-415F-9C52-73105E846040@uiuc.edu> <358f4d650709070220q3680c5a2kbd01adb6f2fca3dc@mail.gmail.com> Message-ID: <3D1D9D7E-4162-4694-A3D6-23E926B07C0E@uiuc.edu> On Sep 7, 2007, at 4:20 AM, Albert Vilella wrote: >>> He'd need to map his exon boundaries from whatever format he has >>> into a >>> GFF file (or DAS/BioSQL/Chado/? server, or...?) for GBrowse to >>> munch on. >> >> I use segmented SeqFeatures in my example. The HOWTO also uses a >> variation ('graded_segments'): >> >> http://www.bioperl.org/wiki/HOWTO:Graphics#Parsing_Real_BLAST_Output >> >> The subseqfeatures are colored by score. Feasibly one could hack >> this so that the exons/introns have a different 'score', thus >> displaying different colors. > > > The exon boundary could be a vertical line or a triangular tick or > something. I don't know if there is a consensus on this kind of > cartoons. > Does anybody know how exon boundaries are displayed in different > browsers/apps? Don't know. BTW, apparently there is something being cooked up as an alignment browser (among other things) for GBrowse: https://www.nescent.org/wg_phyloinformatics/ PhyloSoC:Phylogenetic_and_Haplotype_Displays_for_GBrowse Acc. to Lincoln (from his last GBrowse post) there will be a testable version within a few weeks or so. You could always ask more questions about it on the GBrowse list. chris From lstein at cshl.edu Mon Sep 10 18:09:41 2007 From: lstein at cshl.edu (Lincoln Stein) Date: Mon, 10 Sep 2007 18:09:41 -0400 Subject: [Bioperl-l] display protein/CDS multiple sequence alignments with exon boundaries In-Reply-To: <3D1D9D7E-4162-4694-A3D6-23E926B07C0E@uiuc.edu> References: <358f4d650708310229i421bb1d7w5f64e3ded5f92618@mail.gmail.com> <96BB820A-9199-43FC-BA88-5F45F2D16D56@jays.net> <59E877A9-B2A1-4E62-A291-FDE21CF586E6@uiuc.edu> <358f4d650709050242u7743a532u3b460fcc55e1867e@mail.gmail.com> <08E1C429-0686-49E2-825A-D98C709166F7@uiuc.edu> <9AD575D1-A56C-415F-9C52-73105E846040@uiuc.edu> <358f4d650709070220q3680c5a2kbd01adb6f2fca3dc@mail.gmail.com> <3D1D9D7E-4162-4694-A3D6-23E926B07C0E@uiuc.edu> Message-ID: <6dce9a0b0709101509x5b1a18cfx7c567e40dffb4947@mail.gmail.com> You can view a simple multiple alignment now. Go to www.wormbase.org, turn on some of the EST tracks and then zoom down to base pair level. In bio::graphics, use the "segments" glyph and turn on the -draw_target option. The features must have DNA attached to them. What's coming soon is support for MAF format, which provides genome-level alignments. Lincoln On 9/10/07, Chris Fields wrote: > > On Sep 7, 2007, at 4:20 AM, Albert Vilella wrote: > > >>> He'd need to map his exon boundaries from whatever format he has > >>> into a > >>> GFF file (or DAS/BioSQL/Chado/? server, or...?) for GBrowse to > >>> munch on. > >> > >> I use segmented SeqFeatures in my example. The HOWTO also uses a > >> variation ('graded_segments'): > >> > >> http://www.bioperl.org/wiki/HOWTO:Graphics#Parsing_Real_BLAST_Output > >> > >> The subseqfeatures are colored by score. Feasibly one could hack > >> this so that the exons/introns have a different 'score', thus > >> displaying different colors. > > > > > > The exon boundary could be a vertical line or a triangular tick or > > something. I don't know if there is a consensus on this kind of > > cartoons. > > Does anybody know how exon boundaries are displayed in different > > browsers/apps? > > Don't know. BTW, apparently there is something being cooked up as an > alignment browser (among other things) for GBrowse: > > https://www.nescent.org/wg_phyloinformatics/ > PhyloSoC:Phylogenetic_and_Haplotype_Displays_for_GBrowse > > Acc. to Lincoln (from his last GBrowse post) there will be a testable > version within a few weeks or so. You could always ask more > questions about it on the GBrowse list. > > chris > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > -- Lincoln D. Stein Cold Spring Harbor Laboratory 1 Bungtown Road Cold Spring Harbor, NY 11724 (516) 367-8380 (voice) (516) 367-8389 (fax) FOR URGENT MESSAGES & SCHEDULING, PLEASE CONTACT MY ASSISTANT, SANDRA MICHELSEN, AT michelse at cshl.edu From cjfields at uiuc.edu Mon Sep 10 23:00:29 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 10 Sep 2007 22:00:29 -0500 Subject: [Bioperl-l] display protein/CDS multiple sequence alignments with exon boundaries In-Reply-To: <6dce9a0b0709101509x5b1a18cfx7c567e40dffb4947@mail.gmail.com> References: <358f4d650708310229i421bb1d7w5f64e3ded5f92618@mail.gmail.com> <96BB820A-9199-43FC-BA88-5F45F2D16D56@jays.net> <59E877A9-B2A1-4E62-A291-FDE21CF586E6@uiuc.edu> <358f4d650709050242u7743a532u3b460fcc55e1867e@mail.gmail.com> <08E1C429-0686-49E2-825A-D98C709166F7@uiuc.edu> <9AD575D1-A56C-415F-9C52-73105E846040@uiuc.edu> <358f4d650709070220q3680c5a2kbd01adb6f2fca3dc@mail.gmail.com> <3D1D9D7E-4162-4694-A3D6-23E926B07C0E@uiuc.edu> <6dce9a0b0709101509x5b1a18cfx7c567e40dffb4947@mail.gmail.com> Message-ID: <885E5E3B-E2F7-4279-8EE3-FC21AF535D7E@uiuc.edu> Doesn't that work only for SeqFeature::SimilarityPair and HSP-like (paired) alignments, or am I mistaken? chris On Sep 10, 2007, at 5:09 PM, Lincoln Stein wrote: > You can view a simple multiple alignment now. Go to > www.wormbase.org, turn > on some of the EST tracks and then zoom down to base pair level. > > In bio::graphics, use the "segments" glyph and turn on the - > draw_target > option. The features must have DNA attached to them. > > What's coming soon is support for MAF format, which provides genome- > level > alignments. > > Lincoln From christoph.theunert at web.de Tue Sep 11 06:37:49 2007 From: christoph.theunert at web.de (Christoph Theunert) Date: Tue, 11 Sep 2007 03:37:49 -0700 (PDT) Subject: [Bioperl-l] release of own projects Message-ID: <12611951.post@talk.nabble.com> Hi, I am a bioinformatics student from germany and I need your help Working with perl and bioperl is pretty new to me - currently I am working on a Bioperl project, and I don't know how to release my project when i am finished with it. I want to pack my modules so that other users can download it and install it on their machines. Do I use the command h2xs as to create cpan modules ( makefiles ...) or what is the best way to solve my problem ? thanks for help Christoph -- View this message in context: http://www.nabble.com/release-of-own-projects-tf4421681.html#a12611951 Sent from the Perl - Bioperl-L mailing list archive at Nabble.com. From spiros at lokku.com Tue Sep 11 06:57:14 2007 From: spiros at lokku.com (Spiros Denaxas) Date: Tue, 11 Sep 2007 11:57:14 +0100 Subject: [Bioperl-l] release of own projects In-Reply-To: <12611951.post@talk.nabble.com> References: <12611951.post@talk.nabble.com> Message-ID: Hey, Yes, IMHO the best way would be to create CPANesque modules that people are able to download and install. The installation is pretty straightforward, covers prerequisites and more advanced features if needed and as an approach it is widely supported. Also, it gives you the ability to create and integrate tests seamlessly :) Check out these URL's on how to do it: http://search.cpan.org/~nwclark/perl-5.8.8/pod/perlnewmod.pod http://www.perlmonks.org/?node_id=158999 http://www.perlmonks.org/?node_id=431702 Btw, more friendly and automated tools exist besides h2xs. Be sure to have a look at: http://search.cpan.org/perldoc?ExtUtils::ModuleMaker http://search.cpan.org/perldoc?Module::Starter Hope this helps, Spiros ps. i suggest since its your research work you are going to be handing out to read a bit on the various software licenses which exist and which you prefer to license your code under. On 9/11/07, Christoph Theunert wrote: > > Hi, I am a bioinformatics student from germany and I need your help > > Working with perl and bioperl is pretty new to me - > currently I am working on a Bioperl project, and I don't know how to release > my project when i am finished with it. > > I want to pack my modules so that other users can download it and install it > on their machines. > > Do I use the command h2xs as to create cpan modules ( makefiles ...) or what > is the best way to solve my > problem ? > > thanks for help > > Christoph > -- > View this message in context: http://www.nabble.com/release-of-own-projects-tf4421681.html#a12611951 > Sent from the Perl - Bioperl-L mailing list archive at Nabble.com. > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > From bix at sendu.me.uk Tue Sep 11 07:12:41 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Tue, 11 Sep 2007 12:12:41 +0100 Subject: [Bioperl-l] release of own projects In-Reply-To: <12611951.post@talk.nabble.com> References: <12611951.post@talk.nabble.com> Message-ID: <46E67829.8060303@sendu.me.uk> Christoph Theunert wrote: > Hi, I am a bioinformatics student from germany and I need your help > > Working with perl and bioperl is pretty new to me - > currently I am working on a Bioperl project, and I don't know how to release > my project when i am finished with it. > > I want to pack my modules so that other users can download it and install it > on their machines. > > Do I use the command h2xs as to create cpan modules ( makefiles ...) or what > is the best way to solve my > problem ? You can do it however you like. You can just stick the modules in a folder, .tar.gz it and offer that to people. You can use h2xs to automate certain things. You can use Module::Build. To make your work available via cpan, see http://www.cpan.org/modules/04pause.html If your modules are of general bioinformatic utility you might even consider making them a part of bioperl itself. From jay at jays.net Tue Sep 11 17:15:17 2007 From: jay at jays.net (Jay Hannah) Date: Tue, 11 Sep 2007 16:15:17 -0500 Subject: [Bioperl-l] display protein/CDS multiple sequence alignments with exon boundaries In-Reply-To: <6dce9a0b0709101509x5b1a18cfx7c567e40dffb4947@mail.gmail.com> References: <358f4d650708310229i421bb1d7w5f64e3ded5f92618@mail.gmail.com> <96BB820A-9199-43FC-BA88-5F45F2D16D56@jays.net> <59E877A9-B2A1-4E62-A291-FDE21CF586E6@uiuc.edu> <358f4d650709050242u7743a532u3b460fcc55e1867e@mail.gmail.com> <08E1C429-0686-49E2-825A-D98C709166F7@uiuc.edu> <9AD575D1-A56C-415F-9C52-73105E846040@uiuc.edu> <358f4d650709070220q3680c5a2kbd01adb6f2fca3dc@mail.gmail.com> <3D1D9D7E-4162-4694-A3D6-23E926B07C0E@uiuc.edu> <6dce9a0b0709101509x5b1a18cfx7c567e40dffb4947@mail.gmail.com> Message-ID: <46E70565.5040503@jays.net> Lincoln Stein wrote: > You can view a simple multiple alignment now. Go to www.wormbase.org > , turn on some of the EST tracks and then > zoom down to base pair level. > > In bio::graphics, use the "segments" glyph and turn on the > -draw_target option. The features must have DNA attached to them. Wow. *http://tinyurl.com/yuz8bq* I hadn't seen that done before. > What's coming soon is support for MAF format, which provides > genome-level alignments. I'm looking forward to trying to wrap my head around that. :) Jay Hannah http://clab.ist.unomaha.edu/CLAB/index.php/User:Jhannah From cjfields at uiuc.edu Tue Sep 11 18:40:55 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 11 Sep 2007 17:40:55 -0500 Subject: [Bioperl-l] display protein/CDS multiple sequence alignments with exon boundaries In-Reply-To: <46E70565.5040503@jays.net> References: <358f4d650708310229i421bb1d7w5f64e3ded5f92618@mail.gmail.com> <96BB820A-9199-43FC-BA88-5F45F2D16D56@jays.net> <59E877A9-B2A1-4E62-A291-FDE21CF586E6@uiuc.edu> <358f4d650709050242u7743a532u3b460fcc55e1867e@mail.gmail.com> <08E1C429-0686-49E2-825A-D98C709166F7@uiuc.edu> <9AD575D1-A56C-415F-9C52-73105E846040@uiuc.edu> <358f4d650709070220q3680c5a2kbd01adb6f2fca3dc@mail.gmail.com> <3D1D9D7E-4162-4694-A3D6-23E926B07C0E@uiuc.edu> <6dce9a0b0709101509x5b1a18cfx7c567e40dffb4947@mail.gmail.com> <46E70565.5040503@jays.net> Message-ID: On Sep 11, 2007, at 4:15 PM, Jay Hannah wrote: > Lincoln Stein wrote: >> You can view a simple multiple alignment now. Go to www.wormbase.org >> , turn on some of the EST tracks and then >> zoom down to base pair level. >> >> In bio::graphics, use the "segments" glyph and turn on the >> -draw_target option. The features must have DNA attached to them. > > Wow. *http://tinyurl.com/yuz8bq* I hadn't seen that done before. There is a section detailing how this is done in the GBrowse tutorial (though it uses older GFF): http://gmod.cvs.sourceforge.net/*checkout*/gmod/Generic-Genome- Browser/docs/tutorial/tutorial.html >> What's coming soon is support for MAF format, which provides >> genome-level alignments. > > I'm looking forward to trying to wrap my head around that. :) > > Jay Hannah > http://clab.ist.unomaha.edu/CLAB/index.php/User:Jhannah It's easily parsible, which is nice! chris From stephan.roessner at gsf.de Wed Sep 12 04:44:10 2007 From: stephan.roessner at gsf.de (Stephan Roessner) Date: Wed, 12 Sep 2007 10:44:10 +0200 Subject: [Bioperl-l] bug in Bio::SearchIO? Message-ID: <200709121044.11741.stephan.roessner@gsf.de> Hi, I am parsing a BlastN output with Bio::SearchIO and getting an error for some of the hits when retrieving the start and/or the end position with $hit->end('sbjct') , $hit->start('sbjct'). I want to filter for hits which are are of equal length (~ > 0.9) to the query sequences. SearchIO is retrieving the right results, but throws an exemption, in this case: MSG:Undefined sub-sequence (1633,760). Valid range = 693 - 760 ..... It seems to me valid range is parsed incorrectly, isn't it? Is this a bug? Does anybody have a similar problem? see code, error, and blastn output below. thanks, Stephan Stephan Roessner MIPS/IBI Inst. for Bioinformatics GSF Research Center for Environment and Health Ingolst?dter Landstr. 1 85764 Neuherberg; Germany phone: +49 (0)89 3187 3583 fax: ? ? ? +49 (0)89 3187 3585 email: stephan.roessner at gsf.de Here is the piece of code I am using: my $blast_report = new Bio::SearchIO ('-format'=>'blast', '-file' => $source); while( my $result=$blast_report->next_result) { while( my $hit= $result->next_hit()) { print "Name: ".$hit->name."\n"; print "S: ".$hit->start('sbjct')."\n"; print "E: ".$hit->end('sbjct')."\n"; print "L: ".$hit->length()."\n"; } } Here's the message: ------------- EXCEPTION: Bio::Root::Exception ------------- MSG: Undefined sub-sequence (1633,760). Valid range = 693 - 760 STACK: Error::throw STACK: Bio::Root::Root::throw /usr/lib/perl5/vendor_perl/5.8.8/Bio/Root/Root.pm:359 STACK: Bio::Search::HSP::HSPI::matches /usr/lib/perl5/vendor_perl/5.8.8/Bio/Search/HSP/HSPI.pm:691 STACK: Bio::Search::SearchUtils::_adjust_contigs /usr/lib/perl5/vendor_perl/5.8.8/Bio/Search/SearchUtils.pm:489 STACK: Bio::Search::SearchUtils::tile_hsps /usr/lib/perl5/vendor_perl/5.8.8/Bio/Search/SearchUtils.pm:206 STACK: Bio::Search::Hit::GenericHit::start /usr/lib/perl5/vendor_perl/5.8.8/Bio/Search/Hit/GenericHit.pm:935 STACK: main::parse /home/users/roessner/workspace/GeneSimilarity/similarity_analysis.pl:82 STACK: /home/users/roessner/workspace/GeneSimilarity/similarity_analysis.pl:51 ----------------------------------------------------------- S: 635 E: 790 L: 2052 This is the BLASTN output I am parsing:: >LOC_Os11g37470.1 chr11_pseudomolecule_TIGR r_jap version0 21623485-21621434 BestGuessTranscript Length = 2052 Score = 95.6 bits (48), Expect = 1e-17 Identities = 106/124 (85%), Gaps = 1/124 (0%) Strand = Plus / Plus Query: 3191 tattaagcataattaatgtatcattagcacatgtagg-ttactgtagcatttaaggctaa 3249 |||||||| |||||||| | ||||| ||||||||||| |||||||| || ||| |||||| Sbjct: 635 tattaagcctaattaatctgtcattggcacatgtagggttactgtaacacttatggctaa 694 Query: 3250 tcatagagtaactagacttaaaagactcgtctcgcgattttcaaccaaactgtgtaatta 3309 |||| || ||| |||||| |||||| || |||||||||||||| ||||| ||| ||||| Sbjct: 695 tcatggactaaatagactcaaaagattcatctcgcgattttcatgcaaaccgtgcaatta 754 Query: 3310 gttt 3313 |||| Sbjct: 755 gttt 758 Score = 48.1 bits (24), Expect = 0.002 Identities = 57/68 (83%) Strand = Plus / Minus Query: 2253 aaaaactaattacacaatttacctgtacatcgcgagatgaatcttttaagtttagttact 2312 ||||||||||| ||| ||| | || | ||||||||||||||||||| ||| || ||| | Sbjct: 760 aaaaactaattgcacggtttgcatgaaaatcgcgagatgaatcttttgagtctatttagt 701 Query: 2313 ccatgatt 2320 |||||||| Sbjct: 700 ccatgatt 693 Score = 44.1 bits (22), Expect = 0.038 Identities = 76/94 (80%) Strand = Plus / Minus Query: 1539 atgcatgtagtattaaatatagacgaaaataaaaactaattgcacagtttggtcgaaatt 1598 ||||||| || |||||||||| | ||| ||||||||||||||| ||||| |||| | Sbjct: 790 atgcatggagcattaaatataaataaaatgaaaaactaattgcacggtttgcatgaaaat 731 Query: 1599 gtcgagacgaattttttgagtctagttaggccat 1632 ||||| |||| ||||||||||| |||| |||| Sbjct: 730 cgcgagatgaatcttttgagtctatttagtccat 697 Score = 44.1 bits (22), Expect = 0.038 Identities = 73/90 (81%) Strand = Plus / Plus Query: 2026 actaactagaattaaaagattcgtctcgtcatttacagacaaactgtgtaattagttttt 2085 ||||| |||| | ||||||||| ||||| |||| || ||||| ||| ||||||||||| Sbjct: 701 actaaatagactcaaaagattcatctcgcgattttcatgcaaaccgtgcaattagttttt 760 Query: 2086 gttttcgtctatatttaatgcttcatgcat 2115 ||| | ||||||||||||| ||||||| Sbjct: 761 cattttatttatatttaatgctccatgcat 790 From cjfields at uiuc.edu Wed Sep 12 10:57:22 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Wed, 12 Sep 2007 09:57:22 -0500 Subject: [Bioperl-l] bug in Bio::SearchIO? In-Reply-To: <200709121044.11741.stephan.roessner@gsf.de> References: <200709121044.11741.stephan.roessner@gsf.de> Message-ID: <74CE1BB2-FCEB-43C3-B783-09706C7F55D8@uiuc.edu> Try updating to bioperl from CVS. I believe this issue was fixed but I don't believe it made the 1.5.2 release. chris On Sep 12, 2007, at 3:44 AM, Stephan Roessner wrote: > Hi, > > I am parsing a BlastN output with Bio::SearchIO and getting an > error for some > of the hits when retrieving the start and/or the end position with > $hit->end('sbjct') , $hit->start('sbjct'). I want to filter for > hits which > are are of equal length (~ > 0.9) to the query sequences. > > SearchIO is retrieving the right results, but throws an exemption, > in this > case: MSG:Undefined sub-sequence (1633,760). Valid range = 693 - > 760 ..... > > It seems to me valid range is parsed incorrectly, isn't it? Is this > a bug? > > Does anybody have a similar problem? > > see code, error, and blastn output below. > > thanks, > Stephan > > > Stephan Roessner > MIPS/IBI Inst. for Bioinformatics > GSF Research Center for Environment and Health > Ingolst?dter Landstr. 1 > 85764 Neuherberg; Germany > phone: +49 (0)89 3187 3583 > fax: +49 (0)89 3187 3585 > email: stephan.roessner at gsf.de > > > Here is the piece of code I am using: > > my $blast_report = new Bio::SearchIO ('-format'=>'blast', > '-file' => $source); > > while( my $result=$blast_report->next_result) { > while( my $hit= $result->next_hit()) { > print "Name: ".$hit->name."\n"; > print "S: ".$hit->start('sbjct')."\n"; > print "E: ".$hit->end('sbjct')."\n"; > print "L: ".$hit->length()."\n"; > } > } > > > Here's the message: > > ------------- EXCEPTION: Bio::Root::Exception ------------- > MSG: Undefined sub-sequence (1633,760). Valid range = 693 - 760 > STACK: Error::throw > STACK: > Bio::Root::Root::throw /usr/lib/perl5/vendor_perl/5.8.8/Bio/Root/ > Root.pm:359 > STACK: > Bio::Search::HSP::HSPI::matches /usr/lib/perl5/vendor_perl/5.8.8/ > Bio/Search/HSP/HSPI.pm:691 > STACK: > Bio::Search::SearchUtils::_adjust_contigs /usr/lib/perl5/ > vendor_perl/5.8.8/Bio/Search/SearchUtils.pm:489 > STACK: > Bio::Search::SearchUtils::tile_hsps /usr/lib/perl5/vendor_perl/ > 5.8.8/Bio/Search/SearchUtils.pm:206 > STACK: > Bio::Search::Hit::GenericHit::start /usr/lib/perl5/vendor_perl/ > 5.8.8/Bio/Search/Hit/GenericHit.pm:935 > STACK: > main::parse /home/users/roessner/workspace/GeneSimilarity/ > similarity_analysis.pl:82 > STACK: /home/users/roessner/workspace/GeneSimilarity/ > similarity_analysis.pl:51 > ----------------------------------------------------------- > > S: 635 > E: 790 > L: 2052 > > This is the BLASTN output I am parsing:: > >> LOC_Os11g37470.1 chr11_pseudomolecule_TIGR r_jap version0 > 21623485-21621434 BestGuessTranscript > Length = 2052 > > Score = 95.6 bits (48), Expect = 1e-17 > Identities = 106/124 (85%), Gaps = 1/124 (0%) > Strand = Plus / Plus > > > Query: 3191 tattaagcataattaatgtatcattagcacatgtagg- > ttactgtagcatttaaggctaa 3249 > |||||||| |||||||| | ||||| ||||||||||| |||||||| || ||| > |||||| > Sbjct: 635 > tattaagcctaattaatctgtcattggcacatgtagggttactgtaacacttatggctaa 694 > > > Query: 3250 > tcatagagtaactagacttaaaagactcgtctcgcgattttcaaccaaactgtgtaatta 3309 > |||| || ||| |||||| |||||| || |||||||||||||| ||||| ||| > ||||| > Sbjct: 695 > tcatggactaaatagactcaaaagattcatctcgcgattttcatgcaaaccgtgcaatta 754 > > > Query: 3310 gttt 3313 > |||| > Sbjct: 755 gttt 758 > > > > Score = 48.1 bits (24), Expect = 0.002 > Identities = 57/68 (83%) > Strand = Plus / Minus > > > Query: 2253 > aaaaactaattacacaatttacctgtacatcgcgagatgaatcttttaagtttagttact 2312 > ||||||||||| ||| ||| | || | ||||||||||||||||||| ||| || > ||| | > Sbjct: 760 > aaaaactaattgcacggtttgcatgaaaatcgcgagatgaatcttttgagtctatttagt 701 > > > Query: 2313 ccatgatt 2320 > |||||||| > Sbjct: 700 ccatgatt 693 > > > > Score = 44.1 bits (22), Expect = 0.038 > Identities = 76/94 (80%) > Strand = Plus / Minus > > > Query: 1539 > atgcatgtagtattaaatatagacgaaaataaaaactaattgcacagtttggtcgaaatt 1598 > ||||||| || |||||||||| | ||| ||||||||||||||| ||||| > |||| | > Sbjct: 790 > atgcatggagcattaaatataaataaaatgaaaaactaattgcacggtttgcatgaaaat 731 > > > Query: 1599 gtcgagacgaattttttgagtctagttaggccat 1632 > ||||| |||| ||||||||||| |||| |||| > Sbjct: 730 cgcgagatgaatcttttgagtctatttagtccat 697 > > > > Score = 44.1 bits (22), Expect = 0.038 > Identities = 73/90 (81%) > Strand = Plus / Plus > > > Query: 2026 > actaactagaattaaaagattcgtctcgtcatttacagacaaactgtgtaattagttttt 2085 > ||||| |||| | ||||||||| ||||| |||| || ||||| ||| > ||||||||||| > Sbjct: 701 > actaaatagactcaaaagattcatctcgcgattttcatgcaaaccgtgcaattagttttt 760 > > > Query: 2086 gttttcgtctatatttaatgcttcatgcat 2115 > ||| | ||||||||||||| ||||||| > Sbjct: 761 cattttatttatatttaatgctccatgcat 790 > > > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From n.haigh at sheffield.ac.uk Wed Sep 12 12:34:26 2007 From: n.haigh at sheffield.ac.uk (Nathan Haigh) Date: Wed, 12 Sep 2007 17:34:26 +0100 Subject: [Bioperl-l] Bio::Graphics transparent background Message-ID: <46E81512.3090503@sheffield.ac.uk> Is it possible to set the bg colour of glyphs and the panel background to be transparent? If so, which output formats support transparency? Cheers Nath From Kevin.M.Brown at asu.edu Wed Sep 12 14:15:10 2007 From: Kevin.M.Brown at asu.edu (Kevin Brown) Date: Wed, 12 Sep 2007 11:15:10 -0700 Subject: [Bioperl-l] Bio::Graphics transparent background In-Reply-To: <46E81512.3090503@sheffield.ac.uk> References: <46E81512.3090503@sheffield.ac.uk> Message-ID: <1A4207F8295607498283FE9E93B775B403AEFF97@EX02.asurite.ad.asu.edu> > Is it possible to set the bg colour of glyphs and the panel > background to be transparent? If so, which output formats > support transparency? Not sure if you can, but SVG, PNG, Gif all support a transparent background. From bioperl-list at superfrink.net Thu Sep 13 01:15:39 2007 From: bioperl-list at superfrink.net (bioperl-list at superfrink.net) Date: Wed, 12 Sep 2007 23:15:39 -0600 (MDT) Subject: [Bioperl-l] Bio::Graphics transparent background Message-ID: > Is it possible to set the bg colour of glyphs and the panel background > to be transparent? If so, which output formats support transparency? I had a look at the code and I don't believe it is possible. You could produce a PNG file and knowing the red/green/blue values of the background colour run the following script to make an image with the bg colour transparent. For example: ./make-transparent.pl 252 253 252 2004-11-22.png will produce: 2004-11-22.png.new.png with the RGB colour of (252, 253, 252) replaced with transparency. Regards, Chad #!/usr/bin/perl -w # # file: make-transparent.pl # purpose: make a single colour in a PNG file transparent # author: chad c d clark # $Id$ use strict; use GD; # -- subroutines ------------------------------------------------------- sub usage_message(); # -- main() ------------------------------------------------------------ if(scalar @ARGV < 4) { print usage_message(); exit 1; } # get the colour and make sure it is valid my @RGB = splice @ARGV, 0, 3; for my $i (@RGB) { if ( ($i !~ /^[\d]+$/) or (255 < $i) ) { print "Invalid colour '$i'.\n"; print usage_message(); exit 1; } } print "RGB: (@RGB)\n"; # process each file FILE: while (my $filename = shift @ARGV) { # read the file my $image = GD::Image->new($filename); unless(defined $image) { warn "Unable to read image from file. Skipping '$filename'.\n"; next FILE; } # find the colour index my $index = $image->colorExact(@RGB); if(-1 == $index) { warn "Colour not found in file. Skipping '$filename'.\n"; next FILE; } # make the colour index transparent if(-1 == $image->transparent($index)) { warn "Unable to make colour transparent. Skipping '$filename'.\n"; next FILE; } # write the updated image file my $new_filename = $filename . ".new.png"; # my $new_filename = $filename; # use to over-write existing file open FH, ">" . $new_filename or die "can't open $new_filename"; print FH $image->png; close FH; print "Found file '$filename'.\tCreated '$new_filename'.\n"; } exit 0; # -- subroutines ------------------------------------------------------- sub usage_message() { return qq/ Usage: $0 RED GREEN BLUE FILELIST Where: RED - red value in decimal (0 to 255) GREEN - green value in decimal (0 to 255) BLUE - blue value in decimal (0 to 255) FILELIST - list of files to convert Examples: $0 255 255 255 2004-11-22.png $0 252 253 252 2004-11-22.png second.png $0 1 1 200 2004-11-22.png second.png third.png Description: For each file "foo.png" a new file "foo.png.new.png" will be created (and over-written if it existed). The new file will be the same as the original but the colour specified by the RED, GREEN, and BLUE value will be removed and replaced by transparent pixels. /; } From n.haigh at sheffield.ac.uk Thu Sep 13 06:07:46 2007 From: n.haigh at sheffield.ac.uk (Nathan Haigh) Date: Thu, 13 Sep 2007 11:07:46 +0100 Subject: [Bioperl-l] Bio::Graphics transparent background In-Reply-To: <1A4207F8295607498283FE9E93B775B403AEFF97@EX02.asurite.ad.asu.edu> References: <46E81512.3090503@sheffield.ac.uk> <1A4207F8295607498283FE9E93B775B403AEFF97@EX02.asurite.ad.asu.edu> Message-ID: <46E90BF2.5010607@sheffield.ac.uk> Kevin Brown wrote: >> Is it possible to set the bg colour of glyphs and the panel >> background to be transparent? If so, which output formats >> support transparency? >> > > Not sure if you can, but SVG, PNG, Gif all support a transparent > background. > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > Looking at the GD module documentation: http://search.cpan.org/~lds/GD-2.30/GD.pm It appears that you can set a colour as being transparent - so I think it should be possible to get Bio::Graphics to do this = may require some code to be written. Any one got ideas? Cheers, Nath From n.haigh at sheffield.ac.uk Thu Sep 13 07:59:20 2007 From: n.haigh at sheffield.ac.uk (Nathan Haigh) Date: Thu, 13 Sep 2007 12:59:20 +0100 Subject: [Bioperl-l] Bio::Graphics transparent background In-Reply-To: <46E90BF2.5010607@sheffield.ac.uk> References: <46E81512.3090503@sheffield.ac.uk> <1A4207F8295607498283FE9E93B775B403AEFF97@EX02.asurite.ad.asu.edu> <46E90BF2.5010607@sheffield.ac.uk> Message-ID: <46E92618.7050208@sheffield.ac.uk> Nathan Haigh wrote: > Kevin Brown wrote: > >>> Is it possible to set the bg colour of glyphs and the panel >>> background to be transparent? If so, which output formats >>> support transparency? >>> >>> >> Not sure if you can, but SVG, PNG, Gif all support a transparent >> background. >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> >> > > Looking at the GD module documentation: > http://search.cpan.org/~lds/GD-2.30/GD.pm > > It appears that you can set a colour as being transparent - so I think > it should be possible to get Bio::Graphics to do this = may require some > code to be written. Any one got ideas? > > Cheers, > Nath > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > I took a look and made a simple change to Bio::Graphics::Panel Please see the following bug for a patch and explanation: http://bugzilla.open-bio.org/show_bug.cgi?id=2365 I'd appreciate any comments, especially regarding the method name! If there aren't any complaints I'll commit it later today. Nath From n.haigh at sheffield.ac.uk Thu Sep 13 08:26:57 2007 From: n.haigh at sheffield.ac.uk (Nathan Haigh) Date: Thu, 13 Sep 2007 13:26:57 +0100 Subject: [Bioperl-l] Bio::Graphics Resolution Message-ID: <46E92C91.5020307@sheffield.ac.uk> I want to be able to print my Bio::Graphics image on a poster with good resolution. What can I do to ensure I don't get blocky graphics/text. Altering the width/height of the panel simple increases the size of the canvas on which to draw the image, but the text appears the same size and thus relatively smaller to the rest of the image. So I don't think this would work for printing on a poster. Cheers, Nath From cjfields at uiuc.edu Thu Sep 13 08:46:02 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 13 Sep 2007 07:46:02 -0500 Subject: [Bioperl-l] Bio::Graphics Resolution In-Reply-To: <46E92C91.5020307@sheffield.ac.uk> References: <46E92C91.5020307@sheffield.ac.uk> Message-ID: <69321A85-8715-43C0-BCB0-CEE8F42D7235@uiuc.edu> Print to SVG instead of PNG (should be resolution-independent); I use Illustrator to fine-tune it but there are several other programs which can do the same. You'll need to install GD::SVG for it to work. The alignment example I posted previously about (http:// www.bioperl.org/wiki/HOWTO_Discussion:Graphics) shows essentially what you need to do: my $panel = Bio::Graphics::Panel->new( -image_class => 'SVG', # and whatever else ); # later... print $panel->svg; chris On Sep 13, 2007, at 7:26 AM, Nathan Haigh wrote: > I want to be able to print my Bio::Graphics image on a poster with > good > resolution. What can I do to ensure I don't get blocky graphics/text. > > Altering the width/height of the panel simple increases the size of > the > canvas on which to draw the image, but the text appears the same size > and thus relatively smaller to the rest of the image. So I don't think > this would work for printing on a poster. > > Cheers, > Nath > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From jonathancrabtree at gmail.com Thu Sep 13 09:09:56 2007 From: jonathancrabtree at gmail.com (Jonathan Crabtree) Date: Thu, 13 Sep 2007 09:09:56 -0400 Subject: [Bioperl-l] Bio::Graphics transparent background In-Reply-To: <46E92618.7050208@sheffield.ac.uk> References: <46E81512.3090503@sheffield.ac.uk> <1A4207F8295607498283FE9E93B775B403AEFF97@EX02.asurite.ad.asu.edu> <46E90BF2.5010607@sheffield.ac.uk> <46E92618.7050208@sheffield.ac.uk> Message-ID: <8e5b8bf80709130609x4be19cf6y60f2440a1ac5d332@mail.gmail.com> Hi Nathan- One problem with your proposed solution is that it won't necessarily work when GD::SVG is being used instead of GD (i.e., via the image_class method of Bio::Graphics::Panel). SVG doesn't handle transparency in the same way as GD. At least when you're compositing multiple SVG images/documents, transparency is the default; if you superimpose one SVG image on another ( e.g., by merging the two into a single SVG document) then the bottom image will be visible through any area of the top image that has not been drawn on. When I'm working in SVG with Bio::Graphics I get a "transparent" background by simply not setting the bgcolor; this ensures that Bio::Graphics::Panel will refrain from drawing a filled background rectangle underneath the drawing area. What I don't know is how to ensure that the background is transparent when you're working with the various methods of embedding SVG in web pages ( i.e., transparent with respect to whatever is _underneath_ the SVG-rendered content); this is probably a slightly different issue that's more a question of what the browser/plugin supports. I'm not sure what to suggest as an alternative, but at the very least this probably warrants a YMMV comment in the documentation for the new method, or perhaps it could even throw a runtime error if called when the $gd object is of type GD::SVG. A final option would be to say that this (setting a transparent background) is something that should get handled outside of Bio::Graphics::Panel; I don't think there's any technical reason why the calling code couldn't be responsible for this. I don't think we can modify your new method to unset the bgcolor when working with GD::SVG, because that might affect the image in other ways. I do it in my code but I'm not sure it's 100% safe, since I think GD::SVG might actually _use_ the bgcolor in some situations (e.g., drawing dashed lines) and I haven't checked the code thoroughly to make sure that there are no unintended consequences. Jonathan p.s. I see that Chris has beaten me to the punch in mentioning SVG as a fix to your blocky font problems. All the more reason to think about how this feature will work in that context! On 9/13/07, Nathan Haigh wrote: > > Nathan Haigh wrote: > > Kevin Brown wrote: > > > >>> Is it possible to set the bg colour of glyphs and the panel > >>> background to be transparent? If so, which output formats > >>> support transparency? > >>> > >>> > >> Not sure if you can, but SVG, PNG, Gif all support a transparent > >> background. > >> > >> _______________________________________________ > >> Bioperl-l mailing list > >> Bioperl-l at lists.open-bio.org > >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > >> > >> > > > > Looking at the GD module documentation: > > http://search.cpan.org/~lds/GD-2.30/GD.pm > > > > > It appears that you can set a colour as being transparent - so I think > > it should be possible to get Bio::Graphics to do this = may require some > > code to be written. Any one got ideas? > > > > Cheers, > > Nath > > _______________________________________________ > > Bioperl-l mailing list > > Bioperl-l at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > > I took a look and made a simple change to Bio::Graphics::Panel > > Please see the following bug for a patch and explanation: > http://bugzilla.open-bio.org/show_bug.cgi?id=2365 > > I'd appreciate any comments, especially regarding the method name! If > there aren't any complaints I'll commit it later today. > > Nath > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > From bix at sendu.me.uk Thu Sep 13 09:03:46 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Thu, 13 Sep 2007 14:03:46 +0100 Subject: [Bioperl-l] Bio::Graphics Resolution In-Reply-To: <46E92C91.5020307@sheffield.ac.uk> References: <46E92C91.5020307@sheffield.ac.uk> Message-ID: <46E93532.6030505@sendu.me.uk> Nathan Haigh wrote: > I want to be able to print my Bio::Graphics image on a poster with good > resolution. What can I do to ensure I don't get blocky graphics/text. Output in SVG, which is a vector format == no blockiness. From jonathancrabtree at gmail.com Thu Sep 13 09:20:43 2007 From: jonathancrabtree at gmail.com (Jonathan Crabtree) Date: Thu, 13 Sep 2007 09:20:43 -0400 Subject: [Bioperl-l] Bio::Graphics Resolution In-Reply-To: <46E92C91.5020307@sheffield.ac.uk> References: <46E92C91.5020307@sheffield.ac.uk> Message-ID: <8e5b8bf80709130620r4a24fe8fi5171539f50735bf3@mail.gmail.com> Nathan- As Chris said, you'll want to use GD::SVG instead of GD. However, you're still going to have the issue that you raised that the fonts will be proportionally small with respect to your figure (particularly if you're printing a large region at poster size.) From what I remember GD only gives you a few font sizes to choose from, so even at the largest size you may still have problems. I've worked around this in the past by using scripts to post-process the resulting SVG. I do a global search and replace to increase the font sizes (and, in many cases, to adjust the y-offset of the text accordingly.) You may also need to tweak the amount of vertical whitespace in the image (e.g., between adjacent rows of features) to give yourself space to increase the font size. The same caveat applies to the horizontal dimension, since with a larger font you may have collisions between labels (assuming that the features in your figure are labeled.) To fix this you need to trick Bio::Graphics into thinking the feature labels are longer than they actually are. I forget whether I did this by padding the labels with extra whitespace or actually modifying the code that computes the feature bounding boxes, but something along those lines should work. Essentially you have to trick Bio::Graphics into leaving extra whitespace so that everything looks OK when you bump up the font sizes. Unfortunately I don't have a generic script that does this; after generating a couple of posters this way I switched to direct SVG generation to avoid the constraints imposed by going through GD. Jonathan On 9/13/07, Nathan Haigh wrote: > > I want to be able to print my Bio::Graphics image on a poster with good > resolution. What can I do to ensure I don't get blocky graphics/text. > > Altering the width/height of the panel simple increases the size of the > canvas on which to draw the image, but the text appears the same size > and thus relatively smaller to the rest of the image. So I don't think > this would work for printing on a poster. > > Cheers, > Nath > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > From arareko at campus.iztacala.unam.mx Thu Sep 13 10:59:31 2007 From: arareko at campus.iztacala.unam.mx (Mauricio Herrera Cuadra) Date: Thu, 13 Sep 2007 09:59:31 -0500 Subject: [Bioperl-l] Bio::Graphics Resolution In-Reply-To: <46E92C91.5020307@sheffield.ac.uk> References: <46E92C91.5020307@sheffield.ac.uk> Message-ID: <46E95053.8090300@campus.iztacala.unam.mx> Try saving the output of your Bio::Graphics image as SVG (with the desired proportions between text & graphics), then, at the moment of printing, set the desired output size (from the SVG file) and everything should be scaled accordingly. That can probably work. Cheers, Mauricio. Nathan Haigh wrote: > I want to be able to print my Bio::Graphics image on a poster with good > resolution. What can I do to ensure I don't get blocky graphics/text. > > Altering the width/height of the panel simple increases the size of the > canvas on which to draw the image, but the text appears the same size > and thus relatively smaller to the rest of the image. So I don't think > this would work for printing on a poster. > > Cheers, > Nath > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > -- MAURICIO HERRERA CUADRA arareko at campus.iztacala.unam.mx Laboratorio de Gen?tica Unidad de Morfofisiolog?a y Funci?n Facultad de Estudios Superiores Iztacala, UNAM From jay at jays.net Fri Sep 14 10:27:38 2007 From: jay at jays.net (Jay Hannah) Date: Fri, 14 Sep 2007 09:27:38 -0500 Subject: [Bioperl-l] [patch] getGenBank.pl Message-ID: <1088BA7F-009A-482E-B15E-80D4D59218BE@jays.net> http://cvs.bioperl.org/cgi-bin/viewcvs/viewcvs.cgi/bioperl-live/ examples/db/getGenBank.pl Using this: my $seqio = $gb->get_Stream_by_batch([ qw( 124430577 )]); Throws this warning: $ ./fetch.pl > 124430577.gbk get_Stream_by_batch() is deprecated; use get_Stream_by_id() instead STACK Bio::DB::NCBIHelper::__ANON__ /usr/lib/perl5/Bio/DB/ NCBIHelper.pm:261 STACK toplevel ./fetch.pl:17 Can someone with commit access please change getGenBank.pl? 24,25c24,25 < # if you want to get a bunch of sequences use the batch method < my $seqio = $gb->get_Stream_by_batch([ qw(J00522 AF303112 2981014)]); --- > # feel free to pull multiple sequences > my $seqio = $gb->get_Stream_by_id([ qw(J00522 AF303112 2981014)]); The tweaked version works fine for me and the warning goes away. Thanks, Jay Hannah http://clab.ist.unomaha.edu/CLAB/index.php/User:Jhannah From cjfields at uiuc.edu Fri Sep 14 12:36:56 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Fri, 14 Sep 2007 11:36:56 -0500 Subject: [Bioperl-l] [patch] getGenBank.pl In-Reply-To: <1088BA7F-009A-482E-B15E-80D4D59218BE@jays.net> References: <1088BA7F-009A-482E-B15E-80D4D59218BE@jays.net> Message-ID: <5255407E-F19E-45B6-9F66-3DC1AB25C0AC@uiuc.edu> Done. Thanks for the heads up! chris On Sep 14, 2007, at 9:27 AM, Jay Hannah wrote: > http://cvs.bioperl.org/cgi-bin/viewcvs/viewcvs.cgi/bioperl-live/ > examples/db/getGenBank.pl > > Using this: > my $seqio = $gb->get_Stream_by_batch([ qw( 124430577 )]); > > Throws this warning: > $ ./fetch.pl > 124430577.gbk > get_Stream_by_batch() is deprecated; use get_Stream_by_id() instead > STACK Bio::DB::NCBIHelper::__ANON__ /usr/lib/perl5/Bio/DB/ > NCBIHelper.pm:261 > STACK toplevel ./fetch.pl:17 > > > Can someone with commit access please change getGenBank.pl? > > 24,25c24,25 > < # if you want to get a bunch of sequences use the batch method > < my $seqio = $gb->get_Stream_by_batch([ qw(J00522 AF303112 > 2981014)]); > --- >> # feel free to pull multiple sequences >> my $seqio = $gb->get_Stream_by_id([ qw(J00522 AF303112 2981014)]); > > > The tweaked version works fine for me and the warning goes away. > > Thanks, > > Jay Hannah > http://clab.ist.unomaha.edu/CLAB/index.php/User:Jhannah > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From MEC at stowers-institute.org Mon Sep 17 16:15:39 2007 From: MEC at stowers-institute.org (Cook, Malcolm) Date: Mon, 17 Sep 2007 15:15:39 -0500 Subject: [Bioperl-l] Bioperl -- why so old? ... or ... Feature/Annotation rollback breaks Bioperl/Ensembl compatibility Message-ID: Sometime ago, in the ensembl-dev mailing list, mdr wrote: > > I'm always running into bugs in bioperl that have been fixed in more recent > > versions than the version 1-2-3 that the installation document specifies. > > > > Just wondering, are there plans to try to move to 1-5 any time soon, or is that > > not possible for some reason? Or, by any chance, is Ensembl actually compatible > > with 1-5 and it's just a documentation issue? > > "Ewan Birney" replied > Ensembl doesn't make heavy use of Bioperl anymore - most of the critical things > we re-wrote, mainly due to speed/memory issues. I think the short answer is that > it _probably_ works with 1.5, but we don't have a strong desire to move up > as certainly there are no problems with the 1.2.3 release we are using. FWIW, I have just discovered that the round of bioperl changes in service of http://www.bioperl.org/wiki/Feature_Annotation_rollback introduce (additional?) incompatibilities between current bioperl and the Ensembl Core API. The changes bring me to obtain and use Bioperl version 1.2.3 for use in conjunction with Ensemble API application (as is recommended by Ensembl). Until now, the ways I have used the Ensembl API appear not to have been effected by changes in Bioperl; I have successfully used it in conjunction with the bioperl's leading edge. Of course there may be other incompatibilities that I have just not noticed yet. Evidence of the new incompatibility is present in this back trace, which bridges between code in current bioperl-live and current ensembl/modules/Bio: -------------------- EXCEPTION -------------------- MSG: Operator overloading of AnnotationI is deprecated STACK Bio::Annotation::DBLink::__ANON__ /home/mec/cvs/bioperl-live/Bio/Annotation/DBLink.pm:59 STACK Bio::EnsEMBL::DBSQL::DBEntryAdaptor::_fetch_by_object_type /home/mec/cvs/foo/ensembl/modules/Bio/EnsEMBL/DBSQL/DBEntryAdaptor.pm:77 8 Obtaining version 1.2.3 fixes the issue for me. This is just a warning to others.... Your milage may vary.... -- Malcolm Cook Stowers Institute for Medical Research - Kansas City, Missouri From cjfields at uiuc.edu Mon Sep 17 17:52:55 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 17 Sep 2007 16:52:55 -0500 Subject: [Bioperl-l] Bioperl -- why so old? ... or ... Feature/Annotation rollback breaks Bioperl/Ensembl compatibility In-Reply-To: References: Message-ID: <1CAA1977-45AE-4A8F-815C-4C726DB0E6E4@uiuc.edu> Malcolm, I have removed the Bio::Annotation overloading exceptions from bioperl-live; they're just more trouble than they're worth right now. Could you try it out and see if that suffices, and drop us a note if it doesn't or if you run into other odd issues? I'll be busy until the end of the month but I'll do the best I can to help out. The rollbacks were fairly simple and essentially reversed, corrected, or simplified many changes made prior to the 1.5 release (most of which were undocumented and not completely implemented). They pass all current tests and should make BioPerl classes (particularly Annotations and SeqFeatures) behave more like 1.4. Beyond the now- removed exceptions it should be fine unless it is in an area of already-known incompatibility between BioPerl and Ensembl, some of which you've already outlined. chris On Sep 17, 2007, at 3:15 PM, Cook, Malcolm wrote: > ... > FWIW, I have just discovered that the round of bioperl changes in > service of http://www.bioperl.org/wiki/Feature_Annotation_rollback > introduce (additional?) incompatibilities between current bioperl and > the Ensembl Core API. The changes bring me to obtain and use Bioperl > version 1.2.3 for use in conjunction with Ensemble API application (as > is recommended by Ensembl). > > Until now, the ways I have used the Ensembl API appear not to have > been effected by changes in Bioperl; I have successfully used it > in conjunction with the bioperl's leading edge. Of course there > may be > other incompatibilities that I have just not noticed yet. > > Evidence of the new incompatibility is present in this back trace, > which bridges between code in current bioperl-live and current > ensembl/modules/Bio: > > -------------------- EXCEPTION -------------------- > MSG: Operator overloading of AnnotationI is deprecated > STACK Bio::Annotation::DBLink::__ANON__ > /home/mec/cvs/bioperl-live/Bio/Annotation/DBLink.pm:59 > STACK Bio::EnsEMBL::DBSQL::DBEntryAdaptor::_fetch_by_object_type > /home/mec/cvs/foo/ensembl/modules/Bio/EnsEMBL/DBSQL/ > DBEntryAdaptor.pm:77 > 8 > > > Obtaining version 1.2.3 fixes the issue for me. > > This is just a warning to others.... > > Your milage may vary.... > > -- > > Malcolm Cook > Stowers Institute for Medical Research - Kansas City, Missouri From MEC at stowers-institute.org Mon Sep 17 18:14:41 2007 From: MEC at stowers-institute.org (Cook, Malcolm) Date: Mon, 17 Sep 2007 17:14:41 -0500 Subject: [Bioperl-l] Bioperl -- why so old? ... or ... Feature/Annotation rollback breaks Bioperl/Ensembl compatibility In-Reply-To: <1CAA1977-45AE-4A8F-815C-4C726DB0E6E4@uiuc.edu> References: <1CAA1977-45AE-4A8F-815C-4C726DB0E6E4@uiuc.edu> Message-ID: Chris, Removing those exceptions makes my application work with current bioperl-live again. Hooray! Thanks. But! I have been warned! Regards, Malcolm Cook Stowers Institute for Medical Research - Kansas City, Missouri > -----Original Message----- > From: Chris Fields [mailto:cjfields at uiuc.edu] > Sent: Monday, September 17, 2007 4:53 PM > To: Cook, Malcolm > Cc: bioperl list; ensembl-dev at ebi.ac.uk > Subject: Re: [Bioperl-l] Bioperl -- why so old? ... or ... > Feature/Annotation rollback breaks Bioperl/Ensembl compatibility > > Malcolm, > > I have removed the Bio::Annotation overloading exceptions > from bioperl-live; they're just more trouble than they're > worth right now. Could you try it out and see if that > suffices, and drop us a note if it doesn't or if you run into > other odd issues? I'll be busy until the end of the month > but I'll do the best I can to help out. > > The rollbacks were fairly simple and essentially reversed, > corrected, or simplified many changes made prior to the 1.5 > release (most of which were undocumented and not completely > implemented). They pass all current tests and should make > BioPerl classes (particularly Annotations and SeqFeatures) > behave more like 1.4. Beyond the now- removed exceptions it > should be fine unless it is in an area of already-known > incompatibility between BioPerl and Ensembl, some of which > you've already outlined. > > chris > > On Sep 17, 2007, at 3:15 PM, Cook, Malcolm wrote: > > > ... > > FWIW, I have just discovered that the round of bioperl changes in > > service of http://www.bioperl.org/wiki/Feature_Annotation_rollback > > introduce (additional?) incompatibilities between current > bioperl and > > the Ensembl Core API. The changes bring me to obtain and > use Bioperl > > version 1.2.3 for use in conjunction with Ensemble API > application (as > > is recommended by Ensembl). > > > > Until now, the ways I have used the Ensembl API appear not to have > > been effected by changes in Bioperl; I have successfully used it in > > conjunction with the bioperl's leading edge. Of course > there may be > > other incompatibilities that I have just not noticed yet. > > > > Evidence of the new incompatibility is present in this back trace, > > which bridges between code in current bioperl-live and current > > ensembl/modules/Bio: > > > > -------------------- EXCEPTION -------------------- > > MSG: Operator overloading of AnnotationI is deprecated STACK > > Bio::Annotation::DBLink::__ANON__ > > /home/mec/cvs/bioperl-live/Bio/Annotation/DBLink.pm:59 > > STACK Bio::EnsEMBL::DBSQL::DBEntryAdaptor::_fetch_by_object_type > > /home/mec/cvs/foo/ensembl/modules/Bio/EnsEMBL/DBSQL/ > > DBEntryAdaptor.pm:77 > > 8 > > > > > > Obtaining version 1.2.3 fixes the issue for me. > > > > This is just a warning to others.... > > > > Your milage may vary.... > > > > -- > > > > Malcolm Cook > > Stowers Institute for Medical Research - Kansas City, Missouri > > > From neetisomaiya at gmail.com Tue Sep 18 06:30:34 2007 From: neetisomaiya at gmail.com (neeti somaiya) Date: Tue, 18 Sep 2007 16:00:34 +0530 Subject: [Bioperl-l] A perl regex query Message-ID: <764978cf0709180330i232c2f04yf033414ad0dba655@mail.gmail.com> Hi, This isnt really a bioperl query. But does anyone know how I can substitute all special characters (+ some other things) in a string with nothing in perl? I mean if I have a string like Cyclic-2,3-bisphospho-D-glycerate and I want ouput as bisphosphoglycerate. I want to remove -D-, Cyclic-, 2,3- etc. -- -Neeti Even my blood says, B positive From spiros at lokku.com Tue Sep 18 06:57:18 2007 From: spiros at lokku.com (Spiros Denaxas) Date: Tue, 18 Sep 2007 11:57:18 +0100 Subject: [Bioperl-l] A perl regex query In-Reply-To: <764978cf0709180330i232c2f04yf033414ad0dba655@mail.gmail.com> References: <764978cf0709180330i232c2f04yf033414ad0dba655@mail.gmail.com> Message-ID: Heya, seperate the items you want to remove by a pipe and add the g regex flag. For example: spiros$ echo Cyclic-2,3-bisphospho-D-glycerate | perl -ne ' $_ =~ s at -D-|Cyclic\-@@g ; print $_ ;' 2,3-bisphosphoglycerate IMHO this is ugly. Best to make an array of all the elements you want to remove and then iterate through the array, calling the regex each time with a different element. This way it will be much more easy to read, debug and maintain. For example my $ra_bad_terms = [ '-D-', 'Cyclic-' ] ; foreach (@$ra_bad_terms) { $string =~ s@$_@@g ; } etc. Dont forget escaping and \Q \E if needed. Spiros On 9/18/07, neeti somaiya wrote: > Hi, > > This isnt really a bioperl query. > But does anyone know how I can substitute all special characters (+ some > other things) in a string with nothing in perl? > I mean if I have a string like Cyclic-2,3-bisphospho-D-glycerate and I want > ouput as bisphosphoglycerate. I want to remove -D-, Cyclic-, 2,3- etc. > > -- > -Neeti > Even my blood says, B positive > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > From n.haigh at sheffield.ac.uk Tue Sep 18 07:44:14 2007 From: n.haigh at sheffield.ac.uk (Nathan Haigh) Date: Tue, 18 Sep 2007 12:44:14 +0100 Subject: [Bioperl-l] A perl regex query In-Reply-To: References: <764978cf0709180330i232c2f04yf033414ad0dba655@mail.gmail.com> Message-ID: <46EFBA0E.4030104@sheffield.ac.uk> An even better way is to use the array as Spiros suggested, but you should then be able to use that in the regex like this: my @ra_bad_terms = ( '-D-', 'Cyclic-' ); $string =~ s/@ra_bad_terms//g; Again you might need escaping with \Q and \E - can't remember off hand. You might also want to look here: http://www.perl.com/pub/a/2002/06/04/apo5.html?page=15 Cheers Nath Spiros Denaxas wrote: > Heya, seperate the items you want to remove by a pipe and add the g > regex flag. For example: > > spiros$ echo Cyclic-2,3-bisphospho-D-glycerate | perl -ne ' $_ =~ > s at -D-|Cyclic\-@@g ; print $_ ;' > 2,3-bisphosphoglycerate > > IMHO this is ugly. Best to make an array of all the elements you want > to remove and then iterate through the array, calling the regex each > time with a different element. This way it will be much more easy to > read, debug and maintain. > > For example > > my $ra_bad_terms = [ '-D-', 'Cyclic-' ] ; > > foreach (@$ra_bad_terms) { > $string =~ s@$_@@g ; > } > > etc. > > Dont forget escaping and \Q \E if needed. > > Spiros > > > > On 9/18/07, neeti somaiya wrote: > >> Hi, >> >> This isnt really a bioperl query. >> But does anyone know how I can substitute all special characters (+ some >> other things) in a string with nothing in perl? >> I mean if I have a string like Cyclic-2,3-bisphospho-D-glycerate and I want >> ouput as bisphosphoglycerate. I want to remove -D-, Cyclic-, 2,3- etc. >> >> -- >> -Neeti >> Even my blood says, B positive >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> >> > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > From neetisomaiya at gmail.com Tue Sep 18 08:13:42 2007 From: neetisomaiya at gmail.com (neeti somaiya) Date: Tue, 18 Sep 2007 17:43:42 +0530 Subject: [Bioperl-l] A perl regex query In-Reply-To: <46EFBE8A.6080402@cam.ac.uk> References: <764978cf0709180330i232c2f04yf033414ad0dba655@mail.gmail.com> <46EFBE8A.6080402@cam.ac.uk> Message-ID: <764978cf0709180513r4bf44ea3gcd183a5aadcacb23@mail.gmail.com> Thanks. It might work, but not always, because the string could be somthing like Cyclic-2,3-Bisphospho-D-Glycerate. Here I will first convert the full thing to a lower case and would then try to get what I want. Nothing seems to work, when I try to substitute -D- with nothing, "D" and "-" when occuring separately also get substituted with nothing. On 9/18/07, Roy Chaudhuri wrote: > > > This isnt really a bioperl query. > > But does anyone know how I can substitute all special characters (+ some > > other things) in a string with nothing in perl? > > I mean if I have a string like Cyclic-2,3-bisphospho-D-glycerate and I > want > > ouput as bisphosphoglycerate. I want to remove -D-, Cyclic-, 2,3- etc. > > > > A more general approach that might work is to keep lower case words (I > don't know if that will be true for all your cases): > > $_='Cyclic-2,3-bisphospho-D-glycerate'; > print join '', /\b[a-z]+\b/g; > > Roy. > -- > Dr. Roy Chaudhuri > Department of Veterinary Medicine > University of Cambridge, U.K. > -- -Neeti Even my blood says, B positive From ak at ebi.ac.uk Tue Sep 18 08:20:32 2007 From: ak at ebi.ac.uk (Andreas Kahari) Date: Tue, 18 Sep 2007 13:20:32 +0100 Subject: [Bioperl-l] A perl regex query In-Reply-To: <764978cf0709180330i232c2f04yf033414ad0dba655@mail.gmail.com> References: <764978cf0709180330i232c2f04yf033414ad0dba655@mail.gmail.com> Message-ID: <20070918122032.GV14066@ebi.ac.uk> On Tue, Sep 18, 2007 at 04:00:34PM +0530, neeti somaiya wrote: > Hi, > > This isnt really a bioperl query. > But does anyone know how I can substitute all special characters (+ some > other things) in a string with nothing in perl? > I mean if I have a string like Cyclic-2,3-bisphospho-D-glycerate and I want > ouput as bisphosphoglycerate. I want to remove -D-, Cyclic-, 2,3- etc. This is in additions to the suggestions you've already had. If you always want to concatenate the 3rd and 5th part of the string, as delimited by dashes, then you could do this: my $string = 'Cyclic-2,3-bisphospho-D-glycerate'; my $newstring = join( '', ( split( /-/, $string ) )[ 2, 4 ] ); Cheers, Andreas -- Andreas K?h?ri :: Ensembl Software Developer European Bioinformatics Institute (EMBL-EBI) -------------------------------------------- From spiros at lokku.com Tue Sep 18 08:23:49 2007 From: spiros at lokku.com (Spiros Denaxas) Date: Tue, 18 Sep 2007 13:23:49 +0100 Subject: [Bioperl-l] A perl regex query In-Reply-To: <764978cf0709180513r4bf44ea3gcd183a5aadcacb23@mail.gmail.com> References: <764978cf0709180330i232c2f04yf033414ad0dba655@mail.gmail.com> <46EFBE8A.6080402@cam.ac.uk> <764978cf0709180513r4bf44ea3gcd183a5aadcacb23@mail.gmail.com> Message-ID: Its not impossibe, you just have to use \b to denote the word boundaries :) echo 'this-is-a_test-D-string-D' | perl -ne ' s/\b\-D\-\b//g ; print ;' this-is-a_teststring-D It only gets rid of -D- , all other occurrences of D and - remain intact. Spiros On 9/18/07, neeti somaiya wrote: > Thanks. > It might work, but not always, because the string could be somthing like > Cyclic-2,3-Bisphospho-D-Glycerate. > Here I will first convert the full thing to a lower case and would then try > to get what I want. > > Nothing seems to work, when I try to substitute -D- with nothing, "D" and > "-" when occuring separately also get substituted with nothing. > > On 9/18/07, Roy Chaudhuri wrote: > > > > > This isnt really a bioperl query. > > > But does anyone know how I can substitute all special characters (+ some > > > other things) in a string with nothing in perl? > > > I mean if I have a string like Cyclic-2,3-bisphospho-D-glycerate and I > > want > > > ouput as bisphosphoglycerate. I want to remove -D-, Cyclic-, 2,3- etc. > > > > > > > A more general approach that might work is to keep lower case words (I > > don't know if that will be true for all your cases): > > > > $_='Cyclic-2,3-bisphospho-D-glycerate'; > > print join '', /\b[a-z]+\b/g; > > > > Roy. > > -- > > Dr. Roy Chaudhuri > > Department of Veterinary Medicine > > University of Cambridge, U.K. > > > > > > -- > -Neeti > Even my blood says, B positive > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > From rrc22 at cam.ac.uk Tue Sep 18 08:03:22 2007 From: rrc22 at cam.ac.uk (Roy Chaudhuri) Date: Tue, 18 Sep 2007 13:03:22 +0100 Subject: [Bioperl-l] A perl regex query In-Reply-To: <764978cf0709180330i232c2f04yf033414ad0dba655@mail.gmail.com> References: <764978cf0709180330i232c2f04yf033414ad0dba655@mail.gmail.com> Message-ID: <46EFBE8A.6080402@cam.ac.uk> > This isnt really a bioperl query. > But does anyone know how I can substitute all special characters (+ some > other things) in a string with nothing in perl? > I mean if I have a string like Cyclic-2,3-bisphospho-D-glycerate and I want > ouput as bisphosphoglycerate. I want to remove -D-, Cyclic-, 2,3- etc. > A more general approach that might work is to keep lower case words (I don't know if that will be true for all your cases): $_='Cyclic-2,3-bisphospho-D-glycerate'; print join '', /\b[a-z]+\b/g; Roy. -- Dr. Roy Chaudhuri Department of Veterinary Medicine University of Cambridge, U.K. From neetisomaiya at gmail.com Tue Sep 18 08:47:18 2007 From: neetisomaiya at gmail.com (neeti somaiya) Date: Tue, 18 Sep 2007 18:17:18 +0530 Subject: [Bioperl-l] A perl regex query In-Reply-To: References: <764978cf0709180330i232c2f04yf033414ad0dba655@mail.gmail.com> <46EFBE8A.6080402@cam.ac.uk> <764978cf0709180513r4bf44ea3gcd183a5aadcacb23@mail.gmail.com> Message-ID: <764978cf0709180547w69e0fcfbp67976254e19ab253@mail.gmail.com> My actual problem is a bit more complicated. It is not just one string, nut lakhs of them, they are actually names of chemical compounds. THe problem is there are 2 different data sources, I need to match the compond names between them, but the problem is though the compound may be the same in the two, they use different naming formats for them. eg 1 : Glucose DB1 : D-glucose DB2 : alpha-D-Glucose eg2 : 2,3-bisphosphoglycerate DB1 : Cyclic-2,3-bisphospho-D-Glycerate DB2 : 2,3 bisphoshpglycerate And there are some simple examples, there are even more complicated ones, with many digits, alhas, betas, hyphens, S, R, cis, trans etc etc. I just want to see if the basic compond is the same, i.e. the first one will be glucose and second one will be 2,3-biphosphoglycerate (can't take just bisphosphoglycerate because 1,3-bisphosphoglycerate would mean something else). Anyone has any suggestions how to tackle this? Thanks. On 9/18/07, Spiros Denaxas wrote: > > Its not impossibe, you just have to use \b to denote the word boundaries > :) > > echo 'this-is-a_test-D-string-D' | perl -ne ' s/\b\-D\-\b//g ; print ;' > > this-is-a_teststring-D > > It only gets rid of -D- , all other occurrences of D and - remain intact. > > Spiros > > > On 9/18/07, neeti somaiya wrote: > > Thanks. > > It might work, but not always, because the string could be somthing like > > Cyclic-2,3-Bisphospho-D-Glycerate. > > Here I will first convert the full thing to a lower case and would then > try > > to get what I want. > > > > Nothing seems to work, when I try to substitute -D- with nothing, "D" > and > > "-" when occuring separately also get substituted with nothing. > > > > On 9/18/07, Roy Chaudhuri wrote: > > > > > > > This isnt really a bioperl query. > > > > But does anyone know how I can substitute all special characters (+ > some > > > > other things) in a string with nothing in perl? > > > > I mean if I have a string like Cyclic-2,3-bisphospho-D-glycerate and > I > > > want > > > > ouput as bisphosphoglycerate. I want to remove -D-, Cyclic-, 2,3- > etc. > > > > > > > > > > A more general approach that might work is to keep lower case words (I > > > don't know if that will be true for all your cases): > > > > > > $_='Cyclic-2,3-bisphospho-D-glycerate'; > > > print join '', /\b[a-z]+\b/g; > > > > > > Roy. > > > -- > > > Dr. Roy Chaudhuri > > > Department of Veterinary Medicine > > > University of Cambridge, U.K. > > > > > > > > > > > -- > > -Neeti > > Even my blood says, B positive > > _______________________________________________ > > Bioperl-l mailing list > > Bioperl-l at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > -- -Neeti Even my blood says, B positive From spiros at lokku.com Tue Sep 18 08:56:44 2007 From: spiros at lokku.com (Spiros Denaxas) Date: Tue, 18 Sep 2007 13:56:44 +0100 Subject: [Bioperl-l] A perl regex query In-Reply-To: <46EFCA4E.5090605@sendu.me.uk> References: <764978cf0709180330i232c2f04yf033414ad0dba655@mail.gmail.com> <46EFBE8A.6080402@cam.ac.uk> <764978cf0709180513r4bf44ea3gcd183a5aadcacb23@mail.gmail.com> <46EFCA4E.5090605@sendu.me.uk> Message-ID: On 9/18/07, Sendu Bala wrote: > Spiros Denaxas wrote: > > Its not impossibe, you just have to use \b to denote the word boundaries :) > > > > echo 'this-is-a_test-D-string-D' | perl -ne ' s/\b\-D\-\b//g ; print ;' > > > > this-is-a_teststring-D > > > > It only gets rid of -D- , all other occurrences of D and - remain intact. > > I'm confused. The simpler: > > echo 'this-is-a_test-D-string-D' | perl -ne ' s/-D-//g ; print ;' > > gives the same answer. You'd have to something very strange for a regex > on -D- to match D or - alone. > Its the same thing. He was just mixing up character classes in the regex. Spiros From bix at sendu.me.uk Tue Sep 18 08:53:34 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Tue, 18 Sep 2007 13:53:34 +0100 Subject: [Bioperl-l] A perl regex query In-Reply-To: References: <764978cf0709180330i232c2f04yf033414ad0dba655@mail.gmail.com> <46EFBE8A.6080402@cam.ac.uk> <764978cf0709180513r4bf44ea3gcd183a5aadcacb23@mail.gmail.com> Message-ID: <46EFCA4E.5090605@sendu.me.uk> Spiros Denaxas wrote: > Its not impossibe, you just have to use \b to denote the word boundaries :) > > echo 'this-is-a_test-D-string-D' | perl -ne ' s/\b\-D\-\b//g ; print ;' > > this-is-a_teststring-D > > It only gets rid of -D- , all other occurrences of D and - remain intact. I'm confused. The simpler: echo 'this-is-a_test-D-string-D' | perl -ne ' s/-D-//g ; print ;' gives the same answer. You'd have to something very strange for a regex on -D- to match D or - alone. From rrc22 at cam.ac.uk Tue Sep 18 09:26:47 2007 From: rrc22 at cam.ac.uk (Roy Chaudhuri) Date: Tue, 18 Sep 2007 14:26:47 +0100 Subject: [Bioperl-l] A perl regex query In-Reply-To: <764978cf0709180547w69e0fcfbp67976254e19ab253@mail.gmail.com> References: <764978cf0709180330i232c2f04yf033414ad0dba655@mail.gmail.com> <46EFBE8A.6080402@cam.ac.uk> <764978cf0709180513r4bf44ea3gcd183a5aadcacb23@mail.gmail.com> <764978cf0709180547w69e0fcfbp67976254e19ab253@mail.gmail.com> Message-ID: <46EFD217.1030103@cam.ac.uk> > My actual problem is a bit more complicated. > It is not just one string, nut lakhs of them, they are actually names of > chemical compounds. > > THe problem is there are 2 different data sources, I need to match the > compond names between them, but the problem is though the compound may > be the same in the two, they use different naming formats for them. Unless you can define in simple and precise terms exactly which parts of the string you need then there is no way that you will be able to code a solution in Perl. Maybe you could look for a database that contains the synonyms for each molecule? A quick Google finds ChEBI (http://www.ebi.ac.uk/chebi), which is available to download as flat files. Roy. -- Dr. Roy Chaudhuri Department of Veterinary Medicine University of Cambridge, U.K. From js5 at sanger.ac.uk Tue Sep 18 08:58:36 2007 From: js5 at sanger.ac.uk (James Smith) Date: Tue, 18 Sep 2007 13:58:36 +0100 (BST) Subject: [Bioperl-l] A perl regex query In-Reply-To: <20070918122032.GV14066@ebi.ac.uk> References: <764978cf0709180330i232c2f04yf033414ad0dba655@mail.gmail.com> <20070918122032.GV14066@ebi.ac.uk> Message-ID: Neeti, This isn't really a bioperl query - but I will try and explain a simple solution... warn simplify( 'Cyclic-2,3-bisphospho-D-glycerate' ); sub simplify { local $_ = "-$_[0]-"; ## Quick hack add -'s at start and end! as always match "-string-" s/-( Cyclic | # The prefix "cyclic" \d+ | # a single number between two "-"s \d+,\d+| # number,number between two "-"s \w # a single letter between two "-"s )(?=-)//ixg; ## case-insensitive, commented, multiple matches! ## 0-width +ve lookahead assertion - so can match ## multiple consecutive -x- constructions in same regexp! s/-//g; ## remove remaining "-"s from string... } Not sure what other test strings you may want - but most should be able to fit in the () brackets in the first regexp of simplify James On Tue, 18 Sep 2007, Andreas Kahari wrote: > On Tue, Sep 18, 2007 at 04:00:34PM +0530, neeti somaiya wrote: >> Hi, >> >> This isnt really a bioperl query. >> But does anyone know how I can substitute all special characters (+ some >> other things) in a string with nothing in perl? >> I mean if I have a string like Cyclic-2,3-bisphospho-D-glycerate and I want >> ouput as bisphosphoglycerate. I want to remove -D-, Cyclic-, 2,3- etc. > > This is in additions to the suggestions you've already had. > > If you always want to concatenate the 3rd and 5th part of the string, as > delimited by dashes, then you could do this: > > my $string = 'Cyclic-2,3-bisphospho-D-glycerate'; > my $newstring = join( '', ( split( /-/, $string ) )[ 2, 4 ] ); > > > Cheers, > Andreas > > -- > Andreas K?h?ri :: Ensembl Software Developer > European Bioinformatics Institute (EMBL-EBI) > -------------------------------------------- > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > -- The Wellcome Trust Sanger Institute is operated by Genome Research Limited, a charity registered in England with number 1021457 and a company registered in England with number 2742969, whose registered office is 215 Euston Road, London, NW1 2BE. From stefan.kirov at bms.com Tue Sep 18 09:05:16 2007 From: stefan.kirov at bms.com (Stefan Kirov) Date: Tue, 18 Sep 2007 09:05:16 -0400 Subject: [Bioperl-l] A perl regex query In-Reply-To: <764978cf0709180547w69e0fcfbp67976254e19ab253@mail.gmail.com> References: <764978cf0709180330i232c2f04yf033414ad0dba655@mail.gmail.com> <46EFBE8A.6080402@cam.ac.uk> <764978cf0709180513r4bf44ea3gcd183a5aadcacb23@mail.gmail.com> <764978cf0709180547w69e0fcfbp67976254e19ab253@mail.gmail.com> Message-ID: <46EFCD0C.3010306@bms.com> neeti somaiya wrote: > My actual problem is a bit more complicated. > It is not just one string, nut lakhs of them, they are actually names of > chemical compounds. > > THe problem is there are 2 different data sources, I need to match the > compond names between them, but the problem is though the compound may be > the same in the two, they use different naming formats for them. > > eg 1 : Glucose > DB1 : D-glucose > DB2 : alpha-D-Glucose > > eg2 : 2,3-bisphosphoglycerate > DB1 : Cyclic-2,3-bisphospho-D-Glycerate > DB2 : 2,3 bisphoshpglycerate > It seems to me you are trying to match 2 collections of chemical compounds. If you need to do this reliably you need to use canonical smiles (perhaps there are other solutions but I am not aware of them). There are many resources for that, including open-source: http://openbabel.sourceforge.net/wiki/Main_Page It is not really bioperl's cup of tea, this is much more a chemi-informatics problem. I am not sure if there is a need for bioperl to be extended this way- any thoughts on that? Hope this helps, regards Stefan > And there are some simple examples, there are even more complicated ones, > with many digits, alhas, betas, hyphens, S, R, cis, trans etc etc. > > I just want to see if the basic compond is the same, i.e. the first one will > be glucose and second one will be 2,3-biphosphoglycerate (can't take just > bisphosphoglycerate because 1,3-bisphosphoglycerate would mean something > else). > > Anyone has any suggestions how to tackle this? > > Thanks. > > On 9/18/07, Spiros Denaxas wrote: > >> Its not impossibe, you just have to use \b to denote the word boundaries >> :) >> >> echo 'this-is-a_test-D-string-D' | perl -ne ' s/\b\-D\-\b//g ; print ;' >> >> this-is-a_teststring-D >> >> It only gets rid of -D- , all other occurrences of D and - remain intact. >> >> Spiros >> >> >> On 9/18/07, neeti somaiya wrote: >> >>> Thanks. >>> It might work, but not always, because the string could be somthing like >>> Cyclic-2,3-Bisphospho-D-Glycerate. >>> Here I will first convert the full thing to a lower case and would then >>> >> try >> >>> to get what I want. >>> >>> Nothing seems to work, when I try to substitute -D- with nothing, "D" >>> >> and >> >>> "-" when occuring separately also get substituted with nothing. >>> >>> On 9/18/07, Roy Chaudhuri wrote: >>> >>>>> This isnt really a bioperl query. >>>>> But does anyone know how I can substitute all special characters (+ >>>>> >> some >> >>>>> other things) in a string with nothing in perl? >>>>> I mean if I have a string like Cyclic-2,3-bisphospho-D-glycerate and >>>>> >> I >> >>>> want >>>> >>>>> ouput as bisphosphoglycerate. I want to remove -D-, Cyclic-, 2,3- >>>>> >> etc. >> >>>> A more general approach that might work is to keep lower case words (I >>>> don't know if that will be true for all your cases): >>>> >>>> $_='Cyclic-2,3-bisphospho-D-glycerate'; >>>> print join '', /\b[a-z]+\b/g; >>>> >>>> Roy. >>>> -- >>>> Dr. Roy Chaudhuri >>>> Department of Veterinary Medicine >>>> University of Cambridge, U.K. >>>> >>>> >>> >>> -- >>> -Neeti >>> Even my blood says, B positive >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>> >>> > > > > From stephane.teletchea at jouy.inra.fr Tue Sep 18 09:48:05 2007 From: stephane.teletchea at jouy.inra.fr (=?ISO-8859-1?Q?St=E9phane_T=E9letch=E9a?=) Date: Tue, 18 Sep 2007 15:48:05 +0200 Subject: [Bioperl-l] A perl regex query In-Reply-To: <764978cf0709180547w69e0fcfbp67976254e19ab253@mail.gmail.com> References: <764978cf0709180330i232c2f04yf033414ad0dba655@mail.gmail.com> <46EFBE8A.6080402@cam.ac.uk> <764978cf0709180513r4bf44ea3gcd183a5aadcacb23@mail.gmail.com> <764978cf0709180547w69e0fcfbp67976254e19ab253@mail.gmail.com> Message-ID: <46EFD715.4060308@jouy.inra.fr> neeti somaiya a ?crit : > My actual problem is a bit more complicated. > It is not just one string, nut lakhs of them, they are actually names of > chemical compounds. > > THe problem is there are 2 different data sources, I need to match the > compond names between them, but the problem is though the compound may be > the same in the two, they use different naming formats for them. > > eg 1 : Glucose > DB1 : D-glucose > DB2 : alpha-D-Glucose > > eg2 : 2,3-bisphosphoglycerate > DB1 : Cyclic-2,3-bisphospho-D-Glycerate > DB2 : 2,3 bisphoshpglycerate > > And there are some simple examples, there are even more complicated ones, > with many digits, alhas, betas, hyphens, S, R, cis, trans etc etc. > > I just want to see if the basic compond is the same, i.e. the first one will > be glucose and second one will be 2,3-biphosphoglycerate (can't take just > bisphosphoglycerate because 1,3-bisphosphoglycerate would mean something > else). > > Anyone has any suggestions how to tackle this? > I would use a two step approach : 1 - filter the entries, use a convention, for instance translata all '+' into their 'plus' literal equivalent, change spaces by '_', change all '-' for '_' also, etc 2 - try matching the result, if the match does not work, try to match some characters (for instance, try to remove all non alphabetical characters and see if the resulting produces a match). That's theory, now, you have some time for errors and trials, but i think there is not essay, one shot solution, neither a bioperl facility for handling (bio)chemical compounds. Cheers, St?phane -- St?phane T?letch?a, PhD. http://www.steletch.org Unit? Math?matique Informatique et G?nome http://migale.jouy.inra.fr/mig INRA, Domaine de Vilvert T?l : (33) 134 652 891 78352 Jouy-en-Josas cedex, France Fax : (33) 134 652 901 From puetz at mpipsykl.mpg.de Tue Sep 18 10:12:47 2007 From: puetz at mpipsykl.mpg.de (Benno Puetz) Date: Tue, 18 Sep 2007 16:12:47 +0200 Subject: [Bioperl-l] A perl regex query In-Reply-To: References: <764978cf0709180330i232c2f04yf033414ad0dba655@mail.gmail.com> <20070918122032.GV14066@ebi.ac.uk> Message-ID: <46EFDCDF.6030309@mpipsykl.mpg.de> James Smith wrote: > > Neeti, > > This isn't really a bioperl query - but I will try and explain a simple > solution... > > warn simplify( 'Cyclic-2,3-bisphospho-D-glycerate' ); > > sub simplify { > local $_ = "-$_[0]-"; > ## Quick hack add -'s at start and end! as always match > "-string-" > s/-( > Cyclic | # The prefix "cyclic" > \d+ | # a single number between two "-"s > \d+,\d+| # number,number between two "-"s > \w # a single letter between two "-"s > )(?=-)//ixg; ## case-insensitive, commented, multiple matches! > ## 0-width +ve lookahead assertion - so can match > ## multiple consecutive -x- constructions in same regexp! > s/-//g; > ## remove remaining "-"s from string... > } > > Not sure what other test strings you may want - but most should be > able to > fit in the () brackets in the first regexp of simplify > > James Along the same line # some test for most of the removals below my $string = "Alpha-Cyclic-2,3-bi-sphos-1,2,5-pho-D-beta-glycerate"; my @ra_bad_terms = ( '-?(D|R|S)-', '-?([aA]lpha|[bB]eta|[gG]amma)-', '-?([cC]is|[tT]rans)-', '-?[cC]yclic-', # '-?\d+(,\d+)+-', # uncomment to remove numbers, too '(?//g; print lc($string),"\n"; -- Benno P?tz Statistische Genetik Max-Planck-Institut f. Psychiatrie Tel.: +49-89-30622-222 Kraepelinstr. 10 Fax : +49-89-30622-601 80804 M?nchen, Germany From spiros at lokku.com Tue Sep 18 10:41:20 2007 From: spiros at lokku.com (Spiros Denaxas) Date: Tue, 18 Sep 2007 15:41:20 +0100 Subject: [Bioperl-l] A perl regex query In-Reply-To: <46EFDCDF.6030309@mpipsykl.mpg.de> References: <764978cf0709180330i232c2f04yf033414ad0dba655@mail.gmail.com> <20070918122032.GV14066@ebi.ac.uk> <46EFDCDF.6030309@mpipsykl.mpg.de> Message-ID: On 9/18/07, Benno Puetz wrote: > James Smith wrote: > > > > Neeti, > > > > This isn't really a bioperl query - but I will try and explain a simple > > solution... > > > > warn simplify( 'Cyclic-2,3-bisphospho-D-glycerate' ); > > > > sub simplify { > > local $_ = "-$_[0]-"; > > ## Quick hack add -'s at start and end! as always match > > "-string-" > > s/-( > > Cyclic | # The prefix "cyclic" > > \d+ | # a single number between two "-"s > > \d+,\d+| # number,number between two "-"s > > \w # a single letter between two "-"s > > )(?=-)//ixg; ## case-insensitive, commented, multiple matches! > > ## 0-width +ve lookahead assertion - so can match > > ## multiple consecutive -x- constructions in same regexp! > > s/-//g; > > ## remove remaining "-"s from string... > > } > > > > Not sure what other test strings you may want - but most should be > > able to > > fit in the () brackets in the first regexp of simplify > > > > James > Along the same line > > # some test for most of the removals below > my $string = "Alpha-Cyclic-2,3-bi-sphos-1,2,5-pho-D-beta-glycerate"; > my @ra_bad_terms = ( '-?(D|R|S)-', > '-?([aA]lpha|[bB]eta|[gG]amma)-', > '-?([cC]is|[tT]rans)-', > '-?[cC]yclic-', > # '-?\d+(,\d+)+-', # uncomment to remove numbers, too > '(? print "$string\n"; > foreach ( @ra_bad_terms ){ > > eval { $string =~ s/$_//g; }; > print "$_:$string\n"; # for feedback only > } > #$string =~ s/<@ra_bad_terms>//g; > > print lc($string),"\n"; > > > -- > Benno P?tz My humble opinion would be to avoid using regular expressions to do your task and try and locate a more valid and centralized information repository to use, be it a database of synonyms or some other indexing code. This will add the required domain knowledge in your solution. Using regular expressions will almost certainly lead to problems and bugs which will be very hard to resolve. Should you decide to go forward and treat everything simply as strings and compare them, I feel this is more of an NLP problem. Spiros From cjfields at uiuc.edu Tue Sep 18 11:24:52 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 18 Sep 2007 10:24:52 -0500 Subject: [Bioperl-l] A perl regex query In-Reply-To: <46EFD217.1030103@cam.ac.uk> References: <764978cf0709180330i232c2f04yf033414ad0dba655@mail.gmail.com> <46EFBE8A.6080402@cam.ac.uk> <764978cf0709180513r4bf44ea3gcd183a5aadcacb23@mail.gmail.com> <764978cf0709180547w69e0fcfbp67976254e19ab253@mail.gmail.com> <46EFD217.1030103@cam.ac.uk> Message-ID: <155C67C0-1F81-4A1C-AD68-A21B4E6918C9@uiuc.edu> On Sep 18, 2007, at 8:26 AM, Roy Chaudhuri wrote: >> My actual problem is a bit more complicated. >> It is not just one string, nut lakhs of them, they are actually >> names of >> chemical compounds. >> >> THe problem is there are 2 different data sources, I need to match >> the >> compond names between them, but the problem is though the compound >> may >> be the same in the two, they use different naming formats for them. > > Unless you can define in simple and precise terms exactly which > parts of > the string you need then there is no way that you will be able to > code a > solution in Perl. > > Maybe you could look for a database that contains the synonyms for > each > molecule? A quick Google finds ChEBI (http://www.ebi.ac.uk/chebi), > which > is available to download as flat files. > > Roy. > -- > Dr. Roy Chaudhuri > Department of Veterinary Medicine > University of Cambridge, U.K. D'oh! Roy beat me to it; that's what I was going to suggest. I agree; don't trust simple word munging to always get you the correct answer in this case, it's just too complicated to try and catch every case. ChEBI is a good choice; Stefan's suggestion of OpenBabel is also a good one. I would also try not to reinvent the wheel; there may be some modules available via CPAN which do what you need, such as these: http://search.cpan.org/search?query=chem&mode=module or this: http://search.cpan.org/~ghutchis/Chemistry-OpenBabel-1.2.0/ chris From shameer at ncbs.res.in Tue Sep 18 10:57:55 2007 From: shameer at ncbs.res.in (Shameer Khadar) Date: Tue, 18 Sep 2007 20:27:55 +0530 (IST) Subject: [Bioperl-l] A perl regex query In-Reply-To: <46EFDCDF.6030309@mpipsykl.mpg.de> References: <764978cf0709180330i232c2f04yf033414ad0dba655@mail.gmail.com> <20070918122032.GV14066@ebi.ac.uk> <46EFDCDF.6030309@mpipsykl.mpg.de> Message-ID: <53713.192.168.1.1.1190127475.squirrel@mail.ncbs.res.in> I used this module for my simple chemoinformatics tasks, http://www.perlmol.org/ - PerlMol - Perl Modules for Molecular Chemistry Please explore, you may find something useful. -- > James Smith wrote: >> >> Neeti, >> >> This isn't really a bioperl query - but I will try and explain a simple >> solution... >> >> warn simplify( 'Cyclic-2,3-bisphospho-D-glycerate' ); >> >> sub simplify { >> local $_ = "-$_[0]-"; >> ## Quick hack add -'s at start and end! as always match >> "-string-" >> s/-( >> Cyclic | # The prefix "cyclic" >> \d+ | # a single number between two "-"s >> \d+,\d+| # number,number between two "-"s >> \w # a single letter between two "-"s >> )(?=-)//ixg; ## case-insensitive, commented, multiple matches! >> ## 0-width +ve lookahead assertion - so can match >> ## multiple consecutive -x- constructions in same regexp! >> s/-//g; >> ## remove remaining "-"s from string... >> } >> >> Not sure what other test strings you may want - but most should be >> able to >> fit in the () brackets in the first regexp of simplify >> >> James > Along the same line > > # some test for most of the removals below > my $string = "Alpha-Cyclic-2,3-bi-sphos-1,2,5-pho-D-beta-glycerate"; > my @ra_bad_terms = ( '-?(D|R|S)-', > '-?([aA]lpha|[bB]eta|[gG]amma)-', > '-?([cC]is|[tT]rans)-', > '-?[cC]yclic-', > # '-?\d+(,\d+)+-', # uncomment to remove numbers, > too > '(? print "$string\n"; > foreach ( @ra_bad_terms ){ > > eval { $string =~ s/$_//g; }; > print "$_:$string\n"; # for feedback only > } > #$string =~ s/<@ra_bad_terms>//g; > > print lc($string),"\n"; > > > -- > Benno P?tz > Statistische Genetik > Max-Planck-Institut f. Psychiatrie Tel.: +49-89-30622-222 > Kraepelinstr. 10 Fax : +49-89-30622-601 > 80804 M?nchen, Germany > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > -- Shameer Khadar Prof. R. Sowdhamini's Lab (# 25) The Computational Biology Group National Centre for Biological Sciences (TIFR) GKVK Campus, Bellary Road, Bangalore - 65, Karnataka - India T - 91-080-23666001 EXT - 6251 W - http://www.ncbs.res.in From js5 at sanger.ac.uk Tue Sep 18 11:37:57 2007 From: js5 at sanger.ac.uk (James Smith) Date: Tue, 18 Sep 2007 16:37:57 +0100 (BST) Subject: [Bioperl-l] A perl regex query In-Reply-To: References: <764978cf0709180330i232c2f04yf033414ad0dba655@mail.gmail.com> <20070918122032.GV14066@ebi.ac.uk> <46EFDCDF.6030309@mpipsykl.mpg.de> Message-ID: On Tue, 18 Sep 2007, Spiros Denaxas wrote: > On 9/18/07, Benno Puetz wrote: > James Smith wrote: > > > > Neeti, > > > > This isn't really a bioperl query - but I will try and explain a simple > > solution... > > > > warn simplify( 'Cyclic-2,3-bisphospho-D-glycerate' ); > > > > sub simplify { > > local $_ = "-$_[0]-"; > > ## Quick hack add -'s at start and end! as always match > > "-string-" > > s/-( > > Cyclic | # The prefix "cyclic" > > \d+ | # a single number between two "-"s > > \d+,\d+| # number,number between two "-"s > > \w # a single letter between two "-"s > > )(?=-)//ixg; ## case-insensitive, commented, multiple matches! > > ## 0-width +ve lookahead assertion - so can match > > ## multiple consecutive -x- constructions in same regexp! > > s/-//g; > > ## remove remaining "-"s from string... > > } > > > > Not sure what other test strings you may want - but most should be > > able to > > fit in the () brackets in the first regexp of simplify > > > > James > Along the same line > But the point is you don't need to loop over things.... Updated regexp... sub simplify { local $_ = "-$_[0]-"; # Add '-' at start and end! s{-( [cC]yclic | # The prefix "cyclic" [aA]lpha | [bB]eta | [gG]amma | # Alpha/beta/gamma [tT]rans | [cC]is | # Trans/cis [DRS] | # Single letter "D","R" or "S" # \d+(,\d+)* | # list of 1 or more "," separated nos )(?=-)}{}xg; # No. list currently commented out! s/-//g; # remove all "-" s/([^\d,])([\d,])/\1-\2/g; # re-introduce "-" between number/ s/([\d,])([^\d,])/\1-\2/g; # comma and letters s/--/-/g; # remove duplicate "-" signs.. return $_; } -- The Wellcome Trust Sanger Institute is operated by Genome Research Limited, a charity registered in England with number 1021457 and a company registered in England with number 2742969, whose registered office is 215 Euston Road, London, NW1 2BE. From cjfields at uiuc.edu Tue Sep 18 11:38:04 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 18 Sep 2007 10:38:04 -0500 Subject: [Bioperl-l] A perl regex query In-Reply-To: <46EFCD0C.3010306@bms.com> References: <764978cf0709180330i232c2f04yf033414ad0dba655@mail.gmail.com> <46EFBE8A.6080402@cam.ac.uk> <764978cf0709180513r4bf44ea3gcd183a5aadcacb23@mail.gmail.com> <764978cf0709180547w69e0fcfbp67976254e19ab253@mail.gmail.com> <46EFCD0C.3010306@bms.com> Message-ID: <68EC5D58-9D84-4692-BD99-F53FC75FD0E7@uiuc.edu> On Sep 18, 2007, at 8:05 AM, Stefan Kirov wrote: > neeti somaiya wrote: >> ... > It seems to me you are trying to match 2 collections of chemical > compounds. If you need to do this reliably you need to use canonical > smiles (perhaps there are other solutions but I am not aware of them). > There are many resources for that, including open-source: > http://openbabel.sourceforge.net/wiki/Main_Page > It is not really bioperl's cup of tea, this is much more a > chemi-informatics problem. I am not sure if there is a need for > bioperl > to be extended this way- any thoughts on that? > Hope this helps, regards > Stefan I would vote nyet myself unless I was convinced that this would be beneficial to bioperl core. Right now I'm not yet there, primarily b/ c of the already available OpenBabel (with available CPAN interface) and other resources, not to mention there are too many areas in bioperl which need more focus (tests, documentation, etc). However, if we do want to incorporate chemi-informatics at some point it could be something which is not integrated into the core architecture and can be installed separately (like network, ext, db, etc). chris From bioperl-list at superfrink.net Tue Sep 18 12:16:48 2007 From: bioperl-list at superfrink.net (bioperl-list at superfrink.net) Date: Tue, 18 Sep 2007 10:16:48 -0600 (MDT) Subject: [Bioperl-l] A perl regex query In-Reply-To: <46EFBA0E.4030104@sheffield.ac.uk> References: <764978cf0709180330i232c2f04yf033414ad0dba655@mail.gmail.com> <46EFBA0E.4030104@sheffield.ac.uk> Message-ID: On Tue, 18 Sep 2007, Nathan Haigh wrote: > An even better way is to use the array as Spiros suggested, but you > should then be able to use that in the regex like this: > > my @ra_bad_terms = ( '-D-', 'Cyclic-' ); > $string =~ s/@ra_bad_terms//g; I didn't know one could do that. I couldn't get it to work so I asked around. In case anyone else read it and thought about using that code it might only work in Perl 6. Regards, Chad From bix at sendu.me.uk Tue Sep 18 13:21:06 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Tue, 18 Sep 2007 18:21:06 +0100 Subject: [Bioperl-l] A perl regex query In-Reply-To: References: <764978cf0709180330i232c2f04yf033414ad0dba655@mail.gmail.com> <46EFBA0E.4030104@sheffield.ac.uk> Message-ID: <46F00902.9070401@sendu.me.uk> bioperl-list at superfrink.net wrote: > On Tue, 18 Sep 2007, Nathan Haigh wrote: > >> An even better way is to use the array as Spiros suggested, but you >> should then be able to use that in the regex like this: >> >> my @ra_bad_terms = ( '-D-', 'Cyclic-' ); >> $string =~ s/@ra_bad_terms//g; > > I didn't know one could do that. I couldn't get it to work so I asked > around. In case anyone else read it and thought about using that code it > might only work in Perl 6. I assumed it was a typo. You can get it to work by adding $" = '|'; before the regex; From cjfields at uiuc.edu Tue Sep 18 13:52:00 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 18 Sep 2007 12:52:00 -0500 Subject: [Bioperl-l] A perl regex query In-Reply-To: References: <764978cf0709180330i232c2f04yf033414ad0dba655@mail.gmail.com> <46EFBA0E.4030104@sheffield.ac.uk> Message-ID: On Sep 18, 2007, at 11:16 AM, bioperl-list at superfrink.net wrote: > On Tue, 18 Sep 2007, Nathan Haigh wrote: > >> An even better way is to use the array as Spiros suggested, but you >> should then be able to use that in the regex like this: >> >> my @ra_bad_terms = ( '-D-', 'Cyclic-' ); >> $string =~ s/@ra_bad_terms//g; > > I didn't know one could do that. I couldn't get it to work so I asked > around. In case anyone else read it and thought about using that > code it > might only work in Perl 6. > > Regards, > Chad I think the problem is what s/@terms//g means. To most it means group substitutions, which you can get by using s/(?:a|b|c|d)//g; to others it means stepwise 's/$old//g for $old (@terms)'. To go from an array of terms to an optimized group regex, use Regexp::List (CPAN to the rescue!): http://search.cpan.org/~dankogai/Regexp-Optimizer-0.15/lib/Regexp/ List.pm chris From cjfields at uiuc.edu Tue Sep 18 14:23:37 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 18 Sep 2007 13:23:37 -0500 Subject: [Bioperl-l] A perl regex query In-Reply-To: <46F00902.9070401@sendu.me.uk> References: <764978cf0709180330i232c2f04yf033414ad0dba655@mail.gmail.com> <46EFBA0E.4030104@sheffield.ac.uk> <46F00902.9070401@sendu.me.uk> Message-ID: <5C811C68-D049-407A-946D-061DD8951B54@uiuc.edu> On Sep 18, 2007, at 12:21 PM, Sendu Bala wrote: > bioperl-list at superfrink.net wrote: >> On Tue, 18 Sep 2007, Nathan Haigh wrote: >> >>> An even better way is to use the array as Spiros suggested, but you >>> should then be able to use that in the regex like this: >>> >>> my @ra_bad_terms = ( '-D-', 'Cyclic-' ); >>> $string =~ s/@ra_bad_terms//g; >> >> I didn't know one could do that. I couldn't get it to work so I >> asked >> around. In case anyone else read it and thought about using that >> code it >> might only work in Perl 6. > > I assumed it was a typo. You can get it to work by adding > > $" = '|'; > > before the regex; Ah, didn't know that one. Nice, though shouldn't it be localized? The (supposed) advantage of Regexp::List is the regex is optimized for speed; I haven't tried it out myself, so YMMV. chris From stefan.kirov at bms.com Tue Sep 18 14:54:49 2007 From: stefan.kirov at bms.com (Stefan Kirov) Date: Tue, 18 Sep 2007 14:54:49 -0400 Subject: [Bioperl-l] A perl regex query In-Reply-To: <155C67C0-1F81-4A1C-AD68-A21B4E6918C9@uiuc.edu> References: <764978cf0709180330i232c2f04yf033414ad0dba655@mail.gmail.com> <46EFBE8A.6080402@cam.ac.uk> <764978cf0709180513r4bf44ea3gcd183a5aadcacb23@mail.gmail.com> <764978cf0709180547w69e0fcfbp67976254e19ab253@mail.gmail.com> <46EFD217.1030103@cam.ac.uk> <155C67C0-1F81-4A1C-AD68-A21B4E6918C9@uiuc.edu> Message-ID: <46F01EF9.80003@bms.com> Actually, smiles can be tricky too- you can easily generate non-canonical keys, where InChi is unique (as I understand it at least). It is promoted by IUAPC: http://www.iupac.org/inchi/ and can be generated by OpenBabel. My take is that if you need to map between small molecules InChi might be the best way.. Stefan Chris Fields wrote: > On Sep 18, 2007, at 8:26 AM, Roy Chaudhuri wrote: > > >>> My actual problem is a bit more complicated. >>> It is not just one string, nut lakhs of them, they are actually >>> names of >>> chemical compounds. >>> >>> THe problem is there are 2 different data sources, I need to match >>> the >>> compond names between them, but the problem is though the compound >>> may >>> be the same in the two, they use different naming formats for them. >>> >> Unless you can define in simple and precise terms exactly which >> parts of >> the string you need then there is no way that you will be able to >> code a >> solution in Perl. >> >> Maybe you could look for a database that contains the synonyms for >> each >> molecule? A quick Google finds ChEBI (http://www.ebi.ac.uk/chebi), >> which >> is available to download as flat files. >> >> Roy. >> -- >> Dr. Roy Chaudhuri >> Department of Veterinary Medicine >> University of Cambridge, U.K. >> > > D'oh! Roy beat me to it; that's what I was going to suggest. I > agree; don't trust simple word munging to always get you the correct > answer in this case, it's just too complicated to try and catch every > case. > > ChEBI is a good choice; Stefan's suggestion of OpenBabel is also a > good one. I would also try not to reinvent the wheel; there may be > some modules available via CPAN which do what you need, such as these: > > http://search.cpan.org/search?query=chem&mode=module > > or this: > > http://search.cpan.org/~ghutchis/Chemistry-OpenBabel-1.2.0/ > > chris > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > From jason at bioperl.org Tue Sep 18 20:04:05 2007 From: jason at bioperl.org (Jason Stajich) Date: Tue, 18 Sep 2007 17:04:05 -0700 Subject: [Bioperl-l] bioperl + GFF3 audit Message-ID: Something to throw out there for discussion with GFF3 gurus. Maybe we can have a little STATE-OF-GFF3 and compliance at the GMOD workshop after Genome Informatics in Nov? I propose after we get the next stable release out we consider doing a systematic code audit to insure that we can really generate proper GFF3 compliant data from all of our parsers. This would include both good ID/Parent as well as . I'd be happy to also think about making sure we can generate proper GTF/GFF2.5 - whether this means we have a translator that works on these objects or we have to code this into the parser software that creating the sequence features, not sure. The whole Bio::Tools mishmash is a little unsettling when trying to generate standardized output. I'm not really clear if Bio::FeatureIO actually tries to do this properly, but 'gene_id'/'transcript_id' for GTF and ID/Parent 3-level Features for gene->transcript->exon/CDS doesn't really come out properly and I end up writing workarounds on the downstream data. One aspect that is biting is the flat versus multi-level features (genes -> transcripts -> exons) and how we handle them. I think this ought to get fleshed out better so we can really support . A lot of the Bio::Tools parsers are generally pretty laissez fair here about things and we have a variety of non-standard and non-compliant aspects. For example, I am playing with tRNA parsing and I assume that proper GFF3 here is three levels of : gene -> tRNA -> exon with those being the primary_tag names that correspond to the Sequence Ontology. I have modified the code locally to report generic features but which have sub-features that must be extracted. In addition the ID/Parent fields are explicitly filled in and I wonder if we want to do a better job insuring these are meaningfully entered? So if there are interested people out there we can try and hammer out a todo list on the wiki and see if we're generating proper GFF3 in the first place and trying to make sure all the features that get fed out to Bio::FeatureIO or Bio::Tools::GFF can get properly transformed into GFF3 and GTF output. Comments/Volunteers? -jason -- Jason Stajich jason at bioperl.org From cjfields at uiuc.edu Tue Sep 18 23:37:30 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 18 Sep 2007 22:37:30 -0500 Subject: [Bioperl-l] bioperl + GFF3 audit In-Reply-To: References: Message-ID: On Sep 18, 2007, at 7:04 PM, Jason Stajich wrote: > Something to throw out there for discussion with GFF3 gurus. Maybe > we can have a little STATE-OF-GFF3 and compliance at the GMOD > workshop after Genome Informatics in Nov? > > I propose after we get the next stable release out we consider doing > a systematic code audit to insure that we can really generate proper > GFF3 compliant data from all of our parsers. This would include both > good ID/Parent as well as . I'd be happy to also think about making > sure we can generate proper GTF/GFF2.5 - whether this means we have a > translator that works on these objects or we have to code this into > the parser software that creating the sequence features, not sure. > The whole Bio::Tools mishmash is a little unsettling when trying to > generate standardized output. I'm not really clear if Bio::FeatureIO > actually tries to do this properly, but 'gene_id'/'transcript_id' for > GTF and ID/Parent 3-level Features for gene->transcript->exon/CDS > doesn't really come out properly and I end up writing workarounds on > the downstream data. This suggests we should try to get a stable out fairly quickly and work on the next dev straight away. I'm okay with that, though it would be nice to finish up a few loose ends first, the svn move foremost. The Feature/Annotation stuff has been pretty much rolled back so maybe a stable release can be done fairly quickly. My main concern was that any rollback would break FeatureIO or SF::Annotated, but so far FeatureIO and SF::Annotated both pass tests. However, I think both also need better documentation and possibly more/better test coverage. > One aspect that is biting is the flat versus multi-level features > (genes -> transcripts -> exons) and how we handle them. I think this > ought to get fleshed out better so we can really support . A lot of > the Bio::Tools parsers are generally pretty laissez fair here about > things and we have a variety of non-standard and non-compliant > aspects. Agreed. > For example, I am playing with tRNA parsing and I assume that proper > GFF3 here is three levels of : > gene -> tRNA -> exon > with those being the primary_tag names that correspond to the > Sequence Ontology. > > I have modified the code locally to report generic features but which > have sub-features that must be extracted. In addition the ID/Parent > fields are explicitly filled in and I wonder if we want to do a > better job insuring these are meaningfully entered? Would a factory approach work here? For instance, have a Factory which generates the SeqFeature type you want on the fly if passed appropriate parameters and location, say flattened vs unflattened, strictly typed vs lightweight, etc. For that matter, maybe we could reimplement FTHelper in SeqIO to do the same... > So if there are interested people out there we can try and hammer out > a todo list on the wiki and see if we're generating proper GFF3 in > the first place and trying to make sure all the features that get fed > out to Bio::FeatureIO or Bio::Tools::GFF can get properly transformed > into GFF3 and GTF output. > > Comments/Volunteers? > > -jason > > -- > Jason Stajich > jason at bioperl.org I'll be busy 'til mid-Oct but I'll chip in. I'll keep tabs on the wiki. chris From harryzs1981 at yahoo.com.cn Fri Sep 21 10:40:55 2007 From: harryzs1981 at yahoo.com.cn (sheng zhao) Date: Fri, 21 Sep 2007 22:40:55 +0800 (CST) Subject: [Bioperl-l] Help for extracting CDS sequences from FASTA form Message-ID: <815154.22486.qm@web15901.mail.cnb.yahoo.com> Dear all: I have got a set of DAN sequences in FASTA form as following: >gnl|UG|Bt#S37443275 Bos taurus cathelicidin 4, mRNA (cDNA clone MGC:157131 IMAGE:8442308), complete cds /cds=p(18,452) /gb=BC133480 /gi=126717494 /ug=Bt.3 /len=572 TAGGCAGACTGGGGACCATGCAAACCCAGAGGGCCAGCCTCTCACTGGGGCGGTGGTCAC.................................. >gnl|UG|Bt#S11932596 B.taurus mRNA for interleukin-5 /cds=p(1,405) /gb=Z67872 /gi=1113120 /ug=Bt.5 /len=405 TAGGCAGACTGGGGACCATGCAAACCCAGAGGGCCAGCCTCTCACTGGGGCGGTGGTCAC................................... >gnl|UG|Bt#S29311270 Hw_Loin_11_0520_C11 Bos taurus CF-24-HW loin cDNA library Bos taurus cDNA, mRNA sequence /gb=DV796078 /gi=82648993 /ug=Bt.10 /len=1332 AACCGGGAGCACGCCGTGTACCCGCCAGTGGGGCTTCTGAGGACATGGGGGCCACCGTCA................................... I would like to know how to extract CDS sequences from them? Or a Perl program? Thank you for your reply and help. Sincerely yours, Best wishes, Harry --------------------------------- @yahoo.cn ?????????????????????????? From bix at sendu.me.uk Fri Sep 21 11:28:30 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Fri, 21 Sep 2007 16:28:30 +0100 Subject: [Bioperl-l] Help for extracting CDS sequences from FASTA form In-Reply-To: <815154.22486.qm@web15901.mail.cnb.yahoo.com> References: <815154.22486.qm@web15901.mail.cnb.yahoo.com> Message-ID: <46F3E31E.2010606@sendu.me.uk> sheng zhao wrote: > >gnl|UG|Bt#S37443275 [snip] /gb=BC133480 /gi=126717494 /ug=Bt.3 /len=572 > TAGGCAGACTGGGGACCATGCAAACCCAGAGGGCCAGC [snip] > I would like to know how to extract CDS sequences from them? Or a Perl program? Where did you get the fasta sequences from? It would be easiest to go to the source that originally generated them and get it to give you the CDS coordinates as well. Failing that you can get them from the NCBI database using the gb or gi ids. Someone else will be along to give you the Bioperl code to do that, I'm sure :) From harryzs1981 at yahoo.com.cn Fri Sep 21 12:30:45 2007 From: harryzs1981 at yahoo.com.cn (sheng zhao) Date: Sat, 22 Sep 2007 00:30:45 +0800 (CST) Subject: [Bioperl-l] Help for extracting CDS sequences from FASTA form Message-ID: <352861.19969.qm@web15909.mail.cnb.yahoo.com> Dear sir: Thank you for your relp. I got these sequences from NCBI(ftp.ncbi.nlm.nih.gov/repository/UniGene/Bos_taurus/Bt.seq.uniq.gz). Would you mind to tell me how to get gb or gi ids from this form? Thank you again. Best wishes. Harry Sendu Bala ?????? sheng zhao wrote: > >gnl|UG|Bt#S37443275 [snip] /gb=BC133480 /gi=126717494 /ug=Bt.3 /len=572 > TAGGCAGACTGGGGACCATGCAAACCCAGAGGGCCAGC [snip] > I would like to know how to extract CDS sequences from them? Or a Perl program? Where did you get the fasta sequences from? It would be easiest to go to the source that originally generated them and get it to give you the CDS coordinates as well. Failing that you can get them from the NCBI database using the gb or gi ids. Someone else will be along to give you the Bioperl code to do that, I'm sure :) --------------------------------- ???????????????????? From harryzs1981 at yahoo.com.cn Fri Sep 21 12:32:21 2007 From: harryzs1981 at yahoo.com.cn (sheng zhao) Date: Sat, 22 Sep 2007 00:32:21 +0800 (CST) Subject: [Bioperl-l] =?gb2312?q?=BB=D8=B8=B4=A3=BA=20Re:=20=20Help=20for=20extracti?= =?gb2312?q?ng=20CDS=20sequences=20from=20FASTA=20form?= In-Reply-To: Message-ID: <548016.71302.qm@web15914.mail.cnb.yahoo.com> Dear Brian O.: Thank you for your reple. I just want to get CDS sequences (DNA sequence) from them according to the CDS information in the title for each sequence. For example, for the first sequence, the information is "complete cds /cds=p(18,452)" . Thank you again! Best wishes! Harry Brian Osborne ?????? Harry, Do you mean find an ORF starting at the first initiation codon and translate that? Or some other approach? Take a look at the translate() method: http://www.bioperl.org/wiki/Bptutorial.pl#Translating Brian O. On 9/21/07 10:40 AM, "sheng zhao" wrote: > Dear all: > I have got a set of DAN sequences in FASTA form as following: > >> gnl|UG|Bt#S37443275 Bos taurus cathelicidin 4, mRNA (cDNA clone MGC:157131 >> IMAGE:8442308), complete cds /cds=p(18,452) /gb=BC133480 /gi=126717494 >> /ug=Bt.3 /len=572 > > TAGGCAGACTGGGGACCATGCAAACCCAGAGGGCCAGCCTCTCACTGGGGCGGTGGTCAC.................. > ................ >> gnl|UG|Bt#S11932596 B.taurus mRNA for interleukin-5 /cds=p(1,405) /gb=Z67872 >> /gi=1113120 /ug=Bt.5 /len=405 > > TAGGCAGACTGGGGACCATGCAAACCCAGAGGGCCAGCCTCTCACTGGGGCGGTGGTCAC.................. > ................. >> gnl|UG|Bt#S29311270 Hw_Loin_11_0520_C11 Bos taurus CF-24-HW loin cDNA library >> Bos taurus cDNA, mRNA sequence /gb=DV796078 /gi=82648993 /ug=Bt.10 /len=1332 > > AACCGGGAGCACGCCGTGTACCCGCCAGTGGGGCTTCTGAGGACATGGGGGCCACCGTCA.................. > ................. > > I would like to know how to extract CDS sequences from them? Or a Perl > program? > > Thank you for your reply and help. > > Sincerely yours, > Best wishes, > Harry > > > > > > --------------------------------- > @yahoo.cn ?????????????????????????? > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l --------------------------------- ???????????????????? From bosborne11 at verizon.net Fri Sep 21 11:53:46 2007 From: bosborne11 at verizon.net (Brian Osborne) Date: Fri, 21 Sep 2007 11:53:46 -0400 Subject: [Bioperl-l] Help for extracting CDS sequences from FASTA form In-Reply-To: <815154.22486.qm@web15901.mail.cnb.yahoo.com> Message-ID: Harry, Do you mean find an ORF starting at the first initiation codon and translate that? Or some other approach? Take a look at the translate() method: http://www.bioperl.org/wiki/Bptutorial.pl#Translating Brian O. On 9/21/07 10:40 AM, "sheng zhao" wrote: > Dear all: > I have got a set of DAN sequences in FASTA form as following: > >> gnl|UG|Bt#S37443275 Bos taurus cathelicidin 4, mRNA (cDNA clone MGC:157131 >> IMAGE:8442308), complete cds /cds=p(18,452) /gb=BC133480 /gi=126717494 >> /ug=Bt.3 /len=572 > > TAGGCAGACTGGGGACCATGCAAACCCAGAGGGCCAGCCTCTCACTGGGGCGGTGGTCAC.................. > ................ >> gnl|UG|Bt#S11932596 B.taurus mRNA for interleukin-5 /cds=p(1,405) /gb=Z67872 >> /gi=1113120 /ug=Bt.5 /len=405 > > TAGGCAGACTGGGGACCATGCAAACCCAGAGGGCCAGCCTCTCACTGGGGCGGTGGTCAC.................. > ................. >> gnl|UG|Bt#S29311270 Hw_Loin_11_0520_C11 Bos taurus CF-24-HW loin cDNA library >> Bos taurus cDNA, mRNA sequence /gb=DV796078 /gi=82648993 /ug=Bt.10 /len=1332 > > AACCGGGAGCACGCCGTGTACCCGCCAGTGGGGCTTCTGAGGACATGGGGGCCACCGTCA.................. > ................. > > I would like to know how to extract CDS sequences from them? Or a Perl > program? > > Thank you for your reply and help. > > Sincerely yours, > Best wishes, > Harry > > > > > > --------------------------------- > @yahoo.cn ?????????????????????????? > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From florent.angly at gmail.com Sat Sep 22 01:02:28 2007 From: florent.angly at gmail.com (Florent Angly) Date: Fri, 21 Sep 2007 22:02:28 -0700 Subject: [Bioperl-l] Scoring an overlap In-Reply-To: <5C811C68-D049-407A-946D-061DD8951B54@uiuc.edu> References: <764978cf0709180330i232c2f04yf033414ad0dba655@mail.gmail.com> <46EFBA0E.4030104@sheffield.ac.uk> <46F00902.9070401@sendu.me.uk> <5C811C68-D049-407A-946D-061DD8951B54@uiuc.edu> Message-ID: <46F4A1E4.7020002@gmail.com> Hi, I need to quantify how good some overlaps in contigs are. I have extracted the alignment of the overlapping region and only need to score it. I noticed the Bio::Tools::dpAlign has a scoring function. Is it the right tool for the right tool? Is there anything else? Thank you, Florent From florent.angly at gmail.com Sat Sep 22 21:41:40 2007 From: florent.angly at gmail.com (Florent Angly) Date: Sat, 22 Sep 2007 18:41:40 -0700 Subject: [Bioperl-l] Scoring an overlap In-Reply-To: <46F4A1E4.7020002@gmail.com> References: <764978cf0709180330i232c2f04yf033414ad0dba655@mail.gmail.com> <46EFBA0E.4030104@sheffield.ac.uk> <46F00902.9070401@sendu.me.uk> <5C811C68-D049-407A-946D-061DD8951B54@uiuc.edu> <46F4A1E4.7020002@gmail.com> Message-ID: <46F5C454.3050005@gmail.com> Eventually, I gave Bio::Tools::dpAlign a try. I had no luck running it when installing its dependency, bioperl-ext v1.4 or v1.5.1. However all the tests passed when installing the CVS version. So finally, here I am trying to score my alignments. For alignments of 2 small sequences, it works, but as soon as the sequences get bigger than a few dozen nucleotides, it crashes: Segmentation fault (core dumped) I did not find any help in the documentation... Can I fix this? Is this a bug? Thanks for your help, Florent Florent Angly wrote: > Hi, > I need to quantify how good some overlaps in contigs are. I have > extracted the alignment of the overlapping region and only need to > score it. I noticed the Bio::Tools::dpAlign has a scoring function. > Is it the right tool for the right tool? Is there anything else? > Thank you, > Florent > From bioperl-list at superfrink.net Sun Sep 23 12:21:44 2007 From: bioperl-list at superfrink.net (Chad Clark) Date: Sun, 23 Sep 2007 10:21:44 -0600 (MDT) Subject: [Bioperl-l] Scoring an overlap In-Reply-To: <46F5C454.3050005@gmail.com> References: <764978cf0709180330i232c2f04yf033414ad0dba655@mail.gmail.com> <46EFBA0E.4030104@sheffield.ac.uk> <46F00902.9070401@sendu.me.uk> <5C811C68-D049-407A-946D-061DD8951B54@uiuc.edu> <46F4A1E4.7020002@gmail.com> <46F5C454.3050005@gmail.com> Message-ID: On Sat, 22 Sep 2007, Florent Angly wrote: > So finally, here I am trying to score my alignments. For alignments of 2 > small sequences, it works, but as soon as the sequences get bigger than > a few dozen nucleotides, it crashes: > Segmentation fault (core dumped) In my experience if a program runs on small sets of data and segfaults on larger sets it is likely running out of stack space. You can try changing the allowed stack size with "ulimit" before running your program and see if it works with more data. [chad at water ~]$ ulimit -a | grep -i stack stack size (kbytes, -s) 10240 [chad at water ~]$ ulimit -s 40960 [chad at water ~]$ ulimit -a | grep -i stack stack size (kbytes, -s) 40960 I don't know the code / algorithm in question but it might require significantly more stack space as the data set grows in size so this change might not help enough. Good luck, Chad From bix at sendu.me.uk Mon Sep 24 05:35:39 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Mon, 24 Sep 2007 10:35:39 +0100 Subject: [Bioperl-l] Bio::FeatureIO::gff bug? Message-ID: <46F784EB.9050507@sendu.me.uk> Hi, I'm finding that when writing GFF files the version header line gets printed out twice. This is because: sub _initialize { # [snip] if ($arg{-file} =~ /^>.*/ ) { $self->_print("##gff-version " . $self->version() . "\n"); } else { my $directive; while(($directive = $self->_readline()) && ( $directive =~ /^##/ || $directive =~ /^>/)){ $self->_handle_directive($directive); } $self->_pushback($directive); } if ($arg{-file} =~ /^>.*/ ) { $self->_print("##gff-version " . $self->version() . "\n"); } # [snip] } Does it make sense for if ($arg{-file} =~ /^>.*/ ) to appear twice like that? If not, which one should be removed? The independent one, or the if/else one? Cheers, Sendu. From awitney at sgul.ac.uk Mon Sep 24 08:40:07 2007 From: awitney at sgul.ac.uk (Adam Witney) Date: Mon, 24 Sep 2007 13:40:07 +0100 Subject: [Bioperl-l] Minor docs discrepancy? In-Reply-To: Message-ID: Hi, I was just going through the BlastHSP.pm POD and lines 23-24 say "For Bio::SearchIO BLAST parsing usage examples, see the "examples/search-blast" directory of the Bioperl distribution." However in the distribution (1.4 and 1.5.2_102) it looks to be "examples/searchio" Is this in need of updating, or am I looking in the wrong place? Thanks Adam From cjfields at uiuc.edu Mon Sep 24 09:09:46 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 24 Sep 2007 08:09:46 -0500 Subject: [Bioperl-l] Bio::FeatureIO::gff bug? In-Reply-To: <46F784EB.9050507@sendu.me.uk> References: <46F784EB.9050507@sendu.me.uk> Message-ID: <7C7D7FB3-86B9-43CF-B506-E66FA8264CFC@uiuc.edu> It looks like the first is a cut-and-paste revision of the second, so I would say the second independent if block is redundant. Should we be printing output in _initialize()? I would think any output would be handled in a write_* method of some sort and not in a common method used for initializing both input and output stream data. What happens here if you use '-fh' and want output redirected to STDOUT? chris On Sep 24, 2007, at 4:35 AM, Sendu Bala wrote: > Hi, > > I'm finding that when writing GFF files the version header line gets > printed out twice. This is because: > > sub _initialize { > # [snip] > > if ($arg{-file} =~ /^>.*/ ) { > $self->_print("##gff-version " . $self->version() . "\n"); > } > else { > my $directive; > while(($directive = $self->_readline()) && ( $directive =~ / > ^##/ || > $directive =~ /^>/)){ > $self->_handle_directive($directive); > } > $self->_pushback($directive); > } > > if ($arg{-file} =~ /^>.*/ ) { > $self->_print("##gff-version " . $self->version() . "\n"); > } > > # [snip] > } > > Does it make sense for if ($arg{-file} =~ /^>.*/ ) to appear twice > like > that? If not, which one should be removed? The independent one, or the > if/else one? > > > Cheers, > Sendu. From cjfields at uiuc.edu Mon Sep 24 09:13:53 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 24 Sep 2007 08:13:53 -0500 Subject: [Bioperl-l] Minor docs discrepancy? In-Reply-To: References: Message-ID: I have updated that in CVS. Thanks for pointing that out! chris On Sep 24, 2007, at 7:40 AM, Adam Witney wrote: > > Hi, > > I was just going through the BlastHSP.pm POD and lines 23-24 say > > "For Bio::SearchIO BLAST parsing usage examples, see the > "examples/search-blast" directory of the Bioperl distribution." > > However in the distribution (1.4 and 1.5.2_102) it looks to be > "examples/searchio" > > Is this in need of updating, or am I looking in the wrong place? > > Thanks > > Adam From bix at sendu.me.uk Mon Sep 24 09:20:33 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Mon, 24 Sep 2007 14:20:33 +0100 Subject: [Bioperl-l] Bio::FeatureIO::gff bug? In-Reply-To: <7C7D7FB3-86B9-43CF-B506-E66FA8264CFC@uiuc.edu> References: <46F784EB.9050507@sendu.me.uk> <7C7D7FB3-86B9-43CF-B506-E66FA8264CFC@uiuc.edu> Message-ID: <46F7B9A1.9080206@sendu.me.uk> Chris Fields wrote: > It looks like the first is a cut-and-paste revision of the second, so I > would say the second independent if block is redundant. I agree. I'll make that change. > Should we be printing output in _initialize()? I would think any output > would be handled in a write_* method of some sort and not in a common > method used for initializing both input and output stream data. What > happens here if you use '-fh' and want output redirected to STDOUT? I think the problem is that the method is write_feature(), which can be called many times for a single output file, but the version should only be printed once at the very start of the file. I suppose it just needs better capturing of when we're intending to write... Hmmm... didn't I fix a method related to that?... Yes, yes I did: Bio::Root::IO->mode ;) Any objections to me replacing the if clause with one using that method? From cjfields at uiuc.edu Mon Sep 24 09:35:22 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 24 Sep 2007 08:35:22 -0500 Subject: [Bioperl-l] Bio::FeatureIO::gff bug? In-Reply-To: <46F7B9A1.9080206@sendu.me.uk> References: <46F784EB.9050507@sendu.me.uk> <7C7D7FB3-86B9-43CF-B506-E66FA8264CFC@uiuc.edu> <46F7B9A1.9080206@sendu.me.uk> Message-ID: On Sep 24, 2007, at 8:20 AM, Sendu Bala wrote: > Chris Fields wrote: >> It looks like the first is a cut-and-paste revision of the second, >> so I would say the second independent if block is redundant. > > I agree. I'll make that change. > > >> Should we be printing output in _initialize()? I would think any >> output would be handled in a write_* method of some sort and not >> in a common method used for initializing both input and output >> stream data. What happens here if you use '-fh' and want output >> redirected to STDOUT? > > I think the problem is that the method is write_feature(), which > can be called many times for a single output file, but the version > should only be printed once at the very start of the file. > > I suppose it just needs better capturing of when we're intending to > write... Hmmm... didn't I fix a method related to that?... > > Yes, yes I did: > Bio::Root::IO->mode > ;) > > Any objections to me replacing the if clause with one using that > method? I think that'll work fine. The other option would be call a print_gff_header() function within write_feature() with the intent to print the header only once, using a flag or similar: if (!$self->header_printed) { $self->print_gff_header; $self->header_printed(1); } chris From hlapp at gmx.net Mon Sep 24 13:41:34 2007 From: hlapp at gmx.net (Hilmar Lapp) Date: Mon, 24 Sep 2007 13:41:34 -0400 Subject: [Bioperl-l] Bio::FeatureIO::gff bug? In-Reply-To: References: <46F784EB.9050507@sendu.me.uk> <7C7D7FB3-86B9-43CF-B506-E66FA8264CFC@uiuc.edu> <46F7B9A1.9080206@sendu.me.uk> Message-ID: <234C109D-85FE-4161-8CBA-8E24BE34C5B5@gmx.net> I'd lean toward this or a similar approach too. Writing stuff out in the constructor doesn't feel like the best design. -hilmar On Sep 24, 2007, at 9:35 AM, Chris Fields wrote: > > On Sep 24, 2007, at 8:20 AM, Sendu Bala wrote: > >> Chris Fields wrote: >>> It looks like the first is a cut-and-paste revision of the second, >>> so I would say the second independent if block is redundant. >> >> I agree. I'll make that change. >> >> >>> Should we be printing output in _initialize()? I would think any >>> output would be handled in a write_* method of some sort and not >>> in a common method used for initializing both input and output >>> stream data. What happens here if you use '-fh' and want output >>> redirected to STDOUT? >> >> I think the problem is that the method is write_feature(), which >> can be called many times for a single output file, but the version >> should only be printed once at the very start of the file. >> >> I suppose it just needs better capturing of when we're intending to >> write... Hmmm... didn't I fix a method related to that?... >> >> Yes, yes I did: >> Bio::Root::IO->mode >> ;) >> >> Any objections to me replacing the if clause with one using that >> method? > > I think that'll work fine. The other option would be call a > print_gff_header() function within write_feature() with the intent to > print the header only once, using a flag or similar: > > if (!$self->header_printed) { > $self->print_gff_header; > $self->header_printed(1); > } > > chris > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From cjfields at uiuc.edu Mon Sep 24 14:11:47 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 24 Sep 2007 13:11:47 -0500 Subject: [Bioperl-l] Scoring an overlap In-Reply-To: <46F5C454.3050005@gmail.com> References: <764978cf0709180330i232c2f04yf033414ad0dba655@mail.gmail.com> <46EFBA0E.4030104@sheffield.ac.uk> <46F00902.9070401@sendu.me.uk> <5C811C68-D049-407A-946D-061DD8951B54@uiuc.edu> <46F4A1E4.7020002@gmail.com> <46F5C454.3050005@gmail.com> Message-ID: <9347CF3A-7F01-4C73-82A4-3B164420EF04@uiuc.edu> As Chad mentioned it could be a stack issue, but it might be worth filing a bug on. I will note that bioperl-ext has seen very little use in the last few years, so don't expect it to be fixed unless you can contact the ext module author. chris On Sep 22, 2007, at 8:41 PM, Florent Angly wrote: > Eventually, I gave Bio::Tools::dpAlign a try. I had no luck running it > when installing its dependency, bioperl-ext v1.4 or v1.5.1. However > all > the tests passed when installing the CVS version. > So finally, here I am trying to score my alignments. For alignments > of 2 > small sequences, it works, but as soon as the sequences get bigger > than > a few dozen nucleotides, it crashes: > Segmentation fault (core dumped) > > I did not find any help in the documentation... > > Can I fix this? Is this a bug? > > Thanks for your help, > > Florent > > Florent Angly wrote: >> Hi, >> I need to quantify how good some overlaps in contigs are. I have >> extracted the alignment of the overlapping region and only need to >> score it. I noticed the Bio::Tools::dpAlign has a scoring function. >> Is it the right tool for the right tool? Is there anything else? >> Thank you, >> Florent >> > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From florent.angly at gmail.com Mon Sep 24 15:07:35 2007 From: florent.angly at gmail.com (Florent Angly) Date: Mon, 24 Sep 2007 12:07:35 -0700 Subject: [Bioperl-l] Scoring an overlap In-Reply-To: <9347CF3A-7F01-4C73-82A4-3B164420EF04@uiuc.edu> References: <764978cf0709180330i232c2f04yf033414ad0dba655@mail.gmail.com> <46EFBA0E.4030104@sheffield.ac.uk> <46F00902.9070401@sendu.me.uk> <5C811C68-D049-407A-946D-061DD8951B54@uiuc.edu> <46F4A1E4.7020002@gmail.com> <46F5C454.3050005@gmail.com> <9347CF3A-7F01-4C73-82A4-3B164420EF04@uiuc.edu> Message-ID: <46F80AF7.4090109@gmail.com> I see... Thanks for the replies Chad and Chris. Then I have two more questions! 1/ Do you know how to get a core dump that could help debug my segmentation fault? I have produced dumps of binary C programs before with gdb. I have used the Perl debugger for Perl scripts. But how to deal with C functions called by Perl? 2/ Is there an easier method to calculate an alignment score in BioPerl than using Bio::Tools::dpAlign? I didn't seem to locate something else, but who knows... I have workarounds for quantifying the quality of the overlap, so calculating a score is not critical for me (though I believe this would be the most accurate/adapted method). Florent Chris Fields wrote: > As Chad mentioned it could be a stack issue, but it might be worth > filing a bug on. I will note that bioperl-ext has seen very little > use in the last few years, so don't expect it to be fixed unless you > can contact the ext module author. > > chris > > On Sep 22, 2007, at 8:41 PM, Florent Angly wrote: > >> Eventually, I gave Bio::Tools::dpAlign a try. I had no luck running it >> when installing its dependency, bioperl-ext v1.4 or v1.5.1. However all >> the tests passed when installing the CVS version. >> So finally, here I am trying to score my alignments. For alignments of 2 >> small sequences, it works, but as soon as the sequences get bigger than >> a few dozen nucleotides, it crashes: >> Segmentation fault (core dumped) >> >> I did not find any help in the documentation... >> >> Can I fix this? Is this a bug? >> >> Thanks for your help, >> >> Florent >> >> Florent Angly wrote: >>> Hi, >>> I need to quantify how good some overlaps in contigs are. I have >>> extracted the alignment of the overlapping region and only need to >>> score it. I noticed the Bio::Tools::dpAlign has a scoring function. >>> Is it the right tool for the right tool? Is there anything else? >>> Thank you, >>> Florent >>> >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > Christopher Fields > Postdoctoral Researcher > Lab of Dr. Robert Switzer > Dept of Biochemistry > University of Illinois Urbana-Champaign > > > > From cjfields at uiuc.edu Mon Sep 24 15:26:46 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 24 Sep 2007 14:26:46 -0500 Subject: [Bioperl-l] Scoring an overlap In-Reply-To: <46F80AF7.4090109@gmail.com> References: <764978cf0709180330i232c2f04yf033414ad0dba655@mail.gmail.com> <46EFBA0E.4030104@sheffield.ac.uk> <46F00902.9070401@sendu.me.uk> <5C811C68-D049-407A-946D-061DD8951B54@uiuc.edu> <46F4A1E4.7020002@gmail.com> <46F5C454.3050005@gmail.com> <9347CF3A-7F01-4C73-82A4-3B164420EF04@uiuc.edu> <46F80AF7.4090109@gmail.com> Message-ID: <0C6508FB-4765-4E4E-AFD1-6E5BFE8F9368@uiuc.edu> I suppose if you can find a way to export the contig data into a Bio::SimpleAlign you look at the methods in Bio::Align::DNAStatistics. SimpleAlign also has some builtin methods like average_percentage_identity, percentage_identity, etc, which may be worth a look. chris On Sep 24, 2007, at 2:07 PM, Florent Angly wrote: > I see... Thanks for the replies Chad and Chris. Then I have two more > questions! > 1/ Do you know how to get a core dump that could help debug my > segmentation fault? I have produced dumps of binary C programs before > with gdb. I have used the Perl debugger for Perl scripts. But how to > deal with C functions called by Perl? > 2/ Is there an easier method to calculate an alignment score in > BioPerl > than using Bio::Tools::dpAlign? I didn't seem to locate something > else, > but who knows... > I have workarounds for quantifying the quality of the overlap, so > calculating a score is not critical for me (though I believe this > would > be the most accurate/adapted method). > Florent > > Chris Fields wrote: >> As Chad mentioned it could be a stack issue, but it might be worth >> filing a bug on. I will note that bioperl-ext has seen very little >> use in the last few years, so don't expect it to be fixed unless you >> can contact the ext module author. >> >> chris >> >> On Sep 22, 2007, at 8:41 PM, Florent Angly wrote: >> >>> Eventually, I gave Bio::Tools::dpAlign a try. I had no luck >>> running it >>> when installing its dependency, bioperl-ext v1.4 or v1.5.1. >>> However all >>> the tests passed when installing the CVS version. >>> So finally, here I am trying to score my alignments. For >>> alignments of 2 >>> small sequences, it works, but as soon as the sequences get >>> bigger than >>> a few dozen nucleotides, it crashes: >>> Segmentation fault (core dumped) >>> >>> I did not find any help in the documentation... >>> >>> Can I fix this? Is this a bug? >>> >>> Thanks for your help, >>> >>> Florent >>> >>> Florent Angly wrote: >>>> Hi, >>>> I need to quantify how good some overlaps in contigs are. I have >>>> extracted the alignment of the overlapping region and only need to >>>> score it. I noticed the Bio::Tools::dpAlign has a scoring function. >>>> Is it the right tool for the right tool? Is there anything else? >>>> Thank you, >>>> Florent >>>> >>> >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> >> Christopher Fields >> Postdoctoral Researcher >> Lab of Dr. Robert Switzer >> Dept of Biochemistry >> University of Illinois Urbana-Champaign >> >> >> >> > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From florent.angly at gmail.com Mon Sep 24 15:46:36 2007 From: florent.angly at gmail.com (Florent Angly) Date: Mon, 24 Sep 2007 12:46:36 -0700 Subject: [Bioperl-l] Scoring an overlap In-Reply-To: <0C6508FB-4765-4E4E-AFD1-6E5BFE8F9368@uiuc.edu> References: <764978cf0709180330i232c2f04yf033414ad0dba655@mail.gmail.com> <46EFBA0E.4030104@sheffield.ac.uk> <46F00902.9070401@sendu.me.uk> <5C811C68-D049-407A-946D-061DD8951B54@uiuc.edu> <46F4A1E4.7020002@gmail.com> <46F5C454.3050005@gmail.com> <9347CF3A-7F01-4C73-82A4-3B164420EF04@uiuc.edu> <46F80AF7.4090109@gmail.com> <0C6508FB-4765-4E4E-AFD1-6E5BFE8F9368@uiuc.edu> Message-ID: <46F8141C.8010507@gmail.com> Yes, right! That's the methods I use as my workarounds. =) Thanks for suggesting. Florent Chris Fields wrote: > I suppose if you can find a way to export the contig data into a > Bio::SimpleAlign you look at the methods in > Bio::Align::DNAStatistics. SimpleAlign also has some builtin methods > like average_percentage_identity, percentage_identity, etc, which may > be worth a look. > > chris > > On Sep 24, 2007, at 2:07 PM, Florent Angly wrote: > >> I see... Thanks for the replies Chad and Chris. Then I have two more >> questions! >> 1/ Do you know how to get a core dump that could help debug my >> segmentation fault? I have produced dumps of binary C programs before >> with gdb. I have used the Perl debugger for Perl scripts. But how to >> deal with C functions called by Perl? >> 2/ Is there an easier method to calculate an alignment score in BioPerl >> than using Bio::Tools::dpAlign? I didn't seem to locate something else, >> but who knows... >> I have workarounds for quantifying the quality of the overlap, so >> calculating a score is not critical for me (though I believe this would >> be the most accurate/adapted method). >> Florent >> >> Chris Fields wrote: >>> As Chad mentioned it could be a stack issue, but it might be worth >>> filing a bug on. I will note that bioperl-ext has seen very little >>> use in the last few years, so don't expect it to be fixed unless you >>> can contact the ext module author. >>> >>> chris >>> >>> On Sep 22, 2007, at 8:41 PM, Florent Angly wrote: >>> >>>> Eventually, I gave Bio::Tools::dpAlign a try. I had no luck running it >>>> when installing its dependency, bioperl-ext v1.4 or v1.5.1. However >>>> all >>>> the tests passed when installing the CVS version. >>>> So finally, here I am trying to score my alignments. For alignments >>>> of 2 >>>> small sequences, it works, but as soon as the sequences get bigger >>>> than >>>> a few dozen nucleotides, it crashes: >>>> Segmentation fault (core dumped) >>>> >>>> I did not find any help in the documentation... >>>> >>>> Can I fix this? Is this a bug? >>>> >>>> Thanks for your help, >>>> >>>> Florent >>>> >>>> Florent Angly wrote: >>>>> Hi, >>>>> I need to quantify how good some overlaps in contigs are. I have >>>>> extracted the alignment of the overlapping region and only need to >>>>> score it. I noticed the Bio::Tools::dpAlign has a scoring function. >>>>> Is it the right tool for the right tool? Is there anything else? >>>>> Thank you, >>>>> Florent >>>>> >>>> >>>> _______________________________________________ >>>> Bioperl-l mailing list >>>> Bioperl-l at lists.open-bio.org >>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>> >>> Christopher Fields >>> Postdoctoral Researcher >>> Lab of Dr. Robert Switzer >>> Dept of Biochemistry >>> University of Illinois Urbana-Champaign >>> >>> >>> >>> >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > Christopher Fields > Postdoctoral Researcher > Lab of Dr. Robert Switzer > Dept of Biochemistry > University of Illinois Urbana-Champaign > > > > From bix at sendu.me.uk Tue Sep 25 06:00:20 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Tue, 25 Sep 2007 11:00:20 +0100 Subject: [Bioperl-l] Bio::FeatureIO::gff bug? In-Reply-To: <234C109D-85FE-4161-8CBA-8E24BE34C5B5@gmx.net> References: <46F784EB.9050507@sendu.me.uk> <7C7D7FB3-86B9-43CF-B506-E66FA8264CFC@uiuc.edu> <46F7B9A1.9080206@sendu.me.uk> <234C109D-85FE-4161-8CBA-8E24BE34C5B5@gmx.net> Message-ID: <46F8DC34.6020908@sendu.me.uk> Hilmar Lapp wrote: > On Sep 24, 2007, at 9:35 AM, Chris Fields wrote: >> I think that'll work fine. The other option would be call a >> print_gff_header() function within write_feature() with the intent to >> print the header only once, using a flag or similar: >> >> if (!$self->header_printed) { >> $self->print_gff_header; >> $self->header_printed(1); >> } > > I'd lean toward this or a similar approach too. Writing stuff out in the > constructor doesn't feel like the best design. I'd argue that the alternative is just inefficient with no compensating benefit. You have something that must only be done once, and a method (_initialize) that is only called once. The constructor is used to set up the file, getting it into a state ready to add features. This involves opening it for writing with the correct filename and setting the desired GFF version. Why wouldn't it also output what ever else was necessary it initialize the file? Also, what do we expect should happen when we use Bioperl to create a GFF file and don't write any features to it? Should it be an empty file, or should it contain whatever GFF information the user had managed to supply (the version)? From cjfields at uiuc.edu Tue Sep 25 10:14:04 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 25 Sep 2007 09:14:04 -0500 Subject: [Bioperl-l] Bio::FeatureIO::gff bug? In-Reply-To: <46F8DC34.6020908@sendu.me.uk> References: <46F784EB.9050507@sendu.me.uk> <7C7D7FB3-86B9-43CF-B506-E66FA8264CFC@uiuc.edu> <46F7B9A1.9080206@sendu.me.uk> <234C109D-85FE-4161-8CBA-8E24BE34C5B5@gmx.net> <46F8DC34.6020908@sendu.me.uk> Message-ID: <22FB7AE5-2E1C-450C-A48C-6014CC5EB786@uiuc.edu> On Sep 25, 2007, at 5:00 AM, Sendu Bala wrote: > Hilmar Lapp wrote: >> On Sep 24, 2007, at 9:35 AM, Chris Fields wrote: >>> I think that'll work fine. The other option would be call a >>> print_gff_header() function within write_feature() with the >>> intent to >>> print the header only once, using a flag or similar: >>> >>> if (!$self->header_printed) { >>> $self->print_gff_header; >>> $self->header_printed(1); >>> } >> >> I'd lean toward this or a similar approach too. Writing stuff out >> in the >> constructor doesn't feel like the best design. > > I'd argue that the alternative is just inefficient with no > compensating > benefit. You have something that must only be done once, and a method > (_initialize) that is only called once. The constructor is used to set > up the file, getting it into a state ready to add features. This > involves opening it for writing with the correct filename and setting > the desired GFF version. Why wouldn't it also output what ever else > was > necessary it initialize the file? It's great to have someone picking this up, so anything that works is fine by me, to tell the truth. I'll state my piece, though, and stand out of the way. In my opinion there are a couple of compensating benefits. One is long-term maintenance, primarily being all calls to generate output are contained within the write_features method and are thus easier to find and more maintainable. The logic goes, if there were a bug I would expect to find output in a write_* method or a method called from within write_* method, not in the constructor or something called from the constructor, like _initialize(). I've always been told the constructor in any OO language is typically limited to setting state data and behavior, not generating new data (i.e. generating output). Related to that, the other benefit is expected behavior when calling a method. I don't know of cases in other IO classes in Bioperl which generate output when a new() instance is created; output is expected specifically when calling a particular write_* method. Therefore I wouldn't expect any output be generated until write_feature() were called (or calling similarly named methods where output would be expected, like a print_* method). Saying all that, I'm probably not the best one to bang the 'best practices' drum right now as I haven't had time to finish up several modules I've been working on! Speaking of (going back to work...) > Also, what do we expect should happen when we use Bioperl to create a > GFF file and don't write any features to it? Should it be an empty > file, > or should it contain whatever GFF information the user had managed to > supply (the version)? As mentioned above, I would expect no output generated at all unless explicitly calling write_features(), just like any of the other IOs; the header info would be generated then. BioPerl has traditionally been for whatever works though, which is fine by me. chris From forrest_zhang at 163.com Thu Sep 27 03:41:44 2007 From: forrest_zhang at 163.com (Forrest) Date: Thu, 27 Sep 2007 15:41:44 +0800 Subject: [Bioperl-l] load_seqdatabase.pl error Message-ID: <000501c800d9$dc9c8e90$95d5abb0$@com> Hi, all I install the biosql, and bioperl-db. I want to import swissport data. But the programe show some error as below: ============================================================================ =============================================== >perl load_seqdatabase.pl -dbuser root -dbname bioseqdb -driver mysql -namespace swissprot -format swiss ~/uniprot/uniprot_sprot.dat Loading /home/forrest/uniprot/uniprot_sprot.dat ... Could not store Q6DAH5: ------------- EXCEPTION: Bio::Root::Exception ------------- MSG: The supplied lineage does not start near 'Erwinia carotovora subsp. atroseptica' (I was supplied 'Erwinia carotovora subsp. | Pectobacterium | Enterobacteriaceae | Enterobacteriales | Gammaproteobacteria | Proteobacteria | Bacteria') STACK: Error::throw STACK: Bio::Root::Root::throw /usr/lib/perl5/site_perl/5.8.8/Bio/Root/Root.pm:359 STACK: Bio::Species::classification /usr/lib/perl5/site_perl/5.8.8/Bio/Species.pm:174 STACK: Bio::DB::Persistent::PersistentObject::AUTOLOAD /usr/lib/perl5/site_perl/5.8.8/Bio/DB/Persistent/PersistentObject.pm:552 STACK: Bio::DB::BioSQL::SpeciesAdaptor::populate_from_row /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/SpeciesAdaptor.pm:281 STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::_build_object /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/BasePersistenceAdaptor.pm:1305 STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::_find_by_unique_key /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/BasePersistenceAdaptor.pm:973 STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::find_by_unique_key /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/BasePersistenceAdaptor.pm:852 STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/BasePersistenceAdaptor.pm:182 STACK: Bio::DB::Persistent::PersistentObject::create /usr/lib/perl5/site_perl/5.8.8/Bio/DB/Persistent/PersistentObject.pm:244 STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/BasePersistenceAdaptor.pm:169 STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::store /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/BasePersistenceAdaptor.pm:251 STACK: Bio::DB::Persistent::PersistentObject::store /usr/lib/perl5/site_perl/5.8.8/Bio/DB/Persistent/PersistentObject.pm:271 STACK: load_seqdatabase.pl:620 ----------------------------------------------------------- at load_seqdatabase.pl line 633 ============================================================================ =============================================== How can I solve it, please help me, Thank you. Thanks Forrest zhang From bix at sendu.me.uk Thu Sep 27 04:38:00 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Thu, 27 Sep 2007 09:38:00 +0100 Subject: [Bioperl-l] load_seqdatabase.pl error In-Reply-To: <000501c800d9$dc9c8e90$95d5abb0$@com> References: <000501c800d9$dc9c8e90$95d5abb0$@com> Message-ID: <46FB6BE8.9050203@sendu.me.uk> Forrest wrote: > Hi, all > I install the biosql, and bioperl-db. I want to import swissport data. > But the programe show some error as below: > ============================================================================ > =============================================== >> perl load_seqdatabase.pl -dbuser root -dbname bioseqdb -driver mysql > -namespace swissprot -format swiss ~/uniprot/uniprot_sprot.dat > Loading /home/forrest/uniprot/uniprot_sprot.dat ... > Could not store Q6DAH5: > ------------- EXCEPTION: Bio::Root::Exception ------------- > MSG: The supplied lineage does not start near 'Erwinia carotovora subsp. > atroseptica' (I was supplied 'Erwinia carotovora subsp. | Pectobacterium | > Enterobacteriaceae | Enterobacteriales | Gammaproteobacteria | > Proteobacteria | Bacteria') From: OS Erwinia carotovora subsp. atroseptica (Pectobacterium atrosepticum). OC Bacteria; Proteobacteria; Gammaproteobacteria; Enterobacteriales; OC Enterobacteriaceae; Pectobacterium. I'm guessing some oddity in the Swissprot parser where in one place it truncates the OS to the first '.', and in another it doesn't? Can someone confirm this with CVS versions of bioperl-live and -db in case Chris already fixed it in recent parser changes? From cjfields at uiuc.edu Thu Sep 27 08:47:00 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 27 Sep 2007 07:47:00 -0500 Subject: [Bioperl-l] load_seqdatabase.pl error In-Reply-To: <46FB6BE8.9050203@sendu.me.uk> References: <000501c800d9$dc9c8e90$95d5abb0$@com> <46FB6BE8.9050203@sendu.me.uk> Message-ID: On Sep 27, 2007, at 3:38 AM, Sendu Bala wrote: > Forrest wrote: >> Hi, all >> I install the biosql, and bioperl-db. I want to import >> swissport data. >> But the programe show some error as below: >> ===================================================================== >> ======= >> =============================================== >>> perl load_seqdatabase.pl -dbuser root -dbname bioseqdb -driver mysql >> -namespace swissprot -format swiss ~/uniprot/uniprot_sprot.dat >> Loading /home/forrest/uniprot/uniprot_sprot.dat ... >> Could not store Q6DAH5: >> ------------- EXCEPTION: Bio::Root::Exception ------------- >> MSG: The supplied lineage does not start near 'Erwinia carotovora >> subsp. >> atroseptica' (I was supplied 'Erwinia carotovora subsp. | >> Pectobacterium | >> Enterobacteriaceae | Enterobacteriales | Gammaproteobacteria | >> Proteobacteria | Bacteria') > > From: > OS Erwinia carotovora subsp. atroseptica (Pectobacterium > atrosepticum). > OC Bacteria; Proteobacteria; Gammaproteobacteria; Enterobacteriales; > OC Enterobacteriaceae; Pectobacterium. > > I'm guessing some oddity in the Swissprot parser where in one place it > truncates the OS to the first '.', and in another it doesn't? > > Can someone confirm this with CVS versions of bioperl-live and -db in > case Chris already fixed it in recent parser changes? It looks suspiciously like he isn't using bioperl-live code (I changed the exception to a warning a while back, and I think the '.' truncation was fixed). This is still an outstanding issue with bioperl-db which hasn't been fully fixed yet, though; we may have to move the priority up on this one. chris From bix at sendu.me.uk Thu Sep 27 09:47:06 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Thu, 27 Sep 2007 14:47:06 +0100 Subject: [Bioperl-l] Bio::SeqFeature::Annotated / SOFA question Message-ID: <46FBB45A.10505@sendu.me.uk> I want to create a Bio::SeqFeature::Annotated object where the 'type' is 'conserved_region'. I got the idea that 'conserved_region' might be ok from here: http://song.sourceforge.net/SOterm_tables.html#SO:0000330 However, this doesn't work since: ------------- EXCEPTION ------------- MSG: couldn't find a SOFA term matching type 'conserved_region'. STACK Bio::SeqFeature::Annotated::type /data/bioinf/home/sb/current/bioperl-core/Bio/SeqFeature/Annotated.pm:371 [snip] I'm guessing Bio::Ontology::OntologyStore is getting its allowed SOFA terms from: http://umn.dl.sourceforge.net/sourceforge/song/sofa.definition I don't know much about this area. Can someone offer a little guidance as to what the significance of these two different files is, why they don't contain the same terms, and why I can't use 'conserved_region'? What's the closest alternative term? From cain.cshl at gmail.com Thu Sep 27 10:20:50 2007 From: cain.cshl at gmail.com (Scott Cain) Date: Thu, 27 Sep 2007 10:20:50 -0400 Subject: [Bioperl-l] Bio::SeqFeature::Annotated / SOFA question In-Reply-To: <46FBB45A.10505@sendu.me.uk> References: <46FBB45A.10505@sendu.me.uk> Message-ID: <1190902850.12078.26.camel@localhost.localdomain> Hi Sendu, I believe that BSFA uses SOFA but the growing consensus is that SOFA should be pitched and all of SO should be used where SOFA was being used. I also suspect that BioPerl is using a very old version of SOFA, since at the time BSFA was written, BioPerl couldn't parse OBO files (can it now?), so it was using the very old file format (whose name I can't even remember now) and that file hasn't been updated in a long time (which is why it isn't finding conserved_region). If BioPerl can parse OBO files, we should switch BSFA to validate against http://song.cvs.sourceforge.net/*checkout*/song/ontology/so.obo Scott On Thu, 2007-09-27 at 14:47 +0100, Sendu Bala wrote: > I want to create a Bio::SeqFeature::Annotated object where the 'type' is > 'conserved_region'. > > I got the idea that 'conserved_region' might be ok from here: > http://song.sourceforge.net/SOterm_tables.html#SO:0000330 > > However, this doesn't work since: > > ------------- EXCEPTION ------------- > MSG: couldn't find a SOFA term matching type 'conserved_region'. > STACK Bio::SeqFeature::Annotated::type > /data/bioinf/home/sb/current/bioperl-core/Bio/SeqFeature/Annotated.pm:371 > [snip] > > > I'm guessing Bio::Ontology::OntologyStore is getting its allowed SOFA > terms from: > http://umn.dl.sourceforge.net/sourceforge/song/sofa.definition > > > I don't know much about this area. Can someone offer a little guidance > as to what the significance of these two different files is, why they > don't contain the same terms, and why I can't use 'conserved_region'? > > What's the closest alternative term? > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- ------------------------------------------------------------------------ Scott Cain, Ph. D. cain at cshl.edu GMOD Coordinator (http://www.gmod.org/) 216-392-3087 Cold Spring Harbor Laboratory From cjfields at uiuc.edu Thu Sep 27 11:25:13 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 27 Sep 2007 10:25:13 -0500 Subject: [Bioperl-l] Bio::SeqFeature::Annotated / SOFA question In-Reply-To: <1190902850.12078.26.camel@localhost.localdomain> References: <46FBB45A.10505@sendu.me.uk> <1190902850.12078.26.camel@localhost.localdomain> Message-ID: <29DFE331-33F5-47ED-9A6D-AF488E5F0689@uiuc.edu> On Sep 27, 2007, at 9:20 AM, Scott Cain wrote: > Hi Sendu, > > I believe that BSFA uses SOFA but the growing consensus is that SOFA > should be pitched and all of SO should be used where SOFA was being > used. I also suspect that BioPerl is using a very old version of > SOFA, > since at the time BSFA was written, BioPerl couldn't parse OBO files > (can it now?), so it was using the very old file format (whose name I > can't even remember now) and that file hasn't been updated in a long > time (which is why it isn't finding conserved_region). > > If BioPerl can parse OBO files, we should switch BSFA to validate > against > > http://song.cvs.sourceforge.net/*checkout*/song/ontology/so.obo > > Scott I agree, this would definitely be for the best. BioPerl can parse obo; not sure how often it's used or what the tests are like, but switching to SO should give it a good workout and might wring out any issues. chris > On Thu, 2007-09-27 at 14:47 +0100, Sendu Bala wrote: >> I want to create a Bio::SeqFeature::Annotated object where the >> 'type' is >> 'conserved_region'. >> >> I got the idea that 'conserved_region' might be ok from here: >> http://song.sourceforge.net/SOterm_tables.html#SO:0000330 >> >> However, this doesn't work since: >> >> ------------- EXCEPTION ------------- >> MSG: couldn't find a SOFA term matching type 'conserved_region'. >> STACK Bio::SeqFeature::Annotated::type >> /data/bioinf/home/sb/current/bioperl-core/Bio/SeqFeature/ >> Annotated.pm:371 >> [snip] >> >> >> I'm guessing Bio::Ontology::OntologyStore is getting its allowed SOFA >> terms from: >> http://umn.dl.sourceforge.net/sourceforge/song/sofa.definition >> >> >> I don't know much about this area. Can someone offer a little >> guidance >> as to what the significance of these two different files is, why they >> don't contain the same terms, and why I can't use 'conserved_region'? >> >> What's the closest alternative term? >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > -- > ---------------------------------------------------------------------- > -- > Scott Cain, Ph. D. > cain at cshl.edu > GMOD Coordinator (http://www.gmod.org/) > 216-392-3087 > Cold Spring Harbor Laboratory > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Research Associate Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From cain.cshl at gmail.com Thu Sep 27 11:34:57 2007 From: cain.cshl at gmail.com (Scott Cain) Date: Thu, 27 Sep 2007 11:34:57 -0400 Subject: [Bioperl-l] Bio::SeqFeature::Annotated / SOFA question In-Reply-To: <29DFE331-33F5-47ED-9A6D-AF488E5F0689@uiuc.edu> References: <46FBB45A.10505@sendu.me.uk> <1190902850.12078.26.camel@localhost.localdomain> <29DFE331-33F5-47ED-9A6D-AF488E5F0689@uiuc.edu> Message-ID: <1190907297.12078.32.camel@localhost.localdomain> OK--while I would normal volunteer to do this, I don't think I am going to have time until after the Genome Informatics and GMOD meetings in November :-/ If it is still not done then, somebody poke me and remind me that I said that. Scott On Thu, 2007-09-27 at 10:25 -0500, Chris Fields wrote: > On Sep 27, 2007, at 9:20 AM, Scott Cain wrote: > > > Hi Sendu, > > > > I believe that BSFA uses SOFA but the growing consensus is that SOFA > > should be pitched and all of SO should be used where SOFA was being > > used. I also suspect that BioPerl is using a very old version of > > SOFA, > > since at the time BSFA was written, BioPerl couldn't parse OBO files > > (can it now?), so it was using the very old file format (whose name I > > can't even remember now) and that file hasn't been updated in a long > > time (which is why it isn't finding conserved_region). > > > > If BioPerl can parse OBO files, we should switch BSFA to validate > > against > > > > http://song.cvs.sourceforge.net/*checkout*/song/ontology/so.obo > > > > Scott > > I agree, this would definitely be for the best. BioPerl can parse > obo; not sure how often it's used or what the tests are like, but > switching to SO should give it a good workout and might wring out any > issues. > > chris > > > On Thu, 2007-09-27 at 14:47 +0100, Sendu Bala wrote: > >> I want to create a Bio::SeqFeature::Annotated object where the > >> 'type' is > >> 'conserved_region'. > >> > >> I got the idea that 'conserved_region' might be ok from here: > >> http://song.sourceforge.net/SOterm_tables.html#SO:0000330 > >> > >> However, this doesn't work since: > >> > >> ------------- EXCEPTION ------------- > >> MSG: couldn't find a SOFA term matching type 'conserved_region'. > >> STACK Bio::SeqFeature::Annotated::type > >> /data/bioinf/home/sb/current/bioperl-core/Bio/SeqFeature/ > >> Annotated.pm:371 > >> [snip] > >> > >> > >> I'm guessing Bio::Ontology::OntologyStore is getting its allowed SOFA > >> terms from: > >> http://umn.dl.sourceforge.net/sourceforge/song/sofa.definition > >> > >> > >> I don't know much about this area. Can someone offer a little > >> guidance > >> as to what the significance of these two different files is, why they > >> don't contain the same terms, and why I can't use 'conserved_region'? > >> > >> What's the closest alternative term? > >> _______________________________________________ > >> Bioperl-l mailing list > >> Bioperl-l at lists.open-bio.org > >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > -- > > ---------------------------------------------------------------------- > > -- > > Scott Cain, Ph. D. > > cain at cshl.edu > > GMOD Coordinator (http://www.gmod.org/) > > 216-392-3087 > > Cold Spring Harbor Laboratory > > > > _______________________________________________ > > Bioperl-l mailing list > > Bioperl-l at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > Christopher Fields > Research Associate > Lab of Dr. Robert Switzer > Dept of Biochemistry > University of Illinois Urbana-Champaign > > > -- ------------------------------------------------------------------------ Scott Cain, Ph. D. cain.cshl at gmail.com GMOD Coordinator (http://www.gmod.org/) 216-392-3087 Cold Spring Harbor Laboratory From cjfields at uiuc.edu Thu Sep 27 11:43:06 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 27 Sep 2007 10:43:06 -0500 Subject: [Bioperl-l] Bio::SeqFeature::Annotated / SOFA question In-Reply-To: <1190907297.12078.32.camel@localhost.localdomain> References: <46FBB45A.10505@sendu.me.uk> <1190902850.12078.26.camel@localhost.localdomain> <29DFE331-33F5-47ED-9A6D-AF488E5F0689@uiuc.edu> <1190907297.12078.32.camel@localhost.localdomain> Message-ID: Actually, I just added 'Sequence Ontology OBO' to Bio::Ontology::DocumentRegistry and switched BSFA over to use that in bioperl-live. So far it still passes tests checking SO using obo. Sendu, does that work or crash-and-burn? chris On Sep 27, 2007, at 10:34 AM, Scott Cain wrote: > OK--while I would normal volunteer to do this, I don't think I am > going > to have time until after the Genome Informatics and GMOD meetings in > November :-/ If it is still not done then, somebody poke me and > remind > me that I said that. > > Scott > > > On Thu, 2007-09-27 at 10:25 -0500, Chris Fields wrote: >> On Sep 27, 2007, at 9:20 AM, Scott Cain wrote: >> >>> Hi Sendu, >>> >>> I believe that BSFA uses SOFA but the growing consensus is that SOFA >>> should be pitched and all of SO should be used where SOFA was being >>> used. I also suspect that BioPerl is using a very old version of >>> SOFA, >>> since at the time BSFA was written, BioPerl couldn't parse OBO files >>> (can it now?), so it was using the very old file format (whose >>> name I >>> can't even remember now) and that file hasn't been updated in a long >>> time (which is why it isn't finding conserved_region). >>> >>> If BioPerl can parse OBO files, we should switch BSFA to validate >>> against >>> >>> http://song.cvs.sourceforge.net/*checkout*/song/ontology/so.obo >>> >>> Scott >> >> I agree, this would definitely be for the best. BioPerl can parse >> obo; not sure how often it's used or what the tests are like, but >> switching to SO should give it a good workout and might wring out any >> issues. >> >> chris >> >>> On Thu, 2007-09-27 at 14:47 +0100, Sendu Bala wrote: >>>> I want to create a Bio::SeqFeature::Annotated object where the >>>> 'type' is >>>> 'conserved_region'. >>>> >>>> I got the idea that 'conserved_region' might be ok from here: >>>> http://song.sourceforge.net/SOterm_tables.html#SO:0000330 >>>> >>>> However, this doesn't work since: >>>> >>>> ------------- EXCEPTION ------------- >>>> MSG: couldn't find a SOFA term matching type 'conserved_region'. >>>> STACK Bio::SeqFeature::Annotated::type >>>> /data/bioinf/home/sb/current/bioperl-core/Bio/SeqFeature/ >>>> Annotated.pm:371 >>>> [snip] >>>> >>>> >>>> I'm guessing Bio::Ontology::OntologyStore is getting its allowed >>>> SOFA >>>> terms from: >>>> http://umn.dl.sourceforge.net/sourceforge/song/sofa.definition >>>> >>>> >>>> I don't know much about this area. Can someone offer a little >>>> guidance >>>> as to what the significance of these two different files is, why >>>> they >>>> don't contain the same terms, and why I can't use >>>> 'conserved_region'? >>>> >>>> What's the closest alternative term? >>>> _______________________________________________ >>>> Bioperl-l mailing list >>>> Bioperl-l at lists.open-bio.org >>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>> -- >>> -------------------------------------------------------------------- >>> -- >>> -- >>> Scott Cain, Ph. D. >>> cain at cshl.edu >>> GMOD Coordinator (http://www.gmod.org/) >>> 216-392-3087 >>> Cold Spring Harbor Laboratory >>> >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> >> Christopher Fields >> Research Associate >> Lab of Dr. Robert Switzer >> Dept of Biochemistry >> University of Illinois Urbana-Champaign >> >> >> > -- > ---------------------------------------------------------------------- > -- > Scott Cain, Ph. D. > cain.cshl at gmail.com > GMOD Coordinator (http://www.gmod.org/) > 216-392-3087 > Cold Spring Harbor Laboratory > > Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From hlapp at gmx.net Thu Sep 27 18:17:16 2007 From: hlapp at gmx.net (Hilmar Lapp) Date: Thu, 27 Sep 2007 18:17:16 -0400 Subject: [Bioperl-l] load_seqdatabase.pl error In-Reply-To: <000501c800d9$dc9c8e90$95d5abb0$@com> References: <000501c800d9$dc9c8e90$95d5abb0$@com> Message-ID: <2461576A-1A15-45E2-A14F-AEED6ACD6007@gmx.net> Forrest, have you preloaded the NCBI taxonomy as suggested in the BioSQL installation guidelines? SwissProt format has NCBI taxon IDs, and the code will try to use it to look up species and their lineage, rather than inserting the lineage from whatever BioPerl parses out of the sequence record. -hilmar On Sep 27, 2007, at 3:41 AM, Forrest wrote: > Hi, all > I install the biosql, and bioperl-db. I want to import > swissport data. > But the programe show some error as below: > ====================================================================== > ====== > =============================================== >> perl load_seqdatabase.pl -dbuser root -dbname bioseqdb -driver mysql > -namespace swissprot -format swiss ~/uniprot/uniprot_sprot.dat > Loading /home/forrest/uniprot/uniprot_sprot.dat ... > Could not store Q6DAH5: > ------------- EXCEPTION: Bio::Root::Exception ------------- > MSG: The supplied lineage does not start near 'Erwinia carotovora > subsp. > atroseptica' (I was supplied 'Erwinia carotovora subsp. | > Pectobacterium | > Enterobacteriaceae | Enterobacteriales | Gammaproteobacteria | > Proteobacteria | Bacteria') > STACK: Error::throw > STACK: Bio::Root::Root::throw > /usr/lib/perl5/site_perl/5.8.8/Bio/Root/Root.pm:359 > STACK: Bio::Species::classification > /usr/lib/perl5/site_perl/5.8.8/Bio/Species.pm:174 > STACK: Bio::DB::Persistent::PersistentObject::AUTOLOAD > /usr/lib/perl5/site_perl/5.8.8/Bio/DB/Persistent/ > PersistentObject.pm:552 > STACK: Bio::DB::BioSQL::SpeciesAdaptor::populate_from_row > /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/SpeciesAdaptor.pm:281 > STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::_build_object > /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/ > BasePersistenceAdaptor.pm:1305 > STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::_find_by_unique_key > /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/ > BasePersistenceAdaptor.pm:973 > STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::find_by_unique_key > /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/ > BasePersistenceAdaptor.pm:852 > STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create > /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/ > BasePersistenceAdaptor.pm:182 > STACK: Bio::DB::Persistent::PersistentObject::create > /usr/lib/perl5/site_perl/5.8.8/Bio/DB/Persistent/ > PersistentObject.pm:244 > STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create > /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/ > BasePersistenceAdaptor.pm:169 > STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::store > /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/ > BasePersistenceAdaptor.pm:251 > STACK: Bio::DB::Persistent::PersistentObject::store > /usr/lib/perl5/site_perl/5.8.8/Bio/DB/Persistent/ > PersistentObject.pm:271 > STACK: load_seqdatabase.pl:620 > ----------------------------------------------------------- > > at load_seqdatabase.pl line 633 > ====================================================================== > ====== > =============================================== > > How can I solve it, please help me, Thank you. > > Thanks > Forrest zhang > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From wcnelson at usc.edu Thu Sep 27 15:20:47 2007 From: wcnelson at usc.edu (William C. Nelson) Date: Thu, 27 Sep 2007 15:20:47 -0400 Subject: [Bioperl-l] cpan install Message-ID: <46FC028F.3050000@usc.edu> Hello, I tried to install v1.5.2 using cpan. My urllist looks like: cpan[2]> o conf urllist urllist 0 [ftp://cpan.cs.utah.edu/pub/CPAN/] 1 [ftp://cpan.mirrors.tds.net/pub/CPAN] 2 [ftp://ftp.open-bio.org/pub/bioperl/DIST/] Type 'o conf' to view all configuration items And when I look for bioperl, I see: cpan[1]> d /bioperl/ CPAN: Storable loaded ok (v2.16) Going to read /root/.cpan/Metadata Database was generated on Thu, 27 Sep 2007 18:36:44 GMT Distribution BIRNEY/bioperl-1.2.1.tar.gz Distribution BIRNEY/bioperl-1.2.2.tar.gz Distribution BIRNEY/bioperl-1.2.3.tar.gz Distribution BIRNEY/bioperl-1.2.tar.gz Distribution BIRNEY/bioperl-1.4.tar.gz Distribution BIRNEY/bioperl-db-0.1.tar.gz Distribution BIRNEY/bioperl-ext-1.4.tar.gz Distribution BIRNEY/bioperl-gui-0.7.tar.gz Distribution BIRNEY/bioperl-run-1.2.2.tar.gz Distribution BIRNEY/bioperl-run-1.4.tar.gz Distribution BOZO/Fry-Lib-BioPerl-0.15.tar.gz Distribution CRAFFI/Bundle-BioPerl-2.1.8.tar.gz 12 items found No v 1.5.2. This may be because it can't see the distribution at ftp://ftp.open-bio.org/pub/bioperl/DIST/. When I try to reload the index, I get messages saying cpan can't find the files ftp://ftp.open-bio.org/pub/bioperl/DIST/authors/01mailrc.txt.gz or ftp://ftp.open-bio.org/pub/bioperl/DIST/modules/02packages.details.txt.gz or ftp://ftp.open-bio.org/pub/bioperl/DIST/modules/03modlist.data.gz. Am I doing something wrong? Does the FTP site need to be updated? Thanks, Bill -- ----------------------------------------------------- William C. Nelson, PhD Research Asst Professor Wrigely Institute for Environmental Studies University of Southern California LAS/MEB 310-510-4097 wcnelson at usc.edu From wgallin at ualberta.ca Thu Sep 27 22:51:12 2007 From: wgallin at ualberta.ca (Warren Gallin) Date: Thu, 27 Sep 2007 20:51:12 -0600 Subject: [Bioperl-l] A couple Eutilities questions Message-ID: <98B80D80-AF6F-424B-81B7-5B0CFD8D6CB2@ualberta.ca> I've just started using Bio::DB::Eutilities and I have encountered two things that seem like problems. I am using the latest (retrieved Wednesday September 26, 2007) CVS version, running in an Apple Xserver. Problem 1: When I execute the following code: #Create new EUTILS object for retrieving sets of entries, given an array of accession numbers my $gpeptfactory = Bio::DB::EUtilities -> new( -eutil => 'efetch', -db => 'protein', -rettype =>'genbank', -id => \@pro_acc) ; my $file = 'temp_hold.gb'; $gpeptfactory -> get_Response(-file => $file); my $retr_seq = Bio::SeqIO->new( -file => $file, -format => 'genbank'); I get the following warning, consistently: Use of uninitialized value in concatenation (.) or string at /Library/ Perl/5.8.1/Bio/DB/GenericWebAgent.pm line 92. Also, about half the time I get a crash with the following error message: ------------- EXCEPTION: Bio::Root::Exception ------------- MSG: Response Error Bad Gateway STACK: Error::throw STACK: Bio::Root::Root::throw /Library/Perl/5.8.1/Bio/Root/Root.pm:357 STACK: Bio::DB::GenericWebAgent::get_Response /Library/Perl/5.8.1/Bio/ DB/GenericWebAgent.pm:184 STACK: gb_update_v4.pl:118 ----------------------------------------------------------- The other half of the time the script runs fine through to the end. I have no idea whether the crash is related to the warning or not. I looked at the line where the warning is generated, and it appears to be the "new" method for the GenericWebAgent.pm . I can't see how the call to Eutilities is can be passing an undefined value through to this method. Problem #2: When the code runs, I retrieve an incorrect record. I am retrieving using accessions, and accession I51532 retrieves two records. One is the record I am after, an ion channel protein, the other comes from a patent application; the problem is that, although the accession number for the unwanted record is AAB76204, the LOCUS entry in the record is I51532. So, is it possible that the efetch function is collecting on the basis of LOCUS, not ACCESSION? I realize that the two are almost always the same, but not apparently in this case. Any advice and/or explanation is appreciated. Warren Gallin From forrest_zhang at 163.com Thu Sep 27 22:50:53 2007 From: forrest_zhang at 163.com (Forrest Zhang) Date: Fri, 28 Sep 2007 10:50:53 +0800 Subject: [Bioperl-l] cpan install In-Reply-To: <46FC028F.3050000@usc.edu> References: <46FC028F.3050000@usc.edu> Message-ID: <000001c8017a$6384bae0$2a8e30a0$@com> Try cpan>install S/SE/SENDU/bioperl-1.5.2_102.tar.gz other question you should browse http://www.bioperl.org/wiki/Installing_Bioperl_for_Unix -----Original Message----- From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-bounces at lists.open-bio.org] On Behalf Of William C. Nelson Sent: Friday, September 28, 2007 3:21 AM To: bioperl-l at bioperl.org Subject: [Bioperl-l] cpan install Hello, I tried to install v1.5.2 using cpan. My urllist looks like: cpan[2]> o conf urllist urllist 0 [ftp://cpan.cs.utah.edu/pub/CPAN/] 1 [ftp://cpan.mirrors.tds.net/pub/CPAN] 2 [ftp://ftp.open-bio.org/pub/bioperl/DIST/] Type 'o conf' to view all configuration items And when I look for bioperl, I see: cpan[1]> d /bioperl/ CPAN: Storable loaded ok (v2.16) Going to read /root/.cpan/Metadata Database was generated on Thu, 27 Sep 2007 18:36:44 GMT Distribution BIRNEY/bioperl-1.2.1.tar.gz Distribution BIRNEY/bioperl-1.2.2.tar.gz Distribution BIRNEY/bioperl-1.2.3.tar.gz Distribution BIRNEY/bioperl-1.2.tar.gz Distribution BIRNEY/bioperl-1.4.tar.gz Distribution BIRNEY/bioperl-db-0.1.tar.gz Distribution BIRNEY/bioperl-ext-1.4.tar.gz Distribution BIRNEY/bioperl-gui-0.7.tar.gz Distribution BIRNEY/bioperl-run-1.2.2.tar.gz Distribution BIRNEY/bioperl-run-1.4.tar.gz Distribution BOZO/Fry-Lib-BioPerl-0.15.tar.gz Distribution CRAFFI/Bundle-BioPerl-2.1.8.tar.gz 12 items found No v 1.5.2. This may be because it can't see the distribution at ftp://ftp.open-bio.org/pub/bioperl/DIST/. When I try to reload the index, I get messages saying cpan can't find the files ftp://ftp.open-bio.org/pub/bioperl/DIST/authors/01mailrc.txt.gz or ftp://ftp.open-bio.org/pub/bioperl/DIST/modules/02packages.details.txt.gz or ftp://ftp.open-bio.org/pub/bioperl/DIST/modules/03modlist.data.gz. Am I doing something wrong? Does the FTP site need to be updated? Thanks, Bill -- ----------------------------------------------------- William C. Nelson, PhD Research Asst Professor Wrigely Institute for Environmental Studies University of Southern California LAS/MEB 310-510-4097 wcnelson at usc.edu _______________________________________________ Bioperl-l mailing list Bioperl-l at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/bioperl-l From forrest_zhang at 163.com Thu Sep 27 23:15:03 2007 From: forrest_zhang at 163.com (Forrest Zhang) Date: Fri, 28 Sep 2007 11:15:03 +0800 Subject: [Bioperl-l] load_seqdatabase.pl error In-Reply-To: <2461576A-1A15-45E2-A14F-AEED6ACD6007@gmx.net> References: <000501c800d9$dc9c8e90$95d5abb0$@com> <2461576A-1A15-45E2-A14F-AEED6ACD6007@gmx.net> Message-ID: <000101c8017d$c4643360$4d2c9a20$@com> Hilmar, I have already pre-loaded the NCBI taxonomy using load_ncbi_taxonomy.pl yet. The error message show: --------------------- WARNING --------------------- MSG: The supplied lineage does not start near 'Phaseolus aureus' (I was supplied 'Vigna | Papilionoideae | Fabaceae | Fabales | eurosids I | rosids | core eudicotyledons | eudicotyledons | Magnoliophyta | Euphyllophyta | Embryophyta | Streptophytina | Viridiplantae | Eukaryota') --------------------------------------------------- Could not store Q40784: ------------- EXCEPTION: Bio::Root::Exception ------------- MSG: create: object (Bio::Species) failed to insert or to be found by unique key STACK: Error::throw STACK: Bio::Root::Root::throw /usr/lib/perl5/site_perl/5.8.8/Bio/Root/Root.pm:357 STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/BasePersistenceAdaptor.pm:206 STACK: Bio::DB::Persistent::PersistentObject::create /usr/lib/perl5/site_perl/5.8.8/Bio/DB/Persistent/PersistentObject.pm:244 STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/BasePersistenceAdaptor.pm:169 STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::store /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/BasePersistenceAdaptor.pm:251 STACK: Bio::DB::Persistent::PersistentObject::store /usr/lib/perl5/site_perl/5.8.8/Bio/DB/Persistent/PersistentObject.pm:271 STACK: /usr/bin/bp_load_seqdatabase.pl:633 ----------------------------------------------------------- Sigh~~~~~~ Forrest Zhang -----Original Message----- From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-bounces at lists.open-bio.org] On Behalf Of Hilmar Lapp Sent: Friday, September 28, 2007 6:17 AM To: Forrest Cc: bioperl-l at bioperl.org Subject: Re: [Bioperl-l] load_seqdatabase.pl error Forrest, have you preloaded the NCBI taxonomy as suggested in the BioSQL installation guidelines? SwissProt format has NCBI taxon IDs, and the code will try to use it to look up species and their lineage, rather than inserting the lineage from whatever BioPerl parses out of the sequence record. -hilmar On Sep 27, 2007, at 3:41 AM, Forrest wrote: > Hi, all > I install the biosql, and bioperl-db. I want to import > swissport data. > But the programe show some error as below: > ====================================================================== > ====== > =============================================== >> perl load_seqdatabase.pl -dbuser root -dbname bioseqdb -driver mysql > -namespace swissprot -format swiss ~/uniprot/uniprot_sprot.dat > Loading /home/forrest/uniprot/uniprot_sprot.dat ... > Could not store Q6DAH5: > ------------- EXCEPTION: Bio::Root::Exception ------------- > MSG: The supplied lineage does not start near 'Erwinia carotovora > subsp. > atroseptica' (I was supplied 'Erwinia carotovora subsp. | > Pectobacterium | > Enterobacteriaceae | Enterobacteriales | Gammaproteobacteria | > Proteobacteria | Bacteria') > STACK: Error::throw > STACK: Bio::Root::Root::throw > /usr/lib/perl5/site_perl/5.8.8/Bio/Root/Root.pm:359 > STACK: Bio::Species::classification > /usr/lib/perl5/site_perl/5.8.8/Bio/Species.pm:174 > STACK: Bio::DB::Persistent::PersistentObject::AUTOLOAD > /usr/lib/perl5/site_perl/5.8.8/Bio/DB/Persistent/ > PersistentObject.pm:552 > STACK: Bio::DB::BioSQL::SpeciesAdaptor::populate_from_row > /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/SpeciesAdaptor.pm:281 > STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::_build_object > /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/ > BasePersistenceAdaptor.pm:1305 > STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::_find_by_unique_key > /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/ > BasePersistenceAdaptor.pm:973 > STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::find_by_unique_key > /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/ > BasePersistenceAdaptor.pm:852 > STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create > /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/ > BasePersistenceAdaptor.pm:182 > STACK: Bio::DB::Persistent::PersistentObject::create > /usr/lib/perl5/site_perl/5.8.8/Bio/DB/Persistent/ > PersistentObject.pm:244 > STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create > /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/ > BasePersistenceAdaptor.pm:169 > STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::store > /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/ > BasePersistenceAdaptor.pm:251 > STACK: Bio::DB::Persistent::PersistentObject::store > /usr/lib/perl5/site_perl/5.8.8/Bio/DB/Persistent/ > PersistentObject.pm:271 > STACK: load_seqdatabase.pl:620 > ----------------------------------------------------------- > > at load_seqdatabase.pl line 633 > ====================================================================== > ====== > =============================================== > > How can I solve it, please help me, Thank you. > > Thanks > Forrest zhang > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== _______________________________________________ Bioperl-l mailing list Bioperl-l at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/bioperl-l From forrest_zhang at 163.com Thu Sep 27 23:33:21 2007 From: forrest_zhang at 163.com (Forrest Zhang) Date: Fri, 28 Sep 2007 11:33:21 +0800 Subject: [Bioperl-l] load_seqdatabase.pl error In-Reply-To: <000101c8017d$c4643360$4d2c9a20$@com> References: <000501c800d9$dc9c8e90$95d5abb0$@com> <2461576A-1A15-45E2-A14F-AEED6ACD6007@gmx.net> <000101c8017d$c4643360$4d2c9a20$@com> Message-ID: <000201c80180$54762650$fd6272f0$@com> I reinstall the bioperl-db, I found some error. t/01dbadaptor.....ok t/02species.......FAILED tests 66-95 Failed 30/65 tests, 53.85% okay t/03simpleseq.....ok t/04swiss.........ok t/05seqfeature....ok t/06comment.......ok t/07dblink........ok t/08genbank.......ok t/09fuzzy2........ok t/10ensembl.......ok t/11locuslink.....ok t/12ontology......ok t/13remove........ok t/14query.........ok t/15cluster.......ok 9/160 --------------------- WARNING --------------------- MSG: failed to store one or more child objects for an instance of class Bio::Cluster::UniGene (PK=320) --------------------------------------------------- t/15cluster.......ok t/16obda..........ok Failed Test Stat Wstat Total Fail List of Failed ---------------------------------------------------------------------------- --- t/02species.t 65 30 66-95 Failed 1/16 test scripts. -30/1423 subtests failed. Files=16, Tests=1423, 35 wallclock secs (16.67 cusr + 0.63 csys = 17.30 CPU) Failed 1/16 test programs. -30/1423 subtests failed. make: *** [test] Error 255 -----Original Message----- From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-bounces at lists.open-bio.org] On Behalf Of Forrest Zhang Sent: Friday, September 28, 2007 11:15 AM To: bioperl-l at bioperl.org Subject: Re: [Bioperl-l] load_seqdatabase.pl error Hilmar, I have already pre-loaded the NCBI taxonomy using load_ncbi_taxonomy.pl yet. The error message show: --------------------- WARNING --------------------- MSG: The supplied lineage does not start near 'Phaseolus aureus' (I was supplied 'Vigna | Papilionoideae | Fabaceae | Fabales | eurosids I | rosids | core eudicotyledons | eudicotyledons | Magnoliophyta | Euphyllophyta | Embryophyta | Streptophytina | Viridiplantae | Eukaryota') --------------------------------------------------- Could not store Q40784: ------------- EXCEPTION: Bio::Root::Exception ------------- MSG: create: object (Bio::Species) failed to insert or to be found by unique key STACK: Error::throw STACK: Bio::Root::Root::throw /usr/lib/perl5/site_perl/5.8.8/Bio/Root/Root.pm:357 STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/BasePersistenceAdaptor.pm:206 STACK: Bio::DB::Persistent::PersistentObject::create /usr/lib/perl5/site_perl/5.8.8/Bio/DB/Persistent/PersistentObject.pm:244 STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/BasePersistenceAdaptor.pm:169 STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::store /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/BasePersistenceAdaptor.pm:251 STACK: Bio::DB::Persistent::PersistentObject::store /usr/lib/perl5/site_perl/5.8.8/Bio/DB/Persistent/PersistentObject.pm:271 STACK: /usr/bin/bp_load_seqdatabase.pl:633 ----------------------------------------------------------- Sigh~~~~~~ Forrest Zhang -----Original Message----- From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-bounces at lists.open-bio.org] On Behalf Of Hilmar Lapp Sent: Friday, September 28, 2007 6:17 AM To: Forrest Cc: bioperl-l at bioperl.org Subject: Re: [Bioperl-l] load_seqdatabase.pl error Forrest, have you preloaded the NCBI taxonomy as suggested in the BioSQL installation guidelines? SwissProt format has NCBI taxon IDs, and the code will try to use it to look up species and their lineage, rather than inserting the lineage from whatever BioPerl parses out of the sequence record. -hilmar On Sep 27, 2007, at 3:41 AM, Forrest wrote: > Hi, all > I install the biosql, and bioperl-db. I want to import > swissport data. > But the programe show some error as below: > ====================================================================== > ====== > =============================================== >> perl load_seqdatabase.pl -dbuser root -dbname bioseqdb -driver mysql > -namespace swissprot -format swiss ~/uniprot/uniprot_sprot.dat > Loading /home/forrest/uniprot/uniprot_sprot.dat ... > Could not store Q6DAH5: > ------------- EXCEPTION: Bio::Root::Exception ------------- > MSG: The supplied lineage does not start near 'Erwinia carotovora > subsp. > atroseptica' (I was supplied 'Erwinia carotovora subsp. | > Pectobacterium | > Enterobacteriaceae | Enterobacteriales | Gammaproteobacteria | > Proteobacteria | Bacteria') > STACK: Error::throw > STACK: Bio::Root::Root::throw > /usr/lib/perl5/site_perl/5.8.8/Bio/Root/Root.pm:359 > STACK: Bio::Species::classification > /usr/lib/perl5/site_perl/5.8.8/Bio/Species.pm:174 > STACK: Bio::DB::Persistent::PersistentObject::AUTOLOAD > /usr/lib/perl5/site_perl/5.8.8/Bio/DB/Persistent/ > PersistentObject.pm:552 > STACK: Bio::DB::BioSQL::SpeciesAdaptor::populate_from_row > /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/SpeciesAdaptor.pm:281 > STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::_build_object > /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/ > BasePersistenceAdaptor.pm:1305 > STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::_find_by_unique_key > /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/ > BasePersistenceAdaptor.pm:973 > STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::find_by_unique_key > /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/ > BasePersistenceAdaptor.pm:852 > STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create > /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/ > BasePersistenceAdaptor.pm:182 > STACK: Bio::DB::Persistent::PersistentObject::create > /usr/lib/perl5/site_perl/5.8.8/Bio/DB/Persistent/ > PersistentObject.pm:244 > STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create > /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/ > BasePersistenceAdaptor.pm:169 > STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::store > /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/ > BasePersistenceAdaptor.pm:251 > STACK: Bio::DB::Persistent::PersistentObject::store > /usr/lib/perl5/site_perl/5.8.8/Bio/DB/Persistent/ > PersistentObject.pm:271 > STACK: load_seqdatabase.pl:620 > ----------------------------------------------------------- > > at load_seqdatabase.pl line 633 > ====================================================================== > ====== > =============================================== > > How can I solve it, please help me, Thank you. > > Thanks > Forrest zhang > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== _______________________________________________ Bioperl-l mailing list Bioperl-l at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/bioperl-l _______________________________________________ Bioperl-l mailing list Bioperl-l at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/bioperl-l From cjfields at uiuc.edu Fri Sep 28 00:58:27 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 27 Sep 2007 23:58:27 -0500 Subject: [Bioperl-l] load_seqdatabase.pl error In-Reply-To: <000201c80180$54762650$fd6272f0$@com> References: <000501c800d9$dc9c8e90$95d5abb0$@com> <2461576A-1A15-45E2-A14F-AEED6ACD6007@gmx.net> <000101c8017d$c4643360$4d2c9a20$@com> <000201c80180$54762650$fd6272f0$@com> Message-ID: <9535284F-2DC5-4361-81A2-0B739A7E89E4@uiuc.edu> Bio::Species will have problems if you test on a database with taxonomy loaded (it's mentioned in the install docs I think). The UniGene warning has always popped up and isn't anything to worry about. chris On Sep 27, 2007, at 10:33 PM, Forrest Zhang wrote: > I reinstall the bioperl-db, I found some error. > > t/01dbadaptor.....ok > > t/02species.......FAILED tests 66-95 > > Failed 30/65 tests, 53.85% okay > t/03simpleseq.....ok > > t/04swiss.........ok > > t/05seqfeature....ok > > t/06comment.......ok > > t/07dblink........ok > > t/08genbank.......ok > > t/09fuzzy2........ok > > t/10ensembl.......ok > > t/11locuslink.....ok > > t/12ontology......ok > > t/13remove........ok > > t/14query.........ok > > t/15cluster.......ok 9/160 > > --------------------- WARNING --------------------- > MSG: failed to store one or more child objects for an instance of > class > Bio::Cluster::UniGene (PK=320) > --------------------------------------------------- > t/15cluster.......ok > > t/16obda..........ok > > Failed Test Stat Wstat Total Fail List of Failed > ---------------------------------------------------------------------- > ------ > --- > t/02species.t 65 30 66-95 > Failed 1/16 test scripts. -30/1423 subtests failed. > Files=16, Tests=1423, 35 wallclock secs (16.67 cusr + 0.63 csys = > 17.30 > CPU) > Failed 1/16 test programs. -30/1423 subtests failed. > make: *** [test] Error 255 > > -----Original Message----- > From: bioperl-l-bounces at lists.open-bio.org > [mailto:bioperl-l-bounces at lists.open-bio.org] On Behalf Of Forrest > Zhang > Sent: Friday, September 28, 2007 11:15 AM > To: bioperl-l at bioperl.org > Subject: Re: [Bioperl-l] load_seqdatabase.pl error > > Hilmar, > I have already pre-loaded the NCBI taxonomy using > load_ncbi_taxonomy.pl yet. The error message show: > > --------------------- WARNING --------------------- > MSG: The supplied lineage does not start near 'Phaseolus aureus' (I > was > supplied 'Vigna | Papilionoideae | Fabaceae | Fabales | eurosids I > | rosids > | core eudicotyledons | eudicotyledons | Magnoliophyta | > Euphyllophyta | > Embryophyta | Streptophytina | Viridiplantae | Eukaryota') > --------------------------------------------------- > Could not store Q40784: > ------------- EXCEPTION: Bio::Root::Exception ------------- > MSG: create: object (Bio::Species) failed to insert or to be found > by unique > key > STACK: Error::throw > STACK: Bio::Root::Root::throw > /usr/lib/perl5/site_perl/5.8.8/Bio/Root/Root.pm:357 > STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create > /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/ > BasePersistenceAdaptor.pm:206 > STACK: Bio::DB::Persistent::PersistentObject::create > /usr/lib/perl5/site_perl/5.8.8/Bio/DB/Persistent/ > PersistentObject.pm:244 > STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create > /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/ > BasePersistenceAdaptor.pm:169 > STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::store > /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/ > BasePersistenceAdaptor.pm:251 > STACK: Bio::DB::Persistent::PersistentObject::store > /usr/lib/perl5/site_perl/5.8.8/Bio/DB/Persistent/ > PersistentObject.pm:271 > STACK: /usr/bin/bp_load_seqdatabase.pl:633 > ----------------------------------------------------------- > Sigh~~~~~~ > > Forrest Zhang > > > -----Original Message----- > From: bioperl-l-bounces at lists.open-bio.org > [mailto:bioperl-l-bounces at lists.open-bio.org] On Behalf Of Hilmar Lapp > Sent: Friday, September 28, 2007 6:17 AM > To: Forrest > Cc: bioperl-l at bioperl.org > Subject: Re: [Bioperl-l] load_seqdatabase.pl error > > Forrest, > > have you preloaded the NCBI taxonomy as suggested in the BioSQL > installation guidelines? SwissProt format has NCBI taxon IDs, and the > code will try to use it to look up species and their lineage, rather > than inserting the lineage from whatever BioPerl parses out of the > sequence record. > > -hilmar > > On Sep 27, 2007, at 3:41 AM, Forrest wrote: > >> Hi, all >> I install the biosql, and bioperl-db. I want to import >> swissport data. >> But the programe show some error as below: >> ===================================================================== >> = >> ====== >> =============================================== >>> perl load_seqdatabase.pl -dbuser root -dbname bioseqdb -driver mysql >> -namespace swissprot -format swiss ~/uniprot/uniprot_sprot.dat >> Loading /home/forrest/uniprot/uniprot_sprot.dat ... >> Could not store Q6DAH5: >> ------------- EXCEPTION: Bio::Root::Exception ------------- >> MSG: The supplied lineage does not start near 'Erwinia carotovora >> subsp. >> atroseptica' (I was supplied 'Erwinia carotovora subsp. | >> Pectobacterium | >> Enterobacteriaceae | Enterobacteriales | Gammaproteobacteria | >> Proteobacteria | Bacteria') >> STACK: Error::throw >> STACK: Bio::Root::Root::throw >> /usr/lib/perl5/site_perl/5.8.8/Bio/Root/Root.pm:359 >> STACK: Bio::Species::classification >> /usr/lib/perl5/site_perl/5.8.8/Bio/Species.pm:174 >> STACK: Bio::DB::Persistent::PersistentObject::AUTOLOAD >> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/Persistent/ >> PersistentObject.pm:552 >> STACK: Bio::DB::BioSQL::SpeciesAdaptor::populate_from_row >> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/SpeciesAdaptor.pm:281 >> STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::_build_object >> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/ >> BasePersistenceAdaptor.pm:1305 >> STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::_find_by_unique_key >> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/ >> BasePersistenceAdaptor.pm:973 >> STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::find_by_unique_key >> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/ >> BasePersistenceAdaptor.pm:852 >> STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create >> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/ >> BasePersistenceAdaptor.pm:182 >> STACK: Bio::DB::Persistent::PersistentObject::create >> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/Persistent/ >> PersistentObject.pm:244 >> STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create >> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/ >> BasePersistenceAdaptor.pm:169 >> STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::store >> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/ >> BasePersistenceAdaptor.pm:251 >> STACK: Bio::DB::Persistent::PersistentObject::store >> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/Persistent/ >> PersistentObject.pm:271 >> STACK: load_seqdatabase.pl:620 >> ----------------------------------------------------------- >> >> at load_seqdatabase.pl line 633 >> ===================================================================== >> = >> ====== >> =============================================== >> >> How can I solve it, please help me, Thank you. >> >> Thanks >> Forrest zhang >> >> >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > -- > =========================================================== > : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : > =========================================================== > > > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From cjfields at uiuc.edu Fri Sep 28 00:57:55 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 27 Sep 2007 23:57:55 -0500 Subject: [Bioperl-l] A couple Eutilities questions In-Reply-To: <98B80D80-AF6F-424B-81B7-5B0CFD8D6CB2@ualberta.ca> References: <98B80D80-AF6F-424B-81B7-5B0CFD8D6CB2@ualberta.ca> Message-ID: On Sep 27, 2007, at 9:51 PM, Warren Gallin wrote: > I've just started using Bio::DB::Eutilities and I have encountered > two things that seem like problems. > > I am using the latest (retrieved Wednesday September 26, 2007) CVS > version, running in an Apple Xserver. > > Problem 1: When I execute the following code: > > > #Create new EUTILS object for retrieving sets of entries, given an > array of accession numbers > my $gpeptfactory = Bio::DB::EUtilities -> new( -eutil => 'efetch', > -db => 'protein', > -rettype =>'genbank', > -id => \@pro_acc) ; > my $file = 'temp_hold.gb'; > > $gpeptfactory -> get_Response(-file => $file); > > my $retr_seq = Bio::SeqIO->new( -file => $file, > -format => 'genbank'); > > > I get the following warning, consistently: > > Use of uninitialized value in concatenation (.) or string at /Library/ > Perl/5.8.1/Bio/DB/GenericWebAgent.pm line 92. The above works for me w/o problems. The error itself doesn't make much sense; the line is: $self->ua(LWP::UserAgent->new(env_proxy => 1, agent => ref($self).':'.$self->VERSION)); so either $self isn't a ref (which it appears to be) or there is no version (which is odd but may be a perl bug). What happens if you hard-code the version number to something simple? Also, I noticed you're using perl 5.8.1; which version of Mac OS X are you using? I remember something was off about that perl version but I can't remember what it was... > Also, about half the time I get a crash with the following error > message: > > > ------------- EXCEPTION: Bio::Root::Exception ------------- > MSG: Response Error > Bad Gateway > STACK: Error::throw > STACK: Bio::Root::Root::throw /Library/Perl/5.8.1/Bio/Root/Root.pm:357 > STACK: Bio::DB::GenericWebAgent::get_Response /Library/Perl/5.8.1/Bio/ > DB/GenericWebAgent.pm:184 > STACK: gb_update_v4.pl:118 > ----------------------------------------------------------- I have seen it sometimes pop up when the NCBI server is under heavy server load. It may also be related to your local ISP or setup; see here: http://www.checkupdown.com/status/E502.html Supposedly this may pop up with mod_perl but I haven't seen/heard anything myself related to this. > The other half of the time the script runs fine through to the end. > I have no idea whether the crash is related to the warning or not. I > looked at the line where the warning is generated, and it appears to > be the "new" method for the GenericWebAgent.pm . I can't see how > the call to Eutilities is can be passing an undefined value through > to this method. EUtilities is-a GenericWebAgent; the new() constructors are chained using SUPER::new(). Also, you can call VERSION from any variable so it could be a problem there if VERSION is undef, though again I can't think why this would fail. Regardless, the 'Use of undefined' warning is not a fatal error. > Problem #2: > > When the code runs, I retrieve an incorrect record. I am retrieving > using accessions, and accession I51532 retrieves two records. One is > the record I am after, an ion channel protein, the other comes from a > patent application; the problem is that, although the accession > number for the unwanted record is AAB76204, the LOCUS entry in the > record is I51532. > > So, is it possible that the efetch function is collecting on the > basis of LOCUS, not ACCESSION? I realize that the two are almost > always the same, but not apparently in this case. > > Any advice and/or explanation is appreciated. > > Warren Gallin The only means NCBI guarantees to retrieve a unique record every time is by using the primary ID, which for sequence records is the GI. The accession works most of the time, and efetch accepts accs in the place of GI (it's the only eutil that does). However, every once in a while you get stung and retrieve multiple seqs. BTW, I entered your sequence into Entrez and it popped up as discontinued (which could be part of the problem); the current acc is Q91781. chris From cjfields at uiuc.edu Fri Sep 28 01:00:18 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Fri, 28 Sep 2007 00:00:18 -0500 Subject: [Bioperl-l] load_seqdatabase.pl error In-Reply-To: <000101c8017d$c4643360$4d2c9a20$@com> References: <000501c800d9$dc9c8e90$95d5abb0$@com> <2461576A-1A15-45E2-A14F-AEED6ACD6007@gmx.net> <000101c8017d$c4643360$4d2c9a20$@com> Message-ID: <8C9CDFBD-200C-4CD6-8295-7833D2CD3758@uiuc.edu> If this is occurring using bioperl from CVS then I'll try taking a look at it. chris On Sep 27, 2007, at 10:15 PM, Forrest Zhang wrote: > Hilmar, > I have already pre-loaded the NCBI taxonomy using > load_ncbi_taxonomy.pl yet. The error message show: > > --------------------- WARNING --------------------- > MSG: The supplied lineage does not start near 'Phaseolus aureus' (I > was > supplied 'Vigna | Papilionoideae | Fabaceae | Fabales | eurosids I > | rosids > | core eudicotyledons | eudicotyledons | Magnoliophyta | > Euphyllophyta | > Embryophyta | Streptophytina | Viridiplantae | Eukaryota') > --------------------------------------------------- > Could not store Q40784: > ------------- EXCEPTION: Bio::Root::Exception ------------- > MSG: create: object (Bio::Species) failed to insert or to be found > by unique > key > STACK: Error::throw > STACK: Bio::Root::Root::throw > /usr/lib/perl5/site_perl/5.8.8/Bio/Root/Root.pm:357 > STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create > /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/ > BasePersistenceAdaptor.pm:206 > STACK: Bio::DB::Persistent::PersistentObject::create > /usr/lib/perl5/site_perl/5.8.8/Bio/DB/Persistent/ > PersistentObject.pm:244 > STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create > /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/ > BasePersistenceAdaptor.pm:169 > STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::store > /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/ > BasePersistenceAdaptor.pm:251 > STACK: Bio::DB::Persistent::PersistentObject::store > /usr/lib/perl5/site_perl/5.8.8/Bio/DB/Persistent/ > PersistentObject.pm:271 > STACK: /usr/bin/bp_load_seqdatabase.pl:633 > ----------------------------------------------------------- > Sigh~~~~~~ > > Forrest Zhang > > > -----Original Message----- > From: bioperl-l-bounces at lists.open-bio.org > [mailto:bioperl-l-bounces at lists.open-bio.org] On Behalf Of Hilmar Lapp > Sent: Friday, September 28, 2007 6:17 AM > To: Forrest > Cc: bioperl-l at bioperl.org > Subject: Re: [Bioperl-l] load_seqdatabase.pl error > > Forrest, > > have you preloaded the NCBI taxonomy as suggested in the BioSQL > installation guidelines? SwissProt format has NCBI taxon IDs, and the > code will try to use it to look up species and their lineage, rather > than inserting the lineage from whatever BioPerl parses out of the > sequence record. > > -hilmar > > On Sep 27, 2007, at 3:41 AM, Forrest wrote: > >> Hi, all >> I install the biosql, and bioperl-db. I want to import >> swissport data. >> But the programe show some error as below: >> ===================================================================== >> = >> ====== >> =============================================== >>> perl load_seqdatabase.pl -dbuser root -dbname bioseqdb -driver mysql >> -namespace swissprot -format swiss ~/uniprot/uniprot_sprot.dat >> Loading /home/forrest/uniprot/uniprot_sprot.dat ... >> Could not store Q6DAH5: >> ------------- EXCEPTION: Bio::Root::Exception ------------- >> MSG: The supplied lineage does not start near 'Erwinia carotovora >> subsp. >> atroseptica' (I was supplied 'Erwinia carotovora subsp. | >> Pectobacterium | >> Enterobacteriaceae | Enterobacteriales | Gammaproteobacteria | >> Proteobacteria | Bacteria') >> STACK: Error::throw >> STACK: Bio::Root::Root::throw >> /usr/lib/perl5/site_perl/5.8.8/Bio/Root/Root.pm:359 >> STACK: Bio::Species::classification >> /usr/lib/perl5/site_perl/5.8.8/Bio/Species.pm:174 >> STACK: Bio::DB::Persistent::PersistentObject::AUTOLOAD >> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/Persistent/ >> PersistentObject.pm:552 >> STACK: Bio::DB::BioSQL::SpeciesAdaptor::populate_from_row >> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/SpeciesAdaptor.pm:281 >> STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::_build_object >> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/ >> BasePersistenceAdaptor.pm:1305 >> STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::_find_by_unique_key >> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/ >> BasePersistenceAdaptor.pm:973 >> STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::find_by_unique_key >> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/ >> BasePersistenceAdaptor.pm:852 >> STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create >> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/ >> BasePersistenceAdaptor.pm:182 >> STACK: Bio::DB::Persistent::PersistentObject::create >> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/Persistent/ >> PersistentObject.pm:244 >> STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create >> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/ >> BasePersistenceAdaptor.pm:169 >> STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::store >> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/ >> BasePersistenceAdaptor.pm:251 >> STACK: Bio::DB::Persistent::PersistentObject::store >> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/Persistent/ >> PersistentObject.pm:271 >> STACK: load_seqdatabase.pl:620 >> ----------------------------------------------------------- >> >> at load_seqdatabase.pl line 633 >> ===================================================================== >> = >> ====== >> =============================================== >> >> How can I solve it, please help me, Thank you. >> >> Thanks >> Forrest zhang >> >> >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > -- > =========================================================== > : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : > =========================================================== > > > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From forrest_zhang at 163.com Fri Sep 28 01:34:21 2007 From: forrest_zhang at 163.com (Forrest Zhang) Date: Fri, 28 Sep 2007 13:34:21 +0800 Subject: [Bioperl-l] FW: load_seqdatabase.pl error Message-ID: <000101c80191$3b300e70$b1902b50$@com> Oh, my God! I am tring reinstall bioperl-live using CVS, so many error shown below. biocc bioperl-live # perl Build.PL Checking whether your kit is complete... Looks good Checking prerequisites... Looks good Checking features: BioDBGFF.................enabled BioDBSeqFeature_mysql....enabled Network..................enabled BioDBSeqFeature_BDB......enabled Install [a]ll Bioperl scripts, [n]one, or choose groups [i]nteractively? [a] - will install all scripts Do you want to run the BioDBGFF live database tests? y/n [n] y Which database should I use for testing the mysql driver? [test] On which host is database 'test' running (hostname, ip address or host:port) [localhost] User name for connecting to database 'test'? [undef] root Password for connecting to database 'test'? [undef] - will run the BioDBGFF tests with database driver 'mysql' and these settings: Database test Host localhost DSN dbi:mysql:database=test User root Password undef Do you want to run tests that require connection to servers across the internet (likely to cause some failures)? y/n [n] y - will run internet-requiring tests Deleting Build Removed previous script 'Build' Creating new 'Build' script for 'bioperl' version '1.0050021' biocc bioperl-live # ./Build test Copying Bio/Align/Utilities.pm -> blib/lib/Bio/Align/Utilities.pm Copying Bio/Search/HSP/ModelHSP.pm -> blib/lib/Bio/Search/HSP/ModelHSP.pm Copying Bio/Ontology/DocumentRegistry.pm -> blib/lib/Bio/Ontology/DocumentRegistry.pm Copying Bio/SeqFeature/Annotated.pm -> blib/lib/Bio/SeqFeature/Annotated.pm Copying Bio/SimpleAlign.pm -> blib/lib/Bio/SimpleAlign.pm Copying Bio/AlignIO/stockholm.pm -> blib/lib/Bio/AlignIO/stockholm.pm Copying scripts/utilities/bp_sreformat.PLS -> blib/script/bp_sreformat.PLS Deleting blib/script/bp_sreformat.PLS.bak blib/script/bp_sreformat.PLS -> blib/script/bp_sreformat.pl Copying scripts/graphics/contig_draw.PLS -> blib/script/contig_draw.PLS Deleting blib/script/contig_draw.PLS.bak blib/script/contig_draw.PLS -> blib/script/bp_contig_draw.pl Copying scripts/Bio-DB-GFF/meta_gff.PLS -> blib/script/meta_gff.PLS Deleting blib/script/meta_gff.PLS.bak blib/script/meta_gff.PLS -> blib/script/bp_meta_gff.pl Copying scripts/tree/tree2pag.PLS -> blib/script/tree2pag.PLS Deleting blib/script/tree2pag.PLS.bak blib/script/tree2pag.PLS -> blib/script/bp_tree2pag.pl Copying scripts/Bio-SeqFeature-Store/bp_seqfeature_gff3.PLS -> blib/script/bp_seqfeature_gff3.PLS blib/script/bp_seqfeature_gff3.PLS -> blib/script/bp_seqfeature_gff3.pl Copying scripts/popgen/heterogeneity_test.PLS -> blib/script/heterogeneity_test.PLS Deleting blib/script/heterogeneity_test.PLS.bak blib/script/heterogeneity_test.PLS -> blib/script/bp_heterogeneity_test.pl Copying scripts/DB/flanks.PLS -> blib/script/flanks.PLS Deleting blib/script/flanks.PLS.bak blib/script/flanks.PLS -> blib/script/bp_flanks.pl Copying scripts/graphics/feature_draw.PLS -> blib/script/feature_draw.PLS Deleting blib/script/feature_draw.PLS.bak blib/script/feature_draw.PLS -> blib/script/bp_feature_draw.pl Copying scripts/DB/biogetseq.PLS -> blib/script/biogetseq.PLS Deleting blib/script/biogetseq.PLS.bak blib/script/biogetseq.PLS -> blib/script/bp_biogetseq.pl Copying scripts/Bio-SeqFeature-Store/bp_seqfeature_load.PLS -> blib/script/bp_seqfeature_load.PLS Deleting blib/script/bp_seqfeature_load.PLS.bak blib/script/bp_seqfeature_load.PLS -> blib/script/bp_seqfeature_load.pl Copying scripts/searchio/fastam9_to_table.PLS -> blib/script/fastam9_to_table.PLS Deleting blib/script/fastam9_to_table.PLS.bak blib/script/fastam9_to_table.PLS -> blib/script/bp_fastam9_to_table.pl Copying scripts/utilities/seq_length.PLS -> blib/script/seq_length.PLS Deleting blib/script/seq_length.PLS.bak blib/script/seq_length.PLS -> blib/script/bp_seq_length.pl Copying scripts/Bio-DB-GFF/genbank2gff.PLS -> blib/script/genbank2gff.PLS Deleting blib/script/genbank2gff.PLS.bak blib/script/genbank2gff.PLS -> blib/script/bp_genbank2gff.pl Copying scripts/taxa/taxid4species.PLS -> blib/script/taxid4species.PLS Deleting blib/script/taxid4species.PLS.bak blib/script/taxid4species.PLS -> blib/script/bp_taxid4species.pl Copying scripts/biographics/bp_glyphs1-demo.PLS -> blib/script/bp_glyphs1-demo.PLS Deleting blib/script/bp_glyphs1-demo.PLS.bak blib/script/bp_glyphs1-demo.PLS -> blib/script/bp_glyphs1-demo.pl Copying scripts/tree/blast2tree.PLS -> blib/script/blast2tree.PLS Deleting blib/script/blast2tree.PLS.bak blib/script/blast2tree.PLS -> blib/script/bp_blast2tree.pl Copying scripts/graphics/frend.PLS -> blib/script/frend.PLS Deleting blib/script/frend.PLS.bak blib/script/frend.PLS -> blib/script/bp_frend.pl Copying scripts/taxa/query_entrez_taxa.PLS -> blib/script/query_entrez_taxa.PLS Deleting blib/script/query_entrez_taxa.PLS.bak blib/script/query_entrez_taxa.PLS -> blib/script/bp_query_entrez_taxa.pl Copying scripts/biographics/bp_glyphs2-demo.PLS -> blib/script/bp_glyphs2-demo.PLS Deleting blib/script/bp_glyphs2-demo.PLS.bak blib/script/bp_glyphs2-demo.PLS -> blib/script/bp_glyphs2-demo.pl Copying scripts/taxa/taxonomy2tree.PLS -> blib/script/taxonomy2tree.PLS Deleting blib/script/taxonomy2tree.PLS.bak blib/script/taxonomy2tree.PLS -> blib/script/bp_taxonomy2tree.pl Copying scripts/utilities/search2alnblocks.PLS -> blib/script/search2alnblocks.PLS Deleting blib/script/search2alnblocks.PLS.bak blib/script/search2alnblocks.PLS -> blib/script/bp_search2alnblocks.pl Copying scripts/utilities/mask_by_search.PLS -> blib/script/mask_by_search.PLS Deleting blib/script/mask_by_search.PLS.bak blib/script/mask_by_search.PLS -> blib/script/bp_mask_by_search.pl Copying scripts/seqstats/gccalc.PLS -> blib/script/gccalc.PLS Deleting blib/script/gccalc.PLS.bak blib/script/gccalc.PLS -> blib/script/bp_gccalc.pl Copying scripts/popgen/composite_LD.PLS -> blib/script/composite_LD.PLS Deleting blib/script/composite_LD.PLS.bak blib/script/composite_LD.PLS -> blib/script/bp_composite_LD.pl Copying scripts/seqstats/aacomp.PLS -> blib/script/aacomp.PLS Deleting blib/script/aacomp.PLS.bak blib/script/aacomp.PLS -> blib/script/bp_aacomp.pl Copying scripts/Bio-DB-GFF/process_wormbase.PLS -> blib/script/process_wormbase.PLS Deleting blib/script/process_wormbase.PLS.bak blib/script/process_wormbase.PLS -> blib/script/bp_process_wormbase.pl Copying scripts/taxa/local_taxonomydb_query.PLS -> blib/script/local_taxonomydb_query.PLS Deleting blib/script/local_taxonomydb_query.PLS.bak blib/script/local_taxonomydb_query.PLS -> blib/script/bp_local_taxonomydb_query.pl Copying scripts/biblio/biblio.PLS -> blib/script/biblio.PLS Deleting blib/script/biblio.PLS.bak blib/script/biblio.PLS -> blib/script/bp_biblio.pl Copying scripts/biographics/bp_embl2picture.PLS -> blib/script/bp_embl2picture.PLS Deleting blib/script/bp_embl2picture.PLS.bak blib/script/bp_embl2picture.PLS -> blib/script/bp_embl2picture.pl Copying scripts/Bio-DB-GFF/genbank2gff3.PLS -> blib/script/genbank2gff3.PLS Deleting blib/script/genbank2gff3.PLS.bak blib/script/genbank2gff3.PLS -> blib/script/bp_genbank2gff3.pl Copying scripts/utilities/search2BSML.PLS -> blib/script/search2BSML.PLS Deleting blib/script/search2BSML.PLS.bak blib/script/search2BSML.PLS -> blib/script/bp_search2BSML.pl Copying scripts/seq/seqconvert.PLS -> blib/script/seqconvert.PLS Deleting blib/script/seqconvert.PLS.bak blib/script/seqconvert.PLS -> blib/script/bp_seqconvert.pl Copying scripts/searchio/parse_hmmsearch.PLS -> blib/script/parse_hmmsearch.PLS Deleting blib/script/parse_hmmsearch.PLS.bak blib/script/parse_hmmsearch.PLS -> blib/script/bp_parse_hmmsearch.pl Copying scripts/index/bp_seqret.PLS -> blib/script/bp_seqret.PLS Deleting blib/script/bp_seqret.PLS.bak blib/script/bp_seqret.PLS -> blib/script/bp_seqret.pl Copying scripts/searchio/filter_search.PLS -> blib/script/filter_search.PLS Deleting blib/script/filter_search.PLS.bak blib/script/filter_search.PLS -> blib/script/bp_filter_search.pl Copying scripts/tree/nexus2nh.PLS -> blib/script/nexus2nh.PLS Deleting blib/script/nexus2nh.PLS.bak blib/script/nexus2nh.PLS -> blib/script/bp_nexus2nh.pl Copying scripts/Bio-DB-GFF/generate_histogram.PLS -> blib/script/generate_histogram.PLS Deleting blib/script/generate_histogram.PLS.bak blib/script/generate_histogram.PLS -> blib/script/bp_generate_histogram.pl Copying scripts/seq/split_seq.PLS -> blib/script/split_seq.PLS Deleting blib/script/split_seq.PLS.bak blib/script/split_seq.PLS -> blib/script/bp_split_seq.pl Copying scripts/Bio-DB-GFF/load_gff.PLS -> blib/script/load_gff.PLS Deleting blib/script/load_gff.PLS.bak blib/script/load_gff.PLS -> blib/script/bp_load_gff.pl Copying scripts/index/bp_fetch.PLS -> blib/script/bp_fetch.PLS Deleting blib/script/bp_fetch.PLS.bak blib/script/bp_fetch.PLS -> blib/script/bp_fetch.pl Copying scripts/utilities/mutate.PLS -> blib/script/mutate.PLS Deleting blib/script/mutate.PLS.bak blib/script/mutate.PLS -> blib/script/bp_mutate.pl Copying scripts/Bio-DB-GFF/process_sgd.PLS -> blib/script/process_sgd.PLS Deleting blib/script/process_sgd.PLS.bak blib/script/process_sgd.PLS -> blib/script/bp_process_sgd.pl Copying scripts/index/bp_index.PLS -> blib/script/bp_index.PLS Deleting blib/script/bp_index.PLS.bak blib/script/bp_index.PLS -> blib/script/bp_index.pl Copying scripts/utilities/dbsplit.PLS -> blib/script/dbsplit.PLS Deleting blib/script/dbsplit.PLS.bak blib/script/dbsplit.PLS -> blib/script/bp_dbsplit.pl Copying scripts/seqstats/oligo_count.PLS -> blib/script/oligo_count.PLS Deleting blib/script/oligo_count.PLS.bak blib/script/oligo_count.PLS -> blib/script/bp_oligo_count.pl Copying scripts/searchio/hmmer_to_table.PLS -> blib/script/hmmer_to_table.PLS Deleting blib/script/hmmer_to_table.PLS.bak blib/script/hmmer_to_table.PLS -> blib/script/bp_hmmer_to_table.pl Copying scripts/Bio-DB-GFF/process_gadfly.PLS -> blib/script/process_gadfly.PLS Deleting blib/script/process_gadfly.PLS.bak blib/script/process_gadfly.PLS -> blib/script/bp_process_gadfly.pl Copying scripts/DB/biofetch_genbank_proxy.PLS -> blib/script/biofetch_genbank_proxy.PLS Deleting blib/script/biofetch_genbank_proxy.PLS.bak blib/script/biofetch_genbank_proxy.PLS -> blib/script/bp_biofetch_genbank_proxy.pl Copying scripts/seq/extract_feature_seq.PLS -> blib/script/extract_feature_seq.PLS Deleting blib/script/extract_feature_seq.PLS.bak blib/script/extract_feature_seq.PLS -> blib/script/bp_extract_feature_seq.pl Copying scripts/Bio-DB-GFF/bulk_load_gff.PLS -> blib/script/bulk_load_gff.PLS Deleting blib/script/bulk_load_gff.PLS.bak blib/script/bulk_load_gff.PLS -> blib/script/bp_bulk_load_gff.pl Copying scripts/utilities/search2gff.PLS -> blib/script/search2gff.PLS Deleting blib/script/search2gff.PLS.bak blib/script/search2gff.PLS -> blib/script/bp_search2gff.pl Copying scripts/seq/make_mrna_protein.PLS -> blib/script/make_mrna_protein.PLS Deleting blib/script/make_mrna_protein.PLS.bak blib/script/make_mrna_protein.PLS -> blib/script/bp_make_mrna_protein.pl Copying scripts/seq/unflatten_seq.PLS -> blib/script/unflatten_seq.PLS Deleting blib/script/unflatten_seq.PLS.bak blib/script/unflatten_seq.PLS -> blib/script/bp_unflatten_seq.pl Copying scripts/utilities/search2tribe.PLS -> blib/script/search2tribe.PLS Deleting blib/script/search2tribe.PLS.bak blib/script/search2tribe.PLS -> blib/script/bp_search2tribe.pl Copying scripts/DB/bioflat_index.PLS -> blib/script/bioflat_index.PLS Deleting blib/script/bioflat_index.PLS.bak blib/script/bioflat_index.PLS -> blib/script/bp_bioflat_index.pl Copying scripts/utilities/pairwise_kaks.PLS -> blib/script/pairwise_kaks.PLS Deleting blib/script/pairwise_kaks.PLS.bak blib/script/pairwise_kaks.PLS -> blib/script/bp_pairwise_kaks.pl Copying scripts/Bio-DB-GFF/fast_load_gff.PLS -> blib/script/fast_load_gff.PLS Deleting blib/script/fast_load_gff.PLS.bak blib/script/fast_load_gff.PLS -> blib/script/bp_fast_load_gff.pl Copying scripts/seqstats/chaos_plot.PLS -> blib/script/chaos_plot.PLS Deleting blib/script/chaos_plot.PLS.bak blib/script/chaos_plot.PLS -> blib/script/bp_chaos_plot.pl Copying scripts/utilities/bp_mrtrans.PLS -> blib/script/bp_mrtrans.PLS Deleting blib/script/bp_mrtrans.PLS.bak blib/script/bp_mrtrans.PLS -> blib/script/bp_mrtrans.pl Copying scripts/utilities/bp_nrdb.PLS -> blib/script/bp_nrdb.PLS Deleting blib/script/bp_nrdb.PLS.bak blib/script/bp_nrdb.PLS -> blib/script/bp_nrdb.pl Copying scripts/taxa/classify_hits_kingdom.PLS -> blib/script/classify_hits_kingdom.PLS Deleting blib/script/classify_hits_kingdom.PLS.bak blib/script/classify_hits_kingdom.PLS -> blib/script/bp_classify_hits_kingdom.pl Copying scripts/utilities/remote_blast.PLS -> blib/script/remote_blast.PLS Deleting blib/script/remote_blast.PLS.bak blib/script/remote_blast.PLS -> blib/script/bp_remote_blast.pl Copying scripts/searchio/search2table.PLS -> blib/script/search2table.PLS Deleting blib/script/search2table.PLS.bak blib/script/search2table.PLS -> blib/script/bp_search2table.pl Copying scripts/seq/translate_seq.PLS -> blib/script/translate_seq.PLS Deleting blib/script/translate_seq.PLS.bak blib/script/translate_seq.PLS -> blib/script/bp_translate_seq.pl Copying scripts/graphics/search_overview.PLS -> blib/script/search_overview.PLS Deleting blib/script/search_overview.PLS.bak blib/script/search_overview.PLS -> blib/script/bp_search_overview.pl t/AAChange...................ok t/AAReverseMutate............ok t/AlignIO....................ok t/AlignStats.................ok t/AlignUtil..................ok t/Allele.....................ok t/Alphabet...................ok t/Annotation.................ok 1/112 # Failed (TODO) test 'The object isa Bio::Annotation::Comment' # at t/Annotation.t line 214. # The object isn't a 'Bio::Annotation::Comment' it's a 'Bio::Annotation::OntologyTerm' t/Annotation.................ok 1/112 unexpectedly succeeded TODO PASSED test 96 t/AnnotationAdaptor..........ok t/Assembly...................ok 1/35 # Failed (TODO) test 'get_nof_singlets' # at t/Assembly.t line 44. # got: '0' # expected: '1' # Failed (TODO) test 'get_seq_ids' # at t/Assembly.t line 48. # got: '0' # expected: '2' # Failed (TODO) test at t/Assembly.t line 53. # '0' # ne # '0' # Failed test at t/Assembly.t line 145. # got: '_main_contig_feature:106' # expected: '_aligned_coord:sdsu|SDSU_RFPERU_006_E04.x01.phd.1' t/Assembly...................NOK 31/35# Looks like you failed 1 test of 35. t/Assembly...................dubious Test returned status 1 (wstat 256, 0x100) DIED. FAILED test 31 Failed 1/35 tests, 97.14% okay t/Biblio.....................ok t/BiblioReferences...........ok t/Biblio_biofetch............ok t/Biblio_eutils..............ok t/BioDBGFF...................ok 3/277 skipped: various reasons t/BioDBSeqFeature............ok t/BioDBSeqFeature_BDB........ok t/BioDBSeqFeature_mysql......ok t/BioFetch_DB................ok t/BioGraphics................ok t/BlastIndex.................ok t/Chain......................ok t/ClusterIO..................ok t/Coalescent.................ok t/CodonTable.................ok t/Compatible.................ok t/CoordinateGraph............ok t/CoordinateMapper...........ok t/Correlate..................ok t/CytoMap....................ok t/DB.........................ok 104/116Pseudo-hashes are deprecated at /home/forrest/src/bioperl/bioperl-live/Bio/SeqIO/entrezgene.pm line 491. Pseudo-hashes are deprecated at /home/forrest/src/bioperl/bioperl-live/Bio/SeqIO/entrezgene.pm line 372. Pseudo-hashes are deprecated at /home/forrest/src/bioperl/bioperl-live/Bio/SeqIO/entrezgene.pm line 372. Pseudo-hashes are deprecated at /home/forrest/src/bioperl/bioperl-live/Bio/SeqIO/entrezgene.pm line 565. Pseudo-hashes are deprecated at /home/forrest/src/bioperl/bioperl-live/Bio/SeqIO/entrezgene.pm line 372. Pseudo-hashes are deprecated at /home/forrest/src/bioperl/bioperl-live/Bio/SeqIO/entrezgene.pm line 372. Pseudo-hashes are deprecated at /home/forrest/src/bioperl/bioperl-live/Bio/SeqIO/entrezgene.pm line 565. t/DB.........................ok 107/116Pseudo-hashes are deprecated at /home/forrest/src/bioperl/bioperl-live/Bio/SeqIO/entrezgene.pm line 372. Pseudo-hashes are deprecated at /home/forrest/src/bioperl/bioperl-live/Bio/SeqIO/entrezgene.pm line 372. Pseudo-hashes are deprecated at /home/forrest/src/bioperl/bioperl-live/Bio/SeqIO/entrezgene.pm line 565. Pseudo-hashes are deprecated at /home/forrest/src/bioperl/bioperl-live/Bio/SeqIO/entrezgene.pm line 372. Pseudo-hashes are deprecated at /home/forrest/src/bioperl/bioperl-live/Bio/SeqIO/entrezgene.pm line 372. Pseudo-hashes are deprecated at /home/forrest/src/bioperl/bioperl-live/Bio/SeqIO/entrezgene.pm line 565. t/DB.........................ok t/DBCUTG.....................ok t/DBFasta....................ok t/DNAMutation................ok t/Domcut.....................ok t/ECnumber...................ok t/ELM........................ok t/EMBL_DB....................ok t/EMBOSS_Tools...............ok t/ESEfinder..................ok t/EUtilities.................skipped all skipped: Must set BIOPERLDEBUG=1 for network tests t/EncodedSeq.................ok t/Exception..................ok t/Exonerate..................ok 4/45 skipped: various reasons t/FeatureIO..................ok t/FootPrinter................ok t/GDB........................ok t/GFF........................ok t/GOR4.......................ok t/GOterm.....................ok t/GbrowseGFF.................ok t/Gel........................ok t/GeneCoordinateMapper.......ok t/Geneid.....................ok t/Genewise...................ok 1/53 # Failed (TODO) test at t/Genewise.t line 79. # got: 'Scaffold_2042.1' # expected: 'SINFRUP00000067802' # Failed (TODO) test at t/Genewise.t line 80. # got: 'SINFRUP00000067802' # expected: 'Scaffold_2042.1' t/Genewise...................NOK 37/53 # Failed test at t/Genewise.t line 82. # got: '' # expected: '2054.68' t/Genewise...................NOK 41/53 # Failed test at t/Genewise.t line 88. # got: '' # expected: '2054.68' t/Genewise...................NOK 45/53 # Failed test at t/Genewise.t line 93. # got: '' # expected: '2054.68' # Looks like you failed 3 tests of 53. t/Genewise...................dubious Test returned status 3 (wstat 768, 0x300) DIED. FAILED tests 37, 41, 45 Failed 3/53 tests, 94.34% okay t/Genomewise.................ok t/Genpred....................ok 1/157Argument "<1" isn't numeric in numeric gt (>) at /home/forrest/src/bioperl/bioperl-live/blib/lib/Bio/Tools/Glimmer.pm line 519, line 2. t/Genpred....................ok t/GraphAdaptor...............ok t/GuessSeqFormat.............ok t/HNN........................ok t/Handler....................ok 288/545 # Failed (TODO) test at t/Handler.t line 696. t/Handler....................ok t/HtSNP......................ok t/IUPAC......................ok t/Index......................ok t/InstanceSite...............ok t/InterProParser.............ok t/LargeLocatableSeq..........ok t/LinkageMap.................ok t/LiveSeq....................ok t/LocatableSeq...............ok t/Location...................ok t/LocationFactory............ok t/LocusLink..................ok t/MK.........................ok 4/46 skipped: various reasons t/Map........................ok t/MapIO......................ok t/Matrix.....................ok t/MeSH.......................ok t/Measure....................ok t/MetaSeq....................ok t/MicrosatelliteMarker.......ok t/MiniMIMentry...............ok t/MitoProt...................ok t/Molphy.....................ok t/MultiFile..................ok t/Mutation...................ok t/Mutator....................ok t/NetPhos....................ok t/Node.......................ok t/OMIMentry..................ok t/OMIMentryAllelicVariant....ok t/OMIMparser.................ok t/OddCodes...................ok t/Ontology...................ok t/OntologyEngine.............ok t/OntologyStore..............ok t/PAML.......................ok t/Perl.......................ok t/Phenotype..................ok t/PhylipDist.................ok t/PhysicalMap................ok t/Pictogram..................ok t/PodSyntax..................skipped all skipped: Test::Pod 1.00 required for testing POD t/PopGen.....................ok 1/99 # Failed (TODO) test at t/PopGen.t line 242. t/PopGen.....................ok 2/99 unexpectedly succeeded TODO PASSED tests 97-98 t/PopGenSims.................ok t/PrimarySeq.................ok t/Primer.....................ok t/Promoterwise...............ok t/ProtDist...................ok t/ProtMatrix.................ok t/ProtPsm....................ok 10/14 skipped: various reasons t/Pseudowise.................ok t/QRNA.......................ok t/RNAChange..................ok t/RNA_SearchIO...............ok 2/496 # Failed (TODO) test 'HSP meta' # at t/RNA_SearchIO.t line 798. # undef # ne # undef # Failed (TODO) test at t/RNA_SearchIO.t line 800. # undef # ne # undef # Failed (TODO) test at t/RNA_SearchIO.t line 802. # undef # ne # undef # Failed (TODO) test 'HSP meta' # at t/RNA_SearchIO.t line 848. # undef # ne # undef # Failed (TODO) test at t/RNA_SearchIO.t line 850. # undef # ne # undef # Failed (TODO) test at t/RNA_SearchIO.t line 852. # undef # ne # undef t/RNA_SearchIO...............ok t/RandDistFunctions..........ok t/RandomTreeFactory..........ok t/Range......................ok t/RangeI.....................ok t/RefSeq.....................ok t/Registry...................ok 1/14 --------------------- WARNING --------------------- MSG: Couldn't call new_from_registry() on [Bio::DB::Flat] ------------- EXCEPTION: Bio::Root::Exception ------------- MSG: you must specify an indexing scheme STACK: Error::throw STACK: Bio::Root::Root::throw /home/forrest/src/bioperl/bioperl-live/blib/lib/Bio/Root/Root.pm:357 STACK: Bio::DB::Flat::new /home/forrest/src/bioperl/bioperl-live/blib/lib/Bio/DB/Flat.pm:160 STACK: Bio::DB::Flat::new_from_registry /home/forrest/src/bioperl/bioperl-live/blib/lib/Bio/DB/Flat.pm:252 STACK: Bio::DB::Registry::_load_registry /home/forrest/src/bioperl/bioperl-live/blib/lib/Bio/DB/Registry.pm:164 STACK: Bio::DB::Registry::new /home/forrest/src/bioperl/bioperl-live/blib/lib/Bio/DB/Registry.pm:95 STACK: t/Registry.t:51 ----------------------------------------------------------- --------------------------------------------------- t/Registry...................ok 6/14 skipped: various reasons t/Relationship...............ok t/RelationshipType...........ok t/RemoteBlast................ok t/RepeatMasker...............ok t/RestrictionAnalysis........ok t/RestrictionIO..............ok 1/15 # Failed (TODO) test at t/RestrictionIO.t line 31. t/RestrictionIO..............ok t/Root-Utilities.............ok 1/50 --------------------- WARNING --------------------- MSG: Not owner of file t/data/test.txt. Compressing to temp file /tmp/MBKEv1uzJB.tmp.bioperl.gz. --------------------------------------------------- t/Root-Utilities.............ok t/RootI......................ok t/RootIO.....................ok t/RootStorable...............ok t/SNP........................ok t/Scansite...................ok 5/14 skipped: various reasons t/SearchDist.................skipped all skipped: The optional module Bio::Ext::Align (or dependencies thereof) was not installed t/SearchIO...................ok 529/1449 # Failed (TODO) test at t/SearchIO.t line 989. # '0.852' # > # '0.9' # Failed (TODO) test at t/SearchIO.t line 990. # '1.599' # <= # '1' t/SearchIO...................ok t/Seg........................ok t/Seq........................ok t/SeqAnalysisParser..........ok t/SeqBuilder.................ok t/SeqDiff....................ok t/SeqEvolution...............ok t/SeqFeatAnnotated...........ok t/SeqFeatCollection..........ok t/SeqFeature.................ok t/SeqHound_DB................ok t/SeqIO......................ok t/SeqPattern.................ok t/SeqStats...................ok t/SeqUtils...................ok t/SeqVersion.................ok t/SeqWords...................ok t/SequenceFamily.............ok t/Sigcleave..................ok t/Signalp....................ok t/Signalp2...................ok t/Sim4.......................ok t/SimilarityPair.............ok t/SimpleAlign................ok t/SiteMatrix.................ok t/Sopma......................ok t/Species....................ok t/Spidey.....................ok t/StandAloneBlast............ok 11/45 skipped: various reasons t/StructIO...................ok t/Structure..................ok t/Symbol.....................ok t/TagHaplotype...............ok t/TandemRepeatsFinder........ok t/TaxonTree..................skipped all skipped: All tests are being skipped, probably because the module(s) being tested here are now deprecated t/Taxonomy...................ok t/Tempfile...................ok t/Term.......................ok t/Tmhmm......................ok t/Tools......................ok t/Tree.......................ok t/TreeBuild..................ok t/TreeIO.....................ok t/UCSCParsers................ok t/Unflattener................ok t/Unflattener2...............ok t/UniGene....................ok t/Variation_IO...............ok t/WABA.......................ok t/WrapperBase................ok t/abi........................skipped all skipped: The optional module Bio::SeqIO::staden::read (or dependencies thereof) was not installed t/ace........................ok t/alignUtilities.............ok t/asciitree..................ok t/blast_pull.................ok 1/287 # Failed (TODO) test at t/blast_pull.t line 258. # got: '0.946' # expected: '0.943' t/blast_pull.................ok t/bsml_sax...................ok t/chaosxml...................ok t/cigarstring................ok t/consed.....................ok t/ctf........................skipped all skipped: The optional module Bio::SeqIO::staden::read (or dependencies thereof) was not installed t/dblink.....................ok t/ePCR.......................ok t/embl.......................ok t/entrezgene.................ok 542/1422Pseudo-hashes are deprecated at /home/forrest/src/bioperl/bioperl-live/blib/lib/Bio/SeqIO/entrezgene.pm line 491. t/entrezgene.................ok 966/1422Pseudo-hashes are deprecated at /home/forrest/src/bioperl/bioperl-live/blib/lib/Bio/SeqIO/entrezgene.pm line 491. Pseudo-hashes are deprecated at /home/forrest/src/bioperl/bioperl-live/blib/lib/Bio/SeqIO/entrezgene.pm line 491. Pseudo-hashes are deprecated at /home/forrest/src/bioperl/bioperl-live/blib/lib/Bio/SeqIO/entrezgene.pm line 491. t/entrezgene.................ok t/est2genome.................ok t/exp........................skipped all skipped: The optional module Bio::SeqIO::staden::read (or dependencies thereof) was not installed t/fasta......................ok t/flat.......................ok t/game.......................ok t/gcg........................ok t/genbank....................ok t/hmmer......................ok t/hmmer_pull.................ok t/interpro...................ok t/kegg.......................ok t/largefasta.................ok t/largepseq..................ok t/lasergene..................ok t/lucy.......................ok t/masta......................ok t/metafasta..................ok t/multiple_fasta.............ok t/obo_parser.................ok t/pICalculator...............ok t/phd........................ok t/pir........................ok t/pln........................skipped all skipped: The optional module Bio::SeqIO::staden::read (or dependencies thereof) was not installed t/primaryqual................ok t/primedseq..................ok t/primer3....................ok t/protgraph..................ok 1/70 # Failed (TODO) test at t/protgraph.t line 55. # Failed (TODO) test at t/protgraph.t line 56. # got: '13' # expected: '14' t/protgraph..................ok 49/70 # Failed (TODO) test at t/protgraph.t line 248. # got: 'Helicobacter pylori' # expected: 'Helicobacter pylori 26695' t/protgraph..................ok t/psm........................ok t/qual.......................ok t/raw........................ok t/rnamotif...................ok t/scf........................ok t/seq_quality................ok t/seqfeaturePrimer...........ok t/seqread_fail...............ok t/sequencetrace..............ok t/seqwithquality.............ok t/simpleGOparser.............ok t/singlet....................ok t/sirna......................ok t/splicedseq.................ok t/swiss......................ok 1/239 # Failed (TODO) test at t/swiss.t line 47. t/swiss......................ok t/tRNAscanSE.................ok t/tab........................ok t/table......................ok t/targetp....................ok t/tigrxml....................ok t/tinyseq....................ok t/trim.......................ok t/ztr........................skipped all skipped: The optional module Bio::SeqIO::staden::read (or dependencies thereof) was not installed Failed Test Stat Wstat Total Fail List of Failed ---------------------------------------------------------------------------- --- t/Assembly.t 1 256 35 1 31 t/Genewise.t 3 768 53 3 37 41 45 (3 subtests UNEXPECTEDLY SUCCEEDED), 9 tests and 43 subtests skipped. Failed 2/248 test scripts. 4/15415 subtests failed. Files=248, Tests=15415, 973 wallclock secs (123.75 cusr + 8.58 csys = 132.33 CPU) Failed 2/248 test programs. 4/15415 subtests failed. -----Original Message----- From: Chris Fields [mailto:cjfields at uiuc.edu] Sent: Friday, September 28, 2007 1:00 PM To: Forrest Zhang Cc: bioperl-l at bioperl.org Subject: Re: [Bioperl-l] load_seqdatabase.pl error If this is occurring using bioperl from CVS then I'll try taking a look at it. chris On Sep 27, 2007, at 10:15 PM, Forrest Zhang wrote: > Hilmar, > I have already pre-loaded the NCBI taxonomy using > load_ncbi_taxonomy.pl yet. The error message show: > > --------------------- WARNING --------------------- > MSG: The supplied lineage does not start near 'Phaseolus aureus' (I > was > supplied 'Vigna | Papilionoideae | Fabaceae | Fabales | eurosids I > | rosids > | core eudicotyledons | eudicotyledons | Magnoliophyta | > Euphyllophyta | > Embryophyta | Streptophytina | Viridiplantae | Eukaryota') > --------------------------------------------------- > Could not store Q40784: > ------------- EXCEPTION: Bio::Root::Exception ------------- > MSG: create: object (Bio::Species) failed to insert or to be found > by unique > key > STACK: Error::throw > STACK: Bio::Root::Root::throw > /usr/lib/perl5/site_perl/5.8.8/Bio/Root/Root.pm:357 > STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create > /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/ > BasePersistenceAdaptor.pm:206 > STACK: Bio::DB::Persistent::PersistentObject::create > /usr/lib/perl5/site_perl/5.8.8/Bio/DB/Persistent/ > PersistentObject.pm:244 > STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create > /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/ > BasePersistenceAdaptor.pm:169 > STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::store > /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/ > BasePersistenceAdaptor.pm:251 > STACK: Bio::DB::Persistent::PersistentObject::store > /usr/lib/perl5/site_perl/5.8.8/Bio/DB/Persistent/ > PersistentObject.pm:271 > STACK: /usr/bin/bp_load_seqdatabase.pl:633 > ----------------------------------------------------------- > Sigh~~~~~~ > > Forrest Zhang > > > -----Original Message----- > From: bioperl-l-bounces at lists.open-bio.org > [mailto:bioperl-l-bounces at lists.open-bio.org] On Behalf Of Hilmar Lapp > Sent: Friday, September 28, 2007 6:17 AM > To: Forrest > Cc: bioperl-l at bioperl.org > Subject: Re: [Bioperl-l] load_seqdatabase.pl error > > Forrest, > > have you preloaded the NCBI taxonomy as suggested in the BioSQL > installation guidelines? SwissProt format has NCBI taxon IDs, and the > code will try to use it to look up species and their lineage, rather > than inserting the lineage from whatever BioPerl parses out of the > sequence record. > > -hilmar > > On Sep 27, 2007, at 3:41 AM, Forrest wrote: > >> Hi, all >> I install the biosql, and bioperl-db. I want to import >> swissport data. >> But the programe show some error as below: >> ===================================================================== >> = >> ====== >> =============================================== >>> perl load_seqdatabase.pl -dbuser root -dbname bioseqdb -driver mysql >> -namespace swissprot -format swiss ~/uniprot/uniprot_sprot.dat >> Loading /home/forrest/uniprot/uniprot_sprot.dat ... >> Could not store Q6DAH5: >> ------------- EXCEPTION: Bio::Root::Exception ------------- >> MSG: The supplied lineage does not start near 'Erwinia carotovora >> subsp. >> atroseptica' (I was supplied 'Erwinia carotovora subsp. | >> Pectobacterium | >> Enterobacteriaceae | Enterobacteriales | Gammaproteobacteria | >> Proteobacteria | Bacteria') >> STACK: Error::throw >> STACK: Bio::Root::Root::throw >> /usr/lib/perl5/site_perl/5.8.8/Bio/Root/Root.pm:359 >> STACK: Bio::Species::classification >> /usr/lib/perl5/site_perl/5.8.8/Bio/Species.pm:174 >> STACK: Bio::DB::Persistent::PersistentObject::AUTOLOAD >> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/Persistent/ >> PersistentObject.pm:552 >> STACK: Bio::DB::BioSQL::SpeciesAdaptor::populate_from_row >> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/SpeciesAdaptor.pm:281 >> STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::_build_object >> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/ >> BasePersistenceAdaptor.pm:1305 >> STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::_find_by_unique_key >> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/ >> BasePersistenceAdaptor.pm:973 >> STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::find_by_unique_key >> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/ >> BasePersistenceAdaptor.pm:852 >> STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create >> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/ >> BasePersistenceAdaptor.pm:182 >> STACK: Bio::DB::Persistent::PersistentObject::create >> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/Persistent/ >> PersistentObject.pm:244 >> STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create >> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/ >> BasePersistenceAdaptor.pm:169 >> STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::store >> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/ >> BasePersistenceAdaptor.pm:251 >> STACK: Bio::DB::Persistent::PersistentObject::store >> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/Persistent/ >> PersistentObject.pm:271 >> STACK: load_seqdatabase.pl:620 >> ----------------------------------------------------------- >> >> at load_seqdatabase.pl line 633 >> ===================================================================== >> = >> ====== >> =============================================== >> >> How can I solve it, please help me, Thank you. >> >> Thanks >> Forrest zhang >> >> >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > -- > =========================================================== > : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : > =========================================================== > > > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From bix at sendu.me.uk Fri Sep 28 02:46:48 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Fri, 28 Sep 2007 07:46:48 +0100 Subject: [Bioperl-l] FW: load_seqdatabase.pl error In-Reply-To: <000101c80191$3b300e70$b1902b50$@com> References: <000101c80191$3b300e70$b1902b50$@com> Message-ID: <46FCA358.9070202@sendu.me.uk> Forrest Zhang wrote: > Oh, my God! I am tring reinstall bioperl-live using CVS, so many error > shown below. [snip] > t/Assembly.t 1 256 35 1 31 > t/Genewise.t 3 768 53 3 37 41 45 You failed 4 tests, and this is CVS. Don't worry about the failures if they're in tests of modules you're not using. Do you use the Assembly or Genewise modules? From bix at sendu.me.uk Fri Sep 28 02:40:04 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Fri, 28 Sep 2007 07:40:04 +0100 Subject: [Bioperl-l] cpan install In-Reply-To: <000001c8017a$6384bae0$2a8e30a0$@com> References: <46FC028F.3050000@usc.edu> <000001c8017a$6384bae0$2a8e30a0$@com> Message-ID: <46FCA1C4.7030101@sendu.me.uk> Forrest Zhang wrote: > Try > cpan>install S/SE/SENDU/bioperl-1.5.2_102.tar.gz > > other question you should browse > http://www.bioperl.org/wiki/Installing_Bioperl_for_Unix Yes, the explanation for 1.5.2 not showing up in d /bioperl/ being 'only stable versions appear in that list'. From forrest_zhang at 163.com Fri Sep 28 07:26:56 2007 From: forrest_zhang at 163.com (Forrest Zhang) Date: Fri, 28 Sep 2007 19:26:56 +0800 Subject: [Bioperl-l] load_seqdatabase.pl error In-Reply-To: <8C9CDFBD-200C-4CD6-8295-7833D2CD3758@uiuc.edu> References: <000501c800d9$dc9c8e90$95d5abb0$@com> <2461576A-1A15-45E2-A14F-AEED6ACD6007@gmx.net> <000101c8017d$c4643360$4d2c9a20$@com> <8C9CDFBD-200C-4CD6-8295-7833D2CD3758@uiuc.edu> Message-ID: <000301c801c2$7deb6080$79c22180$@com> Yes, it is happened using CVS. -----Original Message----- From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-bounces at lists.open-bio.org] On Behalf Of Chris Fields Sent: Friday, September 28, 2007 1:00 PM To: Forrest Zhang Cc: bioperl-l at bioperl.org Subject: Re: [Bioperl-l] load_seqdatabase.pl error If this is occurring using bioperl from CVS then I'll try taking a look at it. chris On Sep 27, 2007, at 10:15 PM, Forrest Zhang wrote: > Hilmar, > I have already pre-loaded the NCBI taxonomy using > load_ncbi_taxonomy.pl yet. The error message show: > > --------------------- WARNING --------------------- > MSG: The supplied lineage does not start near 'Phaseolus aureus' (I > was > supplied 'Vigna | Papilionoideae | Fabaceae | Fabales | eurosids I > | rosids > | core eudicotyledons | eudicotyledons | Magnoliophyta | > Euphyllophyta | > Embryophyta | Streptophytina | Viridiplantae | Eukaryota') > --------------------------------------------------- > Could not store Q40784: > ------------- EXCEPTION: Bio::Root::Exception ------------- > MSG: create: object (Bio::Species) failed to insert or to be found > by unique > key > STACK: Error::throw > STACK: Bio::Root::Root::throw > /usr/lib/perl5/site_perl/5.8.8/Bio/Root/Root.pm:357 > STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create > /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/ > BasePersistenceAdaptor.pm:206 > STACK: Bio::DB::Persistent::PersistentObject::create > /usr/lib/perl5/site_perl/5.8.8/Bio/DB/Persistent/ > PersistentObject.pm:244 > STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create > /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/ > BasePersistenceAdaptor.pm:169 > STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::store > /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/ > BasePersistenceAdaptor.pm:251 > STACK: Bio::DB::Persistent::PersistentObject::store > /usr/lib/perl5/site_perl/5.8.8/Bio/DB/Persistent/ > PersistentObject.pm:271 > STACK: /usr/bin/bp_load_seqdatabase.pl:633 > ----------------------------------------------------------- > Sigh~~~~~~ > > Forrest Zhang > > > -----Original Message----- > From: bioperl-l-bounces at lists.open-bio.org > [mailto:bioperl-l-bounces at lists.open-bio.org] On Behalf Of Hilmar Lapp > Sent: Friday, September 28, 2007 6:17 AM > To: Forrest > Cc: bioperl-l at bioperl.org > Subject: Re: [Bioperl-l] load_seqdatabase.pl error > > Forrest, > > have you preloaded the NCBI taxonomy as suggested in the BioSQL > installation guidelines? SwissProt format has NCBI taxon IDs, and the > code will try to use it to look up species and their lineage, rather > than inserting the lineage from whatever BioPerl parses out of the > sequence record. > > -hilmar > > On Sep 27, 2007, at 3:41 AM, Forrest wrote: > >> Hi, all >> I install the biosql, and bioperl-db. I want to import >> swissport data. >> But the programe show some error as below: >> ===================================================================== >> = >> ====== >> =============================================== >>> perl load_seqdatabase.pl -dbuser root -dbname bioseqdb -driver mysql >> -namespace swissprot -format swiss ~/uniprot/uniprot_sprot.dat >> Loading /home/forrest/uniprot/uniprot_sprot.dat ... >> Could not store Q6DAH5: >> ------------- EXCEPTION: Bio::Root::Exception ------------- >> MSG: The supplied lineage does not start near 'Erwinia carotovora >> subsp. >> atroseptica' (I was supplied 'Erwinia carotovora subsp. | >> Pectobacterium | >> Enterobacteriaceae | Enterobacteriales | Gammaproteobacteria | >> Proteobacteria | Bacteria') >> STACK: Error::throw >> STACK: Bio::Root::Root::throw >> /usr/lib/perl5/site_perl/5.8.8/Bio/Root/Root.pm:359 >> STACK: Bio::Species::classification >> /usr/lib/perl5/site_perl/5.8.8/Bio/Species.pm:174 >> STACK: Bio::DB::Persistent::PersistentObject::AUTOLOAD >> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/Persistent/ >> PersistentObject.pm:552 >> STACK: Bio::DB::BioSQL::SpeciesAdaptor::populate_from_row >> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/SpeciesAdaptor.pm:281 >> STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::_build_object >> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/ >> BasePersistenceAdaptor.pm:1305 >> STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::_find_by_unique_key >> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/ >> BasePersistenceAdaptor.pm:973 >> STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::find_by_unique_key >> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/ >> BasePersistenceAdaptor.pm:852 >> STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create >> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/ >> BasePersistenceAdaptor.pm:182 >> STACK: Bio::DB::Persistent::PersistentObject::create >> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/Persistent/ >> PersistentObject.pm:244 >> STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create >> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/ >> BasePersistenceAdaptor.pm:169 >> STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::store >> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/ >> BasePersistenceAdaptor.pm:251 >> STACK: Bio::DB::Persistent::PersistentObject::store >> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/Persistent/ >> PersistentObject.pm:271 >> STACK: load_seqdatabase.pl:620 >> ----------------------------------------------------------- >> >> at load_seqdatabase.pl line 633 >> ===================================================================== >> = >> ====== >> =============================================== >> >> How can I solve it, please help me, Thank you. >> >> Thanks >> Forrest zhang >> >> >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > -- > =========================================================== > : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : > =========================================================== > > > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign _______________________________________________ Bioperl-l mailing list Bioperl-l at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/bioperl-l From forrest_zhang at 163.com Fri Sep 28 07:28:41 2007 From: forrest_zhang at 163.com (Forrest Zhang) Date: Fri, 28 Sep 2007 19:28:41 +0800 Subject: [Bioperl-l] load_seqdatabase.pl error In-Reply-To: <8C9CDFBD-200C-4CD6-8295-7833D2CD3758@uiuc.edu> References: <000501c800d9$dc9c8e90$95d5abb0$@com> <2461576A-1A15-45E2-A14F-AEED6ACD6007@gmx.net> <000101c8017d$c4643360$4d2c9a20$@com> <8C9CDFBD-200C-4CD6-8295-7833D2CD3758@uiuc.edu> Message-ID: <000401c801c2$ba0bae30$2e230a90$@com> Message using CVS. biocc bioperl-db # ./Build test Copying scripts/biosql/terms/importrelation.pl -> blib/script/importrelation.pl blib/script/importrelation.pl -> blib/script/bp_importrelation.pl Copying scripts/biosql/merge-unique-ann.pl -> blib/script/merge-unique-ann.pl blib/script/merge-unique-ann.pl -> blib/script/bp_merge-unique-ann.pl Copying scripts/biosql/update-on-new-date.pl -> blib/script/update-on-new-date.pl blib/script/update-on-new-date.pl -> blib/script/bp_update-on-new-date.pl Copying scripts/biosql/terms/add-term-annot.pl -> blib/script/add-term-annot.pl Deleting blib/script/add-term-annot.pl.bak blib/script/add-term-annot.pl -> blib/script/bp_add-term-annot.pl Copying scripts/corba/caching_corba_server.pl -> blib/script/caching_corba_server.pl Deleting blib/script/caching_corba_server.pl.bak blib/script/caching_corba_server.pl -> blib/script/bp_caching_corba_server.pl Copying scripts/biosql/load_ontology.pl -> blib/script/load_ontology.pl Deleting blib/script/load_ontology.pl.bak blib/script/load_ontology.pl -> blib/script/bp_load_ontology.pl Copying scripts/biosql/load_seqdatabase.pl -> blib/script/load_seqdatabase.pl Deleting blib/script/load_seqdatabase.pl.bak blib/script/load_seqdatabase.pl -> blib/script/bp_load_seqdatabase.pl Copying scripts/biosql/terms/interpro2go.pl -> blib/script/interpro2go.pl blib/script/interpro2go.pl -> blib/script/bp_interpro2go.pl Copying scripts/biosql/clean_ontology.pl -> blib/script/clean_ontology.pl blib/script/clean_ontology.pl -> blib/script/bp_clean_ontology.pl Copying scripts/corba/test_bioenv.pl -> blib/script/test_bioenv.pl Deleting blib/script/test_bioenv.pl.bak blib/script/test_bioenv.pl -> blib/script/bp_test_bioenv.pl Copying scripts/biosql/update-on-new-version.pl -> blib/script/update-on-new-version.pl blib/script/update-on-new-version.pl -> blib/script/bp_update-on-new-version.pl Copying scripts/biosql/bioentry2flat.pl -> blib/script/bioentry2flat.pl Deleting blib/script/bioentry2flat.pl.bak blib/script/bioentry2flat.pl -> blib/script/bp_bioentry2flat.pl Copying scripts/corba/bioenv_server.pl -> blib/script/bioenv_server.pl Deleting blib/script/bioenv_server.pl.bak blib/script/bioenv_server.pl -> blib/script/bp_bioenv_server.pl Copying scripts/biosql/load_interpro.pl -> blib/script/load_interpro.pl blib/script/load_interpro.pl -> blib/script/bp_load_interpro.pl Copying scripts/biosql/cgi-bin/getentry.pl -> blib/script/getentry.pl Deleting blib/script/getentry.pl.bak blib/script/getentry.pl -> blib/script/bp_getentry.pl Copying scripts/biosql/del-assocs-sql.pl -> blib/script/del-assocs-sql.pl blib/script/del-assocs-sql.pl -> blib/script/bp_del-assocs-sql.pl Copying scripts/biosql/freshen-annot.pl -> blib/script/freshen-annot.pl blib/script/freshen-annot.pl -> blib/script/bp_freshen-annot.pl t/01dbadaptor.....ok t/02species.......FAILED tests 66-95 Failed 30/65 tests, 53.85% okay t/03simpleseq.....ok t/04swiss.........ok t/05seqfeature....ok t/06comment.......ok t/07dblink........ok t/08genbank.......ok t/09fuzzy2........ok t/10ensembl.......ok t/11locuslink.....ok t/12ontology......ok t/13remove........ok t/14query.........ok t/15cluster.......ok 9/160 --------------------- WARNING --------------------- MSG: failed to store one or more child objects for an instance of class Bio::Cluster::UniGene (PK=366) --------------------------------------------------- t/15cluster.......ok t/16obda..........ok Failed Test Stat Wstat Total Fail List of Failed ---------------------------------------------------------------------------- --- t/02species.t 65 30 66-95 Failed 1/16 test scripts. -30/1423 subtests failed. Files=16, Tests=1423, 36 wallclock secs (16.64 cusr + 0.65 csys = 17.29 CPU) Failed 1/16 test programs. -30/1423 subtests failed. -----Original Message----- From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-bounces at lists.open-bio.org] On Behalf Of Chris Fields Sent: Friday, September 28, 2007 1:00 PM To: Forrest Zhang Cc: bioperl-l at bioperl.org Subject: Re: [Bioperl-l] load_seqdatabase.pl error If this is occurring using bioperl from CVS then I'll try taking a look at it. chris On Sep 27, 2007, at 10:15 PM, Forrest Zhang wrote: > Hilmar, > I have already pre-loaded the NCBI taxonomy using > load_ncbi_taxonomy.pl yet. The error message show: > > --------------------- WARNING --------------------- > MSG: The supplied lineage does not start near 'Phaseolus aureus' (I > was > supplied 'Vigna | Papilionoideae | Fabaceae | Fabales | eurosids I > | rosids > | core eudicotyledons | eudicotyledons | Magnoliophyta | > Euphyllophyta | > Embryophyta | Streptophytina | Viridiplantae | Eukaryota') > --------------------------------------------------- > Could not store Q40784: > ------------- EXCEPTION: Bio::Root::Exception ------------- > MSG: create: object (Bio::Species) failed to insert or to be found > by unique > key > STACK: Error::throw > STACK: Bio::Root::Root::throw > /usr/lib/perl5/site_perl/5.8.8/Bio/Root/Root.pm:357 > STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create > /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/ > BasePersistenceAdaptor.pm:206 > STACK: Bio::DB::Persistent::PersistentObject::create > /usr/lib/perl5/site_perl/5.8.8/Bio/DB/Persistent/ > PersistentObject.pm:244 > STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create > /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/ > BasePersistenceAdaptor.pm:169 > STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::store > /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/ > BasePersistenceAdaptor.pm:251 > STACK: Bio::DB::Persistent::PersistentObject::store > /usr/lib/perl5/site_perl/5.8.8/Bio/DB/Persistent/ > PersistentObject.pm:271 > STACK: /usr/bin/bp_load_seqdatabase.pl:633 > ----------------------------------------------------------- > Sigh~~~~~~ > > Forrest Zhang > > > -----Original Message----- > From: bioperl-l-bounces at lists.open-bio.org > [mailto:bioperl-l-bounces at lists.open-bio.org] On Behalf Of Hilmar Lapp > Sent: Friday, September 28, 2007 6:17 AM > To: Forrest > Cc: bioperl-l at bioperl.org > Subject: Re: [Bioperl-l] load_seqdatabase.pl error > > Forrest, > > have you preloaded the NCBI taxonomy as suggested in the BioSQL > installation guidelines? SwissProt format has NCBI taxon IDs, and the > code will try to use it to look up species and their lineage, rather > than inserting the lineage from whatever BioPerl parses out of the > sequence record. > > -hilmar > > On Sep 27, 2007, at 3:41 AM, Forrest wrote: > >> Hi, all >> I install the biosql, and bioperl-db. I want to import >> swissport data. >> But the programe show some error as below: >> ===================================================================== >> = >> ====== >> =============================================== >>> perl load_seqdatabase.pl -dbuser root -dbname bioseqdb -driver mysql >> -namespace swissprot -format swiss ~/uniprot/uniprot_sprot.dat >> Loading /home/forrest/uniprot/uniprot_sprot.dat ... >> Could not store Q6DAH5: >> ------------- EXCEPTION: Bio::Root::Exception ------------- >> MSG: The supplied lineage does not start near 'Erwinia carotovora >> subsp. >> atroseptica' (I was supplied 'Erwinia carotovora subsp. | >> Pectobacterium | >> Enterobacteriaceae | Enterobacteriales | Gammaproteobacteria | >> Proteobacteria | Bacteria') >> STACK: Error::throw >> STACK: Bio::Root::Root::throw >> /usr/lib/perl5/site_perl/5.8.8/Bio/Root/Root.pm:359 >> STACK: Bio::Species::classification >> /usr/lib/perl5/site_perl/5.8.8/Bio/Species.pm:174 >> STACK: Bio::DB::Persistent::PersistentObject::AUTOLOAD >> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/Persistent/ >> PersistentObject.pm:552 >> STACK: Bio::DB::BioSQL::SpeciesAdaptor::populate_from_row >> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/SpeciesAdaptor.pm:281 >> STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::_build_object >> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/ >> BasePersistenceAdaptor.pm:1305 >> STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::_find_by_unique_key >> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/ >> BasePersistenceAdaptor.pm:973 >> STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::find_by_unique_key >> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/ >> BasePersistenceAdaptor.pm:852 >> STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create >> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/ >> BasePersistenceAdaptor.pm:182 >> STACK: Bio::DB::Persistent::PersistentObject::create >> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/Persistent/ >> PersistentObject.pm:244 >> STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create >> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/ >> BasePersistenceAdaptor.pm:169 >> STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::store >> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/ >> BasePersistenceAdaptor.pm:251 >> STACK: Bio::DB::Persistent::PersistentObject::store >> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/Persistent/ >> PersistentObject.pm:271 >> STACK: load_seqdatabase.pl:620 >> ----------------------------------------------------------- >> >> at load_seqdatabase.pl line 633 >> ===================================================================== >> = >> ====== >> =============================================== >> >> How can I solve it, please help me, Thank you. >> >> Thanks >> Forrest zhang >> >> >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > -- > =========================================================== > : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : > =========================================================== > > > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign _______________________________________________ Bioperl-l mailing list Bioperl-l at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/bioperl-l From hlapp at gmx.net Fri Sep 28 11:36:39 2007 From: hlapp at gmx.net (Hilmar Lapp) Date: Fri, 28 Sep 2007 11:36:39 -0400 Subject: [Bioperl-l] Bio::FeatureIO::gff bug? In-Reply-To: <46F8DC34.6020908@sendu.me.uk> References: <46F784EB.9050507@sendu.me.uk> <7C7D7FB3-86B9-43CF-B506-E66FA8264CFC@uiuc.edu> <46F7B9A1.9080206@sendu.me.uk> <234C109D-85FE-4161-8CBA-8E24BE34C5B5@gmx.net> <46F8DC34.6020908@sendu.me.uk> Message-ID: <8283EF43-2AF0-4B3B-8A00-4DE615186EC7@gmx.net> You do have a point here. From a design perspective, it feels odd if instantiating an object can fail with an I/O exception. But in reality that's how it's done all the time, from Bio::SeqIO to java.io.*, if I'm not mistaken. I also agree that asking a conditional that we know will be false every time except once violates the sense of elegance. So upon second consideration, I think I agree with you. And a GFF3 file with zero features in it should still be a valid GFF3 file, i.e., have the mandatory headers. Does that make sense? -hilmar On Sep 25, 2007, at 6:00 AM, Sendu Bala wrote: > Hilmar Lapp wrote: >> On Sep 24, 2007, at 9:35 AM, Chris Fields wrote: >>> I think that'll work fine. The other option would be call a >>> print_gff_header() function within write_feature() with the >>> intent to >>> print the header only once, using a flag or similar: >>> >>> if (!$self->header_printed) { >>> $self->print_gff_header; >>> $self->header_printed(1); >>> } > > >> I'd lean toward this or a similar approach too. Writing stuff out >> in the constructor doesn't feel like the best design. > > I'd argue that the alternative is just inefficient with no > compensating benefit. You have something that must only be done > once, and a method (_initialize) that is only called once. The > constructor is used to set up the file, getting it into a state > ready to add features. This involves opening it for writing with > the correct filename and setting the desired GFF version. Why > wouldn't it also output what ever else was necessary it initialize > the file? > > Also, what do we expect should happen when we use Bioperl to create > a GFF file and don't write any features to it? Should it be an > empty file, or should it contain whatever GFF information the user > had managed to supply (the version)? -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From hlapp at gmx.net Fri Sep 28 11:53:33 2007 From: hlapp at gmx.net (Hilmar Lapp) Date: Fri, 28 Sep 2007 11:53:33 -0400 Subject: [Bioperl-l] load_seqdatabase.pl error In-Reply-To: <8C9CDFBD-200C-4CD6-8295-7833D2CD3758@uiuc.edu> References: <000501c800d9$dc9c8e90$95d5abb0$@com> <2461576A-1A15-45E2-A14F-AEED6ACD6007@gmx.net> <000101c8017d$c4643360$4d2c9a20$@com> <8C9CDFBD-200C-4CD6-8295-7833D2CD3758@uiuc.edu> Message-ID: <9917490A-B7AF-4AE6-9C78-AD516700155B@gmx.net> Chris let me know if you get stumped. I'm surprised that the special ranks ('eurosids I' etc) show up in the lineage (has NCBI started to assign ranks to them? I thought I filter them out. Needs to be looked into too.), but at any rate I don't understand why they aren't being accepted. Also, maybe we need a more verbose output here - Forrest, can you run this with adding a --printerror argument. (I'm embarrassed to find that this doesn't seem to be documented. Sigh.) -hilmar On Sep 28, 2007, at 1:00 AM, Chris Fields wrote: > If this is occurring using bioperl from CVS then I'll try taking a > look at it. > > chris > > On Sep 27, 2007, at 10:15 PM, Forrest Zhang wrote: > >> Hilmar, >> I have already pre-loaded the NCBI taxonomy using >> load_ncbi_taxonomy.pl yet. The error message show: >> >> --------------------- WARNING --------------------- >> MSG: The supplied lineage does not start near 'Phaseolus aureus' (I >> was >> supplied 'Vigna | Papilionoideae | Fabaceae | Fabales | eurosids I >> | rosids >> | core eudicotyledons | eudicotyledons | Magnoliophyta | >> Euphyllophyta | >> Embryophyta | Streptophytina | Viridiplantae | Eukaryota') >> --------------------------------------------------- >> Could not store Q40784: >> ------------- EXCEPTION: Bio::Root::Exception ------------- >> MSG: create: object (Bio::Species) failed to insert or to be found >> by unique >> key >> STACK: Error::throw >> STACK: Bio::Root::Root::throw >> /usr/lib/perl5/site_perl/5.8.8/Bio/Root/Root.pm:357 >> STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create >> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/ >> BasePersistenceAdaptor.pm:206 >> STACK: Bio::DB::Persistent::PersistentObject::create >> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/Persistent/ >> PersistentObject.pm:244 >> STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create >> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/ >> BasePersistenceAdaptor.pm:169 >> STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::store >> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/ >> BasePersistenceAdaptor.pm:251 >> STACK: Bio::DB::Persistent::PersistentObject::store >> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/Persistent/ >> PersistentObject.pm:271 >> STACK: /usr/bin/bp_load_seqdatabase.pl:633 >> ----------------------------------------------------------- >> Sigh~~~~~~ >> >> Forrest Zhang >> >> >> -----Original Message----- >> From: bioperl-l-bounces at lists.open-bio.org >> [mailto:bioperl-l-bounces at lists.open-bio.org] On Behalf Of Hilmar >> Lapp >> Sent: Friday, September 28, 2007 6:17 AM >> To: Forrest >> Cc: bioperl-l at bioperl.org >> Subject: Re: [Bioperl-l] load_seqdatabase.pl error >> >> Forrest, >> >> have you preloaded the NCBI taxonomy as suggested in the BioSQL >> installation guidelines? SwissProt format has NCBI taxon IDs, and the >> code will try to use it to look up species and their lineage, rather >> than inserting the lineage from whatever BioPerl parses out of the >> sequence record. >> >> -hilmar >> >> On Sep 27, 2007, at 3:41 AM, Forrest wrote: >> >>> Hi, all >>> I install the biosql, and bioperl-db. I want to import >>> swissport data. >>> But the programe show some error as below: >>> ==================================================================== >>> = >>> = >>> ====== >>> =============================================== >>>> perl load_seqdatabase.pl -dbuser root -dbname bioseqdb -driver >>>> mysql >>> -namespace swissprot -format swiss ~/uniprot/uniprot_sprot.dat >>> Loading /home/forrest/uniprot/uniprot_sprot.dat ... >>> Could not store Q6DAH5: >>> ------------- EXCEPTION: Bio::Root::Exception ------------- >>> MSG: The supplied lineage does not start near 'Erwinia carotovora >>> subsp. >>> atroseptica' (I was supplied 'Erwinia carotovora subsp. | >>> Pectobacterium | >>> Enterobacteriaceae | Enterobacteriales | Gammaproteobacteria | >>> Proteobacteria | Bacteria') >>> STACK: Error::throw >>> STACK: Bio::Root::Root::throw >>> /usr/lib/perl5/site_perl/5.8.8/Bio/Root/Root.pm:359 >>> STACK: Bio::Species::classification >>> /usr/lib/perl5/site_perl/5.8.8/Bio/Species.pm:174 >>> STACK: Bio::DB::Persistent::PersistentObject::AUTOLOAD >>> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/Persistent/ >>> PersistentObject.pm:552 >>> STACK: Bio::DB::BioSQL::SpeciesAdaptor::populate_from_row >>> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/SpeciesAdaptor.pm:281 >>> STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::_build_object >>> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/ >>> BasePersistenceAdaptor.pm:1305 >>> STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::_find_by_unique_key >>> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/ >>> BasePersistenceAdaptor.pm:973 >>> STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::find_by_unique_key >>> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/ >>> BasePersistenceAdaptor.pm:852 >>> STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create >>> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/ >>> BasePersistenceAdaptor.pm:182 >>> STACK: Bio::DB::Persistent::PersistentObject::create >>> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/Persistent/ >>> PersistentObject.pm:244 >>> STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create >>> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/ >>> BasePersistenceAdaptor.pm:169 >>> STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::store >>> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/ >>> BasePersistenceAdaptor.pm:251 >>> STACK: Bio::DB::Persistent::PersistentObject::store >>> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/Persistent/ >>> PersistentObject.pm:271 >>> STACK: load_seqdatabase.pl:620 >>> ----------------------------------------------------------- >>> >>> at load_seqdatabase.pl line 633 >>> ==================================================================== >>> = >>> = >>> ====== >>> =============================================== >>> >>> How can I solve it, please help me, Thank you. >>> >>> Thanks >>> Forrest zhang >>> >>> >>> >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> >> -- >> =========================================================== >> : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : >> =========================================================== >> >> >> >> >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> >> >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > Christopher Fields > Postdoctoral Researcher > Lab of Dr. Robert Switzer > Dept of Biochemistry > University of Illinois Urbana-Champaign > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From cjfields at uiuc.edu Fri Sep 28 12:04:08 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Fri, 28 Sep 2007 11:04:08 -0500 Subject: [Bioperl-l] Bio::FeatureIO::gff bug? In-Reply-To: <8283EF43-2AF0-4B3B-8A00-4DE615186EC7@gmx.net> References: <46F784EB.9050507@sendu.me.uk> <7C7D7FB3-86B9-43CF-B506-E66FA8264CFC@uiuc.edu> <46F7B9A1.9080206@sendu.me.uk> <234C109D-85FE-4161-8CBA-8E24BE34C5B5@gmx.net> <46F8DC34.6020908@sendu.me.uk> <8283EF43-2AF0-4B3B-8A00-4DE615186EC7@gmx.net> Message-ID: <5298B700-EFDE-45E4-A8F3-674FA673A0C7@uiuc.edu> On Sep 28, 2007, at 10:36 AM, Hilmar Lapp wrote: > You do have a point here. From a design perspective, it feels odd > if instantiating an object can fail with an I/O exception. But in > reality that's how it's done all the time, from Bio::SeqIO to > java.io.*, if I'm not mistaken. I also agree that asking a > conditional that we know will be false every time except once > violates the sense of elegance. I agree with the lack of elegance from a design perspective on both counts, but when have we ever been worried about that? ;> In general I don't think SeqIO classes generates actual output (like the GFF header information) in the constructor, they just initialize IO and other state data. It makes sense to fail in this case if an error pops up. Regardless, one could argue ad infinitum that either proposed fix has its benefits/deficiencies, however both will work, so I'm happy with either. chris > So upon second consideration, I think I agree with you. And a GFF3 > file with zero features in it should still be a valid GFF3 file, > i.e., have the mandatory headers. > > Does that make sense? > > -hilmar > > On Sep 25, 2007, at 6:00 AM, Sendu Bala wrote: > >> Hilmar Lapp wrote: >>> On Sep 24, 2007, at 9:35 AM, Chris Fields wrote: >>>> I think that'll work fine. The other option would be call a >>>> print_gff_header() function within write_feature() with the >>>> intent to >>>> print the header only once, using a flag or similar: >>>> >>>> if (!$self->header_printed) { >>>> $self->print_gff_header; >>>> $self->header_printed(1); >>>> } >> > >>> I'd lean toward this or a similar approach too. Writing stuff out >>> in the constructor doesn't feel like the best design. >> >> I'd argue that the alternative is just inefficient with no >> compensating benefit. You have something that must only be done >> once, and a method (_initialize) that is only called once. The >> constructor is used to set up the file, getting it into a state >> ready to add features. This involves opening it for writing with >> the correct filename and setting the desired GFF version. Why >> wouldn't it also output what ever else was necessary it initialize >> the file? >> >> Also, what do we expect should happen when we use Bioperl to >> create a GFF file and don't write any features to it? Should it be >> an empty file, or should it contain whatever GFF information the >> user had managed to supply (the version)? > > -- > =========================================================== > : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : > =========================================================== > > > > > Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From cjfields at uiuc.edu Fri Sep 28 12:10:59 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Fri, 28 Sep 2007 11:10:59 -0500 Subject: [Bioperl-l] load_seqdatabase.pl error In-Reply-To: <9917490A-B7AF-4AE6-9C78-AD516700155B@gmx.net> References: <000501c800d9$dc9c8e90$95d5abb0$@com> <2461576A-1A15-45E2-A14F-AEED6ACD6007@gmx.net> <000101c8017d$c4643360$4d2c9a20$@com> <8C9CDFBD-200C-4CD6-8295-7833D2CD3758@uiuc.edu> <9917490A-B7AF-4AE6-9C78-AD516700155B@gmx.net> Message-ID: I'm actually getting some odd recursion issues again; not sure what's causing it, but a reinstall of both bioperl and bioperl-db fixed it last time. It may be related to the rollback, just not sure yet. I'll try tracking it down if it persists (bad pun). t/04swiss....ok 3/52 --------------------- WARNING --------------------- MSG: recursion detected for Bio::Taxon object --------------------------------------------------- --------------------- WARNING --------------------- MSG: recursion detected for Bio::Taxon object --------------------------------------------------- --------------------- WARNING --------------------- MSG: recursion detected for Bio::Taxon object --------------------------------------------------- --------------------- WARNING --------------------- MSG: recursion detected for Bio::Taxon object --------------------------------------------------- --------------------- WARNING --------------------- MSG: recursion detected for Bio::Taxon object --------------------------------------------------- --------------------- WARNING --------------------- MSG: recursion detected for Bio::Taxon object --------------------------------------------------- --------------------- WARNING --------------------- MSG: recursion detected for Bio::Taxon object --------------------------------------------------- --------------------- WARNING --------------------- MSG: recursion detected for Bio::Taxon object --------------------------------------------------- --------------------- WARNING --------------------- MSG: recursion detected for Bio::Taxon object --------------------------------------------------- --------------------- WARNING --------------------- MSG: recursion detected for Bio::Taxon object --------------------------------------------------- --------------------- WARNING --------------------- MSG: recursion detected for Bio::Taxon object --------------------------------------------------- --------------------- WARNING --------------------- MSG: recursion detected for Bio::Taxon object --------------------------------------------------- --------------------- WARNING --------------------- MSG: recursion detected for Bio::Taxon object --------------------------------------------------- t/04swiss....ok All tests successful. Files=1, Tests=52, 2 wallclock secs ( 1.33 cusr + 0.18 csys = 1.51 CPU) The specific error under verbose running is: --------------------- WARNING --------------------- MSG: recursion detected for Bio::Taxon object STACK Bio::DB::BioSQL::BasePersistenceAdaptor::_process_child /Users/ cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ BasePersistenceAdaptor.pm:680 STACK Bio::DB::BioSQL::BasePersistenceAdaptor::_create_persistent / Users/cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ BasePersistenceAdaptor.pm:629 STACK Bio::DB::BioSQL::BasePersistenceAdaptor::create_persistent / Users/cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ BasePersistenceAdaptor.pm:586 STACK Bio::DB::BioSQL::BasePersistenceAdaptor::_process_child /Users/ cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ BasePersistenceAdaptor.pm:677 STACK Bio::DB::BioSQL::BasePersistenceAdaptor::_create_persistent / Users/cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ BasePersistenceAdaptor.pm:629 STACK Bio::DB::BioSQL::BasePersistenceAdaptor::_process_child /Users/ cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ BasePersistenceAdaptor.pm:691 STACK Bio::DB::BioSQL::BasePersistenceAdaptor::_create_persistent / Users/cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ BasePersistenceAdaptor.pm:629 STACK Bio::DB::BioSQL::BasePersistenceAdaptor::create_persistent / Users/cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ BasePersistenceAdaptor.pm:586 STACK Bio::DB::BioSQL::BasePersistenceAdaptor::_process_child /Users/ cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ BasePersistenceAdaptor.pm:677 STACK Bio::DB::BioSQL::BasePersistenceAdaptor::_create_persistent / Users/cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ BasePersistenceAdaptor.pm:629 STACK Bio::DB::BioSQL::BasePersistenceAdaptor::_process_child /Users/ cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ BasePersistenceAdaptor.pm:691 STACK Bio::DB::BioSQL::BasePersistenceAdaptor::_create_persistent / Users/cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ BasePersistenceAdaptor.pm:629 STACK Bio::DB::BioSQL::BasePersistenceAdaptor::create_persistent / Users/cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ BasePersistenceAdaptor.pm:586 STACK Bio::DB::BioSQL::BasePersistenceAdaptor::_process_child /Users/ cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ BasePersistenceAdaptor.pm:677 STACK Bio::DB::BioSQL::BasePersistenceAdaptor::_create_persistent / Users/cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ BasePersistenceAdaptor.pm:629 STACK Bio::DB::BioSQL::BasePersistenceAdaptor::create_persistent / Users/cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ BasePersistenceAdaptor.pm:586 STACK Bio::DB::BioSQL::BasePersistenceAdaptor::_process_child /Users/ cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ BasePersistenceAdaptor.pm:677 STACK Bio::DB::BioSQL::BasePersistenceAdaptor::_create_persistent / Users/cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ BasePersistenceAdaptor.pm:629 STACK Bio::DB::BioSQL::BasePersistenceAdaptor::create_persistent / Users/cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ BasePersistenceAdaptor.pm:586 STACK Bio::DB::BioSQL::BasePersistenceAdaptor::_process_child /Users/ cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ BasePersistenceAdaptor.pm:658 STACK Bio::DB::BioSQL::BasePersistenceAdaptor::_create_persistent / Users/cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ BasePersistenceAdaptor.pm:629 STACK Bio::DB::BioSQL::BasePersistenceAdaptor::create_persistent / Users/cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ BasePersistenceAdaptor.pm:586 STACK Bio::DB::BioSQL::BasePersistenceAdaptor::store /Users/cjfields/ src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/BasePersistenceAdaptor.pm:252 STACK Bio::DB::BioSQL::PrimarySeqAdaptor::store_children /Users/ cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ PrimarySeqAdaptor.pm:229 STACK Bio::DB::BioSQL::SeqAdaptor::store_children /Users/cjfields/src/ core/bioperl-db/blib/lib/Bio/DB/BioSQL/SeqAdaptor.pm:217 STACK Bio::DB::BioSQL::BasePersistenceAdaptor::create /Users/cjfields/ src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/BasePersistenceAdaptor.pm:213 STACK Bio::DB::Persistent::PersistentObject::create /Users/cjfields/ src/core/bioperl-db/blib/lib/Bio/DB/Persistent/PersistentObject.pm:244 STACK toplevel t/04swiss.t:37 --------------------------------------------------- chris On Sep 28, 2007, at 10:53 AM, Hilmar Lapp wrote: > Chris let me know if you get stumped. I'm surprised that the special > ranks ('eurosids I' etc) show up in the lineage (has NCBI started to > assign ranks to them? I thought I filter them out. Needs to be looked > into too.), but at any rate I don't understand why they aren't being > accepted. > > Also, maybe we need a more verbose output here - Forrest, can you run > this with adding a --printerror argument. (I'm embarrassed to find > that this doesn't seem to be documented. Sigh.) > > -hilmar > > On Sep 28, 2007, at 1:00 AM, Chris Fields wrote: > >> If this is occurring using bioperl from CVS then I'll try taking a >> look at it. >> >> chris >> >> On Sep 27, 2007, at 10:15 PM, Forrest Zhang wrote: >> >>> Hilmar, >>> I have already pre-loaded the NCBI taxonomy using >>> load_ncbi_taxonomy.pl yet. The error message show: >>> >>> --------------------- WARNING --------------------- >>> MSG: The supplied lineage does not start near 'Phaseolus aureus' (I >>> was >>> supplied 'Vigna | Papilionoideae | Fabaceae | Fabales | eurosids I >>> | rosids >>> | core eudicotyledons | eudicotyledons | Magnoliophyta | >>> Euphyllophyta | >>> Embryophyta | Streptophytina | Viridiplantae | Eukaryota') >>> --------------------------------------------------- >>> Could not store Q40784: >>> ------------- EXCEPTION: Bio::Root::Exception ------------- >>> MSG: create: object (Bio::Species) failed to insert or to be found >>> by unique >>> key >>> STACK: Error::throw >>> STACK: Bio::Root::Root::throw >>> /usr/lib/perl5/site_perl/5.8.8/Bio/Root/Root.pm:357 >>> STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create >>> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/ >>> BasePersistenceAdaptor.pm:206 >>> STACK: Bio::DB::Persistent::PersistentObject::create >>> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/Persistent/ >>> PersistentObject.pm:244 >>> STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create >>> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/ >>> BasePersistenceAdaptor.pm:169 >>> STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::store >>> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/ >>> BasePersistenceAdaptor.pm:251 >>> STACK: Bio::DB::Persistent::PersistentObject::store >>> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/Persistent/ >>> PersistentObject.pm:271 >>> STACK: /usr/bin/bp_load_seqdatabase.pl:633 >>> ----------------------------------------------------------- >>> Sigh~~~~~~ >>> >>> Forrest Zhang >>> >>> >>> -----Original Message----- >>> From: bioperl-l-bounces at lists.open-bio.org >>> [mailto:bioperl-l-bounces at lists.open-bio.org] On Behalf Of Hilmar >>> Lapp >>> Sent: Friday, September 28, 2007 6:17 AM >>> To: Forrest >>> Cc: bioperl-l at bioperl.org >>> Subject: Re: [Bioperl-l] load_seqdatabase.pl error >>> >>> Forrest, >>> >>> have you preloaded the NCBI taxonomy as suggested in the BioSQL >>> installation guidelines? SwissProt format has NCBI taxon IDs, and >>> the >>> code will try to use it to look up species and their lineage, rather >>> than inserting the lineage from whatever BioPerl parses out of the >>> sequence record. >>> >>> -hilmar >>> >>> On Sep 27, 2007, at 3:41 AM, Forrest wrote: >>> >>>> Hi, all >>>> I install the biosql, and bioperl-db. I want to import >>>> swissport data. >>>> But the programe show some error as below: >>>> =================================================================== >>>> = >>>> = >>>> = >>>> ====== >>>> =============================================== >>>>> perl load_seqdatabase.pl -dbuser root -dbname bioseqdb -driver >>>>> mysql >>>> -namespace swissprot -format swiss ~/uniprot/uniprot_sprot.dat >>>> Loading /home/forrest/uniprot/uniprot_sprot.dat ... >>>> Could not store Q6DAH5: >>>> ------------- EXCEPTION: Bio::Root::Exception ------------- >>>> MSG: The supplied lineage does not start near 'Erwinia carotovora >>>> subsp. >>>> atroseptica' (I was supplied 'Erwinia carotovora subsp. | >>>> Pectobacterium | >>>> Enterobacteriaceae | Enterobacteriales | Gammaproteobacteria | >>>> Proteobacteria | Bacteria') >>>> STACK: Error::throw >>>> STACK: Bio::Root::Root::throw >>>> /usr/lib/perl5/site_perl/5.8.8/Bio/Root/Root.pm:359 >>>> STACK: Bio::Species::classification >>>> /usr/lib/perl5/site_perl/5.8.8/Bio/Species.pm:174 >>>> STACK: Bio::DB::Persistent::PersistentObject::AUTOLOAD >>>> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/Persistent/ >>>> PersistentObject.pm:552 >>>> STACK: Bio::DB::BioSQL::SpeciesAdaptor::populate_from_row >>>> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/SpeciesAdaptor.pm:281 >>>> STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::_build_object >>>> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/ >>>> BasePersistenceAdaptor.pm:1305 >>>> STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::_find_by_unique_key >>>> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/ >>>> BasePersistenceAdaptor.pm:973 >>>> STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::find_by_unique_key >>>> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/ >>>> BasePersistenceAdaptor.pm:852 >>>> STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create >>>> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/ >>>> BasePersistenceAdaptor.pm:182 >>>> STACK: Bio::DB::Persistent::PersistentObject::create >>>> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/Persistent/ >>>> PersistentObject.pm:244 >>>> STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create >>>> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/ >>>> BasePersistenceAdaptor.pm:169 >>>> STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::store >>>> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/ >>>> BasePersistenceAdaptor.pm:251 >>>> STACK: Bio::DB::Persistent::PersistentObject::store >>>> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/Persistent/ >>>> PersistentObject.pm:271 >>>> STACK: load_seqdatabase.pl:620 >>>> ----------------------------------------------------------- >>>> >>>> at load_seqdatabase.pl line 633 >>>> =================================================================== >>>> = >>>> = >>>> = >>>> ====== >>>> =============================================== >>>> >>>> How can I solve it, please help me, Thank you. >>>> >>>> Thanks >>>> Forrest zhang >>>> >>>> >>>> >>>> _______________________________________________ >>>> Bioperl-l mailing list >>>> Bioperl-l at lists.open-bio.org >>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>> >>> -- >>> =========================================================== >>> : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : >>> =========================================================== >>> >>> >>> >>> >>> >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>> >>> >>> >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> >> Christopher Fields >> Postdoctoral Researcher >> Lab of Dr. Robert Switzer >> Dept of Biochemistry >> University of Illinois Urbana-Champaign >> >> >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > -- > =========================================================== > : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : > =========================================================== > > > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From hlapp at gmx.net Fri Sep 28 17:09:28 2007 From: hlapp at gmx.net (Hilmar Lapp) Date: Fri, 28 Sep 2007 17:09:28 -0400 Subject: [Bioperl-l] Bio::FeatureIO::gff bug? In-Reply-To: <5298B700-EFDE-45E4-A8F3-674FA673A0C7@uiuc.edu> References: <46F784EB.9050507@sendu.me.uk> <7C7D7FB3-86B9-43CF-B506-E66FA8264CFC@uiuc.edu> <46F7B9A1.9080206@sendu.me.uk> <234C109D-85FE-4161-8CBA-8E24BE34C5B5@gmx.net> <46F8DC34.6020908@sendu.me.uk> <8283EF43-2AF0-4B3B-8A00-4DE615186EC7@gmx.net> <5298B700-EFDE-45E4-A8F3-674FA673A0C7@uiuc.edu> Message-ID: <41A20518-63BC-4D01-8FFC-01C903ADD423@gmx.net> On Sep 28, 2007, at 12:04 PM, Chris Fields wrote: > In general I don't think SeqIO classes generates actual output > (like the GFF header information) in the constructor There's probably two reasons they don't (if really all of them don't): i) unless you explicitly test (how?) whether the file has been opened for writing, you actually don't know in the SeqIO constructor whether someone's going to write to the file or read from it. ii) off hand, I don't know of a sequence file format that would require a particular header being written just once. Though thinking about this, I start asking myself whether i) wouldn't also apply to FeatureIO (are we not reading gff too in this class?), and I'm wondering that there must be a header (or at least an enclosing tag) for SeqIO XML formats - so how is that dealt with there? -hilmar -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From cjfields at uiuc.edu Fri Sep 28 17:34:13 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Fri, 28 Sep 2007 16:34:13 -0500 Subject: [Bioperl-l] Bio::FeatureIO::gff bug? In-Reply-To: <41A20518-63BC-4D01-8FFC-01C903ADD423@gmx.net> References: <46F784EB.9050507@sendu.me.uk> <7C7D7FB3-86B9-43CF-B506-E66FA8264CFC@uiuc.edu> <46F7B9A1.9080206@sendu.me.uk> <234C109D-85FE-4161-8CBA-8E24BE34C5B5@gmx.net> <46F8DC34.6020908@sendu.me.uk> <8283EF43-2AF0-4B3B-8A00-4DE615186EC7@gmx.net> <5298B700-EFDE-45E4-A8F3-674FA673A0C7@uiuc.edu> <41A20518-63BC-4D01-8FFC-01C903ADD423@gmx.net> Message-ID: <530A0322-A3BC-471D-AE91-17AD8F0EB237@uiuc.edu> On Sep 28, 2007, at 4:09 PM, Hilmar Lapp wrote: > > On Sep 28, 2007, at 12:04 PM, Chris Fields wrote: > >> In general I don't think SeqIO classes generates actual output >> (like the GFF header information) in the constructor > > There's probably two reasons they don't (if really all of them > don't): i) unless you explicitly test (how?) whether the file has > been opened for writing, you actually don't know in the SeqIO > constructor whether someone's going to write to the file or read from > it. ii) off hand, I don't know of a sequence file format that would > require a particular header being written just once. > > Though thinking about this, I start asking myself whether i) wouldn't > also apply to FeatureIO (are we not reading gff too in this class?), > and I'm wondering that there must be a header (or at least an > enclosing tag) for SeqIO XML formats - so how is that dealt with > there? > > -hilmar Re: (i) and FeatureIO: I believe most FeatureIO classes read/write to/ from specific feature files (bed, gff, ptt, interpro, etc), which is one reason I thought everything I(nput) should go into next_feature (), everything O(utput) into write_feature(). The section writing the gff header info in _initialize() checks the file specifically for '>' prior to output; I think Sendu planned on changing that to use mode() instead. Re: (ii): I'm not sure, actually; I wouldn't be surprised if XML output hasn't been tested very well. If I could go to the Nov. GMOD meeting to help hammer out some of the GFF3/FeatureIO/SF::Annotated stuff I would, but I would be traveling on my own dime. Maybe I'll see what I can come up with and stay at the no-tell motel... chris From cjfields at uiuc.edu Fri Sep 28 18:03:06 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Fri, 28 Sep 2007 17:03:06 -0500 Subject: [Bioperl-l] load_seqdatabase.pl error In-Reply-To: References: <000501c800d9$dc9c8e90$95d5abb0$@com> <2461576A-1A15-45E2-A14F-AEED6ACD6007@gmx.net> <000101c8017d$c4643360$4d2c9a20$@com> <8C9CDFBD-200C-4CD6-8295-7833D2CD3758@uiuc.edu> <9917490A-B7AF-4AE6-9C78-AD516700155B@gmx.net> Message-ID: Okay, fixed the recursion (extra copy of a BasePersistentAdaptor module I was working which tripped it, so nothing in CVS). Forrest, I get all tests passing. I used a database without taxonomy loaded with bioperl-db and bioperl from cvs and it worked w/o problems. I'll try working with your sequence when I have time this weekend. chris On Sep 28, 2007, at 11:10 AM, Chris Fields wrote: > I'm actually getting some odd recursion issues again; not sure what's > causing it, but a reinstall of both bioperl and bioperl-db fixed it > last time. It may be related to the rollback, just not sure yet. > > I'll try tracking it down if it persists (bad pun). > > t/04swiss....ok 3/52 > --------------------- WARNING --------------------- > MSG: recursion detected for Bio::Taxon object > --------------------------------------------------- > > --------------------- WARNING --------------------- > MSG: recursion detected for Bio::Taxon object > --------------------------------------------------- > > --------------------- WARNING --------------------- > MSG: recursion detected for Bio::Taxon object > --------------------------------------------------- > > --------------------- WARNING --------------------- > MSG: recursion detected for Bio::Taxon object > --------------------------------------------------- > > --------------------- WARNING --------------------- > MSG: recursion detected for Bio::Taxon object > --------------------------------------------------- > > --------------------- WARNING --------------------- > MSG: recursion detected for Bio::Taxon object > --------------------------------------------------- > > --------------------- WARNING --------------------- > MSG: recursion detected for Bio::Taxon object > --------------------------------------------------- > > --------------------- WARNING --------------------- > MSG: recursion detected for Bio::Taxon object > --------------------------------------------------- > > --------------------- WARNING --------------------- > MSG: recursion detected for Bio::Taxon object > --------------------------------------------------- > > --------------------- WARNING --------------------- > MSG: recursion detected for Bio::Taxon object > --------------------------------------------------- > > --------------------- WARNING --------------------- > MSG: recursion detected for Bio::Taxon object > --------------------------------------------------- > > --------------------- WARNING --------------------- > MSG: recursion detected for Bio::Taxon object > --------------------------------------------------- > > --------------------- WARNING --------------------- > MSG: recursion detected for Bio::Taxon object > --------------------------------------------------- > t/04swiss....ok > All tests successful. > Files=1, Tests=52, 2 wallclock secs ( 1.33 cusr + 0.18 csys = 1.51 > CPU) > > The specific error under verbose running is: > > --------------------- WARNING --------------------- > MSG: recursion detected for Bio::Taxon object > STACK Bio::DB::BioSQL::BasePersistenceAdaptor::_process_child /Users/ > cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ > BasePersistenceAdaptor.pm:680 > STACK Bio::DB::BioSQL::BasePersistenceAdaptor::_create_persistent / > Users/cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ > BasePersistenceAdaptor.pm:629 > STACK Bio::DB::BioSQL::BasePersistenceAdaptor::create_persistent / > Users/cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ > BasePersistenceAdaptor.pm:586 > STACK Bio::DB::BioSQL::BasePersistenceAdaptor::_process_child /Users/ > cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ > BasePersistenceAdaptor.pm:677 > STACK Bio::DB::BioSQL::BasePersistenceAdaptor::_create_persistent / > Users/cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ > BasePersistenceAdaptor.pm:629 > STACK Bio::DB::BioSQL::BasePersistenceAdaptor::_process_child /Users/ > cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ > BasePersistenceAdaptor.pm:691 > STACK Bio::DB::BioSQL::BasePersistenceAdaptor::_create_persistent / > Users/cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ > BasePersistenceAdaptor.pm:629 > STACK Bio::DB::BioSQL::BasePersistenceAdaptor::create_persistent / > Users/cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ > BasePersistenceAdaptor.pm:586 > STACK Bio::DB::BioSQL::BasePersistenceAdaptor::_process_child /Users/ > cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ > BasePersistenceAdaptor.pm:677 > STACK Bio::DB::BioSQL::BasePersistenceAdaptor::_create_persistent / > Users/cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ > BasePersistenceAdaptor.pm:629 > STACK Bio::DB::BioSQL::BasePersistenceAdaptor::_process_child /Users/ > cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ > BasePersistenceAdaptor.pm:691 > STACK Bio::DB::BioSQL::BasePersistenceAdaptor::_create_persistent / > Users/cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ > BasePersistenceAdaptor.pm:629 > STACK Bio::DB::BioSQL::BasePersistenceAdaptor::create_persistent / > Users/cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ > BasePersistenceAdaptor.pm:586 > STACK Bio::DB::BioSQL::BasePersistenceAdaptor::_process_child /Users/ > cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ > BasePersistenceAdaptor.pm:677 > STACK Bio::DB::BioSQL::BasePersistenceAdaptor::_create_persistent / > Users/cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ > BasePersistenceAdaptor.pm:629 > STACK Bio::DB::BioSQL::BasePersistenceAdaptor::create_persistent / > Users/cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ > BasePersistenceAdaptor.pm:586 > STACK Bio::DB::BioSQL::BasePersistenceAdaptor::_process_child /Users/ > cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ > BasePersistenceAdaptor.pm:677 > STACK Bio::DB::BioSQL::BasePersistenceAdaptor::_create_persistent / > Users/cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ > BasePersistenceAdaptor.pm:629 > STACK Bio::DB::BioSQL::BasePersistenceAdaptor::create_persistent / > Users/cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ > BasePersistenceAdaptor.pm:586 > STACK Bio::DB::BioSQL::BasePersistenceAdaptor::_process_child /Users/ > cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ > BasePersistenceAdaptor.pm:658 > STACK Bio::DB::BioSQL::BasePersistenceAdaptor::_create_persistent / > Users/cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ > BasePersistenceAdaptor.pm:629 > STACK Bio::DB::BioSQL::BasePersistenceAdaptor::create_persistent / > Users/cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ > BasePersistenceAdaptor.pm:586 > STACK Bio::DB::BioSQL::BasePersistenceAdaptor::store /Users/cjfields/ > src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ > BasePersistenceAdaptor.pm:252 > STACK Bio::DB::BioSQL::PrimarySeqAdaptor::store_children /Users/ > cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ > PrimarySeqAdaptor.pm:229 > STACK Bio::DB::BioSQL::SeqAdaptor::store_children /Users/cjfields/src/ > core/bioperl-db/blib/lib/Bio/DB/BioSQL/SeqAdaptor.pm:217 > STACK Bio::DB::BioSQL::BasePersistenceAdaptor::create /Users/cjfields/ > src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ > BasePersistenceAdaptor.pm:213 > STACK Bio::DB::Persistent::PersistentObject::create /Users/cjfields/ > src/core/bioperl-db/blib/lib/Bio/DB/Persistent/PersistentObject.pm:244 > STACK toplevel t/04swiss.t:37 > --------------------------------------------------- > > > chris > > On Sep 28, 2007, at 10:53 AM, Hilmar Lapp wrote: > >> Chris let me know if you get stumped. I'm surprised that the special >> ranks ('eurosids I' etc) show up in the lineage (has NCBI started to >> assign ranks to them? I thought I filter them out. Needs to be looked >> into too.), but at any rate I don't understand why they aren't being >> accepted. >> >> Also, maybe we need a more verbose output here - Forrest, can you run >> this with adding a --printerror argument. (I'm embarrassed to find >> that this doesn't seem to be documented. Sigh.) >> >> -hilmar >> >> On Sep 28, 2007, at 1:00 AM, Chris Fields wrote: >> >>> If this is occurring using bioperl from CVS then I'll try taking a >>> look at it. >>> >>> chris >>> >>> On Sep 27, 2007, at 10:15 PM, Forrest Zhang wrote: >>> >>>> Hilmar, >>>> I have already pre-loaded the NCBI taxonomy using >>>> load_ncbi_taxonomy.pl yet. The error message show: >>>> >>>> --------------------- WARNING --------------------- >>>> MSG: The supplied lineage does not start near 'Phaseolus aureus' (I >>>> was >>>> supplied 'Vigna | Papilionoideae | Fabaceae | Fabales | eurosids I >>>> | rosids >>>> | core eudicotyledons | eudicotyledons | Magnoliophyta | >>>> Euphyllophyta | >>>> Embryophyta | Streptophytina | Viridiplantae | Eukaryota') >>>> --------------------------------------------------- >>>> Could not store Q40784: >>>> ------------- EXCEPTION: Bio::Root::Exception ------------- >>>> MSG: create: object (Bio::Species) failed to insert or to be found >>>> by unique >>>> key >>>> STACK: Error::throw >>>> STACK: Bio::Root::Root::throw >>>> /usr/lib/perl5/site_perl/5.8.8/Bio/Root/Root.pm:357 >>>> STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create >>>> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/ >>>> BasePersistenceAdaptor.pm:206 >>>> STACK: Bio::DB::Persistent::PersistentObject::create >>>> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/Persistent/ >>>> PersistentObject.pm:244 >>>> STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create >>>> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/ >>>> BasePersistenceAdaptor.pm:169 >>>> STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::store >>>> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/ >>>> BasePersistenceAdaptor.pm:251 >>>> STACK: Bio::DB::Persistent::PersistentObject::store >>>> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/Persistent/ >>>> PersistentObject.pm:271 >>>> STACK: /usr/bin/bp_load_seqdatabase.pl:633 >>>> ----------------------------------------------------------- >>>> Sigh~~~~~~ >>>> >>>> Forrest Zhang >>>> >>>> >>>> -----Original Message----- >>>> From: bioperl-l-bounces at lists.open-bio.org >>>> [mailto:bioperl-l-bounces at lists.open-bio.org] On Behalf Of Hilmar >>>> Lapp >>>> Sent: Friday, September 28, 2007 6:17 AM >>>> To: Forrest >>>> Cc: bioperl-l at bioperl.org >>>> Subject: Re: [Bioperl-l] load_seqdatabase.pl error >>>> >>>> Forrest, >>>> >>>> have you preloaded the NCBI taxonomy as suggested in the BioSQL >>>> installation guidelines? SwissProt format has NCBI taxon IDs, and >>>> the >>>> code will try to use it to look up species and their lineage, >>>> rather >>>> than inserting the lineage from whatever BioPerl parses out of the >>>> sequence record. >>>> >>>> -hilmar >>>> >>>> On Sep 27, 2007, at 3:41 AM, Forrest wrote: >>>> >>>>> Hi, all >>>>> I install the biosql, and bioperl-db. I want to import >>>>> swissport data. >>>>> But the programe show some error as below: >>>>> ================================================================== >>>>> = >>>>> = >>>>> = >>>>> = >>>>> ====== >>>>> =============================================== >>>>>> perl load_seqdatabase.pl -dbuser root -dbname bioseqdb -driver >>>>>> mysql >>>>> -namespace swissprot -format swiss ~/uniprot/uniprot_sprot.dat >>>>> Loading /home/forrest/uniprot/uniprot_sprot.dat ... >>>>> Could not store Q6DAH5: >>>>> ------------- EXCEPTION: Bio::Root::Exception ------------- >>>>> MSG: The supplied lineage does not start near 'Erwinia carotovora >>>>> subsp. >>>>> atroseptica' (I was supplied 'Erwinia carotovora subsp. | >>>>> Pectobacterium | >>>>> Enterobacteriaceae | Enterobacteriales | Gammaproteobacteria | >>>>> Proteobacteria | Bacteria') >>>>> STACK: Error::throw >>>>> STACK: Bio::Root::Root::throw >>>>> /usr/lib/perl5/site_perl/5.8.8/Bio/Root/Root.pm:359 >>>>> STACK: Bio::Species::classification >>>>> /usr/lib/perl5/site_perl/5.8.8/Bio/Species.pm:174 >>>>> STACK: Bio::DB::Persistent::PersistentObject::AUTOLOAD >>>>> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/Persistent/ >>>>> PersistentObject.pm:552 >>>>> STACK: Bio::DB::BioSQL::SpeciesAdaptor::populate_from_row >>>>> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/SpeciesAdaptor.pm:281 >>>>> STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::_build_object >>>>> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/ >>>>> BasePersistenceAdaptor.pm:1305 >>>>> STACK: >>>>> Bio::DB::BioSQL::BasePersistenceAdaptor::_find_by_unique_key >>>>> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/ >>>>> BasePersistenceAdaptor.pm:973 >>>>> STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::find_by_unique_key >>>>> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/ >>>>> BasePersistenceAdaptor.pm:852 >>>>> STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create >>>>> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/ >>>>> BasePersistenceAdaptor.pm:182 >>>>> STACK: Bio::DB::Persistent::PersistentObject::create >>>>> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/Persistent/ >>>>> PersistentObject.pm:244 >>>>> STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create >>>>> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/ >>>>> BasePersistenceAdaptor.pm:169 >>>>> STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::store >>>>> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/BioSQL/ >>>>> BasePersistenceAdaptor.pm:251 >>>>> STACK: Bio::DB::Persistent::PersistentObject::store >>>>> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/Persistent/ >>>>> PersistentObject.pm:271 >>>>> STACK: load_seqdatabase.pl:620 >>>>> ----------------------------------------------------------- >>>>> >>>>> at load_seqdatabase.pl line 633 >>>>> ================================================================== >>>>> = >>>>> = >>>>> = >>>>> = >>>>> ====== >>>>> =============================================== >>>>> >>>>> How can I solve it, please help me, Thank you. >>>>> >>>>> Thanks >>>>> Forrest zhang >>>>> >>>>> >>>>> >>>>> _______________________________________________ >>>>> Bioperl-l mailing list >>>>> Bioperl-l at lists.open-bio.org >>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>>> >>>> -- >>>> =========================================================== >>>> : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : >>>> =========================================================== >>>> >>>> >>>> >>>> >>>> >>>> _______________________________________________ >>>> Bioperl-l mailing list >>>> Bioperl-l at lists.open-bio.org >>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>>> >>>> >>>> >>>> _______________________________________________ >>>> Bioperl-l mailing list >>>> Bioperl-l at lists.open-bio.org >>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>> >>> Christopher Fields >>> Postdoctoral Researcher >>> Lab of Dr. Robert Switzer >>> Dept of Biochemistry >>> University of Illinois Urbana-Champaign >>> >>> >>> >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> >> -- >> =========================================================== >> : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : >> =========================================================== >> >> >> >> >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > Christopher Fields > Postdoctoral Researcher > Lab of Dr. Robert Switzer > Dept of Biochemistry > University of Illinois Urbana-Champaign > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From hlapp at gmx.net Fri Sep 28 18:20:37 2007 From: hlapp at gmx.net (Hilmar Lapp) Date: Fri, 28 Sep 2007 18:20:37 -0400 Subject: [Bioperl-l] Bio::FeatureIO::gff bug? In-Reply-To: <530A0322-A3BC-471D-AE91-17AD8F0EB237@uiuc.edu> References: <46F784EB.9050507@sendu.me.uk> <7C7D7FB3-86B9-43CF-B506-E66FA8264CFC@uiuc.edu> <46F7B9A1.9080206@sendu.me.uk> <234C109D-85FE-4161-8CBA-8E24BE34C5B5@gmx.net> <46F8DC34.6020908@sendu.me.uk> <8283EF43-2AF0-4B3B-8A00-4DE615186EC7@gmx.net> <5298B700-EFDE-45E4-A8F3-674FA673A0C7@uiuc.edu> <41A20518-63BC-4D01-8FFC-01C903ADD423@gmx.net> <530A0322-A3BC-471D-AE91-17AD8F0EB237@uiuc.edu> Message-ID: <5FC8F92C-42DD-4DAF-8008-0F8C545065B5@gmx.net> On Sep 28, 2007, at 5:34 PM, Chris Fields wrote: > The section writing the gff header info in _initialize() checks the > file specifically for '>' prior to output; I think Sendu planned on > changing that to use mode() instead. What if we pass in a file handle? -hilmar -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From cjfields at uiuc.edu Fri Sep 28 19:04:21 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Fri, 28 Sep 2007 18:04:21 -0500 Subject: [Bioperl-l] Bio::FeatureIO::gff bug? In-Reply-To: <5FC8F92C-42DD-4DAF-8008-0F8C545065B5@gmx.net> References: <46F784EB.9050507@sendu.me.uk> <7C7D7FB3-86B9-43CF-B506-E66FA8264CFC@uiuc.edu> <46F7B9A1.9080206@sendu.me.uk> <234C109D-85FE-4161-8CBA-8E24BE34C5B5@gmx.net> <46F8DC34.6020908@sendu.me.uk> <8283EF43-2AF0-4B3B-8A00-4DE615186EC7@gmx.net> <5298B700-EFDE-45E4-A8F3-674FA673A0C7@uiuc.edu> <41A20518-63BC-4D01-8FFC-01C903ADD423@gmx.net> <530A0322-A3BC-471D-AE91-17AD8F0EB237@uiuc.edu> <5FC8F92C-42DD-4DAF-8008-0F8C545065B5@gmx.net> Message-ID: On Sep 28, 2007, at 5:20 PM, Hilmar Lapp wrote: > > On Sep 28, 2007, at 5:34 PM, Chris Fields wrote: > >> The section writing the gff header info in _initialize() checks the >> file specifically for '>' prior to output; I think Sendu planned on >> changing that to use mode() instead. > > What if we pass in a file handle? > > -hilmar The old way def. wouldn't work with filehandles. Not sure if checking Root::IO::mode() would work as expected in this case, but it's certainly worth a try. chris From jay at jays.net Fri Sep 28 18:50:29 2007 From: jay at jays.net (Jay Hannah) Date: Fri, 28 Sep 2007 17:50:29 -0500 Subject: [Bioperl-l] Bio::FeatureIO::gff bug? In-Reply-To: <530A0322-A3BC-471D-AE91-17AD8F0EB237@uiuc.edu> References: <46F784EB.9050507@sendu.me.uk> <7C7D7FB3-86B9-43CF-B506-E66FA8264CFC@uiuc.edu> <46F7B9A1.9080206@sendu.me.uk> <234C109D-85FE-4161-8CBA-8E24BE34C5B5@gmx.net> <46F8DC34.6020908@sendu.me.uk> <8283EF43-2AF0-4B3B-8A00-4DE615186EC7@gmx.net> <5298B700-EFDE-45E4-A8F3-674FA673A0C7@uiuc.edu> <41A20518-63BC-4D01-8FFC-01C903ADD423@gmx.net> <530A0322-A3BC-471D-AE91-17AD8F0EB237@uiuc.edu> Message-ID: On Sep 28, 2007, at 4:34 PM, Chris Fields wrote: > On Sep 28, 2007, at 4:09 PM, Hilmar Lapp wrote: >> and I'm wondering that there must be a header (or at least an >> enclosing tag) for SeqIO XML formats - so how is that dealt with >> there? > > Re: (ii): I'm not sure, actually; I wouldn't be surprised if XML > output hasn't been tested very well. For good or ill I stole what I found in other Bio/SeqIO/* classes when I started writing my Bio::SeqIO::solrxml. It is not ready, but you can poke it with a stick here if you like: http://vc.jays.net/viewvc.cgi/SolrGene/solrxml.pm? revision=26&root=CLAB&view=markup In _initialize() it sends XML header goo into $self->_print(), and it uses DESTROY to drop an XML closure tag into $self->_print(). Constructing it $out_solr = Bio::SeqIO->new(-file => ">seq.solr.xml", -format => 'solrxml'); without writing any sequences to it created a seq.solr.xml file with the XML header and footer and nothing in the middle. If this is not The Right Way I'm happy to change it to do whatever. :) > If I could go to the Nov. GMOD meeting to help hammer out some of the > GFF3/FeatureIO/SF::Annotated stuff I would, but I would be traveling > on my own dime. Maybe I'll see what I can come up with and stay at > the no-tell motel... I've decided to spend many of my own dimes and make the trek. I hope to meet many BioPerl'ers. I'll buy you dinner if you show up. :) Take care, Jay Hannah http://clab.ist.unomaha.edu/CLAB/index.php/User:Jhannah From cjfields at uiuc.edu Sat Sep 29 14:28:16 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Sat, 29 Sep 2007 13:28:16 -0500 Subject: [Bioperl-l] Bio::FeatureIO::gff bug? In-Reply-To: References: <46F784EB.9050507@sendu.me.uk> <7C7D7FB3-86B9-43CF-B506-E66FA8264CFC@uiuc.edu> <46F7B9A1.9080206@sendu.me.uk> <234C109D-85FE-4161-8CBA-8E24BE34C5B5@gmx.net> <46F8DC34.6020908@sendu.me.uk> <8283EF43-2AF0-4B3B-8A00-4DE615186EC7@gmx.net> <5298B700-EFDE-45E4-A8F3-674FA673A0C7@uiuc.edu> <41A20518-63BC-4D01-8FFC-01C903ADD423@gmx.net> <530A0322-A3BC-471D-AE91-17AD8F0EB237@uiuc.edu> Message-ID: On Sep 28, 2007, at 5:50 PM, Jay Hannah wrote: > On Sep 28, 2007, at 4:34 PM, Chris Fields wrote: >> On Sep 28, 2007, at 4:09 PM, Hilmar Lapp wrote: >>> and I'm wondering that there must be a header (or at least an >>> enclosing tag) for SeqIO XML formats - so how is that dealt with >>> there? >> >> Re: (ii): I'm not sure, actually; I wouldn't be surprised if XML >> output hasn't been tested very well. > > For good or ill I stole what I found in other Bio/SeqIO/* classes > when I started writing my Bio::SeqIO::solrxml. It is not ready, but > you can poke it with a stick here if you like: > > http://vc.jays.net/viewvc.cgi/SolrGene/solrxml.pm? > revision=26&root=CLAB&view=markup > > In _initialize() it sends XML header goo into $self->_print(), and it > uses DESTROY to drop an XML closure tag into $self->_print(). > > Constructing it > > $out_solr = Bio::SeqIO->new(-file => ">seq.solr.xml", > -format => 'solrxml'); > > without writing any sequences to it created a seq.solr.xml file with > the XML header and footer and nothing in the middle. > > If this is not The Right Way I'm happy to change it to do > whatever. :) If you do it this way you should probably run a check on the mode() the object state is in ('r'=read, 'w'=write, '?'=unknown), and only _print() on write mode. Might also be a good idea to implement a next_seq with an exception ('Module is write only'). >> If I could go to the Nov. GMOD meeting to help hammer out some of the >> GFF3/FeatureIO/SF::Annotated stuff I would, but I would be traveling >> on my own dime. Maybe I'll see what I can come up with and stay at >> the no-tell motel... > > I've decided to spend many of my own dimes and make the trek. I hope > to meet many BioPerl'ers. I'll buy you dinner if you show up. :) > > Take care, > > Jay Hannah > http://clab.ist.unomaha.edu/CLAB/index.php/User:Jhannah I may see what I can scrape up myself but it doesn't look good (lab's closing down soon, so money's pretty tight). If I knew about the meeting a while in advance I would probably have made it. Oh well! chris From cjfields at uiuc.edu Sun Sep 30 16:39:23 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Sun, 30 Sep 2007 15:39:23 -0500 Subject: [Bioperl-l] DB::SeqFeature::Store error Message-ID: I'm getting the following error on my local MySQL (v 5.0.41) with bp_seqfeature_load: -------------------- EXCEPTION -------------------- MSG: The used table type doesn't support FULLTEXT indexes STACK Bio::DB::SeqFeature::Store::DBI::mysql::_init_database /Library/ Perl/5.8.6/Bio/DB/SeqFeature/Store/DBI/mysql.pm:414 STACK Bio::DB::SeqFeature::Store::init_database /Library/Perl/5.8.6/ Bio/DB/SeqFeature/Store.pm:382 STACK Bio::DB::SeqFeature::Store::DBI::mysql::init /Library/Perl/ 5.8.6/Bio/DB/SeqFeature/Store/DBI/mysql.pm:218 STACK Bio::DB::SeqFeature::Store::new /Library/Perl/5.8.6/Bio/DB/ SeqFeature/Store.pm:345 STACK toplevel /usr/local/bin/bp_seqfeature_load.pl:57 ------------------------------------------- The default setting for storage is InnoDB; switching to MyISAM fixes the issue. Should we specify TYPE = MyISAM with the various CREATE TABLE queries in Bio::DB::SeqFeature::Store::DBI::mysql to be on the safe side? chris From alan at tll.org.sg Sun Sep 30 21:53:07 2007 From: alan at tll.org.sg (alan) Date: Mon, 1 Oct 2007 09:53:07 +0800 Subject: [Bioperl-l] exonerate References: <034FB11C-B4E9-4E4E-B213-D4AC6A397B1B@tll.org.sg> Message-ID: <29C4D729-6715-4C19-9872-3B1AF90EAFA3@tll.org.sg> Hi, >> I am calling exonerate.pm within my script while attempting to >> align cDNA to multiple genomic fragments. After processing about >> 120+ genomic fragments my code crashes with the following error: >> >> ** ERROR **: Could not open [/tmp/tlInatbOED] : Too many open files >> aborting... >> MSG: Exonerate call (/usr/local/bin/exonerate /tmp/8X9jQuHUGF / >> tmp/tlInatbOED > /tmp/EolF5qCNLZ/cIf0HfIRf5) crashed: 34304 >> STACK Bio::Tools::Run::Alignment::Exonerate::_run /nfs1/alan/ >> cvs_src/bioperl-run/Bio/Tools/Run/Alignment/Exonerate.pm:214 >> STACK Bio::Tools::Run::Alignment::Exonerate::run /nfs1/alan/ >> cvs_src/bioperl-run/Bio/Tools/Run/Alignment/Exonerate.pm:174 >> >> The code in Exonerate.pm closes the tmpfile at the end of the >> routine yet I get the error message about "too many open files". >> Any suggestions on how I should be closing these files? >> >> >> Extract from my code that runs exonerate is listed below. >> >> foreach my $f(@files) { >> next unless (-f "$dir/$f"); >> my $q_in = Bio::SeqIO->new(-file=>$query, -format=>"Fasta"); >> my $query_obj = $q_in->next_seq(); >> my $target_in = Bio::SeqIO->new(-file=>"$dir/$f", - >> format=>"Fasta"); >> my $target_obj = $target_in->next_seq(); >> my $run = Bio::Tools::Run::Alignment::Exonerate->new(); >> my $exonerate_io = $run->run($query_obj, $target_obj); >> >> [code for parsing the data.......] >> >> $exonerate_io->close; #tried this line out of desperation but it >> did not help :-) >> } >> >> thanks >> alan >> >> >> >> Alan Christoffels >> Computational Biology Group >> Temasek LifeSciences Laboratory >> 1 Research Link >> National University of Singapore >> Singapore >> 117604 >> Tel: +65 68744945 >> Fax: +65 68727007 >> Lab webpage: http://www.tll.org.sg/alan.asp >> >> > From ewijaya at gmail.com Sun Sep 30 10:10:25 2007 From: ewijaya at gmail.com (Edward Wijaya) Date: Sun, 30 Sep 2007 22:10:25 +0800 Subject: [Bioperl-l] Bio::Graphics - Howto draw graded segments overlap with line track Message-ID: <3521d3670709300710l4a41c47es1c72cc5a450a3736@mail.gmail.com> Hi, I want to draw a binding sites hits on sequence of various length. What I have now is a graded segments only. Is there a way to draw segments overlapping with the line ? (see attached figure). -- Edward -------------- next part -------------- A non-text attachment was scrubbed... Name: hits.PNG Type: image/png Size: 15613 bytes Desc: not available URL: From cjfields at uiuc.edu Sun Sep 2 19:54:54 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Sun, 2 Sep 2007 18:54:54 -0500 Subject: [Bioperl-l] (no subject) Message-ID: Posted this to biosql-l already but felt it needed posting here as well. Sorry if you get this twice. I noticed some critical recursion issues with bioperl-db when working in Bio::Ontology changes. This was using bioperl-live (post-feature/ annotation fixes). Bug report is here: http://bugzilla.open-bio.org/show_bug.cgi?id=2355 It seems to be Bio:Taxon related; this is from 03swiss.t: --------------------- WARNING --------------------- MSG: recursion detected for Bio::Taxon object STACK Bio::DB::BioSQL::BasePersistenceAdaptor::_process_child /Users/cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ BasePersistenceAdaptor.pm:681 STACK Bio::DB::BioSQL::BasePersistenceAdaptor::_create_persistent /Users/cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ BasePersistenceAdaptor.pm:630 STACK Bio::DB::BioSQL::BasePersistenceAdaptor::_process_child /Users/cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ BasePersistenceAdaptor.pm:692 STACK Bio::DB::BioSQL::BasePersistenceAdaptor::_create_persistent /Users/cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ BasePersistenceAdaptor.pm:630 ... /Users/cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ BasePersistenceAdaptor.pm:587 STACK Bio::DB::BioSQL::BasePersistenceAdaptor::store /Users/cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ BasePersistenceAdaptor.pm:253 STACK Bio::DB::BioSQL::PrimarySeqAdaptor::store_children /Users/cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ PrimarySeqAdaptor.pm:229 STACK Bio::DB::BioSQL::SeqAdaptor::store_children /Users/cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ SeqAdaptor.pm:217 STACK Bio::DB::BioSQL::BasePersistenceAdaptor::create /Users/cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ BasePersistenceAdaptor.pm:214 STACK Bio::DB::Persistent::PersistentObject::create /Users/cjfields/src/core/bioperl-db/blib/lib/Bio/DB/Persistent/ PersistentObject.pm:244 STACK toplevel t/04swiss.t:36 --------------------------------------------------- Also, seeing this with 13remove.t and 15.cluster.t, both of which appear to infinitely recurse: Deep recursion on subroutine "Bio::DB::BioSQL::BasePersistenceAdaptor::_create_persistent" at /Users/cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ BasePersistenceAdaptor.pm line 587, line 1. Deep recursion on subroutine "Bio::DB::BioSQL::BasePersistenceAdaptor::_process_child" at /Users/cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ BasePersistenceAdaptor.pm line 630, line 1. chris From cjfields at uiuc.edu Sun Sep 2 19:57:59 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Sun, 2 Sep 2007 18:57:59 -0500 Subject: [Bioperl-l] recursion issues with bioperl-db Message-ID: <2E14450C-C135-42DD-A9DE-EB47EB80E6AC@uiuc.edu> Apologies if you get this more than once; the first post appeared to get sent w/o a proper subject line. Posted this to biosql-l already but felt it needed posting here as well. I noticed some critical recursion issues with bioperl-db when working in Bio::Ontology changes. This was using bioperl-live (post-feature/ annotation fixes). Bug report is here: http://bugzilla.open-bio.org/show_bug.cgi?id=2355 It seems to be Bio:Taxon related; this is from 03swiss.t: --------------------- WARNING --------------------- MSG: recursion detected for Bio::Taxon object STACK Bio::DB::BioSQL::BasePersistenceAdaptor::_process_child /Users/cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ BasePersistenceAdaptor.pm:681 STACK Bio::DB::BioSQL::BasePersistenceAdaptor::_create_persistent /Users/cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ BasePersistenceAdaptor.pm:630 STACK Bio::DB::BioSQL::BasePersistenceAdaptor::_process_child /Users/cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ BasePersistenceAdaptor.pm:692 STACK Bio::DB::BioSQL::BasePersistenceAdaptor::_create_persistent /Users/cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ BasePersistenceAdaptor.pm:630 ... /Users/cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ BasePersistenceAdaptor.pm:587 STACK Bio::DB::BioSQL::BasePersistenceAdaptor::store /Users/cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ BasePersistenceAdaptor.pm:253 STACK Bio::DB::BioSQL::PrimarySeqAdaptor::store_children /Users/cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ PrimarySeqAdaptor.pm:229 STACK Bio::DB::BioSQL::SeqAdaptor::store_children /Users/cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ SeqAdaptor.pm:217 STACK Bio::DB::BioSQL::BasePersistenceAdaptor::create /Users/cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ BasePersistenceAdaptor.pm:214 STACK Bio::DB::Persistent::PersistentObject::create /Users/cjfields/src/core/bioperl-db/blib/lib/Bio/DB/Persistent/ PersistentObject.pm:244 STACK toplevel t/04swiss.t:36 --------------------------------------------------- Also, seeing this with 13remove.t and 15.cluster.t, both of which appear to infinitely recurse: Deep recursion on subroutine "Bio::DB::BioSQL::BasePersistenceAdaptor::_create_persistent" at /Users/cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ BasePersistenceAdaptor.pm line 587, line 1. Deep recursion on subroutine "Bio::DB::BioSQL::BasePersistenceAdaptor::_process_child" at /Users/cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ BasePersistenceAdaptor.pm line 630, line 1. chris From cjfields at uiuc.edu Sun Sep 2 21:40:48 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Sun, 2 Sep 2007 20:40:48 -0500 Subject: [Bioperl-l] recursion issues with bioperl-db In-Reply-To: <2E14450C-C135-42DD-A9DE-EB47EB80E6AC@uiuc.edu> References: <2E14450C-C135-42DD-A9DE-EB47EB80E6AC@uiuc.edu> Message-ID: <25CFD36D-D921-4F5F-BADF-D858A2FE76D4@uiuc.edu> Okay, we can the previous posts! Odd, but I started from scratch and can't reproduce the issue; there may have been some cross-talk with different bioperl installations on my laptop. Anyway, everything passes now w/o recursion so I'll mark the bug as invalid. chris On Sep 2, 2007, at 6:57 PM, Chris Fields wrote: > Apologies if you get this more than once; the first post appeared to > get sent w/o a proper subject line. Posted this to biosql-l already > but felt it needed posting here as well. > > I noticed some critical recursion issues with bioperl-db when working > in Bio::Ontology changes. This was using bioperl-live (post-feature/ > annotation fixes). Bug report is here: > > http://bugzilla.open-bio.org/show_bug.cgi?id=2355 > > It seems to be Bio:Taxon related; this is from 03swiss.t: > > --------------------- WARNING --------------------- > MSG: recursion detected for Bio::Taxon object > STACK Bio::DB::BioSQL::BasePersistenceAdaptor::_process_child > /Users/cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ > BasePersistenceAdaptor.pm:681 > STACK Bio::DB::BioSQL::BasePersistenceAdaptor::_create_persistent > /Users/cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ > BasePersistenceAdaptor.pm:630 > STACK Bio::DB::BioSQL::BasePersistenceAdaptor::_process_child > /Users/cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ > BasePersistenceAdaptor.pm:692 > STACK Bio::DB::BioSQL::BasePersistenceAdaptor::_create_persistent > /Users/cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ > BasePersistenceAdaptor.pm:630 > ... > /Users/cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ > BasePersistenceAdaptor.pm:587 > STACK Bio::DB::BioSQL::BasePersistenceAdaptor::store > /Users/cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ > BasePersistenceAdaptor.pm:253 > STACK Bio::DB::BioSQL::PrimarySeqAdaptor::store_children > /Users/cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ > PrimarySeqAdaptor.pm:229 > STACK Bio::DB::BioSQL::SeqAdaptor::store_children > /Users/cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ > SeqAdaptor.pm:217 > STACK Bio::DB::BioSQL::BasePersistenceAdaptor::create > /Users/cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ > BasePersistenceAdaptor.pm:214 > STACK Bio::DB::Persistent::PersistentObject::create > /Users/cjfields/src/core/bioperl-db/blib/lib/Bio/DB/Persistent/ > PersistentObject.pm:244 > STACK toplevel t/04swiss.t:36 > --------------------------------------------------- > > Also, seeing this with 13remove.t and 15.cluster.t, both of which > appear to infinitely recurse: > > Deep recursion on subroutine > "Bio::DB::BioSQL::BasePersistenceAdaptor::_create_persistent" at > /Users/cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ > BasePersistenceAdaptor.pm > line 587, line 1. > Deep recursion on subroutine > "Bio::DB::BioSQL::BasePersistenceAdaptor::_process_child" at > /Users/cjfields/src/core/bioperl-db/blib/lib/Bio/DB/BioSQL/ > BasePersistenceAdaptor.pm > line 630, line 1. > > > chris > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From bernd.web at gmail.com Mon Sep 3 08:43:26 2007 From: bernd.web at gmail.com (Bernd Web) Date: Mon, 3 Sep 2007 14:43:26 +0200 Subject: [Bioperl-l] Fh::flush warning Message-ID: <716af09c0709030543w79f83368gf0ac74d220a96f8c@mail.gmail.com> Hi, Sometimes with Bio::SimpleAlign/AlignIO, I get the following warning: (in cleanup) Undefined subroutine Fh::flush, at /lib/perl/Bio/Root/IO.pm line 541. This occurs in a rather large script and have not been able to isolate a small example where I also get this warning. Does someone know more about this warning and why it is thrown? Regards, Bernd From cjfields at uiuc.edu Mon Sep 3 10:41:49 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 3 Sep 2007 09:41:49 -0500 Subject: [Bioperl-l] Fh::flush warning In-Reply-To: <716af09c0709030543w79f83368gf0ac74d220a96f8c@mail.gmail.com> References: <716af09c0709030543w79f83368gf0ac74d220a96f8c@mail.gmail.com> Message-ID: <98A9D081-2570-4D4E-A8F8-D03282D41E0C@uiuc.edu> Could you give a bit more info (bioperl version, OS, etc)? I'm guessing a recent version as the error coincides with a call to flush() in Root::IO (which is probably called indirectly via DESTROY) and that you're probably using a tied filehandle somewhere for output, e.g. Bio::AlignIO::newFh() or Bio::AlignIO::fh(), so knowing the input/output formats could help. chris On Sep 3, 2007, at 7:43 AM, Bernd Web wrote: > Hi, > > Sometimes with Bio::SimpleAlign/AlignIO, I get the following warning: > (in cleanup) Undefined subroutine Fh::flush, at > /lib/perl/Bio/Root/IO.pm line 541. > > This occurs in a rather large script and have not been able to isolate > a small example where I also get this warning. Does someone know more > about this warning and why it is thrown? > > Regards, > Bernd > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From xianranli78 at yahoo.com.cn Mon Sep 3 22:11:09 2007 From: xianranli78 at yahoo.com.cn (xianran li) Date: Tue, 4 Sep 2007 10:11:09 +0800 (CST) Subject: [Bioperl-l] question about Bio::DB::GFF Message-ID: <361239.6752.qm@web15309.mail.cnb.yahoo.com> Hi, I tried to load the gff3 file with load_gff.pl and extrac some information with Bio::DB::GFF. Althougth this code work properly under windows xp, the $seg got nothing when i run it under Linux. Here is my code and the gff3 file, #################################################################### #!/usr/local/bin/perl -w use strict; use Bio::SeqIO; use Bio::DB::GFF; my $in_gff = Bio::DB::GFF->new( -adaptor => 'dbi::mysqlopt', -dsn => 'dbi:mysql:test', -aggregator => ['coding'], -user => "lixr", -pass => "123456" ); my $seg = $in_gff->segment'BGIOSIBCE000001.1'); print $seg->abs_start."\n"; ################################################################## ##gff-version 3 ##sequence-region Chr01 1 43037 Chr01 bgf mRNA 18113 20165 . + . ID=BGIOSIBCE000001.1 Chr01 bgf CDS 18113 19150 . + 0 Parent=BGIOSIBCE000001.1 Chr01 bgf CDS 19344 20165 . + 0 Parent=BGIOSIBCE000001.1 Chr01 bgf mRNA 30220 36442 . + . ID=BGIOSIBCE000002.1 Chr01 bgf CDS 30220 30387 . + 0 Parent=BGIOSIBCE000002.1 Chr01 bgf CDS 31128 31226 . + 0 Parent=BGIOSIBCE000002.1 Chr01 bgf CDS 32228 32331 . + 0 Parent=BGIOSIBCE000002.1 Chr01 bgf CDS 33907 34715 . + 1 Parent=BGIOSIBCE000002.1 Chr01 bgf CDS 34799 34921 . + 2 Parent=BGIOSIBCE000002.1 Chr01 bgf CDS 35003 35091 . + 2 Parent=BGIOSIBCE000002.1 Chr01 bgf CDS 35179 35379 . + 0 Parent=BGIOSIBCE000002.1 Chr01 bgf CDS 35981 36442 . + 0 Parent=BGIOSIBCE000002.1 Chr01 bgf mRNA 38143 39015 . - . ID=BGIOSIBCE000003.1 Chr01 bgf CDS 38143 38541 . - 0 Parent=BGIOSIBCE000003.1 Chr01 bgf CDS 38649 38813 . - 0 Parent=BGIOSIBCE000003.1 Chr01 bgf CDS 38917 39015 . - 0 Parent=BGIOSIBCE000003.1 Chr01 bgf mRNA 39545 42080 . + . ID=BGIOSIBCE000004.1 Chr01 bgf CDS 39545 40584 . + 0 Parent=BGIOSIBCE000004.1 Chr01 bgf CDS 40677 41042 . + 1 Parent=BGIOSIBCE000004.1 Chr01 bgf CDS 41130 41208 . + 1 Parent=BGIOSIBCE000004.1 Chr01 bgf CDS 41740 41920 . + 0 Parent=BGIOSIBCE000004.1 Chr01 bgf CDS 42037 42080 . + 2 Parent=BGIOSIBCE000004.1 ################################################################# I would appreaciate if any one can give me some clues/link to accomplish this. thanks in advance , Xianran Li --------------------------------- ???????????????????????????????????????????? From cjfields at uiuc.edu Tue Sep 4 00:04:29 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 3 Sep 2007 23:04:29 -0500 Subject: [Bioperl-l] question about Bio::DB::GFF In-Reply-To: <361239.6752.qm@web15309.mail.cnb.yahoo.com> References: <361239.6752.qm@web15309.mail.cnb.yahoo.com> Message-ID: <37BE6493-B49B-47DF-8047-37D616B669A8@uiuc.edu> Not sure if the gff3 you show was modified for demonstration here but it should always be tab-delimited. Also, I have had problems myself when using files with Windows/Mac Classic line endings on UNIX'y systems (Excel and a few other Mac OS X programs insist on adding \r instead of \n, which plays havoc with parsers sometimes even with readline fixes). chris On Sep 3, 2007, at 9:11 PM, xianran li wrote: > > Hi, > > I tried to load the gff3 file with load_gff.pl and extrac some > information with Bio::DB::GFF. Althougth this code work properly > under windows xp, the $seg got nothing when i run it under Linux. > > Here is my code and the gff3 file, > #################################################################### > > #!/usr/local/bin/perl -w > > use strict; > use Bio::SeqIO; > use Bio::DB::GFF; > > my $in_gff = Bio::DB::GFF->new( -adaptor => 'dbi::mysqlopt', > -dsn => 'dbi:mysql:test', > -aggregator => ['coding'], > -user => "lixr", > -pass => "123456" > ); > my $seg = $in_gff->segment'BGIOSIBCE000001.1'); > print $seg->abs_start."\n"; > > > ################################################################## > ##gff-version 3 > ##sequence-region Chr01 1 43037 > Chr01 bgf mRNA 18113 20165 . + . ID=BGIOSIBCE000001.1 > Chr01 bgf CDS 18113 19150 . + 0 Parent=BGIOSIBCE000001.1 > Chr01 bgf CDS 19344 20165 . + 0 Parent=BGIOSIBCE000001.1 > Chr01 bgf mRNA 30220 36442 . + . ID=BGIOSIBCE000002.1 > Chr01 bgf CDS 30220 30387 . + 0 Parent=BGIOSIBCE000002.1 > Chr01 bgf CDS 31128 31226 . + 0 Parent=BGIOSIBCE000002.1 > Chr01 bgf CDS 32228 32331 . + 0 Parent=BGIOSIBCE000002.1 > Chr01 bgf CDS 33907 34715 . + 1 Parent=BGIOSIBCE000002.1 > Chr01 bgf CDS 34799 34921 . + 2 Parent=BGIOSIBCE000002.1 > Chr01 bgf CDS 35003 35091 . + 2 Parent=BGIOSIBCE000002.1 > Chr01 bgf CDS 35179 35379 . + 0 Parent=BGIOSIBCE000002.1 > Chr01 bgf CDS 35981 36442 . + 0 Parent=BGIOSIBCE000002.1 > Chr01 bgf mRNA 38143 39015 . - . ID=BGIOSIBCE000003.1 > Chr01 bgf CDS 38143 38541 . - 0 Parent=BGIOSIBCE000003.1 > Chr01 bgf CDS 38649 38813 . - 0 Parent=BGIOSIBCE000003.1 > Chr01 bgf CDS 38917 39015 . - 0 Parent=BGIOSIBCE000003.1 > Chr01 bgf mRNA 39545 42080 . + . ID=BGIOSIBCE000004.1 > Chr01 bgf CDS 39545 40584 . + 0 Parent=BGIOSIBCE000004.1 > Chr01 bgf CDS 40677 41042 . + 1 Parent=BGIOSIBCE000004.1 > Chr01 bgf CDS 41130 41208 . + 1 Parent=BGIOSIBCE000004.1 > Chr01 bgf CDS 41740 41920 . + 0 Parent=BGIOSIBCE000004.1 > Chr01 bgf CDS 42037 42080 . + 2 Parent=BGIOSIBCE000004.1 > ################################################################# > > > I would appreaciate if any one can give me some clues/link to > accomplish this. > > thanks in advance , > > Xianran Li > > > --------------------------------- > ?????????????????????? > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From xianranli78 at yahoo.com.cn Tue Sep 4 00:58:48 2007 From: xianranli78 at yahoo.com.cn (xianran li) Date: Tue, 4 Sep 2007 12:58:48 +0800 (CST) Subject: [Bioperl-l] =?gb2312?q?=BB=D8=B8=B4=A3=BA=20Re:=20=20question=20about=20Bi?= =?gb2312?q?o::DB::GFF?= In-Reply-To: <37BE6493-B49B-47DF-8047-37D616B669A8@uiuc.edu> Message-ID: <866169.66154.qm@web15309.mail.cnb.yahoo.com> Hi, everybody, It looks like for the different perl version(5.8.8 of windows and 5.8.5 for linux). And I fixed this problem by adding ";Name=XXXX" after each line with "mRNA" ############################################################################## Chr01 bgf mRNA 18113 20165 . + . ID=BGIOSIBCE000001.1;Name=BGIOSIBCE000001.1 Chr01 bgf CDS 18113 19150 . + 0 Parent=BGIOSIBCE000001.1 ############################################################################## This time my code works properly. Xianran Chris Fields ?????? Not sure if the gff3 you show was modified for demonstration here but it should always be tab-delimited. Also, I have had problems myself when using files with Windows/Mac Classic line endings on UNIX'y systems (Excel and a few other Mac OS X programs insist on adding \r instead of \n, which plays havoc with parsers sometimes even with readline fixes). chris On Sep 3, 2007, at 9:11 PM, xianran li wrote: > > Hi, > > I tried to load the gff3 file with load_gff.pl and extrac some > information with Bio::DB::GFF. Althougth this code work properly > under windows xp, the $seg got nothing when i run it under Linux. > > Here is my code and the gff3 file, > #################################################################### > > #!/usr/local/bin/perl -w > > use strict; > use Bio::SeqIO; > use Bio::DB::GFF; > > my $in_gff = Bio::DB::GFF->new( -adaptor => 'dbi::mysqlopt', > -dsn => 'dbi:mysql:test', > -aggregator => ['coding'], > -user => "lixr", > -pass => "123456" > ); > my $seg = $in_gff->segment'BGIOSIBCE000001.1'); > print $seg->abs_start."\n"; > > > ################################################################## > ##gff-version 3 > ##sequence-region Chr01 1 43037 > Chr01 bgf mRNA 18113 20165 . + . ID=BGIOSIBCE000001.1 > Chr01 bgf CDS 18113 19150 . + 0 Parent=BGIOSIBCE000001.1 > Chr01 bgf CDS 19344 20165 . + 0 Parent=BGIOSIBCE000001.1 > Chr01 bgf mRNA 30220 36442 . + . ID=BGIOSIBCE000002.1 > Chr01 bgf CDS 30220 30387 . + 0 Parent=BGIOSIBCE000002.1 > Chr01 bgf CDS 31128 31226 . + 0 Parent=BGIOSIBCE000002.1 > Chr01 bgf CDS 32228 32331 . + 0 Parent=BGIOSIBCE000002.1 > Chr01 bgf CDS 33907 34715 . + 1 Parent=BGIOSIBCE000002.1 > Chr01 bgf CDS 34799 34921 . + 2 Parent=BGIOSIBCE000002.1 > Chr01 bgf CDS 35003 35091 . + 2 Parent=BGIOSIBCE000002.1 > Chr01 bgf CDS 35179 35379 . + 0 Parent=BGIOSIBCE000002.1 > Chr01 bgf CDS 35981 36442 . + 0 Parent=BGIOSIBCE000002.1 > Chr01 bgf mRNA 38143 39015 . - . ID=BGIOSIBCE000003.1 > Chr01 bgf CDS 38143 38541 . - 0 Parent=BGIOSIBCE000003.1 > Chr01 bgf CDS 38649 38813 . - 0 Parent=BGIOSIBCE000003.1 > Chr01 bgf CDS 38917 39015 . - 0 Parent=BGIOSIBCE000003.1 > Chr01 bgf mRNA 39545 42080 . + . ID=BGIOSIBCE000004.1 > Chr01 bgf CDS 39545 40584 . + 0 Parent=BGIOSIBCE000004.1 > Chr01 bgf CDS 40677 41042 . + 1 Parent=BGIOSIBCE000004.1 > Chr01 bgf CDS 41130 41208 . + 1 Parent=BGIOSIBCE000004.1 > Chr01 bgf CDS 41740 41920 . + 0 Parent=BGIOSIBCE000004.1 > Chr01 bgf CDS 42037 42080 . + 2 Parent=BGIOSIBCE000004.1 > ################################################################# > > > I would appreaciate if any one can give me some clues/link to > accomplish this. > > thanks in advance , > > Xianran Li > > > --------------------------------- > ???????????????????????????????????????????? > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign --------------------------------- ???????????????????????????????? From jay at jays.net Tue Sep 4 10:31:36 2007 From: jay at jays.net (Jay Hannah) Date: Tue, 4 Sep 2007 09:31:36 -0500 Subject: [Bioperl-l] display protein/CDS multiple sequence alignments with exon boundaries In-Reply-To: <358f4d650708310229i421bb1d7w5f64e3ded5f92618@mail.gmail.com> References: <358f4d650708310229i421bb1d7w5f64e3ded5f92618@mail.gmail.com> Message-ID: <96BB820A-9199-43FC-BA88-5F45F2D16D56@jays.net> On Aug 31, 2007, at 4:29 AM, Albert Vilella wrote: > Probably a bit of a long shot but does anyone have code for > displaying protein or CDS multiple sequence alignments with the exon > boundaries of each gene in the alignment? > > Something in the bioperl world without funky external dependencies. > I think > it would be an awesome addition to the howtos. > > Currently, the Bio::Graphics howto has cdna to genome mapping > scripts or > blast output scripts, but > I couldn't find code for dealing with multiple sequence alignments. I'm currently under the (potentially uninformed) impression that Bio::Graphics and related tools only work with a single coordinate system. I've never seen a multiple sequence alignment example. ( I Google'd for "gbrowse alignment" and hit this: http://maizeapache.danforthcenter.org/cgi-bin/gbrowse.cgi Click the second Example link and you'll see exons mapped out. But zooming all the way in with all the tracks turned on it looks like the AZM tracks are just the coding regions. I don't see any multiple sequence alignment... ) I doubt that helped. :) Jay Hannah http://clab.ist.unomaha.edu/CLAB/index.php/User:Jhannah From cjfields at uiuc.edu Tue Sep 4 11:28:01 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 4 Sep 2007 10:28:01 -0500 Subject: [Bioperl-l] display protein/CDS multiple sequence alignments with exon boundaries In-Reply-To: <96BB820A-9199-43FC-BA88-5F45F2D16D56@jays.net> References: <358f4d650708310229i421bb1d7w5f64e3ded5f92618@mail.gmail.com> <96BB820A-9199-43FC-BA88-5F45F2D16D56@jays.net> Message-ID: <59E877A9-B2A1-4E62-A291-FDE21CF586E6@uiuc.edu> On Sep 4, 2007, at 9:31 AM, Jay Hannah wrote: > On Aug 31, 2007, at 4:29 AM, Albert Vilella wrote: >> Probably a bit of a long shot but does anyone have code for >> displaying protein or CDS multiple sequence alignments with the exon >> boundaries of each gene in the alignment? >> >> Something in the bioperl world without funky external dependencies. >> I think >> it would be an awesome addition to the howtos. >> >> Currently, the Bio::Graphics howto has cdna to genome mapping >> scripts or >> blast output scripts, but >> I couldn't find code for dealing with multiple sequence alignments. > > I'm currently under the (potentially uninformed) impression that > Bio::Graphics and related tools only work with a single coordinate > system. I've never seen a multiple sequence alignment example. > > ( > I Google'd for "gbrowse alignment" and hit this: > http://maizeapache.danforthcenter.org/cgi-bin/gbrowse.cgi > > Click the second Example link and you'll see exons mapped out. > > But zooming all the way in with all the tracks turned on it looks > like the AZM tracks are just the coding regions. I don't see any > multiple sequence alignment... > ) > > I doubt that helped. :) > > Jay Hannah > http://clab.ist.unomaha.edu/CLAB/index.php/User:Jhannah Acc. to the Gbrowse tutorial GBrowse can deal with alignment data: http://gmod.cvs.sourceforge.net/*checkout*/gmod/Generic-Genome- Browser/docs/tutorial/tutorial.html chris From avilella at gmail.com Wed Sep 5 05:42:37 2007 From: avilella at gmail.com (Albert Vilella) Date: Wed, 5 Sep 2007 11:42:37 +0200 Subject: [Bioperl-l] display protein/CDS multiple sequence alignments with exon boundaries In-Reply-To: <59E877A9-B2A1-4E62-A291-FDE21CF586E6@uiuc.edu> References: <358f4d650708310229i421bb1d7w5f64e3ded5f92618@mail.gmail.com> <96BB820A-9199-43FC-BA88-5F45F2D16D56@jays.net> <59E877A9-B2A1-4E62-A291-FDE21CF586E6@uiuc.edu> Message-ID: <358f4d650709050242u7743a532u3b460fcc55e1867e@mail.gmail.com> A couple of examples: http://www.treefam.org/cgi-bin/alnview.pl?ac=TF105041 treefam has exon boundary and PFAM domain mappings http://www.ensembl.org/Homo_sapiens/genetreeview?gene=ENSG00000139618 here the tree is shown as well, but the idea would be to plot the alignment So it's more "show me the multiple CDS/protein alignment" rather than "show my aligned CDS/proteins wrt my reference genome" I think it would be quite neat to have this as a bioperl howto, Comments? Albert. On 9/4/07, Chris Fields wrote: > > > On Sep 4, 2007, at 9:31 AM, Jay Hannah wrote: > > > On Aug 31, 2007, at 4:29 AM, Albert Vilella wrote: > >> Probably a bit of a long shot but does anyone have code for > >> displaying protein or CDS multiple sequence alignments with the exon > >> boundaries of each gene in the alignment? > >> > >> Something in the bioperl world without funky external dependencies. > >> I think > >> it would be an awesome addition to the howtos. > >> > >> Currently, the Bio::Graphics howto has cdna to genome mapping > >> scripts or > >> blast output scripts, but > >> I couldn't find code for dealing with multiple sequence alignments. > > > > I'm currently under the (potentially uninformed) impression that > > Bio::Graphics and related tools only work with a single coordinate > > system. I've never seen a multiple sequence alignment example. > > > > ( > > I Google'd for "gbrowse alignment" and hit this: > > http://maizeapache.danforthcenter.org/cgi-bin/gbrowse.cgi > > > > Click the second Example link and you'll see exons mapped out. > > > > But zooming all the way in with all the tracks turned on it looks > > like the AZM tracks are just the coding regions. I don't see any > > multiple sequence alignment... > > ) > > > > I doubt that helped. :) > > > > Jay Hannah > > http://clab.ist.unomaha.edu/CLAB/index.php/User:Jhannah > > Acc. to the Gbrowse tutorial GBrowse can deal with alignment data: > > http://gmod.cvs.sourceforge.net/*checkout*/gmod/Generic-Genome- > Browser/docs/tutorial/tutorial.html > > chris > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > From alexl at users.sourceforge.net Wed Sep 5 06:08:14 2007 From: alexl at users.sourceforge.net (Alex Lancaster) Date: Wed, 05 Sep 2007 03:08:14 -0700 Subject: [Bioperl-l] Clarifying license of bioperl In-Reply-To: <8D3FBCDF-47E7-4A6E-8001-C034CA27BF3F@gmx.net> (Hilmar Lapp's message of "Sat\, 18 Aug 2007 12\:13\:28 -0400") References: <1A4207F8295607498283FE9E93B775B40390172E@EX02.asurite.ad.asu.edu> <3515AB25-9919-407B-93E9-352BC426AFA1@uiuc.edu> <8td4xlyt4h.fsf@allele2.localdomain> <8D3FBCDF-47E7-4A6E-8001-C034CA27BF3F@gmx.net> Message-ID: >>>>> "HL" == Hilmar Lapp writes: HL> On Aug 18, 2007, at 7:33 AM, Alex Lancaster wrote: > I imagine the intent of the bioperl >> contributors is that it should be under the same terms as Perl, >> whatever that happens to be (which just happens to be GPL or >> Artistic, which is fine). HL> I fully agree. >> A clarification to that effect would be useful. HL> Agreed, too. Would you mind changing that language on the wiki, HL> since you seem to have a fairly good grasp on the issue? OK, I've updated the wiki in two places: http://www.bioperl.org/wiki/Licensing_BioPerl http://www.bioperl.org/wiki/FAQ#What_are_the_license_terms_for_BioPerl.3F It would also be nice if the LICENSE and Build.PL files in CVS (so it finds its way into the next release) were also updated to reflect the dual-licensed status, currently they only mention the Artistic license: http://code.open-bio.org/cgi/viewcvs.cgi/bioperl-live/LICENSE?rev=HEAD&content-type=text/vnd.viewcvs-markup http://code.open-bio.org/cgi/viewcvs.cgi/bioperl-live/Build.PL?rev=HEAD&content-type=text/vnd.viewcvs-markup For Build.PL this is easy: (e.g., license => 'artistic', should be license => 'GPL or Artistic',) Possible solutions for the LICENSE file include: 1) The GPL could be added to LICENSE file at the end (with a note at the top to indicate that GPL is also included); 2) LICENSE could be moved to LICENSE.Artistic and another file "LICENSE.GPL" added with the GPL (version 2+) conditions, and the contents of LICENSE would include a note about each license. I don't have access to the bioperl CVS repository, so I can't make the changes myself). This would also apply to the Build.PL (and LICENSE files if they are present) in bioperl-run and other modules. Thanks, Alex From cjfields at uiuc.edu Wed Sep 5 08:25:21 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Wed, 5 Sep 2007 07:25:21 -0500 Subject: [Bioperl-l] Clarifying license of bioperl In-Reply-To: References: <1A4207F8295607498283FE9E93B775B40390172E@EX02.asurite.ad.asu.edu> <3515AB25-9919-407B-93E9-352BC426AFA1@uiuc.edu> <8td4xlyt4h.fsf@allele2.localdomain> <8D3FBCDF-47E7-4A6E-8001-C034CA27BF3F@gmx.net> Message-ID: On Sep 5, 2007, at 5:08 AM, Alex Lancaster wrote: ... > > OK, I've updated the wiki in two places: > > http://www.bioperl.org/wiki/Licensing_BioPerl > > http://www.bioperl.org/wiki/ > FAQ#What_are_the_license_terms_for_BioPerl.3F > > It would also be nice if the LICENSE and Build.PL files in CVS (so it > finds its way into the next release) were also updated to reflect the > dual-licensed status, currently they only mention the Artistic > license: > > http://code.open-bio.org/cgi/viewcvs.cgi/bioperl-live/LICENSE? > rev=HEAD&content-type=text/vnd.viewcvs-markup > > http://code.open-bio.org/cgi/viewcvs.cgi/bioperl-live/Build.PL? > rev=HEAD&content-type=text/vnd.viewcvs-markup > > For Build.PL this is easy: > > (e.g., license => 'artistic', should be > license => 'GPL or Artistic',) > > Possible solutions for the LICENSE file include: > > 1) The GPL could be added to LICENSE file at the end (with a note at > the top to indicate that GPL is also included); > > 2) LICENSE could be moved to LICENSE.Artistic and another file > "LICENSE.GPL" added with the GPL (version 2+) conditions, and the > contents of LICENSE would include a note about each license. > > I don't have access to the bioperl CVS repository, so I can't make the > changes myself). This would also apply to the Build.PL (and LICENSE > files if they are present) in bioperl-run and other modules. > > Thanks, > Alex Looks like Sendu has done that. There have been recent troubling developments re: Artistic License: http://use.perl.org/article.pl?sid=07/08/26/1541205&from=rss but the case hasn't been conclusively decided yet. chris From bix at sendu.me.uk Wed Sep 5 08:18:35 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Wed, 05 Sep 2007 13:18:35 +0100 Subject: [Bioperl-l] Clarifying license of bioperl In-Reply-To: References: <1A4207F8295607498283FE9E93B775B40390172E@EX02.asurite.ad.asu.edu> <3515AB25-9919-407B-93E9-352BC426AFA1@uiuc.edu> <8td4xlyt4h.fsf@allele2.localdomain> <8D3FBCDF-47E7-4A6E-8001-C034CA27BF3F@gmx.net> Message-ID: <46DE9E9B.80107@sendu.me.uk> Alex Lancaster wrote: >>>>>> "HL" == Hilmar Lapp writes: > > HL> On Aug 18, 2007, at 7:33 AM, Alex Lancaster wrote: > >>> I imagine the intent of the bioperl >>> contributors is that it should be under the same terms as Perl, >>> whatever that happens to be (which just happens to be GPL or >>> Artistic, which is fine). > > HL> I fully agree. > >>> A clarification to that effect would be useful. > > HL> Agreed, too. Would you mind changing that language on the wiki, > HL> since you seem to have a fairly good grasp on the issue? > > OK, I've updated the wiki in two places: > > http://www.bioperl.org/wiki/Licensing_BioPerl > > http://www.bioperl.org/wiki/FAQ#What_are_the_license_terms_for_BioPerl.3F Thank you very much for that Alex. > It would also be nice if the LICENSE and Build.PL files in CVS (so it > finds its way into the next release) were also updated to reflect the > dual-licensed status, currently they only mention the Artistic > license: [snip] > For Build.PL this is easy: > > (e.g., license => 'artistic', should be > license => 'GPL or Artistic',) As per the 'license' section of http://search.cpan.org/~kwilliams/Module-Build-0.2808/lib/Module/Build/API.pod, I've changed it to 'perl', which means Artistic or GPL. > Possible solutions for the LICENSE file include: > > 1) The GPL could be added to LICENSE file at the end (with a note at > the top to indicate that GPL is also included); I took this approach, using your language for the explanation at the top, and including GPL 3.0 at the bottom. I've made these changes for core (live), run, db and network. Thanks again for your help and advice. From cjfields at uiuc.edu Wed Sep 5 08:53:25 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Wed, 5 Sep 2007 07:53:25 -0500 Subject: [Bioperl-l] display protein/CDS multiple sequence alignments with exon boundaries In-Reply-To: <358f4d650709050242u7743a532u3b460fcc55e1867e@mail.gmail.com> References: <358f4d650708310229i421bb1d7w5f64e3ded5f92618@mail.gmail.com> <96BB820A-9199-43FC-BA88-5F45F2D16D56@jays.net> <59E877A9-B2A1-4E62-A291-FDE21CF586E6@uiuc.edu> <358f4d650709050242u7743a532u3b460fcc55e1867e@mail.gmail.com> Message-ID: <08E1C429-0686-49E2-825A-D98C709166F7@uiuc.edu> You mean something like this? http://www.bioperl.org/wiki/HOWTO_Discussion:Graphics chris On Sep 5, 2007, at 4:42 AM, Albert Vilella wrote: > A couple of examples: > > http://www.treefam.org/cgi-bin/alnview.pl?ac=TF105041 > > treefam has exon boundary and PFAM domain mappings > > http://www.ensembl.org/Homo_sapiens/genetreeview?gene=ENSG00000139618 > > here the tree is shown as well, but the idea would be to plot the > alignment > > So it's more "show me the multiple CDS/protein alignment" rather > than "show > my aligned CDS/proteins wrt my reference genome" > > I think it would be quite neat to have this as a bioperl howto, > > Comments? > > Albert. > > On 9/4/07, Chris Fields wrote: >> >> >> On Sep 4, 2007, at 9:31 AM, Jay Hannah wrote: >> >>> On Aug 31, 2007, at 4:29 AM, Albert Vilella wrote: >>>> Probably a bit of a long shot but does anyone have code for >>>> displaying protein or CDS multiple sequence alignments with the >>>> exon >>>> boundaries of each gene in the alignment? >>>> >>>> Something in the bioperl world without funky external dependencies. >>>> I think >>>> it would be an awesome addition to the howtos. >>>> >>>> Currently, the Bio::Graphics howto has cdna to genome mapping >>>> scripts or >>>> blast output scripts, but >>>> I couldn't find code for dealing with multiple sequence alignments. >>> >>> I'm currently under the (potentially uninformed) impression that >>> Bio::Graphics and related tools only work with a single coordinate >>> system. I've never seen a multiple sequence alignment example. >>> >>> ( >>> I Google'd for "gbrowse alignment" and hit this: >>> http://maizeapache.danforthcenter.org/cgi-bin/gbrowse.cgi >>> >>> Click the second Example link and you'll see exons mapped out. >>> >>> But zooming all the way in with all the tracks turned on it looks >>> like the AZM tracks are just the coding regions. I don't see any >>> multiple sequence alignment... >>> ) >>> >>> I doubt that helped. :) >>> >>> Jay Hannah >>> http://clab.ist.unomaha.edu/CLAB/index.php/User:Jhannah >> >> Acc. to the Gbrowse tutorial GBrowse can deal with alignment data: >> >> http://gmod.cvs.sourceforge.net/*checkout*/gmod/Generic-Genome- >> Browser/docs/tutorial/tutorial.html >> >> chris >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From avilella at gmail.com Wed Sep 5 09:31:24 2007 From: avilella at gmail.com (Albert Vilella) Date: Wed, 5 Sep 2007 15:31:24 +0200 Subject: [Bioperl-l] display protein/CDS multiple sequence alignments with exon boundaries In-Reply-To: <08E1C429-0686-49E2-825A-D98C709166F7@uiuc.edu> References: <358f4d650708310229i421bb1d7w5f64e3ded5f92618@mail.gmail.com> <96BB820A-9199-43FC-BA88-5F45F2D16D56@jays.net> <59E877A9-B2A1-4E62-A291-FDE21CF586E6@uiuc.edu> <358f4d650709050242u7743a532u3b460fcc55e1867e@mail.gmail.com> <08E1C429-0686-49E2-825A-D98C709166F7@uiuc.edu> Message-ID: <358f4d650709050631t136901e6v6280a44089999bde@mail.gmail.com> Awesome!! Thanks Chris! On 9/5/07, Chris Fields wrote: > > You mean something like this? > > http://www.bioperl.org/wiki/HOWTO_Discussion:Graphics > > chris > > On Sep 5, 2007, at 4:42 AM, Albert Vilella wrote: > > > A couple of examples: > > > > http://www.treefam.org/cgi-bin/alnview.pl?ac=TF105041 > > > > treefam has exon boundary and PFAM domain mappings > > > > http://www.ensembl.org/Homo_sapiens/genetreeview?gene=ENSG00000139618 > > > > here the tree is shown as well, but the idea would be to plot the > > alignment > > > > So it's more "show me the multiple CDS/protein alignment" rather > > than "show > > my aligned CDS/proteins wrt my reference genome" > > > > I think it would be quite neat to have this as a bioperl howto, > > > > Comments? > > > > Albert. > > > > On 9/4/07, Chris Fields wrote: > >> > >> > >> On Sep 4, 2007, at 9:31 AM, Jay Hannah wrote: > >> > >>> On Aug 31, 2007, at 4:29 AM, Albert Vilella wrote: > >>>> Probably a bit of a long shot but does anyone have code for > >>>> displaying protein or CDS multiple sequence alignments with the > >>>> exon > >>>> boundaries of each gene in the alignment? > >>>> > >>>> Something in the bioperl world without funky external dependencies. > >>>> I think > >>>> it would be an awesome addition to the howtos. > >>>> > >>>> Currently, the Bio::Graphics howto has cdna to genome mapping > >>>> scripts or > >>>> blast output scripts, but > >>>> I couldn't find code for dealing with multiple sequence alignments. > >>> > >>> I'm currently under the (potentially uninformed) impression that > >>> Bio::Graphics and related tools only work with a single coordinate > >>> system. I've never seen a multiple sequence alignment example. > >>> > >>> ( > >>> I Google'd for "gbrowse alignment" and hit this: > >>> http://maizeapache.danforthcenter.org/cgi-bin/gbrowse.cgi > >>> > >>> Click the second Example link and you'll see exons mapped out. > >>> > >>> But zooming all the way in with all the tracks turned on it looks > >>> like the AZM tracks are just the coding regions. I don't see any > >>> multiple sequence alignment... > >>> ) > >>> > >>> I doubt that helped. :) > >>> > >>> Jay Hannah > >>> http://clab.ist.unomaha.edu/CLAB/index.php/User:Jhannah > >> > >> Acc. to the Gbrowse tutorial GBrowse can deal with alignment data: > >> > >> http://gmod.cvs.sourceforge.net/*checkout*/gmod/Generic-Genome- > >> Browser/docs/tutorial/tutorial.html > >> > >> chris > >> > >> _______________________________________________ > >> Bioperl-l mailing list > >> Bioperl-l at lists.open-bio.org > >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > >> > > _______________________________________________ > > Bioperl-l mailing list > > Bioperl-l at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > Christopher Fields > Postdoctoral Researcher > Lab of Dr. Robert Switzer > Dept of Biochemistry > University of Illinois Urbana-Champaign > > > > From cjfields at uiuc.edu Wed Sep 5 10:17:51 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Wed, 5 Sep 2007 09:17:51 -0500 Subject: [Bioperl-l] display protein/CDS multiple sequence alignments with exon boundaries In-Reply-To: <358f4d650709050631t136901e6v6280a44089999bde@mail.gmail.com> References: <358f4d650708310229i421bb1d7w5f64e3ded5f92618@mail.gmail.com> <96BB820A-9199-43FC-BA88-5F45F2D16D56@jays.net> <59E877A9-B2A1-4E62-A291-FDE21CF586E6@uiuc.edu> <358f4d650709050242u7743a532u3b460fcc55e1867e@mail.gmail.com> <08E1C429-0686-49E2-825A-D98C709166F7@uiuc.edu> <358f4d650709050631t136901e6v6280a44089999bde@mail.gmail.com> Message-ID: <31E25B64-2043-4460-ADC8-9684D01C2468@uiuc.edu> It would be nice to place the labels to the left of the segments. I believe there is a way to do this, but can't remember; if I can find it I'll revise the script. chris On Sep 5, 2007, at 8:31 AM, Albert Vilella wrote: > Awesome!! > > Thanks Chris! > > On 9/5/07, Chris Fields wrote: >> >> You mean something like this? >> >> http://www.bioperl.org/wiki/HOWTO_Discussion:Graphics >> >> chris >> >> On Sep 5, 2007, at 4:42 AM, Albert Vilella wrote: >> >>> A couple of examples: >>> >>> http://www.treefam.org/cgi-bin/alnview.pl?ac=TF105041 >>> >>> treefam has exon boundary and PFAM domain mappings >>> >>> http://www.ensembl.org/Homo_sapiens/genetreeview? >>> gene=ENSG00000139618 >>> >>> here the tree is shown as well, but the idea would be to plot the >>> alignment >>> >>> So it's more "show me the multiple CDS/protein alignment" rather >>> than "show >>> my aligned CDS/proteins wrt my reference genome" >>> >>> I think it would be quite neat to have this as a bioperl howto, >>> >>> Comments? >>> >>> Albert. >>> >>> On 9/4/07, Chris Fields wrote: >>>> >>>> >>>> On Sep 4, 2007, at 9:31 AM, Jay Hannah wrote: >>>> >>>>> On Aug 31, 2007, at 4:29 AM, Albert Vilella wrote: >>>>>> Probably a bit of a long shot but does anyone have code for >>>>>> displaying protein or CDS multiple sequence alignments with the >>>>>> exon >>>>>> boundaries of each gene in the alignment? >>>>>> >>>>>> Something in the bioperl world without funky external >>>>>> dependencies. >>>>>> I think >>>>>> it would be an awesome addition to the howtos. >>>>>> >>>>>> Currently, the Bio::Graphics howto has cdna to genome mapping >>>>>> scripts or >>>>>> blast output scripts, but >>>>>> I couldn't find code for dealing with multiple sequence >>>>>> alignments. >>>>> >>>>> I'm currently under the (potentially uninformed) impression that >>>>> Bio::Graphics and related tools only work with a single >>>>> coordinate >>>>> system. I've never seen a multiple sequence alignment example. >>>>> >>>>> ( >>>>> I Google'd for "gbrowse alignment" and hit this: >>>>> http://maizeapache.danforthcenter.org/cgi-bin/gbrowse.cgi >>>>> >>>>> Click the second Example link and you'll see exons mapped out. >>>>> >>>>> But zooming all the way in with all the tracks turned on it >>>>> looks >>>>> like the AZM tracks are just the coding regions. I don't see any >>>>> multiple sequence alignment... >>>>> ) >>>>> >>>>> I doubt that helped. :) >>>>> >>>>> Jay Hannah >>>>> http://clab.ist.unomaha.edu/CLAB/index.php/User:Jhannah >>>> >>>> Acc. to the Gbrowse tutorial GBrowse can deal with alignment data: >>>> >>>> http://gmod.cvs.sourceforge.net/*checkout*/gmod/Generic-Genome- >>>> Browser/docs/tutorial/tutorial.html >>>> >>>> chris >>>> >>>> _______________________________________________ >>>> Bioperl-l mailing list >>>> Bioperl-l at lists.open-bio.org >>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>>> >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> >> Christopher Fields >> Postdoctoral Researcher >> Lab of Dr. Robert Switzer >> Dept of Biochemistry >> University of Illinois Urbana-Champaign >> >> >> >> > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From n.haigh at sheffield.ac.uk Wed Sep 5 10:22:44 2007 From: n.haigh at sheffield.ac.uk (Nathan Haigh) Date: Wed, 05 Sep 2007 15:22:44 +0100 Subject: [Bioperl-l] Bio::Graphics support for floating point positions In-Reply-To: <08E1C429-0686-49E2-825A-D98C709166F7@uiuc.edu> References: <358f4d650708310229i421bb1d7w5f64e3ded5f92618@mail.gmail.com> <96BB820A-9199-43FC-BA88-5F45F2D16D56@jays.net> <59E877A9-B2A1-4E62-A291-FDE21CF586E6@uiuc.edu> <358f4d650709050242u7743a532u3b460fcc55e1867e@mail.gmail.com> <08E1C429-0686-49E2-825A-D98C709166F7@uiuc.edu> Message-ID: <46DEBBB4.1030200@sheffield.ac.uk> Chris Fields wrote: > You mean something like this? > > http://www.bioperl.org/wiki/HOWTO_Discussion:Graphics > > chris > > On Sep 5, 2007, at 4:42 AM, Albert Vilella wrote: > > Nice! On a similar (well, related to Bio::Graphics) topic, I've written a script that uses markers that have been mapped from a model organism to linkage groups in related species in order to estimate the location of "unknown" markers in those linkage groups. I'm using the Bio::Map::* modules for much of this work and then I use Bio::Graphics to display the linkage groups of the non-model organism with the putative position of the "unknown" markers. However, I've had to do a bit of fudging to get Bio::Graphics to draw this data. The problems I encountered are described below. I also have an open bug: http://bugzilla.open-bio.org/show_bug.cgi?id=2343 1) Linkage maps are measured in cM - which can and are likely to be non-integer values. Bio::Graphics needs integer values, so I simply scaled all my cM measurements prior to drawing by *1000. However, the ruler now doesn't represent the "true scale" - can this be adjusted? 2) Some markers map to 0cM. However, Bio::Graphics requires positions >0. To get round this I simply incremented these positions by 1 (post-scaling), so they display almost in the correct place. Is it possible/likely/wise to support positions starting at zero and float positions? Would such support simply be to internalise what I have already done outside Bio::Graphics into the Bio::Graphics modules and have it display the correctly scaled ruler? Thoughts comments welcome. Cheers, Nath From cjfields at uiuc.edu Wed Sep 5 10:52:00 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Wed, 5 Sep 2007 09:52:00 -0500 Subject: [Bioperl-l] display protein/CDS multiple sequence alignments with exon boundaries In-Reply-To: <358f4d650709050631t136901e6v6280a44089999bde@mail.gmail.com> References: <358f4d650708310229i421bb1d7w5f64e3ded5f92618@mail.gmail.com> <96BB820A-9199-43FC-BA88-5F45F2D16D56@jays.net> <59E877A9-B2A1-4E62-A291-FDE21CF586E6@uiuc.edu> <358f4d650709050242u7743a532u3b460fcc55e1867e@mail.gmail.com> <08E1C429-0686-49E2-825A-D98C709166F7@uiuc.edu> <358f4d650709050631t136901e6v6280a44089999bde@mail.gmail.com> Message-ID: Updated the page on the web site with the new script. Figured it out; if you pass the parameter -label_position 'left' it will display the label to the left. However it displays them right next to the segment (ala GBrowse). I added a hack to Bio::Graphics::Glyph::generic in CVS which allows 'alignment_left' as an option, displaying it aligned to the far left of the panel; there is probably a way to use a callback here as well. chris On Sep 5, 2007, at 8:31 AM, Albert Vilella wrote: > Awesome!! > > Thanks Chris! > > On 9/5/07, Chris Fields wrote: >> >> You mean something like this? >> >> http://www.bioperl.org/wiki/HOWTO_Discussion:Graphics >> >> chris >> >> On Sep 5, 2007, at 4:42 AM, Albert Vilella wrote: >> >>> A couple of examples: >>> >>> http://www.treefam.org/cgi-bin/alnview.pl?ac=TF105041 >>> >>> treefam has exon boundary and PFAM domain mappings >>> >>> http://www.ensembl.org/Homo_sapiens/genetreeview? >>> gene=ENSG00000139618 >>> >>> here the tree is shown as well, but the idea would be to plot the >>> alignment >>> >>> So it's more "show me the multiple CDS/protein alignment" rather >>> than "show >>> my aligned CDS/proteins wrt my reference genome" >>> >>> I think it would be quite neat to have this as a bioperl howto, >>> >>> Comments? >>> >>> Albert. >>> >>> On 9/4/07, Chris Fields wrote: >>>> >>>> >>>> On Sep 4, 2007, at 9:31 AM, Jay Hannah wrote: >>>> >>>>> On Aug 31, 2007, at 4:29 AM, Albert Vilella wrote: >>>>>> Probably a bit of a long shot but does anyone have code for >>>>>> displaying protein or CDS multiple sequence alignments with the >>>>>> exon >>>>>> boundaries of each gene in the alignment? >>>>>> >>>>>> Something in the bioperl world without funky external >>>>>> dependencies. >>>>>> I think >>>>>> it would be an awesome addition to the howtos. >>>>>> >>>>>> Currently, the Bio::Graphics howto has cdna to genome mapping >>>>>> scripts or >>>>>> blast output scripts, but >>>>>> I couldn't find code for dealing with multiple sequence >>>>>> alignments. >>>>> >>>>> I'm currently under the (potentially uninformed) impression that >>>>> Bio::Graphics and related tools only work with a single >>>>> coordinate >>>>> system. I've never seen a multiple sequence alignment example. >>>>> >>>>> ( >>>>> I Google'd for "gbrowse alignment" and hit this: >>>>> http://maizeapache.danforthcenter.org/cgi-bin/gbrowse.cgi >>>>> >>>>> Click the second Example link and you'll see exons mapped out. >>>>> >>>>> But zooming all the way in with all the tracks turned on it >>>>> looks >>>>> like the AZM tracks are just the coding regions. I don't see any >>>>> multiple sequence alignment... >>>>> ) >>>>> >>>>> I doubt that helped. :) >>>>> >>>>> Jay Hannah >>>>> http://clab.ist.unomaha.edu/CLAB/index.php/User:Jhannah >>>> >>>> Acc. to the Gbrowse tutorial GBrowse can deal with alignment data: >>>> >>>> http://gmod.cvs.sourceforge.net/*checkout*/gmod/Generic-Genome- >>>> Browser/docs/tutorial/tutorial.html >>>> >>>> chris >>>> >>>> _______________________________________________ >>>> Bioperl-l mailing list >>>> Bioperl-l at lists.open-bio.org >>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>>> >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> >> Christopher Fields >> Postdoctoral Researcher >> Lab of Dr. Robert Switzer >> Dept of Biochemistry >> University of Illinois Urbana-Champaign >> >> >> >> > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From cjfields at uiuc.edu Wed Sep 5 12:47:46 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Wed, 5 Sep 2007 11:47:46 -0500 Subject: [Bioperl-l] Bio::Graphics support for floating point positions In-Reply-To: <46DEBBB4.1030200@sheffield.ac.uk> References: <358f4d650708310229i421bb1d7w5f64e3ded5f92618@mail.gmail.com> <96BB820A-9199-43FC-BA88-5F45F2D16D56@jays.net> <59E877A9-B2A1-4E62-A291-FDE21CF586E6@uiuc.edu> <358f4d650709050242u7743a532u3b460fcc55e1867e@mail.gmail.com> <08E1C429-0686-49E2-825A-D98C709166F7@uiuc.edu> <46DEBBB4.1030200@sheffield.ac.uk> Message-ID: On Sep 5, 2007, at 9:22 AM, Nathan Haigh wrote: > ... > On a similar (well, related to Bio::Graphics) topic, I've written a > script that uses markers that have been mapped from a model > organism to > linkage groups in related species in order to estimate the location of > "unknown" markers in those linkage groups. > > I'm using the Bio::Map::* modules for much of this work and then I use > Bio::Graphics to display the linkage groups of the non-model organism > with the putative position of the "unknown" markers. However, I've had > to do a bit of fudging to get Bio::Graphics to draw this data. The > problems I encountered are described below. I also have an open bug: > http://bugzilla.open-bio.org/show_bug.cgi?id=2343 > > 1) Linkage maps are measured in cM - which can and are likely to be > non-integer values. Bio::Graphics needs integer values, so I simply > scaled all my cM measurements prior to drawing by *1000. However, the > ruler now doesn't represent the "true scale" - can this be adjusted? > > 2) Some markers map to 0cM. However, Bio::Graphics requires positions >> 0. To get round this I simply incremented these positions by 1 > (post-scaling), so they display almost in the correct place. > > Is it possible/likely/wise to support positions starting at zero and > float positions? Would such support simply be to internalise what I > have > already done outside Bio::Graphics into the Bio::Graphics modules and > have it display the correctly scaled ruler? > > Thoughts comments welcome. > > Cheers, > Nath There is this section in the GBrowse configure doc, which to me suggests there is a way to do what you want in Bioperl; you may have to delve into the Bio::Graphics or GBrowse code to work it out, though. I think the GBrowse mail list archives also have more on this. chris ..... F. DISPLAYING GENETIC AND RH MAPS GBrowse can be tweaked to make it more suitable for displaying genetic and radiation hybrid maps. The main issue is that the Bio::DB::GFF database expects coordinates to be positive integers, not fractions, but genetic and RH maps use floating point numbers. Working around this is a bit of an ugly hack. Before loading your data you must multiply all your coordinates by a constant power of 10 in order to convert them into integers. For example, if a genetic map uses Morgan units ranging from 0 to 1.80, you would multiple by 100 to create a map in ranging from 0 to 180. Create a GFF file containing the markers in modified coordinates and load it as usual. Now you must tell GBrowse to reverse these changes. Enter the following options into the [GENERAL] section of the configuration file: units = M unit_divider = 100 These two options tell GBrowse to use "M" (Morgan) units, and to divide all coordinates by 100. GBrowse will automatically display the scale using the most appropriate units, so the displayed map will typically be drawn using cM units. From bernd.web at gmail.com Wed Sep 5 13:44:26 2007 From: bernd.web at gmail.com (Bernd Web) Date: Wed, 5 Sep 2007 19:44:26 +0200 Subject: [Bioperl-l] SearchIO ResultWriter Message-ID: <716af09c0709051044t1cb9e857uc4b91ad64c9ef22a@mail.gmail.com> Hi, For SearchIO there are ResultWriters to write text, html and BSML (BSMLResultWriter). However, is there also a BLAST xml writer, which writes the original blast xml files. This may have come up before. If there is not, is there interest in having this? Regards, Bernd From sac at bioperl.org Wed Sep 5 16:37:37 2007 From: sac at bioperl.org (Steve Chervitz) Date: Wed, 5 Sep 2007 13:37:37 -0700 Subject: [Bioperl-l] SearchIO ResultWriter In-Reply-To: <716af09c0709051044t1cb9e857uc4b91ad64c9ef22a@mail.gmail.com> References: <716af09c0709051044t1cb9e857uc4b91ad64c9ef22a@mail.gmail.com> Message-ID: <8f200b4c0709051337u532804d6r27712b05faaeea7d@mail.gmail.com> Looks like there is no such functionality in the current repository. If you have implemented such a beast and are willing to contribute it, go for it (or coordinate with a developer if you lack CVS write access). Steve On 9/5/07, Bernd Web wrote: > > Hi, > > For SearchIO there are ResultWriters to write text, html and BSML > (BSMLResultWriter). However, is there also a BLAST xml writer, which > writes the original blast xml files. This may have come up before. If > there is not, is there interest in having this? > > > Regards, > Bernd > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > From n.haigh at sheffield.ac.uk Wed Sep 5 17:18:17 2007 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Wed, 05 Sep 2007 22:18:17 +0100 Subject: [Bioperl-l] Bio::Graphics support for floating point positions In-Reply-To: References: <358f4d650708310229i421bb1d7w5f64e3ded5f92618@mail.gmail.com> <96BB820A-9199-43FC-BA88-5F45F2D16D56@jays.net> <59E877A9-B2A1-4E62-A291-FDE21CF586E6@uiuc.edu> <358f4d650709050242u7743a532u3b460fcc55e1867e@mail.gmail.com> <08E1C429-0686-49E2-825A-D98C709166F7@uiuc.edu> <46DEBBB4.1030200@sheffield.ac.uk> Message-ID: <46DF1D19.9010707@sheffield.ac.uk> Chris Fields wrote: > On Sep 5, 2007, at 9:22 AM, Nathan Haigh wrote: > >> ... >> On a similar (well, related to Bio::Graphics) topic, I've written a >> script that uses markers that have been mapped from a model organism to >> linkage groups in related species in order to estimate the location of >> "unknown" markers in those linkage groups. >> >> I'm using the Bio::Map::* modules for much of this work and then I use >> Bio::Graphics to display the linkage groups of the non-model organism >> with the putative position of the "unknown" markers. However, I've had >> to do a bit of fudging to get Bio::Graphics to draw this data. The >> problems I encountered are described below. I also have an open bug: >> http://bugzilla.open-bio.org/show_bug.cgi?id=2343 >> >> 1) Linkage maps are measured in cM - which can and are likely to be >> non-integer values. Bio::Graphics needs integer values, so I simply >> scaled all my cM measurements prior to drawing by *1000. However, the >> ruler now doesn't represent the "true scale" - can this be adjusted? >> >> 2) Some markers map to 0cM. However, Bio::Graphics requires positions >>> 0. To get round this I simply incremented these positions by 1 >> (post-scaling), so they display almost in the correct place. >> >> Is it possible/likely/wise to support positions starting at zero and >> float positions? Would such support simply be to internalise what I have >> already done outside Bio::Graphics into the Bio::Graphics modules and >> have it display the correctly scaled ruler? >> >> Thoughts comments welcome. >> >> Cheers, >> Nath > > There is this section in the GBrowse configure doc, which to me > suggests there is a way to do what you want in Bioperl; you may have > to delve into the Bio::Graphics or GBrowse code to work it out, > though. I think the GBrowse mail list archives also have more on this. > > chris > > ..... > > F. DISPLAYING GENETIC AND RH MAPS > > GBrowse can be tweaked to make it more suitable for displaying genetic > and radiation hybrid maps. > > The main issue is that the Bio::DB::GFF database expects coordinates > to be positive integers, not fractions, but genetic and RH maps use > floating point numbers. Working around this is a bit of an ugly hack. > Before loading your data you must multiply all your coordinates by a > constant power of 10 in order to convert them into integers. For > example, if a genetic map uses Morgan units ranging from 0 to 1.80, > you would multiple by 100 to create a map in ranging from 0 to 180. > > Create a GFF file containing the markers in modified coordinates and > load it as usual. Now you must tell GBrowse to reverse these changes. > Enter the following options into the [GENERAL] section of the > configuration file: > > units = M > unit_divider = 100 > > These two options tell GBrowse to use "M" (Morgan) units, and to > divide all coordinates by 100. GBrowse will automatically display the > scale using the most appropriate units, so the displayed map will > typically be drawn using cM units. > Thanks for for the pointer Chris! >From what you've said, it appears they might have done a similar hack to me - which is always nice to know! It seems then to me, that it may be worth making the Bio::Graphic::* modules slightly more generic and applicable for these situations. It's late, so does anyone have suggestions before I start digging through Bio::Graphic::* modules in the morning? Maybe you guys across the water have something to say by the time I wake up in the morning!? Thanks Nath From jason at bioperl.org Wed Sep 5 17:33:44 2007 From: jason at bioperl.org (Jason Stajich) Date: Wed, 5 Sep 2007 14:33:44 -0700 Subject: [Bioperl-l] SearchIO ResultWriter In-Reply-To: <8f200b4c0709051337u532804d6r27712b05faaeea7d@mail.gmail.com> References: <716af09c0709051044t1cb9e857uc4b91ad64c9ef22a@mail.gmail.com> <8f200b4c0709051337u532804d6r27712b05faaeea7d@mail.gmail.com> Message-ID: I think most ppl aren't that enamored with the NCBI XML Blast format but I guess it is standard if the NCBI puts it out... It should be a pretty easy writer to make at any rate just follow along with what was done for BSMLWriter. -jason On Sep 5, 2007, at 1:37 PM, Steve Chervitz wrote: > Looks like there is no such functionality in the current > repository. If you > have implemented such a beast and are willing to contribute it, go > for it > (or coordinate with a developer if you lack CVS write access). > > Steve > > On 9/5/07, Bernd Web wrote: >> >> Hi, >> >> For SearchIO there are ResultWriters to write text, html and BSML >> (BSMLResultWriter). However, is there also a BLAST xml writer, which >> writes the original blast xml files. This may have come up before. If >> there is not, is there interest in having this? >> >> >> Regards, >> Bernd >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- Jason Stajich jason at bioperl.org From jay at jays.net Thu Sep 6 15:50:53 2007 From: jay at jays.net (Jay Hannah) Date: Thu, 6 Sep 2007 15:50:53 -0400 (EDT) Subject: [Bioperl-l] display protein/CDS multiple sequence alignments with exon boundaries In-Reply-To: <08E1C429-0686-49E2-825A-D98C709166F7@uiuc.edu> References: <358f4d650708310229i421bb1d7w5f64e3ded5f92618@mail.gmail.com> <96BB820A-9199-43FC-BA88-5F45F2D16D56@jays.net> <59E877A9-B2A1-4E62-A291-FDE21CF586E6@uiuc.edu> <358f4d650709050242u7743a532u3b460fcc55e1867e@mail.gmail.com> <08E1C429-0686-49E2-825A-D98C709166F7@uiuc.edu> Message-ID: On Wed, 5 Sep 2007, Chris Fields wrote: > http://www.bioperl.org/wiki/HOWTO_Discussion:Graphics Wow. That's slick. :) Is it possible to zoom in far enough to see the individual bases and gaps?? On Tue, 4 Sep 2007, Chris Fields wrote: > Acc. to the Gbrowse tutorial GBrowse can deal with alignment data: > http://gmod.cvs.sourceforge.net/*checkout*/gmod/Generic-Genome-Browser/docs/tutorial/tutorial.html Yes, indeed. GBrowse graphs all sorts of amazing things. Specifically, this image might be what Albert is looking for: http://gmod.cvs.sourceforge.net/*checkout*/gmod/Generic-Genome-Browser/docs/tutorial/figures/segmented_features2.gif He'd need to map his exon boundaries from whatever format he has into a GFF file (or DAS/BioSQL/Chado/? server, or...?) for GBrowse to munch on. On Aug 31, 2007, at 4:29 AM, Albert Vilella wrote: > "Something in the bioperl world without funky external dependencies" There are still things the long arm of BioPerl justice hasn't reached? :) Jay Hannah http://clab.ist.unomaha.edu/CLAB/index.php/User:Jhannah From cjfields at uiuc.edu Thu Sep 6 19:39:07 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 6 Sep 2007 18:39:07 -0500 Subject: [Bioperl-l] display protein/CDS multiple sequence alignments with exon boundaries In-Reply-To: References: <358f4d650708310229i421bb1d7w5f64e3ded5f92618@mail.gmail.com> <96BB820A-9199-43FC-BA88-5F45F2D16D56@jays.net> <59E877A9-B2A1-4E62-A291-FDE21CF586E6@uiuc.edu> <358f4d650709050242u7743a532u3b460fcc55e1867e@mail.gmail.com> <08E1C429-0686-49E2-825A-D98C709166F7@uiuc.edu> Message-ID: <9AD575D1-A56C-415F-9C52-73105E846040@uiuc.edu> On Sep 6, 2007, at 2:50 PM, Jay Hannah wrote: > > On Wed, 5 Sep 2007, Chris Fields wrote: >> http://www.bioperl.org/wiki/HOWTO_Discussion:Graphics > > Wow. That's slick. :) Is it possible to zoom in far enough to > see the > individual bases and gaps?? I'm not sure; you can do something like that with GBrowse with some features so there is probably a way to put something together which could do that. > On Tue, 4 Sep 2007, Chris Fields wrote: >> Acc. to the Gbrowse tutorial GBrowse can deal with alignment data: >> http://gmod.cvs.sourceforge.net/*checkout*/gmod/Generic-Genome- >> Browser/docs/tutorial/tutorial.html > > Yes, indeed. GBrowse graphs all sorts of amazing things. Specifically, > this image might be what Albert is looking for: > > http://gmod.cvs.sourceforge.net/*checkout*/gmod/Generic-Genome- > Browser/docs/tutorial/figures/segmented_features2.gif > > He'd need to map his exon boundaries from whatever format he has > into a > GFF file (or DAS/BioSQL/Chado/? server, or...?) for GBrowse to > munch on. I use segmented SeqFeatures in my example. The HOWTO also uses a variation ('graded_segments'): http://www.bioperl.org/wiki/HOWTO:Graphics#Parsing_Real_BLAST_Output The subseqfeatures are colored by score. Feasibly one could hack this so that the exons/introns have a different 'score', thus displaying different colors. > On Aug 31, 2007, at 4:29 AM, Albert Vilella wrote: >> "Something in the bioperl world without funky external dependencies" > > There are still things the long arm of BioPerl justice hasn't > reached? :) > > Jay Hannah > http://clab.ist.unomaha.edu/CLAB/index.php/User:Jhannah chris From cain.cshl at gmail.com Thu Sep 6 23:20:04 2007 From: cain.cshl at gmail.com (Scott Cain) Date: Thu, 06 Sep 2007 23:20:04 -0400 Subject: [Bioperl-l] display protein/CDS multiple sequence alignments with exon boundaries In-Reply-To: <9AD575D1-A56C-415F-9C52-73105E846040@uiuc.edu> References: <358f4d650708310229i421bb1d7w5f64e3ded5f92618@mail.gmail.com> <96BB820A-9199-43FC-BA88-5F45F2D16D56@jays.net> <59E877A9-B2A1-4E62-A291-FDE21CF586E6@uiuc.edu> <358f4d650709050242u7743a532u3b460fcc55e1867e@mail.gmail.com> <08E1C429-0686-49E2-825A-D98C709166F7@uiuc.edu> <9AD575D1-A56C-415F-9C52-73105E846040@uiuc.edu> Message-ID: <1189135204.2560.52.camel@localhost.localdomain> On Thu, 2007-09-06 at 18:39 -0500, Chris Fields wrote: > On Sep 6, 2007, at 2:50 PM, Jay Hannah wrote: > > > > > On Wed, 5 Sep 2007, Chris Fields wrote: > >> http://www.bioperl.org/wiki/HOWTO_Discussion:Graphics > > > > Wow. That's slick. :) Is it possible to zoom in far enough to > > see the > > individual bases and gaps?? > > I'm not sure; you can do something like that with GBrowse with some > features so there is probably a way to put something together which > could do that. Yeah, if it were me, I would install GBrowse, hack my data into GFF and use gbrowse_img to generate pictures. It would probably be easier than starting from scratch. > > > On Tue, 4 Sep 2007, Chris Fields wrote: > >> Acc. to the Gbrowse tutorial GBrowse can deal with alignment data: > >> http://gmod.cvs.sourceforge.net/*checkout*/gmod/Generic-Genome- > >> Browser/docs/tutorial/tutorial.html > > > > Yes, indeed. GBrowse graphs all sorts of amazing things. Specifically, > > this image might be what Albert is looking for: > > > > http://gmod.cvs.sourceforge.net/*checkout*/gmod/Generic-Genome- > > Browser/docs/tutorial/figures/segmented_features2.gif > > > > He'd need to map his exon boundaries from whatever format he has > > into a > > GFF file (or DAS/BioSQL/Chado/? server, or...?) for GBrowse to > > munch on. > > I use segmented SeqFeatures in my example. The HOWTO also uses a > variation ('graded_segments'): > > http://www.bioperl.org/wiki/HOWTO:Graphics#Parsing_Real_BLAST_Output > > The subseqfeatures are colored by score. Feasibly one could hack > this so that the exons/introns have a different 'score', thus > displaying different colors. > > > On Aug 31, 2007, at 4:29 AM, Albert Vilella wrote: > >> "Something in the bioperl world without funky external dependencies" > > > > There are still things the long arm of BioPerl justice hasn't > > reached? :) > > > > Jay Hannah > > http://clab.ist.unomaha.edu/CLAB/index.php/User:Jhannah > > chris > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- ------------------------------------------------------------------------ Scott Cain, Ph. D. cain at cshl.edu GMOD Coordinator (http://www.gmod.org/) 216-392-3087 Cold Spring Harbor Laboratory -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: This is a digitally signed message part URL: From avilella at gmail.com Fri Sep 7 05:20:01 2007 From: avilella at gmail.com (Albert Vilella) Date: Fri, 7 Sep 2007 10:20:01 +0100 Subject: [Bioperl-l] display protein/CDS multiple sequence alignments with exon boundaries In-Reply-To: <9AD575D1-A56C-415F-9C52-73105E846040@uiuc.edu> References: <358f4d650708310229i421bb1d7w5f64e3ded5f92618@mail.gmail.com> <96BB820A-9199-43FC-BA88-5F45F2D16D56@jays.net> <59E877A9-B2A1-4E62-A291-FDE21CF586E6@uiuc.edu> <358f4d650709050242u7743a532u3b460fcc55e1867e@mail.gmail.com> <08E1C429-0686-49E2-825A-D98C709166F7@uiuc.edu> <9AD575D1-A56C-415F-9C52-73105E846040@uiuc.edu> Message-ID: <358f4d650709070220q3680c5a2kbd01adb6f2fca3dc@mail.gmail.com> > > > He'd need to map his exon boundaries from whatever format he has > > into a > > GFF file (or DAS/BioSQL/Chado/? server, or...?) for GBrowse to > > munch on. > > I use segmented SeqFeatures in my example. The HOWTO also uses a > variation ('graded_segments'): > > http://www.bioperl.org/wiki/HOWTO:Graphics#Parsing_Real_BLAST_Output > > The subseqfeatures are colored by score. Feasibly one could hack > this so that the exons/introns have a different 'score', thus > displaying different colors. The exon boundary could be a vertical line or a triangular tick or something. I don't know if there is a consensus on this kind of cartoons. Does anybody know how exon boundaries are displayed in different browsers/apps? From yangmeng at genomics.org.cn Fri Sep 7 03:57:14 2007 From: yangmeng at genomics.org.cn (=?ISO-8859-1?Q?=D1=EE=C3=CD=A3=A8=D6=D0=D0=C4=CA=B5=D1=E9=CA=D2?= ) Date: Fri, 7 Sep 2007 15:57:14 +0800 Subject: [Bioperl-l] a question Message-ID: <200709071557.AA78971054@genomics.org.cn> I am a student from China.During my learing the bioperl,I encounter a problem as follows: I run the program, use Bio::Perl; $seq_object = get_sequence('swissprot',"ROA1_HUMAN"); write_sequence(">roa1.fasta",'fasta',$seq_object); But It returns lots of mistake informatiom, ------------- EXCEPTION: Bio::Root::Exception ------------- MSG: WebDBSeqI Request Error: 501 Protocol scheme '' is not supported Content-Type: text/plain Client-Date: Fri, 07 Sep 2007 07:26:06 GMT Client-Warning: Internal response 501 Protocol scheme '' is not supported STACK: Error::throw STACK: Bio::Root::Root::throw D:/perl/site/lib/Bio/Root/Root.pm:359 STACK: Bio::DB::WebDBSeqI::_request D:/perl/site/lib/Bio/DB/WebDBSeqI.pm:685 STACK: Bio::DB::WebDBSeqI::get_seq_stream D:/perl/site/lib/Bio/DB/WebDBSeqI.pm:4 91 STACK: Bio::DB::WebDBSeqI::get_Stream_by_id D:/perl/site/lib/Bio/DB/WebDBSeqI.pm :275 STACK: Bio::DB::WebDBSeqI::get_Seq_by_id D:/perl/site/lib/Bio/DB/WebDBSeqI.pm:14 5 STACK: Bio::Perl::get_sequence D:/perl/site/lib/Bio/Perl.pm:510 STACK: C:\DOCUME~1\yangmeng\LOCALS~1\Temp\dir13D.tmp\Untitled.pl:6 ----------------------------------------------------------- I don't know the reason of the problem.I have installed the addition perl modules such as bioperl-db,bioperl-network,bioperlgui and almost all "BioPerl Dependencies modules".My network is also OK. It's an annoying promleb to me. I have consulted many experts but didn't got a reply. Could you vacuate in your mass business to give me a help? Thank you! Best regards! YangMeng ________________________________________________________________ Sent via the WebMail system at genomics.org.cn From cjfields at uiuc.edu Fri Sep 7 10:09:18 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Fri, 7 Sep 2007 09:09:18 -0500 Subject: [Bioperl-l] a question In-Reply-To: <200709071557.AA78971054@genomics.org.cn> References: <200709071557.AA78971054@genomics.org.cn> Message-ID: <7F176E39-18A6-4BF9-9247-863D6F3C167D@uiuc.edu> On Sep 7, 2007, at 2:57 AM, ???????????????? wrote: > I am a student from China.During my learing the bioperl,I encounter > a problem as follows: > > I run the program, > > use Bio::Perl; > $seq_object = get_sequence('swissprot',"ROA1_HUMAN"); > write_sequence(">roa1.fasta",'fasta',$seq_object); > > But It returns lots of mistake informatiom, First, always preface problems of this sort with the version of BioPerl you are using (there are quite a few versions still being used). > ------------- EXCEPTION: Bio::Root::Exception ------------- > MSG: WebDBSeqI Request Error: > 501 Protocol scheme '' is not supported > Content-Type: text/plain > Client-Date: Fri, 07 Sep 2007 07:26:06 GMT > Client-Warning: Internal response > 501 Protocol scheme '' is not supported > STACK: Error::throw > STACK: Bio::Root::Root::throw D:/perl/site/lib/Bio/Root/Root.pm:359 > STACK: Bio::DB::WebDBSeqI::_request D:/perl/site/lib/Bio/DB/ > WebDBSeqI.pm:685 > STACK: Bio::DB::WebDBSeqI::get_seq_stream D:/perl/site/lib/Bio/DB/ > WebDBSeqI.pm:4 > 91 > STACK: Bio::DB::WebDBSeqI::get_Stream_by_id D:/perl/site/lib/Bio/DB/ > WebDBSeqI.pm > :275 > STACK: Bio::DB::WebDBSeqI::get_Seq_by_id D:/perl/site/lib/Bio/DB/ > WebDBSeqI.pm:14 > 5 > STACK: Bio::Perl::get_sequence D:/perl/site/lib/Bio/Perl.pm:510 > STACK: C:\DOCUME~1\yangmeng\LOCALS~1\Temp\dir13D.tmp\Untitled.pl:6 > ----------------------------------------------------------- This works for me using bioperl from CVS. There were a few remote DbFetch server changes if I recall correctly, so updating from CVS may be your best option. > I don't know the reason of the problem.I have installed the > addition perl modules such as bioperl-db,bioperl-network,bioperlgui > and almost all "BioPerl Dependencies modules".My network is also > OK. It's an annoying promleb to me. > I have consulted many experts but didn't got a reply. Could you > vacuate in your mass business to give me a help? > > Thank you! > > Best regards! > > YangMeng I think my 'vacuating' is a private matter, let alone doing so in my mass business... http://www.thefreedictionary.com/Vacuate ;> chris From cjfields at uiuc.edu Mon Sep 10 18:04:14 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 10 Sep 2007 17:04:14 -0500 Subject: [Bioperl-l] display protein/CDS multiple sequence alignments with exon boundaries In-Reply-To: <358f4d650709070220q3680c5a2kbd01adb6f2fca3dc@mail.gmail.com> References: <358f4d650708310229i421bb1d7w5f64e3ded5f92618@mail.gmail.com> <96BB820A-9199-43FC-BA88-5F45F2D16D56@jays.net> <59E877A9-B2A1-4E62-A291-FDE21CF586E6@uiuc.edu> <358f4d650709050242u7743a532u3b460fcc55e1867e@mail.gmail.com> <08E1C429-0686-49E2-825A-D98C709166F7@uiuc.edu> <9AD575D1-A56C-415F-9C52-73105E846040@uiuc.edu> <358f4d650709070220q3680c5a2kbd01adb6f2fca3dc@mail.gmail.com> Message-ID: <3D1D9D7E-4162-4694-A3D6-23E926B07C0E@uiuc.edu> On Sep 7, 2007, at 4:20 AM, Albert Vilella wrote: >>> He'd need to map his exon boundaries from whatever format he has >>> into a >>> GFF file (or DAS/BioSQL/Chado/? server, or...?) for GBrowse to >>> munch on. >> >> I use segmented SeqFeatures in my example. The HOWTO also uses a >> variation ('graded_segments'): >> >> http://www.bioperl.org/wiki/HOWTO:Graphics#Parsing_Real_BLAST_Output >> >> The subseqfeatures are colored by score. Feasibly one could hack >> this so that the exons/introns have a different 'score', thus >> displaying different colors. > > > The exon boundary could be a vertical line or a triangular tick or > something. I don't know if there is a consensus on this kind of > cartoons. > Does anybody know how exon boundaries are displayed in different > browsers/apps? Don't know. BTW, apparently there is something being cooked up as an alignment browser (among other things) for GBrowse: https://www.nescent.org/wg_phyloinformatics/ PhyloSoC:Phylogenetic_and_Haplotype_Displays_for_GBrowse Acc. to Lincoln (from his last GBrowse post) there will be a testable version within a few weeks or so. You could always ask more questions about it on the GBrowse list. chris From lstein at cshl.edu Mon Sep 10 18:09:41 2007 From: lstein at cshl.edu (Lincoln Stein) Date: Mon, 10 Sep 2007 18:09:41 -0400 Subject: [Bioperl-l] display protein/CDS multiple sequence alignments with exon boundaries In-Reply-To: <3D1D9D7E-4162-4694-A3D6-23E926B07C0E@uiuc.edu> References: <358f4d650708310229i421bb1d7w5f64e3ded5f92618@mail.gmail.com> <96BB820A-9199-43FC-BA88-5F45F2D16D56@jays.net> <59E877A9-B2A1-4E62-A291-FDE21CF586E6@uiuc.edu> <358f4d650709050242u7743a532u3b460fcc55e1867e@mail.gmail.com> <08E1C429-0686-49E2-825A-D98C709166F7@uiuc.edu> <9AD575D1-A56C-415F-9C52-73105E846040@uiuc.edu> <358f4d650709070220q3680c5a2kbd01adb6f2fca3dc@mail.gmail.com> <3D1D9D7E-4162-4694-A3D6-23E926B07C0E@uiuc.edu> Message-ID: <6dce9a0b0709101509x5b1a18cfx7c567e40dffb4947@mail.gmail.com> You can view a simple multiple alignment now. Go to www.wormbase.org, turn on some of the EST tracks and then zoom down to base pair level. In bio::graphics, use the "segments" glyph and turn on the -draw_target option. The features must have DNA attached to them. What's coming soon is support for MAF format, which provides genome-level alignments. Lincoln On 9/10/07, Chris Fields wrote: > > On Sep 7, 2007, at 4:20 AM, Albert Vilella wrote: > > >>> He'd need to map his exon boundaries from whatever format he has > >>> into a > >>> GFF file (or DAS/BioSQL/Chado/? server, or...?) for GBrowse to > >>> munch on. > >> > >> I use segmented SeqFeatures in my example. The HOWTO also uses a > >> variation ('graded_segments'): > >> > >> http://www.bioperl.org/wiki/HOWTO:Graphics#Parsing_Real_BLAST_Output > >> > >> The subseqfeatures are colored by score. Feasibly one could hack > >> this so that the exons/introns have a different 'score', thus > >> displaying different colors. > > > > > > The exon boundary could be a vertical line or a triangular tick or > > something. I don't know if there is a consensus on this kind of > > cartoons. > > Does anybody know how exon boundaries are displayed in different > > browsers/apps? > > Don't know. BTW, apparently there is something being cooked up as an > alignment browser (among other things) for GBrowse: > > https://www.nescent.org/wg_phyloinformatics/ > PhyloSoC:Phylogenetic_and_Haplotype_Displays_for_GBrowse > > Acc. to Lincoln (from his last GBrowse post) there will be a testable > version within a few weeks or so. You could always ask more > questions about it on the GBrowse list. > > chris > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > -- Lincoln D. Stein Cold Spring Harbor Laboratory 1 Bungtown Road Cold Spring Harbor, NY 11724 (516) 367-8380 (voice) (516) 367-8389 (fax) FOR URGENT MESSAGES & SCHEDULING, PLEASE CONTACT MY ASSISTANT, SANDRA MICHELSEN, AT michelse at cshl.edu From cjfields at uiuc.edu Mon Sep 10 23:00:29 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 10 Sep 2007 22:00:29 -0500 Subject: [Bioperl-l] display protein/CDS multiple sequence alignments with exon boundaries In-Reply-To: <6dce9a0b0709101509x5b1a18cfx7c567e40dffb4947@mail.gmail.com> References: <358f4d650708310229i421bb1d7w5f64e3ded5f92618@mail.gmail.com> <96BB820A-9199-43FC-BA88-5F45F2D16D56@jays.net> <59E877A9-B2A1-4E62-A291-FDE21CF586E6@uiuc.edu> <358f4d650709050242u7743a532u3b460fcc55e1867e@mail.gmail.com> <08E1C429-0686-49E2-825A-D98C709166F7@uiuc.edu> <9AD575D1-A56C-415F-9C52-73105E846040@uiuc.edu> <358f4d650709070220q3680c5a2kbd01adb6f2fca3dc@mail.gmail.com> <3D1D9D7E-4162-4694-A3D6-23E926B07C0E@uiuc.edu> <6dce9a0b0709101509x5b1a18cfx7c567e40dffb4947@mail.gmail.com> Message-ID: <885E5E3B-E2F7-4279-8EE3-FC21AF535D7E@uiuc.edu> Doesn't that work only for SeqFeature::SimilarityPair and HSP-like (paired) alignments, or am I mistaken? chris On Sep 10, 2007, at 5:09 PM, Lincoln Stein wrote: > You can view a simple multiple alignment now. Go to > www.wormbase.org, turn > on some of the EST tracks and then zoom down to base pair level. > > In bio::graphics, use the "segments" glyph and turn on the - > draw_target > option. The features must have DNA attached to them. > > What's coming soon is support for MAF format, which provides genome- > level > alignments. > > Lincoln From christoph.theunert at web.de Tue Sep 11 06:37:49 2007 From: christoph.theunert at web.de (Christoph Theunert) Date: Tue, 11 Sep 2007 03:37:49 -0700 (PDT) Subject: [Bioperl-l] release of own projects Message-ID: <12611951.post@talk.nabble.com> Hi, I am a bioinformatics student from germany and I need your help Working with perl and bioperl is pretty new to me - currently I am working on a Bioperl project, and I don't know how to release my project when i am finished with it. I want to pack my modules so that other users can download it and install it on their machines. Do I use the command h2xs as to create cpan modules ( makefiles ...) or what is the best way to solve my problem ? thanks for help Christoph -- View this message in context: http://www.nabble.com/release-of-own-projects-tf4421681.html#a12611951 Sent from the Perl - Bioperl-L mailing list archive at Nabble.com. From spiros at lokku.com Tue Sep 11 06:57:14 2007 From: spiros at lokku.com (Spiros Denaxas) Date: Tue, 11 Sep 2007 11:57:14 +0100 Subject: [Bioperl-l] release of own projects In-Reply-To: <12611951.post@talk.nabble.com> References: <12611951.post@talk.nabble.com> Message-ID: Hey, Yes, IMHO the best way would be to create CPANesque modules that people are able to download and install. The installation is pretty straightforward, covers prerequisites and more advanced features if needed and as an approach it is widely supported. Also, it gives you the ability to create and integrate tests seamlessly :) Check out these URL's on how to do it: http://search.cpan.org/~nwclark/perl-5.8.8/pod/perlnewmod.pod http://www.perlmonks.org/?node_id=158999 http://www.perlmonks.org/?node_id=431702 Btw, more friendly and automated tools exist besides h2xs. Be sure to have a look at: http://search.cpan.org/perldoc?ExtUtils::ModuleMaker http://search.cpan.org/perldoc?Module::Starter Hope this helps, Spiros ps. i suggest since its your research work you are going to be handing out to read a bit on the various software licenses which exist and which you prefer to license your code under. On 9/11/07, Christoph Theunert wrote: > > Hi, I am a bioinformatics student from germany and I need your help > > Working with perl and bioperl is pretty new to me - > currently I am working on a Bioperl project, and I don't know how to release > my project when i am finished with it. > > I want to pack my modules so that other users can download it and install it > on their machines. > > Do I use the command h2xs as to create cpan modules ( makefiles ...) or what > is the best way to solve my > problem ? > > thanks for help > > Christoph > -- > View this message in context: http://www.nabble.com/release-of-own-projects-tf4421681.html#a12611951 > Sent from the Perl - Bioperl-L mailing list archive at Nabble.com. > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > From bix at sendu.me.uk Tue Sep 11 07:12:41 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Tue, 11 Sep 2007 12:12:41 +0100 Subject: [Bioperl-l] release of own projects In-Reply-To: <12611951.post@talk.nabble.com> References: <12611951.post@talk.nabble.com> Message-ID: <46E67829.8060303@sendu.me.uk> Christoph Theunert wrote: > Hi, I am a bioinformatics student from germany and I need your help > > Working with perl and bioperl is pretty new to me - > currently I am working on a Bioperl project, and I don't know how to release > my project when i am finished with it. > > I want to pack my modules so that other users can download it and install it > on their machines. > > Do I use the command h2xs as to create cpan modules ( makefiles ...) or what > is the best way to solve my > problem ? You can do it however you like. You can just stick the modules in a folder, .tar.gz it and offer that to people. You can use h2xs to automate certain things. You can use Module::Build. To make your work available via cpan, see http://www.cpan.org/modules/04pause.html If your modules are of general bioinformatic utility you might even consider making them a part of bioperl itself. From jay at jays.net Tue Sep 11 17:15:17 2007 From: jay at jays.net (Jay Hannah) Date: Tue, 11 Sep 2007 16:15:17 -0500 Subject: [Bioperl-l] display protein/CDS multiple sequence alignments with exon boundaries In-Reply-To: <6dce9a0b0709101509x5b1a18cfx7c567e40dffb4947@mail.gmail.com> References: <358f4d650708310229i421bb1d7w5f64e3ded5f92618@mail.gmail.com> <96BB820A-9199-43FC-BA88-5F45F2D16D56@jays.net> <59E877A9-B2A1-4E62-A291-FDE21CF586E6@uiuc.edu> <358f4d650709050242u7743a532u3b460fcc55e1867e@mail.gmail.com> <08E1C429-0686-49E2-825A-D98C709166F7@uiuc.edu> <9AD575D1-A56C-415F-9C52-73105E846040@uiuc.edu> <358f4d650709070220q3680c5a2kbd01adb6f2fca3dc@mail.gmail.com> <3D1D9D7E-4162-4694-A3D6-23E926B07C0E@uiuc.edu> <6dce9a0b0709101509x5b1a18cfx7c567e40dffb4947@mail.gmail.com> Message-ID: <46E70565.5040503@jays.net> Lincoln Stein wrote: > You can view a simple multiple alignment now. Go to www.wormbase.org > , turn on some of the EST tracks and then > zoom down to base pair level. > > In bio::graphics, use the "segments" glyph and turn on the > -draw_target option. The features must have DNA attached to them. Wow. *http://tinyurl.com/yuz8bq* I hadn't seen that done before. > What's coming soon is support for MAF format, which provides > genome-level alignments. I'm looking forward to trying to wrap my head around that. :) Jay Hannah http://clab.ist.unomaha.edu/CLAB/index.php/User:Jhannah From cjfields at uiuc.edu Tue Sep 11 18:40:55 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 11 Sep 2007 17:40:55 -0500 Subject: [Bioperl-l] display protein/CDS multiple sequence alignments with exon boundaries In-Reply-To: <46E70565.5040503@jays.net> References: <358f4d650708310229i421bb1d7w5f64e3ded5f92618@mail.gmail.com> <96BB820A-9199-43FC-BA88-5F45F2D16D56@jays.net> <59E877A9-B2A1-4E62-A291-FDE21CF586E6@uiuc.edu> <358f4d650709050242u7743a532u3b460fcc55e1867e@mail.gmail.com> <08E1C429-0686-49E2-825A-D98C709166F7@uiuc.edu> <9AD575D1-A56C-415F-9C52-73105E846040@uiuc.edu> <358f4d650709070220q3680c5a2kbd01adb6f2fca3dc@mail.gmail.com> <3D1D9D7E-4162-4694-A3D6-23E926B07C0E@uiuc.edu> <6dce9a0b0709101509x5b1a18cfx7c567e40dffb4947@mail.gmail.com> <46E70565.5040503@jays.net> Message-ID: On Sep 11, 2007, at 4:15 PM, Jay Hannah wrote: > Lincoln Stein wrote: >> You can view a simple multiple alignment now. Go to www.wormbase.org >> , turn on some of the EST tracks and then >> zoom down to base pair level. >> >> In bio::graphics, use the "segments" glyph and turn on the >> -draw_target option. The features must have DNA attached to them. > > Wow. *http://tinyurl.com/yuz8bq* I hadn't seen that done before. There is a section detailing how this is done in the GBrowse tutorial (though it uses older GFF): http://gmod.cvs.sourceforge.net/*checkout*/gmod/Generic-Genome- Browser/docs/tutorial/tutorial.html >> What's coming soon is support for MAF format, which provides >> genome-level alignments. > > I'm looking forward to trying to wrap my head around that. :) > > Jay Hannah > http://clab.ist.unomaha.edu/CLAB/index.php/User:Jhannah It's easily parsible, which is nice! chris From stephan.roessner at gsf.de Wed Sep 12 04:44:10 2007 From: stephan.roessner at gsf.de (Stephan Roessner) Date: Wed, 12 Sep 2007 10:44:10 +0200 Subject: [Bioperl-l] bug in Bio::SearchIO? Message-ID: <200709121044.11741.stephan.roessner@gsf.de> Hi, I am parsing a BlastN output with Bio::SearchIO and getting an error for some of the hits when retrieving the start and/or the end position with $hit->end('sbjct') , $hit->start('sbjct'). I want to filter for hits which are are of equal length (~ > 0.9) to the query sequences. SearchIO is retrieving the right results, but throws an exemption, in this case: MSG:Undefined sub-sequence (1633,760). Valid range = 693 - 760 ..... It seems to me valid range is parsed incorrectly, isn't it? Is this a bug? Does anybody have a similar problem? see code, error, and blastn output below. thanks, Stephan Stephan Roessner MIPS/IBI Inst. for Bioinformatics GSF Research Center for Environment and Health Ingolst?dter Landstr. 1 85764 Neuherberg; Germany phone: +49 (0)89 3187 3583 fax: ? ? ? +49 (0)89 3187 3585 email: stephan.roessner at gsf.de Here is the piece of code I am using: my $blast_report = new Bio::SearchIO ('-format'=>'blast', '-file' => $source); while( my $result=$blast_report->next_result) { while( my $hit= $result->next_hit()) { print "Name: ".$hit->name."\n"; print "S: ".$hit->start('sbjct')."\n"; print "E: ".$hit->end('sbjct')."\n"; print "L: ".$hit->length()."\n"; } } Here's the message: ------------- EXCEPTION: Bio::Root::Exception ------------- MSG: Undefined sub-sequence (1633,760). Valid range = 693 - 760 STACK: Error::throw STACK: Bio::Root::Root::throw /usr/lib/perl5/vendor_perl/5.8.8/Bio/Root/Root.pm:359 STACK: Bio::Search::HSP::HSPI::matches /usr/lib/perl5/vendor_perl/5.8.8/Bio/Search/HSP/HSPI.pm:691 STACK: Bio::Search::SearchUtils::_adjust_contigs /usr/lib/perl5/vendor_perl/5.8.8/Bio/Search/SearchUtils.pm:489 STACK: Bio::Search::SearchUtils::tile_hsps /usr/lib/perl5/vendor_perl/5.8.8/Bio/Search/SearchUtils.pm:206 STACK: Bio::Search::Hit::GenericHit::start /usr/lib/perl5/vendor_perl/5.8.8/Bio/Search/Hit/GenericHit.pm:935 STACK: main::parse /home/users/roessner/workspace/GeneSimilarity/similarity_analysis.pl:82 STACK: /home/users/roessner/workspace/GeneSimilarity/similarity_analysis.pl:51 ----------------------------------------------------------- S: 635 E: 790 L: 2052 This is the BLASTN output I am parsing:: >LOC_Os11g37470.1 chr11_pseudomolecule_TIGR r_jap version0 21623485-21621434 BestGuessTranscript Length = 2052 Score = 95.6 bits (48), Expect = 1e-17 Identities = 106/124 (85%), Gaps = 1/124 (0%) Strand = Plus / Plus Query: 3191 tattaagcataattaatgtatcattagcacatgtagg-ttactgtagcatttaaggctaa 3249 |||||||| |||||||| | ||||| ||||||||||| |||||||| || ||| |||||| Sbjct: 635 tattaagcctaattaatctgtcattggcacatgtagggttactgtaacacttatggctaa 694 Query: 3250 tcatagagtaactagacttaaaagactcgtctcgcgattttcaaccaaactgtgtaatta 3309 |||| || ||| |||||| |||||| || |||||||||||||| ||||| ||| ||||| Sbjct: 695 tcatggactaaatagactcaaaagattcatctcgcgattttcatgcaaaccgtgcaatta 754 Query: 3310 gttt 3313 |||| Sbjct: 755 gttt 758 Score = 48.1 bits (24), Expect = 0.002 Identities = 57/68 (83%) Strand = Plus / Minus Query: 2253 aaaaactaattacacaatttacctgtacatcgcgagatgaatcttttaagtttagttact 2312 ||||||||||| ||| ||| | || | ||||||||||||||||||| ||| || ||| | Sbjct: 760 aaaaactaattgcacggtttgcatgaaaatcgcgagatgaatcttttgagtctatttagt 701 Query: 2313 ccatgatt 2320 |||||||| Sbjct: 700 ccatgatt 693 Score = 44.1 bits (22), Expect = 0.038 Identities = 76/94 (80%) Strand = Plus / Minus Query: 1539 atgcatgtagtattaaatatagacgaaaataaaaactaattgcacagtttggtcgaaatt 1598 ||||||| || |||||||||| | ||| ||||||||||||||| ||||| |||| | Sbjct: 790 atgcatggagcattaaatataaataaaatgaaaaactaattgcacggtttgcatgaaaat 731 Query: 1599 gtcgagacgaattttttgagtctagttaggccat 1632 ||||| |||| ||||||||||| |||| |||| Sbjct: 730 cgcgagatgaatcttttgagtctatttagtccat 697 Score = 44.1 bits (22), Expect = 0.038 Identities = 73/90 (81%) Strand = Plus / Plus Query: 2026 actaactagaattaaaagattcgtctcgtcatttacagacaaactgtgtaattagttttt 2085 ||||| |||| | ||||||||| ||||| |||| || ||||| ||| ||||||||||| Sbjct: 701 actaaatagactcaaaagattcatctcgcgattttcatgcaaaccgtgcaattagttttt 760 Query: 2086 gttttcgtctatatttaatgcttcatgcat 2115 ||| | ||||||||||||| ||||||| Sbjct: 761 cattttatttatatttaatgctccatgcat 790 From cjfields at uiuc.edu Wed Sep 12 10:57:22 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Wed, 12 Sep 2007 09:57:22 -0500 Subject: [Bioperl-l] bug in Bio::SearchIO? In-Reply-To: <200709121044.11741.stephan.roessner@gsf.de> References: <200709121044.11741.stephan.roessner@gsf.de> Message-ID: <74CE1BB2-FCEB-43C3-B783-09706C7F55D8@uiuc.edu> Try updating to bioperl from CVS. I believe this issue was fixed but I don't believe it made the 1.5.2 release. chris On Sep 12, 2007, at 3:44 AM, Stephan Roessner wrote: > Hi, > > I am parsing a BlastN output with Bio::SearchIO and getting an > error for some > of the hits when retrieving the start and/or the end position with > $hit->end('sbjct') , $hit->start('sbjct'). I want to filter for > hits which > are are of equal length (~ > 0.9) to the query sequences. > > SearchIO is retrieving the right results, but throws an exemption, > in this > case: MSG:Undefined sub-sequence (1633,760). Valid range = 693 - > 760 ..... > > It seems to me valid range is parsed incorrectly, isn't it? Is this > a bug? > > Does anybody have a similar problem? > > see code, error, and blastn output below. > > thanks, > Stephan > > > Stephan Roessner > MIPS/IBI Inst. for Bioinformatics > GSF Research Center for Environment and Health > Ingolst?dter Landstr. 1 > 85764 Neuherberg; Germany > phone: +49 (0)89 3187 3583 > fax: +49 (0)89 3187 3585 > email: stephan.roessner at gsf.de > > > Here is the piece of code I am using: > > my $blast_report = new Bio::SearchIO ('-format'=>'blast', > '-file' => $source); > > while( my $result=$blast_report->next_result) { > while( my $hit= $result->next_hit()) { > print "Name: ".$hit->name."\n"; > print "S: ".$hit->start('sbjct')."\n"; > print "E: ".$hit->end('sbjct')."\n"; > print "L: ".$hit->length()."\n"; > } > } > > > Here's the message: > > ------------- EXCEPTION: Bio::Root::Exception ------------- > MSG: Undefined sub-sequence (1633,760). Valid range = 693 - 760 > STACK: Error::throw > STACK: > Bio::Root::Root::throw /usr/lib/perl5/vendor_perl/5.8.8/Bio/Root/ > Root.pm:359 > STACK: > Bio::Search::HSP::HSPI::matches /usr/lib/perl5/vendor_perl/5.8.8/ > Bio/Search/HSP/HSPI.pm:691 > STACK: > Bio::Search::SearchUtils::_adjust_contigs /usr/lib/perl5/ > vendor_perl/5.8.8/Bio/Search/SearchUtils.pm:489 > STACK: > Bio::Search::SearchUtils::tile_hsps /usr/lib/perl5/vendor_perl/ > 5.8.8/Bio/Search/SearchUtils.pm:206 > STACK: > Bio::Search::Hit::GenericHit::start /usr/lib/perl5/vendor_perl/ > 5.8.8/Bio/Search/Hit/GenericHit.pm:935 > STACK: > main::parse /home/users/roessner/workspace/GeneSimilarity/ > similarity_analysis.pl:82 > STACK: /home/users/roessner/workspace/GeneSimilarity/ > similarity_analysis.pl:51 > ----------------------------------------------------------- > > S: 635 > E: 790 > L: 2052 > > This is the BLASTN output I am parsing:: > >> LOC_Os11g37470.1 chr11_pseudomolecule_TIGR r_jap version0 > 21623485-21621434 BestGuessTranscript > Length = 2052 > > Score = 95.6 bits (48), Expect = 1e-17 > Identities = 106/124 (85%), Gaps = 1/124 (0%) > Strand = Plus / Plus > > > Query: 3191 tattaagcataattaatgtatcattagcacatgtagg- > ttactgtagcatttaaggctaa 3249 > |||||||| |||||||| | ||||| ||||||||||| |||||||| || ||| > |||||| > Sbjct: 635 > tattaagcctaattaatctgtcattggcacatgtagggttactgtaacacttatggctaa 694 > > > Query: 3250 > tcatagagtaactagacttaaaagactcgtctcgcgattttcaaccaaactgtgtaatta 3309 > |||| || ||| |||||| |||||| || |||||||||||||| ||||| ||| > ||||| > Sbjct: 695 > tcatggactaaatagactcaaaagattcatctcgcgattttcatgcaaaccgtgcaatta 754 > > > Query: 3310 gttt 3313 > |||| > Sbjct: 755 gttt 758 > > > > Score = 48.1 bits (24), Expect = 0.002 > Identities = 57/68 (83%) > Strand = Plus / Minus > > > Query: 2253 > aaaaactaattacacaatttacctgtacatcgcgagatgaatcttttaagtttagttact 2312 > ||||||||||| ||| ||| | || | ||||||||||||||||||| ||| || > ||| | > Sbjct: 760 > aaaaactaattgcacggtttgcatgaaaatcgcgagatgaatcttttgagtctatttagt 701 > > > Query: 2313 ccatgatt 2320 > |||||||| > Sbjct: 700 ccatgatt 693 > > > > Score = 44.1 bits (22), Expect = 0.038 > Identities = 76/94 (80%) > Strand = Plus / Minus > > > Query: 1539 > atgcatgtagtattaaatatagacgaaaataaaaactaattgcacagtttggtcgaaatt 1598 > ||||||| || |||||||||| | ||| ||||||||||||||| ||||| > |||| | > Sbjct: 790 > atgcatggagcattaaatataaataaaatgaaaaactaattgcacggtttgcatgaaaat 731 > > > Query: 1599 gtcgagacgaattttttgagtctagttaggccat 1632 > ||||| |||| ||||||||||| |||| |||| > Sbjct: 730 cgcgagatgaatcttttgagtctatttagtccat 697 > > > > Score = 44.1 bits (22), Expect = 0.038 > Identities = 73/90 (81%) > Strand = Plus / Plus > > > Query: 2026 > actaactagaattaaaagattcgtctcgtcatttacagacaaactgtgtaattagttttt 2085 > ||||| |||| | ||||||||| ||||| |||| || ||||| ||| > ||||||||||| > Sbjct: 701 > actaaatagactcaaaagattcatctcgcgattttcatgcaaaccgtgcaattagttttt 760 > > > Query: 2086 gttttcgtctatatttaatgcttcatgcat 2115 > ||| | ||||||||||||| ||||||| > Sbjct: 761 cattttatttatatttaatgctccatgcat 790 > > > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From n.haigh at sheffield.ac.uk Wed Sep 12 12:34:26 2007 From: n.haigh at sheffield.ac.uk (Nathan Haigh) Date: Wed, 12 Sep 2007 17:34:26 +0100 Subject: [Bioperl-l] Bio::Graphics transparent background Message-ID: <46E81512.3090503@sheffield.ac.uk> Is it possible to set the bg colour of glyphs and the panel background to be transparent? If so, which output formats support transparency? Cheers Nath From Kevin.M.Brown at asu.edu Wed Sep 12 14:15:10 2007 From: Kevin.M.Brown at asu.edu (Kevin Brown) Date: Wed, 12 Sep 2007 11:15:10 -0700 Subject: [Bioperl-l] Bio::Graphics transparent background In-Reply-To: <46E81512.3090503@sheffield.ac.uk> References: <46E81512.3090503@sheffield.ac.uk> Message-ID: <1A4207F8295607498283FE9E93B775B403AEFF97@EX02.asurite.ad.asu.edu> > Is it possible to set the bg colour of glyphs and the panel > background to be transparent? If so, which output formats > support transparency? Not sure if you can, but SVG, PNG, Gif all support a transparent background. From bioperl-list at superfrink.net Thu Sep 13 01:15:39 2007 From: bioperl-list at superfrink.net (bioperl-list at superfrink.net) Date: Wed, 12 Sep 2007 23:15:39 -0600 (MDT) Subject: [Bioperl-l] Bio::Graphics transparent background Message-ID: > Is it possible to set the bg colour of glyphs and the panel background > to be transparent? If so, which output formats support transparency? I had a look at the code and I don't believe it is possible. You could produce a PNG file and knowing the red/green/blue values of the background colour run the following script to make an image with the bg colour transparent. For example: ./make-transparent.pl 252 253 252 2004-11-22.png will produce: 2004-11-22.png.new.png with the RGB colour of (252, 253, 252) replaced with transparency. Regards, Chad #!/usr/bin/perl -w # # file: make-transparent.pl # purpose: make a single colour in a PNG file transparent # author: chad c d clark # $Id$ use strict; use GD; # -- subroutines ------------------------------------------------------- sub usage_message(); # -- main() ------------------------------------------------------------ if(scalar @ARGV < 4) { print usage_message(); exit 1; } # get the colour and make sure it is valid my @RGB = splice @ARGV, 0, 3; for my $i (@RGB) { if ( ($i !~ /^[\d]+$/) or (255 < $i) ) { print "Invalid colour '$i'.\n"; print usage_message(); exit 1; } } print "RGB: (@RGB)\n"; # process each file FILE: while (my $filename = shift @ARGV) { # read the file my $image = GD::Image->new($filename); unless(defined $image) { warn "Unable to read image from file. Skipping '$filename'.\n"; next FILE; } # find the colour index my $index = $image->colorExact(@RGB); if(-1 == $index) { warn "Colour not found in file. Skipping '$filename'.\n"; next FILE; } # make the colour index transparent if(-1 == $image->transparent($index)) { warn "Unable to make colour transparent. Skipping '$filename'.\n"; next FILE; } # write the updated image file my $new_filename = $filename . ".new.png"; # my $new_filename = $filename; # use to over-write existing file open FH, ">" . $new_filename or die "can't open $new_filename"; print FH $image->png; close FH; print "Found file '$filename'.\tCreated '$new_filename'.\n"; } exit 0; # -- subroutines ------------------------------------------------------- sub usage_message() { return qq/ Usage: $0 RED GREEN BLUE FILELIST Where: RED - red value in decimal (0 to 255) GREEN - green value in decimal (0 to 255) BLUE - blue value in decimal (0 to 255) FILELIST - list of files to convert Examples: $0 255 255 255 2004-11-22.png $0 252 253 252 2004-11-22.png second.png $0 1 1 200 2004-11-22.png second.png third.png Description: For each file "foo.png" a new file "foo.png.new.png" will be created (and over-written if it existed). The new file will be the same as the original but the colour specified by the RED, GREEN, and BLUE value will be removed and replaced by transparent pixels. /; } From n.haigh at sheffield.ac.uk Thu Sep 13 06:07:46 2007 From: n.haigh at sheffield.ac.uk (Nathan Haigh) Date: Thu, 13 Sep 2007 11:07:46 +0100 Subject: [Bioperl-l] Bio::Graphics transparent background In-Reply-To: <1A4207F8295607498283FE9E93B775B403AEFF97@EX02.asurite.ad.asu.edu> References: <46E81512.3090503@sheffield.ac.uk> <1A4207F8295607498283FE9E93B775B403AEFF97@EX02.asurite.ad.asu.edu> Message-ID: <46E90BF2.5010607@sheffield.ac.uk> Kevin Brown wrote: >> Is it possible to set the bg colour of glyphs and the panel >> background to be transparent? If so, which output formats >> support transparency? >> > > Not sure if you can, but SVG, PNG, Gif all support a transparent > background. > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > Looking at the GD module documentation: http://search.cpan.org/~lds/GD-2.30/GD.pm It appears that you can set a colour as being transparent - so I think it should be possible to get Bio::Graphics to do this = may require some code to be written. Any one got ideas? Cheers, Nath From n.haigh at sheffield.ac.uk Thu Sep 13 07:59:20 2007 From: n.haigh at sheffield.ac.uk (Nathan Haigh) Date: Thu, 13 Sep 2007 12:59:20 +0100 Subject: [Bioperl-l] Bio::Graphics transparent background In-Reply-To: <46E90BF2.5010607@sheffield.ac.uk> References: <46E81512.3090503@sheffield.ac.uk> <1A4207F8295607498283FE9E93B775B403AEFF97@EX02.asurite.ad.asu.edu> <46E90BF2.5010607@sheffield.ac.uk> Message-ID: <46E92618.7050208@sheffield.ac.uk> Nathan Haigh wrote: > Kevin Brown wrote: > >>> Is it possible to set the bg colour of glyphs and the panel >>> background to be transparent? If so, which output formats >>> support transparency? >>> >>> >> Not sure if you can, but SVG, PNG, Gif all support a transparent >> background. >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> >> > > Looking at the GD module documentation: > http://search.cpan.org/~lds/GD-2.30/GD.pm > > It appears that you can set a colour as being transparent - so I think > it should be possible to get Bio::Graphics to do this = may require some > code to be written. Any one got ideas? > > Cheers, > Nath > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > I took a look and made a simple change to Bio::Graphics::Panel Please see the following bug for a patch and explanation: http://bugzilla.open-bio.org/show_bug.cgi?id=2365 I'd appreciate any comments, especially regarding the method name! If there aren't any complaints I'll commit it later today. Nath From n.haigh at sheffield.ac.uk Thu Sep 13 08:26:57 2007 From: n.haigh at sheffield.ac.uk (Nathan Haigh) Date: Thu, 13 Sep 2007 13:26:57 +0100 Subject: [Bioperl-l] Bio::Graphics Resolution Message-ID: <46E92C91.5020307@sheffield.ac.uk> I want to be able to print my Bio::Graphics image on a poster with good resolution. What can I do to ensure I don't get blocky graphics/text. Altering the width/height of the panel simple increases the size of the canvas on which to draw the image, but the text appears the same size and thus relatively smaller to the rest of the image. So I don't think this would work for printing on a poster. Cheers, Nath From cjfields at uiuc.edu Thu Sep 13 08:46:02 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 13 Sep 2007 07:46:02 -0500 Subject: [Bioperl-l] Bio::Graphics Resolution In-Reply-To: <46E92C91.5020307@sheffield.ac.uk> References: <46E92C91.5020307@sheffield.ac.uk> Message-ID: <69321A85-8715-43C0-BCB0-CEE8F42D7235@uiuc.edu> Print to SVG instead of PNG (should be resolution-independent); I use Illustrator to fine-tune it but there are several other programs which can do the same. You'll need to install GD::SVG for it to work. The alignment example I posted previously about (http:// www.bioperl.org/wiki/HOWTO_Discussion:Graphics) shows essentially what you need to do: my $panel = Bio::Graphics::Panel->new( -image_class => 'SVG', # and whatever else ); # later... print $panel->svg; chris On Sep 13, 2007, at 7:26 AM, Nathan Haigh wrote: > I want to be able to print my Bio::Graphics image on a poster with > good > resolution. What can I do to ensure I don't get blocky graphics/text. > > Altering the width/height of the panel simple increases the size of > the > canvas on which to draw the image, but the text appears the same size > and thus relatively smaller to the rest of the image. So I don't think > this would work for printing on a poster. > > Cheers, > Nath > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From jonathancrabtree at gmail.com Thu Sep 13 09:09:56 2007 From: jonathancrabtree at gmail.com (Jonathan Crabtree) Date: Thu, 13 Sep 2007 09:09:56 -0400 Subject: [Bioperl-l] Bio::Graphics transparent background In-Reply-To: <46E92618.7050208@sheffield.ac.uk> References: <46E81512.3090503@sheffield.ac.uk> <1A4207F8295607498283FE9E93B775B403AEFF97@EX02.asurite.ad.asu.edu> <46E90BF2.5010607@sheffield.ac.uk> <46E92618.7050208@sheffield.ac.uk> Message-ID: <8e5b8bf80709130609x4be19cf6y60f2440a1ac5d332@mail.gmail.com> Hi Nathan- One problem with your proposed solution is that it won't necessarily work when GD::SVG is being used instead of GD (i.e., via the image_class method of Bio::Graphics::Panel). SVG doesn't handle transparency in the same way as GD. At least when you're compositing multiple SVG images/documents, transparency is the default; if you superimpose one SVG image on another ( e.g., by merging the two into a single SVG document) then the bottom image will be visible through any area of the top image that has not been drawn on. When I'm working in SVG with Bio::Graphics I get a "transparent" background by simply not setting the bgcolor; this ensures that Bio::Graphics::Panel will refrain from drawing a filled background rectangle underneath the drawing area. What I don't know is how to ensure that the background is transparent when you're working with the various methods of embedding SVG in web pages ( i.e., transparent with respect to whatever is _underneath_ the SVG-rendered content); this is probably a slightly different issue that's more a question of what the browser/plugin supports. I'm not sure what to suggest as an alternative, but at the very least this probably warrants a YMMV comment in the documentation for the new method, or perhaps it could even throw a runtime error if called when the $gd object is of type GD::SVG. A final option would be to say that this (setting a transparent background) is something that should get handled outside of Bio::Graphics::Panel; I don't think there's any technical reason why the calling code couldn't be responsible for this. I don't think we can modify your new method to unset the bgcolor when working with GD::SVG, because that might affect the image in other ways. I do it in my code but I'm not sure it's 100% safe, since I think GD::SVG might actually _use_ the bgcolor in some situations (e.g., drawing dashed lines) and I haven't checked the code thoroughly to make sure that there are no unintended consequences. Jonathan p.s. I see that Chris has beaten me to the punch in mentioning SVG as a fix to your blocky font problems. All the more reason to think about how this feature will work in that context! On 9/13/07, Nathan Haigh wrote: > > Nathan Haigh wrote: > > Kevin Brown wrote: > > > >>> Is it possible to set the bg colour of glyphs and the panel > >>> background to be transparent? If so, which output formats > >>> support transparency? > >>> > >>> > >> Not sure if you can, but SVG, PNG, Gif all support a transparent > >> background. > >> > >> _______________________________________________ > >> Bioperl-l mailing list > >> Bioperl-l at lists.open-bio.org > >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > >> > >> > > > > Looking at the GD module documentation: > > http://search.cpan.org/~lds/GD-2.30/GD.pm > > > > > It appears that you can set a colour as being transparent - so I think > > it should be possible to get Bio::Graphics to do this = may require some > > code to be written. Any one got ideas? > > > > Cheers, > > Nath > > _______________________________________________ > > Bioperl-l mailing list > > Bioperl-l at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > > I took a look and made a simple change to Bio::Graphics::Panel > > Please see the following bug for a patch and explanation: > http://bugzilla.open-bio.org/show_bug.cgi?id=2365 > > I'd appreciate any comments, especially regarding the method name! If > there aren't any complaints I'll commit it later today. > > Nath > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > From bix at sendu.me.uk Thu Sep 13 09:03:46 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Thu, 13 Sep 2007 14:03:46 +0100 Subject: [Bioperl-l] Bio::Graphics Resolution In-Reply-To: <46E92C91.5020307@sheffield.ac.uk> References: <46E92C91.5020307@sheffield.ac.uk> Message-ID: <46E93532.6030505@sendu.me.uk> Nathan Haigh wrote: > I want to be able to print my Bio::Graphics image on a poster with good > resolution. What can I do to ensure I don't get blocky graphics/text. Output in SVG, which is a vector format == no blockiness. From jonathancrabtree at gmail.com Thu Sep 13 09:20:43 2007 From: jonathancrabtree at gmail.com (Jonathan Crabtree) Date: Thu, 13 Sep 2007 09:20:43 -0400 Subject: [Bioperl-l] Bio::Graphics Resolution In-Reply-To: <46E92C91.5020307@sheffield.ac.uk> References: <46E92C91.5020307@sheffield.ac.uk> Message-ID: <8e5b8bf80709130620r4a24fe8fi5171539f50735bf3@mail.gmail.com> Nathan- As Chris said, you'll want to use GD::SVG instead of GD. However, you're still going to have the issue that you raised that the fonts will be proportionally small with respect to your figure (particularly if you're printing a large region at poster size.) From what I remember GD only gives you a few font sizes to choose from, so even at the largest size you may still have problems. I've worked around this in the past by using scripts to post-process the resulting SVG. I do a global search and replace to increase the font sizes (and, in many cases, to adjust the y-offset of the text accordingly.) You may also need to tweak the amount of vertical whitespace in the image (e.g., between adjacent rows of features) to give yourself space to increase the font size. The same caveat applies to the horizontal dimension, since with a larger font you may have collisions between labels (assuming that the features in your figure are labeled.) To fix this you need to trick Bio::Graphics into thinking the feature labels are longer than they actually are. I forget whether I did this by padding the labels with extra whitespace or actually modifying the code that computes the feature bounding boxes, but something along those lines should work. Essentially you have to trick Bio::Graphics into leaving extra whitespace so that everything looks OK when you bump up the font sizes. Unfortunately I don't have a generic script that does this; after generating a couple of posters this way I switched to direct SVG generation to avoid the constraints imposed by going through GD. Jonathan On 9/13/07, Nathan Haigh