From shameer at ncbs.res.in Wed Aug 1 01:45:45 2007 From: shameer at ncbs.res.in (Shameer Khadar) Date: Wed, 1 Aug 2007 11:15:45 +0530 (IST) Subject: [Bioperl-l] Perl 3D OpenGL In-Reply-To: <04BCAD9E-CC25-4F0A-85B1-FBA91C64CE7D@uiuc.edu> References: <152401c7d224$8e2455b0$6e4e7c0a@HPONE> <25A5F0A3-1CC3-46B5-8976-A24C451204E7@jays.net> <04BCAD9E-CC25-4F0A-85B1-FBA91C64CE7D@uiuc.edu> Message-ID: <49637.192.168.1.1.1185947145.squirrel@mail.ncbs.res.in> Hi, Open-GL/3D contributions are always welcome !!! What about Perl-OpenGL/3D implimentation of a web-based 3D-Viewer like Jmol. http://jmol.sourceforge.net/ (So we dont need to worry about Java installation and stuffs :) develop it and deploy it in Perl - eternal happiness !!!) -- SK > > On Jul 31, 2007, at 7:00 AM, Jay Hannah wrote: > >> On Jul 29, 2007, at 4:08 PM, Grafman Productions wrote: >>> If this posting is inappropriate, please let me know - my apologies. >> >> Not at all. AFAIK this is the perfect place to discuss any >> contributions you're motivated to make to the BioPerl project. >> >>> I recently came across an article on BioPerl, and it occurred to me >>> that >>> there might be some need for 3D rendering within your BioPerl >>> project. >>> >>> I released a number of new/updated Perl OpenGL (POGL) modules this >>> year, >>> along with benchmarks that demonstrate that it performs comparably >>> to C. >>> >>> If there's a need for 3D features within BioPerl, and if I can be >>> of any >>> assistance in helping to add such features, I would enjoy the >>> opportunity. >> >> I know nothing about 3D modeling in biology, nor do I hang out with >> any protein structure folks, but 3D always sounds sexy. -grin- >> >> If you're new to bioinformatics (I certainly am) you might want to >> read this: >> >> http://en.wikipedia.org/wiki/Protein_structure >> >> Because that's probably where your 3D work would be used. Especially >> note the "Software" section, where you'll find some of the >> "competition". :) >> >> There's some cool stuff out there. I don't know what all would or >> wouldn't be time well spent in Perl / BioPerl. >> >> HTH, >> >> Jay Hannah >> http://clab.ist.unomaha.edu/CLAB/index.php/User:Jhannah > > I agree that protein structure is the best place for something like > this. > > It's a wide open area as far as I'm concerned; in fact I would say > that Bio::Structure is getting pretty dated, so if anyone wants to > take it over, refactor the code, and so on I don't have a problem. > > chris > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > -- Shameer Khadar Prof. R. Sowdhamini's Lab (# 25) The Computational Biology Group National Centre for Biological Sciences (TIFR) GKVK Campus, Bellary Road, Bangalore - 65, Karnataka - India T - 91-080-23666001 EXT - 6251 W - http://www.ncbs.res.in From Alicia.Amadoz at uv.es Wed Aug 1 03:13:11 2007 From: Alicia.Amadoz at uv.es (Alicia Amadoz) Date: Wed, 1 Aug 2007 09:13:11 +0200 (CEST) Subject: [Bioperl-l] trying to save blast hit sequences to fasta file Message-ID: <1664224328amadoz@uv.es> Hi, I would like to save my hit sequences from a blast result in a fasta file. I am trying some things but I have problems using Bio::SearchIO and Bio::SeqIO. Hope anyone could help me with this. Here is my current code: # my $seq_out = Bio::SeqIO->new("-file" => ">$fasfilename", "-format" => "fasta"); my $seq_out = Bio::SearchIO->new("-file" => ">$fasfilename", "-format" => "fasta"); while(my $result = $blast_report->next_result()) { while(my $hit = $result->next_hit()) { while(my $hsp = $hit->next_hsp()) { my $hseq = $hsp->hit_string(); # $seq_out->write_seq($hseq); $seq_out->write_result($hseq); } } } Here the error is, ------------- EXCEPTION: Bio::Root::Exception ------------- MSG: ResultWriter not defined. I couldn't find any kind of documentation about ResultWriter. Thanks in advance, Alicia From xianranli78 at yahoo.com.cn Wed Aug 1 04:11:53 2007 From: xianranli78 at yahoo.com.cn (Xianran Li) Date: Wed, 1 Aug 2007 16:11:53 +0800 Subject: [Bioperl-l] trying to save blast hit sequences to fasta file References: <1664224328amadoz@uv.es> Message-ID: <001101c7d413$a0d79aa0$ed07a8c0@BGI.LOCAL> The $hseq->$hsp->hit_string() will return the string of hit sequence, rather than an objective of Bio::Seq. So may be you should construct a objective firstly, then you could use $seq_out->write_seq($hseq_obj) to write the seq into a fasta file. # my $seq_out = Bio::SeqIO->new("-file" => ">$fasfilename", "-format" =>"fasta"); my $seq_out = Bio::SearchIO->new("-file" => ">$fasfilename", "-format"=> "fasta"); while(my $result = $blast_report->next_result()) { while(my $hit = $result->next_hit()) { while(my $hsp = $hit->next_hsp()) { my $hseq = $hsp->hit_string(); $hseq =~ s/-//g; #### remove the gap within the aligment my $hseq_obj = Bio::Seq->new(-display_id => $id, -seq => $hseq); # $seq_out->write_seq($hseq); $seq_out->write_result($hseq_obj); } } } Xianran ----- Original Message ----- From: "Alicia Amadoz" To: Sent: Wednesday, August 01, 2007 3:13 PM Subject: [Bioperl-l] trying to save blast hit sequences to fasta file > Hi, I would like to save my hit sequences from a blast result in a fasta > file. I am trying some things but I have problems using Bio::SearchIO > and Bio::SeqIO. Hope anyone could help me with this. Here is my current > code: > > # my $seq_out = Bio::SeqIO->new("-file" => ">$fasfilename", "-format" => > "fasta"); > my $seq_out = Bio::SearchIO->new("-file" => ">$fasfilename", "-format" > => "fasta"); > while(my $result = $blast_report->next_result()) { > while(my $hit = $result->next_hit()) { > while(my $hsp = $hit->next_hsp()) { > my $hseq = $hsp->hit_string(); > # $seq_out->write_seq($hseq); > $seq_out->write_result($hseq); > } > } > } > > Here the error is, > > ------------- EXCEPTION: Bio::Root::Exception ------------- > MSG: ResultWriter not defined. > > I couldn't find any kind of documentation about ResultWriter. > Thanks in advance, > Alicia > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l?????????????????????????????????????????????????????????????????'?f??????? From Alicia.Amadoz at uv.es Wed Aug 1 06:25:29 2007 From: Alicia.Amadoz at uv.es (Alicia Amadoz) Date: Wed, 1 Aug 2007 12:25:29 +0200 (CEST) Subject: [Bioperl-l] trying to save blast hit sequences to fasta file Message-ID: <5927683277amadoz@uv.es> Hi, I have tried what you suggested and I get also some errors. With this code, my $seq_out = Bio::SearchIO->new("-file" => ">$fasfilename", "-format" => "fasta"); while(my $result = $blast_report->next_result()) { while(my $hit = $result->next_hit()) { while(my $hsp = $hit->next_hsp()) { my $hseq = $hsp->hit_string(); $hseq =~ s/-//g; #### remove the gap within the aligment my $hseq_obj = Bio::Seq->new(-display_id => $id, -seq => $hseq); $seq_out->write_seq($hseq_obj); } } } I have the following error: Can't locate object method "write_seq" via package "Bio::SearchIO::fasta" And using write_result methog with this code, my $seq_out = Bio::SearchIO->new("-file" => ">$fasfilename", "-format" => "fasta"); while(my $result = $blast_report->next_result()) { while(my $hit = $result->next_hit()) { while(my $hsp = $hit->next_hsp()) { my $hseq = $hsp->hit_string(); $hseq =~ s/-//g; #### remove the gap within the aligment my $hseq_obj = Bio::Seq->new(-display_id => $id, -seq => $hseq); $seq_out->write_result($hseq_obj); } } } I have again this kind of error: ------------- EXCEPTION: Bio::Root::Exception ------------- MSG: ResultWriter not defined. STACK: Error::throw So, what else can I try?? Thanks in advance, Alicia From neetisomaiya at gmail.com Wed Aug 1 07:28:40 2007 From: neetisomaiya at gmail.com (neeti somaiya) Date: Wed, 1 Aug 2007 16:58:40 +0530 Subject: [Bioperl-l] URGENT : Problem in OMIM parser Message-ID: <764978cf0708010428q5b51d27dw13f73dc5ca8d6dc9@mail.gmail.com> I have downloaded the omim.txt file from NCBI ftp site and I am running my attached parser on this file, the parser run stops in between with this :- ------------- EXCEPTION ------------- MSG: a part/organism must be assigned STACK Bio::Phenotype::OMIM::OMIMentry::add_clinical_symptoms /usr/lib/perl5/site_perl/5.8.8/Bio/Phenotype/OMIM/OMIMentry.pm:566 STACK Bio::Phenotype::OMIM::OMIMparser::_finer_parse_symptoms /usr/lib/perl5/site_perl/5.8.8/Bio/Phenotype/OMIM/OMIMparser.pm:555 STACK Bio::Phenotype::OMIM::OMIMparser::_createOMIMentry /usr/lib/perl5/site_perl/5.8.8/Bio/Phenotype/OMIM/OMIMparser.pm:536 STACK Bio::Phenotype::OMIM::OMIMparser::next_phenotype /usr/lib/perl5/site_perl/5.8.8/Bio/Phenotype/OMIM/OMIMparser.pm:272 STACK toplevel parse_omim_original.pl:47 -------------------------------------- What is the reason for this? Can anyone guide me please. -- -Neeti Even my blood says, B positive From neetisomaiya at gmail.com Wed Aug 1 07:28:40 2007 From: neetisomaiya at gmail.com (neeti somaiya) Date: Wed, 1 Aug 2007 16:58:40 +0530 Subject: [Bioperl-l] URGENT : Problem in OMIM parser Message-ID: <764978cf0708010428q5b51d27dw13f73dc5ca8d6dc9@mail.gmail.com> I have downloaded the omim.txt file from NCBI ftp site and I am running my attached parser on this file, the parser run stops in between with this :- ------------- EXCEPTION ------------- MSG: a part/organism must be assigned STACK Bio::Phenotype::OMIM::OMIMentry::add_clinical_symptoms /usr/lib/perl5/site_perl/5.8.8/Bio/Phenotype/OMIM/OMIMentry.pm:566 STACK Bio::Phenotype::OMIM::OMIMparser::_finer_parse_symptoms /usr/lib/perl5/site_perl/5.8.8/Bio/Phenotype/OMIM/OMIMparser.pm:555 STACK Bio::Phenotype::OMIM::OMIMparser::_createOMIMentry /usr/lib/perl5/site_perl/5.8.8/Bio/Phenotype/OMIM/OMIMparser.pm:536 STACK Bio::Phenotype::OMIM::OMIMparser::next_phenotype /usr/lib/perl5/site_perl/5.8.8/Bio/Phenotype/OMIM/OMIMparser.pm:272 STACK toplevel parse_omim_original.pl:47 -------------------------------------- What is the reason for this? Can anyone guide me please. -- -Neeti Even my blood says, B positive From neetisomaiya at gmail.com Wed Aug 1 07:28:40 2007 From: neetisomaiya at gmail.com (neeti somaiya) Date: Wed, 1 Aug 2007 16:58:40 +0530 Subject: [Bioperl-l] URGENT : Problem in OMIM parser Message-ID: <764978cf0708010428q5b51d27dw13f73dc5ca8d6dc9@mail.gmail.com> I have downloaded the omim.txt file from NCBI ftp site and I am running my attached parser on this file, the parser run stops in between with this :- ------------- EXCEPTION ------------- MSG: a part/organism must be assigned STACK Bio::Phenotype::OMIM::OMIMentry::add_clinical_symptoms /usr/lib/perl5/site_perl/5.8.8/Bio/Phenotype/OMIM/OMIMentry.pm:566 STACK Bio::Phenotype::OMIM::OMIMparser::_finer_parse_symptoms /usr/lib/perl5/site_perl/5.8.8/Bio/Phenotype/OMIM/OMIMparser.pm:555 STACK Bio::Phenotype::OMIM::OMIMparser::_createOMIMentry /usr/lib/perl5/site_perl/5.8.8/Bio/Phenotype/OMIM/OMIMparser.pm:536 STACK Bio::Phenotype::OMIM::OMIMparser::next_phenotype /usr/lib/perl5/site_perl/5.8.8/Bio/Phenotype/OMIM/OMIMparser.pm:272 STACK toplevel parse_omim_original.pl:47 -------------------------------------- What is the reason for this? Can anyone guide me please. -- -Neeti Even my blood says, B positive From jay at jays.net Wed Aug 1 09:30:50 2007 From: jay at jays.net (Jay Hannah) Date: Wed, 1 Aug 2007 09:30:50 -0400 (EDT) Subject: [Bioperl-l] trying to save blast hit sequences to fasta file In-Reply-To: <5927683277amadoz@uv.es> References: <5927683277amadoz@uv.es> Message-ID: On Wed, 1 Aug 2007, Alicia Amadoz wrote: > Hi, I have tried what you suggested and I get also some errors. > With this code, > > my $seq_out = Bio::SearchIO->new("-file" => ">$fasfilename", "-format" > => "fasta"); > while(my $result = $blast_report->next_result()) { > while(my $hit = $result->next_hit()) { > while(my $hsp = $hit->next_hsp()) { > my $hseq = $hsp->hit_string(); > $hseq =~ s/-//g; #### remove the gap within the aligment > my $hseq_obj = Bio::Seq->new(-display_id => $id, -seq => $hseq); > $seq_out->write_seq($hseq_obj); > } > } > } > > I have the following error: > > Can't locate object method "write_seq" via package "Bio::SearchIO::fasta" You don't want to write_seq() to a SearchIO, you want to write_seq() to a SeqIO. Try this: my $seq_out = Bio::SeqIO->new(-file => ">$fasfilename", -format => "fasta"); while(my $result = $blast_report->next_result()) { while(my $hit = $result->next_hit()) { while(my $hsp = $hit->next_hsp()) { my $hseq = $hsp->hit_string(); $hseq =~ s/-//g; #### remove the gap within the aligment my $hseq_obj = Bio::Seq->new(-display_id => $id, -seq => $hseq); $seq_out->write_seq($hseq_obj); } } } (Untested.) HTH, Jay Hannah http://clab.ist.unomaha.edu/CLAB/index.php/User:Jhannah From cjfields at uiuc.edu Wed Aug 1 11:02:07 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Wed, 1 Aug 2007 10:02:07 -0500 Subject: [Bioperl-l] URGENT : Problem in OMIM parser In-Reply-To: <764978cf0708010428q5b51d27dw13f73dc5ca8d6dc9@mail.gmail.com> References: <764978cf0708010428q5b51d27dw13f73dc5ca8d6dc9@mail.gmail.com> Message-ID: <0458A649-AD66-4495-8E25-78D556D51AD9@uiuc.edu> Neeti, Only post to one list email address, namely the one I'm responding to and the one shown here: http://bioperl.org/mailman/listinfo/bioperl-l The others are aliases so you essentially posted three times. As for your question: there was no attached script or any additional information (bioperl version would have also been nice), so we can't help you until we have something more to work with. chris On Aug 1, 2007, at 6:28 AM, neeti somaiya wrote: > I have downloaded the omim.txt file from NCBI ftp site and I am > running my > attached parser on this file, the parser run stops in between with > this :- > > ------------- EXCEPTION ------------- > MSG: a part/organism must be assigned > STACK Bio::Phenotype::OMIM::OMIMentry::add_clinical_symptoms > /usr/lib/perl5/site_perl/5.8.8/Bio/Phenotype/OMIM/OMIMentry.pm:566 > STACK Bio::Phenotype::OMIM::OMIMparser::_finer_parse_symptoms > /usr/lib/perl5/site_perl/5.8.8/Bio/Phenotype/OMIM/OMIMparser.pm:555 > STACK Bio::Phenotype::OMIM::OMIMparser::_createOMIMentry > /usr/lib/perl5/site_perl/5.8.8/Bio/Phenotype/OMIM/OMIMparser.pm:536 > STACK Bio::Phenotype::OMIM::OMIMparser::next_phenotype > /usr/lib/perl5/site_perl/5.8.8/Bio/Phenotype/OMIM/OMIMparser.pm:272 > STACK toplevel parse_omim_original.pl:47 > > -------------------------------------- > > What is the reason for this? > Can anyone guide me please. > > -- > -Neeti > Even my blood says, B positive > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From torsten.seemann at infotech.monash.edu.au Wed Aug 1 20:50:06 2007 From: torsten.seemann at infotech.monash.edu.au (Torsten Seemann) Date: Thu, 2 Aug 2007 10:50:06 +1000 Subject: [Bioperl-l] trying to save blast hit sequences to fasta file In-Reply-To: <1664224328amadoz@uv.es> References: <1664224328amadoz@uv.es> Message-ID: Alicia, > Hi, I would like to save my hit sequences from a blast result in a fasta > file. I am trying some things but I have problems using Bio::SearchIO > and Bio::SeqIO. Hope anyone could help me with this. Here is my current > code: > # my $seq_out = Bio::SeqIO->new("-file" => ">$fasfilename", "-format" => > "fasta"); > my $seq_out = Bio::SearchIO->new("-file" => ">$fasfilename", "-format" > => "fasta"); > ... > my $hseq = $hsp->hit_string(); > # $seq_out->write_seq($hseq); > $seq_out->write_result($hseq); You have encountered two common problems for BioPerl beginners: 1. "fasta" means two different things! In SearchIO it refers to the output format of the "fasta" sequence alignment software. In SeqIO it refers to a file format that stores just sequences. Confusing, I know. You need SeqIO and write_seq, not SearchIO and write_result. 2. $hseq is a STRING which has the raw sequence letters in it. However, the write_seq() method needs a Bio::Seq object (which has extra details like the name and ID) not a raw string. The example code Jay Hannah supplied in his reply looks pretty good, you should try it. -- --Torsten Seemann --Victorian Bioinformatics Consortium, Monash University From Alicia.Amadoz at uv.es Thu Aug 2 03:06:54 2007 From: Alicia.Amadoz at uv.es (Alicia Amadoz) Date: Thu, 2 Aug 2007 09:06:54 +0200 (CEST) Subject: [Bioperl-l] trying to save blast hit sequences to fasta file In-Reply-To: References: Message-ID: <3579584634amadoz@uv.es> Hi, thanks for your help and suggestions. I have tried the example code of Jay Hannah and it works perfectly. But what I need to save in fasta format is the whole sequence in the database that is similar to my query sequence. I don't understand very well the difference between hit_string() and query_string(), are they the whole sequence that is similiar (about hit_string), a part of the whole sequence or just the part that is aligned to my query string? With the previous code what I have are different sequences in length with the same id as my query string, so I am not sure that I am doing what I need to do. Any light on this point? Thank you very much for your help. Alicia > Alicia, > > > Hi, I would like to save my hit sequences from a blast result in a fasta > > file. I am trying some things but I have problems using Bio::SearchIO > > and Bio::SeqIO. Hope anyone could help me with this. Here is my current > > code: > > # my $seq_out = Bio::SeqIO->new("-file" => ">$fasfilename", "-format" => > > "fasta"); > > my $seq_out = Bio::SearchIO->new("-file" => ">$fasfilename", "-format" > > => "fasta"); > > ... > > my $hseq = $hsp->hit_string(); > > # $seq_out->write_seq($hseq); > > $seq_out->write_result($hseq); > > You have encountered two common problems for BioPerl beginners: > > 1. "fasta" means two different things! In SearchIO it refers to the > output format of the "fasta" sequence alignment software. In SeqIO it > refers to a file format that stores just sequences. Confusing, I know. > You need SeqIO and write_seq, not SearchIO and write_result. > > 2. $hseq is a STRING which has the raw sequence letters in it. > However, the write_seq() method needs a Bio::Seq object (which has > extra details like the name and ID) not a raw string. > > The example code Jay Hannah supplied in his reply looks pretty good, > you should try it. > > -- > --Torsten Seemann > --Victorian Bioinformatics Consortium, Monash University > > From xianranli78 at yahoo.com.cn Thu Aug 2 04:56:04 2007 From: xianranli78 at yahoo.com.cn (Xianran Li) Date: Thu, 2 Aug 2007 16:56:04 +0800 Subject: [Bioperl-l] trying to save blast hit sequences to fasta file References: <3579584634amadoz@uv.es> Message-ID: <003701c7d4e2$f7a34bc0$ed07a8c0@BGI.LOCAL> ----- Original Message ----- From: "Alicia Amadoz" To: "Torsten Seemann" ; Cc: Sent: Thursday, August 02, 2007 3:06 PM Subject: Re: [Bioperl-l] trying to save blast hit sequences to fasta file > Hi, thanks for your help and suggestions. I have tried the example code > of Jay Hannah and it works perfectly. But what I need to save in fasta > format is the whole sequence in the database that is similar to my query > sequence. I don't understand very well the difference between > hit_string() and query_string(), are they the whole sequence that is > similiar (about hit_string), a part of the whole sequence or just the > part that is aligned to my query string? The hit_string() returns the aligned sequences of the subject in your database and the query_string() is the aligned sequences of the query. These two things will be the same unless there are some mutations and or gaps within the alignment. > > With the previous code what I have are different sequences in length > with the same id as my query string, so I am not sure that I am doing > what I need to do. Any light on this point? Did you specify the $id before my $hseq_obj = Bio::Seq->new(-display_id => $id, -seq => $hseq); If you didn't, then all the sequences retrieved will get the same id. The following is a simply way to avoid this problem. my $seq_out = Bio::SeqIO->new("-file" => ">$fasfilename", "-format" =>"fasta"); my $i; while(my $result = $blast_report->next_result()) { while(my $hit = $result->next_hit()) { while(my $hsp = $hit->next_hsp()) { $i ++; my $hseq = $hsp->hit_string(); $hseq =~ s/-//g; #### remove the gap within the aligment my $id = $i; ###### specifiy the id my $hseq_obj = Bio::Seq->new(-display_id => $id, -seq => $hseq); # $seq_out->write_seq($hseq); $seq_out->write_result($hseq_obj); } } } Xianran > > Thank you very much for your help. > Alicia > > > Alicia, > > > > > Hi, I would like to save my hit sequences from a blast result in a fasta > > > file. I am trying some things but I have problems using Bio::SearchIO > > > and Bio::SeqIO. Hope anyone could help me with this. Here is my current > > > code: > > > # my $seq_out = Bio::SeqIO->new("-file" => ">$fasfilename", "-format" => > > > "fasta"); > > > my $seq_out = Bio::SearchIO->new("-file" => ">$fasfilename", "-format" > > > => "fasta"); > > > ... > > > my $hseq = $hsp->hit_string(); > > > # $seq_out->write_seq($hseq); > > > $seq_out->write_result($hseq); > > > > You have encountered two common problems for BioPerl beginners: > > > > 1. "fasta" means two different things! In SearchIO it refers to the > > output format of the "fasta" sequence alignment software. In SeqIO it > > refers to a file format that stores just sequences. Confusing, I know. > > You need SeqIO and write_seq, not SearchIO and write_result. > > > > 2. $hseq is a STRING which has the raw sequence letters in it. > > However, the write_seq() method needs a Bio::Seq object (which has > > extra details like the name and ID) not a raw string. > > > > The example code Jay Hannah supplied in his reply looks pretty good, > > you should try it. > > > > -- > > --Torsten Seemann > > --Victorian Bioinformatics Consortium, Monash University > > > > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l?????????????????????????????????????????????????????????????????'?f??????? From neetisomaiya at gmail.com Thu Aug 2 02:20:33 2007 From: neetisomaiya at gmail.com (neeti somaiya) Date: Thu, 2 Aug 2007 11:50:33 +0530 Subject: [Bioperl-l] URGENT : Problem in OMIM parser In-Reply-To: <0458A649-AD66-4495-8E25-78D556D51AD9@uiuc.edu> References: <764978cf0708010428q5b51d27dw13f73dc5ca8d6dc9@mail.gmail.com> <0458A649-AD66-4495-8E25-78D556D51AD9@uiuc.edu> Message-ID: <764978cf0708012320v1f30c7a7tfc3a2e524b72093@mail.gmail.com> Hi, The script is attached with this mail. I am using bioperl-1.4. Regards, Neeti. On 8/1/07, Chris Fields wrote: > > Neeti, > > Only post to one list email address, namely the one I'm responding to > and the one shown here: > > http://bioperl.org/mailman/listinfo/bioperl-l > > The others are aliases so you essentially posted three times. As for > your question: there was no attached script or any additional > information (bioperl version would have also been nice), so we can't > help you until we have something more to work with. > > chris > > On Aug 1, 2007, at 6:28 AM, neeti somaiya wrote: > > > I have downloaded the omim.txt file from NCBI ftp site and I am > > running my > > attached parser on this file, the parser run stops in between with > > this :- > > > > ------------- EXCEPTION ------------- > > MSG: a part/organism must be assigned > > STACK Bio::Phenotype::OMIM::OMIMentry::add_clinical_symptoms > > /usr/lib/perl5/site_perl/5.8.8/Bio/Phenotype/OMIM/OMIMentry.pm:566 > > STACK Bio::Phenotype::OMIM::OMIMparser::_finer_parse_symptoms > > /usr/lib/perl5/site_perl/5.8.8/Bio/Phenotype/OMIM/OMIMparser.pm:555 > > STACK Bio::Phenotype::OMIM::OMIMparser::_createOMIMentry > > /usr/lib/perl5/site_perl/5.8.8/Bio/Phenotype/OMIM/OMIMparser.pm:536 > > STACK Bio::Phenotype::OMIM::OMIMparser::next_phenotype > > /usr/lib/perl5/site_perl/5.8.8/Bio/Phenotype/OMIM/OMIMparser.pm:272 > > STACK toplevel parse_omim_original.pl:47 > > > > -------------------------------------- > > > > What is the reason for this? > > Can anyone guide me please. > > > > -- > > -Neeti > > Even my blood says, B positive > > _______________________________________________ > > Bioperl-l mailing list > > Bioperl-l at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > Christopher Fields > Postdoctoral Researcher > Lab of Dr. Robert Switzer > Dept of Biochemistry > University of Illinois Urbana-Champaign > > > > -- -Neeti Even my blood says, B positive -------------- next part -------------- A non-text attachment was scrubbed... Name: parse_omim_original.pl Type: application/x-perl Size: 5998 bytes Desc: not available Url : http://lists.open-bio.org/pipermail/bioperl-l/attachments/20070802/fbbee8db/attachment.bin From neetisomaiya at gmail.com Thu Aug 2 09:00:33 2007 From: neetisomaiya at gmail.com (neeti somaiya) Date: Thu, 2 Aug 2007 18:30:33 +0530 Subject: [Bioperl-l] URGENT : Problem in OMIM parser In-Reply-To: <764978cf0708012320v1f30c7a7tfc3a2e524b72093@mail.gmail.com> References: <764978cf0708010428q5b51d27dw13f73dc5ca8d6dc9@mail.gmail.com> <0458A649-AD66-4495-8E25-78D556D51AD9@uiuc.edu> <764978cf0708012320v1f30c7a7tfc3a2e524b72093@mail.gmail.com> Message-ID: <764978cf0708020600v551b917ck9acdd443268b85fa@mail.gmail.com> Also, As per the following links we can fetch data from the genemap file as well :- http://search.cpan.org/~birney/bioperl-1.2.3/Bio/Phenotype/OMIM/OMIMparser.pm But when I am trying to do so in the exact manner as given in the above link, I get no data. As in there are OMIM ids which are present in both the omim.txt and genemap files, and for such cases when I parse and fetch data, data from both files should be obtained, but I aint getting it. For eg. while running the attached script, for OMIM id 100790, I get all data from omim.txt but the cytoposition, gene symbol etc from genemap is not coming, though it is present in the genemap file. Please help me find what could be going wrong. On 8/2/07, neeti somaiya wrote: > > Hi, > > The script is attached with this mail. > I am using bioperl-1.4. > > Regards, > Neeti. > > On 8/1/07, Chris Fields < cjfields at uiuc.edu> wrote: > > > > Neeti, > > > > Only post to one list email address, namely the one I'm responding to > > and the one shown here: > > > > http://bioperl.org/mailman/listinfo/bioperl-l > > > > The others are aliases so you essentially posted three times. As for > > your question: there was no attached script or any additional > > information (bioperl version would have also been nice), so we can't > > help you until we have something more to work with. > > > > chris > > > > On Aug 1, 2007, at 6:28 AM, neeti somaiya wrote: > > > > > I have downloaded the omim.txt file from NCBI ftp site and I am > > > running my > > > attached parser on this file, the parser run stops in between with > > > this :- > > > > > > ------------- EXCEPTION ------------- > > > MSG: a part/organism must be assigned > > > STACK Bio::Phenotype::OMIM::OMIMentry::add_clinical_symptoms > > > /usr/lib/perl5/site_perl/5.8.8/Bio/Phenotype/OMIM/OMIMentry.pm:566 > > > STACK Bio::Phenotype::OMIM::OMIMparser::_finer_parse_symptoms > > > /usr/lib/perl5/site_perl/5.8.8/Bio/Phenotype/OMIM/OMIMparser.pm:555 > > > STACK Bio::Phenotype::OMIM::OMIMparser::_createOMIMentry > > > /usr/lib/perl5/site_perl/5.8.8/Bio/Phenotype/OMIM/OMIMparser.pm:536 > > > STACK Bio::Phenotype::OMIM::OMIMparser::next_phenotype > > > /usr/lib/perl5/site_perl/5.8.8/Bio/Phenotype/OMIM/OMIMparser.pm:272 > > > STACK toplevel parse_omim_original.pl:47 > > > > > > -------------------------------------- > > > > > > What is the reason for this? > > > Can anyone guide me please. > > > > > > -- > > > -Neeti > > > Even my blood says, B positive > > > _______________________________________________ > > > Bioperl-l mailing list > > > Bioperl-l at lists.open-bio.org > > > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > > Christopher Fields > > Postdoctoral Researcher > > Lab of Dr. Robert Switzer > > Dept of Biochemistry > > University of Illinois Urbana-Champaign > > > > > > > > > > > -- > -Neeti > Even my blood says, B positive > > -- -Neeti Even my blood says, B positive -------------- next part -------------- A non-text attachment was scrubbed... Name: parse_omim_original.pl Type: application/x-perl Size: 8750 bytes Desc: not available Url : http://lists.open-bio.org/pipermail/bioperl-l/attachments/20070802/6bdb009c/attachment.bin From cjfields at uiuc.edu Thu Aug 2 13:05:55 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 2 Aug 2007 12:05:55 -0500 Subject: [Bioperl-l] Fwd: nonstop repeated output from Remote_blast with xml References: <38B65B2C-A36D-41FB-83C9-7D7B55156CCD@uiuc.edu> Message-ID: For archiving purposes; of course I forgot to cc the list! -c Begin forwarded message: > From: Chris Fields > Date: August 2, 2007 12:04:59 PM CDT > To: gyang at plantbio.uga.edu > Subject: Re: [Bioperl-l] nonstop repeated output from Remote_blast > with xml > > Guojun, > > Make sure to keep this on the mail list for archiving purposes. > > It could be that the RID is not being removed properly (if it isn't > removed then you will repeatedly retrieve your BLAST report). The > new error you are seeing may be coming from whatever XML::SAX > backend parser is being used (XML::SAX::ExpatXS, XML::SAX::Expat, > etc); it doesn't look bioperl-related and there is an eval which > catches this stuff in SearchIO::blastxml. Does text parsing work? > > Could you directly send me your script or add it to a new bug > report as an attachment? > > http://www.bioperl.org/wiki/Bugs > > chris > > On Aug 2, 2007, at 11:07 AM, Guojun Yang wrote: > >> Hi,Chris, >> I installed the latest version of bioperl, in addition to the >> repeated output problem, there are new problems with parsing: >> >> >> -------------------- WARNING --------------------- >> MSG: error in parsing a report: >> No close tag marker [Ln: 4126, Col: 0] >> >> --------------------------------------------------- >> >> Would you please kindly give me a hint on this, >> Thanks a lot, >> Guojun >> >> >> ----- Original Message ----- >> From: Chris Fields [mailto:cjfields at uiuc.edu] >> To: gyang at plantbio.uga.edu >> Cc: bioperl-l List [mailto:bioperl-l at lists.open-bio.org] >> Subject: Re: [Bioperl-l] nonstop repeated output from Remote_blast >> with xml >> >> >>> Make sure to keep responses on the ail list. >>>> You might want to run a full install, just in case. If I remember >>> correctly Sendu made some changes a while back in the BLAST-related >>> modules which may be related to this. At the very least install/ >>> upgrade all modules in Bio::Tools::Run. >>>> chris >>>> On Jul 31, 2007, at 9:40 AM, Guojun Yang wrote: >>>>> Thanks, Chris, >>>> But when I replaced the old RemoteBlast.pm with the new one, I got >>>> "can't locate the object method "retrieve_parameter"". Does this >>>> mean I need to install something else? >>>> Guojun >>>> >>>> ----- Original Message ----- >>>> From: Chris Fields [mailto:cjfields at uiuc.edu] >>>> To: gyang at plantbio.uga.edu >>>> Cc: bioperl-l at bioperl.org >>>> Subject: Re: [Bioperl-l] nonstop repeated output from Remote_blast >>>> with xml >>>> >>>> >>>>>> On Jul 30, 2007, at 3:58 PM, Guojun Yang wrote: >>>>>>> I am running remoteblast and using readmethod "xml", I >>>>>>> noticed that >>>>>> it is printing the output repeatedly nonstop. It's like in a >>>>>> loop. >>>>>> Did anybody notice this before? Can anybody help me getting >>>>>> out of >>>>>> this? >>>>>> Thanks a lot, >>>>>> >>>>>> >>>>>> Guojun Yang >>>>>> University of Georgia >>>>>> Not seeing that using bioperl-live; you may need to update >>>>> RemoteBlast.pm as this sounds similar to an issue that popped up >>>>> earlier in the spring. >>>>>> chris >>>>> >>>> Christopher Fields >>> Postdoctoral Researcher >>> Lab of Dr. Robert Switzer >>> Dept of Biochemistry >>> University of Illinois Urbana-Champaign >>>>>> > > Christopher Fields > Postdoctoral Researcher > Lab of Dr. Robert Switzer > Dept of Biochemistry > University of Illinois Urbana-Champaign > > > Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From cjfields at uiuc.edu Thu Aug 2 13:51:27 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 2 Aug 2007 12:51:27 -0500 Subject: [Bioperl-l] URGENT : Problem in OMIM parser In-Reply-To: <764978cf0708020600v551b917ck9acdd443268b85fa@mail.gmail.com> References: <764978cf0708010428q5b51d27dw13f73dc5ca8d6dc9@mail.gmail.com> <0458A649-AD66-4495-8E25-78D556D51AD9@uiuc.edu> <764978cf0708012320v1f30c7a7tfc3a2e524b72093@mail.gmail.com> <764978cf0708020600v551b917ck9acdd443268b85fa@mail.gmail.com> Message-ID: <921F31D6-3CA9-483A-8AFF-B3555E9768C4@uiuc.edu> Neeti, The genemap wasn't loaded in all cases; don't know what the reasoning for it was, but it is fixed in CVS now (Bio::Phenotype::OMIM::OMIMparser, specifically). I would recommend that you install a full upgrade to at least bioperl 1.5.2 before using this; I can't guarantee it will work with bioperl 1.4. chris On Aug 2, 2007, at 8:00 AM, neeti somaiya wrote: > Also, > As per the following links we can fetch data from the genemap file > as well > :- > http://search.cpan.org/~birney/bioperl-1.2.3/Bio/Phenotype/OMIM/ > OMIMparser.pm > > But when I am trying to do so in the exact manner as given in the > above > link, I get no data. As in there are OMIM ids which are present in > both the > omim.txt and genemap files, and for such cases when I parse and > fetch data, > data from both files should be obtained, but I aint getting it. > > For eg. while running the attached script, for OMIM id 100790, I > get all > data from omim.txt but the cytoposition, gene symbol etc from > genemap is not > coming, though it is present in the genemap file. > > Please help me find what could be going wrong. > > On 8/2/07, neeti somaiya wrote: >> >> Hi, >> >> The script is attached with this mail. >> I am using bioperl-1.4. >> >> Regards, >> Neeti. >> >> On 8/1/07, Chris Fields < cjfields at uiuc.edu> wrote: >>> >>> Neeti, >>> >>> Only post to one list email address, namely the one I'm >>> responding to >>> and the one shown here: >>> >>> http://bioperl.org/mailman/listinfo/bioperl-l >>> >>> The others are aliases so you essentially posted three times. As >>> for >>> your question: there was no attached script or any additional >>> information (bioperl version would have also been nice), so we can't >>> help you until we have something more to work with. >>> >>> chris >>> >>> On Aug 1, 2007, at 6:28 AM, neeti somaiya wrote: >>> >>>> I have downloaded the omim.txt file from NCBI ftp site and I am >>>> running my >>>> attached parser on this file, the parser run stops in between with >>>> this :- >>>> >>>> ------------- EXCEPTION ------------- >>>> MSG: a part/organism must be assigned >>>> STACK Bio::Phenotype::OMIM::OMIMentry::add_clinical_symptoms >>>> /usr/lib/perl5/site_perl/5.8.8/Bio/Phenotype/OMIM/OMIMentry.pm:566 >>>> STACK Bio::Phenotype::OMIM::OMIMparser::_finer_parse_symptoms >>>> /usr/lib/perl5/site_perl/5.8.8/Bio/Phenotype/OMIM/OMIMparser.pm:555 >>>> STACK Bio::Phenotype::OMIM::OMIMparser::_createOMIMentry >>>> /usr/lib/perl5/site_perl/5.8.8/Bio/Phenotype/OMIM/OMIMparser.pm:536 >>>> STACK Bio::Phenotype::OMIM::OMIMparser::next_phenotype >>>> /usr/lib/perl5/site_perl/5.8.8/Bio/Phenotype/OMIM/OMIMparser.pm:272 >>>> STACK toplevel parse_omim_original.pl:47 >>>> >>>> -------------------------------------- >>>> >>>> What is the reason for this? >>>> Can anyone guide me please. >>>> >>>> -- >>>> -Neeti >>>> Even my blood says, B positive >>>> _______________________________________________ >>>> Bioperl-l mailing list >>>> Bioperl-l at lists.open-bio.org >>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>> >>> Christopher Fields >>> Postdoctoral Researcher >>> Lab of Dr. Robert Switzer >>> Dept of Biochemistry >>> University of Illinois Urbana-Champaign >>> >>> >>> >>> >> >> >> -- >> -Neeti >> Even my blood says, B positive >> >> > > > -- > -Neeti > Even my blood says, B positive > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From cjfields at uiuc.edu Thu Aug 2 14:16:56 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 2 Aug 2007 13:16:56 -0500 Subject: [Bioperl-l] URGENT : Problem in OMIM parser In-Reply-To: <764978cf0708021057g435539d2yd7168274589ec55f@mail.gmail.com> References: <764978cf0708010428q5b51d27dw13f73dc5ca8d6dc9@mail.gmail.com> <0458A649-AD66-4495-8E25-78D556D51AD9@uiuc.edu> <764978cf0708012320v1f30c7a7tfc3a2e524b72093@mail.gmail.com> <764978cf0708021057g435539d2yd7168274589ec55f@mail.gmail.com> Message-ID: <9D5F428F-D091-4815-A438-B3357D88212C@uiuc.edu> Neeti, Keep this on the list please. I am unable to reproduce this using your script with or without using the optional genemap file. You really should upgrade bioperl to 1.5.2 and try the fix first; this is something that may have been fixed post-bioperl 1.4. chris On Aug 2, 2007, at 12:57 PM, neeti somaiya wrote: > Waiting for your reply on the exception I had mentioned in my first > mail. > > Thanks. > > ---------- Forwarded message ---------- > From: neeti somaiya < neetisomaiya at gmail.com> > Date: Aug 2, 2007 11:50 AM > Subject: Re: [Bioperl-l] URGENT : Problem in OMIM parser > To: bioperl-l at lists.open-bio.org > > Hi, > > The script is attached with this mail. > I am using bioperl-1.4. > > Regards, > Neeti. > > > On 8/1/07, Chris Fields < cjfields at uiuc.edu> wrote:Neeti, > > Only post to one list email address, namely the one I'm responding to > and the one shown here: > > http://bioperl.org/mailman/listinfo/bioperl-l > > The others are aliases so you essentially posted three times. As for > your question: there was no attached script or any additional > information (bioperl version would have also been nice), so we can't > help you until we have something more to work with. > > chris > > On Aug 1, 2007, at 6:28 AM, neeti somaiya wrote: > > > I have downloaded the omim.txt file from NCBI ftp site and I am > > running my > > attached parser on this file, the parser run stops in between with > > this :- > > > > ------------- EXCEPTION ------------- > > MSG: a part/organism must be assigned > > STACK Bio::Phenotype::OMIM::OMIMentry::add_clinical_symptoms > > /usr/lib/perl5/site_perl/5.8.8/Bio/Phenotype/OMIM/OMIMentry.pm:566 > > STACK Bio::Phenotype::OMIM::OMIMparser::_finer_parse_symptoms > > /usr/lib/perl5/site_perl/5.8.8/Bio/Phenotype/OMIM/OMIMparser.pm:555 > > STACK Bio::Phenotype::OMIM::OMIMparser::_createOMIMentry > > /usr/lib/perl5/site_perl/5.8.8/Bio/Phenotype/OMIM/OMIMparser.pm:536 > > STACK Bio::Phenotype::OMIM::OMIMparser::next_phenotype > > /usr/lib/perl5/site_perl/5.8.8/Bio/Phenotype/OMIM/OMIMparser.pm:272 > > STACK toplevel parse_omim_original.pl:47 > > > > -------------------------------------- > > > > What is the reason for this? > > Can anyone guide me please. > > > > -- > > -Neeti > > Even my blood says, B positive > > _______________________________________________ > > Bioperl-l mailing list > > Bioperl-l at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > Christopher Fields > Postdoctoral Researcher > Lab of Dr. Robert Switzer > Dept of Biochemistry > University of Illinois Urbana-Champaign > > > > > > > -- > -Neeti > Even my blood says, B positive > > > > -- > -Neeti > Even my blood says, B positive > Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From torsten.seemann at infotech.monash.edu.au Thu Aug 2 21:03:36 2007 From: torsten.seemann at infotech.monash.edu.au (Torsten Seemann) Date: Fri, 3 Aug 2007 11:03:36 +1000 Subject: [Bioperl-l] trying to save blast hit sequences to fasta file In-Reply-To: <3579584634amadoz@uv.es> References: <3579584634amadoz@uv.es> Message-ID: Alicia, > Hi, thanks for your help and suggestions. I have tried the example code > of Jay Hannah and it works perfectly. But what I need to save in fasta > format is the whole sequence in the database that is similar to my query > sequence. Unfortunately the hit_string is only that part of the sequence in the database that was similar enough to your query sequence. The BLAST report does not have the whole hit sequence in it, only the locally aligned part. SearchIO can only give you what it can get from the BLAST report. You will need to record the IDs of the database sequences you are interested in, and write extra code to retrieve the WHOLE hit sequence from your database. --Torsten Seemann --Victorian Bioinformatics Consortium, Monash University From neetisomaiya at gmail.com Fri Aug 3 01:46:32 2007 From: neetisomaiya at gmail.com (neeti somaiya) Date: Fri, 3 Aug 2007 11:16:32 +0530 Subject: [Bioperl-l] URGENT : Problem in OMIM parser In-Reply-To: <9D5F428F-D091-4815-A438-B3357D88212C@uiuc.edu> References: <764978cf0708010428q5b51d27dw13f73dc5ca8d6dc9@mail.gmail.com> <0458A649-AD66-4495-8E25-78D556D51AD9@uiuc.edu> <764978cf0708012320v1f30c7a7tfc3a2e524b72093@mail.gmail.com> <764978cf0708021057g435539d2yd7168274589ec55f@mail.gmail.com> <9D5F428F-D091-4815-A438-B3357D88212C@uiuc.edu> Message-ID: <764978cf0708022246v98abed6ue41233f6b27c5674@mail.gmail.com> Hi, Thanks a lot. The exception is not coming after upgrade to bioperl-1.5.2 But the genemap data is still a problem. You had mentioned that I should take Bio::Phenotype::OMIM::OMIMparser, specifically from cvs. Where exactly can I get it? Thanks, Neeti. On 8/2/07, Chris Fields wrote: > > Neeti, > > Keep this on the list please. I am unable to reproduce this using > your script with or without using the optional genemap file. You > really should upgrade bioperl to 1.5.2 and try the fix first; this is > something that may have been fixed post-bioperl 1.4. > > chris > > On Aug 2, 2007, at 12:57 PM, neeti somaiya wrote: > > > Waiting for your reply on the exception I had mentioned in my first > > mail. > > > > Thanks. > > > > ---------- Forwarded message ---------- > > From: neeti somaiya < neetisomaiya at gmail.com> > > Date: Aug 2, 2007 11:50 AM > > Subject: Re: [Bioperl-l] URGENT : Problem in OMIM parser > > To: bioperl-l at lists.open-bio.org > > > > Hi, > > > > The script is attached with this mail. > > I am using bioperl-1.4. > > > > Regards, > > Neeti. > > > > > > On 8/1/07, Chris Fields < cjfields at uiuc.edu> wrote:Neeti, > > > > Only post to one list email address, namely the one I'm responding to > > and the one shown here: > > > > http://bioperl.org/mailman/listinfo/bioperl-l > > > > The others are aliases so you essentially posted three times. As for > > your question: there was no attached script or any additional > > information (bioperl version would have also been nice), so we can't > > help you until we have something more to work with. > > > > chris > > > > On Aug 1, 2007, at 6:28 AM, neeti somaiya wrote: > > > > > I have downloaded the omim.txt file from NCBI ftp site and I am > > > running my > > > attached parser on this file, the parser run stops in between with > > > this :- > > > > > > ------------- EXCEPTION ------------- > > > MSG: a part/organism must be assigned > > > STACK Bio::Phenotype::OMIM::OMIMentry::add_clinical_symptoms > > > /usr/lib/perl5/site_perl/5.8.8/Bio/Phenotype/OMIM/OMIMentry.pm:566 > > > STACK Bio::Phenotype::OMIM::OMIMparser::_finer_parse_symptoms > > > /usr/lib/perl5/site_perl/5.8.8/Bio/Phenotype/OMIM/OMIMparser.pm:555 > > > STACK Bio::Phenotype::OMIM::OMIMparser::_createOMIMentry > > > /usr/lib/perl5/site_perl/5.8.8/Bio/Phenotype/OMIM/OMIMparser.pm:536 > > > STACK Bio::Phenotype::OMIM::OMIMparser::next_phenotype > > > /usr/lib/perl5/site_perl/5.8.8/Bio/Phenotype/OMIM/OMIMparser.pm:272 > > > STACK toplevel parse_omim_original.pl:47 > > > > > > -------------------------------------- > > > > > > What is the reason for this? > > > Can anyone guide me please. > > > > > > -- > > > -Neeti > > > Even my blood says, B positive > > > _______________________________________________ > > > Bioperl-l mailing list > > > Bioperl-l at lists.open-bio.org > > > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > > Christopher Fields > > Postdoctoral Researcher > > Lab of Dr. Robert Switzer > > Dept of Biochemistry > > University of Illinois Urbana-Champaign > > > > > > > > > > > > > > -- > > -Neeti > > Even my blood says, B positive > > > > > > > > -- > > -Neeti > > Even my blood says, B positive > > > > Christopher Fields > Postdoctoral Researcher > Lab of Dr. Robert Switzer > Dept of Biochemistry > University of Illinois Urbana-Champaign > > > > -- -Neeti Even my blood says, B positive From jay at jays.net Fri Aug 3 10:23:11 2007 From: jay at jays.net (Jay Hannah) Date: Fri, 03 Aug 2007 09:23:11 -0500 Subject: [Bioperl-l] trying to save blast hit sequences to fasta file In-Reply-To: References: <3579584634amadoz@uv.es> Message-ID: <46B33A4F.2010403@jays.net> Torsten Seemann wrote: >> Hi, thanks for your help and suggestions. I have tried the example code >> of Jay Hannah and it works perfectly. But what I need to save in fasta >> format is the whole sequence in the database that is similar to my query >> sequence. >> > > Unfortunately the hit_string is only that part of the sequence in the > database that was similar enough to your query sequence. The BLAST > report does not have the whole hit sequence in it, only the locally > aligned part. SearchIO can only give you what it can get from the > BLAST report. > > You will need to record the IDs of the database sequences you are > interested in, and write extra code to retrieve the WHOLE hit sequence > from your database. > This probably won't help, but my (extremely poorly documented) "SeqLab.net" project http://seqlab.net is a framework that sits on top of BioPerl. The current cross_blast() stuff (http://seqlab.net/pods2html/tutorial.html) does this: GenBank -> FASTA -> formatdb -> "stand alone" NCBI BLAST -> reports When the reports run they have simultaneous access to both the original Bio::Seq objects from the GenBank file and the Bio::SearchIO objects from the BLAST results, so it can kick out reports that understand the relationships between (and details of) the original sequences and HSPs simultaneously... If you get stuck trying to do what Torsten suggests and have questions about SeqLab.net you could open a ticket with my group http://clab.ist.unomaha.edu/CLAB/index.php/RT and I'll try to help. Cheers, Jay Hannah http://clab.ist.unomaha.edu/CLAB/index.php/User:Jhannah From mbasu at mail.nih.gov Fri Aug 3 14:55:57 2007 From: mbasu at mail.nih.gov (Malay) Date: Fri, 03 Aug 2007 14:55:57 -0400 Subject: [Bioperl-l] trying to save blast hit sequences to fasta file In-Reply-To: <46B33A4F.2010403@jays.net> References: <3579584634amadoz@uv.es> <46B33A4F.2010403@jays.net> Message-ID: <46B37A3D.4070606@mail.nih.gov> Jay Hannah wrote: > Torsten Seemann wrote: >>> Hi, thanks for your help and suggestions. I have tried the example code >>> of Jay Hannah and it works perfectly. But what I need to save in fasta >>> format is the whole sequence in the database that is similar to my query >>> sequence. >>> >> Unfortunately the hit_string is only that part of the sequence in the >> database that was similar enough to your query sequence. The BLAST >> report does not have the whole hit sequence in it, only the locally >> aligned part. SearchIO can only give you what it can get from the >> BLAST report. >> >> You will need to record the IDs of the database sequences you are >> interested in, and write extra code to retrieve the WHOLE hit sequence >> from your database. I am not sure whether it has already been suggested or not but you can retrieve the full sequence from any blast database using "fastacmd", which is part of NCBI toolbox. Parse the "description" string from from the BLAST report and run: fastacmd -d -s where, the argument of -s can be any unique string for the database. -Malay From cjfields at uiuc.edu Mon Aug 6 13:49:08 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 6 Aug 2007 12:49:08 -0500 Subject: [Bioperl-l] Fwd: nonstop repeated output from Remote_blast with xml References: <1FE846F1-CB20-41FD-929D-8D14E5695B59@uiuc.edu> Message-ID: Wasn't paying attention! Forwarding this to the mail list in case anyone wanted the answer... chris Begin forwarded message: > From: Chris Fields > Date: August 6, 2007 12:10:37 PM CDT > To: gyang at plantbio.uga.edu > Subject: Re: [Bioperl-l] nonstop repeated output from Remote_blast > with xml > > Guojun, > > Sorry about the long wait on this. At this time RemoteBlast > doesn't automatically set the retrieval header to return XML when > setting the -reporttype parameter to 'xml' or 'blastxml'. The > default is text output, so you are retrieving regular text BLAST > reports instead of XML, hence the reported XML parser failure (BTW, > you can see the plain text being returned in the debugging > output). I'll look into a fix for that. > > In the meantime, you can do this manually by setting the following > key prior to submitting the BLAST run: > > $Bio::Tools::Run::RemoteBlast::RETRIEVALHEADER{'FORMAT_TYPE'} = 'XML'; > > When I run your example with the above line added it works fine. > As an additional note, the CVS version of Bio::SearchIO::blastxml > now supports newer versions of XML::SAX::Expat; the problem there > was a bug in XML::SAX::Expat that killed parsing. > > Additional rant before I go back to work (you can skip this if > needed): RemoteBlast is one of the most used modules in BioPerl, > but it is also the most problematic as NCBI keeps changing things > on their end (BLAST text output, prompts when returning RIDs, > etc). It frankly isn't as well-maintained as we would like; this > is partly due to plans we have (but unfortunately haven't acted > upon) to merge RemoteBlast/StandAloneBlast so they have a similar > API and can be used for any BLAST program, including netblast. If > someone wants to take this on at some point then they are more than > welcome! > > chris > > On Aug 3, 2007, at 10:08 AM, Guojun Yang wrote: > >> Thanks, Chris, >> Attached are my script and the query file. I suspected that we >> need to add "remove RID... in the code", I tried putting romoving >> RID at the end of the parsing coding, but it seemed it removed it >> even before the output was processed. I installed >> XML::SAX::Expat, the error became "XML::SAX::Expat is no longer >> supported...", so I installed ExpatXS, the error message becomes: >> >> -------------------- WARNING --------------------- >> MSG: error in parsing a report: >> no element found at line 4126, column 1, byte 186628 at /usr/lib/ >> perl5/site_perl/5.8.3/Bio/SearchIO/blastxml.pm line 304 >> >> >> Would you please try the script with the query file with the >> following input parameters, to see what happens on your machine (I >> want to make sure there is no installation problem on my machine). >> The search subroutine is where blast is performed, I did not >> include a romove RID there. Thanks again! >> >> master:/home/guojun # perl makcgi07.txt >> Query file name: >> kiddo.txt >> Select a function: 1.member;2.RES; 3, long; 4.Anchor; 5.Associator. >> 1 >> Type in the name of an organism, e.g. Oryza sativa. >> Oryza sativa >> Type in the organism to search for RES: >> Your E_value: >> 0.001 >> Size limit for ancestor element: >> 4000 >> Flanking size for retrieved members: >> 50 >> Tolerance for end mismatch: >> 0 >> >> >> >> Guojun From: Chris Fields [mailto:cjfields at uiuc.edu] >> To: gyang at plantbio.uga.edu >> Sent: Thu, 02 Aug 2007 13:04:59 -0400 >> Subject: Re: [Bioperl-l] nonstop repeated output from Remote_blast >> with xml >> >> Guojun, >> >> Make sure to keep this on the mail list for archiving purposes. >> >> It could be that the RID is not being removed properly (if it isn't >> removed then you will repeatedly retrieve your BLAST report). The >> new error you are seeing may be coming from whatever XML::SAX backend >> parser is being used (XML::SAX::ExpatXS, XML::SAX::Expat, etc); it >> doesn't look bioperl-related and there is an eval which catches this >> stuff in SearchIO::blastxml. Does text parsing work? >> >> Could you directly send me your script or add it to a new bug report >> as an attachment? >> >> http://www.bioperl.org/wiki/Bugs >> >> chris >> >> On Aug 2, 2007, at 11:07 AM, Guojun Yang wrote: >> >> > Hi,Chris, >> > I installed the latest version of bioperl, in addition to the >> > repeated output problem, there are new problems with parsing: >> > >> > >> > -------------------- WARNING --------------------- >> > MSG: error in parsing a report: >> > No close tag marker [Ln: 4126, Col: 0] >> > >> > --------------------------------------------------- >> > >> > Would you please kindly give me a hint on this, >> > Thanks a lot, >> > Guojun >> > >> > >> > ----- Original Message ----- >> > From: Chris Fields [mailto:cjfields at uiuc.edu] >> > To: gyang at plantbio.uga.edu >> > Cc: bioperl-l List [mailto:bioperl-l at lists.open-bio.org] >> > Subject: Re: [Bioperl-l] nonstop repeated output from Remote_blast >> > with xml >> > >> > >> >> Make sure to keep responses on the ail list. >> >>> You might want to run a full install, just in case. If I remember >> >> correctly Sendu made some changes a while back in the BLAST- >> related >> >> modules which may be related to this. At the very least install/ >> >> upgrade all modules in Bio::Tools::Run. >> >>> chris >> >>> On Jul 31, 2007, at 9:40 AM, Guojun Yang wrote: >> >>>> Thanks, Chris, >> >>> But when I replaced the old RemoteBlast.pm with the new one, I >> got >> >>> "can't locate the object method "retrieve_parameter"". Does this >> >>> mean I need to install something else? >> >>> Guojun >> >>> >> >>> ----- Original Message ----- >> >>> From: Chris Fields [mailto:cjfields at uiuc.edu] >> >>> To: gyang at plantbio.uga.edu >> >>> Cc: bioperl-l at bioperl.org >> >>> Subject: Re: [Bioperl-l] nonstop repeated output from >> Remote_blast >> >>> with xml >> >>> >> >>> >> >>>>> On Jul 30, 2007, at 3:58 PM, Guojun Yang wrote: >> >>>>>> I am running remoteblast and using readmethod "xml", I noticed >> >>>>>> that >> >>>>> it is printing the output repeatedly nonstop. It's like in a >> loop. >> >>>>> Did anybody notice this before? Can anybody help me getting >> out of >> >>>>> this? >> >>>>> Thanks a lot, >> >>>>> >> >>>>> >> >>>>> Guojun Yang >> >>>>> University of Georgia >> >>>>> Not seeing that using bioperl-live; you may need to update >> >>>> RemoteBlast.pm as this sounds similar to an issue that popped up >> >>>> earlier in the spring. >> >>>>> chris >> >>>> >> >>> Christopher Fields >> >> Postdoctoral Researcher >> >> Lab of Dr. Robert Switzer >> >> Dept of Biochemistry >> >> University of Illinois Urbana-Champaign >> >>>>> >> >> Christopher Fields >> Postdoctoral Researcher >> Lab of Dr. Robert Switzer >> Dept of Biochemistry >> University of Illinois Urbana-Champaign >> >> >> >> >> >> >> > > Christopher Fields > Postdoctoral Researcher > Lab of Dr. Robert Switzer > Dept of Biochemistry > University of Illinois Urbana-Champaign > > > Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From Alicia.Amadoz at uv.es Tue Aug 7 04:20:12 2007 From: Alicia.Amadoz at uv.es (Alicia Amadoz) Date: Tue, 7 Aug 2007 10:20:12 +0200 (CEST) Subject: [Bioperl-l] error using standaloneblast through webserver, part II Message-ID: <1387114447amadoz@uv.es> Hi again, i'm trying to run a bioperl script in linux with standaloneblast from a webserver but i now have another error. It is the following: [blastall] WARNING: Unable to open outfile_allseq.nin [blastall] WARNING: 101: Unable to open outfile_allseq.nin ------------- EXCEPTION: Bio::Root::Exception ------------- MSG: blastall call crashed: 256 /usr/local/blast-2.2.16/bin/blastall -d "/outfile_allseq" -e 10 -i /tmp//alicia_2007_07_20/result_search_alicia_12_03_40.fasta -o /tmp//alicia_2007_08_07/101_result_Local_Blast_alicia_09_56_47.out -p blastn My perl code is the following: my $blastdatadir = $ARGV[9]; -> Here the value of the variable is ok BEGIN { $ENV{PATH} .= ':/usr/local/blast-2.2.16/bin'; # path where blastall bin is located $ENV{BLASTDIR} = '/usr/local/blast-2.2.16/bin'; # path where blastall bin is located $ENV{BLASTDATADIR} = $blastdatadir; # path where formated local databases are located -> Here the value is empty } I have tried without BEGIN { } so $ENV var has a correct value for $blastdatadir but i get the same error. I have checked that formatdb was done and all the files are correct. Any idea or help to solve this problem? Thanks in advance. Regards, Alicia From mheusel at gmail.com Tue Aug 7 04:45:33 2007 From: mheusel at gmail.com (Martin Heusel) Date: Tue, 7 Aug 2007 10:45:33 +0200 Subject: [Bioperl-l] error using standaloneblast through webserver, part II In-Reply-To: <1387114447amadoz@uv.es> References: <1387114447amadoz@uv.es> Message-ID: <6127fc200708070145keb750acycce8a43edd0f724d@mail.gmail.com> > MSG: blastall call crashed: 256 /usr/local/blast-2.2.16/bin/blastall -d > "/outfile_allseq" -e 10 -i I'm not familiar with all this, but it seems your script tries to write in the systems root directory / -d "/outfile_allseq" that is normally not writable for normal users is this the problem? cu Martin -- + openid: http://mhe.myopenid.com/ + gpg : http://user.cs.tu-berlin.de/~mhe/pub/martin.gpg + gpg fp: 4844 71B5 B4E4 3892 69CA 6EA5 6598 61BE 0021 94A2 From Alicia.Amadoz at uv.es Tue Aug 7 07:08:12 2007 From: Alicia.Amadoz at uv.es (Alicia Amadoz) Date: Tue, 7 Aug 2007 13:08:12 +0200 (CEST) Subject: [Bioperl-l] error using standaloneblast through webserver, part II In-Reply-To: <1387114447amadoz@uv.es> References: <1387114447amadoz@uv.es> Message-ID: <5825345446amadoz@uv.es> Hi, i thought that it was enough with setting $ENV{BLASTDATADIR} and standaloneblast would find the database. I have change it, setting -database option of params with path_to_database+name_of_database and it works ok. Thanks for your help. Regards, Alicia From jason at bioperl.org Wed Aug 8 15:16:07 2007 From: jason at bioperl.org (Jason Stajich) Date: Wed, 8 Aug 2007 14:16:07 -0500 Subject: [Bioperl-l] Fwd: Question regarding Bio::GenBank module References: <7a93dad10708081148w74dfede3sd05799a651ebcb80@mail.gmail.com> Message-ID: <24F7DCFE-7047-43BA-BD92-E2238C05DAE1@bioperl.org> Young - I'm forwarding to the list for more help. Begin forwarded message: > From: "Young Song" > Date: August 8, 2007 1:48:29 PM CDT > To: jason at bioperl.org > Subject: Question regarding Bio::GenBank module > > Hello, > > I am currently located in Vancouver, Canada, and I actually have > some > question based on the Bio::GenBank module for bioperl. I read in the > online document for the module ( > http://search.cpan.org/dist/bioperl/Bio/DB/GenBank.pm), that we are > not > supposed to spam the NCBI with multiple requests, which lead me to > think > about the script that I wrote. I am trying to extract some > information > based on the fasta protein files located in the NCBI's database. > The > script reads each '.faa' (Fasta Protein) file and takes in the > 'gi' ID > for each sequence, and extracts several information, which looks like > following output (please note that there are lot more gi's then I > am showing > you right now): > > 10954456 > accesstion number: NP_047185.1 > dbsource: GenBank: NC_001911.1 > NP_047185.1 > starting pos. at genomic seq: 1488 > ending pos. at genomic seq: 1991 > strand: + > description: putative membrane-associated protein > organism: Buchnera aphidicola > MERIIEKAIYASRWLMFPVYVGLSFGFILLTLKFFQQIVFIIPDILAMSESGLVLVVLSLIDIALVGGLL > VMVMFLGYENFISKMDIQDNEKRLGWMGTMDVNSIKNKVASSIVAISSVHLLRLFMEAEKILDDKIMLCV > IIHLTFVLSAFGMAYIDKMSKKKHVLH > ************************************************ > 10954457 > accesstion number: NP_047186.1 > dbsource: GenBank: NC_001911.1 > NP_047186.1 > starting pos. at genomic seq: 2158 > ending pos. at genomic seq: 2913 > strand: + > description: putative replication-associated protein > organism: Buchnera aphidicola > MPRKNYIYNPKPVFNPPKNKRKISTFICYAMKKASEIDVARSNLNYTLLLIDPKTGNILPRFRRLNEHRA > CAMRAIVLAMLYYFDIHSNLVEASIEKLADECGLSTFSDSGNKSITRVSRLINDFLEPMGFVRCKKIKRK > FVSNYIPKKIFLTPMFFMLFNISQSKINRYLFKSKKMSQNLKITEKKIFISFSDIKVMSRLDEKSIRKKI > LNALINYYTASELTKIGPKGLKKRIDIEYNNLCKLFKKIKK > > > > Because there are lot of sequences I am dealing with here, I am > little bit > worried that I may be causing harm to the NCBI server. I just need > to know > if this is the right approach to take, or if there is another > solution (I am > little bit confused what you mean by "multiple requests" in the > document). > Your reply would be very much appreciated. Thank you in advance. > > Sincerely, > > Young C. Song -- Jason Stajich jason at bioperl.org From cjfields at uiuc.edu Wed Aug 8 15:41:34 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Wed, 8 Aug 2007 14:41:34 -0500 Subject: [Bioperl-l] Fwd: Question regarding Bio::GenBank module In-Reply-To: <24F7DCFE-7047-43BA-BD92-E2238C05DAE1@bioperl.org> References: <7a93dad10708081148w74dfede3sd05799a651ebcb80@mail.gmail.com> <24F7DCFE-7047-43BA-BD92-E2238C05DAE1@bioperl.org> Message-ID: NCBI eUtils (which Bio::DB::GenBank uses to get sequence data) has a list of user requirements: http://www.ncbi.nlm.nih.gov/entrez/query/static/ eutils_help.html#UserSystemRequirements The most important one is the 3 second timeout between requests, but the module already implements that policy so there isn't a real issue unless you deliberately mess with that setting. NCBI has been known to block IPs which don't follow that particular rule. Also, if you are planning making hundreds of requests you should consider running the script during low traffic times as indicated in the above link. chris On Aug 8, 2007, at 2:16 PM, Jason Stajich wrote: > Young - > I'm forwarding to the list for more help. > > Begin forwarded message: > >> From: "Young Song" >> Date: August 8, 2007 1:48:29 PM CDT >> To: jason at bioperl.org >> Subject: Question regarding Bio::GenBank module >> >> Hello, >> >> I am currently located in Vancouver, Canada, and I actually have >> some >> question based on the Bio::GenBank module for bioperl. I read in the >> online document for the module ( >> http://search.cpan.org/dist/bioperl/Bio/DB/GenBank.pm), that we are >> not >> supposed to spam the NCBI with multiple requests, which lead me to >> think >> about the script that I wrote. I am trying to extract some >> information >> based on the fasta protein files located in the NCBI's database. >> The >> script reads each '.faa' (Fasta Protein) file and takes in the >> 'gi' ID >> for each sequence, and extracts several information, which looks >> like >> following output (please note that there are lot more gi's then I >> am showing >> you right now): >> >> 10954456 >> accesstion number: NP_047185.1 >> dbsource: GenBank: NC_001911.1 >> NP_047185.1 >> starting pos. at genomic seq: 1488 >> ending pos. at genomic seq: 1991 >> strand: + >> description: putative membrane-associated protein >> organism: Buchnera aphidicola >> MERIIEKAIYASRWLMFPVYVGLSFGFILLTLKFFQQIVFIIPDILAMSESGLVLVVLSLIDIALVGGL >> L >> VMVMFLGYENFISKMDIQDNEKRLGWMGTMDVNSIKNKVASSIVAISSVHLLRLFMEAEKILDDKIMLC >> V >> IIHLTFVLSAFGMAYIDKMSKKKHVLH >> ************************************************ >> 10954457 >> accesstion number: NP_047186.1 >> dbsource: GenBank: NC_001911.1 >> NP_047186.1 >> starting pos. at genomic seq: 2158 >> ending pos. at genomic seq: 2913 >> strand: + >> description: putative replication-associated protein >> organism: Buchnera aphidicola >> MPRKNYIYNPKPVFNPPKNKRKISTFICYAMKKASEIDVARSNLNYTLLLIDPKTGNILPRFRRLNEHR >> A >> CAMRAIVLAMLYYFDIHSNLVEASIEKLADECGLSTFSDSGNKSITRVSRLINDFLEPMGFVRCKKIKR >> K >> FVSNYIPKKIFLTPMFFMLFNISQSKINRYLFKSKKMSQNLKITEKKIFISFSDIKVMSRLDEKSIRKK >> I >> LNALINYYTASELTKIGPKGLKKRIDIEYNNLCKLFKKIKK >> >> >> >> Because there are lot of sequences I am dealing with here, I am >> little bit >> worried that I may be causing harm to the NCBI server. I just need >> to know >> if this is the right approach to take, or if there is another >> solution (I am >> little bit confused what you mean by "multiple requests" in the >> document). >> Your reply would be very much appreciated. Thank you in advance. >> >> Sincerely, >> >> Young C. Song > > -- > Jason Stajich > jason at bioperl.org > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From gyang at plantbio.uga.edu Thu Aug 9 15:03:21 2007 From: gyang at plantbio.uga.edu (Guojun Yang) Date: Thu, 09 Aug 2007 15:03:21 -0400 Subject: [Bioperl-l] standalone blastall call crashed, please help In-Reply-To: 1FE846F1-CB20-41FD-929D-8D14E5695B59@uiuc.edu Message-ID: <20070809190321.191d0d4a@dogwood.plantbio.uga.edu> Hi, Chris, Thanks a lot for your efforts. With your help, I am gaining more confidence to fix the cgi code. While the remoteblast problem is fixed now, I am caught in a local blast problem (see the error message and subroutine). The line starting with * is line 593 in the error message. I tried command line blastall, it works fine. I set the permission to all the blast folders and files, it did not help much. The same sequence and database works OK if I use command line blastall. I used the seq object ref $query as query, the error message gives "-i /tmp/...", does this look like an input problem? The subroutine was working before early 2006 (on a different machine), I am wondering whether this is due to changes in the StandAloneBlast.pm? Best, Guojun I set the blast env variables: BEGIN {$ENV{BLASTDIR} = '/usr/blast-2.2.10/bin'; } BEGIN {$ENV{BLASTDB}='/usr/blast-2.2.10/data';} BEGIN {$ENV{BLASTMAT}='/usr/blast-2.2.10/data';} $PROGRAMDIR = $ENV{'BLASTDIR'} || ''; ...... ------------- EXCEPTION: Bio::Root::Exception ------------- MSG: blastall call crashed: -1 /usr/blast-2.2.10/bin/blastall -d "/usr/blast-2.2.10/data/swissprot" -e 0.001 -i /tmp/3cjvQyodxg -o /tmp/4qSSO16EZP -p blastx STACK: Error::throw STACK: Bio::Root::Root::throw /usr/lib/perl5/site_perl/5.8.3/Bio/Root/Root.pm:359 STACK: Bio::Tools::Run::StandAloneBlast::_runblast /usr/lib/perl5/site_perl/5.8.3/Bio/Tools/Run/StandAloneBlast.pm:813 STACK: Bio::Tools::Run::StandAloneBlast::_generic_local_blast /usr/lib/perl5/site_perl/5.8.3/Bio/Tools/Run/StandAloneBlast.pm:760 STACK: Bio::Tools::Run::StandAloneBlast::blastall /usr/lib/perl5/site_perl/5.8.3/Bio/Tools/Run/StandAloneBlast.pm:570 STACK: main::ancestor makcgi07.txt:593 STACK: makcgi07.txt:208 sub ancestor { use Bio::Tools::Run::StandAloneBlast; use Bio::SearchIO::blast; my $query = Bio::Seq -> new ( -seq=>"$_[0]", -id=>"test"); print $query->seq(); my $len=$query->length(); my $long_name=$_[1]; my $long_start=$_[2]; my $long_end=$_[3]; @db=('swissprot'); foreach my $db (@db) { my $factory = Bio::Tools::Run::StandAloneBlast->new(-program => "blastx", -database => "$db", -e => 1e-3, ); * my $blast_report = $factory->blastall($query); while (my $result = $blast_report->next_result) { while( my $hit = $result->next_hit()) { $hit_name=$hit->name; $hit_name =~ /\S+[|](\S+)[.]\d+[|].*/; $name=$1; $desc = $hit->description(); if ($desc =~ /.*{|\btransposon\b|\btransposase\b|}.*/i){ $AN=0; $replica=0; while ($ancestor_name[$AN]) { $replica=1 if (($ancestor_name[$AN] eq $long_name) && ($hitname[$AN] eq $name)); $AN+=1; } if ($replica==0) { push @ancestor_name, $long_name; push @ancestor_start, $long_start; push @ancestor_end, $long_end; push @desc, $desc; push @hitname,$name; } } } }} return @ancestor_name, at ancestor_start, at ancestor_end, at desc; } From harijay at gmail.com Thu Aug 9 17:47:50 2007 From: harijay at gmail.com (hari jayaram) Date: Thu, 9 Aug 2007 17:47:50 -0400 Subject: [Bioperl-l] newbie wants install help Message-ID: Hi I am trying to install bioperl as a non root user since I dont have root access on the machine. I was following the instructions as given on the wiki at http://bioperl.open-bio.org/wiki/Installing_Bioperl_for_Unix I started from scratch using perl version v5.8.5 and used cpan to install the bioperl module prerequisites bundle Bundle::BioPerl since I thought it was needed. Everything worked just fine I could use cpan as a non root user following instructions given at http://www.dcc.fc.up.pt/~pbrandao/aulas/0203/AR/modules_inst_cpan.html But when I try to install bioperl using the instructions for non-root I get an error when I build Module::Build because I am not root. Iget the same Module::Build error when I try to install without CPAN using command line script perl Build.PL --install_base option as given on the wiki. Is there a way out Thanks for your help in advance harijay Brandeis University Installing /usr/share/man/man3/Module::Build::Platform::VMS.3pm Installing /usr/share/man/man3/Module::Build::Base.3pm Installing /usr/share/man/man3/Module::Build::Authoring.3pm Installing /usr/share/man/man3/Module::Build::Compat.3pm mkdir /usr/lib/perl5/site_perl/5.8.5/i386-linux-thread-multi/auto/Module: Permission denied at /usr/lib/perl5/5.8.5/ExtUtils/Install.pm line 207 Installing /usr/bin/config_data make: *** [install] Error 255 /usr/bin/make install -- NOT OK You may have to su to root to install the package Couldn't install Module::Build, giving up. make: *** No targets specified and no makefile found. Stop. /usr/bin/make -- NOT OK Running make test Can't test without successful make Running make install make had returned bad status, install seems impossible From bix at sendu.me.uk Thu Aug 9 18:23:24 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Thu, 09 Aug 2007 23:23:24 +0100 Subject: [Bioperl-l] newbie wants install help In-Reply-To: References: Message-ID: <46BB93DC.9010608@sendu.me.uk> hari jayaram wrote: > Hi I am trying to install bioperl as a non root user since I dont have root > access on the machine. > > I was following the instructions as given on the wiki at > http://bioperl.open-bio.org/wiki/Installing_Bioperl_for_Unix > I started from scratch using perl version v5.8.5 and used cpan to install > the bioperl module prerequisites bundle Bundle::BioPerl since I thought it > was needed. Everything worked just fine > I could use cpan as a non root user following instructions given at > http://www.dcc.fc.up.pt/~pbrandao/aulas/0203/AR/modules_inst_cpan.html > > But when I try to install bioperl using the instructions for non-root I get > an error when I build Module::Build because I am not root. > Iget the same Module::Build error when I try to install without CPAN using > command line script perl Build.PL --install_base option as given on the > wiki. Follow the cpan instructions you found to install as non-root: Bundle::CPAN Failing that, you require at least: Module::Build Failing that: http://www.bioperl.org/wiki/Installing_Bioperl_for_Unix#INSTALLING_BIOPERL_MODULES_THE_HARD_WAY (it's actually the easiest way, go figure) From bix at sendu.me.uk Fri Aug 10 03:41:29 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Fri, 10 Aug 2007 08:41:29 +0100 Subject: [Bioperl-l] newbie wants install help In-Reply-To: References: <46BB93DC.9010608@sendu.me.uk> Message-ID: <46BC16A9.7090709@sendu.me.uk> hari jayaram wrote: > Hi Sendu , Hi, please post back to the list as well, so others can benefit. > Well after going through a few attempts at installing Bundle::CPAN I > gave up. > It always had weird timeout issues . ANd kept re-installing everything > on restarting the CPAN shell > After a while I thought it did complete - since it retunred me to the shell > > I tried the CPAN install of bioperl at that point > > ANd bingo I got booted out at the exact same point when the Bioperl > install tried to re-install(?) Module:Build which failed as non root Did you follow steps 7 and 8 of http://www.dcc.fc.up.pt/~pbrandao/aulas/0203/AR/modules_inst_cpan.html ? If you managed to install Bundle::CPAN, when you now run 'cpan' it should start up and tell you its version number, which should be v1.9102 or higher. If its lower, you didn't manage to install the latest CPAN, or you haven't managed to tell Perl where your newly installed modules are. > I guess for all future modules I will adopt the option 3 you detailed , > i.e just have the modules sitting somewhere and use them from there > > But I am still interested in getting it done right via CPAN. From n.haigh at sheffield.ac.uk Fri Aug 10 06:14:06 2007 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Fri, 10 Aug 2007 11:14:06 +0100 Subject: [Bioperl-l] newbie wants install help In-Reply-To: <46BC16A9.7090709@sendu.me.uk> References: <46BB93DC.9010608@sendu.me.uk> <46BC16A9.7090709@sendu.me.uk> Message-ID: <46BC3A6E.80302@sheffield.ac.uk> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Sendu Bala wrote: > hari jayaram wrote: >> Hi Sendu , > > Hi, please post back to the list as well, so others can benefit. > > >> Well after going through a few attempts at installing Bundle::CPAN I >> gave up. >> It always had weird timeout issues . ANd kept re-installing everything >> on restarting the CPAN shell >> After a while I thought it did complete - since it retunred me to the shell >> >> I tried the CPAN install of bioperl at that point >> >> ANd bingo I got booted out at the exact same point when the Bioperl >> install tried to re-install(?) Module:Build which failed as non root > > Did you follow steps 7 and 8 of > http://www.dcc.fc.up.pt/~pbrandao/aulas/0203/AR/modules_inst_cpan.html ? > > If you managed to install Bundle::CPAN, when you now run 'cpan' it > should start up and tell you its version number, which should be v1.9102 > or higher. If its lower, you didn't manage to install the latest CPAN, > or you haven't managed to tell Perl where your newly installed modules are. > > >> I guess for all future modules I will adopt the option 3 you detailed , >> i.e just have the modules sitting somewhere and use them from there >> >> But I am still interested in getting it done right via CPAN. > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l It will probably also help, if you post the commands you have run and any output (truncated if it's really long), then we can follow what you have tried and make some better suggestions. Cheers Nath -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFGvDpuczuW2jkwy2gRAjFjAJ0eG90cMfHrrIh7LbKWx1JN94kbXgCdGSbi tMjQrZ/8EPc0wLiNAhYTr4Y= =kXZ2 -----END PGP SIGNATURE----- From mbasu at mail.nih.gov Fri Aug 10 11:25:35 2007 From: mbasu at mail.nih.gov (Malay) Date: Fri, 10 Aug 2007 11:25:35 -0400 Subject: [Bioperl-l] newbie wants install help In-Reply-To: References: Message-ID: <46BC836F.7010906@mail.nih.gov> hari jayaram wrote: > Hi I am trying to install bioperl as a non root user since I dont have root > access on the machine. > > I was following the instructions as given on the wiki at > http://bioperl.open-bio.org/wiki/Installing_Bioperl_for_Unix > I started from scratch using perl version v5.8.5 and used cpan to install > the bioperl module prerequisites bundle Bundle::BioPerl since I thought it > was needed. Everything worked just fine > I could use cpan as a non root user following instructions given at > http://www.dcc.fc.up.pt/~pbrandao/aulas/0203/AR/modules_inst_cpan.html > > But when I try to install bioperl using the instructions for non-root I get > an error when I build Module::Build because I am not root. > Iget the same Module::Build error when I try to install without CPAN using > command line script perl Build.PL --install_base option as given on the > wiki. > > Is there a way out > > Thanks for your help in advance > harijay > Brandeis University > This is related your situation and broadly applicable to all perl users in a non root situation. I can tell from my own experience the best way to handle your situation is to use your own Perl, if you are a dedicated perl developer. Just compile and install your own perl installation in any directory of you choice and put the "bin" directory in front of you path and off you go. The advantages are several fold. First, you get a very optimized, fast perl. The sysadmin might have just installed a binary run-of-the-mill perl version. Second, you get all the freedom of installing the very latest updates of all the modules. The sysadmins may be too busy man to update perl frequently. Third, a very common problem with production machine is that they follow strictly the perl installation instruction and avoid threaded perl, which clips your wings particularly, when almost all machines contain multiple processors. The drawbacks are related to finding "/usr/bin/perl" in the shebang line. If you follow the perl way of installing any script, it will take care of it. When you develop, use the more portable way of #!/usr/bin/env perl BEGIN {$^W =1 } # Use it switch on compile time warnings (-w) All the best, Malay -- Malay K Basu www.malaybasu.net From gyang at plantbio.uga.edu Fri Aug 10 11:23:36 2007 From: gyang at plantbio.uga.edu (Guojun Yang) Date: Fri, 10 Aug 2007 11:23:36 -0400 Subject: [Bioperl-l] ATTN: Matthew Laird & Elia----blastall call crashed from StandAloneBlast In-Reply-To: 20070809190321.191d0d4a@dogwood.plantbio.uga.edu Message-ID: <20070810152336.898c3979@dogwood.plantbio.uga.edu> Hi, Chris, Interestingly, I found the message in bioperl-l from Matthew Laird 2005 "Blastall & StandAloneBlast". "...the Odd thing is, Blast DOES run. If one comments out this line in StandAloneBlast.pm, the execution succeeds perfectly fine". It seemed to be mysterious when I uncommented the " $self->throw("$executable call crashed: $? $! $commandstring\n") unless ($status==0) ;" line, the blastall runs. The only difference from what Matthew saw is that, when I did not uncomment the line, blastall DID NOT run. Thanks, Guojun _____ From: Guojun Yang [mailto:gyang at plantbio.uga.edu] To: Chris Fields [mailto:cjfields at uiuc.edu] Cc: bioperl-l at lists.open-bio.org Sent: Thu, 09 Aug 2007 15:03:21 -0400 Subject: standalone blastall call crashed, please help Hi, Chris, Thanks a lot for your efforts. With your help, I am gaining more confidence to fix the cgi code. While the remoteblast problem is fixed now, I am caught in a local blast problem (see the error message and subroutine). The line starting with * is line 593 in the error message. I tried command line blastall, it works fine. I set the permission to all the blast folders and files, it did not help much. The same sequence and database works OK if I use command line blastall. I used the seq object ref $query as query, the error message gives "-i /tmp/...", does this look like an input problem? The subroutine was working before early 2006 (on a different machine), I am wondering whether this is due to changes in the StandAloneBlast.pm? Best, Guojun I set the blast env variables: BEGIN {$ENV{BLASTDIR} = '/usr/blast-2.2.10/bin'; } BEGIN {$ENV{BLASTDB}='/usr/blast-2.2.10/data';} BEGIN {$ENV{BLASTMAT}='/usr/blast-2.2.10/data';} $PROGRAMDIR = $ENV{'BLASTDIR'} || ''; ...... ------------- EXCEPTION: Bio::Root::Exception ------------- MSG: blastall call crashed: -1 /usr/blast-2.2.10/bin/blastall -d "/usr/blast-2.2.10/data/swissprot" -e 0.001 -i /tmp/3cjvQyodxg -o /tmp/4qSSO16EZP -p blastx STACK: Error::throw STACK: Bio::Root::Root::throw /usr/lib/perl5/site_perl/5.8.3/Bio/Root/Root.pm:359 STACK: Bio::Tools::Run::StandAloneBlast::_runblast /usr/lib/perl5/site_perl/5.8.3/Bio/Tools/Run/StandAloneBlast.pm:813 STACK: Bio::Tools::Run::StandAloneBlast::_generic_local_blast /usr/lib/perl5/site_perl/5.8.3/Bio/Tools/Run/StandAloneBlast.pm:760 STACK: Bio::Tools::Run::StandAloneBlast::blastall /usr/lib/perl5/site_perl/5.8.3/Bio/Tools/Run/StandAloneBlast.pm:570 STACK: main::ancestor makcgi07.txt:593 STACK: makcgi07.txt:208 sub ancestor { use Bio::Tools::Run::StandAloneBlast; use Bio::SearchIO::blast; my $query = Bio::Seq -> new ( -seq=>"$_[0]", -id=>"test"); print $query->seq(); my $len=$query->length(); my $long_name=$_[1]; my $long_start=$_[2]; my $long_end=$_[3]; @db=('swissprot'); foreach my $db (@db) { my $factory = Bio::Tools::Run::StandAloneBlast->new(-program => "blastx", -database => "$db", -e => 1e-3, ); * my $blast_report = $factory->blastall($query); while (my $result = $blast_report->next_result) { while( my $hit = $result->next_hit()) { $hit_name=$hit->name; $hit_name =~ /\S+[|](\S+)[.]\d+[|].*/; $name=$1; $desc = $hit->description(); if ($desc =~ /.*{|\btransposon\b|\btransposase\b|}.*/i){ $AN=0; $replica=0; while ($ancestor_name[$AN]) { $replica=1 if (($ancestor_name[$AN] eq $long_name) && ($hitname[$AN] eq $name)); $AN+=1; } if ($replica==0) { push @ancestor_name, $long_name; push @ancestor_start, $long_start; push @ancestor_end, $long_end; push @desc, $desc; push @hitname,$name; } } } }} return @ancestor_name, at ancestor_start, at ancestor_end, at desc; } From cjfields at uiuc.edu Fri Aug 10 12:17:38 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Fri, 10 Aug 2007 11:17:38 -0500 Subject: [Bioperl-l] ATTN: Matthew Laird & Elia----blastall call crashed from StandAloneBlast In-Reply-To: <20070810152336.898c3979@dogwood.plantbio.uga.edu> References: <20070810152336.898c3979@dogwood.plantbio.uga.edu> Message-ID: <56186844-3CB9-4968-B16F-FD5EE72865A2@uiuc.edu> This should be filed as a bug if possible; could you do that? http://www.bioperl.org/wiki/Bugs Suggestions have been made many times previously that StandAloneBlast, RemoteBlast, etc be combined to use a common API, incorporate other BLAST implementations (i.e. WU-BLAST, NCBI's netblast, etc), and maybe utilize other cross-platform compatible means of running programs and passing off reports to parsers. In fact, Jason, Roger Hall, Torsten, and I discussed tentative plans for plugin-able BLAST wrappers: http://www.bioperl.org/wiki/Module:Bio::Tools::Run::RemoteBlast Though they have never been acted upon. If I get time towards the end of fall and manage to finish up some other projects I may try taking this on, maybe using the wiki to track progress. chris On Aug 10, 2007, at 10:23 AM, Guojun Yang wrote: > Hi, Chris, > Interestingly, I found the message in bioperl-l from Matthew Laird > 2005 "Blastall & StandAloneBlast". "...the Odd thing is, Blast DOES > run. If one comments out this line in StandAloneBlast.pm, the > execution succeeds perfectly fine". It seemed to be mysterious when > I uncommented the " $self->throw("$executable call crashed: $? $! > $commandstring\n") unless ($status==0) ;" line, the blastall runs. > The only difference from what Matthew saw is that, when I did not > uncomment the line, blastall DID NOT run. > Thanks, > Guojun > > From: Guojun Yang [mailto:gyang at plantbio.uga.edu] > To: Chris Fields [mailto:cjfields at uiuc.edu] > Cc: bioperl-l at lists.open-bio.org > Sent: Thu, 09 Aug 2007 15:03:21 -0400 > Subject: standalone blastall call crashed, please help > > Hi, Chris, > Thanks a lot for your efforts. With your help, I am gaining more > confidence to fix the cgi code. While the remoteblast problem is > fixed now, I am caught in a local blast problem (see the error > message and subroutine). The line starting with * is line 593 in > the error message. I tried command line blastall, it works fine. I > set the permission to all the blast folders and files, it did not > help much. The same sequence and database works OK if I use command > line blastall. I used the seq object ref $query as query, the error > message gives "-i /tmp/...", does this look like an input problem? > The subroutine was working before early 2006 (on a different > machine), I am wondering whether this is due to changes in the > StandAloneBlast.pm? Best, Guojun > > I set the blast env variables: > > BEGIN {$ENV{BLASTDIR} = '/usr/blast-2.2.10/bin'; } > BEGIN {$ENV{BLASTDB}='/usr/blast-2.2.10/data';} > BEGIN {$ENV{BLASTMAT}='/usr/blast-2.2.10/data';} > $PROGRAMDIR = $ENV{'BLASTDIR'} || ''; > ...... > > ------------- EXCEPTION: Bio::Root::Exception ------------- > MSG: blastall call crashed: -1 /usr/blast-2.2.10/bin/blastall -d "/ > usr/blast-2.2.10/data/swissprot" -e 0.001 -i /tmp/3cjvQyodxg - > o /tmp/4qSSO16EZP -p blastx > STACK: Error::throw > STACK: Bio::Root::Root::throw /usr/lib/perl5/site_perl/5.8.3/Bio/ > Root/Root.pm:359 > STACK: Bio::Tools::Run::StandAloneBlast::_runblast /usr/lib/perl5/ > site_perl/5.8.3/Bio/Tools/Run/StandAloneBlast.pm:813 > STACK: Bio::Tools::Run::StandAloneBlast::_generic_local_blast /usr/ > lib/perl5/site_perl/5.8.3/Bio/Tools/Run/StandAloneBlast.pm:760 > STACK: Bio::Tools::Run::StandAloneBlast::blastall /usr/lib/perl5/ > site_perl/5.8.3/Bio/Tools/Run/StandAloneBlast.pm:570 > STACK: main::ancestor makcgi07.txt:593 > STACK: makcgi07.txt:208 > sub ancestor { > use Bio::Tools::Run::StandAloneBlast; > use Bio::SearchIO::blast; > > my $query = Bio::Seq -> new ( -seq=>"$_[0]", > -id=>"test"); > print $query->seq(); > my $len=$query->length(); > my $long_name=$_[1]; > my $long_start=$_[2]; > my $long_end=$_[3]; > @db=('swissprot'); > foreach my $db (@db) { > my $factory = Bio::Tools::Run::StandAloneBlast->new(-program => > "blastx", > -database > => "$db", > -e => 1e-3, > ); > * my $blast_report = $factory->blastall($query); > while (my $result = $blast_report->next_result) { > while( my $hit = $result->next_hit()) { > $hit_name=$hit->name; > $hit_name =~ /\S+[|](\S+)[.]\d+[|].*/; > $name=$1; > $desc = $hit->description(); > if ($desc =~ /.*{|\btransposon\b|\btransposase > \b|}.*/i){ > $AN=0; > $replica=0; > while ($ancestor_name[$AN]) { > $replica=1 if (($ancestor_name[$AN] eq > $long_name) && ($hitname[$AN] eq $name)); > $AN+=1; > } > if ($replica==0) { > push @ancestor_name, $long_name; > push @ancestor_start, $long_start; > push @ancestor_end, $long_end; > push @desc, $desc; > push @hitname,$name; > } > } > } > }} > return @ancestor_name, at ancestor_start, at ancestor_end, at desc; > } > > > > > > Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From harijay at gmail.com Fri Aug 10 13:09:32 2007 From: harijay at gmail.com (hari jayaram) Date: Fri, 10 Aug 2007 13:09:32 -0400 Subject: [Bioperl-l] newbie wants install help In-Reply-To: <46BC16A9.7090709@sendu.me.uk> References: <46BB93DC.9010608@sendu.me.uk> <46BC16A9.7090709@sendu.me.uk> Message-ID: Hey all , Thanks for your help. Its working real well now. Turns out I had not set my PERL5LIB environment variable correctly and it was not finding the installed modules (thanks Sendu) So the steps I followed were 1) Install CPAN as myself as detailed http://www.dcc.fc.up.pt/~pbrandao/aulas/0203/AR/modules_inst_cpan.html Importantly the line which tells CPAN what prefix to use for all module installs PREFIX=~/perl5lib/ LIB=~/perl5lib/lib INSTALLMAN1DIR=~/perl5lib/man1 INSTALLMAN3DIR=~/perl5lib/man3 2) Set the Perl5LIB to /home/perl5lib/lib ( and not just /home/perl5lib) in the shell . I use cshell so I edited .cshrc setenv PERL5LIB /home/hari/perl5lib/lib setenv MANPATH ${MANPATH}:/home/hari/perl5lib 3) Updated the system CPAN to latest version - this woked very well once the perl5lib was installed ..only it took a while and sometimes stalled with messages like done 31/34 But a CTRL C , got it going again 4) Made sure I was using the new CPAN v1.9102 5) Installed Bioperl with command install S/SE/SENDU/bioperl-1.5.2_102.tar.gz AND I was good to go.. I am thinking I will screencast this process for everyones benefit and put it up on bioscreencast.com . If that will be useful for others. Thanks to everyone on the group. Now the journey begins Hari Jayaram On 8/10/07, Sendu Bala wrote: > hari jayaram wrote: > > Hi Sendu , > > Hi, please post back to the list as well, so others can benefit. > > > > Well after going through a few attempts at installing Bundle::CPAN I > > gave up. > > It always had weird timeout issues . ANd kept re-installing everything > > on restarting the CPAN shell > > After a while I thought it did complete - since it retunred me to the shell > > > > I tried the CPAN install of bioperl at that point > > > > ANd bingo I got booted out at the exact same point when the Bioperl > > install tried to re-install(?) Module:Build which failed as non root > > Did you follow steps 7 and 8 of > http://www.dcc.fc.up.pt/~pbrandao/aulas/0203/AR/modules_inst_cpan.html ? > > If you managed to install Bundle::CPAN, when you now run 'cpan' it > should start up and tell you its version number, which should be v1.9102 > or higher. If its lower, you didn't manage to install the latest CPAN, > or you haven't managed to tell Perl where your newly installed modules are. > > > > I guess for all future modules I will adopt the option 3 you detailed , > > i.e just have the modules sitting somewhere and use them from there > > > > But I am still interested in getting it done right via CPAN. > From torsten.seemann at infotech.monash.edu.au Fri Aug 10 17:48:56 2007 From: torsten.seemann at infotech.monash.edu.au (Torsten Seemann) Date: Sat, 11 Aug 2007 07:48:56 +1000 Subject: [Bioperl-l] ATTN: Matthew Laird & Elia----blastall call crashed from StandAloneBlast In-Reply-To: <20070810152336.898c3979@dogwood.plantbio.uga.edu> References: <20070809190321.191d0d4a@dogwood.plantbio.uga.edu> <20070810152336.898c3979@dogwood.plantbio.uga.edu> Message-ID: > Interestingly, I found the message in bioperl-l from Matthew Laird 2005 "Blastall & StandAloneBlast". "...the Odd thing is, Blast DOES run. If one comments out this line in StandAloneBlast.pm, the execution succeeds perfectly fine". It seemed to be mysterious when I uncommented the " $self->throw("$executable call crashed: $? $! $commandstring\n") unless ($status==0) ;" line, the blastall runs. The only difference from what Matthew saw is that, when I did not uncomment the line, blastall DID NOT run. Yes, Matthew is one of the authors of PSORTB and I spent a bit of time last year trying to fix this problem (unsuccessfully). The PSORTB docs http://www.psort.org/downloads/index.html explain how to get around this problem just as Guojun describes. I use a custom BioPerl installation just for PSORTB! I was under the impression it was already filed as a bug, but my searching indicates this is not so. -- --Torsten Seemann --Victorian Bioinformatics Consortium, Monash University From cjfields at uiuc.edu Fri Aug 10 18:04:20 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Fri, 10 Aug 2007 17:04:20 -0500 Subject: [Bioperl-l] ATTN: Matthew Laird & Elia----blastall call crashed from StandAloneBlast In-Reply-To: References: <20070809190321.191d0d4a@dogwood.plantbio.uga.edu> <20070810152336.898c3979@dogwood.plantbio.uga.edu> Message-ID: <41A08079-6EEC-4B62-8104-C41E70C03083@uiuc.edu> On Aug 10, 2007, at 4:48 PM, Torsten Seemann wrote: >> Interestingly, I found the message in bioperl-l from Matthew Laird >> 2005 "Blastall & StandAloneBlast". "...the Odd thing is, Blast >> DOES run. If one comments out this line in StandAloneBlast.pm, >> the execution succeeds perfectly fine". It seemed to be mysterious >> when I uncommented the " $self->throw("$executable call crashed: >> $? $! $commandstring\n") unless ($status==0) ;" line, the blastall >> runs. The only difference from what Matthew saw is that, when I >> did not uncomment the line, blastall DID NOT run. > > Yes, Matthew is one of the authors of PSORTB and I spent a bit of time > last year trying to fix this problem (unsuccessfully). The PSORTB docs > http://www.psort.org/downloads/index.html > explain how to get around this problem just as Guojun describes. I use > a custom BioPerl installation just for PSORTB! > > I was under the impression it was already filed as a bug, but my > searching indicates this is not so. > > -- > --Torsten Seemann > --Victorian Bioinformatics Consortium, Monash University > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Might be wise to go ahead and add it to bugzilla so we can track it, along with the workaround. chris From neetisomaiya at gmail.com Mon Aug 13 06:29:39 2007 From: neetisomaiya at gmail.com (neeti somaiya) Date: Mon, 13 Aug 2007 15:59:39 +0530 Subject: [Bioperl-l] Homologene parser? Message-ID: <764978cf0708130329r484bf210qd0e6e9ad39274bbc@mail.gmail.com> Hi, Does anyone know of any Homologene parser, if available? Please let me know. Thanks and Regards, Neeti. -- -Neeti Even my blood says, B positive From shameer at ncbs.res.in Mon Aug 13 07:07:45 2007 From: shameer at ncbs.res.in (Shameer Khadar) Date: Mon, 13 Aug 2007 16:37:45 +0530 (IST) Subject: [Bioperl-l] Bio::Graphics : To change scale color & sort and add direction to SeqFeature In-Reply-To: <6dce9a0b0705030901w203344b4te03ad271a5482faf@mail.gmail.com> References: <10259461.post@talk.nabble.com> <41667.192.168.1.1.1178019391.squirrel@mail.ncbs.res.in> <1178028249.2644.13.camel@localhost.localdomain> <42391.192.168.1.1.1178035451.squirrel@mail.ncbs.res.in> <6dce9a0b0705030901w203344b4te03ad271a5482faf@mail.gmail.com> Message-ID: <51133.192.168.1.1.1187003265.squirrel@mail.ncbs.res.in> Dear All, I am generating images based on Transcription Factor binding site data using bio::graphics module. I created my images using program : version-2 [http://stein.cshl.org/genome_informatics/BioGraphics/] (Courtsey : L. Stein ). I attaching one of the image with this mail. I need to make 3 changes to this image 1. to color the 'scale' Color the scale in two different colors ie, from start 1.0k - color blue from 101 - till end of the scale green (I thoroghly checked the Bio::Graphics document, I couldnt find an option to do this ) 2. to sort the Transcription factors based on the z_score 3. to give forward/reverse [> or < ]direction for the black boxes I would appreaciate if any one can give me some clues/link to accomplish this :). thanks in advance , Shameer -- Shameer Khadar Lab (# 25) The Computational Biology Group National Centre for Biological Sciences (TIFR) GKVK Campus, Bellary Road, Bangalore - 65, Karnataka - India T - 91-080-23666001 EXT - 6251 W - http://www.ncbs.res.in -------------- next part -------------- A non-text attachment was scrubbed... Name: TF_top3.png Type: image/png Size: 2188 bytes Desc: not available Url : http://lists.open-bio.org/pipermail/bioperl-l/attachments/20070813/6a4423bd/attachment.png From bix at sendu.me.uk Mon Aug 13 09:11:50 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Mon, 13 Aug 2007 14:11:50 +0100 Subject: [Bioperl-l] Bio::Graphics : To change scale color & sort and add direction to SeqFeature In-Reply-To: <51133.192.168.1.1.1187003265.squirrel@mail.ncbs.res.in> References: <10259461.post@talk.nabble.com> <41667.192.168.1.1.1178019391.squirrel@mail.ncbs.res.in> <1178028249.2644.13.camel@localhost.localdomain> <42391.192.168.1.1.1178035451.squirrel@mail.ncbs.res.in> <6dce9a0b0705030901w203344b4te03ad271a5482faf@mail.gmail.com> <51133.192.168.1.1.1187003265.squirrel@mail.ncbs.res.in> Message-ID: <46C05896.1010002@sendu.me.uk> Shameer Khadar wrote: > Dear All, > > I am generating images based on Transcription Factor binding site data > using bio::graphics module. > I created my images using program : version-2 > [http://stein.cshl.org/genome_informatics/BioGraphics/] (Courtsey : L. > Stein ). I attaching one of the image with this mail. > > I need to make 3 changes to this image > > 1. to color the 'scale' > Color the scale in two different colors ie, from start 1.0k - color blue > from 101 - till end of the scale green (I thoroghly checked the > Bio::Graphics document, I couldnt find an option to do this ) The scale is just a scale and shouldn't need colouring. You can do what you want by having a blue 'upstream' feature and a green 'gene' feature in the first row. > 2. to sort the Transcription factors based on the z_score I don't know Bio::Graphics well enough, but am interested in the answer... > 3. to give forward/reverse [> or < ]direction for the black boxes Presumably you just change the glyph type of your binding sites to something that shows direction, like 'processed_transcript'. Someone else may have a more appropriate suggestion. However, do your binding sites really have a direction? That is, do you really know which strand your transcription factor bound to? From cjfields at uiuc.edu Mon Aug 13 10:39:11 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 13 Aug 2007 09:39:11 -0500 Subject: [Bioperl-l] Bio::Graphics : To change scale color & sort and add direction to SeqFeature In-Reply-To: <51133.192.168.1.1.1187003265.squirrel@mail.ncbs.res.in> References: <10259461.post@talk.nabble.com> <41667.192.168.1.1.1178019391.squirrel@mail.ncbs.res.in> <1178028249.2644.13.camel@localhost.localdomain> <42391.192.168.1.1.1178035451.squirrel@mail.ncbs.res.in> <6dce9a0b0705030901w203344b4te03ad271a5482faf@mail.gmail.com> <51133.192.168.1.1.1187003265.squirrel@mail.ncbs.res.in> Message-ID: <871544DF-19F0-4C6A-849E-514D8B7BAA12@uiuc.edu> On Aug 13, 2007, at 6:07 AM, Shameer Khadar wrote: > Dear All, > > I am generating images based on Transcription Factor binding site data > using bio::graphics module. > I created my images using program : version-2 > [http://stein.cshl.org/genome_informatics/BioGraphics/] (Courtsey : L. > Stein ). I attaching one of the image with this mail. > > I need to make 3 changes to this image > > 1. to color the 'scale' > Color the scale in two different colors ie, from start 1.0k - color > blue > from 101 - till end of the scale green (I thoroghly checked the > Bio::Graphics document, I couldnt find an option to do this ) Much of the documentation you need is available via 'perldoc Bio::Graphics::Panel' and the various Bio::Graphics::Glyph classes. The above may be possible using two seqfeatures instead of one or maybe a split location with a callback (not sure, haven't tried either, mileage may vary, batteries not included, warranty void if packaging is opened, etc). Might be worth checking out the POD for the arrow glyph to see what's possible. > 2. to sort the Transcription factors based on the z_score In Bio::Graphics::Panel POD under 'Glyph Options', there is documentation for 'sort_order' which accepts callbacks. According to the docs you would basically do something like the following (the prototype is required; note the score): -sort_order => sub ($$) { my ($glyph1,$glyph2) = @_; my $a = $glyph1->feature; my $b = $glyph2->feature; ( $b->score/log($b->length) <=> $a->score/log($a->length) ) || ( $a->start <=> $b->start ) } Again, haven't tried. > 3. to give forward/reverse [> or < ]direction for the black boxes I think you first need to ensure the glyph will accept strandedness, though I think most do. Then you would set either the 'strand_arrow' or 'stranded' option to 1 (they are synonyms). Again, see Bio::Graphics::Panel POD under Glyph Options, specifically the parameter 'stranded' or 'strand_arrow'. > I would appreaciate if any one can give me some clues/link to > accomplish > this :). > thanks in advance , > Shameer No problem! chris > -- > Shameer Khadar > Lab (# 25) The Computational Biology Group > National Centre for Biological Sciences (TIFR) > GKVK Campus, Bellary Road, Bangalore - 65, Karnataka - India > T - 91-080-23666001 EXT - 6251 > W - http://www.ncbs.res.in > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From shameer at ncbs.res.in Mon Aug 13 10:47:35 2007 From: shameer at ncbs.res.in (Shameer Khadar) Date: Mon, 13 Aug 2007 20:17:35 +0530 (IST) Subject: [Bioperl-l] Bio::Graphics : To change scale color & sort and add direction to SeqFeature In-Reply-To: <46C05896.1010002@sendu.me.uk> References: <10259461.post@talk.nabble.com> <41667.192.168.1.1.1178019391.squirrel@mail.ncbs.res.in> <1178028249.2644.13.camel@localhost.localdomain> <42391.192.168.1.1.1178035451.squirrel@mail.ncbs.res.in> <6dce9a0b0705030901w203344b4te03ad271a5482faf@mail.gmail.com> <51133.192.168.1.1.1187003265.squirrel@mail.ncbs.res.in> <46C05896.1010002@sendu.me.uk> Message-ID: <59564.192.168.1.1.1187016455.squirrel@mail.ncbs.res.in> Dear Sendu, Thanks for your reply. >> I need to make 3 changes to this image >> >> 1. to color the 'scale' >> Color the scale in two different colors ie, from start 1.0k - color blue >> from 101 - till end of the scale green (I thoroghly checked the >> Bio::Graphics document, I couldnt find an option to do this ) > > The scale is just a scale and shouldn't need colouring. You can do what > you want by having a blue 'upstream' feature and a green 'gene' feature > in the first row. Thanks for the point : 'The scale is just a scale...'. But my idea is to differentiate the scale in to three to diffentiate between 100bp upstream region, UTR and gene start site. starting point of scale till 0k is the 100bp upstream. From 0k till end of the current_scale is UTR, from the end of scale gene starts, since this is a bit tough to distinguish, we thought of this coloring option. Addition of an extra track may is an alternate option (I tried to convince our experimental team by adding an extra track, but they want it this way :(..) > >> 2. to sort the Transcription factors based on the z_score > I don't know Bio::Graphics well enough, but am interested in the answer... > It is possible, but sort_order option is available. I tried it a couple of times but it is not working. > >> 3. to give forward/reverse [> or < ]direction for the black boxes > > Presumably you just change the glyph type of your binding sites to > something that shows direction, like 'processed_transcript'. Someone > else may have a more appropriate suggestion. Thanks, I will look in to it. > > However, do your binding sites really have a direction? That is, do you > really know which strand your transcription factor bound to? Yes, these info we collated from various experimental datasets. -- Shameer Khadar Lab (# 25) The Computational Biology Group National Centre for Biological Sciences (TIFR) GKVK Campus, Bellary Road, Bangalore - 65, Karnataka - India T - 91-080-23666001 EXT - 6251 W - http://www.ncbs.res.in From bix at sendu.me.uk Mon Aug 13 11:01:43 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Mon, 13 Aug 2007 16:01:43 +0100 Subject: [Bioperl-l] Bio::Graphics : To change scale color & sort and add direction to SeqFeature In-Reply-To: <59564.192.168.1.1.1187016455.squirrel@mail.ncbs.res.in> References: <10259461.post@talk.nabble.com> <41667.192.168.1.1.1178019391.squirrel@mail.ncbs.res.in> <1178028249.2644.13.camel@localhost.localdomain> <42391.192.168.1.1.1178035451.squirrel@mail.ncbs.res.in> <6dce9a0b0705030901w203344b4te03ad271a5482faf@mail.gmail.com> <51133.192.168.1.1.1187003265.squirrel@mail.ncbs.res.in> <46C05896.1010002@sendu.me.uk> <59564.192.168.1.1.1187016455.squirrel@mail.ncbs.res.in> Message-ID: <46C07257.1000308@sendu.me.uk> Shameer Khadar wrote: >> However, do your binding sites really have a direction? That is, do you >> really know which strand your transcription factor bound to? > > Yes, these info we collated from various experimental datasets. Well, those datasets I'd like to see... What I was getting at is the strand probably isn't known at the experimental level, but to describe the site a strand has to be arbitrarily picked so you can write the sequence of the site down as a single string. Its probably the case that the strand information you have is just the way it happened to be reported in the literature and has no biological meaning. From shameer at ncbs.res.in Mon Aug 13 11:16:33 2007 From: shameer at ncbs.res.in (Shameer Khadar) Date: Mon, 13 Aug 2007 20:46:33 +0530 (IST) Subject: [Bioperl-l] Bio::Graphics : To change scale color & sort and add direction to SeqFeature In-Reply-To: <871544DF-19F0-4C6A-849E-514D8B7BAA12@uiuc.edu> References: <10259461.post@talk.nabble.com> <41667.192.168.1.1.1178019391.squirrel@mail.ncbs.res.in> <1178028249.2644.13.camel@localhost.localdomain> <42391.192.168.1.1.1178035451.squirrel@mail.ncbs.res.in> <6dce9a0b0705030901w203344b4te03ad271a5482faf@mail.gmail.com> <51133.192.168.1.1.1187003265.squirrel@mail.ncbs.res.in> <871544DF-19F0-4C6A-849E-514D8B7BAA12@uiuc.edu> Message-ID: <42833.192.168.1.1.1187018193.squirrel@mail.ncbs.res.in> Chris, Thanks for your detailed reply. I will read up the docs and try different options using ur code snippets as starting point. I will get back to the list with my results. Thanks -- Shameer > > On Aug 13, 2007, at 6:07 AM, Shameer Khadar wrote: > >> Dear All, >> >> I am generating images based on Transcription Factor binding site data >> using bio::graphics module. >> I created my images using program : version-2 >> [http://stein.cshl.org/genome_informatics/BioGraphics/] (Courtsey : L. >> Stein ). I attaching one of the image with this mail. >> >> I need to make 3 changes to this image >> >> 1. to color the 'scale' >> Color the scale in two different colors ie, from start 1.0k - color >> blue >> from 101 - till end of the scale green (I thoroghly checked the >> Bio::Graphics document, I couldnt find an option to do this ) > > Much of the documentation you need is available via 'perldoc > Bio::Graphics::Panel' and the various Bio::Graphics::Glyph classes. > The above may be possible using two seqfeatures instead of one or > maybe a split location with a callback (not sure, haven't tried > either, mileage may vary, batteries not included, warranty void if > packaging is opened, etc). Might be worth checking out the POD for > the arrow glyph to see what's possible. > >> 2. to sort the Transcription factors based on the z_score > > In Bio::Graphics::Panel POD under 'Glyph Options', there is > documentation for 'sort_order' which accepts callbacks. According to > the docs you would basically do something like the following (the > prototype is required; note the score): > > -sort_order => sub ($$) { > my ($glyph1,$glyph2) = @_; > my $a = $glyph1->feature; > my $b = $glyph2->feature; > ( $b->score/log($b->length) > <=> > $a->score/log($a->length) ) > || > ( $a->start <=> $b->start ) > } > > Again, haven't tried. > >> 3. to give forward/reverse [> or < ]direction for the black boxes > > I think you first need to ensure the glyph will accept strandedness, > though I think most do. Then you would set either the 'strand_arrow' > or 'stranded' option to 1 (they are synonyms). Again, see > Bio::Graphics::Panel POD under Glyph Options, specifically the > parameter 'stranded' or 'strand_arrow'. > -- Shameer Khadar Lab (# 25) The Computational Biology Group National Centre for Biological Sciences (TIFR) GKVK Campus, Bellary Road, Bangalore - 65, Karnataka - India T - 91-080-23666001 EXT - 6251 W - http://www.ncbs.res.in From bix at sendu.me.uk Mon Aug 13 11:47:10 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Mon, 13 Aug 2007 16:47:10 +0100 Subject: [Bioperl-l] newbie wants install help In-Reply-To: References: <46BB93DC.9010608@sendu.me.uk> <46BC16A9.7090709@sendu.me.uk> Message-ID: <46C07CFE.7020105@sendu.me.uk> hari jayaram wrote: > Hey all , > Thanks for your help. Its working real well now. [snip] > I am thinking I will screencast this process for everyones benefit and > put it up on bioscreencast.com . If that will > be useful for others. I'm certain it will. That's a very interesting website. Thanks for taking the time, and I hope you find Bioperl useful. From cjfields at uiuc.edu Mon Aug 13 12:24:15 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 13 Aug 2007 11:24:15 -0500 Subject: [Bioperl-l] Bio::Graphics : To change scale color & sort and add direction to SeqFeature In-Reply-To: <46C07257.1000308@sendu.me.uk> References: <10259461.post@talk.nabble.com> <41667.192.168.1.1.1178019391.squirrel@mail.ncbs.res.in> <1178028249.2644.13.camel@localhost.localdomain> <42391.192.168.1.1.1178035451.squirrel@mail.ncbs.res.in> <6dce9a0b0705030901w203344b4te03ad271a5482faf@mail.gmail.com> <51133.192.168.1.1.1187003265.squirrel@mail.ncbs.res.in> <46C05896.1010002@sendu.me.uk> <59564.192.168.1.1.1187016455.squirrel@mail.ncbs.res.in> <46C07257.1000308@sendu.me.uk> Message-ID: On Aug 13, 2007, at 10:01 AM, Sendu Bala wrote: > Shameer Khadar wrote: >>> However, do your binding sites really have a direction? That is, >>> do you >>> really know which strand your transcription factor bound to? >> >> Yes, these info we collated from various experimental datasets. > > Well, those datasets I'd like to see... What I was getting at is the > strand probably isn't known at the experimental level, but to describe > the site a strand has to be arbitrarily picked so you can write the > sequence of the site down as a single string. Its probably the case > that > the strand information you have is just the way it happened to be > reported in the literature and has no biological meaning. It's subjective. I can think of several cases where strandedness does matter and has meaning. If the motif is related to how the gene is transcribed or post-transcriptionally regulated, for instance; elements which indicate start of transcription (-10/-35 or any sigma- factor-related promoter element in prokaryotes), end of transcription (poly-A signal, transcription terminators), modulation of translation (SECIS, IRES), or conserved DNA motifs which are transcribed prior to regulation (RNA-binding proteins like IRE). chris From amacgregor at ccg.murdoch.edu.au Mon Aug 13 20:52:10 2007 From: amacgregor at ccg.murdoch.edu.au (Andrew Macgregor) Date: Tue, 14 Aug 2007 08:52:10 +0800 Subject: [Bioperl-l] Homologene parser? In-Reply-To: <764978cf0708130329r484bf210qd0e6e9ad39274bbc@mail.gmail.com> References: <764978cf0708130329r484bf210qd0e6e9ad39274bbc@mail.gmail.com> Message-ID: <22FEA28C-1720-4519-B129-B7A5ADA452D4@ccg.murdoch.edu.au> On 13/08/2007, at 6:29 PM, neeti somaiya wrote: > Hi, > > Does anyone know of any Homologene parser, if available? > Please let me know. > > Thanks and Regards, > Neeti. Hi Neeti, Quite a long time ago now I wrote an Homologene parser and posted it to the mailing list: I don't know if this still works but you could use it as a starting point. There may also be something newer out there too, I don't know. If you search the mailing list archives you'll get a few messages around the topic. Cheers, Andrew. Andrew Macgregor Centre for Comparative Genomics, Murdoch University Email: amacgregor at ccg.murdoch.edu.au Tel: (08) 9360 2961 From cjfields at uiuc.edu Mon Aug 13 23:21:54 2007 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 13 Aug 2007 22:21:54 -0500 Subject: [Bioperl-l] Homologene parser? In-Reply-To: <22FEA28C-1720-4519-B129-B7A5ADA452D4@ccg.murdoch.edu.au> References: <764978cf0708130329r484bf210qd0e6e9ad39274bbc@mail.gmail.com> <22FEA28C-1720-4519-B129-B7A5ADA452D4@ccg.murdoch.edu.au> Message-ID: <4E7F8A99-68A7-49C2-9919-E2FC5652C8D7@uiuc.edu> It looks like Heikki responded and thought a good place for it would be Bio::SeqIO, but it didn't go anywhere I suppose. I see that a few other posts suggest it could be placed in Bio::Cluster as well which I'm not familiar with. We could add it in if you were still interested, just need to find a good place for it; might be nice to have a Parse::RecDescent-based parser. chris On Aug 13, 2007, at 7:52 PM, Andrew Macgregor wrote: > On 13/08/2007, at 6:29 PM, neeti somaiya wrote: > >> Hi, >> >> Does anyone know of any Homologene parser, if available? >> Please let me know. >> >> Thanks and Regards, >> Neeti. > > Hi Neeti, > > Quite a long time ago now I wrote an Homologene parser and posted it > to the mailing list: > > > > I don't know if this still works but you could use it as a starting > point. There may also be something newer out there too, I don't know. > If you search the mailing list archives you'll get a few messages > around the topic. > > Cheers, Andrew. > > > Andrew Macgregor > Centre for Comparative Genomics, Murdoch University > Email: amacgregor at ccg.murdoch.edu.au > Tel: (08) 9360 2961 > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From n.haigh at sheffield.ac.uk Tue Aug 14 03:46:19 2007 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Tue, 14 Aug 2007 08:46:19 +0100 Subject: [Bioperl-l] Warnings/errors generated by Eclipse Message-ID: <46C15DCB.80603@sheffield.ac.uk> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 I've just been setting up Eclipse with the EPIC plugin, and it's generating some errors and warnings about bioperl-live that I'd like to pass by you. I think most of the errors are along the lines of: "Can't find 'build_params' in _build in /usr/local/share/perl/5.8.8/Module/Build/Base.pm line 1011" This occurs with files like: t/Biblio_biofetch.t t/seqread_fail.t I think it's to do with the parameters passed to test_begin() or it could be my setup of Eclipse? Other highlighted problems are some of the scripts in the examples dir. Some require modules that reside in the bioperl-run package. Would it be wise to move these to the bioperl-run examples dir? There may also be some problems with XML files in t/data e.g. t/data/interpro_ebi.xml There appears to be a typo on line 2. However, I'm not sure this is up-to-date? I can comment on the others later if required. Cheers Nath -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFGwV3KczuW2jkwy2gRApM/AJ9abWl02CAJqDK2sEXEUEg8nGRC4ACdHcAb nZmh+1dmtc1W9mThkUVKitw= =5eXZ -----END PGP SIGNATURE----- From amacgregor at ccg.murdoch.edu.au Tue Aug 14 01:14:58 2007 From: amacgregor at ccg.murdoch.edu.au (Andrew Macgregor) Date: Tue, 14 Aug 2007 13:14:58 +0800 Subject: [Bioperl-l] Homologene parser? In-Reply-To: <4E7F8A99-68A7-49C2-9919-E2FC5652C8D7@uiuc.edu> References: <764978cf0708130329r484bf210qd0e6e9ad39274bbc@mail.gmail.com> <22FEA28C-1720-4519-B129-B7A5ADA452D4@ccg.murdoch.edu.au> <4E7F8A99-68A7-49C2-9919-E2FC5652C8D7@uiuc.edu> Message-ID: On 14/08/2007, at 11:21 AM, Chris Fields wrote: > It looks like Heikki responded and thought a good place for it > would be Bio::SeqIO, but it didn't go anywhere I suppose. I see > that a few other posts suggest it could be placed in Bio::Cluster > as well which I'm not familiar with. We could add it in if you > were still interested, just need to find a good place for it; might > be nice to have a Parse::RecDescent-based parser. > > chris > Hi Chris, I was also doing some parsing of UniGene at the time but found RecDescent was too slow and went back to regexes. That code found it's way into Bio::Cluster. Occasionally I see a message with someone looking for a Homologene parser but not very often, so I'm not sure it is worth the effort of moving the code into bioperl. Cheers, Andrew. From neetisomaiya at gmail.com Tue Aug 14 09:24:07 2007 From: neetisomaiya at gmail.com (neeti somaiya) Date: Tue, 14 Aug 2007 18:54:07 +0530 Subject: [Bioperl-l] Homologene parser? In-Reply-To: <22FEA28C-1720-4519-B129-B7A5ADA452D4@ccg.murdoch.edu.au> References: <764978cf0708130329r484bf210qd0e6e9ad39274bbc@mail.gmail.com> <22FEA28C-1720-4519-B129-B7A5ADA452D4@ccg.murdoch.edu.au> Message-ID: <764978cf0708140624s5c198b5akee38bf98866fd7f2@mail.gmail.com> Hi Andrew, I think the homologene data files have changed now on the ftp, from what you had used. It is now homologene.data and homologene.xml. I tried using your parser, but because it was written on the file hmlg.trip.ftp, it doesnt work anymore. I came across a parser http://bioinformatics.tgen.org/brunit/software/bioparser/docs/pod_bio_parser_homologene_fileparser_pm.shtml . I am looking at it to see if it works for me. NOt sure if it will. ~Neeti. On 8/14/07, Andrew Macgregor wrote: > > On 13/08/2007, at 6:29 PM, neeti somaiya wrote: > > > Hi, > > > > Does anyone know of any Homologene parser, if available? > > Please let me know. > > > > Thanks and Regards, > > Neeti. > > Hi Neeti, > > Quite a long time ago now I wrote an Homologene parser and posted it > to the mailing list: > > > > I don't know if this still works but you could use it as a starting > point. There may also be something newer out there too, I don't know. > If you search the mailing list archives you'll get a few messages > around the topic. > > Cheers, Andrew. > > > Andrew Macgregor > Centre for Comparative Genomics, Murdoch University > Email: amacgregor at ccg.murdoch.edu.au > Tel: (08) 9360 2961 > > > > -- -Neeti Even my blood says, B positive From bix at sendu.me.uk Tue Aug 14 10:57:29 2007 From: bix at sendu.me.uk (Sendu Bala) Date: Tue, 14 Aug 2007 15:57:29 +0100 Subject: [Bioperl-l] Should coords be adjusted after removing alignment columns? Message-ID: <46C1C2D9.6050409@sendu.me.uk> I'm looking at what looks like a pretty major bug in Bio::SimpleAlign, but before I commit the fix I wanted to check my sanity/understanding. My understanding is that an alignment may be built from just sub-parts of a number of sequences. So you give each sequence in the alignment a start and stop so you can later map back the aligned region to the original sequence. So, for example, the following should all pass: diff -r1.56 SimpleAlign.t 459a460,540 > > > # is _remove_col really working correctly? > my $a = Bio::LocatableSeq->new(-id => 'a', -seq => 'atcgatcgatcgatcg', -start => 5, -end => 20); > my $b = Bio::LocatableSeq->new(-id => 'b', -seq => '-tcgatc-atcgatcg', -start => 30, -end => 43); > my $c = Bio::LocatableSeq->new(-id => 'c', -seq => 'atcgatcgatc-atc-', -start => 50, -end => 63); > my $d = Bio::LocatableSeq->new(-id => 'd', -seq => '--cgatcgatcgat--', -start => 80, -end => 91); > my $e = Bio::LocatableSeq->new(-id => 'e', -seq => '-t-gatcgatcga-c-', -start => 100, -end => 111); > $aln = Bio::SimpleAlign->new(); > $aln->add_seq($a); > $aln->add_seq($b); > $aln->add_seq($c); > > my $gapless = $aln->remove_gaps(); > foreach my $seq ($gapless->each_seq) { > if ($seq->id eq 'a') { > is $seq->start, 6; > is $seq->end, 19; > is $seq->seq, 'tcgatcatcatc'; > } > elsif ($seq->id eq 'b') { > is $seq->start, 30; > is $seq->end, 42; > is $seq->seq, 'tcgatcatcatc'; > } > elsif ($seq->id eq 'c') { > is $seq->start, 51; > is $seq->end, 63; > is $seq->seq, 'tcgatcatcatc'; > } > } > > $aln->add_seq($d); > $aln->add_seq($e); > $gapless = $aln->remove_gaps(); > foreach my $seq ($gapless->each_seq) { > if ($seq->id eq 'a') { > is $seq->start, 8; > is $seq->end, 17; > is $seq->seq, 'gatcatca'; > } > elsif ($seq->id eq 'b') { > is $seq->start, 32; > is $seq->end, 40; > is $seq->seq, 'gatcatca'; > } > elsif ($seq->id eq 'c') { > is $seq->start, 53; > is $seq->end, 61; > is $seq->seq, 'gatcatca'; > } > elsif ($seq->id eq 'd') { > is $seq->start, 81; > is $seq->end, 90; > is $seq->seq, 'gatcatca'; > } > elsif ($seq->id eq 'e') { > is $seq->start, 101; > is $seq->end, 110; > is $seq->seq, 'gatcatca'; > } > } > > my $f = Bio::LocatableSeq->new(-id => 'f', -seq => 'a-cgatcgatcgat-g', -start => 30, -end => 43); > $aln = Bio::SimpleAlign->new(); > $aln->add_seq($a); > $aln->add_seq($f); > > $gapless = $aln->remove_gaps(); > foreach my $seq ($gapless->each_seq) { > if ($seq->id eq 'a') { > is $seq->start, 5; > is $seq->end, 20; > is $seq->seq, 'acgatcgatcgatg'; > } > elsif ($seq->id eq 'f') { > is $seq->start, 30; > is $seq->end, 43; > is $seq->seq, 'acgatcgatcgatg'; > } > } But they don't. Once you remove certain columns the start and stop of the sequences in the alignment are no longer correct coordinates for the sub-sequence in the original sequence. I propose the following patch to resolve this issue: diff -r1.136 SimpleAlign.pm 1116c1116,1118 < --- > > my $gap = $self->gap_char; > 1129,1137c1131,1147 < my $spliced; < $spliced .= $start > 0 ? substr($sequence,0,$start) : ''; < $spliced .= substr($sequence,$end+1,$seq->length-$end+1); < $sequence = $spliced; < if ($start == 1) { < $new_seq->start($end); < } < else { < $new_seq->start( $seq->start); --- > my $orig = $sequence; > my $head = $start > 0 ? substr($sequence, 0, $start) : ''; > my $tail = ($end + 1) >= length($sequence) ? '' : substr($sequence, $end + 1); > $sequence = $head.$tail; > # start > unless (defined $new_seq->start) { > if ($start == 0) { > my $start_adjust = () = substr($orig, 0, $end + 1) =~ /$gap/g; > $new_seq->start($seq->start + $end + 1 - $start_adjust); > } > else { > my $start_adjust = $orig =~ /$gap+/; > if ($start_adjust) { > $start_adjust = $+[0] - 1 < $start; >