[Bioperl-l] bioperl 1.4 SearchIO doesn't work parsing Blastoutput
Joel Steele
injunjoel at hotmail.com
Thu Feb 9 00:54:26 UTC 2006
Greetings,
Im not well versed in Bio::SearchIO but there are a few comments about your
code that may or may not be relevant...
first thing:
=-=-=-=-=code snippet=-=-=-=-=
#!/usr/bin/perl -w
use strict; #save yourself the headaches and force yourself to write clean
code.
=-=-=-=-=code snippet=-=-=-=-=
next thing:
when you are reading the files from the directory you are not doing any sort
of filtering as to what is returned. If you are on a Unix flavored system
you may be getting the '.' and '..' entries from your readdir(DIR) call. I
would suggest placing a grep in there somewhere to get only blast files.
something like:
=-=-=-=-=code snippet=-=-=-=-=
#assuming the file extension for blast files is .bls
#the -e and -f are filetests; you could probably get away with just
#-f. Here is a link for reference on the filetests available in Perl.
#
# http://www.perlmonks.org/?node_id=370
my @files_to_parse = grep{/\w+\.bls/ && -e && -f} readdir(DIR);
closedir(DIR);
#then proceed with your foreach but over @files_to_parse
foreach my $file(@files_to_parse){
#do cool stuff here...
}
=-=-=-=-=code snippet=-=-=-=-=
Hope that helps.
-Joel Steele
"The surest way to corrupt a youth is to instruct him to hold in higher
regard those who think alike than those who think differently." -Nietzsche
"I do not feel obliged to believe that the same God who endowed us with
sense, reason and intellect has intended us to forego their use." -Galileo
>From: Hubert Prielinger <hubert.prielinger at gmx.at>
>To: Chris Fields <cjfields at uiuc.edu>, bioperl-l at bioperl.org,
>rahall2 at ualr.edu
>Subject: Re: [Bioperl-l] bioperl 1.4 SearchIO doesn't work parsing
>Blastoutput
>Date: Wed, 08 Feb 2006 16:22:44 -0600
>MIME-Version: 1.0
>Received: from newportal.open-bio.org ([209.59.5.172]) by
>bay0-mc11-f17.bay0.hotmail.com with Microsoft SMTPSVC(6.0.3790.211); Wed, 8
>Feb 2006 15:21:55 -0800
>Received: from newportal.open-bio.org (localhost.localdomain [127.0.0.1])by
>newportal.open-bio.org (8.13.1/8.13.1) with ESMTP id k18NKjCX009295;Wed, 8
>Feb 2006 18:20:53 -0500
>Received: from mail.gmx.net (mail.gmx.net [213.165.64.21])by
>newportal.open-bio.org (8.13.1/8.13.1) with SMTP id k18NKhS5009289for
><bioperl-l at bioperl.org>; Wed, 8 Feb 2006 18:20:43 -0500
>Received: (qmail invoked by alias); 08 Feb 2006 23:19:21 -0000
>Received: from ppc7.bio.ucalgary.ca (EHLO [136.159.234.7])
>[136.159.234.7]by mail.gmx.net (mp020) with SMTP; 09 Feb 2006 00:19:21
>+0100
>X-Message-Info: N4u0pqWW+O3IGnF2tRfvcViLTroM8CQX8qbJiCtgSIY=
>X-Authenticated: #16854991
>User-Agent: Mozilla Thunderbird 1.0.7-1.1.fc4 (X11/20050929)
>X-Accept-Language: en-us, en
>References: <001201c62d03$703178c0$15327e82 at pyrimidine>
>X-Y-GMX-Trusted: 0
>X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-2.0.2
>(newportal.open-bio.org [127.0.0.1]); Wed, 08 Feb 2006 18:21:21 -0500 (EST)
>X-Greylist: IP, sender and recipient auto-whitelisted, not delayed
>bymilter-greylist-2.0.2 (newportal.open-bio.org [207.154.17.70]);Wed, 08
>Feb 2006 18:20:43 -0500 (EST)
>X-Spam-Score: (0) X-Spam-Score: (-0.001) SPF_PASS
>X-Scanned-By: MIMEDefang 2.52
>X-Scanned-By: MIMEDefang 2.52 on 207.154.17.70
>X-BeenThere: bioperl-l at lists.open-bio.org
>X-Mailman-Version: 2.1.7
>Precedence: list
>List-Id: Bioperl Project Discussion List <bioperl-l.lists.open-bio.org>
>List-Unsubscribe:
><http://lists.open-bio.org/mailman/listinfo/bioperl-l>,<mailto:bioperl-l-request at lists.open-bio.org?subject=unsubscribe>
>List-Archive: <http://lists.open-bio.org/pipermail/bioperl-l>
>List-Post: <mailto:bioperl-l at lists.open-bio.org>
>List-Help: <mailto:bioperl-l-request at lists.open-bio.org?subject=help>
>List-Subscribe:
><http://lists.open-bio.org/mailman/listinfo/bioperl-l>,<mailto:bioperl-l-request at lists.open-bio.org?subject=subscribe>
>Errors-To: bioperl-l-bounces at lists.open-bio.org
>Return-Path: bioperl-l-bounces at lists.open-bio.org
>X-OriginalArrivalTime: 08 Feb 2006 23:21:56.0754 (UTC)
>FILETIME=[7419CF20:01C62D06]
>
>hi,
>I have installed from the following page:
>http://news.open-bio.org/archives/2005_10.html, the Core, Run and Ext.
>I'm using only the SearchIO without remoteblast module, because I have
>already all my Blast output files.
>My operating system is fedora core 9.
>
>Code:
>
>#!/usr/bin/perl -w
>
>use Bio::SearchIO;
>
>print "start program\n";
>my $directory =
>"/home/Hubert/installed/eclipse/workspace/Database_Search/result_4";
>opendir(DIR, $directory) || die("Cannot open directory");
>print "opened directory\n";
>
>foreach my $file (readdir(DIR)) {
>print "read file\n";
>
>my $search = new Bio::SearchIO (-format => 'blast',
> -file => $file);
>
>my $cutoff_len = 10;
>
>
>
>#iterate over each query sequence
>while (my $result = $search->next_result) {
>print "entered 1st while loop\n";
>
> #iterate over each hit on the query sequence
> while (my $hit = $result->next_hit) {
>
> #iterate over each HSP in the hit
> while (my $hsp = $hit->next_hsp) {
>
> if ($hsp->length('sbjct') <= $cutoff_len) {
> #print $hsp->hit_string, "\n";
> for ($hsp->hit_string) {
>
>
> if (tr/K// >= 2 || tr/R// >= 2 && tr/W// >= 2 ||
>tr/K// == 1 && tr/R// == 1 && tr/W// >= 2) {
>
> # Print some tab-delimited data about this HSP
>
> open (bigShot, ">>BlastOutputTrial.txt") ||
>die ("Could not open file. $!");
> #print $result->query_name, "\t";
>
># print $hit->significance, "\t";
> print bigShot $hit->name, "-->";
> print bigShot $hit->description, "\n";
> #print bigShot "Query: ",
>$hsp->start('query'), " ", $hsp->query_string, " ",
>$hsp->end('query'), "\n";
> print bigShot "Seq: ", $hsp->start('hit'),
>" ", $hsp->hit_string, " ", $hsp->end('hit'), "\n";
>
># print $hsp->rank, "\t";
># print $hsp->percent_identity, "\t";
># print $hsp->evalue, "\t";
># print $hsp->hsp_length, "\n";
>
> close (bigShot);
>
> };
>
>
> }
> }
> }
> }
>}
>
>}
>
>closedir(DIR);
>
>
>Chris Fields wrote:
>
> >Make sure you ran a full installation of bioperl-1.5.1 or bioperl-live
>(not
> >just the modules you want; mixing bioperl versions might work, but you
>might
> >run into interoperability problems). Then replace the
>Bio::SearchIO::blast
> >with the one in Bugzilla. The 'other option' you mentioned might be
>trying
> >XML instead of text, which is more stable in the long run. You will
>still
> >need to run a full upgrade to bioperl 1.5.1 for that; make sure you read
> >this:
> >
> >http://bioperl.org/wiki/Module:Bio::Tools::Run::RemoteBlast
> >
> >If you're using SearchIO directly instead of Remoteblast, you should be
>able
> >to set the '-readmethod' flag to 'blastxml'.
> >
> >It also wouldn't hurt to know what OS you're using or see some code.
>Roger
> >is out there somewhere (I think) and may also have some input.
> >
> >Christopher Fields
> >Postdoctoral Researcher - Switzer Lab
> >Dept. of Biochemistry
> >University of Illinois Urbana-Champaign
> >
> >
> >
> >>-----Original Message-----
> >>From: Hubert Prielinger [mailto:hubert.prielinger at gmx.at]
> >>Sent: Wednesday, February 08, 2006 3:41 PM
> >>To: Chris Fields; bioperl-l at bioperl.org
> >>Subject: Re: [Bioperl-l] bioperl 1.4 SearchIO doesn't work
> >>parsing Blast output
> >>
> >>hi chris,
> >>thanks, I have upgraded to version 1.5.1 but it isn't still
> >>working, do you have any ohter idea, the problem I have is
> >>that I have to parse a lot of textfiles....
> >>or shall I look for another option to parse those files...
> >>
> >>regards
> >>Hubert
> >>
> >>
> >>
> >>Chris Fields wrote:
> >>
> >>
> >>
> >>>My guess is you're running into text parsing problems in
> >>>Bio::SearchIO::blast. Upgrade to the latest developer
> >>>
> >>>
> >>version (1.5.1)
> >>
> >>
> >>>or bioperl-live (CVS), then see the bug below.
> >>>
> >>>http://bugzilla.bioperl.org/show_bug.cgi?id=1934
> >>>
> >>>I think the first problem you ran into is solved in bioperl
> >>>
> >>>
> >>1.5.1, the
> >>
> >>
> >>>last problem (more recent, not related to the first) has
> >>>
> >>>
> >>been fixed but
> >>
> >>
> >>>hasn't been committed to bioperl-live yet. The fixed
> >>>
> >>>
> >>SearchIO::blast
> >>
> >>
> >>>is available in the link above, but realize it hasn't been
> >>>
> >>>
> >>committed yet and may change.
> >>
> >>
> >>>Christopher Fields
> >>>Postdoctoral Researcher - Switzer Lab
> >>>Dept. of Biochemistry
> >>>University of Illinois Urbana-Champaign
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>>-----Original Message-----
> >>>>From: bioperl-l-bounces at lists.open-bio.org
> >>>>[mailto:bioperl-l-bounces at lists.open-bio.org] On Behalf Of Hubert
> >>>>Prielinger
> >>>>Sent: Wednesday, February 08, 2006 2:52 PM
> >>>>To: bioperl-l at bioperl.org
> >>>>Subject: [Bioperl-l] bioperl 1.4 SearchIO doesn't work
> >>>>
> >>>>
> >>parsing Blast
> >>
> >>
> >>>>output
> >>>>
> >>>>Hi,
> >>>>If I want to parse a Blast Output (Version 2.2.12) with
> >>>>
> >>>>
> >>Bio::SearchIO,
> >>
> >>
> >>>>I get the following error message:
> >>>>
> >>>>MSG: no data for midline Query 1 WWWKWRW 7
> >>>>STACK Bio::SearchIO::blast::next_result
> >>>>/usr/lib/perl5/site_perl/5.8.6/Bio/SearchIO/blast.pm:1151
> >>>>STACK toplevel
> >>>>/home/Hubert/installed/eclipse/workspace/Database_Search/Blast.pl:21
> >>>>
> >>>>is that a bug......
> >>>>
> >>>>If I want to parse Blast Output (version 2.2.13), I don't get
> >>>>anything.....
> >>>>I'm using bioperl 1.4
> >>>>
> >>>>before, I have installed bioperl 1.4, it worked fine parsing Blast
> >>>>Output (version 2.2.12), but I don't remember which bioperl
> >>>>
> >>>>
> >>version I
> >>
> >>
> >>>>had installed
> >>>>
> >>>>thanks in advance
> >>>>
> >>>>Hubert
> >>>>
> >>>>
> >>>>
> >>>>_______________________________________________
> >>>>Bioperl-l mailing list
> >>>>Bioperl-l at lists.open-bio.org
> >>>>http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >>>>
> >>>>
> >>>>
> >>>>
> >>>
> >>>
> >>>
> >>>
> >
> >
> >
> >
>
>_______________________________________________
>Bioperl-l mailing list
>Bioperl-l at lists.open-bio.org
>http://lists.open-bio.org/mailman/listinfo/bioperl-l
More information about the Bioperl-l
mailing list