An Error Occurred

 element
##    in an HTML page.

sub remoteBLAST
  {
    my $input = $_[0];  # a BioPerl Seq object
    my $fh    = $_[1];  # a file handle
    my $v = 1;          #  'verbose';
    my $prog = 'blastn';
    my $db   = 'nr';
    my $e_val= '1e-2';

    my @params = ( -prog => $prog,
		   -data => $db,
		   -expect => $e_val,
		   -readmethod => 'SearchIO',
		   -report_type => 'blastn',
		   -m => '3');

    my $factory = Bio::Tools::Run::RemoteBlast->new(@params);
    my $r = $factory->submit_blast($input);
    print $fh $r->table();

    print STDERR "waiting..." if( $v > 0 );

    while ( my @rids = $factory->each_rid ) {
      foreach my $rid ( @rids ) {
	my $rc = $factory->retrieve_blast($rid);
	if( !ref($rc) ) {
	  if( $rc < 0 ) {
	    $factory->remove_rid($rid);
	  }
	  print STDERR "." if ( $v > 0 );
	  sleep 10;
	} else {
	  my $result = $rc->next_result();

	  $factory->remove_rid($rid);
	  print $fh "\nQuery: ", $result->query_name(), "\t",
	  $result->query_description(),"\n";
	  while ( my $hit = $result->next_hit ) {

	    print $fh "\thit:", $hit->name, "\t",
	      $hit->accession(), "\t", $hit->description(),"\t";

	    while( my $hsp = $hit->next_hsp ) {
	      print $fh "\te-val is ", $hsp->evalue, "\n";
	      last;  ## Just print most significant e value
	    }
	  }
	}
      }
    }
  }

From jason.stajich at duke.edu  Tue Jan 10 14:02:52 2006
From: jason.stajich at duke.edu (Jason Stajich)
Date: Tue Jan 10 13:59:19 2006
Subject: [Bioperl-l] Fwd: [Bioperl-guts-l] Can BPLite parse HTML formatted
	BLAST output
References: <48256290-D78A-47F6-8E0A-5F3F411E09DC@duke.edu>
Message-ID: 

forward to main bioperl list too so others can see

Begin forwarded message:

> From: Jason Stajich 
> Date: January 10, 2006 2:02:14 PM EST
> To: Richard Francis 
> Cc: bioperl-guts-l@bioperl.org
> Subject: Re: [Bioperl-guts-l] Can BPLite parse HTML formatted BLAST  
> output
>
>
> http://bioperl.org/Core/Latest/faq.html#Q3.8
>
> But do not rely on this in the future to work always, NCBI has made  
> explicit they cannot promised to keep the HTML output that comes  
> from the website BLAST server parseable -- apparently according to  
> them, XML is the only blessed format which is guaranteed to always  
> be consistently parseable.
>
> -jason
> On Jan 10, 2006, at 1:21 PM, Richard Francis wrote:
>
>> Dear all,
>>
>> I currently use BPLite to parse my text based BLAST outputs.
>> I was wondering if BPLite or another BioPerl tool can parse an HTML
>> formatted BLAST output to pull out the same types of information.
>>
>> Kind regards and many thanks for any help in advance,
>>
>> Richard Francis
>>
>>
>> _______________________________________________
>> Bioperl-guts-l mailing list
>> Bioperl-guts-l@portal.open-bio.org
>> http://portal.open-bio.org/mailman/listinfo/bioperl-guts-l
>
> --
> Jason Stajich
> Duke University
> http://www.duke.edu/~jes12
>
>

--
Jason Stajich
Duke University
http://www.duke.edu/~jes12

From christoph.gille at charite.de  Tue Jan 10 16:03:37 2006
From: christoph.gille at charite.de (Dr. Christoph Gille)
Date: Tue Jan 10 16:07:29 2006
Subject: [Bioperl-l] internet proxy 
Message-ID: <47893.192.168.220.203.1136927017.squirrel@webmail.charite.de>

Please apologize my stupid question:
How do I tell BioPerl that I have a http proxy ?

I try  Bio/Tools/Analysis/Protein/Sopma.pm
which computes second struct predictions using a HTTP server.
Since I do not have direct Internet I've  set the variables http_proxy and
HTTP_PROXY to the proxy server.

Linux$ echo  $http_proxy $HTTP_PROXY
http://realproxy.charite.de:888 http://realproxy.charite.de:888

But Sopma is not able to connect to the Internet.
At home with direct Internet it worked fine.
All other Internet programs can cope with the proxy without problem.
For example wget http://www.google.de fetches the data.

Sopma will be my first test case of how to bring Java and BioPerl together.

Many thanks Christoph

From torsten.seemann at infotech.monash.edu.au  Tue Jan 10 21:20:02 2006
From: torsten.seemann at infotech.monash.edu.au (Torsten Seemann)
Date: Tue Jan 10 21:17:19 2006
Subject: [Bioperl-l] internet proxy
In-Reply-To: <47893.192.168.220.203.1136927017.squirrel@webmail.charite.de>
References: <47893.192.168.220.203.1136927017.squirrel@webmail.charite.de>
Message-ID: <43C46B52.7060600@infotech.monash.edu.au>

Dr. Christoph Gille wrote:
> Please apologize my stupid question:
> How do I tell BioPerl that I have a http proxy ?
> I try  Bio/Tools/Analysis/Protein/Sopma.pm
> which computes second struct predictions using a HTTP server.
> Since I do not have direct Internet I've  set the variables http_proxy and
> HTTP_PROXY to the proxy server.
> But Sopma is not able to connect to the Internet.
> At home with direct Internet it worked fine.
> All other Internet programs can cope with the proxy without problem.
> For example wget http://www.google.de fetches the data.

Sopma.pm ISA Bio::WebAgent ISA LWP::UserAgent.
LWP::UserAgent does the actual HTTP work.
Try:

  my $sopma = Bio::Tools::Analysis::Protein::Sopma->new( ... );
  $sopma->env_proxy;  # tell LWP::UserAgent to use proxy env. vars.
  $sopma->run;

-- 
Torsten Seemann
Victorian Bioinformatics Consortium, Monash University, Australia
http://www.vicbioinformatics.com/
From iain.m.wallace at gmail.com  Wed Jan 11 07:01:24 2006
From: iain.m.wallace at gmail.com (Iain Wallace)
Date: Wed Jan 11 07:05:09 2006
Subject: [Bioperl-l] Problem with Graphics
In-Reply-To: <4BF51350-BB17-4A50-9ABD-357EE920439F@duke.edu>
References: <8cff3eb80512140759w1f434423t4cbd4939e5fe798c@mail.gmail.com>
	<4BF51350-BB17-4A50-9ABD-357EE920439F@duke.edu>
Message-ID: <8cff3eb80601110401o508a89e0n5991d5fb6124f483@mail.gmail.com>

Thanks Jason,
That works for me even though I am on linux (not sure why, but it does)

Iain

On 12/14/05, Jason Stajich  wrote:
> Are you on windows?
>
> Try adding this before calling print.
> binmode (STDOUT);
>
> On Dec 14, 2005, at 10:59 AM, Iain Wallace wrote:
>
> > Hi,
> >
> > I am trying to use the Bio::Graphics module, but am unable to view my
> > output file. When I try to view the file I am told the file is
> > corrupt.
> >
> > Below is the code that I tried and it seems to work (i.e. it doesn't
> > crash and generates an output file)
> >
> > Unfortunately I have no idea what the error could be.
> > Any help/pointers would be greatly appreciated
> >
> > Thanks
> >
> > Iain
> > ---- Code from the How To ---
> >
> > #!/usr/bin/perl
> >
> >     use strict;
> >
> >     use Bio::Graphics;
> >     use Bio::SeqFeature::Generic;
> >
> >   my $panel = Bio::Graphics::Panel->new(-length => 1000,-width  =>
> > 800);
> >     my $track = $panel->add_track(-glyph => 'generic',-label  => 1);
> >
> >     while (<>) { # read blast file
> >       chomp;
> >       next if /^\#/;  # ignore comments
> >      my($name,$score,$start,$end) = split /\t+/;
> >      my $feature =
> > Bio::SeqFeature::Generic->new(-display_name=>$name,-score=>$score,
> >                                                  -start=>$start,-
> > end=>$end);
> >      $track->add_feature($feature);
> >    }
> >
> >    print $panel->png;
> >
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l@portal.open-bio.org
> > http://portal.open-bio.org/mailman/listinfo/bioperl-l
>
> --
> Jason Stajich
> Duke University
> http://www.duke.edu/~jes12
>
>
>

From akarger at CGR.Harvard.edu  Wed Jan 11 16:13:41 2006
From: akarger at CGR.Harvard.edu (Amir Karger)
Date: Wed Jan 11 16:20:55 2006
Subject: [Bioperl-l] blast output -> blast -m8 output
Message-ID: <339D68B133EAD311971E009027DC479703DB404F@montecarlo.cgr.harvard.edu>

> From: Jason Stajich [mailto:jason.stajich@duke.edu] 
> 
> The existing search2table script in scripts/searchio does this for  
> you - I don't think there is a writer plugin but there could be.

Ah nice. but:
-------------------
>perl bioperl-1.5.0-RC1/scripts/searchio/search2table.PLS seqs.blp > zzz
>more zzz
Bacteriophage_1[M19348] ref|NP_037061.1|        40.32   62      27      4
28      89      1050    1107    6e-05   46.6
Bacteriophage_1[M19348] ref|XP_193814.5|        48.89   45      16      6
57      95      320     364     0.001   42.7
Bacteriophage_1[M19348] ref|XP_912463.1|        48.89   45      16      6
57      95      866     910     0.001   42.7
Bacteriophage_1[M19348] ref|XP_619329.2|        48.89   45      16      6
57      95      676     720     0.001   42.7
C.elegans_1_[Z49071]    ref|XP_917828.1|        29.61   412     183     48
40      410     52      456     6e-43   173
C.elegans_1_[Z49071]    gb|AAI10184.1|  31.99   347     147     23      40
373     53      389     6e-42   169
>more seqs.m8
Bacteriophage_1[M19348] gi|6978677|ref|NP_037061.1|     40.32   62      33
1       28      89      1050    1107    6e-05   46.6
Bacteriophage_1[M19348] gi|82958039|ref|XP_193814.5|    48.89   45      17
1       57      95      320     364     0.001   42.7
Bacteriophage_1[M19348] gi|82958037|ref|XP_912463.1|    48.89   45      17
1       57      95      866     910     0.001   42.7
Bacteriophage_1[M19348] gi|82957449|ref|XP_619329.2|    48.89   45      17
1       57      95      676     720     0.001   42.7
C.elegans_1_[Z49071]    gi|82802536|ref|XP_917828.1|    29.61   412     242
9       40      410     52      456     6e-43    173
C.elegans_1_[Z49071]    gi|82571607|gb|AAI10184.1|      31.99   347     213
11      40      373     53      389     6e-42    169
-----------------

I know we can't get around the problem of the IDs, since blast & blast -m8
give different IDs. But columns 5 and 6 (mismatches, gap openings) are
consistently different. Is search2table not trying to mimic -m8 exactly, or
is this a bug?

Apologies if this is due to using bioperl 1.4 and the PLS script from
1.5.0-RC1. That's what I have on hand.

> 
> Note that if you just using BLAST you will find that the blast2table  
> script that is included in the BLAST book (see the O'Reilly website  
> for the book and download the code examples) will also generate this  
> sort of thing for you and will be many times faster than SearchIO  
> code. 

I could steal that. But I was thinking that if NCBI changes the BLAST
format, bioperl may upgrade while the dead trees code won't.

- Amir Karger
Computational Biology Group
Bauer Center for Genomics Research
Harvard University
617-496-0626

> There is also an equivalent hmmer_to_table and  
> fastam9_to_table which are very fast re-formatters that don't  
> actually use SearchIO since one is just trying to get the 
> very simple  
> data out.
> 
> --
> Jason Stajich
> Duke University
> http://www.duke.edu/~jes12/
> 
> 
From cjfields at uiuc.edu  Wed Jan 11 16:26:18 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed Jan 11 16:34:17 2006
Subject: [Bioperl-l] error running load_seqdatabase.pl
In-Reply-To: <000001c612f7$5f1a77b0$15327e82@pyrimidine>
Message-ID: <000001c616f5$a92e24d0$15327e82@pyrimidine>

Hilmar, 

As an update on what's going on:

I've run into a few problems with load_seqdatabase.pl and bioperl-db on
cygwin which I'll try to hash through this week; I'll post if I can't figure
it out soon.  It's not as buggy as trying to run it using the latest
ActivePerl on WinXP, but it still has issues.  

I'm also looking through the ActiveState documentation for the latest
version of perl they have (5.8.7), which I am running.  AFAIK, they enable
dynamic loading when building.  I'll send them an email directly to see what
they say.  There may be some Win32-specific way of configuring a script for
dynamic loading of perl modules which isn't needed in other environments. 

There was also this previous email on bioperl-l:

http://portal.open-bio.org/pipermail/bioperl-l/2005-May/018937.html

Baohua Wang seemed to narrow it down somewhat, but I'm not sure if changing
the modules is a solution until I figure out why he made the changes.  They
seem mainly geared towards getting load_seqdatabase to work with MsSQL, but
if he got it to work on Windows, then he may be onto something.  The
modified Bio* modules can be found at:

ftp://ftp.tc.cornell.edu/Outgoing/bwang/BioSQL-On-Windows 

I'll check them out to see if they work out and see what specific
modifications he made (they're not detailed).

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 
-----Original Message-----
From: bioperl-l-bounces@portal.open-bio.org
[mailto:bioperl-l-bounces@portal.open-bio.org] On Behalf Of Chris Fields
Sent: Friday, January 06, 2006 1:28 PM
To: 'Hilmar Lapp'
Cc: bioperl-l@portal.open-bio.org
Subject: RE: [Bioperl-l] error running load_seqdatabase.pl

I'll try installing bioperl-db using Cygwin.  I know that I can connect to
the native Windows mysql database from inside cygwin, so perhaps this will
do as a short term workaround.  I'll also try using a different native win32
Perl version (maybe 5.6) and look into the dynamic loading issue.  I know
that the AS Perl has given errors like this before and not had problems (I
think it was also cranky with older versions bioperl), but this one is
pretty serious.

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 
-----Original Message-----
From: Hilmar Lapp [mailto:hlapp@gmx.net] 
Sent: Friday, January 06, 2006 12:02 PM
To: Chris Fields
Cc: bioperl-l@portal.open-bio.org
Subject: Re: [Bioperl-l] error running load_seqdatabase.pl

On Jan 6, 2006, at 9:20 AM, Chris Fields wrote:

> Hilmar,
>
> Did this ever get resolved?  I tried to reinstall a biosql database 
> using
> bioperl-db and got the same problems.  I'll list out everything I ran 
> into
> and what I pan on trying, as it's been a long time since I've tried 
> this.
>
> Currently, I'm using ActiveState Perl 5.8.7.813 on WinXP and MySQL 
> 4.1.14.
> Using nmake and installing worked fine.  Loading the biosql schema and
> loading taxonomy info also worked fine, although I had to manually 
> untar the
> taxonomy archive so load_ncbi_taxonomy.pl could find the files (stupid
> windows).  However, this is what happens when using 
> load_seqdatabase.pl:
>
> C:\Perl\Scripts>load_seqdatabase.pl -dbname dihydroorotase -dbuser root
> NP_249092.gpt
> Loading NP_249092.gpt ...
> Undefined subroutine &Bio::Root::Root::debug called at
> C:/Perl/site/lib/Bio/DB/BioSQL/BasePersistenceAdaptor.pm line 1537, 
> 
> line 65.
>
> If I removed all args except the sequence file, it gives the same 
> response,
> which means it happens before the connection is made to the database:
>

This happens indeed before a connection is made because it happens at 
the point it tries to dynamically load the BioSQL driver for the 
adaptor:

	$self->debug("attempting to load driver for adaptor class
$class\n");

The BioSQL driver is loaded before the DBD driver is loaded.

The module in which this happens (i.e., the persistence adaptor) has 
been loaded dynamically as well.

Bio::Root::Root is in the 'use' statements, and the debug() method 
clearly exists. I'm at a loss as to why perl complains on certain 
Windows platforms. If somebody can tell me what, if anything, can be 
done to make this work on those platforms too I'll be glad to implement 
it.

> [...]
> Here's the error messages from that first test (warning it's very 
> messy):
>
> C:\Perl\bin\perl.exe "-MExtUtils::Command::MM" "-e" "test_harness(0, 
> 'bl
> ib\lib', 'blib\arch')" t\01dbadaptor.t t\02species.t t\03simpleseq.t
> t\04swiss.t t\05seqfeature.t t\06comment.t t\07dblink.t t\08genbank.t
> t\09fuzzy2.t t\10ensembl.t t\11locuslink.t t\12ontology.t t\13remove.t
> t\14query.t t\15cluster.t
> t\01dbadaptor.....ok 1/19Subroutine new redefined at
> [...]
> Subroutine debug redefined at C:/Perl/site/lib/Bio\Root\Root.pm line 
> 356.

So obviously it is there, right? So why doesn't perl see it a minute 
later?

> [...]
> I'll end with that.  At this moment, I can't see it working with the 
> current
> setup.  I was using perl 5.8 with the old setup but I upgraded mysql 
> at some
> point when working with gbrowse (I can't remember what the old version 
> was);
> I'll try upgrading to the newest ActiveState version to see what 
> happens.
> Could it be the MySQL version?

I don't think it has anything to do with the MySQL version, or the DBD 
driver for that matter. Instead, it looks like on issue with dynamic 
loading of perl modules on your particular platform.

	-hilmar

>
> Christopher Fields
> Postdoctoral Researcher - Switzer Lab
> Dept. of Biochemistry
> University of Illinois Urbana-Champaign
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l@portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>
>
-- 
-------------------------------------------------------------
Hilmar Lapp                            email: lapp at gnf.org
GNF, San Diego, Ca. 92121              phone: +1-858-812-1757
-------------------------------------------------------------

_______________________________________________
Bioperl-l mailing list
Bioperl-l@portal.open-bio.org
http://portal.open-bio.org/mailman/listinfo/bioperl-l

From cain at cshl.edu  Thu Jan 12 11:34:34 2006
From: cain at cshl.edu (Scott Cain)
Date: Thu Jan 12 11:31:28 2006
Subject: [Bioperl-l] Patch for GFF.pm for BioPerl 1.5.1
Message-ID: <1137083674.3033.35.camel@localhost.localdomain>

Hello,

I created a patch for GFF.pm that is in BioPerl 1.5.1.  This fixes a bug
that caused GFF file loading to fail when using bp_load_gff.pl.  The
patch is now part of the GBrowse 1.64 release:

  http://sourceforge.net/project/showfiles.php?group_id=27707&package_id=34513

and there are release notes as part of the release the describe how to
apply the patch:

  http://sourceforge.net/project/shownotes.php?release_id=374912&group_id=27707

Thanks to Don Gilbert and Jason Stajich for pointing out what needed to
be patched and Tobias Straub for insisting that there really was
something wrong even though it took me a long time to see it.

Sorry for the hassle,
Scott

-- 
------------------------------------------------------------------------
Scott Cain, Ph. D.                                         cain@cshl.edu
GMOD Coordinator (http://www.gmod.org/)                     216-392-3087
Cold Spring Harbor Laboratory

From golharam at umdnj.edu  Thu Jan 12 13:38:50 2006
From: golharam at umdnj.edu (Ryan Golhar)
Date: Thu Jan 12 14:34:28 2006
Subject: [Bioperl-l] Current Version on Web Site?
Message-ID: <002001c617a7$6f89ffd0$2f01a8c0@GOLHARMOBILE1>

What is the official current version of BioPerl?  1.5.1?

The website still has a lot of stuff pointing to 1.4 as the current
version...

Ryan

From bmoore at genetics.utah.edu  Thu Jan 12 14:45:29 2006
From: bmoore at genetics.utah.edu (Barry Moore)
Date: Thu Jan 12 14:40:26 2006
Subject: [Bioperl-l] Current Version on Web Site?
Message-ID: 

1.4 is the current "stable" release.  1.6 and all even numbered releases
will be the stable releases, and 1.5 and all odd numbered releases are
developer releases.

Barry

> -----Original Message-----
> From: bioperl-l-bounces@portal.open-bio.org [mailto:bioperl-l-
> bounces@portal.open-bio.org] On Behalf Of Ryan Golhar
> Sent: Thursday, January 12, 2006 11:39 AM
> To: 'bioperl-l'
> Subject: [Bioperl-l] Current Version on Web Site?
> 
> What is the official current version of BioPerl?  1.5.1?
> 
> The website still has a lot of stuff pointing to 1.4 as the current
> version...
> 
> Ryan
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l@portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l

From cain at cshl.edu  Thu Jan 12 15:06:41 2006
From: cain at cshl.edu (Scott Cain)
Date: Thu Jan 12 19:27:58 2006
Subject: [Bioperl-l] Current Version on Web Site?
In-Reply-To: 
References: 
Message-ID: <1137096401.3033.52.camel@localhost.localdomain>

While 1.4 is the current stable release, it is quite old.  Release 1.5.1
is fairly 'stable' for an unstable release, and is required for some
popular packages, like GBrowse.

Scott

On Thu, 2006-01-12 at 12:45 -0700, Barry Moore wrote:
> 1.4 is the current "stable" release.  1.6 and all even numbered releases
> will be the stable releases, and 1.5 and all odd numbered releases are
> developer releases.
> 
> Barry
> 
> > -----Original Message-----
> > From: bioperl-l-bounces@portal.open-bio.org [mailto:bioperl-l-
> > bounces@portal.open-bio.org] On Behalf Of Ryan Golhar
> > Sent: Thursday, January 12, 2006 11:39 AM
> > To: 'bioperl-l'
> > Subject: [Bioperl-l] Current Version on Web Site?
> > 
> > What is the official current version of BioPerl?  1.5.1?
> > 
> > The website still has a lot of stuff pointing to 1.4 as the current
> > version...
> > 
> > Ryan
> > 
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l@portal.open-bio.org
> > http://portal.open-bio.org/mailman/listinfo/bioperl-l
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l@portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l
-- 
------------------------------------------------------------------------
Scott Cain, Ph. D.                                         cain@cshl.edu
GMOD Coordinator (http://www.gmod.org/)                     216-392-3087
Cold Spring Harbor Laboratory

From hubert.prielinger at gmx.at  Thu Jan 12 18:48:30 2006
From: hubert.prielinger at gmx.at (Hubert Prielinger)
Date: Thu Jan 12 19:42:22 2006
Subject: [Bioperl-l] remoteblast
Message-ID: <43C6EACE.8010708@gmx.at>

Hi,
I have encountered an error, while my remoteblast file was running...
I have used that file since two weeks, and all of a sudden, I got the 
following message error:

-------------------- WARNING ---------------------
MSG: req was POST http://www.ncbi.nlm.nih.gov/blast/Blast.cgi
User-Agent: bioperl-Bio_Tools_Run_RemoteBlast/1.5
Content-Length: 210
Content-Type: application/x-www-form-urlencoded

GAPCOSTS=9+1&DATABASE=nr&QUERY=%3E+%0AKWRRWKRR&COMPOSITION_BASED_STATISTICS=off&EXPECT=20000&WORD_SIZE=2&SERVICE=plain&FORMAT_OBJECT=Alignment&CMD=Put&MATRIX_NAME=PAM30&FILTER=L&DESCRIPTIONS=1000&PROGRAM=blastp

An Error Occurred

An Error Occurred
302 Found

---------------------------------------------------

I hope somebody can help....thanks in advance

regards
From hubert.prielinger at gmx.at  Thu Jan 12 18:57:16 2006
From: hubert.prielinger at gmx.at (Hubert Prielinger)
Date: Thu Jan 12 19:51:08 2006
Subject: [Bioperl-l] parse Blast Output and Composition Based Statistics
	parameter
Message-ID: <43C6ECDC.7050308@gmx.at>

Hello,
I want to know, if there is a possibility to get from a Blast Outputfile 
the whole Sequence of a protein not only the best local alignment...
for example:

 >ref|XP_480077.1| hypothetical protein [Oryza sativa (japonica 
cultivar-group)]
dbj|BAD33542.1| hypothetical protein [Oryza sativa (japonica 
cultivar-group)]
         Length=95

Score = 24.1 bits (47),  Expect =   493
Identities = 6/7 (85%), Positives = 7/7 (100%), Gaps = 0/7 (0%)

Query  2   KKRRRWW  8
                K+RRRWW
Sbjct  87  KRRRRWW  93

and now, if I parse the file, I want to get the whole Sequence of this 
hypothetical protein....is that possible with hsp for example, or any 
other way....

my second question is:
I do my blast search with bioperl and the remoteblast module.....each 
parameter is working very well, except the composition based statistics 
parameter....
it looks like that:

my $factory = 
$Bio::Tools::Run::RemoteBlast::HEADER{'COMPOSITION_BASED_STATISTICS'} = 
'yes';

it should work like that, but it doesn't....

Thanks for your help in advance......

regards
Hubert
From jbikandi at gmail.com  Wed Jan 11 13:07:03 2006
From: jbikandi at gmail.com (Joseba Bikandi)
Date: Thu Jan 12 20:55:30 2006
Subject: [Bioperl-l] Biophp.org released
Message-ID: <36b6cfbd0601111007k61aa33d3s15c20452730a8683@mail.gmail.com>

BioPHP.org has been released.
It is a open source project, and the basic idea is to create an online
repository of functions and minitools. Both types of code are editable by
using a wiki-like service, so new code con be easily developed. Minitools
are one page copy and paste complete scripts intented to be used for basic
computation.
Due to similarity between Perl and PHP, we encourage Bioperl comunity to
visit our site and to participate in this project.

From jason.stajich at duke.edu  Thu Jan 12 20:50:33 2006
From: jason.stajich at duke.edu (Jason Stajich)
Date: Thu Jan 12 21:05:27 2006
Subject: [Bioperl-l] parse Blast Output and Composition Based Statistics
	parameter
In-Reply-To: <43C6ECDC.7050308@gmx.at>
References: <43C6ECDC.7050308@gmx.at>
Message-ID: <2CF48095-DF0E-4BB5-AAB8-3B8DBC813E76@duke.edu>

(please don't try and post to bioperl-announce, it is not for  
questions.)

On Jan 12, 2006, at 6:57 PM, Hubert Prielinger wrote:

> Hello,
> I want to know, if there is a possibility to get from a Blast  
> Outputfile the whole Sequence of a protein not only the best local  
> alignment...
> for example:
>
No. The parser can only return to you what is in the report file...
use Bio::DB::GenPept to retrieve the sequence via the web or  
(recommended) use a locally indexed sequence database like  
Bio::DB::Fasta
> >ref|XP_480077.1| hypothetical protein [Oryza sativa (japonica  
> cultivar-group)]
> dbj|BAD33542.1| hypothetical protein [Oryza sativa (japonica  
> cultivar-group)]
>         Length=95
>
> Score = 24.1 bits (47),  Expect =   493
> Identities = 6/7 (85%), Positives = 7/7 (100%), Gaps = 0/7 (0%)
>
> Query  2   KKRRRWW  8
>                K+RRRWW
> Sbjct  87  KRRRRWW  93
>
> and now, if I parse the file, I want to get the whole Sequence of  
> this hypothetical protein....is that possible with hsp for example,  
> or any other way....
>
> my second question is:
> I do my blast search with bioperl and the remoteblast  
> module.....each parameter is working very well, except the  
> composition based statistics parameter....
> it looks like that:
>
> my $factory = $Bio::Tools::Run::RemoteBlast::HEADER 
> {'COMPOSITION_BASED_STATISTICS'} = 'yes';
>
uh no that is not how you would do it.
You can make it the default for any factories you use in the script  
by doing this
> $Bio::Tools::Run::RemoteBlast::HEADER 
> {'COMPOSITION_BASED_STATISTICS'} = 'yes';
then
$factory = Bio::Tools::Run::RemoteBlast->new();

  =OR=
Once you have a factory object you can set the parameter explicitly:
$factory->submit_parameter('COMPOSITION_BASED_STATISTICS', 'yes');

> it should work like that, but it doesn't....
>
> Thanks for your help in advance......
>
> regards
> Hubert
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l@portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l

--
Jason Stajich
Duke University
http://www.duke.edu/~jes12

From cjfields at uiuc.edu  Thu Jan 12 22:27:00 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu Jan 12 22:25:41 2006
Subject: [Bioperl-l] error running load_seqdatabase.pl
In-Reply-To: 
Message-ID: <000c01c617f1$3aeee610$15327e82@pyrimidine>

Looks like the below modification Baohua Wang made to Root.pm works.  I did
run into another weird issue, but I think it is a sequence formatting
problem.  I try loading in a file with protein sequences in GenPept format
(pulled from BLASTP output using Bio::DB::GenPept and saved in a file using
SeqIO) after changing Root.pm:
______________________________________________________________________

C:\Perl\Scripts>load_seqdatabase.pl -dbname biosql -dbuser root -dbpass
****** -format genbank -safe NP_252217.gpt
Loading NP_252217.gpt ...

C:\Perl\Scripts>
______________________________________________________________________

Good!

The strangeness comes in when using Genpept seqs NOT passed through SeqIO
(pulled directly from NCBI, saved in a similar file).  Most sequences will
load, but a number of them will not:

______________________________________________________________________
C:\Perl\Scripts>load_seqdatabase.pl -dbname biosql -dbuser root -dbpass
crackers
ol -format genbank -safe NP_249092.gpt
Loading NP_249092.gpt ...

-------------------- WARNING ---------------------
MSG: insert in Bio::DB::BioSQL::DBLinkAdaptor (driver) failed, values were
("","HAMAPMF_00220","0") FKs ()
Column 'dbname' cannot be null
---------------------------------------------------
Could not store Q59712:
------------- EXCEPTION  -------------
MSG: create: object (Bio::Annotation::DBLink) failed to insert or to be
found by
 unique key
STACK Bio::DB::BioSQL::BasePersistenceAdaptor::create
C:/Perl/site/lib/Bio/DB/BioSQL/BasePersistenceAdaptor.pm:208
STACK Bio::DB::BioSQL::BasePersistenceAdaptor::store
C:/Perl/site/lib/Bio/DB/BioSQL/BasePersistenceAdaptor.pm:254
STACK Bio::DB::Persistent::PersistentObject::store
C:/Perl/site/lib/Bio/DB/Persistent/PersistentObject.pm:272
STACK Bio::DB::BioSQL::AnnotationCollectionAdaptor::store_children
C:/Perl/site/lib/Bio\DB\BioSQL\AnnotationCollectionAdaptor.pm:219
STACK Bio::DB::BioSQL::BasePersistenceAdaptor::create
C:/Perl/site/lib/Bio/DB/BioSQL/BasePersistenceAdaptor.pm:216
STACK Bio::DB::BioSQL::BasePersistenceAdaptor::store
C:/Perl/site/lib/Bio/DB/BioSQL/BasePersistenceAdaptor.pm:254
STACK Bio::DB::Persistent::PersistentObject::store
C:/Perl/site/lib/Bio/DB/Persistent/PersistentObject.pm:272
STACK Bio::DB::BioSQL::SeqAdaptor::store_children
C:/Perl/site/lib/Bio\DB\BioSQL\SeqAdaptor.pm:226
STACK Bio::DB::BioSQL::BasePersistenceAdaptor::create
C:/Perl/site/lib/Bio/DB/BioSQL/BasePersistenceAdaptor.pm:216
STACK Bio::DB::BioSQL::BasePersistenceAdaptor::store
C:/Perl/site/lib/Bio/DB/BioSQL/BasePersistenceAdaptor.pm:254
STACK Bio::DB::Persistent::PersistentObject::store
C:/Perl/site/lib/Bio/DB/Persistent/PersistentObject.pm:272
STACK (eval) C:\Perl\Scripts\load_seqdatabase.pl:620
STACK toplevel C:\Perl\Scripts\load_seqdatabase.pl:603

--------------------------------------

 at C:\Perl\Scripts\load_seqdatabase.pl line 633

....

at C:\Perl\Scripts\load_seqdatabase.pl line 633
Could not store AAU82296:
------------- EXCEPTION  -------------
MSG: create: object (Bio::Species) failed to insert or to be found by unique
key

STACK Bio::DB::BioSQL::BasePersistenceAdaptor::create
C:/Perl/site/lib/Bio/DB/BioSQL/BasePersistenceAdaptor.pm:208
STACK Bio::DB::Persistent::PersistentObject::create
C:/Perl/site/lib/Bio/DB/Persistent/PersistentObject.pm:245
STACK Bio::DB::BioSQL::BasePersistenceAdaptor::create
C:/Perl/site/lib/Bio/DB/BioSQL/BasePersistenceAdaptor.pm:171
STACK Bio::DB::BioSQL::BasePersistenceAdaptor::store
C:/Perl/site/lib/Bio/DB/BioSQL/BasePersistenceAdaptor.pm:254
STACK Bio::DB::Persistent::PersistentObject::store
C:/Perl/site/lib/Bio/DB/Persistent/PersistentObject.pm:272
STACK (eval) C:\Perl\Scripts\load_seqdatabase.pl:620
STACK toplevel C:\Perl\Scripts\load_seqdatabase.pl:603

--------------------------------------

 at C:\Perl\Scripts\load_seqdatabase.pl line 633
______________________________________________________________________

I'll check them out to try and derive what the differences are.  I will also
pass the above file through SeqIO to see what happens.  I think it could be
some of the GenPept formatted stuff is clogging up the works since I saved
everything in Genbank format through SeqIO.  For now, though, bioperl-db on
Windows works!  Any idea why the 'throw' change works?

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 
-----Original Message-----
From: drycafe@gmail.com [mailto:drycafe@gmail.com] On Behalf Of Hilmar Lapp
Sent: Wednesday, January 11, 2006 5:13 PM
To: Chris Fields; Steve Chervitz
Cc: bioperl-l@portal.open-bio.org
Subject: Re: [Bioperl-l] error running load_seqdatabase.pl

Interesting. That posting didn't receive much attention did it. So he
states:

The script failed on throw() in loading Bio/Root/Root.pm on Windows.
The problem lines are those "throw $class (...".   After I put comma
after $class as "throw $class, (...", the BioSQL tests and load scripts
are succeeded

Can anyone of those who wrote the Root exception and warning code
comment? Maybe Steve?

   -hilmar

On 1/11/06, Chris Fields  wrote:
> Hilmar,
>
> As an update on what's going on:
>
> I've run into a few problems with load_seqdatabase.pl and bioperl-db on
> cygwin which I'll try to hash through this week; I'll post if I can't
figure
> it out soon.  It's not as buggy as trying to run it using the latest
> ActivePerl on WinXP, but it still has issues.
>
> I'm also looking through the ActiveState documentation for the latest
> version of perl they have (5.8.7), which I am running.  AFAIK, they enable
> dynamic loading when building.  I'll send them an email directly to see
what
> they say.  There may be some Win32-specific way of configuring a script
for
> dynamic loading of perl modules which isn't needed in other environments.
>
> There was also this previous email on bioperl-l:
>
> http://portal.open-bio.org/pipermail/bioperl-l/2005-May/018937.html
>
> Baohua Wang seemed to narrow it down somewhat, but I'm not sure if
changing
> the modules is a solution until I figure out why he made the changes.
They
> seem mainly geared towards getting load_seqdatabase to work with MsSQL,
but
> if he got it to work on Windows, then he may be onto something.  The
> modified Bio* modules can be found at:
>
> ftp://ftp.tc.cornell.edu/Outgoing/bwang/BioSQL-On-Windows
>
> I'll check them out to see if they work out and see what specific
> modifications he made (they're not detailed).
>
> Christopher Fields
> Postdoctoral Researcher - Switzer Lab
> Dept. of Biochemistry
> University of Illinois Urbana-Champaign
> -----Original Message-----
> From: bioperl-l-bounces@portal.open-bio.org
> [mailto:bioperl-l-bounces@portal.open-bio.org] On Behalf Of Chris Fields
> Sent: Friday, January 06, 2006 1:28 PM
> To: 'Hilmar Lapp'
> Cc: bioperl-l@portal.open-bio.org
> Subject: RE: [Bioperl-l] error running load_seqdatabase.pl
>
> I'll try installing bioperl-db using Cygwin.  I know that I can connect to
> the native Windows mysql database from inside cygwin, so perhaps this will
> do as a short term workaround.  I'll also try using a different native
win32
> Perl version (maybe 5.6) and look into the dynamic loading issue.  I know
> that the AS Perl has given errors like this before and not had problems (I
> think it was also cranky with older versions bioperl), but this one is
> pretty serious.
>
> Christopher Fields
> Postdoctoral Researcher - Switzer Lab
> Dept. of Biochemistry
> University of Illinois Urbana-Champaign
> -----Original Message-----
> From: Hilmar Lapp [mailto:hlapp@gmx.net]
> Sent: Friday, January 06, 2006 12:02 PM
> To: Chris Fields
> Cc: bioperl-l@portal.open-bio.org
> Subject: Re: [Bioperl-l] error running load_seqdatabase.pl
>
>
> On Jan 6, 2006, at 9:20 AM, Chris Fields wrote:
>
> > Hilmar,
> >
> > Did this ever get resolved?  I tried to reinstall a biosql database
> > using
> > bioperl-db and got the same problems.  I'll list out everything I ran
> > into
> > and what I pan on trying, as it's been a long time since I've tried
> > this.
> >
> > Currently, I'm using ActiveState Perl 5.8.7.813 on WinXP and MySQL
> > 4.1.14.
> > Using nmake and installing worked fine.  Loading the biosql schema and
> > loading taxonomy info also worked fine, although I had to manually
> > untar the
> > taxonomy archive so load_ncbi_taxonomy.pl could find the files (stupid
> > windows).  However, this is what happens when using
> > load_seqdatabase.pl:
> >
> > C:\Perl\Scripts>load_seqdatabase.pl -dbname dihydroorotase -dbuser root
> > NP_249092.gpt
> > Loading NP_249092.gpt ...
> > Undefined subroutine &Bio::Root::Root::debug called at
> > C:/Perl/site/lib/Bio/DB/BioSQL/BasePersistenceAdaptor.pm line 1537,
> > 
> > line 65.
> >
> > If I removed all args except the sequence file, it gives the same
> > response,
> > which means it happens before the connection is made to the database:
> >
>
> This happens indeed before a connection is made because it happens at
> the point it tries to dynamically load the BioSQL driver for the
> adaptor:
>
>         $self->debug("attempting to load driver for adaptor class
> $class\n");
>
> The BioSQL driver is loaded before the DBD driver is loaded.
>
> The module in which this happens (i.e., the persistence adaptor) has
> been loaded dynamically as well.
>
> Bio::Root::Root is in the 'use' statements, and the debug() method
> clearly exists. I'm at a loss as to why perl complains on certain
> Windows platforms. If somebody can tell me what, if anything, can be
> done to make this work on those platforms too I'll be glad to implement
> it.
>
> > [...]
> > Here's the error messages from that first test (warning it's very
> > messy):
> >
> > C:\Perl\bin\perl.exe "-MExtUtils::Command::MM" "-e" "test_harness(0,
> > 'bl
> > ib\lib', 'blib\arch')" t\01dbadaptor.t t\02species.t t\03simpleseq.t
> > t\04swiss.t t\05seqfeature.t t\06comment.t t\07dblink.t t\08genbank.t
> > t\09fuzzy2.t t\10ensembl.t t\11locuslink.t t\12ontology.t t\13remove.t
> > t\14query.t t\15cluster.t
> > t\01dbadaptor.....ok 1/19Subroutine new redefined at
> > [...]
> > Subroutine debug redefined at C:/Perl/site/lib/Bio\Root\Root.pm line
> > 356.
>
> So obviously it is there, right? So why doesn't perl see it a minute
> later?
>
> > [...]
> > I'll end with that.  At this moment, I can't see it working with the
> > current
> > setup.  I was using perl 5.8 with the old setup but I upgraded mysql
> > at some
> > point when working with gbrowse (I can't remember what the old version
> > was);
> > I'll try upgrading to the newest ActiveState version to see what
> > happens.
> > Could it be the MySQL version?
>
> I don't think it has anything to do with the MySQL version, or the DBD
> driver for that matter. Instead, it looks like on issue with dynamic
> loading of perl modules on your particular platform.
>
>         -hilmar
>
> >
> > Christopher Fields
> > Postdoctoral Researcher - Switzer Lab
> > Dept. of Biochemistry
> > University of Illinois Urbana-Champaign
> >
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l@portal.open-bio.org
> > http://portal.open-bio.org/mailman/listinfo/bioperl-l
> >
> >
> --
> -------------------------------------------------------------
> Hilmar Lapp                            email: lapp at gnf.org
> GNF, San Diego, Ca. 92121              phone: +1-858-812-1757
> -------------------------------------------------------------
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l@portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>
>

--
----------------------------------------------------------
: Hilmar Lapp -:- San Diego, CA -:- hlapp at gmx dot net :
----------------------------------------------------------

From hlapp at gmx.net  Thu Jan 12 23:28:14 2006
From: hlapp at gmx.net (Hilmar Lapp)
Date: Fri Jan 13 00:39:40 2006
Subject: [Bioperl-l] error running load_seqdatabase.pl
In-Reply-To: <000c01c617f1$3aeee610$15327e82@pyrimidine>
References: 
	<000c01c617f1$3aeee610$15327e82@pyrimidine>
Message-ID: 

On 1/12/06, Chris Fields  wrote:
> Looks like the below modification Baohua Wang made to Root.pm works.  I did
> run into another weird issue, but I think it is a sequence formatting
> problem.  I try loading in a file with protein sequences in GenPept format
> (pulled from BLASTP output using Bio::DB::GenPept and saved in a file using
> SeqIO) after changing Root.pm:
> ______________________________________________________________________
>
> C:\Perl\Scripts>load_seqdatabase.pl -dbname biosql -dbuser root -dbpass
> ****** -format genbank -safe NP_252217.gpt
> Loading NP_252217.gpt ...
>
> C:\Perl\Scripts>
> ______________________________________________________________________
>
> Good!

Great! So we'll have to test that the effect of adding that comma
isn't negative on Unix platforms but I suspect it's in fact required
by syntax and maybe on Windows perl is less lenient? Odd at any rate.

>
> The strangeness comes in when using Genpept seqs NOT passed through SeqIO
> (pulled directly from NCBI, saved in a similar file).  Most sequences will
> load, but a number of them will not:
>
> ______________________________________________________________________
> C:\Perl\Scripts>load_seqdatabase.pl -dbname biosql -dbuser root -dbpass
> ****** -format genbank -safe NP_249092.gpt
> Loading NP_249092.gpt ...
>
> -------------------- WARNING ---------------------
> MSG: insert in Bio::DB::BioSQL::DBLinkAdaptor (driver) failed, values were
> ("","HAMAPMF_00220","0") FKs ()
> Column 'dbname' cannot be null
> ---------------------------------------------------
> Could not store Q59712:

Are you sure you pulled this from NCBI using NP_249092 as the
accession? I'm asking because NP_249092 is a perfectly sane looking
RefSeq record and in fact does not contain the string HAMAPMF, whereas
Q59712 in reality is a Uniprot record moulded into GenPept format;
some of the db_xrefs come out odd and in fact for the one above
(HAMAPMF_00220) there is no dbname, most likely because dbname and
accession are concatenated like for the following InterPro db_xref.

So I don't think this is worrisome unless you insist you used the
NP_249092 entry ...

I would generally advise against taking Uniprot/Swissprot entries from
their GenPept reincarnation. The formats are incompatible in some
aspects (e.g., Swissprot, like EMBL, has first-level db_xrefs, whereas
GenBank format doesn't; instead it puts db_xrefs into the feature
table).

> [...]
> at C:\Perl\Scripts\load_seqdatabase.pl line 633
> Could not store AAU82296:
> ------------- EXCEPTION  -------------
> MSG: create: object (Bio::Species) failed to insert or to be found by unique
> key

"uncultured archaeon GZfos13E1" is not something Bioperl will parse
correctly into the appropriate Bio::Species structure (not that I
would even know what that would have to look like ;).

However, if you preload your Biosql instance with the NCBI taxonomy
database then this is not a problem because the species will be looked
up correctly by its NCBI taxon ID (which the genbank SeqIO parser
extracts from the feature table if it's there - and it is in this
case).

> [...]
> I'll check them out to try and derive what the differences are.  I will also
> pass the above file through SeqIO to see what happens.

Note that everything you pull down through Bio::DB::GenPept does get
parsed by Bio::SeqIO::genbank - if there is any difference it must be
because the input files aren't identical.

> I think it could be some of the GenPept formatted stuff is clogging up the works since I saved
> everything in Genbank format through SeqIO.

Ah - meaning you got the file by calling $seqio->write_seq($seq) ?
That could cause it's own problems (even though theoretically it
shouldn't and therefore if it does it counts as a bug).

>  For now, though, bioperl-db on
> Windows works!  Any idea why the 'throw' change works?

No, no idea - but great that you found out.

   -hilmar

>
> Christopher Fields
> Postdoctoral Researcher - Switzer Lab
> Dept. of Biochemistry
> University of Illinois Urbana-Champaign
> -----Original Message-----
> From: drycafe@gmail.com [mailto:drycafe@gmail.com] On Behalf Of Hilmar Lapp
> Sent: Wednesday, January 11, 2006 5:13 PM
> To: Chris Fields; Steve Chervitz
> Cc: bioperl-l@portal.open-bio.org
> Subject: Re: [Bioperl-l] error running load_seqdatabase.pl
>
> Interesting. That posting didn't receive much attention did it. So he
> states:
>
> 
> The script failed on throw() in loading Bio/Root/Root.pm on Windows.
> The problem lines are those "throw $class (...".   After I put comma
> after $class as "throw $class, (...", the BioSQL tests and load scripts
> are succeeded
> 
>
> Can anyone of those who wrote the Root exception and warning code
> comment? Maybe Steve?
>
>    -hilmar
>
> On 1/11/06, Chris Fields  wrote:
> > Hilmar,
> >
> > As an update on what's going on:
> >
> > I've run into a few problems with load_seqdatabase.pl and bioperl-db on
> > cygwin which I'll try to hash through this week; I'll post if I can't
> figure
> > it out soon.  It's not as buggy as trying to run it using the latest
> > ActivePerl on WinXP, but it still has issues.
> >
> > I'm also looking through the ActiveState documentation for the latest
> > version of perl they have (5.8.7), which I am running.  AFAIK, they enable
> > dynamic loading when building.  I'll send them an email directly to see
> what
> > they say.  There may be some Win32-specific way of configuring a script
> for
> > dynamic loading of perl modules which isn't needed in other environments.
> >
> > There was also this previous email on bioperl-l:
> >
> > http://portal.open-bio.org/pipermail/bioperl-l/2005-May/018937.html
> >
> > Baohua Wang seemed to narrow it down somewhat, but I'm not sure if
> changing
> > the modules is a solution until I figure out why he made the changes.
> They
> > seem mainly geared towards getting load_seqdatabase to work with MsSQL,
> but
> > if he got it to work on Windows, then he may be onto something.  The
> > modified Bio* modules can be found at:
> >
> > ftp://ftp.tc.cornell.edu/Outgoing/bwang/BioSQL-On-Windows
> >
> > I'll check them out to see if they work out and see what specific
> > modifications he made (they're not detailed).
> >
> > Christopher Fields
> > Postdoctoral Researcher - Switzer Lab
> > Dept. of Biochemistry
> > University of Illinois Urbana-Champaign
> > -----Original Message-----
> > From: bioperl-l-bounces@portal.open-bio.org
> > [mailto:bioperl-l-bounces@portal.open-bio.org] On Behalf Of Chris Fields
> > Sent: Friday, January 06, 2006 1:28 PM
> > To: 'Hilmar Lapp'
> > Cc: bioperl-l@portal.open-bio.org
> > Subject: RE: [Bioperl-l] error running load_seqdatabase.pl
> >
> > I'll try installing bioperl-db using Cygwin.  I know that I can connect to
> > the native Windows mysql database from inside cygwin, so perhaps this will
> > do as a short term workaround.  I'll also try using a different native
> win32
> > Perl version (maybe 5.6) and look into the dynamic loading issue.  I know
> > that the AS Perl has given errors like this before and not had problems (I
> > think it was also cranky with older versions bioperl), but this one is
> > pretty serious.
> >
> > Christopher Fields
> > Postdoctoral Researcher - Switzer Lab
> > Dept. of Biochemistry
> > University of Illinois Urbana-Champaign
> > -----Original Message-----
> > From: Hilmar Lapp [mailto:hlapp@gmx.net]
> > Sent: Friday, January 06, 2006 12:02 PM
> > To: Chris Fields
> > Cc: bioperl-l@portal.open-bio.org
> > Subject: Re: [Bioperl-l] error running load_seqdatabase.pl
> >
> >
> > On Jan 6, 2006, at 9:20 AM, Chris Fields wrote:
> >
> > > Hilmar,
> > >
> > > Did this ever get resolved?  I tried to reinstall a biosql database
> > > using
> > > bioperl-db and got the same problems.  I'll list out everything I ran
> > > into
> > > and what I pan on trying, as it's been a long time since I've tried
> > > this.
> > >
> > > Currently, I'm using ActiveState Perl 5.8.7.813 on WinXP and MySQL
> > > 4.1.14.
> > > Using nmake and installing worked fine.  Loading the biosql schema and
> > > loading taxonomy info also worked fine, although I had to manually
> > > untar the
> > > taxonomy archive so load_ncbi_taxonomy.pl could find the files (stupid
> > > windows).  However, this is what happens when using
> > > load_seqdatabase.pl:
> > >
> > > C:\Perl\Scripts>load_seqdatabase.pl -dbname dihydroorotase -dbuser root
> > > NP_249092.gpt
> > > Loading NP_249092.gpt ...
> > > Undefined subroutine &Bio::Root::Root::debug called at
> > > C:/Perl/site/lib/Bio/DB/BioSQL/BasePersistenceAdaptor.pm line 1537,
> > > 
> > > line 65.
> > >
> > > If I removed all args except the sequence file, it gives the same
> > > response,
> > > which means it happens before the connection is made to the database:
> > >
> >
> > This happens indeed before a connection is made because it happens at
> > the point it tries to dynamically load the BioSQL driver for the
> > adaptor:
> >
> >         $self->debug("attempting to load driver for adaptor class
> > $class\n");
> >
> > The BioSQL driver is loaded before the DBD driver is loaded.
> >
> > The module in which this happens (i.e., the persistence adaptor) has
> > been loaded dynamically as well.
> >
> > Bio::Root::Root is in the 'use' statements, and the debug() method
> > clearly exists. I'm at a loss as to why perl complains on certain
> > Windows platforms. If somebody can tell me what, if anything, can be
> > done to make this work on those platforms too I'll be glad to implement
> > it.
> >
> > > [...]
> > > Here's the error messages from that first test (warning it's very
> > > messy):
> > >
> > > C:\Perl\bin\perl.exe "-MExtUtils::Command::MM" "-e" "test_harness(0,
> > > 'bl
> > > ib\lib', 'blib\arch')" t\01dbadaptor.t t\02species.t t\03simpleseq.t
> > > t\04swiss.t t\05seqfeature.t t\06comment.t t\07dblink.t t\08genbank.t
> > > t\09fuzzy2.t t\10ensembl.t t\11locuslink.t t\12ontology.t t\13remove.t
> > > t\14query.t t\15cluster.t
> > > t\01dbadaptor.....ok 1/19Subroutine new redefined at
> > > [...]
> > > Subroutine debug redefined at C:/Perl/site/lib/Bio\Root\Root.pm line
> > > 356.
> >
> > So obviously it is there, right? So why doesn't perl see it a minute
> > later?
> >
> > > [...]
> > > I'll end with that.  At this moment, I can't see it working with the
> > > current
> > > setup.  I was using perl 5.8 with the old setup but I upgraded mysql
> > > at some
> > > point when working with gbrowse (I can't remember what the old version
> > > was);
> > > I'll try upgrading to the newest ActiveState version to see what
> > > happens.
> > > Could it be the MySQL version?
> >
> > I don't think it has anything to do with the MySQL version, or the DBD
> > driver for that matter. Instead, it looks like on issue with dynamic
> > loading of perl modules on your particular platform.
> >
> >         -hilmar
> >
> > >
> > > Christopher Fields
> > > Postdoctoral Researcher - Switzer Lab
> > > Dept. of Biochemistry
> > > University of Illinois Urbana-Champaign
> > >
> > > _______________________________________________
> > > Bioperl-l mailing list
> > > Bioperl-l@portal.open-bio.org
> > > http://portal.open-bio.org/mailman/listinfo/bioperl-l
> > >
> > >
> > --
> > -------------------------------------------------------------
> > Hilmar Lapp                            email: lapp at gnf.org
> > GNF, San Diego, Ca. 92121              phone: +1-858-812-1757
> > -------------------------------------------------------------
> >
> >
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l@portal.open-bio.org
> > http://portal.open-bio.org/mailman/listinfo/bioperl-l
> >
> >
>
>
> --
> ----------------------------------------------------------
> : Hilmar Lapp -:- San Diego, CA -:- hlapp at gmx dot net :
> ----------------------------------------------------------
>
>

--
----------------------------------------------------------
: Hilmar Lapp -:- San Diego, CA -:- hlapp at gmx dot net :
----------------------------------------------------------

From hlapp at gmx.net  Wed Jan 11 18:12:45 2006
From: hlapp at gmx.net (Hilmar Lapp)
Date: Fri Jan 13 02:52:28 2006
Subject: [Bioperl-l] error running load_seqdatabase.pl
In-Reply-To: <000001c616f5$a92e24d0$15327e82@pyrimidine>
References: <000001c612f7$5f1a77b0$15327e82@pyrimidine>
	<000001c616f5$a92e24d0$15327e82@pyrimidine>
Message-ID: 

Interesting. That posting didn't receive much attention did it. So he states:

The script failed on throw() in loading Bio/Root/Root.pm on Windows.
The problem lines are those "throw $class (...".   After I put comma
after $class as "throw $class, (...", the BioSQL tests and load scripts
are succeeded

Can anyone of those who wrote the Root exception and warning code
comment? Maybe Steve?

   -hilmar

On 1/11/06, Chris Fields  wrote:
> Hilmar,
>
> As an update on what's going on:
>
> I've run into a few problems with load_seqdatabase.pl and bioperl-db on
> cygwin which I'll try to hash through this week; I'll post if I can't figure
> it out soon.  It's not as buggy as trying to run it using the latest
> ActivePerl on WinXP, but it still has issues.
>
> I'm also looking through the ActiveState documentation for the latest
> version of perl they have (5.8.7), which I am running.  AFAIK, they enable
> dynamic loading when building.  I'll send them an email directly to see what
> they say.  There may be some Win32-specific way of configuring a script for
> dynamic loading of perl modules which isn't needed in other environments.
>
> There was also this previous email on bioperl-l:
>
> http://portal.open-bio.org/pipermail/bioperl-l/2005-May/018937.html
>
> Baohua Wang seemed to narrow it down somewhat, but I'm not sure if changing
> the modules is a solution until I figure out why he made the changes.  They
> seem mainly geared towards getting load_seqdatabase to work with MsSQL, but
> if he got it to work on Windows, then he may be onto something.  The
> modified Bio* modules can be found at:
>
> ftp://ftp.tc.cornell.edu/Outgoing/bwang/BioSQL-On-Windows
>
> I'll check them out to see if they work out and see what specific
> modifications he made (they're not detailed).
>
> Christopher Fields
> Postdoctoral Researcher - Switzer Lab
> Dept. of Biochemistry
> University of Illinois Urbana-Champaign
> -----Original Message-----
> From: bioperl-l-bounces@portal.open-bio.org
> [mailto:bioperl-l-bounces@portal.open-bio.org] On Behalf Of Chris Fields
> Sent: Friday, January 06, 2006 1:28 PM
> To: 'Hilmar Lapp'
> Cc: bioperl-l@portal.open-bio.org
> Subject: RE: [Bioperl-l] error running load_seqdatabase.pl
>
> I'll try installing bioperl-db using Cygwin.  I know that I can connect to
> the native Windows mysql database from inside cygwin, so perhaps this will
> do as a short term workaround.  I'll also try using a different native win32
> Perl version (maybe 5.6) and look into the dynamic loading issue.  I know
> that the AS Perl has given errors like this before and not had problems (I
> think it was also cranky with older versions bioperl), but this one is
> pretty serious.
>
> Christopher Fields
> Postdoctoral Researcher - Switzer Lab
> Dept. of Biochemistry
> University of Illinois Urbana-Champaign
> -----Original Message-----
> From: Hilmar Lapp [mailto:hlapp@gmx.net]
> Sent: Friday, January 06, 2006 12:02 PM
> To: Chris Fields
> Cc: bioperl-l@portal.open-bio.org
> Subject: Re: [Bioperl-l] error running load_seqdatabase.pl
>
>
> On Jan 6, 2006, at 9:20 AM, Chris Fields wrote:
>
> > Hilmar,
> >
> > Did this ever get resolved?  I tried to reinstall a biosql database
> > using
> > bioperl-db and got the same problems.  I'll list out everything I ran
> > into
> > and what I pan on trying, as it's been a long time since I've tried
> > this.
> >
> > Currently, I'm using ActiveState Perl 5.8.7.813 on WinXP and MySQL
> > 4.1.14.
> > Using nmake and installing worked fine.  Loading the biosql schema and
> > loading taxonomy info also worked fine, although I had to manually
> > untar the
> > taxonomy archive so load_ncbi_taxonomy.pl could find the files (stupid
> > windows).  However, this is what happens when using
> > load_seqdatabase.pl:
> >
> > C:\Perl\Scripts>load_seqdatabase.pl -dbname dihydroorotase -dbuser root
> > NP_249092.gpt
> > Loading NP_249092.gpt ...
> > Undefined subroutine &Bio::Root::Root::debug called at
> > C:/Perl/site/lib/Bio/DB/BioSQL/BasePersistenceAdaptor.pm line 1537,
> > 
> > line 65.
> >
> > If I removed all args except the sequence file, it gives the same
> > response,
> > which means it happens before the connection is made to the database:
> >
>
> This happens indeed before a connection is made because it happens at
> the point it tries to dynamically load the BioSQL driver for the
> adaptor:
>
>         $self->debug("attempting to load driver for adaptor class
> $class\n");
>
> The BioSQL driver is loaded before the DBD driver is loaded.
>
> The module in which this happens (i.e., the persistence adaptor) has
> been loaded dynamically as well.
>
> Bio::Root::Root is in the 'use' statements, and the debug() method
> clearly exists. I'm at a loss as to why perl complains on certain
> Windows platforms. If somebody can tell me what, if anything, can be
> done to make this work on those platforms too I'll be glad to implement
> it.
>
> > [...]
> > Here's the error messages from that first test (warning it's very
> > messy):
> >
> > C:\Perl\bin\perl.exe "-MExtUtils::Command::MM" "-e" "test_harness(0,
> > 'bl
> > ib\lib', 'blib\arch')" t\01dbadaptor.t t\02species.t t\03simpleseq.t
> > t\04swiss.t t\05seqfeature.t t\06comment.t t\07dblink.t t\08genbank.t
> > t\09fuzzy2.t t\10ensembl.t t\11locuslink.t t\12ontology.t t\13remove.t
> > t\14query.t t\15cluster.t
> > t\01dbadaptor.....ok 1/19Subroutine new redefined at
> > [...]
> > Subroutine debug redefined at C:/Perl/site/lib/Bio\Root\Root.pm line
> > 356.
>
> So obviously it is there, right? So why doesn't perl see it a minute
> later?
>
> > [...]
> > I'll end with that.  At this moment, I can't see it working with the
> > current
> > setup.  I was using perl 5.8 with the old setup but I upgraded mysql
> > at some
> > point when working with gbrowse (I can't remember what the old version
> > was);
> > I'll try upgrading to the newest ActiveState version to see what
> > happens.
> > Could it be the MySQL version?
>
> I don't think it has anything to do with the MySQL version, or the DBD
> driver for that matter. Instead, it looks like on issue with dynamic
> loading of perl modules on your particular platform.
>
>         -hilmar
>
> >
> > Christopher Fields
> > Postdoctoral Researcher - Switzer Lab
> > Dept. of Biochemistry
> > University of Illinois Urbana-Champaign
> >
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l@portal.open-bio.org
> > http://portal.open-bio.org/mailman/listinfo/bioperl-l
> >
> >
> --
> -------------------------------------------------------------
> Hilmar Lapp                            email: lapp at gnf.org
> GNF, San Diego, Ca. 92121              phone: +1-858-812-1757
> -------------------------------------------------------------
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l@portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>
>

--
----------------------------------------------------------
: Hilmar Lapp -:- San Diego, CA -:- hlapp at gmx dot net :
----------------------------------------------------------

From Steve_Chervitz at affymetrix.com  Fri Jan 13 05:25:34 2006
From: Steve_Chervitz at affymetrix.com (Steve Chervitz)
Date: Fri Jan 13 05:34:15 2006
Subject: [Bioperl-l] error running load_seqdatabase.pl
In-Reply-To: 
Message-ID: 

looks like the trouble is when Bio::Root::Root::throw() tries to call
Error::throw(). Perhaps there is some windows-specific problem with
Error.pm? Can't say I've seen this before since I don't use perl on
windows.

Some things to try, in this order:

* Verify that Error.pm is installed for perl on your system.
* Try running t/Exception.t and
the examples/root/exceptions[1-4].pl scripts and see if they
produce the expected behavior.
* Try changing the 'throw $class ...' statements in Root.pm to
'Error::throw $class ...'
* If Error.pm seems to be installed but isn't working right, either
uninstall it or get in the habit of putting this line in your main
scripts: INIT { $DONT_USE_ERROR=1; }

Steve

On Wed, 11 Jan 2006, Hilmar Lapp wrote:

> Date: Wed, 11 Jan 2006 15:12:45 -0800
> From: Hilmar Lapp 
> To: Chris Fields ,
>      Steve Chervitz 
> Cc: bioperl-l@portal.open-bio.org
> Subject: Re: [Bioperl-l] error running load_seqdatabase.pl
>
> Interesting. That posting didn't receive much attention did it. So he states:
>
> 
> The script failed on throw() in loading Bio/Root/Root.pm on Windows.
> The problem lines are those "throw $class (...".   After I put comma
> after $class as "throw $class, (...", the BioSQL tests and load scripts
> are succeeded
> 
>
> Can anyone of those who wrote the Root exception and warning code
> comment? Maybe Steve?
>
>    -hilmar
>
> On 1/11/06, Chris Fields  wrote:
> > Hilmar,
> >
> > As an update on what's going on:
> >
> > I've run into a few problems with load_seqdatabase.pl and bioperl-db on
> > cygwin which I'll try to hash through this week; I'll post if I can't figure
> > it out soon.  It's not as buggy as trying to run it using the latest
> > ActivePerl on WinXP, but it still has issues.
> >
> > I'm also looking through the ActiveState documentation for the latest
> > version of perl they have (5.8.7), which I am running.  AFAIK, they enable
> > dynamic loading when building.  I'll send them an email directly to see what
> > they say.  There may be some Win32-specific way of configuring a script for
> > dynamic loading of perl modules which isn't needed in other environments.
> >
> > There was also this previous email on bioperl-l:
> >
> > http://portal.open-bio.org/pipermail/bioperl-l/2005-May/018937.html
> >
> > Baohua Wang seemed to narrow it down somewhat, but I'm not sure if changing
> > the modules is a solution until I figure out why he made the changes.  They
> > seem mainly geared towards getting load_seqdatabase to work with MsSQL, but
> > if he got it to work on Windows, then he may be onto something.  The
> > modified Bio* modules can be found at:
> >
> > ftp://ftp.tc.cornell.edu/Outgoing/bwang/BioSQL-On-Windows
> >
> > I'll check them out to see if they work out and see what specific
> > modifications he made (they're not detailed).
> >
> > Christopher Fields
> > Postdoctoral Researcher - Switzer Lab
> > Dept. of Biochemistry
> > University of Illinois Urbana-Champaign
> > -----Original Message-----
> > From: bioperl-l-bounces@portal.open-bio.org
> > [mailto:bioperl-l-bounces@portal.open-bio.org] On Behalf Of Chris Fields
> > Sent: Friday, January 06, 2006 1:28 PM
> > To: 'Hilmar Lapp'
> > Cc: bioperl-l@portal.open-bio.org
> > Subject: RE: [Bioperl-l] error running load_seqdatabase.pl
> >
> > I'll try installing bioperl-db using Cygwin.  I know that I can connect to
> > the native Windows mysql database from inside cygwin, so perhaps this will
> > do as a short term workaround.  I'll also try using a different native win32
> > Perl version (maybe 5.6) and look into the dynamic loading issue.  I know
> > that the AS Perl has given errors like this before and not had problems (I
> > think it was also cranky with older versions bioperl), but this one is
> > pretty serious.
> >
> > Christopher Fields
> > Postdoctoral Researcher - Switzer Lab
> > Dept. of Biochemistry
> > University of Illinois Urbana-Champaign
> > -----Original Message-----
> > From: Hilmar Lapp [mailto:hlapp@gmx.net]
> > Sent: Friday, January 06, 2006 12:02 PM
> > To: Chris Fields
> > Cc: bioperl-l@portal.open-bio.org
> > Subject: Re: [Bioperl-l] error running load_seqdatabase.pl
> >
> >
> > On Jan 6, 2006, at 9:20 AM, Chris Fields wrote:
> >
> > > Hilmar,
> > >
> > > Did this ever get resolved?  I tried to reinstall a biosql database
> > > using
> > > bioperl-db and got the same problems.  I'll list out everything I ran
> > > into
> > > and what I pan on trying, as it's been a long time since I've tried
> > > this.
> > >
> > > Currently, I'm using ActiveState Perl 5.8.7.813 on WinXP and MySQL
> > > 4.1.14.
> > > Using nmake and installing worked fine.  Loading the biosql schema and
> > > loading taxonomy info also worked fine, although I had to manually
> > > untar the
> > > taxonomy archive so load_ncbi_taxonomy.pl could find the files (stupid
> > > windows).  However, this is what happens when using
> > > load_seqdatabase.pl:
> > >
> > > C:\Perl\Scripts>load_seqdatabase.pl -dbname dihydroorotase -dbuser root
> > > NP_249092.gpt
> > > Loading NP_249092.gpt ...
> > > Undefined subroutine &Bio::Root::Root::debug called at
> > > C:/Perl/site/lib/Bio/DB/BioSQL/BasePersistenceAdaptor.pm line 1537,
> > > 
> > > line 65.
> > >
> > > If I removed all args except the sequence file, it gives the same
> > > response,
> > > which means it happens before the connection is made to the database:
> > >
> >
> > This happens indeed before a connection is made because it happens at
> > the point it tries to dynamically load the BioSQL driver for the
> > adaptor:
> >
> >         $self->debug("attempting to load driver for adaptor class
> > $class\n");
> >
> > The BioSQL driver is loaded before the DBD driver is loaded.
> >
> > The module in which this happens (i.e., the persistence adaptor) has
> > been loaded dynamically as well.
> >
> > Bio::Root::Root is in the 'use' statements, and the debug() method
> > clearly exists. I'm at a loss as to why perl complains on certain
> > Windows platforms. If somebody can tell me what, if anything, can be
> > done to make this work on those platforms too I'll be glad to implement
> > it.
> >
> > > [...]
> > > Here's the error messages from that first test (warning it's very
> > > messy):
> > >
> > > C:\Perl\bin\perl.exe "-MExtUtils::Command::MM" "-e" "test_harness(0,
> > > 'bl
> > > ib\lib', 'blib\arch')" t\01dbadaptor.t t\02species.t t\03simpleseq.t
> > > t\04swiss.t t\05seqfeature.t t\06comment.t t\07dblink.t t\08genbank.t
> > > t\09fuzzy2.t t\10ensembl.t t\11locuslink.t t\12ontology.t t\13remove.t
> > > t\14query.t t\15cluster.t
> > > t\01dbadaptor.....ok 1/19Subroutine new redefined at
> > > [...]
> > > Subroutine debug redefined at C:/Perl/site/lib/Bio\Root\Root.pm line
> > > 356.
> >
> > So obviously it is there, right? So why doesn't perl see it a minute
> > later?
> >
> > > [...]
> > > I'll end with that.  At this moment, I can't see it working with the
> > > current
> > > setup.  I was using perl 5.8 with the old setup but I upgraded mysql
> > > at some
> > > point when working with gbrowse (I can't remember what the old version
> > > was);
> > > I'll try upgrading to the newest ActiveState version to see what
> > > happens.
> > > Could it be the MySQL version?
> >
> > I don't think it has anything to do with the MySQL version, or the DBD
> > driver for that matter. Instead, it looks like on issue with dynamic
> > loading of perl modules on your particular platform.
> >
> >         -hilmar
> >
> > >
> > > Christopher Fields
> > > Postdoctoral Researcher - Switzer Lab
> > > Dept. of Biochemistry
> > > University of Illinois Urbana-Champaign
> > >
> > > _______________________________________________
> > > Bioperl-l mailing list
> > > Bioperl-l@portal.open-bio.org
> > > http://portal.open-bio.org/mailman/listinfo/bioperl-l
> > >
> > >
> > --
> > -------------------------------------------------------------
> > Hilmar Lapp                            email: lapp at gnf.org
> > GNF, San Diego, Ca. 92121              phone: +1-858-812-1757
> > -------------------------------------------------------------
> >
> >
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l@portal.open-bio.org
> > http://portal.open-bio.org/mailman/listinfo/bioperl-l
> >
> >
>
>
> --
> ----------------------------------------------------------
> : Hilmar Lapp -:- San Diego, CA -:- hlapp at gmx dot net :
> ----------------------------------------------------------
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l@portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>

From jason at portal.open-bio.org  Fri Jan 13 08:02:38 2006
From: jason at portal.open-bio.org (Jason Stajich)
Date: Fri Jan 13 08:05:48 2006
Subject: [Bioperl-l] Re: problem in
	/usr/lib/perl5/site_perl/5.8.6/Bio/SearchIO/blast.pm
In-Reply-To: <1137156850.7510.106.camel@sb289.gbf-braunschweig.de>
References: <1137149266.7510.102.camel@sb289.gbf-braunschweig.de>

	<1137156850.7510.106.camel@sb289.gbf-braunschweig.de>
Message-ID: <746D3A29-366C-4F83-A5BB-61EF8BC15D5C@bioperl.org>

NCBI reserves the right to make the HTML or Text unparseable from the  
CGI, I guess they've now done that.

See these posts:
http://bioperl.org/pipermail/bioperl-l/2005-September/019760.html
http://bioperl.org/pipermail/bioperl-l/2005-September/019724.html

-jason
On Jan 13, 2006, at 7:54 AM, Guido Dieterich wrote:

> Hi, Jason
>
>
> RemoteBlast.pm
> I printed out the NCBI report that was requested!
>
> Guido
>
>
> Am Freitag, den 13.01.2006, 07:44 -0500 schrieb Jason Stajich:
>
>> This is from RemoteBlast or from blast run on the command line?
>>
>> On Jan 13, 2006, at 5:47 AM, Guido Dieterich wrote:
>>
>>> Hi Jason, hi all,
>>>
>>> it seems  so that ncbi changed again its blast output format:
>>> as example:
>>> they added a/more Feature line(s)
>>>>>>>
>>> Features in this part of subject sequence:
>>>    oxidoreductase, pyridine nucleotide-disulfide family
>>>>>>>
>>>
>>>
>>> Bio/SearchIO/blast.pm
>>>
>>> will cause a problem. Error message is:
>>>
>>> ------------- EXCEPTION: Bio::Root::Exception -------------
>>> MSG: no data for midline  Features flanking this part of subject
>>> sequence:
>>> STACK: Error::throw
>>> STACK:
>>> Bio::Root::Root::throw /usr/lib/perl5/site_perl/5.8.6/Bio/Root/
>>> Root.pm:328
>>> STACK:
>>> Bio::SearchIO::blast::next_result /usr/lib/perl5/site_perl/5.8.6/
>>> Bio/SearchIO/blast.pm:1166
>>> STACK: main::remoteBLAST /home/gdi/Perl-skripte/longestORF.pl:95
>>> STACK: /home/gdi/Perl-skripte/longestORF.pl:34
>>> -----------------------------------------------------------
>>>
>>>
>>>
>>> longer example of the blast ncbi report:
>>>
>>>> gb|AE016830.1| Enterococcus faecalis V583, complete genome
>>>           Length=3218031
>>>
>>>  Features in this part of subject sequence:
>>>    oxidoreductase, pyridine nucleotide-disulfide family
>>>
>>>  Score = 79.8 bits (40),  Expect = 3e-11
>>>  Identities = 82/96 (85%), Gaps = 0/96 (0%)
>>>  Strand=Plus/Plus
>>>
>>> Query  1129
>>> ATGAAACATATGGTTAACTTGTACTACTTCTTCGGTATCCGTAGTGGTTACTACATGTGG  1188
>>>                 ||||||||||| ||||||||| | |||||||| | ||| |||
>>> ||||||||||||||
>>> Sbjct  3136253
>>> ATGAAACATATCGTTAACTTGAAATACTTCTTTGATATTCGTTCTGGTTACTACATGTTC   
>>> 3136312
>>>
>>> Query  1189     CAATATATTATGCATGAATTCTTCCACATTAAAGAT  1224
>>>                 ||||| |||||||| ||| ||||||| |||||||||
>>> Sbjct  3136313  CAATACATTATGCACGAAATCTTCCATATTAAAGAT  3136348
>>>
>>>
>>>  Features in this part of subject sequence:oxidoreductase, pyridine
>>> nucleotide-disulfide family
>>>
>>>  Score = 77.8 bits (39),  Expect = 1e-10
>>>  Identities = 90/107 (84%), Gaps = 0/107 (0%)
>>>  Strand=Plus/Plus
>>>
>>> Query  469
>>> AAAGCGATGTTAACATTCGTTGTTTGTGGATCTGGATTTACTGGTATCGAAATGGTTGGG  528
>>>                 ||||| ||||||||||||||||| ||||| ||||| ||||||||
>>> |||||||||||
>>> ||
>>> Sbjct  3135587
>>> AAAGCAATGTTAACATTCGTTGTCTGTGGTTCTGGTTTTACTGGGATCGAAATGGTCGGC   
>>> 3135646
>>>
>>> Query  529      GAACTTTTAGAATGGAAAGATCGTCTTGCTAAAGATAACAAAATTGA  575
>>>                 ||| |  | || |||||||||||| | || ||||||  |||||||||
>>> Sbjct  3135647  GAATTAATCGACTGGAAAGATCGTTTAGCGAAAGATGCCAAAATTGA
>>> 3135693
>>>
>>>
>>
>> --
>> Jason Stajich
>> jason@bioperl.org
>> http://jason.open-bio.org/
>>

--
Jason Stajich
jason@bioperl.org
http://jason.open-bio.org/

From jason at portal.open-bio.org  Fri Jan 13 07:44:43 2006
From: jason at portal.open-bio.org (Jason Stajich)
Date: Fri Jan 13 09:21:06 2006
Subject: [Bioperl-l] Re: problem in
	/usr/lib/perl5/site_perl/5.8.6/Bio/SearchIO/blast.pm
In-Reply-To: <1137149266.7510.102.camel@sb289.gbf-braunschweig.de>
References: <1137149266.7510.102.camel@sb289.gbf-braunschweig.de>
Message-ID: 

This is from RemoteBlast or from blast run on the command line?

On Jan 13, 2006, at 5:47 AM, Guido Dieterich wrote:

> Hi Jason, hi all,
>
> it seems  so that ncbi changed again its blast output format:
> as example:
> they added a/more Feature line(s)
>>>>>
> Features in this part of subject sequence:
>    oxidoreductase, pyridine nucleotide-disulfide family
>>>>>
>
>
> Bio/SearchIO/blast.pm
>
> will cause a problem. Error message is:
>
> ------------- EXCEPTION: Bio::Root::Exception -------------
> MSG: no data for midline  Features flanking this part of subject
> sequence:
> STACK: Error::throw
> STACK:
> Bio::Root::Root::throw /usr/lib/perl5/site_perl/5.8.6/Bio/Root/ 
> Root.pm:328
> STACK:
> Bio::SearchIO::blast::next_result /usr/lib/perl5/site_perl/5.8.6/ 
> Bio/SearchIO/blast.pm:1166
> STACK: main::remoteBLAST /home/gdi/Perl-skripte/longestORF.pl:95
> STACK: /home/gdi/Perl-skripte/longestORF.pl:34
> -----------------------------------------------------------
>
>
>
> longer example of the blast ncbi report:
>
>> gb|AE016830.1| Enterococcus faecalis V583, complete genome
>           Length=3218031
>
>  Features in this part of subject sequence:
>    oxidoreductase, pyridine nucleotide-disulfide family
>
>  Score = 79.8 bits (40),  Expect = 3e-11
>  Identities = 82/96 (85%), Gaps = 0/96 (0%)
>  Strand=Plus/Plus
>
> Query  1129
> ATGAAACATATGGTTAACTTGTACTACTTCTTCGGTATCCGTAGTGGTTACTACATGTGG  1188
>                 ||||||||||| ||||||||| | |||||||| | ||| |||
> ||||||||||||||
> Sbjct  3136253
> ATGAAACATATCGTTAACTTGAAATACTTCTTTGATATTCGTTCTGGTTACTACATGTTC  3136312
>
> Query  1189     CAATATATTATGCATGAATTCTTCCACATTAAAGAT  1224
>                 ||||| |||||||| ||| ||||||| |||||||||
> Sbjct  3136313  CAATACATTATGCACGAAATCTTCCATATTAAAGAT  3136348
>
>
>  Features in this part of subject sequence:oxidoreductase, pyridine
> nucleotide-disulfide family
>
>  Score = 77.8 bits (39),  Expect = 1e-10
>  Identities = 90/107 (84%), Gaps = 0/107 (0%)
>  Strand=Plus/Plus
>
> Query  469
> AAAGCGATGTTAACATTCGTTGTTTGTGGATCTGGATTTACTGGTATCGAAATGGTTGGG  528
>                 ||||| ||||||||||||||||| ||||| ||||| ||||||||  
> |||||||||||
> ||
> Sbjct  3135587
> AAAGCAATGTTAACATTCGTTGTCTGTGGTTCTGGTTTTACTGGGATCGAAATGGTCGGC  3135646
>
> Query  529      GAACTTTTAGAATGGAAAGATCGTCTTGCTAAAGATAACAAAATTGA  575
>                 ||| |  | || |||||||||||| | || ||||||  |||||||||
> Sbjct  3135647  GAATTAATCGACTGGAAAGATCGTTTAGCGAAAGATGCCAAAATTGA   
> 3135693
>
>

--
Jason Stajich
jason@bioperl.org
http://jason.open-bio.org/

From hlapp at gmx.net  Fri Jan 13 11:41:23 2006
From: hlapp at gmx.net (Hilmar Lapp)
Date: Fri Jan 13 11:38:08 2006
Subject: [Bioperl-l] error running load_seqdatabase.pl
In-Reply-To: <000301c6185b$5b61bb60$15327e82@pyrimidine>
References: <000301c6185b$5b61bb60$15327e82@pyrimidine>
Message-ID: <79bce52e7530892a6a6819897d18a7a0@gmx.net>

On Jan 13, 2006, at 8:06 AM, Chris Fields wrote:

> [...]
> I think we really should probably give credit to Baohua Wang for 
> noting the
> change in throw.

Yes, absolutely, smart guy. You get credit for persistence and digging 
it up again 9 months later :-)

>   If it pans out, this may be what is responsible for error
> messages popping up every once in a while with bioperl scripts.  There 
> is
> one thing of note:  Steve mentions that Error.pm should be present:

'Could', not 'should'. The toolkit needs to work in the absence of 
Error.pm too (and does on most platforms; e.g., I don't have it 
installed).

It may turn out though that the missing comma is silent on most (all?) 
non-Windows platforms or if Error.pm is installed, and therefore wasn't 
noticed by most people.

	-hilmar
-- 
-------------------------------------------------------------
Hilmar Lapp                            email: lapp at gnf.org
GNF, San Diego, Ca. 92121              phone: +1-858-812-1757
-------------------------------------------------------------

From cjfields at uiuc.edu  Fri Jan 13 11:06:43 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri Jan 13 11:50:06 2006
Subject: [Bioperl-l] error running load_seqdatabase.pl
In-Reply-To: 
Message-ID: <000301c6185b$5b61bb60$15327e82@pyrimidine>

Sorry, I should have clarified; NP_249092.gpt is a file carrying all protein
sequences with significant BLASTP score hits to NP_249092 in GenPept
format(which also includes the sequence NP_249092).  Only a number of these
had problems, all of which seem to be Uniprot.  I had problems using my
script to download the sequences b/c of NCBI's limit for batch sequence
extraction, so I used the Batch Entrez interface to download them (i.e. they
are directly from the protein  database at NCBI).  NP_252217.gpt is the same
as above (a file with sig. hits to NP_252217) but had fewer hits, so batch
extraction through Bio::DB::GenPept worked (they were then passed as
Bio::SeqIO objects and saved in GenBank format).  As reported before, there
were no errors with that file. 

The other issue, with taxonomy, was fixed when I loaded the database using
load_ncbi_taxonomy.pl.  I dropped the old database, reinstalled the schema,
but forgot to add in the taxonomic info.  

I think we really should probably give credit to Baohua Wang for noting the
change in throw.  If it pans out, this may be what is responsible for error
messages popping up every once in a while with bioperl scripts.  There is
one thing of note:  Steve mentions that Error.pm should be present:

> -----Original Message-----
> From: Steve Chervitz [mailto:Steve_Chervitz@affymetrix.com]
> Sent: Friday, January 13, 2006 4:26 AM
> To: Hilmar Lapp
> Cc: Chris Fields; Steve Chervitz; bioperl-l@portal.open-bio.org
> Subject: Re: [Bioperl-l] error running load_seqdatabase.pl
> 
> looks like the trouble is when Bio::Root::Root::throw() tries to call
> Error::throw(). Perhaps there is some windows-specific problem with
> Error.pm? Can't say I've seen this before since I don't use perl on
> windows.
> 
> Some things to try, in this order:
> 
> * Verify that Error.pm is installed for perl on your system.
> * Try running t/Exception.t and
> the examples/root/exceptions[1-4].pl scripts and see if they
> produce the expected behavior.
> * Try changing the 'throw $class ...' statements in Root.pm to
> 'Error::throw $class ...'
> * If Error.pm seems to be installed but isn't working right, either
> uninstall it or get in the habit of putting this line in your main
> scripts: INIT { $DONT_USE_ERROR=1; }
> 
> Steve

The requirement didn't pop up when creating the PPM distro.  It also isn't
included in ActivePerl but is available.  I've installed it and will go
through the above to see if it changes anything using unmodified Root.pm.  

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 

> -----Original Message-----
> From: drycafe@gmail.com [mailto:drycafe@gmail.com] On Behalf Of Hilmar
> Lapp
> Sent: Thursday, January 12, 2006 10:28 PM
> To: Chris Fields
> Cc: Steve Chervitz; bioperl-l@portal.open-bio.org
> Subject: Re: [Bioperl-l] error running load_seqdatabase.pl
> 
> On 1/12/06, Chris Fields  wrote:
> > Looks like the below modification Baohua Wang made to Root.pm works.  I
> did
> > run into another weird issue, but I think it is a sequence formatting
> > problem.  I try loading in a file with protein sequences in GenPept
> format
> > (pulled from BLASTP output using Bio::DB::GenPept and saved in a file
> using
> > SeqIO) after changing Root.pm:
> > ______________________________________________________________________
> >
> > C:\Perl\Scripts>load_seqdatabase.pl -dbname biosql -dbuser root -dbpass
> > ****** -format genbank -safe NP_252217.gpt
> > Loading NP_252217.gpt ...
> >
> > C:\Perl\Scripts>
> > ______________________________________________________________________
> >
> > Good!
> 
> Great! So we'll have to test that the effect of adding that comma
> isn't negative on Unix platforms but I suspect it's in fact required
> by syntax and maybe on Windows perl is less lenient? Odd at any rate.
> 
> >
> > The strangeness comes in when using Genpept seqs NOT passed through
> SeqIO
> > (pulled directly from NCBI, saved in a similar file).  Most sequences
> will
> > load, but a number of them will not:
> >
> > ______________________________________________________________________
> > C:\Perl\Scripts>load_seqdatabase.pl -dbname biosql -dbuser root -dbpass
> > ****** -format genbank -safe NP_249092.gpt
> > Loading NP_249092.gpt ...
> >
> > -------------------- WARNING ---------------------
> > MSG: insert in Bio::DB::BioSQL::DBLinkAdaptor (driver) failed, values
> were
> > ("","HAMAPMF_00220","0") FKs ()
> > Column 'dbname' cannot be null
> > ---------------------------------------------------
> > Could not store Q59712:
> 
> Are you sure you pulled this from NCBI using NP_249092 as the
> accession? I'm asking because NP_249092 is a perfectly sane looking
> RefSeq record and in fact does not contain the string HAMAPMF, whereas
> Q59712 in reality is a Uniprot record moulded into GenPept format;
> some of the db_xrefs come out odd and in fact for the one above
> (HAMAPMF_00220) there is no dbname, most likely because dbname and
> accession are concatenated like for the following InterPro db_xref.
> 
> So I don't think this is worrisome unless you insist you used the
> NP_249092 entry ...
> 
> I would generally advise against taking Uniprot/Swissprot entries from
> their GenPept reincarnation. The formats are incompatible in some
> aspects (e.g., Swissprot, like EMBL, has first-level db_xrefs, whereas
> GenBank format doesn't; instead it puts db_xrefs into the feature
> table).
> 
> > [...]
> > at C:\Perl\Scripts\load_seqdatabase.pl line 633
> > Could not store AAU82296:
> > ------------- EXCEPTION  -------------
> > MSG: create: object (Bio::Species) failed to insert or to be found by
> unique
> > key
> 
> "uncultured archaeon GZfos13E1" is not something Bioperl will parse
> correctly into the appropriate Bio::Species structure (not that I
> would even know what that would have to look like ;).
> 
> However, if you preload your Biosql instance with the NCBI taxonomy
> database then this is not a problem because the species will be looked
> up correctly by its NCBI taxon ID (which the genbank SeqIO parser
> extracts from the feature table if it's there - and it is in this
> case).
> 
> > [...]
> > I'll check them out to try and derive what the differences are.  I will
> also
> > pass the above file through SeqIO to see what happens.
> 
> Note that everything you pull down through Bio::DB::GenPept does get
> parsed by Bio::SeqIO::genbank - if there is any difference it must be
> because the input files aren't identical.
> 
> > I think it could be some of the GenPept formatted stuff is clogging up
> the works since I saved
> > everything in Genbank format through SeqIO.
> 
> Ah - meaning you got the file by calling $seqio->write_seq($seq) ?
> That could cause it's own problems (even though theoretically it
> shouldn't and therefore if it does it counts as a bug).
> 
> >  For now, though, bioperl-db on
> > Windows works!  Any idea why the 'throw' change works?
> 
> No, no idea - but great that you found out.
> 
>    -hilmar
> 
> >
> > Christopher Fields
> > Postdoctoral Researcher - Switzer Lab
> > Dept. of Biochemistry
> > University of Illinois Urbana-Champaign
> > -----Original Message-----
> > From: drycafe@gmail.com [mailto:drycafe@gmail.com] On Behalf Of Hilmar
> Lapp
> > Sent: Wednesday, January 11, 2006 5:13 PM
> > To: Chris Fields; Steve Chervitz
> > Cc: bioperl-l@portal.open-bio.org
> > Subject: Re: [Bioperl-l] error running load_seqdatabase.pl
> >
> > Interesting. That posting didn't receive much attention did it. So he
> > states:
> >
> > 
> > The script failed on throw() in loading Bio/Root/Root.pm on Windows.
> > The problem lines are those "throw $class (...".   After I put comma
> > after $class as "throw $class, (...", the BioSQL tests and load scripts
> > are succeeded
> > 
> >
> > Can anyone of those who wrote the Root exception and warning code
> > comment? Maybe Steve?
> >
> >    -hilmar
> >
> > On 1/11/06, Chris Fields  wrote:
> > > Hilmar,
> > >
> > > As an update on what's going on:
> > >
> > > I've run into a few problems with load_seqdatabase.pl and bioperl-db
> on
> > > cygwin which I'll try to hash through this week; I'll post if I can't
> > figure
> > > it out soon.  It's not as buggy as trying to run it using the latest
> > > ActivePerl on WinXP, but it still has issues.
> > >
> > > I'm also looking through the ActiveState documentation for the latest
> > > version of perl they have (5.8.7), which I am running.  AFAIK, they
> enable
> > > dynamic loading when building.  I'll send them an email directly to
> see
> > what
> > > they say.  There may be some Win32-specific way of configuring a
> script
> > for
> > > dynamic loading of perl modules which isn't needed in other
> environments.
> > >
> > > There was also this previous email on bioperl-l:
> > >
> > > http://portal.open-bio.org/pipermail/bioperl-l/2005-May/018937.html
> > >
> > > Baohua Wang seemed to narrow it down somewhat, but I'm not sure if
> > changing
> > > the modules is a solution until I figure out why he made the changes.
> > They
> > > seem mainly geared towards getting load_seqdatabase to work with
> MsSQL,
> > but
> > > if he got it to work on Windows, then he may be onto something.  The
> > > modified Bio* modules can be found at:
> > >
> > > ftp://ftp.tc.cornell.edu/Outgoing/bwang/BioSQL-On-Windows
> > >
> > > I'll check them out to see if they work out and see what specific
> > > modifications he made (they're not detailed).
> > >
> > > Christopher Fields
> > > Postdoctoral Researcher - Switzer Lab
> > > Dept. of Biochemistry
> > > University of Illinois Urbana-Champaign
> > > -----Original Message-----
> > > From: bioperl-l-bounces@portal.open-bio.org
> > > [mailto:bioperl-l-bounces@portal.open-bio.org] On Behalf Of Chris
> Fields
> > > Sent: Friday, January 06, 2006 1:28 PM
> > > To: 'Hilmar Lapp'
> > > Cc: bioperl-l@portal.open-bio.org
> > > Subject: RE: [Bioperl-l] error running load_seqdatabase.pl
> > >
> > > I'll try installing bioperl-db using Cygwin.  I know that I can
> connect to
> > > the native Windows mysql database from inside cygwin, so perhaps this
> will
> > > do as a short term workaround.  I'll also try using a different native
> > win32
> > > Perl version (maybe 5.6) and look into the dynamic loading issue.  I
> know
> > > that the AS Perl has given errors like this before and not had
> problems (I
> > > think it was also cranky with older versions bioperl), but this one is
> > > pretty serious.
> > >
> > > Christopher Fields
> > > Postdoctoral Researcher - Switzer Lab
> > > Dept. of Biochemistry
> > > University of Illinois Urbana-Champaign
> > > -----Original Message-----
> > > From: Hilmar Lapp [mailto:hlapp@gmx.net]
> > > Sent: Friday, January 06, 2006 12:02 PM
> > > To: Chris Fields
> > > Cc: bioperl-l@portal.open-bio.org
> > > Subject: Re: [Bioperl-l] error running load_seqdatabase.pl
> > >
> > >
> > > On Jan 6, 2006, at 9:20 AM, Chris Fields wrote:
> > >
> > > > Hilmar,
> > > >
> > > > Did this ever get resolved?  I tried to reinstall a biosql database
> > > > using
> > > > bioperl-db and got the same problems.  I'll list out everything I
> ran
> > > > into
> > > > and what I pan on trying, as it's been a long time since I've tried
> > > > this.
> > > >
> > > > Currently, I'm using ActiveState Perl 5.8.7.813 on WinXP and MySQL
> > > > 4.1.14.
> > > > Using nmake and installing worked fine.  Loading the biosql schema
> and
> > > > loading taxonomy info also worked fine, although I had to manually
> > > > untar the
> > > > taxonomy archive so load_ncbi_taxonomy.pl could find the files
> (stupid
> > > > windows).  However, this is what happens when using
> > > > load_seqdatabase.pl:
> > > >
> > > > C:\Perl\Scripts>load_seqdatabase.pl -dbname dihydroorotase -dbuser
> root
> > > > NP_249092.gpt
> > > > Loading NP_249092.gpt ...
> > > > Undefined subroutine &Bio::Root::Root::debug called at
> > > > C:/Perl/site/lib/Bio/DB/BioSQL/BasePersistenceAdaptor.pm line 1537,
> > > > 
> > > > line 65.
> > > >
> > > > If I removed all args except the sequence file, it gives the same
> > > > response,
> > > > which means it happens before the connection is made to the
> database:
> > > >
> > >
> > > This happens indeed before a connection is made because it happens at
> > > the point it tries to dynamically load the BioSQL driver for the
> > > adaptor:
> > >
> > >         $self->debug("attempting to load driver for adaptor class
> > > $class\n");
> > >
> > > The BioSQL driver is loaded before the DBD driver is loaded.
> > >
> > > The module in which this happens (i.e., the persistence adaptor) has
> > > been loaded dynamically as well.
> > >
> > > Bio::Root::Root is in the 'use' statements, and the debug() method
> > > clearly exists. I'm at a loss as to why perl complains on certain
> > > Windows platforms. If somebody can tell me what, if anything, can be
> > > done to make this work on those platforms too I'll be glad to
> implement
> > > it.
> > >
> > > > [...]
> > > > Here's the error messages from that first test (warning it's very
> > > > messy):
> > > >
> > > > C:\Perl\bin\perl.exe "-MExtUtils::Command::MM" "-e" "test_harness(0,
> > > > 'bl
> > > > ib\lib', 'blib\arch')" t\01dbadaptor.t t\02species.t t\03simpleseq.t
> > > > t\04swiss.t t\05seqfeature.t t\06comment.t t\07dblink.t
> t\08genbank.t
> > > > t\09fuzzy2.t t\10ensembl.t t\11locuslink.t t\12ontology.t
> t\13remove.t
> > > > t\14query.t t\15cluster.t
> > > > t\01dbadaptor.....ok 1/19Subroutine new redefined at
> > > > [...]
> > > > Subroutine debug redefined at C:/Perl/site/lib/Bio\Root\Root.pm line
> > > > 356.
> > >
> > > So obviously it is there, right? So why doesn't perl see it a minute
> > > later?
> > >
> > > > [...]
> > > > I'll end with that.  At this moment, I can't see it working with the
> > > > current
> > > > setup.  I was using perl 5.8 with the old setup but I upgraded mysql
> > > > at some
> > > > point when working with gbrowse (I can't remember what the old
> version
> > > > was);
> > > > I'll try upgrading to the newest ActiveState version to see what
> > > > happens.
> > > > Could it be the MySQL version?
> > >
> > > I don't think it has anything to do with the MySQL version, or the DBD
> > > driver for that matter. Instead, it looks like on issue with dynamic
> > > loading of perl modules on your particular platform.
> > >
> > >         -hilmar
> > >
> > > >
> > > > Christopher Fields
> > > > Postdoctoral Researcher - Switzer Lab
> > > > Dept. of Biochemistry
> > > > University of Illinois Urbana-Champaign
> > > >
> > > > _______________________________________________
> > > > Bioperl-l mailing list
> > > > Bioperl-l@portal.open-bio.org
> > > > http://portal.open-bio.org/mailman/listinfo/bioperl-l
> > > >
> > > >
> > > --
> > > -------------------------------------------------------------
> > > Hilmar Lapp                            email: lapp at gnf.org
> > > GNF, San Diego, Ca. 92121              phone: +1-858-812-1757
> > > -------------------------------------------------------------
> > >
> > >
> > > _______________________________________________
> > > Bioperl-l mailing list
> > > Bioperl-l@portal.open-bio.org
> > > http://portal.open-bio.org/mailman/listinfo/bioperl-l
> > >
> > >
> >
> >
> > --
> > ----------------------------------------------------------
> > : Hilmar Lapp -:- San Diego, CA -:- hlapp at gmx dot net :
> > ----------------------------------------------------------
> >
> >
> 
> 
> --
> ----------------------------------------------------------
> : Hilmar Lapp -:- San Diego, CA -:- hlapp at gmx dot net :
> ----------------------------------------------------------

From cjfields at uiuc.edu  Fri Jan 13 12:53:55 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri Jan 13 12:50:22 2006
Subject: [Bioperl-l] error running load_seqdatabase.pl
In-Reply-To: 
Message-ID: <000501c6186a$5294eed0$15327e82@pyrimidine>

So here's what I found:

*  Running t\Exception.t with or without Error.pm installed didn't reveal
any errors.  

C:\Documents and Settings\Administrator\My Documents\CVS\bioperl-live>perl
-I -w t\Exception.t
1..7
ok 1
ok 2
Setting test data (Eeny meeny miney moe.)
ok 3

Executing method bar() in TestObject
Throwing a Bio::TestException
ok 4
ok 5
ok 6
ok 7

*  Running the example scripts (exceptions[1-4].pl) with or w/o Error.pm
showed no difference (I checked with diff).  
*  Changing "throw $class" to "Error::throw $class" in Root.pm didn't do
anything, which is strange (I did this with and w/o Error.pm installed).  I
thought at this point, that Activestate may have Error.pm as part of their
core modules, but it isn't included anywhere in the Perl directory tree or
under PERL5LIB.  It also isn't listed as CORE in their modules list
(http://ppm.activestate.com/BuildStatus/5.8-E.html); the core modules are
usually under '/lib' instead of '/site/lib'.  So why would "Error::throw"
even work?  I also tried 'perl -e "require Error" and didn't get errors, so
it has to be around somewhere.
*  Even stranger, when changing "throw $class" to "Error::throw $class" in
Root.pm, load_seqdatabase.pl works fine, just like when "throw $class" is
changed to "throw $class,".  Oi!!
*  Changing load_seqdatabase.pl to include the line "INIT {
$DONT_USE_ERROR=1; }" also didn't do anything; only changes to Root.pm made
a difference.

Lesson: Windows is flaky.  I think that much of this behavior is just
ActivePerl-specific, which may be why it hasn't been seen elsewhere.  I
don't know much about ActivePerl and exception handling, so I may delve into
it a bit more to see if there is something else there.  I also dropped
Activestate an email asking about Error.pm and their core distribution.  

So, the question is, should Root.pm be changed in bioperl-live?  Obviously
this would need to be well tested out before committing any changes.  I
could try it out on Mac OS X.

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 

> -----Original Message-----
> From: bioperl-l-bounces@portal.open-bio.org [mailto:bioperl-l-
> bounces@portal.open-bio.org] On Behalf Of Steve Chervitz
> Sent: Friday, January 13, 2006 4:26 AM
> To: Hilmar Lapp
> Cc: Chris Fields; bioperl-l@portal.open-bio.org; Steve Chervitz
> Subject: Re: [Bioperl-l] error running load_seqdatabase.pl
> 
> looks like the trouble is when Bio::Root::Root::throw() tries to call
> Error::throw(). Perhaps there is some windows-specific problem with
> Error.pm? Can't say I've seen this before since I don't use perl on
> windows.
> 
> Some things to try, in this order:
> 
> * Verify that Error.pm is installed for perl on your system.
> * Try running t/Exception.t and
> the examples/root/exceptions[1-4].pl scripts and see if they
> produce the expected behavior.
> * Try changing the 'throw $class ...' statements in Root.pm to
> 'Error::throw $class ...'
> * If Error.pm seems to be installed but isn't working right, either
> uninstall it or get in the habit of putting this line in your main
> scripts: INIT { $DONT_USE_ERROR=1; }
> 
> Steve
> 
> On Wed, 11 Jan 2006, Hilmar Lapp wrote:
> 
> > Date: Wed, 11 Jan 2006 15:12:45 -0800
> > From: Hilmar Lapp 
> > To: Chris Fields ,
> >      Steve Chervitz 
> > Cc: bioperl-l@portal.open-bio.org
> > Subject: Re: [Bioperl-l] error running load_seqdatabase.pl
> >
> > Interesting. That posting didn't receive much attention did it. So he
> states:
> >
> > 
> > The script failed on throw() in loading Bio/Root/Root.pm on Windows.
> > The problem lines are those "throw $class (...".   After I put comma
> > after $class as "throw $class, (...", the BioSQL tests and load scripts
> > are succeeded
> > 
> >
> > Can anyone of those who wrote the Root exception and warning code
> > comment? Maybe Steve?
> >
> >    -hilmar
> >
> > On 1/11/06, Chris Fields  wrote:
> > > Hilmar,
> > >
> > > As an update on what's going on:
> > >
> > > I've run into a few problems with load_seqdatabase.pl and bioperl-db
> on
> > > cygwin which I'll try to hash through this week; I'll post if I can't
> figure
> > > it out soon.  It's not as buggy as trying to run it using the latest
> > > ActivePerl on WinXP, but it still has issues.
> > >
> > > I'm also looking through the ActiveState documentation for the latest
> > > version of perl they have (5.8.7), which I am running.  AFAIK, they
> enable
> > > dynamic loading when building.  I'll send them an email directly to
> see what
> > > they say.  There may be some Win32-specific way of configuring a
> script for
> > > dynamic loading of perl modules which isn't needed in other
> environments.
> > >
> > > There was also this previous email on bioperl-l:
> > >
> > > http://portal.open-bio.org/pipermail/bioperl-l/2005-May/018937.html
> > >
> > > Baohua Wang seemed to narrow it down somewhat, but I'm not sure if
> changing
> > > the modules is a solution until I figure out why he made the changes.
> They
> > > seem mainly geared towards getting load_seqdatabase to work with
> MsSQL, but
> > > if he got it to work on Windows, then he may be onto something.  The
> > > modified Bio* modules can be found at:
> > >
> > > ftp://ftp.tc.cornell.edu/Outgoing/bwang/BioSQL-On-Windows
> > >
> > > I'll check them out to see if they work out and see what specific
> > > modifications he made (they're not detailed).
> > >
> > > Christopher Fields
> > > Postdoctoral Researcher - Switzer Lab
> > > Dept. of Biochemistry
> > > University of Illinois Urbana-Champaign
> > > -----Original Message-----
> > > From: bioperl-l-bounces@portal.open-bio.org
> > > [mailto:bioperl-l-bounces@portal.open-bio.org] On Behalf Of Chris
> Fields
> > > Sent: Friday, January 06, 2006 1:28 PM
> > > To: 'Hilmar Lapp'
> > > Cc: bioperl-l@portal.open-bio.org
> > > Subject: RE: [Bioperl-l] error running load_seqdatabase.pl
> > >
> > > I'll try installing bioperl-db using Cygwin.  I know that I can
> connect to
> > > the native Windows mysql database from inside cygwin, so perhaps this
> will
> > > do as a short term workaround.  I'll also try using a different native
> win32
> > > Perl version (maybe 5.6) and look into the dynamic loading issue.  I
> know
> > > that the AS Perl has given errors like this before and not had
> problems (I
> > > think it was also cranky with older versions bioperl), but this one is
> > > pretty serious.
> > >
> > > Christopher Fields
> > > Postdoctoral Researcher - Switzer Lab
> > > Dept. of Biochemistry
> > > University of Illinois Urbana-Champaign
> > > -----Original Message-----
> > > From: Hilmar Lapp [mailto:hlapp@gmx.net]
> > > Sent: Friday, January 06, 2006 12:02 PM
> > > To: Chris Fields
> > > Cc: bioperl-l@portal.open-bio.org
> > > Subject: Re: [Bioperl-l] error running load_seqdatabase.pl
> > >
> > >
> > > On Jan 6, 2006, at 9:20 AM, Chris Fields wrote:
> > >
> > > > Hilmar,
> > > >
> > > > Did this ever get resolved?  I tried to reinstall a biosql database
> > > > using
> > > > bioperl-db and got the same problems.  I'll list out everything I
> ran
> > > > into
> > > > and what I pan on trying, as it's been a long time since I've tried
> > > > this.
> > > >
> > > > Currently, I'm using ActiveState Perl 5.8.7.813 on WinXP and MySQL
> > > > 4.1.14.
> > > > Using nmake and installing worked fine.  Loading the biosql schema
> and
> > > > loading taxonomy info also worked fine, although I had to manually
> > > > untar the
> > > > taxonomy archive so load_ncbi_taxonomy.pl could find the files
> (stupid
> > > > windows).  However, this is what happens when using
> > > > load_seqdatabase.pl:
> > > >
> > > > C:\Perl\Scripts>load_seqdatabase.pl -dbname dihydroorotase -dbuser
> root
> > > > NP_249092.gpt
> > > > Loading NP_249092.gpt ...
> > > > Undefined subroutine &Bio::Root::Root::debug called at
> > > > C:/Perl/site/lib/Bio/DB/BioSQL/BasePersistenceAdaptor.pm line 1537,
> > > > 
> > > > line 65.
> > > >
> > > > If I removed all args except the sequence file, it gives the same
> > > > response,
> > > > which means it happens before the connection is made to the
> database:
> > > >
> > >
> > > This happens indeed before a connection is made because it happens at
> > > the point it tries to dynamically load the BioSQL driver for the
> > > adaptor:
> > >
> > >         $self->debug("attempting to load driver for adaptor class
> > > $class\n");
> > >
> > > The BioSQL driver is loaded before the DBD driver is loaded.
> > >
> > > The module in which this happens (i.e., the persistence adaptor) has
> > > been loaded dynamically as well.
> > >
> > > Bio::Root::Root is in the 'use' statements, and the debug() method
> > > clearly exists. I'm at a loss as to why perl complains on certain
> > > Windows platforms. If somebody can tell me what, if anything, can be
> > > done to make this work on those platforms too I'll be glad to
> implement
> > > it.
> > >
> > > > [...]
> > > > Here's the error messages from that first test (warning it's very
> > > > messy):
> > > >
> > > > C:\Perl\bin\perl.exe "-MExtUtils::Command::MM" "-e" "test_harness(0,
> > > > 'bl
> > > > ib\lib', 'blib\arch')" t\01dbadaptor.t t\02species.t t\03simpleseq.t
> > > > t\04swiss.t t\05seqfeature.t t\06comment.t t\07dblink.t
> t\08genbank.t
> > > > t\09fuzzy2.t t\10ensembl.t t\11locuslink.t t\12ontology.t
> t\13remove.t
> > > > t\14query.t t\15cluster.t
> > > > t\01dbadaptor.....ok 1/19Subroutine new redefined at
> > > > [...]
> > > > Subroutine debug redefined at C:/Perl/site/lib/Bio\Root\Root.pm line
> > > > 356.
> > >
> > > So obviously it is there, right? So why doesn't perl see it a minute
> > > later?
> > >
> > > > [...]
> > > > I'll end with that.  At this moment, I can't see it working with the
> > > > current
> > > > setup.  I was using perl 5.8 with the old setup but I upgraded mysql
> > > > at some
> > > > point when working with gbrowse (I can't remember what the old
> version
> > > > was);
> > > > I'll try upgrading to the newest ActiveState version to see what
> > > > happens.
> > > > Could it be the MySQL version?
> > >
> > > I don't think it has anything to do with the MySQL version, or the DBD
> > > driver for that matter. Instead, it looks like on issue with dynamic
> > > loading of perl modules on your particular platform.
> > >
> > >         -hilmar
> > >
> > > >
> > > > Christopher Fields
> > > > Postdoctoral Researcher - Switzer Lab
> > > > Dept. of Biochemistry
> > > > University of Illinois Urbana-Champaign
> > > >
> > > > _______________________________________________
> > > > Bioperl-l mailing list
> > > > Bioperl-l@portal.open-bio.org
> > > > http://portal.open-bio.org/mailman/listinfo/bioperl-l
> > > >
> > > >
> > > --
> > > -------------------------------------------------------------
> > > Hilmar Lapp                            email: lapp at gnf.org
> > > GNF, San Diego, Ca. 92121              phone: +1-858-812-1757
> > > -------------------------------------------------------------
> > >
> > >
> > > _______________________________________________
> > > Bioperl-l mailing list
> > > Bioperl-l@portal.open-bio.org
> > > http://portal.open-bio.org/mailman/listinfo/bioperl-l
> > >
> > >
> >
> >
> > --
> > ----------------------------------------------------------
> > : Hilmar Lapp -:- San Diego, CA -:- hlapp at gmx dot net :
> > ----------------------------------------------------------
> >
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l@portal.open-bio.org
> > http://portal.open-bio.org/mailman/listinfo/bioperl-l
> >
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l@portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l

From hlapp at gmx.net  Fri Jan 13 14:52:11 2006
From: hlapp at gmx.net (Hilmar Lapp)
Date: Fri Jan 13 15:17:01 2006
Subject: [Bioperl-l] error running load_seqdatabase.pl
In-Reply-To: <000501c6186a$5294eed0$15327e82@pyrimidine>
References: 
	<000501c6186a$5294eed0$15327e82@pyrimidine>
Message-ID: 

On 1/13/06, Chris Fields  wrote:
> [...]
> *  Running the example scripts (exceptions[1-4].pl) with or w/o Error.pm
> showed no difference (I checked with diff).

Given the below you probably failed to uninstall Error.pm, or not
installing it in the first place doesn't matter because it's there
already.

> *  Changing "throw $class" to "Error::throw $class" in Root.pm didn't do
> anything, which is strange (I did this with and w/o Error.pm installed).  I
> thought at this point, that Activestate may have Error.pm as part of their
> core modules, but it isn't included anywhere in the Perl directory tree or
> under PERL5LIB.  It also isn't listed as CORE in their modules list
> (http://ppm.activestate.com/BuildStatus/5.8-E.html); the core modules are
> usually under '/lib' instead of '/site/lib'.  So why would "Error::throw"
> even work?  I also tried 'perl -e "require Error" and didn't get errors, so
> it has to be around somewhere.

Right. I usually do

     $ perl -MYet::Another::Module

to convince myself that Yes::Another::Module really is not accessible
to the interpreter. And if I do that with Error on my OSX box I do
receive an error about perl not finding the Error module anywhere.

> *  Even stranger, when changing "throw $class" to "Error::throw $class" in
> Root.pm, load_seqdatabase.pl works fine, just like when "throw $class" is
> changed to "throw $class,".  Oi!!

Now, I can't imagine that using Error::throw $class  would
not die immediately if Error.pm is not installed. You can check that
quickly by mistyping the module name (like Errror::throw). So unless
there's some deep magic going on then using Error::throw instead of
just throw() is not an option I'm afraid.

I guess the solution needs to be adding the comma. I can't imagine why
this would break on non-Windows systems, but obviously some testing is
in order.

   -hilmar

> *  Changing load_seqdatabase.pl to include the line "INIT {
> $DONT_USE_ERROR=1; }" also didn't do anything; only changes to Root.pm made
> a difference.
>
> Lesson: Windows is flaky.  I think that much of this behavior is just
> ActivePerl-specific, which may be why it hasn't been seen elsewhere.  I
> don't know much about ActivePerl and exception handling, so I may delve into
> it a bit more to see if there is something else there.  I also dropped
> Activestate an email asking about Error.pm and their core distribution.
>
> So, the question is, should Root.pm be changed in bioperl-live?  Obviously
> this would need to be well tested out before committing any changes.  I
> could try it out on Mac OS X.
>
> Christopher Fields
> Postdoctoral Researcher - Switzer Lab
> Dept. of Biochemistry
> University of Illinois Urbana-Champaign
>
> > -----Original Message-----
> > From: bioperl-l-bounces@portal.open-bio.org [mailto:bioperl-l-
> > bounces@portal.open-bio.org] On Behalf Of Steve Chervitz
> > Sent: Friday, January 13, 2006 4:26 AM
> > To: Hilmar Lapp
> > Cc: Chris Fields; bioperl-l@portal.open-bio.org; Steve Chervitz
> > Subject: Re: [Bioperl-l] error running load_seqdatabase.pl
> >
> > looks like the trouble is when Bio::Root::Root::throw() tries to call
> > Error::throw(). Perhaps there is some windows-specific problem with
> > Error.pm? Can't say I've seen this before since I don't use perl on
> > windows.
> >
> > Some things to try, in this order:
> >
> > * Verify that Error.pm is installed for perl on your system.
> > * Try running t/Exception.t and
> > the examples/root/exceptions[1-4].pl scripts and see if they
> > produce the expected behavior.
> > * Try changing the 'throw $class ...' statements in Root.pm to
> > 'Error::throw $class ...'
> > * If Error.pm seems to be installed but isn't working right, either
> > uninstall it or get in the habit of putting this line in your main
> > scripts: INIT { $DONT_USE_ERROR=1; }
> >
> > Steve
> >
> > On Wed, 11 Jan 2006, Hilmar Lapp wrote:
> >
> > > Date: Wed, 11 Jan 2006 15:12:45 -0800
> > > From: Hilmar Lapp 
> > > To: Chris Fields ,
> > >      Steve Chervitz 
> > > Cc: bioperl-l@portal.open-bio.org
> > > Subject: Re: [Bioperl-l] error running load_seqdatabase.pl
> > >
> > > Interesting. That posting didn't receive much attention did it. So he
> > states:
> > >
> > > 
> > > The script failed on throw() in loading Bio/Root/Root.pm on Windows.
> > > The problem lines are those "throw $class (...".   After I put comma
> > > after $class as "throw $class, (...", the BioSQL tests and load scripts
> > > are succeeded
> > > 
> > >
> > > Can anyone of those who wrote the Root exception and warning code
> > > comment? Maybe Steve?
> > >
> > >    -hilmar
> > >
> > > On 1/11/06, Chris Fields  wrote:
> > > > Hilmar,
> > > >
> > > > As an update on what's going on:
> > > >
> > > > I've run into a few problems with load_seqdatabase.pl and bioperl-db
> > on
> > > > cygwin which I'll try to hash through this week; I'll post if I can't
> > figure
> > > > it out soon.  It's not as buggy as trying to run it using the latest
> > > > ActivePerl on WinXP, but it still has issues.
> > > >
> > > > I'm also looking through the ActiveState documentation for the latest
> > > > version of perl they have (5.8.7), which I am running.  AFAIK, they
> > enable
> > > > dynamic loading when building.  I'll send them an email directly to
> > see what
> > > > they say.  There may be some Win32-specific way of configuring a
> > script for
> > > > dynamic loading of perl modules which isn't needed in other
> > environments.
> > > >
> > > > There was also this previous email on bioperl-l:
> > > >
> > > > http://portal.open-bio.org/pipermail/bioperl-l/2005-May/018937.html
> > > >
> > > > Baohua Wang seemed to narrow it down somewhat, but I'm not sure if
> > changing
> > > > the modules is a solution until I figure out why he made the changes.
> > They
> > > > seem mainly geared towards getting load_seqdatabase to work with
> > MsSQL, but
> > > > if he got it to work on Windows, then he may be onto something.  The
> > > > modified Bio* modules can be found at:
> > > >
> > > > ftp://ftp.tc.cornell.edu/Outgoing/bwang/BioSQL-On-Windows
> > > >
> > > > I'll check them out to see if they work out and see what specific
> > > > modifications he made (they're not detailed).
> > > >
> > > > Christopher Fields
> > > > Postdoctoral Researcher - Switzer Lab
> > > > Dept. of Biochemistry
> > > > University of Illinois Urbana-Champaign
> > > > -----Original Message-----
> > > > From: bioperl-l-bounces@portal.open-bio.org
> > > > [mailto:bioperl-l-bounces@portal.open-bio.org] On Behalf Of Chris
> > Fields
> > > > Sent: Friday, January 06, 2006 1:28 PM
> > > > To: 'Hilmar Lapp'
> > > > Cc: bioperl-l@portal.open-bio.org
> > > > Subject: RE: [Bioperl-l] error running load_seqdatabase.pl
> > > >
> > > > I'll try installing bioperl-db using Cygwin.  I know that I can
> > connect to
> > > > the native Windows mysql database from inside cygwin, so perhaps this
> > will
> > > > do as a short term workaround.  I'll also try using a different native
> > win32
> > > > Perl version (maybe 5.6) and look into the dynamic loading issue.  I
> > know
> > > > that the AS Perl has given errors like this before and not had
> > problems (I
> > > > think it was also cranky with older versions bioperl), but this one is
> > > > pretty serious.
> > > >
> > > > Christopher Fields
> > > > Postdoctoral Researcher - Switzer Lab
> > > > Dept. of Biochemistry
> > > > University of Illinois Urbana-Champaign
> > > > -----Original Message-----
> > > > From: Hilmar Lapp [mailto:hlapp@gmx.net]
> > > > Sent: Friday, January 06, 2006 12:02 PM
> > > > To: Chris Fields
> > > > Cc: bioperl-l@portal.open-bio.org
> > > > Subject: Re: [Bioperl-l] error running load_seqdatabase.pl
> > > >
> > > >
> > > > On Jan 6, 2006, at 9:20 AM, Chris Fields wrote:
> > > >
> > > > > Hilmar,
> > > > >
> > > > > Did this ever get resolved?  I tried to reinstall a biosql database
> > > > > using
> > > > > bioperl-db and got the same problems.  I'll list out everything I
> > ran
> > > > > into
> > > > > and what I pan on trying, as it's been a long time since I've tried
> > > > > this.
> > > > >
> > > > > Currently, I'm using ActiveState Perl 5.8.7.813 on WinXP and MySQL
> > > > > 4.1.14.
> > > > > Using nmake and installing worked fine.  Loading the biosql schema
> > and
> > > > > loading taxonomy info also worked fine, although I had to manually
> > > > > untar the
> > > > > taxonomy archive so load_ncbi_taxonomy.pl could find the files
> > (stupid
> > > > > windows).  However, this is what happens when using
> > > > > load_seqdatabase.pl:
> > > > >
> > > > > C:\Perl\Scripts>load_seqdatabase.pl -dbname dihydroorotase -dbuser
> > root
> > > > > NP_249092.gpt
> > > > > Loading NP_249092.gpt ...
> > > > > Undefined subroutine &Bio::Root::Root::debug called at
> > > > > C:/Perl/site/lib/Bio/DB/BioSQL/BasePersistenceAdaptor.pm line 1537,
> > > > > 
> > > > > line 65.
> > > > >
> > > > > If I removed all args except the sequence file, it gives the same
> > > > > response,
> > > > > which means it happens before the connection is made to the
> > database:
> > > > >
> > > >
> > > > This happens indeed before a connection is made because it happens at
> > > > the point it tries to dynamically load the BioSQL driver for the
> > > > adaptor:
> > > >
> > > >         $self->debug("attempting to load driver for adaptor class
> > > > $class\n");
> > > >
> > > > The BioSQL driver is loaded before the DBD driver is loaded.
> > > >
> > > > The module in which this happens (i.e., the persistence adaptor) has
> > > > been loaded dynamically as well.
> > > >
> > > > Bio::Root::Root is in the 'use' statements, and the debug() method
> > > > clearly exists. I'm at a loss as to why perl complains on certain
> > > > Windows platforms. If somebody can tell me what, if anything, can be
> > > > done to make this work on those platforms too I'll be glad to
> > implement
> > > > it.
> > > >
> > > > > [...]
> > > > > Here's the error messages from that first test (warning it's very
> > > > > messy):
> > > > >
> > > > > C:\Perl\bin\perl.exe "-MExtUtils::Command::MM" "-e" "test_harness(0,
> > > > > 'bl
> > > > > ib\lib', 'blib\arch')" t\01dbadaptor.t t\02species.t t\03simpleseq.t
> > > > > t\04swiss.t t\05seqfeature.t t\06comment.t t\07dblink.t
> > t\08genbank.t
> > > > > t\09fuzzy2.t t\10ensembl.t t\11locuslink.t t\12ontology.t
> > t\13remove.t
> > > > > t\14query.t t\15cluster.t
> > > > > t\01dbadaptor.....ok 1/19Subroutine new redefined at
> > > > > [...]
> > > > > Subroutine debug redefined at C:/Perl/site/lib/Bio\Root\Root.pm line
> > > > > 356.
> > > >
> > > > So obviously it is there, right? So why doesn't perl see it a minute
> > > > later?
> > > >
> > > > > [...]
> > > > > I'll end with that.  At this moment, I can't see it working with the
> > > > > current
> > > > > setup.  I was using perl 5.8 with the old setup but I upgraded mysql
> > > > > at some
> > > > > point when working with gbrowse (I can't remember what the old
> > version
> > > > > was);
> > > > > I'll try upgrading to the newest ActiveState version to see what
> > > > > happens.
> > > > > Could it be the MySQL version?
> > > >
> > > > I don't think it has anything to do with the MySQL version, or the DBD
> > > > driver for that matter. Instead, it looks like on issue with dynamic
> > > > loading of perl modules on your particular platform.
> > > >
> > > >         -hilmar
> > > >
> > > > >
> > > > > Christopher Fields
> > > > > Postdoctoral Researcher - Switzer Lab
> > > > > Dept. of Biochemistry
> > > > > University of Illinois Urbana-Champaign
> > > > >
> > > > > _______________________________________________
> > > > > Bioperl-l mailing list
> > > > > Bioperl-l@portal.open-bio.org
> > > > > http://portal.open-bio.org/mailman/listinfo/bioperl-l
> > > > >
> > > > >
> > > > --
> > > > -------------------------------------------------------------
> > > > Hilmar Lapp                            email: lapp at gnf.org
> > > > GNF, San Diego, Ca. 92121              phone: +1-858-812-1757
> > > > -------------------------------------------------------------
> > > >
> > > >
> > > > _______________________________________________
> > > > Bioperl-l mailing list
> > > > Bioperl-l@portal.open-bio.org
> > > > http://portal.open-bio.org/mailman/listinfo/bioperl-l
> > > >
> > > >
> > >
> > >
> > > --
> > > ----------------------------------------------------------
> > > : Hilmar Lapp -:- San Diego, CA -:- hlapp at gmx dot net :
> > > ----------------------------------------------------------
> > >
> > > _______________________________________________
> > > Bioperl-l mailing list
> > > Bioperl-l@portal.open-bio.org
> > > http://portal.open-bio.org/mailman/listinfo/bioperl-l
> > >
> >
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l@portal.open-bio.org
> > http://portal.open-bio.org/mailman/listinfo/bioperl-l
>
>

--
----------------------------------------------------------
: Hilmar Lapp -:- San Diego, CA -:- hlapp at gmx dot net :
----------------------------------------------------------

From cjfields at uiuc.edu  Fri Jan 13 15:31:32 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri Jan 13 17:09:35 2006
Subject: [Bioperl-l] error running load_seqdatabase.pl
In-Reply-To: 
Message-ID: <000001c61880$57814d10$15327e82@pyrimidine>

Sorry about that.  I retried 'perl -e "require Error;' and various
incarnations of it.  Without Error.pm:

C:\Perl\test\bioperl-db>perl -e "require Error;"
Can't locate Error.pm in @INC (@INC contains: C:\Perl C:/Perl/lib
C:/Perl/site/lib .) at -e line 1.

This is the interesting bit; I then installed Error.pm.  I tried out the
following:

C:\Perl\test\bioperl-db>perl -e "require Error; Error::throw;"

C:\Perl\test\bioperl-db>perl -e "require Error; Error::throw();"
Can't call method "new" on an undefined value at C:/Perl/site/lib/Error.pm
line 148.

It ignored the first run (without parentheses).  Then I tried this:

C:\Perl\test\bioperl-db>perl -e "require Error; Errror::throw"

and got no errors.  Maybe it doesn't recognize Error::throw (or
Errror:throw) as a subroutine for some reason unless it has parentheses.
This makes me think something else is going on.

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 
> -----Original Message-----
> From: drycafe@gmail.com [mailto:drycafe@gmail.com] On Behalf Of Hilmar
> Lapp
> Sent: Friday, January 13, 2006 1:52 PM
> To: Chris Fields
> Cc: Steve Chervitz; bioperl-l@portal.open-bio.org
> Subject: Re: [Bioperl-l] error running load_seqdatabase.pl
> 
> On 1/13/06, Chris Fields  wrote:
> > [...]
> > *  Running the example scripts (exceptions[1-4].pl) with or w/o Error.pm
> > showed no difference (I checked with diff).
> 
> Given the below you probably failed to uninstall Error.pm, or not
> installing it in the first place doesn't matter because it's there
> already.
> 
> > *  Changing "throw $class" to "Error::throw $class" in Root.pm didn't do
> > anything, which is strange (I did this with and w/o Error.pm installed).
> I
> > thought at this point, that Activestate may have Error.pm as part of
> their
> > core modules, but it isn't included anywhere in the Perl directory tree
> or
> > under PERL5LIB.  It also isn't listed as CORE in their modules list
> > (http://ppm.activestate.com/BuildStatus/5.8-E.html); the core modules
> are
> > usually under '/lib' instead of '/site/lib'.  So why would
> "Error::throw"
> > even work?  I also tried 'perl -e "require Error" and didn't get errors,
> so
> > it has to be around somewhere.
> 
> Right. I usually do
> 
>      $ perl -MYet::Another::Module
> 
> to convince myself that Yes::Another::Module really is not accessible
> to the interpreter. And if I do that with Error on my OSX box I do
> receive an error about perl not finding the Error module anywhere.
> 
> > *  Even stranger, when changing "throw $class" to "Error::throw $class"
> in
> > Root.pm, load_seqdatabase.pl works fine, just like when "throw $class"
> is
> > changed to "throw $class,".  Oi!!
> 
> 
> Now, I can't imagine that using Error::throw $class  would
> not die immediately if Error.pm is not installed. You can check that
> quickly by mistyping the module name (like Errror::throw). So unless
> there's some deep magic going on then using Error::throw instead of
> just throw() is not an option I'm afraid.
> 
> I guess the solution needs to be adding the comma. I can't imagine why
> this would break on non-Windows systems, but obviously some testing is
> in order.
> 
>    -hilmar
> 
> > *  Changing load_seqdatabase.pl to include the line "INIT {
> > $DONT_USE_ERROR=1; }" also didn't do anything; only changes to Root.pm
> made
> > a difference.
> >
> > Lesson: Windows is flaky.  I think that much of this behavior is just
> > ActivePerl-specific, which may be why it hasn't been seen elsewhere.  I
> > don't know much about ActivePerl and exception handling, so I may delve
> into
> > it a bit more to see if there is something else there.  I also dropped
> > Activestate an email asking about Error.pm and their core distribution.
> >
> > So, the question is, should Root.pm be changed in bioperl-live?
> Obviously
> > this would need to be well tested out before committing any changes.  I
> > could try it out on Mac OS X.
> >
> > Christopher Fields
> > Postdoctoral Researcher - Switzer Lab
> > Dept. of Biochemistry
> > University of Illinois Urbana-Champaign
> >
> > > -----Original Message-----
> > > From: bioperl-l-bounces@portal.open-bio.org [mailto:bioperl-l-
> > > bounces@portal.open-bio.org] On Behalf Of Steve Chervitz
> > > Sent: Friday, January 13, 2006 4:26 AM
> > > To: Hilmar Lapp
> > > Cc: Chris Fields; bioperl-l@portal.open-bio.org; Steve Chervitz
> > > Subject: Re: [Bioperl-l] error running load_seqdatabase.pl
> > >
> > > looks like the trouble is when Bio::Root::Root::throw() tries to call
> > > Error::throw(). Perhaps there is some windows-specific problem with
> > > Error.pm? Can't say I've seen this before since I don't use perl on
> > > windows.
> > >
> > > Some things to try, in this order:
> > >
> > > * Verify that Error.pm is installed for perl on your system.
> > > * Try running t/Exception.t and
> > > the examples/root/exceptions[1-4].pl scripts and see if they
> > > produce the expected behavior.
> > > * Try changing the 'throw $class ...' statements in Root.pm to
> > > 'Error::throw $class ...'
> > > * If Error.pm seems to be installed but isn't working right, either
> > > uninstall it or get in the habit of putting this line in your main
> > > scripts: INIT { $DONT_USE_ERROR=1; }
> > >
> > > Steve
> > >
> > > On Wed, 11 Jan 2006, Hilmar Lapp wrote:
> > >
> > > > Date: Wed, 11 Jan 2006 15:12:45 -0800
> > > > From: Hilmar Lapp 
> > > > To: Chris Fields ,
> > > >      Steve Chervitz 
> > > > Cc: bioperl-l@portal.open-bio.org
> > > > Subject: Re: [Bioperl-l] error running load_seqdatabase.pl
> > > >
> > > > Interesting. That posting didn't receive much attention did it. So
> he
> > > states:
> > > >
> > > > 
> > > > The script failed on throw() in loading Bio/Root/Root.pm on Windows.
> > > > The problem lines are those "throw $class (...".   After I put comma
> > > > after $class as "throw $class, (...", the BioSQL tests and load
> scripts
> > > > are succeeded
> > > > 
> > > >
> > > > Can anyone of those who wrote the Root exception and warning code
> > > > comment? Maybe Steve?
> > > >
> > > >    -hilmar
> > > >
> > > > On 1/11/06, Chris Fields  wrote:
> > > > > Hilmar,
> > > > >
> > > > > As an update on what's going on:
> > > > >
> > > > > I've run into a few problems with load_seqdatabase.pl and bioperl-
> db
> > > on
> > > > > cygwin which I'll try to hash through this week; I'll post if I
> can't
> > > figure
> > > > > it out soon.  It's not as buggy as trying to run it using the
> latest
> > > > > ActivePerl on WinXP, but it still has issues.
> > > > >
> > > > > I'm also looking through the ActiveState documentation for the
> latest
> > > > > version of perl they have (5.8.7), which I am running.  AFAIK,
> they
> > > enable
> > > > > dynamic loading when building.  I'll send them an email directly
> to
> > > see what
> > > > > they say.  There may be some Win32-specific way of configuring a
> > > script for
> > > > > dynamic loading of perl modules which isn't needed in other
> > > environments.
> > > > >
> > > > > There was also this previous email on bioperl-l:
> > > > >
> > > > > http://portal.open-bio.org/pipermail/bioperl-l/2005-
> May/018937.html
> > > > >
> > > > > Baohua Wang seemed to narrow it down somewhat, but I'm not sure if
> > > changing
> > > > > the modules is a solution until I figure out why he made the
> changes.
> > > They
> > > > > seem mainly geared towards getting load_seqdatabase to work with
> > > MsSQL, but
> > > > > if he got it to work on Windows, then he may be onto something.
> The
> > > > > modified Bio* modules can be found at:
> > > > >
> > > > > ftp://ftp.tc.cornell.edu/Outgoing/bwang/BioSQL-On-Windows
> > > > >
> > > > > I'll check them out to see if they work out and see what specific
> > > > > modifications he made (they're not detailed).
> > > > >
> > > > > Christopher Fields
> > > > > Postdoctoral Researcher - Switzer Lab
> > > > > Dept. of Biochemistry
> > > > > University of Illinois Urbana-Champaign
> > > > > -----Original Message-----
> > > > > From: bioperl-l-bounces@portal.open-bio.org
> > > > > [mailto:bioperl-l-bounces@portal.open-bio.org] On Behalf Of Chris
> > > Fields
> > > > > Sent: Friday, January 06, 2006 1:28 PM
> > > > > To: 'Hilmar Lapp'
> > > > > Cc: bioperl-l@portal.open-bio.org
> > > > > Subject: RE: [Bioperl-l] error running load_seqdatabase.pl
> > > > >
> > > > > I'll try installing bioperl-db using Cygwin.  I know that I can
> > > connect to
> > > > > the native Windows mysql database from inside cygwin, so perhaps
> this
> > > will
> > > > > do as a short term workaround.  I'll also try using a different
> native
> > > win32
> > > > > Perl version (maybe 5.6) and look into the dynamic loading issue.
> I
> > > know
> > > > > that the AS Perl has given errors like this before and not had
> > > problems (I
> > > > > think it was also cranky with older versions bioperl), but this
> one is
> > > > > pretty serious.
> > > > >
> > > > > Christopher Fields
> > > > > Postdoctoral Researcher - Switzer Lab
> > > > > Dept. of Biochemistry
> > > > > University of Illinois Urbana-Champaign
> > > > > -----Original Message-----
> > > > > From: Hilmar Lapp [mailto:hlapp@gmx.net]
> > > > > Sent: Friday, January 06, 2006 12:02 PM
> > > > > To: Chris Fields
> > > > > Cc: bioperl-l@portal.open-bio.org
> > > > > Subject: Re: [Bioperl-l] error running load_seqdatabase.pl
> > > > >
> > > > >
> > > > > On Jan 6, 2006, at 9:20 AM, Chris Fields wrote:
> > > > >
> > > > > > Hilmar,
> > > > > >
> > > > > > Did this ever get resolved?  I tried to reinstall a biosql
> database
> > > > > > using
> > > > > > bioperl-db and got the same problems.  I'll list out everything
> I
> > > ran
> > > > > > into
> > > > > > and what I pan on trying, as it's been a long time since I've
> tried
> > > > > > this.
> > > > > >
> > > > > > Currently, I'm using ActiveState Perl 5.8.7.813 on WinXP and
> MySQL
> > > > > > 4.1.14.
> > > > > > Using nmake and installing worked fine.  Loading the biosql
> schema
> > > and
> > > > > > loading taxonomy info also worked fine, although I had to
> manually
> > > > > > untar the
> > > > > > taxonomy archive so load_ncbi_taxonomy.pl could find the files
> > > (stupid
> > > > > > windows).  However, this is what happens when using
> > > > > > load_seqdatabase.pl:
> > > > > >
> > > > > > C:\Perl\Scripts>load_seqdatabase.pl -dbname dihydroorotase -
> dbuser
> > > root
> > > > > > NP_249092.gpt
> > > > > > Loading NP_249092.gpt ...
> > > > > > Undefined subroutine &Bio::Root::Root::debug called at
> > > > > > C:/Perl/site/lib/Bio/DB/BioSQL/BasePersistenceAdaptor.pm line
> 1537,
> > > > > > 
> > > > > > line 65.
> > > > > >
> > > > > > If I removed all args except the sequence file, it gives the
> same
> > > > > > response,
> > > > > > which means it happens before the connection is made to the
> > > database:
> > > > > >
> > > > >
> > > > > This happens indeed before a connection is made because it happens
> at
> > > > > the point it tries to dynamically load the BioSQL driver for the
> > > > > adaptor:
> > > > >
> > > > >         $self->debug("attempting to load driver for adaptor class
> > > > > $class\n");
> > > > >
> > > > > The BioSQL driver is loaded before the DBD driver is loaded.
> > > > >
> > > > > The module in which this happens (i.e., the persistence adaptor)
> has
> > > > > been loaded dynamically as well.
> > > > >
> > > > > Bio::Root::Root is in the 'use' statements, and the debug() method
> > > > > clearly exists. I'm at a loss as to why perl complains on certain
> > > > > Windows platforms. If somebody can tell me what, if anything, can
> be
> > > > > done to make this work on those platforms too I'll be glad to
> > > implement
> > > > > it.
> > > > >
> > > > > > [...]
> > > > > > Here's the error messages from that first test (warning it's
> very
> > > > > > messy):
> > > > > >
> > > > > > C:\Perl\bin\perl.exe "-MExtUtils::Command::MM" "-e"
> "test_harness(0,
> > > > > > 'bl
> > > > > > ib\lib', 'blib\arch')" t\01dbadaptor.t t\02species.t
> t\03simpleseq.t
> > > > > > t\04swiss.t t\05seqfeature.t t\06comment.t t\07dblink.t
> > > t\08genbank.t
> > > > > > t\09fuzzy2.t t\10ensembl.t t\11locuslink.t t\12ontology.t
> > > t\13remove.t
> > > > > > t\14query.t t\15cluster.t
> > > > > > t\01dbadaptor.....ok 1/19Subroutine new redefined at
> > > > > > [...]
> > > > > > Subroutine debug redefined at C:/Perl/site/lib/Bio\Root\Root.pm
> line
> > > > > > 356.
> > > > >
> > > > > So obviously it is there, right? So why doesn't perl see it a
> minute
> > > > > later?
> > > > >
> > > > > > [...]
> > > > > > I'll end with that.  At this moment, I can't see it working with
> the
> > > > > > current
> > > > > > setup.  I was using perl 5.8 with the old setup but I upgraded
> mysql
> > > > > > at some
> > > > > > point when working with gbrowse (I can't remember what the old
> > > version
> > > > > > was);
> > > > > > I'll try upgrading to the newest ActiveState version to see what
> > > > > > happens.
> > > > > > Could it be the MySQL version?
> > > > >
> > > > > I don't think it has anything to do with the MySQL version, or the
> DBD
> > > > > driver for that matter. Instead, it looks like on issue with
> dynamic
> > > > > loading of perl modules on your particular platform.
> > > > >
> > > > >         -hilmar
> > > > >
> > > > > >
> > > > > > Christopher Fields
> > > > > > Postdoctoral Researcher - Switzer Lab
> > > > > > Dept. of Biochemistry
> > > > > > University of Illinois Urbana-Champaign
> > > > > >
> > > > > > _______________________________________________
> > > > > > Bioperl-l mailing list
> > > > > > Bioperl-l@portal.open-bio.org
> > > > > > http://portal.open-bio.org/mailman/listinfo/bioperl-l
> > > > > >
> > > > > >
> > > > > --
> > > > > -------------------------------------------------------------
> > > > > Hilmar Lapp                            email: lapp at gnf.org
> > > > > GNF, San Diego, Ca. 92121              phone: +1-858-812-1757
> > > > > -------------------------------------------------------------
> > > > >
> > > > >
> > > > > _______________________________________________
> > > > > Bioperl-l mailing list
> > > > > Bioperl-l@portal.open-bio.org
> > > > > http://portal.open-bio.org/mailman/listinfo/bioperl-l
> > > > >
> > > > >
> > > >
> > > >
> > > > --
> > > > ----------------------------------------------------------
> > > > : Hilmar Lapp -:- San Diego, CA -:- hlapp at gmx dot net :
> > > > ----------------------------------------------------------
> > > >
> > > > _______________________________________________
> > > > Bioperl-l mailing list
> > > > Bioperl-l@portal.open-bio.org
> > > > http://portal.open-bio.org/mailman/listinfo/bioperl-l
> > > >
> > >
> > > _______________________________________________
> > > Bioperl-l mailing list
> > > Bioperl-l@portal.open-bio.org
> > > http://portal.open-bio.org/mailman/listinfo/bioperl-l
> >
> >
> 
> 
> --
> ----------------------------------------------------------
> : Hilmar Lapp -:- San Diego, CA -:- hlapp at gmx dot net :
> ----------------------------------------------------------

From anst at kvl.dk  Sat Jan 14 11:50:02 2006
From: anst at kvl.dk (Anders Stegmann)
Date: Sat Jan 14 12:05:57 2006
Subject: [Bioperl-l] BIO::SearchIO HOWTO mistake?
Message-ID: <43C939CA0200009B00000429@gwia.kvl.dk>

Hi! 

According to the Bio::SearchIO HOWTO this, 

$hsp->seq_inds('hit', 'conserved'); 

should fetch ONLY the conserved residues in an alignment (not those
identical). 

When I use: 

sub subject_seq_alignment_conserved_residues { 

my ($hsp_obj) = @_; 
my %subject_conserved_hash = (); 

my @subject_string = split , $$hsp_obj->hit_string; 

foreach ($$hsp_obj->seq_inds('hit', 'conserved')) { 

$subject_conserved_hash{$_} = $subject_string[$_ -1]; 

} 

return %subject_conserved_hash; 

} 

I get all residues in the alignment inclusive those that are identical! 

What's wrong? 

If the Bio::SearchIO HOWTO is wrong about this, is there an easy way to
fetch only the conserved residues? 

Anders. 

From u4075723 at anu.edu.au  Sat Jan 14 02:37:27 2006
From: u4075723 at anu.edu.au (Nagesh Chakka)
Date: Sat Jan 14 17:13:28 2006
Subject: [Bioperl-l] Problem with Webblast.pm
Message-ID: <200601141837.27204.u4075723@anu.edu.au>

Hi,
I am having problem in using the remote blast module of bioperl. I have 
installed the latest version of Bioperl (1.5.1) and when I am running the 
run_remote_blast.pl I am getting the following error that it can not locate 
Webblast.pm module. 
Can't locate Bio/Tools/Blast/Run/Webblast.pm in @INC (@INC 
contains: . .. /home/nagesh/progs/lib/perl5/i686-linux /home/nagesh/progs/lib/perl5 /home/nagesh/progs/lib/perl5//i686-linux /home/nagesh/progs/lib/perl5/ /usr/local/lib/perl5/5.8.6/i686-linux /usr/local/lib/perl5/5.8.6 /usr/local/lib/perl5/site_perl/5.8.6/i686-linux /usr/local/lib/perl5/site_perl/5.8.6 /usr/local/lib/perl5/site_perl) 
at /home/nagesh/progs/lib/perl5/Bio/Tools/Blast.pm line 1303,  line 1.

When I had looked at the directory (Bio/Tools/Blast/Run/), I could not find 
this module. The following webpage says that it is available with the core 
package. 
http://annocpan.org/~BIRNEY/bioperl-0.04.3/Bio/Tools/Blast/Run/Webblast.pm
Can anyone please advice what would have gone wrong and how can I get it 
going? The testing of bioperl installation was ok.
Hoping for some answer.
Thanks
Nagesh
From chen_li3 at yahoo.com  Sat Jan 14 23:57:33 2006
From: chen_li3 at yahoo.com (chen li)
Date: Sun Jan 15 00:00:57 2006
Subject: [Bioperl-l] parser for primer3 output
Message-ID: <20060115045733.96498.qmail@web36810.mail.mud.yahoo.com>

Hi all,

After batch-design of PCR primers with
Bio::Tools::Run::Primer3 I get the results in a file
called "temp.out". I want to pull out each pair of
primers and put them into excel format file. I just
want to know if such a module/parser  is already
available or I need to write some codes to parse the
output.

Thanks,

Li  

__________________________________________________
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com 
From jason.stajich at duke.edu  Sun Jan 15 11:08:36 2006
From: jason.stajich at duke.edu (Jason Stajich)
Date: Sun Jan 15 11:04:58 2006
Subject: [Bioperl-l] parser for primer3 output
In-Reply-To: <20060115045733.96498.qmail@web36810.mail.mud.yahoo.com>
References: <20060115045733.96498.qmail@web36810.mail.mud.yahoo.com>
Message-ID: <52DFE7E1-9826-46E2-9B97-62F735264FFC@duke.edu>

Bio::Tools::Primer3 ?

You get one of these objects back from run() method in  
Bio::Tools::Run::Primer3 without having to re-open the result file.

-jason
On Jan 14, 2006, at 11:57 PM, chen li wrote:

> Hi all,
>
> After batch-design of PCR primers with
> Bio::Tools::Run::Primer3 I get the results in a file
> called "temp.out". I want to pull out each pair of
> primers and put them into excel format file. I just
> want to know if such a module/parser  is already
> available or I need to write some codes to parse the
> output.
>
> Thanks,
>
> Li
>
> __________________________________________________
> Do You Yahoo!?
> Tired of spam?  Yahoo! Mail has the best spam protection around
> http://mail.yahoo.com
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l@portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l

--
Jason Stajich
Duke University
http://www.duke.edu/~jes12

From jason.stajich at duke.edu  Sun Jan 15 10:59:36 2006
From: jason.stajich at duke.edu (Jason Stajich)
Date: Sun Jan 15 11:51:53 2006
Subject: [Bioperl-l] BIO::SearchIO HOWTO mistake?
In-Reply-To: <43C939CA0200009B00000429@gwia.kvl.dk>
References: <43C939CA0200009B00000429@gwia.kvl.dk>
Message-ID: <8710E96A-DE7C-45DD-A8A8-79568D4D65F2@duke.edu>

If you read the documentation for the seq_inds method you'll see the  
following options, I think you want conserved-not-identical.

Title             : seq_inds
Purpose     : Get a list of residue positions (indices) for all  
identical
                     : or conserved residues in the query or sbjct  
sequence.
  Example   : @s_ind = $hsp->seq_inds('query', 'identical');
                     : @h_ind = $hsp->seq_inds('hit', 'conserved');
                     : @h_ind = $hsp->seq_inds('hit', 'conserved-not- 
identical');
                     : @h_ind = $hsp->seq_inds('hit', 'conserved', 1);
Returns      : List of integers
                     : May include ranges if collapse is true.
Argument  : seq_type  = 'query' or 'hit' or 'sbjct'  (default = query)
                     :  ('sbjct' is synonymous with 'hit')
                     : class     = 'identical' or 'conserved' or  
'nomatch' or 'gap'
                     :              (default = identical)
                     :              (can be shortened to 'id' or 'cons')
                     :             or 'conserved-not-identical'
                     : collapse  = boolean, if true, consecutive  
positions are merged
                     :             using a range notation, e.g., "1 2  
3 4 5 7 9 10 11"
                     :             collapses to "1-5 7 9-11". This is  
useful for
                     :             consolidating long lists. Default  
= no collapse.
Throws    : n/a.
Comments  :

On Jan 14, 2006, at 11:50 AM, Anders Stegmann wrote:

> Hi!
>
> According to the Bio::SearchIO HOWTO this,
>
> $hsp->seq_inds('hit', 'conserved');
>
> should fetch ONLY the conserved residues in an alignment (not those
> identical).
>
> When I use:
>
> sub subject_seq_alignment_conserved_residues {
>
> my ($hsp_obj) = @_;
> my %subject_conserved_hash = ();
>
> my @subject_string = split , $$hsp_obj->hit_string;
>
> foreach ($$hsp_obj->seq_inds('hit', 'conserved')) {
>
> $subject_conserved_hash{$_} = $subject_string[$_ -1];
>
> }
>
> return %subject_conserved_hash;
>
> }
>
>
>
> I get all residues in the alignment inclusive those that are  
> identical!
>
> What's wrong?
>
> If the Bio::SearchIO HOWTO is wrong about this, is there an easy  
> way to
> fetch only the conserved residues?
>
> Anders.
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l@portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l

--
Jason Stajich
Duke University
http://www.duke.edu/~jes12

From bmoore at genetics.utah.edu  Sun Jan 15 15:36:15 2006
From: bmoore at genetics.utah.edu (Barry Moore)
Date: Sun Jan 15 15:32:04 2006
Subject: [Bioperl-l] Problem with Webblast.pm
Message-ID: 

Nagesh,

Where did you get run_remote_blast.pl from?  I'm not sure, but I think
that is an older script and I don't think Webblast.pm is part of the
bioperl distribution anymore.  I don't find it on my computer either.
Try using bp_remote_blast.pl which you should find in the scripts
directory of you bioperl installation.  It uses RemoteBlast.pm which is
the current package for doing Blast over the web.

Barry

> -----Original Message-----
> From: bioperl-l-bounces@portal.open-bio.org [mailto:bioperl-l-
> bounces@portal.open-bio.org] On Behalf Of Nagesh Chakka
> Sent: Saturday, January 14, 2006 12:37 AM
> To: bioperl-l@bioperl.org
> Cc: Nagesh Chakka; babu.kannappan@anu.edu.au
> Subject: [Bioperl-l] Problem with Webblast.pm
> 
> Hi,
> I am having problem in using the remote blast module of bioperl. I
have
> installed the latest version of Bioperl (1.5.1) and when I am running
the
> run_remote_blast.pl I am getting the following error that it can not
> locate
> Webblast.pm module.
> Can't locate Bio/Tools/Blast/Run/Webblast.pm in @INC (@INC
> contains: . .. /home/nagesh/progs/lib/perl5/i686-linux
> /home/nagesh/progs/lib/perl5 /home/nagesh/progs/lib/perl5//i686-linux
> /home/nagesh/progs/lib/perl5/ /usr/local/lib/perl5/5.8.6/i686-linux
> /usr/local/lib/perl5/5.8.6
/usr/local/lib/perl5/site_perl/5.8.6/i686-linux
> /usr/local/lib/perl5/site_perl/5.8.6 /usr/local/lib/perl5/site_perl)
> at /home/nagesh/progs/lib/perl5/Bio/Tools/Blast.pm line 1303, 
line
> 1.
> 
> 
> When I had looked at the directory (Bio/Tools/Blast/Run/), I could not
> find
> this module. The following webpage says that it is available with the
core
> package.
>
http://annocpan.org/~BIRNEY/bioperl-0.04.3/Bio/Tools/Blast/Run/Webblast.
pm
> Can anyone please advice what would have gone wrong and how can I get
it
> going? The testing of bioperl installation was ok.
> Hoping for some answer.
> Thanks
> Nagesh
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l@portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l

From nagesh.chakka at anu.edu.au  Mon Jan 16 16:56:45 2006
From: nagesh.chakka at anu.edu.au (Nagesh Chakka)
Date: Mon Jan 16 17:13:58 2006
Subject: [Bioperl-l] Trouble using RemoteBlast.pm
Message-ID: <200601170856.45627.nagesh.chakka@anu.edu.au>

Hi All,
I was trying to setup a system to perform a remote blast on regular basis. I 
thought this could be best achieved by using BioPerl module and came across 
RemoteBlast.pm
I had modified the sample script "bp_remote_blast.pl" which takes a file
containing single FASTA sequence as an input. Also I wanted the blast report 
to be saved in a file for latter use and
modified the code as follows
I am using the latest version of Bioperl (1.5) on a Fedora platform.
#######################################################################
print "$Bio::Root::Version::VERSION\n";
use Bio::Tools::Run::RemoteBlast;
use strict;
my $prog = 'blastp';
my $db   = 'swissprot';
my $e_val= '1e-10';

my @params = ( '-prog' => $prog,
       '-data' => $db,
       '-expect' => $e_val,
       '-readmethod' => 'SearchIO' );

my $factory = Bio::Tools::Run::RemoteBlast->new(@params);

#change a paramter
$Bio::Tools::Run::RemoteBlast::HEADER{'ENTREZ_QUERY'} = 'Homo sapiens
[ORGN]';

#remove a parameter
delete $Bio::Tools::Run::RemoteBlast::HEADER{'FILTER'};

my $v = 1;
#$v is just to turn on and off the messages

my $r = $factory->submit_blast('blastInput.txt');

print STDERR "waiting..." if( $v > 0 );
while ( my @rids = $factory->each_rid ) 
{
        foreach my $rid ( @rids ) 
        {
                my $rc = $factory->retrieve_blast($rid);
                if( !ref($rc) ) 
                {
                        if( $rc < 0 ) 
                        {
                                $factory->remove_rid($rid);
                        }
                        print STDERR "." if ( $v > 0 );
                        sleep 5;
                } 
                else 
                {
                        print "RID $rid\n";
                        $factory->save_output('temp.out');
                        $factory->remove_rid($rid);
                }
        }
}

#################################################################################

This script prints the RID and terminates immediately. Obviously the
output file created is empty as the program did not wait for getting the
blast results from the RID. 
Is there something I am doing wrong and what can I do for the program to wait 
until the results are ready to be printed to the output file. I could not get 
much information from the documentation and have no prior experience with 
Bioperl.
Thanks very much for  your attention.
Regards
Nageshbi
From hubert.prielinger at gmx.at  Mon Jan 16 16:44:09 2006
From: hubert.prielinger at gmx.at (Hubert Prielinger)
Date: Mon Jan 16 17:37:38 2006
Subject: [Bioperl-l] parse Blast Output and Composition Based Statistics
	parameter
In-Reply-To: <43CC05E1.5070503@gmx.at>
References: <43C6ECDC.7050308@gmx.at>
	<2CF48095-DF0E-4BB5-AAB8-3B8DBC813E76@duke.edu>
	<43CC05E1.5070503@gmx.at>
Message-ID: <43CC13A9.3010209@gmx.at>

Hubert Prielinger wrote:

> Jason Stajich wrote:
>
>> (please don't try and post to bioperl-announce, it is not for  
>> questions.)
>>
>> On Jan 12, 2006, at 6:57 PM, Hubert Prielinger wrote:
>>
>>> Hello,
>>> I want to know, if there is a possibility to get from a Blast  
>>> Outputfile the whole Sequence of a protein not only the best local  
>>> alignment...
>>> for example:
>>>
>> No. The parser can only return to you what is in the report file...
>> use Bio::DB::GenPept to retrieve the sequence via the web or  
>> (recommended) use a locally indexed sequence database like  
>> Bio::DB::Fasta
>>
>>> >ref|XP_480077.1| hypothetical protein [Oryza sativa (japonica  
>>> cultivar-group)]
>>> dbj|BAD33542.1| hypothetical protein [Oryza sativa (japonica  
>>> cultivar-group)]
>>>         Length=95
>>>
>>> Score = 24.1 bits (47),  Expect =   493
>>> Identities = 6/7 (85%), Positives = 7/7 (100%), Gaps = 0/7 (0%)
>>>
>>> Query  2   KKRRRWW  8
>>>                K+RRRWW
>>> Sbjct  87  KRRRRWW  93
>>>
>>> and now, if I parse the file, I want to get the whole Sequence of  
>>> this hypothetical protein....is that possible with hsp for example,  
>>> or any other way....
>>>
>>> my second question is:
>>> I do my blast search with bioperl and the remoteblast  
>>> module.....each parameter is working very well, except the  
>>> composition based statistics parameter....
>>> it looks like that:
>>>
>>> my $factory = $Bio::Tools::Run::RemoteBlast::HEADER 
>>> {'COMPOSITION_BASED_STATISTICS'} = 'yes';
>>>
>> uh no that is not how you would do it.
>> You can make it the default for any factories you use in the script  
>> by doing this
>>
>>> $Bio::Tools::Run::RemoteBlast::HEADER 
>>> {'COMPOSITION_BASED_STATISTICS'} = 'yes';
>>
>>
>> then
>> $factory = Bio::Tools::Run::RemoteBlast->new();
>>
>>
>>  =OR=
>> Once you have a factory object you can set the parameter explicitly:
>> $factory->submit_parameter('COMPOSITION_BASED_STATISTICS', 'yes');
>>
>>> it should work like that, but it doesn't....
>>>
>>> Thanks for your help in advance......
>>>
>>> regards
>>> Hubert
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l@portal.open-bio.org
>>> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>>
>>
>>
>> -- 
>> Jason Stajich
>> Duke University
>> http://www.duke.edu/~jes12
>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l@portal.open-bio.org
>> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>>
>>
>
Hi Jason,
I have tried everything that you suggested, but the Composition Based 
Statistic parameter isn't still working, every
other parameter works using e.g

$Bio::Tools::Run::RemoteBlast::HEADER{'DESCRIPTIONS'} = '1000';

thanks in advance
Hubert

From hubert.prielinger at gmx.at  Mon Jan 16 16:54:03 2006
From: hubert.prielinger at gmx.at (Hubert Prielinger)
Date: Mon Jan 16 17:47:31 2006
Subject: [Bioperl-l] Trouble using RemoteBlast.pm
In-Reply-To: <200601170856.45627.nagesh.chakka@anu.edu.au>
References: <200601170856.45627.nagesh.chakka@anu.edu.au>
Message-ID: <43CC15FB.4000008@gmx.at>

Nagesh Chakka wrote:

>Hi All,
>I was trying to setup a system to perform a remote blast on regular basis. I 
>thought this could be best achieved by using BioPerl module and came across 
>RemoteBlast.pm
>I had modified the sample script "bp_remote_blast.pl" which takes a file
>containing single FASTA sequence as an input. Also I wanted the blast report 
>to be saved in a file for latter use and
>modified the code as follows
>I am using the latest version of Bioperl (1.5) on a Fedora platform.
>#######################################################################
>print "$Bio::Root::Version::VERSION\n";
>use Bio::Tools::Run::RemoteBlast;
>use strict;
>my $prog = 'blastp';
>my $db   = 'swissprot';
>my $e_val= '1e-10';
>
>my @params = ( '-prog' => $prog,
>       '-data' => $db,
>       '-expect' => $e_val,
>       '-readmethod' => 'SearchIO' );
>
>my $factory = Bio::Tools::Run::RemoteBlast->new(@params);
>
>#change a paramter
>$Bio::Tools::Run::RemoteBlast::HEADER{'ENTREZ_QUERY'} = 'Homo sapiens
>[ORGN]';
>
>#remove a parameter
>delete $Bio::Tools::Run::RemoteBlast::HEADER{'FILTER'};
>
>my $v = 1;
>#$v is just to turn on and off the messages
>
>my $r = $factory->submit_blast('blastInput.txt');
>
>print STDERR "waiting..." if( $v > 0 );
>while ( my @rids = $factory->each_rid ) 
>{
>        foreach my $rid ( @rids ) 
>        {
>                my $rc = $factory->retrieve_blast($rid);
>                if( !ref($rc) ) 
>                {
>                        if( $rc < 0 ) 
>                        {
>                                $factory->remove_rid($rid);
>                        }
>                        print STDERR "." if ( $v > 0 );
>                        sleep 5;
>                } 
>                else 
>                {
>                        print "RID $rid\n";
>                        $factory->save_output('temp.out');
>                        $factory->remove_rid($rid);
>                }
>        }
>}
>
>#################################################################################
>
>This script prints the RID and terminates immediately. Obviously the
>output file created is empty as the program did not wait for getting the
>blast results from the RID. 
>Is there something I am doing wrong and what can I do for the program to wait 
>until the results are ready to be printed to the output file. I could not get 
>much information from the documentation and have no prior experience with 
>Bioperl.
>Thanks very much for  your attention.
>Regards
>Nageshbi
>_______________________________________________
>Bioperl-l mailing list
>Bioperl-l@portal.open-bio.org
>http://portal.open-bio.org/mailman/listinfo/bioperl-l
>
>
>  
>
hi nagesh,
try this, should work, I had the same problem:

.......................
.......................

else 
                {
                        print "RID $rid\n";
                        $factory->save_output('temp.out');

			my $checkinput = $factory->file;
              		open(my $fh,"<$checkinput") or die $!;
              		while(<$fh>){
                		print;
              		}
              		close $fh;

			$factory->remove_rid($rid);
                }
        }
}			

regards
Hubert

PS: are you using the composition based statistics parameter with your 
blast search?
if yes, is it working?

From hubert.prielinger at gmx.at  Mon Jan 16 16:57:07 2006
From: hubert.prielinger at gmx.at (Hubert Prielinger)
Date: Mon Jan 16 17:50:34 2006
Subject: [Bioperl-l] parse Blast Output and Composition Based Statistics
	parameter
In-Reply-To: 
References: 
Message-ID: <43CC16B3.70908@gmx.at>

Hi Brian,
yes, I have tried pasting the sequence manually at the NCBI Homepage and 
it is working fine.

regards
Hubert

Brian Osborne wrote:

>Hubert,
>
>If all the other parameters are passed correctly then I suspect this is not
>a BioPerl problem. Did you try manually pasting these URLs into the browser
>to confirm that NCBI is processing the parameters correctly?
>
>Brian O.
>
>
>On 1/16/06 4:44 PM, "Hubert Prielinger"  wrote:
>
>  
>
>>Hubert Prielinger wrote:
>>
>>    
>>
>>>Jason Stajich wrote:
>>>
>>>      
>>>
>>>>(please don't try and post to bioperl-announce, it is not for
>>>>questions.)
>>>>
>>>>On Jan 12, 2006, at 6:57 PM, Hubert Prielinger wrote:
>>>>
>>>>        
>>>>
>>>>>Hello,
>>>>>I want to know, if there is a possibility to get from a Blast
>>>>>Outputfile the whole Sequence of a protein not only the best local
>>>>>alignment...
>>>>>for example:
>>>>>
>>>>>          
>>>>>
>>>>No. The parser can only return to you what is in the report file...
>>>>use Bio::DB::GenPept to retrieve the sequence via the web or
>>>>(recommended) use a locally indexed sequence database like
>>>>Bio::DB::Fasta
>>>>
>>>>        
>>>>
>>>>>>ref|XP_480077.1| hypothetical protein [Oryza sativa (japonica
>>>>>>            
>>>>>>
>>>>>cultivar-group)]
>>>>>dbj|BAD33542.1| hypothetical protein [Oryza sativa (japonica
>>>>>cultivar-group)]
>>>>>        Length=95
>>>>>
>>>>>Score = 24.1 bits (47),  Expect =   493
>>>>>Identities = 6/7 (85%), Positives = 7/7 (100%), Gaps = 0/7 (0%)
>>>>>
>>>>>Query  2   KKRRRWW  8
>>>>>               K+RRRWW
>>>>>Sbjct  87  KRRRRWW  93
>>>>>
>>>>>and now, if I parse the file, I want to get the whole Sequence of
>>>>>this hypothetical protein....is that possible with hsp for example,
>>>>>or any other way....
>>>>>
>>>>>my second question is:
>>>>>I do my blast search with bioperl and the remoteblast
>>>>>module.....each parameter is working very well, except the
>>>>>composition based statistics parameter....
>>>>>it looks like that:
>>>>>
>>>>>my $factory = $Bio::Tools::Run::RemoteBlast::HEADER
>>>>>{'COMPOSITION_BASED_STATISTICS'} = 'yes';
>>>>>
>>>>>          
>>>>>
>>>>uh no that is not how you would do it.
>>>>You can make it the default for any factories you use in the script
>>>>by doing this
>>>>
>>>>        
>>>>
>>>>>$Bio::Tools::Run::RemoteBlast::HEADER
>>>>>{'COMPOSITION_BASED_STATISTICS'} = 'yes';
>>>>>          
>>>>>
>>>>then
>>>>$factory = Bio::Tools::Run::RemoteBlast->new();
>>>>
>>>>
>>>> =OR=
>>>>Once you have a factory object you can set the parameter explicitly:
>>>>$factory->submit_parameter('COMPOSITION_BASED_STATISTICS', 'yes');
>>>>
>>>>        
>>>>
>>>>>it should work like that, but it doesn't....
>>>>>
>>>>>Thanks for your help in advance......
>>>>>
>>>>>regards
>>>>>Hubert
>>>>>_______________________________________________
>>>>>Bioperl-l mailing list
>>>>>Bioperl-l@portal.open-bio.org
>>>>>http://portal.open-bio.org/mailman/listinfo/bioperl-l
>>>>>          
>>>>>
>>>>
>>>>-- 
>>>>Jason Stajich
>>>>Duke University
>>>>http://www.duke.edu/~jes12
>>>>
>>>>
>>>>_______________________________________________
>>>>Bioperl-l mailing list
>>>>Bioperl-l@portal.open-bio.org
>>>>http://portal.open-bio.org/mailman/listinfo/bioperl-l
>>>>
>>>>
>>>>        
>>>>
>>Hi Jason,
>>I have tried everything that you suggested, but the Composition Based
>>Statistic parameter isn't still working, every
>>other parameter works using e.g
>>
>>$Bio::Tools::Run::RemoteBlast::HEADER{'DESCRIPTIONS'} = '1000';
>>
>>thanks in advance
>>Hubert
>>
>>
>>
>>_______________________________________________
>>Bioperl-l mailing list
>>Bioperl-l@portal.open-bio.org
>>http://portal.open-bio.org/mailman/listinfo/bioperl-l
>>    
>>
>
>
>
>  
>

From osborne1 at optonline.net  Mon Jan 16 17:49:54 2006
From: osborne1 at optonline.net (Brian Osborne)
Date: Mon Jan 16 18:01:54 2006
Subject: [Bioperl-l] parse Blast Output and Composition Based Statistics
	parameter
In-Reply-To: <43CC13A9.3010209@gmx.at>
Message-ID: 

Hubert,

If all the other parameters are passed correctly then I suspect this is not
a BioPerl problem. Did you try manually pasting these URLs into the browser
to confirm that NCBI is processing the parameters correctly?

Brian O.

On 1/16/06 4:44 PM, "Hubert Prielinger"  wrote:

> Hubert Prielinger wrote:
> 
>> Jason Stajich wrote:
>> 
>>> (please don't try and post to bioperl-announce, it is not for
>>> questions.)
>>> 
>>> On Jan 12, 2006, at 6:57 PM, Hubert Prielinger wrote:
>>> 
>>>> Hello,
>>>> I want to know, if there is a possibility to get from a Blast
>>>> Outputfile the whole Sequence of a protein not only the best local
>>>> alignment...
>>>> for example:
>>>> 
>>> No. The parser can only return to you what is in the report file...
>>> use Bio::DB::GenPept to retrieve the sequence via the web or
>>> (recommended) use a locally indexed sequence database like
>>> Bio::DB::Fasta
>>> 
>>>>> ref|XP_480077.1| hypothetical protein [Oryza sativa (japonica
>>>> cultivar-group)]
>>>> dbj|BAD33542.1| hypothetical protein [Oryza sativa (japonica
>>>> cultivar-group)]
>>>>         Length=95
>>>> 
>>>> Score = 24.1 bits (47),  Expect =   493
>>>> Identities = 6/7 (85%), Positives = 7/7 (100%), Gaps = 0/7 (0%)
>>>> 
>>>> Query  2   KKRRRWW  8
>>>>                K+RRRWW
>>>> Sbjct  87  KRRRRWW  93
>>>> 
>>>> and now, if I parse the file, I want to get the whole Sequence of
>>>> this hypothetical protein....is that possible with hsp for example,
>>>> or any other way....
>>>> 
>>>> my second question is:
>>>> I do my blast search with bioperl and the remoteblast
>>>> module.....each parameter is working very well, except the
>>>> composition based statistics parameter....
>>>> it looks like that:
>>>> 
>>>> my $factory = $Bio::Tools::Run::RemoteBlast::HEADER
>>>> {'COMPOSITION_BASED_STATISTICS'} = 'yes';
>>>> 
>>> uh no that is not how you would do it.
>>> You can make it the default for any factories you use in the script
>>> by doing this
>>> 
>>>> $Bio::Tools::Run::RemoteBlast::HEADER
>>>> {'COMPOSITION_BASED_STATISTICS'} = 'yes';
>>> 
>>> 
>>> then
>>> $factory = Bio::Tools::Run::RemoteBlast->new();
>>> 
>>> 
>>>  =OR=
>>> Once you have a factory object you can set the parameter explicitly:
>>> $factory->submit_parameter('COMPOSITION_BASED_STATISTICS', 'yes');
>>> 
>>>> it should work like that, but it doesn't....
>>>> 
>>>> Thanks for your help in advance......
>>>> 
>>>> regards
>>>> Hubert
>>>> _______________________________________________
>>>> Bioperl-l mailing list
>>>> Bioperl-l@portal.open-bio.org
>>>> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>>> 
>>> 
>>> 
>>> -- 
>>> Jason Stajich
>>> Duke University
>>> http://www.duke.edu/~jes12
>>> 
>>> 
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l@portal.open-bio.org
>>> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>>> 
>>> 
>> 
> Hi Jason,
> I have tried everything that you suggested, but the Composition Based
> Statistic parameter isn't still working, every
> other parameter works using e.g
> 
> $Bio::Tools::Run::RemoteBlast::HEADER{'DESCRIPTIONS'} = '1000';
> 
> thanks in advance
> Hubert
> 
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l@portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l

From jason.stajich at duke.edu  Mon Jan 16 20:11:40 2006
From: jason.stajich at duke.edu (Jason Stajich)
Date: Mon Jan 16 20:07:58 2006
Subject: Fwd: [Bioperl-l] parse Blast Output and Composition Based Statistics
	parameter
References: <43CC05E1.5070503@gmx.at>
Message-ID: <9EC4AC10-CDDB-49AC-B9F9-5A7621F53F0F@duke.edu>

sorry - i don't really have the time to support this module - lots of  
people on the list use it so they can hopefully help.

Begin forwarded message:

> From: Hubert Prielinger 
> Date: January 16, 2006 3:45:21 PM EST
> To: Jason Stajich 
> Subject: Re: [Bioperl-l] parse Blast Output and Composition Based  
> Statistics parameter
>
> Jason Stajich wrote:
>
>> (please don't try and post to bioperl-announce, it is not for   
>> questions.)
>>
>> On Jan 12, 2006, at 6:57 PM, Hubert Prielinger wrote:
>>
>>> Hello,
>>> I want to know, if there is a possibility to get from a Blast   
>>> Outputfile the whole Sequence of a protein not only the best  
>>> local  alignment...
>>> for example:
>>>
>> No. The parser can only return to you what is in the report file...
>> use Bio::DB::GenPept to retrieve the sequence via the web or   
>> (recommended) use a locally indexed sequence database like   
>> Bio::DB::Fasta
>>
>>> >ref|XP_480077.1| hypothetical protein [Oryza sativa (japonica   
>>> cultivar-group)]
>>> dbj|BAD33542.1| hypothetical protein [Oryza sativa (japonica   
>>> cultivar-group)]
>>>         Length=95
>>>
>>> Score = 24.1 bits (47),  Expect =   493
>>> Identities = 6/7 (85%), Positives = 7/7 (100%), Gaps = 0/7 (0%)
>>>
>>> Query  2   KKRRRWW  8
>>>                K+RRRWW
>>> Sbjct  87  KRRRRWW  93
>>>
>>> and now, if I parse the file, I want to get the whole Sequence  
>>> of  this hypothetical protein....is that possible with hsp for  
>>> example,  or any other way....
>>>
>>> my second question is:
>>> I do my blast search with bioperl and the remoteblast   
>>> module.....each parameter is working very well, except the   
>>> composition based statistics parameter....
>>> it looks like that:
>>>
>>> my $factory = $Bio::Tools::Run::RemoteBlast::HEADER  
>>> {'COMPOSITION_BASED_STATISTICS'} = 'yes';
>>>
>> uh no that is not how you would do it.
>> You can make it the default for any factories you use in the  
>> script  by doing this
>>
>>> $Bio::Tools::Run::RemoteBlast::HEADER  
>>> {'COMPOSITION_BASED_STATISTICS'} = 'yes';
>>
>> then
>> $factory = Bio::Tools::Run::RemoteBlast->new();
>>
>>
>>  =OR=
>> Once you have a factory object you can set the parameter explicitly:
>> $factory->submit_parameter('COMPOSITION_BASED_STATISTICS', 'yes');
>>
>>> it should work like that, but it doesn't....
>>>
>>> Thanks for your help in advance......
>>>
>>> regards
>>> Hubert
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l@portal.open-bio.org
>>> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>>
>>
>> -- 
>> Jason Stajich
>> Duke University
>> http://www.duke.edu/~jes12
>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l@portal.open-bio.org
>> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>>
>>
> Hi Jason,
> I have tried everything that you suggested, but the Composition  
> Based Statistic parameter isn't still working, every
> other parameter works using e.g
>
> $Bio::Tools::Run::RemoteBlast::HEADER{'DESCRIPTIONS'} = '1000';
>
> thanks in advance
> Hubert
>

--
Jason Stajich
Duke University
http://www.duke.edu/~jes12

From christoph.gille at charite.de  Wed Jan 11 06:22:25 2006
From: christoph.gille at charite.de (Dr. Christoph Gille)
Date: Tue Jan 17 01:36:21 2006
Subject: [Bioperl-l] internet proxy
In-Reply-To: <43C46B52.7060600@infotech.monash.edu.au>
References: <47893.192.168.220.203.1136927017.squirrel@webmail.charite.de>
	<43C46B52.7060600@infotech.monash.edu.au>
Message-ID: <37130.192.168.220.203.1136978545.squirrel@webmail.charite.de>

Hi Torsten,
Sorry, does not work yet. I am not working with PERL long enough to sort
this out.

In Sopma.pm is
    my $request = POST 'http://npsa-pbil.ibcp.fr/cgi-bin/secpred_sopma.pl',
        Content_Type => 'form-data',
            Content  => [title     => "",
                         notice    => $self->seq->seq,
                         ali_width => 70,
                         states    => $self->states,
                         threshold => $self->similarity_threshold ,
                         width     => $self->window_width,
                        ];

Is POST a static method or is it an instance method ?

If I call $sopma->env_proxy; does the POST method know this ?
Does this method get a "self" reference of Sopma.pm ?

I would have expected that I need to set a static field in the
module that provides the POST method.

Thanks for your help

Christoph

From bmoore at genetics.utah.edu  Tue Jan 17 13:34:08 2006
From: bmoore at genetics.utah.edu (Barry Moore)
Date: Tue Jan 17 13:41:09 2006
Subject: [Bioperl-l] Trouble using RemoteBlast.pm
Message-ID: 

Nagesh-

Did you get this figured out?  Your script works as is on my system.
You say temp.out is empty?  What does you input sequence
(blastInput.txt) look like?

Barry

> -----Original Message-----
> From: bioperl-l-bounces@portal.open-bio.org [mailto:bioperl-l-
> bounces@portal.open-bio.org] On Behalf Of Hubert Prielinger
> Sent: Monday, January 16, 2006 2:54 PM
> To: Nagesh Chakka; bioperl-l@portal.open-bio.org
> Subject: Re: [Bioperl-l] Trouble using RemoteBlast.pm
> 
> Nagesh Chakka wrote:
> 
> >Hi All,
> >I was trying to setup a system to perform a remote blast on regular
> basis. I
> >thought this could be best achieved by using BioPerl module and came
> across
> >RemoteBlast.pm
> >I had modified the sample script "bp_remote_blast.pl" which takes a
file
> >containing single FASTA sequence as an input. Also I wanted the blast
> report
> >to be saved in a file for latter use and
> >modified the code as follows
> >I am using the latest version of Bioperl (1.5) on a Fedora platform.
>
>#######################################################################
> >print "$Bio::Root::Version::VERSION\n";
> >use Bio::Tools::Run::RemoteBlast;
> >use strict;
> >my $prog = 'blastp';
> >my $db   = 'swissprot';
> >my $e_val= '1e-10';
> >
> >my @params = ( '-prog' => $prog,
> >       '-data' => $db,
> >       '-expect' => $e_val,
> >       '-readmethod' => 'SearchIO' );
> >
> >my $factory = Bio::Tools::Run::RemoteBlast->new(@params);
> >
> >#change a paramter
> >$Bio::Tools::Run::RemoteBlast::HEADER{'ENTREZ_QUERY'} = 'Homo sapiens
> >[ORGN]';
> >
> >#remove a parameter
> >delete $Bio::Tools::Run::RemoteBlast::HEADER{'FILTER'};
> >
> >my $v = 1;
> >#$v is just to turn on and off the messages
> >
> >my $r = $factory->submit_blast('blastInput.txt');
> >
> >print STDERR "waiting..." if( $v > 0 );
> >while ( my @rids = $factory->each_rid )
> >{
> >        foreach my $rid ( @rids )
> >        {
> >                my $rc = $factory->retrieve_blast($rid);
> >                if( !ref($rc) )
> >                {
> >                        if( $rc < 0 )
> >                        {
> >                                $factory->remove_rid($rid);
> >                        }
> >                        print STDERR "." if ( $v > 0 );
> >                        sleep 5;
> >                }
> >                else
> >                {
> >                        print "RID $rid\n";
> >                        $factory->save_output('temp.out');
> >                        $factory->remove_rid($rid);
> >                }
> >        }
> >}
> >
>
>#######################################################################
##
> ########
> >
> >This script prints the RID and terminates immediately. Obviously the
> >output file created is empty as the program did not wait for getting
the
> >blast results from the RID.
> >Is there something I am doing wrong and what can I do for the program
to
> wait
> >until the results are ready to be printed to the output file. I could
not
> get
> >much information from the documentation and have no prior experience
with
> >Bioperl.
> >Thanks very much for  your attention.
> >Regards
> >Nageshbi
> >_______________________________________________
> >Bioperl-l mailing list
> >Bioperl-l@portal.open-bio.org
> >http://portal.open-bio.org/mailman/listinfo/bioperl-l
> >
> >
> >
> >
> hi nagesh,
> try this, should work, I had the same problem:
> 
> .......................
> .......................
> 
> else
>                 {
>                         print "RID $rid\n";
>                         $factory->save_output('temp.out');
> 
> 			my $checkinput = $factory->file;
>               		open(my $fh,"<$checkinput") or die $!;
>               		while(<$fh>){
>                 		print;
>               		}
>               		close $fh;
> 
> 
> 			$factory->remove_rid($rid);
>                 }
>         }
> }
> 
> regards
> Hubert
> 
> PS: are you using the composition based statistics parameter with your
> blast search?
> if yes, is it working?
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l@portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l

From hubert.prielinger at gmx.at  Tue Jan 17 14:37:20 2006
From: hubert.prielinger at gmx.at (Hubert Prielinger)
Date: Tue Jan 17 15:31:30 2006
Subject: Fwd: [Bioperl-l] parse Blast Output and Composition Based
	Statistics parameter
In-Reply-To: <9EC4AC10-CDDB-49AC-B9F9-5A7621F53F0F@duke.edu>
References: <43CC05E1.5070503@gmx.at>
	<9EC4AC10-CDDB-49AC-B9F9-5A7621F53F0F@duke.edu>
Message-ID: <43CD4770.1020206@gmx.at>

Hi Jason,
I have written to NCBI helpdesk if they can help me with further 
information...
that's the response:

Hello,

I'm sorry, this has recently changed. Instead of "Yes", try using either
'0' '1' or '2', where:

'0' = No Composition-based statistics
'1' = Conditional compositional score matrix adjustment (apply only to
'biased' sequences)
'2' = Universal compositional score matrix adjustment (apply to all).

This works with the URLAPI; I've not tested with the perl module.

Best regards,
Wayne

regards
Hubert

Jason Stajich wrote:

> sorry - i don't really have the time to support this module - lots of 
> people on the list use it so they can hopefully help.
>
> Begin forwarded message:
>
>> *From: *Hubert Prielinger > >
>> *Date: *January 16, 2006 3:45:21 PM EST
>> *To: *Jason Stajich > >
>> *Subject: **Re: [Bioperl-l] parse Blast Output and Composition Based 
>> Statistics parameter*
>>
>> Jason Stajich wrote:
>>
>>> (please don't try and post to bioperl-announce, it is not for  
>>> questions.)
>>>
>>> On Jan 12, 2006, at 6:57 PM, Hubert Prielinger wrote:
>>>
>>>> Hello,
>>>> I want to know, if there is a possibility to get from a Blast  
>>>> Outputfile the whole Sequence of a protein not only the best local  
>>>> alignment...
>>>> for example:
>>>>
>>> No. The parser can only return to you what is in the report file...
>>> use Bio::DB::GenPept to retrieve the sequence via the web or  
>>> (recommended) use a locally indexed sequence database like  
>>> Bio::DB::Fasta
>>>
>>>> >ref|XP_480077.1| hypothetical protein [Oryza sativa (japonica  
>>>> cultivar-group)]
>>>> dbj|BAD33542.1| hypothetical protein [Oryza sativa (japonica  
>>>> cultivar-group)]
>>>>         Length=95
>>>>
>>>> Score = 24.1 bits (47),  Expect =   493
>>>> Identities = 6/7 (85%), Positives = 7/7 (100%), Gaps = 0/7 (0%)
>>>>
>>>> Query  2   KKRRRWW  8
>>>>                K+RRRWW
>>>> Sbjct  87  KRRRRWW  93
>>>>
>>>> and now, if I parse the file, I want to get the whole Sequence of  
>>>> this hypothetical protein....is that possible with hsp for 
>>>> example,  or any other way....
>>>>
>>>> my second question is:
>>>> I do my blast search with bioperl and the remoteblast  
>>>> module.....each parameter is working very well, except the  
>>>> composition based statistics parameter....
>>>> it looks like that:
>>>>
>>>> my $factory = $Bio::Tools::Run::RemoteBlast::HEADER 
>>>> {'COMPOSITION_BASED_STATISTICS'} = 'yes';
>>>>
>>> uh no that is not how you would do it.
>>> You can make it the default for any factories you use in the script  
>>> by doing this
>>>
>>>> $Bio::Tools::Run::RemoteBlast::HEADER 
>>>> {'COMPOSITION_BASED_STATISTICS'} = 'yes';
>>>
>>>
>>> then
>>> $factory = Bio::Tools::Run::RemoteBlast->new();
>>>
>>>
>>>  =OR=
>>> Once you have a factory object you can set the parameter explicitly:
>>> $factory->submit_parameter('COMPOSITION_BASED_STATISTICS', 'yes');
>>>
>>>> it should work like that, but it doesn't....
>>>>
>>>> Thanks for your help in advance......
>>>>
>>>> regards
>>>> Hubert
>>>> _______________________________________________
>>>> Bioperl-l mailing list
>>>> Bioperl-l@portal.open-bio.org 
>>>> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>>
>>>
>>> -- 
>>> Jason Stajich
>>> Duke University
>>> http://www.duke.edu/~jes12 
>>>
>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l@portal.open-bio.org 
>>> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>>
>> Hi Jason,
>> I have tried everything that you suggested, but the Composition Based 
>> Statistic parameter isn't still working, every
>> other parameter works using e.g
>>
>> $Bio::Tools::Run::RemoteBlast::HEADER{'DESCRIPTIONS'} = '1000';
>>
>> thanks in advance
>> Hubert
>>
>
> --
> Jason Stajich
> Duke University
> http://www.duke.edu/~jes12 
>
>

From nagesh.chakka at anu.edu.au  Tue Jan 17 15:57:14 2006
From: nagesh.chakka at anu.edu.au (Nagesh Chakka)
Date: Tue Jan 17 16:14:48 2006
Subject: [Bioperl-l] Trouble using RemoteBlast.pm
In-Reply-To: 
References: 
Message-ID: <200601180757.14592.nagesh.chakka@anu.edu.au>

Bi  Barry,
With the help of Hubert, I further modified the script but still have the same 
problem. The problem is that from the point of submitting the blast query, 
the script does not wait until the blast results are ready  for retrieval and 
event of submission is immediately followed by retrieving and saving the 
output. Since the results will not be ready (about a sec) this fast, the 
output created is blank. I am able to retrieve the results online using the 
RID which I am making the script to print.
So  my main problem is making the program to wait after submitting the result. 
My input file has a single fasta sequence which I have pasted below.
Its interesting to note that the script works on your system. Is it creating 
an output file with the blast report?
Thanks very much for your attention. 
Regards
Nagesh

blastInput.txt
>MusDpl
MKNRLGTWWVAILCMLLASHLSTVKARGIKHRFKWNRKVLPSSGGQITEARVAENRPGAFIKQGRKLDIDFGAEGNRYYA
ANYWQFPDGIYYEGCSEANVTKEMLVTSCVNATQAANQAEFSREKQDSKLHQRVLWRLIKEICSAKHCDFWLERGAAL
RVAVDQPAMVCLLGFVWFIVK

On Wednesday 18 January 2006 05:34, Barry Moore wrote:
> Nagesh-
>
> Did you get this figured out?  Your script works as is on my system.
> You say temp.out is empty?  What does you input sequence
> (blastInput.txt) look like?
>
> Barry
>
> > -----Original Message-----
> > From: bioperl-l-bounces@portal.open-bio.org [mailto:bioperl-l-
> > bounces@portal.open-bio.org] On Behalf Of Hubert Prielinger
> > Sent: Monday, January 16, 2006 2:54 PM
> > To: Nagesh Chakka; bioperl-l@portal.open-bio.org
> > Subject: Re: [Bioperl-l] Trouble using RemoteBlast.pm
> >
> > Nagesh Chakka wrote:
> > >Hi All,
> > >I was trying to setup a system to perform a remote blast on regular
> >
> > basis. I
> >
> > >thought this could be best achieved by using BioPerl module and came
> >
> > across
> >
> > >RemoteBlast.pm
> > >I had modified the sample script "bp_remote_blast.pl" which takes a
>
> file
>
> > >containing single FASTA sequence as an input. Also I wanted the blast
> >
> > report
> >
> > >to be saved in a file for latter use and
> > >modified the code as follows
> > >I am using the latest version of Bioperl (1.5) on a Fedora platform.
> >
> >#######################################################################
> >
> > >print "$Bio::Root::Version::VERSION\n";
> > >use Bio::Tools::Run::RemoteBlast;
> > >use strict;
> > >my $prog = 'blastp';
> > >my $db   = 'swissprot';
> > >my $e_val= '1e-10';
> > >
> > >my @params = ( '-prog' => $prog,
> > >       '-data' => $db,
> > >       '-expect' => $e_val,
> > >       '-readmethod' => 'SearchIO' );
> > >
> > >my $factory = Bio::Tools::Run::RemoteBlast->new(@params);
> > >
> > >#change a paramter
> > >$Bio::Tools::Run::RemoteBlast::HEADER{'ENTREZ_QUERY'} = 'Homo sapiens
> > >[ORGN]';
> > >
> > >#remove a parameter
> > >delete $Bio::Tools::Run::RemoteBlast::HEADER{'FILTER'};
> > >
> > >my $v = 1;
> > >#$v is just to turn on and off the messages
> > >
> > >my $r = $factory->submit_blast('blastInput.txt');
> > >
> > >print STDERR "waiting..." if( $v > 0 );
> > >while ( my @rids = $factory->each_rid )
> > >{
> > >        foreach my $rid ( @rids )
> > >        {
> > >                my $rc = $factory->retrieve_blast($rid);
> > >                if( !ref($rc) )
> > >                {
> > >                        if( $rc < 0 )
> > >                        {
> > >                                $factory->remove_rid($rid);
> > >                        }
> > >                        print STDERR "." if ( $v > 0 );
> > >                        sleep 5;
> > >                }
> > >                else
> > >                {
> > >                        print "RID $rid\n";
> > >                        $factory->save_output('temp.out');
> > >                        $factory->remove_rid($rid);
> > >                }
> > >        }
> > >}
> >
> >#######################################################################
>
> ##
>
> > ########
> >
> > >This script prints the RID and terminates immediately. Obviously the
> > >output file created is empty as the program did not wait for getting
>
> the
>
> > >blast results from the RID.
> > >Is there something I am doing wrong and what can I do for the program
>
> to
>
> > wait
> >
> > >until the results are ready to be printed to the output file. I could
>
> not
>
> > get
> >
> > >much information from the documentation and have no prior experience
>
> with
>
> > >Bioperl.
> > >Thanks very much for  your attention.
> > >Regards
> > >Nageshbi
> > >_______________________________________________
> > >Bioperl-l mailing list
> > >Bioperl-l@portal.open-bio.org
> > >http://portal.open-bio.org/mailman/listinfo/bioperl-l
> >
> > hi nagesh,
> > try this, should work, I had the same problem:
> >
> > .......................
> > .......................
> >
> > else
> >                 {
> >                         print "RID $rid\n";
> >                         $factory->save_output('temp.out');
> >
> > 			my $checkinput = $factory->file;
> >               		open(my $fh,"<$checkinput") or die $!;
> >               		while(<$fh>){
> >                 		print;
> >               		}
> >               		close $fh;
> >
> >
> > 			$factory->remove_rid($rid);
> >                 }
> >         }
> > }
> >
> > regards
> > Hubert
> >
> > PS: are you using the composition based statistics parameter with your
> > blast search?
> > if yes, is it working?
> >
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l@portal.open-bio.org
> > http://portal.open-bio.org/mailman/listinfo/bioperl-l
From hubert.prielinger at gmx.at  Tue Jan 17 16:27:07 2006
From: hubert.prielinger at gmx.at (Hubert Prielinger)
Date: Tue Jan 17 17:20:34 2006
Subject: Fwd: [Bioperl-l] parse Blast Output and Composition Based
	Statistics parameter
In-Reply-To: <9EC4AC10-CDDB-49AC-B9F9-5A7621F53F0F@duke.edu>
References: <43CC05E1.5070503@gmx.at>
	<9EC4AC10-CDDB-49AC-B9F9-5A7621F53F0F@duke.edu>
Message-ID: <43CD612B.5010608@gmx.at>

Hi Jason,
It works the following way, I have just tried it:

$Bio::Tools::Run::RemoteBlast::HEADER{'COMPOSITION_BASED_STATISTICS'} = '1';

regards
Hubert

Jason Stajich wrote:

> sorry - i don't really have the time to support this module - lots of 
> people on the list use it so they can hopefully help.
>
> Begin forwarded message:
>
>> *From: *Hubert Prielinger > >
>> *Date: *January 16, 2006 3:45:21 PM EST
>> *To: *Jason Stajich > >
>> *Subject: **Re: [Bioperl-l] parse Blast Output and Composition Based 
>> Statistics parameter*
>>
>> Jason Stajich wrote:
>>
>>> (please don't try and post to bioperl-announce, it is not for  
>>> questions.)
>>>
>>> On Jan 12, 2006, at 6:57 PM, Hubert Prielinger wrote:
>>>
>>>> Hello,
>>>> I want to know, if there is a possibility to get from a Blast  
>>>> Outputfile the whole Sequence of a protein not only the best local  
>>>> alignment...
>>>> for example:
>>>>
>>> No. The parser can only return to you what is in the report file...
>>> use Bio::DB::GenPept to retrieve the sequence via the web or  
>>> (recommended) use a locally indexed sequence database like  
>>> Bio::DB::Fasta
>>>
>>>> >ref|XP_480077.1| hypothetical protein [Oryza sativa (japonica  
>>>> cultivar-group)]
>>>> dbj|BAD33542.1| hypothetical protein [Oryza sativa (japonica  
>>>> cultivar-group)]
>>>>         Length=95
>>>>
>>>> Score = 24.1 bits (47),  Expect =   493
>>>> Identities = 6/7 (85%), Positives = 7/7 (100%), Gaps = 0/7 (0%)
>>>>
>>>> Query  2   KKRRRWW  8
>>>>                K+RRRWW
>>>> Sbjct  87  KRRRRWW  93
>>>>
>>>> and now, if I parse the file, I want to get the whole Sequence of  
>>>> this hypothetical protein....is that possible with hsp for 
>>>> example,  or any other way....
>>>>
>>>> my second question is:
>>>> I do my blast search with bioperl and the remoteblast  
>>>> module.....each parameter is working very well, except the  
>>>> composition based statistics parameter....
>>>> it looks like that:
>>>>
>>>> my $factory = $Bio::Tools::Run::RemoteBlast::HEADER 
>>>> {'COMPOSITION_BASED_STATISTICS'} = 'yes';
>>>>
>>> uh no that is not how you would do it.
>>> You can make it the default for any factories you use in the script  
>>> by doing this
>>>
>>>> $Bio::Tools::Run::RemoteBlast::HEADER 
>>>> {'COMPOSITION_BASED_STATISTICS'} = 'yes';
>>>
>>>
>>> then
>>> $factory = Bio::Tools::Run::RemoteBlast->new();
>>>
>>>
>>>  =OR=
>>> Once you have a factory object you can set the parameter explicitly:
>>> $factory->submit_parameter('COMPOSITION_BASED_STATISTICS', 'yes');
>>>
>>>> it should work like that, but it doesn't....
>>>>
>>>> Thanks for your help in advance......
>>>>
>>>> regards
>>>> Hubert
>>>> _______________________________________________
>>>> Bioperl-l mailing list
>>>> Bioperl-l@portal.open-bio.org 
>>>> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>>
>>>
>>> -- 
>>> Jason Stajich
>>> Duke University
>>> http://www.duke.edu/~jes12 
>>>
>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l@portal.open-bio.org 
>>> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>>
>> Hi Jason,
>> I have tried everything that you suggested, but the Composition Based 
>> Statistic parameter isn't still working, every
>> other parameter works using e.g
>>
>> $Bio::Tools::Run::RemoteBlast::HEADER{'DESCRIPTIONS'} = '1000';
>>
>> thanks in advance
>> Hubert
>>
>
> --
> Jason Stajich
> Duke University
> http://www.duke.edu/~jes12 
>
>

From bmoore at genetics.utah.edu  Tue Jan 17 17:33:23 2006
From: bmoore at genetics.utah.edu (Barry Moore)
Date: Tue Jan 17 17:28:03 2006
Subject: [Bioperl-l] parse Blast Output and Composition Based
	Statisticsparameter
Message-ID: 

Hubert,

What exactly isn't working for you with the composition based
statistics.  Are you getting different e-values from Bioperl vs. NCBI
website.  It seems to be working OK for me (at least on one quick test).

Barry

> -----Original Message-----
> From: bioperl-l-bounces@portal.open-bio.org [mailto:bioperl-l-
> bounces@portal.open-bio.org] On Behalf Of Jason Stajich
> Sent: Monday, January 16, 2006 6:12 PM
> To: bioperl-ml List
> Cc: Hubert Prielinger
> Subject: Fwd: [Bioperl-l] parse Blast Output and Composition Based
> Statisticsparameter
> 
> sorry - i don't really have the time to support this module - lots of
> people on the list use it so they can hopefully help.
> 
> Begin forwarded message:
> 
> > From: Hubert Prielinger 
> > Date: January 16, 2006 3:45:21 PM EST
> > To: Jason Stajich 
> > Subject: Re: [Bioperl-l] parse Blast Output and Composition Based
> > Statistics parameter
> >
> > Jason Stajich wrote:
> >
> >> (please don't try and post to bioperl-announce, it is not for
> >> questions.)
> >>
> >> On Jan 12, 2006, at 6:57 PM, Hubert Prielinger wrote:
> >>
> >>> Hello,
> >>> I want to know, if there is a possibility to get from a Blast
> >>> Outputfile the whole Sequence of a protein not only the best
> >>> local  alignment...
> >>> for example:
> >>>
> >> No. The parser can only return to you what is in the report file...
> >> use Bio::DB::GenPept to retrieve the sequence via the web or
> >> (recommended) use a locally indexed sequence database like
> >> Bio::DB::Fasta
> >>
> >>> >ref|XP_480077.1| hypothetical protein [Oryza sativa (japonica
> >>> cultivar-group)]
> >>> dbj|BAD33542.1| hypothetical protein [Oryza sativa (japonica
> >>> cultivar-group)]
> >>>         Length=95
> >>>
> >>> Score = 24.1 bits (47),  Expect =   493
> >>> Identities = 6/7 (85%), Positives = 7/7 (100%), Gaps = 0/7 (0%)
> >>>
> >>> Query  2   KKRRRWW  8
> >>>                K+RRRWW
> >>> Sbjct  87  KRRRRWW  93
> >>>
> >>> and now, if I parse the file, I want to get the whole Sequence
> >>> of  this hypothetical protein....is that possible with hsp for
> >>> example,  or any other way....
> >>>
> >>> my second question is:
> >>> I do my blast search with bioperl and the remoteblast
> >>> module.....each parameter is working very well, except the
> >>> composition based statistics parameter....
> >>> it looks like that:
> >>>
> >>> my $factory = $Bio::Tools::Run::RemoteBlast::HEADER
> >>> {'COMPOSITION_BASED_STATISTICS'} = 'yes';
> >>>
> >> uh no that is not how you would do it.
> >> You can make it the default for any factories you use in the
> >> script  by doing this
> >>
> >>> $Bio::Tools::Run::RemoteBlast::HEADER
> >>> {'COMPOSITION_BASED_STATISTICS'} = 'yes';
> >>
> >> then
> >> $factory = Bio::Tools::Run::RemoteBlast->new();
> >>
> >>
> >>  =OR=
> >> Once you have a factory object you can set the parameter
explicitly:
> >> $factory->submit_parameter('COMPOSITION_BASED_STATISTICS', 'yes');
> >>
> >>> it should work like that, but it doesn't....
> >>>
> >>> Thanks for your help in advance......
> >>>
> >>> regards
> >>> Hubert
> >>> _______________________________________________
> >>> Bioperl-l mailing list
> >>> Bioperl-l@portal.open-bio.org
> >>> http://portal.open-bio.org/mailman/listinfo/bioperl-l
> >>
> >>
> >> --
> >> Jason Stajich
> >> Duke University
> >> http://www.duke.edu/~jes12
> >>
> >>
> >> _______________________________________________
> >> Bioperl-l mailing list
> >> Bioperl-l@portal.open-bio.org
> >> http://portal.open-bio.org/mailman/listinfo/bioperl-l
> >>
> >>
> > Hi Jason,
> > I have tried everything that you suggested, but the Composition
> > Based Statistic parameter isn't still working, every
> > other parameter works using e.g
> >
> > $Bio::Tools::Run::RemoteBlast::HEADER{'DESCRIPTIONS'} = '1000';
> >
> > thanks in advance
> > Hubert
> >
> 
> --
> Jason Stajich
> Duke University
> http://www.duke.edu/~jes12
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l@portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l

From hubert.prielinger at gmx.at  Tue Jan 17 17:09:46 2006
From: hubert.prielinger at gmx.at (Hubert Prielinger)
Date: Tue Jan 17 18:03:27 2006
Subject: [Bioperl-l] parse Blast Output and Composition Based
	Statisticsparameter
In-Reply-To: 
References: 
Message-ID: <43CD6B2A.2080308@gmx.at>

Hello Barry,
Thanks, but I have already solved it, as I have respondet to Jason, the 
parameter doesn't work with yes or no, anymore, because after contacting 
the NCBI
helpdesk, they figured out that you have to use '0' or '1', because they 
have recently changed it
like:
$Bio::Tools::Run::RemoteBlast::HEADER{'COMPOSITION_BASED_STATISTICS'} = '1';

with me, it didn't work with 'yes' or  'no'

regards
Hubert

PS: orginal response mail by NCBI helpdesk:

Hello,

I'm sorry, this has recently changed. Instead of "Yes", try using either
'0' '1' or '2', where:

'0' = No Composition-based statistics
'1' = Conditional compositional score matrix adjustment (apply only to
'biased' sequences)
'2' = Universal compositional score matrix adjustment (apply to all).

This works with the URLAPI; I've not tested with the perl module.

Best regards,
Wayne

Barry Moore wrote:

>Hubert,
>
>What exactly isn't working for you with the composition based
>statistics.  Are you getting different e-values from Bioperl vs. NCBI
>website.  It seems to be working OK for me (at least on one quick test).
>
>Barry
>
>  
>
>>-----Original Message-----
>>From: bioperl-l-bounces@portal.open-bio.org [mailto:bioperl-l-
>>bounces@portal.open-bio.org] On Behalf Of Jason Stajich
>>Sent: Monday, January 16, 2006 6:12 PM
>>To: bioperl-ml List
>>Cc: Hubert Prielinger
>>Subject: Fwd: [Bioperl-l] parse Blast Output and Composition Based
>>Statisticsparameter
>>
>>sorry - i don't really have the time to support this module - lots of
>>people on the list use it so they can hopefully help.
>>
>>Begin forwarded message:
>>
>>    
>>
>>>From: Hubert Prielinger 
>>>Date: January 16, 2006 3:45:21 PM EST
>>>To: Jason Stajich 
>>>Subject: Re: [Bioperl-l] parse Blast Output and Composition Based
>>>Statistics parameter
>>>
>>>Jason Stajich wrote:
>>>
>>>      
>>>
>>>>(please don't try and post to bioperl-announce, it is not for
>>>>questions.)
>>>>
>>>>On Jan 12, 2006, at 6:57 PM, Hubert Prielinger wrote:
>>>>
>>>>        
>>>>
>>>>>Hello,
>>>>>I want to know, if there is a possibility to get from a Blast
>>>>>Outputfile the whole Sequence of a protein not only the best
>>>>>local  alignment...
>>>>>for example:
>>>>>
>>>>>          
>>>>>
>>>>No. The parser can only return to you what is in the report file...
>>>>use Bio::DB::GenPept to retrieve the sequence via the web or
>>>>(recommended) use a locally indexed sequence database like
>>>>Bio::DB::Fasta
>>>>
>>>>        
>>>>
>>>>>>ref|XP_480077.1| hypothetical protein [Oryza sativa (japonica
>>>>>>            
>>>>>>
>>>>>cultivar-group)]
>>>>>dbj|BAD33542.1| hypothetical protein [Oryza sativa (japonica
>>>>>cultivar-group)]
>>>>>        Length=95
>>>>>
>>>>>Score = 24.1 bits (47),  Expect =   493
>>>>>Identities = 6/7 (85%), Positives = 7/7 (100%), Gaps = 0/7 (0%)
>>>>>
>>>>>Query  2   KKRRRWW  8
>>>>>               K+RRRWW
>>>>>Sbjct  87  KRRRRWW  93
>>>>>
>>>>>and now, if I parse the file, I want to get the whole Sequence
>>>>>of  this hypothetical protein....is that possible with hsp for
>>>>>example,  or any other way....
>>>>>
>>>>>my second question is:
>>>>>I do my blast search with bioperl and the remoteblast
>>>>>module.....each parameter is working very well, except the
>>>>>composition based statistics parameter....
>>>>>it looks like that:
>>>>>
>>>>>my $factory = $Bio::Tools::Run::RemoteBlast::HEADER
>>>>>{'COMPOSITION_BASED_STATISTICS'} = 'yes';
>>>>>
>>>>>          
>>>>>
>>>>uh no that is not how you would do it.
>>>>You can make it the default for any factories you use in the
>>>>script  by doing this
>>>>
>>>>        
>>>>
>>>>>$Bio::Tools::Run::RemoteBlast::HEADER
>>>>>{'COMPOSITION_BASED_STATISTICS'} = 'yes';
>>>>>          
>>>>>
>>>>then
>>>>$factory = Bio::Tools::Run::RemoteBlast->new();
>>>>
>>>>
>>>> =OR=
>>>>Once you have a factory object you can set the parameter
>>>>        
>>>>
>explicitly:
>  
>
>>>>$factory->submit_parameter('COMPOSITION_BASED_STATISTICS', 'yes');
>>>>
>>>>        
>>>>
>>>>>it should work like that, but it doesn't....
>>>>>
>>>>>Thanks for your help in advance......
>>>>>
>>>>>regards
>>>>>Hubert
>>>>>_______________________________________________
>>>>>Bioperl-l mailing list
>>>>>Bioperl-l@portal.open-bio.org
>>>>>http://portal.open-bio.org/mailman/listinfo/bioperl-l
>>>>>          
>>>>>
>>>>--
>>>>Jason Stajich
>>>>Duke University
>>>>http://www.duke.edu/~jes12
>>>>
>>>>
>>>>_______________________________________________
>>>>Bioperl-l mailing list
>>>>Bioperl-l@portal.open-bio.org
>>>>http://portal.open-bio.org/mailman/listinfo/bioperl-l
>>>>
>>>>
>>>>        
>>>>
>>>Hi Jason,
>>>I have tried everything that you suggested, but the Composition
>>>Based Statistic parameter isn't still working, every
>>>other parameter works using e.g
>>>
>>>$Bio::Tools::Run::RemoteBlast::HEADER{'DESCRIPTIONS'} = '1000';
>>>
>>>thanks in advance
>>>Hubert
>>>
>>>      
>>>
>>--
>>Jason Stajich
>>Duke University
>>http://www.duke.edu/~jes12
>>
>>
>>_______________________________________________
>>Bioperl-l mailing list
>>Bioperl-l@portal.open-bio.org
>>http://portal.open-bio.org/mailman/listinfo/bioperl-l
>>    
>>
>
>
>  
>

From bmoore at genetics.utah.edu  Tue Jan 17 18:03:55 2006
From: bmoore at genetics.utah.edu (Barry Moore)
Date: Tue Jan 17 20:22:44 2006
Subject: [Bioperl-l] Trouble using RemoteBlast.pm
Message-ID: 

Nagesh,

Attached is an input file, script and output.  These work for me, and I
think they are the same that you are using.  Have a look and see if you
can find any differences that might be causing you problem.  Other than
that I don't know what to tell you.  If you are familiar with the perl
debugger you (and if you're not, now's probably a good time to become
familiar with it) you should step through you script and be sure that
all of you're objects are getting defined when they are supposed to be.
That can often help narrow down the problem.

Barry

> -----Original Message-----
> From: Nagesh Chakka [mailto:nagesh.chakka@anu.edu.au]
> Sent: Tuesday, January 17, 2006 1:57 PM
> To: Barry Moore
> Cc: Hubert Prielinger; bioperl-l@bioperl.org
> Subject: Re: [Bioperl-l] Trouble using RemoteBlast.pm
> 
> Bi  Barry,
> With the help of Hubert, I further modified the script but still have
the
> same
> problem. The problem is that from the point of submitting the blast
query,
> the script does not wait until the blast results are ready  for
retrieval
> and
> event of submission is immediately followed by retrieving and saving
the
> output. Since the results will not be ready (about a sec) this fast,
the
> output created is blank. I am able to retrieve the results online
using
> the
> RID which I am making the script to print.
> So  my main problem is making the program to wait after submitting the
> result.
> My input file has a single fasta sequence which I have pasted below.
> Its interesting to note that the script works on your system. Is it
> creating
> an output file with the blast report?
> Thanks very much for your attention.
> Regards
> Nagesh
> 
> blastInput.txt
> >MusDpl
>
MKNRLGTWWVAILCMLLASHLSTVKARGIKHRFKWNRKVLPSSGGQITEARVAENRPGAFIKQGRKLDIDFG
AE
> GNRYYA
>
ANYWQFPDGIYYEGCSEANVTKEMLVTSCVNATQAANQAEFSREKQDSKLHQRVLWRLIKEICSAKHCDFWL
ER
> GAAL
> RVAVDQPAMVCLLGFVWFIVK
> 
> On Wednesday 18 January 2006 05:34, Barry Moore wrote:
> > Nagesh-
> >
> > Did you get this figured out?  Your script works as is on my system.
> > You say temp.out is empty?  What does you input sequence
> > (blastInput.txt) look like?
> >
> > Barry
> >
> > > -----Original Message-----
> > > From: bioperl-l-bounces@portal.open-bio.org [mailto:bioperl-l-
> > > bounces@portal.open-bio.org] On Behalf Of Hubert Prielinger
> > > Sent: Monday, January 16, 2006 2:54 PM
> > > To: Nagesh Chakka; bioperl-l@portal.open-bio.org
> > > Subject: Re: [Bioperl-l] Trouble using RemoteBlast.pm
> > >
> > > Nagesh Chakka wrote:
> > > >Hi All,
> > > >I was trying to setup a system to perform a remote blast on
regular
> > >
> > > basis. I
> > >
> > > >thought this could be best achieved by using BioPerl module and
came
> > >
> > > across
> > >
> > > >RemoteBlast.pm
> > > >I had modified the sample script "bp_remote_blast.pl" which takes
a
> >
> > file
> >
> > > >containing single FASTA sequence as an input. Also I wanted the
blast
> > >
> > > report
> > >
> > > >to be saved in a file for latter use and
> > > >modified the code as follows
> > > >I am using the latest version of Bioperl (1.5) on a Fedora
platform.
> > >
> >
>#######################################################################
> > >
> > > >print "$Bio::Root::Version::VERSION\n";
> > > >use Bio::Tools::Run::RemoteBlast;
> > > >use strict;
> > > >my $prog = 'blastp';
> > > >my $db   = 'swissprot';
> > > >my $e_val= '1e-10';
> > > >
> > > >my @params = ( '-prog' => $prog,
> > > >       '-data' => $db,
> > > >       '-expect' => $e_val,
> > > >       '-readmethod' => 'SearchIO' );
> > > >
> > > >my $factory = Bio::Tools::Run::RemoteBlast->new(@params);
> > > >
> > > >#change a paramter
> > > >$Bio::Tools::Run::RemoteBlast::HEADER{'ENTREZ_QUERY'} = 'Homo
sapiens
> > > >[ORGN]';
> > > >
> > > >#remove a parameter
> > > >delete $Bio::Tools::Run::RemoteBlast::HEADER{'FILTER'};
> > > >
> > > >my $v = 1;
> > > >#$v is just to turn on and off the messages
> > > >
> > > >my $r = $factory->submit_blast('blastInput.txt');
> > > >
> > > >print STDERR "waiting..." if( $v > 0 );
> > > >while ( my @rids = $factory->each_rid )
> > > >{
> > > >        foreach my $rid ( @rids )
> > > >        {
> > > >                my $rc = $factory->retrieve_blast($rid);
> > > >                if( !ref($rc) )
> > > >                {
> > > >                        if( $rc < 0 )
> > > >                        {
> > > >                                $factory->remove_rid($rid);
> > > >                        }
> > > >                        print STDERR "." if ( $v > 0 );
> > > >                        sleep 5;
> > > >                }
> > > >                else
> > > >                {
> > > >                        print "RID $rid\n";
> > > >                        $factory->save_output('temp.out');
> > > >                        $factory->remove_rid($rid);
> > > >                }
> > > >        }
> > > >}
> > >
> >
>#######################################################################
> >
> > ##
> >
> > > ########
> > >
> > > >This script prints the RID and terminates immediately. Obviously
the
> > > >output file created is empty as the program did not wait for
getting
> >
> > the
> >
> > > >blast results from the RID.
> > > >Is there something I am doing wrong and what can I do for the
program
> >
> > to
> >
> > > wait
> > >
> > > >until the results are ready to be printed to the output file. I
could
> >
> > not
> >
> > > get
> > >
> > > >much information from the documentation and have no prior
experience
> >
> > with
> >
> > > >Bioperl.
> > > >Thanks very much for  your attention.
> > > >Regards
> > > >Nageshbi
> > > >_______________________________________________
> > > >Bioperl-l mailing list
> > > >Bioperl-l@portal.open-bio.org
> > > >http://portal.open-bio.org/mailman/listinfo/bioperl-l
> > >
> > > hi nagesh,
> > > try this, should work, I had the same problem:
> > >
> > > .......................
> > > .......................
> > >
> > > else
> > >                 {
> > >                         print "RID $rid\n";
> > >                         $factory->save_output('temp.out');
> > >
> > > 			my $checkinput = $factory->file;
> > >               		open(my $fh,"<$checkinput") or die $!;
> > >               		while(<$fh>){
> > >                 		print;
> > >               		}
> > >               		close $fh;
> > >
> > >
> > > 			$factory->remove_rid($rid);
> > >                 }
> > >         }
> > > }
> > >
> > > regards
> > > Hubert
> > >
> > > PS: are you using the composition based statistics parameter with
your
> > > blast search?
> > > if yes, is it working?
> > >
> > > _______________________________________________
> > > Bioperl-l mailing list
> > > Bioperl-l@portal.open-bio.org
> > > http://portal.open-bio.org/mailman/listinfo/bioperl-l
-------------- next part --------------
A non-text attachment was scrubbed...
Name: bp_test.pl
Type: application/octet-stream
Size: 1281 bytes
Desc: bp_test.pl
Url : http://portal.open-bio.org/pipermail/bioperl-l/attachments/20060117/04a14cd4/bp_test-0001.obj
-------------- next part --------------
A non-text attachment was scrubbed...
Name: temp.out
Type: application/octet-stream
Size: 2615 bytes
Desc: temp.out
Url : http://portal.open-bio.org/pipermail/bioperl-l/attachments/20060117/04a14cd4/temp-0001.obj
-------------- next part --------------
>MusDpl
MKNRLGTWWVAILCMLLASHLSTVKARGIKHRFKWNRKVLPSSGGQITEARVAENRPGAFIKQGRKLDIDFGAEGNRYYA
ANYWQFPDGIYYEGCSEANVTKEMLVTSCVNATQAANQAEFSREKQDSKLHQRVLWRLIKEICSAKHCDFWLERGAAL
RVAVDQPAMVCLLGFVWFIVK
From jan.aerts at bbsrc.ac.uk  Tue Jan 17 04:54:29 2006
From: jan.aerts at bbsrc.ac.uk (jan aerts (RI))
Date: Tue Jan 17 20:22:53 2006
Subject: [Bioperl-l] concatenate two embl sequence files
Message-ID: <84DA9D8AC9B05F4B889E7C70238CB451030DAB19@rie2ksrv1.ri.bbsrc.ac.uk>

Hi all,

Does anyone know of an easy way to concatenate two sequences, including
recalculation of features positions of the second one? E.g.
  seq 1 = 100 bp
    feature A: 5..15
  seq 2 = 200 bp
    feature B: 20..30
  => concatenated sequence 3 = 300 bp
       feature A: 5..15
       feature B: 120..130  <<<<<<<<<<<

Annotations (features without range) should be transferred as well.

Of course, it must be possible to create a blank sequence and work my
way through all features, adding them to a new collection of features
and stuff. But I was wondering if a simpler technique is possible.

Many thanks,
Jan Aerts
Bioinformatics Department
Roslin Institute
Roslin, Scotland, UK

---------The obligatory disclaimer--------
The information contained in this e-mail (including any attachments) is
confidential and is intended for the use of the addressee only.   The
opinions expressed within this e-mail (including any attachments) are
the opinions of the sender and do not necessarily constitute those of
Roslin Institute (Edinburgh) ("the Institute") unless specifically
stated by a sender who is duly authorised to do so on behalf of the
Institute. 

From jaymoore at plantkind.com  Tue Jan 17 10:09:44 2006
From: jaymoore at plantkind.com (Jay Moore)
Date: Tue Jan 17 20:23:25 2006
Subject: [Bioperl-l] Context-sensitive alignment parameters
Message-ID: <200601171512.k0HFCj8V012227@portal.open-bio.org>

Not strictly bioperl, but if anyone has any ideas, I would appreciate the feedback.

I am doing some comparative work between partially-sequenced plant genomic DNA, and fully-sequenced Arabidopsis genome.  

When I am aligning sequences from other plants to Arabidopsis, the introns are much less well-conserved than the exons, and this ought to be the case 
for animals and other organisms too.  Does anyone use make any allowance for this, by setting gap and gap-extension, or substitution matrix parameters 
in a context-sensitive way?  Is there an alignment method that can take this kind of thing into account?  Is it worth trying to take it into account 
anyway?

Just wondered if anyone has a take, or any information, on this.

Jay Moore
Warwick HRI  http://www.warwickhri.ac.uk

From cjfields at uiuc.edu  Tue Jan 17 20:44:55 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue Jan 17 20:41:14 2006
Subject: [Bioperl-l] bioperl-db working (for the moment) on Win32
Message-ID: <676216A3-1A01-46C1-9873-C52DE6F01994@uiuc.edu>

Hilmar,

Just wanted to drop a line saying bioperl-db seems to be up and  
running on Windows (at least for the moment!). All tests pass using  
ActivePerl and cygwin-perl.  I am trying to sort out the issue with  
throw in Bio::Root::Root (specifically, why it doesn't work without  
the added comma; I'm trying the modifications to Root.pm on Mac OS X  
now) and am trying to also figure out why bioperl and bioperl-db give  
tons of warnings using ActivePerl (most just state that x subroutine  
was redefined in y.pm line z, so aren't serious).  This is an  
ActivePerl or nmake issue and not a bioperl problem as there are no  
warnings using 'make test' in cygwin.  I am in the midst of writing  
up the steps for installing bioperl and bioperl-db using MySQL as the  
relational DB with either ActivePerl or cygwin; I really don't have  
much experience with postgreSQL, oracle, MsSQL (B. Wang's added  
modules), etc., but I can't see any reason why they wouldn't work.

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign

From bmoore at genetics.utah.edu  Tue Jan 17 21:02:33 2006
From: bmoore at genetics.utah.edu (Barry Moore)
Date: Tue Jan 17 20:57:12 2006
Subject: [Bioperl-l] bioperl-db working (for the moment) on Win32
Message-ID: 

This is very helpful Chris.  Thank you.

Barry

> -----Original Message-----
> From: bioperl-l-bounces@portal.open-bio.org [mailto:bioperl-l-
> bounces@portal.open-bio.org] On Behalf Of Chris Fields
> Sent: Tuesday, January 17, 2006 6:45 PM
> To: bioperl-l@portal.open-bio.org
> Subject: [Bioperl-l] bioperl-db working (for the moment) on Win32
> 
> Hilmar,
> 
> Just wanted to drop a line saying bioperl-db seems to be up and
> running on Windows (at least for the moment!). All tests pass using
> ActivePerl and cygwin-perl.  I am trying to sort out the issue with
> throw in Bio::Root::Root (specifically, why it doesn't work without
> the added comma; I'm trying the modifications to Root.pm on Mac OS X
> now) and am trying to also figure out why bioperl and bioperl-db give
> tons of warnings using ActivePerl (most just state that x subroutine
> was redefined in y.pm line z, so aren't serious).  This is an
> ActivePerl or nmake issue and not a bioperl problem as there are no
> warnings using 'make test' in cygwin.  I am in the midst of writing
> up the steps for installing bioperl and bioperl-db using MySQL as the
> relational DB with either ActivePerl or cygwin; I really don't have
> much experience with postgreSQL, oracle, MsSQL (B. Wang's added
> modules), etc., but I can't see any reason why they wouldn't work.
> 
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
> 
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l@portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l

From hlapp at gmx.net  Wed Jan 18 02:07:05 2006
From: hlapp at gmx.net (Hilmar Lapp)
Date: Wed Jan 18 02:03:42 2006
Subject: [Bioperl-l] bioperl-db working (for the moment) on Win32
In-Reply-To: 
References: 
Message-ID: 

Same here, thanks. We'll include your write-up in CVS. -hilmar

On Jan 17, 2006, at 6:02 PM, Barry Moore wrote:

> This is very helpful Chris.  Thank you.
>
> Barry
>
>> -----Original Message-----
>> From: bioperl-l-bounces@portal.open-bio.org [mailto:bioperl-l-
>> bounces@portal.open-bio.org] On Behalf Of Chris Fields
>> Sent: Tuesday, January 17, 2006 6:45 PM
>> To: bioperl-l@portal.open-bio.org
>> Subject: [Bioperl-l] bioperl-db working (for the moment) on Win32
>>
>> Hilmar,
>>
>> Just wanted to drop a line saying bioperl-db seems to be up and
>> running on Windows (at least for the moment!). All tests pass using
>> ActivePerl and cygwin-perl.  I am trying to sort out the issue with
>> throw in Bio::Root::Root (specifically, why it doesn't work without
>> the added comma; I'm trying the modifications to Root.pm on Mac OS X
>> now) and am trying to also figure out why bioperl and bioperl-db give
>> tons of warnings using ActivePerl (most just state that x subroutine
>> was redefined in y.pm line z, so aren't serious).  This is an
>> ActivePerl or nmake issue and not a bioperl problem as there are no
>> warnings using 'make test' in cygwin.  I am in the midst of writing
>> up the steps for installing bioperl and bioperl-db using MySQL as the
>> relational DB with either ActivePerl or cygwin; I really don't have
>> much experience with postgreSQL, oracle, MsSQL (B. Wang's added
>> modules), etc., but I can't see any reason why they wouldn't work.
>>
>> Christopher Fields
>> Postdoctoral Researcher
>> Lab of Dr. Robert Switzer
>> Dept of Biochemistry
>> University of Illinois Urbana-Champaign
>>
>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l@portal.open-bio.org
>> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l@portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>
>
-- 
-------------------------------------------------------------
Hilmar Lapp                            email: lapp at gnf.org
GNF, San Diego, Ca. 92121              phone: +1-858-812-1757
-------------------------------------------------------------

From heikki at sanbi.ac.za  Wed Jan 18 02:11:20 2006
From: heikki at sanbi.ac.za (Heikki Lehvaslaiho)
Date: Wed Jan 18 02:35:39 2006
Subject: [Bioperl-l] concatenate two embl sequence files
In-Reply-To: <84DA9D8AC9B05F4B889E7C70238CB451030DAB19@rie2ksrv1.ri.bbsrc.ac.uk>
References: <84DA9D8AC9B05F4B889E7C70238CB451030DAB19@rie2ksrv1.ri.bbsrc.ac.uk>
Message-ID: <200601180911.20454.heikki@sanbi.ac.za>

Jan, 

It would be easy if someone had written a function to do it. Even writing the 
function is not hard.  I do not think there is no other way than go through 
all features, though.

In my opinion this would be an excellent addition to Bio::Seq::Utilities.

E.g. cat($arrayrefofsequences, optional_seq_class_to_create)
     return a new seq, species and other info based on the first seq in array 

Could you  write it and post to bugzilla?

	-Heikki

On Tuesday 17 January 2006 11:54, jan aerts (RI) wrote:
> Hi all,
>
> Does anyone know of an easy way to concatenate two sequences, including
> recalculation of features positions of the second one? E.g.
>   seq 1 = 100 bp
>     feature A: 5..15
>   seq 2 = 200 bp
>     feature B: 20..30
>   => concatenated sequence 3 = 300 bp
>        feature A: 5..15
>        feature B: 120..130  <<<<<<<<<<<
>
> Annotations (features without range) should be transferred as well.
>
> Of course, it must be possible to create a blank sequence and work my
> way through all features, adding them to a new collection of features
> and stuff. But I was wondering if a simpler technique is possible.
>
> Many thanks,
> Jan Aerts
> Bioinformatics Department
> Roslin Institute
> Roslin, Scotland, UK
>
> ---------The obligatory disclaimer--------
> The information contained in this e-mail (including any attachments) is
> confidential and is intended for the use of the addressee only.   The
> opinions expressed within this e-mail (including any attachments) are
> the opinions of the sender and do not necessarily constitute those of
> Roslin Institute (Edinburgh) ("the Institute") unless specifically
> stated by a sender who is duly authorised to do so on behalf of the
> Institute.
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l@portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l

-- 
______ _/      _/_____________________________________________________
      _/      _/
     _/  _/  _/  Heikki Lehvaslaiho    heikki at_sanbi _ac _za
    _/_/_/_/_/  Associate Professor    skype: heikki_lehvaslaiho
   _/  _/  _/  SANBI, South African National Bioinformatics Institute
  _/  _/  _/  University of Western Cape, South Africa
     _/      Phone: +27 21 959 2096   FAX: +27 21 959 2512
___ _/_/_/_/_/________________________________________________________
From cjfields at uiuc.edu  Wed Jan 18 11:51:55 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed Jan 18 11:48:13 2006
Subject: [Bioperl-l] GMOD PPM repository not working
Message-ID: <000001c61c4f$7d835170$15327e82@pyrimidine>

Scott,

I am trying to find the newest bioperl dev. Release (1.51) from PPM for a
quick write-up on installing bioperl-db on Windows.  I tried using the GMOD
repository:

ppm> rep add gmod http://www.gmod.org/ggb/ppm
Repositories:
[1] gmod
[ ] ActiveState Package Repository
[ ] ActiveState PPM2 Repository
[ ] Bioperl
[ ] Bribes
[ ] Kobes
[ ] local
ppm> search bioperl
Searching in Active Repositories
No matches for 'bioperl'; see 'help search'.
ppm> search *
Searching in Active Repositories
No matches for '*'; see 'help search'.
ppm>

Any idea what's going on?  All other repositories work fine.  I can download
it and install locally w/o a problem.  I am running the newest ActivePerl
(5.8.7.815), WinXP.

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 

From cjfields at uiuc.edu  Wed Jan 18 12:16:33 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed Jan 18 12:12:52 2006
Subject: [Bioperl-l] bioperl-db working (for the moment) on Win32
In-Reply-To: 
Message-ID: <000101c61c52$f20e3070$15327e82@pyrimidine>

Should I get a PPM for the CVS version of bioperl-db ready, or should we
just go with 'nmake', 'nmake test', 'nmake install'?  If I can get a PPM
build to the same repository as the bioperl PPM (http://bioperl.org/DIST/),
it will probably cut down on questions from new users.  I'm using a PPM
build for both bioperl-live and bioperl-db at the moment, which can be
easily modified for the repository.

Also, what version of bioperl should be used with bioperl-db (I'm adding it
as a dependency)?   Will bioperl-1.4 do, or do we need 1.5.1 (available at
the GMOD repository, http://www.gmod.org/ggb/ppm/)? 

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 

> -----Original Message-----
> From: bioperl-l-bounces@portal.open-bio.org [mailto:bioperl-l-
> bounces@portal.open-bio.org] On Behalf Of Hilmar Lapp
> Sent: Wednesday, January 18, 2006 1:07 AM
> To: Barry Moore
> Cc: Chris Fields; bioperl-l@portal.open-bio.org
> Subject: Re: [Bioperl-l] bioperl-db working (for the moment) on Win32
> 
> Same here, thanks. We'll include your write-up in CVS. -hilmar
> 
> On Jan 17, 2006, at 6:02 PM, Barry Moore wrote:
> 
> > This is very helpful Chris.  Thank you.
> >
> > Barry
> >
> >> -----Original Message-----
> >> From: bioperl-l-bounces@portal.open-bio.org [mailto:bioperl-l-
> >> bounces@portal.open-bio.org] On Behalf Of Chris Fields
> >> Sent: Tuesday, January 17, 2006 6:45 PM
> >> To: bioperl-l@portal.open-bio.org
> >> Subject: [Bioperl-l] bioperl-db working (for the moment) on Win32
> >>
> >> Hilmar,
> >>
> >> Just wanted to drop a line saying bioperl-db seems to be up and
> >> running on Windows (at least for the moment!). All tests pass using
> >> ActivePerl and cygwin-perl.  I am trying to sort out the issue with
> >> throw in Bio::Root::Root (specifically, why it doesn't work without
> >> the added comma; I'm trying the modifications to Root.pm on Mac OS X
> >> now) and am trying to also figure out why bioperl and bioperl-db give
> >> tons of warnings using ActivePerl (most just state that x subroutine
> >> was redefined in y.pm line z, so aren't serious).  This is an
> >> ActivePerl or nmake issue and not a bioperl problem as there are no
> >> warnings using 'make test' in cygwin.  I am in the midst of writing
> >> up the steps for installing bioperl and bioperl-db using MySQL as the
> >> relational DB with either ActivePerl or cygwin; I really don't have
> >> much experience with postgreSQL, oracle, MsSQL (B. Wang's added
> >> modules), etc., but I can't see any reason why they wouldn't work.
> >>
> >> Christopher Fields
> >> Postdoctoral Researcher
> >> Lab of Dr. Robert Switzer
> >> Dept of Biochemistry
> >> University of Illinois Urbana-Champaign
> >>
> >>
> >>
> >> _______________________________________________
> >> Bioperl-l mailing list
> >> Bioperl-l@portal.open-bio.org
> >> http://portal.open-bio.org/mailman/listinfo/bioperl-l
> >
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l@portal.open-bio.org
> > http://portal.open-bio.org/mailman/listinfo/bioperl-l
> >
> >
> --
> -------------------------------------------------------------
> Hilmar Lapp                            email: lapp at gnf.org
> GNF, San Diego, Ca. 92121              phone: +1-858-812-1757
> -------------------------------------------------------------
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l@portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l

From hlapp at gmx.net  Wed Jan 18 12:30:36 2006
From: hlapp at gmx.net (Hilmar Lapp)
Date: Wed Jan 18 12:26:53 2006
Subject: [Bioperl-l] bioperl-db working (for the moment) on Win32
In-Reply-To: <000101c61c52$f20e3070$15327e82@pyrimidine>
References: <000101c61c52$f20e3070$15327e82@pyrimidine>
Message-ID: 

On Jan 18, 2006, at 9:16 AM, Chris Fields wrote:

> Also, what version of bioperl should be used with bioperl-db (I'm 
> adding it
> as a dependency)?   Will bioperl-1.4 do, or do we need 1.5.1 
> (available at
> the GMOD repository, http://www.gmod.org/ggb/ppm/)?
>

The recommendation is 1.5.1. v1.4 will largely work too, except that if 
you work with ontologies you might run into problems because there were 
fixes to the Ontology modules in Bioperl post-1.4.

	-hilmar
-- 
-------------------------------------------------------------
Hilmar Lapp                            email: lapp at gnf.org
GNF, San Diego, Ca. 92121              phone: +1-858-812-1757
-------------------------------------------------------------

From dnm_a at swbell.net  Wed Jan 18 12:57:40 2006
From: dnm_a at swbell.net (David Messina)
Date: Wed Jan 18 13:00:39 2006
Subject: [Bioperl-l] Context-sensitive alignment parameters
In-Reply-To: <200601171512.k0HFCj8V012227@portal.open-bio.org>
References: <200601171512.k0HFCj8V012227@portal.open-bio.org>
Message-ID: 

Hi Jay,

> When I am aligning sequences from other plants to Arabidopsis, the  
> introns are much less well-conserved than the exons, and this ought  
> to be the case for animals and other organisms too.

> Does anyone use make any allowance for this, by setting gap and gap- 
> extension, or substitution matrix parameters in a context-sensitive  
> way? Is there an alignment method that can take this kind of thing  
> into account?  Is it worth trying to take it into account anyway?

I would do a local alignment  (with e.g. Blast) first to find the  
segments of the genome that match. Then, I would realign each of the  
matching segments using a global alignment algorithm (e.g. needle  
from the EMBOSS package) to force the best alignment within each  
matching region.

It's worth it if you're interested in looking at the overall  
conservation between the genomes or something like that.

If however you're just interested in the exons, then it's easier to  
do the alignments with cDNA representations of the sequences from the  
other plants and align those to the Arabidopsis genomic sequence  
(using Blast).

Hope this helps,
Dave

-- 
Dave Messina
Informatics Analyst
WashU Genome Sequencing Center
dmessina@watson.wustl.edu
314-286-1825

From kaboroev at sfu.ca  Wed Jan 18 12:15:28 2006
From: kaboroev at sfu.ca (Keith Boroevich)
Date: Wed Jan 18 13:17:02 2006
Subject: [Bioperl-l] Trouble using RemoteBlast.pm
In-Reply-To: 
References: 
Message-ID: <1137604529.18560.14.camel@gotenks.zfighters>

I'm not sure if this is related, but in the last 3 days my remote BLAST
scripts have stop working.  I have not modified the code in any way.
The retrieve_blast() returns successful, and next_result() does return a
"Bio::Search::Result::BlastResult=HASH(0x15ad8d0)" object but takes a
long time to do so.  However, next_hit returns undef.  I'm not really
sure how to approach this problem.  Prior to 3 days ago the scripts
worked perfectly returning a list of hits, their accession and
significance.

Keith

On Tue, 2006-01-17 at 11:34 -0700, Barry Moore wrote:
> Nagesh-
> 
> Did you get this figured out?  Your script works as is on my system.
> You say temp.out is empty?  What does you input sequence
> (blastInput.txt) look like?
> 
> Barry
> 
> > -----Original Message-----
> > From: bioperl-l-bounces@portal.open-bio.org [mailto:bioperl-l-
> > bounces@portal.open-bio.org] On Behalf Of Hubert Prielinger
> > Sent: Monday, January 16, 2006 2:54 PM
> > To: Nagesh Chakka; bioperl-l@portal.open-bio.org
> > Subject: Re: [Bioperl-l] Trouble using RemoteBlast.pm
> > 
> > Nagesh Chakka wrote:
> > 
> > >Hi All,
> > >I was trying to setup a system to perform a remote blast on regular
> > basis. I
> > >thought this could be best achieved by using BioPerl module and came
> > across
> > >RemoteBlast.pm
> > >I had modified the sample script "bp_remote_blast.pl" which takes a
> file
> > >containing single FASTA sequence as an input. Also I wanted the blast
> > report
> > >to be saved in a file for latter use and
> > >modified the code as follows
> > >I am using the latest version of Bioperl (1.5) on a Fedora platform.
> >
> >#######################################################################
> > >print "$Bio::Root::Version::VERSION\n";
> > >use Bio::Tools::Run::RemoteBlast;
> > >use strict;
> > >my $prog = 'blastp';
> > >my $db   = 'swissprot';
> > >my $e_val= '1e-10';
> > >
> > >my @params = ( '-prog' => $prog,
> > >       '-data' => $db,
> > >       '-expect' => $e_val,
> > >       '-readmethod' => 'SearchIO' );
> > >
> > >my $factory = Bio::Tools::Run::RemoteBlast->new(@params);
> > >
> > >#change a paramter
> > >$Bio::Tools::Run::RemoteBlast::HEADER{'ENTREZ_QUERY'} = 'Homo sapiens
> > >[ORGN]';
> > >
> > >#remove a parameter
> > >delete $Bio::Tools::Run::RemoteBlast::HEADER{'FILTER'};
> > >
> > >my $v = 1;
> > >#$v is just to turn on and off the messages
> > >
> > >my $r = $factory->submit_blast('blastInput.txt');
> > >
> > >print STDERR "waiting..." if( $v > 0 );
> > >while ( my @rids = $factory->each_rid )
> > >{
> > >        foreach my $rid ( @rids )
> > >        {
> > >                my $rc = $factory->retrieve_blast($rid);
> > >                if( !ref($rc) )
> > >                {
> > >                        if( $rc < 0 )
> > >                        {
> > >                                $factory->remove_rid($rid);
> > >                        }
> > >                        print STDERR "." if ( $v > 0 );
> > >                        sleep 5;
> > >                }
> > >                else
> > >                {
> > >                        print "RID $rid\n";
> > >                        $factory->save_output('temp.out');
> > >                        $factory->remove_rid($rid);
> > >                }
> > >        }
> > >}
> > >
> >
> >#######################################################################
> ##
> > ########
> > >
> > >This script prints the RID and terminates immediately. Obviously the
> > >output file created is empty as the program did not wait for getting
> the
> > >blast results from the RID.
> > >Is there something I am doing wrong and what can I do for the program
> to
> > wait
> > >until the results are ready to be printed to the output file. I could
> not
> > get
> > >much information from the documentation and have no prior experience
> with
> > >Bioperl.
> > >Thanks very much for  your attention.
> > >Regards
> > >Nageshbi
> > >_______________________________________________
> > >Bioperl-l mailing list
> > >Bioperl-l@portal.open-bio.org
> > >http://portal.open-bio.org/mailman/listinfo/bioperl-l
> > >
> > >
> > >
> > >
> > hi nagesh,
> > try this, should work, I had the same problem:
> > 
> > .......................
> > .......................
> > 
> > else
> >                 {
> >                         print "RID $rid\n";
> >                         $factory->save_output('temp.out');
> > 
> > 			my $checkinput = $factory->file;
> >               		open(my $fh,"<$checkinput") or die $!;
> >               		while(<$fh>){
> >                 		print;
> >               		}
> >               		close $fh;
> > 
> > 
> > 			$factory->remove_rid($rid);
> >                 }
> >         }
> > }
> > 
> > regards
> > Hubert
> > 
> > PS: are you using the composition based statistics parameter with your
> > blast search?
> > if yes, is it working?
> > 
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l@portal.open-bio.org
> > http://portal.open-bio.org/mailman/listinfo/bioperl-l
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l@portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l
> 

From jason.stajich at duke.edu  Wed Jan 18 13:05:49 2006
From: jason.stajich at duke.edu (Jason Stajich)
Date: Wed Jan 18 13:38:21 2006
Subject: [Bioperl-l] Context-sensitive alignment parameters
In-Reply-To: <200601171512.k0HFCj8V012227@portal.open-bio.org>
References: <200601171512.k0HFCj8V012227@portal.open-bio.org>
Message-ID: <3082CF90-556E-4560-B501-8C7C2A8C8663@duke.edu>

WABA kind of does this with three different match states.

-jason
On Jan 17, 2006, at 10:09 AM, Jay Moore wrote:

> Not strictly bioperl, but if anyone has any ideas, I would  
> appreciate the feedback.
>
> I am doing some comparative work between partially-sequenced plant  
> genomic DNA, and fully-sequenced Arabidopsis genome.
>
> When I am aligning sequences from other plants to Arabidopsis, the  
> introns are much less well-conserved than the exons, and this ought  
> to be the case
> for animals and other organisms too.  Does anyone use make any  
> allowance for this, by setting gap and gap-extension, or  
> substitution matrix parameters
> in a context-sensitive way?  Is there an alignment method that can  
> take this kind of thing into account?  Is it worth trying to take  
> it into account
> anyway?
>
> Just wondered if anyone has a take, or any information, on this.
>
> Jay Moore
> Warwick HRI  http://www.warwickhri.ac.uk
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l@portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l

--
Jason Stajich
Duke University
http://www.duke.edu/~jes12

From kaboroev at sfu.ca  Wed Jan 18 12:55:09 2006
From: kaboroev at sfu.ca (Keith Boroevich)
Date: Wed Jan 18 13:56:11 2006
Subject: [Bioperl-l] Trouble using RemoteBlast.pm
In-Reply-To: 
References: 
Message-ID: <1137606909.18560.17.camel@gotenks.zfighters>

I'm not sure if this is related, but in the last 3 days my remote BLAST
scripts have stop working.  I have not modified the code in any way.
The retrieve_blast() returns successful, and next_result() does return a
"Bio::Search::Result::BlastResult=HASH(0x15ad8d0)" object but takes a
long time to do so.  However, next_hit returns undef.  I'm not really
sure how to approach this problem.  Prior to 3 days ago the scripts
worked perfectly returning a list of hits, their accession and
significance.

Keith

On Tue, 2006-01-17 at 11:34 -0700, Barry Moore wrote:
> Nagesh-
> 
> Did you get this figured out?  Your script works as is on my system.
> You say temp.out is empty?  What does you input sequence
> (blastInput.txt) look like?
> 
> Barry
> 
> > -----Original Message-----
> > From: bioperl-l-bounces@portal.open-bio.org [mailto:bioperl-l-
> > bounces@portal.open-bio.org] On Behalf Of Hubert Prielinger
> > Sent: Monday, January 16, 2006 2:54 PM
> > To: Nagesh Chakka; bioperl-l@portal.open-bio.org
> > Subject: Re: [Bioperl-l] Trouble using RemoteBlast.pm
> > 
> > Nagesh Chakka wrote:
> > 
> > >Hi All,
> > >I was trying to setup a system to perform a remote blast on regular
> > basis. I
> > >thought this could be best achieved by using BioPerl module and came
> > across
> > >RemoteBlast.pm
> > >I had modified the sample script "bp_remote_blast.pl" which takes a
> file
> > >containing single FASTA sequence as an input. Also I wanted the blast
> > report
> > >to be saved in a file for latter use and
> > >modified the code as follows
> > >I am using the latest version of Bioperl (1.5) on a Fedora platform.
> >
> >#######################################################################
> > >print "$Bio::Root::Version::VERSION\n";
> > >use Bio::Tools::Run::RemoteBlast;
> > >use strict;
> > >my $prog = 'blastp';
> > >my $db   = 'swissprot';
> > >my $e_val= '1e-10';
> > >
> > >my @params = ( '-prog' => $prog,
> > >       '-data' => $db,
> > >       '-expect' => $e_val,
> > >       '-readmethod' => 'SearchIO' );
> > >
> > >my $factory = Bio::Tools::Run::RemoteBlast->new(@params);
> > >
> > >#change a paramter
> > >$Bio::Tools::Run::RemoteBlast::HEADER{'ENTREZ_QUERY'} = 'Homo sapiens
> > >[ORGN]';
> > >
> > >#remove a parameter
> > >delete $Bio::Tools::Run::RemoteBlast::HEADER{'FILTER'};
> > >
> > >my $v = 1;
> > >#$v is just to turn on and off the messages
> > >
> > >my $r = $factory->submit_blast('blastInput.txt');
> > >
> > >print STDERR "waiting..." if( $v > 0 );
> > >while ( my @rids = $factory->each_rid )
> > >{
> > >        foreach my $rid ( @rids )
> > >        {
> > >                my $rc = $factory->retrieve_blast($rid);
> > >                if( !ref($rc) )
> > >                {
> > >                        if( $rc < 0 )
> > >                        {
> > >                                $factory->remove_rid($rid);
> > >                        }
> > >                        print STDERR "." if ( $v > 0 );
> > >                        sleep 5;
> > >                }
> > >                else
> > >                {
> > >                        print "RID $rid\n";
> > >                        $factory->save_output('temp.out');
> > >                        $factory->remove_rid($rid);
> > >                }
> > >        }
> > >}
> > >
> >
> >#######################################################################
> ##
> > ########
> > >
> > >This script prints the RID and terminates immediately. Obviously the
> > >output file created is empty as the program did not wait for getting
> the
> > >blast results from the RID.
> > >Is there something I am doing wrong and what can I do for the program
> to
> > wait
> > >until the results are ready to be printed to the output file. I could
> not
> > get
> > >much information from the documentation and have no prior experience
> with
> > >Bioperl.
> > >Thanks very much for  your attention.
> > >Regards
> > >Nageshbi
> > >_______________________________________________
> > >Bioperl-l mailing list
> > >Bioperl-l@portal.open-bio.org
> > >http://portal.open-bio.org/mailman/listinfo/bioperl-l
> > >
> > >
> > >
> > >
> > hi nagesh,
> > try this, should work, I had the same problem:
> > 
> > .......................
> > .......................
> > 
> > else
> >                 {
> >                         print "RID $rid\n";
> >                         $factory->save_output('temp.out');
> > 
> > 			my $checkinput = $factory->file;
> >               		open(my $fh,"<$checkinput") or die $!;
> >               		while(<$fh>){
> >                 		print;
> >               		}
> >               		close $fh;
> > 
> > 
> > 			$factory->remove_rid($rid);
> >                 }
> >         }
> > }
> > 
> > regards
> > Hubert
> > 
> > PS: are you using the composition based statistics parameter with your
> > blast search?
> > if yes, is it working?
> > 
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l@portal.open-bio.org
> > http://portal.open-bio.org/mailman/listinfo/bioperl-l
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l@portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l
> 

From cjfields at uiuc.edu  Wed Jan 18 16:17:49 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed Jan 18 16:14:03 2006
Subject: [Bioperl-l] Trouble using RemoteBlast.pm
In-Reply-To: <1137606909.18560.17.camel@gotenks.zfighters>
Message-ID: <001401c61c74$a274b760$15327e82@pyrimidine>

I have had the same problem using a script I wrote.  It worked until ~4 days
ago.  Luckily, I had saved a copy of some of my old searches in a temp
folder so I can compare them.

I noticed that if I just save the output using:

$factory->save_output('temp.out');

it works (just like Barry's script), but if I have the following in a loop
(like in RemoteBlast POD), it craps out:

while ( my @rids = $factory->each_rid ) {
	foreach my $rid ( @rids ) {
		my $rc = $factory->retrieve_blast($rid);	
		# if RID is not present
		if( !ref($rc) ) {
			# remove if RID is bad (error)
			if( $rc < 0 ) {
				$factory->remove_rid($rid);
			}
			print STDERR "." if ( $v > 0 ); 
			sleep 5; 
		} else { # RID is returned
			my $result = $rc->next_result();
			# save the output
			my $filename = $result->query_name()."\.blastp";
			$factory->save_output($filename);
			# remove RID from list
			$factory->remove_rid($rid);
			...

When I change the following:

my $filename = $result->query_name()."\.blastp";

to 

my $filename = "temp.blastp";

and comment out the 'my $result = $rc->next_result()' line, it works again,
so possibly SearchIO?

The only difference I noticed is that older output has this:
_______________________________________________________________________

BLASTP 2.2.12 [Aug-07-2005]
Reference: Altschul, Stephen F., Thomas L. Madden, Alejandro A. Sch?ffer, 
Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman 
(1997), "Gapped BLAST and PSI-BLAST: a new generation of 
protein database search programs", Nucleic Acids Res. 25:3389-3402.

RID: 1131470802-26518-118666159798.BLASTQ3

Database: All non-redundant GenBank CDS
translations+PDB+SwissProt+PIR+PRF excluding environmental samples 
           3,023,944 sequences; 1,040,428,944 total letters
Query=  NP_249094 transcriptional regulator PyrR [Pseudomonas aeruginosa
PAO1].
          (170 letters)
....

_______________________________________________________________________

And new output has this:
_______________________________________________________________________
BLASTP 2.2.13 [Nov-27-2005]
Reference: Altschul, Stephen F., Thomas L. Madden, Alejandro A. Sch??ffer, 
Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman 
(1997), "Gapped BLAST and PSI-BLAST: a new generation of 
protein database search programs", Nucleic Acids Res. 25:3389-3402.

RID: 1137614458-7828-16730336973.BLASTQ4

Database: All non-redundant GenBank CDS
translations+PDB+SwissProt+PIR+PRF excluding environmental samples
           3,228,386 sequences; 1,108,137,318 total letters
Query=  NP_249094 pyrimidine regulatory protein PyrR [Pseudomonas aeruginosa
PAO1].
Length=170
....
_______________________________________________________________________

There is a change in the line for the length.  Is this enough to break
SearchIO::Blast?  

I think Jason is right; maybe NCBI has messed with text output and it's now
breaking the BLAST parser:

http://portal.open-bio.org/pipermail/bioperl-l/2005-November/020067.html

I may try switching over to XML output to see what happens.

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 

> -----Original Message-----
> From: bioperl-l-bounces@portal.open-bio.org [mailto:bioperl-l-
> bounces@portal.open-bio.org] On Behalf Of Keith Boroevich
> Sent: Wednesday, January 18, 2006 11:55 AM
> To: kaboroev@sfu.ca
> Cc: bioperl-l@portal.open-bio.org
> Subject: RE: [Bioperl-l] Trouble using RemoteBlast.pm
> 
> I'm not sure if this is related, but in the last 3 days my remote BLAST
> scripts have stop working.  I have not modified the code in any way.
> The retrieve_blast() returns successful, and next_result() does return a
> "Bio::Search::Result::BlastResult=HASH(0x15ad8d0)" object but takes a
> long time to do so.  However, next_hit returns undef.  I'm not really
> sure how to approach this problem.  Prior to 3 days ago the scripts
> worked perfectly returning a list of hits, their accession and
> significance.
> 
> Keith
> 
> 
> On Tue, 2006-01-17 at 11:34 -0700, Barry Moore wrote:
> > Nagesh-
> >
> > Did you get this figured out?  Your script works as is on my system.
> > You say temp.out is empty?  What does you input sequence
> > (blastInput.txt) look like?
> >
> > Barry
> >
> > > -----Original Message-----
> > > From: bioperl-l-bounces@portal.open-bio.org [mailto:bioperl-l-
> > > bounces@portal.open-bio.org] On Behalf Of Hubert Prielinger
> > > Sent: Monday, January 16, 2006 2:54 PM
> > > To: Nagesh Chakka; bioperl-l@portal.open-bio.org
> > > Subject: Re: [Bioperl-l] Trouble using RemoteBlast.pm
> > >
> > > Nagesh Chakka wrote:
> > >
> > > >Hi All,
> > > >I was trying to setup a system to perform a remote blast on regular
> > > basis. I
> > > >thought this could be best achieved by using BioPerl module and came
> > > across
> > > >RemoteBlast.pm
> > > >I had modified the sample script "bp_remote_blast.pl" which takes a
> > file
> > > >containing single FASTA sequence as an input. Also I wanted the blast
> > > report
> > > >to be saved in a file for latter use and
> > > >modified the code as follows
> > > >I am using the latest version of Bioperl (1.5) on a Fedora platform.
> > >
> > >#######################################################################
> > > >print "$Bio::Root::Version::VERSION\n";
> > > >use Bio::Tools::Run::RemoteBlast;
> > > >use strict;
> > > >my $prog = 'blastp';
> > > >my $db   = 'swissprot';
> > > >my $e_val= '1e-10';
> > > >
> > > >my @params = ( '-prog' => $prog,
> > > >       '-data' => $db,
> > > >       '-expect' => $e_val,
> > > >       '-readmethod' => 'SearchIO' );
> > > >
> > > >my $factory = Bio::Tools::Run::RemoteBlast->new(@params);
> > > >
> > > >#change a paramter
> > > >$Bio::Tools::Run::RemoteBlast::HEADER{'ENTREZ_QUERY'} = 'Homo sapiens
> > > >[ORGN]';
> > > >
> > > >#remove a parameter
> > > >delete $Bio::Tools::Run::RemoteBlast::HEADER{'FILTER'};
> > > >
> > > >my $v = 1;
> > > >#$v is just to turn on and off the messages
> > > >
> > > >my $r = $factory->submit_blast('blastInput.txt');
> > > >
> > > >print STDERR "waiting..." if( $v > 0 );
> > > >while ( my @rids = $factory->each_rid )
> > > >{
> > > >        foreach my $rid ( @rids )
> > > >        {
> > > >                my $rc = $factory->retrieve_blast($rid);
> > > >                if( !ref($rc) )
> > > >                {
> > > >                        if( $rc < 0 )
> > > >                        {
> > > >                                $factory->remove_rid($rid);
> > > >                        }
> > > >                        print STDERR "." if ( $v > 0 );
> > > >                        sleep 5;
> > > >                }
> > > >                else
> > > >                {
> > > >                        print "RID $rid\n";
> > > >                        $factory->save_output('temp.out');
> > > >                        $factory->remove_rid($rid);
> > > >                }
> > > >        }
> > > >}
> > > >
> > >
> > >#######################################################################
> > ##
> > > ########
> > > >
> > > >This script prints the RID and terminates immediately. Obviously the
> > > >output file created is empty as the program did not wait for getting
> > the
> > > >blast results from the RID.
> > > >Is there something I am doing wrong and what can I do for the program
> > to
> > > wait
> > > >until the results are ready to be printed to the output file. I could
> > not
> > > get
> > > >much information from the documentation and have no prior experience
> > with
> > > >Bioperl.
> > > >Thanks very much for  your attention.
> > > >Regards
> > > >Nageshbi
> > > >_______________________________________________
> > > >Bioperl-l mailing list
> > > >Bioperl-l@portal.open-bio.org
> > > >http://portal.open-bio.org/mailman/listinfo/bioperl-l
> > > >
> > > >
> > > >
> > > >
> > > hi nagesh,
> > > try this, should work, I had the same problem:
> > >
> > > .......................
> > > .......................
> > >
> > > else
> > >                 {
> > >                         print "RID $rid\n";
> > >                         $factory->save_output('temp.out');
> > >
> > > 			my $checkinput = $factory->file;
> > >               		open(my $fh,"<$checkinput") or die $!;
> > >               		while(<$fh>){
> > >                 		print;
> > >               		}
> > >               		close $fh;
> > >
> > >
> > > 			$factory->remove_rid($rid);
> > >                 }
> > >         }
> > > }
> > >
> > > regards
> > > Hubert
> > >
> > > PS: are you using the composition based statistics parameter with your
> > > blast search?
> > > if yes, is it working?
> > >
> > > _______________________________________________
> > > Bioperl-l mailing list
> > > Bioperl-l@portal.open-bio.org
> > > http://portal.open-bio.org/mailman/listinfo/bioperl-l
> >
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l@portal.open-bio.org
> > http://portal.open-bio.org/mailman/listinfo/bioperl-l
> >
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l@portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l

From jason.stajich at duke.edu  Wed Jan 18 16:30:02 2006
From: jason.stajich at duke.edu (Jason Stajich)
Date: Wed Jan 18 17:01:09 2006
Subject: [Bioperl-l] Trouble using RemoteBlast.pm
In-Reply-To: <001401c61c74$a274b760$15327e82@pyrimidine>
References: <001401c61c74$a274b760$15327e82@pyrimidine>
Message-ID: 

You may need to start requesting XML instead of plain text - NCBI may  
have finally done what they warned about (http://bioperl.org/ 
pipermail/bioperl-l/2005-September/019687.html).

You can see information here about getting XML.

http://bioperl.open-bio.org/news/2005/11/06/getting-blastxml-using- 
remoteblast/
http://bioperl.open-bio.org/wiki/Module:Bio::Tools::Run::RemoteBlast
http://bioperl.open-bio.org/wiki/NCBI_Blast_email

We'll officially announce the new news and wiki site more at the end  
of the month when we switch to permanent URL but I suspect this  
question needs a pointer.  Feel free to add this question and answer  
to the FAQ as well http://bioperl.open-bio.org/wiki/FAQ

-jason
On Jan 18, 2006, at 4:17 PM, Chris Fields wrote:

> I have had the same problem using a script I wrote.  It worked  
> until ~4 days
> ago.  Luckily, I had saved a copy of some of my old searches in a temp
> folder so I can compare them.
>
> I noticed that if I just save the output using:
>
> $factory->save_output('temp.out');
>
> it works (just like Barry's script), but if I have the following in  
> a loop
> (like in RemoteBlast POD), it craps out:
>
> while ( my @rids = $factory->each_rid ) {
> 	foreach my $rid ( @rids ) {
> 		my $rc = $factory->retrieve_blast($rid);	
> 		# if RID is not present
> 		if( !ref($rc) ) {
> 			# remove if RID is bad (error)
> 			if( $rc < 0 ) {
> 				$factory->remove_rid($rid);
> 			}
> 			print STDERR "." if ( $v > 0 );
> 			sleep 5;
> 		} else { # RID is returned
> 			my $result = $rc->next_result();
> 			# save the output
> 			my $filename = $result->query_name()."\.blastp";
> 			$factory->save_output($filename);
> 			# remove RID from list
> 			$factory->remove_rid($rid);
> 			...
>
>
>
> When I change the following:
>
> my $filename = $result->query_name()."\.blastp";
>
> to
>
> my $filename = "temp.blastp";
>
> and comment out the 'my $result = $rc->next_result()' line, it  
> works again,
> so possibly SearchIO?
>
> The only difference I noticed is that older output has this:
> ______________________________________________________________________ 
> _
>
> BLASTP 2.2.12 [Aug-07-2005]
> Reference: Altschul, Stephen F., Thomas L. Madden, Alejandro A.  
> Sch?ffer,
> Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman
> (1997), "Gapped BLAST and PSI-BLAST: a new generation of
> protein database search programs", Nucleic Acids Res. 25:3389-3402.
>
> RID: 1131470802-26518-118666159798.BLASTQ3
>
>
> Database: All non-redundant GenBank CDS
> translations+PDB+SwissProt+PIR+PRF excluding environmental samples
>            3,023,944 sequences; 1,040,428,944 total letters
> Query=  NP_249094 transcriptional regulator PyrR [Pseudomonas  
> aeruginosa
> PAO1].
>           (170 letters)
> ....
>
> ______________________________________________________________________ 
> _
>
> And new output has this:
> ______________________________________________________________________ 
> _
> BLASTP 2.2.13 [Nov-27-2005]
> Reference: Altschul, Stephen F., Thomas L. Madden, Alejandro A. Sch? 
> ?ffer,
> Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman
> (1997), "Gapped BLAST and PSI-BLAST: a new generation of
> protein database search programs", Nucleic Acids Res. 25:3389-3402.
>
> RID: 1137614458-7828-16730336973.BLASTQ4
>
>
> Database: All non-redundant GenBank CDS
> translations+PDB+SwissProt+PIR+PRF excluding environmental samples
>            3,228,386 sequences; 1,108,137,318 total letters
> Query=  NP_249094 pyrimidine regulatory protein PyrR [Pseudomonas  
> aeruginosa
> PAO1].
> Length=170
> ....
> ______________________________________________________________________ 
> _
>
>
> There is a change in the line for the length.  Is this enough to break
> SearchIO::Blast?
>
> I think Jason is right; maybe NCBI has messed with text output and  
> it's now
> breaking the BLAST parser:
>
> http://portal.open-bio.org/pipermail/bioperl-l/2005-November/ 
> 020067.html
>
> I may try switching over to XML output to see what happens.
>
> Christopher Fields
> Postdoctoral Researcher - Switzer Lab
> Dept. of Biochemistry
> University of Illinois Urbana-Champaign
>
>> -----Original Message-----
>> From: bioperl-l-bounces@portal.open-bio.org [mailto:bioperl-l-
>> bounces@portal.open-bio.org] On Behalf Of Keith Boroevich
>> Sent: Wednesday, January 18, 2006 11:55 AM
>> To: kaboroev@sfu.ca
>> Cc: bioperl-l@portal.open-bio.org
>> Subject: RE: [Bioperl-l] Trouble using RemoteBlast.pm
>>
>> I'm not sure if this is related, but in the last 3 days my remote  
>> BLAST
>> scripts have stop working.  I have not modified the code in any way.
>> The retrieve_blast() returns successful, and next_result() does  
>> return a
>> "Bio::Search::Result::BlastResult=HASH(0x15ad8d0)" object but takes a
>> long time to do so.  However, next_hit returns undef.  I'm not really
>> sure how to approach this problem.  Prior to 3 days ago the scripts
>> worked perfectly returning a list of hits, their accession and
>> significance.
>>
>> Keith
>>
>>
>> On Tue, 2006-01-17 at 11:34 -0700, Barry Moore wrote:
>>> Nagesh-
>>>
>>> Did you get this figured out?  Your script works as is on my system.
>>> You say temp.out is empty?  What does you input sequence
>>> (blastInput.txt) look like?
>>>
>>> Barry
>>>
>>>> -----Original Message-----
>>>> From: bioperl-l-bounces@portal.open-bio.org [mailto:bioperl-l-
>>>> bounces@portal.open-bio.org] On Behalf Of Hubert Prielinger
>>>> Sent: Monday, January 16, 2006 2:54 PM
>>>> To: Nagesh Chakka; bioperl-l@portal.open-bio.org
>>>> Subject: Re: [Bioperl-l] Trouble using RemoteBlast.pm
>>>>
>>>> Nagesh Chakka wrote:
>>>>
>>>>> Hi All,
>>>>> I was trying to setup a system to perform a remote blast on  
>>>>> regular
>>>> basis. I
>>>>> thought this could be best achieved by using BioPerl module and  
>>>>> came
>>>> across
>>>>> RemoteBlast.pm
>>>>> I had modified the sample script "bp_remote_blast.pl" which  
>>>>> takes a
>>> file
>>>>> containing single FASTA sequence as an input. Also I wanted the  
>>>>> blast
>>>> report
>>>>> to be saved in a file for latter use and
>>>>> modified the code as follows
>>>>> I am using the latest version of Bioperl (1.5) on a Fedora  
>>>>> platform.
>>>>
>>>> ################################################################### 
>>>> ####
>>>>> print "$Bio::Root::Version::VERSION\n";
>>>>> use Bio::Tools::Run::RemoteBlast;
>>>>> use strict;
>>>>> my $prog = 'blastp';
>>>>> my $db   = 'swissprot';
>>>>> my $e_val= '1e-10';
>>>>>
>>>>> my @params = ( '-prog' => $prog,
>>>>>       '-data' => $db,
>>>>>       '-expect' => $e_val,
>>>>>       '-readmethod' => 'SearchIO' );
>>>>>
>>>>> my $factory = Bio::Tools::Run::RemoteBlast->new(@params);
>>>>>
>>>>> #change a paramter
>>>>> $Bio::Tools::Run::RemoteBlast::HEADER{'ENTREZ_QUERY'} = 'Homo  
>>>>> sapiens
>>>>> [ORGN]';
>>>>>
>>>>> #remove a parameter
>>>>> delete $Bio::Tools::Run::RemoteBlast::HEADER{'FILTER'};
>>>>>
>>>>> my $v = 1;
>>>>> #$v is just to turn on and off the messages
>>>>>
>>>>> my $r = $factory->submit_blast('blastInput.txt');
>>>>>
>>>>> print STDERR "waiting..." if( $v > 0 );
>>>>> while ( my @rids = $factory->each_rid )
>>>>> {
>>>>>        foreach my $rid ( @rids )
>>>>>        {
>>>>>                my $rc = $factory->retrieve_blast($rid);
>>>>>                if( !ref($rc) )
>>>>>                {
>>>>>                        if( $rc < 0 )
>>>>>                        {
>>>>>                                $factory->remove_rid($rid);
>>>>>                        }
>>>>>                        print STDERR "." if ( $v > 0 );
>>>>>                        sleep 5;
>>>>>                }
>>>>>                else
>>>>>                {
>>>>>                        print "RID $rid\n";
>>>>>                        $factory->save_output('temp.out');
>>>>>                        $factory->remove_rid($rid);
>>>>>                }
>>>>>        }
>>>>> }
>>>>>
>>>>
>>>> ################################################################### 
>>>> ####
>>> ##
>>>> ########
>>>>>
>>>>> This script prints the RID and terminates immediately.  
>>>>> Obviously the
>>>>> output file created is empty as the program did not wait for  
>>>>> getting
>>> the
>>>>> blast results from the RID.
>>>>> Is there something I am doing wrong and what can I do for the  
>>>>> program
>>> to
>>>> wait
>>>>> until the results are ready to be printed to the output file. I  
>>>>> could
>>> not
>>>> get
>>>>> much information from the documentation and have no prior  
>>>>> experience
>>> with
>>>>> Bioperl.
>>>>> Thanks very much for  your attention.
>>>>> Regards
>>>>> Nageshbi
>>>>> _______________________________________________
>>>>> Bioperl-l mailing list
>>>>> Bioperl-l@portal.open-bio.org
>>>>> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>>>>>
>>>>>
>>>>>
>>>>>
>>>> hi nagesh,
>>>> try this, should work, I had the same problem:
>>>>
>>>> .......................
>>>> .......................
>>>>
>>>> else
>>>>                 {
>>>>                         print "RID $rid\n";
>>>>                         $factory->save_output('temp.out');
>>>>
>>>> 			my $checkinput = $factory->file;
>>>>               		open(my $fh,"<$checkinput") or die $!;
>>>>               		while(<$fh>){
>>>>                 		print;
>>>>               		}
>>>>               		close $fh;
>>>>
>>>>
>>>> 			$factory->remove_rid($rid);
>>>>                 }
>>>>         }
>>>> }
>>>>
>>>> regards
>>>> Hubert
>>>>
>>>> PS: are you using the composition based statistics parameter  
>>>> with your
>>>> blast search?
>>>> if yes, is it working?
>>>>
>>>> _______________________________________________
>>>> Bioperl-l mailing list
>>>> Bioperl-l@portal.open-bio.org
>>>> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l@portal.open-bio.org
>>> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l@portal.open-bio.org
>> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l@portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l

--
Jason Stajich
Duke University
http://www.duke.edu/~jes12

From nagesh.chakka at anu.edu.au  Wed Jan 18 20:37:28 2006
From: nagesh.chakka at anu.edu.au (Nagesh)
Date: Wed Jan 18 20:34:08 2006
Subject: [Bioperl-l] Trouble using RemoteBlast.pm
In-Reply-To: 
References: 
Message-ID: <1137634648.5305.36.camel@vogon>

Thanks very much to all specially to Barry and Hubert for their time in
answering my query. Some updates into my problem.

I have performed some diagnostics tests and writing below my
observations.

First of all, the problem in the code was that it was not waiting for
the results to be ready for writing it to the output file. So I wanted
to check whether the condition "if( !ref($rc) )" is ever satisfied and I
printed out the $rc value which was some thing like "Bio::SearchIO::
blast=HASH(0x9010370)". When I had looked at the Bioperl documentation
for RemoteBlast.pm, the value for $rc in "$rc = $factory->retrieve_blast
($rid);" should either return 0 or 1. I am not able to understand
whether what I am getting is right. 

Secondly, I had manually forced the script to wait between submit_blast,
retrieve_blast and save_output by using sleep with values ranging from
30 to 600. None of them where successful in saving the output. 

When sleep (600) is between submit_blast and retrieve_blast, the
following is printed onto std output (shown below is part of the output)
with output file still empty.

Request ID  1137626804-16566-100302560340.BLASTQ4
Status Searching
Submitted at Wed Jan 18 18:26:44 2006
Current time Wed Jan 18 18:36:46 2006
Time since submission
00:10:01

This page will be automatically updated in 10 seconds
until search is done

When sleep (600) is between retrieve_blast and save_output, the
following is printed with nothing written to output file.

Request ID  1137632221-28820-85178967709.BLASTQ1
Status Searching
Submitted at Wed Jan 18 19:57:01 2006
Current time Wed Jan 18 19:57:03 2006
Time since submission
00:00:01

This page will be automatically updated in 10 seconds
until search is done

Please note the difference in time since submission.

Lastly, I had printed out the request ID and manually paused the script
by using  between submit_blast and retrieve_blast. The idea was
to check the status of the job online through the NCBI website. When the
results where ready, I made the script to proceed further and was able
to save the desired results to the file. I am puzzled with this
observation as I am not understanding why manually formating the results
online helps in getting the results.
I am basically a molecular biologist and trying hard to solve this
computational stuff, so there might be some trivial issues according to
you computer wiz :)

Barry suggested me to use perl debugger which I will try to use.

Thanks for your attention.

Below is the code which was being tested. 

########################################################################

use strict;
use warnings;
use Bio::Tools::Run::RemoteBlast;

print "$Bio::Root::Version::VERSION\n";
my $prog = 'blastp';
my $db   = 'swissprot';
my $e_val= '1e-10';

my @params = ( '-prog' => $prog,
       '-data' => $db,
       '-expect' => $e_val,
       '-readmethod' => 'SearchIO' );

my $factory = Bio::Tools::Run::RemoteBlast->new(@params);

#change a paramter
$Bio::Tools::Run::RemoteBlast::HEADER{'ENTREZ_QUERY'} = 'Homo sapiens
[ORGN]';

#remove a parameter
delete $Bio::Tools::Run::RemoteBlast::HEADER{'FILTER'};

my $v = 1;
#$v is just to turn on and off the messages

my $r = $factory->submit_blast('blastInput.txt');

print STDERR "waiting..." if( $v > 0 );
while ( my @rids = $factory->each_rid ) 
{
        foreach my $rid ( @rids ) 
        {

    print "RID $rid\n";

    #;
    #sleep 600;
    my $rc = $factory->retrieve_blast($rid);

    print "RC $rc\n";
                if( !ref($rc) ) 
                {
                        if( $rc < 0 ) 
                        {
    				$factory->remove_rid($rid);
                        }
                        print STDERR "." if ( $v > 0 );
                        sleep 5;
    } 
                else 
                {
    sleep 600;
    $factory->save_output('temp.out');
    my $checkinput = $factory->file;
                    open(my $fh,"<$checkinput") or die $!;
                    while(<$fh>)
{
                             print;
                        }
                         close $fh;
    $factory->remove_rid($rid);
                }
        }
}

########################################################################

On Tue, 2006-01-17 at 16:03 -0700, Barry Moore wrote:
> Nagesh,
> 
> Attached is an input file, script and output.  These work for me, and I
> think they are the same that you are using.  Have a look and see if you
> can find any differences that might be causing you problem.  Other than
> that I don't know what to tell you.  If you are familiar with the perl
> debugger you (and if you're not, now's probably a good time to become
> familiar with it) you should step through you script and be sure that
> all of you're objects are getting defined when they are supposed to be.
> That can often help narrow down the problem.
> 
> Barry
> 
> > -----Original Message-----
> > From: Nagesh Chakka [mailto:nagesh.chakka@anu.edu.au]
> > Sent: Tuesday, January 17, 2006 1:57 PM
> > To: Barry Moore
> > Cc: Hubert Prielinger; bioperl-l@bioperl.org
> > Subject: Re: [Bioperl-l] Trouble using RemoteBlast.pm
> > 
> > Bi  Barry,
> > With the help of Hubert, I further modified the script but still have
> the
> > same
> > problem. The problem is that from the point of submitting the blast
> query,
> > the script does not wait until the blast results are ready  for
> retrieval
> > and
> > event of submission is immediately followed by retrieving and saving
> the
> > output. Since the results will not be ready (about a sec) this fast,
> the
> > output created is blank. I am able to retrieve the results online
> using
> > the
> > RID which I am making the script to print.
> > So  my main problem is making the program to wait after submitting the
> > result.
> > My input file has a single fasta sequence which I have pasted below.
> > Its interesting to note that the script works on your system. Is it
> > creating
> > an output file with the blast report?
> > Thanks very much for your attention.
> > Regards
> > Nagesh
> > 
> > blastInput.txt
> > >MusDpl
> >
> MKNRLGTWWVAILCMLLASHLSTVKARGIKHRFKWNRKVLPSSGGQITEARVAENRPGAFIKQGRKLDIDFG
> AE
> > GNRYYA
> >
> ANYWQFPDGIYYEGCSEANVTKEMLVTSCVNATQAANQAEFSREKQDSKLHQRVLWRLIKEICSAKHCDFWL
> ER
> > GAAL
> > RVAVDQPAMVCLLGFVWFIVK
> > 
> > On Wednesday 18 January 2006 05:34, Barry Moore wrote:
> > > Nagesh-
> > >
> > > Did you get this figured out?  Your script works as is on my system.
> > > You say temp.out is empty?  What does you input sequence
> > > (blastInput.txt) look like?
> > >
> > > Barry
> > >
> > > > -----Original Message-----
> > > > From: bioperl-l-bounces@portal.open-bio.org [mailto:bioperl-l-
> > > > bounces@portal.open-bio.org] On Behalf Of Hubert Prielinger
> > > > Sent: Monday, January 16, 2006 2:54 PM
> > > > To: Nagesh Chakka; bioperl-l@portal.open-bio.org
> > > > Subject: Re: [Bioperl-l] Trouble using RemoteBlast.pm
> > > >
> > > > Nagesh Chakka wrote:
> > > > >Hi All,
> > > > >I was trying to setup a system to perform a remote blast on
> regular
> > > >
> > > > basis. I
> > > >
> > > > >thought this could be best achieved by using BioPerl module and
> came
> > > >
> > > > across
> > > >
> > > > >RemoteBlast.pm
> > > > >I had modified the sample script "bp_remote_blast.pl" which takes
> a
> > >
> > > file
> > >
> > > > >containing single FASTA sequence as an input. Also I wanted the
> blast
> > > >
> > > > report
> > > >
> > > > >to be saved in a file for latter use and
> > > > >modified the code as follows
> > > > >I am using the latest version of Bioperl (1.5) on a Fedora
> platform.
> > > >
> > >
> >#######################################################################
> > > >
> > > > >print "$Bio::Root::Version::VERSION\n";
> > > > >use Bio::Tools::Run::RemoteBlast;
> > > > >use strict;
> > > > >my $prog = 'blastp';
> > > > >my $db   = 'swissprot';
> > > > >my $e_val= '1e-10';
> > > > >
> > > > >my @params = ( '-prog' => $prog,
> > > > >       '-data' => $db,
> > > > >       '-expect' => $e_val,
> > > > >       '-readmethod' => 'SearchIO' );
> > > > >
> > > > >my $factory = Bio::Tools::Run::RemoteBlast->new(@params);
> > > > >
> > > > >#change a paramter
> > > > >$Bio::Tools::Run::RemoteBlast::HEADER{'ENTREZ_QUERY'} = 'Homo
> sapiens
> > > > >[ORGN]';
> > > > >
> > > > >#remove a parameter
> > > > >delete $Bio::Tools::Run::RemoteBlast::HEADER{'FILTER'};
> > > > >
> > > > >my $v = 1;
> > > > >#$v is just to turn on and off the messages
> > > > >
> > > > >my $r = $factory->submit_blast('blastInput.txt');
> > > > >
> > > > >print STDERR "waiting..." if( $v > 0 );
> > > > >while ( my @rids = $factory->each_rid )
> > > > >{
> > > > >        foreach my $rid ( @rids )
> > > > >        {
> > > > >                my $rc = $factory->retrieve_blast($rid);
> > > > >                if( !ref($rc) )
> > > > >                {
> > > > >                        if( $rc < 0 )
> > > > >                        {
> > > > >                                $factory->remove_rid($rid);
> > > > >                        }
> > > > >                        print STDERR "." if ( $v > 0 );
> > > > >                        sleep 5;
> > > > >                }
> > > > >                else
> > > > >                {
> > > > >                        print "RID $rid\n";
> > > > >                        $factory->save_output('temp.out');
> > > > >                        $factory->remove_rid($rid);
> > > > >                }
> > > > >        }
> > > > >}
> > > >
> > >
> >#######################################################################
> > >
> > > ##
> > >
> > > > ########
> > > >
> > > > >This script prints the RID and terminates immediately. Obviously
> the
> > > > >output file created is empty as the program did not wait for
> getting
> > >
> > > the
> > >
> > > > >blast results from the RID.
> > > > >Is there something I am doing wrong and what can I do for the
> program
> > >
> > > to
> > >
> > > > wait
> > > >
> > > > >until the results are ready to be printed to the output file. I
> could
> > >
> > > not
> > >
> > > > get
> > > >
> > > > >much information from the documentation and have no prior
> experience
> > >
> > > with
> > >
> > > > >Bioperl.
> > > > >Thanks very much for  your attention.
> > > > >Regards
> > > > >Nageshbi
> > > > >_______________________________________________
> > > > >Bioperl-l mailing list
> > > > >Bioperl-l@portal.open-bio.org
> > > > >http://portal.open-bio.org/mailman/listinfo/bioperl-l
> > > >
> > > > hi nagesh,
> > > > try this, should work, I had the same problem:
> > > >
> > > > .......................
> > > > .......................
> > > >
> > > > else
> > > >                 {
> > > >                         print "RID $rid\n";
> > > >                         $factory->save_output('temp.out');
> > > >
> > > > 			my $checkinput = $factory->file;
> > > >               		open(my $fh,"<$checkinput") or die $!;
> > > >               		while(<$fh>){
> > > >                 		print;
> > > >               		}
> > > >               		close $fh;
> > > >
> > > >
> > > > 			$factory->remove_rid($rid);
> > > >                 }
> > > >         }
> > > > }
> > > >
> > > > regards
> > > > Hubert
> > > >
> > > > PS: are you using the composition based statistics parameter with
> your
> > > > blast search?
> > > > if yes, is it working?
> > > >
> > > > _______________________________________________
> > > > Bioperl-l mailing list
> > > > Bioperl-l@portal.open-bio.org
> > > > http://portal.open-bio.org/mailman/listinfo/bioperl-l

From cjfields at uiuc.edu  Wed Jan 18 23:04:28 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed Jan 18 23:00:59 2006
Subject: [Bioperl-l] XML output from RemoteBlast
Message-ID: 

Is there any known way to save XML-formatted BLAST queries from  
RemoteBlast?  Changing the FORMAT_TYPE in the retrieval header to  
anything other than 'Text' gives a blank output file.

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign

From barry.moore at genetics.utah.edu  Thu Jan 19 00:15:06 2006
From: barry.moore at genetics.utah.edu (Barry Moore)
Date: Thu Jan 19 00:11:02 2006
Subject: [Bioperl-l] Trouble using RemoteBlast.pm
In-Reply-To: <1137634648.5305.36.camel@vogon>
References: 
	<1137634648.5305.36.camel@vogon>
Message-ID: <88489B2C-0C4B-46B2-ACB6-247990E30AB6@genetics.utah.edu>

Nagesh,

That does sound odd.  What version of bioperl are you using?  I'm  
guessing 1.4?  If the answer is anything but 1.5 something, then I  
suggest you should upgrade before going any further.  You will also  
want to follow the current thread by about parsing XML formatted  
blast reports.  I don't think this is your problem right now, but  
eventually you'll have a problem if you aren't parsing XML format as  
discussed in that post.  I've added some more detail below if you are  
having the problem with 1.5 try some debugging.

Here's what's going on (or should be going on) in your script, and  
some suggestions for using the debugger.

#This next line hits the NCBI server, and if it gets a blast report  
in return parses it, and returns a Bio::Tools::Blast object.  If  
there was no report you get 0, and if there was an error you get -1.

     my $rc = $factory->retrieve_blast($rid);

     print "RC $rc\n";

#This if statement is checking to see if the server has NOT returned  
a report yet.  If it did then $rc should be an object and ref $rc  
will return 'Bio::SearchIIO::blast'.  If $rc is not an object (i.e.  
you got no report) then ref $rc returns undef.
                 if( !ref($rc) )
                 {
#If you got here then you got no report from NCBI server yet, and so  
the next if check is you got -1 meaning there was an error.  On error  
delete this RID cause it's no good.
                         if( $rc < 0 )
                         {
     				$factory->remove_rid($rid);
                         }
#Print a dot on the screen in leu of music to keep the user  
entertained while they wait.
                         print STDERR "." if ( $v > 0 );
#Take a nap so you don't piss off NCBI sys admin!
                         sleep 5;
     }
#Getting here means that $rc was an object, so we've got a report.   
Go ahead and save it.
                 else
                 {
     sleep 600;
#Obviously writing your output file.
     $factory->save_output('temp.out');
     my $checkinput = $factory->file;
                     open(my $fh,"<$checkinput") or die $!;
                     while(<$fh>)
{
                              print;
                         }
                          close $fh;
     $factory->remove_rid($rid);

run your script in the debugger like this:

perl -d your_script.pl

Step forward one line at a time by typing 'n'.
When you get just past my $rc = $factory->retrieve_blast($rid); type  
'x $rc'
You should get 0, -1 or 'Bio::SearchIO::blast'
Keep stepping forward with 'n'.
If you get 0 you should loop back to retrieve_blast after a sleep.
If you get -1 you should end your script - you got an error (What was  
it?)
If you get an Bio::SearchIO::blast object then you should be writing  
a temp.out

Barry

On Jan 18, 2006, at 6:37 PM, Nagesh wrote:

> Thanks very much to all specially to Barry and Hubert for their  
> time in
> answering my query. Some updates into my problem.
>
> I have performed some diagnostics tests and writing below my
> observations.
>
> First of all, the problem in the code was that it was not waiting for
> the results to be ready for writing it to the output file. So I wanted
> to check whether the condition "if( !ref($rc) )" is ever satisfied  
> and I
> printed out the $rc value which was some thing like "Bio::SearchIO::
> blast=HASH(0x9010370)". When I had looked at the Bioperl documentation
> for RemoteBlast.pm, the value for $rc in "$rc = $factory- 
> >retrieve_blast
> ($rid);" should either return 0 or 1. I am not able to understand
> whether what I am getting is right.
>
> Secondly, I had manually forced the script to wait between  
> submit_blast,
> retrieve_blast and save_output by using sleep with values ranging from
> 30 to 600. None of them where successful in saving the output.
>
> When sleep (600) is between submit_blast and retrieve_blast, the
> following is printed onto std output (shown below is part of the  
> output)
> with output file still empty.
>
> 
> 
> 
> 
> 
> 
> 
> 
Request ID  1137626804-16566-100302560340.BLASTQ4 b>
Status Searching
Submitted at Wed Jan 18 18:26:44 2006
Current time Wed Jan 18 18:36:46 2006
Time since submission 00:10:01
> 
This page will be automatically updated in 10 seconds
> until search is done

>
> When sleep (600) is between retrieve_blast and save_output, the
> following is printed with nothing written to output file.
>
> 
> 
> 
> 
> 
> 
> 
> 
Request ID  1137632221-28820-85178967709.BLASTQ1 b>
Status Searching
Submitted at Wed Jan 18 19:57:01 2006
Current time Wed Jan 18 19:57:03 2006
Time since submission 00:00:01
> 
This page will be automatically updated in 10 seconds
> until search is done

>
> Please note the difference in time since submission.
>
> Lastly, I had printed out the request ID and manually paused the  
> script
> by using  between submit_blast and retrieve_blast. The idea was
> to check the status of the job online through the NCBI website.  
> When the
> results where ready, I made the script to proceed further and was able
> to save the desired results to the file. I am puzzled with this
> observation as I am not understanding why manually formating the  
> results
> online helps in getting the results.
> I am basically a molecular biologist and trying hard to solve this
> computational stuff, so there might be some trivial issues  
> according to
> you computer wiz :)
>
> Barry suggested me to use perl debugger which I will try to use.
>
> Thanks for your attention.
>
> Below is the code which was being tested.
>
> ###################################################################### 
> ##
>
> use strict;
> use warnings;
> use Bio::Tools::Run::RemoteBlast;
>
> print "$Bio::Root::Version::VERSION\n";
> my $prog = 'blastp';
> my $db   = 'swissprot';
> my $e_val= '1e-10';
>
> my @params = ( '-prog' => $prog,
>        '-data' => $db,
>        '-expect' => $e_val,
>        '-readmethod' => 'SearchIO' );
>
> my $factory = Bio::Tools::Run::RemoteBlast->new(@params);
>
> #change a paramter
> $Bio::Tools::Run::RemoteBlast::HEADER{'ENTREZ_QUERY'} = 'Homo sapiens
> [ORGN]';
>
> #remove a parameter
> delete $Bio::Tools::Run::RemoteBlast::HEADER{'FILTER'};
>
> my $v = 1;
> #$v is just to turn on and off the messages
>
> my $r = $factory->submit_blast('blastInput.txt');
>
> print STDERR "waiting..." if( $v > 0 );
> while ( my @rids = $factory->each_rid )
> {
>         foreach my $rid ( @rids )
>         {
>
>     print "RID $rid\n";
>
>     #;
>     #sleep 600;
>     my $rc = $factory->retrieve_blast($rid);
>
>     print "RC $rc\n";
>                 if( !ref($rc) )
>                 {
>                         if( $rc < 0 )
>                         {
>     				$factory->remove_rid($rid);
>                         }
>                         print STDERR "." if ( $v > 0 );
>                         sleep 5;
>     }
>                 else
>                 {
>     sleep 600;
>     $factory->save_output('temp.out');
>     my $checkinput = $factory->file;
>                     open(my $fh,"<$checkinput") or die $!;
>                     while(<$fh>)
> {
>                              print;
>                         }
>                          close $fh;
>     $factory->remove_rid($rid);
>                 }
>         }
> }
>
> ###################################################################### 
> ##
>
>
> On Tue, 2006-01-17 at 16:03 -0700, Barry Moore wrote:
>> Nagesh,
>>
>> Attached is an input file, script and output.  These work for me,  
>> and I
>> think they are the same that you are using.  Have a look and see  
>> if you
>> can find any differences that might be causing you problem.  Other  
>> than
>> that I don't know what to tell you.  If you are familiar with the  
>> perl
>> debugger you (and if you're not, now's probably a good time to become
>> familiar with it) you should step through you script and be sure that
>> all of you're objects are getting defined when they are supposed  
>> to be.
>> That can often help narrow down the problem.
>>
>> Barry
>>
>>> -----Original Message-----
>>> From: Nagesh Chakka [mailto:nagesh.chakka@anu.edu.au]
>>> Sent: Tuesday, January 17, 2006 1:57 PM
>>> To: Barry Moore
>>> Cc: Hubert Prielinger; bioperl-l@bioperl.org
>>> Subject: Re: [Bioperl-l] Trouble using RemoteBlast.pm
>>>
>>> Bi  Barry,
>>> With the help of Hubert, I further modified the script but still  
>>> have
>> the
>>> same
>>> problem. The problem is that from the point of submitting the blast
>> query,
>>> the script does not wait until the blast results are ready  for
>> retrieval
>>> and
>>> event of submission is immediately followed by retrieving and saving
>> the
>>> output. Since the results will not be ready (about a sec) this fast,
>> the
>>> output created is blank. I am able to retrieve the results online
>> using
>>> the
>>> RID which I am making the script to print.
>>> So  my main problem is making the program to wait after  
>>> submitting the
>>> result.
>>> My input file has a single fasta sequence which I have pasted below.
>>> Its interesting to note that the script works on your system. Is it
>>> creating
>>> an output file with the blast report?
>>> Thanks very much for your attention.
>>> Regards
>>> Nagesh
>>>
>>> blastInput.txt
>>>> MusDpl
>>>
>> MKNRLGTWWVAILCMLLASHLSTVKARGIKHRFKWNRKVLPSSGGQITEARVAENRPGAFIKQGRKLDI 
>> DFG
>> AE
>>> GNRYYA
>>>
>> ANYWQFPDGIYYEGCSEANVTKEMLVTSCVNATQAANQAEFSREKQDSKLHQRVLWRLIKEICSAKHCD 
>> FWL
>> ER
>>> GAAL
>>> RVAVDQPAMVCLLGFVWFIVK
>>>
>>> On Wednesday 18 January 2006 05:34, Barry Moore wrote:
>>>> Nagesh-
>>>>
>>>> Did you get this figured out?  Your script works as is on my  
>>>> system.
>>>> You say temp.out is empty?  What does you input sequence
>>>> (blastInput.txt) look like?
>>>>
>>>> Barry
>>>>
>>>>> -----Original Message-----
>>>>> From: bioperl-l-bounces@portal.open-bio.org [mailto:bioperl-l-
>>>>> bounces@portal.open-bio.org] On Behalf Of Hubert Prielinger
>>>>> Sent: Monday, January 16, 2006 2:54 PM
>>>>> To: Nagesh Chakka; bioperl-l@portal.open-bio.org
>>>>> Subject: Re: [Bioperl-l] Trouble using RemoteBlast.pm
>>>>>
>>>>> Nagesh Chakka wrote:
>>>>>> Hi All,
>>>>>> I was trying to setup a system to perform a remote blast on
>> regular
>>>>>
>>>>> basis. I
>>>>>
>>>>>> thought this could be best achieved by using BioPerl module and
>> came
>>>>>
>>>>> across
>>>>>
>>>>>> RemoteBlast.pm
>>>>>> I had modified the sample script "bp_remote_blast.pl" which takes
>> a
>>>>
>>>> file
>>>>
>>>>>> containing single FASTA sequence as an input. Also I wanted the
>> blast
>>>>>
>>>>> report
>>>>>
>>>>>> to be saved in a file for latter use and
>>>>>> modified the code as follows
>>>>>> I am using the latest version of Bioperl (1.5) on a Fedora
>> platform.
>>>>>
>>>>
>>> #################################################################### 
>>> ###
>>>>>
>>>>>> print "$Bio::Root::Version::VERSION\n";
>>>>>> use Bio::Tools::Run::RemoteBlast;
>>>>>> use strict;
>>>>>> my $prog = 'blastp';
>>>>>> my $db   = 'swissprot';
>>>>>> my $e_val= '1e-10';
>>>>>>
>>>>>> my @params = ( '-prog' => $prog,
>>>>>>       '-data' => $db,
>>>>>>       '-expect' => $e_val,
>>>>>>       '-readmethod' => 'SearchIO' );
>>>>>>
>>>>>> my $factory = Bio::Tools::Run::RemoteBlast->new(@params);
>>>>>>
>>>>>> #change a paramter
>>>>>> $Bio::Tools::Run::RemoteBlast::HEADER{'ENTREZ_QUERY'} = 'Homo
>> sapiens
>>>>>> [ORGN]';
>>>>>>
>>>>>> #remove a parameter
>>>>>> delete $Bio::Tools::Run::RemoteBlast::HEADER{'FILTER'};
>>>>>>
>>>>>> my $v = 1;
>>>>>> #$v is just to turn on and off the messages
>>>>>>
>>>>>> my $r = $factory->submit_blast('blastInput.txt');
>>>>>>
>>>>>> print STDERR "waiting..." if( $v > 0 );
>>>>>> while ( my @rids = $factory->each_rid )
>>>>>> {
>>>>>>        foreach my $rid ( @rids )
>>>>>>        {
>>>>>>                my $rc = $factory->retrieve_blast($rid);
>>>>>>                if( !ref($rc) )
>>>>>>                {
>>>>>>                        if( $rc < 0 )
>>>>>>                        {
>>>>>>                                $factory->remove_rid($rid);
>>>>>>                        }
>>>>>>                        print STDERR "." if ( $v > 0 );
>>>>>>                        sleep 5;
>>>>>>                }
>>>>>>                else
>>>>>>                {
>>>>>>                        print "RID $rid\n";
>>>>>>                        $factory->save_output('temp.out');
>>>>>>                        $factory->remove_rid($rid);
>>>>>>                }
>>>>>>        }
>>>>>> }
>>>>>
>>>>
>>> #################################################################### 
>>> ###
>>>>
>>>> ##
>>>>
>>>>> ########
>>>>>
>>>>>> This script prints the RID and terminates immediately. Obviously
>> the
>>>>>> output file created is empty as the program did not wait for
>> getting
>>>>
>>>> the
>>>>
>>>>>> blast results from the RID.
>>>>>> Is there something I am doing wrong and what can I do for the
>> program
>>>>
>>>> to
>>>>
>>>>> wait
>>>>>
>>>>>> until the results are ready to be printed to the output file. I
>> could
>>>>
>>>> not
>>>>
>>>>> get
>>>>>
>>>>>> much information from the documentation and have no prior
>> experience
>>>>
>>>> with
>>>>
>>>>>> Bioperl.
>>>>>> Thanks very much for  your attention.
>>>>>> Regards
>>>>>> Nageshbi
>>>>>> _______________________________________________
>>>>>> Bioperl-l mailing list
>>>>>> Bioperl-l@portal.open-bio.org
>>>>>> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>>>>>
>>>>> hi nagesh,
>>>>> try this, should work, I had the same problem:
>>>>>
>>>>> .......................
>>>>> .......................
>>>>>
>>>>> else
>>>>>                 {
>>>>>                         print "RID $rid\n";
>>>>>                         $factory->save_output('temp.out');
>>>>>
>>>>> 			my $checkinput = $factory->file;
>>>>>               		open(my $fh,"<$checkinput") or die $!;
>>>>>               		while(<$fh>){
>>>>>                 		print;
>>>>>               		}
>>>>>               		close $fh;
>>>>>
>>>>>
>>>>> 			$factory->remove_rid($rid);
>>>>>                 }
>>>>>         }
>>>>> }
>>>>>
>>>>> regards
>>>>> Hubert
>>>>>
>>>>> PS: are you using the composition based statistics parameter with
>> your
>>>>> blast search?
>>>>> if yes, is it working?
>>>>>
>>>>> _______________________________________________
>>>>> Bioperl-l mailing list
>>>>> Bioperl-l@portal.open-bio.org
>>>>> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l@portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l

From heikki at sanbi.ac.za  Thu Jan 19 01:18:17 2006
From: heikki at sanbi.ac.za (Heikki Lehvaslaiho)
Date: Thu Jan 19 01:30:29 2006
Subject: [Bioperl-l] Bio::Taxonomy::{Tree&Node} testing
Message-ID: <200601190818.17885.heikki@sanbi.ac.za>

Dan,

I've committed a preliminary test file called TaxonTree.t to bioperl main main 
trunk. Could you check that and correct it where needed.

I got quite confused about the Node methods. What they are supposed to do and 
return was not quite clear. I did fix one mistake where description was stored 
in the same place as descendants.

I hope you are still interested in working on these modules.

	-Heikki

-- 
______ _/      _/_____________________________________________________
      _/      _/
     _/  _/  _/  Heikki Lehvaslaiho    heikki at_sanbi _ac _za
    _/_/_/_/_/  Associate Professor    skype: heikki_lehvaslaiho
   _/  _/  _/  SANBI, South African National Bioinformatics Institute
  _/  _/  _/  University of Western Cape, South Africa
     _/      Phone: +27 21 959 2096   FAX: +27 21 959 2512
___ _/_/_/_/_/________________________________________________________
From nagesh.chakka at anu.edu.au  Thu Jan 19 02:22:17 2006
From: nagesh.chakka at anu.edu.au (Nagesh)
Date: Thu Jan 19 02:18:42 2006
Subject: [Bioperl-l] RemoteBlast.pm problem resolved!!!!!
In-Reply-To: <88489B2C-0C4B-46B2-ACB6-247990E30AB6@genetics.utah.edu>
References: 
	<1137634648.5305.36.camel@vogon>
	<88489B2C-0C4B-46B2-ACB6-247990E30AB6@genetics.utah.edu>
Message-ID: <1137655337.5305.73.camel@vogon>

Hi Barry,
Thanks once again for an elaborate mail and explanation. I am using the
latest version of BioPerl 1.5. I also tested this problem on 1.4 with no
difference. The problem is with the "$rc = $factory->retrieve_blast
($rid);" where $rc was always getting an object as a return from
retrieve_blast and is never entering into sleep 5 mode (the condition
"if( !ref($rc) )" is never satisfied). 

I thought I will have a look at the RemoteBlast.pm code once before
trying anything more. I looked at the method retrieve_blast which was
the main culprit and then found a possible answer for my problem. I
looked at the condition which returns 0, -1 or an object which is below

Code from Bio/Tools/Run/RemoteBlast.pm version 1.5 line 569-560
#########################################################
		my $size = -s $tempfile;
		if( $size > 1000 ) {
#########################################################

So I made it to print the file size and had run my perl script several
times

#########################################################
		my $size = -s $tempfile;
		print "Size of temporary file from RemoteBlast.pm $size\n";
		if( $size > 1000 ) {
#########################################################

Each time I did so, I was getting the file size value of 2014 to 2017
and no wonder it satisfies the condition ($size > 1000) even when the
results were not ready.

So I modified the condition to the following
#########################################################
		my $size = -s $tempfile;
		if( $size > 2017 ) {
#########################################################

and there it goes, the code behaved itself and waited until the results
were ready to proceed further with saving the output.
This may be a result of some changes the NCBI admin would have made to
the results status page which would have increased the file size and
satisfying the condition to return an object which must be returned only
when the results were ready. 
I am not sure whether this is the right answer to the problem but it
does definitely work.
Any comments from people having similar problem will be useful. I will
see how long does this solution would work and knock back on your doors
if I need further help.
Thanks for your help.
Regards
Nagesh

On Wed, 2006-01-18 at 22:15 -0700, Barry Moore wrote:
> Nagesh,
> 
> That does sound odd.  What version of bioperl are you using?  I'm  
> guessing 1.4?  If the answer is anything but 1.5 something, then I  
> suggest you should upgrade before going any further.  You will also  
> want to follow the current thread by about parsing XML formatted  
> blast reports.  I don't think this is your problem right now, but  
> eventually you'll have a problem if you aren't parsing XML format as  
> discussed in that post.  I've added some more detail below if you are  
> having the problem with 1.5 try some debugging.
> 
> Here's what's going on (or should be going on) in your script, and  
> some suggestions for using the debugger.
> 
> #This next line hits the NCBI server, and if it gets a blast report  
> in return parses it, and returns a Bio::Tools::Blast object.  If  
> there was no report you get 0, and if there was an error you get -1.
> 
>      my $rc = $factory->retrieve_blast($rid);
> 
>      print "RC $rc\n";
> 
> #This if statement is checking to see if the server has NOT returned  
> a report yet.  If it did then $rc should be an object and ref $rc  
> will return 'Bio::SearchIIO::blast'.  If $rc is not an object (i.e.  
> you got no report) then ref $rc returns undef.
>                  if( !ref($rc) )
>                  {
> #If you got here then you got no report from NCBI server yet, and so  
> the next if check is you got -1 meaning there was an error.  On error  
> delete this RID cause it's no good.
>                          if( $rc < 0 )
>                          {
>      				$factory->remove_rid($rid);
>                          }
> #Print a dot on the screen in leu of music to keep the user  
> entertained while they wait.
>                          print STDERR "." if ( $v > 0 );
> #Take a nap so you don't piss off NCBI sys admin!
>                          sleep 5;
>      }
> #Getting here means that $rc was an object, so we've got a report.   
> Go ahead and save it.
>                  else
>                  {
>      sleep 600;
> #Obviously writing your output file.
>      $factory->save_output('temp.out');
>      my $checkinput = $factory->file;
>                      open(my $fh,"<$checkinput") or die $!;
>                      while(<$fh>)
> {
>                               print;
>                          }
>                           close $fh;
>      $factory->remove_rid($rid);
> 
> 
> run your script in the debugger like this:
> 
> perl -d your_script.pl
> 
> Step forward one line at a time by typing 'n'.
> When you get just past my $rc = $factory->retrieve_blast($rid); type  
> 'x $rc'
> You should get 0, -1 or 'Bio::SearchIO::blast'
> Keep stepping forward with 'n'.
> If you get 0 you should loop back to retrieve_blast after a sleep.
> If you get -1 you should end your script - you got an error (What was  
> it?)
> If you get an Bio::SearchIO::blast object then you should be writing  
> a temp.out
> 
> Barry
> 
> 
> On Jan 18, 2006, at 6:37 PM, Nagesh wrote:
> 
> > Thanks very much to all specially to Barry and Hubert for their  
> > time in
> > answering my query. Some updates into my problem.
> >
> > I have performed some diagnostics tests and writing below my
> > observations.
> >
> > First of all, the problem in the code was that it was not waiting for
> > the results to be ready for writing it to the output file. So I wanted
> > to check whether the condition "if( !ref($rc) )" is ever satisfied  
> > and I
> > printed out the $rc value which was some thing like "Bio::SearchIO::
> > blast=HASH(0x9010370)". When I had looked at the Bioperl documentation
> > for RemoteBlast.pm, the value for $rc in "$rc = $factory- 
> > >retrieve_blast
> > ($rid);" should either return 0 or 1. I am not able to understand
> > whether what I am getting is right.
> >
> > Secondly, I had manually forced the script to wait between  
> > submit_blast,
> > retrieve_blast and save_output by using sleep with values ranging from
> > 30 to 600. None of them where successful in saving the output.
> >
> > When sleep (600) is between submit_blast and retrieve_blast, the
> > following is printed onto std output (shown below is part of the  
> > output)
> > with output file still empty.
> >
> > 
> > 
> > 
> > 
> > 
> > 
> > 
> > 
Request ID  1137626804-16566-100302560340.BLASTQ4 > b>
Status Searching
Submitted at Wed Jan 18 18:26:44 2006
Current time Wed Jan 18 18:36:46 2006
Time since submission 00:10:01
> > 
This page will be automatically updated in 10 seconds
> > until search is done

> >
> > When sleep (600) is between retrieve_blast and save_output, the
> > following is printed with nothing written to output file.
> >
> > 
> > 
> > 
> > 
> > 
> > 
> > 
> > 
Request ID  1137632221-28820-85178967709.BLASTQ1 > b>
Status Searching
Submitted at Wed Jan 18 19:57:01 2006
Current time Wed Jan 18 19:57:03 2006
Time since submission 00:00:01
> > 
This page will be automatically updated in 10 seconds
> > until search is done

> >
> > Please note the difference in time since submission.
> >
> > Lastly, I had printed out the request ID and manually paused the  
> > script
> > by using  between submit_blast and retrieve_blast. The idea was
> > to check the status of the job online through the NCBI website.  
> > When the
> > results where ready, I made the script to proceed further and was able
> > to save the desired results to the file. I am puzzled with this
> > observation as I am not understanding why manually formating the  
> > results
> > online helps in getting the results.
> > I am basically a molecular biologist and trying hard to solve this
> > computational stuff, so there might be some trivial issues  
> > according to
> > you computer wiz :)
> >
> > Barry suggested me to use perl debugger which I will try to use.
> >
> > Thanks for your attention.
> >
> > Below is the code which was being tested.
> >
> > ###################################################################### 
> > ##
> >
> > use strict;
> > use warnings;
> > use Bio::Tools::Run::RemoteBlast;
> >
> > print "$Bio::Root::Version::VERSION\n";
> > my $prog = 'blastp';
> > my $db   = 'swissprot';
> > my $e_val= '1e-10';
> >
> > my @params = ( '-prog' => $prog,
> >        '-data' => $db,
> >        '-expect' => $e_val,
> >        '-readmethod' => 'SearchIO' );
> >
> > my $factory = Bio::Tools::Run::RemoteBlast->new(@params);
> >
> > #change a paramter
> > $Bio::Tools::Run::RemoteBlast::HEADER{'ENTREZ_QUERY'} = 'Homo sapiens
> > [ORGN]';
> >
> > #remove a parameter
> > delete $Bio::Tools::Run::RemoteBlast::HEADER{'FILTER'};
> >
> > my $v = 1;
> > #$v is just to turn on and off the messages
> >
> > my $r = $factory->submit_blast('blastInput.txt');
> >
> > print STDERR "waiting..." if( $v > 0 );
> > while ( my @rids = $factory->each_rid )
> > {
> >         foreach my $rid ( @rids )
> >         {
> >
> >     print "RID $rid\n";
> >
> >     #;
> >     #sleep 600;
> >     my $rc = $factory->retrieve_blast($rid);
> >
> >     print "RC $rc\n";
> >                 if( !ref($rc) )
> >                 {
> >                         if( $rc < 0 )
> >                         {
> >     				$factory->remove_rid($rid);
> >                         }
> >                         print STDERR "." if ( $v > 0 );
> >                         sleep 5;
> >     }
> >                 else
> >                 {
> >     sleep 600;
> >     $factory->save_output('temp.out');
> >     my $checkinput = $factory->file;
> >                     open(my $fh,"<$checkinput") or die $!;
> >                     while(<$fh>)
> > {
> >                              print;
> >                         }
> >                          close $fh;
> >     $factory->remove_rid($rid);
> >                 }
> >         }
> > }
> >
> > ###################################################################### 
> > ##
> >
> >
> > On Tue, 2006-01-17 at 16:03 -0700, Barry Moore wrote:
> >> Nagesh,
> >>
> >> Attached is an input file, script and output.  These work for me,  
> >> and I
> >> think they are the same that you are using.  Have a look and see  
> >> if you
> >> can find any differences that might be causing you problem.  Other  
> >> than
> >> that I don't know what to tell you.  If you are familiar with the  
> >> perl
> >> debugger you (and if you're not, now's probably a good time to become
> >> familiar with it) you should step through you script and be sure that
> >> all of you're objects are getting defined when they are supposed  
> >> to be.
> >> That can often help narrow down the problem.
> >>
> >> Barry
> >>
> >>> -----Original Message-----
> >>> From: Nagesh Chakka [mailto:nagesh.chakka@anu.edu.au]
> >>> Sent: Tuesday, January 17, 2006 1:57 PM
> >>> To: Barry Moore
> >>> Cc: Hubert Prielinger; bioperl-l@bioperl.org
> >>> Subject: Re: [Bioperl-l] Trouble using RemoteBlast.pm
> >>>
> >>> Bi  Barry,
> >>> With the help of Hubert, I further modified the script but still  
> >>> have
> >> the
> >>> same
> >>> problem. The problem is that from the point of submitting the blast
> >> query,
> >>> the script does not wait until the blast results are ready  for
> >> retrieval
> >>> and
> >>> event of submission is immediately followed by retrieving and saving
> >> the
> >>> output. Since the results will not be ready (about a sec) this fast,
> >> the
> >>> output created is blank. I am able to retrieve the results online
> >> using
> >>> the
> >>> RID which I am making the script to print.
> >>> So  my main problem is making the program to wait after  
> >>> submitting the
> >>> result.
> >>> My input file has a single fasta sequence which I have pasted below.
> >>> Its interesting to note that the script works on your system. Is it
> >>> creating
> >>> an output file with the blast report?
> >>> Thanks very much for your attention.
> >>> Regards
> >>> Nagesh
> >>>
> >>> blastInput.txt
> >>>> MusDpl
> >>>
> >> MKNRLGTWWVAILCMLLASHLSTVKARGIKHRFKWNRKVLPSSGGQITEARVAENRPGAFIKQGRKLDI 
> >> DFG
> >> AE
> >>> GNRYYA
> >>>
> >> ANYWQFPDGIYYEGCSEANVTKEMLVTSCVNATQAANQAEFSREKQDSKLHQRVLWRLIKEICSAKHCD 
> >> FWL
> >> ER
> >>> GAAL
> >>> RVAVDQPAMVCLLGFVWFIVK
> >>>
> >>> On Wednesday 18 January 2006 05:34, Barry Moore wrote:
> >>>> Nagesh-
> >>>>
> >>>> Did you get this figured out?  Your script works as is on my  
> >>>> system.
> >>>> You say temp.out is empty?  What does you input sequence
> >>>> (blastInput.txt) look like?
> >>>>
> >>>> Barry
> >>>>
> >>>>> -----Original Message-----
> >>>>> From: bioperl-l-bounces@portal.open-bio.org [mailto:bioperl-l-
> >>>>> bounces@portal.open-bio.org] On Behalf Of Hubert Prielinger
> >>>>> Sent: Monday, January 16, 2006 2:54 PM
> >>>>> To: Nagesh Chakka; bioperl-l@portal.open-bio.org
> >>>>> Subject: Re: [Bioperl-l] Trouble using RemoteBlast.pm
> >>>>>
> >>>>> Nagesh Chakka wrote:
> >>>>>> Hi All,
> >>>>>> I was trying to setup a system to perform a remote blast on
> >> regular
> >>>>>
> >>>>> basis. I
> >>>>>
> >>>>>> thought this could be best achieved by using BioPerl module and
> >> came
> >>>>>
> >>>>> across
> >>>>>
> >>>>>> RemoteBlast.pm
> >>>>>> I had modified the sample script "bp_remote_blast.pl" which takes
> >> a
> >>>>
> >>>> file
> >>>>
> >>>>>> containing single FASTA sequence as an input. Also I wanted the
> >> blast
> >>>>>
> >>>>> report
> >>>>>
> >>>>>> to be saved in a file for latter use and
> >>>>>> modified the code as follows
> >>>>>> I am using the latest version of Bioperl (1.5) on a Fedora
> >> platform.
> >>>>>
> >>>>
> >>> #################################################################### 
> >>> ###
> >>>>>
> >>>>>> print "$Bio::Root::Version::VERSION\n";
> >>>>>> use Bio::Tools::Run::RemoteBlast;
> >>>>>> use strict;
> >>>>>> my $prog = 'blastp';
> >>>>>> my $db   = 'swissprot';
> >>>>>> my $e_val= '1e-10';
> >>>>>>
> >>>>>> my @params = ( '-prog' => $prog,
> >>>>>>       '-data' => $db,
> >>>>>>       '-expect' => $e_val,
> >>>>>>       '-readmethod' => 'SearchIO' );
> >>>>>>
> >>>>>> my $factory = Bio::Tools::Run::RemoteBlast->new(@params);
> >>>>>>
> >>>>>> #change a paramter
> >>>>>> $Bio::Tools::Run::RemoteBlast::HEADER{'ENTREZ_QUERY'} = 'Homo
> >> sapiens
> >>>>>> [ORGN]';
> >>>>>>
> >>>>>> #remove a parameter
> >>>>>> delete $Bio::Tools::Run::RemoteBlast::HEADER{'FILTER'};
> >>>>>>
> >>>>>> my $v = 1;
> >>>>>> #$v is just to turn on and off the messages
> >>>>>>
> >>>>>> my $r = $factory->submit_blast('blastInput.txt');
> >>>>>>
> >>>>>> print STDERR "waiting..." if( $v > 0 );
> >>>>>> while ( my @rids = $factory->each_rid )
> >>>>>> {
> >>>>>>        foreach my $rid ( @rids )
> >>>>>>        {
> >>>>>>                my $rc = $factory->retrieve_blast($rid);
> >>>>>>                if( !ref($rc) )
> >>>>>>                {
> >>>>>>                        if( $rc < 0 )
> >>>>>>                        {
> >>>>>>                                $factory->remove_rid($rid);
> >>>>>>                        }
> >>>>>>                        print STDERR "." if ( $v > 0 );
> >>>>>>                        sleep 5;
> >>>>>>                }
> >>>>>>                else
> >>>>>>                {
> >>>>>>                        print "RID $rid\n";
> >>>>>>                        $factory->save_output('temp.out');
> >>>>>>                        $factory->remove_rid($rid);
> >>>>>>                }
> >>>>>>        }
> >>>>>> }
> >>>>>
> >>>>
> >>> #################################################################### 
> >>> ###
> >>>>
> >>>> ##
> >>>>
> >>>>> ########
> >>>>>
> >>>>>> This script prints the RID and terminates immediately. Obviously
> >> the
> >>>>>> output file created is empty as the program did not wait for
> >> getting
> >>>>
> >>>> the
> >>>>
> >>>>>> blast results from the RID.
> >>>>>> Is there something I am doing wrong and what can I do for the
> >> program
> >>>>
> >>>> to
> >>>>
> >>>>> wait
> >>>>>
> >>>>>> until the results are ready to be printed to the output file. I
> >> could
> >>>>
> >>>> not
> >>>>
> >>>>> get
> >>>>>
> >>>>>> much information from the documentation and have no prior
> >> experience
> >>>>
> >>>> with
> >>>>
> >>>>>> Bioperl.
> >>>>>> Thanks very much for  your attention.
> >>>>>> Regards
> >>>>>> Nageshbi
> >>>>>> _______________________________________________
> >>>>>> Bioperl-l mailing list
> >>>>>> Bioperl-l@portal.open-bio.org
> >>>>>> http://portal.open-bio.org/mailman/listinfo/bioperl-l
> >>>>>
> >>>>> hi nagesh,
> >>>>> try this, should work, I had the same problem:
> >>>>>
> >>>>> .......................
> >>>>> .......................
> >>>>>
> >>>>> else
> >>>>>                 {
> >>>>>                         print "RID $rid\n";
> >>>>>                         $factory->save_output('temp.out');
> >>>>>
> >>>>> 			my $checkinput = $factory->file;
> >>>>>               		open(my $fh,"<$checkinput") or die $!;
> >>>>>               		while(<$fh>){
> >>>>>                 		print;
> >>>>>               		}
> >>>>>               		close $fh;
> >>>>>
> >>>>>
> >>>>> 			$factory->remove_rid($rid);
> >>>>>                 }
> >>>>>         }
> >>>>> }
> >>>>>
> >>>>> regards
> >>>>> Hubert
> >>>>>
> >>>>> PS: are you using the composition based statistics parameter with
> >> your
> >>>>> blast search?
> >>>>> if yes, is it working?
> >>>>>
> >>>>> _______________________________________________
> >>>>> Bioperl-l mailing list
> >>>>> Bioperl-l@portal.open-bio.org
> >>>>> http://portal.open-bio.org/mailman/listinfo/bioperl-l
> >
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l@portal.open-bio.org
> > http://portal.open-bio.org/mailman/listinfo/bioperl-l
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l@portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l

From ajo11 at mole.bio.cam.ac.uk  Thu Jan 19 07:02:47 2006
From: ajo11 at mole.bio.cam.ac.uk (Amanda O'Reilly)
Date: Thu Jan 19 07:14:41 2006
Subject: [Bioperl-l] SVG treefile problem
Message-ID: <43CF7FE7.2080309@mole.bio.cam.ac.uk>

I am trying to draw SVG format phylogenetic trees - the output tree is 
distorted.

Using checked code & tree from here (code reproduced below also):
http://portal.open-bio.org/pipermail/bioperl-l/2004-April/015581.html
This gives a distorted tree if I leave out 'warn $tree'.
If I leave 'warn $tree' in the code, I get the following error.
Bio::Tree::Tree=HASH(0x606b68) at tree_play.pl line 17,  line 1.

Have tried running with UNIX (BioPerl 1.5) & Linux installations & tried 
viewing tree with different applications- output tree always looks wrong.

Thanks,
Amanda.

#!/usr/local/bin/perl -w
use strict;

use lib '.';
use Bio::TreeIO;
use Data::Dumper;
use SVG::Graph;

my $infile = "/scratch/ajo11/exp/aln/000ms/11.ph";
my $outfile = ">/scratch/ajo11/exp/aln/000ms/11.svg";
my $in = new Bio::TreeIO(-file => $infile,
                           -format => 'newick');
my $out = new Bio::TreeIO(-file => $outfile,
                            -format => 'svggraph');

while( my $tree = $in->next_tree ) {
      #warn $tree;
      my $svg_xml = $out->write_tree($tree);
}

From supramuk at yahoo.com  Thu Jan 19 00:47:57 2006
From: supramuk at yahoo.com (supratim mukherjee)
Date: Thu Jan 19 08:40:12 2006
Subject: [Bioperl-l] Re: volunteers needed
Message-ID: <20060119054758.91641.qmail@web32406.mail.mud.yahoo.com>

Respected Sir/Madam,

I am a student from Bangalore University, India and
have just finished my masters in Biotechnology. I have
applied for PhD in a few universities in USA and am
awaiting the admission decision.

I would like to contribute to bioperl.org. Please let
me know if any of my contributions would help.

Regards
Supratim

__________________________________________________
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com 
From cjfields at uiuc.edu  Thu Jan 19 11:26:13 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu Jan 19 11:23:02 2006
Subject: [Bioperl-l] RemoteBlast.pm problem resolved!!!!!
In-Reply-To: <1137655337.5305.73.camel@vogon>
Message-ID: <001801c61d15$10fd9210$15327e82@pyrimidine>

This resolves the problem only if you use bioperl 1.5.1.  RemoteBlast.pm was
changed ~fall 2005 and removed the $size variable (as reported here:
http://bugzilla.bioperl.org/show_bug.cgi?id=1864).

The text output will save if you use Search::IO.  However, parsing text
output seems to be broken using SearchIO at the moment, likely due to
modifications in output that probably broke SearchIO::blast.

Jason addresses this in the last few emails in this thread.  If you plan on
parsing out data (like accessions or HSP's) from BLAST output, then you may
have to switch to XML as text or HTML parsing can break at any time.  

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 

> -----Original Message-----
> From: bioperl-l-bounces@portal.open-bio.org [mailto:bioperl-l-
> bounces@portal.open-bio.org] On Behalf Of Nagesh
> Sent: Thursday, January 19, 2006 1:22 AM
> To: Barry Moore; bioperl-l@bioperl.org
> Cc: ganesh.b.chakka@jpmorgan.com
> Subject: Re: [Bioperl-l] RemoteBlast.pm problem resolved!!!!!
> 
> Hi Barry,
> Thanks once again for an elaborate mail and explanation. I am using the
> latest version of BioPerl 1.5. I also tested this problem on 1.4 with no
> difference. The problem is with the "$rc = $factory->retrieve_blast
> ($rid);" where $rc was always getting an object as a return from
> retrieve_blast and is never entering into sleep 5 mode (the condition
> "if( !ref($rc) )" is never satisfied).
> 
> I thought I will have a look at the RemoteBlast.pm code once before
> trying anything more. I looked at the method retrieve_blast which was
> the main culprit and then found a possible answer for my problem. I
> looked at the condition which returns 0, -1 or an object which is below
> 
> Code from Bio/Tools/Run/RemoteBlast.pm version 1.5 line 569-560
> #########################################################
> 		my $size = -s $tempfile;
> 		if( $size > 1000 ) {
> #########################################################
> 
> So I made it to print the file size and had run my perl script several
> times
> 
> #########################################################
> 		my $size = -s $tempfile;
> 		print "Size of temporary file from RemoteBlast.pm $size\n";
> 		if( $size > 1000 ) {
> #########################################################
> 
> Each time I did so, I was getting the file size value of 2014 to 2017
> and no wonder it satisfies the condition ($size > 1000) even when the
> results were not ready.
> 
> So I modified the condition to the following
> #########################################################
> 		my $size = -s $tempfile;
> 		if( $size > 2017 ) {
> #########################################################
> 
> and there it goes, the code behaved itself and waited until the results
> were ready to proceed further with saving the output.
> This may be a result of some changes the NCBI admin would have made to
> the results status page which would have increased the file size and
> satisfying the condition to return an object which must be returned only
> when the results were ready.
> I am not sure whether this is the right answer to the problem but it
> does definitely work.
> Any comments from people having similar problem will be useful. I will
> see how long does this solution would work and knock back on your doors
> if I need further help.
> Thanks for your help.
> Regards
> Nagesh
> 
> 
> On Wed, 2006-01-18 at 22:15 -0700, Barry Moore wrote:
> > Nagesh,
> >
> > That does sound odd.  What version of bioperl are you using?  I'm
> > guessing 1.4?  If the answer is anything but 1.5 something, then I
> > suggest you should upgrade before going any further.  You will also
> > want to follow the current thread by about parsing XML formatted
> > blast reports.  I don't think this is your problem right now, but
> > eventually you'll have a problem if you aren't parsing XML format as
> > discussed in that post.  I've added some more detail below if you are
> > having the problem with 1.5 try some debugging.
> >
> > Here's what's going on (or should be going on) in your script, and
> > some suggestions for using the debugger.
> >
> > #This next line hits the NCBI server, and if it gets a blast report
> > in return parses it, and returns a Bio::Tools::Blast object.  If
> > there was no report you get 0, and if there was an error you get -1.
> >
> >      my $rc = $factory->retrieve_blast($rid);
> >
> >      print "RC $rc\n";
> >
> > #This if statement is checking to see if the server has NOT returned
> > a report yet.  If it did then $rc should be an object and ref $rc
> > will return 'Bio::SearchIIO::blast'.  If $rc is not an object (i.e.
> > you got no report) then ref $rc returns undef.
> >                  if( !ref($rc) )
> >                  {
> > #If you got here then you got no report from NCBI server yet, and so
> > the next if check is you got -1 meaning there was an error.  On error
> > delete this RID cause it's no good.
> >                          if( $rc < 0 )
> >                          {
> >      				$factory->remove_rid($rid);
> >                          }
> > #Print a dot on the screen in leu of music to keep the user
> > entertained while they wait.
> >                          print STDERR "." if ( $v > 0 );
> > #Take a nap so you don't piss off NCBI sys admin!
> >                          sleep 5;
> >      }
> > #Getting here means that $rc was an object, so we've got a report.
> > Go ahead and save it.
> >                  else
> >                  {
> >      sleep 600;
> > #Obviously writing your output file.
> >      $factory->save_output('temp.out');
> >      my $checkinput = $factory->file;
> >                      open(my $fh,"<$checkinput") or die $!;
> >                      while(<$fh>)
> > {
> >                               print;
> >                          }
> >                           close $fh;
> >      $factory->remove_rid($rid);
> >
> >
> > run your script in the debugger like this:
> >
> > perl -d your_script.pl
> >
> > Step forward one line at a time by typing 'n'.
> > When you get just past my $rc = $factory->retrieve_blast($rid); type
> > 'x $rc'
> > You should get 0, -1 or 'Bio::SearchIO::blast'
> > Keep stepping forward with 'n'.
> > If you get 0 you should loop back to retrieve_blast after a sleep.
> > If you get -1 you should end your script - you got an error (What was
> > it?)
> > If you get an Bio::SearchIO::blast object then you should be writing
> > a temp.out
> >
> > Barry
> >
> >
> > On Jan 18, 2006, at 6:37 PM, Nagesh wrote:
> >
> > > Thanks very much to all specially to Barry and Hubert for their
> > > time in
> > > answering my query. Some updates into my problem.
> > >
> > > I have performed some diagnostics tests and writing below my
> > > observations.
> > >
> > > First of all, the problem in the code was that it was not waiting for
> > > the results to be ready for writing it to the output file. So I wanted
> > > to check whether the condition "if( !ref($rc) )" is ever satisfied
> > > and I
> > > printed out the $rc value which was some thing like "Bio::SearchIO::
> > > blast=HASH(0x9010370)". When I had looked at the Bioperl documentation
> > > for RemoteBlast.pm, the value for $rc in "$rc = $factory-
> > > >retrieve_blast
> > > ($rid);" should either return 0 or 1. I am not able to understand
> > > whether what I am getting is right.
> > >
> > > Secondly, I had manually forced the script to wait between
> > > submit_blast,
> > > retrieve_blast and save_output by using sleep with values ranging from
> > > 30 to 600. None of them where successful in saving the output.
> > >
> > > When sleep (600) is between submit_blast and retrieve_blast, the
> > > following is printed onto std output (shown below is part of the
> > > output)
> > > with output file still empty.
> > >
> > > 
> > > 
> > > 
> > > 
> > > 
> > > 
> > > 
> > > 
Request ID  1137626804-16566-100302560340.BLASTQ4 > > b>
Status Searching
Submitted at Wed Jan 18 18:26:44 2006
Current time Wed Jan 18 18:36:46 2006
Time since submission 00:10:01
> > > 
This page will be automatically updated in 10 seconds
> > > until search is done

> > >
> > > When sleep (600) is between retrieve_blast and save_output, the
> > > following is printed with nothing written to output file.
> > >
> > > 
> > > 
> > > 
> > > 
> > > 
> > > 
> > > 
> > > 
Request ID  1137632221-28820-85178967709.BLASTQ1 > > b>
Status Searching
Submitted at Wed Jan 18 19:57:01 2006
Current time Wed Jan 18 19:57:03 2006
Time since submission 00:00:01
> > > 
This page will be automatically updated in 10 seconds
> > > until search is done

> > >
> > > Please note the difference in time since submission.
> > >
> > > Lastly, I had printed out the request ID and manually paused the
> > > script
> > > by using  between submit_blast and retrieve_blast. The idea was
> > > to check the status of the job online through the NCBI website.
> > > When the
> > > results where ready, I made the script to proceed further and was able
> > > to save the desired results to the file. I am puzzled with this
> > > observation as I am not understanding why manually formating the
> > > results
> > > online helps in getting the results.
> > > I am basically a molecular biologist and trying hard to solve this
> > > computational stuff, so there might be some trivial issues
> > > according to
> > > you computer wiz :)
> > >
> > > Barry suggested me to use perl debugger which I will try to use.
> > >
> > > Thanks for your attention.
> > >
> > > Below is the code which was being tested.
> > >
> > > ######################################################################
> > > ##
> > >
> > > use strict;
> > > use warnings;
> > > use Bio::Tools::Run::RemoteBlast;
> > >
> > > print "$Bio::Root::Version::VERSION\n";
> > > my $prog = 'blastp';
> > > my $db   = 'swissprot';
> > > my $e_val= '1e-10';
> > >
> > > my @params = ( '-prog' => $prog,
> > >        '-data' => $db,
> > >        '-expect' => $e_val,
> > >        '-readmethod' => 'SearchIO' );
> > >
> > > my $factory = Bio::Tools::Run::RemoteBlast->new(@params);
> > >
> > > #change a paramter
> > > $Bio::Tools::Run::RemoteBlast::HEADER{'ENTREZ_QUERY'} = 'Homo sapiens
> > > [ORGN]';
> > >
> > > #remove a parameter
> > > delete $Bio::Tools::Run::RemoteBlast::HEADER{'FILTER'};
> > >
> > > my $v = 1;
> > > #$v is just to turn on and off the messages
> > >
> > > my $r = $factory->submit_blast('blastInput.txt');
> > >
> > > print STDERR "waiting..." if( $v > 0 );
> > > while ( my @rids = $factory->each_rid )
> > > {
> > >         foreach my $rid ( @rids )
> > >         {
> > >
> > >     print "RID $rid\n";
> > >
> > >     #;
> > >     #sleep 600;
> > >     my $rc = $factory->retrieve_blast($rid);
> > >
> > >     print "RC $rc\n";
> > >                 if( !ref($rc) )
> > >                 {
> > >                         if( $rc < 0 )
> > >                         {
> > >     				$factory->remove_rid($rid);
> > >                         }
> > >                         print STDERR "." if ( $v > 0 );
> > >                         sleep 5;
> > >     }
> > >                 else
> > >                 {
> > >     sleep 600;
> > >     $factory->save_output('temp.out');
> > >     my $checkinput = $factory->file;
> > >                     open(my $fh,"<$checkinput") or die $!;
> > >                     while(<$fh>)
> > > {
> > >                              print;
> > >                         }
> > >                          close $fh;
> > >     $factory->remove_rid($rid);
> > >                 }
> > >         }
> > > }
> > >
> > > ######################################################################
> > > ##
> > >
> > >
> > > On Tue, 2006-01-17 at 16:03 -0700, Barry Moore wrote:
> > >> Nagesh,
> > >>
> > >> Attached is an input file, script and output.  These work for me,
> > >> and I
> > >> think they are the same that you are using.  Have a look and see
> > >> if you
> > >> can find any differences that might be causing you problem.  Other
> > >> than
> > >> that I don't know what to tell you.  If you are familiar with the
> > >> perl
> > >> debugger you (and if you're not, now's probably a good time to become
> > >> familiar with it) you should step through you script and be sure that
> > >> all of you're objects are getting defined when they are supposed
> > >> to be.
> > >> That can often help narrow down the problem.
> > >>
> > >> Barry
> > >>
> > >>> -----Original Message-----
> > >>> From: Nagesh Chakka [mailto:nagesh.chakka@anu.edu.au]
> > >>> Sent: Tuesday, January 17, 2006 1:57 PM
> > >>> To: Barry Moore
> > >>> Cc: Hubert Prielinger; bioperl-l@bioperl.org
> > >>> Subject: Re: [Bioperl-l] Trouble using RemoteBlast.pm
> > >>>
> > >>> Bi  Barry,
> > >>> With the help of Hubert, I further modified the script but still
> > >>> have
> > >> the
> > >>> same
> > >>> problem. The problem is that from the point of submitting the blast
> > >> query,
> > >>> the script does not wait until the blast results are ready  for
> > >> retrieval
> > >>> and
> > >>> event of submission is immediately followed by retrieving and saving
> > >> the
> > >>> output. Since the results will not be ready (about a sec) this fast,
> > >> the
> > >>> output created is blank. I am able to retrieve the results online
> > >> using
> > >>> the
> > >>> RID which I am making the script to print.
> > >>> So  my main problem is making the program to wait after
> > >>> submitting the
> > >>> result.
> > >>> My input file has a single fasta sequence which I have pasted below.
> > >>> Its interesting to note that the script works on your system. Is it
> > >>> creating
> > >>> an output file with the blast report?
> > >>> Thanks very much for your attention.
> > >>> Regards
> > >>> Nagesh
> > >>>
> > >>> blastInput.txt
> > >>>> MusDpl
> > >>>
> > >> MKNRLGTWWVAILCMLLASHLSTVKARGIKHRFKWNRKVLPSSGGQITEARVAENRPGAFIKQGRKLDI
> > >> DFG
> > >> AE
> > >>> GNRYYA
> > >>>
> > >> ANYWQFPDGIYYEGCSEANVTKEMLVTSCVNATQAANQAEFSREKQDSKLHQRVLWRLIKEICSAKHCD
> > >> FWL
> > >> ER
> > >>> GAAL
> > >>> RVAVDQPAMVCLLGFVWFIVK
> > >>>
> > >>> On Wednesday 18 January 2006 05:34, Barry Moore wrote:
> > >>>> Nagesh-
> > >>>>
> > >>>> Did you get this figured out?  Your script works as is on my
> > >>>> system.
> > >>>> You say temp.out is empty?  What does you input sequence
> > >>>> (blastInput.txt) look like?
> > >>>>
> > >>>> Barry
> > >>>>
> > >>>>> -----Original Message-----
> > >>>>> From: bioperl-l-bounces@portal.open-bio.org [mailto:bioperl-l-
> > >>>>> bounces@portal.open-bio.org] On Behalf Of Hubert Prielinger
> > >>>>> Sent: Monday, January 16, 2006 2:54 PM
> > >>>>> To: Nagesh Chakka; bioperl-l@portal.open-bio.org
> > >>>>> Subject: Re: [Bioperl-l] Trouble using RemoteBlast.pm
> > >>>>>
> > >>>>> Nagesh Chakka wrote:
> > >>>>>> Hi All,
> > >>>>>> I was trying to setup a system to perform a remote blast on
> > >> regular
> > >>>>>
> > >>>>> basis. I
> > >>>>>
> > >>>>>> thought this could be best achieved by using BioPerl module and
> > >> came
> > >>>>>
> > >>>>> across
> > >>>>>
> > >>>>>> RemoteBlast.pm
> > >>>>>> I had modified the sample script "bp_remote_blast.pl" which takes
> > >> a
> > >>>>
> > >>>> file
> > >>>>
> > >>>>>> containing single FASTA sequence as an input. Also I wanted the
> > >> blast
> > >>>>>
> > >>>>> report
> > >>>>>
> > >>>>>> to be saved in a file for latter use and
> > >>>>>> modified the code as follows
> > >>>>>> I am using the latest version of Bioperl (1.5) on a Fedora
> > >> platform.
> > >>>>>
> > >>>>
> > >>> ####################################################################
> > >>> ###
> > >>>>>
> > >>>>>> print "$Bio::Root::Version::VERSION\n";
> > >>>>>> use Bio::Tools::Run::RemoteBlast;
> > >>>>>> use strict;
> > >>>>>> my $prog = 'blastp';
> > >>>>>> my $db   = 'swissprot';
> > >>>>>> my $e_val= '1e-10';
> > >>>>>>
> > >>>>>> my @params = ( '-prog' => $prog,
> > >>>>>>       '-data' => $db,
> > >>>>>>       '-expect' => $e_val,
> > >>>>>>       '-readmethod' => 'SearchIO' );
> > >>>>>>
> > >>>>>> my $factory = Bio::Tools::Run::RemoteBlast->new(@params);
> > >>>>>>
> > >>>>>> #change a paramter
> > >>>>>> $Bio::Tools::Run::RemoteBlast::HEADER{'ENTREZ_QUERY'} = 'Homo
> > >> sapiens
> > >>>>>> [ORGN]';
> > >>>>>>
> > >>>>>> #remove a parameter
> > >>>>>> delete $Bio::Tools::Run::RemoteBlast::HEADER{'FILTER'};
> > >>>>>>
> > >>>>>> my $v = 1;
> > >>>>>> #$v is just to turn on and off the messages
> > >>>>>>
> > >>>>>> my $r = $factory->submit_blast('blastInput.txt');
> > >>>>>>
> > >>>>>> print STDERR "waiting..." if( $v > 0 );
> > >>>>>> while ( my @rids = $factory->each_rid )
> > >>>>>> {
> > >>>>>>        foreach my $rid ( @rids )
> > >>>>>>        {
> > >>>>>>                my $rc = $factory->retrieve_blast($rid);
> > >>>>>>                if( !ref($rc) )
> > >>>>>>                {
> > >>>>>>                        if( $rc < 0 )
> > >>>>>>                        {
> > >>>>>>                                $factory->remove_rid($rid);
> > >>>>>>                        }
> > >>>>>>                        print STDERR "." if ( $v > 0 );
> > >>>>>>                        sleep 5;
> > >>>>>>                }
> > >>>>>>                else
> > >>>>>>                {
> > >>>>>>                        print "RID $rid\n";
> > >>>>>>                        $factory->save_output('temp.out');
> > >>>>>>                        $factory->remove_rid($rid);
> > >>>>>>                }
> > >>>>>>        }
> > >>>>>> }
> > >>>>>
> > >>>>
> > >>> ####################################################################
> > >>> ###
> > >>>>
> > >>>> ##
> > >>>>
> > >>>>> ########
> > >>>>>
> > >>>>>> This script prints the RID and terminates immediately. Obviously
> > >> the
> > >>>>>> output file created is empty as the program did not wait for
> > >> getting
> > >>>>
> > >>>> the
> > >>>>
> > >>>>>> blast results from the RID.
> > >>>>>> Is there something I am doing wrong and what can I do for the
> > >> program
> > >>>>
> > >>>> to
> > >>>>
> > >>>>> wait
> > >>>>>
> > >>>>>> until the results are ready to be printed to the output file. I
> > >> could
> > >>>>
> > >>>> not
> > >>>>
> > >>>>> get
> > >>>>>
> > >>>>>> much information from the documentation and have no prior
> > >> experience
> > >>>>
> > >>>> with
> > >>>>
> > >>>>>> Bioperl.
> > >>>>>> Thanks very much for  your attention.
> > >>>>>> Regards
> > >>>>>> Nageshbi
> > >>>>>> _______________________________________________
> > >>>>>> Bioperl-l mailing list
> > >>>>>> Bioperl-l@portal.open-bio.org
> > >>>>>> http://portal.open-bio.org/mailman/listinfo/bioperl-l
> > >>>>>
> > >>>>> hi nagesh,
> > >>>>> try this, should work, I had the same problem:
> > >>>>>
> > >>>>> .......................
> > >>>>> .......................
> > >>>>>
> > >>>>> else
> > >>>>>                 {
> > >>>>>                         print "RID $rid\n";
> > >>>>>                         $factory->save_output('temp.out');
> > >>>>>
> > >>>>> 			my $checkinput = $factory->file;
> > >>>>>               		open(my $fh,"<$checkinput") or die
$!;
> > >>>>>               		while(<$fh>){
> > >>>>>                 		print;
> > >>>>>               		}
> > >>>>>               		close $fh;
> > >>>>>
> > >>>>>
> > >>>>> 			$factory->remove_rid($rid);
> > >>>>>                 }
> > >>>>>         }
> > >>>>> }
> > >>>>>
> > >>>>> regards
> > >>>>> Hubert
> > >>>>>
> > >>>>> PS: are you using the composition based statistics parameter with
> > >> your
> > >>>>> blast search?
> > >>>>> if yes, is it working?
> > >>>>>
> > >>>>> _______________________________________________
> > >>>>> Bioperl-l mailing list
> > >>>>> Bioperl-l@portal.open-bio.org
> > >>>>> http://portal.open-bio.org/mailman/listinfo/bioperl-l
> > >
> > > _______________________________________________
> > > Bioperl-l mailing list
> > > Bioperl-l@portal.open-bio.org
> > > http://portal.open-bio.org/mailman/listinfo/bioperl-l
> >
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l@portal.open-bio.org
> > http://portal.open-bio.org/mailman/listinfo/bioperl-l
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l@portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l

From akarger at CGR.Harvard.edu  Thu Jan 19 12:15:41 2006
From: akarger at CGR.Harvard.edu (Amir Karger)
Date: Thu Jan 19 12:26:46 2006
Subject: [Bioperl-l] Calculating a bunch of SNPs
Message-ID: <339D68B133EAD311971E009027DC479703DB438B@montecarlo.cgr.harvard.edu>

I have 96 files. The first is a reference sequence. The other 95 are
sequences from different genotypes, with minor SNPs compared to the first
one. I want to generate a list of all the SNPs for each sequence compared to
the reference sequence. Output format doesn't really matter.

I was told I could run EMBOSS diffseq on each of the 95 pairs, and parse the
output to get my list. I'm wondering if there's a Bioperl tool that will do
what diffseq does, though - presumably outputting Bio::Align objects of some
kind, or is it Bio::Variation? - rather than parsing 95*N output files.

Thanks,

- Amir Karger
Computational Biology Group
Bauer Center for Genomics Research
Harvard University
617-496-0626
From cjfields at uiuc.edu  Thu Jan 19 12:53:01 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu Jan 19 12:49:48 2006
Subject: [Bioperl-l] XML output from RemoteBlast
In-Reply-To: 
Message-ID: <001f01c61d21$31086d80$15327e82@pyrimidine>

Jason, 

Nope.  No go.  I thought Nagesh may have found the problem with the $size
parameter (maybe the XML-formatted output was > 1000), but there is no $size
variable now.  RemoteBlast.pm was changed ~fall 2005 (by you, I believe) to
fix bug 1864 (http://bugzilla.bioperl.org/show_bug.cgi?id=1864), so is
post-1.5.1.  I'm using a recent PPM build of bioperl-live.  As reported
before, it worked up until very recently (within the last week), but I was
parsing text output and using '-readmethod'=>'SearchIO' or 'blast' in the
parameters list.  

My script uses a local sequence file (FASTA) in a BLASTP search against
'nr'.  When FORMAT_TYPE was set to 'Text' format using SearchIO for
readmethod, everything works fine and I get saved output; switching to
'readmethod'=>'xml' and FORMAT_TYPE to XML, gives a blank file.  The
-verbose switch is on, so I can switch FORMAT_TYPE to any of the accepted
parameter settings (HTML, Text, ASN.1, XML) and I see the corresponding
output style sent to stdout along with the warnings from the NCBI queue.
However, nothing besides text output will save, suggesting something with
retrieve_blast() in RemoteBlast.pm.  Strangely, the file name, derived from
query_name, does not pick up the query name sent, but a chunk of the RID!
BTW, it only does this with XML output; the query_name from text output is
as expected.  Changing $filename to temp.blastp (commented out below)
doesn't do the trick; it's still an empty file.  I have also tried an older
version of this script on Mac OS X and had similar problems with XML output,
but text output saves fine, so I don't think this is the OS.

Here's the saved file names (using XML output) and their RID's (no point in
sending the file contents, they were all blank). These were all using the
same query sequence; I noticed that the file names were different each time
and thought of the RID.

1_20910.blastp
  ^^^^^
1137691949-20910-102543092805.BLASTQ4
           ^^^^^

1_25245.blastp
  ^^^^^
1137692051-25245-128580015999.BLASTQ1		
           ^^^^^

1_21057.blastp
  ^^^^^
1137692263-21057-148127371984.BLASTQ4
           ^^^^^

Is the RID jamming up the works somehow?

Following is the script (sorry if it's a bit clunky)
____________________________________________________________________________
___
#!perl

use strict;
use Bio::Tools::Run::RemoteBlast;

# $v is just to turn on and off the messages
my $v = 1;

# changing or modifying parameters for blast search
my $prog = 'blastp';
my $db = 'nr';
my $e_val = '0.1';
my @params = (
		'-verbose' => $v,
		'-prog' => $prog,
		'-data' => $db,
		'-expect' => $e_val,
		'-readmethod' => 'xml'
		); 

# remove filter
delete $Bio::Tools::Run::RemoteBlast::HEADER{'FILTER'};

# change cgi parameters for blast results
# DESCRIPTIONS and ALIGNMENTS need to be changed in both the HEADER 
# and RETRIEVALHEADER hashes
$Bio::Tools::Run::RemoteBlast::RETRIEVALHEADER{'FORMAT_TYPE'} = 'XML';

# init new BLAST factory
my $factory = Bio::Tools::Run::RemoteBlast->new(@params);

print "Starting blast search ...\n";
# submit blast query
my $r = $factory->submit_blast('m_smeg_pyrR.txt');
print STDERR "waiting..." if( $v > 0 );
while ( my @rids = $factory->each_rid ) {
	foreach my $rid ( @rids ) {
		my $rc = $factory->retrieve_blast($rid);
		# if RID is not present
		if( !ref($rc) ) {
			# remove if RID is bad (error)
			if( $rc < 0 ) {
				$factory->remove_rid($rid);
			}
			# otherwise, query is still in progress, continue
loop, printing output
			# if requested
			print STDERR "." if ( $v > 0 ); 
			sleep 2; 
		} else { # RID is returned
			# save the output
			print $rid;
			my $result = $rc->next_result();
			my $filename= $result->query_name.".blastp";
			#my $filename= "temp.blastp";
			$factory->save_output($filename);
			# remove RID from list
			$factory->remove_rid($rid);
		}
	}
}
____________________________________________________________________________
___
I may switch to the blast client from NCBI for now, but I would like to keep
RemoteBlast.pm going somehow unless it's completely unfeasible.  I'm a still
a bit green when it comes to object-oriented programming (I am primarily a
molecular biologist with programming experience) and I'm still trying to
wrap my head around some bioperl objects and their methods (though I'm
catching on slowly).  

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 

> -----Original Message-----
> From: Jason Stajich [mailto:jason.stajich@duke.edu]
> Sent: Wednesday, January 18, 2006 10:23 PM
> To: Chris Fields
> Subject: Re: [Bioperl-l] XML output from RemoteBlast
> 
> This doesn't work for you?
> http://bioperl.open-bio.org/news/2005/11/06/getting-blastxml-using-
> remoteblast/
> On Jan 18, 2006, at 11:04 PM, Chris Fields wrote:
> 
> > Is there any known way to save XML-formatted BLAST queries from
> > RemoteBlast?  Changing the FORMAT_TYPE in the retrieval header to
> > anything other than 'Text' gives a blank output file.
> >
> > Christopher Fields
> > Postdoctoral Researcher
> > Lab of Dr. Robert Switzer
> > Dept of Biochemistry
> > University of Illinois Urbana-Champaign
> >
> >
> >
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l@portal.open-bio.org
> > http://portal.open-bio.org/mailman/listinfo/bioperl-l
> 
> --
> Jason Stajich
> Duke University
> http://www.duke.edu/~jes12

From osborne1 at optonline.net  Thu Jan 19 12:57:17 2006
From: osborne1 at optonline.net (Brian Osborne)
Date: Thu Jan 19 12:59:54 2006
Subject: [Bioperl-l] Re: volunteers needed
In-Reply-To: <20060119054758.91641.qmail@web32406.mail.mud.yahoo.com>
Message-ID: 

Supratim,

In a week or so Jason Stajich will be releasing a new Bioperl documentation
site, it will have a nice page detailing a number of different projects that
need volunteers. While you wait think about what areas in bioinformatics
_you_ want to work on, it's probably the case that you'll do the best work
on those topics that interest you personally.

Brian O.

On 1/19/06 12:47 AM, "supratim mukherjee"  wrote:

> Respected Sir/Madam,
> 
> I am a student from Bangalore University, India and
> have just finished my masters in Biotechnology. I have
> applied for PhD in a few universities in USA and am
> awaiting the admission decision.
> 
> I would like to contribute to bioperl.org. Please let
> me know if any of my contributions would help.
> 
> Regards
> Supratim
> 
> __________________________________________________
> Do You Yahoo!?
> Tired of spam?  Yahoo! Mail has the best spam protection around
> http://mail.yahoo.com
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l@portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l

From andyn108 at gmail.com  Thu Jan 19 13:28:36 2006
From: andyn108 at gmail.com (Andy Nunberg)
Date: Thu Jan 19 13:51:50 2006
Subject: [Bioperl-l] problem with Primer3: too many files open
Message-ID: <2ae1a0fe0601191028i718efe15yc76185b5a64cda99@mail.gmail.com>

Hi, I am using bioperl-1.4 running Primer3 to select a bunch of primers.
While running the script, I get an exception at the same point with
the following error:

------------- EXCEPTION  -------------
MSG: Can't open RESULTS:Too many open files
STACK Bio::Tools::Run::Primer3::run
/compbio/pkg/bio-perl/bioperl-run/Bio/Tools/Run/Primer3.pm:361
STACK (eval) find_primers_first_pass.pl:183
STACK main::_primer3 find_primers_first_pass.pl:182
STACK main::get_primer find_primers_first_pass.pl:141
STACK toplevel find_primers_first_pass.pl:86

--------------------------------------

Now if I take this sequence out of the list and run the script, it
runs just fine.

here is the subroutine calling primer3:
sub _primer3{
    my($seq,$qual_region)=@_;
    my $primer3=Bio::Tools::Run::Primer3->new(-seq=>$seq,-verbose=>0,-flush=>1);
    my @qual = @{$seq->qual};
    #set the start of the search window for primer3
    my $primer3_start=1;
    if($seq->length > ($window+100)){
    $primer3_start=$qual_region->end-($window+100);
    }
    #set up primer3
    $primer3->add_targets('INCLUDED_REGION'=>"$primer3_start,$window");
    $primer3->add_targets('PRIMER_FIRST_BASE_INDEX'=>1,
    'PRIMER_TASK'=>'pick_left_only');
    $primer3->add_targets('PRIMER_SEQUENCE_QUALITY'=>"@qual");
    $primer3->add_targets('PRIMER_MIN_QUALITY'=>$minqual,
    'PRIMER_NUM_RETURN'=>1,
    'PRIMER_MAX_POLY_X'=>3);
    $primer3->add_targets('PRIMER_GC_CLAMP'=>1) unless($no_gc_clamp);

    #run primer3
    my $prim3_results;
    eval {
    $prim3_results=$primer3->run;
    };
    die $seq->id." :$@" if ($@);

    #fetch result for the first primer
    my $hash_ref=$prim3_results->primer_results(0);
    return $hash_ref;

}

any suggestions? any thoughts on why I am getting the error to begin with?
thanks

From avilella at ub.edu  Thu Jan 19 13:31:01 2006
From: avilella at ub.edu (Albert Vilella)
Date: Thu Jan 19 13:54:52 2006
Subject: [Bioperl-l] Calculating a bunch of SNPs
In-Reply-To: <339D68B133EAD311971E009027DC479703DB438B@montecarlo.cgr.harvard.edu>
References: <339D68B133EAD311971E009027DC479703DB438B@montecarlo.cgr.harvard.edu>
Message-ID: <1137695462.9170.6.camel@localhost.localdomain>

El dj 19 de 01 del 2006 a les 12:15 -0500, en/na Amir Karger va
escriure:
> I have 96 files. The first is a reference sequence. The other 95 are
> sequences from different genotypes, with minor SNPs compared to the first
> one. I want to generate a list of all the SNPs for each sequence compared to
> the reference sequence. Output format doesn't really matter.

Dear Amir,

If the sequences are simply instances of genotypes/haplotypes, so that
each position already correlates in all 96 sequences, then one
possibility would be to simply create a Bio::Align object by adding
each of them.

Once you have your alignment, you can get the marker information with
the aln_to_population method of Bio::PopGen::Utilities.

 Usage   : my $pop = Bio::PopGen::Utilities->aln_to_population($aln);
 Function: Turn and alignment into a set of L
           objects grouped in a L object

You will see some example output files in t/data/.

There may be other (better or different) ways to do what you need with
Bioperl,

    Albert.

> I was told I could run EMBOSS diffseq on each of the 95 pairs, and parse the
> output to get my list. I'm wondering if there's a Bioperl tool that will do
> what diffseq does, though - presumably outputting Bio::Align objects of some
> kind, or is it Bio::Variation? - rather than parsing 95*N output files.

From cjfields at uiuc.edu  Thu Jan 19 15:55:23 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu Jan 19 15:52:37 2006
Subject: [Bioperl-l] XML output from RemoteBlast
In-Reply-To: 
Message-ID: <000101c61d3a$aae30070$15327e82@pyrimidine>

...and I tried an XML-formatted BLASTP file (from blastcl3 output) to test
SearchIO directly; it's not SearchIO or blastxml.  They parsed accessions,
hits, etc very well.  So at least I can use a system call to blastcl3 with
parameters as a workaround for now.

I'm pretty sure it is the retrieve_blast() or save_output() method in
RemoteBlast.pm.  I'm busy trying to finish up a write-up for bioperl-db
(among the experiments going on in the lab), but I'll try to figure it out.

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 

> -----Original Message-----
> From: Jason Stajich [mailto:jason.stajich@duke.edu]
> Sent: Wednesday, January 18, 2006 10:23 PM
> To: Chris Fields
> Subject: Re: [Bioperl-l] XML output from RemoteBlast
> 
> This doesn't work for you?
> http://bioperl.open-bio.org/news/2005/11/06/getting-blastxml-using-
> remoteblast/
> On Jan 18, 2006, at 11:04 PM, Chris Fields wrote:
> 
> > Is there any known way to save XML-formatted BLAST queries from
> > RemoteBlast?  Changing the FORMAT_TYPE in the retrieval header to
> > anything other than 'Text' gives a blank output file.
> >
> > Christopher Fields
> > Postdoctoral Researcher
> > Lab of Dr. Robert Switzer
> > Dept of Biochemistry
> > University of Illinois Urbana-Champaign
> >
> >
> >
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l@portal.open-bio.org
> > http://portal.open-bio.org/mailman/listinfo/bioperl-l
> 
> --
> Jason Stajich
> Duke University
> http://www.duke.edu/~jes12

From osborne1 at optonline.net  Thu Jan 19 16:06:29 2006
From: osborne1 at optonline.net (Brian Osborne)
Date: Thu Jan 19 16:08:05 2006
Subject: [Bioperl-l] problem with Primer3: too many files open
In-Reply-To: <2ae1a0fe0601191028i718efe15yc76185b5a64cda99@mail.gmail.com>
Message-ID: 

Andy,

I believe this is fixed in 1.5.1. The newer version should look like this
around line 370 (bioperl-run/Bio/Tools/Run/Primer3.pm):

my ($temphandle, $tempfile)=$self->io->tempfile;
print $temphandle join "\n", @{$self->{'primer3_input'}}, "=\n";
$temphandle->close;
open (RESULTS, "$executable < $tempfile|") || $self->throw("Can't open
RESULTS");

Do you see the line with close in your file? If not either add this line or
upgrade to 1.5.1.

Brian O.

On 1/19/06 1:28 PM, "Andy Nunberg"  wrote:

> Hi, I am using bioperl-1.4 running Primer3 to select a bunch of primers.
> While running the script, I get an exception at the same point with
> the following error:
> 
> ------------- EXCEPTION  -------------
> MSG: Can't open RESULTS:Too many open files
> STACK Bio::Tools::Run::Primer3::run
> /compbio/pkg/bio-perl/bioperl-run/Bio/Tools/Run/Primer3.pm:361
> STACK (eval) find_primers_first_pass.pl:183
> STACK main::_primer3 find_primers_first_pass.pl:182
> STACK main::get_primer find_primers_first_pass.pl:141
> STACK toplevel find_primers_first_pass.pl:86
> 
> --------------------------------------
> 
> Now if I take this sequence out of the list and run the script, it
> runs just fine.
> 
> here is the subroutine calling primer3:
> sub _primer3{
>     my($seq,$qual_region)=@_;
>     my 
> $primer3=Bio::Tools::Run::Primer3->new(-seq=>$seq,-verbose=>0,-flush=>1);
>     my @qual = @{$seq->qual};
>     #set the start of the search window for primer3
>     my $primer3_start=1;
>     if($seq->length > ($window+100)){
>     $primer3_start=$qual_region->end-($window+100);
>     }
>     #set up primer3
>     $primer3->add_targets('INCLUDED_REGION'=>"$primer3_start,$window");
>     $primer3->add_targets('PRIMER_FIRST_BASE_INDEX'=>1,
>     'PRIMER_TASK'=>'pick_left_only');
>     $primer3->add_targets('PRIMER_SEQUENCE_QUALITY'=>"@qual");
>     $primer3->add_targets('PRIMER_MIN_QUALITY'=>$minqual,
>     'PRIMER_NUM_RETURN'=>1,
>     'PRIMER_MAX_POLY_X'=>3);
>     $primer3->add_targets('PRIMER_GC_CLAMP'=>1) unless($no_gc_clamp);
> 
>     #run primer3
>     my $prim3_results;
>     eval {
>     $prim3_results=$primer3->run;
>     };
>     die $seq->id." :$@" if ($@);
> 
>     #fetch result for the first primer
>     my $hash_ref=$prim3_results->primer_results(0);
>     return $hash_ref;
> 
> }
> 
> any suggestions? any thoughts on why I am getting the error to begin with?
> thanks
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l@portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l

From hlapp at gmx.net  Thu Jan 19 18:11:22 2006
From: hlapp at gmx.net (Hilmar Lapp)
Date: Thu Jan 19 18:14:33 2006
Subject: [Bioperl-l] search2gff
In-Reply-To: 
References: 
Message-ID: 

I added a couple of capabilities to the scripts/utilities/search2gff
script written by Jason. In a nutshell, there are now options for
controlling the score, location, and method of the HSP-representing
feature, as well as options for printing of parent, which parent, and
whether to skip all except the first HSP for each hit.

As for possible applications, for example using these options you can
blast SNP assay primers and use the options to create SNP features for
a single basepair at the end of the primer, ready to be piped to a
GBrowse GFF3 loader.

I tried to preserve the original functionality in its entirety, i.e.,
if you don't use any of the new options the script should work as
before. If not please let me know.

POD is attached.

   -hilmar
--
----------------------------------------------------------
: Hilmar Lapp -:- San Diego, CA -:- hlapp at gmx dot net :
----------------------------------------------------------

SYNOPSIS
    Usage: search2gff [-o outputfile] [-f reportformat] [-i inputfilename]
    OR file1 file2 ..

DESCRIPTION
    This script will turn a protein Search report (BLASTP, FASTP, SSEARCH,
    AXT, WABA) into a GFF File.

    The options are:

       -i infilename      - (optional) inputfilename, will read
                            either ARGV files or from STDIN
       -o filename        - the output filename [default STDOUT]
       -f format          - search result format (blast, fasta,waba,axt)
                            (ssearch is fasta format). default is blast.
       -t/--type seqtype  - if you want to see query or hit information
                            in the GFF report
       -s/--source        - specify the source (will be algorithm name
                            otherwise like BLASTN)
       --method           - the method tag (primary_tag) of the features
                            (default is similarity)
       --scorefunc        - a string or a file that when parsed evaluates
                            to a closure which will be passed a feature
                            object and that returns the score to be printed
       --locfunc          - a string or a file that when parsed evaluates
                            to a closure which will be passed two
                            features, query and hit, and returns the
                            location (Bio::LocationI compliant) for the
                            GFF3 feature created for each HSP; the closure
                            may use the clone_loc() and create_loc()
                            functions for convenience, see their PODs
       --onehsp           - only print the first HSP feature for each hit
       -p/--parent        - the parent to which HSP features should refer
                            if not the name of the hit or query (depending
                            on --type)
       --target/--notarget - whether to always add the Target tag or not
       -h                 - this help menu
       --version          - GFF version to use (put a 3 here to use gff 3)
       --component        - generate GFF component fields (chromosome)
       -m/--match         - generate a 'match' line which is a container
                            of all the similarity HSPs
       --addid            - add ID tag in the absence of --match
       -c/--cutoff        - specify an evalue cutoff

    Additionally specify the filenames you want to process on the
    command-line. If no files are specified then STDIN input is assumed. You
    specify this by doing: search2gff < file1 file2 file3

AUTHOR
    Jason Stajich, jason-at-bioperl-dot-org

Contributors
    Hilmar Lapp, hlapp-at-gmx-dot-net

  clone_loc
     Title   : clone_loc
     Usage   : my $l = clone_loc($feature->location);
     Function: Helper function to simplify the task of cloning locations
               for --locfunc closures.

               Presently simply implemented using Storable::dclone().
     Example :
     Returns : A L object of the same type and with the
               same properties as the argument, but physically different.
               All structured properties will be cloned as well.
     Args    : A L compliant object

  create_loc
     Title   : create_loc
     Usage   : my $l = create_loc("10..12");
     Function: Helper function to simplify the task of creating locations
               for --locfunc closures. Creates a location from a feature-
               table formatted string.

     Example :
     Returns : A L object representing the location given
               as formatted string.
     Args    : A GenBank feature-table formatted string.

From hlapp at gmx.net  Thu Jan 19 18:06:57 2006
From: hlapp at gmx.net (Hilmar Lapp)
Date: Thu Jan 19 20:24:31 2006
Subject: [Bioperl-l] search2gff
Message-ID: 

I added a couple of capabilities to the scripts/utilities/search2gff
script written by Jason. In a nutshell, there are now options for
controlling the score, location, and method of the HSP-representing
feature, as well as options for printing of parent, which parent, and
whether to skip all except the first HSP for each hit.

As for possible applications, for example using these options you can
blast SNP assay primers and use the options to create SNP features for
a single basepair at the end of the primer, ready to be piped to a
GBrowse GFF3 loader.

I tried to preserve the original functionality in its entirety, i.e.,
if you don't use any of the new options the script should work as
before. If not please let me know.

POD is attached.

   -hilmar
--
----------------------------------------------------------
: Hilmar Lapp -:- San Diego, CA -:- hlapp at gmx dot net :
----------------------------------------------------------
-------------- next part --------------
SYNOPSIS
    Usage: search2gff [-o outputfile] [-f reportformat] [-i inputfilename]
    OR file1 file2 ..

DESCRIPTION
    This script will turn a protein Search report (BLASTP, FASTP, SSEARCH,
    AXT, WABA) into a GFF File.

    The options are:

       -i infilename      - (optional) inputfilename, will read
                            either ARGV files or from STDIN
       -o filename        - the output filename [default STDOUT]
       -f format          - search result format (blast, fasta,waba,axt)
                            (ssearch is fasta format). default is blast.
       -t/--type seqtype  - if you want to see query or hit information
                            in the GFF report
       -s/--source        - specify the source (will be algorithm name
                            otherwise like BLASTN)
       --method           - the method tag (primary_tag) of the features
                            (default is similarity)
       --scorefunc        - a string or a file that when parsed evaluates
                            to a closure which will be passed a feature
                            object and that returns the score to be printed
       --locfunc          - a string or a file that when parsed evaluates
                            to a closure which will be passed two
                            features, query and hit, and returns the
                            location (Bio::LocationI compliant) for the
                            GFF3 feature created for each HSP; the closure
                            may use the clone_loc() and create_loc()
                            functions for convenience, see their PODs
       --onehsp           - only print the first HSP feature for each hit
       -p/--parent        - the parent to which HSP features should refer
                            if not the name of the hit or query (depending
                            on --type)
       --target/--notarget - whether to always add the Target tag or not
       -h                 - this help menu
       --version          - GFF version to use (put a 3 here to use gff 3)
       --component        - generate GFF component fields (chromosome)
       -m/--match         - generate a 'match' line which is a container
                            of all the similarity HSPs
       --addid            - add ID tag in the absence of --match
       -c/--cutoff        - specify an evalue cutoff

    Additionally specify the filenames you want to process on the
    command-line. If no files are specified then STDIN input is assumed. You
    specify this by doing: search2gff < file1 file2 file3

AUTHOR
    Jason Stajich, jason-at-bioperl-dot-org

Contributors
    Hilmar Lapp, hlapp-at-gmx-dot-net

  clone_loc
     Title   : clone_loc
     Usage   : my $l = clone_loc($feature->location);
     Function: Helper function to simplify the task of cloning locations
               for --locfunc closures.

               Presently simply implemented using Storable::dclone().
     Example :
     Returns : A L object of the same type and with the
               same properties as the argument, but physically different.
               All structured properties will be cloned as well.
     Args    : A L compliant object

  create_loc
     Title   : create_loc
     Usage   : my $l = create_loc("10..12");
     Function: Helper function to simplify the task of creating locations
               for --locfunc closures. Creates a location from a feature-
               table formatted string.

     Example :
     Returns : A L object representing the location given
               as formatted string.
     Args    : A GenBank feature-table formatted string.

From christoph.gille at charite.de  Thu Jan 19 18:36:53 2006
From: christoph.gille at charite.de (Dr. Christoph Gille)
Date: Thu Jan 19 22:12:06 2006
Subject: [Bioperl-l] bioperl
Message-ID: <64335.84.190.29.176.1137713813.squirrel@webmail.charite.de>

Hi Torsten,

perhaps Sopma is not the best choice as a test case for bringing perl and
java together. It is not a convincing example because people would ask why
not
contacting the server directly from java and why taking the hazzard with
perl installation.

I want to demonstrate that BioPerl programs can well work together with
STRAP/Biojava with the wrapper I am  just developing but I need a suitable
example program.

What I consider is a sophisticated non-interactive Bioperl program that
performs some kind of useful computation on a protein sequence, or an
alignment or a protein 3D structure.

Do you know of something appropriate ?

It does not matter if the program is complex or contains C/C++ as long as
it can be automatically installed without user interaction.

Many thanks

Christoph

From hlapp at gmx.net  Thu Jan 19 17:53:40 2006
From: hlapp at gmx.net (Hilmar Lapp)
Date: Thu Jan 19 23:20:02 2006
Subject: [Bioperl-l] OntologyTerm::as_text
Message-ID: 

I changed the as_text() method of Bio::Annotation::OntologyTerm to not
use the identifier of the term anymore but instead append the
is_obsolete property.

The reason is that a term's identifier is optional (though strongly
recommended), and two terms (and therefore annotations) with the same
name and ontology but one with identifier and the other without are
considered equal annotations nonetheless.

Please let me know if this creates a problem for anybody. I also had
to fix a test in t/Annotation.t that assumed the identifier to be
included in as_text.

   -hilmar

--
----------------------------------------------------------
: Hilmar Lapp -:- San Diego, CA -:- hlapp at gmx dot net :
----------------------------------------------------------

From angshu96 at gmail.com  Fri Jan 20 13:57:05 2006
From: angshu96 at gmail.com (Angshu Kar)
Date: Fri Jan 20 15:34:33 2006
Subject: [Bioperl-l] 1 small help with WU-BLAST -postsw
Message-ID: 

Hi,

I'm using WU-BLASTP in yeast data (all vs all).

Then I'm using :

$hit_object->frac_aligned_query() > 0.5
and
$hit_object->frac_aligned_hit() > 0.5

as filter conditions.

In that I'm getting asymmetric results! I mean I've sequences A,B in
my o/p and not B,A. Has it got something to do with the asymmetry of
BLAST (but I thought -postsw takes care of that)?

Please help.

Thanks,
Angshu

--
Ignore the impossible but honor it ...
The only enviable second position is success, since failure always
comes first...

From jason.stajich at duke.edu  Fri Jan 20 16:02:22 2006
From: jason.stajich at duke.edu (Jason Stajich)
Date: Fri Jan 20 17:16:55 2006
Subject: [Bioperl-l] 1 small help with WU-BLAST -postsw
In-Reply-To: 
References: 
Message-ID: 

Well the hit and query probably are not the same length which is the  
denominator in that fraction ....

On Jan 20, 2006, at 1:57 PM, Angshu Kar wrote:

> Hi,
>
> I'm using WU-BLASTP in yeast data (all vs all).
>
> Then I'm using :
>
> $hit_object->frac_aligned_query() > 0.5
> and
> $hit_object->frac_aligned_hit() > 0.5
>
> as filter conditions.
>
> In that I'm getting asymmetric results! I mean I've sequences A,B in
> my o/p and not B,A. Has it got something to do with the asymmetry of
> BLAST (but I thought -postsw takes care of that)?
>
> Please help.
>
> Thanks,
> Angshu
>
>
>
>
>
>
>
> --
> Ignore the impossible but honor it ...
> The only enviable second position is success, since failure always
> comes first...
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l@portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l

--
Jason Stajich
Duke University
http://www.duke.edu/~jes12

From angshu96 at gmail.com  Fri Jan 20 16:34:07 2006
From: angshu96 at gmail.com (Angshu Kar)
Date: Fri Jan 20 20:18:04 2006
Subject: [Bioperl-l] 1 small help with WU-BLAST -postsw
In-Reply-To: 
References: 

Message-ID: 

But I'm doing an AND.So if i do use alignment/hit > 0.5 and
alignment/query > 0.5 as a filter will the length of the denominator
matter?

On 1/20/06, Jason Stajich  wrote:
> Well the hit and query probably are not the same length which is the
> denominator in that fraction ....
>
> On Jan 20, 2006, at 1:57 PM, Angshu Kar wrote:
>
> > Hi,
> >
> > I'm using WU-BLASTP in yeast data (all vs all).
> >
> > Then I'm using :
> >
> > $hit_object->frac_aligned_query() > 0.5
> > and
> > $hit_object->frac_aligned_hit() > 0.5
> >
> > as filter conditions.
> >
> > In that I'm getting asymmetric results! I mean I've sequences A,B in
> > my o/p and not B,A. Has it got something to do with the asymmetry of
> > BLAST (but I thought -postsw takes care of that)?
> >
> > Please help.
> >
> > Thanks,
> > Angshu
> >
> >
> >
> >
> >
> >
> >
> > --
> > Ignore the impossible but honor it ...
> > The only enviable second position is success, since failure always
> > comes first...
> >
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l@portal.open-bio.org
> > http://portal.open-bio.org/mailman/listinfo/bioperl-l
>
> --
> Jason Stajich
> Duke University
> http://www.duke.edu/~jes12
>
>
>

--
Ignore the impossible but honor it ...
The only enviable second position is success, since failure always
comes first...

From jason.stajich at duke.edu  Sat Jan 21 10:38:23 2006
From: jason.stajich at duke.edu (Jason Stajich)
Date: Sat Jan 21 10:36:11 2006
Subject: [Bioperl-l] 1 small help with WU-BLAST -postsw
In-Reply-To: 
References: 

Message-ID: <1F723DFB-C817-411C-9E19-7183C8E3DF91@duke.edu>

if you are really using Hit->frac_aligned I wouldn't rely on the Hit  
object frac_aligned - deal with the HSPs and calculate what you want  
- if there are multiple HSPs after a postsw some of those could be  
overalapping alternative sub-optimal alignments I don't know what  
thethe Hit object algorithm does when it tries to merge these examples.

My best advice is
  - find concrete examples of A,B and B,A pairs that doesn't meet  
your filter
  - print out the HSP information and LOOK at the HSPs and calculate  
the expected numbers by hand to figure out what is going on
  - there are going to be comparisons where it is ambiguous what  
would you do BY HAND for these cases, ignore them because they are  
the 5th best hit? take the longest HSP?   try and
    calculate overall coverage?
  - calculate frac aligned per HSP, decide what you want to do when  
there are multiple HSPs, take the longest, attempt to figure out some  
overall coverage, it all depends on your
  question if you are trying to find a single number
  - $hit_HSP_frac_aligned = $hsp->hit->length / $hit->length

Good luck.
-jason

On Jan 20, 2006, at 4:34 PM, Angshu Kar wrote:

> But I'm doing an AND.So if i do use alignment/hit > 0.5 and
> alignment/query > 0.5 as a filter will the length of the denominator
> matter?
>
> On 1/20/06, Jason Stajich  wrote:
>> Well the hit and query probably are not the same length which is the
>> denominator in that fraction ....
>>
>> On Jan 20, 2006, at 1:57 PM, Angshu Kar wrote:
>>
>>> Hi,
>>>
>>> I'm using WU-BLASTP in yeast data (all vs all).
>>>
>>> Then I'm using :
>>>
>>> $hit_object->frac_aligned_query() > 0.5
>>> and
>>> $hit_object->frac_aligned_hit() > 0.5
>>>
>>> as filter conditions.
>>>
>>> In that I'm getting asymmetric results! I mean I've sequences A,B in
>>> my o/p and not B,A. Has it got something to do with the asymmetry of
>>> BLAST (but I thought -postsw takes care of that)?
>>>
>>> Please help.
>>>
>>> Thanks,
>>> Angshu
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> --
>>> Ignore the impossible but honor it ...
>>> The only enviable second position is success, since failure always
>>> comes first...
>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l@portal.open-bio.org
>>> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>>
>> --
>> Jason Stajich
>> Duke University
>> http://www.duke.edu/~jes12
>>
>>
>>
>
>
> --
> Ignore the impossible but honor it ...
> The only enviable second position is success, since failure always
> comes first...

--
Jason Stajich
Duke University
http://www.duke.edu/~jes12/

From anst at kvl.dk  Sat Jan 21 13:40:42 2006
From: anst at kvl.dk (Anders Stegmann)
Date: Sat Jan 21 14:05:56 2006
Subject: [Bioperl-l] wrong nomatch position from protein with singly deleted
	Aa
Message-ID: <43D28E3A0200009B0000069C@gwia.kvl.dk>

Hi BioPerl! 

I have an original protein seq which I blastp (standalone) against the same seq with Aa nr 61 deleted manually. 

The result is that the subject nomatch is Aa. E on position 60, which is definitely not a mismatch!!? 
This also happens if I delete two Aa at positions 61 and 62 in the subject seq. 
This does strangely enough not happen if I delete a whole line (60 Aa) in the subject seq. 

The result for the query  nomatch is Aa. V at position 61, which is korrekt (the subrutine code is similar to the subject code shown below). 

the code I use is following: 

sub subject_seq_alignment_nomatch_residues { 

my ($hsp_obj) = @_; 
my %subject_nomatch_hash = (); 
my @new_subject_string = (); 

my @subject_string = split , $$hsp_obj->hit_string; 

foreach (@subject_string) { #positioner i visse tilf?lde 

if ($_ ne '-') {push @new_subject_string, $_}; 

} 

my $start_subject_number = $$hsp_obj->start('hit'); 

$start_subject_number = $start_subject_number - 1; 

foreach ($$hsp_obj->seq_inds('hit', 'nomatch')) { 

$subject_nomatch_hash{$_} = $new_subject_string[$_ -1 -$start_subject_number];#positionen, tr?kker derefter den tilsvarende #aminosyre ud af subjekt sekvensen 
} 

return %subject_nomatch_hash; 

} 

It has nothing to do with the foreach (@subject_string) { code or the $start_subject_number (cause it is 0 in this example). I checked! 

How can this be? 

Regards Anders. 

From jason.stajich at duke.edu  Sat Jan 21 14:29:21 2006
From: jason.stajich at duke.edu (Jason Stajich)
Date: Sat Jan 21 14:25:31 2006
Subject: [Bioperl-l] wrong nomatch position from protein with singly
	deleted Aa
In-Reply-To: <43D28E3A0200009B0000069C@gwia.kvl.dk>
References: <43D28E3A0200009B0000069C@gwia.kvl.dk>
Message-ID: 

I know you are trying to give an example but this isn't really enough  
for someone to help as you are referring to a sequence alignment we  
can't see.  I can't tell if this a problem with translated blast  
coordinates, the seq_inds code alone, or what.

Why don't you gather a sample report, the code you are using, and  
your expected result together in something that someone can run and  
submit it as a bug to bugzilla.  Then hopefully someone from the  
community will download and reproduce your problem for you and can  
tell whether the problem is in the module or elsewhere.

-jason

On Jan 21, 2006, at 1:40 PM, Anders Stegmann wrote:

> Hi BioPerl!
>
> I have an original protein seq which I blastp (standalone) against  
> the same seq with Aa nr 61 deleted manually.
>
> The result is that the subject nomatch is Aa. E on position 60,  
> which is definitely not a mismatch!!?
> This also happens if I delete two Aa at positions 61 and 62 in the  
> subject seq.
> This does strangely enough not happen if I delete a whole line (60  
> Aa) in the subject seq.
>
> The result for the query  nomatch is Aa. V at position 61, which is  
> korrekt (the subrutine code is similar to the subject code shown  
> below).
>
>
> the code I use is following:
>
> sub subject_seq_alignment_nomatch_residues {
>
> my ($hsp_obj) = @_;
> my %subject_nomatch_hash = ();
> my @new_subject_string = ();
>
> my @subject_string = split , $$hsp_obj->hit_string;
>
> foreach (@subject_string) { #positioner i visse tilf?lde
>
> if ($_ ne '-') {push @new_subject_string, $_};
>
> }
>
> my $start_subject_number = $$hsp_obj->start('hit');
>
> $start_subject_number = $start_subject_number - 1;
>
> foreach ($$hsp_obj->seq_inds('hit', 'nomatch')) {
>
> $subject_nomatch_hash{$_} = $new_subject_string[$_ -1 - 
> $start_subject_number];#positionen, tr?kker derefter den  
> tilsvarende #aminosyre ud af subjekt sekvensen
> }
>
> return %subject_nomatch_hash;
>
> }
>
> It has nothing to do with the foreach (@subject_string) { code or  
> the $start_subject_number (cause it is 0 in this example). I checked!
>
> How can this be?
>
> Regards Anders.
>
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l@portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l

--
Jason Stajich
Duke University
http://www.duke.edu/~jes12/

From heikki at sanbi.ac.za  Mon Jan 23 02:54:02 2006
From: heikki at sanbi.ac.za (Heikki Lehvaslaiho)
Date: Mon Jan 23 02:50:10 2006
Subject: [Bioperl-l] BioPerl Physical Map modules, tests needed
Message-ID: <200601230954.02776.heikki@sanbi.ac.za>

Dear Gaurav,

I am going slowly through BioPerl modules that do not have any tests written 
to them. Unless there are tests in place, there is no way anyone can keep 
track of changes needed for the modules and useful modules will become 
obsolete as bioperl moves ahead.

You have contributed the following modules:

Bio::MapIO::fpc
Bio::Map::Clone
Bio::Map::Contig
Bio::Map::FPCMarker
Bio::Map::OrderedPositionWithDistance
Bio::Map::Physical

Would it be possible for you to write tests  and add them, together with  a 
small FPC sample file,  to the repository. I gave it a shot 
(t/PhysicalMap.t), but had to stop short when I realised that  
Bio::MapIO::fpc reads in an FPC file and creates a Bio::Map::Physical by 
holding everything in an internal hash. None of the objects included in a 
physical map are instantiated before they are required. It is therefore very 
difficult for someone else to write meaningful tests without a good example 
FPC file. 

Yours,
	-Heikki
-- 
______ _/      _/_____________________________________________________
      _/      _/
     _/  _/  _/  Heikki Lehvaslaiho    heikki at_sanbi _ac _za
    _/_/_/_/_/  Associate Professor    skype: heikki_lehvaslaiho
   _/  _/  _/  SANBI, South African National Bioinformatics Institute
  _/  _/  _/  University of Western Cape, South Africa
     _/      Phone: +27 21 959 2096   FAX: +27 21 959 2512
___ _/_/_/_/_/________________________________________________________
From phil511 at 21cn.com  Mon Jan 23 11:41:23 2006
From: phil511 at 21cn.com (Phil-)
Date: Mon Jan 23 11:47:17 2006
Subject: [Bioperl-l] 2 questions about Bio::Tools::WebBlat
Message-ID: <200601231639.k0NGdKc29777@taurus.zsu.edu.cn>

Greetings, everyone!

	I'm an undergraduate from Sun Yat-Sen University of China. I am currently using bioperl to map some of my cDNA sequences onto human chromosomes. I used the Bio::Tools::WebBlat but something just goes wrong. Here comes my code:

      BEGIN{$ENV{HTTP_PROXY}='http://202.116.64.1:8001/';}

      use Bio::Tools::WebBlat;
      use Bio::Seq;

      my $webblat = Bio::Tools::WebBlat->new();
      my $seq = Bio::Seq->new(-id => 'foo' , -seq => 'aataataat' );
      my $searchio = $webblat->create_searchio(sequence=>$seq);

      while(my $result = $searchio->next_result){
        sleep 1;
      }

	As you can see that I have to use a proxy. But i can't get the results by these codes. I further check the result of the LWP::UserAgent->request called inside WebBlat.pm and I think I found a mistake at the last line of code:

      $self->throw($ua->status_line);

   I think status_line should be a method of HTTP::Response but not LWP::UserAgent. I change the code and what I got is a '500 Internal Server Error', and with the following lines:

----------------------------
The page you requested resulted in a server problem on our systems.

We hate this type of error immensely and we're sure that you do as well. While we have logged it and rapidly pursue any problems on our systems,sometimes extra information from the user can pinpoint the cause of the problem for us and help us prevent it in the future. If you have information that you would like to provide about what led to the error, please email us at genome-www@soe.ucsc.edu.

If you are unable to access commonly-used features on our website, it is possible that you may need to reset your Genome Browser with the following URL: http://genome.ucsc.edu/cgi-bin/cartReset. This will replace your stored settings with the default configuration and will return your Browser to the state it was in when you first accessed it.

We apologize for the inconvenience. 
----------------------------

It seems that my script get contact with UCSC server but can't work though to get the result. What should I do? Thank you all!

����������������Phil-
����������������phil511@21cn.com
��������������������2006-01-24

From anst at kvl.dk  Sun Jan 22 08:06:29 2006
From: anst at kvl.dk (Anders Stegmann)
Date: Mon Jan 23 15:53:44 2006
Subject: [Bioperl-l] wrong nomatch position from protein with
	singly deleted Aa
In-Reply-To: 
References: <43D28E3A0200009B0000069C@gwia.kvl.dk>

Message-ID: <43D391650200009B000006B7@gwia.kvl.dk>

Skipped content of type multipart/alternative-------------- next part --------------
A non-text attachment was scrubbed...
Name: blastp.pl
Type: application/octet-stream
Size: 22241 bytes
Desc: not available
Url : http://portal.open-bio.org/pipermail/bioperl-l/attachments/20060122/5a2eb143/blastp-0001.obj
-------------- next part --------------
A non-text attachment was scrubbed...
Name: YAL001C
Type: application/octet-stream
Size: 1529 bytes
Desc: not available
Url : http://portal.open-bio.org/pipermail/bioperl-l/attachments/20060122/5a2eb143/YAL001C-0001.obj
-------------- next part --------------
A non-text attachment was scrubbed...
Name: YAL001CDB
Type: application/octet-stream
Size: 1528 bytes
Desc: not available
Url : http://portal.open-bio.org/pipermail/bioperl-l/attachments/20060122/5a2eb143/YAL001CDB-0001.obj
-------------- next part --------------
A non-text attachment was scrubbed...
Name: YAL001Chit1hsp1
Type: application/octet-stream
Size: 296 bytes
Desc: not available
Url : http://portal.open-bio.org/pipermail/bioperl-l/attachments/20060122/5a2eb143/YAL001Chit1hsp1-0001.obj
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://portal.open-bio.org/pipermail/bioperl-l/attachments/20060122/5a2eb143/YAL001Chit1hsp1-0001.html
From hubert.prielinger at gmx.at  Mon Jan 23 16:18:48 2006
From: hubert.prielinger at gmx.at (Hubert Prielinger)
Date: Mon Jan 23 17:18:27 2006
Subject: [Bioperl-l] formatdb with the nr database
Message-ID: <43D54838.5050301@gmx.at>

Hi,
I have downloaded the nr database for doing a blast search locally, now 
I'm supposed to index the database with formatdb, but it doesn't work...
The online help says that you need a fasta file that is indexed to use 
for searching the database, but when I uncompressed the zip file, there 
were only .phr, .pnd, .pin, .pni, .ppd file....
Is there anybody who can tell me, how to use formatdb with the nr 
database...

Help is very appreciated
Thank you very much in advance

Hubert

From smarkel at scitegic.com  Mon Jan 23 17:53:43 2006
From: smarkel at scitegic.com (Scott Markel)
Date: Mon Jan 23 18:02:09 2006
Subject: [Bioperl-l] formatdb with the nr database
In-Reply-To: <43D54838.5050301@gmx.at>
References: <43D54838.5050301@gmx.at>
Message-ID: <43D55E77.9040501@scitegic.com>

Hubert,

The .phr et al files are the result of already having run
formatdb.  By running NCBI's fastacmd (comes with blastall
and formatdb) with the -D option, you can get back to a
FASTA file.

Scott

Hubert Prielinger wrote:

> Hi,
> I have downloaded the nr database for doing a blast search locally, now 
> I'm supposed to index the database with formatdb, but it doesn't work...
> The online help says that you need a fasta file that is indexed to use 
> for searching the database, but when I uncompressed the zip file, there 
> were only .phr, .pnd, .pin, .pni, .ppd file....
> Is there anybody who can tell me, how to use formatdb with the nr 
> database...
> 
> Help is very appreciated
> Thank you very much in advance
> 
> Hubert
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l@portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l
> 
> 

-- 
Scott Markel, Ph.D.
Principal Bioinformatics Architect  email:  smarkel@scitegic.com
SciTegic Inc.                       mobile: +1 858 205 3653
9665 Chesapeake Drive, Suite 401    voice:  +1 858 279 8800, ext. 253
San Diego, CA 92123                 fax:    +1 858 279 8804
USA                                 web:    http://www.scitegic.com

From hubert.prielinger at gmx.at  Mon Jan 23 18:08:51 2006
From: hubert.prielinger at gmx.at (Hubert Prielinger)
Date: Mon Jan 23 19:01:50 2006
Subject: [Bioperl-l] formatdb with the nr database
In-Reply-To: <43D5693C.1020805@anu.edu.au>
References: <43D54838.5050301@gmx.at> <43D5693C.1020805@anu.edu.au>
Message-ID: <43D56203.2060806@gmx.at>

Hi,
thank you very much for the help, another questions that raises up, do I 
have to write the path to the database files as well, I guess so, but 
how I do that, the same way I write the path to teh blast bin files?
Does anybody know how to set the Composition based statistics parameter?
there is my code:

#!/usr/bin/perl -w

use Bio::Tools::Run::StandAloneBlast;
use Bio::Seq;
use Bio::SeqIO;
use strict;

BEGIN
{
    $ENV{PATH}=":/home/Hubert/blast/blast-2.2.13/bin/:";
}

# parameters
my $expect_value = 20000;
#my $filter_query_sequence = 'F';
my $one_line_description = 1000;
my $alignments = 1000;
# my $strands = 1;
my $count = 1;

my @params = ('program' => 'blastp', 'database' => 'nr');
#my $progress_interval = 100;

my $seqio_obj = Bio::SeqIO->new(
  -file   => "Perm.txt",
  -format => "raw",
);

# create factory object and set parameters
my $factory = Bio::Tools::Run::StandAloneBlast->new(@params);

$factory->e($expect_value);
#$factory->F($filter_query_sequence);
$factory->v($one_line_description);
$factory->b($alignments);
#$factory->S($strands);

# get query

while ( my $query = $seqio_obj->next_seq ) {
      my $blast_report = $factory->blastall($query);
      my $filename = "comp_$count.txt";
      my $factory->outfile($filename);
      print $query->seq;
      print "\n";

  $count++;
}

thank you very much in advance
Hubert

Nagesh Chakka wrote:

> Hi Hubert,
> I downloaded the nr.00.tar.gz file a week ago. I was able to get the 
> following files
> .phr, .pin, .pnd, .pni, .ppd, .ppi, .psd, .psi, .psq, .pal files. I 
> have no trouble in running standalone blast. You are not required to 
> run formardb on the downloaded blast databases and that may be the 
> reason why the sequences are not included as it will also reduce the 
> size of the file.
> Did you try to run a blast search, if so is it giving you any errors?
> Nagesh
>
>
>
> Hubert Prielinger wrote:
>
>> Hi,
>> I have downloaded the nr database for doing a blast search locally, 
>> now I'm supposed to index the database with formatdb, but it doesn't 
>> work...
>> The online help says that you need a fasta file that is indexed to 
>> use for searching the database, but when I uncompressed the zip file, 
>> there were only .phr, .pnd, .pin, .pni, .ppd file....
>> Is there anybody who can tell me, how to use formatdb with the nr 
>> database...
>>
>> Help is very appreciated
>> Thank you very much in advance
>>
>> Hubert
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l@portal.open-bio.org
>> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>
>
>
>

From hubert.prielinger at gmx.at  Mon Jan 23 19:15:45 2006
From: hubert.prielinger at gmx.at (Hubert Prielinger)
Date: Mon Jan 23 20:08:44 2006
Subject: [Bioperl-l] formatdb with the nr database
In-Reply-To: <1138062266.2534.2.camel@vogon>
References: <43D54838.5050301@gmx.at> <43D5693C.1020805@anu.edu.au>	
	<43D56203.2060806@gmx.at> <1138062266.2534.2.camel@vogon>
Message-ID: <43D571B1.3020008@gmx.at>

Hi Nagesh,
thank you very much, I put my database into the data folder, run the 
program and got the following error message:

submit Sequence...just do it....
sh: /home/Hubert/blast/blast-2.2.13/bin/blastall: cannot execute binary file

------------- EXCEPTION  -------------
MSG: blastall call crashed: 32256 
/home/Hubert/blast/blast-2.2.13/bin/blastall -p  blastp  -d  "/nr"  -i  
/tmp/QTZfYMbgLM  -e  20000  -o  /tmp/v3YwWvONZ1  -v  1000  -b  1000

STACK Bio::Tools::Run::StandAloneBlast::_runblast 
/usr/lib/perl5/site_perl/5.8.6/Bio/Tools/Run/StandAloneBlast.pm:759
STACK Bio::Tools::Run::StandAloneBlast::_generic_local_blast 
/usr/lib/perl5/site_perl/5.8.6/Bio/Tools/Run/StandAloneBlast.pm:706
STACK Bio::Tools::Run::StandAloneBlast::blastall 
/usr/lib/perl5/site_perl/5.8.6/Bio/Tools/Run/StandAloneBlast.pm:557
STACK toplevel 
/home/Hubert/installed/eclipse/workspace/Database_Search/standalone_blast.pl:46

--------------------------------------

Why it did not find my binary file, but it is there

regards

Nagesh Chakka wrote:

>Hi,
>The following is from the StandAloneBlast.pm documentation
>" If the databases which will be searched by BLAST are located in the
>data subdirectory of the blast program directory (the default
>installation location), StandAloneBlast will find them; however, if the
>database files are located in any other location, environmental variable
>$BLASTDATADIR will need to be set to point to that directory."
>Please note that I have not used this module before.
>Nagesh
>
>
>
>On Mon, 2006-01-23 at 17:08 -0600, Hubert Prielinger wrote:
>  
>
>>Hi,
>>thank you very much for the help, another questions that raises up, do I 
>>have to write the path to the database files as well, I guess so, but 
>>how I do that, the same way I write the path to teh blast bin files?
>>Does anybody know how to set the Composition based statistics parameter?
>>there is my code:
>>
>>#!/usr/bin/perl -w
>>
>>use Bio::Tools::Run::StandAloneBlast;
>>use Bio::Seq;
>>use Bio::SeqIO;
>>use strict;
>>
>>BEGIN
>>{
>>    $ENV{PATH}=":/home/Hubert/blast/blast-2.2.13/bin/:";
>>}
>>
>>
>># parameters
>>my $expect_value = 20000;
>>#my $filter_query_sequence = 'F';
>>my $one_line_description = 1000;
>>my $alignments = 1000;
>># my $strands = 1;
>>my $count = 1;
>>
>>my @params = ('program' => 'blastp', 'database' => 'nr');
>>#my $progress_interval = 100;
>>
>>
>>my $seqio_obj = Bio::SeqIO->new(
>>  -file   => "Perm.txt",
>>  -format => "raw",
>>);
>>
>># create factory object and set parameters
>>my $factory = Bio::Tools::Run::StandAloneBlast->new(@params);
>>
>>$factory->e($expect_value);
>>#$factory->F($filter_query_sequence);
>>$factory->v($one_line_description);
>>$factory->b($alignments);
>>#$factory->S($strands);
>>
>>
>># get query
>>
>>while ( my $query = $seqio_obj->next_seq ) {
>>      my $blast_report = $factory->blastall($query);
>>      my $filename = "comp_$count.txt";
>>      my $factory->outfile($filename);
>>      print $query->seq;
>>      print "\n";
>> 
>>  $count++;
>>}
>>
>>thank you very much in advance
>>Hubert
>>
>>
>>
>>Nagesh Chakka wrote:
>>
>>    
>>
>>>Hi Hubert,
>>>I downloaded the nr.00.tar.gz file a week ago. I was able to get the 
>>>following files
>>>.phr, .pin, .pnd, .pni, .ppd, .ppi, .psd, .psi, .psq, .pal files. I 
>>>have no trouble in running standalone blast. You are not required to 
>>>run formardb on the downloaded blast databases and that may be the 
>>>reason why the sequences are not included as it will also reduce the 
>>>size of the file.
>>>Did you try to run a blast search, if so is it giving you any errors?
>>>Nagesh
>>>
>>>
>>>
>>>Hubert Prielinger wrote:
>>>
>>>      
>>>
>>>>Hi,
>>>>I have downloaded the nr database for doing a blast search locally, 
>>>>now I'm supposed to index the database with formatdb, but it doesn't 
>>>>work...
>>>>The online help says that you need a fasta file that is indexed to 
>>>>use for searching the database, but when I uncompressed the zip file, 
>>>>there were only .phr, .pnd, .pin, .pni, .ppd file....
>>>>Is there anybody who can tell me, how to use formatdb with the nr 
>>>>database...
>>>>
>>>>Help is very appreciated
>>>>Thank you very much in advance
>>>>
>>>>Hubert
>>>>
>>>>_______________________________________________
>>>>Bioperl-l mailing list
>>>>Bioperl-l@portal.open-bio.org
>>>>http://portal.open-bio.org/mailman/listinfo/bioperl-l
>>>>        
>>>>
>>>
>>>
>>>      
>>>
>
>
>  
>

From smarkel at scitegic.com  Mon Jan 23 20:47:38 2006
From: smarkel at scitegic.com (Scott Markel)
Date: Mon Jan 23 20:45:56 2006
Subject: [Bioperl-l] formatdb with the nr database
In-Reply-To: <43D571B1.3020008@gmx.at>
References: <43D54838.5050301@gmx.at>
	<43D5693C.1020805@anu.edu.au>		<43D56203.2060806@gmx.at>
	<1138062266.2534.2.camel@vogon> <43D571B1.3020008@gmx.at>
Message-ID: <43D5873A.8050501@scitegic.com>

Hubert,

Does your blastall file have execute permission turned on?

Scott

Hubert Prielinger wrote:

> Hi Nagesh,
> thank you very much, I put my database into the data folder, run the 
> program and got the following error message:
> 
> submit Sequence...just do it....
> sh: /home/Hubert/blast/blast-2.2.13/bin/blastall: cannot execute binary 
> file
> 
> ------------- EXCEPTION  -------------
> MSG: blastall call crashed: 32256 
> /home/Hubert/blast/blast-2.2.13/bin/blastall -p  blastp  -d  "/nr"  -i  
> /tmp/QTZfYMbgLM  -e  20000  -o  /tmp/v3YwWvONZ1  -v  1000  -b  1000
> 
> STACK Bio::Tools::Run::StandAloneBlast::_runblast 
> /usr/lib/perl5/site_perl/5.8.6/Bio/Tools/Run/StandAloneBlast.pm:759
> STACK Bio::Tools::Run::StandAloneBlast::_generic_local_blast 
> /usr/lib/perl5/site_perl/5.8.6/Bio/Tools/Run/StandAloneBlast.pm:706
> STACK Bio::Tools::Run::StandAloneBlast::blastall 
> /usr/lib/perl5/site_perl/5.8.6/Bio/Tools/Run/StandAloneBlast.pm:557
> STACK toplevel 
> /home/Hubert/installed/eclipse/workspace/Database_Search/standalone_blast.pl:46 
> 
> 
> --------------------------------------
> 
> Why it did not find my binary file, but it is there
> 
> regards
> 
> Nagesh Chakka wrote:
> 
>> Hi,
>> The following is from the StandAloneBlast.pm documentation
>> " If the databases which will be searched by BLAST are located in the
>> data subdirectory of the blast program directory (the default
>> installation location), StandAloneBlast will find them; however, if the
>> database files are located in any other location, environmental variable
>> $BLASTDATADIR will need to be set to point to that directory."
>> Please note that I have not used this module before.
>> Nagesh
>>
>>
>>
>> On Mon, 2006-01-23 at 17:08 -0600, Hubert Prielinger wrote:
>>  
>>
>>> Hi,
>>> thank you very much for the help, another questions that raises up, 
>>> do I have to write the path to the database files as well, I guess 
>>> so, but how I do that, the same way I write the path to teh blast bin 
>>> files?
>>> Does anybody know how to set the Composition based statistics parameter?
>>> there is my code:
>>>
>>> #!/usr/bin/perl -w
>>>
>>> use Bio::Tools::Run::StandAloneBlast;
>>> use Bio::Seq;
>>> use Bio::SeqIO;
>>> use strict;
>>>
>>> BEGIN
>>> {
>>>    $ENV{PATH}=":/home/Hubert/blast/blast-2.2.13/bin/:";
>>> }
>>>
>>>
>>> # parameters
>>> my $expect_value = 20000;
>>> #my $filter_query_sequence = 'F';
>>> my $one_line_description = 1000;
>>> my $alignments = 1000;
>>> # my $strands = 1;
>>> my $count = 1;
>>>
>>> my @params = ('program' => 'blastp', 'database' => 'nr');
>>> #my $progress_interval = 100;
>>>
>>>
>>> my $seqio_obj = Bio::SeqIO->new(
>>>  -file   => "Perm.txt",
>>>  -format => "raw",
>>> );
>>>
>>> # create factory object and set parameters
>>> my $factory = Bio::Tools::Run::StandAloneBlast->new(@params);
>>>
>>> $factory->e($expect_value);
>>> #$factory->F($filter_query_sequence);
>>> $factory->v($one_line_description);
>>> $factory->b($alignments);
>>> #$factory->S($strands);
>>>
>>>
>>> # get query
>>>
>>> while ( my $query = $seqio_obj->next_seq ) {
>>>      my $blast_report = $factory->blastall($query);
>>>      my $filename = "comp_$count.txt";
>>>      my $factory->outfile($filename);
>>>      print $query->seq;
>>>      print "\n";
>>>
>>>  $count++;
>>> }
>>>
>>> thank you very much in advance
>>> Hubert
>>>
>>>
>>>
>>> Nagesh Chakka wrote:
>>>
>>>   
>>>
>>>> Hi Hubert,
>>>> I downloaded the nr.00.tar.gz file a week ago. I was able to get the 
>>>> following files
>>>> .phr, .pin, .pnd, .pni, .ppd, .ppi, .psd, .psi, .psq, .pal files. I 
>>>> have no trouble in running standalone blast. You are not required to 
>>>> run formardb on the downloaded blast databases and that may be the 
>>>> reason why the sequences are not included as it will also reduce the 
>>>> size of the file.
>>>> Did you try to run a blast search, if so is it giving you any errors?
>>>> Nagesh
>>>>
>>>>
>>>>
>>>> Hubert Prielinger wrote:
>>>>
>>>>     
>>>>
>>>>> Hi,
>>>>> I have downloaded the nr database for doing a blast search locally, 
>>>>> now I'm supposed to index the database with formatdb, but it 
>>>>> doesn't work...
>>>>> The online help says that you need a fasta file that is indexed to 
>>>>> use for searching the database, but when I uncompressed the zip 
>>>>> file, there were only .phr, .pnd, .pin, .pni, .ppd file....
>>>>> Is there anybody who can tell me, how to use formatdb with the nr 
>>>>> database...
>>>>>
>>>>> Help is very appreciated
>>>>> Thank you very much in advance
>>>>>
>>>>> Hubert
>>>>>
>>>>> _______________________________________________
>>>>> Bioperl-l mailing list
>>>>> Bioperl-l@portal.open-bio.org
>>>>> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>>>>>       
>>>>
>>>>
>>>>
>>>>     
>>
>>
>>
>>  
>>
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l@portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l
> 
> 

-- 
Scott Markel, Ph.D.
Principal Bioinformatics Architect  email:  smarkel@scitegic.com
SciTegic Inc.                       mobile: +1 858 205 3653
9665 Chesapeake Drive, Suite 401    voice:  +1 858 279 8800, ext. 253
San Diego, CA 92123                 fax:    +1 858 279 8804
USA                                 web:    http://www.scitegic.com

From hubert.prielinger at gmx.at  Mon Jan 23 20:02:10 2006
From: hubert.prielinger at gmx.at (Hubert Prielinger)
Date: Mon Jan 23 20:55:07 2006
Subject: [Bioperl-l] formatdb with the nr database
In-Reply-To: <43D5873A.8050501@scitegic.com>
References: <43D54838.5050301@gmx.at>
	<43D5693C.1020805@anu.edu.au>		<43D56203.2060806@gmx.at>
	<1138062266.2534.2.camel@vogon> <43D571B1.3020008@gmx.at>
	<43D5873A.8050501@scitegic.com>
Message-ID: <43D57C92.5040905@gmx.at>

Hi,
yes all permissions are turned on

Hubert

Scott Markel wrote:

> Hubert,
>
> Does your blastall file have execute permission turned on?
>
> Scott
>
> Hubert Prielinger wrote:
>
>> Hi Nagesh,
>> thank you very much, I put my database into the data folder, run the 
>> program and got the following error message:
>>
>> submit Sequence...just do it....
>> sh: /home/Hubert/blast/blast-2.2.13/bin/blastall: cannot execute 
>> binary file
>>
>> ------------- EXCEPTION  -------------
>> MSG: blastall call crashed: 32256 
>> /home/Hubert/blast/blast-2.2.13/bin/blastall -p  blastp  -d  "/nr"  
>> -i  /tmp/QTZfYMbgLM  -e  20000  -o  /tmp/v3YwWvONZ1  -v  1000  -b  1000
>>
>> STACK Bio::Tools::Run::StandAloneBlast::_runblast 
>> /usr/lib/perl5/site_perl/5.8.6/Bio/Tools/Run/StandAloneBlast.pm:759
>> STACK Bio::Tools::Run::StandAloneBlast::_generic_local_blast 
>> /usr/lib/perl5/site_perl/5.8.6/Bio/Tools/Run/StandAloneBlast.pm:706
>> STACK Bio::Tools::Run::StandAloneBlast::blastall 
>> /usr/lib/perl5/site_perl/5.8.6/Bio/Tools/Run/StandAloneBlast.pm:557
>> STACK toplevel 
>> /home/Hubert/installed/eclipse/workspace/Database_Search/standalone_blast.pl:46 
>>
>>
>> --------------------------------------
>>
>> Why it did not find my binary file, but it is there
>>
>> regards
>>
>> Nagesh Chakka wrote:
>>
>>> Hi,
>>> The following is from the StandAloneBlast.pm documentation
>>> " If the databases which will be searched by BLAST are located in the
>>> data subdirectory of the blast program directory (the default
>>> installation location), StandAloneBlast will find them; however, if the
>>> database files are located in any other location, environmental 
>>> variable
>>> $BLASTDATADIR will need to be set to point to that directory."
>>> Please note that I have not used this module before.
>>> Nagesh
>>>
>>>
>>>
>>> On Mon, 2006-01-23 at 17:08 -0600, Hubert Prielinger wrote:
>>>  
>>>
>>>> Hi,
>>>> thank you very much for the help, another questions that raises up, 
>>>> do I have to write the path to the database files as well, I guess 
>>>> so, but how I do that, the same way I write the path to teh blast 
>>>> bin files?
>>>> Does anybody know how to set the Composition based statistics 
>>>> parameter?
>>>> there is my code:
>>>>
>>>> #!/usr/bin/perl -w
>>>>
>>>> use Bio::Tools::Run::StandAloneBlast;
>>>> use Bio::Seq;
>>>> use Bio::SeqIO;
>>>> use strict;
>>>>
>>>> BEGIN
>>>> {
>>>>    $ENV{PATH}=":/home/Hubert/blast/blast-2.2.13/bin/:";
>>>> }
>>>>
>>>>
>>>> # parameters
>>>> my $expect_value = 20000;
>>>> #my $filter_query_sequence = 'F';
>>>> my $one_line_description = 1000;
>>>> my $alignments = 1000;
>>>> # my $strands = 1;
>>>> my $count = 1;
>>>>
>>>> my @params = ('program' => 'blastp', 'database' => 'nr');
>>>> #my $progress_interval = 100;
>>>>
>>>>
>>>> my $seqio_obj = Bio::SeqIO->new(
>>>>  -file   => "Perm.txt",
>>>>  -format => "raw",
>>>> );
>>>>
>>>> # create factory object and set parameters
>>>> my $factory = Bio::Tools::Run::StandAloneBlast->new(@params);
>>>>
>>>> $factory->e($expect_value);
>>>> #$factory->F($filter_query_sequence);
>>>> $factory->v($one_line_description);
>>>> $factory->b($alignments);
>>>> #$factory->S($strands);
>>>>
>>>>
>>>> # get query
>>>>
>>>> while ( my $query = $seqio_obj->next_seq ) {
>>>>      my $blast_report = $factory->blastall($query);
>>>>      my $filename = "comp_$count.txt";
>>>>      my $factory->outfile($filename);
>>>>      print $query->seq;
>>>>      print "\n";
>>>>
>>>>  $count++;
>>>> }
>>>>
>>>> thank you very much in advance
>>>> Hubert
>>>>
>>>>
>>>>
>>>> Nagesh Chakka wrote:
>>>>
>>>>  
>>>>
>>>>> Hi Hubert,
>>>>> I downloaded the nr.00.tar.gz file a week ago. I was able to get 
>>>>> the following files
>>>>> .phr, .pin, .pnd, .pni, .ppd, .ppi, .psd, .psi, .psq, .pal files. 
>>>>> I have no trouble in running standalone blast. You are not 
>>>>> required to run formardb on the downloaded blast databases and 
>>>>> that may be the reason why the sequences are not included as it 
>>>>> will also reduce the size of the file.
>>>>> Did you try to run a blast search, if so is it giving you any errors?
>>>>> Nagesh
>>>>>
>>>>>
>>>>>
>>>>> Hubert Prielinger wrote:
>>>>>
>>>>>    
>>>>>
>>>>>> Hi,
>>>>>> I have downloaded the nr database for doing a blast search 
>>>>>> locally, now I'm supposed to index the database with formatdb, 
>>>>>> but it doesn't work...
>>>>>> The online help says that you need a fasta file that is indexed 
>>>>>> to use for searching the database, but when I uncompressed the 
>>>>>> zip file, there were only .phr, .pnd, .pin, .pni, .ppd file....
>>>>>> Is there anybody who can tell me, how to use formatdb with the nr 
>>>>>> database...
>>>>>>
>>>>>> Help is very appreciated
>>>>>> Thank you very much in advance
>>>>>>
>>>>>> Hubert
>>>>>>
>>>>>> _______________________________________________
>>>>>> Bioperl-l mailing list
>>>>>> Bioperl-l@portal.open-bio.org
>>>>>> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>>>>>>       
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>     
>>>>
>>>
>>>
>>>
>>>  
>>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l@portal.open-bio.org
>> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>>
>>
>

From hubert.prielinger at gmx.at  Mon Jan 23 20:41:35 2006
From: hubert.prielinger at gmx.at (Hubert Prielinger)
Date: Mon Jan 23 21:34:47 2006
Subject: [Bioperl-l] formatdb with the nr database
In-Reply-To: <43D58D06.5080501@anu.edu.au>
References: <43D54838.5050301@gmx.at> <43D5693C.1020805@anu.edu.au>
	<43D56203.2060806@gmx.at> <1138062266.2534.2.camel@vogon>
	<43D571B1.3020008@gmx.at> <43D58D06.5080501@anu.edu.au>
Message-ID: <43D585CF.5070902@gmx.at>

hi,
sorry, but what do you mean with is your blast database in /nr...
my database is located in the path /home/Hubert/blast/blast-2.2.13/data

Nagesh Chakka wrote:

> Can you just run the blast from the command line.
> Is your blast database in "/nr".
>
> Hubert Prielinger wrote:
>
>> Hi Nagesh,
>> thank you very much, I put my database into the data folder, run the 
>> program and got the following error message:
>>
>> submit Sequence...just do it....
>> sh: /home/Hubert/blast/blast-2.2.13/bin/blastall: cannot execute 
>> binary file
>>
>> ------------- EXCEPTION  -------------
>> MSG: blastall call crashed: 32256 
>> /home/Hubert/blast/blast-2.2.13/bin/blastall -p  blastp  -d  "/nr"  
>> -i  /tmp/QTZfYMbgLM  -e  20000  -o  /tmp/v3YwWvONZ1  -v  1000  -b  1000
>>
>> STACK Bio::Tools::Run::StandAloneBlast::_runblast 
>> /usr/lib/perl5/site_perl/5.8.6/Bio/Tools/Run/StandAloneBlast.pm:759
>> STACK Bio::Tools::Run::StandAloneBlast::_generic_local_blast 
>> /usr/lib/perl5/site_perl/5.8.6/Bio/Tools/Run/StandAloneBlast.pm:706
>> STACK Bio::Tools::Run::StandAloneBlast::blastall 
>> /usr/lib/perl5/site_perl/5.8.6/Bio/Tools/Run/StandAloneBlast.pm:557
>> STACK toplevel 
>> /home/Hubert/installed/eclipse/workspace/Database_Search/standalone_blast.pl:46 
>>
>>
>> --------------------------------------
>>
>> Why it did not find my binary file, but it is there
>>
>> regards
>>
>> Nagesh Chakka wrote:
>>
>>> Hi,
>>> The following is from the StandAloneBlast.pm documentation
>>> " If the databases which will be searched by BLAST are located in the
>>> data subdirectory of the blast program directory (the default
>>> installation location), StandAloneBlast will find them; however, if the
>>> database files are located in any other location, environmental 
>>> variable
>>> $BLASTDATADIR will need to be set to point to that directory."
>>> Please note that I have not used this module before.
>>> Nagesh
>>>
>>>
>>>
>>> On Mon, 2006-01-23 at 17:08 -0600, Hubert Prielinger wrote:
>>>  
>>>
>>>> Hi,
>>>> thank you very much for the help, another questions that raises up, 
>>>> do I have to write the path to the database files as well, I guess 
>>>> so, but how I do that, the same way I write the path to teh blast 
>>>> bin files?
>>>> Does anybody know how to set the Composition based statistics 
>>>> parameter?
>>>> there is my code:
>>>>
>>>> #!/usr/bin/perl -w
>>>>
>>>> use Bio::Tools::Run::StandAloneBlast;
>>>> use Bio::Seq;
>>>> use Bio::SeqIO;
>>>> use strict;
>>>>
>>>> BEGIN
>>>> {
>>>>    $ENV{PATH}=":/home/Hubert/blast/blast-2.2.13/bin/:";
>>>> }
>>>>
>>>>
>>>> # parameters
>>>> my $expect_value = 20000;
>>>> #my $filter_query_sequence = 'F';
>>>> my $one_line_description = 1000;
>>>> my $alignments = 1000;
>>>> # my $strands = 1;
>>>> my $count = 1;
>>>>
>>>> my @params = ('program' => 'blastp', 'database' => 'nr');
>>>> #my $progress_interval = 100;
>>>>
>>>>
>>>> my $seqio_obj = Bio::SeqIO->new(
>>>>  -file   => "Perm.txt",
>>>>  -format => "raw",
>>>> );
>>>>
>>>> # create factory object and set parameters
>>>> my $factory = Bio::Tools::Run::StandAloneBlast->new(@params);
>>>>
>>>> $factory->e($expect_value);
>>>> #$factory->F($filter_query_sequence);
>>>> $factory->v($one_line_description);
>>>> $factory->b($alignments);
>>>> #$factory->S($strands);
>>>>
>>>>
>>>> # get query
>>>>
>>>> while ( my $query = $seqio_obj->next_seq ) {
>>>>      my $blast_report = $factory->blastall($query);
>>>>      my $filename = "comp_$count.txt";
>>>>      my $factory->outfile($filename);
>>>>      print $query->seq;
>>>>      print "\n";
>>>>
>>>>  $count++;
>>>> }
>>>>
>>>> thank you very much in advance
>>>> Hubert
>>>>
>>>>
>>>>
>>>> Nagesh Chakka wrote:
>>>>
>>>>  
>>>>
>>>>> Hi Hubert,
>>>>> I downloaded the nr.00.tar.gz file a week ago. I was able to get 
>>>>> the following files
>>>>> .phr, .pin, .pnd, .pni, .ppd, .ppi, .psd, .psi, .psq, .pal files. 
>>>>> I have no trouble in running standalone blast. You are not 
>>>>> required to run formardb on the downloaded blast databases and 
>>>>> that may be the reason why the sequences are not included as it 
>>>>> will also reduce the size of the file.
>>>>> Did you try to run a blast search, if so is it giving you any errors?
>>>>> Nagesh
>>>>>
>>>>>
>>>>>
>>>>> Hubert Prielinger wrote:
>>>>>
>>>>>    
>>>>>
>>>>>> Hi,
>>>>>> I have downloaded the nr database for doing a blast search 
>>>>>> locally, now I'm supposed to index the database with formatdb, 
>>>>>> but it doesn't work...
>>>>>> The online help says that you need a fasta file that is indexed 
>>>>>> to use for searching the database, but when I uncompressed the 
>>>>>> zip file, there were only .phr, .pnd, .pin, .pni, .ppd file....
>>>>>> Is there anybody who can tell me, how to use formatdb with the nr 
>>>>>> database...
>>>>>>
>>>>>> Help is very appreciated
>>>>>> Thank you very much in advance
>>>>>>
>>>>>> Hubert
>>>>>>
>>>>>> _______________________________________________
>>>>>> Bioperl-l mailing list
>>>>>> Bioperl-l@portal.open-bio.org
>>>>>> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>>>>>>       
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>     
>>>>
>>>>
>>>
>>>
>>>  
>>>
>>
>
>

From taerwin at gmail.com  Mon Jan 23 18:21:03 2006
From: taerwin at gmail.com (Tim Erwin)
Date: Mon Jan 23 21:57:43 2006
Subject: [Bioperl-l] formatdb with the nr database
In-Reply-To: <43D54838.5050301@gmx.at>
References: <43D54838.5050301@gmx.at>
Message-ID: 

The .phr, .pnd, .pin, .pni, .ppd files are the indexed database, you
don't need to run formatdb as this step has already been done. If you
run formatdb on a fasta file it will generate these .p* files for a
protein database and .n* files for a nucleotide database.

 Regards,

 Tim

On 1/24/06, Hubert Prielinger  wrote:
> Hi,
> I have downloaded the nr database for doing a blast search locally, now
> I'm supposed to index the database with formatdb, but it doesn't work...
> The online help says that you need a fasta file that is indexed to use
> for searching the database, but when I uncompressed the zip file, there
> were only .phr, .pnd, .pin, .pni, .ppd file....
> Is there anybody who can tell me, how to use formatdb with the nr
> database...
>
> Help is very appreciated
> Thank you very much in advance
>
> Hubert
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l@portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>

From jason.stajich at duke.edu  Mon Jan 23 22:57:47 2006
From: jason.stajich at duke.edu (Jason Stajich)
Date: Mon Jan 23 22:54:13 2006
Subject: [Bioperl-l] bioperl
In-Reply-To: <64335.84.190.29.176.1137713813.squirrel@webmail.charite.de>
References: <64335.84.190.29.176.1137713813.squirrel@webmail.charite.de>
Message-ID: 

On Jan 19, 2006, at 6:36 PM, Dr. Christoph Gille wrote:

> Hi Torsten,
>
> perhaps Sopma is not the best choice as a test case for bringing  
> perl and
> java together. It is not a convincing example because people would  
> ask why
> not
> contacting the server directly from java and why taking the hazzard  
> with
> perl installation.
>
> I want to demonstrate that BioPerl programs can well work together  
> with
> STRAP/Biojava with the wrapper I am  just developing but I need a  
> suitable
> example program.
>
> What I consider is a sophisticated non-interactive Bioperl program  
> that
> performs some kind of useful computation on a protein sequence, or an
> alignment or a protein 3D structure.
>
> Do you know of something appropriate ?
>
> It does not matter if the program is complex or contains C/C++ as  
> long as
> it can be automatically installed without user interaction.

PAML/Codeml, PHYLIP programs Neighbor, Seqboot, ProtDist, ProtPars,  
require some file formatting that but might still be criticized as  
not sufficiently difficult to brave the hazards of installing perl  
modules.  Simpler things like MUSCLE, TCOFFEE, BLAST, FASTA/SSEARCH  
might also be good choices.

>
> Many thanks
>
> Christoph
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l@portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l

--
Jason Stajich
Duke University
http://www.duke.edu/~jes12

From nagesh.chakka at anu.edu.au  Tue Jan 24 03:00:02 2006
From: nagesh.chakka at anu.edu.au (Nagesh Chakka)
Date: Tue Jan 24 03:22:15 2006
Subject: [Bioperl-l] formatdb with the nr database
In-Reply-To: <43D585CF.5070902@gmx.at>
References: <43D54838.5050301@gmx.at> <43D5693C.1020805@anu.edu.au>
	<43D56203.2060806@gmx.at> <1138062266.2534.2.camel@vogon>
	<43D571B1.3020008@gmx.at> <43D58D06.5080501@anu.edu.au>
	<43D585CF.5070902@gmx.at>
Message-ID: <1138089602.3643.1.camel@vogon>

I could get the following code working. The only problem I had was with
using the method outfile which I defined differently.

#!/usr/local/bin/perl -w
BEGIN
{
  	$ENV{BLASTDIR}="/usr/local/blast/bin";
    	$ENV{BLASTDATADIR}= "/home/nagesh/blast/nr.00";  
}

use Bio::Tools::Run::StandAloneBlast;
use Bio::Seq;
use Bio::SeqIO;
use strict;

# parameters
my $expect_value = 20000;
#my $filter_query_sequence = 'F';
my $one_line_description = 1000;
my $alignments = 1000;
# my $strands = 1;
my $count = 1;
my @params = ('program' => 'blastp','database' => 'nr.00', 'outfile' =>
'temp.out');
#my $progress_interval = 100;

my $seqio_obj = Bio::SeqIO->new(
  -file   => "blastInput.txt",
  -format => "fasta",
);

# create factory object and set parameters
my $factory = Bio::Tools::Run::StandAloneBlast->new(@params);

$factory->e($expect_value);
#$factory->F($filter_query_sequence);
$factory->v($one_line_description);
$factory->b($alignments);
#$factory->S($strands);

# get query

while ( my $query = $seqio_obj->next_seq) {
      my $blast_report = $factory->blastall($query);
      print "$blast_report\n";
#      $factory->outfile("temp.out");
      print $query->seq;
      print "\n";
   $count++;
}

On Mon, 2006-01-23 at 19:41 -0600, Hubert Prielinger wrote:
> hi,
> sorry, but what do you mean with is your blast database in /nr...
> my database is located in the path /home/Hubert/blast/blast-2.2.13/data
> 
> 
> 
> Nagesh Chakka wrote:
> 
> > Can you just run the blast from the command line.
> > Is your blast database in "/nr".
> >
> > Hubert Prielinger wrote:
> >
> >> Hi Nagesh,
> >> thank you very much, I put my database into the data folder, run the 
> >> program and got the following error message:
> >>
> >> submit Sequence...just do it....
> >> sh: /home/Hubert/blast/blast-2.2.13/bin/blastall: cannot execute 
> >> binary file
> >>
> >> ------------- EXCEPTION  -------------
> >> MSG: blastall call crashed: 32256 
> >> /home/Hubert/blast/blast-2.2.13/bin/blastall -p  blastp  -d  "/nr"  
> >> -i  /tmp/QTZfYMbgLM  -e  20000  -o  /tmp/v3YwWvONZ1  -v  1000  -b  1000
> >>
> >> STACK Bio::Tools::Run::StandAloneBlast::_runblast 
> >> /usr/lib/perl5/site_perl/5.8.6/Bio/Tools/Run/StandAloneBlast.pm:759
> >> STACK Bio::Tools::Run::StandAloneBlast::_generic_local_blast 
> >> /usr/lib/perl5/site_perl/5.8.6/Bio/Tools/Run/StandAloneBlast.pm:706
> >> STACK Bio::Tools::Run::StandAloneBlast::blastall 
> >> /usr/lib/perl5/site_perl/5.8.6/Bio/Tools/Run/StandAloneBlast.pm:557
> >> STACK toplevel 
> >> /home/Hubert/installed/eclipse/workspace/Database_Search/standalone_blast.pl:46 
> >>
> >>
> >> --------------------------------------
> >>
> >> Why it did not find my binary file, but it is there
> >>
> >> regards
> >>
> >> Nagesh Chakka wrote:
> >>
> >>> Hi,
> >>> The following is from the StandAloneBlast.pm documentation
> >>> " If the databases which will be searched by BLAST are located in the
> >>> data subdirectory of the blast program directory (the default
> >>> installation location), StandAloneBlast will find them; however, if the
> >>> database files are located in any other location, environmental 
> >>> variable
> >>> $BLASTDATADIR will need to be set to point to that directory."
> >>> Please note that I have not used this module before.
> >>> Nagesh
> >>>
> >>>
> >>>
> >>> On Mon, 2006-01-23 at 17:08 -0600, Hubert Prielinger wrote:
> >>>  
> >>>
> >>>> Hi,
> >>>> thank you very much for the help, another questions that raises up, 
> >>>> do I have to write the path to the database files as well, I guess 
> >>>> so, but how I do that, the same way I write the path to teh blast 
> >>>> bin files?
> >>>> Does anybody know how to set the Composition based statistics 
> >>>> parameter?
> >>>> there is my code:
> >>>>
> >>>> #!/usr/bin/perl -w
> >>>>
> >>>> use Bio::Tools::Run::StandAloneBlast;
> >>>> use Bio::Seq;
> >>>> use Bio::SeqIO;
> >>>> use strict;
> >>>>
> >>>> BEGIN
> >>>> {
> >>>>    $ENV{PATH}=":/home/Hubert/blast/blast-2.2.13/bin/:";
> >>>> }
> >>>>
> >>>>
> >>>> # parameters
> >>>> my $expect_value = 20000;
> >>>> #my $filter_query_sequence = 'F';
> >>>> my $one_line_description = 1000;
> >>>> my $alignments = 1000;
> >>>> # my $strands = 1;
> >>>> my $count = 1;
> >>>>
> >>>> my @params = ('program' => 'blastp', 'database' => 'nr');
> >>>> #my $progress_interval = 100;
> >>>>
> >>>>
> >>>> my $seqio_obj = Bio::SeqIO->new(
> >>>>  -file   => "Perm.txt",
> >>>>  -format => "raw",
> >>>> );
> >>>>
> >>>> # create factory object and set parameters
> >>>> my $factory = Bio::Tools::Run::StandAloneBlast->new(@params);
> >>>>
> >>>> $factory->e($expect_value);
> >>>> #$factory->F($filter_query_sequence);
> >>>> $factory->v($one_line_description);
> >>>> $factory->b($alignments);
> >>>> #$factory->S($strands);
> >>>>
> >>>>
> >>>> # get query
> >>>>
> >>>> while ( my $query = $seqio_obj->next_seq ) {
> >>>>      my $blast_report = $factory->blastall($query);
> >>>>      my $filename = "comp_$count.txt";
> >>>>      my $factory->outfile($filename);
> >>>>      print $query->seq;
> >>>>      print "\n";
> >>>>
> >>>>  $count++;
> >>>> }
> >>>>
> >>>> thank you very much in advance
> >>>> Hubert
> >>>>
> >>>>
> >>>>
> >>>> Nagesh Chakka wrote:
> >>>>
> >>>>  
> >>>>
> >>>>> Hi Hubert,
> >>>>> I downloaded the nr.00.tar.gz file a week ago. I was able to get 
> >>>>> the following files
> >>>>> .phr, .pin, .pnd, .pni, .ppd, .ppi, .psd, .psi, .psq, .pal files. 
> >>>>> I have no trouble in running standalone blast. You are not 
> >>>>> required to run formardb on the downloaded blast databases and 
> >>>>> that may be the reason why the sequences are not included as it 
> >>>>> will also reduce the size of the file.
> >>>>> Did you try to run a blast search, if so is it giving you any errors?
> >>>>> Nagesh
> >>>>>
> >>>>>
> >>>>>
> >>>>> Hubert Prielinger wrote:
> >>>>>
> >>>>>    
> >>>>>
> >>>>>> Hi,
> >>>>>> I have downloaded the nr database for doing a blast search 
> >>>>>> locally, now I'm supposed to index the database with formatdb, 
> >>>>>> but it doesn't work...
> >>>>>> The online help says that you need a fasta file that is indexed 
> >>>>>> to use for searching the database, but when I uncompressed the 
> >>>>>> zip file, there were only .phr, .pnd, .pin, .pni, .ppd file....
> >>>>>> Is there anybody who can tell me, how to use formatdb with the nr 
> >>>>>> database...
> >>>>>>
> >>>>>> Help is very appreciated
> >>>>>> Thank you very much in advance
> >>>>>>
> >>>>>> Hubert
> >>>>>>
> >>>>>> _______________________________________________
> >>>>>> Bioperl-l mailing list
> >>>>>> Bioperl-l@portal.open-bio.org
> >>>>>> http://portal.open-bio.org/mailman/listinfo/bioperl-l
> >>>>>>       
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>>     
> >>>>
> >>>>
> >>>
> >>>
> >>>  
> >>>
> >>
> >
> >
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l@portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l

From mith at ceh.ac.uk  Tue Jan 24 04:16:49 2006
From: mith at ceh.ac.uk (Milo Thurston)
Date: Tue Jan 24 04:27:54 2006
Subject: [Bioperl-l] .tab + FASTA -> EMBL
Message-ID: <200601240916.k0O9Gnnd011814@ivpcl10.nox.ac.uk>

Hello,
Would anyone be able to suggest a suitable method for the following,
please?
I have a load of FASTA sequences, and for each one several Artemis
feature tables and MSP Crunch files. I'd like to read in each sequence
plus the annotations, combine them and save as EMBL format. Converting
from FASTA to EMBL is, of course, trivial but I can't find any existing
Bioperl modules that might deal with the .tabs, and I'd rather use what's
available than duplicate code to read them.
Thanks.

--
Dr. Milo Thurston, CEH Oxford, Mansfield Road, Oxford, OX1 3SR.
'phone 01865 281975,  fax 01865 281696.
http://www.genomics.ceh.ac.uk/lab/
From smarkel at scitegic.com  Tue Jan 24 09:54:46 2006
From: smarkel at scitegic.com (Scott Markel)
Date: Tue Jan 24 09:51:39 2006
Subject: [Bioperl-l] formatdb with the nr database
In-Reply-To: <43D585CF.5070902@gmx.at>
References: <43D54838.5050301@gmx.at>
	<43D5693C.1020805@anu.edu.au>	<43D56203.2060806@gmx.at>
	<1138062266.2534.2.camel@vogon>	<43D571B1.3020008@gmx.at>
	<43D58D06.5080501@anu.edu.au> <43D585CF.5070902@gmx.at>
Message-ID: <43D63FB6.4090505@scitegic.com>

Hubert,

If you look at the MSG line in the exception you can see
exactly what the command line was.  Nagesh is pointing out
that you used -d "/nr" and asking if that's what you want.
I suspect that the '/' shouldn't be there.

Try invoking blastall directly from the command line.  All
BioPerl is doing is invoking BLAST on your behalf.  The
same command line that BioPerl uses should also work for
you on the command line.

Scott

Hubert Prielinger wrote:

> hi,
> sorry, but what do you mean with is your blast database in /nr...
> my database is located in the path /home/Hubert/blast/blast-2.2.13/data
> 
> 
> 
> Nagesh Chakka wrote:
> 
>> Can you just run the blast from the command line.
>> Is your blast database in "/nr".
>>
>> Hubert Prielinger wrote:
>>
>>> Hi Nagesh,
>>> thank you very much, I put my database into the data folder, run the 
>>> program and got the following error message:
>>>
>>> submit Sequence...just do it....
>>> sh: /home/Hubert/blast/blast-2.2.13/bin/blastall: cannot execute 
>>> binary file
>>>
>>> ------------- EXCEPTION  -------------
>>> MSG: blastall call crashed: 32256 
>>> /home/Hubert/blast/blast-2.2.13/bin/blastall -p  blastp  -d  "/nr"  
>>> -i  /tmp/QTZfYMbgLM  -e  20000  -o  /tmp/v3YwWvONZ1  -v  1000  -b  1000
>>>
>>> STACK Bio::Tools::Run::StandAloneBlast::_runblast 
>>> /usr/lib/perl5/site_perl/5.8.6/Bio/Tools/Run/StandAloneBlast.pm:759
>>> STACK Bio::Tools::Run::StandAloneBlast::_generic_local_blast 
>>> /usr/lib/perl5/site_perl/5.8.6/Bio/Tools/Run/StandAloneBlast.pm:706
>>> STACK Bio::Tools::Run::StandAloneBlast::blastall 
>>> /usr/lib/perl5/site_perl/5.8.6/Bio/Tools/Run/StandAloneBlast.pm:557
>>> STACK toplevel 
>>> /home/Hubert/installed/eclipse/workspace/Database_Search/standalone_blast.pl:46 
>>>
>>>
>>> --------------------------------------
>>>
>>> Why it did not find my binary file, but it is there
>>>
>>> regards
>>>
>>> Nagesh Chakka wrote:
>>>
>>>> Hi,
>>>> The following is from the StandAloneBlast.pm documentation
>>>> " If the databases which will be searched by BLAST are located in the
>>>> data subdirectory of the blast program directory (the default
>>>> installation location), StandAloneBlast will find them; however, if the
>>>> database files are located in any other location, environmental 
>>>> variable
>>>> $BLASTDATADIR will need to be set to point to that directory."
>>>> Please note that I have not used this module before.
>>>> Nagesh
>>>>
>>>>
>>>>
>>>> On Mon, 2006-01-23 at 17:08 -0600, Hubert Prielinger wrote:
>>>>  
>>>>
>>>>> Hi,
>>>>> thank you very much for the help, another questions that raises up, 
>>>>> do I have to write the path to the database files as well, I guess 
>>>>> so, but how I do that, the same way I write the path to teh blast 
>>>>> bin files?
>>>>> Does anybody know how to set the Composition based statistics 
>>>>> parameter?
>>>>> there is my code:
>>>>>
>>>>> #!/usr/bin/perl -w
>>>>>
>>>>> use Bio::Tools::Run::StandAloneBlast;
>>>>> use Bio::Seq;
>>>>> use Bio::SeqIO;
>>>>> use strict;
>>>>>
>>>>> BEGIN
>>>>> {
>>>>>    $ENV{PATH}=":/home/Hubert/blast/blast-2.2.13/bin/:";
>>>>> }
>>>>>
>>>>>
>>>>> # parameters
>>>>> my $expect_value = 20000;
>>>>> #my $filter_query_sequence = 'F';
>>>>> my $one_line_description = 1000;
>>>>> my $alignments = 1000;
>>>>> # my $strands = 1;
>>>>> my $count = 1;
>>>>>
>>>>> my @params = ('program' => 'blastp', 'database' => 'nr');
>>>>> #my $progress_interval = 100;
>>>>>
>>>>>
>>>>> my $seqio_obj = Bio::SeqIO->new(
>>>>>  -file   => "Perm.txt",
>>>>>  -format => "raw",
>>>>> );
>>>>>
>>>>> # create factory object and set parameters
>>>>> my $factory = Bio::Tools::Run::StandAloneBlast->new(@params);
>>>>>
>>>>> $factory->e($expect_value);
>>>>> #$factory->F($filter_query_sequence);
>>>>> $factory->v($one_line_description);
>>>>> $factory->b($alignments);
>>>>> #$factory->S($strands);
>>>>>
>>>>>
>>>>> # get query
>>>>>
>>>>> while ( my $query = $seqio_obj->next_seq ) {
>>>>>      my $blast_report = $factory->blastall($query);
>>>>>      my $filename = "comp_$count.txt";
>>>>>      my $factory->outfile($filename);
>>>>>      print $query->seq;
>>>>>      print "\n";
>>>>>
>>>>>  $count++;
>>>>> }
>>>>>
>>>>> thank you very much in advance
>>>>> Hubert
>>>>>
>>>>>
>>>>>
>>>>> Nagesh Chakka wrote:
>>>>>
>>>>>  
>>>>>
>>>>>> Hi Hubert,
>>>>>> I downloaded the nr.00.tar.gz file a week ago. I was able to get 
>>>>>> the following files
>>>>>> .phr, .pin, .pnd, .pni, .ppd, .ppi, .psd, .psi, .psq, .pal files. 
>>>>>> I have no trouble in running standalone blast. You are not 
>>>>>> required to run formardb on the downloaded blast databases and 
>>>>>> that may be the reason why the sequences are not included as it 
>>>>>> will also reduce the size of the file.
>>>>>> Did you try to run a blast search, if so is it giving you any errors?
>>>>>> Nagesh
>>>>>>
>>>>>>
>>>>>>
>>>>>> Hubert Prielinger wrote:
>>>>>>
>>>>>>   
>>>>>>
>>>>>>> Hi,
>>>>>>> I have downloaded the nr database for doing a blast search 
>>>>>>> locally, now I'm supposed to index the database with formatdb, 
>>>>>>> but it doesn't work...
>>>>>>> The online help says that you need a fasta file that is indexed 
>>>>>>> to use for searching the database, but when I uncompressed the 
>>>>>>> zip file, there were only .phr, .pnd, .pin, .pni, .ppd file....
>>>>>>> Is there anybody who can tell me, how to use formatdb with the nr 
>>>>>>> database...
>>>>>>>
>>>>>>> Help is very appreciated
>>>>>>> Thank you very much in advance
>>>>>>>
>>>>>>> Hubert
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> Bioperl-l mailing list
>>>>>>> Bioperl-l@portal.open-bio.org
>>>>>>> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>>>>>>>       
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>     
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>>  
>>>>
>>>
>>
>>
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l@portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l
> 
> 

-- 
Scott Markel, Ph.D.
Principal Bioinformatics Architect  email:  smarkel@scitegic.com
SciTegic Inc.                       mobile: +1 858 205 3653
9665 Chesapeake Drive, Suite 401    voice:  +1 858 279 8800, ext. 253
San Diego, CA 92123                 fax:    +1 858 279 8804
USA                                 web:    http://www.scitegic.com

From cain at cshl.edu  Tue Jan 24 11:16:23 2006
From: cain at cshl.edu (Scott Cain)
Date: Tue Jan 24 11:29:02 2006
Subject: [Bioperl-l] Re: [Gmod-gbrowse] GMOD PPM repository not working
In-Reply-To: <000001c61c4f$7d835170$15327e82@pyrimidine>
References: <000001c61c4f$7d835170$15327e82@pyrimidine>
Message-ID: <1138119383.3338.68.camel@localhost.localdomain>

Hi Chris,

Is it still misbehaving?  I'll do some testing today, but my ability to
do so is little hampered as I am traveling this week.

Thanks,
Scott

On Wed, 2006-01-18 at 10:51 -0600, Chris Fields wrote:
> Scott,
> 
> I am trying to find the newest bioperl dev. Release (1.51) from PPM for a
> quick write-up on installing bioperl-db on Windows.  I tried using the GMOD
> repository:
> 
> ppm> rep add gmod http://www.gmod.org/ggb/ppm
> Repositories:
> [1] gmod
> [ ] ActiveState Package Repository
> [ ] ActiveState PPM2 Repository
> [ ] Bioperl
> [ ] Bribes
> [ ] Kobes
> [ ] local
> ppm> search bioperl
> Searching in Active Repositories
> No matches for 'bioperl'; see 'help search'.
> ppm> search *
> Searching in Active Repositories
> No matches for '*'; see 'help search'.
> ppm>
> 
> 
> Any idea what's going on?  All other repositories work fine.  I can download
> it and install locally w/o a problem.  I am running the newest ActivePerl
> (5.8.7.815), WinXP.
> 
> Christopher Fields
> Postdoctoral Researcher - Switzer Lab
> Dept. of Biochemistry
> University of Illinois Urbana-Champaign 
> 
> 
> 
> 
> -------------------------------------------------------
> This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
> for problems?  Stop!  Download the new AJAX search engine that makes
> searching your log files as easy as surfing the  web.  DOWNLOAD SPLUNK!
> http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642
> _______________________________________________
> Gmod-gbrowse mailing list
> Gmod-gbrowse@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/gmod-gbrowse
-- 
------------------------------------------------------------------------
Scott Cain, Ph. D.                                         cain@cshl.edu
GMOD Coordinator (http://www.gmod.org/)                     216-392-3087
Cold Spring Harbor Laboratory

From cain at cshl.edu  Tue Jan 24 11:33:18 2006
From: cain at cshl.edu (Scott Cain)
Date: Tue Jan 24 11:29:13 2006
Subject: RE [Gmod-gbrowse] [Fwd: [Bioperl-l] search2gff]
In-Reply-To: 
References: 
Message-ID: <1138120399.3338.77.camel@localhost.localdomain>

Hello Dea,

If there were a bioperl parser for Geneseqer output, it probably
wouldn't be that hard to write one, but as far as I can tell there isn't
a parser (a quick grep through bioperl-live came up empty).

Sorry,
Scott

On Tue, 2006-01-24 at 17:19 +0100, dea.giardella@biogemma.com wrote:
> Hello,
> 
> In the same way are there any scripts to convert Geneseqer output in GGF3 
> format ?
> Geneseqer : http://www.plantgdb.org/PlantGDB-cgi/GeneSeqer/PlantGDBgs.cgi
> 
> Thanks a lot !
> 
> D?a GIARDELLA
> dea.giardella@biogemma.com 
> 
> 
> 
> Scott Cain  
> Envoy? par : gmod-gbrowse-admin@lists.sourceforge.net
> 24/01/2006 16:26
> 
> A
> "Gbrowse (E-mail)" 
> cc
> 
> Objet
> [Gmod-gbrowse] [Fwd: [Bioperl-l] search2gff]
> 
> 
> 
> 
> 
> 
> Hello all,
> 
> Hilmar Lapp posted the attached message to the bioperl mailing list
> about search2gff, a script for converting BLAST output to GFF3.  I
> thought it might be of interest to readers of this mailing list as well.
> 
> Scott
> 
> -- 
> ------------------------------------------------------------------------
> Scott Cain, Ph. D.                                         cain@cshl.edu
> GMOD Coordinator (http://www.gmod.org/)                     216-392-3087
> Cold Spring Harbor Laboratory
> 
> ----- Message de Hilmar Lapp  sur Thu, 19 Jan 2006 15:11:22 
> -0800 -----
> Pour:
> bioperl-l 
> Objet:
> [Bioperl-l] search2gff
> I added a couple of capabilities to the scripts/utilities/search2gff
> script written by Jason. In a nutshell, there are now options for
> controlling the score, location, and method of the HSP-representing
> feature, as well as options for printing of parent, which parent, and
> whether to skip all except the first HSP for each hit.
> 
> As for possible applications, for example using these options you can
> blast SNP assay primers and use the options to create SNP features for
> a single basepair at the end of the primer, ready to be piped to a
> GBrowse GFF3 loader.
> 
> I tried to preserve the original functionality in its entirety, i.e.,
> if you don't use any of the new options the script should work as
> before. If not please let me know.
> 
> POD is attached.
> 
>    -hilmar
> --
> ----------------------------------------------------------
> : Hilmar Lapp -:- San Diego, CA -:- hlapp at gmx dot net :
> ----------------------------------------------------------
> 
> SYNOPSIS
>     Usage: search2gff [-o outputfile] [-f reportformat] [-i inputfilename]
>     OR file1 file2 ..
> 
> DESCRIPTION
>     This script will turn a protein Search report (BLASTP, FASTP, SSEARCH,
>     AXT, WABA) into a GFF File.
> 
>     The options are:
> 
>        -i infilename      - (optional) inputfilename, will read
>                             either ARGV files or from STDIN
>        -o filename        - the output filename [default STDOUT]
>        -f format          - search result format (blast, fasta,waba,axt)
>                             (ssearch is fasta format). default is blast.
>        -t/--type seqtype  - if you want to see query or hit information
>                             in the GFF report
>        -s/--source        - specify the source (will be algorithm name
>                             otherwise like BLASTN)
>        --method           - the method tag (primary_tag) of the features
>                             (default is similarity)
>        --scorefunc        - a string or a file that when parsed evaluates
>                             to a closure which will be passed a feature
>                             object and that returns the score to be 
> printed
>        --locfunc          - a string or a file that when parsed evaluates
>                             to a closure which will be passed two
>                             features, query and hit, and returns the
>                             location (Bio::LocationI compliant) for the
>                             GFF3 feature created for each HSP; the closure
>                             may use the clone_loc() and create_loc()
>                             functions for convenience, see their PODs
>        --onehsp           - only print the first HSP feature for each hit
>        -p/--parent        - the parent to which HSP features should refer
>                             if not the name of the hit or query (depending
>                             on --type)
>        --target/--notarget - whether to always add the Target tag or not
>        -h                 - this help menu
>        --version          - GFF version to use (put a 3 here to use gff 3)
>        --component        - generate GFF component fields (chromosome)
>        -m/--match         - generate a 'match' line which is a container
>                             of all the similarity HSPs
>        --addid            - add ID tag in the absence of --match
>        -c/--cutoff        - specify an evalue cutoff
> 
>     Additionally specify the filenames you want to process on the
>     command-line. If no files are specified then STDIN input is assumed. 
> You
>     specify this by doing: search2gff < file1 file2 file3
> 
> AUTHOR
>     Jason Stajich, jason-at-bioperl-dot-org
> 
> Contributors
>     Hilmar Lapp, hlapp-at-gmx-dot-net
> 
>   clone_loc
>      Title   : clone_loc
>      Usage   : my $l = clone_loc($feature->location);
>      Function: Helper function to simplify the task of cloning locations
>                for --locfunc closures.
> 
>                Presently simply implemented using Storable::dclone().
>      Example :
>      Returns : A L object of the same type and with the
>                same properties as the argument, but physically different.
>                All structured properties will be cloned as well.
>      Args    : A L compliant object
> 
>   create_loc
>      Title   : create_loc
>      Usage   : my $l = create_loc("10..12");
>      Function: Helper function to simplify the task of creating locations
>                for --locfunc closures. Creates a location from a feature-
>                table formatted string.
> 
>      Example :
>      Returns : A L object representing the location given
>                as formatted string.
>      Args    : A GenBank feature-table formatted string.
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l@portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l
> 
> 
> 
> 
> -------------------------------------------------------
> This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
> for problems?  Stop!  Download the new AJAX search engine that makes
> searching your log files as easy as surfing the  web.  DOWNLOAD SPLUNK!
> http://sel.as-us.falkag.net/sel?cmd_______________________________________________
> Gmod-gbrowse mailing list
> Gmod-gbrowse@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/gmod-gbrowse
> 
-- 
------------------------------------------------------------------------
Scott Cain, Ph. D.                                         cain@cshl.edu
GMOD Coordinator (http://www.gmod.org/)                     216-392-3087
Cold Spring Harbor Laboratory

From cjfields at uiuc.edu  Tue Jan 24 12:09:56 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue Jan 24 12:08:27 2006
Subject: [Bioperl-l] RemoteBlast.pm and Bio::SearchIO::blast.pm - partially
	resolved
Message-ID: <000901c62109$01814870$15327e82@pyrimidine>

I submitted two bugs on Bugzilla to describe recent problems with
RemoteBlast.pm and SearchIO::blast.pm

http://bugzilla.bioperl.org/show_bug.cgi?id=1934
http://bugzilla.bioperl.org/show_bug.cgi?id=1935

Today I submitted a patched version of Bio::SearchIO::blast.pm which should
fix the text parsing issue for old (2.2.12) and new (2.2.13) versions of
NCBI's BLAST; the bug link above describes the problem and the fix.  Problem
is, I know it will likely break again b/c NCBI will probably change text
output in a future BLAST version.  I also agree with Jason about changing
the default for SearchIO to XML.  So, does text output parsing through
blast.pm need to be deprecated in favor of XML, or should both be available?

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 

From jason.stajich at duke.edu  Tue Jan 24 12:15:47 2006
From: jason.stajich at duke.edu (Jason Stajich)
Date: Tue Jan 24 12:12:12 2006
Subject: [Bioperl-l] Re: RemoteBlast.pm and Bio::SearchIO::blast.pm -
	partially resolved
In-Reply-To: <000901c62109$01814870$15327e82@pyrimidine>
References: <000901c62109$01814870$15327e82@pyrimidine>
Message-ID: <18966F80-B780-4661-953E-613B05B56164@duke.edu>

Thanks Chris - I don't know when I'll have time to check in bugs so  
anyone else who has commit access feel free to give these a whirl and  
check in.

I would propose making the XML default but allowing the text version  
to still be supported in the event that someone has setup their own  
local NCBI BLAST Web interface which still supports the simple Text  
output.

-j

On Jan 24, 2006, at 12:09 PM, Chris Fields wrote:

> I submitted two bugs on Bugzilla to describe recent problems with
> RemoteBlast.pm and SearchIO::blast.pm
>
> http://bugzilla.bioperl.org/show_bug.cgi?id=1934
> http://bugzilla.bioperl.org/show_bug.cgi?id=1935
>
> Today I submitted a patched version of Bio::SearchIO::blast.pm  
> which should
> fix the text parsing issue for old (2.2.12) and new (2.2.13)  
> versions of
> NCBI's BLAST; the bug link above describes the problem and the  
> fix.  Problem
> is, I know it will likely break again b/c NCBI will probably change  
> text
> output in a future BLAST version.  I also agree with Jason about  
> changing
> the default for SearchIO to XML.  So, does text output parsing through
> blast.pm need to be deprecated in favor of XML, or should both be  
> available?
>
> Christopher Fields
> Postdoctoral Researcher - Switzer Lab
> Dept. of Biochemistry
> University of Illinois Urbana-Champaign
>
>

--
Jason Stajich
Duke University
http://www.duke.edu/~jes12

From cjfields at uiuc.edu  Tue Jan 24 12:33:09 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue Jan 24 12:42:56 2006
Subject: [Bioperl-l] RE: RemoteBlast.pm and Bio::SearchIO::blast.pm -
	partially resolved
In-Reply-To: <18966F80-B780-4661-953E-613B05B56164@duke.edu>
Message-ID: <000d01c6210c$3e7c1040$15327e82@pyrimidine>

I wouldn't mind helping out in maintaining blast.pm or RemoteBlast.pm, but
I'm still a bit 'green' with Perl and Bioperl objects and methods.  This
last fix was somewhat easy to spot (simple regex); the problems with saving
XML output (bug #1935) are a stumbling block here, though.  A new wrinkle
though, which limits the bug's severity: it does at least parse the XML
output as it will pull out accession numbers, which is a bit of a relief
(blastxml seems to be working).  It just won't save it, and using
$result->query_name still gives part of the RID, suggesting a regex messing
up somewhere, maybe in blastxml.

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 

> -----Original Message-----
> From: Jason Stajich [mailto:jason.stajich@duke.edu]
> Sent: Tuesday, January 24, 2006 11:16 AM
> To: Chris Fields
> Cc: bioperl-ml List
> Subject: Re: RemoteBlast.pm and Bio::SearchIO::blast.pm - partially
> resolved
> 
> Thanks Chris - I don't know when I'll have time to check in bugs so
> anyone else who has commit access feel free to give these a whirl and
> check in.
> 
> I would propose making the XML default but allowing the text version
> to still be supported in the event that someone has setup their own
> local NCBI BLAST Web interface which still supports the simple Text
> output.
> 
> -j
> 
> On Jan 24, 2006, at 12:09 PM, Chris Fields wrote:
> 
> > I submitted two bugs on Bugzilla to describe recent problems with
> > RemoteBlast.pm and SearchIO::blast.pm
> >
> > http://bugzilla.bioperl.org/show_bug.cgi?id=1934
> > http://bugzilla.bioperl.org/show_bug.cgi?id=1935
> >
> > Today I submitted a patched version of Bio::SearchIO::blast.pm
> > which should
> > fix the text parsing issue for old (2.2.12) and new (2.2.13)
> > versions of
> > NCBI's BLAST; the bug link above describes the problem and the
> > fix.  Problem
> > is, I know it will likely break again b/c NCBI will probably change
> > text
> > output in a future BLAST version.  I also agree with Jason about
> > changing
> > the default for SearchIO to XML.  So, does text output parsing through
> > blast.pm need to be deprecated in favor of XML, or should both be
> > available?
> >
> > Christopher Fields
> > Postdoctoral Researcher - Switzer Lab
> > Dept. of Biochemistry
> > University of Illinois Urbana-Champaign
> >
> >
> 
> --
> Jason Stajich
> Duke University
> http://www.duke.edu/~jes12

From hubert.prielinger at gmx.at  Tue Jan 24 15:49:07 2006
From: hubert.prielinger at gmx.at (Hubert Prielinger)
Date: Tue, 24 Jan 2006 14:49:07 -0600
Subject: [Bioperl-l] formatdb with the nr database
In-Reply-To: <43D63FB6.4090505@scitegic.com>
References: <43D54838.5050301@gmx.at>
	<43D5693C.1020805@anu.edu.au>	<43D56203.2060806@gmx.at>
	<1138062266.2534.2.camel@vogon>	<43D571B1.3020008@gmx.at>
	<43D58D06.5080501@anu.edu.au> <43D585CF.5070902@gmx.at>
	<43D63FB6.4090505@scitegic.com>
Message-ID: <43D692C3.80306@gmx.at>

Hi,
thank you very much for the help, I have tried to run the blastall on 
commandline, but I can't even execute the binary file, nevertheless the 
blastall exe file have every permission...
I always get the error message: blastall: cannot execute the binary file
Need to be the exe file somewhere else, another path...now it is located 
under /home/Hubert/blast/blast-2.2.13/bin

thanks
Hubert

Scott Markel wrote:

> Hubert,
>
> If you look at the MSG line in the exception you can see
> exactly what the command line was.  Nagesh is pointing out
> that you used -d "/nr" and asking if that's what you want.
> I suspect that the '/' shouldn't be there.
>
> Try invoking blastall directly from the command line.  All
> BioPerl is doing is invoking BLAST on your behalf.  The
> same command line that BioPerl uses should also work for
> you on the command line.
>
> Scott
>
> Hubert Prielinger wrote:
>
>> hi,
>> sorry, but what do you mean with is your blast database in /nr...
>> my database is located in the path /home/Hubert/blast/blast-2.2.13/data
>>
>>
>>
>> Nagesh Chakka wrote:
>>
>>> Can you just run the blast from the command line.
>>> Is your blast database in "/nr".
>>>
>>> Hubert Prielinger wrote:
>>>
>>>> Hi Nagesh,
>>>> thank you very much, I put my database into the data folder, run 
>>>> the program and got the following error message:
>>>>
>>>> submit Sequence...just do it....
>>>> sh: /home/Hubert/blast/blast-2.2.13/bin/blastall: cannot execute 
>>>> binary file
>>>>
>>>> ------------- EXCEPTION  -------------
>>>> MSG: blastall call crashed: 32256 
>>>> /home/Hubert/blast/blast-2.2.13/bin/blastall -p  blastp  -d  "/nr"  
>>>> -i  /tmp/QTZfYMbgLM  -e  20000  -o  /tmp/v3YwWvONZ1  -v  1000  -b  
>>>> 1000
>>>>
>>>> STACK Bio::Tools::Run::StandAloneBlast::_runblast 
>>>> /usr/lib/perl5/site_perl/5.8.6/Bio/Tools/Run/StandAloneBlast.pm:759
>>>> STACK Bio::Tools::Run::StandAloneBlast::_generic_local_blast 
>>>> /usr/lib/perl5/site_perl/5.8.6/Bio/Tools/Run/StandAloneBlast.pm:706
>>>> STACK Bio::Tools::Run::StandAloneBlast::blastall 
>>>> /usr/lib/perl5/site_perl/5.8.6/Bio/Tools/Run/StandAloneBlast.pm:557
>>>> STACK toplevel 
>>>> /home/Hubert/installed/eclipse/workspace/Database_Search/standalone_blast.pl:46 
>>>>
>>>>
>>>> --------------------------------------
>>>>
>>>> Why it did not find my binary file, but it is there
>>>>
>>>> regards
>>>>
>>>> Nagesh Chakka wrote:
>>>>
>>>>> Hi,
>>>>> The following is from the StandAloneBlast.pm documentation
>>>>> " If the databases which will be searched by BLAST are located in the
>>>>> data subdirectory of the blast program directory (the default
>>>>> installation location), StandAloneBlast will find them; however, 
>>>>> if the
>>>>> database files are located in any other location, environmental 
>>>>> variable
>>>>> $BLASTDATADIR will need to be set to point to that directory."
>>>>> Please note that I have not used this module before.
>>>>> Nagesh
>>>>>
>>>>>
>>>>>
>>>>> On Mon, 2006-01-23 at 17:08 -0600, Hubert Prielinger wrote:
>>>>>  
>>>>>
>>>>>> Hi,
>>>>>> thank you very much for the help, another questions that raises 
>>>>>> up, do I have to write the path to the database files as well, I 
>>>>>> guess so, but how I do that, the same way I write the path to teh 
>>>>>> blast bin files?
>>>>>> Does anybody know how to set the Composition based statistics 
>>>>>> parameter?
>>>>>> there is my code:
>>>>>>
>>>>>> #!/usr/bin/perl -w
>>>>>>
>>>>>> use Bio::Tools::Run::StandAloneBlast;
>>>>>> use Bio::Seq;
>>>>>> use Bio::SeqIO;
>>>>>> use strict;
>>>>>>
>>>>>> BEGIN
>>>>>> {
>>>>>>    $ENV{PATH}=":/home/Hubert/blast/blast-2.2.13/bin/:";
>>>>>> }
>>>>>>
>>>>>>
>>>>>> # parameters
>>>>>> my $expect_value = 20000;
>>>>>> #my $filter_query_sequence = 'F';
>>>>>> my $one_line_description = 1000;
>>>>>> my $alignments = 1000;
>>>>>> # my $strands = 1;
>>>>>> my $count = 1;
>>>>>>
>>>>>> my @params = ('program' => 'blastp', 'database' => 'nr');
>>>>>> #my $progress_interval = 100;
>>>>>>
>>>>>>
>>>>>> my $seqio_obj = Bio::SeqIO->new(
>>>>>>  -file   => "Perm.txt",
>>>>>>  -format => "raw",
>>>>>> );
>>>>>>
>>>>>> # create factory object and set parameters
>>>>>> my $factory = Bio::Tools::Run::StandAloneBlast->new(@params);
>>>>>>
>>>>>> $factory->e($expect_value);
>>>>>> #$factory->F($filter_query_sequence);
>>>>>> $factory->v($one_line_description);
>>>>>> $factory->b($alignments);
>>>>>> #$factory->S($strands);
>>>>>>
>>>>>>
>>>>>> # get query
>>>>>>
>>>>>> while ( my $query = $seqio_obj->next_seq ) {
>>>>>>      my $blast_report = $factory->blastall($query);
>>>>>>      my $filename = "comp_$count.txt";
>>>>>>      my $factory->outfile($filename);
>>>>>>      print $query->seq;
>>>>>>      print "\n";
>>>>>>
>>>>>>  $count++;
>>>>>> }
>>>>>>
>>>>>> thank you very much in advance
>>>>>> Hubert
>>>>>>
>>>>>>
>>>>>>
>>>>>> Nagesh Chakka wrote:
>>>>>>
>>>>>>  
>>>>>>
>>>>>>> Hi Hubert,
>>>>>>> I downloaded the nr.00.tar.gz file a week ago. I was able to get 
>>>>>>> the following files
>>>>>>> .phr, .pin, .pnd, .pni, .ppd, .ppi, .psd, .psi, .psq, .pal 
>>>>>>> files. I have no trouble in running standalone blast. You are 
>>>>>>> not required to run formardb on the downloaded blast databases 
>>>>>>> and that may be the reason why the sequences are not included as 
>>>>>>> it will also reduce the size of the file.
>>>>>>> Did you try to run a blast search, if so is it giving you any 
>>>>>>> errors?
>>>>>>> Nagesh
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Hubert Prielinger wrote:
>>>>>>>
>>>>>>>  
>>>>>>>
>>>>>>>> Hi,
>>>>>>>> I have downloaded the nr database for doing a blast search 
>>>>>>>> locally, now I'm supposed to index the database with formatdb, 
>>>>>>>> but it doesn't work...
>>>>>>>> The online help says that you need a fasta file that is indexed 
>>>>>>>> to use for searching the database, but when I uncompressed the 
>>>>>>>> zip file, there were only .phr, .pnd, .pin, .pni, .ppd file....
>>>>>>>> Is there anybody who can tell me, how to use formatdb with the 
>>>>>>>> nr database...
>>>>>>>>
>>>>>>>> Help is very appreciated
>>>>>>>> Thank you very much in advance
>>>>>>>>
>>>>>>>> Hubert
>>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>> Bioperl-l mailing list
>>>>>>>> Bioperl-l at portal.open-bio.org
>>>>>>>> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>>>>>>>>       
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>     
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>>  
>>>>>
>>>>
>>>
>>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at portal.open-bio.org
>> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>>
>>
>

From hubert.prielinger at gmx.at  Tue Jan 24 16:15:38 2006
From: hubert.prielinger at gmx.at (Hubert Prielinger)
Date: Tue, 24 Jan 2006 15:15:38 -0600
Subject: [Bioperl-l] formatdb with the nr database
In-Reply-To: <43D6B09A.3040207@atgc.org>
References: <43D54838.5050301@gmx.at>	<43D5693C.1020805@anu.edu.au>	<43D56203.2060806@gmx.at>	<1138062266.2534.2.camel@vogon>	<43D571B1.3020008@gmx.at>	<43D58D06.5080501@anu.edu.au>
	<43D585CF.5070902@gmx.at>	<43D63FB6.4090505@scitegic.com>
	<43D692C3.80306@gmx.at> <43D6B09A.3040207@atgc.org>
Message-ID: <43D698FA.3090904@gmx.at>

hi alex,
I have done, as you recommended and got the following output:

[Hubert at ppc7 ~]$ file /home/Hubert/blast/blast-2.2.13/bin/blastall
/home/Hubert/blast/blast-2.2.13/bin/blastall: ELF 64-bit LSB executable, 
AMD x86-64, version 1 (SYSV), for GNU/Linux 2.4.1, dynamically linked 
(uses shared libs), for GNU/Linux 2.4.1, not stripped
[Hubert at ppc7 ~]$

does it mean, that it is compatible with the operating system

thanks for help
Hubert

Alexander Kozik wrote:

> try Unix command "file", for example:
>
>
> bash-2.03$ file /usr/local/genome/bin/blastall
>
> /usr/local/genome/bin/blastall: ELF 64-bit MSB executable SPARCV9 
> Version 1, UltraSPARC1 Extensions Required, dynamically linked, stripped
>
> bash-2.03$
>
> it will tell if it's compatible with the operating system
>
> -Alex
>
> Hubert Prielinger wrote:
>
>>Hi,
>>thank you very much for the help, I have tried to run the blastall on 
>>commandline, but I can't even execute the binary file, nevertheless the 
>>blastall exe file have every permission...
>>I always get the error message: blastall: cannot execute the binary file
>>Need to be the exe file somewhere else, another path...now it is located 
>>under /home/Hubert/blast/blast-2.2.13/bin
>>
>>thanks
>>Hubert
>>
>>
>>
>>
>>
>>Scott Markel wrote:
>>
>>    
>>
>>>Hubert,
>>>
>>>If you look at the MSG line in the exception you can see
>>>exactly what the command line was.  Nagesh is pointing out
>>>that you used -d "/nr" and asking if that's what you want.
>>>I suspect that the '/' shouldn't be there.
>>>
>>>Try invoking blastall directly from the command line.  All
>>>BioPerl is doing is invoking BLAST on your behalf.  The
>>>same command line that BioPerl uses should also work for
>>>you on the command line.
>>>
>>>Scott
>>>
>>>Hubert Prielinger wrote:
>>>
>>>      
>>>
>>>>hi,
>>>>sorry, but what do you mean with is your blast database in /nr...
>>>>my database is located in the path /home/Hubert/blast/blast-2.2.13/data
>>>>
>>>>
>>>>
>>>>Nagesh Chakka wrote:
>>>>
>>>>        
>>>>
>>>>>Can you just run the blast from the command line.
>>>>>Is your blast database in "/nr".
>>>>>
>>>>>Hubert Prielinger wrote:
>>>>>
>>>>>          
>>>>>
>>>>>>Hi Nagesh,
>>>>>>thank you very much, I put my database into the data folder, run 
>>>>>>the program and got the following error message:
>>>>>>
>>>>>>submit Sequence...just do it....
>>>>>>sh: /home/Hubert/blast/blast-2.2.13/bin/blastall: cannot execute 
>>>>>>binary file
>>>>>>
>>>>>>------------- EXCEPTION  -------------
>>>>>>MSG: blastall call crashed: 32256 
>>>>>>/home/Hubert/blast/blast-2.2.13/bin/blastall -p  blastp  -d  "/nr"  
>>>>>>-i  /tmp/QTZfYMbgLM  -e  20000  -o  /tmp/v3YwWvONZ1  -v  1000  -b  
>>>>>>1000
>>>>>>
>>>>>>STACK Bio::Tools::Run::StandAloneBlast::_runblast 
>>>>>>/usr/lib/perl5/site_perl/5.8.6/Bio/Tools/Run/StandAloneBlast.pm:759
>>>>>>STACK Bio::Tools::Run::StandAloneBlast::_generic_local_blast 
>>>>>>/usr/lib/perl5/site_perl/5.8.6/Bio/Tools/Run/StandAloneBlast.pm:706
>>>>>>STACK Bio::Tools::Run::StandAloneBlast::blastall 
>>>>>>/usr/lib/perl5/site_perl/5.8.6/Bio/Tools/Run/StandAloneBlast.pm:557
>>>>>>STACK toplevel 
>>>>>>/home/Hubert/installed/eclipse/workspace/Database_Search/standalo
>>>>>>ne_blast.pl:46 
>>>>>>
>>>>>>
>>>>>>--------------------------------------
>>>>>>
>>>>>>Why it did not find my binary file, but it is there
>>>>>>
>>>>>>regards
>>>>>>
>>>>>>Nagesh Chakka wrote:
>>>>>>
>>>>>>            
>>>>>>
>>>>>>>Hi,
>>>>>>>The following is from the StandAloneBlast.pm documentation
>>>>>>>" If the databases which will be searched by BLAST are located in the
>>>>>>>data subdirectory of the blast program directory (the default
>>>>>>>installation location), StandAloneBlast will find them; however, 
>>>>>>>if the
>>>>>>>database files are located in any other location, environmental 
>>>>>>>variable
>>>>>>>$BLASTDATADIR will need to be set to point to that directory."
>>>>>>>Please note that I have not used this module before.
>>>>>>>Nagesh
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>On Mon, 2006-01-23 at 17:08 -0600, Hubert Prielinger wrote:
>>>>>>> 
>>>>>>>
>>>>>>>              
>>>>>>>
>>>>>>>>Hi,
>>>>>>>>thank you very much for the help, another questions that raises 
>>>>>>>>up, do I have to write the path to the database files as well, I 
>>>>>>>>guess so, but how I do that, the same way I write the path to teh 
>>>>>>>>blast bin files?
>>>>>>>>Does anybody know how to set the Composition based statistics 
>>>>>>>>parameter?
>>>>>>>>there is my code:
>>>>>>>>
>>>>>>>>#!/usr/bin/perl -w
>>>>>>>>
>>>>>>>>use Bio::Tools::Run::StandAloneBlast;
>>>>>>>>use Bio::Seq;
>>>>>>>>use Bio::SeqIO;
>>>>>>>>use strict;
>>>>>>>>
>>>>>>>>BEGIN
>>>>>>>>{
>>>>>>>>   $ENV{PATH}=":/home/Hubert/blast/blast-2.2.13/bin/:";
>>>>>>>>}
>>>>>>>>
>>>>>>>>
>>>>>>>># parameters
>>>>>>>>my $expect_value = 20000;
>>>>>>>>#my $filter_query_sequence = 'F';
>>>>>>>>my $one_line_description = 1000;
>>>>>>>>my $alignments = 1000;
>>>>>>>># my $strands = 1;
>>>>>>>>my $count = 1;
>>>>>>>>
>>>>>>>>my @params = ('program' => 'blastp', 'database' => 'nr');
>>>>>>>>#my $progress_interval = 100;
>>>>>>>>
>>>>>>>>
>>>>>>>>my $seqio_obj = Bio::SeqIO->new(
>>>>>>>> -file   => "Perm.txt",
>>>>>>>> -format => "raw",
>>>>>>>>);
>>>>>>>>
>>>>>>>># create factory
>>>>>>>> object and set parameters
>>>>>>>>my $factory = Bio::Tools::Run::StandAloneBlast->new(@params);
>>>>>>>>
>>>>>>>>$factory->e($expect_value);
>>>>>>>>#$factory->F($filter_query_sequence);
>>>>>>>>$factory->v($one_line_description);
>>>>>>>>$factory->b($alignments);
>>>>>>>>#$factory->S($strands);
>>>>>>>>
>>>>>>>>
>>>>>>>># get query
>>>>>>>>
>>>>>>>>while ( my $query = $seqio_obj->next_seq ) {
>>>>>>>>     my $blast_report = $factory->blastall($query);
>>>>>>>>     my $filename = "comp_$count.txt";
>>>>>>>>     my $factory->outfile($filename);
>>>>>>>>     print $query->seq;
>>>>>>>>     print "\n";
>>>>>>>>
>>>>>>>> $count++;
>>>>>>>>}
>>>>>>>>
>>>>>>>>thank you very much in advance
>>>>>>>>Hubert
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>Nagesh Chakka wrote:
>>>>>>>>
>>>>>>>> 
>>>>>>>>
>>>>>>>>                
>>>>>>>>
>>>>>>>>>Hi Hubert,
>>>>>>>>>I downloaded the nr.00.tar.gz file a week ago. I was able to get 
>>>>>>>>>the following files
>>>>>>>>>.phr, .pin, .pnd, .pni, .ppd, .ppi, .psd, .psi, .psq, .pal 
>>>>>>>>>files. I have no trouble in running standalone blast. You are 
>>>>>>>>>not required to run formardb on the downloaded blast databases 
>>>>>>>>>and that may be the reason why the sequences are not included as 
>>>>>>>>>it will also reduce the size of the file.
>>>>>>>>>Did you try to run a blast search, if so is it giving you any 
>>>>>>>>>errors?
>>>>>>>>>Nagesh
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>Hubert Prielinger wrote:
>>>>>>>>>
>>>>>>>>> 
>>>>>>>>>
>>>>>>>>>                  
>>>>>>>>>
>>>>>>>>>>Hi,
>>>>>>>>>>I have downloaded the nr database for doing a blast search 
>>>>>>>>>>locally, now I'm supposed to index the database with formatdb, 
>>>>>>>>>>but it doesn't work...
>>>>>>>>>>The online help says that you need a fasta file that is indexed 
>>>>>>>>>>to use for searching the database, but when I uncompressed the 
>>>>>>>>>>zip file, there were only .phr, .pnd, .pin, .pni, .ppd file....
>>>>>>>>>>Is there anybody who can tell me, how to use formatdb with the 
>>>>>>>>>>nr database...
>>>>>>>>>>
>>>>>>>>>>Help is very appreciated
>>>>>>>>>>Thank you very much in advance
>>>>>>>>>>
>>>>>>>>>>Hubert
>>>>>>>>>>
>>>>>>>>>>_______________________________________________
>>>>>>>>>>Bioperl-l mailing list
>>>>>>>>>>Bioperl-l at portal.open-bio.org
>>>>>>>>>>http://portal.open-bio.org/mailman/listinfo/bioperl-l
>>>>>>>>>>      
>>>>>>>>>>                    
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>    
>>>>>>>>>                  
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>                
>>>>>>>>
>>>>>>> 
>>>>>>>
>>>>>>>              
>>>>>>>
>>>>>          
>>>>>
>>>>_______________________________________________
>>>>Bioperl-l mailing list
>>>>Bioperl-l at portal.open-bio.org
>>>>http://portal.open-bio.org/mailman/listinfo/bioperl-l
>>>>
>>>>
>>>>        
>>>>
>>
>>
>>_______________________________________________
>>Bioperl-l mailing list
>>Bioperl-l at lists.open-bio.org
>>http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>    
>>
>

From hubert.prielinger at gmx.at  Tue Jan 24 16:24:51 2006
From: hubert.prielinger at gmx.at (Hubert Prielinger)
Date: Tue, 24 Jan 2006 15:24:51 -0600
Subject: [Bioperl-l] formatdb with the nr database
In-Reply-To: <43D6B09A.3040207@atgc.org>
References: <43D54838.5050301@gmx.at>	<43D5693C.1020805@anu.edu.au>	<43D56203.2060806@gmx.at>	<1138062266.2534.2.camel@vogon>	<43D571B1.3020008@gmx.at>	<43D58D06.5080501@anu.edu.au>
	<43D585CF.5070902@gmx.at>	<43D63FB6.4090505@scitegic.com>
	<43D692C3.80306@gmx.at> <43D6B09A.3040207@atgc.org>
Message-ID: <43D69B23.9010100@gmx.at>

Hi,
I'm very sorry for wasting your time, but I just figured out what 
happend, I have installed the 64 bit version and not the 32 bit version....
sorry for the inconvenience and thanks for the help....
I'm trying to fix now the problem with the database....

Sorry
Hubert

Alexander Kozik wrote:

> try Unix command "file", for example:
>
>
> bash-2.03$ file /usr/local/genome/bin/blastall
>
> /usr/local/genome/bin/blastall: ELF 64-bit MSB executable SPARCV9 
> Version 1, UltraSPARC1 Extensions Required, dynamically linked, stripped
>
> bash-2.03$
>
> it will tell if it's compatible with the operating system
>
> -Alex
>
> Hubert Prielinger wrote:
>
>>Hi,
>>thank you very much for the help, I have tried to run the blastall on 
>>commandline, but I can't even execute the binary file, nevertheless the 
>>blastall exe file have every permission...
>>I always get the error message: blastall: cannot execute the binary file
>>Need to be the exe file somewhere else, another path...now it is located 
>>under /home/Hubert/blast/blast-2.2.13/bin
>>
>>thanks
>>Hubert
>>
>>
>>
>>
>>
>>Scott Markel wrote:
>>
>>    
>>
>>>Hubert,
>>>
>>>If you look at the MSG line in the exception you can see
>>>exactly what the command line was.  Nagesh is pointing out
>>>that you used -d "/nr" and asking if that's what you want.
>>>I suspect that the '/' shouldn't be there.
>>>
>>>Try invoking blastall directly from the command line.  All
>>>BioPerl is doing is invoking BLAST on your behalf.  The
>>>same command line that BioPerl uses should also work for
>>>you on the command line.
>>>
>>>Scott
>>>
>>>Hubert Prielinger wrote:
>>>
>>>      
>>>
>>>>hi,
>>>>sorry, but what do you mean with is your blast database in /nr...
>>>>my database is located in the path /home/Hubert/blast/blast-2.2.13/data
>>>>
>>>>
>>>>
>>>>Nagesh Chakka wrote:
>>>>
>>>>        
>>>>
>>>>>Can you just run the blast from the command line.
>>>>>Is your blast database in "/nr".
>>>>>
>>>>>Hubert Prielinger wrote:
>>>>>
>>>>>          
>>>>>
>>>>>>Hi Nagesh,
>>>>>>thank you very much, I put my database into the data folder, run 
>>>>>>the program and got the following error message:
>>>>>>
>>>>>>submit Sequence...just do it....
>>>>>>sh: /home/Hubert/blast/blast-2.2.13/bin/blastall: cannot execute 
>>>>>>binary file
>>>>>>
>>>>>>------------- EXCEPTION  -------------
>>>>>>MSG: blastall call crashed: 32256 
>>>>>>/home/Hubert/blast/blast-2.2.13/bin/blastall -p  blastp  -d  "/nr"  
>>>>>>-i  /tmp/QTZfYMbgLM  -e  20000  -o  /tmp/v3YwWvONZ1  -v  1000  -b  
>>>>>>1000
>>>>>>
>>>>>>STACK Bio::Tools::Run::StandAloneBlast::_runblast 
>>>>>>/usr/lib/perl5/site_perl/5.8.6/Bio/Tools/Run/StandAloneBlast.pm:759
>>>>>>STACK Bio::Tools::Run::StandAloneBlast::_generic_local_blast 
>>>>>>/usr/lib/perl5/site_perl/5.8.6/Bio/Tools/Run/StandAloneBlast.pm:706
>>>>>>STACK Bio::Tools::Run::StandAloneBlast::blastall 
>>>>>>/usr/lib/perl5/site_perl/5.8.6/Bio/Tools/Run/StandAloneBlast.pm:557
>>>>>>STACK toplevel 
>>>>>>/home/Hubert/installed/eclipse/workspace/Database_Search/standalo
>>>>>>ne_blast.pl:46 
>>>>>>
>>>>>>
>>>>>>--------------------------------------
>>>>>>
>>>>>>Why it did not find my binary file, but it is there
>>>>>>
>>>>>>regards
>>>>>>
>>>>>>Nagesh Chakka wrote:
>>>>>>
>>>>>>            
>>>>>>
>>>>>>>Hi,
>>>>>>>The following is from the StandAloneBlast.pm documentation
>>>>>>>" If the databases which will be searched by BLAST are located in the
>>>>>>>data subdirectory of the blast program directory (the default
>>>>>>>installation location), StandAloneBlast will find them; however, 
>>>>>>>if the
>>>>>>>database files are located in any other location, environmental 
>>>>>>>variable
>>>>>>>$BLASTDATADIR will need to be set to point to that directory."
>>>>>>>Please note that I have not used this module before.
>>>>>>>Nagesh
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>On Mon, 2006-01-23 at 17:08 -0600, Hubert Prielinger wrote:
>>>>>>> 
>>>>>>>
>>>>>>>              
>>>>>>>
>>>>>>>>Hi,
>>>>>>>>thank you very much for the help, another questions that raises 
>>>>>>>>up, do I have to write the path to the database files as well, I 
>>>>>>>>guess so, but how I do that, the same way I write the path to teh 
>>>>>>>>blast bin files?
>>>>>>>>Does anybody know how to set the Composition based statistics 
>>>>>>>>parameter?
>>>>>>>>there is my code:
>>>>>>>>
>>>>>>>>#!/usr/bin/perl -w
>>>>>>>>
>>>>>>>>use Bio::Tools::Run::StandAloneBlast;
>>>>>>>>use Bio::Seq;
>>>>>>>>use Bio::SeqIO;
>>>>>>>>use strict;
>>>>>>>>
>>>>>>>>BEGIN
>>>>>>>>{
>>>>>>>>   $ENV{PATH}=":/home/Hubert/blast/blast-2.2.13/bin/:";
>>>>>>>>}
>>>>>>>>
>>>>>>>>
>>>>>>>># parameters
>>>>>>>>my $expect_value = 20000;
>>>>>>>>#my $filter_query_sequence = 'F';
>>>>>>>>my $one_line_description = 1000;
>>>>>>>>my $alignments = 1000;
>>>>>>>># my $strands = 1;
>>>>>>>>my $count = 1;
>>>>>>>>
>>>>>>>>my @params = ('program' => 'blastp', 'database' => 'nr');
>>>>>>>>#my $progress_interval = 100;
>>>>>>>>
>>>>>>>>
>>>>>>>>my $seqio_obj = Bio::SeqIO->new(
>>>>>>>> -file   => "Perm.txt",
>>>>>>>> -format => "raw",
>>>>>>>>);
>>>>>>>>
>>>>>>>># create factory
>>>>>>>> object and set parameters
>>>>>>>>my $factory = Bio::Tools::Run::StandAloneBlast->new(@params);
>>>>>>>>
>>>>>>>>$factory->e($expect_value);
>>>>>>>>#$factory->F($filter_query_sequence);
>>>>>>>>$factory->v($one_line_description);
>>>>>>>>$factory->b($alignments);
>>>>>>>>#$factory->S($strands);
>>>>>>>>
>>>>>>>>
>>>>>>>># get query
>>>>>>>>
>>>>>>>>while ( my $query = $seqio_obj->next_seq ) {
>>>>>>>>     my $blast_report = $factory->blastall($query);
>>>>>>>>     my $filename = "comp_$count.txt";
>>>>>>>>     my $factory->outfile($filename);
>>>>>>>>     print $query->seq;
>>>>>>>>     print "\n";
>>>>>>>>
>>>>>>>> $count++;
>>>>>>>>}
>>>>>>>>
>>>>>>>>thank you very much in advance
>>>>>>>>Hubert
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>Nagesh Chakka wrote:
>>>>>>>>
>>>>>>>> 
>>>>>>>>
>>>>>>>>                
>>>>>>>>
>>>>>>>>>Hi Hubert,
>>>>>>>>>I downloaded the nr.00.tar.gz file a week ago. I was able to get 
>>>>>>>>>the following files
>>>>>>>>>.phr, .pin, .pnd, .pni, .ppd, .ppi, .psd, .psi, .psq, .pal 
>>>>>>>>>files. I have no trouble in running standalone blast. You are 
>>>>>>>>>not required to run formardb on the downloaded blast databases 
>>>>>>>>>and that may be the reason why the sequences are not included as 
>>>>>>>>>it will also reduce the size of the file.
>>>>>>>>>Did you try to run a blast search, if so is it giving you any 
>>>>>>>>>errors?
>>>>>>>>>Nagesh
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>Hubert Prielinger wrote:
>>>>>>>>>
>>>>>>>>> 
>>>>>>>>>
>>>>>>>>>                  
>>>>>>>>>
>>>>>>>>>>Hi,
>>>>>>>>>>I have downloaded the nr database for doing a blast search 
>>>>>>>>>>locally, now I'm supposed to index the database with formatdb, 
>>>>>>>>>>but it doesn't work...
>>>>>>>>>>The online help says that you need a fasta file that is indexed 
>>>>>>>>>>to use for searching the database, but when I uncompressed the 
>>>>>>>>>>zip file, there were only .phr, .pnd, .pin, .pni, .ppd file....
>>>>>>>>>>Is there anybody who can tell me, how to use formatdb with the 
>>>>>>>>>>nr database...
>>>>>>>>>>
>>>>>>>>>>Help is very appreciated
>>>>>>>>>>Thank you very much in advance
>>>>>>>>>>
>>>>>>>>>>Hubert
>>>>>>>>>>
>>>>>>>>>>_______________________________________________
>>>>>>>>>>Bioperl-l mailing list
>>>>>>>>>>Bioperl-l at portal.open-bio.org
>>>>>>>>>>http://portal.open-bio.org/mailman/listinfo/bioperl-l
>>>>>>>>>>      
>>>>>>>>>>                    
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>    
>>>>>>>>>                  
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>                
>>>>>>>>
>>>>>>> 
>>>>>>>
>>>>>>>              
>>>>>>>
>>>>>          
>>>>>
>>>>_______________________________________________
>>>>Bioperl-l mailing list
>>>>Bioperl-l at portal.open-bio.org
>>>>http://portal.open-bio.org/mailman/listinfo/bioperl-l
>>>>
>>>>
>>>>        
>>>>
>>
>>
>>_______________________________________________
>>Bioperl-l mailing list
>>Bioperl-l at lists.open-bio.org
>>http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>    
>>
>

From smarkel at scitegic.com  Tue Jan 24 17:09:57 2006
From: smarkel at scitegic.com (Scott Markel)
Date: Tue, 24 Jan 2006 14:09:57 -0800
Subject: [Bioperl-l] formatdb with the nr database
In-Reply-To: <43D692C3.80306@gmx.at>
References: <43D54838.5050301@gmx.at>
	<43D5693C.1020805@anu.edu.au>	<43D56203.2060806@gmx.at>
	<1138062266.2534.2.camel@vogon>	<43D571B1.3020008@gmx.at>
	<43D58D06.5080501@anu.edu.au> <43D585CF.5070902@gmx.at>
	<43D63FB6.4090505@scitegic.com> <43D692C3.80306@gmx.at>
Message-ID: <43D6A5B5.8090106@scitegic.com>

Hubert,

Since you can't run blastall on the command line, your initial
problem has nothing to do with BioPerl.  Once you get blastall
working on the command line, you'll know what directories and
environment variable settings to use when running via BioPerl.

What happens when you run the following?

   file /home/Hubert/blast/blast-2.2.13/bin/blastall

Is the executable the correct one for your operating system?

Scott

Hubert Prielinger wrote:

> Hi,
> thank you very much for the help, I have tried to run the blastall on 
> commandline, but I can't even execute the binary file, nevertheless the 
> blastall exe file have every permission...
> I always get the error message: blastall: cannot execute the binary file
> Need to be the exe file somewhere else, another path...now it is located 
> under /home/Hubert/blast/blast-2.2.13/bin
> 
> thanks
> Hubert
> 
> 
> 
> 
> 
> Scott Markel wrote:
> 
>> Hubert,
>>
>> If you look at the MSG line in the exception you can see
>> exactly what the command line was.  Nagesh is pointing out
>> that you used -d "/nr" and asking if that's what you want.
>> I suspect that the '/' shouldn't be there.
>>
>> Try invoking blastall directly from the command line.  All
>> BioPerl is doing is invoking BLAST on your behalf.  The
>> same command line that BioPerl uses should also work for
>> you on the command line.
>>
>> Scott
>>
>> Hubert Prielinger wrote:
>>
>>> hi,
>>> sorry, but what do you mean with is your blast database in /nr...
>>> my database is located in the path /home/Hubert/blast/blast-2.2.13/data
>>>
>>>
>>>
>>> Nagesh Chakka wrote:
>>>
>>>> Can you just run the blast from the command line.
>>>> Is your blast database in "/nr".
>>>>
>>>> Hubert Prielinger wrote:
>>>>
>>>>> Hi Nagesh,
>>>>> thank you very much, I put my database into the data folder, run 
>>>>> the program and got the following error message:
>>>>>
>>>>> submit Sequence...just do it....
>>>>> sh: /home/Hubert/blast/blast-2.2.13/bin/blastall: cannot execute 
>>>>> binary file
>>>>>
>>>>> ------------- EXCEPTION  -------------
>>>>> MSG: blastall call crashed: 32256 
>>>>> /home/Hubert/blast/blast-2.2.13/bin/blastall -p  blastp  -d  "/nr"  
>>>>> -i  /tmp/QTZfYMbgLM  -e  20000  -o  /tmp/v3YwWvONZ1  -v  1000  -b  
>>>>> 1000
>>>>>
>>>>> STACK Bio::Tools::Run::StandAloneBlast::_runblast 
>>>>> /usr/lib/perl5/site_perl/5.8.6/Bio/Tools/Run/StandAloneBlast.pm:759
>>>>> STACK Bio::Tools::Run::StandAloneBlast::_generic_local_blast 
>>>>> /usr/lib/perl5/site_perl/5.8.6/Bio/Tools/Run/StandAloneBlast.pm:706
>>>>> STACK Bio::Tools::Run::StandAloneBlast::blastall 
>>>>> /usr/lib/perl5/site_perl/5.8.6/Bio/Tools/Run/StandAloneBlast.pm:557
>>>>> STACK toplevel 
>>>>> /home/Hubert/installed/eclipse/workspace/Database_Search/standalone_blast.pl:46 
>>>>>
>>>>>
>>>>> --------------------------------------
>>>>>
>>>>> Why it did not find my binary file, but it is there
>>>>>
>>>>> regards
>>>>>
>>>>> Nagesh Chakka wrote:
>>>>>
>>>>>> Hi,
>>>>>> The following is from the StandAloneBlast.pm documentation
>>>>>> " If the databases which will be searched by BLAST are located in the
>>>>>> data subdirectory of the blast program directory (the default
>>>>>> installation location), StandAloneBlast will find them; however, 
>>>>>> if the
>>>>>> database files are located in any other location, environmental 
>>>>>> variable
>>>>>> $BLASTDATADIR will need to be set to point to that directory."
>>>>>> Please note that I have not used this module before.
>>>>>> Nagesh
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Mon, 2006-01-23 at 17:08 -0600, Hubert Prielinger wrote:
>>>>>>  
>>>>>>
>>>>>>> Hi,
>>>>>>> thank you very much for the help, another questions that raises 
>>>>>>> up, do I have to write the path to the database files as well, I 
>>>>>>> guess so, but how I do that, the same way I write the path to teh 
>>>>>>> blast bin files?
>>>>>>> Does anybody know how to set the Composition based statistics 
>>>>>>> parameter?
>>>>>>> there is my code:
>>>>>>>
>>>>>>> #!/usr/bin/perl -w
>>>>>>>
>>>>>>> use Bio::Tools::Run::StandAloneBlast;
>>>>>>> use Bio::Seq;
>>>>>>> use Bio::SeqIO;
>>>>>>> use strict;
>>>>>>>
>>>>>>> BEGIN
>>>>>>> {
>>>>>>>    $ENV{PATH}=":/home/Hubert/blast/blast-2.2.13/bin/:";
>>>>>>> }
>>>>>>>
>>>>>>>
>>>>>>> # parameters
>>>>>>> my $expect_value = 20000;
>>>>>>> #my $filter_query_sequence = 'F';
>>>>>>> my $one_line_description = 1000;
>>>>>>> my $alignments = 1000;
>>>>>>> # my $strands = 1;
>>>>>>> my $count = 1;
>>>>>>>
>>>>>>> my @params = ('program' => 'blastp', 'database' => 'nr');
>>>>>>> #my $progress_interval = 100;
>>>>>>>
>>>>>>>
>>>>>>> my $seqio_obj = Bio::SeqIO->new(
>>>>>>>  -file   => "Perm.txt",
>>>>>>>  -format => "raw",
>>>>>>> );
>>>>>>>
>>>>>>> # create factory object and set parameters
>>>>>>> my $factory = Bio::Tools::Run::StandAloneBlast->new(@params);
>>>>>>>
>>>>>>> $factory->e($expect_value);
>>>>>>> #$factory->F($filter_query_sequence);
>>>>>>> $factory->v($one_line_description);
>>>>>>> $factory->b($alignments);
>>>>>>> #$factory->S($strands);
>>>>>>>
>>>>>>>
>>>>>>> # get query
>>>>>>>
>>>>>>> while ( my $query = $seqio_obj->next_seq ) {
>>>>>>>      my $blast_report = $factory->blastall($query);
>>>>>>>      my $filename = "comp_$count.txt";
>>>>>>>      my $factory->outfile($filename);
>>>>>>>      print $query->seq;
>>>>>>>      print "\n";
>>>>>>>
>>>>>>>  $count++;
>>>>>>> }
>>>>>>>
>>>>>>> thank you very much in advance
>>>>>>> Hubert
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Nagesh Chakka wrote:
>>>>>>>
>>>>>>>  
>>>>>>>
>>>>>>>> Hi Hubert,
>>>>>>>> I downloaded the nr.00.tar.gz file a week ago. I was able to get 
>>>>>>>> the following files
>>>>>>>> .phr, .pin, .pnd, .pni, .ppd, .ppi, .psd, .psi, .psq, .pal 
>>>>>>>> files. I have no trouble in running standalone blast. You are 
>>>>>>>> not required to run formardb on the downloaded blast databases 
>>>>>>>> and that may be the reason why the sequences are not included as 
>>>>>>>> it will also reduce the size of the file.
>>>>>>>> Did you try to run a blast search, if so is it giving you any 
>>>>>>>> errors?
>>>>>>>> Nagesh
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> Hubert Prielinger wrote:
>>>>>>>>
>>>>>>>>  
>>>>>>>>
>>>>>>>>> Hi,
>>>>>>>>> I have downloaded the nr database for doing a blast search 
>>>>>>>>> locally, now I'm supposed to index the database with formatdb, 
>>>>>>>>> but it doesn't work...
>>>>>>>>> The online help says that you need a fasta file that is indexed 
>>>>>>>>> to use for searching the database, but when I uncompressed the 
>>>>>>>>> zip file, there were only .phr, .pnd, .pin, .pni, .ppd file....
>>>>>>>>> Is there anybody who can tell me, how to use formatdb with the 
>>>>>>>>> nr database...
>>>>>>>>>
>>>>>>>>> Help is very appreciated
>>>>>>>>> Thank you very much in advance
>>>>>>>>>
>>>>>>>>> Hubert
>>>>>>>>>
>>>>>>>>> _______________________________________________
>>>>>>>>> Bioperl-l mailing list
>>>>>>>>> Bioperl-l at portal.open-bio.org
>>>>>>>>> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>>>>>>>>>       
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>     
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>>  
>>>>>>
>>>>>
>>>>
>>>>
>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at portal.open-bio.org
>>> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>>
>>
> 
> 
> 
> 

-- 
Scott Markel, Ph.D.
Principal Bioinformatics Architect  email:  smarkel at scitegic.com
SciTegic Inc.                       mobile: +1 858 205 3653
9665 Chesapeake Drive, Suite 401    voice:  +1 858 279 8800, ext. 253
San Diego, CA 92123                 fax:    +1 858 279 8804
USA                                 web:    http://www.scitegic.com

From cjfields at uiuc.edu  Tue Jan 24 17:21:22 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 24 Jan 2006 16:21:22 -0600
Subject: [Bioperl-l] RemoteBlast.pm and Bio::SearchIO::blast.pm
	-partially resolved
In-Reply-To: <18966F80-B780-4661-953E-613B05B56164@duke.edu>
Message-ID: <000301c62134$81cdc500$15327e82@pyrimidine>

Jason, 

I have worked out all the problems with RemoteBlast.pm and posted a patched
version to Bugzilla (http://bugzilla.bioperl.org/show_bug.cgi?id=1935).  The
main problem was that RemoteBlast::save_output was not looking for XML
output when dumping from the tempfile to the saved file (it only looked for
the text header).  That is fixed.  The other problems mentioned were due to
differences in mapping key=>value pairs between blast and blastxml and a
problem in my own script.  It passed all tests using 'perl t/RemoteBlast.t'
with debugging set.

See if anybody else out there can test them out.

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 
> -----Original Message-----
> From: bioperl-l-bounces at portal.open-bio.org [mailto:bioperl-l-
> bounces at portal.open-bio.org] On Behalf Of Jason Stajich
> Sent: Tuesday, January 24, 2006 11:16 AM
> To: Chris Fields
> Cc: bioperl-ml List
> Subject: [Bioperl-l] Re: RemoteBlast.pm and Bio::SearchIO::blast.pm -
> partially resolved
> 
> Thanks Chris - I don't know when I'll have time to check in bugs so
> anyone else who has commit access feel free to give these a whirl and
> check in.
> 
> I would propose making the XML default but allowing the text version
> to still be supported in the event that someone has setup their own
> local NCBI BLAST Web interface which still supports the simple Text
> output.
> 
> -j
> 
> On Jan 24, 2006, at 12:09 PM, Chris Fields wrote:
> 
> > I submitted two bugs on Bugzilla to describe recent problems with
> > RemoteBlast.pm and SearchIO::blast.pm
> >
> > http://bugzilla.bioperl.org/show_bug.cgi?id=1934
> > http://bugzilla.bioperl.org/show_bug.cgi?id=1935
> >
> > Today I submitted a patched version of Bio::SearchIO::blast.pm
> > which should
> > fix the text parsing issue for old (2.2.12) and new (2.2.13)
> > versions of
> > NCBI's BLAST; the bug link above describes the problem and the
> > fix.  Problem
> > is, I know it will likely break again b/c NCBI will probably change
> > text
> > output in a future BLAST version.  I also agree with Jason about
> > changing
> > the default for SearchIO to XML.  So, does text output parsing through
> > blast.pm need to be deprecated in favor of XML, or should both be
> > available?
> >
> > Christopher Fields
> > Postdoctoral Researcher - Switzer Lab
> > Dept. of Biochemistry
> > University of Illinois Urbana-Champaign
> >
> >
> 
> --
> Jason Stajich
> Duke University
> http://www.duke.edu/~jes12
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l

From jason.stajich at duke.edu  Tue Jan 24 16:44:34 2006
From: jason.stajich at duke.edu (Jason Stajich)
Date: Tue, 24 Jan 2006 16:44:34 -0500
Subject: [Bioperl-l] new mailing list server
Message-ID: <50E14815-266E-4ACB-8E6E-293C9EB33476@duke.edu>

Chris Dagdigian has switched our mailing lists over to a new server  
to upgrade us to newer hardware.  In the switch the default mailing  
list the server name is 'lists.open-bio.org' instead of 'portal.open- 
bio.org'.  That should be the only change you should notice at the  
bottom of your mails.  All mail should get delivered to any of those  
addresses (although @bioperl.org is preferred).

We hope this changeover will help improve the performance and  
scalability of our mail and webservices.

We also will aim to move the developer read-write CVS server to a new  
machine in the coming weeks.  We hope this will only be a minor  
inconvenience but will allow us to move to a more recent operating  
system and larger disk space.

If you have questions or concerns they can be directed to support AT  
open-bio.org
-jason
--
Jason Stajich
Duke University
http://www.duke.edu/~jes12

From jason.stajich at duke.edu  Tue Jan 24 22:31:38 2006
From: jason.stajich at duke.edu (Jason Stajich)
Date: Tue, 24 Jan 2006 22:31:38 -0500
Subject: [Bioperl-l] new website launched
Message-ID: <79628361-F026-461F-A156-0B6810DB0B52@duke.edu>

I am pleased to announce the release of a new website for BioPerl.   
The site is based on the mediawiki software that was developed for  
the wikipedia project.  We intend the site to be a place for  
community input on documentation and design for the BioPerl project.   
There is also a fair amount of documentation started surrounding  
bioinformatics tools and techniques applicable to using BioPerl and  
some of the authors who created these resources.

The website continues to be at the URL http://www.bioperl.org.  The  
DNS updates may take up to 24 hours to reach everyone.

The initial content of the site is result of the work of myself,  
Mauricio Herrera Cuadra, Brian Osborne, and Torsten Seemann.  We  
encourage you to contribute to the site's content by signing up for  
an account.

There are several guides for style of the site and how to link to  
Modules for example which can contain additional information from the  
POD
http://bioperl.org/wiki/Module:Bio::SeqIO

You'll notice that many of the paths have changed but the DIST and  
SRC continues to be available at http://bioperl.org/DIST and http:// 
bioperl.org/SRC.  The HOWTOs are now available from http:// 
bioperl.org/wiki/HOWTOs

The FAQ is available at http://bioperl.org/wiki/FAQ and I encourage  
you to add your questions to it so they can be properly archived and  
addressed.

We also have initiated a News site for Bioperl for posting  
announcements regarding development and software.  I would like to  
see if there are volunteers to post weekly or monthly summaries of  
mailing list traffic and development.
http://www.bioperl.org/news/

Jason Stajich on behalf of Mauricio Herrera Cuadra, Brian Osborne,  
Torsten Seemann.

--
Jason Stajich
Duke University
http://www.duke.edu/~jes12

From roy at colibase.bham.ac.uk  Wed Jan 25 12:05:29 2006
From: roy at colibase.bham.ac.uk (Roy Chaudhuri)
Date: Wed, 25 Jan 2006 17:05:29 +0000
Subject: [Bioperl-l] concatenate two embl sequence files
In-Reply-To: <200601182120.k0ILIl8X022324@portal.open-bio.org>
References: <200601182120.k0ILIl8X022324@portal.open-bio.org>
Message-ID: <43D7AFD9.2020305@colibase.bham.ac.uk>

Hi all.

I also had need of a function to concatenate two Bio::Seq objects, so had a go
at this. My naive attempt (intended to go in Bio::SeqUtils) is pasted below. I'm
not too sure about the concept of sub-SeqFeatures (I've never seen any sequence
that had more than one level of feature)- I worked on the assumption that little
sub-SeqFeatures can have littler sub-SeqFeatures and so ad infinitum, but as I
don't have an example file I haven't been able to test if this works. Likewise,
although I think the code should cope with Fuzzy and Split locations, I haven't
tested this with any particularly unusual examples.

Roy.
--
Dr. Roy Chaudhuri
Bioinformatics Research Fellow
Division of Immunity and Infection
University of Birmingham, U.K.

http://xbase.bham.ac.uk

=head2 cat

  Title   : cat
  Usage   : my $catseq = Bio::SeqUtils->cat(@seqs)
  Function: Concatenates an array of Bio::Seq objects, using the first sequence
            as a template for species etc. Adjusts the coordinates of features
            from any additional objects.
  Returns : A sequence object of the same class as the first argument.
  Args    : array of sequence objects

=cut

sub cat {
     my ($self, @seqs) = @_;
     my $seq=shift @seqs;
     $self->throw('Object [$seq] '. 'of class ['. ref($seq).
                  '] should be a Bio::PrimarySeqI ')
     unless $seq->isa('Bio::PrimarySeqI');
     for (@seqs) {
     	$self->throw('Object [$seq] '. 'of class ['. ref($seq).
                  '] should be a Bio::PrimarySeqI ')
	unless $seq->isa('Bio::PrimarySeqI');
	my $length=$seq->length;
	$seq->seq($seq->seq.$_->seq);
	for my $feat ($_->get_SeqFeatures) {
	    $seq->add_SeqFeature($self->_coordAdjust($feat, $length));
	}
     }
     return $seq;
}

=head2 _coordAdjust

  Title   : _coordAdjust
  Usage   : my $newfeat=Bio::SeqUtils->_coordAdjust($feature, 100);
  Function: Recursive subroutine to adjust the coordinates of a feature
            and all its subfeatures.
  Returns : A Bio::SeqFeatureI compliant object.
  Args    : A Bio::SeqFeatureI compliant object,
            the number of bases to add to the coordinates

=cut

sub _coordAdjust {
     my ($self, $feat, $add)=@_;
     $self->throw('Object [$feat] '. 'of class ['. ref($feat).
                  '] should be a Bio::SeqFeatureI ')
	unless $feat->isa('Bio::SeqFeatureI');
     my @adjsubfeat;
     for my $subfeat ($feat->remove_SeqFeatures) {
	push @adjsubfeat, Bio::SeqUtils->_coordAdjust($add, $subfeat);
     }
     my @loc=$feat->location->each_Location;
     map {
	my @coords=($_->start, $_->end);
	map s/(\d+)/$add+$1/ge, @coords;
	$_->start(shift @coords);
	$_->end(shift @coords);
     } @loc;
     if (@loc==1) {
	$feat->location($loc[0])
     } else {
	my $loc=Bio::Location::Split->new;
	$loc->add_sub_Location(@loc);
	$feat->location($loc);
     }
     $feat->add_SeqFeature($_) for @adjsubfeat;
     return $feat;
}

> 
> 
> Jan, 
> 
> It would be easy if someone had written a function to do it. Even writing the 
> function is not hard.  I do not think there is no other way than go through 
> all features, though.
> 
> In my opinion this would be an excellent addition to Bio::Seq::Utilities.
> 
> E.g. cat($arrayrefofsequences, optional_seq_class_to_create)
>      return a new seq, species and other info based on the first seq in array 
> 
> Could you  write it and post to bugzilla?
> 
> 	-Heikki
> 
> 
> On Tuesday 17 January 2006 11:54, jan aerts (RI) wrote:
>> Hi all,
>>
>> Does anyone know of an easy way to concatenate two sequences, including
>> recalculation of features positions of the second one? E.g.
>>   seq 1 = 100 bp
>>     feature A: 5..15
>>   seq 2 = 200 bp
>>     feature B: 20..30
>>   => concatenated sequence 3 = 300 bp
>>        feature A: 5..15
>>        feature B: 120..130  <<<<<<<<<<<
>>
>> Annotations (features without range) should be transferred as well.
>>
>> Of course, it must be possible to create a blank sequence and work my
>> way through all features, adding them to a new collection of features
>> and stuff. But I was wondering if a simpler technique is possible.
>>
>> Many thanks,
>> Jan Aerts
>> Bioinformatics Department
>> Roslin Institute
>> Roslin, Scotland, UK
>>
>> ---------The obligatory disclaimer--------
>> The information contained in this e-mail (including any attachments) is
>> confidential and is intended for the use of the addressee only.   The
>> opinions expressed within this e-mail (including any attachments) are
>> the opinions of the sender and do not necessarily constitute those of
>> Roslin Institute (Edinburgh) ("the Institute") unless specifically
>> stated by a sender who is duly authorised to do so on behalf of the
>> Institute.
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at portal.open-bio.org
>> http://portal.open-bio.org/mailman/listinfo/bioperl-l
> 
> -- 
> ______ _/      _/_____________________________________________________
>       _/      _/
>      _/  _/  _/  Heikki Lehvaslaiho    heikki at_sanbi _ac _za
>     _/_/_/_/_/  Associate Professor    skype: heikki_lehvaslaiho
>    _/  _/  _/  SANBI, South African National Bioinformatics Institute
>   _/  _/  _/  University of Western Cape, South Africa
>      _/      Phone: +27 21 959 2096   FAX: +27 21 959 2512
> ___ _/_/_/_/_/________________________________________________________
> 

From heikki at sanbi.ac.za  Wed Jan 25 16:11:45 2006
From: heikki at sanbi.ac.za (Heikki Lehvaslaiho)
Date: Wed, 25 Jan 2006 23:11:45 +0200
Subject: [Bioperl-l] concatenate two embl sequence files
In-Reply-To: <43D7AFD9.2020305@colibase.bham.ac.uk>
References: <200601182120.k0ILIl8X022324@portal.open-bio.org>
	<43D7AFD9.2020305@colibase.bham.ac.uk>
Message-ID: <200601252311.45582.heikki@sanbi.ac.za>

Thanks Roy!

I'll check to code in tomorrow when I am less sleepy and can go through the 
code in detail. In principle the code looks good. It definitely needs tests. 
If you have written any please do post them.

A few more checks to make sure seq_>alphabet is the same in all sequences 
might be a good idea.

   -Heikki

On Wednesday 25 January 2006 19:05, Roy Chaudhuri wrote:
> Hi all.
>
> I also had need of a function to concatenate two Bio::Seq objects, so had a
> go at this. My naive attempt (intended to go in Bio::SeqUtils) is pasted
> below. I'm not too sure about the concept of sub-SeqFeatures (I've never
> seen any sequence that had more than one level of feature)- I worked on the
> assumption that little sub-SeqFeatures can have littler sub-SeqFeatures and
> so ad infinitum, but as I don't have an example file I haven't been able to
> test if this works. Likewise, although I think the code should cope with
> Fuzzy and Split locations, I haven't tested this with any particularly
> unusual examples.
>
> Roy.
> --
> Dr. Roy Chaudhuri
> Bioinformatics Research Fellow
> Division of Immunity and Infection
> University of Birmingham, U.K.
>
> http://xbase.bham.ac.uk
>
>
>
> =head2 cat
>
>   Title   : cat
>   Usage   : my $catseq = Bio::SeqUtils->cat(@seqs)
>   Function: Concatenates an array of Bio::Seq objects, using the first
> sequence as a template for species etc. Adjusts the coordinates of features
> from any additional objects.
>   Returns : A sequence object of the same class as the first argument.
>   Args    : array of sequence objects
>
>
> =cut
>
> sub cat {
>      my ($self, @seqs) = @_;
>      my $seq=shift @seqs;
>      $self->throw('Object [$seq] '. 'of class ['. ref($seq).
>                   '] should be a Bio::PrimarySeqI ')
>      unless $seq->isa('Bio::PrimarySeqI');
>      for (@seqs) {
>      	$self->throw('Object [$seq] '. 'of class ['. ref($seq).
>                   '] should be a Bio::PrimarySeqI ')
> 	unless $seq->isa('Bio::PrimarySeqI');
> 	my $length=$seq->length;
> 	$seq->seq($seq->seq.$_->seq);
> 	for my $feat ($_->get_SeqFeatures) {
> 	    $seq->add_SeqFeature($self->_coordAdjust($feat, $length));
> 	}
>      }
>      return $seq;
> }
>
> =head2 _coordAdjust
>
>   Title   : _coordAdjust
>   Usage   : my $newfeat=Bio::SeqUtils->_coordAdjust($feature, 100);
>   Function: Recursive subroutine to adjust the coordinates of a feature
>             and all its subfeatures.
>   Returns : A Bio::SeqFeatureI compliant object.
>   Args    : A Bio::SeqFeatureI compliant object,
>             the number of bases to add to the coordinates
>
>
> =cut
>
> sub _coordAdjust {
>      my ($self, $feat, $add)=@_;
>      $self->throw('Object [$feat] '. 'of class ['. ref($feat).
>                   '] should be a Bio::SeqFeatureI ')
> 	unless $feat->isa('Bio::SeqFeatureI');
>      my @adjsubfeat;
>      for my $subfeat ($feat->remove_SeqFeatures) {
> 	push @adjsubfeat, Bio::SeqUtils->_coordAdjust($add, $subfeat);
>      }
>      my @loc=$feat->location->each_Location;
>      map {
> 	my @coords=($_->start, $_->end);
> 	map s/(\d+)/$add+$1/ge, @coords;
> 	$_->start(shift @coords);
> 	$_->end(shift @coords);
>      } @loc;
>      if (@loc==1) {
> 	$feat->location($loc[0])
>      } else {
> 	my $loc=Bio::Location::Split->new;
> 	$loc->add_sub_Location(@loc);
> 	$feat->location($loc);
>      }
>      $feat->add_SeqFeature($_) for @adjsubfeat;
>      return $feat;
> }
>
> > Jan,
> >
> > It would be easy if someone had written a function to do it. Even writing
> > the function is not hard.  I do not think there is no other way than go
> > through all features, though.
> >
> > In my opinion this would be an excellent addition to Bio::Seq::Utilities.
> >
> > E.g. cat($arrayrefofsequences, optional_seq_class_to_create)
> >      return a new seq, species and other info based on the first seq in
> > array
> >
> > Could you  write it and post to bugzilla?
> >
> > 	-Heikki
> >
> > On Tuesday 17 January 2006 11:54, jan aerts (RI) wrote:
> >> Hi all,
> >>
> >> Does anyone know of an easy way to concatenate two sequences, including
> >> recalculation of features positions of the second one? E.g.
> >>   seq 1 = 100 bp
> >>     feature A: 5..15
> >>   seq 2 = 200 bp
> >>     feature B: 20..30
> >>   => concatenated sequence 3 = 300 bp
> >>        feature A: 5..15
> >>        feature B: 120..130  <<<<<<<<<<<
> >>
> >> Annotations (features without range) should be transferred as well.
> >>
> >> Of course, it must be possible to create a blank sequence and work my
> >> way through all features, adding them to a new collection of features
> >> and stuff. But I was wondering if a simpler technique is possible.
> >>
> >> Many thanks,
> >> Jan Aerts
> >> Bioinformatics Department
> >> Roslin Institute
> >> Roslin, Scotland, UK
> >>
> >> ---------The obligatory disclaimer--------
> >> The information contained in this e-mail (including any attachments) is
> >> confidential and is intended for the use of the addressee only.   The
> >> opinions expressed within this e-mail (including any attachments) are
> >> the opinions of the sender and do not necessarily constitute those of
> >> Roslin Institute (Edinburgh) ("the Institute") unless specifically
> >> stated by a sender who is duly authorised to do so on behalf of the
> >> Institute.
> >>
> >> _______________________________________________
> >> Bioperl-l mailing list
> >> Bioperl-l at portal.open-bio.org
> >> http://portal.open-bio.org/mailman/listinfo/bioperl-l
> >
> > --
> > ______ _/      _/_____________________________________________________
> >       _/      _/
> >      _/  _/  _/  Heikki Lehvaslaiho    heikki at_sanbi _ac _za
> >     _/_/_/_/_/  Associate Professor    skype: heikki_lehvaslaiho
> >    _/  _/  _/  SANBI, South African National Bioinformatics Institute
> >   _/  _/  _/  University of Western Cape, South Africa
> >      _/      Phone: +27 21 959 2096   FAX: +27 21 959 2512
> > ___ _/_/_/_/_/________________________________________________________
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

-- 
______ _/      _/_____________________________________________________
      _/      _/
     _/  _/  _/  Heikki Lehvaslaiho    heikki at_sanbi _ac _za
    _/_/_/_/_/  Associate Professor    skype: heikki_lehvaslaiho
   _/  _/  _/  SANBI, South African National Bioinformatics Institute
  _/  _/  _/  University of Western Cape, South Africa
     _/      Phone: +27 21 959 2096   FAX: +27 21 959 2512
___ _/_/_/_/_/________________________________________________________

From heikki at sanbi.ac.za  Wed Jan 25 15:52:42 2006
From: heikki at sanbi.ac.za (Heikki Lehvaslaiho)
Date: Wed, 25 Jan 2006 22:52:42 +0200
Subject: [Bioperl-l] new website launched
In-Reply-To: <79628361-F026-461F-A156-0B6810DB0B52@duke.edu>
References: <79628361-F026-461F-A156-0B6810DB0B52@duke.edu>
Message-ID: <200601252252.42786.heikki@sanbi.ac.za>

Congratulations and huge thank you for the production team!

The new website is a big step ahead readability and ease in editing the 
information.

I for my part have already corrected a few small typos and omissions on the 
new pages. I invite other to do the same.

    -Heikki

On Wednesday 25 January 2006 05:31, Jason Stajich wrote:
> I am pleased to announce the release of a new website for BioPerl.
> The site is based on the mediawiki software that was developed for
> the wikipedia project.  We intend the site to be a place for
> community input on documentation and design for the BioPerl project.
> There is also a fair amount of documentation started surrounding
> bioinformatics tools and techniques applicable to using BioPerl and
> some of the authors who created these resources.
>
> The website continues to be at the URL http://www.bioperl.org.  The
> DNS updates may take up to 24 hours to reach everyone.
>
> The initial content of the site is result of the work of myself,
> Mauricio Herrera Cuadra, Brian Osborne, and Torsten Seemann.  We
> encourage you to contribute to the site's content by signing up for
> an account.
>
> There are several guides for style of the site and how to link to
> Modules for example which can contain additional information from the
> POD
> http://bioperl.org/wiki/Module:Bio::SeqIO
>
> You'll notice that many of the paths have changed but the DIST and
> SRC continues to be available at http://bioperl.org/DIST and http://
> bioperl.org/SRC.  The HOWTOs are now available from http://
> bioperl.org/wiki/HOWTOs
>
> The FAQ is available at http://bioperl.org/wiki/FAQ and I encourage
> you to add your questions to it so they can be properly archived and
> addressed.
>
> We also have initiated a News site for Bioperl for posting
> announcements regarding development and software.  I would like to
> see if there are volunteers to post weekly or monthly summaries of
> mailing list traffic and development.
> http://www.bioperl.org/news/
>
>
> Jason Stajich on behalf of Mauricio Herrera Cuadra, Brian Osborne,
> Torsten Seemann.
>
> --
> Jason Stajich
> Duke University
> http://www.duke.edu/~jes12
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

-- 
______ _/      _/_____________________________________________________
      _/      _/
     _/  _/  _/  Heikki Lehvaslaiho    heikki at_sanbi _ac _za
    _/_/_/_/_/  Associate Professor    skype: heikki_lehvaslaiho
   _/  _/  _/  SANBI, South African National Bioinformatics Institute
  _/  _/  _/  University of Western Cape, South Africa
     _/      Phone: +27 21 959 2096   FAX: +27 21 959 2512
___ _/_/_/_/_/________________________________________________________

From cjfields at uiuc.edu  Wed Jan 25 22:34:01 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 25 Jan 2006 21:34:01 -0600
Subject: [Bioperl-l] [Gmod-gbrowse] GMOD PPM repository not working
In-Reply-To: <1138119383.3338.68.camel@localhost.localdomain>
Message-ID: <000201c62229$59ed5f50$15327e82@pyrimidine>

Scott,

This popped up, for some reason, when I tried to install a perl module
(Error.pm); maybe it has something to do with the reason PPM can't 'see'
GMOD's repository.  It crashes PPM pretty nicely!  Looks like the home page
for GMOD, so maybe Sourceforge is redirecting things and this messes with
PPM?  

_____________________________________________
C:\Perl\Scripts>ppm
PPM - Programmer's Package Manager version 3.3.
Copyright (c) 2001 ActiveState Corp. All Rights Reserved.
ActiveState is a division of Sophos.

Entering interactive shell. Using Term::ReadLine::Perl as readline library.

Type 'help' to get started.

ppm> rep
Repositories:
[1] Bioperl
[2] gmod
[3] ActiveState PPM2 Repository
[4] ActiveState Package Repository
[ ] Bribes
[ ] Kobes
[ ] local
ppm> install Error
PPM::PPD::init: not a PPD and not a file:

  The Generic Model Organism Database Project | GMOD

      GMOD

      Generic Software Components for Model
Organism Databases

      Mailing lists |
Bug Reports |
Feature Requests |
Publications |
Meetings |

.... (lots of HTML removed)

This site is maintained by Scott
Cain | Powered by 
drupal

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 
> -----Original Message-----
> From: gmod-gbrowse-admin at lists.sourceforge.net [mailto:gmod-gbrowse-
> admin at lists.sourceforge.net] On Behalf Of Scott Cain
> Sent: Tuesday, January 24, 2006 10:16 AM
> To: Chris Fields
> Cc: 'Gbrowse (E-mail)'; bioperl-l at portal.open-bio.org
> Subject: Re: [Gmod-gbrowse] GMOD PPM repository not working
> 
> Hi Chris,
> 
> Is it still misbehaving?  I'll do some testing today, but my ability to
> do so is little hampered as I am traveling this week.
> 
> Thanks,
> Scott
> 
> 
> On Wed, 2006-01-18 at 10:51 -0600, Chris Fields wrote:
> > Scott,
> >
> > I am trying to find the newest bioperl dev. Release (1.51) from PPM for
> a
> > quick write-up on installing bioperl-db on Windows.  I tried using the
> GMOD
> > repository:
> >
> > ppm> rep add gmod http://www.gmod.org/ggb/ppm
> > Repositories:
> > [1] gmod
> > [ ] ActiveState Package Repository
> > [ ] ActiveState PPM2 Repository
> > [ ] Bioperl
> > [ ] Bribes
> > [ ] Kobes
> > [ ] local
> > ppm> search bioperl
> > Searching in Active Repositories
> > No matches for 'bioperl'; see 'help search'.
> > ppm> search *
> > Searching in Active Repositories
> > No matches for '*'; see 'help search'.
> > ppm>
> >
> >
> > Any idea what's going on?  All other repositories work fine.  I can
> download
> > it and install locally w/o a problem.  I am running the newest
> ActivePerl
> > (5.8.7.815), WinXP.
> >
> > Christopher Fields
> > Postdoctoral Researcher - Switzer Lab
> > Dept. of Biochemistry
> > University of Illinois Urbana-Champaign
> >
> >
> >
> >
> > -------------------------------------------------------
> > This SF.net email is sponsored by: Splunk Inc. Do you grep through log
> files
> > for problems?  Stop!  Download the new AJAX search engine that makes
> > searching your log files as easy as surfing the  web.  DOWNLOAD SPLUNK!
> > http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642
> > _______________________________________________
> > Gmod-gbrowse mailing list
> > Gmod-gbrowse at lists.sourceforge.net
> > https://lists.sourceforge.net/lists/listinfo/gmod-gbrowse
> --
> ------------------------------------------------------------------------
> Scott Cain, Ph. D.                                         cain at cshl.edu
> GMOD Coordinator (http://www.gmod.org/)                     216-392-3087
> Cold Spring Harbor Laboratory
> 
> 
> 
> -------------------------------------------------------
> This SF.net email is sponsored by: Splunk Inc. Do you grep through log
> files
> for problems?  Stop!  Download the new AJAX search engine that makes
> searching your log files as easy as surfing the  web.  DOWNLOAD SPLUNK!
> http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642
> _______________________________________________
> Gmod-gbrowse mailing list
> Gmod-gbrowse at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/gmod-gbrowse

From cjfields at uiuc.edu  Thu Jan 26 00:38:56 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 25 Jan 2006 23:38:56 -0600
Subject: [Bioperl-l] bioperl-db on Windows (update)
Message-ID: <000001c6223a$cd5539c0$15327e82@pyrimidine>

Hilmar, 

I checked load_seqdatabase.pl with all variables of Root.pm and checking
debugging output; basically, the only way that I could find to get
load_seqdatabase.pl to work on native Windows is by changing those Root.pm
lines by adding a comma (i.e. three lines, from 'throw $class ...' to 'throw
$class, ...').  I ran debugging on load_seqdatabase.pl using all versions of
Root.pm, with and without Error.pm.  Only those with a comma present worked
in both circumstances.  I don't know why this hasn't popped up before now,
but it seems to be a unique combination of Windows, load_seqdatabase.pl, and
bioperl-db.  It doesn't happen with any scripts of Bioperl on Windows that
I've run into, and debugging other modules (for instance,
Bio::SearchIO::blast, which I recently worked on) doesn't cause this
problem.  

Here's the debugging output for load_seqdatabase.pl, with and w/o Error.pm
and without modifying Root.pm.

____________________________________________________________

Without Error.pm:
____________________________________________________________
C:\Perl\Scripts>perl -MError
Can't locate Error.pm in @INC (@INC contains:
C:\Perl\src\bioperl\bioperl-live C:\Perl\src\bioperl\bioperl-db C:/Perl/lib
C:/P
erl/site/lib .).
BEGIN failed--compilation aborted.

C:\Perl\Scripts>load_seqdatabase.pl -dbname biotest -dbuser root -dbpass
****** -driver mysql -format genbank -namespace tes
t -testonly -safe -debug input.gpt
Loading input.gpt ...
attempting to load adaptor class for Bio::Seq::RichSeq
        attempting to load module Bio::DB::BioSQL::RichSeqAdaptor
attempting to load adaptor class for Bio::Seq
        attempting to load module Bio::DB::BioSQL::SeqAdaptor
instantiating adaptor class Bio::DB::BioSQL::SeqAdaptor
attempting to load adaptor class for Bio::Species
        attempting to load module Bio::DB::BioSQL::SpeciesAdaptor
instantiating adaptor class Bio::DB::BioSQL::SpeciesAdaptor
Undefined subroutine &Bio::Root::Root::debug called at
C:\Perl\src\bioperl\bioperl-db/Bio/DB/BioSQL/BasePersistenceAdaptor.pm
line 1537,  line 63.
____________________________________________________________

With Error.pm:
____________________________________________________________

C:\Perl\Scripts>perl -MError -e ";"

C:\Perl\Scripts>load_seqdatabase.pl -dbname biotest -dbuser root -dbpass
****** -driver mysql -format genbank -namespace tes
t -testonly -safe -debug input.gpt
Loading input.gpt ...
attempting to load adaptor class for Bio::Seq::RichSeq
        attempting to load module Bio::DB::BioSQL::RichSeqAdaptor
  Calling Error::throw

  Calling Error::throw

attempting to load adaptor class for Bio::Seq
        attempting to load module Bio::DB::BioSQL::SeqAdaptor
instantiating adaptor class Bio::DB::BioSQL::SeqAdaptor
  Calling Error::throw

attempting to load adaptor class for Bio::Species
        attempting to load module Bio::DB::BioSQL::SpeciesAdaptor
instantiating adaptor class Bio::DB::BioSQL::SpeciesAdaptor
  Calling Error::throw

  Calling Error::throw

Undefined subroutine &Bio::Root::Root::debug called at
C:\Perl\src\bioperl\bioperl-db/Bio/DB/BioSQL/BasePersistenceAdaptor.pm
line 1537,  line 63.

____________________________________________________________

Error::throw is called w/o a problem when Error.pm is present (which is what
should happen).  For some reason, that extra comma makes all the difference
in the world.

The line above in BasePersistenceAdaptor.pm is :

$self->debug("attempting to load driver for adaptor class $class\n");

which is found in many modules.  I don't really know why it decides to hang
up here.  I'll try running a few of the Root.pm modifications under Mac OS X
in the next day or so to see what happens.

I also reran a few of Steve Chervitz's recommendations from a previous post;
everything ran fine except in circumstances in which Error.pm was required
with a 'use' statement, and only when Error.pm wasn't present, which is
expected.  Previously, when I ran them, there was a bit of confusion b/c it
seemed that Error.pm was present somewhere.  It was; Steve included it in
bioperl-live/examples/root/lib.  When I deleted it, I got the expected
results.

Anyway, I don't know what else I can do at this point besides check out
everything on Mac OS X.  Any additional checks of the modified Root.pm need
to be made on other systems.  Will filing this as a bug in Bugzilla help?

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 

From s.rayner at att.net  Thu Jan 26 00:58:42 2006
From: s.rayner at att.net (s.rayner at att.net)
Date: Thu, 26 Jan 2006 05:58:42 +0000
Subject: [Bioperl-l] bioperl installation problems with External Modules -
	doesn't see installed modules
Message-ID: <012620060558.15437.43D865110008848F00003C4D21602806519D0A02970E9DD29C@att.net>

I am trying to install the bioperl::bundle to use some of the external perl modules. 
Particularly the bio::DB::GFF module for use with biodas.

I follow the instructions, both from the bioperl web site for installing the bioperl bundle, and also specific instructions from the biodas web site for installing bio::DB::GFF.  Namely

   (1) Make sure that CVS is installed on your system.

    (2) Use the following command (all on one line) to login to the server

         % cvs -d :pserver:cvs at cvs.bioperl.org:/home/repository/bioperl login

          when prompted, the password is 'cvs'

    (3) Check out the bioperl package you are interested in, for most
    users this will be the bioperl-live source tree.  The following
    command should be executed as one line.

         % cvs -d :pserver:cvs at cvs.bioperl.org:/home/repository/bioperl checkout bioperl-live

    The login and checkout procedure should only have to be done
    once. To update the source directories in the future it should be
    possible just to enter the top level directory and issue the
    following command:

         % cvs update

This will create the directory ``bioperl-live''. Now build and install bioperl with the following recipe:

         % cd bioperl-live
         % perl Makefile.PL
         % make
         % make test
         % make install

The last step will probably need to be run as root.

When i perform either of these steps i get the message that the installation was successful, but bioperl and biodas return a message that the modules have not been installed.

They are physically present on the disk, but the programs don't seem to know where to find them.

Can anyone suggest how to fix this problem?

thanks

Simon

From heikki at sanbi.ac.za  Thu Jan 26 02:53:22 2006
From: heikki at sanbi.ac.za (Heikki Lehvaslaiho)
Date: Thu, 26 Jan 2006 09:53:22 +0200
Subject: [Bioperl-l] Fwd: some doubts in bioperl
Message-ID: <200601260953.22923.heikki@sanbi.ac.za>

----------  Forwarded Message  ----------

Subject: some doubts in bioperl
Date: Monday 23 January 2006 10:16
From: apsara asok 
To: heikki at sanbi.ac.za

dear heikki,
                  i want to clear some doubts in bioperl.using suffix tree
how can v do pattern searching in bioperl
do u have any idea pls help me
apsara

-------------------------------------------------------

From roy at colibase.bham.ac.uk  Thu Jan 26 08:18:03 2006
From: roy at colibase.bham.ac.uk (Roy Chaudhuri)
Date: Thu, 26 Jan 2006 13:18:03 +0000
Subject: [Bioperl-l] concatenate two embl sequence files
In-Reply-To: <200601252311.45582.heikki@sanbi.ac.za>
References: <200601182120.k0ILIl8X022324@portal.open-bio.org>
	<43D7AFD9.2020305@colibase.bham.ac.uk>
	<200601252311.45582.heikki@sanbi.ac.za>
Message-ID: <43D8CC0B.10403@colibase.bham.ac.uk>

Heikki Lehvaslaiho wrote:
> Thanks Roy!
> 
> I'll check to code in tomorrow when I am less sleepy and can go through the 
> code in detail. In principle the code looks good. It definitely needs tests. 
> If you have written any please do post them.
Not too sure about how to go about writing tests, any suggestions?

It did occur to me that my _coordAdjust method could be adapted to allow 
the Bio::Seq trunc method to retain sequence features (since there's no 
reason why the $add argument can't be negative). This would probably 
need a bit more work to cope with the situation where a feature overlaps 
the trunc coordinates, for example if we truncate to coordinates 1..400, 
but there's a feature 300..500. I guess the 'correct' behaviour might be 
to convert that feature to a fuzzy location of 300..>400? Or is it 
acceptable to have features with coordinates outside of a sequence?

If we did that then an obvious test would be to cat a sequence to 
itself, then trunc to retain just the second half of the new sequence 
and see if you got back what you started with.

> A few more checks to make sure seq_>alphabet is the same in all sequences 
> might be a good idea.
That's easy to implement. Just put the line:
	$self->throw('Trying to concatenate sequences with different alphabets: 
'.$seq->display_id.' ('.$seq->alphabet.') and ' .$_->display_id.' 
('.$_->alphabet.')') unless $_->alphabet eq $seq->alphabet;

at the start of the for(@seqs) loop of the cat subroutine.

Roy.
--
Dr. Roy Chaudhuri
Bioinformatics Research Fellow
Division of Immunity and Infection
University of Birmingham, U.K.

http://xbase.bham.ac.uk

From hlapp at gmx.net  Thu Jan 26 01:31:43 2006
From: hlapp at gmx.net (Hilmar Lapp)
Date: Wed, 25 Jan 2006 22:31:43 -0800
Subject: [Bioperl-l] bioperl-db on Windows (update)
In-Reply-To: <000001c6223a$cd5539c0$15327e82@pyrimidine>
References: <000001c6223a$cd5539c0$15327e82@pyrimidine>
Message-ID: 

This is a lot of work you did to investigate this Chris, thanks. Yes
filing as a bug report will help, and don't forget to attach this
report of yours with all the tests you did. Really all that's left to
do is test on a couple of Unix platforms, which will happen
semi-automatically by people once we commit the change.

   -hilmar

On 1/25/06, Chris Fields  wrote:
> Hilmar,
>
> I checked load_seqdatabase.pl with all variables of Root.pm and checking
> debugging output; basically, the only way that I could find to get
> load_seqdatabase.pl to work on native Windows is by changing those Root.pm
> lines by adding a comma (i.e. three lines, from 'throw $class ...' to 'throw
> $class, ...').  I ran debugging on load_seqdatabase.pl using all versions of
> Root.pm, with and without Error.pm.  Only those with a comma present worked
> in both circumstances.  I don't know why this hasn't popped up before now,
> but it seems to be a unique combination of Windows, load_seqdatabase.pl, and
> bioperl-db.  It doesn't happen with any scripts of Bioperl on Windows that
> I've run into, and debugging other modules (for instance,
> Bio::SearchIO::blast, which I recently worked on) doesn't cause this
> problem.
>
> Here's the debugging output for load_seqdatabase.pl, with and w/o Error.pm
> and without modifying Root.pm.
>
> ____________________________________________________________
>
> Without Error.pm:
> ____________________________________________________________
> C:\Perl\Scripts>perl -MError
> Can't locate Error.pm in @INC (@INC contains:
> C:\Perl\src\bioperl\bioperl-live C:\Perl\src\bioperl\bioperl-db C:/Perl/lib
> C:/P
> erl/site/lib .).
> BEGIN failed--compilation aborted.
>
> C:\Perl\Scripts>load_seqdatabase.pl -dbname biotest -dbuser root -dbpass
> ****** -driver mysql -format genbank -namespace tes
> t -testonly -safe -debug input.gpt
> Loading input.gpt ...
> attempting to load adaptor class for Bio::Seq::RichSeq
>         attempting to load module Bio::DB::BioSQL::RichSeqAdaptor
> attempting to load adaptor class for Bio::Seq
>         attempting to load module Bio::DB::BioSQL::SeqAdaptor
> instantiating adaptor class Bio::DB::BioSQL::SeqAdaptor
> attempting to load adaptor class for Bio::Species
>         attempting to load module Bio::DB::BioSQL::SpeciesAdaptor
> instantiating adaptor class Bio::DB::BioSQL::SpeciesAdaptor
> Undefined subroutine &Bio::Root::Root::debug called at
> C:\Perl\src\bioperl\bioperl-db/Bio/DB/BioSQL/BasePersistenceAdaptor.pm
> line 1537,  line 63.
> ____________________________________________________________
>
> With Error.pm:
> ____________________________________________________________
>
> C:\Perl\Scripts>perl -MError -e ";"
>
> C:\Perl\Scripts>load_seqdatabase.pl -dbname biotest -dbuser root -dbpass
> ****** -driver mysql -format genbank -namespace tes
> t -testonly -safe -debug input.gpt
> Loading input.gpt ...
> attempting to load adaptor class for Bio::Seq::RichSeq
>         attempting to load module Bio::DB::BioSQL::RichSeqAdaptor
>   Calling Error::throw
>
>   Calling Error::throw
>
> attempting to load adaptor class for Bio::Seq
>         attempting to load module Bio::DB::BioSQL::SeqAdaptor
> instantiating adaptor class Bio::DB::BioSQL::SeqAdaptor
>   Calling Error::throw
>
> attempting to load adaptor class for Bio::Species
>         attempting to load module Bio::DB::BioSQL::SpeciesAdaptor
> instantiating adaptor class Bio::DB::BioSQL::SpeciesAdaptor
>   Calling Error::throw
>
>   Calling Error::throw
>
> Undefined subroutine &Bio::Root::Root::debug called at
> C:\Perl\src\bioperl\bioperl-db/Bio/DB/BioSQL/BasePersistenceAdaptor.pm
> line 1537,  line 63.
>
> ____________________________________________________________
>
> Error::throw is called w/o a problem when Error.pm is present (which is what
> should happen).  For some reason, that extra comma makes all the difference
> in the world.
>
> The line above in BasePersistenceAdaptor.pm is :
>
> $self->debug("attempting to load driver for adaptor class $class\n");
>
> which is found in many modules.  I don't really know why it decides to hang
> up here.  I'll try running a few of the Root.pm modifications under Mac OS X
> in the next day or so to see what happens.
>
> I also reran a few of Steve Chervitz's recommendations from a previous post;
> everything ran fine except in circumstances in which Error.pm was required
> with a 'use' statement, and only when Error.pm wasn't present, which is
> expected.  Previously, when I ran them, there was a bit of confusion b/c it
> seemed that Error.pm was present somewhere.  It was; Steve included it in
> bioperl-live/examples/root/lib.  When I deleted it, I got the expected
> results.
>
> Anyway, I don't know what else I can do at this point besides check out
> everything on Mac OS X.  Any additional checks of the modified Root.pm need
> to be made on other systems.  Will filing this as a bug in Bugzilla help?
>
> Christopher Fields
> Postdoctoral Researcher - Switzer Lab
> Dept. of Biochemistry
> University of Illinois Urbana-Champaign
>
>

--
----------------------------------------------------------
: Hilmar Lapp -:- San Diego, CA -:- hlapp at gmx dot net :
----------------------------------------------------------

From cain at cshl.edu  Thu Jan 26 10:41:20 2006
From: cain at cshl.edu (Scott Cain)
Date: Thu, 26 Jan 2006 10:41:20 -0500
Subject: [Bioperl-l] [Gmod-gbrowse] GMOD PPM repository not working
In-Reply-To: <000201c62229$59ed5f50$15327e82@pyrimidine>
References: <000201c62229$59ed5f50$15327e82@pyrimidine>
Message-ID: <1138290080.2894.25.camel@localhost.localdomain>

Hi Chris,

I still don't exactly know what the problem is, but this at least has
given me some insight on some messages in my error_log: I've been seeing
lots of messages about '/icon/somegif.gif' not found and haven't been
able to track down their source (not that I'd really tried, it was an
annoyance that hadn't risen to the level of serous debugging yet).  We
are using mod_rewrite, so that could be part of the problem.  I'll try
to fix it so that the icons display properly and that may have a side
effect of fixing ppm.

Scott

On Wed, 2006-01-25 at 21:34 -0600, Chris Fields wrote:
> Scott,
> 
> This popped up, for some reason, when I tried to install a perl module
> (Error.pm); maybe it has something to do with the reason PPM can't 'see'
> GMOD's repository.  It crashes PPM pretty nicely!  Looks like the home page
> for GMOD, so maybe Sourceforge is redirecting things and this messes with
> PPM?  
> 
> _____________________________________________
> C:\Perl\Scripts>ppm
> PPM - Programmer's Package Manager version 3.3.
> Copyright (c) 2001 ActiveState Corp. All Rights Reserved.
> ActiveState is a division of Sophos.
> 
> Entering interactive shell. Using Term::ReadLine::Perl as readline library.
> 
> Type 'help' to get started.
> 
> ppm> rep
> Repositories:
> [1] Bioperl
> [2] gmod
> [3] ActiveState PPM2 Repository
> [4] ActiveState Package Repository
> [ ] Bribes
> [ ] Kobes
> [ ] local
> ppm> install Error
> PPM::PPD::init: not a PPD and not a file:
>  "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
> 
> 
> 
>   The Generic Model Organism Database Project | GMOD
>   
> 
> 
>   
>   
> 
> 
> 
> 
> 
>   
>     
>     
> 
>        alt="Home" />
> 
>       GMOD
> 
>       Generic Software Components for Model
> Organism Databases
> 
> 
>     
>        href="http://sourceforge.net/mail/?group_id=27707">Mailing lists |
>  href="http://sourceforge.net/tracker/?atid=391291&group_id=27707&func=browse
> ">Bug Reports |
>  href="http://sourceforge.net/tracker/?atid=391294&group_id=27707&func=browse
> ">Feature Requests |
> Publications |
> Meetings |
> 
> .... (lots of HTML removed)
> 
> 
> This site is maintained by Scott
> Cain | Powered by 
> drupal
> 
> 
> 
> 
> Christopher Fields
> Postdoctoral Researcher - Switzer Lab
> Dept. of Biochemistry
> University of Illinois Urbana-Champaign 
> > -----Original Message-----
> > From: gmod-gbrowse-admin at lists.sourceforge.net [mailto:gmod-gbrowse-
> > admin at lists.sourceforge.net] On Behalf Of Scott Cain
> > Sent: Tuesday, January 24, 2006 10:16 AM
> > To: Chris Fields
> > Cc: 'Gbrowse (E-mail)'; bioperl-l at portal.open-bio.org
> > Subject: Re: [Gmod-gbrowse] GMOD PPM repository not working
> > 
> > Hi Chris,
> > 
> > Is it still misbehaving?  I'll do some testing today, but my ability to
> > do so is little hampered as I am traveling this week.
> > 
> > Thanks,
> > Scott
> > 
> > 
> > On Wed, 2006-01-18 at 10:51 -0600, Chris Fields wrote:
> > > Scott,
> > >
> > > I am trying to find the newest bioperl dev. Release (1.51) from PPM for
> > a
> > > quick write-up on installing bioperl-db on Windows.  I tried using the
> > GMOD
> > > repository:
> > >
> > > ppm> rep add gmod http://www.gmod.org/ggb/ppm
> > > Repositories:
> > > [1] gmod
> > > [ ] ActiveState Package Repository
> > > [ ] ActiveState PPM2 Repository
> > > [ ] Bioperl
> > > [ ] Bribes
> > > [ ] Kobes
> > > [ ] local
> > > ppm> search bioperl
> > > Searching in Active Repositories
> > > No matches for 'bioperl'; see 'help search'.
> > > ppm> search *
> > > Searching in Active Repositories
> > > No matches for '*'; see 'help search'.
> > > ppm>
> > >
> > >
> > > Any idea what's going on?  All other repositories work fine.  I can
> > download
> > > it and install locally w/o a problem.  I am running the newest
> > ActivePerl
> > > (5.8.7.815), WinXP.
> > >
> > > Christopher Fields
> > > Postdoctoral Researcher - Switzer Lab
> > > Dept. of Biochemistry
> > > University of Illinois Urbana-Champaign
> > >
> > >
> > >
> > >
> > > -------------------------------------------------------
> > > This SF.net email is sponsored by: Splunk Inc. Do you grep through log
> > files
> > > for problems?  Stop!  Download the new AJAX search engine that makes
> > > searching your log files as easy as surfing the  web.  DOWNLOAD SPLUNK!
> > > http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642
> > > _______________________________________________
> > > Gmod-gbrowse mailing list
> > > Gmod-gbrowse at lists.sourceforge.net
> > > https://lists.sourceforge.net/lists/listinfo/gmod-gbrowse
> > --
> > ------------------------------------------------------------------------
> > Scott Cain, Ph. D.                                         cain at cshl.edu
> > GMOD Coordinator (http://www.gmod.org/)                     216-392-3087
> > Cold Spring Harbor Laboratory
> > 
> > 
> > 
> > -------------------------------------------------------
> > This SF.net email is sponsored by: Splunk Inc. Do you grep through log
> > files
> > for problems?  Stop!  Download the new AJAX search engine that makes
> > searching your log files as easy as surfing the  web.  DOWNLOAD SPLUNK!
> > http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642
> > _______________________________________________
> > Gmod-gbrowse mailing list
> > Gmod-gbrowse at lists.sourceforge.net
> > https://lists.sourceforge.net/lists/listinfo/gmod-gbrowse
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
-- 
------------------------------------------------------------------------
Scott Cain, Ph. D.                                         cain at cshl.edu
GMOD Coordinator (http://www.gmod.org/)                     216-392-3087
Cold Spring Harbor Laboratory

From rbalbi at gmail.com  Thu Jan 26 13:19:57 2006
From: rbalbi at gmail.com (Ricardo Balbi)
Date: Thu, 26 Jan 2006 16:19:57 -0200
Subject: [Bioperl-l] [GUSDEV] Another error executing ISF
In-Reply-To: 
References: 

Message-ID: 

Hi all,

   Anybody could help me with this error ?

thanks in advance,
Ricardo

2006/1/26, Aaron J. Mackey :
>
>
> This is a BioPerl "Unflattener" error; it's unable to automatically
> reconstruct the gene/mRNA/exon logic of some (or all) of the
> annotation in your genbank file.  To get help with this, you should
> post a message to the BioPerl mailing list (bioperl-l at bioperl.org),
> including a snippet of your genbank file.
>
> -Aaron
>
> On Jan 26, 2006, at 11:53 AM, Ricardo Balbi wrote:
>
> > Hi all,
> >
> >   After making some changes in the gus mapping file to ignore some
> > features of the kinetoplastida database, I followed in the
> > execution of the ISF, however without success.
> >
> >   Somebody could help me with this error?
> >
> > thanks in advance,
> > Ricardo
> >
> > ERROR:
> >
> > ------------- EXCEPTION  -------------
> > MSG: structure_type 2 is currently unknown
> > STACK Bio::SeqFeature::Tools::Unflattener::unflatten_seq /usr/local/
> > bioperl15/Bio/SeqFeature/Tools/Unflattener.pm:1419
> > STACK GUS::Supported::Plugin::InsertSequenceFeatures::unflatten /
> > GUS/gus_home/lib/perl/GUS/Supported/Plugin/
> > InsertSequenceFeatures.pm:353
> > STACK
> > GUS::Supported::Plugin::InsertSequenceFeatures::processFeatureTrees /G
> > US/gus_home/lib/perl/GUS/Supported/Plugin/InsertSequenceFeatures.pm:
> > 720
> > STACK GUS::Supported::Plugin::InsertSequenceFeatures::run /GUS/
> > gus_home/lib/perl/GUS/Supported/Plugin/InsertSequenceFeatures.pm:330
> > STACK (eval) /GUS/gus_home/lib/perl/GUS/PluginMgr/GusApplication.pm:
> > 549
> > STACK GUS::PluginMgr::GusApplication::doMajorMode_RunOrReport /GUS/
> > gus_home/lib/perl/GUS/PluginMgr/GusApplication.pm:541
> > STACK GUS::PluginMgr::GusApplication::doMajorMode_Run /GUS/gus_home/
> > lib/perl/GUS/PluginMgr/GusApplication.pm:459
> > STACK GUS::PluginMgr::GusApplication::doMajorMode /GUS/gus_home/lib/
> > perl/GUS/PluginMgr/GusApplication.pm:357
> > STACK GUS::PluginMgr::GusApplication::parseAndRun /GUS/gus_home/lib/
> > perl/GUS/PluginMgr/GusApplication.pm:266
> > STACK toplevel /GUS/gus_home/bin/ga:11
> >
> > --------------------------------------
> >
> > STACK TRACE:
> >  at /usr/local/bioperl15/Bio/Root/Root.pm line 342
> >         Bio::Root::Root::throw
> > ('Bio::SeqFeature::Tools::Unflattener=HASH(0xb5e124c)',
> > 'structure_type 2 is currently unknown') called at /usr/local/
> > bioperl15/Bio/SeqFeature/Tools/Unflattener.pm line 1419
> >         Bio::SeqFeature::Tools::Unflattener::unflatten_seq
> > ('Bio::SeqFeature::Tools::Unflattener=HASH(0xb5e124c)', '-seq',
> > 'Bio::Seq::RichSeq=HASH(0xb820b14)', '-use_magic', 1) called at /
> > GUS/gus_home/lib/perl/GUS/Supported/Plugin/
> > InsertSequenceFeatures.pm line 353
> >         GUS::Supported::Plugin::InsertSequenceFeatures::unflatten
> > ('GUS::Supported::Plugin::InsertSequenceFeatures=HASH(0xa011adc)',
> > 'Bio::Seq::RichSeq=HASH(0xb820b14)') called at /GUS/gus_home/lib/
> > perl/GUS/Supported/Plugin/InsertSequenceFeatures.pm line 720
> >
> > GUS::Supported::Plugin::InsertSequenceFeatures::processFeatureTrees
> > ('GUS::Supported::Plugin::InsertSequenceFeatures=HASH(0xa011adc)',
> > 'Bio::Seq::RichSeq=HASH(0xb820b14)', 140, 177) called at /GUS/
> > gus_home/lib/perl/GUS/Supported/Plugin/InsertSequenceFeatures.pm
> > line 330
> >         GUS::Supported::Plugin::InsertSequenceFeatures::run
> > ('GUS::Supported::Plugin::InsertSequenceFeatures=HASH(0xa011adc)',
> > 'HASH(0xb047e98)') called at /GUS/gus_home/lib/perl/GUS/PluginMgr/
> > GusApplication.pm line 549
> >         eval {...} called at /GUS/gus_home/lib/perl/GUS/PluginMgr/
> > GusApplication.pm line 541
> >         GUS::PluginMgr::GusApplication::doMajorMode_RunOrReport
> > ('GUS::PluginMgr::GusApplication=HASH(0x9ff0c20)',
> > 'GUS::Supported::Plugin::InsertSequenceFeatures', 1) called at /GUS/
> > gus_home/lib/perl/GUS/PluginMgr/GusApplication.pm line 459
> >         GUS::PluginMgr::GusApplication::doMajorMode_Run
> > ('GUS::PluginMgr::GusApplication=HASH(0x9ff0c20)',
> > 'GUS::Supported::Plugin::InsertSequenceFeatures') called at /GUS/
> > gus_home/lib/perl/GUS/PluginMgr/GusApplication.pm line 357
> >         GUS::PluginMgr::GusApplication::doMajorMode
> > ('GUS::PluginMgr::GusApplication=HASH(0x9ff0c20)',
> > 'GUS::Supported::Plugin::InsertSequenceFeatures') called at /GUS/
> > gus_home/lib/perl/GUS/PluginMgr/GusApplication.pm line 266
> >         GUS::PluginMgr::GusApplication::parseAndRun
> > ('GUS::PluginMgr::GusApplication=HASH(0x9ff0c20)', 'ARRAY
> > (0xa004738)') called at /GUS/gus_home/bin/ga line 11
> >
> >
> >
>
> --
> Dr. Aaron J. Mackey, Ph.D.
> Project Manager, ApiDB Bioinformatics Resource Center
> Penn Genomics Institute, University of Pennsylvania
> email:  amackey at pcbi.upenn.edu
> office: 215-898-1205 (Biology, 212 Goddard Labs)
>          215-746-7018 (PCBI, 1428 Blockley Hall)
> fax:    215-746-6697 (Penn Genomics Institute)
> postal: Penn Genomics Institute
>          Goddard Labs 212
>          415 S. University Avenue
>          Philadelphia, PA  19104-6017
>
>
>

From jason.stajich at duke.edu  Thu Jan 26 14:28:26 2006
From: jason.stajich at duke.edu (Jason Stajich)
Date: Thu, 26 Jan 2006 14:28:26 -0500
Subject: [Bioperl-l] [GUSDEV] Another error executing ISF
In-Reply-To: 
References: 

Message-ID: <2A475D24-5AC3-4AD5-80CB-0C40DB622283@duke.edu>

I would suggest following Aaron's instructions to

>> including a snippet of your genbank file.

--
Jason Stajich
Duke University
http://www.duke.edu/~jes12

From cjm at fruitfly.org  Thu Jan 26 14:33:46 2006
From: cjm at fruitfly.org (chris mungall)
Date: Thu, 26 Jan 2006 11:33:46 -0800
Subject: [Bioperl-l] [GUSDEV] Another error executing ISF
In-Reply-To: 
References: 

Message-ID: 

Sorry for the uninformative error message.

The unflattener uses a collection of heuristics to infer a canonical 
gene-mRNA-CDS-exon feature hierarchy from a flattened genbank style 
file. Due to the highly variable nature of some genbank records this 
isn't always possible, and some data massaging is required beforehand. 
I don't know what the context of this message is, but I presume you're 
aware of this from the docs.

The only time I've seen this before was with the genbank submission of 
the pombe genome, which has some very.. unusual features purportedly of 
type mRNA; the actual gene models are encoded using 'gene' and 'CDS' 
features. This confuses the heuristics a little. The only way I've been 
able to deal with this one was to manually remove the mRNA features 
(they appeared to be just fragments and not actual gene models) using 
$unf->remove_types(['mRNA']) beforehand.

Can you send the accession of the record you're trying this on (or 
email me the file off-list if it isn't too large). I'll try and get a 
more informative error message in there.

On Jan 26, 2006, at 10:19 AM, Ricardo Balbi wrote:

> Hi all,
>
>    Anybody could help me with this error ?
>
> thanks in advance,
> Ricardo
>
> 2006/1/26, Aaron J. Mackey :
>>
>>
>> This is a BioPerl "Unflattener" error; it's unable to automatically
>> reconstruct the gene/mRNA/exon logic of some (or all) of the
>> annotation in your genbank file.  To get help with this, you should
>> post a message to the BioPerl mailing list (bioperl-l at bioperl.org),
>> including a snippet of your genbank file.
>>
>> -Aaron
>>
>> On Jan 26, 2006, at 11:53 AM, Ricardo Balbi wrote:
>>
>>> Hi all,
>>>
>>>   After making some changes in the gus mapping file to ignore some
>>> features of the kinetoplastida database, I followed in the
>>> execution of the ISF, however without success.
>>>
>>>   Somebody could help me with this error?
>>>
>>> thanks in advance,
>>> Ricardo
>>>
>>> ERROR:
>>>
>>> ------------- EXCEPTION  -------------
>>> MSG: structure_type 2 is currently unknown
>>> STACK Bio::SeqFeature::Tools::Unflattener::unflatten_seq /usr/local/
>>> bioperl15/Bio/SeqFeature/Tools/Unflattener.pm:1419
>>> STACK GUS::Supported::Plugin::InsertSequenceFeatures::unflatten /
>>> GUS/gus_home/lib/perl/GUS/Supported/Plugin/
>>> InsertSequenceFeatures.pm:353
>>> STACK
>>> GUS::Supported::Plugin::InsertSequenceFeatures::processFeatureTrees 
>>> /G
>>> US/gus_home/lib/perl/GUS/Supported/Plugin/InsertSequenceFeatures.pm:
>>> 720
>>> STACK GUS::Supported::Plugin::InsertSequenceFeatures::run /GUS/
>>> gus_home/lib/perl/GUS/Supported/Plugin/InsertSequenceFeatures.pm:330
>>> STACK (eval) /GUS/gus_home/lib/perl/GUS/PluginMgr/GusApplication.pm:
>>> 549
>>> STACK GUS::PluginMgr::GusApplication::doMajorMode_RunOrReport /GUS/
>>> gus_home/lib/perl/GUS/PluginMgr/GusApplication.pm:541
>>> STACK GUS::PluginMgr::GusApplication::doMajorMode_Run /GUS/gus_home/
>>> lib/perl/GUS/PluginMgr/GusApplication.pm:459
>>> STACK GUS::PluginMgr::GusApplication::doMajorMode /GUS/gus_home/lib/
>>> perl/GUS/PluginMgr/GusApplication.pm:357
>>> STACK GUS::PluginMgr::GusApplication::parseAndRun /GUS/gus_home/lib/
>>> perl/GUS/PluginMgr/GusApplication.pm:266
>>> STACK toplevel /GUS/gus_home/bin/ga:11
>>>
>>> --------------------------------------
>>>
>>> STACK TRACE:
>>>  at /usr/local/bioperl15/Bio/Root/Root.pm line 342
>>>         Bio::Root::Root::throw
>>> ('Bio::SeqFeature::Tools::Unflattener=HASH(0xb5e124c)',
>>> 'structure_type 2 is currently unknown') called at /usr/local/
>>> bioperl15/Bio/SeqFeature/Tools/Unflattener.pm line 1419
>>>         Bio::SeqFeature::Tools::Unflattener::unflatten_seq
>>> ('Bio::SeqFeature::Tools::Unflattener=HASH(0xb5e124c)', '-seq',
>>> 'Bio::Seq::RichSeq=HASH(0xb820b14)', '-use_magic', 1) called at /
>>> GUS/gus_home/lib/perl/GUS/Supported/Plugin/
>>> InsertSequenceFeatures.pm line 353
>>>         GUS::Supported::Plugin::InsertSequenceFeatures::unflatten
>>> ('GUS::Supported::Plugin::InsertSequenceFeatures=HASH(0xa011adc)',
>>> 'Bio::Seq::RichSeq=HASH(0xb820b14)') called at /GUS/gus_home/lib/
>>> perl/GUS/Supported/Plugin/InsertSequenceFeatures.pm line 720
>>>
>>> GUS::Supported::Plugin::InsertSequenceFeatures::processFeatureTrees
>>> ('GUS::Supported::Plugin::InsertSequenceFeatures=HASH(0xa011adc)',
>>> 'Bio::Seq::RichSeq=HASH(0xb820b14)', 140, 177) called at /GUS/
>>> gus_home/lib/perl/GUS/Supported/Plugin/InsertSequenceFeatures.pm
>>> line 330
>>>         GUS::Supported::Plugin::InsertSequenceFeatures::run
>>> ('GUS::Supported::Plugin::InsertSequenceFeatures=HASH(0xa011adc)',
>>> 'HASH(0xb047e98)') called at /GUS/gus_home/lib/perl/GUS/PluginMgr/
>>> GusApplication.pm line 549
>>>         eval {...} called at /GUS/gus_home/lib/perl/GUS/PluginMgr/
>>> GusApplication.pm line 541
>>>         GUS::PluginMgr::GusApplication::doMajorMode_RunOrReport
>>> ('GUS::PluginMgr::GusApplication=HASH(0x9ff0c20)',
>>> 'GUS::Supported::Plugin::InsertSequenceFeatures', 1) called at /GUS/
>>> gus_home/lib/perl/GUS/PluginMgr/GusApplication.pm line 459
>>>         GUS::PluginMgr::GusApplication::doMajorMode_Run
>>> ('GUS::PluginMgr::GusApplication=HASH(0x9ff0c20)',
>>> 'GUS::Supported::Plugin::InsertSequenceFeatures') called at /GUS/
>>> gus_home/lib/perl/GUS/PluginMgr/GusApplication.pm line 357
>>>         GUS::PluginMgr::GusApplication::doMajorMode
>>> ('GUS::PluginMgr::GusApplication=HASH(0x9ff0c20)',
>>> 'GUS::Supported::Plugin::InsertSequenceFeatures') called at /GUS/
>>> gus_home/lib/perl/GUS/PluginMgr/GusApplication.pm line 266
>>>         GUS::PluginMgr::GusApplication::parseAndRun
>>> ('GUS::PluginMgr::GusApplication=HASH(0x9ff0c20)', 'ARRAY
>>> (0xa004738)') called at /GUS/gus_home/bin/ga line 11
>>>
>>>
>>>
>>
>> --
>> Dr. Aaron J. Mackey, Ph.D.
>> Project Manager, ApiDB Bioinformatics Resource Center
>> Penn Genomics Institute, University of Pennsylvania
>> email:  amackey at pcbi.upenn.edu
>> office: 215-898-1205 (Biology, 212 Goddard Labs)
>>          215-746-7018 (PCBI, 1428 Blockley Hall)
>> fax:    215-746-6697 (Penn Genomics Institute)
>> postal: Penn Genomics Institute
>>          Goddard Labs 212
>>          415 S. University Avenue
>>          Philadelphia, PA  19104-6017
>>
>>
>>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

From heikki at sanbi.ac.za  Fri Jan 27 05:06:52 2006
From: heikki at sanbi.ac.za (Heikki Lehvaslaiho)
Date: Fri, 27 Jan 2006 12:06:52 +0200
Subject: [Bioperl-l] concatenate two embl sequence files
In-Reply-To: <43D8CC0B.10403@colibase.bham.ac.uk>
References: <200601182120.k0ILIl8X022324@portal.open-bio.org>
	<200601252311.45582.heikki@sanbi.ac.za>
	<43D8CC0B.10403@colibase.bham.ac.uk>
Message-ID: <200601271206.52875.heikki@sanbi.ac.za>

On Thursday 26 January 2006 15:18, Roy Chaudhuri wrote:
> Heikki Lehvaslaiho wrote:
> > Thanks Roy!
> >
> > I'll check to code in tomorrow when I am less sleepy and can go through
> > the code in detail. In principle the code looks good. It definitely needs
> > tests. If you have written any please do post them.
>
> Not too sure about how to go about writing tests, any suggestions?

I've committed the code and tests. See t/SeqUtils.t. The idea is to test all 
methods and a reasonable portion of all edge values to be sure that the 
method works as it should.

Note that the code does not create a new sequence object. It modifies the 
existing one. Therefore it is best not to return that object. The users would 
assign that to a variable that points to the same structure and get confused. 
The method now returns true upon completeion.

Creating a new sequence object is problematic because one needs to add one 
more dependency (e.g. Clone) and will not work anyway if the sequence 
implementation is using a database back end. It is better the way you have 
written it.

I added code to move over the annotations from secondary sequences, but did 
not do anything remove duplicates if the same sequence gets added twice. I 
wrote a note about this so that users know to be prepared if that affects 
them.

> It did occur to me that my _coordAdjust method could be adapted to allow
> the Bio::Seq trunc method to retain sequence features (since there's no
> reason why the $add argument can't be negative). This would probably
> need a bit more work to cope with the situation where a feature overlaps
> the trunc coordinates, for example if we truncate to coordinates 1..400,
> but there's a feature 300..500. I guess the 'correct' behaviour might be
> to convert that feature to a fuzzy location of 300..>400? Or is it
> acceptable to have features with coordinates outside of a sequence?

No feature coordinates should always be within the sequence. Fuzzy is the 
correct solution to this.

> If we did that then an obvious test would be to cat a sequence to
> itself, then trunc to retain just the second half of the new sequence
> and see if you got back what you started with.

Go ahead an try it!

> > A few more checks to make sure seq_>alphabet is the same in all sequences
> > might be a good idea.
>
> That's easy to implement. Just put the line:
> 	$self->throw('Trying to concatenate sequences with different alphabets:
> '.$seq->display_id.' ('.$seq->alphabet.') and ' .$_->display_id.'
> ('.$_->alphabet.')') unless $_->alphabet eq $seq->alphabet;
>
> at the start of the for(@seqs) loop of the cat subroutine.

Added.

Thanks,

	-Heikki
> Roy.
> --
> Dr. Roy Chaudhuri
> Bioinformatics Research Fellow
> Division of Immunity and Infection
> University of Birmingham, U.K.
>
> http://xbase.bham.ac.uk
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

-- 
______ _/      _/_____________________________________________________
      _/      _/
     _/  _/  _/  Heikki Lehvaslaiho    heikki at_sanbi _ac _za
    _/_/_/_/_/  Associate Professor    skype: heikki_lehvaslaiho
   _/  _/  _/  SANBI, South African National Bioinformatics Institute
  _/  _/  _/  University of Western Cape, South Africa
     _/      Phone: +27 21 959 2096   FAX: +27 21 959 2512
___ _/_/_/_/_/________________________________________________________

From wlhsiao at yahoo.ca  Fri Jan 27 05:37:25 2006
From: wlhsiao at yahoo.ca (William Hsiao)
Date: Fri, 27 Jan 2006 05:37:25 -0500 (EST)
Subject: [Bioperl-l] new website launched
In-Reply-To: <79628361-F026-461F-A156-0B6810DB0B52@duke.edu>
Message-ID: <20060127103725.49913.qmail@web32405.mail.mud.yahoo.com>

Hi Jason,
  Nice new site!  I am wondering if I missed an
obvious link to the module documentations (e.g
http://doc.bioperl.org/releases/bioperl-1.4/) from the
homepage?  It seems that is the one thing missing from
the old website setup and I am not sure if it's
intentional.  I am developing a set of lecture notes
for a workshop and would like to know if there is a
stable way to navigate to the module documentations.

Thanks

Cheers,

Will

--- Jason Stajich  wrote:

> I am pleased to announce the release of a new
> website for BioPerl.   
> The site is based on the mediawiki software that was
> developed for  
> the wikipedia project.  We intend the site to be a
> place for  
> community input on documentation and design for the
> BioPerl project.   
> There is also a fair amount of documentation started
> surrounding  
> bioinformatics tools and techniques applicable to
> using BioPerl and  
> some of the authors who created these resources.
> 
> The website continues to be at the URL
> http://www.bioperl.org.  The  
> DNS updates may take up to 24 hours to reach
> everyone.
> 
> The initial content of the site is result of the
> work of myself,  
> Mauricio Herrera Cuadra, Brian Osborne, and Torsten
> Seemann.  We  
> encourage you to contribute to the site's content by
> signing up for  
> an account.
> 
> There are several guides for style of the site and
> how to link to  
> Modules for example which can contain additional
> information from the  
> POD
> http://bioperl.org/wiki/Module:Bio::SeqIO
> 
> You'll notice that many of the paths have changed
> but the DIST and  
> SRC continues to be available at
> http://bioperl.org/DIST and http:// 
> bioperl.org/SRC.  The HOWTOs are now available from
> http:// 
> bioperl.org/wiki/HOWTOs
> 
> The FAQ is available at http://bioperl.org/wiki/FAQ
> and I encourage  
> you to add your questions to it so they can be
> properly archived and  
> addressed.
> 
> We also have initiated a News site for Bioperl for
> posting  
> announcements regarding development and software.  I
> would like to  
> see if there are volunteers to post weekly or
> monthly summaries of  
> mailing list traffic and development.
> http://www.bioperl.org/news/
> 
> 
> Jason Stajich on behalf of Mauricio Herrera Cuadra,
> Brian Osborne,  
> Torsten Seemann.
> 
> --
> Jason Stajich
> Duke University
> http://www.duke.edu/~jes12
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 

__________________________________________________________ 
Find your next car at http://autos.yahoo.ca

From wlhsiao at yahoo.ca  Fri Jan 27 05:37:23 2006
From: wlhsiao at yahoo.ca (William Hsiao)
Date: Fri, 27 Jan 2006 05:37:23 -0500 (EST)
Subject: [Bioperl-l] new website launched
In-Reply-To: <79628361-F026-461F-A156-0B6810DB0B52@duke.edu>
Message-ID: <20060127103723.99495.qmail@web32402.mail.mud.yahoo.com>

Hi Jason,
  Nice new site!  I am wondering if I missed an
obvious link to the module documentations (e.g
http://doc.bioperl.org/releases/bioperl-1.4/) from the
homepage?  It seems that is the one thing missing from
the old website setup and I am not sure if it's
intentional.  I am developing a set of lecture notes
for a workshop and would like to know if there is a
stable way to navigate to the module documentations.

Thanks

Cheers,

Will

--- Jason Stajich  wrote:

> I am pleased to announce the release of a new
> website for BioPerl.   
> The site is based on the mediawiki software that was
> developed for  
> the wikipedia project.  We intend the site to be a
> place for  
> community input on documentation and design for the
> BioPerl project.   
> There is also a fair amount of documentation started
> surrounding  
> bioinformatics tools and techniques applicable to
> using BioPerl and  
> some of the authors who created these resources.
> 
> The website continues to be at the URL
> http://www.bioperl.org.  The  
> DNS updates may take up to 24 hours to reach
> everyone.
> 
> The initial content of the site is result of the
> work of myself,  
> Mauricio Herrera Cuadra, Brian Osborne, and Torsten
> Seemann.  We  
> encourage you to contribute to the site's content by
> signing up for  
> an account.
> 
> There are several guides for style of the site and
> how to link to  
> Modules for example which can contain additional
> information from the  
> POD
> http://bioperl.org/wiki/Module:Bio::SeqIO
> 
> You'll notice that many of the paths have changed
> but the DIST and  
> SRC continues to be available at
> http://bioperl.org/DIST and http:// 
> bioperl.org/SRC.  The HOWTOs are now available from
> http:// 
> bioperl.org/wiki/HOWTOs
> 
> The FAQ is available at http://bioperl.org/wiki/FAQ
> and I encourage  
> you to add your questions to it so they can be
> properly archived and  
> addressed.
> 
> We also have initiated a News site for Bioperl for
> posting  
> announcements regarding development and software.  I
> would like to  
> see if there are volunteers to post weekly or
> monthly summaries of  
> mailing list traffic and development.
> http://www.bioperl.org/news/
> 
> 
> Jason Stajich on behalf of Mauricio Herrera Cuadra,
> Brian Osborne,  
> Torsten Seemann.
> 
> --
> Jason Stajich
> Duke University
> http://www.duke.edu/~jes12
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 

__________________________________________________________ 
Find your next car at http://autos.yahoo.ca

From jason.stajich at duke.edu  Fri Jan 27 08:28:52 2006
From: jason.stajich at duke.edu (Jason Stajich)
Date: Fri, 27 Jan 2006 08:28:52 -0500
Subject: [Bioperl-l] new website launched
In-Reply-To: <20060127103723.99495.qmail@web32402.mail.mud.yahoo.com>
References: <20060127103723.99495.qmail@web32402.mail.mud.yahoo.com>
Message-ID: 

Each module is directly linked to that site in the module-level  
pages, see for example:
http://bioperl.org/wiki/Module:Bio::SearchIO

I've added a mention of the doc.bioperl site on the front page.

Note that as part of setting up the site I insured that there is now  
a standardized URL for the nightly generated Pdoc pages (from CVS  
live) (thanks to Steve Chervitz for suggesting it).

http://doc.bioperl.org/releases/bioperl-current/bioperl-live/
http://doc.bioperl.org/releases/bioperl-current/bioperl-run/
http://doc.bioperl.org/releases/bioperl-current/bioperl-ext/
....etc

The frozen release-based docs will continue to stay up - I never had  
time to make one for the bioperl-1.5.1 but hopefully will do it for  
bioperl 1.5.2 and obviously will make it for the next stable release  
(1.6).

We encourage people to add snippets of code using modules,  
complaints, workarounds, etc on the module pages on the wiki site.   
There is a "discussion" paired for each wiki page where we would  
suggest people put comments, while useful workarounds/example code  
should go on the main page.  I've just added some text about this to  
the "About this site" page.

-jason
On Jan 27, 2006, at 5:37 AM, William Hsiao wrote:

> Hi Jason,
>   Nice new site!  I am wondering if I missed an
> obvious link to the module documentations (e.g
> http://doc.bioperl.org/releases/bioperl-1.4/) from the
> homepage?  It seems that is the one thing missing from
> the old website setup and I am not sure if it's
> intentional.  I am developing a set of lecture notes
> for a workshop and would like to know if there is a
> stable way to navigate to the module documentations.
>
> Thanks
>
> Cheers,
>
> Will
>
>
> --- Jason Stajich  wrote:
>
>> I am pleased to announce the release of a new
>> website for BioPerl.
>> The site is based on the mediawiki software that was
>> developed for
>> the wikipedia project.  We intend the site to be a
>> place for
>> community input on documentation and design for the
>> BioPerl project.
>> There is also a fair amount of documentation started
>> surrounding
>> bioinformatics tools and techniques applicable to
>> using BioPerl and
>> some of the authors who created these resources.
>>
>> The website continues to be at the URL
>> http://www.bioperl.org.  The
>> DNS updates may take up to 24 hours to reach
>> everyone.
>>
>> The initial content of the site is result of the
>> work of myself,
>> Mauricio Herrera Cuadra, Brian Osborne, and Torsten
>> Seemann.  We
>> encourage you to contribute to the site's content by
>> signing up for
>> an account.
>>
>> There are several guides for style of the site and
>> how to link to
>> Modules for example which can contain additional
>> information from the
>> POD
>> http://bioperl.org/wiki/Module:Bio::SeqIO
>>
>> You'll notice that many of the paths have changed
>> but the DIST and
>> SRC continues to be available at
>> http://bioperl.org/DIST and http://
>> bioperl.org/SRC.  The HOWTOs are now available from
>> http://
>> bioperl.org/wiki/HOWTOs
>>
>> The FAQ is available at http://bioperl.org/wiki/FAQ
>> and I encourage
>> you to add your questions to it so they can be
>> properly archived and
>> addressed.
>>
>> We also have initiated a News site for Bioperl for
>> posting
>> announcements regarding development and software.  I
>> would like to
>> see if there are volunteers to post weekly or
>> monthly summaries of
>> mailing list traffic and development.
>> http://www.bioperl.org/news/
>>
>>
>> Jason Stajich on behalf of Mauricio Herrera Cuadra,
>> Brian Osborne,
>> Torsten Seemann.
>>
>> --
>> Jason Stajich
>> Duke University
>> http://www.duke.edu/~jes12
>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>
>
>
> 	
>
> 	
> 		
> __________________________________________________________
> Find your next car at http://autos.yahoo.ca

--
Jason Stajich
Duke University
http://www.duke.edu/~jes12

From davila at fiocruz.br  Thu Jan 26 19:05:35 2006
From: davila at fiocruz.br (Alberto M. R. =?iso-8859-1?Q?D=E1vila?=)
Date: Thu, 26 Jan 2006 22:05:35 -0200 (BRST)
Subject: [Bioperl-l] [GUSDEV] Another error executing ISF
In-Reply-To: 
References: 

Message-ID: <3197.201.17.105.240.1138320335.squirrel@www.redefiocruz.fiocruz.br>

Dear Chris,

Happy 2006 !

I am not sure about the exact record the ISF plugin was trying to
read/parse, but I think it is the first one, anyway I am listing the first
5 GIs of our file for your testing:

85539529
56130985
54300415
54288810
50604596

The whole file is really big (1.4GB) as it contains all the nucleotide
sequences of "kinetoplastida [organism]" from genbank in genbank format.

Hope you can catch "the bug" ;-)

Kindest regards, Alberto

>
> Sorry for the uninformative error message.
>
> The unflattener uses a collection of heuristics to infer a canonical
> gene-mRNA-CDS-exon feature hierarchy from a flattened genbank style
> file. Due to the highly variable nature of some genbank records this
> isn't always possible, and some data massaging is required beforehand.
> I don't know what the context of this message is, but I presume you're
> aware of this from the docs.
>
> The only time I've seen this before was with the genbank submission of
> the pombe genome, which has some very.. unusual features purportedly of
> type mRNA; the actual gene models are encoded using 'gene' and 'CDS'
> features. This confuses the heuristics a little. The only way I've been
> able to deal with this one was to manually remove the mRNA features
> (they appeared to be just fragments and not actual gene models) using
> $unf->remove_types(['mRNA']) beforehand.
>
> Can you send the accession of the record you're trying this on (or
> email me the file off-list if it isn't too large). I'll try and get a
> more informative error message in there.
>
> On Jan 26, 2006, at 10:19 AM, Ricardo Balbi wrote:
>
>> Hi all,
>>
>>    Anybody could help me with this error ?
>>
>> thanks in advance,
>> Ricardo
>>
>> 2006/1/26, Aaron J. Mackey :
>>>
>>>
>>> This is a BioPerl "Unflattener" error; it's unable to automatically
>>> reconstruct the gene/mRNA/exon logic of some (or all) of the
>>> annotation in your genbank file.  To get help with this, you should
>>> post a message to the BioPerl mailing list (bioperl-l at bioperl.org),
>>> including a snippet of your genbank file.
>>>
>>> -Aaron
>>>
>>> On Jan 26, 2006, at 11:53 AM, Ricardo Balbi wrote:
>>>
>>>> Hi all,
>>>>
>>>>   After making some changes in the gus mapping file to ignore some
>>>> features of the kinetoplastida database, I followed in the
>>>> execution of the ISF, however without success.
>>>>
>>>>   Somebody could help me with this error?
>>>>
>>>> thanks in advance,
>>>> Ricardo
>>>>
>>>> ERROR:
>>>>
>>>> ------------- EXCEPTION  -------------
>>>> MSG: structure_type 2 is currently unknown
>>>> STACK Bio::SeqFeature::Tools::Unflattener::unflatten_seq /usr/local/
>>>> bioperl15/Bio/SeqFeature/Tools/Unflattener.pm:1419
>>>> STACK GUS::Supported::Plugin::InsertSequenceFeatures::unflatten /
>>>> GUS/gus_home/lib/perl/GUS/Supported/Plugin/
>>>> InsertSequenceFeatures.pm:353
>>>> STACK
>>>> GUS::Supported::Plugin::InsertSequenceFeatures::processFeatureTrees
>>>> /G
>>>> US/gus_home/lib/perl/GUS/Supported/Plugin/InsertSequenceFeatures.pm:
>>>> 720
>>>> STACK GUS::Supported::Plugin::InsertSequenceFeatures::run /GUS/
>>>> gus_home/lib/perl/GUS/Supported/Plugin/InsertSequenceFeatures.pm:330
>>>> STACK (eval) /GUS/gus_home/lib/perl/GUS/PluginMgr/GusApplication.pm:
>>>> 549
>>>> STACK GUS::PluginMgr::GusApplication::doMajorMode_RunOrReport /GUS/
>>>> gus_home/lib/perl/GUS/PluginMgr/GusApplication.pm:541
>>>> STACK GUS::PluginMgr::GusApplication::doMajorMode_Run /GUS/gus_home/
>>>> lib/perl/GUS/PluginMgr/GusApplication.pm:459
>>>> STACK GUS::PluginMgr::GusApplication::doMajorMode /GUS/gus_home/lib/
>>>> perl/GUS/PluginMgr/GusApplication.pm:357
>>>> STACK GUS::PluginMgr::GusApplication::parseAndRun /GUS/gus_home/lib/
>>>> perl/GUS/PluginMgr/GusApplication.pm:266
>>>> STACK toplevel /GUS/gus_home/bin/ga:11
>>>>
>>>> --------------------------------------
>>>>
>>>> STACK TRACE:
>>>>  at /usr/local/bioperl15/Bio/Root/Root.pm line 342
>>>>         Bio::Root::Root::throw
>>>> ('Bio::SeqFeature::Tools::Unflattener=HASH(0xb5e124c)',
>>>> 'structure_type 2 is currently unknown') called at /usr/local/
>>>> bioperl15/Bio/SeqFeature/Tools/Unflattener.pm line 1419
>>>>         Bio::SeqFeature::Tools::Unflattener::unflatten_seq
>>>> ('Bio::SeqFeature::Tools::Unflattener=HASH(0xb5e124c)', '-seq',
>>>> 'Bio::Seq::RichSeq=HASH(0xb820b14)', '-use_magic', 1) called at /
>>>> GUS/gus_home/lib/perl/GUS/Supported/Plugin/
>>>> InsertSequenceFeatures.pm line 353
>>>>         GUS::Supported::Plugin::InsertSequenceFeatures::unflatten
>>>> ('GUS::Supported::Plugin::InsertSequenceFeatures=HASH(0xa011adc)',
>>>> 'Bio::Seq::RichSeq=HASH(0xb820b14)') called at /GUS/gus_home/lib/
>>>> perl/GUS/Supported/Plugin/InsertSequenceFeatures.pm line 720
>>>>
>>>> GUS::Supported::Plugin::InsertSequenceFeatures::processFeatureTrees
>>>> ('GUS::Supported::Plugin::InsertSequenceFeatures=HASH(0xa011adc)',
>>>> 'Bio::Seq::RichSeq=HASH(0xb820b14)', 140, 177) called at /GUS/
>>>> gus_home/lib/perl/GUS/Supported/Plugin/InsertSequenceFeatures.pm
>>>> line 330
>>>>         GUS::Supported::Plugin::InsertSequenceFeatures::run
>>>> ('GUS::Supported::Plugin::InsertSequenceFeatures=HASH(0xa011adc)',
>>>> 'HASH(0xb047e98)') called at /GUS/gus_home/lib/perl/GUS/PluginMgr/
>>>> GusApplication.pm line 549
>>>>         eval {...} called at /GUS/gus_home/lib/perl/GUS/PluginMgr/
>>>> GusApplication.pm line 541
>>>>         GUS::PluginMgr::GusApplication::doMajorMode_RunOrReport
>>>> ('GUS::PluginMgr::GusApplication=HASH(0x9ff0c20)',
>>>> 'GUS::Supported::Plugin::InsertSequenceFeatures', 1) called at /GUS/
>>>> gus_home/lib/perl/GUS/PluginMgr/GusApplication.pm line 459
>>>>         GUS::PluginMgr::GusApplication::doMajorMode_Run
>>>> ('GUS::PluginMgr::GusApplication=HASH(0x9ff0c20)',
>>>> 'GUS::Supported::Plugin::InsertSequenceFeatures') called at /GUS/
>>>> gus_home/lib/perl/GUS/PluginMgr/GusApplication.pm line 357
>>>>         GUS::PluginMgr::GusApplication::doMajorMode
>>>> ('GUS::PluginMgr::GusApplication=HASH(0x9ff0c20)',
>>>> 'GUS::Supported::Plugin::InsertSequenceFeatures') called at /GUS/
>>>> gus_home/lib/perl/GUS/PluginMgr/GusApplication.pm line 266
>>>>         GUS::PluginMgr::GusApplication::parseAndRun
>>>> ('GUS::PluginMgr::GusApplication=HASH(0x9ff0c20)', 'ARRAY
>>>> (0xa004738)') called at /GUS/gus_home/bin/ga line 11
>>>>
>>>>
>>>>
>>>
>>> --
>>> Dr. Aaron J. Mackey, Ph.D.
>>> Project Manager, ApiDB Bioinformatics Resource Center
>>> Penn Genomics Institute, University of Pennsylvania
>>> email:  amackey at pcbi.upenn.edu
>>> office: 215-898-1205 (Biology, 212 Goddard Labs)
>>>          215-746-7018 (PCBI, 1428 Blockley Hall)
>>> fax:    215-746-6697 (Penn Genomics Institute)
>>> postal: Penn Genomics Institute
>>>          Goddard Labs 212
>>>          415 S. University Avenue
>>>          Philadelphia, PA  19104-6017
>>>

From roy at colibase.bham.ac.uk  Fri Jan 27 10:31:50 2006
From: roy at colibase.bham.ac.uk (Roy Chaudhuri)
Date: Fri, 27 Jan 2006 15:31:50 +0000
Subject: [Bioperl-l] concatenate two embl sequence files
In-Reply-To: <200601271206.52875.heikki@sanbi.ac.za>
References: <200601182120.k0ILIl8X022324@portal.open-bio.org>
	<200601252311.45582.heikki@sanbi.ac.za>
	<43D8CC0B.10403@colibase.bham.ac.uk>
	<200601271206.52875.heikki@sanbi.ac.za>
Message-ID: <43DA3CE6.4020708@colibase.bham.ac.uk>

> I've committed the code and tests. See t/SeqUtils.t. The idea is to test all 
> methods and a reasonable portion of all edge values to be sure that the 
> method works as it should.
Cool, thanks for that. My first proper contribution to BioPerl 8^).
The tests look good- I'll know better for next time.

> Note that the code does not create a new sequence object. It modifies the 
> existing one. Therefore it is best not to return that object. The users would 
> assign that to a variable that points to the same structure and get confused. 
> The method now returns true upon completeion.
> 
> Creating a new sequence object is problematic because one needs to add one 
> more dependency (e.g. Clone) and will not work anyway if the sequence 
> implementation is using a database back end. It is better the way you have 
> written it.
Yes, that makes sense. Although with that interface it might be more 
natural in Bio::Seq? If it is a method that will modify a sequence in 
place then it seems more intuitive to call $seq->cat(@seqs) [or even 
$seq->append(@seqs)] rather than Bio::SeqUtils->cat($seq, @seqs).

> I added code to move over the annotations from secondary sequences, but did 
> not do anything remove duplicates if the same sequence gets added twice. I 
> wrote a note about this so that users know to be prepared if that affects 
> them.
I'm not convinced about this- perhaps it should be optional? In practice 
many of the annotations for each subsequence are only going to be 
applicable to that sequence, not the concatenated whole. Some of them 
may also be duplicated even between non-identical sequences. I think 
it'd be better by default to keep just the annotation from the first 
sequence (which probably would still need to be changed, but could at 
least act as a placeholder).

There were a couple of problems with renamed variables/subroutines that 
hadn't all been updated, I've fixed those and pasted the new version below.

> No feature coordinates should always be within the sequence. Fuzzy is the 
> correct solution to this.
Okay, I'll have a go and let you know how I get on.

Cheers.
Roy.

--
Dr. Roy Chaudhuri
Bioinformatics Research Fellow
Division of Immunity and Infection
University of Birmingham, U.K.

http://xbase.bham.ac.uk

=head2 cat

   Title   : cat
   Usage   : my $catseq = Bio::SeqUtils->cat(@seqs)
   Function: Concatenates an array of Bio::Seq objects, using the first 
sequence
             as a target. Annotations and sequence features are copied over
             from any additional objects. Adjusts the coordinates of copied
             features.
   Returns : a boolean
   Args    : array of sequence objects

-
Note that annotations have no sequence region. If you concatenate the
same sequence more than once, you will have its annotations
duplicated.

=cut

sub cat {
     my ($self, $seq, @seqs) = @_;
     $self->throw('Object [$seq] '. 'of class ['. ref($seq).
                  '] should be a Bio::PrimarySeqI ')
         unless $seq->isa('Bio::PrimarySeqI');

     for my $catseq (@seqs) {
         $self->throw('Object [$catseq] '. 'of class ['. ref($catseq).
                      '] should be a Bio::PrimarySeqI ')
             unless $catseq->isa('Bio::PrimarySeqI');

         $self->throw('Trying to concatenate sequences with different 
alphabets: '.
                      $seq->display_id. '('. $seq->alphabet. ') and '. 
$catseq->display_id.
                      '('. $catseq->alphabet. ')')
             unless $catseq->alphabet eq $seq->alphabet;

         my $length=$seq->length;
         $seq->seq($seq->seq.$catseq->seq);

         # move annotations
         if ($seq->isa("Bio::AnnotatableI") and 
$catseq->isa("Bio::AnnotatableI")) {
             foreach my $key ( 
$catseq->annotation->get_all_annotation_keys() ) {

                 foreach my $value ( 
$catseq->annotation->get_Annotations($key) ) {
                     $seq->annotation->add_Annotation($key, $value);
                 }
             }
         }

         # move SeqFeatures
         if ( $seq->isa('Bio::SeqI') and $catseq->isa('Bio::SeqI')) {
             for my $feat ($catseq->get_SeqFeatures) {
                 $seq->add_SeqFeature($self->_coord_adjust($feat, $length));
             }
         }

     }
     1;
}

=head2 _coord_adjust

   Title   : _coord_adjust
   Usage   : my $newfeat=Bio::SeqUtils->_coord_adjust($feature, 100);
   Function: Recursive subroutine to adjust the coordinates of a feature
             and all its subfeatures.
   Returns : A Bio::SeqFeatureI compliant object.
   Args    : A Bio::SeqFeatureI compliant object,
             the number of bases to add to the coordinates

=cut

sub _coord_adjust {
     my ($self, $feat, $add)=@_;
     $self->throw('Object [$feat] '. 'of class ['. ref($feat).
                  '] should be a Bio::SeqFeatureI ')
	unless $feat->isa('Bio::SeqFeatureI');
     my @adjsubfeat;
     for my $subfeat ($feat->remove_SeqFeatures) {
	push @adjsubfeat, Bio::SeqUtils->_coord_adjust($add, $subfeat);
     }
     my @loc=$feat->location->each_Location;
     map {
	my @coords=($_->start, $_->end);
	map s/(\d+)/$add+$1/ge, @coords;
	$_->start(shift @coords);
	$_->end(shift @coords);
     } @loc;
     if (@loc==1) {
	$feat->location($loc[0])
     } else {
	my $loc=Bio::Location::Split->new;
	$loc->add_sub_Location(@loc);
	$feat->location($loc);
     }
     $feat->add_SeqFeature($_) for @adjsubfeat;
     return $feat;
}

From lupey+ at pitt.edu  Fri Jan 27 07:52:03 2006
From: lupey+ at pitt.edu (Paul G Cantalupo)
Date: Fri, 27 Jan 2006 07:52:03 -0500 (EST)
Subject: [Bioperl-l] How to search Bioperl-l archives
Message-ID: 

Hello,

Is there a better way to search the bioperl-l archives other than 
searching in each Archive listed on 
http://bioperl.org/pipermail/bioperl-l/. I've found that Google is not the 
best answer either.

Thank you,

Paul

Paul Cantalupo
Research Specialist/Systems Programmer
559 Crawford Hall
Department of Biological Sciences
University of Pittsburgh
Pittsburgh, PA 15260
Work: 412-624-4687
Fax: 412-624-4759

Ask me about Toastmasters: www.toastmasters.org
Midday Club Treasurer

From jason.stajich at duke.edu  Fri Jan 27 15:48:00 2006
From: jason.stajich at duke.edu (Jason Stajich)
Date: Fri, 27 Jan 2006 15:48:00 -0500
Subject: [Bioperl-l] How to search Bioperl-l archives
In-Reply-To: 
References: 
Message-ID: <91EF5237-8A86-40FA-8126-D953DE28DD69@duke.edu>

Google is the best answer we've got...
site:bioperl.org +pipermail +bioperl-l YOUR TERM

We will try and re-setup the swish indexed archive on the new server  
when there is time.  I don't think I'm going to have time for quite a  
while, if someone volunteers to help out ChrisD and I with sys  
admining it can of course get done sooner.  The old site is http:// 
search.open-bio.org but I don't think the indexes have been updated  
in a while.

-jason

On Jan 27, 2006, at 7:52 AM, Paul G Cantalupo wrote:

> Hello,
>
> Is there a better way to search the bioperl-l archives other than
> searching in each Archive listed on
> http://bioperl.org/pipermail/bioperl-l/. I've found that Google is  
> not the
> best answer either.
>
> Thank you,
>
> Paul
>
>
> Paul Cantalupo
> Research Specialist/Systems Programmer
> 559 Crawford Hall
> Department of Biological Sciences
> University of Pittsburgh
> Pittsburgh, PA 15260
> Work: 412-624-4687
> Fax: 412-624-4759
>
> Ask me about Toastmasters: www.toastmasters.org
> Midday Club Treasurer
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

--
Jason Stajich
Duke University
http://www.duke.edu/~jes12

From cjfields at uiuc.edu  Fri Jan 27 15:57:50 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri, 27 Jan 2006 14:57:50 -0600
Subject: [Bioperl-l] How to search Bioperl-l archives
In-Reply-To: 
Message-ID: <000001c62384$555ea8c0$15327e82@pyrimidine>

There's a link from this page:

http://www.bioperl.org/wiki/Mailing_lists

Two different searches are shown for bioperl-l : Google and Open-Bio.  I use
the Open-Bio b/c of its sorting capabilities (I haven't tried fooling around
with the Google interface yet).

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 
> -----Original Message-----
> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
> bounces at lists.open-bio.org] On Behalf Of Paul G Cantalupo
> Sent: Friday, January 27, 2006 6:52 AM
> To: bioperl-l
> Subject: [Bioperl-l] How to search Bioperl-l archives
> 
> Hello,
> 
> Is there a better way to search the bioperl-l archives other than
> searching in each Archive listed on
> http://bioperl.org/pipermail/bioperl-l/. I've found that Google is not the
> best answer either.
> 
> Thank you,
> 
> Paul
> 
> 
> Paul Cantalupo
> Research Specialist/Systems Programmer
> 559 Crawford Hall
> Department of Biological Sciences
> University of Pittsburgh
> Pittsburgh, PA 15260
> Work: 412-624-4687
> Fax: 412-624-4759
> 
> Ask me about Toastmasters: www.toastmasters.org
> Midday Club Treasurer
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

From cjfields at uiuc.edu  Fri Jan 27 16:02:59 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri, 27 Jan 2006 15:02:59 -0600
Subject: [Bioperl-l] RNAMotif parser
Message-ID: <000101c62385$0ddfc870$15327e82@pyrimidine>

Jason,

I have been fiddling with an RNAMotif parser and an ERPIN parser for a
number of years now; I plan on releasing it for inclusion in bioperl or
bioperl-run.  Right now, I think I may base them somewhat on your
Bio::Tools::QRNA module.  Should they be in bioperl (Bio::Tools::RNAMotif)
or bioperl-run (Bio::Tools::Run::RNAMotif) namespace?

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 

From rahall2 at ualr.edu  Fri Jan 27 15:34:47 2006
From: rahall2 at ualr.edu (Roger Hall)
Date: Fri, 27 Jan 2006 14:34:47 -0600
Subject: [Bioperl-l] Requesting your issues with
	Module:Bio::Tools::Run::RemoteBlast
Message-ID: <008001c62381$1d844980$d416a790@LIBERAL>

All,

I have a fun little application written around this module to track new hits
for my favorite sequences, but it stopped working some time ago, so I have
finally adopted this orphaned module.

I have received very specific suggestions from Jason and Chris for
implementation, and plan to follow them in order to at least bring this
module into the wonderful world of XML. I would appreciate it if you would
send any additional features (and any known issues) my way.

Thanks!

Roger Hall

Technical Director

MidSouth Bioinformatics Center

University of Arkansas at Little Rock

(501) 569-8074

From cjfields at uiuc.edu  Fri Jan 27 20:03:15 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri, 27 Jan 2006 19:03:15 -0600
Subject: [Bioperl-l] Requesting your issues
	withModule:Bio::Tools::Run::RemoteBlast
In-Reply-To: <008001c62381$1d844980$d416a790@LIBERAL>
Message-ID: <001101c623a6$9eb652d0$15327e82@pyrimidine>

The only real change to RemoteBlast.pm made was to the save_output method;
it wasn't saving XML output because the regex used to check the tempfile
output:

	while(my $l = ) {
		next if ($l =~ //);
		if( $l =~ /^(?:[T]?BLAST[NPX])\s*.+$/i ||
			 $l =~/^RPS-BLAST\s*.+$/i ) {
			$seentop=1;
		}
		next if !$seentop;
		if( $seentop ) {
			print SAVEOUT $l;
		}
	}

didn't check for XML.  I just added a check for XML that is the same as the
XML format check in the retrieve_blast method:

	while(my $l = ) {
		next if ($l =~ //);
		if( $l =~ /^(?:[T]?BLAST[NPX])\s*.+$/i ||  # NCBI BLAST
			$l =~/^RPS-BLAST\s*.+$/i || # RPS BLAST
                  $1 =~/<\?xml version=/) { # NCBI BLAST XML output
			$seentop=1;
		}
		next if !$seentop;
		if( $seentop ) {
			print SAVEOUT $l;
		}
	}

There is probably a better way to do this, but it works for now.  All other
fixes were made to SearchIO::blast.  That module is where most of the work
is done and which 'broke' recently from the BLAST version change at NCBI.

The only things I can think of at the moment are things that Jason
mentioned, switching to XML as the default (I agree with) and possibly
incorporating the netblast client (blastcl3).  It might be possible to
branch off a similar module specifically geared towards the blastcl3 client,
maybe acting as a wrapper to parse the returned data using SearchIO, but I
don't necessarily think it would be best to include in RemoteBlast. 

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 

> -----Original Message-----
> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
> bounces at lists.open-bio.org] On Behalf Of Roger Hall
> Sent: Friday, January 27, 2006 2:35 PM
> To: Bioperl-L
> Subject: [Bioperl-l] Requesting your issues
> withModule:Bio::Tools::Run::RemoteBlast
> 
> All,
> 
> 
> 
> I have a fun little application written around this module to track new
> hits
> for my favorite sequences, but it stopped working some time ago, so I have
> finally adopted this orphaned module.
> 
> 
> 
> I have received very specific suggestions from Jason and Chris for
> implementation, and plan to follow them in order to at least bring this
> module into the wonderful world of XML. I would appreciate it if you would
> send any additional features (and any known issues) my way.
> 
> 
> 
> Thanks!
> 
> 
> 
> Roger Hall
> 
> Technical Director
> 
> MidSouth Bioinformatics Center
> 
> University of Arkansas at Little Rock
> 
> (501) 569-8074
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

From torsten.seemann at infotech.monash.edu.au  Fri Jan 27 20:30:34 2006
From: torsten.seemann at infotech.monash.edu.au (Torsten Seemann)
Date: Sat, 28 Jan 2006 12:30:34 +1100
Subject: [Bioperl-l] RNAMotif parser
In-Reply-To: <000101c62385$0ddfc870$15327e82@pyrimidine>
References: <000101c62385$0ddfc870$15327e82@pyrimidine>
Message-ID: <43DAC93A.1000208@infotech.monash.edu.au>

Chris,

> I have been fiddling with an RNAMotif parser and an ERPIN parser for a
> number of years now; I plan on releasing it for inclusion in bioperl or
> bioperl-run.  Right now, I think I may base them somewhat on your
> Bio::Tools::QRNA module.  Should they be in bioperl (Bio::Tools::RNAMotif)
> or bioperl-run (Bio::Tools::Run::RNAMotif) namespace?

 From my understanding, a module to _parse the output_ of some TOOL goes 
in Bio::Tools::TOOL. The wrapper to _run_ TOOL goes in 
Bio::Tools::Run::TOOL. Usually the run() method in Bio::Tools::Run::TOOL 
takes the TOOL output and creates a Bio::Tools::TOOL object with the 
result in it as a convenience.

-- 
Torsten Seemann
Victorian Bioinformatics Consortium, Monash University, Australia
http://www.vicbioinformatics.com/
Phone: +61 3 9905 9010

From cjfields at uiuc.edu  Fri Jan 27 20:47:48 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri, 27 Jan 2006 19:47:48 -0600
Subject: [Bioperl-l] RNAMotif parser
In-Reply-To: <43DAC93A.1000208@infotech.monash.edu.au>
Message-ID: <000001c623ac$d7d07db0$15327e82@pyrimidine>

Yeah, forgot about that.  I just remember a discussion at one point a while
back about splitting off sections of bioperl core b/c some thought
bioperl-core was getting too big; I didn't want to get too deep into writing
code w/o asking.  Okay, then, that's settled.  

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 
> -----Original Message-----
> From: Torsten Seemann [mailto:torsten.seemann at infotech.monash.edu.au]
> Sent: Friday, January 27, 2006 7:31 PM
> To: Chris Fields
> Cc: 'bioperl-ml List'
> Subject: Re: [Bioperl-l] RNAMotif parser
> 
> Chris,
> 
> > I have been fiddling with an RNAMotif parser and an ERPIN parser for a
> > number of years now; I plan on releasing it for inclusion in bioperl or
> > bioperl-run.  Right now, I think I may base them somewhat on your
> > Bio::Tools::QRNA module.  Should they be in bioperl
> (Bio::Tools::RNAMotif)
> > or bioperl-run (Bio::Tools::Run::RNAMotif) namespace?
> 
>  From my understanding, a module to _parse the output_ of some TOOL goes
> in Bio::Tools::TOOL. The wrapper to _run_ TOOL goes in
> Bio::Tools::Run::TOOL. Usually the run() method in Bio::Tools::Run::TOOL
> takes the TOOL output and creates a Bio::Tools::TOOL object with the
> result in it as a convenience.
> 
> --
> Torsten Seemann
> Victorian Bioinformatics Consortium, Monash University, Australia
> http://www.vicbioinformatics.com/
> Phone: +61 3 9905 9010

From torsten.seemann at infotech.monash.edu.au  Sat Jan 28 05:04:30 2006
From: torsten.seemann at infotech.monash.edu.au (Torsten Seemann)
Date: Sat, 28 Jan 2006 21:04:30 +1100
Subject: [Bioperl-l] RNAMotif parser
In-Reply-To: <000001c623ac$d7d07db0$15327e82@pyrimidine>
References: <000001c623ac$d7d07db0$15327e82@pyrimidine>
Message-ID: <43DB41AE.30002@infotech.monash.edu.au>

>> From my understanding, a module to _parse the output_ of some TOOL goes
>>in Bio::Tools::TOOL. The wrapper to _run_ TOOL goes in
>>Bio::Tools::Run::TOOL. Usually the run() method in Bio::Tools::Run::TOOL
>>takes the TOOL output and creates a Bio::Tools::TOOL object with the
>>result in it as a convenience.

> Yeah, forgot about that.  I just remember a discussion at one point a while
> back about splitting off sections of bioperl core b/c some thought
> bioperl-core was getting too big; I didn't want to get too deep into writing
> code w/o asking.  Okay, then, that's settled.  

I think this is still true. Anything in Bio::Tools::Run namespace should 
be in bioperl-run CVS (except for RemoteBlast and StandAloneBlast which 
are in bioperl-live core due to popularity). All the output parsers are 
in bioperl-live core.

-- 
Torsten Seemann
Victorian Bioinformatics Consortium, Monash University, Australia
http://www.vicbioinformatics.com/
Phone: +61 3 9905 9010

From jason.stajich at duke.edu  Sat Jan 28 11:06:06 2006
From: jason.stajich at duke.edu (Jason Stajich)
Date: Sat, 28 Jan 2006 11:06:06 -0500
Subject: [Bioperl-l] RNAMotif parser
In-Reply-To: <43DB41AE.30002@infotech.monash.edu.au>
References: <000001c623ac$d7d07db0$15327e82@pyrimidine>
	<43DB41AE.30002@infotech.monash.edu.au>
Message-ID: 

exactly!
On Jan 28, 2006, at 5:04 AM, Torsten Seemann wrote:

>>> From my understanding, a module to _parse the output_ of some  
>>> TOOL goes
>>> in Bio::Tools::TOOL. The wrapper to _run_ TOOL goes in
>>> Bio::Tools::Run::TOOL. Usually the run() method in  
>>> Bio::Tools::Run::TOOL
>>> takes the TOOL output and creates a Bio::Tools::TOOL object with the
>>> result in it as a convenience.
>
>> Yeah, forgot about that.  I just remember a discussion at one  
>> point a while
>> back about splitting off sections of bioperl core b/c some thought
>> bioperl-core was getting too big; I didn't want to get too deep  
>> into writing
>> code w/o asking.  Okay, then, that's settled.
>
> I think this is still true. Anything in Bio::Tools::Run namespace  
> should
> be in bioperl-run CVS (except for RemoteBlast and StandAloneBlast  
> which
> are in bioperl-live core due to popularity). All the output parsers  
> are
> in bioperl-live core.
>
>
> -- 
> Torsten Seemann
> Victorian Bioinformatics Consortium, Monash University, Australia
> http://www.vicbioinformatics.com/
> Phone: +61 3 9905 9010
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

--
Jason Stajich
Duke University
http://www.duke.edu/~jes12

From golharam at umdnj.edu  Sun Jan 29 12:48:34 2006
From: golharam at umdnj.edu (Ryan Golhar)
Date: Sun, 29 Jan 2006 12:48:34 -0500
Subject: [Bioperl-l] Parsing clustalw alignments
Message-ID: <00ce01c624fc$3a8f34a0$2f01a8c0@GOLHARMOBILE1>

I can't figure this out from the documentation.  In fact, I'm not sure
its possible:

I have a bunch of clustalw alignments in GCG (MSF) format.  Each
alignment consists of three sequences.  I want to get the sequences
including the gaps from the alignment.  

I'm trying to use Bio::AlignIO to read the alignment file, then trying
to get each sequence from the alignment. I tried doing this:

$seqio = Bio::AlignIO->new(-format => 'clustalw', -file =>
"align$x.clustalw");
my $aln = $seqio->next_aln();
$seq1 = $aln->next_seq()->seq;

Getting the sequence from the alignment isn't working and I'm not sure
how to do it.  Does anyone have any ideas as to what I might try?

--
Ryan Golhar  -  golharam at umdnj.edu
The Informatics Institute of UMDNJ

From cjfields at uiuc.edu  Sun Jan 29 14:44:22 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Sun, 29 Jan 2006 13:44:22 -0600
Subject: [Bioperl-l] Parsing clustalw alignments
In-Reply-To: <00ce01c624fc$3a8f34a0$2f01a8c0@GOLHARMOBILE1>
References: <00ce01c624fc$3a8f34a0$2f01a8c0@GOLHARMOBILE1>
Message-ID: <294C9886-277B-4C35-AF7F-D6ABB3B401A3@uiuc.edu>

Even though you used clustalw for aligning the sequences, the output  
format is GCG (msf) and not clustalw (aln) format, so you need to  
change the '-format' flag you have set:

> $seqio = Bio::AlignIO->new(-format => 'clustalw', -file =>
> "align$x.clustalw");

to

> $seqio = Bio::AlignIO->new(-format => 'msf', -file =>
> "align$x.clustalw");

See if that works.

On Jan 29, 2006, at 11:48 AM, Ryan Golhar wrote:

> I can't figure this out from the documentation.  In fact, I'm not sure
> its possible:
>
> I have a bunch of clustalw alignments in GCG (MSF) format.  Each
> alignment consists of three sequences.  I want to get the sequences
> including the gaps from the alignment.
>
> I'm trying to use Bio::AlignIO to read the alignment file, then trying
> to get each sequence from the alignment. I tried doing this:
>
> $seqio = Bio::AlignIO->new(-format => 'clustalw', -file =>
> "align$x.clustalw");
> my $aln = $seqio->next_aln();
> $seq1 = $aln->next_seq()->seq;
>
> Getting the sequence from the alignment isn't working and I'm not sure
> how to do it.  Does anyone have any ideas as to what I might try?
>
> --
> Ryan Golhar  -  golharam at umdnj.edu
> The Informatics Institute of UMDNJ
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign

From jason.stajich at duke.edu  Sun Jan 29 14:49:20 2006
From: jason.stajich at duke.edu (Jason Stajich)
Date: Sun, 29 Jan 2006 14:49:20 -0500
Subject: [Bioperl-l] Parsing clustalw alignments
In-Reply-To: <00ce01c624fc$3a8f34a0$2f01a8c0@GOLHARMOBILE1>
References: <00ce01c624fc$3a8f34a0$2f01a8c0@GOLHARMOBILE1>
Message-ID: <21817F7A-8552-4F24-8094-86D830A506BB@duke.edu>

See the Bio::SimpleAlign documentation for information on how to  
interact with an alignment

Here is some code from the SYNOPSIS
# Extract sequences and check values for the alignment column $pos
   foreach $seq ($aln->each_seq) {
       $res = $seq->subseq($pos, $pos);
       $count{$res}++;
   }

So for you question:
# get the aln parser
my $alnio = Bio::AlignIO->new(-format => 'clustalw', -file  
=>"alnfile.aln);
while( my $aln = $alnio->next_aln ) {
  # get the alignments one by one
  for my $seq ( $aln->each_seq ) {
  # get the sequences out from the alignment
   print "sequence as a string", $seq->seq, "\n";
   }
}

next_seq is an API Sequence streams, not something we have  
implemented for alignments since you can get them all out with the  
each_seq method.

-jason
On Jan 29, 2006, at 12:48 PM, Ryan Golhar wrote:

> I can't figure this out from the documentation.  In fact, I'm not sure
> its possible:
>
> I have a bunch of clustalw alignments in GCG (MSF) format.  Each
> alignment consists of three sequences.  I want to get the sequences
> including the gaps from the alignment.
>
> I'm trying to use Bio::AlignIO to read the alignment file, then trying
> to get each sequence from the alignment. I tried doing this:
>
> $seqio = Bio::AlignIO->new(-format => 'clustalw', -file =>
> "align$x.clustalw");
> my $aln = $seqio->next_aln();
> $seq1 = $aln->next_seq()->seq;
>
> Getting the sequence from the alignment isn't working and I'm not sure
> how to do it.  Does anyone have any ideas as to what I might try?
>
> --
> Ryan Golhar  -  golharam at umdnj.edu
> The Informatics Institute of UMDNJ
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

--
Jason Stajich
Duke University
http://www.duke.edu/~jes12

From biophp at biophp.org  Fri Jan 27 08:20:31 2006
From: biophp at biophp.org (Joseba Bikandi)
Date: Fri, 27 Jan 2006 08:20:31 -0500
Subject: [Bioperl-l] BioPHP.org - open source repository of code and scripts
Message-ID: 

Dear Sir/Madam,

I would like to let you know about biophp.org, 
an open source project which may be interesting 
for you. It is a new project which includes 
PHP code (functions) and minitools (copy and
paste one page scripts). 

Sincerely,

......
Joseba Bikandi
biophp at biophp.org

From golharam at umdnj.edu  Mon Jan 30 12:40:58 2006
From: golharam at umdnj.edu (Ryan Golhar)
Date: Mon, 30 Jan 2006 12:40:58 -0500
Subject: [Bioperl-l] Parsing clustalw alignments
In-Reply-To: <21817F7A-8552-4F24-8094-86D830A506BB@duke.edu>
Message-ID: <003701c625c4$5527d790$2f01a8c0@GOLHARMOBILE1>

Thanks.  Here's what I ended up doing:

$seqio = Bio::AlignIO->new(-format => 'msf', -file =>
"alnfile.clustalw");
my $aln = $seqio->next_aln();
@_ = $aln->each_seq_with_id('org1');
$seq1 = $_[0]->seq;

-----Original Message-----
From: bioperl-l-bounces at lists.open-bio.org
[mailto:bioperl-l-bounces at lists.open-bio.org] On Behalf Of Jason Stajich
Sent: Sunday, January 29, 2006 2:49 PM
To: golharam at umdnj.edu
Cc: 'bioperl-l'
Subject: Re: [Bioperl-l] Parsing clustalw alignments

See the Bio::SimpleAlign documentation for information on how to  
interact with an alignment

Here is some code from the SYNOPSIS
# Extract sequences and check values for the alignment column $pos
   foreach $seq ($aln->each_seq) {
       $res = $seq->subseq($pos, $pos);
       $count{$res}++;
   }

So for you question:
# get the aln parser
my $alnio = Bio::AlignIO->new(-format => 'clustalw', -file  
=>"alnfile.aln);
while( my $aln = $alnio->next_aln ) {
  # get the alignments one by one
  for my $seq ( $aln->each_seq ) {
  # get the sequences out from the alignment
   print "sequence as a string", $seq->seq, "\n";
   }
}

next_seq is an API Sequence streams, not something we have  
implemented for alignments since you can get them all out with the  
each_seq method.

-jason
On Jan 29, 2006, at 12:48 PM, Ryan Golhar wrote:

> I can't figure this out from the documentation.  In fact, I'm not sure

> its possible:
>
> I have a bunch of clustalw alignments in GCG (MSF) format.  Each 
> alignment consists of three sequences.  I want to get the sequences 
> including the gaps from the alignment.
>
> I'm trying to use Bio::AlignIO to read the alignment file, then trying

> to get each sequence from the alignment. I tried doing this:
>
> $seqio = Bio::AlignIO->new(-format => 'clustalw', -file => 
> "align$x.clustalw"); my $aln = $seqio->next_aln();
> $seq1 = $aln->next_seq()->seq;
>
> Getting the sequence from the alignment isn't working and I'm not sure

> how to do it.  Does anyone have any ideas as to what I might try?
>
> --
> Ryan Golhar  -  golharam at umdnj.edu
> The Informatics Institute of UMDNJ
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org 
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

--
Jason Stajich
Duke University
http://www.duke.edu/~jes12

_______________________________________________
Bioperl-l mailing list
Bioperl-l at lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/bioperl-l

From alindeman at gmail.com  Mon Jan 30 23:00:32 2006
From: alindeman at gmail.com (Andy Lindeman)
Date: Mon, 30 Jan 2006 22:00:32 -0600
Subject: [Bioperl-l] Bio::Graphics: Different glyphs on same track
In-Reply-To: <3f3ecb5a0601301442o573053aes23934617edb14729@mail.gmail.com>
References: <3f3ecb5a0601301442o573053aes23934617edb14729@mail.gmail.com>
Message-ID: <3f3ecb5a0601302000j7a3fbd4y1739a3c1696e30aa@mail.gmail.com>

Hi all--

Is it possible to use two different glyphs (or the same glyph with
different properties) on the same panel track?

Thanks

--A

From Marc.Logghe at DEVGEN.com  Tue Jan 31 03:08:09 2006
From: Marc.Logghe at DEVGEN.com (Marc Logghe)
Date: Tue, 31 Jan 2006 09:08:09 +0100
Subject: [Bioperl-l] Bio::Graphics: Different glyphs on same track
Message-ID: <0C528E3670D8CE4B8E013F6749231AA6746AB5@ANTARESIA.be.devgen.com>

Hi Andy
> Is it possible to use two different glyphs (or the same glyph 
> with different properties) on the same panel track?
Sure it is. This extract comes from the docs of Bio::Graphics::Panel

" There are a large number of glyph types.  By default, each track will
be homogeneous on a single glyph type, but you can mix several glyph
types on the same track by providing a code reference to the -glyph
argument.  Other options passed to add_track() control the color and
size of the glyphs, whether they are allowed to overlap, and other
formatting attributes.  The height of a track is determined from its
contents and cannot be directly influenced."

HTH,
Marc

From alindeman at gmail.com  Tue Jan 31 14:59:00 2006
From: alindeman at gmail.com (Andy Lindeman)
Date: Tue, 31 Jan 2006 13:59:00 -0600
Subject: [Bioperl-l] Bio::Graphics: Different glyphs on same track
In-Reply-To: <0C528E3670D8CE4B8E013F6749231AA6746AB5@ANTARESIA.be.devgen.com>
References: <0C528E3670D8CE4B8E013F6749231AA6746AB5@ANTARESIA.be.devgen.com>
Message-ID: <3f3ecb5a0601311159k6d7f09d3j65732b5e72019e9d@mail.gmail.com>

Wonderful!

Thanks.

--A

On 1/31/06, Marc Logghe  wrote:
> Hi Andy
> > Is it possible to use two different glyphs (or the same glyph
> > with different properties) on the same panel track?
> Sure it is. This extract comes from the docs of Bio::Graphics::Panel
>
> " There are a large number of glyph types.  By default, each track will
> be homogeneous on a single glyph type, but you can mix several glyph
> types on the same track by providing a code reference to the -glyph
> argument.  Other options passed to add_track() control the color and
> size of the glyphs, whether they are allowed to overlap, and other
> formatting attributes.  The height of a track is determined from its
> contents and cannot be directly influenced."
>
> HTH,
> Marc
>

From hubert.prielinger at gmx.at  Tue Jan 24 15:49:07 2006
From: hubert.prielinger at gmx.at (Hubert Prielinger)
Date: Tue, 24 Jan 2006 14:49:07 -0600
Subject: [Bioperl-l] formatdb with the nr database
In-Reply-To: <43D63FB6.4090505@scitegic.com>
References: <43D54838.5050301@gmx.at>
	<43D5693C.1020805@anu.edu.au>	<43D56203.2060806@gmx.at>
	<1138062266.2534.2.camel@vogon>	<43D571B1.3020008@gmx.at>
	<43D58D06.5080501@anu.edu.au> <43D585CF.5070902@gmx.at>
	<43D63FB6.4090505@scitegic.com>
Message-ID: <43D692C3.80306@gmx.at>

Hi,
thank you very much for the help, I have tried to run the blastall on 
commandline, but I can't even execute the binary file, nevertheless the 
blastall exe file have every permission...
I always get the error message: blastall: cannot execute the binary file
Need to be the exe file somewhere else, another path...now it is located 
under /home/Hubert/blast/blast-2.2.13/bin

thanks
Hubert

Scott Markel wrote:

> Hubert,
>
> If you look at the MSG line in the exception you can see
> exactly what the command line was.  Nagesh is pointing out
> that you used -d "/nr" and asking if that's what you want.
> I suspect that the '/' shouldn't be there.
>
> Try invoking blastall directly from the command line.  All
> BioPerl is doing is invoking BLAST on your behalf.  The
> same command line that BioPerl uses should also work for
> you on the command line.
>
> Scott
>
> Hubert Prielinger wrote:
>
>> hi,
>> sorry, but what do you mean with is your blast database in /nr...
>> my database is located in the path /home/Hubert/blast/blast-2.2.13/data
>>
>>
>>
>> Nagesh Chakka wrote:
>>
>>> Can you just run the blast from the command line.
>>> Is your blast database in "/nr".
>>>
>>> Hubert Prielinger wrote:
>>>
>>>> Hi Nagesh,
>>>> thank you very much, I put my database into the data folder, run 
>>>> the program and got the following error message:
>>>>
>>>> submit Sequence...just do it....
>>>> sh: /home/Hubert/blast/blast-2.2.13/bin/blastall: cannot execute 
>>>> binary file
>>>>
>>>> ------------- EXCEPTION  -------------
>>>> MSG: blastall call crashed: 32256 
>>>> /home/Hubert/blast/blast-2.2.13/bin/blastall -p  blastp  -d  "/nr"  
>>>> -i  /tmp/QTZfYMbgLM  -e  20000  -o  /tmp/v3YwWvONZ1  -v  1000  -b  
>>>> 1000
>>>>
>>>> STACK Bio::Tools::Run::StandAloneBlast::_runblast 
>>>> /usr/lib/perl5/site_perl/5.8.6/Bio/Tools/Run/StandAloneBlast.pm:759
>>>> STACK Bio::Tools::Run::StandAloneBlast::_generic_local_blast 
>>>> /usr/lib/perl5/site_perl/5.8.6/Bio/Tools/Run/StandAloneBlast.pm:706
>>>> STACK Bio::Tools::Run::StandAloneBlast::blastall 
>>>> /usr/lib/perl5/site_perl/5.8.6/Bio/Tools/Run/StandAloneBlast.pm:557
>>>> STACK toplevel 
>>>> /home/Hubert/installed/eclipse/workspace/Database_Search/standalone_blast.pl:46 
>>>>
>>>>
>>>> --------------------------------------
>>>>
>>>> Why it did not find my binary file, but it is there
>>>>
>>>> regards
>>>>
>>>> Nagesh Chakka wrote:
>>>>
>>>>> Hi,
>>>>> The following is from the StandAloneBlast.pm documentation
>>>>> " If the databases which will be searched by BLAST are located in the
>>>>> data subdirectory of the blast program directory (the default
>>>>> installation location), StandAloneBlast will find them; however, 
>>>>> if the
>>>>> database files are located in any other location, environmental 
>>>>> variable
>>>>> $BLASTDATADIR will need to be set to point to that directory."
>>>>> Please note that I have not used this module before.
>>>>> Nagesh
>>>>>
>>>>>
>>>>>
>>>>> On Mon, 2006-01-23 at 17:08 -0600, Hubert Prielinger wrote:
>>>>>  
>>>>>
>>>>>> Hi,
>>>>>> thank you very much for the help, another questions that raises 
>>>>>> up, do I have to write the path to the database files as well, I 
>>>>>> guess so, but how I do that, the same way I write the path to teh 
>>>>>> blast bin files?
>>>>>> Does anybody know how to set the Composition based statistics 
>>>>>> parameter?
>>>>>> there is my code:
>>>>>>
>>>>>> #!/usr/bin/perl -w
>>>>>>
>>>>>> use Bio::Tools::Run::StandAloneBlast;
>>>>>> use Bio::Seq;
>>>>>> use Bio::SeqIO;
>>>>>> use strict;
>>>>>>
>>>>>> BEGIN
>>>>>> {
>>>>>>    $ENV{PATH}=":/home/Hubert/blast/blast-2.2.13/bin/:";
>>>>>> }
>>>>>>
>>>>>>
>>>>>> # parameters
>>>>>> my $expect_value = 20000;
>>>>>> #my $filter_query_sequence = 'F';
>>>>>> my $one_line_description = 1000;
>>>>>> my $alignments = 1000;
>>>>>> # my $strands = 1;
>>>>>> my $count = 1;
>>>>>>
>>>>>> my @params = ('program' => 'blastp', 'database' => 'nr');
>>>>>> #my $progress_interval = 100;
>>>>>>
>>>>>>
>>>>>> my $seqio_obj = Bio::SeqIO->new(
>>>>>>  -file   => "Perm.txt",
>>>>>>  -format => "raw",
>>>>>> );
>>>>>>
>>>>>> # create factory object and set parameters
>>>>>> my $factory = Bio::Tools::Run::StandAloneBlast->new(@params);
>>>>>>
>>>>>> $factory->e($expect_value);
>>>>>> #$factory->F($filter_query_sequence);
>>>>>> $factory->v($one_line_description);
>>>>>> $factory->b($alignments);
>>>>>> #$factory->S($strands);
>>>>>>
>>>>>>
>>>>>> # get query
>>>>>>
>>>>>> while ( my $query = $seqio_obj->next_seq ) {
>>>>>>      my $blast_report = $factory->blastall($query);
>>>>>>      my $filename = "comp_$count.txt";
>>>>>>      my $factory->outfile($filename);
>>>>>>      print $query->seq;
>>>>>>      print "\n";
>>>>>>
>>>>>>  $count++;
>>>>>> }
>>>>>>
>>>>>> thank you very much in advance
>>>>>> Hubert
>>>>>>
>>>>>>
>>>>>>
>>>>>> Nagesh Chakka wrote:
>>>>>>
>>>>>>  
>>>>>>
>>>>>>> Hi Hubert,
>>>>>>> I downloaded the nr.00.tar.gz file a week ago. I was able to get 
>>>>>>> the following files
>>>>>>> .phr, .pin, .pnd, .pni, .ppd, .ppi, .psd, .psi, .psq, .pal 
>>>>>>> files. I have no trouble in running standalone blast. You are 
>>>>>>> not required to run formardb on the downloaded blast databases 
>>>>>>> and that may be the reason why the sequences are not included as 
>>>>>>> it will also reduce the size of the file.
>>>>>>> Did you try to run a blast search, if so is it giving you any 
>>>>>>> errors?
>>>>>>> Nagesh
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Hubert Prielinger wrote:
>>>>>>>
>>>>>>>  
>>>>>>>
>>>>>>>> Hi,
>>>>>>>> I have downloaded the nr database for doing a blast search 
>>>>>>>> locally, now I'm supposed to index the database with formatdb, 
>>>>>>>> but it doesn't work...
>>>>>>>> The online help says that you need a fasta file that is indexed 
>>>>>>>> to use for searching the database, but when I uncompressed the 
>>>>>>>> zip file, there were only .phr, .pnd, .pin, .pni, .ppd file....
>>>>>>>> Is there anybody who can tell me, how to use formatdb with the 
>>>>>>>> nr database...
>>>>>>>>
>>>>>>>> Help is very appreciated
>>>>>>>> Thank you very much in advance
>>>>>>>>
>>>>>>>> Hubert
>>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>> Bioperl-l mailing list
>>>>>>>> Bioperl-l at portal.open-bio.org
>>>>>>>> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>>>>>>>>       
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>     
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>>  
>>>>>
>>>>
>>>
>>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at portal.open-bio.org
>> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>>
>>
>

From hubert.prielinger at gmx.at  Tue Jan 24 16:15:38 2006
From: hubert.prielinger at gmx.at (Hubert Prielinger)
Date: Tue, 24 Jan 2006 15:15:38 -0600
Subject: [Bioperl-l] formatdb with the nr database
In-Reply-To: <43D6B09A.3040207@atgc.org>
References: <43D54838.5050301@gmx.at>	<43D5693C.1020805@anu.edu.au>	<43D56203.2060806@gmx.at>	<1138062266.2534.2.camel@vogon>	<43D571B1.3020008@gmx.at>	<43D58D06.5080501@anu.edu.au>
	<43D585CF.5070902@gmx.at>	<43D63FB6.4090505@scitegic.com>
	<43D692C3.80306@gmx.at> <43D6B09A.3040207@atgc.org>
Message-ID: <43D698FA.3090904@gmx.at>

hi alex,
I have done, as you recommended and got the following output:

[Hubert at ppc7 ~]$ file /home/Hubert/blast/blast-2.2.13/bin/blastall
/home/Hubert/blast/blast-2.2.13/bin/blastall: ELF 64-bit LSB executable, 
AMD x86-64, version 1 (SYSV), for GNU/Linux 2.4.1, dynamically linked 
(uses shared libs), for GNU/Linux 2.4.1, not stripped
[Hubert at ppc7 ~]$

does it mean, that it is compatible with the operating system

thanks for help
Hubert

Alexander Kozik wrote:

> try Unix command "file", for example:
>
>
> bash-2.03$ file /usr/local/genome/bin/blastall
>
> /usr/local/genome/bin/blastall: ELF 64-bit MSB executable SPARCV9 
> Version 1, UltraSPARC1 Extensions Required, dynamically linked, stripped
>
> bash-2.03$
>
> it will tell if it's compatible with the operating system
>
> -Alex
>
> Hubert Prielinger wrote:
>
>>Hi,
>>thank you very much for the help, I have tried to run the blastall on 
>>commandline, but I can't even execute the binary file, nevertheless the 
>>blastall exe file have every permission...
>>I always get the error message: blastall: cannot execute the binary file
>>Need to be the exe file somewhere else, another path...now it is located 
>>under /home/Hubert/blast/blast-2.2.13/bin
>>
>>thanks
>>Hubert
>>
>>
>>
>>
>>
>>Scott Markel wrote:
>>
>>    
>>
>>>Hubert,
>>>
>>>If you look at the MSG line in the exception you can see
>>>exactly what the command line was.  Nagesh is pointing out
>>>that you used -d "/nr" and asking if that's what you want.
>>>I suspect that the '/' shouldn't be there.
>>>
>>>Try invoking blastall directly from the command line.  All
>>>BioPerl is doing is invoking BLAST on your behalf.  The
>>>same command line that BioPerl uses should also work for
>>>you on the command line.
>>>
>>>Scott
>>>
>>>Hubert Prielinger wrote:
>>>
>>>      
>>>
>>>>hi,
>>>>sorry, but what do you mean with is your blast database in /nr...
>>>>my database is located in the path /home/Hubert/blast/blast-2.2.13/data
>>>>
>>>>
>>>>
>>>>Nagesh Chakka wrote:
>>>>
>>>>        
>>>>
>>>>>Can you just run the blast from the command line.
>>>>>Is your blast database in "/nr".
>>>>>
>>>>>Hubert Prielinger wrote:
>>>>>
>>>>>          
>>>>>
>>>>>>Hi Nagesh,
>>>>>>thank you very much, I put my database into the data folder, run 
>>>>>>the program and got the following error message:
>>>>>>
>>>>>>submit Sequence...just do it....
>>>>>>sh: /home/Hubert/blast/blast-2.2.13/bin/blastall: cannot execute 
>>>>>>binary file
>>>>>>
>>>>>>------------- EXCEPTION  -------------
>>>>>>MSG: blastall call crashed: 32256 
>>>>>>/home/Hubert/blast/blast-2.2.13/bin/blastall -p  blastp  -d  "/nr"  
>>>>>>-i  /tmp/QTZfYMbgLM  -e  20000  -o  /tmp/v3YwWvONZ1  -v  1000  -b  
>>>>>>1000
>>>>>>
>>>>>>STACK Bio::Tools::Run::StandAloneBlast::_runblast 
>>>>>>/usr/lib/perl5/site_perl/5.8.6/Bio/Tools/Run/StandAloneBlast.pm:759
>>>>>>STACK Bio::Tools::Run::StandAloneBlast::_generic_local_blast 
>>>>>>/usr/lib/perl5/site_perl/5.8.6/Bio/Tools/Run/StandAloneBlast.pm:706
>>>>>>STACK Bio::Tools::Run::StandAloneBlast::blastall 
>>>>>>/usr/lib/perl5/site_perl/5.8.6/Bio/Tools/Run/StandAloneBlast.pm:557
>>>>>>STACK toplevel 
>>>>>>/home/Hubert/installed/eclipse/workspace/Database_Search/standalo
>>>>>>ne_blast.pl:46 
>>>>>>
>>>>>>
>>>>>>--------------------------------------
>>>>>>
>>>>>>Why it did not find my binary file, but it is there
>>>>>>
>>>>>>regards
>>>>>>
>>>>>>Nagesh Chakka wrote:
>>>>>>
>>>>>>            
>>>>>>
>>>>>>>Hi,
>>>>>>>The following is from the StandAloneBlast.pm documentation
>>>>>>>" If the databases which will be searched by BLAST are located in the
>>>>>>>data subdirectory of the blast program directory (the default
>>>>>>>installation location), StandAloneBlast will find them; however, 
>>>>>>>if the
>>>>>>>database files are located in any other location, environmental 
>>>>>>>variable
>>>>>>>$BLASTDATADIR will need to be set to point to that directory."
>>>>>>>Please note that I have not used this module before.
>>>>>>>Nagesh
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>On Mon, 2006-01-23 at 17:08 -0600, Hubert Prielinger wrote:
>>>>>>> 
>>>>>>>
>>>>>>>              
>>>>>>>
>>>>>>>>Hi,
>>>>>>>>thank you very much for the help, another questions that raises 
>>>>>>>>up, do I have to write the path to the database files as well, I 
>>>>>>>>guess so, but how I do that, the same way I write the path to teh 
>>>>>>>>blast bin files?
>>>>>>>>Does anybody know how to set the Composition based statistics 
>>>>>>>>parameter?
>>>>>>>>there is my code:
>>>>>>>>
>>>>>>>>#!/usr/bin/perl -w
>>>>>>>>
>>>>>>>>use Bio::Tools::Run::StandAloneBlast;
>>>>>>>>use Bio::Seq;
>>>>>>>>use Bio::SeqIO;
>>>>>>>>use strict;
>>>>>>>>
>>>>>>>>BEGIN
>>>>>>>>{
>>>>>>>>   $ENV{PATH}=":/home/Hubert/blast/blast-2.2.13/bin/:";
>>>>>>>>}
>>>>>>>>
>>>>>>>>
>>>>>>>># parameters
>>>>>>>>my $expect_value = 20000;
>>>>>>>>#my $filter_query_sequence = 'F';
>>>>>>>>my $one_line_description = 1000;
>>>>>>>>my $alignments = 1000;
>>>>>>>># my $strands = 1;
>>>>>>>>my $count = 1;
>>>>>>>>
>>>>>>>>my @params = ('program' => 'blastp', 'database' => 'nr');
>>>>>>>>#my $progress_interval = 100;
>>>>>>>>
>>>>>>>>
>>>>>>>>my $seqio_obj = Bio::SeqIO->new(
>>>>>>>> -file   => "Perm.txt",
>>>>>>>> -format => "raw",
>>>>>>>>);
>>>>>>>>
>>>>>>>># create factory
>>>>>>>> object and set parameters
>>>>>>>>my $factory = Bio::Tools::Run::StandAloneBlast->new(@params);
>>>>>>>>
>>>>>>>>$factory->e($expect_value);
>>>>>>>>#$factory->F($filter_query_sequence);
>>>>>>>>$factory->v($one_line_description);
>>>>>>>>$factory->b($alignments);
>>>>>>>>#$factory->S($strands);
>>>>>>>>
>>>>>>>>
>>>>>>>># get query
>>>>>>>>
>>>>>>>>while ( my $query = $seqio_obj->next_seq ) {
>>>>>>>>     my $blast_report = $factory->blastall($query);
>>>>>>>>     my $filename = "comp_$count.txt";
>>>>>>>>     my $factory->outfile($filename);
>>>>>>>>     print $query->seq;
>>>>>>>>     print "\n";
>>>>>>>>
>>>>>>>> $count++;
>>>>>>>>}
>>>>>>>>
>>>>>>>>thank you very much in advance
>>>>>>>>Hubert
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>Nagesh Chakka wrote:
>>>>>>>>
>>>>>>>> 
>>>>>>>>
>>>>>>>>                
>>>>>>>>
>>>>>>>>>Hi Hubert,
>>>>>>>>>I downloaded the nr.00.tar.gz file a week ago. I was able to get 
>>>>>>>>>the following files
>>>>>>>>>.phr, .pin, .pnd, .pni, .ppd, .ppi, .psd, .psi, .psq, .pal 
>>>>>>>>>files. I have no trouble in running standalone blast. You are 
>>>>>>>>>not required to run formardb on the downloaded blast databases 
>>>>>>>>>and that may be the reason why the sequences are not included as 
>>>>>>>>>it will also reduce the size of the file.
>>>>>>>>>Did you try to run a blast search, if so is it giving you any 
>>>>>>>>>errors?
>>>>>>>>>Nagesh
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>Hubert Prielinger wrote:
>>>>>>>>>
>>>>>>>>> 
>>>>>>>>>
>>>>>>>>>                  
>>>>>>>>>
>>>>>>>>>>Hi,
>>>>>>>>>>I have downloaded the nr database for doing a blast search 
>>>>>>>>>>locally, now I'm supposed to index the database with formatdb, 
>>>>>>>>>>but it doesn't work...
>>>>>>>>>>The online help says that you need a fasta file that is indexed 
>>>>>>>>>>to use for searching the database, but when I uncompressed the 
>>>>>>>>>>zip file, there were only .phr, .pnd, .pin, .pni, .ppd file....
>>>>>>>>>>Is there anybody who can tell me, how to use formatdb with the 
>>>>>>>>>>nr database...
>>>>>>>>>>
>>>>>>>>>>Help is very appreciated
>>>>>>>>>>Thank you very much in advance
>>>>>>>>>>
>>>>>>>>>>Hubert
>>>>>>>>>>
>>>>>>>>>>_______________________________________________
>>>>>>>>>>Bioperl-l mailing list
>>>>>>>>>>Bioperl-l at portal.open-bio.org
>>>>>>>>>>http://portal.open-bio.org/mailman/listinfo/bioperl-l
>>>>>>>>>>      
>>>>>>>>>>                    
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>    
>>>>>>>>>                  
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>                
>>>>>>>>
>>>>>>> 
>>>>>>>
>>>>>>>              
>>>>>>>
>>>>>          
>>>>>
>>>>_______________________________________________
>>>>Bioperl-l mailing list
>>>>Bioperl-l at portal.open-bio.org
>>>>http://portal.open-bio.org/mailman/listinfo/bioperl-l
>>>>
>>>>
>>>>        
>>>>
>>
>>
>>_______________________________________________
>>Bioperl-l mailing list
>>Bioperl-l at lists.open-bio.org
>>http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>    
>>
>

From hubert.prielinger at gmx.at  Tue Jan 24 16:24:51 2006
From: hubert.prielinger at gmx.at (Hubert Prielinger)
Date: Tue, 24 Jan 2006 15:24:51 -0600
Subject: [Bioperl-l] formatdb with the nr database
In-Reply-To: <43D6B09A.3040207@atgc.org>
References: <43D54838.5050301@gmx.at>	<43D5693C.1020805@anu.edu.au>	<43D56203.2060806@gmx.at>	<1138062266.2534.2.camel@vogon>	<43D571B1.3020008@gmx.at>	<43D58D06.5080501@anu.edu.au>
	<43D585CF.5070902@gmx.at>	<43D63FB6.4090505@scitegic.com>
	<43D692C3.80306@gmx.at> <43D6B09A.3040207@atgc.org>
Message-ID: <43D69B23.9010100@gmx.at>

Hi,
I'm very sorry for wasting your time, but I just figured out what 
happend, I have installed the 64 bit version and not the 32 bit version....
sorry for the inconvenience and thanks for the help....
I'm trying to fix now the problem with the database....

Sorry
Hubert

Alexander Kozik wrote:

> try Unix command "file", for example:
>
>
> bash-2.03$ file /usr/local/genome/bin/blastall
>
> /usr/local/genome/bin/blastall: ELF 64-bit MSB executable SPARCV9 
> Version 1, UltraSPARC1 Extensions Required, dynamically linked, stripped
>
> bash-2.03$
>
> it will tell if it's compatible with the operating system
>
> -Alex
>
> Hubert Prielinger wrote:
>
>>Hi,
>>thank you very much for the help, I have tried to run the blastall on 
>>commandline, but I can't even execute the binary file, nevertheless the 
>>blastall exe file have every permission...
>>I always get the error message: blastall: cannot execute the binary file
>>Need to be the exe file somewhere else, another path...now it is located 
>>under /home/Hubert/blast/blast-2.2.13/bin
>>
>>thanks
>>Hubert
>>
>>
>>
>>
>>
>>Scott Markel wrote:
>>
>>    
>>
>>>Hubert,
>>>
>>>If you look at the MSG line in the exception you can see
>>>exactly what the command line was.  Nagesh is pointing out
>>>that you used -d "/nr" and asking if that's what you want.
>>>I suspect that the '/' shouldn't be there.
>>>
>>>Try invoking blastall directly from the command line.  All
>>>BioPerl is doing is invoking BLAST on your behalf.  The
>>>same command line that BioPerl uses should also work for
>>>you on the command line.
>>>
>>>Scott
>>>
>>>Hubert Prielinger wrote:
>>>
>>>      
>>>
>>>>hi,
>>>>sorry, but what do you mean with is your blast database in /nr...
>>>>my database is located in the path /home/Hubert/blast/blast-2.2.13/data
>>>>
>>>>
>>>>
>>>>Nagesh Chakka wrote:
>>>>
>>>>        
>>>>
>>>>>Can you just run the blast from the command line.
>>>>>Is your blast database in "/nr".
>>>>>
>>>>>Hubert Prielinger wrote:
>>>>>
>>>>>          
>>>>>
>>>>>>Hi Nagesh,
>>>>>>thank you very much, I put my database into the data folder, run 
>>>>>>the program and got the following error message:
>>>>>>
>>>>>>submit Sequence...just do it....
>>>>>>sh: /home/Hubert/blast/blast-2.2.13/bin/blastall: cannot execute 
>>>>>>binary file
>>>>>>
>>>>>>------------- EXCEPTION  -------------
>>>>>>MSG: blastall call crashed: 32256 
>>>>>>/home/Hubert/blast/blast-2.2.13/bin/blastall -p  blastp  -d  "/nr"  
>>>>>>-i  /tmp/QTZfYMbgLM  -e  20000  -o  /tmp/v3YwWvONZ1  -v  1000  -b  
>>>>>>1000
>>>>>>
>>>>>>STACK Bio::Tools::Run::StandAloneBlast::_runblast 
>>>>>>/usr/lib/perl5/site_perl/5.8.6/Bio/Tools/Run/StandAloneBlast.pm:759
>>>>>>STACK Bio::Tools::Run::StandAloneBlast::_generic_local_blast 
>>>>>>/usr/lib/perl5/site_perl/5.8.6/Bio/Tools/Run/StandAloneBlast.pm:706
>>>>>>STACK Bio::Tools::Run::StandAloneBlast::blastall 
>>>>>>/usr/lib/perl5/site_perl/5.8.6/Bio/Tools/Run/StandAloneBlast.pm:557
>>>>>>STACK toplevel 
>>>>>>/home/Hubert/installed/eclipse/workspace/Database_Search/standalo
>>>>>>ne_blast.pl:46 
>>>>>>
>>>>>>
>>>>>>--------------------------------------
>>>>>>
>>>>>>Why it did not find my binary file, but it is there
>>>>>>
>>>>>>regards
>>>>>>
>>>>>>Nagesh Chakka wrote:
>>>>>>
>>>>>>            
>>>>>>
>>>>>>>Hi,
>>>>>>>The following is from the StandAloneBlast.pm documentation
>>>>>>>" If the databases which will be searched by BLAST are located in the
>>>>>>>data subdirectory of the blast program directory (the default
>>>>>>>installation location), StandAloneBlast will find them; however, 
>>>>>>>if the
>>>>>>>database files are located in any other location, environmental 
>>>>>>>variable
>>>>>>>$BLASTDATADIR will need to be set to point to that directory."
>>>>>>>Please note that I have not used this module before.
>>>>>>>Nagesh
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>On Mon, 2006-01-23 at 17:08 -0600, Hubert Prielinger wrote:
>>>>>>> 
>>>>>>>
>>>>>>>              
>>>>>>>
>>>>>>>>Hi,
>>>>>>>>thank you very much for the help, another questions that raises 
>>>>>>>>up, do I have to write the path to the database files as well, I 
>>>>>>>>guess so, but how I do that, the same way I write the path to teh 
>>>>>>>>blast bin files?
>>>>>>>>Does anybody know how to set the Composition based statistics 
>>>>>>>>parameter?
>>>>>>>>there is my code:
>>>>>>>>
>>>>>>>>#!/usr/bin/perl -w
>>>>>>>>
>>>>>>>>use Bio::Tools::Run::StandAloneBlast;
>>>>>>>>use Bio::Seq;
>>>>>>>>use Bio::SeqIO;
>>>>>>>>use strict;
>>>>>>>>
>>>>>>>>BEGIN
>>>>>>>>{
>>>>>>>>   $ENV{PATH}=":/home/Hubert/blast/blast-2.2.13/bin/:";
>>>>>>>>}
>>>>>>>>
>>>>>>>>
>>>>>>>># parameters
>>>>>>>>my $expect_value = 20000;
>>>>>>>>#my $filter_query_sequence = 'F';
>>>>>>>>my $one_line_description = 1000;
>>>>>>>>my $alignments = 1000;
>>>>>>>># my $strands = 1;
>>>>>>>>my $count = 1;
>>>>>>>>
>>>>>>>>my @params = ('program' => 'blastp', 'database' => 'nr');
>>>>>>>>#my $progress_interval = 100;
>>>>>>>>
>>>>>>>>
>>>>>>>>my $seqio_obj = Bio::SeqIO->new(
>>>>>>>> -file   => "Perm.txt",
>>>>>>>> -format => "raw",
>>>>>>>>);
>>>>>>>>
>>>>>>>># create factory
>>>>>>>> object and set parameters
>>>>>>>>my $factory = Bio::Tools::Run::StandAloneBlast->new(@params);
>>>>>>>>
>>>>>>>>$factory->e($expect_value);
>>>>>>>>#$factory->F($filter_query_sequence);
>>>>>>>>$factory->v($one_line_description);
>>>>>>>>$factory->b($alignments);
>>>>>>>>#$factory->S($strands);
>>>>>>>>
>>>>>>>>
>>>>>>>># get query
>>>>>>>>
>>>>>>>>while ( my $query = $seqio_obj->next_seq ) {
>>>>>>>>     my $blast_report = $factory->blastall($query);
>>>>>>>>     my $filename = "comp_$count.txt";
>>>>>>>>     my $factory->outfile($filename);
>>>>>>>>     print $query->seq;
>>>>>>>>     print "\n";
>>>>>>>>
>>>>>>>> $count++;
>>>>>>>>}
>>>>>>>>
>>>>>>>>thank you very much in advance
>>>>>>>>Hubert
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>Nagesh Chakka wrote:
>>>>>>>>
>>>>>>>> 
>>>>>>>>
>>>>>>>>                
>>>>>>>>
>>>>>>>>>Hi Hubert,
>>>>>>>>>I downloaded the nr.00.tar.gz file a week ago. I was able to get 
>>>>>>>>>the following files
>>>>>>>>>.phr, .pin, .pnd, .pni, .ppd, .ppi, .psd, .psi, .psq, .pal 
>>>>>>>>>files. I have no trouble in running standalone blast. You are 
>>>>>>>>>not required to run formardb on the downloaded blast databases 
>>>>>>>>>and that may be the reason why the sequences are not included as 
>>>>>>>>>it will also reduce the size of the file.
>>>>>>>>>Did you try to run a blast search, if so is it giving you any 
>>>>>>>>>errors?
>>>>>>>>>Nagesh
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>Hubert Prielinger wrote:
>>>>>>>>>
>>>>>>>>> 
>>>>>>>>>
>>>>>>>>>                  
>>>>>>>>>
>>>>>>>>>>Hi,
>>>>>>>>>>I have downloaded the nr database for doing a blast search 
>>>>>>>>>>locally, now I'm supposed to index the database with formatdb, 
>>>>>>>>>>but it doesn't work...
>>>>>>>>>>The online help says that you need a fasta file that is indexed 
>>>>>>>>>>to use for searching the database, but when I uncompressed the 
>>>>>>>>>>zip file, there were only .phr, .pnd, .pin, .pni, .ppd file....
>>>>>>>>>>Is there anybody who can tell me, how to use formatdb with the 
>>>>>>>>>>nr database...
>>>>>>>>>>
>>>>>>>>>>Help is very appreciated
>>>>>>>>>>Thank you very much in advance
>>>>>>>>>>
>>>>>>>>>>Hubert
>>>>>>>>>>
>>>>>>>>>>_______________________________________________
>>>>>>>>>>Bioperl-l mailing list
>>>>>>>>>>Bioperl-l at portal.open-bio.org
>>>>>>>>>>http://portal.open-bio.org/mailman/listinfo/bioperl-l
>>>>>>>>>>      
>>>>>>>>>>                    
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>    
>>>>>>>>>                  
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>                
>>>>>>>>
>>>>>>> 
>>>>>>>
>>>>>>>              
>>>>>>>
>>>>>          
>>>>>
>>>>_______________________________________________
>>>>Bioperl-l mailing list
>>>>Bioperl-l at portal.open-bio.org
>>>>http://portal.open-bio.org/mailman/listinfo/bioperl-l
>>>>
>>>>
>>>>        
>>>>
>>
>>
>>_______________________________________________
>>Bioperl-l mailing list
>>Bioperl-l at lists.open-bio.org
>>http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>    
>>
>

From smarkel at scitegic.com  Tue Jan 24 17:09:57 2006
From: smarkel at scitegic.com (Scott Markel)
Date: Tue, 24 Jan 2006 14:09:57 -0800
Subject: [Bioperl-l] formatdb with the nr database
In-Reply-To: <43D692C3.80306@gmx.at>
References: <43D54838.5050301@gmx.at>
	<43D5693C.1020805@anu.edu.au>	<43D56203.2060806@gmx.at>
	<1138062266.2534.2.camel@vogon>	<43D571B1.3020008@gmx.at>
	<43D58D06.5080501@anu.edu.au> <43D585CF.5070902@gmx.at>
	<43D63FB6.4090505@scitegic.com> <43D692C3.80306@gmx.at>
Message-ID: <43D6A5B5.8090106@scitegic.com>

Hubert,

Since you can't run blastall on the command line, your initial
problem has nothing to do with BioPerl.  Once you get blastall
working on the command line, you'll know what directories and
environment variable settings to use when running via BioPerl.

What happens when you run the following?

   file /home/Hubert/blast/blast-2.2.13/bin/blastall

Is the executable the correct one for your operating system?

Scott

Hubert Prielinger wrote:

> Hi,
> thank you very much for the help, I have tried to run the blastall on 
> commandline, but I can't even execute the binary file, nevertheless the 
> blastall exe file have every permission...
> I always get the error message: blastall: cannot execute the binary file
> Need to be the exe file somewhere else, another path...now it is located 
> under /home/Hubert/blast/blast-2.2.13/bin
> 
> thanks
> Hubert
> 
> 
> 
> 
> 
> Scott Markel wrote:
> 
>> Hubert,
>>
>> If you look at the MSG line in the exception you can see
>> exactly what the command line was.  Nagesh is pointing out
>> that you used -d "/nr" and asking if that's what you want.
>> I suspect that the '/' shouldn't be there.
>>
>> Try invoking blastall directly from the command line.  All
>> BioPerl is doing is invoking BLAST on your behalf.  The
>> same command line that BioPerl uses should also work for
>> you on the command line.
>>
>> Scott
>>
>> Hubert Prielinger wrote:
>>
>>> hi,
>>> sorry, but what do you mean with is your blast database in /nr...
>>> my database is located in the path /home/Hubert/blast/blast-2.2.13/data
>>>
>>>
>>>
>>> Nagesh Chakka wrote:
>>>
>>>> Can you just run the blast from the command line.
>>>> Is your blast database in "/nr".
>>>>
>>>> Hubert Prielinger wrote:
>>>>
>>>>> Hi Nagesh,
>>>>> thank you very much, I put my database into the data folder, run 
>>>>> the program and got the following error message:
>>>>>
>>>>> submit Sequence...just do it....
>>>>> sh: /home/Hubert/blast/blast-2.2.13/bin/blastall: cannot execute 
>>>>> binary file
>>>>>
>>>>> ------------- EXCEPTION  -------------
>>>>> MSG: blastall call crashed: 32256 
>>>>> /home/Hubert/blast/blast-2.2.13/bin/blastall -p  blastp  -d  "/nr"  
>>>>> -i  /tmp/QTZfYMbgLM  -e  20000  -o  /tmp/v3YwWvONZ1  -v  1000  -b  
>>>>> 1000
>>>>>
>>>>> STACK Bio::Tools::Run::StandAloneBlast::_runblast 
>>>>> /usr/lib/perl5/site_perl/5.8.6/Bio/Tools/Run/StandAloneBlast.pm:759
>>>>> STACK Bio::Tools::Run::StandAloneBlast::_generic_local_blast 
>>>>> /usr/lib/perl5/site_perl/5.8.6/Bio/Tools/Run/StandAloneBlast.pm:706
>>>>> STACK Bio::Tools::Run::StandAloneBlast::blastall 
>>>>> /usr/lib/perl5/site_perl/5.8.6/Bio/Tools/Run/StandAloneBlast.pm:557
>>>>> STACK toplevel 
>>>>> /home/Hubert/installed/eclipse/workspace/Database_Search/standalone_blast.pl:46 
>>>>>
>>>>>
>>>>> --------------------------------------
>>>>>
>>>>> Why it did not find my binary file, but it is there
>>>>>
>>>>> regards
>>>>>
>>>>> Nagesh Chakka wrote:
>>>>>
>>>>>> Hi,
>>>>>> The following is from the StandAloneBlast.pm documentation
>>>>>> " If the databases which will be searched by BLAST are located in the
>>>>>> data subdirectory of the blast program directory (the default
>>>>>> installation location), StandAloneBlast will find them; however, 
>>>>>> if the
>>>>>> database files are located in any other location, environmental 
>>>>>> variable
>>>>>> $BLASTDATADIR will need to be set to point to that directory."
>>>>>> Please note that I have not used this module before.
>>>>>> Nagesh
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Mon, 2006-01-23 at 17:08 -0600, Hubert Prielinger wrote:
>>>>>>  
>>>>>>
>>>>>>> Hi,
>>>>>>> thank you very much for the help, another questions that raises 
>>>>>>> up, do I have to write the path to the database files as well, I 
>>>>>>> guess so, but how I do that, the same way I write the path to teh 
>>>>>>> blast bin files?
>>>>>>> Does anybody know how to set the Composition based statistics 
>>>>>>> parameter?
>>>>>>> there is my code:
>>>>>>>
>>>>>>> #!/usr/bin/perl -w
>>>>>>>
>>>>>>> use Bio::Tools::Run::StandAloneBlast;
>>>>>>> use Bio::Seq;
>>>>>>> use Bio::SeqIO;
>>>>>>> use strict;
>>>>>>>
>>>>>>> BEGIN
>>>>>>> {
>>>>>>>    $ENV{PATH}=":/home/Hubert/blast/blast-2.2.13/bin/:";
>>>>>>> }
>>>>>>>
>>>>>>>
>>>>>>> # parameters
>>>>>>> my $expect_value = 20000;
>>>>>>> #my $filter_query_sequence = 'F';
>>>>>>> my $one_line_description = 1000;
>>>>>>> my $alignments = 1000;
>>>>>>> # my $strands = 1;
>>>>>>> my $count = 1;
>>>>>>>
>>>>>>> my @params = ('program' => 'blastp', 'database' => 'nr');
>>>>>>> #my $progress_interval = 100;
>>>>>>>
>>>>>>>
>>>>>>> my $seqio_obj = Bio::SeqIO->new(
>>>>>>>  -file   => "Perm.txt",
>>>>>>>  -format => "raw",
>>>>>>> );
>>>>>>>
>>>>>>> # create factory object and set parameters
>>>>>>> my $factory = Bio::Tools::Run::StandAloneBlast->new(@params);
>>>>>>>
>>>>>>> $factory->e($expect_value);
>>>>>>> #$factory->F($filter_query_sequence);
>>>>>>> $factory->v($one_line_description);
>>>>>>> $factory->b($alignments);
>>>>>>> #$factory->S($strands);
>>>>>>>
>>>>>>>
>>>>>>> # get query
>>>>>>>
>>>>>>> while ( my $query = $seqio_obj->next_seq ) {
>>>>>>>      my $blast_report = $factory->blastall($query);
>>>>>>>      my $filename = "comp_$count.txt";
>>>>>>>      my $factory->outfile($filename);
>>>>>>>      print $query->seq;
>>>>>>>      print "\n";
>>>>>>>
>>>>>>>  $count++;
>>>>>>> }
>>>>>>>
>>>>>>> thank you very much in advance
>>>>>>> Hubert
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Nagesh Chakka wrote:
>>>>>>>
>>>>>>>  
>>>>>>>
>>>>>>>> Hi Hubert,
>>>>>>>> I downloaded the nr.00.tar.gz file a week ago. I was able to get 
>>>>>>>> the following files
>>>>>>>> .phr, .pin, .pnd, .pni, .ppd, .ppi, .psd, .psi, .psq, .pal 
>>>>>>>> files. I have no trouble in running standalone blast. You are 
>>>>>>>> not required to run formardb on the downloaded blast databases 
>>>>>>>> and that may be the reason why the sequences are not included as 
>>>>>>>> it will also reduce the size of the file.
>>>>>>>> Did you try to run a blast search, if so is it giving you any 
>>>>>>>> errors?
>>>>>>>> Nagesh
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> Hubert Prielinger wrote:
>>>>>>>>
>>>>>>>>  
>>>>>>>>
>>>>>>>>> Hi,
>>>>>>>>> I have downloaded the nr database for doing a blast search 
>>>>>>>>> locally, now I'm supposed to index the database with formatdb, 
>>>>>>>>> but it doesn't work...
>>>>>>>>> The online help says that you need a fasta file that is indexed 
>>>>>>>>> to use for searching the database, but when I uncompressed the 
>>>>>>>>> zip file, there were only .phr, .pnd, .pin, .pni, .ppd file....
>>>>>>>>> Is there anybody who can tell me, how to use formatdb with the 
>>>>>>>>> nr database...
>>>>>>>>>
>>>>>>>>> Help is very appreciated
>>>>>>>>> Thank you very much in advance
>>>>>>>>>
>>>>>>>>> Hubert
>>>>>>>>>
>>>>>>>>> _______________________________________________
>>>>>>>>> Bioperl-l mailing list
>>>>>>>>> Bioperl-l at portal.open-bio.org
>>>>>>>>> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>>>>>>>>>       
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>     
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>>  
>>>>>>
>>>>>
>>>>
>>>>
>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at portal.open-bio.org
>>> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>>
>>
> 
> 
> 
> 

-- 
Scott Markel, Ph.D.
Principal Bioinformatics Architect  email:  smarkel at scitegic.com
SciTegic Inc.                       mobile: +1 858 205 3653
9665 Chesapeake Drive, Suite 401    voice:  +1 858 279 8800, ext. 253
San Diego, CA 92123                 fax:    +1 858 279 8804
USA                                 web:    http://www.scitegic.com

From cjfields at uiuc.edu  Tue Jan 24 17:21:22 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 24 Jan 2006 16:21:22 -0600
Subject: [Bioperl-l] RemoteBlast.pm and Bio::SearchIO::blast.pm
	-partially resolved
In-Reply-To: <18966F80-B780-4661-953E-613B05B56164@duke.edu>
Message-ID: <000301c62134$81cdc500$15327e82@pyrimidine>

Jason, 

I have worked out all the problems with RemoteBlast.pm and posted a patched
version to Bugzilla (http://bugzilla.bioperl.org/show_bug.cgi?id=1935).  The
main problem was that RemoteBlast::save_output was not looking for XML
output when dumping from the tempfile to the saved file (it only looked for
the text header).  That is fixed.  The other problems mentioned were due to
differences in mapping key=>value pairs between blast and blastxml and a
problem in my own script.  It passed all tests using 'perl t/RemoteBlast.t'
with debugging set.

See if anybody else out there can test them out.

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 
> -----Original Message-----
> From: bioperl-l-bounces at portal.open-bio.org [mailto:bioperl-l-
> bounces at portal.open-bio.org] On Behalf Of Jason Stajich
> Sent: Tuesday, January 24, 2006 11:16 AM
> To: Chris Fields
> Cc: bioperl-ml List
> Subject: [Bioperl-l] Re: RemoteBlast.pm and Bio::SearchIO::blast.pm -
> partially resolved
> 
> Thanks Chris - I don't know when I'll have time to check in bugs so
> anyone else who has commit access feel free to give these a whirl and
> check in.
> 
> I would propose making the XML default but allowing the text version
> to still be supported in the event that someone has setup their own
> local NCBI BLAST Web interface which still supports the simple Text
> output.
> 
> -j
> 
> On Jan 24, 2006, at 12:09 PM, Chris Fields wrote:
> 
> > I submitted two bugs on Bugzilla to describe recent problems with
> > RemoteBlast.pm and SearchIO::blast.pm
> >
> > http://bugzilla.bioperl.org/show_bug.cgi?id=1934
> > http://bugzilla.bioperl.org/show_bug.cgi?id=1935
> >
> > Today I submitted a patched version of Bio::SearchIO::blast.pm
> > which should
> > fix the text parsing issue for old (2.2.12) and new (2.2.13)
> > versions of
> > NCBI's BLAST; the bug link above describes the problem and the
> > fix.  Problem
> > is, I know it will likely break again b/c NCBI will probably change
> > text
> > output in a future BLAST version.  I also agree with Jason about
> > changing
> > the default for SearchIO to XML.  So, does text output parsing through
> > blast.pm need to be deprecated in favor of XML, or should both be
> > available?
> >
> > Christopher Fields
> > Postdoctoral Researcher - Switzer Lab
> > Dept. of Biochemistry
> > University of Illinois Urbana-Champaign
> >
> >
> 
> --
> Jason Stajich
> Duke University
> http://www.duke.edu/~jes12
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l

From jason.stajich at duke.edu  Tue Jan 24 16:44:34 2006
From: jason.stajich at duke.edu (Jason Stajich)
Date: Tue, 24 Jan 2006 16:44:34 -0500
Subject: [Bioperl-l] new mailing list server
Message-ID: <50E14815-266E-4ACB-8E6E-293C9EB33476@duke.edu>

Chris Dagdigian has switched our mailing lists over to a new server  
to upgrade us to newer hardware.  In the switch the default mailing  
list the server name is 'lists.open-bio.org' instead of 'portal.open- 
bio.org'.  That should be the only change you should notice at the  
bottom of your mails.  All mail should get delivered to any of those  
addresses (although @bioperl.org is preferred).

We hope this changeover will help improve the performance and  
scalability of our mail and webservices.

We also will aim to move the developer read-write CVS server to a new  
machine in the coming weeks.  We hope this will only be a minor  
inconvenience but will allow us to move to a more recent operating  
system and larger disk space.

If you have questions or concerns they can be directed to support AT  
open-bio.org
-jason
--
Jason Stajich
Duke University
http://www.duke.edu/~jes12

From jason.stajich at duke.edu  Tue Jan 24 22:31:38 2006
From: jason.stajich at duke.edu (Jason Stajich)
Date: Tue, 24 Jan 2006 22:31:38 -0500
Subject: [Bioperl-l] new website launched
Message-ID: <79628361-F026-461F-A156-0B6810DB0B52@duke.edu>

I am pleased to announce the release of a new website for BioPerl.   
The site is based on the mediawiki software that was developed for  
the wikipedia project.  We intend the site to be a place for  
community input on documentation and design for the BioPerl project.   
There is also a fair amount of documentation started surrounding  
bioinformatics tools and techniques applicable to using BioPerl and  
some of the authors who created these resources.

The website continues to be at the URL http://www.bioperl.org.  The  
DNS updates may take up to 24 hours to reach everyone.

The initial content of the site is result of the work of myself,  
Mauricio Herrera Cuadra, Brian Osborne, and Torsten Seemann.  We  
encourage you to contribute to the site's content by signing up for  
an account.

There are several guides for style of the site and how to link to  
Modules for example which can contain additional information from the  
POD
http://bioperl.org/wiki/Module:Bio::SeqIO

You'll notice that many of the paths have changed but the DIST and  
SRC continues to be available at http://bioperl.org/DIST and http:// 
bioperl.org/SRC.  The HOWTOs are now available from http:// 
bioperl.org/wiki/HOWTOs

The FAQ is available at http://bioperl.org/wiki/FAQ and I encourage  
you to add your questions to it so they can be properly archived and  
addressed.

We also have initiated a News site for Bioperl for posting  
announcements regarding development and software.  I would like to  
see if there are volunteers to post weekly or monthly summaries of  
mailing list traffic and development.
http://www.bioperl.org/news/

Jason Stajich on behalf of Mauricio Herrera Cuadra, Brian Osborne,  
Torsten Seemann.

--
Jason Stajich
Duke University
http://www.duke.edu/~jes12

From roy at colibase.bham.ac.uk  Wed Jan 25 12:05:29 2006
From: roy at colibase.bham.ac.uk (Roy Chaudhuri)
Date: Wed, 25 Jan 2006 17:05:29 +0000
Subject: [Bioperl-l] concatenate two embl sequence files
In-Reply-To: <200601182120.k0ILIl8X022324@portal.open-bio.org>
References: <200601182120.k0ILIl8X022324@portal.open-bio.org>
Message-ID: <43D7AFD9.2020305@colibase.bham.ac.uk>

Hi all.

I also had need of a function to concatenate two Bio::Seq objects, so had a go
at this. My naive attempt (intended to go in Bio::SeqUtils) is pasted below. I'm
not too sure about the concept of sub-SeqFeatures (I've never seen any sequence
that had more than one level of feature)- I worked on the assumption that little
sub-SeqFeatures can have littler sub-SeqFeatures and so ad infinitum, but as I
don't have an example file I haven't been able to test if this works. Likewise,
although I think the code should cope with Fuzzy and Split locations, I haven't
tested this with any particularly unusual examples.

Roy.
--
Dr. Roy Chaudhuri
Bioinformatics Research Fellow
Division of Immunity and Infection
University of Birmingham, U.K.

http://xbase.bham.ac.uk

=head2 cat

  Title   : cat
  Usage   : my $catseq = Bio::SeqUtils->cat(@seqs)
  Function: Concatenates an array of Bio::Seq objects, using the first sequence
            as a template for species etc. Adjusts the coordinates of features
            from any additional objects.
  Returns : A sequence object of the same class as the first argument.
  Args    : array of sequence objects

=cut

sub cat {
     my ($self, @seqs) = @_;
     my $seq=shift @seqs;
     $self->throw('Object [$seq] '. 'of class ['. ref($seq).
                  '] should be a Bio::PrimarySeqI ')
     unless $seq->isa('Bio::PrimarySeqI');
     for (@seqs) {
     	$self->throw('Object [$seq] '. 'of class ['. ref($seq).
                  '] should be a Bio::PrimarySeqI ')
	unless $seq->isa('Bio::PrimarySeqI');
	my $length=$seq->length;
	$seq->seq($seq->seq.$_->seq);
	for my $feat ($_->get_SeqFeatures) {
	    $seq->add_SeqFeature($self->_coordAdjust($feat, $length));
	}
     }
     return $seq;
}

=head2 _coordAdjust

  Title   : _coordAdjust
  Usage   : my $newfeat=Bio::SeqUtils->_coordAdjust($feature, 100);
  Function: Recursive subroutine to adjust the coordinates of a feature
            and all its subfeatures.
  Returns : A Bio::SeqFeatureI compliant object.
  Args    : A Bio::SeqFeatureI compliant object,
            the number of bases to add to the coordinates

=cut

sub _coordAdjust {
     my ($self, $feat, $add)=@_;
     $self->throw('Object [$feat] '. 'of class ['. ref($feat).
                  '] should be a Bio::SeqFeatureI ')
	unless $feat->isa('Bio::SeqFeatureI');
     my @adjsubfeat;
     for my $subfeat ($feat->remove_SeqFeatures) {
	push @adjsubfeat, Bio::SeqUtils->_coordAdjust($add, $subfeat);
     }
     my @loc=$feat->location->each_Location;
     map {
	my @coords=($_->start, $_->end);
	map s/(\d+)/$add+$1/ge, @coords;
	$_->start(shift @coords);
	$_->end(shift @coords);
     } @loc;
     if (@loc==1) {
	$feat->location($loc[0])
     } else {
	my $loc=Bio::Location::Split->new;
	$loc->add_sub_Location(@loc);
	$feat->location($loc);
     }
     $feat->add_SeqFeature($_) for @adjsubfeat;
     return $feat;
}

> 
> 
> Jan, 
> 
> It would be easy if someone had written a function to do it. Even writing the 
> function is not hard.  I do not think there is no other way than go through 
> all features, though.
> 
> In my opinion this would be an excellent addition to Bio::Seq::Utilities.
> 
> E.g. cat($arrayrefofsequences, optional_seq_class_to_create)
>      return a new seq, species and other info based on the first seq in array 
> 
> Could you  write it and post to bugzilla?
> 
> 	-Heikki
> 
> 
> On Tuesday 17 January 2006 11:54, jan aerts (RI) wrote:
>> Hi all,
>>
>> Does anyone know of an easy way to concatenate two sequences, including
>> recalculation of features positions of the second one? E.g.
>>   seq 1 = 100 bp
>>     feature A: 5..15
>>   seq 2 = 200 bp
>>     feature B: 20..30
>>   => concatenated sequence 3 = 300 bp
>>        feature A: 5..15
>>        feature B: 120..130  <<<<<<<<<<<
>>
>> Annotations (features without range) should be transferred as well.
>>
>> Of course, it must be possible to create a blank sequence and work my
>> way through all features, adding them to a new collection of features
>> and stuff. But I was wondering if a simpler technique is possible.
>>
>> Many thanks,
>> Jan Aerts
>> Bioinformatics Department
>> Roslin Institute
>> Roslin, Scotland, UK
>>
>> ---------The obligatory disclaimer--------
>> The information contained in this e-mail (including any attachments) is
>> confidential and is intended for the use of the addressee only.   The
>> opinions expressed within this e-mail (including any attachments) are
>> the opinions of the sender and do not necessarily constitute those of
>> Roslin Institute (Edinburgh) ("the Institute") unless specifically
>> stated by a sender who is duly authorised to do so on behalf of the
>> Institute.
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at portal.open-bio.org
>> http://portal.open-bio.org/mailman/listinfo/bioperl-l
> 
> -- 
> ______ _/      _/_____________________________________________________
>       _/      _/
>      _/  _/  _/  Heikki Lehvaslaiho    heikki at_sanbi _ac _za
>     _/_/_/_/_/  Associate Professor    skype: heikki_lehvaslaiho
>    _/  _/  _/  SANBI, South African National Bioinformatics Institute
>   _/  _/  _/  University of Western Cape, South Africa
>      _/      Phone: +27 21 959 2096   FAX: +27 21 959 2512
> ___ _/_/_/_/_/________________________________________________________
> 

From heikki at sanbi.ac.za  Wed Jan 25 16:11:45 2006
From: heikki at sanbi.ac.za (Heikki Lehvaslaiho)
Date: Wed, 25 Jan 2006 23:11:45 +0200
Subject: [Bioperl-l] concatenate two embl sequence files
In-Reply-To: <43D7AFD9.2020305@colibase.bham.ac.uk>
References: <200601182120.k0ILIl8X022324@portal.open-bio.org>
	<43D7AFD9.2020305@colibase.bham.ac.uk>
Message-ID: <200601252311.45582.heikki@sanbi.ac.za>

Thanks Roy!

I'll check to code in tomorrow when I am less sleepy and can go through the 
code in detail. In principle the code looks good. It definitely needs tests. 
If you have written any please do post them.

A few more checks to make sure seq_>alphabet is the same in all sequences 
might be a good idea.

   -Heikki

On Wednesday 25 January 2006 19:05, Roy Chaudhuri wrote:
> Hi all.
>
> I also had need of a function to concatenate two Bio::Seq objects, so had a
> go at this. My naive attempt (intended to go in Bio::SeqUtils) is pasted
> below. I'm not too sure about the concept of sub-SeqFeatures (I've never
> seen any sequence that had more than one level of feature)- I worked on the
> assumption that little sub-SeqFeatures can have littler sub-SeqFeatures and
> so ad infinitum, but as I don't have an example file I haven't been able to
> test if this works. Likewise, although I think the code should cope with
> Fuzzy and Split locations, I haven't tested this with any particularly
> unusual examples.
>
> Roy.
> --
> Dr. Roy Chaudhuri
> Bioinformatics Research Fellow
> Division of Immunity and Infection
> University of Birmingham, U.K.
>
> http://xbase.bham.ac.uk
>
>
>
> =head2 cat
>
>   Title   : cat
>   Usage   : my $catseq = Bio::SeqUtils->cat(@seqs)
>   Function: Concatenates an array of Bio::Seq objects, using the first
> sequence as a template for species etc. Adjusts the coordinates of features
> from any additional objects.
>   Returns : A sequence object of the same class as the first argument.
>   Args    : array of sequence objects
>
>
> =cut
>
> sub cat {
>      my ($self, @seqs) = @_;
>      my $seq=shift @seqs;
>      $self->throw('Object [$seq] '. 'of class ['. ref($seq).
>                   '] should be a Bio::PrimarySeqI ')
>      unless $seq->isa('Bio::PrimarySeqI');
>      for (@seqs) {
>      	$self->throw('Object [$seq] '. 'of class ['. ref($seq).
>                   '] should be a Bio::PrimarySeqI ')
> 	unless $seq->isa('Bio::PrimarySeqI');
> 	my $length=$seq->length;
> 	$seq->seq($seq->seq.$_->seq);
> 	for my $feat ($_->get_SeqFeatures) {
> 	    $seq->add_SeqFeature($self->_coordAdjust($feat, $length));
> 	}
>      }
>      return $seq;
> }
>
> =head2 _coordAdjust
>
>   Title   : _coordAdjust
>   Usage   : my $newfeat=Bio::SeqUtils->_coordAdjust($feature, 100);
>   Function: Recursive subroutine to adjust the coordinates of a feature
>             and all its subfeatures.
>   Returns : A Bio::SeqFeatureI compliant object.
>   Args    : A Bio::SeqFeatureI compliant object,
>             the number of bases to add to the coordinates
>
>
> =cut
>
> sub _coordAdjust {
>      my ($self, $feat, $add)=@_;
>      $self->throw('Object [$feat] '. 'of class ['. ref($feat).
>                   '] should be a Bio::SeqFeatureI ')
> 	unless $feat->isa('Bio::SeqFeatureI');
>      my @adjsubfeat;
>      for my $subfeat ($feat->remove_SeqFeatures) {
> 	push @adjsubfeat, Bio::SeqUtils->_coordAdjust($add, $subfeat);
>      }
>      my @loc=$feat->location->each_Location;
>      map {
> 	my @coords=($_->start, $_->end);
> 	map s/(\d+)/$add+$1/ge, @coords;
> 	$_->start(shift @coords);
> 	$_->end(shift @coords);
>      } @loc;
>      if (@loc==1) {
> 	$feat->location($loc[0])
>      } else {
> 	my $loc=Bio::Location::Split->new;
> 	$loc->add_sub_Location(@loc);
> 	$feat->location($loc);
>      }
>      $feat->add_SeqFeature($_) for @adjsubfeat;
>      return $feat;
> }
>
> > Jan,
> >
> > It would be easy if someone had written a function to do it. Even writing
> > the function is not hard.  I do not think there is no other way than go
> > through all features, though.
> >
> > In my opinion this would be an excellent addition to Bio::Seq::Utilities.
> >
> > E.g. cat($arrayrefofsequences, optional_seq_class_to_create)
> >      return a new seq, species and other info based on the first seq in
> > array
> >
> > Could you  write it and post to bugzilla?
> >
> > 	-Heikki
> >
> > On Tuesday 17 January 2006 11:54, jan aerts (RI) wrote:
> >> Hi all,
> >>
> >> Does anyone know of an easy way to concatenate two sequences, including
> >> recalculation of features positions of the second one? E.g.
> >>   seq 1 = 100 bp
> >>     feature A: 5..15
> >>   seq 2 = 200 bp
> >>     feature B: 20..30
> >>   => concatenated sequence 3 = 300 bp
> >>        feature A: 5..15
> >>        feature B: 120..130  <<<<<<<<<<<
> >>
> >> Annotations (features without range) should be transferred as well.
> >>
> >> Of course, it must be possible to create a blank sequence and work my
> >> way through all features, adding them to a new collection of features
> >> and stuff. But I was wondering if a simpler technique is possible.
> >>
> >> Many thanks,
> >> Jan Aerts
> >> Bioinformatics Department
> >> Roslin Institute
> >> Roslin, Scotland, UK
> >>
> >> ---------The obligatory disclaimer--------
> >> The information contained in this e-mail (including any attachments) is
> >> confidential and is intended for the use of the addressee only.   The
> >> opinions expressed within this e-mail (including any attachments) are
> >> the opinions of the sender and do not necessarily constitute those of
> >> Roslin Institute (Edinburgh) ("the Institute") unless specifically
> >> stated by a sender who is duly authorised to do so on behalf of the
> >> Institute.
> >>
> >> _______________________________________________
> >> Bioperl-l mailing list
> >> Bioperl-l at portal.open-bio.org
> >> http://portal.open-bio.org/mailman/listinfo/bioperl-l
> >
> > --
> > ______ _/      _/_____________________________________________________
> >       _/      _/
> >      _/  _/  _/  Heikki Lehvaslaiho    heikki at_sanbi _ac _za
> >     _/_/_/_/_/  Associate Professor    skype: heikki_lehvaslaiho
> >    _/  _/  _/  SANBI, South African National Bioinformatics Institute
> >   _/  _/  _/  University of Western Cape, South Africa
> >      _/      Phone: +27 21 959 2096   FAX: +27 21 959 2512
> > ___ _/_/_/_/_/________________________________________________________
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

-- 
______ _/      _/_____________________________________________________
      _/      _/
     _/  _/  _/  Heikki Lehvaslaiho    heikki at_sanbi _ac _za
    _/_/_/_/_/  Associate Professor    skype: heikki_lehvaslaiho
   _/  _/  _/  SANBI, South African National Bioinformatics Institute
  _/  _/  _/  University of Western Cape, South Africa
     _/      Phone: +27 21 959 2096   FAX: +27 21 959 2512
___ _/_/_/_/_/________________________________________________________

From heikki at sanbi.ac.za  Wed Jan 25 15:52:42 2006
From: heikki at sanbi.ac.za (Heikki Lehvaslaiho)
Date: Wed, 25 Jan 2006 22:52:42 +0200
Subject: [Bioperl-l] new website launched
In-Reply-To: <79628361-F026-461F-A156-0B6810DB0B52@duke.edu>
References: <79628361-F026-461F-A156-0B6810DB0B52@duke.edu>
Message-ID: <200601252252.42786.heikki@sanbi.ac.za>

Congratulations and huge thank you for the production team!

The new website is a big step ahead readability and ease in editing the 
information.

I for my part have already corrected a few small typos and omissions on the 
new pages. I invite other to do the same.

    -Heikki

On Wednesday 25 January 2006 05:31, Jason Stajich wrote:
> I am pleased to announce the release of a new website for BioPerl.
> The site is based on the mediawiki software that was developed for
> the wikipedia project.  We intend the site to be a place for
> community input on documentation and design for the BioPerl project.
> There is also a fair amount of documentation started surrounding
> bioinformatics tools and techniques applicable to using BioPerl and
> some of the authors who created these resources.
>
> The website continues to be at the URL http://www.bioperl.org.  The
> DNS updates may take up to 24 hours to reach everyone.
>
> The initial content of the site is result of the work of myself,
> Mauricio Herrera Cuadra, Brian Osborne, and Torsten Seemann.  We
> encourage you to contribute to the site's content by signing up for
> an account.
>
> There are several guides for style of the site and how to link to
> Modules for example which can contain additional information from the
> POD
> http://bioperl.org/wiki/Module:Bio::SeqIO
>
> You'll notice that many of the paths have changed but the DIST and
> SRC continues to be available at http://bioperl.org/DIST and http://
> bioperl.org/SRC.  The HOWTOs are now available from http://
> bioperl.org/wiki/HOWTOs
>
> The FAQ is available at http://bioperl.org/wiki/FAQ and I encourage
> you to add your questions to it so they can be properly archived and
> addressed.
>
> We also have initiated a News site for Bioperl for posting
> announcements regarding development and software.  I would like to
> see if there are volunteers to post weekly or monthly summaries of
> mailing list traffic and development.
> http://www.bioperl.org/news/
>
>
> Jason Stajich on behalf of Mauricio Herrera Cuadra, Brian Osborne,
> Torsten Seemann.
>
> --
> Jason Stajich
> Duke University
> http://www.duke.edu/~jes12
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

-- 
______ _/      _/_____________________________________________________
      _/      _/
     _/  _/  _/  Heikki Lehvaslaiho    heikki at_sanbi _ac _za
    _/_/_/_/_/  Associate Professor    skype: heikki_lehvaslaiho
   _/  _/  _/  SANBI, South African National Bioinformatics Institute
  _/  _/  _/  University of Western Cape, South Africa
     _/      Phone: +27 21 959 2096   FAX: +27 21 959 2512
___ _/_/_/_/_/________________________________________________________

From cjfields at uiuc.edu  Wed Jan 25 22:34:01 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 25 Jan 2006 21:34:01 -0600
Subject: [Bioperl-l] [Gmod-gbrowse] GMOD PPM repository not working
In-Reply-To: <1138119383.3338.68.camel@localhost.localdomain>
Message-ID: <000201c62229$59ed5f50$15327e82@pyrimidine>

Scott,

This popped up, for some reason, when I tried to install a perl module
(Error.pm); maybe it has something to do with the reason PPM can't 'see'
GMOD's repository.  It crashes PPM pretty nicely!  Looks like the home page
for GMOD, so maybe Sourceforge is redirecting things and this messes with
PPM?  

_____________________________________________
C:\Perl\Scripts>ppm
PPM - Programmer's Package Manager version 3.3.
Copyright (c) 2001 ActiveState Corp. All Rights Reserved.
ActiveState is a division of Sophos.

Entering interactive shell. Using Term::ReadLine::Perl as readline library.

Type 'help' to get started.

ppm> rep
Repositories:
[1] Bioperl
[2] gmod
[3] ActiveState PPM2 Repository
[4] ActiveState Package Repository
[ ] Bribes
[ ] Kobes
[ ] local
ppm> install Error
PPM::PPD::init: not a PPD and not a file:

  The Generic Model Organism Database Project | GMOD

      GMOD

      Generic Software Components for Model
Organism Databases

      Mailing lists |
Bug Reports |
Feature Requests |
Publications |
Meetings |

.... (lots of HTML removed)

This site is maintained by Scott
Cain | Powered by 
drupal

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 
> -----Original Message-----
> From: gmod-gbrowse-admin at lists.sourceforge.net [mailto:gmod-gbrowse-
> admin at lists.sourceforge.net] On Behalf Of Scott Cain
> Sent: Tuesday, January 24, 2006 10:16 AM
> To: Chris Fields
> Cc: 'Gbrowse (E-mail)'; bioperl-l at portal.open-bio.org
> Subject: Re: [Gmod-gbrowse] GMOD PPM repository not working
> 
> Hi Chris,
> 
> Is it still misbehaving?  I'll do some testing today, but my ability to
> do so is little hampered as I am traveling this week.
> 
> Thanks,
> Scott
> 
> 
> On Wed, 2006-01-18 at 10:51 -0600, Chris Fields wrote:
> > Scott,
> >
> > I am trying to find the newest bioperl dev. Release (1.51) from PPM for
> a
> > quick write-up on installing bioperl-db on Windows.  I tried using the
> GMOD
> > repository:
> >
> > ppm> rep add gmod http://www.gmod.org/ggb/ppm
> > Repositories:
> > [1] gmod
> > [ ] ActiveState Package Repository
> > [ ] ActiveState PPM2 Repository
> > [ ] Bioperl
> > [ ] Bribes
> > [ ] Kobes
> > [ ] local
> > ppm> search bioperl
> > Searching in Active Repositories
> > No matches for 'bioperl'; see 'help search'.
> > ppm> search *
> > Searching in Active Repositories
> > No matches for '*'; see 'help search'.
> > ppm>
> >
> >
> > Any idea what's going on?  All other repositories work fine.  I can
> download
> > it and install locally w/o a problem.  I am running the newest
> ActivePerl
> > (5.8.7.815), WinXP.
> >
> > Christopher Fields
> > Postdoctoral Researcher - Switzer Lab
> > Dept. of Biochemistry
> > University of Illinois Urbana-Champaign
> >
> >
> >
> >
> > -------------------------------------------------------
> > This SF.net email is sponsored by: Splunk Inc. Do you grep through log
> files
> > for problems?  Stop!  Download the new AJAX search engine that makes
> > searching your log files as easy as surfing the  web.  DOWNLOAD SPLUNK!
> > http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642
> > _______________________________________________
> > Gmod-gbrowse mailing list
> > Gmod-gbrowse at lists.sourceforge.net
> > https://lists.sourceforge.net/lists/listinfo/gmod-gbrowse
> --
> ------------------------------------------------------------------------
> Scott Cain, Ph. D.                                         cain at cshl.edu
> GMOD Coordinator (http://www.gmod.org/)                     216-392-3087
> Cold Spring Harbor Laboratory
> 
> 
> 
> -------------------------------------------------------
> This SF.net email is sponsored by: Splunk Inc. Do you grep through log
> files
> for problems?  Stop!  Download the new AJAX search engine that makes
> searching your log files as easy as surfing the  web.  DOWNLOAD SPLUNK!
> http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642
> _______________________________________________
> Gmod-gbrowse mailing list
> Gmod-gbrowse at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/gmod-gbrowse

From cjfields at uiuc.edu  Thu Jan 26 00:38:56 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 25 Jan 2006 23:38:56 -0600
Subject: [Bioperl-l] bioperl-db on Windows (update)
Message-ID: <000001c6223a$cd5539c0$15327e82@pyrimidine>

Hilmar, 

I checked load_seqdatabase.pl with all variables of Root.pm and checking
debugging output; basically, the only way that I could find to get
load_seqdatabase.pl to work on native Windows is by changing those Root.pm
lines by adding a comma (i.e. three lines, from 'throw $class ...' to 'throw
$class, ...').  I ran debugging on load_seqdatabase.pl using all versions of
Root.pm, with and without Error.pm.  Only those with a comma present worked
in both circumstances.  I don't know why this hasn't popped up before now,
but it seems to be a unique combination of Windows, load_seqdatabase.pl, and
bioperl-db.  It doesn't happen with any scripts of Bioperl on Windows that
I've run into, and debugging other modules (for instance,
Bio::SearchIO::blast, which I recently worked on) doesn't cause this
problem.  

Here's the debugging output for load_seqdatabase.pl, with and w/o Error.pm
and without modifying Root.pm.

____________________________________________________________

Without Error.pm:
____________________________________________________________
C:\Perl\Scripts>perl -MError
Can't locate Error.pm in @INC (@INC contains:
C:\Perl\src\bioperl\bioperl-live C:\Perl\src\bioperl\bioperl-db C:/Perl/lib
C:/P
erl/site/lib .).
BEGIN failed--compilation aborted.

C:\Perl\Scripts>load_seqdatabase.pl -dbname biotest -dbuser root -dbpass
****** -driver mysql -format genbank -namespace tes
t -testonly -safe -debug input.gpt
Loading input.gpt ...
attempting to load adaptor class for Bio::Seq::RichSeq
        attempting to load module Bio::DB::BioSQL::RichSeqAdaptor
attempting to load adaptor class for Bio::Seq
        attempting to load module Bio::DB::BioSQL::SeqAdaptor
instantiating adaptor class Bio::DB::BioSQL::SeqAdaptor
attempting to load adaptor class for Bio::Species
        attempting to load module Bio::DB::BioSQL::SpeciesAdaptor
instantiating adaptor class Bio::DB::BioSQL::SpeciesAdaptor
Undefined subroutine &Bio::Root::Root::debug called at
C:\Perl\src\bioperl\bioperl-db/Bio/DB/BioSQL/BasePersistenceAdaptor.pm
line 1537,  line 63.
____________________________________________________________

With Error.pm:
____________________________________________________________

C:\Perl\Scripts>perl -MError -e ";"

C:\Perl\Scripts>load_seqdatabase.pl -dbname biotest -dbuser root -dbpass
****** -driver mysql -format genbank -namespace tes
t -testonly -safe -debug input.gpt
Loading input.gpt ...
attempting to load adaptor class for Bio::Seq::RichSeq
        attempting to load module Bio::DB::BioSQL::RichSeqAdaptor
  Calling Error::throw

  Calling Error::throw

attempting to load adaptor class for Bio::Seq
        attempting to load module Bio::DB::BioSQL::SeqAdaptor
instantiating adaptor class Bio::DB::BioSQL::SeqAdaptor
  Calling Error::throw

attempting to load adaptor class for Bio::Species
        attempting to load module Bio::DB::BioSQL::SpeciesAdaptor
instantiating adaptor class Bio::DB::BioSQL::SpeciesAdaptor
  Calling Error::throw

  Calling Error::throw

Undefined subroutine &Bio::Root::Root::debug called at
C:\Perl\src\bioperl\bioperl-db/Bio/DB/BioSQL/BasePersistenceAdaptor.pm
line 1537,  line 63.

____________________________________________________________

Error::throw is called w/o a problem when Error.pm is present (which is what
should happen).  For some reason, that extra comma makes all the difference
in the world.

The line above in BasePersistenceAdaptor.pm is :

$self->debug("attempting to load driver for adaptor class $class\n");

which is found in many modules.  I don't really know why it decides to hang
up here.  I'll try running a few of the Root.pm modifications under Mac OS X
in the next day or so to see what happens.

I also reran a few of Steve Chervitz's recommendations from a previous post;
everything ran fine except in circumstances in which Error.pm was required
with a 'use' statement, and only when Error.pm wasn't present, which is
expected.  Previously, when I ran them, there was a bit of confusion b/c it
seemed that Error.pm was present somewhere.  It was; Steve included it in
bioperl-live/examples/root/lib.  When I deleted it, I got the expected
results.

Anyway, I don't know what else I can do at this point besides check out
everything on Mac OS X.  Any additional checks of the modified Root.pm need
to be made on other systems.  Will filing this as a bug in Bugzilla help?

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 

From s.rayner at att.net  Thu Jan 26 00:58:42 2006
From: s.rayner at att.net (s.rayner at att.net)
Date: Thu, 26 Jan 2006 05:58:42 +0000
Subject: [Bioperl-l] bioperl installation problems with External Modules -
	doesn't see installed modules
Message-ID: <012620060558.15437.43D865110008848F00003C4D21602806519D0A02970E9DD29C@att.net>

I am trying to install the bioperl::bundle to use some of the external perl modules. 
Particularly the bio::DB::GFF module for use with biodas.

I follow the instructions, both from the bioperl web site for installing the bioperl bundle, and also specific instructions from the biodas web site for installing bio::DB::GFF.  Namely

   (1) Make sure that CVS is installed on your system.

    (2) Use the following command (all on one line) to login to the server

         % cvs -d :pserver:cvs at cvs.bioperl.org:/home/repository/bioperl login

          when prompted, the password is 'cvs'

    (3) Check out the bioperl package you are interested in, for most
    users this will be the bioperl-live source tree.  The following
    command should be executed as one line.

         % cvs -d :pserver:cvs at cvs.bioperl.org:/home/repository/bioperl checkout bioperl-live

    The login and checkout procedure should only have to be done
    once. To update the source directories in the future it should be
    possible just to enter the top level directory and issue the
    following command:

         % cvs update

This will create the directory ``bioperl-live''. Now build and install bioperl with the following recipe:

         % cd bioperl-live
         % perl Makefile.PL
         % make
         % make test
         % make install

The last step will probably need to be run as root.

When i perform either of these steps i get the message that the installation was successful, but bioperl and biodas return a message that the modules have not been installed.

They are physically present on the disk, but the programs don't seem to know where to find them.

Can anyone suggest how to fix this problem?

thanks

Simon

From heikki at sanbi.ac.za  Thu Jan 26 02:53:22 2006
From: heikki at sanbi.ac.za (Heikki Lehvaslaiho)
Date: Thu, 26 Jan 2006 09:53:22 +0200
Subject: [Bioperl-l] Fwd: some doubts in bioperl
Message-ID: <200601260953.22923.heikki@sanbi.ac.za>

----------  Forwarded Message  ----------

Subject: some doubts in bioperl
Date: Monday 23 January 2006 10:16
From: apsara asok 
To: heikki at sanbi.ac.za

dear heikki,
                  i want to clear some doubts in bioperl.using suffix tree
how can v do pattern searching in bioperl
do u have any idea pls help me
apsara

-------------------------------------------------------

From roy at colibase.bham.ac.uk  Thu Jan 26 08:18:03 2006
From: roy at colibase.bham.ac.uk (Roy Chaudhuri)
Date: Thu, 26 Jan 2006 13:18:03 +0000
Subject: [Bioperl-l] concatenate two embl sequence files
In-Reply-To: <200601252311.45582.heikki@sanbi.ac.za>
References: <200601182120.k0ILIl8X022324@portal.open-bio.org>
	<43D7AFD9.2020305@colibase.bham.ac.uk>
	<200601252311.45582.heikki@sanbi.ac.za>
Message-ID: <43D8CC0B.10403@colibase.bham.ac.uk>

Heikki Lehvaslaiho wrote:
> Thanks Roy!
> 
> I'll check to code in tomorrow when I am less sleepy and can go through the 
> code in detail. In principle the code looks good. It definitely needs tests. 
> If you have written any please do post them.
Not too sure about how to go about writing tests, any suggestions?

It did occur to me that my _coordAdjust method could be adapted to allow 
the Bio::Seq trunc method to retain sequence features (since there's no 
reason why the $add argument can't be negative). This would probably 
need a bit more work to cope with the situation where a feature overlaps 
the trunc coordinates, for example if we truncate to coordinates 1..400, 
but there's a feature 300..500. I guess the 'correct' behaviour might be 
to convert that feature to a fuzzy location of 300..>400? Or is it 
acceptable to have features with coordinates outside of a sequence?

If we did that then an obvious test would be to cat a sequence to 
itself, then trunc to retain just the second half of the new sequence 
and see if you got back what you started with.

> A few more checks to make sure seq_>alphabet is the same in all sequences 
> might be a good idea.
That's easy to implement. Just put the line:
	$self->throw('Trying to concatenate sequences with different alphabets: 
'.$seq->display_id.' ('.$seq->alphabet.') and ' .$_->display_id.' 
('.$_->alphabet.')') unless $_->alphabet eq $seq->alphabet;

at the start of the for(@seqs) loop of the cat subroutine.

Roy.
--
Dr. Roy Chaudhuri
Bioinformatics Research Fellow
Division of Immunity and Infection
University of Birmingham, U.K.

http://xbase.bham.ac.uk

From hlapp at gmx.net  Thu Jan 26 01:31:43 2006
From: hlapp at gmx.net (Hilmar Lapp)
Date: Wed, 25 Jan 2006 22:31:43 -0800
Subject: [Bioperl-l] bioperl-db on Windows (update)
In-Reply-To: <000001c6223a$cd5539c0$15327e82@pyrimidine>
References: <000001c6223a$cd5539c0$15327e82@pyrimidine>
Message-ID: 

This is a lot of work you did to investigate this Chris, thanks. Yes
filing as a bug report will help, and don't forget to attach this
report of yours with all the tests you did. Really all that's left to
do is test on a couple of Unix platforms, which will happen
semi-automatically by people once we commit the change.

   -hilmar

On 1/25/06, Chris Fields  wrote:
> Hilmar,
>
> I checked load_seqdatabase.pl with all variables of Root.pm and checking
> debugging output; basically, the only way that I could find to get
> load_seqdatabase.pl to work on native Windows is by changing those Root.pm
> lines by adding a comma (i.e. three lines, from 'throw $class ...' to 'throw
> $class, ...').  I ran debugging on load_seqdatabase.pl using all versions of
> Root.pm, with and without Error.pm.  Only those with a comma present worked
> in both circumstances.  I don't know why this hasn't popped up before now,
> but it seems to be a unique combination of Windows, load_seqdatabase.pl, and
> bioperl-db.  It doesn't happen with any scripts of Bioperl on Windows that
> I've run into, and debugging other modules (for instance,
> Bio::SearchIO::blast, which I recently worked on) doesn't cause this
> problem.
>
> Here's the debugging output for load_seqdatabase.pl, with and w/o Error.pm
> and without modifying Root.pm.
>
> ____________________________________________________________
>
> Without Error.pm:
> ____________________________________________________________
> C:\Perl\Scripts>perl -MError
> Can't locate Error.pm in @INC (@INC contains:
> C:\Perl\src\bioperl\bioperl-live C:\Perl\src\bioperl\bioperl-db C:/Perl/lib
> C:/P
> erl/site/lib .).
> BEGIN failed--compilation aborted.
>
> C:\Perl\Scripts>load_seqdatabase.pl -dbname biotest -dbuser root -dbpass
> ****** -driver mysql -format genbank -namespace tes
> t -testonly -safe -debug input.gpt
> Loading input.gpt ...
> attempting to load adaptor class for Bio::Seq::RichSeq
>         attempting to load module Bio::DB::BioSQL::RichSeqAdaptor
> attempting to load adaptor class for Bio::Seq
>         attempting to load module Bio::DB::BioSQL::SeqAdaptor
> instantiating adaptor class Bio::DB::BioSQL::SeqAdaptor
> attempting to load adaptor class for Bio::Species
>         attempting to load module Bio::DB::BioSQL::SpeciesAdaptor
> instantiating adaptor class Bio::DB::BioSQL::SpeciesAdaptor
> Undefined subroutine &Bio::Root::Root::debug called at
> C:\Perl\src\bioperl\bioperl-db/Bio/DB/BioSQL/BasePersistenceAdaptor.pm
> line 1537,  line 63.
> ____________________________________________________________
>
> With Error.pm:
> ____________________________________________________________
>
> C:\Perl\Scripts>perl -MError -e ";"
>
> C:\Perl\Scripts>load_seqdatabase.pl -dbname biotest -dbuser root -dbpass
> ****** -driver mysql -format genbank -namespace tes
> t -testonly -safe -debug input.gpt
> Loading input.gpt ...
> attempting to load adaptor class for Bio::Seq::RichSeq
>         attempting to load module Bio::DB::BioSQL::RichSeqAdaptor
>   Calling Error::throw
>
>   Calling Error::throw
>
> attempting to load adaptor class for Bio::Seq
>         attempting to load module Bio::DB::BioSQL::SeqAdaptor
> instantiating adaptor class Bio::DB::BioSQL::SeqAdaptor
>   Calling Error::throw
>
> attempting to load adaptor class for Bio::Species
>         attempting to load module Bio::DB::BioSQL::SpeciesAdaptor
> instantiating adaptor class Bio::DB::BioSQL::SpeciesAdaptor
>   Calling Error::throw
>
>   Calling Error::throw
>
> Undefined subroutine &Bio::Root::Root::debug called at
> C:\Perl\src\bioperl\bioperl-db/Bio/DB/BioSQL/BasePersistenceAdaptor.pm
> line 1537,  line 63.
>
> ____________________________________________________________
>
> Error::throw is called w/o a problem when Error.pm is present (which is what
> should happen).  For some reason, that extra comma makes all the difference
> in the world.
>
> The line above in BasePersistenceAdaptor.pm is :
>
> $self->debug("attempting to load driver for adaptor class $class\n");
>
> which is found in many modules.  I don't really know why it decides to hang
> up here.  I'll try running a few of the Root.pm modifications under Mac OS X
> in the next day or so to see what happens.
>
> I also reran a few of Steve Chervitz's recommendations from a previous post;
> everything ran fine except in circumstances in which Error.pm was required
> with a 'use' statement, and only when Error.pm wasn't present, which is
> expected.  Previously, when I ran them, there was a bit of confusion b/c it
> seemed that Error.pm was present somewhere.  It was; Steve included it in
> bioperl-live/examples/root/lib.  When I deleted it, I got the expected
> results.
>
> Anyway, I don't know what else I can do at this point besides check out
> everything on Mac OS X.  Any additional checks of the modified Root.pm need
> to be made on other systems.  Will filing this as a bug in Bugzilla help?
>
> Christopher Fields
> Postdoctoral Researcher - Switzer Lab
> Dept. of Biochemistry
> University of Illinois Urbana-Champaign
>
>

--
----------------------------------------------------------
: Hilmar Lapp -:- San Diego, CA -:- hlapp at gmx dot net :
----------------------------------------------------------

From cain at cshl.edu  Thu Jan 26 10:41:20 2006
From: cain at cshl.edu (Scott Cain)
Date: Thu, 26 Jan 2006 10:41:20 -0500
Subject: [Bioperl-l] [Gmod-gbrowse] GMOD PPM repository not working
In-Reply-To: <000201c62229$59ed5f50$15327e82@pyrimidine>
References: <000201c62229$59ed5f50$15327e82@pyrimidine>
Message-ID: <1138290080.2894.25.camel@localhost.localdomain>

Hi Chris,

I still don't exactly know what the problem is, but this at least has
given me some insight on some messages in my error_log: I've been seeing
lots of messages about '/icon/somegif.gif' not found and haven't been
able to track down their source (not that I'd really tried, it was an
annoyance that hadn't risen to the level of serous debugging yet).  We
are using mod_rewrite, so that could be part of the problem.  I'll try
to fix it so that the icons display properly and that may have a side
effect of fixing ppm.

Scott

On Wed, 2006-01-25 at 21:34 -0600, Chris Fields wrote:
> Scott,
> 
> This popped up, for some reason, when I tried to install a perl module
> (Error.pm); maybe it has something to do with the reason PPM can't 'see'
> GMOD's repository.  It crashes PPM pretty nicely!  Looks like the home page
> for GMOD, so maybe Sourceforge is redirecting things and this messes with
> PPM?  
> 
> _____________________________________________
> C:\Perl\Scripts>ppm
> PPM - Programmer's Package Manager version 3.3.
> Copyright (c) 2001 ActiveState Corp. All Rights Reserved.
> ActiveState is a division of Sophos.
> 
> Entering interactive shell. Using Term::ReadLine::Perl as readline library.
> 
> Type 'help' to get started.
> 
> ppm> rep
> Repositories:
> [1] Bioperl
> [2] gmod
> [3] ActiveState PPM2 Repository
> [4] ActiveState Package Repository
> [ ] Bribes
> [ ] Kobes
> [ ] local
> ppm> install Error
> PPM::PPD::init: not a PPD and not a file:
>  "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
> 
> 
> 
>   The Generic Model Organism Database Project | GMOD
>   
> 
> 
>   
>   
> 
> 
> 
> 
> 
>   
>     
>     
> 
>        alt="Home" />
> 
>       GMOD
> 
>       Generic Software Components for Model
> Organism Databases
> 
> 
>     
>        href="http://sourceforge.net/mail/?group_id=27707">Mailing lists |
>  href="http://sourceforge.net/tracker/?atid=391291&group_id=27707&func=browse
> ">Bug Reports |
>  href="http://sourceforge.net/tracker/?atid=391294&group_id=27707&func=browse
> ">Feature Requests |
> Publications |
> Meetings |
> 
> .... (lots of HTML removed)
> 
> 
> This site is maintained by Scott
> Cain | Powered by 
> drupal
> 
> 
> 
> 
> Christopher Fields
> Postdoctoral Researcher - Switzer Lab
> Dept. of Biochemistry
> University of Illinois Urbana-Champaign 
> > -----Original Message-----
> > From: gmod-gbrowse-admin at lists.sourceforge.net [mailto:gmod-gbrowse-
> > admin at lists.sourceforge.net] On Behalf Of Scott Cain
> > Sent: Tuesday, January 24, 2006 10:16 AM
> > To: Chris Fields
> > Cc: 'Gbrowse (E-mail)'; bioperl-l at portal.open-bio.org
> > Subject: Re: [Gmod-gbrowse] GMOD PPM repository not working
> > 
> > Hi Chris,
> > 
> > Is it still misbehaving?  I'll do some testing today, but my ability to
> > do so is little hampered as I am traveling this week.
> > 
> > Thanks,
> > Scott
> > 
> > 
> > On Wed, 2006-01-18 at 10:51 -0600, Chris Fields wrote:
> > > Scott,
> > >
> > > I am trying to find the newest bioperl dev. Release (1.51) from PPM for
> > a
> > > quick write-up on installing bioperl-db on Windows.  I tried using the
> > GMOD
> > > repository:
> > >
> > > ppm> rep add gmod http://www.gmod.org/ggb/ppm
> > > Repositories:
> > > [1] gmod
> > > [ ] ActiveState Package Repository
> > > [ ] ActiveState PPM2 Repository
> > > [ ] Bioperl
> > > [ ] Bribes
> > > [ ] Kobes
> > > [ ] local
> > > ppm> search bioperl
> > > Searching in Active Repositories
> > > No matches for 'bioperl'; see 'help search'.
> > > ppm> search *
> > > Searching in Active Repositories
> > > No matches for '*'; see 'help search'.
> > > ppm>
> > >
> > >
> > > Any idea what's going on?  All other repositories work fine.  I can
> > download
> > > it and install locally w/o a problem.  I am running the newest
> > ActivePerl
> > > (5.8.7.815), WinXP.
> > >
> > > Christopher Fields
> > > Postdoctoral Researcher - Switzer Lab
> > > Dept. of Biochemistry
> > > University of Illinois Urbana-Champaign
> > >
> > >
> > >
> > >
> > > -------------------------------------------------------
> > > This SF.net email is sponsored by: Splunk Inc. Do you grep through log
> > files
> > > for problems?  Stop!  Download the new AJAX search engine that makes
> > > searching your log files as easy as surfing the  web.  DOWNLOAD SPLUNK!
> > > http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642
> > > _______________________________________________
> > > Gmod-gbrowse mailing list
> > > Gmod-gbrowse at lists.sourceforge.net
> > > https://lists.sourceforge.net/lists/listinfo/gmod-gbrowse
> > --
> > ------------------------------------------------------------------------
> > Scott Cain, Ph. D.                                         cain at cshl.edu
> > GMOD Coordinator (http://www.gmod.org/)                     216-392-3087
> > Cold Spring Harbor Laboratory
> > 
> > 
> > 
> > -------------------------------------------------------
> > This SF.net email is sponsored by: Splunk Inc. Do you grep through log
> > files
> > for problems?  Stop!  Download the new AJAX search engine that makes
> > searching your log files as easy as surfing the  web.  DOWNLOAD SPLUNK!
> > http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642
> > _______________________________________________
> > Gmod-gbrowse mailing list
> > Gmod-gbrowse at lists.sourceforge.net
> > https://lists.sourceforge.net/lists/listinfo/gmod-gbrowse
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
-- 
------------------------------------------------------------------------
Scott Cain, Ph. D.                                         cain at cshl.edu
GMOD Coordinator (http://www.gmod.org/)                     216-392-3087
Cold Spring Harbor Laboratory

From rbalbi at gmail.com  Thu Jan 26 13:19:57 2006
From: rbalbi at gmail.com (Ricardo Balbi)
Date: Thu, 26 Jan 2006 16:19:57 -0200
Subject: [Bioperl-l] [GUSDEV] Another error executing ISF
In-Reply-To: 
References: 

Message-ID: 

Hi all,

   Anybody could help me with this error ?

thanks in advance,
Ricardo

2006/1/26, Aaron J. Mackey :
>
>
> This is a BioPerl "Unflattener" error; it's unable to automatically
> reconstruct the gene/mRNA/exon logic of some (or all) of the
> annotation in your genbank file.  To get help with this, you should
> post a message to the BioPerl mailing list (bioperl-l at bioperl.org),
> including a snippet of your genbank file.
>
> -Aaron
>
> On Jan 26, 2006, at 11:53 AM, Ricardo Balbi wrote:
>
> > Hi all,
> >
> >   After making some changes in the gus mapping file to ignore some
> > features of the kinetoplastida database, I followed in the
> > execution of the ISF, however without success.
> >
> >   Somebody could help me with this error?
> >
> > thanks in advance,
> > Ricardo
> >
> > ERROR:
> >
> > ------------- EXCEPTION  -------------
> > MSG: structure_type 2 is currently unknown
> > STACK Bio::SeqFeature::Tools::Unflattener::unflatten_seq /usr/local/
> > bioperl15/Bio/SeqFeature/Tools/Unflattener.pm:1419
> > STACK GUS::Supported::Plugin::InsertSequenceFeatures::unflatten /
> > GUS/gus_home/lib/perl/GUS/Supported/Plugin/
> > InsertSequenceFeatures.pm:353
> > STACK
> > GUS::Supported::Plugin::InsertSequenceFeatures::processFeatureTrees /G
> > US/gus_home/lib/perl/GUS/Supported/Plugin/InsertSequenceFeatures.pm:
> > 720
> > STACK GUS::Supported::Plugin::InsertSequenceFeatures::run /GUS/
> > gus_home/lib/perl/GUS/Supported/Plugin/InsertSequenceFeatures.pm:330
> > STACK (eval) /GUS/gus_home/lib/perl/GUS/PluginMgr/GusApplication.pm:
> > 549
> > STACK GUS::PluginMgr::GusApplication::doMajorMode_RunOrReport /GUS/
> > gus_home/lib/perl/GUS/PluginMgr/GusApplication.pm:541
> > STACK GUS::PluginMgr::GusApplication::doMajorMode_Run /GUS/gus_home/
> > lib/perl/GUS/PluginMgr/GusApplication.pm:459
> > STACK GUS::PluginMgr::GusApplication::doMajorMode /GUS/gus_home/lib/
> > perl/GUS/PluginMgr/GusApplication.pm:357
> > STACK GUS::PluginMgr::GusApplication::parseAndRun /GUS/gus_home/lib/
> > perl/GUS/PluginMgr/GusApplication.pm:266
> > STACK toplevel /GUS/gus_home/bin/ga:11
> >
> > --------------------------------------
> >
> > STACK TRACE:
> >  at /usr/local/bioperl15/Bio/Root/Root.pm line 342
> >         Bio::Root::Root::throw
> > ('Bio::SeqFeature::Tools::Unflattener=HASH(0xb5e124c)',
> > 'structure_type 2 is currently unknown') called at /usr/local/
> > bioperl15/Bio/SeqFeature/Tools/Unflattener.pm line 1419
> >         Bio::SeqFeature::Tools::Unflattener::unflatten_seq
> > ('Bio::SeqFeature::Tools::Unflattener=HASH(0xb5e124c)', '-seq',
> > 'Bio::Seq::RichSeq=HASH(0xb820b14)', '-use_magic', 1) called at /
> > GUS/gus_home/lib/perl/GUS/Supported/Plugin/
> > InsertSequenceFeatures.pm line 353
> >         GUS::Supported::Plugin::InsertSequenceFeatures::unflatten
> > ('GUS::Supported::Plugin::InsertSequenceFeatures=HASH(0xa011adc)',
> > 'Bio::Seq::RichSeq=HASH(0xb820b14)') called at /GUS/gus_home/lib/
> > perl/GUS/Supported/Plugin/InsertSequenceFeatures.pm line 720
> >
> > GUS::Supported::Plugin::InsertSequenceFeatures::processFeatureTrees
> > ('GUS::Supported::Plugin::InsertSequenceFeatures=HASH(0xa011adc)',
> > 'Bio::Seq::RichSeq=HASH(0xb820b14)', 140, 177) called at /GUS/
> > gus_home/lib/perl/GUS/Supported/Plugin/InsertSequenceFeatures.pm
> > line 330
> >         GUS::Supported::Plugin::InsertSequenceFeatures::run
> > ('GUS::Supported::Plugin::InsertSequenceFeatures=HASH(0xa011adc)',
> > 'HASH(0xb047e98)') called at /GUS/gus_home/lib/perl/GUS/PluginMgr/
> > GusApplication.pm line 549
> >         eval {...} called at /GUS/gus_home/lib/perl/GUS/PluginMgr/
> > GusApplication.pm line 541
> >         GUS::PluginMgr::GusApplication::doMajorMode_RunOrReport
> > ('GUS::PluginMgr::GusApplication=HASH(0x9ff0c20)',
> > 'GUS::Supported::Plugin::InsertSequenceFeatures', 1) called at /GUS/
> > gus_home/lib/perl/GUS/PluginMgr/GusApplication.pm line 459
> >         GUS::PluginMgr::GusApplication::doMajorMode_Run
> > ('GUS::PluginMgr::GusApplication=HASH(0x9ff0c20)',
> > 'GUS::Supported::Plugin::InsertSequenceFeatures') called at /GUS/
> > gus_home/lib/perl/GUS/PluginMgr/GusApplication.pm line 357
> >         GUS::PluginMgr::GusApplication::doMajorMode
> > ('GUS::PluginMgr::GusApplication=HASH(0x9ff0c20)',
> > 'GUS::Supported::Plugin::InsertSequenceFeatures') called at /GUS/
> > gus_home/lib/perl/GUS/PluginMgr/GusApplication.pm line 266
> >         GUS::PluginMgr::GusApplication::parseAndRun
> > ('GUS::PluginMgr::GusApplication=HASH(0x9ff0c20)', 'ARRAY
> > (0xa004738)') called at /GUS/gus_home/bin/ga line 11
> >
> >
> >
>
> --
> Dr. Aaron J. Mackey, Ph.D.
> Project Manager, ApiDB Bioinformatics Resource Center
> Penn Genomics Institute, University of Pennsylvania
> email:  amackey at pcbi.upenn.edu
> office: 215-898-1205 (Biology, 212 Goddard Labs)
>          215-746-7018 (PCBI, 1428 Blockley Hall)
> fax:    215-746-6697 (Penn Genomics Institute)
> postal: Penn Genomics Institute
>          Goddard Labs 212
>          415 S. University Avenue
>          Philadelphia, PA  19104-6017
>
>
>

From jason.stajich at duke.edu  Thu Jan 26 14:28:26 2006
From: jason.stajich at duke.edu (Jason Stajich)
Date: Thu, 26 Jan 2006 14:28:26 -0500
Subject: [Bioperl-l] [GUSDEV] Another error executing ISF
In-Reply-To: 
References: 

Message-ID: <2A475D24-5AC3-4AD5-80CB-0C40DB622283@duke.edu>

I would suggest following Aaron's instructions to

>> including a snippet of your genbank file.

--
Jason Stajich
Duke University
http://www.duke.edu/~jes12

From cjm at fruitfly.org  Thu Jan 26 14:33:46 2006
From: cjm at fruitfly.org (chris mungall)
Date: Thu, 26 Jan 2006 11:33:46 -0800
Subject: [Bioperl-l] [GUSDEV] Another error executing ISF
In-Reply-To: 
References: 

Message-ID: 

Sorry for the uninformative error message.

The unflattener uses a collection of heuristics to infer a canonical 
gene-mRNA-CDS-exon feature hierarchy from a flattened genbank style 
file. Due to the highly variable nature of some genbank records this 
isn't always possible, and some data massaging is required beforehand. 
I don't know what the context of this message is, but I presume you're 
aware of this from the docs.

The only time I've seen this before was with the genbank submission of 
the pombe genome, which has some very.. unusual features purportedly of 
type mRNA; the actual gene models are encoded using 'gene' and 'CDS' 
features. This confuses the heuristics a little. The only way I've been 
able to deal with this one was to manually remove the mRNA features 
(they appeared to be just fragments and not actual gene models) using 
$unf->remove_types(['mRNA']) beforehand.

Can you send the accession of the record you're trying this on (or 
email me the file off-list if it isn't too large). I'll try and get a 
more informative error message in there.

On Jan 26, 2006, at 10:19 AM, Ricardo Balbi wrote:

> Hi all,
>
>    Anybody could help me with this error ?
>
> thanks in advance,
> Ricardo
>
> 2006/1/26, Aaron J. Mackey :
>>
>>
>> This is a BioPerl "Unflattener" error; it's unable to automatically
>> reconstruct the gene/mRNA/exon logic of some (or all) of the
>> annotation in your genbank file.  To get help with this, you should
>> post a message to the BioPerl mailing list (bioperl-l at bioperl.org),
>> including a snippet of your genbank file.
>>
>> -Aaron
>>
>> On Jan 26, 2006, at 11:53 AM, Ricardo Balbi wrote:
>>
>>> Hi all,
>>>
>>>   After making some changes in the gus mapping file to ignore some
>>> features of the kinetoplastida database, I followed in the
>>> execution of the ISF, however without success.
>>>
>>>   Somebody could help me with this error?
>>>
>>> thanks in advance,
>>> Ricardo
>>>
>>> ERROR:
>>>
>>> ------------- EXCEPTION  -------------
>>> MSG: structure_type 2 is currently unknown
>>> STACK Bio::SeqFeature::Tools::Unflattener::unflatten_seq /usr/local/
>>> bioperl15/Bio/SeqFeature/Tools/Unflattener.pm:1419
>>> STACK GUS::Supported::Plugin::InsertSequenceFeatures::unflatten /
>>> GUS/gus_home/lib/perl/GUS/Supported/Plugin/
>>> InsertSequenceFeatures.pm:353
>>> STACK
>>> GUS::Supported::Plugin::InsertSequenceFeatures::processFeatureTrees 
>>> /G
>>> US/gus_home/lib/perl/GUS/Supported/Plugin/InsertSequenceFeatures.pm:
>>> 720
>>> STACK GUS::Supported::Plugin::InsertSequenceFeatures::run /GUS/
>>> gus_home/lib/perl/GUS/Supported/Plugin/InsertSequenceFeatures.pm:330
>>> STACK (eval) /GUS/gus_home/lib/perl/GUS/PluginMgr/GusApplication.pm:
>>> 549
>>> STACK GUS::PluginMgr::GusApplication::doMajorMode_RunOrReport /GUS/
>>> gus_home/lib/perl/GUS/PluginMgr/GusApplication.pm:541
>>> STACK GUS::PluginMgr::GusApplication::doMajorMode_Run /GUS/gus_home/
>>> lib/perl/GUS/PluginMgr/GusApplication.pm:459
>>> STACK GUS::PluginMgr::GusApplication::doMajorMode /GUS/gus_home/lib/
>>> perl/GUS/PluginMgr/GusApplication.pm:357
>>> STACK GUS::PluginMgr::GusApplication::parseAndRun /GUS/gus_home/lib/
>>> perl/GUS/PluginMgr/GusApplication.pm:266
>>> STACK toplevel /GUS/gus_home/bin/ga:11
>>>
>>> --------------------------------------
>>>
>>> STACK TRACE:
>>>  at /usr/local/bioperl15/Bio/Root/Root.pm line 342
>>>         Bio::Root::Root::throw
>>> ('Bio::SeqFeature::Tools::Unflattener=HASH(0xb5e124c)',
>>> 'structure_type 2 is currently unknown') called at /usr/local/
>>> bioperl15/Bio/SeqFeature/Tools/Unflattener.pm line 1419
>>>         Bio::SeqFeature::Tools::Unflattener::unflatten_seq
>>> ('Bio::SeqFeature::Tools::Unflattener=HASH(0xb5e124c)', '-seq',
>>> 'Bio::Seq::RichSeq=HASH(0xb820b14)', '-use_magic', 1) called at /
>>> GUS/gus_home/lib/perl/GUS/Supported/Plugin/
>>> InsertSequenceFeatures.pm line 353
>>>         GUS::Supported::Plugin::InsertSequenceFeatures::unflatten
>>> ('GUS::Supported::Plugin::InsertSequenceFeatures=HASH(0xa011adc)',
>>> 'Bio::Seq::RichSeq=HASH(0xb820b14)') called at /GUS/gus_home/lib/
>>> perl/GUS/Supported/Plugin/InsertSequenceFeatures.pm line 720
>>>
>>> GUS::Supported::Plugin::InsertSequenceFeatures::processFeatureTrees
>>> ('GUS::Supported::Plugin::InsertSequenceFeatures=HASH(0xa011adc)',
>>> 'Bio::Seq::RichSeq=HASH(0xb820b14)', 140, 177) called at /GUS/
>>> gus_home/lib/perl/GUS/Supported/Plugin/InsertSequenceFeatures.pm
>>> line 330
>>>         GUS::Supported::Plugin::InsertSequenceFeatures::run
>>> ('GUS::Supported::Plugin::InsertSequenceFeatures=HASH(0xa011adc)',
>>> 'HASH(0xb047e98)') called at /GUS/gus_home/lib/perl/GUS/PluginMgr/
>>> GusApplication.pm line 549
>>>         eval {...} called at /GUS/gus_home/lib/perl/GUS/PluginMgr/
>>> GusApplication.pm line 541
>>>         GUS::PluginMgr::GusApplication::doMajorMode_RunOrReport
>>> ('GUS::PluginMgr::GusApplication=HASH(0x9ff0c20)',
>>> 'GUS::Supported::Plugin::InsertSequenceFeatures', 1) called at /GUS/
>>> gus_home/lib/perl/GUS/PluginMgr/GusApplication.pm line 459
>>>         GUS::PluginMgr::GusApplication::doMajorMode_Run
>>> ('GUS::PluginMgr::GusApplication=HASH(0x9ff0c20)',
>>> 'GUS::Supported::Plugin::InsertSequenceFeatures') called at /GUS/
>>> gus_home/lib/perl/GUS/PluginMgr/GusApplication.pm line 357
>>>         GUS::PluginMgr::GusApplication::doMajorMode
>>> ('GUS::PluginMgr::GusApplication=HASH(0x9ff0c20)',
>>> 'GUS::Supported::Plugin::InsertSequenceFeatures') called at /GUS/
>>> gus_home/lib/perl/GUS/PluginMgr/GusApplication.pm line 266
>>>         GUS::PluginMgr::GusApplication::parseAndRun
>>> ('GUS::PluginMgr::GusApplication=HASH(0x9ff0c20)', 'ARRAY
>>> (0xa004738)') called at /GUS/gus_home/bin/ga line 11
>>>
>>>
>>>
>>
>> --
>> Dr. Aaron J. Mackey, Ph.D.
>> Project Manager, ApiDB Bioinformatics Resource Center
>> Penn Genomics Institute, University of Pennsylvania
>> email:  amackey at pcbi.upenn.edu
>> office: 215-898-1205 (Biology, 212 Goddard Labs)
>>          215-746-7018 (PCBI, 1428 Blockley Hall)
>> fax:    215-746-6697 (Penn Genomics Institute)
>> postal: Penn Genomics Institute
>>          Goddard Labs 212
>>          415 S. University Avenue
>>          Philadelphia, PA  19104-6017
>>
>>
>>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

From heikki at sanbi.ac.za  Fri Jan 27 05:06:52 2006
From: heikki at sanbi.ac.za (Heikki Lehvaslaiho)
Date: Fri, 27 Jan 2006 12:06:52 +0200
Subject: [Bioperl-l] concatenate two embl sequence files
In-Reply-To: <43D8CC0B.10403@colibase.bham.ac.uk>
References: <200601182120.k0ILIl8X022324@portal.open-bio.org>
	<200601252311.45582.heikki@sanbi.ac.za>
	<43D8CC0B.10403@colibase.bham.ac.uk>
Message-ID: <200601271206.52875.heikki@sanbi.ac.za>

On Thursday 26 January 2006 15:18, Roy Chaudhuri wrote:
> Heikki Lehvaslaiho wrote:
> > Thanks Roy!
> >
> > I'll check to code in tomorrow when I am less sleepy and can go through
> > the code in detail. In principle the code looks good. It definitely needs
> > tests. If you have written any please do post them.
>
> Not too sure about how to go about writing tests, any suggestions?

I've committed the code and tests. See t/SeqUtils.t. The idea is to test all 
methods and a reasonable portion of all edge values to be sure that the 
method works as it should.

Note that the code does not create a new sequence object. It modifies the 
existing one. Therefore it is best not to return that object. The users would 
assign that to a variable that points to the same structure and get confused. 
The method now returns true upon completeion.

Creating a new sequence object is problematic because one needs to add one 
more dependency (e.g. Clone) and will not work anyway if the sequence 
implementation is using a database back end. It is better the way you have 
written it.

I added code to move over the annotations from secondary sequences, but did 
not do anything remove duplicates if the same sequence gets added twice. I 
wrote a note about this so that users know to be prepared if that affects 
them.

> It did occur to me that my _coordAdjust method could be adapted to allow
> the Bio::Seq trunc method to retain sequence features (since there's no
> reason why the $add argument can't be negative). This would probably
> need a bit more work to cope with the situation where a feature overlaps
> the trunc coordinates, for example if we truncate to coordinates 1..400,
> but there's a feature 300..500. I guess the 'correct' behaviour might be
> to convert that feature to a fuzzy location of 300..>400? Or is it
> acceptable to have features with coordinates outside of a sequence?

No feature coordinates should always be within the sequence. Fuzzy is the 
correct solution to this.

> If we did that then an obvious test would be to cat a sequence to
> itself, then trunc to retain just the second half of the new sequence
> and see if you got back what you started with.

Go ahead an try it!

> > A few more checks to make sure seq_>alphabet is the same in all sequences
> > might be a good idea.
>
> That's easy to implement. Just put the line:
> 	$self->throw('Trying to concatenate sequences with different alphabets:
> '.$seq->display_id.' ('.$seq->alphabet.') and ' .$_->display_id.'
> ('.$_->alphabet.')') unless $_->alphabet eq $seq->alphabet;
>
> at the start of the for(@seqs) loop of the cat subroutine.

Added.

Thanks,

	-Heikki
> Roy.
> --
> Dr. Roy Chaudhuri
> Bioinformatics Research Fellow
> Division of Immunity and Infection
> University of Birmingham, U.K.
>
> http://xbase.bham.ac.uk
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

-- 
______ _/      _/_____________________________________________________
      _/      _/
     _/  _/  _/  Heikki Lehvaslaiho    heikki at_sanbi _ac _za
    _/_/_/_/_/  Associate Professor    skype: heikki_lehvaslaiho
   _/  _/  _/  SANBI, South African National Bioinformatics Institute
  _/  _/  _/  University of Western Cape, South Africa
     _/      Phone: +27 21 959 2096   FAX: +27 21 959 2512
___ _/_/_/_/_/________________________________________________________

From wlhsiao at yahoo.ca  Fri Jan 27 05:37:25 2006
From: wlhsiao at yahoo.ca (William Hsiao)
Date: Fri, 27 Jan 2006 05:37:25 -0500 (EST)
Subject: [Bioperl-l] new website launched
In-Reply-To: <79628361-F026-461F-A156-0B6810DB0B52@duke.edu>
Message-ID: <20060127103725.49913.qmail@web32405.mail.mud.yahoo.com>

Hi Jason,
  Nice new site!  I am wondering if I missed an
obvious link to the module documentations (e.g
http://doc.bioperl.org/releases/bioperl-1.4/) from the
homepage?  It seems that is the one thing missing from
the old website setup and I am not sure if it's
intentional.  I am developing a set of lecture notes
for a workshop and would like to know if there is a
stable way to navigate to the module documentations.

Thanks

Cheers,

Will

--- Jason Stajich  wrote:

> I am pleased to announce the release of a new
> website for BioPerl.   
> The site is based on the mediawiki software that was
> developed for  
> the wikipedia project.  We intend the site to be a
> place for  
> community input on documentation and design for the
> BioPerl project.   
> There is also a fair amount of documentation started
> surrounding  
> bioinformatics tools and techniques applicable to
> using BioPerl and  
> some of the authors who created these resources.
> 
> The website continues to be at the URL
> http://www.bioperl.org.  The  
> DNS updates may take up to 24 hours to reach
> everyone.
> 
> The initial content of the site is result of the
> work of myself,  
> Mauricio Herrera Cuadra, Brian Osborne, and Torsten
> Seemann.  We  
> encourage you to contribute to the site's content by
> signing up for  
> an account.
> 
> There are several guides for style of the site and
> how to link to  
> Modules for example which can contain additional
> information from the  
> POD
> http://bioperl.org/wiki/Module:Bio::SeqIO
> 
> You'll notice that many of the paths have changed
> but the DIST and  
> SRC continues to be available at
> http://bioperl.org/DIST and http:// 
> bioperl.org/SRC.  The HOWTOs are now available from
> http:// 
> bioperl.org/wiki/HOWTOs
> 
> The FAQ is available at http://bioperl.org/wiki/FAQ
> and I encourage  
> you to add your questions to it so they can be
> properly archived and  
> addressed.
> 
> We also have initiated a News site for Bioperl for
> posting  
> announcements regarding development and software.  I
> would like to  
> see if there are volunteers to post weekly or
> monthly summaries of  
> mailing list traffic and development.
> http://www.bioperl.org/news/
> 
> 
> Jason Stajich on behalf of Mauricio Herrera Cuadra,
> Brian Osborne,  
> Torsten Seemann.
> 
> --
> Jason Stajich
> Duke University
> http://www.duke.edu/~jes12
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 

__________________________________________________________ 
Find your next car at http://autos.yahoo.ca

From wlhsiao at yahoo.ca  Fri Jan 27 05:37:23 2006
From: wlhsiao at yahoo.ca (William Hsiao)
Date: Fri, 27 Jan 2006 05:37:23 -0500 (EST)
Subject: [Bioperl-l] new website launched
In-Reply-To: <79628361-F026-461F-A156-0B6810DB0B52@duke.edu>
Message-ID: <20060127103723.99495.qmail@web32402.mail.mud.yahoo.com>

Hi Jason,
  Nice new site!  I am wondering if I missed an
obvious link to the module documentations (e.g
http://doc.bioperl.org/releases/bioperl-1.4/) from the
homepage?  It seems that is the one thing missing from
the old website setup and I am not sure if it's
intentional.  I am developing a set of lecture notes
for a workshop and would like to know if there is a
stable way to navigate to the module documentations.

Thanks

Cheers,

Will

--- Jason Stajich  wrote:

> I am pleased to announce the release of a new
> website for BioPerl.   
> The site is based on the mediawiki software that was
> developed for  
> the wikipedia project.  We intend the site to be a
> place for  
> community input on documentation and design for the
> BioPerl project.   
> There is also a fair amount of documentation started
> surrounding  
> bioinformatics tools and techniques applicable to
> using BioPerl and  
> some of the authors who created these resources.
> 
> The website continues to be at the URL
> http://www.bioperl.org.  The  
> DNS updates may take up to 24 hours to reach
> everyone.
> 
> The initial content of the site is result of the
> work of myself,  
> Mauricio Herrera Cuadra, Brian Osborne, and Torsten
> Seemann.  We  
> encourage you to contribute to the site's content by
> signing up for  
> an account.
> 
> There are several guides for style of the site and
> how to link to  
> Modules for example which can contain additional
> information from the  
> POD
> http://bioperl.org/wiki/Module:Bio::SeqIO
> 
> You'll notice that many of the paths have changed
> but the DIST and  
> SRC continues to be available at
> http://bioperl.org/DIST and http:// 
> bioperl.org/SRC.  The HOWTOs are now available from
> http:// 
> bioperl.org/wiki/HOWTOs
> 
> The FAQ is available at http://bioperl.org/wiki/FAQ
> and I encourage  
> you to add your questions to it so they can be
> properly archived and  
> addressed.
> 
> We also have initiated a News site for Bioperl for
> posting  
> announcements regarding development and software.  I
> would like to  
> see if there are volunteers to post weekly or
> monthly summaries of  
> mailing list traffic and development.
> http://www.bioperl.org/news/
> 
> 
> Jason Stajich on behalf of Mauricio Herrera Cuadra,
> Brian Osborne,  
> Torsten Seemann.
> 
> --
> Jason Stajich
> Duke University
> http://www.duke.edu/~jes12
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 

__________________________________________________________ 
Find your next car at http://autos.yahoo.ca

From jason.stajich at duke.edu  Fri Jan 27 08:28:52 2006
From: jason.stajich at duke.edu (Jason Stajich)
Date: Fri, 27 Jan 2006 08:28:52 -0500
Subject: [Bioperl-l] new website launched
In-Reply-To: <20060127103723.99495.qmail@web32402.mail.mud.yahoo.com>
References: <20060127103723.99495.qmail@web32402.mail.mud.yahoo.com>
Message-ID: 

Each module is directly linked to that site in the module-level  
pages, see for example:
http://bioperl.org/wiki/Module:Bio::SearchIO

I've added a mention of the doc.bioperl site on the front page.

Note that as part of setting up the site I insured that there is now  
a standardized URL for the nightly generated Pdoc pages (from CVS  
live) (thanks to Steve Chervitz for suggesting it).

http://doc.bioperl.org/releases/bioperl-current/bioperl-live/
http://doc.bioperl.org/releases/bioperl-current/bioperl-run/
http://doc.bioperl.org/releases/bioperl-current/bioperl-ext/
....etc

The frozen release-based docs will continue to stay up - I never had  
time to make one for the bioperl-1.5.1 but hopefully will do it for  
bioperl 1.5.2 and obviously will make it for the next stable release  
(1.6).

We encourage people to add snippets of code using modules,  
complaints, workarounds, etc on the module pages on the wiki site.   
There is a "discussion" paired for each wiki page where we would  
suggest people put comments, while useful workarounds/example code  
should go on the main page.  I've just added some text about this to  
the "About this site" page.

-jason
On Jan 27, 2006, at 5:37 AM, William Hsiao wrote:

> Hi Jason,
>   Nice new site!  I am wondering if I missed an
> obvious link to the module documentations (e.g
> http://doc.bioperl.org/releases/bioperl-1.4/) from the
> homepage?  It seems that is the one thing missing from
> the old website setup and I am not sure if it's
> intentional.  I am developing a set of lecture notes
> for a workshop and would like to know if there is a
> stable way to navigate to the module documentations.
>
> Thanks
>
> Cheers,
>
> Will
>
>
> --- Jason Stajich  wrote:
>
>> I am pleased to announce the release of a new
>> website for BioPerl.
>> The site is based on the mediawiki software that was
>> developed for
>> the wikipedia project.  We intend the site to be a
>> place for
>> community input on documentation and design for the
>> BioPerl project.
>> There is also a fair amount of documentation started
>> surrounding
>> bioinformatics tools and techniques applicable to
>> using BioPerl and
>> some of the authors who created these resources.
>>
>> The website continues to be at the URL
>> http://www.bioperl.org.  The
>> DNS updates may take up to 24 hours to reach
>> everyone.
>>
>> The initial content of the site is result of the
>> work of myself,
>> Mauricio Herrera Cuadra, Brian Osborne, and Torsten
>> Seemann.  We
>> encourage you to contribute to the site's content by
>> signing up for
>> an account.
>>
>> There are several guides for style of the site and
>> how to link to
>> Modules for example which can contain additional
>> information from the
>> POD
>> http://bioperl.org/wiki/Module:Bio::SeqIO
>>
>> You'll notice that many of the paths have changed
>> but the DIST and
>> SRC continues to be available at
>> http://bioperl.org/DIST and http://
>> bioperl.org/SRC.  The HOWTOs are now available from
>> http://
>> bioperl.org/wiki/HOWTOs
>>
>> The FAQ is available at http://bioperl.org/wiki/FAQ
>> and I encourage
>> you to add your questions to it so they can be
>> properly archived and
>> addressed.
>>
>> We also have initiated a News site for Bioperl for
>> posting
>> announcements regarding development and software.  I
>> would like to
>> see if there are volunteers to post weekly or
>> monthly summaries of
>> mailing list traffic and development.
>> http://www.bioperl.org/news/
>>
>>
>> Jason Stajich on behalf of Mauricio Herrera Cuadra,
>> Brian Osborne,
>> Torsten Seemann.
>>
>> --
>> Jason Stajich
>> Duke University
>> http://www.duke.edu/~jes12
>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>
>
>
> 	
>
> 	
> 		
> __________________________________________________________
> Find your next car at http://autos.yahoo.ca

--
Jason Stajich
Duke University
http://www.duke.edu/~jes12

From davila at fiocruz.br  Thu Jan 26 19:05:35 2006
From: davila at fiocruz.br (Alberto M. R. =?iso-8859-1?Q?D=E1vila?=)
Date: Thu, 26 Jan 2006 22:05:35 -0200 (BRST)
Subject: [Bioperl-l] [GUSDEV] Another error executing ISF
In-Reply-To: 
References: 

Message-ID: <3197.201.17.105.240.1138320335.squirrel@www.redefiocruz.fiocruz.br>

Dear Chris,

Happy 2006 !

I am not sure about the exact record the ISF plugin was trying to
read/parse, but I think it is the first one, anyway I am listing the first
5 GIs of our file for your testing:

85539529
56130985
54300415
54288810
50604596

The whole file is really big (1.4GB) as it contains all the nucleotide
sequences of "kinetoplastida [organism]" from genbank in genbank format.

Hope you can catch "the bug" ;-)

Kindest regards, Alberto

>
> Sorry for the uninformative error message.
>
> The unflattener uses a collection of heuristics to infer a canonical
> gene-mRNA-CDS-exon feature hierarchy from a flattened genbank style
> file. Due to the highly variable nature of some genbank records this
> isn't always possible, and some data massaging is required beforehand.
> I don't know what the context of this message is, but I presume you're
> aware of this from the docs.
>
> The only time I've seen this before was with the genbank submission of
> the pombe genome, which has some very.. unusual features purportedly of
> type mRNA; the actual gene models are encoded using 'gene' and 'CDS'
> features. This confuses the heuristics a little. The only way I've been
> able to deal with this one was to manually remove the mRNA features
> (they appeared to be just fragments and not actual gene models) using
> $unf->remove_types(['mRNA']) beforehand.
>
> Can you send the accession of the record you're trying this on (or
> email me the file off-list if it isn't too large). I'll try and get a
> more informative error message in there.
>
> On Jan 26, 2006, at 10:19 AM, Ricardo Balbi wrote:
>
>> Hi all,
>>
>>    Anybody could help me with this error ?
>>
>> thanks in advance,
>> Ricardo
>>
>> 2006/1/26, Aaron J. Mackey :
>>>
>>>
>>> This is a BioPerl "Unflattener" error; it's unable to automatically
>>> reconstruct the gene/mRNA/exon logic of some (or all) of the
>>> annotation in your genbank file.  To get help with this, you should
>>> post a message to the BioPerl mailing list (bioperl-l at bioperl.org),
>>> including a snippet of your genbank file.
>>>
>>> -Aaron
>>>
>>> On Jan 26, 2006, at 11:53 AM, Ricardo Balbi wrote:
>>>
>>>> Hi all,
>>>>
>>>>   After making some changes in the gus mapping file to ignore some
>>>> features of the kinetoplastida database, I followed in the
>>>> execution of the ISF, however without success.
>>>>
>>>>   Somebody could help me with this error?
>>>>
>>>> thanks in advance,
>>>> Ricardo
>>>>
>>>> ERROR:
>>>>
>>>> ------------- EXCEPTION  -------------
>>>> MSG: structure_type 2 is currently unknown
>>>> STACK Bio::SeqFeature::Tools::Unflattener::unflatten_seq /usr/local/
>>>> bioperl15/Bio/SeqFeature/Tools/Unflattener.pm:1419
>>>> STACK GUS::Supported::Plugin::InsertSequenceFeatures::unflatten /
>>>> GUS/gus_home/lib/perl/GUS/Supported/Plugin/
>>>> InsertSequenceFeatures.pm:353
>>>> STACK
>>>> GUS::Supported::Plugin::InsertSequenceFeatures::processFeatureTrees
>>>> /G
>>>> US/gus_home/lib/perl/GUS/Supported/Plugin/InsertSequenceFeatures.pm:
>>>> 720
>>>> STACK GUS::Supported::Plugin::InsertSequenceFeatures::run /GUS/
>>>> gus_home/lib/perl/GUS/Supported/Plugin/InsertSequenceFeatures.pm:330
>>>> STACK (eval) /GUS/gus_home/lib/perl/GUS/PluginMgr/GusApplication.pm:
>>>> 549
>>>> STACK GUS::PluginMgr::GusApplication::doMajorMode_RunOrReport /GUS/
>>>> gus_home/lib/perl/GUS/PluginMgr/GusApplication.pm:541
>>>> STACK GUS::PluginMgr::GusApplication::doMajorMode_Run /GUS/gus_home/
>>>> lib/perl/GUS/PluginMgr/GusApplication.pm:459
>>>> STACK GUS::PluginMgr::GusApplication::doMajorMode /GUS/gus_home/lib/
>>>> perl/GUS/PluginMgr/GusApplication.pm:357
>>>> STACK GUS::PluginMgr::GusApplication::parseAndRun /GUS/gus_home/lib/
>>>> perl/GUS/PluginMgr/GusApplication.pm:266
>>>> STACK toplevel /GUS/gus_home/bin/ga:11
>>>>
>>>> --------------------------------------
>>>>
>>>> STACK TRACE:
>>>>  at /usr/local/bioperl15/Bio/Root/Root.pm line 342
>>>>         Bio::Root::Root::throw
>>>> ('Bio::SeqFeature::Tools::Unflattener=HASH(0xb5e124c)',
>>>> 'structure_type 2 is currently unknown') called at /usr/local/
>>>> bioperl15/Bio/SeqFeature/Tools/Unflattener.pm line 1419
>>>>         Bio::SeqFeature::Tools::Unflattener::unflatten_seq
>>>> ('Bio::SeqFeature::Tools::Unflattener=HASH(0xb5e124c)', '-seq',
>>>> 'Bio::Seq::RichSeq=HASH(0xb820b14)', '-use_magic', 1) called at /
>>>> GUS/gus_home/lib/perl/GUS/Supported/Plugin/
>>>> InsertSequenceFeatures.pm line 353
>>>>         GUS::Supported::Plugin::InsertSequenceFeatures::unflatten
>>>> ('GUS::Supported::Plugin::InsertSequenceFeatures=HASH(0xa011adc)',
>>>> 'Bio::Seq::RichSeq=HASH(0xb820b14)') called at /GUS/gus_home/lib/
>>>> perl/GUS/Supported/Plugin/InsertSequenceFeatures.pm line 720
>>>>
>>>> GUS::Supported::Plugin::InsertSequenceFeatures::processFeatureTrees
>>>> ('GUS::Supported::Plugin::InsertSequenceFeatures=HASH(0xa011adc)',
>>>> 'Bio::Seq::RichSeq=HASH(0xb820b14)', 140, 177) called at /GUS/
>>>> gus_home/lib/perl/GUS/Supported/Plugin/InsertSequenceFeatures.pm
>>>> line 330
>>>>         GUS::Supported::Plugin::InsertSequenceFeatures::run
>>>> ('GUS::Supported::Plugin::InsertSequenceFeatures=HASH(0xa011adc)',
>>>> 'HASH(0xb047e98)') called at /GUS/gus_home/lib/perl/GUS/PluginMgr/
>>>> GusApplication.pm line 549
>>>>         eval {...} called at /GUS/gus_home/lib/perl/GUS/PluginMgr/
>>>> GusApplication.pm line 541
>>>>         GUS::PluginMgr::GusApplication::doMajorMode_RunOrReport
>>>> ('GUS::PluginMgr::GusApplication=HASH(0x9ff0c20)',
>>>> 'GUS::Supported::Plugin::InsertSequenceFeatures', 1) called at /GUS/
>>>> gus_home/lib/perl/GUS/PluginMgr/GusApplication.pm line 459
>>>>         GUS::PluginMgr::GusApplication::doMajorMode_Run
>>>> ('GUS::PluginMgr::GusApplication=HASH(0x9ff0c20)',
>>>> 'GUS::Supported::Plugin::InsertSequenceFeatures') called at /GUS/
>>>> gus_home/lib/perl/GUS/PluginMgr/GusApplication.pm line 357
>>>>         GUS::PluginMgr::GusApplication::doMajorMode
>>>> ('GUS::PluginMgr::GusApplication=HASH(0x9ff0c20)',
>>>> 'GUS::Supported::Plugin::InsertSequenceFeatures') called at /GUS/
>>>> gus_home/lib/perl/GUS/PluginMgr/GusApplication.pm line 266
>>>>         GUS::PluginMgr::GusApplication::parseAndRun
>>>> ('GUS::PluginMgr::GusApplication=HASH(0x9ff0c20)', 'ARRAY
>>>> (0xa004738)') called at /GUS/gus_home/bin/ga line 11
>>>>
>>>>
>>>>
>>>
>>> --
>>> Dr. Aaron J. Mackey, Ph.D.
>>> Project Manager, ApiDB Bioinformatics Resource Center
>>> Penn Genomics Institute, University of Pennsylvania
>>> email:  amackey at pcbi.upenn.edu
>>> office: 215-898-1205 (Biology, 212 Goddard Labs)
>>>          215-746-7018 (PCBI, 1428 Blockley Hall)
>>> fax:    215-746-6697 (Penn Genomics Institute)
>>> postal: Penn Genomics Institute
>>>          Goddard Labs 212
>>>          415 S. University Avenue
>>>          Philadelphia, PA  19104-6017
>>>

From roy at colibase.bham.ac.uk  Fri Jan 27 10:31:50 2006
From: roy at colibase.bham.ac.uk (Roy Chaudhuri)
Date: Fri, 27 Jan 2006 15:31:50 +0000
Subject: [Bioperl-l] concatenate two embl sequence files
In-Reply-To: <200601271206.52875.heikki@sanbi.ac.za>
References: <200601182120.k0ILIl8X022324@portal.open-bio.org>
	<200601252311.45582.heikki@sanbi.ac.za>
	<43D8CC0B.10403@colibase.bham.ac.uk>
	<200601271206.52875.heikki@sanbi.ac.za>
Message-ID: <43DA3CE6.4020708@colibase.bham.ac.uk>

> I've committed the code and tests. See t/SeqUtils.t. The idea is to test all 
> methods and a reasonable portion of all edge values to be sure that the 
> method works as it should.
Cool, thanks for that. My first proper contribution to BioPerl 8^).
The tests look good- I'll know better for next time.

> Note that the code does not create a new sequence object. It modifies the 
> existing one. Therefore it is best not to return that object. The users would 
> assign that to a variable that points to the same structure and get confused. 
> The method now returns true upon completeion.
> 
> Creating a new sequence object is problematic because one needs to add one 
> more dependency (e.g. Clone) and will not work anyway if the sequence 
> implementation is using a database back end. It is better the way you have 
> written it.
Yes, that makes sense. Although with that interface it might be more 
natural in Bio::Seq? If it is a method that will modify a sequence in 
place then it seems more intuitive to call $seq->cat(@seqs) [or even 
$seq->append(@seqs)] rather than Bio::SeqUtils->cat($seq, @seqs).

> I added code to move over the annotations from secondary sequences, but did 
> not do anything remove duplicates if the same sequence gets added twice. I 
> wrote a note about this so that users know to be prepared if that affects 
> them.
I'm not convinced about this- perhaps it should be optional? In practice 
many of the annotations for each subsequence are only going to be 
applicable to that sequence, not the concatenated whole. Some of them 
may also be duplicated even between non-identical sequences. I think 
it'd be better by default to keep just the annotation from the first 
sequence (which probably would still need to be changed, but could at 
least act as a placeholder).

There were a couple of problems with renamed variables/subroutines that 
hadn't all been updated, I've fixed those and pasted the new version below.

> No feature coordinates should always be within the sequence. Fuzzy is the 
> correct solution to this.
Okay, I'll have a go and let you know how I get on.

Cheers.
Roy.

--
Dr. Roy Chaudhuri
Bioinformatics Research Fellow
Division of Immunity and Infection
University of Birmingham, U.K.

http://xbase.bham.ac.uk

=head2 cat

   Title   : cat
   Usage   : my $catseq = Bio::SeqUtils->cat(@seqs)
   Function: Concatenates an array of Bio::Seq objects, using the first 
sequence
             as a target. Annotations and sequence features are copied over
             from any additional objects. Adjusts the coordinates of copied
             features.
   Returns : a boolean
   Args    : array of sequence objects

-
Note that annotations have no sequence region. If you concatenate the
same sequence more than once, you will have its annotations
duplicated.

=cut

sub cat {
     my ($self, $seq, @seqs) = @_;
     $self->throw('Object [$seq] '. 'of class ['. ref($seq).
                  '] should be a Bio::PrimarySeqI ')
         unless $seq->isa('Bio::PrimarySeqI');

     for my $catseq (@seqs) {
         $self->throw('Object [$catseq] '. 'of class ['. ref($catseq).
                      '] should be a Bio::PrimarySeqI ')
             unless $catseq->isa('Bio::PrimarySeqI');

         $self->throw('Trying to concatenate sequences with different 
alphabets: '.
                      $seq->display_id. '('. $seq->alphabet. ') and '. 
$catseq->display_id.
                      '('. $catseq->alphabet. ')')
             unless $catseq->alphabet eq $seq->alphabet;

         my $length=$seq->length;
         $seq->seq($seq->seq.$catseq->seq);

         # move annotations
         if ($seq->isa("Bio::AnnotatableI") and 
$catseq->isa("Bio::AnnotatableI")) {
             foreach my $key ( 
$catseq->annotation->get_all_annotation_keys() ) {

                 foreach my $value ( 
$catseq->annotation->get_Annotations($key) ) {
                     $seq->annotation->add_Annotation($key, $value);
                 }
             }
         }

         # move SeqFeatures
         if ( $seq->isa('Bio::SeqI') and $catseq->isa('Bio::SeqI')) {
             for my $feat ($catseq->get_SeqFeatures) {
                 $seq->add_SeqFeature($self->_coord_adjust($feat, $length));
             }
         }

     }
     1;
}

=head2 _coord_adjust

   Title   : _coord_adjust
   Usage   : my $newfeat=Bio::SeqUtils->_coord_adjust($feature, 100);
   Function: Recursive subroutine to adjust the coordinates of a feature
             and all its subfeatures.
   Returns : A Bio::SeqFeatureI compliant object.
   Args    : A Bio::SeqFeatureI compliant object,
             the number of bases to add to the coordinates

=cut

sub _coord_adjust {
     my ($self, $feat, $add)=@_;
     $self->throw('Object [$feat] '. 'of class ['. ref($feat).
                  '] should be a Bio::SeqFeatureI ')
	unless $feat->isa('Bio::SeqFeatureI');
     my @adjsubfeat;
     for my $subfeat ($feat->remove_SeqFeatures) {
	push @adjsubfeat, Bio::SeqUtils->_coord_adjust($add, $subfeat);
     }
     my @loc=$feat->location->each_Location;
     map {
	my @coords=($_->start, $_->end);
	map s/(\d+)/$add+$1/ge, @coords;
	$_->start(shift @coords);
	$_->end(shift @coords);
     } @loc;
     if (@loc==1) {
	$feat->location($loc[0])
     } else {
	my $loc=Bio::Location::Split->new;
	$loc->add_sub_Location(@loc);
	$feat->location($loc);
     }
     $feat->add_SeqFeature($_) for @adjsubfeat;
     return $feat;
}

From lupey+ at pitt.edu  Fri Jan 27 07:52:03 2006
From: lupey+ at pitt.edu (Paul G Cantalupo)
Date: Fri, 27 Jan 2006 07:52:03 -0500 (EST)
Subject: [Bioperl-l] How to search Bioperl-l archives
Message-ID: 

Hello,

Is there a better way to search the bioperl-l archives other than 
searching in each Archive listed on 
http://bioperl.org/pipermail/bioperl-l/. I've found that Google is not the 
best answer either.

Thank you,

Paul

Paul Cantalupo
Research Specialist/Systems Programmer
559 Crawford Hall
Department of Biological Sciences
University of Pittsburgh
Pittsburgh, PA 15260
Work: 412-624-4687
Fax: 412-624-4759

Ask me about Toastmasters: www.toastmasters.org
Midday Club Treasurer

From jason.stajich at duke.edu  Fri Jan 27 15:48:00 2006
From: jason.stajich at duke.edu (Jason Stajich)
Date: Fri, 27 Jan 2006 15:48:00 -0500
Subject: [Bioperl-l] How to search Bioperl-l archives
In-Reply-To: 
References: 
Message-ID: <91EF5237-8A86-40FA-8126-D953DE28DD69@duke.edu>

Google is the best answer we've got...
site:bioperl.org +pipermail +bioperl-l YOUR TERM

We will try and re-setup the swish indexed archive on the new server  
when there is time.  I don't think I'm going to have time for quite a  
while, if someone volunteers to help out ChrisD and I with sys  
admining it can of course get done sooner.  The old site is http:// 
search.open-bio.org but I don't think the indexes have been updated  
in a while.

-jason

On Jan 27, 2006, at 7:52 AM, Paul G Cantalupo wrote:

> Hello,
>
> Is there a better way to search the bioperl-l archives other than
> searching in each Archive listed on
> http://bioperl.org/pipermail/bioperl-l/. I've found that Google is  
> not the
> best answer either.
>
> Thank you,
>
> Paul
>
>
> Paul Cantalupo
> Research Specialist/Systems Programmer
> 559 Crawford Hall
> Department of Biological Sciences
> University of Pittsburgh
> Pittsburgh, PA 15260
> Work: 412-624-4687
> Fax: 412-624-4759
>
> Ask me about Toastmasters: www.toastmasters.org
> Midday Club Treasurer
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

--
Jason Stajich
Duke University
http://www.duke.edu/~jes12

From cjfields at uiuc.edu  Fri Jan 27 15:57:50 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri, 27 Jan 2006 14:57:50 -0600
Subject: [Bioperl-l] How to search Bioperl-l archives
In-Reply-To: 
Message-ID: <000001c62384$555ea8c0$15327e82@pyrimidine>

There's a link from this page:

http://www.bioperl.org/wiki/Mailing_lists

Two different searches are shown for bioperl-l : Google and Open-Bio.  I use
the Open-Bio b/c of its sorting capabilities (I haven't tried fooling around
with the Google interface yet).

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 
> -----Original Message-----
> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
> bounces at lists.open-bio.org] On Behalf Of Paul G Cantalupo
> Sent: Friday, January 27, 2006 6:52 AM
> To: bioperl-l
> Subject: [Bioperl-l] How to search Bioperl-l archives
> 
> Hello,
> 
> Is there a better way to search the bioperl-l archives other than
> searching in each Archive listed on
> http://bioperl.org/pipermail/bioperl-l/. I've found that Google is not the
> best answer either.
> 
> Thank you,
> 
> Paul
> 
> 
> Paul Cantalupo
> Research Specialist/Systems Programmer
> 559 Crawford Hall
> Department of Biological Sciences
> University of Pittsburgh
> Pittsburgh, PA 15260
> Work: 412-624-4687
> Fax: 412-624-4759
> 
> Ask me about Toastmasters: www.toastmasters.org
> Midday Club Treasurer
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

From cjfields at uiuc.edu  Fri Jan 27 16:02:59 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri, 27 Jan 2006 15:02:59 -0600
Subject: [Bioperl-l] RNAMotif parser
Message-ID: <000101c62385$0ddfc870$15327e82@pyrimidine>

Jason,

I have been fiddling with an RNAMotif parser and an ERPIN parser for a
number of years now; I plan on releasing it for inclusion in bioperl or
bioperl-run.  Right now, I think I may base them somewhat on your
Bio::Tools::QRNA module.  Should they be in bioperl (Bio::Tools::RNAMotif)
or bioperl-run (Bio::Tools::Run::RNAMotif) namespace?

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 

From rahall2 at ualr.edu  Fri Jan 27 15:34:47 2006
From: rahall2 at ualr.edu (Roger Hall)
Date: Fri, 27 Jan 2006 14:34:47 -0600
Subject: [Bioperl-l] Requesting your issues with
	Module:Bio::Tools::Run::RemoteBlast
Message-ID: <008001c62381$1d844980$d416a790@LIBERAL>

All,

I have a fun little application written around this module to track new hits
for my favorite sequences, but it stopped working some time ago, so I have
finally adopted this orphaned module.

I have received very specific suggestions from Jason and Chris for
implementation, and plan to follow them in order to at least bring this
module into the wonderful world of XML. I would appreciate it if you would
send any additional features (and any known issues) my way.

Thanks!

Roger Hall

Technical Director

MidSouth Bioinformatics Center

University of Arkansas at Little Rock

(501) 569-8074

From cjfields at uiuc.edu  Fri Jan 27 20:03:15 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri, 27 Jan 2006 19:03:15 -0600
Subject: [Bioperl-l] Requesting your issues
	withModule:Bio::Tools::Run::RemoteBlast
In-Reply-To: <008001c62381$1d844980$d416a790@LIBERAL>
Message-ID: <001101c623a6$9eb652d0$15327e82@pyrimidine>

The only real change to RemoteBlast.pm made was to the save_output method;
it wasn't saving XML output because the regex used to check the tempfile
output:

	while(my $l = ) {
		next if ($l =~ //);
		if( $l =~ /^(?:[T]?BLAST[NPX])\s*.+$/i ||
			 $l =~/^RPS-BLAST\s*.+$/i ) {
			$seentop=1;
		}
		next if !$seentop;
		if( $seentop ) {
			print SAVEOUT $l;
		}
	}

didn't check for XML.  I just added a check for XML that is the same as the
XML format check in the retrieve_blast method:

	while(my $l = ) {
		next if ($l =~ //);
		if( $l =~ /^(?:[T]?BLAST[NPX])\s*.+$/i ||  # NCBI BLAST
			$l =~/^RPS-BLAST\s*.+$/i || # RPS BLAST
                  $1 =~/<\?xml version=/) { # NCBI BLAST XML output
			$seentop=1;
		}
		next if !$seentop;
		if( $seentop ) {
			print SAVEOUT $l;
		}
	}

There is probably a better way to do this, but it works for now.  All other
fixes were made to SearchIO::blast.  That module is where most of the work
is done and which 'broke' recently from the BLAST version change at NCBI.

The only things I can think of at the moment are things that Jason
mentioned, switching to XML as the default (I agree with) and possibly
incorporating the netblast client (blastcl3).  It might be possible to
branch off a similar module specifically geared towards the blastcl3 client,
maybe acting as a wrapper to parse the returned data using SearchIO, but I
don't necessarily think it would be best to include in RemoteBlast. 

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 

> -----Original Message-----
> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
> bounces at lists.open-bio.org] On Behalf Of Roger Hall
> Sent: Friday, January 27, 2006 2:35 PM
> To: Bioperl-L
> Subject: [Bioperl-l] Requesting your issues
> withModule:Bio::Tools::Run::RemoteBlast
> 
> All,
> 
> 
> 
> I have a fun little application written around this module to track new
> hits
> for my favorite sequences, but it stopped working some time ago, so I have
> finally adopted this orphaned module.
> 
> 
> 
> I have received very specific suggestions from Jason and Chris for
> implementation, and plan to follow them in order to at least bring this
> module into the wonderful world of XML. I would appreciate it if you would
> send any additional features (and any known issues) my way.
> 
> 
> 
> Thanks!
> 
> 
> 
> Roger Hall
> 
> Technical Director
> 
> MidSouth Bioinformatics Center
> 
> University of Arkansas at Little Rock
> 
> (501) 569-8074
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

From torsten.seemann at infotech.monash.edu.au  Fri Jan 27 20:30:34 2006
From: torsten.seemann at infotech.monash.edu.au (Torsten Seemann)
Date: Sat, 28 Jan 2006 12:30:34 +1100
Subject: [Bioperl-l] RNAMotif parser
In-Reply-To: <000101c62385$0ddfc870$15327e82@pyrimidine>
References: <000101c62385$0ddfc870$15327e82@pyrimidine>
Message-ID: <43DAC93A.1000208@infotech.monash.edu.au>

Chris,

> I have been fiddling with an RNAMotif parser and an ERPIN parser for a
> number of years now; I plan on releasing it for inclusion in bioperl or
> bioperl-run.  Right now, I think I may base them somewhat on your
> Bio::Tools::QRNA module.  Should they be in bioperl (Bio::Tools::RNAMotif)
> or bioperl-run (Bio::Tools::Run::RNAMotif) namespace?

 From my understanding, a module to _parse the output_ of some TOOL goes 
in Bio::Tools::TOOL. The wrapper to _run_ TOOL goes in 
Bio::Tools::Run::TOOL. Usually the run() method in Bio::Tools::Run::TOOL 
takes the TOOL output and creates a Bio::Tools::TOOL object with the 
result in it as a convenience.

-- 
Torsten Seemann
Victorian Bioinformatics Consortium, Monash University, Australia
http://www.vicbioinformatics.com/
Phone: +61 3 9905 9010

From cjfields at uiuc.edu  Fri Jan 27 20:47:48 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri, 27 Jan 2006 19:47:48 -0600
Subject: [Bioperl-l] RNAMotif parser
In-Reply-To: <43DAC93A.1000208@infotech.monash.edu.au>
Message-ID: <000001c623ac$d7d07db0$15327e82@pyrimidine>

Yeah, forgot about that.  I just remember a discussion at one point a while
back about splitting off sections of bioperl core b/c some thought
bioperl-core was getting too big; I didn't want to get too deep into writing
code w/o asking.  Okay, then, that's settled.  

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 
> -----Original Message-----
> From: Torsten Seemann [mailto:torsten.seemann at infotech.monash.edu.au]
> Sent: Friday, January 27, 2006 7:31 PM
> To: Chris Fields
> Cc: 'bioperl-ml List'
> Subject: Re: [Bioperl-l] RNAMotif parser
> 
> Chris,
> 
> > I have been fiddling with an RNAMotif parser and an ERPIN parser for a
> > number of years now; I plan on releasing it for inclusion in bioperl or
> > bioperl-run.  Right now, I think I may base them somewhat on your
> > Bio::Tools::QRNA module.  Should they be in bioperl
> (Bio::Tools::RNAMotif)
> > or bioperl-run (Bio::Tools::Run::RNAMotif) namespace?
> 
>  From my understanding, a module to _parse the output_ of some TOOL goes
> in Bio::Tools::TOOL. The wrapper to _run_ TOOL goes in
> Bio::Tools::Run::TOOL. Usually the run() method in Bio::Tools::Run::TOOL
> takes the TOOL output and creates a Bio::Tools::TOOL object with the
> result in it as a convenience.
> 
> --
> Torsten Seemann
> Victorian Bioinformatics Consortium, Monash University, Australia
> http://www.vicbioinformatics.com/
> Phone: +61 3 9905 9010

From torsten.seemann at infotech.monash.edu.au  Sat Jan 28 05:04:30 2006
From: torsten.seemann at infotech.monash.edu.au (Torsten Seemann)
Date: Sat, 28 Jan 2006 21:04:30 +1100
Subject: [Bioperl-l] RNAMotif parser
In-Reply-To: <000001c623ac$d7d07db0$15327e82@pyrimidine>
References: <000001c623ac$d7d07db0$15327e82@pyrimidine>
Message-ID: <43DB41AE.30002@infotech.monash.edu.au>

>> From my understanding, a module to _parse the output_ of some TOOL goes
>>in Bio::Tools::TOOL. The wrapper to _run_ TOOL goes in
>>Bio::Tools::Run::TOOL. Usually the run() method in Bio::Tools::Run::TOOL
>>takes the TOOL output and creates a Bio::Tools::TOOL object with the
>>result in it as a convenience.

> Yeah, forgot about that.  I just remember a discussion at one point a while
> back about splitting off sections of bioperl core b/c some thought
> bioperl-core was getting too big; I didn't want to get too deep into writing
> code w/o asking.  Okay, then, that's settled.  

I think this is still true. Anything in Bio::Tools::Run namespace should 
be in bioperl-run CVS (except for RemoteBlast and StandAloneBlast which 
are in bioperl-live core due to popularity). All the output parsers are 
in bioperl-live core.

-- 
Torsten Seemann
Victorian Bioinformatics Consortium, Monash University, Australia
http://www.vicbioinformatics.com/
Phone: +61 3 9905 9010

From jason.stajich at duke.edu  Sat Jan 28 11:06:06 2006
From: jason.stajich at duke.edu (Jason Stajich)
Date: Sat, 28 Jan 2006 11:06:06 -0500
Subject: [Bioperl-l] RNAMotif parser
In-Reply-To: <43DB41AE.30002@infotech.monash.edu.au>
References: <000001c623ac$d7d07db0$15327e82@pyrimidine>
	<43DB41AE.30002@infotech.monash.edu.au>
Message-ID: 

exactly!
On Jan 28, 2006, at 5:04 AM, Torsten Seemann wrote:

>>> From my understanding, a module to _parse the output_ of some  
>>> TOOL goes
>>> in Bio::Tools::TOOL. The wrapper to _run_ TOOL goes in
>>> Bio::Tools::Run::TOOL. Usually the run() method in  
>>> Bio::Tools::Run::TOOL
>>> takes the TOOL output and creates a Bio::Tools::TOOL object with the
>>> result in it as a convenience.
>
>> Yeah, forgot about that.  I just remember a discussion at one  
>> point a while
>> back about splitting off sections of bioperl core b/c some thought
>> bioperl-core was getting too big; I didn't want to get too deep  
>> into writing
>> code w/o asking.  Okay, then, that's settled.
>
> I think this is still true. Anything in Bio::Tools::Run namespace  
> should
> be in bioperl-run CVS (except for RemoteBlast and StandAloneBlast  
> which
> are in bioperl-live core due to popularity). All the output parsers  
> are
> in bioperl-live core.
>
>
> -- 
> Torsten Seemann
> Victorian Bioinformatics Consortium, Monash University, Australia
> http://www.vicbioinformatics.com/
> Phone: +61 3 9905 9010
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

--
Jason Stajich
Duke University
http://www.duke.edu/~jes12

From golharam at umdnj.edu  Sun Jan 29 12:48:34 2006
From: golharam at umdnj.edu (Ryan Golhar)
Date: Sun, 29 Jan 2006 12:48:34 -0500
Subject: [Bioperl-l] Parsing clustalw alignments
Message-ID: <00ce01c624fc$3a8f34a0$2f01a8c0@GOLHARMOBILE1>

I can't figure this out from the documentation.  In fact, I'm not sure
its possible:

I have a bunch of clustalw alignments in GCG (MSF) format.  Each
alignment consists of three sequences.  I want to get the sequences
including the gaps from the alignment.  

I'm trying to use Bio::AlignIO to read the alignment file, then trying
to get each sequence from the alignment. I tried doing this:

$seqio = Bio::AlignIO->new(-format => 'clustalw', -file =>
"align$x.clustalw");
my $aln = $seqio->next_aln();
$seq1 = $aln->next_seq()->seq;

Getting the sequence from the alignment isn't working and I'm not sure
how to do it.  Does anyone have any ideas as to what I might try?

--
Ryan Golhar  -  golharam at umdnj.edu
The Informatics Institute of UMDNJ

From cjfields at uiuc.edu  Sun Jan 29 14:44:22 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Sun, 29 Jan 2006 13:44:22 -0600
Subject: [Bioperl-l] Parsing clustalw alignments
In-Reply-To: <00ce01c624fc$3a8f34a0$2f01a8c0@GOLHARMOBILE1>
References: <00ce01c624fc$3a8f34a0$2f01a8c0@GOLHARMOBILE1>
Message-ID: <294C9886-277B-4C35-AF7F-D6ABB3B401A3@uiuc.edu>

Even though you used clustalw for aligning the sequences, the output  
format is GCG (msf) and not clustalw (aln) format, so you need to  
change the '-format' flag you have set:

> $seqio = Bio::AlignIO->new(-format => 'clustalw', -file =>
> "align$x.clustalw");

to

> $seqio = Bio::AlignIO->new(-format => 'msf', -file =>
> "align$x.clustalw");

See if that works.

On Jan 29, 2006, at 11:48 AM, Ryan Golhar wrote:

> I can't figure this out from the documentation.  In fact, I'm not sure
> its possible:
>
> I have a bunch of clustalw alignments in GCG (MSF) format.  Each
> alignment consists of three sequences.  I want to get the sequences
> including the gaps from the alignment.
>
> I'm trying to use Bio::AlignIO to read the alignment file, then trying
> to get each sequence from the alignment. I tried doing this:
>
> $seqio = Bio::AlignIO->new(-format => 'clustalw', -file =>
> "align$x.clustalw");
> my $aln = $seqio->next_aln();
> $seq1 = $aln->next_seq()->seq;
>
> Getting the sequence from the alignment isn't working and I'm not sure
> how to do it.  Does anyone have any ideas as to what I might try?
>
> --
> Ryan Golhar  -  golharam at umdnj.edu
> The Informatics Institute of UMDNJ
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign

From jason.stajich at duke.edu  Sun Jan 29 14:49:20 2006
From: jason.stajich at duke.edu (Jason Stajich)
Date: Sun, 29 Jan 2006 14:49:20 -0500
Subject: [Bioperl-l] Parsing clustalw alignments
In-Reply-To: <00ce01c624fc$3a8f34a0$2f01a8c0@GOLHARMOBILE1>
References: <00ce01c624fc$3a8f34a0$2f01a8c0@GOLHARMOBILE1>
Message-ID: <21817F7A-8552-4F24-8094-86D830A506BB@duke.edu>

See the Bio::SimpleAlign documentation for information on how to  
interact with an alignment

Here is some code from the SYNOPSIS
# Extract sequences and check values for the alignment column $pos
   foreach $seq ($aln->each_seq) {
       $res = $seq->subseq($pos, $pos);
       $count{$res}++;
   }

So for you question:
# get the aln parser
my $alnio = Bio::AlignIO->new(-format => 'clustalw', -file  
=>"alnfile.aln);
while( my $aln = $alnio->next_aln ) {
  # get the alignments one by one
  for my $seq ( $aln->each_seq ) {
  # get the sequences out from the alignment
   print "sequence as a string", $seq->seq, "\n";
   }
}

next_seq is an API Sequence streams, not something we have  
implemented for alignments since you can get them all out with the  
each_seq method.

-jason
On Jan 29, 2006, at 12:48 PM, Ryan Golhar wrote:

> I can't figure this out from the documentation.  In fact, I'm not sure
> its possible:
>
> I have a bunch of clustalw alignments in GCG (MSF) format.  Each
> alignment consists of three sequences.  I want to get the sequences
> including the gaps from the alignment.
>
> I'm trying to use Bio::AlignIO to read the alignment file, then trying
> to get each sequence from the alignment. I tried doing this:
>
> $seqio = Bio::AlignIO->new(-format => 'clustalw', -file =>
> "align$x.clustalw");
> my $aln = $seqio->next_aln();
> $seq1 = $aln->next_seq()->seq;
>
> Getting the sequence from the alignment isn't working and I'm not sure
> how to do it.  Does anyone have any ideas as to what I might try?
>
> --
> Ryan Golhar  -  golharam at umdnj.edu
> The Informatics Institute of UMDNJ
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

--
Jason Stajich
Duke University
http://www.duke.edu/~jes12

From biophp at biophp.org  Fri Jan 27 08:20:31 2006
From: biophp at biophp.org (Joseba Bikandi)
Date: Fri, 27 Jan 2006 08:20:31 -0500
Subject: [Bioperl-l] BioPHP.org - open source repository of code and scripts
Message-ID: 

Dear Sir/Madam,

I would like to let you know about biophp.org, 
an open source project which may be interesting 
for you. It is a new project which includes 
PHP code (functions) and minitools (copy and
paste one page scripts). 

Sincerely,

......
Joseba Bikandi
biophp at biophp.org

From golharam at umdnj.edu  Mon Jan 30 12:40:58 2006
From: golharam at umdnj.edu (Ryan Golhar)
Date: Mon, 30 Jan 2006 12:40:58 -0500
Subject: [Bioperl-l] Parsing clustalw alignments
In-Reply-To: <21817F7A-8552-4F24-8094-86D830A506BB@duke.edu>
Message-ID: <003701c625c4$5527d790$2f01a8c0@GOLHARMOBILE1>

Thanks.  Here's what I ended up doing:

$seqio = Bio::AlignIO->new(-format => 'msf', -file =>
"alnfile.clustalw");
my $aln = $seqio->next_aln();
@_ = $aln->each_seq_with_id('org1');
$seq1 = $_[0]->seq;

-----Original Message-----
From: bioperl-l-bounces at lists.open-bio.org
[mailto:bioperl-l-bounces at lists.open-bio.org] On Behalf Of Jason Stajich
Sent: Sunday, January 29, 2006 2:49 PM
To: golharam at umdnj.edu
Cc: 'bioperl-l'
Subject: Re: [Bioperl-l] Parsing clustalw alignments

See the Bio::SimpleAlign documentation for information on how to  
interact with an alignment

Here is some code from the SYNOPSIS
# Extract sequences and check values for the alignment column $pos
   foreach $seq ($aln->each_seq) {
       $res = $seq->subseq($pos, $pos);
       $count{$res}++;
   }

So for you question:
# get the aln parser
my $alnio = Bio::AlignIO->new(-format => 'clustalw', -file  
=>"alnfile.aln);
while( my $aln = $alnio->next_aln ) {
  # get the alignments one by one
  for my $seq ( $aln->each_seq ) {
  # get the sequences out from the alignment
   print "sequence as a string", $seq->seq, "\n";
   }
}

next_seq is an API Sequence streams, not something we have  
implemented for alignments since you can get them all out with the  
each_seq method.

-jason
On Jan 29, 2006, at 12:48 PM, Ryan Golhar wrote:

> I can't figure this out from the documentation.  In fact, I'm not sure

> its possible:
>
> I have a bunch of clustalw alignments in GCG (MSF) format.  Each 
> alignment consists of three sequences.  I want to get the sequences 
> including the gaps from the alignment.
>
> I'm trying to use Bio::AlignIO to read the alignment file, then trying

> to get each sequence from the alignment. I tried doing this:
>
> $seqio = Bio::AlignIO->new(-format => 'clustalw', -file => 
> "align$x.clustalw"); my $aln = $seqio->next_aln();
> $seq1 = $aln->next_seq()->seq;
>
> Getting the sequence from the alignment isn't working and I'm not sure

> how to do it.  Does anyone have any ideas as to what I might try?
>
> --
> Ryan Golhar  -  golharam at umdnj.edu
> The Informatics Institute of UMDNJ
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org 
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

--
Jason Stajich
Duke University
http://www.duke.edu/~jes12

_______________________________________________
Bioperl-l mailing list
Bioperl-l at lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/bioperl-l

From alindeman at gmail.com  Mon Jan 30 23:00:32 2006
From: alindeman at gmail.com (Andy Lindeman)
Date: Mon, 30 Jan 2006 22:00:32 -0600
Subject: [Bioperl-l] Bio::Graphics: Different glyphs on same track
In-Reply-To: <3f3ecb5a0601301442o573053aes23934617edb14729@mail.gmail.com>
References: <3f3ecb5a0601301442o573053aes23934617edb14729@mail.gmail.com>
Message-ID: <3f3ecb5a0601302000j7a3fbd4y1739a3c1696e30aa@mail.gmail.com>

Hi all--

Is it possible to use two different glyphs (or the same glyph with
different properties) on the same panel track?

Thanks

--A

From Marc.Logghe at DEVGEN.com  Tue Jan 31 03:08:09 2006
From: Marc.Logghe at DEVGEN.com (Marc Logghe)
Date: Tue, 31 Jan 2006 09:08:09 +0100
Subject: [Bioperl-l] Bio::Graphics: Different glyphs on same track
Message-ID: <0C528E3670D8CE4B8E013F6749231AA6746AB5@ANTARESIA.be.devgen.com>

Hi Andy
> Is it possible to use two different glyphs (or the same glyph 
> with different properties) on the same panel track?
Sure it is. This extract comes from the docs of Bio::Graphics::Panel

" There are a large number of glyph types.  By default, each track will
be homogeneous on a single glyph type, but you can mix several glyph
types on the same track by providing a code reference to the -glyph
argument.  Other options passed to add_track() control the color and
size of the glyphs, whether they are allowed to overlap, and other
formatting attributes.  The height of a track is determined from its
contents and cannot be directly influenced."

HTH,
Marc

From alindeman at gmail.com  Tue Jan 31 14:59:00 2006
From: alindeman at gmail.com (Andy Lindeman)
Date: Tue, 31 Jan 2006 13:59:00 -0600
Subject: [Bioperl-l] Bio::Graphics: Different glyphs on same track
In-Reply-To: <0C528E3670D8CE4B8E013F6749231AA6746AB5@ANTARESIA.be.devgen.com>
References: <0C528E3670D8CE4B8E013F6749231AA6746AB5@ANTARESIA.be.devgen.com>
Message-ID: <3f3ecb5a0601311159k6d7f09d3j65732b5e72019e9d@mail.gmail.com>

Wonderful!

Thanks.

--A

On 1/31/06, Marc Logghe  wrote:
> Hi Andy
> > Is it possible to use two different glyphs (or the same glyph
> > with different properties) on the same panel track?
> Sure it is. This extract comes from the docs of Bio::Graphics::Panel
>
> " There are a large number of glyph types.  By default, each track will
> be homogeneous on a single glyph type, but you can mix several glyph
> types on the same track by providing a code reference to the -glyph
> argument.  Other options passed to add_track() control the color and
> size of the glyphs, whether they are allowed to overlap, and other
> formatting attributes.  The height of a track is determined from its
> contents and cannot be directly influenced."
>
> HTH,
> Marc
>

From hubert.prielinger at gmx.at  Tue Jan 24 15:49:07 2006
From: hubert.prielinger at gmx.at (Hubert Prielinger)
Date: Tue, 24 Jan 2006 14:49:07 -0600
Subject: [Bioperl-l] formatdb with the nr database
In-Reply-To: <43D63FB6.4090505@scitegic.com>
References: <43D54838.5050301@gmx.at>
	<43D5693C.1020805@anu.edu.au>	<43D56203.2060806@gmx.at>
	<1138062266.2534.2.camel@vogon>	<43D571B1.3020008@gmx.at>
	<43D58D06.5080501@anu.edu.au> <43D585CF.5070902@gmx.at>
	<43D63FB6.4090505@scitegic.com>
Message-ID: <43D692C3.80306@gmx.at>

Hi,
thank you very much for the help, I have tried to run the blastall on 
commandline, but I can't even execute the binary file, nevertheless the 
blastall exe file have every permission...
I always get the error message: blastall: cannot execute the binary file
Need to be the exe file somewhere else, another path...now it is located 
under /home/Hubert/blast/blast-2.2.13/bin

thanks
Hubert

Scott Markel wrote:

> Hubert,
>
> If you look at the MSG line in the exception you can see
> exactly what the command line was.  Nagesh is pointing out
> that you used -d "/nr" and asking if that's what you want.
> I suspect that the '/' shouldn't be there.
>
> Try invoking blastall directly from the command line.  All
> BioPerl is doing is invoking BLAST on your behalf.  The
> same command line that BioPerl uses should also work for
> you on the command line.
>
> Scott
>
> Hubert Prielinger wrote:
>
>> hi,
>> sorry, but what do you mean with is your blast database in /nr...
>> my database is located in the path /home/Hubert/blast/blast-2.2.13/data
>>
>>
>>
>> Nagesh Chakka wrote:
>>
>>> Can you just run the blast from the command line.
>>> Is your blast database in "/nr".
>>>
>>> Hubert Prielinger wrote:
>>>
>>>> Hi Nagesh,
>>>> thank you very much, I put my database into the data folder, run 
>>>> the program and got the following error message:
>>>>
>>>> submit Sequence...just do it....
>>>> sh: /home/Hubert/blast/blast-2.2.13/bin/blastall: cannot execute 
>>>> binary file
>>>>
>>>> ------------- EXCEPTION  -------------
>>>> MSG: blastall call crashed: 32256 
>>>> /home/Hubert/blast/blast-2.2.13/bin/blastall -p  blastp  -d  "/nr"  
>>>> -i  /tmp/QTZfYMbgLM  -e  20000  -o  /tmp/v3YwWvONZ1  -v  1000  -b  
>>>> 1000
>>>>
>>>> STACK Bio::Tools::Run::StandAloneBlast::_runblast 
>>>> /usr/lib/perl5/site_perl/5.8.6/Bio/Tools/Run/StandAloneBlast.pm:759
>>>> STACK Bio::Tools::Run::StandAloneBlast::_generic_local_blast 
>>>> /usr/lib/perl5/site_perl/5.8.6/Bio/Tools/Run/StandAloneBlast.pm:706
>>>> STACK Bio::Tools::Run::StandAloneBlast::blastall 
>>>> /usr/lib/perl5/site_perl/5.8.6/Bio/Tools/Run/StandAloneBlast.pm:557
>>>> STACK toplevel 
>>>> /home/Hubert/installed/eclipse/workspace/Database_Search/standalone_blast.pl:46 
>>>>
>>>>
>>>> --------------------------------------
>>>>
>>>> Why it did not find my binary file, but it is there
>>>>
>>>> regards
>>>>
>>>> Nagesh Chakka wrote:
>>>>
>>>>> Hi,
>>>>> The following is from the StandAloneBlast.pm documentation
>>>>> " If the databases which will be searched by BLAST are located in the
>>>>> data subdirectory of the blast program directory (the default
>>>>> installation location), StandAloneBlast will find them; however, 
>>>>> if the
>>>>> database files are located in any other location, environmental 
>>>>> variable
>>>>> $BLASTDATADIR will need to be set to point to that directory."
>>>>> Please note that I have not used this module before.
>>>>> Nagesh
>>>>>
>>>>>
>>>>>
>>>>> On Mon, 2006-01-23 at 17:08 -0600, Hubert Prielinger wrote:
>>>>>  
>>>>>
>>>>>> Hi,
>>>>>> thank you very much for the help, another questions that raises 
>>>>>> up, do I have to write the path to the database files as well, I 
>>>>>> guess so, but how I do that, the same way I write the path to teh 
>>>>>> blast bin files?
>>>>>> Does anybody know how to set the Composition based statistics 
>>>>>> parameter?
>>>>>> there is my code:
>>>>>>
>>>>>> #!/usr/bin/perl -w
>>>>>>
>>>>>> use Bio::Tools::Run::StandAloneBlast;
>>>>>> use Bio::Seq;
>>>>>> use Bio::SeqIO;
>>>>>> use strict;
>>>>>>
>>>>>> BEGIN
>>>>>> {
>>>>>>    $ENV{PATH}=":/home/Hubert/blast/blast-2.2.13/bin/:";
>>>>>> }
>>>>>>
>>>>>>
>>>>>> # parameters
>>>>>> my $expect_value = 20000;
>>>>>> #my $filter_query_sequence = 'F';
>>>>>> my $one_line_description = 1000;
>>>>>> my $alignments = 1000;
>>>>>> # my $strands = 1;
>>>>>> my $count = 1;
>>>>>>
>>>>>> my @params = ('program' => 'blastp', 'database' => 'nr');
>>>>>> #my $progress_interval = 100;
>>>>>>
>>>>>>
>>>>>> my $seqio_obj = Bio::SeqIO->new(
>>>>>>  -file   => "Perm.txt",
>>>>>>  -format => "raw",
>>>>>> );
>>>>>>
>>>>>> # create factory object and set parameters
>>>>>> my $factory = Bio::Tools::Run::StandAloneBlast->new(@params);
>>>>>>
>>>>>> $factory->e($expect_value);
>>>>>> #$factory->F($filter_query_sequence);
>>>>>> $factory->v($one_line_description);
>>>>>> $factory->b($alignments);
>>>>>> #$factory->S($strands);
>>>>>>
>>>>>>
>>>>>> # get query
>>>>>>
>>>>>> while ( my $query = $seqio_obj->next_seq ) {
>>>>>>      my $blast_report = $factory->blastall($query);
>>>>>>      my $filename = "comp_$count.txt";
>>>>>>      my $factory->outfile($filename);
>>>>>>      print $query->seq;
>>>>>>      print "\n";
>>>>>>
>>>>>>  $count++;
>>>>>> }
>>>>>>
>>>>>> thank you very much in advance
>>>>>> Hubert
>>>>>>
>>>>>>
>>>>>>
>>>>>> Nagesh Chakka wrote:
>>>>>>
>>>>>>  
>>>>>>
>>>>>>> Hi Hubert,
>>>>>>> I downloaded the nr.00.tar.gz file a week ago. I was able to get 
>>>>>>> the following files
>>>>>>> .phr, .pin, .pnd, .pni, .ppd, .ppi, .psd, .psi, .psq, .pal 
>>>>>>> files. I have no trouble in running standalone blast. You are 
>>>>>>> not required to run formardb on the downloaded blast databases 
>>>>>>> and that may be the reason why the sequences are not included as 
>>>>>>> it will also reduce the size of the file.
>>>>>>> Did you try to run a blast search, if so is it giving you any 
>>>>>>> errors?
>>>>>>> Nagesh
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Hubert Prielinger wrote:
>>>>>>>
>>>>>>>  
>>>>>>>
>>>>>>>> Hi,
>>>>>>>> I have downloaded the nr database for doing a blast search 
>>>>>>>> locally, now I'm supposed to index the database with formatdb, 
>>>>>>>> but it doesn't work...
>>>>>>>> The online help says that you need a fasta file that is indexed 
>>>>>>>> to use for searching the database, but when I uncompressed the 
>>>>>>>> zip file, there were only .phr, .pnd, .pin, .pni, .ppd file....
>>>>>>>> Is there anybody who can tell me, how to use formatdb with the 
>>>>>>>> nr database...
>>>>>>>>
>>>>>>>> Help is very appreciated
>>>>>>>> Thank you very much in advance
>>>>>>>>
>>>>>>>> Hubert
>>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>> Bioperl-l mailing list
>>>>>>>> Bioperl-l at portal.open-bio.org
>>>>>>>> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>>>>>>>>       
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>     
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>>  
>>>>>
>>>>
>>>
>>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at portal.open-bio.org
>> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>>
>>
>

From hubert.prielinger at gmx.at  Tue Jan 24 16:15:38 2006
From: hubert.prielinger at gmx.at (Hubert Prielinger)
Date: Tue, 24 Jan 2006 15:15:38 -0600
Subject: [Bioperl-l] formatdb with the nr database
In-Reply-To: <43D6B09A.3040207@atgc.org>
References: <43D54838.5050301@gmx.at>	<43D5693C.1020805@anu.edu.au>	<43D56203.2060806@gmx.at>	<1138062266.2534.2.camel@vogon>	<43D571B1.3020008@gmx.at>	<43D58D06.5080501@anu.edu.au>
	<43D585CF.5070902@gmx.at>	<43D63FB6.4090505@scitegic.com>
	<43D692C3.80306@gmx.at> <43D6B09A.3040207@atgc.org>
Message-ID: <43D698FA.3090904@gmx.at>

hi alex,
I have done, as you recommended and got the following output:

[Hubert at ppc7 ~]$ file /home/Hubert/blast/blast-2.2.13/bin/blastall
/home/Hubert/blast/blast-2.2.13/bin/blastall: ELF 64-bit LSB executable, 
AMD x86-64, version 1 (SYSV), for GNU/Linux 2.4.1, dynamically linked 
(uses shared libs), for GNU/Linux 2.4.1, not stripped
[Hubert at ppc7 ~]$

does it mean, that it is compatible with the operating system

thanks for help
Hubert

Alexander Kozik wrote:

> try Unix command "file", for example:
>
>
> bash-2.03$ file /usr/local/genome/bin/blastall
>
> /usr/local/genome/bin/blastall: ELF 64-bit MSB executable SPARCV9 
> Version 1, UltraSPARC1 Extensions Required, dynamically linked, stripped
>
> bash-2.03$
>
> it will tell if it's compatible with the operating system
>
> -Alex
>
> Hubert Prielinger wrote:
>
>>Hi,
>>thank you very much for the help, I have tried to run the blastall on 
>>commandline, but I can't even execute the binary file, nevertheless the 
>>blastall exe file have every permission...
>>I always get the error message: blastall: cannot execute the binary file
>>Need to be the exe file somewhere else, another path...now it is located 
>>under /home/Hubert/blast/blast-2.2.13/bin
>>
>>thanks
>>Hubert
>>
>>
>>
>>
>>
>>Scott Markel wrote:
>>
>>    
>>
>>>Hubert,
>>>
>>>If you look at the MSG line in the exception you can see
>>>exactly what the command line was.  Nagesh is pointing out
>>>that you used -d "/nr" and asking if that's what you want.
>>>I suspect that the '/' shouldn't be there.
>>>
>>>Try invoking blastall directly from the command line.  All
>>>BioPerl is doing is invoking BLAST on your behalf.  The
>>>same command line that BioPerl uses should also work for
>>>you on the command line.
>>>
>>>Scott
>>>
>>>Hubert Prielinger wrote:
>>>
>>>      
>>>
>>>>hi,
>>>>sorry, but what do you mean with is your blast database in /nr...
>>>>my database is located in the path /home/Hubert/blast/blast-2.2.13/data
>>>>
>>>>
>>>>
>>>>Nagesh Chakka wrote:
>>>>
>>>>        
>>>>
>>>>>Can you just run the blast from the command line.
>>>>>Is your blast database in "/nr".
>>>>>
>>>>>Hubert Prielinger wrote:
>>>>>
>>>>>          
>>>>>
>>>>>>Hi Nagesh,
>>>>>>thank you very much, I put my database into the data folder, run 
>>>>>>the program and got the following error message:
>>>>>>
>>>>>>submit Sequence...just do it....
>>>>>>sh: /home/Hubert/blast/blast-2.2.13/bin/blastall: cannot execute 
>>>>>>binary file
>>>>>>
>>>>>>------------- EXCEPTION  -------------
>>>>>>MSG: blastall call crashed: 32256 
>>>>>>/home/Hubert/blast/blast-2.2.13/bin/blastall -p  blastp  -d  "/nr"  
>>>>>>-i  /tmp/QTZfYMbgLM  -e  20000  -o  /tmp/v3YwWvONZ1  -v  1000  -b  
>>>>>>1000
>>>>>>
>>>>>>STACK Bio::Tools::Run::StandAloneBlast::_runblast 
>>>>>>/usr/lib/perl5/site_perl/5.8.6/Bio/Tools/Run/StandAloneBlast.pm:759
>>>>>>STACK Bio::Tools::Run::StandAloneBlast::_generic_local_blast 
>>>>>>/usr/lib/perl5/site_perl/5.8.6/Bio/Tools/Run/StandAloneBlast.pm:706
>>>>>>STACK Bio::Tools::Run::StandAloneBlast::blastall 
>>>>>>/usr/lib/perl5/site_perl/5.8.6/Bio/Tools/Run/StandAloneBlast.pm:557
>>>>>>STACK toplevel 
>>>>>>/home/Hubert/installed/eclipse/workspace/Database_Search/standalo
>>>>>>ne_blast.pl:46 
>>>>>>
>>>>>>
>>>>>>--------------------------------------
>>>>>>
>>>>>>Why it did not find my binary file, but it is there
>>>>>>
>>>>>>regards
>>>>>>
>>>>>>Nagesh Chakka wrote:
>>>>>>
>>>>>>            
>>>>>>
>>>>>>>Hi,
>>>>>>>The following is from the StandAloneBlast.pm documentation
>>>>>>>" If the databases which will be searched by BLAST are located in the
>>>>>>>data subdirectory of the blast program directory (the default
>>>>>>>installation location), StandAloneBlast will find them; however, 
>>>>>>>if the
>>>>>>>database files are located in any other location, environmental 
>>>>>>>variable
>>>>>>>$BLASTDATADIR will need to be set to point to that directory."
>>>>>>>Please note that I have not used this module before.
>>>>>>>Nagesh
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>On Mon, 2006-01-23 at 17:08 -0600, Hubert Prielinger wrote:
>>>>>>> 
>>>>>>>
>>>>>>>              
>>>>>>>
>>>>>>>>Hi,
>>>>>>>>thank you very much for the help, another questions that raises 
>>>>>>>>up, do I have to write the path to the database files as well, I 
>>>>>>>>guess so, but how I do that, the same way I write the path to teh 
>>>>>>>>blast bin files?
>>>>>>>>Does anybody know how to set the Composition based statistics 
>>>>>>>>parameter?
>>>>>>>>there is my code:
>>>>>>>>
>>>>>>>>#!/usr/bin/perl -w
>>>>>>>>
>>>>>>>>use Bio::Tools::Run::StandAloneBlast;
>>>>>>>>use Bio::Seq;
>>>>>>>>use Bio::SeqIO;
>>>>>>>>use strict;
>>>>>>>>
>>>>>>>>BEGIN
>>>>>>>>{
>>>>>>>>   $ENV{PATH}=":/home/Hubert/blast/blast-2.2.13/bin/:";
>>>>>>>>}
>>>>>>>>
>>>>>>>>
>>>>>>>># parameters
>>>>>>>>my $expect_value = 20000;
>>>>>>>>#my $filter_query_sequence = 'F';
>>>>>>>>my $one_line_description = 1000;
>>>>>>>>my $alignments = 1000;
>>>>>>>># my $strands = 1;
>>>>>>>>my $count = 1;
>>>>>>>>
>>>>>>>>my @params = ('program' => 'blastp', 'database' => 'nr');
>>>>>>>>#my $progress_interval = 100;
>>>>>>>>
>>>>>>>>
>>>>>>>>my $seqio_obj = Bio::SeqIO->new(
>>>>>>>> -file   => "Perm.txt",
>>>>>>>> -format => "raw",
>>>>>>>>);
>>>>>>>>
>>>>>>>># create factory
>>>>>>>> object and set parameters
>>>>>>>>my $factory = Bio::Tools::Run::StandAloneBlast->new(@params);
>>>>>>>>
>>>>>>>>$factory->e($expect_value);
>>>>>>>>#$factory->F($filter_query_sequence);
>>>>>>>>$factory->v($one_line_description);
>>>>>>>>$factory->b($alignments);
>>>>>>>>#$factory->S($strands);
>>>>>>>>
>>>>>>>>
>>>>>>>># get query
>>>>>>>>
>>>>>>>>while ( my $query = $seqio_obj->next_seq ) {
>>>>>>>>     my $blast_report = $factory->blastall($query);
>>>>>>>>     my $filename = "comp_$count.txt";
>>>>>>>>     my $factory->outfile($filename);
>>>>>>>>     print $query->seq;
>>>>>>>>     print "\n";
>>>>>>>>
>>>>>>>> $count++;
>>>>>>>>}
>>>>>>>>
>>>>>>>>thank you very much in advance
>>>>>>>>Hubert
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>Nagesh Chakka wrote:
>>>>>>>>
>>>>>>>> 
>>>>>>>>
>>>>>>>>                
>>>>>>>>
>>>>>>>>>Hi Hubert,
>>>>>>>>>I downloaded the nr.00.tar.gz file a week ago. I was able to get 
>>>>>>>>>the following files
>>>>>>>>>.phr, .pin, .pnd, .pni, .ppd, .ppi, .psd, .psi, .psq, .pal 
>>>>>>>>>files. I have no trouble in running standalone blast. You are 
>>>>>>>>>not required to run formardb on the downloaded blast databases 
>>>>>>>>>and that may be the reason why the sequences are not included as 
>>>>>>>>>it will also reduce the size of the file.
>>>>>>>>>Did you try to run a blast search, if so is it giving you any 
>>>>>>>>>errors?
>>>>>>>>>Nagesh
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>Hubert Prielinger wrote:
>>>>>>>>>
>>>>>>>>> 
>>>>>>>>>
>>>>>>>>>                  
>>>>>>>>>
>>>>>>>>>>Hi,
>>>>>>>>>>I have downloaded the nr database for doing a blast search 
>>>>>>>>>>locally, now I'm supposed to index the database with formatdb, 
>>>>>>>>>>but it doesn't work...
>>>>>>>>>>The online help says that you need a fasta file that is indexed 
>>>>>>>>>>to use for searching the database, but when I uncompressed the 
>>>>>>>>>>zip file, there were only .phr, .pnd, .pin, .pni, .ppd file....
>>>>>>>>>>Is there anybody who can tell me, how to use formatdb with the 
>>>>>>>>>>nr database...
>>>>>>>>>>
>>>>>>>>>>Help is very appreciated
>>>>>>>>>>Thank you very much in advance
>>>>>>>>>>
>>>>>>>>>>Hubert
>>>>>>>>>>
>>>>>>>>>>_______________________________________________
>>>>>>>>>>Bioperl-l mailing list
>>>>>>>>>>Bioperl-l at portal.open-bio.org
>>>>>>>>>>http://portal.open-bio.org/mailman/listinfo/bioperl-l
>>>>>>>>>>      
>>>>>>>>>>                    
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>    
>>>>>>>>>                  
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>                
>>>>>>>>
>>>>>>> 
>>>>>>>
>>>>>>>              
>>>>>>>
>>>>>          
>>>>>
>>>>_______________________________________________
>>>>Bioperl-l mailing list
>>>>Bioperl-l at portal.open-bio.org
>>>>http://portal.open-bio.org/mailman/listinfo/bioperl-l
>>>>
>>>>
>>>>        
>>>>
>>
>>
>>_______________________________________________
>>Bioperl-l mailing list
>>Bioperl-l at lists.open-bio.org
>>http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>    
>>
>

From hubert.prielinger at gmx.at  Tue Jan 24 16:24:51 2006
From: hubert.prielinger at gmx.at (Hubert Prielinger)
Date: Tue, 24 Jan 2006 15:24:51 -0600
Subject: [Bioperl-l] formatdb with the nr database
In-Reply-To: <43D6B09A.3040207@atgc.org>
References: <43D54838.5050301@gmx.at>	<43D5693C.1020805@anu.edu.au>	<43D56203.2060806@gmx.at>	<1138062266.2534.2.camel@vogon>	<43D571B1.3020008@gmx.at>	<43D58D06.5080501@anu.edu.au>
	<43D585CF.5070902@gmx.at>	<43D63FB6.4090505@scitegic.com>
	<43D692C3.80306@gmx.at> <43D6B09A.3040207@atgc.org>
Message-ID: <43D69B23.9010100@gmx.at>

Hi,
I'm very sorry for wasting your time, but I just figured out what 
happend, I have installed the 64 bit version and not the 32 bit version....
sorry for the inconvenience and thanks for the help....
I'm trying to fix now the problem with the database....

Sorry
Hubert

Alexander Kozik wrote:

> try Unix command "file", for example:
>
>
> bash-2.03$ file /usr/local/genome/bin/blastall
>
> /usr/local/genome/bin/blastall: ELF 64-bit MSB executable SPARCV9 
> Version 1, UltraSPARC1 Extensions Required, dynamically linked, stripped
>
> bash-2.03$
>
> it will tell if it's compatible with the operating system
>
> -Alex
>
> Hubert Prielinger wrote:
>
>>Hi,
>>thank you very much for the help, I have tried to run the blastall on 
>>commandline, but I can't even execute the binary file, nevertheless the 
>>blastall exe file have every permission...
>>I always get the error message: blastall: cannot execute the binary file
>>Need to be the exe file somewhere else, another path...now it is located 
>>under /home/Hubert/blast/blast-2.2.13/bin
>>
>>thanks
>>Hubert
>>
>>
>>
>>
>>
>>Scott Markel wrote:
>>
>>    
>>
>>>Hubert,
>>>
>>>If you look at the MSG line in the exception you can see
>>>exactly what the command line was.  Nagesh is pointing out
>>>that you used -d "/nr" and asking if that's what you want.
>>>I suspect that the '/' shouldn't be there.
>>>
>>>Try invoking blastall directly from the command line.  All
>>>BioPerl is doing is invoking BLAST on your behalf.  The
>>>same command line that BioPerl uses should also work for
>>>you on the command line.
>>>
>>>Scott
>>>
>>>Hubert Prielinger wrote:
>>>
>>>      
>>>
>>>>hi,
>>>>sorry, but what do you mean with is your blast database in /nr...
>>>>my database is located in the path /home/Hubert/blast/blast-2.2.13/data
>>>>
>>>>
>>>>
>>>>Nagesh Chakka wrote:
>>>>
>>>>        
>>>>
>>>>>Can you just run the blast from the command line.
>>>>>Is your blast database in "/nr".
>>>>>
>>>>>Hubert Prielinger wrote:
>>>>>
>>>>>          
>>>>>
>>>>>>Hi Nagesh,
>>>>>>thank you very much, I put my database into the data folder, run 
>>>>>>the program and got the following error message:
>>>>>>
>>>>>>submit Sequence...just do it....
>>>>>>sh: /home/Hubert/blast/blast-2.2.13/bin/blastall: cannot execute 
>>>>>>binary file
>>>>>>
>>>>>>------------- EXCEPTION  -------------
>>>>>>MSG: blastall call crashed: 32256 
>>>>>>/home/Hubert/blast/blast-2.2.13/bin/blastall -p  blastp  -d  "/nr"  
>>>>>>-i  /tmp/QTZfYMbgLM  -e  20000  -o  /tmp/v3YwWvONZ1  -v  1000  -b  
>>>>>>1000
>>>>>>
>>>>>>STACK Bio::Tools::Run::StandAloneBlast::_runblast 
>>>>>>/usr/lib/perl5/site_perl/5.8.6/Bio/Tools/Run/StandAloneBlast.pm:759
>>>>>>STACK Bio::Tools::Run::StandAloneBlast::_generic_local_blast 
>>>>>>/usr/lib/perl5/site_perl/5.8.6/Bio/Tools/Run/StandAloneBlast.pm:706
>>>>>>STACK Bio::Tools::Run::StandAloneBlast::blastall 
>>>>>>/usr/lib/perl5/site_perl/5.8.6/Bio/Tools/Run/StandAloneBlast.pm:557
>>>>>>STACK toplevel 
>>>>>>/home/Hubert/installed/eclipse/workspace/Database_Search/standalo
>>>>>>ne_blast.pl:46 
>>>>>>
>>>>>>
>>>>>>--------------------------------------
>>>>>>
>>>>>>Why it did not find my binary file, but it is there
>>>>>>
>>>>>>regards
>>>>>>
>>>>>>Nagesh Chakka wrote:
>>>>>>
>>>>>>            
>>>>>>
>>>>>>>Hi,
>>>>>>>The following is from the StandAloneBlast.pm documentation
>>>>>>>" If the databases which will be searched by BLAST are located in the
>>>>>>>data subdirectory of the blast program directory (the default
>>>>>>>installation location), StandAloneBlast will find them; however, 
>>>>>>>if the
>>>>>>>database files are located in any other location, environmental 
>>>>>>>variable
>>>>>>>$BLASTDATADIR will need to be set to point to that directory."
>>>>>>>Please note that I have not used this module before.
>>>>>>>Nagesh
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>On Mon, 2006-01-23 at 17:08 -0600, Hubert Prielinger wrote:
>>>>>>> 
>>>>>>>
>>>>>>>              
>>>>>>>
>>>>>>>>Hi,
>>>>>>>>thank you very much for the help, another questions that raises 
>>>>>>>>up, do I have to write the path to the database files as well, I 
>>>>>>>>guess so, but how I do that, the same way I write the path to teh 
>>>>>>>>blast bin files?
>>>>>>>>Does anybody know how to set the Composition based statistics 
>>>>>>>>parameter?
>>>>>>>>there is my code:
>>>>>>>>
>>>>>>>>#!/usr/bin/perl -w
>>>>>>>>
>>>>>>>>use Bio::Tools::Run::StandAloneBlast;
>>>>>>>>use Bio::Seq;
>>>>>>>>use Bio::SeqIO;
>>>>>>>>use strict;
>>>>>>>>
>>>>>>>>BEGIN
>>>>>>>>{
>>>>>>>>   $ENV{PATH}=":/home/Hubert/blast/blast-2.2.13/bin/:";
>>>>>>>>}
>>>>>>>>
>>>>>>>>
>>>>>>>># parameters
>>>>>>>>my $expect_value = 20000;
>>>>>>>>#my $filter_query_sequence = 'F';
>>>>>>>>my $one_line_description = 1000;
>>>>>>>>my $alignments = 1000;
>>>>>>>># my $strands = 1;
>>>>>>>>my $count = 1;
>>>>>>>>
>>>>>>>>my @params = ('program' => 'blastp', 'database' => 'nr');
>>>>>>>>#my $progress_interval = 100;
>>>>>>>>
>>>>>>>>
>>>>>>>>my $seqio_obj = Bio::SeqIO->new(
>>>>>>>> -file   => "Perm.txt",
>>>>>>>> -format => "raw",
>>>>>>>>);
>>>>>>>>
>>>>>>>># create factory
>>>>>>>> object and set parameters
>>>>>>>>my $factory = Bio::Tools::Run::StandAloneBlast->new(@params);
>>>>>>>>
>>>>>>>>$factory->e($expect_value);
>>>>>>>>#$factory->F($filter_query_sequence);
>>>>>>>>$factory->v($one_line_description);
>>>>>>>>$factory->b($alignments);
>>>>>>>>#$factory->S($strands);
>>>>>>>>
>>>>>>>>
>>>>>>>># get query
>>>>>>>>
>>>>>>>>while ( my $query = $seqio_obj->next_seq ) {
>>>>>>>>     my $blast_report = $factory->blastall($query);
>>>>>>>>     my $filename = "comp_$count.txt";
>>>>>>>>     my $factory->outfile($filename);
>>>>>>>>     print $query->seq;
>>>>>>>>     print "\n";
>>>>>>>>
>>>>>>>> $count++;
>>>>>>>>}
>>>>>>>>
>>>>>>>>thank you very much in advance
>>>>>>>>Hubert
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>Nagesh Chakka wrote:
>>>>>>>>
>>>>>>>> 
>>>>>>>>
>>>>>>>>                
>>>>>>>>
>>>>>>>>>Hi Hubert,
>>>>>>>>>I downloaded the nr.00.tar.gz file a week ago. I was able to get 
>>>>>>>>>the following files
>>>>>>>>>.phr, .pin, .pnd, .pni, .ppd, .ppi, .psd, .psi, .psq, .pal 
>>>>>>>>>files. I have no trouble in running standalone blast. You are 
>>>>>>>>>not required to run formardb on the downloaded blast databases 
>>>>>>>>>and that may be the reason why the sequences are not included as 
>>>>>>>>>it will also reduce the size of the file.
>>>>>>>>>Did you try to run a blast search, if so is it giving you any 
>>>>>>>>>errors?
>>>>>>>>>Nagesh
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>Hubert Prielinger wrote:
>>>>>>>>>
>>>>>>>>> 
>>>>>>>>>
>>>>>>>>>                  
>>>>>>>>>
>>>>>>>>>>Hi,
>>>>>>>>>>I have downloaded the nr database for doing a blast search 
>>>>>>>>>>locally, now I'm supposed to index the database with formatdb, 
>>>>>>>>>>but it doesn't work...
>>>>>>>>>>The online help says that you need a fasta file that is indexed 
>>>>>>>>>>to use for searching the database, but when I uncompressed the 
>>>>>>>>>>zip file, there were only .phr, .pnd, .pin, .pni, .ppd file....
>>>>>>>>>>Is there anybody who can tell me, how to use formatdb with the 
>>>>>>>>>>nr database...
>>>>>>>>>>
>>>>>>>>>>Help is very appreciated
>>>>>>>>>>Thank you very much in advance
>>>>>>>>>>
>>>>>>>>>>Hubert
>>>>>>>>>>
>>>>>>>>>>_______________________________________________
>>>>>>>>>>Bioperl-l mailing list
>>>>>>>>>>Bioperl-l at portal.open-bio.org
>>>>>>>>>>http://portal.open-bio.org/mailman/listinfo/bioperl-l
>>>>>>>>>>      
>>>>>>>>>>                    
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>    
>>>>>>>>>                  
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>                
>>>>>>>>
>>>>>>> 
>>>>>>>
>>>>>>>              
>>>>>>>
>>>>>          
>>>>>
>>>>_______________________________________________
>>>>Bioperl-l mailing list
>>>>Bioperl-l at portal.open-bio.org
>>>>http://portal.open-bio.org/mailman/listinfo/bioperl-l
>>>>
>>>>
>>>>        
>>>>
>>
>>
>>_______________________________________________
>>Bioperl-l mailing list
>>Bioperl-l at lists.open-bio.org
>>http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>    
>>
>

From smarkel at scitegic.com  Tue Jan 24 17:09:57 2006
From: smarkel at scitegic.com (Scott Markel)
Date: Tue, 24 Jan 2006 14:09:57 -0800
Subject: [Bioperl-l] formatdb with the nr database
In-Reply-To: <43D692C3.80306@gmx.at>
References: <43D54838.5050301@gmx.at>
	<43D5693C.1020805@anu.edu.au>	<43D56203.2060806@gmx.at>
	<1138062266.2534.2.camel@vogon>	<43D571B1.3020008@gmx.at>
	<43D58D06.5080501@anu.edu.au> <43D585CF.5070902@gmx.at>
	<43D63FB6.4090505@scitegic.com> <43D692C3.80306@gmx.at>
Message-ID: <43D6A5B5.8090106@scitegic.com>

Hubert,

Since you can't run blastall on the command line, your initial
problem has nothing to do with BioPerl.  Once you get blastall
working on the command line, you'll know what directories and
environment variable settings to use when running via BioPerl.

What happens when you run the following?

   file /home/Hubert/blast/blast-2.2.13/bin/blastall

Is the executable the correct one for your operating system?

Scott

Hubert Prielinger wrote:

> Hi,
> thank you very much for the help, I have tried to run the blastall on 
> commandline, but I can't even execute the binary file, nevertheless the 
> blastall exe file have every permission...
> I always get the error message: blastall: cannot execute the binary file
> Need to be the exe file somewhere else, another path...now it is located 
> under /home/Hubert/blast/blast-2.2.13/bin
> 
> thanks
> Hubert
> 
> 
> 
> 
> 
> Scott Markel wrote:
> 
>> Hubert,
>>
>> If you look at the MSG line in the exception you can see
>> exactly what the command line was.  Nagesh is pointing out
>> that you used -d "/nr" and asking if that's what you want.
>> I suspect that the '/' shouldn't be there.
>>
>> Try invoking blastall directly from the command line.  All
>> BioPerl is doing is invoking BLAST on your behalf.  The
>> same command line that BioPerl uses should also work for
>> you on the command line.
>>
>> Scott
>>
>> Hubert Prielinger wrote:
>>
>>> hi,
>>> sorry, but what do you mean with is your blast database in /nr...
>>> my database is located in the path /home/Hubert/blast/blast-2.2.13/data
>>>
>>>
>>>
>>> Nagesh Chakka wrote:
>>>
>>>> Can you just run the blast from the command line.
>>>> Is your blast database in "/nr".
>>>>
>>>> Hubert Prielinger wrote:
>>>>
>>>>> Hi Nagesh,
>>>>> thank you very much, I put my database into the data folder, run 
>>>>> the program and got the following error message:
>>>>>
>>>>> submit Sequence...just do it....
>>>>> sh: /home/Hubert/blast/blast-2.2.13/bin/blastall: cannot execute 
>>>>> binary file
>>>>>
>>>>> ------------- EXCEPTION  -------------
>>>>> MSG: blastall call crashed: 32256 
>>>>> /home/Hubert/blast/blast-2.2.13/bin/blastall -p  blastp  -d  "/nr"  
>>>>> -i  /tmp/QTZfYMbgLM  -e  20000  -o  /tmp/v3YwWvONZ1  -v  1000  -b  
>>>>> 1000
>>>>>
>>>>> STACK Bio::Tools::Run::StandAloneBlast::_runblast 
>>>>> /usr/lib/perl5/site_perl/5.8.6/Bio/Tools/Run/StandAloneBlast.pm:759
>>>>> STACK Bio::Tools::Run::StandAloneBlast::_generic_local_blast 
>>>>> /usr/lib/perl5/site_perl/5.8.6/Bio/Tools/Run/StandAloneBlast.pm:706
>>>>> STACK Bio::Tools::Run::StandAloneBlast::blastall 
>>>>> /usr/lib/perl5/site_perl/5.8.6/Bio/Tools/Run/StandAloneBlast.pm:557
>>>>> STACK toplevel 
>>>>> /home/Hubert/installed/eclipse/workspace/Database_Search/standalone_blast.pl:46 
>>>>>
>>>>>
>>>>> --------------------------------------
>>>>>
>>>>> Why it did not find my binary file, but it is there
>>>>>
>>>>> regards
>>>>>
>>>>> Nagesh Chakka wrote:
>>>>>
>>>>>> Hi,
>>>>>> The following is from the StandAloneBlast.pm documentation
>>>>>> " If the databases which will be searched by BLAST are located in the
>>>>>> data subdirectory of the blast program directory (the default
>>>>>> installation location), StandAloneBlast will find them; however, 
>>>>>> if the
>>>>>> database files are located in any other location, environmental 
>>>>>> variable
>>>>>> $BLASTDATADIR will need to be set to point to that directory."
>>>>>> Please note that I have not used this module before.
>>>>>> Nagesh
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Mon, 2006-01-23 at 17:08 -0600, Hubert Prielinger wrote:
>>>>>>  
>>>>>>
>>>>>>> Hi,
>>>>>>> thank you very much for the help, another questions that raises 
>>>>>>> up, do I have to write the path to the database files as well, I 
>>>>>>> guess so, but how I do that, the same way I write the path to teh 
>>>>>>> blast bin files?
>>>>>>> Does anybody know how to set the Composition based statistics 
>>>>>>> parameter?
>>>>>>> there is my code:
>>>>>>>
>>>>>>> #!/usr/bin/perl -w
>>>>>>>
>>>>>>> use Bio::Tools::Run::StandAloneBlast;
>>>>>>> use Bio::Seq;
>>>>>>> use Bio::SeqIO;
>>>>>>> use strict;
>>>>>>>
>>>>>>> BEGIN
>>>>>>> {
>>>>>>>    $ENV{PATH}=":/home/Hubert/blast/blast-2.2.13/bin/:";
>>>>>>> }
>>>>>>>
>>>>>>>
>>>>>>> # parameters
>>>>>>> my $expect_value = 20000;
>>>>>>> #my $filter_query_sequence = 'F';
>>>>>>> my $one_line_description = 1000;
>>>>>>> my $alignments = 1000;
>>>>>>> # my $strands = 1;
>>>>>>> my $count = 1;
>>>>>>>
>>>>>>> my @params = ('program' => 'blastp', 'database' => 'nr');
>>>>>>> #my $progress_interval = 100;
>>>>>>>
>>>>>>>
>>>>>>> my $seqio_obj = Bio::SeqIO->new(
>>>>>>>  -file   => "Perm.txt",
>>>>>>>  -format => "raw",
>>>>>>> );
>>>>>>>
>>>>>>> # create factory object and set parameters
>>>>>>> my $factory = Bio::Tools::Run::StandAloneBlast->new(@params);
>>>>>>>
>>>>>>> $factory->e($expect_value);
>>>>>>> #$factory->F($filter_query_sequence);
>>>>>>> $factory->v($one_line_description);
>>>>>>> $factory->b($alignments);
>>>>>>> #$factory->S($strands);
>>>>>>>
>>>>>>>
>>>>>>> # get query
>>>>>>>
>>>>>>> while ( my $query = $seqio_obj->next_seq ) {
>>>>>>>      my $blast_report = $factory->blastall($query);
>>>>>>>      my $filename = "comp_$count.txt";
>>>>>>>      my $factory->outfile($filename);
>>>>>>>      print $query->seq;
>>>>>>>      print "\n";
>>>>>>>
>>>>>>>  $count++;
>>>>>>> }
>>>>>>>
>>>>>>> thank you very much in advance
>>>>>>> Hubert
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Nagesh Chakka wrote:
>>>>>>>
>>>>>>>  
>>>>>>>
>>>>>>>> Hi Hubert,
>>>>>>>> I downloaded the nr.00.tar.gz file a week ago. I was able to get 
>>>>>>>> the following files
>>>>>>>> .phr, .pin, .pnd, .pni, .ppd, .ppi, .psd, .psi, .psq, .pal 
>>>>>>>> files. I have no trouble in running standalone blast. You are 
>>>>>>>> not required to run formardb on the downloaded blast databases 
>>>>>>>> and that may be the reason why the sequences are not included as 
>>>>>>>> it will also reduce the size of the file.
>>>>>>>> Did you try to run a blast search, if so is it giving you any 
>>>>>>>> errors?
>>>>>>>> Nagesh
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> Hubert Prielinger wrote:
>>>>>>>>
>>>>>>>>  
>>>>>>>>
>>>>>>>>> Hi,
>>>>>>>>> I have downloaded the nr database for doing a blast search 
>>>>>>>>> locally, now I'm supposed to index the database with formatdb, 
>>>>>>>>> but it doesn't work...
>>>>>>>>> The online help says that you need a fasta file that is indexed 
>>>>>>>>> to use for searching the database, but when I uncompressed the 
>>>>>>>>> zip file, there were only .phr, .pnd, .pin, .pni, .ppd file....
>>>>>>>>> Is there anybody who can tell me, how to use formatdb with the 
>>>>>>>>> nr database...
>>>>>>>>>
>>>>>>>>> Help is very appreciated
>>>>>>>>> Thank you very much in advance
>>>>>>>>>
>>>>>>>>> Hubert
>>>>>>>>>
>>>>>>>>> _______________________________________________
>>>>>>>>> Bioperl-l mailing list
>>>>>>>>> Bioperl-l at portal.open-bio.org
>>>>>>>>> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>>>>>>>>>       
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>     
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>>  
>>>>>>
>>>>>
>>>>
>>>>
>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at portal.open-bio.org
>>> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>>
>>
> 
> 
> 
> 

-- 
Scott Markel, Ph.D.
Principal Bioinformatics Architect  email:  smarkel at scitegic.com
SciTegic Inc.                       mobile: +1 858 205 3653
9665 Chesapeake Drive, Suite 401    voice:  +1 858 279 8800, ext. 253
San Diego, CA 92123                 fax:    +1 858 279 8804
USA                                 web:    http://www.scitegic.com

From cjfields at uiuc.edu  Tue Jan 24 17:21:22 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 24 Jan 2006 16:21:22 -0600
Subject: [Bioperl-l] RemoteBlast.pm and Bio::SearchIO::blast.pm
	-partially resolved
In-Reply-To: <18966F80-B780-4661-953E-613B05B56164@duke.edu>
Message-ID: <000301c62134$81cdc500$15327e82@pyrimidine>

Jason, 

I have worked out all the problems with RemoteBlast.pm and posted a patched
version to Bugzilla (http://bugzilla.bioperl.org/show_bug.cgi?id=1935).  The
main problem was that RemoteBlast::save_output was not looking for XML
output when dumping from the tempfile to the saved file (it only looked for
the text header).  That is fixed.  The other problems mentioned were due to
differences in mapping key=>value pairs between blast and blastxml and a
problem in my own script.  It passed all tests using 'perl t/RemoteBlast.t'
with debugging set.

See if anybody else out there can test them out.

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 
> -----Original Message-----
> From: bioperl-l-bounces at portal.open-bio.org [mailto:bioperl-l-
> bounces at portal.open-bio.org] On Behalf Of Jason Stajich
> Sent: Tuesday, January 24, 2006 11:16 AM
> To: Chris Fields
> Cc: bioperl-ml List
> Subject: [Bioperl-l] Re: RemoteBlast.pm and Bio::SearchIO::blast.pm -
> partially resolved
> 
> Thanks Chris - I don't know when I'll have time to check in bugs so
> anyone else who has commit access feel free to give these a whirl and
> check in.
> 
> I would propose making the XML default but allowing the text version
> to still be supported in the event that someone has setup their own
> local NCBI BLAST Web interface which still supports the simple Text
> output.
> 
> -j
> 
> On Jan 24, 2006, at 12:09 PM, Chris Fields wrote:
> 
> > I submitted two bugs on Bugzilla to describe recent problems with
> > RemoteBlast.pm and SearchIO::blast.pm
> >
> > http://bugzilla.bioperl.org/show_bug.cgi?id=1934
> > http://bugzilla.bioperl.org/show_bug.cgi?id=1935
> >
> > Today I submitted a patched version of Bio::SearchIO::blast.pm
> > which should
> > fix the text parsing issue for old (2.2.12) and new (2.2.13)
> > versions of
> > NCBI's BLAST; the bug link above describes the problem and the
> > fix.  Problem
> > is, I know it will likely break again b/c NCBI will probably change
> > text
> > output in a future BLAST version.  I also agree with Jason about
> > changing
> > the default for SearchIO to XML.  So, does text output parsing through
> > blast.pm need to be deprecated in favor of XML, or should both be
> > available?
> >
> > Christopher Fields
> > Postdoctoral Researcher - Switzer Lab
> > Dept. of Biochemistry
> > University of Illinois Urbana-Champaign
> >
> >
> 
> --
> Jason Stajich
> Duke University
> http://www.duke.edu/~jes12
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l

From jason.stajich at duke.edu  Tue Jan 24 16:44:34 2006
From: jason.stajich at duke.edu (Jason Stajich)
Date: Tue, 24 Jan 2006 16:44:34 -0500
Subject: [Bioperl-l] new mailing list server
Message-ID: <50E14815-266E-4ACB-8E6E-293C9EB33476@duke.edu>

Chris Dagdigian has switched our mailing lists over to a new server  
to upgrade us to newer hardware.  In the switch the default mailing  
list the server name is 'lists.open-bio.org' instead of 'portal.open- 
bio.org'.  That should be the only change you should notice at the  
bottom of your mails.  All mail should get delivered to any of those  
addresses (although @bioperl.org is preferred).

We hope this changeover will help improve the performance and  
scalability of our mail and webservices.

We also will aim to move the developer read-write CVS server to a new  
machine in the coming weeks.  We hope this will only be a minor  
inconvenience but will allow us to move to a more recent operating  
system and larger disk space.

If you have questions or concerns they can be directed to support AT  
open-bio.org
-jason
--
Jason Stajich
Duke University
http://www.duke.edu/~jes12

From jason.stajich at duke.edu  Tue Jan 24 22:31:38 2006
From: jason.stajich at duke.edu (Jason Stajich)
Date: Tue, 24 Jan 2006 22:31:38 -0500
Subject: [Bioperl-l] new website launched
Message-ID: <79628361-F026-461F-A156-0B6810DB0B52@duke.edu>

I am pleased to announce the release of a new website for BioPerl.   
The site is based on the mediawiki software that was developed for  
the wikipedia project.  We intend the site to be a place for  
community input on documentation and design for the BioPerl project.   
There is also a fair amount of documentation started surrounding  
bioinformatics tools and techniques applicable to using BioPerl and  
some of the authors who created these resources.

The website continues to be at the URL http://www.bioperl.org.  The  
DNS updates may take up to 24 hours to reach everyone.

The initial content of the site is result of the work of myself,  
Mauricio Herrera Cuadra, Brian Osborne, and Torsten Seemann.  We  
encourage you to contribute to the site's content by signing up for  
an account.

There are several guides for style of the site and how to link to  
Modules for example which can contain additional information from the  
POD
http://bioperl.org/wiki/Module:Bio::SeqIO

You'll notice that many of the paths have changed but the DIST and  
SRC continues to be available at http://bioperl.org/DIST and http:// 
bioperl.org/SRC.  The HOWTOs are now available from http:// 
bioperl.org/wiki/HOWTOs

The FAQ is available at http://bioperl.org/wiki/FAQ and I encourage  
you to add your questions to it so they can be properly archived and  
addressed.

We also have initiated a News site for Bioperl for posting  
announcements regarding development and software.  I would like to  
see if there are volunteers to post weekly or monthly summaries of  
mailing list traffic and development.
http://www.bioperl.org/news/

Jason Stajich on behalf of Mauricio Herrera Cuadra, Brian Osborne,  
Torsten Seemann.

--
Jason Stajich
Duke University
http://www.duke.edu/~jes12

From roy at colibase.bham.ac.uk  Wed Jan 25 12:05:29 2006
From: roy at colibase.bham.ac.uk (Roy Chaudhuri)
Date: Wed, 25 Jan 2006 17:05:29 +0000
Subject: [Bioperl-l] concatenate two embl sequence files
In-Reply-To: <200601182120.k0ILIl8X022324@portal.open-bio.org>
References: <200601182120.k0ILIl8X022324@portal.open-bio.org>
Message-ID: <43D7AFD9.2020305@colibase.bham.ac.uk>

Hi all.

I also had need of a function to concatenate two Bio::Seq objects, so had a go
at this. My naive attempt (intended to go in Bio::SeqUtils) is pasted below. I'm
not too sure about the concept of sub-SeqFeatures (I've never seen any sequence
that had more than one level of feature)- I worked on the assumption that little
sub-SeqFeatures can have littler sub-SeqFeatures and so ad infinitum, but as I
don't have an example file I haven't been able to test if this works. Likewise,
although I think the code should cope with Fuzzy and Split locations, I haven't
tested this with any particularly unusual examples.

Roy.
--
Dr. Roy Chaudhuri
Bioinformatics Research Fellow
Division of Immunity and Infection
University of Birmingham, U.K.

http://xbase.bham.ac.uk

=head2 cat

  Title   : cat
  Usage   : my $catseq = Bio::SeqUtils->cat(@seqs)
  Function: Concatenates an array of Bio::Seq objects, using the first sequence
            as a template for species etc. Adjusts the coordinates of features
            from any additional objects.
  Returns : A sequence object of the same class as the first argument.
  Args    : array of sequence objects

=cut

sub cat {
     my ($self, @seqs) = @_;
     my $seq=shift @seqs;
     $self->throw('Object [$seq] '. 'of class ['. ref($seq).
                  '] should be a Bio::PrimarySeqI ')
     unless $seq->isa('Bio::PrimarySeqI');
     for (@seqs) {
     	$self->throw('Object [$seq] '. 'of class ['. ref($seq).
                  '] should be a Bio::PrimarySeqI ')
	unless $seq->isa('Bio::PrimarySeqI');
	my $length=$seq->length;
	$seq->seq($seq->seq.$_->seq);
	for my $feat ($_->get_SeqFeatures) {
	    $seq->add_SeqFeature($self->_coordAdjust($feat, $length));
	}
     }
     return $seq;
}

=head2 _coordAdjust

  Title   : _coordAdjust
  Usage   : my $newfeat=Bio::SeqUtils->_coordAdjust($feature, 100);
  Function: Recursive subroutine to adjust the coordinates of a feature
            and all its subfeatures.
  Returns : A Bio::SeqFeatureI compliant object.
  Args    : A Bio::SeqFeatureI compliant object,
            the number of bases to add to the coordinates

=cut

sub _coordAdjust {
     my ($self, $feat, $add)=@_;
     $self->throw('Object [$feat] '. 'of class ['. ref($feat).
                  '] should be a Bio::SeqFeatureI ')
	unless $feat->isa('Bio::SeqFeatureI');
     my @adjsubfeat;
     for my $subfeat ($feat->remove_SeqFeatures) {
	push @adjsubfeat, Bio::SeqUtils->_coordAdjust($add, $subfeat);
     }
     my @loc=$feat->location->each_Location;
     map {
	my @coords=($_->start, $_->end);
	map s/(\d+)/$add+$1/ge, @coords;
	$_->start(shift @coords);
	$_->end(shift @coords);
     } @loc;
     if (@loc==1) {
	$feat->location($loc[0])
     } else {
	my $loc=Bio::Location::Split->new;
	$loc->add_sub_Location(@loc);
	$feat->location($loc);
     }
     $feat->add_SeqFeature($_) for @adjsubfeat;
     return $feat;
}

> 
> 
> Jan, 
> 
> It would be easy if someone had written a function to do it. Even writing the 
> function is not hard.  I do not think there is no other way than go through 
> all features, though.
> 
> In my opinion this would be an excellent addition to Bio::Seq::Utilities.
> 
> E.g. cat($arrayrefofsequences, optional_seq_class_to_create)
>      return a new seq, species and other info based on the first seq in array 
> 
> Could you  write it and post to bugzilla?
> 
> 	-Heikki
> 
> 
> On Tuesday 17 January 2006 11:54, jan aerts (RI) wrote:
>> Hi all,
>>
>> Does anyone know of an easy way to concatenate two sequences, including
>> recalculation of features positions of the second one? E.g.
>>   seq 1 = 100 bp
>>     feature A: 5..15
>>   seq 2 = 200 bp
>>     feature B: 20..30
>>   => concatenated sequence 3 = 300 bp
>>        feature A: 5..15
>>        feature B: 120..130  <<<<<<<<<<<
>>
>> Annotations (features without range) should be transferred as well.
>>
>> Of course, it must be possible to create a blank sequence and work my
>> way through all features, adding them to a new collection of features
>> and stuff. But I was wondering if a simpler technique is possible.
>>
>> Many thanks,
>> Jan Aerts
>> Bioinformatics Department
>> Roslin Institute
>> Roslin, Scotland, UK
>>
>> ---------The obligatory disclaimer--------
>> The information contained in this e-mail (including any attachments) is
>> confidential and is intended for the use of the addressee only.   The
>> opinions expressed within this e-mail (including any attachments) are
>> the opinions of the sender and do not necessarily constitute those of
>> Roslin Institute (Edinburgh) ("the Institute") unless specifically
>> stated by a sender who is duly authorised to do so on behalf of the
>> Institute.
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at portal.open-bio.org
>> http://portal.open-bio.org/mailman/listinfo/bioperl-l
> 
> -- 
> ______ _/      _/_____________________________________________________
>       _/      _/
>      _/  _/  _/  Heikki Lehvaslaiho    heikki at_sanbi _ac _za
>     _/_/_/_/_/  Associate Professor    skype: heikki_lehvaslaiho
>    _/  _/  _/  SANBI, South African National Bioinformatics Institute
>   _/  _/  _/  University of Western Cape, South Africa
>      _/      Phone: +27 21 959 2096   FAX: +27 21 959 2512
> ___ _/_/_/_/_/________________________________________________________
> 

From heikki at sanbi.ac.za  Wed Jan 25 16:11:45 2006
From: heikki at sanbi.ac.za (Heikki Lehvaslaiho)
Date: Wed, 25 Jan 2006 23:11:45 +0200
Subject: [Bioperl-l] concatenate two embl sequence files
In-Reply-To: <43D7AFD9.2020305@colibase.bham.ac.uk>
References: <200601182120.k0ILIl8X022324@portal.open-bio.org>
	<43D7AFD9.2020305@colibase.bham.ac.uk>
Message-ID: <200601252311.45582.heikki@sanbi.ac.za>

Thanks Roy!

I'll check to code in tomorrow when I am less sleepy and can go through the 
code in detail. In principle the code looks good. It definitely needs tests. 
If you have written any please do post them.

A few more checks to make sure seq_>alphabet is the same in all sequences 
might be a good idea.

   -Heikki

On Wednesday 25 January 2006 19:05, Roy Chaudhuri wrote:
> Hi all.
>
> I also had need of a function to concatenate two Bio::Seq objects, so had a
> go at this. My naive attempt (intended to go in Bio::SeqUtils) is pasted
> below. I'm not too sure about the concept of sub-SeqFeatures (I've never
> seen any sequence that had more than one level of feature)- I worked on the
> assumption that little sub-SeqFeatures can have littler sub-SeqFeatures and
> so ad infinitum, but as I don't have an example file I haven't been able to
> test if this works. Likewise, although I think the code should cope with
> Fuzzy and Split locations, I haven't tested this with any particularly
> unusual examples.
>
> Roy.
> --
> Dr. Roy Chaudhuri
> Bioinformatics Research Fellow
> Division of Immunity and Infection
> University of Birmingham, U.K.
>
> http://xbase.bham.ac.uk
>
>
>
> =head2 cat
>
>   Title   : cat
>   Usage   : my $catseq = Bio::SeqUtils->cat(@seqs)
>   Function: Concatenates an array of Bio::Seq objects, using the first
> sequence as a template for species etc. Adjusts the coordinates of features
> from any additional objects.
>   Returns : A sequence object of the same class as the first argument.
>   Args    : array of sequence objects
>
>
> =cut
>
> sub cat {
>      my ($self, @seqs) = @_;
>      my $seq=shift @seqs;
>      $self->throw('Object [$seq] '. 'of class ['. ref($seq).
>                   '] should be a Bio::PrimarySeqI ')
>      unless $seq->isa('Bio::PrimarySeqI');
>      for (@seqs) {
>      	$self->throw('Object [$seq] '. 'of class ['. ref($seq).
>                   '] should be a Bio::PrimarySeqI ')
> 	unless $seq->isa('Bio::PrimarySeqI');
> 	my $length=$seq->length;
> 	$seq->seq($seq->seq.$_->seq);
> 	for my $feat ($_->get_SeqFeatures) {
> 	    $seq->add_SeqFeature($self->_coordAdjust($feat, $length));
> 	}
>      }
>      return $seq;
> }
>
> =head2 _coordAdjust
>
>   Title   : _coordAdjust
>   Usage   : my $newfeat=Bio::SeqUtils->_coordAdjust($feature, 100);
>   Function: Recursive subroutine to adjust the coordinates of a feature
>             and all its subfeatures.
>   Returns : A Bio::SeqFeatureI compliant object.
>   Args    : A Bio::SeqFeatureI compliant object,
>             the number of bases to add to the coordinates
>
>
> =cut
>
> sub _coordAdjust {
>      my ($self, $feat, $add)=@_;
>      $self->throw('Object [$feat] '. 'of class ['. ref($feat).
>                   '] should be a Bio::SeqFeatureI ')
> 	unless $feat->isa('Bio::SeqFeatureI');
>      my @adjsubfeat;
>      for my $subfeat ($feat->remove_SeqFeatures) {
> 	push @adjsubfeat, Bio::SeqUtils->_coordAdjust($add, $subfeat);
>      }
>      my @loc=$feat->location->each_Location;
>      map {
> 	my @coords=($_->start, $_->end);
> 	map s/(\d+)/$add+$1/ge, @coords;
> 	$_->start(shift @coords);
> 	$_->end(shift @coords);
>      } @loc;
>      if (@loc==1) {
> 	$feat->location($loc[0])
>      } else {
> 	my $loc=Bio::Location::Split->new;
> 	$loc->add_sub_Location(@loc);
> 	$feat->location($loc);
>      }
>      $feat->add_SeqFeature($_) for @adjsubfeat;
>      return $feat;
> }
>
> > Jan,
> >
> > It would be easy if someone had written a function to do it. Even writing
> > the function is not hard.  I do not think there is no other way than go
> > through all features, though.
> >
> > In my opinion this would be an excellent addition to Bio::Seq::Utilities.
> >
> > E.g. cat($arrayrefofsequences, optional_seq_class_to_create)
> >      return a new seq, species and other info based on the first seq in
> > array
> >
> > Could you  write it and post to bugzilla?
> >
> > 	-Heikki
> >
> > On Tuesday 17 January 2006 11:54, jan aerts (RI) wrote:
> >> Hi all,
> >>
> >> Does anyone know of an easy way to concatenate two sequences, including
> >> recalculation of features positions of the second one? E.g.
> >>   seq 1 = 100 bp
> >>     feature A: 5..15
> >>   seq 2 = 200 bp
> >>     feature B: 20..30
> >>   => concatenated sequence 3 = 300 bp
> >>        feature A: 5..15
> >>        feature B: 120..130  <<<<<<<<<<<
> >>
> >> Annotations (features without range) should be transferred as well.
> >>
> >> Of course, it must be possible to create a blank sequence and work my
> >> way through all features, adding them to a new collection of features
> >> and stuff. But I was wondering if a simpler technique is possible.
> >>
> >> Many thanks,
> >> Jan Aerts
> >> Bioinformatics Department
> >> Roslin Institute
> >> Roslin, Scotland, UK
> >>
> >> ---------The obligatory disclaimer--------
> >> The information contained in this e-mail (including any attachments) is
> >> confidential and is intended for the use of the addressee only.   The
> >> opinions expressed within this e-mail (including any attachments) are
> >> the opinions of the sender and do not necessarily constitute those of
> >> Roslin Institute (Edinburgh) ("the Institute") unless specifically
> >> stated by a sender who is duly authorised to do so on behalf of the
> >> Institute.
> >>
> >> _______________________________________________
> >> Bioperl-l mailing list
> >> Bioperl-l at portal.open-bio.org
> >> http://portal.open-bio.org/mailman/listinfo/bioperl-l
> >
> > --
> > ______ _/      _/_____________________________________________________
> >       _/      _/
> >      _/  _/  _/  Heikki Lehvaslaiho    heikki at_sanbi _ac _za
> >     _/_/_/_/_/  Associate Professor    skype: heikki_lehvaslaiho
> >    _/  _/  _/  SANBI, South African National Bioinformatics Institute
> >   _/  _/  _/  University of Western Cape, South Africa
> >      _/      Phone: +27 21 959 2096   FAX: +27 21 959 2512
> > ___ _/_/_/_/_/________________________________________________________
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

-- 
______ _/      _/_____________________________________________________
      _/      _/
     _/  _/  _/  Heikki Lehvaslaiho    heikki at_sanbi _ac _za
    _/_/_/_/_/  Associate Professor    skype: heikki_lehvaslaiho
   _/  _/  _/  SANBI, South African National Bioinformatics Institute
  _/  _/  _/  University of Western Cape, South Africa
     _/      Phone: +27 21 959 2096   FAX: +27 21 959 2512
___ _/_/_/_/_/________________________________________________________

From heikki at sanbi.ac.za  Wed Jan 25 15:52:42 2006
From: heikki at sanbi.ac.za (Heikki Lehvaslaiho)
Date: Wed, 25 Jan 2006 22:52:42 +0200
Subject: [Bioperl-l] new website launched
In-Reply-To: <79628361-F026-461F-A156-0B6810DB0B52@duke.edu>
References: <79628361-F026-461F-A156-0B6810DB0B52@duke.edu>
Message-ID: <200601252252.42786.heikki@sanbi.ac.za>

Congratulations and huge thank you for the production team!

The new website is a big step ahead readability and ease in editing the 
information.

I for my part have already corrected a few small typos and omissions on the 
new pages. I invite other to do the same.

    -Heikki

On Wednesday 25 January 2006 05:31, Jason Stajich wrote:
> I am pleased to announce the release of a new website for BioPerl.
> The site is based on the mediawiki software that was developed for
> the wikipedia project.  We intend the site to be a place for
> community input on documentation and design for the BioPerl project.
> There is also a fair amount of documentation started surrounding
> bioinformatics tools and techniques applicable to using BioPerl and
> some of the authors who created these resources.
>
> The website continues to be at the URL http://www.bioperl.org.  The
> DNS updates may take up to 24 hours to reach everyone.
>
> The initial content of the site is result of the work of myself,
> Mauricio Herrera Cuadra, Brian Osborne, and Torsten Seemann.  We
> encourage you to contribute to the site's content by signing up for
> an account.
>
> There are several guides for style of the site and how to link to
> Modules for example which can contain additional information from the
> POD
> http://bioperl.org/wiki/Module:Bio::SeqIO
>
> You'll notice that many of the paths have changed but the DIST and
> SRC continues to be available at http://bioperl.org/DIST and http://
> bioperl.org/SRC.  The HOWTOs are now available from http://
> bioperl.org/wiki/HOWTOs
>
> The FAQ is available at http://bioperl.org/wiki/FAQ and I encourage
> you to add your questions to it so they can be properly archived and
> addressed.
>
> We also have initiated a News site for Bioperl for posting
> announcements regarding development and software.  I would like to
> see if there are volunteers to post weekly or monthly summaries of
> mailing list traffic and development.
> http://www.bioperl.org/news/
>
>
> Jason Stajich on behalf of Mauricio Herrera Cuadra, Brian Osborne,
> Torsten Seemann.
>
> --
> Jason Stajich
> Duke University
> http://www.duke.edu/~jes12
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

-- 
______ _/      _/_____________________________________________________
      _/      _/
     _/  _/  _/  Heikki Lehvaslaiho    heikki at_sanbi _ac _za
    _/_/_/_/_/  Associate Professor    skype: heikki_lehvaslaiho
   _/  _/  _/  SANBI, South African National Bioinformatics Institute
  _/  _/  _/  University of Western Cape, South Africa
     _/      Phone: +27 21 959 2096   FAX: +27 21 959 2512
___ _/_/_/_/_/________________________________________________________

From cjfields at uiuc.edu  Wed Jan 25 22:34:01 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 25 Jan 2006 21:34:01 -0600
Subject: [Bioperl-l] [Gmod-gbrowse] GMOD PPM repository not working
In-Reply-To: <1138119383.3338.68.camel@localhost.localdomain>
Message-ID: <000201c62229$59ed5f50$15327e82@pyrimidine>

Scott,

This popped up, for some reason, when I tried to install a perl module
(Error.pm); maybe it has something to do with the reason PPM can't 'see'
GMOD's repository.  It crashes PPM pretty nicely!  Looks like the home page
for GMOD, so maybe Sourceforge is redirecting things and this messes with
PPM?  

_____________________________________________
C:\Perl\Scripts>ppm
PPM - Programmer's Package Manager version 3.3.
Copyright (c) 2001 ActiveState Corp. All Rights Reserved.
ActiveState is a division of Sophos.

Entering interactive shell. Using Term::ReadLine::Perl as readline library.

Type 'help' to get started.

ppm> rep
Repositories:
[1] Bioperl
[2] gmod
[3] ActiveState PPM2 Repository
[4] ActiveState Package Repository
[ ] Bribes
[ ] Kobes
[ ] local
ppm> install Error
PPM::PPD::init: not a PPD and not a file:

  The Generic Model Organism Database Project | GMOD

      GMOD

      Generic Software Components for Model
Organism Databases

      Mailing lists |
Bug Reports |
Feature Requests |
Publications |
Meetings |

.... (lots of HTML removed)

This site is maintained by Scott
Cain | Powered by 
drupal

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 
> -----Original Message-----
> From: gmod-gbrowse-admin at lists.sourceforge.net [mailto:gmod-gbrowse-
> admin at lists.sourceforge.net] On Behalf Of Scott Cain
> Sent: Tuesday, January 24, 2006 10:16 AM
> To: Chris Fields
> Cc: 'Gbrowse (E-mail)'; bioperl-l at portal.open-bio.org
> Subject: Re: [Gmod-gbrowse] GMOD PPM repository not working
> 
> Hi Chris,
> 
> Is it still misbehaving?  I'll do some testing today, but my ability to
> do so is little hampered as I am traveling this week.
> 
> Thanks,
> Scott
> 
> 
> On Wed, 2006-01-18 at 10:51 -0600, Chris Fields wrote:
> > Scott,
> >
> > I am trying to find the newest bioperl dev. Release (1.51) from PPM for
> a
> > quick write-up on installing bioperl-db on Windows.  I tried using the
> GMOD
> > repository:
> >
> > ppm> rep add gmod http://www.gmod.org/ggb/ppm
> > Repositories:
> > [1] gmod
> > [ ] ActiveState Package Repository
> > [ ] ActiveState PPM2 Repository
> > [ ] Bioperl
> > [ ] Bribes
> > [ ] Kobes
> > [ ] local
> > ppm> search bioperl
> > Searching in Active Repositories
> > No matches for 'bioperl'; see 'help search'.
> > ppm> search *
> > Searching in Active Repositories
> > No matches for '*'; see 'help search'.
> > ppm>
> >
> >
> > Any idea what's going on?  All other repositories work fine.  I can
> download
> > it and install locally w/o a problem.  I am running the newest
> ActivePerl
> > (5.8.7.815), WinXP.
> >
> > Christopher Fields
> > Postdoctoral Researcher - Switzer Lab
> > Dept. of Biochemistry
> > University of Illinois Urbana-Champaign
> >
> >
> >
> >
> > -------------------------------------------------------
> > This SF.net email is sponsored by: Splunk Inc. Do you grep through log
> files
> > for problems?  Stop!  Download the new AJAX search engine that makes
> > searching your log files as easy as surfing the  web.  DOWNLOAD SPLUNK!
> > http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642
> > _______________________________________________
> > Gmod-gbrowse mailing list
> > Gmod-gbrowse at lists.sourceforge.net
> > https://lists.sourceforge.net/lists/listinfo/gmod-gbrowse
> --
> ------------------------------------------------------------------------
> Scott Cain, Ph. D.                                         cain at cshl.edu
> GMOD Coordinator (http://www.gmod.org/)                     216-392-3087
> Cold Spring Harbor Laboratory
> 
> 
> 
> -------------------------------------------------------
> This SF.net email is sponsored by: Splunk Inc. Do you grep through log
> files
> for problems?  Stop!  Download the new AJAX search engine that makes
> searching your log files as easy as surfing the  web.  DOWNLOAD SPLUNK!
> http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642
> _______________________________________________
> Gmod-gbrowse mailing list
> Gmod-gbrowse at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/gmod-gbrowse

From cjfields at uiuc.edu  Thu Jan 26 00:38:56 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 25 Jan 2006 23:38:56 -0600
Subject: [Bioperl-l] bioperl-db on Windows (update)
Message-ID: <000001c6223a$cd5539c0$15327e82@pyrimidine>

Hilmar, 

I checked load_seqdatabase.pl with all variables of Root.pm and checking
debugging output; basically, the only way that I could find to get
load_seqdatabase.pl to work on native Windows is by changing those Root.pm
lines by adding a comma (i.e. three lines, from 'throw $class ...' to 'throw
$class, ...').  I ran debugging on load_seqdatabase.pl using all versions of
Root.pm, with and without Error.pm.  Only those with a comma present worked
in both circumstances.  I don't know why this hasn't popped up before now,
but it seems to be a unique combination of Windows, load_seqdatabase.pl, and
bioperl-db.  It doesn't happen with any scripts of Bioperl on Windows that
I've run into, and debugging other modules (for instance,
Bio::SearchIO::blast, which I recently worked on) doesn't cause this
problem.  

Here's the debugging output for load_seqdatabase.pl, with and w/o Error.pm
and without modifying Root.pm.

____________________________________________________________

Without Error.pm:
____________________________________________________________
C:\Perl\Scripts>perl -MError
Can't locate Error.pm in @INC (@INC contains:
C:\Perl\src\bioperl\bioperl-live C:\Perl\src\bioperl\bioperl-db C:/Perl/lib
C:/P
erl/site/lib .).
BEGIN failed--compilation aborted.

C:\Perl\Scripts>load_seqdatabase.pl -dbname biotest -dbuser root -dbpass
****** -driver mysql -format genbank -namespace tes
t -testonly -safe -debug input.gpt
Loading input.gpt ...
attempting to load adaptor class for Bio::Seq::RichSeq
        attempting to load module Bio::DB::BioSQL::RichSeqAdaptor
attempting to load adaptor class for Bio::Seq
        attempting to load module Bio::DB::BioSQL::SeqAdaptor
instantiating adaptor class Bio::DB::BioSQL::SeqAdaptor
attempting to load adaptor class for Bio::Species
        attempting to load module Bio::DB::BioSQL::SpeciesAdaptor
instantiating adaptor class Bio::DB::BioSQL::SpeciesAdaptor
Undefined subroutine &Bio::Root::Root::debug called at
C:\Perl\src\bioperl\bioperl-db/Bio/DB/BioSQL/BasePersistenceAdaptor.pm
line 1537,  line 63.
____________________________________________________________

With Error.pm:
____________________________________________________________

C:\Perl\Scripts>perl -MError -e ";"

C:\Perl\Scripts>load_seqdatabase.pl -dbname biotest -dbuser root -dbpass
****** -driver mysql -format genbank -namespace tes
t -testonly -safe -debug input.gpt
Loading input.gpt ...
attempting to load adaptor class for Bio::Seq::RichSeq
        attempting to load module Bio::DB::BioSQL::RichSeqAdaptor
  Calling Error::throw

  Calling Error::throw

attempting to load adaptor class for Bio::Seq
        attempting to load module Bio::DB::BioSQL::SeqAdaptor
instantiating adaptor class Bio::DB::BioSQL::SeqAdaptor
  Calling Error::throw

attempting to load adaptor class for Bio::Species
        attempting to load module Bio::DB::BioSQL::SpeciesAdaptor
instantiating adaptor class Bio::DB::BioSQL::SpeciesAdaptor
  Calling Error::throw

  Calling Error::throw

Undefined subroutine &Bio::Root::Root::debug called at
C:\Perl\src\bioperl\bioperl-db/Bio/DB/BioSQL/BasePersistenceAdaptor.pm
line 1537,  line 63.

____________________________________________________________

Error::throw is called w/o a problem when Error.pm is present (which is what
should happen).  For some reason, that extra comma makes all the difference
in the world.

The line above in BasePersistenceAdaptor.pm is :

$self->debug("attempting to load driver for adaptor class $class\n");

which is found in many modules.  I don't really know why it decides to hang
up here.  I'll try running a few of the Root.pm modifications under Mac OS X
in the next day or so to see what happens.

I also reran a few of Steve Chervitz's recommendations from a previous post;
everything ran fine except in circumstances in which Error.pm was required
with a 'use' statement, and only when Error.pm wasn't present, which is
expected.  Previously, when I ran them, there was a bit of confusion b/c it
seemed that Error.pm was present somewhere.  It was; Steve included it in
bioperl-live/examples/root/lib.  When I deleted it, I got the expected
results.

Anyway, I don't know what else I can do at this point besides check out
everything on Mac OS X.  Any additional checks of the modified Root.pm need
to be made on other systems.  Will filing this as a bug in Bugzilla help?

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 

From s.rayner at att.net  Thu Jan 26 00:58:42 2006
From: s.rayner at att.net (s.rayner at att.net)
Date: Thu, 26 Jan 2006 05:58:42 +0000
Subject: [Bioperl-l] bioperl installation problems with External Modules -
	doesn't see installed modules
Message-ID: <012620060558.15437.43D865110008848F00003C4D21602806519D0A02970E9DD29C@att.net>

I am trying to install the bioperl::bundle to use some of the external perl modules. 
Particularly the bio::DB::GFF module for use with biodas.

I follow the instructions, both from the bioperl web site for installing the bioperl bundle, and also specific instructions from the biodas web site for installing bio::DB::GFF.  Namely

   (1) Make sure that CVS is installed on your system.

    (2) Use the following command (all on one line) to login to the server

         % cvs -d :pserver:cvs at cvs.bioperl.org:/home/repository/bioperl login

          when prompted, the password is 'cvs'

    (3) Check out the bioperl package you are interested in, for most
    users this will be the bioperl-live source tree.  The following
    command should be executed as one line.

         % cvs -d :pserver:cvs at cvs.bioperl.org:/home/repository/bioperl checkout bioperl-live

    The login and checkout procedure should only have to be done
    once. To update the source directories in the future it should be
    possible just to enter the top level directory and issue the
    following command:

         % cvs update

This will create the directory ``bioperl-live''. Now build and install bioperl with the following recipe:

         % cd bioperl-live
         % perl Makefile.PL
         % make
         % make test
         % make install

The last step will probably need to be run as root.

When i perform either of these steps i get the message that the installation was successful, but bioperl and biodas return a message that the modules have not been installed.

They are physically present on the disk, but the programs don't seem to know where to find them.

Can anyone suggest how to fix this problem?

thanks

Simon

From heikki at sanbi.ac.za  Thu Jan 26 02:53:22 2006
From: heikki at sanbi.ac.za (Heikki Lehvaslaiho)
Date: Thu, 26 Jan 2006 09:53:22 +0200
Subject: [Bioperl-l] Fwd: some doubts in bioperl
Message-ID: <200601260953.22923.heikki@sanbi.ac.za>

----------  Forwarded Message  ----------

Subject: some doubts in bioperl
Date: Monday 23 January 2006 10:16
From: apsara asok 
To: heikki at sanbi.ac.za

dear heikki,
                  i want to clear some doubts in bioperl.using suffix tree
how can v do pattern searching in bioperl
do u have any idea pls help me
apsara

-------------------------------------------------------

From roy at colibase.bham.ac.uk  Thu Jan 26 08:18:03 2006
From: roy at colibase.bham.ac.uk (Roy Chaudhuri)
Date: Thu, 26 Jan 2006 13:18:03 +0000
Subject: [Bioperl-l] concatenate two embl sequence files
In-Reply-To: <200601252311.45582.heikki@sanbi.ac.za>
References: <200601182120.k0ILIl8X022324@portal.open-bio.org>
	<43D7AFD9.2020305@colibase.bham.ac.uk>
	<200601252311.45582.heikki@sanbi.ac.za>
Message-ID: <43D8CC0B.10403@colibase.bham.ac.uk>

Heikki Lehvaslaiho wrote:
> Thanks Roy!
> 
> I'll check to code in tomorrow when I am less sleepy and can go through the 
> code in detail. In principle the code looks good. It definitely needs tests. 
> If you have written any please do post them.
Not too sure about how to go about writing tests, any suggestions?

It did occur to me that my _coordAdjust method could be adapted to allow 
the Bio::Seq trunc method to retain sequence features (since there's no 
reason why the $add argument can't be negative). This would probably 
need a bit more work to cope with the situation where a feature overlaps 
the trunc coordinates, for example if we truncate to coordinates 1..400, 
but there's a feature 300..500. I guess the 'correct' behaviour might be 
to convert that feature to a fuzzy location of 300..>400? Or is it 
acceptable to have features with coordinates outside of a sequence?

If we did that then an obvious test would be to cat a sequence to 
itself, then trunc to retain just the second half of the new sequence 
and see if you got back what you started with.

> A few more checks to make sure seq_>alphabet is the same in all sequences 
> might be a good idea.
That's easy to implement. Just put the line:
	$self->throw('Trying to concatenate sequences with different alphabets: 
'.$seq->display_id.' ('.$seq->alphabet.') and ' .$_->display_id.' 
('.$_->alphabet.')') unless $_->alphabet eq $seq->alphabet;

at the start of the for(@seqs) loop of the cat subroutine.

Roy.
--
Dr. Roy Chaudhuri
Bioinformatics Research Fellow
Division of Immunity and Infection
University of Birmingham, U.K.

http://xbase.bham.ac.uk

From hlapp at gmx.net  Thu Jan 26 01:31:43 2006
From: hlapp at gmx.net (Hilmar Lapp)
Date: Wed, 25 Jan 2006 22:31:43 -0800
Subject: [Bioperl-l] bioperl-db on Windows (update)
In-Reply-To: <000001c6223a$cd5539c0$15327e82@pyrimidine>
References: <000001c6223a$cd5539c0$15327e82@pyrimidine>
Message-ID: 

This is a lot of work you did to investigate this Chris, thanks. Yes
filing as a bug report will help, and don't forget to attach this
report of yours with all the tests you did. Really all that's left to
do is test on a couple of Unix platforms, which will happen
semi-automatically by people once we commit the change.

   -hilmar

On 1/25/06, Chris Fields  wrote:
> Hilmar,
>
> I checked load_seqdatabase.pl with all variables of Root.pm and checking
> debugging output; basically, the only way that I could find to get
> load_seqdatabase.pl to work on native Windows is by changing those Root.pm
> lines by adding a comma (i.e. three lines, from 'throw $class ...' to 'throw
> $class, ...').  I ran debugging on load_seqdatabase.pl using all versions of
> Root.pm, with and without Error.pm.  Only those with a comma present worked
> in both circumstances.  I don't know why this hasn't popped up before now,
> but it seems to be a unique combination of Windows, load_seqdatabase.pl, and
> bioperl-db.  It doesn't happen with any scripts of Bioperl on Windows that
> I've run into, and debugging other modules (for instance,
> Bio::SearchIO::blast, which I recently worked on) doesn't cause this
> problem.
>
> Here's the debugging output for load_seqdatabase.pl, with and w/o Error.pm
> and without modifying Root.pm.
>
> ____________________________________________________________
>
> Without Error.pm:
> ____________________________________________________________
> C:\Perl\Scripts>perl -MError
> Can't locate Error.pm in @INC (@INC contains:
> C:\Perl\src\bioperl\bioperl-live C:\Perl\src\bioperl\bioperl-db C:/Perl/lib
> C:/P
> erl/site/lib .).
> BEGIN failed--compilation aborted.
>
> C:\Perl\Scripts>load_seqdatabase.pl -dbname biotest -dbuser root -dbpass
> ****** -driver mysql -format genbank -namespace tes
> t -testonly -safe -debug input.gpt
> Loading input.gpt ...
> attempting to load adaptor class for Bio::Seq::RichSeq
>         attempting to load module Bio::DB::BioSQL::RichSeqAdaptor
> attempting to load adaptor class for Bio::Seq
>         attempting to load module Bio::DB::BioSQL::SeqAdaptor
> instantiating adaptor class Bio::DB::BioSQL::SeqAdaptor
> attempting to load adaptor class for Bio::Species
>         attempting to load module Bio::DB::BioSQL::SpeciesAdaptor
> instantiating adaptor class Bio::DB::BioSQL::SpeciesAdaptor
> Undefined subroutine &Bio::Root::Root::debug called at
> C:\Perl\src\bioperl\bioperl-db/Bio/DB/BioSQL/BasePersistenceAdaptor.pm
> line 1537,  line 63.
> ____________________________________________________________
>
> With Error.pm:
> ____________________________________________________________
>
> C:\Perl\Scripts>perl -MError -e ";"
>
> C:\Perl\Scripts>load_seqdatabase.pl -dbname biotest -dbuser root -dbpass
> ****** -driver mysql -format genbank -namespace tes
> t -testonly -safe -debug input.gpt
> Loading input.gpt ...
> attempting to load adaptor class for Bio::Seq::RichSeq
>         attempting to load module Bio::DB::BioSQL::RichSeqAdaptor
>   Calling Error::throw
>
>   Calling Error::throw
>
> attempting to load adaptor class for Bio::Seq
>         attempting to load module Bio::DB::BioSQL::SeqAdaptor
> instantiating adaptor class Bio::DB::BioSQL::SeqAdaptor
>   Calling Error::throw
>
> attempting to load adaptor class for Bio::Species
>         attempting to load module Bio::DB::BioSQL::SpeciesAdaptor
> instantiating adaptor class Bio::DB::BioSQL::SpeciesAdaptor
>   Calling Error::throw
>
>   Calling Error::throw
>
> Undefined subroutine &Bio::Root::Root::debug called at
> C:\Perl\src\bioperl\bioperl-db/Bio/DB/BioSQL/BasePersistenceAdaptor.pm
> line 1537,  line 63.
>
> ____________________________________________________________
>
> Error::throw is called w/o a problem when Error.pm is present (which is what
> should happen).  For some reason, that extra comma makes all the difference
> in the world.
>
> The line above in BasePersistenceAdaptor.pm is :
>
> $self->debug("attempting to load driver for adaptor class $class\n");
>
> which is found in many modules.  I don't really know why it decides to hang
> up here.  I'll try running a few of the Root.pm modifications under Mac OS X
> in the next day or so to see what happens.
>
> I also reran a few of Steve Chervitz's recommendations from a previous post;
> everything ran fine except in circumstances in which Error.pm was required
> with a 'use' statement, and only when Error.pm wasn't present, which is
> expected.  Previously, when I ran them, there was a bit of confusion b/c it
> seemed that Error.pm was present somewhere.  It was; Steve included it in
> bioperl-live/examples/root/lib.  When I deleted it, I got the expected
> results.
>
> Anyway, I don't know what else I can do at this point besides check out
> everything on Mac OS X.  Any additional checks of the modified Root.pm need
> to be made on other systems.  Will filing this as a bug in Bugzilla help?
>
> Christopher Fields
> Postdoctoral Researcher - Switzer Lab
> Dept. of Biochemistry
> University of Illinois Urbana-Champaign
>
>

--
----------------------------------------------------------
: Hilmar Lapp -:- San Diego, CA -:- hlapp at gmx dot net :
----------------------------------------------------------

From cain at cshl.edu  Thu Jan 26 10:41:20 2006
From: cain at cshl.edu (Scott Cain)
Date: Thu, 26 Jan 2006 10:41:20 -0500
Subject: [Bioperl-l] [Gmod-gbrowse] GMOD PPM repository not working
In-Reply-To: <000201c62229$59ed5f50$15327e82@pyrimidine>
References: <000201c62229$59ed5f50$15327e82@pyrimidine>
Message-ID: <1138290080.2894.25.camel@localhost.localdomain>

Hi Chris,

I still don't exactly know what the problem is, but this at least has
given me some insight on some messages in my error_log: I've been seeing
lots of messages about '/icon/somegif.gif' not found and haven't been
able to track down their source (not that I'd really tried, it was an
annoyance that hadn't risen to the level of serous debugging yet).  We
are using mod_rewrite, so that could be part of the problem.  I'll try
to fix it so that the icons display properly and that may have a side
effect of fixing ppm.

Scott

On Wed, 2006-01-25 at 21:34 -0600, Chris Fields wrote:
> Scott,
> 
> This popped up, for some reason, when I tried to install a perl module
> (Error.pm); maybe it has something to do with the reason PPM can't 'see'
> GMOD's repository.  It crashes PPM pretty nicely!  Looks like the home page
> for GMOD, so maybe Sourceforge is redirecting things and this messes with
> PPM?  
> 
> _____________________________________________
> C:\Perl\Scripts>ppm
> PPM - Programmer's Package Manager version 3.3.
> Copyright (c) 2001 ActiveState Corp. All Rights Reserved.
> ActiveState is a division of Sophos.
> 
> Entering interactive shell. Using Term::ReadLine::Perl as readline library.
> 
> Type 'help' to get started.
> 
> ppm> rep
> Repositories:
> [1] Bioperl
> [2] gmod
> [3] ActiveState PPM2 Repository
> [4] ActiveState Package Repository
> [ ] Bribes
> [ ] Kobes
> [ ] local
> ppm> install Error
> PPM::PPD::init: not a PPD and not a file:
>  "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
> 
> 
> 
>   The Generic Model Organism Database Project | GMOD
>   
> 
> 
>   
>   
> 
> 
> 
> 
> 
>   
>     
>     
> 
>        alt="Home" />
> 
>       GMOD
> 
>       Generic Software Components for Model
> Organism Databases
> 
> 
>     
>        href="http://sourceforge.net/mail/?group_id=27707">Mailing lists |
>  href="http://sourceforge.net/tracker/?atid=391291&group_id=27707&func=browse
> ">Bug Reports |
>  href="http://sourceforge.net/tracker/?atid=391294&group_id=27707&func=browse
> ">Feature Requests |
> Publications |
> Meetings |
> 
> .... (lots of HTML removed)
> 
> 
> This site is maintained by Scott
> Cain | Powered by 
> drupal
> 
> 
> 
> 
> Christopher Fields
> Postdoctoral Researcher - Switzer Lab
> Dept. of Biochemistry
> University of Illinois Urbana-Champaign 
> > -----Original Message-----
> > From: gmod-gbrowse-admin at lists.sourceforge.net [mailto:gmod-gbrowse-
> > admin at lists.sourceforge.net] On Behalf Of Scott Cain
> > Sent: Tuesday, January 24, 2006 10:16 AM
> > To: Chris Fields
> > Cc: 'Gbrowse (E-mail)'; bioperl-l at portal.open-bio.org
> > Subject: Re: [Gmod-gbrowse] GMOD PPM repository not working
> > 
> > Hi Chris,
> > 
> > Is it still misbehaving?  I'll do some testing today, but my ability to
> > do so is little hampered as I am traveling this week.
> > 
> > Thanks,
> > Scott
> > 
> > 
> > On Wed, 2006-01-18 at 10:51 -0600, Chris Fields wrote:
> > > Scott,
> > >
> > > I am trying to find the newest bioperl dev. Release (1.51) from PPM for
> > a
> > > quick write-up on installing bioperl-db on Windows.  I tried using the
> > GMOD
> > > repository:
> > >
> > > ppm> rep add gmod http://www.gmod.org/ggb/ppm
> > > Repositories:
> > > [1] gmod
> > > [ ] ActiveState Package Repository
> > > [ ] ActiveState PPM2 Repository
> > > [ ] Bioperl
> > > [ ] Bribes
> > > [ ] Kobes
> > > [ ] local
> > > ppm> search bioperl
> > > Searching in Active Repositories
> > > No matches for 'bioperl'; see 'help search'.
> > > ppm> search *
> > > Searching in Active Repositories
> > > No matches for '*'; see 'help search'.
> > > ppm>
> > >
> > >
> > > Any idea what's going on?  All other repositories work fine.  I can
> > download
> > > it and install locally w/o a problem.  I am running the newest
> > ActivePerl
> > > (5.8.7.815), WinXP.
> > >
> > > Christopher Fields
> > > Postdoctoral Researcher - Switzer Lab
> > > Dept. of Biochemistry
> > > University of Illinois Urbana-Champaign
> > >
> > >
> > >
> > >
> > > -------------------------------------------------------
> > > This SF.net email is sponsored by: Splunk Inc. Do you grep through log
> > files
> > > for problems?  Stop!  Download the new AJAX search engine that makes
> > > searching your log files as easy as surfing the  web.  DOWNLOAD SPLUNK!
> > > http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642
> > > _______________________________________________
> > > Gmod-gbrowse mailing list
> > > Gmod-gbrowse at lists.sourceforge.net
> > > https://lists.sourceforge.net/lists/listinfo/gmod-gbrowse
> > --
> > ------------------------------------------------------------------------
> > Scott Cain, Ph. D.                                         cain at cshl.edu
> > GMOD Coordinator (http://www.gmod.org/)                     216-392-3087
> > Cold Spring Harbor Laboratory
> > 
> > 
> > 
> > -------------------------------------------------------
> > This SF.net email is sponsored by: Splunk Inc. Do you grep through log
> > files
> > for problems?  Stop!  Download the new AJAX search engine that makes
> > searching your log files as easy as surfing the  web.  DOWNLOAD SPLUNK!
> > http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642
> > _______________________________________________
> > Gmod-gbrowse mailing list
> > Gmod-gbrowse at lists.sourceforge.net
> > https://lists.sourceforge.net/lists/listinfo/gmod-gbrowse
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
-- 
------------------------------------------------------------------------
Scott Cain, Ph. D.                                         cain at cshl.edu
GMOD Coordinator (http://www.gmod.org/)                     216-392-3087
Cold Spring Harbor Laboratory

From rbalbi at gmail.com  Thu Jan 26 13:19:57 2006
From: rbalbi at gmail.com (Ricardo Balbi)
Date: Thu, 26 Jan 2006 16:19:57 -0200
Subject: [Bioperl-l] [GUSDEV] Another error executing ISF
In-Reply-To: 
References: 

Message-ID: 

Hi all,

   Anybody could help me with this error ?

thanks in advance,
Ricardo

2006/1/26, Aaron J. Mackey :
>
>
> This is a BioPerl "Unflattener" error; it's unable to automatically
> reconstruct the gene/mRNA/exon logic of some (or all) of the
> annotation in your genbank file.  To get help with this, you should
> post a message to the BioPerl mailing list (bioperl-l at bioperl.org),
> including a snippet of your genbank file.
>
> -Aaron
>
> On Jan 26, 2006, at 11:53 AM, Ricardo Balbi wrote:
>
> > Hi all,
> >
> >   After making some changes in the gus mapping file to ignore some
> > features of the kinetoplastida database, I followed in the
> > execution of the ISF, however without success.
> >
> >   Somebody could help me with this error?
> >
> > thanks in advance,
> > Ricardo
> >
> > ERROR:
> >
> > ------------- EXCEPTION  -------------
> > MSG: structure_type 2 is currently unknown
> > STACK Bio::SeqFeature::Tools::Unflattener::unflatten_seq /usr/local/
> > bioperl15/Bio/SeqFeature/Tools/Unflattener.pm:1419
> > STACK GUS::Supported::Plugin::InsertSequenceFeatures::unflatten /
> > GUS/gus_home/lib/perl/GUS/Supported/Plugin/
> > InsertSequenceFeatures.pm:353
> > STACK
> > GUS::Supported::Plugin::InsertSequenceFeatures::processFeatureTrees /G
> > US/gus_home/lib/perl/GUS/Supported/Plugin/InsertSequenceFeatures.pm:
> > 720
> > STACK GUS::Supported::Plugin::InsertSequenceFeatures::run /GUS/
> > gus_home/lib/perl/GUS/Supported/Plugin/InsertSequenceFeatures.pm:330
> > STACK (eval) /GUS/gus_home/lib/perl/GUS/PluginMgr/GusApplication.pm:
> > 549
> > STACK GUS::PluginMgr::GusApplication::doMajorMode_RunOrReport /GUS/
> > gus_home/lib/perl/GUS/PluginMgr/GusApplication.pm:541
> > STACK GUS::PluginMgr::GusApplication::doMajorMode_Run /GUS/gus_home/
> > lib/perl/GUS/PluginMgr/GusApplication.pm:459
> > STACK GUS::PluginMgr::GusApplication::doMajorMode /GUS/gus_home/lib/
> > perl/GUS/PluginMgr/GusApplication.pm:357
> > STACK GUS::PluginMgr::GusApplication::parseAndRun /GUS/gus_home/lib/
> > perl/GUS/PluginMgr/GusApplication.pm:266
> > STACK toplevel /GUS/gus_home/bin/ga:11
> >
> > --------------------------------------
> >
> > STACK TRACE:
> >  at /usr/local/bioperl15/Bio/Root/Root.pm line 342
> >         Bio::Root::Root::throw
> > ('Bio::SeqFeature::Tools::Unflattener=HASH(0xb5e124c)',
> > 'structure_type 2 is currently unknown') called at /usr/local/
> > bioperl15/Bio/SeqFeature/Tools/Unflattener.pm line 1419
> >         Bio::SeqFeature::Tools::Unflattener::unflatten_seq
> > ('Bio::SeqFeature::Tools::Unflattener=HASH(0xb5e124c)', '-seq',
> > 'Bio::Seq::RichSeq=HASH(0xb820b14)', '-use_magic', 1) called at /
> > GUS/gus_home/lib/perl/GUS/Supported/Plugin/
> > InsertSequenceFeatures.pm line 353
> >         GUS::Supported::Plugin::InsertSequenceFeatures::unflatten
> > ('GUS::Supported::Plugin::InsertSequenceFeatures=HASH(0xa011adc)',
> > 'Bio::Seq::RichSeq=HASH(0xb820b14)') called at /GUS/gus_home/lib/
> > perl/GUS/Supported/Plugin/InsertSequenceFeatures.pm line 720
> >
> > GUS::Supported::Plugin::InsertSequenceFeatures::processFeatureTrees
> > ('GUS::Supported::Plugin::InsertSequenceFeatures=HASH(0xa011adc)',
> > 'Bio::Seq::RichSeq=HASH(0xb820b14)', 140, 177) called at /GUS/
> > gus_home/lib/perl/GUS/Supported/Plugin/InsertSequenceFeatures.pm
> > line 330
> >         GUS::Supported::Plugin::InsertSequenceFeatures::run
> > ('GUS::Supported::Plugin::InsertSequenceFeatures=HASH(0xa011adc)',
> > 'HASH(0xb047e98)') called at /GUS/gus_home/lib/perl/GUS/PluginMgr/
> > GusApplication.pm line 549
> >         eval {...} called at /GUS/gus_home/lib/perl/GUS/PluginMgr/
> > GusApplication.pm line 541
> >         GUS::PluginMgr::GusApplication::doMajorMode_RunOrReport
> > ('GUS::PluginMgr::GusApplication=HASH(0x9ff0c20)',
> > 'GUS::Supported::Plugin::InsertSequenceFeatures', 1) called at /GUS/
> > gus_home/lib/perl/GUS/PluginMgr/GusApplication.pm line 459
> >         GUS::PluginMgr::GusApplication::doMajorMode_Run
> > ('GUS::PluginMgr::GusApplication=HASH(0x9ff0c20)',
> > 'GUS::Supported::Plugin::InsertSequenceFeatures') called at /GUS/
> > gus_home/lib/perl/GUS/PluginMgr/GusApplication.pm line 357
> >         GUS::PluginMgr::GusApplication::doMajorMode
> > ('GUS::PluginMgr::GusApplication=HASH(0x9ff0c20)',
> > 'GUS::Supported::Plugin::InsertSequenceFeatures') called at /GUS/
> > gus_home/lib/perl/GUS/PluginMgr/GusApplication.pm line 266
> >         GUS::PluginMgr::GusApplication::parseAndRun
> > ('GUS::PluginMgr::GusApplication=HASH(0x9ff0c20)', 'ARRAY
> > (0xa004738)') called at /GUS/gus_home/bin/ga line 11
> >
> >
> >
>
> --
> Dr. Aaron J. Mackey, Ph.D.
> Project Manager, ApiDB Bioinformatics Resource Center
> Penn Genomics Institute, University of Pennsylvania
> email:  amackey at pcbi.upenn.edu
> office: 215-898-1205 (Biology, 212 Goddard Labs)
>          215-746-7018 (PCBI, 1428 Blockley Hall)
> fax:    215-746-6697 (Penn Genomics Institute)
> postal: Penn Genomics Institute
>          Goddard Labs 212
>          415 S. University Avenue
>          Philadelphia, PA  19104-6017
>
>
>

From jason.stajich at duke.edu  Thu Jan 26 14:28:26 2006
From: jason.stajich at duke.edu (Jason Stajich)
Date: Thu, 26 Jan 2006 14:28:26 -0500
Subject: [Bioperl-l] [GUSDEV] Another error executing ISF
In-Reply-To: 
References: 

Message-ID: <2A475D24-5AC3-4AD5-80CB-0C40DB622283@duke.edu>

I would suggest following Aaron's instructions to

>> including a snippet of your genbank file.

--
Jason Stajich
Duke University
http://www.duke.edu/~jes12

From cjm at fruitfly.org  Thu Jan 26 14:33:46 2006
From: cjm at fruitfly.org (chris mungall)
Date: Thu, 26 Jan 2006 11:33:46 -0800
Subject: [Bioperl-l] [GUSDEV] Another error executing ISF
In-Reply-To: 
References: 

Message-ID: 

Sorry for the uninformative error message.

The unflattener uses a collection of heuristics to infer a canonical 
gene-mRNA-CDS-exon feature hierarchy from a flattened genbank style 
file. Due to the highly variable nature of some genbank records this 
isn't always possible, and some data massaging is required beforehand. 
I don't know what the context of this message is, but I presume you're 
aware of this from the docs.

The only time I've seen this before was with the genbank submission of 
the pombe genome, which has some very.. unusual features purportedly of 
type mRNA; the actual gene models are encoded using 'gene' and 'CDS' 
features. This confuses the heuristics a little. The only way I've been 
able to deal with this one was to manually remove the mRNA features 
(they appeared to be just fragments and not actual gene models) using 
$unf->remove_types(['mRNA']) beforehand.

Can you send the accession of the record you're trying this on (or 
email me the file off-list if it isn't too large). I'll try and get a 
more informative error message in there.

On Jan 26, 2006, at 10:19 AM, Ricardo Balbi wrote:

> Hi all,
>
>    Anybody could help me with this error ?
>
> thanks in advance,
> Ricardo
>
> 2006/1/26, Aaron J. Mackey :
>>
>>
>> This is a BioPerl "Unflattener" error; it's unable to automatically
>> reconstruct the gene/mRNA/exon logic of some (or all) of the
>> annotation in your genbank file.  To get help with this, you should
>> post a message to the BioPerl mailing list (bioperl-l at bioperl.org),
>> including a snippet of your genbank file.
>>
>> -Aaron
>>
>> On Jan 26, 2006, at 11:53 AM, Ricardo Balbi wrote:
>>
>>> Hi all,
>>>
>>>   After making some changes in the gus mapping file to ignore some
>>> features of the kinetoplastida database, I followed in the
>>> execution of the ISF, however without success.
>>>
>>>   Somebody could help me with this error?
>>>
>>> thanks in advance,
>>> Ricardo
>>>
>>> ERROR:
>>>
>>> ------------- EXCEPTION  -------------
>>> MSG: structure_type 2 is currently unknown
>>> STACK Bio::SeqFeature::Tools::Unflattener::unflatten_seq /usr/local/
>>> bioperl15/Bio/SeqFeature/Tools/Unflattener.pm:1419
>>> STACK GUS::Supported::Plugin::InsertSequenceFeatures::unflatten /
>>> GUS/gus_home/lib/perl/GUS/Supported/Plugin/
>>> InsertSequenceFeatures.pm:353
>>> STACK
>>> GUS::Supported::Plugin::InsertSequenceFeatures::processFeatureTrees 
>>> /G
>>> US/gus_home/lib/perl/GUS/Supported/Plugin/InsertSequenceFeatures.pm:
>>> 720
>>> STACK GUS::Supported::Plugin::InsertSequenceFeatures::run /GUS/
>>> gus_home/lib/perl/GUS/Supported/Plugin/InsertSequenceFeatures.pm:330
>>> STACK (eval) /GUS/gus_home/lib/perl/GUS/PluginMgr/GusApplication.pm:
>>> 549
>>> STACK GUS::PluginMgr::GusApplication::doMajorMode_RunOrReport /GUS/
>>> gus_home/lib/perl/GUS/PluginMgr/GusApplication.pm:541
>>> STACK GUS::PluginMgr::GusApplication::doMajorMode_Run /GUS/gus_home/
>>> lib/perl/GUS/PluginMgr/GusApplication.pm:459
>>> STACK GUS::PluginMgr::GusApplication::doMajorMode /GUS/gus_home/lib/
>>> perl/GUS/PluginMgr/GusApplication.pm:357
>>> STACK GUS::PluginMgr::GusApplication::parseAndRun /GUS/gus_home/lib/
>>> perl/GUS/PluginMgr/GusApplication.pm:266
>>> STACK toplevel /GUS/gus_home/bin/ga:11
>>>
>>> --------------------------------------
>>>
>>> STACK TRACE:
>>>  at /usr/local/bioperl15/Bio/Root/Root.pm line 342
>>>         Bio::Root::Root::throw
>>> ('Bio::SeqFeature::Tools::Unflattener=HASH(0xb5e124c)',
>>> 'structure_type 2 is currently unknown') called at /usr/local/
>>> bioperl15/Bio/SeqFeature/Tools/Unflattener.pm line 1419
>>>         Bio::SeqFeature::Tools::Unflattener::unflatten_seq
>>> ('Bio::SeqFeature::Tools::Unflattener=HASH(0xb5e124c)', '-seq',
>>> 'Bio::Seq::RichSeq=HASH(0xb820b14)', '-use_magic', 1) called at /
>>> GUS/gus_home/lib/perl/GUS/Supported/Plugin/
>>> InsertSequenceFeatures.pm line 353
>>>         GUS::Supported::Plugin::InsertSequenceFeatures::unflatten
>>> ('GUS::Supported::Plugin::InsertSequenceFeatures=HASH(0xa011adc)',
>>> 'Bio::Seq::RichSeq=HASH(0xb820b14)') called at /GUS/gus_home/lib/
>>> perl/GUS/Supported/Plugin/InsertSequenceFeatures.pm line 720
>>>
>>> GUS::Supported::Plugin::InsertSequenceFeatures::processFeatureTrees
>>> ('GUS::Supported::Plugin::InsertSequenceFeatures=HASH(0xa011adc)',
>>> 'Bio::Seq::RichSeq=HASH(0xb820b14)', 140, 177) called at /GUS/
>>> gus_home/lib/perl/GUS/Supported/Plugin/InsertSequenceFeatures.pm
>>> line 330
>>>         GUS::Supported::Plugin::InsertSequenceFeatures::run
>>> ('GUS::Supported::Plugin::InsertSequenceFeatures=HASH(0xa011adc)',
>>> 'HASH(0xb047e98)') called at /GUS/gus_home/lib/perl/GUS/PluginMgr/
>>> GusApplication.pm line 549
>>>         eval {...} called at /GUS/gus_home/lib/perl/GUS/PluginMgr/
>>> GusApplication.pm line 541
>>>         GUS::PluginMgr::GusApplication::doMajorMode_RunOrReport
>>> ('GUS::PluginMgr::GusApplication=HASH(0x9ff0c20)',
>>> 'GUS::Supported::Plugin::InsertSequenceFeatures', 1) called at /GUS/
>>> gus_home/lib/perl/GUS/PluginMgr/GusApplication.pm line 459
>>>         GUS::PluginMgr::GusApplication::doMajorMode_Run
>>> ('GUS::PluginMgr::GusApplication=HASH(0x9ff0c20)',
>>> 'GUS::Supported::Plugin::InsertSequenceFeatures') called at /GUS/
>>> gus_home/lib/perl/GUS/PluginMgr/GusApplication.pm line 357
>>>         GUS::PluginMgr::GusApplication::doMajorMode
>>> ('GUS::PluginMgr::GusApplication=HASH(0x9ff0c20)',
>>> 'GUS::Supported::Plugin::InsertSequenceFeatures') called at /GUS/
>>> gus_home/lib/perl/GUS/PluginMgr/GusApplication.pm line 266
>>>         GUS::PluginMgr::GusApplication::parseAndRun
>>> ('GUS::PluginMgr::GusApplication=HASH(0x9ff0c20)', 'ARRAY
>>> (0xa004738)') called at /GUS/gus_home/bin/ga line 11
>>>
>>>
>>>
>>
>> --
>> Dr. Aaron J. Mackey, Ph.D.
>> Project Manager, ApiDB Bioinformatics Resource Center
>> Penn Genomics Institute, University of Pennsylvania
>> email:  amackey at pcbi.upenn.edu
>> office: 215-898-1205 (Biology, 212 Goddard Labs)
>>          215-746-7018 (PCBI, 1428 Blockley Hall)
>> fax:    215-746-6697 (Penn Genomics Institute)
>> postal: Penn Genomics Institute
>>          Goddard Labs 212
>>          415 S. University Avenue
>>          Philadelphia, PA  19104-6017
>>
>>
>>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

From heikki at sanbi.ac.za  Fri Jan 27 05:06:52 2006
From: heikki at sanbi.ac.za (Heikki Lehvaslaiho)
Date: Fri, 27 Jan 2006 12:06:52 +0200
Subject: [Bioperl-l] concatenate two embl sequence files
In-Reply-To: <43D8CC0B.10403@colibase.bham.ac.uk>
References: <200601182120.k0ILIl8X022324@portal.open-bio.org>
	<200601252311.45582.heikki@sanbi.ac.za>
	<43D8CC0B.10403@colibase.bham.ac.uk>
Message-ID: <200601271206.52875.heikki@sanbi.ac.za>

On Thursday 26 January 2006 15:18, Roy Chaudhuri wrote:
> Heikki Lehvaslaiho wrote:
> > Thanks Roy!
> >
> > I'll check to code in tomorrow when I am less sleepy and can go through
> > the code in detail. In principle the code looks good. It definitely needs
> > tests. If you have written any please do post them.
>
> Not too sure about how to go about writing tests, any suggestions?

I've committed the code and tests. See t/SeqUtils.t. The idea is to test all 
methods and a reasonable portion of all edge values to be sure that the 
method works as it should.

Note that the code does not create a new sequence object. It modifies the 
existing one. Therefore it is best not to return that object. The users would 
assign that to a variable that points to the same structure and get confused. 
The method now returns true upon completeion.

Creating a new sequence object is problematic because one needs to add one 
more dependency (e.g. Clone) and will not work anyway if the sequence 
implementation is using a database back end. It is better the way you have 
written it.

I added code to move over the annotations from secondary sequences, but did 
not do anything remove duplicates if the same sequence gets added twice. I 
wrote a note about this so that users know to be prepared if that affects 
them.

> It did occur to me that my _coordAdjust method could be adapted to allow
> the Bio::Seq trunc method to retain sequence features (since there's no
> reason why the $add argument can't be negative). This would probably
> need a bit more work to cope with the situation where a feature overlaps
> the trunc coordinates, for example if we truncate to coordinates 1..400,
> but there's a feature 300..500. I guess the 'correct' behaviour might be
> to convert that feature to a fuzzy location of 300..>400? Or is it
> acceptable to have features with coordinates outside of a sequence?

No feature coordinates should always be within the sequence. Fuzzy is the 
correct solution to this.

> If we did that then an obvious test would be to cat a sequence to
> itself, then trunc to retain just the second half of the new sequence
> and see if you got back what you started with.

Go ahead an try it!

> > A few more checks to make sure seq_>alphabet is the same in all sequences
> > might be a good idea.
>
> That's easy to implement. Just put the line:
> 	$self->throw('Trying to concatenate sequences with different alphabets:
> '.$seq->display_id.' ('.$seq->alphabet.') and ' .$_->display_id.'
> ('.$_->alphabet.')') unless $_->alphabet eq $seq->alphabet;
>
> at the start of the for(@seqs) loop of the cat subroutine.

Added.

Thanks,

	-Heikki
> Roy.
> --
> Dr. Roy Chaudhuri
> Bioinformatics Research Fellow
> Division of Immunity and Infection
> University of Birmingham, U.K.
>
> http://xbase.bham.ac.uk
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

-- 
______ _/      _/_____________________________________________________
      _/      _/
     _/  _/  _/  Heikki Lehvaslaiho    heikki at_sanbi _ac _za
    _/_/_/_/_/  Associate Professor    skype: heikki_lehvaslaiho
   _/  _/  _/  SANBI, South African National Bioinformatics Institute
  _/  _/  _/  University of Western Cape, South Africa
     _/      Phone: +27 21 959 2096   FAX: +27 21 959 2512
___ _/_/_/_/_/________________________________________________________

From wlhsiao at yahoo.ca  Fri Jan 27 05:37:25 2006
From: wlhsiao at yahoo.ca (William Hsiao)
Date: Fri, 27 Jan 2006 05:37:25 -0500 (EST)
Subject: [Bioperl-l] new website launched
In-Reply-To: <79628361-F026-461F-A156-0B6810DB0B52@duke.edu>
Message-ID: <20060127103725.49913.qmail@web32405.mail.mud.yahoo.com>

Hi Jason,
  Nice new site!  I am wondering if I missed an
obvious link to the module documentations (e.g
http://doc.bioperl.org/releases/bioperl-1.4/) from the
homepage?  It seems that is the one thing missing from
the old website setup and I am not sure if it's
intentional.  I am developing a set of lecture notes
for a workshop and would like to know if there is a
stable way to navigate to the module documentations.

Thanks

Cheers,

Will

--- Jason Stajich  wrote:

> I am pleased to announce the release of a new
> website for BioPerl.   
> The site is based on the mediawiki software that was
> developed for  
> the wikipedia project.  We intend the site to be a
> place for  
> community input on documentation and design for the
> BioPerl project.   
> There is also a fair amount of documentation started
> surrounding  
> bioinformatics tools and techniques applicable to
> using BioPerl and  
> some of the authors who created these resources.
> 
> The website continues to be at the URL
> http://www.bioperl.org.  The  
> DNS updates may take up to 24 hours to reach
> everyone.
> 
> The initial content of the site is result of the
> work of myself,  
> Mauricio Herrera Cuadra, Brian Osborne, and Torsten
> Seemann.  We  
> encourage you to contribute to the site's content by
> signing up for  
> an account.
> 
> There are several guides for style of the site and
> how to link to  
> Modules for example which can contain additional
> information from the  
> POD
> http://bioperl.org/wiki/Module:Bio::SeqIO
> 
> You'll notice that many of the paths have changed
> but the DIST and  
> SRC continues to be available at
> http://bioperl.org/DIST and http:// 
> bioperl.org/SRC.  The HOWTOs are now available from
> http:// 
> bioperl.org/wiki/HOWTOs
> 
> The FAQ is available at http://bioperl.org/wiki/FAQ
> and I encourage  
> you to add your questions to it so they can be
> properly archived and  
> addressed.
> 
> We also have initiated a News site for Bioperl for
> posting  
> announcements regarding development and software.  I
> would like to  
> see if there are volunteers to post weekly or
> monthly summaries of  
> mailing list traffic and development.
> http://www.bioperl.org/news/
> 
> 
> Jason Stajich on behalf of Mauricio Herrera Cuadra,
> Brian Osborne,  
> Torsten Seemann.
> 
> --
> Jason Stajich
> Duke University
> http://www.duke.edu/~jes12
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 

__________________________________________________________ 
Find your next car at http://autos.yahoo.ca

From wlhsiao at yahoo.ca  Fri Jan 27 05:37:23 2006
From: wlhsiao at yahoo.ca (William Hsiao)
Date: Fri, 27 Jan 2006 05:37:23 -0500 (EST)
Subject: [Bioperl-l] new website launched
In-Reply-To: <79628361-F026-461F-A156-0B6810DB0B52@duke.edu>
Message-ID: <20060127103723.99495.qmail@web32402.mail.mud.yahoo.com>

Hi Jason,
  Nice new site!  I am wondering if I missed an
obvious link to the module documentations (e.g
http://doc.bioperl.org/releases/bioperl-1.4/) from the
homepage?  It seems that is the one thing missing from
the old website setup and I am not sure if it's
intentional.  I am developing a set of lecture notes
for a workshop and would like to know if there is a
stable way to navigate to the module documentations.

Thanks

Cheers,

Will

--- Jason Stajich  wrote:

> I am pleased to announce the release of a new
> website for BioPerl.   
> The site is based on the mediawiki software that was
> developed for  
> the wikipedia project.  We intend the site to be a
> place for  
> community input on documentation and design for the
> BioPerl project.   
> There is also a fair amount of documentation started
> surrounding  
> bioinformatics tools and techniques applicable to
> using BioPerl and  
> some of the authors who created these resources.
> 
> The website continues to be at the URL
> http://www.bioperl.org.  The  
> DNS updates may take up to 24 hours to reach
> everyone.
> 
> The initial content of the site is result of the
> work of myself,  
> Mauricio Herrera Cuadra, Brian Osborne, and Torsten
> Seemann.  We  
> encourage you to contribute to the site's content by
> signing up for  
> an account.
> 
> There are several guides for style of the site and
> how to link to  
> Modules for example which can contain additional
> information from the  
> POD
> http://bioperl.org/wiki/Module:Bio::SeqIO
> 
> You'll notice that many of the paths have changed
> but the DIST and  
> SRC continues to be available at
> http://bioperl.org/DIST and http:// 
> bioperl.org/SRC.  The HOWTOs are now available from
> http:// 
> bioperl.org/wiki/HOWTOs
> 
> The FAQ is available at http://bioperl.org/wiki/FAQ
> and I encourage  
> you to add your questions to it so they can be
> properly archived and  
> addressed.
> 
> We also have initiated a News site for Bioperl for
> posting  
> announcements regarding development and software.  I
> would like to  
> see if there are volunteers to post weekly or
> monthly summaries of  
> mailing list traffic and development.
> http://www.bioperl.org/news/
> 
> 
> Jason Stajich on behalf of Mauricio Herrera Cuadra,
> Brian Osborne,  
> Torsten Seemann.
> 
> --
> Jason Stajich
> Duke University
> http://www.duke.edu/~jes12
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 

__________________________________________________________ 
Find your next car at http://autos.yahoo.ca

From jason.stajich at duke.edu  Fri Jan 27 08:28:52 2006
From: jason.stajich at duke.edu (Jason Stajich)
Date: Fri, 27 Jan 2006 08:28:52 -0500
Subject: [Bioperl-l] new website launched
In-Reply-To: <20060127103723.99495.qmail@web32402.mail.mud.yahoo.com>
References: <20060127103723.99495.qmail@web32402.mail.mud.yahoo.com>
Message-ID: 

Each module is directly linked to that site in the module-level  
pages, see for example:
http://bioperl.org/wiki/Module:Bio::SearchIO

I've added a mention of the doc.bioperl site on the front page.

Note that as part of setting up the site I insured that there is now  
a standardized URL for the nightly generated Pdoc pages (from CVS  
live) (thanks to Steve Chervitz for suggesting it).

http://doc.bioperl.org/releases/bioperl-current/bioperl-live/
http://doc.bioperl.org/releases/bioperl-current/bioperl-run/
http://doc.bioperl.org/releases/bioperl-current/bioperl-ext/
....etc

The frozen release-based docs will continue to stay up - I never had  
time to make one for the bioperl-1.5.1 but hopefully will do it for  
bioperl 1.5.2 and obviously will make it for the next stable release  
(1.6).

We encourage people to add snippets of code using modules,  
complaints, workarounds, etc on the module pages on the wiki site.   
There is a "discussion" paired for each wiki page where we would  
suggest people put comments, while useful workarounds/example code  
should go on the main page.  I've just added some text about this to  
the "About this site" page.

-jason
On Jan 27, 2006, at 5:37 AM, William Hsiao wrote:

> Hi Jason,
>   Nice new site!  I am wondering if I missed an
> obvious link to the module documentations (e.g
> http://doc.bioperl.org/releases/bioperl-1.4/) from the
> homepage?  It seems that is the one thing missing from
> the old website setup and I am not sure if it's
> intentional.  I am developing a set of lecture notes
> for a workshop and would like to know if there is a
> stable way to navigate to the module documentations.
>
> Thanks
>
> Cheers,
>
> Will
>
>
> --- Jason Stajich  wrote:
>
>> I am pleased to announce the release of a new
>> website for BioPerl.
>> The site is based on the mediawiki software that was
>> developed for
>> the wikipedia project.  We intend the site to be a
>> place for
>> community input on documentation and design for the
>> BioPerl project.
>> There is also a fair amount of documentation started
>> surrounding
>> bioinformatics tools and techniques applicable to
>> using BioPerl and
>> some of the authors who created these resources.
>>
>> The website continues to be at the URL
>> http://www.bioperl.org.  The
>> DNS updates may take up to 24 hours to reach
>> everyone.
>>
>> The initial content of the site is result of the
>> work of myself,
>> Mauricio Herrera Cuadra, Brian Osborne, and Torsten
>> Seemann.  We
>> encourage you to contribute to the site's content by
>> signing up for
>> an account.
>>
>> There are several guides for style of the site and
>> how to link to
>> Modules for example which can contain additional
>> information from the
>> POD
>> http://bioperl.org/wiki/Module:Bio::SeqIO
>>
>> You'll notice that many of the paths have changed
>> but the DIST and
>> SRC continues to be available at
>> http://bioperl.org/DIST and http://
>> bioperl.org/SRC.  The HOWTOs are now available from
>> http://
>> bioperl.org/wiki/HOWTOs
>>
>> The FAQ is available at http://bioperl.org/wiki/FAQ
>> and I encourage
>> you to add your questions to it so they can be
>> properly archived and
>> addressed.
>>
>> We also have initiated a News site for Bioperl for
>> posting
>> announcements regarding development and software.  I
>> would like to
>> see if there are volunteers to post weekly or
>> monthly summaries of
>> mailing list traffic and development.
>> http://www.bioperl.org/news/
>>
>>
>> Jason Stajich on behalf of Mauricio Herrera Cuadra,
>> Brian Osborne,
>> Torsten Seemann.
>>
>> --
>> Jason Stajich
>> Duke University
>> http://www.duke.edu/~jes12
>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>
>
>
> 	
>
> 	
> 		
> __________________________________________________________
> Find your next car at http://autos.yahoo.ca

--
Jason Stajich
Duke University
http://www.duke.edu/~jes12

From davila at fiocruz.br  Thu Jan 26 19:05:35 2006
From: davila at fiocruz.br (Alberto M. R. =?iso-8859-1?Q?D=E1vila?=)
Date: Thu, 26 Jan 2006 22:05:35 -0200 (BRST)
Subject: [Bioperl-l] [GUSDEV] Another error executing ISF
In-Reply-To: 
References: 

Message-ID: <3197.201.17.105.240.1138320335.squirrel@www.redefiocruz.fiocruz.br>

Dear Chris,

Happy 2006 !

I am not sure about the exact record the ISF plugin was trying to
read/parse, but I think it is the first one, anyway I am listing the first
5 GIs of our file for your testing:

85539529
56130985
54300415
54288810
50604596

The whole file is really big (1.4GB) as it contains all the nucleotide
sequences of "kinetoplastida [organism]" from genbank in genbank format.

Hope you can catch "the bug" ;-)

Kindest regards, Alberto

>
> Sorry for the uninformative error message.
>
> The unflattener uses a collection of heuristics to infer a canonical
> gene-mRNA-CDS-exon feature hierarchy from a flattened genbank style
> file. Due to the highly variable nature of some genbank records this
> isn't always possible, and some data massaging is required beforehand.
> I don't know what the context of this message is, but I presume you're
> aware of this from the docs.
>
> The only time I've seen this before was with the genbank submission of
> the pombe genome, which has some very.. unusual features purportedly of
> type mRNA; the actual gene models are encoded using 'gene' and 'CDS'
> features. This confuses the heuristics a little. The only way I've been
> able to deal with this one was to manually remove the mRNA features
> (they appeared to be just fragments and not actual gene models) using
> $unf->remove_types(['mRNA']) beforehand.
>
> Can you send the accession of the record you're trying this on (or
> email me the file off-list if it isn't too large). I'll try and get a
> more informative error message in there.
>
> On Jan 26, 2006, at 10:19 AM, Ricardo Balbi wrote:
>
>> Hi all,
>>
>>    Anybody could help me with this error ?
>>
>> thanks in advance,
>> Ricardo
>>
>> 2006/1/26, Aaron J. Mackey :
>>>
>>>
>>> This is a BioPerl "Unflattener" error; it's unable to automatically
>>> reconstruct the gene/mRNA/exon logic of some (or all) of the
>>> annotation in your genbank file.  To get help with this, you should
>>> post a message to the BioPerl mailing list (bioperl-l at bioperl.org),
>>> including a snippet of your genbank file.
>>>
>>> -Aaron
>>>
>>> On Jan 26, 2006, at 11:53 AM, Ricardo Balbi wrote:
>>>
>>>> Hi all,
>>>>
>>>>   After making some changes in the gus mapping file to ignore some
>>>> features of the kinetoplastida database, I followed in the
>>>> execution of the ISF, however without success.
>>>>
>>>>   Somebody could help me with this error?
>>>>
>>>> thanks in advance,
>>>> Ricardo
>>>>
>>>> ERROR:
>>>>
>>>> ------------- EXCEPTION  -------------
>>>> MSG: structure_type 2 is currently unknown
>>>> STACK Bio::SeqFeature::Tools::Unflattener::unflatten_seq /usr/local/
>>>> bioperl15/Bio/SeqFeature/Tools/Unflattener.pm:1419
>>>> STACK GUS::Supported::Plugin::InsertSequenceFeatures::unflatten /
>>>> GUS/gus_home/lib/perl/GUS/Supported/Plugin/
>>>> InsertSequenceFeatures.pm:353
>>>> STACK
>>>> GUS::Supported::Plugin::InsertSequenceFeatures::processFeatureTrees
>>>> /G
>>>> US/gus_home/lib/perl/GUS/Supported/Plugin/InsertSequenceFeatures.pm:
>>>> 720
>>>> STACK GUS::Supported::Plugin::InsertSequenceFeatures::run /GUS/
>>>> gus_home/lib/perl/GUS/Supported/Plugin/InsertSequenceFeatures.pm:330
>>>> STACK (eval) /GUS/gus_home/lib/perl/GUS/PluginMgr/GusApplication.pm:
>>>> 549
>>>> STACK GUS::PluginMgr::GusApplication::doMajorMode_RunOrReport /GUS/
>>>> gus_home/lib/perl/GUS/PluginMgr/GusApplication.pm:541
>>>> STACK GUS::PluginMgr::GusApplication::doMajorMode_Run /GUS/gus_home/
>>>> lib/perl/GUS/PluginMgr/GusApplication.pm:459
>>>> STACK GUS::PluginMgr::GusApplication::doMajorMode /GUS/gus_home/lib/
>>>> perl/GUS/PluginMgr/GusApplication.pm:357
>>>> STACK GUS::PluginMgr::GusApplication::parseAndRun /GUS/gus_home/lib/
>>>> perl/GUS/PluginMgr/GusApplication.pm:266
>>>> STACK toplevel /GUS/gus_home/bin/ga:11
>>>>
>>>> --------------------------------------
>>>>
>>>> STACK TRACE:
>>>>  at /usr/local/bioperl15/Bio/Root/Root.pm line 342
>>>>         Bio::Root::Root::throw
>>>> ('Bio::SeqFeature::Tools::Unflattener=HASH(0xb5e124c)',
>>>> 'structure_type 2 is currently unknown') called at /usr/local/
>>>> bioperl15/Bio/SeqFeature/Tools/Unflattener.pm line 1419
>>>>         Bio::SeqFeature::Tools::Unflattener::unflatten_seq
>>>> ('Bio::SeqFeature::Tools::Unflattener=HASH(0xb5e124c)', '-seq',
>>>> 'Bio::Seq::RichSeq=HASH(0xb820b14)', '-use_magic', 1) called at /
>>>> GUS/gus_home/lib/perl/GUS/Supported/Plugin/
>>>> InsertSequenceFeatures.pm line 353
>>>>         GUS::Supported::Plugin::InsertSequenceFeatures::unflatten
>>>> ('GUS::Supported::Plugin::InsertSequenceFeatures=HASH(0xa011adc)',
>>>> 'Bio::Seq::RichSeq=HASH(0xb820b14)') called at /GUS/gus_home/lib/
>>>> perl/GUS/Supported/Plugin/InsertSequenceFeatures.pm line 720
>>>>
>>>> GUS::Supported::Plugin::InsertSequenceFeatures::processFeatureTrees
>>>> ('GUS::Supported::Plugin::InsertSequenceFeatures=HASH(0xa011adc)',
>>>> 'Bio::Seq::RichSeq=HASH(0xb820b14)', 140, 177) called at /GUS/
>>>> gus_home/lib/perl/GUS/Supported/Plugin/InsertSequenceFeatures.pm
>>>> line 330
>>>>         GUS::Supported::Plugin::InsertSequenceFeatures::run
>>>> ('GUS::Supported::Plugin::InsertSequenceFeatures=HASH(0xa011adc)',
>>>> 'HASH(0xb047e98)') called at /GUS/gus_home/lib/perl/GUS/PluginMgr/
>>>> GusApplication.pm line 549
>>>>         eval {...} called at /GUS/gus_home/lib/perl/GUS/PluginMgr/
>>>> GusApplication.pm line 541
>>>>         GUS::PluginMgr::GusApplication::doMajorMode_RunOrReport
>>>> ('GUS::PluginMgr::GusApplication=HASH(0x9ff0c20)',
>>>> 'GUS::Supported::Plugin::InsertSequenceFeatures', 1) called at /GUS/
>>>> gus_home/lib/perl/GUS/PluginMgr/GusApplication.pm line 459
>>>>         GUS::PluginMgr::GusApplication::doMajorMode_Run
>>>> ('GUS::PluginMgr::GusApplication=HASH(0x9ff0c20)',
>>>> 'GUS::Supported::Plugin::InsertSequenceFeatures') called at /GUS/
>>>> gus_home/lib/perl/GUS/PluginMgr/GusApplication.pm line 357
>>>>         GUS::PluginMgr::GusApplication::doMajorMode
>>>> ('GUS::PluginMgr::GusApplication=HASH(0x9ff0c20)',
>>>> 'GUS::Supported::Plugin::InsertSequenceFeatures') called at /GUS/
>>>> gus_home/lib/perl/GUS/PluginMgr/GusApplication.pm line 266
>>>>         GUS::PluginMgr::GusApplication::parseAndRun
>>>> ('GUS::PluginMgr::GusApplication=HASH(0x9ff0c20)', 'ARRAY
>>>> (0xa004738)') called at /GUS/gus_home/bin/ga line 11
>>>>
>>>>
>>>>
>>>
>>> --
>>> Dr. Aaron J. Mackey, Ph.D.
>>> Project Manager, ApiDB Bioinformatics Resource Center
>>> Penn Genomics Institute, University of Pennsylvania
>>> email:  amackey at pcbi.upenn.edu
>>> office: 215-898-1205 (Biology, 212 Goddard Labs)
>>>          215-746-7018 (PCBI, 1428 Blockley Hall)
>>> fax:    215-746-6697 (Penn Genomics Institute)
>>> postal: Penn Genomics Institute
>>>          Goddard Labs 212
>>>          415 S. University Avenue
>>>          Philadelphia, PA  19104-6017
>>>

From roy at colibase.bham.ac.uk  Fri Jan 27 10:31:50 2006
From: roy at colibase.bham.ac.uk (Roy Chaudhuri)
Date: Fri, 27 Jan 2006 15:31:50 +0000
Subject: [Bioperl-l] concatenate two embl sequence files
In-Reply-To: <200601271206.52875.heikki@sanbi.ac.za>
References: <200601182120.k0ILIl8X022324@portal.open-bio.org>
	<200601252311.45582.heikki@sanbi.ac.za>
	<43D8CC0B.10403@colibase.bham.ac.uk>
	<200601271206.52875.heikki@sanbi.ac.za>
Message-ID: <43DA3CE6.4020708@colibase.bham.ac.uk>

> I've committed the code and tests. See t/SeqUtils.t. The idea is to test all 
> methods and a reasonable portion of all edge values to be sure that the 
> method works as it should.
Cool, thanks for that. My first proper contribution to BioPerl 8^).
The tests look good- I'll know better for next time.

> Note that the code does not create a new sequence object. It modifies the 
> existing one. Therefore it is best not to return that object. The users would 
> assign that to a variable that points to the same structure and get confused. 
> The method now returns true upon completeion.
> 
> Creating a new sequence object is problematic because one needs to add one 
> more dependency (e.g. Clone) and will not work anyway if the sequence 
> implementation is using a database back end. It is better the way you have 
> written it.
Yes, that makes sense. Although with that interface it might be more 
natural in Bio::Seq? If it is a method that will modify a sequence in 
place then it seems more intuitive to call $seq->cat(@seqs) [or even 
$seq->append(@seqs)] rather than Bio::SeqUtils->cat($seq, @seqs).

> I added code to move over the annotations from secondary sequences, but did 
> not do anything remove duplicates if the same sequence gets added twice. I 
> wrote a note about this so that users know to be prepared if that affects 
> them.
I'm not convinced about this- perhaps it should be optional? In practice 
many of the annotations for each subsequence are only going to be 
applicable to that sequence, not the concatenated whole. Some of them 
may also be duplicated even between non-identical sequences. I think 
it'd be better by default to keep just the annotation from the first 
sequence (which probably would still need to be changed, but could at 
least act as a placeholder).

There were a couple of problems with renamed variables/subroutines that 
hadn't all been updated, I've fixed those and pasted the new version below.

> No feature coordinates should always be within the sequence. Fuzzy is the 
> correct solution to this.
Okay, I'll have a go and let you know how I get on.

Cheers.
Roy.

--
Dr. Roy Chaudhuri
Bioinformatics Research Fellow
Division of Immunity and Infection
University of Birmingham, U.K.

http://xbase.bham.ac.uk

=head2 cat

   Title   : cat
   Usage   : my $catseq = Bio::SeqUtils->cat(@seqs)
   Function: Concatenates an array of Bio::Seq objects, using the first 
sequence
             as a target. Annotations and sequence features are copied over
             from any additional objects. Adjusts the coordinates of copied
             features.
   Returns : a boolean
   Args    : array of sequence objects

-
Note that annotations have no sequence region. If you concatenate the
same sequence more than once, you will have its annotations
duplicated.

=cut

sub cat {
     my ($self, $seq, @seqs) = @_;
     $self->throw('Object [$seq] '. 'of class ['. ref($seq).
                  '] should be a Bio::PrimarySeqI ')
         unless $seq->isa('Bio::PrimarySeqI');

     for my $catseq (@seqs) {
         $self->throw('Object [$catseq] '. 'of class ['. ref($catseq).
                      '] should be a Bio::PrimarySeqI ')
             unless $catseq->isa('Bio::PrimarySeqI');

         $self->throw('Trying to concatenate sequences with different 
alphabets: '.
                      $seq->display_id. '('. $seq->alphabet. ') and '. 
$catseq->display_id.
                      '('. $catseq->alphabet. ')')
             unless $catseq->alphabet eq $seq->alphabet;

         my $length=$seq->length;
         $seq->seq($seq->seq.$catseq->seq);

         # move annotations
         if ($seq->isa("Bio::AnnotatableI") and 
$catseq->isa("Bio::AnnotatableI")) {
             foreach my $key ( 
$catseq->annotation->get_all_annotation_keys() ) {

                 foreach my $value ( 
$catseq->annotation->get_Annotations($key) ) {
                     $seq->annotation->add_Annotation($key, $value);
                 }
             }
         }

         # move SeqFeatures
         if ( $seq->isa('Bio::SeqI') and $catseq->isa('Bio::SeqI')) {
             for my $feat ($catseq->get_SeqFeatures) {
                 $seq->add_SeqFeature($self->_coord_adjust($feat, $length));
             }
         }

     }
     1;
}

=head2 _coord_adjust

   Title   : _coord_adjust
   Usage   : my $newfeat=Bio::SeqUtils->_coord_adjust($feature, 100);
   Function: Recursive subroutine to adjust the coordinates of a feature
             and all its subfeatures.
   Returns : A Bio::SeqFeatureI compliant object.
   Args    : A Bio::SeqFeatureI compliant object,
             the number of bases to add to the coordinates

=cut

sub _coord_adjust {
     my ($self, $feat, $add)=@_;
     $self->throw('Object [$feat] '. 'of class ['. ref($feat).
                  '] should be a Bio::SeqFeatureI ')
	unless $feat->isa('Bio::SeqFeatureI');
     my @adjsubfeat;
     for my $subfeat ($feat->remove_SeqFeatures) {
	push @adjsubfeat, Bio::SeqUtils->_coord_adjust($add, $subfeat);
     }
     my @loc=$feat->location->each_Location;
     map {
	my @coords=($_->start, $_->end);
	map s/(\d+)/$add+$1/ge, @coords;
	$_->start(shift @coords);
	$_->end(shift @coords);
     } @loc;
     if (@loc==1) {
	$feat->location($loc[0])
     } else {
	my $loc=Bio::Location::Split->new;
	$loc->add_sub_Location(@loc);
	$feat->location($loc);
     }
     $feat->add_SeqFeature($_) for @adjsubfeat;
     return $feat;
}

From lupey+ at pitt.edu  Fri Jan 27 07:52:03 2006
From: lupey+ at pitt.edu (Paul G Cantalupo)
Date: Fri, 27 Jan 2006 07:52:03 -0500 (EST)
Subject: [Bioperl-l] How to search Bioperl-l archives
Message-ID: 

Hello,

Is there a better way to search the bioperl-l archives other than 
searching in each Archive listed on 
http://bioperl.org/pipermail/bioperl-l/. I've found that Google is not the 
best answer either.

Thank you,

Paul

Paul Cantalupo
Research Specialist/Systems Programmer
559 Crawford Hall
Department of Biological Sciences
University of Pittsburgh
Pittsburgh, PA 15260
Work: 412-624-4687
Fax: 412-624-4759

Ask me about Toastmasters: www.toastmasters.org
Midday Club Treasurer

From jason.stajich at duke.edu  Fri Jan 27 15:48:00 2006
From: jason.stajich at duke.edu (Jason Stajich)
Date: Fri, 27 Jan 2006 15:48:00 -0500
Subject: [Bioperl-l] How to search Bioperl-l archives
In-Reply-To: 
References: 
Message-ID: <91EF5237-8A86-40FA-8126-D953DE28DD69@duke.edu>

Google is the best answer we've got...
site:bioperl.org +pipermail +bioperl-l YOUR TERM

We will try and re-setup the swish indexed archive on the new server  
when there is time.  I don't think I'm going to have time for quite a  
while, if someone volunteers to help out ChrisD and I with sys  
admining it can of course get done sooner.  The old site is http:// 
search.open-bio.org but I don't think the indexes have been updated  
in a while.

-jason

On Jan 27, 2006, at 7:52 AM, Paul G Cantalupo wrote:

> Hello,
>
> Is there a better way to search the bioperl-l archives other than
> searching in each Archive listed on
> http://bioperl.org/pipermail/bioperl-l/. I've found that Google is  
> not the
> best answer either.
>
> Thank you,
>
> Paul
>
>
> Paul Cantalupo
> Research Specialist/Systems Programmer
> 559 Crawford Hall
> Department of Biological Sciences
> University of Pittsburgh
> Pittsburgh, PA 15260
> Work: 412-624-4687
> Fax: 412-624-4759
>
> Ask me about Toastmasters: www.toastmasters.org
> Midday Club Treasurer
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

--
Jason Stajich
Duke University
http://www.duke.edu/~jes12

From cjfields at uiuc.edu  Fri Jan 27 15:57:50 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri, 27 Jan 2006 14:57:50 -0600
Subject: [Bioperl-l] How to search Bioperl-l archives
In-Reply-To: 
Message-ID: <000001c62384$555ea8c0$15327e82@pyrimidine>

There's a link from this page:

http://www.bioperl.org/wiki/Mailing_lists

Two different searches are shown for bioperl-l : Google and Open-Bio.  I use
the Open-Bio b/c of its sorting capabilities (I haven't tried fooling around
with the Google interface yet).

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 
> -----Original Message-----
> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
> bounces at lists.open-bio.org] On Behalf Of Paul G Cantalupo
> Sent: Friday, January 27, 2006 6:52 AM
> To: bioperl-l
> Subject: [Bioperl-l] How to search Bioperl-l archives
> 
> Hello,
> 
> Is there a better way to search the bioperl-l archives other than
> searching in each Archive listed on
> http://bioperl.org/pipermail/bioperl-l/. I've found that Google is not the
> best answer either.
> 
> Thank you,
> 
> Paul
> 
> 
> Paul Cantalupo
> Research Specialist/Systems Programmer
> 559 Crawford Hall
> Department of Biological Sciences
> University of Pittsburgh
> Pittsburgh, PA 15260
> Work: 412-624-4687
> Fax: 412-624-4759
> 
> Ask me about Toastmasters: www.toastmasters.org
> Midday Club Treasurer
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

From cjfields at uiuc.edu  Fri Jan 27 16:02:59 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri, 27 Jan 2006 15:02:59 -0600
Subject: [Bioperl-l] RNAMotif parser
Message-ID: <000101c62385$0ddfc870$15327e82@pyrimidine>

Jason,

I have been fiddling with an RNAMotif parser and an ERPIN parser for a
number of years now; I plan on releasing it for inclusion in bioperl or
bioperl-run.  Right now, I think I may base them somewhat on your
Bio::Tools::QRNA module.  Should they be in bioperl (Bio::Tools::RNAMotif)
or bioperl-run (Bio::Tools::Run::RNAMotif) namespace?

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 

From rahall2 at ualr.edu  Fri Jan 27 15:34:47 2006
From: rahall2 at ualr.edu (Roger Hall)
Date: Fri, 27 Jan 2006 14:34:47 -0600
Subject: [Bioperl-l] Requesting your issues with
	Module:Bio::Tools::Run::RemoteBlast
Message-ID: <008001c62381$1d844980$d416a790@LIBERAL>

All,

I have a fun little application written around this module to track new hits
for my favorite sequences, but it stopped working some time ago, so I have
finally adopted this orphaned module.

I have received very specific suggestions from Jason and Chris for
implementation, and plan to follow them in order to at least bring this
module into the wonderful world of XML. I would appreciate it if you would
send any additional features (and any known issues) my way.

Thanks!

Roger Hall

Technical Director

MidSouth Bioinformatics Center

University of Arkansas at Little Rock

(501) 569-8074

From cjfields at uiuc.edu  Fri Jan 27 20:03:15 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri, 27 Jan 2006 19:03:15 -0600
Subject: [Bioperl-l] Requesting your issues
	withModule:Bio::Tools::Run::RemoteBlast
In-Reply-To: <008001c62381$1d844980$d416a790@LIBERAL>
Message-ID: <001101c623a6$9eb652d0$15327e82@pyrimidine>

The only real change to RemoteBlast.pm made was to the save_output method;
it wasn't saving XML output because the regex used to check the tempfile
output:

	while(my $l = ) {
		next if ($l =~ //);
		if( $l =~ /^(?:[T]?BLAST[NPX])\s*.+$/i ||
			 $l =~/^RPS-BLAST\s*.+$/i ) {
			$seentop=1;
		}
		next if !$seentop;
		if( $seentop ) {
			print SAVEOUT $l;
		}
	}

didn't check for XML.  I just added a check for XML that is the same as the
XML format check in the retrieve_blast method:

	while(my $l = ) {
		next if ($l =~ //);
		if( $l =~ /^(?:[T]?BLAST[NPX])\s*.+$/i ||  # NCBI BLAST
			$l =~/^RPS-BLAST\s*.+$/i || # RPS BLAST
                  $1 =~/<\?xml version=/) { # NCBI BLAST XML output
			$seentop=1;
		}
		next if !$seentop;
		if( $seentop ) {
			print SAVEOUT $l;
		}
	}

There is probably a better way to do this, but it works for now.  All other
fixes were made to SearchIO::blast.  That module is where most of the work
is done and which 'broke' recently from the BLAST version change at NCBI.

The only things I can think of at the moment are things that Jason
mentioned, switching to XML as the default (I agree with) and possibly
incorporating the netblast client (blastcl3).  It might be possible to
branch off a similar module specifically geared towards the blastcl3 client,
maybe acting as a wrapper to parse the returned data using SearchIO, but I
don't necessarily think it would be best to include in RemoteBlast. 

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 

> -----Original Message-----
> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
> bounces at lists.open-bio.org] On Behalf Of Roger Hall
> Sent: Friday, January 27, 2006 2:35 PM
> To: Bioperl-L
> Subject: [Bioperl-l] Requesting your issues
> withModule:Bio::Tools::Run::RemoteBlast
> 
> All,
> 
> 
> 
> I have a fun little application written around this module to track new
> hits
> for my favorite sequences, but it stopped working some time ago, so I have
> finally adopted this orphaned module.
> 
> 
> 
> I have received very specific suggestions from Jason and Chris for
> implementation, and plan to follow them in order to at least bring this
> module into the wonderful world of XML. I would appreciate it if you would
> send any additional features (and any known issues) my way.
> 
> 
> 
> Thanks!
> 
> 
> 
> Roger Hall
> 
> Technical Director
> 
> MidSouth Bioinformatics Center
> 
> University of Arkansas at Little Rock
> 
> (501) 569-8074
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

From torsten.seemann at infotech.monash.edu.au  Fri Jan 27 20:30:34 2006
From: torsten.seemann at infotech.monash.edu.au (Torsten Seemann)
Date: Sat, 28 Jan 2006 12:30:34 +1100
Subject: [Bioperl-l] RNAMotif parser
In-Reply-To: <000101c62385$0ddfc870$15327e82@pyrimidine>
References: <000101c62385$0ddfc870$15327e82@pyrimidine>
Message-ID: <43DAC93A.1000208@infotech.monash.edu.au>

Chris,

> I have been fiddling with an RNAMotif parser and an ERPIN parser for a
> number of years now; I plan on releasing it for inclusion in bioperl or
> bioperl-run.  Right now, I think I may base them somewhat on your
> Bio::Tools::QRNA module.  Should they be in bioperl (Bio::Tools::RNAMotif)
> or bioperl-run (Bio::Tools::Run::RNAMotif) namespace?

 From my understanding, a module to _parse the output_ of some TOOL goes 
in Bio::Tools::TOOL. The wrapper to _run_ TOOL goes in 
Bio::Tools::Run::TOOL. Usually the run() method in Bio::Tools::Run::TOOL 
takes the TOOL output and creates a Bio::Tools::TOOL object with the 
result in it as a convenience.

-- 
Torsten Seemann
Victorian Bioinformatics Consortium, Monash University, Australia
http://www.vicbioinformatics.com/
Phone: +61 3 9905 9010

From cjfields at uiuc.edu  Fri Jan 27 20:47:48 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri, 27 Jan 2006 19:47:48 -0600
Subject: [Bioperl-l] RNAMotif parser
In-Reply-To: <43DAC93A.1000208@infotech.monash.edu.au>
Message-ID: <000001c623ac$d7d07db0$15327e82@pyrimidine>

Yeah, forgot about that.  I just remember a discussion at one point a while
back about splitting off sections of bioperl core b/c some thought
bioperl-core was getting too big; I didn't want to get too deep into writing
code w/o asking.  Okay, then, that's settled.  

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 
> -----Original Message-----
> From: Torsten Seemann [mailto:torsten.seemann at infotech.monash.edu.au]
> Sent: Friday, January 27, 2006 7:31 PM
> To: Chris Fields
> Cc: 'bioperl-ml List'
> Subject: Re: [Bioperl-l] RNAMotif parser
> 
> Chris,
> 
> > I have been fiddling with an RNAMotif parser and an ERPIN parser for a
> > number of years now; I plan on releasing it for inclusion in bioperl or
> > bioperl-run.  Right now, I think I may base them somewhat on your
> > Bio::Tools::QRNA module.  Should they be in bioperl
> (Bio::Tools::RNAMotif)
> > or bioperl-run (Bio::Tools::Run::RNAMotif) namespace?
> 
>  From my understanding, a module to _parse the output_ of some TOOL goes
> in Bio::Tools::TOOL. The wrapper to _run_ TOOL goes in
> Bio::Tools::Run::TOOL. Usually the run() method in Bio::Tools::Run::TOOL
> takes the TOOL output and creates a Bio::Tools::TOOL object with the
> result in it as a convenience.
> 
> --
> Torsten Seemann
> Victorian Bioinformatics Consortium, Monash University, Australia
> http://www.vicbioinformatics.com/
> Phone: +61 3 9905 9010

From torsten.seemann at infotech.monash.edu.au  Sat Jan 28 05:04:30 2006
From: torsten.seemann at infotech.monash.edu.au (Torsten Seemann)
Date: Sat, 28 Jan 2006 21:04:30 +1100
Subject: [Bioperl-l] RNAMotif parser
In-Reply-To: <000001c623ac$d7d07db0$15327e82@pyrimidine>
References: <000001c623ac$d7d07db0$15327e82@pyrimidine>
Message-ID: <43DB41AE.30002@infotech.monash.edu.au>

>> From my understanding, a module to _parse the output_ of some TOOL goes
>>in Bio::Tools::TOOL. The wrapper to _run_ TOOL goes in
>>Bio::Tools::Run::TOOL. Usually the run() method in Bio::Tools::Run::TOOL
>>takes the TOOL output and creates a Bio::Tools::TOOL object with the
>>result in it as a convenience.

> Yeah, forgot about that.  I just remember a discussion at one point a while
> back about splitting off sections of bioperl core b/c some thought
> bioperl-core was getting too big; I didn't want to get too deep into writing
> code w/o asking.  Okay, then, that's settled.  

I think this is still true. Anything in Bio::Tools::Run namespace should 
be in bioperl-run CVS (except for RemoteBlast and StandAloneBlast which 
are in bioperl-live core due to popularity). All the output parsers are 
in bioperl-live core.

-- 
Torsten Seemann
Victorian Bioinformatics Consortium, Monash University, Australia
http://www.vicbioinformatics.com/
Phone: +61 3 9905 9010

From jason.stajich at duke.edu  Sat Jan 28 11:06:06 2006
From: jason.stajich at duke.edu (Jason Stajich)
Date: Sat, 28 Jan 2006 11:06:06 -0500
Subject: [Bioperl-l] RNAMotif parser
In-Reply-To: <43DB41AE.30002@infotech.monash.edu.au>
References: <000001c623ac$d7d07db0$15327e82@pyrimidine>
	<43DB41AE.30002@infotech.monash.edu.au>
Message-ID: 

exactly!
On Jan 28, 2006, at 5:04 AM, Torsten Seemann wrote:

>>> From my understanding, a module to _parse the output_ of some  
>>> TOOL goes
>>> in Bio::Tools::TOOL. The wrapper to _run_ TOOL goes in
>>> Bio::Tools::Run::TOOL. Usually the run() method in  
>>> Bio::Tools::Run::TOOL
>>> takes the TOOL output and creates a Bio::Tools::TOOL object with the
>>> result in it as a convenience.
>
>> Yeah, forgot about that.  I just remember a discussion at one  
>> point a while
>> back about splitting off sections of bioperl core b/c some thought
>> bioperl-core was getting too big; I didn't want to get too deep  
>> into writing
>> code w/o asking.  Okay, then, that's settled.
>
> I think this is still true. Anything in Bio::Tools::Run namespace  
> should
> be in bioperl-run CVS (except for RemoteBlast and StandAloneBlast  
> which
> are in bioperl-live core due to popularity). All the output parsers  
> are
> in bioperl-live core.
>
>
> -- 
> Torsten Seemann
> Victorian Bioinformatics Consortium, Monash University, Australia
> http://www.vicbioinformatics.com/
> Phone: +61 3 9905 9010
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

--
Jason Stajich
Duke University
http://www.duke.edu/~jes12

From golharam at umdnj.edu  Sun Jan 29 12:48:34 2006
From: golharam at umdnj.edu (Ryan Golhar)
Date: Sun, 29 Jan 2006 12:48:34 -0500
Subject: [Bioperl-l] Parsing clustalw alignments
Message-ID: <00ce01c624fc$3a8f34a0$2f01a8c0@GOLHARMOBILE1>

I can't figure this out from the documentation.  In fact, I'm not sure
its possible:

I have a bunch of clustalw alignments in GCG (MSF) format.  Each
alignment consists of three sequences.  I want to get the sequences
including the gaps from the alignment.  

I'm trying to use Bio::AlignIO to read the alignment file, then trying
to get each sequence from the alignment. I tried doing this:

$seqio = Bio::AlignIO->new(-format => 'clustalw', -file =>
"align$x.clustalw");
my $aln = $seqio->next_aln();
$seq1 = $aln->next_seq()->seq;

Getting the sequence from the alignment isn't working and I'm not sure
how to do it.  Does anyone have any ideas as to what I might try?

--
Ryan Golhar  -  golharam at umdnj.edu
The Informatics Institute of UMDNJ

From cjfields at uiuc.edu  Sun Jan 29 14:44:22 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Sun, 29 Jan 2006 13:44:22 -0600
Subject: [Bioperl-l] Parsing clustalw alignments
In-Reply-To: <00ce01c624fc$3a8f34a0$2f01a8c0@GOLHARMOBILE1>
References: <00ce01c624fc$3a8f34a0$2f01a8c0@GOLHARMOBILE1>
Message-ID: <294C9886-277B-4C35-AF7F-D6ABB3B401A3@uiuc.edu>

Even though you used clustalw for aligning the sequences, the output  
format is GCG (msf) and not clustalw (aln) format, so you need to  
change the '-format' flag you have set:

> $seqio = Bio::AlignIO->new(-format => 'clustalw', -file =>
> "align$x.clustalw");

to

> $seqio = Bio::AlignIO->new(-format => 'msf', -file =>
> "align$x.clustalw");

See if that works.

On Jan 29, 2006, at 11:48 AM, Ryan Golhar wrote:

> I can't figure this out from the documentation.  In fact, I'm not sure
> its possible:
>
> I have a bunch of clustalw alignments in GCG (MSF) format.  Each
> alignment consists of three sequences.  I want to get the sequences
> including the gaps from the alignment.
>
> I'm trying to use Bio::AlignIO to read the alignment file, then trying
> to get each sequence from the alignment. I tried doing this:
>
> $seqio = Bio::AlignIO->new(-format => 'clustalw', -file =>
> "align$x.clustalw");
> my $aln = $seqio->next_aln();
> $seq1 = $aln->next_seq()->seq;
>
> Getting the sequence from the alignment isn't working and I'm not sure
> how to do it.  Does anyone have any ideas as to what I might try?
>
> --
> Ryan Golhar  -  golharam at umdnj.edu
> The Informatics Institute of UMDNJ
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign

From jason.stajich at duke.edu  Sun Jan 29 14:49:20 2006
From: jason.stajich at duke.edu (Jason Stajich)
Date: Sun, 29 Jan 2006 14:49:20 -0500
Subject: [Bioperl-l] Parsing clustalw alignments
In-Reply-To: <00ce01c624fc$3a8f34a0$2f01a8c0@GOLHARMOBILE1>
References: <00ce01c624fc$3a8f34a0$2f01a8c0@GOLHARMOBILE1>
Message-ID: <21817F7A-8552-4F24-8094-86D830A506BB@duke.edu>

See the Bio::SimpleAlign documentation for information on how to  
interact with an alignment

Here is some code from the SYNOPSIS
# Extract sequences and check values for the alignment column $pos
   foreach $seq ($aln->each_seq) {
       $res = $seq->subseq($pos, $pos);
       $count{$res}++;
   }

So for you question:
# get the aln parser
my $alnio = Bio::AlignIO->new(-format => 'clustalw', -file  
=>"alnfile.aln);
while( my $aln = $alnio->next_aln ) {
  # get the alignments one by one
  for my $seq ( $aln->each_seq ) {
  # get the sequences out from the alignment
   print "sequence as a string", $seq->seq, "\n";
   }
}

next_seq is an API Sequence streams, not something we have  
implemented for alignments since you can get them all out with the  
each_seq method.

-jason
On Jan 29, 2006, at 12:48 PM, Ryan Golhar wrote:

> I can't figure this out from the documentation.  In fact, I'm not sure
> its possible:
>
> I have a bunch of clustalw alignments in GCG (MSF) format.  Each
> alignment consists of three sequences.  I want to get the sequences
> including the gaps from the alignment.
>
> I'm trying to use Bio::AlignIO to read the alignment file, then trying
> to get each sequence from the alignment. I tried doing this:
>
> $seqio = Bio::AlignIO->new(-format => 'clustalw', -file =>
> "align$x.clustalw");
> my $aln = $seqio->next_aln();
> $seq1 = $aln->next_seq()->seq;
>
> Getting the sequence from the alignment isn't working and I'm not sure
> how to do it.  Does anyone have any ideas as to what I might try?
>
> --
> Ryan Golhar  -  golharam at umdnj.edu
> The Informatics Institute of UMDNJ
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

--
Jason Stajich
Duke University
http://www.duke.edu/~jes12

From biophp at biophp.org  Fri Jan 27 08:20:31 2006
From: biophp at biophp.org (Joseba Bikandi)
Date: Fri, 27 Jan 2006 08:20:31 -0500
Subject: [Bioperl-l] BioPHP.org - open source repository of code and scripts
Message-ID: 

Dear Sir/Madam,

I would like to let you know about biophp.org, 
an open source project which may be interesting 
for you. It is a new project which includes 
PHP code (functions) and minitools (copy and
paste one page scripts). 

Sincerely,

......
Joseba Bikandi
biophp at biophp.org

From golharam at umdnj.edu  Mon Jan 30 12:40:58 2006
From: golharam at umdnj.edu (Ryan Golhar)
Date: Mon, 30 Jan 2006 12:40:58 -0500
Subject: [Bioperl-l] Parsing clustalw alignments
In-Reply-To: <21817F7A-8552-4F24-8094-86D830A506BB@duke.edu>
Message-ID: <003701c625c4$5527d790$2f01a8c0@GOLHARMOBILE1>

Thanks.  Here's what I ended up doing:

$seqio = Bio::AlignIO->new(-format => 'msf', -file =>
"alnfile.clustalw");
my $aln = $seqio->next_aln();
@_ = $aln->each_seq_with_id('org1');
$seq1 = $_[0]->seq;

-----Original Message-----
From: bioperl-l-bounces at lists.open-bio.org
[mailto:bioperl-l-bounces at lists.open-bio.org] On Behalf Of Jason Stajich
Sent: Sunday, January 29, 2006 2:49 PM
To: golharam at umdnj.edu
Cc: 'bioperl-l'
Subject: Re: [Bioperl-l] Parsing clustalw alignments

See the Bio::SimpleAlign documentation for information on how to  
interact with an alignment

Here is some code from the SYNOPSIS
# Extract sequences and check values for the alignment column $pos
   foreach $seq ($aln->each_seq) {
       $res = $seq->subseq($pos, $pos);
       $count{$res}++;
   }

So for you question:
# get the aln parser
my $alnio = Bio::AlignIO->new(-format => 'clustalw', -file  
=>"alnfile.aln);
while( my $aln = $alnio->next_aln ) {
  # get the alignments one by one
  for my $seq ( $aln->each_seq ) {
  # get the sequences out from the alignment
   print "sequence as a string", $seq->seq, "\n";
   }
}

next_seq is an API Sequence streams, not something we have  
implemented for alignments since you can get them all out with the  
each_seq method.

-jason
On Jan 29, 2006, at 12:48 PM, Ryan Golhar wrote:

> I can't figure this out from the documentation.  In fact, I'm not sure

> its possible:
>
> I have a bunch of clustalw alignments in GCG (MSF) format.  Each 
> alignment consists of three sequences.  I want to get the sequences 
> including the gaps from the alignment.
>
> I'm trying to use Bio::AlignIO to read the alignment file, then trying

> to get each sequence from the alignment. I tried doing this:
>
> $seqio = Bio::AlignIO->new(-format => 'clustalw', -file => 
> "align$x.clustalw"); my $aln = $seqio->next_aln();
> $seq1 = $aln->next_seq()->seq;
>
> Getting the sequence from the alignment isn't working and I'm not sure

> how to do it.  Does anyone have any ideas as to what I might try?
>
> --
> Ryan Golhar  -  golharam at umdnj.edu
> The Informatics Institute of UMDNJ
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org 
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

--
Jason Stajich
Duke University
http://www.duke.edu/~jes12

_______________________________________________
Bioperl-l mailing list
Bioperl-l at lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/bioperl-l

From alindeman at gmail.com  Mon Jan 30 23:00:32 2006
From: alindeman at gmail.com (Andy Lindeman)
Date: Mon, 30 Jan 2006 22:00:32 -0600
Subject: [Bioperl-l] Bio::Graphics: Different glyphs on same track
In-Reply-To: <3f3ecb5a0601301442o573053aes23934617edb14729@mail.gmail.com>
References: <3f3ecb5a0601301442o573053aes23934617edb14729@mail.gmail.com>
Message-ID: <3f3ecb5a0601302000j7a3fbd4y1739a3c1696e30aa@mail.gmail.com>

Hi all--

Is it possible to use two different glyphs (or the same glyph with
different properties) on the same panel track?

Thanks

--A

From Marc.Logghe at DEVGEN.com  Tue Jan 31 03:08:09 2006
From: Marc.Logghe at DEVGEN.com (Marc Logghe)
Date: Tue, 31 Jan 2006 09:08:09 +0100
Subject: [Bioperl-l] Bio::Graphics: Different glyphs on same track
Message-ID: <0C528E3670D8CE4B8E013F6749231AA6746AB5@ANTARESIA.be.devgen.com>

Hi Andy
> Is it possible to use two different glyphs (or the same glyph 
> with different properties) on the same panel track?
Sure it is. This extract comes from the docs of Bio::Graphics::Panel

" There are a large number of glyph types.  By default, each track will
be homogeneous on a single glyph type, but you can mix several glyph
types on the same track by providing a code reference to the -glyph
argument.  Other options passed to add_track() control the color and
size of the glyphs, whether they are allowed to overlap, and other
formatting attributes.  The height of a track is determined from its
contents and cannot be directly influenced."

HTH,
Marc

From alindeman at gmail.com  Tue Jan 31 14:59:00 2006
From: alindeman at gmail.com (Andy Lindeman)
Date: Tue, 31 Jan 2006 13:59:00 -0600
Subject: [Bioperl-l] Bio::Graphics: Different glyphs on same track
In-Reply-To: <0C528E3670D8CE4B8E013F6749231AA6746AB5@ANTARESIA.be.devgen.com>
References: <0C528E3670D8CE4B8E013F6749231AA6746AB5@ANTARESIA.be.devgen.com>
Message-ID: <3f3ecb5a0601311159k6d7f09d3j65732b5e72019e9d@mail.gmail.com>

Wonderful!

Thanks.

--A

On 1/31/06, Marc Logghe  wrote:
> Hi Andy
> > Is it possible to use two different glyphs (or the same glyph
> > with different properties) on the same panel track?
> Sure it is. This extract comes from the docs of Bio::Graphics::Panel
>
> " There are a large number of glyph types.  By default, each track will
> be homogeneous on a single glyph type, but you can mix several glyph
> types on the same track by providing a code reference to the -glyph
> argument.  Other options passed to add_track() control the color and
> size of the glyphs, whether they are allowed to overlap, and other
> formatting attributes.  The height of a track is determined from its
> contents and cannot be directly influenced."
>
> HTH,
> Marc
>

From hubert.prielinger at gmx.at  Tue Jan 24 20:49:07 2006
From: hubert.prielinger at gmx.at (Hubert Prielinger)
Date: Tue, 24 Jan 2006 14:49:07 -0600
Subject: [Bioperl-l] formatdb with the nr database
In-Reply-To: <43D63FB6.4090505@scitegic.com>
References: <43D54838.5050301@gmx.at>
	<43D5693C.1020805@anu.edu.au>	<43D56203.2060806@gmx.at>
	<1138062266.2534.2.camel@vogon>	<43D571B1.3020008@gmx.at>
	<43D58D06.5080501@anu.edu.au> <43D585CF.5070902@gmx.at>
	<43D63FB6.4090505@scitegic.com>
Message-ID: <43D692C3.80306@gmx.at>

Hi,
thank you very much for the help, I have tried to run the blastall on 
commandline, but I can't even execute the binary file, nevertheless the 
blastall exe file have every permission...
I always get the error message: blastall: cannot execute the binary file
Need to be the exe file somewhere else, another path...now it is located 
under /home/Hubert/blast/blast-2.2.13/bin

thanks
Hubert

Scott Markel wrote:

> Hubert,
>
> If you look at the MSG line in the exception you can see
> exactly what the command line was.  Nagesh is pointing out
> that you used -d "/nr" and asking if that's what you want.
> I suspect that the '/' shouldn't be there.
>
> Try invoking blastall directly from the command line.  All
> BioPerl is doing is invoking BLAST on your behalf.  The
> same command line that BioPerl uses should also work for
> you on the command line.
>
> Scott
>
> Hubert Prielinger wrote:
>
>> hi,
>> sorry, but what do you mean with is your blast database in /nr...
>> my database is located in the path /home/Hubert/blast/blast-2.2.13/data
>>
>>
>>
>> Nagesh Chakka wrote:
>>
>>> Can you just run the blast from the command line.
>>> Is your blast database in "/nr".
>>>
>>> Hubert Prielinger wrote:
>>>
>>>> Hi Nagesh,
>>>> thank you very much, I put my database into the data folder, run 
>>>> the program and got the following error message:
>>>>
>>>> submit Sequence...just do it....
>>>> sh: /home/Hubert/blast/blast-2.2.13/bin/blastall: cannot execute 
>>>> binary file
>>>>
>>>> ------------- EXCEPTION  -------------
>>>> MSG: blastall call crashed: 32256 
>>>> /home/Hubert/blast/blast-2.2.13/bin/blastall -p  blastp  -d  "/nr"  
>>>> -i  /tmp/QTZfYMbgLM  -e  20000  -o  /tmp/v3YwWvONZ1  -v  1000  -b  
>>>> 1000
>>>>
>>>> STACK Bio::Tools::Run::StandAloneBlast::_runblast 
>>>> /usr/lib/perl5/site_perl/5.8.6/Bio/Tools/Run/StandAloneBlast.pm:759
>>>> STACK Bio::Tools::Run::StandAloneBlast::_generic_local_blast 
>>>> /usr/lib/perl5/site_perl/5.8.6/Bio/Tools/Run/StandAloneBlast.pm:706
>>>> STACK Bio::Tools::Run::StandAloneBlast::blastall 
>>>> /usr/lib/perl5/site_perl/5.8.6/Bio/Tools/Run/StandAloneBlast.pm:557
>>>> STACK toplevel 
>>>> /home/Hubert/installed/eclipse/workspace/Database_Search/standalone_blast.pl:46 
>>>>
>>>>
>>>> --------------------------------------
>>>>
>>>> Why it did not find my binary file, but it is there
>>>>
>>>> regards
>>>>
>>>> Nagesh Chakka wrote:
>>>>
>>>>> Hi,
>>>>> The following is from the StandAloneBlast.pm documentation
>>>>> " If the databases which will be searched by BLAST are located in the
>>>>> data subdirectory of the blast program directory (the default
>>>>> installation location), StandAloneBlast will find them; however, 
>>>>> if the
>>>>> database files are located in any other location, environmental 
>>>>> variable
>>>>> $BLASTDATADIR will need to be set to point to that directory."
>>>>> Please note that I have not used this module before.
>>>>> Nagesh
>>>>>
>>>>>
>>>>>
>>>>> On Mon, 2006-01-23 at 17:08 -0600, Hubert Prielinger wrote:
>>>>>  
>>>>>
>>>>>> Hi,
>>>>>> thank you very much for the help, another questions that raises 
>>>>>> up, do I have to write the path to the database files as well, I 
>>>>>> guess so, but how I do that, the same way I write the path to teh 
>>>>>> blast bin files?
>>>>>> Does anybody know how to set the Composition based statistics 
>>>>>> parameter?
>>>>>> there is my code:
>>>>>>
>>>>>> #!/usr/bin/perl -w
>>>>>>
>>>>>> use Bio::Tools::Run::StandAloneBlast;
>>>>>> use Bio::Seq;
>>>>>> use Bio::SeqIO;
>>>>>> use strict;
>>>>>>
>>>>>> BEGIN
>>>>>> {
>>>>>>    $ENV{PATH}=":/home/Hubert/blast/blast-2.2.13/bin/:";
>>>>>> }
>>>>>>
>>>>>>
>>>>>> # parameters
>>>>>> my $expect_value = 20000;
>>>>>> #my $filter_query_sequence = 'F';
>>>>>> my $one_line_description = 1000;
>>>>>> my $alignments = 1000;
>>>>>> # my $strands = 1;
>>>>>> my $count = 1;
>>>>>>
>>>>>> my @params = ('program' => 'blastp', 'database' => 'nr');
>>>>>> #my $progress_interval = 100;
>>>>>>
>>>>>>
>>>>>> my $seqio_obj = Bio::SeqIO->new(
>>>>>>  -file   => "Perm.txt",
>>>>>>  -format => "raw",
>>>>>> );
>>>>>>
>>>>>> # create factory object and set parameters
>>>>>> my $factory = Bio::Tools::Run::StandAloneBlast->new(@params);
>>>>>>
>>>>>> $factory->e($expect_value);
>>>>>> #$factory->F($filter_query_sequence);
>>>>>> $factory->v($one_line_description);
>>>>>> $factory->b($alignments);
>>>>>> #$factory->S($strands);
>>>>>>
>>>>>>
>>>>>> # get query
>>>>>>
>>>>>> while ( my $query = $seqio_obj->next_seq ) {
>>>>>>      my $blast_report = $factory->blastall($query);
>>>>>>      my $filename = "comp_$count.txt";
>>>>>>      my $factory->outfile($filename);
>>>>>>      print $query->seq;
>>>>>>      print "\n";
>>>>>>
>>>>>>  $count++;
>>>>>> }
>>>>>>
>>>>>> thank you very much in advance
>>>>>> Hubert
>>>>>>
>>>>>>
>>>>>>
>>>>>> Nagesh Chakka wrote:
>>>>>>
>>>>>>  
>>>>>>
>>>>>>> Hi Hubert,
>>>>>>> I downloaded the nr.00.tar.gz file a week ago. I was able to get 
>>>>>>> the following files
>>>>>>> .phr, .pin, .pnd, .pni, .ppd, .ppi, .psd, .psi, .psq, .pal 
>>>>>>> files. I have no trouble in running standalone blast. You are 
>>>>>>> not required to run formardb on the downloaded blast databases 
>>>>>>> and that may be the reason why the sequences are not included as 
>>>>>>> it will also reduce the size of the file.
>>>>>>> Did you try to run a blast search, if so is it giving you any 
>>>>>>> errors?
>>>>>>> Nagesh
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Hubert Prielinger wrote:
>>>>>>>
>>>>>>>  
>>>>>>>
>>>>>>>> Hi,
>>>>>>>> I have downloaded the nr database for doing a blast search 
>>>>>>>> locally, now I'm supposed to index the database with formatdb, 
>>>>>>>> but it doesn't work...
>>>>>>>> The online help says that you need a fasta file that is indexed 
>>>>>>>> to use for searching the database, but when I uncompressed the 
>>>>>>>> zip file, there were only .phr, .pnd, .pin, .pni, .ppd file....
>>>>>>>> Is there anybody who can tell me, how to use formatdb with the 
>>>>>>>> nr database...
>>>>>>>>
>>>>>>>> Help is very appreciated
>>>>>>>> Thank you very much in advance
>>>>>>>>
>>>>>>>> Hubert
>>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>> Bioperl-l mailing list
>>>>>>>> Bioperl-l at portal.open-bio.org
>>>>>>>> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>>>>>>>>       
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>     
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>>  
>>>>>
>>>>
>>>
>>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at portal.open-bio.org
>> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>>
>>
>

From hubert.prielinger at gmx.at  Tue Jan 24 21:15:38 2006
From: hubert.prielinger at gmx.at (Hubert Prielinger)
Date: Tue, 24 Jan 2006 15:15:38 -0600
Subject: [Bioperl-l] formatdb with the nr database
In-Reply-To: <43D6B09A.3040207@atgc.org>
References: <43D54838.5050301@gmx.at>	<43D5693C.1020805@anu.edu.au>	<43D56203.2060806@gmx.at>	<1138062266.2534.2.camel@vogon>	<43D571B1.3020008@gmx.at>	<43D58D06.5080501@anu.edu.au>
	<43D585CF.5070902@gmx.at>	<43D63FB6.4090505@scitegic.com>
	<43D692C3.80306@gmx.at> <43D6B09A.3040207@atgc.org>
Message-ID: <43D698FA.3090904@gmx.at>

hi alex,
I have done, as you recommended and got the following output:

[Hubert at ppc7 ~]$ file /home/Hubert/blast/blast-2.2.13/bin/blastall
/home/Hubert/blast/blast-2.2.13/bin/blastall: ELF 64-bit LSB executable, 
AMD x86-64, version 1 (SYSV), for GNU/Linux 2.4.1, dynamically linked 
(uses shared libs), for GNU/Linux 2.4.1, not stripped
[Hubert at ppc7 ~]$

does it mean, that it is compatible with the operating system

thanks for help
Hubert

Alexander Kozik wrote:

> try Unix command "file", for example:
>
>
> bash-2.03$ file /usr/local/genome/bin/blastall
>
> /usr/local/genome/bin/blastall: ELF 64-bit MSB executable SPARCV9 
> Version 1, UltraSPARC1 Extensions Required, dynamically linked, stripped
>
> bash-2.03$
>
> it will tell if it's compatible with the operating system
>
> -Alex
>
> Hubert Prielinger wrote:
>
>>Hi,
>>thank you very much for the help, I have tried to run the blastall on 
>>commandline, but I can't even execute the binary file, nevertheless the 
>>blastall exe file have every permission...
>>I always get the error message: blastall: cannot execute the binary file
>>Need to be the exe file somewhere else, another path...now it is located 
>>under /home/Hubert/blast/blast-2.2.13/bin
>>
>>thanks
>>Hubert
>>
>>
>>
>>
>>
>>Scott Markel wrote:
>>
>>    
>>
>>>Hubert,
>>>
>>>If you look at the MSG line in the exception you can see
>>>exactly what the command line was.  Nagesh is pointing out
>>>that you used -d "/nr" and asking if that's what you want.
>>>I suspect that the '/' shouldn't be there.
>>>
>>>Try invoking blastall directly from the command line.  All
>>>BioPerl is doing is invoking BLAST on your behalf.  The
>>>same command line that BioPerl uses should also work for
>>>you on the command line.
>>>
>>>Scott
>>>
>>>Hubert Prielinger wrote:
>>>
>>>      
>>>
>>>>hi,
>>>>sorry, but what do you mean with is your blast database in /nr...
>>>>my database is located in the path /home/Hubert/blast/blast-2.2.13/data
>>>>
>>>>
>>>>
>>>>Nagesh Chakka wrote:
>>>>
>>>>        
>>>>
>>>>>Can you just run the blast from the command line.
>>>>>Is your blast database in "/nr".
>>>>>
>>>>>Hubert Prielinger wrote:
>>>>>
>>>>>          
>>>>>
>>>>>>Hi Nagesh,
>>>>>>thank you very much, I put my database into the data folder, run 
>>>>>>the program and got the following error message:
>>>>>>
>>>>>>submit Sequence...just do it....
>>>>>>sh: /home/Hubert/blast/blast-2.2.13/bin/blastall: cannot execute 
>>>>>>binary file
>>>>>>
>>>>>>------------- EXCEPTION  -------------
>>>>>>MSG: blastall call crashed: 32256 
>>>>>>/home/Hubert/blast/blast-2.2.13/bin/blastall -p  blastp  -d  "/nr"  
>>>>>>-i  /tmp/QTZfYMbgLM  -e  20000  -o  /tmp/v3YwWvONZ1  -v  1000  -b  
>>>>>>1000
>>>>>>
>>>>>>STACK Bio::Tools::Run::StandAloneBlast::_runblast 
>>>>>>/usr/lib/perl5/site_perl/5.8.6/Bio/Tools/Run/StandAloneBlast.pm:759
>>>>>>STACK Bio::Tools::Run::StandAloneBlast::_generic_local_blast 
>>>>>>/usr/lib/perl5/site_perl/5.8.6/Bio/Tools/Run/StandAloneBlast.pm:706
>>>>>>STACK Bio::Tools::Run::StandAloneBlast::blastall 
>>>>>>/usr/lib/perl5/site_perl/5.8.6/Bio/Tools/Run/StandAloneBlast.pm:557
>>>>>>STACK toplevel 
>>>>>>/home/Hubert/installed/eclipse/workspace/Database_Search/standalo
>>>>>>ne_blast.pl:46 
>>>>>>
>>>>>>
>>>>>>--------------------------------------
>>>>>>
>>>>>>Why it did not find my binary file, but it is there
>>>>>>
>>>>>>regards
>>>>>>
>>>>>>Nagesh Chakka wrote:
>>>>>>
>>>>>>            
>>>>>>
>>>>>>>Hi,
>>>>>>>The following is from the StandAloneBlast.pm documentation
>>>>>>>" If the databases which will be searched by BLAST are located in the
>>>>>>>data subdirectory of the blast program directory (the default
>>>>>>>installation location), StandAloneBlast will find them; however, 
>>>>>>>if the
>>>>>>>database files are located in any other location, environmental 
>>>>>>>variable
>>>>>>>$BLASTDATADIR will need to be set to point to that directory."
>>>>>>>Please note that I have not used this module before.
>>>>>>>Nagesh
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>On Mon, 2006-01-23 at 17:08 -0600, Hubert Prielinger wrote:
>>>>>>> 
>>>>>>>
>>>>>>>              
>>>>>>>
>>>>>>>>Hi,
>>>>>>>>thank you very much for the help, another questions that raises 
>>>>>>>>up, do I have to write the path to the database files as well, I 
>>>>>>>>guess so, but how I do that, the same way I write the path to teh 
>>>>>>>>blast bin files?
>>>>>>>>Does anybody know how to set the Composition based statistics 
>>>>>>>>parameter?
>>>>>>>>there is my code:
>>>>>>>>
>>>>>>>>#!/usr/bin/perl -w
>>>>>>>>
>>>>>>>>use Bio::Tools::Run::StandAloneBlast;
>>>>>>>>use Bio::Seq;
>>>>>>>>use Bio::SeqIO;
>>>>>>>>use strict;
>>>>>>>>
>>>>>>>>BEGIN
>>>>>>>>{
>>>>>>>>   $ENV{PATH}=":/home/Hubert/blast/blast-2.2.13/bin/:";
>>>>>>>>}
>>>>>>>>
>>>>>>>>
>>>>>>>># parameters
>>>>>>>>my $expect_value = 20000;
>>>>>>>>#my $filter_query_sequence = 'F';
>>>>>>>>my $one_line_description = 1000;
>>>>>>>>my $alignments = 1000;
>>>>>>>># my $strands = 1;
>>>>>>>>my $count = 1;
>>>>>>>>
>>>>>>>>my @params = ('program' => 'blastp', 'database' => 'nr');
>>>>>>>>#my $progress_interval = 100;
>>>>>>>>
>>>>>>>>
>>>>>>>>my $seqio_obj = Bio::SeqIO->new(
>>>>>>>> -file   => "Perm.txt",
>>>>>>>> -format => "raw",
>>>>>>>>);
>>>>>>>>
>>>>>>>># create factory
>>>>>>>> object and set parameters
>>>>>>>>my $factory = Bio::Tools::Run::StandAloneBlast->new(@params);
>>>>>>>>
>>>>>>>>$factory->e($expect_value);
>>>>>>>>#$factory->F($filter_query_sequence);
>>>>>>>>$factory->v($one_line_description);
>>>>>>>>$factory->b($alignments);
>>>>>>>>#$factory->S($strands);
>>>>>>>>
>>>>>>>>
>>>>>>>># get query
>>>>>>>>
>>>>>>>>while ( my $query = $seqio_obj->next_seq ) {
>>>>>>>>     my $blast_report = $factory->blastall($query);
>>>>>>>>     my $filename = "comp_$count.txt";
>>>>>>>>     my $factory->outfile($filename);
>>>>>>>>     print $query->seq;
>>>>>>>>     print "\n";
>>>>>>>>
>>>>>>>> $count++;
>>>>>>>>}
>>>>>>>>
>>>>>>>>thank you very much in advance
>>>>>>>>Hubert
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>Nagesh Chakka wrote:
>>>>>>>>
>>>>>>>> 
>>>>>>>>
>>>>>>>>                
>>>>>>>>
>>>>>>>>>Hi Hubert,
>>>>>>>>>I downloaded the nr.00.tar.gz file a week ago. I was able to get 
>>>>>>>>>the following files
>>>>>>>>>.phr, .pin, .pnd, .pni, .ppd, .ppi, .psd, .psi, .psq, .pal 
>>>>>>>>>files. I have no trouble in running standalone blast. You are 
>>>>>>>>>not required to run formardb on the downloaded blast databases 
>>>>>>>>>and that may be the reason why the sequences are not included as 
>>>>>>>>>it will also reduce the size of the file.
>>>>>>>>>Did you try to run a blast search, if so is it giving you any 
>>>>>>>>>errors?
>>>>>>>>>Nagesh
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>Hubert Prielinger wrote:
>>>>>>>>>
>>>>>>>>> 
>>>>>>>>>
>>>>>>>>>                  
>>>>>>>>>
>>>>>>>>>>Hi,
>>>>>>>>>>I have downloaded the nr database for doing a blast search 
>>>>>>>>>>locally, now I'm supposed to index the database with formatdb, 
>>>>>>>>>>but it doesn't work...
>>>>>>>>>>The online help says that you need a fasta file that is indexed 
>>>>>>>>>>to use for searching the database, but when I uncompressed the 
>>>>>>>>>>zip file, there were only .phr, .pnd, .pin, .pni, .ppd file....
>>>>>>>>>>Is there anybody who can tell me, how to use formatdb with the 
>>>>>>>>>>nr database...
>>>>>>>>>>
>>>>>>>>>>Help is very appreciated
>>>>>>>>>>Thank you very much in advance
>>>>>>>>>>
>>>>>>>>>>Hubert
>>>>>>>>>>
>>>>>>>>>>_______________________________________________
>>>>>>>>>>Bioperl-l mailing list
>>>>>>>>>>Bioperl-l at portal.open-bio.org
>>>>>>>>>>http://portal.open-bio.org/mailman/listinfo/bioperl-l
>>>>>>>>>>      
>>>>>>>>>>                    
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>    
>>>>>>>>>                  
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>                
>>>>>>>>
>>>>>>> 
>>>>>>>
>>>>>>>              
>>>>>>>
>>>>>          
>>>>>
>>>>_______________________________________________
>>>>Bioperl-l mailing list
>>>>Bioperl-l at portal.open-bio.org
>>>>http://portal.open-bio.org/mailman/listinfo/bioperl-l
>>>>
>>>>
>>>>        
>>>>
>>
>>
>>_______________________________________________
>>Bioperl-l mailing list
>>Bioperl-l at lists.open-bio.org
>>http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>    
>>
>

From hubert.prielinger at gmx.at  Tue Jan 24 21:24:51 2006
From: hubert.prielinger at gmx.at (Hubert Prielinger)
Date: Tue, 24 Jan 2006 15:24:51 -0600
Subject: [Bioperl-l] formatdb with the nr database
In-Reply-To: <43D6B09A.3040207@atgc.org>
References: <43D54838.5050301@gmx.at>	<43D5693C.1020805@anu.edu.au>	<43D56203.2060806@gmx.at>	<1138062266.2534.2.camel@vogon>	<43D571B1.3020008@gmx.at>	<43D58D06.5080501@anu.edu.au>
	<43D585CF.5070902@gmx.at>	<43D63FB6.4090505@scitegic.com>
	<43D692C3.80306@gmx.at> <43D6B09A.3040207@atgc.org>
Message-ID: <43D69B23.9010100@gmx.at>

Hi,
I'm very sorry for wasting your time, but I just figured out what 
happend, I have installed the 64 bit version and not the 32 bit version....
sorry for the inconvenience and thanks for the help....
I'm trying to fix now the problem with the database....

Sorry
Hubert

Alexander Kozik wrote:

> try Unix command "file", for example:
>
>
> bash-2.03$ file /usr/local/genome/bin/blastall
>
> /usr/local/genome/bin/blastall: ELF 64-bit MSB executable SPARCV9 
> Version 1, UltraSPARC1 Extensions Required, dynamically linked, stripped
>
> bash-2.03$
>
> it will tell if it's compatible with the operating system
>
> -Alex
>
> Hubert Prielinger wrote:
>
>>Hi,
>>thank you very much for the help, I have tried to run the blastall on 
>>commandline, but I can't even execute the binary file, nevertheless the 
>>blastall exe file have every permission...
>>I always get the error message: blastall: cannot execute the binary file
>>Need to be the exe file somewhere else, another path...now it is located 
>>under /home/Hubert/blast/blast-2.2.13/bin
>>
>>thanks
>>Hubert
>>
>>
>>
>>
>>
>>Scott Markel wrote:
>>
>>    
>>
>>>Hubert,
>>>
>>>If you look at the MSG line in the exception you can see
>>>exactly what the command line was.  Nagesh is pointing out
>>>that you used -d "/nr" and asking if that's what you want.
>>>I suspect that the '/' shouldn't be there.
>>>
>>>Try invoking blastall directly from the command line.  All
>>>BioPerl is doing is invoking BLAST on your behalf.  The
>>>same command line that BioPerl uses should also work for
>>>you on the command line.
>>>
>>>Scott
>>>
>>>Hubert Prielinger wrote:
>>>
>>>      
>>>
>>>>hi,
>>>>sorry, but what do you mean with is your blast database in /nr...
>>>>my database is located in the path /home/Hubert/blast/blast-2.2.13/data
>>>>
>>>>
>>>>
>>>>Nagesh Chakka wrote:
>>>>
>>>>        
>>>>
>>>>>Can you just run the blast from the command line.
>>>>>Is your blast database in "/nr".
>>>>>
>>>>>Hubert Prielinger wrote:
>>>>>
>>>>>          
>>>>>
>>>>>>Hi Nagesh,
>>>>>>thank you very much, I put my database into the data folder, run 
>>>>>>the program and got the following error message:
>>>>>>
>>>>>>submit Sequence...just do it....
>>>>>>sh: /home/Hubert/blast/blast-2.2.13/bin/blastall: cannot execute 
>>>>>>binary file
>>>>>>
>>>>>>------------- EXCEPTION  -------------
>>>>>>MSG: blastall call crashed: 32256 
>>>>>>/home/Hubert/blast/blast-2.2.13/bin/blastall -p  blastp  -d  "/nr"  
>>>>>>-i  /tmp/QTZfYMbgLM  -e  20000  -o  /tmp/v3YwWvONZ1  -v  1000  -b  
>>>>>>1000
>>>>>>
>>>>>>STACK Bio::Tools::Run::StandAloneBlast::_runblast 
>>>>>>/usr/lib/perl5/site_perl/5.8.6/Bio/Tools/Run/StandAloneBlast.pm:759
>>>>>>STACK Bio::Tools::Run::StandAloneBlast::_generic_local_blast 
>>>>>>/usr/lib/perl5/site_perl/5.8.6/Bio/Tools/Run/StandAloneBlast.pm:706
>>>>>>STACK Bio::Tools::Run::StandAloneBlast::blastall 
>>>>>>/usr/lib/perl5/site_perl/5.8.6/Bio/Tools/Run/StandAloneBlast.pm:557
>>>>>>STACK toplevel 
>>>>>>/home/Hubert/installed/eclipse/workspace/Database_Search/standalo
>>>>>>ne_blast.pl:46 
>>>>>>
>>>>>>
>>>>>>--------------------------------------
>>>>>>
>>>>>>Why it did not find my binary file, but it is there
>>>>>>
>>>>>>regards
>>>>>>
>>>>>>Nagesh Chakka wrote:
>>>>>>
>>>>>>            
>>>>>>
>>>>>>>Hi,
>>>>>>>The following is from the StandAloneBlast.pm documentation
>>>>>>>" If the databases which will be searched by BLAST are located in the
>>>>>>>data subdirectory of the blast program directory (the default
>>>>>>>installation location), StandAloneBlast will find them; however, 
>>>>>>>if the
>>>>>>>database files are located in any other location, environmental 
>>>>>>>variable
>>>>>>>$BLASTDATADIR will need to be set to point to that directory."
>>>>>>>Please note that I have not used this module before.
>>>>>>>Nagesh
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>On Mon, 2006-01-23 at 17:08 -0600, Hubert Prielinger wrote:
>>>>>>> 
>>>>>>>
>>>>>>>              
>>>>>>>
>>>>>>>>Hi,
>>>>>>>>thank you very much for the help, another questions that raises 
>>>>>>>>up, do I have to write the path to the database files as well, I 
>>>>>>>>guess so, but how I do that, the same way I write the path to teh 
>>>>>>>>blast bin files?
>>>>>>>>Does anybody know how to set the Composition based statistics 
>>>>>>>>parameter?
>>>>>>>>there is my code:
>>>>>>>>
>>>>>>>>#!/usr/bin/perl -w
>>>>>>>>
>>>>>>>>use Bio::Tools::Run::StandAloneBlast;
>>>>>>>>use Bio::Seq;
>>>>>>>>use Bio::SeqIO;
>>>>>>>>use strict;
>>>>>>>>
>>>>>>>>BEGIN
>>>>>>>>{
>>>>>>>>   $ENV{PATH}=":/home/Hubert/blast/blast-2.2.13/bin/:";
>>>>>>>>}
>>>>>>>>
>>>>>>>>
>>>>>>>># parameters
>>>>>>>>my $expect_value = 20000;
>>>>>>>>#my $filter_query_sequence = 'F';
>>>>>>>>my $one_line_description = 1000;
>>>>>>>>my $alignments = 1000;
>>>>>>>># my $strands = 1;
>>>>>>>>my $count = 1;
>>>>>>>>
>>>>>>>>my @params = ('program' => 'blastp', 'database' => 'nr');
>>>>>>>>#my $progress_interval = 100;
>>>>>>>>
>>>>>>>>
>>>>>>>>my $seqio_obj = Bio::SeqIO->new(
>>>>>>>> -file   => "Perm.txt",
>>>>>>>> -format => "raw",
>>>>>>>>);
>>>>>>>>
>>>>>>>># create factory
>>>>>>>> object and set parameters
>>>>>>>>my $factory = Bio::Tools::Run::StandAloneBlast->new(@params);
>>>>>>>>
>>>>>>>>$factory->e($expect_value);
>>>>>>>>#$factory->F($filter_query_sequence);
>>>>>>>>$factory->v($one_line_description);
>>>>>>>>$factory->b($alignments);
>>>>>>>>#$factory->S($strands);
>>>>>>>>
>>>>>>>>
>>>>>>>># get query
>>>>>>>>
>>>>>>>>while ( my $query = $seqio_obj->next_seq ) {
>>>>>>>>     my $blast_report = $factory->blastall($query);
>>>>>>>>     my $filename = "comp_$count.txt";
>>>>>>>>     my $factory->outfile($filename);
>>>>>>>>     print $query->seq;
>>>>>>>>     print "\n";
>>>>>>>>
>>>>>>>> $count++;
>>>>>>>>}
>>>>>>>>
>>>>>>>>thank you very much in advance
>>>>>>>>Hubert
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>Nagesh Chakka wrote:
>>>>>>>>
>>>>>>>> 
>>>>>>>>
>>>>>>>>                
>>>>>>>>
>>>>>>>>>Hi Hubert,
>>>>>>>>>I downloaded the nr.00.tar.gz file a week ago. I was able to get 
>>>>>>>>>the following files
>>>>>>>>>.phr, .pin, .pnd, .pni, .ppd, .ppi, .psd, .psi, .psq, .pal 
>>>>>>>>>files. I have no trouble in running standalone blast. You are 
>>>>>>>>>not required to run formardb on the downloaded blast databases 
>>>>>>>>>and that may be the reason why the sequences are not included as 
>>>>>>>>>it will also reduce the size of the file.
>>>>>>>>>Did you try to run a blast search, if so is it giving you any 
>>>>>>>>>errors?
>>>>>>>>>Nagesh
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>Hubert Prielinger wrote:
>>>>>>>>>
>>>>>>>>> 
>>>>>>>>>
>>>>>>>>>                  
>>>>>>>>>
>>>>>>>>>>Hi,
>>>>>>>>>>I have downloaded the nr database for doing a blast search 
>>>>>>>>>>locally, now I'm supposed to index the database with formatdb, 
>>>>>>>>>>but it doesn't work...
>>>>>>>>>>The online help says that you need a fasta file that is indexed 
>>>>>>>>>>to use for searching the database, but when I uncompressed the 
>>>>>>>>>>zip file, there were only .phr, .pnd, .pin, .pni, .ppd file....
>>>>>>>>>>Is there anybody who can tell me, how to use formatdb with the 
>>>>>>>>>>nr database...
>>>>>>>>>>
>>>>>>>>>>Help is very appreciated
>>>>>>>>>>Thank you very much in advance
>>>>>>>>>>
>>>>>>>>>>Hubert
>>>>>>>>>>
>>>>>>>>>>_______________________________________________
>>>>>>>>>>Bioperl-l mailing list
>>>>>>>>>>Bioperl-l at portal.open-bio.org
>>>>>>>>>>http://portal.open-bio.org/mailman/listinfo/bioperl-l
>>>>>>>>>>      
>>>>>>>>>>                    
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>    
>>>>>>>>>                  
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>                
>>>>>>>>
>>>>>>> 
>>>>>>>
>>>>>>>              
>>>>>>>
>>>>>          
>>>>>
>>>>_______________________________________________
>>>>Bioperl-l mailing list
>>>>Bioperl-l at portal.open-bio.org
>>>>http://portal.open-bio.org/mailman/listinfo/bioperl-l
>>>>
>>>>
>>>>        
>>>>
>>
>>
>>_______________________________________________
>>Bioperl-l mailing list
>>Bioperl-l at lists.open-bio.org
>>http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>    
>>
>

From smarkel at scitegic.com  Tue Jan 24 22:09:57 2006
From: smarkel at scitegic.com (Scott Markel)
Date: Tue, 24 Jan 2006 14:09:57 -0800
Subject: [Bioperl-l] formatdb with the nr database
In-Reply-To: <43D692C3.80306@gmx.at>
References: <43D54838.5050301@gmx.at>
	<43D5693C.1020805@anu.edu.au>	<43D56203.2060806@gmx.at>
	<1138062266.2534.2.camel@vogon>	<43D571B1.3020008@gmx.at>
	<43D58D06.5080501@anu.edu.au> <43D585CF.5070902@gmx.at>
	<43D63FB6.4090505@scitegic.com> <43D692C3.80306@gmx.at>
Message-ID: <43D6A5B5.8090106@scitegic.com>

Hubert,

Since you can't run blastall on the command line, your initial
problem has nothing to do with BioPerl.  Once you get blastall
working on the command line, you'll know what directories and
environment variable settings to use when running via BioPerl.

What happens when you run the following?

   file /home/Hubert/blast/blast-2.2.13/bin/blastall

Is the executable the correct one for your operating system?

Scott

Hubert Prielinger wrote:

> Hi,
> thank you very much for the help, I have tried to run the blastall on 
> commandline, but I can't even execute the binary file, nevertheless the 
> blastall exe file have every permission...
> I always get the error message: blastall: cannot execute the binary file
> Need to be the exe file somewhere else, another path...now it is located 
> under /home/Hubert/blast/blast-2.2.13/bin
> 
> thanks
> Hubert
> 
> 
> 
> 
> 
> Scott Markel wrote:
> 
>> Hubert,
>>
>> If you look at the MSG line in the exception you can see
>> exactly what the command line was.  Nagesh is pointing out
>> that you used -d "/nr" and asking if that's what you want.
>> I suspect that the '/' shouldn't be there.
>>
>> Try invoking blastall directly from the command line.  All
>> BioPerl is doing is invoking BLAST on your behalf.  The
>> same command line that BioPerl uses should also work for
>> you on the command line.
>>
>> Scott
>>
>> Hubert Prielinger wrote:
>>
>>> hi,
>>> sorry, but what do you mean with is your blast database in /nr...
>>> my database is located in the path /home/Hubert/blast/blast-2.2.13/data
>>>
>>>
>>>
>>> Nagesh Chakka wrote:
>>>
>>>> Can you just run the blast from the command line.
>>>> Is your blast database in "/nr".
>>>>
>>>> Hubert Prielinger wrote:
>>>>
>>>>> Hi Nagesh,
>>>>> thank you very much, I put my database into the data folder, run 
>>>>> the program and got the following error message:
>>>>>
>>>>> submit Sequence...just do it....
>>>>> sh: /home/Hubert/blast/blast-2.2.13/bin/blastall: cannot execute 
>>>>> binary file
>>>>>
>>>>> ------------- EXCEPTION  -------------
>>>>> MSG: blastall call crashed: 32256 
>>>>> /home/Hubert/blast/blast-2.2.13/bin/blastall -p  blastp  -d  "/nr"  
>>>>> -i  /tmp/QTZfYMbgLM  -e  20000  -o  /tmp/v3YwWvONZ1  -v  1000  -b  
>>>>> 1000
>>>>>
>>>>> STACK Bio::Tools::Run::StandAloneBlast::_runblast 
>>>>> /usr/lib/perl5/site_perl/5.8.6/Bio/Tools/Run/StandAloneBlast.pm:759
>>>>> STACK Bio::Tools::Run::StandAloneBlast::_generic_local_blast 
>>>>> /usr/lib/perl5/site_perl/5.8.6/Bio/Tools/Run/StandAloneBlast.pm:706
>>>>> STACK Bio::Tools::Run::StandAloneBlast::blastall 
>>>>> /usr/lib/perl5/site_perl/5.8.6/Bio/Tools/Run/StandAloneBlast.pm:557
>>>>> STACK toplevel 
>>>>> /home/Hubert/installed/eclipse/workspace/Database_Search/standalone_blast.pl:46 
>>>>>
>>>>>
>>>>> --------------------------------------
>>>>>
>>>>> Why it did not find my binary file, but it is there
>>>>>
>>>>> regards
>>>>>
>>>>> Nagesh Chakka wrote:
>>>>>
>>>>>> Hi,
>>>>>> The following is from the StandAloneBlast.pm documentation
>>>>>> " If the databases which will be searched by BLAST are located in the
>>>>>> data subdirectory of the blast program directory (the default
>>>>>> installation location), StandAloneBlast will find them; however, 
>>>>>> if the
>>>>>> database files are located in any other location, environmental 
>>>>>> variable
>>>>>> $BLASTDATADIR will need to be set to point to that directory."
>>>>>> Please note that I have not used this module before.
>>>>>> Nagesh
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Mon, 2006-01-23 at 17:08 -0600, Hubert Prielinger wrote:
>>>>>>  
>>>>>>
>>>>>>> Hi,
>>>>>>> thank you very much for the help, another questions that raises 
>>>>>>> up, do I have to write the path to the database files as well, I 
>>>>>>> guess so, but how I do that, the same way I write the path to teh 
>>>>>>> blast bin files?
>>>>>>> Does anybody know how to set the Composition based statistics 
>>>>>>> parameter?
>>>>>>> there is my code:
>>>>>>>
>>>>>>> #!/usr/bin/perl -w
>>>>>>>
>>>>>>> use Bio::Tools::Run::StandAloneBlast;
>>>>>>> use Bio::Seq;
>>>>>>> use Bio::SeqIO;
>>>>>>> use strict;
>>>>>>>
>>>>>>> BEGIN
>>>>>>> {
>>>>>>>    $ENV{PATH}=":/home/Hubert/blast/blast-2.2.13/bin/:";
>>>>>>> }
>>>>>>>
>>>>>>>
>>>>>>> # parameters
>>>>>>> my $expect_value = 20000;
>>>>>>> #my $filter_query_sequence = 'F';
>>>>>>> my $one_line_description = 1000;
>>>>>>> my $alignments = 1000;
>>>>>>> # my $strands = 1;
>>>>>>> my $count = 1;
>>>>>>>
>>>>>>> my @params = ('program' => 'blastp', 'database' => 'nr');
>>>>>>> #my $progress_interval = 100;
>>>>>>>
>>>>>>>
>>>>>>> my $seqio_obj = Bio::SeqIO->new(
>>>>>>>  -file   => "Perm.txt",
>>>>>>>  -format => "raw",
>>>>>>> );
>>>>>>>
>>>>>>> # create factory object and set parameters
>>>>>>> my $factory = Bio::Tools::Run::StandAloneBlast->new(@params);
>>>>>>>
>>>>>>> $factory->e($expect_value);
>>>>>>> #$factory->F($filter_query_sequence);
>>>>>>> $factory->v($one_line_description);
>>>>>>> $factory->b($alignments);
>>>>>>> #$factory->S($strands);
>>>>>>>
>>>>>>>
>>>>>>> # get query
>>>>>>>
>>>>>>> while ( my $query = $seqio_obj->next_seq ) {
>>>>>>>      my $blast_report = $factory->blastall($query);
>>>>>>>      my $filename = "comp_$count.txt";
>>>>>>>      my $factory->outfile($filename);
>>>>>>>      print $query->seq;
>>>>>>>      print "\n";
>>>>>>>
>>>>>>>  $count++;
>>>>>>> }
>>>>>>>
>>>>>>> thank you very much in advance
>>>>>>> Hubert
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Nagesh Chakka wrote:
>>>>>>>
>>>>>>>  
>>>>>>>
>>>>>>>> Hi Hubert,
>>>>>>>> I downloaded the nr.00.tar.gz file a week ago. I was able to get 
>>>>>>>> the following files
>>>>>>>> .phr, .pin, .pnd, .pni, .ppd, .ppi, .psd, .psi, .psq, .pal 
>>>>>>>> files. I have no trouble in running standalone blast. You are 
>>>>>>>> not required to run formardb on the downloaded blast databases 
>>>>>>>> and that may be the reason why the sequences are not included as 
>>>>>>>> it will also reduce the size of the file.
>>>>>>>> Did you try to run a blast search, if so is it giving you any 
>>>>>>>> errors?
>>>>>>>> Nagesh
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> Hubert Prielinger wrote:
>>>>>>>>
>>>>>>>>  
>>>>>>>>
>>>>>>>>> Hi,
>>>>>>>>> I have downloaded the nr database for doing a blast search 
>>>>>>>>> locally, now I'm supposed to index the database with formatdb, 
>>>>>>>>> but it doesn't work...
>>>>>>>>> The online help says that you need a fasta file that is indexed 
>>>>>>>>> to use for searching the database, but when I uncompressed the 
>>>>>>>>> zip file, there were only .phr, .pnd, .pin, .pni, .ppd file....
>>>>>>>>> Is there anybody who can tell me, how to use formatdb with the 
>>>>>>>>> nr database...
>>>>>>>>>
>>>>>>>>> Help is very appreciated
>>>>>>>>> Thank you very much in advance
>>>>>>>>>
>>>>>>>>> Hubert
>>>>>>>>>
>>>>>>>>> _______________________________________________
>>>>>>>>> Bioperl-l mailing list
>>>>>>>>> Bioperl-l at portal.open-bio.org
>>>>>>>>> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>>>>>>>>>       
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>     
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>>  
>>>>>>
>>>>>
>>>>
>>>>
>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at portal.open-bio.org
>>> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>>
>>
> 
> 
> 
> 

-- 
Scott Markel, Ph.D.
Principal Bioinformatics Architect  email:  smarkel at scitegic.com
SciTegic Inc.                       mobile: +1 858 205 3653
9665 Chesapeake Drive, Suite 401    voice:  +1 858 279 8800, ext. 253
San Diego, CA 92123                 fax:    +1 858 279 8804
USA                                 web:    http://www.scitegic.com

From cjfields at uiuc.edu  Tue Jan 24 22:21:22 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 24 Jan 2006 16:21:22 -0600
Subject: [Bioperl-l] RemoteBlast.pm and Bio::SearchIO::blast.pm
	-partially resolved
In-Reply-To: <18966F80-B780-4661-953E-613B05B56164@duke.edu>
Message-ID: <000301c62134$81cdc500$15327e82@pyrimidine>

Jason, 

I have worked out all the problems with RemoteBlast.pm and posted a patched
version to Bugzilla (http://bugzilla.bioperl.org/show_bug.cgi?id=1935).  The
main problem was that RemoteBlast::save_output was not looking for XML
output when dumping from the tempfile to the saved file (it only looked for
the text header).  That is fixed.  The other problems mentioned were due to
differences in mapping key=>value pairs between blast and blastxml and a
problem in my own script.  It passed all tests using 'perl t/RemoteBlast.t'
with debugging set.

See if anybody else out there can test them out.

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 
> -----Original Message-----
> From: bioperl-l-bounces at portal.open-bio.org [mailto:bioperl-l-
> bounces at portal.open-bio.org] On Behalf Of Jason Stajich
> Sent: Tuesday, January 24, 2006 11:16 AM
> To: Chris Fields
> Cc: bioperl-ml List
> Subject: [Bioperl-l] Re: RemoteBlast.pm and Bio::SearchIO::blast.pm -
> partially resolved
> 
> Thanks Chris - I don't know when I'll have time to check in bugs so
> anyone else who has commit access feel free to give these a whirl and
> check in.
> 
> I would propose making the XML default but allowing the text version
> to still be supported in the event that someone has setup their own
> local NCBI BLAST Web interface which still supports the simple Text
> output.
> 
> -j
> 
> On Jan 24, 2006, at 12:09 PM, Chris Fields wrote:
> 
> > I submitted two bugs on Bugzilla to describe recent problems with
> > RemoteBlast.pm and SearchIO::blast.pm
> >
> > http://bugzilla.bioperl.org/show_bug.cgi?id=1934
> > http://bugzilla.bioperl.org/show_bug.cgi?id=1935
> >
> > Today I submitted a patched version of Bio::SearchIO::blast.pm
> > which should
> > fix the text parsing issue for old (2.2.12) and new (2.2.13)
> > versions of
> > NCBI's BLAST; the bug link above describes the problem and the
> > fix.  Problem
> > is, I know it will likely break again b/c NCBI will probably change
> > text
> > output in a future BLAST version.  I also agree with Jason about
> > changing
> > the default for SearchIO to XML.  So, does text output parsing through
> > blast.pm need to be deprecated in favor of XML, or should both be
> > available?
> >
> > Christopher Fields
> > Postdoctoral Researcher - Switzer Lab
> > Dept. of Biochemistry
> > University of Illinois Urbana-Champaign
> >
> >
> 
> --
> Jason Stajich
> Duke University
> http://www.duke.edu/~jes12
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l

From jason.stajich at duke.edu  Tue Jan 24 21:44:34 2006
From: jason.stajich at duke.edu (Jason Stajich)
Date: Tue, 24 Jan 2006 16:44:34 -0500
Subject: [Bioperl-l] new mailing list server
Message-ID: <50E14815-266E-4ACB-8E6E-293C9EB33476@duke.edu>

Chris Dagdigian has switched our mailing lists over to a new server  
to upgrade us to newer hardware.  In the switch the default mailing  
list the server name is 'lists.open-bio.org' instead of 'portal.open- 
bio.org'.  That should be the only change you should notice at the  
bottom of your mails.  All mail should get delivered to any of those  
addresses (although @bioperl.org is preferred).

We hope this changeover will help improve the performance and  
scalability of our mail and webservices.

We also will aim to move the developer read-write CVS server to a new  
machine in the coming weeks.  We hope this will only be a minor  
inconvenience but will allow us to move to a more recent operating  
system and larger disk space.

If you have questions or concerns they can be directed to support AT  
open-bio.org
-jason
--
Jason Stajich
Duke University
http://www.duke.edu/~jes12

From jason.stajich at duke.edu  Wed Jan 25 03:31:38 2006
From: jason.stajich at duke.edu (Jason Stajich)
Date: Tue, 24 Jan 2006 22:31:38 -0500
Subject: [Bioperl-l] new website launched
Message-ID: <79628361-F026-461F-A156-0B6810DB0B52@duke.edu>

I am pleased to announce the release of a new website for BioPerl.   
The site is based on the mediawiki software that was developed for  
the wikipedia project.  We intend the site to be a place for  
community input on documentation and design for the BioPerl project.   
There is also a fair amount of documentation started surrounding  
bioinformatics tools and techniques applicable to using BioPerl and  
some of the authors who created these resources.

The website continues to be at the URL http://www.bioperl.org.  The  
DNS updates may take up to 24 hours to reach everyone.

The initial content of the site is result of the work of myself,  
Mauricio Herrera Cuadra, Brian Osborne, and Torsten Seemann.  We  
encourage you to contribute to the site's content by signing up for  
an account.

There are several guides for style of the site and how to link to  
Modules for example which can contain additional information from the  
POD
http://bioperl.org/wiki/Module:Bio::SeqIO

You'll notice that many of the paths have changed but the DIST and  
SRC continues to be available at http://bioperl.org/DIST and http:// 
bioperl.org/SRC.  The HOWTOs are now available from http:// 
bioperl.org/wiki/HOWTOs

The FAQ is available at http://bioperl.org/wiki/FAQ and I encourage  
you to add your questions to it so they can be properly archived and  
addressed.

We also have initiated a News site for Bioperl for posting  
announcements regarding development and software.  I would like to  
see if there are volunteers to post weekly or monthly summaries of  
mailing list traffic and development.
http://www.bioperl.org/news/

Jason Stajich on behalf of Mauricio Herrera Cuadra, Brian Osborne,  
Torsten Seemann.

--
Jason Stajich
Duke University
http://www.duke.edu/~jes12

From roy at colibase.bham.ac.uk  Wed Jan 25 17:05:29 2006
From: roy at colibase.bham.ac.uk (Roy Chaudhuri)
Date: Wed, 25 Jan 2006 17:05:29 +0000
Subject: [Bioperl-l] concatenate two embl sequence files
In-Reply-To: <200601182120.k0ILIl8X022324@portal.open-bio.org>
References: <200601182120.k0ILIl8X022324@portal.open-bio.org>
Message-ID: <43D7AFD9.2020305@colibase.bham.ac.uk>

Hi all.

I also had need of a function to concatenate two Bio::Seq objects, so had a go
at this. My naive attempt (intended to go in Bio::SeqUtils) is pasted below. I'm
not too sure about the concept of sub-SeqFeatures (I've never seen any sequence
that had more than one level of feature)- I worked on the assumption that little
sub-SeqFeatures can have littler sub-SeqFeatures and so ad infinitum, but as I
don't have an example file I haven't been able to test if this works. Likewise,
although I think the code should cope with Fuzzy and Split locations, I haven't
tested this with any particularly unusual examples.

Roy.
--
Dr. Roy Chaudhuri
Bioinformatics Research Fellow
Division of Immunity and Infection
University of Birmingham, U.K.

http://xbase.bham.ac.uk

=head2 cat

  Title   : cat
  Usage   : my $catseq = Bio::SeqUtils->cat(@seqs)
  Function: Concatenates an array of Bio::Seq objects, using the first sequence
            as a template for species etc. Adjusts the coordinates of features
            from any additional objects.
  Returns : A sequence object of the same class as the first argument.
  Args    : array of sequence objects

=cut

sub cat {
     my ($self, @seqs) = @_;
     my $seq=shift @seqs;
     $self->throw('Object [$seq] '. 'of class ['. ref($seq).
                  '] should be a Bio::PrimarySeqI ')
     unless $seq->isa('Bio::PrimarySeqI');
     for (@seqs) {
     	$self->throw('Object [$seq] '. 'of class ['. ref($seq).
                  '] should be a Bio::PrimarySeqI ')
	unless $seq->isa('Bio::PrimarySeqI');
	my $length=$seq->length;
	$seq->seq($seq->seq.$_->seq);
	for my $feat ($_->get_SeqFeatures) {
	    $seq->add_SeqFeature($self->_coordAdjust($feat, $length));
	}
     }
     return $seq;
}

=head2 _coordAdjust

  Title   : _coordAdjust
  Usage   : my $newfeat=Bio::SeqUtils->_coordAdjust($feature, 100);
  Function: Recursive subroutine to adjust the coordinates of a feature
            and all its subfeatures.
  Returns : A Bio::SeqFeatureI compliant object.
  Args    : A Bio::SeqFeatureI compliant object,
            the number of bases to add to the coordinates

=cut

sub _coordAdjust {
     my ($self, $feat, $add)=@_;
     $self->throw('Object [$feat] '. 'of class ['. ref($feat).
                  '] should be a Bio::SeqFeatureI ')
	unless $feat->isa('Bio::SeqFeatureI');
     my @adjsubfeat;
     for my $subfeat ($feat->remove_SeqFeatures) {
	push @adjsubfeat, Bio::SeqUtils->_coordAdjust($add, $subfeat);
     }
     my @loc=$feat->location->each_Location;
     map {
	my @coords=($_->start, $_->end);
	map s/(\d+)/$add+$1/ge, @coords;
	$_->start(shift @coords);
	$_->end(shift @coords);
     } @loc;
     if (@loc==1) {
	$feat->location($loc[0])
     } else {
	my $loc=Bio::Location::Split->new;
	$loc->add_sub_Location(@loc);
	$feat->location($loc);
     }
     $feat->add_SeqFeature($_) for @adjsubfeat;
     return $feat;
}

> 
> 
> Jan, 
> 
> It would be easy if someone had written a function to do it. Even writing the 
> function is not hard.  I do not think there is no other way than go through 
> all features, though.
> 
> In my opinion this would be an excellent addition to Bio::Seq::Utilities.
> 
> E.g. cat($arrayrefofsequences, optional_seq_class_to_create)
>      return a new seq, species and other info based on the first seq in array 
> 
> Could you  write it and post to bugzilla?
> 
> 	-Heikki
> 
> 
> On Tuesday 17 January 2006 11:54, jan aerts (RI) wrote:
>> Hi all,
>>
>> Does anyone know of an easy way to concatenate two sequences, including
>> recalculation of features positions of the second one? E.g.
>>   seq 1 = 100 bp
>>     feature A: 5..15
>>   seq 2 = 200 bp
>>     feature B: 20..30
>>   => concatenated sequence 3 = 300 bp
>>        feature A: 5..15
>>        feature B: 120..130  <<<<<<<<<<<
>>
>> Annotations (features without range) should be transferred as well.
>>
>> Of course, it must be possible to create a blank sequence and work my
>> way through all features, adding them to a new collection of features
>> and stuff. But I was wondering if a simpler technique is possible.
>>
>> Many thanks,
>> Jan Aerts
>> Bioinformatics Department
>> Roslin Institute
>> Roslin, Scotland, UK
>>
>> ---------The obligatory disclaimer--------
>> The information contained in this e-mail (including any attachments) is
>> confidential and is intended for the use of the addressee only.   The
>> opinions expressed within this e-mail (including any attachments) are
>> the opinions of the sender and do not necessarily constitute those of
>> Roslin Institute (Edinburgh) ("the Institute") unless specifically
>> stated by a sender who is duly authorised to do so on behalf of the
>> Institute.
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at portal.open-bio.org
>> http://portal.open-bio.org/mailman/listinfo/bioperl-l
> 
> -- 
> ______ _/      _/_____________________________________________________
>       _/      _/
>      _/  _/  _/  Heikki Lehvaslaiho    heikki at_sanbi _ac _za
>     _/_/_/_/_/  Associate Professor    skype: heikki_lehvaslaiho
>    _/  _/  _/  SANBI, South African National Bioinformatics Institute
>   _/  _/  _/  University of Western Cape, South Africa
>      _/      Phone: +27 21 959 2096   FAX: +27 21 959 2512
> ___ _/_/_/_/_/________________________________________________________
> 

From heikki at sanbi.ac.za  Wed Jan 25 21:11:45 2006
From: heikki at sanbi.ac.za (Heikki Lehvaslaiho)
Date: Wed, 25 Jan 2006 23:11:45 +0200
Subject: [Bioperl-l] concatenate two embl sequence files
In-Reply-To: <43D7AFD9.2020305@colibase.bham.ac.uk>
References: <200601182120.k0ILIl8X022324@portal.open-bio.org>
	<43D7AFD9.2020305@colibase.bham.ac.uk>
Message-ID: <200601252311.45582.heikki@sanbi.ac.za>

Thanks Roy!

I'll check to code in tomorrow when I am less sleepy and can go through the 
code in detail. In principle the code looks good. It definitely needs tests. 
If you have written any please do post them.

A few more checks to make sure seq_>alphabet is the same in all sequences 
might be a good idea.

   -Heikki

On Wednesday 25 January 2006 19:05, Roy Chaudhuri wrote:
> Hi all.
>
> I also had need of a function to concatenate two Bio::Seq objects, so had a
> go at this. My naive attempt (intended to go in Bio::SeqUtils) is pasted
> below. I'm not too sure about the concept of sub-SeqFeatures (I've never
> seen any sequence that had more than one level of feature)- I worked on the
> assumption that little sub-SeqFeatures can have littler sub-SeqFeatures and
> so ad infinitum, but as I don't have an example file I haven't been able to
> test if this works. Likewise, although I think the code should cope with
> Fuzzy and Split locations, I haven't tested this with any particularly
> unusual examples.
>
> Roy.
> --
> Dr. Roy Chaudhuri
> Bioinformatics Research Fellow
> Division of Immunity and Infection
> University of Birmingham, U.K.
>
> http://xbase.bham.ac.uk
>
>
>
> =head2 cat
>
>   Title   : cat
>   Usage   : my $catseq = Bio::SeqUtils->cat(@seqs)
>   Function: Concatenates an array of Bio::Seq objects, using the first
> sequence as a template for species etc. Adjusts the coordinates of features
> from any additional objects.
>   Returns : A sequence object of the same class as the first argument.
>   Args    : array of sequence objects
>
>
> =cut
>
> sub cat {
>      my ($self, @seqs) = @_;
>      my $seq=shift @seqs;
>      $self->throw('Object [$seq] '. 'of class ['. ref($seq).
>                   '] should be a Bio::PrimarySeqI ')
>      unless $seq->isa('Bio::PrimarySeqI');
>      for (@seqs) {
>      	$self->throw('Object [$seq] '. 'of class ['. ref($seq).
>                   '] should be a Bio::PrimarySeqI ')
> 	unless $seq->isa('Bio::PrimarySeqI');
> 	my $length=$seq->length;
> 	$seq->seq($seq->seq.$_->seq);
> 	for my $feat ($_->get_SeqFeatures) {
> 	    $seq->add_SeqFeature($self->_coordAdjust($feat, $length));
> 	}
>      }
>      return $seq;
> }
>
> =head2 _coordAdjust
>
>   Title   : _coordAdjust
>   Usage   : my $newfeat=Bio::SeqUtils->_coordAdjust($feature, 100);
>   Function: Recursive subroutine to adjust the coordinates of a feature
>             and all its subfeatures.
>   Returns : A Bio::SeqFeatureI compliant object.
>   Args    : A Bio::SeqFeatureI compliant object,
>             the number of bases to add to the coordinates
>
>
> =cut
>
> sub _coordAdjust {
>      my ($self, $feat, $add)=@_;
>      $self->throw('Object [$feat] '. 'of class ['. ref($feat).
>                   '] should be a Bio::SeqFeatureI ')
> 	unless $feat->isa('Bio::SeqFeatureI');
>      my @adjsubfeat;
>      for my $subfeat ($feat->remove_SeqFeatures) {
> 	push @adjsubfeat, Bio::SeqUtils->_coordAdjust($add, $subfeat);
>      }
>      my @loc=$feat->location->each_Location;
>      map {
> 	my @coords=($_->start, $_->end);
> 	map s/(\d+)/$add+$1/ge, @coords;
> 	$_->start(shift @coords);
> 	$_->end(shift @coords);
>      } @loc;
>      if (@loc==1) {
> 	$feat->location($loc[0])
>      } else {
> 	my $loc=Bio::Location::Split->new;
> 	$loc->add_sub_Location(@loc);
> 	$feat->location($loc);
>      }
>      $feat->add_SeqFeature($_) for @adjsubfeat;
>      return $feat;
> }
>
> > Jan,
> >
> > It would be easy if someone had written a function to do it. Even writing
> > the function is not hard.  I do not think there is no other way than go
> > through all features, though.
> >
> > In my opinion this would be an excellent addition to Bio::Seq::Utilities.
> >
> > E.g. cat($arrayrefofsequences, optional_seq_class_to_create)
> >      return a new seq, species and other info based on the first seq in
> > array
> >
> > Could you  write it and post to bugzilla?
> >
> > 	-Heikki
> >
> > On Tuesday 17 January 2006 11:54, jan aerts (RI) wrote:
> >> Hi all,
> >>
> >> Does anyone know of an easy way to concatenate two sequences, including
> >> recalculation of features positions of the second one? E.g.
> >>   seq 1 = 100 bp
> >>     feature A: 5..15
> >>   seq 2 = 200 bp
> >>     feature B: 20..30
> >>   => concatenated sequence 3 = 300 bp
> >>        feature A: 5..15
> >>        feature B: 120..130  <<<<<<<<<<<
> >>
> >> Annotations (features without range) should be transferred as well.
> >>
> >> Of course, it must be possible to create a blank sequence and work my
> >> way through all features, adding them to a new collection of features
> >> and stuff. But I was wondering if a simpler technique is possible.
> >>
> >> Many thanks,
> >> Jan Aerts
> >> Bioinformatics Department
> >> Roslin Institute
> >> Roslin, Scotland, UK
> >>
> >> ---------The obligatory disclaimer--------
> >> The information contained in this e-mail (including any attachments) is
> >> confidential and is intended for the use of the addressee only.   The
> >> opinions expressed within this e-mail (including any attachments) are
> >> the opinions of the sender and do not necessarily constitute those of
> >> Roslin Institute (Edinburgh) ("the Institute") unless specifically
> >> stated by a sender who is duly authorised to do so on behalf of the
> >> Institute.
> >>
> >> _______________________________________________
> >> Bioperl-l mailing list
> >> Bioperl-l at portal.open-bio.org
> >> http://portal.open-bio.org/mailman/listinfo/bioperl-l
> >
> > --
> > ______ _/      _/_____________________________________________________
> >       _/      _/
> >      _/  _/  _/  Heikki Lehvaslaiho    heikki at_sanbi _ac _za
> >     _/_/_/_/_/  Associate Professor    skype: heikki_lehvaslaiho
> >    _/  _/  _/  SANBI, South African National Bioinformatics Institute
> >   _/  _/  _/  University of Western Cape, South Africa
> >      _/      Phone: +27 21 959 2096   FAX: +27 21 959 2512
> > ___ _/_/_/_/_/________________________________________________________
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

-- 
______ _/      _/_____________________________________________________
      _/      _/
     _/  _/  _/  Heikki Lehvaslaiho    heikki at_sanbi _ac _za
    _/_/_/_/_/  Associate Professor    skype: heikki_lehvaslaiho
   _/  _/  _/  SANBI, South African National Bioinformatics Institute
  _/  _/  _/  University of Western Cape, South Africa
     _/      Phone: +27 21 959 2096   FAX: +27 21 959 2512
___ _/_/_/_/_/________________________________________________________

From heikki at sanbi.ac.za  Wed Jan 25 20:52:42 2006
From: heikki at sanbi.ac.za (Heikki Lehvaslaiho)
Date: Wed, 25 Jan 2006 22:52:42 +0200
Subject: [Bioperl-l] new website launched
In-Reply-To: <79628361-F026-461F-A156-0B6810DB0B52@duke.edu>
References: <79628361-F026-461F-A156-0B6810DB0B52@duke.edu>
Message-ID: <200601252252.42786.heikki@sanbi.ac.za>

Congratulations and huge thank you for the production team!

The new website is a big step ahead readability and ease in editing the 
information.

I for my part have already corrected a few small typos and omissions on the 
new pages. I invite other to do the same.

    -Heikki

On Wednesday 25 January 2006 05:31, Jason Stajich wrote:
> I am pleased to announce the release of a new website for BioPerl.
> The site is based on the mediawiki software that was developed for
> the wikipedia project.  We intend the site to be a place for
> community input on documentation and design for the BioPerl project.
> There is also a fair amount of documentation started surrounding
> bioinformatics tools and techniques applicable to using BioPerl and
> some of the authors who created these resources.
>
> The website continues to be at the URL http://www.bioperl.org.  The
> DNS updates may take up to 24 hours to reach everyone.
>
> The initial content of the site is result of the work of myself,
> Mauricio Herrera Cuadra, Brian Osborne, and Torsten Seemann.  We
> encourage you to contribute to the site's content by signing up for
> an account.
>
> There are several guides for style of the site and how to link to
> Modules for example which can contain additional information from the
> POD
> http://bioperl.org/wiki/Module:Bio::SeqIO
>
> You'll notice that many of the paths have changed but the DIST and
> SRC continues to be available at http://bioperl.org/DIST and http://
> bioperl.org/SRC.  The HOWTOs are now available from http://
> bioperl.org/wiki/HOWTOs
>
> The FAQ is available at http://bioperl.org/wiki/FAQ and I encourage
> you to add your questions to it so they can be properly archived and
> addressed.
>
> We also have initiated a News site for Bioperl for posting
> announcements regarding development and software.  I would like to
> see if there are volunteers to post weekly or monthly summaries of
> mailing list traffic and development.
> http://www.bioperl.org/news/
>
>
> Jason Stajich on behalf of Mauricio Herrera Cuadra, Brian Osborne,
> Torsten Seemann.
>
> --
> Jason Stajich
> Duke University
> http://www.duke.edu/~jes12
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

-- 
______ _/      _/_____________________________________________________
      _/      _/
     _/  _/  _/  Heikki Lehvaslaiho    heikki at_sanbi _ac _za
    _/_/_/_/_/  Associate Professor    skype: heikki_lehvaslaiho
   _/  _/  _/  SANBI, South African National Bioinformatics Institute
  _/  _/  _/  University of Western Cape, South Africa
     _/      Phone: +27 21 959 2096   FAX: +27 21 959 2512
___ _/_/_/_/_/________________________________________________________

From cjfields at uiuc.edu  Thu Jan 26 03:34:01 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 25 Jan 2006 21:34:01 -0600
Subject: [Bioperl-l] [Gmod-gbrowse] GMOD PPM repository not working
In-Reply-To: <1138119383.3338.68.camel@localhost.localdomain>
Message-ID: <000201c62229$59ed5f50$15327e82@pyrimidine>

Scott,

This popped up, for some reason, when I tried to install a perl module
(Error.pm); maybe it has something to do with the reason PPM can't 'see'
GMOD's repository.  It crashes PPM pretty nicely!  Looks like the home page
for GMOD, so maybe Sourceforge is redirecting things and this messes with
PPM?  

_____________________________________________
C:\Perl\Scripts>ppm
PPM - Programmer's Package Manager version 3.3.
Copyright (c) 2001 ActiveState Corp. All Rights Reserved.
ActiveState is a division of Sophos.

Entering interactive shell. Using Term::ReadLine::Perl as readline library.

Type 'help' to get started.

ppm> rep
Repositories:
[1] Bioperl
[2] gmod
[3] ActiveState PPM2 Repository
[4] ActiveState Package Repository
[ ] Bribes
[ ] Kobes
[ ] local
ppm> install Error
PPM::PPD::init: not a PPD and not a file:

  The Generic Model Organism Database Project | GMOD

      GMOD

      Generic Software Components for Model
Organism Databases

      Mailing lists |
Bug Reports |
Feature Requests |
Publications |
Meetings |

.... (lots of HTML removed)

This site is maintained by Scott
Cain | Powered by 
drupal

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 
> -----Original Message-----
> From: gmod-gbrowse-admin at lists.sourceforge.net [mailto:gmod-gbrowse-
> admin at lists.sourceforge.net] On Behalf Of Scott Cain
> Sent: Tuesday, January 24, 2006 10:16 AM
> To: Chris Fields
> Cc: 'Gbrowse (E-mail)'; bioperl-l at portal.open-bio.org
> Subject: Re: [Gmod-gbrowse] GMOD PPM repository not working
> 
> Hi Chris,
> 
> Is it still misbehaving?  I'll do some testing today, but my ability to
> do so is little hampered as I am traveling this week.
> 
> Thanks,
> Scott
> 
> 
> On Wed, 2006-01-18 at 10:51 -0600, Chris Fields wrote:
> > Scott,
> >
> > I am trying to find the newest bioperl dev. Release (1.51) from PPM for
> a
> > quick write-up on installing bioperl-db on Windows.  I tried using the
> GMOD
> > repository:
> >
> > ppm> rep add gmod http://www.gmod.org/ggb/ppm
> > Repositories:
> > [1] gmod
> > [ ] ActiveState Package Repository
> > [ ] ActiveState PPM2 Repository
> > [ ] Bioperl
> > [ ] Bribes
> > [ ] Kobes
> > [ ] local
> > ppm> search bioperl
> > Searching in Active Repositories
> > No matches for 'bioperl'; see 'help search'.
> > ppm> search *
> > Searching in Active Repositories
> > No matches for '*'; see 'help search'.
> > ppm>
> >
> >
> > Any idea what's going on?  All other repositories work fine.  I can
> download
> > it and install locally w/o a problem.  I am running the newest
> ActivePerl
> > (5.8.7.815), WinXP.
> >
> > Christopher Fields
> > Postdoctoral Researcher - Switzer Lab
> > Dept. of Biochemistry
> > University of Illinois Urbana-Champaign
> >
> >
> >
> >
> > -------------------------------------------------------
> > This SF.net email is sponsored by: Splunk Inc. Do you grep through log
> files
> > for problems?  Stop!  Download the new AJAX search engine that makes
> > searching your log files as easy as surfing the  web.  DOWNLOAD SPLUNK!
> > http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642
> > _______________________________________________
> > Gmod-gbrowse mailing list
> > Gmod-gbrowse at lists.sourceforge.net
> > https://lists.sourceforge.net/lists/listinfo/gmod-gbrowse
> --
> ------------------------------------------------------------------------
> Scott Cain, Ph. D.                                         cain at cshl.edu
> GMOD Coordinator (http://www.gmod.org/)                     216-392-3087
> Cold Spring Harbor Laboratory
> 
> 
> 
> -------------------------------------------------------
> This SF.net email is sponsored by: Splunk Inc. Do you grep through log
> files
> for problems?  Stop!  Download the new AJAX search engine that makes
> searching your log files as easy as surfing the  web.  DOWNLOAD SPLUNK!
> http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642
> _______________________________________________
> Gmod-gbrowse mailing list
> Gmod-gbrowse at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/gmod-gbrowse

From cjfields at uiuc.edu  Thu Jan 26 05:38:56 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 25 Jan 2006 23:38:56 -0600
Subject: [Bioperl-l] bioperl-db on Windows (update)
Message-ID: <000001c6223a$cd5539c0$15327e82@pyrimidine>

Hilmar, 

I checked load_seqdatabase.pl with all variables of Root.pm and checking
debugging output; basically, the only way that I could find to get
load_seqdatabase.pl to work on native Windows is by changing those Root.pm
lines by adding a comma (i.e. three lines, from 'throw $class ...' to 'throw
$class, ...').  I ran debugging on load_seqdatabase.pl using all versions of
Root.pm, with and without Error.pm.  Only those with a comma present worked
in both circumstances.  I don't know why this hasn't popped up before now,
but it seems to be a unique combination of Windows, load_seqdatabase.pl, and
bioperl-db.  It doesn't happen with any scripts of Bioperl on Windows that
I've run into, and debugging other modules (for instance,
Bio::SearchIO::blast, which I recently worked on) doesn't cause this
problem.  

Here's the debugging output for load_seqdatabase.pl, with and w/o Error.pm
and without modifying Root.pm.

____________________________________________________________

Without Error.pm:
____________________________________________________________
C:\Perl\Scripts>perl -MError
Can't locate Error.pm in @INC (@INC contains:
C:\Perl\src\bioperl\bioperl-live C:\Perl\src\bioperl\bioperl-db C:/Perl/lib
C:/P
erl/site/lib .).
BEGIN failed--compilation aborted.

C:\Perl\Scripts>load_seqdatabase.pl -dbname biotest -dbuser root -dbpass
****** -driver mysql -format genbank -namespace tes
t -testonly -safe -debug input.gpt
Loading input.gpt ...
attempting to load adaptor class for Bio::Seq::RichSeq
        attempting to load module Bio::DB::BioSQL::RichSeqAdaptor
attempting to load adaptor class for Bio::Seq
        attempting to load module Bio::DB::BioSQL::SeqAdaptor
instantiating adaptor class Bio::DB::BioSQL::SeqAdaptor
attempting to load adaptor class for Bio::Species
        attempting to load module Bio::DB::BioSQL::SpeciesAdaptor
instantiating adaptor class Bio::DB::BioSQL::SpeciesAdaptor
Undefined subroutine &Bio::Root::Root::debug called at
C:\Perl\src\bioperl\bioperl-db/Bio/DB/BioSQL/BasePersistenceAdaptor.pm
line 1537,  line 63.
____________________________________________________________

With Error.pm:
____________________________________________________________

C:\Perl\Scripts>perl -MError -e ";"

C:\Perl\Scripts>load_seqdatabase.pl -dbname biotest -dbuser root -dbpass
****** -driver mysql -format genbank -namespace tes
t -testonly -safe -debug input.gpt
Loading input.gpt ...
attempting to load adaptor class for Bio::Seq::RichSeq
        attempting to load module Bio::DB::BioSQL::RichSeqAdaptor
  Calling Error::throw

  Calling Error::throw

attempting to load adaptor class for Bio::Seq
        attempting to load module Bio::DB::BioSQL::SeqAdaptor
instantiating adaptor class Bio::DB::BioSQL::SeqAdaptor
  Calling Error::throw

attempting to load adaptor class for Bio::Species
        attempting to load module Bio::DB::BioSQL::SpeciesAdaptor
instantiating adaptor class Bio::DB::BioSQL::SpeciesAdaptor
  Calling Error::throw

  Calling Error::throw

Undefined subroutine &Bio::Root::Root::debug called at
C:\Perl\src\bioperl\bioperl-db/Bio/DB/BioSQL/BasePersistenceAdaptor.pm
line 1537,  line 63.

____________________________________________________________

Error::throw is called w/o a problem when Error.pm is present (which is what
should happen).  For some reason, that extra comma makes all the difference
in the world.

The line above in BasePersistenceAdaptor.pm is :

$self->debug("attempting to load driver for adaptor class $class\n");

which is found in many modules.  I don't really know why it decides to hang
up here.  I'll try running a few of the Root.pm modifications under Mac OS X
in the next day or so to see what happens.

I also reran a few of Steve Chervitz's recommendations from a previous post;
everything ran fine except in circumstances in which Error.pm was required
with a 'use' statement, and only when Error.pm wasn't present, which is
expected.  Previously, when I ran them, there was a bit of confusion b/c it
seemed that Error.pm was present somewhere.  It was; Steve included it in
bioperl-live/examples/root/lib.  When I deleted it, I got the expected
results.

Anyway, I don't know what else I can do at this point besides check out
everything on Mac OS X.  Any additional checks of the modified Root.pm need
to be made on other systems.  Will filing this as a bug in Bugzilla help?

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 

From s.rayner at att.net  Thu Jan 26 05:58:42 2006
From: s.rayner at att.net (s.rayner at att.net)
Date: Thu, 26 Jan 2006 05:58:42 +0000
Subject: [Bioperl-l] bioperl installation problems with External Modules -
	doesn't see installed modules
Message-ID: <012620060558.15437.43D865110008848F00003C4D21602806519D0A02970E9DD29C@att.net>

I am trying to install the bioperl::bundle to use some of the external perl modules. 
Particularly the bio::DB::GFF module for use with biodas.

I follow the instructions, both from the bioperl web site for installing the bioperl bundle, and also specific instructions from the biodas web site for installing bio::DB::GFF.  Namely

   (1) Make sure that CVS is installed on your system.

    (2) Use the following command (all on one line) to login to the server

         % cvs -d :pserver:cvs at cvs.bioperl.org:/home/repository/bioperl login

          when prompted, the password is 'cvs'

    (3) Check out the bioperl package you are interested in, for most
    users this will be the bioperl-live source tree.  The following
    command should be executed as one line.

         % cvs -d :pserver:cvs at cvs.bioperl.org:/home/repository/bioperl checkout bioperl-live

    The login and checkout procedure should only have to be done
    once. To update the source directories in the future it should be
    possible just to enter the top level directory and issue the
    following command:

         % cvs update

This will create the directory ``bioperl-live''. Now build and install bioperl with the following recipe:

         % cd bioperl-live
         % perl Makefile.PL
         % make
         % make test
         % make install

The last step will probably need to be run as root.

When i perform either of these steps i get the message that the installation was successful, but bioperl and biodas return a message that the modules have not been installed.

They are physically present on the disk, but the programs don't seem to know where to find them.

Can anyone suggest how to fix this problem?

thanks

Simon

From heikki at sanbi.ac.za  Thu Jan 26 07:53:22 2006
From: heikki at sanbi.ac.za (Heikki Lehvaslaiho)
Date: Thu, 26 Jan 2006 09:53:22 +0200
Subject: [Bioperl-l] Fwd: some doubts in bioperl
Message-ID: <200601260953.22923.heikki@sanbi.ac.za>

----------  Forwarded Message  ----------

Subject: some doubts in bioperl
Date: Monday 23 January 2006 10:16
From: apsara asok 
To: heikki at sanbi.ac.za

dear heikki,
                  i want to clear some doubts in bioperl.using suffix tree
how can v do pattern searching in bioperl
do u have any idea pls help me
apsara

-------------------------------------------------------

From roy at colibase.bham.ac.uk  Thu Jan 26 13:18:03 2006
From: roy at colibase.bham.ac.uk (Roy Chaudhuri)
Date: Thu, 26 Jan 2006 13:18:03 +0000
Subject: [Bioperl-l] concatenate two embl sequence files
In-Reply-To: <200601252311.45582.heikki@sanbi.ac.za>
References: <200601182120.k0ILIl8X022324@portal.open-bio.org>
	<43D7AFD9.2020305@colibase.bham.ac.uk>
	<200601252311.45582.heikki@sanbi.ac.za>
Message-ID: <43D8CC0B.10403@colibase.bham.ac.uk>

Heikki Lehvaslaiho wrote:
> Thanks Roy!
> 
> I'll check to code in tomorrow when I am less sleepy and can go through the 
> code in detail. In principle the code looks good. It definitely needs tests. 
> If you have written any please do post them.
Not too sure about how to go about writing tests, any suggestions?

It did occur to me that my _coordAdjust method could be adapted to allow 
the Bio::Seq trunc method to retain sequence features (since there's no 
reason why the $add argument can't be negative). This would probably 
need a bit more work to cope with the situation where a feature overlaps 
the trunc coordinates, for example if we truncate to coordinates 1..400, 
but there's a feature 300..500. I guess the 'correct' behaviour might be 
to convert that feature to a fuzzy location of 300..>400? Or is it 
acceptable to have features with coordinates outside of a sequence?

If we did that then an obvious test would be to cat a sequence to 
itself, then trunc to retain just the second half of the new sequence 
and see if you got back what you started with.

> A few more checks to make sure seq_>alphabet is the same in all sequences 
> might be a good idea.
That's easy to implement. Just put the line:
	$self->throw('Trying to concatenate sequences with different alphabets: 
'.$seq->display_id.' ('.$seq->alphabet.') and ' .$_->display_id.' 
('.$_->alphabet.')') unless $_->alphabet eq $seq->alphabet;

at the start of the for(@seqs) loop of the cat subroutine.

Roy.
--
Dr. Roy Chaudhuri
Bioinformatics Research Fellow
Division of Immunity and Infection
University of Birmingham, U.K.

http://xbase.bham.ac.uk

From hlapp at gmx.net  Thu Jan 26 06:31:43 2006
From: hlapp at gmx.net (Hilmar Lapp)
Date: Wed, 25 Jan 2006 22:31:43 -0800
Subject: [Bioperl-l] bioperl-db on Windows (update)
In-Reply-To: <000001c6223a$cd5539c0$15327e82@pyrimidine>
References: <000001c6223a$cd5539c0$15327e82@pyrimidine>
Message-ID: 

This is a lot of work you did to investigate this Chris, thanks. Yes
filing as a bug report will help, and don't forget to attach this
report of yours with all the tests you did. Really all that's left to
do is test on a couple of Unix platforms, which will happen
semi-automatically by people once we commit the change.

   -hilmar

On 1/25/06, Chris Fields  wrote:
> Hilmar,
>
> I checked load_seqdatabase.pl with all variables of Root.pm and checking
> debugging output; basically, the only way that I could find to get
> load_seqdatabase.pl to work on native Windows is by changing those Root.pm
> lines by adding a comma (i.e. three lines, from 'throw $class ...' to 'throw
> $class, ...').  I ran debugging on load_seqdatabase.pl using all versions of
> Root.pm, with and without Error.pm.  Only those with a comma present worked
> in both circumstances.  I don't know why this hasn't popped up before now,
> but it seems to be a unique combination of Windows, load_seqdatabase.pl, and
> bioperl-db.  It doesn't happen with any scripts of Bioperl on Windows that
> I've run into, and debugging other modules (for instance,
> Bio::SearchIO::blast, which I recently worked on) doesn't cause this
> problem.
>
> Here's the debugging output for load_seqdatabase.pl, with and w/o Error.pm
> and without modifying Root.pm.
>
> ____________________________________________________________
>
> Without Error.pm:
> ____________________________________________________________
> C:\Perl\Scripts>perl -MError
> Can't locate Error.pm in @INC (@INC contains:
> C:\Perl\src\bioperl\bioperl-live C:\Perl\src\bioperl\bioperl-db C:/Perl/lib
> C:/P
> erl/site/lib .).
> BEGIN failed--compilation aborted.
>
> C:\Perl\Scripts>load_seqdatabase.pl -dbname biotest -dbuser root -dbpass
> ****** -driver mysql -format genbank -namespace tes
> t -testonly -safe -debug input.gpt
> Loading input.gpt ...
> attempting to load adaptor class for Bio::Seq::RichSeq
>         attempting to load module Bio::DB::BioSQL::RichSeqAdaptor
> attempting to load adaptor class for Bio::Seq
>         attempting to load module Bio::DB::BioSQL::SeqAdaptor
> instantiating adaptor class Bio::DB::BioSQL::SeqAdaptor
> attempting to load adaptor class for Bio::Species
>         attempting to load module Bio::DB::BioSQL::SpeciesAdaptor
> instantiating adaptor class Bio::DB::BioSQL::SpeciesAdaptor
> Undefined subroutine &Bio::Root::Root::debug called at
> C:\Perl\src\bioperl\bioperl-db/Bio/DB/BioSQL/BasePersistenceAdaptor.pm
> line 1537,  line 63.
> ____________________________________________________________
>
> With Error.pm:
> ____________________________________________________________
>
> C:\Perl\Scripts>perl -MError -e ";"
>
> C:\Perl\Scripts>load_seqdatabase.pl -dbname biotest -dbuser root -dbpass
> ****** -driver mysql -format genbank -namespace tes
> t -testonly -safe -debug input.gpt
> Loading input.gpt ...
> attempting to load adaptor class for Bio::Seq::RichSeq
>         attempting to load module Bio::DB::BioSQL::RichSeqAdaptor
>   Calling Error::throw
>
>   Calling Error::throw
>
> attempting to load adaptor class for Bio::Seq
>         attempting to load module Bio::DB::BioSQL::SeqAdaptor
> instantiating adaptor class Bio::DB::BioSQL::SeqAdaptor
>   Calling Error::throw
>
> attempting to load adaptor class for Bio::Species
>         attempting to load module Bio::DB::BioSQL::SpeciesAdaptor
> instantiating adaptor class Bio::DB::BioSQL::SpeciesAdaptor
>   Calling Error::throw
>
>   Calling Error::throw
>
> Undefined subroutine &Bio::Root::Root::debug called at
> C:\Perl\src\bioperl\bioperl-db/Bio/DB/BioSQL/BasePersistenceAdaptor.pm
> line 1537,  line 63.
>
> ____________________________________________________________
>
> Error::throw is called w/o a problem when Error.pm is present (which is what
> should happen).  For some reason, that extra comma makes all the difference
> in the world.
>
> The line above in BasePersistenceAdaptor.pm is :
>
> $self->debug("attempting to load driver for adaptor class $class\n");
>
> which is found in many modules.  I don't really know why it decides to hang
> up here.  I'll try running a few of the Root.pm modifications under Mac OS X
> in the next day or so to see what happens.
>
> I also reran a few of Steve Chervitz's recommendations from a previous post;
> everything ran fine except in circumstances in which Error.pm was required
> with a 'use' statement, and only when Error.pm wasn't present, which is
> expected.  Previously, when I ran them, there was a bit of confusion b/c it
> seemed that Error.pm was present somewhere.  It was; Steve included it in
> bioperl-live/examples/root/lib.  When I deleted it, I got the expected
> results.
>
> Anyway, I don't know what else I can do at this point besides check out
> everything on Mac OS X.  Any additional checks of the modified Root.pm need
> to be made on other systems.  Will filing this as a bug in Bugzilla help?
>
> Christopher Fields
> Postdoctoral Researcher - Switzer Lab
> Dept. of Biochemistry
> University of Illinois Urbana-Champaign
>
>

--
----------------------------------------------------------
: Hilmar Lapp -:- San Diego, CA -:- hlapp at gmx dot net :
----------------------------------------------------------

From cain at cshl.edu  Thu Jan 26 15:41:20 2006
From: cain at cshl.edu (Scott Cain)
Date: Thu, 26 Jan 2006 10:41:20 -0500
Subject: [Bioperl-l] [Gmod-gbrowse] GMOD PPM repository not working
In-Reply-To: <000201c62229$59ed5f50$15327e82@pyrimidine>
References: <000201c62229$59ed5f50$15327e82@pyrimidine>
Message-ID: <1138290080.2894.25.camel@localhost.localdomain>

Hi Chris,

I still don't exactly know what the problem is, but this at least has
given me some insight on some messages in my error_log: I've been seeing
lots of messages about '/icon/somegif.gif' not found and haven't been
able to track down their source (not that I'd really tried, it was an
annoyance that hadn't risen to the level of serous debugging yet).  We
are using mod_rewrite, so that could be part of the problem.  I'll try
to fix it so that the icons display properly and that may have a side
effect of fixing ppm.

Scott

On Wed, 2006-01-25 at 21:34 -0600, Chris Fields wrote:
> Scott,
> 
> This popped up, for some reason, when I tried to install a perl module
> (Error.pm); maybe it has something to do with the reason PPM can't 'see'
> GMOD's repository.  It crashes PPM pretty nicely!  Looks like the home page
> for GMOD, so maybe Sourceforge is redirecting things and this messes with
> PPM?  
> 
> _____________________________________________
> C:\Perl\Scripts>ppm
> PPM - Programmer's Package Manager version 3.3.
> Copyright (c) 2001 ActiveState Corp. All Rights Reserved.
> ActiveState is a division of Sophos.
> 
> Entering interactive shell. Using Term::ReadLine::Perl as readline library.
> 
> Type 'help' to get started.
> 
> ppm> rep
> Repositories:
> [1] Bioperl
> [2] gmod
> [3] ActiveState PPM2 Repository
> [4] ActiveState Package Repository
> [ ] Bribes
> [ ] Kobes
> [ ] local
> ppm> install Error
> PPM::PPD::init: not a PPD and not a file:
>  "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
> 
> 
> 
>   The Generic Model Organism Database Project | GMOD
>   
> 
> 
>   
>   
> 
> 
> 
> 
> 
>   
>     
>     
> 
>        alt="Home" />
> 
>       GMOD
> 
>       Generic Software Components for Model
> Organism Databases
> 
> 
>     
>        href="http://sourceforge.net/mail/?group_id=27707">Mailing lists |
>  href="http://sourceforge.net/tracker/?atid=391291&group_id=27707&func=browse
> ">Bug Reports |
>  href="http://sourceforge.net/tracker/?atid=391294&group_id=27707&func=browse
> ">Feature Requests |
> Publications |
> Meetings |
> 
> .... (lots of HTML removed)
> 
> 
> This site is maintained by Scott
> Cain | Powered by 
> drupal
> 
> 
> 
> 
> Christopher Fields
> Postdoctoral Researcher - Switzer Lab
> Dept. of Biochemistry
> University of Illinois Urbana-Champaign 
> > -----Original Message-----
> > From: gmod-gbrowse-admin at lists.sourceforge.net [mailto:gmod-gbrowse-
> > admin at lists.sourceforge.net] On Behalf Of Scott Cain
> > Sent: Tuesday, January 24, 2006 10:16 AM
> > To: Chris Fields
> > Cc: 'Gbrowse (E-mail)'; bioperl-l at portal.open-bio.org
> > Subject: Re: [Gmod-gbrowse] GMOD PPM repository not working
> > 
> > Hi Chris,
> > 
> > Is it still misbehaving?  I'll do some testing today, but my ability to
> > do so is little hampered as I am traveling this week.
> > 
> > Thanks,
> > Scott
> > 
> > 
> > On Wed, 2006-01-18 at 10:51 -0600, Chris Fields wrote:
> > > Scott,
> > >
> > > I am trying to find the newest bioperl dev. Release (1.51) from PPM for
> > a
> > > quick write-up on installing bioperl-db on Windows.  I tried using the
> > GMOD
> > > repository:
> > >
> > > ppm> rep add gmod http://www.gmod.org/ggb/ppm
> > > Repositories:
> > > [1] gmod
> > > [ ] ActiveState Package Repository
> > > [ ] ActiveState PPM2 Repository
> > > [ ] Bioperl
> > > [ ] Bribes
> > > [ ] Kobes
> > > [ ] local
> > > ppm> search bioperl
> > > Searching in Active Repositories
> > > No matches for 'bioperl'; see 'help search'.
> > > ppm> search *
> > > Searching in Active Repositories
> > > No matches for '*'; see 'help search'.
> > > ppm>
> > >
> > >
> > > Any idea what's going on?  All other repositories work fine.  I can
> > download
> > > it and install locally w/o a problem.  I am running the newest
> > ActivePerl
> > > (5.8.7.815), WinXP.
> > >
> > > Christopher Fields
> > > Postdoctoral Researcher - Switzer Lab
> > > Dept. of Biochemistry
> > > University of Illinois Urbana-Champaign
> > >
> > >
> > >
> > >
> > > -------------------------------------------------------
> > > This SF.net email is sponsored by: Splunk Inc. Do you grep through log
> > files
> > > for problems?  Stop!  Download the new AJAX search engine that makes
> > > searching your log files as easy as surfing the  web.  DOWNLOAD SPLUNK!
> > > http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642
> > > _______________________________________________
> > > Gmod-gbrowse mailing list
> > > Gmod-gbrowse at lists.sourceforge.net
> > > https://lists.sourceforge.net/lists/listinfo/gmod-gbrowse
> > --
> > ------------------------------------------------------------------------
> > Scott Cain, Ph. D.                                         cain at cshl.edu
> > GMOD Coordinator (http://www.gmod.org/)                     216-392-3087
> > Cold Spring Harbor Laboratory
> > 
> > 
> > 
> > -------------------------------------------------------
> > This SF.net email is sponsored by: Splunk Inc. Do you grep through log
> > files
> > for problems?  Stop!  Download the new AJAX search engine that makes
> > searching your log files as easy as surfing the  web.  DOWNLOAD SPLUNK!
> > http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642
> > _______________________________________________
> > Gmod-gbrowse mailing list
> > Gmod-gbrowse at lists.sourceforge.net
> > https://lists.sourceforge.net/lists/listinfo/gmod-gbrowse
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
-- 
------------------------------------------------------------------------
Scott Cain, Ph. D.                                         cain at cshl.edu
GMOD Coordinator (http://www.gmod.org/)                     216-392-3087
Cold Spring Harbor Laboratory

From rbalbi at gmail.com  Thu Jan 26 18:19:57 2006
From: rbalbi at gmail.com (Ricardo Balbi)
Date: Thu, 26 Jan 2006 16:19:57 -0200
Subject: [Bioperl-l] [GUSDEV] Another error executing ISF
In-Reply-To: 
References: 

Message-ID: 

Hi all,

   Anybody could help me with this error ?

thanks in advance,
Ricardo

2006/1/26, Aaron J. Mackey :
>
>
> This is a BioPerl "Unflattener" error; it's unable to automatically
> reconstruct the gene/mRNA/exon logic of some (or all) of the
> annotation in your genbank file.  To get help with this, you should
> post a message to the BioPerl mailing list (bioperl-l at bioperl.org),
> including a snippet of your genbank file.
>
> -Aaron
>
> On Jan 26, 2006, at 11:53 AM, Ricardo Balbi wrote:
>
> > Hi all,
> >
> >   After making some changes in the gus mapping file to ignore some
> > features of the kinetoplastida database, I followed in the
> > execution of the ISF, however without success.
> >
> >   Somebody could help me with this error?
> >
> > thanks in advance,
> > Ricardo
> >
> > ERROR:
> >
> > ------------- EXCEPTION  -------------
> > MSG: structure_type 2 is currently unknown
> > STACK Bio::SeqFeature::Tools::Unflattener::unflatten_seq /usr/local/
> > bioperl15/Bio/SeqFeature/Tools/Unflattener.pm:1419
> > STACK GUS::Supported::Plugin::InsertSequenceFeatures::unflatten /
> > GUS/gus_home/lib/perl/GUS/Supported/Plugin/
> > InsertSequenceFeatures.pm:353
> > STACK
> > GUS::Supported::Plugin::InsertSequenceFeatures::processFeatureTrees /G
> > US/gus_home/lib/perl/GUS/Supported/Plugin/InsertSequenceFeatures.pm:
> > 720
> > STACK GUS::Supported::Plugin::InsertSequenceFeatures::run /GUS/
> > gus_home/lib/perl/GUS/Supported/Plugin/InsertSequenceFeatures.pm:330
> > STACK (eval) /GUS/gus_home/lib/perl/GUS/PluginMgr/GusApplication.pm:
> > 549
> > STACK GUS::PluginMgr::GusApplication::doMajorMode_RunOrReport /GUS/
> > gus_home/lib/perl/GUS/PluginMgr/GusApplication.pm:541
> > STACK GUS::PluginMgr::GusApplication::doMajorMode_Run /GUS/gus_home/
> > lib/perl/GUS/PluginMgr/GusApplication.pm:459
> > STACK GUS::PluginMgr::GusApplication::doMajorMode /GUS/gus_home/lib/
> > perl/GUS/PluginMgr/GusApplication.pm:357
> > STACK GUS::PluginMgr::GusApplication::parseAndRun /GUS/gus_home/lib/
> > perl/GUS/PluginMgr/GusApplication.pm:266
> > STACK toplevel /GUS/gus_home/bin/ga:11
> >
> > --------------------------------------
> >
> > STACK TRACE:
> >  at /usr/local/bioperl15/Bio/Root/Root.pm line 342
> >         Bio::Root::Root::throw
> > ('Bio::SeqFeature::Tools::Unflattener=HASH(0xb5e124c)',
> > 'structure_type 2 is currently unknown') called at /usr/local/
> > bioperl15/Bio/SeqFeature/Tools/Unflattener.pm line 1419
> >         Bio::SeqFeature::Tools::Unflattener::unflatten_seq
> > ('Bio::SeqFeature::Tools::Unflattener=HASH(0xb5e124c)', '-seq',
> > 'Bio::Seq::RichSeq=HASH(0xb820b14)', '-use_magic', 1) called at /
> > GUS/gus_home/lib/perl/GUS/Supported/Plugin/
> > InsertSequenceFeatures.pm line 353
> >         GUS::Supported::Plugin::InsertSequenceFeatures::unflatten
> > ('GUS::Supported::Plugin::InsertSequenceFeatures=HASH(0xa011adc)',
> > 'Bio::Seq::RichSeq=HASH(0xb820b14)') called at /GUS/gus_home/lib/
> > perl/GUS/Supported/Plugin/InsertSequenceFeatures.pm line 720
> >
> > GUS::Supported::Plugin::InsertSequenceFeatures::processFeatureTrees
> > ('GUS::Supported::Plugin::InsertSequenceFeatures=HASH(0xa011adc)',
> > 'Bio::Seq::RichSeq=HASH(0xb820b14)', 140, 177) called at /GUS/
> > gus_home/lib/perl/GUS/Supported/Plugin/InsertSequenceFeatures.pm
> > line 330
> >         GUS::Supported::Plugin::InsertSequenceFeatures::run
> > ('GUS::Supported::Plugin::InsertSequenceFeatures=HASH(0xa011adc)',
> > 'HASH(0xb047e98)') called at /GUS/gus_home/lib/perl/GUS/PluginMgr/
> > GusApplication.pm line 549
> >         eval {...} called at /GUS/gus_home/lib/perl/GUS/PluginMgr/
> > GusApplication.pm line 541
> >         GUS::PluginMgr::GusApplication::doMajorMode_RunOrReport
> > ('GUS::PluginMgr::GusApplication=HASH(0x9ff0c20)',
> > 'GUS::Supported::Plugin::InsertSequenceFeatures', 1) called at /GUS/
> > gus_home/lib/perl/GUS/PluginMgr/GusApplication.pm line 459
> >         GUS::PluginMgr::GusApplication::doMajorMode_Run
> > ('GUS::PluginMgr::GusApplication=HASH(0x9ff0c20)',
> > 'GUS::Supported::Plugin::InsertSequenceFeatures') called at /GUS/
> > gus_home/lib/perl/GUS/PluginMgr/GusApplication.pm line 357
> >         GUS::PluginMgr::GusApplication::doMajorMode
> > ('GUS::PluginMgr::GusApplication=HASH(0x9ff0c20)',
> > 'GUS::Supported::Plugin::InsertSequenceFeatures') called at /GUS/
> > gus_home/lib/perl/GUS/PluginMgr/GusApplication.pm line 266
> >         GUS::PluginMgr::GusApplication::parseAndRun
> > ('GUS::PluginMgr::GusApplication=HASH(0x9ff0c20)', 'ARRAY
> > (0xa004738)') called at /GUS/gus_home/bin/ga line 11
> >
> >
> >
>
> --
> Dr. Aaron J. Mackey, Ph.D.
> Project Manager, ApiDB Bioinformatics Resource Center
> Penn Genomics Institute, University of Pennsylvania
> email:  amackey at pcbi.upenn.edu
> office: 215-898-1205 (Biology, 212 Goddard Labs)
>          215-746-7018 (PCBI, 1428 Blockley Hall)
> fax:    215-746-6697 (Penn Genomics Institute)
> postal: Penn Genomics Institute
>          Goddard Labs 212
>          415 S. University Avenue
>          Philadelphia, PA  19104-6017
>
>
>

From jason.stajich at duke.edu  Thu Jan 26 19:28:26 2006
From: jason.stajich at duke.edu (Jason Stajich)
Date: Thu, 26 Jan 2006 14:28:26 -0500
Subject: [Bioperl-l] [GUSDEV] Another error executing ISF
In-Reply-To: 
References: 

Message-ID: <2A475D24-5AC3-4AD5-80CB-0C40DB622283@duke.edu>

I would suggest following Aaron's instructions to

>> including a snippet of your genbank file.

--
Jason Stajich
Duke University
http://www.duke.edu/~jes12

From cjm at fruitfly.org  Thu Jan 26 19:33:46 2006
From: cjm at fruitfly.org (chris mungall)
Date: Thu, 26 Jan 2006 11:33:46 -0800
Subject: [Bioperl-l] [GUSDEV] Another error executing ISF
In-Reply-To: 
References: 

Message-ID: 

Sorry for the uninformative error message.

The unflattener uses a collection of heuristics to infer a canonical 
gene-mRNA-CDS-exon feature hierarchy from a flattened genbank style 
file. Due to the highly variable nature of some genbank records this 
isn't always possible, and some data massaging is required beforehand. 
I don't know what the context of this message is, but I presume you're 
aware of this from the docs.

The only time I've seen this before was with the genbank submission of 
the pombe genome, which has some very.. unusual features purportedly of 
type mRNA; the actual gene models are encoded using 'gene' and 'CDS' 
features. This confuses the heuristics a little. The only way I've been 
able to deal with this one was to manually remove the mRNA features 
(they appeared to be just fragments and not actual gene models) using 
$unf->remove_types(['mRNA']) beforehand.

Can you send the accession of the record you're trying this on (or 
email me the file off-list if it isn't too large). I'll try and get a 
more informative error message in there.

On Jan 26, 2006, at 10:19 AM, Ricardo Balbi wrote:

> Hi all,
>
>    Anybody could help me with this error ?
>
> thanks in advance,
> Ricardo
>
> 2006/1/26, Aaron J. Mackey :
>>
>>
>> This is a BioPerl "Unflattener" error; it's unable to automatically
>> reconstruct the gene/mRNA/exon logic of some (or all) of the
>> annotation in your genbank file.  To get help with this, you should
>> post a message to the BioPerl mailing list (bioperl-l at bioperl.org),
>> including a snippet of your genbank file.
>>
>> -Aaron
>>
>> On Jan 26, 2006, at 11:53 AM, Ricardo Balbi wrote:
>>
>>> Hi all,
>>>
>>>   After making some changes in the gus mapping file to ignore some
>>> features of the kinetoplastida database, I followed in the
>>> execution of the ISF, however without success.
>>>
>>>   Somebody could help me with this error?
>>>
>>> thanks in advance,
>>> Ricardo
>>>
>>> ERROR:
>>>
>>> ------------- EXCEPTION  -------------
>>> MSG: structure_type 2 is currently unknown
>>> STACK Bio::SeqFeature::Tools::Unflattener::unflatten_seq /usr/local/
>>> bioperl15/Bio/SeqFeature/Tools/Unflattener.pm:1419
>>> STACK GUS::Supported::Plugin::InsertSequenceFeatures::unflatten /
>>> GUS/gus_home/lib/perl/GUS/Supported/Plugin/
>>> InsertSequenceFeatures.pm:353
>>> STACK
>>> GUS::Supported::Plugin::InsertSequenceFeatures::processFeatureTrees 
>>> /G
>>> US/gus_home/lib/perl/GUS/Supported/Plugin/InsertSequenceFeatures.pm:
>>> 720
>>> STACK GUS::Supported::Plugin::InsertSequenceFeatures::run /GUS/
>>> gus_home/lib/perl/GUS/Supported/Plugin/InsertSequenceFeatures.pm:330
>>> STACK (eval) /GUS/gus_home/lib/perl/GUS/PluginMgr/GusApplication.pm:
>>> 549
>>> STACK GUS::PluginMgr::GusApplication::doMajorMode_RunOrReport /GUS/
>>> gus_home/lib/perl/GUS/PluginMgr/GusApplication.pm:541
>>> STACK GUS::PluginMgr::GusApplication::doMajorMode_Run /GUS/gus_home/
>>> lib/perl/GUS/PluginMgr/GusApplication.pm:459
>>> STACK GUS::PluginMgr::GusApplication::doMajorMode /GUS/gus_home/lib/
>>> perl/GUS/PluginMgr/GusApplication.pm:357
>>> STACK GUS::PluginMgr::GusApplication::parseAndRun /GUS/gus_home/lib/
>>> perl/GUS/PluginMgr/GusApplication.pm:266
>>> STACK toplevel /GUS/gus_home/bin/ga:11
>>>
>>> --------------------------------------
>>>
>>> STACK TRACE:
>>>  at /usr/local/bioperl15/Bio/Root/Root.pm line 342
>>>         Bio::Root::Root::throw
>>> ('Bio::SeqFeature::Tools::Unflattener=HASH(0xb5e124c)',
>>> 'structure_type 2 is currently unknown') called at /usr/local/
>>> bioperl15/Bio/SeqFeature/Tools/Unflattener.pm line 1419
>>>         Bio::SeqFeature::Tools::Unflattener::unflatten_seq
>>> ('Bio::SeqFeature::Tools::Unflattener=HASH(0xb5e124c)', '-seq',
>>> 'Bio::Seq::RichSeq=HASH(0xb820b14)', '-use_magic', 1) called at /
>>> GUS/gus_home/lib/perl/GUS/Supported/Plugin/
>>> InsertSequenceFeatures.pm line 353
>>>         GUS::Supported::Plugin::InsertSequenceFeatures::unflatten
>>> ('GUS::Supported::Plugin::InsertSequenceFeatures=HASH(0xa011adc)',
>>> 'Bio::Seq::RichSeq=HASH(0xb820b14)') called at /GUS/gus_home/lib/
>>> perl/GUS/Supported/Plugin/InsertSequenceFeatures.pm line 720
>>>
>>> GUS::Supported::Plugin::InsertSequenceFeatures::processFeatureTrees
>>> ('GUS::Supported::Plugin::InsertSequenceFeatures=HASH(0xa011adc)',
>>> 'Bio::Seq::RichSeq=HASH(0xb820b14)', 140, 177) called at /GUS/
>>> gus_home/lib/perl/GUS/Supported/Plugin/InsertSequenceFeatures.pm
>>> line 330
>>>         GUS::Supported::Plugin::InsertSequenceFeatures::run
>>> ('GUS::Supported::Plugin::InsertSequenceFeatures=HASH(0xa011adc)',
>>> 'HASH(0xb047e98)') called at /GUS/gus_home/lib/perl/GUS/PluginMgr/
>>> GusApplication.pm line 549
>>>         eval {...} called at /GUS/gus_home/lib/perl/GUS/PluginMgr/
>>> GusApplication.pm line 541
>>>         GUS::PluginMgr::GusApplication::doMajorMode_RunOrReport
>>> ('GUS::PluginMgr::GusApplication=HASH(0x9ff0c20)',
>>> 'GUS::Supported::Plugin::InsertSequenceFeatures', 1) called at /GUS/
>>> gus_home/lib/perl/GUS/PluginMgr/GusApplication.pm line 459
>>>         GUS::PluginMgr::GusApplication::doMajorMode_Run
>>> ('GUS::PluginMgr::GusApplication=HASH(0x9ff0c20)',
>>> 'GUS::Supported::Plugin::InsertSequenceFeatures') called at /GUS/
>>> gus_home/lib/perl/GUS/PluginMgr/GusApplication.pm line 357
>>>         GUS::PluginMgr::GusApplication::doMajorMode
>>> ('GUS::PluginMgr::GusApplication=HASH(0x9ff0c20)',
>>> 'GUS::Supported::Plugin::InsertSequenceFeatures') called at /GUS/
>>> gus_home/lib/perl/GUS/PluginMgr/GusApplication.pm line 266
>>>         GUS::PluginMgr::GusApplication::parseAndRun
>>> ('GUS::PluginMgr::GusApplication=HASH(0x9ff0c20)', 'ARRAY
>>> (0xa004738)') called at /GUS/gus_home/bin/ga line 11
>>>
>>>
>>>
>>
>> --
>> Dr. Aaron J. Mackey, Ph.D.
>> Project Manager, ApiDB Bioinformatics Resource Center
>> Penn Genomics Institute, University of Pennsylvania
>> email:  amackey at pcbi.upenn.edu
>> office: 215-898-1205 (Biology, 212 Goddard Labs)
>>          215-746-7018 (PCBI, 1428 Blockley Hall)
>> fax:    215-746-6697 (Penn Genomics Institute)
>> postal: Penn Genomics Institute
>>          Goddard Labs 212
>>          415 S. University Avenue
>>          Philadelphia, PA  19104-6017
>>
>>
>>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

From heikki at sanbi.ac.za  Fri Jan 27 10:06:52 2006
From: heikki at sanbi.ac.za (Heikki Lehvaslaiho)
Date: Fri, 27 Jan 2006 12:06:52 +0200
Subject: [Bioperl-l] concatenate two embl sequence files
In-Reply-To: <43D8CC0B.10403@colibase.bham.ac.uk>
References: <200601182120.k0ILIl8X022324@portal.open-bio.org>
	<200601252311.45582.heikki@sanbi.ac.za>
	<43D8CC0B.10403@colibase.bham.ac.uk>
Message-ID: <200601271206.52875.heikki@sanbi.ac.za>

On Thursday 26 January 2006 15:18, Roy Chaudhuri wrote:
> Heikki Lehvaslaiho wrote:
> > Thanks Roy!
> >
> > I'll check to code in tomorrow when I am less sleepy and can go through
> > the code in detail. In principle the code looks good. It definitely needs
> > tests. If you have written any please do post them.
>
> Not too sure about how to go about writing tests, any suggestions?

I've committed the code and tests. See t/SeqUtils.t. The idea is to test all 
methods and a reasonable portion of all edge values to be sure that the 
method works as it should.

Note that the code does not create a new sequence object. It modifies the 
existing one. Therefore it is best not to return that object. The users would 
assign that to a variable that points to the same structure and get confused. 
The method now returns true upon completeion.

Creating a new sequence object is problematic because one needs to add one 
more dependency (e.g. Clone) and will not work anyway if the sequence 
implementation is using a database back end. It is better the way you have 
written it.

I added code to move over the annotations from secondary sequences, but did 
not do anything remove duplicates if the same sequence gets added twice. I 
wrote a note about this so that users know to be prepared if that affects 
them.

> It did occur to me that my _coordAdjust method could be adapted to allow
> the Bio::Seq trunc method to retain sequence features (since there's no
> reason why the $add argument can't be negative). This would probably
> need a bit more work to cope with the situation where a feature overlaps
> the trunc coordinates, for example if we truncate to coordinates 1..400,
> but there's a feature 300..500. I guess the 'correct' behaviour might be
> to convert that feature to a fuzzy location of 300..>400? Or is it
> acceptable to have features with coordinates outside of a sequence?

No feature coordinates should always be within the sequence. Fuzzy is the 
correct solution to this.

> If we did that then an obvious test would be to cat a sequence to
> itself, then trunc to retain just the second half of the new sequence
> and see if you got back what you started with.

Go ahead an try it!

> > A few more checks to make sure seq_>alphabet is the same in all sequences
> > might be a good idea.
>
> That's easy to implement. Just put the line:
> 	$self->throw('Trying to concatenate sequences with different alphabets:
> '.$seq->display_id.' ('.$seq->alphabet.') and ' .$_->display_id.'
> ('.$_->alphabet.')') unless $_->alphabet eq $seq->alphabet;
>
> at the start of the for(@seqs) loop of the cat subroutine.

Added.

Thanks,

	-Heikki
> Roy.
> --
> Dr. Roy Chaudhuri
> Bioinformatics Research Fellow
> Division of Immunity and Infection
> University of Birmingham, U.K.
>
> http://xbase.bham.ac.uk
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

-- 
______ _/      _/_____________________________________________________
      _/      _/
     _/  _/  _/  Heikki Lehvaslaiho    heikki at_sanbi _ac _za
    _/_/_/_/_/  Associate Professor    skype: heikki_lehvaslaiho
   _/  _/  _/  SANBI, South African National Bioinformatics Institute
  _/  _/  _/  University of Western Cape, South Africa
     _/      Phone: +27 21 959 2096   FAX: +27 21 959 2512
___ _/_/_/_/_/________________________________________________________

From wlhsiao at yahoo.ca  Fri Jan 27 10:37:25 2006
From: wlhsiao at yahoo.ca (William Hsiao)
Date: Fri, 27 Jan 2006 05:37:25 -0500 (EST)
Subject: [Bioperl-l] new website launched
In-Reply-To: <79628361-F026-461F-A156-0B6810DB0B52@duke.edu>
Message-ID: <20060127103725.49913.qmail@web32405.mail.mud.yahoo.com>

Hi Jason,
  Nice new site!  I am wondering if I missed an
obvious link to the module documentations (e.g
http://doc.bioperl.org/releases/bioperl-1.4/) from the
homepage?  It seems that is the one thing missing from
the old website setup and I am not sure if it's
intentional.  I am developing a set of lecture notes
for a workshop and would like to know if there is a
stable way to navigate to the module documentations.

Thanks

Cheers,

Will

--- Jason Stajich  wrote:

> I am pleased to announce the release of a new
> website for BioPerl.   
> The site is based on the mediawiki software that was
> developed for  
> the wikipedia project.  We intend the site to be a
> place for  
> community input on documentation and design for the
> BioPerl project.   
> There is also a fair amount of documentation started
> surrounding  
> bioinformatics tools and techniques applicable to
> using BioPerl and  
> some of the authors who created these resources.
> 
> The website continues to be at the URL
> http://www.bioperl.org.  The  
> DNS updates may take up to 24 hours to reach
> everyone.
> 
> The initial content of the site is result of the
> work of myself,  
> Mauricio Herrera Cuadra, Brian Osborne, and Torsten
> Seemann.  We  
> encourage you to contribute to the site's content by
> signing up for  
> an account.
> 
> There are several guides for style of the site and
> how to link to  
> Modules for example which can contain additional
> information from the  
> POD
> http://bioperl.org/wiki/Module:Bio::SeqIO
> 
> You'll notice that many of the paths have changed
> but the DIST and  
> SRC continues to be available at
> http://bioperl.org/DIST and http:// 
> bioperl.org/SRC.  The HOWTOs are now available from
> http:// 
> bioperl.org/wiki/HOWTOs
> 
> The FAQ is available at http://bioperl.org/wiki/FAQ
> and I encourage  
> you to add your questions to it so they can be
> properly archived and  
> addressed.
> 
> We also have initiated a News site for Bioperl for
> posting  
> announcements regarding development and software.  I
> would like to  
> see if there are volunteers to post weekly or
> monthly summaries of  
> mailing list traffic and development.
> http://www.bioperl.org/news/
> 
> 
> Jason Stajich on behalf of Mauricio Herrera Cuadra,
> Brian Osborne,  
> Torsten Seemann.
> 
> --
> Jason Stajich
> Duke University
> http://www.duke.edu/~jes12
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 

__________________________________________________________ 
Find your next car at http://autos.yahoo.ca

From wlhsiao at yahoo.ca  Fri Jan 27 10:37:23 2006
From: wlhsiao at yahoo.ca (William Hsiao)
Date: Fri, 27 Jan 2006 05:37:23 -0500 (EST)
Subject: [Bioperl-l] new website launched
In-Reply-To: <79628361-F026-461F-A156-0B6810DB0B52@duke.edu>
Message-ID: <20060127103723.99495.qmail@web32402.mail.mud.yahoo.com>

Hi Jason,
  Nice new site!  I am wondering if I missed an
obvious link to the module documentations (e.g
http://doc.bioperl.org/releases/bioperl-1.4/) from the
homepage?  It seems that is the one thing missing from
the old website setup and I am not sure if it's
intentional.  I am developing a set of lecture notes
for a workshop and would like to know if there is a
stable way to navigate to the module documentations.

Thanks

Cheers,

Will

--- Jason Stajich  wrote:

> I am pleased to announce the release of a new
> website for BioPerl.   
> The site is based on the mediawiki software that was
> developed for  
> the wikipedia project.  We intend the site to be a
> place for  
> community input on documentation and design for the
> BioPerl project.   
> There is also a fair amount of documentation started
> surrounding  
> bioinformatics tools and techniques applicable to
> using BioPerl and  
> some of the authors who created these resources.
> 
> The website continues to be at the URL
> http://www.bioperl.org.  The  
> DNS updates may take up to 24 hours to reach
> everyone.
> 
> The initial content of the site is result of the
> work of myself,  
> Mauricio Herrera Cuadra, Brian Osborne, and Torsten
> Seemann.  We  
> encourage you to contribute to the site's content by
> signing up for  
> an account.
> 
> There are several guides for style of the site and
> how to link to  
> Modules for example which can contain additional
> information from the  
> POD
> http://bioperl.org/wiki/Module:Bio::SeqIO
> 
> You'll notice that many of the paths have changed
> but the DIST and  
> SRC continues to be available at
> http://bioperl.org/DIST and http:// 
> bioperl.org/SRC.  The HOWTOs are now available from
> http:// 
> bioperl.org/wiki/HOWTOs
> 
> The FAQ is available at http://bioperl.org/wiki/FAQ
> and I encourage  
> you to add your questions to it so they can be
> properly archived and  
> addressed.
> 
> We also have initiated a News site for Bioperl for
> posting  
> announcements regarding development and software.  I
> would like to  
> see if there are volunteers to post weekly or
> monthly summaries of  
> mailing list traffic and development.
> http://www.bioperl.org/news/
> 
> 
> Jason Stajich on behalf of Mauricio Herrera Cuadra,
> Brian Osborne,  
> Torsten Seemann.
> 
> --
> Jason Stajich
> Duke University
> http://www.duke.edu/~jes12
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 

__________________________________________________________ 
Find your next car at http://autos.yahoo.ca

From jason.stajich at duke.edu  Fri Jan 27 13:28:52 2006
From: jason.stajich at duke.edu (Jason Stajich)
Date: Fri, 27 Jan 2006 08:28:52 -0500
Subject: [Bioperl-l] new website launched
In-Reply-To: <20060127103723.99495.qmail@web32402.mail.mud.yahoo.com>
References: <20060127103723.99495.qmail@web32402.mail.mud.yahoo.com>
Message-ID: 

Each module is directly linked to that site in the module-level  
pages, see for example:
http://bioperl.org/wiki/Module:Bio::SearchIO

I've added a mention of the doc.bioperl site on the front page.

Note that as part of setting up the site I insured that there is now  
a standardized URL for the nightly generated Pdoc pages (from CVS  
live) (thanks to Steve Chervitz for suggesting it).

http://doc.bioperl.org/releases/bioperl-current/bioperl-live/
http://doc.bioperl.org/releases/bioperl-current/bioperl-run/
http://doc.bioperl.org/releases/bioperl-current/bioperl-ext/
....etc

The frozen release-based docs will continue to stay up - I never had  
time to make one for the bioperl-1.5.1 but hopefully will do it for  
bioperl 1.5.2 and obviously will make it for the next stable release  
(1.6).

We encourage people to add snippets of code using modules,  
complaints, workarounds, etc on the module pages on the wiki site.   
There is a "discussion" paired for each wiki page where we would  
suggest people put comments, while useful workarounds/example code  
should go on the main page.  I've just added some text about this to  
the "About this site" page.

-jason
On Jan 27, 2006, at 5:37 AM, William Hsiao wrote:

> Hi Jason,
>   Nice new site!  I am wondering if I missed an
> obvious link to the module documentations (e.g
> http://doc.bioperl.org/releases/bioperl-1.4/) from the
> homepage?  It seems that is the one thing missing from
> the old website setup and I am not sure if it's
> intentional.  I am developing a set of lecture notes
> for a workshop and would like to know if there is a
> stable way to navigate to the module documentations.
>
> Thanks
>
> Cheers,
>
> Will
>
>
> --- Jason Stajich  wrote:
>
>> I am pleased to announce the release of a new
>> website for BioPerl.
>> The site is based on the mediawiki software that was
>> developed for
>> the wikipedia project.  We intend the site to be a
>> place for
>> community input on documentation and design for the
>> BioPerl project.
>> There is also a fair amount of documentation started
>> surrounding
>> bioinformatics tools and techniques applicable to
>> using BioPerl and
>> some of the authors who created these resources.
>>
>> The website continues to be at the URL
>> http://www.bioperl.org.  The
>> DNS updates may take up to 24 hours to reach
>> everyone.
>>
>> The initial content of the site is result of the
>> work of myself,
>> Mauricio Herrera Cuadra, Brian Osborne, and Torsten
>> Seemann.  We
>> encourage you to contribute to the site's content by
>> signing up for
>> an account.
>>
>> There are several guides for style of the site and
>> how to link to
>> Modules for example which can contain additional
>> information from the
>> POD
>> http://bioperl.org/wiki/Module:Bio::SeqIO
>>
>> You'll notice that many of the paths have changed
>> but the DIST and
>> SRC continues to be available at
>> http://bioperl.org/DIST and http://
>> bioperl.org/SRC.  The HOWTOs are now available from
>> http://
>> bioperl.org/wiki/HOWTOs
>>
>> The FAQ is available at http://bioperl.org/wiki/FAQ
>> and I encourage
>> you to add your questions to it so they can be
>> properly archived and
>> addressed.
>>
>> We also have initiated a News site for Bioperl for
>> posting
>> announcements regarding development and software.  I
>> would like to
>> see if there are volunteers to post weekly or
>> monthly summaries of
>> mailing list traffic and development.
>> http://www.bioperl.org/news/
>>
>>
>> Jason Stajich on behalf of Mauricio Herrera Cuadra,
>> Brian Osborne,
>> Torsten Seemann.
>>
>> --
>> Jason Stajich
>> Duke University
>> http://www.duke.edu/~jes12
>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>
>
>
> 	
>
> 	
> 		
> __________________________________________________________
> Find your next car at http://autos.yahoo.ca

--
Jason Stajich
Duke University
http://www.duke.edu/~jes12

From davila at fiocruz.br  Fri Jan 27 00:05:35 2006
From: davila at fiocruz.br (Alberto M. R. =?iso-8859-1?Q?D=E1vila?=)
Date: Thu, 26 Jan 2006 22:05:35 -0200 (BRST)
Subject: [Bioperl-l] [GUSDEV] Another error executing ISF
In-Reply-To: 
References: 

Message-ID: <3197.201.17.105.240.1138320335.squirrel@www.redefiocruz.fiocruz.br>

Dear Chris,

Happy 2006 !

I am not sure about the exact record the ISF plugin was trying to
read/parse, but I think it is the first one, anyway I am listing the first
5 GIs of our file for your testing:

85539529
56130985
54300415
54288810
50604596

The whole file is really big (1.4GB) as it contains all the nucleotide
sequences of "kinetoplastida [organism]" from genbank in genbank format.

Hope you can catch "the bug" ;-)

Kindest regards, Alberto

>
> Sorry for the uninformative error message.
>
> The unflattener uses a collection of heuristics to infer a canonical
> gene-mRNA-CDS-exon feature hierarchy from a flattened genbank style
> file. Due to the highly variable nature of some genbank records this
> isn't always possible, and some data massaging is required beforehand.
> I don't know what the context of this message is, but I presume you're
> aware of this from the docs.
>
> The only time I've seen this before was with the genbank submission of
> the pombe genome, which has some very.. unusual features purportedly of
> type mRNA; the actual gene models are encoded using 'gene' and 'CDS'
> features. This confuses the heuristics a little. The only way I've been
> able to deal with this one was to manually remove the mRNA features
> (they appeared to be just fragments and not actual gene models) using
> $unf->remove_types(['mRNA']) beforehand.
>
> Can you send the accession of the record you're trying this on (or
> email me the file off-list if it isn't too large). I'll try and get a
> more informative error message in there.
>
> On Jan 26, 2006, at 10:19 AM, Ricardo Balbi wrote:
>
>> Hi all,
>>
>>    Anybody could help me with this error ?
>>
>> thanks in advance,
>> Ricardo
>>
>> 2006/1/26, Aaron J. Mackey :
>>>
>>>
>>> This is a BioPerl "Unflattener" error; it's unable to automatically
>>> reconstruct the gene/mRNA/exon logic of some (or all) of the
>>> annotation in your genbank file.  To get help with this, you should
>>> post a message to the BioPerl mailing list (bioperl-l at bioperl.org),
>>> including a snippet of your genbank file.
>>>
>>> -Aaron
>>>
>>> On Jan 26, 2006, at 11:53 AM, Ricardo Balbi wrote:
>>>
>>>> Hi all,
>>>>
>>>>   After making some changes in the gus mapping file to ignore some
>>>> features of the kinetoplastida database, I followed in the
>>>> execution of the ISF, however without success.
>>>>
>>>>   Somebody could help me with this error?
>>>>
>>>> thanks in advance,
>>>> Ricardo
>>>>
>>>> ERROR:
>>>>
>>>> ------------- EXCEPTION  -------------
>>>> MSG: structure_type 2 is currently unknown
>>>> STACK Bio::SeqFeature::Tools::Unflattener::unflatten_seq /usr/local/
>>>> bioperl15/Bio/SeqFeature/Tools/Unflattener.pm:1419
>>>> STACK GUS::Supported::Plugin::InsertSequenceFeatures::unflatten /
>>>> GUS/gus_home/lib/perl/GUS/Supported/Plugin/
>>>> InsertSequenceFeatures.pm:353
>>>> STACK
>>>> GUS::Supported::Plugin::InsertSequenceFeatures::processFeatureTrees
>>>> /G
>>>> US/gus_home/lib/perl/GUS/Supported/Plugin/InsertSequenceFeatures.pm:
>>>> 720
>>>> STACK GUS::Supported::Plugin::InsertSequenceFeatures::run /GUS/
>>>> gus_home/lib/perl/GUS/Supported/Plugin/InsertSequenceFeatures.pm:330
>>>> STACK (eval) /GUS/gus_home/lib/perl/GUS/PluginMgr/GusApplication.pm:
>>>> 549
>>>> STACK GUS::PluginMgr::GusApplication::doMajorMode_RunOrReport /GUS/
>>>> gus_home/lib/perl/GUS/PluginMgr/GusApplication.pm:541
>>>> STACK GUS::PluginMgr::GusApplication::doMajorMode_Run /GUS/gus_home/
>>>> lib/perl/GUS/PluginMgr/GusApplication.pm:459
>>>> STACK GUS::PluginMgr::GusApplication::doMajorMode /GUS/gus_home/lib/
>>>> perl/GUS/PluginMgr/GusApplication.pm:357
>>>> STACK GUS::PluginMgr::GusApplication::parseAndRun /GUS/gus_home/lib/
>>>> perl/GUS/PluginMgr/GusApplication.pm:266
>>>> STACK toplevel /GUS/gus_home/bin/ga:11
>>>>
>>>> --------------------------------------
>>>>
>>>> STACK TRACE:
>>>>  at /usr/local/bioperl15/Bio/Root/Root.pm line 342
>>>>         Bio::Root::Root::throw
>>>> ('Bio::SeqFeature::Tools::Unflattener=HASH(0xb5e124c)',
>>>> 'structure_type 2 is currently unknown') called at /usr/local/
>>>> bioperl15/Bio/SeqFeature/Tools/Unflattener.pm line 1419
>>>>         Bio::SeqFeature::Tools::Unflattener::unflatten_seq
>>>> ('Bio::SeqFeature::Tools::Unflattener=HASH(0xb5e124c)', '-seq',
>>>> 'Bio::Seq::RichSeq=HASH(0xb820b14)', '-use_magic', 1) called at /
>>>> GUS/gus_home/lib/perl/GUS/Supported/Plugin/
>>>> InsertSequenceFeatures.pm line 353
>>>>         GUS::Supported::Plugin::InsertSequenceFeatures::unflatten
>>>> ('GUS::Supported::Plugin::InsertSequenceFeatures=HASH(0xa011adc)',
>>>> 'Bio::Seq::RichSeq=HASH(0xb820b14)') called at /GUS/gus_home/lib/
>>>> perl/GUS/Supported/Plugin/InsertSequenceFeatures.pm line 720
>>>>
>>>> GUS::Supported::Plugin::InsertSequenceFeatures::processFeatureTrees
>>>> ('GUS::Supported::Plugin::InsertSequenceFeatures=HASH(0xa011adc)',
>>>> 'Bio::Seq::RichSeq=HASH(0xb820b14)', 140, 177) called at /GUS/
>>>> gus_home/lib/perl/GUS/Supported/Plugin/InsertSequenceFeatures.pm
>>>> line 330
>>>>         GUS::Supported::Plugin::InsertSequenceFeatures::run
>>>> ('GUS::Supported::Plugin::InsertSequenceFeatures=HASH(0xa011adc)',
>>>> 'HASH(0xb047e98)') called at /GUS/gus_home/lib/perl/GUS/PluginMgr/
>>>> GusApplication.pm line 549
>>>>         eval {...} called at /GUS/gus_home/lib/perl/GUS/PluginMgr/
>>>> GusApplication.pm line 541
>>>>         GUS::PluginMgr::GusApplication::doMajorMode_RunOrReport
>>>> ('GUS::PluginMgr::GusApplication=HASH(0x9ff0c20)',
>>>> 'GUS::Supported::Plugin::InsertSequenceFeatures', 1) called at /GUS/
>>>> gus_home/lib/perl/GUS/PluginMgr/GusApplication.pm line 459
>>>>         GUS::PluginMgr::GusApplication::doMajorMode_Run
>>>> ('GUS::PluginMgr::GusApplication=HASH(0x9ff0c20)',
>>>> 'GUS::Supported::Plugin::InsertSequenceFeatures') called at /GUS/
>>>> gus_home/lib/perl/GUS/PluginMgr/GusApplication.pm line 357
>>>>         GUS::PluginMgr::GusApplication::doMajorMode
>>>> ('GUS::PluginMgr::GusApplication=HASH(0x9ff0c20)',
>>>> 'GUS::Supported::Plugin::InsertSequenceFeatures') called at /GUS/
>>>> gus_home/lib/perl/GUS/PluginMgr/GusApplication.pm line 266
>>>>         GUS::PluginMgr::GusApplication::parseAndRun
>>>> ('GUS::PluginMgr::GusApplication=HASH(0x9ff0c20)', 'ARRAY
>>>> (0xa004738)') called at /GUS/gus_home/bin/ga line 11
>>>>
>>>>
>>>>
>>>
>>> --
>>> Dr. Aaron J. Mackey, Ph.D.
>>> Project Manager, ApiDB Bioinformatics Resource Center
>>> Penn Genomics Institute, University of Pennsylvania
>>> email:  amackey at pcbi.upenn.edu
>>> office: 215-898-1205 (Biology, 212 Goddard Labs)
>>>          215-746-7018 (PCBI, 1428 Blockley Hall)
>>> fax:    215-746-6697 (Penn Genomics Institute)
>>> postal: Penn Genomics Institute
>>>          Goddard Labs 212
>>>          415 S. University Avenue
>>>          Philadelphia, PA  19104-6017
>>>

From roy at colibase.bham.ac.uk  Fri Jan 27 15:31:50 2006
From: roy at colibase.bham.ac.uk (Roy Chaudhuri)
Date: Fri, 27 Jan 2006 15:31:50 +0000
Subject: [Bioperl-l] concatenate two embl sequence files
In-Reply-To: <200601271206.52875.heikki@sanbi.ac.za>
References: <200601182120.k0ILIl8X022324@portal.open-bio.org>
	<200601252311.45582.heikki@sanbi.ac.za>
	<43D8CC0B.10403@colibase.bham.ac.uk>
	<200601271206.52875.heikki@sanbi.ac.za>
Message-ID: <43DA3CE6.4020708@colibase.bham.ac.uk>

> I've committed the code and tests. See t/SeqUtils.t. The idea is to test all 
> methods and a reasonable portion of all edge values to be sure that the 
> method works as it should.
Cool, thanks for that. My first proper contribution to BioPerl 8^).
The tests look good- I'll know better for next time.

> Note that the code does not create a new sequence object. It modifies the 
> existing one. Therefore it is best not to return that object. The users would 
> assign that to a variable that points to the same structure and get confused. 
> The method now returns true upon completeion.
> 
> Creating a new sequence object is problematic because one needs to add one 
> more dependency (e.g. Clone) and will not work anyway if the sequence 
> implementation is using a database back end. It is better the way you have 
> written it.
Yes, that makes sense. Although with that interface it might be more 
natural in Bio::Seq? If it is a method that will modify a sequence in 
place then it seems more intuitive to call $seq->cat(@seqs) [or even 
$seq->append(@seqs)] rather than Bio::SeqUtils->cat($seq, @seqs).

> I added code to move over the annotations from secondary sequences, but did 
> not do anything remove duplicates if the same sequence gets added twice. I 
> wrote a note about this so that users know to be prepared if that affects 
> them.
I'm not convinced about this- perhaps it should be optional? In practice 
many of the annotations for each subsequence are only going to be 
applicable to that sequence, not the concatenated whole. Some of them 
may also be duplicated even between non-identical sequences. I think 
it'd be better by default to keep just the annotation from the first 
sequence (which probably would still need to be changed, but could at 
least act as a placeholder).

There were a couple of problems with renamed variables/subroutines that 
hadn't all been updated, I've fixed those and pasted the new version below.

> No feature coordinates should always be within the sequence. Fuzzy is the 
> correct solution to this.
Okay, I'll have a go and let you know how I get on.

Cheers.
Roy.

--
Dr. Roy Chaudhuri
Bioinformatics Research Fellow
Division of Immunity and Infection
University of Birmingham, U.K.

http://xbase.bham.ac.uk

=head2 cat

   Title   : cat
   Usage   : my $catseq = Bio::SeqUtils->cat(@seqs)
   Function: Concatenates an array of Bio::Seq objects, using the first 
sequence
             as a target. Annotations and sequence features are copied over
             from any additional objects. Adjusts the coordinates of copied
             features.
   Returns : a boolean
   Args    : array of sequence objects

-
Note that annotations have no sequence region. If you concatenate the
same sequence more than once, you will have its annotations
duplicated.

=cut

sub cat {
     my ($self, $seq, @seqs) = @_;
     $self->throw('Object [$seq] '. 'of class ['. ref($seq).
                  '] should be a Bio::PrimarySeqI ')
         unless $seq->isa('Bio::PrimarySeqI');

     for my $catseq (@seqs) {
         $self->throw('Object [$catseq] '. 'of class ['. ref($catseq).
                      '] should be a Bio::PrimarySeqI ')
             unless $catseq->isa('Bio::PrimarySeqI');

         $self->throw('Trying to concatenate sequences with different 
alphabets: '.
                      $seq->display_id. '('. $seq->alphabet. ') and '. 
$catseq->display_id.
                      '('. $catseq->alphabet. ')')
             unless $catseq->alphabet eq $seq->alphabet;

         my $length=$seq->length;
         $seq->seq($seq->seq.$catseq->seq);

         # move annotations
         if ($seq->isa("Bio::AnnotatableI") and 
$catseq->isa("Bio::AnnotatableI")) {
             foreach my $key ( 
$catseq->annotation->get_all_annotation_keys() ) {

                 foreach my $value ( 
$catseq->annotation->get_Annotations($key) ) {
                     $seq->annotation->add_Annotation($key, $value);
                 }
             }
         }

         # move SeqFeatures
         if ( $seq->isa('Bio::SeqI') and $catseq->isa('Bio::SeqI')) {
             for my $feat ($catseq->get_SeqFeatures) {
                 $seq->add_SeqFeature($self->_coord_adjust($feat, $length));
             }
         }

     }
     1;
}

=head2 _coord_adjust

   Title   : _coord_adjust
   Usage   : my $newfeat=Bio::SeqUtils->_coord_adjust($feature, 100);
   Function: Recursive subroutine to adjust the coordinates of a feature
             and all its subfeatures.
   Returns : A Bio::SeqFeatureI compliant object.
   Args    : A Bio::SeqFeatureI compliant object,
             the number of bases to add to the coordinates

=cut

sub _coord_adjust {
     my ($self, $feat, $add)=@_;
     $self->throw('Object [$feat] '. 'of class ['. ref($feat).
                  '] should be a Bio::SeqFeatureI ')
	unless $feat->isa('Bio::SeqFeatureI');
     my @adjsubfeat;
     for my $subfeat ($feat->remove_SeqFeatures) {
	push @adjsubfeat, Bio::SeqUtils->_coord_adjust($add, $subfeat);
     }
     my @loc=$feat->location->each_Location;
     map {
	my @coords=($_->start, $_->end);
	map s/(\d+)/$add+$1/ge, @coords;
	$_->start(shift @coords);
	$_->end(shift @coords);
     } @loc;
     if (@loc==1) {
	$feat->location($loc[0])
     } else {
	my $loc=Bio::Location::Split->new;
	$loc->add_sub_Location(@loc);
	$feat->location($loc);
     }
     $feat->add_SeqFeature($_) for @adjsubfeat;
     return $feat;
}

From lupey+ at pitt.edu  Fri Jan 27 12:52:03 2006
From: lupey+ at pitt.edu (Paul G Cantalupo)
Date: Fri, 27 Jan 2006 07:52:03 -0500 (EST)
Subject: [Bioperl-l] How to search Bioperl-l archives
Message-ID: 

Hello,

Is there a better way to search the bioperl-l archives other than 
searching in each Archive listed on 
http://bioperl.org/pipermail/bioperl-l/. I've found that Google is not the 
best answer either.

Thank you,

Paul

Paul Cantalupo
Research Specialist/Systems Programmer
559 Crawford Hall
Department of Biological Sciences
University of Pittsburgh
Pittsburgh, PA 15260
Work: 412-624-4687
Fax: 412-624-4759

Ask me about Toastmasters: www.toastmasters.org
Midday Club Treasurer

From jason.stajich at duke.edu  Fri Jan 27 20:48:00 2006
From: jason.stajich at duke.edu (Jason Stajich)
Date: Fri, 27 Jan 2006 15:48:00 -0500
Subject: [Bioperl-l] How to search Bioperl-l archives
In-Reply-To: 
References: 
Message-ID: <91EF5237-8A86-40FA-8126-D953DE28DD69@duke.edu>

Google is the best answer we've got...
site:bioperl.org +pipermail +bioperl-l YOUR TERM

We will try and re-setup the swish indexed archive on the new server  
when there is time.  I don't think I'm going to have time for quite a  
while, if someone volunteers to help out ChrisD and I with sys  
admining it can of course get done sooner.  The old site is http:// 
search.open-bio.org but I don't think the indexes have been updated  
in a while.

-jason

On Jan 27, 2006, at 7:52 AM, Paul G Cantalupo wrote:

> Hello,
>
> Is there a better way to search the bioperl-l archives other than
> searching in each Archive listed on
> http://bioperl.org/pipermail/bioperl-l/. I've found that Google is  
> not the
> best answer either.
>
> Thank you,
>
> Paul
>
>
> Paul Cantalupo
> Research Specialist/Systems Programmer
> 559 Crawford Hall
> Department of Biological Sciences
> University of Pittsburgh
> Pittsburgh, PA 15260
> Work: 412-624-4687
> Fax: 412-624-4759
>
> Ask me about Toastmasters: www.toastmasters.org
> Midday Club Treasurer
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

--
Jason Stajich
Duke University
http://www.duke.edu/~jes12

From cjfields at uiuc.edu  Fri Jan 27 20:57:50 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri, 27 Jan 2006 14:57:50 -0600
Subject: [Bioperl-l] How to search Bioperl-l archives
In-Reply-To: 
Message-ID: <000001c62384$555ea8c0$15327e82@pyrimidine>

There's a link from this page:

http://www.bioperl.org/wiki/Mailing_lists

Two different searches are shown for bioperl-l : Google and Open-Bio.  I use
the Open-Bio b/c of its sorting capabilities (I haven't tried fooling around
with the Google interface yet).

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 
> -----Original Message-----
> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
> bounces at lists.open-bio.org] On Behalf Of Paul G Cantalupo
> Sent: Friday, January 27, 2006 6:52 AM
> To: bioperl-l
> Subject: [Bioperl-l] How to search Bioperl-l archives
> 
> Hello,
> 
> Is there a better way to search the bioperl-l archives other than
> searching in each Archive listed on
> http://bioperl.org/pipermail/bioperl-l/. I've found that Google is not the
> best answer either.
> 
> Thank you,
> 
> Paul
> 
> 
> Paul Cantalupo
> Research Specialist/Systems Programmer
> 559 Crawford Hall
> Department of Biological Sciences
> University of Pittsburgh
> Pittsburgh, PA 15260
> Work: 412-624-4687
> Fax: 412-624-4759
> 
> Ask me about Toastmasters: www.toastmasters.org
> Midday Club Treasurer
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

From cjfields at uiuc.edu  Fri Jan 27 21:02:59 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri, 27 Jan 2006 15:02:59 -0600
Subject: [Bioperl-l] RNAMotif parser
Message-ID: <000101c62385$0ddfc870$15327e82@pyrimidine>

Jason,

I have been fiddling with an RNAMotif parser and an ERPIN parser for a
number of years now; I plan on releasing it for inclusion in bioperl or
bioperl-run.  Right now, I think I may base them somewhat on your
Bio::Tools::QRNA module.  Should they be in bioperl (Bio::Tools::RNAMotif)
or bioperl-run (Bio::Tools::Run::RNAMotif) namespace?

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 

From rahall2 at ualr.edu  Fri Jan 27 20:34:47 2006
From: rahall2 at ualr.edu (Roger Hall)
Date: Fri, 27 Jan 2006 14:34:47 -0600
Subject: [Bioperl-l] Requesting your issues with
	Module:Bio::Tools::Run::RemoteBlast
Message-ID: <008001c62381$1d844980$d416a790@LIBERAL>

All,

I have a fun little application written around this module to track new hits
for my favorite sequences, but it stopped working some time ago, so I have
finally adopted this orphaned module.

I have received very specific suggestions from Jason and Chris for
implementation, and plan to follow them in order to at least bring this
module into the wonderful world of XML. I would appreciate it if you would
send any additional features (and any known issues) my way.

Thanks!

Roger Hall

Technical Director

MidSouth Bioinformatics Center

University of Arkansas at Little Rock

(501) 569-8074

From cjfields at uiuc.edu  Sat Jan 28 01:03:15 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri, 27 Jan 2006 19:03:15 -0600
Subject: [Bioperl-l] Requesting your issues
	withModule:Bio::Tools::Run::RemoteBlast
In-Reply-To: <008001c62381$1d844980$d416a790@LIBERAL>
Message-ID: <001101c623a6$9eb652d0$15327e82@pyrimidine>

The only real change to RemoteBlast.pm made was to the save_output method;
it wasn't saving XML output because the regex used to check the tempfile
output:

	while(my $l = ) {
		next if ($l =~ //);
		if( $l =~ /^(?:[T]?BLAST[NPX])\s*.+$/i ||
			 $l =~/^RPS-BLAST\s*.+$/i ) {
			$seentop=1;
		}
		next if !$seentop;
		if( $seentop ) {
			print SAVEOUT $l;
		}
	}

didn't check for XML.  I just added a check for XML that is the same as the
XML format check in the retrieve_blast method:

	while(my $l = ) {
		next if ($l =~ //);
		if( $l =~ /^(?:[T]?BLAST[NPX])\s*.+$/i ||  # NCBI BLAST
			$l =~/^RPS-BLAST\s*.+$/i || # RPS BLAST
                  $1 =~/<\?xml version=/) { # NCBI BLAST XML output
			$seentop=1;
		}
		next if !$seentop;
		if( $seentop ) {
			print SAVEOUT $l;
		}
	}

There is probably a better way to do this, but it works for now.  All other
fixes were made to SearchIO::blast.  That module is where most of the work
is done and which 'broke' recently from the BLAST version change at NCBI.

The only things I can think of at the moment are things that Jason
mentioned, switching to XML as the default (I agree with) and possibly
incorporating the netblast client (blastcl3).  It might be possible to
branch off a similar module specifically geared towards the blastcl3 client,
maybe acting as a wrapper to parse the returned data using SearchIO, but I
don't necessarily think it would be best to include in RemoteBlast. 

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 

> -----Original Message-----
> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
> bounces at lists.open-bio.org] On Behalf Of Roger Hall
> Sent: Friday, January 27, 2006 2:35 PM
> To: Bioperl-L
> Subject: [Bioperl-l] Requesting your issues
> withModule:Bio::Tools::Run::RemoteBlast
> 
> All,
> 
> 
> 
> I have a fun little application written around this module to track new
> hits
> for my favorite sequences, but it stopped working some time ago, so I have
> finally adopted this orphaned module.
> 
> 
> 
> I have received very specific suggestions from Jason and Chris for
> implementation, and plan to follow them in order to at least bring this
> module into the wonderful world of XML. I would appreciate it if you would
> send any additional features (and any known issues) my way.
> 
> 
> 
> Thanks!
> 
> 
> 
> Roger Hall
> 
> Technical Director
> 
> MidSouth Bioinformatics Center
> 
> University of Arkansas at Little Rock
> 
> (501) 569-8074
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

From torsten.seemann at infotech.monash.edu.au  Sat Jan 28 01:30:34 2006
From: torsten.seemann at infotech.monash.edu.au (Torsten Seemann)
Date: Sat, 28 Jan 2006 12:30:34 +1100
Subject: [Bioperl-l] RNAMotif parser
In-Reply-To: <000101c62385$0ddfc870$15327e82@pyrimidine>
References: <000101c62385$0ddfc870$15327e82@pyrimidine>
Message-ID: <43DAC93A.1000208@infotech.monash.edu.au>

Chris,

> I have been fiddling with an RNAMotif parser and an ERPIN parser for a
> number of years now; I plan on releasing it for inclusion in bioperl or
> bioperl-run.  Right now, I think I may base them somewhat on your
> Bio::Tools::QRNA module.  Should they be in bioperl (Bio::Tools::RNAMotif)
> or bioperl-run (Bio::Tools::Run::RNAMotif) namespace?

 From my understanding, a module to _parse the output_ of some TOOL goes 
in Bio::Tools::TOOL. The wrapper to _run_ TOOL goes in 
Bio::Tools::Run::TOOL. Usually the run() method in Bio::Tools::Run::TOOL 
takes the TOOL output and creates a Bio::Tools::TOOL object with the 
result in it as a convenience.

-- 
Torsten Seemann
Victorian Bioinformatics Consortium, Monash University, Australia
http://www.vicbioinformatics.com/
Phone: +61 3 9905 9010

From cjfields at uiuc.edu  Sat Jan 28 01:47:48 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri, 27 Jan 2006 19:47:48 -0600
Subject: [Bioperl-l] RNAMotif parser
In-Reply-To: <43DAC93A.1000208@infotech.monash.edu.au>
Message-ID: <000001c623ac$d7d07db0$15327e82@pyrimidine>

Yeah, forgot about that.  I just remember a discussion at one point a while
back about splitting off sections of bioperl core b/c some thought
bioperl-core was getting too big; I didn't want to get too deep into writing
code w/o asking.  Okay, then, that's settled.  

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 
> -----Original Message-----
> From: Torsten Seemann [mailto:torsten.seemann at infotech.monash.edu.au]
> Sent: Friday, January 27, 2006 7:31 PM
> To: Chris Fields
> Cc: 'bioperl-ml List'
> Subject: Re: [Bioperl-l] RNAMotif parser
> 
> Chris,
> 
> > I have been fiddling with an RNAMotif parser and an ERPIN parser for a
> > number of years now; I plan on releasing it for inclusion in bioperl or
> > bioperl-run.  Right now, I think I may base them somewhat on your
> > Bio::Tools::QRNA module.  Should they be in bioperl
> (Bio::Tools::RNAMotif)
> > or bioperl-run (Bio::Tools::Run::RNAMotif) namespace?
> 
>  From my understanding, a module to _parse the output_ of some TOOL goes
> in Bio::Tools::TOOL. The wrapper to _run_ TOOL goes in
> Bio::Tools::Run::TOOL. Usually the run() method in Bio::Tools::Run::TOOL
> takes the TOOL output and creates a Bio::Tools::TOOL object with the
> result in it as a convenience.
> 
> --
> Torsten Seemann
> Victorian Bioinformatics Consortium, Monash University, Australia
> http://www.vicbioinformatics.com/
> Phone: +61 3 9905 9010

From torsten.seemann at infotech.monash.edu.au  Sat Jan 28 10:04:30 2006
From: torsten.seemann at infotech.monash.edu.au (Torsten Seemann)
Date: Sat, 28 Jan 2006 21:04:30 +1100
Subject: [Bioperl-l] RNAMotif parser
In-Reply-To: <000001c623ac$d7d07db0$15327e82@pyrimidine>
References: <000001c623ac$d7d07db0$15327e82@pyrimidine>
Message-ID: <43DB41AE.30002@infotech.monash.edu.au>

>> From my understanding, a module to _parse the output_ of some TOOL goes
>>in Bio::Tools::TOOL. The wrapper to _run_ TOOL goes in
>>Bio::Tools::Run::TOOL. Usually the run() method in Bio::Tools::Run::TOOL
>>takes the TOOL output and creates a Bio::Tools::TOOL object with the
>>result in it as a convenience.

> Yeah, forgot about that.  I just remember a discussion at one point a while
> back about splitting off sections of bioperl core b/c some thought
> bioperl-core was getting too big; I didn't want to get too deep into writing
> code w/o asking.  Okay, then, that's settled.  

I think this is still true. Anything in Bio::Tools::Run namespace should 
be in bioperl-run CVS (except for RemoteBlast and StandAloneBlast which 
are in bioperl-live core due to popularity). All the output parsers are 
in bioperl-live core.

-- 
Torsten Seemann
Victorian Bioinformatics Consortium, Monash University, Australia
http://www.vicbioinformatics.com/
Phone: +61 3 9905 9010

From jason.stajich at duke.edu  Sat Jan 28 16:06:06 2006
From: jason.stajich at duke.edu (Jason Stajich)
Date: Sat, 28 Jan 2006 11:06:06 -0500
Subject: [Bioperl-l] RNAMotif parser
In-Reply-To: <43DB41AE.30002@infotech.monash.edu.au>
References: <000001c623ac$d7d07db0$15327e82@pyrimidine>
	<43DB41AE.30002@infotech.monash.edu.au>
Message-ID: 

exactly!
On Jan 28, 2006, at 5:04 AM, Torsten Seemann wrote:

>>> From my understanding, a module to _parse the output_ of some  
>>> TOOL goes
>>> in Bio::Tools::TOOL. The wrapper to _run_ TOOL goes in
>>> Bio::Tools::Run::TOOL. Usually the run() method in  
>>> Bio::Tools::Run::TOOL
>>> takes the TOOL output and creates a Bio::Tools::TOOL object with the
>>> result in it as a convenience.
>
>> Yeah, forgot about that.  I just remember a discussion at one  
>> point a while
>> back about splitting off sections of bioperl core b/c some thought
>> bioperl-core was getting too big; I didn't want to get too deep  
>> into writing
>> code w/o asking.  Okay, then, that's settled.
>
> I think this is still true. Anything in Bio::Tools::Run namespace  
> should
> be in bioperl-run CVS (except for RemoteBlast and StandAloneBlast  
> which
> are in bioperl-live core due to popularity). All the output parsers  
> are
> in bioperl-live core.
>
>
> -- 
> Torsten Seemann
> Victorian Bioinformatics Consortium, Monash University, Australia
> http://www.vicbioinformatics.com/
> Phone: +61 3 9905 9010
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

--
Jason Stajich
Duke University
http://www.duke.edu/~jes12

From golharam at umdnj.edu  Sun Jan 29 17:48:34 2006
From: golharam at umdnj.edu (Ryan Golhar)
Date: Sun, 29 Jan 2006 12:48:34 -0500
Subject: [Bioperl-l] Parsing clustalw alignments
Message-ID: <00ce01c624fc$3a8f34a0$2f01a8c0@GOLHARMOBILE1>

I can't figure this out from the documentation.  In fact, I'm not sure
its possible:

I have a bunch of clustalw alignments in GCG (MSF) format.  Each
alignment consists of three sequences.  I want to get the sequences
including the gaps from the alignment.  

I'm trying to use Bio::AlignIO to read the alignment file, then trying
to get each sequence from the alignment. I tried doing this:

$seqio = Bio::AlignIO->new(-format => 'clustalw', -file =>
"align$x.clustalw");
my $aln = $seqio->next_aln();
$seq1 = $aln->next_seq()->seq;

Getting the sequence from the alignment isn't working and I'm not sure
how to do it.  Does anyone have any ideas as to what I might try?

--
Ryan Golhar  -  golharam at umdnj.edu
The Informatics Institute of UMDNJ

From cjfields at uiuc.edu  Sun Jan 29 19:44:22 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Sun, 29 Jan 2006 13:44:22 -0600
Subject: [Bioperl-l] Parsing clustalw alignments
In-Reply-To: <00ce01c624fc$3a8f34a0$2f01a8c0@GOLHARMOBILE1>
References: <00ce01c624fc$3a8f34a0$2f01a8c0@GOLHARMOBILE1>
Message-ID: <294C9886-277B-4C35-AF7F-D6ABB3B401A3@uiuc.edu>

Even though you used clustalw for aligning the sequences, the output  
format is GCG (msf) and not clustalw (aln) format, so you need to  
change the '-format' flag you have set:

> $seqio = Bio::AlignIO->new(-format => 'clustalw', -file =>
> "align$x.clustalw");

to

> $seqio = Bio::AlignIO->new(-format => 'msf', -file =>
> "align$x.clustalw");

See if that works.

On Jan 29, 2006, at 11:48 AM, Ryan Golhar wrote:

> I can't figure this out from the documentation.  In fact, I'm not sure
> its possible:
>
> I have a bunch of clustalw alignments in GCG (MSF) format.  Each
> alignment consists of three sequences.  I want to get the sequences
> including the gaps from the alignment.
>
> I'm trying to use Bio::AlignIO to read the alignment file, then trying
> to get each sequence from the alignment. I tried doing this:
>
> $seqio = Bio::AlignIO->new(-format => 'clustalw', -file =>
> "align$x.clustalw");
> my $aln = $seqio->next_aln();
> $seq1 = $aln->next_seq()->seq;
>
> Getting the sequence from the alignment isn't working and I'm not sure
> how to do it.  Does anyone have any ideas as to what I might try?
>
> --
> Ryan Golhar  -  golharam at umdnj.edu
> The Informatics Institute of UMDNJ
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign

From jason.stajich at duke.edu  Sun Jan 29 19:49:20 2006
From: jason.stajich at duke.edu (Jason Stajich)
Date: Sun, 29 Jan 2006 14:49:20 -0500
Subject: [Bioperl-l] Parsing clustalw alignments
In-Reply-To: <00ce01c624fc$3a8f34a0$2f01a8c0@GOLHARMOBILE1>
References: <00ce01c624fc$3a8f34a0$2f01a8c0@GOLHARMOBILE1>
Message-ID: <21817F7A-8552-4F24-8094-86D830A506BB@duke.edu>

See the Bio::SimpleAlign documentation for information on how to  
interact with an alignment

Here is some code from the SYNOPSIS
# Extract sequences and check values for the alignment column $pos
   foreach $seq ($aln->each_seq) {
       $res = $seq->subseq($pos, $pos);
       $count{$res}++;
   }

So for you question:
# get the aln parser
my $alnio = Bio::AlignIO->new(-format => 'clustalw', -file  
=>"alnfile.aln);
while( my $aln = $alnio->next_aln ) {
  # get the alignments one by one
  for my $seq ( $aln->each_seq ) {
  # get the sequences out from the alignment
   print "sequence as a string", $seq->seq, "\n";
   }
}

next_seq is an API Sequence streams, not something we have  
implemented for alignments since you can get them all out with the  
each_seq method.

-jason
On Jan 29, 2006, at 12:48 PM, Ryan Golhar wrote:

> I can't figure this out from the documentation.  In fact, I'm not sure
> its possible:
>
> I have a bunch of clustalw alignments in GCG (MSF) format.  Each
> alignment consists of three sequences.  I want to get the sequences
> including the gaps from the alignment.
>
> I'm trying to use Bio::AlignIO to read the alignment file, then trying
> to get each sequence from the alignment. I tried doing this:
>
> $seqio = Bio::AlignIO->new(-format => 'clustalw', -file =>
> "align$x.clustalw");
> my $aln = $seqio->next_aln();
> $seq1 = $aln->next_seq()->seq;
>
> Getting the sequence from the alignment isn't working and I'm not sure
> how to do it.  Does anyone have any ideas as to what I might try?
>
> --
> Ryan Golhar  -  golharam at umdnj.edu
> The Informatics Institute of UMDNJ
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

--
Jason Stajich
Duke University
http://www.duke.edu/~jes12

From biophp at biophp.org  Fri Jan 27 13:20:31 2006
From: biophp at biophp.org (Joseba Bikandi)
Date: Fri, 27 Jan 2006 08:20:31 -0500
Subject: [Bioperl-l] BioPHP.org - open source repository of code and scripts
Message-ID: 

Dear Sir/Madam,

I would like to let you know about biophp.org, 
an open source project which may be interesting 
for you. It is a new project which includes 
PHP code (functions) and minitools (copy and
paste one page scripts). 

Sincerely,

......
Joseba Bikandi
biophp at biophp.org

From golharam at umdnj.edu  Mon Jan 30 17:40:58 2006
From: golharam at umdnj.edu (Ryan Golhar)
Date: Mon, 30 Jan 2006 12:40:58 -0500
Subject: [Bioperl-l] Parsing clustalw alignments
In-Reply-To: <21817F7A-8552-4F24-8094-86D830A506BB@duke.edu>
Message-ID: <003701c625c4$5527d790$2f01a8c0@GOLHARMOBILE1>

Thanks.  Here's what I ended up doing:

$seqio = Bio::AlignIO->new(-format => 'msf', -file =>
"alnfile.clustalw");
my $aln = $seqio->next_aln();
@_ = $aln->each_seq_with_id('org1');
$seq1 = $_[0]->seq;

-----Original Message-----
From: bioperl-l-bounces at lists.open-bio.org
[mailto:bioperl-l-bounces at lists.open-bio.org] On Behalf Of Jason Stajich
Sent: Sunday, January 29, 2006 2:49 PM
To: golharam at umdnj.edu
Cc: 'bioperl-l'
Subject: Re: [Bioperl-l] Parsing clustalw alignments

See the Bio::SimpleAlign documentation for information on how to  
interact with an alignment

Here is some code from the SYNOPSIS
# Extract sequences and check values for the alignment column $pos
   foreach $seq ($aln->each_seq) {
       $res = $seq->subseq($pos, $pos);
       $count{$res}++;
   }

So for you question:
# get the aln parser
my $alnio = Bio::AlignIO->new(-format => 'clustalw', -file  
=>"alnfile.aln);
while( my $aln = $alnio->next_aln ) {
  # get the alignments one by one
  for my $seq ( $aln->each_seq ) {
  # get the sequences out from the alignment
   print "sequence as a string", $seq->seq, "\n";
   }
}

next_seq is an API Sequence streams, not something we have  
implemented for alignments since you can get them all out with the  
each_seq method.

-jason
On Jan 29, 2006, at 12:48 PM, Ryan Golhar wrote:

> I can't figure this out from the documentation.  In fact, I'm not sure

> its possible:
>
> I have a bunch of clustalw alignments in GCG (MSF) format.  Each 
> alignment consists of three sequences.  I want to get the sequences 
> including the gaps from the alignment.
>
> I'm trying to use Bio::AlignIO to read the alignment file, then trying

> to get each sequence from the alignment. I tried doing this:
>
> $seqio = Bio::AlignIO->new(-format => 'clustalw', -file => 
> "align$x.clustalw"); my $aln = $seqio->next_aln();
> $seq1 = $aln->next_seq()->seq;
>
> Getting the sequence from the alignment isn't working and I'm not sure

> how to do it.  Does anyone have any ideas as to what I might try?
>
> --
> Ryan Golhar  -  golharam at umdnj.edu
> The Informatics Institute of UMDNJ
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org 
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

--
Jason Stajich
Duke University
http://www.duke.edu/~jes12

_______________________________________________
Bioperl-l mailing list
Bioperl-l at lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/bioperl-l

From alindeman at gmail.com  Tue Jan 31 04:00:32 2006
From: alindeman at gmail.com (Andy Lindeman)
Date: Mon, 30 Jan 2006 22:00:32 -0600
Subject: [Bioperl-l] Bio::Graphics: Different glyphs on same track
In-Reply-To: <3f3ecb5a0601301442o573053aes23934617edb14729@mail.gmail.com>
References: <3f3ecb5a0601301442o573053aes23934617edb14729@mail.gmail.com>
Message-ID: <3f3ecb5a0601302000j7a3fbd4y1739a3c1696e30aa@mail.gmail.com>

Hi all--

Is it possible to use two different glyphs (or the same glyph with
different properties) on the same panel track?

Thanks

--A

From Marc.Logghe at DEVGEN.com  Tue Jan 31 08:08:09 2006
From: Marc.Logghe at DEVGEN.com (Marc Logghe)
Date: Tue, 31 Jan 2006 09:08:09 +0100
Subject: [Bioperl-l] Bio::Graphics: Different glyphs on same track
Message-ID: <0C528E3670D8CE4B8E013F6749231AA6746AB5@ANTARESIA.be.devgen.com>

Hi Andy
> Is it possible to use two different glyphs (or the same glyph 
> with different properties) on the same panel track?
Sure it is. This extract comes from the docs of Bio::Graphics::Panel

" There are a large number of glyph types.  By default, each track will
be homogeneous on a single glyph type, but you can mix several glyph
types on the same track by providing a code reference to the -glyph
argument.  Other options passed to add_track() control the color and
size of the glyphs, whether they are allowed to overlap, and other
formatting attributes.  The height of a track is determined from its
contents and cannot be directly influenced."

HTH,
Marc

From alindeman at gmail.com  Tue Jan 31 19:59:00 2006
From: alindeman at gmail.com (Andy Lindeman)
Date: Tue, 31 Jan 2006 13:59:00 -0600
Subject: [Bioperl-l] Bio::Graphics: Different glyphs on same track
In-Reply-To: <0C528E3670D8CE4B8E013F6749231AA6746AB5@ANTARESIA.be.devgen.com>
References: <0C528E3670D8CE4B8E013F6749231AA6746AB5@ANTARESIA.be.devgen.com>
Message-ID: <3f3ecb5a0601311159k6d7f09d3j65732b5e72019e9d@mail.gmail.com>

Wonderful!

Thanks.

--A

On 1/31/06, Marc Logghe  wrote:
> Hi Andy
> > Is it possible to use two different glyphs (or the same glyph
> > with different properties) on the same panel track?
> Sure it is. This extract comes from the docs of Bio::Graphics::Panel
>
> " There are a large number of glyph types.  By default, each track will
> be homogeneous on a single glyph type, but you can mix several glyph
> types on the same track by providing a code reference to the -glyph
> argument.  Other options passed to add_track() control the color and
> size of the glyphs, whether they are allowed to overlap, and other
> formatting attributes.  The height of a track is determined from its
> contents and cannot be directly influenced."
>
> HTH,
> Marc
>

Request ID	1137626804-16566-100302560340.BLASTQ4
Status	Searching
Submitted at	Wed Jan 18 18:26:44 2006
Current time	Wed Jan 18 18:36:46 2006
Time since submission	00:10:01

Request ID	1137632221-28820-85178967709.BLASTQ1
Status	Searching
Submitted at	Wed Jan 18 19:57:01 2006
Current time	Wed Jan 18 19:57:03 2006
Time since submission	00:00:01

Request ID	1137626804-16566-100302560340.BLASTQ4 b>
Status	Searching
Submitted at	Wed Jan 18 18:26:44 2006
Current time	Wed Jan 18 18:36:46 2006
Time since submission	00:10:01

Request ID	1137632221-28820-85178967709.BLASTQ1 b>
Status	Searching
Submitted at	Wed Jan 18 19:57:01 2006
Current time	Wed Jan 18 19:57:03 2006
Time since submission	00:00:01

Request ID	1137632221-28820-85178967709.BLASTQ1 > b>
Status	Searching
Submitted at	Wed Jan 18 19:57:01 2006
Current time	Wed Jan 18 19:57:03 2006
Time since submission	00:00:01