From maj at fortinbras.us  Sun Nov  1 23:47:15 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Sun, 1 Nov 2009 23:47:15 -0500
Subject: [Bioperl-l] annotations
Message-ID: <5150801225E0484D95DC51B2D00AE519@NewLife>

I'm cogitating on features and annotations. For a RichSeq, one gets the set of annotations by

$seq->annotation->get_Annotations

while getting features by 

$seq->get_Features

Is there a reason not to have a method in SeqI 

sub get_Annotations { shift->annotation->get_Annotations }

to allow a user to do what seems natural from a user's perspective, viz. $seq->get_Annotations? I imagine this might save hundreds of hours of frustration, integrated over all newbies.
MAJ

From cjfields at illinois.edu  Mon Nov  2 08:08:54 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Mon, 2 Nov 2009 07:08:54 -0600
Subject: [Bioperl-l] annotations
In-Reply-To: <5150801225E0484D95DC51B2D00AE519@NewLife>
References: <5150801225E0484D95DC51B2D00AE519@NewLife>
Message-ID: <6920A9E1-D221-4CF8-9866-0ADBDB254C19@illinois.edu>

On Nov 1, 2009, at 10:47 PM, Mark A. Jensen wrote:

> I'm cogitating on features and annotations. For a RichSeq, one gets  
> the set of annotations by
>
> $seq->annotation->get_Annotations
>
> while getting features by
>
> $seq->get_Features
>
> Is there a reason not to have a method in SeqI
>
> sub get_Annotations { shift->annotation->get_Annotations }
>
> to allow a user to do what seems natural from a user's perspective,  
> viz. $seq->get_Annotations? I imagine this might save hundreds of  
> hours of frustration, integrated over all newbies.
> MAJ

One could add the methods to delegate to annotation() (that's  
essentially what I'm planning on doing for Biome).

chris

From kiekyon.huang at gmail.com  Tue Nov  3 10:14:39 2009
From: kiekyon.huang at gmail.com (Kie Kyon Huang)
Date: Tue, 3 Nov 2009 23:14:39 +0800
Subject: [Bioperl-l] render_blast problem
Message-ID: <a54400840911030714q15c0a601t6882beaff68a45f5@mail.gmail.com>

Hi,

I was trying to follow the HOWTO:Graphics at
http://www.bioperl.org/wiki/HOWTO:Graphics

When running the command line in cygwin

$ perl render_blast1.pl data1.txt | display -

I get the following error line,

bash: display: command not found

I also tried

$ perl render_blast1.pl data1.txt > data1.png

however, I was unable to open the data1.png file using Microsoft
Office Picture Manager or windows Photo Gallery

Thanks

Huang

From biopython at maubp.freeserve.co.uk  Tue Nov  3 10:45:37 2009
From: biopython at maubp.freeserve.co.uk (Peter)
Date: Tue, 3 Nov 2009 15:45:37 +0000
Subject: [Bioperl-l] render_blast problem
In-Reply-To: <a54400840911030714q15c0a601t6882beaff68a45f5@mail.gmail.com>
References: <a54400840911030714q15c0a601t6882beaff68a45f5@mail.gmail.com>
Message-ID: <320fb6e00911030745s68331ef7n729505f460863e21@mail.gmail.com>

On Tue, Nov 3, 2009 at 3:14 PM, Kie Kyon Huang <kiekyon.huang at gmail.com> wrote:
> Hi,
>
> I was trying to follow the HOWTO:Graphics at
> http://www.bioperl.org/wiki/HOWTO:Graphics
>
> When running the command line in cygwin
>
> $ perl render_blast1.pl data1.txt | display -
>
> I get the following error line,
>
> bash: display: command not found

That makes sense on Windows, since display is a Unix
command line tool.

> I also tried
>
> $ perl render_blast1.pl data1.txt > data1.png

Based on the wiki, I think that ought to have worked.

> however, I was unable to open the data1.png file using Microsoft
> Office Picture Manager or windows Photo Gallery

Did you do this step?:
>> Important!  If you are on a Windows platform, you need to put
>> STDOUT into binary mode so that the PNG file does not go
>> through Window's carriage return/linefeed transformations.
>> Before the final print statement, put the statement
>> binmode(STDOUT). This advice also applies to certain older
>> versions of RedHat, which ship with a patched (and possibly
>> broken) version of Perl.

(BioPerl devs - couldn't that be added to the default
render_blast1.pl script with an if statement checking for
Windows?)

Peter

From biopython at maubp.freeserve.co.uk  Tue Nov  3 11:04:59 2009
From: biopython at maubp.freeserve.co.uk (Peter)
Date: Tue, 3 Nov 2009 16:04:59 +0000
Subject: [Bioperl-l] render_blast problem
In-Reply-To: <a54400840911030755s725229f7ib679d67932535753@mail.gmail.com>
References: <a54400840911030714q15c0a601t6882beaff68a45f5@mail.gmail.com>
	<320fb6e00911030745s68331ef7n729505f460863e21@mail.gmail.com>
	<a54400840911030755s725229f7ib679d67932535753@mail.gmail.com>
Message-ID: <320fb6e00911030804r62e50da6w373bbb61e9823f28@mail.gmail.com>

Mailing list CC'd - solved :)

On Tue, Nov 3, 2009 at 3:55 PM, Kie Kyon Huang <kiekyon.huang at gmail.com> wrote:
>
> ok, that fix it
> i forget sometimes what platform am i on.
> thanks

Great.

Peter

From amackey at virginia.edu  Tue Nov  3 12:09:00 2009
From: amackey at virginia.edu (Aaron Mackey)
Date: Tue, 3 Nov 2009 12:09:00 -0500
Subject: [Bioperl-l] svn errors?
Message-ID: <24c96eca0911030909p7cfbf858h4de5a345cf8a0782@mail.gmail.com>

[ajm6q at lc4 bioperl-live]$ svn update
svn: Decompression of svndiff data failed


I'll admit to not having svn updated in awhile; A clean, anonymous svn co
failed with the same message:

[...]
A    bioperl-live/Bio/Structure/StructureI.pm
A    bioperl-live/Bio/Structure/IO
svn: Decompression of svndiff data failed

-Aaron

P.S. I used this command: svn co svn://
code.open-bio.org/bioperl/bioperl-live/trunk bioperl-live

From cjfields at illinois.edu  Tue Nov  3 12:17:10 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Tue, 3 Nov 2009 11:17:10 -0600
Subject: [Bioperl-l] svn errors?
In-Reply-To: <24c96eca0911030909p7cfbf858h4de5a345cf8a0782@mail.gmail.com>
References: <24c96eca0911030909p7cfbf858h4de5a345cf8a0782@mail.gmail.com>
Message-ID: <8C5FC42D-F957-45AC-9AAC-876ACC9D77E0@illinois.edu>

Aaron,

Yep, this was reported to support (a couple of users on #bioperl  
reported the same problem).  Chris D. is looking into it.

I'm wondering if it's worth setting up a second mirror to github for  
this purpose.

chris

On Nov 3, 2009, at 11:09 AM, Aaron Mackey wrote:

> [ajm6q at lc4 bioperl-live]$ svn update
> svn: Decompression of svndiff data failed
>
>
> I'll admit to not having svn updated in awhile; A clean, anonymous  
> svn co
> failed with the same message:
>
> [...]
> A    bioperl-live/Bio/Structure/StructureI.pm
> A    bioperl-live/Bio/Structure/IO
> svn: Decompression of svndiff data failed
>
> -Aaron
>
> P.S. I used this command: svn co svn://
> code.open-bio.org/bioperl/bioperl-live/trunk bioperl-live
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From cjfields at illinois.edu  Tue Nov  3 12:19:56 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Tue, 3 Nov 2009 11:19:56 -0600
Subject: [Bioperl-l] render_blast problem
In-Reply-To: <320fb6e00911030745s68331ef7n729505f460863e21@mail.gmail.com>
References: <a54400840911030714q15c0a601t6882beaff68a45f5@mail.gmail.com>
	<320fb6e00911030745s68331ef7n729505f460863e21@mail.gmail.com>
Message-ID: <8336341C-C7B4-4740-A7C3-E2DE5FDAF651@illinois.edu>


On Nov 3, 2009, at 9:45 AM, Peter wrote:

> ...
> Did you do this step?:
>>> Important!  If you are on a Windows platform, you need to put
>>> STDOUT into binary mode so that the PNG file does not go
>>> through Window's carriage return/linefeed transformations.
>>> Before the final print statement, put the statement
>>> binmode(STDOUT). This advice also applies to certain older
>>> versions of RedHat, which ship with a patched (and possibly
>>> broken) version of Perl.
>
> (BioPerl devs - couldn't that be added to the default
> render_blast1.pl script with an if statement checking for
> Windows?)
>
> Peter

Yes, that should be added.  I'll work on it.

chris

From mauricio at open-bio.org  Tue Nov  3 12:20:52 2009
From: mauricio at open-bio.org (Mauricio Herrera Cuadra)
Date: Tue, 03 Nov 2009 11:20:52 -0600
Subject: [Bioperl-l] svn errors?
In-Reply-To: <24c96eca0911030909p7cfbf858h4de5a345cf8a0782@mail.gmail.com>
References: <24c96eca0911030909p7cfbf858h4de5a345cf8a0782@mail.gmail.com>
Message-ID: <4AF06674.30506@open-bio.org>

Hi Aaron,

This was reported a few days ago. Chris Dagdigian is working today on a 
fix for it.

Mauricio.

Aaron Mackey wrote:
> [ajm6q at lc4 bioperl-live]$ svn update
> svn: Decompression of svndiff data failed
> 
> 
> I'll admit to not having svn updated in awhile; A clean, anonymous svn co
> failed with the same message:
> 
> [...]
> A    bioperl-live/Bio/Structure/StructureI.pm
> A    bioperl-live/Bio/Structure/IO
> svn: Decompression of svndiff data failed
> 
> -Aaron
> 
> P.S. I used this command: svn co svn://
> code.open-bio.org/bioperl/bioperl-live/trunk bioperl-live
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 

From rachitasharma at gmail.com  Tue Nov  3 17:12:11 2009
From: rachitasharma at gmail.com (Rachita Sharma)
Date: Tue, 3 Nov 2009 14:12:11 -0800
Subject: [Bioperl-l] Trouble parsing PSI-BLAST
Message-ID: <48f9c0d0911031412v26935097ib06d13c2266cfd8a@mail.gmail.com>

I am having trouble parsing PSI-BLAST results. Please help.

The code is:
my $in = new Bio::SearchIO(        -format => 'blast',
                                -file =>
"BS_XFpsiRblastoutputs/e${ev}/bloutput${i}.txt");


while( my $result = $in->next_result ) {
while( my $hit = $result->next_hit ) {

$sth->execute($result->query_name, $hit->name, $hit->significance);
print "Query executed!\n";

}
}

The error is:
------------- EXCEPTION: Bio::Root::Exception -------------
MSG: no data for midline  ***** No hits found ******
STACK: Error::throw
STACK: Bio::Root::Root::throw
/usr/lib/perl5/site_perl/5.8.8/Bio/Root/Root.pm:359
STACK: Bio::SearchIO::blast::next_result
/usr/lib/perl5/site_perl/5.8.8/Bio/SearchIO/blast.pm:1813
STACK: BSubVCpsiRblast.pl:92
-----------------------------------------------------------

From cjfields at illinois.edu  Tue Nov  3 22:42:55 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Tue, 3 Nov 2009 21:42:55 -0600
Subject: [Bioperl-l] Trouble parsing PSI-BLAST
In-Reply-To: <48f9c0d0911031412v26935097ib06d13c2266cfd8a@mail.gmail.com>
References: <48f9c0d0911031412v26935097ib06d13c2266cfd8a@mail.gmail.com>
Message-ID: <DD8E7843-7181-45AD-95B1-FD877D0A5D4E@illinois.edu>

Rachita,

You'll have to give us more to go on than this.  The best thing to do  
is file a bug report and attach an example PSI-BLAST report and code  
that causes the problem.  The $sth->execute(...) is a bit odd, but  
that shouldn't cause the error in question.

Also, make sure to stipulate the OS, version of BioPerl, and perl  
version.

chris

On Nov 3, 2009, at 4:12 PM, Rachita Sharma wrote:

> I am having trouble parsing PSI-BLAST results. Please help.
>
> The code is:
> my $in = new Bio::SearchIO(        -format => 'blast',
>                                -file =>
> "BS_XFpsiRblastoutputs/e${ev}/bloutput${i}.txt");
>
>
> while( my $result = $in->next_result ) {
> while( my $hit = $result->next_hit ) {
>
> $sth->execute($result->query_name, $hit->name, $hit->significance);
> print "Query executed!\n";
>
> }
> }
>
> The error is:
> ------------- EXCEPTION: Bio::Root::Exception -------------
> MSG: no data for midline  ***** No hits found ******
> STACK: Error::throw
> STACK: Bio::Root::Root::throw
> /usr/lib/perl5/site_perl/5.8.8/Bio/Root/Root.pm:359
> STACK: Bio::SearchIO::blast::next_result
> /usr/lib/perl5/site_perl/5.8.8/Bio/SearchIO/blast.pm:1813
> STACK: BSubVCpsiRblast.pl:92
> -----------------------------------------------------------
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From alexl at users.sourceforge.net  Wed Nov  4 02:30:21 2009
From: alexl at users.sourceforge.net (Alex Lancaster)
Date: Wed, 04 Nov 2009 02:30:21 -0500
Subject: [Bioperl-l] version of ExtUtils::Manifest too strict?
Message-ID: <msd43yycfm.fsf@allele2.localdomain>

Does the version of ExtUtils::Manifest really need to be strictly
greater than or equal to 1.52?

Currently this blocks me updating the Fedora package of BioPerl to
1.6.1, because the version of perl that Fedora ships is on 1.51 and
hence the build fails with:

Checking prerequisites...
 - ERROR: ExtUtils::Manifest (1.51_01) is installed, but we need version >= 1.52

Full logs are here:
http://koji.fedoraproject.org/koji/taskinfo?taskID=1787483
http://koji.fedoraproject.org/koji/getfile?taskID=1787483&name=build.log

This is true even with the version of Perl in rawhide/F-12 etc.
(ExtUtils::Manifest is in the base perl package).

If it really is necessary, I would like to be armed with a good argument
why it needs to be updated, since the Perl package maintainer would have
to update the entire Perl package simply to get a more recent version of
one small subpackage.

Regards,
Alex

From jluis.lavin at unavarra.es  Wed Nov  4 03:43:35 2009
From: jluis.lavin at unavarra.es (jluis.lavin at unavarra.es)
Date: Wed, 4 Nov 2009 09:43:35 +0100 (CET)
Subject: [Bioperl-l] Trouble retrieving multiple sequences from NCBI in a
 single list query
Message-ID: <1386.130.206.164.153.1257324215.squirrel@webmail.unavarra.es>


Hello all,

I?m a newbie who is having terrible troubles trying to retrieve a list
multiple sequences from the NCBI and write them to a single file in Fasta
format.
The code I?ve written seems to read mylist and retrive the sequences, but
it kinda overwrites them so that I only get the last sequence on the list.
I?ve been told to ask the people on this mailing list for help, since you
may have come across this problem also or at last will know how to solve
it...

Here is my code, which basically consist on an STDIN for the list to be
read into an array and a loop to read each sequence (stopping when the
list ends) and retrieve a sequence each time the loop is launched,
writting that sequence to a fasta file. I only get a sequence back
although it seems to perform the retrieving process with each of the
sequences of the list...


#!/usr/bin/perl -w
use strict;
use Bio::DB::GenPept;
use Bio::DB::GenBank;
use Bio::SeqIO;
print "Enter your list name:";
my $archivo=<STDIN>;
chomp $archivo;
die ("Can?t open input\n") unless (open(INFILE, $archivo));
my @lista = <INFILE>;
foreach my $seq (@lista) {
    if ($seq eq '') {
        die ("empty list")
        }
    else {
my $db = new Bio::DB::GenPept("-format" => "Fasta");
my $seqobj = $db->get_Seq_by_acc($seq);
my $out = new Bio::SeqIO (-file => ">extracted_seqs.fasta",
-format => 'fasta');
$out->write_seq($seqobj);
}
}
exit;


An example list of sequences can be this one:

YP_003107578.1
YP_003106103.1
YP_003106552.1
YP_003106560.1
YP_003107053.1
YP_003107450.1
YP_003108000.1
YP_003105023.1
YP_003105264.1

Thanks in advance for your help ;)

-- 
Jos? Luis Lav?n Trueba, PhD

Dpto. de Producci?n Agraria
Grupo de Gen?tica y Microbiolog?a
Universidad P?blica de Navarra
31006 Pamplona
Navarra
SPAIN


From e.osimo at gmail.com  Wed Nov  4 04:54:52 2009
From: e.osimo at gmail.com (Emanuele Osimo)
Date: Wed, 4 Nov 2009 10:54:52 +0100
Subject: [Bioperl-l] Bio::Graphics and picture format
Message-ID: <2ac05d0f0911040154h4eed4a1j8108f78e6e4761f3@mail.gmail.com>

Hello everyone,
do you know if it is possible to generate an image with Bio::Graphics in a
vector format? Is there a list of available formats?
Thanks
Emanuele

From David.Messina at sbc.su.se  Wed Nov  4 04:52:53 2009
From: David.Messina at sbc.su.se (Dave Messina)
Date: Wed, 4 Nov 2009 10:52:53 +0100
Subject: [Bioperl-l] Trouble retrieving multiple sequences from NCBI in
	a single list query
In-Reply-To: <1386.130.206.164.153.1257324215.squirrel@webmail.unavarra.es>
References: <1386.130.206.164.153.1257324215.squirrel@webmail.unavarra.es>
Message-ID: <628aabb70911040152r19ed79dfnbc54f346295d28a8@mail.gmail.com>

>
> The code I?ve written seems to read mylist and retrive the sequences, but
> it kinda overwrites them so that I only get the last sequence on the list.
>

With this line

my $out = new Bio::SeqIO (-file => ">extracted_seqs.fasta", -format =>
'fasta');


you are opening the filehandle for the output file inside your loop, so each
time it is writing over the previous file with an empty file. Then, you
write a single sequence to that file with this line

$out->write_seq($seqobj);


So when you are done, you just have the last sequence in the output file.

If you move the opening of the output filehandle outside the loop (it needs
to be done only once), then it should work as you expect.

Also, I notice the newline characters are not being removed from your
sequence IDs  (actually I'm a little surprised that the sequences are being
retrieved). Just to be safe, you may want to add the line

chomp @lista;


after

my @lista = <INFILE>;


Dave


From jluis.lavin at unavarra.es  Wed Nov  4 05:14:40 2009
From: jluis.lavin at unavarra.es (jluis.lavin at unavarra.es)
Date: Wed, 4 Nov 2009 11:14:40 +0100 (CET)
Subject: [Bioperl-l] Trouble retrieving multiple sequences from NCBI in
 a single list query
In-Reply-To: <628aabb70911040152r19ed79dfnbc54f346295d28a8@mail.gmail.com>
References: <1386.130.206.164.153.1257324215.squirrel@webmail.unavarra.es>
	<628aabb70911040152r19ed79dfnbc54f346295d28a8@mail.gmail.com>
Message-ID: <1791.130.206.164.153.1257329680.squirrel@webmail.unavarra.es>

Thank you very very much Dave,
I?ve had a really frustrating time trying to find out what I was doing
wrong, it has been so frustrating that I was about to quit Bioperl.
Now I can try to focus on BLAST parsing for my comparative genomic analysis

You?re great in this mailing list, because you give a fast and neat advice
to all the questions asked here by newbies like me ;)


El Mie, 4 de Noviembre de 2009, 10:52, Dave Messina escribi?:
>>
>> The code I??ve written seems to read mylist and retrive the sequences,
>> but
>> it kinda overwrites them so that I only get the last sequence on the
>> list.
>>
>
> With this line
>
> my $out = new Bio::SeqIO (-file => ">extracted_seqs.fasta", -format =>
> 'fasta');
>
>
> you are opening the filehandle for the output file inside your loop, so
> each
> time it is writing over the previous file with an empty file. Then, you
> write a single sequence to that file with this line
>
> $out->write_seq($seqobj);
>
>
> So when you are done, you just have the last sequence in the output file.
>
> If you move the opening of the output filehandle outside the loop (it
> needs
> to be done only once), then it should work as you expect.
>
> Also, I notice the newline characters are not being removed from your
> sequence IDs  (actually I'm a little surprised that the sequences are
> being
> retrieved). Just to be safe, you may want to add the line
>
> chomp @lista;
>
>
> after
>
> my @lista = <INFILE>;
>
>
>
>
> Dave
>


-- 
Dr. Jos? Luis Lav?n Trueba

Dpto. de Producci?n Agraria
Grupo de Gen?tica y Microbiolog?a
Universidad P?blica de Navarra
31006 Pamplona
Navarra
SPAIN


From hrh at fmi.ch  Wed Nov  4 05:05:17 2009
From: hrh at fmi.ch (Hotz, Hans-Rudolf)
Date: Wed, 04 Nov 2009 11:05:17 +0100
Subject: [Bioperl-l] Trouble retrieving multiple sequences from NCBI in
 a single list query
In-Reply-To: <1386.130.206.164.153.1257324215.squirrel@webmail.unavarra.es>
Message-ID: <C717106D.54F2%hrh@fmi.ch>

Hi

try

my $out = new Bio::SeqIO (-file => ">>extracted_seqs.fasta",
                                     ^

this way you no longer overwrite your existing file, but append the next
sequence.

Regards, Hans


On 11/4/09 9:43 AM, "jluis.lavin at unavarra.es" <jluis.lavin at unavarra.es>
wrote:

> 
> Hello all,
> 
> I?m a newbie who is having terrible troubles trying to retrieve a list
> multiple sequences from the NCBI and write them to a single file in Fasta
> format.
> The code I?ve written seems to read mylist and retrive the sequences, but
> it kinda overwrites them so that I only get the last sequence on the list.
> I?ve been told to ask the people on this mailing list for help, since you
> may have come across this problem also or at last will know how to solve
> it...
> 
> Here is my code, which basically consist on an STDIN for the list to be
> read into an array and a loop to read each sequence (stopping when the
> list ends) and retrieve a sequence each time the loop is launched,
> writting that sequence to a fasta file. I only get a sequence back
> although it seems to perform the retrieving process with each of the
> sequences of the list...
> 
> 
> #!/usr/bin/perl -w
> use strict;
> use Bio::DB::GenPept;
> use Bio::DB::GenBank;
> use Bio::SeqIO;
> print "Enter your list name:";
> my $archivo=<STDIN>;
> chomp $archivo;
> die ("Can?t open input\n") unless (open(INFILE, $archivo));
> my @lista = <INFILE>;
> foreach my $seq (@lista) {
>     if ($seq eq '') {
>         die ("empty list")
>         }
>     else {
> my $db = new Bio::DB::GenPept("-format" => "Fasta");
> my $seqobj = $db->get_Seq_by_acc($seq);
> my $out = new Bio::SeqIO (-file => ">extracted_seqs.fasta",
> -format => 'fasta');
> $out->write_seq($seqobj);
> }
> }
> exit;
> 
> 
> An example list of sequences can be this one:
> 
> YP_003107578.1
> YP_003106103.1
> YP_003106552.1
> YP_003106560.1
> YP_003107053.1
> YP_003107450.1
> YP_003108000.1
> YP_003105023.1
> YP_003105264.1
> 
> Thanks in advance for your help ;)


From jluis.lavin at unavarra.es  Wed Nov  4 05:25:38 2009
From: jluis.lavin at unavarra.es (jluis.lavin at unavarra.es)
Date: Wed, 4 Nov 2009 11:25:38 +0100 (CET)
Subject: [Bioperl-l] Trouble retrieving multiple sequences from NCBI in
 asingle list query
In-Reply-To: <C717106D.54F2%hrh@fmi.ch>
References: <1386.130.206.164.153.1257324215.squirrel@webmail.unavarra.es>
	<C717106D.54F2%hrh@fmi.ch>
Message-ID: <1834.130.206.164.153.1257330338.squirrel@webmail.unavarra.es>

Thank you very much for your answer Hans!!!
It works perfectly,also a neat and fast solution, like Dave?s.

Blessings to you all ;)

El Mie, 4 de Noviembre de 2009, 11:05, Hotz, Hans-Rudolf escribi?:
> Hi
>
> try
>
> my $out = new Bio::SeqIO (-file => ">>extracted_seqs.fasta",
>                                      ^
>
> this way you no longer overwrite your existing file, but append the next
> sequence.
>
> Regards, Hans
>
>
>
> On 11/4/09 9:43 AM, "jluis.lavin at unavarra.es" <jluis.lavin at unavarra.es>
> wrote:
>
>>
>> Hello all,
>>
>> I?m a newbie who is having terrible troubles trying to retrieve a list
>> multiple sequences from the NCBI and write them to a single file in
>> Fasta
>> format.
>> The code I?ve written seems to read mylist and retrive the sequences,
>> but
>> it kinda overwrites them so that I only get the last sequence on the
>> list.
>> I?ve been told to ask the people on this mailing list for help, since
>> you
>> may have come across this problem also or at last will know how to solve
>> it...
>>
>> Here is my code, which basically consist on an STDIN for the list to be
>> read into an array and a loop to read each sequence (stopping when the
>> list ends) and retrieve a sequence each time the loop is launched,
>> writting that sequence to a fasta file. I only get a sequence back
>> although it seems to perform the retrieving process with each of the
>> sequences of the list...
>>
>>
>> #!/usr/bin/perl -w
>> use strict;
>> use Bio::DB::GenPept;
>> use Bio::DB::GenBank;
>> use Bio::SeqIO;
>> print "Enter your list name:";
>> my $archivo=<STDIN>;
>> chomp $archivo;
>> die ("Can?t open input\n") unless (open(INFILE, $archivo));
>> my @lista = <INFILE>;
>> foreach my $seq (@lista) {
>>     if ($seq eq '') {
>>         die ("empty list")
>>         }
>>     else {
>> my $db = new Bio::DB::GenPept("-format" => "Fasta");
>> my $seqobj = $db->get_Seq_by_acc($seq);
>> my $out = new Bio::SeqIO (-file => ">extracted_seqs.fasta",
>> -format => 'fasta');
>> $out->write_seq($seqobj);
>> }
>> }
>> exit;
>>
>>
>> An example list of sequences can be this one:
>>
>> YP_003107578.1
>> YP_003106103.1
>> YP_003106552.1
>> YP_003106560.1
>> YP_003107053.1
>> YP_003107450.1
>> YP_003108000.1
>> YP_003105023.1
>> YP_003105264.1
>>
>> Thanks in advance for your help ;)
>
>


-- 
Dr. Jos? Luis Lav?n Trueba

Dpto. de Producci?n Agraria
Grupo de Gen?tica y Microbiolog?a
Universidad P?blica de Navarra
31006 Pamplona
Navarra
SPAIN


From scott at scottcain.net  Wed Nov  4 08:26:02 2009
From: scott at scottcain.net (Scott Cain)
Date: Wed, 4 Nov 2009 08:26:02 -0500
Subject: [Bioperl-l] Bio::Graphics and picture format
In-Reply-To: <2ac05d0f0911040154h4eed4a1j8108f78e6e4761f3@mail.gmail.com>
References: <2ac05d0f0911040154h4eed4a1j8108f78e6e4761f3@mail.gmail.com>
Message-ID: <0FB17FBC-16BE-4A9F-AC75-983D3B4ECE7D@scottcain.net>

Hi Emanuele,

It is possible to use GD::SVG instead of GD to generate SVG graphics.   
To use it, you provide an argument of "-image_class  GD::SVG" to the  
constructor of Bio::Graphics::Panel.  See the perldoc of  
Bio::Graphics::Panel for more info.

Scott


On Nov 4, 2009, at 4:54 AM, Emanuele Osimo wrote:

> Hello everyone,
> do you know if it is possible to generate an image with  
> Bio::Graphics in a
> vector format? Is there a list of available formats?
> Thanks
> Emanuele
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

-----------------------------------------------------------------------
Scott Cain, Ph. D. scott at scottcain dot net
GMOD Coordinator (http://gmod.org/) 216-392-3087
Ontario Institute for Cancer Research


From b3sn7 at UNB.ca  Tue Nov  3 12:30:24 2009
From: b3sn7 at UNB.ca (Sharma, Rachita)
Date: Tue,  3 Nov 2009 13:30:24 -0400
Subject: [Bioperl-l] Trouble parsing PSI-BLAST
Message-ID: <1257269424.4af068b045434@webmail.unb.ca>


I am having trouble parsing PSI-BLAST results. Please help.

The code is:
my $in = new Bio::SearchIO(	-format => 'blast',
				-file => "BS_XFpsiRblastoutputs/e${ev}/bloutput${i}.txt");


while( my $result = $in->next_result ) {
while( my $hit = $result->next_hit ) {

$sth->execute($result->query_name, $hit->name, $hit->significance);
print "Query executed!\n";  

}
}

The error is:
------------- EXCEPTION: Bio::Root::Exception -------------
MSG: no data for midline  ***** No hits found ******
STACK: Error::throw
STACK: Bio::Root::Root::throw /usr/lib/perl5/site_perl/5.8.8/Bio/Root/Root.pm:359
STACK: Bio::SearchIO::blast::next_result
/usr/lib/perl5/site_perl/5.8.8/Bio/SearchIO/blast.pm:1813
STACK: BSubVCpsiRblast.pl:92
-----------------------------------------------------------


*******************************
Rachita Sharma
Research Assistant (PhD Student)
University of New Brunswick, NB, CANADA
email: Rachita.Sharma at unb.ca
Phone no: 503-895-3619
*******************************


From cjfields at illinois.edu  Wed Nov  4 08:53:35 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Wed, 4 Nov 2009 07:53:35 -0600
Subject: [Bioperl-l] version of ExtUtils::Manifest too strict?
In-Reply-To: <msd43yycfm.fsf@allele2.localdomain>
References: <msd43yycfm.fsf@allele2.localdomain>
Message-ID: <1D9E943F-2EDC-49AB-83DE-78DED5A8AC23@illinois.edu>

Alex,

Not sure why ExtUtils::Manifest can't be bundled as a separate perl  
package alone.  It is part of perl core but it's also available on  
CPAN separately from perl itself:

http://search.cpan.org/~rkobes/ExtUtils-Manifest-1.57/lib/ExtUtils/Manifest.pm

This is the commit message for that BTW.  This allows spaces in file  
names for the MANIFEST.  v1.52 is a bug fix and is required.

http://code.open-bio.org/svnweb/index.cgi/bioperl/revision/?rev=15673

chris

On Nov 4, 2009, at 1:30 AM, Alex Lancaster wrote:

> Does the version of ExtUtils::Manifest really need to be strictly
> greater than or equal to 1.52?
>
> Currently this blocks me updating the Fedora package of BioPerl to
> 1.6.1, because the version of perl that Fedora ships is on 1.51 and
> hence the build fails with:
>
> Checking prerequisites...
> - ERROR: ExtUtils::Manifest (1.51_01) is installed, but we need  
> version >= 1.52
>
> Full logs are here:
> http://koji.fedoraproject.org/koji/taskinfo?taskID=1787483
> http://koji.fedoraproject.org/koji/getfile?taskID=1787483&name=build.log
>
> This is true even with the version of Perl in rawhide/F-12 etc.
> (ExtUtils::Manifest is in the base perl package).
>
> If it really is necessary, I would like to be armed with a good  
> argument why this ca
> why it needs to be updated, since the Perl package maintainer would  
> have
> to update the entire Perl package simply to get a more recent  
> version of
> one small subpackage.
>
> Regards,
> Alex
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From cjfields at illinois.edu  Wed Nov  4 08:55:34 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Wed, 4 Nov 2009 07:55:34 -0600
Subject: [Bioperl-l] Trouble parsing PSI-BLAST
In-Reply-To: <1257269424.4af068b045434@webmail.unb.ca>
References: <1257269424.4af068b045434@webmail.unb.ca>
Message-ID: <70E34111-4E70-463D-86EE-06926EA57073@illinois.edu>

Rachita,

Asked and answered yesterday.  Please submit as a bug.

chris

On Nov 3, 2009, at 11:30 AM, Sharma, Rachita wrote:

>
> I am having trouble parsing PSI-BLAST results. Please help.
>
> The code is:
> my $in = new Bio::SearchIO(	-format => 'blast',
> 				-file => "BS_XFpsiRblastoutputs/e${ev}/bloutput${i}.txt");
>
>
> while( my $result = $in->next_result ) {
> while( my $hit = $result->next_hit ) {
>
> $sth->execute($result->query_name, $hit->name, $hit->significance);
> print "Query executed!\n";
>
> }
> }
>
> The error is:
> ------------- EXCEPTION: Bio::Root::Exception -------------
> MSG: no data for midline  ***** No hits found ******
> STACK: Error::throw
> STACK: Bio::Root::Root::throw /usr/lib/perl5/site_perl/5.8.8/Bio/ 
> Root/Root.pm:359
> STACK: Bio::SearchIO::blast::next_result
> /usr/lib/perl5/site_perl/5.8.8/Bio/SearchIO/blast.pm:1813
> STACK: BSubVCpsiRblast.pl:92
> -----------------------------------------------------------
>
>
>
>
> *******************************
> Rachita Sharma
> Research Assistant (PhD Student)
> University of New Brunswick, NB, CANADA
> email: Rachita.Sharma at unb.ca
> Phone no: 503-895-3619
> *******************************
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From David.Messina at sbc.su.se  Wed Nov  4 09:11:43 2009
From: David.Messina at sbc.su.se (Dave Messina)
Date: Wed, 4 Nov 2009 15:11:43 +0100
Subject: [Bioperl-l] Trouble retrieving multiple sequences from NCBI in
	a single list query
In-Reply-To: <1791.130.206.164.153.1257329680.squirrel@webmail.unavarra.es>
References: <1386.130.206.164.153.1257324215.squirrel@webmail.unavarra.es> 
	<628aabb70911040152r19ed79dfnbc54f346295d28a8@mail.gmail.com> 
	<1791.130.206.164.153.1257329680.squirrel@webmail.unavarra.es>
Message-ID: <628aabb70911040611q56b441c8o6888f326d0b314d@mail.gmail.com>

Aw shucks, Jos?, glad I could be of help. There are plenty of people who
answer questions around here, but my timezone sometimes gives me an
advantage for the European ones. :)


Dave


From daniel.gaston at gmail.com  Wed Nov  4 09:45:04 2009
From: daniel.gaston at gmail.com (Daniel Gaston)
Date: Wed, 4 Nov 2009 10:45:04 -0400
Subject: [Bioperl-l] SwissProt and Subcellular localization information
Message-ID: <50c615ba0911040645j1b28e727p5d7bf47a04db160b@mail.gmail.com>

Hi Everyone,

I have recently been playing around with SwissProt format flatfiles and want
to extract sequences based on subcellular localization. I notice in going
through the code for swiss.pm and swissdriver.pm that in both (more so in
swissdriver.pm) there are several steps where organelle information based on
the OG line could be extracted and added to data structure but isn't. It
seems that in both cases the OG line is being added in to the generic
lumping of data from the OC, OS, and OX lines in order to extract species
names and taxonomy information but getting rid of everything else. Is there
a particular reason for this or just a simple oversight? On the surface at
least it looks like a relatively simple modification to make although I
admit that I am not terribly adept at manipulating these SeqIO
datastructures.

Thanks for your time,

Dan

From daniel.gaston at gmail.com  Wed Nov  4 12:12:10 2009
From: daniel.gaston at gmail.com (Daniel Gaston)
Date: Wed, 4 Nov 2009 13:12:10 -0400
Subject: [Bioperl-l] SwissProt and Subcellular localization information
Message-ID: <50c615ba0911040912pfd2483fwe44cd098beed73c7@mail.gmail.com>

Sorry folks, it appears I was just being a bonehead and didn't look close
enough into Bio:Annotations and Bio:Species objects that store all of this
data.

Dan

On Wed, Nov 4, 2009 at 1:00 PM, <bioperl-l-request at lists.open-bio.org>wrote:

> Send Bioperl-l mailing list submissions to
>        bioperl-l at lists.open-bio.org
>
> To subscribe or unsubscribe via the World Wide Web, visit
>        http://lists.open-bio.org/mailman/listinfo/bioperl-l
> or, via email, send a message with subject or body 'help' to
>        bioperl-l-request at lists.open-bio.org
>
> You can reach the person managing the list at
>        bioperl-l-owner at lists.open-bio.org
>
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of Bioperl-l digest..."
>
> Today's Topics:
>
>   1.  SwissProt and Subcellular localization information
>      (Daniel Gaston)
>
>
> ---------- Forwarded message ----------
> From: Daniel Gaston <daniel.gaston at gmail.com>
> To: bioperl-l at lists.open-bio.org
> Date: Wed, 4 Nov 2009 10:45:04 -0400
> Subject: [Bioperl-l] SwissProt and Subcellular localization information
> Hi Everyone,
>
> I have recently been playing around with SwissProt format flatfiles and
> want
> to extract sequences based on subcellular localization. I notice in going
> through the code for swiss.pm and swissdriver.pm that in both (more so in
> swissdriver.pm) there are several steps where organelle information based
> on
> the OG line could be extracted and added to data structure but isn't. It
> seems that in both cases the OG line is being added in to the generic
> lumping of data from the OC, OS, and OX lines in order to extract species
> names and taxonomy information but getting rid of everything else. Is there
> a particular reason for this or just a simple oversight? On the surface at
> least it looks like a relatively simple modification to make although I
> admit that I am not terribly adept at manipulating these SeqIO
> datastructures.
>
> Thanks for your time,
>
> Dan
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>

From jluis.lavin at unavarra.es  Thu Nov  5 10:28:23 2009
From: jluis.lavin at unavarra.es (jluis.lavin at unavarra.es)
Date: Thu, 5 Nov 2009 16:28:23 +0100 (CET)
Subject: [Bioperl-l] A question about iBio::Index: and its correct use
Message-ID: <2120.130.206.164.153.1257434903.squirrel@webmail.unavarra.es>


Hello to all,

I?m trying to write a script to retrieve a list of sequences from a local
FASTA file (for example a fasta archive where all the protein models of an
organism are stored). This file would be used by me as some kind "local
database" (sorry if I mistake a few concepts...)
I?ve been reading the BioPerl HOWTOs and I came across the
Bio::Index::Fasta tool.
If I didn?t misunderstood what I read (which can be easy because my low
level on programming) this Indexing tool should do the job.
I wrote a couple of scripts based on the documentation i read about this
tool, but I don?t seem to be able to create the index file to be used
later (to retrieve the sequences from).
-First of all, I want to ask the people in this forum if the
Bio::Index::Fasta is the right one to chose for this tasks.
-Then I?ll beg you to take a look at my scripts, because I don?t seem to
catch the bug...

Best wishes to you all and thanks in advance ;)

-- 
Jos? Luis Lav?n Trueba, PhD

Dpto. de Producci?n Agraria
Grupo de Gen?tica y Microbiolog?a
Universidad P?blica de Navarra
31006 Pamplona
Navarra
SPAIN


From maj at fortinbras.us  Thu Nov  5 10:39:05 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Thu, 5 Nov 2009 10:39:05 -0500
Subject: [Bioperl-l] A question about iBio::Index: and its correct use
In-Reply-To: <2120.130.206.164.153.1257434903.squirrel@webmail.unavarra.es>
References: <2120.130.206.164.153.1257434903.squirrel@webmail.unavarra.es>
Message-ID: <A28922858F64480ABD8A6696E269023C@NewLife>

Jos? -- It looks like this is a good solution to your problem. Please send you 
script so we can look at it-
cheers Mark
----- Original Message ----- 
From: <jluis.lavin at unavarra.es>
To: <bioperl-l at lists.open-bio.org>
Sent: Thursday, November 05, 2009 10:28 AM
Subject: [Bioperl-l] A question about iBio::Index: and its correct use


Hello to all,

I?m trying to write a script to retrieve a list of sequences from a local
FASTA file (for example a fasta archive where all the protein models of an
organism are stored). This file would be used by me as some kind "local
database" (sorry if I mistake a few concepts...)
I?ve been reading the BioPerl HOWTOs and I came across the
Bio::Index::Fasta tool.
If I didn?t misunderstood what I read (which can be easy because my low
level on programming) this Indexing tool should do the job.
I wrote a couple of scripts based on the documentation i read about this
tool, but I don?t seem to be able to create the index file to be used
later (to retrieve the sequences from).
-First of all, I want to ask the people in this forum if the
Bio::Index::Fasta is the right one to chose for this tasks.
-Then I?ll beg you to take a look at my scripts, because I don?t seem to
catch the bug...

Best wishes to you all and thanks in advance ;)

-- 
Jos? Luis Lav?n Trueba, PhD

Dpto. de Producci?n Agraria
Grupo de Gen?tica y Microbiolog?a
Universidad P?blica de Navarra
31006 Pamplona
Navarra
SPAIN


_______________________________________________
Bioperl-l mailing list
Bioperl-l at lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/bioperl-l


From jluis.lavin at unavarra.es  Thu Nov  5 10:46:36 2009
From: jluis.lavin at unavarra.es (jluis.lavin at unavarra.es)
Date: Thu, 5 Nov 2009 16:46:36 +0100 (CET)
Subject: [Bioperl-l] [Fwd: Re: A question about iBio::Index: and its correct
 use]
Message-ID: <2642.130.206.164.153.1257435996.squirrel@webmail.unavarra.es>


---------------------------- Mensaje original ----------------------------
Subject: Re: [Bioperl-l] A question about iBio::Index: and its correct use
From:    jluis.lavin at unavarra.es
Fecha:   Jue, 5 de Noviembre de 2009, 16:46
To:      "Mark A. Jensen" <maj at fortinbras.us>
--------------------------------------------------------------------------

Hi Mark,

I?ve actually got two scripts, the first one is to create the index and
the second one is to retrieve the sequence lis from the indexed file.

1)Here is the Index creation script:

#!/c:/Perl -w
use strict;
use Bio::Index::Fasta;
use strict;

print "Enter file for indexing: \n";
my $Index_File_Name = <STDIN>;
my $inx = Bio::Index::Fasta->new(-filename => $Index_File_Name.".idx",
    -write_flag => 1);
$inx->make_index(my $File_Name);

2)And here is the sequence retrieval script:

#!/c:/Perl -w
use Bio::Index::Fasta;
use strict;
#PC9.fasta is my genomic file
my $Index_File_Name ="PC9.fasta";
my $inx = Bio::Index::Fasta->new($Index_File_Name);
#LCS.txt is my sequences list
@ARGV = <lCS.txt>;
foreach  my $id (@ARGV) {
if ($id eq ''){
die ("empty list")
}
else {
my $seqobj = $inx->fetch($id);
my $out = new Bio::SeqIO (-file => ">>extracted_seqs.fasta",
-format => 'fasta');
$out->write_seq($seqobj);
}
}
exit;
}

I hope this code is not a total scum...

Thanks in advance ;)


El Jue, 5 de Noviembre de 2009, 16:39, Mark A. Jensen escribi?:
> Jos? -- It looks like this is a good solution to your problem. Please send
> you
> script so we can look at it-
> cheers Mark
> ----- Original Message -----
> From: <jluis.lavin at unavarra.es>
> To: <bioperl-l at lists.open-bio.org>
> Sent: Thursday, November 05, 2009 10:28 AM
> Subject: [Bioperl-l] A question about iBio::Index: and its correct use
>
>
>
> Hello to all,
>
> I?m trying to write a script to retrieve a list of sequences from a local
> FASTA file (for example a fasta archive where all the protein models of an
> organism are stored). This file would be used by me as some kind "local
> database" (sorry if I mistake a few concepts...)
> I?ve been reading the BioPerl HOWTOs and I came across the
> Bio::Index::Fasta tool.
> If I didn?t misunderstood what I read (which can be easy because my low
> level on programming) this Indexing tool should do the job.
> I wrote a couple of scripts based on the documentation i read about this
> tool, but I don?t seem to be able to create the index file to be used
> later (to retrieve the sequences from).
> -First of all, I want to ask the people in this forum if the
> Bio::Index::Fasta is the right one to chose for this tasks.
> -Then I?ll beg you to take a look at my scripts, because I don?t seem to
> catch the bug...
>
> Best wishes to you all and thanks in advance ;)
>
> --
> Jos? Luis Lav?n Trueba, PhD
>
> Dpto. de Producci?n Agraria
> Grupo de Gen?tica y Microbiolog?a
> Universidad P?blica de Navarra
> 31006 Pamplona
> Navarra
> SPAIN
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>
>


-- 
Dr. Jos? Luis Lav?n Trueba

Dpto. de Producci?n Agraria
Grupo de Gen?tica y Microbiolog?a
Universidad P?blica de Navarra
31006 Pamplona
Navarra
SPAIN


-- 
Dr. Jos? Luis Lav?n Trueba

Dpto. de Producci?n Agraria
Grupo de Gen?tica y Microbiolog?a
Universidad P?blica de Navarra
31006 Pamplona
Navarra
SPAIN


From maj at fortinbras.us  Thu Nov  5 10:37:53 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Thu, 5 Nov 2009 10:37:53 -0500
Subject: [Bioperl-l] Trouble retrieving multiple sequences from NCBI ina
	single list query
In-Reply-To: <628aabb70911040611q56b441c8o6888f326d0b314d@mail.gmail.com>
References: <1386.130.206.164.153.1257324215.squirrel@webmail.unavarra.es>
	<628aabb70911040152r19ed79dfnbc54f346295d28a8@mail.gmail.com>
	<1791.130.206.164.153.1257329680.squirrel@webmail.unavarra.es>
	<628aabb70911040611q56b441c8o6888f326d0b314d@mail.gmail.com>
Message-ID: <49075FDFF6764EE48E932D95EB994221@NewLife>

True, Dave, you compete only with crazed east coast core developers who're doing 
"just one more thing" at 2am....
----- Original Message ----- 
From: "Dave Messina" <David.Messina at sbc.su.se>
To: <jluis.lavin at unavarra.es>
Cc: <bioperl-l at lists.open-bio.org>
Sent: Wednesday, November 04, 2009 9:11 AM
Subject: Re: [Bioperl-l] Trouble retrieving multiple sequences from NCBI ina 
single list query


> Aw shucks, Jos?, glad I could be of help. There are plenty of people who
> answer questions around here, but my timezone sometimes gives me an
> advantage for the European ones. :)
>
>
> Dave
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l 


From hrh at fmi.ch  Thu Nov  5 11:02:48 2009
From: hrh at fmi.ch (Hotz, Hans-Rudolf)
Date: Thu, 05 Nov 2009 17:02:48 +0100
Subject: [Bioperl-l] A question about iBio::Index: and its correct use
In-Reply-To: <2120.130.206.164.153.1257434903.squirrel@webmail.unavarra.es>
Message-ID: <C718B5B8.5561%hrh@fmi.ch>


Jluis

> -Then I?ll beg you to take a look at my scripts, because I don?t seem to
> catch the bug...

you haven't attached/included any scripts, have you?


Anyway, have you considered using BLAST indices (created with the additional
flag "-o") together with the tool 'fastacmd' (which also included in the
NCBI blast binaries) as a simple (and very fast) alternative for fetching
sequences.


Regards, Hans


From maj at fortinbras.us  Thu Nov  5 11:02:09 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Thu, 5 Nov 2009 11:02:09 -0500
Subject: [Bioperl-l] [Fwd: Re: A question about iBio::Index: and its
	correct use]
In-Reply-To: <2642.130.206.164.153.1257435996.squirrel@webmail.unavarra.es>
References: <2642.130.206.164.153.1257435996.squirrel@webmail.unavarra.es>
Message-ID: <1984ED07F36C446284B25F617964B6C6@NewLife>

Hey Jos?,
The first thing that jumps out it the index file name. Looks
like you create it as
PC9.fasta.idx
But you read it as
PC9.fasta
Not an unusual mistake. Do
my $inx = Bio::Index::Fasta->new('PC9.fasta.idx');
and see if it works.
MAJ
----- Original Message ----- 
From: <jluis.lavin at unavarra.es>
To: <bioperl-l at lists.open-bio.org>
Sent: Thursday, November 05, 2009 10:46 AM
Subject: [Bioperl-l] [Fwd: Re: A question about iBio::Index: and its correct 
use]


---------------------------- Mensaje original ----------------------------
Subject: Re: [Bioperl-l] A question about iBio::Index: and its correct use
From:    jluis.lavin at unavarra.es
Fecha:   Jue, 5 de Noviembre de 2009, 16:46
To:      "Mark A. Jensen" <maj at fortinbras.us>
--------------------------------------------------------------------------

Hi Mark,

I?ve actually got two scripts, the first one is to create the index and
the second one is to retrieve the sequence lis from the indexed file.

1)Here is the Index creation script:

#!/c:/Perl -w
use strict;
use Bio::Index::Fasta;
use strict;

print "Enter file for indexing: \n";
my $Index_File_Name = <STDIN>;
my $inx = Bio::Index::Fasta->new(-filename => $Index_File_Name.".idx",
    -write_flag => 1);
$inx->make_index(my $File_Name);

2)And here is the sequence retrieval script:

#!/c:/Perl -w
use Bio::Index::Fasta;
use strict;
#PC9.fasta is my genomic file
my $Index_File_Name ="PC9.fasta";
my $inx = Bio::Index::Fasta->new($Index_File_Name);
#LCS.txt is my sequences list
@ARGV = <lCS.txt>;
foreach  my $id (@ARGV) {
if ($id eq ''){
die ("empty list")
}
else {
my $seqobj = $inx->fetch($id);
my $out = new Bio::SeqIO (-file => ">>extracted_seqs.fasta",
-format => 'fasta');
$out->write_seq($seqobj);
}
}
exit;
}

I hope this code is not a total scum...

Thanks in advance ;)


El Jue, 5 de Noviembre de 2009, 16:39, Mark A. Jensen escribi?:
> Jos? -- It looks like this is a good solution to your problem. Please send
> you
> script so we can look at it-
> cheers Mark
> ----- Original Message -----
> From: <jluis.lavin at unavarra.es>
> To: <bioperl-l at lists.open-bio.org>
> Sent: Thursday, November 05, 2009 10:28 AM
> Subject: [Bioperl-l] A question about iBio::Index: and its correct use
>
>
>
> Hello to all,
>
> I?m trying to write a script to retrieve a list of sequences from a local
> FASTA file (for example a fasta archive where all the protein models of an
> organism are stored). This file would be used by me as some kind "local
> database" (sorry if I mistake a few concepts...)
> I?ve been reading the BioPerl HOWTOs and I came across the
> Bio::Index::Fasta tool.
> If I didn?t misunderstood what I read (which can be easy because my low
> level on programming) this Indexing tool should do the job.
> I wrote a couple of scripts based on the documentation i read about this
> tool, but I don?t seem to be able to create the index file to be used
> later (to retrieve the sequences from).
> -First of all, I want to ask the people in this forum if the
> Bio::Index::Fasta is the right one to chose for this tasks.
> -Then I?ll beg you to take a look at my scripts, because I don?t seem to
> catch the bug...
>
> Best wishes to you all and thanks in advance ;)
>
> --
> Jos? Luis Lav?n Trueba, PhD
>
> Dpto. de Producci?n Agraria
> Grupo de Gen?tica y Microbiolog?a
> Universidad P?blica de Navarra
> 31006 Pamplona
> Navarra
> SPAIN
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>
>


-- 
Dr. Jos? Luis Lav?n Trueba

Dpto. de Producci?n Agraria
Grupo de Gen?tica y Microbiolog?a
Universidad P?blica de Navarra
31006 Pamplona
Navarra
SPAIN


-- 
Dr. Jos? Luis Lav?n Trueba

Dpto. de Producci?n Agraria
Grupo de Gen?tica y Microbiolog?a
Universidad P?blica de Navarra
31006 Pamplona
Navarra
SPAIN


_______________________________________________
Bioperl-l mailing list
Bioperl-l at lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/bioperl-l


From jluis.lavin at unavarra.es  Thu Nov  5 11:21:57 2009
From: jluis.lavin at unavarra.es (jluis.lavin at unavarra.es)
Date: Thu, 5 Nov 2009 17:21:57 +0100 (CET)
Subject: [Bioperl-l] [Fwd: Re: A question about iBio::Index: and its
 correct use]
In-Reply-To: <1984ED07F36C446284B25F617964B6C6@NewLife>
References: <2642.130.206.164.153.1257435996.squirrel@webmail.unavarra.es>
	<1984ED07F36C446284B25F617964B6C6@NewLife>
Message-ID: <2969.130.206.164.153.1257438117.squirrel@webmail.unavarra.es>

Thank you very much Mark, that?s a good point :$
I guess your correction is referred to the second script, isn?t it?

If it is so, there is still a problem with the first script, it doesn?t
create the PC9.fasta.idx file, instead it creates two files named:
-PC9.fasta.idx.pag
-PC9.fasta.idx.dir

which seem to be clearly related with some kind of indexing process...but,
unless the PC9.fasta.idx file is only virtual or remains hidden, I can?t
find it anywhere...
Forgive me if I?m talking nosense...

Thank you very much again for your help ;)


El Jue, 5 de Noviembre de 2009, 17:02, Mark A. Jensen escribi?:
> Hey Jos?,
> The first thing that jumps out it the index file name. Looks
> like you create it as
> PC9.fasta.idx
> But you read it as
> PC9.fasta
> Not an unusual mistake. Do
> my $inx = Bio::Index::Fasta->new('PC9.fasta.idx');
> and see if it works.
> MAJ
> ----- Original Message -----
> From: <jluis.lavin at unavarra.es>
> To: <bioperl-l at lists.open-bio.org>
> Sent: Thursday, November 05, 2009 10:46 AM
> Subject: [Bioperl-l] [Fwd: Re: A question about iBio::Index: and its
> correct
> use]
>
>
>
>
> ---------------------------- Mensaje original ----------------------------
> Subject: Re: [Bioperl-l] A question about iBio::Index: and its correct use
> From:    jluis.lavin at unavarra.es
> Fecha:   Jue, 5 de Noviembre de 2009, 16:46
> To:      "Mark A. Jensen" <maj at fortinbras.us>
> --------------------------------------------------------------------------
>
> Hi Mark,
>
> I?ve actually got two scripts, the first one is to create the index and
> the second one is to retrieve the sequence lis from the indexed file.
>
> 1)Here is the Index creation script:
>
> #!/c:/Perl -w
> use strict;
> use Bio::Index::Fasta;
> use strict;
>
> print "Enter file for indexing: \n";
> my $Index_File_Name = <STDIN>;
> my $inx = Bio::Index::Fasta->new(-filename => $Index_File_Name.".idx",
>     -write_flag => 1);
> $inx->make_index(my $File_Name);
>
> 2)And here is the sequence retrieval script:
>
> #!/c:/Perl -w
> use Bio::Index::Fasta;
> use strict;
> #PC9.fasta is my genomic file
> my $Index_File_Name ="PC9.fasta";
> my $inx = Bio::Index::Fasta->new($Index_File_Name);
> #LCS.txt is my sequences list
> @ARGV = <lCS.txt>;
> foreach  my $id (@ARGV) {
> if ($id eq ''){
> die ("empty list")
> }
> else {
> my $seqobj = $inx->fetch($id);
> my $out = new Bio::SeqIO (-file => ">>extracted_seqs.fasta",
> -format => 'fasta');
> $out->write_seq($seqobj);
> }
> }
> exit;
> }
>
> I hope this code is not a total scum...
>
> Thanks in advance ;)
>
>
>
> El Jue, 5 de Noviembre de 2009, 16:39, Mark A. Jensen escribi?:
>> Jos? -- It looks like this is a good solution to your problem. Please
>> send
>> you
>> script so we can look at it-
>> cheers Mark
>> ----- Original Message -----
>> From: <jluis.lavin at unavarra.es>
>> To: <bioperl-l at lists.open-bio.org>
>> Sent: Thursday, November 05, 2009 10:28 AM
>> Subject: [Bioperl-l] A question about iBio::Index: and its correct use
>>
>>
>>
>> Hello to all,
>>
>> I?m trying to write a script to retrieve a list of sequences from a
>> local
>> FASTA file (for example a fasta archive where all the protein models of
>> an
>> organism are stored). This file would be used by me as some kind "local
>> database" (sorry if I mistake a few concepts...)
>> I?ve been reading the BioPerl HOWTOs and I came across the
>> Bio::Index::Fasta tool.
>> If I didn?t misunderstood what I read (which can be easy because my low
>> level on programming) this Indexing tool should do the job.
>> I wrote a couple of scripts based on the documentation i read about this
>> tool, but I don?t seem to be able to create the index file to be used
>> later (to retrieve the sequences from).
>> -First of all, I want to ask the people in this forum if the
>> Bio::Index::Fasta is the right one to chose for this tasks.
>> -Then I?ll beg you to take a look at my scripts, because I don?t seem to
>> catch the bug...
>>
>> Best wishes to you all and thanks in advance ;)
>>
>> --
>> Jos? Luis Lav?n Trueba, PhD
>>
>> Dpto. de Producci?n Agraria
>> Grupo de Gen?tica y Microbiolog?a
>> Universidad P?blica de Navarra
>> 31006 Pamplona
>> Navarra
>> SPAIN
>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>>
>>
>
>
> --
> Dr. Jos? Luis Lav?n Trueba
>
> Dpto. de Producci?n Agraria
> Grupo de Gen?tica y Microbiolog?a
> Universidad P?blica de Navarra
> 31006 Pamplona
> Navarra
> SPAIN
>
>
>
> --
> Dr. Jos? Luis Lav?n Trueba
>
> Dpto. de Producci?n Agraria
> Grupo de Gen?tica y Microbiolog?a
> Universidad P?blica de Navarra
> 31006 Pamplona
> Navarra
> SPAIN
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>
>


-- 
Dr. Jos? Luis Lav?n Trueba

Dpto. de Producci?n Agraria
Grupo de Gen?tica y Microbiolog?a
Universidad P?blica de Navarra
31006 Pamplona
Navarra
SPAIN


From maj at fortinbras.us  Thu Nov  5 11:39:09 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Thu, 5 Nov 2009 11:39:09 -0500
Subject: [Bioperl-l] [Fwd: Re: A question about iBio::Index: and its
	correct use]
In-Reply-To: <2969.130.206.164.153.1257438117.squirrel@webmail.unavarra.es>
References: <2642.130.206.164.153.1257435996.squirrel@webmail.unavarra.es>
	<1984ED07F36C446284B25F617964B6C6@NewLife>
	<2969.130.206.164.153.1257438117.squirrel@webmail.unavarra.es>
Message-ID: <A1ACC4B552514872B77208248B31977C@NewLife>

Yes, these are files created by the SDBM, Perl's internal db manager. You should 
be able to
open the index by simply
$inx = Bio::Index::Fasta->new('PC9.fasta.idx');
and the dbm will know what to do--
cheers MAJ
----- Original Message ----- 
From: <jluis.lavin at unavarra.es>
To: "Mark A. Jensen" <maj at fortinbras.us>
Cc: <jluis.lavin at unavarra.es>; <bioperl-l at lists.open-bio.org>
Sent: Thursday, November 05, 2009 11:21 AM
Subject: Re: [Bioperl-l] [Fwd: Re: A question about iBio::Index: and its correct 
use]


> Thank you very much Mark, that?s a good point :$
> I guess your correction is referred to the second script, isn?t it?
>
> If it is so, there is still a problem with the first script, it doesn?t
> create the PC9.fasta.idx file, instead it creates two files named:
> -PC9.fasta.idx.pag
> -PC9.fasta.idx.dir
>
> which seem to be clearly related with some kind of indexing process...but,
> unless the PC9.fasta.idx file is only virtual or remains hidden, I can?t
> find it anywhere...
> Forgive me if I?m talking nosense...
>
> Thank you very much again for your help ;)
>
>
> El Jue, 5 de Noviembre de 2009, 17:02, Mark A. Jensen escribi?:
>> Hey Jos?,
>> The first thing that jumps out it the index file name. Looks
>> like you create it as
>> PC9.fasta.idx
>> But you read it as
>> PC9.fasta
>> Not an unusual mistake. Do
>> my $inx = Bio::Index::Fasta->new('PC9.fasta.idx');
>> and see if it works.
>> MAJ
>> ----- Original Message -----
>> From: <jluis.lavin at unavarra.es>
>> To: <bioperl-l at lists.open-bio.org>
>> Sent: Thursday, November 05, 2009 10:46 AM
>> Subject: [Bioperl-l] [Fwd: Re: A question about iBio::Index: and its
>> correct
>> use]
>>
>>
>>
>>
>> ---------------------------- Mensaje original ----------------------------
>> Subject: Re: [Bioperl-l] A question about iBio::Index: and its correct use
>> From:    jluis.lavin at unavarra.es
>> Fecha:   Jue, 5 de Noviembre de 2009, 16:46
>> To:      "Mark A. Jensen" <maj at fortinbras.us>
>> --------------------------------------------------------------------------
>>
>> Hi Mark,
>>
>> I?ve actually got two scripts, the first one is to create the index and
>> the second one is to retrieve the sequence lis from the indexed file.
>>
>> 1)Here is the Index creation script:
>>
>> #!/c:/Perl -w
>> use strict;
>> use Bio::Index::Fasta;
>> use strict;
>>
>> print "Enter file for indexing: \n";
>> my $Index_File_Name = <STDIN>;
>> my $inx = Bio::Index::Fasta->new(-filename => $Index_File_Name.".idx",
>>     -write_flag => 1);
>> $inx->make_index(my $File_Name);
>>
>> 2)And here is the sequence retrieval script:
>>
>> #!/c:/Perl -w
>> use Bio::Index::Fasta;
>> use strict;
>> #PC9.fasta is my genomic file
>> my $Index_File_Name ="PC9.fasta";
>> my $inx = Bio::Index::Fasta->new($Index_File_Name);
>> #LCS.txt is my sequences list
>> @ARGV = <lCS.txt>;
>> foreach  my $id (@ARGV) {
>> if ($id eq ''){
>> die ("empty list")
>> }
>> else {
>> my $seqobj = $inx->fetch($id);
>> my $out = new Bio::SeqIO (-file => ">>extracted_seqs.fasta",
>> -format => 'fasta');
>> $out->write_seq($seqobj);
>> }
>> }
>> exit;
>> }
>>
>> I hope this code is not a total scum...
>>
>> Thanks in advance ;)
>>
>>
>>
>> El Jue, 5 de Noviembre de 2009, 16:39, Mark A. Jensen escribi?:
>>> Jos? -- It looks like this is a good solution to your problem. Please
>>> send
>>> you
>>> script so we can look at it-
>>> cheers Mark
>>> ----- Original Message -----
>>> From: <jluis.lavin at unavarra.es>
>>> To: <bioperl-l at lists.open-bio.org>
>>> Sent: Thursday, November 05, 2009 10:28 AM
>>> Subject: [Bioperl-l] A question about iBio::Index: and its correct use
>>>
>>>
>>>
>>> Hello to all,
>>>
>>> I?m trying to write a script to retrieve a list of sequences from a
>>> local
>>> FASTA file (for example a fasta archive where all the protein models of
>>> an
>>> organism are stored). This file would be used by me as some kind "local
>>> database" (sorry if I mistake a few concepts...)
>>> I?ve been reading the BioPerl HOWTOs and I came across the
>>> Bio::Index::Fasta tool.
>>> If I didn?t misunderstood what I read (which can be easy because my low
>>> level on programming) this Indexing tool should do the job.
>>> I wrote a couple of scripts based on the documentation i read about this
>>> tool, but I don?t seem to be able to create the index file to be used
>>> later (to retrieve the sequences from).
>>> -First of all, I want to ask the people in this forum if the
>>> Bio::Index::Fasta is the right one to chose for this tasks.
>>> -Then I?ll beg you to take a look at my scripts, because I don?t seem to
>>> catch the bug...
>>>
>>> Best wishes to you all and thanks in advance ;)
>>>
>>> --
>>> Jos? Luis Lav?n Trueba, PhD
>>>
>>> Dpto. de Producci?n Agraria
>>> Grupo de Gen?tica y Microbiolog?a
>>> Universidad P?blica de Navarra
>>> 31006 Pamplona
>>> Navarra
>>> SPAIN
>>>
>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>>
>>>
>>
>>
>> --
>> Dr. Jos? Luis Lav?n Trueba
>>
>> Dpto. de Producci?n Agraria
>> Grupo de Gen?tica y Microbiolog?a
>> Universidad P?blica de Navarra
>> 31006 Pamplona
>> Navarra
>> SPAIN
>>
>>
>>
>> --
>> Dr. Jos? Luis Lav?n Trueba
>>
>> Dpto. de Producci?n Agraria
>> Grupo de Gen?tica y Microbiolog?a
>> Universidad P?blica de Navarra
>> 31006 Pamplona
>> Navarra
>> SPAIN
>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>>
>>
>
>
> -- 
> Dr. Jos? Luis Lav?n Trueba
>
> Dpto. de Producci?n Agraria
> Grupo de Gen?tica y Microbiolog?a
> Universidad P?blica de Navarra
> 31006 Pamplona
> Navarra
> SPAIN
>
>
>
> 


From jluis.lavin at unavarra.es  Thu Nov  5 12:48:12 2009
From: jluis.lavin at unavarra.es (jluis.lavin at unavarra.es)
Date: Thu, 5 Nov 2009 18:48:12 +0100 (CET)
Subject: [Bioperl-l] A question about iBio::Index: and its correct use
In-Reply-To: <C718B5B8.5561%hrh@fmi.ch>
References: <2120.130.206.164.153.1257434903.squirrel@webmail.unavarra.es>
	<C718B5B8.5561%hrh@fmi.ch>
Message-ID: <3313.130.206.164.153.1257443292.squirrel@webmail.unavarra.es>

Thanks a lot for your help Hans,
It's a little bit to hard to understand and turn into script this awesome
information you've just given me...I hope I can use it in a near future
anyway ;)
The issue here is that the sequences I,m indexing are not generated by the
NCBI nor stored there...although I belive you?re just refering to the tool
itself and not to a retrieval from the NCBI.

Thanks again you?re all great giving advice to newbies like me ;)

Best wishes to you all


El Jue, 5 de Noviembre de 2009, 17:02, Hotz, Hans-Rudolf escribi?:
>
>
>
> Jluis
>
>> -Then I?ll beg you to take a look at my scripts, because I don?t seem to
>> catch the bug...
>
> you haven't attached/included any scripts, have you?
>
>
> Anyway, have you considered using BLAST indices (created with the
> additional
> flag "-o") together with the tool 'fastacmd' (which also included in the
> NCBI blast binaries) as a simple (and very fast) alternative for fetching
> sequences.
>
>
> Regards, Hans
>
>
>


-- 
Dr. Jos? Luis Lav?n Trueba

Dpto. de Producci?n Agraria
Grupo de Gen?tica y Microbiolog?a
Universidad P?blica de Navarra
31006 Pamplona
Navarra
SPAIN


From florent.angly at gmail.com  Thu Nov  5 13:00:19 2009
From: florent.angly at gmail.com (Florent Angly)
Date: Thu, 05 Nov 2009 10:00:19 -0800
Subject: [Bioperl-l] A question about iBio::Index: and its correct use
In-Reply-To: <3313.130.206.164.153.1257443292.squirrel@webmail.unavarra.es>
References: <2120.130.206.164.153.1257434903.squirrel@webmail.unavarra.es>	<C718B5B8.5561%hrh@fmi.ch>
	<3313.130.206.164.153.1257443292.squirrel@webmail.unavarra.es>
Message-ID: <4AF312B3.9060009@gmail.com>

Hans-Rudolf was talking about a way to retrieve sequences from a BLAST 
database. If you use BLAST locally, then your database is local too.
More info here: 
http://www.ncbi.nlm.nih.gov/staff/tao/URLAPI/formatdb_fastacmd.html
Florent


jluis.lavin at unavarra.es wrote:
> Thanks a lot for your help Hans,
> It's a little bit to hard to understand and turn into script this awesome
> information you've just given me...I hope I can use it in a near future
> anyway ;)
> The issue here is that the sequences I,m indexing are not generated by the
> NCBI nor stored there...although I belive you?re just refering to the tool
> itself and not to a retrieval from the NCBI.
>
> Thanks again you?re all great giving advice to newbies like me ;)
>
> Best wishes to you all
>
>
> El Jue, 5 de Noviembre de 2009, 17:02, Hotz, Hans-Rudolf escribi?:
>   
>>
>> Jluis
>>
>>     
>>> -Then I?ll beg you to take a look at my scripts, because I don?t seem to
>>> catch the bug...
>>>       
>> you haven't attached/included any scripts, have you?
>>
>>
>> Anyway, have you considered using BLAST indices (created with the
>> additional
>> flag "-o") together with the tool 'fastacmd' (which also included in the
>> NCBI blast binaries) as a simple (and very fast) alternative for fetching
>> sequences.
>>
>>
>> Regards, Hans
>>
>>
>>
>>     
>
>
>   


From valiente at lsi.upc.edu  Fri Nov  6 03:06:48 2009
From: valiente at lsi.upc.edu (valiente at lsi.upc.edu)
Date: Fri, 6 Nov 2009 09:06:48 +0100 (CET)
Subject: [Bioperl-l] Bio::SeqIO::genbank.pm
Message-ID: <45737.147.83.59.225.1257494808.squirrel@webmail.lsi.upc.edu>


There is a line in Bio::SeqIO::genbank.pm to convert data in classification lines into a classification array by splitting only
on ';' or '.' so that a classification that is 2
or more words will still get
matched,my @class = map { s/^\s+//; s/\s+$//; s/\s{2,}/ /g; $_; } split /(?<!subgen)[;\.]+/, $class_lines;but this
will break organism names that have a dot inside, such as "Salmonella
enterica subsp. enterica?serovar Typhimurium", which is now
being broken into "Salmonella enterica subsp" and "enterica?serovar
Typhimurium".Changing [;\.]
to [;] solves this issue,my @class = map { s/^\s+//; s/\s+$//; s/\s{2,}/ /g; $_; } split /(?<!subgen)[;]+/,
$class_lines;Does anybody want to further
test it before I commit this change? Thanks,Gabriel

From jluis.lavin at unavarra.es  Fri Nov  6 03:44:45 2009
From: jluis.lavin at unavarra.es (jluis.lavin at unavarra.es)
Date: Fri, 6 Nov 2009 09:44:45 +0100 (CET)
Subject: [Bioperl-l] A question about iBio::Index: and its correct use
In-Reply-To: <4AF312B3.9060009@gmail.com>
References: <2120.130.206.164.153.1257434903.squirrel@webmail.unavarra.es>	<
	C718B5B8.5561%hrh@fmi.ch> 
	<3313.130.206.164.153.1257443292.squirrel@webmail.unavarra.es>
	<4AF312B3.9060009@gmail.com>
Message-ID: <1222.130.206.164.153.1257497085.squirrel@webmail.unavarra.es>

Thank you for the info Florent!
I?ll try to read al the information on the link you provided and try to
figure out how to make it work and if it is worthy for me, I mean, I work
with several sequence files that come from multiple databases (JGI, BROAD,
Genolevures or NCBI). Protein IDs from each of those databases is
different from NCBI. Maybe it could be easier to write a script that
allows me to enter a fasta file with all the protein models of a single
organism, parse it and then extract the sequences of a given list (using
the "ID style" of the particular database) than creating a BLAST index for
each organism I need to work with...Did I explain the issue correctly?
Anyway, since I don?t know anything about this tool Hans and you provided
me, I can easily be wrong...
Thank you for showing me the local BLAST Index tool, I?ll read the
documentation carefully and study all its possibilities.

Best wishes

JL


El Jue, 5 de Noviembre de 2009, 19:00, Florent Angly escribi?:
> Hans-Rudolf was talking about a way to retrieve sequences from a BLAST
> database. If you use BLAST locally, then your database is local too.
> More info here:
> http://www.ncbi.nlm.nih.gov/staff/tao/URLAPI/formatdb_fastacmd.html
> Florent
>
>
> jluis.lavin at unavarra.es wrote:
>> Thanks a lot for your help Hans,
>> It's a little bit to hard to understand and turn into script this
>> awesome
>> information you've just given me...I hope I can use it in a near future
>> anyway ;)
>> The issue here is that the sequences I,m indexing are not generated by
>> the
>> NCBI nor stored there...although I belive you?re just refering to the
>> tool
>> itself and not to a retrieval from the NCBI.
>>
>> Thanks again you?re all great giving advice to newbies like me ;)
>>
>> Best wishes to you all
>>
>>
>> El Jue, 5 de Noviembre de 2009, 17:02, Hotz, Hans-Rudolf escribi?:
>>
>>>
>>> Jluis
>>>
>>>
>>>> -Then I?ll beg you to take a look at my scripts, because I don?t seem
>>>> to
>>>> catch the bug...
>>>>
>>> you haven't attached/included any scripts, have you?
>>>
>>>
>>> Anyway, have you considered using BLAST indices (created with the
>>> additional
>>> flag "-o") together with the tool 'fastacmd' (which also included in
>>> the
>>> NCBI blast binaries) as a simple (and very fast) alternative for
>>> fetching
>>> sequences.
>>>
>>>
>>> Regards, Hans
>>>
>>>
>>>
>>>
>>
>>
>>
>
>


-- 
Dr. Jos? Luis Lav?n Trueba

Dpto. de Producci?n Agraria
Grupo de Gen?tica y Microbiolog?a
Universidad P?blica de Navarra
31006 Pamplona
Navarra
SPAIN


From maj at fortinbras.us  Fri Nov  6 07:45:01 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Fri, 6 Nov 2009 07:45:01 -0500
Subject: [Bioperl-l] Bioperl
In-Reply-To: <16842715.26316.1257510446095.JavaMail.root@durga.amrita.ac.in>
References: <16842715.26316.1257510446095.JavaMail.root@durga.amrita.ac.in>
Message-ID: <AE7A03CA8F45495C9F8D940AC0EC6D69@NewLife>

Hi Resmi-
You should look at http://bioperl.org/ under "Installation" for 
information on getting and installing BioPerl. An introduction 
to working with trees in BioPerl is at this link:
http://www.bioperl.org/wiki/HOWTO:Trees
cheers, 
Mark

----- Original Message ----- 
  From: Resmi S. 
  To: maj at fortinbras.us 
  Sent: Friday, November 06, 2009 7:27 AM
  Subject: Bioperl


  Respected Sir,
  I am Resmi S studying II MSc Bioinformatics.Now am doing my project in Phylogenetic Tree Construction using BioPerl.I am not much familiar on BioPerl modules.So could please send me the names of the Bioperl modules needed for my project.I also need to  know , from where i will get these modules.If that is from CPAN,then send me the location or link.I kindly request you to send me the details soon.

  Yours Sincerely,
     Resmi S,
     II MSc Bioinformatics,
     School of Biotechnology,
     Amrita Vishwa Vidyapeetham,
      Email : amm08bi019 at students.amrita.ac.in


------------------------------------------------------------------------------


  -------------------------------------------------------------------

  This mail has been scanned by Amrita GAV Server, Amrita Vishwa Vidyapeetham, Amritapuri Campus


From robert.bradbury at gmail.com  Fri Nov  6 12:35:22 2009
From: robert.bradbury at gmail.com (Robert Bradbury)
Date: Fri, 6 Nov 2009 12:35:22 -0500
Subject: [Bioperl-l] Function that determines serious mutations
Message-ID: <deaa866a0911060935q5e0364f5v9a296162c42fd0ca@mail.gmail.com>

Is there a function in the library (or has someone written one) that can
take a genbank entry and determine which mutations are harmful?

It would be used to produce a table summary of:
  GENE          # SNP      # BadSNP

One kind of gets this from NCBI if you lookup in the "GENE" db a gene name
and then go to the "GeneView" om dbSNP page it has the information I want
but largely in a graphical format while I simply want numbers I can dump
into a spreadsheet.

I don't think it would be hard, fetch the gene, run through the features for
the SNP database, figure out whether they are good or bad SNPs, accumulate
the statistics and dump it.  I think the functions available are flexible
enough to do it but I can't believe nobody has already done it.  It could be
a bit more complex in that one could do an analysis to see if the mutations
are in a conserved domain or mutations that code for Cysteine or Methionine
(or othe potentially "critical" amino acids) but since "critical" is in the
eye of the beholder there would have to be some kind of callback to a
scoring function.

Thanks,
Robert

From nevoband at igb.uiuc.edu  Fri Nov  6 15:58:05 2009
From: nevoband at igb.uiuc.edu (kleenix)
Date: Fri, 6 Nov 2009 12:58:05 -0800 (PST)
Subject: [Bioperl-l]  StandAloneBlast Unallowed parameter
Message-ID: <26230896.post@talk.nabble.com>


I'm not sure if i'm doing this wrong. I am trying to use the -m parameter in
blastall using the StandAloneBlast bioperl class.
when i add 'm'=>0 to @params i get Unallowed parameter: error.
Am I adding the parameter wrong? i'm using StandAloneBlast version 1.51

Thanks

-Nevo
-- 
View this message in context: http://old.nabble.com/StandAloneBlast-Unallowed-parameter-tp26230896p26230896.html
Sent from the Perl - Bioperl-L mailing list archive at Nabble.com.


From veronica.xiaoyu at gmail.com  Fri Nov  6 17:25:04 2009
From: veronica.xiaoyu at gmail.com (Xiaoyu Liang)
Date: Fri, 6 Nov 2009 17:25:04 -0500
Subject: [Bioperl-l] Parsing BLAST out file to HTML. How to change the
	description's name of each hit?
Message-ID: <a6a926b50911061425m67e47a3bx1853f21e1de5bd95@mail.gmail.com>

Hi,

I'm using Bio::SearchIO::Writer HTMLResultWriter help me parse BLAST out
file into HTML.

Anybody knows how to parse and change the description name of each hit?

By using hit->description can call hits' description, but it is not allowed
to be modified.

Thank you very much,
Xiaoyu

From maj at fortinbras.us  Fri Nov  6 19:40:17 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Fri, 6 Nov 2009 19:40:17 -0500
Subject: [Bioperl-l] Parsing BLAST out file to HTML. How to change
	thedescription's name of each hit?
In-Reply-To: <a6a926b50911061425m67e47a3bx1853f21e1de5bd95@mail.gmail.com>
References: <a6a926b50911061425m67e47a3bx1853f21e1de5bd95@mail.gmail.com>
Message-ID: <11592B31D9924FA7A8638D90AE4A3F4A@NewLife>

Xiaoyu-
That method should work to change the description; are you doing

$hit->description('This is my new description');

This method returns the old description when you change the value:

$hit->description('old');
$str = $hit->description('new'); # $str eq 'old'
$str = $hit->description;            # $str eq 'new'

MAJ

----- Original Message ----- 
From: "Xiaoyu Liang" <veronica.xiaoyu at gmail.com>
To: <Bioperl-l at lists.open-bio.org>
Sent: Friday, November 06, 2009 5:25 PM
Subject: [Bioperl-l] Parsing BLAST out file to HTML. How to change 
thedescription's name of each hit?


> Hi,
>
> I'm using Bio::SearchIO::Writer HTMLResultWriter help me parse BLAST out
> file into HTML.
>
> Anybody knows how to parse and change the description name of each hit?
>
> By using hit->description can call hits' description, but it is not allowed
> to be modified.
>
> Thank you very much,
> Xiaoyu
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> 


From Daniel.Lang at biologie.uni-freiburg.de  Sun Nov  8 09:50:48 2009
From: Daniel.Lang at biologie.uni-freiburg.de (Daniel Lang)
Date: Sun, 08 Nov 2009 15:50:48 +0100
Subject: [Bioperl-l] arguments to call back functions in GBrowse2
Message-ID: <4AF6DAC8.8070204@biologie.uni-freiburg.de>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi Lincoln,

a while back (May 29, 2009; 09:08pm) you replied to an even older thread
("Re: Access the parent of a Bio::DB::SeqFeature within a gbrowse config
callback function").

I missed your reply and did follow it up back then, sorry!

I'm currently facing the same issue again with gbrowse2. I have a
callback function for "balloon click". Following your last reply I
expected 5 arguments, but I am getting only three: $feature,$panel,$track.

In principle, I am using the latest releases/checkouts...
Which modules do I need to look at/update for this functionality?

Furthermore, is there a possibility to share global variables between
gbrowse2 and slaves? Should this work via init_code?
Should modules initialized in a conf be in the scope of a slave?

If not can I introduce modules via the slave config files, or do I need
to alter the slave scripts?


Thanks, again!

Cheers,
Daniel


PS: gbrowse2 rocks!
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAkr22sUACgkQmJnbCpJAG3A2MgCdG61bNRGMFVWExagzMFejKMjO
FiUAn16nQNemDGSy8nJBS5dUHQMnDgrP
=ODxn
-----END PGP SIGNATURE-----

From maj at fortinbras.us  Sun Nov  8 11:09:43 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Sun, 8 Nov 2009 11:09:43 -0500
Subject: [Bioperl-l] GuessSeqFormat: fastq?
Message-ID: <B5891A5A7F2D482DB093E01A7A8DA9A1@NewLife>

Hi All- 
Any plans in the works for a _possibly_fastq sequence guesser?
MAJ

From maj at fortinbras.us  Sun Nov  8 11:20:55 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Sun, 8 Nov 2009 11:20:55 -0500
Subject: [Bioperl-l] GuessSeqFormat: fastq?
In-Reply-To: <B5891A5A7F2D482DB093E01A7A8DA9A1@NewLife>
References: <B5891A5A7F2D482DB093E01A7A8DA9A1@NewLife>
Message-ID: <E2407ED235C24BFF9A03377416109318@NewLife>

Never mind; got it covered-- MAJ
----- Original Message ----- 
From: "Mark A. Jensen" <maj at fortinbras.us>
To: "bioperl-l" <bioperl-l at lists.open-bio.org>
Sent: Sunday, November 08, 2009 11:09 AM
Subject: [Bioperl-l] GuessSeqFormat: fastq?


> Hi All- 
> Any plans in the works for a _possibly_fastq sequence guesser?
> MAJ
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
>

From saikari78 at gmail.com  Mon Nov  9 10:47:10 2009
From: saikari78 at gmail.com (saikari keitele)
Date: Mon, 9 Nov 2009 15:47:10 +0000
Subject: [Bioperl-l] Retrieving link to protein from PubChem
Message-ID: <a38167fa0911090747p6702c62fibd7e8310d3a72dae@mail.gmail.com>

Hi,

I'm using Bioperl to retrieve records from PubChem.
I'm trying to find a way-but have been unsuccessful- to retrieve from a
compound record, the reference to the protein(s) that can synthesize the
compound.
Thanks very much.

saikari

From saikari78 at gmail.com  Mon Nov  9 11:05:57 2009
From: saikari78 at gmail.com (saikari keitele)
Date: Mon, 9 Nov 2009 16:05:57 +0000
Subject: [Bioperl-l] [bioperl newbie] Retrieving link to protein from PubChem
Message-ID: <a38167fa0911090805y76456205g3f42caa7e3471241@mail.gmail.com>

Hi,

I'm using Bioperl to retrieve records from PubChem.
I'm trying to find a way-but have been unsuccessful- to retrieve from a
compound record, the reference to the protein(s) that can synthesize the
compound.
Thanks very much.

saikari

From cjfields at illinois.edu  Mon Nov  9 11:27:10 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Mon, 9 Nov 2009 10:27:10 -0600
Subject: [Bioperl-l] [bioperl newbie] Retrieving link to protein from
	PubChem
In-Reply-To: <a38167fa0911090805y76456205g3f42caa7e3471241@mail.gmail.com>
References: <a38167fa0911090805y76456205g3f42caa7e3471241@mail.gmail.com>
Message-ID: <1ECC543A-F923-4D5E-A0C1-5BBD35ECAAE8@illinois.edu>

On Nov 9, 2009, at 10:05 AM, saikari keitele wrote:

> Hi,
>
> I'm using Bioperl to retrieve records from PubChem.
> I'm trying to find a way-but have been unsuccessful- to retrieve  
> from a
> compound record, the reference to the protein(s) that can synthesize  
> the
> compound.
> Thanks very much.
>
> saikari

The below bioperl script returns the GI for proteins that correspond  
to the substance passed on the command line; invoke using 'perl  
pc_substance.pl substance_requested'.  It probably needs more fiddling  
to catch everything but it should get you started.

For other bits and pieces (such as how to retrieve the raw sequence  
files), please see the EUtilities HOWTO:

http://www.bioperl.org/wiki/HOWTO:EUtilities_Cookbook

chris

----------------------------------------

#!/usr/bin/perl -w

use 5.010;
use strict;
use warnings;
use Bio::DB::EUtilities;

my $substance = shift;

my $eutil = Bio::DB::EUtilities->new(-eutil => 'esearch',
                                      -db => 'pcsubstance',
                                      -term => $substance,
                                      -usehistory => 'y');

my $hist = $eutil->next_History || die;

$eutil->reset_parameters(-eutil => 'elink',
                        -history => $hist,
                        -db      => 'protein',
                        -dbfrom  => 'pcsubstance',
                        -retmax  => 1000);

say join(',',$eutil->get_ids);

From saikari78 at gmail.com  Mon Nov  9 11:41:20 2009
From: saikari78 at gmail.com (saikari keitele)
Date: Mon, 9 Nov 2009 16:41:20 +0000
Subject: [Bioperl-l] [bioperl newbie] Retrieving link to protein from
	PubChem
In-Reply-To: <1ECC543A-F923-4D5E-A0C1-5BBD35ECAAE8@illinois.edu>
References: <a38167fa0911090805y76456205g3f42caa7e3471241@mail.gmail.com>
	<1ECC543A-F923-4D5E-A0C1-5BBD35ECAAE8@illinois.edu>
Message-ID: <a38167fa0911090841o6a0a7bcdj5e2396f1a58dc4f2@mail.gmail.com>

Fabulous!. Huge help.
saikari

On Mon, Nov 9, 2009 at 4:27 PM, Chris Fields <cjfields at illinois.edu> wrote:

>  On Nov 9, 2009, at 10:05 AM, saikari keitele wrote:
>
> Hi,
>>
>> I'm using Bioperl to retrieve records from PubChem.
>> I'm trying to find a way-but have been unsuccessful- to retrieve from a
>> compound record, the reference to the protein(s) that can synthesize the
>> compound.
>> Thanks very much.
>>
>> saikari
>>
>
> The below bioperl script returns the GI for proteins that correspond to the
> substance passed on the command line; invoke using 'perl pc_substance.plsubstance_requested'.  It probably needs more fiddling to catch everything
> but it should get you started.
>
> For other bits and pieces (such as how to retrieve the raw sequence files),
> please see the EUtilities HOWTO:
>
> http://www.bioperl.org/wiki/HOWTO:EUtilities_Cookbook
>
> chris
>
> ----------------------------------------
>
> #!/usr/bin/perl -w
>
> use 5.010;
> use strict;
> use warnings;
> use Bio::DB::EUtilities;
>
> my $substance = shift;
>
> my $eutil = Bio::DB::EUtilities->new(-eutil => 'esearch',
>                                     -db => 'pcsubstance',
>                                     -term => $substance,
>                                     -usehistory => 'y');
>
> my $hist = $eutil->next_History || die;
>
> $eutil->reset_parameters(-eutil => 'elink',
>                       -history => $hist,
>                       -db      => 'protein',
>                       -dbfrom  => 'pcsubstance',
>                       -retmax  => 1000);
>
> say join(',',$eutil->get_ids);
>

From gc11song at gmail.com  Mon Nov  9 13:08:48 2009
From: gc11song at gmail.com (Guangchun Song)
Date: Mon, 9 Nov 2009 12:08:48 -0600
Subject: [Bioperl-l] how to get the protein sequences from DNA sequences
	around novel SNPs?
Message-ID: <794eafc20911091008g1f98b944ncbd66ac4962a85a3@mail.gmail.com>

Hello,

I'm new bioperl user.  I' working on a project: To determine the
status of all tutative SNPs such as non-synonymous vs. synonymous, and
predict the tranlational effect of non-synonymous mutations as benign
or malicious.  I'm trying to use bioperl to get the DNA sequence and
translate to protein sequence for the SNPs that are in gene's coding
region.  Could someone tell me how to do it?

Thanks,

-Guangchun Song

From robert.bradbury at gmail.com  Mon Nov  9 16:15:33 2009
From: robert.bradbury at gmail.com (Robert Bradbury)
Date: Mon, 9 Nov 2009 16:15:33 -0500
Subject: [Bioperl-l] how to get the protein sequences from DNA sequences
	around novel SNPs?
In-Reply-To: <794eafc20911091008g1f98b944ncbd66ac4962a85a3@mail.gmail.com>
References: <794eafc20911091008g1f98b944ncbd66ac4962a85a3@mail.gmail.com>
Message-ID: <deaa866a0911091315l6a0782e8i24bfdc60065eb648@mail.gmail.com>

On Mon, Nov 9, 2009 at 1:08 PM, Guangchun Song <gc11song at gmail.com> wrote:
>
> I'm new bioperl user.  I' working on a project: To determine the
> status of all tutative SNPs such as non-synonymous vs. synonymous, and
> predict the tranlational effect of non-synonymous mutations as benign
> or malicious.  I'm trying to use bioperl to get the DNA sequence and
> translate to protein sequence for the SNPs that are in gene's coding
> region.  Could someone tell me how to do it?
>
>
I too would like to know if this information is available.  I've recently
been working with the dbSNP results from NCBI but they display the results
in a graphical format rather than data that one can play with and ask
questions of like "What is the most disease causing gene in the Human
Genome?" or "What are the critical proteins damaged by gene defects in the
Human Genome?" ... "In terms of premature deaths, extended health care
requirements, loss of quality of life, etc.?"

The same types of questions can be applied to the dog and cat genomes where
there is emotional value or the cow, horse, pig, etc. genomes where there is
economic value?

The value of BioPerl would increase significantly if there were
functionality that would allow easy access to "these mutations may have
negative/positive impact" (which means you need a function that qualifies
mutations by degree) and allow for impact to be subjectively determined
(implying there must be some callback function to provide a user
quality/impact rating).

For example:
   $/@differences =  protein_compare($mygene, $refseq_gene, @critical_aa,
@critical_domain, $callback)
Where $callback could "rate" differences about the protein and position and
the "type of interest" (e.g. metal binding amino acids, structural changing
amino acids, critical catalysis amino acids, etc.).

A default callback would be based on some evolving definition of "critical"
changes which result in human disease for example.

This is a "required" capability to be able to determine things like the
"adaptability" of a species -- those with fewest critical mutation points
may have better adaptability to mutation increasing circumstances.

Please pardon any errors in perl syntax/usage its been a while since I've
written perl and I'd really rather be coding in C.

Robert

From maj at fortinbras.us  Mon Nov  9 16:56:24 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Mon, 9 Nov 2009 16:56:24 -0500
Subject: [Bioperl-l] how to get the protein sequences from DNA
	sequencesaround novel SNPs?
In-Reply-To: <deaa866a0911091315l6a0782e8i24bfdc60065eb648@mail.gmail.com>
References: <794eafc20911091008g1f98b944ncbd66ac4962a85a3@mail.gmail.com>
	<deaa866a0911091315l6a0782e8i24bfdc60065eb648@mail.gmail.com>
Message-ID: <3ED3D387B5DE4248A218D42882369925@NewLife>

I agree that BioPerl would significantly increase in value with
such a module; in fact, the BioTeam would probably buy us out.
My opinion is that the entire GWAS enterprise is the search for
such a callback function, for humans anyway. For those engaged
in this quest, if BioPerl doesn't provide a Maserati, it at least provides
good italian-made (among others) parts.
MAJ
----- Original Message ----- 
From: "Robert Bradbury" <robert.bradbury at gmail.com>
To: "Guangchun Song" <gc11song at gmail.com>
Cc: <bioperl-l at lists.open-bio.org>
Sent: Monday, November 09, 2009 4:15 PM
Subject: Re: [Bioperl-l] how to get the protein sequences from DNA 
sequencesaround novel SNPs?


> On Mon, Nov 9, 2009 at 1:08 PM, Guangchun Song <gc11song at gmail.com> wrote:
>>
>> I'm new bioperl user.  I' working on a project: To determine the
>> status of all tutative SNPs such as non-synonymous vs. synonymous, and
>> predict the tranlational effect of non-synonymous mutations as benign
>> or malicious.  I'm trying to use bioperl to get the DNA sequence and
>> translate to protein sequence for the SNPs that are in gene's coding
>> region.  Could someone tell me how to do it?
>>
>>
> I too would like to know if this information is available.  I've recently
> been working with the dbSNP results from NCBI but they display the results
> in a graphical format rather than data that one can play with and ask
> questions of like "What is the most disease causing gene in the Human
> Genome?" or "What are the critical proteins damaged by gene defects in the
> Human Genome?" ... "In terms of premature deaths, extended health care
> requirements, loss of quality of life, etc.?"
>
> The same types of questions can be applied to the dog and cat genomes where
> there is emotional value or the cow, horse, pig, etc. genomes where there is
> economic value?
>
> The value of BioPerl would increase significantly if there were
> functionality that would allow easy access to "these mutations may have
> negative/positive impact" (which means you need a function that qualifies
> mutations by degree) and allow for impact to be subjectively determined
> (implying there must be some callback function to provide a user
> quality/impact rating).
>
> For example:
>   $/@differences =  protein_compare($mygene, $refseq_gene, @critical_aa,
> @critical_domain, $callback)
> Where $callback could "rate" differences about the protein and position and
> the "type of interest" (e.g. metal binding amino acids, structural changing
> amino acids, critical catalysis amino acids, etc.).
>
> A default callback would be based on some evolving definition of "critical"
> changes which result in human disease for example.
>
> This is a "required" capability to be able to determine things like the
> "adaptability" of a species -- those with fewest critical mutation points
> may have better adaptability to mutation increasing circumstances.
>
> Please pardon any errors in perl syntax/usage its been a while since I've
> written perl and I'd really rather be coding in C.
>
> Robert
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> 


From alexl at users.sourceforge.net  Mon Nov  9 18:44:07 2009
From: alexl at users.sourceforge.net (Alex Lancaster)
Date: Mon, 09 Nov 2009 18:44:07 -0500
Subject: [Bioperl-l] version of ExtUtils::Manifest too strict?
In-Reply-To: <1D9E943F-2EDC-49AB-83DE-78DED5A8AC23@illinois.edu> (Chris
	Fields's message of "Wed, 4 Nov 2009 07:53:35 -0600")
References: <msd43yycfm.fsf@allele2.localdomain>
	<1D9E943F-2EDC-49AB-83DE-78DED5A8AC23@illinois.edu>
Message-ID: <nmocnbuuuw.fsf@allele2.localdomain>

>>>>> Chris Fields  writes:

> Alex, Not sure why ExtUtils::Manifest can't be bundled as a separate
> perl package alone.  It is part of perl core but it's also available
> on CPAN separately from perl itself:

> http://search.cpan.org/~rkobes/ExtUtils-Manifest-1.57/lib/ExtUtils/Manifest.pm

Hi Chris,

Yes, in principle it would be possible to have this split out as a
separate package (currently it's a "subpackage" under the main perl
package), unfortunately that's just not the way it's currently done in
Fedora (probably because it's part of the core set and they like to
update all relevant packages in one step) and I have little control over
that.

As I suspected, the perl maintainer is not at all enthusiastic for
updating the whole of perl just for that package (except for rawhide
which would mean that bioperl 1.6.1 would not be available until F-13,
about 6 months from now).  See:

http://bugzilla.redhat.com/show_bug.cgi?id=533562#c1

Obviously I am not happy with this situation either, because it will
freeze bioperl on Fedora at 1.6.0 for about 6 months, so can you
recommend any temporary workarounds in the meantime?

> This is the commit message for that BTW.  This allows spaces in file
> names for the MANIFEST.  v1.52 is a bug fix and is required.

> http://code.open-bio.org/svnweb/index.cgi/bioperl/revision/?rev=15673

Perhaps I could create a patch that renamed files with spaces in them to
ones with no spaces and then rename them again upon installation.

Can you point me to which files are the problematic ones that triggered
the dependency for 1.52?  Perhaps I can figure a workaround.

Meanwhile I will press the maintainer of perl in Fedora to perhaps
reconsider his position (e.g. if another update for perl is going out
for another reason, like a security update, perhaps he could roll in the
1.52 update at the same time).

Cheers,
Alex

From cjfields at illinois.edu  Mon Nov  9 19:50:00 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Mon, 9 Nov 2009 18:50:00 -0600
Subject: [Bioperl-l] version of ExtUtils::Manifest too strict?
In-Reply-To: <nmocnbuuuw.fsf@allele2.localdomain>
References: <msd43yycfm.fsf@allele2.localdomain>
	<1D9E943F-2EDC-49AB-83DE-78DED5A8AC23@illinois.edu>
	<nmocnbuuuw.fsf@allele2.localdomain>
Message-ID: <29EA2398-F60B-48F2-AFE7-39A44011C451@illinois.edu>

On Nov 9, 2009, at 5:44 PM, Alex Lancaster wrote:

>>>>>> Chris Fields  writes:
>
>> Alex, Not sure why ExtUtils::Manifest can't be bundled as a separate
>> perl package alone.  It is part of perl core but it's also available
>> on CPAN separately from perl itself:
>
>> http://search.cpan.org/~rkobes/ExtUtils-Manifest-1.57/lib/ExtUtils/Manifest.pm
>
> Hi Chris,
>
> Yes, in principle it would be possible to have this split out as a
> separate package (currently it's a "subpackage" under the main perl
> package), unfortunately that's just not the way it's currently done in
> Fedora (probably because it's part of the core set and they like to
> update all relevant packages in one step) and I have little control  
> over
> that.
>
> As I suspected, the perl maintainer is not at all enthusiastic for
> updating the whole of perl just for that package (except for rawhide
> which would mean that bioperl 1.6.1 would not be available until F-13,
> about 6 months from now).  See:
>
> http://bugzilla.redhat.com/show_bug.cgi?id=533562#c1
>
> Obviously I am not happy with this situation either, because it will
> freeze bioperl on Fedora at 1.6.0 for about 6 months, so can you
> recommend any temporary workarounds in the meantime?

Well, if you don't absolutely require the MANIFEST for the final  
package you can forego the requirement.  The file in question that  
triggered the requirement is a data file used only for testing:

t/data/test 2.txt

>> This is the commit message for that BTW.  This allows spaces in file
>> names for the MANIFEST.  v1.52 is a bug fix and is required.
>
>> http://code.open-bio.org/svnweb/index.cgi/bioperl/revision/?rev=15673
>
> Perhaps I could create a patch that renamed files with spaces in  
> them to
> ones with no spaces and then rename them again upon installation.
>
> Can you point me to which files are the problematic ones that  
> triggered
> the dependency for 1.52?  Perhaps I can figure a workaround.
>
> Meanwhile I will press the maintainer of perl in Fedora to perhaps
> reconsider his position (e.g. if another update for perl is going out
> for another reason, like a security update, perhaps he could roll in  
> the
> 1.52 update at the same time).
>
> Cheers,
> Alex

I would point out that this is a fairly significant bug fix for  
ExtUtils::Manifest.  A newer point release of perl is now available  
(5.10.1) that contains the fix and has a fix for a performance  
regression that popped up in 5.10.0.

chris

From jay at jays.net  Mon Nov  9 19:05:51 2009
From: jay at jays.net (Jay Hannah)
Date: Mon, 9 Nov 2009 18:05:51 -0600
Subject: [Bioperl-l] Bio::Index::GenBank - by organism?
Message-ID: <3B01A09C-198E-4691-B807-7ED3250BB81A@jays.net>

Many thanks to Ewan Birney et. al. for Bio::Index::*

I can throw away my awful grep based index-by-accession stuff.   :)

Any chance someone has also written an organism based index mechanism?  
Something like...

while (my $seq = $inx?>get_Seq_by_organism('*Xanthomonas*')) {
    print $seq->display_id . "\n";
}

Thanks,

j


From cjfields at illinois.edu  Mon Nov  9 22:55:01 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Mon, 9 Nov 2009 21:55:01 -0600
Subject: [Bioperl-l] Bio::Index::GenBank - by organism?
In-Reply-To: <3B01A09C-198E-4691-B807-7ED3250BB81A@jays.net>
References: <3B01A09C-198E-4691-B807-7ED3250BB81A@jays.net>
Message-ID: <12DFD22E-42DC-4626-9873-0DE3EBB5CFBD@illinois.edu>

On Nov 9, 2009, at 6:05 PM, Jay Hannah wrote:

> Many thanks to Ewan Birney et. al. for Bio::Index::*
>
> I can throw away my awful grep based index-by-accession stuff.   :)
>
> Any chance someone has also written an organism based index  
> mechanism? Something like...
>
> while (my $seq = $inx?>get_Seq_by_organism('*Xanthomonas*')) {
>   print $seq->display_id . "\n";
> }
>
> Thanks,
>
> j

It should work via id_parser(); from Bio::Index::GenBank:

    $inx->id_parser(\&get_id);
    # make the index
    $inx->make_index($file_name);

    # here is where the retrieval key is specified
    sub get_id {
       my $line = shift;
       $line =~ /clone="(\S+)"/;
       $1;
    }

Change the code ref deal with the line you want and parse the name  
out.  Caveat: this may not be absolutely perfect (it only passes in a  
line at a time, and some species lines will wrap).  Also not sure how  
this would work in cases where multiple sequences from the same  
species are present.

The other option is to preparse everything and tie a hash to store a  
species->UID map, then use that along with your Bio::Index index to  
grab what you need.

chris

From cjfields at illinois.edu  Mon Nov  9 23:58:32 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Mon, 9 Nov 2009 22:58:32 -0600
Subject: [Bioperl-l] how to get the protein sequences from DNA sequences
	around novel SNPs?
In-Reply-To: <deaa866a0911091315l6a0782e8i24bfdc60065eb648@mail.gmail.com>
References: <794eafc20911091008g1f98b944ncbd66ac4962a85a3@mail.gmail.com>
	<deaa866a0911091315l6a0782e8i24bfdc60065eb648@mail.gmail.com>
Message-ID: <435BA1A8-2CCB-4D7A-8909-84F8135C439F@illinois.edu>

On Nov 9, 2009, at 3:15 PM, Robert Bradbury wrote:

> On Mon, Nov 9, 2009 at 1:08 PM, Guangchun Song <gc11song at gmail.com>  
> wrote:
>>
>> I'm new bioperl user.  I' working on a project: To determine the
>> status of all tutative SNPs such as non-synonymous vs. synonymous,  
>> and
>> predict the tranlational effect of non-synonymous mutations as benign
>> or malicious.  I'm trying to use bioperl to get the DNA sequence and
>> translate to protein sequence for the SNPs that are in gene's coding
>> region.  Could someone tell me how to do it?
>>
>>
> I too would like to know if this information is available.  I've  
> recently
> been working with the dbSNP results from NCBI but they display the  
> results
> in a graphical format rather than data that one can play with and ask
> questions of like "What is the most disease causing gene in the Human
> Genome?" or "What are the critical proteins damaged by gene defects  
> in the
> Human Genome?" ... "In terms of premature deaths, extended health care
> requirements, loss of quality of life, etc.?"
>
> The same types of questions can be applied to the dog and cat  
> genomes where
> there is emotional value or the cow, horse, pig, etc. genomes where  
> there is
> economic value?
>
> The value of BioPerl would increase significantly if there were
> functionality that would allow easy access to "these mutations may  
> have
> negative/positive impact" (which means you need a function that  
> qualifies
> mutations by degree) and allow for impact to be subjectively  
> determined
> (implying there must be some callback function to provide a user
> quality/impact rating).
>
> For example:
>   $/@differences =  protein_compare($mygene, $refseq_gene,  
> @critical_aa,
> @critical_domain, $callback)
> Where $callback could "rate" differences about the protein and  
> position and
> the "type of interest" (e.g. metal binding amino acids, structural  
> changing
> amino acids, critical catalysis amino acids, etc.).
>
> A default callback would be based on some evolving definition of  
> "critical"
> changes which result in human disease for example.
>
> This is a "required" capability to be able to determine things like  
> the
> "adaptability" of a species -- those with fewest critical mutation  
> points
> may have better adaptability to mutation increasing circumstances.
>
> Please pardon any errors in perl syntax/usage its been a while since  
> I've
> written perl and I'd really rather be coding in C.
>
> Robert

I will say that most of the information from the SNP database is  
available in various formats (see following link under 'Retrieval  
Types'):

http://www.ncbi.nlm.nih.gov/corehtml/query/static/efetchseq_help.html

You can access this information, as well as the full XML, using  
something like the following script.

chris

------------------------------------------------

#!/usr/bin/perl -w

use 5.010;
use strict;
use warnings;
use Bio::DB::EUtilities;

my $term = shift;
my $eutil  = Bio::DB::EUtilities->new(-eutil    => 'esearch',
                                       -db       => 'snp',
                                       -term     => $term,
                                       -usehistory => 'y',
                                       -retmax   => 100);

my $hist = $eutil->next_History || die "No history returned";

# for SNP XML, change retmode to 'xml'
$eutil->set_parameters(-eutil   => 'efetch',
                        -history => $hist,
                        -retmode => 'text',
                        -rettype => 'flt');

# dumps to STDOUT
say $eutil->get_Response->content;


From jluis.lavin at unavarra.es  Tue Nov 10 05:43:40 2009
From: jluis.lavin at unavarra.es (jluis.lavin at unavarra.es)
Date: Tue, 10 Nov 2009 11:43:40 +0100 (CET)
Subject: [Bioperl-l] [Fwd: Re: A question about iBio::Index: and
 itscorrect use]
In-Reply-To: <A1ACC4B552514872B77208248B31977C@NewLife>
References: <2642.130.206.164.153.1257435996.squirrel@webmail.unavarra.es><1
	984ED07F36C446284B25F617964B6C6@NewLife><2969.130.206.164.153.1257438117.squirrel@webmail.unavarra.es>
	<A1ACC4B552514872B77208248B31977C@NewLife>
Message-ID: <3471.130.206.164.153.1257849820.squirrel@webmail.unavarra.es>

Hello again,

I tried what Mark told me modifying the code line he told me but there?s
still a problem that I believe must be due to the sequences name.
My secuences header on the Fasta file have this format:

>PleosPC9_1_103820|fgenesh1_pg.3_#_1

Th part on the right of the pipe changes depending on the program used to
create the gene model, for example:

>PleosPC9_1_103820|fgenesh1_pg.3_#_1
>PleosPC9_1_123413|genemark.2731_g
>PleosPC9_1_52065|e_gw1.3.64.1

So I guess I need to parse my ids somehow for thr program to detect only
the first part of the fasta header (the "protein name") and not to get
messed with the other side of the pipe...

This is the corrected code I wrote following Mark?s indications, but I
still don?t have any idea about the parsing issue...

#!/c:/Perl -w
use Bio::Index::Fasta;
use strict;
#PC9.fasta is my genomic file
my $Index_File_Name ="PC9.fasta";
my $inx = Bio::Index::Fasta->new('PC9.fasta.idx');
#LCS.txt is my sequences list
@ARGV = <LCS.txt>;
foreach  my $id (@ARGV) {
if ($id eq ''){
die ("empty list")
}
else {
my $seqobj = $inx->fetch($id);
my $out = new Bio::SeqIO (-file => ">>index_extracted.fasta",
-format => 'fasta');
$out->write_seq($seqobj);
}
}
exit;
}

Thanks in advance

PD. May it be a faster way of extracting those sequences using plain PERL?


El Jue, 5 de Noviembre de 2009, 17:39, Mark A. Jensen escribi?:
> Yes, these are files created by the SDBM, Perl's internal db manager. You
> should
> be able to
> open the index by simply
> $inx = Bio::Index::Fasta->new('PC9.fasta.idx');
> and the dbm will know what to do--
> cheers MAJ
> ----- Original Message -----
> From: <jluis.lavin at unavarra.es>
> To: "Mark A. Jensen" <maj at fortinbras.us>
> Cc: <jluis.lavin at unavarra.es>; <bioperl-l at lists.open-bio.org>
> Sent: Thursday, November 05, 2009 11:21 AM
> Subject: Re: [Bioperl-l] [Fwd: Re: A question about iBio::Index: and its
> correct
> use]
>
>
>> Thank you very much Mark, that?s a good point :$
>> I guess your correction is referred to the second script, isn?t it?
>>
>> If it is so, there is still a problem with the first script, it doesn?t
>> create the PC9.fasta.idx file, instead it creates two files named:
>> -PC9.fasta.idx.pag
>> -PC9.fasta.idx.dir
>>
>> which seem to be clearly related with some kind of indexing
>> process...but,
>> unless the PC9.fasta.idx file is only virtual or remains hidden, I can?t
>> find it anywhere...
>> Forgive me if I?m talking nosense...
>>
>> Thank you very much again for your help ;)
>>
>>
>> El Jue, 5 de Noviembre de 2009, 17:02, Mark A. Jensen escribi?:
>>> Hey Jos?,
>>> The first thing that jumps out it the index file name. Looks
>>> like you create it as
>>> PC9.fasta.idx
>>> But you read it as
>>> PC9.fasta
>>> Not an unusual mistake. Do
>>> my $inx = Bio::Index::Fasta->new('PC9.fasta.idx');
>>> and see if it works.
>>> MAJ
>>> ----- Original Message -----
>>> From: <jluis.lavin at unavarra.es>
>>> To: <bioperl-l at lists.open-bio.org>
>>> Sent: Thursday, November 05, 2009 10:46 AM
>>> Subject: [Bioperl-l] [Fwd: Re: A question about iBio::Index: and its
>>> correct
>>> use]
>>>
>>>
>>>
>>>
>>> ---------------------------- Mensaje original
>>> ----------------------------
>>> Subject: Re: [Bioperl-l] A question about iBio::Index: and its correct
>>> use
>>> From:    jluis.lavin at unavarra.es
>>> Fecha:   Jue, 5 de Noviembre de 2009, 16:46
>>> To:      "Mark A. Jensen" <maj at fortinbras.us>
>>> --------------------------------------------------------------------------
>>>
>>> Hi Mark,
>>>
>>> I?ve actually got two scripts, the first one is to create the index and
>>> the second one is to retrieve the sequence lis from the indexed file.
>>>
>>> 1)Here is the Index creation script:
>>>
>>> #!/c:/Perl -w
>>> use strict;
>>> use Bio::Index::Fasta;
>>> use strict;
>>>
>>> print "Enter file for indexing: \n";
>>> my $Index_File_Name = <STDIN>;
>>> my $inx = Bio::Index::Fasta->new(-filename => $Index_File_Name.".idx",
>>>     -write_flag => 1);
>>> $inx->make_index(my $File_Name);
>>>
>>> 2)And here is the sequence retrieval script:
>>>
>>> #!/c:/Perl -w
>>> use Bio::Index::Fasta;
>>> use strict;
>>> #PC9.fasta is my genomic file
>>> my $Index_File_Name ="PC9.fasta";
>>> my $inx = Bio::Index::Fasta->new($Index_File_Name);
>>> #LCS.txt is my sequences list
>>> @ARGV = <lCS.txt>;
>>> foreach  my $id (@ARGV) {
>>> if ($id eq ''){
>>> die ("empty list")
>>> }
>>> else {
>>> my $seqobj = $inx->fetch($id);
>>> my $out = new Bio::SeqIO (-file => ">>extracted_seqs.fasta",
>>> -format => 'fasta');
>>> $out->write_seq($seqobj);
>>> }
>>> }
>>> exit;
>>> }
>>>
>>> I hope this code is not a total scum...
>>>
>>> Thanks in advance ;)
>>>
>>>
>>>
>>> El Jue, 5 de Noviembre de 2009, 16:39, Mark A. Jensen escribi?:
>>>> Jos? -- It looks like this is a good solution to your problem. Please
>>>> send
>>>> you
>>>> script so we can look at it-
>>>> cheers Mark
>>>> ----- Original Message -----
>>>> From: <jluis.lavin at unavarra.es>
>>>> To: <bioperl-l at lists.open-bio.org>
>>>> Sent: Thursday, November 05, 2009 10:28 AM
>>>> Subject: [Bioperl-l] A question about iBio::Index: and its correct use
>>>>
>>>>
>>>>
>>>> Hello to all,
>>>>
>>>> I?m trying to write a script to retrieve a list of sequences from a
>>>> local
>>>> FASTA file (for example a fasta archive where all the protein models
>>>> of
>>>> an
>>>> organism are stored). This file would be used by me as some kind
>>>> "local
>>>> database" (sorry if I mistake a few concepts...)
>>>> I?ve been reading the BioPerl HOWTOs and I came across the
>>>> Bio::Index::Fasta tool.
>>>> If I didn?t misunderstood what I read (which can be easy because my
>>>> low
>>>> level on programming) this Indexing tool should do the job.
>>>> I wrote a couple of scripts based on the documentation i read about
>>>> this
>>>> tool, but I don?t seem to be able to create the index file to be used
>>>> later (to retrieve the sequences from).
>>>> -First of all, I want to ask the people in this forum if the
>>>> Bio::Index::Fasta is the right one to chose for this tasks.
>>>> -Then I?ll beg you to take a look at my scripts, because I don?t seem
>>>> to
>>>> catch the bug...
>>>>
>>>> Best wishes to you all and thanks in advance ;)
>>>>
>>>> --
>>>> Jos? Luis Lav?n Trueba, PhD
>>>>
>>>> Dpto. de Producci?n Agraria
>>>> Grupo de Gen?tica y Microbiolog?a
>>>> Universidad P?blica de Navarra
>>>> 31006 Pamplona
>>>> Navarra
>>>> SPAIN
>>>>
>>>>
>>>> _______________________________________________
>>>> Bioperl-l mailing list
>>>> Bioperl-l at lists.open-bio.org
>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>>
>>>>
>>>>
>>>
>>>
>>> --
>>> Dr. Jos? Luis Lav?n Trueba
>>>
>>> Dpto. de Producci?n Agraria
>>> Grupo de Gen?tica y Microbiolog?a
>>> Universidad P?blica de Navarra
>>> 31006 Pamplona
>>> Navarra
>>> SPAIN
>>>
>>>
>>>
>>> --
>>> Dr. Jos? Luis Lav?n Trueba
>>>
>>> Dpto. de Producci?n Agraria
>>> Grupo de Gen?tica y Microbiolog?a
>>> Universidad P?blica de Navarra
>>> 31006 Pamplona
>>> Navarra
>>> SPAIN
>>>
>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>>
>>>
>>
>>
>> --
>> Dr. Jos? Luis Lav?n Trueba
>>
>> Dpto. de Producci?n Agraria
>> Grupo de Gen?tica y Microbiolog?a
>> Universidad P?blica de Navarra
>> 31006 Pamplona
>> Navarra
>> SPAIN
>>
>>
>>
>>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


-- 
Dr. Jos? Luis Lav?n Trueba

Dpto. de Producci?n Agraria
Grupo de Gen?tica y Microbiolog?a
Universidad P?blica de Navarra
31006 Pamplona
Navarra
SPAIN


From saikari78 at gmail.com  Tue Nov 10 06:41:11 2009
From: saikari78 at gmail.com (saikari keitele)
Date: Tue, 10 Nov 2009 11:41:11 +0000
Subject: [Bioperl-l] [bioperl newbie] Retrieving link to protein from
	PubChem
In-Reply-To: <a38167fa0911090841o6a0a7bcdj5e2396f1a58dc4f2@mail.gmail.com>
References: <a38167fa0911090805y76456205g3f42caa7e3471241@mail.gmail.com>
	<1ECC543A-F923-4D5E-A0C1-5BBD35ECAAE8@illinois.edu>
	<a38167fa0911090841o6a0a7bcdj5e2396f1a58dc4f2@mail.gmail.com>
Message-ID: <a38167fa0911100341j45af6218r449270a7f4b530ec@mail.gmail.com>

Thanks again very much for your help and the script.
i've been trying it, however I fail to find any protein record linked to a
record in the pcsubstance database.
Do you think that its is because  no links have been defined between the 2
databases, or that I am just unlucky and that no link exists for the
particular records I'm testing?
Thanks again

saikari

On Mon, Nov 9, 2009 at 4:41 PM, saikari keitele <saikari78 at gmail.com> wrote:

> Fabulous!. Huge help.
> saikari
>
>   On Mon, Nov 9, 2009 at 4:27 PM, Chris Fields <cjfields at illinois.edu>wrote:
>
>>  On Nov 9, 2009, at 10:05 AM, saikari keitele wrote:
>>
>> Hi,
>>>
>>> I'm using Bioperl to retrieve records from PubChem.
>>> I'm trying to find a way-but have been unsuccessful- to retrieve from a
>>> compound record, the reference to the protein(s) that can synthesize the
>>> compound.
>>> Thanks very much.
>>>
>>> saikari
>>>
>>
>> The below bioperl script returns the GI for proteins that correspond to
>> the substance passed on the command line; invoke using 'perl
>> pc_substance.pl substance_requested'.  It probably needs more fiddling to
>> catch everything but it should get you started.
>>
>> For other bits and pieces (such as how to retrieve the raw sequence
>> files), please see the EUtilities HOWTO:
>>
>> http://www.bioperl.org/wiki/HOWTO:EUtilities_Cookbook
>>
>> chris
>>
>> ----------------------------------------
>>
>> #!/usr/bin/perl -w
>>
>> use 5.010;
>> use strict;
>> use warnings;
>> use Bio::DB::EUtilities;
>>
>> my $substance = shift;
>>
>> my $eutil = Bio::DB::EUtilities->new(-eutil => 'esearch',
>>                                     -db => 'pcsubstance',
>>                                     -term => $substance,
>>                                     -usehistory => 'y');
>>
>> my $hist = $eutil->next_History || die;
>>
>> $eutil->reset_parameters(-eutil => 'elink',
>>                       -history => $hist,
>>                       -db      => 'protein',
>>                       -dbfrom  => 'pcsubstance',
>>                       -retmax  => 1000);
>>
>> say join(',',$eutil->get_ids);
>>
>
>

From heyne at informatik.uni-freiburg.de  Tue Nov 10 07:55:06 2009
From: heyne at informatik.uni-freiburg.de (Steffen Heyne)
Date: Tue, 10 Nov 2009 13:55:06 +0100
Subject: [Bioperl-l] problem with alignments and sequence locations
Message-ID: <4AF962AA.7060908@informatik.uni-freiburg.de>

Hi,

I'm using Bioperl for my research and it is very useful! Thank you!

Currently I have a problem with locations tags of sequences. I read in 
seed alignments of Rfam (in stockholm format, but I think it is similar 
to other formats).

If the location is like:

AB194432.1/908-846

the start/end values are changed to

$seq->start = 846
$seq->end = 908

and therefore the new location (e.g.$seq->get_nse) is:

AB194432.1/846-908

The $seq->strand tag is correctly set to -1 in this case, but if the 
alignment is written out again (clustal, stockholm,...) this strand info 
is lost and the sequences have this "wrong" location. But this 
information is important in respect to the sequence accession number.

Is there a way to set the location back to the original one or is this 
behavior desired? Any manually setting with $seq->start($val) failed due 
to automatic checking.

I'm using bioperl 1.6.1

Thanks!

steffen


-- 
---
Steffen Heyne, Dipl.-Bioinf.
Lehrstuhl f?r Bioinformatik
Institut f?r Informatik
Albert-Ludwigs-Universit?t Freiburg
Georges-K?hler-Allee 106
79110 Freiburg, Germany

Tel: (+49) 761 203 8239
Fax: (+49) 761 203 7462
Mail: heyne at informatik.uni-freiburg.de

From cjfields at illinois.edu  Tue Nov 10 08:58:52 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Tue, 10 Nov 2009 07:58:52 -0600
Subject: [Bioperl-l] problem with alignments and sequence locations
In-Reply-To: <4AF962AA.7060908@informatik.uni-freiburg.de>
References: <4AF962AA.7060908@informatik.uni-freiburg.de>
Message-ID: <DF72C01A-410F-4391-B33E-4884D7CB859E@illinois.edu>

On Nov 10, 2009, at 6:55 AM, Steffen Heyne wrote:

> Hi,
>
> I'm using Bioperl for my research and it is very useful! Thank you!
>
> Currently I have a problem with locations tags of sequences. I read  
> in seed alignments of Rfam (in stockholm format, but I think it is  
> similar to other formats).
>
> If the location is like:
>
> AB194432.1/908-846
>
> the start/end values are changed to
>
> $seq->start = 846
> $seq->end = 908
>
> and therefore the new location (e.g.$seq->get_nse) is:
>
> AB194432.1/846-908
>
> The $seq->strand tag is correctly set to -1 in this case, but if the  
> alignment is written out again (clustal, stockholm,...) this strand  
> info is lost and the sequences have this "wrong" location. But this  
> information is important in respect to the sequence accession number.
>
> Is there a way to set the location back to the original one or is  
> this behavior desired? Any manually setting with $seq->start($val)  
> failed due to automatic checking.
>
> I'm using bioperl 1.6.1
>
> Thanks!
>
> steffen

This is a definite bug. We recently discussed amending the NSE format  
due to this (the subject came up over the last few months or so); it's  
fallen through the cracks.  Fortunaely it is very easy to fix (the  
relevant method is in LocatableSeq).

Does anyone have a problem with me adding this in?  It will change  
output for only those instances where the strand is -1, so

AB194432.1/908-846

would be start = 846, end = 908, strand = -1

AB194432.1/846-908

would be start = 846, end = 908, strand = 1

chris

From cjfields at illinois.edu  Tue Nov 10 09:05:51 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Tue, 10 Nov 2009 08:05:51 -0600
Subject: [Bioperl-l] [bioperl newbie] Retrieving link to protein from
	PubChem
In-Reply-To: <a38167fa0911100341j45af6218r449270a7f4b530ec@mail.gmail.com>
References: <a38167fa0911090805y76456205g3f42caa7e3471241@mail.gmail.com>
	<1ECC543A-F923-4D5E-A0C1-5BBD35ECAAE8@illinois.edu>
	<a38167fa0911090841o6a0a7bcdj5e2396f1a58dc4f2@mail.gmail.com>
	<a38167fa0911100341j45af6218r449270a7f4b530ec@mail.gmail.com>
Message-ID: <738F6320-B87A-4541-B9FA-20273ABA96B9@illinois.edu>

On Nov 10, 2009, at 5:41 AM, saikari keitele wrote:

> Thanks again very much for your help and the script.
> i've been trying it, however I fail to find any protein record  
> linked to a
> record in the pcsubstance database.
> Do you think that its is because  no links have been defined between  
> the 2
> databases, or that I am just unlucky and that no link exists for the
> particular records I'm testing?
> Thanks again
>
> saikari

It's probably that no links have been defined.  I have found similar  
problems in the past with pubchem, in that not all substances have  
proteins associated with them.  Most proteins linked to are those with  
a deposited structure.

There are a few other databases to check out; KEGG, the BioCyc dbs  
(like EcoCyc), come to mind.  I don't think we have a generic remote  
query engine set up for any of those unfortunately (unless there is  
one I'm unaware of), but I know BioCyc comes with it's own set of  
tools (including perl- and java-based query tools) and can be set up  
locally, which is likely much faster and more in lines with what you  
need.

chris

...

From vebaev at gmail.com  Tue Nov 10 12:38:54 2009
From: vebaev at gmail.com (Vesselin Baev)
Date: Tue, 10 Nov 2009 09:38:54 -0800 (PST)
Subject: [Bioperl-l] Invitation to connect on LinkedIn
Message-ID: <1983273212.597925.1257874734811.JavaMail.app@ech3-cdn07.prod>

LinkedIn
------------

Vesselin Baev requested to add you as a connection on LinkedIn:
------------------------------------------

Bolotin,,

I'd like to add you to my professional network on LinkedIn.

- Vesselin

Accept invitation from Vesselin Baev
http://www.linkedin.com/e/uez6TYkHzbaXxXM-lUk23auFwJZodcPlXc2UWC0Ao8h/blk/I1572789477_2/pmpxnSRJrSdvj4R5fnhv9ClRsDgZp6lQs6lzoQ5AomZIpn8_cBYTdPgVe3sOdPkNiiZFlAN1oPlOp2YMdPsTcz8OdjwLrCBxbOYWrSlI/EML_comm_afe/

View invitation from Vesselin Baev
http://www.linkedin.com/e/uez6TYkHzbaXxXM-lUk23auFwJZodcPlXc2UWC0Ao8h/blk/I1572789477_2/39vdPsQejwTczsRckALqnpPbOYWrSlI/svi/

------------------------------------------ 
DID YOU KNOW your LinkedIn profile helps you control your public image when people search for you? Setting your profile as public means your LinkedIn profile will come up when people enter your name in leading search engines. Take control of your image! 
http://www.linkedin.com/e/ewp/inv-22/

 
------
(c) 2009, LinkedIn Corporation


From jason at bioperl.org  Tue Nov 10 13:47:02 2009
From: jason at bioperl.org (Jason Stajich)
Date: Tue, 10 Nov 2009 10:47:02 -0800
Subject: [Bioperl-l] [Fwd: Re: A question about iBio::Index: and
	itscorrect use]
In-Reply-To: <3471.130.206.164.153.1257849820.squirrel@webmail.unavarra.es>
References: <2642.130.206.164.153.1257435996.squirrel@webmail.unavarra.es><1
	984ED07F36C446284B25F617964B6C6@NewLife><2969.130.206.164.153.1257438117.squirrel@webmail.unavarra.es>
	<A1ACC4B552514872B77208248B31977C@NewLife>
	<3471.130.206.164.153.1257849820.squirrel@webmail.unavarra.es>
Message-ID: <E2D3B0C9-9294-4630-ACA0-FB0559E19E4C@bioperl.org>

Page 44 has the custom ID info or look at documentation for  
Bio::DB::Fasta - there is a similar syntax for Bio::Index::Fasta if  
you read the perldoc for the module.

  http://jason.open-bio.org/Bioperl_Tutorials/ProgrammingBiology2008/ProgBiology_BioPerl_I.pdf

Don't re-opening SeqIO each time just do it once at the beginning  
outside of the loop and then call write_seq within the loop.

This is one nuance of doing OO programming vs procedural is that there  
is some outside state information that can persist in an object, but  
conceptually, you want to open a filehandle once and just keep writing  
to it.

-jason
On Nov 10, 2009, at 2:43 AM, jluis.lavin at unavarra.es wrote:

> Hello again,
>
> I tried what Mark told me modifying the code line he told me but  
> there?s
> still a problem that I believe must be due to the sequences name.
> My secuences header on the Fasta file have this format:
>
>> PleosPC9_1_103820|fgenesh1_pg.3_#_1
>
> Th part on the right of the pipe changes depending on the program  
> used to
> create the gene model, for example:
>
>> PleosPC9_1_103820|fgenesh1_pg.3_#_1
>> PleosPC9_1_123413|genemark.2731_g
>> PleosPC9_1_52065|e_gw1.3.64.1
>
> So I guess I need to parse my ids somehow for thr program to detect  
> only
> the first part of the fasta header (the "protein name") and not to get
> messed with the other side of the pipe...
>
> This is the corrected code I wrote following Mark?s indications, but I
> still don?t have any idea about the parsing issue...
>
> #!/c:/Perl -w
> use Bio::Index::Fasta;
> use strict;
> #PC9.fasta is my genomic file
> my $Index_File_Name ="PC9.fasta";
> my $inx = Bio::Index::Fasta->new('PC9.fasta.idx');
> #LCS.txt is my sequences list
> @ARGV = <LCS.txt>;
> foreach  my $id (@ARGV) {
> if ($id eq ''){
> die ("empty list")
> }
> else {
> my $seqobj = $inx->fetch($id);
> my $out = new Bio::SeqIO (-file => ">>index_extracted.fasta",
> -format => 'fasta');
> $out->write_seq($seqobj);
> }
> }
> exit;
> }
>
> Thanks in advance
>
> PD. May it be a faster way of extracting those sequences using plain  
> PERL?
>
>
>
>
> El Jue, 5 de Noviembre de 2009, 17:39, Mark A. Jensen escribi?:
>> Yes, these are files created by the SDBM, Perl's internal db  
>> manager. You
>> should
>> be able to
>> open the index by simply
>> $inx = Bio::Index::Fasta->new('PC9.fasta.idx');
>> and the dbm will know what to do--
>> cheers MAJ
>> ----- Original Message -----
>> From: <jluis.lavin at unavarra.es>
>> To: "Mark A. Jensen" <maj at fortinbras.us>
>> Cc: <jluis.lavin at unavarra.es>; <bioperl-l at lists.open-bio.org>
>> Sent: Thursday, November 05, 2009 11:21 AM
>> Subject: Re: [Bioperl-l] [Fwd: Re: A question about iBio::Index:  
>> and its
>> correct
>> use]
>>
>>
>>> Thank you very much Mark, that?s a good point :$
>>> I guess your correction is referred to the second script, isn?t it?
>>>
>>> If it is so, there is still a problem with the first script, it  
>>> doesn?t
>>> create the PC9.fasta.idx file, instead it creates two files named:
>>> -PC9.fasta.idx.pag
>>> -PC9.fasta.idx.dir
>>>
>>> which seem to be clearly related with some kind of indexing
>>> process...but,
>>> unless the PC9.fasta.idx file is only virtual or remains hidden, I  
>>> can?t
>>> find it anywhere...
>>> Forgive me if I?m talking nosense...
>>>
>>> Thank you very much again for your help ;)
>>>
>>>
>>> El Jue, 5 de Noviembre de 2009, 17:02, Mark A. Jensen escribi?:
>>>> Hey Jos?,
>>>> The first thing that jumps out it the index file name. Looks
>>>> like you create it as
>>>> PC9.fasta.idx
>>>> But you read it as
>>>> PC9.fasta
>>>> Not an unusual mistake. Do
>>>> my $inx = Bio::Index::Fasta->new('PC9.fasta.idx');
>>>> and see if it works.
>>>> MAJ
>>>> ----- Original Message -----
>>>> From: <jluis.lavin at unavarra.es>
>>>> To: <bioperl-l at lists.open-bio.org>
>>>> Sent: Thursday, November 05, 2009 10:46 AM
>>>> Subject: [Bioperl-l] [Fwd: Re: A question about iBio::Index: and  
>>>> its
>>>> correct
>>>> use]
>>>>
>>>>
>>>>
>>>>
>>>> ---------------------------- Mensaje original
>>>> ----------------------------
>>>> Subject: Re: [Bioperl-l] A question about iBio::Index: and its  
>>>> correct
>>>> use
>>>> From:    jluis.lavin at unavarra.es
>>>> Fecha:   Jue, 5 de Noviembre de 2009, 16:46
>>>> To:      "Mark A. Jensen" <maj at fortinbras.us>
>>>> --------------------------------------------------------------------------
>>>>
>>>> Hi Mark,
>>>>
>>>> I?ve actually got two scripts, the first one is to create the  
>>>> index and
>>>> the second one is to retrieve the sequence lis from the indexed  
>>>> file.
>>>>
>>>> 1)Here is the Index creation script:
>>>>
>>>> #!/c:/Perl -w
>>>> use strict;
>>>> use Bio::Index::Fasta;
>>>> use strict;
>>>>
>>>> print "Enter file for indexing: \n";
>>>> my $Index_File_Name = <STDIN>;
>>>> my $inx = Bio::Index::Fasta->new(-filename =>  
>>>> $Index_File_Name.".idx",
>>>>    -write_flag => 1);
>>>> $inx->make_index(my $File_Name);
>>>>
>>>> 2)And here is the sequence retrieval script:
>>>>
>>>> #!/c:/Perl -w
>>>> use Bio::Index::Fasta;
>>>> use strict;
>>>> #PC9.fasta is my genomic file
>>>> my $Index_File_Name ="PC9.fasta";
>>>> my $inx = Bio::Index::Fasta->new($Index_File_Name);
>>>> #LCS.txt is my sequences list
>>>> @ARGV = <lCS.txt>;
>>>> foreach  my $id (@ARGV) {
>>>> if ($id eq ''){
>>>> die ("empty list")
>>>> }
>>>> else {
>>>> my $seqobj = $inx->fetch($id);
>>>> my $out = new Bio::SeqIO (-file => ">>extracted_seqs.fasta",
>>>> -format => 'fasta');
>>>> $out->write_seq($seqobj);
>>>> }
>>>> }
>>>> exit;
>>>> }
>>>>
>>>> I hope this code is not a total scum...
>>>>
>>>> Thanks in advance ;)
>>>>
>>>>
>>>>
>>>> El Jue, 5 de Noviembre de 2009, 16:39, Mark A. Jensen escribi?:
>>>>> Jos? -- It looks like this is a good solution to your problem.  
>>>>> Please
>>>>> send
>>>>> you
>>>>> script so we can look at it-
>>>>> cheers Mark
>>>>> ----- Original Message -----
>>>>> From: <jluis.lavin at unavarra.es>
>>>>> To: <bioperl-l at lists.open-bio.org>
>>>>> Sent: Thursday, November 05, 2009 10:28 AM
>>>>> Subject: [Bioperl-l] A question about iBio::Index: and its  
>>>>> correct use
>>>>>
>>>>>
>>>>>
>>>>> Hello to all,
>>>>>
>>>>> I?m trying to write a script to retrieve a list of sequences  
>>>>> from a
>>>>> local
>>>>> FASTA file (for example a fasta archive where all the protein  
>>>>> models
>>>>> of
>>>>> an
>>>>> organism are stored). This file would be used by me as some kind
>>>>> "local
>>>>> database" (sorry if I mistake a few concepts...)
>>>>> I?ve been reading the BioPerl HOWTOs and I came across the
>>>>> Bio::Index::Fasta tool.
>>>>> If I didn?t misunderstood what I read (which can be easy because  
>>>>> my
>>>>> low
>>>>> level on programming) this Indexing tool should do the job.
>>>>> I wrote a couple of scripts based on the documentation i read  
>>>>> about
>>>>> this
>>>>> tool, but I don?t seem to be able to create the index file to be  
>>>>> used
>>>>> later (to retrieve the sequences from).
>>>>> -First of all, I want to ask the people in this forum if the
>>>>> Bio::Index::Fasta is the right one to chose for this tasks.
>>>>> -Then I?ll beg you to take a look at my scripts, because I don?t  
>>>>> seem
>>>>> to
>>>>> catch the bug...
>>>>>
>>>>> Best wishes to you all and thanks in advance ;)
>>>>>
>>>>> --
>>>>> Jos? Luis Lav?n Trueba, PhD
>>>>>
>>>>> Dpto. de Producci?n Agraria
>>>>> Grupo de Gen?tica y Microbiolog?a
>>>>> Universidad P?blica de Navarra
>>>>> 31006 Pamplona
>>>>> Navarra
>>>>> SPAIN
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> Bioperl-l mailing list
>>>>> Bioperl-l at lists.open-bio.org
>>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> Dr. Jos? Luis Lav?n Trueba
>>>>
>>>> Dpto. de Producci?n Agraria
>>>> Grupo de Gen?tica y Microbiolog?a
>>>> Universidad P?blica de Navarra
>>>> 31006 Pamplona
>>>> Navarra
>>>> SPAIN
>>>>
>>>>
>>>>
>>>> --
>>>> Dr. Jos? Luis Lav?n Trueba
>>>>
>>>> Dpto. de Producci?n Agraria
>>>> Grupo de Gen?tica y Microbiolog?a
>>>> Universidad P?blica de Navarra
>>>> 31006 Pamplona
>>>> Navarra
>>>> SPAIN
>>>>
>>>>
>>>> _______________________________________________
>>>> Bioperl-l mailing list
>>>> Bioperl-l at lists.open-bio.org
>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>>
>>>>
>>>>
>>>
>>>
>>> --
>>> Dr. Jos? Luis Lav?n Trueba
>>>
>>> Dpto. de Producci?n Agraria
>>> Grupo de Gen?tica y Microbiolog?a
>>> Universidad P?blica de Navarra
>>> 31006 Pamplona
>>> Navarra
>>> SPAIN
>>>
>>>
>>>
>>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>
>
> -- 
> Dr. Jos? Luis Lav?n Trueba
>
> Dpto. de Producci?n Agraria
> Grupo de Gen?tica y Microbiolog?a
> Universidad P?blica de Navarra
> 31006 Pamplona
> Navarra
> SPAIN
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

--
Jason Stajich
jason.stajich at gmail.com
jason at bioperl.org


From jason at bioperl.org  Tue Nov 10 13:50:00 2009
From: jason at bioperl.org (Jason Stajich)
Date: Tue, 10 Nov 2009 10:50:00 -0800
Subject: [Bioperl-l] Bio::Index::GenBank - by organism?
In-Reply-To: <12DFD22E-42DC-4626-9873-0DE3EBB5CFBD@illinois.edu>
References: <3B01A09C-198E-4691-B807-7ED3250BB81A@jays.net>
	<12DFD22E-42DC-4626-9873-0DE3EBB5CFBD@illinois.edu>
Message-ID: <2BA451B1-6E18-483E-B655-74D1146772CC@bioperl.org>

You might also look at what mygenbank does:
http://homepage.mac.com/iankorf/mygenbank.html

On Nov 9, 2009, at 7:55 PM, Chris Fields wrote:

> On Nov 9, 2009, at 6:05 PM, Jay Hannah wrote:
>
>> Many thanks to Ewan Birney et. al. for Bio::Index::*
>>
>> I can throw away my awful grep based index-by-accession stuff.   :)
>>
>> Any chance someone has also written an organism based index  
>> mechanism? Something like...
>>
>> while (my $seq = $inx?>get_Seq_by_organism('*Xanthomonas*')) {
>>  print $seq->display_id . "\n";
>> }
>>
>> Thanks,
>>
>> j
>
> It should work via id_parser(); from Bio::Index::GenBank:
>
>   $inx->id_parser(\&get_id);
>   # make the index
>   $inx->make_index($file_name);
>
>   # here is where the retrieval key is specified
>   sub get_id {
>      my $line = shift;
>      $line =~ /clone="(\S+)"/;
>      $1;
>   }
>
> Change the code ref deal with the line you want and parse the name  
> out.  Caveat: this may not be absolutely perfect (it only passes in  
> a line at a time, and some species lines will wrap).  Also not sure  
> how this would work in cases where multiple sequences from the same  
> species are present.
>
> The other option is to preparse everything and tie a hash to store a  
> species->UID map, then use that along with your Bio::Index index to  
> grab what you need.
>
> chris
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

--
Jason Stajich
jason.stajich at gmail.com
jason at bioperl.org


From jluis.lavin at unavarra.es  Wed Nov 11 10:01:18 2009
From: jluis.lavin at unavarra.es (jluis.lavin at unavarra.es)
Date: Wed, 11 Nov 2009 16:01:18 +0100 (CET)
Subject: [Bioperl-l] [Fwd: Re: A question about iBio::Index:
 anditscorrect use]
In-Reply-To: <E2D3B0C9-9294-4630-ACA0-FB0559E19E4C@bioperl.org>
References: <2642.130.206.164.153.1257435996.squirrel@webmail.unavarra.es><1
	984ED07F36C446284B25F617964B6C6@NewLife><2969.130.206.164.153.1257438117.sq
	uirrel@webmail.unavarra.es><A1ACC4B552514872B77208248B31977C@NewLife><3471.
	130.206.164.153.1257849820.squirrel@webmail.unavarra.es>
	<E2D3B0C9-9294-4630-ACA0-FB0559E19E4C@bioperl.org>
Message-ID: <2979.130.206.164.153.1257951678.squirrel@webmail.unavarra.es>

Hi once again,
I have modified the script following the instructions Jason gave me (at
last what I understood, remember it is my first time trying to learn a
programming language...and I?m not the smartest guy in the class, hehe)but
it seems I didn?t fix the problem...
Here?s the new code I wrote:

#!/c:/Perl -w
	use strict;
        use Bio::Index::Fasta;
	use Bio::DB::Fasta;
	use Bio::SeqIO;
	use IO::File;

# assign files to scalars
my $index_file = 'PC91.fasta';
my $id_list = 'LCS2.txt';

# open index file
my $db = Bio::DB::Fasta->new($index_file) or die;

# open the id list
my $in = IO::File->new($id_list) or die;

# open FASTA to write
my $out = new Bio::SeqIO (-file => ">>index_extracted.fasta",
-format => 'fasta');

# retrieve ids loop
foreach my $id ($in) {
if ($id eq ''){
die ("empty list")
}
else {
my $seqobj = my $inx->fetch($id);
$out->write_seq($seqobj);
}
}

# parse fasta headers
sub my_makeid {
my $id = shift;
if ( $id =~ /^>[^:]+:(\S+)/ ) {
return $1;
} elsif ($id =~ /^>(\S+)/) {
return $1;
} else {
warn("cannot parse ID for $id\n");
}
}
exit;

Would anyone, please take a look at it ...

Thanks in advance ;)


El Mar, 10 de Noviembre de 2009, 19:47, Jason Stajich escribi?:
> Page 44 has the custom ID info or look at documentation for
> Bio::DB::Fasta - there is a similar syntax for Bio::Index::Fasta if
> you read the perldoc for the module.
>
>   http://jason.open-bio.org/Bioperl_Tutorials/ProgrammingBiology2008/ProgBiology_BioPerl_I.pdf
>
> Don't re-opening SeqIO each time just do it once at the beginning
> outside of the loop and then call write_seq within the loop.
>
> This is one nuance of doing OO programming vs procedural is that there
> is some outside state information that can persist in an object, but
> conceptually, you want to open a filehandle once and just keep writing
> to it.
>
> -jason
> On Nov 10, 2009, at 2:43 AM, jluis.lavin at unavarra.es wrote:
>
>> Hello again,
>>
>> I tried what Mark told me modifying the code line he told me but
>> there?s
>> still a problem that I believe must be due to the sequences name.
>> My secuences header on the Fasta file have this format:
>>
>>> PleosPC9_1_103820|fgenesh1_pg.3_#_1
>>
>> Th part on the right of the pipe changes depending on the program
>> used to
>> create the gene model, for example:
>>
>>> PleosPC9_1_103820|fgenesh1_pg.3_#_1
>>> PleosPC9_1_123413|genemark.2731_g
>>> PleosPC9_1_52065|e_gw1.3.64.1
>>
>> So I guess I need to parse my ids somehow for thr program to detect
>> only
>> the first part of the fasta header (the "protein name") and not to get
>> messed with the other side of the pipe...
>>
>> This is the corrected code I wrote following Mark?s indications, but I
>> still don?t have any idea about the parsing issue...
>>
>> #!/c:/Perl -w
>> use Bio::Index::Fasta;
>> use strict;
>> #PC9.fasta is my genomic file
>> my $Index_File_Name ="PC9.fasta";
>> my $inx = Bio::Index::Fasta->new('PC9.fasta.idx');
>> #LCS.txt is my sequences list
>> @ARGV = <LCS.txt>;
>> foreach  my $id (@ARGV) {
>> if ($id eq ''){
>> die ("empty list")
>> }
>> else {
>> my $seqobj = $inx->fetch($id);
>> my $out = new Bio::SeqIO (-file => ">>index_extracted.fasta",
>> -format => 'fasta');
>> $out->write_seq($seqobj);
>> }
>> }
>> exit;
>> }
>>
>> Thanks in advance
>>
>> PD. May it be a faster way of extracting those sequences using plain
>> PERL?
>>
>>
>>
>>
>> El Jue, 5 de Noviembre de 2009, 17:39, Mark A. Jensen escribi?:
>>> Yes, these are files created by the SDBM, Perl's internal db
>>> manager. You
>>> should
>>> be able to
>>> open the index by simply
>>> $inx = Bio::Index::Fasta->new('PC9.fasta.idx');
>>> and the dbm will know what to do--
>>> cheers MAJ
>>> ----- Original Message -----
>>> From: <jluis.lavin at unavarra.es>
>>> To: "Mark A. Jensen" <maj at fortinbras.us>
>>> Cc: <jluis.lavin at unavarra.es>; <bioperl-l at lists.open-bio.org>
>>> Sent: Thursday, November 05, 2009 11:21 AM
>>> Subject: Re: [Bioperl-l] [Fwd: Re: A question about iBio::Index:
>>> and its
>>> correct
>>> use]
>>>
>>>
>>>> Thank you very much Mark, that?s a good point :$
>>>> I guess your correction is referred to the second script, isn?t it?
>>>>
>>>> If it is so, there is still a problem with the first script, it
>>>> doesn?t
>>>> create the PC9.fasta.idx file, instead it creates two files named:
>>>> -PC9.fasta.idx.pag
>>>> -PC9.fasta.idx.dir
>>>>
>>>> which seem to be clearly related with some kind of indexing
>>>> process...but,
>>>> unless the PC9.fasta.idx file is only virtual or remains hidden, I
>>>> can?t
>>>> find it anywhere...
>>>> Forgive me if I?m talking nosense...
>>>>
>>>> Thank you very much again for your help ;)
>>>>
>>>>
>>>> El Jue, 5 de Noviembre de 2009, 17:02, Mark A. Jensen escribi?:
>>>>> Hey Jos?,
>>>>> The first thing that jumps out it the index file name. Looks
>>>>> like you create it as
>>>>> PC9.fasta.idx
>>>>> But you read it as
>>>>> PC9.fasta
>>>>> Not an unusual mistake. Do
>>>>> my $inx = Bio::Index::Fasta->new('PC9.fasta.idx');
>>>>> and see if it works.
>>>>> MAJ
>>>>> ----- Original Message -----
>>>>> From: <jluis.lavin at unavarra.es>
>>>>> To: <bioperl-l at lists.open-bio.org>
>>>>> Sent: Thursday, November 05, 2009 10:46 AM
>>>>> Subject: [Bioperl-l] [Fwd: Re: A question about iBio::Index: and
>>>>> its
>>>>> correct
>>>>> use]
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> ---------------------------- Mensaje original
>>>>> ----------------------------
>>>>> Subject: Re: [Bioperl-l] A question about iBio::Index: and its
>>>>> correct
>>>>> use
>>>>> From:    jluis.lavin at unavarra.es
>>>>> Fecha:   Jue, 5 de Noviembre de 2009, 16:46
>>>>> To:      "Mark A. Jensen" <maj at fortinbras.us>
>>>>> --------------------------------------------------------------------------
>>>>>
>>>>> Hi Mark,
>>>>>
>>>>> I?ve actually got two scripts, the first one is to create the
>>>>> index and
>>>>> the second one is to retrieve the sequence lis from the indexed
>>>>> file.
>>>>>
>>>>> 1)Here is the Index creation script:
>>>>>
>>>>> #!/c:/Perl -w
>>>>> use strict;
>>>>> use Bio::Index::Fasta;
>>>>> use strict;
>>>>>
>>>>> print "Enter file for indexing: \n";
>>>>> my $Index_File_Name = <STDIN>;
>>>>> my $inx = Bio::Index::Fasta->new(-filename =>
>>>>> $Index_File_Name.".idx",
>>>>>    -write_flag => 1);
>>>>> $inx->make_index(my $File_Name);
>>>>>
>>>>> 2)And here is the sequence retrieval script:
>>>>>
>>>>> #!/c:/Perl -w
>>>>> use Bio::Index::Fasta;
>>>>> use strict;
>>>>> #PC9.fasta is my genomic file
>>>>> my $Index_File_Name ="PC9.fasta";
>>>>> my $inx = Bio::Index::Fasta->new($Index_File_Name);
>>>>> #LCS.txt is my sequences list
>>>>> @ARGV = <lCS.txt>;
>>>>> foreach  my $id (@ARGV) {
>>>>> if ($id eq ''){
>>>>> die ("empty list")
>>>>> }
>>>>> else {
>>>>> my $seqobj = $inx->fetch($id);
>>>>> my $out = new Bio::SeqIO (-file => ">>extracted_seqs.fasta",
>>>>> -format => 'fasta');
>>>>> $out->write_seq($seqobj);
>>>>> }
>>>>> }
>>>>> exit;
>>>>> }
>>>>>
>>>>> I hope this code is not a total scum...
>>>>>
>>>>> Thanks in advance ;)
>>>>>
>>>>>
>>>>>
>>>>> El Jue, 5 de Noviembre de 2009, 16:39, Mark A. Jensen escribi?:
>>>>>> Jos? -- It looks like this is a good solution to your problem.
>>>>>> Please
>>>>>> send
>>>>>> you
>>>>>> script so we can look at it-
>>>>>> cheers Mark
>>>>>> ----- Original Message -----
>>>>>> From: <jluis.lavin at unavarra.es>
>>>>>> To: <bioperl-l at lists.open-bio.org>
>>>>>> Sent: Thursday, November 05, 2009 10:28 AM
>>>>>> Subject: [Bioperl-l] A question about iBio::Index: and its
>>>>>> correct use
>>>>>>
>>>>>>
>>>>>>
>>>>>> Hello to all,
>>>>>>
>>>>>> I?m trying to write a script to retrieve a list of sequences
>>>>>> from a
>>>>>> local
>>>>>> FASTA file (for example a fasta archive where all the protein
>>>>>> models
>>>>>> of
>>>>>> an
>>>>>> organism are stored). This file would be used by me as some kind
>>>>>> "local
>>>>>> database" (sorry if I mistake a few concepts...)
>>>>>> I?ve been reading the BioPerl HOWTOs and I came across the
>>>>>> Bio::Index::Fasta tool.
>>>>>> If I didn?t misunderstood what I read (which can be easy because
>>>>>> my
>>>>>> low
>>>>>> level on programming) this Indexing tool should do the job.
>>>>>> I wrote a couple of scripts based on the documentation i read
>>>>>> about
>>>>>> this
>>>>>> tool, but I don?t seem to be able to create the index file to be
>>>>>> used
>>>>>> later (to retrieve the sequences from).
>>>>>> -First of all, I want to ask the people in this forum if the
>>>>>> Bio::Index::Fasta is the right one to chose for this tasks.
>>>>>> -Then I?ll beg you to take a look at my scripts, because I don?t
>>>>>> seem
>>>>>> to
>>>>>> catch the bug...
>>>>>>
>>>>>> Best wishes to you all and thanks in advance ;)
>>>>>>
>>>>>> --
>>>>>> Jos? Luis Lav?n Trueba, PhD
>>>>>>
>>>>>> Dpto. de Producci?n Agraria
>>>>>> Grupo de Gen?tica y Microbiolog?a
>>>>>> Universidad P?blica de Navarra
>>>>>> 31006 Pamplona
>>>>>> Navarra
>>>>>> SPAIN
>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> Bioperl-l mailing list
>>>>>> Bioperl-l at lists.open-bio.org
>>>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Dr. Jos? Luis Lav?n Trueba
>>>>>
>>>>> Dpto. de Producci?n Agraria
>>>>> Grupo de Gen?tica y Microbiolog?a
>>>>> Universidad P?blica de Navarra
>>>>> 31006 Pamplona
>>>>> Navarra
>>>>> SPAIN
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Dr. Jos? Luis Lav?n Trueba
>>>>>
>>>>> Dpto. de Producci?n Agraria
>>>>> Grupo de Gen?tica y Microbiolog?a
>>>>> Universidad P?blica de Navarra
>>>>> 31006 Pamplona
>>>>> Navarra
>>>>> SPAIN
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> Bioperl-l mailing list
>>>>> Bioperl-l at lists.open-bio.org
>>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> Dr. Jos? Luis Lav?n Trueba
>>>>
>>>> Dpto. de Producci?n Agraria
>>>> Grupo de Gen?tica y Microbiolog?a
>>>> Universidad P?blica de Navarra
>>>> 31006 Pamplona
>>>> Navarra
>>>> SPAIN
>>>>
>>>>
>>>>
>>>>
>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>
>>
>> --
>> Dr. Jos? Luis Lav?n Trueba
>>
>> Dpto. de Producci?n Agraria
>> Grupo de Gen?tica y Microbiolog?a
>> Universidad P?blica de Navarra
>> 31006 Pamplona
>> Navarra
>> SPAIN
>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> --
> Jason Stajich
> jason.stajich at gmail.com
> jason at bioperl.org
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


-- 
Dr. Jos? Luis Lav?n Trueba

Dpto. de Producci?n Agraria
Grupo de Gen?tica y Microbiolog?a
Universidad P?blica de Navarra
31006 Pamplona
Navarra
SPAIN


From maj at fortinbras.us  Wed Nov 11 18:48:33 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Wed, 11 Nov 2009 18:48:33 -0500
Subject: [Bioperl-l] Maq assembly wrapper ready for beta testing
Message-ID: <4057E5A862B845EA8BB153888075590C@NewLife>

Hi All-

New modules are available in the core and in bioperl-run for
working with Heng Li's short read assembler "maq"
(http://maq.sourceforge.net/maq-man.shtml). Bio::Tools::Run::Maq
allows a quick assembly call with a canned a maq pipeline, and also
allows individual maq commands to be called separately. 
It uses Bio::Assembly::IO::maq  (a read-only module) to deliver
a Bio::Assembly::Scaffold from maq output. 

If you're interested, see
http://www.bioperl.org/wiki/HOWTO:Short-read_assemblies_with_maq
and update your core and bioperl-run. The code inherits from Florent's
excellent new Bio::Tools::Run::AssemblerBase -- kudos to him!!

tests are in bioperl-run/trunk/t/Maq.t, see them for myriad examples
send me the bugs
MAJ

From clarsen at vecna.com  Thu Nov 12 12:22:26 2009
From: clarsen at vecna.com (Chris Larsen)
Date: Thu, 12 Nov 2009 12:22:26 -0500
Subject: [Bioperl-l] Polyproteins, ribo slippage,
	and mat_peptide in  viruses?
In-Reply-To: <320fb6e00910271029m26f07564l727fb78adae81c11@mail.gmail.com>
References: <B0218AEF-3CEB-4E06-B8DF-7B302D024797@vecna.com>
	<320fb6e00910271029m26f07564l727fb78adae81c11@mail.gmail.com>
Message-ID: <7BBAE077-4D76-46C2-BF66-363F5A017278@vecna.com>

All,

This is a short followup on the prior thread of discussion, regarding  
computing mature peptide sequences for viruses. The topic has gone  
underwater for the time being as we solve some problems with source  
data. While the biopython effort and contributors on this board have  
given good guidance, and we now have scripts that function (thanks  
mostly to pcock), however, the source data on which everything relies  
is suspect:

   mat_peptide	15118..16914	<===
		/product="nsp13"	
		/note="helicase"
I can tell you the virus community does not want to rely heavily, on  
those position numbers. Furthermore we have found fewer compete source  
genomes for viruses than bacteria, more virus-to-virus variation in  
the data fields annotated in the GBK file, (Gene, CDS, ORF, Protein,  
Polyprotein, mat_peptide, db_xref) and in fact the community will have  
to come together significantly on how these molecules are defined in  
public repositories, before a mature scripting effort becomes  
reliable, public and well received. Because of the variation in  
viruses, it's not even clear at this point what a 'gene' is. I will  
let you know how we proceed when more sequence data has been fully  
analyzed, and we can think about making any perl based solution a new  
viral protein module.

Thanks,

Chris

-- 

Christopher Larsen, Ph.D.
Sr. Scientist / Grants Manager
Vecna Technologies
6404 Ivy Lane #500
Greenbelt, MD 20770
Phone: (240) 965-4525
Fax: (240) 547-6133
240-737-4525


From David.Messina at sbc.su.se  Thu Nov 12 14:20:54 2009
From: David.Messina at sbc.su.se (Dave Messina)
Date: Thu, 12 Nov 2009 20:20:54 +0100
Subject: [Bioperl-l] highest PAML version supported?
Message-ID: <628aabb70911121120w4c609056v50204b9bd9e5c3fb@mail.gmail.com>

Hi everyone,

What is the latest version of PAML (specifically codeml) that I can use with
bioperl-live and bioperl-run?

I looked around and couldn't find where (or if) this is documented.


With PAML version 4.3a against the current trunk of both -live and -run I
see this:
------------- EXCEPTION Bio::Root::NotImplemented -------------
MSG: Unknown format of PAML output did not see seqtype
STACK Bio::Tools::Phylo::PAML::_parse_summary
/Users/dave/src/bioperl-live/Bio/Tools/Phylo/PAML.pm:461
STACK Bio::Tools::Phylo::PAML::next_result
/Users/dave/src/bioperl-live/Bio/Tools/Phylo/PAML.pm:270
STACK toplevel ../bin/cluster_kaks:251
---------------------------------------------------------------

...which I suspect (but haven't confirmed) is due to a change in the file
format.


Dave

From jason at bioperl.org  Thu Nov 12 14:29:22 2009
From: jason at bioperl.org (Jason Stajich)
Date: Thu, 12 Nov 2009 11:29:22 -0800
Subject: [Bioperl-l] highest PAML version supported?
In-Reply-To: <628aabb70911121120w4c609056v50204b9bd9e5c3fb@mail.gmail.com>
References: <628aabb70911121120w4c609056v50204b9bd9e5c3fb@mail.gmail.com>
Message-ID: <D44511CE-D632-4CBE-9DD1-1B68EFD70124@bioperl.org>

prolly 3.15 or so.

it really needs a maintainer!!!

On Nov 12, 2009, at 11:20 AM, Dave Messina wrote:

> Hi everyone,
>
> What is the latest version of PAML (specifically codeml) that I can  
> use with
> bioperl-live and bioperl-run?
>
> I looked around and couldn't find where (or if) this is documented.
>
>
> With PAML version 4.3a against the current trunk of both -live and - 
> run I
> see this:
> ------------- EXCEPTION Bio::Root::NotImplemented -------------
> MSG: Unknown format of PAML output did not see seqtype
> STACK Bio::Tools::Phylo::PAML::_parse_summary
> /Users/dave/src/bioperl-live/Bio/Tools/Phylo/PAML.pm:461
> STACK Bio::Tools::Phylo::PAML::next_result
> /Users/dave/src/bioperl-live/Bio/Tools/Phylo/PAML.pm:270
> STACK toplevel ../bin/cluster_kaks:251
> ---------------------------------------------------------------
>
> ...which I suspect (but haven't confirmed) is due to a change in the  
> file
> format.
>
>
> Dave
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

--
Jason Stajich
jason.stajich at gmail.com
jason at bioperl.org


From scott at scottcain.net  Fri Nov 13 09:48:43 2009
From: scott at scottcain.net (Scott Cain)
Date: Fri, 13 Nov 2009 09:48:43 -0500
Subject: [Bioperl-l] January GMOD meeting announcement
Message-ID: <4536f7700911130648j40eb2d82g2594adaccf476d73@mail.gmail.com>

Hello,

I am pleased to announce that the January GMOD meeting will be taking
place on January 14 and 15 in San Diego at the Best Western Seven Seas
(the same location as last year).  Please see this page for
registration information:

  http://gmod.org/wiki/January_2010_GMOD_Meeting

When you go to that page, please take a moment to add suggestions for
the agenda.  There is no registration fee for this meeting, however
there is limited space, so please register early.

The proprietors of the Best Western have given us an excellent room
rate, and extended it to the previous week, so that people attending
the GMOD meeting and the Plant and Animal Genome meeting before it may
stay at the Best Western the entire time.

Please direct follow up questions to the gmod-devel mailing list:
https://lists.sourceforge.net/lists/listinfo/gmod-devel

Thanks and I look forward to seeing you in San Diego!
Scott


-- 
------------------------------------------------------------------------
Scott Cain, Ph. D.                                   scott at scottcain dot net
GMOD Coordinator (http://gmod.org/)                     216-392-3087
Ontario Institute for Cancer Research

From j.inoue at ucl.ac.uk  Sat Nov 14 14:20:29 2009
From: j.inoue at ucl.ac.uk (Jun Inoue)
Date: Sat, 14 Nov 2009 19:20:29 +0000
Subject: [Bioperl-l] Bio::TreeIO, Root-tip branch lengths
Message-ID: <F05F4263-D708-492A-A536-74059687279C@ucl.ac.uk>

Dear All,

I just started to learn BioPerl for phylogenetics.
Usually I am using perl v5.10.0 on my Mac OS 10.5.8.
I would like to ask you a hint to calculate the Branch lengths
from root to tip for all species in NEWICK TREE format.

Please see the following web site.
I am explaining what I want to do and
showing my easy script (not completed).
http://www.geocities.jp/ancientfishtree/BioPerl_BLRootTip.html

Thank you for your help.

Best,
Jun Inoue
http://www.geocities.jp/ancientfishtree/index_eng.html

From maj at fortinbras.us  Sat Nov 14 16:47:37 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Sat, 14 Nov 2009 16:47:37 -0500
Subject: [Bioperl-l] Bio::TreeIO, Root-tip branch lengths
In-Reply-To: <F05F4263-D708-492A-A536-74059687279C@ucl.ac.uk>
References: <F05F4263-D708-492A-A536-74059687279C@ucl.ac.uk>
Message-ID: <3BC179984D5E49868C4F12D181D82B8D@NewLife>

Hi Jun,

Some hints: incorporate

@leaves = $tree->get_leaf_nodes;

and

use Bio::Tree::TreeFunctionsI;
$distance = $tree->distance( $node_a, $node_b );

cheers, Mark

----- Original Message ----- 
From: "Jun Inoue" <j.inoue at ucl.ac.uk>
To: <bioperl-l at lists.open-bio.org>
Cc: "?? ?" <j.inoue at ucl.ac.uk>
Sent: Saturday, November 14, 2009 2:20 PM
Subject: [Bioperl-l] Bio::TreeIO, Root-tip branch lengths


> Dear All,
>
> I just started to learn BioPerl for phylogenetics.
> Usually I am using perl v5.10.0 on my Mac OS 10.5.8.
> I would like to ask you a hint to calculate the Branch lengths
> from root to tip for all species in NEWICK TREE format.
>
> Please see the following web site.
> I am explaining what I want to do and
> showing my easy script (not completed).
> http://www.geocities.jp/ancientfishtree/BioPerl_BLRootTip.html
>
> Thank you for your help.
>
> Best,
> Jun Inoue
> http://www.geocities.jp/ancientfishtree/index_eng.html
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> 


From jay at jays.net  Sun Nov 15 20:23:38 2009
From: jay at jays.net (Jay Hannah)
Date: Sun, 15 Nov 2009 19:23:38 -0600
Subject: [Bioperl-l] Bio::Index::GenBank - by organism?
In-Reply-To: <12DFD22E-42DC-4626-9873-0DE3EBB5CFBD@illinois.edu>
References: <3B01A09C-198E-4691-B807-7ED3250BB81A@jays.net>
	<12DFD22E-42DC-4626-9873-0DE3EBB5CFBD@illinois.edu>
Message-ID: <F8052B51-85FB-44B9-9254-9AD1E964FA7B@jays.net>

On Nov 9, 2009, at 9:55 PM, Chris Fields wrote:
> It should work via id_parser(); from Bio::Index::GenBank:
> 
>   $inx->id_parser(\&get_id);
>   # make the index
>   $inx->make_index($file_name);
> 
>   # here is where the retrieval key is specified
>   sub get_id {
>      my $line = shift;
>      $line =~ /clone="(\S+)"/;
>      $1;
>   }

This worked great for me today (tackling a different problem than the original).  Thanks!!

j


From veronica.xiaoyu at gmail.com  Fri Nov 13 15:35:48 2009
From: veronica.xiaoyu at gmail.com (Xiaoyu Liang)
Date: Fri, 13 Nov 2009 15:35:48 -0500
Subject: [Bioperl-l] Bio::Graphics::Panel question
Message-ID: <a6a926b50911131235p3b04b59fs634593f275300ed6@mail.gmail.com>

Hi,

I'm using Bio::Graphics to parse the blast result and generate images. But,
sometimes, in the middle of the output image, the hit's color is white,
eventhough I set it to other colors. I attached the picture here for an
example. This doesn't occur all the time, usually, it works well. I'm
wondering if I did something wrong? or depends on the blast result?

Thank you,
Xiaoyu
-------------- next part --------------
A non-text attachment was scrubbed...
Name: BLAST_problem.jpg
Type: image/jpeg
Size: 51888 bytes
Desc: not available
URL: <http://lists.open-bio.org/pipermail/bioperl-l/attachments/20091113/57550aa9/attachment-0001.jpg>

From ryan_bogard at hms.harvard.edu  Sun Nov 15 22:30:22 2009
From: ryan_bogard at hms.harvard.edu (rbogard)
Date: Sun, 15 Nov 2009 19:30:22 -0800 (PST)
Subject: [Bioperl-l]  Problems with bioperl in Mac OS X 10.6 (Perl 5.10.0)
Message-ID: <26366421.post@talk.nabble.com>


In advance, any advice would be grealy appreciated! I have installed
bioperl-588pm via fink but I am having difficulties calling the modules in
script. The following is added to .profile (bash):
PERL5LIB=/sw/lib/perl5/5.8.8:$PERL5LIB

If I change this to /sw/lib/perl5 then I get an @INC error, as use Bio::PERL
cannot be located.

The environment variables are as follows:

MANPATH=/sw/share/man:/usr/share/man:/usr/X11/man:/sw/lib/perl5/5.10.0/man:/usr/X11R6/man:/sw/lib/perl5-core/5.8.8/man:/sw/lib/perl5/5.8.8/man
PERL5LIB=/sw/lib/perl5/5.8.8:/sw/lib/perl5:/sw/lib/perl5/darwin:/sw/lib/perl5/5.8.8
PATH=/sw/bin:/sw/sbin:/usr/bin:/bin:/usr/sbin:/sbin:/usr/local/bin:/usr/X11/bin:/usr/X11R6/bin
INFOPATH=/sw/share/info:/sw/info:/usr/share/info


This is the perl script I'm attempting to run:
#!/sw/bin/perl5.8.8
use strict;
use Bio::Perl;
$seq_object = get_sequence('swiss',"ROA1_HUMAN");
write_sequence(">roa1.fasta",'fasta',$seq_object);

Here is the error output:

dyld: lazy symbol binding failed: Symbol not found: _Perl_Tstack_sp_ptr
  Referenced from:
/sw/lib/perl5/5.8.8/darwin-thread-multi-2level/auto/IO/IO.bundle
  Expected in: dynamic lookup

dyld: Symbol not found: _Perl_Tstack_sp_ptr
  Referenced from:
/sw/lib/perl5/5.8.8/darwin-thread-multi-2level/auto/IO/IO.bundle
  Expected in: dynamic lookup

Trace/BPT trap

I have looked through many forum postings and attempted the solutions
offered in those instances, but none seem to work in my case. I'm not sure
if it's because I have perl 5.10.0 installed while attempting to call
bioperl 5.8.8; however, others seem to have it working just fine.

Thank you, Ryan 
-- 
View this message in context: http://old.nabble.com/Problems-with-bioperl-in-Mac-OS-X-10.6-%28Perl-5.10.0%29-tp26366421p26366421.html
Sent from the Perl - Bioperl-L mailing list archive at Nabble.com.


From e.osimo at gmail.com  Mon Nov 16 02:04:40 2009
From: e.osimo at gmail.com (Emanuele Osimo)
Date: Mon, 16 Nov 2009 08:04:40 +0100
Subject: [Bioperl-l] Problems with bioperl in Mac OS X 10.6 (Perl 5.10.0)
In-Reply-To: <26366421.post@talk.nabble.com>
References: <26366421.post@talk.nabble.com>
Message-ID: <2ac05d0f0911152304v58985cb5x6ea0501bff7a41ab@mail.gmail.com>

Hello Ryan,
unfortunately, if you upgraded to 10.6 without formatting, I have to tell
you that you'll be in big trouble with perl and with everything you
installed from the commandline... Because in the upgrade process everything
in the system folders, perl and bioperl being some of these things, is
erased without being uninstalled, so you'll find a lot of folders with the
same name but no contents.
I suggest you, as I did, to format your pc and reinstall 10.6 from scratch.
Then youl'll be able to install mysql (I had to install
mysql-5.4.3-beta-osx10.5, the only to work on 10.6), and, working with perl
5.10 that is already installed, you'll install bioperl with no effort.
Bye
Emanuele

On Mon, Nov 16, 2009 at 04:30, rbogard <ryan_bogard at hms.harvard.edu> wrote:

>
> In advance, any advice would be grealy appreciated! I have installed
> bioperl-588pm via fink but I am having difficulties calling the modules in
> script. The following is added to .profile (bash):
> PERL5LIB=/sw/lib/perl5/5.8.8:$PERL5LIB
>
> If I change this to /sw/lib/perl5 then I get an @INC error, as use
> Bio::PERL
> cannot be located.
>
> The environment variables are as follows:
>
>
> MANPATH=/sw/share/man:/usr/share/man:/usr/X11/man:/sw/lib/perl5/5.10.0/man:/usr/X11R6/man:/sw/lib/perl5-core/5.8.8/man:/sw/lib/perl5/5.8.8/man
>
> PERL5LIB=/sw/lib/perl5/5.8.8:/sw/lib/perl5:/sw/lib/perl5/darwin:/sw/lib/perl5/5.8.8
>
> PATH=/sw/bin:/sw/sbin:/usr/bin:/bin:/usr/sbin:/sbin:/usr/local/bin:/usr/X11/bin:/usr/X11R6/bin
> INFOPATH=/sw/share/info:/sw/info:/usr/share/info
>
>
> This is the perl script I'm attempting to run:
> #!/sw/bin/perl5.8.8
> use strict;
> use Bio::Perl;
> $seq_object = get_sequence('swiss',"ROA1_HUMAN");
> write_sequence(">roa1.fasta",'fasta',$seq_object);
>
> Here is the error output:
>
> dyld: lazy symbol binding failed: Symbol not found: _Perl_Tstack_sp_ptr
>  Referenced from:
> /sw/lib/perl5/5.8.8/darwin-thread-multi-2level/auto/IO/IO.bundle
>  Expected in: dynamic lookup
>
> dyld: Symbol not found: _Perl_Tstack_sp_ptr
>  Referenced from:
> /sw/lib/perl5/5.8.8/darwin-thread-multi-2level/auto/IO/IO.bundle
>  Expected in: dynamic lookup
>
> Trace/BPT trap
>
> I have looked through many forum postings and attempted the solutions
> offered in those instances, but none seem to work in my case. I'm not sure
> if it's because I have perl 5.10.0 installed while attempting to call
> bioperl 5.8.8; however, others seem to have it working just fine.
>
> Thank you, Ryan
> --
> View this message in context:
> http://old.nabble.com/Problems-with-bioperl-in-Mac-OS-X-10.6-%28Perl-5.10.0%29-tp26366421p26366421.html
> Sent from the Perl - Bioperl-L mailing list archive at Nabble.com.
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>

From ryan_bogard at hms.harvard.edu  Mon Nov 16 08:43:19 2009
From: ryan_bogard at hms.harvard.edu (rbogard)
Date: Mon, 16 Nov 2009 05:43:19 -0800 (PST)
Subject: [Bioperl-l] Problems with bioperl in Mac OS X 10.6 (Perl 5.10.0)
In-Reply-To: <2ac05d0f0911152304v58985cb5x6ea0501bff7a41ab@mail.gmail.com>
References: <26366421.post@talk.nabble.com>
	<2ac05d0f0911152304v58985cb5x6ea0501bff7a41ab@mail.gmail.com>
Message-ID: <26372079.post@talk.nabble.com>


The Mac OS X 10.6 was a fresh install on a new Mac Book Pro. Not sure if I
will have the same issues, but it's worth a shot as I have little on my
computer and reinstalling to start over wouldn't be too difficult. What
method did you use to install bioperl? I used fink and I am not sure the
available stable version is the one I need. I will install from the command
line this time around, and let you know how it turns out.

Thank you!


Emanuele Osimo wrote:
> 
> Hello Ryan,
> unfortunately, if you upgraded to 10.6 without formatting, I have to tell
> you that you'll be in big trouble with perl and with everything you
> installed from the commandline... Because in the upgrade process
> everything
> in the system folders, perl and bioperl being some of these things, is
> erased without being uninstalled, so you'll find a lot of folders with the
> same name but no contents.
> I suggest you, as I did, to format your pc and reinstall 10.6 from
> scratch.
> Then youl'll be able to install mysql (I had to install
> mysql-5.4.3-beta-osx10.5, the only to work on 10.6), and, working with
> perl
> 5.10 that is already installed, you'll install bioperl with no effort.
> Bye
> Emanuele
> 
> On Mon, Nov 16, 2009 at 04:30, rbogard <ryan_bogard at hms.harvard.edu>
> wrote:
> 
>>
>> In advance, any advice would be grealy appreciated! I have installed
>> bioperl-588pm via fink but I am having difficulties calling the modules
>> in
>> script. The following is added to .profile (bash):
>> PERL5LIB=/sw/lib/perl5/5.8.8:$PERL5LIB
>>
>> If I change this to /sw/lib/perl5 then I get an @INC error, as use
>> Bio::PERL
>> cannot be located.
>>
>> The environment variables are as follows:
>>
>>
>> MANPATH=/sw/share/man:/usr/share/man:/usr/X11/man:/sw/lib/perl5/5.10.0/man:/usr/X11R6/man:/sw/lib/perl5-core/5.8.8/man:/sw/lib/perl5/5.8.8/man
>>
>> PERL5LIB=/sw/lib/perl5/5.8.8:/sw/lib/perl5:/sw/lib/perl5/darwin:/sw/lib/perl5/5.8.8
>>
>> PATH=/sw/bin:/sw/sbin:/usr/bin:/bin:/usr/sbin:/sbin:/usr/local/bin:/usr/X11/bin:/usr/X11R6/bin
>> INFOPATH=/sw/share/info:/sw/info:/usr/share/info
>>
>>
>> This is the perl script I'm attempting to run:
>> #!/sw/bin/perl5.8.8
>> use strict;
>> use Bio::Perl;
>> $seq_object = get_sequence('swiss',"ROA1_HUMAN");
>> write_sequence(">roa1.fasta",'fasta',$seq_object);
>>
>> Here is the error output:
>>
>> dyld: lazy symbol binding failed: Symbol not found: _Perl_Tstack_sp_ptr
>>  Referenced from:
>> /sw/lib/perl5/5.8.8/darwin-thread-multi-2level/auto/IO/IO.bundle
>>  Expected in: dynamic lookup
>>
>> dyld: Symbol not found: _Perl_Tstack_sp_ptr
>>  Referenced from:
>> /sw/lib/perl5/5.8.8/darwin-thread-multi-2level/auto/IO/IO.bundle
>>  Expected in: dynamic lookup
>>
>> Trace/BPT trap
>>
>> I have looked through many forum postings and attempted the solutions
>> offered in those instances, but none seem to work in my case. I'm not
>> sure
>> if it's because I have perl 5.10.0 installed while attempting to call
>> bioperl 5.8.8; however, others seem to have it working just fine.
>>
>> Thank you, Ryan
>> --
>> View this message in context:
>> http://old.nabble.com/Problems-with-bioperl-in-Mac-OS-X-10.6-%28Perl-5.10.0%29-tp26366421p26366421.html
>> Sent from the Perl - Bioperl-L mailing list archive at Nabble.com.
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
> 

-- 
View this message in context: http://old.nabble.com/Problems-with-bioperl-in-Mac-OS-X-10.6-%28Perl-5.10.0%29-tp26366421p26372079.html
Sent from the Perl - Bioperl-L mailing list archive at Nabble.com.


From maj at fortinbras.us  Mon Nov 16 08:48:17 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Mon, 16 Nov 2009 08:48:17 -0500
Subject: [Bioperl-l] Problems with bioperl in Mac OS X 10.6 (Perl 5.10.0)
In-Reply-To: <26372079.post@talk.nabble.com>
References: <26366421.post@talk.nabble.com><2ac05d0f0911152304v58985cb5x6ea0501bff7a41ab@mail.gmail.com>
	<26372079.post@talk.nabble.com>
Message-ID: <8D822081B13F49C2A37677D3A47F38B4@NewLife>

Ryan,
I'm not a mac person, but Koen has said (see 
http://www.bioperl.org/wiki/Getting_BioPerl#Mac_OS_X_using_fink )
to use the unstable tree to get BioPerl 1.6.1, which is likely to be what you 
want.
cheers
Mark
----- Original Message ----- 
From: "rbogard" <ryan_bogard at hms.harvard.edu>
To: <Bioperl-l at lists.open-bio.org>
Sent: Monday, November 16, 2009 8:43 AM
Subject: Re: [Bioperl-l] Problems with bioperl in Mac OS X 10.6 (Perl 5.10.0)


>
> The Mac OS X 10.6 was a fresh install on a new Mac Book Pro. Not sure if I
> will have the same issues, but it's worth a shot as I have little on my
> computer and reinstalling to start over wouldn't be too difficult. What
> method did you use to install bioperl? I used fink and I am not sure the
> available stable version is the one I need. I will install from the command
> line this time around, and let you know how it turns out.
>
> Thank you!
>
>
>
> Emanuele Osimo wrote:
>>
>> Hello Ryan,
>> unfortunately, if you upgraded to 10.6 without formatting, I have to tell
>> you that you'll be in big trouble with perl and with everything you
>> installed from the commandline... Because in the upgrade process
>> everything
>> in the system folders, perl and bioperl being some of these things, is
>> erased without being uninstalled, so you'll find a lot of folders with the
>> same name but no contents.
>> I suggest you, as I did, to format your pc and reinstall 10.6 from
>> scratch.
>> Then youl'll be able to install mysql (I had to install
>> mysql-5.4.3-beta-osx10.5, the only to work on 10.6), and, working with
>> perl
>> 5.10 that is already installed, you'll install bioperl with no effort.
>> Bye
>> Emanuele
>>
>> On Mon, Nov 16, 2009 at 04:30, rbogard <ryan_bogard at hms.harvard.edu>
>> wrote:
>>
>>>
>>> In advance, any advice would be grealy appreciated! I have installed
>>> bioperl-588pm via fink but I am having difficulties calling the modules
>>> in
>>> script. The following is added to .profile (bash):
>>> PERL5LIB=/sw/lib/perl5/5.8.8:$PERL5LIB
>>>
>>> If I change this to /sw/lib/perl5 then I get an @INC error, as use
>>> Bio::PERL
>>> cannot be located.
>>>
>>> The environment variables are as follows:
>>>
>>>
>>> MANPATH=/sw/share/man:/usr/share/man:/usr/X11/man:/sw/lib/perl5/5.10.0/man:/usr/X11R6/man:/sw/lib/perl5-core/5.8.8/man:/sw/lib/perl5/5.8.8/man
>>>
>>> PERL5LIB=/sw/lib/perl5/5.8.8:/sw/lib/perl5:/sw/lib/perl5/darwin:/sw/lib/perl5/5.8.8
>>>
>>> PATH=/sw/bin:/sw/sbin:/usr/bin:/bin:/usr/sbin:/sbin:/usr/local/bin:/usr/X11/bin:/usr/X11R6/bin
>>> INFOPATH=/sw/share/info:/sw/info:/usr/share/info
>>>
>>>
>>> This is the perl script I'm attempting to run:
>>> #!/sw/bin/perl5.8.8
>>> use strict;
>>> use Bio::Perl;
>>> $seq_object = get_sequence('swiss',"ROA1_HUMAN");
>>> write_sequence(">roa1.fasta",'fasta',$seq_object);
>>>
>>> Here is the error output:
>>>
>>> dyld: lazy symbol binding failed: Symbol not found: _Perl_Tstack_sp_ptr
>>>  Referenced from:
>>> /sw/lib/perl5/5.8.8/darwin-thread-multi-2level/auto/IO/IO.bundle
>>>  Expected in: dynamic lookup
>>>
>>> dyld: Symbol not found: _Perl_Tstack_sp_ptr
>>>  Referenced from:
>>> /sw/lib/perl5/5.8.8/darwin-thread-multi-2level/auto/IO/IO.bundle
>>>  Expected in: dynamic lookup
>>>
>>> Trace/BPT trap
>>>
>>> I have looked through many forum postings and attempted the solutions
>>> offered in those instances, but none seem to work in my case. I'm not
>>> sure
>>> if it's because I have perl 5.10.0 installed while attempting to call
>>> bioperl 5.8.8; however, others seem to have it working just fine.
>>>
>>> Thank you, Ryan
>>> --
>>> View this message in context:
>>> http://old.nabble.com/Problems-with-bioperl-in-Mac-OS-X-10.6-%28Perl-5.10.0%29-tp26366421p26366421.html
>>> Sent from the Perl - Bioperl-L mailing list archive at Nabble.com.
>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>>
>
> -- 
> View this message in context: 
> http://old.nabble.com/Problems-with-bioperl-in-Mac-OS-X-10.6-%28Perl-5.10.0%29-tp26366421p26372079.html
> Sent from the Perl - Bioperl-L mailing list archive at Nabble.com.
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> 


From cjfields at illinois.edu  Mon Nov 16 10:00:09 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Mon, 16 Nov 2009 09:00:09 -0600
Subject: [Bioperl-l] Problems with bioperl in Mac OS X 10.6 (Perl 5.10.0)
In-Reply-To: <2ac05d0f0911152304v58985cb5x6ea0501bff7a41ab@mail.gmail.com>
References: <26366421.post@talk.nabble.com>
	<2ac05d0f0911152304v58985cb5x6ea0501bff7a41ab@mail.gmail.com>
Message-ID: <49681E01-E95D-4FC6-AE42-6E57ED43AAA2@illinois.edu>

On Nov 16, 2009, at 1:04 AM, Emanuele Osimo wrote:

> Hello Ryan,
> unfortunately, if you upgraded to 10.6 without formatting, I have to tell
> you that you'll be in big trouble with perl and with everything you
> installed from the commandline... Because in the upgrade process everything
> in the system folders, perl and bioperl being some of these things, is
> erased without being uninstalled, so you'll find a lot of folders with the
> same name but no contents.

> I suggest you, as I did, to format your pc and reinstall 10.6 from scratch.
> Then youl'll be able to install mysql (I had to install
> mysql-5.4.3-beta-osx10.5, the only to work on 10.6), and, working with perl
> 5.10 that is already installed, you'll install bioperl with no effort.
> Bye
> Emanuele

Just starting from scratch isn't always the best solution (though it is the cleanest).  In this case I don't think anything you mention applies, as there are conflicting symbols being reported.  My guess is conflicting perl builds, probably between your system 5.10.0 (snow leopard) and your fink-installed perl 5.8.8 (they are binary incompatible).  Also, remember that snow leopard is primarily 64-bit, so it might be best to try working out whether your fink is attempting to compile 64- vs 32-bit.  

In this case, I would just uninstall the fink-based perl and either use the system one (snow leopard = 5.10.0), or roll your own and install 5.10.1 locally or in /usr/local.  Do NOT replace the system one, as that will likely break your OS.

In my experience, and not to bash on fink or MacPorts, I never had much luck with their perl installs.  Unless I plan on only using fink or macports for my OS (not likely in my case), I find they tend to cause problems in the long term unless one uses them to install packages with very few dependencies, and even then you need to make sure fink is configure to compile the correct binary.  For instance, they're fairly good for gd, libxml2, etc., but beyond that one may get into issues with odd, version-specific dependencies with some packages, such as relying on perl 5.8.8 (but not perl 5.10.x), db42 (instead of db44), etc.  I've ended up in the past with 2-3 different perl versions, berkeley db versions, etc. 

chris

> On Mon, Nov 16, 2009 at 04:30, rbogard <ryan_bogard at hms.harvard.edu> wrote:
> 
>> 
>> In advance, any advice would be grealy appreciated! I have installed
>> bioperl-588pm via fink but I am having difficulties calling the modules in
>> script. The following is added to .profile (bash):
>> PERL5LIB=/sw/lib/perl5/5.8.8:$PERL5LIB
>> 
>> If I change this to /sw/lib/perl5 then I get an @INC error, as use
>> Bio::PERL
>> cannot be located.
>> 
>> The environment variables are as follows:
>> 
>> 
>> MANPATH=/sw/share/man:/usr/share/man:/usr/X11/man:/sw/lib/perl5/5.10.0/man:/usr/X11R6/man:/sw/lib/perl5-core/5.8.8/man:/sw/lib/perl5/5.8.8/man
>> 
>> PERL5LIB=/sw/lib/perl5/5.8.8:/sw/lib/perl5:/sw/lib/perl5/darwin:/sw/lib/perl5/5.8.8
>> 
>> PATH=/sw/bin:/sw/sbin:/usr/bin:/bin:/usr/sbin:/sbin:/usr/local/bin:/usr/X11/bin:/usr/X11R6/bin
>> INFOPATH=/sw/share/info:/sw/info:/usr/share/info
>> 
>> 
>> This is the perl script I'm attempting to run:
>> #!/sw/bin/perl5.8.8
>> use strict;
>> use Bio::Perl;
>> $seq_object = get_sequence('swiss',"ROA1_HUMAN");
>> write_sequence(">roa1.fasta",'fasta',$seq_object);
>> 
>> Here is the error output:
>> 
>> dyld: lazy symbol binding failed: Symbol not found: _Perl_Tstack_sp_ptr
>> Referenced from:
>> /sw/lib/perl5/5.8.8/darwin-thread-multi-2level/auto/IO/IO.bundle
>> Expected in: dynamic lookup
>> 
>> dyld: Symbol not found: _Perl_Tstack_sp_ptr
>> Referenced from:
>> /sw/lib/perl5/5.8.8/darwin-thread-multi-2level/auto/IO/IO.bundle
>> Expected in: dynamic lookup
>> 
>> Trace/BPT trap
>> 
>> I have looked through many forum postings and attempted the solutions
>> offered in those instances, but none seem to work in my case. I'm not sure
>> if it's because I have perl 5.10.0 installed while attempting to call
>> bioperl 5.8.8; however, others seem to have it working just fine.
>> 
>> Thank you, Ryan
>> --
>> View this message in context:
>> http://old.nabble.com/Problems-with-bioperl-in-Mac-OS-X-10.6-%28Perl-5.10.0%29-tp26366421p26366421.html
>> Sent from the Perl - Bioperl-L mailing list archive at Nabble.com.
>> 
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From cjfields at illinois.edu  Mon Nov 16 10:01:01 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Mon, 16 Nov 2009 09:01:01 -0600
Subject: [Bioperl-l] Problems with bioperl in Mac OS X 10.6 (Perl 5.10.0)
In-Reply-To: <8D822081B13F49C2A37677D3A47F38B4@NewLife>
References: <26366421.post@talk.nabble.com><2ac05d0f0911152304v58985cb5x6ea0501bff7a41ab@mail.gmail.com>
	<26372079.post@talk.nabble.com>
	<8D822081B13F49C2A37677D3A47F38B4@NewLife>
Message-ID: <58912861-CD59-4AFC-8F30-B0AA2E77AECB@illinois.edu>

Actually, why not just install via CPAN?  Any particular reason?

chris

On Nov 16, 2009, at 7:48 AM, Mark A. Jensen wrote:

> Ryan,
> I'm not a mac person, but Koen has said (see http://www.bioperl.org/wiki/Getting_BioPerl#Mac_OS_X_using_fink )
> to use the unstable tree to get BioPerl 1.6.1, which is likely to be what you want.
> cheers
> Mark
> ----- Original Message ----- From: "rbogard" <ryan_bogard at hms.harvard.edu>
> To: <Bioperl-l at lists.open-bio.org>
> Sent: Monday, November 16, 2009 8:43 AM
> Subject: Re: [Bioperl-l] Problems with bioperl in Mac OS X 10.6 (Perl 5.10.0)
> 
> 
>> 
>> The Mac OS X 10.6 was a fresh install on a new Mac Book Pro. Not sure if I
>> will have the same issues, but it's worth a shot as I have little on my
>> computer and reinstalling to start over wouldn't be too difficult. What
>> method did you use to install bioperl? I used fink and I am not sure the
>> available stable version is the one I need. I will install from the command
>> line this time around, and let you know how it turns out.
>> 
>> Thank you!
>> 
>> 
>> 
>> Emanuele Osimo wrote:
>>> 
>>> Hello Ryan,
>>> unfortunately, if you upgraded to 10.6 without formatting, I have to tell
>>> you that you'll be in big trouble with perl and with everything you
>>> installed from the commandline... Because in the upgrade process
>>> everything
>>> in the system folders, perl and bioperl being some of these things, is
>>> erased without being uninstalled, so you'll find a lot of folders with the
>>> same name but no contents.
>>> I suggest you, as I did, to format your pc and reinstall 10.6 from
>>> scratch.
>>> Then youl'll be able to install mysql (I had to install
>>> mysql-5.4.3-beta-osx10.5, the only to work on 10.6), and, working with
>>> perl
>>> 5.10 that is already installed, you'll install bioperl with no effort.
>>> Bye
>>> Emanuele
>>> 
>>> On Mon, Nov 16, 2009 at 04:30, rbogard <ryan_bogard at hms.harvard.edu>
>>> wrote:
>>> 
>>>> 
>>>> In advance, any advice would be grealy appreciated! I have installed
>>>> bioperl-588pm via fink but I am having difficulties calling the modules
>>>> in
>>>> script. The following is added to .profile (bash):
>>>> PERL5LIB=/sw/lib/perl5/5.8.8:$PERL5LIB
>>>> 
>>>> If I change this to /sw/lib/perl5 then I get an @INC error, as use
>>>> Bio::PERL
>>>> cannot be located.
>>>> 
>>>> The environment variables are as follows:
>>>> 
>>>> 
>>>> MANPATH=/sw/share/man:/usr/share/man:/usr/X11/man:/sw/lib/perl5/5.10.0/man:/usr/X11R6/man:/sw/lib/perl5-core/5.8.8/man:/sw/lib/perl5/5.8.8/man
>>>> 
>>>> PERL5LIB=/sw/lib/perl5/5.8.8:/sw/lib/perl5:/sw/lib/perl5/darwin:/sw/lib/perl5/5.8.8
>>>> 
>>>> PATH=/sw/bin:/sw/sbin:/usr/bin:/bin:/usr/sbin:/sbin:/usr/local/bin:/usr/X11/bin:/usr/X11R6/bin
>>>> INFOPATH=/sw/share/info:/sw/info:/usr/share/info
>>>> 
>>>> 
>>>> This is the perl script I'm attempting to run:
>>>> #!/sw/bin/perl5.8.8
>>>> use strict;
>>>> use Bio::Perl;
>>>> $seq_object = get_sequence('swiss',"ROA1_HUMAN");
>>>> write_sequence(">roa1.fasta",'fasta',$seq_object);
>>>> 
>>>> Here is the error output:
>>>> 
>>>> dyld: lazy symbol binding failed: Symbol not found: _Perl_Tstack_sp_ptr
>>>> Referenced from:
>>>> /sw/lib/perl5/5.8.8/darwin-thread-multi-2level/auto/IO/IO.bundle
>>>> Expected in: dynamic lookup
>>>> 
>>>> dyld: Symbol not found: _Perl_Tstack_sp_ptr
>>>> Referenced from:
>>>> /sw/lib/perl5/5.8.8/darwin-thread-multi-2level/auto/IO/IO.bundle
>>>> Expected in: dynamic lookup
>>>> 
>>>> Trace/BPT trap
>>>> 
>>>> I have looked through many forum postings and attempted the solutions
>>>> offered in those instances, but none seem to work in my case. I'm not
>>>> sure
>>>> if it's because I have perl 5.10.0 installed while attempting to call
>>>> bioperl 5.8.8; however, others seem to have it working just fine.
>>>> 
>>>> Thank you, Ryan
>>>> --
>>>> View this message in context:
>>>> http://old.nabble.com/Problems-with-bioperl-in-Mac-OS-X-10.6-%28Perl-5.10.0%29-tp26366421p26366421.html
>>>> Sent from the Perl - Bioperl-L mailing list archive at Nabble.com.
>>>> 
>>>> _______________________________________________
>>>> Bioperl-l mailing list
>>>> Bioperl-l at lists.open-bio.org
>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>> 
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>> 
>>> 
>> 
>> -- 
>> View this message in context: http://old.nabble.com/Problems-with-bioperl-in-Mac-OS-X-10.6-%28Perl-5.10.0%29-tp26366421p26372079.html
>> Sent from the Perl - Bioperl-L mailing list archive at Nabble.com.
>> 
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From Kevin.M.Brown at asu.edu  Mon Nov 16 10:49:13 2009
From: Kevin.M.Brown at asu.edu (Kevin Brown)
Date: Mon, 16 Nov 2009 08:49:13 -0700
Subject: [Bioperl-l] Bio::Graphics::Panel question
In-Reply-To: <a6a926b50911131235p3b04b59fs634593f275300ed6@mail.gmail.com>
References: <a6a926b50911131235p3b04b59fs634593f275300ed6@mail.gmail.com>
Message-ID: <1A4207F8295607498283FE9E93B775B40663EDB9@EX02.asurite.ad.asu.edu>

To really be able to tell if this was a bug, I (and probably the real
devs) would need to see that part of your code and the Blast file that
is having this issue as it could be your callback for color choice vs
the blast object (e.g. your color picker is missing an option that the
data comes in with and so returns with a blank value).

-----Original Message-----
From: bioperl-l-bounces at lists.open-bio.org
[mailto:bioperl-l-bounces at lists.open-bio.org] On Behalf Of Xiaoyu Liang
Sent: Friday, November 13, 2009 1:36 PM
To: Bioperl-l at lists.open-bio.org
Subject: [Bioperl-l] Bio::Graphics::Panel question

Hi,

I'm using Bio::Graphics to parse the blast result and generate images.
But, sometimes, in the middle of the output image, the hit's color is
white, eventhough I set it to other colors. I attached the picture here
for an example. This doesn't occur all the time, usually, it works well.
I'm wondering if I did something wrong? or depends on the blast result?

Thank you,
Xiaoyu


From ryan_bogard at hms.harvard.edu  Mon Nov 16 11:57:16 2009
From: ryan_bogard at hms.harvard.edu (rbogard)
Date: Mon, 16 Nov 2009 08:57:16 -0800 (PST)
Subject: [Bioperl-l] Problems with bioperl in Mac OS X 10.6 (Perl 5.10.0)
In-Reply-To: <58912861-CD59-4AFC-8F30-B0AA2E77AECB@illinois.edu>
References: <26366421.post@talk.nabble.com>
	<2ac05d0f0911152304v58985cb5x6ea0501bff7a41ab@mail.gmail.com>
	<26372079.post@talk.nabble.com>
	<8D822081B13F49C2A37677D3A47F38B4@NewLife>
	<58912861-CD59-4AFC-8F30-B0AA2E77AECB@illinois.edu>
Message-ID: <26375418.post@talk.nabble.com>


I read that posting by Koen and used the unstable tree after the first
attempt; however, the errors still persisted. I just finished a fresh
install and I will just follow Mr. Fields advice and use CPAN. 
Thank you all for the help!


Chris Fields-5 wrote:
> 
> Actually, why not just install via CPAN?  Any particular reason?
> 
> chris
> 
> On Nov 16, 2009, at 7:48 AM, Mark A. Jensen wrote:
> 
>> Ryan,
>> I'm not a mac person, but Koen has said (see
>> http://www.bioperl.org/wiki/Getting_BioPerl#Mac_OS_X_using_fink )
>> to use the unstable tree to get BioPerl 1.6.1, which is likely to be what
>> you want.
>> cheers
>> Mark
>> ----- Original Message ----- From: "rbogard"
>> <ryan_bogard at hms.harvard.edu>
>> To: <Bioperl-l at lists.open-bio.org>
>> Sent: Monday, November 16, 2009 8:43 AM
>> Subject: Re: [Bioperl-l] Problems with bioperl in Mac OS X 10.6 (Perl
>> 5.10.0)
>> 
>> 
>>> 
>>> The Mac OS X 10.6 was a fresh install on a new Mac Book Pro. Not sure if
>>> I
>>> will have the same issues, but it's worth a shot as I have little on my
>>> computer and reinstalling to start over wouldn't be too difficult. What
>>> method did you use to install bioperl? I used fink and I am not sure the
>>> available stable version is the one I need. I will install from the
>>> command
>>> line this time around, and let you know how it turns out.
>>> 
>>> Thank you!
>>> 
>>> 
>>> 
>>> Emanuele Osimo wrote:
>>>> 
>>>> Hello Ryan,
>>>> unfortunately, if you upgraded to 10.6 without formatting, I have to
>>>> tell
>>>> you that you'll be in big trouble with perl and with everything you
>>>> installed from the commandline... Because in the upgrade process
>>>> everything
>>>> in the system folders, perl and bioperl being some of these things, is
>>>> erased without being uninstalled, so you'll find a lot of folders with
>>>> the
>>>> same name but no contents.
>>>> I suggest you, as I did, to format your pc and reinstall 10.6 from
>>>> scratch.
>>>> Then youl'll be able to install mysql (I had to install
>>>> mysql-5.4.3-beta-osx10.5, the only to work on 10.6), and, working with
>>>> perl
>>>> 5.10 that is already installed, you'll install bioperl with no effort.
>>>> Bye
>>>> Emanuele
>>>> 
>>>> On Mon, Nov 16, 2009 at 04:30, rbogard <ryan_bogard at hms.harvard.edu>
>>>> wrote:
>>>> 
>>>>> 
>>>>> In advance, any advice would be grealy appreciated! I have installed
>>>>> bioperl-588pm via fink but I am having difficulties calling the
>>>>> modules
>>>>> in
>>>>> script. The following is added to .profile (bash):
>>>>> PERL5LIB=/sw/lib/perl5/5.8.8:$PERL5LIB
>>>>> 
>>>>> If I change this to /sw/lib/perl5 then I get an @INC error, as use
>>>>> Bio::PERL
>>>>> cannot be located.
>>>>> 
>>>>> The environment variables are as follows:
>>>>> 
>>>>> 
>>>>> MANPATH=/sw/share/man:/usr/share/man:/usr/X11/man:/sw/lib/perl5/5.10.0/man:/usr/X11R6/man:/sw/lib/perl5-core/5.8.8/man:/sw/lib/perl5/5.8.8/man
>>>>> 
>>>>> PERL5LIB=/sw/lib/perl5/5.8.8:/sw/lib/perl5:/sw/lib/perl5/darwin:/sw/lib/perl5/5.8.8
>>>>> 
>>>>> PATH=/sw/bin:/sw/sbin:/usr/bin:/bin:/usr/sbin:/sbin:/usr/local/bin:/usr/X11/bin:/usr/X11R6/bin
>>>>> INFOPATH=/sw/share/info:/sw/info:/usr/share/info
>>>>> 
>>>>> 
>>>>> This is the perl script I'm attempting to run:
>>>>> #!/sw/bin/perl5.8.8
>>>>> use strict;
>>>>> use Bio::Perl;
>>>>> $seq_object = get_sequence('swiss',"ROA1_HUMAN");
>>>>> write_sequence(">roa1.fasta",'fasta',$seq_object);
>>>>> 
>>>>> Here is the error output:
>>>>> 
>>>>> dyld: lazy symbol binding failed: Symbol not found:
>>>>> _Perl_Tstack_sp_ptr
>>>>> Referenced from:
>>>>> /sw/lib/perl5/5.8.8/darwin-thread-multi-2level/auto/IO/IO.bundle
>>>>> Expected in: dynamic lookup
>>>>> 
>>>>> dyld: Symbol not found: _Perl_Tstack_sp_ptr
>>>>> Referenced from:
>>>>> /sw/lib/perl5/5.8.8/darwin-thread-multi-2level/auto/IO/IO.bundle
>>>>> Expected in: dynamic lookup
>>>>> 
>>>>> Trace/BPT trap
>>>>> 
>>>>> I have looked through many forum postings and attempted the solutions
>>>>> offered in those instances, but none seem to work in my case. I'm not
>>>>> sure
>>>>> if it's because I have perl 5.10.0 installed while attempting to call
>>>>> bioperl 5.8.8; however, others seem to have it working just fine.
>>>>> 
>>>>> Thank you, Ryan
>>>>> --
>>>>> View this message in context:
>>>>> http://old.nabble.com/Problems-with-bioperl-in-Mac-OS-X-10.6-%28Perl-5.10.0%29-tp26366421p26366421.html
>>>>> Sent from the Perl - Bioperl-L mailing list archive at Nabble.com.
>>>>> 
>>>>> _______________________________________________
>>>>> Bioperl-l mailing list
>>>>> Bioperl-l at lists.open-bio.org
>>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>>> 
>>>> _______________________________________________
>>>> Bioperl-l mailing list
>>>> Bioperl-l at lists.open-bio.org
>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>> 
>>>> 
>>> 
>>> -- 
>>> View this message in context:
>>> http://old.nabble.com/Problems-with-bioperl-in-Mac-OS-X-10.6-%28Perl-5.10.0%29-tp26366421p26372079.html
>>> Sent from the Perl - Bioperl-L mailing list archive at Nabble.com.
>>> 
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>> 
>> 
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
> 

-- 
View this message in context: http://old.nabble.com/Problems-with-bioperl-in-Mac-OS-X-10.6-%28Perl-5.10.0%29-tp26366421p26375418.html
Sent from the Perl - Bioperl-L mailing list archive at Nabble.com.


From krishna.aneesh at gmail.com  Mon Nov 16 02:00:15 2009
From: krishna.aneesh at gmail.com (Aneesh K)
Date: Mon, 16 Nov 2009 12:30:15 +0530
Subject: [Bioperl-l] Regarding Bio::TreeIO Object
Message-ID: <9cb9dfd70911152300y34789f88qc69dd14bf505f57d@mail.gmail.com>

Hi,

I just started to use Bioperl modules. It's really useful and interesting.
Now I have in stuck with "Tree objects and phylogenetic trees".
I couldn't get any documentation/examples about reading/parsing phylip tree
files.

Please tell me from where I can get some sample codes for this.

Waiting for your reply.

Thanks
Aneesh.K
Mob. 09646181517

From David.Messina at sbc.su.se  Mon Nov 16 12:33:36 2009
From: David.Messina at sbc.su.se (Dave Messina)
Date: Mon, 16 Nov 2009 18:33:36 +0100
Subject: [Bioperl-l] highest PAML version supported?
In-Reply-To: <D44511CE-D632-4CBE-9DD1-1B68EFD70124@bioperl.org>
References: <628aabb70911121120w4c609056v50204b9bd9e5c3fb@mail.gmail.com>
	<D44511CE-D632-4CBE-9DD1-1B68EFD70124@bioperl.org>
Message-ID: <B0AEE42A-A40A-4BB9-9A1C-98381CBB4CA9@sbc.su.se>

Hi everyone,

I just committed support for parsing codeml 4.3a (August 2009) to bioperl-live. I added new tests and all PAML-related tests pass, but please report any problems you have to the list.

Note that I haven't tested the other PAML 4.3a executables to see if there are format changes with those. If you get the chance to try any and it doesn't work, let me know and I'll try to add support for them.

(Note that these changes are only to the PAML parsing code; Bio::Tools::Run already appears to handle 4.3a just fine.)


Dave


From jason at bioperl.org  Mon Nov 16 12:34:57 2009
From: jason at bioperl.org (Jason Stajich)
Date: Mon, 16 Nov 2009 09:34:57 -0800
Subject: [Bioperl-l] Regarding Bio::TreeIO Object
In-Reply-To: <9cb9dfd70911152300y34789f88qc69dd14bf505f57d@mail.gmail.com>
References: <9cb9dfd70911152300y34789f88qc69dd14bf505f57d@mail.gmail.com>
Message-ID: <D1D4E0B9-4741-4D45-84B6-6BB57B6E2B1E@bioperl.org>

Is this at all helpful to your questions.
http://www.bioperl.org/wiki/HOWTO:Trees

The trees are in 'newick' or new hampshire format though I don't think  
there is a phylip format for trees.

-jason
On Nov 15, 2009, at 11:00 PM, Aneesh K wrote:

> Hi,
>
> I just started to use Bioperl modules. It's really useful and  
> interesting.
> Now I have in stuck with "Tree objects and phylogenetic trees".
> I couldn't get any documentation/examples about reading/parsing  
> phylip tree
> files.
>
> Please tell me from where I can get some sample codes for this.
>
> Waiting for your reply.
>
> Thanks
> Aneesh.K
> Mob. 09646181517
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

--
Jason Stajich
jason.stajich at gmail.com
jason at bioperl.org


From roy.chaudhuri at gmail.com  Mon Nov 16 12:31:49 2009
From: roy.chaudhuri at gmail.com (Roy Chaudhuri)
Date: Mon, 16 Nov 2009 17:31:49 +0000
Subject: [Bioperl-l] Regarding Bio::TreeIO Object
In-Reply-To: <9cb9dfd70911152300y34789f88qc69dd14bf505f57d@mail.gmail.com>
References: <9cb9dfd70911152300y34789f88qc69dd14bf505f57d@mail.gmail.com>
Message-ID: <4B018C85.6020801@gmail.com>

Hi Aneesh,

See the Bioperl trees howto:
http://www.bioperl.org/wiki/HOWTO:Trees

Roy.

Aneesh K wrote:
> Hi,
> 
> I just started to use Bioperl modules. It's really useful and interesting.
> Now I have in stuck with "Tree objects and phylogenetic trees".
> I couldn't get any documentation/examples about reading/parsing phylip tree
> files.
> 
> Please tell me from where I can get some sample codes for this.
> 
> Waiting for your reply.
> 
> Thanks
> Aneesh.K
> Mob. 09646181517


-- 
Dr. Roy Chaudhuri
Department of Veterinary Medicine
University of Cambridge, U.K.

From Kevin.M.Brown at asu.edu  Mon Nov 16 13:22:07 2009
From: Kevin.M.Brown at asu.edu (Kevin Brown)
Date: Mon, 16 Nov 2009 11:22:07 -0700
Subject: [Bioperl-l] FW:  Bio::Graphics::Panel question
Message-ID: <1A4207F8295607498283FE9E93B775B40663EE37@EX02.asurite.ad.asu.edu>

Please keep your responses on the list for more timely help.
 

Kevin Brown
Center for Innovations in Medicine
Biodesign Institute
Arizona State University 

 
________________________________

From: Xiaoyu Liang [mailto:veronica.xiaoyu at gmail.com] 
Sent: Monday, November 16, 2009 9:34 AM
To: Kevin Brown
Subject: Re: [Bioperl-l] Bio::Graphics::Panel question


Hi Kevin, 

Thank you for ur quick response. I attached the BLAST .out file here.
And the follow is my code part. I have an array keeping the color for
each hit, and I printed it out the array, there is no missing. 

my $track = $panel->add_track(
                              -glyph       => 'graded_segments',
                              -label       => 1,
                              -connector   => 'dashed',
                              -font2color  => 'red',
                              -sort_order  => 'high_score',
                              -description => sub {
                                $feature = shift;
                                #print "--".$feature."\n";
                                return unless
$feature->has_tag('description');
                                my ($description) =
$feature->each_tag_value('description');
                                my ($id) = $feature->display_name;
                                my @records= split(/\|/,$description);
                                my $score = $feature->score;
                                #print $id.":".$score."\n";
                                if($score >=200){
                                        push (@color_array,1);
                                }elsif($score >=80){
                                        push (@color_array,2);
                                }elsif($score >=50){
                                        push (@color_array,3);
                                }elsif($score >= 40){
                                        push (@color_array,4);
                                }else{
                                        push (@color_array,5);
                                }
                                
                                if($type == 1){
                                        "Species:Arabidopsis TF
Family:$records[1] Score=$score";
                                }elsif($type == 2){
                                        if(scalar(@records)==5){
                                                "Species:$records[1] TF
Family:$records[2] Accepted Name:$records[3] Score=$score";
                                        }else{
                                                "Species:$records[1] TF
Family:$records[2] Score=$score";
                                        }
                                }else{
                                        "";
                                }
                               },
                               -bgcolor => sub{
                                        return unless
$feature->has_tag('description');
                                        if($color_array[$index] == 1 ){
                                                $color = 'red';
                                        }
                                        if($color_array[$index]== 2){
                                                $color = 'orange';
                                        }
                                        if($color_array[$index]== 3){
                                                $color = 'green';
                                        }
                                        if($color_array[$index]== 4){
                                                $color = 'blue';
                                        }
                                        if($color_array[$index]== 5){
                                                $color = 'black';
                                        }
                                        #if ($index == 20){
                                        #        $color = 'black';
                                        #}
                                        #print
$index."--".$color_array[$index]."\n";
                                        $index++;
                                        
                                        #print $feature."\n";
                                        #print
$feature->display_name."\n";
                                        return $color;
                               },
                             );


Best regrads,
Xiaoyu


On Mon, Nov 16, 2009 at 10:49 AM, Kevin Brown <Kevin.M.Brown at asu.edu>
wrote:


	To really be able to tell if this was a bug, I (and probably the
real
	devs) would need to see that part of your code and the Blast
file that
	is having this issue as it could be your callback for color
choice vs
	the blast object (e.g. your color picker is missing an option
that the
	data comes in with and so returns with a blank value).
	

	-----Original Message-----
	From: bioperl-l-bounces at lists.open-bio.org
	[mailto:bioperl-l-bounces at lists.open-bio.org] On Behalf Of
Xiaoyu Liang
	Sent: Friday, November 13, 2009 1:36 PM
	To: Bioperl-l at lists.open-bio.org
	Subject: [Bioperl-l] Bio::Graphics::Panel question
	
	Hi,
	
	I'm using Bio::Graphics to parse the blast result and generate
images.
	But, sometimes, in the middle of the output image, the hit's
color is
	white, eventhough I set it to other colors. I attached the
picture here
	for an example. This doesn't occur all the time, usually, it
works well.
	I'm wondering if I did something wrong? or depends on the blast
result?
	
	Thank you,
	Xiaoyu
	
	
	_______________________________________________
	Bioperl-l mailing list
	Bioperl-l at lists.open-bio.org
	http://lists.open-bio.org/mailman/listinfo/bioperl-l
	

-------------- next part --------------
A non-text attachment was scrubbed...
Name: 1258388779.out
Type: application/octet-stream
Size: 32599 bytes
Desc: 1258388779.out
URL: <http://lists.open-bio.org/pipermail/bioperl-l/attachments/20091116/cb23e40d/attachment-0001.obj>

From paolo.pavan at gmail.com  Mon Nov 16 14:06:06 2009
From: paolo.pavan at gmail.com (Paolo Pavan)
Date: Mon, 16 Nov 2009 20:06:06 +0100
Subject: [Bioperl-l] bioperl-ext installation issue
Message-ID: <56be91b60911161106w69e20fd9k133a465e8d4f8a3f@mail.gmail.com>

Hi everybody,
I have problems installing the bioperl-ext package, any help is much
appreciated.
1)

   - I start trying with cpan i /bioperl-ext/ the only resource available is
   /B/BI/BIRNEY/bioperl-ext-1.4 (is it ok?)
   - I install Inline::MakeMaker and Inline::C then
   - i/BIRNEY/bioperl-ext-1.4/ fails bacause I don't have staden package

2) I try to install io_lib-1.8.10.tar as suggested by the README (
ftp://ftp.mrc-lmb.cam.ac.uk/pub/staden/io_lib/), installation fails after:
...
gcc -g -O2 -o makeSCF makeSCF.o ../read/.libs/libread.a -lz -lm
../read/.libs/libread.a(compress.o): In function `fopen_compressed':
/root/Download/staden/io_lib-1.8.10/utils/compress.c:321: warning: the use
of `tempnam' is dangerous, better use `mkstemp'
gcc -DHAVE_CONFIG_H -I. -I. -I.. -I.. -I../read -I../alf -I../abi -I../ctf
-I../ztr -I../plain -I../scf -I../exp_file -I../utils  -I/usr/local/include
-g -O2 -c -o extract_seq.o `test -f extract_seq.c || echo './'`extract_seq.c
/bin/sh ../libtool --mode=link gcc  -g -O2   -o extract_seq  extract_seq.o
../read/libread.la
gcc -g -O2 -o extract_seq extract_seq.o ../read/.libs/libread.a -lz -lm
../read/.libs/libread.a(compress.o): In function `fopen_compressed':
/root/Download/staden/io_lib-1.8.10/utils/compress.c:321: warning: the use
of `tempnam' is dangerous, better use `mkstemp'
gcc -DHAVE_CONFIG_H -I. -I. -I.. -I.. -I../read -I../alf -I../abi -I../ctf
-I../ztr -I../plain -I../scf -I../exp_file -I../utils  -I/usr/local/include
-g -O2 -c -o index_tar.o `test -f index_tar.c || echo './'`index_tar.c
index_tar.c: In function ?main?:
index_tar.c:12: error: two or more data types in declaration specifiers
make[2]: *** [index_tar.o] Error 1
make[2]: Leaving directory `/home/root/Download/staden/io_lib-1.8.10/progs'
make[1]: *** [all-recursive] Error 1
make[1]: Leaving directory `/home/root/Download/staden/io_lib-1.8.10'
make: *** [all-recursive-am] Error 2

3) I give up staden, because I actually need pSW, and try to install from
Makefile.PL in Bio/Ext/Align but installation fails after:
...
Align.xs:18: warning: ?not_here? defined but not used
Running Mkbootstrap for Bio::Ext::Align ()
chmod 644 Align.bs
rm -f ../blib/arch/auto/Bio/Ext/Align/Align.so
gcc  -shared -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions
-fstack-protector --param=ssp-buffer-size=4 -m64 -mtune=generic Align.o  -o
../blib/arch/auto/Bio/Ext/Align/Align.so libs/libsw.a    \
           -lm          \

/usr/bin/ld: libs/libsw.a(aln.o): relocation R_X86_64_32 against `a local
symbol' can not be used when making a shared object; recompile with -fPIC
libs/libsw.a: could not read symbols: Bad value
collect2: ld returned 1 exit status
make[1]: *** [../blib/arch/auto/Bio/Ext/Align/Align.so] Error 1
make[1]: Leaving directory
`/home/root/.cpan/sources/authors/id/B/BI/BIRNEY/bioperl-ext-1.4/Bio/Ext/Align'
make: *** [subdirs] Error 2

I have also made some other tries such force install Bio::Ext:Align without
success but I'm sure I miss something trivial that I can't catch.
Can someone help me?

Thank you,
Paolo


From lincoln.stein at gmail.com  Mon Nov 16 15:08:20 2009
From: lincoln.stein at gmail.com (Lincoln Stein)
Date: Mon, 16 Nov 2009 15:08:20 -0500
Subject: [Bioperl-l] FW: Bio::Graphics::Panel question
In-Reply-To: <1A4207F8295607498283FE9E93B775B40663EE37@EX02.asurite.ad.asu.edu>
References: <1A4207F8295607498283FE9E93B775B40663EE37@EX02.asurite.ad.asu.edu>
Message-ID: <6dce9a0b0911161208q2f826d83s319184f0cacca097@mail.gmail.com>

Hi,

I think you should modify your color selection code as follows:


                                       if($color_array[$index] == 1 ){
                                               $color = 'red';
                                       }
                                       elsif($color_array[$index]== 2){
                                               $color = 'orange';
                                       }
                                       elsif($color_array[$index]== 3){
                                               $color = 'green';
                                       }
                                       elsif($color_array[$index]== 4){
                                               $color = 'blue';
                                       }
                                       elsif($color_array[$index]== 5){
                                               $color = 'black';
                                       }
                                       else { die "unexpected color array
value $color_array[$index]" }

Lincoln

On Mon, Nov 16, 2009 at 1:22 PM, Kevin Brown <Kevin.M.Brown at asu.edu> wrote:

> Please keep your responses on the list for more timely help.
>
>
> Kevin Brown
> Center for Innovations in Medicine
> Biodesign Institute
> Arizona State University
>
>
>
> ________________________________
>
> From: Xiaoyu Liang [mailto:veronica.xiaoyu at gmail.com]
> Sent: Monday, November 16, 2009 9:34 AM
> To: Kevin Brown
> Subject: Re: [Bioperl-l] Bio::Graphics::Panel question
>
>
> Hi Kevin,
>
> Thank you for ur quick response. I attached the BLAST .out file here.
> And the follow is my code part. I have an array keeping the color for
> each hit, and I printed it out the array, there is no missing.
>
> my $track = $panel->add_track(
>                              -glyph       => 'graded_segments',
>                              -label       => 1,
>                              -connector   => 'dashed',
>                              -font2color  => 'red',
>                              -sort_order  => 'high_score',
>                              -description => sub {
>                                $feature = shift;
>                                #print "--".$feature."\n";
>                                return unless
> $feature->has_tag('description');
>                                my ($description) =
> $feature->each_tag_value('description');
>                                my ($id) = $feature->display_name;
>                                my @records= split(/\|/,$description);
>                                my $score = $feature->score;
>                                #print $id.":".$score."\n";
>                                if($score >=200){
>                                        push (@color_array,1);
>                                }elsif($score >=80){
>                                        push (@color_array,2);
>                                }elsif($score >=50){
>                                        push (@color_array,3);
>                                }elsif($score >= 40){
>                                        push (@color_array,4);
>                                }else{
>                                        push (@color_array,5);
>                                }
>
>                                if($type == 1){
>                                        "Species:Arabidopsis TF
> Family:$records[1] Score=$score";
>                                }elsif($type == 2){
>                                        if(scalar(@records)==5){
>                                                "Species:$records[1] TF
> Family:$records[2] Accepted Name:$records[3] Score=$score";
>                                        }else{
>                                                "Species:$records[1] TF
> Family:$records[2] Score=$score";
>                                        }
>                                }else{
>                                        "";
>                                }
>                               },
>                               -bgcolor => sub{
>                                        return unless
> $feature->has_tag('description');
>                                        if($color_array[$index] == 1 ){
>                                                $color = 'red';
>                                        }
>                                        if($color_array[$index]== 2){
>                                                $color = 'orange';
>                                        }
>                                        if($color_array[$index]== 3){
>                                                $color = 'green';
>                                        }
>                                        if($color_array[$index]== 4){
>                                                $color = 'blue';
>                                        }
>                                        if($color_array[$index]== 5){
>                                                $color = 'black';
>                                        }
>                                        #if ($index == 20){
>                                        #        $color = 'black';
>                                        #}
>                                        #print
> $index."--".$color_array[$index]."\n";
>                                        $index++;
>
>                                        #print $feature."\n";
>                                        #print
> $feature->display_name."\n";
>                                        return $color;
>                               },
>                             );
>
>
> Best regrads,
> Xiaoyu
>
>
> On Mon, Nov 16, 2009 at 10:49 AM, Kevin Brown <Kevin.M.Brown at asu.edu>
> wrote:
>
>
>        To really be able to tell if this was a bug, I (and probably the
> real
>        devs) would need to see that part of your code and the Blast
> file that
>        is having this issue as it could be your callback for color
> choice vs
>        the blast object (e.g. your color picker is missing an option
> that the
>        data comes in with and so returns with a blank value).
>
>
>        -----Original Message-----
>        From: bioperl-l-bounces at lists.open-bio.org
>        [mailto:bioperl-l-bounces at lists.open-bio.org] On Behalf Of
> Xiaoyu Liang
>        Sent: Friday, November 13, 2009 1:36 PM
>        To: Bioperl-l at lists.open-bio.org
>        Subject: [Bioperl-l] Bio::Graphics::Panel question
>
>        Hi,
>
>        I'm using Bio::Graphics to parse the blast result and generate
> images.
>        But, sometimes, in the middle of the output image, the hit's
> color is
>        white, eventhough I set it to other colors. I attached the
> picture here
>        for an example. This doesn't occur all the time, usually, it
> works well.
>        I'm wondering if I did something wrong? or depends on the blast
> result?
>
>        Thank you,
>        Xiaoyu
>
>
>        _______________________________________________
>        Bioperl-l mailing list
>        Bioperl-l at lists.open-bio.org
>        http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


-- 
Lincoln D. Stein
Director, Informatics and Biocomputing Platform
Ontario Institute for Cancer Research
101 College St., Suite 800
Toronto, ON, Canada M5G0A3
416 673-8514
Assistant: Renata Musa <Renata.Musa at oicr.on.ca>

From ryan_bogard at hms.harvard.edu  Mon Nov 16 16:44:25 2009
From: ryan_bogard at hms.harvard.edu (rbogard)
Date: Mon, 16 Nov 2009 13:44:25 -0800 (PST)
Subject: [Bioperl-l] Problems with bioperl in Mac OS X 10.6 (Perl 5.10.0)
In-Reply-To: <26366421.post@talk.nabble.com>
References: <26366421.post@talk.nabble.com>
Message-ID: <26379710.post@talk.nabble.com>


Thank you all for your help! I was able to get bioperl working via manual
download and install. It was a combination of permissions issues and X86_64
vs. X86_32 compatibility issues. Using fink to download and install seems to
have given me a combination of 32 and 64 associated files (I probably did
something wrong in config). 


rbogard wrote:
> 
> In advance, any advice would be grealy appreciated! I have installed
> bioperl-588pm via fink but I am having difficulties calling the modules in
> script. The following is added to .profile (bash):
> PERL5LIB=/sw/lib/perl5/5.8.8:$PERL5LIB
> 
> If I change this to /sw/lib/perl5 then I get an @INC error, as use
> Bio::PERL cannot be located.
> 
> The environment variables are as follows:
> 
> MANPATH=/sw/share/man:/usr/share/man:/usr/X11/man:/sw/lib/perl5/5.10.0/man:/usr/X11R6/man:/sw/lib/perl5-core/5.8.8/man:/sw/lib/perl5/5.8.8/man
> PERL5LIB=/sw/lib/perl5/5.8.8:/sw/lib/perl5:/sw/lib/perl5/darwin:/sw/lib/perl5/5.8.8
> PATH=/sw/bin:/sw/sbin:/usr/bin:/bin:/usr/sbin:/sbin:/usr/local/bin:/usr/X11/bin:/usr/X11R6/bin
> INFOPATH=/sw/share/info:/sw/info:/usr/share/info
> 
> 
> This is the perl script I'm attempting to run:
> #!/sw/bin/perl5.8.8
> use strict;
> use Bio::Perl;
> $seq_object = get_sequence('swiss',"ROA1_HUMAN");
> write_sequence(">roa1.fasta",'fasta',$seq_object);
> 
> Here is the error output:
> 
> dyld: lazy symbol binding failed: Symbol not found: _Perl_Tstack_sp_ptr
>   Referenced from:
> /sw/lib/perl5/5.8.8/darwin-thread-multi-2level/auto/IO/IO.bundle
>   Expected in: dynamic lookup
> 
> dyld: Symbol not found: _Perl_Tstack_sp_ptr
>   Referenced from:
> /sw/lib/perl5/5.8.8/darwin-thread-multi-2level/auto/IO/IO.bundle
>   Expected in: dynamic lookup
> 
> Trace/BPT trap
> 
> I have looked through many forum postings and attempted the solutions
> offered in those instances, but none seem to work in my case. I'm not sure
> if it's because I have perl 5.10.0 installed while attempting to call
> bioperl 5.8.8; however, others seem to have it working just fine.
> 
> Thank you, Ryan 
> 

-- 
View this message in context: http://old.nabble.com/Problems-with-bioperl-in-Mac-OS-X-10.6-%28Perl-5.10.0%29-tp26366421p26379710.html
Sent from the Perl - Bioperl-L mailing list archive at Nabble.com.


From jay at jays.net  Mon Nov 16 17:02:10 2009
From: jay at jays.net (Jay Hannah)
Date: Mon, 16 Nov 2009 16:02:10 -0600
Subject: [Bioperl-l] Bio::Index::GenBank - by organism?
In-Reply-To: <2BA451B1-6E18-483E-B655-74D1146772CC@bioperl.org>
References: <3B01A09C-198E-4691-B807-7ED3250BB81A@jays.net>
	<12DFD22E-42DC-4626-9873-0DE3EBB5CFBD@illinois.edu>
	<2BA451B1-6E18-483E-B655-74D1146772CC@bioperl.org>
Message-ID: <60ADD3A9-D38B-4A39-A5CE-C8118DEC1242@jays.net>

On Nov 10, 2009, at 12:50 PM, Jason Stajich wrote:
> You might also look at what mygenbank does:
> http://homepage.mac.com/iankorf/mygenbank.html

It appears, perhaps, that BioSQL can provide *foo* searching like so:

http://www.biosql.org/wiki/Schema_Overview#TAXON.2C_TAXON_NAME

 SELECT DISTINCT include.ncbi_taxon_id FROM taxon
    INNER JOIN taxon AS include ON
      (include.left_value BETWEEN taxon.left_value
        AND taxon.right_value)
 WHERE taxon.taxon_id IN
   (SELECT taxon_id FROM taxon_name
    WHERE name LIKE '%fungi%')

So I think we're going to chase that for a while.

I didn't see a *foo* search in MyGenBank?

Thanks,

j
http://clab.ist.unomaha.edu/CLAB/index.php/User:Jhannah

From roy.chaudhuri at gmail.com  Tue Nov 17 06:24:07 2009
From: roy.chaudhuri at gmail.com (Roy Chaudhuri)
Date: Tue, 17 Nov 2009 11:24:07 +0000
Subject: [Bioperl-l] Regarding Bio::TreeIO Object
In-Reply-To: <9cb9dfd70911162117nfac0e52gea3d638e34337b16@mail.gmail.com>
References: <9cb9dfd70911152300y34789f88qc69dd14bf505f57d@mail.gmail.com>
	<4B018C85.6020801@gmail.com>
	<9cb9dfd70911162117nfac0e52gea3d638e34337b16@mail.gmail.com>
Message-ID: <4B0287D7.5050702@gmail.com>

Hi Aneesh,

Please keep your replies on the mailing list, that way someone else can 
respond, which would be particularly useful in this case since I know 
nothing about MapIO.

Roy.

Aneesh K wrote:
> Thanks for your reply. 
> 
> I would like to know about "Genetic Maps" also. I would like to 
> use MapIO object. 
> But I'm not aware about genetic maps and the mapmaker format. 
> 
> Please tell me from where I can get some examples for mapmaker format 
> and some example scripts to use MapIO object. 
> 
> Hoping your reply.
> 
> Aneesh.K
> Mob. 09646181517
> 
> 
> 
> On Mon, Nov 16, 2009 at 11:01 PM, Roy Chaudhuri <roy.chaudhuri at gmail.com 
> <mailto:roy.chaudhuri at gmail.com>> wrote:
> 
>     Hi Aneesh,
> 
>     See the Bioperl trees howto:
>     http://www.bioperl.org/wiki/HOWTO:Trees
> 
>     Roy.
> 
> 
>     Aneesh K wrote:
> 
>         Hi,
> 
>         I just started to use Bioperl modules. It's really useful and
>         interesting.
>         Now I have in stuck with "Tree objects and phylogenetic trees".
>         I couldn't get any documentation/examples about reading/parsing
>         phylip tree
>         files.
> 
>         Please tell me from where I can get some sample codes for this.
> 
>         Waiting for your reply.
> 
>         Thanks
>         Aneesh.K
>         Mob. 09646181517
> 
> 
> 
>     -- 
>     Dr. Roy Chaudhuri
>     Department of Veterinary Medicine
>     University of Cambridge, U.K.
> 
> 


From maj at fortinbras.us  Tue Nov 17 07:50:06 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Tue, 17 Nov 2009 07:50:06 -0500
Subject: [Bioperl-l] Regarding Bio::TreeIO Object
In-Reply-To: <4B0287D7.5050702@gmail.com>
References: <9cb9dfd70911152300y34789f88qc69dd14bf505f57d@mail.gmail.com><4B018C85.6020801@gmail.com><9cb9dfd70911162117nfac0e52gea3d638e34337b16@mail.gmail.com>
	<4B0287D7.5050702@gmail.com>
Message-ID: <394F62D51F15405BBCF8BB50DA0FF336@NewLife>

Aneesh, 
Have a look in the t/Map directory of the BioPerl distribution. These
are test scripts that are also examples of usage. The t/data directory
will contain the datafiles that the tests use; these will provide example data.
cheers 
Mark 
----- Original Message ----- 
From: "Roy Chaudhuri" <roy.chaudhuri at gmail.com>
To: "Aneesh K" <krishna.aneesh at gmail.com>; <bioperl-l at bioperl.org>
Sent: Tuesday, November 17, 2009 6:24 AM
Subject: Re: [Bioperl-l] Regarding Bio::TreeIO Object


> Hi Aneesh,
> 
> Please keep your replies on the mailing list, that way someone else can 
> respond, which would be particularly useful in this case since I know 
> nothing about MapIO.
> 
> Roy.
> 
> Aneesh K wrote:
>> Thanks for your reply. 
>> 
>> I would like to know about "Genetic Maps" also. I would like to 
>> use MapIO object. 
>> But I'm not aware about genetic maps and the mapmaker format. 
>> 
>> Please tell me from where I can get some examples for mapmaker format 
>> and some example scripts to use MapIO object. 
>> 
>> Hoping your reply.
>> 
>> Aneesh.K
>> Mob. 09646181517
>> 
>> 
>> 
>> On Mon, Nov 16, 2009 at 11:01 PM, Roy Chaudhuri <roy.chaudhuri at gmail.com 
>> <mailto:roy.chaudhuri at gmail.com>> wrote:
>> 
>>     Hi Aneesh,
>> 
>>     See the Bioperl trees howto:
>>     http://www.bioperl.org/wiki/HOWTO:Trees
>> 
>>     Roy.
>> 
>> 
>>     Aneesh K wrote:
>> 
>>         Hi,
>> 
>>         I just started to use Bioperl modules. It's really useful and
>>         interesting.
>>         Now I have in stuck with "Tree objects and phylogenetic trees".
>>         I couldn't get any documentation/examples about reading/parsing
>>         phylip tree
>>         files.
>> 
>>         Please tell me from where I can get some sample codes for this.
>> 
>>         Waiting for your reply.
>> 
>>         Thanks
>>         Aneesh.K
>>         Mob. 09646181517
>> 
>> 
>> 
>>     -- 
>>     Dr. Roy Chaudhuri
>>     Department of Veterinary Medicine
>>     University of Cambridge, U.K.
>> 
>> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
>

From veronica.xiaoyu at gmail.com  Wed Nov 18 12:18:33 2009
From: veronica.xiaoyu at gmail.com (Xiaoyu Liang)
Date: Wed, 18 Nov 2009 12:18:33 -0500
Subject: [Bioperl-l] how to visualize multiple sequences alignments
Message-ID: <a6a926b50911180918m6b6c31b2g191f9d78325e68de@mail.gmail.com>

Hi,

I'm wondering Is there any modules that can be used for visualizing multiple
sequences alignments? like the result from ClustalW?

Thank you very much,
Xiaoyu

From jason at bioperl.org  Wed Nov 18 13:23:05 2009
From: jason at bioperl.org (Jason Stajich)
Date: Wed, 18 Nov 2009 10:23:05 -0800
Subject: [Bioperl-l] how to visualize multiple sequences alignments
In-Reply-To: <a6a926b50911180918m6b6c31b2g191f9d78325e68de@mail.gmail.com>
References: <a6a926b50911180918m6b6c31b2g191f9d78325e68de@mail.gmail.com>
Message-ID: <FC65BC44-AC7A-4AE6-8DF0-3F7969CD5B4C@bioperl.org>

try jalview http://www.jalview.org/

On Nov 18, 2009, at 9:18 AM, Xiaoyu Liang wrote:

> Hi,
>
> I'm wondering Is there any modules that can be used for visualizing  
> multiple
> sequences alignments? like the result from ClustalW?
>
> Thank you very much,
> Xiaoyu
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

--
Jason Stajich
jason.stajich at gmail.com
jason at bioperl.org


From andrew.j.grimm at gmail.com  Wed Nov 18 21:52:31 2009
From: andrew.j.grimm at gmail.com (Andrew Grimm)
Date: Thu, 19 Nov 2009 13:52:31 +1100
Subject: [Bioperl-l] DANGER: hacking of bioperl wiki?
Message-ID: <b9140daa0911181852n3a000da7x3fc7a5f0661f10f3@mail.gmail.com>

Caution: read the whole email before visiting the bioperl wiki

I was doing some bioinformatics-related searching using google, and
one of the hits was to the bio dot perl dot org wiki (the FAQ in
particular).

When I did that, I was redirected to a ferdax dot com web site (a
typo-squatting of fedex?).

Some people reckon that ferdax hacks web sites and redirects google
hits from the victim web site to their own web site. For example, this
thread at google's webmaster central
http://www.google.com/support/forum/p/Webmasters/thread?tid=37a36c0d1ea99819&hl=en#all
(it's talking about zencart, but presumably they've since found other
victims)

Just going to the website without using google may not trigger the redirect.

Apologies if this is a false alarm, but I don't think it is.

I won't be in contact between Friday and Monday Australian time (I'll
be at railscamp 6 in Melbourne), so I won't be able to answer any
replies.

Thanks,

Andrew Grimm

From maj at fortinbras.us  Wed Nov 18 22:14:44 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Wed, 18 Nov 2009 22:14:44 -0500
Subject: [Bioperl-l] DANGER: hacking of bioperl wiki?
In-Reply-To: <b9140daa0911181852n3a000da7x3fc7a5f0661f10f3@mail.gmail.com>
References: <b9140daa0911181852n3a000da7x3fc7a5f0661f10f3@mail.gmail.com>
Message-ID: <7761C2223DB54DE6B836F302D2FF6AC0@NewLife>

Andrew-- thanks!! We're on it.
MAJ
----- Original Message ----- 
From: "Andrew Grimm" <andrew.j.grimm at gmail.com>
To: <bioperl-l at lists.open-bio.org>
Sent: Wednesday, November 18, 2009 9:52 PM
Subject: [Bioperl-l] DANGER: hacking of bioperl wiki?


> Caution: read the whole email before visiting the bioperl wiki
>
> I was doing some bioinformatics-related searching using google, and
> one of the hits was to the bio dot perl dot org wiki (the FAQ in
> particular).
>
> When I did that, I was redirected to a ferdax dot com web site (a
> typo-squatting of fedex?).
>
> Some people reckon that ferdax hacks web sites and redirects google
> hits from the victim web site to their own web site. For example, this
> thread at google's webmaster central
> http://www.google.com/support/forum/p/Webmasters/thread?tid=37a36c0d1ea99819&hl=en#all
> (it's talking about zencart, but presumably they've since found other
> victims)
>
> Just going to the website without using google may not trigger the redirect.
>
> Apologies if this is a false alarm, but I don't think it is.
>
> I won't be in contact between Friday and Monday Australian time (I'll
> be at railscamp 6 in Melbourne), so I won't be able to answer any
> replies.
>
> Thanks,
>
> Andrew Grimm
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> 


From sandipan.chowdhury at physiology.wisc.edu  Thu Nov 19 01:49:45 2009
From: sandipan.chowdhury at physiology.wisc.edu (Sandipan Chowdhury)
Date: Thu, 19 Nov 2009 00:49:45 -0600
Subject: [Bioperl-l] accessing EMBL database
Message-ID: <475F74057A618245A773CD325E105D1E033334AC@phy-srv01.physiol.physiology.wisc.edu>

Hi,
 
I have 3 questions all related to the retreival of sequences from online databases.
 
(1) I have been trying to download a protein sequence from the EMBL database and trying to write the sequence into a text file, as a string. I am using the following code: 
 
use Bio::DB::EMBL;
open b,">","s.txt";
$em_obj = Bio::DB::EMBL->new;
  $seq_obj = $em_obj->get_Seq_by_acc("CAB95729");
  $s_str = $seq_obj->seq;
  print b "$s_str\n";
close b;
 
The script is not working and gives the messege:
"MSG: EMBL stream with no ID. Not embl in my book
STACK: Error::throw
STACK: Bio::Root::Root::throw C:/Perl/site/lib/Bio/Root/Root.pm: 368
STACK: Bio::SeqIO::embl::next_seq C:/Perl/site/lib/Bio\SeqIO\embl.pm: 203
STACK: Bio::DB::WebDBSeqI::get_Seq_by_acc C:/Perl/site/lib/Bio/DB/WebDBSeqI.pm: 194
STACK: trial2.pl"
 
I am not sure what this means. A similar version of the script works for the Swissprot, GenBank and RefSeq databases but not for the EMBL. What is the way around this so that I can download the embl sequence?
 
(2) Also, is there anyway I can download sequences from DDBJ (database of Japan)?
 
(3) Can GI numbers be used to retreive the sequences? If so then how?
 
Answers to these questions would be greatly appreciated. I am very new to Perl/Bioperl and am not really familiar with the advanced programming features, so I would need to your help to find my way out of this situation.
 
Many Thanks
Sandipan
 

From maj at fortinbras.us  Thu Nov 19 08:10:07 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Thu, 19 Nov 2009 08:10:07 -0500
Subject: [Bioperl-l] accessing EMBL database
In-Reply-To: <475F74057A618245A773CD325E105D1E033334AC@phy-srv01.physiol.physiology.wisc.edu>
References: <475F74057A618245A773CD325E105D1E033334AC@phy-srv01.physiol.physiology.wisc.edu>
Message-ID: <E10EF917D9914031BCFDF8BDDBFA4F13@NewLife>

Sandipan-- That id (CAB95729) returns "No entries" from EMBL.
I would agree that the error message is not really informative.
The module documentation warns:

      # remember that EMBL_ID does not equal GenBank_ID!
so I would check that.
MAJ
----- Original Message ----- 
From: "Sandipan Chowdhury" <sandipan.chowdhury at physiology.wisc.edu>
To: <bioperl-l at bioperl.org>
Sent: Thursday, November 19, 2009 1:49 AM
Subject: [Bioperl-l] accessing EMBL database


> Hi,
>
> I have 3 questions all related to the retreival of sequences from online 
> databases.
>
> (1) I have been trying to download a protein sequence from the EMBL database 
> and trying to write the sequence into a text file, as a string. I am using the 
> following code:
>
> use Bio::DB::EMBL;
> open b,">","s.txt";
> $em_obj = Bio::DB::EMBL->new;
>  $seq_obj = $em_obj->get_Seq_by_acc("CAB95729");
>  $s_str = $seq_obj->seq;
>  print b "$s_str\n";
> close b;
>
> The script is not working and gives the messege:
> "MSG: EMBL stream with no ID. Not embl in my book
> STACK: Error::throw
> STACK: Bio::Root::Root::throw C:/Perl/site/lib/Bio/Root/Root.pm: 368
> STACK: Bio::SeqIO::embl::next_seq C:/Perl/site/lib/Bio\SeqIO\embl.pm: 203
> STACK: Bio::DB::WebDBSeqI::get_Seq_by_acc 
> C:/Perl/site/lib/Bio/DB/WebDBSeqI.pm: 194
> STACK: trial2.pl"
>
> I am not sure what this means. A similar version of the script works for the 
> Swissprot, GenBank and RefSeq databases but not for the EMBL. What is the way 
> around this so that I can download the embl sequence?
>
> (2) Also, is there anyway I can download sequences from DDBJ (database of 
> Japan)?
>
> (3) Can GI numbers be used to retreive the sequences? If so then how?
>
> Answers to these questions would be greatly appreciated. I am very new to 
> Perl/Bioperl and am not really familiar with the advanced programming 
> features, so I would need to your help to find my way out of this situation.
>
> Many Thanks
> Sandipan
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> 


From hrh at fmi.ch  Thu Nov 19 08:23:29 2009
From: hrh at fmi.ch (Hotz, Hans-Rudolf)
Date: Thu, 19 Nov 2009 14:23:29 +0100
Subject: [Bioperl-l] accessing EMBL database
In-Reply-To: <475F74057A618245A773CD325E105D1E033334AC@phy-srv01.physiol.physiology.wisc.edu>
Message-ID: <C72B0561.5887%hrh@fmi.ch>


Sandipan


> I have 3 questions all related to the retreival of sequences from online
> databases.
>  
> (1) I have been trying to download a protein sequence from the EMBL database
> and trying to write the sequence into a text file, as a string. I am using the
> following code: 
>  
> use Bio::DB::EMBL;
> open b,">","s.txt";
> $em_obj = Bio::DB::EMBL->new;
>   $seq_obj = $em_obj->get_Seq_by_acc("CAB95729");
>   $s_str = $seq_obj->seq;
>   print b "$s_str\n";
> close b;
>  
> The script is not working and gives the messege:
> "MSG: EMBL stream with no ID. Not embl in my book
> STACK: Error::throw
> STACK: Bio::Root::Root::throw C:/Perl/site/lib/Bio/Root/Root.pm: 368
> STACK: Bio::SeqIO::embl::next_seq C:/Perl/site/lib/Bio\SeqIO\embl.pm: 203
> STACK: Bio::DB::WebDBSeqI::get_Seq_by_acc
> C:/Perl/site/lib/Bio/DB/WebDBSeqI.pm: 194
> STACK: trial2.pl"
>  
> I am not sure what this means. A similar version of the script works for the
> Swissprot, GenBank and RefSeq databases but not for the EMBL. What is the way
> around this so that I can download the embl sequence?

"CAB95729" is a protein sequence, ie a translation of the CDS of
'AJ277028.1'.

As far as I know, Bio::DB::EMBL is only designed to get EMBL entries, ie the
nucleotides sequence


> (2) Also, is there anyway I can download sequences from DDBJ (database of
> Japan)?

Unless, for network/speed reason, why do you want to download data from
DDBJ? It contains the same data as GenBank and EMBL. Those three databases
exchange their data on a daily basis.
  
> (3) Can GI numbers be used to retreive the sequences? If so then how?

Have you looked at Bio::DB::Eutilities ? See the 'HOWTOs'  page in the
Bioperl Wiki


Regards, Hans


> Answers to these questions would be greatly appreciated. I am very new to
> Perl/Bioperl and am not really familiar with the advanced programming
> features, so I would need to your help to find my way out of this situation.
>  
> Many Thanks
> Sandipan
>  
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From cjfields at illinois.edu  Thu Nov 19 08:47:16 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Thu, 19 Nov 2009 07:47:16 -0600
Subject: [Bioperl-l] accessing EMBL database
In-Reply-To: <C72B0561.5887%hrh@fmi.ch>
References: <C72B0561.5887%hrh@fmi.ch>
Message-ID: <95D416ED-7630-40A1-ABA5-A3C3525D25B1@illinois.edu>


On Nov 19, 2009, at 7:23 AM, Hotz, Hans-Rudolf wrote:

> 
> Sandipan
> 
> 
>> I have 3 questions all related to the retreival of sequences from online
>> databases.
>> 
>> (1) I have been trying to download a protein sequence from the EMBL database
>> and trying to write the sequence into a text file, as a string. I am using the
>> following code: 
>> 
>> use Bio::DB::EMBL;
>> open b,">","s.txt";
>> $em_obj = Bio::DB::EMBL->new;
>>  $seq_obj = $em_obj->get_Seq_by_acc("CAB95729");
>>  $s_str = $seq_obj->seq;
>>  print b "$s_str\n";
>> close b;
>> 
>> The script is not working and gives the messege:
>> "MSG: EMBL stream with no ID. Not embl in my book
>> STACK: Error::throw
>> STACK: Bio::Root::Root::throw C:/Perl/site/lib/Bio/Root/Root.pm: 368
>> STACK: Bio::SeqIO::embl::next_seq C:/Perl/site/lib/Bio\SeqIO\embl.pm: 203
>> STACK: Bio::DB::WebDBSeqI::get_Seq_by_acc
>> C:/Perl/site/lib/Bio/DB/WebDBSeqI.pm: 194
>> STACK: trial2.pl"
>> 
>> I am not sure what this means. A similar version of the script works for the
>> Swissprot, GenBank and RefSeq databases but not for the EMBL. What is the way
>> around this so that I can download the embl sequence?
> 
> "CAB95729" is a protein sequence, ie a translation of the CDS of
> 'AJ277028.1'.
> 
> As far as I know, Bio::DB::EMBL is only designed to get EMBL entries, ie the
> nucleotides sequence
> 
> 
> 
>> (2) Also, is there anyway I can download sequences from DDBJ (database of
>> Japan)?
> 
> Unless, for network/speed reason, why do you want to download data from
> DDBJ? It contains the same data as GenBank and EMBL. Those three databases
> exchange their data on a daily basis.
> 
>> (3) Can GI numbers be used to retreive the sequences? If so then how?
> 
> Have you looked at Bio::DB::Eutilities ? See the 'HOWTOs'  page in the
> Bioperl Wiki
> 
> 
> 
> Regards, Hans
> 
> 
> 
>> Answers to these questions would be greatly appreciated. I am very new to
>> Perl/Bioperl and am not really familiar with the advanced programming
>> features, so I would need to your help to find my way out of this situation.
>> 
>> Many Thanks
>> Sandipan

To add to that, if you want the protein sequences as a Bio::Seq you can use Bio::DB::GenPept (Bio::DB::EUtilities will retrieve raw data only).

chris


From David.Messina at sbc.su.se  Thu Nov 19 09:04:55 2009
From: David.Messina at sbc.su.se (Dave Messina)
Date: Thu, 19 Nov 2009 15:04:55 +0100
Subject: [Bioperl-l] accessing EMBL database
In-Reply-To: <E10EF917D9914031BCFDF8BDDBFA4F13@NewLife>
References: <475F74057A618245A773CD325E105D1E033334AC@phy-srv01.physiol.physiology.wisc.edu>
	<E10EF917D9914031BCFDF8BDDBFA4F13@NewLife>
Message-ID: <B6E6CEFF-14E7-48D6-B9D8-9C114F166190@sbc.su.se>

> I would agree that the error message is not really informative.

Agreed that it could be better, but I wonder whether part of the problem with BioPerl error messages is the stack dump.

I think a lot of eyes just glaze right over when they see a big wad of complicated stuff, with colons and slashes and line numbers, spewing out at them.

Perhaps the stack dump should be turned off by default?

Wouldn't this:

ERROR: EMBL stream with no ID. Not embl in my book


Be a lot clearer than this?:

MSG: EMBL stream with no ID. Not embl in my book
STACK: Error::throw
STACK: Bio::Root::Root::throw C:/Perl/site/lib/Bio/Root/Root.pm: 368
STACK: Bio::SeqIO::embl::next_seq C:/Perl/site/lib/Bio\SeqIO\embl.pm: 203
STACK: Bio::DB::WebDBSeqI::get_Seq_by_acc C:/Perl/site/lib/Bio/DB/WebDBSeqI.pm: 194
STACK: trial2.pl


Just a thought. This has probably been discussed before.
Dave


From maj at fortinbras.us  Thu Nov 19 09:17:05 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Thu, 19 Nov 2009 09:17:05 -0500
Subject: [Bioperl-l] accessing EMBL database
In-Reply-To: <B6E6CEFF-14E7-48D6-B9D8-9C114F166190@sbc.su.se>
References: <475F74057A618245A773CD325E105D1E033334AC@phy-srv01.physiol.physiology.wisc.edu>
	<E10EF917D9914031BCFDF8BDDBFA4F13@NewLife>
	<B6E6CEFF-14E7-48D6-B9D8-9C114F166190@sbc.su.se>
Message-ID: <FADF827A6CE34C959062F2D93849E15A@NewLife>

I'm inclined to agree. Lots of responses to questions here that begin
"Well, as the error message said, you need to check...", which means
people tend towards "I broke it! Write the list!". I do find it hairy when
my errors are way down in the object tree.
----- Original Message ----- 
From: "Dave Messina" <David.Messina at sbc.su.se>
To: "Mark A. Jensen" <maj at fortinbras.us>
Cc: <bioperl-l at bioperl.org>
Sent: Thursday, November 19, 2009 9:04 AM
Subject: Re: [Bioperl-l] accessing EMBL database


> I would agree that the error message is not really informative.

Agreed that it could be better, but I wonder whether part of the problem with 
BioPerl error messages is the stack dump.

I think a lot of eyes just glaze right over when they see a big wad of 
complicated stuff, with colons and slashes and line numbers, spewing out at 
them.

Perhaps the stack dump should be turned off by default?

Wouldn't this:

ERROR: EMBL stream with no ID. Not embl in my book


Be a lot clearer than this?:

MSG: EMBL stream with no ID. Not embl in my book
STACK: Error::throw
STACK: Bio::Root::Root::throw C:/Perl/site/lib/Bio/Root/Root.pm: 368
STACK: Bio::SeqIO::embl::next_seq C:/Perl/site/lib/Bio\SeqIO\embl.pm: 203
STACK: Bio::DB::WebDBSeqI::get_Seq_by_acc C:/Perl/site/lib/Bio/DB/WebDBSeqI.pm: 
194
STACK: trial2.pl


Just a thought. This has probably been discussed before.
Dave


From rtbio.2009 at gmail.com  Thu Nov 19 09:55:27 2009
From: rtbio.2009 at gmail.com (Roopa Raghuveer)
Date: Thu, 19 Nov 2009 15:55:27 +0100
Subject: [Bioperl-l] Remote blast
Message-ID: <c7cac1600911190655q7c4cf910x221e8a2ebac40f2a@mail.gmail.com>

Hello everybody,

I have a problem. I would like to use remote blast to find sequences
matching for an input sequence.

Ex:-I would like to search sequences which match Trypanosoma Brucei
sequence.

I want the output to be only Trypanosoma Brucei sequences matching with my
query.When i tried to use remoteblast to nr database,I got sequences from
different organisms like E.coli,Pseudomonas etc.,

Could you please tell me how can this be solved...?

My code is as follows.

use Bio::Tools::Run::RemoteBlast;
  use strict;
  my $prog = 'blastn';
  my $db   = 'nr';
  my $e_val= '1e-10';
 my $organism= 'Trypanosoma Brucei';

  my @params = ( '-prog' => $prog,
         '-data' => $db,
         '-expect' => $e_val,
         '-readmethod' => 'SearchIO',
         '-Organism'   => $organism );

  my $factory = Bio::Tools::Run::RemoteBlast->
new(@params);

  #change a paramter
  #$Bio::Tools::Run::RemoteBlast::HEADER{'ENTREZ_QUERY'} = 'Trypanosoma
brucei[ORGN]'

  #remove a parameter
  #delete $Bio::Tools::Run::RemoteBlast::HEADER{'FILTER'};

  my $v = 1;
  #$v is just to turn on and off the messages

  my $str = Bio::SeqIO->new(-file=>'amino.fa' , '-format' => 'fasta' ,
'-organism' => 'Trypanosoma Brucei' );

  while (my $input = $str->next_seq()){
    #Blast a sequence against a database:
   my $r = $factory->submit_blast($input);
    #my $r = $factory->submit_blast('amino.fa');

    print STDERR "waiting..." if( $v > 0 );
    while ( my @rids = $factory->each_rid ) {
      foreach my $rid ( @rids ) {
        my $rc = $factory->retrieve_blast($rid);
        if( !ref($rc) ) {
          if( $rc < 0 ) {
            $factory->remove_rid($rid);
          }
          print STDERR "." if ( $v > 0 );
         sleep 5;
        }
     else {
          my $result = $rc->next_result();
          #save the output
          my $filename = $result->query_name()."\.out";
          $factory->save_output($filename);
          $factory->remove_rid($rid);
          print "\nQuery Name: ", $result->query_name(), "\n";
          while ( my $hit = $result->next_hit ) {
            next unless ( $v > 0);
            print "\thit name is ", $hit->name, "\n";
            while( my $hsp = $hit->next_hsp ) {
              print "\t\tscore is ", $hsp->score, "\n";
            }
          }
        }
      }
    }
  }

My input sequence is

>ref|NC_009512.1|:385-1902
GTGTCAGTGGAACTTTGGCAGCAGTGCGTGGAGCTTCTGCGCGATGAACTGCCTGCCCAGCAATTCAACA
CCTGGATCCGTCCGCTACAGGTCGAAGCCGAAGGCGACGAGTTGCGCGTCTATGCGCCTAACCGTTTCGT
TCTCGATTGGGTCAATGAAAAGTACCTGGGTCGTTTGCTCGAGCTGTTGGGTGAGAACGGTAGCGGCATT
GCACCAGCCCTTTCCTTATTAATAGGTAGCCGCCGCAGCTCGGCCCCAAGGGCTGCACCCAACGCGCCGG
TCAGCGCTGCCGTTGCGGCTTCGCTGGCGCAGACTCAGGCGCACAAGACGGCCCCGGCAGCAGCGGTTGA
ACCCGTTGCCGTGGCCGCGGCCGAGCCTGTATTGGTCGAGACGTCTTCGCGTGACAGCTTTGATGCCATG
GCCGAGCCTGCTGCTGCGCCGCCCAGTGGTGGCCGGGCTGAACAGCGCACCGTGCAGGTTGAAGGTGCGC
TCAAGCACACCAGTTACCTGAACCGGACCTTTACCTTTGACACCTTCGTCGAAGGTAAGTCGAACCAGCT
CGCCCGCGCGGCTGCCTGGCAGGTTGCGGACAACCCTAAGCATGGCTACAACCCACTGTTCCTTTATGGC
GGTGTGGGTTTGGGTAAAACCCACCTTATGCATGCTGTGGGTAACCATCTGCTGAAGAAGAATCCGAACG
CCAAGGTGGTGTACCTGCATTCGGAGCGCTTCGTCGCGGACATGGTCAAAGCGTTGCAACTCAACGCCAT
CAACGAATTCAAGCGCTTCTACCGCTCGGTGGACGCGTTGCTGATCGACGATATCCAGTTCTTCGCTCGC
AAAGAGCGCTCGCAAGAAGAGTTTTTCCACACCTTCAACGCCTTGCTTGAGGGTGGCCAGCAGGTAATCC
TTACCTCTGACCGCTATCCCAAGGAAATCGAAGGCCTGGAAGAGCGTCTGAAGTCGCGCTTTGGTTGGGG
CCTGACGGTGGCTGTCGAGCCGCCAGAGCTGGAGACCCGCGTAGCGATCCTGATGAAGAAGGCCGACCAG
GCCAAAGTCGAGCTCCCGCATGACGCAGCCTTTTTCATCGCTCAGCGCATCCGGTCCAACGTCCGTGAGC
TGGAAGGTGCACTGAAGCGAGTTATTGCTCACTCGCACTTCATGGGGCGTGACATCACCATCGAGCTGAT
TCGTGAATCGCTCAAGGATCTGTTGGCGCTGCAAGACAAACTGGTCAGTGTGGATAACATTCAGCGTACC
GTCGCTGAGTACTACAAGATCAAGATCTCCGATCTGTTGTCCAAGCGTCGTTCGCGTTCTGTCGCGCGCC
CGCGTCAGGTAGCCATGGCCCTGTCCAAGGAGTTGACCAACCACAGTCTGCCGGAAATCGGCGACATGTT
CGGTGGTCGCGACCATACGACCGTGCTGCACGCCTGCCGCAAAATCAATGAACTGAAGGAATCCGACGCG
GACATCCGCGAGGACTACAAGAACCTGCTGCGGACGCTGACGACCTGA

Please mail me regarding any queries.

Regards,
Roopa.

From cjfields at illinois.edu  Thu Nov 19 10:30:34 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Thu, 19 Nov 2009 09:30:34 -0600
Subject: [Bioperl-l] verbosity and error stack, was  accessing EMBL database
In-Reply-To: <FADF827A6CE34C959062F2D93849E15A@NewLife>
References: <475F74057A618245A773CD325E105D1E033334AC@phy-srv01.physiol.physiology.wisc.edu>
	<E10EF917D9914031BCFDF8BDDBFA4F13@NewLife>
	<B6E6CEFF-14E7-48D6-B9D8-9C114F166190@sbc.su.se>
	<FADF827A6CE34C959062F2D93849E15A@NewLife>
Message-ID: <B335150B-7479-4D31-BA0A-9F139E2CCE0E@illinois.edu>

Mark, Dave,

This could be based on verbose(). 

          Level      w     t     d    st
verbose   < 0        -     +     -    -/+
verbose     0        +     +     -    -/+
verbose     1        +     +     +    +/+
verbose   > 1        +* -> +     +    +/+
* converts to throw()
w = warn
t = throw
d = debug
st = stack trace

warn() is set up that way now, you don't get a stack trace unless verbose() is > 0.  throw() could be the same; would be a simple fix, really.

My only problem with the current state of things is (I think we've delved down this path before) verbosity level is tied to exception strictness as seen above, and they're really two separate concepts, at least to me.  Verbosity of 1 or more doesn't necessarily mean I want an elevated level of strictness along with it.  For instance, one might want very strict exceptions w/o the noise, or (conversely) lots of debugging output but no warnings. 

(aside: another small nit, but I haven't exactly liked that the global level of strictness is designated by a env. variable with DEBUG in the name, but that's just me).

I've been thinking it would be nice to have simple separate verbose/strict switches (this is the way it's implemented in Biome).  This would allow some finer grained control over output:

          Level      d    st
verbose     0        -    -
verbose     1        +    +
Default = BIOPERLDEBUG || 0 # current situation

          Level      w     t
strict      -1       -     +
strict      0        +     +
strict      1        +* -> +
* converts to throw()
Default = BIOPERLSTRICT || 0

We could even allow finer-grained control of verbosity (states which cover all combinations) w/o affecting strictness.

chris

On Nov 19, 2009, at 8:17 AM, Mark A. Jensen wrote:

> I'm inclined to agree. Lots of responses to questions here that begin
> "Well, as the error message said, you need to check...", which means
> people tend towards "I broke it! Write the list!". I do find it hairy when
> my errors are way down in the object tree.
> ----- Original Message ----- From: "Dave Messina" <David.Messina at sbc.su.se>
> To: "Mark A. Jensen" <maj at fortinbras.us>
> Cc: <bioperl-l at bioperl.org>
> Sent: Thursday, November 19, 2009 9:04 AM
> Subject: Re: [Bioperl-l] accessing EMBL database
> 
> 
>> I would agree that the error message is not really informative.
> 
> Agreed that it could be better, but I wonder whether part of the problem with BioPerl error messages is the stack dump.
> 
> I think a lot of eyes just glaze right over when they see a big wad of complicated stuff, with colons and slashes and line numbers, spewing out at them.
> 
> Perhaps the stack dump should be turned off by default?
> 
> Wouldn't this:
> 
> ERROR: EMBL stream with no ID. Not embl in my book
> 
> 
> 
> Be a lot clearer than this?:
> 
> MSG: EMBL stream with no ID. Not embl in my book
> STACK: Error::throw
> STACK: Bio::Root::Root::throw C:/Perl/site/lib/Bio/Root/Root.pm: 368
> STACK: Bio::SeqIO::embl::next_seq C:/Perl/site/lib/Bio\SeqIO\embl.pm: 203
> STACK: Bio::DB::WebDBSeqI::get_Seq_by_acc C:/Perl/site/lib/Bio/DB/WebDBSeqI.pm: 194
> STACK: trial2.pl
> 
> 
> 
> Just a thought. This has probably been discussed before.
> Dave
> 
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From roy.chaudhuri at gmail.com  Thu Nov 19 11:10:28 2009
From: roy.chaudhuri at gmail.com (Roy Chaudhuri)
Date: Thu, 19 Nov 2009 16:10:28 +0000
Subject: [Bioperl-l] Remote blast
In-Reply-To: <c7cac1600911190655q7c4cf910x221e8a2ebac40f2a@mail.gmail.com>
References: <c7cac1600911190655q7c4cf910x221e8a2ebac40f2a@mail.gmail.com>
Message-ID: <4B056DF4.2030502@gmail.com>

Hi Roopa,

I think that the -Organism parameter that you specify for 
Bio::Tools::Run::RemoteBlast is ignored - I can't find any reference to 
it in the documentation:
http://search.cpan.org/~cjfields/BioPerl-1.6.1/Bio/Tools/Run/RemoteBlast.pm

You have the correct approach in your code - limiting the search to the 
Entrez query "Trypanosoma brucei[ORGN]", but the line is commented out. 
If you uncomment the line (and add a semicolon afterwards), the program 
runs correctly, but no hits are reported below your threshold e-value. 
If you change the value of $e_val to 10 then some T.brucei hits are 
reported.

Roy.

Roopa Raghuveer wrote:
> Hello everybody,
> 
> I have a problem. I would like to use remote blast to find sequences
> matching for an input sequence.
> 
> Ex:-I would like to search sequences which match Trypanosoma Brucei
> sequence.
> 
> I want the output to be only Trypanosoma Brucei sequences matching with my
> query.When i tried to use remoteblast to nr database,I got sequences from
> different organisms like E.coli,Pseudomonas etc.,
> 
> Could you please tell me how can this be solved...?
> 
> My code is as follows.
> 
> use Bio::Tools::Run::RemoteBlast;
>   use strict;
>   my $prog = 'blastn';
>   my $db   = 'nr';
>   my $e_val= '1e-10';
>  my $organism= 'Trypanosoma Brucei';
> 
>   my @params = ( '-prog' => $prog,
>          '-data' => $db,
>          '-expect' => $e_val,
>          '-readmethod' => 'SearchIO',
>          '-Organism'   => $organism );
> 
>   my $factory = Bio::Tools::Run::RemoteBlast->
> new(@params);
> 
>   #change a paramter
>   #$Bio::Tools::Run::RemoteBlast::HEADER{'ENTREZ_QUERY'} = 'Trypanosoma
> brucei[ORGN]'
> 
>   #remove a parameter
>   #delete $Bio::Tools::Run::RemoteBlast::HEADER{'FILTER'};
> 
>   my $v = 1;
>   #$v is just to turn on and off the messages
> 
>   my $str = Bio::SeqIO->new(-file=>'amino.fa' , '-format' => 'fasta' ,
> '-organism' => 'Trypanosoma Brucei' );
> 
>   while (my $input = $str->next_seq()){
>     #Blast a sequence against a database:
>    my $r = $factory->submit_blast($input);
>     #my $r = $factory->submit_blast('amino.fa');
> 
>     print STDERR "waiting..." if( $v > 0 );
>     while ( my @rids = $factory->each_rid ) {
>       foreach my $rid ( @rids ) {
>         my $rc = $factory->retrieve_blast($rid);
>         if( !ref($rc) ) {
>           if( $rc < 0 ) {
>             $factory->remove_rid($rid);
>           }
>           print STDERR "." if ( $v > 0 );
>          sleep 5;
>         }
>      else {
>           my $result = $rc->next_result();
>           #save the output
>           my $filename = $result->query_name()."\.out";
>           $factory->save_output($filename);
>           $factory->remove_rid($rid);
>           print "\nQuery Name: ", $result->query_name(), "\n";
>           while ( my $hit = $result->next_hit ) {
>             next unless ( $v > 0);
>             print "\thit name is ", $hit->name, "\n";
>             while( my $hsp = $hit->next_hsp ) {
>               print "\t\tscore is ", $hsp->score, "\n";
>             }
>           }
>         }
>       }
>     }
>   }
> 
> My input sequence is
> 
>> ref|NC_009512.1|:385-1902
> GTGTCAGTGGAACTTTGGCAGCAGTGCGTGGAGCTTCTGCGCGATGAACTGCCTGCCCAGCAATTCAACA
> CCTGGATCCGTCCGCTACAGGTCGAAGCCGAAGGCGACGAGTTGCGCGTCTATGCGCCTAACCGTTTCGT
> TCTCGATTGGGTCAATGAAAAGTACCTGGGTCGTTTGCTCGAGCTGTTGGGTGAGAACGGTAGCGGCATT
> GCACCAGCCCTTTCCTTATTAATAGGTAGCCGCCGCAGCTCGGCCCCAAGGGCTGCACCCAACGCGCCGG
> TCAGCGCTGCCGTTGCGGCTTCGCTGGCGCAGACTCAGGCGCACAAGACGGCCCCGGCAGCAGCGGTTGA
> ACCCGTTGCCGTGGCCGCGGCCGAGCCTGTATTGGTCGAGACGTCTTCGCGTGACAGCTTTGATGCCATG
> GCCGAGCCTGCTGCTGCGCCGCCCAGTGGTGGCCGGGCTGAACAGCGCACCGTGCAGGTTGAAGGTGCGC
> TCAAGCACACCAGTTACCTGAACCGGACCTTTACCTTTGACACCTTCGTCGAAGGTAAGTCGAACCAGCT
> CGCCCGCGCGGCTGCCTGGCAGGTTGCGGACAACCCTAAGCATGGCTACAACCCACTGTTCCTTTATGGC
> GGTGTGGGTTTGGGTAAAACCCACCTTATGCATGCTGTGGGTAACCATCTGCTGAAGAAGAATCCGAACG
> CCAAGGTGGTGTACCTGCATTCGGAGCGCTTCGTCGCGGACATGGTCAAAGCGTTGCAACTCAACGCCAT
> CAACGAATTCAAGCGCTTCTACCGCTCGGTGGACGCGTTGCTGATCGACGATATCCAGTTCTTCGCTCGC
> AAAGAGCGCTCGCAAGAAGAGTTTTTCCACACCTTCAACGCCTTGCTTGAGGGTGGCCAGCAGGTAATCC
> TTACCTCTGACCGCTATCCCAAGGAAATCGAAGGCCTGGAAGAGCGTCTGAAGTCGCGCTTTGGTTGGGG
> CCTGACGGTGGCTGTCGAGCCGCCAGAGCTGGAGACCCGCGTAGCGATCCTGATGAAGAAGGCCGACCAG
> GCCAAAGTCGAGCTCCCGCATGACGCAGCCTTTTTCATCGCTCAGCGCATCCGGTCCAACGTCCGTGAGC
> TGGAAGGTGCACTGAAGCGAGTTATTGCTCACTCGCACTTCATGGGGCGTGACATCACCATCGAGCTGAT
> TCGTGAATCGCTCAAGGATCTGTTGGCGCTGCAAGACAAACTGGTCAGTGTGGATAACATTCAGCGTACC
> GTCGCTGAGTACTACAAGATCAAGATCTCCGATCTGTTGTCCAAGCGTCGTTCGCGTTCTGTCGCGCGCC
> CGCGTCAGGTAGCCATGGCCCTGTCCAAGGAGTTGACCAACCACAGTCTGCCGGAAATCGGCGACATGTT
> CGGTGGTCGCGACCATACGACCGTGCTGCACGCCTGCCGCAAAATCAATGAACTGAAGGAATCCGACGCG
> GACATCCGCGAGGACTACAAGAACCTGCTGCGGACGCTGACGACCTGA
> 
> Please mail me regarding any queries.
> 
> Regards,
> Roopa.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From clements at nescent.org  Thu Nov 19 12:46:32 2009
From: clements at nescent.org (Dave Clements)
Date: Thu, 19 Nov 2009 18:46:32 +0100
Subject: [Bioperl-l] how to visualize multiple sequences alignments
In-Reply-To: <FC65BC44-AC7A-4AE6-8DF0-3F7969CD5B4C@bioperl.org>
References: <a6a926b50911180918m6b6c31b2g191f9d78325e68de@mail.gmail.com>
	<FC65BC44-AC7A-4AE6-8DF0-3F7969CD5B4C@bioperl.org>
Message-ID: <f135c01c0911190946t7488718brfed76b975f6d2b2@mail.gmail.com>

Hi Xiaoyu,

I would also take a look at GBrowse_syn, a perl based solution built with
the GBrowse genome browser framework.

See http://gmod.org/wiki/GBrowse_syn.

Cheers,

Dave C.

On Wed, Nov 18, 2009 at 7:23 PM, Jason Stajich <jason at bioperl.org> wrote:

> try jalview http://www.jalview.org/
>
>
> On Nov 18, 2009, at 9:18 AM, Xiaoyu Liang wrote:
>
>  Hi,
>>
>> I'm wondering Is there any modules that can be used for visualizing
>> multiple
>> sequences alignments? like the result from ClustalW?
>>
>> Thank you very much,
>> Xiaoyu
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>
> --
> Jason Stajich
> jason.stajich at gmail.com
> jason at bioperl.org
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


-- 
http://gmod.org/wiki/GMOD_News
http://gmod.org/wiki/January_2010_GMOD_Meeting

From maj at fortinbras.us  Thu Nov 19 18:37:05 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Thu, 19 Nov 2009 18:37:05 -0500
Subject: [Bioperl-l] verbosity and error stack,
	was  accessing EMBL database
In-Reply-To: <B335150B-7479-4D31-BA0A-9F139E2CCE0E@illinois.edu>
References: <475F74057A618245A773CD325E105D1E033334AC@phy-srv01.physiol.physiology.wisc.edu><E10EF917D9914031BCFDF8BDDBFA4F13@NewLife><B6E6CEFF-14E7-48D6-B9D8-9C114F166190@sbc.su.se><FADF827A6CE34C959062F2D93849E15A@NewLife>
	<B335150B-7479-4D31-BA0A-9F139E2CCE0E@illinois.edu>
Message-ID: <D72A208491F04DBF9B3F7F10D86A9931@NewLife>

I like this verbose/strict separability a lot. Should we go for it?
----- Original Message ----- 
From: "Chris Fields" <cjfields at illinois.edu>
To: "Mark A. Jensen" <maj at fortinbras.us>
Cc: <bioperl-l at bioperl.org>
Sent: Thursday, November 19, 2009 10:30 AM
Subject: [Bioperl-l] verbosity and error stack, was accessing EMBL database


> Mark, Dave,
>
> This could be based on verbose().
>
>          Level      w     t     d    st
> verbose   < 0        -     +     -    -/+
> verbose     0        +     +     -    -/+
> verbose     1        +     +     +    +/+
> verbose   > 1        +* -> +     +    +/+
> * converts to throw()
> w = warn
> t = throw
> d = debug
> st = stack trace
>
> warn() is set up that way now, you don't get a stack trace unless verbose() is 
>  > 0.  throw() could be the same; would be a simple fix, really.
>
> My only problem with the current state of things is (I think we've delved down 
> this path before) verbosity level is tied to exception strictness as seen 
> above, and they're really two separate concepts, at least to me.  Verbosity of 
> 1 or more doesn't necessarily mean I want an elevated level of strictness 
> along with it.  For instance, one might want very strict exceptions w/o the 
> noise, or (conversely) lots of debugging output but no warnings.
>
> (aside: another small nit, but I haven't exactly liked that the global level 
> of strictness is designated by a env. variable with DEBUG in the name, but 
> that's just me).
>
> I've been thinking it would be nice to have simple separate verbose/strict 
> switches (this is the way it's implemented in Biome).  This would allow some 
> finer grained control over output:
>
>          Level      d    st
> verbose     0        -    -
> verbose     1        +    +
> Default = BIOPERLDEBUG || 0 # current situation
>
>          Level      w     t
> strict      -1       -     +
> strict      0        +     +
> strict      1        +* -> +
> * converts to throw()
> Default = BIOPERLSTRICT || 0
>
> We could even allow finer-grained control of verbosity (states which cover all 
> combinations) w/o affecting strictness.
>
> chris
>
> On Nov 19, 2009, at 8:17 AM, Mark A. Jensen wrote:
>
>> I'm inclined to agree. Lots of responses to questions here that begin
>> "Well, as the error message said, you need to check...", which means
>> people tend towards "I broke it! Write the list!". I do find it hairy when
>> my errors are way down in the object tree.
>> ----- Original Message ----- From: "Dave Messina" <David.Messina at sbc.su.se>
>> To: "Mark A. Jensen" <maj at fortinbras.us>
>> Cc: <bioperl-l at bioperl.org>
>> Sent: Thursday, November 19, 2009 9:04 AM
>> Subject: Re: [Bioperl-l] accessing EMBL database
>>
>>
>>> I would agree that the error message is not really informative.
>>
>> Agreed that it could be better, but I wonder whether part of the problem with 
>> BioPerl error messages is the stack dump.
>>
>> I think a lot of eyes just glaze right over when they see a big wad of 
>> complicated stuff, with colons and slashes and line numbers, spewing out at 
>> them.
>>
>> Perhaps the stack dump should be turned off by default?
>>
>> Wouldn't this:
>>
>> ERROR: EMBL stream with no ID. Not embl in my book
>>
>>
>>
>> Be a lot clearer than this?:
>>
>> MSG: EMBL stream with no ID. Not embl in my book
>> STACK: Error::throw
>> STACK: Bio::Root::Root::throw C:/Perl/site/lib/Bio/Root/Root.pm: 368
>> STACK: Bio::SeqIO::embl::next_seq C:/Perl/site/lib/Bio\SeqIO\embl.pm: 203
>> STACK: Bio::DB::WebDBSeqI::get_Seq_by_acc 
>> C:/Perl/site/lib/Bio/DB/WebDBSeqI.pm: 194
>> STACK: trial2.pl
>>
>>
>>
>> Just a thought. This has probably been discussed before.
>> Dave
>>
>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> 


From michael.watson at bbsrc.ac.uk  Fri Nov 20 05:07:10 2009
From: michael.watson at bbsrc.ac.uk (michael watson (IAH-C))
Date: Fri, 20 Nov 2009 10:07:10 +0000
Subject: [Bioperl-l] strand in Bio::SearchIO when parsing fasta35 output
In-Reply-To: <8D08960C647E64438CE5740657CBBDC501487319B6@iahcexch1.iah.bbsrc.ac.uk>
References: <8D08960C647E64438CE5740657CBBDC501487319AE@iahcexch1.iah.bbsrc.ac.uk>
	<9994F70B-AE92-4425-9AAC-E9A2DC26964E@bioperl.org>
	<8D08960C647E64438CE5740657CBBDC501487319B6@iahcexch1.iah.bbsrc.ac.uk>
Message-ID: <8D08960C647E64438CE5740657CBBDC50148731CEB@iahcexch1.iah.bbsrc.ac.uk>

Hello

I was just wondering if anyone had had time to look into this?

I posted a bug: http://bugzilla.open-bio.org/show_bug.cgi?id=2937

Thanks
Mick

-----Original Message-----
From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-bounces at lists.open-bio.org] On Behalf Of michael watson (IAH-C)
Sent: 27 October 2009 09:01
To: 'Jason Stajich'
Cc: bioperl-l
Subject: Re: [Bioperl-l] strand in Bio::SearchIO when parsing fasta35 output

Hi Jason

They both print 0 also.

A bug report it is

Mick

-----Original Message-----
From: Jason Stajich [mailto:jason.stajich at gmail.com] On Behalf Of Jason Stajich
Sent: 26 October 2009 18:46
To: michael watson (IAH-C)
Cc: bioperl-l
Subject: Re: [Bioperl-l] strand in Bio::SearchIO when parsing fasta35 output


Is this -m9 -d 0 output or standard default?  I think the strand is  
parsed in the HSP parsing.

Can you double check what $hsp->query->strand and $hsp->hit->strand  
prints?

A full example report as a bug request will be next step if that  
doesn't resolve.

-jason
On Oct 26, 2009, at 10:04 AM, michael watson (IAH-C) wrote:

> Dear all
>
> Where does this go?  Perhaps I am doing something wrong.
>
> Fasta35 output puts the strand in the hit list at the top:
>
> cluster_99033:3                                (  23) [r]  115 37.9   
> 0.0011
> cluster_79238:1                 (  27) [f]  126 38.0 0.00097 0.963  
> 0.963   27
>
> The [r] stands for reverse and the [f] stands for forward.
>
> There is also the text "rev-comp" after the hit line further down.
>
> However, when I parse fasta35 output using SearchIO and output the  
> strand of the HSP:
>
> print $hsp->strand('hit'), ",";
> print $hsp->strand('query'), "\n";
>
> This simply prints out 0, 0 (I assume 0 is the default in BioPerl  
> for "I don't know which strand it's on").
>
> So the information is there, but it's not getting parsed.   
> Alternatively, I've missed something and will feel a bit foolish.
>
> Currently using BioPerl 1.6.0
>
> Thanks
> Mick
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

--
Jason Stajich
jason.stajich at gmail.com
jason at bioperl.org


_______________________________________________
Bioperl-l mailing list
Bioperl-l at lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/bioperl-l


From David.Messina at sbc.su.se  Fri Nov 20 05:15:11 2009
From: David.Messina at sbc.su.se (Dave Messina)
Date: Fri, 20 Nov 2009 11:15:11 +0100
Subject: [Bioperl-l] verbosity and error stack,
	was  accessing EMBL database
In-Reply-To: <D72A208491F04DBF9B3F7F10D86A9931@NewLife>
References: <475F74057A618245A773CD325E105D1E033334AC@phy-srv01.physiol.physiology.wisc.edu><E10EF917D9914031BCFDF8BDDBFA4F13@NewLife><B6E6CEFF-14E7-48D6-B9D8-9C114F166190@sbc.su.se><FADF827A6CE34C959062F2D93849E15A@NewLife>
	<B335150B-7479-4D31-BA0A-9F139E2CCE0E@illinois.edu>
	<D72A208491F04DBF9B3F7F10D86A9931@NewLife>
Message-ID: <3277368F-615A-4DD3-B9B3-5D32A5EEEE98@sbc.su.se>

Chris, I took a look at how you implemented this in Biome -- very nice!


> I like this verbose/strict separability a lot. Should we go for it?

Me too. So yes, I think so.


> We could even allow finer-grained control of verbosity (states which cover all combinations) w/o affecting strictness.


Perhaps this is a job for Log::Log4Perl or Log::Dispatch?
http://search.cpan.org/~mschilli/Log-Log4perl-1.25/lib/Log/Log4perl.pm
http://search.cpan.org/~drolsky/Log-Dispatch-2.26/lib/Log/Dispatch.pm


That might be overkill, though.

Dave


From roychu at gmail.com  Fri Nov 20 05:21:54 2009
From: roychu at gmail.com (Chu, Roy)
Date: Fri, 20 Nov 2009 02:21:54 -0800
Subject: [Bioperl-l] Installing Bio-perl on dreamhost via CPAN
Message-ID: <4d7f3e450911200221m5f39ace2hb979712115fb9d78@mail.gmail.com>

Hi,

Does anyone use dreamhost as a web hosting service?  I'm just curious
if anyone has had any luck installing the module as their daemon seems
to kill my process whenever I try to install it.  Dreamhost tech
support attributes it to either exceeding the allocated memory cache
or exceeding the processing time.  I tried to nice the process, but
that didn't help for me.  Any luck or experience in resolving this
would be much appreciated.  I suppose my next attempt would be to try
installing it directly and hope I don't need root...

Thanks,
Roy

From s.denaxas at gmail.com  Fri Nov 20 05:27:42 2009
From: s.denaxas at gmail.com (Spiros Denaxas)
Date: Fri, 20 Nov 2009 11:27:42 +0100
Subject: [Bioperl-l] Installing Bio-perl on dreamhost via CPAN
In-Reply-To: <4d7f3e450911200221m5f39ace2hb979712115fb9d78@mail.gmail.com>
References: <4d7f3e450911200221m5f39ace2hb979712115fb9d78@mail.gmail.com>
Message-ID: <bba689ec0911200227g1a8d717elce0daebf6a96c6aa@mail.gmail.com>

Hello,

normally you don't need to be root -
http://sial.org/howto/perl/life-with-cpan/non-root/
Kind of disturbing that their tech support cannot give you a straight
answer on what they are killing the process.

Good luck
Spiros

On Fri, Nov 20, 2009 at 11:21 AM, Chu, Roy <roychu at gmail.com> wrote:

>  ?I suppose my next attempt would be to try
> installing it directly and hope I don't need root...
>
> Thanks,
> Roy
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


From charles-listes+bioperl at plessy.org  Fri Nov 20 05:44:45 2009
From: charles-listes+bioperl at plessy.org (Charles Plessy)
Date: Fri, 20 Nov 2009 19:44:45 +0900
Subject: [Bioperl-l] Installing Bio-perl on dreamhost via CPAN
In-Reply-To: <4d7f3e450911200221m5f39ace2hb979712115fb9d78@mail.gmail.com>
References: <4d7f3e450911200221m5f39ace2hb979712115fb9d78@mail.gmail.com>
Message-ID: <20091120104445.GG31318@kunpuu.plessy.org>

Le Fri, Nov 20, 2009 at 02:21:54AM -0800, Chu, Roy a ?crit :
> 
> Does anyone use dreamhost as a web hosting service?  I'm just curious
> if anyone has had any luck installing the module as their daemon seems
> to kill my process whenever I try to install it.  Dreamhost tech
> support attributes it to either exceeding the allocated memory cache
> or exceeding the processing time.  I tried to nice the process, but
> that didn't help for me.  Any luck or experience in resolving this
> would be much appreciated.  I suppose my next attempt would be to try
> installing it directly and hope I don't need root...

Dear Roy,

DreamHost uses Debian, so you can suggest them to install the Debian package.
If you are in contact with the tech service, do not hesitate to tell them to
contact me if they are interested by a backport of the 1.6.0 package. For
version 1.6.1, it may be more difficult as it depends on perl 5.10.1.

PS: if you propse to install BioPerl as a feature in the Dreamhost panel, I
will vote for it :)

Have a nice day,

--  
Charles Plessy
Debian Med packaging team,
http://www.debian.org/devel/debian-med
Tsurumi, Kanagawa, Japan

From cjfields at illinois.edu  Fri Nov 20 07:51:39 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Fri, 20 Nov 2009 06:51:39 -0600
Subject: [Bioperl-l] strand in Bio::SearchIO when parsing fasta35 output
In-Reply-To: <8D08960C647E64438CE5740657CBBDC50148731CEB@iahcexch1.iah.bbsrc.ac.uk>
References: <8D08960C647E64438CE5740657CBBDC501487319AE@iahcexch1.iah.bbsrc.ac.uk>
	<9994F70B-AE92-4425-9AAC-E9A2DC26964E@bioperl.org>
	<8D08960C647E64438CE5740657CBBDC501487319B6@iahcexch1.iah.bbsrc.ac.uk>
	<8D08960C647E64438CE5740657CBBDC50148731CEB@iahcexch1.iah.bbsrc.ac.uk>
Message-ID: <E9D5435B-07D6-46A9-AA84-C9667FA0CEDE@illinois.edu>

Mick,

Short answer, no.  It was in the queue to be fixed at some point in 1.6.x, but that queue is quite long.  I'm pushing it into the queue specifically for 1.6.2, so it should be addressed soon.

chris

On Nov 20, 2009, at 4:07 AM, michael watson (IAH-C) wrote:

> Hello
> 
> I was just wondering if anyone had had time to look into this?
> 
> I posted a bug: http://bugzilla.open-bio.org/show_bug.cgi?id=2937
> 
> Thanks
> Mick
> 
> -----Original Message-----
> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-bounces at lists.open-bio.org] On Behalf Of michael watson (IAH-C)
> Sent: 27 October 2009 09:01
> To: 'Jason Stajich'
> Cc: bioperl-l
> Subject: Re: [Bioperl-l] strand in Bio::SearchIO when parsing fasta35 output
> 
> Hi Jason
> 
> They both print 0 also.
> 
> A bug report it is
> 
> Mick
> 
> -----Original Message-----
> From: Jason Stajich [mailto:jason.stajich at gmail.com] On Behalf Of Jason Stajich
> Sent: 26 October 2009 18:46
> To: michael watson (IAH-C)
> Cc: bioperl-l
> Subject: Re: [Bioperl-l] strand in Bio::SearchIO when parsing fasta35 output
> 
> 
> Is this -m9 -d 0 output or standard default?  I think the strand is  
> parsed in the HSP parsing.
> 
> Can you double check what $hsp->query->strand and $hsp->hit->strand  
> prints?
> 
> A full example report as a bug request will be next step if that  
> doesn't resolve.
> 
> -jason
> On Oct 26, 2009, at 10:04 AM, michael watson (IAH-C) wrote:
> 
>> Dear all
>> 
>> Where does this go?  Perhaps I am doing something wrong.
>> 
>> Fasta35 output puts the strand in the hit list at the top:
>> 
>> cluster_99033:3                                (  23) [r]  115 37.9   
>> 0.0011
>> cluster_79238:1                 (  27) [f]  126 38.0 0.00097 0.963  
>> 0.963   27
>> 
>> The [r] stands for reverse and the [f] stands for forward.
>> 
>> There is also the text "rev-comp" after the hit line further down.
>> 
>> However, when I parse fasta35 output using SearchIO and output the  
>> strand of the HSP:
>> 
>> print $hsp->strand('hit'), ",";
>> print $hsp->strand('query'), "\n";
>> 
>> This simply prints out 0, 0 (I assume 0 is the default in BioPerl  
>> for "I don't know which strand it's on").
>> 
>> So the information is there, but it's not getting parsed.   
>> Alternatively, I've missed something and will feel a bit foolish.
>> 
>> Currently using BioPerl 1.6.0
>> 
>> Thanks
>> Mick
>> 
>> 
>> 
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
> --
> Jason Stajich
> jason.stajich at gmail.com
> jason at bioperl.org
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From cjfields at illinois.edu  Fri Nov 20 08:00:45 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Fri, 20 Nov 2009 07:00:45 -0600
Subject: [Bioperl-l] Installing Bio-perl on dreamhost via CPAN
In-Reply-To: <20091120104445.GG31318@kunpuu.plessy.org>
References: <4d7f3e450911200221m5f39ace2hb979712115fb9d78@mail.gmail.com>
	<20091120104445.GG31318@kunpuu.plessy.org>
Message-ID: <ECA15E72-8026-4EA0-A4F6-7C09EA2BD040@illinois.edu>


On Nov 20, 2009, at 4:44 AM, Charles Plessy wrote:

> Le Fri, Nov 20, 2009 at 02:21:54AM -0800, Chu, Roy a ?crit :
>> 
>> Does anyone use dreamhost as a web hosting service?  I'm just curious
>> if anyone has had any luck installing the module as their daemon seems
>> to kill my process whenever I try to install it.  Dreamhost tech
>> support attributes it to either exceeding the allocated memory cache
>> or exceeding the processing time.  I tried to nice the process, but
>> that didn't help for me.  Any luck or experience in resolving this
>> would be much appreciated.  I suppose my next attempt would be to try
>> installing it directly and hope I don't need root...
> 
> Dear Roy,
> 
> DreamHost uses Debian, so you can suggest them to install the Debian package.
> If you are in contact with the tech service, do not hesitate to tell them to
> contact me if they are interested by a backport of the 1.6.0 package. For
> version 1.6.1, it may be more difficult as it depends on perl 5.10.1.

Any reason why this is so?  We specify compatibility back to 5.6.1.

Alex mentioned the reliance on the specific Extutils::Manifest version.  The version requested has an important bug fix, is present on CPAN, and is backwards-compatible to 5.6.1.  It should be fairly easy to request that as a separate package.  

A strict requirement for perl 5.10.1 doesn't make sense in that light, unless said perl maintainer can enlighten us as to why this is an issue?  This one may require a ranty blog post.

> PS: if you propse to install BioPerl as a feature in the Dreamhost panel, I
> will vote for it :)
> 
> Have a nice day,
> 
> --  
> Charles Plessy
> Debian Med packaging team,
> http://www.debian.org/devel/debian-med
> Tsurumi, Kanagawa, Japan

chris

From rtbio.2009 at gmail.com  Fri Nov 20 10:52:09 2009
From: rtbio.2009 at gmail.com (Roopa Raghuveer)
Date: Fri, 20 Nov 2009 16:52:09 +0100
Subject: [Bioperl-l] Remote blast
In-Reply-To: <c7cac1600911200330m5aafc21ehd5038707b4af9cdd@mail.gmail.com>
References: <c7cac1600911190655q7c4cf910x221e8a2ebac40f2a@mail.gmail.com>
	<4B056DF4.2030502@gmail.com>
	<c7cac1600911200330m5aafc21ehd5038707b4af9cdd@mail.gmail.com>
Message-ID: <c7cac1600911200752h2dd48ca8r55877f5045948220@mail.gmail.com>

Hello everybody,

I have tried to use Remote blast on Trypanasoma brucei sequences and could
get certain hits.But I am unable to retrieve the complete sequence from
where I got hits.
i.e., I am unable to parse the blast output file for getting the complete
sequences of the hits. Here is my code.

#!/usr/bin/perl -w
use Bio::SearchIO;
my $blast_report = new Bio::SearchIO ('-format' => 'blast',
                                      '-file'   => $ARGV[0]);
my $result = $blast_report->next_result;
my $level = $ARGV[1];

while( my $hit = $result->next_hit) {
       print $hit->name;
       push(@arr1,$hit->name);
       while( my $hsp = $hit->next_hsp()) {
        if ($hsp->frac_identical() >= $level) {
            #print $hsp->hit_string, "\n";
            push(@arr,$hsp->hit_string);
        }
    }
}
$k=@arr1;
for($i=0;$i<$k;$i++){
push(@arr2,split(/|/,$arr1[$i]));
#print "$arr[$i]\n";
}
#$t=@arr2;

Here,I am trying to use the blast output file and get the complete sequence
where I found a hit  but  I could not get the complete sequence.

i/p:-
Last login: Mon Nov 16 11:57:22 on console
Welcome to Darwin!
lmbicip-mac1:~ cip$ ssh admin at 141.84.66.66
The authenticity of host '141.84.66.66 (141.84.66.66)' can't be established.
RSA key fingerprint is 2d:4a:09:1d:2e:f3:51:c7:ba:8b:29:37:36:f6:44:db.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added '141.84.66.66' (RSA) to the list of known hosts.
Password:
Last login: Fri Nov 20 13:52:57 2009 from 10.153.189.239
Have a lot of fun...
admin at BosLinux:~> clear


admin at BosLinux:~> cd Documents/
admin at BosLinux:~/Documents> clear


admin at BosLinux:~/Documents> vim blast.pl
admin at BosLinux:~/Documents> clear


admin at BosLinux:~/Documents> vim nnn.pl
admin at BosLinux:~/Documents> vim other.pl
admin at BosLinux:~/Documents> vim amino.fa
admin at BosLinux:~/Documents> vim Tb09.211.2410.out
admin at BosLinux:~/Documents> vim Tb09.211.2410.out


             ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Sbjct  661   TTTGATGAAACCCCAATTCGGACGTATGAAAAGATTCTTGCGGGCCGGCTTAAATTCCCC
720

Query  721   AATTGGTTTGATGAGCGTGCGCGGGATCTCGTAAAGGGTTTATTGCAAACGGATCACACG
780
             ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Sbjct  721   AATTGGTTTGATGAGCGTGCGCGGGATCTCGTAAAGGGTTTATTGCAAACGGATCACACG
780

Query  781   AAACGGTTGGGCACGCTGAAGGATGGCGTAGCTGATGTGAAGAATCACCCATTCTTCCGT
840
             ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Sbjct  781   AAACGGTTGGGCACGCTGAAGGATGGCGTAGCTGATGTGAAGAATCACCCATTCTTCCGT
840

Query  841   GGTGCGAATTGGGAGAAACTCTATGGACGTCATTATAACGCCCCCATTGCCGTAAAAGTG
900
             ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Sbjct  841   GGTGCGAATTGGGAGAAACTCTATGGACGTCATTATAACGCCCCCATTGCCGTAAAAGTG
900

Query  901   AAGAGCCCCGGCGACACAAGTAACTTTGAGTCGTATCCCGAGAGTGGAGATAAGGGTTCT
960
             ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Sbjct  901   AAGAGCCCCGGCGACACAAGTAACTTTGAGTCGTATCCCGAGAGTGGAGATAAGGGTTCT
960

Query  961   CCTCCACTAACCCCTTCGCAACAGGTTGCATTCCGTGGTTTTTAG  1005
             |||||||||||||||||||||||||||||||||||||||||||||
Sbjct  961   CCTCCACTAACCCCTTCGCAACAGGTTGCATTCCGTGGTTTTTAG  1005

>ref|XM_822286.1| Trypanosoma brucei TREU927 protein kinase A catalytic
subunit
isoform 2 (Tb09.211.2360) partial mRNA
Length=1011

 Score = 1622 bits (1798),  Expect = 0.0
 Identities = 944/974 (96%), Gaps = 0/974 (0%)
 Strand=Plus/Plus

Query  32    TGTTTACCAAGCCTGACACATCGGGATGGAAGCTGAGTGACTTTGAAATGGGTGACACGC
91
             |||||||||| |||||||||||||||||||||||||||||||||||||||||||||||||
Sbjct  38    TGTTTACCAAACCTGACACATCGGGATGGAAGCTGAGTGACTTTGAAATGGGTGACACGC
97

Query  92    TAGGGACCGGCTCGTTTGGTCGCGTGCGCATTGCAAAACTGAAGAGCAGGGGGGAGTATT
151
             ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Sbjct  98    TAGGGACCGGCTCGTTTGGTCGCGTGCGCATTGCAAAACTGAAGAGCAGGGGGGAGTATT
157

Query  152   ATGCAATAAAATGTCTAAAGAAGCATGAGATACTAAAGATGAAGCAGGTACAACACCTGA
211
             |||||||||||||||||||||||| |||||||||||||||||||||||||||||||||||
Sbjct  158   ATGCAATAAAATGTCTAAAGAAGCGTGAGATACTAAAGATGAAGCAGGTACAACACCTGA
217

Query  212   ACCAAGAGAAGCAAATTCTAATGGAGTTGTCACACCCCTTCATTGTGAACATGATGTGTT
271
             ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Sbjct  218   ACCAAGAGAAGCAAATTCTAATGGAGTTGTCACACCCCTTCATTGTGAACATGATGTGTT
277

uery  272   CCTTCCAGGATGAGAACCGCGTCTACTTTGTTCTAGAATTTGTGGTAGGTGGTGAGGTAT
331
             ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Sbjct  278   CCTTCCAGGATGAGAACCGCGTCTACTTTGTTCTAGAATTTGTGGTAGGTGGTGAGGTAT
337

Query  332   TTACTCACCTTCGTTCCGCAGGCCGTTTCCCGAATGACGTAGCGAAGTTCTATCATGCGG
391
             ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Sbjct  338   TTACTCACCTTCGTTCCGCAGGCCGTTTCCCGAATGACGTAGCGAAGTTCTATCATGCGG
397

Query  392   AGCTTGTGTTGGCCTTTGAATATTTACACTCGAAGGACATTATCTACCGTGACTTGAAAC
451
             ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Sbjct  398   AGCTTGTGTTGGCCTTTGAATATTTACACTCGAAGGACATTATCTACCGTGACTTGAAAC
457

Query  452   CTGAGAATCTGCTACTTGATGGGAAGGGACACGTCAAGGTGACTGATTTTGGTTTTGCTA
511
             ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Sbjct  458   CTGAGAATCTGCTACTTGATGGGAAGGGACACGTCAAGGTGACTGATTTTGGTTTTGCTA
517

Query  512   AGAAGGTGACGGATCGTACCTATACGTTATGTGGGACACCTGAGTATCTTGCACCTGAGG
571
             ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Sbjct  518   AGAAGGTGACGGATCGTACCTATACGTTATGTGGGACACCTGAGTATCTTGCACCTGAGG
577

Query  572   TAATTCAGAGCAAAGGACATGGGAAGGCTGTGGATTGGTGGACGATGGGTGTTTTGCTGT
631
             ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||

It follows like this.

The output I got is
ATGACGACAACTCCCACTGGTGATGGCCAACTGTTTACCAAGCCTGACACATCGGGATGGAAGCTGAGTGACTTTGAAATGGGTGACACGCTAGGGACCGGCTCGTTTGGTCGCGTGCGCATTGCAAAACTGAAGAGCAGGGGGGAGTATTATGCAATAAAATGTCTAAAGAAGCATGAGATACTAAAGATGAAGCAGGTACAACACCTGAACCAAGAGAAGCAAATTCTAATGGAGTTGTCACACCCCTTCATTGTGAACATGATGTGTTCCTTCCAGGATGAGAACCGCGTCTACTTTGTTCTAGAATTTGTGGTAGGTGGTGAGGTATTTACTCACCTTCGTTCCGCAGGCCGTTTCCCGAATGACGTAGCGAAGTTCTATCATGCGGAGCTTGTGTTGGCCTTTGAATATTTACACTCGAAGGACATTATCTACCGTGACTTGAAACCTGAGAATCTGCTACTTGATGGGAAGGGACACGTCAAGGTGACTGATTTTGGTTTTGCTAAGAAGGTGACGGATCGTACCTATACGTTATGTGGGACACCTGAGTATCTTGCACCTGAGGTAATTCAGAGCAAAGGACATGGGAAGGCTGTGGATTGGTGGACGATGGGTGTTTTGCTGTATGAATTCATAGCTGGCCATCCTCCCTTTTTTGATGAAACCCCAATTCGGACGTATGAAAAGATTCTTGCGGGCCGGCTTAAATTCCCCAATTGGTTTGATGAGCGTGCGCGGGATCTCGTAAAGGGTTTATTGCAAACGGATCACACGAAACGGTTGGGCACGCTGAAGGATGGCGTAGCTGATGTGAAGAATCACCCATTCTTCCGTGGTGCGAATTGGGAGAAACTCTATGGACGTCATTATAACGCCCCCATTGCCGTAAAAGTGAAGAGCCCCGGCGACACAAGTAACTTTGAGTCGTATCCCGAGAGTGGAGATAAGGGTTCTCCTCCACTAACCCCTTCGCAACAGG
TTGCATTCCGTGGTTTTTAG

TGTTTACCAAACCTGACACATCGGGATGGAAGCTGAGTGACTTTGAAATGGGTGACACGCTAGGGACCGGCTCGTTTGGTCGCGTGCGCATTGCAAAACTGAAGAGCAGGGGGGAGTATTATGCAATAAAATGTCTAAAGAAGCGTGAGATACTAAAGATGAAGCAGGTACAACACCTGAACCAAGAGAAGCAAATTCTAATGGAGTTGTCACACCCCTTCATTGTGAACATGATGTGTTCCTTCCAGGATGAGAACCGCGTCTACTTTGTTCTAGAATTTGTGGTAGGTGGTGAGGTATTTACTCACCTTCGTTCCGCAGGCCGTTTCCCGAATGACGTAGCGAAGTTCTATCATGCGGAGCTTGTGTTGGCCTTTGAATATTTACACTCGAAGGACATTATCTACCGTGACTTGAAACCTGAGAATCTGCTACTTGATGGGAAGGGACACGTCAAGGTGACTGATTTTGGTTTTGCTAAGAAGGTGACGGATCGTACCTATACGTTATGTGGGACACCTGAGTATCTTGCACCTGAGGTAATTCAGAGCAAAGGACATGGGAAGGCTGTGGATTGGTGGACGATGGGTGTTTTGCTGTATGAATTCATAGCTGGCCATCCTCCCTTTTTTGATGAAACCCCAATTCGGACGTATGAAAAGATTCTTGCGGGCCGGTTCAAATTCCCCAATTGGTTTGACTCCCGTGCGCGGGATCTCGTAAAGGGTTTATTGCAAACGGATCACACGAAACGGTTGGGCACGCTGAAGGATGGCGTAGCTGATGTGAAGAATCACCCATTCTTCCGTGGTGCGAATTGGGAGAAACTCTATGGACGTCATTATCACGCTCCCATTCCTGTAAAAGTGAAGAGCCCCGGCGACACAAGTAACTTTGAGTCGTATCCCGAGAGTGGGGATAAGCGGTTGCCCCCGTTAGCACCATCACAACAATTGGAGTTCCGTGGGTTTTAG
GGATGATGACCGATTGTACCTCCTCCTCGAGTATGTGGTGGGTGGCGAGCTGT

TCTCCCACCTCCGGAAGGCGGGAAAATTCCCTAATGATGTAGCCAAGTTCTACTCCGCAGAAGTGGTTTTGGCGTTTGAATATATTCATGAGTGCGGCATCGTATACCGTGACTTGAAGCCAGAAAATGTGCTTTTGGACAAGCAGGGAAACATTAAGATTACGGACTTTGGGTTCGCGAAACGCGTTAGGGACAGAACGTACACGCTATGTGGGACTCCAGAGTATCTTGCGCCGGAGATAATCCAAAGTAAAGGTCACGATCGGGCTGTGGATTGGTGGACACTCGGAATTCTTCTCTATGAGATGCTTGTCGGTTATCCTCCTTTTTTCGACGAGAGTCCTTTTAGAACATACGAAAAAATTTTAGAGGGGAAACTTCAGTTTCCAAAGTGGGTGGAGATGCGGGCGAAGGACCTCATAAAGAGTTTTTTAACAATTGAACCAACGAAACG

i.e.,It is only giving the region where it could find the best alignment
i.e., the best hit ones.

I want the complete sequence i.e., sequences corresponding to the accession
numbers
XM_822292.1
XM_822286.1
XM_822694.1

Database used in Remote blast was RefSeq i.e.,(refseq_rna),organism used
:Trypanasoma brucei.

Can any one please help me in solving this problem

Regards,
Roopa.
On Fri, Nov 20, 2009 at 12:30 PM, Roopa Raghuveer <rtbio.2009 at gmail.com>wrote:

>
> Hello Roy,
>
> Thanks a lot for your reply.My code is working for my sequence now.
>
> Thanks alot.
>
> Regards,
> Roopa.
>
> On Thu, Nov 19, 2009 at 5:10 PM, Roy Chaudhuri <roy.chaudhuri at gmail.com>wrote:
>
>> Hi Roopa,
>>
>> I think that the -Organism parameter that you specify for
>> Bio::Tools::Run::RemoteBlast is ignored - I can't find any reference to it
>> in the documentation:
>>
>> http://search.cpan.org/~cjfields/BioPerl-1.6.1/Bio/Tools/Run/RemoteBlast.pm<http://search.cpan.org/%7Ecjfields/BioPerl-1.6.1/Bio/Tools/Run/RemoteBlast.pm>
>>
>> You have the correct approach in your code - limiting the search to the
>> Entrez query "Trypanosoma brucei[ORGN]", but the line is commented out. If
>> you uncomment the line (and add a semicolon afterwards), the program runs
>> correctly, but no hits are reported below your threshold e-value. If you
>> change the value of $e_val to 10 then some T.brucei hits are reported.
>>
>> Roy.
>>
>> Roopa Raghuveer wrote:
>>
>>> Hello everybody,
>>>
>>> I have a problem. I would like to use remote blast to find sequences
>>> matching for an input sequence.
>>>
>>> Ex:-I would like to search sequences which match Trypanosoma Brucei
>>> sequence.
>>>
>>> I want the output to be only Trypanosoma Brucei sequences matching with
>>> my
>>> query.When i tried to use remoteblast to nr database,I got sequences from
>>> different organisms like E.coli,Pseudomonas etc.,
>>>
>>> Could you please tell me how can this be solved...?
>>>
>>> My code is as follows.
>>>
>>> use Bio::Tools::Run::RemoteBlast;
>>>  use strict;
>>>  my $prog = 'blastn';
>>>  my $db   = 'nr';
>>>  my $e_val= '1e-10';
>>>  my $organism= 'Trypanosoma Brucei';
>>>
>>>  my @params = ( '-prog' => $prog,
>>>         '-data' => $db,
>>>         '-expect' => $e_val,
>>>         '-readmethod' => 'SearchIO',
>>>         '-Organism'   => $organism );
>>>
>>>  my $factory = Bio::Tools::Run::RemoteBlast->
>>> new(@params);
>>>
>>>  #change a paramter
>>>  #$Bio::Tools::Run::RemoteBlast::HEADER{'ENTREZ_QUERY'} = 'Trypanosoma
>>> brucei[ORGN]'
>>>
>>>  #remove a parameter
>>>  #delete $Bio::Tools::Run::RemoteBlast::HEADER{'FILTER'};
>>>
>>>  my $v = 1;
>>>  #$v is just to turn on and off the messages
>>>
>>>  my $str = Bio::SeqIO->new(-file=>'amino.fa' , '-format' => 'fasta' ,
>>> '-organism' => 'Trypanosoma Brucei' );
>>>
>>>  while (my $input = $str->next_seq()){
>>>    #Blast a sequence against a database:
>>>   my $r = $factory->submit_blast($input);
>>>    #my $r = $factory->submit_blast('amino.fa');
>>>
>>>    print STDERR "waiting..." if( $v > 0 );
>>>    while ( my @rids = $factory->each_rid ) {
>>>      foreach my $rid ( @rids ) {
>>>        my $rc = $factory->retrieve_blast($rid);
>>>        if( !ref($rc) ) {
>>>          if( $rc < 0 ) {
>>>            $factory->remove_rid($rid);
>>>          }
>>>          print STDERR "." if ( $v > 0 );
>>>         sleep 5;
>>>        }
>>>     else {
>>>          my $result = $rc->next_result();
>>>          #save the output
>>>          my $filename = $result->query_name()."\.out";
>>>          $factory->save_output($filename);
>>>          $factory->remove_rid($rid);
>>>          print "\nQuery Name: ", $result->query_name(), "\n";
>>>          while ( my $hit = $result->next_hit ) {
>>>            next unless ( $v > 0);
>>>            print "\thit name is ", $hit->name, "\n";
>>>            while( my $hsp = $hit->next_hsp ) {
>>>              print "\t\tscore is ", $hsp->score, "\n";
>>>            }
>>>          }
>>>        }
>>>      }
>>>    }
>>>  }
>>>
>>> My input sequence is
>>>
>>>  ref|NC_009512.1|:385-1902
>>>>
>>> GTGTCAGTGGAACTTTGGCAGCAGTGCGTGGAGCTTCTGCGCGATGAACTGCCTGCCCAGCAATTCAACA
>>> CCTGGATCCGTCCGCTACAGGTCGAAGCCGAAGGCGACGAGTTGCGCGTCTATGCGCCTAACCGTTTCGT
>>> TCTCGATTGGGTCAATGAAAAGTACCTGGGTCGTTTGCTCGAGCTGTTGGGTGAGAACGGTAGCGGCATT
>>> GCACCAGCCCTTTCCTTATTAATAGGTAGCCGCCGCAGCTCGGCCCCAAGGGCTGCACCCAACGCGCCGG
>>> TCAGCGCTGCCGTTGCGGCTTCGCTGGCGCAGACTCAGGCGCACAAGACGGCCCCGGCAGCAGCGGTTGA
>>> ACCCGTTGCCGTGGCCGCGGCCGAGCCTGTATTGGTCGAGACGTCTTCGCGTGACAGCTTTGATGCCATG
>>> GCCGAGCCTGCTGCTGCGCCGCCCAGTGGTGGCCGGGCTGAACAGCGCACCGTGCAGGTTGAAGGTGCGC
>>> TCAAGCACACCAGTTACCTGAACCGGACCTTTACCTTTGACACCTTCGTCGAAGGTAAGTCGAACCAGCT
>>> CGCCCGCGCGGCTGCCTGGCAGGTTGCGGACAACCCTAAGCATGGCTACAACCCACTGTTCCTTTATGGC
>>> GGTGTGGGTTTGGGTAAAACCCACCTTATGCATGCTGTGGGTAACCATCTGCTGAAGAAGAATCCGAACG
>>> CCAAGGTGGTGTACCTGCATTCGGAGCGCTTCGTCGCGGACATGGTCAAAGCGTTGCAACTCAACGCCAT
>>> CAACGAATTCAAGCGCTTCTACCGCTCGGTGGACGCGTTGCTGATCGACGATATCCAGTTCTTCGCTCGC
>>> AAAGAGCGCTCGCAAGAAGAGTTTTTCCACACCTTCAACGCCTTGCTTGAGGGTGGCCAGCAGGTAATCC
>>> TTACCTCTGACCGCTATCCCAAGGAAATCGAAGGCCTGGAAGAGCGTCTGAAGTCGCGCTTTGGTTGGGG
>>> CCTGACGGTGGCTGTCGAGCCGCCAGAGCTGGAGACCCGCGTAGCGATCCTGATGAAGAAGGCCGACCAG
>>> GCCAAAGTCGAGCTCCCGCATGACGCAGCCTTTTTCATCGCTCAGCGCATCCGGTCCAACGTCCGTGAGC
>>> TGGAAGGTGCACTGAAGCGAGTTATTGCTCACTCGCACTTCATGGGGCGTGACATCACCATCGAGCTGAT
>>> TCGTGAATCGCTCAAGGATCTGTTGGCGCTGCAAGACAAACTGGTCAGTGTGGATAACATTCAGCGTACC
>>> GTCGCTGAGTACTACAAGATCAAGATCTCCGATCTGTTGTCCAAGCGTCGTTCGCGTTCTGTCGCGCGCC
>>> CGCGTCAGGTAGCCATGGCCCTGTCCAAGGAGTTGACCAACCACAGTCTGCCGGAAATCGGCGACATGTT
>>> CGGTGGTCGCGACCATACGACCGTGCTGCACGCCTGCCGCAAAATCAATGAACTGAAGGAATCCGACGCG
>>> GACATCCGCGAGGACTACAAGAACCTGCTGCGGACGCTGACGACCTGA
>>>
>>> Please mail me regarding any queries.
>>>
>>> Regards,
>>> Roopa.
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>
>>
>


From mauricio at open-bio.org  Fri Nov 20 11:15:22 2009
From: mauricio at open-bio.org (Mauricio Herrera Cuadra)
Date: Fri, 20 Nov 2009 10:15:22 -0600
Subject: [Bioperl-l] DANGER: hacking of bioperl wiki?
In-Reply-To: <7761C2223DB54DE6B836F302D2FF6AC0@NewLife>
References: <b9140daa0911181852n3a000da7x3fc7a5f0661f10f3@mail.gmail.com>
	<7761C2223DB54DE6B836F302D2FF6AC0@NewLife>
Message-ID: <4B06C09A.8060708@open-bio.org>

All OBF wikis and blogs have been upgraded and cleaned from the hack. 
Thanks for the heads up!

Mauricio.

Mark A. Jensen wrote:
> Andrew-- thanks!! We're on it.
> MAJ
> ----- Original Message ----- From: "Andrew Grimm" 
> <andrew.j.grimm at gmail.com>
> To: <bioperl-l at lists.open-bio.org>
> Sent: Wednesday, November 18, 2009 9:52 PM
> Subject: [Bioperl-l] DANGER: hacking of bioperl wiki?
> 
> 
>> Caution: read the whole email before visiting the bioperl wiki
>>
>> I was doing some bioinformatics-related searching using google, and
>> one of the hits was to the bio dot perl dot org wiki (the FAQ in
>> particular).
>>
>> When I did that, I was redirected to a ferdax dot com web site (a
>> typo-squatting of fedex?).
>>
>> Some people reckon that ferdax hacks web sites and redirects google
>> hits from the victim web site to their own web site. For example, this
>> thread at google's webmaster central
>> http://www.google.com/support/forum/p/Webmasters/thread?tid=37a36c0d1ea99819&hl=en#all 
>>
>> (it's talking about zencart, but presumably they've since found other
>> victims)
>>
>> Just going to the website without using google may not trigger the 
>> redirect.
>>
>> Apologies if this is a false alarm, but I don't think it is.
>>
>> I won't be in contact between Friday and Monday Australian time (I'll
>> be at railscamp 6 in Melbourne), so I won't be able to answer any
>> replies.
>>
>> Thanks,
>>
>> Andrew Grimm
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>>
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 

From David.Messina at sbc.su.se  Fri Nov 20 11:39:53 2009
From: David.Messina at sbc.su.se (Dave Messina)
Date: Fri, 20 Nov 2009 17:39:53 +0100
Subject: [Bioperl-l] Remote blast
In-Reply-To: <c7cac1600911200752h2dd48ca8r55877f5045948220@mail.gmail.com>
References: <c7cac1600911190655q7c4cf910x221e8a2ebac40f2a@mail.gmail.com>
	<4B056DF4.2030502@gmail.com>
	<c7cac1600911200330m5aafc21ehd5038707b4af9cdd@mail.gmail.com>
	<c7cac1600911200752h2dd48ca8r55877f5045948220@mail.gmail.com>
Message-ID: <7ECF627D-3DBF-4575-89CF-FA6348C88E8E@sbc.su.se>

Hi Roopa,

As far as I know, a BLAST report never contains the complete sequences of the hits. If it includes any part of the hit's sequence, it will be the part that matches the query.

You'll have to use the hit's ID or accession to get its complete sequence from somewhere else. You can use Bio::DB::Genbank to do that, for example.

See
http://www.bioperl.org/wiki/HOWTO:Beginners#Retrieving_a_sequence_from_a_database


Dave


From alessandra.bilardi at gmail.com  Fri Nov 20 12:44:18 2009
From: alessandra.bilardi at gmail.com (Alessandra)
Date: Fri, 20 Nov 2009 18:44:18 +0100
Subject: [Bioperl-l] Bio::DB::EUtilities question
Message-ID: <e0996aca0911200944r1162ceaew1bd846ade73d2841@mail.gmail.com>

Hi all,

I'm testing Bio::DB::EUtilities - webagent which interacts with and
retrieves data from NCBI's eUtils. My perl script works but it works
only if I request less than ~450 times get_Response function.. else I
have got this error message:

------------- EXCEPTION -------------
MSG: Response Error
Can't connect to eutils.ncbi.nlm.nih.gov:80 (connect: No route to host)
STACK Bio::DB::GenericWebAgent::get_Response
/usr/local/share/perl/5.10.0/Bio/DB/GenericWebAgent.pm:215
STACK toplevel ./wget4gbk.pl:77
-------------------------------------

wget4gbk.pl lines 76-77 are:
my $req = Bio::DB::EUtilities->new(-db => 'genome', -eutil =>
'esummary', -retmode => $mode, -rettype => $type, -id => $id);
my $entry = $req->get_Response;

I run perl script more ten times and this error arrives random time at
the range 300-600 requests. If I use another system to request data,
then I can to do ~ 10000 requests, without errors. Had I to set
EUtilities object with particular parameters?

Can you help me about random exception error?

Best,

-- 
 Alessandra Bilardi, Ph. D.
----
 CRIBI, University of Padova, Italy
 http://www.linkedin.com/in/bilardi
----

From maj at fortinbras.us  Fri Nov 20 13:42:38 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Fri, 20 Nov 2009 13:42:38 -0500
Subject: [Bioperl-l] gravatars on the wiki
Message-ID: <94431678F3764E8C9A49EA4D2FCD0DBD@NewLife>

Hi all, 
You can now reveal your Gravatar (http://www.gravatar.com) on the wiki, by including 
the following markup on the page:

 <winterPreWiki>
 {{#gravatar|youremail -at- yourplace -dot- tld}}
 </winterPreWiki>

You can do the antispam measure above, or use a regular email. Invalid emails throw an error.
http://bioperl.org/wiki/Gravatars 
Happy coding, 
MAJ

From roychu at gmail.com  Fri Nov 20 15:23:21 2009
From: roychu at gmail.com (Chu, Roy)
Date: Fri, 20 Nov 2009 12:23:21 -0800
Subject: [Bioperl-l] Installing Bio-perl on dreamhost via CPAN
In-Reply-To: <ECA15E72-8026-4EA0-A4F6-7C09EA2BD040@illinois.edu>
References: <4d7f3e450911200221m5f39ace2hb979712115fb9d78@mail.gmail.com>
	<20091120104445.GG31318@kunpuu.plessy.org>
	<ECA15E72-8026-4EA0-A4F6-7C09EA2BD040@illinois.edu>
Message-ID: <4d7f3e450911201223w59cb308q5af7690a28697966@mail.gmail.com>

"sounds very much like you process was killed for prolonged execution
time, or memory usage. We have a daemon in place that monitors for
processes that take up too much of a shared web server's resources, and
this may have kicked in (and often does when trying to install packages
on a shared server)."

This was the explanation they had.  Regarding asking their admins to
install, it seems is a "they'll try to get to it but don't hold your
breath situation."

Hmmm, I tried some other attempts, installing 1.4.0 posed no problems.
 I'm not a perl guru, so I tried to increase the build cache size from
the default, 10 MB, hoping that that may be the problem--can't imagine
how though, since I can't imagine how big the whole package version
can differ by (though honestly, I haven't checked).
Whenever I try to install 1.6.1, it runs into a problem I guess after
the 'make' step and lists the
modules--BioPerl-1.6.0/t/Variation/SeqDiff.t
BioPerl-1.6.0/t/Variation/SNP.t
BioPerl-1.6.0/t/Variation/Variation_IO.t
--and typically gets killed here '> Killed'

Next, I tried 1.6.0, then I get this:
"(I think you ran Build.PL directly, so will use CPAN to install
prerequisites on demand)
CPAN: Storable loaded ok (v2.12)
Going to read '/home/$username/.cpan/Metadata'
Killed" (everything prior works and it seems to get further along than
when I try to install 1.6.1)

Any insight into why this may be happening would be appreciated.
Something EQUALLY appreciated would be a recommendation of a decent
enough hosting service where someone has had success installing
Bio-Perl.  I'd try to set up my Mac web sharing feature and then try
to setup the stuff locally, but I haven't yet been able to
successfully get the port forwarding feature working properly on the
apple airport extreme--perplexing.  Next, I might just try to install
via the Build.pl script.

Hmm, checking the wiki, it seems I'll still be able to run remote
blast and use the basic seq modules, although some discrepancies and
idiosyncrasies may be expected?  Any head-ups about any false
assumptions by me would be greatly appreciated.

Thanks in advance,
Roy

On Fri, Nov 20, 2009 at 5:00 AM, Chris Fields <cjfields at illinois.edu> wrote:
>
> On Nov 20, 2009, at 4:44 AM, Charles Plessy wrote:
>
>> Le Fri, Nov 20, 2009 at 02:21:54AM -0800, Chu, Roy a ?crit :
>>>
>>> Does anyone use dreamhost as a web hosting service? ?I'm just curious
>>> if anyone has had any luck installing the module as their daemon seems
>>> to kill my process whenever I try to install it. ?Dreamhost tech
>>> support attributes it to either exceeding the allocated memory cache
>>> or exceeding the processing time. ?I tried to nice the process, but
>>> that didn't help for me. ?Any luck or experience in resolving this
>>> would be much appreciated. ?I suppose my next attempt would be to try
>>> installing it directly and hope I don't need root...
>>
>> Dear Roy,
>>
>> DreamHost uses Debian, so you can suggest them to install the Debian package.
>> If you are in contact with the tech service, do not hesitate to tell them to
>> contact me if they are interested by a backport of the 1.6.0 package. For
>> version 1.6.1, it may be more difficult as it depends on perl 5.10.1.
>
> Any reason why this is so? ?We specify compatibility back to 5.6.1.
>
> Alex mentioned the reliance on the specific Extutils::Manifest version. ?The version requested has an important bug fix, is present on CPAN, and is backwards-compatible to 5.6.1. ?It should be fairly easy to request that as a separate package.
>
> A strict requirement for perl 5.10.1 doesn't make sense in that light, unless said perl maintainer can enlighten us as to why this is an issue? ?This one may require a ranty blog post.
>
>> PS: if you propse to install BioPerl as a feature in the Dreamhost panel, I
>> will vote for it :)
>>
>> Have a nice day,
>>
>> --
>> Charles Plessy
>> Debian Med packaging team,
>> http://www.debian.org/devel/debian-med
>> Tsurumi, Kanagawa, Japan
>
> chris
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


From cjfields at illinois.edu  Fri Nov 20 15:40:24 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Fri, 20 Nov 2009 14:40:24 -0600
Subject: [Bioperl-l] Installing Bio-perl on dreamhost via CPAN
In-Reply-To: <4d7f3e450911201223w59cb308q5af7690a28697966@mail.gmail.com>
References: <4d7f3e450911200221m5f39ace2hb979712115fb9d78@mail.gmail.com>
	<20091120104445.GG31318@kunpuu.plessy.org>
	<ECA15E72-8026-4EA0-A4F6-7C09EA2BD040@illinois.edu>
	<4d7f3e450911201223w59cb308q5af7690a28697966@mail.gmail.com>
Message-ID: <1D1B0987-3309-4281-BCE0-2737E4F0D0B1@illinois.edu>

BioPerl is pure perl.  If you believe all dependencies are installed, just unpack the dist to a specific directory and point PERL5LIB at it (for bash):

export PERL5LIB=/home/USER/bioperl/bioperl-live

Note that if you plan on doing the same for other bioperl-related modules (ex: bioperl-db) you'll need to add 'lib' to it, as they use a generic Module::Build now.

export PERL5LIB=/home/USER/bioperl/bioperl-db/lib

You can also add a 'use lib' directive in your scripts as well.  More at the following link:

http://www.bioperl.org/wiki/Installing_Bioperl_for_Unix#USING_MODULES_NOT_INSTALLED_IN_THE_STANDARD_LOCATION

chris

On Nov 20, 2009, at 2:23 PM, Chu, Roy wrote:

> "sounds very much like you process was killed for prolonged execution
> time, or memory usage. We have a daemon in place that monitors for
> processes that take up too much of a shared web server's resources, and
> this may have kicked in (and often does when trying to install packages
> on a shared server)."
> 
> This was the explanation they had.  Regarding asking their admins to
> install, it seems is a "they'll try to get to it but don't hold your
> breath situation."
> 
> Hmmm, I tried some other attempts, installing 1.4.0 posed no problems.
> I'm not a perl guru, so I tried to increase the build cache size from
> the default, 10 MB, hoping that that may be the problem--can't imagine
> how though, since I can't imagine how big the whole package version
> can differ by (though honestly, I haven't checked).
> Whenever I try to install 1.6.1, it runs into a problem I guess after
> the 'make' step and lists the
> modules--BioPerl-1.6.0/t/Variation/SeqDiff.t
> BioPerl-1.6.0/t/Variation/SNP.t
> BioPerl-1.6.0/t/Variation/Variation_IO.t
> --and typically gets killed here '> Killed'
> 
> Next, I tried 1.6.0, then I get this:
> "(I think you ran Build.PL directly, so will use CPAN to install
> prerequisites on demand)
> CPAN: Storable loaded ok (v2.12)
> Going to read '/home/$username/.cpan/Metadata'
> Killed" (everything prior works and it seems to get further along than
> when I try to install 1.6.1)
> 
> Any insight into why this may be happening would be appreciated.
> Something EQUALLY appreciated would be a recommendation of a decent
> enough hosting service where someone has had success installing
> Bio-Perl.  I'd try to set up my Mac web sharing feature and then try
> to setup the stuff locally, but I haven't yet been able to
> successfully get the port forwarding feature working properly on the
> apple airport extreme--perplexing.  Next, I might just try to install
> via the Build.pl script.
> 
> Hmm, checking the wiki, it seems I'll still be able to run remote
> blast and use the basic seq modules, although some discrepancies and
> idiosyncrasies may be expected?  Any head-ups about any false
> assumptions by me would be greatly appreciated.
> 
> Thanks in advance,
> Roy
> 
> On Fri, Nov 20, 2009 at 5:00 AM, Chris Fields <cjfields at illinois.edu> wrote:
>> 
>> On Nov 20, 2009, at 4:44 AM, Charles Plessy wrote:
>> 
>>> Le Fri, Nov 20, 2009 at 02:21:54AM -0800, Chu, Roy a ?crit :
>>>> 
>>>> Does anyone use dreamhost as a web hosting service?  I'm just curious
>>>> if anyone has had any luck installing the module as their daemon seems
>>>> to kill my process whenever I try to install it.  Dreamhost tech
>>>> support attributes it to either exceeding the allocated memory cache
>>>> or exceeding the processing time.  I tried to nice the process, but
>>>> that didn't help for me.  Any luck or experience in resolving this
>>>> would be much appreciated.  I suppose my next attempt would be to try
>>>> installing it directly and hope I don't need root...
>>> 
>>> Dear Roy,
>>> 
>>> DreamHost uses Debian, so you can suggest them to install the Debian package.
>>> If you are in contact with the tech service, do not hesitate to tell them to
>>> contact me if they are interested by a backport of the 1.6.0 package. For
>>> version 1.6.1, it may be more difficult as it depends on perl 5.10.1.
>> 
>> Any reason why this is so?  We specify compatibility back to 5.6.1.
>> 
>> Alex mentioned the reliance on the specific Extutils::Manifest version.  The version requested has an important bug fix, is present on CPAN, and is backwards-compatible to 5.6.1.  It should be fairly easy to request that as a separate package.
>> 
>> A strict requirement for perl 5.10.1 doesn't make sense in that light, unless said perl maintainer can enlighten us as to why this is an issue?  This one may require a ranty blog post.
>> 
>>> PS: if you propse to install BioPerl as a feature in the Dreamhost panel, I
>>> will vote for it :)
>>> 
>>> Have a nice day,
>>> 
>>> --
>>> Charles Plessy
>>> Debian Med packaging team,
>>> http://www.debian.org/devel/debian-med
>>> Tsurumi, Kanagawa, Japan
>> 
>> chris
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From charles-listes+bioperl at plessy.org  Fri Nov 20 20:07:23 2009
From: charles-listes+bioperl at plessy.org (Charles Plessy)
Date: Sat, 21 Nov 2009 10:07:23 +0900
Subject: [Bioperl-l] Installing Bio-perl on dreamhost via CPAN
In-Reply-To: <ECA15E72-8026-4EA0-A4F6-7C09EA2BD040@illinois.edu>
References: <4d7f3e450911200221m5f39ace2hb979712115fb9d78@mail.gmail.com>
	<20091120104445.GG31318@kunpuu.plessy.org>
	<ECA15E72-8026-4EA0-A4F6-7C09EA2BD040@illinois.edu>
Message-ID: <20091121010723.GA7786@kunpuu.plessy.org>

Le Fri, Nov 20, 2009 at 07:00:45AM -0600, Chris Fields a ?crit :
> On Nov 20, 2009, at 4:44 AM, Charles Plessy wrote:
> > 
> > DreamHost uses Debian, so you can suggest them to install the Debian
> > package.  If you are in contact with the tech service, do not hesitate to
> > tell them to contact me if they are interested by a backport of the 1.6.0
> > package. For version 1.6.1, it may be more difficult as it depends on perl
> > 5.10.1.
> 
> Any reason why this is so?  We specify compatibility back to 5.6.1.

Dear Chris,

you make a good point: although for building we need to either depend on perl
5.10.1 or package separately Extutils::Manifest, the resulting bioperl package
does not depend on such a high version. Therefore, there is no need for a
backport, and the latest Debian package can be installed on Debian stable
(5.0/Lenny) system. I just checked the Dreamhost machine on which I happen to
have an acces, ?waratahs?, and it seems to be older, but nevertheless it may be
worth asking the admins anyway (with the big drawback that they would have to
be asked for each update).

Have a nice week-end,

-- 
Charles Plessy
Debian Med packaging team,
http://www.debian.org/devel/debian-med
Tsurumi, Kanagawa, Japan

From robert.bradbury at gmail.com  Fri Nov 20 20:40:14 2009
From: robert.bradbury at gmail.com (Robert Bradbury)
Date: Fri, 20 Nov 2009 20:40:14 -0500
Subject: [Bioperl-l] Excessive CPU use by various Bioperl sites
Message-ID: <deaa866a0911201740h63aeb09cma237064b7622f5ce@mail.gmail.com>

I run a Linux system which is in a gradual process of evolution from the
default Linux browsers (Galeon, Epiphany, etc.) through Firefox (better) to
Google's Chromium (IMO, perhaps the best so far).  Chromium allows one to
create a process per tab/URL so one can effectively track what it is doing.
 It also allows one to track the machine usage of these processes (through
the Developer > Task manager [shift-escape keyboard] option) which though
expensive in terms of overhead allows one to track offending windows (in
terms of memory or CPU use).  My processor recently jumped from a typical
700 MHz to 1.4 GHz speed (using the Linux Ondemand scheduler - which saves
~20 W at the wall outlet -- I've measured it) to the full tilt 2.8 GHz the
CPU is capable of.  Looking at the chrome task manager I was not surprised
to find the NY Times high on the list (they are pushing content, esp. using
Javascript) but much to my dismay the Jalview and Howto:Trees:Bioperl
appeared to be high on the list.  Now I am forced to ask myself *why* sites
which are simply distributing static information are eating up CPU on my
machine!  This is a fundamental flaw in the architecture of the sites --
wherein there should be conscious efforts to minimize user-CPU use (or avoid
Javascript entirely).  This would not be a problem if I were using Firefox
as I can easily use NoScript to block Javacscript from non-approved sites.
 But it raises the question of when one should allow Javascript to run (one
would "normally" approve academic sites by default) when even the academic
sites are abusing my CPU.  There needs to be much greater awareness both on
the part of software distributors and software consumers that it is *MY* CPU
and *MY* Electricty and *MY* contribution to global warming.  And the
developers/distributors should not be sucking down those resources without
first saying "May I?" and I have the option of saying "No you may not."
 There is enough we can do productively (running low homology blast
searches) without engaging in endless wheel spinning of Javascripts or
looped GIFs.

Robert

From maj at fortinbras.us  Fri Nov 20 23:17:12 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Fri, 20 Nov 2009 23:17:12 -0500
Subject: [Bioperl-l] ohlohers
Message-ID: <C003FAD20636489DBFB2D34F5955C68D@NewLife>

You can now add your Ohloh widgets and increase your carbon footprint with the less crufty:

 <winterPreWiki>
 {{#ohloh|acct_id|TYPE}}
 </winterPreWiki>

where TYPE is [Detailed|Rank|Tiny]. Taint checks aplenty.
MAJ

From maj at fortinbras.us  Fri Nov 20 23:33:02 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Fri, 20 Nov 2009 23:33:02 -0500
Subject: [Bioperl-l] Installing Bio-perl on dreamhost via CPAN
In-Reply-To: <4d7f3e450911201223w59cb308q5af7690a28697966@mail.gmail.com>
References: <4d7f3e450911200221m5f39ace2hb979712115fb9d78@mail.gmail.com><20091120104445.GG31318@kunpuu.plessy.org><ECA15E72-8026-4EA0-A4F6-7C09EA2BD040@illinois.edu>
	<4d7f3e450911201223w59cb308q5af7690a28697966@mail.gmail.com>
Message-ID: <9ECC66C2F23F47469AF0F07E3F9307FC@NewLife>

Maybe 'nightmarehost' is more appropriate. I've had no problems on AWS,
but this may not exactly what you need. MAJ
----- Original Message ----- 
From: "Chu, Roy" <roychu at gmail.com>
To: <bioperl-l at lists.open-bio.org>
Sent: Friday, November 20, 2009 3:23 PM
Subject: Re: [Bioperl-l] Installing Bio-perl on dreamhost via CPAN


"sounds very much like you process was killed for prolonged execution
time, or memory usage. We have a daemon in place that monitors for
processes that take up too much of a shared web server's resources, and
this may have kicked in (and often does when trying to install packages
on a shared server)."

This was the explanation they had.  Regarding asking their admins to
install, it seems is a "they'll try to get to it but don't hold your
breath situation."

Hmmm, I tried some other attempts, installing 1.4.0 posed no problems.
 I'm not a perl guru, so I tried to increase the build cache size from
the default, 10 MB, hoping that that may be the problem--can't imagine
how though, since I can't imagine how big the whole package version
can differ by (though honestly, I haven't checked).
Whenever I try to install 1.6.1, it runs into a problem I guess after
the 'make' step and lists the
modules--BioPerl-1.6.0/t/Variation/SeqDiff.t
BioPerl-1.6.0/t/Variation/SNP.t
BioPerl-1.6.0/t/Variation/Variation_IO.t
--and typically gets killed here '> Killed'

Next, I tried 1.6.0, then I get this:
"(I think you ran Build.PL directly, so will use CPAN to install
prerequisites on demand)
CPAN: Storable loaded ok (v2.12)
Going to read '/home/$username/.cpan/Metadata'
Killed" (everything prior works and it seems to get further along than
when I try to install 1.6.1)

Any insight into why this may be happening would be appreciated.
Something EQUALLY appreciated would be a recommendation of a decent
enough hosting service where someone has had success installing
Bio-Perl.  I'd try to set up my Mac web sharing feature and then try
to setup the stuff locally, but I haven't yet been able to
successfully get the port forwarding feature working properly on the
apple airport extreme--perplexing.  Next, I might just try to install
via the Build.pl script.

Hmm, checking the wiki, it seems I'll still be able to run remote
blast and use the basic seq modules, although some discrepancies and
idiosyncrasies may be expected?  Any head-ups about any false
assumptions by me would be greatly appreciated.

Thanks in advance,
Roy

On Fri, Nov 20, 2009 at 5:00 AM, Chris Fields <cjfields at illinois.edu> wrote:
>
> On Nov 20, 2009, at 4:44 AM, Charles Plessy wrote:
>
>> Le Fri, Nov 20, 2009 at 02:21:54AM -0800, Chu, Roy a ?crit :
>>>
>>> Does anyone use dreamhost as a web hosting service? I'm just curious
>>> if anyone has had any luck installing the module as their daemon seems
>>> to kill my process whenever I try to install it. Dreamhost tech
>>> support attributes it to either exceeding the allocated memory cache
>>> or exceeding the processing time. I tried to nice the process, but
>>> that didn't help for me. Any luck or experience in resolving this
>>> would be much appreciated. I suppose my next attempt would be to try
>>> installing it directly and hope I don't need root...
>>
>> Dear Roy,
>>
>> DreamHost uses Debian, so you can suggest them to install the Debian package.
>> If you are in contact with the tech service, do not hesitate to tell them to
>> contact me if they are interested by a backport of the 1.6.0 package. For
>> version 1.6.1, it may be more difficult as it depends on perl 5.10.1.
>
> Any reason why this is so? We specify compatibility back to 5.6.1.
>
> Alex mentioned the reliance on the specific Extutils::Manifest version. The 
> version requested has an important bug fix, is present on CPAN, and is 
> backwards-compatible to 5.6.1. It should be fairly easy to request that as a 
> separate package.
>
> A strict requirement for perl 5.10.1 doesn't make sense in that light, unless 
> said perl maintainer can enlighten us as to why this is an issue? This one may 
> require a ranty blog post.
>
>> PS: if you propse to install BioPerl as a feature in the Dreamhost panel, I
>> will vote for it :)
>>
>> Have a nice day,
>>
>> --
>> Charles Plessy
>> Debian Med packaging team,
>> http://www.debian.org/devel/debian-med
>> Tsurumi, Kanagawa, Japan
>
> chris
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>

_______________________________________________
Bioperl-l mailing list
Bioperl-l at lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/bioperl-l


From cjfields at illinois.edu  Fri Nov 20 23:38:23 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Fri, 20 Nov 2009 22:38:23 -0600
Subject: [Bioperl-l] Excessive CPU use by various Bioperl sites
In-Reply-To: <deaa866a0911201740h63aeb09cma237064b7622f5ce@mail.gmail.com>
References: <deaa866a0911201740h63aeb09cma237064b7622f5ce@mail.gmail.com>
Message-ID: <8163BC62-3F3E-4936-AAA9-61A4FB307C99@illinois.edu>

Robert, 

Not sure why you're seeing that, but the HOWTO (and, AFAIK, the wiki in general) do not use JS, unless there is a specific addition I'm unaware of.  Now, the site wiki was recently 'parasited' for redirects, which may be the culprit, but this is now fixed.  Can you at least retest to see if this persists?

Anyone else know about this?

chris

On Nov 20, 2009, at 7:40 PM, Robert Bradbury wrote:

> I run a Linux system which is in a gradual process of evolution from the
> default Linux browsers (Galeon, Epiphany, etc.) through Firefox (better) to
> Google's Chromium (IMO, perhaps the best so far).  Chromium allows one to
> create a process per tab/URL so one can effectively track what it is doing.
> It also allows one to track the machine usage of these processes (through
> the Developer > Task manager [shift-escape keyboard] option) which though
> expensive in terms of overhead allows one to track offending windows (in
> terms of memory or CPU use).  My processor recently jumped from a typical
> 700 MHz to 1.4 GHz speed (using the Linux Ondemand scheduler - which saves
> ~20 W at the wall outlet -- I've measured it) to the full tilt 2.8 GHz the
> CPU is capable of.  Looking at the chrome task manager I was not surprised
> to find the NY Times high on the list (they are pushing content, esp. using
> Javascript) but much to my dismay the Jalview and Howto:Trees:Bioperl
> appeared to be high on the list.  Now I am forced to ask myself *why* sites
> which are simply distributing static information are eating up CPU on my
> machine!  This is a fundamental flaw in the architecture of the sites --
> wherein there should be conscious efforts to minimize user-CPU use (or avoid
> Javascript entirely).  This would not be a problem if I were using Firefox
> as I can easily use NoScript to block Javacscript from non-approved sites.
> But it raises the question of when one should allow Javascript to run (one
> would "normally" approve academic sites by default) when even the academic
> sites are abusing my CPU.  There needs to be much greater awareness both on
> the part of software distributors and software consumers that it is *MY* CPU
> and *MY* Electricty and *MY* contribution to global warming.  And the
> developers/distributors should not be sucking down those resources without
> first saying "May I?" and I have the option of saying "No you may not."
> There is enough we can do productively (running low homology blast
> searches) without engaging in endless wheel spinning of Javascripts or
> looped GIFs.
> 
> Robert
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From sdavis2 at mail.nih.gov  Sat Nov 21 00:11:34 2009
From: sdavis2 at mail.nih.gov (Sean Davis)
Date: Fri, 20 Nov 2009 21:11:34 -0800
Subject: [Bioperl-l] Excessive CPU use by various Bioperl sites
In-Reply-To: <8163BC62-3F3E-4936-AAA9-61A4FB307C99@illinois.edu>
References: <deaa866a0911201740h63aeb09cma237064b7622f5ce@mail.gmail.com>
	<8163BC62-3F3E-4936-AAA9-61A4FB307C99@illinois.edu>
Message-ID: <264855a00911202111u4b1f1020r4aa6e0e9b0ea61@mail.gmail.com>

On Fri, Nov 20, 2009 at 8:38 PM, Chris Fields <cjfields at illinois.edu> wrote:

> Robert,
>
> Not sure why you're seeing that, but the HOWTO (and, AFAIK, the wiki in
> general) do not use JS, unless there is a specific addition I'm unaware of.
>  Now, the site wiki was recently 'parasited' for redirects, which may be the
> culprit, but this is now fixed.  Can you at least retest to see if this
> persists?
>
> Anyone else know about this?
>
>
The page in question does include javascript, it appears from the source.
 This is a function of using mediawiki, though, I believe and not something
specific to that page.

Sean

From cjfields at illinois.edu  Sat Nov 21 00:20:37 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Fri, 20 Nov 2009 23:20:37 -0600
Subject: [Bioperl-l] Excessive CPU use by various Bioperl sites
In-Reply-To: <264855a00911202111u4b1f1020r4aa6e0e9b0ea61@mail.gmail.com>
References: <deaa866a0911201740h63aeb09cma237064b7622f5ce@mail.gmail.com>
	<8163BC62-3F3E-4936-AAA9-61A4FB307C99@illinois.edu>
	<264855a00911202111u4b1f1020r4aa6e0e9b0ea61@mail.gmail.com>
Message-ID: <A7AC3865-3C9A-4C6E-85B5-349240C40680@illinois.edu>

On Nov 20, 2009, at 11:11 PM, Sean Davis wrote:

> On Fri, Nov 20, 2009 at 8:38 PM, Chris Fields <cjfields at illinois.edu> wrote:
> 
>> Robert,
>> 
>> Not sure why you're seeing that, but the HOWTO (and, AFAIK, the wiki in
>> general) do not use JS, unless there is a specific addition I'm unaware of.
>> Now, the site wiki was recently 'parasited' for redirects, which may be the
>> culprit, but this is now fixed.  Can you at least retest to see if this
>> persists?
>> 
>> Anyone else know about this?
>> 
>> 
> The page in question does include javascript, it appears from the source.
> This is a function of using mediawiki, though, I believe and not something
> specific to that page.
> 
> Sean

</sound of my hand slapping my forehead>

Sean, thanks for pointing that out.

chris

From robert.bradbury at gmail.com  Sat Nov 21 13:26:05 2009
From: robert.bradbury at gmail.com (Robert Bradbury)
Date: Sat, 21 Nov 2009 13:26:05 -0500
Subject: [Bioperl-l] Bio::DB::EUtilities question
In-Reply-To: <e0996aca0911200944r1162ceaew1bd846ade73d2841@mail.gmail.com>
References: <e0996aca0911200944r1162ceaew1bd846ade73d2841@mail.gmail.com>
Message-ID: <deaa866a0911211026t2c8e1cafvac9440586dc32122@mail.gmail.com>

It sounds like NCBI may be counting frequency of requests, how much data
they send or something similar.  Are you delaying the time between fetches?
 The code I've seen typically sleeps for a few seconds each time around a
loop.  You might try longer delays between fetches and see if that gets you
any more data.

Alternatively perhaps the libraries aren't reusing the TCP/IP connection
properly.  Is there a difference between the amount of memory on the
machines?  Have you watched the size of the process to see if it grows over
time?  I think the bug which prevented me from fetching a not-so-large
genome from a few months ago (eating up 3GB of memory in the process) has
not been resolved.  If so that could be your problem.

Robert

On Fri, Nov 20, 2009 at 12:44 PM, Alessandra
<alessandra.bilardi at gmail.com>wrote:
>
>
> I'm testing Bio::DB::EUtilities - webagent which interacts with and
> retrieves data from NCBI's eUtils. My perl script works but it works
> only if I request less than ~450 times get_Response function.. else I
> have got this error message:
>
> ------------- EXCEPTION -------------
> MSG: Response Error
> Can't connect to eutils.ncbi.nlm.nih.gov:80 (connect: No route to host)
> STACK Bio::DB::GenericWebAgent::get_Response
> /usr/local/share/perl/5.10.0/Bio/DB/GenericWebAgent.pm:215
> STACK toplevel ./wget4gbk.pl:77
>

From cjfields at illinois.edu  Sat Nov 21 14:19:24 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Sat, 21 Nov 2009 13:19:24 -0600
Subject: [Bioperl-l] Bio::DB::EUtilities question
In-Reply-To: <deaa866a0911211026t2c8e1cafvac9440586dc32122@mail.gmail.com>
References: <e0996aca0911200944r1162ceaew1bd846ade73d2841@mail.gmail.com>
	<deaa866a0911211026t2c8e1cafvac9440586dc32122@mail.gmail.com>
Message-ID: <837CE7E7-E625-4285-AD54-06FD168C0DF3@illinois.edu>

NCBI has specific rules about the repeated queries to its servers:

http://eutils.ncbi.nlm.nih.gov/#UserSystemRequirements

Acc. to that, if you are making over 100 requests at peak times you will run into problems (they'll probably temp-block your IP), even if the timeout is much shorter now (it's 3 requests/second, whereas a year or two ago it was once every 3 sec).  In general it's best to run something like this during off-hours.  

The actual limit on number of server requests is one specific part of Bio::DB::EUtilities that hasn't been added yet, but is tentatively planned.  

chris

On Nov 21, 2009, at 12:26 PM, Robert Bradbury wrote:

> It sounds like NCBI may be counting frequency of requests, how much data
> they send or something similar.  Are you delaying the time between fetches?
> The code I've seen typically sleeps for a few seconds each time around a
> loop.  You might try longer delays between fetches and see if that gets you
> any more data.
> 
> Alternatively perhaps the libraries aren't reusing the TCP/IP connection
> properly.  Is there a difference between the amount of memory on the
> machines?  Have you watched the size of the process to see if it grows over
> time?  I think the bug which prevented me from fetching a not-so-large
> genome from a few months ago (eating up 3GB of memory in the process) has
> not been resolved.  If so that could be your problem.
> 
> Robert
> 
> On Fri, Nov 20, 2009 at 12:44 PM, Alessandra
> <alessandra.bilardi at gmail.com>wrote:
>> 
>> 
>> I'm testing Bio::DB::EUtilities - webagent which interacts with and
>> retrieves data from NCBI's eUtils. My perl script works but it works
>> only if I request less than ~450 times get_Response function.. else I
>> have got this error message:
>> 
>> ------------- EXCEPTION -------------
>> MSG: Response Error
>> Can't connect to eutils.ncbi.nlm.nih.gov:80 (connect: No route to host)
>> STACK Bio::DB::GenericWebAgent::get_Response
>> /usr/local/share/perl/5.10.0/Bio/DB/GenericWebAgent.pm:215
>> STACK toplevel ./wget4gbk.pl:77
>> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From cjfields at illinois.edu  Sat Nov 21 21:58:37 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Sat, 21 Nov 2009 20:58:37 -0600
Subject: [Bioperl-l] BioPerl on FLOSS Weekly
Message-ID: <05EB7AF4-8A20-4046-A585-FBF41EA8350A@illinois.edu>

Jason and I were recently interviewed (Wednesday!) about BioPerl for FLOSS Weekly by Randal Schwartz, Leo Laporte, Marc Pelletier, and Kirsten Sanford.  The interview is now available online, so get your favorite flavor (MP3, podcast) here:

http://twit.tv/floss96

Enjoy!

chris and jason

From adsj at novozymes.com  Sun Nov 22 07:37:40 2009
From: adsj at novozymes.com (Adam =?iso-8859-1?Q?Sj=F8gren?=)
Date: Sun, 22 Nov 2009 13:37:40 +0100
Subject: [Bioperl-l] BioPerl on FLOSS Weekly
In-Reply-To: <05EB7AF4-8A20-4046-A585-FBF41EA8350A@illinois.edu> (Chris
	Fields's message of "Sat, 21 Nov 2009 20:58:37 -0600")
References: <05EB7AF4-8A20-4046-A585-FBF41EA8350A@illinois.edu>
Message-ID: <87aaye91m3.fsf@topper.koldfront.dk>

On Sat, 21 Nov 2009 20:58:37 -0600, Chris wrote:

> Jason and I were recently interviewed (Wednesday!) about BioPerl for
> FLOSS Weekly by Randal Schwartz, Leo Laporte, Marc Pelletier, and
> Kirsten Sanford.

Great!

How about linking to it on bioperl.org?


  :-),

   Adam

-- 
                                                          Adam Sj?gren
                                                    adsj at novozymes.com


From cjfields at illinois.edu  Sun Nov 22 15:30:01 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Sun, 22 Nov 2009 14:30:01 -0600
Subject: [Bioperl-l] BioPerl on FLOSS Weekly
In-Reply-To: <87aaye91m3.fsf@topper.koldfront.dk>
References: <05EB7AF4-8A20-4046-A585-FBF41EA8350A@illinois.edu>
	<87aaye91m3.fsf@topper.koldfront.dk>
Message-ID: <2F050081-8B44-4B4C-82D2-7AC71F156588@illinois.edu>


On Nov 22, 2009, at 6:37 AM, Adam Sj?gren wrote:

> On Sat, 21 Nov 2009 20:58:37 -0600, Chris wrote:
> 
>> Jason and I were recently interviewed (Wednesday!) about BioPerl for
>> FLOSS Weekly by Randal Schwartz, Leo Laporte, Marc Pelletier, and
>> Kirsten Sanford.
> 
> Great!
> 
> How about linking to it on bioperl.org?
> 
> 
>  :-),
> 
>   Adam
> 
> -- 
>                                                          Adam Sj?gren
>                                                    adsj at novozymes.com

Now posted via O|B|F News; I'll try to make that feed more prominent on the main page.  

Since this is the second such interview (Jason did one a few years back for PerlCast), I'm thinking we need a media page of some sort.

chris

From maj at fortinbras.us  Sun Nov 22 15:48:39 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Sun, 22 Nov 2009 15:48:39 -0500
Subject: [Bioperl-l] BioPerl on FLOSS Weekly
In-Reply-To: <2F050081-8B44-4B4C-82D2-7AC71F156588@illinois.edu>
References: <05EB7AF4-8A20-4046-A585-FBF41EA8350A@illinois.edu><87aaye91m3.fsf@topper.koldfront.dk>
	<2F050081-8B44-4B4C-82D2-7AC71F156588@illinois.edu>
Message-ID: <247658CC6D9A4529B281F4482BD3E4BD@NewLife>

We do have http://www.bioperl.org/wiki/Category:BioPerl_Media --
----- Original Message ----- 
From: "Chris Fields" <cjfields at illinois.edu>
To: "Adam Sj?gren" <adsj at novozymes.com>
Cc: <bioperl-l at bioperl.org>
Sent: Sunday, November 22, 2009 3:30 PM
Subject: Re: [Bioperl-l] BioPerl on FLOSS Weekly


On Nov 22, 2009, at 6:37 AM, Adam Sj?gren wrote:

> On Sat, 21 Nov 2009 20:58:37 -0600, Chris wrote:
>
>> Jason and I were recently interviewed (Wednesday!) about BioPerl for
>> FLOSS Weekly by Randal Schwartz, Leo Laporte, Marc Pelletier, and
>> Kirsten Sanford.
>
> Great!
>
> How about linking to it on bioperl.org?
>
>
>  :-),
>
>   Adam
>
> -- 
>                                                          Adam Sj?gren
>                                                    adsj at novozymes.com

Now posted via O|B|F News; I'll try to make that feed more prominent on the main 
page.

Since this is the second such interview (Jason did one a few years back for 
PerlCast), I'm thinking we need a media page of some sort.

chris
_______________________________________________
Bioperl-l mailing list
Bioperl-l at lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/bioperl-l


From jardim.rodrigo at gmail.com  Sun Nov 22 11:06:40 2009
From: jardim.rodrigo at gmail.com (Rodrigo Jardim)
Date: Sun, 22 Nov 2009 14:06:40 -0200
Subject: [Bioperl-l] Problems with Genbank Proteins File
Message-ID: <cad526f60911220806td613367x255b8c6a0cebd1fc@mail.gmail.com>

I have been problem to parser genbank protein file. I think that because
this file have a other order of fields. For example:

In most general genbank files:
========================
LOCUS       AA399704                  183 bp   mRNA    linear   EST
03-MAR-2000
ACCESSION   AA399704
VERSION     AA399704.1  GI:2053305
DEFINITION  TEUF0001 T.cruzi epimastigote non-normalized cDNA Library
            Trypanosoma cruzi cDNA clone 1 5' similar to T. cruzi gene for
            histone H2b (X60982), mRNA sequence.
KEYWORDS    EST.
SOURCE      Trypanosoma cruzi

In genbank protein files:
===================
LOCUS       XP_628849                510 aa            linear   INV
31-OCT-2008
DEFINITION  hypothetical protein [Dictyostelium discoideum AX4].
ACCESSION   XP_628849
VERSION     XP_628849.1  GI:66799847
DBSOURCE    REFSEQ: accession XM_628847.1
KEYWORDS    .
SOURCE      Dictyostelium discoideum AX4.

When I try to parser, Bioperl abort with message error.

Any ideas?

Thanks all,

-- 
Atc,
Rodrigo Jardim
jardim.rodrigo at gmail.com

From biopython at maubp.freeserve.co.uk  Mon Nov 23 12:36:36 2009
From: biopython at maubp.freeserve.co.uk (Peter)
Date: Mon, 23 Nov 2009 17:36:36 +0000
Subject: [Bioperl-l] Problems with Genbank Proteins File
In-Reply-To: <cad526f60911220806td613367x255b8c6a0cebd1fc@mail.gmail.com>
References: <cad526f60911220806td613367x255b8c6a0cebd1fc@mail.gmail.com>
Message-ID: <320fb6e00911230936ofb9d897rbd45abb73a361250@mail.gmail.com>

On Sun, Nov 22, 2009 at 4:06 PM, Rodrigo Jardim
<jardim.rodrigo at gmail.com> wrote:
> I have been problem to parser genbank protein file. I think that because
> this file have a other order of fields. For example:
>
> ...
>
> When I try to parser, Bioperl abort with message error.
>
> Any ideas?

There are some important bits of information missing - what is the error
message, and what version of BioPerl are you using?

Peter

From maj at fortinbras.us  Mon Nov 23 12:58:46 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Mon, 23 Nov 2009 12:58:46 -0500
Subject: [Bioperl-l] building samtools/Bio::DB::Sam on cygwin
Message-ID: <FD03906C0D074E1B8AFDB89A283E9FAB@NewLife>

Hi All--

I've had some hard-won success installing samtools and Lincoln's
Bio::DB::Sam under cygwin; thought some on the list would be able to
use my notes. (Yes, Jason, I'm working on Bio::Tools::Run::BWA...)


(To get the current samtools, ping
http://sourceforge.net/projects/samtools/files/samtools/0.1.7/samtools-0.1.7a.tar.bz2/download
)

* Getting samtools to make from scratch in cygwin

The following diff details the changes to the samtools Makefile I made
by hand. The key points are

-D_WIN32

and the additional variable LFLAGS and its interpolations. To get the
linker to see

libgcc libstdc++

I needed to add symlinks from /lib to the correct files in
/lib/gcc/i386-pc-cygwin/4.3.2/. Your gcc version may differ.


--- ../old/samtools-0.1.7a/Makefile 2009-11-16 10:13:43.000000000 -0500
+++ Makefile 2009-11-23 12:14:18.529000000 -0500
@@ -1,16 +1,18 @@
 CC=   gcc
 CFLAGS=  -g -Wall -O2 #-m64 #-arch ppc
-DFLAGS=  -D_FILE_OFFSET_BITS=64 -D_USE_KNETFILE -D_CURSES_LIB=1
+LFLAGS=         -lws2_32 -lgcc -lcygwin -lbz2 -lz -lstdc++
+DFLAGS=  -D_WIN32 -D_FILE_OFFSET_BITS=64 -D_CURSES_LIB=1
 LOBJS=  bgzf.o kstring.o bam_aux.o bam.o bam_import.o sam.o bam_index.o \
    bam_pileup.o bam_lpileup.o bam_md.o glf.o razf.o faidx.o knetfile.o \
    bam_sort.o sam_header.o
 AOBJS=  bam_tview.o bam_maqcns.o bam_plcmd.o sam_view.o \
    bam_rmdup.o bam_rmdupse.o bam_mate.o bam_stat.o bam_color.o \
    bamtk.o kaln.o

@@ -36,13 +38,13 @@
   $(AR) -cru $@ $(LOBJS)
 
 samtools:lib $(AOBJS)
-  $(CC) $(CFLAGS) -o $@ $(AOBJS) -lm $(LIBPATH) $(LIBCURSES) -lz -L. -lbam
+  $(CC) $(CFLAGS) -o $@ $(AOBJS) -Xlinker --enable-auto-import -lm $(LIBPATH) $(LIBCURSES) -lz -L. -lbam $(LFLAGS)
 
 razip:razip.o razf.o knetfile.o
-  $(CC) $(CFLAGS) -o $@ razf.o razip.o knetfile.o -lz
+  $(CC) $(CFLAGS) -o $@ razf.o razip.o knetfile.o -lz -lm -lws2_32
 
 bgzip:bgzip.o bgzf.o
-  $(CC) $(CFLAGS) -o $@ bgzf.o bgzip.o -lz
+  $(CC) $(CFLAGS) -o $@ bgzf.o bgzip.o -lz -lm -lws2_32
 
 razip.o:razf.h
 bam.o:bam.h razf.h bam_endian.h kstring.h sam_header.h

* Getting Bio::DB::Sam to compile and install

Bio::DB::Sam requires not the samtools.exe, but the bam library
created during the samtools build, as well as all the samtools header
files. Create a symlink in /lib to libbam.a in the build directory (or
copy libbam.a up to /lib), and create symlinks or copy *.h into
/usr/include. Then in cygwin bash shell

$ cpan
cpan> install Bio::DB::Sam

should fly. 

Hope someone finds this useful. These mods led me to a successful
Bio::DB::Sam install--have not yet checked original code based on
Bio::DB::Sam. If they don't work for you, reply to the list.

cheers, 
MAJ 


From jcline at ieee.org  Mon Nov 23 14:13:26 2009
From: jcline at ieee.org (Jonathan Cline)
Date: Mon, 23 Nov 2009 13:13:26 -0600
Subject: [Bioperl-l] Installing Bio-perl on dreamhost via CPAN
In-Reply-To: <mailman.15.1258822805.21407.bioperl-l@lists.open-bio.org>
References: <mailman.15.1258822805.21407.bioperl-l@lists.open-bio.org>
Message-ID: <4B0ADED6.8040901@ieee.org>

Dreamhost has terrible reliability.  I have stats going back years on a
standard dreamhost hosting account (non-dedicated server), and on some
days the web server doesn't respond.  Dreamhost service is OK for a
hobby blog however it is definitely *not* suitable for anything real. 
Add in latency, arbitrary account limits/restrictions,  etc, and as a
hosting service, it is a bad idea to host a project there.   Although
some users apparently get lucky with server allocation and end up on a
"good server", the provider can change this at any time as well.  I
think more typically, the accounts users don't notice, since most are
simple bloggers.

Here's a data snip that illustrates the problem with a typical dreamhost
account:

----------------------------------------------------------------------
date          uptime       dns   connect   request      ttfb      ttlb

2008-08-05     91.40     0.000     0.528     0.528     2.257     1.619
2008-08-04     89.13     0.002     0.301     0.301     1.302     0.971
2008-08-03     94.62     0.000     0.567     0.567     1.506     0.913
2008-08-02    100.00     0.000     0.335     0.335     1.475     1.079
2008-08-01    100.00     0.000     0.310     0.310     1.587     0.825
2008-07-31     93.55     0.023     0.386     0.386     1.280     0.759
2008-07-30    100.00     0.000     0.345     0.345     1.373     0.860
2008-07-29    100.00     0.000     0.358     0.358     1.335     0.757
2008-07-28    100.00     0.000     0.327     0.327     1.462     0.896
2008-07-27    100.00     0.000     0.292     0.292     1.410     0.966
2008-07-26    100.00     0.000     0.283     0.283     1.280     0.815
2008-07-25    100.00     0.000     0.297     0.297     1.231     0.853
2008-07-24    100.00     0.000     0.362     0.362     1.258     0.699
2008-07-23    100.00     0.000     0.339     0.339     1.270     0.785

----------------------------------------------------------------------
minimum        89.13     0.000     0.283     0.283     1.231     0.699
maximum       100.00     0.023     0.567     0.567     2.257     1.619
average        97.76     0.002     0.359     0.359     1.430     0.914
----------------------------------------------------------------------


Or this month:

----------------------------------------------------------------------
date          uptime       dns   connect   request      ttfb      ttlb

2009-11-11    100.00     0.011     0.097     0.097     1.260     1.638
2009-11-10    100.00     0.008     0.094     0.094     1.285     1.647
2009-11-09    100.00     0.008     0.094     0.094     1.494     1.872
2009-11-08    100.00     0.015     0.101     0.101     1.509     1.894
2009-11-07    100.00     0.006     0.092     0.092     1.453     1.831
2009-11-06    100.00     0.011     0.097     0.097     1.500     1.882
2009-11-05     97.80     0.012     0.097     0.097     1.445     1.806
2009-11-04    100.00     0.010     0.096     0.096     1.235     1.605
2009-11-03     95.65     0.007     0.093     0.093     1.266     1.612
2009-11-02    100.00     0.010     0.096     0.096     1.267     1.637
2009-11-01    100.00     0.007     0.093     0.093     1.311     1.692
2009-10-31    100.00     0.009     0.095     0.095     1.225     1.594
2009-10-30    100.00     0.009     0.095     0.095     1.364     1.739
2009-10-29    100.00     0.017     0.103     0.103     1.121     1.505

----------------------------------------------------------------------
minimum        95.65     0.006     0.092     0.092     1.121     1.505
maximum       100.00     0.017     0.103     0.103     1.509     1.894
average        99.53     0.010     0.096     0.096     1.338     1.711
----------------------------------------------------------------------


## Jonathan Cline
## jcline at ieee.org
## Mobile: +1-805-617-0223
########################


From cjfields at illinois.edu  Mon Nov 23 22:19:02 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Mon, 23 Nov 2009 21:19:02 -0600
Subject: [Bioperl-l] verbosity and error stack,
	was  accessing EMBL database
In-Reply-To: <3277368F-615A-4DD3-B9B3-5D32A5EEEE98@sbc.su.se>
References: <475F74057A618245A773CD325E105D1E033334AC@phy-srv01.physiol.physiology.wisc.edu><E10EF917D9914031BCFDF8BDDBFA4F13@NewLife><B6E6CEFF-14E7-48D6-B9D8-9C114F166190@sbc.su.se><FADF827A6CE34C959062F2D93849E15A@NewLife>
	<B335150B-7479-4D31-BA0A-9F139E2CCE0E@illinois.edu>
	<D72A208491F04DBF9B3F7F10D86A9931@NewLife>
	<3277368F-615A-4DD3-B9B3-5D32A5EEEE98@sbc.su.se>
Message-ID: <167D2408-653E-4DF5-BCD7-134CE2549E44@illinois.edu>

Okay, so I think it's feasible to add this into trunk.  I like the idea of optionally having a log class, if someone comes up with a nice way of adding it in I would be for it.

chris

On Nov 20, 2009, at 4:15 AM, Dave Messina wrote:

> Chris, I took a look at how you implemented this in Biome -- very nice!
> 
> 
>> I like this verbose/strict separability a lot. Should we go for it?
> 
> Me too. So yes, I think so.
> 
> 
>> We could even allow finer-grained control of verbosity (states which cover all combinations) w/o affecting strictness.
> 
> 
> Perhaps this is a job for Log::Log4Perl or Log::Dispatch?
> http://search.cpan.org/~mschilli/Log-Log4perl-1.25/lib/Log/Log4perl.pm
> http://search.cpan.org/~drolsky/Log-Dispatch-2.26/lib/Log/Dispatch.pm
> 
> 
> That might be overkill, though.
> 
> Dave
> 


From David.Messina at sbc.su.se  Tue Nov 24 11:18:22 2009
From: David.Messina at sbc.su.se (Dave Messina)
Date: Tue, 24 Nov 2009 17:18:22 +0100
Subject: [Bioperl-l] verbosity and error stack,
	was  accessing EMBL database
In-Reply-To: <167D2408-653E-4DF5-BCD7-134CE2549E44@illinois.edu>
References: <475F74057A618245A773CD325E105D1E033334AC@phy-srv01.physiol.physiology.wisc.edu><E10EF917D9914031BCFDF8BDDBFA4F13@NewLife><B6E6CEFF-14E7-48D6-B9D8-9C114F166190@sbc.su.se><FADF827A6CE34C959062F2D93849E15A@NewLife>
	<B335150B-7479-4D31-BA0A-9F139E2CCE0E@illinois.edu>
	<D72A208491F04DBF9B3F7F10D86A9931@NewLife>
	<3277368F-615A-4DD3-B9B3-5D32A5EEEE98@sbc.su.se>
	<167D2408-653E-4DF5-BCD7-134CE2549E44@illinois.edu>
Message-ID: <3FD2086D-062F-4706-9DC8-2A53224C4913@sbc.su.se>

> I like the idea of optionally having a log class, if someone comes up with a nice way of adding it in I would be for it.

My suggestion of the logging modules was actually to handle the various levels of verbose output -- I think both of the ones I mentioned "log" to STDERR by default.

But of course a nice side effect of using such a logging module is that it would allow optional logging to a file, too.

Dave


From paolo.pavan at gmail.com  Tue Nov 24 14:28:09 2009
From: paolo.pavan at gmail.com (Paolo Pavan)
Date: Tue, 24 Nov 2009 20:28:09 +0100
Subject: [Bioperl-l] Bio::Tools::Run::Cap3 usage question
Message-ID: <56be91b60911241128s52613a56u99e5b1cb3ba8d19a@mail.gmail.com>

Dear,
I'm confused about the proper usage of the module Bio::Tools::Run::Cap3.
As documented in the pod, the run(@seqs) method returns the cap3 report file
while I expect to return a Bio::Assembly object, consistently with other
Bio::Tools::Run classes.
However, I went around this by getting from the factory object the location
and the names of the temp output files (actually accessing a private
property, although) and reading them via the Assembly::IO system.
I was just wandering what is the proper designed way to do this job.

Thank you for enlighten the way!
Paolo

From Russell.Smithies at agresearch.co.nz  Tue Nov 24 17:04:31 2009
From: Russell.Smithies at agresearch.co.nz (Smithies, Russell)
Date: Wed, 25 Nov 2009 11:04:31 +1300
Subject: [Bioperl-l] Bio::DB::Fasta
Message-ID: <18DF7D20DFEC044098A1062202F5FFF32B63085409@exchsth.agresearch.co.nz>

Is there any way to pass a filename to Bio::DB::Fasta for the location of where to write the directory.index?
It's writing in the same dir as the fasta but I'd rather have it write in /tmp as it's part of a web app.

Thanx,

Russell


=======================================================================
Attention: The information contained in this message and/or attachments
from AgResearch Limited is intended only for the persons or entities
to which it is addressed and may contain confidential and/or privileged
material. Any review, retransmission, dissemination or other use of, or
taking of any action in reliance upon, this information by persons or
entities other than the intended recipients is prohibited by AgResearch
Limited. If you have received this message in error, please notify the
sender immediately.
=======================================================================


From Russell.Smithies at agresearch.co.nz  Tue Nov 24 17:21:52 2009
From: Russell.Smithies at agresearch.co.nz (Smithies, Russell)
Date: Wed, 25 Nov 2009 11:21:52 +1300
Subject: [Bioperl-l] Bio::DB::Fasta
In-Reply-To: <4296CD1039FC44B89034A1FD3E6721F3@NewLife>
References: <18DF7D20DFEC044098A1062202F5FFF32B63085409@exchsth.agresearch.co.nz>
	<4296CD1039FC44B89034A1FD3E6721F3@NewLife>
Message-ID: <18DF7D20DFEC044098A1062202F5FFF32B6308542C@exchsth.agresearch.co.nz>

That's what I ended up doing.
Also, there's no "obvious" way to index a single file so I ended putting the filename in the glob parameter.

my $db = Bio::DB::Fasta->new( "$tmp", -glob => "test.faa", -reindex => 1 );

--Russell


> -----Original Message-----
> From: Mark A. Jensen [mailto:maj at fortinbras.us]
> Sent: Wednesday, 25 November 2009 11:19 a.m.
> To: Smithies, Russell; 'bioperl-l'
> Subject: Re: [Bioperl-l] Bio::DB::Fasta
> 
> The code (method index_dir() ) seems to expect all the fasta files to be
> contained in that directory. Looks hairy; what about creating symlinks to
> your
> fasta files in a /tmp subdir and calling new() with that subdir?
> ----- Original Message -----
> From: "Smithies, Russell" <Russell.Smithies at agresearch.co.nz>
> To: "'bioperl-l'" <bioperl-l at bioperl.org>
> Sent: Tuesday, November 24, 2009 5:04 PM
> Subject: [Bioperl-l] Bio::DB::Fasta
> 
> 
> > Is there any way to pass a filename to Bio::DB::Fasta for the location
> of
> > where to write the directory.index?
> > It's writing in the same dir as the fasta but I'd rather have it write
> in /tmp
> > as it's part of a web app.
> >
> > Thanx,
> >
> > Russell
> >
> >
> > =======================================================================
> > Attention: The information contained in this message and/or attachments
> > from AgResearch Limited is intended only for the persons or entities
> > to which it is addressed and may contain confidential and/or privileged
> > material. Any review, retransmission, dissemination or other use of, or
> > taking of any action in reliance upon, this information by persons or
> > entities other than the intended recipients is prohibited by AgResearch
> > Limited. If you have received this message in error, please notify the
> > sender immediately.
> > =======================================================================
> >
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >
> >


From maj at fortinbras.us  Tue Nov 24 17:18:51 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Tue, 24 Nov 2009 17:18:51 -0500
Subject: [Bioperl-l] Bio::DB::Fasta
In-Reply-To: <18DF7D20DFEC044098A1062202F5FFF32B63085409@exchsth.agresearch.co.nz>
References: <18DF7D20DFEC044098A1062202F5FFF32B63085409@exchsth.agresearch.co.nz>
Message-ID: <4296CD1039FC44B89034A1FD3E6721F3@NewLife>

The code (method index_dir() ) seems to expect all the fasta files to be 
contained in that directory. Looks hairy; what about creating symlinks to your 
fasta files in a /tmp subdir and calling new() with that subdir?
----- Original Message ----- 
From: "Smithies, Russell" <Russell.Smithies at agresearch.co.nz>
To: "'bioperl-l'" <bioperl-l at bioperl.org>
Sent: Tuesday, November 24, 2009 5:04 PM
Subject: [Bioperl-l] Bio::DB::Fasta


> Is there any way to pass a filename to Bio::DB::Fasta for the location of 
> where to write the directory.index?
> It's writing in the same dir as the fasta but I'd rather have it write in /tmp 
> as it's part of a web app.
>
> Thanx,
>
> Russell
>
>
> =======================================================================
> Attention: The information contained in this message and/or attachments
> from AgResearch Limited is intended only for the persons or entities
> to which it is addressed and may contain confidential and/or privileged
> material. Any review, retransmission, dissemination or other use of, or
> taking of any action in reliance upon, this information by persons or
> entities other than the intended recipients is prohibited by AgResearch
> Limited. If you have received this message in error, please notify the
> sender immediately.
> =======================================================================
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> 


From florent.angly at gmail.com  Tue Nov 24 17:54:48 2009
From: florent.angly at gmail.com (Florent Angly)
Date: Tue, 24 Nov 2009 14:54:48 -0800
Subject: [Bioperl-l] Bio::Tools::Run::Cap3 usage question
In-Reply-To: <56be91b60911241128s52613a56u99e5b1cb3ba8d19a@mail.gmail.com>
References: <56be91b60911241128s52613a56u99e5b1cb3ba8d19a@mail.gmail.com>
Message-ID: <4B0C6438.8070405@gmail.com>

Hi Paolo,

It turns out that there is no standard for what is to be passed to the 
Bio::Tools::Run wrappers and returned by them. I noticed the 
inconsistency between the assembly wrappers recently while implementing 
support for new wrapper. I implemented inital support for additional de 
novo assembly programs in BioPerl (454 Newbler and Minimo) a couple of 
weeks ago and Mark Jensen added support for Maq, a program that 
assembler reads against a reference. In the process, all the assembly 
wrappers were changed to take the same type of input data (a FASTA 
sequence or an array reference of sequence objects) and return one of 
the following:
    * a Bio::Assembly::Scaffold object (the default), or
    * a Bio::Assembly::IO object, or
    * the name of a file for the output of the assembler
Use the out_type method to set up which output you want, e.g.:
    $factory->out_type('Bio::Assembly::IO');
or
    $factory->out_type('cap3_results.ace');
You'll have to use the code in the bioperl-run subversion if you want to 
use these new features.

Cheers,

Florent


Paolo Pavan wrote:
> Dear,
> I'm confused about the proper usage of the module Bio::Tools::Run::Cap3.
> As documented in the pod, the run(@seqs) method returns the cap3 report file
> while I expect to return a Bio::Assembly object, consistently with other
> Bio::Tools::Run classes.
> However, I went around this by getting from the factory object the location
> and the names of the temp output files (actually accessing a private
> property, although) and reading them via the Assembly::IO system.
> I was just wandering what is the proper designed way to do this job.
>
> Thank you for enlighten the way!
> Paolo
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>   


From roychu at gmail.com  Tue Nov 24 18:00:58 2009
From: roychu at gmail.com (Roy)
Date: Tue, 24 Nov 2009 15:00:58 -0800
Subject: [Bioperl-l] Remote Blast - same script but different results
Message-ID: <4d7f3e450911241500y7df305acq1d03819ea1ec7d3e@mail.gmail.com>

Hi bioperl community,

I've tried searching the old lists to see if this topic has been
covered, and perhaps this question arises from my own lack of
familiarity with BLAST, but (from my perl script listed below) I get
different results with remote blast when I call my script (that is, I
will either get hits or no hits at all).  I'll call the script one
time, and get no hits.  Then call the script again (with the same
parameters), and get the same several hits that I may have before
after having gotten no hits.  I use a subroutine to parse the blast
report information, and then I use a boolean to indicate whether
results are returned or not.  Any insight into what I may have missed
would be appreciated.  Short question, is this behavior typical?  My
understanding of how BLAST works is that it shouldn'tl...


Thanks in advance,
Roy

#!/usr/bin/perl -w

use strict;
use warnings;
use Carp;
use Bio::Perl;
use CGI;
use Bio::SeqIO;
use Bio::SearchIO;
use Bio::SeqFeature::Generic;
use Bio::Restriction::Analysis;
use Bio::Tools::Run::RemoteBlast;

use Bio::SimpleAlign;
use Bio::AlignIO;
use Bio::LocatableSeq;

my $five_seqobj = Bio::Seq->new(
		-seq		=>	'ATTCCCACCGGGACCTGCGGGGCTGAGTGCCCTTCTCGGTTGCTGCCGCTGAGGAGCCCGCCCAGCCAGCCAGGGCCGCGAGGCCGAGGCCAGGCCGCAGCCCAGGAGCCGCCCCACCGCAGCTGGCGATGGACCCGCCGAGGCCCGCGCTGCTGGCGCTGCTGGCGCTGCCTGCGCTGCTGCTGCTGCTGCTGGCGGGCGCCAGGGCCG',
		-display_id	=>	'genomic_a',
		-alphabet 	=>	'dna',
	);
my $three_seqobj = Bio::Seq->new(
		-seq		=>	'GTGAGTGCGCGGCCGCTCTGCGGGCGCAGAGGGAGCGGGAGGGAGCCGGCGGCACGAGGTTGGCCGGGGCAGCCTGGGCCTAGGCCAGAGGGAGGGCAGCCACAGGGTCCAGGGCGAGTGGGGGGATTGGACCAGCTGGCGGCCCCTGCAGGCTCAGGATGGGGGGCGCGGGATGGAGGGGCTGAGGAGGGGGTCTCCGGAGCCTGCCTC',
		-display_id	=>	'genomic_b',
		-alphabet 	=>	'dna',
	);

my @params = (
'-program' => 'blastn',
'-database' => 'refseq_genomic',
'-expect' => '10',
'-readmethod' => 'blastxml'
);
$Bio::Tools::Run::RemoteBlast::HEADER{'MEGABLAST'} = 'YES';
$Bio::Tools::Run::RemoteBlast::HEADER{'PERC_IDENT'} = 75;
$Bio::Tools::Run::RemoteBlast::HEADER{'FORMAT_TYPE'} = 'XML';
$Bio::Tools::Run::RemoteBlast::HEADER{'ENTREZ_QUERY'} = 'Homo sapiens [ORGN]';
$Bio::Tools::Run::RemoteBlast::HEADER{'HITLIST_SIZE'} = 100; # Put:
limit number of hits

my $factory_a = Bio::Tools::Run::RemoteBlast->new(@params);
$factory_a->retrieve_parameter('FORMAT_TYPE', 'XML');

my $hits_a;
my $hits_b;

my $r;
my $bool_hit;
print "Submitting BLAST query - 5' end (MEGABLAST = YES)\n";
$Bio::Tools::Run::RemoteBlast::HEADER{'MEGABLAST'} = 'YES';
$r = $factory_a->submit_blast($a_seqobj);
$bool_hit = fetch_blast_report($factory_a);
unless ($bool_hit) {
	print "\nNo hits\n";
	print "Re-submitting BLAST query - 5' end (MEGABLAST = NO)\n";
	sleep 5;
	$Bio::Tools::Run::RemoteBlast::HEADER{'MEGABLAST'} = 'NO';
	$r = $factory_a->submit_blast($a_seqobj);
	($bool_hit, $hits_a) = fetch_blast_report($factory_a);
	if ($bool_hit == 0) { print "No hits\n"; }
	sleep 5;
}

my $factory_b = Bio::Tools::Run::RemoteBlast->new(@params);
print "\n--------------------------------------------------\n\n";
print "Submitting BLAST query - 3' end (MEGABLAST = YES)\n";
$Bio::Tools::Run::RemoteBlast::HEADER{'MEGABLAST'} = 'YES';
$r = $remote_blast_three->submit_blast($b_seqobj);
$bool_hit = fetch_blast_report($factory_b);
unless ($bool_hit) {
	print " No hits\n";
	print "Re-submitting BLAST query - 3' end (MEGABLAST = NO)\n";
	sleep 5;
	$Bio::Tools::Run::RemoteBlast::HEADER{'MEGABLAST'} = 'NO';
	$r = $factory_b->submit_blast($b_seqobj);
	($bool_hit, $hits_b) = fetch_blast_report($factory_b);
	if ($bool_hit == 0) { print " No hits\n"; }
	sleep 5;
}

print "\nbye\n\n";

print "$hits_a\n$hits_b\n";

exit;

sub fetch_blast_report {
	my ($factory) = @_;
	my $v = 1;
	my $bool_hit = 0;
	my $hits = '';
	
	print STDERR "waiting...";
	while (my @rids = $factory->each_rid) {
		foreach my $rid (@rids) {
			print STDERR ".";
			my $rc = $factory->retrieve_blast($rid);
			# retrieves blast report from remote blast queue,
			# returns -1 on error, 0 on 'job not finished', Bio::SearchIO object
			# args, remote blast id (rid)
			if (!ref($rc)) {
				# if not empty string, ref EXPR returns a non-empty string if EXPR
is a reference
				if ($rc < 0) {
					$factory->remove_rid($rid);
				}
				print STDERR "." if ($v > 0);
#####################################################################################
is this printing out as multiple dots? when and why?
				sleep 5;
			} else {
				$bool_hit = 1;
				my $result = $rc->next_result();
				unless ($result->num_hits > 0) {
					$bool_hit = 0;
				}
				# returns: Bio::Search::Result::ResultI object
				$factory->remove_rid($rid);
				print "\ndatabase:\t", $result->database_name,"\n";
				print "query name:\t", $result->query_name,"\n";
				print "query length\t", $result->query_length,"\n";
				print "num hits\t", $result->num_hits,"\n";
				if ($result->num_hits) {
					# $result->hits returns an array of hits
					# $results->no_hits_found, boolean vs $#{@hits} ie. filtering\
					while (my $hit = $result->next_hit) {
					
					print "\nhit name:\t", $hit->name,"\n";	
					print "description:\t", $hit->description,"\n";	
					print "locus:\t", $hit->locus,"\n";	
					print "algorithm: ", $hit->algorithm,"\thit length: ",
$hit->length,"\thit ranking: ", $hit->rank,"\n";
					while (my $hsp = $hit->next_hsp) {
						print "evalue: ", $hsp->evalue,"\tscore: ",
$hsp->score,"\tpercent_id: ", $hsp->percent_identity,"\n";
						print "query_start: ", $hsp->query->start,"\tquery_end: ",
$hsp->query->end;
						print "\tquery_length: ", $hsp->query->length,"\tquery_strand:
", $hsp->strand('query'), "\n";
						print "subject_start: ", $hsp->subject->start,"\tsubject_end: ",
$hsp->subject->end;
						print "\tsubject_length: ",
$hsp->subject->length,"\tsubject_strand: ", $hsp->strand('subject'),
"\n\n";
						my $aln = $hsp->get_aln;
						if ($aln->is_flush) {
							foreach my $seq ($aln->each_seq) {
								print $seq->seq,"\n";
							}
							print $aln->gap_line, "\n";
							print $aln->consensus_string(95), "\n\n";
						}

						$hits .= $hit->name."\t".$hsp->subject->start."\t".$hsp->subject->end."\t".$hsp->strand('subject')."\n";
					}
				}		
			}
		}
	}
	return ($bool_hit, $hits);
}
}

From maj at fortinbras.us  Tue Nov 24 23:12:13 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Tue, 24 Nov 2009 23:12:13 -0500
Subject: [Bioperl-l] Bio::DB::Fasta
In-Reply-To: <18DF7D20DFEC044098A1062202F5FFF32B6308542C@exchsth.agresearch.co.nz>
References: <18DF7D20DFEC044098A1062202F5FFF32B63085409@exchsth.agresearch.co.nz>
	<4296CD1039FC44B89034A1FD3E6721F3@NewLife>
	<18DF7D20DFEC044098A1062202F5FFF32B6308542C@exchsth.agresearch.co.nz>
Message-ID: <3ECFA0236D1B467181EE63C8C6BE7E1F@NewLife>

I seem to be able to do
$db = Bio::DB::Fasta->new("$tmp/test.faa");
without a problem- something in the mixing of named and unnamed parameters?
----- Original Message ----- 
From: "Smithies, Russell" <Russell.Smithies at agresearch.co.nz>
To: "'Mark A. Jensen'" <maj at fortinbras.us>; "'bioperl-l'" 
<bioperl-l at bioperl.org>
Sent: Tuesday, November 24, 2009 5:21 PM
Subject: RE: [Bioperl-l] Bio::DB::Fasta


That's what I ended up doing.
Also, there's no "obvious" way to index a single file so I ended putting the 
filename in the glob parameter.

my $db = Bio::DB::Fasta->new( "$tmp", -glob => "test.faa", -reindex => 1 );

--Russell


> -----Original Message-----
> From: Mark A. Jensen [mailto:maj at fortinbras.us]
> Sent: Wednesday, 25 November 2009 11:19 a.m.
> To: Smithies, Russell; 'bioperl-l'
> Subject: Re: [Bioperl-l] Bio::DB::Fasta
>
> The code (method index_dir() ) seems to expect all the fasta files to be
> contained in that directory. Looks hairy; what about creating symlinks to
> your
> fasta files in a /tmp subdir and calling new() with that subdir?
> ----- Original Message -----
> From: "Smithies, Russell" <Russell.Smithies at agresearch.co.nz>
> To: "'bioperl-l'" <bioperl-l at bioperl.org>
> Sent: Tuesday, November 24, 2009 5:04 PM
> Subject: [Bioperl-l] Bio::DB::Fasta
>
>
> > Is there any way to pass a filename to Bio::DB::Fasta for the location
> of
> > where to write the directory.index?
> > It's writing in the same dir as the fasta but I'd rather have it write
> in /tmp
> > as it's part of a web app.
> >
> > Thanx,
> >
> > Russell
> >
> >
> > =======================================================================
> > Attention: The information contained in this message and/or attachments
> > from AgResearch Limited is intended only for the persons or entities
> > to which it is addressed and may contain confidential and/or privileged
> > material. Any review, retransmission, dissemination or other use of, or
> > taking of any action in reliance upon, this information by persons or
> > entities other than the intended recipients is prohibited by AgResearch
> > Limited. If you have received this message in error, please notify the
> > sender immediately.
> > =======================================================================
> >
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >
> >


From maj at fortinbras.us  Wed Nov 25 12:25:30 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Wed, 25 Nov 2009 12:25:30 -0500
Subject: [Bioperl-l] question for all regarding a sam-based Bio::Assembly::IO
Message-ID: <1E72D5B0A190448FA27545DB5B68638D@NewLife>

Short-readers, 

I'm working on an Assembly::IO class for sam alignments.
I'm currently making a decision about handling multiple reference sequences:
would you prefer that next_assembly() return an assembly that covers all reference
sequences, or that next_assembly iterates over each reference sequence?
(Or both?)

thanks for your input-
MAJ

From timbourine81 at gmail.com  Wed Nov 25 12:40:52 2009
From: timbourine81 at gmail.com (Tim)
Date: Wed, 25 Nov 2009 18:40:52 +0100
Subject: [Bioperl-l] How to parse BLAST output - all hits of each query in
	new file
Message-ID: <4B0D6C24.2080308@gmail.com>

Dear bioperl users,

I am a real newbie and have - maybe a very trivial - question.

I searched the mailing list archive and many howtos but I have not found
a concrete answer to my problem. So hopefully you can help me :)

Background: I use the latest Bioperl version (installed it two weeks
before).
When I use Bio::Tools::Run::StandAloneBlast to BLAST one fasta file
including different sequences, I get a BLAST output with many queries
each having several hits / sbjcts.

My problem is how to parse *all* hits of *one* query into a single new
file. And this for all the queries I have in my BLAST output file.

Or is it better the other way round; first to make fasta files with only
single sequences inside and BLAST each file? But how can I automize that
using Bioperl?

I tried Bio::SearchIO but can only parse all queries and their
respective hits in only one file...
I think iteration is also necessary here, but I do not really know how
to include that into Bio::SearchIO.
Or do I have to use Module:Bio::Index::Blast?

I can index a file (see below), but I have no idea what comes next...

###How I index a file...

#!/usr/bin/perl -w

$ENV{BIOPERL_INDEX_TYPE} = "SDBM_File";

use Bio::Index::Fasta;


$file_name = "8_to_BLAST_two_seq_index.fasta";
$id = "48882";
$inx = Bio::Index::Fasta->new (-filename => $file_name . ".idx",
-write_flag => 1);
$inx->make_index($file_name);


Hopefully, you can give me at least hints what to look for.

A big THANKS in advance!

Cheers,

Tim

From timbourine81 at gmail.com  Wed Nov 25 12:53:34 2009
From: timbourine81 at gmail.com (Tim)
Date: Wed, 25 Nov 2009 18:53:34 +0100
Subject: [Bioperl-l] How to parse different (fasta) files
Message-ID: <4B0D6F1E.8@gmail.com>

Hey everybody,

another question from me...if you do not mind :)

My situation is like this: I have parsed a standalone BLAST output using
SearchIO with only the hit names. Now I have a second fasta file with
the same sequences like in the BLAST database but including an alignment
(meaning "." and "-"). (There is no chance to make a BLAST database with
fasta files including the alignment, unfortunately...).
My intention is now to take the name of the hit sequences (BLAST output)
and to get the corresponding aligned sequences (fasta file incl.
alignment) and putting it in a new file.

Is anybody out there who has tried that before?

Again, I am a absolute greenhorn in using (Bio)perl. Maybe it is very
simple :D

Looking forward to get an answer of you.

All the best,

Tim
-- 
Tim K?hler
MPI for Terrestrial Microbiology
Karl-von-Frisch-Stra?e
D-35043 Marburg / Germany

Email: koehlerd at mpi-marburg.mpg.de
Phone: +49 6421 178-740
Fax:   +49 6421 178-999

From maj at fortinbras.us  Wed Nov 25 13:20:03 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Wed, 25 Nov 2009 13:20:03 -0500
Subject: [Bioperl-l] How to parse BLAST output - all hits of each query
	innew file
In-Reply-To: <4B0D6C24.2080308@gmail.com>
References: <4B0D6C24.2080308@gmail.com>
Message-ID: <53DE480F205E42CE8D2B9421592AAF0E@NewLife>

hey Tim--

Sound like you need to go about collecting your queries inside out:

my %hits_by_query;
for ($result->hits) {
  push @{$hits_by_query{$hit->name}} $hit;
}

I believe now each hash element, keyed by the query name, will contain
an arrayref to the set of hits assoc with that query.
>From here, I believe

use Bio::Search::Result::BlastResult;
use Bio::SearchIO;

foreach my $qid ( keys %hits_by_query ) {
  my $result = Bio::Search::Result::BlastResult->new();
  $result->add_hit($_) for ( @{$hits_by_query{$qid}} );
  my $blio = Bio::SearchIO->new( -file => $qid.".bls", -format=>'blast' );
  $blio->write_result($result);
}

will do what you want.

hope this helps -
Mark

----- Original Message ----- 
From: "Tim" <timbourine81 at gmail.com>
To: <bioperl-l at lists.open-bio.org>
Sent: Wednesday, November 25, 2009 12:40 PM
Subject: [Bioperl-l] How to parse BLAST output - all hits of each query innew 
file


> Dear bioperl users,
>
> I am a real newbie and have - maybe a very trivial - question.
>
> I searched the mailing list archive and many howtos but I have not found
> a concrete answer to my problem. So hopefully you can help me :)
>
> Background: I use the latest Bioperl version (installed it two weeks
> before).
> When I use Bio::Tools::Run::StandAloneBlast to BLAST one fasta file
> including different sequences, I get a BLAST output with many queries
> each having several hits / sbjcts.
>
> My problem is how to parse *all* hits of *one* query into a single new
> file. And this for all the queries I have in my BLAST output file.
>
> Or is it better the other way round; first to make fasta files with only
> single sequences inside and BLAST each file? But how can I automize that
> using Bioperl?
>
> I tried Bio::SearchIO but can only parse all queries and their
> respective hits in only one file...
> I think iteration is also necessary here, but I do not really know how
> to include that into Bio::SearchIO.
> Or do I have to use Module:Bio::Index::Blast?
>
> I can index a file (see below), but I have no idea what comes next...
>
> ###How I index a file...
>
> #!/usr/bin/perl -w
>
> $ENV{BIOPERL_INDEX_TYPE} = "SDBM_File";
>
> use Bio::Index::Fasta;
>
>
> $file_name = "8_to_BLAST_two_seq_index.fasta";
> $id = "48882";
> $inx = Bio::Index::Fasta->new (-filename => $file_name . ".idx",
> -write_flag => 1);
> $inx->make_index($file_name);
>
>
> Hopefully, you can give me at least hints what to look for.
>
> A big THANKS in advance!
>
> Cheers,
>
> Tim
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> 


From Russell.Smithies at agresearch.co.nz  Wed Nov 25 14:07:26 2009
From: Russell.Smithies at agresearch.co.nz (Smithies, Russell)
Date: Thu, 26 Nov 2009 08:07:26 +1300
Subject: [Bioperl-l] How to parse BLAST output - all hits of each query
 in new file
In-Reply-To: <4B0D6C24.2080308@gmail.com>
References: <4B0D6C24.2080308@gmail.com>
Message-ID: <18DF7D20DFEC044098A1062202F5FFF32B63085701@exchsth.agresearch.co.nz>

Hi Tim,
Here's some code for a job I'm working on at the moment that contains all the bits you'll probably need.
It's extracting 2 species-specific databases from nr (based on tax ids), doing a blast, then parsing the results and creating a substitution matrix. I was initially using Bio::DB::Eutilities to query and retrieve sequences but I kept getting errors and time-outs from NCBI when pulling back large numbers of sequences.
It should give you a rough idea of how to run Bio::Tools::Run::StandAloneBlast, Bio::DB::Fasta and Bio::SearchIO.

Email me direct if you want further explaination as it's not well commented ;-)

Russell Smithies 

Bioinformatics Applications Developer 
T +64 3 489 9085 
E  russell.smithies at agresearch.co.nz 

Invermay  Research Centre 
Puddle Alley, 
Mosgiel, 
New Zealand 
T  +64 3 489 3809   
F  +64 3 489 9174  
www.agresearch.co.nz

=======================================

#!/usr/local/bin/perl

use strict;
use Bio::Tools::Run::StandAloneBlast;
use Bio::SeqIO;
use Bio::SearchIO;
use Bio::DB::Fasta;

use Storable;

# Parameters: <query> <subject> <number or percentage of searches>
# Percentage can be specified as either 20p, 20P or 20%
# So for 20% of rice sequences blasted against oil palm:
#    4530 51953 20p   (4530=rice,51953=oil_palm, 20p=20%)
# Or for 20 searches:
#      4530 51953 20
#
my ( $q, $s, $c ) = @ARGV;

my $nr = "/data/databases/flatfile/illuminati_blastdata/nr";
my $tax_file = "/data/anonftp/pub/mirror/taxonomy/gi_taxid_prot.dmp.gz";
my $tmp = "/tmp/tax";


my %stats      = ();
my $total_subs = 0;

my $min_hsp_len      = 0;
my $min_hsp_identity = 0;
my $num_searches     = $c || 10;
my $blast_e          = '1e-6';
my $count            = 0;

# check if all the fasta and blast files exist
# if not, extract new fasta and re-formatdb the database
foreach my $t ( $q, $s ) {
  foreach ( map { "$tmp/$t.$_" } qw(faa list phr pin psq) ) {
    unless ( -e $_ ) {
      print "Creating database for $t\n";
      &create_database($t);
      last;
    }
  }
}

my @params = (
               -database => "$tmp/$q",
               -program  => 'blastp',
               -e        => $blast_e,
               -outfile  => "$tmp/blast.out",
               -v        => '1',
               -b        => '1'
);
my $factory = Bio::Tools::Run::StandAloneBlast->new(@params) or die $!;

# load the query sequences into a db
# makes it easier to randomly access them
my $db = Bio::DB::Fasta->new( "$tmp", -glob => "$s.faa", -reindex => 1 );

my @ids      = $db->ids;
my $id_count = $#ids;
exit "No sequences\n" unless $id_count;

# if a percentage is requested, calculate
# the required number of searches
if ( $num_searches =~ m/(\d+)[pP%]/ ) {
  $num_searches = int( ( $1 / 100 ) * $id_count );
  warn
"Searching random $1 percent ($num_searches) of $id_count sequences from taxid $q\n";
}

my $summary_file = "$tmp/".$$."_summary.txt";
open( OUT, ">", $summary_file ) or die $!;
print OUT
"#Summary of $num_searches random blast searches from taxid $q against taxid $s.\n";
print OUT "#Parameters used were:\n";
print OUT "#blast_e: $blast_e\n";
print OUT "#min_hsp_len: $min_hsp_len\n";
print OUT "#min_hsp_identity: $min_hsp_identity\n";
print OUT "\n";

while ( my $seq = $db->get_Seq_by_id( $ids[ rand($#ids) ] ) ) {
  next unless $seq;

  warn "Processing ", $seq->id, "\n";
  eval {
    my $blast_report = $factory->blastall($seq);
    sleep 5;
  };

  my $blast_in = new Bio::SearchIO( -format => "blast", -file => "$tmp/blast.out" );

  while ( my $result = $blast_in->next_result ) {
    if ( $result->num_hits <= 0 ) {
      warn "No hits for ", $result->query_accession, "\n";
      print OUT "No hits for ", $result->query_accession, "\n";
      next;
    }
    $count++;
    while ( my $hit = $result->next_hit ) {
      while ( my $hsp = $hit->next_hsp ) {
        warn sprintf( "%s had %s hsp%s\n",
                      $result->query_accession, $hit->num_hsps,
                      $hit->num_hsps > 1 ? "s" : "" );
        print OUT sprintf( "%s had %s hsp%s\n",
                      $result->query_accession, $hit->num_hsps,
                      $hit->num_hsps > 1 ? "s" : "" );

        # http://www.bioperl.org/wiki/HOWTO:SearchIO#Table_of_Methods
        if ( $hsp->length('total') > $min_hsp_len ) {
          if ( $hsp->percent_identity >= $min_hsp_identity ) {
            my @query_string = split '', $hsp->query_string;
            my @homol_string = split '', $hsp->homology_string;
            my @hit_string   = split '', $hsp->hit_string;
            for ( my $i = 0; $i < $#query_string; $i++ ) {
              next unless $homol_string[$i] =~ /\+/;
              $stats{ $query_string[$i] }{ $hit_string[$i] }++;
              $total_subs++;
            }
          }
        }
      }
    }
  }
  unlink '$tmp/blast.out' if -e '$tmp/blast.out';
  last if $count >= $num_searches;
}


# create summary frequency list
my %summary = ();
for my $query ( keys %stats ) {
  for my $hit ( keys %{ $stats{$query} } ) {
    $summary{"$query->$hit"} =
      sprintf( "%6f", $stats{$query}{$hit} / $total_subs );
  }
}

print OUT "\n";

# sort by decending frequencies and print to summary file
foreach my $k ( sort { $summary{$b} <=> $summary{$a} } keys %summary ) {
  print OUT "$k\t", $summary{$k}, "\n" unless $k =~ /TOTAL/;
}

print OUT "\n\n";

# print substitution matrix
my $i     = 0;
my @prots = qw(A R N D C Q E G H I L K M F P S T W Y V);
my $sep   = "\t";

print OUT sprintf( "%7s %s", $_, $sep ) foreach ( "       ", @prots );
print OUT "\n";

foreach my $x (@prots) {
  print OUT sprintf( "%7s|%s", $prots[ $i++ ], $sep );
  foreach my $y (@prots) {
    my $val =
      defined( $stats{$x}{$y} )
      ? sprintf( "%0.6f", $stats{$x}{$y} / $total_subs )
      : "--------";
    print OUT sprintf( "%s%s", $val, $sep );
  }
  print OUT "\n";
}
close OUT;


open(IN, $summary_file) or die $!;
print $_ while(<IN>);
close IN;


# extract sequences from nr database based on taxid.
sub create_database {
  my $txid      = shift;
  my %hash      = ();
  my $gi_stored = "/tmp/gi.dat";

  if ( -e $gi_stored ) {
    %hash = %{ retrieve($gi_stored) };
  }
  else {
    open( TXID, "zcat $tax_file | " ) or die $!;
    while (<TXID>) {
      chomp;
      my ( $gi, $tx ) = split( "\t", $_ );
      push( @{ $hash{$tx} }, $gi );
    }
    close TXID;

    store( \%hash, $gi_stored );
  }

  my $txlist = "$tmp/$txid.list";
  my $txseq  = "$tmp/$txid.faa";
	
	die "No sequences found for taxid $txid\n" unless defined( @{ $hash{$txid} });
	my $num_seqs =  scalar( @{ $hash{$txid} });
	warn "Found $num_seqs sequences for taxid $txid in $tax_file\n";

  open OUT, ">", $txlist or die $!;
  print OUT "$_\n" foreach ( @{ $hash{$txid} } );
  close OUT;

  my $cmd = "fastacmd -d $nr -i $txlist -t T -o $txseq 2>/dev/null";
  system $cmd;

  my $count = `grep -c '>' $txseq`;
  $count =~ s/\n//;
	warn "Could only extract $count sequences from $nr\n";

  $cmd = "formatdb -p T -i $tmp/$txid.faa -n $tmp/$txid -l $tmp/formatdb.log";
  system $cmd;

  $cmd = "fastacmd -d $tmp/$txid -I";
  system $cmd;

  warn "Check the formatdb.log for any errors\n";
}


=======================================


> -----Original Message-----
> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
> bounces at lists.open-bio.org] On Behalf Of Tim
> Sent: Thursday, 26 November 2009 6:41 a.m.
> To: bioperl-l at lists.open-bio.org
> Subject: [Bioperl-l] How to parse BLAST output - all hits of each query in
> new file
> 
> Dear bioperl users,
> 
> I am a real newbie and have - maybe a very trivial - question.
> 
> I searched the mailing list archive and many howtos but I have not found
> a concrete answer to my problem. So hopefully you can help me :)
> 
> Background: I use the latest Bioperl version (installed it two weeks
> before).
> When I use Bio::Tools::Run::StandAloneBlast to BLAST one fasta file
> including different sequences, I get a BLAST output with many queries
> each having several hits / sbjcts.
> 
> My problem is how to parse *all* hits of *one* query into a single new
> file. And this for all the queries I have in my BLAST output file.
> 
> Or is it better the other way round; first to make fasta files with only
> single sequences inside and BLAST each file? But how can I automize that
> using Bioperl?
> 
> I tried Bio::SearchIO but can only parse all queries and their
> respective hits in only one file...
> I think iteration is also necessary here, but I do not really know how
> to include that into Bio::SearchIO.
> Or do I have to use Module:Bio::Index::Blast?
> 
> I can index a file (see below), but I have no idea what comes next...
> 
> ###How I index a file...
> 
> #!/usr/bin/perl -w
> 
> $ENV{BIOPERL_INDEX_TYPE} = "SDBM_File";
> 
> use Bio::Index::Fasta;
> 
> 
> $file_name = "8_to_BLAST_two_seq_index.fasta";
> $id = "48882";
> $inx = Bio::Index::Fasta->new (-filename => $file_name . ".idx",
> -write_flag => 1);
> $inx->make_index($file_name);
> 
> 
> Hopefully, you can give me at least hints what to look for.
> 
> A big THANKS in advance!
> 
> Cheers,
> 
> Tim
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
=======================================================================
Attention: The information contained in this message and/or attachments
from AgResearch Limited is intended only for the persons or entities
to which it is addressed and may contain confidential and/or privileged
material. Any review, retransmission, dissemination or other use of, or
taking of any action in reliance upon, this information by persons or
entities other than the intended recipients is prohibited by AgResearch
Limited. If you have received this message in error, please notify the
sender immediately.
=======================================================================


From maj at fortinbras.us  Wed Nov 25 14:21:27 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Wed, 25 Nov 2009 14:21:27 -0500
Subject: [Bioperl-l] How to parse BLAST output - all hits of each
	queryinnew file
In-Reply-To: <53DE480F205E42CE8D2B9421592AAF0E@NewLife>
References: <4B0D6C24.2080308@gmail.com>
	<53DE480F205E42CE8D2B9421592AAF0E@NewLife>
Message-ID: <815D2A47BC9C4D89B8DEF0B10DA9EAF8@NewLife>

whoops: change the following line:
my $blio = Bio::SearchIO->new( -file => $qid.".bls", -format=>'blast' );

to

my $blio = Bio::SearchIO->new( -file => ">$qid\.bls", -format=>'blast' );

(I always forget that...)
MAJ

----- Original Message ----- 
From: "Mark A. Jensen" <maj at fortinbras.us>
To: "Tim" <timbourine81 at gmail.com>; <bioperl-l at lists.open-bio.org>
Sent: Wednesday, November 25, 2009 1:20 PM
Subject: Re: [Bioperl-l] How to parse BLAST output - all hits of each queryinnew 
file


> hey Tim--
>
> Sound like you need to go about collecting your queries inside out:
>
> my %hits_by_query;
> for ($result->hits) {
>  push @{$hits_by_query{$hit->name}} $hit;
> }
>
> I believe now each hash element, keyed by the query name, will contain
> an arrayref to the set of hits assoc with that query.
>>From here, I believe
>
> use Bio::Search::Result::BlastResult;
> use Bio::SearchIO;
>
> foreach my $qid ( keys %hits_by_query ) {
>  my $result = Bio::Search::Result::BlastResult->new();
>  $result->add_hit($_) for ( @{$hits_by_query{$qid}} );
>  my $blio = Bio::SearchIO->new( -file => $qid.".bls", -format=>'blast' );
>  $blio->write_result($result);
> }
>
> will do what you want.
>
> hope this helps -
> Mark
>
> ----- Original Message ----- 
> From: "Tim" <timbourine81 at gmail.com>
> To: <bioperl-l at lists.open-bio.org>
> Sent: Wednesday, November 25, 2009 12:40 PM
> Subject: [Bioperl-l] How to parse BLAST output - all hits of each query innew 
> file
>
>
>> Dear bioperl users,
>>
>> I am a real newbie and have - maybe a very trivial - question.
>>
>> I searched the mailing list archive and many howtos but I have not found
>> a concrete answer to my problem. So hopefully you can help me :)
>>
>> Background: I use the latest Bioperl version (installed it two weeks
>> before).
>> When I use Bio::Tools::Run::StandAloneBlast to BLAST one fasta file
>> including different sequences, I get a BLAST output with many queries
>> each having several hits / sbjcts.
>>
>> My problem is how to parse *all* hits of *one* query into a single new
>> file. And this for all the queries I have in my BLAST output file.
>>
>> Or is it better the other way round; first to make fasta files with only
>> single sequences inside and BLAST each file? But how can I automize that
>> using Bioperl?
>>
>> I tried Bio::SearchIO but can only parse all queries and their
>> respective hits in only one file...
>> I think iteration is also necessary here, but I do not really know how
>> to include that into Bio::SearchIO.
>> Or do I have to use Module:Bio::Index::Blast?
>>
>> I can index a file (see below), but I have no idea what comes next...
>>
>> ###How I index a file...
>>
>> #!/usr/bin/perl -w
>>
>> $ENV{BIOPERL_INDEX_TYPE} = "SDBM_File";
>>
>> use Bio::Index::Fasta;
>>
>>
>> $file_name = "8_to_BLAST_two_seq_index.fasta";
>> $id = "48882";
>> $inx = Bio::Index::Fasta->new (-filename => $file_name . ".idx",
>> -write_flag => 1);
>> $inx->make_index($file_name);
>>
>>
>> Hopefully, you can give me at least hints what to look for.
>>
>> A big THANKS in advance!
>>
>> Cheers,
>>
>> Tim
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> 


From alden.huang at gmail.com  Thu Nov 26 05:54:30 2009
From: alden.huang at gmail.com (Alden Huang)
Date: Thu, 26 Nov 2009 02:54:30 -0800
Subject: [Bioperl-l] Function that determines serious mutations
In-Reply-To: <deaa866a0911060935q5e0364f5v9a296162c42fd0ca@mail.gmail.com>
References: <deaa866a0911060935q5e0364f5v9a296162c42fd0ca@mail.gmail.com>
Message-ID: <9e408d720911260254r1e85169lb92d944d88a1880c@mail.gmail.com>

Hey rob,

Sorting Intolerant from Tolerant
http://sift.jcvi.org/

~alden

...a bit late, i kno; I just read you post now while cleaning the inbox

On Fri, Nov 6, 2009 at 9:35 AM, Robert Bradbury
<robert.bradbury at gmail.com> wrote:
> Is there a function in the library (or has someone written one) that can
> take a genbank entry and determine which mutations are harmful?
>
> It would be used to produce a table summary of:
> ?GENE ? ? ? ? ?# SNP ? ? ?# BadSNP
>
> One kind of gets this from NCBI if you lookup in the "GENE" db a gene name
> and then go to the "GeneView" om dbSNP page it has the information I want
> but largely in a graphical format while I simply want numbers I can dump
> into a spreadsheet.
>
> I don't think it would be hard, fetch the gene, run through the features for
> the SNP database, figure out whether they are good or bad SNPs, accumulate
> the statistics and dump it. ?I think the functions available are flexible
> enough to do it but I can't believe nobody has already done it. ?It could be
> a bit more complex in that one could do an analysis to see if the mutations
> are in a conserved domain or mutations that code for Cysteine or Methionine
> (or othe potentially "critical" amino acids) but since "critical" is in the
> eye of the beholder there would have to be some kind of callback to a
> scoring function.
>
> Thanks,
> Robert
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


From robert.bradbury at gmail.com  Thu Nov 26 06:27:50 2009
From: robert.bradbury at gmail.com (Robert Bradbury)
Date: Thu, 26 Nov 2009 06:27:50 -0500
Subject: [Bioperl-l] Function that determines serious mutations
In-Reply-To: <9e408d720911260254r1e85169lb92d944d88a1880c@mail.gmail.com>
References: <deaa866a0911060935q5e0364f5v9a296162c42fd0ca@mail.gmail.com>
	<9e408d720911260254r1e85169lb92d944d88a1880c@mail.gmail.com>
Message-ID: <deaa866a0911260327j5b57d16erfcbe5b996e1a6e64@mail.gmail.com>

On Thu, Nov 26, 2009 at 5:54 AM, Alden Huang <alden.huang at gmail.com> wrote:
>
> Sorting Intolerant from Tolerant
> http://sift.jcvi.org/
>
>
Ah yes, thank you very much.  This looks very much like a tool that can be
adapted for various uses.

Robert

From jason at bioperl.org  Thu Nov 26 12:16:17 2009
From: jason at bioperl.org (Jason Stajich)
Date: Thu, 26 Nov 2009 09:16:17 -0800
Subject: [Bioperl-l] question about a Bio::Tree::Tree method
In-Reply-To: <30960443.966281259248778372.JavaMail.defaultUser@defaultHost>
References: <30960443.966281259248778372.JavaMail.defaultUser@defaultHost>
Message-ID: <14F4B8C9-A1F4-436B-813F-50E139932D3D@bioperl.org>

Emilio - please ask your questions on the list - many people there can  
help answer questions.

get_nodes returns all the nodes in the tree, the options specify the  
order they are returned in.  Depending on your question the order  
probably won't matter so you can just call it without any arguments  
like in the examples and the HOWTO.

The documentation for the method says:
  Title   : get_nodes
         Usage   : my @nodes = $tree?>get_nodes()
         Function: Return list of Bio::Tree::NodeI objects
         Returns : array of Bio::Tree::NodeI objects
         Args    : (named values) hash with one value
                   order => ?b?breadth? first order or  
?d?depth? first order

So you can provide no arguments and get the default (breadth-first I  
believe) or you can specify
-order => 'd'
or
-order => 'depth'

to get the nodes in depth-first order.

-jason
On Nov 26, 2009, at 7:19 AM, miglio83 at libero.it wrote:

> Hi Jason,
> I'm Emilio Siena, a PhD student of the University of Perugia.
> I have
> a question about the method "get_nodes" of the  "Bio::Tree::Tree"  
> class.
> In
> particular I didn't understand which type of arguments it accepts  
> and in which
> format an argument should be given.
>
> Thank you in advance!
>
> Emilio

--
Jason Stajich
jason.stajich at gmail.com
jason at bioperl.org


From maj at fortinbras.us  Thu Nov 26 12:40:45 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Thu, 26 Nov 2009 12:40:45 -0500
Subject: [Bioperl-l] Bio::Assembly::IO::sam is alpha
Message-ID: <599F8BABCD2848EFA98FB24A4419674E@NewLife>

in bioperl-live/trunk with plenty pod; bravehearts can (please!) test on .bam files
cheers, MAJ

From mauricio at open-bio.org  Thu Nov 26 16:45:43 2009
From: mauricio at open-bio.org (Mauricio Herrera Cuadra)
Date: Thu, 26 Nov 2009 15:45:43 -0600
Subject: [Bioperl-l] [DAS] DAS workshop 7th-9th April 2010
In-Reply-To: <F30A9ED7-41E9-4833-A094-FDF0893E0F92@sanger.ac.uk>
References: <F30A9ED7-41E9-4833-A094-FDF0893E0F92@sanger.ac.uk>
Message-ID: <4B0EF707.6080202@open-bio.org>

Hi Jonathan,

Any chance it can be webcasted? I'm sure it would attract a lot of 
remote attendees ;)

Regards,
Mauricio.


Jonathan Warren wrote:
> We are considering running a Distributed Annotation System workshop here 
> at the Sanger/EBI in the UK subject to decent demand.
> The workshop will be held from Wednesday 7th-Friday 9th April 2010. If 
> you would be interested in attending either to present or just take part
> then please email me jw12 at sanger.ac.uk
> 
> The format of the workshop is likely to be similar to last years (1st 
> day for beginners, 2nd for both beginners and advanced users, 3rd day 
> for advanced), information for which can be found here:
> http://www.dasregistry.org/course.jsp
> 
> If you would like to present then please send a short summary of what 
> you would like to talk about.
> 
> Thanks
> 
> Jonathan.
> 
> Jonathan Warren
> Senior Developer and DAS coordinator
> jw12 at sanger.ac.uk
> 
> 
> 
> 
> 
> 
> 
> 
> 

From robert.bradbury at gmail.com  Thu Nov 26 21:06:40 2009
From: robert.bradbury at gmail.com (Robert Bradbury)
Date: Thu, 26 Nov 2009 21:06:40 -0500
Subject: [Bioperl-l] BioPerl "guts" question regarding forked processes
Message-ID: <deaa866a0911261806g12211028w23cd540ae3652ff4@mail.gmail.com>

I'm currently running near my process limit and running sequence fetches
from swissprot (I've also had this happen with getting gi's from NCBI) and
am running out of processes about halfway through the set I'm trying to
fetch [1].

Now, is there someplace in the bioperl documentation that documents where
one is supposed to wait() for defunct processes after each sequence fetch.
 I'm encountering the problem both when the sequence fetches succeed as well
as when they fail.

Thanks in advance.
Robert

1. This is due to a bug in chromium's use of flash that involves it leaving
many defunct processes that are uncollected and therefore counting towards
ones "process limit".

From kanzure at gmail.com  Thu Nov 26 21:12:46 2009
From: kanzure at gmail.com (Bryan Bishop)
Date: Thu, 26 Nov 2009 20:12:46 -0600
Subject: [Bioperl-l] BioPerl "guts" question regarding forked processes
In-Reply-To: <deaa866a0911261806g12211028w23cd540ae3652ff4@mail.gmail.com>
References: <deaa866a0911261806g12211028w23cd540ae3652ff4@mail.gmail.com>
Message-ID: <55ad6af70911261812q583277d5l71df0d66e756f617@mail.gmail.com>

On Thu, Nov 26, 2009 at 8:06 PM, Robert Bradbury wrote:
> I'm currently running near my process limit and running sequence fetches
> from swissprot (I've also had this happen with getting gi's from NCBI) and
> am running out of processes about halfway through the set I'm trying to
> fetch [1].

Hey Robert, sorry for the off-topic question, but I was wondering if
you're the same Robert Bradbury from the extropy-chat list. Hi?

- Bryan
http://heybryan.org/
1 512 203 0507

From paolo.pavan at gmail.com  Fri Nov 27 06:35:03 2009
From: paolo.pavan at gmail.com (Paolo Pavan)
Date: Fri, 27 Nov 2009 12:35:03 +0100
Subject: [Bioperl-l] More general Bio::Assembly::Contig question (was
	Bio::Tools::Run::Cap3 usage question)
Message-ID: <56be91b60911270335s3a50ab0cpb03aabb6660f81dc@mail.gmail.com>

Dear Florent,
Thank you for your kind answer and for your efforts spent in this module.
Since you are working on these topics I would like to seize the day and put
you some questions about some doubts I have in mind, if you agree, of course
:-)
Some times ago I tried to work with bioperl, loading the data from an ACE
file originated by Newbler; my need was to extract part of the contig like
an alignment of reads and I tought to do it with a slice() method, since I
saw Bio::Assembly::Contig implements Bio::AlignI interface. Unfortunately I
realize that this interface is inherited but not implemented.
I tried to hack it by adding a slice method which would act on a
Bio::Alignment created from the array of LocatableSeqs representing the
reads.

This is the question:
If I'm not wrong (please correct me if yes), Bio::Assembly::Contig class
stores reads informations in:
Bio::Assembly::Contigs->{_elem}{READ_NAME}{_feat}{
     _align_clipping:READ_NAME}
     _aligned_coord:READ_NAME}
     _quality_clipping:READ_NAME}

Anyone of these 3 features _align_clipping, _aligned_coord,
_quality_clipping, contains a Bio::SeqFeature::Generic, which of them is
more suitable to the purpose expressed before, the slice method?
And more, If you apologize me for being too long, is consequently to the
previous: I don't have perfectly clear the purpose of this 3 feature per
read, can you explain it?

Really thanks you for the time you would spend.
Bye bye,
Paolo


2009/11/24 Florent Angly <florent.angly at gmail.com>

> Hi Paolo,
>
> It turns out that there is no standard for what is to be passed to the
> Bio::Tools::Run wrappers and returned by them. I noticed the inconsistency
> between the assembly wrappers recently while implementing support for new
> wrapper. I implemented inital support for additional de novo assembly
> programs in BioPerl (454 Newbler and Minimo) a couple of weeks ago and Mark
> Jensen added support for Maq, a program that assembler reads against a
> reference. In the process, all the assembly wrappers were changed to take
> the same type of input data (a FASTA sequence or an array reference of
> sequence objects) and return one of the following:
>   * a Bio::Assembly::Scaffold object (the default), or
>   * a Bio::Assembly::IO object, or
>   * the name of a file for the output of the assembler
> Use the out_type method to set up which output you want, e.g.:
>   $factory->out_type('Bio::Assembly::IO');
> or
>   $factory->out_type('cap3_results.ace');
> You'll have to use the code in the bioperl-run subversion if you want to
> use these new features.
>
> Cheers,
>
> Florent
>
>
>
>
> Paolo Pavan wrote:
>
>> Dear,
>> I'm confused about the proper usage of the module Bio::Tools::Run::Cap3.
>> As documented in the pod, the run(@seqs) method returns the cap3 report
>> file
>> while I expect to return a Bio::Assembly object, consistently with other
>> Bio::Tools::Run classes.
>> However, I went around this by getting from the factory object the
>> location
>> and the names of the temp output files (actually accessing a private
>> property, although) and reading them via the Assembly::IO system.
>> I was just wandering what is the proper designed way to do this job.
>>
>> Thank you for enlighten the way!
>> Paolo
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>>
>>
>
>

From jw12 at sanger.ac.uk  Thu Nov 26 09:57:35 2009
From: jw12 at sanger.ac.uk (Jonathan Warren)
Date: Thu, 26 Nov 2009 14:57:35 +0000
Subject: [Bioperl-l] DAS workshop 7th-9th April 2010
Message-ID: <F30A9ED7-41E9-4833-A094-FDF0893E0F92@sanger.ac.uk>

We are considering running a Distributed Annotation System workshop  
here at the Sanger/EBI in the UK subject to decent demand.
The workshop will be held from Wednesday 7th-Friday 9th April 2010. If  
you would be interested in attending either to present or just take part
then please email me jw12 at sanger.ac.uk

The format of the workshop is likely to be similar to last years (1st  
day for beginners, 2nd for both beginners and advanced users, 3rd day  
for advanced), information for which can be found here:
http://www.dasregistry.org/course.jsp

If you would like to present then please send a short summary of what  
you would like to talk about.

Thanks

Jonathan.

Jonathan Warren
Senior Developer and DAS coordinator
jw12 at sanger.ac.uk


-- 
 The Wellcome Trust Sanger Institute is operated by Genome Research 
 Limited, a charity registered in England with number 1021457 and a 
 company registered in England with number 2742969, whose registered 
 office is 215 Euston Road, London, NW1 2BE. 

From timbourine81 at googlemail.com  Thu Nov 26 11:02:30 2009
From: timbourine81 at googlemail.com (Tim Koehler)
Date: Thu, 26 Nov 2009 17:02:30 +0100
Subject: [Bioperl-l] How to parse BLAST output - all hits of each
	queryinnew file
In-Reply-To: <4B0EA44D.2050507@gmail.com>
References: <4B0D6C24.2080308@gmail.com>
	<53DE480F205E42CE8D2B9421592AAF0E@NewLife>
	<815D2A47BC9C4D89B8DEF0B10DA9EAF8@NewLife>
	<4B0EA44D.2050507@gmail.com>
Message-ID: <c3cc98c0911260802t2a66e7e8o7be3b79c03d86e6e@mail.gmail.com>

ups, sent too early...

Hey Mark,

thanks for the answer. But I am still struggling, especially where to put in
your code.

Here ist the code I have, so far:

#!/usr/bin/perl -w

### should I put your code here as push is a perl command?
my %hits_by_query;
for ($result->hits) {
### I inserted a comma after name}}; if there is no comma, there was the
error: Scalar found where operator expected at
12_BLAST_two_sequence_each_query_one_file.PL line7, near "} $hit"
###        (Missing operator before  $hit?)
###Useless use of push with no values at
12_BLAST_two_sequence_each_query_one_file.PL line 7.
###syntax error at 12_BLAST_two_sequence_each_query_one_file.PL line 7, near
"} $hit"
###BEGIN not safe after errors--compilation aborted at
12_BLAST_two_sequence_each_query_one_file.PL line 13.
 push @{$hits_by_query{$hit->name}}, $hit;
###here, every time this terror appears: Name "main::result" used only once:
possible typo at 12_BLAST_two_sequence_each_query_one_file.PL line 5.
###error: Can't call method "hits" on an undefined value at
12_BLAST_two_sequence_each_query_one_file.PL line 5.
}


use strict;
use Bio::Tools::Run::StandAloneBlast;
use Bio::SeqIO;
use Bio::SearchIO;
use Bio::Search::Result::BlastResult;

my $Seq_in = Bio::SeqIO->new (
-file =>
"/home/koehler/Programs/for_BLAST/BLAST_Pipeline/1_to_BLAST_two_seq.fasta",
-format => 'fasta'
);
while (my $query = $Seq_in->next_seq()) {
my $factory = Bio::Tools::Run::StandAloneBlast->new(
'program' => 'blastn',
'database' => '/home/koehler/Programs/for_BLAST/BLAST_Pipeline/3_BLAST_db',
_READMETHOD => "Blast"
);

my $blast_report = $factory->blastall($query);

### Should I need to use a module? are the commands here at the right
position? errors, e.g., Global symbol "$hit" requires explicit package name
#my %hits_by_query;
#for ($result->hits) {
### inserted comma after name}}
# push @{$hits_by_query{$hit->name}}, $hit;
#}

foreach my $qid ( keys %hits_by_query ) {
 my $result = Bio::Search::Result::BlastResult->new();
 $result->add_hit($_) for ( @{$hits_by_query{$qid}} );
 my $blio = Bio::SearchIO->new( -file => ">$qid\.bls", -format=>'blast' );
 $blio->write_result($result);
}

###where are the files stored? what is their name. Sorry, but I cannot get
behind that :(

while( my $result = $blast_report->next_result ) {
  ## $result is a Bio::Search::Result::ResultI compliant object
  while( my $hit = $result->next_hit ) {
   ## $hit is a Bio::Search::Hit::HitI compliant object
   while( my $hsp = $hit->next_hsp ) {
    ## $hsp is a Bio::Search::HSP::HSPI compliant object
    if( $hsp->length('total') > 50 ) {
     if ( $hsp->percent_identity >= 75 ) {
     print  "Query= ",        $result->query_name,
        "Hit= ",        $hit->name,
            "Length= ",     $hsp->length('total'),
            "Percent_id= ", $hsp->percent_identity,
        "Subject=",        $hsp->hit_string,"\n";
     }
    }
   }
  }
}
}

Again, a big thanks in advance :)

All the best,

Tim


On Thu, Nov 26, 2009 at 4:52 PM, Tim <timbourine81 at gmail.com> wrote:

> Hey Mark,
>
> thanks for the answer
>
> On 25.11.2009 20:21, Mark A. Jensen wrote:
> > whoops: change the following line:
> > my $blio = Bio::SearchIO->new( -file => $qid.".bls", -format=>'blast' );
> >
> > to
> >
> > my $blio = Bio::SearchIO->new( -file => ">$qid\.bls", -format=>'blast' );
> >
> > (I always forget that...)
> > MAJ
> >
> > ----- Original Message ----- From: "Mark A. Jensen" <maj at fortinbras.us>
> > To: "Tim" <timbourine81 at gmail.com>; <bioperl-l at lists.open-bio.org>
> > Sent: Wednesday, November 25, 2009 1:20 PM
> > Subject: Re: [Bioperl-l] How to parse BLAST output - all hits of each
> > queryinnew file
> >
> >
> >> hey Tim--
> >>
> >> Sound like you need to go about collecting your queries inside out:
> >>
> >> my %hits_by_query;
> >> for ($result->hits) {
> >>  push @{$hits_by_query{$hit->name}} $hit;
> >> }
> >>
> >> I believe now each hash element, keyed by the query name, will contain
> >> an arrayref to the set of hits assoc with that query.
> >>> From here, I believe
> >>
> >> use Bio::Search::Result::BlastResult;
> >> use Bio::SearchIO;
> >>
> >> foreach my $qid ( keys %hits_by_query ) {
> >>  my $result = Bio::Search::Result::BlastResult->new();
> >>  $result->add_hit($_) for ( @{$hits_by_query{$qid}} );
> >>  my $blio = Bio::SearchIO->new( -file => $qid.".bls", -format=>'blast'
> );
> >>  $blio->write_result($result);
> >> }
> >>
> >> will do what you want.
> >>
> >> hope this helps -
> >> Mark
> >>
> >> ----- Original Message ----- From: "Tim" <timbourine81 at gmail.com>
> >> To: <bioperl-l at lists.open-bio.org>
> >> Sent: Wednesday, November 25, 2009 12:40 PM
> >> Subject: [Bioperl-l] How to parse BLAST output - all hits of each
> >> query innew file
> >>
> >>
> >>> Dear bioperl users,
> >>>
> >>> I am a real newbie and have - maybe a very trivial - question.
> >>>
> >>> I searched the mailing list archive and many howtos but I have not
> found
> >>> a concrete answer to my problem. So hopefully you can help me :)
> >>>
> >>> Background: I use the latest Bioperl version (installed it two weeks
> >>> before).
> >>> When I use Bio::Tools::Run::StandAloneBlast to BLAST one fasta file
> >>> including different sequences, I get a BLAST output with many queries
> >>> each having several hits / sbjcts.
> >>>
> >>> My problem is how to parse *all* hits of *one* query into a single new
> >>> file. And this for all the queries I have in my BLAST output file.
> >>>
> >>> Or is it better the other way round; first to make fasta files with
> only
> >>> single sequences inside and BLAST each file? But how can I automize
> that
> >>> using Bioperl?
> >>>
> >>> I tried Bio::SearchIO but can only parse all queries and their
> >>> respective hits in only one file...
> >>> I think iteration is also necessary here, but I do not really know how
> >>> to include that into Bio::SearchIO.
> >>> Or do I have to use Module:Bio::Index::Blast?
> >>>
> >>> I can index a file (see below), but I have no idea what comes next...
> >>>
> >>> ###How I index a file...
> >>>
> >>> #!/usr/bin/perl -w
> >>>
> >>> $ENV{BIOPERL_INDEX_TYPE} = "SDBM_File";
> >>>
> >>> use Bio::Index::Fasta;
> >>>
> >>>
> >>> $file_name = "8_to_BLAST_two_seq_index.fasta";
> >>> $id = "48882";
> >>> $inx = Bio::Index::Fasta->new (-filename => $file_name . ".idx",
> >>> -write_flag => 1);
> >>> $inx->make_index($file_name);
> >>>
> >>>
> >>> Hopefully, you can give me at least hints what to look for.
> >>>
> >>> A big THANKS in advance!
> >>>
> >>> Cheers,
> >>>
> >>> Tim
> >>> _______________________________________________
> >>> Bioperl-l mailing list
> >>> Bioperl-l at lists.open-bio.org
> >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >>>
> >>>
> >>
> >> _______________________________________________
> >> Bioperl-l mailing list
> >> Bioperl-l at lists.open-bio.org
> >> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >>
> >>
> >
>
> Tim K?hler
MPI for Terrestrial Microbiology
Karl-von-Frisch-Stra?e
D-35043 Marburg / Germany

Email: koehlerd at mpi-marburg.mpg.de
Phone: +49 6421 178-740
Fax:   +49 6421 178-999


From rtbio.2009 at gmail.com  Sat Nov 28 02:53:43 2009
From: rtbio.2009 at gmail.com (Roopa Raghuveer)
Date: Sat, 28 Nov 2009 08:53:43 +0100
Subject: [Bioperl-l] Linking of two cgi scripts
Message-ID: <c7cac1600911272353u70019256rf394078759660266@mail.gmail.com>

hello everyone,

I have a small question.

I would like to link two cgi scripts i.e.,

I have an input sequence being entered in a text area

ex:->gi|at442323|...
ATGCCCCCTTGGAACCAAAAAAA....

So I would like to compare this with the query sequences.These query
sequences would be from a BLAST script in the module blast.pm
So once I enter the input sequence and request for BLAST using submit
button,my request should go to a program which performs BLAST search.After
this, the sequences obtained from BLAST have to be returned to a program
Roopa.pm which compares the input sequence and the sequences obtained from
blast.

But I am unable to provide this link between the cgi scripts.(i.e.,one
script to use BLAST,the other script to compare the sequences and send the
results to the browser)

Could any one help me in this regard?

Regards,
Roopa.

From s.denaxas at gmail.com  Sat Nov 28 05:56:15 2009
From: s.denaxas at gmail.com (Spiros Denaxas)
Date: Sat, 28 Nov 2009 10:56:15 +0000
Subject: [Bioperl-l] Linking of two cgi scripts
In-Reply-To: <c7cac1600911272353u70019256rf394078759660266@mail.gmail.com>
References: <c7cac1600911272353u70019256rf394078759660266@mail.gmail.com>
Message-ID: <bba689ec0911280256u602b8f9dpffe9483189c56536@mail.gmail.com>

Hello,

Why do they both have to be CGi scripts? cant all the processing
happen server side, i.e. both BLAST and comparison of returned
results?

If that is strictly a requirement, you could:

a) get input from user on script A, i.e. the input sequence
b) do a HTTP request from the CGI to the other script B using LWP::UserAgent
c) get results from script B, pass on to comparison module
d) return results to user

As I said, this will be clunky so either do everything in one go or
consider AJAX

hope this helps
Spiros

On Sat, Nov 28, 2009 at 7:53 AM, Roopa Raghuveer <rtbio.2009 at gmail.com> wrote:
> hello everyone,
>
> I have a small question.
>
> I would like to link two cgi scripts i.e.,
>
> I have an input sequence being entered in a text area
>
> ex:->gi|at442323|...
> ATGCCCCCTTGGAACCAAAAAAA....
>
> So I would like to compare this with the query sequences.These query
> sequences would be from a BLAST script in the module blast.pm
> So once I enter the input sequence and request for BLAST using submit
> button,my request should go to a program which performs BLAST search.After
> this, the sequences obtained from BLAST have to be returned to a program
> Roopa.pm which compares the input sequence and the sequences obtained from
> blast.
>
> But I am unable to provide this link between the cgi scripts.(i.e.,one
> script to use BLAST,the other script to compare the sequences and send the
> results to the browser)
>
> Could any one help me in this regard?
>
> Regards,
> Roopa.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>

From maj at fortinbras.us  Sat Nov 28 11:23:53 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Sat, 28 Nov 2009 11:23:53 -0500
Subject: [Bioperl-l] Run wrappers for BWA and Samtools
Message-ID: <7F56A6EEEB0E4EE291D5340F27DF7D3A@NewLife>

Hi All, 

Run wrappers for the bwa assembler and the samtools suite
are now available as beta in the bioperl-run/trunk. The bwa 
wrapper allows you to run a canned assembly pipeline, or 
to execute individual bwa components. The assembly pipeline
can return a Bio::Assembly::Scaffold object via the new 
Bio::Assembly::IO::sam module in bioperl-live/trunk
(this requires lstein's Bio::DB::Sam, from CPAN). Details at

http://www.bioperl.org/wiki/HOWTO:Short-read_assemblies_with_BWA

and, of course, in the pod. 

Cheers, 
MAJ

From maj at fortinbras.us  Sat Nov 28 21:55:42 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Sat, 28 Nov 2009 21:55:42 -0500
Subject: [Bioperl-l] How to parse BLAST output - all hits of
	eachqueryinnew file
In-Reply-To: <c3cc98c0911260802t2a66e7e8o7be3b79c03d86e6e@mail.gmail.com>
References: <4B0D6C24.2080308@gmail.com><53DE480F205E42CE8D2B9421592AAF0E@NewLife><815D2A47BC9C4D89B8DEF0B10DA9EAF8@NewLife><4B0EA44D.2050507@gmail.com>
	<c3cc98c0911260802t2a66e7e8o7be3b79c03d86e6e@mail.gmail.com>
Message-ID: <21BFD947CEEF43CCAC8AFFDB7A064A49@NewLife>

Hi Tim--
There's a bug in my code; should be
for my $hit ($result->hits) {
...
}
and you're right about the comma. My bad.

But I don't think you need this-- you're already looping over your
query sequences and doing blastn on each one. So in the middle of
your loop, you can simply write the blast result that you got:

my $blio = Bio::SearchIO->new( -file => 
">".$query->id.".bls", -format=>"blast" );
$blio->write_result($result);

and forget about the foreach my $qid loop entirely.

The files should show up in the directory from which you're
running the script.
cheers, MAJ


----- Original Message ----- 
From: "Tim Koehler" <timbourine81 at googlemail.com>
To: <bioperl-l at lists.open-bio.org>
Sent: Thursday, November 26, 2009 11:02 AM
Subject: Re: [Bioperl-l] How to parse BLAST output - all hits of eachqueryinnew 
file


ups, sent too early...

Hey Mark,

thanks for the answer. But I am still struggling, especially where to put in
your code.

Here ist the code I have, so far:

#!/usr/bin/perl -w

### should I put your code here as push is a perl command?
my %hits_by_query;
for ($result->hits) {
### I inserted a comma after name}}; if there is no comma, there was the
error: Scalar found where operator expected at
12_BLAST_two_sequence_each_query_one_file.PL line7, near "} $hit"
###        (Missing operator before  $hit?)
###Useless use of push with no values at
12_BLAST_two_sequence_each_query_one_file.PL line 7.
###syntax error at 12_BLAST_two_sequence_each_query_one_file.PL line 7, near
"} $hit"
###BEGIN not safe after errors--compilation aborted at
12_BLAST_two_sequence_each_query_one_file.PL line 13.
 push @{$hits_by_query{$hit->name}}, $hit;
###here, every time this terror appears: Name "main::result" used only once:
possible typo at 12_BLAST_two_sequence_each_query_one_file.PL line 5.
###error: Can't call method "hits" on an undefined value at
12_BLAST_two_sequence_each_query_one_file.PL line 5.
}


use strict;
use Bio::Tools::Run::StandAloneBlast;
use Bio::SeqIO;
use Bio::SearchIO;
use Bio::Search::Result::BlastResult;

my $Seq_in = Bio::SeqIO->new (
-file =>
"/home/koehler/Programs/for_BLAST/BLAST_Pipeline/1_to_BLAST_two_seq.fasta",
-format => 'fasta'
);
while (my $query = $Seq_in->next_seq()) {
my $factory = Bio::Tools::Run::StandAloneBlast->new(
'program' => 'blastn',
'database' => '/home/koehler/Programs/for_BLAST/BLAST_Pipeline/3_BLAST_db',
_READMETHOD => "Blast"
);

my $blast_report = $factory->blastall($query);

### Should I need to use a module? are the commands here at the right
position? errors, e.g., Global symbol "$hit" requires explicit package name
#my %hits_by_query;
#for ($result->hits) {
### inserted comma after name}}
# push @{$hits_by_query{$hit->name}}, $hit;
#}

foreach my $qid ( keys %hits_by_query ) {
 my $result = Bio::Search::Result::BlastResult->new();
 $result->add_hit($_) for ( @{$hits_by_query{$qid}} );
 my $blio = Bio::SearchIO->new( -file => ">$qid\.bls", -format=>'blast' );
 $blio->write_result($result);
}

###where are the files stored? what is their name. Sorry, but I cannot get
behind that :(

while( my $result = $blast_report->next_result ) {
  ## $result is a Bio::Search::Result::ResultI compliant object
  while( my $hit = $result->next_hit ) {
   ## $hit is a Bio::Search::Hit::HitI compliant object
   while( my $hsp = $hit->next_hsp ) {
    ## $hsp is a Bio::Search::HSP::HSPI compliant object
    if( $hsp->length('total') > 50 ) {
     if ( $hsp->percent_identity >= 75 ) {
     print  "Query= ",        $result->query_name,
        "Hit= ",        $hit->name,
            "Length= ",     $hsp->length('total'),
            "Percent_id= ", $hsp->percent_identity,
        "Subject=",        $hsp->hit_string,"\n";
     }
    }
   }
  }
}
}

Again, a big thanks in advance :)

All the best,

Tim


On Thu, Nov 26, 2009 at 4:52 PM, Tim <timbourine81 at gmail.com> wrote:

> Hey Mark,
>
> thanks for the answer
>
> On 25.11.2009 20:21, Mark A. Jensen wrote:
> > whoops: change the following line:
> > my $blio = Bio::SearchIO->new( -file => $qid.".bls", -format=>'blast' );
> >
> > to
> >
> > my $blio = Bio::SearchIO->new( -file => ">$qid\.bls", -format=>'blast' );
> >
> > (I always forget that...)
> > MAJ
> >
> > ----- Original Message ----- From: "Mark A. Jensen" <maj at fortinbras.us>
> > To: "Tim" <timbourine81 at gmail.com>; <bioperl-l at lists.open-bio.org>
> > Sent: Wednesday, November 25, 2009 1:20 PM
> > Subject: Re: [Bioperl-l] How to parse BLAST output - all hits of each
> > queryinnew file
> >
> >
> >> hey Tim--
> >>
> >> Sound like you need to go about collecting your queries inside out:
> >>
> >> my %hits_by_query;
> >> for ($result->hits) {
> >>  push @{$hits_by_query{$hit->name}} $hit;
> >> }
> >>
> >> I believe now each hash element, keyed by the query name, will contain
> >> an arrayref to the set of hits assoc with that query.
> >>> From here, I believe
> >>
> >> use Bio::Search::Result::BlastResult;
> >> use Bio::SearchIO;
> >>
> >> foreach my $qid ( keys %hits_by_query ) {
> >>  my $result = Bio::Search::Result::BlastResult->new();
> >>  $result->add_hit($_) for ( @{$hits_by_query{$qid}} );
> >>  my $blio = Bio::SearchIO->new( -file => $qid.".bls", -format=>'blast'
> );
> >>  $blio->write_result($result);
> >> }
> >>
> >> will do what you want.
> >>
> >> hope this helps -
> >> Mark
> >>
> >> ----- Original Message ----- From: "Tim" <timbourine81 at gmail.com>
> >> To: <bioperl-l at lists.open-bio.org>
> >> Sent: Wednesday, November 25, 2009 12:40 PM
> >> Subject: [Bioperl-l] How to parse BLAST output - all hits of each
> >> query innew file
> >>
> >>
> >>> Dear bioperl users,
> >>>
> >>> I am a real newbie and have - maybe a very trivial - question.
> >>>
> >>> I searched the mailing list archive and many howtos but I have not
> found
> >>> a concrete answer to my problem. So hopefully you can help me :)
> >>>
> >>> Background: I use the latest Bioperl version (installed it two weeks
> >>> before).
> >>> When I use Bio::Tools::Run::StandAloneBlast to BLAST one fasta file
> >>> including different sequences, I get a BLAST output with many queries
> >>> each having several hits / sbjcts.
> >>>
> >>> My problem is how to parse *all* hits of *one* query into a single new
> >>> file. And this for all the queries I have in my BLAST output file.
> >>>
> >>> Or is it better the other way round; first to make fasta files with
> only
> >>> single sequences inside and BLAST each file? But how can I automize
> that
> >>> using Bioperl?
> >>>
> >>> I tried Bio::SearchIO but can only parse all queries and their
> >>> respective hits in only one file...
> >>> I think iteration is also necessary here, but I do not really know how
> >>> to include that into Bio::SearchIO.
> >>> Or do I have to use Module:Bio::Index::Blast?
> >>>
> >>> I can index a file (see below), but I have no idea what comes next...
> >>>
> >>> ###How I index a file...
> >>>
> >>> #!/usr/bin/perl -w
> >>>
> >>> $ENV{BIOPERL_INDEX_TYPE} = "SDBM_File";
> >>>
> >>> use Bio::Index::Fasta;
> >>>
> >>>
> >>> $file_name = "8_to_BLAST_two_seq_index.fasta";
> >>> $id = "48882";
> >>> $inx = Bio::Index::Fasta->new (-filename => $file_name . ".idx",
> >>> -write_flag => 1);
> >>> $inx->make_index($file_name);
> >>>
> >>>
> >>> Hopefully, you can give me at least hints what to look for.
> >>>
> >>> A big THANKS in advance!
> >>>
> >>> Cheers,
> >>>
> >>> Tim
> >>> _______________________________________________
> >>> Bioperl-l mailing list
> >>> Bioperl-l at lists.open-bio.org
> >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >>>
> >>>
> >>
> >> _______________________________________________
> >> Bioperl-l mailing list
> >> Bioperl-l at lists.open-bio.org
> >> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >>
> >>
> >
>
> Tim K?hler
MPI for Terrestrial Microbiology
Karl-von-Frisch-Stra?e
D-35043 Marburg / Germany

Email: koehlerd at mpi-marburg.mpg.de
Phone: +49 6421 178-740
Fax:   +49 6421 178-999

_______________________________________________
Bioperl-l mailing list
Bioperl-l at lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/bioperl-l


From maj at fortinbras.us  Sat Nov 28 22:32:42 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Sat, 28 Nov 2009 22:32:42 -0500
Subject: [Bioperl-l] HOWTO copyright policy vs FDL on wiki
Message-ID: <9EC73CA501BD45BA912F2D77954D6CD7@NewLife>

The HOWTOs appear to have a more restrictive copyright
than FDL-- in particular, the blurb at the bottom of the 
HOWTO page asks users to use the documents for personal 
use only. I'm for this; I think we should therefore have some 
explicit license for these that specifies this kind of restriction, 
and then express that on each howto and in BioPerl:Copyright.
Any thoughts on the right license and whether this is a good plan?
MAJ

From florent.angly at gmail.com  Sat Nov 28 22:47:45 2009
From: florent.angly at gmail.com (Florent Angly)
Date: Sat, 28 Nov 2009 19:47:45 -0800
Subject: [Bioperl-l] More general Bio::Assembly::Contig question (was
 Bio::Tools::Run::Cap3 usage question)
In-Reply-To: <56be91b60911270335s3a50ab0cpb03aabb6660f81dc@mail.gmail.com>
References: <56be91b60911270335s3a50ab0cpb03aabb6660f81dc@mail.gmail.com>
Message-ID: <4B11EEE1.8070907@gmail.com>

Hi Paolo,

The aligned reads of a contig are stored in 
Bio::Assembly::Contigs->{_elem}{READ_NAME}{_seq}. To implement a slice() 
method, you could retrieve the reads using get_seq_ids(), 
get_seq_by_name() or get_seq_by_pos(). To retrieve the position of an 
aligned read in the contig, use get_seq_coord() which returns a 
Bio::SeqFeature::Generic object (from 
Bio::Assembly::Contigs->{_elem}{READ_NAME}{_feat}{_aligned_coord:READ_NAME}) 
on which you can call the start() and end() methods.

I'm not entirely sure what 
Bio::Assembly::Contigs->{_elem}{READ_NAME}{_feat}{_align_clipping:READ_NAME} 
and {_quality_clipping:READ_NAME} are. I believe that they represent the 
clear range of the read/contig.

Hope it helps,

Florent


Paolo Pavan wrote:
> Dear Florent,
> Thank you for your kind answer and for your efforts spent in this module.
> Since you are working on these topics I would like to seize the day 
> and put you some questions about some doubts I have in mind, if you 
> agree, of course :-)
> Some times ago I tried to work with bioperl, loading the data from an 
> ACE file originated by Newbler; my need was to extract part of the 
> contig like an alignment of reads and I tought to do it with a slice() 
> method, since I saw Bio::Assembly::Contig implements Bio::AlignI 
> interface. Unfortunately I realize that this interface is inherited 
> but not implemented.
> I tried to hack it by adding a slice method which would act on a 
> Bio::Alignment created from the array of LocatableSeqs representing 
> the reads.
>
> This is the question:
> If I'm not wrong (please correct me if yes), Bio::Assembly::Contig 
> class stores reads informations in:
> Bio::Assembly::Contigs->{_elem}{READ_NAME}{_feat}{
>      _align_clipping:READ_NAME}
>      _aligned_coord:READ_NAME}
>      _quality_clipping:READ_NAME}
>
> Anyone of these 3 features _align_clipping, _aligned_coord, 
> _quality_clipping, contains a Bio::SeqFeature::Generic, which of them 
> is more suitable to the purpose expressed before, the slice method?
> And more, If you apologize me for being too long, is consequently to 
> the previous: I don't have perfectly clear the purpose of this 3 
> feature per read, can you explain it?
>
> Really thanks you for the time you would spend.
> Bye bye,
> Paolo


From bimber at wisc.edu  Sun Nov 29 00:31:25 2009
From: bimber at wisc.edu (Ben Bimber)
Date: Sat, 28 Nov 2009 23:31:25 -0600
Subject: [Bioperl-l] using bioperl to compare sequences
Message-ID: <9f985cdc0911282131l350bc525gd9ad4717c101ac63@mail.gmail.com>

Hello,

I have a couple years programming experience, but am reasonably new to
perl and extremely new to bioperl.  I have been reading through the
bioperl documentation and am trying to understand the best way to
approach a particular problem.  I'm hoping someone could offer some
tips and point me in the right direction.  If someone has solved this
sort of problem before, i'd prefer not to reinvent things.  Here's
what I'm trying to do:

Our lab generates mRNA sequence data, consisting of alleles of a given
gene or genes
I want to compare each of these sequences against a reference using
BLAST or clustalw (will need the ability to choose at run time)
Take the result of this alignment, then record positions of difference
between the experimental sequence and reference sequence (SNPs)
Translate the corresponding AA change(s) associated with each SNP.
There can be overlapping ORFs.

I see that bioperl has modules for BLAST and clustal.  I've also been
looking at the modules under variation.  I havent fully wrapped my
head around them, but they look to be what i'd use for SNP detection.

has anyone has written code to perform similar things and if so, would
you be willing to share specific examples?  Anything concrete to see
exactly how these modules operate would be extremely helpful.

Thanks in advance for any tips or help.

From jason at bioperl.org  Sun Nov 29 10:54:53 2009
From: jason at bioperl.org (Jason Stajich)
Date: Sun, 29 Nov 2009 07:54:53 -0800
Subject: [Bioperl-l] How to parse BLAST output - all hits of
	eachqueryinnew file
In-Reply-To: <21BFD947CEEF43CCAC8AFFDB7A064A49@NewLife>
References: <4B0D6C24.2080308@gmail.com><53DE480F205E42CE8D2B9421592AAF0E@NewLife><815D2A47BC9C4D89B8DEF0B10DA9EAF8@NewLife><4B0EA44D.2050507@gmail.com>
	<c3cc98c0911260802t2a66e7e8o7be3b79c03d86e6e@mail.gmail.com>
	<21BFD947CEEF43CCAC8AFFDB7A064A49@NewLife>
Message-ID: <897A8DB4-AF29-4601-A1E5-9A04D9D8C151@bioperl.org>

or
while( my $hit = $result->next_hit ) {
}
On Nov 28, 2009, at 6:55 PM, Mark A. Jensen wrote:

> Hi Tim--
> There's a bug in my code; should be
> for my $hit ($result->hits) {
> ...
> }
> and you're right about the comma. My bad.
>
> But I don't think you need this-- you're already looping over your
> query sequences and doing blastn on each one. So in the middle of
> your loop, you can simply write the blast result that you got:
>
> my $blio = Bio::SearchIO->new( -file => ">".$query->id.".bls", - 
> format=>"blast" );
> $blio->write_result($result);
>
> and forget about the foreach my $qid loop entirely.
>
> The files should show up in the directory from which you're
> running the script.
> cheers, MAJ
>
>
>
> ----- Original Message ----- From: "Tim Koehler" <timbourine81 at googlemail.com 
> >
> To: <bioperl-l at lists.open-bio.org>
> Sent: Thursday, November 26, 2009 11:02 AM
> Subject: Re: [Bioperl-l] How to parse BLAST output - all hits of  
> eachqueryinnew file
>
>
> ups, sent too early...
>
> Hey Mark,
>
> thanks for the answer. But I am still struggling, especially where  
> to put in
> your code.
>
> Here ist the code I have, so far:
>
> #!/usr/bin/perl -w
>
> ### should I put your code here as push is a perl command?
> my %hits_by_query;
> for ($result->hits) {
> ### I inserted a comma after name}}; if there is no comma, there was  
> the
> error: Scalar found where operator expected at
> 12_BLAST_two_sequence_each_query_one_file.PL line7, near "} $hit"
> ###        (Missing operator before  $hit?)
> ###Useless use of push with no values at
> 12_BLAST_two_sequence_each_query_one_file.PL line 7.
> ###syntax error at 12_BLAST_two_sequence_each_query_one_file.PL line  
> 7, near
> "} $hit"
> ###BEGIN not safe after errors--compilation aborted at
> 12_BLAST_two_sequence_each_query_one_file.PL line 13.
> push @{$hits_by_query{$hit->name}}, $hit;
> ###here, every time this terror appears: Name "main::result" used  
> only once:
> possible typo at 12_BLAST_two_sequence_each_query_one_file.PL line 5.
> ###error: Can't call method "hits" on an undefined value at
> 12_BLAST_two_sequence_each_query_one_file.PL line 5.
> }
>
>
> use strict;
> use Bio::Tools::Run::StandAloneBlast;
> use Bio::SeqIO;
> use Bio::SearchIO;
> use Bio::Search::Result::BlastResult;
>
> my $Seq_in = Bio::SeqIO->new (
> -file =>
> "/home/koehler/Programs/for_BLAST/BLAST_Pipeline/ 
> 1_to_BLAST_two_seq.fasta",
> -format => 'fasta'
> );
> while (my $query = $Seq_in->next_seq()) {
> my $factory = Bio::Tools::Run::StandAloneBlast->new(
> 'program' => 'blastn',
> 'database' => '/home/koehler/Programs/for_BLAST/BLAST_Pipeline/ 
> 3_BLAST_db',
> _READMETHOD => "Blast"
> );
>
> my $blast_report = $factory->blastall($query);
>
> ### Should I need to use a module? are the commands here at the right
> position? errors, e.g., Global symbol "$hit" requires explicit  
> package name
> #my %hits_by_query;
> #for ($result->hits) {
> ### inserted comma after name}}
> # push @{$hits_by_query{$hit->name}}, $hit;
> #}
>
> foreach my $qid ( keys %hits_by_query ) {
> my $result = Bio::Search::Result::BlastResult->new();
> $result->add_hit($_) for ( @{$hits_by_query{$qid}} );
> my $blio = Bio::SearchIO->new( -file => ">$qid\.bls", - 
> format=>'blast' );
> $blio->write_result($result);
> }
>
> ###where are the files stored? what is their name. Sorry, but I  
> cannot get
> behind that :(
>
> while( my $result = $blast_report->next_result ) {
> ## $result is a Bio::Search::Result::ResultI compliant object
> while( my $hit = $result->next_hit ) {
>  ## $hit is a Bio::Search::Hit::HitI compliant object
>  while( my $hsp = $hit->next_hsp ) {
>   ## $hsp is a Bio::Search::HSP::HSPI compliant object
>   if( $hsp->length('total') > 50 ) {
>    if ( $hsp->percent_identity >= 75 ) {
>    print  "Query= ",        $result->query_name,
>       "Hit= ",        $hit->name,
>           "Length= ",     $hsp->length('total'),
>           "Percent_id= ", $hsp->percent_identity,
>       "Subject=",        $hsp->hit_string,"\n";
>    }
>   }
>  }
> }
> }
> }
>
> Again, a big thanks in advance :)
>
> All the best,
>
> Tim
>
>
> On Thu, Nov 26, 2009 at 4:52 PM, Tim <timbourine81 at gmail.com> wrote:
>
>> Hey Mark,
>>
>> thanks for the answer
>>
>> On 25.11.2009 20:21, Mark A. Jensen wrote:
>> > whoops: change the following line:
>> > my $blio = Bio::SearchIO->new( -file => $qid.".bls", - 
>> format=>'blast' );
>> >
>> > to
>> >
>> > my $blio = Bio::SearchIO->new( -file => ">$qid\.bls", - 
>> format=>'blast' );
>> >
>> > (I always forget that...)
>> > MAJ
>> >
>> > ----- Original Message ----- From: "Mark A. Jensen" <maj at fortinbras.us 
>> >
>> > To: "Tim" <timbourine81 at gmail.com>; <bioperl-l at lists.open-bio.org>
>> > Sent: Wednesday, November 25, 2009 1:20 PM
>> > Subject: Re: [Bioperl-l] How to parse BLAST output - all hits of  
>> each
>> > queryinnew file
>> >
>> >
>> >> hey Tim--
>> >>
>> >> Sound like you need to go about collecting your queries inside  
>> out:
>> >>
>> >> my %hits_by_query;
>> >> for ($result->hits) {
>> >>  push @{$hits_by_query{$hit->name}} $hit;
>> >> }
>> >>
>> >> I believe now each hash element, keyed by the query name, will  
>> contain
>> >> an arrayref to the set of hits assoc with that query.
>> >>> From here, I believe
>> >>
>> >> use Bio::Search::Result::BlastResult;
>> >> use Bio::SearchIO;
>> >>
>> >> foreach my $qid ( keys %hits_by_query ) {
>> >>  my $result = Bio::Search::Result::BlastResult->new();
>> >>  $result->add_hit($_) for ( @{$hits_by_query{$qid}} );
>> >>  my $blio = Bio::SearchIO->new( -file => $qid.".bls", - 
>> format=>'blast'
>> );
>> >>  $blio->write_result($result);
>> >> }
>> >>
>> >> will do what you want.
>> >>
>> >> hope this helps -
>> >> Mark
>> >>
>> >> ----- Original Message ----- From: "Tim" <timbourine81 at gmail.com>
>> >> To: <bioperl-l at lists.open-bio.org>
>> >> Sent: Wednesday, November 25, 2009 12:40 PM
>> >> Subject: [Bioperl-l] How to parse BLAST output - all hits of each
>> >> query innew file
>> >>
>> >>
>> >>> Dear bioperl users,
>> >>>
>> >>> I am a real newbie and have - maybe a very trivial - question.
>> >>>
>> >>> I searched the mailing list archive and many howtos but I have  
>> not
>> found
>> >>> a concrete answer to my problem. So hopefully you can help me :)
>> >>>
>> >>> Background: I use the latest Bioperl version (installed it two  
>> weeks
>> >>> before).
>> >>> When I use Bio::Tools::Run::StandAloneBlast to BLAST one fasta  
>> file
>> >>> including different sequences, I get a BLAST output with many  
>> queries
>> >>> each having several hits / sbjcts.
>> >>>
>> >>> My problem is how to parse *all* hits of *one* query into a  
>> single new
>> >>> file. And this for all the queries I have in my BLAST output  
>> file.
>> >>>
>> >>> Or is it better the other way round; first to make fasta files  
>> with
>> only
>> >>> single sequences inside and BLAST each file? But how can I  
>> automize
>> that
>> >>> using Bioperl?
>> >>>
>> >>> I tried Bio::SearchIO but can only parse all queries and their
>> >>> respective hits in only one file...
>> >>> I think iteration is also necessary here, but I do not really  
>> know how
>> >>> to include that into Bio::SearchIO.
>> >>> Or do I have to use Module:Bio::Index::Blast?
>> >>>
>> >>> I can index a file (see below), but I have no idea what comes  
>> next...
>> >>>
>> >>> ###How I index a file...
>> >>>
>> >>> #!/usr/bin/perl -w
>> >>>
>> >>> $ENV{BIOPERL_INDEX_TYPE} = "SDBM_File";
>> >>>
>> >>> use Bio::Index::Fasta;
>> >>>
>> >>>
>> >>> $file_name = "8_to_BLAST_two_seq_index.fasta";
>> >>> $id = "48882";
>> >>> $inx = Bio::Index::Fasta->new (-filename => $file_name . ".idx",
>> >>> -write_flag => 1);
>> >>> $inx->make_index($file_name);
>> >>>
>> >>>
>> >>> Hopefully, you can give me at least hints what to look for.
>> >>>
>> >>> A big THANKS in advance!
>> >>>
>> >>> Cheers,
>> >>>
>> >>> Tim
>> >>> _______________________________________________
>> >>> Bioperl-l mailing list
>> >>> Bioperl-l at lists.open-bio.org
>> >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>> >>>
>> >>>
>> >>
>> >> _______________________________________________
>> >> Bioperl-l mailing list
>> >> Bioperl-l at lists.open-bio.org
>> >> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>> >>
>> >>
>> >
>>
>> Tim K?hler
> MPI for Terrestrial Microbiology
> Karl-von-Frisch-Stra?e
> D-35043 Marburg / Germany
>
> Email: koehlerd at mpi-marburg.mpg.de
> Phone: +49 6421 178-740
> Fax:   +49 6421 178-999
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

--
Jason Stajich
jason.stajich at gmail.com
jason at bioperl.org


From suzi at berkeleybop.org  Sun Nov 29 23:03:09 2009
From: suzi at berkeleybop.org (Suzanna Lewis)
Date: Sun, 29 Nov 2009 20:03:09 -0800
Subject: [Bioperl-l] [DAS] DAS workshop 7th-9th April 2010
In-Reply-To: <F30A9ED7-41E9-4833-A094-FDF0893E0F92@sanger.ac.uk>
References: <F30A9ED7-41E9-4833-A094-FDF0893E0F92@sanger.ac.uk>
Message-ID: <3AD3C819-4BAA-4D90-B141-9611F48C5CAD@ berkeleybop.org>

I/we (Gregg) would be interested in attending. We'd present an update on the collaborative, web-based version of Apollo. We will be working with Ian Holmes and Mitch Skinner using JBrowse for basic display.

-S


On Nov 26, 2009, at 6:57 AM, Jonathan Warren wrote:

> We are considering running a Distributed Annotation System workshop here at the Sanger/EBI in the UK subject to decent demand.
> The workshop will be held from Wednesday 7th-Friday 9th April 2010. If you would be interested in attending either to present or just take part
> then please email me jw12 at sanger.ac.uk
> 
> The format of the workshop is likely to be similar to last years (1st day for beginners, 2nd for both beginners and advanced users, 3rd day for advanced), information for which can be found here:
> http://www.dasregistry.org/course.jsp
> 
> If you would like to present then please send a short summary of what you would like to talk about.
> 
> Thanks
> 
> Jonathan.
> 
> Jonathan Warren
> Senior Developer and DAS coordinator
> jw12 at sanger.ac.uk
> 
> 
> 
> 
> 
> 
> 
> 
> 
> -- 
> The Wellcome Trust Sanger Institute is operated by Genome ResearchLimited, a charity registered in England with number 1021457 and acompany registered in England with number 2742969, whose registeredoffice is 215 Euston Road, London, NW1 2BE._______________________________________________
> DAS mailing list
> DAS at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/das
> 


From maj at fortinbras.us  Mon Nov 30 09:31:27 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Mon, 30 Nov 2009 09:31:27 -0500
Subject: [Bioperl-l] HOWTO copyright policy vs FDL on wiki
In-Reply-To: <81B3C4A1-9F14-4FF9-A4AF-F7E90817A2F1@verizon.net>
References: <9EC73CA501BD45BA912F2D77954D6CD7@NewLife>
	<81B3C4A1-9F14-4FF9-A4AF-F7E90817A2F1@verizon.net>
Message-ID: <513F1C824EF84974993A76F0CC719CDF@NewLife>

Well, it has a history, Jason's point. So the question could
be: "is this still a valid issue"? A while back, a user on the wiki,
with natural and good intentions, removed the authorship and revision
info from a couple of the HOWTOs; it is more wiki-like,
after all. But Chris had some objections to that, which I
seconded, mainly on the basis of the special status that
seems implied by the copyright note on the HOWTO
page. I also think that the nature of the howto is somewhat
different from other info on the site -- that developers themselves
put a lot of time in to explaining how to use their modules, and
that in this world where devs get paid by recognition, it is a reasonable
thing to allow this extra horn-tooting. Now, that is a policy
that could be completely separable from the issue of copyright.
However, devs may also get paid by using their materials in teaching
seminars. The dilemma would be that people who like to use the
wiki are people who like to share, and so it feels unnatural to
withhold from the community the materials they develop,  but
people who like to share also like to eat and wear shoes...
so I'm interested in everyone's thoughts about it.
----- Original Message ----- 
From: "Brian Osborne" <bosborne11 at verizon.net>
To: "Mark A. Jensen" <maj at fortinbras.us>
Cc: "Chris Fields" <cjfields at illinois.edu>; "Jason Stajich" 
<jason.stajich at ucr.edu>; "bioperl List" <bioperl-l at bioperl.org>
Sent: Monday, November 30, 2009 9:16 AM
Subject: Re: [Bioperl-l] HOWTO copyright policy vs FDL on wiki


> Mark,
>
> Let me ask you a question, and don't take this question as an implicit 
> criticism of your suggestion, it is not. Why would you want this more 
> restrictive copyright?
>
> Brian O.
>
> On Nov 28, 2009, at 10:32 PM, Mark A. Jensen wrote:
>
>> The HOWTOs appear to have a more restrictive copyright
>> than FDL-- in particular, the blurb at the bottom of the
>> HOWTO page asks users to use the documents for personal
>> use only. I'm for this; I think we should therefore have some
>> explicit license for these that specifies this kind of restriction,
>> and then express that on each howto and in BioPerl:Copyright.
>> Any thoughts on the right license and whether this is a good plan?
>> MAJ
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>
> 


From bosborne11 at verizon.net  Mon Nov 30 10:15:32 2009
From: bosborne11 at verizon.net (Brian Osborne)
Date: Mon, 30 Nov 2009 10:15:32 -0500
Subject: [Bioperl-l] HOWTO copyright policy vs FDL on wiki
In-Reply-To: <513F1C824EF84974993A76F0CC719CDF@NewLife>
References: <9EC73CA501BD45BA912F2D77954D6CD7@NewLife>
	<81B3C4A1-9F14-4FF9-A4AF-F7E90817A2F1@verizon.net>
	<513F1C824EF84974993A76F0CC719CDF@NewLife>
Message-ID: <54671455-A02C-4139-8C39-AC17B50D5CE6@verizon.net>

Mark,

I have no objection to a more restrictive copyright, and I also have  
no objection to using FDL, or things like it.

Brian O.

On Nov 30, 2009, at 9:31 AM, Mark A. Jensen wrote:

> Well, it has a history, Jason's point. So the question could
> be: "is this still a valid issue"? A while back, a user on the wiki,
> with natural and good intentions, removed the authorship and revision
> info from a couple of the HOWTOs; it is more wiki-like,
> after all. But Chris had some objections to that, which I
> seconded, mainly on the basis of the special status that
> seems implied by the copyright note on the HOWTO
> page. I also think that the nature of the howto is somewhat
> different from other info on the site -- that developers themselves
> put a lot of time in to explaining how to use their modules, and
> that in this world where devs get paid by recognition, it is a  
> reasonable
> thing to allow this extra horn-tooting. Now, that is a policy
> that could be completely separable from the issue of copyright.
> However, devs may also get paid by using their materials in teaching
> seminars. The dilemma would be that people who like to use the
> wiki are people who like to share, and so it feels unnatural to
> withhold from the community the materials they develop,  but
> people who like to share also like to eat and wear shoes...
> so I'm interested in everyone's thoughts about it.
> ----- Original Message ----- From: "Brian Osborne" <bosborne11 at verizon.net 
> >
> To: "Mark A. Jensen" <maj at fortinbras.us>
> Cc: "Chris Fields" <cjfields at illinois.edu>; "Jason Stajich" <jason.stajich at ucr.edu 
> >; "bioperl List" <bioperl-l at bioperl.org>
> Sent: Monday, November 30, 2009 9:16 AM
> Subject: Re: [Bioperl-l] HOWTO copyright policy vs FDL on wiki
>
>
>> Mark,
>>
>> Let me ask you a question, and don't take this question as an  
>> implicit criticism of your suggestion, it is not. Why would you  
>> want this more restrictive copyright?
>>
>> Brian O.
>>
>> On Nov 28, 2009, at 10:32 PM, Mark A. Jensen wrote:
>>
>>> The HOWTOs appear to have a more restrictive copyright
>>> than FDL-- in particular, the blurb at the bottom of the
>>> HOWTO page asks users to use the documents for personal
>>> use only. I'm for this; I think we should therefore have some
>>> explicit license for these that specifies this kind of restriction,
>>> and then express that on each howto and in BioPerl:Copyright.
>>> Any thoughts on the right license and whether this is a good plan?
>>> MAJ
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>>
>


From bosborne11 at verizon.net  Mon Nov 30 09:16:07 2009
From: bosborne11 at verizon.net (Brian Osborne)
Date: Mon, 30 Nov 2009 09:16:07 -0500
Subject: [Bioperl-l] HOWTO copyright policy vs FDL on wiki
In-Reply-To: <9EC73CA501BD45BA912F2D77954D6CD7@NewLife>
References: <9EC73CA501BD45BA912F2D77954D6CD7@NewLife>
Message-ID: <81B3C4A1-9F14-4FF9-A4AF-F7E90817A2F1@verizon.net>

Mark,

Let me ask you a question, and don't take this question as an implicit  
criticism of your suggestion, it is not. Why would you want this more  
restrictive copyright?

Brian O.

On Nov 28, 2009, at 10:32 PM, Mark A. Jensen wrote:

> The HOWTOs appear to have a more restrictive copyright
> than FDL-- in particular, the blurb at the bottom of the
> HOWTO page asks users to use the documents for personal
> use only. I'm for this; I think we should therefore have some
> explicit license for these that specifies this kind of restriction,
> and then express that on each howto and in BioPerl:Copyright.
> Any thoughts on the right license and whether this is a good plan?
> MAJ
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From maj at fortinbras.us  Mon Nov 30 12:41:44 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Mon, 30 Nov 2009 12:41:44 -0500
Subject: [Bioperl-l] How to parse BLAST output - all hits of each
	queryinnew file
In-Reply-To: <c3cc98c0911300923p144ac04ax8a881150dca54835@mail.gmail.com>
References: <4B0D6C24.2080308@gmail.com>
	<53DE480F205E42CE8D2B9421592AAF0E@NewLife>
	<815D2A47BC9C4D89B8DEF0B10DA9EAF8@NewLife>
	<4B0EA44D.2050507@gmail.com>
	<c3cc98c0911260802t2a66e7e8o7be3b79c03d86e6e@mail.gmail.com>
	<c3cc98c0911270123i6e4e83d3lfee0f5f32ca0cf46@mail.gmail.com>
	<18DF7D20DFEC044098A1062202F5FFF32B630E6C53@exchsth.agresearch.co.nz>
	<52D67F20A9CB4953B86FF794ADE0BE96@NewLife>
	<18DF7D20DFEC044098A1062202F5FFF32B630E6D05@exchsth.agresearch.co.nz>
	<c3cc98c0911300923p144ac04ax8a881150dca54835@mail.gmail.com>
Message-ID: <8C288FEF9CEB4055B0CDD19267FBA26C@NewLife>

thanks Tim! corrected (I hope) in r16432... 
MAJ
  ----- Original Message ----- 
  From: Tim Koehler 
  To: Smithies, Russell 
  Cc: Mark A. Jensen ; bioperl-l at lists.open-bio.org 
  Sent: Monday, November 30, 2009 12:23 PM
  Subject: Re: [Bioperl-l] How to parse BLAST output - all hits of each queryinnew file


  Hello everybody,

  thanks a lot for the overwhelming answers! All these codes are different flavors and worked all.

  For me the added code works the best. But I think I found a bug in ...Bio/SearchIO/blast.pm. 
  There the DEFAULT_BLAST_... variable is set to Bio::Search::Writer::HitTableWriter instead of Bio::SearchIO::Writer::HitTableWriter. This variable I changed also to HTMLResultWriter and others.

  So again: THANKS for the support!

  Cheers, 
  Tim

  #!/usr/bin/perl -w

  use strict;

  use Bio::Tools::Run::StandAloneBlast;

  use Bio::SeqIO;

  use Bio::SearchIO;

  ### add here the writer you want
  use Bio::SearchIO::Writer::HitTableWriter;

  use Bio::Search::Result::BlastResult;

   
  use Data::Dumper;

   
  my $Seq_in = Bio::SeqIO->new( -file   => "/home/koehler/Programs/for_BLAST/1_to_BLAST_two_seq.fasta",

                                -format => "fasta" );

   
  while ( my $query = $Seq_in->next_seq() ) {

         warn "Processing ",$query->id, "\n";

    my $factory =

      Bio::Tools::Run::StandAloneBlast->new(

                   program  => "blastn",

                   database => "/home/koehler/Programs/for_BLAST/BLAST_Pipeline/3_BLAST_db",

                   _READMETHOD => "Blast"

      );

   
    my $blast_report = $factory->blastall($query);

    sleep 5;

   
    # just write the result we got for this query into a 

     #new blast-formatted file...named after the id of the query seq...  

    my $result = $blast_report->next_result;

    my $blio = Bio::SearchIO->new( -file => ">".$query->id.".bls", -format => "blast" ) or die $!;

    $blio->write_result($result);

   
    # below, just looking at the current blast result

  ###this does not appear in the output files

    while ( my $result = $blast_report->next_result ) {

      ## $result is a Bio::Search::Result::ResultI compliant object

      while ( my $hit = $result->next_hit ) {

        ## $hit is a Bio::Search::Hit::HitI compliant object

        while ( my $hsp = $hit->next_hsp ) {

          ## $hsp is a Bio::Search::HSP::HSPI compliant object

          if ( $hsp->length('total') > 50 ) {

            if ( $hsp->percent_identity >= 75 ) {

              print "Query= ", $result->query_name,

                "Hit= ",        $hit->name,

                "Length= ",     $hsp->length('total'),

                "Percent_id= ", $hsp->percent_identity,

                "Subject=",     $hsp->hit_string, "\n";

            }

          }

        }

      }

    }

  }

   
  On Sun, Nov 29, 2009 at 11:29 PM, Smithies, Russell <Russell.Smithies at agresearch.co.nz> wrote:

    Changed it to a generic result and added a writer and it seems tio work:


      foreach my $qid ( keys %hits_by_query ) {

        warn "qid = $qid\n";

        my $res = Bio::Search::Result::GenericResult->new(-algorithm => "blastn") or die $!;

       # print Dumper $res;

        foreach my $h ( @{ $hits_by_query{$qid} } ){

                         warn "adding hit ", $h->name, "\n";

                         $res->add_hit($h) if defined($h);

                               }

        my $writerhtml =  Bio::SearchIO::Writer::HTMLResultWriter->new();

        my $blio = Bio::SearchIO->new(-writer => $writerhtml, -file => ">$qid\.bls\.html", -format => "blast" ) or die $!;

        $blio->write_result($res);

      }


    From: Mark A. Jensen [mailto:maj at fortinbras.us] 
    Sent: Monday, 30 November 2009 10:19 a.m.
    To: Smithies, Russell; 'Tim Koehler'


    Subject: Re: [Bioperl-l] How to parse BLAST output - all hits of each queryinnew file


    My thought here was that since Tim's already going one at a time thru

    his queries, my scrap was not really necessary: 


    use strict;

    use Bio::Tools::Run::StandAloneBlast;

    use Bio::SeqIO;

    use Bio::SearchIO;

    use Bio::Search::Result::BlastResult;


    use Data::Dumper;


    my $Seq_in = Bio::SeqIO->new( -file   => "sequences.fasta",

                                  -format => "fasta" );


    while ( my $query = $Seq_in->next_seq() ) {

           warn "Processing ",$query->id, "\n";

      my $factory =

        Bio::Tools::Run::StandAloneBlast->new(

                     program  => "blastn",

                     database => "/data/databases/flatfile/illuminati_blastdata/nt",

                     _READMETHOD => "Blast"

        );


      my $blast_report = $factory->blastall($query);

      sleep 5;


      # just write the result we got for this query into a 

       #new blast-formatted file...named after the id of the query seq...  

     my $result = $blast_report->next_result;

    my $blio = Bio::SearchIO->new( -file => ">".$query->id.".bls", -format => "blast" ) or die $!;

      $blio->write_result($result);


      # below, just looking at the current blast result

      while ( my $result = $blast_report->next_result ) {

        ## $result is a Bio::Search::Result::ResultI compliant object

        while ( my $hit = $result->next_hit ) {

          ## $hit is a Bio::Search::Hit::HitI compliant object

          while ( my $hsp = $hit->next_hsp ) {

            ## $hsp is a Bio::Search::HSP::HSPI compliant object

            if ( $hsp->length('total') > 50 ) {

              if ( $hsp->percent_identity >= 75 ) {

                print "Query= ", $result->query_name,

                  "Hit= ",        $hit->name,

                  "Length= ",     $hsp->length('total'),

                  "Percent_id= ", $hsp->percent_identity,

                  "Subject=",     $hsp->hit_string, "\n";

              }

            }

          }

        }

      }

    }

      ----- Original Message ----- 

      From: Smithies, Russell 

      To: 'Tim Koehler' ; 'maj at fortinbras.us' 

      Sent: Sunday, November 29, 2009 3:58 PM

      Subject: RE: [Bioperl-l] How to parse BLAST output - all hits of each queryinnew file


      Hi Tim

      With various people writing the ?howtos? and other docs, the examples are bound to have differing names for the variables used but as long as you?re consistent, it should all fit together.


      I think I?ve almost got your code working, just getting errors from Bio::Search::Result::BlastResult  which I?m not entirely sure how to use. Perhaps Mark can get this bit going?


      --Russell

      ===============================


      use strict;

      use Bio::Tools::Run::StandAloneBlast;

      use Bio::SeqIO;

      use Bio::SearchIO;

      use Bio::Search::Result::BlastResult;


      use Data::Dumper;


      my $Seq_in = Bio::SeqIO->new( -file   => "sequences.fasta",

                                    -format => "fasta" );


      while ( my $query = $Seq_in->next_seq() ) {

             warn "Processing ",$query->id, "\n";

        my $factory =

          Bio::Tools::Run::StandAloneBlast->new(

                       program  => "blastn",

                       database => "/data/databases/flatfile/illuminati_blastdata/nt",

                       _READMETHOD => "Blast"

          );


        my $blast_report = $factory->blastall($query);

        sleep 5;


        my %hits_by_query;


             while ( my $result = $blast_report->next_result ) {

               foreach my $hit ( $result->hits ) {

                           warn "Pushed a hit for ",$hit->name, "\n";

                 push( @{ $hits_by_query{ $hit->name } }, $hit );

               }

             }


        foreach my $qid ( keys %hits_by_query ) {

                    warn "qid = $qid\n";

          my $res = Bio::Search::Result::BlastResult->new() or die $!;

          print Dumper $res;

          foreach my $h ( @{ $hits_by_query{$qid} } ){

                           warn "adding hit ", $h->name, "\n";

                           $res->add_hit($h) if defined($h);

                                 }

          my $blio = Bio::SearchIO->new( -file => ">$qid\.bls", -format => "blast" ) or die $!;

          $blio->write_result($res);

        }


        while ( my $result = $blast_report->next_result ) {

          ## $result is a Bio::Search::Result::ResultI compliant object

          while ( my $hit = $result->next_hit ) {

            ## $hit is a Bio::Search::Hit::HitI compliant object

            while ( my $hsp = $hit->next_hsp ) {

              ## $hsp is a Bio::Search::HSP::HSPI compliant object

              if ( $hsp->length('total') > 50 ) {

                if ( $hsp->percent_identity >= 75 ) {

                  print "Query= ", $result->query_name,

                    "Hit= ",        $hit->name,

                    "Length= ",     $hsp->length('total'),

                    "Percent_id= ", $hsp->percent_identity,

                    "Subject=",     $hsp->hit_string, "\n";

                }

              }

            }

          }

        }

      }

      ===============================


      From: Tim Koehler [mailto:timbourine81 at googlemail.com] 
      Sent: Friday, 27 November 2009 10:24 p.m.
      To: Smithies, Russell; maj at fortinbras.us
      Subject: Re: [Bioperl-l] How to parse BLAST output - all hits of each queryinnew file


      Hey guys,

      please, do not get me wrong that I wanted to put the workload on you. So far I only found the HowTo's but in there in some way the language changed with time (e.g. $in to $Seq_in) or some things I simply could not find.
      Now I got a tip where else to search: the scrapbook and deobfuscator.

      I immediately will have a look at that.

      This is the first time for me touching linux / perl commands; that's why I thought after several days of trial and many errors ;) asking the mailinglist.

      I was very happy about your fast answers!

      Cheers and a nice weekend,

      Tim

      On Thu, Nov 26, 2009 at 5:02 PM, Tim Koehler <timbourine81 at googlemail.com> wrote:

      ups, sent too early...

      Hey Mark,

      thanks for the answer. But I am still struggling, especially where to put in your code.

      Here ist the code I have, so far:

      #!/usr/bin/perl -w

      ### should I put your code here as push is a perl command?


      my %hits_by_query;
      for ($result->hits) {

      ### I inserted a comma after name}}; if there is no comma, there was the error: Scalar found where operator expected at 12_BLAST_two_sequence_each_query_one_file.PL line7, near "} $hit"
      ###        (Missing operator before  $hit?)
      ###Useless use of push with no values at 12_BLAST_two_sequence_each_query_one_file.PL line 7.
      ###syntax error at 12_BLAST_two_sequence_each_query_one_file.PL line 7, near "} $hit"
      ###BEGIN not safe after errors--compilation aborted at 12_BLAST_two_sequence_each_query_one_file.PL line 13.


       push @{$hits_by_query{$hit->name}}, $hit;

      ###here, every time this terror appears: Name "main::result" used only once: possible typo at 12_BLAST_two_sequence_each_query_one_file.PL line 5.
      ###error: Can't call method "hits" on an undefined value at 12_BLAST_two_sequence_each_query_one_file.PL line 5.


      }


      use strict;
      use Bio::Tools::Run::StandAloneBlast;
      use Bio::SeqIO;
      use Bio::SearchIO;

      use Bio::Search::Result::BlastResult;

      my $Seq_in = Bio::SeqIO->new (
      -file => "/home/koehler/Programs/for_BLAST/BLAST_Pipeline/1_to_BLAST_two_seq.fasta",
      -format => 'fasta'
      );
      while (my $query = $Seq_in->next_seq()) {


      my $factory = Bio::Tools::Run::StandAloneBlast->new(

      'program' => 'blastn',
      'database' => '/home/koehler/Programs/for_BLAST/BLAST_Pipeline/3_BLAST_db',
      _READMETHOD => "Blast"
      );

      my $blast_report = $factory->blastall($query);

      ### Should I need to use a module? are the commands here at the right position? errors, e.g., Global symbol "$hit" requires explicit package name
      #my %hits_by_query;
      #for ($result->hits) {
      ### inserted comma after name}}
      # push @{$hits_by_query{$hit->name}}, $hit;
      #}


      foreach my $qid ( keys %hits_by_query ) {
       my $result = Bio::Search::Result::BlastResult->new();
       $result->add_hit($_) for ( @{$hits_by_query{$qid}} );
       my $blio = Bio::SearchIO->new( -file => ">$qid\.bls", -format=>'blast' );
       $blio->write_result($result);
      } 

      ###where are the files stored? what is their name. Sorry, but I cannot get behind that :(

      while( my $result = $blast_report->next_result ) {
        ## $result is a Bio::Search::Result::ResultI compliant object


        while( my $hit = $result->next_hit ) {

         ## $hit is a Bio::Search::Hit::HitI compliant object


         while( my $hsp = $hit->next_hsp ) {

          ## $hsp is a Bio::Search::HSP::HSPI compliant object
          if( $hsp->length('total') > 50 ) {
           if ( $hsp->percent_identity >= 75 ) {
           print  "Query= ",        $result->query_name,
              "Hit= ",        $hit->name,
                  "Length= ",     $hsp->length('total'),
                  "Percent_id= ", $hsp->percent_identity,
              "Subject=",        $hsp->hit_string,"\n";
           }
          }
         }
        }
      }
      }

      Again, a big thanks in advance :)

      All the best,

      Tim

      On Thu, Nov 26, 2009 at 4:52 PM, Tim <timbourine81 at gmail.com> wrote:

      Hey Mark,

      thanks for the answer


      On 25.11.2009 20:21, Mark A. Jensen wrote:
      > whoops: change the following line:
      > my $blio = Bio::SearchIO->new( -file => $qid.".bls", -format=>'blast' );
      >
      > to
      >
      > my $blio = Bio::SearchIO->new( -file => ">$qid\.bls", -format=>'blast' );
      >
      > (I always forget that...)
      > MAJ
      >
      > ----- Original Message ----- From: "Mark A. Jensen" <maj at fortinbras.us>
      > To: "Tim" <timbourine81 at gmail.com>; <bioperl-l at lists.open-bio.org>
      > Sent: Wednesday, November 25, 2009 1:20 PM
      > Subject: Re: [Bioperl-l] How to parse BLAST output - all hits of each
      > queryinnew file
      >
      >
      >> hey Tim--
      >>
      >> Sound like you need to go about collecting your queries inside out:
      >>
      >> my %hits_by_query;
      >> for ($result->hits) {
      >>  push @{$hits_by_query{$hit->name}} $hit;
      >> }
      >>
      >> I believe now each hash element, keyed by the query name, will contain
      >> an arrayref to the set of hits assoc with that query.
      >>> From here, I believe
      >>
      >> use Bio::Search::Result::BlastResult;
      >> use Bio::SearchIO;
      >>
      >> foreach my $qid ( keys %hits_by_query ) {
      >>  my $result = Bio::Search::Result::BlastResult->new();
      >>  $result->add_hit($_) for ( @{$hits_by_query{$qid}} );
      >>  my $blio = Bio::SearchIO->new( -file => $qid.".bls", -format=>'blast' );
      >>  $blio->write_result($result);
      >> }
      >>
      >> will do what you want.
      >>
      >> hope this helps -
      >> Mark
      >>
      >> ----- Original Message ----- From: "Tim" <timbourine81 at gmail.com>
      >> To: <bioperl-l at lists.open-bio.org>
      >> Sent: Wednesday, November 25, 2009 12:40 PM
      >> Subject: [Bioperl-l] How to parse BLAST output - all hits of each
      >> query innew file
      >>
      >>
      >>> Dear bioperl users,
      >>>
      >>> I am a real newbie and have - maybe a very trivial - question.
      >>>
      >>> I searched the mailing list archive and many howtos but I have not found
      >>> a concrete answer to my problem. So hopefully you can help me :)
      >>>
      >>> Background: I use the latest Bioperl version (installed it two weeks
      >>> before).
      >>> When I use Bio::Tools::Run::StandAloneBlast to BLAST one fasta file
      >>> including different sequences, I get a BLAST output with many queries
      >>> each having several hits / sbjcts.
      >>>
      >>> My problem is how to parse *all* hits of *one* query into a single new
      >>> file. And this for all the queries I have in my BLAST output file.
      >>>
      >>> Or is it better the other way round; first to make fasta files with only
      >>> single sequences inside and BLAST each file? But how can I automize that
      >>> using Bioperl?
      >>>
      >>> I tried Bio::SearchIO but can only parse all queries and their
      >>> respective hits in only one file...
      >>> I think iteration is also necessary here, but I do not really know how
      >>> to include that into Bio::SearchIO.
      >>> Or do I have to use Module:Bio::Index::Blast?
      >>>
      >>> I can index a file (see below), but I have no idea what comes next...
      >>>
      >>> ###How I index a file...
      >>>
      >>> #!/usr/bin/perl -w
      >>>
      >>> $ENV{BIOPERL_INDEX_TYPE} = "SDBM_File";
      >>>
      >>> use Bio::Index::Fasta;
      >>>
      >>>
      >>> $file_name = "8_to_BLAST_two_seq_index.fasta";
      >>> $id = "48882";
      >>> $inx = Bio::Index::Fasta->new (-filename => $file_name . ".idx",
      >>> -write_flag => 1);
      >>> $inx->make_index($file_name);
      >>>
      >>>
      >>> Hopefully, you can give me at least hints what to look for.
      >>>
      >>> A big THANKS in advance!
      >>>
      >>> Cheers,
      >>>
      >>> Tim
      >>> _______________________________________________
      >>> Bioperl-l mailing list
      >>> Bioperl-l at lists.open-bio.org
      >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
      >>>
      >>>
      >>
      >> _______________________________________________
      >> Bioperl-l mailing list
      >> Bioperl-l at lists.open-bio.org
      >> http://lists.open-bio.org/mailman/listinfo/bioperl-l
      >>
      >>
      >

      Tim K?hler
      MPI for Terrestrial Microbiology
      Karl-von-Frisch-Stra?e
      D-35043 Marburg / Germany

      Email: koehlerd at mpi-marburg.mpg.de
      Phone: +49 6421 178-740
      Fax:   +49 6421 178-999


--------------------------------------------------------------------------

      Attention: The information contained in this message and/or attachments from AgResearch Limited is intended only for the persons or entities to which it is addressed and may contain confidential and/or privileged material. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipients is prohibited by AgResearch Limited. If you have received this message in error, please notify the sender immediately.


--------------------------------------------------------------------------


From timbourine81 at googlemail.com  Mon Nov 30 12:23:58 2009
From: timbourine81 at googlemail.com (Tim Koehler)
Date: Mon, 30 Nov 2009 18:23:58 +0100
Subject: [Bioperl-l] How to parse BLAST output - all hits of each
	queryinnew file
In-Reply-To: <18DF7D20DFEC044098A1062202F5FFF32B630E6D05@exchsth.agresearch.co.nz>
References: <4B0D6C24.2080308@gmail.com>
	<53DE480F205E42CE8D2B9421592AAF0E@NewLife>
	<815D2A47BC9C4D89B8DEF0B10DA9EAF8@NewLife>
	<4B0EA44D.2050507@gmail.com>
	<c3cc98c0911260802t2a66e7e8o7be3b79c03d86e6e@mail.gmail.com>
	<c3cc98c0911270123i6e4e83d3lfee0f5f32ca0cf46@mail.gmail.com>
	<18DF7D20DFEC044098A1062202F5FFF32B630E6C53@exchsth.agresearch.co.nz>
	<52D67F20A9CB4953B86FF794ADE0BE96@NewLife>
	<18DF7D20DFEC044098A1062202F5FFF32B630E6D05@exchsth.agresearch.co.nz>
Message-ID: <c3cc98c0911300923p144ac04ax8a881150dca54835@mail.gmail.com>

Hello everybody,

thanks a lot for the overwhelming answers! All these codes are different
flavors and worked all.

For me the added code works the best. But I think I found a bug in
...Bio/SearchIO/blast.pm.
There the DEFAULT_BLAST_... variable is set to
Bio::Search::Writer::HitTableWriter instead of
Bio::SearchIO::Writer::HitTableWriter. This variable I changed also to
HTMLResultWriter
and others.

So again: THANKS for the support!

Cheers,
Tim

#!/usr/bin/perl -w

use strict;

use Bio::Tools::Run::StandAloneBlast;

use Bio::SeqIO;

use Bio::SearchIO;

### add here the writer you want
use Bio::SearchIO::Writer::HitTableWriter;

use Bio::Search::Result::BlastResult;


use Data::Dumper;


my $Seq_in = Bio::SeqIO->new( -file   =>
"/home/koehler/Programs/for_BLAST/1_to_BLAST_two_seq.fasta",

                              -format => "fasta" );


while ( my $query = $Seq_in->next_seq() ) {

       warn "Processing ",$query->id, "\n";

  my $factory =

    Bio::Tools::Run::StandAloneBlast->new(

                 program  => "blastn",

                 database =>
"/home/koehler/Programs/for_BLAST/BLAST_Pipeline/3_BLAST_db",

                 _READMETHOD => "Blast"

    );


  my $blast_report = $factory->blastall($query);

  sleep 5;


  # just write the result we got for this query into a

   #new blast-formatted file...named after the id of the query seq...

  my $result = $blast_report->next_result;

  my $blio = Bio::SearchIO->new( -file => ">".$query->id.".bls", -format =>
"blast" ) or die $!;

  $blio->write_result($result);


  # below, just looking at the current blast result

###this does not appear in the output files

  while ( my $result = $blast_report->next_result ) {

    ## $result is a Bio::Search::Result::ResultI compliant object

    while ( my $hit = $result->next_hit ) {

      ## $hit is a Bio::Search::Hit::HitI compliant object

      while ( my $hsp = $hit->next_hsp ) {

        ## $hsp is a Bio::Search::HSP::HSPI compliant object

        if ( $hsp->length('total') > 50 ) {

          if ( $hsp->percent_identity >= 75 ) {

            print "Query= ", $result->query_name,

              "Hit= ",        $hit->name,

              "Length= ",     $hsp->length('total'),

              "Percent_id= ", $hsp->percent_identity,

              "Subject=",     $hsp->hit_string, "\n";

          }

        }

      }

    }

  }

}


On Sun, Nov 29, 2009 at 11:29 PM, Smithies, Russell <
Russell.Smithies at agresearch.co.nz> wrote:

>  Changed it to a generic result and added a writer and it seems tio work:
>
>
>
>   foreach my $qid ( keys %hits_by_query ) {
>
>     warn "qid = $qid\n";
>
>     my $res = Bio::Search::Result::GenericResult->new(-algorithm =>
> "blastn") or die $!;
>
>    # print Dumper $res;
>
>     foreach my $h ( @{ $hits_by_query{$qid} } ){
>
>                      warn "adding hit ", $h->name, "\n";
>
>                      $res->add_hit($h) if defined($h);
>
>                            }
>
>     my $writerhtml =  Bio::SearchIO::Writer::HTMLResultWriter->new();
>
>     my $blio = Bio::SearchIO->new(-writer => $writerhtml, -file =>
> ">$qid\.bls\.html", -format => "blast" ) or die $!;
>
>     $blio->write_result($res);
>
>   }
>
>
>
>
>
> *From:* Mark A. Jensen [mailto:maj at fortinbras.us]
> *Sent:* Monday, 30 November 2009 10:19 a.m.
> *To:* Smithies, Russell; 'Tim Koehler'
>
> *Subject:* Re: [Bioperl-l] How to parse BLAST output - all hits of each
> queryinnew file
>
>
>
> My thought here was that since Tim's already going one at a time thru
>
> his queries, my scrap was not really necessary:
>
>
>
> use strict;
>
> use Bio::Tools::Run::StandAloneBlast;
>
> use Bio::SeqIO;
>
> use Bio::SearchIO;
>
> use Bio::Search::Result::BlastResult;
>
>
>
> use Data::Dumper;
>
>
>
> my $Seq_in = Bio::SeqIO->new( -file   => "sequences.fasta",
>
>                               -format => "fasta" );
>
>
>
> while ( my $query = $Seq_in->next_seq() ) {
>
>        warn "Processing ",$query->id, "\n";
>
>   my $factory =
>
>     Bio::Tools::Run::StandAloneBlast->new(
>
>                  program  => "blastn",
>
>                  database =>
> "/data/databases/flatfile/illuminati_blastdata/nt",
>
>                  _READMETHOD => "Blast"
>
>     );
>
>
>
>   my $blast_report = $factory->blastall($query);
>
>   sleep 5;
>
>
>
>   # just write the result we got for this query into a
>
>    #new blast-formatted file...named after the id of the query seq...
>
>  my $result = $blast_report->next_result;
>
> my $blio = Bio::SearchIO->new( -file => ">".$query->id.".bls", -format =>
> "blast" ) or die $!;
>
>   $blio->write_result($result);
>
>
>
>   # below, just looking at the current blast result
>
>   while ( my $result = $blast_report->next_result ) {
>
>     ## $result is a Bio::Search::Result::ResultI compliant object
>
>     while ( my $hit = $result->next_hit ) {
>
>       ## $hit is a Bio::Search::Hit::HitI compliant object
>
>       while ( my $hsp = $hit->next_hsp ) {
>
>         ## $hsp is a Bio::Search::HSP::HSPI compliant object
>
>         if ( $hsp->length('total') > 50 ) {
>
>           if ( $hsp->percent_identity >= 75 ) {
>
>             print "Query= ", $result->query_name,
>
>               "Hit= ",        $hit->name,
>
>               "Length= ",     $hsp->length('total'),
>
>               "Percent_id= ", $hsp->percent_identity,
>
>               "Subject=",     $hsp->hit_string, "\n";
>
>           }
>
>         }
>
>       }
>
>     }
>
>   }
>
> }
>
>  ----- Original Message -----
>
> *From:* Smithies, Russell <Russell.Smithies at agresearch.co.nz>
>
> *To:* 'Tim Koehler' <timbourine81 at googlemail.com> ; 'maj at fortinbras.us'<%27maj at fortinbras.us%27>
>
> *Sent:* Sunday, November 29, 2009 3:58 PM
>
> *Subject:* RE: [Bioperl-l] How to parse BLAST output - all hits of each
> queryinnew file
>
>
>
> Hi Tim
>
> With various people writing the ?howtos? and other docs, the examples are
> bound to have differing names for the variables used but as long as you?re
> consistent, it should all fit together.
>
>
>
> I think I?ve almost got your code working, just getting errors from
> Bio::Search::Result::BlastResult  which I?m not entirely sure how to use.
> Perhaps Mark can get this bit going?
>
>
>
> --Russell
>
> ===============================
>
>
>
> use strict;
>
> use Bio::Tools::Run::StandAloneBlast;
>
> use Bio::SeqIO;
>
> use Bio::SearchIO;
>
> use Bio::Search::Result::BlastResult;
>
>
>
> use Data::Dumper;
>
>
>
> my $Seq_in = Bio::SeqIO->new( -file   => "sequences.fasta",
>
>                               -format => "fasta" );
>
>
>
> while ( my $query = $Seq_in->next_seq() ) {
>
>        warn "Processing ",$query->id, "\n";
>
>   my $factory =
>
>     Bio::Tools::Run::StandAloneBlast->new(
>
>                  program  => "blastn",
>
>                  database =>
> "/data/databases/flatfile/illuminati_blastdata/nt",
>
>                  _READMETHOD => "Blast"
>
>     );
>
>
>
>   my $blast_report = $factory->blastall($query);
>
>   sleep 5;
>
>
>
>
>
>   my %hits_by_query;
>
>
>
>        while ( my $result = $blast_report->next_result ) {
>
>          foreach my $hit ( $result->hits ) {
>
>                      warn "Pushed a hit for ",$hit->name, "\n";
>
>            push( @{ $hits_by_query{ $hit->name } }, $hit );
>
>          }
>
>        }
>
>
>
>   foreach my $qid ( keys %hits_by_query ) {
>
>               warn "qid = $qid\n";
>
>     my $res = Bio::Search::Result::BlastResult->new() or die $!;
>
>     print Dumper $res;
>
>     foreach my $h ( @{ $hits_by_query{$qid} } ){
>
>                      warn "adding hit ", $h->name, "\n";
>
>                      $res->add_hit($h) if defined($h);
>
>                            }
>
>     my $blio = Bio::SearchIO->new( -file => ">$qid\.bls", -format =>
> "blast" ) or die $!;
>
>     $blio->write_result($res);
>
>   }
>
>
>
>   while ( my $result = $blast_report->next_result ) {
>
>     ## $result is a Bio::Search::Result::ResultI compliant object
>
>     while ( my $hit = $result->next_hit ) {
>
>       ## $hit is a Bio::Search::Hit::HitI compliant object
>
>       while ( my $hsp = $hit->next_hsp ) {
>
>         ## $hsp is a Bio::Search::HSP::HSPI compliant object
>
>         if ( $hsp->length('total') > 50 ) {
>
>           if ( $hsp->percent_identity >= 75 ) {
>
>             print "Query= ", $result->query_name,
>
>               "Hit= ",        $hit->name,
>
>               "Length= ",     $hsp->length('total'),
>
>               "Percent_id= ", $hsp->percent_identity,
>
>               "Subject=",     $hsp->hit_string, "\n";
>
>           }
>
>         }
>
>       }
>
>     }
>
>   }
>
> }
>
> ===============================
>
>
>
> *From:* Tim Koehler [mailto:timbourine81 at googlemail.com]
> *Sent:* Friday, 27 November 2009 10:24 p.m.
> *To:* Smithies, Russell; maj at fortinbras.us
> *Subject:* Re: [Bioperl-l] How to parse BLAST output - all hits of each
> queryinnew file
>
>
>
> Hey guys,
>
> please, do not get me wrong that I wanted to put the workload on you. So
> far I only found the HowTo's but in there in some way the language changed
> with time (e.g. $in to $Seq_in) or some things I simply could not find.
> Now I got a tip where else to search: the scrapbook and deobfuscator.
>
> I immediately will have a look at that.
>
> This is the first time for me touching linux / perl commands; that's why I
> thought after several days of trial and many errors ;) asking the
> mailinglist.
>
> I was very happy about your fast answers!
>
> Cheers and a nice weekend,
>
> Tim
>
> On Thu, Nov 26, 2009 at 5:02 PM, Tim Koehler <timbourine81 at googlemail.com>
> wrote:
>
> ups, sent too early...
>
> Hey Mark,
>
> thanks for the answer. But I am still struggling, especially where to put
> in your code.
>
> Here ist the code I have, so far:
>
> #!/usr/bin/perl -w
>
> ### should I put your code here as push is a perl command?
>
>
> my %hits_by_query;
> for ($result->hits) {
>
> ### I inserted a comma after name}}; if there is no comma, there was the
> error: Scalar found where operator expected at
> 12_BLAST_two_sequence_each_query_one_file.PL line7, near "} $hit"
> ###        (Missing operator before  $hit?)
> ###Useless use of push with no values at
> 12_BLAST_two_sequence_each_query_one_file.PL line 7.
> ###syntax error at 12_BLAST_two_sequence_each_query_one_file.PL line 7,
> near "} $hit"
> ###BEGIN not safe after errors--compilation aborted at
> 12_BLAST_two_sequence_each_query_one_file.PL line 13.
>
>
>  push @{$hits_by_query{$hit->name}}, $hit;
>
> ###here, every time this terror appears: Name "main::result" used only
> once: possible typo at 12_BLAST_two_sequence_each_query_one_file.PL line 5.
> ###error: Can't call method "hits" on an undefined value at
> 12_BLAST_two_sequence_each_query_one_file.PL line 5.
>
>
> }
>
>
> use strict;
> use Bio::Tools::Run::StandAloneBlast;
> use Bio::SeqIO;
> use Bio::SearchIO;
>
> use Bio::Search::Result::BlastResult;
>
> my $Seq_in = Bio::SeqIO->new (
> -file =>
> "/home/koehler/Programs/for_BLAST/BLAST_Pipeline/1_to_BLAST_two_seq.fasta",
> -format => 'fasta'
> );
> while (my $query = $Seq_in->next_seq()) {
>
>
> my $factory = Bio::Tools::Run::StandAloneBlast->new(
>
> 'program' => 'blastn',
> 'database' => '/home/koehler/Programs/for_BLAST/BLAST_Pipeline/3_BLAST_db',
> _READMETHOD => "Blast"
> );
>
> my $blast_report = $factory->blastall($query);
>
> ### Should I need to use a module? are the commands here at the right
> position? errors, e.g., Global symbol "$hit" requires explicit package name
> #my %hits_by_query;
> #for ($result->hits) {
> ### inserted comma after name}}
> # push @{$hits_by_query{$hit->name}}, $hit;
> #}
>
>
>
> foreach my $qid ( keys %hits_by_query ) {
>  my $result = Bio::Search::Result::BlastResult->new();
>  $result->add_hit($_) for ( @{$hits_by_query{$qid}} );
>  my $blio = Bio::SearchIO->new( -file => ">$qid\.bls", -format=>'blast' );
>  $blio->write_result($result);
> }
>
> ###where are the files stored? what is their name. Sorry, but I cannot get
> behind that :(
>
> while( my $result = $blast_report->next_result ) {
>   ## $result is a Bio::Search::Result::ResultI compliant object
>
>
>   while( my $hit = $result->next_hit ) {
>
>    ## $hit is a Bio::Search::Hit::HitI compliant object
>
>
>    while( my $hsp = $hit->next_hsp ) {
>
>     ## $hsp is a Bio::Search::HSP::HSPI compliant object
>     if( $hsp->length('total') > 50 ) {
>      if ( $hsp->percent_identity >= 75 ) {
>      print  "Query= ",        $result->query_name,
>         "Hit= ",        $hit->name,
>             "Length= ",     $hsp->length('total'),
>             "Percent_id= ", $hsp->percent_identity,
>         "Subject=",        $hsp->hit_string,"\n";
>      }
>     }
>    }
>   }
> }
> }
>
> Again, a big thanks in advance :)
>
> All the best,
>
> Tim
>
> On Thu, Nov 26, 2009 at 4:52 PM, Tim <timbourine81 at gmail.com> wrote:
>
> Hey Mark,
>
> thanks for the answer
>
>
>
>
> On 25.11.2009 20:21, Mark A. Jensen wrote:
> > whoops: change the following line:
> > my $blio = Bio::SearchIO->new( -file => $qid.".bls", -format=>'blast' );
> >
> > to
> >
> > my $blio = Bio::SearchIO->new( -file => ">$qid\.bls", -format=>'blast' );
> >
> > (I always forget that...)
> > MAJ
> >
> > ----- Original Message ----- From: "Mark A. Jensen" <maj at fortinbras.us>
> > To: "Tim" <timbourine81 at gmail.com>; <bioperl-l at lists.open-bio.org>
> > Sent: Wednesday, November 25, 2009 1:20 PM
> > Subject: Re: [Bioperl-l] How to parse BLAST output - all hits of each
> > queryinnew file
> >
> >
> >> hey Tim--
> >>
> >> Sound like you need to go about collecting your queries inside out:
> >>
> >> my %hits_by_query;
> >> for ($result->hits) {
> >>  push @{$hits_by_query{$hit->name}} $hit;
> >> }
> >>
> >> I believe now each hash element, keyed by the query name, will contain
> >> an arrayref to the set of hits assoc with that query.
> >>> From here, I believe
> >>
> >> use Bio::Search::Result::BlastResult;
> >> use Bio::SearchIO;
> >>
> >> foreach my $qid ( keys %hits_by_query ) {
> >>  my $result = Bio::Search::Result::BlastResult->new();
> >>  $result->add_hit($_) for ( @{$hits_by_query{$qid}} );
> >>  my $blio = Bio::SearchIO->new( -file => $qid.".bls", -format=>'blast'
> );
> >>  $blio->write_result($result);
> >> }
> >>
> >> will do what you want.
> >>
> >> hope this helps -
> >> Mark
> >>
> >> ----- Original Message ----- From: "Tim" <timbourine81 at gmail.com>
> >> To: <bioperl-l at lists.open-bio.org>
> >> Sent: Wednesday, November 25, 2009 12:40 PM
> >> Subject: [Bioperl-l] How to parse BLAST output - all hits of each
> >> query innew file
> >>
> >>
> >>> Dear bioperl users,
> >>>
> >>> I am a real newbie and have - maybe a very trivial - question.
> >>>
> >>> I searched the mailing list archive and many howtos but I have not
> found
> >>> a concrete answer to my problem. So hopefully you can help me :)
> >>>
> >>> Background: I use the latest Bioperl version (installed it two weeks
> >>> before).
> >>> When I use Bio::Tools::Run::StandAloneBlast to BLAST one fasta file
> >>> including different sequences, I get a BLAST output with many queries
> >>> each having several hits / sbjcts.
> >>>
> >>> My problem is how to parse *all* hits of *one* query into a single new
> >>> file. And this for all the queries I have in my BLAST output file.
> >>>
> >>> Or is it better the other way round; first to make fasta files with
> only
> >>> single sequences inside and BLAST each file? But how can I automize
> that
> >>> using Bioperl?
> >>>
> >>> I tried Bio::SearchIO but can only parse all queries and their
> >>> respective hits in only one file...
> >>> I think iteration is also necessary here, but I do not really know how
> >>> to include that into Bio::SearchIO.
> >>> Or do I have to use Module:Bio::Index::Blast?
> >>>
> >>> I can index a file (see below), but I have no idea what comes next...
> >>>
> >>> ###How I index a file...
> >>>
> >>> #!/usr/bin/perl -w
> >>>
> >>> $ENV{BIOPERL_INDEX_TYPE} = "SDBM_File";
> >>>
> >>> use Bio::Index::Fasta;
> >>>
> >>>
> >>> $file_name = "8_to_BLAST_two_seq_index.fasta";
> >>> $id = "48882";
> >>> $inx = Bio::Index::Fasta->new (-filename => $file_name . ".idx",
> >>> -write_flag => 1);
> >>> $inx->make_index($file_name);
> >>>
> >>>
> >>> Hopefully, you can give me at least hints what to look for.
> >>>
> >>> A big THANKS in advance!
> >>>
> >>> Cheers,
> >>>
> >>> Tim
> >>> _______________________________________________
> >>> Bioperl-l mailing list
> >>> Bioperl-l at lists.open-bio.org
> >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >>>
> >>>
> >>
> >> _______________________________________________
> >> Bioperl-l mailing list
> >> Bioperl-l at lists.open-bio.org
> >> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >>
> >>
> >
>
> Tim K?hler
> MPI for Terrestrial Microbiology
> Karl-von-Frisch-Stra?e
> D-35043 Marburg / Germany
>
> Email: koehlerd at mpi-marburg.mpg.de
> Phone: +49 6421 178-740
> Fax:   +49 6421 178-999
>
>
>
>
>  ------------------------------
>
> *Attention: *The information contained in this message and/or attachments
> from AgResearch Limited is intended only for the persons or entities to
> which it is addressed and may contain confidential and/or privileged
> material. Any review, retransmission, dissemination or other use of, or
> taking of any action in reliance upon, this information by persons or
> entities other than the intended recipients is prohibited by AgResearch
> Limited. If you have received this message in error, please notify the
> sender immediately.
>  ------------------------------
>
>
>
>


From maj at fortinbras.us  Sun Nov  1 23:47:15 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Sun, 1 Nov 2009 23:47:15 -0500
Subject: [Bioperl-l] annotations
Message-ID: <5150801225E0484D95DC51B2D00AE519@NewLife>

I'm cogitating on features and annotations. For a RichSeq, one gets the set of annotations by

$seq->annotation->get_Annotations

while getting features by 

$seq->get_Features

Is there a reason not to have a method in SeqI 

sub get_Annotations { shift->annotation->get_Annotations }

to allow a user to do what seems natural from a user's perspective, viz. $seq->get_Annotations? I imagine this might save hundreds of hours of frustration, integrated over all newbies.
MAJ


From cjfields at illinois.edu  Mon Nov  2 08:08:54 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Mon, 2 Nov 2009 07:08:54 -0600
Subject: [Bioperl-l] annotations
In-Reply-To: <5150801225E0484D95DC51B2D00AE519@NewLife>
References: <5150801225E0484D95DC51B2D00AE519@NewLife>
Message-ID: <6920A9E1-D221-4CF8-9866-0ADBDB254C19@illinois.edu>

On Nov 1, 2009, at 10:47 PM, Mark A. Jensen wrote:

> I'm cogitating on features and annotations. For a RichSeq, one gets  
> the set of annotations by
>
> $seq->annotation->get_Annotations
>
> while getting features by
>
> $seq->get_Features
>
> Is there a reason not to have a method in SeqI
>
> sub get_Annotations { shift->annotation->get_Annotations }
>
> to allow a user to do what seems natural from a user's perspective,  
> viz. $seq->get_Annotations? I imagine this might save hundreds of  
> hours of frustration, integrated over all newbies.
> MAJ

One could add the methods to delegate to annotation() (that's  
essentially what I'm planning on doing for Biome).

chris


From kiekyon.huang at gmail.com  Tue Nov  3 10:14:39 2009
From: kiekyon.huang at gmail.com (Kie Kyon Huang)
Date: Tue, 3 Nov 2009 23:14:39 +0800
Subject: [Bioperl-l] render_blast problem
Message-ID: <a54400840911030714q15c0a601t6882beaff68a45f5@mail.gmail.com>

Hi,

I was trying to follow the HOWTO:Graphics at
http://www.bioperl.org/wiki/HOWTO:Graphics

When running the command line in cygwin

$ perl render_blast1.pl data1.txt | display -

I get the following error line,

bash: display: command not found

I also tried

$ perl render_blast1.pl data1.txt > data1.png

however, I was unable to open the data1.png file using Microsoft
Office Picture Manager or windows Photo Gallery

Thanks

Huang


From biopython at maubp.freeserve.co.uk  Tue Nov  3 10:45:37 2009
From: biopython at maubp.freeserve.co.uk (Peter)
Date: Tue, 3 Nov 2009 15:45:37 +0000
Subject: [Bioperl-l] render_blast problem
In-Reply-To: <a54400840911030714q15c0a601t6882beaff68a45f5@mail.gmail.com>
References: <a54400840911030714q15c0a601t6882beaff68a45f5@mail.gmail.com>
Message-ID: <320fb6e00911030745s68331ef7n729505f460863e21@mail.gmail.com>

On Tue, Nov 3, 2009 at 3:14 PM, Kie Kyon Huang <kiekyon.huang at gmail.com> wrote:
> Hi,
>
> I was trying to follow the HOWTO:Graphics at
> http://www.bioperl.org/wiki/HOWTO:Graphics
>
> When running the command line in cygwin
>
> $ perl render_blast1.pl data1.txt | display -
>
> I get the following error line,
>
> bash: display: command not found

That makes sense on Windows, since display is a Unix
command line tool.

> I also tried
>
> $ perl render_blast1.pl data1.txt > data1.png

Based on the wiki, I think that ought to have worked.

> however, I was unable to open the data1.png file using Microsoft
> Office Picture Manager or windows Photo Gallery

Did you do this step?:
>> Important!  If you are on a Windows platform, you need to put
>> STDOUT into binary mode so that the PNG file does not go
>> through Window's carriage return/linefeed transformations.
>> Before the final print statement, put the statement
>> binmode(STDOUT). This advice also applies to certain older
>> versions of RedHat, which ship with a patched (and possibly
>> broken) version of Perl.

(BioPerl devs - couldn't that be added to the default
render_blast1.pl script with an if statement checking for
Windows?)

Peter


From biopython at maubp.freeserve.co.uk  Tue Nov  3 11:04:59 2009
From: biopython at maubp.freeserve.co.uk (Peter)
Date: Tue, 3 Nov 2009 16:04:59 +0000
Subject: [Bioperl-l] render_blast problem
In-Reply-To: <a54400840911030755s725229f7ib679d67932535753@mail.gmail.com>
References: <a54400840911030714q15c0a601t6882beaff68a45f5@mail.gmail.com>
	<320fb6e00911030745s68331ef7n729505f460863e21@mail.gmail.com>
	<a54400840911030755s725229f7ib679d67932535753@mail.gmail.com>
Message-ID: <320fb6e00911030804r62e50da6w373bbb61e9823f28@mail.gmail.com>

Mailing list CC'd - solved :)

On Tue, Nov 3, 2009 at 3:55 PM, Kie Kyon Huang <kiekyon.huang at gmail.com> wrote:
>
> ok, that fix it
> i forget sometimes what platform am i on.
> thanks

Great.

Peter


From amackey at virginia.edu  Tue Nov  3 12:09:00 2009
From: amackey at virginia.edu (Aaron Mackey)
Date: Tue, 3 Nov 2009 12:09:00 -0500
Subject: [Bioperl-l] svn errors?
Message-ID: <24c96eca0911030909p7cfbf858h4de5a345cf8a0782@mail.gmail.com>

[ajm6q at lc4 bioperl-live]$ svn update
svn: Decompression of svndiff data failed


I'll admit to not having svn updated in awhile; A clean, anonymous svn co
failed with the same message:

[...]
A    bioperl-live/Bio/Structure/StructureI.pm
A    bioperl-live/Bio/Structure/IO
svn: Decompression of svndiff data failed

-Aaron

P.S. I used this command: svn co svn://
code.open-bio.org/bioperl/bioperl-live/trunk bioperl-live


From cjfields at illinois.edu  Tue Nov  3 12:17:10 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Tue, 3 Nov 2009 11:17:10 -0600
Subject: [Bioperl-l] svn errors?
In-Reply-To: <24c96eca0911030909p7cfbf858h4de5a345cf8a0782@mail.gmail.com>
References: <24c96eca0911030909p7cfbf858h4de5a345cf8a0782@mail.gmail.com>
Message-ID: <8C5FC42D-F957-45AC-9AAC-876ACC9D77E0@illinois.edu>

Aaron,

Yep, this was reported to support (a couple of users on #bioperl  
reported the same problem).  Chris D. is looking into it.

I'm wondering if it's worth setting up a second mirror to github for  
this purpose.

chris

On Nov 3, 2009, at 11:09 AM, Aaron Mackey wrote:

> [ajm6q at lc4 bioperl-live]$ svn update
> svn: Decompression of svndiff data failed
>
>
> I'll admit to not having svn updated in awhile; A clean, anonymous  
> svn co
> failed with the same message:
>
> [...]
> A    bioperl-live/Bio/Structure/StructureI.pm
> A    bioperl-live/Bio/Structure/IO
> svn: Decompression of svndiff data failed
>
> -Aaron
>
> P.S. I used this command: svn co svn://
> code.open-bio.org/bioperl/bioperl-live/trunk bioperl-live
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From cjfields at illinois.edu  Tue Nov  3 12:19:56 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Tue, 3 Nov 2009 11:19:56 -0600
Subject: [Bioperl-l] render_blast problem
In-Reply-To: <320fb6e00911030745s68331ef7n729505f460863e21@mail.gmail.com>
References: <a54400840911030714q15c0a601t6882beaff68a45f5@mail.gmail.com>
	<320fb6e00911030745s68331ef7n729505f460863e21@mail.gmail.com>
Message-ID: <8336341C-C7B4-4740-A7C3-E2DE5FDAF651@illinois.edu>


On Nov 3, 2009, at 9:45 AM, Peter wrote:

> ...
> Did you do this step?:
>>> Important!  If you are on a Windows platform, you need to put
>>> STDOUT into binary mode so that the PNG file does not go
>>> through Window's carriage return/linefeed transformations.
>>> Before the final print statement, put the statement
>>> binmode(STDOUT). This advice also applies to certain older
>>> versions of RedHat, which ship with a patched (and possibly
>>> broken) version of Perl.
>
> (BioPerl devs - couldn't that be added to the default
> render_blast1.pl script with an if statement checking for
> Windows?)
>
> Peter

Yes, that should be added.  I'll work on it.

chris


From mauricio at open-bio.org  Tue Nov  3 12:20:52 2009
From: mauricio at open-bio.org (Mauricio Herrera Cuadra)
Date: Tue, 03 Nov 2009 11:20:52 -0600
Subject: [Bioperl-l] svn errors?
In-Reply-To: <24c96eca0911030909p7cfbf858h4de5a345cf8a0782@mail.gmail.com>
References: <24c96eca0911030909p7cfbf858h4de5a345cf8a0782@mail.gmail.com>
Message-ID: <4AF06674.30506@open-bio.org>

Hi Aaron,

This was reported a few days ago. Chris Dagdigian is working today on a 
fix for it.

Mauricio.

Aaron Mackey wrote:
> [ajm6q at lc4 bioperl-live]$ svn update
> svn: Decompression of svndiff data failed
> 
> 
> I'll admit to not having svn updated in awhile; A clean, anonymous svn co
> failed with the same message:
> 
> [...]
> A    bioperl-live/Bio/Structure/StructureI.pm
> A    bioperl-live/Bio/Structure/IO
> svn: Decompression of svndiff data failed
> 
> -Aaron
> 
> P.S. I used this command: svn co svn://
> code.open-bio.org/bioperl/bioperl-live/trunk bioperl-live
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 


From rachitasharma at gmail.com  Tue Nov  3 17:12:11 2009
From: rachitasharma at gmail.com (Rachita Sharma)
Date: Tue, 3 Nov 2009 14:12:11 -0800
Subject: [Bioperl-l] Trouble parsing PSI-BLAST
Message-ID: <48f9c0d0911031412v26935097ib06d13c2266cfd8a@mail.gmail.com>

I am having trouble parsing PSI-BLAST results. Please help.

The code is:
my $in = new Bio::SearchIO(        -format => 'blast',
                                -file =>
"BS_XFpsiRblastoutputs/e${ev}/bloutput${i}.txt");


while( my $result = $in->next_result ) {
while( my $hit = $result->next_hit ) {

$sth->execute($result->query_name, $hit->name, $hit->significance);
print "Query executed!\n";

}
}

The error is:
------------- EXCEPTION: Bio::Root::Exception -------------
MSG: no data for midline  ***** No hits found ******
STACK: Error::throw
STACK: Bio::Root::Root::throw
/usr/lib/perl5/site_perl/5.8.8/Bio/Root/Root.pm:359
STACK: Bio::SearchIO::blast::next_result
/usr/lib/perl5/site_perl/5.8.8/Bio/SearchIO/blast.pm:1813
STACK: BSubVCpsiRblast.pl:92
-----------------------------------------------------------


From cjfields at illinois.edu  Tue Nov  3 22:42:55 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Tue, 3 Nov 2009 21:42:55 -0600
Subject: [Bioperl-l] Trouble parsing PSI-BLAST
In-Reply-To: <48f9c0d0911031412v26935097ib06d13c2266cfd8a@mail.gmail.com>
References: <48f9c0d0911031412v26935097ib06d13c2266cfd8a@mail.gmail.com>
Message-ID: <DD8E7843-7181-45AD-95B1-FD877D0A5D4E@illinois.edu>

Rachita,

You'll have to give us more to go on than this.  The best thing to do  
is file a bug report and attach an example PSI-BLAST report and code  
that causes the problem.  The $sth->execute(...) is a bit odd, but  
that shouldn't cause the error in question.

Also, make sure to stipulate the OS, version of BioPerl, and perl  
version.

chris

On Nov 3, 2009, at 4:12 PM, Rachita Sharma wrote:

> I am having trouble parsing PSI-BLAST results. Please help.
>
> The code is:
> my $in = new Bio::SearchIO(        -format => 'blast',
>                                -file =>
> "BS_XFpsiRblastoutputs/e${ev}/bloutput${i}.txt");
>
>
> while( my $result = $in->next_result ) {
> while( my $hit = $result->next_hit ) {
>
> $sth->execute($result->query_name, $hit->name, $hit->significance);
> print "Query executed!\n";
>
> }
> }
>
> The error is:
> ------------- EXCEPTION: Bio::Root::Exception -------------
> MSG: no data for midline  ***** No hits found ******
> STACK: Error::throw
> STACK: Bio::Root::Root::throw
> /usr/lib/perl5/site_perl/5.8.8/Bio/Root/Root.pm:359
> STACK: Bio::SearchIO::blast::next_result
> /usr/lib/perl5/site_perl/5.8.8/Bio/SearchIO/blast.pm:1813
> STACK: BSubVCpsiRblast.pl:92
> -----------------------------------------------------------
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From alexl at users.sourceforge.net  Wed Nov  4 02:30:21 2009
From: alexl at users.sourceforge.net (Alex Lancaster)
Date: Wed, 04 Nov 2009 02:30:21 -0500
Subject: [Bioperl-l] version of ExtUtils::Manifest too strict?
Message-ID: <msd43yycfm.fsf@allele2.localdomain>

Does the version of ExtUtils::Manifest really need to be strictly
greater than or equal to 1.52?

Currently this blocks me updating the Fedora package of BioPerl to
1.6.1, because the version of perl that Fedora ships is on 1.51 and
hence the build fails with:

Checking prerequisites...
 - ERROR: ExtUtils::Manifest (1.51_01) is installed, but we need version >= 1.52

Full logs are here:
http://koji.fedoraproject.org/koji/taskinfo?taskID=1787483
http://koji.fedoraproject.org/koji/getfile?taskID=1787483&name=build.log

This is true even with the version of Perl in rawhide/F-12 etc.
(ExtUtils::Manifest is in the base perl package).

If it really is necessary, I would like to be armed with a good argument
why it needs to be updated, since the Perl package maintainer would have
to update the entire Perl package simply to get a more recent version of
one small subpackage.

Regards,
Alex


From jluis.lavin at unavarra.es  Wed Nov  4 03:43:35 2009
From: jluis.lavin at unavarra.es (jluis.lavin at unavarra.es)
Date: Wed, 4 Nov 2009 09:43:35 +0100 (CET)
Subject: [Bioperl-l] Trouble retrieving multiple sequences from NCBI in a
 single list query
Message-ID: <1386.130.206.164.153.1257324215.squirrel@webmail.unavarra.es>


Hello all,

I?m a newbie who is having terrible troubles trying to retrieve a list
multiple sequences from the NCBI and write them to a single file in Fasta
format.
The code I?ve written seems to read mylist and retrive the sequences, but
it kinda overwrites them so that I only get the last sequence on the list.
I?ve been told to ask the people on this mailing list for help, since you
may have come across this problem also or at last will know how to solve
it...

Here is my code, which basically consist on an STDIN for the list to be
read into an array and a loop to read each sequence (stopping when the
list ends) and retrieve a sequence each time the loop is launched,
writting that sequence to a fasta file. I only get a sequence back
although it seems to perform the retrieving process with each of the
sequences of the list...


#!/usr/bin/perl -w
use strict;
use Bio::DB::GenPept;
use Bio::DB::GenBank;
use Bio::SeqIO;
print "Enter your list name:";
my $archivo=<STDIN>;
chomp $archivo;
die ("Can?t open input\n") unless (open(INFILE, $archivo));
my @lista = <INFILE>;
foreach my $seq (@lista) {
    if ($seq eq '') {
        die ("empty list")
        }
    else {
my $db = new Bio::DB::GenPept("-format" => "Fasta");
my $seqobj = $db->get_Seq_by_acc($seq);
my $out = new Bio::SeqIO (-file => ">extracted_seqs.fasta",
-format => 'fasta');
$out->write_seq($seqobj);
}
}
exit;


An example list of sequences can be this one:

YP_003107578.1
YP_003106103.1
YP_003106552.1
YP_003106560.1
YP_003107053.1
YP_003107450.1
YP_003108000.1
YP_003105023.1
YP_003105264.1

Thanks in advance for your help ;)

-- 
Jos? Luis Lav?n Trueba, PhD

Dpto. de Producci?n Agraria
Grupo de Gen?tica y Microbiolog?a
Universidad P?blica de Navarra
31006 Pamplona
Navarra
SPAIN


From e.osimo at gmail.com  Wed Nov  4 04:54:52 2009
From: e.osimo at gmail.com (Emanuele Osimo)
Date: Wed, 4 Nov 2009 10:54:52 +0100
Subject: [Bioperl-l] Bio::Graphics and picture format
Message-ID: <2ac05d0f0911040154h4eed4a1j8108f78e6e4761f3@mail.gmail.com>

Hello everyone,
do you know if it is possible to generate an image with Bio::Graphics in a
vector format? Is there a list of available formats?
Thanks
Emanuele


From David.Messina at sbc.su.se  Wed Nov  4 04:52:53 2009
From: David.Messina at sbc.su.se (Dave Messina)
Date: Wed, 4 Nov 2009 10:52:53 +0100
Subject: [Bioperl-l] Trouble retrieving multiple sequences from NCBI in
	a single list query
In-Reply-To: <1386.130.206.164.153.1257324215.squirrel@webmail.unavarra.es>
References: <1386.130.206.164.153.1257324215.squirrel@webmail.unavarra.es>
Message-ID: <628aabb70911040152r19ed79dfnbc54f346295d28a8@mail.gmail.com>

>
> The code I?ve written seems to read mylist and retrive the sequences, but
> it kinda overwrites them so that I only get the last sequence on the list.
>

With this line

my $out = new Bio::SeqIO (-file => ">extracted_seqs.fasta", -format =>
'fasta');


you are opening the filehandle for the output file inside your loop, so each
time it is writing over the previous file with an empty file. Then, you
write a single sequence to that file with this line

$out->write_seq($seqobj);


So when you are done, you just have the last sequence in the output file.

If you move the opening of the output filehandle outside the loop (it needs
to be done only once), then it should work as you expect.

Also, I notice the newline characters are not being removed from your
sequence IDs  (actually I'm a little surprised that the sequences are being
retrieved). Just to be safe, you may want to add the line

chomp @lista;


after

my @lista = <INFILE>;


Dave


From jluis.lavin at unavarra.es  Wed Nov  4 05:14:40 2009
From: jluis.lavin at unavarra.es (jluis.lavin at unavarra.es)
Date: Wed, 4 Nov 2009 11:14:40 +0100 (CET)
Subject: [Bioperl-l] Trouble retrieving multiple sequences from NCBI in
 a single list query
In-Reply-To: <628aabb70911040152r19ed79dfnbc54f346295d28a8@mail.gmail.com>
References: <1386.130.206.164.153.1257324215.squirrel@webmail.unavarra.es>
	<628aabb70911040152r19ed79dfnbc54f346295d28a8@mail.gmail.com>
Message-ID: <1791.130.206.164.153.1257329680.squirrel@webmail.unavarra.es>

Thank you very very much Dave,
I?ve had a really frustrating time trying to find out what I was doing
wrong, it has been so frustrating that I was about to quit Bioperl.
Now I can try to focus on BLAST parsing for my comparative genomic analysis

You?re great in this mailing list, because you give a fast and neat advice
to all the questions asked here by newbies like me ;)


El Mie, 4 de Noviembre de 2009, 10:52, Dave Messina escribi?:
>>
>> The code I??ve written seems to read mylist and retrive the sequences,
>> but
>> it kinda overwrites them so that I only get the last sequence on the
>> list.
>>
>
> With this line
>
> my $out = new Bio::SeqIO (-file => ">extracted_seqs.fasta", -format =>
> 'fasta');
>
>
> you are opening the filehandle for the output file inside your loop, so
> each
> time it is writing over the previous file with an empty file. Then, you
> write a single sequence to that file with this line
>
> $out->write_seq($seqobj);
>
>
> So when you are done, you just have the last sequence in the output file.
>
> If you move the opening of the output filehandle outside the loop (it
> needs
> to be done only once), then it should work as you expect.
>
> Also, I notice the newline characters are not being removed from your
> sequence IDs  (actually I'm a little surprised that the sequences are
> being
> retrieved). Just to be safe, you may want to add the line
>
> chomp @lista;
>
>
> after
>
> my @lista = <INFILE>;
>
>
>
>
> Dave
>


-- 
Dr. Jos? Luis Lav?n Trueba

Dpto. de Producci?n Agraria
Grupo de Gen?tica y Microbiolog?a
Universidad P?blica de Navarra
31006 Pamplona
Navarra
SPAIN


From hrh at fmi.ch  Wed Nov  4 05:05:17 2009
From: hrh at fmi.ch (Hotz, Hans-Rudolf)
Date: Wed, 04 Nov 2009 11:05:17 +0100
Subject: [Bioperl-l] Trouble retrieving multiple sequences from NCBI in
 a single list query
In-Reply-To: <1386.130.206.164.153.1257324215.squirrel@webmail.unavarra.es>
Message-ID: <C717106D.54F2%hrh@fmi.ch>

Hi

try

my $out = new Bio::SeqIO (-file => ">>extracted_seqs.fasta",
                                     ^

this way you no longer overwrite your existing file, but append the next
sequence.

Regards, Hans


On 11/4/09 9:43 AM, "jluis.lavin at unavarra.es" <jluis.lavin at unavarra.es>
wrote:

> 
> Hello all,
> 
> I?m a newbie who is having terrible troubles trying to retrieve a list
> multiple sequences from the NCBI and write them to a single file in Fasta
> format.
> The code I?ve written seems to read mylist and retrive the sequences, but
> it kinda overwrites them so that I only get the last sequence on the list.
> I?ve been told to ask the people on this mailing list for help, since you
> may have come across this problem also or at last will know how to solve
> it...
> 
> Here is my code, which basically consist on an STDIN for the list to be
> read into an array and a loop to read each sequence (stopping when the
> list ends) and retrieve a sequence each time the loop is launched,
> writting that sequence to a fasta file. I only get a sequence back
> although it seems to perform the retrieving process with each of the
> sequences of the list...
> 
> 
> #!/usr/bin/perl -w
> use strict;
> use Bio::DB::GenPept;
> use Bio::DB::GenBank;
> use Bio::SeqIO;
> print "Enter your list name:";
> my $archivo=<STDIN>;
> chomp $archivo;
> die ("Can?t open input\n") unless (open(INFILE, $archivo));
> my @lista = <INFILE>;
> foreach my $seq (@lista) {
>     if ($seq eq '') {
>         die ("empty list")
>         }
>     else {
> my $db = new Bio::DB::GenPept("-format" => "Fasta");
> my $seqobj = $db->get_Seq_by_acc($seq);
> my $out = new Bio::SeqIO (-file => ">extracted_seqs.fasta",
> -format => 'fasta');
> $out->write_seq($seqobj);
> }
> }
> exit;
> 
> 
> An example list of sequences can be this one:
> 
> YP_003107578.1
> YP_003106103.1
> YP_003106552.1
> YP_003106560.1
> YP_003107053.1
> YP_003107450.1
> YP_003108000.1
> YP_003105023.1
> YP_003105264.1
> 
> Thanks in advance for your help ;)


From jluis.lavin at unavarra.es  Wed Nov  4 05:25:38 2009
From: jluis.lavin at unavarra.es (jluis.lavin at unavarra.es)
Date: Wed, 4 Nov 2009 11:25:38 +0100 (CET)
Subject: [Bioperl-l] Trouble retrieving multiple sequences from NCBI in
 asingle list query
In-Reply-To: <C717106D.54F2%hrh@fmi.ch>
References: <1386.130.206.164.153.1257324215.squirrel@webmail.unavarra.es>
	<C717106D.54F2%hrh@fmi.ch>
Message-ID: <1834.130.206.164.153.1257330338.squirrel@webmail.unavarra.es>

Thank you very much for your answer Hans!!!
It works perfectly,also a neat and fast solution, like Dave?s.

Blessings to you all ;)

El Mie, 4 de Noviembre de 2009, 11:05, Hotz, Hans-Rudolf escribi?:
> Hi
>
> try
>
> my $out = new Bio::SeqIO (-file => ">>extracted_seqs.fasta",
>                                      ^
>
> this way you no longer overwrite your existing file, but append the next
> sequence.
>
> Regards, Hans
>
>
>
> On 11/4/09 9:43 AM, "jluis.lavin at unavarra.es" <jluis.lavin at unavarra.es>
> wrote:
>
>>
>> Hello all,
>>
>> I?m a newbie who is having terrible troubles trying to retrieve a list
>> multiple sequences from the NCBI and write them to a single file in
>> Fasta
>> format.
>> The code I?ve written seems to read mylist and retrive the sequences,
>> but
>> it kinda overwrites them so that I only get the last sequence on the
>> list.
>> I?ve been told to ask the people on this mailing list for help, since
>> you
>> may have come across this problem also or at last will know how to solve
>> it...
>>
>> Here is my code, which basically consist on an STDIN for the list to be
>> read into an array and a loop to read each sequence (stopping when the
>> list ends) and retrieve a sequence each time the loop is launched,
>> writting that sequence to a fasta file. I only get a sequence back
>> although it seems to perform the retrieving process with each of the
>> sequences of the list...
>>
>>
>> #!/usr/bin/perl -w
>> use strict;
>> use Bio::DB::GenPept;
>> use Bio::DB::GenBank;
>> use Bio::SeqIO;
>> print "Enter your list name:";
>> my $archivo=<STDIN>;
>> chomp $archivo;
>> die ("Can?t open input\n") unless (open(INFILE, $archivo));
>> my @lista = <INFILE>;
>> foreach my $seq (@lista) {
>>     if ($seq eq '') {
>>         die ("empty list")
>>         }
>>     else {
>> my $db = new Bio::DB::GenPept("-format" => "Fasta");
>> my $seqobj = $db->get_Seq_by_acc($seq);
>> my $out = new Bio::SeqIO (-file => ">extracted_seqs.fasta",
>> -format => 'fasta');
>> $out->write_seq($seqobj);
>> }
>> }
>> exit;
>>
>>
>> An example list of sequences can be this one:
>>
>> YP_003107578.1
>> YP_003106103.1
>> YP_003106552.1
>> YP_003106560.1
>> YP_003107053.1
>> YP_003107450.1
>> YP_003108000.1
>> YP_003105023.1
>> YP_003105264.1
>>
>> Thanks in advance for your help ;)
>
>


-- 
Dr. Jos? Luis Lav?n Trueba

Dpto. de Producci?n Agraria
Grupo de Gen?tica y Microbiolog?a
Universidad P?blica de Navarra
31006 Pamplona
Navarra
SPAIN


From scott at scottcain.net  Wed Nov  4 08:26:02 2009
From: scott at scottcain.net (Scott Cain)
Date: Wed, 4 Nov 2009 08:26:02 -0500
Subject: [Bioperl-l] Bio::Graphics and picture format
In-Reply-To: <2ac05d0f0911040154h4eed4a1j8108f78e6e4761f3@mail.gmail.com>
References: <2ac05d0f0911040154h4eed4a1j8108f78e6e4761f3@mail.gmail.com>
Message-ID: <0FB17FBC-16BE-4A9F-AC75-983D3B4ECE7D@scottcain.net>

Hi Emanuele,

It is possible to use GD::SVG instead of GD to generate SVG graphics.   
To use it, you provide an argument of "-image_class  GD::SVG" to the  
constructor of Bio::Graphics::Panel.  See the perldoc of  
Bio::Graphics::Panel for more info.

Scott


On Nov 4, 2009, at 4:54 AM, Emanuele Osimo wrote:

> Hello everyone,
> do you know if it is possible to generate an image with  
> Bio::Graphics in a
> vector format? Is there a list of available formats?
> Thanks
> Emanuele
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

-----------------------------------------------------------------------
Scott Cain, Ph. D. scott at scottcain dot net
GMOD Coordinator (http://gmod.org/) 216-392-3087
Ontario Institute for Cancer Research


From b3sn7 at UNB.ca  Tue Nov  3 12:30:24 2009
From: b3sn7 at UNB.ca (Sharma, Rachita)
Date: Tue,  3 Nov 2009 13:30:24 -0400
Subject: [Bioperl-l] Trouble parsing PSI-BLAST
Message-ID: <1257269424.4af068b045434@webmail.unb.ca>


I am having trouble parsing PSI-BLAST results. Please help.

The code is:
my $in = new Bio::SearchIO(	-format => 'blast',
				-file => "BS_XFpsiRblastoutputs/e${ev}/bloutput${i}.txt");


while( my $result = $in->next_result ) {
while( my $hit = $result->next_hit ) {

$sth->execute($result->query_name, $hit->name, $hit->significance);
print "Query executed!\n";  

}
}

The error is:
------------- EXCEPTION: Bio::Root::Exception -------------
MSG: no data for midline  ***** No hits found ******
STACK: Error::throw
STACK: Bio::Root::Root::throw /usr/lib/perl5/site_perl/5.8.8/Bio/Root/Root.pm:359
STACK: Bio::SearchIO::blast::next_result
/usr/lib/perl5/site_perl/5.8.8/Bio/SearchIO/blast.pm:1813
STACK: BSubVCpsiRblast.pl:92
-----------------------------------------------------------


*******************************
Rachita Sharma
Research Assistant (PhD Student)
University of New Brunswick, NB, CANADA
email: Rachita.Sharma at unb.ca
Phone no: 503-895-3619
*******************************


From cjfields at illinois.edu  Wed Nov  4 08:53:35 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Wed, 4 Nov 2009 07:53:35 -0600
Subject: [Bioperl-l] version of ExtUtils::Manifest too strict?
In-Reply-To: <msd43yycfm.fsf@allele2.localdomain>
References: <msd43yycfm.fsf@allele2.localdomain>
Message-ID: <1D9E943F-2EDC-49AB-83DE-78DED5A8AC23@illinois.edu>

Alex,

Not sure why ExtUtils::Manifest can't be bundled as a separate perl  
package alone.  It is part of perl core but it's also available on  
CPAN separately from perl itself:

http://search.cpan.org/~rkobes/ExtUtils-Manifest-1.57/lib/ExtUtils/Manifest.pm

This is the commit message for that BTW.  This allows spaces in file  
names for the MANIFEST.  v1.52 is a bug fix and is required.

http://code.open-bio.org/svnweb/index.cgi/bioperl/revision/?rev=15673

chris

On Nov 4, 2009, at 1:30 AM, Alex Lancaster wrote:

> Does the version of ExtUtils::Manifest really need to be strictly
> greater than or equal to 1.52?
>
> Currently this blocks me updating the Fedora package of BioPerl to
> 1.6.1, because the version of perl that Fedora ships is on 1.51 and
> hence the build fails with:
>
> Checking prerequisites...
> - ERROR: ExtUtils::Manifest (1.51_01) is installed, but we need  
> version >= 1.52
>
> Full logs are here:
> http://koji.fedoraproject.org/koji/taskinfo?taskID=1787483
> http://koji.fedoraproject.org/koji/getfile?taskID=1787483&name=build.log
>
> This is true even with the version of Perl in rawhide/F-12 etc.
> (ExtUtils::Manifest is in the base perl package).
>
> If it really is necessary, I would like to be armed with a good  
> argument why this ca
> why it needs to be updated, since the Perl package maintainer would  
> have
> to update the entire Perl package simply to get a more recent  
> version of
> one small subpackage.
>
> Regards,
> Alex
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From cjfields at illinois.edu  Wed Nov  4 08:55:34 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Wed, 4 Nov 2009 07:55:34 -0600
Subject: [Bioperl-l] Trouble parsing PSI-BLAST
In-Reply-To: <1257269424.4af068b045434@webmail.unb.ca>
References: <1257269424.4af068b045434@webmail.unb.ca>
Message-ID: <70E34111-4E70-463D-86EE-06926EA57073@illinois.edu>

Rachita,

Asked and answered yesterday.  Please submit as a bug.

chris

On Nov 3, 2009, at 11:30 AM, Sharma, Rachita wrote:

>
> I am having trouble parsing PSI-BLAST results. Please help.
>
> The code is:
> my $in = new Bio::SearchIO(	-format => 'blast',
> 				-file => "BS_XFpsiRblastoutputs/e${ev}/bloutput${i}.txt");
>
>
> while( my $result = $in->next_result ) {
> while( my $hit = $result->next_hit ) {
>
> $sth->execute($result->query_name, $hit->name, $hit->significance);
> print "Query executed!\n";
>
> }
> }
>
> The error is:
> ------------- EXCEPTION: Bio::Root::Exception -------------
> MSG: no data for midline  ***** No hits found ******
> STACK: Error::throw
> STACK: Bio::Root::Root::throw /usr/lib/perl5/site_perl/5.8.8/Bio/ 
> Root/Root.pm:359
> STACK: Bio::SearchIO::blast::next_result
> /usr/lib/perl5/site_perl/5.8.8/Bio/SearchIO/blast.pm:1813
> STACK: BSubVCpsiRblast.pl:92
> -----------------------------------------------------------
>
>
>
>
> *******************************
> Rachita Sharma
> Research Assistant (PhD Student)
> University of New Brunswick, NB, CANADA
> email: Rachita.Sharma at unb.ca
> Phone no: 503-895-3619
> *******************************
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From David.Messina at sbc.su.se  Wed Nov  4 09:11:43 2009
From: David.Messina at sbc.su.se (Dave Messina)
Date: Wed, 4 Nov 2009 15:11:43 +0100
Subject: [Bioperl-l] Trouble retrieving multiple sequences from NCBI in
	a single list query
In-Reply-To: <1791.130.206.164.153.1257329680.squirrel@webmail.unavarra.es>
References: <1386.130.206.164.153.1257324215.squirrel@webmail.unavarra.es> 
	<628aabb70911040152r19ed79dfnbc54f346295d28a8@mail.gmail.com> 
	<1791.130.206.164.153.1257329680.squirrel@webmail.unavarra.es>
Message-ID: <628aabb70911040611q56b441c8o6888f326d0b314d@mail.gmail.com>

Aw shucks, Jos?, glad I could be of help. There are plenty of people who
answer questions around here, but my timezone sometimes gives me an
advantage for the European ones. :)


Dave


From daniel.gaston at gmail.com  Wed Nov  4 09:45:04 2009
From: daniel.gaston at gmail.com (Daniel Gaston)
Date: Wed, 4 Nov 2009 10:45:04 -0400
Subject: [Bioperl-l] SwissProt and Subcellular localization information
Message-ID: <50c615ba0911040645j1b28e727p5d7bf47a04db160b@mail.gmail.com>

Hi Everyone,

I have recently been playing around with SwissProt format flatfiles and want
to extract sequences based on subcellular localization. I notice in going
through the code for swiss.pm and swissdriver.pm that in both (more so in
swissdriver.pm) there are several steps where organelle information based on
the OG line could be extracted and added to data structure but isn't. It
seems that in both cases the OG line is being added in to the generic
lumping of data from the OC, OS, and OX lines in order to extract species
names and taxonomy information but getting rid of everything else. Is there
a particular reason for this or just a simple oversight? On the surface at
least it looks like a relatively simple modification to make although I
admit that I am not terribly adept at manipulating these SeqIO
datastructures.

Thanks for your time,

Dan


From daniel.gaston at gmail.com  Wed Nov  4 12:12:10 2009
From: daniel.gaston at gmail.com (Daniel Gaston)
Date: Wed, 4 Nov 2009 13:12:10 -0400
Subject: [Bioperl-l] SwissProt and Subcellular localization information
Message-ID: <50c615ba0911040912pfd2483fwe44cd098beed73c7@mail.gmail.com>

Sorry folks, it appears I was just being a bonehead and didn't look close
enough into Bio:Annotations and Bio:Species objects that store all of this
data.

Dan

On Wed, Nov 4, 2009 at 1:00 PM, <bioperl-l-request at lists.open-bio.org>wrote:

> Send Bioperl-l mailing list submissions to
>        bioperl-l at lists.open-bio.org
>
> To subscribe or unsubscribe via the World Wide Web, visit
>        http://lists.open-bio.org/mailman/listinfo/bioperl-l
> or, via email, send a message with subject or body 'help' to
>        bioperl-l-request at lists.open-bio.org
>
> You can reach the person managing the list at
>        bioperl-l-owner at lists.open-bio.org
>
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of Bioperl-l digest..."
>
> Today's Topics:
>
>   1.  SwissProt and Subcellular localization information
>      (Daniel Gaston)
>
>
> ---------- Forwarded message ----------
> From: Daniel Gaston <daniel.gaston at gmail.com>
> To: bioperl-l at lists.open-bio.org
> Date: Wed, 4 Nov 2009 10:45:04 -0400
> Subject: [Bioperl-l] SwissProt and Subcellular localization information
> Hi Everyone,
>
> I have recently been playing around with SwissProt format flatfiles and
> want
> to extract sequences based on subcellular localization. I notice in going
> through the code for swiss.pm and swissdriver.pm that in both (more so in
> swissdriver.pm) there are several steps where organelle information based
> on
> the OG line could be extracted and added to data structure but isn't. It
> seems that in both cases the OG line is being added in to the generic
> lumping of data from the OC, OS, and OX lines in order to extract species
> names and taxonomy information but getting rid of everything else. Is there
> a particular reason for this or just a simple oversight? On the surface at
> least it looks like a relatively simple modification to make although I
> admit that I am not terribly adept at manipulating these SeqIO
> datastructures.
>
> Thanks for your time,
>
> Dan
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


From jluis.lavin at unavarra.es  Thu Nov  5 10:28:23 2009
From: jluis.lavin at unavarra.es (jluis.lavin at unavarra.es)
Date: Thu, 5 Nov 2009 16:28:23 +0100 (CET)
Subject: [Bioperl-l] A question about iBio::Index: and its correct use
Message-ID: <2120.130.206.164.153.1257434903.squirrel@webmail.unavarra.es>


Hello to all,

I?m trying to write a script to retrieve a list of sequences from a local
FASTA file (for example a fasta archive where all the protein models of an
organism are stored). This file would be used by me as some kind "local
database" (sorry if I mistake a few concepts...)
I?ve been reading the BioPerl HOWTOs and I came across the
Bio::Index::Fasta tool.
If I didn?t misunderstood what I read (which can be easy because my low
level on programming) this Indexing tool should do the job.
I wrote a couple of scripts based on the documentation i read about this
tool, but I don?t seem to be able to create the index file to be used
later (to retrieve the sequences from).
-First of all, I want to ask the people in this forum if the
Bio::Index::Fasta is the right one to chose for this tasks.
-Then I?ll beg you to take a look at my scripts, because I don?t seem to
catch the bug...

Best wishes to you all and thanks in advance ;)

-- 
Jos? Luis Lav?n Trueba, PhD

Dpto. de Producci?n Agraria
Grupo de Gen?tica y Microbiolog?a
Universidad P?blica de Navarra
31006 Pamplona
Navarra
SPAIN


From maj at fortinbras.us  Thu Nov  5 10:39:05 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Thu, 5 Nov 2009 10:39:05 -0500
Subject: [Bioperl-l] A question about iBio::Index: and its correct use
In-Reply-To: <2120.130.206.164.153.1257434903.squirrel@webmail.unavarra.es>
References: <2120.130.206.164.153.1257434903.squirrel@webmail.unavarra.es>
Message-ID: <A28922858F64480ABD8A6696E269023C@NewLife>

Jos? -- It looks like this is a good solution to your problem. Please send you 
script so we can look at it-
cheers Mark
----- Original Message ----- 
From: <jluis.lavin at unavarra.es>
To: <bioperl-l at lists.open-bio.org>
Sent: Thursday, November 05, 2009 10:28 AM
Subject: [Bioperl-l] A question about iBio::Index: and its correct use


Hello to all,

I?m trying to write a script to retrieve a list of sequences from a local
FASTA file (for example a fasta archive where all the protein models of an
organism are stored). This file would be used by me as some kind "local
database" (sorry if I mistake a few concepts...)
I?ve been reading the BioPerl HOWTOs and I came across the
Bio::Index::Fasta tool.
If I didn?t misunderstood what I read (which can be easy because my low
level on programming) this Indexing tool should do the job.
I wrote a couple of scripts based on the documentation i read about this
tool, but I don?t seem to be able to create the index file to be used
later (to retrieve the sequences from).
-First of all, I want to ask the people in this forum if the
Bio::Index::Fasta is the right one to chose for this tasks.
-Then I?ll beg you to take a look at my scripts, because I don?t seem to
catch the bug...

Best wishes to you all and thanks in advance ;)

-- 
Jos? Luis Lav?n Trueba, PhD

Dpto. de Producci?n Agraria
Grupo de Gen?tica y Microbiolog?a
Universidad P?blica de Navarra
31006 Pamplona
Navarra
SPAIN


_______________________________________________
Bioperl-l mailing list
Bioperl-l at lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/bioperl-l


From jluis.lavin at unavarra.es  Thu Nov  5 10:46:36 2009
From: jluis.lavin at unavarra.es (jluis.lavin at unavarra.es)
Date: Thu, 5 Nov 2009 16:46:36 +0100 (CET)
Subject: [Bioperl-l] [Fwd: Re: A question about iBio::Index: and its correct
 use]
Message-ID: <2642.130.206.164.153.1257435996.squirrel@webmail.unavarra.es>


---------------------------- Mensaje original ----------------------------
Subject: Re: [Bioperl-l] A question about iBio::Index: and its correct use
From:    jluis.lavin at unavarra.es
Fecha:   Jue, 5 de Noviembre de 2009, 16:46
To:      "Mark A. Jensen" <maj at fortinbras.us>
--------------------------------------------------------------------------

Hi Mark,

I?ve actually got two scripts, the first one is to create the index and
the second one is to retrieve the sequence lis from the indexed file.

1)Here is the Index creation script:

#!/c:/Perl -w
use strict;
use Bio::Index::Fasta;
use strict;

print "Enter file for indexing: \n";
my $Index_File_Name = <STDIN>;
my $inx = Bio::Index::Fasta->new(-filename => $Index_File_Name.".idx",
    -write_flag => 1);
$inx->make_index(my $File_Name);

2)And here is the sequence retrieval script:

#!/c:/Perl -w
use Bio::Index::Fasta;
use strict;
#PC9.fasta is my genomic file
my $Index_File_Name ="PC9.fasta";
my $inx = Bio::Index::Fasta->new($Index_File_Name);
#LCS.txt is my sequences list
@ARGV = <lCS.txt>;
foreach  my $id (@ARGV) {
if ($id eq ''){
die ("empty list")
}
else {
my $seqobj = $inx->fetch($id);
my $out = new Bio::SeqIO (-file => ">>extracted_seqs.fasta",
-format => 'fasta');
$out->write_seq($seqobj);
}
}
exit;
}

I hope this code is not a total scum...

Thanks in advance ;)


El Jue, 5 de Noviembre de 2009, 16:39, Mark A. Jensen escribi?:
> Jos? -- It looks like this is a good solution to your problem. Please send
> you
> script so we can look at it-
> cheers Mark
> ----- Original Message -----
> From: <jluis.lavin at unavarra.es>
> To: <bioperl-l at lists.open-bio.org>
> Sent: Thursday, November 05, 2009 10:28 AM
> Subject: [Bioperl-l] A question about iBio::Index: and its correct use
>
>
>
> Hello to all,
>
> I?m trying to write a script to retrieve a list of sequences from a local
> FASTA file (for example a fasta archive where all the protein models of an
> organism are stored). This file would be used by me as some kind "local
> database" (sorry if I mistake a few concepts...)
> I?ve been reading the BioPerl HOWTOs and I came across the
> Bio::Index::Fasta tool.
> If I didn?t misunderstood what I read (which can be easy because my low
> level on programming) this Indexing tool should do the job.
> I wrote a couple of scripts based on the documentation i read about this
> tool, but I don?t seem to be able to create the index file to be used
> later (to retrieve the sequences from).
> -First of all, I want to ask the people in this forum if the
> Bio::Index::Fasta is the right one to chose for this tasks.
> -Then I?ll beg you to take a look at my scripts, because I don?t seem to
> catch the bug...
>
> Best wishes to you all and thanks in advance ;)
>
> --
> Jos? Luis Lav?n Trueba, PhD
>
> Dpto. de Producci?n Agraria
> Grupo de Gen?tica y Microbiolog?a
> Universidad P?blica de Navarra
> 31006 Pamplona
> Navarra
> SPAIN
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>
>


-- 
Dr. Jos? Luis Lav?n Trueba

Dpto. de Producci?n Agraria
Grupo de Gen?tica y Microbiolog?a
Universidad P?blica de Navarra
31006 Pamplona
Navarra
SPAIN


-- 
Dr. Jos? Luis Lav?n Trueba

Dpto. de Producci?n Agraria
Grupo de Gen?tica y Microbiolog?a
Universidad P?blica de Navarra
31006 Pamplona
Navarra
SPAIN


From maj at fortinbras.us  Thu Nov  5 10:37:53 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Thu, 5 Nov 2009 10:37:53 -0500
Subject: [Bioperl-l] Trouble retrieving multiple sequences from NCBI ina
	single list query
In-Reply-To: <628aabb70911040611q56b441c8o6888f326d0b314d@mail.gmail.com>
References: <1386.130.206.164.153.1257324215.squirrel@webmail.unavarra.es>
	<628aabb70911040152r19ed79dfnbc54f346295d28a8@mail.gmail.com>
	<1791.130.206.164.153.1257329680.squirrel@webmail.unavarra.es>
	<628aabb70911040611q56b441c8o6888f326d0b314d@mail.gmail.com>
Message-ID: <49075FDFF6764EE48E932D95EB994221@NewLife>

True, Dave, you compete only with crazed east coast core developers who're doing 
"just one more thing" at 2am....
----- Original Message ----- 
From: "Dave Messina" <David.Messina at sbc.su.se>
To: <jluis.lavin at unavarra.es>
Cc: <bioperl-l at lists.open-bio.org>
Sent: Wednesday, November 04, 2009 9:11 AM
Subject: Re: [Bioperl-l] Trouble retrieving multiple sequences from NCBI ina 
single list query


> Aw shucks, Jos?, glad I could be of help. There are plenty of people who
> answer questions around here, but my timezone sometimes gives me an
> advantage for the European ones. :)
>
>
> Dave
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l 


From hrh at fmi.ch  Thu Nov  5 11:02:48 2009
From: hrh at fmi.ch (Hotz, Hans-Rudolf)
Date: Thu, 05 Nov 2009 17:02:48 +0100
Subject: [Bioperl-l] A question about iBio::Index: and its correct use
In-Reply-To: <2120.130.206.164.153.1257434903.squirrel@webmail.unavarra.es>
Message-ID: <C718B5B8.5561%hrh@fmi.ch>


Jluis

> -Then I?ll beg you to take a look at my scripts, because I don?t seem to
> catch the bug...

you haven't attached/included any scripts, have you?


Anyway, have you considered using BLAST indices (created with the additional
flag "-o") together with the tool 'fastacmd' (which also included in the
NCBI blast binaries) as a simple (and very fast) alternative for fetching
sequences.


Regards, Hans


From maj at fortinbras.us  Thu Nov  5 11:02:09 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Thu, 5 Nov 2009 11:02:09 -0500
Subject: [Bioperl-l] [Fwd: Re: A question about iBio::Index: and its
	correct use]
In-Reply-To: <2642.130.206.164.153.1257435996.squirrel@webmail.unavarra.es>
References: <2642.130.206.164.153.1257435996.squirrel@webmail.unavarra.es>
Message-ID: <1984ED07F36C446284B25F617964B6C6@NewLife>

Hey Jos?,
The first thing that jumps out it the index file name. Looks
like you create it as
PC9.fasta.idx
But you read it as
PC9.fasta
Not an unusual mistake. Do
my $inx = Bio::Index::Fasta->new('PC9.fasta.idx');
and see if it works.
MAJ
----- Original Message ----- 
From: <jluis.lavin at unavarra.es>
To: <bioperl-l at lists.open-bio.org>
Sent: Thursday, November 05, 2009 10:46 AM
Subject: [Bioperl-l] [Fwd: Re: A question about iBio::Index: and its correct 
use]


---------------------------- Mensaje original ----------------------------
Subject: Re: [Bioperl-l] A question about iBio::Index: and its correct use
From:    jluis.lavin at unavarra.es
Fecha:   Jue, 5 de Noviembre de 2009, 16:46
To:      "Mark A. Jensen" <maj at fortinbras.us>
--------------------------------------------------------------------------

Hi Mark,

I?ve actually got two scripts, the first one is to create the index and
the second one is to retrieve the sequence lis from the indexed file.

1)Here is the Index creation script:

#!/c:/Perl -w
use strict;
use Bio::Index::Fasta;
use strict;

print "Enter file for indexing: \n";
my $Index_File_Name = <STDIN>;
my $inx = Bio::Index::Fasta->new(-filename => $Index_File_Name.".idx",
    -write_flag => 1);
$inx->make_index(my $File_Name);

2)And here is the sequence retrieval script:

#!/c:/Perl -w
use Bio::Index::Fasta;
use strict;
#PC9.fasta is my genomic file
my $Index_File_Name ="PC9.fasta";
my $inx = Bio::Index::Fasta->new($Index_File_Name);
#LCS.txt is my sequences list
@ARGV = <lCS.txt>;
foreach  my $id (@ARGV) {
if ($id eq ''){
die ("empty list")
}
else {
my $seqobj = $inx->fetch($id);
my $out = new Bio::SeqIO (-file => ">>extracted_seqs.fasta",
-format => 'fasta');
$out->write_seq($seqobj);
}
}
exit;
}

I hope this code is not a total scum...

Thanks in advance ;)


El Jue, 5 de Noviembre de 2009, 16:39, Mark A. Jensen escribi?:
> Jos? -- It looks like this is a good solution to your problem. Please send
> you
> script so we can look at it-
> cheers Mark
> ----- Original Message -----
> From: <jluis.lavin at unavarra.es>
> To: <bioperl-l at lists.open-bio.org>
> Sent: Thursday, November 05, 2009 10:28 AM
> Subject: [Bioperl-l] A question about iBio::Index: and its correct use
>
>
>
> Hello to all,
>
> I?m trying to write a script to retrieve a list of sequences from a local
> FASTA file (for example a fasta archive where all the protein models of an
> organism are stored). This file would be used by me as some kind "local
> database" (sorry if I mistake a few concepts...)
> I?ve been reading the BioPerl HOWTOs and I came across the
> Bio::Index::Fasta tool.
> If I didn?t misunderstood what I read (which can be easy because my low
> level on programming) this Indexing tool should do the job.
> I wrote a couple of scripts based on the documentation i read about this
> tool, but I don?t seem to be able to create the index file to be used
> later (to retrieve the sequences from).
> -First of all, I want to ask the people in this forum if the
> Bio::Index::Fasta is the right one to chose for this tasks.
> -Then I?ll beg you to take a look at my scripts, because I don?t seem to
> catch the bug...
>
> Best wishes to you all and thanks in advance ;)
>
> --
> Jos? Luis Lav?n Trueba, PhD
>
> Dpto. de Producci?n Agraria
> Grupo de Gen?tica y Microbiolog?a
> Universidad P?blica de Navarra
> 31006 Pamplona
> Navarra
> SPAIN
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>
>


-- 
Dr. Jos? Luis Lav?n Trueba

Dpto. de Producci?n Agraria
Grupo de Gen?tica y Microbiolog?a
Universidad P?blica de Navarra
31006 Pamplona
Navarra
SPAIN


-- 
Dr. Jos? Luis Lav?n Trueba

Dpto. de Producci?n Agraria
Grupo de Gen?tica y Microbiolog?a
Universidad P?blica de Navarra
31006 Pamplona
Navarra
SPAIN


_______________________________________________
Bioperl-l mailing list
Bioperl-l at lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/bioperl-l


From jluis.lavin at unavarra.es  Thu Nov  5 11:21:57 2009
From: jluis.lavin at unavarra.es (jluis.lavin at unavarra.es)
Date: Thu, 5 Nov 2009 17:21:57 +0100 (CET)
Subject: [Bioperl-l] [Fwd: Re: A question about iBio::Index: and its
 correct use]
In-Reply-To: <1984ED07F36C446284B25F617964B6C6@NewLife>
References: <2642.130.206.164.153.1257435996.squirrel@webmail.unavarra.es>
	<1984ED07F36C446284B25F617964B6C6@NewLife>
Message-ID: <2969.130.206.164.153.1257438117.squirrel@webmail.unavarra.es>

Thank you very much Mark, that?s a good point :$
I guess your correction is referred to the second script, isn?t it?

If it is so, there is still a problem with the first script, it doesn?t
create the PC9.fasta.idx file, instead it creates two files named:
-PC9.fasta.idx.pag
-PC9.fasta.idx.dir

which seem to be clearly related with some kind of indexing process...but,
unless the PC9.fasta.idx file is only virtual or remains hidden, I can?t
find it anywhere...
Forgive me if I?m talking nosense...

Thank you very much again for your help ;)


El Jue, 5 de Noviembre de 2009, 17:02, Mark A. Jensen escribi?:
> Hey Jos?,
> The first thing that jumps out it the index file name. Looks
> like you create it as
> PC9.fasta.idx
> But you read it as
> PC9.fasta
> Not an unusual mistake. Do
> my $inx = Bio::Index::Fasta->new('PC9.fasta.idx');
> and see if it works.
> MAJ
> ----- Original Message -----
> From: <jluis.lavin at unavarra.es>
> To: <bioperl-l at lists.open-bio.org>
> Sent: Thursday, November 05, 2009 10:46 AM
> Subject: [Bioperl-l] [Fwd: Re: A question about iBio::Index: and its
> correct
> use]
>
>
>
>
> ---------------------------- Mensaje original ----------------------------
> Subject: Re: [Bioperl-l] A question about iBio::Index: and its correct use
> From:    jluis.lavin at unavarra.es
> Fecha:   Jue, 5 de Noviembre de 2009, 16:46
> To:      "Mark A. Jensen" <maj at fortinbras.us>
> --------------------------------------------------------------------------
>
> Hi Mark,
>
> I?ve actually got two scripts, the first one is to create the index and
> the second one is to retrieve the sequence lis from the indexed file.
>
> 1)Here is the Index creation script:
>
> #!/c:/Perl -w
> use strict;
> use Bio::Index::Fasta;
> use strict;
>
> print "Enter file for indexing: \n";
> my $Index_File_Name = <STDIN>;
> my $inx = Bio::Index::Fasta->new(-filename => $Index_File_Name.".idx",
>     -write_flag => 1);
> $inx->make_index(my $File_Name);
>
> 2)And here is the sequence retrieval script:
>
> #!/c:/Perl -w
> use Bio::Index::Fasta;
> use strict;
> #PC9.fasta is my genomic file
> my $Index_File_Name ="PC9.fasta";
> my $inx = Bio::Index::Fasta->new($Index_File_Name);
> #LCS.txt is my sequences list
> @ARGV = <lCS.txt>;
> foreach  my $id (@ARGV) {
> if ($id eq ''){
> die ("empty list")
> }
> else {
> my $seqobj = $inx->fetch($id);
> my $out = new Bio::SeqIO (-file => ">>extracted_seqs.fasta",
> -format => 'fasta');
> $out->write_seq($seqobj);
> }
> }
> exit;
> }
>
> I hope this code is not a total scum...
>
> Thanks in advance ;)
>
>
>
> El Jue, 5 de Noviembre de 2009, 16:39, Mark A. Jensen escribi?:
>> Jos? -- It looks like this is a good solution to your problem. Please
>> send
>> you
>> script so we can look at it-
>> cheers Mark
>> ----- Original Message -----
>> From: <jluis.lavin at unavarra.es>
>> To: <bioperl-l at lists.open-bio.org>
>> Sent: Thursday, November 05, 2009 10:28 AM
>> Subject: [Bioperl-l] A question about iBio::Index: and its correct use
>>
>>
>>
>> Hello to all,
>>
>> I?m trying to write a script to retrieve a list of sequences from a
>> local
>> FASTA file (for example a fasta archive where all the protein models of
>> an
>> organism are stored). This file would be used by me as some kind "local
>> database" (sorry if I mistake a few concepts...)
>> I?ve been reading the BioPerl HOWTOs and I came across the
>> Bio::Index::Fasta tool.
>> If I didn?t misunderstood what I read (which can be easy because my low
>> level on programming) this Indexing tool should do the job.
>> I wrote a couple of scripts based on the documentation i read about this
>> tool, but I don?t seem to be able to create the index file to be used
>> later (to retrieve the sequences from).
>> -First of all, I want to ask the people in this forum if the
>> Bio::Index::Fasta is the right one to chose for this tasks.
>> -Then I?ll beg you to take a look at my scripts, because I don?t seem to
>> catch the bug...
>>
>> Best wishes to you all and thanks in advance ;)
>>
>> --
>> Jos? Luis Lav?n Trueba, PhD
>>
>> Dpto. de Producci?n Agraria
>> Grupo de Gen?tica y Microbiolog?a
>> Universidad P?blica de Navarra
>> 31006 Pamplona
>> Navarra
>> SPAIN
>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>>
>>
>
>
> --
> Dr. Jos? Luis Lav?n Trueba
>
> Dpto. de Producci?n Agraria
> Grupo de Gen?tica y Microbiolog?a
> Universidad P?blica de Navarra
> 31006 Pamplona
> Navarra
> SPAIN
>
>
>
> --
> Dr. Jos? Luis Lav?n Trueba
>
> Dpto. de Producci?n Agraria
> Grupo de Gen?tica y Microbiolog?a
> Universidad P?blica de Navarra
> 31006 Pamplona
> Navarra
> SPAIN
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>
>


-- 
Dr. Jos? Luis Lav?n Trueba

Dpto. de Producci?n Agraria
Grupo de Gen?tica y Microbiolog?a
Universidad P?blica de Navarra
31006 Pamplona
Navarra
SPAIN


From maj at fortinbras.us  Thu Nov  5 11:39:09 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Thu, 5 Nov 2009 11:39:09 -0500
Subject: [Bioperl-l] [Fwd: Re: A question about iBio::Index: and its
	correct use]
In-Reply-To: <2969.130.206.164.153.1257438117.squirrel@webmail.unavarra.es>
References: <2642.130.206.164.153.1257435996.squirrel@webmail.unavarra.es>
	<1984ED07F36C446284B25F617964B6C6@NewLife>
	<2969.130.206.164.153.1257438117.squirrel@webmail.unavarra.es>
Message-ID: <A1ACC4B552514872B77208248B31977C@NewLife>

Yes, these are files created by the SDBM, Perl's internal db manager. You should 
be able to
open the index by simply
$inx = Bio::Index::Fasta->new('PC9.fasta.idx');
and the dbm will know what to do--
cheers MAJ
----- Original Message ----- 
From: <jluis.lavin at unavarra.es>
To: "Mark A. Jensen" <maj at fortinbras.us>
Cc: <jluis.lavin at unavarra.es>; <bioperl-l at lists.open-bio.org>
Sent: Thursday, November 05, 2009 11:21 AM
Subject: Re: [Bioperl-l] [Fwd: Re: A question about iBio::Index: and its correct 
use]


> Thank you very much Mark, that?s a good point :$
> I guess your correction is referred to the second script, isn?t it?
>
> If it is so, there is still a problem with the first script, it doesn?t
> create the PC9.fasta.idx file, instead it creates two files named:
> -PC9.fasta.idx.pag
> -PC9.fasta.idx.dir
>
> which seem to be clearly related with some kind of indexing process...but,
> unless the PC9.fasta.idx file is only virtual or remains hidden, I can?t
> find it anywhere...
> Forgive me if I?m talking nosense...
>
> Thank you very much again for your help ;)
>
>
> El Jue, 5 de Noviembre de 2009, 17:02, Mark A. Jensen escribi?:
>> Hey Jos?,
>> The first thing that jumps out it the index file name. Looks
>> like you create it as
>> PC9.fasta.idx
>> But you read it as
>> PC9.fasta
>> Not an unusual mistake. Do
>> my $inx = Bio::Index::Fasta->new('PC9.fasta.idx');
>> and see if it works.
>> MAJ
>> ----- Original Message -----
>> From: <jluis.lavin at unavarra.es>
>> To: <bioperl-l at lists.open-bio.org>
>> Sent: Thursday, November 05, 2009 10:46 AM
>> Subject: [Bioperl-l] [Fwd: Re: A question about iBio::Index: and its
>> correct
>> use]
>>
>>
>>
>>
>> ---------------------------- Mensaje original ----------------------------
>> Subject: Re: [Bioperl-l] A question about iBio::Index: and its correct use
>> From:    jluis.lavin at unavarra.es
>> Fecha:   Jue, 5 de Noviembre de 2009, 16:46
>> To:      "Mark A. Jensen" <maj at fortinbras.us>
>> --------------------------------------------------------------------------
>>
>> Hi Mark,
>>
>> I?ve actually got two scripts, the first one is to create the index and
>> the second one is to retrieve the sequence lis from the indexed file.
>>
>> 1)Here is the Index creation script:
>>
>> #!/c:/Perl -w
>> use strict;
>> use Bio::Index::Fasta;
>> use strict;
>>
>> print "Enter file for indexing: \n";
>> my $Index_File_Name = <STDIN>;
>> my $inx = Bio::Index::Fasta->new(-filename => $Index_File_Name.".idx",
>>     -write_flag => 1);
>> $inx->make_index(my $File_Name);
>>
>> 2)And here is the sequence retrieval script:
>>
>> #!/c:/Perl -w
>> use Bio::Index::Fasta;
>> use strict;
>> #PC9.fasta is my genomic file
>> my $Index_File_Name ="PC9.fasta";
>> my $inx = Bio::Index::Fasta->new($Index_File_Name);
>> #LCS.txt is my sequences list
>> @ARGV = <lCS.txt>;
>> foreach  my $id (@ARGV) {
>> if ($id eq ''){
>> die ("empty list")
>> }
>> else {
>> my $seqobj = $inx->fetch($id);
>> my $out = new Bio::SeqIO (-file => ">>extracted_seqs.fasta",
>> -format => 'fasta');
>> $out->write_seq($seqobj);
>> }
>> }
>> exit;
>> }
>>
>> I hope this code is not a total scum...
>>
>> Thanks in advance ;)
>>
>>
>>
>> El Jue, 5 de Noviembre de 2009, 16:39, Mark A. Jensen escribi?:
>>> Jos? -- It looks like this is a good solution to your problem. Please
>>> send
>>> you
>>> script so we can look at it-
>>> cheers Mark
>>> ----- Original Message -----
>>> From: <jluis.lavin at unavarra.es>
>>> To: <bioperl-l at lists.open-bio.org>
>>> Sent: Thursday, November 05, 2009 10:28 AM
>>> Subject: [Bioperl-l] A question about iBio::Index: and its correct use
>>>
>>>
>>>
>>> Hello to all,
>>>
>>> I?m trying to write a script to retrieve a list of sequences from a
>>> local
>>> FASTA file (for example a fasta archive where all the protein models of
>>> an
>>> organism are stored). This file would be used by me as some kind "local
>>> database" (sorry if I mistake a few concepts...)
>>> I?ve been reading the BioPerl HOWTOs and I came across the
>>> Bio::Index::Fasta tool.
>>> If I didn?t misunderstood what I read (which can be easy because my low
>>> level on programming) this Indexing tool should do the job.
>>> I wrote a couple of scripts based on the documentation i read about this
>>> tool, but I don?t seem to be able to create the index file to be used
>>> later (to retrieve the sequences from).
>>> -First of all, I want to ask the people in this forum if the
>>> Bio::Index::Fasta is the right one to chose for this tasks.
>>> -Then I?ll beg you to take a look at my scripts, because I don?t seem to
>>> catch the bug...
>>>
>>> Best wishes to you all and thanks in advance ;)
>>>
>>> --
>>> Jos? Luis Lav?n Trueba, PhD
>>>
>>> Dpto. de Producci?n Agraria
>>> Grupo de Gen?tica y Microbiolog?a
>>> Universidad P?blica de Navarra
>>> 31006 Pamplona
>>> Navarra
>>> SPAIN
>>>
>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>>
>>>
>>
>>
>> --
>> Dr. Jos? Luis Lav?n Trueba
>>
>> Dpto. de Producci?n Agraria
>> Grupo de Gen?tica y Microbiolog?a
>> Universidad P?blica de Navarra
>> 31006 Pamplona
>> Navarra
>> SPAIN
>>
>>
>>
>> --
>> Dr. Jos? Luis Lav?n Trueba
>>
>> Dpto. de Producci?n Agraria
>> Grupo de Gen?tica y Microbiolog?a
>> Universidad P?blica de Navarra
>> 31006 Pamplona
>> Navarra
>> SPAIN
>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>>
>>
>
>
> -- 
> Dr. Jos? Luis Lav?n Trueba
>
> Dpto. de Producci?n Agraria
> Grupo de Gen?tica y Microbiolog?a
> Universidad P?blica de Navarra
> 31006 Pamplona
> Navarra
> SPAIN
>
>
>
> 


From jluis.lavin at unavarra.es  Thu Nov  5 12:48:12 2009
From: jluis.lavin at unavarra.es (jluis.lavin at unavarra.es)
Date: Thu, 5 Nov 2009 18:48:12 +0100 (CET)
Subject: [Bioperl-l] A question about iBio::Index: and its correct use
In-Reply-To: <C718B5B8.5561%hrh@fmi.ch>
References: <2120.130.206.164.153.1257434903.squirrel@webmail.unavarra.es>
	<C718B5B8.5561%hrh@fmi.ch>
Message-ID: <3313.130.206.164.153.1257443292.squirrel@webmail.unavarra.es>

Thanks a lot for your help Hans,
It's a little bit to hard to understand and turn into script this awesome
information you've just given me...I hope I can use it in a near future
anyway ;)
The issue here is that the sequences I,m indexing are not generated by the
NCBI nor stored there...although I belive you?re just refering to the tool
itself and not to a retrieval from the NCBI.

Thanks again you?re all great giving advice to newbies like me ;)

Best wishes to you all


El Jue, 5 de Noviembre de 2009, 17:02, Hotz, Hans-Rudolf escribi?:
>
>
>
> Jluis
>
>> -Then I?ll beg you to take a look at my scripts, because I don?t seem to
>> catch the bug...
>
> you haven't attached/included any scripts, have you?
>
>
> Anyway, have you considered using BLAST indices (created with the
> additional
> flag "-o") together with the tool 'fastacmd' (which also included in the
> NCBI blast binaries) as a simple (and very fast) alternative for fetching
> sequences.
>
>
> Regards, Hans
>
>
>


-- 
Dr. Jos? Luis Lav?n Trueba

Dpto. de Producci?n Agraria
Grupo de Gen?tica y Microbiolog?a
Universidad P?blica de Navarra
31006 Pamplona
Navarra
SPAIN


From florent.angly at gmail.com  Thu Nov  5 13:00:19 2009
From: florent.angly at gmail.com (Florent Angly)
Date: Thu, 05 Nov 2009 10:00:19 -0800
Subject: [Bioperl-l] A question about iBio::Index: and its correct use
In-Reply-To: <3313.130.206.164.153.1257443292.squirrel@webmail.unavarra.es>
References: <2120.130.206.164.153.1257434903.squirrel@webmail.unavarra.es>	<C718B5B8.5561%hrh@fmi.ch>
	<3313.130.206.164.153.1257443292.squirrel@webmail.unavarra.es>
Message-ID: <4AF312B3.9060009@gmail.com>

Hans-Rudolf was talking about a way to retrieve sequences from a BLAST 
database. If you use BLAST locally, then your database is local too.
More info here: 
http://www.ncbi.nlm.nih.gov/staff/tao/URLAPI/formatdb_fastacmd.html
Florent


jluis.lavin at unavarra.es wrote:
> Thanks a lot for your help Hans,
> It's a little bit to hard to understand and turn into script this awesome
> information you've just given me...I hope I can use it in a near future
> anyway ;)
> The issue here is that the sequences I,m indexing are not generated by the
> NCBI nor stored there...although I belive you?re just refering to the tool
> itself and not to a retrieval from the NCBI.
>
> Thanks again you?re all great giving advice to newbies like me ;)
>
> Best wishes to you all
>
>
> El Jue, 5 de Noviembre de 2009, 17:02, Hotz, Hans-Rudolf escribi?:
>   
>>
>> Jluis
>>
>>     
>>> -Then I?ll beg you to take a look at my scripts, because I don?t seem to
>>> catch the bug...
>>>       
>> you haven't attached/included any scripts, have you?
>>
>>
>> Anyway, have you considered using BLAST indices (created with the
>> additional
>> flag "-o") together with the tool 'fastacmd' (which also included in the
>> NCBI blast binaries) as a simple (and very fast) alternative for fetching
>> sequences.
>>
>>
>> Regards, Hans
>>
>>
>>
>>     
>
>
>   


From valiente at lsi.upc.edu  Fri Nov  6 03:06:48 2009
From: valiente at lsi.upc.edu (valiente at lsi.upc.edu)
Date: Fri, 6 Nov 2009 09:06:48 +0100 (CET)
Subject: [Bioperl-l] Bio::SeqIO::genbank.pm
Message-ID: <45737.147.83.59.225.1257494808.squirrel@webmail.lsi.upc.edu>


There is a line in Bio::SeqIO::genbank.pm to convert data in classification lines into a classification array by splitting only
on ';' or '.' so that a classification that is 2
or more words will still get
matched,my @class = map { s/^\s+//; s/\s+$//; s/\s{2,}/ /g; $_; } split /(?<!subgen)[;\.]+/, $class_lines;but this
will break organism names that have a dot inside, such as "Salmonella
enterica subsp. enterica?serovar Typhimurium", which is now
being broken into "Salmonella enterica subsp" and "enterica?serovar
Typhimurium".Changing [;\.]
to [;] solves this issue,my @class = map { s/^\s+//; s/\s+$//; s/\s{2,}/ /g; $_; } split /(?<!subgen)[;]+/,
$class_lines;Does anybody want to further
test it before I commit this change? Thanks,Gabriel


From jluis.lavin at unavarra.es  Fri Nov  6 03:44:45 2009
From: jluis.lavin at unavarra.es (jluis.lavin at unavarra.es)
Date: Fri, 6 Nov 2009 09:44:45 +0100 (CET)
Subject: [Bioperl-l] A question about iBio::Index: and its correct use
In-Reply-To: <4AF312B3.9060009@gmail.com>
References: <2120.130.206.164.153.1257434903.squirrel@webmail.unavarra.es>	<
	C718B5B8.5561%hrh@fmi.ch> 
	<3313.130.206.164.153.1257443292.squirrel@webmail.unavarra.es>
	<4AF312B3.9060009@gmail.com>
Message-ID: <1222.130.206.164.153.1257497085.squirrel@webmail.unavarra.es>

Thank you for the info Florent!
I?ll try to read al the information on the link you provided and try to
figure out how to make it work and if it is worthy for me, I mean, I work
with several sequence files that come from multiple databases (JGI, BROAD,
Genolevures or NCBI). Protein IDs from each of those databases is
different from NCBI. Maybe it could be easier to write a script that
allows me to enter a fasta file with all the protein models of a single
organism, parse it and then extract the sequences of a given list (using
the "ID style" of the particular database) than creating a BLAST index for
each organism I need to work with...Did I explain the issue correctly?
Anyway, since I don?t know anything about this tool Hans and you provided
me, I can easily be wrong...
Thank you for showing me the local BLAST Index tool, I?ll read the
documentation carefully and study all its possibilities.

Best wishes

JL


El Jue, 5 de Noviembre de 2009, 19:00, Florent Angly escribi?:
> Hans-Rudolf was talking about a way to retrieve sequences from a BLAST
> database. If you use BLAST locally, then your database is local too.
> More info here:
> http://www.ncbi.nlm.nih.gov/staff/tao/URLAPI/formatdb_fastacmd.html
> Florent
>
>
> jluis.lavin at unavarra.es wrote:
>> Thanks a lot for your help Hans,
>> It's a little bit to hard to understand and turn into script this
>> awesome
>> information you've just given me...I hope I can use it in a near future
>> anyway ;)
>> The issue here is that the sequences I,m indexing are not generated by
>> the
>> NCBI nor stored there...although I belive you?re just refering to the
>> tool
>> itself and not to a retrieval from the NCBI.
>>
>> Thanks again you?re all great giving advice to newbies like me ;)
>>
>> Best wishes to you all
>>
>>
>> El Jue, 5 de Noviembre de 2009, 17:02, Hotz, Hans-Rudolf escribi?:
>>
>>>
>>> Jluis
>>>
>>>
>>>> -Then I?ll beg you to take a look at my scripts, because I don?t seem
>>>> to
>>>> catch the bug...
>>>>
>>> you haven't attached/included any scripts, have you?
>>>
>>>
>>> Anyway, have you considered using BLAST indices (created with the
>>> additional
>>> flag "-o") together with the tool 'fastacmd' (which also included in
>>> the
>>> NCBI blast binaries) as a simple (and very fast) alternative for
>>> fetching
>>> sequences.
>>>
>>>
>>> Regards, Hans
>>>
>>>
>>>
>>>
>>
>>
>>
>
>


-- 
Dr. Jos? Luis Lav?n Trueba

Dpto. de Producci?n Agraria
Grupo de Gen?tica y Microbiolog?a
Universidad P?blica de Navarra
31006 Pamplona
Navarra
SPAIN


From maj at fortinbras.us  Fri Nov  6 07:45:01 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Fri, 6 Nov 2009 07:45:01 -0500
Subject: [Bioperl-l] Bioperl
In-Reply-To: <16842715.26316.1257510446095.JavaMail.root@durga.amrita.ac.in>
References: <16842715.26316.1257510446095.JavaMail.root@durga.amrita.ac.in>
Message-ID: <AE7A03CA8F45495C9F8D940AC0EC6D69@NewLife>

Hi Resmi-
You should look at http://bioperl.org/ under "Installation" for 
information on getting and installing BioPerl. An introduction 
to working with trees in BioPerl is at this link:
http://www.bioperl.org/wiki/HOWTO:Trees
cheers, 
Mark

----- Original Message ----- 
  From: Resmi S. 
  To: maj at fortinbras.us 
  Sent: Friday, November 06, 2009 7:27 AM
  Subject: Bioperl


  Respected Sir,
  I am Resmi S studying II MSc Bioinformatics.Now am doing my project in Phylogenetic Tree Construction using BioPerl.I am not much familiar on BioPerl modules.So could please send me the names of the Bioperl modules needed for my project.I also need to  know , from where i will get these modules.If that is from CPAN,then send me the location or link.I kindly request you to send me the details soon.

  Yours Sincerely,
     Resmi S,
     II MSc Bioinformatics,
     School of Biotechnology,
     Amrita Vishwa Vidyapeetham,
      Email : amm08bi019 at students.amrita.ac.in


------------------------------------------------------------------------------


  -------------------------------------------------------------------

  This mail has been scanned by Amrita GAV Server, Amrita Vishwa Vidyapeetham, Amritapuri Campus


From robert.bradbury at gmail.com  Fri Nov  6 12:35:22 2009
From: robert.bradbury at gmail.com (Robert Bradbury)
Date: Fri, 6 Nov 2009 12:35:22 -0500
Subject: [Bioperl-l] Function that determines serious mutations
Message-ID: <deaa866a0911060935q5e0364f5v9a296162c42fd0ca@mail.gmail.com>

Is there a function in the library (or has someone written one) that can
take a genbank entry and determine which mutations are harmful?

It would be used to produce a table summary of:
  GENE          # SNP      # BadSNP

One kind of gets this from NCBI if you lookup in the "GENE" db a gene name
and then go to the "GeneView" om dbSNP page it has the information I want
but largely in a graphical format while I simply want numbers I can dump
into a spreadsheet.

I don't think it would be hard, fetch the gene, run through the features for
the SNP database, figure out whether they are good or bad SNPs, accumulate
the statistics and dump it.  I think the functions available are flexible
enough to do it but I can't believe nobody has already done it.  It could be
a bit more complex in that one could do an analysis to see if the mutations
are in a conserved domain or mutations that code for Cysteine or Methionine
(or othe potentially "critical" amino acids) but since "critical" is in the
eye of the beholder there would have to be some kind of callback to a
scoring function.

Thanks,
Robert


From nevoband at igb.uiuc.edu  Fri Nov  6 15:58:05 2009
From: nevoband at igb.uiuc.edu (kleenix)
Date: Fri, 6 Nov 2009 12:58:05 -0800 (PST)
Subject: [Bioperl-l]  StandAloneBlast Unallowed parameter
Message-ID: <26230896.post@talk.nabble.com>


I'm not sure if i'm doing this wrong. I am trying to use the -m parameter in
blastall using the StandAloneBlast bioperl class.
when i add 'm'=>0 to @params i get Unallowed parameter: error.
Am I adding the parameter wrong? i'm using StandAloneBlast version 1.51

Thanks

-Nevo
-- 
View this message in context: http://old.nabble.com/StandAloneBlast-Unallowed-parameter-tp26230896p26230896.html
Sent from the Perl - Bioperl-L mailing list archive at Nabble.com.


From veronica.xiaoyu at gmail.com  Fri Nov  6 17:25:04 2009
From: veronica.xiaoyu at gmail.com (Xiaoyu Liang)
Date: Fri, 6 Nov 2009 17:25:04 -0500
Subject: [Bioperl-l] Parsing BLAST out file to HTML. How to change the
	description's name of each hit?
Message-ID: <a6a926b50911061425m67e47a3bx1853f21e1de5bd95@mail.gmail.com>

Hi,

I'm using Bio::SearchIO::Writer HTMLResultWriter help me parse BLAST out
file into HTML.

Anybody knows how to parse and change the description name of each hit?

By using hit->description can call hits' description, but it is not allowed
to be modified.

Thank you very much,
Xiaoyu


From maj at fortinbras.us  Fri Nov  6 19:40:17 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Fri, 6 Nov 2009 19:40:17 -0500
Subject: [Bioperl-l] Parsing BLAST out file to HTML. How to change
	thedescription's name of each hit?
In-Reply-To: <a6a926b50911061425m67e47a3bx1853f21e1de5bd95@mail.gmail.com>
References: <a6a926b50911061425m67e47a3bx1853f21e1de5bd95@mail.gmail.com>
Message-ID: <11592B31D9924FA7A8638D90AE4A3F4A@NewLife>

Xiaoyu-
That method should work to change the description; are you doing

$hit->description('This is my new description');

This method returns the old description when you change the value:

$hit->description('old');
$str = $hit->description('new'); # $str eq 'old'
$str = $hit->description;            # $str eq 'new'

MAJ

----- Original Message ----- 
From: "Xiaoyu Liang" <veronica.xiaoyu at gmail.com>
To: <Bioperl-l at lists.open-bio.org>
Sent: Friday, November 06, 2009 5:25 PM
Subject: [Bioperl-l] Parsing BLAST out file to HTML. How to change 
thedescription's name of each hit?


> Hi,
>
> I'm using Bio::SearchIO::Writer HTMLResultWriter help me parse BLAST out
> file into HTML.
>
> Anybody knows how to parse and change the description name of each hit?
>
> By using hit->description can call hits' description, but it is not allowed
> to be modified.
>
> Thank you very much,
> Xiaoyu
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> 


From Daniel.Lang at biologie.uni-freiburg.de  Sun Nov  8 09:50:48 2009
From: Daniel.Lang at biologie.uni-freiburg.de (Daniel Lang)
Date: Sun, 08 Nov 2009 15:50:48 +0100
Subject: [Bioperl-l] arguments to call back functions in GBrowse2
Message-ID: <4AF6DAC8.8070204@biologie.uni-freiburg.de>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi Lincoln,

a while back (May 29, 2009; 09:08pm) you replied to an even older thread
("Re: Access the parent of a Bio::DB::SeqFeature within a gbrowse config
callback function").

I missed your reply and did follow it up back then, sorry!

I'm currently facing the same issue again with gbrowse2. I have a
callback function for "balloon click". Following your last reply I
expected 5 arguments, but I am getting only three: $feature,$panel,$track.

In principle, I am using the latest releases/checkouts...
Which modules do I need to look at/update for this functionality?

Furthermore, is there a possibility to share global variables between
gbrowse2 and slaves? Should this work via init_code?
Should modules initialized in a conf be in the scope of a slave?

If not can I introduce modules via the slave config files, or do I need
to alter the slave scripts?


Thanks, again!

Cheers,
Daniel


PS: gbrowse2 rocks!
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAkr22sUACgkQmJnbCpJAG3A2MgCdG61bNRGMFVWExagzMFejKMjO
FiUAn16nQNemDGSy8nJBS5dUHQMnDgrP
=ODxn
-----END PGP SIGNATURE-----


From maj at fortinbras.us  Sun Nov  8 11:09:43 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Sun, 8 Nov 2009 11:09:43 -0500
Subject: [Bioperl-l] GuessSeqFormat: fastq?
Message-ID: <B5891A5A7F2D482DB093E01A7A8DA9A1@NewLife>

Hi All- 
Any plans in the works for a _possibly_fastq sequence guesser?
MAJ


From maj at fortinbras.us  Sun Nov  8 11:20:55 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Sun, 8 Nov 2009 11:20:55 -0500
Subject: [Bioperl-l] GuessSeqFormat: fastq?
In-Reply-To: <B5891A5A7F2D482DB093E01A7A8DA9A1@NewLife>
References: <B5891A5A7F2D482DB093E01A7A8DA9A1@NewLife>
Message-ID: <E2407ED235C24BFF9A03377416109318@NewLife>

Never mind; got it covered-- MAJ
----- Original Message ----- 
From: "Mark A. Jensen" <maj at fortinbras.us>
To: "bioperl-l" <bioperl-l at lists.open-bio.org>
Sent: Sunday, November 08, 2009 11:09 AM
Subject: [Bioperl-l] GuessSeqFormat: fastq?


> Hi All- 
> Any plans in the works for a _possibly_fastq sequence guesser?
> MAJ
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
>


From saikari78 at gmail.com  Mon Nov  9 10:47:10 2009
From: saikari78 at gmail.com (saikari keitele)
Date: Mon, 9 Nov 2009 15:47:10 +0000
Subject: [Bioperl-l] Retrieving link to protein from PubChem
Message-ID: <a38167fa0911090747p6702c62fibd7e8310d3a72dae@mail.gmail.com>

Hi,

I'm using Bioperl to retrieve records from PubChem.
I'm trying to find a way-but have been unsuccessful- to retrieve from a
compound record, the reference to the protein(s) that can synthesize the
compound.
Thanks very much.

saikari


From saikari78 at gmail.com  Mon Nov  9 11:05:57 2009
From: saikari78 at gmail.com (saikari keitele)
Date: Mon, 9 Nov 2009 16:05:57 +0000
Subject: [Bioperl-l] [bioperl newbie] Retrieving link to protein from PubChem
Message-ID: <a38167fa0911090805y76456205g3f42caa7e3471241@mail.gmail.com>

Hi,

I'm using Bioperl to retrieve records from PubChem.
I'm trying to find a way-but have been unsuccessful- to retrieve from a
compound record, the reference to the protein(s) that can synthesize the
compound.
Thanks very much.

saikari


From cjfields at illinois.edu  Mon Nov  9 11:27:10 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Mon, 9 Nov 2009 10:27:10 -0600
Subject: [Bioperl-l] [bioperl newbie] Retrieving link to protein from
	PubChem
In-Reply-To: <a38167fa0911090805y76456205g3f42caa7e3471241@mail.gmail.com>
References: <a38167fa0911090805y76456205g3f42caa7e3471241@mail.gmail.com>
Message-ID: <1ECC543A-F923-4D5E-A0C1-5BBD35ECAAE8@illinois.edu>

On Nov 9, 2009, at 10:05 AM, saikari keitele wrote:

> Hi,
>
> I'm using Bioperl to retrieve records from PubChem.
> I'm trying to find a way-but have been unsuccessful- to retrieve  
> from a
> compound record, the reference to the protein(s) that can synthesize  
> the
> compound.
> Thanks very much.
>
> saikari

The below bioperl script returns the GI for proteins that correspond  
to the substance passed on the command line; invoke using 'perl  
pc_substance.pl substance_requested'.  It probably needs more fiddling  
to catch everything but it should get you started.

For other bits and pieces (such as how to retrieve the raw sequence  
files), please see the EUtilities HOWTO:

http://www.bioperl.org/wiki/HOWTO:EUtilities_Cookbook

chris

----------------------------------------

#!/usr/bin/perl -w

use 5.010;
use strict;
use warnings;
use Bio::DB::EUtilities;

my $substance = shift;

my $eutil = Bio::DB::EUtilities->new(-eutil => 'esearch',
                                      -db => 'pcsubstance',
                                      -term => $substance,
                                      -usehistory => 'y');

my $hist = $eutil->next_History || die;

$eutil->reset_parameters(-eutil => 'elink',
                        -history => $hist,
                        -db      => 'protein',
                        -dbfrom  => 'pcsubstance',
                        -retmax  => 1000);

say join(',',$eutil->get_ids);


From saikari78 at gmail.com  Mon Nov  9 11:41:20 2009
From: saikari78 at gmail.com (saikari keitele)
Date: Mon, 9 Nov 2009 16:41:20 +0000
Subject: [Bioperl-l] [bioperl newbie] Retrieving link to protein from
	PubChem
In-Reply-To: <1ECC543A-F923-4D5E-A0C1-5BBD35ECAAE8@illinois.edu>
References: <a38167fa0911090805y76456205g3f42caa7e3471241@mail.gmail.com>
	<1ECC543A-F923-4D5E-A0C1-5BBD35ECAAE8@illinois.edu>
Message-ID: <a38167fa0911090841o6a0a7bcdj5e2396f1a58dc4f2@mail.gmail.com>

Fabulous!. Huge help.
saikari

On Mon, Nov 9, 2009 at 4:27 PM, Chris Fields <cjfields at illinois.edu> wrote:

>  On Nov 9, 2009, at 10:05 AM, saikari keitele wrote:
>
> Hi,
>>
>> I'm using Bioperl to retrieve records from PubChem.
>> I'm trying to find a way-but have been unsuccessful- to retrieve from a
>> compound record, the reference to the protein(s) that can synthesize the
>> compound.
>> Thanks very much.
>>
>> saikari
>>
>
> The below bioperl script returns the GI for proteins that correspond to the
> substance passed on the command line; invoke using 'perl pc_substance.plsubstance_requested'.  It probably needs more fiddling to catch everything
> but it should get you started.
>
> For other bits and pieces (such as how to retrieve the raw sequence files),
> please see the EUtilities HOWTO:
>
> http://www.bioperl.org/wiki/HOWTO:EUtilities_Cookbook
>
> chris
>
> ----------------------------------------
>
> #!/usr/bin/perl -w
>
> use 5.010;
> use strict;
> use warnings;
> use Bio::DB::EUtilities;
>
> my $substance = shift;
>
> my $eutil = Bio::DB::EUtilities->new(-eutil => 'esearch',
>                                     -db => 'pcsubstance',
>                                     -term => $substance,
>                                     -usehistory => 'y');
>
> my $hist = $eutil->next_History || die;
>
> $eutil->reset_parameters(-eutil => 'elink',
>                       -history => $hist,
>                       -db      => 'protein',
>                       -dbfrom  => 'pcsubstance',
>                       -retmax  => 1000);
>
> say join(',',$eutil->get_ids);
>


From gc11song at gmail.com  Mon Nov  9 13:08:48 2009
From: gc11song at gmail.com (Guangchun Song)
Date: Mon, 9 Nov 2009 12:08:48 -0600
Subject: [Bioperl-l] how to get the protein sequences from DNA sequences
	around novel SNPs?
Message-ID: <794eafc20911091008g1f98b944ncbd66ac4962a85a3@mail.gmail.com>

Hello,

I'm new bioperl user.  I' working on a project: To determine the
status of all tutative SNPs such as non-synonymous vs. synonymous, and
predict the tranlational effect of non-synonymous mutations as benign
or malicious.  I'm trying to use bioperl to get the DNA sequence and
translate to protein sequence for the SNPs that are in gene's coding
region.  Could someone tell me how to do it?

Thanks,

-Guangchun Song


From robert.bradbury at gmail.com  Mon Nov  9 16:15:33 2009
From: robert.bradbury at gmail.com (Robert Bradbury)
Date: Mon, 9 Nov 2009 16:15:33 -0500
Subject: [Bioperl-l] how to get the protein sequences from DNA sequences
	around novel SNPs?
In-Reply-To: <794eafc20911091008g1f98b944ncbd66ac4962a85a3@mail.gmail.com>
References: <794eafc20911091008g1f98b944ncbd66ac4962a85a3@mail.gmail.com>
Message-ID: <deaa866a0911091315l6a0782e8i24bfdc60065eb648@mail.gmail.com>

On Mon, Nov 9, 2009 at 1:08 PM, Guangchun Song <gc11song at gmail.com> wrote:
>
> I'm new bioperl user.  I' working on a project: To determine the
> status of all tutative SNPs such as non-synonymous vs. synonymous, and
> predict the tranlational effect of non-synonymous mutations as benign
> or malicious.  I'm trying to use bioperl to get the DNA sequence and
> translate to protein sequence for the SNPs that are in gene's coding
> region.  Could someone tell me how to do it?
>
>
I too would like to know if this information is available.  I've recently
been working with the dbSNP results from NCBI but they display the results
in a graphical format rather than data that one can play with and ask
questions of like "What is the most disease causing gene in the Human
Genome?" or "What are the critical proteins damaged by gene defects in the
Human Genome?" ... "In terms of premature deaths, extended health care
requirements, loss of quality of life, etc.?"

The same types of questions can be applied to the dog and cat genomes where
there is emotional value or the cow, horse, pig, etc. genomes where there is
economic value?

The value of BioPerl would increase significantly if there were
functionality that would allow easy access to "these mutations may have
negative/positive impact" (which means you need a function that qualifies
mutations by degree) and allow for impact to be subjectively determined
(implying there must be some callback function to provide a user
quality/impact rating).

For example:
   $/@differences =  protein_compare($mygene, $refseq_gene, @critical_aa,
@critical_domain, $callback)
Where $callback could "rate" differences about the protein and position and
the "type of interest" (e.g. metal binding amino acids, structural changing
amino acids, critical catalysis amino acids, etc.).

A default callback would be based on some evolving definition of "critical"
changes which result in human disease for example.

This is a "required" capability to be able to determine things like the
"adaptability" of a species -- those with fewest critical mutation points
may have better adaptability to mutation increasing circumstances.

Please pardon any errors in perl syntax/usage its been a while since I've
written perl and I'd really rather be coding in C.

Robert


From maj at fortinbras.us  Mon Nov  9 16:56:24 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Mon, 9 Nov 2009 16:56:24 -0500
Subject: [Bioperl-l] how to get the protein sequences from DNA
	sequencesaround novel SNPs?
In-Reply-To: <deaa866a0911091315l6a0782e8i24bfdc60065eb648@mail.gmail.com>
References: <794eafc20911091008g1f98b944ncbd66ac4962a85a3@mail.gmail.com>
	<deaa866a0911091315l6a0782e8i24bfdc60065eb648@mail.gmail.com>
Message-ID: <3ED3D387B5DE4248A218D42882369925@NewLife>

I agree that BioPerl would significantly increase in value with
such a module; in fact, the BioTeam would probably buy us out.
My opinion is that the entire GWAS enterprise is the search for
such a callback function, for humans anyway. For those engaged
in this quest, if BioPerl doesn't provide a Maserati, it at least provides
good italian-made (among others) parts.
MAJ
----- Original Message ----- 
From: "Robert Bradbury" <robert.bradbury at gmail.com>
To: "Guangchun Song" <gc11song at gmail.com>
Cc: <bioperl-l at lists.open-bio.org>
Sent: Monday, November 09, 2009 4:15 PM
Subject: Re: [Bioperl-l] how to get the protein sequences from DNA 
sequencesaround novel SNPs?


> On Mon, Nov 9, 2009 at 1:08 PM, Guangchun Song <gc11song at gmail.com> wrote:
>>
>> I'm new bioperl user.  I' working on a project: To determine the
>> status of all tutative SNPs such as non-synonymous vs. synonymous, and
>> predict the tranlational effect of non-synonymous mutations as benign
>> or malicious.  I'm trying to use bioperl to get the DNA sequence and
>> translate to protein sequence for the SNPs that are in gene's coding
>> region.  Could someone tell me how to do it?
>>
>>
> I too would like to know if this information is available.  I've recently
> been working with the dbSNP results from NCBI but they display the results
> in a graphical format rather than data that one can play with and ask
> questions of like "What is the most disease causing gene in the Human
> Genome?" or "What are the critical proteins damaged by gene defects in the
> Human Genome?" ... "In terms of premature deaths, extended health care
> requirements, loss of quality of life, etc.?"
>
> The same types of questions can be applied to the dog and cat genomes where
> there is emotional value or the cow, horse, pig, etc. genomes where there is
> economic value?
>
> The value of BioPerl would increase significantly if there were
> functionality that would allow easy access to "these mutations may have
> negative/positive impact" (which means you need a function that qualifies
> mutations by degree) and allow for impact to be subjectively determined
> (implying there must be some callback function to provide a user
> quality/impact rating).
>
> For example:
>   $/@differences =  protein_compare($mygene, $refseq_gene, @critical_aa,
> @critical_domain, $callback)
> Where $callback could "rate" differences about the protein and position and
> the "type of interest" (e.g. metal binding amino acids, structural changing
> amino acids, critical catalysis amino acids, etc.).
>
> A default callback would be based on some evolving definition of "critical"
> changes which result in human disease for example.
>
> This is a "required" capability to be able to determine things like the
> "adaptability" of a species -- those with fewest critical mutation points
> may have better adaptability to mutation increasing circumstances.
>
> Please pardon any errors in perl syntax/usage its been a while since I've
> written perl and I'd really rather be coding in C.
>
> Robert
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> 


From alexl at users.sourceforge.net  Mon Nov  9 18:44:07 2009
From: alexl at users.sourceforge.net (Alex Lancaster)
Date: Mon, 09 Nov 2009 18:44:07 -0500
Subject: [Bioperl-l] version of ExtUtils::Manifest too strict?
In-Reply-To: <1D9E943F-2EDC-49AB-83DE-78DED5A8AC23@illinois.edu> (Chris
	Fields's message of "Wed, 4 Nov 2009 07:53:35 -0600")
References: <msd43yycfm.fsf@allele2.localdomain>
	<1D9E943F-2EDC-49AB-83DE-78DED5A8AC23@illinois.edu>
Message-ID: <nmocnbuuuw.fsf@allele2.localdomain>

>>>>> Chris Fields  writes:

> Alex, Not sure why ExtUtils::Manifest can't be bundled as a separate
> perl package alone.  It is part of perl core but it's also available
> on CPAN separately from perl itself:

> http://search.cpan.org/~rkobes/ExtUtils-Manifest-1.57/lib/ExtUtils/Manifest.pm

Hi Chris,

Yes, in principle it would be possible to have this split out as a
separate package (currently it's a "subpackage" under the main perl
package), unfortunately that's just not the way it's currently done in
Fedora (probably because it's part of the core set and they like to
update all relevant packages in one step) and I have little control over
that.

As I suspected, the perl maintainer is not at all enthusiastic for
updating the whole of perl just for that package (except for rawhide
which would mean that bioperl 1.6.1 would not be available until F-13,
about 6 months from now).  See:

http://bugzilla.redhat.com/show_bug.cgi?id=533562#c1

Obviously I am not happy with this situation either, because it will
freeze bioperl on Fedora at 1.6.0 for about 6 months, so can you
recommend any temporary workarounds in the meantime?

> This is the commit message for that BTW.  This allows spaces in file
> names for the MANIFEST.  v1.52 is a bug fix and is required.

> http://code.open-bio.org/svnweb/index.cgi/bioperl/revision/?rev=15673

Perhaps I could create a patch that renamed files with spaces in them to
ones with no spaces and then rename them again upon installation.

Can you point me to which files are the problematic ones that triggered
the dependency for 1.52?  Perhaps I can figure a workaround.

Meanwhile I will press the maintainer of perl in Fedora to perhaps
reconsider his position (e.g. if another update for perl is going out
for another reason, like a security update, perhaps he could roll in the
1.52 update at the same time).

Cheers,
Alex


From cjfields at illinois.edu  Mon Nov  9 19:50:00 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Mon, 9 Nov 2009 18:50:00 -0600
Subject: [Bioperl-l] version of ExtUtils::Manifest too strict?
In-Reply-To: <nmocnbuuuw.fsf@allele2.localdomain>
References: <msd43yycfm.fsf@allele2.localdomain>
	<1D9E943F-2EDC-49AB-83DE-78DED5A8AC23@illinois.edu>
	<nmocnbuuuw.fsf@allele2.localdomain>
Message-ID: <29EA2398-F60B-48F2-AFE7-39A44011C451@illinois.edu>

On Nov 9, 2009, at 5:44 PM, Alex Lancaster wrote:

>>>>>> Chris Fields  writes:
>
>> Alex, Not sure why ExtUtils::Manifest can't be bundled as a separate
>> perl package alone.  It is part of perl core but it's also available
>> on CPAN separately from perl itself:
>
>> http://search.cpan.org/~rkobes/ExtUtils-Manifest-1.57/lib/ExtUtils/Manifest.pm
>
> Hi Chris,
>
> Yes, in principle it would be possible to have this split out as a
> separate package (currently it's a "subpackage" under the main perl
> package), unfortunately that's just not the way it's currently done in
> Fedora (probably because it's part of the core set and they like to
> update all relevant packages in one step) and I have little control  
> over
> that.
>
> As I suspected, the perl maintainer is not at all enthusiastic for
> updating the whole of perl just for that package (except for rawhide
> which would mean that bioperl 1.6.1 would not be available until F-13,
> about 6 months from now).  See:
>
> http://bugzilla.redhat.com/show_bug.cgi?id=533562#c1
>
> Obviously I am not happy with this situation either, because it will
> freeze bioperl on Fedora at 1.6.0 for about 6 months, so can you
> recommend any temporary workarounds in the meantime?

Well, if you don't absolutely require the MANIFEST for the final  
package you can forego the requirement.  The file in question that  
triggered the requirement is a data file used only for testing:

t/data/test 2.txt

>> This is the commit message for that BTW.  This allows spaces in file
>> names for the MANIFEST.  v1.52 is a bug fix and is required.
>
>> http://code.open-bio.org/svnweb/index.cgi/bioperl/revision/?rev=15673
>
> Perhaps I could create a patch that renamed files with spaces in  
> them to
> ones with no spaces and then rename them again upon installation.
>
> Can you point me to which files are the problematic ones that  
> triggered
> the dependency for 1.52?  Perhaps I can figure a workaround.
>
> Meanwhile I will press the maintainer of perl in Fedora to perhaps
> reconsider his position (e.g. if another update for perl is going out
> for another reason, like a security update, perhaps he could roll in  
> the
> 1.52 update at the same time).
>
> Cheers,
> Alex

I would point out that this is a fairly significant bug fix for  
ExtUtils::Manifest.  A newer point release of perl is now available  
(5.10.1) that contains the fix and has a fix for a performance  
regression that popped up in 5.10.0.

chris


From jay at jays.net  Mon Nov  9 19:05:51 2009
From: jay at jays.net (Jay Hannah)
Date: Mon, 9 Nov 2009 18:05:51 -0600
Subject: [Bioperl-l] Bio::Index::GenBank - by organism?
Message-ID: <3B01A09C-198E-4691-B807-7ED3250BB81A@jays.net>

Many thanks to Ewan Birney et. al. for Bio::Index::*

I can throw away my awful grep based index-by-accession stuff.   :)

Any chance someone has also written an organism based index mechanism?  
Something like...

while (my $seq = $inx?>get_Seq_by_organism('*Xanthomonas*')) {
    print $seq->display_id . "\n";
}

Thanks,

j


From cjfields at illinois.edu  Mon Nov  9 22:55:01 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Mon, 9 Nov 2009 21:55:01 -0600
Subject: [Bioperl-l] Bio::Index::GenBank - by organism?
In-Reply-To: <3B01A09C-198E-4691-B807-7ED3250BB81A@jays.net>
References: <3B01A09C-198E-4691-B807-7ED3250BB81A@jays.net>
Message-ID: <12DFD22E-42DC-4626-9873-0DE3EBB5CFBD@illinois.edu>

On Nov 9, 2009, at 6:05 PM, Jay Hannah wrote:

> Many thanks to Ewan Birney et. al. for Bio::Index::*
>
> I can throw away my awful grep based index-by-accession stuff.   :)
>
> Any chance someone has also written an organism based index  
> mechanism? Something like...
>
> while (my $seq = $inx?>get_Seq_by_organism('*Xanthomonas*')) {
>   print $seq->display_id . "\n";
> }
>
> Thanks,
>
> j

It should work via id_parser(); from Bio::Index::GenBank:

    $inx->id_parser(\&get_id);
    # make the index
    $inx->make_index($file_name);

    # here is where the retrieval key is specified
    sub get_id {
       my $line = shift;
       $line =~ /clone="(\S+)"/;
       $1;
    }

Change the code ref deal with the line you want and parse the name  
out.  Caveat: this may not be absolutely perfect (it only passes in a  
line at a time, and some species lines will wrap).  Also not sure how  
this would work in cases where multiple sequences from the same  
species are present.

The other option is to preparse everything and tie a hash to store a  
species->UID map, then use that along with your Bio::Index index to  
grab what you need.

chris


From cjfields at illinois.edu  Mon Nov  9 23:58:32 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Mon, 9 Nov 2009 22:58:32 -0600
Subject: [Bioperl-l] how to get the protein sequences from DNA sequences
	around novel SNPs?
In-Reply-To: <deaa866a0911091315l6a0782e8i24bfdc60065eb648@mail.gmail.com>
References: <794eafc20911091008g1f98b944ncbd66ac4962a85a3@mail.gmail.com>
	<deaa866a0911091315l6a0782e8i24bfdc60065eb648@mail.gmail.com>
Message-ID: <435BA1A8-2CCB-4D7A-8909-84F8135C439F@illinois.edu>

On Nov 9, 2009, at 3:15 PM, Robert Bradbury wrote:

> On Mon, Nov 9, 2009 at 1:08 PM, Guangchun Song <gc11song at gmail.com>  
> wrote:
>>
>> I'm new bioperl user.  I' working on a project: To determine the
>> status of all tutative SNPs such as non-synonymous vs. synonymous,  
>> and
>> predict the tranlational effect of non-synonymous mutations as benign
>> or malicious.  I'm trying to use bioperl to get the DNA sequence and
>> translate to protein sequence for the SNPs that are in gene's coding
>> region.  Could someone tell me how to do it?
>>
>>
> I too would like to know if this information is available.  I've  
> recently
> been working with the dbSNP results from NCBI but they display the  
> results
> in a graphical format rather than data that one can play with and ask
> questions of like "What is the most disease causing gene in the Human
> Genome?" or "What are the critical proteins damaged by gene defects  
> in the
> Human Genome?" ... "In terms of premature deaths, extended health care
> requirements, loss of quality of life, etc.?"
>
> The same types of questions can be applied to the dog and cat  
> genomes where
> there is emotional value or the cow, horse, pig, etc. genomes where  
> there is
> economic value?
>
> The value of BioPerl would increase significantly if there were
> functionality that would allow easy access to "these mutations may  
> have
> negative/positive impact" (which means you need a function that  
> qualifies
> mutations by degree) and allow for impact to be subjectively  
> determined
> (implying there must be some callback function to provide a user
> quality/impact rating).
>
> For example:
>   $/@differences =  protein_compare($mygene, $refseq_gene,  
> @critical_aa,
> @critical_domain, $callback)
> Where $callback could "rate" differences about the protein and  
> position and
> the "type of interest" (e.g. metal binding amino acids, structural  
> changing
> amino acids, critical catalysis amino acids, etc.).
>
> A default callback would be based on some evolving definition of  
> "critical"
> changes which result in human disease for example.
>
> This is a "required" capability to be able to determine things like  
> the
> "adaptability" of a species -- those with fewest critical mutation  
> points
> may have better adaptability to mutation increasing circumstances.
>
> Please pardon any errors in perl syntax/usage its been a while since  
> I've
> written perl and I'd really rather be coding in C.
>
> Robert

I will say that most of the information from the SNP database is  
available in various formats (see following link under 'Retrieval  
Types'):

http://www.ncbi.nlm.nih.gov/corehtml/query/static/efetchseq_help.html

You can access this information, as well as the full XML, using  
something like the following script.

chris

------------------------------------------------

#!/usr/bin/perl -w

use 5.010;
use strict;
use warnings;
use Bio::DB::EUtilities;

my $term = shift;
my $eutil  = Bio::DB::EUtilities->new(-eutil    => 'esearch',
                                       -db       => 'snp',
                                       -term     => $term,
                                       -usehistory => 'y',
                                       -retmax   => 100);

my $hist = $eutil->next_History || die "No history returned";

# for SNP XML, change retmode to 'xml'
$eutil->set_parameters(-eutil   => 'efetch',
                        -history => $hist,
                        -retmode => 'text',
                        -rettype => 'flt');

# dumps to STDOUT
say $eutil->get_Response->content;


From jluis.lavin at unavarra.es  Tue Nov 10 05:43:40 2009
From: jluis.lavin at unavarra.es (jluis.lavin at unavarra.es)
Date: Tue, 10 Nov 2009 11:43:40 +0100 (CET)
Subject: [Bioperl-l] [Fwd: Re: A question about iBio::Index: and
 itscorrect use]
In-Reply-To: <A1ACC4B552514872B77208248B31977C@NewLife>
References: <2642.130.206.164.153.1257435996.squirrel@webmail.unavarra.es><1
	984ED07F36C446284B25F617964B6C6@NewLife><2969.130.206.164.153.1257438117.squirrel@webmail.unavarra.es>
	<A1ACC4B552514872B77208248B31977C@NewLife>
Message-ID: <3471.130.206.164.153.1257849820.squirrel@webmail.unavarra.es>

Hello again,

I tried what Mark told me modifying the code line he told me but there?s
still a problem that I believe must be due to the sequences name.
My secuences header on the Fasta file have this format:

>PleosPC9_1_103820|fgenesh1_pg.3_#_1

Th part on the right of the pipe changes depending on the program used to
create the gene model, for example:

>PleosPC9_1_103820|fgenesh1_pg.3_#_1
>PleosPC9_1_123413|genemark.2731_g
>PleosPC9_1_52065|e_gw1.3.64.1

So I guess I need to parse my ids somehow for thr program to detect only
the first part of the fasta header (the "protein name") and not to get
messed with the other side of the pipe...

This is the corrected code I wrote following Mark?s indications, but I
still don?t have any idea about the parsing issue...

#!/c:/Perl -w
use Bio::Index::Fasta;
use strict;
#PC9.fasta is my genomic file
my $Index_File_Name ="PC9.fasta";
my $inx = Bio::Index::Fasta->new('PC9.fasta.idx');
#LCS.txt is my sequences list
@ARGV = <LCS.txt>;
foreach  my $id (@ARGV) {
if ($id eq ''){
die ("empty list")
}
else {
my $seqobj = $inx->fetch($id);
my $out = new Bio::SeqIO (-file => ">>index_extracted.fasta",
-format => 'fasta');
$out->write_seq($seqobj);
}
}
exit;
}

Thanks in advance

PD. May it be a faster way of extracting those sequences using plain PERL?


El Jue, 5 de Noviembre de 2009, 17:39, Mark A. Jensen escribi?:
> Yes, these are files created by the SDBM, Perl's internal db manager. You
> should
> be able to
> open the index by simply
> $inx = Bio::Index::Fasta->new('PC9.fasta.idx');
> and the dbm will know what to do--
> cheers MAJ
> ----- Original Message -----
> From: <jluis.lavin at unavarra.es>
> To: "Mark A. Jensen" <maj at fortinbras.us>
> Cc: <jluis.lavin at unavarra.es>; <bioperl-l at lists.open-bio.org>
> Sent: Thursday, November 05, 2009 11:21 AM
> Subject: Re: [Bioperl-l] [Fwd: Re: A question about iBio::Index: and its
> correct
> use]
>
>
>> Thank you very much Mark, that?s a good point :$
>> I guess your correction is referred to the second script, isn?t it?
>>
>> If it is so, there is still a problem with the first script, it doesn?t
>> create the PC9.fasta.idx file, instead it creates two files named:
>> -PC9.fasta.idx.pag
>> -PC9.fasta.idx.dir
>>
>> which seem to be clearly related with some kind of indexing
>> process...but,
>> unless the PC9.fasta.idx file is only virtual or remains hidden, I can?t
>> find it anywhere...
>> Forgive me if I?m talking nosense...
>>
>> Thank you very much again for your help ;)
>>
>>
>> El Jue, 5 de Noviembre de 2009, 17:02, Mark A. Jensen escribi?:
>>> Hey Jos?,
>>> The first thing that jumps out it the index file name. Looks
>>> like you create it as
>>> PC9.fasta.idx
>>> But you read it as
>>> PC9.fasta
>>> Not an unusual mistake. Do
>>> my $inx = Bio::Index::Fasta->new('PC9.fasta.idx');
>>> and see if it works.
>>> MAJ
>>> ----- Original Message -----
>>> From: <jluis.lavin at unavarra.es>
>>> To: <bioperl-l at lists.open-bio.org>
>>> Sent: Thursday, November 05, 2009 10:46 AM
>>> Subject: [Bioperl-l] [Fwd: Re: A question about iBio::Index: and its
>>> correct
>>> use]
>>>
>>>
>>>
>>>
>>> ---------------------------- Mensaje original
>>> ----------------------------
>>> Subject: Re: [Bioperl-l] A question about iBio::Index: and its correct
>>> use
>>> From:    jluis.lavin at unavarra.es
>>> Fecha:   Jue, 5 de Noviembre de 2009, 16:46
>>> To:      "Mark A. Jensen" <maj at fortinbras.us>
>>> --------------------------------------------------------------------------
>>>
>>> Hi Mark,
>>>
>>> I?ve actually got two scripts, the first one is to create the index and
>>> the second one is to retrieve the sequence lis from the indexed file.
>>>
>>> 1)Here is the Index creation script:
>>>
>>> #!/c:/Perl -w
>>> use strict;
>>> use Bio::Index::Fasta;
>>> use strict;
>>>
>>> print "Enter file for indexing: \n";
>>> my $Index_File_Name = <STDIN>;
>>> my $inx = Bio::Index::Fasta->new(-filename => $Index_File_Name.".idx",
>>>     -write_flag => 1);
>>> $inx->make_index(my $File_Name);
>>>
>>> 2)And here is the sequence retrieval script:
>>>
>>> #!/c:/Perl -w
>>> use Bio::Index::Fasta;
>>> use strict;
>>> #PC9.fasta is my genomic file
>>> my $Index_File_Name ="PC9.fasta";
>>> my $inx = Bio::Index::Fasta->new($Index_File_Name);
>>> #LCS.txt is my sequences list
>>> @ARGV = <lCS.txt>;
>>> foreach  my $id (@ARGV) {
>>> if ($id eq ''){
>>> die ("empty list")
>>> }
>>> else {
>>> my $seqobj = $inx->fetch($id);
>>> my $out = new Bio::SeqIO (-file => ">>extracted_seqs.fasta",
>>> -format => 'fasta');
>>> $out->write_seq($seqobj);
>>> }
>>> }
>>> exit;
>>> }
>>>
>>> I hope this code is not a total scum...
>>>
>>> Thanks in advance ;)
>>>
>>>
>>>
>>> El Jue, 5 de Noviembre de 2009, 16:39, Mark A. Jensen escribi?:
>>>> Jos? -- It looks like this is a good solution to your problem. Please
>>>> send
>>>> you
>>>> script so we can look at it-
>>>> cheers Mark
>>>> ----- Original Message -----
>>>> From: <jluis.lavin at unavarra.es>
>>>> To: <bioperl-l at lists.open-bio.org>
>>>> Sent: Thursday, November 05, 2009 10:28 AM
>>>> Subject: [Bioperl-l] A question about iBio::Index: and its correct use
>>>>
>>>>
>>>>
>>>> Hello to all,
>>>>
>>>> I?m trying to write a script to retrieve a list of sequences from a
>>>> local
>>>> FASTA file (for example a fasta archive where all the protein models
>>>> of
>>>> an
>>>> organism are stored). This file would be used by me as some kind
>>>> "local
>>>> database" (sorry if I mistake a few concepts...)
>>>> I?ve been reading the BioPerl HOWTOs and I came across the
>>>> Bio::Index::Fasta tool.
>>>> If I didn?t misunderstood what I read (which can be easy because my
>>>> low
>>>> level on programming) this Indexing tool should do the job.
>>>> I wrote a couple of scripts based on the documentation i read about
>>>> this
>>>> tool, but I don?t seem to be able to create the index file to be used
>>>> later (to retrieve the sequences from).
>>>> -First of all, I want to ask the people in this forum if the
>>>> Bio::Index::Fasta is the right one to chose for this tasks.
>>>> -Then I?ll beg you to take a look at my scripts, because I don?t seem
>>>> to
>>>> catch the bug...
>>>>
>>>> Best wishes to you all and thanks in advance ;)
>>>>
>>>> --
>>>> Jos? Luis Lav?n Trueba, PhD
>>>>
>>>> Dpto. de Producci?n Agraria
>>>> Grupo de Gen?tica y Microbiolog?a
>>>> Universidad P?blica de Navarra
>>>> 31006 Pamplona
>>>> Navarra
>>>> SPAIN
>>>>
>>>>
>>>> _______________________________________________
>>>> Bioperl-l mailing list
>>>> Bioperl-l at lists.open-bio.org
>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>>
>>>>
>>>>
>>>
>>>
>>> --
>>> Dr. Jos? Luis Lav?n Trueba
>>>
>>> Dpto. de Producci?n Agraria
>>> Grupo de Gen?tica y Microbiolog?a
>>> Universidad P?blica de Navarra
>>> 31006 Pamplona
>>> Navarra
>>> SPAIN
>>>
>>>
>>>
>>> --
>>> Dr. Jos? Luis Lav?n Trueba
>>>
>>> Dpto. de Producci?n Agraria
>>> Grupo de Gen?tica y Microbiolog?a
>>> Universidad P?blica de Navarra
>>> 31006 Pamplona
>>> Navarra
>>> SPAIN
>>>
>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>>
>>>
>>
>>
>> --
>> Dr. Jos? Luis Lav?n Trueba
>>
>> Dpto. de Producci?n Agraria
>> Grupo de Gen?tica y Microbiolog?a
>> Universidad P?blica de Navarra
>> 31006 Pamplona
>> Navarra
>> SPAIN
>>
>>
>>
>>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


-- 
Dr. Jos? Luis Lav?n Trueba

Dpto. de Producci?n Agraria
Grupo de Gen?tica y Microbiolog?a
Universidad P?blica de Navarra
31006 Pamplona
Navarra
SPAIN


From saikari78 at gmail.com  Tue Nov 10 06:41:11 2009
From: saikari78 at gmail.com (saikari keitele)
Date: Tue, 10 Nov 2009 11:41:11 +0000
Subject: [Bioperl-l] [bioperl newbie] Retrieving link to protein from
	PubChem
In-Reply-To: <a38167fa0911090841o6a0a7bcdj5e2396f1a58dc4f2@mail.gmail.com>
References: <a38167fa0911090805y76456205g3f42caa7e3471241@mail.gmail.com>
	<1ECC543A-F923-4D5E-A0C1-5BBD35ECAAE8@illinois.edu>
	<a38167fa0911090841o6a0a7bcdj5e2396f1a58dc4f2@mail.gmail.com>
Message-ID: <a38167fa0911100341j45af6218r449270a7f4b530ec@mail.gmail.com>

Thanks again very much for your help and the script.
i've been trying it, however I fail to find any protein record linked to a
record in the pcsubstance database.
Do you think that its is because  no links have been defined between the 2
databases, or that I am just unlucky and that no link exists for the
particular records I'm testing?
Thanks again

saikari

On Mon, Nov 9, 2009 at 4:41 PM, saikari keitele <saikari78 at gmail.com> wrote:

> Fabulous!. Huge help.
> saikari
>
>   On Mon, Nov 9, 2009 at 4:27 PM, Chris Fields <cjfields at illinois.edu>wrote:
>
>>  On Nov 9, 2009, at 10:05 AM, saikari keitele wrote:
>>
>> Hi,
>>>
>>> I'm using Bioperl to retrieve records from PubChem.
>>> I'm trying to find a way-but have been unsuccessful- to retrieve from a
>>> compound record, the reference to the protein(s) that can synthesize the
>>> compound.
>>> Thanks very much.
>>>
>>> saikari
>>>
>>
>> The below bioperl script returns the GI for proteins that correspond to
>> the substance passed on the command line; invoke using 'perl
>> pc_substance.pl substance_requested'.  It probably needs more fiddling to
>> catch everything but it should get you started.
>>
>> For other bits and pieces (such as how to retrieve the raw sequence
>> files), please see the EUtilities HOWTO:
>>
>> http://www.bioperl.org/wiki/HOWTO:EUtilities_Cookbook
>>
>> chris
>>
>> ----------------------------------------
>>
>> #!/usr/bin/perl -w
>>
>> use 5.010;
>> use strict;
>> use warnings;
>> use Bio::DB::EUtilities;
>>
>> my $substance = shift;
>>
>> my $eutil = Bio::DB::EUtilities->new(-eutil => 'esearch',
>>                                     -db => 'pcsubstance',
>>                                     -term => $substance,
>>                                     -usehistory => 'y');
>>
>> my $hist = $eutil->next_History || die;
>>
>> $eutil->reset_parameters(-eutil => 'elink',
>>                       -history => $hist,
>>                       -db      => 'protein',
>>                       -dbfrom  => 'pcsubstance',
>>                       -retmax  => 1000);
>>
>> say join(',',$eutil->get_ids);
>>
>
>


From heyne at informatik.uni-freiburg.de  Tue Nov 10 07:55:06 2009
From: heyne at informatik.uni-freiburg.de (Steffen Heyne)
Date: Tue, 10 Nov 2009 13:55:06 +0100
Subject: [Bioperl-l] problem with alignments and sequence locations
Message-ID: <4AF962AA.7060908@informatik.uni-freiburg.de>

Hi,

I'm using Bioperl for my research and it is very useful! Thank you!

Currently I have a problem with locations tags of sequences. I read in 
seed alignments of Rfam (in stockholm format, but I think it is similar 
to other formats).

If the location is like:

AB194432.1/908-846

the start/end values are changed to

$seq->start = 846
$seq->end = 908

and therefore the new location (e.g.$seq->get_nse) is:

AB194432.1/846-908

The $seq->strand tag is correctly set to -1 in this case, but if the 
alignment is written out again (clustal, stockholm,...) this strand info 
is lost and the sequences have this "wrong" location. But this 
information is important in respect to the sequence accession number.

Is there a way to set the location back to the original one or is this 
behavior desired? Any manually setting with $seq->start($val) failed due 
to automatic checking.

I'm using bioperl 1.6.1

Thanks!

steffen


-- 
---
Steffen Heyne, Dipl.-Bioinf.
Lehrstuhl f?r Bioinformatik
Institut f?r Informatik
Albert-Ludwigs-Universit?t Freiburg
Georges-K?hler-Allee 106
79110 Freiburg, Germany

Tel: (+49) 761 203 8239
Fax: (+49) 761 203 7462
Mail: heyne at informatik.uni-freiburg.de


From cjfields at illinois.edu  Tue Nov 10 08:58:52 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Tue, 10 Nov 2009 07:58:52 -0600
Subject: [Bioperl-l] problem with alignments and sequence locations
In-Reply-To: <4AF962AA.7060908@informatik.uni-freiburg.de>
References: <4AF962AA.7060908@informatik.uni-freiburg.de>
Message-ID: <DF72C01A-410F-4391-B33E-4884D7CB859E@illinois.edu>

On Nov 10, 2009, at 6:55 AM, Steffen Heyne wrote:

> Hi,
>
> I'm using Bioperl for my research and it is very useful! Thank you!
>
> Currently I have a problem with locations tags of sequences. I read  
> in seed alignments of Rfam (in stockholm format, but I think it is  
> similar to other formats).
>
> If the location is like:
>
> AB194432.1/908-846
>
> the start/end values are changed to
>
> $seq->start = 846
> $seq->end = 908
>
> and therefore the new location (e.g.$seq->get_nse) is:
>
> AB194432.1/846-908
>
> The $seq->strand tag is correctly set to -1 in this case, but if the  
> alignment is written out again (clustal, stockholm,...) this strand  
> info is lost and the sequences have this "wrong" location. But this  
> information is important in respect to the sequence accession number.
>
> Is there a way to set the location back to the original one or is  
> this behavior desired? Any manually setting with $seq->start($val)  
> failed due to automatic checking.
>
> I'm using bioperl 1.6.1
>
> Thanks!
>
> steffen

This is a definite bug. We recently discussed amending the NSE format  
due to this (the subject came up over the last few months or so); it's  
fallen through the cracks.  Fortunaely it is very easy to fix (the  
relevant method is in LocatableSeq).

Does anyone have a problem with me adding this in?  It will change  
output for only those instances where the strand is -1, so

AB194432.1/908-846

would be start = 846, end = 908, strand = -1

AB194432.1/846-908

would be start = 846, end = 908, strand = 1

chris


From cjfields at illinois.edu  Tue Nov 10 09:05:51 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Tue, 10 Nov 2009 08:05:51 -0600
Subject: [Bioperl-l] [bioperl newbie] Retrieving link to protein from
	PubChem
In-Reply-To: <a38167fa0911100341j45af6218r449270a7f4b530ec@mail.gmail.com>
References: <a38167fa0911090805y76456205g3f42caa7e3471241@mail.gmail.com>
	<1ECC543A-F923-4D5E-A0C1-5BBD35ECAAE8@illinois.edu>
	<a38167fa0911090841o6a0a7bcdj5e2396f1a58dc4f2@mail.gmail.com>
	<a38167fa0911100341j45af6218r449270a7f4b530ec@mail.gmail.com>
Message-ID: <738F6320-B87A-4541-B9FA-20273ABA96B9@illinois.edu>

On Nov 10, 2009, at 5:41 AM, saikari keitele wrote:

> Thanks again very much for your help and the script.
> i've been trying it, however I fail to find any protein record  
> linked to a
> record in the pcsubstance database.
> Do you think that its is because  no links have been defined between  
> the 2
> databases, or that I am just unlucky and that no link exists for the
> particular records I'm testing?
> Thanks again
>
> saikari

It's probably that no links have been defined.  I have found similar  
problems in the past with pubchem, in that not all substances have  
proteins associated with them.  Most proteins linked to are those with  
a deposited structure.

There are a few other databases to check out; KEGG, the BioCyc dbs  
(like EcoCyc), come to mind.  I don't think we have a generic remote  
query engine set up for any of those unfortunately (unless there is  
one I'm unaware of), but I know BioCyc comes with it's own set of  
tools (including perl- and java-based query tools) and can be set up  
locally, which is likely much faster and more in lines with what you  
need.

chris

...


From vebaev at gmail.com  Tue Nov 10 12:38:54 2009
From: vebaev at gmail.com (Vesselin Baev)
Date: Tue, 10 Nov 2009 09:38:54 -0800 (PST)
Subject: [Bioperl-l] Invitation to connect on LinkedIn
Message-ID: <1983273212.597925.1257874734811.JavaMail.app@ech3-cdn07.prod>

LinkedIn
------------

Vesselin Baev requested to add you as a connection on LinkedIn:
------------------------------------------

Bolotin,,

I'd like to add you to my professional network on LinkedIn.

- Vesselin

Accept invitation from Vesselin Baev
http://www.linkedin.com/e/uez6TYkHzbaXxXM-lUk23auFwJZodcPlXc2UWC0Ao8h/blk/I1572789477_2/pmpxnSRJrSdvj4R5fnhv9ClRsDgZp6lQs6lzoQ5AomZIpn8_cBYTdPgVe3sOdPkNiiZFlAN1oPlOp2YMdPsTcz8OdjwLrCBxbOYWrSlI/EML_comm_afe/

View invitation from Vesselin Baev
http://www.linkedin.com/e/uez6TYkHzbaXxXM-lUk23auFwJZodcPlXc2UWC0Ao8h/blk/I1572789477_2/39vdPsQejwTczsRckALqnpPbOYWrSlI/svi/

------------------------------------------ 
DID YOU KNOW your LinkedIn profile helps you control your public image when people search for you? Setting your profile as public means your LinkedIn profile will come up when people enter your name in leading search engines. Take control of your image! 
http://www.linkedin.com/e/ewp/inv-22/

 
------
(c) 2009, LinkedIn Corporation


From jason at bioperl.org  Tue Nov 10 13:47:02 2009
From: jason at bioperl.org (Jason Stajich)
Date: Tue, 10 Nov 2009 10:47:02 -0800
Subject: [Bioperl-l] [Fwd: Re: A question about iBio::Index: and
	itscorrect use]
In-Reply-To: <3471.130.206.164.153.1257849820.squirrel@webmail.unavarra.es>
References: <2642.130.206.164.153.1257435996.squirrel@webmail.unavarra.es><1
	984ED07F36C446284B25F617964B6C6@NewLife><2969.130.206.164.153.1257438117.squirrel@webmail.unavarra.es>
	<A1ACC4B552514872B77208248B31977C@NewLife>
	<3471.130.206.164.153.1257849820.squirrel@webmail.unavarra.es>
Message-ID: <E2D3B0C9-9294-4630-ACA0-FB0559E19E4C@bioperl.org>

Page 44 has the custom ID info or look at documentation for  
Bio::DB::Fasta - there is a similar syntax for Bio::Index::Fasta if  
you read the perldoc for the module.

  http://jason.open-bio.org/Bioperl_Tutorials/ProgrammingBiology2008/ProgBiology_BioPerl_I.pdf

Don't re-opening SeqIO each time just do it once at the beginning  
outside of the loop and then call write_seq within the loop.

This is one nuance of doing OO programming vs procedural is that there  
is some outside state information that can persist in an object, but  
conceptually, you want to open a filehandle once and just keep writing  
to it.

-jason
On Nov 10, 2009, at 2:43 AM, jluis.lavin at unavarra.es wrote:

> Hello again,
>
> I tried what Mark told me modifying the code line he told me but  
> there?s
> still a problem that I believe must be due to the sequences name.
> My secuences header on the Fasta file have this format:
>
>> PleosPC9_1_103820|fgenesh1_pg.3_#_1
>
> Th part on the right of the pipe changes depending on the program  
> used to
> create the gene model, for example:
>
>> PleosPC9_1_103820|fgenesh1_pg.3_#_1
>> PleosPC9_1_123413|genemark.2731_g
>> PleosPC9_1_52065|e_gw1.3.64.1
>
> So I guess I need to parse my ids somehow for thr program to detect  
> only
> the first part of the fasta header (the "protein name") and not to get
> messed with the other side of the pipe...
>
> This is the corrected code I wrote following Mark?s indications, but I
> still don?t have any idea about the parsing issue...
>
> #!/c:/Perl -w
> use Bio::Index::Fasta;
> use strict;
> #PC9.fasta is my genomic file
> my $Index_File_Name ="PC9.fasta";
> my $inx = Bio::Index::Fasta->new('PC9.fasta.idx');
> #LCS.txt is my sequences list
> @ARGV = <LCS.txt>;
> foreach  my $id (@ARGV) {
> if ($id eq ''){
> die ("empty list")
> }
> else {
> my $seqobj = $inx->fetch($id);
> my $out = new Bio::SeqIO (-file => ">>index_extracted.fasta",
> -format => 'fasta');
> $out->write_seq($seqobj);
> }
> }
> exit;
> }
>
> Thanks in advance
>
> PD. May it be a faster way of extracting those sequences using plain  
> PERL?
>
>
>
>
> El Jue, 5 de Noviembre de 2009, 17:39, Mark A. Jensen escribi?:
>> Yes, these are files created by the SDBM, Perl's internal db  
>> manager. You
>> should
>> be able to
>> open the index by simply
>> $inx = Bio::Index::Fasta->new('PC9.fasta.idx');
>> and the dbm will know what to do--
>> cheers MAJ
>> ----- Original Message -----
>> From: <jluis.lavin at unavarra.es>
>> To: "Mark A. Jensen" <maj at fortinbras.us>
>> Cc: <jluis.lavin at unavarra.es>; <bioperl-l at lists.open-bio.org>
>> Sent: Thursday, November 05, 2009 11:21 AM
>> Subject: Re: [Bioperl-l] [Fwd: Re: A question about iBio::Index:  
>> and its
>> correct
>> use]
>>
>>
>>> Thank you very much Mark, that?s a good point :$
>>> I guess your correction is referred to the second script, isn?t it?
>>>
>>> If it is so, there is still a problem with the first script, it  
>>> doesn?t
>>> create the PC9.fasta.idx file, instead it creates two files named:
>>> -PC9.fasta.idx.pag
>>> -PC9.fasta.idx.dir
>>>
>>> which seem to be clearly related with some kind of indexing
>>> process...but,
>>> unless the PC9.fasta.idx file is only virtual or remains hidden, I  
>>> can?t
>>> find it anywhere...
>>> Forgive me if I?m talking nosense...
>>>
>>> Thank you very much again for your help ;)
>>>
>>>
>>> El Jue, 5 de Noviembre de 2009, 17:02, Mark A. Jensen escribi?:
>>>> Hey Jos?,
>>>> The first thing that jumps out it the index file name. Looks
>>>> like you create it as
>>>> PC9.fasta.idx
>>>> But you read it as
>>>> PC9.fasta
>>>> Not an unusual mistake. Do
>>>> my $inx = Bio::Index::Fasta->new('PC9.fasta.idx');
>>>> and see if it works.
>>>> MAJ
>>>> ----- Original Message -----
>>>> From: <jluis.lavin at unavarra.es>
>>>> To: <bioperl-l at lists.open-bio.org>
>>>> Sent: Thursday, November 05, 2009 10:46 AM
>>>> Subject: [Bioperl-l] [Fwd: Re: A question about iBio::Index: and  
>>>> its
>>>> correct
>>>> use]
>>>>
>>>>
>>>>
>>>>
>>>> ---------------------------- Mensaje original
>>>> ----------------------------
>>>> Subject: Re: [Bioperl-l] A question about iBio::Index: and its  
>>>> correct
>>>> use
>>>> From:    jluis.lavin at unavarra.es
>>>> Fecha:   Jue, 5 de Noviembre de 2009, 16:46
>>>> To:      "Mark A. Jensen" <maj at fortinbras.us>
>>>> --------------------------------------------------------------------------
>>>>
>>>> Hi Mark,
>>>>
>>>> I?ve actually got two scripts, the first one is to create the  
>>>> index and
>>>> the second one is to retrieve the sequence lis from the indexed  
>>>> file.
>>>>
>>>> 1)Here is the Index creation script:
>>>>
>>>> #!/c:/Perl -w
>>>> use strict;
>>>> use Bio::Index::Fasta;
>>>> use strict;
>>>>
>>>> print "Enter file for indexing: \n";
>>>> my $Index_File_Name = <STDIN>;
>>>> my $inx = Bio::Index::Fasta->new(-filename =>  
>>>> $Index_File_Name.".idx",
>>>>    -write_flag => 1);
>>>> $inx->make_index(my $File_Name);
>>>>
>>>> 2)And here is the sequence retrieval script:
>>>>
>>>> #!/c:/Perl -w
>>>> use Bio::Index::Fasta;
>>>> use strict;
>>>> #PC9.fasta is my genomic file
>>>> my $Index_File_Name ="PC9.fasta";
>>>> my $inx = Bio::Index::Fasta->new($Index_File_Name);
>>>> #LCS.txt is my sequences list
>>>> @ARGV = <lCS.txt>;
>>>> foreach  my $id (@ARGV) {
>>>> if ($id eq ''){
>>>> die ("empty list")
>>>> }
>>>> else {
>>>> my $seqobj = $inx->fetch($id);
>>>> my $out = new Bio::SeqIO (-file => ">>extracted_seqs.fasta",
>>>> -format => 'fasta');
>>>> $out->write_seq($seqobj);
>>>> }
>>>> }
>>>> exit;
>>>> }
>>>>
>>>> I hope this code is not a total scum...
>>>>
>>>> Thanks in advance ;)
>>>>
>>>>
>>>>
>>>> El Jue, 5 de Noviembre de 2009, 16:39, Mark A. Jensen escribi?:
>>>>> Jos? -- It looks like this is a good solution to your problem.  
>>>>> Please
>>>>> send
>>>>> you
>>>>> script so we can look at it-
>>>>> cheers Mark
>>>>> ----- Original Message -----
>>>>> From: <jluis.lavin at unavarra.es>
>>>>> To: <bioperl-l at lists.open-bio.org>
>>>>> Sent: Thursday, November 05, 2009 10:28 AM
>>>>> Subject: [Bioperl-l] A question about iBio::Index: and its  
>>>>> correct use
>>>>>
>>>>>
>>>>>
>>>>> Hello to all,
>>>>>
>>>>> I?m trying to write a script to retrieve a list of sequences  
>>>>> from a
>>>>> local
>>>>> FASTA file (for example a fasta archive where all the protein  
>>>>> models
>>>>> of
>>>>> an
>>>>> organism are stored). This file would be used by me as some kind
>>>>> "local
>>>>> database" (sorry if I mistake a few concepts...)
>>>>> I?ve been reading the BioPerl HOWTOs and I came across the
>>>>> Bio::Index::Fasta tool.
>>>>> If I didn?t misunderstood what I read (which can be easy because  
>>>>> my
>>>>> low
>>>>> level on programming) this Indexing tool should do the job.
>>>>> I wrote a couple of scripts based on the documentation i read  
>>>>> about
>>>>> this
>>>>> tool, but I don?t seem to be able to create the index file to be  
>>>>> used
>>>>> later (to retrieve the sequences from).
>>>>> -First of all, I want to ask the people in this forum if the
>>>>> Bio::Index::Fasta is the right one to chose for this tasks.
>>>>> -Then I?ll beg you to take a look at my scripts, because I don?t  
>>>>> seem
>>>>> to
>>>>> catch the bug...
>>>>>
>>>>> Best wishes to you all and thanks in advance ;)
>>>>>
>>>>> --
>>>>> Jos? Luis Lav?n Trueba, PhD
>>>>>
>>>>> Dpto. de Producci?n Agraria
>>>>> Grupo de Gen?tica y Microbiolog?a
>>>>> Universidad P?blica de Navarra
>>>>> 31006 Pamplona
>>>>> Navarra
>>>>> SPAIN
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> Bioperl-l mailing list
>>>>> Bioperl-l at lists.open-bio.org
>>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> Dr. Jos? Luis Lav?n Trueba
>>>>
>>>> Dpto. de Producci?n Agraria
>>>> Grupo de Gen?tica y Microbiolog?a
>>>> Universidad P?blica de Navarra
>>>> 31006 Pamplona
>>>> Navarra
>>>> SPAIN
>>>>
>>>>
>>>>
>>>> --
>>>> Dr. Jos? Luis Lav?n Trueba
>>>>
>>>> Dpto. de Producci?n Agraria
>>>> Grupo de Gen?tica y Microbiolog?a
>>>> Universidad P?blica de Navarra
>>>> 31006 Pamplona
>>>> Navarra
>>>> SPAIN
>>>>
>>>>
>>>> _______________________________________________
>>>> Bioperl-l mailing list
>>>> Bioperl-l at lists.open-bio.org
>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>>
>>>>
>>>>
>>>
>>>
>>> --
>>> Dr. Jos? Luis Lav?n Trueba
>>>
>>> Dpto. de Producci?n Agraria
>>> Grupo de Gen?tica y Microbiolog?a
>>> Universidad P?blica de Navarra
>>> 31006 Pamplona
>>> Navarra
>>> SPAIN
>>>
>>>
>>>
>>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>
>
> -- 
> Dr. Jos? Luis Lav?n Trueba
>
> Dpto. de Producci?n Agraria
> Grupo de Gen?tica y Microbiolog?a
> Universidad P?blica de Navarra
> 31006 Pamplona
> Navarra
> SPAIN
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

--
Jason Stajich
jason.stajich at gmail.com
jason at bioperl.org


From jason at bioperl.org  Tue Nov 10 13:50:00 2009
From: jason at bioperl.org (Jason Stajich)
Date: Tue, 10 Nov 2009 10:50:00 -0800
Subject: [Bioperl-l] Bio::Index::GenBank - by organism?
In-Reply-To: <12DFD22E-42DC-4626-9873-0DE3EBB5CFBD@illinois.edu>
References: <3B01A09C-198E-4691-B807-7ED3250BB81A@jays.net>
	<12DFD22E-42DC-4626-9873-0DE3EBB5CFBD@illinois.edu>
Message-ID: <2BA451B1-6E18-483E-B655-74D1146772CC@bioperl.org>

You might also look at what mygenbank does:
http://homepage.mac.com/iankorf/mygenbank.html

On Nov 9, 2009, at 7:55 PM, Chris Fields wrote:

> On Nov 9, 2009, at 6:05 PM, Jay Hannah wrote:
>
>> Many thanks to Ewan Birney et. al. for Bio::Index::*
>>
>> I can throw away my awful grep based index-by-accession stuff.   :)
>>
>> Any chance someone has also written an organism based index  
>> mechanism? Something like...
>>
>> while (my $seq = $inx?>get_Seq_by_organism('*Xanthomonas*')) {
>>  print $seq->display_id . "\n";
>> }
>>
>> Thanks,
>>
>> j
>
> It should work via id_parser(); from Bio::Index::GenBank:
>
>   $inx->id_parser(\&get_id);
>   # make the index
>   $inx->make_index($file_name);
>
>   # here is where the retrieval key is specified
>   sub get_id {
>      my $line = shift;
>      $line =~ /clone="(\S+)"/;
>      $1;
>   }
>
> Change the code ref deal with the line you want and parse the name  
> out.  Caveat: this may not be absolutely perfect (it only passes in  
> a line at a time, and some species lines will wrap).  Also not sure  
> how this would work in cases where multiple sequences from the same  
> species are present.
>
> The other option is to preparse everything and tie a hash to store a  
> species->UID map, then use that along with your Bio::Index index to  
> grab what you need.
>
> chris
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

--
Jason Stajich
jason.stajich at gmail.com
jason at bioperl.org


From jluis.lavin at unavarra.es  Wed Nov 11 10:01:18 2009
From: jluis.lavin at unavarra.es (jluis.lavin at unavarra.es)
Date: Wed, 11 Nov 2009 16:01:18 +0100 (CET)
Subject: [Bioperl-l] [Fwd: Re: A question about iBio::Index:
 anditscorrect use]
In-Reply-To: <E2D3B0C9-9294-4630-ACA0-FB0559E19E4C@bioperl.org>
References: <2642.130.206.164.153.1257435996.squirrel@webmail.unavarra.es><1
	984ED07F36C446284B25F617964B6C6@NewLife><2969.130.206.164.153.1257438117.sq
	uirrel@webmail.unavarra.es><A1ACC4B552514872B77208248B31977C@NewLife><3471.
	130.206.164.153.1257849820.squirrel@webmail.unavarra.es>
	<E2D3B0C9-9294-4630-ACA0-FB0559E19E4C@bioperl.org>
Message-ID: <2979.130.206.164.153.1257951678.squirrel@webmail.unavarra.es>

Hi once again,
I have modified the script following the instructions Jason gave me (at
last what I understood, remember it is my first time trying to learn a
programming language...and I?m not the smartest guy in the class, hehe)but
it seems I didn?t fix the problem...
Here?s the new code I wrote:

#!/c:/Perl -w
	use strict;
        use Bio::Index::Fasta;
	use Bio::DB::Fasta;
	use Bio::SeqIO;
	use IO::File;

# assign files to scalars
my $index_file = 'PC91.fasta';
my $id_list = 'LCS2.txt';

# open index file
my $db = Bio::DB::Fasta->new($index_file) or die;

# open the id list
my $in = IO::File->new($id_list) or die;

# open FASTA to write
my $out = new Bio::SeqIO (-file => ">>index_extracted.fasta",
-format => 'fasta');

# retrieve ids loop
foreach my $id ($in) {
if ($id eq ''){
die ("empty list")
}
else {
my $seqobj = my $inx->fetch($id);
$out->write_seq($seqobj);
}
}

# parse fasta headers
sub my_makeid {
my $id = shift;
if ( $id =~ /^>[^:]+:(\S+)/ ) {
return $1;
} elsif ($id =~ /^>(\S+)/) {
return $1;
} else {
warn("cannot parse ID for $id\n");
}
}
exit;

Would anyone, please take a look at it ...

Thanks in advance ;)


El Mar, 10 de Noviembre de 2009, 19:47, Jason Stajich escribi?:
> Page 44 has the custom ID info or look at documentation for
> Bio::DB::Fasta - there is a similar syntax for Bio::Index::Fasta if
> you read the perldoc for the module.
>
>   http://jason.open-bio.org/Bioperl_Tutorials/ProgrammingBiology2008/ProgBiology_BioPerl_I.pdf
>
> Don't re-opening SeqIO each time just do it once at the beginning
> outside of the loop and then call write_seq within the loop.
>
> This is one nuance of doing OO programming vs procedural is that there
> is some outside state information that can persist in an object, but
> conceptually, you want to open a filehandle once and just keep writing
> to it.
>
> -jason
> On Nov 10, 2009, at 2:43 AM, jluis.lavin at unavarra.es wrote:
>
>> Hello again,
>>
>> I tried what Mark told me modifying the code line he told me but
>> there?s
>> still a problem that I believe must be due to the sequences name.
>> My secuences header on the Fasta file have this format:
>>
>>> PleosPC9_1_103820|fgenesh1_pg.3_#_1
>>
>> Th part on the right of the pipe changes depending on the program
>> used to
>> create the gene model, for example:
>>
>>> PleosPC9_1_103820|fgenesh1_pg.3_#_1
>>> PleosPC9_1_123413|genemark.2731_g
>>> PleosPC9_1_52065|e_gw1.3.64.1
>>
>> So I guess I need to parse my ids somehow for thr program to detect
>> only
>> the first part of the fasta header (the "protein name") and not to get
>> messed with the other side of the pipe...
>>
>> This is the corrected code I wrote following Mark?s indications, but I
>> still don?t have any idea about the parsing issue...
>>
>> #!/c:/Perl -w
>> use Bio::Index::Fasta;
>> use strict;
>> #PC9.fasta is my genomic file
>> my $Index_File_Name ="PC9.fasta";
>> my $inx = Bio::Index::Fasta->new('PC9.fasta.idx');
>> #LCS.txt is my sequences list
>> @ARGV = <LCS.txt>;
>> foreach  my $id (@ARGV) {
>> if ($id eq ''){
>> die ("empty list")
>> }
>> else {
>> my $seqobj = $inx->fetch($id);
>> my $out = new Bio::SeqIO (-file => ">>index_extracted.fasta",
>> -format => 'fasta');
>> $out->write_seq($seqobj);
>> }
>> }
>> exit;
>> }
>>
>> Thanks in advance
>>
>> PD. May it be a faster way of extracting those sequences using plain
>> PERL?
>>
>>
>>
>>
>> El Jue, 5 de Noviembre de 2009, 17:39, Mark A. Jensen escribi?:
>>> Yes, these are files created by the SDBM, Perl's internal db
>>> manager. You
>>> should
>>> be able to
>>> open the index by simply
>>> $inx = Bio::Index::Fasta->new('PC9.fasta.idx');
>>> and the dbm will know what to do--
>>> cheers MAJ
>>> ----- Original Message -----
>>> From: <jluis.lavin at unavarra.es>
>>> To: "Mark A. Jensen" <maj at fortinbras.us>
>>> Cc: <jluis.lavin at unavarra.es>; <bioperl-l at lists.open-bio.org>
>>> Sent: Thursday, November 05, 2009 11:21 AM
>>> Subject: Re: [Bioperl-l] [Fwd: Re: A question about iBio::Index:
>>> and its
>>> correct
>>> use]
>>>
>>>
>>>> Thank you very much Mark, that?s a good point :$
>>>> I guess your correction is referred to the second script, isn?t it?
>>>>
>>>> If it is so, there is still a problem with the first script, it
>>>> doesn?t
>>>> create the PC9.fasta.idx file, instead it creates two files named:
>>>> -PC9.fasta.idx.pag
>>>> -PC9.fasta.idx.dir
>>>>
>>>> which seem to be clearly related with some kind of indexing
>>>> process...but,
>>>> unless the PC9.fasta.idx file is only virtual or remains hidden, I
>>>> can?t
>>>> find it anywhere...
>>>> Forgive me if I?m talking nosense...
>>>>
>>>> Thank you very much again for your help ;)
>>>>
>>>>
>>>> El Jue, 5 de Noviembre de 2009, 17:02, Mark A. Jensen escribi?:
>>>>> Hey Jos?,
>>>>> The first thing that jumps out it the index file name. Looks
>>>>> like you create it as
>>>>> PC9.fasta.idx
>>>>> But you read it as
>>>>> PC9.fasta
>>>>> Not an unusual mistake. Do
>>>>> my $inx = Bio::Index::Fasta->new('PC9.fasta.idx');
>>>>> and see if it works.
>>>>> MAJ
>>>>> ----- Original Message -----
>>>>> From: <jluis.lavin at unavarra.es>
>>>>> To: <bioperl-l at lists.open-bio.org>
>>>>> Sent: Thursday, November 05, 2009 10:46 AM
>>>>> Subject: [Bioperl-l] [Fwd: Re: A question about iBio::Index: and
>>>>> its
>>>>> correct
>>>>> use]
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> ---------------------------- Mensaje original
>>>>> ----------------------------
>>>>> Subject: Re: [Bioperl-l] A question about iBio::Index: and its
>>>>> correct
>>>>> use
>>>>> From:    jluis.lavin at unavarra.es
>>>>> Fecha:   Jue, 5 de Noviembre de 2009, 16:46
>>>>> To:      "Mark A. Jensen" <maj at fortinbras.us>
>>>>> --------------------------------------------------------------------------
>>>>>
>>>>> Hi Mark,
>>>>>
>>>>> I?ve actually got two scripts, the first one is to create the
>>>>> index and
>>>>> the second one is to retrieve the sequence lis from the indexed
>>>>> file.
>>>>>
>>>>> 1)Here is the Index creation script:
>>>>>
>>>>> #!/c:/Perl -w
>>>>> use strict;
>>>>> use Bio::Index::Fasta;
>>>>> use strict;
>>>>>
>>>>> print "Enter file for indexing: \n";
>>>>> my $Index_File_Name = <STDIN>;
>>>>> my $inx = Bio::Index::Fasta->new(-filename =>
>>>>> $Index_File_Name.".idx",
>>>>>    -write_flag => 1);
>>>>> $inx->make_index(my $File_Name);
>>>>>
>>>>> 2)And here is the sequence retrieval script:
>>>>>
>>>>> #!/c:/Perl -w
>>>>> use Bio::Index::Fasta;
>>>>> use strict;
>>>>> #PC9.fasta is my genomic file
>>>>> my $Index_File_Name ="PC9.fasta";
>>>>> my $inx = Bio::Index::Fasta->new($Index_File_Name);
>>>>> #LCS.txt is my sequences list
>>>>> @ARGV = <lCS.txt>;
>>>>> foreach  my $id (@ARGV) {
>>>>> if ($id eq ''){
>>>>> die ("empty list")
>>>>> }
>>>>> else {
>>>>> my $seqobj = $inx->fetch($id);
>>>>> my $out = new Bio::SeqIO (-file => ">>extracted_seqs.fasta",
>>>>> -format => 'fasta');
>>>>> $out->write_seq($seqobj);
>>>>> }
>>>>> }
>>>>> exit;
>>>>> }
>>>>>
>>>>> I hope this code is not a total scum...
>>>>>
>>>>> Thanks in advance ;)
>>>>>
>>>>>
>>>>>
>>>>> El Jue, 5 de Noviembre de 2009, 16:39, Mark A. Jensen escribi?:
>>>>>> Jos? -- It looks like this is a good solution to your problem.
>>>>>> Please
>>>>>> send
>>>>>> you
>>>>>> script so we can look at it-
>>>>>> cheers Mark
>>>>>> ----- Original Message -----
>>>>>> From: <jluis.lavin at unavarra.es>
>>>>>> To: <bioperl-l at lists.open-bio.org>
>>>>>> Sent: Thursday, November 05, 2009 10:28 AM
>>>>>> Subject: [Bioperl-l] A question about iBio::Index: and its
>>>>>> correct use
>>>>>>
>>>>>>
>>>>>>
>>>>>> Hello to all,
>>>>>>
>>>>>> I?m trying to write a script to retrieve a list of sequences
>>>>>> from a
>>>>>> local
>>>>>> FASTA file (for example a fasta archive where all the protein
>>>>>> models
>>>>>> of
>>>>>> an
>>>>>> organism are stored). This file would be used by me as some kind
>>>>>> "local
>>>>>> database" (sorry if I mistake a few concepts...)
>>>>>> I?ve been reading the BioPerl HOWTOs and I came across the
>>>>>> Bio::Index::Fasta tool.
>>>>>> If I didn?t misunderstood what I read (which can be easy because
>>>>>> my
>>>>>> low
>>>>>> level on programming) this Indexing tool should do the job.
>>>>>> I wrote a couple of scripts based on the documentation i read
>>>>>> about
>>>>>> this
>>>>>> tool, but I don?t seem to be able to create the index file to be
>>>>>> used
>>>>>> later (to retrieve the sequences from).
>>>>>> -First of all, I want to ask the people in this forum if the
>>>>>> Bio::Index::Fasta is the right one to chose for this tasks.
>>>>>> -Then I?ll beg you to take a look at my scripts, because I don?t
>>>>>> seem
>>>>>> to
>>>>>> catch the bug...
>>>>>>
>>>>>> Best wishes to you all and thanks in advance ;)
>>>>>>
>>>>>> --
>>>>>> Jos? Luis Lav?n Trueba, PhD
>>>>>>
>>>>>> Dpto. de Producci?n Agraria
>>>>>> Grupo de Gen?tica y Microbiolog?a
>>>>>> Universidad P?blica de Navarra
>>>>>> 31006 Pamplona
>>>>>> Navarra
>>>>>> SPAIN
>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> Bioperl-l mailing list
>>>>>> Bioperl-l at lists.open-bio.org
>>>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Dr. Jos? Luis Lav?n Trueba
>>>>>
>>>>> Dpto. de Producci?n Agraria
>>>>> Grupo de Gen?tica y Microbiolog?a
>>>>> Universidad P?blica de Navarra
>>>>> 31006 Pamplona
>>>>> Navarra
>>>>> SPAIN
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Dr. Jos? Luis Lav?n Trueba
>>>>>
>>>>> Dpto. de Producci?n Agraria
>>>>> Grupo de Gen?tica y Microbiolog?a
>>>>> Universidad P?blica de Navarra
>>>>> 31006 Pamplona
>>>>> Navarra
>>>>> SPAIN
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> Bioperl-l mailing list
>>>>> Bioperl-l at lists.open-bio.org
>>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> Dr. Jos? Luis Lav?n Trueba
>>>>
>>>> Dpto. de Producci?n Agraria
>>>> Grupo de Gen?tica y Microbiolog?a
>>>> Universidad P?blica de Navarra
>>>> 31006 Pamplona
>>>> Navarra
>>>> SPAIN
>>>>
>>>>
>>>>
>>>>
>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>
>>
>> --
>> Dr. Jos? Luis Lav?n Trueba
>>
>> Dpto. de Producci?n Agraria
>> Grupo de Gen?tica y Microbiolog?a
>> Universidad P?blica de Navarra
>> 31006 Pamplona
>> Navarra
>> SPAIN
>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> --
> Jason Stajich
> jason.stajich at gmail.com
> jason at bioperl.org
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


-- 
Dr. Jos? Luis Lav?n Trueba

Dpto. de Producci?n Agraria
Grupo de Gen?tica y Microbiolog?a
Universidad P?blica de Navarra
31006 Pamplona
Navarra
SPAIN


From maj at fortinbras.us  Wed Nov 11 18:48:33 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Wed, 11 Nov 2009 18:48:33 -0500
Subject: [Bioperl-l] Maq assembly wrapper ready for beta testing
Message-ID: <4057E5A862B845EA8BB153888075590C@NewLife>

Hi All-

New modules are available in the core and in bioperl-run for
working with Heng Li's short read assembler "maq"
(http://maq.sourceforge.net/maq-man.shtml). Bio::Tools::Run::Maq
allows a quick assembly call with a canned a maq pipeline, and also
allows individual maq commands to be called separately. 
It uses Bio::Assembly::IO::maq  (a read-only module) to deliver
a Bio::Assembly::Scaffold from maq output. 

If you're interested, see
http://www.bioperl.org/wiki/HOWTO:Short-read_assemblies_with_maq
and update your core and bioperl-run. The code inherits from Florent's
excellent new Bio::Tools::Run::AssemblerBase -- kudos to him!!

tests are in bioperl-run/trunk/t/Maq.t, see them for myriad examples
send me the bugs
MAJ


From clarsen at vecna.com  Thu Nov 12 12:22:26 2009
From: clarsen at vecna.com (Chris Larsen)
Date: Thu, 12 Nov 2009 12:22:26 -0500
Subject: [Bioperl-l] Polyproteins, ribo slippage,
	and mat_peptide in  viruses?
In-Reply-To: <320fb6e00910271029m26f07564l727fb78adae81c11@mail.gmail.com>
References: <B0218AEF-3CEB-4E06-B8DF-7B302D024797@vecna.com>
	<320fb6e00910271029m26f07564l727fb78adae81c11@mail.gmail.com>
Message-ID: <7BBAE077-4D76-46C2-BF66-363F5A017278@vecna.com>

All,

This is a short followup on the prior thread of discussion, regarding  
computing mature peptide sequences for viruses. The topic has gone  
underwater for the time being as we solve some problems with source  
data. While the biopython effort and contributors on this board have  
given good guidance, and we now have scripts that function (thanks  
mostly to pcock), however, the source data on which everything relies  
is suspect:

   mat_peptide	15118..16914	<===
		/product="nsp13"	
		/note="helicase"
I can tell you the virus community does not want to rely heavily, on  
those position numbers. Furthermore we have found fewer compete source  
genomes for viruses than bacteria, more virus-to-virus variation in  
the data fields annotated in the GBK file, (Gene, CDS, ORF, Protein,  
Polyprotein, mat_peptide, db_xref) and in fact the community will have  
to come together significantly on how these molecules are defined in  
public repositories, before a mature scripting effort becomes  
reliable, public and well received. Because of the variation in  
viruses, it's not even clear at this point what a 'gene' is. I will  
let you know how we proceed when more sequence data has been fully  
analyzed, and we can think about making any perl based solution a new  
viral protein module.

Thanks,

Chris

-- 

Christopher Larsen, Ph.D.
Sr. Scientist / Grants Manager
Vecna Technologies
6404 Ivy Lane #500
Greenbelt, MD 20770
Phone: (240) 965-4525
Fax: (240) 547-6133
240-737-4525


From David.Messina at sbc.su.se  Thu Nov 12 14:20:54 2009
From: David.Messina at sbc.su.se (Dave Messina)
Date: Thu, 12 Nov 2009 20:20:54 +0100
Subject: [Bioperl-l] highest PAML version supported?
Message-ID: <628aabb70911121120w4c609056v50204b9bd9e5c3fb@mail.gmail.com>

Hi everyone,

What is the latest version of PAML (specifically codeml) that I can use with
bioperl-live and bioperl-run?

I looked around and couldn't find where (or if) this is documented.


With PAML version 4.3a against the current trunk of both -live and -run I
see this:
------------- EXCEPTION Bio::Root::NotImplemented -------------
MSG: Unknown format of PAML output did not see seqtype
STACK Bio::Tools::Phylo::PAML::_parse_summary
/Users/dave/src/bioperl-live/Bio/Tools/Phylo/PAML.pm:461
STACK Bio::Tools::Phylo::PAML::next_result
/Users/dave/src/bioperl-live/Bio/Tools/Phylo/PAML.pm:270
STACK toplevel ../bin/cluster_kaks:251
---------------------------------------------------------------

...which I suspect (but haven't confirmed) is due to a change in the file
format.


Dave


From jason at bioperl.org  Thu Nov 12 14:29:22 2009
From: jason at bioperl.org (Jason Stajich)
Date: Thu, 12 Nov 2009 11:29:22 -0800
Subject: [Bioperl-l] highest PAML version supported?
In-Reply-To: <628aabb70911121120w4c609056v50204b9bd9e5c3fb@mail.gmail.com>
References: <628aabb70911121120w4c609056v50204b9bd9e5c3fb@mail.gmail.com>
Message-ID: <D44511CE-D632-4CBE-9DD1-1B68EFD70124@bioperl.org>

prolly 3.15 or so.

it really needs a maintainer!!!

On Nov 12, 2009, at 11:20 AM, Dave Messina wrote:

> Hi everyone,
>
> What is the latest version of PAML (specifically codeml) that I can  
> use with
> bioperl-live and bioperl-run?
>
> I looked around and couldn't find where (or if) this is documented.
>
>
> With PAML version 4.3a against the current trunk of both -live and - 
> run I
> see this:
> ------------- EXCEPTION Bio::Root::NotImplemented -------------
> MSG: Unknown format of PAML output did not see seqtype
> STACK Bio::Tools::Phylo::PAML::_parse_summary
> /Users/dave/src/bioperl-live/Bio/Tools/Phylo/PAML.pm:461
> STACK Bio::Tools::Phylo::PAML::next_result
> /Users/dave/src/bioperl-live/Bio/Tools/Phylo/PAML.pm:270
> STACK toplevel ../bin/cluster_kaks:251
> ---------------------------------------------------------------
>
> ...which I suspect (but haven't confirmed) is due to a change in the  
> file
> format.
>
>
> Dave
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

--
Jason Stajich
jason.stajich at gmail.com
jason at bioperl.org


From scott at scottcain.net  Fri Nov 13 09:48:43 2009
From: scott at scottcain.net (Scott Cain)
Date: Fri, 13 Nov 2009 09:48:43 -0500
Subject: [Bioperl-l] January GMOD meeting announcement
Message-ID: <4536f7700911130648j40eb2d82g2594adaccf476d73@mail.gmail.com>

Hello,

I am pleased to announce that the January GMOD meeting will be taking
place on January 14 and 15 in San Diego at the Best Western Seven Seas
(the same location as last year).  Please see this page for
registration information:

  http://gmod.org/wiki/January_2010_GMOD_Meeting

When you go to that page, please take a moment to add suggestions for
the agenda.  There is no registration fee for this meeting, however
there is limited space, so please register early.

The proprietors of the Best Western have given us an excellent room
rate, and extended it to the previous week, so that people attending
the GMOD meeting and the Plant and Animal Genome meeting before it may
stay at the Best Western the entire time.

Please direct follow up questions to the gmod-devel mailing list:
https://lists.sourceforge.net/lists/listinfo/gmod-devel

Thanks and I look forward to seeing you in San Diego!
Scott


-- 
------------------------------------------------------------------------
Scott Cain, Ph. D.                                   scott at scottcain dot net
GMOD Coordinator (http://gmod.org/)                     216-392-3087
Ontario Institute for Cancer Research


From j.inoue at ucl.ac.uk  Sat Nov 14 14:20:29 2009
From: j.inoue at ucl.ac.uk (Jun Inoue)
Date: Sat, 14 Nov 2009 19:20:29 +0000
Subject: [Bioperl-l] Bio::TreeIO, Root-tip branch lengths
Message-ID: <F05F4263-D708-492A-A536-74059687279C@ucl.ac.uk>

Dear All,

I just started to learn BioPerl for phylogenetics.
Usually I am using perl v5.10.0 on my Mac OS 10.5.8.
I would like to ask you a hint to calculate the Branch lengths
from root to tip for all species in NEWICK TREE format.

Please see the following web site.
I am explaining what I want to do and
showing my easy script (not completed).
http://www.geocities.jp/ancientfishtree/BioPerl_BLRootTip.html

Thank you for your help.

Best,
Jun Inoue
http://www.geocities.jp/ancientfishtree/index_eng.html


From maj at fortinbras.us  Sat Nov 14 16:47:37 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Sat, 14 Nov 2009 16:47:37 -0500
Subject: [Bioperl-l] Bio::TreeIO, Root-tip branch lengths
In-Reply-To: <F05F4263-D708-492A-A536-74059687279C@ucl.ac.uk>
References: <F05F4263-D708-492A-A536-74059687279C@ucl.ac.uk>
Message-ID: <3BC179984D5E49868C4F12D181D82B8D@NewLife>

Hi Jun,

Some hints: incorporate

@leaves = $tree->get_leaf_nodes;

and

use Bio::Tree::TreeFunctionsI;
$distance = $tree->distance( $node_a, $node_b );

cheers, Mark

----- Original Message ----- 
From: "Jun Inoue" <j.inoue at ucl.ac.uk>
To: <bioperl-l at lists.open-bio.org>
Cc: "?? ?" <j.inoue at ucl.ac.uk>
Sent: Saturday, November 14, 2009 2:20 PM
Subject: [Bioperl-l] Bio::TreeIO, Root-tip branch lengths


> Dear All,
>
> I just started to learn BioPerl for phylogenetics.
> Usually I am using perl v5.10.0 on my Mac OS 10.5.8.
> I would like to ask you a hint to calculate the Branch lengths
> from root to tip for all species in NEWICK TREE format.
>
> Please see the following web site.
> I am explaining what I want to do and
> showing my easy script (not completed).
> http://www.geocities.jp/ancientfishtree/BioPerl_BLRootTip.html
>
> Thank you for your help.
>
> Best,
> Jun Inoue
> http://www.geocities.jp/ancientfishtree/index_eng.html
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> 


From jay at jays.net  Sun Nov 15 20:23:38 2009
From: jay at jays.net (Jay Hannah)
Date: Sun, 15 Nov 2009 19:23:38 -0600
Subject: [Bioperl-l] Bio::Index::GenBank - by organism?
In-Reply-To: <12DFD22E-42DC-4626-9873-0DE3EBB5CFBD@illinois.edu>
References: <3B01A09C-198E-4691-B807-7ED3250BB81A@jays.net>
	<12DFD22E-42DC-4626-9873-0DE3EBB5CFBD@illinois.edu>
Message-ID: <F8052B51-85FB-44B9-9254-9AD1E964FA7B@jays.net>

On Nov 9, 2009, at 9:55 PM, Chris Fields wrote:
> It should work via id_parser(); from Bio::Index::GenBank:
> 
>   $inx->id_parser(\&get_id);
>   # make the index
>   $inx->make_index($file_name);
> 
>   # here is where the retrieval key is specified
>   sub get_id {
>      my $line = shift;
>      $line =~ /clone="(\S+)"/;
>      $1;
>   }

This worked great for me today (tackling a different problem than the original).  Thanks!!

j


From veronica.xiaoyu at gmail.com  Fri Nov 13 15:35:48 2009
From: veronica.xiaoyu at gmail.com (Xiaoyu Liang)
Date: Fri, 13 Nov 2009 15:35:48 -0500
Subject: [Bioperl-l] Bio::Graphics::Panel question
Message-ID: <a6a926b50911131235p3b04b59fs634593f275300ed6@mail.gmail.com>

Hi,

I'm using Bio::Graphics to parse the blast result and generate images. But,
sometimes, in the middle of the output image, the hit's color is white,
eventhough I set it to other colors. I attached the picture here for an
example. This doesn't occur all the time, usually, it works well. I'm
wondering if I did something wrong? or depends on the blast result?

Thank you,
Xiaoyu
-------------- next part --------------
A non-text attachment was scrubbed...
Name: BLAST_problem.jpg
Type: image/jpeg
Size: 51888 bytes
Desc: not available
URL: <http://lists.open-bio.org/pipermail/bioperl-l/attachments/20091113/57550aa9/attachment-0002.jpg>

From ryan_bogard at hms.harvard.edu  Sun Nov 15 22:30:22 2009
From: ryan_bogard at hms.harvard.edu (rbogard)
Date: Sun, 15 Nov 2009 19:30:22 -0800 (PST)
Subject: [Bioperl-l]  Problems with bioperl in Mac OS X 10.6 (Perl 5.10.0)
Message-ID: <26366421.post@talk.nabble.com>


In advance, any advice would be grealy appreciated! I have installed
bioperl-588pm via fink but I am having difficulties calling the modules in
script. The following is added to .profile (bash):
PERL5LIB=/sw/lib/perl5/5.8.8:$PERL5LIB

If I change this to /sw/lib/perl5 then I get an @INC error, as use Bio::PERL
cannot be located.

The environment variables are as follows:

MANPATH=/sw/share/man:/usr/share/man:/usr/X11/man:/sw/lib/perl5/5.10.0/man:/usr/X11R6/man:/sw/lib/perl5-core/5.8.8/man:/sw/lib/perl5/5.8.8/man
PERL5LIB=/sw/lib/perl5/5.8.8:/sw/lib/perl5:/sw/lib/perl5/darwin:/sw/lib/perl5/5.8.8
PATH=/sw/bin:/sw/sbin:/usr/bin:/bin:/usr/sbin:/sbin:/usr/local/bin:/usr/X11/bin:/usr/X11R6/bin
INFOPATH=/sw/share/info:/sw/info:/usr/share/info


This is the perl script I'm attempting to run:
#!/sw/bin/perl5.8.8
use strict;
use Bio::Perl;
$seq_object = get_sequence('swiss',"ROA1_HUMAN");
write_sequence(">roa1.fasta",'fasta',$seq_object);

Here is the error output:

dyld: lazy symbol binding failed: Symbol not found: _Perl_Tstack_sp_ptr
  Referenced from:
/sw/lib/perl5/5.8.8/darwin-thread-multi-2level/auto/IO/IO.bundle
  Expected in: dynamic lookup

dyld: Symbol not found: _Perl_Tstack_sp_ptr
  Referenced from:
/sw/lib/perl5/5.8.8/darwin-thread-multi-2level/auto/IO/IO.bundle
  Expected in: dynamic lookup

Trace/BPT trap

I have looked through many forum postings and attempted the solutions
offered in those instances, but none seem to work in my case. I'm not sure
if it's because I have perl 5.10.0 installed while attempting to call
bioperl 5.8.8; however, others seem to have it working just fine.

Thank you, Ryan 
-- 
View this message in context: http://old.nabble.com/Problems-with-bioperl-in-Mac-OS-X-10.6-%28Perl-5.10.0%29-tp26366421p26366421.html
Sent from the Perl - Bioperl-L mailing list archive at Nabble.com.


From e.osimo at gmail.com  Mon Nov 16 02:04:40 2009
From: e.osimo at gmail.com (Emanuele Osimo)
Date: Mon, 16 Nov 2009 08:04:40 +0100
Subject: [Bioperl-l] Problems with bioperl in Mac OS X 10.6 (Perl 5.10.0)
In-Reply-To: <26366421.post@talk.nabble.com>
References: <26366421.post@talk.nabble.com>
Message-ID: <2ac05d0f0911152304v58985cb5x6ea0501bff7a41ab@mail.gmail.com>

Hello Ryan,
unfortunately, if you upgraded to 10.6 without formatting, I have to tell
you that you'll be in big trouble with perl and with everything you
installed from the commandline... Because in the upgrade process everything
in the system folders, perl and bioperl being some of these things, is
erased without being uninstalled, so you'll find a lot of folders with the
same name but no contents.
I suggest you, as I did, to format your pc and reinstall 10.6 from scratch.
Then youl'll be able to install mysql (I had to install
mysql-5.4.3-beta-osx10.5, the only to work on 10.6), and, working with perl
5.10 that is already installed, you'll install bioperl with no effort.
Bye
Emanuele

On Mon, Nov 16, 2009 at 04:30, rbogard <ryan_bogard at hms.harvard.edu> wrote:

>
> In advance, any advice would be grealy appreciated! I have installed
> bioperl-588pm via fink but I am having difficulties calling the modules in
> script. The following is added to .profile (bash):
> PERL5LIB=/sw/lib/perl5/5.8.8:$PERL5LIB
>
> If I change this to /sw/lib/perl5 then I get an @INC error, as use
> Bio::PERL
> cannot be located.
>
> The environment variables are as follows:
>
>
> MANPATH=/sw/share/man:/usr/share/man:/usr/X11/man:/sw/lib/perl5/5.10.0/man:/usr/X11R6/man:/sw/lib/perl5-core/5.8.8/man:/sw/lib/perl5/5.8.8/man
>
> PERL5LIB=/sw/lib/perl5/5.8.8:/sw/lib/perl5:/sw/lib/perl5/darwin:/sw/lib/perl5/5.8.8
>
> PATH=/sw/bin:/sw/sbin:/usr/bin:/bin:/usr/sbin:/sbin:/usr/local/bin:/usr/X11/bin:/usr/X11R6/bin
> INFOPATH=/sw/share/info:/sw/info:/usr/share/info
>
>
> This is the perl script I'm attempting to run:
> #!/sw/bin/perl5.8.8
> use strict;
> use Bio::Perl;
> $seq_object = get_sequence('swiss',"ROA1_HUMAN");
> write_sequence(">roa1.fasta",'fasta',$seq_object);
>
> Here is the error output:
>
> dyld: lazy symbol binding failed: Symbol not found: _Perl_Tstack_sp_ptr
>  Referenced from:
> /sw/lib/perl5/5.8.8/darwin-thread-multi-2level/auto/IO/IO.bundle
>  Expected in: dynamic lookup
>
> dyld: Symbol not found: _Perl_Tstack_sp_ptr
>  Referenced from:
> /sw/lib/perl5/5.8.8/darwin-thread-multi-2level/auto/IO/IO.bundle
>  Expected in: dynamic lookup
>
> Trace/BPT trap
>
> I have looked through many forum postings and attempted the solutions
> offered in those instances, but none seem to work in my case. I'm not sure
> if it's because I have perl 5.10.0 installed while attempting to call
> bioperl 5.8.8; however, others seem to have it working just fine.
>
> Thank you, Ryan
> --
> View this message in context:
> http://old.nabble.com/Problems-with-bioperl-in-Mac-OS-X-10.6-%28Perl-5.10.0%29-tp26366421p26366421.html
> Sent from the Perl - Bioperl-L mailing list archive at Nabble.com.
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


From ryan_bogard at hms.harvard.edu  Mon Nov 16 08:43:19 2009
From: ryan_bogard at hms.harvard.edu (rbogard)
Date: Mon, 16 Nov 2009 05:43:19 -0800 (PST)
Subject: [Bioperl-l] Problems with bioperl in Mac OS X 10.6 (Perl 5.10.0)
In-Reply-To: <2ac05d0f0911152304v58985cb5x6ea0501bff7a41ab@mail.gmail.com>
References: <26366421.post@talk.nabble.com>
	<2ac05d0f0911152304v58985cb5x6ea0501bff7a41ab@mail.gmail.com>
Message-ID: <26372079.post@talk.nabble.com>


The Mac OS X 10.6 was a fresh install on a new Mac Book Pro. Not sure if I
will have the same issues, but it's worth a shot as I have little on my
computer and reinstalling to start over wouldn't be too difficult. What
method did you use to install bioperl? I used fink and I am not sure the
available stable version is the one I need. I will install from the command
line this time around, and let you know how it turns out.

Thank you!


Emanuele Osimo wrote:
> 
> Hello Ryan,
> unfortunately, if you upgraded to 10.6 without formatting, I have to tell
> you that you'll be in big trouble with perl and with everything you
> installed from the commandline... Because in the upgrade process
> everything
> in the system folders, perl and bioperl being some of these things, is
> erased without being uninstalled, so you'll find a lot of folders with the
> same name but no contents.
> I suggest you, as I did, to format your pc and reinstall 10.6 from
> scratch.
> Then youl'll be able to install mysql (I had to install
> mysql-5.4.3-beta-osx10.5, the only to work on 10.6), and, working with
> perl
> 5.10 that is already installed, you'll install bioperl with no effort.
> Bye
> Emanuele
> 
> On Mon, Nov 16, 2009 at 04:30, rbogard <ryan_bogard at hms.harvard.edu>
> wrote:
> 
>>
>> In advance, any advice would be grealy appreciated! I have installed
>> bioperl-588pm via fink but I am having difficulties calling the modules
>> in
>> script. The following is added to .profile (bash):
>> PERL5LIB=/sw/lib/perl5/5.8.8:$PERL5LIB
>>
>> If I change this to /sw/lib/perl5 then I get an @INC error, as use
>> Bio::PERL
>> cannot be located.
>>
>> The environment variables are as follows:
>>
>>
>> MANPATH=/sw/share/man:/usr/share/man:/usr/X11/man:/sw/lib/perl5/5.10.0/man:/usr/X11R6/man:/sw/lib/perl5-core/5.8.8/man:/sw/lib/perl5/5.8.8/man
>>
>> PERL5LIB=/sw/lib/perl5/5.8.8:/sw/lib/perl5:/sw/lib/perl5/darwin:/sw/lib/perl5/5.8.8
>>
>> PATH=/sw/bin:/sw/sbin:/usr/bin:/bin:/usr/sbin:/sbin:/usr/local/bin:/usr/X11/bin:/usr/X11R6/bin
>> INFOPATH=/sw/share/info:/sw/info:/usr/share/info
>>
>>
>> This is the perl script I'm attempting to run:
>> #!/sw/bin/perl5.8.8
>> use strict;
>> use Bio::Perl;
>> $seq_object = get_sequence('swiss',"ROA1_HUMAN");
>> write_sequence(">roa1.fasta",'fasta',$seq_object);
>>
>> Here is the error output:
>>
>> dyld: lazy symbol binding failed: Symbol not found: _Perl_Tstack_sp_ptr
>>  Referenced from:
>> /sw/lib/perl5/5.8.8/darwin-thread-multi-2level/auto/IO/IO.bundle
>>  Expected in: dynamic lookup
>>
>> dyld: Symbol not found: _Perl_Tstack_sp_ptr
>>  Referenced from:
>> /sw/lib/perl5/5.8.8/darwin-thread-multi-2level/auto/IO/IO.bundle
>>  Expected in: dynamic lookup
>>
>> Trace/BPT trap
>>
>> I have looked through many forum postings and attempted the solutions
>> offered in those instances, but none seem to work in my case. I'm not
>> sure
>> if it's because I have perl 5.10.0 installed while attempting to call
>> bioperl 5.8.8; however, others seem to have it working just fine.
>>
>> Thank you, Ryan
>> --
>> View this message in context:
>> http://old.nabble.com/Problems-with-bioperl-in-Mac-OS-X-10.6-%28Perl-5.10.0%29-tp26366421p26366421.html
>> Sent from the Perl - Bioperl-L mailing list archive at Nabble.com.
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
> 

-- 
View this message in context: http://old.nabble.com/Problems-with-bioperl-in-Mac-OS-X-10.6-%28Perl-5.10.0%29-tp26366421p26372079.html
Sent from the Perl - Bioperl-L mailing list archive at Nabble.com.


From maj at fortinbras.us  Mon Nov 16 08:48:17 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Mon, 16 Nov 2009 08:48:17 -0500
Subject: [Bioperl-l] Problems with bioperl in Mac OS X 10.6 (Perl 5.10.0)
In-Reply-To: <26372079.post@talk.nabble.com>
References: <26366421.post@talk.nabble.com><2ac05d0f0911152304v58985cb5x6ea0501bff7a41ab@mail.gmail.com>
	<26372079.post@talk.nabble.com>
Message-ID: <8D822081B13F49C2A37677D3A47F38B4@NewLife>

Ryan,
I'm not a mac person, but Koen has said (see 
http://www.bioperl.org/wiki/Getting_BioPerl#Mac_OS_X_using_fink )
to use the unstable tree to get BioPerl 1.6.1, which is likely to be what you 
want.
cheers
Mark
----- Original Message ----- 
From: "rbogard" <ryan_bogard at hms.harvard.edu>
To: <Bioperl-l at lists.open-bio.org>
Sent: Monday, November 16, 2009 8:43 AM
Subject: Re: [Bioperl-l] Problems with bioperl in Mac OS X 10.6 (Perl 5.10.0)


>
> The Mac OS X 10.6 was a fresh install on a new Mac Book Pro. Not sure if I
> will have the same issues, but it's worth a shot as I have little on my
> computer and reinstalling to start over wouldn't be too difficult. What
> method did you use to install bioperl? I used fink and I am not sure the
> available stable version is the one I need. I will install from the command
> line this time around, and let you know how it turns out.
>
> Thank you!
>
>
>
> Emanuele Osimo wrote:
>>
>> Hello Ryan,
>> unfortunately, if you upgraded to 10.6 without formatting, I have to tell
>> you that you'll be in big trouble with perl and with everything you
>> installed from the commandline... Because in the upgrade process
>> everything
>> in the system folders, perl and bioperl being some of these things, is
>> erased without being uninstalled, so you'll find a lot of folders with the
>> same name but no contents.
>> I suggest you, as I did, to format your pc and reinstall 10.6 from
>> scratch.
>> Then youl'll be able to install mysql (I had to install
>> mysql-5.4.3-beta-osx10.5, the only to work on 10.6), and, working with
>> perl
>> 5.10 that is already installed, you'll install bioperl with no effort.
>> Bye
>> Emanuele
>>
>> On Mon, Nov 16, 2009 at 04:30, rbogard <ryan_bogard at hms.harvard.edu>
>> wrote:
>>
>>>
>>> In advance, any advice would be grealy appreciated! I have installed
>>> bioperl-588pm via fink but I am having difficulties calling the modules
>>> in
>>> script. The following is added to .profile (bash):
>>> PERL5LIB=/sw/lib/perl5/5.8.8:$PERL5LIB
>>>
>>> If I change this to /sw/lib/perl5 then I get an @INC error, as use
>>> Bio::PERL
>>> cannot be located.
>>>
>>> The environment variables are as follows:
>>>
>>>
>>> MANPATH=/sw/share/man:/usr/share/man:/usr/X11/man:/sw/lib/perl5/5.10.0/man:/usr/X11R6/man:/sw/lib/perl5-core/5.8.8/man:/sw/lib/perl5/5.8.8/man
>>>
>>> PERL5LIB=/sw/lib/perl5/5.8.8:/sw/lib/perl5:/sw/lib/perl5/darwin:/sw/lib/perl5/5.8.8
>>>
>>> PATH=/sw/bin:/sw/sbin:/usr/bin:/bin:/usr/sbin:/sbin:/usr/local/bin:/usr/X11/bin:/usr/X11R6/bin
>>> INFOPATH=/sw/share/info:/sw/info:/usr/share/info
>>>
>>>
>>> This is the perl script I'm attempting to run:
>>> #!/sw/bin/perl5.8.8
>>> use strict;
>>> use Bio::Perl;
>>> $seq_object = get_sequence('swiss',"ROA1_HUMAN");
>>> write_sequence(">roa1.fasta",'fasta',$seq_object);
>>>
>>> Here is the error output:
>>>
>>> dyld: lazy symbol binding failed: Symbol not found: _Perl_Tstack_sp_ptr
>>>  Referenced from:
>>> /sw/lib/perl5/5.8.8/darwin-thread-multi-2level/auto/IO/IO.bundle
>>>  Expected in: dynamic lookup
>>>
>>> dyld: Symbol not found: _Perl_Tstack_sp_ptr
>>>  Referenced from:
>>> /sw/lib/perl5/5.8.8/darwin-thread-multi-2level/auto/IO/IO.bundle
>>>  Expected in: dynamic lookup
>>>
>>> Trace/BPT trap
>>>
>>> I have looked through many forum postings and attempted the solutions
>>> offered in those instances, but none seem to work in my case. I'm not
>>> sure
>>> if it's because I have perl 5.10.0 installed while attempting to call
>>> bioperl 5.8.8; however, others seem to have it working just fine.
>>>
>>> Thank you, Ryan
>>> --
>>> View this message in context:
>>> http://old.nabble.com/Problems-with-bioperl-in-Mac-OS-X-10.6-%28Perl-5.10.0%29-tp26366421p26366421.html
>>> Sent from the Perl - Bioperl-L mailing list archive at Nabble.com.
>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>>
>
> -- 
> View this message in context: 
> http://old.nabble.com/Problems-with-bioperl-in-Mac-OS-X-10.6-%28Perl-5.10.0%29-tp26366421p26372079.html
> Sent from the Perl - Bioperl-L mailing list archive at Nabble.com.
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> 


From cjfields at illinois.edu  Mon Nov 16 10:00:09 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Mon, 16 Nov 2009 09:00:09 -0600
Subject: [Bioperl-l] Problems with bioperl in Mac OS X 10.6 (Perl 5.10.0)
In-Reply-To: <2ac05d0f0911152304v58985cb5x6ea0501bff7a41ab@mail.gmail.com>
References: <26366421.post@talk.nabble.com>
	<2ac05d0f0911152304v58985cb5x6ea0501bff7a41ab@mail.gmail.com>
Message-ID: <49681E01-E95D-4FC6-AE42-6E57ED43AAA2@illinois.edu>

On Nov 16, 2009, at 1:04 AM, Emanuele Osimo wrote:

> Hello Ryan,
> unfortunately, if you upgraded to 10.6 without formatting, I have to tell
> you that you'll be in big trouble with perl and with everything you
> installed from the commandline... Because in the upgrade process everything
> in the system folders, perl and bioperl being some of these things, is
> erased without being uninstalled, so you'll find a lot of folders with the
> same name but no contents.

> I suggest you, as I did, to format your pc and reinstall 10.6 from scratch.
> Then youl'll be able to install mysql (I had to install
> mysql-5.4.3-beta-osx10.5, the only to work on 10.6), and, working with perl
> 5.10 that is already installed, you'll install bioperl with no effort.
> Bye
> Emanuele

Just starting from scratch isn't always the best solution (though it is the cleanest).  In this case I don't think anything you mention applies, as there are conflicting symbols being reported.  My guess is conflicting perl builds, probably between your system 5.10.0 (snow leopard) and your fink-installed perl 5.8.8 (they are binary incompatible).  Also, remember that snow leopard is primarily 64-bit, so it might be best to try working out whether your fink is attempting to compile 64- vs 32-bit.  

In this case, I would just uninstall the fink-based perl and either use the system one (snow leopard = 5.10.0), or roll your own and install 5.10.1 locally or in /usr/local.  Do NOT replace the system one, as that will likely break your OS.

In my experience, and not to bash on fink or MacPorts, I never had much luck with their perl installs.  Unless I plan on only using fink or macports for my OS (not likely in my case), I find they tend to cause problems in the long term unless one uses them to install packages with very few dependencies, and even then you need to make sure fink is configure to compile the correct binary.  For instance, they're fairly good for gd, libxml2, etc., but beyond that one may get into issues with odd, version-specific dependencies with some packages, such as relying on perl 5.8.8 (but not perl 5.10.x), db42 (instead of db44), etc.  I've ended up in the past with 2-3 different perl versions, berkeley db versions, etc. 

chris

> On Mon, Nov 16, 2009 at 04:30, rbogard <ryan_bogard at hms.harvard.edu> wrote:
> 
>> 
>> In advance, any advice would be grealy appreciated! I have installed
>> bioperl-588pm via fink but I am having difficulties calling the modules in
>> script. The following is added to .profile (bash):
>> PERL5LIB=/sw/lib/perl5/5.8.8:$PERL5LIB
>> 
>> If I change this to /sw/lib/perl5 then I get an @INC error, as use
>> Bio::PERL
>> cannot be located.
>> 
>> The environment variables are as follows:
>> 
>> 
>> MANPATH=/sw/share/man:/usr/share/man:/usr/X11/man:/sw/lib/perl5/5.10.0/man:/usr/X11R6/man:/sw/lib/perl5-core/5.8.8/man:/sw/lib/perl5/5.8.8/man
>> 
>> PERL5LIB=/sw/lib/perl5/5.8.8:/sw/lib/perl5:/sw/lib/perl5/darwin:/sw/lib/perl5/5.8.8
>> 
>> PATH=/sw/bin:/sw/sbin:/usr/bin:/bin:/usr/sbin:/sbin:/usr/local/bin:/usr/X11/bin:/usr/X11R6/bin
>> INFOPATH=/sw/share/info:/sw/info:/usr/share/info
>> 
>> 
>> This is the perl script I'm attempting to run:
>> #!/sw/bin/perl5.8.8
>> use strict;
>> use Bio::Perl;
>> $seq_object = get_sequence('swiss',"ROA1_HUMAN");
>> write_sequence(">roa1.fasta",'fasta',$seq_object);
>> 
>> Here is the error output:
>> 
>> dyld: lazy symbol binding failed: Symbol not found: _Perl_Tstack_sp_ptr
>> Referenced from:
>> /sw/lib/perl5/5.8.8/darwin-thread-multi-2level/auto/IO/IO.bundle
>> Expected in: dynamic lookup
>> 
>> dyld: Symbol not found: _Perl_Tstack_sp_ptr
>> Referenced from:
>> /sw/lib/perl5/5.8.8/darwin-thread-multi-2level/auto/IO/IO.bundle
>> Expected in: dynamic lookup
>> 
>> Trace/BPT trap
>> 
>> I have looked through many forum postings and attempted the solutions
>> offered in those instances, but none seem to work in my case. I'm not sure
>> if it's because I have perl 5.10.0 installed while attempting to call
>> bioperl 5.8.8; however, others seem to have it working just fine.
>> 
>> Thank you, Ryan
>> --
>> View this message in context:
>> http://old.nabble.com/Problems-with-bioperl-in-Mac-OS-X-10.6-%28Perl-5.10.0%29-tp26366421p26366421.html
>> Sent from the Perl - Bioperl-L mailing list archive at Nabble.com.
>> 
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From cjfields at illinois.edu  Mon Nov 16 10:01:01 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Mon, 16 Nov 2009 09:01:01 -0600
Subject: [Bioperl-l] Problems with bioperl in Mac OS X 10.6 (Perl 5.10.0)
In-Reply-To: <8D822081B13F49C2A37677D3A47F38B4@NewLife>
References: <26366421.post@talk.nabble.com><2ac05d0f0911152304v58985cb5x6ea0501bff7a41ab@mail.gmail.com>
	<26372079.post@talk.nabble.com>
	<8D822081B13F49C2A37677D3A47F38B4@NewLife>
Message-ID: <58912861-CD59-4AFC-8F30-B0AA2E77AECB@illinois.edu>

Actually, why not just install via CPAN?  Any particular reason?

chris

On Nov 16, 2009, at 7:48 AM, Mark A. Jensen wrote:

> Ryan,
> I'm not a mac person, but Koen has said (see http://www.bioperl.org/wiki/Getting_BioPerl#Mac_OS_X_using_fink )
> to use the unstable tree to get BioPerl 1.6.1, which is likely to be what you want.
> cheers
> Mark
> ----- Original Message ----- From: "rbogard" <ryan_bogard at hms.harvard.edu>
> To: <Bioperl-l at lists.open-bio.org>
> Sent: Monday, November 16, 2009 8:43 AM
> Subject: Re: [Bioperl-l] Problems with bioperl in Mac OS X 10.6 (Perl 5.10.0)
> 
> 
>> 
>> The Mac OS X 10.6 was a fresh install on a new Mac Book Pro. Not sure if I
>> will have the same issues, but it's worth a shot as I have little on my
>> computer and reinstalling to start over wouldn't be too difficult. What
>> method did you use to install bioperl? I used fink and I am not sure the
>> available stable version is the one I need. I will install from the command
>> line this time around, and let you know how it turns out.
>> 
>> Thank you!
>> 
>> 
>> 
>> Emanuele Osimo wrote:
>>> 
>>> Hello Ryan,
>>> unfortunately, if you upgraded to 10.6 without formatting, I have to tell
>>> you that you'll be in big trouble with perl and with everything you
>>> installed from the commandline... Because in the upgrade process
>>> everything
>>> in the system folders, perl and bioperl being some of these things, is
>>> erased without being uninstalled, so you'll find a lot of folders with the
>>> same name but no contents.
>>> I suggest you, as I did, to format your pc and reinstall 10.6 from
>>> scratch.
>>> Then youl'll be able to install mysql (I had to install
>>> mysql-5.4.3-beta-osx10.5, the only to work on 10.6), and, working with
>>> perl
>>> 5.10 that is already installed, you'll install bioperl with no effort.
>>> Bye
>>> Emanuele
>>> 
>>> On Mon, Nov 16, 2009 at 04:30, rbogard <ryan_bogard at hms.harvard.edu>
>>> wrote:
>>> 
>>>> 
>>>> In advance, any advice would be grealy appreciated! I have installed
>>>> bioperl-588pm via fink but I am having difficulties calling the modules
>>>> in
>>>> script. The following is added to .profile (bash):
>>>> PERL5LIB=/sw/lib/perl5/5.8.8:$PERL5LIB
>>>> 
>>>> If I change this to /sw/lib/perl5 then I get an @INC error, as use
>>>> Bio::PERL
>>>> cannot be located.
>>>> 
>>>> The environment variables are as follows:
>>>> 
>>>> 
>>>> MANPATH=/sw/share/man:/usr/share/man:/usr/X11/man:/sw/lib/perl5/5.10.0/man:/usr/X11R6/man:/sw/lib/perl5-core/5.8.8/man:/sw/lib/perl5/5.8.8/man
>>>> 
>>>> PERL5LIB=/sw/lib/perl5/5.8.8:/sw/lib/perl5:/sw/lib/perl5/darwin:/sw/lib/perl5/5.8.8
>>>> 
>>>> PATH=/sw/bin:/sw/sbin:/usr/bin:/bin:/usr/sbin:/sbin:/usr/local/bin:/usr/X11/bin:/usr/X11R6/bin
>>>> INFOPATH=/sw/share/info:/sw/info:/usr/share/info
>>>> 
>>>> 
>>>> This is the perl script I'm attempting to run:
>>>> #!/sw/bin/perl5.8.8
>>>> use strict;
>>>> use Bio::Perl;
>>>> $seq_object = get_sequence('swiss',"ROA1_HUMAN");
>>>> write_sequence(">roa1.fasta",'fasta',$seq_object);
>>>> 
>>>> Here is the error output:
>>>> 
>>>> dyld: lazy symbol binding failed: Symbol not found: _Perl_Tstack_sp_ptr
>>>> Referenced from:
>>>> /sw/lib/perl5/5.8.8/darwin-thread-multi-2level/auto/IO/IO.bundle
>>>> Expected in: dynamic lookup
>>>> 
>>>> dyld: Symbol not found: _Perl_Tstack_sp_ptr
>>>> Referenced from:
>>>> /sw/lib/perl5/5.8.8/darwin-thread-multi-2level/auto/IO/IO.bundle
>>>> Expected in: dynamic lookup
>>>> 
>>>> Trace/BPT trap
>>>> 
>>>> I have looked through many forum postings and attempted the solutions
>>>> offered in those instances, but none seem to work in my case. I'm not
>>>> sure
>>>> if it's because I have perl 5.10.0 installed while attempting to call
>>>> bioperl 5.8.8; however, others seem to have it working just fine.
>>>> 
>>>> Thank you, Ryan
>>>> --
>>>> View this message in context:
>>>> http://old.nabble.com/Problems-with-bioperl-in-Mac-OS-X-10.6-%28Perl-5.10.0%29-tp26366421p26366421.html
>>>> Sent from the Perl - Bioperl-L mailing list archive at Nabble.com.
>>>> 
>>>> _______________________________________________
>>>> Bioperl-l mailing list
>>>> Bioperl-l at lists.open-bio.org
>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>> 
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>> 
>>> 
>> 
>> -- 
>> View this message in context: http://old.nabble.com/Problems-with-bioperl-in-Mac-OS-X-10.6-%28Perl-5.10.0%29-tp26366421p26372079.html
>> Sent from the Perl - Bioperl-L mailing list archive at Nabble.com.
>> 
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From Kevin.M.Brown at asu.edu  Mon Nov 16 10:49:13 2009
From: Kevin.M.Brown at asu.edu (Kevin Brown)
Date: Mon, 16 Nov 2009 08:49:13 -0700
Subject: [Bioperl-l] Bio::Graphics::Panel question
In-Reply-To: <a6a926b50911131235p3b04b59fs634593f275300ed6@mail.gmail.com>
References: <a6a926b50911131235p3b04b59fs634593f275300ed6@mail.gmail.com>
Message-ID: <1A4207F8295607498283FE9E93B775B40663EDB9@EX02.asurite.ad.asu.edu>

To really be able to tell if this was a bug, I (and probably the real
devs) would need to see that part of your code and the Blast file that
is having this issue as it could be your callback for color choice vs
the blast object (e.g. your color picker is missing an option that the
data comes in with and so returns with a blank value).

-----Original Message-----
From: bioperl-l-bounces at lists.open-bio.org
[mailto:bioperl-l-bounces at lists.open-bio.org] On Behalf Of Xiaoyu Liang
Sent: Friday, November 13, 2009 1:36 PM
To: Bioperl-l at lists.open-bio.org
Subject: [Bioperl-l] Bio::Graphics::Panel question

Hi,

I'm using Bio::Graphics to parse the blast result and generate images.
But, sometimes, in the middle of the output image, the hit's color is
white, eventhough I set it to other colors. I attached the picture here
for an example. This doesn't occur all the time, usually, it works well.
I'm wondering if I did something wrong? or depends on the blast result?

Thank you,
Xiaoyu


From ryan_bogard at hms.harvard.edu  Mon Nov 16 11:57:16 2009
From: ryan_bogard at hms.harvard.edu (rbogard)
Date: Mon, 16 Nov 2009 08:57:16 -0800 (PST)
Subject: [Bioperl-l] Problems with bioperl in Mac OS X 10.6 (Perl 5.10.0)
In-Reply-To: <58912861-CD59-4AFC-8F30-B0AA2E77AECB@illinois.edu>
References: <26366421.post@talk.nabble.com>
	<2ac05d0f0911152304v58985cb5x6ea0501bff7a41ab@mail.gmail.com>
	<26372079.post@talk.nabble.com>
	<8D822081B13F49C2A37677D3A47F38B4@NewLife>
	<58912861-CD59-4AFC-8F30-B0AA2E77AECB@illinois.edu>
Message-ID: <26375418.post@talk.nabble.com>


I read that posting by Koen and used the unstable tree after the first
attempt; however, the errors still persisted. I just finished a fresh
install and I will just follow Mr. Fields advice and use CPAN. 
Thank you all for the help!


Chris Fields-5 wrote:
> 
> Actually, why not just install via CPAN?  Any particular reason?
> 
> chris
> 
> On Nov 16, 2009, at 7:48 AM, Mark A. Jensen wrote:
> 
>> Ryan,
>> I'm not a mac person, but Koen has said (see
>> http://www.bioperl.org/wiki/Getting_BioPerl#Mac_OS_X_using_fink )
>> to use the unstable tree to get BioPerl 1.6.1, which is likely to be what
>> you want.
>> cheers
>> Mark
>> ----- Original Message ----- From: "rbogard"
>> <ryan_bogard at hms.harvard.edu>
>> To: <Bioperl-l at lists.open-bio.org>
>> Sent: Monday, November 16, 2009 8:43 AM
>> Subject: Re: [Bioperl-l] Problems with bioperl in Mac OS X 10.6 (Perl
>> 5.10.0)
>> 
>> 
>>> 
>>> The Mac OS X 10.6 was a fresh install on a new Mac Book Pro. Not sure if
>>> I
>>> will have the same issues, but it's worth a shot as I have little on my
>>> computer and reinstalling to start over wouldn't be too difficult. What
>>> method did you use to install bioperl? I used fink and I am not sure the
>>> available stable version is the one I need. I will install from the
>>> command
>>> line this time around, and let you know how it turns out.
>>> 
>>> Thank you!
>>> 
>>> 
>>> 
>>> Emanuele Osimo wrote:
>>>> 
>>>> Hello Ryan,
>>>> unfortunately, if you upgraded to 10.6 without formatting, I have to
>>>> tell
>>>> you that you'll be in big trouble with perl and with everything you
>>>> installed from the commandline... Because in the upgrade process
>>>> everything
>>>> in the system folders, perl and bioperl being some of these things, is
>>>> erased without being uninstalled, so you'll find a lot of folders with
>>>> the
>>>> same name but no contents.
>>>> I suggest you, as I did, to format your pc and reinstall 10.6 from
>>>> scratch.
>>>> Then youl'll be able to install mysql (I had to install
>>>> mysql-5.4.3-beta-osx10.5, the only to work on 10.6), and, working with
>>>> perl
>>>> 5.10 that is already installed, you'll install bioperl with no effort.
>>>> Bye
>>>> Emanuele
>>>> 
>>>> On Mon, Nov 16, 2009 at 04:30, rbogard <ryan_bogard at hms.harvard.edu>
>>>> wrote:
>>>> 
>>>>> 
>>>>> In advance, any advice would be grealy appreciated! I have installed
>>>>> bioperl-588pm via fink but I am having difficulties calling the
>>>>> modules
>>>>> in
>>>>> script. The following is added to .profile (bash):
>>>>> PERL5LIB=/sw/lib/perl5/5.8.8:$PERL5LIB
>>>>> 
>>>>> If I change this to /sw/lib/perl5 then I get an @INC error, as use
>>>>> Bio::PERL
>>>>> cannot be located.
>>>>> 
>>>>> The environment variables are as follows:
>>>>> 
>>>>> 
>>>>> MANPATH=/sw/share/man:/usr/share/man:/usr/X11/man:/sw/lib/perl5/5.10.0/man:/usr/X11R6/man:/sw/lib/perl5-core/5.8.8/man:/sw/lib/perl5/5.8.8/man
>>>>> 
>>>>> PERL5LIB=/sw/lib/perl5/5.8.8:/sw/lib/perl5:/sw/lib/perl5/darwin:/sw/lib/perl5/5.8.8
>>>>> 
>>>>> PATH=/sw/bin:/sw/sbin:/usr/bin:/bin:/usr/sbin:/sbin:/usr/local/bin:/usr/X11/bin:/usr/X11R6/bin
>>>>> INFOPATH=/sw/share/info:/sw/info:/usr/share/info
>>>>> 
>>>>> 
>>>>> This is the perl script I'm attempting to run:
>>>>> #!/sw/bin/perl5.8.8
>>>>> use strict;
>>>>> use Bio::Perl;
>>>>> $seq_object = get_sequence('swiss',"ROA1_HUMAN");
>>>>> write_sequence(">roa1.fasta",'fasta',$seq_object);
>>>>> 
>>>>> Here is the error output:
>>>>> 
>>>>> dyld: lazy symbol binding failed: Symbol not found:
>>>>> _Perl_Tstack_sp_ptr
>>>>> Referenced from:
>>>>> /sw/lib/perl5/5.8.8/darwin-thread-multi-2level/auto/IO/IO.bundle
>>>>> Expected in: dynamic lookup
>>>>> 
>>>>> dyld: Symbol not found: _Perl_Tstack_sp_ptr
>>>>> Referenced from:
>>>>> /sw/lib/perl5/5.8.8/darwin-thread-multi-2level/auto/IO/IO.bundle
>>>>> Expected in: dynamic lookup
>>>>> 
>>>>> Trace/BPT trap
>>>>> 
>>>>> I have looked through many forum postings and attempted the solutions
>>>>> offered in those instances, but none seem to work in my case. I'm not
>>>>> sure
>>>>> if it's because I have perl 5.10.0 installed while attempting to call
>>>>> bioperl 5.8.8; however, others seem to have it working just fine.
>>>>> 
>>>>> Thank you, Ryan
>>>>> --
>>>>> View this message in context:
>>>>> http://old.nabble.com/Problems-with-bioperl-in-Mac-OS-X-10.6-%28Perl-5.10.0%29-tp26366421p26366421.html
>>>>> Sent from the Perl - Bioperl-L mailing list archive at Nabble.com.
>>>>> 
>>>>> _______________________________________________
>>>>> Bioperl-l mailing list
>>>>> Bioperl-l at lists.open-bio.org
>>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>>> 
>>>> _______________________________________________
>>>> Bioperl-l mailing list
>>>> Bioperl-l at lists.open-bio.org
>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>> 
>>>> 
>>> 
>>> -- 
>>> View this message in context:
>>> http://old.nabble.com/Problems-with-bioperl-in-Mac-OS-X-10.6-%28Perl-5.10.0%29-tp26366421p26372079.html
>>> Sent from the Perl - Bioperl-L mailing list archive at Nabble.com.
>>> 
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>> 
>> 
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
> 

-- 
View this message in context: http://old.nabble.com/Problems-with-bioperl-in-Mac-OS-X-10.6-%28Perl-5.10.0%29-tp26366421p26375418.html
Sent from the Perl - Bioperl-L mailing list archive at Nabble.com.


From krishna.aneesh at gmail.com  Mon Nov 16 02:00:15 2009
From: krishna.aneesh at gmail.com (Aneesh K)
Date: Mon, 16 Nov 2009 12:30:15 +0530
Subject: [Bioperl-l] Regarding Bio::TreeIO Object
Message-ID: <9cb9dfd70911152300y34789f88qc69dd14bf505f57d@mail.gmail.com>

Hi,

I just started to use Bioperl modules. It's really useful and interesting.
Now I have in stuck with "Tree objects and phylogenetic trees".
I couldn't get any documentation/examples about reading/parsing phylip tree
files.

Please tell me from where I can get some sample codes for this.

Waiting for your reply.

Thanks
Aneesh.K
Mob. 09646181517


From David.Messina at sbc.su.se  Mon Nov 16 12:33:36 2009
From: David.Messina at sbc.su.se (Dave Messina)
Date: Mon, 16 Nov 2009 18:33:36 +0100
Subject: [Bioperl-l] highest PAML version supported?
In-Reply-To: <D44511CE-D632-4CBE-9DD1-1B68EFD70124@bioperl.org>
References: <628aabb70911121120w4c609056v50204b9bd9e5c3fb@mail.gmail.com>
	<D44511CE-D632-4CBE-9DD1-1B68EFD70124@bioperl.org>
Message-ID: <B0AEE42A-A40A-4BB9-9A1C-98381CBB4CA9@sbc.su.se>

Hi everyone,

I just committed support for parsing codeml 4.3a (August 2009) to bioperl-live. I added new tests and all PAML-related tests pass, but please report any problems you have to the list.

Note that I haven't tested the other PAML 4.3a executables to see if there are format changes with those. If you get the chance to try any and it doesn't work, let me know and I'll try to add support for them.

(Note that these changes are only to the PAML parsing code; Bio::Tools::Run already appears to handle 4.3a just fine.)


Dave


From jason at bioperl.org  Mon Nov 16 12:34:57 2009
From: jason at bioperl.org (Jason Stajich)
Date: Mon, 16 Nov 2009 09:34:57 -0800
Subject: [Bioperl-l] Regarding Bio::TreeIO Object
In-Reply-To: <9cb9dfd70911152300y34789f88qc69dd14bf505f57d@mail.gmail.com>
References: <9cb9dfd70911152300y34789f88qc69dd14bf505f57d@mail.gmail.com>
Message-ID: <D1D4E0B9-4741-4D45-84B6-6BB57B6E2B1E@bioperl.org>

Is this at all helpful to your questions.
http://www.bioperl.org/wiki/HOWTO:Trees

The trees are in 'newick' or new hampshire format though I don't think  
there is a phylip format for trees.

-jason
On Nov 15, 2009, at 11:00 PM, Aneesh K wrote:

> Hi,
>
> I just started to use Bioperl modules. It's really useful and  
> interesting.
> Now I have in stuck with "Tree objects and phylogenetic trees".
> I couldn't get any documentation/examples about reading/parsing  
> phylip tree
> files.
>
> Please tell me from where I can get some sample codes for this.
>
> Waiting for your reply.
>
> Thanks
> Aneesh.K
> Mob. 09646181517
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

--
Jason Stajich
jason.stajich at gmail.com
jason at bioperl.org


From roy.chaudhuri at gmail.com  Mon Nov 16 12:31:49 2009
From: roy.chaudhuri at gmail.com (Roy Chaudhuri)
Date: Mon, 16 Nov 2009 17:31:49 +0000
Subject: [Bioperl-l] Regarding Bio::TreeIO Object
In-Reply-To: <9cb9dfd70911152300y34789f88qc69dd14bf505f57d@mail.gmail.com>
References: <9cb9dfd70911152300y34789f88qc69dd14bf505f57d@mail.gmail.com>
Message-ID: <4B018C85.6020801@gmail.com>

Hi Aneesh,

See the Bioperl trees howto:
http://www.bioperl.org/wiki/HOWTO:Trees

Roy.

Aneesh K wrote:
> Hi,
> 
> I just started to use Bioperl modules. It's really useful and interesting.
> Now I have in stuck with "Tree objects and phylogenetic trees".
> I couldn't get any documentation/examples about reading/parsing phylip tree
> files.
> 
> Please tell me from where I can get some sample codes for this.
> 
> Waiting for your reply.
> 
> Thanks
> Aneesh.K
> Mob. 09646181517


-- 
Dr. Roy Chaudhuri
Department of Veterinary Medicine
University of Cambridge, U.K.


From Kevin.M.Brown at asu.edu  Mon Nov 16 13:22:07 2009
From: Kevin.M.Brown at asu.edu (Kevin Brown)
Date: Mon, 16 Nov 2009 11:22:07 -0700
Subject: [Bioperl-l] FW:  Bio::Graphics::Panel question
Message-ID: <1A4207F8295607498283FE9E93B775B40663EE37@EX02.asurite.ad.asu.edu>

Please keep your responses on the list for more timely help.
 

Kevin Brown
Center for Innovations in Medicine
Biodesign Institute
Arizona State University 

 
________________________________

From: Xiaoyu Liang [mailto:veronica.xiaoyu at gmail.com] 
Sent: Monday, November 16, 2009 9:34 AM
To: Kevin Brown
Subject: Re: [Bioperl-l] Bio::Graphics::Panel question


Hi Kevin, 

Thank you for ur quick response. I attached the BLAST .out file here.
And the follow is my code part. I have an array keeping the color for
each hit, and I printed it out the array, there is no missing. 

my $track = $panel->add_track(
                              -glyph       => 'graded_segments',
                              -label       => 1,
                              -connector   => 'dashed',
                              -font2color  => 'red',
                              -sort_order  => 'high_score',
                              -description => sub {
                                $feature = shift;
                                #print "--".$feature."\n";
                                return unless
$feature->has_tag('description');
                                my ($description) =
$feature->each_tag_value('description');
                                my ($id) = $feature->display_name;
                                my @records= split(/\|/,$description);
                                my $score = $feature->score;
                                #print $id.":".$score."\n";
                                if($score >=200){
                                        push (@color_array,1);
                                }elsif($score >=80){
                                        push (@color_array,2);
                                }elsif($score >=50){
                                        push (@color_array,3);
                                }elsif($score >= 40){
                                        push (@color_array,4);
                                }else{
                                        push (@color_array,5);
                                }
                                
                                if($type == 1){
                                        "Species:Arabidopsis TF
Family:$records[1] Score=$score";
                                }elsif($type == 2){
                                        if(scalar(@records)==5){
                                                "Species:$records[1] TF
Family:$records[2] Accepted Name:$records[3] Score=$score";
                                        }else{
                                                "Species:$records[1] TF
Family:$records[2] Score=$score";
                                        }
                                }else{
                                        "";
                                }
                               },
                               -bgcolor => sub{
                                        return unless
$feature->has_tag('description');
                                        if($color_array[$index] == 1 ){
                                                $color = 'red';
                                        }
                                        if($color_array[$index]== 2){
                                                $color = 'orange';
                                        }
                                        if($color_array[$index]== 3){
                                                $color = 'green';
                                        }
                                        if($color_array[$index]== 4){
                                                $color = 'blue';
                                        }
                                        if($color_array[$index]== 5){
                                                $color = 'black';
                                        }
                                        #if ($index == 20){
                                        #        $color = 'black';
                                        #}
                                        #print
$index."--".$color_array[$index]."\n";
                                        $index++;
                                        
                                        #print $feature."\n";
                                        #print
$feature->display_name."\n";
                                        return $color;
                               },
                             );


Best regrads,
Xiaoyu


On Mon, Nov 16, 2009 at 10:49 AM, Kevin Brown <Kevin.M.Brown at asu.edu>
wrote:


	To really be able to tell if this was a bug, I (and probably the
real
	devs) would need to see that part of your code and the Blast
file that
	is having this issue as it could be your callback for color
choice vs
	the blast object (e.g. your color picker is missing an option
that the
	data comes in with and so returns with a blank value).
	

	-----Original Message-----
	From: bioperl-l-bounces at lists.open-bio.org
	[mailto:bioperl-l-bounces at lists.open-bio.org] On Behalf Of
Xiaoyu Liang
	Sent: Friday, November 13, 2009 1:36 PM
	To: Bioperl-l at lists.open-bio.org
	Subject: [Bioperl-l] Bio::Graphics::Panel question
	
	Hi,
	
	I'm using Bio::Graphics to parse the blast result and generate
images.
	But, sometimes, in the middle of the output image, the hit's
color is
	white, eventhough I set it to other colors. I attached the
picture here
	for an example. This doesn't occur all the time, usually, it
works well.
	I'm wondering if I did something wrong? or depends on the blast
result?
	
	Thank you,
	Xiaoyu
	
	
	_______________________________________________
	Bioperl-l mailing list
	Bioperl-l at lists.open-bio.org
	http://lists.open-bio.org/mailman/listinfo/bioperl-l
	

-------------- next part --------------
A non-text attachment was scrubbed...
Name: 1258388779.out
Type: application/octet-stream
Size: 32599 bytes
Desc: 1258388779.out
URL: <http://lists.open-bio.org/pipermail/bioperl-l/attachments/20091116/cb23e40d/attachment-0002.obj>

From paolo.pavan at gmail.com  Mon Nov 16 14:06:06 2009
From: paolo.pavan at gmail.com (Paolo Pavan)
Date: Mon, 16 Nov 2009 20:06:06 +0100
Subject: [Bioperl-l] bioperl-ext installation issue
Message-ID: <56be91b60911161106w69e20fd9k133a465e8d4f8a3f@mail.gmail.com>

Hi everybody,
I have problems installing the bioperl-ext package, any help is much
appreciated.
1)

   - I start trying with cpan i /bioperl-ext/ the only resource available is
   /B/BI/BIRNEY/bioperl-ext-1.4 (is it ok?)
   - I install Inline::MakeMaker and Inline::C then
   - i/BIRNEY/bioperl-ext-1.4/ fails bacause I don't have staden package

2) I try to install io_lib-1.8.10.tar as suggested by the README (
ftp://ftp.mrc-lmb.cam.ac.uk/pub/staden/io_lib/), installation fails after:
...
gcc -g -O2 -o makeSCF makeSCF.o ../read/.libs/libread.a -lz -lm
../read/.libs/libread.a(compress.o): In function `fopen_compressed':
/root/Download/staden/io_lib-1.8.10/utils/compress.c:321: warning: the use
of `tempnam' is dangerous, better use `mkstemp'
gcc -DHAVE_CONFIG_H -I. -I. -I.. -I.. -I../read -I../alf -I../abi -I../ctf
-I../ztr -I../plain -I../scf -I../exp_file -I../utils  -I/usr/local/include
-g -O2 -c -o extract_seq.o `test -f extract_seq.c || echo './'`extract_seq.c
/bin/sh ../libtool --mode=link gcc  -g -O2   -o extract_seq  extract_seq.o
../read/libread.la
gcc -g -O2 -o extract_seq extract_seq.o ../read/.libs/libread.a -lz -lm
../read/.libs/libread.a(compress.o): In function `fopen_compressed':
/root/Download/staden/io_lib-1.8.10/utils/compress.c:321: warning: the use
of `tempnam' is dangerous, better use `mkstemp'
gcc -DHAVE_CONFIG_H -I. -I. -I.. -I.. -I../read -I../alf -I../abi -I../ctf
-I../ztr -I../plain -I../scf -I../exp_file -I../utils  -I/usr/local/include
-g -O2 -c -o index_tar.o `test -f index_tar.c || echo './'`index_tar.c
index_tar.c: In function ?main?:
index_tar.c:12: error: two or more data types in declaration specifiers
make[2]: *** [index_tar.o] Error 1
make[2]: Leaving directory `/home/root/Download/staden/io_lib-1.8.10/progs'
make[1]: *** [all-recursive] Error 1
make[1]: Leaving directory `/home/root/Download/staden/io_lib-1.8.10'
make: *** [all-recursive-am] Error 2

3) I give up staden, because I actually need pSW, and try to install from
Makefile.PL in Bio/Ext/Align but installation fails after:
...
Align.xs:18: warning: ?not_here? defined but not used
Running Mkbootstrap for Bio::Ext::Align ()
chmod 644 Align.bs
rm -f ../blib/arch/auto/Bio/Ext/Align/Align.so
gcc  -shared -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions
-fstack-protector --param=ssp-buffer-size=4 -m64 -mtune=generic Align.o  -o
../blib/arch/auto/Bio/Ext/Align/Align.so libs/libsw.a    \
           -lm          \

/usr/bin/ld: libs/libsw.a(aln.o): relocation R_X86_64_32 against `a local
symbol' can not be used when making a shared object; recompile with -fPIC
libs/libsw.a: could not read symbols: Bad value
collect2: ld returned 1 exit status
make[1]: *** [../blib/arch/auto/Bio/Ext/Align/Align.so] Error 1
make[1]: Leaving directory
`/home/root/.cpan/sources/authors/id/B/BI/BIRNEY/bioperl-ext-1.4/Bio/Ext/Align'
make: *** [subdirs] Error 2

I have also made some other tries such force install Bio::Ext:Align without
success but I'm sure I miss something trivial that I can't catch.
Can someone help me?

Thank you,
Paolo


From lincoln.stein at gmail.com  Mon Nov 16 15:08:20 2009
From: lincoln.stein at gmail.com (Lincoln Stein)
Date: Mon, 16 Nov 2009 15:08:20 -0500
Subject: [Bioperl-l] FW: Bio::Graphics::Panel question
In-Reply-To: <1A4207F8295607498283FE9E93B775B40663EE37@EX02.asurite.ad.asu.edu>
References: <1A4207F8295607498283FE9E93B775B40663EE37@EX02.asurite.ad.asu.edu>
Message-ID: <6dce9a0b0911161208q2f826d83s319184f0cacca097@mail.gmail.com>

Hi,

I think you should modify your color selection code as follows:


                                       if($color_array[$index] == 1 ){
                                               $color = 'red';
                                       }
                                       elsif($color_array[$index]== 2){
                                               $color = 'orange';
                                       }
                                       elsif($color_array[$index]== 3){
                                               $color = 'green';
                                       }
                                       elsif($color_array[$index]== 4){
                                               $color = 'blue';
                                       }
                                       elsif($color_array[$index]== 5){
                                               $color = 'black';
                                       }
                                       else { die "unexpected color array
value $color_array[$index]" }

Lincoln

On Mon, Nov 16, 2009 at 1:22 PM, Kevin Brown <Kevin.M.Brown at asu.edu> wrote:

> Please keep your responses on the list for more timely help.
>
>
> Kevin Brown
> Center for Innovations in Medicine
> Biodesign Institute
> Arizona State University
>
>
>
> ________________________________
>
> From: Xiaoyu Liang [mailto:veronica.xiaoyu at gmail.com]
> Sent: Monday, November 16, 2009 9:34 AM
> To: Kevin Brown
> Subject: Re: [Bioperl-l] Bio::Graphics::Panel question
>
>
> Hi Kevin,
>
> Thank you for ur quick response. I attached the BLAST .out file here.
> And the follow is my code part. I have an array keeping the color for
> each hit, and I printed it out the array, there is no missing.
>
> my $track = $panel->add_track(
>                              -glyph       => 'graded_segments',
>                              -label       => 1,
>                              -connector   => 'dashed',
>                              -font2color  => 'red',
>                              -sort_order  => 'high_score',
>                              -description => sub {
>                                $feature = shift;
>                                #print "--".$feature."\n";
>                                return unless
> $feature->has_tag('description');
>                                my ($description) =
> $feature->each_tag_value('description');
>                                my ($id) = $feature->display_name;
>                                my @records= split(/\|/,$description);
>                                my $score = $feature->score;
>                                #print $id.":".$score."\n";
>                                if($score >=200){
>                                        push (@color_array,1);
>                                }elsif($score >=80){
>                                        push (@color_array,2);
>                                }elsif($score >=50){
>                                        push (@color_array,3);
>                                }elsif($score >= 40){
>                                        push (@color_array,4);
>                                }else{
>                                        push (@color_array,5);
>                                }
>
>                                if($type == 1){
>                                        "Species:Arabidopsis TF
> Family:$records[1] Score=$score";
>                                }elsif($type == 2){
>                                        if(scalar(@records)==5){
>                                                "Species:$records[1] TF
> Family:$records[2] Accepted Name:$records[3] Score=$score";
>                                        }else{
>                                                "Species:$records[1] TF
> Family:$records[2] Score=$score";
>                                        }
>                                }else{
>                                        "";
>                                }
>                               },
>                               -bgcolor => sub{
>                                        return unless
> $feature->has_tag('description');
>                                        if($color_array[$index] == 1 ){
>                                                $color = 'red';
>                                        }
>                                        if($color_array[$index]== 2){
>                                                $color = 'orange';
>                                        }
>                                        if($color_array[$index]== 3){
>                                                $color = 'green';
>                                        }
>                                        if($color_array[$index]== 4){
>                                                $color = 'blue';
>                                        }
>                                        if($color_array[$index]== 5){
>                                                $color = 'black';
>                                        }
>                                        #if ($index == 20){
>                                        #        $color = 'black';
>                                        #}
>                                        #print
> $index."--".$color_array[$index]."\n";
>                                        $index++;
>
>                                        #print $feature."\n";
>                                        #print
> $feature->display_name."\n";
>                                        return $color;
>                               },
>                             );
>
>
> Best regrads,
> Xiaoyu
>
>
> On Mon, Nov 16, 2009 at 10:49 AM, Kevin Brown <Kevin.M.Brown at asu.edu>
> wrote:
>
>
>        To really be able to tell if this was a bug, I (and probably the
> real
>        devs) would need to see that part of your code and the Blast
> file that
>        is having this issue as it could be your callback for color
> choice vs
>        the blast object (e.g. your color picker is missing an option
> that the
>        data comes in with and so returns with a blank value).
>
>
>        -----Original Message-----
>        From: bioperl-l-bounces at lists.open-bio.org
>        [mailto:bioperl-l-bounces at lists.open-bio.org] On Behalf Of
> Xiaoyu Liang
>        Sent: Friday, November 13, 2009 1:36 PM
>        To: Bioperl-l at lists.open-bio.org
>        Subject: [Bioperl-l] Bio::Graphics::Panel question
>
>        Hi,
>
>        I'm using Bio::Graphics to parse the blast result and generate
> images.
>        But, sometimes, in the middle of the output image, the hit's
> color is
>        white, eventhough I set it to other colors. I attached the
> picture here
>        for an example. This doesn't occur all the time, usually, it
> works well.
>        I'm wondering if I did something wrong? or depends on the blast
> result?
>
>        Thank you,
>        Xiaoyu
>
>
>        _______________________________________________
>        Bioperl-l mailing list
>        Bioperl-l at lists.open-bio.org
>        http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


-- 
Lincoln D. Stein
Director, Informatics and Biocomputing Platform
Ontario Institute for Cancer Research
101 College St., Suite 800
Toronto, ON, Canada M5G0A3
416 673-8514
Assistant: Renata Musa <Renata.Musa at oicr.on.ca>


From ryan_bogard at hms.harvard.edu  Mon Nov 16 16:44:25 2009
From: ryan_bogard at hms.harvard.edu (rbogard)
Date: Mon, 16 Nov 2009 13:44:25 -0800 (PST)
Subject: [Bioperl-l] Problems with bioperl in Mac OS X 10.6 (Perl 5.10.0)
In-Reply-To: <26366421.post@talk.nabble.com>
References: <26366421.post@talk.nabble.com>
Message-ID: <26379710.post@talk.nabble.com>


Thank you all for your help! I was able to get bioperl working via manual
download and install. It was a combination of permissions issues and X86_64
vs. X86_32 compatibility issues. Using fink to download and install seems to
have given me a combination of 32 and 64 associated files (I probably did
something wrong in config). 


rbogard wrote:
> 
> In advance, any advice would be grealy appreciated! I have installed
> bioperl-588pm via fink but I am having difficulties calling the modules in
> script. The following is added to .profile (bash):
> PERL5LIB=/sw/lib/perl5/5.8.8:$PERL5LIB
> 
> If I change this to /sw/lib/perl5 then I get an @INC error, as use
> Bio::PERL cannot be located.
> 
> The environment variables are as follows:
> 
> MANPATH=/sw/share/man:/usr/share/man:/usr/X11/man:/sw/lib/perl5/5.10.0/man:/usr/X11R6/man:/sw/lib/perl5-core/5.8.8/man:/sw/lib/perl5/5.8.8/man
> PERL5LIB=/sw/lib/perl5/5.8.8:/sw/lib/perl5:/sw/lib/perl5/darwin:/sw/lib/perl5/5.8.8
> PATH=/sw/bin:/sw/sbin:/usr/bin:/bin:/usr/sbin:/sbin:/usr/local/bin:/usr/X11/bin:/usr/X11R6/bin
> INFOPATH=/sw/share/info:/sw/info:/usr/share/info
> 
> 
> This is the perl script I'm attempting to run:
> #!/sw/bin/perl5.8.8
> use strict;
> use Bio::Perl;
> $seq_object = get_sequence('swiss',"ROA1_HUMAN");
> write_sequence(">roa1.fasta",'fasta',$seq_object);
> 
> Here is the error output:
> 
> dyld: lazy symbol binding failed: Symbol not found: _Perl_Tstack_sp_ptr
>   Referenced from:
> /sw/lib/perl5/5.8.8/darwin-thread-multi-2level/auto/IO/IO.bundle
>   Expected in: dynamic lookup
> 
> dyld: Symbol not found: _Perl_Tstack_sp_ptr
>   Referenced from:
> /sw/lib/perl5/5.8.8/darwin-thread-multi-2level/auto/IO/IO.bundle
>   Expected in: dynamic lookup
> 
> Trace/BPT trap
> 
> I have looked through many forum postings and attempted the solutions
> offered in those instances, but none seem to work in my case. I'm not sure
> if it's because I have perl 5.10.0 installed while attempting to call
> bioperl 5.8.8; however, others seem to have it working just fine.
> 
> Thank you, Ryan 
> 

-- 
View this message in context: http://old.nabble.com/Problems-with-bioperl-in-Mac-OS-X-10.6-%28Perl-5.10.0%29-tp26366421p26379710.html
Sent from the Perl - Bioperl-L mailing list archive at Nabble.com.


From jay at jays.net  Mon Nov 16 17:02:10 2009
From: jay at jays.net (Jay Hannah)
Date: Mon, 16 Nov 2009 16:02:10 -0600
Subject: [Bioperl-l] Bio::Index::GenBank - by organism?
In-Reply-To: <2BA451B1-6E18-483E-B655-74D1146772CC@bioperl.org>
References: <3B01A09C-198E-4691-B807-7ED3250BB81A@jays.net>
	<12DFD22E-42DC-4626-9873-0DE3EBB5CFBD@illinois.edu>
	<2BA451B1-6E18-483E-B655-74D1146772CC@bioperl.org>
Message-ID: <60ADD3A9-D38B-4A39-A5CE-C8118DEC1242@jays.net>

On Nov 10, 2009, at 12:50 PM, Jason Stajich wrote:
> You might also look at what mygenbank does:
> http://homepage.mac.com/iankorf/mygenbank.html

It appears, perhaps, that BioSQL can provide *foo* searching like so:

http://www.biosql.org/wiki/Schema_Overview#TAXON.2C_TAXON_NAME

 SELECT DISTINCT include.ncbi_taxon_id FROM taxon
    INNER JOIN taxon AS include ON
      (include.left_value BETWEEN taxon.left_value
        AND taxon.right_value)
 WHERE taxon.taxon_id IN
   (SELECT taxon_id FROM taxon_name
    WHERE name LIKE '%fungi%')

So I think we're going to chase that for a while.

I didn't see a *foo* search in MyGenBank?

Thanks,

j
http://clab.ist.unomaha.edu/CLAB/index.php/User:Jhannah


From roy.chaudhuri at gmail.com  Tue Nov 17 06:24:07 2009
From: roy.chaudhuri at gmail.com (Roy Chaudhuri)
Date: Tue, 17 Nov 2009 11:24:07 +0000
Subject: [Bioperl-l] Regarding Bio::TreeIO Object
In-Reply-To: <9cb9dfd70911162117nfac0e52gea3d638e34337b16@mail.gmail.com>
References: <9cb9dfd70911152300y34789f88qc69dd14bf505f57d@mail.gmail.com>
	<4B018C85.6020801@gmail.com>
	<9cb9dfd70911162117nfac0e52gea3d638e34337b16@mail.gmail.com>
Message-ID: <4B0287D7.5050702@gmail.com>

Hi Aneesh,

Please keep your replies on the mailing list, that way someone else can 
respond, which would be particularly useful in this case since I know 
nothing about MapIO.

Roy.

Aneesh K wrote:
> Thanks for your reply. 
> 
> I would like to know about "Genetic Maps" also. I would like to 
> use MapIO object. 
> But I'm not aware about genetic maps and the mapmaker format. 
> 
> Please tell me from where I can get some examples for mapmaker format 
> and some example scripts to use MapIO object. 
> 
> Hoping your reply.
> 
> Aneesh.K
> Mob. 09646181517
> 
> 
> 
> On Mon, Nov 16, 2009 at 11:01 PM, Roy Chaudhuri <roy.chaudhuri at gmail.com 
> <mailto:roy.chaudhuri at gmail.com>> wrote:
> 
>     Hi Aneesh,
> 
>     See the Bioperl trees howto:
>     http://www.bioperl.org/wiki/HOWTO:Trees
> 
>     Roy.
> 
> 
>     Aneesh K wrote:
> 
>         Hi,
> 
>         I just started to use Bioperl modules. It's really useful and
>         interesting.
>         Now I have in stuck with "Tree objects and phylogenetic trees".
>         I couldn't get any documentation/examples about reading/parsing
>         phylip tree
>         files.
> 
>         Please tell me from where I can get some sample codes for this.
> 
>         Waiting for your reply.
> 
>         Thanks
>         Aneesh.K
>         Mob. 09646181517
> 
> 
> 
>     -- 
>     Dr. Roy Chaudhuri
>     Department of Veterinary Medicine
>     University of Cambridge, U.K.
> 
> 


From maj at fortinbras.us  Tue Nov 17 07:50:06 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Tue, 17 Nov 2009 07:50:06 -0500
Subject: [Bioperl-l] Regarding Bio::TreeIO Object
In-Reply-To: <4B0287D7.5050702@gmail.com>
References: <9cb9dfd70911152300y34789f88qc69dd14bf505f57d@mail.gmail.com><4B018C85.6020801@gmail.com><9cb9dfd70911162117nfac0e52gea3d638e34337b16@mail.gmail.com>
	<4B0287D7.5050702@gmail.com>
Message-ID: <394F62D51F15405BBCF8BB50DA0FF336@NewLife>

Aneesh, 
Have a look in the t/Map directory of the BioPerl distribution. These
are test scripts that are also examples of usage. The t/data directory
will contain the datafiles that the tests use; these will provide example data.
cheers 
Mark 
----- Original Message ----- 
From: "Roy Chaudhuri" <roy.chaudhuri at gmail.com>
To: "Aneesh K" <krishna.aneesh at gmail.com>; <bioperl-l at bioperl.org>
Sent: Tuesday, November 17, 2009 6:24 AM
Subject: Re: [Bioperl-l] Regarding Bio::TreeIO Object


> Hi Aneesh,
> 
> Please keep your replies on the mailing list, that way someone else can 
> respond, which would be particularly useful in this case since I know 
> nothing about MapIO.
> 
> Roy.
> 
> Aneesh K wrote:
>> Thanks for your reply. 
>> 
>> I would like to know about "Genetic Maps" also. I would like to 
>> use MapIO object. 
>> But I'm not aware about genetic maps and the mapmaker format. 
>> 
>> Please tell me from where I can get some examples for mapmaker format 
>> and some example scripts to use MapIO object. 
>> 
>> Hoping your reply.
>> 
>> Aneesh.K
>> Mob. 09646181517
>> 
>> 
>> 
>> On Mon, Nov 16, 2009 at 11:01 PM, Roy Chaudhuri <roy.chaudhuri at gmail.com 
>> <mailto:roy.chaudhuri at gmail.com>> wrote:
>> 
>>     Hi Aneesh,
>> 
>>     See the Bioperl trees howto:
>>     http://www.bioperl.org/wiki/HOWTO:Trees
>> 
>>     Roy.
>> 
>> 
>>     Aneesh K wrote:
>> 
>>         Hi,
>> 
>>         I just started to use Bioperl modules. It's really useful and
>>         interesting.
>>         Now I have in stuck with "Tree objects and phylogenetic trees".
>>         I couldn't get any documentation/examples about reading/parsing
>>         phylip tree
>>         files.
>> 
>>         Please tell me from where I can get some sample codes for this.
>> 
>>         Waiting for your reply.
>> 
>>         Thanks
>>         Aneesh.K
>>         Mob. 09646181517
>> 
>> 
>> 
>>     -- 
>>     Dr. Roy Chaudhuri
>>     Department of Veterinary Medicine
>>     University of Cambridge, U.K.
>> 
>> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
>


From veronica.xiaoyu at gmail.com  Wed Nov 18 12:18:33 2009
From: veronica.xiaoyu at gmail.com (Xiaoyu Liang)
Date: Wed, 18 Nov 2009 12:18:33 -0500
Subject: [Bioperl-l] how to visualize multiple sequences alignments
Message-ID: <a6a926b50911180918m6b6c31b2g191f9d78325e68de@mail.gmail.com>

Hi,

I'm wondering Is there any modules that can be used for visualizing multiple
sequences alignments? like the result from ClustalW?

Thank you very much,
Xiaoyu


From jason at bioperl.org  Wed Nov 18 13:23:05 2009
From: jason at bioperl.org (Jason Stajich)
Date: Wed, 18 Nov 2009 10:23:05 -0800
Subject: [Bioperl-l] how to visualize multiple sequences alignments
In-Reply-To: <a6a926b50911180918m6b6c31b2g191f9d78325e68de@mail.gmail.com>
References: <a6a926b50911180918m6b6c31b2g191f9d78325e68de@mail.gmail.com>
Message-ID: <FC65BC44-AC7A-4AE6-8DF0-3F7969CD5B4C@bioperl.org>

try jalview http://www.jalview.org/

On Nov 18, 2009, at 9:18 AM, Xiaoyu Liang wrote:

> Hi,
>
> I'm wondering Is there any modules that can be used for visualizing  
> multiple
> sequences alignments? like the result from ClustalW?
>
> Thank you very much,
> Xiaoyu
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

--
Jason Stajich
jason.stajich at gmail.com
jason at bioperl.org


From andrew.j.grimm at gmail.com  Wed Nov 18 21:52:31 2009
From: andrew.j.grimm at gmail.com (Andrew Grimm)
Date: Thu, 19 Nov 2009 13:52:31 +1100
Subject: [Bioperl-l] DANGER: hacking of bioperl wiki?
Message-ID: <b9140daa0911181852n3a000da7x3fc7a5f0661f10f3@mail.gmail.com>

Caution: read the whole email before visiting the bioperl wiki

I was doing some bioinformatics-related searching using google, and
one of the hits was to the bio dot perl dot org wiki (the FAQ in
particular).

When I did that, I was redirected to a ferdax dot com web site (a
typo-squatting of fedex?).

Some people reckon that ferdax hacks web sites and redirects google
hits from the victim web site to their own web site. For example, this
thread at google's webmaster central
http://www.google.com/support/forum/p/Webmasters/thread?tid=37a36c0d1ea99819&hl=en#all
(it's talking about zencart, but presumably they've since found other
victims)

Just going to the website without using google may not trigger the redirect.

Apologies if this is a false alarm, but I don't think it is.

I won't be in contact between Friday and Monday Australian time (I'll
be at railscamp 6 in Melbourne), so I won't be able to answer any
replies.

Thanks,

Andrew Grimm


From maj at fortinbras.us  Wed Nov 18 22:14:44 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Wed, 18 Nov 2009 22:14:44 -0500
Subject: [Bioperl-l] DANGER: hacking of bioperl wiki?
In-Reply-To: <b9140daa0911181852n3a000da7x3fc7a5f0661f10f3@mail.gmail.com>
References: <b9140daa0911181852n3a000da7x3fc7a5f0661f10f3@mail.gmail.com>
Message-ID: <7761C2223DB54DE6B836F302D2FF6AC0@NewLife>

Andrew-- thanks!! We're on it.
MAJ
----- Original Message ----- 
From: "Andrew Grimm" <andrew.j.grimm at gmail.com>
To: <bioperl-l at lists.open-bio.org>
Sent: Wednesday, November 18, 2009 9:52 PM
Subject: [Bioperl-l] DANGER: hacking of bioperl wiki?


> Caution: read the whole email before visiting the bioperl wiki
>
> I was doing some bioinformatics-related searching using google, and
> one of the hits was to the bio dot perl dot org wiki (the FAQ in
> particular).
>
> When I did that, I was redirected to a ferdax dot com web site (a
> typo-squatting of fedex?).
>
> Some people reckon that ferdax hacks web sites and redirects google
> hits from the victim web site to their own web site. For example, this
> thread at google's webmaster central
> http://www.google.com/support/forum/p/Webmasters/thread?tid=37a36c0d1ea99819&hl=en#all
> (it's talking about zencart, but presumably they've since found other
> victims)
>
> Just going to the website without using google may not trigger the redirect.
>
> Apologies if this is a false alarm, but I don't think it is.
>
> I won't be in contact between Friday and Monday Australian time (I'll
> be at railscamp 6 in Melbourne), so I won't be able to answer any
> replies.
>
> Thanks,
>
> Andrew Grimm
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> 


From sandipan.chowdhury at physiology.wisc.edu  Thu Nov 19 01:49:45 2009
From: sandipan.chowdhury at physiology.wisc.edu (Sandipan Chowdhury)
Date: Thu, 19 Nov 2009 00:49:45 -0600
Subject: [Bioperl-l] accessing EMBL database
Message-ID: <475F74057A618245A773CD325E105D1E033334AC@phy-srv01.physiol.physiology.wisc.edu>

Hi,
 
I have 3 questions all related to the retreival of sequences from online databases.
 
(1) I have been trying to download a protein sequence from the EMBL database and trying to write the sequence into a text file, as a string. I am using the following code: 
 
use Bio::DB::EMBL;
open b,">","s.txt";
$em_obj = Bio::DB::EMBL->new;
  $seq_obj = $em_obj->get_Seq_by_acc("CAB95729");
  $s_str = $seq_obj->seq;
  print b "$s_str\n";
close b;
 
The script is not working and gives the messege:
"MSG: EMBL stream with no ID. Not embl in my book
STACK: Error::throw
STACK: Bio::Root::Root::throw C:/Perl/site/lib/Bio/Root/Root.pm: 368
STACK: Bio::SeqIO::embl::next_seq C:/Perl/site/lib/Bio\SeqIO\embl.pm: 203
STACK: Bio::DB::WebDBSeqI::get_Seq_by_acc C:/Perl/site/lib/Bio/DB/WebDBSeqI.pm: 194
STACK: trial2.pl"
 
I am not sure what this means. A similar version of the script works for the Swissprot, GenBank and RefSeq databases but not for the EMBL. What is the way around this so that I can download the embl sequence?
 
(2) Also, is there anyway I can download sequences from DDBJ (database of Japan)?
 
(3) Can GI numbers be used to retreive the sequences? If so then how?
 
Answers to these questions would be greatly appreciated. I am very new to Perl/Bioperl and am not really familiar with the advanced programming features, so I would need to your help to find my way out of this situation.
 
Many Thanks
Sandipan
 

From maj at fortinbras.us  Thu Nov 19 08:10:07 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Thu, 19 Nov 2009 08:10:07 -0500
Subject: [Bioperl-l] accessing EMBL database
In-Reply-To: <475F74057A618245A773CD325E105D1E033334AC@phy-srv01.physiol.physiology.wisc.edu>
References: <475F74057A618245A773CD325E105D1E033334AC@phy-srv01.physiol.physiology.wisc.edu>
Message-ID: <E10EF917D9914031BCFDF8BDDBFA4F13@NewLife>

Sandipan-- That id (CAB95729) returns "No entries" from EMBL.
I would agree that the error message is not really informative.
The module documentation warns:

      # remember that EMBL_ID does not equal GenBank_ID!
so I would check that.
MAJ
----- Original Message ----- 
From: "Sandipan Chowdhury" <sandipan.chowdhury at physiology.wisc.edu>
To: <bioperl-l at bioperl.org>
Sent: Thursday, November 19, 2009 1:49 AM
Subject: [Bioperl-l] accessing EMBL database


> Hi,
>
> I have 3 questions all related to the retreival of sequences from online 
> databases.
>
> (1) I have been trying to download a protein sequence from the EMBL database 
> and trying to write the sequence into a text file, as a string. I am using the 
> following code:
>
> use Bio::DB::EMBL;
> open b,">","s.txt";
> $em_obj = Bio::DB::EMBL->new;
>  $seq_obj = $em_obj->get_Seq_by_acc("CAB95729");
>  $s_str = $seq_obj->seq;
>  print b "$s_str\n";
> close b;
>
> The script is not working and gives the messege:
> "MSG: EMBL stream with no ID. Not embl in my book
> STACK: Error::throw
> STACK: Bio::Root::Root::throw C:/Perl/site/lib/Bio/Root/Root.pm: 368
> STACK: Bio::SeqIO::embl::next_seq C:/Perl/site/lib/Bio\SeqIO\embl.pm: 203
> STACK: Bio::DB::WebDBSeqI::get_Seq_by_acc 
> C:/Perl/site/lib/Bio/DB/WebDBSeqI.pm: 194
> STACK: trial2.pl"
>
> I am not sure what this means. A similar version of the script works for the 
> Swissprot, GenBank and RefSeq databases but not for the EMBL. What is the way 
> around this so that I can download the embl sequence?
>
> (2) Also, is there anyway I can download sequences from DDBJ (database of 
> Japan)?
>
> (3) Can GI numbers be used to retreive the sequences? If so then how?
>
> Answers to these questions would be greatly appreciated. I am very new to 
> Perl/Bioperl and am not really familiar with the advanced programming 
> features, so I would need to your help to find my way out of this situation.
>
> Many Thanks
> Sandipan
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> 


From hrh at fmi.ch  Thu Nov 19 08:23:29 2009
From: hrh at fmi.ch (Hotz, Hans-Rudolf)
Date: Thu, 19 Nov 2009 14:23:29 +0100
Subject: [Bioperl-l] accessing EMBL database
In-Reply-To: <475F74057A618245A773CD325E105D1E033334AC@phy-srv01.physiol.physiology.wisc.edu>
Message-ID: <C72B0561.5887%hrh@fmi.ch>


Sandipan


> I have 3 questions all related to the retreival of sequences from online
> databases.
>  
> (1) I have been trying to download a protein sequence from the EMBL database
> and trying to write the sequence into a text file, as a string. I am using the
> following code: 
>  
> use Bio::DB::EMBL;
> open b,">","s.txt";
> $em_obj = Bio::DB::EMBL->new;
>   $seq_obj = $em_obj->get_Seq_by_acc("CAB95729");
>   $s_str = $seq_obj->seq;
>   print b "$s_str\n";
> close b;
>  
> The script is not working and gives the messege:
> "MSG: EMBL stream with no ID. Not embl in my book
> STACK: Error::throw
> STACK: Bio::Root::Root::throw C:/Perl/site/lib/Bio/Root/Root.pm: 368
> STACK: Bio::SeqIO::embl::next_seq C:/Perl/site/lib/Bio\SeqIO\embl.pm: 203
> STACK: Bio::DB::WebDBSeqI::get_Seq_by_acc
> C:/Perl/site/lib/Bio/DB/WebDBSeqI.pm: 194
> STACK: trial2.pl"
>  
> I am not sure what this means. A similar version of the script works for the
> Swissprot, GenBank and RefSeq databases but not for the EMBL. What is the way
> around this so that I can download the embl sequence?

"CAB95729" is a protein sequence, ie a translation of the CDS of
'AJ277028.1'.

As far as I know, Bio::DB::EMBL is only designed to get EMBL entries, ie the
nucleotides sequence


> (2) Also, is there anyway I can download sequences from DDBJ (database of
> Japan)?

Unless, for network/speed reason, why do you want to download data from
DDBJ? It contains the same data as GenBank and EMBL. Those three databases
exchange their data on a daily basis.
  
> (3) Can GI numbers be used to retreive the sequences? If so then how?

Have you looked at Bio::DB::Eutilities ? See the 'HOWTOs'  page in the
Bioperl Wiki


Regards, Hans


> Answers to these questions would be greatly appreciated. I am very new to
> Perl/Bioperl and am not really familiar with the advanced programming
> features, so I would need to your help to find my way out of this situation.
>  
> Many Thanks
> Sandipan
>  
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From cjfields at illinois.edu  Thu Nov 19 08:47:16 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Thu, 19 Nov 2009 07:47:16 -0600
Subject: [Bioperl-l] accessing EMBL database
In-Reply-To: <C72B0561.5887%hrh@fmi.ch>
References: <C72B0561.5887%hrh@fmi.ch>
Message-ID: <95D416ED-7630-40A1-ABA5-A3C3525D25B1@illinois.edu>


On Nov 19, 2009, at 7:23 AM, Hotz, Hans-Rudolf wrote:

> 
> Sandipan
> 
> 
>> I have 3 questions all related to the retreival of sequences from online
>> databases.
>> 
>> (1) I have been trying to download a protein sequence from the EMBL database
>> and trying to write the sequence into a text file, as a string. I am using the
>> following code: 
>> 
>> use Bio::DB::EMBL;
>> open b,">","s.txt";
>> $em_obj = Bio::DB::EMBL->new;
>>  $seq_obj = $em_obj->get_Seq_by_acc("CAB95729");
>>  $s_str = $seq_obj->seq;
>>  print b "$s_str\n";
>> close b;
>> 
>> The script is not working and gives the messege:
>> "MSG: EMBL stream with no ID. Not embl in my book
>> STACK: Error::throw
>> STACK: Bio::Root::Root::throw C:/Perl/site/lib/Bio/Root/Root.pm: 368
>> STACK: Bio::SeqIO::embl::next_seq C:/Perl/site/lib/Bio\SeqIO\embl.pm: 203
>> STACK: Bio::DB::WebDBSeqI::get_Seq_by_acc
>> C:/Perl/site/lib/Bio/DB/WebDBSeqI.pm: 194
>> STACK: trial2.pl"
>> 
>> I am not sure what this means. A similar version of the script works for the
>> Swissprot, GenBank and RefSeq databases but not for the EMBL. What is the way
>> around this so that I can download the embl sequence?
> 
> "CAB95729" is a protein sequence, ie a translation of the CDS of
> 'AJ277028.1'.
> 
> As far as I know, Bio::DB::EMBL is only designed to get EMBL entries, ie the
> nucleotides sequence
> 
> 
> 
>> (2) Also, is there anyway I can download sequences from DDBJ (database of
>> Japan)?
> 
> Unless, for network/speed reason, why do you want to download data from
> DDBJ? It contains the same data as GenBank and EMBL. Those three databases
> exchange their data on a daily basis.
> 
>> (3) Can GI numbers be used to retreive the sequences? If so then how?
> 
> Have you looked at Bio::DB::Eutilities ? See the 'HOWTOs'  page in the
> Bioperl Wiki
> 
> 
> 
> Regards, Hans
> 
> 
> 
>> Answers to these questions would be greatly appreciated. I am very new to
>> Perl/Bioperl and am not really familiar with the advanced programming
>> features, so I would need to your help to find my way out of this situation.
>> 
>> Many Thanks
>> Sandipan

To add to that, if you want the protein sequences as a Bio::Seq you can use Bio::DB::GenPept (Bio::DB::EUtilities will retrieve raw data only).

chris


From David.Messina at sbc.su.se  Thu Nov 19 09:04:55 2009
From: David.Messina at sbc.su.se (Dave Messina)
Date: Thu, 19 Nov 2009 15:04:55 +0100
Subject: [Bioperl-l] accessing EMBL database
In-Reply-To: <E10EF917D9914031BCFDF8BDDBFA4F13@NewLife>
References: <475F74057A618245A773CD325E105D1E033334AC@phy-srv01.physiol.physiology.wisc.edu>
	<E10EF917D9914031BCFDF8BDDBFA4F13@NewLife>
Message-ID: <B6E6CEFF-14E7-48D6-B9D8-9C114F166190@sbc.su.se>

> I would agree that the error message is not really informative.

Agreed that it could be better, but I wonder whether part of the problem with BioPerl error messages is the stack dump.

I think a lot of eyes just glaze right over when they see a big wad of complicated stuff, with colons and slashes and line numbers, spewing out at them.

Perhaps the stack dump should be turned off by default?

Wouldn't this:

ERROR: EMBL stream with no ID. Not embl in my book


Be a lot clearer than this?:

MSG: EMBL stream with no ID. Not embl in my book
STACK: Error::throw
STACK: Bio::Root::Root::throw C:/Perl/site/lib/Bio/Root/Root.pm: 368
STACK: Bio::SeqIO::embl::next_seq C:/Perl/site/lib/Bio\SeqIO\embl.pm: 203
STACK: Bio::DB::WebDBSeqI::get_Seq_by_acc C:/Perl/site/lib/Bio/DB/WebDBSeqI.pm: 194
STACK: trial2.pl


Just a thought. This has probably been discussed before.
Dave


From maj at fortinbras.us  Thu Nov 19 09:17:05 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Thu, 19 Nov 2009 09:17:05 -0500
Subject: [Bioperl-l] accessing EMBL database
In-Reply-To: <B6E6CEFF-14E7-48D6-B9D8-9C114F166190@sbc.su.se>
References: <475F74057A618245A773CD325E105D1E033334AC@phy-srv01.physiol.physiology.wisc.edu>
	<E10EF917D9914031BCFDF8BDDBFA4F13@NewLife>
	<B6E6CEFF-14E7-48D6-B9D8-9C114F166190@sbc.su.se>
Message-ID: <FADF827A6CE34C959062F2D93849E15A@NewLife>

I'm inclined to agree. Lots of responses to questions here that begin
"Well, as the error message said, you need to check...", which means
people tend towards "I broke it! Write the list!". I do find it hairy when
my errors are way down in the object tree.
----- Original Message ----- 
From: "Dave Messina" <David.Messina at sbc.su.se>
To: "Mark A. Jensen" <maj at fortinbras.us>
Cc: <bioperl-l at bioperl.org>
Sent: Thursday, November 19, 2009 9:04 AM
Subject: Re: [Bioperl-l] accessing EMBL database


> I would agree that the error message is not really informative.

Agreed that it could be better, but I wonder whether part of the problem with 
BioPerl error messages is the stack dump.

I think a lot of eyes just glaze right over when they see a big wad of 
complicated stuff, with colons and slashes and line numbers, spewing out at 
them.

Perhaps the stack dump should be turned off by default?

Wouldn't this:

ERROR: EMBL stream with no ID. Not embl in my book


Be a lot clearer than this?:

MSG: EMBL stream with no ID. Not embl in my book
STACK: Error::throw
STACK: Bio::Root::Root::throw C:/Perl/site/lib/Bio/Root/Root.pm: 368
STACK: Bio::SeqIO::embl::next_seq C:/Perl/site/lib/Bio\SeqIO\embl.pm: 203
STACK: Bio::DB::WebDBSeqI::get_Seq_by_acc C:/Perl/site/lib/Bio/DB/WebDBSeqI.pm: 
194
STACK: trial2.pl


Just a thought. This has probably been discussed before.
Dave


From rtbio.2009 at gmail.com  Thu Nov 19 09:55:27 2009
From: rtbio.2009 at gmail.com (Roopa Raghuveer)
Date: Thu, 19 Nov 2009 15:55:27 +0100
Subject: [Bioperl-l] Remote blast
Message-ID: <c7cac1600911190655q7c4cf910x221e8a2ebac40f2a@mail.gmail.com>

Hello everybody,

I have a problem. I would like to use remote blast to find sequences
matching for an input sequence.

Ex:-I would like to search sequences which match Trypanosoma Brucei
sequence.

I want the output to be only Trypanosoma Brucei sequences matching with my
query.When i tried to use remoteblast to nr database,I got sequences from
different organisms like E.coli,Pseudomonas etc.,

Could you please tell me how can this be solved...?

My code is as follows.

use Bio::Tools::Run::RemoteBlast;
  use strict;
  my $prog = 'blastn';
  my $db   = 'nr';
  my $e_val= '1e-10';
 my $organism= 'Trypanosoma Brucei';

  my @params = ( '-prog' => $prog,
         '-data' => $db,
         '-expect' => $e_val,
         '-readmethod' => 'SearchIO',
         '-Organism'   => $organism );

  my $factory = Bio::Tools::Run::RemoteBlast->
new(@params);

  #change a paramter
  #$Bio::Tools::Run::RemoteBlast::HEADER{'ENTREZ_QUERY'} = 'Trypanosoma
brucei[ORGN]'

  #remove a parameter
  #delete $Bio::Tools::Run::RemoteBlast::HEADER{'FILTER'};

  my $v = 1;
  #$v is just to turn on and off the messages

  my $str = Bio::SeqIO->new(-file=>'amino.fa' , '-format' => 'fasta' ,
'-organism' => 'Trypanosoma Brucei' );

  while (my $input = $str->next_seq()){
    #Blast a sequence against a database:
   my $r = $factory->submit_blast($input);
    #my $r = $factory->submit_blast('amino.fa');

    print STDERR "waiting..." if( $v > 0 );
    while ( my @rids = $factory->each_rid ) {
      foreach my $rid ( @rids ) {
        my $rc = $factory->retrieve_blast($rid);
        if( !ref($rc) ) {
          if( $rc < 0 ) {
            $factory->remove_rid($rid);
          }
          print STDERR "." if ( $v > 0 );
         sleep 5;
        }
     else {
          my $result = $rc->next_result();
          #save the output
          my $filename = $result->query_name()."\.out";
          $factory->save_output($filename);
          $factory->remove_rid($rid);
          print "\nQuery Name: ", $result->query_name(), "\n";
          while ( my $hit = $result->next_hit ) {
            next unless ( $v > 0);
            print "\thit name is ", $hit->name, "\n";
            while( my $hsp = $hit->next_hsp ) {
              print "\t\tscore is ", $hsp->score, "\n";
            }
          }
        }
      }
    }
  }

My input sequence is

>ref|NC_009512.1|:385-1902
GTGTCAGTGGAACTTTGGCAGCAGTGCGTGGAGCTTCTGCGCGATGAACTGCCTGCCCAGCAATTCAACA
CCTGGATCCGTCCGCTACAGGTCGAAGCCGAAGGCGACGAGTTGCGCGTCTATGCGCCTAACCGTTTCGT
TCTCGATTGGGTCAATGAAAAGTACCTGGGTCGTTTGCTCGAGCTGTTGGGTGAGAACGGTAGCGGCATT
GCACCAGCCCTTTCCTTATTAATAGGTAGCCGCCGCAGCTCGGCCCCAAGGGCTGCACCCAACGCGCCGG
TCAGCGCTGCCGTTGCGGCTTCGCTGGCGCAGACTCAGGCGCACAAGACGGCCCCGGCAGCAGCGGTTGA
ACCCGTTGCCGTGGCCGCGGCCGAGCCTGTATTGGTCGAGACGTCTTCGCGTGACAGCTTTGATGCCATG
GCCGAGCCTGCTGCTGCGCCGCCCAGTGGTGGCCGGGCTGAACAGCGCACCGTGCAGGTTGAAGGTGCGC
TCAAGCACACCAGTTACCTGAACCGGACCTTTACCTTTGACACCTTCGTCGAAGGTAAGTCGAACCAGCT
CGCCCGCGCGGCTGCCTGGCAGGTTGCGGACAACCCTAAGCATGGCTACAACCCACTGTTCCTTTATGGC
GGTGTGGGTTTGGGTAAAACCCACCTTATGCATGCTGTGGGTAACCATCTGCTGAAGAAGAATCCGAACG
CCAAGGTGGTGTACCTGCATTCGGAGCGCTTCGTCGCGGACATGGTCAAAGCGTTGCAACTCAACGCCAT
CAACGAATTCAAGCGCTTCTACCGCTCGGTGGACGCGTTGCTGATCGACGATATCCAGTTCTTCGCTCGC
AAAGAGCGCTCGCAAGAAGAGTTTTTCCACACCTTCAACGCCTTGCTTGAGGGTGGCCAGCAGGTAATCC
TTACCTCTGACCGCTATCCCAAGGAAATCGAAGGCCTGGAAGAGCGTCTGAAGTCGCGCTTTGGTTGGGG
CCTGACGGTGGCTGTCGAGCCGCCAGAGCTGGAGACCCGCGTAGCGATCCTGATGAAGAAGGCCGACCAG
GCCAAAGTCGAGCTCCCGCATGACGCAGCCTTTTTCATCGCTCAGCGCATCCGGTCCAACGTCCGTGAGC
TGGAAGGTGCACTGAAGCGAGTTATTGCTCACTCGCACTTCATGGGGCGTGACATCACCATCGAGCTGAT
TCGTGAATCGCTCAAGGATCTGTTGGCGCTGCAAGACAAACTGGTCAGTGTGGATAACATTCAGCGTACC
GTCGCTGAGTACTACAAGATCAAGATCTCCGATCTGTTGTCCAAGCGTCGTTCGCGTTCTGTCGCGCGCC
CGCGTCAGGTAGCCATGGCCCTGTCCAAGGAGTTGACCAACCACAGTCTGCCGGAAATCGGCGACATGTT
CGGTGGTCGCGACCATACGACCGTGCTGCACGCCTGCCGCAAAATCAATGAACTGAAGGAATCCGACGCG
GACATCCGCGAGGACTACAAGAACCTGCTGCGGACGCTGACGACCTGA

Please mail me regarding any queries.

Regards,
Roopa.


From cjfields at illinois.edu  Thu Nov 19 10:30:34 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Thu, 19 Nov 2009 09:30:34 -0600
Subject: [Bioperl-l] verbosity and error stack, was  accessing EMBL database
In-Reply-To: <FADF827A6CE34C959062F2D93849E15A@NewLife>
References: <475F74057A618245A773CD325E105D1E033334AC@phy-srv01.physiol.physiology.wisc.edu>
	<E10EF917D9914031BCFDF8BDDBFA4F13@NewLife>
	<B6E6CEFF-14E7-48D6-B9D8-9C114F166190@sbc.su.se>
	<FADF827A6CE34C959062F2D93849E15A@NewLife>
Message-ID: <B335150B-7479-4D31-BA0A-9F139E2CCE0E@illinois.edu>

Mark, Dave,

This could be based on verbose(). 

          Level      w     t     d    st
verbose   < 0        -     +     -    -/+
verbose     0        +     +     -    -/+
verbose     1        +     +     +    +/+
verbose   > 1        +* -> +     +    +/+
* converts to throw()
w = warn
t = throw
d = debug
st = stack trace

warn() is set up that way now, you don't get a stack trace unless verbose() is > 0.  throw() could be the same; would be a simple fix, really.

My only problem with the current state of things is (I think we've delved down this path before) verbosity level is tied to exception strictness as seen above, and they're really two separate concepts, at least to me.  Verbosity of 1 or more doesn't necessarily mean I want an elevated level of strictness along with it.  For instance, one might want very strict exceptions w/o the noise, or (conversely) lots of debugging output but no warnings. 

(aside: another small nit, but I haven't exactly liked that the global level of strictness is designated by a env. variable with DEBUG in the name, but that's just me).

I've been thinking it would be nice to have simple separate verbose/strict switches (this is the way it's implemented in Biome).  This would allow some finer grained control over output:

          Level      d    st
verbose     0        -    -
verbose     1        +    +
Default = BIOPERLDEBUG || 0 # current situation

          Level      w     t
strict      -1       -     +
strict      0        +     +
strict      1        +* -> +
* converts to throw()
Default = BIOPERLSTRICT || 0

We could even allow finer-grained control of verbosity (states which cover all combinations) w/o affecting strictness.

chris

On Nov 19, 2009, at 8:17 AM, Mark A. Jensen wrote:

> I'm inclined to agree. Lots of responses to questions here that begin
> "Well, as the error message said, you need to check...", which means
> people tend towards "I broke it! Write the list!". I do find it hairy when
> my errors are way down in the object tree.
> ----- Original Message ----- From: "Dave Messina" <David.Messina at sbc.su.se>
> To: "Mark A. Jensen" <maj at fortinbras.us>
> Cc: <bioperl-l at bioperl.org>
> Sent: Thursday, November 19, 2009 9:04 AM
> Subject: Re: [Bioperl-l] accessing EMBL database
> 
> 
>> I would agree that the error message is not really informative.
> 
> Agreed that it could be better, but I wonder whether part of the problem with BioPerl error messages is the stack dump.
> 
> I think a lot of eyes just glaze right over when they see a big wad of complicated stuff, with colons and slashes and line numbers, spewing out at them.
> 
> Perhaps the stack dump should be turned off by default?
> 
> Wouldn't this:
> 
> ERROR: EMBL stream with no ID. Not embl in my book
> 
> 
> 
> Be a lot clearer than this?:
> 
> MSG: EMBL stream with no ID. Not embl in my book
> STACK: Error::throw
> STACK: Bio::Root::Root::throw C:/Perl/site/lib/Bio/Root/Root.pm: 368
> STACK: Bio::SeqIO::embl::next_seq C:/Perl/site/lib/Bio\SeqIO\embl.pm: 203
> STACK: Bio::DB::WebDBSeqI::get_Seq_by_acc C:/Perl/site/lib/Bio/DB/WebDBSeqI.pm: 194
> STACK: trial2.pl
> 
> 
> 
> Just a thought. This has probably been discussed before.
> Dave
> 
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From roy.chaudhuri at gmail.com  Thu Nov 19 11:10:28 2009
From: roy.chaudhuri at gmail.com (Roy Chaudhuri)
Date: Thu, 19 Nov 2009 16:10:28 +0000
Subject: [Bioperl-l] Remote blast
In-Reply-To: <c7cac1600911190655q7c4cf910x221e8a2ebac40f2a@mail.gmail.com>
References: <c7cac1600911190655q7c4cf910x221e8a2ebac40f2a@mail.gmail.com>
Message-ID: <4B056DF4.2030502@gmail.com>

Hi Roopa,

I think that the -Organism parameter that you specify for 
Bio::Tools::Run::RemoteBlast is ignored - I can't find any reference to 
it in the documentation:
http://search.cpan.org/~cjfields/BioPerl-1.6.1/Bio/Tools/Run/RemoteBlast.pm

You have the correct approach in your code - limiting the search to the 
Entrez query "Trypanosoma brucei[ORGN]", but the line is commented out. 
If you uncomment the line (and add a semicolon afterwards), the program 
runs correctly, but no hits are reported below your threshold e-value. 
If you change the value of $e_val to 10 then some T.brucei hits are 
reported.

Roy.

Roopa Raghuveer wrote:
> Hello everybody,
> 
> I have a problem. I would like to use remote blast to find sequences
> matching for an input sequence.
> 
> Ex:-I would like to search sequences which match Trypanosoma Brucei
> sequence.
> 
> I want the output to be only Trypanosoma Brucei sequences matching with my
> query.When i tried to use remoteblast to nr database,I got sequences from
> different organisms like E.coli,Pseudomonas etc.,
> 
> Could you please tell me how can this be solved...?
> 
> My code is as follows.
> 
> use Bio::Tools::Run::RemoteBlast;
>   use strict;
>   my $prog = 'blastn';
>   my $db   = 'nr';
>   my $e_val= '1e-10';
>  my $organism= 'Trypanosoma Brucei';
> 
>   my @params = ( '-prog' => $prog,
>          '-data' => $db,
>          '-expect' => $e_val,
>          '-readmethod' => 'SearchIO',
>          '-Organism'   => $organism );
> 
>   my $factory = Bio::Tools::Run::RemoteBlast->
> new(@params);
> 
>   #change a paramter
>   #$Bio::Tools::Run::RemoteBlast::HEADER{'ENTREZ_QUERY'} = 'Trypanosoma
> brucei[ORGN]'
> 
>   #remove a parameter
>   #delete $Bio::Tools::Run::RemoteBlast::HEADER{'FILTER'};
> 
>   my $v = 1;
>   #$v is just to turn on and off the messages
> 
>   my $str = Bio::SeqIO->new(-file=>'amino.fa' , '-format' => 'fasta' ,
> '-organism' => 'Trypanosoma Brucei' );
> 
>   while (my $input = $str->next_seq()){
>     #Blast a sequence against a database:
>    my $r = $factory->submit_blast($input);
>     #my $r = $factory->submit_blast('amino.fa');
> 
>     print STDERR "waiting..." if( $v > 0 );
>     while ( my @rids = $factory->each_rid ) {
>       foreach my $rid ( @rids ) {
>         my $rc = $factory->retrieve_blast($rid);
>         if( !ref($rc) ) {
>           if( $rc < 0 ) {
>             $factory->remove_rid($rid);
>           }
>           print STDERR "." if ( $v > 0 );
>          sleep 5;
>         }
>      else {
>           my $result = $rc->next_result();
>           #save the output
>           my $filename = $result->query_name()."\.out";
>           $factory->save_output($filename);
>           $factory->remove_rid($rid);
>           print "\nQuery Name: ", $result->query_name(), "\n";
>           while ( my $hit = $result->next_hit ) {
>             next unless ( $v > 0);
>             print "\thit name is ", $hit->name, "\n";
>             while( my $hsp = $hit->next_hsp ) {
>               print "\t\tscore is ", $hsp->score, "\n";
>             }
>           }
>         }
>       }
>     }
>   }
> 
> My input sequence is
> 
>> ref|NC_009512.1|:385-1902
> GTGTCAGTGGAACTTTGGCAGCAGTGCGTGGAGCTTCTGCGCGATGAACTGCCTGCCCAGCAATTCAACA
> CCTGGATCCGTCCGCTACAGGTCGAAGCCGAAGGCGACGAGTTGCGCGTCTATGCGCCTAACCGTTTCGT
> TCTCGATTGGGTCAATGAAAAGTACCTGGGTCGTTTGCTCGAGCTGTTGGGTGAGAACGGTAGCGGCATT
> GCACCAGCCCTTTCCTTATTAATAGGTAGCCGCCGCAGCTCGGCCCCAAGGGCTGCACCCAACGCGCCGG
> TCAGCGCTGCCGTTGCGGCTTCGCTGGCGCAGACTCAGGCGCACAAGACGGCCCCGGCAGCAGCGGTTGA
> ACCCGTTGCCGTGGCCGCGGCCGAGCCTGTATTGGTCGAGACGTCTTCGCGTGACAGCTTTGATGCCATG
> GCCGAGCCTGCTGCTGCGCCGCCCAGTGGTGGCCGGGCTGAACAGCGCACCGTGCAGGTTGAAGGTGCGC
> TCAAGCACACCAGTTACCTGAACCGGACCTTTACCTTTGACACCTTCGTCGAAGGTAAGTCGAACCAGCT
> CGCCCGCGCGGCTGCCTGGCAGGTTGCGGACAACCCTAAGCATGGCTACAACCCACTGTTCCTTTATGGC
> GGTGTGGGTTTGGGTAAAACCCACCTTATGCATGCTGTGGGTAACCATCTGCTGAAGAAGAATCCGAACG
> CCAAGGTGGTGTACCTGCATTCGGAGCGCTTCGTCGCGGACATGGTCAAAGCGTTGCAACTCAACGCCAT
> CAACGAATTCAAGCGCTTCTACCGCTCGGTGGACGCGTTGCTGATCGACGATATCCAGTTCTTCGCTCGC
> AAAGAGCGCTCGCAAGAAGAGTTTTTCCACACCTTCAACGCCTTGCTTGAGGGTGGCCAGCAGGTAATCC
> TTACCTCTGACCGCTATCCCAAGGAAATCGAAGGCCTGGAAGAGCGTCTGAAGTCGCGCTTTGGTTGGGG
> CCTGACGGTGGCTGTCGAGCCGCCAGAGCTGGAGACCCGCGTAGCGATCCTGATGAAGAAGGCCGACCAG
> GCCAAAGTCGAGCTCCCGCATGACGCAGCCTTTTTCATCGCTCAGCGCATCCGGTCCAACGTCCGTGAGC
> TGGAAGGTGCACTGAAGCGAGTTATTGCTCACTCGCACTTCATGGGGCGTGACATCACCATCGAGCTGAT
> TCGTGAATCGCTCAAGGATCTGTTGGCGCTGCAAGACAAACTGGTCAGTGTGGATAACATTCAGCGTACC
> GTCGCTGAGTACTACAAGATCAAGATCTCCGATCTGTTGTCCAAGCGTCGTTCGCGTTCTGTCGCGCGCC
> CGCGTCAGGTAGCCATGGCCCTGTCCAAGGAGTTGACCAACCACAGTCTGCCGGAAATCGGCGACATGTT
> CGGTGGTCGCGACCATACGACCGTGCTGCACGCCTGCCGCAAAATCAATGAACTGAAGGAATCCGACGCG
> GACATCCGCGAGGACTACAAGAACCTGCTGCGGACGCTGACGACCTGA
> 
> Please mail me regarding any queries.
> 
> Regards,
> Roopa.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From clements at nescent.org  Thu Nov 19 12:46:32 2009
From: clements at nescent.org (Dave Clements)
Date: Thu, 19 Nov 2009 18:46:32 +0100
Subject: [Bioperl-l] how to visualize multiple sequences alignments
In-Reply-To: <FC65BC44-AC7A-4AE6-8DF0-3F7969CD5B4C@bioperl.org>
References: <a6a926b50911180918m6b6c31b2g191f9d78325e68de@mail.gmail.com>
	<FC65BC44-AC7A-4AE6-8DF0-3F7969CD5B4C@bioperl.org>
Message-ID: <f135c01c0911190946t7488718brfed76b975f6d2b2@mail.gmail.com>

Hi Xiaoyu,

I would also take a look at GBrowse_syn, a perl based solution built with
the GBrowse genome browser framework.

See http://gmod.org/wiki/GBrowse_syn.

Cheers,

Dave C.

On Wed, Nov 18, 2009 at 7:23 PM, Jason Stajich <jason at bioperl.org> wrote:

> try jalview http://www.jalview.org/
>
>
> On Nov 18, 2009, at 9:18 AM, Xiaoyu Liang wrote:
>
>  Hi,
>>
>> I'm wondering Is there any modules that can be used for visualizing
>> multiple
>> sequences alignments? like the result from ClustalW?
>>
>> Thank you very much,
>> Xiaoyu
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>
> --
> Jason Stajich
> jason.stajich at gmail.com
> jason at bioperl.org
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


-- 
http://gmod.org/wiki/GMOD_News
http://gmod.org/wiki/January_2010_GMOD_Meeting


From maj at fortinbras.us  Thu Nov 19 18:37:05 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Thu, 19 Nov 2009 18:37:05 -0500
Subject: [Bioperl-l] verbosity and error stack,
	was  accessing EMBL database
In-Reply-To: <B335150B-7479-4D31-BA0A-9F139E2CCE0E@illinois.edu>
References: <475F74057A618245A773CD325E105D1E033334AC@phy-srv01.physiol.physiology.wisc.edu><E10EF917D9914031BCFDF8BDDBFA4F13@NewLife><B6E6CEFF-14E7-48D6-B9D8-9C114F166190@sbc.su.se><FADF827A6CE34C959062F2D93849E15A@NewLife>
	<B335150B-7479-4D31-BA0A-9F139E2CCE0E@illinois.edu>
Message-ID: <D72A208491F04DBF9B3F7F10D86A9931@NewLife>

I like this verbose/strict separability a lot. Should we go for it?
----- Original Message ----- 
From: "Chris Fields" <cjfields at illinois.edu>
To: "Mark A. Jensen" <maj at fortinbras.us>
Cc: <bioperl-l at bioperl.org>
Sent: Thursday, November 19, 2009 10:30 AM
Subject: [Bioperl-l] verbosity and error stack, was accessing EMBL database


> Mark, Dave,
>
> This could be based on verbose().
>
>          Level      w     t     d    st
> verbose   < 0        -     +     -    -/+
> verbose     0        +     +     -    -/+
> verbose     1        +     +     +    +/+
> verbose   > 1        +* -> +     +    +/+
> * converts to throw()
> w = warn
> t = throw
> d = debug
> st = stack trace
>
> warn() is set up that way now, you don't get a stack trace unless verbose() is 
>  > 0.  throw() could be the same; would be a simple fix, really.
>
> My only problem with the current state of things is (I think we've delved down 
> this path before) verbosity level is tied to exception strictness as seen 
> above, and they're really two separate concepts, at least to me.  Verbosity of 
> 1 or more doesn't necessarily mean I want an elevated level of strictness 
> along with it.  For instance, one might want very strict exceptions w/o the 
> noise, or (conversely) lots of debugging output but no warnings.
>
> (aside: another small nit, but I haven't exactly liked that the global level 
> of strictness is designated by a env. variable with DEBUG in the name, but 
> that's just me).
>
> I've been thinking it would be nice to have simple separate verbose/strict 
> switches (this is the way it's implemented in Biome).  This would allow some 
> finer grained control over output:
>
>          Level      d    st
> verbose     0        -    -
> verbose     1        +    +
> Default = BIOPERLDEBUG || 0 # current situation
>
>          Level      w     t
> strict      -1       -     +
> strict      0        +     +
> strict      1        +* -> +
> * converts to throw()
> Default = BIOPERLSTRICT || 0
>
> We could even allow finer-grained control of verbosity (states which cover all 
> combinations) w/o affecting strictness.
>
> chris
>
> On Nov 19, 2009, at 8:17 AM, Mark A. Jensen wrote:
>
>> I'm inclined to agree. Lots of responses to questions here that begin
>> "Well, as the error message said, you need to check...", which means
>> people tend towards "I broke it! Write the list!". I do find it hairy when
>> my errors are way down in the object tree.
>> ----- Original Message ----- From: "Dave Messina" <David.Messina at sbc.su.se>
>> To: "Mark A. Jensen" <maj at fortinbras.us>
>> Cc: <bioperl-l at bioperl.org>
>> Sent: Thursday, November 19, 2009 9:04 AM
>> Subject: Re: [Bioperl-l] accessing EMBL database
>>
>>
>>> I would agree that the error message is not really informative.
>>
>> Agreed that it could be better, but I wonder whether part of the problem with 
>> BioPerl error messages is the stack dump.
>>
>> I think a lot of eyes just glaze right over when they see a big wad of 
>> complicated stuff, with colons and slashes and line numbers, spewing out at 
>> them.
>>
>> Perhaps the stack dump should be turned off by default?
>>
>> Wouldn't this:
>>
>> ERROR: EMBL stream with no ID. Not embl in my book
>>
>>
>>
>> Be a lot clearer than this?:
>>
>> MSG: EMBL stream with no ID. Not embl in my book
>> STACK: Error::throw
>> STACK: Bio::Root::Root::throw C:/Perl/site/lib/Bio/Root/Root.pm: 368
>> STACK: Bio::SeqIO::embl::next_seq C:/Perl/site/lib/Bio\SeqIO\embl.pm: 203
>> STACK: Bio::DB::WebDBSeqI::get_Seq_by_acc 
>> C:/Perl/site/lib/Bio/DB/WebDBSeqI.pm: 194
>> STACK: trial2.pl
>>
>>
>>
>> Just a thought. This has probably been discussed before.
>> Dave
>>
>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> 


From michael.watson at bbsrc.ac.uk  Fri Nov 20 05:07:10 2009
From: michael.watson at bbsrc.ac.uk (michael watson (IAH-C))
Date: Fri, 20 Nov 2009 10:07:10 +0000
Subject: [Bioperl-l] strand in Bio::SearchIO when parsing fasta35 output
In-Reply-To: <8D08960C647E64438CE5740657CBBDC501487319B6@iahcexch1.iah.bbsrc.ac.uk>
References: <8D08960C647E64438CE5740657CBBDC501487319AE@iahcexch1.iah.bbsrc.ac.uk>
	<9994F70B-AE92-4425-9AAC-E9A2DC26964E@bioperl.org>
	<8D08960C647E64438CE5740657CBBDC501487319B6@iahcexch1.iah.bbsrc.ac.uk>
Message-ID: <8D08960C647E64438CE5740657CBBDC50148731CEB@iahcexch1.iah.bbsrc.ac.uk>

Hello

I was just wondering if anyone had had time to look into this?

I posted a bug: http://bugzilla.open-bio.org/show_bug.cgi?id=2937

Thanks
Mick

-----Original Message-----
From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-bounces at lists.open-bio.org] On Behalf Of michael watson (IAH-C)
Sent: 27 October 2009 09:01
To: 'Jason Stajich'
Cc: bioperl-l
Subject: Re: [Bioperl-l] strand in Bio::SearchIO when parsing fasta35 output

Hi Jason

They both print 0 also.

A bug report it is

Mick

-----Original Message-----
From: Jason Stajich [mailto:jason.stajich at gmail.com] On Behalf Of Jason Stajich
Sent: 26 October 2009 18:46
To: michael watson (IAH-C)
Cc: bioperl-l
Subject: Re: [Bioperl-l] strand in Bio::SearchIO when parsing fasta35 output


Is this -m9 -d 0 output or standard default?  I think the strand is  
parsed in the HSP parsing.

Can you double check what $hsp->query->strand and $hsp->hit->strand  
prints?

A full example report as a bug request will be next step if that  
doesn't resolve.

-jason
On Oct 26, 2009, at 10:04 AM, michael watson (IAH-C) wrote:

> Dear all
>
> Where does this go?  Perhaps I am doing something wrong.
>
> Fasta35 output puts the strand in the hit list at the top:
>
> cluster_99033:3                                (  23) [r]  115 37.9   
> 0.0011
> cluster_79238:1                 (  27) [f]  126 38.0 0.00097 0.963  
> 0.963   27
>
> The [r] stands for reverse and the [f] stands for forward.
>
> There is also the text "rev-comp" after the hit line further down.
>
> However, when I parse fasta35 output using SearchIO and output the  
> strand of the HSP:
>
> print $hsp->strand('hit'), ",";
> print $hsp->strand('query'), "\n";
>
> This simply prints out 0, 0 (I assume 0 is the default in BioPerl  
> for "I don't know which strand it's on").
>
> So the information is there, but it's not getting parsed.   
> Alternatively, I've missed something and will feel a bit foolish.
>
> Currently using BioPerl 1.6.0
>
> Thanks
> Mick
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

--
Jason Stajich
jason.stajich at gmail.com
jason at bioperl.org


_______________________________________________
Bioperl-l mailing list
Bioperl-l at lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/bioperl-l


From David.Messina at sbc.su.se  Fri Nov 20 05:15:11 2009
From: David.Messina at sbc.su.se (Dave Messina)
Date: Fri, 20 Nov 2009 11:15:11 +0100
Subject: [Bioperl-l] verbosity and error stack,
	was  accessing EMBL database
In-Reply-To: <D72A208491F04DBF9B3F7F10D86A9931@NewLife>
References: <475F74057A618245A773CD325E105D1E033334AC@phy-srv01.physiol.physiology.wisc.edu><E10EF917D9914031BCFDF8BDDBFA4F13@NewLife><B6E6CEFF-14E7-48D6-B9D8-9C114F166190@sbc.su.se><FADF827A6CE34C959062F2D93849E15A@NewLife>
	<B335150B-7479-4D31-BA0A-9F139E2CCE0E@illinois.edu>
	<D72A208491F04DBF9B3F7F10D86A9931@NewLife>
Message-ID: <3277368F-615A-4DD3-B9B3-5D32A5EEEE98@sbc.su.se>

Chris, I took a look at how you implemented this in Biome -- very nice!


> I like this verbose/strict separability a lot. Should we go for it?

Me too. So yes, I think so.


> We could even allow finer-grained control of verbosity (states which cover all combinations) w/o affecting strictness.


Perhaps this is a job for Log::Log4Perl or Log::Dispatch?
http://search.cpan.org/~mschilli/Log-Log4perl-1.25/lib/Log/Log4perl.pm
http://search.cpan.org/~drolsky/Log-Dispatch-2.26/lib/Log/Dispatch.pm


That might be overkill, though.

Dave


From roychu at gmail.com  Fri Nov 20 05:21:54 2009
From: roychu at gmail.com (Chu, Roy)
Date: Fri, 20 Nov 2009 02:21:54 -0800
Subject: [Bioperl-l] Installing Bio-perl on dreamhost via CPAN
Message-ID: <4d7f3e450911200221m5f39ace2hb979712115fb9d78@mail.gmail.com>

Hi,

Does anyone use dreamhost as a web hosting service?  I'm just curious
if anyone has had any luck installing the module as their daemon seems
to kill my process whenever I try to install it.  Dreamhost tech
support attributes it to either exceeding the allocated memory cache
or exceeding the processing time.  I tried to nice the process, but
that didn't help for me.  Any luck or experience in resolving this
would be much appreciated.  I suppose my next attempt would be to try
installing it directly and hope I don't need root...

Thanks,
Roy


From s.denaxas at gmail.com  Fri Nov 20 05:27:42 2009
From: s.denaxas at gmail.com (Spiros Denaxas)
Date: Fri, 20 Nov 2009 11:27:42 +0100
Subject: [Bioperl-l] Installing Bio-perl on dreamhost via CPAN
In-Reply-To: <4d7f3e450911200221m5f39ace2hb979712115fb9d78@mail.gmail.com>
References: <4d7f3e450911200221m5f39ace2hb979712115fb9d78@mail.gmail.com>
Message-ID: <bba689ec0911200227g1a8d717elce0daebf6a96c6aa@mail.gmail.com>

Hello,

normally you don't need to be root -
http://sial.org/howto/perl/life-with-cpan/non-root/
Kind of disturbing that their tech support cannot give you a straight
answer on what they are killing the process.

Good luck
Spiros

On Fri, Nov 20, 2009 at 11:21 AM, Chu, Roy <roychu at gmail.com> wrote:

>  ?I suppose my next attempt would be to try
> installing it directly and hope I don't need root...
>
> Thanks,
> Roy
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


From charles-listes+bioperl at plessy.org  Fri Nov 20 05:44:45 2009
From: charles-listes+bioperl at plessy.org (Charles Plessy)
Date: Fri, 20 Nov 2009 19:44:45 +0900
Subject: [Bioperl-l] Installing Bio-perl on dreamhost via CPAN
In-Reply-To: <4d7f3e450911200221m5f39ace2hb979712115fb9d78@mail.gmail.com>
References: <4d7f3e450911200221m5f39ace2hb979712115fb9d78@mail.gmail.com>
Message-ID: <20091120104445.GG31318@kunpuu.plessy.org>

Le Fri, Nov 20, 2009 at 02:21:54AM -0800, Chu, Roy a ?crit :
> 
> Does anyone use dreamhost as a web hosting service?  I'm just curious
> if anyone has had any luck installing the module as their daemon seems
> to kill my process whenever I try to install it.  Dreamhost tech
> support attributes it to either exceeding the allocated memory cache
> or exceeding the processing time.  I tried to nice the process, but
> that didn't help for me.  Any luck or experience in resolving this
> would be much appreciated.  I suppose my next attempt would be to try
> installing it directly and hope I don't need root...

Dear Roy,

DreamHost uses Debian, so you can suggest them to install the Debian package.
If you are in contact with the tech service, do not hesitate to tell them to
contact me if they are interested by a backport of the 1.6.0 package. For
version 1.6.1, it may be more difficult as it depends on perl 5.10.1.

PS: if you propse to install BioPerl as a feature in the Dreamhost panel, I
will vote for it :)

Have a nice day,

--  
Charles Plessy
Debian Med packaging team,
http://www.debian.org/devel/debian-med
Tsurumi, Kanagawa, Japan


From cjfields at illinois.edu  Fri Nov 20 07:51:39 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Fri, 20 Nov 2009 06:51:39 -0600
Subject: [Bioperl-l] strand in Bio::SearchIO when parsing fasta35 output
In-Reply-To: <8D08960C647E64438CE5740657CBBDC50148731CEB@iahcexch1.iah.bbsrc.ac.uk>
References: <8D08960C647E64438CE5740657CBBDC501487319AE@iahcexch1.iah.bbsrc.ac.uk>
	<9994F70B-AE92-4425-9AAC-E9A2DC26964E@bioperl.org>
	<8D08960C647E64438CE5740657CBBDC501487319B6@iahcexch1.iah.bbsrc.ac.uk>
	<8D08960C647E64438CE5740657CBBDC50148731CEB@iahcexch1.iah.bbsrc.ac.uk>
Message-ID: <E9D5435B-07D6-46A9-AA84-C9667FA0CEDE@illinois.edu>

Mick,

Short answer, no.  It was in the queue to be fixed at some point in 1.6.x, but that queue is quite long.  I'm pushing it into the queue specifically for 1.6.2, so it should be addressed soon.

chris

On Nov 20, 2009, at 4:07 AM, michael watson (IAH-C) wrote:

> Hello
> 
> I was just wondering if anyone had had time to look into this?
> 
> I posted a bug: http://bugzilla.open-bio.org/show_bug.cgi?id=2937
> 
> Thanks
> Mick
> 
> -----Original Message-----
> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-bounces at lists.open-bio.org] On Behalf Of michael watson (IAH-C)
> Sent: 27 October 2009 09:01
> To: 'Jason Stajich'
> Cc: bioperl-l
> Subject: Re: [Bioperl-l] strand in Bio::SearchIO when parsing fasta35 output
> 
> Hi Jason
> 
> They both print 0 also.
> 
> A bug report it is
> 
> Mick
> 
> -----Original Message-----
> From: Jason Stajich [mailto:jason.stajich at gmail.com] On Behalf Of Jason Stajich
> Sent: 26 October 2009 18:46
> To: michael watson (IAH-C)
> Cc: bioperl-l
> Subject: Re: [Bioperl-l] strand in Bio::SearchIO when parsing fasta35 output
> 
> 
> Is this -m9 -d 0 output or standard default?  I think the strand is  
> parsed in the HSP parsing.
> 
> Can you double check what $hsp->query->strand and $hsp->hit->strand  
> prints?
> 
> A full example report as a bug request will be next step if that  
> doesn't resolve.
> 
> -jason
> On Oct 26, 2009, at 10:04 AM, michael watson (IAH-C) wrote:
> 
>> Dear all
>> 
>> Where does this go?  Perhaps I am doing something wrong.
>> 
>> Fasta35 output puts the strand in the hit list at the top:
>> 
>> cluster_99033:3                                (  23) [r]  115 37.9   
>> 0.0011
>> cluster_79238:1                 (  27) [f]  126 38.0 0.00097 0.963  
>> 0.963   27
>> 
>> The [r] stands for reverse and the [f] stands for forward.
>> 
>> There is also the text "rev-comp" after the hit line further down.
>> 
>> However, when I parse fasta35 output using SearchIO and output the  
>> strand of the HSP:
>> 
>> print $hsp->strand('hit'), ",";
>> print $hsp->strand('query'), "\n";
>> 
>> This simply prints out 0, 0 (I assume 0 is the default in BioPerl  
>> for "I don't know which strand it's on").
>> 
>> So the information is there, but it's not getting parsed.   
>> Alternatively, I've missed something and will feel a bit foolish.
>> 
>> Currently using BioPerl 1.6.0
>> 
>> Thanks
>> Mick
>> 
>> 
>> 
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
> --
> Jason Stajich
> jason.stajich at gmail.com
> jason at bioperl.org
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From cjfields at illinois.edu  Fri Nov 20 08:00:45 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Fri, 20 Nov 2009 07:00:45 -0600
Subject: [Bioperl-l] Installing Bio-perl on dreamhost via CPAN
In-Reply-To: <20091120104445.GG31318@kunpuu.plessy.org>
References: <4d7f3e450911200221m5f39ace2hb979712115fb9d78@mail.gmail.com>
	<20091120104445.GG31318@kunpuu.plessy.org>
Message-ID: <ECA15E72-8026-4EA0-A4F6-7C09EA2BD040@illinois.edu>


On Nov 20, 2009, at 4:44 AM, Charles Plessy wrote:

> Le Fri, Nov 20, 2009 at 02:21:54AM -0800, Chu, Roy a ?crit :
>> 
>> Does anyone use dreamhost as a web hosting service?  I'm just curious
>> if anyone has had any luck installing the module as their daemon seems
>> to kill my process whenever I try to install it.  Dreamhost tech
>> support attributes it to either exceeding the allocated memory cache
>> or exceeding the processing time.  I tried to nice the process, but
>> that didn't help for me.  Any luck or experience in resolving this
>> would be much appreciated.  I suppose my next attempt would be to try
>> installing it directly and hope I don't need root...
> 
> Dear Roy,
> 
> DreamHost uses Debian, so you can suggest them to install the Debian package.
> If you are in contact with the tech service, do not hesitate to tell them to
> contact me if they are interested by a backport of the 1.6.0 package. For
> version 1.6.1, it may be more difficult as it depends on perl 5.10.1.

Any reason why this is so?  We specify compatibility back to 5.6.1.

Alex mentioned the reliance on the specific Extutils::Manifest version.  The version requested has an important bug fix, is present on CPAN, and is backwards-compatible to 5.6.1.  It should be fairly easy to request that as a separate package.  

A strict requirement for perl 5.10.1 doesn't make sense in that light, unless said perl maintainer can enlighten us as to why this is an issue?  This one may require a ranty blog post.

> PS: if you propse to install BioPerl as a feature in the Dreamhost panel, I
> will vote for it :)
> 
> Have a nice day,
> 
> --  
> Charles Plessy
> Debian Med packaging team,
> http://www.debian.org/devel/debian-med
> Tsurumi, Kanagawa, Japan

chris


From rtbio.2009 at gmail.com  Fri Nov 20 10:52:09 2009
From: rtbio.2009 at gmail.com (Roopa Raghuveer)
Date: Fri, 20 Nov 2009 16:52:09 +0100
Subject: [Bioperl-l] Remote blast
In-Reply-To: <c7cac1600911200330m5aafc21ehd5038707b4af9cdd@mail.gmail.com>
References: <c7cac1600911190655q7c4cf910x221e8a2ebac40f2a@mail.gmail.com>
	<4B056DF4.2030502@gmail.com>
	<c7cac1600911200330m5aafc21ehd5038707b4af9cdd@mail.gmail.com>
Message-ID: <c7cac1600911200752h2dd48ca8r55877f5045948220@mail.gmail.com>

Hello everybody,

I have tried to use Remote blast on Trypanasoma brucei sequences and could
get certain hits.But I am unable to retrieve the complete sequence from
where I got hits.
i.e., I am unable to parse the blast output file for getting the complete
sequences of the hits. Here is my code.

#!/usr/bin/perl -w
use Bio::SearchIO;
my $blast_report = new Bio::SearchIO ('-format' => 'blast',
                                      '-file'   => $ARGV[0]);
my $result = $blast_report->next_result;
my $level = $ARGV[1];

while( my $hit = $result->next_hit) {
       print $hit->name;
       push(@arr1,$hit->name);
       while( my $hsp = $hit->next_hsp()) {
        if ($hsp->frac_identical() >= $level) {
            #print $hsp->hit_string, "\n";
            push(@arr,$hsp->hit_string);
        }
    }
}
$k=@arr1;
for($i=0;$i<$k;$i++){
push(@arr2,split(/|/,$arr1[$i]));
#print "$arr[$i]\n";
}
#$t=@arr2;

Here,I am trying to use the blast output file and get the complete sequence
where I found a hit  but  I could not get the complete sequence.

i/p:-
Last login: Mon Nov 16 11:57:22 on console
Welcome to Darwin!
lmbicip-mac1:~ cip$ ssh admin at 141.84.66.66
The authenticity of host '141.84.66.66 (141.84.66.66)' can't be established.
RSA key fingerprint is 2d:4a:09:1d:2e:f3:51:c7:ba:8b:29:37:36:f6:44:db.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added '141.84.66.66' (RSA) to the list of known hosts.
Password:
Last login: Fri Nov 20 13:52:57 2009 from 10.153.189.239
Have a lot of fun...
admin at BosLinux:~> clear


admin at BosLinux:~> cd Documents/
admin at BosLinux:~/Documents> clear


admin at BosLinux:~/Documents> vim blast.pl
admin at BosLinux:~/Documents> clear


admin at BosLinux:~/Documents> vim nnn.pl
admin at BosLinux:~/Documents> vim other.pl
admin at BosLinux:~/Documents> vim amino.fa
admin at BosLinux:~/Documents> vim Tb09.211.2410.out
admin at BosLinux:~/Documents> vim Tb09.211.2410.out


             ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Sbjct  661   TTTGATGAAACCCCAATTCGGACGTATGAAAAGATTCTTGCGGGCCGGCTTAAATTCCCC
720

Query  721   AATTGGTTTGATGAGCGTGCGCGGGATCTCGTAAAGGGTTTATTGCAAACGGATCACACG
780
             ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Sbjct  721   AATTGGTTTGATGAGCGTGCGCGGGATCTCGTAAAGGGTTTATTGCAAACGGATCACACG
780

Query  781   AAACGGTTGGGCACGCTGAAGGATGGCGTAGCTGATGTGAAGAATCACCCATTCTTCCGT
840
             ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Sbjct  781   AAACGGTTGGGCACGCTGAAGGATGGCGTAGCTGATGTGAAGAATCACCCATTCTTCCGT
840

Query  841   GGTGCGAATTGGGAGAAACTCTATGGACGTCATTATAACGCCCCCATTGCCGTAAAAGTG
900
             ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Sbjct  841   GGTGCGAATTGGGAGAAACTCTATGGACGTCATTATAACGCCCCCATTGCCGTAAAAGTG
900

Query  901   AAGAGCCCCGGCGACACAAGTAACTTTGAGTCGTATCCCGAGAGTGGAGATAAGGGTTCT
960
             ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Sbjct  901   AAGAGCCCCGGCGACACAAGTAACTTTGAGTCGTATCCCGAGAGTGGAGATAAGGGTTCT
960

Query  961   CCTCCACTAACCCCTTCGCAACAGGTTGCATTCCGTGGTTTTTAG  1005
             |||||||||||||||||||||||||||||||||||||||||||||
Sbjct  961   CCTCCACTAACCCCTTCGCAACAGGTTGCATTCCGTGGTTTTTAG  1005

>ref|XM_822286.1| Trypanosoma brucei TREU927 protein kinase A catalytic
subunit
isoform 2 (Tb09.211.2360) partial mRNA
Length=1011

 Score = 1622 bits (1798),  Expect = 0.0
 Identities = 944/974 (96%), Gaps = 0/974 (0%)
 Strand=Plus/Plus

Query  32    TGTTTACCAAGCCTGACACATCGGGATGGAAGCTGAGTGACTTTGAAATGGGTGACACGC
91
             |||||||||| |||||||||||||||||||||||||||||||||||||||||||||||||
Sbjct  38    TGTTTACCAAACCTGACACATCGGGATGGAAGCTGAGTGACTTTGAAATGGGTGACACGC
97

Query  92    TAGGGACCGGCTCGTTTGGTCGCGTGCGCATTGCAAAACTGAAGAGCAGGGGGGAGTATT
151
             ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Sbjct  98    TAGGGACCGGCTCGTTTGGTCGCGTGCGCATTGCAAAACTGAAGAGCAGGGGGGAGTATT
157

Query  152   ATGCAATAAAATGTCTAAAGAAGCATGAGATACTAAAGATGAAGCAGGTACAACACCTGA
211
             |||||||||||||||||||||||| |||||||||||||||||||||||||||||||||||
Sbjct  158   ATGCAATAAAATGTCTAAAGAAGCGTGAGATACTAAAGATGAAGCAGGTACAACACCTGA
217

Query  212   ACCAAGAGAAGCAAATTCTAATGGAGTTGTCACACCCCTTCATTGTGAACATGATGTGTT
271
             ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Sbjct  218   ACCAAGAGAAGCAAATTCTAATGGAGTTGTCACACCCCTTCATTGTGAACATGATGTGTT
277

uery  272   CCTTCCAGGATGAGAACCGCGTCTACTTTGTTCTAGAATTTGTGGTAGGTGGTGAGGTAT
331
             ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Sbjct  278   CCTTCCAGGATGAGAACCGCGTCTACTTTGTTCTAGAATTTGTGGTAGGTGGTGAGGTAT
337

Query  332   TTACTCACCTTCGTTCCGCAGGCCGTTTCCCGAATGACGTAGCGAAGTTCTATCATGCGG
391
             ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Sbjct  338   TTACTCACCTTCGTTCCGCAGGCCGTTTCCCGAATGACGTAGCGAAGTTCTATCATGCGG
397

Query  392   AGCTTGTGTTGGCCTTTGAATATTTACACTCGAAGGACATTATCTACCGTGACTTGAAAC
451
             ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Sbjct  398   AGCTTGTGTTGGCCTTTGAATATTTACACTCGAAGGACATTATCTACCGTGACTTGAAAC
457

Query  452   CTGAGAATCTGCTACTTGATGGGAAGGGACACGTCAAGGTGACTGATTTTGGTTTTGCTA
511
             ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Sbjct  458   CTGAGAATCTGCTACTTGATGGGAAGGGACACGTCAAGGTGACTGATTTTGGTTTTGCTA
517

Query  512   AGAAGGTGACGGATCGTACCTATACGTTATGTGGGACACCTGAGTATCTTGCACCTGAGG
571
             ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Sbjct  518   AGAAGGTGACGGATCGTACCTATACGTTATGTGGGACACCTGAGTATCTTGCACCTGAGG
577

Query  572   TAATTCAGAGCAAAGGACATGGGAAGGCTGTGGATTGGTGGACGATGGGTGTTTTGCTGT
631
             ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||

It follows like this.

The output I got is
ATGACGACAACTCCCACTGGTGATGGCCAACTGTTTACCAAGCCTGACACATCGGGATGGAAGCTGAGTGACTTTGAAATGGGTGACACGCTAGGGACCGGCTCGTTTGGTCGCGTGCGCATTGCAAAACTGAAGAGCAGGGGGGAGTATTATGCAATAAAATGTCTAAAGAAGCATGAGATACTAAAGATGAAGCAGGTACAACACCTGAACCAAGAGAAGCAAATTCTAATGGAGTTGTCACACCCCTTCATTGTGAACATGATGTGTTCCTTCCAGGATGAGAACCGCGTCTACTTTGTTCTAGAATTTGTGGTAGGTGGTGAGGTATTTACTCACCTTCGTTCCGCAGGCCGTTTCCCGAATGACGTAGCGAAGTTCTATCATGCGGAGCTTGTGTTGGCCTTTGAATATTTACACTCGAAGGACATTATCTACCGTGACTTGAAACCTGAGAATCTGCTACTTGATGGGAAGGGACACGTCAAGGTGACTGATTTTGGTTTTGCTAAGAAGGTGACGGATCGTACCTATACGTTATGTGGGACACCTGAGTATCTTGCACCTGAGGTAATTCAGAGCAAAGGACATGGGAAGGCTGTGGATTGGTGGACGATGGGTGTTTTGCTGTATGAATTCATAGCTGGCCATCCTCCCTTTTTTGATGAAACCCCAATTCGGACGTATGAAAAGATTCTTGCGGGCCGGCTTAAATTCCCCAATTGGTTTGATGAGCGTGCGCGGGATCTCGTAAAGGGTTTATTGCAAACGGATCACACGAAACGGTTGGGCACGCTGAAGGATGGCGTAGCTGATGTGAAGAATCACCCATTCTTCCGTGGTGCGAATTGGGAGAAACTCTATGGACGTCATTATAACGCCCCCATTGCCGTAAAAGTGAAGAGCCCCGGCGACACAAGTAACTTTGAGTCGTATCCCGAGAGTGGAGATAAGGGTTCTCCTCCACTAACCCCTTCGCAACAGG
TTGCATTCCGTGGTTTTTAG

TGTTTACCAAACCTGACACATCGGGATGGAAGCTGAGTGACTTTGAAATGGGTGACACGCTAGGGACCGGCTCGTTTGGTCGCGTGCGCATTGCAAAACTGAAGAGCAGGGGGGAGTATTATGCAATAAAATGTCTAAAGAAGCGTGAGATACTAAAGATGAAGCAGGTACAACACCTGAACCAAGAGAAGCAAATTCTAATGGAGTTGTCACACCCCTTCATTGTGAACATGATGTGTTCCTTCCAGGATGAGAACCGCGTCTACTTTGTTCTAGAATTTGTGGTAGGTGGTGAGGTATTTACTCACCTTCGTTCCGCAGGCCGTTTCCCGAATGACGTAGCGAAGTTCTATCATGCGGAGCTTGTGTTGGCCTTTGAATATTTACACTCGAAGGACATTATCTACCGTGACTTGAAACCTGAGAATCTGCTACTTGATGGGAAGGGACACGTCAAGGTGACTGATTTTGGTTTTGCTAAGAAGGTGACGGATCGTACCTATACGTTATGTGGGACACCTGAGTATCTTGCACCTGAGGTAATTCAGAGCAAAGGACATGGGAAGGCTGTGGATTGGTGGACGATGGGTGTTTTGCTGTATGAATTCATAGCTGGCCATCCTCCCTTTTTTGATGAAACCCCAATTCGGACGTATGAAAAGATTCTTGCGGGCCGGTTCAAATTCCCCAATTGGTTTGACTCCCGTGCGCGGGATCTCGTAAAGGGTTTATTGCAAACGGATCACACGAAACGGTTGGGCACGCTGAAGGATGGCGTAGCTGATGTGAAGAATCACCCATTCTTCCGTGGTGCGAATTGGGAGAAACTCTATGGACGTCATTATCACGCTCCCATTCCTGTAAAAGTGAAGAGCCCCGGCGACACAAGTAACTTTGAGTCGTATCCCGAGAGTGGGGATAAGCGGTTGCCCCCGTTAGCACCATCACAACAATTGGAGTTCCGTGGGTTTTAG
GGATGATGACCGATTGTACCTCCTCCTCGAGTATGTGGTGGGTGGCGAGCTGT

TCTCCCACCTCCGGAAGGCGGGAAAATTCCCTAATGATGTAGCCAAGTTCTACTCCGCAGAAGTGGTTTTGGCGTTTGAATATATTCATGAGTGCGGCATCGTATACCGTGACTTGAAGCCAGAAAATGTGCTTTTGGACAAGCAGGGAAACATTAAGATTACGGACTTTGGGTTCGCGAAACGCGTTAGGGACAGAACGTACACGCTATGTGGGACTCCAGAGTATCTTGCGCCGGAGATAATCCAAAGTAAAGGTCACGATCGGGCTGTGGATTGGTGGACACTCGGAATTCTTCTCTATGAGATGCTTGTCGGTTATCCTCCTTTTTTCGACGAGAGTCCTTTTAGAACATACGAAAAAATTTTAGAGGGGAAACTTCAGTTTCCAAAGTGGGTGGAGATGCGGGCGAAGGACCTCATAAAGAGTTTTTTAACAATTGAACCAACGAAACG

i.e.,It is only giving the region where it could find the best alignment
i.e., the best hit ones.

I want the complete sequence i.e., sequences corresponding to the accession
numbers
XM_822292.1
XM_822286.1
XM_822694.1

Database used in Remote blast was RefSeq i.e.,(refseq_rna),organism used
:Trypanasoma brucei.

Can any one please help me in solving this problem

Regards,
Roopa.
On Fri, Nov 20, 2009 at 12:30 PM, Roopa Raghuveer <rtbio.2009 at gmail.com>wrote:

>
> Hello Roy,
>
> Thanks a lot for your reply.My code is working for my sequence now.
>
> Thanks alot.
>
> Regards,
> Roopa.
>
> On Thu, Nov 19, 2009 at 5:10 PM, Roy Chaudhuri <roy.chaudhuri at gmail.com>wrote:
>
>> Hi Roopa,
>>
>> I think that the -Organism parameter that you specify for
>> Bio::Tools::Run::RemoteBlast is ignored - I can't find any reference to it
>> in the documentation:
>>
>> http://search.cpan.org/~cjfields/BioPerl-1.6.1/Bio/Tools/Run/RemoteBlast.pm<http://search.cpan.org/%7Ecjfields/BioPerl-1.6.1/Bio/Tools/Run/RemoteBlast.pm>
>>
>> You have the correct approach in your code - limiting the search to the
>> Entrez query "Trypanosoma brucei[ORGN]", but the line is commented out. If
>> you uncomment the line (and add a semicolon afterwards), the program runs
>> correctly, but no hits are reported below your threshold e-value. If you
>> change the value of $e_val to 10 then some T.brucei hits are reported.
>>
>> Roy.
>>
>> Roopa Raghuveer wrote:
>>
>>> Hello everybody,
>>>
>>> I have a problem. I would like to use remote blast to find sequences
>>> matching for an input sequence.
>>>
>>> Ex:-I would like to search sequences which match Trypanosoma Brucei
>>> sequence.
>>>
>>> I want the output to be only Trypanosoma Brucei sequences matching with
>>> my
>>> query.When i tried to use remoteblast to nr database,I got sequences from
>>> different organisms like E.coli,Pseudomonas etc.,
>>>
>>> Could you please tell me how can this be solved...?
>>>
>>> My code is as follows.
>>>
>>> use Bio::Tools::Run::RemoteBlast;
>>>  use strict;
>>>  my $prog = 'blastn';
>>>  my $db   = 'nr';
>>>  my $e_val= '1e-10';
>>>  my $organism= 'Trypanosoma Brucei';
>>>
>>>  my @params = ( '-prog' => $prog,
>>>         '-data' => $db,
>>>         '-expect' => $e_val,
>>>         '-readmethod' => 'SearchIO',
>>>         '-Organism'   => $organism );
>>>
>>>  my $factory = Bio::Tools::Run::RemoteBlast->
>>> new(@params);
>>>
>>>  #change a paramter
>>>  #$Bio::Tools::Run::RemoteBlast::HEADER{'ENTREZ_QUERY'} = 'Trypanosoma
>>> brucei[ORGN]'
>>>
>>>  #remove a parameter
>>>  #delete $Bio::Tools::Run::RemoteBlast::HEADER{'FILTER'};
>>>
>>>  my $v = 1;
>>>  #$v is just to turn on and off the messages
>>>
>>>  my $str = Bio::SeqIO->new(-file=>'amino.fa' , '-format' => 'fasta' ,
>>> '-organism' => 'Trypanosoma Brucei' );
>>>
>>>  while (my $input = $str->next_seq()){
>>>    #Blast a sequence against a database:
>>>   my $r = $factory->submit_blast($input);
>>>    #my $r = $factory->submit_blast('amino.fa');
>>>
>>>    print STDERR "waiting..." if( $v > 0 );
>>>    while ( my @rids = $factory->each_rid ) {
>>>      foreach my $rid ( @rids ) {
>>>        my $rc = $factory->retrieve_blast($rid);
>>>        if( !ref($rc) ) {
>>>          if( $rc < 0 ) {
>>>            $factory->remove_rid($rid);
>>>          }
>>>          print STDERR "." if ( $v > 0 );
>>>         sleep 5;
>>>        }
>>>     else {
>>>          my $result = $rc->next_result();
>>>          #save the output
>>>          my $filename = $result->query_name()."\.out";
>>>          $factory->save_output($filename);
>>>          $factory->remove_rid($rid);
>>>          print "\nQuery Name: ", $result->query_name(), "\n";
>>>          while ( my $hit = $result->next_hit ) {
>>>            next unless ( $v > 0);
>>>            print "\thit name is ", $hit->name, "\n";
>>>            while( my $hsp = $hit->next_hsp ) {
>>>              print "\t\tscore is ", $hsp->score, "\n";
>>>            }
>>>          }
>>>        }
>>>      }
>>>    }
>>>  }
>>>
>>> My input sequence is
>>>
>>>  ref|NC_009512.1|:385-1902
>>>>
>>> GTGTCAGTGGAACTTTGGCAGCAGTGCGTGGAGCTTCTGCGCGATGAACTGCCTGCCCAGCAATTCAACA
>>> CCTGGATCCGTCCGCTACAGGTCGAAGCCGAAGGCGACGAGTTGCGCGTCTATGCGCCTAACCGTTTCGT
>>> TCTCGATTGGGTCAATGAAAAGTACCTGGGTCGTTTGCTCGAGCTGTTGGGTGAGAACGGTAGCGGCATT
>>> GCACCAGCCCTTTCCTTATTAATAGGTAGCCGCCGCAGCTCGGCCCCAAGGGCTGCACCCAACGCGCCGG
>>> TCAGCGCTGCCGTTGCGGCTTCGCTGGCGCAGACTCAGGCGCACAAGACGGCCCCGGCAGCAGCGGTTGA
>>> ACCCGTTGCCGTGGCCGCGGCCGAGCCTGTATTGGTCGAGACGTCTTCGCGTGACAGCTTTGATGCCATG
>>> GCCGAGCCTGCTGCTGCGCCGCCCAGTGGTGGCCGGGCTGAACAGCGCACCGTGCAGGTTGAAGGTGCGC
>>> TCAAGCACACCAGTTACCTGAACCGGACCTTTACCTTTGACACCTTCGTCGAAGGTAAGTCGAACCAGCT
>>> CGCCCGCGCGGCTGCCTGGCAGGTTGCGGACAACCCTAAGCATGGCTACAACCCACTGTTCCTTTATGGC
>>> GGTGTGGGTTTGGGTAAAACCCACCTTATGCATGCTGTGGGTAACCATCTGCTGAAGAAGAATCCGAACG
>>> CCAAGGTGGTGTACCTGCATTCGGAGCGCTTCGTCGCGGACATGGTCAAAGCGTTGCAACTCAACGCCAT
>>> CAACGAATTCAAGCGCTTCTACCGCTCGGTGGACGCGTTGCTGATCGACGATATCCAGTTCTTCGCTCGC
>>> AAAGAGCGCTCGCAAGAAGAGTTTTTCCACACCTTCAACGCCTTGCTTGAGGGTGGCCAGCAGGTAATCC
>>> TTACCTCTGACCGCTATCCCAAGGAAATCGAAGGCCTGGAAGAGCGTCTGAAGTCGCGCTTTGGTTGGGG
>>> CCTGACGGTGGCTGTCGAGCCGCCAGAGCTGGAGACCCGCGTAGCGATCCTGATGAAGAAGGCCGACCAG
>>> GCCAAAGTCGAGCTCCCGCATGACGCAGCCTTTTTCATCGCTCAGCGCATCCGGTCCAACGTCCGTGAGC
>>> TGGAAGGTGCACTGAAGCGAGTTATTGCTCACTCGCACTTCATGGGGCGTGACATCACCATCGAGCTGAT
>>> TCGTGAATCGCTCAAGGATCTGTTGGCGCTGCAAGACAAACTGGTCAGTGTGGATAACATTCAGCGTACC
>>> GTCGCTGAGTACTACAAGATCAAGATCTCCGATCTGTTGTCCAAGCGTCGTTCGCGTTCTGTCGCGCGCC
>>> CGCGTCAGGTAGCCATGGCCCTGTCCAAGGAGTTGACCAACCACAGTCTGCCGGAAATCGGCGACATGTT
>>> CGGTGGTCGCGACCATACGACCGTGCTGCACGCCTGCCGCAAAATCAATGAACTGAAGGAATCCGACGCG
>>> GACATCCGCGAGGACTACAAGAACCTGCTGCGGACGCTGACGACCTGA
>>>
>>> Please mail me regarding any queries.
>>>
>>> Regards,
>>> Roopa.
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>
>>
>


From mauricio at open-bio.org  Fri Nov 20 11:15:22 2009
From: mauricio at open-bio.org (Mauricio Herrera Cuadra)
Date: Fri, 20 Nov 2009 10:15:22 -0600
Subject: [Bioperl-l] DANGER: hacking of bioperl wiki?
In-Reply-To: <7761C2223DB54DE6B836F302D2FF6AC0@NewLife>
References: <b9140daa0911181852n3a000da7x3fc7a5f0661f10f3@mail.gmail.com>
	<7761C2223DB54DE6B836F302D2FF6AC0@NewLife>
Message-ID: <4B06C09A.8060708@open-bio.org>

All OBF wikis and blogs have been upgraded and cleaned from the hack. 
Thanks for the heads up!

Mauricio.

Mark A. Jensen wrote:
> Andrew-- thanks!! We're on it.
> MAJ
> ----- Original Message ----- From: "Andrew Grimm" 
> <andrew.j.grimm at gmail.com>
> To: <bioperl-l at lists.open-bio.org>
> Sent: Wednesday, November 18, 2009 9:52 PM
> Subject: [Bioperl-l] DANGER: hacking of bioperl wiki?
> 
> 
>> Caution: read the whole email before visiting the bioperl wiki
>>
>> I was doing some bioinformatics-related searching using google, and
>> one of the hits was to the bio dot perl dot org wiki (the FAQ in
>> particular).
>>
>> When I did that, I was redirected to a ferdax dot com web site (a
>> typo-squatting of fedex?).
>>
>> Some people reckon that ferdax hacks web sites and redirects google
>> hits from the victim web site to their own web site. For example, this
>> thread at google's webmaster central
>> http://www.google.com/support/forum/p/Webmasters/thread?tid=37a36c0d1ea99819&hl=en#all 
>>
>> (it's talking about zencart, but presumably they've since found other
>> victims)
>>
>> Just going to the website without using google may not trigger the 
>> redirect.
>>
>> Apologies if this is a false alarm, but I don't think it is.
>>
>> I won't be in contact between Friday and Monday Australian time (I'll
>> be at railscamp 6 in Melbourne), so I won't be able to answer any
>> replies.
>>
>> Thanks,
>>
>> Andrew Grimm
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>>
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 


From David.Messina at sbc.su.se  Fri Nov 20 11:39:53 2009
From: David.Messina at sbc.su.se (Dave Messina)
Date: Fri, 20 Nov 2009 17:39:53 +0100
Subject: [Bioperl-l] Remote blast
In-Reply-To: <c7cac1600911200752h2dd48ca8r55877f5045948220@mail.gmail.com>
References: <c7cac1600911190655q7c4cf910x221e8a2ebac40f2a@mail.gmail.com>
	<4B056DF4.2030502@gmail.com>
	<c7cac1600911200330m5aafc21ehd5038707b4af9cdd@mail.gmail.com>
	<c7cac1600911200752h2dd48ca8r55877f5045948220@mail.gmail.com>
Message-ID: <7ECF627D-3DBF-4575-89CF-FA6348C88E8E@sbc.su.se>

Hi Roopa,

As far as I know, a BLAST report never contains the complete sequences of the hits. If it includes any part of the hit's sequence, it will be the part that matches the query.

You'll have to use the hit's ID or accession to get its complete sequence from somewhere else. You can use Bio::DB::Genbank to do that, for example.

See
http://www.bioperl.org/wiki/HOWTO:Beginners#Retrieving_a_sequence_from_a_database


Dave


From alessandra.bilardi at gmail.com  Fri Nov 20 12:44:18 2009
From: alessandra.bilardi at gmail.com (Alessandra)
Date: Fri, 20 Nov 2009 18:44:18 +0100
Subject: [Bioperl-l] Bio::DB::EUtilities question
Message-ID: <e0996aca0911200944r1162ceaew1bd846ade73d2841@mail.gmail.com>

Hi all,

I'm testing Bio::DB::EUtilities - webagent which interacts with and
retrieves data from NCBI's eUtils. My perl script works but it works
only if I request less than ~450 times get_Response function.. else I
have got this error message:

------------- EXCEPTION -------------
MSG: Response Error
Can't connect to eutils.ncbi.nlm.nih.gov:80 (connect: No route to host)
STACK Bio::DB::GenericWebAgent::get_Response
/usr/local/share/perl/5.10.0/Bio/DB/GenericWebAgent.pm:215
STACK toplevel ./wget4gbk.pl:77
-------------------------------------

wget4gbk.pl lines 76-77 are:
my $req = Bio::DB::EUtilities->new(-db => 'genome', -eutil =>
'esummary', -retmode => $mode, -rettype => $type, -id => $id);
my $entry = $req->get_Response;

I run perl script more ten times and this error arrives random time at
the range 300-600 requests. If I use another system to request data,
then I can to do ~ 10000 requests, without errors. Had I to set
EUtilities object with particular parameters?

Can you help me about random exception error?

Best,

-- 
 Alessandra Bilardi, Ph. D.
----
 CRIBI, University of Padova, Italy
 http://www.linkedin.com/in/bilardi
----


From maj at fortinbras.us  Fri Nov 20 13:42:38 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Fri, 20 Nov 2009 13:42:38 -0500
Subject: [Bioperl-l] gravatars on the wiki
Message-ID: <94431678F3764E8C9A49EA4D2FCD0DBD@NewLife>

Hi all, 
You can now reveal your Gravatar (http://www.gravatar.com) on the wiki, by including 
the following markup on the page:

 <winterPreWiki>
 {{#gravatar|youremail -at- yourplace -dot- tld}}
 </winterPreWiki>

You can do the antispam measure above, or use a regular email. Invalid emails throw an error.
http://bioperl.org/wiki/Gravatars 
Happy coding, 
MAJ


From roychu at gmail.com  Fri Nov 20 15:23:21 2009
From: roychu at gmail.com (Chu, Roy)
Date: Fri, 20 Nov 2009 12:23:21 -0800
Subject: [Bioperl-l] Installing Bio-perl on dreamhost via CPAN
In-Reply-To: <ECA15E72-8026-4EA0-A4F6-7C09EA2BD040@illinois.edu>
References: <4d7f3e450911200221m5f39ace2hb979712115fb9d78@mail.gmail.com>
	<20091120104445.GG31318@kunpuu.plessy.org>
	<ECA15E72-8026-4EA0-A4F6-7C09EA2BD040@illinois.edu>
Message-ID: <4d7f3e450911201223w59cb308q5af7690a28697966@mail.gmail.com>

"sounds very much like you process was killed for prolonged execution
time, or memory usage. We have a daemon in place that monitors for
processes that take up too much of a shared web server's resources, and
this may have kicked in (and often does when trying to install packages
on a shared server)."

This was the explanation they had.  Regarding asking their admins to
install, it seems is a "they'll try to get to it but don't hold your
breath situation."

Hmmm, I tried some other attempts, installing 1.4.0 posed no problems.
 I'm not a perl guru, so I tried to increase the build cache size from
the default, 10 MB, hoping that that may be the problem--can't imagine
how though, since I can't imagine how big the whole package version
can differ by (though honestly, I haven't checked).
Whenever I try to install 1.6.1, it runs into a problem I guess after
the 'make' step and lists the
modules--BioPerl-1.6.0/t/Variation/SeqDiff.t
BioPerl-1.6.0/t/Variation/SNP.t
BioPerl-1.6.0/t/Variation/Variation_IO.t
--and typically gets killed here '> Killed'

Next, I tried 1.6.0, then I get this:
"(I think you ran Build.PL directly, so will use CPAN to install
prerequisites on demand)
CPAN: Storable loaded ok (v2.12)
Going to read '/home/$username/.cpan/Metadata'
Killed" (everything prior works and it seems to get further along than
when I try to install 1.6.1)

Any insight into why this may be happening would be appreciated.
Something EQUALLY appreciated would be a recommendation of a decent
enough hosting service where someone has had success installing
Bio-Perl.  I'd try to set up my Mac web sharing feature and then try
to setup the stuff locally, but I haven't yet been able to
successfully get the port forwarding feature working properly on the
apple airport extreme--perplexing.  Next, I might just try to install
via the Build.pl script.

Hmm, checking the wiki, it seems I'll still be able to run remote
blast and use the basic seq modules, although some discrepancies and
idiosyncrasies may be expected?  Any head-ups about any false
assumptions by me would be greatly appreciated.

Thanks in advance,
Roy

On Fri, Nov 20, 2009 at 5:00 AM, Chris Fields <cjfields at illinois.edu> wrote:
>
> On Nov 20, 2009, at 4:44 AM, Charles Plessy wrote:
>
>> Le Fri, Nov 20, 2009 at 02:21:54AM -0800, Chu, Roy a ?crit :
>>>
>>> Does anyone use dreamhost as a web hosting service? ?I'm just curious
>>> if anyone has had any luck installing the module as their daemon seems
>>> to kill my process whenever I try to install it. ?Dreamhost tech
>>> support attributes it to either exceeding the allocated memory cache
>>> or exceeding the processing time. ?I tried to nice the process, but
>>> that didn't help for me. ?Any luck or experience in resolving this
>>> would be much appreciated. ?I suppose my next attempt would be to try
>>> installing it directly and hope I don't need root...
>>
>> Dear Roy,
>>
>> DreamHost uses Debian, so you can suggest them to install the Debian package.
>> If you are in contact with the tech service, do not hesitate to tell them to
>> contact me if they are interested by a backport of the 1.6.0 package. For
>> version 1.6.1, it may be more difficult as it depends on perl 5.10.1.
>
> Any reason why this is so? ?We specify compatibility back to 5.6.1.
>
> Alex mentioned the reliance on the specific Extutils::Manifest version. ?The version requested has an important bug fix, is present on CPAN, and is backwards-compatible to 5.6.1. ?It should be fairly easy to request that as a separate package.
>
> A strict requirement for perl 5.10.1 doesn't make sense in that light, unless said perl maintainer can enlighten us as to why this is an issue? ?This one may require a ranty blog post.
>
>> PS: if you propse to install BioPerl as a feature in the Dreamhost panel, I
>> will vote for it :)
>>
>> Have a nice day,
>>
>> --
>> Charles Plessy
>> Debian Med packaging team,
>> http://www.debian.org/devel/debian-med
>> Tsurumi, Kanagawa, Japan
>
> chris
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


From cjfields at illinois.edu  Fri Nov 20 15:40:24 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Fri, 20 Nov 2009 14:40:24 -0600
Subject: [Bioperl-l] Installing Bio-perl on dreamhost via CPAN
In-Reply-To: <4d7f3e450911201223w59cb308q5af7690a28697966@mail.gmail.com>
References: <4d7f3e450911200221m5f39ace2hb979712115fb9d78@mail.gmail.com>
	<20091120104445.GG31318@kunpuu.plessy.org>
	<ECA15E72-8026-4EA0-A4F6-7C09EA2BD040@illinois.edu>
	<4d7f3e450911201223w59cb308q5af7690a28697966@mail.gmail.com>
Message-ID: <1D1B0987-3309-4281-BCE0-2737E4F0D0B1@illinois.edu>

BioPerl is pure perl.  If you believe all dependencies are installed, just unpack the dist to a specific directory and point PERL5LIB at it (for bash):

export PERL5LIB=/home/USER/bioperl/bioperl-live

Note that if you plan on doing the same for other bioperl-related modules (ex: bioperl-db) you'll need to add 'lib' to it, as they use a generic Module::Build now.

export PERL5LIB=/home/USER/bioperl/bioperl-db/lib

You can also add a 'use lib' directive in your scripts as well.  More at the following link:

http://www.bioperl.org/wiki/Installing_Bioperl_for_Unix#USING_MODULES_NOT_INSTALLED_IN_THE_STANDARD_LOCATION

chris

On Nov 20, 2009, at 2:23 PM, Chu, Roy wrote:

> "sounds very much like you process was killed for prolonged execution
> time, or memory usage. We have a daemon in place that monitors for
> processes that take up too much of a shared web server's resources, and
> this may have kicked in (and often does when trying to install packages
> on a shared server)."
> 
> This was the explanation they had.  Regarding asking their admins to
> install, it seems is a "they'll try to get to it but don't hold your
> breath situation."
> 
> Hmmm, I tried some other attempts, installing 1.4.0 posed no problems.
> I'm not a perl guru, so I tried to increase the build cache size from
> the default, 10 MB, hoping that that may be the problem--can't imagine
> how though, since I can't imagine how big the whole package version
> can differ by (though honestly, I haven't checked).
> Whenever I try to install 1.6.1, it runs into a problem I guess after
> the 'make' step and lists the
> modules--BioPerl-1.6.0/t/Variation/SeqDiff.t
> BioPerl-1.6.0/t/Variation/SNP.t
> BioPerl-1.6.0/t/Variation/Variation_IO.t
> --and typically gets killed here '> Killed'
> 
> Next, I tried 1.6.0, then I get this:
> "(I think you ran Build.PL directly, so will use CPAN to install
> prerequisites on demand)
> CPAN: Storable loaded ok (v2.12)
> Going to read '/home/$username/.cpan/Metadata'
> Killed" (everything prior works and it seems to get further along than
> when I try to install 1.6.1)
> 
> Any insight into why this may be happening would be appreciated.
> Something EQUALLY appreciated would be a recommendation of a decent
> enough hosting service where someone has had success installing
> Bio-Perl.  I'd try to set up my Mac web sharing feature and then try
> to setup the stuff locally, but I haven't yet been able to
> successfully get the port forwarding feature working properly on the
> apple airport extreme--perplexing.  Next, I might just try to install
> via the Build.pl script.
> 
> Hmm, checking the wiki, it seems I'll still be able to run remote
> blast and use the basic seq modules, although some discrepancies and
> idiosyncrasies may be expected?  Any head-ups about any false
> assumptions by me would be greatly appreciated.
> 
> Thanks in advance,
> Roy
> 
> On Fri, Nov 20, 2009 at 5:00 AM, Chris Fields <cjfields at illinois.edu> wrote:
>> 
>> On Nov 20, 2009, at 4:44 AM, Charles Plessy wrote:
>> 
>>> Le Fri, Nov 20, 2009 at 02:21:54AM -0800, Chu, Roy a ?crit :
>>>> 
>>>> Does anyone use dreamhost as a web hosting service?  I'm just curious
>>>> if anyone has had any luck installing the module as their daemon seems
>>>> to kill my process whenever I try to install it.  Dreamhost tech
>>>> support attributes it to either exceeding the allocated memory cache
>>>> or exceeding the processing time.  I tried to nice the process, but
>>>> that didn't help for me.  Any luck or experience in resolving this
>>>> would be much appreciated.  I suppose my next attempt would be to try
>>>> installing it directly and hope I don't need root...
>>> 
>>> Dear Roy,
>>> 
>>> DreamHost uses Debian, so you can suggest them to install the Debian package.
>>> If you are in contact with the tech service, do not hesitate to tell them to
>>> contact me if they are interested by a backport of the 1.6.0 package. For
>>> version 1.6.1, it may be more difficult as it depends on perl 5.10.1.
>> 
>> Any reason why this is so?  We specify compatibility back to 5.6.1.
>> 
>> Alex mentioned the reliance on the specific Extutils::Manifest version.  The version requested has an important bug fix, is present on CPAN, and is backwards-compatible to 5.6.1.  It should be fairly easy to request that as a separate package.
>> 
>> A strict requirement for perl 5.10.1 doesn't make sense in that light, unless said perl maintainer can enlighten us as to why this is an issue?  This one may require a ranty blog post.
>> 
>>> PS: if you propse to install BioPerl as a feature in the Dreamhost panel, I
>>> will vote for it :)
>>> 
>>> Have a nice day,
>>> 
>>> --
>>> Charles Plessy
>>> Debian Med packaging team,
>>> http://www.debian.org/devel/debian-med
>>> Tsurumi, Kanagawa, Japan
>> 
>> chris
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From charles-listes+bioperl at plessy.org  Fri Nov 20 20:07:23 2009
From: charles-listes+bioperl at plessy.org (Charles Plessy)
Date: Sat, 21 Nov 2009 10:07:23 +0900
Subject: [Bioperl-l] Installing Bio-perl on dreamhost via CPAN
In-Reply-To: <ECA15E72-8026-4EA0-A4F6-7C09EA2BD040@illinois.edu>
References: <4d7f3e450911200221m5f39ace2hb979712115fb9d78@mail.gmail.com>
	<20091120104445.GG31318@kunpuu.plessy.org>
	<ECA15E72-8026-4EA0-A4F6-7C09EA2BD040@illinois.edu>
Message-ID: <20091121010723.GA7786@kunpuu.plessy.org>

Le Fri, Nov 20, 2009 at 07:00:45AM -0600, Chris Fields a ?crit :
> On Nov 20, 2009, at 4:44 AM, Charles Plessy wrote:
> > 
> > DreamHost uses Debian, so you can suggest them to install the Debian
> > package.  If you are in contact with the tech service, do not hesitate to
> > tell them to contact me if they are interested by a backport of the 1.6.0
> > package. For version 1.6.1, it may be more difficult as it depends on perl
> > 5.10.1.
> 
> Any reason why this is so?  We specify compatibility back to 5.6.1.

Dear Chris,

you make a good point: although for building we need to either depend on perl
5.10.1 or package separately Extutils::Manifest, the resulting bioperl package
does not depend on such a high version. Therefore, there is no need for a
backport, and the latest Debian package can be installed on Debian stable
(5.0/Lenny) system. I just checked the Dreamhost machine on which I happen to
have an acces, ?waratahs?, and it seems to be older, but nevertheless it may be
worth asking the admins anyway (with the big drawback that they would have to
be asked for each update).

Have a nice week-end,

-- 
Charles Plessy
Debian Med packaging team,
http://www.debian.org/devel/debian-med
Tsurumi, Kanagawa, Japan


From robert.bradbury at gmail.com  Fri Nov 20 20:40:14 2009
From: robert.bradbury at gmail.com (Robert Bradbury)
Date: Fri, 20 Nov 2009 20:40:14 -0500
Subject: [Bioperl-l] Excessive CPU use by various Bioperl sites
Message-ID: <deaa866a0911201740h63aeb09cma237064b7622f5ce@mail.gmail.com>

I run a Linux system which is in a gradual process of evolution from the
default Linux browsers (Galeon, Epiphany, etc.) through Firefox (better) to
Google's Chromium (IMO, perhaps the best so far).  Chromium allows one to
create a process per tab/URL so one can effectively track what it is doing.
 It also allows one to track the machine usage of these processes (through
the Developer > Task manager [shift-escape keyboard] option) which though
expensive in terms of overhead allows one to track offending windows (in
terms of memory or CPU use).  My processor recently jumped from a typical
700 MHz to 1.4 GHz speed (using the Linux Ondemand scheduler - which saves
~20 W at the wall outlet -- I've measured it) to the full tilt 2.8 GHz the
CPU is capable of.  Looking at the chrome task manager I was not surprised
to find the NY Times high on the list (they are pushing content, esp. using
Javascript) but much to my dismay the Jalview and Howto:Trees:Bioperl
appeared to be high on the list.  Now I am forced to ask myself *why* sites
which are simply distributing static information are eating up CPU on my
machine!  This is a fundamental flaw in the architecture of the sites --
wherein there should be conscious efforts to minimize user-CPU use (or avoid
Javascript entirely).  This would not be a problem if I were using Firefox
as I can easily use NoScript to block Javacscript from non-approved sites.
 But it raises the question of when one should allow Javascript to run (one
would "normally" approve academic sites by default) when even the academic
sites are abusing my CPU.  There needs to be much greater awareness both on
the part of software distributors and software consumers that it is *MY* CPU
and *MY* Electricty and *MY* contribution to global warming.  And the
developers/distributors should not be sucking down those resources without
first saying "May I?" and I have the option of saying "No you may not."
 There is enough we can do productively (running low homology blast
searches) without engaging in endless wheel spinning of Javascripts or
looped GIFs.

Robert


From maj at fortinbras.us  Fri Nov 20 23:17:12 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Fri, 20 Nov 2009 23:17:12 -0500
Subject: [Bioperl-l] ohlohers
Message-ID: <C003FAD20636489DBFB2D34F5955C68D@NewLife>

You can now add your Ohloh widgets and increase your carbon footprint with the less crufty:

 <winterPreWiki>
 {{#ohloh|acct_id|TYPE}}
 </winterPreWiki>

where TYPE is [Detailed|Rank|Tiny]. Taint checks aplenty.
MAJ


From maj at fortinbras.us  Fri Nov 20 23:33:02 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Fri, 20 Nov 2009 23:33:02 -0500
Subject: [Bioperl-l] Installing Bio-perl on dreamhost via CPAN
In-Reply-To: <4d7f3e450911201223w59cb308q5af7690a28697966@mail.gmail.com>
References: <4d7f3e450911200221m5f39ace2hb979712115fb9d78@mail.gmail.com><20091120104445.GG31318@kunpuu.plessy.org><ECA15E72-8026-4EA0-A4F6-7C09EA2BD040@illinois.edu>
	<4d7f3e450911201223w59cb308q5af7690a28697966@mail.gmail.com>
Message-ID: <9ECC66C2F23F47469AF0F07E3F9307FC@NewLife>

Maybe 'nightmarehost' is more appropriate. I've had no problems on AWS,
but this may not exactly what you need. MAJ
----- Original Message ----- 
From: "Chu, Roy" <roychu at gmail.com>
To: <bioperl-l at lists.open-bio.org>
Sent: Friday, November 20, 2009 3:23 PM
Subject: Re: [Bioperl-l] Installing Bio-perl on dreamhost via CPAN


"sounds very much like you process was killed for prolonged execution
time, or memory usage. We have a daemon in place that monitors for
processes that take up too much of a shared web server's resources, and
this may have kicked in (and often does when trying to install packages
on a shared server)."

This was the explanation they had.  Regarding asking their admins to
install, it seems is a "they'll try to get to it but don't hold your
breath situation."

Hmmm, I tried some other attempts, installing 1.4.0 posed no problems.
 I'm not a perl guru, so I tried to increase the build cache size from
the default, 10 MB, hoping that that may be the problem--can't imagine
how though, since I can't imagine how big the whole package version
can differ by (though honestly, I haven't checked).
Whenever I try to install 1.6.1, it runs into a problem I guess after
the 'make' step and lists the
modules--BioPerl-1.6.0/t/Variation/SeqDiff.t
BioPerl-1.6.0/t/Variation/SNP.t
BioPerl-1.6.0/t/Variation/Variation_IO.t
--and typically gets killed here '> Killed'

Next, I tried 1.6.0, then I get this:
"(I think you ran Build.PL directly, so will use CPAN to install
prerequisites on demand)
CPAN: Storable loaded ok (v2.12)
Going to read '/home/$username/.cpan/Metadata'
Killed" (everything prior works and it seems to get further along than
when I try to install 1.6.1)

Any insight into why this may be happening would be appreciated.
Something EQUALLY appreciated would be a recommendation of a decent
enough hosting service where someone has had success installing
Bio-Perl.  I'd try to set up my Mac web sharing feature and then try
to setup the stuff locally, but I haven't yet been able to
successfully get the port forwarding feature working properly on the
apple airport extreme--perplexing.  Next, I might just try to install
via the Build.pl script.

Hmm, checking the wiki, it seems I'll still be able to run remote
blast and use the basic seq modules, although some discrepancies and
idiosyncrasies may be expected?  Any head-ups about any false
assumptions by me would be greatly appreciated.

Thanks in advance,
Roy

On Fri, Nov 20, 2009 at 5:00 AM, Chris Fields <cjfields at illinois.edu> wrote:
>
> On Nov 20, 2009, at 4:44 AM, Charles Plessy wrote:
>
>> Le Fri, Nov 20, 2009 at 02:21:54AM -0800, Chu, Roy a ?crit :
>>>
>>> Does anyone use dreamhost as a web hosting service? I'm just curious
>>> if anyone has had any luck installing the module as their daemon seems
>>> to kill my process whenever I try to install it. Dreamhost tech
>>> support attributes it to either exceeding the allocated memory cache
>>> or exceeding the processing time. I tried to nice the process, but
>>> that didn't help for me. Any luck or experience in resolving this
>>> would be much appreciated. I suppose my next attempt would be to try
>>> installing it directly and hope I don't need root...
>>
>> Dear Roy,
>>
>> DreamHost uses Debian, so you can suggest them to install the Debian package.
>> If you are in contact with the tech service, do not hesitate to tell them to
>> contact me if they are interested by a backport of the 1.6.0 package. For
>> version 1.6.1, it may be more difficult as it depends on perl 5.10.1.
>
> Any reason why this is so? We specify compatibility back to 5.6.1.
>
> Alex mentioned the reliance on the specific Extutils::Manifest version. The 
> version requested has an important bug fix, is present on CPAN, and is 
> backwards-compatible to 5.6.1. It should be fairly easy to request that as a 
> separate package.
>
> A strict requirement for perl 5.10.1 doesn't make sense in that light, unless 
> said perl maintainer can enlighten us as to why this is an issue? This one may 
> require a ranty blog post.
>
>> PS: if you propse to install BioPerl as a feature in the Dreamhost panel, I
>> will vote for it :)
>>
>> Have a nice day,
>>
>> --
>> Charles Plessy
>> Debian Med packaging team,
>> http://www.debian.org/devel/debian-med
>> Tsurumi, Kanagawa, Japan
>
> chris
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>

_______________________________________________
Bioperl-l mailing list
Bioperl-l at lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/bioperl-l


From cjfields at illinois.edu  Fri Nov 20 23:38:23 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Fri, 20 Nov 2009 22:38:23 -0600
Subject: [Bioperl-l] Excessive CPU use by various Bioperl sites
In-Reply-To: <deaa866a0911201740h63aeb09cma237064b7622f5ce@mail.gmail.com>
References: <deaa866a0911201740h63aeb09cma237064b7622f5ce@mail.gmail.com>
Message-ID: <8163BC62-3F3E-4936-AAA9-61A4FB307C99@illinois.edu>

Robert, 

Not sure why you're seeing that, but the HOWTO (and, AFAIK, the wiki in general) do not use JS, unless there is a specific addition I'm unaware of.  Now, the site wiki was recently 'parasited' for redirects, which may be the culprit, but this is now fixed.  Can you at least retest to see if this persists?

Anyone else know about this?

chris

On Nov 20, 2009, at 7:40 PM, Robert Bradbury wrote:

> I run a Linux system which is in a gradual process of evolution from the
> default Linux browsers (Galeon, Epiphany, etc.) through Firefox (better) to
> Google's Chromium (IMO, perhaps the best so far).  Chromium allows one to
> create a process per tab/URL so one can effectively track what it is doing.
> It also allows one to track the machine usage of these processes (through
> the Developer > Task manager [shift-escape keyboard] option) which though
> expensive in terms of overhead allows one to track offending windows (in
> terms of memory or CPU use).  My processor recently jumped from a typical
> 700 MHz to 1.4 GHz speed (using the Linux Ondemand scheduler - which saves
> ~20 W at the wall outlet -- I've measured it) to the full tilt 2.8 GHz the
> CPU is capable of.  Looking at the chrome task manager I was not surprised
> to find the NY Times high on the list (they are pushing content, esp. using
> Javascript) but much to my dismay the Jalview and Howto:Trees:Bioperl
> appeared to be high on the list.  Now I am forced to ask myself *why* sites
> which are simply distributing static information are eating up CPU on my
> machine!  This is a fundamental flaw in the architecture of the sites --
> wherein there should be conscious efforts to minimize user-CPU use (or avoid
> Javascript entirely).  This would not be a problem if I were using Firefox
> as I can easily use NoScript to block Javacscript from non-approved sites.
> But it raises the question of when one should allow Javascript to run (one
> would "normally" approve academic sites by default) when even the academic
> sites are abusing my CPU.  There needs to be much greater awareness both on
> the part of software distributors and software consumers that it is *MY* CPU
> and *MY* Electricty and *MY* contribution to global warming.  And the
> developers/distributors should not be sucking down those resources without
> first saying "May I?" and I have the option of saying "No you may not."
> There is enough we can do productively (running low homology blast
> searches) without engaging in endless wheel spinning of Javascripts or
> looped GIFs.
> 
> Robert
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From sdavis2 at mail.nih.gov  Sat Nov 21 00:11:34 2009
From: sdavis2 at mail.nih.gov (Sean Davis)
Date: Fri, 20 Nov 2009 21:11:34 -0800
Subject: [Bioperl-l] Excessive CPU use by various Bioperl sites
In-Reply-To: <8163BC62-3F3E-4936-AAA9-61A4FB307C99@illinois.edu>
References: <deaa866a0911201740h63aeb09cma237064b7622f5ce@mail.gmail.com>
	<8163BC62-3F3E-4936-AAA9-61A4FB307C99@illinois.edu>
Message-ID: <264855a00911202111u4b1f1020r4aa6e0e9b0ea61@mail.gmail.com>

On Fri, Nov 20, 2009 at 8:38 PM, Chris Fields <cjfields at illinois.edu> wrote:

> Robert,
>
> Not sure why you're seeing that, but the HOWTO (and, AFAIK, the wiki in
> general) do not use JS, unless there is a specific addition I'm unaware of.
>  Now, the site wiki was recently 'parasited' for redirects, which may be the
> culprit, but this is now fixed.  Can you at least retest to see if this
> persists?
>
> Anyone else know about this?
>
>
The page in question does include javascript, it appears from the source.
 This is a function of using mediawiki, though, I believe and not something
specific to that page.

Sean


From cjfields at illinois.edu  Sat Nov 21 00:20:37 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Fri, 20 Nov 2009 23:20:37 -0600
Subject: [Bioperl-l] Excessive CPU use by various Bioperl sites
In-Reply-To: <264855a00911202111u4b1f1020r4aa6e0e9b0ea61@mail.gmail.com>
References: <deaa866a0911201740h63aeb09cma237064b7622f5ce@mail.gmail.com>
	<8163BC62-3F3E-4936-AAA9-61A4FB307C99@illinois.edu>
	<264855a00911202111u4b1f1020r4aa6e0e9b0ea61@mail.gmail.com>
Message-ID: <A7AC3865-3C9A-4C6E-85B5-349240C40680@illinois.edu>

On Nov 20, 2009, at 11:11 PM, Sean Davis wrote:

> On Fri, Nov 20, 2009 at 8:38 PM, Chris Fields <cjfields at illinois.edu> wrote:
> 
>> Robert,
>> 
>> Not sure why you're seeing that, but the HOWTO (and, AFAIK, the wiki in
>> general) do not use JS, unless there is a specific addition I'm unaware of.
>> Now, the site wiki was recently 'parasited' for redirects, which may be the
>> culprit, but this is now fixed.  Can you at least retest to see if this
>> persists?
>> 
>> Anyone else know about this?
>> 
>> 
> The page in question does include javascript, it appears from the source.
> This is a function of using mediawiki, though, I believe and not something
> specific to that page.
> 
> Sean

</sound of my hand slapping my forehead>

Sean, thanks for pointing that out.

chris


From robert.bradbury at gmail.com  Sat Nov 21 13:26:05 2009
From: robert.bradbury at gmail.com (Robert Bradbury)
Date: Sat, 21 Nov 2009 13:26:05 -0500
Subject: [Bioperl-l] Bio::DB::EUtilities question
In-Reply-To: <e0996aca0911200944r1162ceaew1bd846ade73d2841@mail.gmail.com>
References: <e0996aca0911200944r1162ceaew1bd846ade73d2841@mail.gmail.com>
Message-ID: <deaa866a0911211026t2c8e1cafvac9440586dc32122@mail.gmail.com>

It sounds like NCBI may be counting frequency of requests, how much data
they send or something similar.  Are you delaying the time between fetches?
 The code I've seen typically sleeps for a few seconds each time around a
loop.  You might try longer delays between fetches and see if that gets you
any more data.

Alternatively perhaps the libraries aren't reusing the TCP/IP connection
properly.  Is there a difference between the amount of memory on the
machines?  Have you watched the size of the process to see if it grows over
time?  I think the bug which prevented me from fetching a not-so-large
genome from a few months ago (eating up 3GB of memory in the process) has
not been resolved.  If so that could be your problem.

Robert

On Fri, Nov 20, 2009 at 12:44 PM, Alessandra
<alessandra.bilardi at gmail.com>wrote:
>
>
> I'm testing Bio::DB::EUtilities - webagent which interacts with and
> retrieves data from NCBI's eUtils. My perl script works but it works
> only if I request less than ~450 times get_Response function.. else I
> have got this error message:
>
> ------------- EXCEPTION -------------
> MSG: Response Error
> Can't connect to eutils.ncbi.nlm.nih.gov:80 (connect: No route to host)
> STACK Bio::DB::GenericWebAgent::get_Response
> /usr/local/share/perl/5.10.0/Bio/DB/GenericWebAgent.pm:215
> STACK toplevel ./wget4gbk.pl:77
>


From cjfields at illinois.edu  Sat Nov 21 14:19:24 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Sat, 21 Nov 2009 13:19:24 -0600
Subject: [Bioperl-l] Bio::DB::EUtilities question
In-Reply-To: <deaa866a0911211026t2c8e1cafvac9440586dc32122@mail.gmail.com>
References: <e0996aca0911200944r1162ceaew1bd846ade73d2841@mail.gmail.com>
	<deaa866a0911211026t2c8e1cafvac9440586dc32122@mail.gmail.com>
Message-ID: <837CE7E7-E625-4285-AD54-06FD168C0DF3@illinois.edu>

NCBI has specific rules about the repeated queries to its servers:

http://eutils.ncbi.nlm.nih.gov/#UserSystemRequirements

Acc. to that, if you are making over 100 requests at peak times you will run into problems (they'll probably temp-block your IP), even if the timeout is much shorter now (it's 3 requests/second, whereas a year or two ago it was once every 3 sec).  In general it's best to run something like this during off-hours.  

The actual limit on number of server requests is one specific part of Bio::DB::EUtilities that hasn't been added yet, but is tentatively planned.  

chris

On Nov 21, 2009, at 12:26 PM, Robert Bradbury wrote:

> It sounds like NCBI may be counting frequency of requests, how much data
> they send or something similar.  Are you delaying the time between fetches?
> The code I've seen typically sleeps for a few seconds each time around a
> loop.  You might try longer delays between fetches and see if that gets you
> any more data.
> 
> Alternatively perhaps the libraries aren't reusing the TCP/IP connection
> properly.  Is there a difference between the amount of memory on the
> machines?  Have you watched the size of the process to see if it grows over
> time?  I think the bug which prevented me from fetching a not-so-large
> genome from a few months ago (eating up 3GB of memory in the process) has
> not been resolved.  If so that could be your problem.
> 
> Robert
> 
> On Fri, Nov 20, 2009 at 12:44 PM, Alessandra
> <alessandra.bilardi at gmail.com>wrote:
>> 
>> 
>> I'm testing Bio::DB::EUtilities - webagent which interacts with and
>> retrieves data from NCBI's eUtils. My perl script works but it works
>> only if I request less than ~450 times get_Response function.. else I
>> have got this error message:
>> 
>> ------------- EXCEPTION -------------
>> MSG: Response Error
>> Can't connect to eutils.ncbi.nlm.nih.gov:80 (connect: No route to host)
>> STACK Bio::DB::GenericWebAgent::get_Response
>> /usr/local/share/perl/5.10.0/Bio/DB/GenericWebAgent.pm:215
>> STACK toplevel ./wget4gbk.pl:77
>> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From cjfields at illinois.edu  Sat Nov 21 21:58:37 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Sat, 21 Nov 2009 20:58:37 -0600
Subject: [Bioperl-l] BioPerl on FLOSS Weekly
Message-ID: <05EB7AF4-8A20-4046-A585-FBF41EA8350A@illinois.edu>

Jason and I were recently interviewed (Wednesday!) about BioPerl for FLOSS Weekly by Randal Schwartz, Leo Laporte, Marc Pelletier, and Kirsten Sanford.  The interview is now available online, so get your favorite flavor (MP3, podcast) here:

http://twit.tv/floss96

Enjoy!

chris and jason


From adsj at novozymes.com  Sun Nov 22 07:37:40 2009
From: adsj at novozymes.com (Adam =?iso-8859-1?Q?Sj=F8gren?=)
Date: Sun, 22 Nov 2009 13:37:40 +0100
Subject: [Bioperl-l] BioPerl on FLOSS Weekly
In-Reply-To: <05EB7AF4-8A20-4046-A585-FBF41EA8350A@illinois.edu> (Chris
	Fields's message of "Sat, 21 Nov 2009 20:58:37 -0600")
References: <05EB7AF4-8A20-4046-A585-FBF41EA8350A@illinois.edu>
Message-ID: <87aaye91m3.fsf@topper.koldfront.dk>

On Sat, 21 Nov 2009 20:58:37 -0600, Chris wrote:

> Jason and I were recently interviewed (Wednesday!) about BioPerl for
> FLOSS Weekly by Randal Schwartz, Leo Laporte, Marc Pelletier, and
> Kirsten Sanford.

Great!

How about linking to it on bioperl.org?


  :-),

   Adam

-- 
                                                          Adam Sj?gren
                                                    adsj at novozymes.com


From cjfields at illinois.edu  Sun Nov 22 15:30:01 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Sun, 22 Nov 2009 14:30:01 -0600
Subject: [Bioperl-l] BioPerl on FLOSS Weekly
In-Reply-To: <87aaye91m3.fsf@topper.koldfront.dk>
References: <05EB7AF4-8A20-4046-A585-FBF41EA8350A@illinois.edu>
	<87aaye91m3.fsf@topper.koldfront.dk>
Message-ID: <2F050081-8B44-4B4C-82D2-7AC71F156588@illinois.edu>


On Nov 22, 2009, at 6:37 AM, Adam Sj?gren wrote:

> On Sat, 21 Nov 2009 20:58:37 -0600, Chris wrote:
> 
>> Jason and I were recently interviewed (Wednesday!) about BioPerl for
>> FLOSS Weekly by Randal Schwartz, Leo Laporte, Marc Pelletier, and
>> Kirsten Sanford.
> 
> Great!
> 
> How about linking to it on bioperl.org?
> 
> 
>  :-),
> 
>   Adam
> 
> -- 
>                                                          Adam Sj?gren
>                                                    adsj at novozymes.com

Now posted via O|B|F News; I'll try to make that feed more prominent on the main page.  

Since this is the second such interview (Jason did one a few years back for PerlCast), I'm thinking we need a media page of some sort.

chris


From maj at fortinbras.us  Sun Nov 22 15:48:39 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Sun, 22 Nov 2009 15:48:39 -0500
Subject: [Bioperl-l] BioPerl on FLOSS Weekly
In-Reply-To: <2F050081-8B44-4B4C-82D2-7AC71F156588@illinois.edu>
References: <05EB7AF4-8A20-4046-A585-FBF41EA8350A@illinois.edu><87aaye91m3.fsf@topper.koldfront.dk>
	<2F050081-8B44-4B4C-82D2-7AC71F156588@illinois.edu>
Message-ID: <247658CC6D9A4529B281F4482BD3E4BD@NewLife>

We do have http://www.bioperl.org/wiki/Category:BioPerl_Media --
----- Original Message ----- 
From: "Chris Fields" <cjfields at illinois.edu>
To: "Adam Sj?gren" <adsj at novozymes.com>
Cc: <bioperl-l at bioperl.org>
Sent: Sunday, November 22, 2009 3:30 PM
Subject: Re: [Bioperl-l] BioPerl on FLOSS Weekly


On Nov 22, 2009, at 6:37 AM, Adam Sj?gren wrote:

> On Sat, 21 Nov 2009 20:58:37 -0600, Chris wrote:
>
>> Jason and I were recently interviewed (Wednesday!) about BioPerl for
>> FLOSS Weekly by Randal Schwartz, Leo Laporte, Marc Pelletier, and
>> Kirsten Sanford.
>
> Great!
>
> How about linking to it on bioperl.org?
>
>
>  :-),
>
>   Adam
>
> -- 
>                                                          Adam Sj?gren
>                                                    adsj at novozymes.com

Now posted via O|B|F News; I'll try to make that feed more prominent on the main 
page.

Since this is the second such interview (Jason did one a few years back for 
PerlCast), I'm thinking we need a media page of some sort.

chris
_______________________________________________
Bioperl-l mailing list
Bioperl-l at lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/bioperl-l


From jardim.rodrigo at gmail.com  Sun Nov 22 11:06:40 2009
From: jardim.rodrigo at gmail.com (Rodrigo Jardim)
Date: Sun, 22 Nov 2009 14:06:40 -0200
Subject: [Bioperl-l] Problems with Genbank Proteins File
Message-ID: <cad526f60911220806td613367x255b8c6a0cebd1fc@mail.gmail.com>

I have been problem to parser genbank protein file. I think that because
this file have a other order of fields. For example:

In most general genbank files:
========================
LOCUS       AA399704                  183 bp   mRNA    linear   EST
03-MAR-2000
ACCESSION   AA399704
VERSION     AA399704.1  GI:2053305
DEFINITION  TEUF0001 T.cruzi epimastigote non-normalized cDNA Library
            Trypanosoma cruzi cDNA clone 1 5' similar to T. cruzi gene for
            histone H2b (X60982), mRNA sequence.
KEYWORDS    EST.
SOURCE      Trypanosoma cruzi

In genbank protein files:
===================
LOCUS       XP_628849                510 aa            linear   INV
31-OCT-2008
DEFINITION  hypothetical protein [Dictyostelium discoideum AX4].
ACCESSION   XP_628849
VERSION     XP_628849.1  GI:66799847
DBSOURCE    REFSEQ: accession XM_628847.1
KEYWORDS    .
SOURCE      Dictyostelium discoideum AX4.

When I try to parser, Bioperl abort with message error.

Any ideas?

Thanks all,

-- 
Atc,
Rodrigo Jardim
jardim.rodrigo at gmail.com


From biopython at maubp.freeserve.co.uk  Mon Nov 23 12:36:36 2009
From: biopython at maubp.freeserve.co.uk (Peter)
Date: Mon, 23 Nov 2009 17:36:36 +0000
Subject: [Bioperl-l] Problems with Genbank Proteins File
In-Reply-To: <cad526f60911220806td613367x255b8c6a0cebd1fc@mail.gmail.com>
References: <cad526f60911220806td613367x255b8c6a0cebd1fc@mail.gmail.com>
Message-ID: <320fb6e00911230936ofb9d897rbd45abb73a361250@mail.gmail.com>

On Sun, Nov 22, 2009 at 4:06 PM, Rodrigo Jardim
<jardim.rodrigo at gmail.com> wrote:
> I have been problem to parser genbank protein file. I think that because
> this file have a other order of fields. For example:
>
> ...
>
> When I try to parser, Bioperl abort with message error.
>
> Any ideas?

There are some important bits of information missing - what is the error
message, and what version of BioPerl are you using?

Peter


From maj at fortinbras.us  Mon Nov 23 12:58:46 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Mon, 23 Nov 2009 12:58:46 -0500
Subject: [Bioperl-l] building samtools/Bio::DB::Sam on cygwin
Message-ID: <FD03906C0D074E1B8AFDB89A283E9FAB@NewLife>

Hi All--

I've had some hard-won success installing samtools and Lincoln's
Bio::DB::Sam under cygwin; thought some on the list would be able to
use my notes. (Yes, Jason, I'm working on Bio::Tools::Run::BWA...)


(To get the current samtools, ping
http://sourceforge.net/projects/samtools/files/samtools/0.1.7/samtools-0.1.7a.tar.bz2/download
)

* Getting samtools to make from scratch in cygwin

The following diff details the changes to the samtools Makefile I made
by hand. The key points are

-D_WIN32

and the additional variable LFLAGS and its interpolations. To get the
linker to see

libgcc libstdc++

I needed to add symlinks from /lib to the correct files in
/lib/gcc/i386-pc-cygwin/4.3.2/. Your gcc version may differ.


--- ../old/samtools-0.1.7a/Makefile 2009-11-16 10:13:43.000000000 -0500
+++ Makefile 2009-11-23 12:14:18.529000000 -0500
@@ -1,16 +1,18 @@
 CC=   gcc
 CFLAGS=  -g -Wall -O2 #-m64 #-arch ppc
-DFLAGS=  -D_FILE_OFFSET_BITS=64 -D_USE_KNETFILE -D_CURSES_LIB=1
+LFLAGS=         -lws2_32 -lgcc -lcygwin -lbz2 -lz -lstdc++
+DFLAGS=  -D_WIN32 -D_FILE_OFFSET_BITS=64 -D_CURSES_LIB=1
 LOBJS=  bgzf.o kstring.o bam_aux.o bam.o bam_import.o sam.o bam_index.o \
    bam_pileup.o bam_lpileup.o bam_md.o glf.o razf.o faidx.o knetfile.o \
    bam_sort.o sam_header.o
 AOBJS=  bam_tview.o bam_maqcns.o bam_plcmd.o sam_view.o \
    bam_rmdup.o bam_rmdupse.o bam_mate.o bam_stat.o bam_color.o \
    bamtk.o kaln.o

@@ -36,13 +38,13 @@
   $(AR) -cru $@ $(LOBJS)
 
 samtools:lib $(AOBJS)
-  $(CC) $(CFLAGS) -o $@ $(AOBJS) -lm $(LIBPATH) $(LIBCURSES) -lz -L. -lbam
+  $(CC) $(CFLAGS) -o $@ $(AOBJS) -Xlinker --enable-auto-import -lm $(LIBPATH) $(LIBCURSES) -lz -L. -lbam $(LFLAGS)
 
 razip:razip.o razf.o knetfile.o
-  $(CC) $(CFLAGS) -o $@ razf.o razip.o knetfile.o -lz
+  $(CC) $(CFLAGS) -o $@ razf.o razip.o knetfile.o -lz -lm -lws2_32
 
 bgzip:bgzip.o bgzf.o
-  $(CC) $(CFLAGS) -o $@ bgzf.o bgzip.o -lz
+  $(CC) $(CFLAGS) -o $@ bgzf.o bgzip.o -lz -lm -lws2_32
 
 razip.o:razf.h
 bam.o:bam.h razf.h bam_endian.h kstring.h sam_header.h

* Getting Bio::DB::Sam to compile and install

Bio::DB::Sam requires not the samtools.exe, but the bam library
created during the samtools build, as well as all the samtools header
files. Create a symlink in /lib to libbam.a in the build directory (or
copy libbam.a up to /lib), and create symlinks or copy *.h into
/usr/include. Then in cygwin bash shell

$ cpan
cpan> install Bio::DB::Sam

should fly. 

Hope someone finds this useful. These mods led me to a successful
Bio::DB::Sam install--have not yet checked original code based on
Bio::DB::Sam. If they don't work for you, reply to the list.

cheers, 
MAJ 


From jcline at ieee.org  Mon Nov 23 14:13:26 2009
From: jcline at ieee.org (Jonathan Cline)
Date: Mon, 23 Nov 2009 13:13:26 -0600
Subject: [Bioperl-l] Installing Bio-perl on dreamhost via CPAN
In-Reply-To: <mailman.15.1258822805.21407.bioperl-l@lists.open-bio.org>
References: <mailman.15.1258822805.21407.bioperl-l@lists.open-bio.org>
Message-ID: <4B0ADED6.8040901@ieee.org>

Dreamhost has terrible reliability.  I have stats going back years on a
standard dreamhost hosting account (non-dedicated server), and on some
days the web server doesn't respond.  Dreamhost service is OK for a
hobby blog however it is definitely *not* suitable for anything real. 
Add in latency, arbitrary account limits/restrictions,  etc, and as a
hosting service, it is a bad idea to host a project there.   Although
some users apparently get lucky with server allocation and end up on a
"good server", the provider can change this at any time as well.  I
think more typically, the accounts users don't notice, since most are
simple bloggers.

Here's a data snip that illustrates the problem with a typical dreamhost
account:

----------------------------------------------------------------------
date          uptime       dns   connect   request      ttfb      ttlb

2008-08-05     91.40     0.000     0.528     0.528     2.257     1.619
2008-08-04     89.13     0.002     0.301     0.301     1.302     0.971
2008-08-03     94.62     0.000     0.567     0.567     1.506     0.913
2008-08-02    100.00     0.000     0.335     0.335     1.475     1.079
2008-08-01    100.00     0.000     0.310     0.310     1.587     0.825
2008-07-31     93.55     0.023     0.386     0.386     1.280     0.759
2008-07-30    100.00     0.000     0.345     0.345     1.373     0.860
2008-07-29    100.00     0.000     0.358     0.358     1.335     0.757
2008-07-28    100.00     0.000     0.327     0.327     1.462     0.896
2008-07-27    100.00     0.000     0.292     0.292     1.410     0.966
2008-07-26    100.00     0.000     0.283     0.283     1.280     0.815
2008-07-25    100.00     0.000     0.297     0.297     1.231     0.853
2008-07-24    100.00     0.000     0.362     0.362     1.258     0.699
2008-07-23    100.00     0.000     0.339     0.339     1.270     0.785

----------------------------------------------------------------------
minimum        89.13     0.000     0.283     0.283     1.231     0.699
maximum       100.00     0.023     0.567     0.567     2.257     1.619
average        97.76     0.002     0.359     0.359     1.430     0.914
----------------------------------------------------------------------


Or this month:

----------------------------------------------------------------------
date          uptime       dns   connect   request      ttfb      ttlb

2009-11-11    100.00     0.011     0.097     0.097     1.260     1.638
2009-11-10    100.00     0.008     0.094     0.094     1.285     1.647
2009-11-09    100.00     0.008     0.094     0.094     1.494     1.872
2009-11-08    100.00     0.015     0.101     0.101     1.509     1.894
2009-11-07    100.00     0.006     0.092     0.092     1.453     1.831
2009-11-06    100.00     0.011     0.097     0.097     1.500     1.882
2009-11-05     97.80     0.012     0.097     0.097     1.445     1.806
2009-11-04    100.00     0.010     0.096     0.096     1.235     1.605
2009-11-03     95.65     0.007     0.093     0.093     1.266     1.612
2009-11-02    100.00     0.010     0.096     0.096     1.267     1.637
2009-11-01    100.00     0.007     0.093     0.093     1.311     1.692
2009-10-31    100.00     0.009     0.095     0.095     1.225     1.594
2009-10-30    100.00     0.009     0.095     0.095     1.364     1.739
2009-10-29    100.00     0.017     0.103     0.103     1.121     1.505

----------------------------------------------------------------------
minimum        95.65     0.006     0.092     0.092     1.121     1.505
maximum       100.00     0.017     0.103     0.103     1.509     1.894
average        99.53     0.010     0.096     0.096     1.338     1.711
----------------------------------------------------------------------


## Jonathan Cline
## jcline at ieee.org
## Mobile: +1-805-617-0223
########################


From cjfields at illinois.edu  Mon Nov 23 22:19:02 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Mon, 23 Nov 2009 21:19:02 -0600
Subject: [Bioperl-l] verbosity and error stack,
	was  accessing EMBL database
In-Reply-To: <3277368F-615A-4DD3-B9B3-5D32A5EEEE98@sbc.su.se>
References: <475F74057A618245A773CD325E105D1E033334AC@phy-srv01.physiol.physiology.wisc.edu><E10EF917D9914031BCFDF8BDDBFA4F13@NewLife><B6E6CEFF-14E7-48D6-B9D8-9C114F166190@sbc.su.se><FADF827A6CE34C959062F2D93849E15A@NewLife>
	<B335150B-7479-4D31-BA0A-9F139E2CCE0E@illinois.edu>
	<D72A208491F04DBF9B3F7F10D86A9931@NewLife>
	<3277368F-615A-4DD3-B9B3-5D32A5EEEE98@sbc.su.se>
Message-ID: <167D2408-653E-4DF5-BCD7-134CE2549E44@illinois.edu>

Okay, so I think it's feasible to add this into trunk.  I like the idea of optionally having a log class, if someone comes up with a nice way of adding it in I would be for it.

chris

On Nov 20, 2009, at 4:15 AM, Dave Messina wrote:

> Chris, I took a look at how you implemented this in Biome -- very nice!
> 
> 
>> I like this verbose/strict separability a lot. Should we go for it?
> 
> Me too. So yes, I think so.
> 
> 
>> We could even allow finer-grained control of verbosity (states which cover all combinations) w/o affecting strictness.
> 
> 
> Perhaps this is a job for Log::Log4Perl or Log::Dispatch?
> http://search.cpan.org/~mschilli/Log-Log4perl-1.25/lib/Log/Log4perl.pm
> http://search.cpan.org/~drolsky/Log-Dispatch-2.26/lib/Log/Dispatch.pm
> 
> 
> That might be overkill, though.
> 
> Dave
> 


From David.Messina at sbc.su.se  Tue Nov 24 11:18:22 2009
From: David.Messina at sbc.su.se (Dave Messina)
Date: Tue, 24 Nov 2009 17:18:22 +0100
Subject: [Bioperl-l] verbosity and error stack,
	was  accessing EMBL database
In-Reply-To: <167D2408-653E-4DF5-BCD7-134CE2549E44@illinois.edu>
References: <475F74057A618245A773CD325E105D1E033334AC@phy-srv01.physiol.physiology.wisc.edu><E10EF917D9914031BCFDF8BDDBFA4F13@NewLife><B6E6CEFF-14E7-48D6-B9D8-9C114F166190@sbc.su.se><FADF827A6CE34C959062F2D93849E15A@NewLife>
	<B335150B-7479-4D31-BA0A-9F139E2CCE0E@illinois.edu>
	<D72A208491F04DBF9B3F7F10D86A9931@NewLife>
	<3277368F-615A-4DD3-B9B3-5D32A5EEEE98@sbc.su.se>
	<167D2408-653E-4DF5-BCD7-134CE2549E44@illinois.edu>
Message-ID: <3FD2086D-062F-4706-9DC8-2A53224C4913@sbc.su.se>

> I like the idea of optionally having a log class, if someone comes up with a nice way of adding it in I would be for it.

My suggestion of the logging modules was actually to handle the various levels of verbose output -- I think both of the ones I mentioned "log" to STDERR by default.

But of course a nice side effect of using such a logging module is that it would allow optional logging to a file, too.

Dave


From paolo.pavan at gmail.com  Tue Nov 24 14:28:09 2009
From: paolo.pavan at gmail.com (Paolo Pavan)
Date: Tue, 24 Nov 2009 20:28:09 +0100
Subject: [Bioperl-l] Bio::Tools::Run::Cap3 usage question
Message-ID: <56be91b60911241128s52613a56u99e5b1cb3ba8d19a@mail.gmail.com>

Dear,
I'm confused about the proper usage of the module Bio::Tools::Run::Cap3.
As documented in the pod, the run(@seqs) method returns the cap3 report file
while I expect to return a Bio::Assembly object, consistently with other
Bio::Tools::Run classes.
However, I went around this by getting from the factory object the location
and the names of the temp output files (actually accessing a private
property, although) and reading them via the Assembly::IO system.
I was just wandering what is the proper designed way to do this job.

Thank you for enlighten the way!
Paolo


From Russell.Smithies at agresearch.co.nz  Tue Nov 24 17:04:31 2009
From: Russell.Smithies at agresearch.co.nz (Smithies, Russell)
Date: Wed, 25 Nov 2009 11:04:31 +1300
Subject: [Bioperl-l] Bio::DB::Fasta
Message-ID: <18DF7D20DFEC044098A1062202F5FFF32B63085409@exchsth.agresearch.co.nz>

Is there any way to pass a filename to Bio::DB::Fasta for the location of where to write the directory.index?
It's writing in the same dir as the fasta but I'd rather have it write in /tmp as it's part of a web app.

Thanx,

Russell


=======================================================================
Attention: The information contained in this message and/or attachments
from AgResearch Limited is intended only for the persons or entities
to which it is addressed and may contain confidential and/or privileged
material. Any review, retransmission, dissemination or other use of, or
taking of any action in reliance upon, this information by persons or
entities other than the intended recipients is prohibited by AgResearch
Limited. If you have received this message in error, please notify the
sender immediately.
=======================================================================


From Russell.Smithies at agresearch.co.nz  Tue Nov 24 17:21:52 2009
From: Russell.Smithies at agresearch.co.nz (Smithies, Russell)
Date: Wed, 25 Nov 2009 11:21:52 +1300
Subject: [Bioperl-l] Bio::DB::Fasta
In-Reply-To: <4296CD1039FC44B89034A1FD3E6721F3@NewLife>
References: <18DF7D20DFEC044098A1062202F5FFF32B63085409@exchsth.agresearch.co.nz>
	<4296CD1039FC44B89034A1FD3E6721F3@NewLife>
Message-ID: <18DF7D20DFEC044098A1062202F5FFF32B6308542C@exchsth.agresearch.co.nz>

That's what I ended up doing.
Also, there's no "obvious" way to index a single file so I ended putting the filename in the glob parameter.

my $db = Bio::DB::Fasta->new( "$tmp", -glob => "test.faa", -reindex => 1 );

--Russell


> -----Original Message-----
> From: Mark A. Jensen [mailto:maj at fortinbras.us]
> Sent: Wednesday, 25 November 2009 11:19 a.m.
> To: Smithies, Russell; 'bioperl-l'
> Subject: Re: [Bioperl-l] Bio::DB::Fasta
> 
> The code (method index_dir() ) seems to expect all the fasta files to be
> contained in that directory. Looks hairy; what about creating symlinks to
> your
> fasta files in a /tmp subdir and calling new() with that subdir?
> ----- Original Message -----
> From: "Smithies, Russell" <Russell.Smithies at agresearch.co.nz>
> To: "'bioperl-l'" <bioperl-l at bioperl.org>
> Sent: Tuesday, November 24, 2009 5:04 PM
> Subject: [Bioperl-l] Bio::DB::Fasta
> 
> 
> > Is there any way to pass a filename to Bio::DB::Fasta for the location
> of
> > where to write the directory.index?
> > It's writing in the same dir as the fasta but I'd rather have it write
> in /tmp
> > as it's part of a web app.
> >
> > Thanx,
> >
> > Russell
> >
> >
> > =======================================================================
> > Attention: The information contained in this message and/or attachments
> > from AgResearch Limited is intended only for the persons or entities
> > to which it is addressed and may contain confidential and/or privileged
> > material. Any review, retransmission, dissemination or other use of, or
> > taking of any action in reliance upon, this information by persons or
> > entities other than the intended recipients is prohibited by AgResearch
> > Limited. If you have received this message in error, please notify the
> > sender immediately.
> > =======================================================================
> >
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >
> >


From maj at fortinbras.us  Tue Nov 24 17:18:51 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Tue, 24 Nov 2009 17:18:51 -0500
Subject: [Bioperl-l] Bio::DB::Fasta
In-Reply-To: <18DF7D20DFEC044098A1062202F5FFF32B63085409@exchsth.agresearch.co.nz>
References: <18DF7D20DFEC044098A1062202F5FFF32B63085409@exchsth.agresearch.co.nz>
Message-ID: <4296CD1039FC44B89034A1FD3E6721F3@NewLife>

The code (method index_dir() ) seems to expect all the fasta files to be 
contained in that directory. Looks hairy; what about creating symlinks to your 
fasta files in a /tmp subdir and calling new() with that subdir?
----- Original Message ----- 
From: "Smithies, Russell" <Russell.Smithies at agresearch.co.nz>
To: "'bioperl-l'" <bioperl-l at bioperl.org>
Sent: Tuesday, November 24, 2009 5:04 PM
Subject: [Bioperl-l] Bio::DB::Fasta


> Is there any way to pass a filename to Bio::DB::Fasta for the location of 
> where to write the directory.index?
> It's writing in the same dir as the fasta but I'd rather have it write in /tmp 
> as it's part of a web app.
>
> Thanx,
>
> Russell
>
>
> =======================================================================
> Attention: The information contained in this message and/or attachments
> from AgResearch Limited is intended only for the persons or entities
> to which it is addressed and may contain confidential and/or privileged
> material. Any review, retransmission, dissemination or other use of, or
> taking of any action in reliance upon, this information by persons or
> entities other than the intended recipients is prohibited by AgResearch
> Limited. If you have received this message in error, please notify the
> sender immediately.
> =======================================================================
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> 


From florent.angly at gmail.com  Tue Nov 24 17:54:48 2009
From: florent.angly at gmail.com (Florent Angly)
Date: Tue, 24 Nov 2009 14:54:48 -0800
Subject: [Bioperl-l] Bio::Tools::Run::Cap3 usage question
In-Reply-To: <56be91b60911241128s52613a56u99e5b1cb3ba8d19a@mail.gmail.com>
References: <56be91b60911241128s52613a56u99e5b1cb3ba8d19a@mail.gmail.com>
Message-ID: <4B0C6438.8070405@gmail.com>

Hi Paolo,

It turns out that there is no standard for what is to be passed to the 
Bio::Tools::Run wrappers and returned by them. I noticed the 
inconsistency between the assembly wrappers recently while implementing 
support for new wrapper. I implemented inital support for additional de 
novo assembly programs in BioPerl (454 Newbler and Minimo) a couple of 
weeks ago and Mark Jensen added support for Maq, a program that 
assembler reads against a reference. In the process, all the assembly 
wrappers were changed to take the same type of input data (a FASTA 
sequence or an array reference of sequence objects) and return one of 
the following:
    * a Bio::Assembly::Scaffold object (the default), or
    * a Bio::Assembly::IO object, or
    * the name of a file for the output of the assembler
Use the out_type method to set up which output you want, e.g.:
    $factory->out_type('Bio::Assembly::IO');
or
    $factory->out_type('cap3_results.ace');
You'll have to use the code in the bioperl-run subversion if you want to 
use these new features.

Cheers,

Florent


Paolo Pavan wrote:
> Dear,
> I'm confused about the proper usage of the module Bio::Tools::Run::Cap3.
> As documented in the pod, the run(@seqs) method returns the cap3 report file
> while I expect to return a Bio::Assembly object, consistently with other
> Bio::Tools::Run classes.
> However, I went around this by getting from the factory object the location
> and the names of the temp output files (actually accessing a private
> property, although) and reading them via the Assembly::IO system.
> I was just wandering what is the proper designed way to do this job.
>
> Thank you for enlighten the way!
> Paolo
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>   


From roychu at gmail.com  Tue Nov 24 18:00:58 2009
From: roychu at gmail.com (Roy)
Date: Tue, 24 Nov 2009 15:00:58 -0800
Subject: [Bioperl-l] Remote Blast - same script but different results
Message-ID: <4d7f3e450911241500y7df305acq1d03819ea1ec7d3e@mail.gmail.com>

Hi bioperl community,

I've tried searching the old lists to see if this topic has been
covered, and perhaps this question arises from my own lack of
familiarity with BLAST, but (from my perl script listed below) I get
different results with remote blast when I call my script (that is, I
will either get hits or no hits at all).  I'll call the script one
time, and get no hits.  Then call the script again (with the same
parameters), and get the same several hits that I may have before
after having gotten no hits.  I use a subroutine to parse the blast
report information, and then I use a boolean to indicate whether
results are returned or not.  Any insight into what I may have missed
would be appreciated.  Short question, is this behavior typical?  My
understanding of how BLAST works is that it shouldn'tl...


Thanks in advance,
Roy

#!/usr/bin/perl -w

use strict;
use warnings;
use Carp;
use Bio::Perl;
use CGI;
use Bio::SeqIO;
use Bio::SearchIO;
use Bio::SeqFeature::Generic;
use Bio::Restriction::Analysis;
use Bio::Tools::Run::RemoteBlast;

use Bio::SimpleAlign;
use Bio::AlignIO;
use Bio::LocatableSeq;

my $five_seqobj = Bio::Seq->new(
		-seq		=>	'ATTCCCACCGGGACCTGCGGGGCTGAGTGCCCTTCTCGGTTGCTGCCGCTGAGGAGCCCGCCCAGCCAGCCAGGGCCGCGAGGCCGAGGCCAGGCCGCAGCCCAGGAGCCGCCCCACCGCAGCTGGCGATGGACCCGCCGAGGCCCGCGCTGCTGGCGCTGCTGGCGCTGCCTGCGCTGCTGCTGCTGCTGCTGGCGGGCGCCAGGGCCG',
		-display_id	=>	'genomic_a',
		-alphabet 	=>	'dna',
	);
my $three_seqobj = Bio::Seq->new(
		-seq		=>	'GTGAGTGCGCGGCCGCTCTGCGGGCGCAGAGGGAGCGGGAGGGAGCCGGCGGCACGAGGTTGGCCGGGGCAGCCTGGGCCTAGGCCAGAGGGAGGGCAGCCACAGGGTCCAGGGCGAGTGGGGGGATTGGACCAGCTGGCGGCCCCTGCAGGCTCAGGATGGGGGGCGCGGGATGGAGGGGCTGAGGAGGGGGTCTCCGGAGCCTGCCTC',
		-display_id	=>	'genomic_b',
		-alphabet 	=>	'dna',
	);

my @params = (
'-program' => 'blastn',
'-database' => 'refseq_genomic',
'-expect' => '10',
'-readmethod' => 'blastxml'
);
$Bio::Tools::Run::RemoteBlast::HEADER{'MEGABLAST'} = 'YES';
$Bio::Tools::Run::RemoteBlast::HEADER{'PERC_IDENT'} = 75;
$Bio::Tools::Run::RemoteBlast::HEADER{'FORMAT_TYPE'} = 'XML';
$Bio::Tools::Run::RemoteBlast::HEADER{'ENTREZ_QUERY'} = 'Homo sapiens [ORGN]';
$Bio::Tools::Run::RemoteBlast::HEADER{'HITLIST_SIZE'} = 100; # Put:
limit number of hits

my $factory_a = Bio::Tools::Run::RemoteBlast->new(@params);
$factory_a->retrieve_parameter('FORMAT_TYPE', 'XML');

my $hits_a;
my $hits_b;

my $r;
my $bool_hit;
print "Submitting BLAST query - 5' end (MEGABLAST = YES)\n";
$Bio::Tools::Run::RemoteBlast::HEADER{'MEGABLAST'} = 'YES';
$r = $factory_a->submit_blast($a_seqobj);
$bool_hit = fetch_blast_report($factory_a);
unless ($bool_hit) {
	print "\nNo hits\n";
	print "Re-submitting BLAST query - 5' end (MEGABLAST = NO)\n";
	sleep 5;
	$Bio::Tools::Run::RemoteBlast::HEADER{'MEGABLAST'} = 'NO';
	$r = $factory_a->submit_blast($a_seqobj);
	($bool_hit, $hits_a) = fetch_blast_report($factory_a);
	if ($bool_hit == 0) { print "No hits\n"; }
	sleep 5;
}

my $factory_b = Bio::Tools::Run::RemoteBlast->new(@params);
print "\n--------------------------------------------------\n\n";
print "Submitting BLAST query - 3' end (MEGABLAST = YES)\n";
$Bio::Tools::Run::RemoteBlast::HEADER{'MEGABLAST'} = 'YES';
$r = $remote_blast_three->submit_blast($b_seqobj);
$bool_hit = fetch_blast_report($factory_b);
unless ($bool_hit) {
	print " No hits\n";
	print "Re-submitting BLAST query - 3' end (MEGABLAST = NO)\n";
	sleep 5;
	$Bio::Tools::Run::RemoteBlast::HEADER{'MEGABLAST'} = 'NO';
	$r = $factory_b->submit_blast($b_seqobj);
	($bool_hit, $hits_b) = fetch_blast_report($factory_b);
	if ($bool_hit == 0) { print " No hits\n"; }
	sleep 5;
}

print "\nbye\n\n";

print "$hits_a\n$hits_b\n";

exit;

sub fetch_blast_report {
	my ($factory) = @_;
	my $v = 1;
	my $bool_hit = 0;
	my $hits = '';
	
	print STDERR "waiting...";
	while (my @rids = $factory->each_rid) {
		foreach my $rid (@rids) {
			print STDERR ".";
			my $rc = $factory->retrieve_blast($rid);
			# retrieves blast report from remote blast queue,
			# returns -1 on error, 0 on 'job not finished', Bio::SearchIO object
			# args, remote blast id (rid)
			if (!ref($rc)) {
				# if not empty string, ref EXPR returns a non-empty string if EXPR
is a reference
				if ($rc < 0) {
					$factory->remove_rid($rid);
				}
				print STDERR "." if ($v > 0);
#####################################################################################
is this printing out as multiple dots? when and why?
				sleep 5;
			} else {
				$bool_hit = 1;
				my $result = $rc->next_result();
				unless ($result->num_hits > 0) {
					$bool_hit = 0;
				}
				# returns: Bio::Search::Result::ResultI object
				$factory->remove_rid($rid);
				print "\ndatabase:\t", $result->database_name,"\n";
				print "query name:\t", $result->query_name,"\n";
				print "query length\t", $result->query_length,"\n";
				print "num hits\t", $result->num_hits,"\n";
				if ($result->num_hits) {
					# $result->hits returns an array of hits
					# $results->no_hits_found, boolean vs $#{@hits} ie. filtering\
					while (my $hit = $result->next_hit) {
					
					print "\nhit name:\t", $hit->name,"\n";	
					print "description:\t", $hit->description,"\n";	
					print "locus:\t", $hit->locus,"\n";	
					print "algorithm: ", $hit->algorithm,"\thit length: ",
$hit->length,"\thit ranking: ", $hit->rank,"\n";
					while (my $hsp = $hit->next_hsp) {
						print "evalue: ", $hsp->evalue,"\tscore: ",
$hsp->score,"\tpercent_id: ", $hsp->percent_identity,"\n";
						print "query_start: ", $hsp->query->start,"\tquery_end: ",
$hsp->query->end;
						print "\tquery_length: ", $hsp->query->length,"\tquery_strand:
", $hsp->strand('query'), "\n";
						print "subject_start: ", $hsp->subject->start,"\tsubject_end: ",
$hsp->subject->end;
						print "\tsubject_length: ",
$hsp->subject->length,"\tsubject_strand: ", $hsp->strand('subject'),
"\n\n";
						my $aln = $hsp->get_aln;
						if ($aln->is_flush) {
							foreach my $seq ($aln->each_seq) {
								print $seq->seq,"\n";
							}
							print $aln->gap_line, "\n";
							print $aln->consensus_string(95), "\n\n";
						}

						$hits .= $hit->name."\t".$hsp->subject->start."\t".$hsp->subject->end."\t".$hsp->strand('subject')."\n";
					}
				}		
			}
		}
	}
	return ($bool_hit, $hits);
}
}


From maj at fortinbras.us  Tue Nov 24 23:12:13 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Tue, 24 Nov 2009 23:12:13 -0500
Subject: [Bioperl-l] Bio::DB::Fasta
In-Reply-To: <18DF7D20DFEC044098A1062202F5FFF32B6308542C@exchsth.agresearch.co.nz>
References: <18DF7D20DFEC044098A1062202F5FFF32B63085409@exchsth.agresearch.co.nz>
	<4296CD1039FC44B89034A1FD3E6721F3@NewLife>
	<18DF7D20DFEC044098A1062202F5FFF32B6308542C@exchsth.agresearch.co.nz>
Message-ID: <3ECFA0236D1B467181EE63C8C6BE7E1F@NewLife>

I seem to be able to do
$db = Bio::DB::Fasta->new("$tmp/test.faa");
without a problem- something in the mixing of named and unnamed parameters?
----- Original Message ----- 
From: "Smithies, Russell" <Russell.Smithies at agresearch.co.nz>
To: "'Mark A. Jensen'" <maj at fortinbras.us>; "'bioperl-l'" 
<bioperl-l at bioperl.org>
Sent: Tuesday, November 24, 2009 5:21 PM
Subject: RE: [Bioperl-l] Bio::DB::Fasta


That's what I ended up doing.
Also, there's no "obvious" way to index a single file so I ended putting the 
filename in the glob parameter.

my $db = Bio::DB::Fasta->new( "$tmp", -glob => "test.faa", -reindex => 1 );

--Russell


> -----Original Message-----
> From: Mark A. Jensen [mailto:maj at fortinbras.us]
> Sent: Wednesday, 25 November 2009 11:19 a.m.
> To: Smithies, Russell; 'bioperl-l'
> Subject: Re: [Bioperl-l] Bio::DB::Fasta
>
> The code (method index_dir() ) seems to expect all the fasta files to be
> contained in that directory. Looks hairy; what about creating symlinks to
> your
> fasta files in a /tmp subdir and calling new() with that subdir?
> ----- Original Message -----
> From: "Smithies, Russell" <Russell.Smithies at agresearch.co.nz>
> To: "'bioperl-l'" <bioperl-l at bioperl.org>
> Sent: Tuesday, November 24, 2009 5:04 PM
> Subject: [Bioperl-l] Bio::DB::Fasta
>
>
> > Is there any way to pass a filename to Bio::DB::Fasta for the location
> of
> > where to write the directory.index?
> > It's writing in the same dir as the fasta but I'd rather have it write
> in /tmp
> > as it's part of a web app.
> >
> > Thanx,
> >
> > Russell
> >
> >
> > =======================================================================
> > Attention: The information contained in this message and/or attachments
> > from AgResearch Limited is intended only for the persons or entities
> > to which it is addressed and may contain confidential and/or privileged
> > material. Any review, retransmission, dissemination or other use of, or
> > taking of any action in reliance upon, this information by persons or
> > entities other than the intended recipients is prohibited by AgResearch
> > Limited. If you have received this message in error, please notify the
> > sender immediately.
> > =======================================================================
> >
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >
> >


From maj at fortinbras.us  Wed Nov 25 12:25:30 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Wed, 25 Nov 2009 12:25:30 -0500
Subject: [Bioperl-l] question for all regarding a sam-based Bio::Assembly::IO
Message-ID: <1E72D5B0A190448FA27545DB5B68638D@NewLife>

Short-readers, 

I'm working on an Assembly::IO class for sam alignments.
I'm currently making a decision about handling multiple reference sequences:
would you prefer that next_assembly() return an assembly that covers all reference
sequences, or that next_assembly iterates over each reference sequence?
(Or both?)

thanks for your input-
MAJ


From timbourine81 at gmail.com  Wed Nov 25 12:40:52 2009
From: timbourine81 at gmail.com (Tim)
Date: Wed, 25 Nov 2009 18:40:52 +0100
Subject: [Bioperl-l] How to parse BLAST output - all hits of each query in
	new file
Message-ID: <4B0D6C24.2080308@gmail.com>

Dear bioperl users,

I am a real newbie and have - maybe a very trivial - question.

I searched the mailing list archive and many howtos but I have not found
a concrete answer to my problem. So hopefully you can help me :)

Background: I use the latest Bioperl version (installed it two weeks
before).
When I use Bio::Tools::Run::StandAloneBlast to BLAST one fasta file
including different sequences, I get a BLAST output with many queries
each having several hits / sbjcts.

My problem is how to parse *all* hits of *one* query into a single new
file. And this for all the queries I have in my BLAST output file.

Or is it better the other way round; first to make fasta files with only
single sequences inside and BLAST each file? But how can I automize that
using Bioperl?

I tried Bio::SearchIO but can only parse all queries and their
respective hits in only one file...
I think iteration is also necessary here, but I do not really know how
to include that into Bio::SearchIO.
Or do I have to use Module:Bio::Index::Blast?

I can index a file (see below), but I have no idea what comes next...

###How I index a file...

#!/usr/bin/perl -w

$ENV{BIOPERL_INDEX_TYPE} = "SDBM_File";

use Bio::Index::Fasta;


$file_name = "8_to_BLAST_two_seq_index.fasta";
$id = "48882";
$inx = Bio::Index::Fasta->new (-filename => $file_name . ".idx",
-write_flag => 1);
$inx->make_index($file_name);


Hopefully, you can give me at least hints what to look for.

A big THANKS in advance!

Cheers,

Tim


From timbourine81 at gmail.com  Wed Nov 25 12:53:34 2009
From: timbourine81 at gmail.com (Tim)
Date: Wed, 25 Nov 2009 18:53:34 +0100
Subject: [Bioperl-l] How to parse different (fasta) files
Message-ID: <4B0D6F1E.8@gmail.com>

Hey everybody,

another question from me...if you do not mind :)

My situation is like this: I have parsed a standalone BLAST output using
SearchIO with only the hit names. Now I have a second fasta file with
the same sequences like in the BLAST database but including an alignment
(meaning "." and "-"). (There is no chance to make a BLAST database with
fasta files including the alignment, unfortunately...).
My intention is now to take the name of the hit sequences (BLAST output)
and to get the corresponding aligned sequences (fasta file incl.
alignment) and putting it in a new file.

Is anybody out there who has tried that before?

Again, I am a absolute greenhorn in using (Bio)perl. Maybe it is very
simple :D

Looking forward to get an answer of you.

All the best,

Tim
-- 
Tim K?hler
MPI for Terrestrial Microbiology
Karl-von-Frisch-Stra?e
D-35043 Marburg / Germany

Email: koehlerd at mpi-marburg.mpg.de
Phone: +49 6421 178-740
Fax:   +49 6421 178-999


From maj at fortinbras.us  Wed Nov 25 13:20:03 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Wed, 25 Nov 2009 13:20:03 -0500
Subject: [Bioperl-l] How to parse BLAST output - all hits of each query
	innew file
In-Reply-To: <4B0D6C24.2080308@gmail.com>
References: <4B0D6C24.2080308@gmail.com>
Message-ID: <53DE480F205E42CE8D2B9421592AAF0E@NewLife>

hey Tim--

Sound like you need to go about collecting your queries inside out:

my %hits_by_query;
for ($result->hits) {
  push @{$hits_by_query{$hit->name}} $hit;
}

I believe now each hash element, keyed by the query name, will contain
an arrayref to the set of hits assoc with that query.
>From here, I believe

use Bio::Search::Result::BlastResult;
use Bio::SearchIO;

foreach my $qid ( keys %hits_by_query ) {
  my $result = Bio::Search::Result::BlastResult->new();
  $result->add_hit($_) for ( @{$hits_by_query{$qid}} );
  my $blio = Bio::SearchIO->new( -file => $qid.".bls", -format=>'blast' );
  $blio->write_result($result);
}

will do what you want.

hope this helps -
Mark

----- Original Message ----- 
From: "Tim" <timbourine81 at gmail.com>
To: <bioperl-l at lists.open-bio.org>
Sent: Wednesday, November 25, 2009 12:40 PM
Subject: [Bioperl-l] How to parse BLAST output - all hits of each query innew 
file


> Dear bioperl users,
>
> I am a real newbie and have - maybe a very trivial - question.
>
> I searched the mailing list archive and many howtos but I have not found
> a concrete answer to my problem. So hopefully you can help me :)
>
> Background: I use the latest Bioperl version (installed it two weeks
> before).
> When I use Bio::Tools::Run::StandAloneBlast to BLAST one fasta file
> including different sequences, I get a BLAST output with many queries
> each having several hits / sbjcts.
>
> My problem is how to parse *all* hits of *one* query into a single new
> file. And this for all the queries I have in my BLAST output file.
>
> Or is it better the other way round; first to make fasta files with only
> single sequences inside and BLAST each file? But how can I automize that
> using Bioperl?
>
> I tried Bio::SearchIO but can only parse all queries and their
> respective hits in only one file...
> I think iteration is also necessary here, but I do not really know how
> to include that into Bio::SearchIO.
> Or do I have to use Module:Bio::Index::Blast?
>
> I can index a file (see below), but I have no idea what comes next...
>
> ###How I index a file...
>
> #!/usr/bin/perl -w
>
> $ENV{BIOPERL_INDEX_TYPE} = "SDBM_File";
>
> use Bio::Index::Fasta;
>
>
> $file_name = "8_to_BLAST_two_seq_index.fasta";
> $id = "48882";
> $inx = Bio::Index::Fasta->new (-filename => $file_name . ".idx",
> -write_flag => 1);
> $inx->make_index($file_name);
>
>
> Hopefully, you can give me at least hints what to look for.
>
> A big THANKS in advance!
>
> Cheers,
>
> Tim
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> 


From Russell.Smithies at agresearch.co.nz  Wed Nov 25 14:07:26 2009
From: Russell.Smithies at agresearch.co.nz (Smithies, Russell)
Date: Thu, 26 Nov 2009 08:07:26 +1300
Subject: [Bioperl-l] How to parse BLAST output - all hits of each query
 in new file
In-Reply-To: <4B0D6C24.2080308@gmail.com>
References: <4B0D6C24.2080308@gmail.com>
Message-ID: <18DF7D20DFEC044098A1062202F5FFF32B63085701@exchsth.agresearch.co.nz>

Hi Tim,
Here's some code for a job I'm working on at the moment that contains all the bits you'll probably need.
It's extracting 2 species-specific databases from nr (based on tax ids), doing a blast, then parsing the results and creating a substitution matrix. I was initially using Bio::DB::Eutilities to query and retrieve sequences but I kept getting errors and time-outs from NCBI when pulling back large numbers of sequences.
It should give you a rough idea of how to run Bio::Tools::Run::StandAloneBlast, Bio::DB::Fasta and Bio::SearchIO.

Email me direct if you want further explaination as it's not well commented ;-)

Russell Smithies 

Bioinformatics Applications Developer 
T +64 3 489 9085 
E  russell.smithies at agresearch.co.nz 

Invermay  Research Centre 
Puddle Alley, 
Mosgiel, 
New Zealand 
T  +64 3 489 3809   
F  +64 3 489 9174  
www.agresearch.co.nz

=======================================

#!/usr/local/bin/perl

use strict;
use Bio::Tools::Run::StandAloneBlast;
use Bio::SeqIO;
use Bio::SearchIO;
use Bio::DB::Fasta;

use Storable;

# Parameters: <query> <subject> <number or percentage of searches>
# Percentage can be specified as either 20p, 20P or 20%
# So for 20% of rice sequences blasted against oil palm:
#    4530 51953 20p   (4530=rice,51953=oil_palm, 20p=20%)
# Or for 20 searches:
#      4530 51953 20
#
my ( $q, $s, $c ) = @ARGV;

my $nr = "/data/databases/flatfile/illuminati_blastdata/nr";
my $tax_file = "/data/anonftp/pub/mirror/taxonomy/gi_taxid_prot.dmp.gz";
my $tmp = "/tmp/tax";


my %stats      = ();
my $total_subs = 0;

my $min_hsp_len      = 0;
my $min_hsp_identity = 0;
my $num_searches     = $c || 10;
my $blast_e          = '1e-6';
my $count            = 0;

# check if all the fasta and blast files exist
# if not, extract new fasta and re-formatdb the database
foreach my $t ( $q, $s ) {
  foreach ( map { "$tmp/$t.$_" } qw(faa list phr pin psq) ) {
    unless ( -e $_ ) {
      print "Creating database for $t\n";
      &create_database($t);
      last;
    }
  }
}

my @params = (
               -database => "$tmp/$q",
               -program  => 'blastp',
               -e        => $blast_e,
               -outfile  => "$tmp/blast.out",
               -v        => '1',
               -b        => '1'
);
my $factory = Bio::Tools::Run::StandAloneBlast->new(@params) or die $!;

# load the query sequences into a db
# makes it easier to randomly access them
my $db = Bio::DB::Fasta->new( "$tmp", -glob => "$s.faa", -reindex => 1 );

my @ids      = $db->ids;
my $id_count = $#ids;
exit "No sequences\n" unless $id_count;

# if a percentage is requested, calculate
# the required number of searches
if ( $num_searches =~ m/(\d+)[pP%]/ ) {
  $num_searches = int( ( $1 / 100 ) * $id_count );
  warn
"Searching random $1 percent ($num_searches) of $id_count sequences from taxid $q\n";
}

my $summary_file = "$tmp/".$$."_summary.txt";
open( OUT, ">", $summary_file ) or die $!;
print OUT
"#Summary of $num_searches random blast searches from taxid $q against taxid $s.\n";
print OUT "#Parameters used were:\n";
print OUT "#blast_e: $blast_e\n";
print OUT "#min_hsp_len: $min_hsp_len\n";
print OUT "#min_hsp_identity: $min_hsp_identity\n";
print OUT "\n";

while ( my $seq = $db->get_Seq_by_id( $ids[ rand($#ids) ] ) ) {
  next unless $seq;

  warn "Processing ", $seq->id, "\n";
  eval {
    my $blast_report = $factory->blastall($seq);
    sleep 5;
  };

  my $blast_in = new Bio::SearchIO( -format => "blast", -file => "$tmp/blast.out" );

  while ( my $result = $blast_in->next_result ) {
    if ( $result->num_hits <= 0 ) {
      warn "No hits for ", $result->query_accession, "\n";
      print OUT "No hits for ", $result->query_accession, "\n";
      next;
    }
    $count++;
    while ( my $hit = $result->next_hit ) {
      while ( my $hsp = $hit->next_hsp ) {
        warn sprintf( "%s had %s hsp%s\n",
                      $result->query_accession, $hit->num_hsps,
                      $hit->num_hsps > 1 ? "s" : "" );
        print OUT sprintf( "%s had %s hsp%s\n",
                      $result->query_accession, $hit->num_hsps,
                      $hit->num_hsps > 1 ? "s" : "" );

        # http://www.bioperl.org/wiki/HOWTO:SearchIO#Table_of_Methods
        if ( $hsp->length('total') > $min_hsp_len ) {
          if ( $hsp->percent_identity >= $min_hsp_identity ) {
            my @query_string = split '', $hsp->query_string;
            my @homol_string = split '', $hsp->homology_string;
            my @hit_string   = split '', $hsp->hit_string;
            for ( my $i = 0; $i < $#query_string; $i++ ) {
              next unless $homol_string[$i] =~ /\+/;
              $stats{ $query_string[$i] }{ $hit_string[$i] }++;
              $total_subs++;
            }
          }
        }
      }
    }
  }
  unlink '$tmp/blast.out' if -e '$tmp/blast.out';
  last if $count >= $num_searches;
}


# create summary frequency list
my %summary = ();
for my $query ( keys %stats ) {
  for my $hit ( keys %{ $stats{$query} } ) {
    $summary{"$query->$hit"} =
      sprintf( "%6f", $stats{$query}{$hit} / $total_subs );
  }
}

print OUT "\n";

# sort by decending frequencies and print to summary file
foreach my $k ( sort { $summary{$b} <=> $summary{$a} } keys %summary ) {
  print OUT "$k\t", $summary{$k}, "\n" unless $k =~ /TOTAL/;
}

print OUT "\n\n";

# print substitution matrix
my $i     = 0;
my @prots = qw(A R N D C Q E G H I L K M F P S T W Y V);
my $sep   = "\t";

print OUT sprintf( "%7s %s", $_, $sep ) foreach ( "       ", @prots );
print OUT "\n";

foreach my $x (@prots) {
  print OUT sprintf( "%7s|%s", $prots[ $i++ ], $sep );
  foreach my $y (@prots) {
    my $val =
      defined( $stats{$x}{$y} )
      ? sprintf( "%0.6f", $stats{$x}{$y} / $total_subs )
      : "--------";
    print OUT sprintf( "%s%s", $val, $sep );
  }
  print OUT "\n";
}
close OUT;


open(IN, $summary_file) or die $!;
print $_ while(<IN>);
close IN;


# extract sequences from nr database based on taxid.
sub create_database {
  my $txid      = shift;
  my %hash      = ();
  my $gi_stored = "/tmp/gi.dat";

  if ( -e $gi_stored ) {
    %hash = %{ retrieve($gi_stored) };
  }
  else {
    open( TXID, "zcat $tax_file | " ) or die $!;
    while (<TXID>) {
      chomp;
      my ( $gi, $tx ) = split( "\t", $_ );
      push( @{ $hash{$tx} }, $gi );
    }
    close TXID;

    store( \%hash, $gi_stored );
  }

  my $txlist = "$tmp/$txid.list";
  my $txseq  = "$tmp/$txid.faa";
	
	die "No sequences found for taxid $txid\n" unless defined( @{ $hash{$txid} });
	my $num_seqs =  scalar( @{ $hash{$txid} });
	warn "Found $num_seqs sequences for taxid $txid in $tax_file\n";

  open OUT, ">", $txlist or die $!;
  print OUT "$_\n" foreach ( @{ $hash{$txid} } );
  close OUT;

  my $cmd = "fastacmd -d $nr -i $txlist -t T -o $txseq 2>/dev/null";
  system $cmd;

  my $count = `grep -c '>' $txseq`;
  $count =~ s/\n//;
	warn "Could only extract $count sequences from $nr\n";

  $cmd = "formatdb -p T -i $tmp/$txid.faa -n $tmp/$txid -l $tmp/formatdb.log";
  system $cmd;

  $cmd = "fastacmd -d $tmp/$txid -I";
  system $cmd;

  warn "Check the formatdb.log for any errors\n";
}


=======================================


> -----Original Message-----
> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
> bounces at lists.open-bio.org] On Behalf Of Tim
> Sent: Thursday, 26 November 2009 6:41 a.m.
> To: bioperl-l at lists.open-bio.org
> Subject: [Bioperl-l] How to parse BLAST output - all hits of each query in
> new file
> 
> Dear bioperl users,
> 
> I am a real newbie and have - maybe a very trivial - question.
> 
> I searched the mailing list archive and many howtos but I have not found
> a concrete answer to my problem. So hopefully you can help me :)
> 
> Background: I use the latest Bioperl version (installed it two weeks
> before).
> When I use Bio::Tools::Run::StandAloneBlast to BLAST one fasta file
> including different sequences, I get a BLAST output with many queries
> each having several hits / sbjcts.
> 
> My problem is how to parse *all* hits of *one* query into a single new
> file. And this for all the queries I have in my BLAST output file.
> 
> Or is it better the other way round; first to make fasta files with only
> single sequences inside and BLAST each file? But how can I automize that
> using Bioperl?
> 
> I tried Bio::SearchIO but can only parse all queries and their
> respective hits in only one file...
> I think iteration is also necessary here, but I do not really know how
> to include that into Bio::SearchIO.
> Or do I have to use Module:Bio::Index::Blast?
> 
> I can index a file (see below), but I have no idea what comes next...
> 
> ###How I index a file...
> 
> #!/usr/bin/perl -w
> 
> $ENV{BIOPERL_INDEX_TYPE} = "SDBM_File";
> 
> use Bio::Index::Fasta;
> 
> 
> $file_name = "8_to_BLAST_two_seq_index.fasta";
> $id = "48882";
> $inx = Bio::Index::Fasta->new (-filename => $file_name . ".idx",
> -write_flag => 1);
> $inx->make_index($file_name);
> 
> 
> Hopefully, you can give me at least hints what to look for.
> 
> A big THANKS in advance!
> 
> Cheers,
> 
> Tim
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
=======================================================================
Attention: The information contained in this message and/or attachments
from AgResearch Limited is intended only for the persons or entities
to which it is addressed and may contain confidential and/or privileged
material. Any review, retransmission, dissemination or other use of, or
taking of any action in reliance upon, this information by persons or
entities other than the intended recipients is prohibited by AgResearch
Limited. If you have received this message in error, please notify the
sender immediately.
=======================================================================


From maj at fortinbras.us  Wed Nov 25 14:21:27 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Wed, 25 Nov 2009 14:21:27 -0500
Subject: [Bioperl-l] How to parse BLAST output - all hits of each
	queryinnew file
In-Reply-To: <53DE480F205E42CE8D2B9421592AAF0E@NewLife>
References: <4B0D6C24.2080308@gmail.com>
	<53DE480F205E42CE8D2B9421592AAF0E@NewLife>
Message-ID: <815D2A47BC9C4D89B8DEF0B10DA9EAF8@NewLife>

whoops: change the following line:
my $blio = Bio::SearchIO->new( -file => $qid.".bls", -format=>'blast' );

to

my $blio = Bio::SearchIO->new( -file => ">$qid\.bls", -format=>'blast' );

(I always forget that...)
MAJ

----- Original Message ----- 
From: "Mark A. Jensen" <maj at fortinbras.us>
To: "Tim" <timbourine81 at gmail.com>; <bioperl-l at lists.open-bio.org>
Sent: Wednesday, November 25, 2009 1:20 PM
Subject: Re: [Bioperl-l] How to parse BLAST output - all hits of each queryinnew 
file


> hey Tim--
>
> Sound like you need to go about collecting your queries inside out:
>
> my %hits_by_query;
> for ($result->hits) {
>  push @{$hits_by_query{$hit->name}} $hit;
> }
>
> I believe now each hash element, keyed by the query name, will contain
> an arrayref to the set of hits assoc with that query.
>>From here, I believe
>
> use Bio::Search::Result::BlastResult;
> use Bio::SearchIO;
>
> foreach my $qid ( keys %hits_by_query ) {
>  my $result = Bio::Search::Result::BlastResult->new();
>  $result->add_hit($_) for ( @{$hits_by_query{$qid}} );
>  my $blio = Bio::SearchIO->new( -file => $qid.".bls", -format=>'blast' );
>  $blio->write_result($result);
> }
>
> will do what you want.
>
> hope this helps -
> Mark
>
> ----- Original Message ----- 
> From: "Tim" <timbourine81 at gmail.com>
> To: <bioperl-l at lists.open-bio.org>
> Sent: Wednesday, November 25, 2009 12:40 PM
> Subject: [Bioperl-l] How to parse BLAST output - all hits of each query innew 
> file
>
>
>> Dear bioperl users,
>>
>> I am a real newbie and have - maybe a very trivial - question.
>>
>> I searched the mailing list archive and many howtos but I have not found
>> a concrete answer to my problem. So hopefully you can help me :)
>>
>> Background: I use the latest Bioperl version (installed it two weeks
>> before).
>> When I use Bio::Tools::Run::StandAloneBlast to BLAST one fasta file
>> including different sequences, I get a BLAST output with many queries
>> each having several hits / sbjcts.
>>
>> My problem is how to parse *all* hits of *one* query into a single new
>> file. And this for all the queries I have in my BLAST output file.
>>
>> Or is it better the other way round; first to make fasta files with only
>> single sequences inside and BLAST each file? But how can I automize that
>> using Bioperl?
>>
>> I tried Bio::SearchIO but can only parse all queries and their
>> respective hits in only one file...
>> I think iteration is also necessary here, but I do not really know how
>> to include that into Bio::SearchIO.
>> Or do I have to use Module:Bio::Index::Blast?
>>
>> I can index a file (see below), but I have no idea what comes next...
>>
>> ###How I index a file...
>>
>> #!/usr/bin/perl -w
>>
>> $ENV{BIOPERL_INDEX_TYPE} = "SDBM_File";
>>
>> use Bio::Index::Fasta;
>>
>>
>> $file_name = "8_to_BLAST_two_seq_index.fasta";
>> $id = "48882";
>> $inx = Bio::Index::Fasta->new (-filename => $file_name . ".idx",
>> -write_flag => 1);
>> $inx->make_index($file_name);
>>
>>
>> Hopefully, you can give me at least hints what to look for.
>>
>> A big THANKS in advance!
>>
>> Cheers,
>>
>> Tim
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> 


From alden.huang at gmail.com  Thu Nov 26 05:54:30 2009
From: alden.huang at gmail.com (Alden Huang)
Date: Thu, 26 Nov 2009 02:54:30 -0800
Subject: [Bioperl-l] Function that determines serious mutations
In-Reply-To: <deaa866a0911060935q5e0364f5v9a296162c42fd0ca@mail.gmail.com>
References: <deaa866a0911060935q5e0364f5v9a296162c42fd0ca@mail.gmail.com>
Message-ID: <9e408d720911260254r1e85169lb92d944d88a1880c@mail.gmail.com>

Hey rob,

Sorting Intolerant from Tolerant
http://sift.jcvi.org/

~alden

...a bit late, i kno; I just read you post now while cleaning the inbox

On Fri, Nov 6, 2009 at 9:35 AM, Robert Bradbury
<robert.bradbury at gmail.com> wrote:
> Is there a function in the library (or has someone written one) that can
> take a genbank entry and determine which mutations are harmful?
>
> It would be used to produce a table summary of:
> ?GENE ? ? ? ? ?# SNP ? ? ?# BadSNP
>
> One kind of gets this from NCBI if you lookup in the "GENE" db a gene name
> and then go to the "GeneView" om dbSNP page it has the information I want
> but largely in a graphical format while I simply want numbers I can dump
> into a spreadsheet.
>
> I don't think it would be hard, fetch the gene, run through the features for
> the SNP database, figure out whether they are good or bad SNPs, accumulate
> the statistics and dump it. ?I think the functions available are flexible
> enough to do it but I can't believe nobody has already done it. ?It could be
> a bit more complex in that one could do an analysis to see if the mutations
> are in a conserved domain or mutations that code for Cysteine or Methionine
> (or othe potentially "critical" amino acids) but since "critical" is in the
> eye of the beholder there would have to be some kind of callback to a
> scoring function.
>
> Thanks,
> Robert
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


From robert.bradbury at gmail.com  Thu Nov 26 06:27:50 2009
From: robert.bradbury at gmail.com (Robert Bradbury)
Date: Thu, 26 Nov 2009 06:27:50 -0500
Subject: [Bioperl-l] Function that determines serious mutations
In-Reply-To: <9e408d720911260254r1e85169lb92d944d88a1880c@mail.gmail.com>
References: <deaa866a0911060935q5e0364f5v9a296162c42fd0ca@mail.gmail.com>
	<9e408d720911260254r1e85169lb92d944d88a1880c@mail.gmail.com>
Message-ID: <deaa866a0911260327j5b57d16erfcbe5b996e1a6e64@mail.gmail.com>

On Thu, Nov 26, 2009 at 5:54 AM, Alden Huang <alden.huang at gmail.com> wrote:
>
> Sorting Intolerant from Tolerant
> http://sift.jcvi.org/
>
>
Ah yes, thank you very much.  This looks very much like a tool that can be
adapted for various uses.

Robert


From jason at bioperl.org  Thu Nov 26 12:16:17 2009
From: jason at bioperl.org (Jason Stajich)
Date: Thu, 26 Nov 2009 09:16:17 -0800
Subject: [Bioperl-l] question about a Bio::Tree::Tree method
In-Reply-To: <30960443.966281259248778372.JavaMail.defaultUser@defaultHost>
References: <30960443.966281259248778372.JavaMail.defaultUser@defaultHost>
Message-ID: <14F4B8C9-A1F4-436B-813F-50E139932D3D@bioperl.org>

Emilio - please ask your questions on the list - many people there can  
help answer questions.

get_nodes returns all the nodes in the tree, the options specify the  
order they are returned in.  Depending on your question the order  
probably won't matter so you can just call it without any arguments  
like in the examples and the HOWTO.

The documentation for the method says:
  Title   : get_nodes
         Usage   : my @nodes = $tree?>get_nodes()
         Function: Return list of Bio::Tree::NodeI objects
         Returns : array of Bio::Tree::NodeI objects
         Args    : (named values) hash with one value
                   order => ?b?breadth? first order or  
?d?depth? first order

So you can provide no arguments and get the default (breadth-first I  
believe) or you can specify
-order => 'd'
or
-order => 'depth'

to get the nodes in depth-first order.

-jason
On Nov 26, 2009, at 7:19 AM, miglio83 at libero.it wrote:

> Hi Jason,
> I'm Emilio Siena, a PhD student of the University of Perugia.
> I have
> a question about the method "get_nodes" of the  "Bio::Tree::Tree"  
> class.
> In
> particular I didn't understand which type of arguments it accepts  
> and in which
> format an argument should be given.
>
> Thank you in advance!
>
> Emilio

--
Jason Stajich
jason.stajich at gmail.com
jason at bioperl.org


From maj at fortinbras.us  Thu Nov 26 12:40:45 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Thu, 26 Nov 2009 12:40:45 -0500
Subject: [Bioperl-l] Bio::Assembly::IO::sam is alpha
Message-ID: <599F8BABCD2848EFA98FB24A4419674E@NewLife>

in bioperl-live/trunk with plenty pod; bravehearts can (please!) test on .bam files
cheers, MAJ


From mauricio at open-bio.org  Thu Nov 26 16:45:43 2009
From: mauricio at open-bio.org (Mauricio Herrera Cuadra)
Date: Thu, 26 Nov 2009 15:45:43 -0600
Subject: [Bioperl-l] [DAS] DAS workshop 7th-9th April 2010
In-Reply-To: <F30A9ED7-41E9-4833-A094-FDF0893E0F92@sanger.ac.uk>
References: <F30A9ED7-41E9-4833-A094-FDF0893E0F92@sanger.ac.uk>
Message-ID: <4B0EF707.6080202@open-bio.org>

Hi Jonathan,

Any chance it can be webcasted? I'm sure it would attract a lot of 
remote attendees ;)

Regards,
Mauricio.


Jonathan Warren wrote:
> We are considering running a Distributed Annotation System workshop here 
> at the Sanger/EBI in the UK subject to decent demand.
> The workshop will be held from Wednesday 7th-Friday 9th April 2010. If 
> you would be interested in attending either to present or just take part
> then please email me jw12 at sanger.ac.uk
> 
> The format of the workshop is likely to be similar to last years (1st 
> day for beginners, 2nd for both beginners and advanced users, 3rd day 
> for advanced), information for which can be found here:
> http://www.dasregistry.org/course.jsp
> 
> If you would like to present then please send a short summary of what 
> you would like to talk about.
> 
> Thanks
> 
> Jonathan.
> 
> Jonathan Warren
> Senior Developer and DAS coordinator
> jw12 at sanger.ac.uk
> 
> 
> 
> 
> 
> 
> 
> 
> 


From robert.bradbury at gmail.com  Thu Nov 26 21:06:40 2009
From: robert.bradbury at gmail.com (Robert Bradbury)
Date: Thu, 26 Nov 2009 21:06:40 -0500
Subject: [Bioperl-l] BioPerl "guts" question regarding forked processes
Message-ID: <deaa866a0911261806g12211028w23cd540ae3652ff4@mail.gmail.com>

I'm currently running near my process limit and running sequence fetches
from swissprot (I've also had this happen with getting gi's from NCBI) and
am running out of processes about halfway through the set I'm trying to
fetch [1].

Now, is there someplace in the bioperl documentation that documents where
one is supposed to wait() for defunct processes after each sequence fetch.
 I'm encountering the problem both when the sequence fetches succeed as well
as when they fail.

Thanks in advance.
Robert

1. This is due to a bug in chromium's use of flash that involves it leaving
many defunct processes that are uncollected and therefore counting towards
ones "process limit".


From kanzure at gmail.com  Thu Nov 26 21:12:46 2009
From: kanzure at gmail.com (Bryan Bishop)
Date: Thu, 26 Nov 2009 20:12:46 -0600
Subject: [Bioperl-l] BioPerl "guts" question regarding forked processes
In-Reply-To: <deaa866a0911261806g12211028w23cd540ae3652ff4@mail.gmail.com>
References: <deaa866a0911261806g12211028w23cd540ae3652ff4@mail.gmail.com>
Message-ID: <55ad6af70911261812q583277d5l71df0d66e756f617@mail.gmail.com>

On Thu, Nov 26, 2009 at 8:06 PM, Robert Bradbury wrote:
> I'm currently running near my process limit and running sequence fetches
> from swissprot (I've also had this happen with getting gi's from NCBI) and
> am running out of processes about halfway through the set I'm trying to
> fetch [1].

Hey Robert, sorry for the off-topic question, but I was wondering if
you're the same Robert Bradbury from the extropy-chat list. Hi?

- Bryan
http://heybryan.org/
1 512 203 0507


From paolo.pavan at gmail.com  Fri Nov 27 06:35:03 2009
From: paolo.pavan at gmail.com (Paolo Pavan)
Date: Fri, 27 Nov 2009 12:35:03 +0100
Subject: [Bioperl-l] More general Bio::Assembly::Contig question (was
	Bio::Tools::Run::Cap3 usage question)
Message-ID: <56be91b60911270335s3a50ab0cpb03aabb6660f81dc@mail.gmail.com>

Dear Florent,
Thank you for your kind answer and for your efforts spent in this module.
Since you are working on these topics I would like to seize the day and put
you some questions about some doubts I have in mind, if you agree, of course
:-)
Some times ago I tried to work with bioperl, loading the data from an ACE
file originated by Newbler; my need was to extract part of the contig like
an alignment of reads and I tought to do it with a slice() method, since I
saw Bio::Assembly::Contig implements Bio::AlignI interface. Unfortunately I
realize that this interface is inherited but not implemented.
I tried to hack it by adding a slice method which would act on a
Bio::Alignment created from the array of LocatableSeqs representing the
reads.

This is the question:
If I'm not wrong (please correct me if yes), Bio::Assembly::Contig class
stores reads informations in:
Bio::Assembly::Contigs->{_elem}{READ_NAME}{_feat}{
     _align_clipping:READ_NAME}
     _aligned_coord:READ_NAME}
     _quality_clipping:READ_NAME}

Anyone of these 3 features _align_clipping, _aligned_coord,
_quality_clipping, contains a Bio::SeqFeature::Generic, which of them is
more suitable to the purpose expressed before, the slice method?
And more, If you apologize me for being too long, is consequently to the
previous: I don't have perfectly clear the purpose of this 3 feature per
read, can you explain it?

Really thanks you for the time you would spend.
Bye bye,
Paolo


2009/11/24 Florent Angly <florent.angly at gmail.com>

> Hi Paolo,
>
> It turns out that there is no standard for what is to be passed to the
> Bio::Tools::Run wrappers and returned by them. I noticed the inconsistency
> between the assembly wrappers recently while implementing support for new
> wrapper. I implemented inital support for additional de novo assembly
> programs in BioPerl (454 Newbler and Minimo) a couple of weeks ago and Mark
> Jensen added support for Maq, a program that assembler reads against a
> reference. In the process, all the assembly wrappers were changed to take
> the same type of input data (a FASTA sequence or an array reference of
> sequence objects) and return one of the following:
>   * a Bio::Assembly::Scaffold object (the default), or
>   * a Bio::Assembly::IO object, or
>   * the name of a file for the output of the assembler
> Use the out_type method to set up which output you want, e.g.:
>   $factory->out_type('Bio::Assembly::IO');
> or
>   $factory->out_type('cap3_results.ace');
> You'll have to use the code in the bioperl-run subversion if you want to
> use these new features.
>
> Cheers,
>
> Florent
>
>
>
>
> Paolo Pavan wrote:
>
>> Dear,
>> I'm confused about the proper usage of the module Bio::Tools::Run::Cap3.
>> As documented in the pod, the run(@seqs) method returns the cap3 report
>> file
>> while I expect to return a Bio::Assembly object, consistently with other
>> Bio::Tools::Run classes.
>> However, I went around this by getting from the factory object the
>> location
>> and the names of the temp output files (actually accessing a private
>> property, although) and reading them via the Assembly::IO system.
>> I was just wandering what is the proper designed way to do this job.
>>
>> Thank you for enlighten the way!
>> Paolo
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>>
>>
>
>


From jw12 at sanger.ac.uk  Thu Nov 26 09:57:35 2009
From: jw12 at sanger.ac.uk (Jonathan Warren)
Date: Thu, 26 Nov 2009 14:57:35 +0000
Subject: [Bioperl-l] DAS workshop 7th-9th April 2010
Message-ID: <F30A9ED7-41E9-4833-A094-FDF0893E0F92@sanger.ac.uk>

We are considering running a Distributed Annotation System workshop  
here at the Sanger/EBI in the UK subject to decent demand.
The workshop will be held from Wednesday 7th-Friday 9th April 2010. If  
you would be interested in attending either to present or just take part
then please email me jw12 at sanger.ac.uk

The format of the workshop is likely to be similar to last years (1st  
day for beginners, 2nd for both beginners and advanced users, 3rd day  
for advanced), information for which can be found here:
http://www.dasregistry.org/course.jsp

If you would like to present then please send a short summary of what  
you would like to talk about.

Thanks

Jonathan.

Jonathan Warren
Senior Developer and DAS coordinator
jw12 at sanger.ac.uk


-- 
 The Wellcome Trust Sanger Institute is operated by Genome Research 
 Limited, a charity registered in England with number 1021457 and a 
 company registered in England with number 2742969, whose registered 
 office is 215 Euston Road, London, NW1 2BE. 


From timbourine81 at googlemail.com  Thu Nov 26 11:02:30 2009
From: timbourine81 at googlemail.com (Tim Koehler)
Date: Thu, 26 Nov 2009 17:02:30 +0100
Subject: [Bioperl-l] How to parse BLAST output - all hits of each
	queryinnew file
In-Reply-To: <4B0EA44D.2050507@gmail.com>
References: <4B0D6C24.2080308@gmail.com>
	<53DE480F205E42CE8D2B9421592AAF0E@NewLife>
	<815D2A47BC9C4D89B8DEF0B10DA9EAF8@NewLife>
	<4B0EA44D.2050507@gmail.com>
Message-ID: <c3cc98c0911260802t2a66e7e8o7be3b79c03d86e6e@mail.gmail.com>

ups, sent too early...

Hey Mark,

thanks for the answer. But I am still struggling, especially where to put in
your code.

Here ist the code I have, so far:

#!/usr/bin/perl -w

### should I put your code here as push is a perl command?
my %hits_by_query;
for ($result->hits) {
### I inserted a comma after name}}; if there is no comma, there was the
error: Scalar found where operator expected at
12_BLAST_two_sequence_each_query_one_file.PL line7, near "} $hit"
###        (Missing operator before  $hit?)
###Useless use of push with no values at
12_BLAST_two_sequence_each_query_one_file.PL line 7.
###syntax error at 12_BLAST_two_sequence_each_query_one_file.PL line 7, near
"} $hit"
###BEGIN not safe after errors--compilation aborted at
12_BLAST_two_sequence_each_query_one_file.PL line 13.
 push @{$hits_by_query{$hit->name}}, $hit;
###here, every time this terror appears: Name "main::result" used only once:
possible typo at 12_BLAST_two_sequence_each_query_one_file.PL line 5.
###error: Can't call method "hits" on an undefined value at
12_BLAST_two_sequence_each_query_one_file.PL line 5.
}


use strict;
use Bio::Tools::Run::StandAloneBlast;
use Bio::SeqIO;
use Bio::SearchIO;
use Bio::Search::Result::BlastResult;

my $Seq_in = Bio::SeqIO->new (
-file =>
"/home/koehler/Programs/for_BLAST/BLAST_Pipeline/1_to_BLAST_two_seq.fasta",
-format => 'fasta'
);
while (my $query = $Seq_in->next_seq()) {
my $factory = Bio::Tools::Run::StandAloneBlast->new(
'program' => 'blastn',
'database' => '/home/koehler/Programs/for_BLAST/BLAST_Pipeline/3_BLAST_db',
_READMETHOD => "Blast"
);

my $blast_report = $factory->blastall($query);

### Should I need to use a module? are the commands here at the right
position? errors, e.g., Global symbol "$hit" requires explicit package name
#my %hits_by_query;
#for ($result->hits) {
### inserted comma after name}}
# push @{$hits_by_query{$hit->name}}, $hit;
#}

foreach my $qid ( keys %hits_by_query ) {
 my $result = Bio::Search::Result::BlastResult->new();
 $result->add_hit($_) for ( @{$hits_by_query{$qid}} );
 my $blio = Bio::SearchIO->new( -file => ">$qid\.bls", -format=>'blast' );
 $blio->write_result($result);
}

###where are the files stored? what is their name. Sorry, but I cannot get
behind that :(

while( my $result = $blast_report->next_result ) {
  ## $result is a Bio::Search::Result::ResultI compliant object
  while( my $hit = $result->next_hit ) {
   ## $hit is a Bio::Search::Hit::HitI compliant object
   while( my $hsp = $hit->next_hsp ) {
    ## $hsp is a Bio::Search::HSP::HSPI compliant object
    if( $hsp->length('total') > 50 ) {
     if ( $hsp->percent_identity >= 75 ) {
     print  "Query= ",        $result->query_name,
        "Hit= ",        $hit->name,
            "Length= ",     $hsp->length('total'),
            "Percent_id= ", $hsp->percent_identity,
        "Subject=",        $hsp->hit_string,"\n";
     }
    }
   }
  }
}
}

Again, a big thanks in advance :)

All the best,

Tim


On Thu, Nov 26, 2009 at 4:52 PM, Tim <timbourine81 at gmail.com> wrote:

> Hey Mark,
>
> thanks for the answer
>
> On 25.11.2009 20:21, Mark A. Jensen wrote:
> > whoops: change the following line:
> > my $blio = Bio::SearchIO->new( -file => $qid.".bls", -format=>'blast' );
> >
> > to
> >
> > my $blio = Bio::SearchIO->new( -file => ">$qid\.bls", -format=>'blast' );
> >
> > (I always forget that...)
> > MAJ
> >
> > ----- Original Message ----- From: "Mark A. Jensen" <maj at fortinbras.us>
> > To: "Tim" <timbourine81 at gmail.com>; <bioperl-l at lists.open-bio.org>
> > Sent: Wednesday, November 25, 2009 1:20 PM
> > Subject: Re: [Bioperl-l] How to parse BLAST output - all hits of each
> > queryinnew file
> >
> >
> >> hey Tim--
> >>
> >> Sound like you need to go about collecting your queries inside out:
> >>
> >> my %hits_by_query;
> >> for ($result->hits) {
> >>  push @{$hits_by_query{$hit->name}} $hit;
> >> }
> >>
> >> I believe now each hash element, keyed by the query name, will contain
> >> an arrayref to the set of hits assoc with that query.
> >>> From here, I believe
> >>
> >> use Bio::Search::Result::BlastResult;
> >> use Bio::SearchIO;
> >>
> >> foreach my $qid ( keys %hits_by_query ) {
> >>  my $result = Bio::Search::Result::BlastResult->new();
> >>  $result->add_hit($_) for ( @{$hits_by_query{$qid}} );
> >>  my $blio = Bio::SearchIO->new( -file => $qid.".bls", -format=>'blast'
> );
> >>  $blio->write_result($result);
> >> }
> >>
> >> will do what you want.
> >>
> >> hope this helps -
> >> Mark
> >>
> >> ----- Original Message ----- From: "Tim" <timbourine81 at gmail.com>
> >> To: <bioperl-l at lists.open-bio.org>
> >> Sent: Wednesday, November 25, 2009 12:40 PM
> >> Subject: [Bioperl-l] How to parse BLAST output - all hits of each
> >> query innew file
> >>
> >>
> >>> Dear bioperl users,
> >>>
> >>> I am a real newbie and have - maybe a very trivial - question.
> >>>
> >>> I searched the mailing list archive and many howtos but I have not
> found
> >>> a concrete answer to my problem. So hopefully you can help me :)
> >>>
> >>> Background: I use the latest Bioperl version (installed it two weeks
> >>> before).
> >>> When I use Bio::Tools::Run::StandAloneBlast to BLAST one fasta file
> >>> including different sequences, I get a BLAST output with many queries
> >>> each having several hits / sbjcts.
> >>>
> >>> My problem is how to parse *all* hits of *one* query into a single new
> >>> file. And this for all the queries I have in my BLAST output file.
> >>>
> >>> Or is it better the other way round; first to make fasta files with
> only
> >>> single sequences inside and BLAST each file? But how can I automize
> that
> >>> using Bioperl?
> >>>
> >>> I tried Bio::SearchIO but can only parse all queries and their
> >>> respective hits in only one file...
> >>> I think iteration is also necessary here, but I do not really know how
> >>> to include that into Bio::SearchIO.
> >>> Or do I have to use Module:Bio::Index::Blast?
> >>>
> >>> I can index a file (see below), but I have no idea what comes next...
> >>>
> >>> ###How I index a file...
> >>>
> >>> #!/usr/bin/perl -w
> >>>
> >>> $ENV{BIOPERL_INDEX_TYPE} = "SDBM_File";
> >>>
> >>> use Bio::Index::Fasta;
> >>>
> >>>
> >>> $file_name = "8_to_BLAST_two_seq_index.fasta";
> >>> $id = "48882";
> >>> $inx = Bio::Index::Fasta->new (-filename => $file_name . ".idx",
> >>> -write_flag => 1);
> >>> $inx->make_index($file_name);
> >>>
> >>>
> >>> Hopefully, you can give me at least hints what to look for.
> >>>
> >>> A big THANKS in advance!
> >>>
> >>> Cheers,
> >>>
> >>> Tim
> >>> _______________________________________________
> >>> Bioperl-l mailing list
> >>> Bioperl-l at lists.open-bio.org
> >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >>>
> >>>
> >>
> >> _______________________________________________
> >> Bioperl-l mailing list
> >> Bioperl-l at lists.open-bio.org
> >> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >>
> >>
> >
>
> Tim K?hler
MPI for Terrestrial Microbiology
Karl-von-Frisch-Stra?e
D-35043 Marburg / Germany

Email: koehlerd at mpi-marburg.mpg.de
Phone: +49 6421 178-740
Fax:   +49 6421 178-999


From rtbio.2009 at gmail.com  Sat Nov 28 02:53:43 2009
From: rtbio.2009 at gmail.com (Roopa Raghuveer)
Date: Sat, 28 Nov 2009 08:53:43 +0100
Subject: [Bioperl-l] Linking of two cgi scripts
Message-ID: <c7cac1600911272353u70019256rf394078759660266@mail.gmail.com>

hello everyone,

I have a small question.

I would like to link two cgi scripts i.e.,

I have an input sequence being entered in a text area

ex:->gi|at442323|...
ATGCCCCCTTGGAACCAAAAAAA....

So I would like to compare this with the query sequences.These query
sequences would be from a BLAST script in the module blast.pm
So once I enter the input sequence and request for BLAST using submit
button,my request should go to a program which performs BLAST search.After
this, the sequences obtained from BLAST have to be returned to a program
Roopa.pm which compares the input sequence and the sequences obtained from
blast.

But I am unable to provide this link between the cgi scripts.(i.e.,one
script to use BLAST,the other script to compare the sequences and send the
results to the browser)

Could any one help me in this regard?

Regards,
Roopa.


From s.denaxas at gmail.com  Sat Nov 28 05:56:15 2009
From: s.denaxas at gmail.com (Spiros Denaxas)
Date: Sat, 28 Nov 2009 10:56:15 +0000
Subject: [Bioperl-l] Linking of two cgi scripts
In-Reply-To: <c7cac1600911272353u70019256rf394078759660266@mail.gmail.com>
References: <c7cac1600911272353u70019256rf394078759660266@mail.gmail.com>
Message-ID: <bba689ec0911280256u602b8f9dpffe9483189c56536@mail.gmail.com>

Hello,

Why do they both have to be CGi scripts? cant all the processing
happen server side, i.e. both BLAST and comparison of returned
results?

If that is strictly a requirement, you could:

a) get input from user on script A, i.e. the input sequence
b) do a HTTP request from the CGI to the other script B using LWP::UserAgent
c) get results from script B, pass on to comparison module
d) return results to user

As I said, this will be clunky so either do everything in one go or
consider AJAX

hope this helps
Spiros

On Sat, Nov 28, 2009 at 7:53 AM, Roopa Raghuveer <rtbio.2009 at gmail.com> wrote:
> hello everyone,
>
> I have a small question.
>
> I would like to link two cgi scripts i.e.,
>
> I have an input sequence being entered in a text area
>
> ex:->gi|at442323|...
> ATGCCCCCTTGGAACCAAAAAAA....
>
> So I would like to compare this with the query sequences.These query
> sequences would be from a BLAST script in the module blast.pm
> So once I enter the input sequence and request for BLAST using submit
> button,my request should go to a program which performs BLAST search.After
> this, the sequences obtained from BLAST have to be returned to a program
> Roopa.pm which compares the input sequence and the sequences obtained from
> blast.
>
> But I am unable to provide this link between the cgi scripts.(i.e.,one
> script to use BLAST,the other script to compare the sequences and send the
> results to the browser)
>
> Could any one help me in this regard?
>
> Regards,
> Roopa.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


From maj at fortinbras.us  Sat Nov 28 11:23:53 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Sat, 28 Nov 2009 11:23:53 -0500
Subject: [Bioperl-l] Run wrappers for BWA and Samtools
Message-ID: <7F56A6EEEB0E4EE291D5340F27DF7D3A@NewLife>

Hi All, 

Run wrappers for the bwa assembler and the samtools suite
are now available as beta in the bioperl-run/trunk. The bwa 
wrapper allows you to run a canned assembly pipeline, or 
to execute individual bwa components. The assembly pipeline
can return a Bio::Assembly::Scaffold object via the new 
Bio::Assembly::IO::sam module in bioperl-live/trunk
(this requires lstein's Bio::DB::Sam, from CPAN). Details at

http://www.bioperl.org/wiki/HOWTO:Short-read_assemblies_with_BWA

and, of course, in the pod. 

Cheers, 
MAJ


From maj at fortinbras.us  Sat Nov 28 21:55:42 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Sat, 28 Nov 2009 21:55:42 -0500
Subject: [Bioperl-l] How to parse BLAST output - all hits of
	eachqueryinnew file
In-Reply-To: <c3cc98c0911260802t2a66e7e8o7be3b79c03d86e6e@mail.gmail.com>
References: <4B0D6C24.2080308@gmail.com><53DE480F205E42CE8D2B9421592AAF0E@NewLife><815D2A47BC9C4D89B8DEF0B10DA9EAF8@NewLife><4B0EA44D.2050507@gmail.com>
	<c3cc98c0911260802t2a66e7e8o7be3b79c03d86e6e@mail.gmail.com>
Message-ID: <21BFD947CEEF43CCAC8AFFDB7A064A49@NewLife>

Hi Tim--
There's a bug in my code; should be
for my $hit ($result->hits) {
...
}
and you're right about the comma. My bad.

But I don't think you need this-- you're already looping over your
query sequences and doing blastn on each one. So in the middle of
your loop, you can simply write the blast result that you got:

my $blio = Bio::SearchIO->new( -file => 
">".$query->id.".bls", -format=>"blast" );
$blio->write_result($result);

and forget about the foreach my $qid loop entirely.

The files should show up in the directory from which you're
running the script.
cheers, MAJ


----- Original Message ----- 
From: "Tim Koehler" <timbourine81 at googlemail.com>
To: <bioperl-l at lists.open-bio.org>
Sent: Thursday, November 26, 2009 11:02 AM
Subject: Re: [Bioperl-l] How to parse BLAST output - all hits of eachqueryinnew 
file


ups, sent too early...

Hey Mark,

thanks for the answer. But I am still struggling, especially where to put in
your code.

Here ist the code I have, so far:

#!/usr/bin/perl -w

### should I put your code here as push is a perl command?
my %hits_by_query;
for ($result->hits) {
### I inserted a comma after name}}; if there is no comma, there was the
error: Scalar found where operator expected at
12_BLAST_two_sequence_each_query_one_file.PL line7, near "} $hit"
###        (Missing operator before  $hit?)
###Useless use of push with no values at
12_BLAST_two_sequence_each_query_one_file.PL line 7.
###syntax error at 12_BLAST_two_sequence_each_query_one_file.PL line 7, near
"} $hit"
###BEGIN not safe after errors--compilation aborted at
12_BLAST_two_sequence_each_query_one_file.PL line 13.
 push @{$hits_by_query{$hit->name}}, $hit;
###here, every time this terror appears: Name "main::result" used only once:
possible typo at 12_BLAST_two_sequence_each_query_one_file.PL line 5.
###error: Can't call method "hits" on an undefined value at
12_BLAST_two_sequence_each_query_one_file.PL line 5.
}


use strict;
use Bio::Tools::Run::StandAloneBlast;
use Bio::SeqIO;
use Bio::SearchIO;
use Bio::Search::Result::BlastResult;

my $Seq_in = Bio::SeqIO->new (
-file =>
"/home/koehler/Programs/for_BLAST/BLAST_Pipeline/1_to_BLAST_two_seq.fasta",
-format => 'fasta'
);
while (my $query = $Seq_in->next_seq()) {
my $factory = Bio::Tools::Run::StandAloneBlast->new(
'program' => 'blastn',
'database' => '/home/koehler/Programs/for_BLAST/BLAST_Pipeline/3_BLAST_db',
_READMETHOD => "Blast"
);

my $blast_report = $factory->blastall($query);

### Should I need to use a module? are the commands here at the right
position? errors, e.g., Global symbol "$hit" requires explicit package name
#my %hits_by_query;
#for ($result->hits) {
### inserted comma after name}}
# push @{$hits_by_query{$hit->name}}, $hit;
#}

foreach my $qid ( keys %hits_by_query ) {
 my $result = Bio::Search::Result::BlastResult->new();
 $result->add_hit($_) for ( @{$hits_by_query{$qid}} );
 my $blio = Bio::SearchIO->new( -file => ">$qid\.bls", -format=>'blast' );
 $blio->write_result($result);
}

###where are the files stored? what is their name. Sorry, but I cannot get
behind that :(

while( my $result = $blast_report->next_result ) {
  ## $result is a Bio::Search::Result::ResultI compliant object
  while( my $hit = $result->next_hit ) {
   ## $hit is a Bio::Search::Hit::HitI compliant object
   while( my $hsp = $hit->next_hsp ) {
    ## $hsp is a Bio::Search::HSP::HSPI compliant object
    if( $hsp->length('total') > 50 ) {
     if ( $hsp->percent_identity >= 75 ) {
     print  "Query= ",        $result->query_name,
        "Hit= ",        $hit->name,
            "Length= ",     $hsp->length('total'),
            "Percent_id= ", $hsp->percent_identity,
        "Subject=",        $hsp->hit_string,"\n";
     }
    }
   }
  }
}
}

Again, a big thanks in advance :)

All the best,

Tim


On Thu, Nov 26, 2009 at 4:52 PM, Tim <timbourine81 at gmail.com> wrote:

> Hey Mark,
>
> thanks for the answer
>
> On 25.11.2009 20:21, Mark A. Jensen wrote:
> > whoops: change the following line:
> > my $blio = Bio::SearchIO->new( -file => $qid.".bls", -format=>'blast' );
> >
> > to
> >
> > my $blio = Bio::SearchIO->new( -file => ">$qid\.bls", -format=>'blast' );
> >
> > (I always forget that...)
> > MAJ
> >
> > ----- Original Message ----- From: "Mark A. Jensen" <maj at fortinbras.us>
> > To: "Tim" <timbourine81 at gmail.com>; <bioperl-l at lists.open-bio.org>
> > Sent: Wednesday, November 25, 2009 1:20 PM
> > Subject: Re: [Bioperl-l] How to parse BLAST output - all hits of each
> > queryinnew file
> >
> >
> >> hey Tim--
> >>
> >> Sound like you need to go about collecting your queries inside out:
> >>
> >> my %hits_by_query;
> >> for ($result->hits) {
> >>  push @{$hits_by_query{$hit->name}} $hit;
> >> }
> >>
> >> I believe now each hash element, keyed by the query name, will contain
> >> an arrayref to the set of hits assoc with that query.
> >>> From here, I believe
> >>
> >> use Bio::Search::Result::BlastResult;
> >> use Bio::SearchIO;
> >>
> >> foreach my $qid ( keys %hits_by_query ) {
> >>  my $result = Bio::Search::Result::BlastResult->new();
> >>  $result->add_hit($_) for ( @{$hits_by_query{$qid}} );
> >>  my $blio = Bio::SearchIO->new( -file => $qid.".bls", -format=>'blast'
> );
> >>  $blio->write_result($result);
> >> }
> >>
> >> will do what you want.
> >>
> >> hope this helps -
> >> Mark
> >>
> >> ----- Original Message ----- From: "Tim" <timbourine81 at gmail.com>
> >> To: <bioperl-l at lists.open-bio.org>
> >> Sent: Wednesday, November 25, 2009 12:40 PM
> >> Subject: [Bioperl-l] How to parse BLAST output - all hits of each
> >> query innew file
> >>
> >>
> >>> Dear bioperl users,
> >>>
> >>> I am a real newbie and have - maybe a very trivial - question.
> >>>
> >>> I searched the mailing list archive and many howtos but I have not
> found
> >>> a concrete answer to my problem. So hopefully you can help me :)
> >>>
> >>> Background: I use the latest Bioperl version (installed it two weeks
> >>> before).
> >>> When I use Bio::Tools::Run::StandAloneBlast to BLAST one fasta file
> >>> including different sequences, I get a BLAST output with many queries
> >>> each having several hits / sbjcts.
> >>>
> >>> My problem is how to parse *all* hits of *one* query into a single new
> >>> file. And this for all the queries I have in my BLAST output file.
> >>>
> >>> Or is it better the other way round; first to make fasta files with
> only
> >>> single sequences inside and BLAST each file? But how can I automize
> that
> >>> using Bioperl?
> >>>
> >>> I tried Bio::SearchIO but can only parse all queries and their
> >>> respective hits in only one file...
> >>> I think iteration is also necessary here, but I do not really know how
> >>> to include that into Bio::SearchIO.
> >>> Or do I have to use Module:Bio::Index::Blast?
> >>>
> >>> I can index a file (see below), but I have no idea what comes next...
> >>>
> >>> ###How I index a file...
> >>>
> >>> #!/usr/bin/perl -w
> >>>
> >>> $ENV{BIOPERL_INDEX_TYPE} = "SDBM_File";
> >>>
> >>> use Bio::Index::Fasta;
> >>>
> >>>
> >>> $file_name = "8_to_BLAST_two_seq_index.fasta";
> >>> $id = "48882";
> >>> $inx = Bio::Index::Fasta->new (-filename => $file_name . ".idx",
> >>> -write_flag => 1);
> >>> $inx->make_index($file_name);
> >>>
> >>>
> >>> Hopefully, you can give me at least hints what to look for.
> >>>
> >>> A big THANKS in advance!
> >>>
> >>> Cheers,
> >>>
> >>> Tim
> >>> _______________________________________________
> >>> Bioperl-l mailing list
> >>> Bioperl-l at lists.open-bio.org
> >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >>>
> >>>
> >>
> >> _______________________________________________
> >> Bioperl-l mailing list
> >> Bioperl-l at lists.open-bio.org
> >> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >>
> >>
> >
>
> Tim K?hler
MPI for Terrestrial Microbiology
Karl-von-Frisch-Stra?e
D-35043 Marburg / Germany

Email: koehlerd at mpi-marburg.mpg.de
Phone: +49 6421 178-740
Fax:   +49 6421 178-999

_______________________________________________
Bioperl-l mailing list
Bioperl-l at lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/bioperl-l


From maj at fortinbras.us  Sat Nov 28 22:32:42 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Sat, 28 Nov 2009 22:32:42 -0500
Subject: [Bioperl-l] HOWTO copyright policy vs FDL on wiki
Message-ID: <9EC73CA501BD45BA912F2D77954D6CD7@NewLife>

The HOWTOs appear to have a more restrictive copyright
than FDL-- in particular, the blurb at the bottom of the 
HOWTO page asks users to use the documents for personal 
use only. I'm for this; I think we should therefore have some 
explicit license for these that specifies this kind of restriction, 
and then express that on each howto and in BioPerl:Copyright.
Any thoughts on the right license and whether this is a good plan?
MAJ


From florent.angly at gmail.com  Sat Nov 28 22:47:45 2009
From: florent.angly at gmail.com (Florent Angly)
Date: Sat, 28 Nov 2009 19:47:45 -0800
Subject: [Bioperl-l] More general Bio::Assembly::Contig question (was
 Bio::Tools::Run::Cap3 usage question)
In-Reply-To: <56be91b60911270335s3a50ab0cpb03aabb6660f81dc@mail.gmail.com>
References: <56be91b60911270335s3a50ab0cpb03aabb6660f81dc@mail.gmail.com>
Message-ID: <4B11EEE1.8070907@gmail.com>

Hi Paolo,

The aligned reads of a contig are stored in 
Bio::Assembly::Contigs->{_elem}{READ_NAME}{_seq}. To implement a slice() 
method, you could retrieve the reads using get_seq_ids(), 
get_seq_by_name() or get_seq_by_pos(). To retrieve the position of an 
aligned read in the contig, use get_seq_coord() which returns a 
Bio::SeqFeature::Generic object (from 
Bio::Assembly::Contigs->{_elem}{READ_NAME}{_feat}{_aligned_coord:READ_NAME}) 
on which you can call the start() and end() methods.

I'm not entirely sure what 
Bio::Assembly::Contigs->{_elem}{READ_NAME}{_feat}{_align_clipping:READ_NAME} 
and {_quality_clipping:READ_NAME} are. I believe that they represent the 
clear range of the read/contig.

Hope it helps,

Florent


Paolo Pavan wrote:
> Dear Florent,
> Thank you for your kind answer and for your efforts spent in this module.
> Since you are working on these topics I would like to seize the day 
> and put you some questions about some doubts I have in mind, if you 
> agree, of course :-)
> Some times ago I tried to work with bioperl, loading the data from an 
> ACE file originated by Newbler; my need was to extract part of the 
> contig like an alignment of reads and I tought to do it with a slice() 
> method, since I saw Bio::Assembly::Contig implements Bio::AlignI 
> interface. Unfortunately I realize that this interface is inherited 
> but not implemented.
> I tried to hack it by adding a slice method which would act on a 
> Bio::Alignment created from the array of LocatableSeqs representing 
> the reads.
>
> This is the question:
> If I'm not wrong (please correct me if yes), Bio::Assembly::Contig 
> class stores reads informations in:
> Bio::Assembly::Contigs->{_elem}{READ_NAME}{_feat}{
>      _align_clipping:READ_NAME}
>      _aligned_coord:READ_NAME}
>      _quality_clipping:READ_NAME}
>
> Anyone of these 3 features _align_clipping, _aligned_coord, 
> _quality_clipping, contains a Bio::SeqFeature::Generic, which of them 
> is more suitable to the purpose expressed before, the slice method?
> And more, If you apologize me for being too long, is consequently to 
> the previous: I don't have perfectly clear the purpose of this 3 
> feature per read, can you explain it?
>
> Really thanks you for the time you would spend.
> Bye bye,
> Paolo


From bimber at wisc.edu  Sun Nov 29 00:31:25 2009
From: bimber at wisc.edu (Ben Bimber)
Date: Sat, 28 Nov 2009 23:31:25 -0600
Subject: [Bioperl-l] using bioperl to compare sequences
Message-ID: <9f985cdc0911282131l350bc525gd9ad4717c101ac63@mail.gmail.com>

Hello,

I have a couple years programming experience, but am reasonably new to
perl and extremely new to bioperl.  I have been reading through the
bioperl documentation and am trying to understand the best way to
approach a particular problem.  I'm hoping someone could offer some
tips and point me in the right direction.  If someone has solved this
sort of problem before, i'd prefer not to reinvent things.  Here's
what I'm trying to do:

Our lab generates mRNA sequence data, consisting of alleles of a given
gene or genes
I want to compare each of these sequences against a reference using
BLAST or clustalw (will need the ability to choose at run time)
Take the result of this alignment, then record positions of difference
between the experimental sequence and reference sequence (SNPs)
Translate the corresponding AA change(s) associated with each SNP.
There can be overlapping ORFs.

I see that bioperl has modules for BLAST and clustal.  I've also been
looking at the modules under variation.  I havent fully wrapped my
head around them, but they look to be what i'd use for SNP detection.

has anyone has written code to perform similar things and if so, would
you be willing to share specific examples?  Anything concrete to see
exactly how these modules operate would be extremely helpful.

Thanks in advance for any tips or help.


From jason at bioperl.org  Sun Nov 29 10:54:53 2009
From: jason at bioperl.org (Jason Stajich)
Date: Sun, 29 Nov 2009 07:54:53 -0800
Subject: [Bioperl-l] How to parse BLAST output - all hits of
	eachqueryinnew file
In-Reply-To: <21BFD947CEEF43CCAC8AFFDB7A064A49@NewLife>
References: <4B0D6C24.2080308@gmail.com><53DE480F205E42CE8D2B9421592AAF0E@NewLife><815D2A47BC9C4D89B8DEF0B10DA9EAF8@NewLife><4B0EA44D.2050507@gmail.com>
	<c3cc98c0911260802t2a66e7e8o7be3b79c03d86e6e@mail.gmail.com>
	<21BFD947CEEF43CCAC8AFFDB7A064A49@NewLife>
Message-ID: <897A8DB4-AF29-4601-A1E5-9A04D9D8C151@bioperl.org>

or
while( my $hit = $result->next_hit ) {
}
On Nov 28, 2009, at 6:55 PM, Mark A. Jensen wrote:

> Hi Tim--
> There's a bug in my code; should be
> for my $hit ($result->hits) {
> ...
> }
> and you're right about the comma. My bad.
>
> But I don't think you need this-- you're already looping over your
> query sequences and doing blastn on each one. So in the middle of
> your loop, you can simply write the blast result that you got:
>
> my $blio = Bio::SearchIO->new( -file => ">".$query->id.".bls", - 
> format=>"blast" );
> $blio->write_result($result);
>
> and forget about the foreach my $qid loop entirely.
>
> The files should show up in the directory from which you're
> running the script.
> cheers, MAJ
>
>
>
> ----- Original Message ----- From: "Tim Koehler" <timbourine81 at googlemail.com 
> >
> To: <bioperl-l at lists.open-bio.org>
> Sent: Thursday, November 26, 2009 11:02 AM
> Subject: Re: [Bioperl-l] How to parse BLAST output - all hits of  
> eachqueryinnew file
>
>
> ups, sent too early...
>
> Hey Mark,
>
> thanks for the answer. But I am still struggling, especially where  
> to put in
> your code.
>
> Here ist the code I have, so far:
>
> #!/usr/bin/perl -w
>
> ### should I put your code here as push is a perl command?
> my %hits_by_query;
> for ($result->hits) {
> ### I inserted a comma after name}}; if there is no comma, there was  
> the
> error: Scalar found where operator expected at
> 12_BLAST_two_sequence_each_query_one_file.PL line7, near "} $hit"
> ###        (Missing operator before  $hit?)
> ###Useless use of push with no values at
> 12_BLAST_two_sequence_each_query_one_file.PL line 7.
> ###syntax error at 12_BLAST_two_sequence_each_query_one_file.PL line  
> 7, near
> "} $hit"
> ###BEGIN not safe after errors--compilation aborted at
> 12_BLAST_two_sequence_each_query_one_file.PL line 13.
> push @{$hits_by_query{$hit->name}}, $hit;
> ###here, every time this terror appears: Name "main::result" used  
> only once:
> possible typo at 12_BLAST_two_sequence_each_query_one_file.PL line 5.
> ###error: Can't call method "hits" on an undefined value at
> 12_BLAST_two_sequence_each_query_one_file.PL line 5.
> }
>
>
> use strict;
> use Bio::Tools::Run::StandAloneBlast;
> use Bio::SeqIO;
> use Bio::SearchIO;
> use Bio::Search::Result::BlastResult;
>
> my $Seq_in = Bio::SeqIO->new (
> -file =>
> "/home/koehler/Programs/for_BLAST/BLAST_Pipeline/ 
> 1_to_BLAST_two_seq.fasta",
> -format => 'fasta'
> );
> while (my $query = $Seq_in->next_seq()) {
> my $factory = Bio::Tools::Run::StandAloneBlast->new(
> 'program' => 'blastn',
> 'database' => '/home/koehler/Programs/for_BLAST/BLAST_Pipeline/ 
> 3_BLAST_db',
> _READMETHOD => "Blast"
> );
>
> my $blast_report = $factory->blastall($query);
>
> ### Should I need to use a module? are the commands here at the right
> position? errors, e.g., Global symbol "$hit" requires explicit  
> package name
> #my %hits_by_query;
> #for ($result->hits) {
> ### inserted comma after name}}
> # push @{$hits_by_query{$hit->name}}, $hit;
> #}
>
> foreach my $qid ( keys %hits_by_query ) {
> my $result = Bio::Search::Result::BlastResult->new();
> $result->add_hit($_) for ( @{$hits_by_query{$qid}} );
> my $blio = Bio::SearchIO->new( -file => ">$qid\.bls", - 
> format=>'blast' );
> $blio->write_result($result);
> }
>
> ###where are the files stored? what is their name. Sorry, but I  
> cannot get
> behind that :(
>
> while( my $result = $blast_report->next_result ) {
> ## $result is a Bio::Search::Result::ResultI compliant object
> while( my $hit = $result->next_hit ) {
>  ## $hit is a Bio::Search::Hit::HitI compliant object
>  while( my $hsp = $hit->next_hsp ) {
>   ## $hsp is a Bio::Search::HSP::HSPI compliant object
>   if( $hsp->length('total') > 50 ) {
>    if ( $hsp->percent_identity >= 75 ) {
>    print  "Query= ",        $result->query_name,
>       "Hit= ",        $hit->name,
>           "Length= ",     $hsp->length('total'),
>           "Percent_id= ", $hsp->percent_identity,
>       "Subject=",        $hsp->hit_string,"\n";
>    }
>   }
>  }
> }
> }
> }
>
> Again, a big thanks in advance :)
>
> All the best,
>
> Tim
>
>
> On Thu, Nov 26, 2009 at 4:52 PM, Tim <timbourine81 at gmail.com> wrote:
>
>> Hey Mark,
>>
>> thanks for the answer
>>
>> On 25.11.2009 20:21, Mark A. Jensen wrote:
>> > whoops: change the following line:
>> > my $blio = Bio::SearchIO->new( -file => $qid.".bls", - 
>> format=>'blast' );
>> >
>> > to
>> >
>> > my $blio = Bio::SearchIO->new( -file => ">$qid\.bls", - 
>> format=>'blast' );
>> >
>> > (I always forget that...)
>> > MAJ
>> >
>> > ----- Original Message ----- From: "Mark A. Jensen" <maj at fortinbras.us 
>> >
>> > To: "Tim" <timbourine81 at gmail.com>; <bioperl-l at lists.open-bio.org>
>> > Sent: Wednesday, November 25, 2009 1:20 PM
>> > Subject: Re: [Bioperl-l] How to parse BLAST output - all hits of  
>> each
>> > queryinnew file
>> >
>> >
>> >> hey Tim--
>> >>
>> >> Sound like you need to go about collecting your queries inside  
>> out:
>> >>
>> >> my %hits_by_query;
>> >> for ($result->hits) {
>> >>  push @{$hits_by_query{$hit->name}} $hit;
>> >> }
>> >>
>> >> I believe now each hash element, keyed by the query name, will  
>> contain
>> >> an arrayref to the set of hits assoc with that query.
>> >>> From here, I believe
>> >>
>> >> use Bio::Search::Result::BlastResult;
>> >> use Bio::SearchIO;
>> >>
>> >> foreach my $qid ( keys %hits_by_query ) {
>> >>  my $result = Bio::Search::Result::BlastResult->new();
>> >>  $result->add_hit($_) for ( @{$hits_by_query{$qid}} );
>> >>  my $blio = Bio::SearchIO->new( -file => $qid.".bls", - 
>> format=>'blast'
>> );
>> >>  $blio->write_result($result);
>> >> }
>> >>
>> >> will do what you want.
>> >>
>> >> hope this helps -
>> >> Mark
>> >>
>> >> ----- Original Message ----- From: "Tim" <timbourine81 at gmail.com>
>> >> To: <bioperl-l at lists.open-bio.org>
>> >> Sent: Wednesday, November 25, 2009 12:40 PM
>> >> Subject: [Bioperl-l] How to parse BLAST output - all hits of each
>> >> query innew file
>> >>
>> >>
>> >>> Dear bioperl users,
>> >>>
>> >>> I am a real newbie and have - maybe a very trivial - question.
>> >>>
>> >>> I searched the mailing list archive and many howtos but I have  
>> not
>> found
>> >>> a concrete answer to my problem. So hopefully you can help me :)
>> >>>
>> >>> Background: I use the latest Bioperl version (installed it two  
>> weeks
>> >>> before).
>> >>> When I use Bio::Tools::Run::StandAloneBlast to BLAST one fasta  
>> file
>> >>> including different sequences, I get a BLAST output with many  
>> queries
>> >>> each having several hits / sbjcts.
>> >>>
>> >>> My problem is how to parse *all* hits of *one* query into a  
>> single new
>> >>> file. And this for all the queries I have in my BLAST output  
>> file.
>> >>>
>> >>> Or is it better the other way round; first to make fasta files  
>> with
>> only
>> >>> single sequences inside and BLAST each file? But how can I  
>> automize
>> that
>> >>> using Bioperl?
>> >>>
>> >>> I tried Bio::SearchIO but can only parse all queries and their
>> >>> respective hits in only one file...
>> >>> I think iteration is also necessary here, but I do not really  
>> know how
>> >>> to include that into Bio::SearchIO.
>> >>> Or do I have to use Module:Bio::Index::Blast?
>> >>>
>> >>> I can index a file (see below), but I have no idea what comes  
>> next...
>> >>>
>> >>> ###How I index a file...
>> >>>
>> >>> #!/usr/bin/perl -w
>> >>>
>> >>> $ENV{BIOPERL_INDEX_TYPE} = "SDBM_File";
>> >>>
>> >>> use Bio::Index::Fasta;
>> >>>
>> >>>
>> >>> $file_name = "8_to_BLAST_two_seq_index.fasta";
>> >>> $id = "48882";
>> >>> $inx = Bio::Index::Fasta->new (-filename => $file_name . ".idx",
>> >>> -write_flag => 1);
>> >>> $inx->make_index($file_name);
>> >>>
>> >>>
>> >>> Hopefully, you can give me at least hints what to look for.
>> >>>
>> >>> A big THANKS in advance!
>> >>>
>> >>> Cheers,
>> >>>
>> >>> Tim
>> >>> _______________________________________________
>> >>> Bioperl-l mailing list
>> >>> Bioperl-l at lists.open-bio.org
>> >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>> >>>
>> >>>
>> >>
>> >> _______________________________________________
>> >> Bioperl-l mailing list
>> >> Bioperl-l at lists.open-bio.org
>> >> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>> >>
>> >>
>> >
>>
>> Tim K?hler
> MPI for Terrestrial Microbiology
> Karl-von-Frisch-Stra?e
> D-35043 Marburg / Germany
>
> Email: koehlerd at mpi-marburg.mpg.de
> Phone: +49 6421 178-740
> Fax:   +49 6421 178-999
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

--
Jason Stajich
jason.stajich at gmail.com
jason at bioperl.org


From suzi at berkeleybop.org  Sun Nov 29 23:03:09 2009
From: suzi at berkeleybop.org (Suzanna Lewis)
Date: Sun, 29 Nov 2009 20:03:09 -0800
Subject: [Bioperl-l] [DAS] DAS workshop 7th-9th April 2010
In-Reply-To: <F30A9ED7-41E9-4833-A094-FDF0893E0F92@sanger.ac.uk>
References: <F30A9ED7-41E9-4833-A094-FDF0893E0F92@sanger.ac.uk>
Message-ID: <3AD3C819-4BAA-4D90-B141-9611F48C5CAD@ berkeleybop.org>

I/we (Gregg) would be interested in attending. We'd present an update on the collaborative, web-based version of Apollo. We will be working with Ian Holmes and Mitch Skinner using JBrowse for basic display.

-S


On Nov 26, 2009, at 6:57 AM, Jonathan Warren wrote:

> We are considering running a Distributed Annotation System workshop here at the Sanger/EBI in the UK subject to decent demand.
> The workshop will be held from Wednesday 7th-Friday 9th April 2010. If you would be interested in attending either to present or just take part
> then please email me jw12 at sanger.ac.uk
> 
> The format of the workshop is likely to be similar to last years (1st day for beginners, 2nd for both beginners and advanced users, 3rd day for advanced), information for which can be found here:
> http://www.dasregistry.org/course.jsp
> 
> If you would like to present then please send a short summary of what you would like to talk about.
> 
> Thanks
> 
> Jonathan.
> 
> Jonathan Warren
> Senior Developer and DAS coordinator
> jw12 at sanger.ac.uk
> 
> 
> 
> 
> 
> 
> 
> 
> 
> -- 
> The Wellcome Trust Sanger Institute is operated by Genome ResearchLimited, a charity registered in England with number 1021457 and acompany registered in England with number 2742969, whose registeredoffice is 215 Euston Road, London, NW1 2BE._______________________________________________
> DAS mailing list
> DAS at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/das
> 


From maj at fortinbras.us  Mon Nov 30 09:31:27 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Mon, 30 Nov 2009 09:31:27 -0500
Subject: [Bioperl-l] HOWTO copyright policy vs FDL on wiki
In-Reply-To: <81B3C4A1-9F14-4FF9-A4AF-F7E90817A2F1@verizon.net>
References: <9EC73CA501BD45BA912F2D77954D6CD7@NewLife>
	<81B3C4A1-9F14-4FF9-A4AF-F7E90817A2F1@verizon.net>
Message-ID: <513F1C824EF84974993A76F0CC719CDF@NewLife>

Well, it has a history, Jason's point. So the question could
be: "is this still a valid issue"? A while back, a user on the wiki,
with natural and good intentions, removed the authorship and revision
info from a couple of the HOWTOs; it is more wiki-like,
after all. But Chris had some objections to that, which I
seconded, mainly on the basis of the special status that
seems implied by the copyright note on the HOWTO
page. I also think that the nature of the howto is somewhat
different from other info on the site -- that developers themselves
put a lot of time in to explaining how to use their modules, and
that in this world where devs get paid by recognition, it is a reasonable
thing to allow this extra horn-tooting. Now, that is a policy
that could be completely separable from the issue of copyright.
However, devs may also get paid by using their materials in teaching
seminars. The dilemma would be that people who like to use the
wiki are people who like to share, and so it feels unnatural to
withhold from the community the materials they develop,  but
people who like to share also like to eat and wear shoes...
so I'm interested in everyone's thoughts about it.
----- Original Message ----- 
From: "Brian Osborne" <bosborne11 at verizon.net>
To: "Mark A. Jensen" <maj at fortinbras.us>
Cc: "Chris Fields" <cjfields at illinois.edu>; "Jason Stajich" 
<jason.stajich at ucr.edu>; "bioperl List" <bioperl-l at bioperl.org>
Sent: Monday, November 30, 2009 9:16 AM
Subject: Re: [Bioperl-l] HOWTO copyright policy vs FDL on wiki


> Mark,
>
> Let me ask you a question, and don't take this question as an implicit 
> criticism of your suggestion, it is not. Why would you want this more 
> restrictive copyright?
>
> Brian O.
>
> On Nov 28, 2009, at 10:32 PM, Mark A. Jensen wrote:
>
>> The HOWTOs appear to have a more restrictive copyright
>> than FDL-- in particular, the blurb at the bottom of the
>> HOWTO page asks users to use the documents for personal
>> use only. I'm for this; I think we should therefore have some
>> explicit license for these that specifies this kind of restriction,
>> and then express that on each howto and in BioPerl:Copyright.
>> Any thoughts on the right license and whether this is a good plan?
>> MAJ
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>
> 


From bosborne11 at verizon.net  Mon Nov 30 10:15:32 2009
From: bosborne11 at verizon.net (Brian Osborne)
Date: Mon, 30 Nov 2009 10:15:32 -0500
Subject: [Bioperl-l] HOWTO copyright policy vs FDL on wiki
In-Reply-To: <513F1C824EF84974993A76F0CC719CDF@NewLife>
References: <9EC73CA501BD45BA912F2D77954D6CD7@NewLife>
	<81B3C4A1-9F14-4FF9-A4AF-F7E90817A2F1@verizon.net>
	<513F1C824EF84974993A76F0CC719CDF@NewLife>
Message-ID: <54671455-A02C-4139-8C39-AC17B50D5CE6@verizon.net>

Mark,

I have no objection to a more restrictive copyright, and I also have  
no objection to using FDL, or things like it.

Brian O.

On Nov 30, 2009, at 9:31 AM, Mark A. Jensen wrote:

> Well, it has a history, Jason's point. So the question could
> be: "is this still a valid issue"? A while back, a user on the wiki,
> with natural and good intentions, removed the authorship and revision
> info from a couple of the HOWTOs; it is more wiki-like,
> after all. But Chris had some objections to that, which I
> seconded, mainly on the basis of the special status that
> seems implied by the copyright note on the HOWTO
> page. I also think that the nature of the howto is somewhat
> different from other info on the site -- that developers themselves
> put a lot of time in to explaining how to use their modules, and
> that in this world where devs get paid by recognition, it is a  
> reasonable
> thing to allow this extra horn-tooting. Now, that is a policy
> that could be completely separable from the issue of copyright.
> However, devs may also get paid by using their materials in teaching
> seminars. The dilemma would be that people who like to use the
> wiki are people who like to share, and so it feels unnatural to
> withhold from the community the materials they develop,  but
> people who like to share also like to eat and wear shoes...
> so I'm interested in everyone's thoughts about it.
> ----- Original Message ----- From: "Brian Osborne" <bosborne11 at verizon.net 
> >
> To: "Mark A. Jensen" <maj at fortinbras.us>
> Cc: "Chris Fields" <cjfields at illinois.edu>; "Jason Stajich" <jason.stajich at ucr.edu 
> >; "bioperl List" <bioperl-l at bioperl.org>
> Sent: Monday, November 30, 2009 9:16 AM
> Subject: Re: [Bioperl-l] HOWTO copyright policy vs FDL on wiki
>
>
>> Mark,
>>
>> Let me ask you a question, and don't take this question as an  
>> implicit criticism of your suggestion, it is not. Why would you  
>> want this more restrictive copyright?
>>
>> Brian O.
>>
>> On Nov 28, 2009, at 10:32 PM, Mark A. Jensen wrote:
>>
>>> The HOWTOs appear to have a more restrictive copyright
>>> than FDL-- in particular, the blurb at the bottom of the
>>> HOWTO page asks users to use the documents for personal
>>> use only. I'm for this; I think we should therefore have some
>>> explicit license for these that specifies this kind of restriction,
>>> and then express that on each howto and in BioPerl:Copyright.
>>> Any thoughts on the right license and whether this is a good plan?
>>> MAJ
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>>
>


From bosborne11 at verizon.net  Mon Nov 30 09:16:07 2009
From: bosborne11 at verizon.net (Brian Osborne)
Date: Mon, 30 Nov 2009 09:16:07 -0500
Subject: [Bioperl-l] HOWTO copyright policy vs FDL on wiki
In-Reply-To: <9EC73CA501BD45BA912F2D77954D6CD7@NewLife>
References: <9EC73CA501BD45BA912F2D77954D6CD7@NewLife>
Message-ID: <81B3C4A1-9F14-4FF9-A4AF-F7E90817A2F1@verizon.net>

Mark,

Let me ask you a question, and don't take this question as an implicit  
criticism of your suggestion, it is not. Why would you want this more  
restrictive copyright?

Brian O.

On Nov 28, 2009, at 10:32 PM, Mark A. Jensen wrote:

> The HOWTOs appear to have a more restrictive copyright
> than FDL-- in particular, the blurb at the bottom of the
> HOWTO page asks users to use the documents for personal
> use only. I'm for this; I think we should therefore have some
> explicit license for these that specifies this kind of restriction,
> and then express that on each howto and in BioPerl:Copyright.
> Any thoughts on the right license and whether this is a good plan?
> MAJ
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From maj at fortinbras.us  Mon Nov 30 12:41:44 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Mon, 30 Nov 2009 12:41:44 -0500
Subject: [Bioperl-l] How to parse BLAST output - all hits of each
	queryinnew file
In-Reply-To: <c3cc98c0911300923p144ac04ax8a881150dca54835@mail.gmail.com>
References: <4B0D6C24.2080308@gmail.com>
	<53DE480F205E42CE8D2B9421592AAF0E@NewLife>
	<815D2A47BC9C4D89B8DEF0B10DA9EAF8@NewLife>
	<4B0EA44D.2050507@gmail.com>
	<c3cc98c0911260802t2a66e7e8o7be3b79c03d86e6e@mail.gmail.com>
	<c3cc98c0911270123i6e4e83d3lfee0f5f32ca0cf46@mail.gmail.com>
	<18DF7D20DFEC044098A1062202F5FFF32B630E6C53@exchsth.agresearch.co.nz>
	<52D67F20A9CB4953B86FF794ADE0BE96@NewLife>
	<18DF7D20DFEC044098A1062202F5FFF32B630E6D05@exchsth.agresearch.co.nz>
	<c3cc98c0911300923p144ac04ax8a881150dca54835@mail.gmail.com>
Message-ID: <8C288FEF9CEB4055B0CDD19267FBA26C@NewLife>

thanks Tim! corrected (I hope) in r16432... 
MAJ
  ----- Original Message ----- 
  From: Tim Koehler 
  To: Smithies, Russell 
  Cc: Mark A. Jensen ; bioperl-l at lists.open-bio.org 
  Sent: Monday, November 30, 2009 12:23 PM
  Subject: Re: [Bioperl-l] How to parse BLAST output - all hits of each queryinnew file


  Hello everybody,

  thanks a lot for the overwhelming answers! All these codes are different flavors and worked all.

  For me the added code works the best. But I think I found a bug in ...Bio/SearchIO/blast.pm. 
  There the DEFAULT_BLAST_... variable is set to Bio::Search::Writer::HitTableWriter instead of Bio::SearchIO::Writer::HitTableWriter. This variable I changed also to HTMLResultWriter and others.

  So again: THANKS for the support!

  Cheers, 
  Tim

  #!/usr/bin/perl -w

  use strict;

  use Bio::Tools::Run::StandAloneBlast;

  use Bio::SeqIO;

  use Bio::SearchIO;

  ### add here the writer you want
  use Bio::SearchIO::Writer::HitTableWriter;

  use Bio::Search::Result::BlastResult;

   
  use Data::Dumper;

   
  my $Seq_in = Bio::SeqIO->new( -file   => "/home/koehler/Programs/for_BLAST/1_to_BLAST_two_seq.fasta",

                                -format => "fasta" );

   
  while ( my $query = $Seq_in->next_seq() ) {

         warn "Processing ",$query->id, "\n";

    my $factory =

      Bio::Tools::Run::StandAloneBlast->new(

                   program  => "blastn",

                   database => "/home/koehler/Programs/for_BLAST/BLAST_Pipeline/3_BLAST_db",

                   _READMETHOD => "Blast"

      );

   
    my $blast_report = $factory->blastall($query);

    sleep 5;

   
    # just write the result we got for this query into a 

     #new blast-formatted file...named after the id of the query seq...  

    my $result = $blast_report->next_result;

    my $blio = Bio::SearchIO->new( -file => ">".$query->id.".bls", -format => "blast" ) or die $!;

    $blio->write_result($result);

   
    # below, just looking at the current blast result

  ###this does not appear in the output files

    while ( my $result = $blast_report->next_result ) {

      ## $result is a Bio::Search::Result::ResultI compliant object

      while ( my $hit = $result->next_hit ) {

        ## $hit is a Bio::Search::Hit::HitI compliant object

        while ( my $hsp = $hit->next_hsp ) {

          ## $hsp is a Bio::Search::HSP::HSPI compliant object

          if ( $hsp->length('total') > 50 ) {

            if ( $hsp->percent_identity >= 75 ) {

              print "Query= ", $result->query_name,

                "Hit= ",        $hit->name,

                "Length= ",     $hsp->length('total'),

                "Percent_id= ", $hsp->percent_identity,

                "Subject=",     $hsp->hit_string, "\n";

            }

          }

        }

      }

    }

  }

   
  On Sun, Nov 29, 2009 at 11:29 PM, Smithies, Russell <Russell.Smithies at agresearch.co.nz> wrote:

    Changed it to a generic result and added a writer and it seems tio work:


      foreach my $qid ( keys %hits_by_query ) {

        warn "qid = $qid\n";

        my $res = Bio::Search::Result::GenericResult->new(-algorithm => "blastn") or die $!;

       # print Dumper $res;

        foreach my $h ( @{ $hits_by_query{$qid} } ){

                         warn "adding hit ", $h->name, "\n";

                         $res->add_hit($h) if defined($h);

                               }

        my $writerhtml =  Bio::SearchIO::Writer::HTMLResultWriter->new();

        my $blio = Bio::SearchIO->new(-writer => $writerhtml, -file => ">$qid\.bls\.html", -format => "blast" ) or die $!;

        $blio->write_result($res);

      }


    From: Mark A. Jensen [mailto:maj at fortinbras.us] 
    Sent: Monday, 30 November 2009 10:19 a.m.
    To: Smithies, Russell; 'Tim Koehler'


    Subject: Re: [Bioperl-l] How to parse BLAST output - all hits of each queryinnew file


    My thought here was that since Tim's already going one at a time thru

    his queries, my scrap was not really necessary: 


    use strict;

    use Bio::Tools::Run::StandAloneBlast;

    use Bio::SeqIO;

    use Bio::SearchIO;

    use Bio::Search::Result::BlastResult;


    use Data::Dumper;


    my $Seq_in = Bio::SeqIO->new( -file   => "sequences.fasta",

                                  -format => "fasta" );


    while ( my $query = $Seq_in->next_seq() ) {

           warn "Processing ",$query->id, "\n";

      my $factory =

        Bio::Tools::Run::StandAloneBlast->new(

                     program  => "blastn",

                     database => "/data/databases/flatfile/illuminati_blastdata/nt",

                     _READMETHOD => "Blast"

        );


      my $blast_report = $factory->blastall($query);

      sleep 5;


      # just write the result we got for this query into a 

       #new blast-formatted file...named after the id of the query seq...  

     my $result = $blast_report->next_result;

    my $blio = Bio::SearchIO->new( -file => ">".$query->id.".bls", -format => "blast" ) or die $!;

      $blio->write_result($result);


      # below, just looking at the current blast result

      while ( my $result = $blast_report->next_result ) {

        ## $result is a Bio::Search::Result::ResultI compliant object

        while ( my $hit = $result->next_hit ) {

          ## $hit is a Bio::Search::Hit::HitI compliant object

          while ( my $hsp = $hit->next_hsp ) {

            ## $hsp is a Bio::Search::HSP::HSPI compliant object

            if ( $hsp->length('total') > 50 ) {

              if ( $hsp->percent_identity >= 75 ) {

                print "Query= ", $result->query_name,

                  "Hit= ",        $hit->name,

                  "Length= ",     $hsp->length('total'),

                  "Percent_id= ", $hsp->percent_identity,

                  "Subject=",     $hsp->hit_string, "\n";

              }

            }

          }

        }

      }

    }

      ----- Original Message ----- 

      From: Smithies, Russell 

      To: 'Tim Koehler' ; 'maj at fortinbras.us' 

      Sent: Sunday, November 29, 2009 3:58 PM

      Subject: RE: [Bioperl-l] How to parse BLAST output - all hits of each queryinnew file


      Hi Tim

      With various people writing the ?howtos? and other docs, the examples are bound to have differing names for the variables used but as long as you?re consistent, it should all fit together.


      I think I?ve almost got your code working, just getting errors from Bio::Search::Result::BlastResult  which I?m not entirely sure how to use. Perhaps Mark can get this bit going?


      --Russell

      ===============================


      use strict;

      use Bio::Tools::Run::StandAloneBlast;

      use Bio::SeqIO;

      use Bio::SearchIO;

      use Bio::Search::Result::BlastResult;


      use Data::Dumper;


      my $Seq_in = Bio::SeqIO->new( -file   => "sequences.fasta",

                                    -format => "fasta" );


      while ( my $query = $Seq_in->next_seq() ) {

             warn "Processing ",$query->id, "\n";

        my $factory =

          Bio::Tools::Run::StandAloneBlast->new(

                       program  => "blastn",

                       database => "/data/databases/flatfile/illuminati_blastdata/nt",

                       _READMETHOD => "Blast"

          );


        my $blast_report = $factory->blastall($query);

        sleep 5;


        my %hits_by_query;


             while ( my $result = $blast_report->next_result ) {

               foreach my $hit ( $result->hits ) {

                           warn "Pushed a hit for ",$hit->name, "\n";

                 push( @{ $hits_by_query{ $hit->name } }, $hit );

               }

             }


        foreach my $qid ( keys %hits_by_query ) {

                    warn "qid = $qid\n";

          my $res = Bio::Search::Result::BlastResult->new() or die $!;

          print Dumper $res;

          foreach my $h ( @{ $hits_by_query{$qid} } ){

                           warn "adding hit ", $h->name, "\n";

                           $res->add_hit($h) if defined($h);

                                 }

          my $blio = Bio::SearchIO->new( -file => ">$qid\.bls", -format => "blast" ) or die $!;

          $blio->write_result($res);

        }


        while ( my $result = $blast_report->next_result ) {

          ## $result is a Bio::Search::Result::ResultI compliant object

          while ( my $hit = $result->next_hit ) {

            ## $hit is a Bio::Search::Hit::HitI compliant object

            while ( my $hsp = $hit->next_hsp ) {

              ## $hsp is a Bio::Search::HSP::HSPI compliant object

              if ( $hsp->length('total') > 50 ) {

                if ( $hsp->percent_identity >= 75 ) {

                  print "Query= ", $result->query_name,

                    "Hit= ",        $hit->name,

                    "Length= ",     $hsp->length('total'),

                    "Percent_id= ", $hsp->percent_identity,

                    "Subject=",     $hsp->hit_string, "\n";

                }

              }

            }

          }

        }

      }

      ===============================


      From: Tim Koehler [mailto:timbourine81 at googlemail.com] 
      Sent: Friday, 27 November 2009 10:24 p.m.
      To: Smithies, Russell; maj at fortinbras.us
      Subject: Re: [Bioperl-l] How to parse BLAST output - all hits of each queryinnew file


      Hey guys,

      please, do not get me wrong that I wanted to put the workload on you. So far I only found the HowTo's but in there in some way the language changed with time (e.g. $in to $Seq_in) or some things I simply could not find.
      Now I got a tip where else to search: the scrapbook and deobfuscator.

      I immediately will have a look at that.

      This is the first time for me touching linux / perl commands; that's why I thought after several days of trial and many errors ;) asking the mailinglist.

      I was very happy about your fast answers!

      Cheers and a nice weekend,

      Tim

      On Thu, Nov 26, 2009 at 5:02 PM, Tim Koehler <timbourine81 at googlemail.com> wrote:

      ups, sent too early...

      Hey Mark,

      thanks for the answer. But I am still struggling, especially where to put in your code.

      Here ist the code I have, so far:

      #!/usr/bin/perl -w

      ### should I put your code here as push is a perl command?


      my %hits_by_query;
      for ($result->hits) {

      ### I inserted a comma after name}}; if there is no comma, there was the error: Scalar found where operator expected at 12_BLAST_two_sequence_each_query_one_file.PL line7, near "} $hit"
      ###        (Missing operator before  $hit?)
      ###Useless use of push with no values at 12_BLAST_two_sequence_each_query_one_file.PL line 7.
      ###syntax error at 12_BLAST_two_sequence_each_query_one_file.PL line 7, near "} $hit"
      ###BEGIN not safe after errors--compilation aborted at 12_BLAST_two_sequence_each_query_one_file.PL line 13.


       push @{$hits_by_query{$hit->name}}, $hit;

      ###here, every time this terror appears: Name "main::result" used only once: possible typo at 12_BLAST_two_sequence_each_query_one_file.PL line 5.
      ###error: Can't call method "hits" on an undefined value at 12_BLAST_two_sequence_each_query_one_file.PL line 5.


      }


      use strict;
      use Bio::Tools::Run::StandAloneBlast;
      use Bio::SeqIO;
      use Bio::SearchIO;

      use Bio::Search::Result::BlastResult;

      my $Seq_in = Bio::SeqIO->new (
      -file => "/home/koehler/Programs/for_BLAST/BLAST_Pipeline/1_to_BLAST_two_seq.fasta",
      -format => 'fasta'
      );
      while (my $query = $Seq_in->next_seq()) {


      my $factory = Bio::Tools::Run::StandAloneBlast->new(

      'program' => 'blastn',
      'database' => '/home/koehler/Programs/for_BLAST/BLAST_Pipeline/3_BLAST_db',
      _READMETHOD => "Blast"
      );

      my $blast_report = $factory->blastall($query);

      ### Should I need to use a module? are the commands here at the right position? errors, e.g., Global symbol "$hit" requires explicit package name
      #my %hits_by_query;
      #for ($result->hits) {
      ### inserted comma after name}}
      # push @{$hits_by_query{$hit->name}}, $hit;
      #}


      foreach my $qid ( keys %hits_by_query ) {
       my $result = Bio::Search::Result::BlastResult->new();
       $result->add_hit($_) for ( @{$hits_by_query{$qid}} );
       my $blio = Bio::SearchIO->new( -file => ">$qid\.bls", -format=>'blast' );
       $blio->write_result($result);
      } 

      ###where are the files stored? what is their name. Sorry, but I cannot get behind that :(

      while( my $result = $blast_report->next_result ) {
        ## $result is a Bio::Search::Result::ResultI compliant object


        while( my $hit = $result->next_hit ) {

         ## $hit is a Bio::Search::Hit::HitI compliant object


         while( my $hsp = $hit->next_hsp ) {

          ## $hsp is a Bio::Search::HSP::HSPI compliant object
          if( $hsp->length('total') > 50 ) {
           if ( $hsp->percent_identity >= 75 ) {
           print  "Query= ",        $result->query_name,
              "Hit= ",        $hit->name,
                  "Length= ",     $hsp->length('total'),
                  "Percent_id= ", $hsp->percent_identity,
              "Subject=",        $hsp->hit_string,"\n";
           }
          }
         }
        }
      }
      }

      Again, a big thanks in advance :)

      All the best,

      Tim

      On Thu, Nov 26, 2009 at 4:52 PM, Tim <timbourine81 at gmail.com> wrote:

      Hey Mark,

      thanks for the answer


      On 25.11.2009 20:21, Mark A. Jensen wrote:
      > whoops: change the following line:
      > my $blio = Bio::SearchIO->new( -file => $qid.".bls", -format=>'blast' );
      >
      > to
      >
      > my $blio = Bio::SearchIO->new( -file => ">$qid\.bls", -format=>'blast' );
      >
      > (I always forget that...)
      > MAJ
      >
      > ----- Original Message ----- From: "Mark A. Jensen" <maj at fortinbras.us>
      > To: "Tim" <timbourine81 at gmail.com>; <bioperl-l at lists.open-bio.org>
      > Sent: Wednesday, November 25, 2009 1:20 PM
      > Subject: Re: [Bioperl-l] How to parse BLAST output - all hits of each
      > queryinnew file
      >
      >
      >> hey Tim--
      >>
      >> Sound like you need to go about collecting your queries inside out:
      >>
      >> my %hits_by_query;
      >> for ($result->hits) {
      >>  push @{$hits_by_query{$hit->name}} $hit;
      >> }
      >>
      >> I believe now each hash element, keyed by the query name, will contain
      >> an arrayref to the set of hits assoc with that query.
      >>> From here, I believe
      >>
      >> use Bio::Search::Result::BlastResult;
      >> use Bio::SearchIO;
      >>
      >> foreach my $qid ( keys %hits_by_query ) {
      >>  my $result = Bio::Search::Result::BlastResult->new();
      >>  $result->add_hit($_) for ( @{$hits_by_query{$qid}} );
      >>  my $blio = Bio::SearchIO->new( -file => $qid.".bls", -format=>'blast' );
      >>  $blio->write_result($result);
      >> }
      >>
      >> will do what you want.
      >>
      >> hope this helps -
      >> Mark
      >>
      >> ----- Original Message ----- From: "Tim" <timbourine81 at gmail.com>
      >> To: <bioperl-l at lists.open-bio.org>
      >> Sent: Wednesday, November 25, 2009 12:40 PM
      >> Subject: [Bioperl-l] How to parse BLAST output - all hits of each
      >> query innew file
      >>
      >>
      >>> Dear bioperl users,
      >>>
      >>> I am a real newbie and have - maybe a very trivial - question.
      >>>
      >>> I searched the mailing list archive and many howtos but I have not found
      >>> a concrete answer to my problem. So hopefully you can help me :)
      >>>
      >>> Background: I use the latest Bioperl version (installed it two weeks
      >>> before).
      >>> When I use Bio::Tools::Run::StandAloneBlast to BLAST one fasta file
      >>> including different sequences, I get a BLAST output with many queries
      >>> each having several hits / sbjcts.
      >>>
      >>> My problem is how to parse *all* hits of *one* query into a single new
      >>> file. And this for all the queries I have in my BLAST output file.
      >>>
      >>> Or is it better the other way round; first to make fasta files with only
      >>> single sequences inside and BLAST each file? But how can I automize that
      >>> using Bioperl?
      >>>
      >>> I tried Bio::SearchIO but can only parse all queries and their
      >>> respective hits in only one file...
      >>> I think iteration is also necessary here, but I do not really know how
      >>> to include that into Bio::SearchIO.
      >>> Or do I have to use Module:Bio::Index::Blast?
      >>>
      >>> I can index a file (see below), but I have no idea what comes next...
      >>>
      >>> ###How I index a file...
      >>>
      >>> #!/usr/bin/perl -w
      >>>
      >>> $ENV{BIOPERL_INDEX_TYPE} = "SDBM_File";
      >>>
      >>> use Bio::Index::Fasta;
      >>>
      >>>
      >>> $file_name = "8_to_BLAST_two_seq_index.fasta";
      >>> $id = "48882";
      >>> $inx = Bio::Index::Fasta->new (-filename => $file_name . ".idx",
      >>> -write_flag => 1);
      >>> $inx->make_index($file_name);
      >>>
      >>>
      >>> Hopefully, you can give me at least hints what to look for.
      >>>
      >>> A big THANKS in advance!
      >>>
      >>> Cheers,
      >>>
      >>> Tim
      >>> _______________________________________________
      >>> Bioperl-l mailing list
      >>> Bioperl-l at lists.open-bio.org
      >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
      >>>
      >>>
      >>
      >> _______________________________________________
      >> Bioperl-l mailing list
      >> Bioperl-l at lists.open-bio.org
      >> http://lists.open-bio.org/mailman/listinfo/bioperl-l
      >>
      >>
      >

      Tim K?hler
      MPI for Terrestrial Microbiology
      Karl-von-Frisch-Stra?e
      D-35043 Marburg / Germany

      Email: koehlerd at mpi-marburg.mpg.de
      Phone: +49 6421 178-740
      Fax:   +49 6421 178-999


--------------------------------------------------------------------------

      Attention: The information contained in this message and/or attachments from AgResearch Limited is intended only for the persons or entities to which it is addressed and may contain confidential and/or privileged material. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipients is prohibited by AgResearch Limited. If you have received this message in error, please notify the sender immediately.


--------------------------------------------------------------------------


From timbourine81 at googlemail.com  Mon Nov 30 12:23:58 2009
From: timbourine81 at googlemail.com (Tim Koehler)
Date: Mon, 30 Nov 2009 18:23:58 +0100
Subject: [Bioperl-l] How to parse BLAST output - all hits of each
	queryinnew file
In-Reply-To: <18DF7D20DFEC044098A1062202F5FFF32B630E6D05@exchsth.agresearch.co.nz>
References: <4B0D6C24.2080308@gmail.com>
	<53DE480F205E42CE8D2B9421592AAF0E@NewLife>
	<815D2A47BC9C4D89B8DEF0B10DA9EAF8@NewLife>
	<4B0EA44D.2050507@gmail.com>
	<c3cc98c0911260802t2a66e7e8o7be3b79c03d86e6e@mail.gmail.com>
	<c3cc98c0911270123i6e4e83d3lfee0f5f32ca0cf46@mail.gmail.com>
	<18DF7D20DFEC044098A1062202F5FFF32B630E6C53@exchsth.agresearch.co.nz>
	<52D67F20A9CB4953B86FF794ADE0BE96@NewLife>
	<18DF7D20DFEC044098A1062202F5FFF32B630E6D05@exchsth.agresearch.co.nz>
Message-ID: <c3cc98c0911300923p144ac04ax8a881150dca54835@mail.gmail.com>

Hello everybody,

thanks a lot for the overwhelming answers! All these codes are different
flavors and worked all.

For me the added code works the best. But I think I found a bug in
...Bio/SearchIO/blast.pm.
There the DEFAULT_BLAST_... variable is set to
Bio::Search::Writer::HitTableWriter instead of
Bio::SearchIO::Writer::HitTableWriter. This variable I changed also to
HTMLResultWriter
and others.

So again: THANKS for the support!

Cheers,
Tim

#!/usr/bin/perl -w

use strict;

use Bio::Tools::Run::StandAloneBlast;

use Bio::SeqIO;

use Bio::SearchIO;

### add here the writer you want
use Bio::SearchIO::Writer::HitTableWriter;

use Bio::Search::Result::BlastResult;


use Data::Dumper;


my $Seq_in = Bio::SeqIO->new( -file   =>
"/home/koehler/Programs/for_BLAST/1_to_BLAST_two_seq.fasta",

                              -format => "fasta" );


while ( my $query = $Seq_in->next_seq() ) {

       warn "Processing ",$query->id, "\n";

  my $factory =

    Bio::Tools::Run::StandAloneBlast->new(

                 program  => "blastn",

                 database =>
"/home/koehler/Programs/for_BLAST/BLAST_Pipeline/3_BLAST_db",

                 _READMETHOD => "Blast"

    );


  my $blast_report = $factory->blastall($query);

  sleep 5;


  # just write the result we got for this query into a

   #new blast-formatted file...named after the id of the query seq...

  my $result = $blast_report->next_result;

  my $blio = Bio::SearchIO->new( -file => ">".$query->id.".bls", -format =>
"blast" ) or die $!;

  $blio->write_result($result);


  # below, just looking at the current blast result

###this does not appear in the output files

  while ( my $result = $blast_report->next_result ) {

    ## $result is a Bio::Search::Result::ResultI compliant object

    while ( my $hit = $result->next_hit ) {

      ## $hit is a Bio::Search::Hit::HitI compliant object

      while ( my $hsp = $hit->next_hsp ) {

        ## $hsp is a Bio::Search::HSP::HSPI compliant object

        if ( $hsp->length('total') > 50 ) {

          if ( $hsp->percent_identity >= 75 ) {

            print "Query= ", $result->query_name,

              "Hit= ",        $hit->name,

              "Length= ",     $hsp->length('total'),

              "Percent_id= ", $hsp->percent_identity,

              "Subject=",     $hsp->hit_string, "\n";

          }

        }

      }

    }

  }

}


On Sun, Nov 29, 2009 at 11:29 PM, Smithies, Russell <
Russell.Smithies at agresearch.co.nz> wrote:

>  Changed it to a generic result and added a writer and it seems tio work:
>
>
>
>   foreach my $qid ( keys %hits_by_query ) {
>
>     warn "qid = $qid\n";
>
>     my $res = Bio::Search::Result::GenericResult->new(-algorithm =>
> "blastn") or die $!;
>
>    # print Dumper $res;
>
>     foreach my $h ( @{ $hits_by_query{$qid} } ){
>
>                      warn "adding hit ", $h->name, "\n";
>
>                      $res->add_hit($h) if defined($h);
>
>                            }
>
>     my $writerhtml =  Bio::SearchIO::Writer::HTMLResultWriter->new();
>
>     my $blio = Bio::SearchIO->new(-writer => $writerhtml, -file =>
> ">$qid\.bls\.html", -format => "blast" ) or die $!;
>
>     $blio->write_result($res);
>
>   }
>
>
>
>
>
> *From:* Mark A. Jensen [mailto:maj at fortinbras.us]
> *Sent:* Monday, 30 November 2009 10:19 a.m.
> *To:* Smithies, Russell; 'Tim Koehler'
>
> *Subject:* Re: [Bioperl-l] How to parse BLAST output - all hits of each
> queryinnew file
>
>
>
> My thought here was that since Tim's already going one at a time thru
>
> his queries, my scrap was not really necessary:
>
>
>
> use strict;
>
> use Bio::Tools::Run::StandAloneBlast;
>
> use Bio::SeqIO;
>
> use Bio::SearchIO;
>
> use Bio::Search::Result::BlastResult;
>
>
>
> use Data::Dumper;
>
>
>
> my $Seq_in = Bio::SeqIO->new( -file   => "sequences.fasta",
>
>                               -format => "fasta" );
>
>
>
> while ( my $query = $Seq_in->next_seq() ) {
>
>        warn "Processing ",$query->id, "\n";
>
>   my $factory =
>
>     Bio::Tools::Run::StandAloneBlast->new(
>
>                  program  => "blastn",
>
>                  database =>
> "/data/databases/flatfile/illuminati_blastdata/nt",
>
>                  _READMETHOD => "Blast"
>
>     );
>
>
>
>   my $blast_report = $factory->blastall($query);
>
>   sleep 5;
>
>
>
>   # just write the result we got for this query into a
>
>    #new blast-formatted file...named after the id of the query seq...
>
>  my $result = $blast_report->next_result;
>
> my $blio = Bio::SearchIO->new( -file => ">".$query->id.".bls", -format =>
> "blast" ) or die $!;
>
>   $blio->write_result($result);
>
>
>
>   # below, just looking at the current blast result
>
>   while ( my $result = $blast_report->next_result ) {
>
>     ## $result is a Bio::Search::Result::ResultI compliant object
>
>     while ( my $hit = $result->next_hit ) {
>
>       ## $hit is a Bio::Search::Hit::HitI compliant object
>
>       while ( my $hsp = $hit->next_hsp ) {
>
>         ## $hsp is a Bio::Search::HSP::HSPI compliant object
>
>         if ( $hsp->length('total') > 50 ) {
>
>           if ( $hsp->percent_identity >= 75 ) {
>
>             print "Query= ", $result->query_name,
>
>               "Hit= ",        $hit->name,
>
>               "Length= ",     $hsp->length('total'),
>
>               "Percent_id= ", $hsp->percent_identity,
>
>               "Subject=",     $hsp->hit_string, "\n";
>
>           }
>
>         }
>
>       }
>
>     }
>
>   }
>
> }
>
>  ----- Original Message -----
>
> *From:* Smithies, Russell <Russell.Smithies at agresearch.co.nz>
>
> *To:* 'Tim Koehler' <timbourine81 at googlemail.com> ; 'maj at fortinbras.us'<%27maj at fortinbras.us%27>
>
> *Sent:* Sunday, November 29, 2009 3:58 PM
>
> *Subject:* RE: [Bioperl-l] How to parse BLAST output - all hits of each
> queryinnew file
>
>
>
> Hi Tim
>
> With various people writing the ?howtos? and other docs, the examples are
> bound to have differing names for the variables used but as long as you?re
> consistent, it should all fit together.
>
>
>
> I think I?ve almost got your code working, just getting errors from
> Bio::Search::Result::BlastResult  which I?m not entirely sure how to use.
> Perhaps Mark can get this bit going?
>
>
>
> --Russell
>
> ===============================
>
>
>
> use strict;
>
> use Bio::Tools::Run::StandAloneBlast;
>
> use Bio::SeqIO;
>
> use Bio::SearchIO;
>
> use Bio::Search::Result::BlastResult;
>
>
>
> use Data::Dumper;
>
>
>
> my $Seq_in = Bio::SeqIO->new( -file   => "sequences.fasta",
>
>                               -format => "fasta" );
>
>
>
> while ( my $query = $Seq_in->next_seq() ) {
>
>        warn "Processing ",$query->id, "\n";
>
>   my $factory =
>
>     Bio::Tools::Run::StandAloneBlast->new(
>
>                  program  => "blastn",
>
>                  database =>
> "/data/databases/flatfile/illuminati_blastdata/nt",
>
>                  _READMETHOD => "Blast"
>
>     );
>
>
>
>   my $blast_report = $factory->blastall($query);
>
>   sleep 5;
>
>
>
>
>
>   my %hits_by_query;
>
>
>
>        while ( my $result = $blast_report->next_result ) {
>
>          foreach my $hit ( $result->hits ) {
>
>                      warn "Pushed a hit for ",$hit->name, "\n";
>
>            push( @{ $hits_by_query{ $hit->name } }, $hit );
>
>          }
>
>        }
>
>
>
>   foreach my $qid ( keys %hits_by_query ) {
>
>               warn "qid = $qid\n";
>
>     my $res = Bio::Search::Result::BlastResult->new() or die $!;
>
>     print Dumper $res;
>
>     foreach my $h ( @{ $hits_by_query{$qid} } ){
>
>                      warn "adding hit ", $h->name, "\n";
>
>                      $res->add_hit($h) if defined($h);
>
>                            }
>
>     my $blio = Bio::SearchIO->new( -file => ">$qid\.bls", -format =>
> "blast" ) or die $!;
>
>     $blio->write_result($res);
>
>   }
>
>
>
>   while ( my $result = $blast_report->next_result ) {
>
>     ## $result is a Bio::Search::Result::ResultI compliant object
>
>     while ( my $hit = $result->next_hit ) {
>
>       ## $hit is a Bio::Search::Hit::HitI compliant object
>
>       while ( my $hsp = $hit->next_hsp ) {
>
>         ## $hsp is a Bio::Search::HSP::HSPI compliant object
>
>         if ( $hsp->length('total') > 50 ) {
>
>           if ( $hsp->percent_identity >= 75 ) {
>
>             print "Query= ", $result->query_name,
>
>               "Hit= ",        $hit->name,
>
>               "Length= ",     $hsp->length('total'),
>
>               "Percent_id= ", $hsp->percent_identity,
>
>               "Subject=",     $hsp->hit_string, "\n";
>
>           }
>
>         }
>
>       }
>
>     }
>
>   }
>
> }
>
> ===============================
>
>
>
> *From:* Tim Koehler [mailto:timbourine81 at googlemail.com]
> *Sent:* Friday, 27 November 2009 10:24 p.m.
> *To:* Smithies, Russell; maj at fortinbras.us
> *Subject:* Re: [Bioperl-l] How to parse BLAST output - all hits of each
> queryinnew file
>
>
>
> Hey guys,
>
> please, do not get me wrong that I wanted to put the workload on you. So
> far I only found the HowTo's but in there in some way the language changed
> with time (e.g. $in to $Seq_in) or some things I simply could not find.
> Now I got a tip where else to search: the scrapbook and deobfuscator.
>
> I immediately will have a look at that.
>
> This is the first time for me touching linux / perl commands; that's why I
> thought after several days of trial and many errors ;) asking the
> mailinglist.
>
> I was very happy about your fast answers!
>
> Cheers and a nice weekend,
>
> Tim
>
> On Thu, Nov 26, 2009 at 5:02 PM, Tim Koehler <timbourine81 at googlemail.com>
> wrote:
>
> ups, sent too early...
>
> Hey Mark,
>
> thanks for the answer. But I am still struggling, especially where to put
> in your code.
>
> Here ist the code I have, so far:
>
> #!/usr/bin/perl -w
>
> ### should I put your code here as push is a perl command?
>
>
> my %hits_by_query;
> for ($result->hits) {
>
> ### I inserted a comma after name}}; if there is no comma, there was the
> error: Scalar found where operator expected at
> 12_BLAST_two_sequence_each_query_one_file.PL line7, near "} $hit"
> ###        (Missing operator before  $hit?)
> ###Useless use of push with no values at
> 12_BLAST_two_sequence_each_query_one_file.PL line 7.
> ###syntax error at 12_BLAST_two_sequence_each_query_one_file.PL line 7,
> near "} $hit"
> ###BEGIN not safe after errors--compilation aborted at
> 12_BLAST_two_sequence_each_query_one_file.PL line 13.
>
>
>  push @{$hits_by_query{$hit->name}}, $hit;
>
> ###here, every time this terror appears: Name "main::result" used only
> once: possible typo at 12_BLAST_two_sequence_each_query_one_file.PL line 5.
> ###error: Can't call method "hits" on an undefined value at
> 12_BLAST_two_sequence_each_query_one_file.PL line 5.
>
>
> }
>
>
> use strict;
> use Bio::Tools::Run::StandAloneBlast;
> use Bio::SeqIO;
> use Bio::SearchIO;
>
> use Bio::Search::Result::BlastResult;
>
> my $Seq_in = Bio::SeqIO->new (
> -file =>
> "/home/koehler/Programs/for_BLAST/BLAST_Pipeline/1_to_BLAST_two_seq.fasta",
> -format => 'fasta'
> );
> while (my $query = $Seq_in->next_seq()) {
>
>
> my $factory = Bio::Tools::Run::StandAloneBlast->new(
>
> 'program' => 'blastn',
> 'database' => '/home/koehler/Programs/for_BLAST/BLAST_Pipeline/3_BLAST_db',
> _READMETHOD => "Blast"
> );
>
> my $blast_report = $factory->blastall($query);
>
> ### Should I need to use a module? are the commands here at the right
> position? errors, e.g., Global symbol "$hit" requires explicit package name
> #my %hits_by_query;
> #for ($result->hits) {
> ### inserted comma after name}}
> # push @{$hits_by_query{$hit->name}}, $hit;
> #}
>
>
>
> foreach my $qid ( keys %hits_by_query ) {
>  my $result = Bio::Search::Result::BlastResult->new();
>  $result->add_hit($_) for ( @{$hits_by_query{$qid}} );
>  my $blio = Bio::SearchIO->new( -file => ">$qid\.bls", -format=>'blast' );
>  $blio->write_result($result);
> }
>
> ###where are the files stored? what is their name. Sorry, but I cannot get
> behind that :(
>
> while( my $result = $blast_report->next_result ) {
>   ## $result is a Bio::Search::Result::ResultI compliant object
>
>
>   while( my $hit = $result->next_hit ) {
>
>    ## $hit is a Bio::Search::Hit::HitI compliant object
>
>
>    while( my $hsp = $hit->next_hsp ) {
>
>     ## $hsp is a Bio::Search::HSP::HSPI compliant object
>     if( $hsp->length('total') > 50 ) {
>      if ( $hsp->percent_identity >= 75 ) {
>      print  "Query= ",        $result->query_name,
>         "Hit= ",        $hit->name,
>             "Length= ",     $hsp->length('total'),
>             "Percent_id= ", $hsp->percent_identity,
>         "Subject=",        $hsp->hit_string,"\n";
>      }
>     }
>    }
>   }
> }
> }
>
> Again, a big thanks in advance :)
>
> All the best,
>
> Tim
>
> On Thu, Nov 26, 2009 at 4:52 PM, Tim <timbourine81 at gmail.com> wrote:
>
> Hey Mark,
>
> thanks for the answer
>
>
>
>
> On 25.11.2009 20:21, Mark A. Jensen wrote:
> > whoops: change the following line:
> > my $blio = Bio::SearchIO->new( -file => $qid.".bls", -format=>'blast' );
> >
> > to
> >
> > my $blio = Bio::SearchIO->new( -file => ">$qid\.bls", -format=>'blast' );
> >
> > (I always forget that...)
> > MAJ
> >
> > ----- Original Message ----- From: "Mark A. Jensen" <maj at fortinbras.us>
> > To: "Tim" <timbourine81 at gmail.com>; <bioperl-l at lists.open-bio.org>
> > Sent: Wednesday, November 25, 2009 1:20 PM
> > Subject: Re: [Bioperl-l] How to parse BLAST output - all hits of each
> > queryinnew file
> >
> >
> >> hey Tim--
> >>
> >> Sound like you need to go about collecting your queries inside out:
> >>
> >> my %hits_by_query;
> >> for ($result->hits) {
> >>  push @{$hits_by_query{$hit->name}} $hit;
> >> }
> >>
> >> I believe now each hash element, keyed by the query name, will contain
> >> an arrayref to the set of hits assoc with that query.
> >>> From here, I believe
> >>
> >> use Bio::Search::Result::BlastResult;
> >> use Bio::SearchIO;
> >>
> >> foreach my $qid ( keys %hits_by_query ) {
> >>  my $result = Bio::Search::Result::BlastResult->new();
> >>  $result->add_hit($_) for ( @{$hits_by_query{$qid}} );
> >>  my $blio = Bio::SearchIO->new( -file => $qid.".bls", -format=>'blast'
> );
> >>  $blio->write_result($result);
> >> }
> >>
> >> will do what you want.
> >>
> >> hope this helps -
> >> Mark
> >>
> >> ----- Original Message ----- From: "Tim" <timbourine81 at gmail.com>
> >> To: <bioperl-l at lists.open-bio.org>
> >> Sent: Wednesday, November 25, 2009 12:40 PM
> >> Subject: [Bioperl-l] How to parse BLAST output - all hits of each
> >> query innew file
> >>
> >>
> >>> Dear bioperl users,
> >>>
> >>> I am a real newbie and have - maybe a very trivial - question.
> >>>
> >>> I searched the mailing list archive and many howtos but I have not
> found
> >>> a concrete answer to my problem. So hopefully you can help me :)
> >>>
> >>> Background: I use the latest Bioperl version (installed it two weeks
> >>> before).
> >>> When I use Bio::Tools::Run::StandAloneBlast to BLAST one fasta file
> >>> including different sequences, I get a BLAST output with many queries
> >>> each having several hits / sbjcts.
> >>>
> >>> My problem is how to parse *all* hits of *one* query into a single new
> >>> file. And this for all the queries I have in my BLAST output file.
> >>>
> >>> Or is it better the other way round; first to make fasta files with
> only
> >>> single sequences inside and BLAST each file? But how can I automize
> that
> >>> using Bioperl?
> >>>
> >>> I tried Bio::SearchIO but can only parse all queries and their
> >>> respective hits in only one file...
> >>> I think iteration is also necessary here, but I do not really know how
> >>> to include that into Bio::SearchIO.
> >>> Or do I have to use Module:Bio::Index::Blast?
> >>>
> >>> I can index a file (see below), but I have no idea what comes next...
> >>>
> >>> ###How I index a file...
> >>>
> >>> #!/usr/bin/perl -w
> >>>
> >>> $ENV{BIOPERL_INDEX_TYPE} = "SDBM_File";
> >>>
> >>> use Bio::Index::Fasta;
> >>>
> >>>
> >>> $file_name = "8_to_BLAST_two_seq_index.fasta";
> >>> $id = "48882";
> >>> $inx = Bio::Index::Fasta->new (-filename => $file_name . ".idx",
> >>> -write_flag => 1);
> >>> $inx->make_index($file_name);
> >>>
> >>>
> >>> Hopefully, you can give me at least hints what to look for.
> >>>
> >>> A big THANKS in advance!
> >>>
> >>> Cheers,
> >>>
> >>> Tim
> >>> _______________________________________________
> >>> Bioperl-l mailing list
> >>> Bioperl-l at lists.open-bio.org
> >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >>>
> >>>
> >>
> >> _______________________________________________
> >> Bioperl-l mailing list
> >> Bioperl-l at lists.open-bio.org
> >> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >>
> >>
> >
>
> Tim K?hler
> MPI for Terrestrial Microbiology
> Karl-von-Frisch-Stra?e
> D-35043 Marburg / Germany
>
> Email: koehlerd at mpi-marburg.mpg.de
> Phone: +49 6421 178-740
> Fax:   +49 6421 178-999
>
>
>
>
>  ------------------------------
>
> *Attention: *The information contained in this message and/or attachments
> from AgResearch Limited is intended only for the persons or entities to
> which it is addressed and may contain confidential and/or privileged
> material. Any review, retransmission, dissemination or other use of, or
> taking of any action in reliance upon, this information by persons or
> entities other than the intended recipients is prohibited by AgResearch
> Limited. If you have received this message in error, please notify the
> sender immediately.
>  ------------------------------
>
>
>
>


From maj at fortinbras.us  Sun Nov  1 23:47:15 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Sun, 1 Nov 2009 23:47:15 -0500
Subject: [Bioperl-l] annotations
Message-ID: <5150801225E0484D95DC51B2D00AE519@NewLife>

I'm cogitating on features and annotations. For a RichSeq, one gets the set of annotations by

$seq->annotation->get_Annotations

while getting features by 

$seq->get_Features

Is there a reason not to have a method in SeqI 

sub get_Annotations { shift->annotation->get_Annotations }

to allow a user to do what seems natural from a user's perspective, viz. $seq->get_Annotations? I imagine this might save hundreds of hours of frustration, integrated over all newbies.
MAJ


From cjfields at illinois.edu  Mon Nov  2 08:08:54 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Mon, 2 Nov 2009 07:08:54 -0600
Subject: [Bioperl-l] annotations
In-Reply-To: <5150801225E0484D95DC51B2D00AE519@NewLife>
References: <5150801225E0484D95DC51B2D00AE519@NewLife>
Message-ID: <6920A9E1-D221-4CF8-9866-0ADBDB254C19@illinois.edu>

On Nov 1, 2009, at 10:47 PM, Mark A. Jensen wrote:

> I'm cogitating on features and annotations. For a RichSeq, one gets  
> the set of annotations by
>
> $seq->annotation->get_Annotations
>
> while getting features by
>
> $seq->get_Features
>
> Is there a reason not to have a method in SeqI
>
> sub get_Annotations { shift->annotation->get_Annotations }
>
> to allow a user to do what seems natural from a user's perspective,  
> viz. $seq->get_Annotations? I imagine this might save hundreds of  
> hours of frustration, integrated over all newbies.
> MAJ

One could add the methods to delegate to annotation() (that's  
essentially what I'm planning on doing for Biome).

chris


From kiekyon.huang at gmail.com  Tue Nov  3 10:14:39 2009
From: kiekyon.huang at gmail.com (Kie Kyon Huang)
Date: Tue, 3 Nov 2009 23:14:39 +0800
Subject: [Bioperl-l] render_blast problem
Message-ID: <a54400840911030714q15c0a601t6882beaff68a45f5@mail.gmail.com>

Hi,

I was trying to follow the HOWTO:Graphics at
http://www.bioperl.org/wiki/HOWTO:Graphics

When running the command line in cygwin

$ perl render_blast1.pl data1.txt | display -

I get the following error line,

bash: display: command not found

I also tried

$ perl render_blast1.pl data1.txt > data1.png

however, I was unable to open the data1.png file using Microsoft
Office Picture Manager or windows Photo Gallery

Thanks

Huang


From biopython at maubp.freeserve.co.uk  Tue Nov  3 10:45:37 2009
From: biopython at maubp.freeserve.co.uk (Peter)
Date: Tue, 3 Nov 2009 15:45:37 +0000
Subject: [Bioperl-l] render_blast problem
In-Reply-To: <a54400840911030714q15c0a601t6882beaff68a45f5@mail.gmail.com>
References: <a54400840911030714q15c0a601t6882beaff68a45f5@mail.gmail.com>
Message-ID: <320fb6e00911030745s68331ef7n729505f460863e21@mail.gmail.com>

On Tue, Nov 3, 2009 at 3:14 PM, Kie Kyon Huang <kiekyon.huang at gmail.com> wrote:
> Hi,
>
> I was trying to follow the HOWTO:Graphics at
> http://www.bioperl.org/wiki/HOWTO:Graphics
>
> When running the command line in cygwin
>
> $ perl render_blast1.pl data1.txt | display -
>
> I get the following error line,
>
> bash: display: command not found

That makes sense on Windows, since display is a Unix
command line tool.

> I also tried
>
> $ perl render_blast1.pl data1.txt > data1.png

Based on the wiki, I think that ought to have worked.

> however, I was unable to open the data1.png file using Microsoft
> Office Picture Manager or windows Photo Gallery

Did you do this step?:
>> Important!  If you are on a Windows platform, you need to put
>> STDOUT into binary mode so that the PNG file does not go
>> through Window's carriage return/linefeed transformations.
>> Before the final print statement, put the statement
>> binmode(STDOUT). This advice also applies to certain older
>> versions of RedHat, which ship with a patched (and possibly
>> broken) version of Perl.

(BioPerl devs - couldn't that be added to the default
render_blast1.pl script with an if statement checking for
Windows?)

Peter


From biopython at maubp.freeserve.co.uk  Tue Nov  3 11:04:59 2009
From: biopython at maubp.freeserve.co.uk (Peter)
Date: Tue, 3 Nov 2009 16:04:59 +0000
Subject: [Bioperl-l] render_blast problem
In-Reply-To: <a54400840911030755s725229f7ib679d67932535753@mail.gmail.com>
References: <a54400840911030714q15c0a601t6882beaff68a45f5@mail.gmail.com>
	<320fb6e00911030745s68331ef7n729505f460863e21@mail.gmail.com>
	<a54400840911030755s725229f7ib679d67932535753@mail.gmail.com>
Message-ID: <320fb6e00911030804r62e50da6w373bbb61e9823f28@mail.gmail.com>

Mailing list CC'd - solved :)

On Tue, Nov 3, 2009 at 3:55 PM, Kie Kyon Huang <kiekyon.huang at gmail.com> wrote:
>
> ok, that fix it
> i forget sometimes what platform am i on.
> thanks

Great.

Peter


From amackey at virginia.edu  Tue Nov  3 12:09:00 2009
From: amackey at virginia.edu (Aaron Mackey)
Date: Tue, 3 Nov 2009 12:09:00 -0500
Subject: [Bioperl-l] svn errors?
Message-ID: <24c96eca0911030909p7cfbf858h4de5a345cf8a0782@mail.gmail.com>

[ajm6q at lc4 bioperl-live]$ svn update
svn: Decompression of svndiff data failed


I'll admit to not having svn updated in awhile; A clean, anonymous svn co
failed with the same message:

[...]
A    bioperl-live/Bio/Structure/StructureI.pm
A    bioperl-live/Bio/Structure/IO
svn: Decompression of svndiff data failed

-Aaron

P.S. I used this command: svn co svn://
code.open-bio.org/bioperl/bioperl-live/trunk bioperl-live


From cjfields at illinois.edu  Tue Nov  3 12:17:10 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Tue, 3 Nov 2009 11:17:10 -0600
Subject: [Bioperl-l] svn errors?
In-Reply-To: <24c96eca0911030909p7cfbf858h4de5a345cf8a0782@mail.gmail.com>
References: <24c96eca0911030909p7cfbf858h4de5a345cf8a0782@mail.gmail.com>
Message-ID: <8C5FC42D-F957-45AC-9AAC-876ACC9D77E0@illinois.edu>

Aaron,

Yep, this was reported to support (a couple of users on #bioperl  
reported the same problem).  Chris D. is looking into it.

I'm wondering if it's worth setting up a second mirror to github for  
this purpose.

chris

On Nov 3, 2009, at 11:09 AM, Aaron Mackey wrote:

> [ajm6q at lc4 bioperl-live]$ svn update
> svn: Decompression of svndiff data failed
>
>
> I'll admit to not having svn updated in awhile; A clean, anonymous  
> svn co
> failed with the same message:
>
> [...]
> A    bioperl-live/Bio/Structure/StructureI.pm
> A    bioperl-live/Bio/Structure/IO
> svn: Decompression of svndiff data failed
>
> -Aaron
>
> P.S. I used this command: svn co svn://
> code.open-bio.org/bioperl/bioperl-live/trunk bioperl-live
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From cjfields at illinois.edu  Tue Nov  3 12:19:56 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Tue, 3 Nov 2009 11:19:56 -0600
Subject: [Bioperl-l] render_blast problem
In-Reply-To: <320fb6e00911030745s68331ef7n729505f460863e21@mail.gmail.com>
References: <a54400840911030714q15c0a601t6882beaff68a45f5@mail.gmail.com>
	<320fb6e00911030745s68331ef7n729505f460863e21@mail.gmail.com>
Message-ID: <8336341C-C7B4-4740-A7C3-E2DE5FDAF651@illinois.edu>


On Nov 3, 2009, at 9:45 AM, Peter wrote:

> ...
> Did you do this step?:
>>> Important!  If you are on a Windows platform, you need to put
>>> STDOUT into binary mode so that the PNG file does not go
>>> through Window's carriage return/linefeed transformations.
>>> Before the final print statement, put the statement
>>> binmode(STDOUT). This advice also applies to certain older
>>> versions of RedHat, which ship with a patched (and possibly
>>> broken) version of Perl.
>
> (BioPerl devs - couldn't that be added to the default
> render_blast1.pl script with an if statement checking for
> Windows?)
>
> Peter

Yes, that should be added.  I'll work on it.

chris


From mauricio at open-bio.org  Tue Nov  3 12:20:52 2009
From: mauricio at open-bio.org (Mauricio Herrera Cuadra)
Date: Tue, 03 Nov 2009 11:20:52 -0600
Subject: [Bioperl-l] svn errors?
In-Reply-To: <24c96eca0911030909p7cfbf858h4de5a345cf8a0782@mail.gmail.com>
References: <24c96eca0911030909p7cfbf858h4de5a345cf8a0782@mail.gmail.com>
Message-ID: <4AF06674.30506@open-bio.org>

Hi Aaron,

This was reported a few days ago. Chris Dagdigian is working today on a 
fix for it.

Mauricio.

Aaron Mackey wrote:
> [ajm6q at lc4 bioperl-live]$ svn update
> svn: Decompression of svndiff data failed
> 
> 
> I'll admit to not having svn updated in awhile; A clean, anonymous svn co
> failed with the same message:
> 
> [...]
> A    bioperl-live/Bio/Structure/StructureI.pm
> A    bioperl-live/Bio/Structure/IO
> svn: Decompression of svndiff data failed
> 
> -Aaron
> 
> P.S. I used this command: svn co svn://
> code.open-bio.org/bioperl/bioperl-live/trunk bioperl-live
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 


From rachitasharma at gmail.com  Tue Nov  3 17:12:11 2009
From: rachitasharma at gmail.com (Rachita Sharma)
Date: Tue, 3 Nov 2009 14:12:11 -0800
Subject: [Bioperl-l] Trouble parsing PSI-BLAST
Message-ID: <48f9c0d0911031412v26935097ib06d13c2266cfd8a@mail.gmail.com>

I am having trouble parsing PSI-BLAST results. Please help.

The code is:
my $in = new Bio::SearchIO(        -format => 'blast',
                                -file =>
"BS_XFpsiRblastoutputs/e${ev}/bloutput${i}.txt");


while( my $result = $in->next_result ) {
while( my $hit = $result->next_hit ) {

$sth->execute($result->query_name, $hit->name, $hit->significance);
print "Query executed!\n";

}
}

The error is:
------------- EXCEPTION: Bio::Root::Exception -------------
MSG: no data for midline  ***** No hits found ******
STACK: Error::throw
STACK: Bio::Root::Root::throw
/usr/lib/perl5/site_perl/5.8.8/Bio/Root/Root.pm:359
STACK: Bio::SearchIO::blast::next_result
/usr/lib/perl5/site_perl/5.8.8/Bio/SearchIO/blast.pm:1813
STACK: BSubVCpsiRblast.pl:92
-----------------------------------------------------------


From cjfields at illinois.edu  Tue Nov  3 22:42:55 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Tue, 3 Nov 2009 21:42:55 -0600
Subject: [Bioperl-l] Trouble parsing PSI-BLAST
In-Reply-To: <48f9c0d0911031412v26935097ib06d13c2266cfd8a@mail.gmail.com>
References: <48f9c0d0911031412v26935097ib06d13c2266cfd8a@mail.gmail.com>
Message-ID: <DD8E7843-7181-45AD-95B1-FD877D0A5D4E@illinois.edu>

Rachita,

You'll have to give us more to go on than this.  The best thing to do  
is file a bug report and attach an example PSI-BLAST report and code  
that causes the problem.  The $sth->execute(...) is a bit odd, but  
that shouldn't cause the error in question.

Also, make sure to stipulate the OS, version of BioPerl, and perl  
version.

chris

On Nov 3, 2009, at 4:12 PM, Rachita Sharma wrote:

> I am having trouble parsing PSI-BLAST results. Please help.
>
> The code is:
> my $in = new Bio::SearchIO(        -format => 'blast',
>                                -file =>
> "BS_XFpsiRblastoutputs/e${ev}/bloutput${i}.txt");
>
>
> while( my $result = $in->next_result ) {
> while( my $hit = $result->next_hit ) {
>
> $sth->execute($result->query_name, $hit->name, $hit->significance);
> print "Query executed!\n";
>
> }
> }
>
> The error is:
> ------------- EXCEPTION: Bio::Root::Exception -------------
> MSG: no data for midline  ***** No hits found ******
> STACK: Error::throw
> STACK: Bio::Root::Root::throw
> /usr/lib/perl5/site_perl/5.8.8/Bio/Root/Root.pm:359
> STACK: Bio::SearchIO::blast::next_result
> /usr/lib/perl5/site_perl/5.8.8/Bio/SearchIO/blast.pm:1813
> STACK: BSubVCpsiRblast.pl:92
> -----------------------------------------------------------
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From alexl at users.sourceforge.net  Wed Nov  4 02:30:21 2009
From: alexl at users.sourceforge.net (Alex Lancaster)
Date: Wed, 04 Nov 2009 02:30:21 -0500
Subject: [Bioperl-l] version of ExtUtils::Manifest too strict?
Message-ID: <msd43yycfm.fsf@allele2.localdomain>

Does the version of ExtUtils::Manifest really need to be strictly
greater than or equal to 1.52?

Currently this blocks me updating the Fedora package of BioPerl to
1.6.1, because the version of perl that Fedora ships is on 1.51 and
hence the build fails with:

Checking prerequisites...
 - ERROR: ExtUtils::Manifest (1.51_01) is installed, but we need version >= 1.52

Full logs are here:
http://koji.fedoraproject.org/koji/taskinfo?taskID=1787483
http://koji.fedoraproject.org/koji/getfile?taskID=1787483&name=build.log

This is true even with the version of Perl in rawhide/F-12 etc.
(ExtUtils::Manifest is in the base perl package).

If it really is necessary, I would like to be armed with a good argument
why it needs to be updated, since the Perl package maintainer would have
to update the entire Perl package simply to get a more recent version of
one small subpackage.

Regards,
Alex


From jluis.lavin at unavarra.es  Wed Nov  4 03:43:35 2009
From: jluis.lavin at unavarra.es (jluis.lavin at unavarra.es)
Date: Wed, 4 Nov 2009 09:43:35 +0100 (CET)
Subject: [Bioperl-l] Trouble retrieving multiple sequences from NCBI in a
 single list query
Message-ID: <1386.130.206.164.153.1257324215.squirrel@webmail.unavarra.es>


Hello all,

I?m a newbie who is having terrible troubles trying to retrieve a list
multiple sequences from the NCBI and write them to a single file in Fasta
format.
The code I?ve written seems to read mylist and retrive the sequences, but
it kinda overwrites them so that I only get the last sequence on the list.
I?ve been told to ask the people on this mailing list for help, since you
may have come across this problem also or at last will know how to solve
it...

Here is my code, which basically consist on an STDIN for the list to be
read into an array and a loop to read each sequence (stopping when the
list ends) and retrieve a sequence each time the loop is launched,
writting that sequence to a fasta file. I only get a sequence back
although it seems to perform the retrieving process with each of the
sequences of the list...


#!/usr/bin/perl -w
use strict;
use Bio::DB::GenPept;
use Bio::DB::GenBank;
use Bio::SeqIO;
print "Enter your list name:";
my $archivo=<STDIN>;
chomp $archivo;
die ("Can?t open input\n") unless (open(INFILE, $archivo));
my @lista = <INFILE>;
foreach my $seq (@lista) {
    if ($seq eq '') {
        die ("empty list")
        }
    else {
my $db = new Bio::DB::GenPept("-format" => "Fasta");
my $seqobj = $db->get_Seq_by_acc($seq);
my $out = new Bio::SeqIO (-file => ">extracted_seqs.fasta",
-format => 'fasta');
$out->write_seq($seqobj);
}
}
exit;


An example list of sequences can be this one:

YP_003107578.1
YP_003106103.1
YP_003106552.1
YP_003106560.1
YP_003107053.1
YP_003107450.1
YP_003108000.1
YP_003105023.1
YP_003105264.1

Thanks in advance for your help ;)

-- 
Jos? Luis Lav?n Trueba, PhD

Dpto. de Producci?n Agraria
Grupo de Gen?tica y Microbiolog?a
Universidad P?blica de Navarra
31006 Pamplona
Navarra
SPAIN


From e.osimo at gmail.com  Wed Nov  4 04:54:52 2009
From: e.osimo at gmail.com (Emanuele Osimo)
Date: Wed, 4 Nov 2009 10:54:52 +0100
Subject: [Bioperl-l] Bio::Graphics and picture format
Message-ID: <2ac05d0f0911040154h4eed4a1j8108f78e6e4761f3@mail.gmail.com>

Hello everyone,
do you know if it is possible to generate an image with Bio::Graphics in a
vector format? Is there a list of available formats?
Thanks
Emanuele


From David.Messina at sbc.su.se  Wed Nov  4 04:52:53 2009
From: David.Messina at sbc.su.se (Dave Messina)
Date: Wed, 4 Nov 2009 10:52:53 +0100
Subject: [Bioperl-l] Trouble retrieving multiple sequences from NCBI in
	a single list query
In-Reply-To: <1386.130.206.164.153.1257324215.squirrel@webmail.unavarra.es>
References: <1386.130.206.164.153.1257324215.squirrel@webmail.unavarra.es>
Message-ID: <628aabb70911040152r19ed79dfnbc54f346295d28a8@mail.gmail.com>

>
> The code I?ve written seems to read mylist and retrive the sequences, but
> it kinda overwrites them so that I only get the last sequence on the list.
>

With this line

my $out = new Bio::SeqIO (-file => ">extracted_seqs.fasta", -format =>
'fasta');


you are opening the filehandle for the output file inside your loop, so each
time it is writing over the previous file with an empty file. Then, you
write a single sequence to that file with this line

$out->write_seq($seqobj);


So when you are done, you just have the last sequence in the output file.

If you move the opening of the output filehandle outside the loop (it needs
to be done only once), then it should work as you expect.

Also, I notice the newline characters are not being removed from your
sequence IDs  (actually I'm a little surprised that the sequences are being
retrieved). Just to be safe, you may want to add the line

chomp @lista;


after

my @lista = <INFILE>;


Dave


From jluis.lavin at unavarra.es  Wed Nov  4 05:14:40 2009
From: jluis.lavin at unavarra.es (jluis.lavin at unavarra.es)
Date: Wed, 4 Nov 2009 11:14:40 +0100 (CET)
Subject: [Bioperl-l] Trouble retrieving multiple sequences from NCBI in
 a single list query
In-Reply-To: <628aabb70911040152r19ed79dfnbc54f346295d28a8@mail.gmail.com>
References: <1386.130.206.164.153.1257324215.squirrel@webmail.unavarra.es>
	<628aabb70911040152r19ed79dfnbc54f346295d28a8@mail.gmail.com>
Message-ID: <1791.130.206.164.153.1257329680.squirrel@webmail.unavarra.es>

Thank you very very much Dave,
I?ve had a really frustrating time trying to find out what I was doing
wrong, it has been so frustrating that I was about to quit Bioperl.
Now I can try to focus on BLAST parsing for my comparative genomic analysis

You?re great in this mailing list, because you give a fast and neat advice
to all the questions asked here by newbies like me ;)


El Mie, 4 de Noviembre de 2009, 10:52, Dave Messina escribi?:
>>
>> The code I??ve written seems to read mylist and retrive the sequences,
>> but
>> it kinda overwrites them so that I only get the last sequence on the
>> list.
>>
>
> With this line
>
> my $out = new Bio::SeqIO (-file => ">extracted_seqs.fasta", -format =>
> 'fasta');
>
>
> you are opening the filehandle for the output file inside your loop, so
> each
> time it is writing over the previous file with an empty file. Then, you
> write a single sequence to that file with this line
>
> $out->write_seq($seqobj);
>
>
> So when you are done, you just have the last sequence in the output file.
>
> If you move the opening of the output filehandle outside the loop (it
> needs
> to be done only once), then it should work as you expect.
>
> Also, I notice the newline characters are not being removed from your
> sequence IDs  (actually I'm a little surprised that the sequences are
> being
> retrieved). Just to be safe, you may want to add the line
>
> chomp @lista;
>
>
> after
>
> my @lista = <INFILE>;
>
>
>
>
> Dave
>


-- 
Dr. Jos? Luis Lav?n Trueba

Dpto. de Producci?n Agraria
Grupo de Gen?tica y Microbiolog?a
Universidad P?blica de Navarra
31006 Pamplona
Navarra
SPAIN


From hrh at fmi.ch  Wed Nov  4 05:05:17 2009
From: hrh at fmi.ch (Hotz, Hans-Rudolf)
Date: Wed, 04 Nov 2009 11:05:17 +0100
Subject: [Bioperl-l] Trouble retrieving multiple sequences from NCBI in
 a single list query
In-Reply-To: <1386.130.206.164.153.1257324215.squirrel@webmail.unavarra.es>
Message-ID: <C717106D.54F2%hrh@fmi.ch>

Hi

try

my $out = new Bio::SeqIO (-file => ">>extracted_seqs.fasta",
                                     ^

this way you no longer overwrite your existing file, but append the next
sequence.

Regards, Hans


On 11/4/09 9:43 AM, "jluis.lavin at unavarra.es" <jluis.lavin at unavarra.es>
wrote:

> 
> Hello all,
> 
> I?m a newbie who is having terrible troubles trying to retrieve a list
> multiple sequences from the NCBI and write them to a single file in Fasta
> format.
> The code I?ve written seems to read mylist and retrive the sequences, but
> it kinda overwrites them so that I only get the last sequence on the list.
> I?ve been told to ask the people on this mailing list for help, since you
> may have come across this problem also or at last will know how to solve
> it...
> 
> Here is my code, which basically consist on an STDIN for the list to be
> read into an array and a loop to read each sequence (stopping when the
> list ends) and retrieve a sequence each time the loop is launched,
> writting that sequence to a fasta file. I only get a sequence back
> although it seems to perform the retrieving process with each of the
> sequences of the list...
> 
> 
> #!/usr/bin/perl -w
> use strict;
> use Bio::DB::GenPept;
> use Bio::DB::GenBank;
> use Bio::SeqIO;
> print "Enter your list name:";
> my $archivo=<STDIN>;
> chomp $archivo;
> die ("Can?t open input\n") unless (open(INFILE, $archivo));
> my @lista = <INFILE>;
> foreach my $seq (@lista) {
>     if ($seq eq '') {
>         die ("empty list")
>         }
>     else {
> my $db = new Bio::DB::GenPept("-format" => "Fasta");
> my $seqobj = $db->get_Seq_by_acc($seq);
> my $out = new Bio::SeqIO (-file => ">extracted_seqs.fasta",
> -format => 'fasta');
> $out->write_seq($seqobj);
> }
> }
> exit;
> 
> 
> An example list of sequences can be this one:
> 
> YP_003107578.1
> YP_003106103.1
> YP_003106552.1
> YP_003106560.1
> YP_003107053.1
> YP_003107450.1
> YP_003108000.1
> YP_003105023.1
> YP_003105264.1
> 
> Thanks in advance for your help ;)


From jluis.lavin at unavarra.es  Wed Nov  4 05:25:38 2009
From: jluis.lavin at unavarra.es (jluis.lavin at unavarra.es)
Date: Wed, 4 Nov 2009 11:25:38 +0100 (CET)
Subject: [Bioperl-l] Trouble retrieving multiple sequences from NCBI in
 asingle list query
In-Reply-To: <C717106D.54F2%hrh@fmi.ch>
References: <1386.130.206.164.153.1257324215.squirrel@webmail.unavarra.es>
	<C717106D.54F2%hrh@fmi.ch>
Message-ID: <1834.130.206.164.153.1257330338.squirrel@webmail.unavarra.es>

Thank you very much for your answer Hans!!!
It works perfectly,also a neat and fast solution, like Dave?s.

Blessings to you all ;)

El Mie, 4 de Noviembre de 2009, 11:05, Hotz, Hans-Rudolf escribi?:
> Hi
>
> try
>
> my $out = new Bio::SeqIO (-file => ">>extracted_seqs.fasta",
>                                      ^
>
> this way you no longer overwrite your existing file, but append the next
> sequence.
>
> Regards, Hans
>
>
>
> On 11/4/09 9:43 AM, "jluis.lavin at unavarra.es" <jluis.lavin at unavarra.es>
> wrote:
>
>>
>> Hello all,
>>
>> I?m a newbie who is having terrible troubles trying to retrieve a list
>> multiple sequences from the NCBI and write them to a single file in
>> Fasta
>> format.
>> The code I?ve written seems to read mylist and retrive the sequences,
>> but
>> it kinda overwrites them so that I only get the last sequence on the
>> list.
>> I?ve been told to ask the people on this mailing list for help, since
>> you
>> may have come across this problem also or at last will know how to solve
>> it...
>>
>> Here is my code, which basically consist on an STDIN for the list to be
>> read into an array and a loop to read each sequence (stopping when the
>> list ends) and retrieve a sequence each time the loop is launched,
>> writting that sequence to a fasta file. I only get a sequence back
>> although it seems to perform the retrieving process with each of the
>> sequences of the list...
>>
>>
>> #!/usr/bin/perl -w
>> use strict;
>> use Bio::DB::GenPept;
>> use Bio::DB::GenBank;
>> use Bio::SeqIO;
>> print "Enter your list name:";
>> my $archivo=<STDIN>;
>> chomp $archivo;
>> die ("Can?t open input\n") unless (open(INFILE, $archivo));
>> my @lista = <INFILE>;
>> foreach my $seq (@lista) {
>>     if ($seq eq '') {
>>         die ("empty list")
>>         }
>>     else {
>> my $db = new Bio::DB::GenPept("-format" => "Fasta");
>> my $seqobj = $db->get_Seq_by_acc($seq);
>> my $out = new Bio::SeqIO (-file => ">extracted_seqs.fasta",
>> -format => 'fasta');
>> $out->write_seq($seqobj);
>> }
>> }
>> exit;
>>
>>
>> An example list of sequences can be this one:
>>
>> YP_003107578.1
>> YP_003106103.1
>> YP_003106552.1
>> YP_003106560.1
>> YP_003107053.1
>> YP_003107450.1
>> YP_003108000.1
>> YP_003105023.1
>> YP_003105264.1
>>
>> Thanks in advance for your help ;)
>
>


-- 
Dr. Jos? Luis Lav?n Trueba

Dpto. de Producci?n Agraria
Grupo de Gen?tica y Microbiolog?a
Universidad P?blica de Navarra
31006 Pamplona
Navarra
SPAIN


From scott at scottcain.net  Wed Nov  4 08:26:02 2009
From: scott at scottcain.net (Scott Cain)
Date: Wed, 4 Nov 2009 08:26:02 -0500
Subject: [Bioperl-l] Bio::Graphics and picture format
In-Reply-To: <2ac05d0f0911040154h4eed4a1j8108f78e6e4761f3@mail.gmail.com>
References: <2ac05d0f0911040154h4eed4a1j8108f78e6e4761f3@mail.gmail.com>
Message-ID: <0FB17FBC-16BE-4A9F-AC75-983D3B4ECE7D@scottcain.net>

Hi Emanuele,

It is possible to use GD::SVG instead of GD to generate SVG graphics.   
To use it, you provide an argument of "-image_class  GD::SVG" to the  
constructor of Bio::Graphics::Panel.  See the perldoc of  
Bio::Graphics::Panel for more info.

Scott


On Nov 4, 2009, at 4:54 AM, Emanuele Osimo wrote:

> Hello everyone,
> do you know if it is possible to generate an image with  
> Bio::Graphics in a
> vector format? Is there a list of available formats?
> Thanks
> Emanuele
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

-----------------------------------------------------------------------
Scott Cain, Ph. D. scott at scottcain dot net
GMOD Coordinator (http://gmod.org/) 216-392-3087
Ontario Institute for Cancer Research


From b3sn7 at UNB.ca  Tue Nov  3 12:30:24 2009
From: b3sn7 at UNB.ca (Sharma, Rachita)
Date: Tue,  3 Nov 2009 13:30:24 -0400
Subject: [Bioperl-l] Trouble parsing PSI-BLAST
Message-ID: <1257269424.4af068b045434@webmail.unb.ca>


I am having trouble parsing PSI-BLAST results. Please help.

The code is:
my $in = new Bio::SearchIO(	-format => 'blast',
				-file => "BS_XFpsiRblastoutputs/e${ev}/bloutput${i}.txt");


while( my $result = $in->next_result ) {
while( my $hit = $result->next_hit ) {

$sth->execute($result->query_name, $hit->name, $hit->significance);
print "Query executed!\n";  

}
}

The error is:
------------- EXCEPTION: Bio::Root::Exception -------------
MSG: no data for midline  ***** No hits found ******
STACK: Error::throw
STACK: Bio::Root::Root::throw /usr/lib/perl5/site_perl/5.8.8/Bio/Root/Root.pm:359
STACK: Bio::SearchIO::blast::next_result
/usr/lib/perl5/site_perl/5.8.8/Bio/SearchIO/blast.pm:1813
STACK: BSubVCpsiRblast.pl:92
-----------------------------------------------------------


*******************************
Rachita Sharma
Research Assistant (PhD Student)
University of New Brunswick, NB, CANADA
email: Rachita.Sharma at unb.ca
Phone no: 503-895-3619
*******************************


From cjfields at illinois.edu  Wed Nov  4 08:53:35 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Wed, 4 Nov 2009 07:53:35 -0600
Subject: [Bioperl-l] version of ExtUtils::Manifest too strict?
In-Reply-To: <msd43yycfm.fsf@allele2.localdomain>
References: <msd43yycfm.fsf@allele2.localdomain>
Message-ID: <1D9E943F-2EDC-49AB-83DE-78DED5A8AC23@illinois.edu>

Alex,

Not sure why ExtUtils::Manifest can't be bundled as a separate perl  
package alone.  It is part of perl core but it's also available on  
CPAN separately from perl itself:

http://search.cpan.org/~rkobes/ExtUtils-Manifest-1.57/lib/ExtUtils/Manifest.pm

This is the commit message for that BTW.  This allows spaces in file  
names for the MANIFEST.  v1.52 is a bug fix and is required.

http://code.open-bio.org/svnweb/index.cgi/bioperl/revision/?rev=15673

chris

On Nov 4, 2009, at 1:30 AM, Alex Lancaster wrote:

> Does the version of ExtUtils::Manifest really need to be strictly
> greater than or equal to 1.52?
>
> Currently this blocks me updating the Fedora package of BioPerl to
> 1.6.1, because the version of perl that Fedora ships is on 1.51 and
> hence the build fails with:
>
> Checking prerequisites...
> - ERROR: ExtUtils::Manifest (1.51_01) is installed, but we need  
> version >= 1.52
>
> Full logs are here:
> http://koji.fedoraproject.org/koji/taskinfo?taskID=1787483
> http://koji.fedoraproject.org/koji/getfile?taskID=1787483&name=build.log
>
> This is true even with the version of Perl in rawhide/F-12 etc.
> (ExtUtils::Manifest is in the base perl package).
>
> If it really is necessary, I would like to be armed with a good  
> argument why this ca
> why it needs to be updated, since the Perl package maintainer would  
> have
> to update the entire Perl package simply to get a more recent  
> version of
> one small subpackage.
>
> Regards,
> Alex
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From cjfields at illinois.edu  Wed Nov  4 08:55:34 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Wed, 4 Nov 2009 07:55:34 -0600
Subject: [Bioperl-l] Trouble parsing PSI-BLAST
In-Reply-To: <1257269424.4af068b045434@webmail.unb.ca>
References: <1257269424.4af068b045434@webmail.unb.ca>
Message-ID: <70E34111-4E70-463D-86EE-06926EA57073@illinois.edu>

Rachita,

Asked and answered yesterday.  Please submit as a bug.

chris

On Nov 3, 2009, at 11:30 AM, Sharma, Rachita wrote:

>
> I am having trouble parsing PSI-BLAST results. Please help.
>
> The code is:
> my $in = new Bio::SearchIO(	-format => 'blast',
> 				-file => "BS_XFpsiRblastoutputs/e${ev}/bloutput${i}.txt");
>
>
> while( my $result = $in->next_result ) {
> while( my $hit = $result->next_hit ) {
>
> $sth->execute($result->query_name, $hit->name, $hit->significance);
> print "Query executed!\n";
>
> }
> }
>
> The error is:
> ------------- EXCEPTION: Bio::Root::Exception -------------
> MSG: no data for midline  ***** No hits found ******
> STACK: Error::throw
> STACK: Bio::Root::Root::throw /usr/lib/perl5/site_perl/5.8.8/Bio/ 
> Root/Root.pm:359
> STACK: Bio::SearchIO::blast::next_result
> /usr/lib/perl5/site_perl/5.8.8/Bio/SearchIO/blast.pm:1813
> STACK: BSubVCpsiRblast.pl:92
> -----------------------------------------------------------
>
>
>
>
> *******************************
> Rachita Sharma
> Research Assistant (PhD Student)
> University of New Brunswick, NB, CANADA
> email: Rachita.Sharma at unb.ca
> Phone no: 503-895-3619
> *******************************
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From David.Messina at sbc.su.se  Wed Nov  4 09:11:43 2009
From: David.Messina at sbc.su.se (Dave Messina)
Date: Wed, 4 Nov 2009 15:11:43 +0100
Subject: [Bioperl-l] Trouble retrieving multiple sequences from NCBI in
	a single list query
In-Reply-To: <1791.130.206.164.153.1257329680.squirrel@webmail.unavarra.es>
References: <1386.130.206.164.153.1257324215.squirrel@webmail.unavarra.es> 
	<628aabb70911040152r19ed79dfnbc54f346295d28a8@mail.gmail.com> 
	<1791.130.206.164.153.1257329680.squirrel@webmail.unavarra.es>
Message-ID: <628aabb70911040611q56b441c8o6888f326d0b314d@mail.gmail.com>

Aw shucks, Jos?, glad I could be of help. There are plenty of people who
answer questions around here, but my timezone sometimes gives me an
advantage for the European ones. :)


Dave


From daniel.gaston at gmail.com  Wed Nov  4 09:45:04 2009
From: daniel.gaston at gmail.com (Daniel Gaston)
Date: Wed, 4 Nov 2009 10:45:04 -0400
Subject: [Bioperl-l] SwissProt and Subcellular localization information
Message-ID: <50c615ba0911040645j1b28e727p5d7bf47a04db160b@mail.gmail.com>

Hi Everyone,

I have recently been playing around with SwissProt format flatfiles and want
to extract sequences based on subcellular localization. I notice in going
through the code for swiss.pm and swissdriver.pm that in both (more so in
swissdriver.pm) there are several steps where organelle information based on
the OG line could be extracted and added to data structure but isn't. It
seems that in both cases the OG line is being added in to the generic
lumping of data from the OC, OS, and OX lines in order to extract species
names and taxonomy information but getting rid of everything else. Is there
a particular reason for this or just a simple oversight? On the surface at
least it looks like a relatively simple modification to make although I
admit that I am not terribly adept at manipulating these SeqIO
datastructures.

Thanks for your time,

Dan


From daniel.gaston at gmail.com  Wed Nov  4 12:12:10 2009
From: daniel.gaston at gmail.com (Daniel Gaston)
Date: Wed, 4 Nov 2009 13:12:10 -0400
Subject: [Bioperl-l] SwissProt and Subcellular localization information
Message-ID: <50c615ba0911040912pfd2483fwe44cd098beed73c7@mail.gmail.com>

Sorry folks, it appears I was just being a bonehead and didn't look close
enough into Bio:Annotations and Bio:Species objects that store all of this
data.

Dan

On Wed, Nov 4, 2009 at 1:00 PM, <bioperl-l-request at lists.open-bio.org>wrote:

> Send Bioperl-l mailing list submissions to
>        bioperl-l at lists.open-bio.org
>
> To subscribe or unsubscribe via the World Wide Web, visit
>        http://lists.open-bio.org/mailman/listinfo/bioperl-l
> or, via email, send a message with subject or body 'help' to
>        bioperl-l-request at lists.open-bio.org
>
> You can reach the person managing the list at
>        bioperl-l-owner at lists.open-bio.org
>
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of Bioperl-l digest..."
>
> Today's Topics:
>
>   1.  SwissProt and Subcellular localization information
>      (Daniel Gaston)
>
>
> ---------- Forwarded message ----------
> From: Daniel Gaston <daniel.gaston at gmail.com>
> To: bioperl-l at lists.open-bio.org
> Date: Wed, 4 Nov 2009 10:45:04 -0400
> Subject: [Bioperl-l] SwissProt and Subcellular localization information
> Hi Everyone,
>
> I have recently been playing around with SwissProt format flatfiles and
> want
> to extract sequences based on subcellular localization. I notice in going
> through the code for swiss.pm and swissdriver.pm that in both (more so in
> swissdriver.pm) there are several steps where organelle information based
> on
> the OG line could be extracted and added to data structure but isn't. It
> seems that in both cases the OG line is being added in to the generic
> lumping of data from the OC, OS, and OX lines in order to extract species
> names and taxonomy information but getting rid of everything else. Is there
> a particular reason for this or just a simple oversight? On the surface at
> least it looks like a relatively simple modification to make although I
> admit that I am not terribly adept at manipulating these SeqIO
> datastructures.
>
> Thanks for your time,
>
> Dan
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


From jluis.lavin at unavarra.es  Thu Nov  5 10:28:23 2009
From: jluis.lavin at unavarra.es (jluis.lavin at unavarra.es)
Date: Thu, 5 Nov 2009 16:28:23 +0100 (CET)
Subject: [Bioperl-l] A question about iBio::Index: and its correct use
Message-ID: <2120.130.206.164.153.1257434903.squirrel@webmail.unavarra.es>


Hello to all,

I?m trying to write a script to retrieve a list of sequences from a local
FASTA file (for example a fasta archive where all the protein models of an
organism are stored). This file would be used by me as some kind "local
database" (sorry if I mistake a few concepts...)
I?ve been reading the BioPerl HOWTOs and I came across the
Bio::Index::Fasta tool.
If I didn?t misunderstood what I read (which can be easy because my low
level on programming) this Indexing tool should do the job.
I wrote a couple of scripts based on the documentation i read about this
tool, but I don?t seem to be able to create the index file to be used
later (to retrieve the sequences from).
-First of all, I want to ask the people in this forum if the
Bio::Index::Fasta is the right one to chose for this tasks.
-Then I?ll beg you to take a look at my scripts, because I don?t seem to
catch the bug...

Best wishes to you all and thanks in advance ;)

-- 
Jos? Luis Lav?n Trueba, PhD

Dpto. de Producci?n Agraria
Grupo de Gen?tica y Microbiolog?a
Universidad P?blica de Navarra
31006 Pamplona
Navarra
SPAIN


From maj at fortinbras.us  Thu Nov  5 10:39:05 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Thu, 5 Nov 2009 10:39:05 -0500
Subject: [Bioperl-l] A question about iBio::Index: and its correct use
In-Reply-To: <2120.130.206.164.153.1257434903.squirrel@webmail.unavarra.es>
References: <2120.130.206.164.153.1257434903.squirrel@webmail.unavarra.es>
Message-ID: <A28922858F64480ABD8A6696E269023C@NewLife>

Jos? -- It looks like this is a good solution to your problem. Please send you 
script so we can look at it-
cheers Mark
----- Original Message ----- 
From: <jluis.lavin at unavarra.es>
To: <bioperl-l at lists.open-bio.org>
Sent: Thursday, November 05, 2009 10:28 AM
Subject: [Bioperl-l] A question about iBio::Index: and its correct use


Hello to all,

I?m trying to write a script to retrieve a list of sequences from a local
FASTA file (for example a fasta archive where all the protein models of an
organism are stored). This file would be used by me as some kind "local
database" (sorry if I mistake a few concepts...)
I?ve been reading the BioPerl HOWTOs and I came across the
Bio::Index::Fasta tool.
If I didn?t misunderstood what I read (which can be easy because my low
level on programming) this Indexing tool should do the job.
I wrote a couple of scripts based on the documentation i read about this
tool, but I don?t seem to be able to create the index file to be used
later (to retrieve the sequences from).
-First of all, I want to ask the people in this forum if the
Bio::Index::Fasta is the right one to chose for this tasks.
-Then I?ll beg you to take a look at my scripts, because I don?t seem to
catch the bug...

Best wishes to you all and thanks in advance ;)

-- 
Jos? Luis Lav?n Trueba, PhD

Dpto. de Producci?n Agraria
Grupo de Gen?tica y Microbiolog?a
Universidad P?blica de Navarra
31006 Pamplona
Navarra
SPAIN


_______________________________________________
Bioperl-l mailing list
Bioperl-l at lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/bioperl-l


From jluis.lavin at unavarra.es  Thu Nov  5 10:46:36 2009
From: jluis.lavin at unavarra.es (jluis.lavin at unavarra.es)
Date: Thu, 5 Nov 2009 16:46:36 +0100 (CET)
Subject: [Bioperl-l] [Fwd: Re: A question about iBio::Index: and its correct
 use]
Message-ID: <2642.130.206.164.153.1257435996.squirrel@webmail.unavarra.es>


---------------------------- Mensaje original ----------------------------
Subject: Re: [Bioperl-l] A question about iBio::Index: and its correct use
From:    jluis.lavin at unavarra.es
Fecha:   Jue, 5 de Noviembre de 2009, 16:46
To:      "Mark A. Jensen" <maj at fortinbras.us>
--------------------------------------------------------------------------

Hi Mark,

I?ve actually got two scripts, the first one is to create the index and
the second one is to retrieve the sequence lis from the indexed file.

1)Here is the Index creation script:

#!/c:/Perl -w
use strict;
use Bio::Index::Fasta;
use strict;

print "Enter file for indexing: \n";
my $Index_File_Name = <STDIN>;
my $inx = Bio::Index::Fasta->new(-filename => $Index_File_Name.".idx",
    -write_flag => 1);
$inx->make_index(my $File_Name);

2)And here is the sequence retrieval script:

#!/c:/Perl -w
use Bio::Index::Fasta;
use strict;
#PC9.fasta is my genomic file
my $Index_File_Name ="PC9.fasta";
my $inx = Bio::Index::Fasta->new($Index_File_Name);
#LCS.txt is my sequences list
@ARGV = <lCS.txt>;
foreach  my $id (@ARGV) {
if ($id eq ''){
die ("empty list")
}
else {
my $seqobj = $inx->fetch($id);
my $out = new Bio::SeqIO (-file => ">>extracted_seqs.fasta",
-format => 'fasta');
$out->write_seq($seqobj);
}
}
exit;
}

I hope this code is not a total scum...

Thanks in advance ;)


El Jue, 5 de Noviembre de 2009, 16:39, Mark A. Jensen escribi?:
> Jos? -- It looks like this is a good solution to your problem. Please send
> you
> script so we can look at it-
> cheers Mark
> ----- Original Message -----
> From: <jluis.lavin at unavarra.es>
> To: <bioperl-l at lists.open-bio.org>
> Sent: Thursday, November 05, 2009 10:28 AM
> Subject: [Bioperl-l] A question about iBio::Index: and its correct use
>
>
>
> Hello to all,
>
> I?m trying to write a script to retrieve a list of sequences from a local
> FASTA file (for example a fasta archive where all the protein models of an
> organism are stored). This file would be used by me as some kind "local
> database" (sorry if I mistake a few concepts...)
> I?ve been reading the BioPerl HOWTOs and I came across the
> Bio::Index::Fasta tool.
> If I didn?t misunderstood what I read (which can be easy because my low
> level on programming) this Indexing tool should do the job.
> I wrote a couple of scripts based on the documentation i read about this
> tool, but I don?t seem to be able to create the index file to be used
> later (to retrieve the sequences from).
> -First of all, I want to ask the people in this forum if the
> Bio::Index::Fasta is the right one to chose for this tasks.
> -Then I?ll beg you to take a look at my scripts, because I don?t seem to
> catch the bug...
>
> Best wishes to you all and thanks in advance ;)
>
> --
> Jos? Luis Lav?n Trueba, PhD
>
> Dpto. de Producci?n Agraria
> Grupo de Gen?tica y Microbiolog?a
> Universidad P?blica de Navarra
> 31006 Pamplona
> Navarra
> SPAIN
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>
>


-- 
Dr. Jos? Luis Lav?n Trueba

Dpto. de Producci?n Agraria
Grupo de Gen?tica y Microbiolog?a
Universidad P?blica de Navarra
31006 Pamplona
Navarra
SPAIN


-- 
Dr. Jos? Luis Lav?n Trueba

Dpto. de Producci?n Agraria
Grupo de Gen?tica y Microbiolog?a
Universidad P?blica de Navarra
31006 Pamplona
Navarra
SPAIN


From maj at fortinbras.us  Thu Nov  5 10:37:53 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Thu, 5 Nov 2009 10:37:53 -0500
Subject: [Bioperl-l] Trouble retrieving multiple sequences from NCBI ina
	single list query
In-Reply-To: <628aabb70911040611q56b441c8o6888f326d0b314d@mail.gmail.com>
References: <1386.130.206.164.153.1257324215.squirrel@webmail.unavarra.es>
	<628aabb70911040152r19ed79dfnbc54f346295d28a8@mail.gmail.com>
	<1791.130.206.164.153.1257329680.squirrel@webmail.unavarra.es>
	<628aabb70911040611q56b441c8o6888f326d0b314d@mail.gmail.com>
Message-ID: <49075FDFF6764EE48E932D95EB994221@NewLife>

True, Dave, you compete only with crazed east coast core developers who're doing 
"just one more thing" at 2am....
----- Original Message ----- 
From: "Dave Messina" <David.Messina at sbc.su.se>
To: <jluis.lavin at unavarra.es>
Cc: <bioperl-l at lists.open-bio.org>
Sent: Wednesday, November 04, 2009 9:11 AM
Subject: Re: [Bioperl-l] Trouble retrieving multiple sequences from NCBI ina 
single list query


> Aw shucks, Jos?, glad I could be of help. There are plenty of people who
> answer questions around here, but my timezone sometimes gives me an
> advantage for the European ones. :)
>
>
> Dave
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l 


From hrh at fmi.ch  Thu Nov  5 11:02:48 2009
From: hrh at fmi.ch (Hotz, Hans-Rudolf)
Date: Thu, 05 Nov 2009 17:02:48 +0100
Subject: [Bioperl-l] A question about iBio::Index: and its correct use
In-Reply-To: <2120.130.206.164.153.1257434903.squirrel@webmail.unavarra.es>
Message-ID: <C718B5B8.5561%hrh@fmi.ch>


Jluis

> -Then I?ll beg you to take a look at my scripts, because I don?t seem to
> catch the bug...

you haven't attached/included any scripts, have you?


Anyway, have you considered using BLAST indices (created with the additional
flag "-o") together with the tool 'fastacmd' (which also included in the
NCBI blast binaries) as a simple (and very fast) alternative for fetching
sequences.


Regards, Hans


From maj at fortinbras.us  Thu Nov  5 11:02:09 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Thu, 5 Nov 2009 11:02:09 -0500
Subject: [Bioperl-l] [Fwd: Re: A question about iBio::Index: and its
	correct use]
In-Reply-To: <2642.130.206.164.153.1257435996.squirrel@webmail.unavarra.es>
References: <2642.130.206.164.153.1257435996.squirrel@webmail.unavarra.es>
Message-ID: <1984ED07F36C446284B25F617964B6C6@NewLife>

Hey Jos?,
The first thing that jumps out it the index file name. Looks
like you create it as
PC9.fasta.idx
But you read it as
PC9.fasta
Not an unusual mistake. Do
my $inx = Bio::Index::Fasta->new('PC9.fasta.idx');
and see if it works.
MAJ
----- Original Message ----- 
From: <jluis.lavin at unavarra.es>
To: <bioperl-l at lists.open-bio.org>
Sent: Thursday, November 05, 2009 10:46 AM
Subject: [Bioperl-l] [Fwd: Re: A question about iBio::Index: and its correct 
use]


---------------------------- Mensaje original ----------------------------
Subject: Re: [Bioperl-l] A question about iBio::Index: and its correct use
From:    jluis.lavin at unavarra.es
Fecha:   Jue, 5 de Noviembre de 2009, 16:46
To:      "Mark A. Jensen" <maj at fortinbras.us>
--------------------------------------------------------------------------

Hi Mark,

I?ve actually got two scripts, the first one is to create the index and
the second one is to retrieve the sequence lis from the indexed file.

1)Here is the Index creation script:

#!/c:/Perl -w
use strict;
use Bio::Index::Fasta;
use strict;

print "Enter file for indexing: \n";
my $Index_File_Name = <STDIN>;
my $inx = Bio::Index::Fasta->new(-filename => $Index_File_Name.".idx",
    -write_flag => 1);
$inx->make_index(my $File_Name);

2)And here is the sequence retrieval script:

#!/c:/Perl -w
use Bio::Index::Fasta;
use strict;
#PC9.fasta is my genomic file
my $Index_File_Name ="PC9.fasta";
my $inx = Bio::Index::Fasta->new($Index_File_Name);
#LCS.txt is my sequences list
@ARGV = <lCS.txt>;
foreach  my $id (@ARGV) {
if ($id eq ''){
die ("empty list")
}
else {
my $seqobj = $inx->fetch($id);
my $out = new Bio::SeqIO (-file => ">>extracted_seqs.fasta",
-format => 'fasta');
$out->write_seq($seqobj);
}
}
exit;
}

I hope this code is not a total scum...

Thanks in advance ;)


El Jue, 5 de Noviembre de 2009, 16:39, Mark A. Jensen escribi?:
> Jos? -- It looks like this is a good solution to your problem. Please send
> you
> script so we can look at it-
> cheers Mark
> ----- Original Message -----
> From: <jluis.lavin at unavarra.es>
> To: <bioperl-l at lists.open-bio.org>
> Sent: Thursday, November 05, 2009 10:28 AM
> Subject: [Bioperl-l] A question about iBio::Index: and its correct use
>
>
>
> Hello to all,
>
> I?m trying to write a script to retrieve a list of sequences from a local
> FASTA file (for example a fasta archive where all the protein models of an
> organism are stored). This file would be used by me as some kind "local
> database" (sorry if I mistake a few concepts...)
> I?ve been reading the BioPerl HOWTOs and I came across the
> Bio::Index::Fasta tool.
> If I didn?t misunderstood what I read (which can be easy because my low
> level on programming) this Indexing tool should do the job.
> I wrote a couple of scripts based on the documentation i read about this
> tool, but I don?t seem to be able to create the index file to be used
> later (to retrieve the sequences from).
> -First of all, I want to ask the people in this forum if the
> Bio::Index::Fasta is the right one to chose for this tasks.
> -Then I?ll beg you to take a look at my scripts, because I don?t seem to
> catch the bug...
>
> Best wishes to you all and thanks in advance ;)
>
> --
> Jos? Luis Lav?n Trueba, PhD
>
> Dpto. de Producci?n Agraria
> Grupo de Gen?tica y Microbiolog?a
> Universidad P?blica de Navarra
> 31006 Pamplona
> Navarra
> SPAIN
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>
>


-- 
Dr. Jos? Luis Lav?n Trueba

Dpto. de Producci?n Agraria
Grupo de Gen?tica y Microbiolog?a
Universidad P?blica de Navarra
31006 Pamplona
Navarra
SPAIN


-- 
Dr. Jos? Luis Lav?n Trueba

Dpto. de Producci?n Agraria
Grupo de Gen?tica y Microbiolog?a
Universidad P?blica de Navarra
31006 Pamplona
Navarra
SPAIN


_______________________________________________
Bioperl-l mailing list
Bioperl-l at lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/bioperl-l


From jluis.lavin at unavarra.es  Thu Nov  5 11:21:57 2009
From: jluis.lavin at unavarra.es (jluis.lavin at unavarra.es)
Date: Thu, 5 Nov 2009 17:21:57 +0100 (CET)
Subject: [Bioperl-l] [Fwd: Re: A question about iBio::Index: and its
 correct use]
In-Reply-To: <1984ED07F36C446284B25F617964B6C6@NewLife>
References: <2642.130.206.164.153.1257435996.squirrel@webmail.unavarra.es>
	<1984ED07F36C446284B25F617964B6C6@NewLife>
Message-ID: <2969.130.206.164.153.1257438117.squirrel@webmail.unavarra.es>

Thank you very much Mark, that?s a good point :$
I guess your correction is referred to the second script, isn?t it?

If it is so, there is still a problem with the first script, it doesn?t
create the PC9.fasta.idx file, instead it creates two files named:
-PC9.fasta.idx.pag
-PC9.fasta.idx.dir

which seem to be clearly related with some kind of indexing process...but,
unless the PC9.fasta.idx file is only virtual or remains hidden, I can?t
find it anywhere...
Forgive me if I?m talking nosense...

Thank you very much again for your help ;)


El Jue, 5 de Noviembre de 2009, 17:02, Mark A. Jensen escribi?:
> Hey Jos?,
> The first thing that jumps out it the index file name. Looks
> like you create it as
> PC9.fasta.idx
> But you read it as
> PC9.fasta
> Not an unusual mistake. Do
> my $inx = Bio::Index::Fasta->new('PC9.fasta.idx');
> and see if it works.
> MAJ
> ----- Original Message -----
> From: <jluis.lavin at unavarra.es>
> To: <bioperl-l at lists.open-bio.org>
> Sent: Thursday, November 05, 2009 10:46 AM
> Subject: [Bioperl-l] [Fwd: Re: A question about iBio::Index: and its
> correct
> use]
>
>
>
>
> ---------------------------- Mensaje original ----------------------------
> Subject: Re: [Bioperl-l] A question about iBio::Index: and its correct use
> From:    jluis.lavin at unavarra.es
> Fecha:   Jue, 5 de Noviembre de 2009, 16:46
> To:      "Mark A. Jensen" <maj at fortinbras.us>
> --------------------------------------------------------------------------
>
> Hi Mark,
>
> I?ve actually got two scripts, the first one is to create the index and
> the second one is to retrieve the sequence lis from the indexed file.
>
> 1)Here is the Index creation script:
>
> #!/c:/Perl -w
> use strict;
> use Bio::Index::Fasta;
> use strict;
>
> print "Enter file for indexing: \n";
> my $Index_File_Name = <STDIN>;
> my $inx = Bio::Index::Fasta->new(-filename => $Index_File_Name.".idx",
>     -write_flag => 1);
> $inx->make_index(my $File_Name);
>
> 2)And here is the sequence retrieval script:
>
> #!/c:/Perl -w
> use Bio::Index::Fasta;
> use strict;
> #PC9.fasta is my genomic file
> my $Index_File_Name ="PC9.fasta";
> my $inx = Bio::Index::Fasta->new($Index_File_Name);
> #LCS.txt is my sequences list
> @ARGV = <lCS.txt>;
> foreach  my $id (@ARGV) {
> if ($id eq ''){
> die ("empty list")
> }
> else {
> my $seqobj = $inx->fetch($id);
> my $out = new Bio::SeqIO (-file => ">>extracted_seqs.fasta",
> -format => 'fasta');
> $out->write_seq($seqobj);
> }
> }
> exit;
> }
>
> I hope this code is not a total scum...
>
> Thanks in advance ;)
>
>
>
> El Jue, 5 de Noviembre de 2009, 16:39, Mark A. Jensen escribi?:
>> Jos? -- It looks like this is a good solution to your problem. Please
>> send
>> you
>> script so we can look at it-
>> cheers Mark
>> ----- Original Message -----
>> From: <jluis.lavin at unavarra.es>
>> To: <bioperl-l at lists.open-bio.org>
>> Sent: Thursday, November 05, 2009 10:28 AM
>> Subject: [Bioperl-l] A question about iBio::Index: and its correct use
>>
>>
>>
>> Hello to all,
>>
>> I?m trying to write a script to retrieve a list of sequences from a
>> local
>> FASTA file (for example a fasta archive where all the protein models of
>> an
>> organism are stored). This file would be used by me as some kind "local
>> database" (sorry if I mistake a few concepts...)
>> I?ve been reading the BioPerl HOWTOs and I came across the
>> Bio::Index::Fasta tool.
>> If I didn?t misunderstood what I read (which can be easy because my low
>> level on programming) this Indexing tool should do the job.
>> I wrote a couple of scripts based on the documentation i read about this
>> tool, but I don?t seem to be able to create the index file to be used
>> later (to retrieve the sequences from).
>> -First of all, I want to ask the people in this forum if the
>> Bio::Index::Fasta is the right one to chose for this tasks.
>> -Then I?ll beg you to take a look at my scripts, because I don?t seem to
>> catch the bug...
>>
>> Best wishes to you all and thanks in advance ;)
>>
>> --
>> Jos? Luis Lav?n Trueba, PhD
>>
>> Dpto. de Producci?n Agraria
>> Grupo de Gen?tica y Microbiolog?a
>> Universidad P?blica de Navarra
>> 31006 Pamplona
>> Navarra
>> SPAIN
>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>>
>>
>
>
> --
> Dr. Jos? Luis Lav?n Trueba
>
> Dpto. de Producci?n Agraria
> Grupo de Gen?tica y Microbiolog?a
> Universidad P?blica de Navarra
> 31006 Pamplona
> Navarra
> SPAIN
>
>
>
> --
> Dr. Jos? Luis Lav?n Trueba
>
> Dpto. de Producci?n Agraria
> Grupo de Gen?tica y Microbiolog?a
> Universidad P?blica de Navarra
> 31006 Pamplona
> Navarra
> SPAIN
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>
>


-- 
Dr. Jos? Luis Lav?n Trueba

Dpto. de Producci?n Agraria
Grupo de Gen?tica y Microbiolog?a
Universidad P?blica de Navarra
31006 Pamplona
Navarra
SPAIN


From maj at fortinbras.us  Thu Nov  5 11:39:09 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Thu, 5 Nov 2009 11:39:09 -0500
Subject: [Bioperl-l] [Fwd: Re: A question about iBio::Index: and its
	correct use]
In-Reply-To: <2969.130.206.164.153.1257438117.squirrel@webmail.unavarra.es>
References: <2642.130.206.164.153.1257435996.squirrel@webmail.unavarra.es>
	<1984ED07F36C446284B25F617964B6C6@NewLife>
	<2969.130.206.164.153.1257438117.squirrel@webmail.unavarra.es>
Message-ID: <A1ACC4B552514872B77208248B31977C@NewLife>

Yes, these are files created by the SDBM, Perl's internal db manager. You should 
be able to
open the index by simply
$inx = Bio::Index::Fasta->new('PC9.fasta.idx');
and the dbm will know what to do--
cheers MAJ
----- Original Message ----- 
From: <jluis.lavin at unavarra.es>
To: "Mark A. Jensen" <maj at fortinbras.us>
Cc: <jluis.lavin at unavarra.es>; <bioperl-l at lists.open-bio.org>
Sent: Thursday, November 05, 2009 11:21 AM
Subject: Re: [Bioperl-l] [Fwd: Re: A question about iBio::Index: and its correct 
use]


> Thank you very much Mark, that?s a good point :$
> I guess your correction is referred to the second script, isn?t it?
>
> If it is so, there is still a problem with the first script, it doesn?t
> create the PC9.fasta.idx file, instead it creates two files named:
> -PC9.fasta.idx.pag
> -PC9.fasta.idx.dir
>
> which seem to be clearly related with some kind of indexing process...but,
> unless the PC9.fasta.idx file is only virtual or remains hidden, I can?t
> find it anywhere...
> Forgive me if I?m talking nosense...
>
> Thank you very much again for your help ;)
>
>
> El Jue, 5 de Noviembre de 2009, 17:02, Mark A. Jensen escribi?:
>> Hey Jos?,
>> The first thing that jumps out it the index file name. Looks
>> like you create it as
>> PC9.fasta.idx
>> But you read it as
>> PC9.fasta
>> Not an unusual mistake. Do
>> my $inx = Bio::Index::Fasta->new('PC9.fasta.idx');
>> and see if it works.
>> MAJ
>> ----- Original Message -----
>> From: <jluis.lavin at unavarra.es>
>> To: <bioperl-l at lists.open-bio.org>
>> Sent: Thursday, November 05, 2009 10:46 AM
>> Subject: [Bioperl-l] [Fwd: Re: A question about iBio::Index: and its
>> correct
>> use]
>>
>>
>>
>>
>> ---------------------------- Mensaje original ----------------------------
>> Subject: Re: [Bioperl-l] A question about iBio::Index: and its correct use
>> From:    jluis.lavin at unavarra.es
>> Fecha:   Jue, 5 de Noviembre de 2009, 16:46
>> To:      "Mark A. Jensen" <maj at fortinbras.us>
>> --------------------------------------------------------------------------
>>
>> Hi Mark,
>>
>> I?ve actually got two scripts, the first one is to create the index and
>> the second one is to retrieve the sequence lis from the indexed file.
>>
>> 1)Here is the Index creation script:
>>
>> #!/c:/Perl -w
>> use strict;
>> use Bio::Index::Fasta;
>> use strict;
>>
>> print "Enter file for indexing: \n";
>> my $Index_File_Name = <STDIN>;
>> my $inx = Bio::Index::Fasta->new(-filename => $Index_File_Name.".idx",
>>     -write_flag => 1);
>> $inx->make_index(my $File_Name);
>>
>> 2)And here is the sequence retrieval script:
>>
>> #!/c:/Perl -w
>> use Bio::Index::Fasta;
>> use strict;
>> #PC9.fasta is my genomic file
>> my $Index_File_Name ="PC9.fasta";
>> my $inx = Bio::Index::Fasta->new($Index_File_Name);
>> #LCS.txt is my sequences list
>> @ARGV = <lCS.txt>;
>> foreach  my $id (@ARGV) {
>> if ($id eq ''){
>> die ("empty list")
>> }
>> else {
>> my $seqobj = $inx->fetch($id);
>> my $out = new Bio::SeqIO (-file => ">>extracted_seqs.fasta",
>> -format => 'fasta');
>> $out->write_seq($seqobj);
>> }
>> }
>> exit;
>> }
>>
>> I hope this code is not a total scum...
>>
>> Thanks in advance ;)
>>
>>
>>
>> El Jue, 5 de Noviembre de 2009, 16:39, Mark A. Jensen escribi?:
>>> Jos? -- It looks like this is a good solution to your problem. Please
>>> send
>>> you
>>> script so we can look at it-
>>> cheers Mark
>>> ----- Original Message -----
>>> From: <jluis.lavin at unavarra.es>
>>> To: <bioperl-l at lists.open-bio.org>
>>> Sent: Thursday, November 05, 2009 10:28 AM
>>> Subject: [Bioperl-l] A question about iBio::Index: and its correct use
>>>
>>>
>>>
>>> Hello to all,
>>>
>>> I?m trying to write a script to retrieve a list of sequences from a
>>> local
>>> FASTA file (for example a fasta archive where all the protein models of
>>> an
>>> organism are stored). This file would be used by me as some kind "local
>>> database" (sorry if I mistake a few concepts...)
>>> I?ve been reading the BioPerl HOWTOs and I came across the
>>> Bio::Index::Fasta tool.
>>> If I didn?t misunderstood what I read (which can be easy because my low
>>> level on programming) this Indexing tool should do the job.
>>> I wrote a couple of scripts based on the documentation i read about this
>>> tool, but I don?t seem to be able to create the index file to be used
>>> later (to retrieve the sequences from).
>>> -First of all, I want to ask the people in this forum if the
>>> Bio::Index::Fasta is the right one to chose for this tasks.
>>> -Then I?ll beg you to take a look at my scripts, because I don?t seem to
>>> catch the bug...
>>>
>>> Best wishes to you all and thanks in advance ;)
>>>
>>> --
>>> Jos? Luis Lav?n Trueba, PhD
>>>
>>> Dpto. de Producci?n Agraria
>>> Grupo de Gen?tica y Microbiolog?a
>>> Universidad P?blica de Navarra
>>> 31006 Pamplona
>>> Navarra
>>> SPAIN
>>>
>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>>
>>>
>>
>>
>> --
>> Dr. Jos? Luis Lav?n Trueba
>>
>> Dpto. de Producci?n Agraria
>> Grupo de Gen?tica y Microbiolog?a
>> Universidad P?blica de Navarra
>> 31006 Pamplona
>> Navarra
>> SPAIN
>>
>>
>>
>> --
>> Dr. Jos? Luis Lav?n Trueba
>>
>> Dpto. de Producci?n Agraria
>> Grupo de Gen?tica y Microbiolog?a
>> Universidad P?blica de Navarra
>> 31006 Pamplona
>> Navarra
>> SPAIN
>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>>
>>
>
>
> -- 
> Dr. Jos? Luis Lav?n Trueba
>
> Dpto. de Producci?n Agraria
> Grupo de Gen?tica y Microbiolog?a
> Universidad P?blica de Navarra
> 31006 Pamplona
> Navarra
> SPAIN
>
>
>
> 


From jluis.lavin at unavarra.es  Thu Nov  5 12:48:12 2009
From: jluis.lavin at unavarra.es (jluis.lavin at unavarra.es)
Date: Thu, 5 Nov 2009 18:48:12 +0100 (CET)
Subject: [Bioperl-l] A question about iBio::Index: and its correct use
In-Reply-To: <C718B5B8.5561%hrh@fmi.ch>
References: <2120.130.206.164.153.1257434903.squirrel@webmail.unavarra.es>
	<C718B5B8.5561%hrh@fmi.ch>
Message-ID: <3313.130.206.164.153.1257443292.squirrel@webmail.unavarra.es>

Thanks a lot for your help Hans,
It's a little bit to hard to understand and turn into script this awesome
information you've just given me...I hope I can use it in a near future
anyway ;)
The issue here is that the sequences I,m indexing are not generated by the
NCBI nor stored there...although I belive you?re just refering to the tool
itself and not to a retrieval from the NCBI.

Thanks again you?re all great giving advice to newbies like me ;)

Best wishes to you all


El Jue, 5 de Noviembre de 2009, 17:02, Hotz, Hans-Rudolf escribi?:
>
>
>
> Jluis
>
>> -Then I?ll beg you to take a look at my scripts, because I don?t seem to
>> catch the bug...
>
> you haven't attached/included any scripts, have you?
>
>
> Anyway, have you considered using BLAST indices (created with the
> additional
> flag "-o") together with the tool 'fastacmd' (which also included in the
> NCBI blast binaries) as a simple (and very fast) alternative for fetching
> sequences.
>
>
> Regards, Hans
>
>
>


-- 
Dr. Jos? Luis Lav?n Trueba

Dpto. de Producci?n Agraria
Grupo de Gen?tica y Microbiolog?a
Universidad P?blica de Navarra
31006 Pamplona
Navarra
SPAIN


From florent.angly at gmail.com  Thu Nov  5 13:00:19 2009
From: florent.angly at gmail.com (Florent Angly)
Date: Thu, 05 Nov 2009 10:00:19 -0800
Subject: [Bioperl-l] A question about iBio::Index: and its correct use
In-Reply-To: <3313.130.206.164.153.1257443292.squirrel@webmail.unavarra.es>
References: <2120.130.206.164.153.1257434903.squirrel@webmail.unavarra.es>	<C718B5B8.5561%hrh@fmi.ch>
	<3313.130.206.164.153.1257443292.squirrel@webmail.unavarra.es>
Message-ID: <4AF312B3.9060009@gmail.com>

Hans-Rudolf was talking about a way to retrieve sequences from a BLAST 
database. If you use BLAST locally, then your database is local too.
More info here: 
http://www.ncbi.nlm.nih.gov/staff/tao/URLAPI/formatdb_fastacmd.html
Florent


jluis.lavin at unavarra.es wrote:
> Thanks a lot for your help Hans,
> It's a little bit to hard to understand and turn into script this awesome
> information you've just given me...I hope I can use it in a near future
> anyway ;)
> The issue here is that the sequences I,m indexing are not generated by the
> NCBI nor stored there...although I belive you?re just refering to the tool
> itself and not to a retrieval from the NCBI.
>
> Thanks again you?re all great giving advice to newbies like me ;)
>
> Best wishes to you all
>
>
> El Jue, 5 de Noviembre de 2009, 17:02, Hotz, Hans-Rudolf escribi?:
>   
>>
>> Jluis
>>
>>     
>>> -Then I?ll beg you to take a look at my scripts, because I don?t seem to
>>> catch the bug...
>>>       
>> you haven't attached/included any scripts, have you?
>>
>>
>> Anyway, have you considered using BLAST indices (created with the
>> additional
>> flag "-o") together with the tool 'fastacmd' (which also included in the
>> NCBI blast binaries) as a simple (and very fast) alternative for fetching
>> sequences.
>>
>>
>> Regards, Hans
>>
>>
>>
>>     
>
>
>   


From valiente at lsi.upc.edu  Fri Nov  6 03:06:48 2009
From: valiente at lsi.upc.edu (valiente at lsi.upc.edu)
Date: Fri, 6 Nov 2009 09:06:48 +0100 (CET)
Subject: [Bioperl-l] Bio::SeqIO::genbank.pm
Message-ID: <45737.147.83.59.225.1257494808.squirrel@webmail.lsi.upc.edu>


There is a line in Bio::SeqIO::genbank.pm to convert data in classification lines into a classification array by splitting only
on ';' or '.' so that a classification that is 2
or more words will still get
matched,my @class = map { s/^\s+//; s/\s+$//; s/\s{2,}/ /g; $_; } split /(?<!subgen)[;\.]+/, $class_lines;but this
will break organism names that have a dot inside, such as "Salmonella
enterica subsp. enterica?serovar Typhimurium", which is now
being broken into "Salmonella enterica subsp" and "enterica?serovar
Typhimurium".Changing [;\.]
to [;] solves this issue,my @class = map { s/^\s+//; s/\s+$//; s/\s{2,}/ /g; $_; } split /(?<!subgen)[;]+/,
$class_lines;Does anybody want to further
test it before I commit this change? Thanks,Gabriel


From jluis.lavin at unavarra.es  Fri Nov  6 03:44:45 2009
From: jluis.lavin at unavarra.es (jluis.lavin at unavarra.es)
Date: Fri, 6 Nov 2009 09:44:45 +0100 (CET)
Subject: [Bioperl-l] A question about iBio::Index: and its correct use
In-Reply-To: <4AF312B3.9060009@gmail.com>
References: <2120.130.206.164.153.1257434903.squirrel@webmail.unavarra.es>	<
	C718B5B8.5561%hrh@fmi.ch> 
	<3313.130.206.164.153.1257443292.squirrel@webmail.unavarra.es>
	<4AF312B3.9060009@gmail.com>
Message-ID: <1222.130.206.164.153.1257497085.squirrel@webmail.unavarra.es>

Thank you for the info Florent!
I?ll try to read al the information on the link you provided and try to
figure out how to make it work and if it is worthy for me, I mean, I work
with several sequence files that come from multiple databases (JGI, BROAD,
Genolevures or NCBI). Protein IDs from each of those databases is
different from NCBI. Maybe it could be easier to write a script that
allows me to enter a fasta file with all the protein models of a single
organism, parse it and then extract the sequences of a given list (using
the "ID style" of the particular database) than creating a BLAST index for
each organism I need to work with...Did I explain the issue correctly?
Anyway, since I don?t know anything about this tool Hans and you provided
me, I can easily be wrong...
Thank you for showing me the local BLAST Index tool, I?ll read the
documentation carefully and study all its possibilities.

Best wishes

JL


El Jue, 5 de Noviembre de 2009, 19:00, Florent Angly escribi?:
> Hans-Rudolf was talking about a way to retrieve sequences from a BLAST
> database. If you use BLAST locally, then your database is local too.
> More info here:
> http://www.ncbi.nlm.nih.gov/staff/tao/URLAPI/formatdb_fastacmd.html
> Florent
>
>
> jluis.lavin at unavarra.es wrote:
>> Thanks a lot for your help Hans,
>> It's a little bit to hard to understand and turn into script this
>> awesome
>> information you've just given me...I hope I can use it in a near future
>> anyway ;)
>> The issue here is that the sequences I,m indexing are not generated by
>> the
>> NCBI nor stored there...although I belive you?re just refering to the
>> tool
>> itself and not to a retrieval from the NCBI.
>>
>> Thanks again you?re all great giving advice to newbies like me ;)
>>
>> Best wishes to you all
>>
>>
>> El Jue, 5 de Noviembre de 2009, 17:02, Hotz, Hans-Rudolf escribi?:
>>
>>>
>>> Jluis
>>>
>>>
>>>> -Then I?ll beg you to take a look at my scripts, because I don?t seem
>>>> to
>>>> catch the bug...
>>>>
>>> you haven't attached/included any scripts, have you?
>>>
>>>
>>> Anyway, have you considered using BLAST indices (created with the
>>> additional
>>> flag "-o") together with the tool 'fastacmd' (which also included in
>>> the
>>> NCBI blast binaries) as a simple (and very fast) alternative for
>>> fetching
>>> sequences.
>>>
>>>
>>> Regards, Hans
>>>
>>>
>>>
>>>
>>
>>
>>
>
>


-- 
Dr. Jos? Luis Lav?n Trueba

Dpto. de Producci?n Agraria
Grupo de Gen?tica y Microbiolog?a
Universidad P?blica de Navarra
31006 Pamplona
Navarra
SPAIN


From maj at fortinbras.us  Fri Nov  6 07:45:01 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Fri, 6 Nov 2009 07:45:01 -0500
Subject: [Bioperl-l] Bioperl
In-Reply-To: <16842715.26316.1257510446095.JavaMail.root@durga.amrita.ac.in>
References: <16842715.26316.1257510446095.JavaMail.root@durga.amrita.ac.in>
Message-ID: <AE7A03CA8F45495C9F8D940AC0EC6D69@NewLife>

Hi Resmi-
You should look at http://bioperl.org/ under "Installation" for 
information on getting and installing BioPerl. An introduction 
to working with trees in BioPerl is at this link:
http://www.bioperl.org/wiki/HOWTO:Trees
cheers, 
Mark

----- Original Message ----- 
  From: Resmi S. 
  To: maj at fortinbras.us 
  Sent: Friday, November 06, 2009 7:27 AM
  Subject: Bioperl


  Respected Sir,
  I am Resmi S studying II MSc Bioinformatics.Now am doing my project in Phylogenetic Tree Construction using BioPerl.I am not much familiar on BioPerl modules.So could please send me the names of the Bioperl modules needed for my project.I also need to  know , from where i will get these modules.If that is from CPAN,then send me the location or link.I kindly request you to send me the details soon.

  Yours Sincerely,
     Resmi S,
     II MSc Bioinformatics,
     School of Biotechnology,
     Amrita Vishwa Vidyapeetham,
      Email : amm08bi019 at students.amrita.ac.in


------------------------------------------------------------------------------


  -------------------------------------------------------------------

  This mail has been scanned by Amrita GAV Server, Amrita Vishwa Vidyapeetham, Amritapuri Campus


From robert.bradbury at gmail.com  Fri Nov  6 12:35:22 2009
From: robert.bradbury at gmail.com (Robert Bradbury)
Date: Fri, 6 Nov 2009 12:35:22 -0500
Subject: [Bioperl-l] Function that determines serious mutations
Message-ID: <deaa866a0911060935q5e0364f5v9a296162c42fd0ca@mail.gmail.com>

Is there a function in the library (or has someone written one) that can
take a genbank entry and determine which mutations are harmful?

It would be used to produce a table summary of:
  GENE          # SNP      # BadSNP

One kind of gets this from NCBI if you lookup in the "GENE" db a gene name
and then go to the "GeneView" om dbSNP page it has the information I want
but largely in a graphical format while I simply want numbers I can dump
into a spreadsheet.

I don't think it would be hard, fetch the gene, run through the features for
the SNP database, figure out whether they are good or bad SNPs, accumulate
the statistics and dump it.  I think the functions available are flexible
enough to do it but I can't believe nobody has already done it.  It could be
a bit more complex in that one could do an analysis to see if the mutations
are in a conserved domain or mutations that code for Cysteine or Methionine
(or othe potentially "critical" amino acids) but since "critical" is in the
eye of the beholder there would have to be some kind of callback to a
scoring function.

Thanks,
Robert


From nevoband at igb.uiuc.edu  Fri Nov  6 15:58:05 2009
From: nevoband at igb.uiuc.edu (kleenix)
Date: Fri, 6 Nov 2009 12:58:05 -0800 (PST)
Subject: [Bioperl-l]  StandAloneBlast Unallowed parameter
Message-ID: <26230896.post@talk.nabble.com>


I'm not sure if i'm doing this wrong. I am trying to use the -m parameter in
blastall using the StandAloneBlast bioperl class.
when i add 'm'=>0 to @params i get Unallowed parameter: error.
Am I adding the parameter wrong? i'm using StandAloneBlast version 1.51

Thanks

-Nevo
-- 
View this message in context: http://old.nabble.com/StandAloneBlast-Unallowed-parameter-tp26230896p26230896.html
Sent from the Perl - Bioperl-L mailing list archive at Nabble.com.


From veronica.xiaoyu at gmail.com  Fri Nov  6 17:25:04 2009
From: veronica.xiaoyu at gmail.com (Xiaoyu Liang)
Date: Fri, 6 Nov 2009 17:25:04 -0500
Subject: [Bioperl-l] Parsing BLAST out file to HTML. How to change the
	description's name of each hit?
Message-ID: <a6a926b50911061425m67e47a3bx1853f21e1de5bd95@mail.gmail.com>

Hi,

I'm using Bio::SearchIO::Writer HTMLResultWriter help me parse BLAST out
file into HTML.

Anybody knows how to parse and change the description name of each hit?

By using hit->description can call hits' description, but it is not allowed
to be modified.

Thank you very much,
Xiaoyu


From maj at fortinbras.us  Fri Nov  6 19:40:17 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Fri, 6 Nov 2009 19:40:17 -0500
Subject: [Bioperl-l] Parsing BLAST out file to HTML. How to change
	thedescription's name of each hit?
In-Reply-To: <a6a926b50911061425m67e47a3bx1853f21e1de5bd95@mail.gmail.com>
References: <a6a926b50911061425m67e47a3bx1853f21e1de5bd95@mail.gmail.com>
Message-ID: <11592B31D9924FA7A8638D90AE4A3F4A@NewLife>

Xiaoyu-
That method should work to change the description; are you doing

$hit->description('This is my new description');

This method returns the old description when you change the value:

$hit->description('old');
$str = $hit->description('new'); # $str eq 'old'
$str = $hit->description;            # $str eq 'new'

MAJ

----- Original Message ----- 
From: "Xiaoyu Liang" <veronica.xiaoyu at gmail.com>
To: <Bioperl-l at lists.open-bio.org>
Sent: Friday, November 06, 2009 5:25 PM
Subject: [Bioperl-l] Parsing BLAST out file to HTML. How to change 
thedescription's name of each hit?


> Hi,
>
> I'm using Bio::SearchIO::Writer HTMLResultWriter help me parse BLAST out
> file into HTML.
>
> Anybody knows how to parse and change the description name of each hit?
>
> By using hit->description can call hits' description, but it is not allowed
> to be modified.
>
> Thank you very much,
> Xiaoyu
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> 


From Daniel.Lang at biologie.uni-freiburg.de  Sun Nov  8 09:50:48 2009
From: Daniel.Lang at biologie.uni-freiburg.de (Daniel Lang)
Date: Sun, 08 Nov 2009 15:50:48 +0100
Subject: [Bioperl-l] arguments to call back functions in GBrowse2
Message-ID: <4AF6DAC8.8070204@biologie.uni-freiburg.de>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi Lincoln,

a while back (May 29, 2009; 09:08pm) you replied to an even older thread
("Re: Access the parent of a Bio::DB::SeqFeature within a gbrowse config
callback function").

I missed your reply and did follow it up back then, sorry!

I'm currently facing the same issue again with gbrowse2. I have a
callback function for "balloon click". Following your last reply I
expected 5 arguments, but I am getting only three: $feature,$panel,$track.

In principle, I am using the latest releases/checkouts...
Which modules do I need to look at/update for this functionality?

Furthermore, is there a possibility to share global variables between
gbrowse2 and slaves? Should this work via init_code?
Should modules initialized in a conf be in the scope of a slave?

If not can I introduce modules via the slave config files, or do I need
to alter the slave scripts?


Thanks, again!

Cheers,
Daniel


PS: gbrowse2 rocks!
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAkr22sUACgkQmJnbCpJAG3A2MgCdG61bNRGMFVWExagzMFejKMjO
FiUAn16nQNemDGSy8nJBS5dUHQMnDgrP
=ODxn
-----END PGP SIGNATURE-----


From maj at fortinbras.us  Sun Nov  8 11:09:43 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Sun, 8 Nov 2009 11:09:43 -0500
Subject: [Bioperl-l] GuessSeqFormat: fastq?
Message-ID: <B5891A5A7F2D482DB093E01A7A8DA9A1@NewLife>

Hi All- 
Any plans in the works for a _possibly_fastq sequence guesser?
MAJ


From maj at fortinbras.us  Sun Nov  8 11:20:55 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Sun, 8 Nov 2009 11:20:55 -0500
Subject: [Bioperl-l] GuessSeqFormat: fastq?
In-Reply-To: <B5891A5A7F2D482DB093E01A7A8DA9A1@NewLife>
References: <B5891A5A7F2D482DB093E01A7A8DA9A1@NewLife>
Message-ID: <E2407ED235C24BFF9A03377416109318@NewLife>

Never mind; got it covered-- MAJ
----- Original Message ----- 
From: "Mark A. Jensen" <maj at fortinbras.us>
To: "bioperl-l" <bioperl-l at lists.open-bio.org>
Sent: Sunday, November 08, 2009 11:09 AM
Subject: [Bioperl-l] GuessSeqFormat: fastq?


> Hi All- 
> Any plans in the works for a _possibly_fastq sequence guesser?
> MAJ
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
>


From saikari78 at gmail.com  Mon Nov  9 10:47:10 2009
From: saikari78 at gmail.com (saikari keitele)
Date: Mon, 9 Nov 2009 15:47:10 +0000
Subject: [Bioperl-l] Retrieving link to protein from PubChem
Message-ID: <a38167fa0911090747p6702c62fibd7e8310d3a72dae@mail.gmail.com>

Hi,

I'm using Bioperl to retrieve records from PubChem.
I'm trying to find a way-but have been unsuccessful- to retrieve from a
compound record, the reference to the protein(s) that can synthesize the
compound.
Thanks very much.

saikari


From saikari78 at gmail.com  Mon Nov  9 11:05:57 2009
From: saikari78 at gmail.com (saikari keitele)
Date: Mon, 9 Nov 2009 16:05:57 +0000
Subject: [Bioperl-l] [bioperl newbie] Retrieving link to protein from PubChem
Message-ID: <a38167fa0911090805y76456205g3f42caa7e3471241@mail.gmail.com>

Hi,

I'm using Bioperl to retrieve records from PubChem.
I'm trying to find a way-but have been unsuccessful- to retrieve from a
compound record, the reference to the protein(s) that can synthesize the
compound.
Thanks very much.

saikari


From cjfields at illinois.edu  Mon Nov  9 11:27:10 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Mon, 9 Nov 2009 10:27:10 -0600
Subject: [Bioperl-l] [bioperl newbie] Retrieving link to protein from
	PubChem
In-Reply-To: <a38167fa0911090805y76456205g3f42caa7e3471241@mail.gmail.com>
References: <a38167fa0911090805y76456205g3f42caa7e3471241@mail.gmail.com>
Message-ID: <1ECC543A-F923-4D5E-A0C1-5BBD35ECAAE8@illinois.edu>

On Nov 9, 2009, at 10:05 AM, saikari keitele wrote:

> Hi,
>
> I'm using Bioperl to retrieve records from PubChem.
> I'm trying to find a way-but have been unsuccessful- to retrieve  
> from a
> compound record, the reference to the protein(s) that can synthesize  
> the
> compound.
> Thanks very much.
>
> saikari

The below bioperl script returns the GI for proteins that correspond  
to the substance passed on the command line; invoke using 'perl  
pc_substance.pl substance_requested'.  It probably needs more fiddling  
to catch everything but it should get you started.

For other bits and pieces (such as how to retrieve the raw sequence  
files), please see the EUtilities HOWTO:

http://www.bioperl.org/wiki/HOWTO:EUtilities_Cookbook

chris

----------------------------------------

#!/usr/bin/perl -w

use 5.010;
use strict;
use warnings;
use Bio::DB::EUtilities;

my $substance = shift;

my $eutil = Bio::DB::EUtilities->new(-eutil => 'esearch',
                                      -db => 'pcsubstance',
                                      -term => $substance,
                                      -usehistory => 'y');

my $hist = $eutil->next_History || die;

$eutil->reset_parameters(-eutil => 'elink',
                        -history => $hist,
                        -db      => 'protein',
                        -dbfrom  => 'pcsubstance',
                        -retmax  => 1000);

say join(',',$eutil->get_ids);


From saikari78 at gmail.com  Mon Nov  9 11:41:20 2009
From: saikari78 at gmail.com (saikari keitele)
Date: Mon, 9 Nov 2009 16:41:20 +0000
Subject: [Bioperl-l] [bioperl newbie] Retrieving link to protein from
	PubChem
In-Reply-To: <1ECC543A-F923-4D5E-A0C1-5BBD35ECAAE8@illinois.edu>
References: <a38167fa0911090805y76456205g3f42caa7e3471241@mail.gmail.com>
	<1ECC543A-F923-4D5E-A0C1-5BBD35ECAAE8@illinois.edu>
Message-ID: <a38167fa0911090841o6a0a7bcdj5e2396f1a58dc4f2@mail.gmail.com>

Fabulous!. Huge help.
saikari

On Mon, Nov 9, 2009 at 4:27 PM, Chris Fields <cjfields at illinois.edu> wrote:

>  On Nov 9, 2009, at 10:05 AM, saikari keitele wrote:
>
> Hi,
>>
>> I'm using Bioperl to retrieve records from PubChem.
>> I'm trying to find a way-but have been unsuccessful- to retrieve from a
>> compound record, the reference to the protein(s) that can synthesize the
>> compound.
>> Thanks very much.
>>
>> saikari
>>
>
> The below bioperl script returns the GI for proteins that correspond to the
> substance passed on the command line; invoke using 'perl pc_substance.plsubstance_requested'.  It probably needs more fiddling to catch everything
> but it should get you started.
>
> For other bits and pieces (such as how to retrieve the raw sequence files),
> please see the EUtilities HOWTO:
>
> http://www.bioperl.org/wiki/HOWTO:EUtilities_Cookbook
>
> chris
>
> ----------------------------------------
>
> #!/usr/bin/perl -w
>
> use 5.010;
> use strict;
> use warnings;
> use Bio::DB::EUtilities;
>
> my $substance = shift;
>
> my $eutil = Bio::DB::EUtilities->new(-eutil => 'esearch',
>                                     -db => 'pcsubstance',
>                                     -term => $substance,
>                                     -usehistory => 'y');
>
> my $hist = $eutil->next_History || die;
>
> $eutil->reset_parameters(-eutil => 'elink',
>                       -history => $hist,
>                       -db      => 'protein',
>                       -dbfrom  => 'pcsubstance',
>                       -retmax  => 1000);
>
> say join(',',$eutil->get_ids);
>


From gc11song at gmail.com  Mon Nov  9 13:08:48 2009
From: gc11song at gmail.com (Guangchun Song)
Date: Mon, 9 Nov 2009 12:08:48 -0600
Subject: [Bioperl-l] how to get the protein sequences from DNA sequences
	around novel SNPs?
Message-ID: <794eafc20911091008g1f98b944ncbd66ac4962a85a3@mail.gmail.com>

Hello,

I'm new bioperl user.  I' working on a project: To determine the
status of all tutative SNPs such as non-synonymous vs. synonymous, and
predict the tranlational effect of non-synonymous mutations as benign
or malicious.  I'm trying to use bioperl to get the DNA sequence and
translate to protein sequence for the SNPs that are in gene's coding
region.  Could someone tell me how to do it?

Thanks,

-Guangchun Song


From robert.bradbury at gmail.com  Mon Nov  9 16:15:33 2009
From: robert.bradbury at gmail.com (Robert Bradbury)
Date: Mon, 9 Nov 2009 16:15:33 -0500
Subject: [Bioperl-l] how to get the protein sequences from DNA sequences
	around novel SNPs?
In-Reply-To: <794eafc20911091008g1f98b944ncbd66ac4962a85a3@mail.gmail.com>
References: <794eafc20911091008g1f98b944ncbd66ac4962a85a3@mail.gmail.com>
Message-ID: <deaa866a0911091315l6a0782e8i24bfdc60065eb648@mail.gmail.com>

On Mon, Nov 9, 2009 at 1:08 PM, Guangchun Song <gc11song at gmail.com> wrote:
>
> I'm new bioperl user.  I' working on a project: To determine the
> status of all tutative SNPs such as non-synonymous vs. synonymous, and
> predict the tranlational effect of non-synonymous mutations as benign
> or malicious.  I'm trying to use bioperl to get the DNA sequence and
> translate to protein sequence for the SNPs that are in gene's coding
> region.  Could someone tell me how to do it?
>
>
I too would like to know if this information is available.  I've recently
been working with the dbSNP results from NCBI but they display the results
in a graphical format rather than data that one can play with and ask
questions of like "What is the most disease causing gene in the Human
Genome?" or "What are the critical proteins damaged by gene defects in the
Human Genome?" ... "In terms of premature deaths, extended health care
requirements, loss of quality of life, etc.?"

The same types of questions can be applied to the dog and cat genomes where
there is emotional value or the cow, horse, pig, etc. genomes where there is
economic value?

The value of BioPerl would increase significantly if there were
functionality that would allow easy access to "these mutations may have
negative/positive impact" (which means you need a function that qualifies
mutations by degree) and allow for impact to be subjectively determined
(implying there must be some callback function to provide a user
quality/impact rating).

For example:
   $/@differences =  protein_compare($mygene, $refseq_gene, @critical_aa,
@critical_domain, $callback)
Where $callback could "rate" differences about the protein and position and
the "type of interest" (e.g. metal binding amino acids, structural changing
amino acids, critical catalysis amino acids, etc.).

A default callback would be based on some evolving definition of "critical"
changes which result in human disease for example.

This is a "required" capability to be able to determine things like the
"adaptability" of a species -- those with fewest critical mutation points
may have better adaptability to mutation increasing circumstances.

Please pardon any errors in perl syntax/usage its been a while since I've
written perl and I'd really rather be coding in C.

Robert


From maj at fortinbras.us  Mon Nov  9 16:56:24 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Mon, 9 Nov 2009 16:56:24 -0500
Subject: [Bioperl-l] how to get the protein sequences from DNA
	sequencesaround novel SNPs?
In-Reply-To: <deaa866a0911091315l6a0782e8i24bfdc60065eb648@mail.gmail.com>
References: <794eafc20911091008g1f98b944ncbd66ac4962a85a3@mail.gmail.com>
	<deaa866a0911091315l6a0782e8i24bfdc60065eb648@mail.gmail.com>
Message-ID: <3ED3D387B5DE4248A218D42882369925@NewLife>

I agree that BioPerl would significantly increase in value with
such a module; in fact, the BioTeam would probably buy us out.
My opinion is that the entire GWAS enterprise is the search for
such a callback function, for humans anyway. For those engaged
in this quest, if BioPerl doesn't provide a Maserati, it at least provides
good italian-made (among others) parts.
MAJ
----- Original Message ----- 
From: "Robert Bradbury" <robert.bradbury at gmail.com>
To: "Guangchun Song" <gc11song at gmail.com>
Cc: <bioperl-l at lists.open-bio.org>
Sent: Monday, November 09, 2009 4:15 PM
Subject: Re: [Bioperl-l] how to get the protein sequences from DNA 
sequencesaround novel SNPs?


> On Mon, Nov 9, 2009 at 1:08 PM, Guangchun Song <gc11song at gmail.com> wrote:
>>
>> I'm new bioperl user.  I' working on a project: To determine the
>> status of all tutative SNPs such as non-synonymous vs. synonymous, and
>> predict the tranlational effect of non-synonymous mutations as benign
>> or malicious.  I'm trying to use bioperl to get the DNA sequence and
>> translate to protein sequence for the SNPs that are in gene's coding
>> region.  Could someone tell me how to do it?
>>
>>
> I too would like to know if this information is available.  I've recently
> been working with the dbSNP results from NCBI but they display the results
> in a graphical format rather than data that one can play with and ask
> questions of like "What is the most disease causing gene in the Human
> Genome?" or "What are the critical proteins damaged by gene defects in the
> Human Genome?" ... "In terms of premature deaths, extended health care
> requirements, loss of quality of life, etc.?"
>
> The same types of questions can be applied to the dog and cat genomes where
> there is emotional value or the cow, horse, pig, etc. genomes where there is
> economic value?
>
> The value of BioPerl would increase significantly if there were
> functionality that would allow easy access to "these mutations may have
> negative/positive impact" (which means you need a function that qualifies
> mutations by degree) and allow for impact to be subjectively determined
> (implying there must be some callback function to provide a user
> quality/impact rating).
>
> For example:
>   $/@differences =  protein_compare($mygene, $refseq_gene, @critical_aa,
> @critical_domain, $callback)
> Where $callback could "rate" differences about the protein and position and
> the "type of interest" (e.g. metal binding amino acids, structural changing
> amino acids, critical catalysis amino acids, etc.).
>
> A default callback would be based on some evolving definition of "critical"
> changes which result in human disease for example.
>
> This is a "required" capability to be able to determine things like the
> "adaptability" of a species -- those with fewest critical mutation points
> may have better adaptability to mutation increasing circumstances.
>
> Please pardon any errors in perl syntax/usage its been a while since I've
> written perl and I'd really rather be coding in C.
>
> Robert
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> 


From alexl at users.sourceforge.net  Mon Nov  9 18:44:07 2009
From: alexl at users.sourceforge.net (Alex Lancaster)
Date: Mon, 09 Nov 2009 18:44:07 -0500
Subject: [Bioperl-l] version of ExtUtils::Manifest too strict?
In-Reply-To: <1D9E943F-2EDC-49AB-83DE-78DED5A8AC23@illinois.edu> (Chris
	Fields's message of "Wed, 4 Nov 2009 07:53:35 -0600")
References: <msd43yycfm.fsf@allele2.localdomain>
	<1D9E943F-2EDC-49AB-83DE-78DED5A8AC23@illinois.edu>
Message-ID: <nmocnbuuuw.fsf@allele2.localdomain>

>>>>> Chris Fields  writes:

> Alex, Not sure why ExtUtils::Manifest can't be bundled as a separate
> perl package alone.  It is part of perl core but it's also available
> on CPAN separately from perl itself:

> http://search.cpan.org/~rkobes/ExtUtils-Manifest-1.57/lib/ExtUtils/Manifest.pm

Hi Chris,

Yes, in principle it would be possible to have this split out as a
separate package (currently it's a "subpackage" under the main perl
package), unfortunately that's just not the way it's currently done in
Fedora (probably because it's part of the core set and they like to
update all relevant packages in one step) and I have little control over
that.

As I suspected, the perl maintainer is not at all enthusiastic for
updating the whole of perl just for that package (except for rawhide
which would mean that bioperl 1.6.1 would not be available until F-13,
about 6 months from now).  See:

http://bugzilla.redhat.com/show_bug.cgi?id=533562#c1

Obviously I am not happy with this situation either, because it will
freeze bioperl on Fedora at 1.6.0 for about 6 months, so can you
recommend any temporary workarounds in the meantime?

> This is the commit message for that BTW.  This allows spaces in file
> names for the MANIFEST.  v1.52 is a bug fix and is required.

> http://code.open-bio.org/svnweb/index.cgi/bioperl/revision/?rev=15673

Perhaps I could create a patch that renamed files with spaces in them to
ones with no spaces and then rename them again upon installation.

Can you point me to which files are the problematic ones that triggered
the dependency for 1.52?  Perhaps I can figure a workaround.

Meanwhile I will press the maintainer of perl in Fedora to perhaps
reconsider his position (e.g. if another update for perl is going out
for another reason, like a security update, perhaps he could roll in the
1.52 update at the same time).

Cheers,
Alex


From cjfields at illinois.edu  Mon Nov  9 19:50:00 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Mon, 9 Nov 2009 18:50:00 -0600
Subject: [Bioperl-l] version of ExtUtils::Manifest too strict?
In-Reply-To: <nmocnbuuuw.fsf@allele2.localdomain>
References: <msd43yycfm.fsf@allele2.localdomain>
	<1D9E943F-2EDC-49AB-83DE-78DED5A8AC23@illinois.edu>
	<nmocnbuuuw.fsf@allele2.localdomain>
Message-ID: <29EA2398-F60B-48F2-AFE7-39A44011C451@illinois.edu>

On Nov 9, 2009, at 5:44 PM, Alex Lancaster wrote:

>>>>>> Chris Fields  writes:
>
>> Alex, Not sure why ExtUtils::Manifest can't be bundled as a separate
>> perl package alone.  It is part of perl core but it's also available
>> on CPAN separately from perl itself:
>
>> http://search.cpan.org/~rkobes/ExtUtils-Manifest-1.57/lib/ExtUtils/Manifest.pm
>
> Hi Chris,
>
> Yes, in principle it would be possible to have this split out as a
> separate package (currently it's a "subpackage" under the main perl
> package), unfortunately that's just not the way it's currently done in
> Fedora (probably because it's part of the core set and they like to
> update all relevant packages in one step) and I have little control  
> over
> that.
>
> As I suspected, the perl maintainer is not at all enthusiastic for
> updating the whole of perl just for that package (except for rawhide
> which would mean that bioperl 1.6.1 would not be available until F-13,
> about 6 months from now).  See:
>
> http://bugzilla.redhat.com/show_bug.cgi?id=533562#c1
>
> Obviously I am not happy with this situation either, because it will
> freeze bioperl on Fedora at 1.6.0 for about 6 months, so can you
> recommend any temporary workarounds in the meantime?

Well, if you don't absolutely require the MANIFEST for the final  
package you can forego the requirement.  The file in question that  
triggered the requirement is a data file used only for testing:

t/data/test 2.txt

>> This is the commit message for that BTW.  This allows spaces in file
>> names for the MANIFEST.  v1.52 is a bug fix and is required.
>
>> http://code.open-bio.org/svnweb/index.cgi/bioperl/revision/?rev=15673
>
> Perhaps I could create a patch that renamed files with spaces in  
> them to
> ones with no spaces and then rename them again upon installation.
>
> Can you point me to which files are the problematic ones that  
> triggered
> the dependency for 1.52?  Perhaps I can figure a workaround.
>
> Meanwhile I will press the maintainer of perl in Fedora to perhaps
> reconsider his position (e.g. if another update for perl is going out
> for another reason, like a security update, perhaps he could roll in  
> the
> 1.52 update at the same time).
>
> Cheers,
> Alex

I would point out that this is a fairly significant bug fix for  
ExtUtils::Manifest.  A newer point release of perl is now available  
(5.10.1) that contains the fix and has a fix for a performance  
regression that popped up in 5.10.0.

chris


From jay at jays.net  Mon Nov  9 19:05:51 2009
From: jay at jays.net (Jay Hannah)
Date: Mon, 9 Nov 2009 18:05:51 -0600
Subject: [Bioperl-l] Bio::Index::GenBank - by organism?
Message-ID: <3B01A09C-198E-4691-B807-7ED3250BB81A@jays.net>

Many thanks to Ewan Birney et. al. for Bio::Index::*

I can throw away my awful grep based index-by-accession stuff.   :)

Any chance someone has also written an organism based index mechanism?  
Something like...

while (my $seq = $inx?>get_Seq_by_organism('*Xanthomonas*')) {
    print $seq->display_id . "\n";
}

Thanks,

j


From cjfields at illinois.edu  Mon Nov  9 22:55:01 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Mon, 9 Nov 2009 21:55:01 -0600
Subject: [Bioperl-l] Bio::Index::GenBank - by organism?
In-Reply-To: <3B01A09C-198E-4691-B807-7ED3250BB81A@jays.net>
References: <3B01A09C-198E-4691-B807-7ED3250BB81A@jays.net>
Message-ID: <12DFD22E-42DC-4626-9873-0DE3EBB5CFBD@illinois.edu>

On Nov 9, 2009, at 6:05 PM, Jay Hannah wrote:

> Many thanks to Ewan Birney et. al. for Bio::Index::*
>
> I can throw away my awful grep based index-by-accession stuff.   :)
>
> Any chance someone has also written an organism based index  
> mechanism? Something like...
>
> while (my $seq = $inx?>get_Seq_by_organism('*Xanthomonas*')) {
>   print $seq->display_id . "\n";
> }
>
> Thanks,
>
> j

It should work via id_parser(); from Bio::Index::GenBank:

    $inx->id_parser(\&get_id);
    # make the index
    $inx->make_index($file_name);

    # here is where the retrieval key is specified
    sub get_id {
       my $line = shift;
       $line =~ /clone="(\S+)"/;
       $1;
    }

Change the code ref deal with the line you want and parse the name  
out.  Caveat: this may not be absolutely perfect (it only passes in a  
line at a time, and some species lines will wrap).  Also not sure how  
this would work in cases where multiple sequences from the same  
species are present.

The other option is to preparse everything and tie a hash to store a  
species->UID map, then use that along with your Bio::Index index to  
grab what you need.

chris


From cjfields at illinois.edu  Mon Nov  9 23:58:32 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Mon, 9 Nov 2009 22:58:32 -0600
Subject: [Bioperl-l] how to get the protein sequences from DNA sequences
	around novel SNPs?
In-Reply-To: <deaa866a0911091315l6a0782e8i24bfdc60065eb648@mail.gmail.com>
References: <794eafc20911091008g1f98b944ncbd66ac4962a85a3@mail.gmail.com>
	<deaa866a0911091315l6a0782e8i24bfdc60065eb648@mail.gmail.com>
Message-ID: <435BA1A8-2CCB-4D7A-8909-84F8135C439F@illinois.edu>

On Nov 9, 2009, at 3:15 PM, Robert Bradbury wrote:

> On Mon, Nov 9, 2009 at 1:08 PM, Guangchun Song <gc11song at gmail.com>  
> wrote:
>>
>> I'm new bioperl user.  I' working on a project: To determine the
>> status of all tutative SNPs such as non-synonymous vs. synonymous,  
>> and
>> predict the tranlational effect of non-synonymous mutations as benign
>> or malicious.  I'm trying to use bioperl to get the DNA sequence and
>> translate to protein sequence for the SNPs that are in gene's coding
>> region.  Could someone tell me how to do it?
>>
>>
> I too would like to know if this information is available.  I've  
> recently
> been working with the dbSNP results from NCBI but they display the  
> results
> in a graphical format rather than data that one can play with and ask
> questions of like "What is the most disease causing gene in the Human
> Genome?" or "What are the critical proteins damaged by gene defects  
> in the
> Human Genome?" ... "In terms of premature deaths, extended health care
> requirements, loss of quality of life, etc.?"
>
> The same types of questions can be applied to the dog and cat  
> genomes where
> there is emotional value or the cow, horse, pig, etc. genomes where  
> there is
> economic value?
>
> The value of BioPerl would increase significantly if there were
> functionality that would allow easy access to "these mutations may  
> have
> negative/positive impact" (which means you need a function that  
> qualifies
> mutations by degree) and allow for impact to be subjectively  
> determined
> (implying there must be some callback function to provide a user
> quality/impact rating).
>
> For example:
>   $/@differences =  protein_compare($mygene, $refseq_gene,  
> @critical_aa,
> @critical_domain, $callback)
> Where $callback could "rate" differences about the protein and  
> position and
> the "type of interest" (e.g. metal binding amino acids, structural  
> changing
> amino acids, critical catalysis amino acids, etc.).
>
> A default callback would be based on some evolving definition of  
> "critical"
> changes which result in human disease for example.
>
> This is a "required" capability to be able to determine things like  
> the
> "adaptability" of a species -- those with fewest critical mutation  
> points
> may have better adaptability to mutation increasing circumstances.
>
> Please pardon any errors in perl syntax/usage its been a while since  
> I've
> written perl and I'd really rather be coding in C.
>
> Robert

I will say that most of the information from the SNP database is  
available in various formats (see following link under 'Retrieval  
Types'):

http://www.ncbi.nlm.nih.gov/corehtml/query/static/efetchseq_help.html

You can access this information, as well as the full XML, using  
something like the following script.

chris

------------------------------------------------

#!/usr/bin/perl -w

use 5.010;
use strict;
use warnings;
use Bio::DB::EUtilities;

my $term = shift;
my $eutil  = Bio::DB::EUtilities->new(-eutil    => 'esearch',
                                       -db       => 'snp',
                                       -term     => $term,
                                       -usehistory => 'y',
                                       -retmax   => 100);

my $hist = $eutil->next_History || die "No history returned";

# for SNP XML, change retmode to 'xml'
$eutil->set_parameters(-eutil   => 'efetch',
                        -history => $hist,
                        -retmode => 'text',
                        -rettype => 'flt');

# dumps to STDOUT
say $eutil->get_Response->content;


From jluis.lavin at unavarra.es  Tue Nov 10 05:43:40 2009
From: jluis.lavin at unavarra.es (jluis.lavin at unavarra.es)
Date: Tue, 10 Nov 2009 11:43:40 +0100 (CET)
Subject: [Bioperl-l] [Fwd: Re: A question about iBio::Index: and
 itscorrect use]
In-Reply-To: <A1ACC4B552514872B77208248B31977C@NewLife>
References: <2642.130.206.164.153.1257435996.squirrel@webmail.unavarra.es><1
	984ED07F36C446284B25F617964B6C6@NewLife><2969.130.206.164.153.1257438117.squirrel@webmail.unavarra.es>
	<A1ACC4B552514872B77208248B31977C@NewLife>
Message-ID: <3471.130.206.164.153.1257849820.squirrel@webmail.unavarra.es>

Hello again,

I tried what Mark told me modifying the code line he told me but there?s
still a problem that I believe must be due to the sequences name.
My secuences header on the Fasta file have this format:

>PleosPC9_1_103820|fgenesh1_pg.3_#_1

Th part on the right of the pipe changes depending on the program used to
create the gene model, for example:

>PleosPC9_1_103820|fgenesh1_pg.3_#_1
>PleosPC9_1_123413|genemark.2731_g
>PleosPC9_1_52065|e_gw1.3.64.1

So I guess I need to parse my ids somehow for thr program to detect only
the first part of the fasta header (the "protein name") and not to get
messed with the other side of the pipe...

This is the corrected code I wrote following Mark?s indications, but I
still don?t have any idea about the parsing issue...

#!/c:/Perl -w
use Bio::Index::Fasta;
use strict;
#PC9.fasta is my genomic file
my $Index_File_Name ="PC9.fasta";
my $inx = Bio::Index::Fasta->new('PC9.fasta.idx');
#LCS.txt is my sequences list
@ARGV = <LCS.txt>;
foreach  my $id (@ARGV) {
if ($id eq ''){
die ("empty list")
}
else {
my $seqobj = $inx->fetch($id);
my $out = new Bio::SeqIO (-file => ">>index_extracted.fasta",
-format => 'fasta');
$out->write_seq($seqobj);
}
}
exit;
}

Thanks in advance

PD. May it be a faster way of extracting those sequences using plain PERL?


El Jue, 5 de Noviembre de 2009, 17:39, Mark A. Jensen escribi?:
> Yes, these are files created by the SDBM, Perl's internal db manager. You
> should
> be able to
> open the index by simply
> $inx = Bio::Index::Fasta->new('PC9.fasta.idx');
> and the dbm will know what to do--
> cheers MAJ
> ----- Original Message -----
> From: <jluis.lavin at unavarra.es>
> To: "Mark A. Jensen" <maj at fortinbras.us>
> Cc: <jluis.lavin at unavarra.es>; <bioperl-l at lists.open-bio.org>
> Sent: Thursday, November 05, 2009 11:21 AM
> Subject: Re: [Bioperl-l] [Fwd: Re: A question about iBio::Index: and its
> correct
> use]
>
>
>> Thank you very much Mark, that?s a good point :$
>> I guess your correction is referred to the second script, isn?t it?
>>
>> If it is so, there is still a problem with the first script, it doesn?t
>> create the PC9.fasta.idx file, instead it creates two files named:
>> -PC9.fasta.idx.pag
>> -PC9.fasta.idx.dir
>>
>> which seem to be clearly related with some kind of indexing
>> process...but,
>> unless the PC9.fasta.idx file is only virtual or remains hidden, I can?t
>> find it anywhere...
>> Forgive me if I?m talking nosense...
>>
>> Thank you very much again for your help ;)
>>
>>
>> El Jue, 5 de Noviembre de 2009, 17:02, Mark A. Jensen escribi?:
>>> Hey Jos?,
>>> The first thing that jumps out it the index file name. Looks
>>> like you create it as
>>> PC9.fasta.idx
>>> But you read it as
>>> PC9.fasta
>>> Not an unusual mistake. Do
>>> my $inx = Bio::Index::Fasta->new('PC9.fasta.idx');
>>> and see if it works.
>>> MAJ
>>> ----- Original Message -----
>>> From: <jluis.lavin at unavarra.es>
>>> To: <bioperl-l at lists.open-bio.org>
>>> Sent: Thursday, November 05, 2009 10:46 AM
>>> Subject: [Bioperl-l] [Fwd: Re: A question about iBio::Index: and its
>>> correct
>>> use]
>>>
>>>
>>>
>>>
>>> ---------------------------- Mensaje original
>>> ----------------------------
>>> Subject: Re: [Bioperl-l] A question about iBio::Index: and its correct
>>> use
>>> From:    jluis.lavin at unavarra.es
>>> Fecha:   Jue, 5 de Noviembre de 2009, 16:46
>>> To:      "Mark A. Jensen" <maj at fortinbras.us>
>>> --------------------------------------------------------------------------
>>>
>>> Hi Mark,
>>>
>>> I?ve actually got two scripts, the first one is to create the index and
>>> the second one is to retrieve the sequence lis from the indexed file.
>>>
>>> 1)Here is the Index creation script:
>>>
>>> #!/c:/Perl -w
>>> use strict;
>>> use Bio::Index::Fasta;
>>> use strict;
>>>
>>> print "Enter file for indexing: \n";
>>> my $Index_File_Name = <STDIN>;
>>> my $inx = Bio::Index::Fasta->new(-filename => $Index_File_Name.".idx",
>>>     -write_flag => 1);
>>> $inx->make_index(my $File_Name);
>>>
>>> 2)And here is the sequence retrieval script:
>>>
>>> #!/c:/Perl -w
>>> use Bio::Index::Fasta;
>>> use strict;
>>> #PC9.fasta is my genomic file
>>> my $Index_File_Name ="PC9.fasta";
>>> my $inx = Bio::Index::Fasta->new($Index_File_Name);
>>> #LCS.txt is my sequences list
>>> @ARGV = <lCS.txt>;
>>> foreach  my $id (@ARGV) {
>>> if ($id eq ''){
>>> die ("empty list")
>>> }
>>> else {
>>> my $seqobj = $inx->fetch($id);
>>> my $out = new Bio::SeqIO (-file => ">>extracted_seqs.fasta",
>>> -format => 'fasta');
>>> $out->write_seq($seqobj);
>>> }
>>> }
>>> exit;
>>> }
>>>
>>> I hope this code is not a total scum...
>>>
>>> Thanks in advance ;)
>>>
>>>
>>>
>>> El Jue, 5 de Noviembre de 2009, 16:39, Mark A. Jensen escribi?:
>>>> Jos? -- It looks like this is a good solution to your problem. Please
>>>> send
>>>> you
>>>> script so we can look at it-
>>>> cheers Mark
>>>> ----- Original Message -----
>>>> From: <jluis.lavin at unavarra.es>
>>>> To: <bioperl-l at lists.open-bio.org>
>>>> Sent: Thursday, November 05, 2009 10:28 AM
>>>> Subject: [Bioperl-l] A question about iBio::Index: and its correct use
>>>>
>>>>
>>>>
>>>> Hello to all,
>>>>
>>>> I?m trying to write a script to retrieve a list of sequences from a
>>>> local
>>>> FASTA file (for example a fasta archive where all the protein models
>>>> of
>>>> an
>>>> organism are stored). This file would be used by me as some kind
>>>> "local
>>>> database" (sorry if I mistake a few concepts...)
>>>> I?ve been reading the BioPerl HOWTOs and I came across the
>>>> Bio::Index::Fasta tool.
>>>> If I didn?t misunderstood what I read (which can be easy because my
>>>> low
>>>> level on programming) this Indexing tool should do the job.
>>>> I wrote a couple of scripts based on the documentation i read about
>>>> this
>>>> tool, but I don?t seem to be able to create the index file to be used
>>>> later (to retrieve the sequences from).
>>>> -First of all, I want to ask the people in this forum if the
>>>> Bio::Index::Fasta is the right one to chose for this tasks.
>>>> -Then I?ll beg you to take a look at my scripts, because I don?t seem
>>>> to
>>>> catch the bug...
>>>>
>>>> Best wishes to you all and thanks in advance ;)
>>>>
>>>> --
>>>> Jos? Luis Lav?n Trueba, PhD
>>>>
>>>> Dpto. de Producci?n Agraria
>>>> Grupo de Gen?tica y Microbiolog?a
>>>> Universidad P?blica de Navarra
>>>> 31006 Pamplona
>>>> Navarra
>>>> SPAIN
>>>>
>>>>
>>>> _______________________________________________
>>>> Bioperl-l mailing list
>>>> Bioperl-l at lists.open-bio.org
>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>>
>>>>
>>>>
>>>
>>>
>>> --
>>> Dr. Jos? Luis Lav?n Trueba
>>>
>>> Dpto. de Producci?n Agraria
>>> Grupo de Gen?tica y Microbiolog?a
>>> Universidad P?blica de Navarra
>>> 31006 Pamplona
>>> Navarra
>>> SPAIN
>>>
>>>
>>>
>>> --
>>> Dr. Jos? Luis Lav?n Trueba
>>>
>>> Dpto. de Producci?n Agraria
>>> Grupo de Gen?tica y Microbiolog?a
>>> Universidad P?blica de Navarra
>>> 31006 Pamplona
>>> Navarra
>>> SPAIN
>>>
>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>>
>>>
>>
>>
>> --
>> Dr. Jos? Luis Lav?n Trueba
>>
>> Dpto. de Producci?n Agraria
>> Grupo de Gen?tica y Microbiolog?a
>> Universidad P?blica de Navarra
>> 31006 Pamplona
>> Navarra
>> SPAIN
>>
>>
>>
>>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


-- 
Dr. Jos? Luis Lav?n Trueba

Dpto. de Producci?n Agraria
Grupo de Gen?tica y Microbiolog?a
Universidad P?blica de Navarra
31006 Pamplona
Navarra
SPAIN


From saikari78 at gmail.com  Tue Nov 10 06:41:11 2009
From: saikari78 at gmail.com (saikari keitele)
Date: Tue, 10 Nov 2009 11:41:11 +0000
Subject: [Bioperl-l] [bioperl newbie] Retrieving link to protein from
	PubChem
In-Reply-To: <a38167fa0911090841o6a0a7bcdj5e2396f1a58dc4f2@mail.gmail.com>
References: <a38167fa0911090805y76456205g3f42caa7e3471241@mail.gmail.com>
	<1ECC543A-F923-4D5E-A0C1-5BBD35ECAAE8@illinois.edu>
	<a38167fa0911090841o6a0a7bcdj5e2396f1a58dc4f2@mail.gmail.com>
Message-ID: <a38167fa0911100341j45af6218r449270a7f4b530ec@mail.gmail.com>

Thanks again very much for your help and the script.
i've been trying it, however I fail to find any protein record linked to a
record in the pcsubstance database.
Do you think that its is because  no links have been defined between the 2
databases, or that I am just unlucky and that no link exists for the
particular records I'm testing?
Thanks again

saikari

On Mon, Nov 9, 2009 at 4:41 PM, saikari keitele <saikari78 at gmail.com> wrote:

> Fabulous!. Huge help.
> saikari
>
>   On Mon, Nov 9, 2009 at 4:27 PM, Chris Fields <cjfields at illinois.edu>wrote:
>
>>  On Nov 9, 2009, at 10:05 AM, saikari keitele wrote:
>>
>> Hi,
>>>
>>> I'm using Bioperl to retrieve records from PubChem.
>>> I'm trying to find a way-but have been unsuccessful- to retrieve from a
>>> compound record, the reference to the protein(s) that can synthesize the
>>> compound.
>>> Thanks very much.
>>>
>>> saikari
>>>
>>
>> The below bioperl script returns the GI for proteins that correspond to
>> the substance passed on the command line; invoke using 'perl
>> pc_substance.pl substance_requested'.  It probably needs more fiddling to
>> catch everything but it should get you started.
>>
>> For other bits and pieces (such as how to retrieve the raw sequence
>> files), please see the EUtilities HOWTO:
>>
>> http://www.bioperl.org/wiki/HOWTO:EUtilities_Cookbook
>>
>> chris
>>
>> ----------------------------------------
>>
>> #!/usr/bin/perl -w
>>
>> use 5.010;
>> use strict;
>> use warnings;
>> use Bio::DB::EUtilities;
>>
>> my $substance = shift;
>>
>> my $eutil = Bio::DB::EUtilities->new(-eutil => 'esearch',
>>                                     -db => 'pcsubstance',
>>                                     -term => $substance,
>>                                     -usehistory => 'y');
>>
>> my $hist = $eutil->next_History || die;
>>
>> $eutil->reset_parameters(-eutil => 'elink',
>>                       -history => $hist,
>>                       -db      => 'protein',
>>                       -dbfrom  => 'pcsubstance',
>>                       -retmax  => 1000);
>>
>> say join(',',$eutil->get_ids);
>>
>
>


From heyne at informatik.uni-freiburg.de  Tue Nov 10 07:55:06 2009
From: heyne at informatik.uni-freiburg.de (Steffen Heyne)
Date: Tue, 10 Nov 2009 13:55:06 +0100
Subject: [Bioperl-l] problem with alignments and sequence locations
Message-ID: <4AF962AA.7060908@informatik.uni-freiburg.de>

Hi,

I'm using Bioperl for my research and it is very useful! Thank you!

Currently I have a problem with locations tags of sequences. I read in 
seed alignments of Rfam (in stockholm format, but I think it is similar 
to other formats).

If the location is like:

AB194432.1/908-846

the start/end values are changed to

$seq->start = 846
$seq->end = 908

and therefore the new location (e.g.$seq->get_nse) is:

AB194432.1/846-908

The $seq->strand tag is correctly set to -1 in this case, but if the 
alignment is written out again (clustal, stockholm,...) this strand info 
is lost and the sequences have this "wrong" location. But this 
information is important in respect to the sequence accession number.

Is there a way to set the location back to the original one or is this 
behavior desired? Any manually setting with $seq->start($val) failed due 
to automatic checking.

I'm using bioperl 1.6.1

Thanks!

steffen


-- 
---
Steffen Heyne, Dipl.-Bioinf.
Lehrstuhl f?r Bioinformatik
Institut f?r Informatik
Albert-Ludwigs-Universit?t Freiburg
Georges-K?hler-Allee 106
79110 Freiburg, Germany

Tel: (+49) 761 203 8239
Fax: (+49) 761 203 7462
Mail: heyne at informatik.uni-freiburg.de


From cjfields at illinois.edu  Tue Nov 10 08:58:52 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Tue, 10 Nov 2009 07:58:52 -0600
Subject: [Bioperl-l] problem with alignments and sequence locations
In-Reply-To: <4AF962AA.7060908@informatik.uni-freiburg.de>
References: <4AF962AA.7060908@informatik.uni-freiburg.de>
Message-ID: <DF72C01A-410F-4391-B33E-4884D7CB859E@illinois.edu>

On Nov 10, 2009, at 6:55 AM, Steffen Heyne wrote:

> Hi,
>
> I'm using Bioperl for my research and it is very useful! Thank you!
>
> Currently I have a problem with locations tags of sequences. I read  
> in seed alignments of Rfam (in stockholm format, but I think it is  
> similar to other formats).
>
> If the location is like:
>
> AB194432.1/908-846
>
> the start/end values are changed to
>
> $seq->start = 846
> $seq->end = 908
>
> and therefore the new location (e.g.$seq->get_nse) is:
>
> AB194432.1/846-908
>
> The $seq->strand tag is correctly set to -1 in this case, but if the  
> alignment is written out again (clustal, stockholm,...) this strand  
> info is lost and the sequences have this "wrong" location. But this  
> information is important in respect to the sequence accession number.
>
> Is there a way to set the location back to the original one or is  
> this behavior desired? Any manually setting with $seq->start($val)  
> failed due to automatic checking.
>
> I'm using bioperl 1.6.1
>
> Thanks!
>
> steffen

This is a definite bug. We recently discussed amending the NSE format  
due to this (the subject came up over the last few months or so); it's  
fallen through the cracks.  Fortunaely it is very easy to fix (the  
relevant method is in LocatableSeq).

Does anyone have a problem with me adding this in?  It will change  
output for only those instances where the strand is -1, so

AB194432.1/908-846

would be start = 846, end = 908, strand = -1

AB194432.1/846-908

would be start = 846, end = 908, strand = 1

chris


From cjfields at illinois.edu  Tue Nov 10 09:05:51 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Tue, 10 Nov 2009 08:05:51 -0600
Subject: [Bioperl-l] [bioperl newbie] Retrieving link to protein from
	PubChem
In-Reply-To: <a38167fa0911100341j45af6218r449270a7f4b530ec@mail.gmail.com>
References: <a38167fa0911090805y76456205g3f42caa7e3471241@mail.gmail.com>
	<1ECC543A-F923-4D5E-A0C1-5BBD35ECAAE8@illinois.edu>
	<a38167fa0911090841o6a0a7bcdj5e2396f1a58dc4f2@mail.gmail.com>
	<a38167fa0911100341j45af6218r449270a7f4b530ec@mail.gmail.com>
Message-ID: <738F6320-B87A-4541-B9FA-20273ABA96B9@illinois.edu>

On Nov 10, 2009, at 5:41 AM, saikari keitele wrote:

> Thanks again very much for your help and the script.
> i've been trying it, however I fail to find any protein record  
> linked to a
> record in the pcsubstance database.
> Do you think that its is because  no links have been defined between  
> the 2
> databases, or that I am just unlucky and that no link exists for the
> particular records I'm testing?
> Thanks again
>
> saikari

It's probably that no links have been defined.  I have found similar  
problems in the past with pubchem, in that not all substances have  
proteins associated with them.  Most proteins linked to are those with  
a deposited structure.

There are a few other databases to check out; KEGG, the BioCyc dbs  
(like EcoCyc), come to mind.  I don't think we have a generic remote  
query engine set up for any of those unfortunately (unless there is  
one I'm unaware of), but I know BioCyc comes with it's own set of  
tools (including perl- and java-based query tools) and can be set up  
locally, which is likely much faster and more in lines with what you  
need.

chris

...


From vebaev at gmail.com  Tue Nov 10 12:38:54 2009
From: vebaev at gmail.com (Vesselin Baev)
Date: Tue, 10 Nov 2009 09:38:54 -0800 (PST)
Subject: [Bioperl-l] Invitation to connect on LinkedIn
Message-ID: <1983273212.597925.1257874734811.JavaMail.app@ech3-cdn07.prod>

LinkedIn
------------

Vesselin Baev requested to add you as a connection on LinkedIn:
------------------------------------------

Bolotin,,

I'd like to add you to my professional network on LinkedIn.

- Vesselin

Accept invitation from Vesselin Baev
http://www.linkedin.com/e/uez6TYkHzbaXxXM-lUk23auFwJZodcPlXc2UWC0Ao8h/blk/I1572789477_2/pmpxnSRJrSdvj4R5fnhv9ClRsDgZp6lQs6lzoQ5AomZIpn8_cBYTdPgVe3sOdPkNiiZFlAN1oPlOp2YMdPsTcz8OdjwLrCBxbOYWrSlI/EML_comm_afe/

View invitation from Vesselin Baev
http://www.linkedin.com/e/uez6TYkHzbaXxXM-lUk23auFwJZodcPlXc2UWC0Ao8h/blk/I1572789477_2/39vdPsQejwTczsRckALqnpPbOYWrSlI/svi/

------------------------------------------ 
DID YOU KNOW your LinkedIn profile helps you control your public image when people search for you? Setting your profile as public means your LinkedIn profile will come up when people enter your name in leading search engines. Take control of your image! 
http://www.linkedin.com/e/ewp/inv-22/

 
------
(c) 2009, LinkedIn Corporation


From jason at bioperl.org  Tue Nov 10 13:47:02 2009
From: jason at bioperl.org (Jason Stajich)
Date: Tue, 10 Nov 2009 10:47:02 -0800
Subject: [Bioperl-l] [Fwd: Re: A question about iBio::Index: and
	itscorrect use]
In-Reply-To: <3471.130.206.164.153.1257849820.squirrel@webmail.unavarra.es>
References: <2642.130.206.164.153.1257435996.squirrel@webmail.unavarra.es><1
	984ED07F36C446284B25F617964B6C6@NewLife><2969.130.206.164.153.1257438117.squirrel@webmail.unavarra.es>
	<A1ACC4B552514872B77208248B31977C@NewLife>
	<3471.130.206.164.153.1257849820.squirrel@webmail.unavarra.es>
Message-ID: <E2D3B0C9-9294-4630-ACA0-FB0559E19E4C@bioperl.org>

Page 44 has the custom ID info or look at documentation for  
Bio::DB::Fasta - there is a similar syntax for Bio::Index::Fasta if  
you read the perldoc for the module.

  http://jason.open-bio.org/Bioperl_Tutorials/ProgrammingBiology2008/ProgBiology_BioPerl_I.pdf

Don't re-opening SeqIO each time just do it once at the beginning  
outside of the loop and then call write_seq within the loop.

This is one nuance of doing OO programming vs procedural is that there  
is some outside state information that can persist in an object, but  
conceptually, you want to open a filehandle once and just keep writing  
to it.

-jason
On Nov 10, 2009, at 2:43 AM, jluis.lavin at unavarra.es wrote:

> Hello again,
>
> I tried what Mark told me modifying the code line he told me but  
> there?s
> still a problem that I believe must be due to the sequences name.
> My secuences header on the Fasta file have this format:
>
>> PleosPC9_1_103820|fgenesh1_pg.3_#_1
>
> Th part on the right of the pipe changes depending on the program  
> used to
> create the gene model, for example:
>
>> PleosPC9_1_103820|fgenesh1_pg.3_#_1
>> PleosPC9_1_123413|genemark.2731_g
>> PleosPC9_1_52065|e_gw1.3.64.1
>
> So I guess I need to parse my ids somehow for thr program to detect  
> only
> the first part of the fasta header (the "protein name") and not to get
> messed with the other side of the pipe...
>
> This is the corrected code I wrote following Mark?s indications, but I
> still don?t have any idea about the parsing issue...
>
> #!/c:/Perl -w
> use Bio::Index::Fasta;
> use strict;
> #PC9.fasta is my genomic file
> my $Index_File_Name ="PC9.fasta";
> my $inx = Bio::Index::Fasta->new('PC9.fasta.idx');
> #LCS.txt is my sequences list
> @ARGV = <LCS.txt>;
> foreach  my $id (@ARGV) {
> if ($id eq ''){
> die ("empty list")
> }
> else {
> my $seqobj = $inx->fetch($id);
> my $out = new Bio::SeqIO (-file => ">>index_extracted.fasta",
> -format => 'fasta');
> $out->write_seq($seqobj);
> }
> }
> exit;
> }
>
> Thanks in advance
>
> PD. May it be a faster way of extracting those sequences using plain  
> PERL?
>
>
>
>
> El Jue, 5 de Noviembre de 2009, 17:39, Mark A. Jensen escribi?:
>> Yes, these are files created by the SDBM, Perl's internal db  
>> manager. You
>> should
>> be able to
>> open the index by simply
>> $inx = Bio::Index::Fasta->new('PC9.fasta.idx');
>> and the dbm will know what to do--
>> cheers MAJ
>> ----- Original Message -----
>> From: <jluis.lavin at unavarra.es>
>> To: "Mark A. Jensen" <maj at fortinbras.us>
>> Cc: <jluis.lavin at unavarra.es>; <bioperl-l at lists.open-bio.org>
>> Sent: Thursday, November 05, 2009 11:21 AM
>> Subject: Re: [Bioperl-l] [Fwd: Re: A question about iBio::Index:  
>> and its
>> correct
>> use]
>>
>>
>>> Thank you very much Mark, that?s a good point :$
>>> I guess your correction is referred to the second script, isn?t it?
>>>
>>> If it is so, there is still a problem with the first script, it  
>>> doesn?t
>>> create the PC9.fasta.idx file, instead it creates two files named:
>>> -PC9.fasta.idx.pag
>>> -PC9.fasta.idx.dir
>>>
>>> which seem to be clearly related with some kind of indexing
>>> process...but,
>>> unless the PC9.fasta.idx file is only virtual or remains hidden, I  
>>> can?t
>>> find it anywhere...
>>> Forgive me if I?m talking nosense...
>>>
>>> Thank you very much again for your help ;)
>>>
>>>
>>> El Jue, 5 de Noviembre de 2009, 17:02, Mark A. Jensen escribi?:
>>>> Hey Jos?,
>>>> The first thing that jumps out it the index file name. Looks
>>>> like you create it as
>>>> PC9.fasta.idx
>>>> But you read it as
>>>> PC9.fasta
>>>> Not an unusual mistake. Do
>>>> my $inx = Bio::Index::Fasta->new('PC9.fasta.idx');
>>>> and see if it works.
>>>> MAJ
>>>> ----- Original Message -----
>>>> From: <jluis.lavin at unavarra.es>
>>>> To: <bioperl-l at lists.open-bio.org>
>>>> Sent: Thursday, November 05, 2009 10:46 AM
>>>> Subject: [Bioperl-l] [Fwd: Re: A question about iBio::Index: and  
>>>> its
>>>> correct
>>>> use]
>>>>
>>>>
>>>>
>>>>
>>>> ---------------------------- Mensaje original
>>>> ----------------------------
>>>> Subject: Re: [Bioperl-l] A question about iBio::Index: and its  
>>>> correct
>>>> use
>>>> From:    jluis.lavin at unavarra.es
>>>> Fecha:   Jue, 5 de Noviembre de 2009, 16:46
>>>> To:      "Mark A. Jensen" <maj at fortinbras.us>
>>>> --------------------------------------------------------------------------
>>>>
>>>> Hi Mark,
>>>>
>>>> I?ve actually got two scripts, the first one is to create the  
>>>> index and
>>>> the second one is to retrieve the sequence lis from the indexed  
>>>> file.
>>>>
>>>> 1)Here is the Index creation script:
>>>>
>>>> #!/c:/Perl -w
>>>> use strict;
>>>> use Bio::Index::Fasta;
>>>> use strict;
>>>>
>>>> print "Enter file for indexing: \n";
>>>> my $Index_File_Name = <STDIN>;
>>>> my $inx = Bio::Index::Fasta->new(-filename =>  
>>>> $Index_File_Name.".idx",
>>>>    -write_flag => 1);
>>>> $inx->make_index(my $File_Name);
>>>>
>>>> 2)And here is the sequence retrieval script:
>>>>
>>>> #!/c:/Perl -w
>>>> use Bio::Index::Fasta;
>>>> use strict;
>>>> #PC9.fasta is my genomic file
>>>> my $Index_File_Name ="PC9.fasta";
>>>> my $inx = Bio::Index::Fasta->new($Index_File_Name);
>>>> #LCS.txt is my sequences list
>>>> @ARGV = <lCS.txt>;
>>>> foreach  my $id (@ARGV) {
>>>> if ($id eq ''){
>>>> die ("empty list")
>>>> }
>>>> else {
>>>> my $seqobj = $inx->fetch($id);
>>>> my $out = new Bio::SeqIO (-file => ">>extracted_seqs.fasta",
>>>> -format => 'fasta');
>>>> $out->write_seq($seqobj);
>>>> }
>>>> }
>>>> exit;
>>>> }
>>>>
>>>> I hope this code is not a total scum...
>>>>
>>>> Thanks in advance ;)
>>>>
>>>>
>>>>
>>>> El Jue, 5 de Noviembre de 2009, 16:39, Mark A. Jensen escribi?:
>>>>> Jos? -- It looks like this is a good solution to your problem.  
>>>>> Please
>>>>> send
>>>>> you
>>>>> script so we can look at it-
>>>>> cheers Mark
>>>>> ----- Original Message -----
>>>>> From: <jluis.lavin at unavarra.es>
>>>>> To: <bioperl-l at lists.open-bio.org>
>>>>> Sent: Thursday, November 05, 2009 10:28 AM
>>>>> Subject: [Bioperl-l] A question about iBio::Index: and its  
>>>>> correct use
>>>>>
>>>>>
>>>>>
>>>>> Hello to all,
>>>>>
>>>>> I?m trying to write a script to retrieve a list of sequences  
>>>>> from a
>>>>> local
>>>>> FASTA file (for example a fasta archive where all the protein  
>>>>> models
>>>>> of
>>>>> an
>>>>> organism are stored). This file would be used by me as some kind
>>>>> "local
>>>>> database" (sorry if I mistake a few concepts...)
>>>>> I?ve been reading the BioPerl HOWTOs and I came across the
>>>>> Bio::Index::Fasta tool.
>>>>> If I didn?t misunderstood what I read (which can be easy because  
>>>>> my
>>>>> low
>>>>> level on programming) this Indexing tool should do the job.
>>>>> I wrote a couple of scripts based on the documentation i read  
>>>>> about
>>>>> this
>>>>> tool, but I don?t seem to be able to create the index file to be  
>>>>> used
>>>>> later (to retrieve the sequences from).
>>>>> -First of all, I want to ask the people in this forum if the
>>>>> Bio::Index::Fasta is the right one to chose for this tasks.
>>>>> -Then I?ll beg you to take a look at my scripts, because I don?t  
>>>>> seem
>>>>> to
>>>>> catch the bug...
>>>>>
>>>>> Best wishes to you all and thanks in advance ;)
>>>>>
>>>>> --
>>>>> Jos? Luis Lav?n Trueba, PhD
>>>>>
>>>>> Dpto. de Producci?n Agraria
>>>>> Grupo de Gen?tica y Microbiolog?a
>>>>> Universidad P?blica de Navarra
>>>>> 31006 Pamplona
>>>>> Navarra
>>>>> SPAIN
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> Bioperl-l mailing list
>>>>> Bioperl-l at lists.open-bio.org
>>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> Dr. Jos? Luis Lav?n Trueba
>>>>
>>>> Dpto. de Producci?n Agraria
>>>> Grupo de Gen?tica y Microbiolog?a
>>>> Universidad P?blica de Navarra
>>>> 31006 Pamplona
>>>> Navarra
>>>> SPAIN
>>>>
>>>>
>>>>
>>>> --
>>>> Dr. Jos? Luis Lav?n Trueba
>>>>
>>>> Dpto. de Producci?n Agraria
>>>> Grupo de Gen?tica y Microbiolog?a
>>>> Universidad P?blica de Navarra
>>>> 31006 Pamplona
>>>> Navarra
>>>> SPAIN
>>>>
>>>>
>>>> _______________________________________________
>>>> Bioperl-l mailing list
>>>> Bioperl-l at lists.open-bio.org
>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>>
>>>>
>>>>
>>>
>>>
>>> --
>>> Dr. Jos? Luis Lav?n Trueba
>>>
>>> Dpto. de Producci?n Agraria
>>> Grupo de Gen?tica y Microbiolog?a
>>> Universidad P?blica de Navarra
>>> 31006 Pamplona
>>> Navarra
>>> SPAIN
>>>
>>>
>>>
>>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>
>
> -- 
> Dr. Jos? Luis Lav?n Trueba
>
> Dpto. de Producci?n Agraria
> Grupo de Gen?tica y Microbiolog?a
> Universidad P?blica de Navarra
> 31006 Pamplona
> Navarra
> SPAIN
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

--
Jason Stajich
jason.stajich at gmail.com
jason at bioperl.org


From jason at bioperl.org  Tue Nov 10 13:50:00 2009
From: jason at bioperl.org (Jason Stajich)
Date: Tue, 10 Nov 2009 10:50:00 -0800
Subject: [Bioperl-l] Bio::Index::GenBank - by organism?
In-Reply-To: <12DFD22E-42DC-4626-9873-0DE3EBB5CFBD@illinois.edu>
References: <3B01A09C-198E-4691-B807-7ED3250BB81A@jays.net>
	<12DFD22E-42DC-4626-9873-0DE3EBB5CFBD@illinois.edu>
Message-ID: <2BA451B1-6E18-483E-B655-74D1146772CC@bioperl.org>

You might also look at what mygenbank does:
http://homepage.mac.com/iankorf/mygenbank.html

On Nov 9, 2009, at 7:55 PM, Chris Fields wrote:

> On Nov 9, 2009, at 6:05 PM, Jay Hannah wrote:
>
>> Many thanks to Ewan Birney et. al. for Bio::Index::*
>>
>> I can throw away my awful grep based index-by-accession stuff.   :)
>>
>> Any chance someone has also written an organism based index  
>> mechanism? Something like...
>>
>> while (my $seq = $inx?>get_Seq_by_organism('*Xanthomonas*')) {
>>  print $seq->display_id . "\n";
>> }
>>
>> Thanks,
>>
>> j
>
> It should work via id_parser(); from Bio::Index::GenBank:
>
>   $inx->id_parser(\&get_id);
>   # make the index
>   $inx->make_index($file_name);
>
>   # here is where the retrieval key is specified
>   sub get_id {
>      my $line = shift;
>      $line =~ /clone="(\S+)"/;
>      $1;
>   }
>
> Change the code ref deal with the line you want and parse the name  
> out.  Caveat: this may not be absolutely perfect (it only passes in  
> a line at a time, and some species lines will wrap).  Also not sure  
> how this would work in cases where multiple sequences from the same  
> species are present.
>
> The other option is to preparse everything and tie a hash to store a  
> species->UID map, then use that along with your Bio::Index index to  
> grab what you need.
>
> chris
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

--
Jason Stajich
jason.stajich at gmail.com
jason at bioperl.org


From jluis.lavin at unavarra.es  Wed Nov 11 10:01:18 2009
From: jluis.lavin at unavarra.es (jluis.lavin at unavarra.es)
Date: Wed, 11 Nov 2009 16:01:18 +0100 (CET)
Subject: [Bioperl-l] [Fwd: Re: A question about iBio::Index:
 anditscorrect use]
In-Reply-To: <E2D3B0C9-9294-4630-ACA0-FB0559E19E4C@bioperl.org>
References: <2642.130.206.164.153.1257435996.squirrel@webmail.unavarra.es><1
	984ED07F36C446284B25F617964B6C6@NewLife><2969.130.206.164.153.1257438117.sq
	uirrel@webmail.unavarra.es><A1ACC4B552514872B77208248B31977C@NewLife><3471.
	130.206.164.153.1257849820.squirrel@webmail.unavarra.es>
	<E2D3B0C9-9294-4630-ACA0-FB0559E19E4C@bioperl.org>
Message-ID: <2979.130.206.164.153.1257951678.squirrel@webmail.unavarra.es>

Hi once again,
I have modified the script following the instructions Jason gave me (at
last what I understood, remember it is my first time trying to learn a
programming language...and I?m not the smartest guy in the class, hehe)but
it seems I didn?t fix the problem...
Here?s the new code I wrote:

#!/c:/Perl -w
	use strict;
        use Bio::Index::Fasta;
	use Bio::DB::Fasta;
	use Bio::SeqIO;
	use IO::File;

# assign files to scalars
my $index_file = 'PC91.fasta';
my $id_list = 'LCS2.txt';

# open index file
my $db = Bio::DB::Fasta->new($index_file) or die;

# open the id list
my $in = IO::File->new($id_list) or die;

# open FASTA to write
my $out = new Bio::SeqIO (-file => ">>index_extracted.fasta",
-format => 'fasta');

# retrieve ids loop
foreach my $id ($in) {
if ($id eq ''){
die ("empty list")
}
else {
my $seqobj = my $inx->fetch($id);
$out->write_seq($seqobj);
}
}

# parse fasta headers
sub my_makeid {
my $id = shift;
if ( $id =~ /^>[^:]+:(\S+)/ ) {
return $1;
} elsif ($id =~ /^>(\S+)/) {
return $1;
} else {
warn("cannot parse ID for $id\n");
}
}
exit;

Would anyone, please take a look at it ...

Thanks in advance ;)


El Mar, 10 de Noviembre de 2009, 19:47, Jason Stajich escribi?:
> Page 44 has the custom ID info or look at documentation for
> Bio::DB::Fasta - there is a similar syntax for Bio::Index::Fasta if
> you read the perldoc for the module.
>
>   http://jason.open-bio.org/Bioperl_Tutorials/ProgrammingBiology2008/ProgBiology_BioPerl_I.pdf
>
> Don't re-opening SeqIO each time just do it once at the beginning
> outside of the loop and then call write_seq within the loop.
>
> This is one nuance of doing OO programming vs procedural is that there
> is some outside state information that can persist in an object, but
> conceptually, you want to open a filehandle once and just keep writing
> to it.
>
> -jason
> On Nov 10, 2009, at 2:43 AM, jluis.lavin at unavarra.es wrote:
>
>> Hello again,
>>
>> I tried what Mark told me modifying the code line he told me but
>> there?s
>> still a problem that I believe must be due to the sequences name.
>> My secuences header on the Fasta file have this format:
>>
>>> PleosPC9_1_103820|fgenesh1_pg.3_#_1
>>
>> Th part on the right of the pipe changes depending on the program
>> used to
>> create the gene model, for example:
>>
>>> PleosPC9_1_103820|fgenesh1_pg.3_#_1
>>> PleosPC9_1_123413|genemark.2731_g
>>> PleosPC9_1_52065|e_gw1.3.64.1
>>
>> So I guess I need to parse my ids somehow for thr program to detect
>> only
>> the first part of the fasta header (the "protein name") and not to get
>> messed with the other side of the pipe...
>>
>> This is the corrected code I wrote following Mark?s indications, but I
>> still don?t have any idea about the parsing issue...
>>
>> #!/c:/Perl -w
>> use Bio::Index::Fasta;
>> use strict;
>> #PC9.fasta is my genomic file
>> my $Index_File_Name ="PC9.fasta";
>> my $inx = Bio::Index::Fasta->new('PC9.fasta.idx');
>> #LCS.txt is my sequences list
>> @ARGV = <LCS.txt>;
>> foreach  my $id (@ARGV) {
>> if ($id eq ''){
>> die ("empty list")
>> }
>> else {
>> my $seqobj = $inx->fetch($id);
>> my $out = new Bio::SeqIO (-file => ">>index_extracted.fasta",
>> -format => 'fasta');
>> $out->write_seq($seqobj);
>> }
>> }
>> exit;
>> }
>>
>> Thanks in advance
>>
>> PD. May it be a faster way of extracting those sequences using plain
>> PERL?
>>
>>
>>
>>
>> El Jue, 5 de Noviembre de 2009, 17:39, Mark A. Jensen escribi?:
>>> Yes, these are files created by the SDBM, Perl's internal db
>>> manager. You
>>> should
>>> be able to
>>> open the index by simply
>>> $inx = Bio::Index::Fasta->new('PC9.fasta.idx');
>>> and the dbm will know what to do--
>>> cheers MAJ
>>> ----- Original Message -----
>>> From: <jluis.lavin at unavarra.es>
>>> To: "Mark A. Jensen" <maj at fortinbras.us>
>>> Cc: <jluis.lavin at unavarra.es>; <bioperl-l at lists.open-bio.org>
>>> Sent: Thursday, November 05, 2009 11:21 AM
>>> Subject: Re: [Bioperl-l] [Fwd: Re: A question about iBio::Index:
>>> and its
>>> correct
>>> use]
>>>
>>>
>>>> Thank you very much Mark, that?s a good point :$
>>>> I guess your correction is referred to the second script, isn?t it?
>>>>
>>>> If it is so, there is still a problem with the first script, it
>>>> doesn?t
>>>> create the PC9.fasta.idx file, instead it creates two files named:
>>>> -PC9.fasta.idx.pag
>>>> -PC9.fasta.idx.dir
>>>>
>>>> which seem to be clearly related with some kind of indexing
>>>> process...but,
>>>> unless the PC9.fasta.idx file is only virtual or remains hidden, I
>>>> can?t
>>>> find it anywhere...
>>>> Forgive me if I?m talking nosense...
>>>>
>>>> Thank you very much again for your help ;)
>>>>
>>>>
>>>> El Jue, 5 de Noviembre de 2009, 17:02, Mark A. Jensen escribi?:
>>>>> Hey Jos?,
>>>>> The first thing that jumps out it the index file name. Looks
>>>>> like you create it as
>>>>> PC9.fasta.idx
>>>>> But you read it as
>>>>> PC9.fasta
>>>>> Not an unusual mistake. Do
>>>>> my $inx = Bio::Index::Fasta->new('PC9.fasta.idx');
>>>>> and see if it works.
>>>>> MAJ
>>>>> ----- Original Message -----
>>>>> From: <jluis.lavin at unavarra.es>
>>>>> To: <bioperl-l at lists.open-bio.org>
>>>>> Sent: Thursday, November 05, 2009 10:46 AM
>>>>> Subject: [Bioperl-l] [Fwd: Re: A question about iBio::Index: and
>>>>> its
>>>>> correct
>>>>> use]
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> ---------------------------- Mensaje original
>>>>> ----------------------------
>>>>> Subject: Re: [Bioperl-l] A question about iBio::Index: and its
>>>>> correct
>>>>> use
>>>>> From:    jluis.lavin at unavarra.es
>>>>> Fecha:   Jue, 5 de Noviembre de 2009, 16:46
>>>>> To:      "Mark A. Jensen" <maj at fortinbras.us>
>>>>> --------------------------------------------------------------------------
>>>>>
>>>>> Hi Mark,
>>>>>
>>>>> I?ve actually got two scripts, the first one is to create the
>>>>> index and
>>>>> the second one is to retrieve the sequence lis from the indexed
>>>>> file.
>>>>>
>>>>> 1)Here is the Index creation script:
>>>>>
>>>>> #!/c:/Perl -w
>>>>> use strict;
>>>>> use Bio::Index::Fasta;
>>>>> use strict;
>>>>>
>>>>> print "Enter file for indexing: \n";
>>>>> my $Index_File_Name = <STDIN>;
>>>>> my $inx = Bio::Index::Fasta->new(-filename =>
>>>>> $Index_File_Name.".idx",
>>>>>    -write_flag => 1);
>>>>> $inx->make_index(my $File_Name);
>>>>>
>>>>> 2)And here is the sequence retrieval script:
>>>>>
>>>>> #!/c:/Perl -w
>>>>> use Bio::Index::Fasta;
>>>>> use strict;
>>>>> #PC9.fasta is my genomic file
>>>>> my $Index_File_Name ="PC9.fasta";
>>>>> my $inx = Bio::Index::Fasta->new($Index_File_Name);
>>>>> #LCS.txt is my sequences list
>>>>> @ARGV = <lCS.txt>;
>>>>> foreach  my $id (@ARGV) {
>>>>> if ($id eq ''){
>>>>> die ("empty list")
>>>>> }
>>>>> else {
>>>>> my $seqobj = $inx->fetch($id);
>>>>> my $out = new Bio::SeqIO (-file => ">>extracted_seqs.fasta",
>>>>> -format => 'fasta');
>>>>> $out->write_seq($seqobj);
>>>>> }
>>>>> }
>>>>> exit;
>>>>> }
>>>>>
>>>>> I hope this code is not a total scum...
>>>>>
>>>>> Thanks in advance ;)
>>>>>
>>>>>
>>>>>
>>>>> El Jue, 5 de Noviembre de 2009, 16:39, Mark A. Jensen escribi?:
>>>>>> Jos? -- It looks like this is a good solution to your problem.
>>>>>> Please
>>>>>> send
>>>>>> you
>>>>>> script so we can look at it-
>>>>>> cheers Mark
>>>>>> ----- Original Message -----
>>>>>> From: <jluis.lavin at unavarra.es>
>>>>>> To: <bioperl-l at lists.open-bio.org>
>>>>>> Sent: Thursday, November 05, 2009 10:28 AM
>>>>>> Subject: [Bioperl-l] A question about iBio::Index: and its
>>>>>> correct use
>>>>>>
>>>>>>
>>>>>>
>>>>>> Hello to all,
>>>>>>
>>>>>> I?m trying to write a script to retrieve a list of sequences
>>>>>> from a
>>>>>> local
>>>>>> FASTA file (for example a fasta archive where all the protein
>>>>>> models
>>>>>> of
>>>>>> an
>>>>>> organism are stored). This file would be used by me as some kind
>>>>>> "local
>>>>>> database" (sorry if I mistake a few concepts...)
>>>>>> I?ve been reading the BioPerl HOWTOs and I came across the
>>>>>> Bio::Index::Fasta tool.
>>>>>> If I didn?t misunderstood what I read (which can be easy because
>>>>>> my
>>>>>> low
>>>>>> level on programming) this Indexing tool should do the job.
>>>>>> I wrote a couple of scripts based on the documentation i read
>>>>>> about
>>>>>> this
>>>>>> tool, but I don?t seem to be able to create the index file to be
>>>>>> used
>>>>>> later (to retrieve the sequences from).
>>>>>> -First of all, I want to ask the people in this forum if the
>>>>>> Bio::Index::Fasta is the right one to chose for this tasks.
>>>>>> -Then I?ll beg you to take a look at my scripts, because I don?t
>>>>>> seem
>>>>>> to
>>>>>> catch the bug...
>>>>>>
>>>>>> Best wishes to you all and thanks in advance ;)
>>>>>>
>>>>>> --
>>>>>> Jos? Luis Lav?n Trueba, PhD
>>>>>>
>>>>>> Dpto. de Producci?n Agraria
>>>>>> Grupo de Gen?tica y Microbiolog?a
>>>>>> Universidad P?blica de Navarra
>>>>>> 31006 Pamplona
>>>>>> Navarra
>>>>>> SPAIN
>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> Bioperl-l mailing list
>>>>>> Bioperl-l at lists.open-bio.org
>>>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Dr. Jos? Luis Lav?n Trueba
>>>>>
>>>>> Dpto. de Producci?n Agraria
>>>>> Grupo de Gen?tica y Microbiolog?a
>>>>> Universidad P?blica de Navarra
>>>>> 31006 Pamplona
>>>>> Navarra
>>>>> SPAIN
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Dr. Jos? Luis Lav?n Trueba
>>>>>
>>>>> Dpto. de Producci?n Agraria
>>>>> Grupo de Gen?tica y Microbiolog?a
>>>>> Universidad P?blica de Navarra
>>>>> 31006 Pamplona
>>>>> Navarra
>>>>> SPAIN
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> Bioperl-l mailing list
>>>>> Bioperl-l at lists.open-bio.org
>>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> Dr. Jos? Luis Lav?n Trueba
>>>>
>>>> Dpto. de Producci?n Agraria
>>>> Grupo de Gen?tica y Microbiolog?a
>>>> Universidad P?blica de Navarra
>>>> 31006 Pamplona
>>>> Navarra
>>>> SPAIN
>>>>
>>>>
>>>>
>>>>
>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>
>>
>> --
>> Dr. Jos? Luis Lav?n Trueba
>>
>> Dpto. de Producci?n Agraria
>> Grupo de Gen?tica y Microbiolog?a
>> Universidad P?blica de Navarra
>> 31006 Pamplona
>> Navarra
>> SPAIN
>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> --
> Jason Stajich
> jason.stajich at gmail.com
> jason at bioperl.org
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


-- 
Dr. Jos? Luis Lav?n Trueba

Dpto. de Producci?n Agraria
Grupo de Gen?tica y Microbiolog?a
Universidad P?blica de Navarra
31006 Pamplona
Navarra
SPAIN


From maj at fortinbras.us  Wed Nov 11 18:48:33 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Wed, 11 Nov 2009 18:48:33 -0500
Subject: [Bioperl-l] Maq assembly wrapper ready for beta testing
Message-ID: <4057E5A862B845EA8BB153888075590C@NewLife>

Hi All-

New modules are available in the core and in bioperl-run for
working with Heng Li's short read assembler "maq"
(http://maq.sourceforge.net/maq-man.shtml). Bio::Tools::Run::Maq
allows a quick assembly call with a canned a maq pipeline, and also
allows individual maq commands to be called separately. 
It uses Bio::Assembly::IO::maq  (a read-only module) to deliver
a Bio::Assembly::Scaffold from maq output. 

If you're interested, see
http://www.bioperl.org/wiki/HOWTO:Short-read_assemblies_with_maq
and update your core and bioperl-run. The code inherits from Florent's
excellent new Bio::Tools::Run::AssemblerBase -- kudos to him!!

tests are in bioperl-run/trunk/t/Maq.t, see them for myriad examples
send me the bugs
MAJ


From clarsen at vecna.com  Thu Nov 12 12:22:26 2009
From: clarsen at vecna.com (Chris Larsen)
Date: Thu, 12 Nov 2009 12:22:26 -0500
Subject: [Bioperl-l] Polyproteins, ribo slippage,
	and mat_peptide in  viruses?
In-Reply-To: <320fb6e00910271029m26f07564l727fb78adae81c11@mail.gmail.com>
References: <B0218AEF-3CEB-4E06-B8DF-7B302D024797@vecna.com>
	<320fb6e00910271029m26f07564l727fb78adae81c11@mail.gmail.com>
Message-ID: <7BBAE077-4D76-46C2-BF66-363F5A017278@vecna.com>

All,

This is a short followup on the prior thread of discussion, regarding  
computing mature peptide sequences for viruses. The topic has gone  
underwater for the time being as we solve some problems with source  
data. While the biopython effort and contributors on this board have  
given good guidance, and we now have scripts that function (thanks  
mostly to pcock), however, the source data on which everything relies  
is suspect:

   mat_peptide	15118..16914	<===
		/product="nsp13"	
		/note="helicase"
I can tell you the virus community does not want to rely heavily, on  
those position numbers. Furthermore we have found fewer compete source  
genomes for viruses than bacteria, more virus-to-virus variation in  
the data fields annotated in the GBK file, (Gene, CDS, ORF, Protein,  
Polyprotein, mat_peptide, db_xref) and in fact the community will have  
to come together significantly on how these molecules are defined in  
public repositories, before a mature scripting effort becomes  
reliable, public and well received. Because of the variation in  
viruses, it's not even clear at this point what a 'gene' is. I will  
let you know how we proceed when more sequence data has been fully  
analyzed, and we can think about making any perl based solution a new  
viral protein module.

Thanks,

Chris

-- 

Christopher Larsen, Ph.D.
Sr. Scientist / Grants Manager
Vecna Technologies
6404 Ivy Lane #500
Greenbelt, MD 20770
Phone: (240) 965-4525
Fax: (240) 547-6133
240-737-4525


From David.Messina at sbc.su.se  Thu Nov 12 14:20:54 2009
From: David.Messina at sbc.su.se (Dave Messina)
Date: Thu, 12 Nov 2009 20:20:54 +0100
Subject: [Bioperl-l] highest PAML version supported?
Message-ID: <628aabb70911121120w4c609056v50204b9bd9e5c3fb@mail.gmail.com>

Hi everyone,

What is the latest version of PAML (specifically codeml) that I can use with
bioperl-live and bioperl-run?

I looked around and couldn't find where (or if) this is documented.


With PAML version 4.3a against the current trunk of both -live and -run I
see this:
------------- EXCEPTION Bio::Root::NotImplemented -------------
MSG: Unknown format of PAML output did not see seqtype
STACK Bio::Tools::Phylo::PAML::_parse_summary
/Users/dave/src/bioperl-live/Bio/Tools/Phylo/PAML.pm:461
STACK Bio::Tools::Phylo::PAML::next_result
/Users/dave/src/bioperl-live/Bio/Tools/Phylo/PAML.pm:270
STACK toplevel ../bin/cluster_kaks:251
---------------------------------------------------------------

...which I suspect (but haven't confirmed) is due to a change in the file
format.


Dave


From jason at bioperl.org  Thu Nov 12 14:29:22 2009
From: jason at bioperl.org (Jason Stajich)
Date: Thu, 12 Nov 2009 11:29:22 -0800
Subject: [Bioperl-l] highest PAML version supported?
In-Reply-To: <628aabb70911121120w4c609056v50204b9bd9e5c3fb@mail.gmail.com>
References: <628aabb70911121120w4c609056v50204b9bd9e5c3fb@mail.gmail.com>
Message-ID: <D44511CE-D632-4CBE-9DD1-1B68EFD70124@bioperl.org>

prolly 3.15 or so.

it really needs a maintainer!!!

On Nov 12, 2009, at 11:20 AM, Dave Messina wrote:

> Hi everyone,
>
> What is the latest version of PAML (specifically codeml) that I can  
> use with
> bioperl-live and bioperl-run?
>
> I looked around and couldn't find where (or if) this is documented.
>
>
> With PAML version 4.3a against the current trunk of both -live and - 
> run I
> see this:
> ------------- EXCEPTION Bio::Root::NotImplemented -------------
> MSG: Unknown format of PAML output did not see seqtype
> STACK Bio::Tools::Phylo::PAML::_parse_summary
> /Users/dave/src/bioperl-live/Bio/Tools/Phylo/PAML.pm:461
> STACK Bio::Tools::Phylo::PAML::next_result
> /Users/dave/src/bioperl-live/Bio/Tools/Phylo/PAML.pm:270
> STACK toplevel ../bin/cluster_kaks:251
> ---------------------------------------------------------------
>
> ...which I suspect (but haven't confirmed) is due to a change in the  
> file
> format.
>
>
> Dave
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

--
Jason Stajich
jason.stajich at gmail.com
jason at bioperl.org


From scott at scottcain.net  Fri Nov 13 09:48:43 2009
From: scott at scottcain.net (Scott Cain)
Date: Fri, 13 Nov 2009 09:48:43 -0500
Subject: [Bioperl-l] January GMOD meeting announcement
Message-ID: <4536f7700911130648j40eb2d82g2594adaccf476d73@mail.gmail.com>

Hello,

I am pleased to announce that the January GMOD meeting will be taking
place on January 14 and 15 in San Diego at the Best Western Seven Seas
(the same location as last year).  Please see this page for
registration information:

  http://gmod.org/wiki/January_2010_GMOD_Meeting

When you go to that page, please take a moment to add suggestions for
the agenda.  There is no registration fee for this meeting, however
there is limited space, so please register early.

The proprietors of the Best Western have given us an excellent room
rate, and extended it to the previous week, so that people attending
the GMOD meeting and the Plant and Animal Genome meeting before it may
stay at the Best Western the entire time.

Please direct follow up questions to the gmod-devel mailing list:
https://lists.sourceforge.net/lists/listinfo/gmod-devel

Thanks and I look forward to seeing you in San Diego!
Scott


-- 
------------------------------------------------------------------------
Scott Cain, Ph. D.                                   scott at scottcain dot net
GMOD Coordinator (http://gmod.org/)                     216-392-3087
Ontario Institute for Cancer Research


From j.inoue at ucl.ac.uk  Sat Nov 14 14:20:29 2009
From: j.inoue at ucl.ac.uk (Jun Inoue)
Date: Sat, 14 Nov 2009 19:20:29 +0000
Subject: [Bioperl-l] Bio::TreeIO, Root-tip branch lengths
Message-ID: <F05F4263-D708-492A-A536-74059687279C@ucl.ac.uk>

Dear All,

I just started to learn BioPerl for phylogenetics.
Usually I am using perl v5.10.0 on my Mac OS 10.5.8.
I would like to ask you a hint to calculate the Branch lengths
from root to tip for all species in NEWICK TREE format.

Please see the following web site.
I am explaining what I want to do and
showing my easy script (not completed).
http://www.geocities.jp/ancientfishtree/BioPerl_BLRootTip.html

Thank you for your help.

Best,
Jun Inoue
http://www.geocities.jp/ancientfishtree/index_eng.html


From maj at fortinbras.us  Sat Nov 14 16:47:37 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Sat, 14 Nov 2009 16:47:37 -0500
Subject: [Bioperl-l] Bio::TreeIO, Root-tip branch lengths
In-Reply-To: <F05F4263-D708-492A-A536-74059687279C@ucl.ac.uk>
References: <F05F4263-D708-492A-A536-74059687279C@ucl.ac.uk>
Message-ID: <3BC179984D5E49868C4F12D181D82B8D@NewLife>

Hi Jun,

Some hints: incorporate

@leaves = $tree->get_leaf_nodes;

and

use Bio::Tree::TreeFunctionsI;
$distance = $tree->distance( $node_a, $node_b );

cheers, Mark

----- Original Message ----- 
From: "Jun Inoue" <j.inoue at ucl.ac.uk>
To: <bioperl-l at lists.open-bio.org>
Cc: "?? ?" <j.inoue at ucl.ac.uk>
Sent: Saturday, November 14, 2009 2:20 PM
Subject: [Bioperl-l] Bio::TreeIO, Root-tip branch lengths


> Dear All,
>
> I just started to learn BioPerl for phylogenetics.
> Usually I am using perl v5.10.0 on my Mac OS 10.5.8.
> I would like to ask you a hint to calculate the Branch lengths
> from root to tip for all species in NEWICK TREE format.
>
> Please see the following web site.
> I am explaining what I want to do and
> showing my easy script (not completed).
> http://www.geocities.jp/ancientfishtree/BioPerl_BLRootTip.html
>
> Thank you for your help.
>
> Best,
> Jun Inoue
> http://www.geocities.jp/ancientfishtree/index_eng.html
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> 


From jay at jays.net  Sun Nov 15 20:23:38 2009
From: jay at jays.net (Jay Hannah)
Date: Sun, 15 Nov 2009 19:23:38 -0600
Subject: [Bioperl-l] Bio::Index::GenBank - by organism?
In-Reply-To: <12DFD22E-42DC-4626-9873-0DE3EBB5CFBD@illinois.edu>
References: <3B01A09C-198E-4691-B807-7ED3250BB81A@jays.net>
	<12DFD22E-42DC-4626-9873-0DE3EBB5CFBD@illinois.edu>
Message-ID: <F8052B51-85FB-44B9-9254-9AD1E964FA7B@jays.net>

On Nov 9, 2009, at 9:55 PM, Chris Fields wrote:
> It should work via id_parser(); from Bio::Index::GenBank:
> 
>   $inx->id_parser(\&get_id);
>   # make the index
>   $inx->make_index($file_name);
> 
>   # here is where the retrieval key is specified
>   sub get_id {
>      my $line = shift;
>      $line =~ /clone="(\S+)"/;
>      $1;
>   }

This worked great for me today (tackling a different problem than the original).  Thanks!!

j


From veronica.xiaoyu at gmail.com  Fri Nov 13 15:35:48 2009
From: veronica.xiaoyu at gmail.com (Xiaoyu Liang)
Date: Fri, 13 Nov 2009 15:35:48 -0500
Subject: [Bioperl-l] Bio::Graphics::Panel question
Message-ID: <a6a926b50911131235p3b04b59fs634593f275300ed6@mail.gmail.com>

Hi,

I'm using Bio::Graphics to parse the blast result and generate images. But,
sometimes, in the middle of the output image, the hit's color is white,
eventhough I set it to other colors. I attached the picture here for an
example. This doesn't occur all the time, usually, it works well. I'm
wondering if I did something wrong? or depends on the blast result?

Thank you,
Xiaoyu
-------------- next part --------------
A non-text attachment was scrubbed...
Name: BLAST_problem.jpg
Type: image/jpeg
Size: 51888 bytes
Desc: not available
URL: <http://lists.open-bio.org/pipermail/bioperl-l/attachments/20091113/57550aa9/attachment-0003.jpg>

From ryan_bogard at hms.harvard.edu  Sun Nov 15 22:30:22 2009
From: ryan_bogard at hms.harvard.edu (rbogard)
Date: Sun, 15 Nov 2009 19:30:22 -0800 (PST)
Subject: [Bioperl-l]  Problems with bioperl in Mac OS X 10.6 (Perl 5.10.0)
Message-ID: <26366421.post@talk.nabble.com>


In advance, any advice would be grealy appreciated! I have installed
bioperl-588pm via fink but I am having difficulties calling the modules in
script. The following is added to .profile (bash):
PERL5LIB=/sw/lib/perl5/5.8.8:$PERL5LIB

If I change this to /sw/lib/perl5 then I get an @INC error, as use Bio::PERL
cannot be located.

The environment variables are as follows:

MANPATH=/sw/share/man:/usr/share/man:/usr/X11/man:/sw/lib/perl5/5.10.0/man:/usr/X11R6/man:/sw/lib/perl5-core/5.8.8/man:/sw/lib/perl5/5.8.8/man
PERL5LIB=/sw/lib/perl5/5.8.8:/sw/lib/perl5:/sw/lib/perl5/darwin:/sw/lib/perl5/5.8.8
PATH=/sw/bin:/sw/sbin:/usr/bin:/bin:/usr/sbin:/sbin:/usr/local/bin:/usr/X11/bin:/usr/X11R6/bin
INFOPATH=/sw/share/info:/sw/info:/usr/share/info


This is the perl script I'm attempting to run:
#!/sw/bin/perl5.8.8
use strict;
use Bio::Perl;
$seq_object = get_sequence('swiss',"ROA1_HUMAN");
write_sequence(">roa1.fasta",'fasta',$seq_object);

Here is the error output:

dyld: lazy symbol binding failed: Symbol not found: _Perl_Tstack_sp_ptr
  Referenced from:
/sw/lib/perl5/5.8.8/darwin-thread-multi-2level/auto/IO/IO.bundle
  Expected in: dynamic lookup

dyld: Symbol not found: _Perl_Tstack_sp_ptr
  Referenced from:
/sw/lib/perl5/5.8.8/darwin-thread-multi-2level/auto/IO/IO.bundle
  Expected in: dynamic lookup

Trace/BPT trap

I have looked through many forum postings and attempted the solutions
offered in those instances, but none seem to work in my case. I'm not sure
if it's because I have perl 5.10.0 installed while attempting to call
bioperl 5.8.8; however, others seem to have it working just fine.

Thank you, Ryan 
-- 
View this message in context: http://old.nabble.com/Problems-with-bioperl-in-Mac-OS-X-10.6-%28Perl-5.10.0%29-tp26366421p26366421.html
Sent from the Perl - Bioperl-L mailing list archive at Nabble.com.


From e.osimo at gmail.com  Mon Nov 16 02:04:40 2009
From: e.osimo at gmail.com (Emanuele Osimo)
Date: Mon, 16 Nov 2009 08:04:40 +0100
Subject: [Bioperl-l] Problems with bioperl in Mac OS X 10.6 (Perl 5.10.0)
In-Reply-To: <26366421.post@talk.nabble.com>
References: <26366421.post@talk.nabble.com>
Message-ID: <2ac05d0f0911152304v58985cb5x6ea0501bff7a41ab@mail.gmail.com>

Hello Ryan,
unfortunately, if you upgraded to 10.6 without formatting, I have to tell
you that you'll be in big trouble with perl and with everything you
installed from the commandline... Because in the upgrade process everything
in the system folders, perl and bioperl being some of these things, is
erased without being uninstalled, so you'll find a lot of folders with the
same name but no contents.
I suggest you, as I did, to format your pc and reinstall 10.6 from scratch.
Then youl'll be able to install mysql (I had to install
mysql-5.4.3-beta-osx10.5, the only to work on 10.6), and, working with perl
5.10 that is already installed, you'll install bioperl with no effort.
Bye
Emanuele

On Mon, Nov 16, 2009 at 04:30, rbogard <ryan_bogard at hms.harvard.edu> wrote:

>
> In advance, any advice would be grealy appreciated! I have installed
> bioperl-588pm via fink but I am having difficulties calling the modules in
> script. The following is added to .profile (bash):
> PERL5LIB=/sw/lib/perl5/5.8.8:$PERL5LIB
>
> If I change this to /sw/lib/perl5 then I get an @INC error, as use
> Bio::PERL
> cannot be located.
>
> The environment variables are as follows:
>
>
> MANPATH=/sw/share/man:/usr/share/man:/usr/X11/man:/sw/lib/perl5/5.10.0/man:/usr/X11R6/man:/sw/lib/perl5-core/5.8.8/man:/sw/lib/perl5/5.8.8/man
>
> PERL5LIB=/sw/lib/perl5/5.8.8:/sw/lib/perl5:/sw/lib/perl5/darwin:/sw/lib/perl5/5.8.8
>
> PATH=/sw/bin:/sw/sbin:/usr/bin:/bin:/usr/sbin:/sbin:/usr/local/bin:/usr/X11/bin:/usr/X11R6/bin
> INFOPATH=/sw/share/info:/sw/info:/usr/share/info
>
>
> This is the perl script I'm attempting to run:
> #!/sw/bin/perl5.8.8
> use strict;
> use Bio::Perl;
> $seq_object = get_sequence('swiss',"ROA1_HUMAN");
> write_sequence(">roa1.fasta",'fasta',$seq_object);
>
> Here is the error output:
>
> dyld: lazy symbol binding failed: Symbol not found: _Perl_Tstack_sp_ptr
>  Referenced from:
> /sw/lib/perl5/5.8.8/darwin-thread-multi-2level/auto/IO/IO.bundle
>  Expected in: dynamic lookup
>
> dyld: Symbol not found: _Perl_Tstack_sp_ptr
>  Referenced from:
> /sw/lib/perl5/5.8.8/darwin-thread-multi-2level/auto/IO/IO.bundle
>  Expected in: dynamic lookup
>
> Trace/BPT trap
>
> I have looked through many forum postings and attempted the solutions
> offered in those instances, but none seem to work in my case. I'm not sure
> if it's because I have perl 5.10.0 installed while attempting to call
> bioperl 5.8.8; however, others seem to have it working just fine.
>
> Thank you, Ryan
> --
> View this message in context:
> http://old.nabble.com/Problems-with-bioperl-in-Mac-OS-X-10.6-%28Perl-5.10.0%29-tp26366421p26366421.html
> Sent from the Perl - Bioperl-L mailing list archive at Nabble.com.
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


From ryan_bogard at hms.harvard.edu  Mon Nov 16 08:43:19 2009
From: ryan_bogard at hms.harvard.edu (rbogard)
Date: Mon, 16 Nov 2009 05:43:19 -0800 (PST)
Subject: [Bioperl-l] Problems with bioperl in Mac OS X 10.6 (Perl 5.10.0)
In-Reply-To: <2ac05d0f0911152304v58985cb5x6ea0501bff7a41ab@mail.gmail.com>
References: <26366421.post@talk.nabble.com>
	<2ac05d0f0911152304v58985cb5x6ea0501bff7a41ab@mail.gmail.com>
Message-ID: <26372079.post@talk.nabble.com>


The Mac OS X 10.6 was a fresh install on a new Mac Book Pro. Not sure if I
will have the same issues, but it's worth a shot as I have little on my
computer and reinstalling to start over wouldn't be too difficult. What
method did you use to install bioperl? I used fink and I am not sure the
available stable version is the one I need. I will install from the command
line this time around, and let you know how it turns out.

Thank you!


Emanuele Osimo wrote:
> 
> Hello Ryan,
> unfortunately, if you upgraded to 10.6 without formatting, I have to tell
> you that you'll be in big trouble with perl and with everything you
> installed from the commandline... Because in the upgrade process
> everything
> in the system folders, perl and bioperl being some of these things, is
> erased without being uninstalled, so you'll find a lot of folders with the
> same name but no contents.
> I suggest you, as I did, to format your pc and reinstall 10.6 from
> scratch.
> Then youl'll be able to install mysql (I had to install
> mysql-5.4.3-beta-osx10.5, the only to work on 10.6), and, working with
> perl
> 5.10 that is already installed, you'll install bioperl with no effort.
> Bye
> Emanuele
> 
> On Mon, Nov 16, 2009 at 04:30, rbogard <ryan_bogard at hms.harvard.edu>
> wrote:
> 
>>
>> In advance, any advice would be grealy appreciated! I have installed
>> bioperl-588pm via fink but I am having difficulties calling the modules
>> in
>> script. The following is added to .profile (bash):
>> PERL5LIB=/sw/lib/perl5/5.8.8:$PERL5LIB
>>
>> If I change this to /sw/lib/perl5 then I get an @INC error, as use
>> Bio::PERL
>> cannot be located.
>>
>> The environment variables are as follows:
>>
>>
>> MANPATH=/sw/share/man:/usr/share/man:/usr/X11/man:/sw/lib/perl5/5.10.0/man:/usr/X11R6/man:/sw/lib/perl5-core/5.8.8/man:/sw/lib/perl5/5.8.8/man
>>
>> PERL5LIB=/sw/lib/perl5/5.8.8:/sw/lib/perl5:/sw/lib/perl5/darwin:/sw/lib/perl5/5.8.8
>>
>> PATH=/sw/bin:/sw/sbin:/usr/bin:/bin:/usr/sbin:/sbin:/usr/local/bin:/usr/X11/bin:/usr/X11R6/bin
>> INFOPATH=/sw/share/info:/sw/info:/usr/share/info
>>
>>
>> This is the perl script I'm attempting to run:
>> #!/sw/bin/perl5.8.8
>> use strict;
>> use Bio::Perl;
>> $seq_object = get_sequence('swiss',"ROA1_HUMAN");
>> write_sequence(">roa1.fasta",'fasta',$seq_object);
>>
>> Here is the error output:
>>
>> dyld: lazy symbol binding failed: Symbol not found: _Perl_Tstack_sp_ptr
>>  Referenced from:
>> /sw/lib/perl5/5.8.8/darwin-thread-multi-2level/auto/IO/IO.bundle
>>  Expected in: dynamic lookup
>>
>> dyld: Symbol not found: _Perl_Tstack_sp_ptr
>>  Referenced from:
>> /sw/lib/perl5/5.8.8/darwin-thread-multi-2level/auto/IO/IO.bundle
>>  Expected in: dynamic lookup
>>
>> Trace/BPT trap
>>
>> I have looked through many forum postings and attempted the solutions
>> offered in those instances, but none seem to work in my case. I'm not
>> sure
>> if it's because I have perl 5.10.0 installed while attempting to call
>> bioperl 5.8.8; however, others seem to have it working just fine.
>>
>> Thank you, Ryan
>> --
>> View this message in context:
>> http://old.nabble.com/Problems-with-bioperl-in-Mac-OS-X-10.6-%28Perl-5.10.0%29-tp26366421p26366421.html
>> Sent from the Perl - Bioperl-L mailing list archive at Nabble.com.
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
> 

-- 
View this message in context: http://old.nabble.com/Problems-with-bioperl-in-Mac-OS-X-10.6-%28Perl-5.10.0%29-tp26366421p26372079.html
Sent from the Perl - Bioperl-L mailing list archive at Nabble.com.


From maj at fortinbras.us  Mon Nov 16 08:48:17 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Mon, 16 Nov 2009 08:48:17 -0500
Subject: [Bioperl-l] Problems with bioperl in Mac OS X 10.6 (Perl 5.10.0)
In-Reply-To: <26372079.post@talk.nabble.com>
References: <26366421.post@talk.nabble.com><2ac05d0f0911152304v58985cb5x6ea0501bff7a41ab@mail.gmail.com>
	<26372079.post@talk.nabble.com>
Message-ID: <8D822081B13F49C2A37677D3A47F38B4@NewLife>

Ryan,
I'm not a mac person, but Koen has said (see 
http://www.bioperl.org/wiki/Getting_BioPerl#Mac_OS_X_using_fink )
to use the unstable tree to get BioPerl 1.6.1, which is likely to be what you 
want.
cheers
Mark
----- Original Message ----- 
From: "rbogard" <ryan_bogard at hms.harvard.edu>
To: <Bioperl-l at lists.open-bio.org>
Sent: Monday, November 16, 2009 8:43 AM
Subject: Re: [Bioperl-l] Problems with bioperl in Mac OS X 10.6 (Perl 5.10.0)


>
> The Mac OS X 10.6 was a fresh install on a new Mac Book Pro. Not sure if I
> will have the same issues, but it's worth a shot as I have little on my
> computer and reinstalling to start over wouldn't be too difficult. What
> method did you use to install bioperl? I used fink and I am not sure the
> available stable version is the one I need. I will install from the command
> line this time around, and let you know how it turns out.
>
> Thank you!
>
>
>
> Emanuele Osimo wrote:
>>
>> Hello Ryan,
>> unfortunately, if you upgraded to 10.6 without formatting, I have to tell
>> you that you'll be in big trouble with perl and with everything you
>> installed from the commandline... Because in the upgrade process
>> everything
>> in the system folders, perl and bioperl being some of these things, is
>> erased without being uninstalled, so you'll find a lot of folders with the
>> same name but no contents.
>> I suggest you, as I did, to format your pc and reinstall 10.6 from
>> scratch.
>> Then youl'll be able to install mysql (I had to install
>> mysql-5.4.3-beta-osx10.5, the only to work on 10.6), and, working with
>> perl
>> 5.10 that is already installed, you'll install bioperl with no effort.
>> Bye
>> Emanuele
>>
>> On Mon, Nov 16, 2009 at 04:30, rbogard <ryan_bogard at hms.harvard.edu>
>> wrote:
>>
>>>
>>> In advance, any advice would be grealy appreciated! I have installed
>>> bioperl-588pm via fink but I am having difficulties calling the modules
>>> in
>>> script. The following is added to .profile (bash):
>>> PERL5LIB=/sw/lib/perl5/5.8.8:$PERL5LIB
>>>
>>> If I change this to /sw/lib/perl5 then I get an @INC error, as use
>>> Bio::PERL
>>> cannot be located.
>>>
>>> The environment variables are as follows:
>>>
>>>
>>> MANPATH=/sw/share/man:/usr/share/man:/usr/X11/man:/sw/lib/perl5/5.10.0/man:/usr/X11R6/man:/sw/lib/perl5-core/5.8.8/man:/sw/lib/perl5/5.8.8/man
>>>
>>> PERL5LIB=/sw/lib/perl5/5.8.8:/sw/lib/perl5:/sw/lib/perl5/darwin:/sw/lib/perl5/5.8.8
>>>
>>> PATH=/sw/bin:/sw/sbin:/usr/bin:/bin:/usr/sbin:/sbin:/usr/local/bin:/usr/X11/bin:/usr/X11R6/bin
>>> INFOPATH=/sw/share/info:/sw/info:/usr/share/info
>>>
>>>
>>> This is the perl script I'm attempting to run:
>>> #!/sw/bin/perl5.8.8
>>> use strict;
>>> use Bio::Perl;
>>> $seq_object = get_sequence('swiss',"ROA1_HUMAN");
>>> write_sequence(">roa1.fasta",'fasta',$seq_object);
>>>
>>> Here is the error output:
>>>
>>> dyld: lazy symbol binding failed: Symbol not found: _Perl_Tstack_sp_ptr
>>>  Referenced from:
>>> /sw/lib/perl5/5.8.8/darwin-thread-multi-2level/auto/IO/IO.bundle
>>>  Expected in: dynamic lookup
>>>
>>> dyld: Symbol not found: _Perl_Tstack_sp_ptr
>>>  Referenced from:
>>> /sw/lib/perl5/5.8.8/darwin-thread-multi-2level/auto/IO/IO.bundle
>>>  Expected in: dynamic lookup
>>>
>>> Trace/BPT trap
>>>
>>> I have looked through many forum postings and attempted the solutions
>>> offered in those instances, but none seem to work in my case. I'm not
>>> sure
>>> if it's because I have perl 5.10.0 installed while attempting to call
>>> bioperl 5.8.8; however, others seem to have it working just fine.
>>>
>>> Thank you, Ryan
>>> --
>>> View this message in context:
>>> http://old.nabble.com/Problems-with-bioperl-in-Mac-OS-X-10.6-%28Perl-5.10.0%29-tp26366421p26366421.html
>>> Sent from the Perl - Bioperl-L mailing list archive at Nabble.com.
>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>>
>
> -- 
> View this message in context: 
> http://old.nabble.com/Problems-with-bioperl-in-Mac-OS-X-10.6-%28Perl-5.10.0%29-tp26366421p26372079.html
> Sent from the Perl - Bioperl-L mailing list archive at Nabble.com.
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> 


From cjfields at illinois.edu  Mon Nov 16 10:00:09 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Mon, 16 Nov 2009 09:00:09 -0600
Subject: [Bioperl-l] Problems with bioperl in Mac OS X 10.6 (Perl 5.10.0)
In-Reply-To: <2ac05d0f0911152304v58985cb5x6ea0501bff7a41ab@mail.gmail.com>
References: <26366421.post@talk.nabble.com>
	<2ac05d0f0911152304v58985cb5x6ea0501bff7a41ab@mail.gmail.com>
Message-ID: <49681E01-E95D-4FC6-AE42-6E57ED43AAA2@illinois.edu>

On Nov 16, 2009, at 1:04 AM, Emanuele Osimo wrote:

> Hello Ryan,
> unfortunately, if you upgraded to 10.6 without formatting, I have to tell
> you that you'll be in big trouble with perl and with everything you
> installed from the commandline... Because in the upgrade process everything
> in the system folders, perl and bioperl being some of these things, is
> erased without being uninstalled, so you'll find a lot of folders with the
> same name but no contents.

> I suggest you, as I did, to format your pc and reinstall 10.6 from scratch.
> Then youl'll be able to install mysql (I had to install
> mysql-5.4.3-beta-osx10.5, the only to work on 10.6), and, working with perl
> 5.10 that is already installed, you'll install bioperl with no effort.
> Bye
> Emanuele

Just starting from scratch isn't always the best solution (though it is the cleanest).  In this case I don't think anything you mention applies, as there are conflicting symbols being reported.  My guess is conflicting perl builds, probably between your system 5.10.0 (snow leopard) and your fink-installed perl 5.8.8 (they are binary incompatible).  Also, remember that snow leopard is primarily 64-bit, so it might be best to try working out whether your fink is attempting to compile 64- vs 32-bit.  

In this case, I would just uninstall the fink-based perl and either use the system one (snow leopard = 5.10.0), or roll your own and install 5.10.1 locally or in /usr/local.  Do NOT replace the system one, as that will likely break your OS.

In my experience, and not to bash on fink or MacPorts, I never had much luck with their perl installs.  Unless I plan on only using fink or macports for my OS (not likely in my case), I find they tend to cause problems in the long term unless one uses them to install packages with very few dependencies, and even then you need to make sure fink is configure to compile the correct binary.  For instance, they're fairly good for gd, libxml2, etc., but beyond that one may get into issues with odd, version-specific dependencies with some packages, such as relying on perl 5.8.8 (but not perl 5.10.x), db42 (instead of db44), etc.  I've ended up in the past with 2-3 different perl versions, berkeley db versions, etc. 

chris

> On Mon, Nov 16, 2009 at 04:30, rbogard <ryan_bogard at hms.harvard.edu> wrote:
> 
>> 
>> In advance, any advice would be grealy appreciated! I have installed
>> bioperl-588pm via fink but I am having difficulties calling the modules in
>> script. The following is added to .profile (bash):
>> PERL5LIB=/sw/lib/perl5/5.8.8:$PERL5LIB
>> 
>> If I change this to /sw/lib/perl5 then I get an @INC error, as use
>> Bio::PERL
>> cannot be located.
>> 
>> The environment variables are as follows:
>> 
>> 
>> MANPATH=/sw/share/man:/usr/share/man:/usr/X11/man:/sw/lib/perl5/5.10.0/man:/usr/X11R6/man:/sw/lib/perl5-core/5.8.8/man:/sw/lib/perl5/5.8.8/man
>> 
>> PERL5LIB=/sw/lib/perl5/5.8.8:/sw/lib/perl5:/sw/lib/perl5/darwin:/sw/lib/perl5/5.8.8
>> 
>> PATH=/sw/bin:/sw/sbin:/usr/bin:/bin:/usr/sbin:/sbin:/usr/local/bin:/usr/X11/bin:/usr/X11R6/bin
>> INFOPATH=/sw/share/info:/sw/info:/usr/share/info
>> 
>> 
>> This is the perl script I'm attempting to run:
>> #!/sw/bin/perl5.8.8
>> use strict;
>> use Bio::Perl;
>> $seq_object = get_sequence('swiss',"ROA1_HUMAN");
>> write_sequence(">roa1.fasta",'fasta',$seq_object);
>> 
>> Here is the error output:
>> 
>> dyld: lazy symbol binding failed: Symbol not found: _Perl_Tstack_sp_ptr
>> Referenced from:
>> /sw/lib/perl5/5.8.8/darwin-thread-multi-2level/auto/IO/IO.bundle
>> Expected in: dynamic lookup
>> 
>> dyld: Symbol not found: _Perl_Tstack_sp_ptr
>> Referenced from:
>> /sw/lib/perl5/5.8.8/darwin-thread-multi-2level/auto/IO/IO.bundle
>> Expected in: dynamic lookup
>> 
>> Trace/BPT trap
>> 
>> I have looked through many forum postings and attempted the solutions
>> offered in those instances, but none seem to work in my case. I'm not sure
>> if it's because I have perl 5.10.0 installed while attempting to call
>> bioperl 5.8.8; however, others seem to have it working just fine.
>> 
>> Thank you, Ryan
>> --
>> View this message in context:
>> http://old.nabble.com/Problems-with-bioperl-in-Mac-OS-X-10.6-%28Perl-5.10.0%29-tp26366421p26366421.html
>> Sent from the Perl - Bioperl-L mailing list archive at Nabble.com.
>> 
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From cjfields at illinois.edu  Mon Nov 16 10:01:01 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Mon, 16 Nov 2009 09:01:01 -0600
Subject: [Bioperl-l] Problems with bioperl in Mac OS X 10.6 (Perl 5.10.0)
In-Reply-To: <8D822081B13F49C2A37677D3A47F38B4@NewLife>
References: <26366421.post@talk.nabble.com><2ac05d0f0911152304v58985cb5x6ea0501bff7a41ab@mail.gmail.com>
	<26372079.post@talk.nabble.com>
	<8D822081B13F49C2A37677D3A47F38B4@NewLife>
Message-ID: <58912861-CD59-4AFC-8F30-B0AA2E77AECB@illinois.edu>

Actually, why not just install via CPAN?  Any particular reason?

chris

On Nov 16, 2009, at 7:48 AM, Mark A. Jensen wrote:

> Ryan,
> I'm not a mac person, but Koen has said (see http://www.bioperl.org/wiki/Getting_BioPerl#Mac_OS_X_using_fink )
> to use the unstable tree to get BioPerl 1.6.1, which is likely to be what you want.
> cheers
> Mark
> ----- Original Message ----- From: "rbogard" <ryan_bogard at hms.harvard.edu>
> To: <Bioperl-l at lists.open-bio.org>
> Sent: Monday, November 16, 2009 8:43 AM
> Subject: Re: [Bioperl-l] Problems with bioperl in Mac OS X 10.6 (Perl 5.10.0)
> 
> 
>> 
>> The Mac OS X 10.6 was a fresh install on a new Mac Book Pro. Not sure if I
>> will have the same issues, but it's worth a shot as I have little on my
>> computer and reinstalling to start over wouldn't be too difficult. What
>> method did you use to install bioperl? I used fink and I am not sure the
>> available stable version is the one I need. I will install from the command
>> line this time around, and let you know how it turns out.
>> 
>> Thank you!
>> 
>> 
>> 
>> Emanuele Osimo wrote:
>>> 
>>> Hello Ryan,
>>> unfortunately, if you upgraded to 10.6 without formatting, I have to tell
>>> you that you'll be in big trouble with perl and with everything you
>>> installed from the commandline... Because in the upgrade process
>>> everything
>>> in the system folders, perl and bioperl being some of these things, is
>>> erased without being uninstalled, so you'll find a lot of folders with the
>>> same name but no contents.
>>> I suggest you, as I did, to format your pc and reinstall 10.6 from
>>> scratch.
>>> Then youl'll be able to install mysql (I had to install
>>> mysql-5.4.3-beta-osx10.5, the only to work on 10.6), and, working with
>>> perl
>>> 5.10 that is already installed, you'll install bioperl with no effort.
>>> Bye
>>> Emanuele
>>> 
>>> On Mon, Nov 16, 2009 at 04:30, rbogard <ryan_bogard at hms.harvard.edu>
>>> wrote:
>>> 
>>>> 
>>>> In advance, any advice would be grealy appreciated! I have installed
>>>> bioperl-588pm via fink but I am having difficulties calling the modules
>>>> in
>>>> script. The following is added to .profile (bash):
>>>> PERL5LIB=/sw/lib/perl5/5.8.8:$PERL5LIB
>>>> 
>>>> If I change this to /sw/lib/perl5 then I get an @INC error, as use
>>>> Bio::PERL
>>>> cannot be located.
>>>> 
>>>> The environment variables are as follows:
>>>> 
>>>> 
>>>> MANPATH=/sw/share/man:/usr/share/man:/usr/X11/man:/sw/lib/perl5/5.10.0/man:/usr/X11R6/man:/sw/lib/perl5-core/5.8.8/man:/sw/lib/perl5/5.8.8/man
>>>> 
>>>> PERL5LIB=/sw/lib/perl5/5.8.8:/sw/lib/perl5:/sw/lib/perl5/darwin:/sw/lib/perl5/5.8.8
>>>> 
>>>> PATH=/sw/bin:/sw/sbin:/usr/bin:/bin:/usr/sbin:/sbin:/usr/local/bin:/usr/X11/bin:/usr/X11R6/bin
>>>> INFOPATH=/sw/share/info:/sw/info:/usr/share/info
>>>> 
>>>> 
>>>> This is the perl script I'm attempting to run:
>>>> #!/sw/bin/perl5.8.8
>>>> use strict;
>>>> use Bio::Perl;
>>>> $seq_object = get_sequence('swiss',"ROA1_HUMAN");
>>>> write_sequence(">roa1.fasta",'fasta',$seq_object);
>>>> 
>>>> Here is the error output:
>>>> 
>>>> dyld: lazy symbol binding failed: Symbol not found: _Perl_Tstack_sp_ptr
>>>> Referenced from:
>>>> /sw/lib/perl5/5.8.8/darwin-thread-multi-2level/auto/IO/IO.bundle
>>>> Expected in: dynamic lookup
>>>> 
>>>> dyld: Symbol not found: _Perl_Tstack_sp_ptr
>>>> Referenced from:
>>>> /sw/lib/perl5/5.8.8/darwin-thread-multi-2level/auto/IO/IO.bundle
>>>> Expected in: dynamic lookup
>>>> 
>>>> Trace/BPT trap
>>>> 
>>>> I have looked through many forum postings and attempted the solutions
>>>> offered in those instances, but none seem to work in my case. I'm not
>>>> sure
>>>> if it's because I have perl 5.10.0 installed while attempting to call
>>>> bioperl 5.8.8; however, others seem to have it working just fine.
>>>> 
>>>> Thank you, Ryan
>>>> --
>>>> View this message in context:
>>>> http://old.nabble.com/Problems-with-bioperl-in-Mac-OS-X-10.6-%28Perl-5.10.0%29-tp26366421p26366421.html
>>>> Sent from the Perl - Bioperl-L mailing list archive at Nabble.com.
>>>> 
>>>> _______________________________________________
>>>> Bioperl-l mailing list
>>>> Bioperl-l at lists.open-bio.org
>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>> 
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>> 
>>> 
>> 
>> -- 
>> View this message in context: http://old.nabble.com/Problems-with-bioperl-in-Mac-OS-X-10.6-%28Perl-5.10.0%29-tp26366421p26372079.html
>> Sent from the Perl - Bioperl-L mailing list archive at Nabble.com.
>> 
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From Kevin.M.Brown at asu.edu  Mon Nov 16 10:49:13 2009
From: Kevin.M.Brown at asu.edu (Kevin Brown)
Date: Mon, 16 Nov 2009 08:49:13 -0700
Subject: [Bioperl-l] Bio::Graphics::Panel question
In-Reply-To: <a6a926b50911131235p3b04b59fs634593f275300ed6@mail.gmail.com>
References: <a6a926b50911131235p3b04b59fs634593f275300ed6@mail.gmail.com>
Message-ID: <1A4207F8295607498283FE9E93B775B40663EDB9@EX02.asurite.ad.asu.edu>

To really be able to tell if this was a bug, I (and probably the real
devs) would need to see that part of your code and the Blast file that
is having this issue as it could be your callback for color choice vs
the blast object (e.g. your color picker is missing an option that the
data comes in with and so returns with a blank value).

-----Original Message-----
From: bioperl-l-bounces at lists.open-bio.org
[mailto:bioperl-l-bounces at lists.open-bio.org] On Behalf Of Xiaoyu Liang
Sent: Friday, November 13, 2009 1:36 PM
To: Bioperl-l at lists.open-bio.org
Subject: [Bioperl-l] Bio::Graphics::Panel question

Hi,

I'm using Bio::Graphics to parse the blast result and generate images.
But, sometimes, in the middle of the output image, the hit's color is
white, eventhough I set it to other colors. I attached the picture here
for an example. This doesn't occur all the time, usually, it works well.
I'm wondering if I did something wrong? or depends on the blast result?

Thank you,
Xiaoyu


From ryan_bogard at hms.harvard.edu  Mon Nov 16 11:57:16 2009
From: ryan_bogard at hms.harvard.edu (rbogard)
Date: Mon, 16 Nov 2009 08:57:16 -0800 (PST)
Subject: [Bioperl-l] Problems with bioperl in Mac OS X 10.6 (Perl 5.10.0)
In-Reply-To: <58912861-CD59-4AFC-8F30-B0AA2E77AECB@illinois.edu>
References: <26366421.post@talk.nabble.com>
	<2ac05d0f0911152304v58985cb5x6ea0501bff7a41ab@mail.gmail.com>
	<26372079.post@talk.nabble.com>
	<8D822081B13F49C2A37677D3A47F38B4@NewLife>
	<58912861-CD59-4AFC-8F30-B0AA2E77AECB@illinois.edu>
Message-ID: <26375418.post@talk.nabble.com>


I read that posting by Koen and used the unstable tree after the first
attempt; however, the errors still persisted. I just finished a fresh
install and I will just follow Mr. Fields advice and use CPAN. 
Thank you all for the help!


Chris Fields-5 wrote:
> 
> Actually, why not just install via CPAN?  Any particular reason?
> 
> chris
> 
> On Nov 16, 2009, at 7:48 AM, Mark A. Jensen wrote:
> 
>> Ryan,
>> I'm not a mac person, but Koen has said (see
>> http://www.bioperl.org/wiki/Getting_BioPerl#Mac_OS_X_using_fink )
>> to use the unstable tree to get BioPerl 1.6.1, which is likely to be what
>> you want.
>> cheers
>> Mark
>> ----- Original Message ----- From: "rbogard"
>> <ryan_bogard at hms.harvard.edu>
>> To: <Bioperl-l at lists.open-bio.org>
>> Sent: Monday, November 16, 2009 8:43 AM
>> Subject: Re: [Bioperl-l] Problems with bioperl in Mac OS X 10.6 (Perl
>> 5.10.0)
>> 
>> 
>>> 
>>> The Mac OS X 10.6 was a fresh install on a new Mac Book Pro. Not sure if
>>> I
>>> will have the same issues, but it's worth a shot as I have little on my
>>> computer and reinstalling to start over wouldn't be too difficult. What
>>> method did you use to install bioperl? I used fink and I am not sure the
>>> available stable version is the one I need. I will install from the
>>> command
>>> line this time around, and let you know how it turns out.
>>> 
>>> Thank you!
>>> 
>>> 
>>> 
>>> Emanuele Osimo wrote:
>>>> 
>>>> Hello Ryan,
>>>> unfortunately, if you upgraded to 10.6 without formatting, I have to
>>>> tell
>>>> you that you'll be in big trouble with perl and with everything you
>>>> installed from the commandline... Because in the upgrade process
>>>> everything
>>>> in the system folders, perl and bioperl being some of these things, is
>>>> erased without being uninstalled, so you'll find a lot of folders with
>>>> the
>>>> same name but no contents.
>>>> I suggest you, as I did, to format your pc and reinstall 10.6 from
>>>> scratch.
>>>> Then youl'll be able to install mysql (I had to install
>>>> mysql-5.4.3-beta-osx10.5, the only to work on 10.6), and, working with
>>>> perl
>>>> 5.10 that is already installed, you'll install bioperl with no effort.
>>>> Bye
>>>> Emanuele
>>>> 
>>>> On Mon, Nov 16, 2009 at 04:30, rbogard <ryan_bogard at hms.harvard.edu>
>>>> wrote:
>>>> 
>>>>> 
>>>>> In advance, any advice would be grealy appreciated! I have installed
>>>>> bioperl-588pm via fink but I am having difficulties calling the
>>>>> modules
>>>>> in
>>>>> script. The following is added to .profile (bash):
>>>>> PERL5LIB=/sw/lib/perl5/5.8.8:$PERL5LIB
>>>>> 
>>>>> If I change this to /sw/lib/perl5 then I get an @INC error, as use
>>>>> Bio::PERL
>>>>> cannot be located.
>>>>> 
>>>>> The environment variables are as follows:
>>>>> 
>>>>> 
>>>>> MANPATH=/sw/share/man:/usr/share/man:/usr/X11/man:/sw/lib/perl5/5.10.0/man:/usr/X11R6/man:/sw/lib/perl5-core/5.8.8/man:/sw/lib/perl5/5.8.8/man
>>>>> 
>>>>> PERL5LIB=/sw/lib/perl5/5.8.8:/sw/lib/perl5:/sw/lib/perl5/darwin:/sw/lib/perl5/5.8.8
>>>>> 
>>>>> PATH=/sw/bin:/sw/sbin:/usr/bin:/bin:/usr/sbin:/sbin:/usr/local/bin:/usr/X11/bin:/usr/X11R6/bin
>>>>> INFOPATH=/sw/share/info:/sw/info:/usr/share/info
>>>>> 
>>>>> 
>>>>> This is the perl script I'm attempting to run:
>>>>> #!/sw/bin/perl5.8.8
>>>>> use strict;
>>>>> use Bio::Perl;
>>>>> $seq_object = get_sequence('swiss',"ROA1_HUMAN");
>>>>> write_sequence(">roa1.fasta",'fasta',$seq_object);
>>>>> 
>>>>> Here is the error output:
>>>>> 
>>>>> dyld: lazy symbol binding failed: Symbol not found:
>>>>> _Perl_Tstack_sp_ptr
>>>>> Referenced from:
>>>>> /sw/lib/perl5/5.8.8/darwin-thread-multi-2level/auto/IO/IO.bundle
>>>>> Expected in: dynamic lookup
>>>>> 
>>>>> dyld: Symbol not found: _Perl_Tstack_sp_ptr
>>>>> Referenced from:
>>>>> /sw/lib/perl5/5.8.8/darwin-thread-multi-2level/auto/IO/IO.bundle
>>>>> Expected in: dynamic lookup
>>>>> 
>>>>> Trace/BPT trap
>>>>> 
>>>>> I have looked through many forum postings and attempted the solutions
>>>>> offered in those instances, but none seem to work in my case. I'm not
>>>>> sure
>>>>> if it's because I have perl 5.10.0 installed while attempting to call
>>>>> bioperl 5.8.8; however, others seem to have it working just fine.
>>>>> 
>>>>> Thank you, Ryan
>>>>> --
>>>>> View this message in context:
>>>>> http://old.nabble.com/Problems-with-bioperl-in-Mac-OS-X-10.6-%28Perl-5.10.0%29-tp26366421p26366421.html
>>>>> Sent from the Perl - Bioperl-L mailing list archive at Nabble.com.
>>>>> 
>>>>> _______________________________________________
>>>>> Bioperl-l mailing list
>>>>> Bioperl-l at lists.open-bio.org
>>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>>> 
>>>> _______________________________________________
>>>> Bioperl-l mailing list
>>>> Bioperl-l at lists.open-bio.org
>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>> 
>>>> 
>>> 
>>> -- 
>>> View this message in context:
>>> http://old.nabble.com/Problems-with-bioperl-in-Mac-OS-X-10.6-%28Perl-5.10.0%29-tp26366421p26372079.html
>>> Sent from the Perl - Bioperl-L mailing list archive at Nabble.com.
>>> 
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>> 
>> 
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
> 

-- 
View this message in context: http://old.nabble.com/Problems-with-bioperl-in-Mac-OS-X-10.6-%28Perl-5.10.0%29-tp26366421p26375418.html
Sent from the Perl - Bioperl-L mailing list archive at Nabble.com.


From krishna.aneesh at gmail.com  Mon Nov 16 02:00:15 2009
From: krishna.aneesh at gmail.com (Aneesh K)
Date: Mon, 16 Nov 2009 12:30:15 +0530
Subject: [Bioperl-l] Regarding Bio::TreeIO Object
Message-ID: <9cb9dfd70911152300y34789f88qc69dd14bf505f57d@mail.gmail.com>

Hi,

I just started to use Bioperl modules. It's really useful and interesting.
Now I have in stuck with "Tree objects and phylogenetic trees".
I couldn't get any documentation/examples about reading/parsing phylip tree
files.

Please tell me from where I can get some sample codes for this.

Waiting for your reply.

Thanks
Aneesh.K
Mob. 09646181517


From David.Messina at sbc.su.se  Mon Nov 16 12:33:36 2009
From: David.Messina at sbc.su.se (Dave Messina)
Date: Mon, 16 Nov 2009 18:33:36 +0100
Subject: [Bioperl-l] highest PAML version supported?
In-Reply-To: <D44511CE-D632-4CBE-9DD1-1B68EFD70124@bioperl.org>
References: <628aabb70911121120w4c609056v50204b9bd9e5c3fb@mail.gmail.com>
	<D44511CE-D632-4CBE-9DD1-1B68EFD70124@bioperl.org>
Message-ID: <B0AEE42A-A40A-4BB9-9A1C-98381CBB4CA9@sbc.su.se>

Hi everyone,

I just committed support for parsing codeml 4.3a (August 2009) to bioperl-live. I added new tests and all PAML-related tests pass, but please report any problems you have to the list.

Note that I haven't tested the other PAML 4.3a executables to see if there are format changes with those. If you get the chance to try any and it doesn't work, let me know and I'll try to add support for them.

(Note that these changes are only to the PAML parsing code; Bio::Tools::Run already appears to handle 4.3a just fine.)


Dave


From jason at bioperl.org  Mon Nov 16 12:34:57 2009
From: jason at bioperl.org (Jason Stajich)
Date: Mon, 16 Nov 2009 09:34:57 -0800
Subject: [Bioperl-l] Regarding Bio::TreeIO Object
In-Reply-To: <9cb9dfd70911152300y34789f88qc69dd14bf505f57d@mail.gmail.com>
References: <9cb9dfd70911152300y34789f88qc69dd14bf505f57d@mail.gmail.com>
Message-ID: <D1D4E0B9-4741-4D45-84B6-6BB57B6E2B1E@bioperl.org>

Is this at all helpful to your questions.
http://www.bioperl.org/wiki/HOWTO:Trees

The trees are in 'newick' or new hampshire format though I don't think  
there is a phylip format for trees.

-jason
On Nov 15, 2009, at 11:00 PM, Aneesh K wrote:

> Hi,
>
> I just started to use Bioperl modules. It's really useful and  
> interesting.
> Now I have in stuck with "Tree objects and phylogenetic trees".
> I couldn't get any documentation/examples about reading/parsing  
> phylip tree
> files.
>
> Please tell me from where I can get some sample codes for this.
>
> Waiting for your reply.
>
> Thanks
> Aneesh.K
> Mob. 09646181517
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

--
Jason Stajich
jason.stajich at gmail.com
jason at bioperl.org


From roy.chaudhuri at gmail.com  Mon Nov 16 12:31:49 2009
From: roy.chaudhuri at gmail.com (Roy Chaudhuri)
Date: Mon, 16 Nov 2009 17:31:49 +0000
Subject: [Bioperl-l] Regarding Bio::TreeIO Object
In-Reply-To: <9cb9dfd70911152300y34789f88qc69dd14bf505f57d@mail.gmail.com>
References: <9cb9dfd70911152300y34789f88qc69dd14bf505f57d@mail.gmail.com>
Message-ID: <4B018C85.6020801@gmail.com>

Hi Aneesh,

See the Bioperl trees howto:
http://www.bioperl.org/wiki/HOWTO:Trees

Roy.

Aneesh K wrote:
> Hi,
> 
> I just started to use Bioperl modules. It's really useful and interesting.
> Now I have in stuck with "Tree objects and phylogenetic trees".
> I couldn't get any documentation/examples about reading/parsing phylip tree
> files.
> 
> Please tell me from where I can get some sample codes for this.
> 
> Waiting for your reply.
> 
> Thanks
> Aneesh.K
> Mob. 09646181517


-- 
Dr. Roy Chaudhuri
Department of Veterinary Medicine
University of Cambridge, U.K.


From Kevin.M.Brown at asu.edu  Mon Nov 16 13:22:07 2009
From: Kevin.M.Brown at asu.edu (Kevin Brown)
Date: Mon, 16 Nov 2009 11:22:07 -0700
Subject: [Bioperl-l] FW:  Bio::Graphics::Panel question
Message-ID: <1A4207F8295607498283FE9E93B775B40663EE37@EX02.asurite.ad.asu.edu>

Please keep your responses on the list for more timely help.
 

Kevin Brown
Center for Innovations in Medicine
Biodesign Institute
Arizona State University 

 
________________________________

From: Xiaoyu Liang [mailto:veronica.xiaoyu at gmail.com] 
Sent: Monday, November 16, 2009 9:34 AM
To: Kevin Brown
Subject: Re: [Bioperl-l] Bio::Graphics::Panel question


Hi Kevin, 

Thank you for ur quick response. I attached the BLAST .out file here.
And the follow is my code part. I have an array keeping the color for
each hit, and I printed it out the array, there is no missing. 

my $track = $panel->add_track(
                              -glyph       => 'graded_segments',
                              -label       => 1,
                              -connector   => 'dashed',
                              -font2color  => 'red',
                              -sort_order  => 'high_score',
                              -description => sub {
                                $feature = shift;
                                #print "--".$feature."\n";
                                return unless
$feature->has_tag('description');
                                my ($description) =
$feature->each_tag_value('description');
                                my ($id) = $feature->display_name;
                                my @records= split(/\|/,$description);
                                my $score = $feature->score;
                                #print $id.":".$score."\n";
                                if($score >=200){
                                        push (@color_array,1);
                                }elsif($score >=80){
                                        push (@color_array,2);
                                }elsif($score >=50){
                                        push (@color_array,3);
                                }elsif($score >= 40){
                                        push (@color_array,4);
                                }else{
                                        push (@color_array,5);
                                }
                                
                                if($type == 1){
                                        "Species:Arabidopsis TF
Family:$records[1] Score=$score";
                                }elsif($type == 2){
                                        if(scalar(@records)==5){
                                                "Species:$records[1] TF
Family:$records[2] Accepted Name:$records[3] Score=$score";
                                        }else{
                                                "Species:$records[1] TF
Family:$records[2] Score=$score";
                                        }
                                }else{
                                        "";
                                }
                               },
                               -bgcolor => sub{
                                        return unless
$feature->has_tag('description');
                                        if($color_array[$index] == 1 ){
                                                $color = 'red';
                                        }
                                        if($color_array[$index]== 2){
                                                $color = 'orange';
                                        }
                                        if($color_array[$index]== 3){
                                                $color = 'green';
                                        }
                                        if($color_array[$index]== 4){
                                                $color = 'blue';
                                        }
                                        if($color_array[$index]== 5){
                                                $color = 'black';
                                        }
                                        #if ($index == 20){
                                        #        $color = 'black';
                                        #}
                                        #print
$index."--".$color_array[$index]."\n";
                                        $index++;
                                        
                                        #print $feature."\n";
                                        #print
$feature->display_name."\n";
                                        return $color;
                               },
                             );


Best regrads,
Xiaoyu


On Mon, Nov 16, 2009 at 10:49 AM, Kevin Brown <Kevin.M.Brown at asu.edu>
wrote:


	To really be able to tell if this was a bug, I (and probably the
real
	devs) would need to see that part of your code and the Blast
file that
	is having this issue as it could be your callback for color
choice vs
	the blast object (e.g. your color picker is missing an option
that the
	data comes in with and so returns with a blank value).
	

	-----Original Message-----
	From: bioperl-l-bounces at lists.open-bio.org
	[mailto:bioperl-l-bounces at lists.open-bio.org] On Behalf Of
Xiaoyu Liang
	Sent: Friday, November 13, 2009 1:36 PM
	To: Bioperl-l at lists.open-bio.org
	Subject: [Bioperl-l] Bio::Graphics::Panel question
	
	Hi,
	
	I'm using Bio::Graphics to parse the blast result and generate
images.
	But, sometimes, in the middle of the output image, the hit's
color is
	white, eventhough I set it to other colors. I attached the
picture here
	for an example. This doesn't occur all the time, usually, it
works well.
	I'm wondering if I did something wrong? or depends on the blast
result?
	
	Thank you,
	Xiaoyu
	
	
	_______________________________________________
	Bioperl-l mailing list
	Bioperl-l at lists.open-bio.org
	http://lists.open-bio.org/mailman/listinfo/bioperl-l
	

-------------- next part --------------
A non-text attachment was scrubbed...
Name: 1258388779.out
Type: application/octet-stream
Size: 32599 bytes
Desc: 1258388779.out
URL: <http://lists.open-bio.org/pipermail/bioperl-l/attachments/20091116/cb23e40d/attachment-0003.obj>

From paolo.pavan at gmail.com  Mon Nov 16 14:06:06 2009
From: paolo.pavan at gmail.com (Paolo Pavan)
Date: Mon, 16 Nov 2009 20:06:06 +0100
Subject: [Bioperl-l] bioperl-ext installation issue
Message-ID: <56be91b60911161106w69e20fd9k133a465e8d4f8a3f@mail.gmail.com>

Hi everybody,
I have problems installing the bioperl-ext package, any help is much
appreciated.
1)

   - I start trying with cpan i /bioperl-ext/ the only resource available is
   /B/BI/BIRNEY/bioperl-ext-1.4 (is it ok?)
   - I install Inline::MakeMaker and Inline::C then
   - i/BIRNEY/bioperl-ext-1.4/ fails bacause I don't have staden package

2) I try to install io_lib-1.8.10.tar as suggested by the README (
ftp://ftp.mrc-lmb.cam.ac.uk/pub/staden/io_lib/), installation fails after:
...
gcc -g -O2 -o makeSCF makeSCF.o ../read/.libs/libread.a -lz -lm
../read/.libs/libread.a(compress.o): In function `fopen_compressed':
/root/Download/staden/io_lib-1.8.10/utils/compress.c:321: warning: the use
of `tempnam' is dangerous, better use `mkstemp'
gcc -DHAVE_CONFIG_H -I. -I. -I.. -I.. -I../read -I../alf -I../abi -I../ctf
-I../ztr -I../plain -I../scf -I../exp_file -I../utils  -I/usr/local/include
-g -O2 -c -o extract_seq.o `test -f extract_seq.c || echo './'`extract_seq.c
/bin/sh ../libtool --mode=link gcc  -g -O2   -o extract_seq  extract_seq.o
../read/libread.la
gcc -g -O2 -o extract_seq extract_seq.o ../read/.libs/libread.a -lz -lm
../read/.libs/libread.a(compress.o): In function `fopen_compressed':
/root/Download/staden/io_lib-1.8.10/utils/compress.c:321: warning: the use
of `tempnam' is dangerous, better use `mkstemp'
gcc -DHAVE_CONFIG_H -I. -I. -I.. -I.. -I../read -I../alf -I../abi -I../ctf
-I../ztr -I../plain -I../scf -I../exp_file -I../utils  -I/usr/local/include
-g -O2 -c -o index_tar.o `test -f index_tar.c || echo './'`index_tar.c
index_tar.c: In function ?main?:
index_tar.c:12: error: two or more data types in declaration specifiers
make[2]: *** [index_tar.o] Error 1
make[2]: Leaving directory `/home/root/Download/staden/io_lib-1.8.10/progs'
make[1]: *** [all-recursive] Error 1
make[1]: Leaving directory `/home/root/Download/staden/io_lib-1.8.10'
make: *** [all-recursive-am] Error 2

3) I give up staden, because I actually need pSW, and try to install from
Makefile.PL in Bio/Ext/Align but installation fails after:
...
Align.xs:18: warning: ?not_here? defined but not used
Running Mkbootstrap for Bio::Ext::Align ()
chmod 644 Align.bs
rm -f ../blib/arch/auto/Bio/Ext/Align/Align.so
gcc  -shared -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions
-fstack-protector --param=ssp-buffer-size=4 -m64 -mtune=generic Align.o  -o
../blib/arch/auto/Bio/Ext/Align/Align.so libs/libsw.a    \
           -lm          \

/usr/bin/ld: libs/libsw.a(aln.o): relocation R_X86_64_32 against `a local
symbol' can not be used when making a shared object; recompile with -fPIC
libs/libsw.a: could not read symbols: Bad value
collect2: ld returned 1 exit status
make[1]: *** [../blib/arch/auto/Bio/Ext/Align/Align.so] Error 1
make[1]: Leaving directory
`/home/root/.cpan/sources/authors/id/B/BI/BIRNEY/bioperl-ext-1.4/Bio/Ext/Align'
make: *** [subdirs] Error 2

I have also made some other tries such force install Bio::Ext:Align without
success but I'm sure I miss something trivial that I can't catch.
Can someone help me?

Thank you,
Paolo


From lincoln.stein at gmail.com  Mon Nov 16 15:08:20 2009
From: lincoln.stein at gmail.com (Lincoln Stein)
Date: Mon, 16 Nov 2009 15:08:20 -0500
Subject: [Bioperl-l] FW: Bio::Graphics::Panel question
In-Reply-To: <1A4207F8295607498283FE9E93B775B40663EE37@EX02.asurite.ad.asu.edu>
References: <1A4207F8295607498283FE9E93B775B40663EE37@EX02.asurite.ad.asu.edu>
Message-ID: <6dce9a0b0911161208q2f826d83s319184f0cacca097@mail.gmail.com>

Hi,

I think you should modify your color selection code as follows:


                                       if($color_array[$index] == 1 ){
                                               $color = 'red';
                                       }
                                       elsif($color_array[$index]== 2){
                                               $color = 'orange';
                                       }
                                       elsif($color_array[$index]== 3){
                                               $color = 'green';
                                       }
                                       elsif($color_array[$index]== 4){
                                               $color = 'blue';
                                       }
                                       elsif($color_array[$index]== 5){
                                               $color = 'black';
                                       }
                                       else { die "unexpected color array
value $color_array[$index]" }

Lincoln

On Mon, Nov 16, 2009 at 1:22 PM, Kevin Brown <Kevin.M.Brown at asu.edu> wrote:

> Please keep your responses on the list for more timely help.
>
>
> Kevin Brown
> Center for Innovations in Medicine
> Biodesign Institute
> Arizona State University
>
>
>
> ________________________________
>
> From: Xiaoyu Liang [mailto:veronica.xiaoyu at gmail.com]
> Sent: Monday, November 16, 2009 9:34 AM
> To: Kevin Brown
> Subject: Re: [Bioperl-l] Bio::Graphics::Panel question
>
>
> Hi Kevin,
>
> Thank you for ur quick response. I attached the BLAST .out file here.
> And the follow is my code part. I have an array keeping the color for
> each hit, and I printed it out the array, there is no missing.
>
> my $track = $panel->add_track(
>                              -glyph       => 'graded_segments',
>                              -label       => 1,
>                              -connector   => 'dashed',
>                              -font2color  => 'red',
>                              -sort_order  => 'high_score',
>                              -description => sub {
>                                $feature = shift;
>                                #print "--".$feature."\n";
>                                return unless
> $feature->has_tag('description');
>                                my ($description) =
> $feature->each_tag_value('description');
>                                my ($id) = $feature->display_name;
>                                my @records= split(/\|/,$description);
>                                my $score = $feature->score;
>                                #print $id.":".$score."\n";
>                                if($score >=200){
>                                        push (@color_array,1);
>                                }elsif($score >=80){
>                                        push (@color_array,2);
>                                }elsif($score >=50){
>                                        push (@color_array,3);
>                                }elsif($score >= 40){
>                                        push (@color_array,4);
>                                }else{
>                                        push (@color_array,5);
>                                }
>
>                                if($type == 1){
>                                        "Species:Arabidopsis TF
> Family:$records[1] Score=$score";
>                                }elsif($type == 2){
>                                        if(scalar(@records)==5){
>                                                "Species:$records[1] TF
> Family:$records[2] Accepted Name:$records[3] Score=$score";
>                                        }else{
>                                                "Species:$records[1] TF
> Family:$records[2] Score=$score";
>                                        }
>                                }else{
>                                        "";
>                                }
>                               },
>                               -bgcolor => sub{
>                                        return unless
> $feature->has_tag('description');
>                                        if($color_array[$index] == 1 ){
>                                                $color = 'red';
>                                        }
>                                        if($color_array[$index]== 2){
>                                                $color = 'orange';
>                                        }
>                                        if($color_array[$index]== 3){
>                                                $color = 'green';
>                                        }
>                                        if($color_array[$index]== 4){
>                                                $color = 'blue';
>                                        }
>                                        if($color_array[$index]== 5){
>                                                $color = 'black';
>                                        }
>                                        #if ($index == 20){
>                                        #        $color = 'black';
>                                        #}
>                                        #print
> $index."--".$color_array[$index]."\n";
>                                        $index++;
>
>                                        #print $feature."\n";
>                                        #print
> $feature->display_name."\n";
>                                        return $color;
>                               },
>                             );
>
>
> Best regrads,
> Xiaoyu
>
>
> On Mon, Nov 16, 2009 at 10:49 AM, Kevin Brown <Kevin.M.Brown at asu.edu>
> wrote:
>
>
>        To really be able to tell if this was a bug, I (and probably the
> real
>        devs) would need to see that part of your code and the Blast
> file that
>        is having this issue as it could be your callback for color
> choice vs
>        the blast object (e.g. your color picker is missing an option
> that the
>        data comes in with and so returns with a blank value).
>
>
>        -----Original Message-----
>        From: bioperl-l-bounces at lists.open-bio.org
>        [mailto:bioperl-l-bounces at lists.open-bio.org] On Behalf Of
> Xiaoyu Liang
>        Sent: Friday, November 13, 2009 1:36 PM
>        To: Bioperl-l at lists.open-bio.org
>        Subject: [Bioperl-l] Bio::Graphics::Panel question
>
>        Hi,
>
>        I'm using Bio::Graphics to parse the blast result and generate
> images.
>        But, sometimes, in the middle of the output image, the hit's
> color is
>        white, eventhough I set it to other colors. I attached the
> picture here
>        for an example. This doesn't occur all the time, usually, it
> works well.
>        I'm wondering if I did something wrong? or depends on the blast
> result?
>
>        Thank you,
>        Xiaoyu
>
>
>        _______________________________________________
>        Bioperl-l mailing list
>        Bioperl-l at lists.open-bio.org
>        http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


-- 
Lincoln D. Stein
Director, Informatics and Biocomputing Platform
Ontario Institute for Cancer Research
101 College St., Suite 800
Toronto, ON, Canada M5G0A3
416 673-8514
Assistant: Renata Musa <Renata.Musa at oicr.on.ca>


From ryan_bogard at hms.harvard.edu  Mon Nov 16 16:44:25 2009
From: ryan_bogard at hms.harvard.edu (rbogard)
Date: Mon, 16 Nov 2009 13:44:25 -0800 (PST)
Subject: [Bioperl-l] Problems with bioperl in Mac OS X 10.6 (Perl 5.10.0)
In-Reply-To: <26366421.post@talk.nabble.com>
References: <26366421.post@talk.nabble.com>
Message-ID: <26379710.post@talk.nabble.com>


Thank you all for your help! I was able to get bioperl working via manual
download and install. It was a combination of permissions issues and X86_64
vs. X86_32 compatibility issues. Using fink to download and install seems to
have given me a combination of 32 and 64 associated files (I probably did
something wrong in config). 


rbogard wrote:
> 
> In advance, any advice would be grealy appreciated! I have installed
> bioperl-588pm via fink but I am having difficulties calling the modules in
> script. The following is added to .profile (bash):
> PERL5LIB=/sw/lib/perl5/5.8.8:$PERL5LIB
> 
> If I change this to /sw/lib/perl5 then I get an @INC error, as use
> Bio::PERL cannot be located.
> 
> The environment variables are as follows:
> 
> MANPATH=/sw/share/man:/usr/share/man:/usr/X11/man:/sw/lib/perl5/5.10.0/man:/usr/X11R6/man:/sw/lib/perl5-core/5.8.8/man:/sw/lib/perl5/5.8.8/man
> PERL5LIB=/sw/lib/perl5/5.8.8:/sw/lib/perl5:/sw/lib/perl5/darwin:/sw/lib/perl5/5.8.8
> PATH=/sw/bin:/sw/sbin:/usr/bin:/bin:/usr/sbin:/sbin:/usr/local/bin:/usr/X11/bin:/usr/X11R6/bin
> INFOPATH=/sw/share/info:/sw/info:/usr/share/info
> 
> 
> This is the perl script I'm attempting to run:
> #!/sw/bin/perl5.8.8
> use strict;
> use Bio::Perl;
> $seq_object = get_sequence('swiss',"ROA1_HUMAN");
> write_sequence(">roa1.fasta",'fasta',$seq_object);
> 
> Here is the error output:
> 
> dyld: lazy symbol binding failed: Symbol not found: _Perl_Tstack_sp_ptr
>   Referenced from:
> /sw/lib/perl5/5.8.8/darwin-thread-multi-2level/auto/IO/IO.bundle
>   Expected in: dynamic lookup
> 
> dyld: Symbol not found: _Perl_Tstack_sp_ptr
>   Referenced from:
> /sw/lib/perl5/5.8.8/darwin-thread-multi-2level/auto/IO/IO.bundle
>   Expected in: dynamic lookup
> 
> Trace/BPT trap
> 
> I have looked through many forum postings and attempted the solutions
> offered in those instances, but none seem to work in my case. I'm not sure
> if it's because I have perl 5.10.0 installed while attempting to call
> bioperl 5.8.8; however, others seem to have it working just fine.
> 
> Thank you, Ryan 
> 

-- 
View this message in context: http://old.nabble.com/Problems-with-bioperl-in-Mac-OS-X-10.6-%28Perl-5.10.0%29-tp26366421p26379710.html
Sent from the Perl - Bioperl-L mailing list archive at Nabble.com.


From jay at jays.net  Mon Nov 16 17:02:10 2009
From: jay at jays.net (Jay Hannah)
Date: Mon, 16 Nov 2009 16:02:10 -0600
Subject: [Bioperl-l] Bio::Index::GenBank - by organism?
In-Reply-To: <2BA451B1-6E18-483E-B655-74D1146772CC@bioperl.org>
References: <3B01A09C-198E-4691-B807-7ED3250BB81A@jays.net>
	<12DFD22E-42DC-4626-9873-0DE3EBB5CFBD@illinois.edu>
	<2BA451B1-6E18-483E-B655-74D1146772CC@bioperl.org>
Message-ID: <60ADD3A9-D38B-4A39-A5CE-C8118DEC1242@jays.net>

On Nov 10, 2009, at 12:50 PM, Jason Stajich wrote:
> You might also look at what mygenbank does:
> http://homepage.mac.com/iankorf/mygenbank.html

It appears, perhaps, that BioSQL can provide *foo* searching like so:

http://www.biosql.org/wiki/Schema_Overview#TAXON.2C_TAXON_NAME

 SELECT DISTINCT include.ncbi_taxon_id FROM taxon
    INNER JOIN taxon AS include ON
      (include.left_value BETWEEN taxon.left_value
        AND taxon.right_value)
 WHERE taxon.taxon_id IN
   (SELECT taxon_id FROM taxon_name
    WHERE name LIKE '%fungi%')

So I think we're going to chase that for a while.

I didn't see a *foo* search in MyGenBank?

Thanks,

j
http://clab.ist.unomaha.edu/CLAB/index.php/User:Jhannah


From roy.chaudhuri at gmail.com  Tue Nov 17 06:24:07 2009
From: roy.chaudhuri at gmail.com (Roy Chaudhuri)
Date: Tue, 17 Nov 2009 11:24:07 +0000
Subject: [Bioperl-l] Regarding Bio::TreeIO Object
In-Reply-To: <9cb9dfd70911162117nfac0e52gea3d638e34337b16@mail.gmail.com>
References: <9cb9dfd70911152300y34789f88qc69dd14bf505f57d@mail.gmail.com>
	<4B018C85.6020801@gmail.com>
	<9cb9dfd70911162117nfac0e52gea3d638e34337b16@mail.gmail.com>
Message-ID: <4B0287D7.5050702@gmail.com>

Hi Aneesh,

Please keep your replies on the mailing list, that way someone else can 
respond, which would be particularly useful in this case since I know 
nothing about MapIO.

Roy.

Aneesh K wrote:
> Thanks for your reply. 
> 
> I would like to know about "Genetic Maps" also. I would like to 
> use MapIO object. 
> But I'm not aware about genetic maps and the mapmaker format. 
> 
> Please tell me from where I can get some examples for mapmaker format 
> and some example scripts to use MapIO object. 
> 
> Hoping your reply.
> 
> Aneesh.K
> Mob. 09646181517
> 
> 
> 
> On Mon, Nov 16, 2009 at 11:01 PM, Roy Chaudhuri <roy.chaudhuri at gmail.com 
> <mailto:roy.chaudhuri at gmail.com>> wrote:
> 
>     Hi Aneesh,
> 
>     See the Bioperl trees howto:
>     http://www.bioperl.org/wiki/HOWTO:Trees
> 
>     Roy.
> 
> 
>     Aneesh K wrote:
> 
>         Hi,
> 
>         I just started to use Bioperl modules. It's really useful and
>         interesting.
>         Now I have in stuck with "Tree objects and phylogenetic trees".
>         I couldn't get any documentation/examples about reading/parsing
>         phylip tree
>         files.
> 
>         Please tell me from where I can get some sample codes for this.
> 
>         Waiting for your reply.
> 
>         Thanks
>         Aneesh.K
>         Mob. 09646181517
> 
> 
> 
>     -- 
>     Dr. Roy Chaudhuri
>     Department of Veterinary Medicine
>     University of Cambridge, U.K.
> 
> 


From maj at fortinbras.us  Tue Nov 17 07:50:06 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Tue, 17 Nov 2009 07:50:06 -0500
Subject: [Bioperl-l] Regarding Bio::TreeIO Object
In-Reply-To: <4B0287D7.5050702@gmail.com>
References: <9cb9dfd70911152300y34789f88qc69dd14bf505f57d@mail.gmail.com><4B018C85.6020801@gmail.com><9cb9dfd70911162117nfac0e52gea3d638e34337b16@mail.gmail.com>
	<4B0287D7.5050702@gmail.com>
Message-ID: <394F62D51F15405BBCF8BB50DA0FF336@NewLife>

Aneesh, 
Have a look in the t/Map directory of the BioPerl distribution. These
are test scripts that are also examples of usage. The t/data directory
will contain the datafiles that the tests use; these will provide example data.
cheers 
Mark 
----- Original Message ----- 
From: "Roy Chaudhuri" <roy.chaudhuri at gmail.com>
To: "Aneesh K" <krishna.aneesh at gmail.com>; <bioperl-l at bioperl.org>
Sent: Tuesday, November 17, 2009 6:24 AM
Subject: Re: [Bioperl-l] Regarding Bio::TreeIO Object


> Hi Aneesh,
> 
> Please keep your replies on the mailing list, that way someone else can 
> respond, which would be particularly useful in this case since I know 
> nothing about MapIO.
> 
> Roy.
> 
> Aneesh K wrote:
>> Thanks for your reply. 
>> 
>> I would like to know about "Genetic Maps" also. I would like to 
>> use MapIO object. 
>> But I'm not aware about genetic maps and the mapmaker format. 
>> 
>> Please tell me from where I can get some examples for mapmaker format 
>> and some example scripts to use MapIO object. 
>> 
>> Hoping your reply.
>> 
>> Aneesh.K
>> Mob. 09646181517
>> 
>> 
>> 
>> On Mon, Nov 16, 2009 at 11:01 PM, Roy Chaudhuri <roy.chaudhuri at gmail.com 
>> <mailto:roy.chaudhuri at gmail.com>> wrote:
>> 
>>     Hi Aneesh,
>> 
>>     See the Bioperl trees howto:
>>     http://www.bioperl.org/wiki/HOWTO:Trees
>> 
>>     Roy.
>> 
>> 
>>     Aneesh K wrote:
>> 
>>         Hi,
>> 
>>         I just started to use Bioperl modules. It's really useful and
>>         interesting.
>>         Now I have in stuck with "Tree objects and phylogenetic trees".
>>         I couldn't get any documentation/examples about reading/parsing
>>         phylip tree
>>         files.
>> 
>>         Please tell me from where I can get some sample codes for this.
>> 
>>         Waiting for your reply.
>> 
>>         Thanks
>>         Aneesh.K
>>         Mob. 09646181517
>> 
>> 
>> 
>>     -- 
>>     Dr. Roy Chaudhuri
>>     Department of Veterinary Medicine
>>     University of Cambridge, U.K.
>> 
>> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
>


From veronica.xiaoyu at gmail.com  Wed Nov 18 12:18:33 2009
From: veronica.xiaoyu at gmail.com (Xiaoyu Liang)
Date: Wed, 18 Nov 2009 12:18:33 -0500
Subject: [Bioperl-l] how to visualize multiple sequences alignments
Message-ID: <a6a926b50911180918m6b6c31b2g191f9d78325e68de@mail.gmail.com>

Hi,

I'm wondering Is there any modules that can be used for visualizing multiple
sequences alignments? like the result from ClustalW?

Thank you very much,
Xiaoyu


From jason at bioperl.org  Wed Nov 18 13:23:05 2009
From: jason at bioperl.org (Jason Stajich)
Date: Wed, 18 Nov 2009 10:23:05 -0800
Subject: [Bioperl-l] how to visualize multiple sequences alignments
In-Reply-To: <a6a926b50911180918m6b6c31b2g191f9d78325e68de@mail.gmail.com>
References: <a6a926b50911180918m6b6c31b2g191f9d78325e68de@mail.gmail.com>
Message-ID: <FC65BC44-AC7A-4AE6-8DF0-3F7969CD5B4C@bioperl.org>

try jalview http://www.jalview.org/

On Nov 18, 2009, at 9:18 AM, Xiaoyu Liang wrote:

> Hi,
>
> I'm wondering Is there any modules that can be used for visualizing  
> multiple
> sequences alignments? like the result from ClustalW?
>
> Thank you very much,
> Xiaoyu
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

--
Jason Stajich
jason.stajich at gmail.com
jason at bioperl.org


From andrew.j.grimm at gmail.com  Wed Nov 18 21:52:31 2009
From: andrew.j.grimm at gmail.com (Andrew Grimm)
Date: Thu, 19 Nov 2009 13:52:31 +1100
Subject: [Bioperl-l] DANGER: hacking of bioperl wiki?
Message-ID: <b9140daa0911181852n3a000da7x3fc7a5f0661f10f3@mail.gmail.com>

Caution: read the whole email before visiting the bioperl wiki

I was doing some bioinformatics-related searching using google, and
one of the hits was to the bio dot perl dot org wiki (the FAQ in
particular).

When I did that, I was redirected to a ferdax dot com web site (a
typo-squatting of fedex?).

Some people reckon that ferdax hacks web sites and redirects google
hits from the victim web site to their own web site. For example, this
thread at google's webmaster central
http://www.google.com/support/forum/p/Webmasters/thread?tid=37a36c0d1ea99819&hl=en#all
(it's talking about zencart, but presumably they've since found other
victims)

Just going to the website without using google may not trigger the redirect.

Apologies if this is a false alarm, but I don't think it is.

I won't be in contact between Friday and Monday Australian time (I'll
be at railscamp 6 in Melbourne), so I won't be able to answer any
replies.

Thanks,

Andrew Grimm


From maj at fortinbras.us  Wed Nov 18 22:14:44 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Wed, 18 Nov 2009 22:14:44 -0500
Subject: [Bioperl-l] DANGER: hacking of bioperl wiki?
In-Reply-To: <b9140daa0911181852n3a000da7x3fc7a5f0661f10f3@mail.gmail.com>
References: <b9140daa0911181852n3a000da7x3fc7a5f0661f10f3@mail.gmail.com>
Message-ID: <7761C2223DB54DE6B836F302D2FF6AC0@NewLife>

Andrew-- thanks!! We're on it.
MAJ
----- Original Message ----- 
From: "Andrew Grimm" <andrew.j.grimm at gmail.com>
To: <bioperl-l at lists.open-bio.org>
Sent: Wednesday, November 18, 2009 9:52 PM
Subject: [Bioperl-l] DANGER: hacking of bioperl wiki?


> Caution: read the whole email before visiting the bioperl wiki
>
> I was doing some bioinformatics-related searching using google, and
> one of the hits was to the bio dot perl dot org wiki (the FAQ in
> particular).
>
> When I did that, I was redirected to a ferdax dot com web site (a
> typo-squatting of fedex?).
>
> Some people reckon that ferdax hacks web sites and redirects google
> hits from the victim web site to their own web site. For example, this
> thread at google's webmaster central
> http://www.google.com/support/forum/p/Webmasters/thread?tid=37a36c0d1ea99819&hl=en#all
> (it's talking about zencart, but presumably they've since found other
> victims)
>
> Just going to the website without using google may not trigger the redirect.
>
> Apologies if this is a false alarm, but I don't think it is.
>
> I won't be in contact between Friday and Monday Australian time (I'll
> be at railscamp 6 in Melbourne), so I won't be able to answer any
> replies.
>
> Thanks,
>
> Andrew Grimm
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> 


From sandipan.chowdhury at physiology.wisc.edu  Thu Nov 19 01:49:45 2009
From: sandipan.chowdhury at physiology.wisc.edu (Sandipan Chowdhury)
Date: Thu, 19 Nov 2009 00:49:45 -0600
Subject: [Bioperl-l] accessing EMBL database
Message-ID: <475F74057A618245A773CD325E105D1E033334AC@phy-srv01.physiol.physiology.wisc.edu>

Hi,
 
I have 3 questions all related to the retreival of sequences from online databases.
 
(1) I have been trying to download a protein sequence from the EMBL database and trying to write the sequence into a text file, as a string. I am using the following code: 
 
use Bio::DB::EMBL;
open b,">","s.txt";
$em_obj = Bio::DB::EMBL->new;
  $seq_obj = $em_obj->get_Seq_by_acc("CAB95729");
  $s_str = $seq_obj->seq;
  print b "$s_str\n";
close b;
 
The script is not working and gives the messege:
"MSG: EMBL stream with no ID. Not embl in my book
STACK: Error::throw
STACK: Bio::Root::Root::throw C:/Perl/site/lib/Bio/Root/Root.pm: 368
STACK: Bio::SeqIO::embl::next_seq C:/Perl/site/lib/Bio\SeqIO\embl.pm: 203
STACK: Bio::DB::WebDBSeqI::get_Seq_by_acc C:/Perl/site/lib/Bio/DB/WebDBSeqI.pm: 194
STACK: trial2.pl"
 
I am not sure what this means. A similar version of the script works for the Swissprot, GenBank and RefSeq databases but not for the EMBL. What is the way around this so that I can download the embl sequence?
 
(2) Also, is there anyway I can download sequences from DDBJ (database of Japan)?
 
(3) Can GI numbers be used to retreive the sequences? If so then how?
 
Answers to these questions would be greatly appreciated. I am very new to Perl/Bioperl and am not really familiar with the advanced programming features, so I would need to your help to find my way out of this situation.
 
Many Thanks
Sandipan
 

From maj at fortinbras.us  Thu Nov 19 08:10:07 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Thu, 19 Nov 2009 08:10:07 -0500
Subject: [Bioperl-l] accessing EMBL database
In-Reply-To: <475F74057A618245A773CD325E105D1E033334AC@phy-srv01.physiol.physiology.wisc.edu>
References: <475F74057A618245A773CD325E105D1E033334AC@phy-srv01.physiol.physiology.wisc.edu>
Message-ID: <E10EF917D9914031BCFDF8BDDBFA4F13@NewLife>

Sandipan-- That id (CAB95729) returns "No entries" from EMBL.
I would agree that the error message is not really informative.
The module documentation warns:

      # remember that EMBL_ID does not equal GenBank_ID!
so I would check that.
MAJ
----- Original Message ----- 
From: "Sandipan Chowdhury" <sandipan.chowdhury at physiology.wisc.edu>
To: <bioperl-l at bioperl.org>
Sent: Thursday, November 19, 2009 1:49 AM
Subject: [Bioperl-l] accessing EMBL database


> Hi,
>
> I have 3 questions all related to the retreival of sequences from online 
> databases.
>
> (1) I have been trying to download a protein sequence from the EMBL database 
> and trying to write the sequence into a text file, as a string. I am using the 
> following code:
>
> use Bio::DB::EMBL;
> open b,">","s.txt";
> $em_obj = Bio::DB::EMBL->new;
>  $seq_obj = $em_obj->get_Seq_by_acc("CAB95729");
>  $s_str = $seq_obj->seq;
>  print b "$s_str\n";
> close b;
>
> The script is not working and gives the messege:
> "MSG: EMBL stream with no ID. Not embl in my book
> STACK: Error::throw
> STACK: Bio::Root::Root::throw C:/Perl/site/lib/Bio/Root/Root.pm: 368
> STACK: Bio::SeqIO::embl::next_seq C:/Perl/site/lib/Bio\SeqIO\embl.pm: 203
> STACK: Bio::DB::WebDBSeqI::get_Seq_by_acc 
> C:/Perl/site/lib/Bio/DB/WebDBSeqI.pm: 194
> STACK: trial2.pl"
>
> I am not sure what this means. A similar version of the script works for the 
> Swissprot, GenBank and RefSeq databases but not for the EMBL. What is the way 
> around this so that I can download the embl sequence?
>
> (2) Also, is there anyway I can download sequences from DDBJ (database of 
> Japan)?
>
> (3) Can GI numbers be used to retreive the sequences? If so then how?
>
> Answers to these questions would be greatly appreciated. I am very new to 
> Perl/Bioperl and am not really familiar with the advanced programming 
> features, so I would need to your help to find my way out of this situation.
>
> Many Thanks
> Sandipan
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> 


From hrh at fmi.ch  Thu Nov 19 08:23:29 2009
From: hrh at fmi.ch (Hotz, Hans-Rudolf)
Date: Thu, 19 Nov 2009 14:23:29 +0100
Subject: [Bioperl-l] accessing EMBL database
In-Reply-To: <475F74057A618245A773CD325E105D1E033334AC@phy-srv01.physiol.physiology.wisc.edu>
Message-ID: <C72B0561.5887%hrh@fmi.ch>


Sandipan


> I have 3 questions all related to the retreival of sequences from online
> databases.
>  
> (1) I have been trying to download a protein sequence from the EMBL database
> and trying to write the sequence into a text file, as a string. I am using the
> following code: 
>  
> use Bio::DB::EMBL;
> open b,">","s.txt";
> $em_obj = Bio::DB::EMBL->new;
>   $seq_obj = $em_obj->get_Seq_by_acc("CAB95729");
>   $s_str = $seq_obj->seq;
>   print b "$s_str\n";
> close b;
>  
> The script is not working and gives the messege:
> "MSG: EMBL stream with no ID. Not embl in my book
> STACK: Error::throw
> STACK: Bio::Root::Root::throw C:/Perl/site/lib/Bio/Root/Root.pm: 368
> STACK: Bio::SeqIO::embl::next_seq C:/Perl/site/lib/Bio\SeqIO\embl.pm: 203
> STACK: Bio::DB::WebDBSeqI::get_Seq_by_acc
> C:/Perl/site/lib/Bio/DB/WebDBSeqI.pm: 194
> STACK: trial2.pl"
>  
> I am not sure what this means. A similar version of the script works for the
> Swissprot, GenBank and RefSeq databases but not for the EMBL. What is the way
> around this so that I can download the embl sequence?

"CAB95729" is a protein sequence, ie a translation of the CDS of
'AJ277028.1'.

As far as I know, Bio::DB::EMBL is only designed to get EMBL entries, ie the
nucleotides sequence


> (2) Also, is there anyway I can download sequences from DDBJ (database of
> Japan)?

Unless, for network/speed reason, why do you want to download data from
DDBJ? It contains the same data as GenBank and EMBL. Those three databases
exchange their data on a daily basis.
  
> (3) Can GI numbers be used to retreive the sequences? If so then how?

Have you looked at Bio::DB::Eutilities ? See the 'HOWTOs'  page in the
Bioperl Wiki


Regards, Hans


> Answers to these questions would be greatly appreciated. I am very new to
> Perl/Bioperl and am not really familiar with the advanced programming
> features, so I would need to your help to find my way out of this situation.
>  
> Many Thanks
> Sandipan
>  
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From cjfields at illinois.edu  Thu Nov 19 08:47:16 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Thu, 19 Nov 2009 07:47:16 -0600
Subject: [Bioperl-l] accessing EMBL database
In-Reply-To: <C72B0561.5887%hrh@fmi.ch>
References: <C72B0561.5887%hrh@fmi.ch>
Message-ID: <95D416ED-7630-40A1-ABA5-A3C3525D25B1@illinois.edu>


On Nov 19, 2009, at 7:23 AM, Hotz, Hans-Rudolf wrote:

> 
> Sandipan
> 
> 
>> I have 3 questions all related to the retreival of sequences from online
>> databases.
>> 
>> (1) I have been trying to download a protein sequence from the EMBL database
>> and trying to write the sequence into a text file, as a string. I am using the
>> following code: 
>> 
>> use Bio::DB::EMBL;
>> open b,">","s.txt";
>> $em_obj = Bio::DB::EMBL->new;
>>  $seq_obj = $em_obj->get_Seq_by_acc("CAB95729");
>>  $s_str = $seq_obj->seq;
>>  print b "$s_str\n";
>> close b;
>> 
>> The script is not working and gives the messege:
>> "MSG: EMBL stream with no ID. Not embl in my book
>> STACK: Error::throw
>> STACK: Bio::Root::Root::throw C:/Perl/site/lib/Bio/Root/Root.pm: 368
>> STACK: Bio::SeqIO::embl::next_seq C:/Perl/site/lib/Bio\SeqIO\embl.pm: 203
>> STACK: Bio::DB::WebDBSeqI::get_Seq_by_acc
>> C:/Perl/site/lib/Bio/DB/WebDBSeqI.pm: 194
>> STACK: trial2.pl"
>> 
>> I am not sure what this means. A similar version of the script works for the
>> Swissprot, GenBank and RefSeq databases but not for the EMBL. What is the way
>> around this so that I can download the embl sequence?
> 
> "CAB95729" is a protein sequence, ie a translation of the CDS of
> 'AJ277028.1'.
> 
> As far as I know, Bio::DB::EMBL is only designed to get EMBL entries, ie the
> nucleotides sequence
> 
> 
> 
>> (2) Also, is there anyway I can download sequences from DDBJ (database of
>> Japan)?
> 
> Unless, for network/speed reason, why do you want to download data from
> DDBJ? It contains the same data as GenBank and EMBL. Those three databases
> exchange their data on a daily basis.
> 
>> (3) Can GI numbers be used to retreive the sequences? If so then how?
> 
> Have you looked at Bio::DB::Eutilities ? See the 'HOWTOs'  page in the
> Bioperl Wiki
> 
> 
> 
> Regards, Hans
> 
> 
> 
>> Answers to these questions would be greatly appreciated. I am very new to
>> Perl/Bioperl and am not really familiar with the advanced programming
>> features, so I would need to your help to find my way out of this situation.
>> 
>> Many Thanks
>> Sandipan

To add to that, if you want the protein sequences as a Bio::Seq you can use Bio::DB::GenPept (Bio::DB::EUtilities will retrieve raw data only).

chris


From David.Messina at sbc.su.se  Thu Nov 19 09:04:55 2009
From: David.Messina at sbc.su.se (Dave Messina)
Date: Thu, 19 Nov 2009 15:04:55 +0100
Subject: [Bioperl-l] accessing EMBL database
In-Reply-To: <E10EF917D9914031BCFDF8BDDBFA4F13@NewLife>
References: <475F74057A618245A773CD325E105D1E033334AC@phy-srv01.physiol.physiology.wisc.edu>
	<E10EF917D9914031BCFDF8BDDBFA4F13@NewLife>
Message-ID: <B6E6CEFF-14E7-48D6-B9D8-9C114F166190@sbc.su.se>

> I would agree that the error message is not really informative.

Agreed that it could be better, but I wonder whether part of the problem with BioPerl error messages is the stack dump.

I think a lot of eyes just glaze right over when they see a big wad of complicated stuff, with colons and slashes and line numbers, spewing out at them.

Perhaps the stack dump should be turned off by default?

Wouldn't this:

ERROR: EMBL stream with no ID. Not embl in my book


Be a lot clearer than this?:

MSG: EMBL stream with no ID. Not embl in my book
STACK: Error::throw
STACK: Bio::Root::Root::throw C:/Perl/site/lib/Bio/Root/Root.pm: 368
STACK: Bio::SeqIO::embl::next_seq C:/Perl/site/lib/Bio\SeqIO\embl.pm: 203
STACK: Bio::DB::WebDBSeqI::get_Seq_by_acc C:/Perl/site/lib/Bio/DB/WebDBSeqI.pm: 194
STACK: trial2.pl


Just a thought. This has probably been discussed before.
Dave


From maj at fortinbras.us  Thu Nov 19 09:17:05 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Thu, 19 Nov 2009 09:17:05 -0500
Subject: [Bioperl-l] accessing EMBL database
In-Reply-To: <B6E6CEFF-14E7-48D6-B9D8-9C114F166190@sbc.su.se>
References: <475F74057A618245A773CD325E105D1E033334AC@phy-srv01.physiol.physiology.wisc.edu>
	<E10EF917D9914031BCFDF8BDDBFA4F13@NewLife>
	<B6E6CEFF-14E7-48D6-B9D8-9C114F166190@sbc.su.se>
Message-ID: <FADF827A6CE34C959062F2D93849E15A@NewLife>

I'm inclined to agree. Lots of responses to questions here that begin
"Well, as the error message said, you need to check...", which means
people tend towards "I broke it! Write the list!". I do find it hairy when
my errors are way down in the object tree.
----- Original Message ----- 
From: "Dave Messina" <David.Messina at sbc.su.se>
To: "Mark A. Jensen" <maj at fortinbras.us>
Cc: <bioperl-l at bioperl.org>
Sent: Thursday, November 19, 2009 9:04 AM
Subject: Re: [Bioperl-l] accessing EMBL database


> I would agree that the error message is not really informative.

Agreed that it could be better, but I wonder whether part of the problem with 
BioPerl error messages is the stack dump.

I think a lot of eyes just glaze right over when they see a big wad of 
complicated stuff, with colons and slashes and line numbers, spewing out at 
them.

Perhaps the stack dump should be turned off by default?

Wouldn't this:

ERROR: EMBL stream with no ID. Not embl in my book


Be a lot clearer than this?:

MSG: EMBL stream with no ID. Not embl in my book
STACK: Error::throw
STACK: Bio::Root::Root::throw C:/Perl/site/lib/Bio/Root/Root.pm: 368
STACK: Bio::SeqIO::embl::next_seq C:/Perl/site/lib/Bio\SeqIO\embl.pm: 203
STACK: Bio::DB::WebDBSeqI::get_Seq_by_acc C:/Perl/site/lib/Bio/DB/WebDBSeqI.pm: 
194
STACK: trial2.pl


Just a thought. This has probably been discussed before.
Dave


From rtbio.2009 at gmail.com  Thu Nov 19 09:55:27 2009
From: rtbio.2009 at gmail.com (Roopa Raghuveer)
Date: Thu, 19 Nov 2009 15:55:27 +0100
Subject: [Bioperl-l] Remote blast
Message-ID: <c7cac1600911190655q7c4cf910x221e8a2ebac40f2a@mail.gmail.com>

Hello everybody,

I have a problem. I would like to use remote blast to find sequences
matching for an input sequence.

Ex:-I would like to search sequences which match Trypanosoma Brucei
sequence.

I want the output to be only Trypanosoma Brucei sequences matching with my
query.When i tried to use remoteblast to nr database,I got sequences from
different organisms like E.coli,Pseudomonas etc.,

Could you please tell me how can this be solved...?

My code is as follows.

use Bio::Tools::Run::RemoteBlast;
  use strict;
  my $prog = 'blastn';
  my $db   = 'nr';
  my $e_val= '1e-10';
 my $organism= 'Trypanosoma Brucei';

  my @params = ( '-prog' => $prog,
         '-data' => $db,
         '-expect' => $e_val,
         '-readmethod' => 'SearchIO',
         '-Organism'   => $organism );

  my $factory = Bio::Tools::Run::RemoteBlast->
new(@params);

  #change a paramter
  #$Bio::Tools::Run::RemoteBlast::HEADER{'ENTREZ_QUERY'} = 'Trypanosoma
brucei[ORGN]'

  #remove a parameter
  #delete $Bio::Tools::Run::RemoteBlast::HEADER{'FILTER'};

  my $v = 1;
  #$v is just to turn on and off the messages

  my $str = Bio::SeqIO->new(-file=>'amino.fa' , '-format' => 'fasta' ,
'-organism' => 'Trypanosoma Brucei' );

  while (my $input = $str->next_seq()){
    #Blast a sequence against a database:
   my $r = $factory->submit_blast($input);
    #my $r = $factory->submit_blast('amino.fa');

    print STDERR "waiting..." if( $v > 0 );
    while ( my @rids = $factory->each_rid ) {
      foreach my $rid ( @rids ) {
        my $rc = $factory->retrieve_blast($rid);
        if( !ref($rc) ) {
          if( $rc < 0 ) {
            $factory->remove_rid($rid);
          }
          print STDERR "." if ( $v > 0 );
         sleep 5;
        }
     else {
          my $result = $rc->next_result();
          #save the output
          my $filename = $result->query_name()."\.out";
          $factory->save_output($filename);
          $factory->remove_rid($rid);
          print "\nQuery Name: ", $result->query_name(), "\n";
          while ( my $hit = $result->next_hit ) {
            next unless ( $v > 0);
            print "\thit name is ", $hit->name, "\n";
            while( my $hsp = $hit->next_hsp ) {
              print "\t\tscore is ", $hsp->score, "\n";
            }
          }
        }
      }
    }
  }

My input sequence is

>ref|NC_009512.1|:385-1902
GTGTCAGTGGAACTTTGGCAGCAGTGCGTGGAGCTTCTGCGCGATGAACTGCCTGCCCAGCAATTCAACA
CCTGGATCCGTCCGCTACAGGTCGAAGCCGAAGGCGACGAGTTGCGCGTCTATGCGCCTAACCGTTTCGT
TCTCGATTGGGTCAATGAAAAGTACCTGGGTCGTTTGCTCGAGCTGTTGGGTGAGAACGGTAGCGGCATT
GCACCAGCCCTTTCCTTATTAATAGGTAGCCGCCGCAGCTCGGCCCCAAGGGCTGCACCCAACGCGCCGG
TCAGCGCTGCCGTTGCGGCTTCGCTGGCGCAGACTCAGGCGCACAAGACGGCCCCGGCAGCAGCGGTTGA
ACCCGTTGCCGTGGCCGCGGCCGAGCCTGTATTGGTCGAGACGTCTTCGCGTGACAGCTTTGATGCCATG
GCCGAGCCTGCTGCTGCGCCGCCCAGTGGTGGCCGGGCTGAACAGCGCACCGTGCAGGTTGAAGGTGCGC
TCAAGCACACCAGTTACCTGAACCGGACCTTTACCTTTGACACCTTCGTCGAAGGTAAGTCGAACCAGCT
CGCCCGCGCGGCTGCCTGGCAGGTTGCGGACAACCCTAAGCATGGCTACAACCCACTGTTCCTTTATGGC
GGTGTGGGTTTGGGTAAAACCCACCTTATGCATGCTGTGGGTAACCATCTGCTGAAGAAGAATCCGAACG
CCAAGGTGGTGTACCTGCATTCGGAGCGCTTCGTCGCGGACATGGTCAAAGCGTTGCAACTCAACGCCAT
CAACGAATTCAAGCGCTTCTACCGCTCGGTGGACGCGTTGCTGATCGACGATATCCAGTTCTTCGCTCGC
AAAGAGCGCTCGCAAGAAGAGTTTTTCCACACCTTCAACGCCTTGCTTGAGGGTGGCCAGCAGGTAATCC
TTACCTCTGACCGCTATCCCAAGGAAATCGAAGGCCTGGAAGAGCGTCTGAAGTCGCGCTTTGGTTGGGG
CCTGACGGTGGCTGTCGAGCCGCCAGAGCTGGAGACCCGCGTAGCGATCCTGATGAAGAAGGCCGACCAG
GCCAAAGTCGAGCTCCCGCATGACGCAGCCTTTTTCATCGCTCAGCGCATCCGGTCCAACGTCCGTGAGC
TGGAAGGTGCACTGAAGCGAGTTATTGCTCACTCGCACTTCATGGGGCGTGACATCACCATCGAGCTGAT
TCGTGAATCGCTCAAGGATCTGTTGGCGCTGCAAGACAAACTGGTCAGTGTGGATAACATTCAGCGTACC
GTCGCTGAGTACTACAAGATCAAGATCTCCGATCTGTTGTCCAAGCGTCGTTCGCGTTCTGTCGCGCGCC
CGCGTCAGGTAGCCATGGCCCTGTCCAAGGAGTTGACCAACCACAGTCTGCCGGAAATCGGCGACATGTT
CGGTGGTCGCGACCATACGACCGTGCTGCACGCCTGCCGCAAAATCAATGAACTGAAGGAATCCGACGCG
GACATCCGCGAGGACTACAAGAACCTGCTGCGGACGCTGACGACCTGA

Please mail me regarding any queries.

Regards,
Roopa.


From cjfields at illinois.edu  Thu Nov 19 10:30:34 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Thu, 19 Nov 2009 09:30:34 -0600
Subject: [Bioperl-l] verbosity and error stack, was  accessing EMBL database
In-Reply-To: <FADF827A6CE34C959062F2D93849E15A@NewLife>
References: <475F74057A618245A773CD325E105D1E033334AC@phy-srv01.physiol.physiology.wisc.edu>
	<E10EF917D9914031BCFDF8BDDBFA4F13@NewLife>
	<B6E6CEFF-14E7-48D6-B9D8-9C114F166190@sbc.su.se>
	<FADF827A6CE34C959062F2D93849E15A@NewLife>
Message-ID: <B335150B-7479-4D31-BA0A-9F139E2CCE0E@illinois.edu>

Mark, Dave,

This could be based on verbose(). 

          Level      w     t     d    st
verbose   < 0        -     +     -    -/+
verbose     0        +     +     -    -/+
verbose     1        +     +     +    +/+
verbose   > 1        +* -> +     +    +/+
* converts to throw()
w = warn
t = throw
d = debug
st = stack trace

warn() is set up that way now, you don't get a stack trace unless verbose() is > 0.  throw() could be the same; would be a simple fix, really.

My only problem with the current state of things is (I think we've delved down this path before) verbosity level is tied to exception strictness as seen above, and they're really two separate concepts, at least to me.  Verbosity of 1 or more doesn't necessarily mean I want an elevated level of strictness along with it.  For instance, one might want very strict exceptions w/o the noise, or (conversely) lots of debugging output but no warnings. 

(aside: another small nit, but I haven't exactly liked that the global level of strictness is designated by a env. variable with DEBUG in the name, but that's just me).

I've been thinking it would be nice to have simple separate verbose/strict switches (this is the way it's implemented in Biome).  This would allow some finer grained control over output:

          Level      d    st
verbose     0        -    -
verbose     1        +    +
Default = BIOPERLDEBUG || 0 # current situation

          Level      w     t
strict      -1       -     +
strict      0        +     +
strict      1        +* -> +
* converts to throw()
Default = BIOPERLSTRICT || 0

We could even allow finer-grained control of verbosity (states which cover all combinations) w/o affecting strictness.

chris

On Nov 19, 2009, at 8:17 AM, Mark A. Jensen wrote:

> I'm inclined to agree. Lots of responses to questions here that begin
> "Well, as the error message said, you need to check...", which means
> people tend towards "I broke it! Write the list!". I do find it hairy when
> my errors are way down in the object tree.
> ----- Original Message ----- From: "Dave Messina" <David.Messina at sbc.su.se>
> To: "Mark A. Jensen" <maj at fortinbras.us>
> Cc: <bioperl-l at bioperl.org>
> Sent: Thursday, November 19, 2009 9:04 AM
> Subject: Re: [Bioperl-l] accessing EMBL database
> 
> 
>> I would agree that the error message is not really informative.
> 
> Agreed that it could be better, but I wonder whether part of the problem with BioPerl error messages is the stack dump.
> 
> I think a lot of eyes just glaze right over when they see a big wad of complicated stuff, with colons and slashes and line numbers, spewing out at them.
> 
> Perhaps the stack dump should be turned off by default?
> 
> Wouldn't this:
> 
> ERROR: EMBL stream with no ID. Not embl in my book
> 
> 
> 
> Be a lot clearer than this?:
> 
> MSG: EMBL stream with no ID. Not embl in my book
> STACK: Error::throw
> STACK: Bio::Root::Root::throw C:/Perl/site/lib/Bio/Root/Root.pm: 368
> STACK: Bio::SeqIO::embl::next_seq C:/Perl/site/lib/Bio\SeqIO\embl.pm: 203
> STACK: Bio::DB::WebDBSeqI::get_Seq_by_acc C:/Perl/site/lib/Bio/DB/WebDBSeqI.pm: 194
> STACK: trial2.pl
> 
> 
> 
> Just a thought. This has probably been discussed before.
> Dave
> 
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From roy.chaudhuri at gmail.com  Thu Nov 19 11:10:28 2009
From: roy.chaudhuri at gmail.com (Roy Chaudhuri)
Date: Thu, 19 Nov 2009 16:10:28 +0000
Subject: [Bioperl-l] Remote blast
In-Reply-To: <c7cac1600911190655q7c4cf910x221e8a2ebac40f2a@mail.gmail.com>
References: <c7cac1600911190655q7c4cf910x221e8a2ebac40f2a@mail.gmail.com>
Message-ID: <4B056DF4.2030502@gmail.com>

Hi Roopa,

I think that the -Organism parameter that you specify for 
Bio::Tools::Run::RemoteBlast is ignored - I can't find any reference to 
it in the documentation:
http://search.cpan.org/~cjfields/BioPerl-1.6.1/Bio/Tools/Run/RemoteBlast.pm

You have the correct approach in your code - limiting the search to the 
Entrez query "Trypanosoma brucei[ORGN]", but the line is commented out. 
If you uncomment the line (and add a semicolon afterwards), the program 
runs correctly, but no hits are reported below your threshold e-value. 
If you change the value of $e_val to 10 then some T.brucei hits are 
reported.

Roy.

Roopa Raghuveer wrote:
> Hello everybody,
> 
> I have a problem. I would like to use remote blast to find sequences
> matching for an input sequence.
> 
> Ex:-I would like to search sequences which match Trypanosoma Brucei
> sequence.
> 
> I want the output to be only Trypanosoma Brucei sequences matching with my
> query.When i tried to use remoteblast to nr database,I got sequences from
> different organisms like E.coli,Pseudomonas etc.,
> 
> Could you please tell me how can this be solved...?
> 
> My code is as follows.
> 
> use Bio::Tools::Run::RemoteBlast;
>   use strict;
>   my $prog = 'blastn';
>   my $db   = 'nr';
>   my $e_val= '1e-10';
>  my $organism= 'Trypanosoma Brucei';
> 
>   my @params = ( '-prog' => $prog,
>          '-data' => $db,
>          '-expect' => $e_val,
>          '-readmethod' => 'SearchIO',
>          '-Organism'   => $organism );
> 
>   my $factory = Bio::Tools::Run::RemoteBlast->
> new(@params);
> 
>   #change a paramter
>   #$Bio::Tools::Run::RemoteBlast::HEADER{'ENTREZ_QUERY'} = 'Trypanosoma
> brucei[ORGN]'
> 
>   #remove a parameter
>   #delete $Bio::Tools::Run::RemoteBlast::HEADER{'FILTER'};
> 
>   my $v = 1;
>   #$v is just to turn on and off the messages
> 
>   my $str = Bio::SeqIO->new(-file=>'amino.fa' , '-format' => 'fasta' ,
> '-organism' => 'Trypanosoma Brucei' );
> 
>   while (my $input = $str->next_seq()){
>     #Blast a sequence against a database:
>    my $r = $factory->submit_blast($input);
>     #my $r = $factory->submit_blast('amino.fa');
> 
>     print STDERR "waiting..." if( $v > 0 );
>     while ( my @rids = $factory->each_rid ) {
>       foreach my $rid ( @rids ) {
>         my $rc = $factory->retrieve_blast($rid);
>         if( !ref($rc) ) {
>           if( $rc < 0 ) {
>             $factory->remove_rid($rid);
>           }
>           print STDERR "." if ( $v > 0 );
>          sleep 5;
>         }
>      else {
>           my $result = $rc->next_result();
>           #save the output
>           my $filename = $result->query_name()."\.out";
>           $factory->save_output($filename);
>           $factory->remove_rid($rid);
>           print "\nQuery Name: ", $result->query_name(), "\n";
>           while ( my $hit = $result->next_hit ) {
>             next unless ( $v > 0);
>             print "\thit name is ", $hit->name, "\n";
>             while( my $hsp = $hit->next_hsp ) {
>               print "\t\tscore is ", $hsp->score, "\n";
>             }
>           }
>         }
>       }
>     }
>   }
> 
> My input sequence is
> 
>> ref|NC_009512.1|:385-1902
> GTGTCAGTGGAACTTTGGCAGCAGTGCGTGGAGCTTCTGCGCGATGAACTGCCTGCCCAGCAATTCAACA
> CCTGGATCCGTCCGCTACAGGTCGAAGCCGAAGGCGACGAGTTGCGCGTCTATGCGCCTAACCGTTTCGT
> TCTCGATTGGGTCAATGAAAAGTACCTGGGTCGTTTGCTCGAGCTGTTGGGTGAGAACGGTAGCGGCATT
> GCACCAGCCCTTTCCTTATTAATAGGTAGCCGCCGCAGCTCGGCCCCAAGGGCTGCACCCAACGCGCCGG
> TCAGCGCTGCCGTTGCGGCTTCGCTGGCGCAGACTCAGGCGCACAAGACGGCCCCGGCAGCAGCGGTTGA
> ACCCGTTGCCGTGGCCGCGGCCGAGCCTGTATTGGTCGAGACGTCTTCGCGTGACAGCTTTGATGCCATG
> GCCGAGCCTGCTGCTGCGCCGCCCAGTGGTGGCCGGGCTGAACAGCGCACCGTGCAGGTTGAAGGTGCGC
> TCAAGCACACCAGTTACCTGAACCGGACCTTTACCTTTGACACCTTCGTCGAAGGTAAGTCGAACCAGCT
> CGCCCGCGCGGCTGCCTGGCAGGTTGCGGACAACCCTAAGCATGGCTACAACCCACTGTTCCTTTATGGC
> GGTGTGGGTTTGGGTAAAACCCACCTTATGCATGCTGTGGGTAACCATCTGCTGAAGAAGAATCCGAACG
> CCAAGGTGGTGTACCTGCATTCGGAGCGCTTCGTCGCGGACATGGTCAAAGCGTTGCAACTCAACGCCAT
> CAACGAATTCAAGCGCTTCTACCGCTCGGTGGACGCGTTGCTGATCGACGATATCCAGTTCTTCGCTCGC
> AAAGAGCGCTCGCAAGAAGAGTTTTTCCACACCTTCAACGCCTTGCTTGAGGGTGGCCAGCAGGTAATCC
> TTACCTCTGACCGCTATCCCAAGGAAATCGAAGGCCTGGAAGAGCGTCTGAAGTCGCGCTTTGGTTGGGG
> CCTGACGGTGGCTGTCGAGCCGCCAGAGCTGGAGACCCGCGTAGCGATCCTGATGAAGAAGGCCGACCAG
> GCCAAAGTCGAGCTCCCGCATGACGCAGCCTTTTTCATCGCTCAGCGCATCCGGTCCAACGTCCGTGAGC
> TGGAAGGTGCACTGAAGCGAGTTATTGCTCACTCGCACTTCATGGGGCGTGACATCACCATCGAGCTGAT
> TCGTGAATCGCTCAAGGATCTGTTGGCGCTGCAAGACAAACTGGTCAGTGTGGATAACATTCAGCGTACC
> GTCGCTGAGTACTACAAGATCAAGATCTCCGATCTGTTGTCCAAGCGTCGTTCGCGTTCTGTCGCGCGCC
> CGCGTCAGGTAGCCATGGCCCTGTCCAAGGAGTTGACCAACCACAGTCTGCCGGAAATCGGCGACATGTT
> CGGTGGTCGCGACCATACGACCGTGCTGCACGCCTGCCGCAAAATCAATGAACTGAAGGAATCCGACGCG
> GACATCCGCGAGGACTACAAGAACCTGCTGCGGACGCTGACGACCTGA
> 
> Please mail me regarding any queries.
> 
> Regards,
> Roopa.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From clements at nescent.org  Thu Nov 19 12:46:32 2009
From: clements at nescent.org (Dave Clements)
Date: Thu, 19 Nov 2009 18:46:32 +0100
Subject: [Bioperl-l] how to visualize multiple sequences alignments
In-Reply-To: <FC65BC44-AC7A-4AE6-8DF0-3F7969CD5B4C@bioperl.org>
References: <a6a926b50911180918m6b6c31b2g191f9d78325e68de@mail.gmail.com>
	<FC65BC44-AC7A-4AE6-8DF0-3F7969CD5B4C@bioperl.org>
Message-ID: <f135c01c0911190946t7488718brfed76b975f6d2b2@mail.gmail.com>

Hi Xiaoyu,

I would also take a look at GBrowse_syn, a perl based solution built with
the GBrowse genome browser framework.

See http://gmod.org/wiki/GBrowse_syn.

Cheers,

Dave C.

On Wed, Nov 18, 2009 at 7:23 PM, Jason Stajich <jason at bioperl.org> wrote:

> try jalview http://www.jalview.org/
>
>
> On Nov 18, 2009, at 9:18 AM, Xiaoyu Liang wrote:
>
>  Hi,
>>
>> I'm wondering Is there any modules that can be used for visualizing
>> multiple
>> sequences alignments? like the result from ClustalW?
>>
>> Thank you very much,
>> Xiaoyu
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>
> --
> Jason Stajich
> jason.stajich at gmail.com
> jason at bioperl.org
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


-- 
http://gmod.org/wiki/GMOD_News
http://gmod.org/wiki/January_2010_GMOD_Meeting


From maj at fortinbras.us  Thu Nov 19 18:37:05 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Thu, 19 Nov 2009 18:37:05 -0500
Subject: [Bioperl-l] verbosity and error stack,
	was  accessing EMBL database
In-Reply-To: <B335150B-7479-4D31-BA0A-9F139E2CCE0E@illinois.edu>
References: <475F74057A618245A773CD325E105D1E033334AC@phy-srv01.physiol.physiology.wisc.edu><E10EF917D9914031BCFDF8BDDBFA4F13@NewLife><B6E6CEFF-14E7-48D6-B9D8-9C114F166190@sbc.su.se><FADF827A6CE34C959062F2D93849E15A@NewLife>
	<B335150B-7479-4D31-BA0A-9F139E2CCE0E@illinois.edu>
Message-ID: <D72A208491F04DBF9B3F7F10D86A9931@NewLife>

I like this verbose/strict separability a lot. Should we go for it?
----- Original Message ----- 
From: "Chris Fields" <cjfields at illinois.edu>
To: "Mark A. Jensen" <maj at fortinbras.us>
Cc: <bioperl-l at bioperl.org>
Sent: Thursday, November 19, 2009 10:30 AM
Subject: [Bioperl-l] verbosity and error stack, was accessing EMBL database


> Mark, Dave,
>
> This could be based on verbose().
>
>          Level      w     t     d    st
> verbose   < 0        -     +     -    -/+
> verbose     0        +     +     -    -/+
> verbose     1        +     +     +    +/+
> verbose   > 1        +* -> +     +    +/+
> * converts to throw()
> w = warn
> t = throw
> d = debug
> st = stack trace
>
> warn() is set up that way now, you don't get a stack trace unless verbose() is 
>  > 0.  throw() could be the same; would be a simple fix, really.
>
> My only problem with the current state of things is (I think we've delved down 
> this path before) verbosity level is tied to exception strictness as seen 
> above, and they're really two separate concepts, at least to me.  Verbosity of 
> 1 or more doesn't necessarily mean I want an elevated level of strictness 
> along with it.  For instance, one might want very strict exceptions w/o the 
> noise, or (conversely) lots of debugging output but no warnings.
>
> (aside: another small nit, but I haven't exactly liked that the global level 
> of strictness is designated by a env. variable with DEBUG in the name, but 
> that's just me).
>
> I've been thinking it would be nice to have simple separate verbose/strict 
> switches (this is the way it's implemented in Biome).  This would allow some 
> finer grained control over output:
>
>          Level      d    st
> verbose     0        -    -
> verbose     1        +    +
> Default = BIOPERLDEBUG || 0 # current situation
>
>          Level      w     t
> strict      -1       -     +
> strict      0        +     +
> strict      1        +* -> +
> * converts to throw()
> Default = BIOPERLSTRICT || 0
>
> We could even allow finer-grained control of verbosity (states which cover all 
> combinations) w/o affecting strictness.
>
> chris
>
> On Nov 19, 2009, at 8:17 AM, Mark A. Jensen wrote:
>
>> I'm inclined to agree. Lots of responses to questions here that begin
>> "Well, as the error message said, you need to check...", which means
>> people tend towards "I broke it! Write the list!". I do find it hairy when
>> my errors are way down in the object tree.
>> ----- Original Message ----- From: "Dave Messina" <David.Messina at sbc.su.se>
>> To: "Mark A. Jensen" <maj at fortinbras.us>
>> Cc: <bioperl-l at bioperl.org>
>> Sent: Thursday, November 19, 2009 9:04 AM
>> Subject: Re: [Bioperl-l] accessing EMBL database
>>
>>
>>> I would agree that the error message is not really informative.
>>
>> Agreed that it could be better, but I wonder whether part of the problem with 
>> BioPerl error messages is the stack dump.
>>
>> I think a lot of eyes just glaze right over when they see a big wad of 
>> complicated stuff, with colons and slashes and line numbers, spewing out at 
>> them.
>>
>> Perhaps the stack dump should be turned off by default?
>>
>> Wouldn't this:
>>
>> ERROR: EMBL stream with no ID. Not embl in my book
>>
>>
>>
>> Be a lot clearer than this?:
>>
>> MSG: EMBL stream with no ID. Not embl in my book
>> STACK: Error::throw
>> STACK: Bio::Root::Root::throw C:/Perl/site/lib/Bio/Root/Root.pm: 368
>> STACK: Bio::SeqIO::embl::next_seq C:/Perl/site/lib/Bio\SeqIO\embl.pm: 203
>> STACK: Bio::DB::WebDBSeqI::get_Seq_by_acc 
>> C:/Perl/site/lib/Bio/DB/WebDBSeqI.pm: 194
>> STACK: trial2.pl
>>
>>
>>
>> Just a thought. This has probably been discussed before.
>> Dave
>>
>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> 


From michael.watson at bbsrc.ac.uk  Fri Nov 20 05:07:10 2009
From: michael.watson at bbsrc.ac.uk (michael watson (IAH-C))
Date: Fri, 20 Nov 2009 10:07:10 +0000
Subject: [Bioperl-l] strand in Bio::SearchIO when parsing fasta35 output
In-Reply-To: <8D08960C647E64438CE5740657CBBDC501487319B6@iahcexch1.iah.bbsrc.ac.uk>
References: <8D08960C647E64438CE5740657CBBDC501487319AE@iahcexch1.iah.bbsrc.ac.uk>
	<9994F70B-AE92-4425-9AAC-E9A2DC26964E@bioperl.org>
	<8D08960C647E64438CE5740657CBBDC501487319B6@iahcexch1.iah.bbsrc.ac.uk>
Message-ID: <8D08960C647E64438CE5740657CBBDC50148731CEB@iahcexch1.iah.bbsrc.ac.uk>

Hello

I was just wondering if anyone had had time to look into this?

I posted a bug: http://bugzilla.open-bio.org/show_bug.cgi?id=2937

Thanks
Mick

-----Original Message-----
From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-bounces at lists.open-bio.org] On Behalf Of michael watson (IAH-C)
Sent: 27 October 2009 09:01
To: 'Jason Stajich'
Cc: bioperl-l
Subject: Re: [Bioperl-l] strand in Bio::SearchIO when parsing fasta35 output

Hi Jason

They both print 0 also.

A bug report it is

Mick

-----Original Message-----
From: Jason Stajich [mailto:jason.stajich at gmail.com] On Behalf Of Jason Stajich
Sent: 26 October 2009 18:46
To: michael watson (IAH-C)
Cc: bioperl-l
Subject: Re: [Bioperl-l] strand in Bio::SearchIO when parsing fasta35 output


Is this -m9 -d 0 output or standard default?  I think the strand is  
parsed in the HSP parsing.

Can you double check what $hsp->query->strand and $hsp->hit->strand  
prints?

A full example report as a bug request will be next step if that  
doesn't resolve.

-jason
On Oct 26, 2009, at 10:04 AM, michael watson (IAH-C) wrote:

> Dear all
>
> Where does this go?  Perhaps I am doing something wrong.
>
> Fasta35 output puts the strand in the hit list at the top:
>
> cluster_99033:3                                (  23) [r]  115 37.9   
> 0.0011
> cluster_79238:1                 (  27) [f]  126 38.0 0.00097 0.963  
> 0.963   27
>
> The [r] stands for reverse and the [f] stands for forward.
>
> There is also the text "rev-comp" after the hit line further down.
>
> However, when I parse fasta35 output using SearchIO and output the  
> strand of the HSP:
>
> print $hsp->strand('hit'), ",";
> print $hsp->strand('query'), "\n";
>
> This simply prints out 0, 0 (I assume 0 is the default in BioPerl  
> for "I don't know which strand it's on").
>
> So the information is there, but it's not getting parsed.   
> Alternatively, I've missed something and will feel a bit foolish.
>
> Currently using BioPerl 1.6.0
>
> Thanks
> Mick
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

--
Jason Stajich
jason.stajich at gmail.com
jason at bioperl.org


_______________________________________________
Bioperl-l mailing list
Bioperl-l at lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/bioperl-l


From David.Messina at sbc.su.se  Fri Nov 20 05:15:11 2009
From: David.Messina at sbc.su.se (Dave Messina)
Date: Fri, 20 Nov 2009 11:15:11 +0100
Subject: [Bioperl-l] verbosity and error stack,
	was  accessing EMBL database
In-Reply-To: <D72A208491F04DBF9B3F7F10D86A9931@NewLife>
References: <475F74057A618245A773CD325E105D1E033334AC@phy-srv01.physiol.physiology.wisc.edu><E10EF917D9914031BCFDF8BDDBFA4F13@NewLife><B6E6CEFF-14E7-48D6-B9D8-9C114F166190@sbc.su.se><FADF827A6CE34C959062F2D93849E15A@NewLife>
	<B335150B-7479-4D31-BA0A-9F139E2CCE0E@illinois.edu>
	<D72A208491F04DBF9B3F7F10D86A9931@NewLife>
Message-ID: <3277368F-615A-4DD3-B9B3-5D32A5EEEE98@sbc.su.se>

Chris, I took a look at how you implemented this in Biome -- very nice!


> I like this verbose/strict separability a lot. Should we go for it?

Me too. So yes, I think so.


> We could even allow finer-grained control of verbosity (states which cover all combinations) w/o affecting strictness.


Perhaps this is a job for Log::Log4Perl or Log::Dispatch?
http://search.cpan.org/~mschilli/Log-Log4perl-1.25/lib/Log/Log4perl.pm
http://search.cpan.org/~drolsky/Log-Dispatch-2.26/lib/Log/Dispatch.pm


That might be overkill, though.

Dave


From roychu at gmail.com  Fri Nov 20 05:21:54 2009
From: roychu at gmail.com (Chu, Roy)
Date: Fri, 20 Nov 2009 02:21:54 -0800
Subject: [Bioperl-l] Installing Bio-perl on dreamhost via CPAN
Message-ID: <4d7f3e450911200221m5f39ace2hb979712115fb9d78@mail.gmail.com>

Hi,

Does anyone use dreamhost as a web hosting service?  I'm just curious
if anyone has had any luck installing the module as their daemon seems
to kill my process whenever I try to install it.  Dreamhost tech
support attributes it to either exceeding the allocated memory cache
or exceeding the processing time.  I tried to nice the process, but
that didn't help for me.  Any luck or experience in resolving this
would be much appreciated.  I suppose my next attempt would be to try
installing it directly and hope I don't need root...

Thanks,
Roy


From s.denaxas at gmail.com  Fri Nov 20 05:27:42 2009
From: s.denaxas at gmail.com (Spiros Denaxas)
Date: Fri, 20 Nov 2009 11:27:42 +0100
Subject: [Bioperl-l] Installing Bio-perl on dreamhost via CPAN
In-Reply-To: <4d7f3e450911200221m5f39ace2hb979712115fb9d78@mail.gmail.com>
References: <4d7f3e450911200221m5f39ace2hb979712115fb9d78@mail.gmail.com>
Message-ID: <bba689ec0911200227g1a8d717elce0daebf6a96c6aa@mail.gmail.com>

Hello,

normally you don't need to be root -
http://sial.org/howto/perl/life-with-cpan/non-root/
Kind of disturbing that their tech support cannot give you a straight
answer on what they are killing the process.

Good luck
Spiros

On Fri, Nov 20, 2009 at 11:21 AM, Chu, Roy <roychu at gmail.com> wrote:

>  ?I suppose my next attempt would be to try
> installing it directly and hope I don't need root...
>
> Thanks,
> Roy
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


From charles-listes+bioperl at plessy.org  Fri Nov 20 05:44:45 2009
From: charles-listes+bioperl at plessy.org (Charles Plessy)
Date: Fri, 20 Nov 2009 19:44:45 +0900
Subject: [Bioperl-l] Installing Bio-perl on dreamhost via CPAN
In-Reply-To: <4d7f3e450911200221m5f39ace2hb979712115fb9d78@mail.gmail.com>
References: <4d7f3e450911200221m5f39ace2hb979712115fb9d78@mail.gmail.com>
Message-ID: <20091120104445.GG31318@kunpuu.plessy.org>

Le Fri, Nov 20, 2009 at 02:21:54AM -0800, Chu, Roy a ?crit :
> 
> Does anyone use dreamhost as a web hosting service?  I'm just curious
> if anyone has had any luck installing the module as their daemon seems
> to kill my process whenever I try to install it.  Dreamhost tech
> support attributes it to either exceeding the allocated memory cache
> or exceeding the processing time.  I tried to nice the process, but
> that didn't help for me.  Any luck or experience in resolving this
> would be much appreciated.  I suppose my next attempt would be to try
> installing it directly and hope I don't need root...

Dear Roy,

DreamHost uses Debian, so you can suggest them to install the Debian package.
If you are in contact with the tech service, do not hesitate to tell them to
contact me if they are interested by a backport of the 1.6.0 package. For
version 1.6.1, it may be more difficult as it depends on perl 5.10.1.

PS: if you propse to install BioPerl as a feature in the Dreamhost panel, I
will vote for it :)

Have a nice day,

--  
Charles Plessy
Debian Med packaging team,
http://www.debian.org/devel/debian-med
Tsurumi, Kanagawa, Japan


From cjfields at illinois.edu  Fri Nov 20 07:51:39 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Fri, 20 Nov 2009 06:51:39 -0600
Subject: [Bioperl-l] strand in Bio::SearchIO when parsing fasta35 output
In-Reply-To: <8D08960C647E64438CE5740657CBBDC50148731CEB@iahcexch1.iah.bbsrc.ac.uk>
References: <8D08960C647E64438CE5740657CBBDC501487319AE@iahcexch1.iah.bbsrc.ac.uk>
	<9994F70B-AE92-4425-9AAC-E9A2DC26964E@bioperl.org>
	<8D08960C647E64438CE5740657CBBDC501487319B6@iahcexch1.iah.bbsrc.ac.uk>
	<8D08960C647E64438CE5740657CBBDC50148731CEB@iahcexch1.iah.bbsrc.ac.uk>
Message-ID: <E9D5435B-07D6-46A9-AA84-C9667FA0CEDE@illinois.edu>

Mick,

Short answer, no.  It was in the queue to be fixed at some point in 1.6.x, but that queue is quite long.  I'm pushing it into the queue specifically for 1.6.2, so it should be addressed soon.

chris

On Nov 20, 2009, at 4:07 AM, michael watson (IAH-C) wrote:

> Hello
> 
> I was just wondering if anyone had had time to look into this?
> 
> I posted a bug: http://bugzilla.open-bio.org/show_bug.cgi?id=2937
> 
> Thanks
> Mick
> 
> -----Original Message-----
> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-bounces at lists.open-bio.org] On Behalf Of michael watson (IAH-C)
> Sent: 27 October 2009 09:01
> To: 'Jason Stajich'
> Cc: bioperl-l
> Subject: Re: [Bioperl-l] strand in Bio::SearchIO when parsing fasta35 output
> 
> Hi Jason
> 
> They both print 0 also.
> 
> A bug report it is
> 
> Mick
> 
> -----Original Message-----
> From: Jason Stajich [mailto:jason.stajich at gmail.com] On Behalf Of Jason Stajich
> Sent: 26 October 2009 18:46
> To: michael watson (IAH-C)
> Cc: bioperl-l
> Subject: Re: [Bioperl-l] strand in Bio::SearchIO when parsing fasta35 output
> 
> 
> Is this -m9 -d 0 output or standard default?  I think the strand is  
> parsed in the HSP parsing.
> 
> Can you double check what $hsp->query->strand and $hsp->hit->strand  
> prints?
> 
> A full example report as a bug request will be next step if that  
> doesn't resolve.
> 
> -jason
> On Oct 26, 2009, at 10:04 AM, michael watson (IAH-C) wrote:
> 
>> Dear all
>> 
>> Where does this go?  Perhaps I am doing something wrong.
>> 
>> Fasta35 output puts the strand in the hit list at the top:
>> 
>> cluster_99033:3                                (  23) [r]  115 37.9   
>> 0.0011
>> cluster_79238:1                 (  27) [f]  126 38.0 0.00097 0.963  
>> 0.963   27
>> 
>> The [r] stands for reverse and the [f] stands for forward.
>> 
>> There is also the text "rev-comp" after the hit line further down.
>> 
>> However, when I parse fasta35 output using SearchIO and output the  
>> strand of the HSP:
>> 
>> print $hsp->strand('hit'), ",";
>> print $hsp->strand('query'), "\n";
>> 
>> This simply prints out 0, 0 (I assume 0 is the default in BioPerl  
>> for "I don't know which strand it's on").
>> 
>> So the information is there, but it's not getting parsed.   
>> Alternatively, I've missed something and will feel a bit foolish.
>> 
>> Currently using BioPerl 1.6.0
>> 
>> Thanks
>> Mick
>> 
>> 
>> 
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
> --
> Jason Stajich
> jason.stajich at gmail.com
> jason at bioperl.org
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From cjfields at illinois.edu  Fri Nov 20 08:00:45 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Fri, 20 Nov 2009 07:00:45 -0600
Subject: [Bioperl-l] Installing Bio-perl on dreamhost via CPAN
In-Reply-To: <20091120104445.GG31318@kunpuu.plessy.org>
References: <4d7f3e450911200221m5f39ace2hb979712115fb9d78@mail.gmail.com>
	<20091120104445.GG31318@kunpuu.plessy.org>
Message-ID: <ECA15E72-8026-4EA0-A4F6-7C09EA2BD040@illinois.edu>


On Nov 20, 2009, at 4:44 AM, Charles Plessy wrote:

> Le Fri, Nov 20, 2009 at 02:21:54AM -0800, Chu, Roy a ?crit :
>> 
>> Does anyone use dreamhost as a web hosting service?  I'm just curious
>> if anyone has had any luck installing the module as their daemon seems
>> to kill my process whenever I try to install it.  Dreamhost tech
>> support attributes it to either exceeding the allocated memory cache
>> or exceeding the processing time.  I tried to nice the process, but
>> that didn't help for me.  Any luck or experience in resolving this
>> would be much appreciated.  I suppose my next attempt would be to try
>> installing it directly and hope I don't need root...
> 
> Dear Roy,
> 
> DreamHost uses Debian, so you can suggest them to install the Debian package.
> If you are in contact with the tech service, do not hesitate to tell them to
> contact me if they are interested by a backport of the 1.6.0 package. For
> version 1.6.1, it may be more difficult as it depends on perl 5.10.1.

Any reason why this is so?  We specify compatibility back to 5.6.1.

Alex mentioned the reliance on the specific Extutils::Manifest version.  The version requested has an important bug fix, is present on CPAN, and is backwards-compatible to 5.6.1.  It should be fairly easy to request that as a separate package.  

A strict requirement for perl 5.10.1 doesn't make sense in that light, unless said perl maintainer can enlighten us as to why this is an issue?  This one may require a ranty blog post.

> PS: if you propse to install BioPerl as a feature in the Dreamhost panel, I
> will vote for it :)
> 
> Have a nice day,
> 
> --  
> Charles Plessy
> Debian Med packaging team,
> http://www.debian.org/devel/debian-med
> Tsurumi, Kanagawa, Japan

chris


From rtbio.2009 at gmail.com  Fri Nov 20 10:52:09 2009
From: rtbio.2009 at gmail.com (Roopa Raghuveer)
Date: Fri, 20 Nov 2009 16:52:09 +0100
Subject: [Bioperl-l] Remote blast
In-Reply-To: <c7cac1600911200330m5aafc21ehd5038707b4af9cdd@mail.gmail.com>
References: <c7cac1600911190655q7c4cf910x221e8a2ebac40f2a@mail.gmail.com>
	<4B056DF4.2030502@gmail.com>
	<c7cac1600911200330m5aafc21ehd5038707b4af9cdd@mail.gmail.com>
Message-ID: <c7cac1600911200752h2dd48ca8r55877f5045948220@mail.gmail.com>

Hello everybody,

I have tried to use Remote blast on Trypanasoma brucei sequences and could
get certain hits.But I am unable to retrieve the complete sequence from
where I got hits.
i.e., I am unable to parse the blast output file for getting the complete
sequences of the hits. Here is my code.

#!/usr/bin/perl -w
use Bio::SearchIO;
my $blast_report = new Bio::SearchIO ('-format' => 'blast',
                                      '-file'   => $ARGV[0]);
my $result = $blast_report->next_result;
my $level = $ARGV[1];

while( my $hit = $result->next_hit) {
       print $hit->name;
       push(@arr1,$hit->name);
       while( my $hsp = $hit->next_hsp()) {
        if ($hsp->frac_identical() >= $level) {
            #print $hsp->hit_string, "\n";
            push(@arr,$hsp->hit_string);
        }
    }
}
$k=@arr1;
for($i=0;$i<$k;$i++){
push(@arr2,split(/|/,$arr1[$i]));
#print "$arr[$i]\n";
}
#$t=@arr2;

Here,I am trying to use the blast output file and get the complete sequence
where I found a hit  but  I could not get the complete sequence.

i/p:-
Last login: Mon Nov 16 11:57:22 on console
Welcome to Darwin!
lmbicip-mac1:~ cip$ ssh admin at 141.84.66.66
The authenticity of host '141.84.66.66 (141.84.66.66)' can't be established.
RSA key fingerprint is 2d:4a:09:1d:2e:f3:51:c7:ba:8b:29:37:36:f6:44:db.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added '141.84.66.66' (RSA) to the list of known hosts.
Password:
Last login: Fri Nov 20 13:52:57 2009 from 10.153.189.239
Have a lot of fun...
admin at BosLinux:~> clear


admin at BosLinux:~> cd Documents/
admin at BosLinux:~/Documents> clear


admin at BosLinux:~/Documents> vim blast.pl
admin at BosLinux:~/Documents> clear


admin at BosLinux:~/Documents> vim nnn.pl
admin at BosLinux:~/Documents> vim other.pl
admin at BosLinux:~/Documents> vim amino.fa
admin at BosLinux:~/Documents> vim Tb09.211.2410.out
admin at BosLinux:~/Documents> vim Tb09.211.2410.out


             ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Sbjct  661   TTTGATGAAACCCCAATTCGGACGTATGAAAAGATTCTTGCGGGCCGGCTTAAATTCCCC
720

Query  721   AATTGGTTTGATGAGCGTGCGCGGGATCTCGTAAAGGGTTTATTGCAAACGGATCACACG
780
             ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Sbjct  721   AATTGGTTTGATGAGCGTGCGCGGGATCTCGTAAAGGGTTTATTGCAAACGGATCACACG
780

Query  781   AAACGGTTGGGCACGCTGAAGGATGGCGTAGCTGATGTGAAGAATCACCCATTCTTCCGT
840
             ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Sbjct  781   AAACGGTTGGGCACGCTGAAGGATGGCGTAGCTGATGTGAAGAATCACCCATTCTTCCGT
840

Query  841   GGTGCGAATTGGGAGAAACTCTATGGACGTCATTATAACGCCCCCATTGCCGTAAAAGTG
900
             ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Sbjct  841   GGTGCGAATTGGGAGAAACTCTATGGACGTCATTATAACGCCCCCATTGCCGTAAAAGTG
900

Query  901   AAGAGCCCCGGCGACACAAGTAACTTTGAGTCGTATCCCGAGAGTGGAGATAAGGGTTCT
960
             ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Sbjct  901   AAGAGCCCCGGCGACACAAGTAACTTTGAGTCGTATCCCGAGAGTGGAGATAAGGGTTCT
960

Query  961   CCTCCACTAACCCCTTCGCAACAGGTTGCATTCCGTGGTTTTTAG  1005
             |||||||||||||||||||||||||||||||||||||||||||||
Sbjct  961   CCTCCACTAACCCCTTCGCAACAGGTTGCATTCCGTGGTTTTTAG  1005

>ref|XM_822286.1| Trypanosoma brucei TREU927 protein kinase A catalytic
subunit
isoform 2 (Tb09.211.2360) partial mRNA
Length=1011

 Score = 1622 bits (1798),  Expect = 0.0
 Identities = 944/974 (96%), Gaps = 0/974 (0%)
 Strand=Plus/Plus

Query  32    TGTTTACCAAGCCTGACACATCGGGATGGAAGCTGAGTGACTTTGAAATGGGTGACACGC
91
             |||||||||| |||||||||||||||||||||||||||||||||||||||||||||||||
Sbjct  38    TGTTTACCAAACCTGACACATCGGGATGGAAGCTGAGTGACTTTGAAATGGGTGACACGC
97

Query  92    TAGGGACCGGCTCGTTTGGTCGCGTGCGCATTGCAAAACTGAAGAGCAGGGGGGAGTATT
151
             ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Sbjct  98    TAGGGACCGGCTCGTTTGGTCGCGTGCGCATTGCAAAACTGAAGAGCAGGGGGGAGTATT
157

Query  152   ATGCAATAAAATGTCTAAAGAAGCATGAGATACTAAAGATGAAGCAGGTACAACACCTGA
211
             |||||||||||||||||||||||| |||||||||||||||||||||||||||||||||||
Sbjct  158   ATGCAATAAAATGTCTAAAGAAGCGTGAGATACTAAAGATGAAGCAGGTACAACACCTGA
217

Query  212   ACCAAGAGAAGCAAATTCTAATGGAGTTGTCACACCCCTTCATTGTGAACATGATGTGTT
271
             ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Sbjct  218   ACCAAGAGAAGCAAATTCTAATGGAGTTGTCACACCCCTTCATTGTGAACATGATGTGTT
277

uery  272   CCTTCCAGGATGAGAACCGCGTCTACTTTGTTCTAGAATTTGTGGTAGGTGGTGAGGTAT
331
             ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Sbjct  278   CCTTCCAGGATGAGAACCGCGTCTACTTTGTTCTAGAATTTGTGGTAGGTGGTGAGGTAT
337

Query  332   TTACTCACCTTCGTTCCGCAGGCCGTTTCCCGAATGACGTAGCGAAGTTCTATCATGCGG
391
             ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Sbjct  338   TTACTCACCTTCGTTCCGCAGGCCGTTTCCCGAATGACGTAGCGAAGTTCTATCATGCGG
397

Query  392   AGCTTGTGTTGGCCTTTGAATATTTACACTCGAAGGACATTATCTACCGTGACTTGAAAC
451
             ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Sbjct  398   AGCTTGTGTTGGCCTTTGAATATTTACACTCGAAGGACATTATCTACCGTGACTTGAAAC
457

Query  452   CTGAGAATCTGCTACTTGATGGGAAGGGACACGTCAAGGTGACTGATTTTGGTTTTGCTA
511
             ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Sbjct  458   CTGAGAATCTGCTACTTGATGGGAAGGGACACGTCAAGGTGACTGATTTTGGTTTTGCTA
517

Query  512   AGAAGGTGACGGATCGTACCTATACGTTATGTGGGACACCTGAGTATCTTGCACCTGAGG
571
             ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Sbjct  518   AGAAGGTGACGGATCGTACCTATACGTTATGTGGGACACCTGAGTATCTTGCACCTGAGG
577

Query  572   TAATTCAGAGCAAAGGACATGGGAAGGCTGTGGATTGGTGGACGATGGGTGTTTTGCTGT
631
             ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||

It follows like this.

The output I got is
ATGACGACAACTCCCACTGGTGATGGCCAACTGTTTACCAAGCCTGACACATCGGGATGGAAGCTGAGTGACTTTGAAATGGGTGACACGCTAGGGACCGGCTCGTTTGGTCGCGTGCGCATTGCAAAACTGAAGAGCAGGGGGGAGTATTATGCAATAAAATGTCTAAAGAAGCATGAGATACTAAAGATGAAGCAGGTACAACACCTGAACCAAGAGAAGCAAATTCTAATGGAGTTGTCACACCCCTTCATTGTGAACATGATGTGTTCCTTCCAGGATGAGAACCGCGTCTACTTTGTTCTAGAATTTGTGGTAGGTGGTGAGGTATTTACTCACCTTCGTTCCGCAGGCCGTTTCCCGAATGACGTAGCGAAGTTCTATCATGCGGAGCTTGTGTTGGCCTTTGAATATTTACACTCGAAGGACATTATCTACCGTGACTTGAAACCTGAGAATCTGCTACTTGATGGGAAGGGACACGTCAAGGTGACTGATTTTGGTTTTGCTAAGAAGGTGACGGATCGTACCTATACGTTATGTGGGACACCTGAGTATCTTGCACCTGAGGTAATTCAGAGCAAAGGACATGGGAAGGCTGTGGATTGGTGGACGATGGGTGTTTTGCTGTATGAATTCATAGCTGGCCATCCTCCCTTTTTTGATGAAACCCCAATTCGGACGTATGAAAAGATTCTTGCGGGCCGGCTTAAATTCCCCAATTGGTTTGATGAGCGTGCGCGGGATCTCGTAAAGGGTTTATTGCAAACGGATCACACGAAACGGTTGGGCACGCTGAAGGATGGCGTAGCTGATGTGAAGAATCACCCATTCTTCCGTGGTGCGAATTGGGAGAAACTCTATGGACGTCATTATAACGCCCCCATTGCCGTAAAAGTGAAGAGCCCCGGCGACACAAGTAACTTTGAGTCGTATCCCGAGAGTGGAGATAAGGGTTCTCCTCCACTAACCCCTTCGCAACAGG
TTGCATTCCGTGGTTTTTAG

TGTTTACCAAACCTGACACATCGGGATGGAAGCTGAGTGACTTTGAAATGGGTGACACGCTAGGGACCGGCTCGTTTGGTCGCGTGCGCATTGCAAAACTGAAGAGCAGGGGGGAGTATTATGCAATAAAATGTCTAAAGAAGCGTGAGATACTAAAGATGAAGCAGGTACAACACCTGAACCAAGAGAAGCAAATTCTAATGGAGTTGTCACACCCCTTCATTGTGAACATGATGTGTTCCTTCCAGGATGAGAACCGCGTCTACTTTGTTCTAGAATTTGTGGTAGGTGGTGAGGTATTTACTCACCTTCGTTCCGCAGGCCGTTTCCCGAATGACGTAGCGAAGTTCTATCATGCGGAGCTTGTGTTGGCCTTTGAATATTTACACTCGAAGGACATTATCTACCGTGACTTGAAACCTGAGAATCTGCTACTTGATGGGAAGGGACACGTCAAGGTGACTGATTTTGGTTTTGCTAAGAAGGTGACGGATCGTACCTATACGTTATGTGGGACACCTGAGTATCTTGCACCTGAGGTAATTCAGAGCAAAGGACATGGGAAGGCTGTGGATTGGTGGACGATGGGTGTTTTGCTGTATGAATTCATAGCTGGCCATCCTCCCTTTTTTGATGAAACCCCAATTCGGACGTATGAAAAGATTCTTGCGGGCCGGTTCAAATTCCCCAATTGGTTTGACTCCCGTGCGCGGGATCTCGTAAAGGGTTTATTGCAAACGGATCACACGAAACGGTTGGGCACGCTGAAGGATGGCGTAGCTGATGTGAAGAATCACCCATTCTTCCGTGGTGCGAATTGGGAGAAACTCTATGGACGTCATTATCACGCTCCCATTCCTGTAAAAGTGAAGAGCCCCGGCGACACAAGTAACTTTGAGTCGTATCCCGAGAGTGGGGATAAGCGGTTGCCCCCGTTAGCACCATCACAACAATTGGAGTTCCGTGGGTTTTAG
GGATGATGACCGATTGTACCTCCTCCTCGAGTATGTGGTGGGTGGCGAGCTGT

TCTCCCACCTCCGGAAGGCGGGAAAATTCCCTAATGATGTAGCCAAGTTCTACTCCGCAGAAGTGGTTTTGGCGTTTGAATATATTCATGAGTGCGGCATCGTATACCGTGACTTGAAGCCAGAAAATGTGCTTTTGGACAAGCAGGGAAACATTAAGATTACGGACTTTGGGTTCGCGAAACGCGTTAGGGACAGAACGTACACGCTATGTGGGACTCCAGAGTATCTTGCGCCGGAGATAATCCAAAGTAAAGGTCACGATCGGGCTGTGGATTGGTGGACACTCGGAATTCTTCTCTATGAGATGCTTGTCGGTTATCCTCCTTTTTTCGACGAGAGTCCTTTTAGAACATACGAAAAAATTTTAGAGGGGAAACTTCAGTTTCCAAAGTGGGTGGAGATGCGGGCGAAGGACCTCATAAAGAGTTTTTTAACAATTGAACCAACGAAACG

i.e.,It is only giving the region where it could find the best alignment
i.e., the best hit ones.

I want the complete sequence i.e., sequences corresponding to the accession
numbers
XM_822292.1
XM_822286.1
XM_822694.1

Database used in Remote blast was RefSeq i.e.,(refseq_rna),organism used
:Trypanasoma brucei.

Can any one please help me in solving this problem

Regards,
Roopa.
On Fri, Nov 20, 2009 at 12:30 PM, Roopa Raghuveer <rtbio.2009 at gmail.com>wrote:

>
> Hello Roy,
>
> Thanks a lot for your reply.My code is working for my sequence now.
>
> Thanks alot.
>
> Regards,
> Roopa.
>
> On Thu, Nov 19, 2009 at 5:10 PM, Roy Chaudhuri <roy.chaudhuri at gmail.com>wrote:
>
>> Hi Roopa,
>>
>> I think that the -Organism parameter that you specify for
>> Bio::Tools::Run::RemoteBlast is ignored - I can't find any reference to it
>> in the documentation:
>>
>> http://search.cpan.org/~cjfields/BioPerl-1.6.1/Bio/Tools/Run/RemoteBlast.pm<http://search.cpan.org/%7Ecjfields/BioPerl-1.6.1/Bio/Tools/Run/RemoteBlast.pm>
>>
>> You have the correct approach in your code - limiting the search to the
>> Entrez query "Trypanosoma brucei[ORGN]", but the line is commented out. If
>> you uncomment the line (and add a semicolon afterwards), the program runs
>> correctly, but no hits are reported below your threshold e-value. If you
>> change the value of $e_val to 10 then some T.brucei hits are reported.
>>
>> Roy.
>>
>> Roopa Raghuveer wrote:
>>
>>> Hello everybody,
>>>
>>> I have a problem. I would like to use remote blast to find sequences
>>> matching for an input sequence.
>>>
>>> Ex:-I would like to search sequences which match Trypanosoma Brucei
>>> sequence.
>>>
>>> I want the output to be only Trypanosoma Brucei sequences matching with
>>> my
>>> query.When i tried to use remoteblast to nr database,I got sequences from
>>> different organisms like E.coli,Pseudomonas etc.,
>>>
>>> Could you please tell me how can this be solved...?
>>>
>>> My code is as follows.
>>>
>>> use Bio::Tools::Run::RemoteBlast;
>>>  use strict;
>>>  my $prog = 'blastn';
>>>  my $db   = 'nr';
>>>  my $e_val= '1e-10';
>>>  my $organism= 'Trypanosoma Brucei';
>>>
>>>  my @params = ( '-prog' => $prog,
>>>         '-data' => $db,
>>>         '-expect' => $e_val,
>>>         '-readmethod' => 'SearchIO',
>>>         '-Organism'   => $organism );
>>>
>>>  my $factory = Bio::Tools::Run::RemoteBlast->
>>> new(@params);
>>>
>>>  #change a paramter
>>>  #$Bio::Tools::Run::RemoteBlast::HEADER{'ENTREZ_QUERY'} = 'Trypanosoma
>>> brucei[ORGN]'
>>>
>>>  #remove a parameter
>>>  #delete $Bio::Tools::Run::RemoteBlast::HEADER{'FILTER'};
>>>
>>>  my $v = 1;
>>>  #$v is just to turn on and off the messages
>>>
>>>  my $str = Bio::SeqIO->new(-file=>'amino.fa' , '-format' => 'fasta' ,
>>> '-organism' => 'Trypanosoma Brucei' );
>>>
>>>  while (my $input = $str->next_seq()){
>>>    #Blast a sequence against a database:
>>>   my $r = $factory->submit_blast($input);
>>>    #my $r = $factory->submit_blast('amino.fa');
>>>
>>>    print STDERR "waiting..." if( $v > 0 );
>>>    while ( my @rids = $factory->each_rid ) {
>>>      foreach my $rid ( @rids ) {
>>>        my $rc = $factory->retrieve_blast($rid);
>>>        if( !ref($rc) ) {
>>>          if( $rc < 0 ) {
>>>            $factory->remove_rid($rid);
>>>          }
>>>          print STDERR "." if ( $v > 0 );
>>>         sleep 5;
>>>        }
>>>     else {
>>>          my $result = $rc->next_result();
>>>          #save the output
>>>          my $filename = $result->query_name()."\.out";
>>>          $factory->save_output($filename);
>>>          $factory->remove_rid($rid);
>>>          print "\nQuery Name: ", $result->query_name(), "\n";
>>>          while ( my $hit = $result->next_hit ) {
>>>            next unless ( $v > 0);
>>>            print "\thit name is ", $hit->name, "\n";
>>>            while( my $hsp = $hit->next_hsp ) {
>>>              print "\t\tscore is ", $hsp->score, "\n";
>>>            }
>>>          }
>>>        }
>>>      }
>>>    }
>>>  }
>>>
>>> My input sequence is
>>>
>>>  ref|NC_009512.1|:385-1902
>>>>
>>> GTGTCAGTGGAACTTTGGCAGCAGTGCGTGGAGCTTCTGCGCGATGAACTGCCTGCCCAGCAATTCAACA
>>> CCTGGATCCGTCCGCTACAGGTCGAAGCCGAAGGCGACGAGTTGCGCGTCTATGCGCCTAACCGTTTCGT
>>> TCTCGATTGGGTCAATGAAAAGTACCTGGGTCGTTTGCTCGAGCTGTTGGGTGAGAACGGTAGCGGCATT
>>> GCACCAGCCCTTTCCTTATTAATAGGTAGCCGCCGCAGCTCGGCCCCAAGGGCTGCACCCAACGCGCCGG
>>> TCAGCGCTGCCGTTGCGGCTTCGCTGGCGCAGACTCAGGCGCACAAGACGGCCCCGGCAGCAGCGGTTGA
>>> ACCCGTTGCCGTGGCCGCGGCCGAGCCTGTATTGGTCGAGACGTCTTCGCGTGACAGCTTTGATGCCATG
>>> GCCGAGCCTGCTGCTGCGCCGCCCAGTGGTGGCCGGGCTGAACAGCGCACCGTGCAGGTTGAAGGTGCGC
>>> TCAAGCACACCAGTTACCTGAACCGGACCTTTACCTTTGACACCTTCGTCGAAGGTAAGTCGAACCAGCT
>>> CGCCCGCGCGGCTGCCTGGCAGGTTGCGGACAACCCTAAGCATGGCTACAACCCACTGTTCCTTTATGGC
>>> GGTGTGGGTTTGGGTAAAACCCACCTTATGCATGCTGTGGGTAACCATCTGCTGAAGAAGAATCCGAACG
>>> CCAAGGTGGTGTACCTGCATTCGGAGCGCTTCGTCGCGGACATGGTCAAAGCGTTGCAACTCAACGCCAT
>>> CAACGAATTCAAGCGCTTCTACCGCTCGGTGGACGCGTTGCTGATCGACGATATCCAGTTCTTCGCTCGC
>>> AAAGAGCGCTCGCAAGAAGAGTTTTTCCACACCTTCAACGCCTTGCTTGAGGGTGGCCAGCAGGTAATCC
>>> TTACCTCTGACCGCTATCCCAAGGAAATCGAAGGCCTGGAAGAGCGTCTGAAGTCGCGCTTTGGTTGGGG
>>> CCTGACGGTGGCTGTCGAGCCGCCAGAGCTGGAGACCCGCGTAGCGATCCTGATGAAGAAGGCCGACCAG
>>> GCCAAAGTCGAGCTCCCGCATGACGCAGCCTTTTTCATCGCTCAGCGCATCCGGTCCAACGTCCGTGAGC
>>> TGGAAGGTGCACTGAAGCGAGTTATTGCTCACTCGCACTTCATGGGGCGTGACATCACCATCGAGCTGAT
>>> TCGTGAATCGCTCAAGGATCTGTTGGCGCTGCAAGACAAACTGGTCAGTGTGGATAACATTCAGCGTACC
>>> GTCGCTGAGTACTACAAGATCAAGATCTCCGATCTGTTGTCCAAGCGTCGTTCGCGTTCTGTCGCGCGCC
>>> CGCGTCAGGTAGCCATGGCCCTGTCCAAGGAGTTGACCAACCACAGTCTGCCGGAAATCGGCGACATGTT
>>> CGGTGGTCGCGACCATACGACCGTGCTGCACGCCTGCCGCAAAATCAATGAACTGAAGGAATCCGACGCG
>>> GACATCCGCGAGGACTACAAGAACCTGCTGCGGACGCTGACGACCTGA
>>>
>>> Please mail me regarding any queries.
>>>
>>> Regards,
>>> Roopa.
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>
>>
>


From mauricio at open-bio.org  Fri Nov 20 11:15:22 2009
From: mauricio at open-bio.org (Mauricio Herrera Cuadra)
Date: Fri, 20 Nov 2009 10:15:22 -0600
Subject: [Bioperl-l] DANGER: hacking of bioperl wiki?
In-Reply-To: <7761C2223DB54DE6B836F302D2FF6AC0@NewLife>
References: <b9140daa0911181852n3a000da7x3fc7a5f0661f10f3@mail.gmail.com>
	<7761C2223DB54DE6B836F302D2FF6AC0@NewLife>
Message-ID: <4B06C09A.8060708@open-bio.org>

All OBF wikis and blogs have been upgraded and cleaned from the hack. 
Thanks for the heads up!

Mauricio.

Mark A. Jensen wrote:
> Andrew-- thanks!! We're on it.
> MAJ
> ----- Original Message ----- From: "Andrew Grimm" 
> <andrew.j.grimm at gmail.com>
> To: <bioperl-l at lists.open-bio.org>
> Sent: Wednesday, November 18, 2009 9:52 PM
> Subject: [Bioperl-l] DANGER: hacking of bioperl wiki?
> 
> 
>> Caution: read the whole email before visiting the bioperl wiki
>>
>> I was doing some bioinformatics-related searching using google, and
>> one of the hits was to the bio dot perl dot org wiki (the FAQ in
>> particular).
>>
>> When I did that, I was redirected to a ferdax dot com web site (a
>> typo-squatting of fedex?).
>>
>> Some people reckon that ferdax hacks web sites and redirects google
>> hits from the victim web site to their own web site. For example, this
>> thread at google's webmaster central
>> http://www.google.com/support/forum/p/Webmasters/thread?tid=37a36c0d1ea99819&hl=en#all 
>>
>> (it's talking about zencart, but presumably they've since found other
>> victims)
>>
>> Just going to the website without using google may not trigger the 
>> redirect.
>>
>> Apologies if this is a false alarm, but I don't think it is.
>>
>> I won't be in contact between Friday and Monday Australian time (I'll
>> be at railscamp 6 in Melbourne), so I won't be able to answer any
>> replies.
>>
>> Thanks,
>>
>> Andrew Grimm
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>>
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 


From David.Messina at sbc.su.se  Fri Nov 20 11:39:53 2009
From: David.Messina at sbc.su.se (Dave Messina)
Date: Fri, 20 Nov 2009 17:39:53 +0100
Subject: [Bioperl-l] Remote blast
In-Reply-To: <c7cac1600911200752h2dd48ca8r55877f5045948220@mail.gmail.com>
References: <c7cac1600911190655q7c4cf910x221e8a2ebac40f2a@mail.gmail.com>
	<4B056DF4.2030502@gmail.com>
	<c7cac1600911200330m5aafc21ehd5038707b4af9cdd@mail.gmail.com>
	<c7cac1600911200752h2dd48ca8r55877f5045948220@mail.gmail.com>
Message-ID: <7ECF627D-3DBF-4575-89CF-FA6348C88E8E@sbc.su.se>

Hi Roopa,

As far as I know, a BLAST report never contains the complete sequences of the hits. If it includes any part of the hit's sequence, it will be the part that matches the query.

You'll have to use the hit's ID or accession to get its complete sequence from somewhere else. You can use Bio::DB::Genbank to do that, for example.

See
http://www.bioperl.org/wiki/HOWTO:Beginners#Retrieving_a_sequence_from_a_database


Dave


From alessandra.bilardi at gmail.com  Fri Nov 20 12:44:18 2009
From: alessandra.bilardi at gmail.com (Alessandra)
Date: Fri, 20 Nov 2009 18:44:18 +0100
Subject: [Bioperl-l] Bio::DB::EUtilities question
Message-ID: <e0996aca0911200944r1162ceaew1bd846ade73d2841@mail.gmail.com>

Hi all,

I'm testing Bio::DB::EUtilities - webagent which interacts with and
retrieves data from NCBI's eUtils. My perl script works but it works
only if I request less than ~450 times get_Response function.. else I
have got this error message:

------------- EXCEPTION -------------
MSG: Response Error
Can't connect to eutils.ncbi.nlm.nih.gov:80 (connect: No route to host)
STACK Bio::DB::GenericWebAgent::get_Response
/usr/local/share/perl/5.10.0/Bio/DB/GenericWebAgent.pm:215
STACK toplevel ./wget4gbk.pl:77
-------------------------------------

wget4gbk.pl lines 76-77 are:
my $req = Bio::DB::EUtilities->new(-db => 'genome', -eutil =>
'esummary', -retmode => $mode, -rettype => $type, -id => $id);
my $entry = $req->get_Response;

I run perl script more ten times and this error arrives random time at
the range 300-600 requests. If I use another system to request data,
then I can to do ~ 10000 requests, without errors. Had I to set
EUtilities object with particular parameters?

Can you help me about random exception error?

Best,

-- 
 Alessandra Bilardi, Ph. D.
----
 CRIBI, University of Padova, Italy
 http://www.linkedin.com/in/bilardi
----


From maj at fortinbras.us  Fri Nov 20 13:42:38 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Fri, 20 Nov 2009 13:42:38 -0500
Subject: [Bioperl-l] gravatars on the wiki
Message-ID: <94431678F3764E8C9A49EA4D2FCD0DBD@NewLife>

Hi all, 
You can now reveal your Gravatar (http://www.gravatar.com) on the wiki, by including 
the following markup on the page:

 <winterPreWiki>
 {{#gravatar|youremail -at- yourplace -dot- tld}}
 </winterPreWiki>

You can do the antispam measure above, or use a regular email. Invalid emails throw an error.
http://bioperl.org/wiki/Gravatars 
Happy coding, 
MAJ


From roychu at gmail.com  Fri Nov 20 15:23:21 2009
From: roychu at gmail.com (Chu, Roy)
Date: Fri, 20 Nov 2009 12:23:21 -0800
Subject: [Bioperl-l] Installing Bio-perl on dreamhost via CPAN
In-Reply-To: <ECA15E72-8026-4EA0-A4F6-7C09EA2BD040@illinois.edu>
References: <4d7f3e450911200221m5f39ace2hb979712115fb9d78@mail.gmail.com>
	<20091120104445.GG31318@kunpuu.plessy.org>
	<ECA15E72-8026-4EA0-A4F6-7C09EA2BD040@illinois.edu>
Message-ID: <4d7f3e450911201223w59cb308q5af7690a28697966@mail.gmail.com>

"sounds very much like you process was killed for prolonged execution
time, or memory usage. We have a daemon in place that monitors for
processes that take up too much of a shared web server's resources, and
this may have kicked in (and often does when trying to install packages
on a shared server)."

This was the explanation they had.  Regarding asking their admins to
install, it seems is a "they'll try to get to it but don't hold your
breath situation."

Hmmm, I tried some other attempts, installing 1.4.0 posed no problems.
 I'm not a perl guru, so I tried to increase the build cache size from
the default, 10 MB, hoping that that may be the problem--can't imagine
how though, since I can't imagine how big the whole package version
can differ by (though honestly, I haven't checked).
Whenever I try to install 1.6.1, it runs into a problem I guess after
the 'make' step and lists the
modules--BioPerl-1.6.0/t/Variation/SeqDiff.t
BioPerl-1.6.0/t/Variation/SNP.t
BioPerl-1.6.0/t/Variation/Variation_IO.t
--and typically gets killed here '> Killed'

Next, I tried 1.6.0, then I get this:
"(I think you ran Build.PL directly, so will use CPAN to install
prerequisites on demand)
CPAN: Storable loaded ok (v2.12)
Going to read '/home/$username/.cpan/Metadata'
Killed" (everything prior works and it seems to get further along than
when I try to install 1.6.1)

Any insight into why this may be happening would be appreciated.
Something EQUALLY appreciated would be a recommendation of a decent
enough hosting service where someone has had success installing
Bio-Perl.  I'd try to set up my Mac web sharing feature and then try
to setup the stuff locally, but I haven't yet been able to
successfully get the port forwarding feature working properly on the
apple airport extreme--perplexing.  Next, I might just try to install
via the Build.pl script.

Hmm, checking the wiki, it seems I'll still be able to run remote
blast and use the basic seq modules, although some discrepancies and
idiosyncrasies may be expected?  Any head-ups about any false
assumptions by me would be greatly appreciated.

Thanks in advance,
Roy

On Fri, Nov 20, 2009 at 5:00 AM, Chris Fields <cjfields at illinois.edu> wrote:
>
> On Nov 20, 2009, at 4:44 AM, Charles Plessy wrote:
>
>> Le Fri, Nov 20, 2009 at 02:21:54AM -0800, Chu, Roy a ?crit :
>>>
>>> Does anyone use dreamhost as a web hosting service? ?I'm just curious
>>> if anyone has had any luck installing the module as their daemon seems
>>> to kill my process whenever I try to install it. ?Dreamhost tech
>>> support attributes it to either exceeding the allocated memory cache
>>> or exceeding the processing time. ?I tried to nice the process, but
>>> that didn't help for me. ?Any luck or experience in resolving this
>>> would be much appreciated. ?I suppose my next attempt would be to try
>>> installing it directly and hope I don't need root...
>>
>> Dear Roy,
>>
>> DreamHost uses Debian, so you can suggest them to install the Debian package.
>> If you are in contact with the tech service, do not hesitate to tell them to
>> contact me if they are interested by a backport of the 1.6.0 package. For
>> version 1.6.1, it may be more difficult as it depends on perl 5.10.1.
>
> Any reason why this is so? ?We specify compatibility back to 5.6.1.
>
> Alex mentioned the reliance on the specific Extutils::Manifest version. ?The version requested has an important bug fix, is present on CPAN, and is backwards-compatible to 5.6.1. ?It should be fairly easy to request that as a separate package.
>
> A strict requirement for perl 5.10.1 doesn't make sense in that light, unless said perl maintainer can enlighten us as to why this is an issue? ?This one may require a ranty blog post.
>
>> PS: if you propse to install BioPerl as a feature in the Dreamhost panel, I
>> will vote for it :)
>>
>> Have a nice day,
>>
>> --
>> Charles Plessy
>> Debian Med packaging team,
>> http://www.debian.org/devel/debian-med
>> Tsurumi, Kanagawa, Japan
>
> chris
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


From cjfields at illinois.edu  Fri Nov 20 15:40:24 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Fri, 20 Nov 2009 14:40:24 -0600
Subject: [Bioperl-l] Installing Bio-perl on dreamhost via CPAN
In-Reply-To: <4d7f3e450911201223w59cb308q5af7690a28697966@mail.gmail.com>
References: <4d7f3e450911200221m5f39ace2hb979712115fb9d78@mail.gmail.com>
	<20091120104445.GG31318@kunpuu.plessy.org>
	<ECA15E72-8026-4EA0-A4F6-7C09EA2BD040@illinois.edu>
	<4d7f3e450911201223w59cb308q5af7690a28697966@mail.gmail.com>
Message-ID: <1D1B0987-3309-4281-BCE0-2737E4F0D0B1@illinois.edu>

BioPerl is pure perl.  If you believe all dependencies are installed, just unpack the dist to a specific directory and point PERL5LIB at it (for bash):

export PERL5LIB=/home/USER/bioperl/bioperl-live

Note that if you plan on doing the same for other bioperl-related modules (ex: bioperl-db) you'll need to add 'lib' to it, as they use a generic Module::Build now.

export PERL5LIB=/home/USER/bioperl/bioperl-db/lib

You can also add a 'use lib' directive in your scripts as well.  More at the following link:

http://www.bioperl.org/wiki/Installing_Bioperl_for_Unix#USING_MODULES_NOT_INSTALLED_IN_THE_STANDARD_LOCATION

chris

On Nov 20, 2009, at 2:23 PM, Chu, Roy wrote:

> "sounds very much like you process was killed for prolonged execution
> time, or memory usage. We have a daemon in place that monitors for
> processes that take up too much of a shared web server's resources, and
> this may have kicked in (and often does when trying to install packages
> on a shared server)."
> 
> This was the explanation they had.  Regarding asking their admins to
> install, it seems is a "they'll try to get to it but don't hold your
> breath situation."
> 
> Hmmm, I tried some other attempts, installing 1.4.0 posed no problems.
> I'm not a perl guru, so I tried to increase the build cache size from
> the default, 10 MB, hoping that that may be the problem--can't imagine
> how though, since I can't imagine how big the whole package version
> can differ by (though honestly, I haven't checked).
> Whenever I try to install 1.6.1, it runs into a problem I guess after
> the 'make' step and lists the
> modules--BioPerl-1.6.0/t/Variation/SeqDiff.t
> BioPerl-1.6.0/t/Variation/SNP.t
> BioPerl-1.6.0/t/Variation/Variation_IO.t
> --and typically gets killed here '> Killed'
> 
> Next, I tried 1.6.0, then I get this:
> "(I think you ran Build.PL directly, so will use CPAN to install
> prerequisites on demand)
> CPAN: Storable loaded ok (v2.12)
> Going to read '/home/$username/.cpan/Metadata'
> Killed" (everything prior works and it seems to get further along than
> when I try to install 1.6.1)
> 
> Any insight into why this may be happening would be appreciated.
> Something EQUALLY appreciated would be a recommendation of a decent
> enough hosting service where someone has had success installing
> Bio-Perl.  I'd try to set up my Mac web sharing feature and then try
> to setup the stuff locally, but I haven't yet been able to
> successfully get the port forwarding feature working properly on the
> apple airport extreme--perplexing.  Next, I might just try to install
> via the Build.pl script.
> 
> Hmm, checking the wiki, it seems I'll still be able to run remote
> blast and use the basic seq modules, although some discrepancies and
> idiosyncrasies may be expected?  Any head-ups about any false
> assumptions by me would be greatly appreciated.
> 
> Thanks in advance,
> Roy
> 
> On Fri, Nov 20, 2009 at 5:00 AM, Chris Fields <cjfields at illinois.edu> wrote:
>> 
>> On Nov 20, 2009, at 4:44 AM, Charles Plessy wrote:
>> 
>>> Le Fri, Nov 20, 2009 at 02:21:54AM -0800, Chu, Roy a ?crit :
>>>> 
>>>> Does anyone use dreamhost as a web hosting service?  I'm just curious
>>>> if anyone has had any luck installing the module as their daemon seems
>>>> to kill my process whenever I try to install it.  Dreamhost tech
>>>> support attributes it to either exceeding the allocated memory cache
>>>> or exceeding the processing time.  I tried to nice the process, but
>>>> that didn't help for me.  Any luck or experience in resolving this
>>>> would be much appreciated.  I suppose my next attempt would be to try
>>>> installing it directly and hope I don't need root...
>>> 
>>> Dear Roy,
>>> 
>>> DreamHost uses Debian, so you can suggest them to install the Debian package.
>>> If you are in contact with the tech service, do not hesitate to tell them to
>>> contact me if they are interested by a backport of the 1.6.0 package. For
>>> version 1.6.1, it may be more difficult as it depends on perl 5.10.1.
>> 
>> Any reason why this is so?  We specify compatibility back to 5.6.1.
>> 
>> Alex mentioned the reliance on the specific Extutils::Manifest version.  The version requested has an important bug fix, is present on CPAN, and is backwards-compatible to 5.6.1.  It should be fairly easy to request that as a separate package.
>> 
>> A strict requirement for perl 5.10.1 doesn't make sense in that light, unless said perl maintainer can enlighten us as to why this is an issue?  This one may require a ranty blog post.
>> 
>>> PS: if you propse to install BioPerl as a feature in the Dreamhost panel, I
>>> will vote for it :)
>>> 
>>> Have a nice day,
>>> 
>>> --
>>> Charles Plessy
>>> Debian Med packaging team,
>>> http://www.debian.org/devel/debian-med
>>> Tsurumi, Kanagawa, Japan
>> 
>> chris
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From charles-listes+bioperl at plessy.org  Fri Nov 20 20:07:23 2009
From: charles-listes+bioperl at plessy.org (Charles Plessy)
Date: Sat, 21 Nov 2009 10:07:23 +0900
Subject: [Bioperl-l] Installing Bio-perl on dreamhost via CPAN
In-Reply-To: <ECA15E72-8026-4EA0-A4F6-7C09EA2BD040@illinois.edu>
References: <4d7f3e450911200221m5f39ace2hb979712115fb9d78@mail.gmail.com>
	<20091120104445.GG31318@kunpuu.plessy.org>
	<ECA15E72-8026-4EA0-A4F6-7C09EA2BD040@illinois.edu>
Message-ID: <20091121010723.GA7786@kunpuu.plessy.org>

Le Fri, Nov 20, 2009 at 07:00:45AM -0600, Chris Fields a ?crit :
> On Nov 20, 2009, at 4:44 AM, Charles Plessy wrote:
> > 
> > DreamHost uses Debian, so you can suggest them to install the Debian
> > package.  If you are in contact with the tech service, do not hesitate to
> > tell them to contact me if they are interested by a backport of the 1.6.0
> > package. For version 1.6.1, it may be more difficult as it depends on perl
> > 5.10.1.
> 
> Any reason why this is so?  We specify compatibility back to 5.6.1.

Dear Chris,

you make a good point: although for building we need to either depend on perl
5.10.1 or package separately Extutils::Manifest, the resulting bioperl package
does not depend on such a high version. Therefore, there is no need for a
backport, and the latest Debian package can be installed on Debian stable
(5.0/Lenny) system. I just checked the Dreamhost machine on which I happen to
have an acces, ?waratahs?, and it seems to be older, but nevertheless it may be
worth asking the admins anyway (with the big drawback that they would have to
be asked for each update).

Have a nice week-end,

-- 
Charles Plessy
Debian Med packaging team,
http://www.debian.org/devel/debian-med
Tsurumi, Kanagawa, Japan


From robert.bradbury at gmail.com  Fri Nov 20 20:40:14 2009
From: robert.bradbury at gmail.com (Robert Bradbury)
Date: Fri, 20 Nov 2009 20:40:14 -0500
Subject: [Bioperl-l] Excessive CPU use by various Bioperl sites
Message-ID: <deaa866a0911201740h63aeb09cma237064b7622f5ce@mail.gmail.com>

I run a Linux system which is in a gradual process of evolution from the
default Linux browsers (Galeon, Epiphany, etc.) through Firefox (better) to
Google's Chromium (IMO, perhaps the best so far).  Chromium allows one to
create a process per tab/URL so one can effectively track what it is doing.
 It also allows one to track the machine usage of these processes (through
the Developer > Task manager [shift-escape keyboard] option) which though
expensive in terms of overhead allows one to track offending windows (in
terms of memory or CPU use).  My processor recently jumped from a typical
700 MHz to 1.4 GHz speed (using the Linux Ondemand scheduler - which saves
~20 W at the wall outlet -- I've measured it) to the full tilt 2.8 GHz the
CPU is capable of.  Looking at the chrome task manager I was not surprised
to find the NY Times high on the list (they are pushing content, esp. using
Javascript) but much to my dismay the Jalview and Howto:Trees:Bioperl
appeared to be high on the list.  Now I am forced to ask myself *why* sites
which are simply distributing static information are eating up CPU on my
machine!  This is a fundamental flaw in the architecture of the sites --
wherein there should be conscious efforts to minimize user-CPU use (or avoid
Javascript entirely).  This would not be a problem if I were using Firefox
as I can easily use NoScript to block Javacscript from non-approved sites.
 But it raises the question of when one should allow Javascript to run (one
would "normally" approve academic sites by default) when even the academic
sites are abusing my CPU.  There needs to be much greater awareness both on
the part of software distributors and software consumers that it is *MY* CPU
and *MY* Electricty and *MY* contribution to global warming.  And the
developers/distributors should not be sucking down those resources without
first saying "May I?" and I have the option of saying "No you may not."
 There is enough we can do productively (running low homology blast
searches) without engaging in endless wheel spinning of Javascripts or
looped GIFs.

Robert


From maj at fortinbras.us  Fri Nov 20 23:17:12 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Fri, 20 Nov 2009 23:17:12 -0500
Subject: [Bioperl-l] ohlohers
Message-ID: <C003FAD20636489DBFB2D34F5955C68D@NewLife>

You can now add your Ohloh widgets and increase your carbon footprint with the less crufty:

 <winterPreWiki>
 {{#ohloh|acct_id|TYPE}}
 </winterPreWiki>

where TYPE is [Detailed|Rank|Tiny]. Taint checks aplenty.
MAJ


From maj at fortinbras.us  Fri Nov 20 23:33:02 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Fri, 20 Nov 2009 23:33:02 -0500
Subject: [Bioperl-l] Installing Bio-perl on dreamhost via CPAN
In-Reply-To: <4d7f3e450911201223w59cb308q5af7690a28697966@mail.gmail.com>
References: <4d7f3e450911200221m5f39ace2hb979712115fb9d78@mail.gmail.com><20091120104445.GG31318@kunpuu.plessy.org><ECA15E72-8026-4EA0-A4F6-7C09EA2BD040@illinois.edu>
	<4d7f3e450911201223w59cb308q5af7690a28697966@mail.gmail.com>
Message-ID: <9ECC66C2F23F47469AF0F07E3F9307FC@NewLife>

Maybe 'nightmarehost' is more appropriate. I've had no problems on AWS,
but this may not exactly what you need. MAJ
----- Original Message ----- 
From: "Chu, Roy" <roychu at gmail.com>
To: <bioperl-l at lists.open-bio.org>
Sent: Friday, November 20, 2009 3:23 PM
Subject: Re: [Bioperl-l] Installing Bio-perl on dreamhost via CPAN


"sounds very much like you process was killed for prolonged execution
time, or memory usage. We have a daemon in place that monitors for
processes that take up too much of a shared web server's resources, and
this may have kicked in (and often does when trying to install packages
on a shared server)."

This was the explanation they had.  Regarding asking their admins to
install, it seems is a "they'll try to get to it but don't hold your
breath situation."

Hmmm, I tried some other attempts, installing 1.4.0 posed no problems.
 I'm not a perl guru, so I tried to increase the build cache size from
the default, 10 MB, hoping that that may be the problem--can't imagine
how though, since I can't imagine how big the whole package version
can differ by (though honestly, I haven't checked).
Whenever I try to install 1.6.1, it runs into a problem I guess after
the 'make' step and lists the
modules--BioPerl-1.6.0/t/Variation/SeqDiff.t
BioPerl-1.6.0/t/Variation/SNP.t
BioPerl-1.6.0/t/Variation/Variation_IO.t
--and typically gets killed here '> Killed'

Next, I tried 1.6.0, then I get this:
"(I think you ran Build.PL directly, so will use CPAN to install
prerequisites on demand)
CPAN: Storable loaded ok (v2.12)
Going to read '/home/$username/.cpan/Metadata'
Killed" (everything prior works and it seems to get further along than
when I try to install 1.6.1)

Any insight into why this may be happening would be appreciated.
Something EQUALLY appreciated would be a recommendation of a decent
enough hosting service where someone has had success installing
Bio-Perl.  I'd try to set up my Mac web sharing feature and then try
to setup the stuff locally, but I haven't yet been able to
successfully get the port forwarding feature working properly on the
apple airport extreme--perplexing.  Next, I might just try to install
via the Build.pl script.

Hmm, checking the wiki, it seems I'll still be able to run remote
blast and use the basic seq modules, although some discrepancies and
idiosyncrasies may be expected?  Any head-ups about any false
assumptions by me would be greatly appreciated.

Thanks in advance,
Roy

On Fri, Nov 20, 2009 at 5:00 AM, Chris Fields <cjfields at illinois.edu> wrote:
>
> On Nov 20, 2009, at 4:44 AM, Charles Plessy wrote:
>
>> Le Fri, Nov 20, 2009 at 02:21:54AM -0800, Chu, Roy a ?crit :
>>>
>>> Does anyone use dreamhost as a web hosting service? I'm just curious
>>> if anyone has had any luck installing the module as their daemon seems
>>> to kill my process whenever I try to install it. Dreamhost tech
>>> support attributes it to either exceeding the allocated memory cache
>>> or exceeding the processing time. I tried to nice the process, but
>>> that didn't help for me. Any luck or experience in resolving this
>>> would be much appreciated. I suppose my next attempt would be to try
>>> installing it directly and hope I don't need root...
>>
>> Dear Roy,
>>
>> DreamHost uses Debian, so you can suggest them to install the Debian package.
>> If you are in contact with the tech service, do not hesitate to tell them to
>> contact me if they are interested by a backport of the 1.6.0 package. For
>> version 1.6.1, it may be more difficult as it depends on perl 5.10.1.
>
> Any reason why this is so? We specify compatibility back to 5.6.1.
>
> Alex mentioned the reliance on the specific Extutils::Manifest version. The 
> version requested has an important bug fix, is present on CPAN, and is 
> backwards-compatible to 5.6.1. It should be fairly easy to request that as a 
> separate package.
>
> A strict requirement for perl 5.10.1 doesn't make sense in that light, unless 
> said perl maintainer can enlighten us as to why this is an issue? This one may 
> require a ranty blog post.
>
>> PS: if you propse to install BioPerl as a feature in the Dreamhost panel, I
>> will vote for it :)
>>
>> Have a nice day,
>>
>> --
>> Charles Plessy
>> Debian Med packaging team,
>> http://www.debian.org/devel/debian-med
>> Tsurumi, Kanagawa, Japan
>
> chris
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>

_______________________________________________
Bioperl-l mailing list
Bioperl-l at lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/bioperl-l


From cjfields at illinois.edu  Fri Nov 20 23:38:23 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Fri, 20 Nov 2009 22:38:23 -0600
Subject: [Bioperl-l] Excessive CPU use by various Bioperl sites
In-Reply-To: <deaa866a0911201740h63aeb09cma237064b7622f5ce@mail.gmail.com>
References: <deaa866a0911201740h63aeb09cma237064b7622f5ce@mail.gmail.com>
Message-ID: <8163BC62-3F3E-4936-AAA9-61A4FB307C99@illinois.edu>

Robert, 

Not sure why you're seeing that, but the HOWTO (and, AFAIK, the wiki in general) do not use JS, unless there is a specific addition I'm unaware of.  Now, the site wiki was recently 'parasited' for redirects, which may be the culprit, but this is now fixed.  Can you at least retest to see if this persists?

Anyone else know about this?

chris

On Nov 20, 2009, at 7:40 PM, Robert Bradbury wrote:

> I run a Linux system which is in a gradual process of evolution from the
> default Linux browsers (Galeon, Epiphany, etc.) through Firefox (better) to
> Google's Chromium (IMO, perhaps the best so far).  Chromium allows one to
> create a process per tab/URL so one can effectively track what it is doing.
> It also allows one to track the machine usage of these processes (through
> the Developer > Task manager [shift-escape keyboard] option) which though
> expensive in terms of overhead allows one to track offending windows (in
> terms of memory or CPU use).  My processor recently jumped from a typical
> 700 MHz to 1.4 GHz speed (using the Linux Ondemand scheduler - which saves
> ~20 W at the wall outlet -- I've measured it) to the full tilt 2.8 GHz the
> CPU is capable of.  Looking at the chrome task manager I was not surprised
> to find the NY Times high on the list (they are pushing content, esp. using
> Javascript) but much to my dismay the Jalview and Howto:Trees:Bioperl
> appeared to be high on the list.  Now I am forced to ask myself *why* sites
> which are simply distributing static information are eating up CPU on my
> machine!  This is a fundamental flaw in the architecture of the sites --
> wherein there should be conscious efforts to minimize user-CPU use (or avoid
> Javascript entirely).  This would not be a problem if I were using Firefox
> as I can easily use NoScript to block Javacscript from non-approved sites.
> But it raises the question of when one should allow Javascript to run (one
> would "normally" approve academic sites by default) when even the academic
> sites are abusing my CPU.  There needs to be much greater awareness both on
> the part of software distributors and software consumers that it is *MY* CPU
> and *MY* Electricty and *MY* contribution to global warming.  And the
> developers/distributors should not be sucking down those resources without
> first saying "May I?" and I have the option of saying "No you may not."
> There is enough we can do productively (running low homology blast
> searches) without engaging in endless wheel spinning of Javascripts or
> looped GIFs.
> 
> Robert
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From sdavis2 at mail.nih.gov  Sat Nov 21 00:11:34 2009
From: sdavis2 at mail.nih.gov (Sean Davis)
Date: Fri, 20 Nov 2009 21:11:34 -0800
Subject: [Bioperl-l] Excessive CPU use by various Bioperl sites
In-Reply-To: <8163BC62-3F3E-4936-AAA9-61A4FB307C99@illinois.edu>
References: <deaa866a0911201740h63aeb09cma237064b7622f5ce@mail.gmail.com>
	<8163BC62-3F3E-4936-AAA9-61A4FB307C99@illinois.edu>
Message-ID: <264855a00911202111u4b1f1020r4aa6e0e9b0ea61@mail.gmail.com>

On Fri, Nov 20, 2009 at 8:38 PM, Chris Fields <cjfields at illinois.edu> wrote:

> Robert,
>
> Not sure why you're seeing that, but the HOWTO (and, AFAIK, the wiki in
> general) do not use JS, unless there is a specific addition I'm unaware of.
>  Now, the site wiki was recently 'parasited' for redirects, which may be the
> culprit, but this is now fixed.  Can you at least retest to see if this
> persists?
>
> Anyone else know about this?
>
>
The page in question does include javascript, it appears from the source.
 This is a function of using mediawiki, though, I believe and not something
specific to that page.

Sean


From cjfields at illinois.edu  Sat Nov 21 00:20:37 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Fri, 20 Nov 2009 23:20:37 -0600
Subject: [Bioperl-l] Excessive CPU use by various Bioperl sites
In-Reply-To: <264855a00911202111u4b1f1020r4aa6e0e9b0ea61@mail.gmail.com>
References: <deaa866a0911201740h63aeb09cma237064b7622f5ce@mail.gmail.com>
	<8163BC62-3F3E-4936-AAA9-61A4FB307C99@illinois.edu>
	<264855a00911202111u4b1f1020r4aa6e0e9b0ea61@mail.gmail.com>
Message-ID: <A7AC3865-3C9A-4C6E-85B5-349240C40680@illinois.edu>

On Nov 20, 2009, at 11:11 PM, Sean Davis wrote:

> On Fri, Nov 20, 2009 at 8:38 PM, Chris Fields <cjfields at illinois.edu> wrote:
> 
>> Robert,
>> 
>> Not sure why you're seeing that, but the HOWTO (and, AFAIK, the wiki in
>> general) do not use JS, unless there is a specific addition I'm unaware of.
>> Now, the site wiki was recently 'parasited' for redirects, which may be the
>> culprit, but this is now fixed.  Can you at least retest to see if this
>> persists?
>> 
>> Anyone else know about this?
>> 
>> 
> The page in question does include javascript, it appears from the source.
> This is a function of using mediawiki, though, I believe and not something
> specific to that page.
> 
> Sean

</sound of my hand slapping my forehead>

Sean, thanks for pointing that out.

chris


From robert.bradbury at gmail.com  Sat Nov 21 13:26:05 2009
From: robert.bradbury at gmail.com (Robert Bradbury)
Date: Sat, 21 Nov 2009 13:26:05 -0500
Subject: [Bioperl-l] Bio::DB::EUtilities question
In-Reply-To: <e0996aca0911200944r1162ceaew1bd846ade73d2841@mail.gmail.com>
References: <e0996aca0911200944r1162ceaew1bd846ade73d2841@mail.gmail.com>
Message-ID: <deaa866a0911211026t2c8e1cafvac9440586dc32122@mail.gmail.com>

It sounds like NCBI may be counting frequency of requests, how much data
they send or something similar.  Are you delaying the time between fetches?
 The code I've seen typically sleeps for a few seconds each time around a
loop.  You might try longer delays between fetches and see if that gets you
any more data.

Alternatively perhaps the libraries aren't reusing the TCP/IP connection
properly.  Is there a difference between the amount of memory on the
machines?  Have you watched the size of the process to see if it grows over
time?  I think the bug which prevented me from fetching a not-so-large
genome from a few months ago (eating up 3GB of memory in the process) has
not been resolved.  If so that could be your problem.

Robert

On Fri, Nov 20, 2009 at 12:44 PM, Alessandra
<alessandra.bilardi at gmail.com>wrote:
>
>
> I'm testing Bio::DB::EUtilities - webagent which interacts with and
> retrieves data from NCBI's eUtils. My perl script works but it works
> only if I request less than ~450 times get_Response function.. else I
> have got this error message:
>
> ------------- EXCEPTION -------------
> MSG: Response Error
> Can't connect to eutils.ncbi.nlm.nih.gov:80 (connect: No route to host)
> STACK Bio::DB::GenericWebAgent::get_Response
> /usr/local/share/perl/5.10.0/Bio/DB/GenericWebAgent.pm:215
> STACK toplevel ./wget4gbk.pl:77
>


From cjfields at illinois.edu  Sat Nov 21 14:19:24 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Sat, 21 Nov 2009 13:19:24 -0600
Subject: [Bioperl-l] Bio::DB::EUtilities question
In-Reply-To: <deaa866a0911211026t2c8e1cafvac9440586dc32122@mail.gmail.com>
References: <e0996aca0911200944r1162ceaew1bd846ade73d2841@mail.gmail.com>
	<deaa866a0911211026t2c8e1cafvac9440586dc32122@mail.gmail.com>
Message-ID: <837CE7E7-E625-4285-AD54-06FD168C0DF3@illinois.edu>

NCBI has specific rules about the repeated queries to its servers:

http://eutils.ncbi.nlm.nih.gov/#UserSystemRequirements

Acc. to that, if you are making over 100 requests at peak times you will run into problems (they'll probably temp-block your IP), even if the timeout is much shorter now (it's 3 requests/second, whereas a year or two ago it was once every 3 sec).  In general it's best to run something like this during off-hours.  

The actual limit on number of server requests is one specific part of Bio::DB::EUtilities that hasn't been added yet, but is tentatively planned.  

chris

On Nov 21, 2009, at 12:26 PM, Robert Bradbury wrote:

> It sounds like NCBI may be counting frequency of requests, how much data
> they send or something similar.  Are you delaying the time between fetches?
> The code I've seen typically sleeps for a few seconds each time around a
> loop.  You might try longer delays between fetches and see if that gets you
> any more data.
> 
> Alternatively perhaps the libraries aren't reusing the TCP/IP connection
> properly.  Is there a difference between the amount of memory on the
> machines?  Have you watched the size of the process to see if it grows over
> time?  I think the bug which prevented me from fetching a not-so-large
> genome from a few months ago (eating up 3GB of memory in the process) has
> not been resolved.  If so that could be your problem.
> 
> Robert
> 
> On Fri, Nov 20, 2009 at 12:44 PM, Alessandra
> <alessandra.bilardi at gmail.com>wrote:
>> 
>> 
>> I'm testing Bio::DB::EUtilities - webagent which interacts with and
>> retrieves data from NCBI's eUtils. My perl script works but it works
>> only if I request less than ~450 times get_Response function.. else I
>> have got this error message:
>> 
>> ------------- EXCEPTION -------------
>> MSG: Response Error
>> Can't connect to eutils.ncbi.nlm.nih.gov:80 (connect: No route to host)
>> STACK Bio::DB::GenericWebAgent::get_Response
>> /usr/local/share/perl/5.10.0/Bio/DB/GenericWebAgent.pm:215
>> STACK toplevel ./wget4gbk.pl:77
>> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From cjfields at illinois.edu  Sat Nov 21 21:58:37 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Sat, 21 Nov 2009 20:58:37 -0600
Subject: [Bioperl-l] BioPerl on FLOSS Weekly
Message-ID: <05EB7AF4-8A20-4046-A585-FBF41EA8350A@illinois.edu>

Jason and I were recently interviewed (Wednesday!) about BioPerl for FLOSS Weekly by Randal Schwartz, Leo Laporte, Marc Pelletier, and Kirsten Sanford.  The interview is now available online, so get your favorite flavor (MP3, podcast) here:

http://twit.tv/floss96

Enjoy!

chris and jason


From adsj at novozymes.com  Sun Nov 22 07:37:40 2009
From: adsj at novozymes.com (Adam =?iso-8859-1?Q?Sj=F8gren?=)
Date: Sun, 22 Nov 2009 13:37:40 +0100
Subject: [Bioperl-l] BioPerl on FLOSS Weekly
In-Reply-To: <05EB7AF4-8A20-4046-A585-FBF41EA8350A@illinois.edu> (Chris
	Fields's message of "Sat, 21 Nov 2009 20:58:37 -0600")
References: <05EB7AF4-8A20-4046-A585-FBF41EA8350A@illinois.edu>
Message-ID: <87aaye91m3.fsf@topper.koldfront.dk>

On Sat, 21 Nov 2009 20:58:37 -0600, Chris wrote:

> Jason and I were recently interviewed (Wednesday!) about BioPerl for
> FLOSS Weekly by Randal Schwartz, Leo Laporte, Marc Pelletier, and
> Kirsten Sanford.

Great!

How about linking to it on bioperl.org?


  :-),

   Adam

-- 
                                                          Adam Sj?gren
                                                    adsj at novozymes.com


From cjfields at illinois.edu  Sun Nov 22 15:30:01 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Sun, 22 Nov 2009 14:30:01 -0600
Subject: [Bioperl-l] BioPerl on FLOSS Weekly
In-Reply-To: <87aaye91m3.fsf@topper.koldfront.dk>
References: <05EB7AF4-8A20-4046-A585-FBF41EA8350A@illinois.edu>
	<87aaye91m3.fsf@topper.koldfront.dk>
Message-ID: <2F050081-8B44-4B4C-82D2-7AC71F156588@illinois.edu>


On Nov 22, 2009, at 6:37 AM, Adam Sj?gren wrote:

> On Sat, 21 Nov 2009 20:58:37 -0600, Chris wrote:
> 
>> Jason and I were recently interviewed (Wednesday!) about BioPerl for
>> FLOSS Weekly by Randal Schwartz, Leo Laporte, Marc Pelletier, and
>> Kirsten Sanford.
> 
> Great!
> 
> How about linking to it on bioperl.org?
> 
> 
>  :-),
> 
>   Adam
> 
> -- 
>                                                          Adam Sj?gren
>                                                    adsj at novozymes.com

Now posted via O|B|F News; I'll try to make that feed more prominent on the main page.  

Since this is the second such interview (Jason did one a few years back for PerlCast), I'm thinking we need a media page of some sort.

chris


From maj at fortinbras.us  Sun Nov 22 15:48:39 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Sun, 22 Nov 2009 15:48:39 -0500
Subject: [Bioperl-l] BioPerl on FLOSS Weekly
In-Reply-To: <2F050081-8B44-4B4C-82D2-7AC71F156588@illinois.edu>
References: <05EB7AF4-8A20-4046-A585-FBF41EA8350A@illinois.edu><87aaye91m3.fsf@topper.koldfront.dk>
	<2F050081-8B44-4B4C-82D2-7AC71F156588@illinois.edu>
Message-ID: <247658CC6D9A4529B281F4482BD3E4BD@NewLife>

We do have http://www.bioperl.org/wiki/Category:BioPerl_Media --
----- Original Message ----- 
From: "Chris Fields" <cjfields at illinois.edu>
To: "Adam Sj?gren" <adsj at novozymes.com>
Cc: <bioperl-l at bioperl.org>
Sent: Sunday, November 22, 2009 3:30 PM
Subject: Re: [Bioperl-l] BioPerl on FLOSS Weekly


On Nov 22, 2009, at 6:37 AM, Adam Sj?gren wrote:

> On Sat, 21 Nov 2009 20:58:37 -0600, Chris wrote:
>
>> Jason and I were recently interviewed (Wednesday!) about BioPerl for
>> FLOSS Weekly by Randal Schwartz, Leo Laporte, Marc Pelletier, and
>> Kirsten Sanford.
>
> Great!
>
> How about linking to it on bioperl.org?
>
>
>  :-),
>
>   Adam
>
> -- 
>                                                          Adam Sj?gren
>                                                    adsj at novozymes.com

Now posted via O|B|F News; I'll try to make that feed more prominent on the main 
page.

Since this is the second such interview (Jason did one a few years back for 
PerlCast), I'm thinking we need a media page of some sort.

chris
_______________________________________________
Bioperl-l mailing list
Bioperl-l at lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/bioperl-l


From jardim.rodrigo at gmail.com  Sun Nov 22 11:06:40 2009
From: jardim.rodrigo at gmail.com (Rodrigo Jardim)
Date: Sun, 22 Nov 2009 14:06:40 -0200
Subject: [Bioperl-l] Problems with Genbank Proteins File
Message-ID: <cad526f60911220806td613367x255b8c6a0cebd1fc@mail.gmail.com>

I have been problem to parser genbank protein file. I think that because
this file have a other order of fields. For example:

In most general genbank files:
========================
LOCUS       AA399704                  183 bp   mRNA    linear   EST
03-MAR-2000
ACCESSION   AA399704
VERSION     AA399704.1  GI:2053305
DEFINITION  TEUF0001 T.cruzi epimastigote non-normalized cDNA Library
            Trypanosoma cruzi cDNA clone 1 5' similar to T. cruzi gene for
            histone H2b (X60982), mRNA sequence.
KEYWORDS    EST.
SOURCE      Trypanosoma cruzi

In genbank protein files:
===================
LOCUS       XP_628849                510 aa            linear   INV
31-OCT-2008
DEFINITION  hypothetical protein [Dictyostelium discoideum AX4].
ACCESSION   XP_628849
VERSION     XP_628849.1  GI:66799847
DBSOURCE    REFSEQ: accession XM_628847.1
KEYWORDS    .
SOURCE      Dictyostelium discoideum AX4.

When I try to parser, Bioperl abort with message error.

Any ideas?

Thanks all,

-- 
Atc,
Rodrigo Jardim
jardim.rodrigo at gmail.com


From biopython at maubp.freeserve.co.uk  Mon Nov 23 12:36:36 2009
From: biopython at maubp.freeserve.co.uk (Peter)
Date: Mon, 23 Nov 2009 17:36:36 +0000
Subject: [Bioperl-l] Problems with Genbank Proteins File
In-Reply-To: <cad526f60911220806td613367x255b8c6a0cebd1fc@mail.gmail.com>
References: <cad526f60911220806td613367x255b8c6a0cebd1fc@mail.gmail.com>
Message-ID: <320fb6e00911230936ofb9d897rbd45abb73a361250@mail.gmail.com>

On Sun, Nov 22, 2009 at 4:06 PM, Rodrigo Jardim
<jardim.rodrigo at gmail.com> wrote:
> I have been problem to parser genbank protein file. I think that because
> this file have a other order of fields. For example:
>
> ...
>
> When I try to parser, Bioperl abort with message error.
>
> Any ideas?

There are some important bits of information missing - what is the error
message, and what version of BioPerl are you using?

Peter


From maj at fortinbras.us  Mon Nov 23 12:58:46 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Mon, 23 Nov 2009 12:58:46 -0500
Subject: [Bioperl-l] building samtools/Bio::DB::Sam on cygwin
Message-ID: <FD03906C0D074E1B8AFDB89A283E9FAB@NewLife>

Hi All--

I've had some hard-won success installing samtools and Lincoln's
Bio::DB::Sam under cygwin; thought some on the list would be able to
use my notes. (Yes, Jason, I'm working on Bio::Tools::Run::BWA...)


(To get the current samtools, ping
http://sourceforge.net/projects/samtools/files/samtools/0.1.7/samtools-0.1.7a.tar.bz2/download
)

* Getting samtools to make from scratch in cygwin

The following diff details the changes to the samtools Makefile I made
by hand. The key points are

-D_WIN32

and the additional variable LFLAGS and its interpolations. To get the
linker to see

libgcc libstdc++

I needed to add symlinks from /lib to the correct files in
/lib/gcc/i386-pc-cygwin/4.3.2/. Your gcc version may differ.


--- ../old/samtools-0.1.7a/Makefile 2009-11-16 10:13:43.000000000 -0500
+++ Makefile 2009-11-23 12:14:18.529000000 -0500
@@ -1,16 +1,18 @@
 CC=   gcc
 CFLAGS=  -g -Wall -O2 #-m64 #-arch ppc
-DFLAGS=  -D_FILE_OFFSET_BITS=64 -D_USE_KNETFILE -D_CURSES_LIB=1
+LFLAGS=         -lws2_32 -lgcc -lcygwin -lbz2 -lz -lstdc++
+DFLAGS=  -D_WIN32 -D_FILE_OFFSET_BITS=64 -D_CURSES_LIB=1
 LOBJS=  bgzf.o kstring.o bam_aux.o bam.o bam_import.o sam.o bam_index.o \
    bam_pileup.o bam_lpileup.o bam_md.o glf.o razf.o faidx.o knetfile.o \
    bam_sort.o sam_header.o
 AOBJS=  bam_tview.o bam_maqcns.o bam_plcmd.o sam_view.o \
    bam_rmdup.o bam_rmdupse.o bam_mate.o bam_stat.o bam_color.o \
    bamtk.o kaln.o

@@ -36,13 +38,13 @@
   $(AR) -cru $@ $(LOBJS)
 
 samtools:lib $(AOBJS)
-  $(CC) $(CFLAGS) -o $@ $(AOBJS) -lm $(LIBPATH) $(LIBCURSES) -lz -L. -lbam
+  $(CC) $(CFLAGS) -o $@ $(AOBJS) -Xlinker --enable-auto-import -lm $(LIBPATH) $(LIBCURSES) -lz -L. -lbam $(LFLAGS)
 
 razip:razip.o razf.o knetfile.o
-  $(CC) $(CFLAGS) -o $@ razf.o razip.o knetfile.o -lz
+  $(CC) $(CFLAGS) -o $@ razf.o razip.o knetfile.o -lz -lm -lws2_32
 
 bgzip:bgzip.o bgzf.o
-  $(CC) $(CFLAGS) -o $@ bgzf.o bgzip.o -lz
+  $(CC) $(CFLAGS) -o $@ bgzf.o bgzip.o -lz -lm -lws2_32
 
 razip.o:razf.h
 bam.o:bam.h razf.h bam_endian.h kstring.h sam_header.h

* Getting Bio::DB::Sam to compile and install

Bio::DB::Sam requires not the samtools.exe, but the bam library
created during the samtools build, as well as all the samtools header
files. Create a symlink in /lib to libbam.a in the build directory (or
copy libbam.a up to /lib), and create symlinks or copy *.h into
/usr/include. Then in cygwin bash shell

$ cpan
cpan> install Bio::DB::Sam

should fly. 

Hope someone finds this useful. These mods led me to a successful
Bio::DB::Sam install--have not yet checked original code based on
Bio::DB::Sam. If they don't work for you, reply to the list.

cheers, 
MAJ 


From jcline at ieee.org  Mon Nov 23 14:13:26 2009
From: jcline at ieee.org (Jonathan Cline)
Date: Mon, 23 Nov 2009 13:13:26 -0600
Subject: [Bioperl-l] Installing Bio-perl on dreamhost via CPAN
In-Reply-To: <mailman.15.1258822805.21407.bioperl-l@lists.open-bio.org>
References: <mailman.15.1258822805.21407.bioperl-l@lists.open-bio.org>
Message-ID: <4B0ADED6.8040901@ieee.org>

Dreamhost has terrible reliability.  I have stats going back years on a
standard dreamhost hosting account (non-dedicated server), and on some
days the web server doesn't respond.  Dreamhost service is OK for a
hobby blog however it is definitely *not* suitable for anything real. 
Add in latency, arbitrary account limits/restrictions,  etc, and as a
hosting service, it is a bad idea to host a project there.   Although
some users apparently get lucky with server allocation and end up on a
"good server", the provider can change this at any time as well.  I
think more typically, the accounts users don't notice, since most are
simple bloggers.

Here's a data snip that illustrates the problem with a typical dreamhost
account:

----------------------------------------------------------------------
date          uptime       dns   connect   request      ttfb      ttlb

2008-08-05     91.40     0.000     0.528     0.528     2.257     1.619
2008-08-04     89.13     0.002     0.301     0.301     1.302     0.971
2008-08-03     94.62     0.000     0.567     0.567     1.506     0.913
2008-08-02    100.00     0.000     0.335     0.335     1.475     1.079
2008-08-01    100.00     0.000     0.310     0.310     1.587     0.825
2008-07-31     93.55     0.023     0.386     0.386     1.280     0.759
2008-07-30    100.00     0.000     0.345     0.345     1.373     0.860
2008-07-29    100.00     0.000     0.358     0.358     1.335     0.757
2008-07-28    100.00     0.000     0.327     0.327     1.462     0.896
2008-07-27    100.00     0.000     0.292     0.292     1.410     0.966
2008-07-26    100.00     0.000     0.283     0.283     1.280     0.815
2008-07-25    100.00     0.000     0.297     0.297     1.231     0.853
2008-07-24    100.00     0.000     0.362     0.362     1.258     0.699
2008-07-23    100.00     0.000     0.339     0.339     1.270     0.785

----------------------------------------------------------------------
minimum        89.13     0.000     0.283     0.283     1.231     0.699
maximum       100.00     0.023     0.567     0.567     2.257     1.619
average        97.76     0.002     0.359     0.359     1.430     0.914
----------------------------------------------------------------------


Or this month:

----------------------------------------------------------------------
date          uptime       dns   connect   request      ttfb      ttlb

2009-11-11    100.00     0.011     0.097     0.097     1.260     1.638
2009-11-10    100.00     0.008     0.094     0.094     1.285     1.647
2009-11-09    100.00     0.008     0.094     0.094     1.494     1.872
2009-11-08    100.00     0.015     0.101     0.101     1.509     1.894
2009-11-07    100.00     0.006     0.092     0.092     1.453     1.831
2009-11-06    100.00     0.011     0.097     0.097     1.500     1.882
2009-11-05     97.80     0.012     0.097     0.097     1.445     1.806
2009-11-04    100.00     0.010     0.096     0.096     1.235     1.605
2009-11-03     95.65     0.007     0.093     0.093     1.266     1.612
2009-11-02    100.00     0.010     0.096     0.096     1.267     1.637
2009-11-01    100.00     0.007     0.093     0.093     1.311     1.692
2009-10-31    100.00     0.009     0.095     0.095     1.225     1.594
2009-10-30    100.00     0.009     0.095     0.095     1.364     1.739
2009-10-29    100.00     0.017     0.103     0.103     1.121     1.505

----------------------------------------------------------------------
minimum        95.65     0.006     0.092     0.092     1.121     1.505
maximum       100.00     0.017     0.103     0.103     1.509     1.894
average        99.53     0.010     0.096     0.096     1.338     1.711
----------------------------------------------------------------------


## Jonathan Cline
## jcline at ieee.org
## Mobile: +1-805-617-0223
########################


From cjfields at illinois.edu  Mon Nov 23 22:19:02 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Mon, 23 Nov 2009 21:19:02 -0600
Subject: [Bioperl-l] verbosity and error stack,
	was  accessing EMBL database
In-Reply-To: <3277368F-615A-4DD3-B9B3-5D32A5EEEE98@sbc.su.se>
References: <475F74057A618245A773CD325E105D1E033334AC@phy-srv01.physiol.physiology.wisc.edu><E10EF917D9914031BCFDF8BDDBFA4F13@NewLife><B6E6CEFF-14E7-48D6-B9D8-9C114F166190@sbc.su.se><FADF827A6CE34C959062F2D93849E15A@NewLife>
	<B335150B-7479-4D31-BA0A-9F139E2CCE0E@illinois.edu>
	<D72A208491F04DBF9B3F7F10D86A9931@NewLife>
	<3277368F-615A-4DD3-B9B3-5D32A5EEEE98@sbc.su.se>
Message-ID: <167D2408-653E-4DF5-BCD7-134CE2549E44@illinois.edu>

Okay, so I think it's feasible to add this into trunk.  I like the idea of optionally having a log class, if someone comes up with a nice way of adding it in I would be for it.

chris

On Nov 20, 2009, at 4:15 AM, Dave Messina wrote:

> Chris, I took a look at how you implemented this in Biome -- very nice!
> 
> 
>> I like this verbose/strict separability a lot. Should we go for it?
> 
> Me too. So yes, I think so.
> 
> 
>> We could even allow finer-grained control of verbosity (states which cover all combinations) w/o affecting strictness.
> 
> 
> Perhaps this is a job for Log::Log4Perl or Log::Dispatch?
> http://search.cpan.org/~mschilli/Log-Log4perl-1.25/lib/Log/Log4perl.pm
> http://search.cpan.org/~drolsky/Log-Dispatch-2.26/lib/Log/Dispatch.pm
> 
> 
> That might be overkill, though.
> 
> Dave
> 


From David.Messina at sbc.su.se  Tue Nov 24 11:18:22 2009
From: David.Messina at sbc.su.se (Dave Messina)
Date: Tue, 24 Nov 2009 17:18:22 +0100
Subject: [Bioperl-l] verbosity and error stack,
	was  accessing EMBL database
In-Reply-To: <167D2408-653E-4DF5-BCD7-134CE2549E44@illinois.edu>
References: <475F74057A618245A773CD325E105D1E033334AC@phy-srv01.physiol.physiology.wisc.edu><E10EF917D9914031BCFDF8BDDBFA4F13@NewLife><B6E6CEFF-14E7-48D6-B9D8-9C114F166190@sbc.su.se><FADF827A6CE34C959062F2D93849E15A@NewLife>
	<B335150B-7479-4D31-BA0A-9F139E2CCE0E@illinois.edu>
	<D72A208491F04DBF9B3F7F10D86A9931@NewLife>
	<3277368F-615A-4DD3-B9B3-5D32A5EEEE98@sbc.su.se>
	<167D2408-653E-4DF5-BCD7-134CE2549E44@illinois.edu>
Message-ID: <3FD2086D-062F-4706-9DC8-2A53224C4913@sbc.su.se>

> I like the idea of optionally having a log class, if someone comes up with a nice way of adding it in I would be for it.

My suggestion of the logging modules was actually to handle the various levels of verbose output -- I think both of the ones I mentioned "log" to STDERR by default.

But of course a nice side effect of using such a logging module is that it would allow optional logging to a file, too.

Dave


From paolo.pavan at gmail.com  Tue Nov 24 14:28:09 2009
From: paolo.pavan at gmail.com (Paolo Pavan)
Date: Tue, 24 Nov 2009 20:28:09 +0100
Subject: [Bioperl-l] Bio::Tools::Run::Cap3 usage question
Message-ID: <56be91b60911241128s52613a56u99e5b1cb3ba8d19a@mail.gmail.com>

Dear,
I'm confused about the proper usage of the module Bio::Tools::Run::Cap3.
As documented in the pod, the run(@seqs) method returns the cap3 report file
while I expect to return a Bio::Assembly object, consistently with other
Bio::Tools::Run classes.
However, I went around this by getting from the factory object the location
and the names of the temp output files (actually accessing a private
property, although) and reading them via the Assembly::IO system.
I was just wandering what is the proper designed way to do this job.

Thank you for enlighten the way!
Paolo


From Russell.Smithies at agresearch.co.nz  Tue Nov 24 17:04:31 2009
From: Russell.Smithies at agresearch.co.nz (Smithies, Russell)
Date: Wed, 25 Nov 2009 11:04:31 +1300
Subject: [Bioperl-l] Bio::DB::Fasta
Message-ID: <18DF7D20DFEC044098A1062202F5FFF32B63085409@exchsth.agresearch.co.nz>

Is there any way to pass a filename to Bio::DB::Fasta for the location of where to write the directory.index?
It's writing in the same dir as the fasta but I'd rather have it write in /tmp as it's part of a web app.

Thanx,

Russell


=======================================================================
Attention: The information contained in this message and/or attachments
from AgResearch Limited is intended only for the persons or entities
to which it is addressed and may contain confidential and/or privileged
material. Any review, retransmission, dissemination or other use of, or
taking of any action in reliance upon, this information by persons or
entities other than the intended recipients is prohibited by AgResearch
Limited. If you have received this message in error, please notify the
sender immediately.
=======================================================================


From Russell.Smithies at agresearch.co.nz  Tue Nov 24 17:21:52 2009
From: Russell.Smithies at agresearch.co.nz (Smithies, Russell)
Date: Wed, 25 Nov 2009 11:21:52 +1300
Subject: [Bioperl-l] Bio::DB::Fasta
In-Reply-To: <4296CD1039FC44B89034A1FD3E6721F3@NewLife>
References: <18DF7D20DFEC044098A1062202F5FFF32B63085409@exchsth.agresearch.co.nz>
	<4296CD1039FC44B89034A1FD3E6721F3@NewLife>
Message-ID: <18DF7D20DFEC044098A1062202F5FFF32B6308542C@exchsth.agresearch.co.nz>

That's what I ended up doing.
Also, there's no "obvious" way to index a single file so I ended putting the filename in the glob parameter.

my $db = Bio::DB::Fasta->new( "$tmp", -glob => "test.faa", -reindex => 1 );

--Russell


> -----Original Message-----
> From: Mark A. Jensen [mailto:maj at fortinbras.us]
> Sent: Wednesday, 25 November 2009 11:19 a.m.
> To: Smithies, Russell; 'bioperl-l'
> Subject: Re: [Bioperl-l] Bio::DB::Fasta
> 
> The code (method index_dir() ) seems to expect all the fasta files to be
> contained in that directory. Looks hairy; what about creating symlinks to
> your
> fasta files in a /tmp subdir and calling new() with that subdir?
> ----- Original Message -----
> From: "Smithies, Russell" <Russell.Smithies at agresearch.co.nz>
> To: "'bioperl-l'" <bioperl-l at bioperl.org>
> Sent: Tuesday, November 24, 2009 5:04 PM
> Subject: [Bioperl-l] Bio::DB::Fasta
> 
> 
> > Is there any way to pass a filename to Bio::DB::Fasta for the location
> of
> > where to write the directory.index?
> > It's writing in the same dir as the fasta but I'd rather have it write
> in /tmp
> > as it's part of a web app.
> >
> > Thanx,
> >
> > Russell
> >
> >
> > =======================================================================
> > Attention: The information contained in this message and/or attachments
> > from AgResearch Limited is intended only for the persons or entities
> > to which it is addressed and may contain confidential and/or privileged
> > material. Any review, retransmission, dissemination or other use of, or
> > taking of any action in reliance upon, this information by persons or
> > entities other than the intended recipients is prohibited by AgResearch
> > Limited. If you have received this message in error, please notify the
> > sender immediately.
> > =======================================================================
> >
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >
> >


From maj at fortinbras.us  Tue Nov 24 17:18:51 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Tue, 24 Nov 2009 17:18:51 -0500
Subject: [Bioperl-l] Bio::DB::Fasta
In-Reply-To: <18DF7D20DFEC044098A1062202F5FFF32B63085409@exchsth.agresearch.co.nz>
References: <18DF7D20DFEC044098A1062202F5FFF32B63085409@exchsth.agresearch.co.nz>
Message-ID: <4296CD1039FC44B89034A1FD3E6721F3@NewLife>

The code (method index_dir() ) seems to expect all the fasta files to be 
contained in that directory. Looks hairy; what about creating symlinks to your 
fasta files in a /tmp subdir and calling new() with that subdir?
----- Original Message ----- 
From: "Smithies, Russell" <Russell.Smithies at agresearch.co.nz>
To: "'bioperl-l'" <bioperl-l at bioperl.org>
Sent: Tuesday, November 24, 2009 5:04 PM
Subject: [Bioperl-l] Bio::DB::Fasta


> Is there any way to pass a filename to Bio::DB::Fasta for the location of 
> where to write the directory.index?
> It's writing in the same dir as the fasta but I'd rather have it write in /tmp 
> as it's part of a web app.
>
> Thanx,
>
> Russell
>
>
> =======================================================================
> Attention: The information contained in this message and/or attachments
> from AgResearch Limited is intended only for the persons or entities
> to which it is addressed and may contain confidential and/or privileged
> material. Any review, retransmission, dissemination or other use of, or
> taking of any action in reliance upon, this information by persons or
> entities other than the intended recipients is prohibited by AgResearch
> Limited. If you have received this message in error, please notify the
> sender immediately.
> =======================================================================
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> 


From florent.angly at gmail.com  Tue Nov 24 17:54:48 2009
From: florent.angly at gmail.com (Florent Angly)
Date: Tue, 24 Nov 2009 14:54:48 -0800
Subject: [Bioperl-l] Bio::Tools::Run::Cap3 usage question
In-Reply-To: <56be91b60911241128s52613a56u99e5b1cb3ba8d19a@mail.gmail.com>
References: <56be91b60911241128s52613a56u99e5b1cb3ba8d19a@mail.gmail.com>
Message-ID: <4B0C6438.8070405@gmail.com>

Hi Paolo,

It turns out that there is no standard for what is to be passed to the 
Bio::Tools::Run wrappers and returned by them. I noticed the 
inconsistency between the assembly wrappers recently while implementing 
support for new wrapper. I implemented inital support for additional de 
novo assembly programs in BioPerl (454 Newbler and Minimo) a couple of 
weeks ago and Mark Jensen added support for Maq, a program that 
assembler reads against a reference. In the process, all the assembly 
wrappers were changed to take the same type of input data (a FASTA 
sequence or an array reference of sequence objects) and return one of 
the following:
    * a Bio::Assembly::Scaffold object (the default), or
    * a Bio::Assembly::IO object, or
    * the name of a file for the output of the assembler
Use the out_type method to set up which output you want, e.g.:
    $factory->out_type('Bio::Assembly::IO');
or
    $factory->out_type('cap3_results.ace');
You'll have to use the code in the bioperl-run subversion if you want to 
use these new features.

Cheers,

Florent


Paolo Pavan wrote:
> Dear,
> I'm confused about the proper usage of the module Bio::Tools::Run::Cap3.
> As documented in the pod, the run(@seqs) method returns the cap3 report file
> while I expect to return a Bio::Assembly object, consistently with other
> Bio::Tools::Run classes.
> However, I went around this by getting from the factory object the location
> and the names of the temp output files (actually accessing a private
> property, although) and reading them via the Assembly::IO system.
> I was just wandering what is the proper designed way to do this job.
>
> Thank you for enlighten the way!
> Paolo
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>   


From roychu at gmail.com  Tue Nov 24 18:00:58 2009
From: roychu at gmail.com (Roy)
Date: Tue, 24 Nov 2009 15:00:58 -0800
Subject: [Bioperl-l] Remote Blast - same script but different results
Message-ID: <4d7f3e450911241500y7df305acq1d03819ea1ec7d3e@mail.gmail.com>

Hi bioperl community,

I've tried searching the old lists to see if this topic has been
covered, and perhaps this question arises from my own lack of
familiarity with BLAST, but (from my perl script listed below) I get
different results with remote blast when I call my script (that is, I
will either get hits or no hits at all).  I'll call the script one
time, and get no hits.  Then call the script again (with the same
parameters), and get the same several hits that I may have before
after having gotten no hits.  I use a subroutine to parse the blast
report information, and then I use a boolean to indicate whether
results are returned or not.  Any insight into what I may have missed
would be appreciated.  Short question, is this behavior typical?  My
understanding of how BLAST works is that it shouldn'tl...


Thanks in advance,
Roy

#!/usr/bin/perl -w

use strict;
use warnings;
use Carp;
use Bio::Perl;
use CGI;
use Bio::SeqIO;
use Bio::SearchIO;
use Bio::SeqFeature::Generic;
use Bio::Restriction::Analysis;
use Bio::Tools::Run::RemoteBlast;

use Bio::SimpleAlign;
use Bio::AlignIO;
use Bio::LocatableSeq;

my $five_seqobj = Bio::Seq->new(
		-seq		=>	'ATTCCCACCGGGACCTGCGGGGCTGAGTGCCCTTCTCGGTTGCTGCCGCTGAGGAGCCCGCCCAGCCAGCCAGGGCCGCGAGGCCGAGGCCAGGCCGCAGCCCAGGAGCCGCCCCACCGCAGCTGGCGATGGACCCGCCGAGGCCCGCGCTGCTGGCGCTGCTGGCGCTGCCTGCGCTGCTGCTGCTGCTGCTGGCGGGCGCCAGGGCCG',
		-display_id	=>	'genomic_a',
		-alphabet 	=>	'dna',
	);
my $three_seqobj = Bio::Seq->new(
		-seq		=>	'GTGAGTGCGCGGCCGCTCTGCGGGCGCAGAGGGAGCGGGAGGGAGCCGGCGGCACGAGGTTGGCCGGGGCAGCCTGGGCCTAGGCCAGAGGGAGGGCAGCCACAGGGTCCAGGGCGAGTGGGGGGATTGGACCAGCTGGCGGCCCCTGCAGGCTCAGGATGGGGGGCGCGGGATGGAGGGGCTGAGGAGGGGGTCTCCGGAGCCTGCCTC',
		-display_id	=>	'genomic_b',
		-alphabet 	=>	'dna',
	);

my @params = (
'-program' => 'blastn',
'-database' => 'refseq_genomic',
'-expect' => '10',
'-readmethod' => 'blastxml'
);
$Bio::Tools::Run::RemoteBlast::HEADER{'MEGABLAST'} = 'YES';
$Bio::Tools::Run::RemoteBlast::HEADER{'PERC_IDENT'} = 75;
$Bio::Tools::Run::RemoteBlast::HEADER{'FORMAT_TYPE'} = 'XML';
$Bio::Tools::Run::RemoteBlast::HEADER{'ENTREZ_QUERY'} = 'Homo sapiens [ORGN]';
$Bio::Tools::Run::RemoteBlast::HEADER{'HITLIST_SIZE'} = 100; # Put:
limit number of hits

my $factory_a = Bio::Tools::Run::RemoteBlast->new(@params);
$factory_a->retrieve_parameter('FORMAT_TYPE', 'XML');

my $hits_a;
my $hits_b;

my $r;
my $bool_hit;
print "Submitting BLAST query - 5' end (MEGABLAST = YES)\n";
$Bio::Tools::Run::RemoteBlast::HEADER{'MEGABLAST'} = 'YES';
$r = $factory_a->submit_blast($a_seqobj);
$bool_hit = fetch_blast_report($factory_a);
unless ($bool_hit) {
	print "\nNo hits\n";
	print "Re-submitting BLAST query - 5' end (MEGABLAST = NO)\n";
	sleep 5;
	$Bio::Tools::Run::RemoteBlast::HEADER{'MEGABLAST'} = 'NO';
	$r = $factory_a->submit_blast($a_seqobj);
	($bool_hit, $hits_a) = fetch_blast_report($factory_a);
	if ($bool_hit == 0) { print "No hits\n"; }
	sleep 5;
}

my $factory_b = Bio::Tools::Run::RemoteBlast->new(@params);
print "\n--------------------------------------------------\n\n";
print "Submitting BLAST query - 3' end (MEGABLAST = YES)\n";
$Bio::Tools::Run::RemoteBlast::HEADER{'MEGABLAST'} = 'YES';
$r = $remote_blast_three->submit_blast($b_seqobj);
$bool_hit = fetch_blast_report($factory_b);
unless ($bool_hit) {
	print " No hits\n";
	print "Re-submitting BLAST query - 3' end (MEGABLAST = NO)\n";
	sleep 5;
	$Bio::Tools::Run::RemoteBlast::HEADER{'MEGABLAST'} = 'NO';
	$r = $factory_b->submit_blast($b_seqobj);
	($bool_hit, $hits_b) = fetch_blast_report($factory_b);
	if ($bool_hit == 0) { print " No hits\n"; }
	sleep 5;
}

print "\nbye\n\n";

print "$hits_a\n$hits_b\n";

exit;

sub fetch_blast_report {
	my ($factory) = @_;
	my $v = 1;
	my $bool_hit = 0;
	my $hits = '';
	
	print STDERR "waiting...";
	while (my @rids = $factory->each_rid) {
		foreach my $rid (@rids) {
			print STDERR ".";
			my $rc = $factory->retrieve_blast($rid);
			# retrieves blast report from remote blast queue,
			# returns -1 on error, 0 on 'job not finished', Bio::SearchIO object
			# args, remote blast id (rid)
			if (!ref($rc)) {
				# if not empty string, ref EXPR returns a non-empty string if EXPR
is a reference
				if ($rc < 0) {
					$factory->remove_rid($rid);
				}
				print STDERR "." if ($v > 0);
#####################################################################################
is this printing out as multiple dots? when and why?
				sleep 5;
			} else {
				$bool_hit = 1;
				my $result = $rc->next_result();
				unless ($result->num_hits > 0) {
					$bool_hit = 0;
				}
				# returns: Bio::Search::Result::ResultI object
				$factory->remove_rid($rid);
				print "\ndatabase:\t", $result->database_name,"\n";
				print "query name:\t", $result->query_name,"\n";
				print "query length\t", $result->query_length,"\n";
				print "num hits\t", $result->num_hits,"\n";
				if ($result->num_hits) {
					# $result->hits returns an array of hits
					# $results->no_hits_found, boolean vs $#{@hits} ie. filtering\
					while (my $hit = $result->next_hit) {
					
					print "\nhit name:\t", $hit->name,"\n";	
					print "description:\t", $hit->description,"\n";	
					print "locus:\t", $hit->locus,"\n";	
					print "algorithm: ", $hit->algorithm,"\thit length: ",
$hit->length,"\thit ranking: ", $hit->rank,"\n";
					while (my $hsp = $hit->next_hsp) {
						print "evalue: ", $hsp->evalue,"\tscore: ",
$hsp->score,"\tpercent_id: ", $hsp->percent_identity,"\n";
						print "query_start: ", $hsp->query->start,"\tquery_end: ",
$hsp->query->end;
						print "\tquery_length: ", $hsp->query->length,"\tquery_strand:
", $hsp->strand('query'), "\n";
						print "subject_start: ", $hsp->subject->start,"\tsubject_end: ",
$hsp->subject->end;
						print "\tsubject_length: ",
$hsp->subject->length,"\tsubject_strand: ", $hsp->strand('subject'),
"\n\n";
						my $aln = $hsp->get_aln;
						if ($aln->is_flush) {
							foreach my $seq ($aln->each_seq) {
								print $seq->seq,"\n";
							}
							print $aln->gap_line, "\n";
							print $aln->consensus_string(95), "\n\n";
						}

						$hits .= $hit->name."\t".$hsp->subject->start."\t".$hsp->subject->end."\t".$hsp->strand('subject')."\n";
					}
				}		
			}
		}
	}
	return ($bool_hit, $hits);
}
}


From maj at fortinbras.us  Tue Nov 24 23:12:13 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Tue, 24 Nov 2009 23:12:13 -0500
Subject: [Bioperl-l] Bio::DB::Fasta
In-Reply-To: <18DF7D20DFEC044098A1062202F5FFF32B6308542C@exchsth.agresearch.co.nz>
References: <18DF7D20DFEC044098A1062202F5FFF32B63085409@exchsth.agresearch.co.nz>
	<4296CD1039FC44B89034A1FD3E6721F3@NewLife>
	<18DF7D20DFEC044098A1062202F5FFF32B6308542C@exchsth.agresearch.co.nz>
Message-ID: <3ECFA0236D1B467181EE63C8C6BE7E1F@NewLife>

I seem to be able to do
$db = Bio::DB::Fasta->new("$tmp/test.faa");
without a problem- something in the mixing of named and unnamed parameters?
----- Original Message ----- 
From: "Smithies, Russell" <Russell.Smithies at agresearch.co.nz>
To: "'Mark A. Jensen'" <maj at fortinbras.us>; "'bioperl-l'" 
<bioperl-l at bioperl.org>
Sent: Tuesday, November 24, 2009 5:21 PM
Subject: RE: [Bioperl-l] Bio::DB::Fasta


That's what I ended up doing.
Also, there's no "obvious" way to index a single file so I ended putting the 
filename in the glob parameter.

my $db = Bio::DB::Fasta->new( "$tmp", -glob => "test.faa", -reindex => 1 );

--Russell


> -----Original Message-----
> From: Mark A. Jensen [mailto:maj at fortinbras.us]
> Sent: Wednesday, 25 November 2009 11:19 a.m.
> To: Smithies, Russell; 'bioperl-l'
> Subject: Re: [Bioperl-l] Bio::DB::Fasta
>
> The code (method index_dir() ) seems to expect all the fasta files to be
> contained in that directory. Looks hairy; what about creating symlinks to
> your
> fasta files in a /tmp subdir and calling new() with that subdir?
> ----- Original Message -----
> From: "Smithies, Russell" <Russell.Smithies at agresearch.co.nz>
> To: "'bioperl-l'" <bioperl-l at bioperl.org>
> Sent: Tuesday, November 24, 2009 5:04 PM
> Subject: [Bioperl-l] Bio::DB::Fasta
>
>
> > Is there any way to pass a filename to Bio::DB::Fasta for the location
> of
> > where to write the directory.index?
> > It's writing in the same dir as the fasta but I'd rather have it write
> in /tmp
> > as it's part of a web app.
> >
> > Thanx,
> >
> > Russell
> >
> >
> > =======================================================================
> > Attention: The information contained in this message and/or attachments
> > from AgResearch Limited is intended only for the persons or entities
> > to which it is addressed and may contain confidential and/or privileged
> > material. Any review, retransmission, dissemination or other use of, or
> > taking of any action in reliance upon, this information by persons or
> > entities other than the intended recipients is prohibited by AgResearch
> > Limited. If you have received this message in error, please notify the
> > sender immediately.
> > =======================================================================
> >
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >
> >


From maj at fortinbras.us  Wed Nov 25 12:25:30 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Wed, 25 Nov 2009 12:25:30 -0500
Subject: [Bioperl-l] question for all regarding a sam-based Bio::Assembly::IO
Message-ID: <1E72D5B0A190448FA27545DB5B68638D@NewLife>

Short-readers, 

I'm working on an Assembly::IO class for sam alignments.
I'm currently making a decision about handling multiple reference sequences:
would you prefer that next_assembly() return an assembly that covers all reference
sequences, or that next_assembly iterates over each reference sequence?
(Or both?)

thanks for your input-
MAJ


From timbourine81 at gmail.com  Wed Nov 25 12:40:52 2009
From: timbourine81 at gmail.com (Tim)
Date: Wed, 25 Nov 2009 18:40:52 +0100
Subject: [Bioperl-l] How to parse BLAST output - all hits of each query in
	new file
Message-ID: <4B0D6C24.2080308@gmail.com>

Dear bioperl users,

I am a real newbie and have - maybe a very trivial - question.

I searched the mailing list archive and many howtos but I have not found
a concrete answer to my problem. So hopefully you can help me :)

Background: I use the latest Bioperl version (installed it two weeks
before).
When I use Bio::Tools::Run::StandAloneBlast to BLAST one fasta file
including different sequences, I get a BLAST output with many queries
each having several hits / sbjcts.

My problem is how to parse *all* hits of *one* query into a single new
file. And this for all the queries I have in my BLAST output file.

Or is it better the other way round; first to make fasta files with only
single sequences inside and BLAST each file? But how can I automize that
using Bioperl?

I tried Bio::SearchIO but can only parse all queries and their
respective hits in only one file...
I think iteration is also necessary here, but I do not really know how
to include that into Bio::SearchIO.
Or do I have to use Module:Bio::Index::Blast?

I can index a file (see below), but I have no idea what comes next...

###How I index a file...

#!/usr/bin/perl -w

$ENV{BIOPERL_INDEX_TYPE} = "SDBM_File";

use Bio::Index::Fasta;


$file_name = "8_to_BLAST_two_seq_index.fasta";
$id = "48882";
$inx = Bio::Index::Fasta->new (-filename => $file_name . ".idx",
-write_flag => 1);
$inx->make_index($file_name);


Hopefully, you can give me at least hints what to look for.

A big THANKS in advance!

Cheers,

Tim


From timbourine81 at gmail.com  Wed Nov 25 12:53:34 2009
From: timbourine81 at gmail.com (Tim)
Date: Wed, 25 Nov 2009 18:53:34 +0100
Subject: [Bioperl-l] How to parse different (fasta) files
Message-ID: <4B0D6F1E.8@gmail.com>

Hey everybody,

another question from me...if you do not mind :)

My situation is like this: I have parsed a standalone BLAST output using
SearchIO with only the hit names. Now I have a second fasta file with
the same sequences like in the BLAST database but including an alignment
(meaning "." and "-"). (There is no chance to make a BLAST database with
fasta files including the alignment, unfortunately...).
My intention is now to take the name of the hit sequences (BLAST output)
and to get the corresponding aligned sequences (fasta file incl.
alignment) and putting it in a new file.

Is anybody out there who has tried that before?

Again, I am a absolute greenhorn in using (Bio)perl. Maybe it is very
simple :D

Looking forward to get an answer of you.

All the best,

Tim
-- 
Tim K?hler
MPI for Terrestrial Microbiology
Karl-von-Frisch-Stra?e
D-35043 Marburg / Germany

Email: koehlerd at mpi-marburg.mpg.de
Phone: +49 6421 178-740
Fax:   +49 6421 178-999


From maj at fortinbras.us  Wed Nov 25 13:20:03 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Wed, 25 Nov 2009 13:20:03 -0500
Subject: [Bioperl-l] How to parse BLAST output - all hits of each query
	innew file
In-Reply-To: <4B0D6C24.2080308@gmail.com>
References: <4B0D6C24.2080308@gmail.com>
Message-ID: <53DE480F205E42CE8D2B9421592AAF0E@NewLife>

hey Tim--

Sound like you need to go about collecting your queries inside out:

my %hits_by_query;
for ($result->hits) {
  push @{$hits_by_query{$hit->name}} $hit;
}

I believe now each hash element, keyed by the query name, will contain
an arrayref to the set of hits assoc with that query.
>From here, I believe

use Bio::Search::Result::BlastResult;
use Bio::SearchIO;

foreach my $qid ( keys %hits_by_query ) {
  my $result = Bio::Search::Result::BlastResult->new();
  $result->add_hit($_) for ( @{$hits_by_query{$qid}} );
  my $blio = Bio::SearchIO->new( -file => $qid.".bls", -format=>'blast' );
  $blio->write_result($result);
}

will do what you want.

hope this helps -
Mark

----- Original Message ----- 
From: "Tim" <timbourine81 at gmail.com>
To: <bioperl-l at lists.open-bio.org>
Sent: Wednesday, November 25, 2009 12:40 PM
Subject: [Bioperl-l] How to parse BLAST output - all hits of each query innew 
file


> Dear bioperl users,
>
> I am a real newbie and have - maybe a very trivial - question.
>
> I searched the mailing list archive and many howtos but I have not found
> a concrete answer to my problem. So hopefully you can help me :)
>
> Background: I use the latest Bioperl version (installed it two weeks
> before).
> When I use Bio::Tools::Run::StandAloneBlast to BLAST one fasta file
> including different sequences, I get a BLAST output with many queries
> each having several hits / sbjcts.
>
> My problem is how to parse *all* hits of *one* query into a single new
> file. And this for all the queries I have in my BLAST output file.
>
> Or is it better the other way round; first to make fasta files with only
> single sequences inside and BLAST each file? But how can I automize that
> using Bioperl?
>
> I tried Bio::SearchIO but can only parse all queries and their
> respective hits in only one file...
> I think iteration is also necessary here, but I do not really know how
> to include that into Bio::SearchIO.
> Or do I have to use Module:Bio::Index::Blast?
>
> I can index a file (see below), but I have no idea what comes next...
>
> ###How I index a file...
>
> #!/usr/bin/perl -w
>
> $ENV{BIOPERL_INDEX_TYPE} = "SDBM_File";
>
> use Bio::Index::Fasta;
>
>
> $file_name = "8_to_BLAST_two_seq_index.fasta";
> $id = "48882";
> $inx = Bio::Index::Fasta->new (-filename => $file_name . ".idx",
> -write_flag => 1);
> $inx->make_index($file_name);
>
>
> Hopefully, you can give me at least hints what to look for.
>
> A big THANKS in advance!
>
> Cheers,
>
> Tim
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> 


From Russell.Smithies at agresearch.co.nz  Wed Nov 25 14:07:26 2009
From: Russell.Smithies at agresearch.co.nz (Smithies, Russell)
Date: Thu, 26 Nov 2009 08:07:26 +1300
Subject: [Bioperl-l] How to parse BLAST output - all hits of each query
 in new file
In-Reply-To: <4B0D6C24.2080308@gmail.com>
References: <4B0D6C24.2080308@gmail.com>
Message-ID: <18DF7D20DFEC044098A1062202F5FFF32B63085701@exchsth.agresearch.co.nz>

Hi Tim,
Here's some code for a job I'm working on at the moment that contains all the bits you'll probably need.
It's extracting 2 species-specific databases from nr (based on tax ids), doing a blast, then parsing the results and creating a substitution matrix. I was initially using Bio::DB::Eutilities to query and retrieve sequences but I kept getting errors and time-outs from NCBI when pulling back large numbers of sequences.
It should give you a rough idea of how to run Bio::Tools::Run::StandAloneBlast, Bio::DB::Fasta and Bio::SearchIO.

Email me direct if you want further explaination as it's not well commented ;-)

Russell Smithies 

Bioinformatics Applications Developer 
T +64 3 489 9085 
E  russell.smithies at agresearch.co.nz 

Invermay  Research Centre 
Puddle Alley, 
Mosgiel, 
New Zealand 
T  +64 3 489 3809   
F  +64 3 489 9174  
www.agresearch.co.nz

=======================================

#!/usr/local/bin/perl

use strict;
use Bio::Tools::Run::StandAloneBlast;
use Bio::SeqIO;
use Bio::SearchIO;
use Bio::DB::Fasta;

use Storable;

# Parameters: <query> <subject> <number or percentage of searches>
# Percentage can be specified as either 20p, 20P or 20%
# So for 20% of rice sequences blasted against oil palm:
#    4530 51953 20p   (4530=rice,51953=oil_palm, 20p=20%)
# Or for 20 searches:
#      4530 51953 20
#
my ( $q, $s, $c ) = @ARGV;

my $nr = "/data/databases/flatfile/illuminati_blastdata/nr";
my $tax_file = "/data/anonftp/pub/mirror/taxonomy/gi_taxid_prot.dmp.gz";
my $tmp = "/tmp/tax";


my %stats      = ();
my $total_subs = 0;

my $min_hsp_len      = 0;
my $min_hsp_identity = 0;
my $num_searches     = $c || 10;
my $blast_e          = '1e-6';
my $count            = 0;

# check if all the fasta and blast files exist
# if not, extract new fasta and re-formatdb the database
foreach my $t ( $q, $s ) {
  foreach ( map { "$tmp/$t.$_" } qw(faa list phr pin psq) ) {
    unless ( -e $_ ) {
      print "Creating database for $t\n";
      &create_database($t);
      last;
    }
  }
}

my @params = (
               -database => "$tmp/$q",
               -program  => 'blastp',
               -e        => $blast_e,
               -outfile  => "$tmp/blast.out",
               -v        => '1',
               -b        => '1'
);
my $factory = Bio::Tools::Run::StandAloneBlast->new(@params) or die $!;

# load the query sequences into a db
# makes it easier to randomly access them
my $db = Bio::DB::Fasta->new( "$tmp", -glob => "$s.faa", -reindex => 1 );

my @ids      = $db->ids;
my $id_count = $#ids;
exit "No sequences\n" unless $id_count;

# if a percentage is requested, calculate
# the required number of searches
if ( $num_searches =~ m/(\d+)[pP%]/ ) {
  $num_searches = int( ( $1 / 100 ) * $id_count );
  warn
"Searching random $1 percent ($num_searches) of $id_count sequences from taxid $q\n";
}

my $summary_file = "$tmp/".$$."_summary.txt";
open( OUT, ">", $summary_file ) or die $!;
print OUT
"#Summary of $num_searches random blast searches from taxid $q against taxid $s.\n";
print OUT "#Parameters used were:\n";
print OUT "#blast_e: $blast_e\n";
print OUT "#min_hsp_len: $min_hsp_len\n";
print OUT "#min_hsp_identity: $min_hsp_identity\n";
print OUT "\n";

while ( my $seq = $db->get_Seq_by_id( $ids[ rand($#ids) ] ) ) {
  next unless $seq;

  warn "Processing ", $seq->id, "\n";
  eval {
    my $blast_report = $factory->blastall($seq);
    sleep 5;
  };

  my $blast_in = new Bio::SearchIO( -format => "blast", -file => "$tmp/blast.out" );

  while ( my $result = $blast_in->next_result ) {
    if ( $result->num_hits <= 0 ) {
      warn "No hits for ", $result->query_accession, "\n";
      print OUT "No hits for ", $result->query_accession, "\n";
      next;
    }
    $count++;
    while ( my $hit = $result->next_hit ) {
      while ( my $hsp = $hit->next_hsp ) {
        warn sprintf( "%s had %s hsp%s\n",
                      $result->query_accession, $hit->num_hsps,
                      $hit->num_hsps > 1 ? "s" : "" );
        print OUT sprintf( "%s had %s hsp%s\n",
                      $result->query_accession, $hit->num_hsps,
                      $hit->num_hsps > 1 ? "s" : "" );

        # http://www.bioperl.org/wiki/HOWTO:SearchIO#Table_of_Methods
        if ( $hsp->length('total') > $min_hsp_len ) {
          if ( $hsp->percent_identity >= $min_hsp_identity ) {
            my @query_string = split '', $hsp->query_string;
            my @homol_string = split '', $hsp->homology_string;
            my @hit_string   = split '', $hsp->hit_string;
            for ( my $i = 0; $i < $#query_string; $i++ ) {
              next unless $homol_string[$i] =~ /\+/;
              $stats{ $query_string[$i] }{ $hit_string[$i] }++;
              $total_subs++;
            }
          }
        }
      }
    }
  }
  unlink '$tmp/blast.out' if -e '$tmp/blast.out';
  last if $count >= $num_searches;
}


# create summary frequency list
my %summary = ();
for my $query ( keys %stats ) {
  for my $hit ( keys %{ $stats{$query} } ) {
    $summary{"$query->$hit"} =
      sprintf( "%6f", $stats{$query}{$hit} / $total_subs );
  }
}

print OUT "\n";

# sort by decending frequencies and print to summary file
foreach my $k ( sort { $summary{$b} <=> $summary{$a} } keys %summary ) {
  print OUT "$k\t", $summary{$k}, "\n" unless $k =~ /TOTAL/;
}

print OUT "\n\n";

# print substitution matrix
my $i     = 0;
my @prots = qw(A R N D C Q E G H I L K M F P S T W Y V);
my $sep   = "\t";

print OUT sprintf( "%7s %s", $_, $sep ) foreach ( "       ", @prots );
print OUT "\n";

foreach my $x (@prots) {
  print OUT sprintf( "%7s|%s", $prots[ $i++ ], $sep );
  foreach my $y (@prots) {
    my $val =
      defined( $stats{$x}{$y} )
      ? sprintf( "%0.6f", $stats{$x}{$y} / $total_subs )
      : "--------";
    print OUT sprintf( "%s%s", $val, $sep );
  }
  print OUT "\n";
}
close OUT;


open(IN, $summary_file) or die $!;
print $_ while(<IN>);
close IN;


# extract sequences from nr database based on taxid.
sub create_database {
  my $txid      = shift;
  my %hash      = ();
  my $gi_stored = "/tmp/gi.dat";

  if ( -e $gi_stored ) {
    %hash = %{ retrieve($gi_stored) };
  }
  else {
    open( TXID, "zcat $tax_file | " ) or die $!;
    while (<TXID>) {
      chomp;
      my ( $gi, $tx ) = split( "\t", $_ );
      push( @{ $hash{$tx} }, $gi );
    }
    close TXID;

    store( \%hash, $gi_stored );
  }

  my $txlist = "$tmp/$txid.list";
  my $txseq  = "$tmp/$txid.faa";
	
	die "No sequences found for taxid $txid\n" unless defined( @{ $hash{$txid} });
	my $num_seqs =  scalar( @{ $hash{$txid} });
	warn "Found $num_seqs sequences for taxid $txid in $tax_file\n";

  open OUT, ">", $txlist or die $!;
  print OUT "$_\n" foreach ( @{ $hash{$txid} } );
  close OUT;

  my $cmd = "fastacmd -d $nr -i $txlist -t T -o $txseq 2>/dev/null";
  system $cmd;

  my $count = `grep -c '>' $txseq`;
  $count =~ s/\n//;
	warn "Could only extract $count sequences from $nr\n";

  $cmd = "formatdb -p T -i $tmp/$txid.faa -n $tmp/$txid -l $tmp/formatdb.log";
  system $cmd;

  $cmd = "fastacmd -d $tmp/$txid -I";
  system $cmd;

  warn "Check the formatdb.log for any errors\n";
}


=======================================


> -----Original Message-----
> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
> bounces at lists.open-bio.org] On Behalf Of Tim
> Sent: Thursday, 26 November 2009 6:41 a.m.
> To: bioperl-l at lists.open-bio.org
> Subject: [Bioperl-l] How to parse BLAST output - all hits of each query in
> new file
> 
> Dear bioperl users,
> 
> I am a real newbie and have - maybe a very trivial - question.
> 
> I searched the mailing list archive and many howtos but I have not found
> a concrete answer to my problem. So hopefully you can help me :)
> 
> Background: I use the latest Bioperl version (installed it two weeks
> before).
> When I use Bio::Tools::Run::StandAloneBlast to BLAST one fasta file
> including different sequences, I get a BLAST output with many queries
> each having several hits / sbjcts.
> 
> My problem is how to parse *all* hits of *one* query into a single new
> file. And this for all the queries I have in my BLAST output file.
> 
> Or is it better the other way round; first to make fasta files with only
> single sequences inside and BLAST each file? But how can I automize that
> using Bioperl?
> 
> I tried Bio::SearchIO but can only parse all queries and their
> respective hits in only one file...
> I think iteration is also necessary here, but I do not really know how
> to include that into Bio::SearchIO.
> Or do I have to use Module:Bio::Index::Blast?
> 
> I can index a file (see below), but I have no idea what comes next...
> 
> ###How I index a file...
> 
> #!/usr/bin/perl -w
> 
> $ENV{BIOPERL_INDEX_TYPE} = "SDBM_File";
> 
> use Bio::Index::Fasta;
> 
> 
> $file_name = "8_to_BLAST_two_seq_index.fasta";
> $id = "48882";
> $inx = Bio::Index::Fasta->new (-filename => $file_name . ".idx",
> -write_flag => 1);
> $inx->make_index($file_name);
> 
> 
> Hopefully, you can give me at least hints what to look for.
> 
> A big THANKS in advance!
> 
> Cheers,
> 
> Tim
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
=======================================================================
Attention: The information contained in this message and/or attachments
from AgResearch Limited is intended only for the persons or entities
to which it is addressed and may contain confidential and/or privileged
material. Any review, retransmission, dissemination or other use of, or
taking of any action in reliance upon, this information by persons or
entities other than the intended recipients is prohibited by AgResearch
Limited. If you have received this message in error, please notify the
sender immediately.
=======================================================================


From maj at fortinbras.us  Wed Nov 25 14:21:27 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Wed, 25 Nov 2009 14:21:27 -0500
Subject: [Bioperl-l] How to parse BLAST output - all hits of each
	queryinnew file
In-Reply-To: <53DE480F205E42CE8D2B9421592AAF0E@NewLife>
References: <4B0D6C24.2080308@gmail.com>
	<53DE480F205E42CE8D2B9421592AAF0E@NewLife>
Message-ID: <815D2A47BC9C4D89B8DEF0B10DA9EAF8@NewLife>

whoops: change the following line:
my $blio = Bio::SearchIO->new( -file => $qid.".bls", -format=>'blast' );

to

my $blio = Bio::SearchIO->new( -file => ">$qid\.bls", -format=>'blast' );

(I always forget that...)
MAJ

----- Original Message ----- 
From: "Mark A. Jensen" <maj at fortinbras.us>
To: "Tim" <timbourine81 at gmail.com>; <bioperl-l at lists.open-bio.org>
Sent: Wednesday, November 25, 2009 1:20 PM
Subject: Re: [Bioperl-l] How to parse BLAST output - all hits of each queryinnew 
file


> hey Tim--
>
> Sound like you need to go about collecting your queries inside out:
>
> my %hits_by_query;
> for ($result->hits) {
>  push @{$hits_by_query{$hit->name}} $hit;
> }
>
> I believe now each hash element, keyed by the query name, will contain
> an arrayref to the set of hits assoc with that query.
>>From here, I believe
>
> use Bio::Search::Result::BlastResult;
> use Bio::SearchIO;
>
> foreach my $qid ( keys %hits_by_query ) {
>  my $result = Bio::Search::Result::BlastResult->new();
>  $result->add_hit($_) for ( @{$hits_by_query{$qid}} );
>  my $blio = Bio::SearchIO->new( -file => $qid.".bls", -format=>'blast' );
>  $blio->write_result($result);
> }
>
> will do what you want.
>
> hope this helps -
> Mark
>
> ----- Original Message ----- 
> From: "Tim" <timbourine81 at gmail.com>
> To: <bioperl-l at lists.open-bio.org>
> Sent: Wednesday, November 25, 2009 12:40 PM
> Subject: [Bioperl-l] How to parse BLAST output - all hits of each query innew 
> file
>
>
>> Dear bioperl users,
>>
>> I am a real newbie and have - maybe a very trivial - question.
>>
>> I searched the mailing list archive and many howtos but I have not found
>> a concrete answer to my problem. So hopefully you can help me :)
>>
>> Background: I use the latest Bioperl version (installed it two weeks
>> before).
>> When I use Bio::Tools::Run::StandAloneBlast to BLAST one fasta file
>> including different sequences, I get a BLAST output with many queries
>> each having several hits / sbjcts.
>>
>> My problem is how to parse *all* hits of *one* query into a single new
>> file. And this for all the queries I have in my BLAST output file.
>>
>> Or is it better the other way round; first to make fasta files with only
>> single sequences inside and BLAST each file? But how can I automize that
>> using Bioperl?
>>
>> I tried Bio::SearchIO but can only parse all queries and their
>> respective hits in only one file...
>> I think iteration is also necessary here, but I do not really know how
>> to include that into Bio::SearchIO.
>> Or do I have to use Module:Bio::Index::Blast?
>>
>> I can index a file (see below), but I have no idea what comes next...
>>
>> ###How I index a file...
>>
>> #!/usr/bin/perl -w
>>
>> $ENV{BIOPERL_INDEX_TYPE} = "SDBM_File";
>>
>> use Bio::Index::Fasta;
>>
>>
>> $file_name = "8_to_BLAST_two_seq_index.fasta";
>> $id = "48882";
>> $inx = Bio::Index::Fasta->new (-filename => $file_name . ".idx",
>> -write_flag => 1);
>> $inx->make_index($file_name);
>>
>>
>> Hopefully, you can give me at least hints what to look for.
>>
>> A big THANKS in advance!
>>
>> Cheers,
>>
>> Tim
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> 


From alden.huang at gmail.com  Thu Nov 26 05:54:30 2009
From: alden.huang at gmail.com (Alden Huang)
Date: Thu, 26 Nov 2009 02:54:30 -0800
Subject: [Bioperl-l] Function that determines serious mutations
In-Reply-To: <deaa866a0911060935q5e0364f5v9a296162c42fd0ca@mail.gmail.com>
References: <deaa866a0911060935q5e0364f5v9a296162c42fd0ca@mail.gmail.com>
Message-ID: <9e408d720911260254r1e85169lb92d944d88a1880c@mail.gmail.com>

Hey rob,

Sorting Intolerant from Tolerant
http://sift.jcvi.org/

~alden

...a bit late, i kno; I just read you post now while cleaning the inbox

On Fri, Nov 6, 2009 at 9:35 AM, Robert Bradbury
<robert.bradbury at gmail.com> wrote:
> Is there a function in the library (or has someone written one) that can
> take a genbank entry and determine which mutations are harmful?
>
> It would be used to produce a table summary of:
> ?GENE ? ? ? ? ?# SNP ? ? ?# BadSNP
>
> One kind of gets this from NCBI if you lookup in the "GENE" db a gene name
> and then go to the "GeneView" om dbSNP page it has the information I want
> but largely in a graphical format while I simply want numbers I can dump
> into a spreadsheet.
>
> I don't think it would be hard, fetch the gene, run through the features for
> the SNP database, figure out whether they are good or bad SNPs, accumulate
> the statistics and dump it. ?I think the functions available are flexible
> enough to do it but I can't believe nobody has already done it. ?It could be
> a bit more complex in that one could do an analysis to see if the mutations
> are in a conserved domain or mutations that code for Cysteine or Methionine
> (or othe potentially "critical" amino acids) but since "critical" is in the
> eye of the beholder there would have to be some kind of callback to a
> scoring function.
>
> Thanks,
> Robert
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


From robert.bradbury at gmail.com  Thu Nov 26 06:27:50 2009
From: robert.bradbury at gmail.com (Robert Bradbury)
Date: Thu, 26 Nov 2009 06:27:50 -0500
Subject: [Bioperl-l] Function that determines serious mutations
In-Reply-To: <9e408d720911260254r1e85169lb92d944d88a1880c@mail.gmail.com>
References: <deaa866a0911060935q5e0364f5v9a296162c42fd0ca@mail.gmail.com>
	<9e408d720911260254r1e85169lb92d944d88a1880c@mail.gmail.com>
Message-ID: <deaa866a0911260327j5b57d16erfcbe5b996e1a6e64@mail.gmail.com>

On Thu, Nov 26, 2009 at 5:54 AM, Alden Huang <alden.huang at gmail.com> wrote:
>
> Sorting Intolerant from Tolerant
> http://sift.jcvi.org/
>
>
Ah yes, thank you very much.  This looks very much like a tool that can be
adapted for various uses.

Robert


From jason at bioperl.org  Thu Nov 26 12:16:17 2009
From: jason at bioperl.org (Jason Stajich)
Date: Thu, 26 Nov 2009 09:16:17 -0800
Subject: [Bioperl-l] question about a Bio::Tree::Tree method
In-Reply-To: <30960443.966281259248778372.JavaMail.defaultUser@defaultHost>
References: <30960443.966281259248778372.JavaMail.defaultUser@defaultHost>
Message-ID: <14F4B8C9-A1F4-436B-813F-50E139932D3D@bioperl.org>

Emilio - please ask your questions on the list - many people there can  
help answer questions.

get_nodes returns all the nodes in the tree, the options specify the  
order they are returned in.  Depending on your question the order  
probably won't matter so you can just call it without any arguments  
like in the examples and the HOWTO.

The documentation for the method says:
  Title   : get_nodes
         Usage   : my @nodes = $tree?>get_nodes()
         Function: Return list of Bio::Tree::NodeI objects
         Returns : array of Bio::Tree::NodeI objects
         Args    : (named values) hash with one value
                   order => ?b?breadth? first order or  
?d?depth? first order

So you can provide no arguments and get the default (breadth-first I  
believe) or you can specify
-order => 'd'
or
-order => 'depth'

to get the nodes in depth-first order.

-jason
On Nov 26, 2009, at 7:19 AM, miglio83 at libero.it wrote:

> Hi Jason,
> I'm Emilio Siena, a PhD student of the University of Perugia.
> I have
> a question about the method "get_nodes" of the  "Bio::Tree::Tree"  
> class.
> In
> particular I didn't understand which type of arguments it accepts  
> and in which
> format an argument should be given.
>
> Thank you in advance!
>
> Emilio

--
Jason Stajich
jason.stajich at gmail.com
jason at bioperl.org


From maj at fortinbras.us  Thu Nov 26 12:40:45 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Thu, 26 Nov 2009 12:40:45 -0500
Subject: [Bioperl-l] Bio::Assembly::IO::sam is alpha
Message-ID: <599F8BABCD2848EFA98FB24A4419674E@NewLife>

in bioperl-live/trunk with plenty pod; bravehearts can (please!) test on .bam files
cheers, MAJ


From mauricio at open-bio.org  Thu Nov 26 16:45:43 2009
From: mauricio at open-bio.org (Mauricio Herrera Cuadra)
Date: Thu, 26 Nov 2009 15:45:43 -0600
Subject: [Bioperl-l] [DAS] DAS workshop 7th-9th April 2010
In-Reply-To: <F30A9ED7-41E9-4833-A094-FDF0893E0F92@sanger.ac.uk>
References: <F30A9ED7-41E9-4833-A094-FDF0893E0F92@sanger.ac.uk>
Message-ID: <4B0EF707.6080202@open-bio.org>

Hi Jonathan,

Any chance it can be webcasted? I'm sure it would attract a lot of 
remote attendees ;)

Regards,
Mauricio.


Jonathan Warren wrote:
> We are considering running a Distributed Annotation System workshop here 
> at the Sanger/EBI in the UK subject to decent demand.
> The workshop will be held from Wednesday 7th-Friday 9th April 2010. If 
> you would be interested in attending either to present or just take part
> then please email me jw12 at sanger.ac.uk
> 
> The format of the workshop is likely to be similar to last years (1st 
> day for beginners, 2nd for both beginners and advanced users, 3rd day 
> for advanced), information for which can be found here:
> http://www.dasregistry.org/course.jsp
> 
> If you would like to present then please send a short summary of what 
> you would like to talk about.
> 
> Thanks
> 
> Jonathan.
> 
> Jonathan Warren
> Senior Developer and DAS coordinator
> jw12 at sanger.ac.uk
> 
> 
> 
> 
> 
> 
> 
> 
> 


From robert.bradbury at gmail.com  Thu Nov 26 21:06:40 2009
From: robert.bradbury at gmail.com (Robert Bradbury)
Date: Thu, 26 Nov 2009 21:06:40 -0500
Subject: [Bioperl-l] BioPerl "guts" question regarding forked processes
Message-ID: <deaa866a0911261806g12211028w23cd540ae3652ff4@mail.gmail.com>

I'm currently running near my process limit and running sequence fetches
from swissprot (I've also had this happen with getting gi's from NCBI) and
am running out of processes about halfway through the set I'm trying to
fetch [1].

Now, is there someplace in the bioperl documentation that documents where
one is supposed to wait() for defunct processes after each sequence fetch.
 I'm encountering the problem both when the sequence fetches succeed as well
as when they fail.

Thanks in advance.
Robert

1. This is due to a bug in chromium's use of flash that involves it leaving
many defunct processes that are uncollected and therefore counting towards
ones "process limit".


From kanzure at gmail.com  Thu Nov 26 21:12:46 2009
From: kanzure at gmail.com (Bryan Bishop)
Date: Thu, 26 Nov 2009 20:12:46 -0600
Subject: [Bioperl-l] BioPerl "guts" question regarding forked processes
In-Reply-To: <deaa866a0911261806g12211028w23cd540ae3652ff4@mail.gmail.com>
References: <deaa866a0911261806g12211028w23cd540ae3652ff4@mail.gmail.com>
Message-ID: <55ad6af70911261812q583277d5l71df0d66e756f617@mail.gmail.com>

On Thu, Nov 26, 2009 at 8:06 PM, Robert Bradbury wrote:
> I'm currently running near my process limit and running sequence fetches
> from swissprot (I've also had this happen with getting gi's from NCBI) and
> am running out of processes about halfway through the set I'm trying to
> fetch [1].

Hey Robert, sorry for the off-topic question, but I was wondering if
you're the same Robert Bradbury from the extropy-chat list. Hi?

- Bryan
http://heybryan.org/
1 512 203 0507


From paolo.pavan at gmail.com  Fri Nov 27 06:35:03 2009
From: paolo.pavan at gmail.com (Paolo Pavan)
Date: Fri, 27 Nov 2009 12:35:03 +0100
Subject: [Bioperl-l] More general Bio::Assembly::Contig question (was
	Bio::Tools::Run::Cap3 usage question)
Message-ID: <56be91b60911270335s3a50ab0cpb03aabb6660f81dc@mail.gmail.com>

Dear Florent,
Thank you for your kind answer and for your efforts spent in this module.
Since you are working on these topics I would like to seize the day and put
you some questions about some doubts I have in mind, if you agree, of course
:-)
Some times ago I tried to work with bioperl, loading the data from an ACE
file originated by Newbler; my need was to extract part of the contig like
an alignment of reads and I tought to do it with a slice() method, since I
saw Bio::Assembly::Contig implements Bio::AlignI interface. Unfortunately I
realize that this interface is inherited but not implemented.
I tried to hack it by adding a slice method which would act on a
Bio::Alignment created from the array of LocatableSeqs representing the
reads.

This is the question:
If I'm not wrong (please correct me if yes), Bio::Assembly::Contig class
stores reads informations in:
Bio::Assembly::Contigs->{_elem}{READ_NAME}{_feat}{
     _align_clipping:READ_NAME}
     _aligned_coord:READ_NAME}
     _quality_clipping:READ_NAME}

Anyone of these 3 features _align_clipping, _aligned_coord,
_quality_clipping, contains a Bio::SeqFeature::Generic, which of them is
more suitable to the purpose expressed before, the slice method?
And more, If you apologize me for being too long, is consequently to the
previous: I don't have perfectly clear the purpose of this 3 feature per
read, can you explain it?

Really thanks you for the time you would spend.
Bye bye,
Paolo


2009/11/24 Florent Angly <florent.angly at gmail.com>

> Hi Paolo,
>
> It turns out that there is no standard for what is to be passed to the
> Bio::Tools::Run wrappers and returned by them. I noticed the inconsistency
> between the assembly wrappers recently while implementing support for new
> wrapper. I implemented inital support for additional de novo assembly
> programs in BioPerl (454 Newbler and Minimo) a couple of weeks ago and Mark
> Jensen added support for Maq, a program that assembler reads against a
> reference. In the process, all the assembly wrappers were changed to take
> the same type of input data (a FASTA sequence or an array reference of
> sequence objects) and return one of the following:
>   * a Bio::Assembly::Scaffold object (the default), or
>   * a Bio::Assembly::IO object, or
>   * the name of a file for the output of the assembler
> Use the out_type method to set up which output you want, e.g.:
>   $factory->out_type('Bio::Assembly::IO');
> or
>   $factory->out_type('cap3_results.ace');
> You'll have to use the code in the bioperl-run subversion if you want to
> use these new features.
>
> Cheers,
>
> Florent
>
>
>
>
> Paolo Pavan wrote:
>
>> Dear,
>> I'm confused about the proper usage of the module Bio::Tools::Run::Cap3.
>> As documented in the pod, the run(@seqs) method returns the cap3 report
>> file
>> while I expect to return a Bio::Assembly object, consistently with other
>> Bio::Tools::Run classes.
>> However, I went around this by getting from the factory object the
>> location
>> and the names of the temp output files (actually accessing a private
>> property, although) and reading them via the Assembly::IO system.
>> I was just wandering what is the proper designed way to do this job.
>>
>> Thank you for enlighten the way!
>> Paolo
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>>
>>
>
>


From jw12 at sanger.ac.uk  Thu Nov 26 09:57:35 2009
From: jw12 at sanger.ac.uk (Jonathan Warren)
Date: Thu, 26 Nov 2009 14:57:35 +0000
Subject: [Bioperl-l] DAS workshop 7th-9th April 2010
Message-ID: <F30A9ED7-41E9-4833-A094-FDF0893E0F92@sanger.ac.uk>

We are considering running a Distributed Annotation System workshop  
here at the Sanger/EBI in the UK subject to decent demand.
The workshop will be held from Wednesday 7th-Friday 9th April 2010. If  
you would be interested in attending either to present or just take part
then please email me jw12 at sanger.ac.uk

The format of the workshop is likely to be similar to last years (1st  
day for beginners, 2nd for both beginners and advanced users, 3rd day  
for advanced), information for which can be found here:
http://www.dasregistry.org/course.jsp

If you would like to present then please send a short summary of what  
you would like to talk about.

Thanks

Jonathan.

Jonathan Warren
Senior Developer and DAS coordinator
jw12 at sanger.ac.uk


-- 
 The Wellcome Trust Sanger Institute is operated by Genome Research 
 Limited, a charity registered in England with number 1021457 and a 
 company registered in England with number 2742969, whose registered 
 office is 215 Euston Road, London, NW1 2BE. 


From timbourine81 at googlemail.com  Thu Nov 26 11:02:30 2009
From: timbourine81 at googlemail.com (Tim Koehler)
Date: Thu, 26 Nov 2009 17:02:30 +0100
Subject: [Bioperl-l] How to parse BLAST output - all hits of each
	queryinnew file
In-Reply-To: <4B0EA44D.2050507@gmail.com>
References: <4B0D6C24.2080308@gmail.com>
	<53DE480F205E42CE8D2B9421592AAF0E@NewLife>
	<815D2A47BC9C4D89B8DEF0B10DA9EAF8@NewLife>
	<4B0EA44D.2050507@gmail.com>
Message-ID: <c3cc98c0911260802t2a66e7e8o7be3b79c03d86e6e@mail.gmail.com>

ups, sent too early...

Hey Mark,

thanks for the answer. But I am still struggling, especially where to put in
your code.

Here ist the code I have, so far:

#!/usr/bin/perl -w

### should I put your code here as push is a perl command?
my %hits_by_query;
for ($result->hits) {
### I inserted a comma after name}}; if there is no comma, there was the
error: Scalar found where operator expected at
12_BLAST_two_sequence_each_query_one_file.PL line7, near "} $hit"
###        (Missing operator before  $hit?)
###Useless use of push with no values at
12_BLAST_two_sequence_each_query_one_file.PL line 7.
###syntax error at 12_BLAST_two_sequence_each_query_one_file.PL line 7, near
"} $hit"
###BEGIN not safe after errors--compilation aborted at
12_BLAST_two_sequence_each_query_one_file.PL line 13.
 push @{$hits_by_query{$hit->name}}, $hit;
###here, every time this terror appears: Name "main::result" used only once:
possible typo at 12_BLAST_two_sequence_each_query_one_file.PL line 5.
###error: Can't call method "hits" on an undefined value at
12_BLAST_two_sequence_each_query_one_file.PL line 5.
}


use strict;
use Bio::Tools::Run::StandAloneBlast;
use Bio::SeqIO;
use Bio::SearchIO;
use Bio::Search::Result::BlastResult;

my $Seq_in = Bio::SeqIO->new (
-file =>
"/home/koehler/Programs/for_BLAST/BLAST_Pipeline/1_to_BLAST_two_seq.fasta",
-format => 'fasta'
);
while (my $query = $Seq_in->next_seq()) {
my $factory = Bio::Tools::Run::StandAloneBlast->new(
'program' => 'blastn',
'database' => '/home/koehler/Programs/for_BLAST/BLAST_Pipeline/3_BLAST_db',
_READMETHOD => "Blast"
);

my $blast_report = $factory->blastall($query);

### Should I need to use a module? are the commands here at the right
position? errors, e.g., Global symbol "$hit" requires explicit package name
#my %hits_by_query;
#for ($result->hits) {
### inserted comma after name}}
# push @{$hits_by_query{$hit->name}}, $hit;
#}

foreach my $qid ( keys %hits_by_query ) {
 my $result = Bio::Search::Result::BlastResult->new();
 $result->add_hit($_) for ( @{$hits_by_query{$qid}} );
 my $blio = Bio::SearchIO->new( -file => ">$qid\.bls", -format=>'blast' );
 $blio->write_result($result);
}

###where are the files stored? what is their name. Sorry, but I cannot get
behind that :(

while( my $result = $blast_report->next_result ) {
  ## $result is a Bio::Search::Result::ResultI compliant object
  while( my $hit = $result->next_hit ) {
   ## $hit is a Bio::Search::Hit::HitI compliant object
   while( my $hsp = $hit->next_hsp ) {
    ## $hsp is a Bio::Search::HSP::HSPI compliant object
    if( $hsp->length('total') > 50 ) {
     if ( $hsp->percent_identity >= 75 ) {
     print  "Query= ",        $result->query_name,
        "Hit= ",        $hit->name,
            "Length= ",     $hsp->length('total'),
            "Percent_id= ", $hsp->percent_identity,
        "Subject=",        $hsp->hit_string,"\n";
     }
    }
   }
  }
}
}

Again, a big thanks in advance :)

All the best,

Tim


On Thu, Nov 26, 2009 at 4:52 PM, Tim <timbourine81 at gmail.com> wrote:

> Hey Mark,
>
> thanks for the answer
>
> On 25.11.2009 20:21, Mark A. Jensen wrote:
> > whoops: change the following line:
> > my $blio = Bio::SearchIO->new( -file => $qid.".bls", -format=>'blast' );
> >
> > to
> >
> > my $blio = Bio::SearchIO->new( -file => ">$qid\.bls", -format=>'blast' );
> >
> > (I always forget that...)
> > MAJ
> >
> > ----- Original Message ----- From: "Mark A. Jensen" <maj at fortinbras.us>
> > To: "Tim" <timbourine81 at gmail.com>; <bioperl-l at lists.open-bio.org>
> > Sent: Wednesday, November 25, 2009 1:20 PM
> > Subject: Re: [Bioperl-l] How to parse BLAST output - all hits of each
> > queryinnew file
> >
> >
> >> hey Tim--
> >>
> >> Sound like you need to go about collecting your queries inside out:
> >>
> >> my %hits_by_query;
> >> for ($result->hits) {
> >>  push @{$hits_by_query{$hit->name}} $hit;
> >> }
> >>
> >> I believe now each hash element, keyed by the query name, will contain
> >> an arrayref to the set of hits assoc with that query.
> >>> From here, I believe
> >>
> >> use Bio::Search::Result::BlastResult;
> >> use Bio::SearchIO;
> >>
> >> foreach my $qid ( keys %hits_by_query ) {
> >>  my $result = Bio::Search::Result::BlastResult->new();
> >>  $result->add_hit($_) for ( @{$hits_by_query{$qid}} );
> >>  my $blio = Bio::SearchIO->new( -file => $qid.".bls", -format=>'blast'
> );
> >>  $blio->write_result($result);
> >> }
> >>
> >> will do what you want.
> >>
> >> hope this helps -
> >> Mark
> >>
> >> ----- Original Message ----- From: "Tim" <timbourine81 at gmail.com>
> >> To: <bioperl-l at lists.open-bio.org>
> >> Sent: Wednesday, November 25, 2009 12:40 PM
> >> Subject: [Bioperl-l] How to parse BLAST output - all hits of each
> >> query innew file
> >>
> >>
> >>> Dear bioperl users,
> >>>
> >>> I am a real newbie and have - maybe a very trivial - question.
> >>>
> >>> I searched the mailing list archive and many howtos but I have not
> found
> >>> a concrete answer to my problem. So hopefully you can help me :)
> >>>
> >>> Background: I use the latest Bioperl version (installed it two weeks
> >>> before).
> >>> When I use Bio::Tools::Run::StandAloneBlast to BLAST one fasta file
> >>> including different sequences, I get a BLAST output with many queries
> >>> each having several hits / sbjcts.
> >>>
> >>> My problem is how to parse *all* hits of *one* query into a single new
> >>> file. And this for all the queries I have in my BLAST output file.
> >>>
> >>> Or is it better the other way round; first to make fasta files with
> only
> >>> single sequences inside and BLAST each file? But how can I automize
> that
> >>> using Bioperl?
> >>>
> >>> I tried Bio::SearchIO but can only parse all queries and their
> >>> respective hits in only one file...
> >>> I think iteration is also necessary here, but I do not really know how
> >>> to include that into Bio::SearchIO.
> >>> Or do I have to use Module:Bio::Index::Blast?
> >>>
> >>> I can index a file (see below), but I have no idea what comes next...
> >>>
> >>> ###How I index a file...
> >>>
> >>> #!/usr/bin/perl -w
> >>>
> >>> $ENV{BIOPERL_INDEX_TYPE} = "SDBM_File";
> >>>
> >>> use Bio::Index::Fasta;
> >>>
> >>>
> >>> $file_name = "8_to_BLAST_two_seq_index.fasta";
> >>> $id = "48882";
> >>> $inx = Bio::Index::Fasta->new (-filename => $file_name . ".idx",
> >>> -write_flag => 1);
> >>> $inx->make_index($file_name);
> >>>
> >>>
> >>> Hopefully, you can give me at least hints what to look for.
> >>>
> >>> A big THANKS in advance!
> >>>
> >>> Cheers,
> >>>
> >>> Tim
> >>> _______________________________________________
> >>> Bioperl-l mailing list
> >>> Bioperl-l at lists.open-bio.org
> >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >>>
> >>>
> >>
> >> _______________________________________________
> >> Bioperl-l mailing list
> >> Bioperl-l at lists.open-bio.org
> >> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >>
> >>
> >
>
> Tim K?hler
MPI for Terrestrial Microbiology
Karl-von-Frisch-Stra?e
D-35043 Marburg / Germany

Email: koehlerd at mpi-marburg.mpg.de
Phone: +49 6421 178-740
Fax:   +49 6421 178-999


From rtbio.2009 at gmail.com  Sat Nov 28 02:53:43 2009
From: rtbio.2009 at gmail.com (Roopa Raghuveer)
Date: Sat, 28 Nov 2009 08:53:43 +0100
Subject: [Bioperl-l] Linking of two cgi scripts
Message-ID: <c7cac1600911272353u70019256rf394078759660266@mail.gmail.com>

hello everyone,

I have a small question.

I would like to link two cgi scripts i.e.,

I have an input sequence being entered in a text area

ex:->gi|at442323|...
ATGCCCCCTTGGAACCAAAAAAA....

So I would like to compare this with the query sequences.These query
sequences would be from a BLAST script in the module blast.pm
So once I enter the input sequence and request for BLAST using submit
button,my request should go to a program which performs BLAST search.After
this, the sequences obtained from BLAST have to be returned to a program
Roopa.pm which compares the input sequence and the sequences obtained from
blast.

But I am unable to provide this link between the cgi scripts.(i.e.,one
script to use BLAST,the other script to compare the sequences and send the
results to the browser)

Could any one help me in this regard?

Regards,
Roopa.


From s.denaxas at gmail.com  Sat Nov 28 05:56:15 2009
From: s.denaxas at gmail.com (Spiros Denaxas)
Date: Sat, 28 Nov 2009 10:56:15 +0000
Subject: [Bioperl-l] Linking of two cgi scripts
In-Reply-To: <c7cac1600911272353u70019256rf394078759660266@mail.gmail.com>
References: <c7cac1600911272353u70019256rf394078759660266@mail.gmail.com>
Message-ID: <bba689ec0911280256u602b8f9dpffe9483189c56536@mail.gmail.com>

Hello,

Why do they both have to be CGi scripts? cant all the processing
happen server side, i.e. both BLAST and comparison of returned
results?

If that is strictly a requirement, you could:

a) get input from user on script A, i.e. the input sequence
b) do a HTTP request from the CGI to the other script B using LWP::UserAgent
c) get results from script B, pass on to comparison module
d) return results to user

As I said, this will be clunky so either do everything in one go or
consider AJAX

hope this helps
Spiros

On Sat, Nov 28, 2009 at 7:53 AM, Roopa Raghuveer <rtbio.2009 at gmail.com> wrote:
> hello everyone,
>
> I have a small question.
>
> I would like to link two cgi scripts i.e.,
>
> I have an input sequence being entered in a text area
>
> ex:->gi|at442323|...
> ATGCCCCCTTGGAACCAAAAAAA....
>
> So I would like to compare this with the query sequences.These query
> sequences would be from a BLAST script in the module blast.pm
> So once I enter the input sequence and request for BLAST using submit
> button,my request should go to a program which performs BLAST search.After
> this, the sequences obtained from BLAST have to be returned to a program
> Roopa.pm which compares the input sequence and the sequences obtained from
> blast.
>
> But I am unable to provide this link between the cgi scripts.(i.e.,one
> script to use BLAST,the other script to compare the sequences and send the
> results to the browser)
>
> Could any one help me in this regard?
>
> Regards,
> Roopa.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


From maj at fortinbras.us  Sat Nov 28 11:23:53 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Sat, 28 Nov 2009 11:23:53 -0500
Subject: [Bioperl-l] Run wrappers for BWA and Samtools
Message-ID: <7F56A6EEEB0E4EE291D5340F27DF7D3A@NewLife>

Hi All, 

Run wrappers for the bwa assembler and the samtools suite
are now available as beta in the bioperl-run/trunk. The bwa 
wrapper allows you to run a canned assembly pipeline, or 
to execute individual bwa components. The assembly pipeline
can return a Bio::Assembly::Scaffold object via the new 
Bio::Assembly::IO::sam module in bioperl-live/trunk
(this requires lstein's Bio::DB::Sam, from CPAN). Details at

http://www.bioperl.org/wiki/HOWTO:Short-read_assemblies_with_BWA

and, of course, in the pod. 

Cheers, 
MAJ


From maj at fortinbras.us  Sat Nov 28 21:55:42 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Sat, 28 Nov 2009 21:55:42 -0500
Subject: [Bioperl-l] How to parse BLAST output - all hits of
	eachqueryinnew file
In-Reply-To: <c3cc98c0911260802t2a66e7e8o7be3b79c03d86e6e@mail.gmail.com>
References: <4B0D6C24.2080308@gmail.com><53DE480F205E42CE8D2B9421592AAF0E@NewLife><815D2A47BC9C4D89B8DEF0B10DA9EAF8@NewLife><4B0EA44D.2050507@gmail.com>
	<c3cc98c0911260802t2a66e7e8o7be3b79c03d86e6e@mail.gmail.com>
Message-ID: <21BFD947CEEF43CCAC8AFFDB7A064A49@NewLife>

Hi Tim--
There's a bug in my code; should be
for my $hit ($result->hits) {
...
}
and you're right about the comma. My bad.

But I don't think you need this-- you're already looping over your
query sequences and doing blastn on each one. So in the middle of
your loop, you can simply write the blast result that you got:

my $blio = Bio::SearchIO->new( -file => 
">".$query->id.".bls", -format=>"blast" );
$blio->write_result($result);

and forget about the foreach my $qid loop entirely.

The files should show up in the directory from which you're
running the script.
cheers, MAJ


----- Original Message ----- 
From: "Tim Koehler" <timbourine81 at googlemail.com>
To: <bioperl-l at lists.open-bio.org>
Sent: Thursday, November 26, 2009 11:02 AM
Subject: Re: [Bioperl-l] How to parse BLAST output - all hits of eachqueryinnew 
file


ups, sent too early...

Hey Mark,

thanks for the answer. But I am still struggling, especially where to put in
your code.

Here ist the code I have, so far:

#!/usr/bin/perl -w

### should I put your code here as push is a perl command?
my %hits_by_query;
for ($result->hits) {
### I inserted a comma after name}}; if there is no comma, there was the
error: Scalar found where operator expected at
12_BLAST_two_sequence_each_query_one_file.PL line7, near "} $hit"
###        (Missing operator before  $hit?)
###Useless use of push with no values at
12_BLAST_two_sequence_each_query_one_file.PL line 7.
###syntax error at 12_BLAST_two_sequence_each_query_one_file.PL line 7, near
"} $hit"
###BEGIN not safe after errors--compilation aborted at
12_BLAST_two_sequence_each_query_one_file.PL line 13.
 push @{$hits_by_query{$hit->name}}, $hit;
###here, every time this terror appears: Name "main::result" used only once:
possible typo at 12_BLAST_two_sequence_each_query_one_file.PL line 5.
###error: Can't call method "hits" on an undefined value at
12_BLAST_two_sequence_each_query_one_file.PL line 5.
}


use strict;
use Bio::Tools::Run::StandAloneBlast;
use Bio::SeqIO;
use Bio::SearchIO;
use Bio::Search::Result::BlastResult;

my $Seq_in = Bio::SeqIO->new (
-file =>
"/home/koehler/Programs/for_BLAST/BLAST_Pipeline/1_to_BLAST_two_seq.fasta",
-format => 'fasta'
);
while (my $query = $Seq_in->next_seq()) {
my $factory = Bio::Tools::Run::StandAloneBlast->new(
'program' => 'blastn',
'database' => '/home/koehler/Programs/for_BLAST/BLAST_Pipeline/3_BLAST_db',
_READMETHOD => "Blast"
);

my $blast_report = $factory->blastall($query);

### Should I need to use a module? are the commands here at the right
position? errors, e.g., Global symbol "$hit" requires explicit package name
#my %hits_by_query;
#for ($result->hits) {
### inserted comma after name}}
# push @{$hits_by_query{$hit->name}}, $hit;
#}

foreach my $qid ( keys %hits_by_query ) {
 my $result = Bio::Search::Result::BlastResult->new();
 $result->add_hit($_) for ( @{$hits_by_query{$qid}} );
 my $blio = Bio::SearchIO->new( -file => ">$qid\.bls", -format=>'blast' );
 $blio->write_result($result);
}

###where are the files stored? what is their name. Sorry, but I cannot get
behind that :(

while( my $result = $blast_report->next_result ) {
  ## $result is a Bio::Search::Result::ResultI compliant object
  while( my $hit = $result->next_hit ) {
   ## $hit is a Bio::Search::Hit::HitI compliant object
   while( my $hsp = $hit->next_hsp ) {
    ## $hsp is a Bio::Search::HSP::HSPI compliant object
    if( $hsp->length('total') > 50 ) {
     if ( $hsp->percent_identity >= 75 ) {
     print  "Query= ",        $result->query_name,
        "Hit= ",        $hit->name,
            "Length= ",     $hsp->length('total'),
            "Percent_id= ", $hsp->percent_identity,
        "Subject=",        $hsp->hit_string,"\n";
     }
    }
   }
  }
}
}

Again, a big thanks in advance :)

All the best,

Tim


On Thu, Nov 26, 2009 at 4:52 PM, Tim <timbourine81 at gmail.com> wrote:

> Hey Mark,
>
> thanks for the answer
>
> On 25.11.2009 20:21, Mark A. Jensen wrote:
> > whoops: change the following line:
> > my $blio = Bio::SearchIO->new( -file => $qid.".bls", -format=>'blast' );
> >
> > to
> >
> > my $blio = Bio::SearchIO->new( -file => ">$qid\.bls", -format=>'blast' );
> >
> > (I always forget that...)
> > MAJ
> >
> > ----- Original Message ----- From: "Mark A. Jensen" <maj at fortinbras.us>
> > To: "Tim" <timbourine81 at gmail.com>; <bioperl-l at lists.open-bio.org>
> > Sent: Wednesday, November 25, 2009 1:20 PM
> > Subject: Re: [Bioperl-l] How to parse BLAST output - all hits of each
> > queryinnew file
> >
> >
> >> hey Tim--
> >>
> >> Sound like you need to go about collecting your queries inside out:
> >>
> >> my %hits_by_query;
> >> for ($result->hits) {
> >>  push @{$hits_by_query{$hit->name}} $hit;
> >> }
> >>
> >> I believe now each hash element, keyed by the query name, will contain
> >> an arrayref to the set of hits assoc with that query.
> >>> From here, I believe
> >>
> >> use Bio::Search::Result::BlastResult;
> >> use Bio::SearchIO;
> >>
> >> foreach my $qid ( keys %hits_by_query ) {
> >>  my $result = Bio::Search::Result::BlastResult->new();
> >>  $result->add_hit($_) for ( @{$hits_by_query{$qid}} );
> >>  my $blio = Bio::SearchIO->new( -file => $qid.".bls", -format=>'blast'
> );
> >>  $blio->write_result($result);
> >> }
> >>
> >> will do what you want.
> >>
> >> hope this helps -
> >> Mark
> >>
> >> ----- Original Message ----- From: "Tim" <timbourine81 at gmail.com>
> >> To: <bioperl-l at lists.open-bio.org>
> >> Sent: Wednesday, November 25, 2009 12:40 PM
> >> Subject: [Bioperl-l] How to parse BLAST output - all hits of each
> >> query innew file
> >>
> >>
> >>> Dear bioperl users,
> >>>
> >>> I am a real newbie and have - maybe a very trivial - question.
> >>>
> >>> I searched the mailing list archive and many howtos but I have not
> found
> >>> a concrete answer to my problem. So hopefully you can help me :)
> >>>
> >>> Background: I use the latest Bioperl version (installed it two weeks
> >>> before).
> >>> When I use Bio::Tools::Run::StandAloneBlast to BLAST one fasta file
> >>> including different sequences, I get a BLAST output with many queries
> >>> each having several hits / sbjcts.
> >>>
> >>> My problem is how to parse *all* hits of *one* query into a single new
> >>> file. And this for all the queries I have in my BLAST output file.
> >>>
> >>> Or is it better the other way round; first to make fasta files with
> only
> >>> single sequences inside and BLAST each file? But how can I automize
> that
> >>> using Bioperl?
> >>>
> >>> I tried Bio::SearchIO but can only parse all queries and their
> >>> respective hits in only one file...
> >>> I think iteration is also necessary here, but I do not really know how
> >>> to include that into Bio::SearchIO.
> >>> Or do I have to use Module:Bio::Index::Blast?
> >>>
> >>> I can index a file (see below), but I have no idea what comes next...
> >>>
> >>> ###How I index a file...
> >>>
> >>> #!/usr/bin/perl -w
> >>>
> >>> $ENV{BIOPERL_INDEX_TYPE} = "SDBM_File";
> >>>
> >>> use Bio::Index::Fasta;
> >>>
> >>>
> >>> $file_name = "8_to_BLAST_two_seq_index.fasta";
> >>> $id = "48882";
> >>> $inx = Bio::Index::Fasta->new (-filename => $file_name . ".idx",
> >>> -write_flag => 1);
> >>> $inx->make_index($file_name);
> >>>
> >>>
> >>> Hopefully, you can give me at least hints what to look for.
> >>>
> >>> A big THANKS in advance!
> >>>
> >>> Cheers,
> >>>
> >>> Tim
> >>> _______________________________________________
> >>> Bioperl-l mailing list
> >>> Bioperl-l at lists.open-bio.org
> >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >>>
> >>>
> >>
> >> _______________________________________________
> >> Bioperl-l mailing list
> >> Bioperl-l at lists.open-bio.org
> >> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >>
> >>
> >
>
> Tim K?hler
MPI for Terrestrial Microbiology
Karl-von-Frisch-Stra?e
D-35043 Marburg / Germany

Email: koehlerd at mpi-marburg.mpg.de
Phone: +49 6421 178-740
Fax:   +49 6421 178-999

_______________________________________________
Bioperl-l mailing list
Bioperl-l at lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/bioperl-l


From maj at fortinbras.us  Sat Nov 28 22:32:42 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Sat, 28 Nov 2009 22:32:42 -0500
Subject: [Bioperl-l] HOWTO copyright policy vs FDL on wiki
Message-ID: <9EC73CA501BD45BA912F2D77954D6CD7@NewLife>

The HOWTOs appear to have a more restrictive copyright
than FDL-- in particular, the blurb at the bottom of the 
HOWTO page asks users to use the documents for personal 
use only. I'm for this; I think we should therefore have some 
explicit license for these that specifies this kind of restriction, 
and then express that on each howto and in BioPerl:Copyright.
Any thoughts on the right license and whether this is a good plan?
MAJ


From florent.angly at gmail.com  Sat Nov 28 22:47:45 2009
From: florent.angly at gmail.com (Florent Angly)
Date: Sat, 28 Nov 2009 19:47:45 -0800
Subject: [Bioperl-l] More general Bio::Assembly::Contig question (was
 Bio::Tools::Run::Cap3 usage question)
In-Reply-To: <56be91b60911270335s3a50ab0cpb03aabb6660f81dc@mail.gmail.com>
References: <56be91b60911270335s3a50ab0cpb03aabb6660f81dc@mail.gmail.com>
Message-ID: <4B11EEE1.8070907@gmail.com>

Hi Paolo,

The aligned reads of a contig are stored in 
Bio::Assembly::Contigs->{_elem}{READ_NAME}{_seq}. To implement a slice() 
method, you could retrieve the reads using get_seq_ids(), 
get_seq_by_name() or get_seq_by_pos(). To retrieve the position of an 
aligned read in the contig, use get_seq_coord() which returns a 
Bio::SeqFeature::Generic object (from 
Bio::Assembly::Contigs->{_elem}{READ_NAME}{_feat}{_aligned_coord:READ_NAME}) 
on which you can call the start() and end() methods.

I'm not entirely sure what 
Bio::Assembly::Contigs->{_elem}{READ_NAME}{_feat}{_align_clipping:READ_NAME} 
and {_quality_clipping:READ_NAME} are. I believe that they represent the 
clear range of the read/contig.

Hope it helps,

Florent


Paolo Pavan wrote:
> Dear Florent,
> Thank you for your kind answer and for your efforts spent in this module.
> Since you are working on these topics I would like to seize the day 
> and put you some questions about some doubts I have in mind, if you 
> agree, of course :-)
> Some times ago I tried to work with bioperl, loading the data from an 
> ACE file originated by Newbler; my need was to extract part of the 
> contig like an alignment of reads and I tought to do it with a slice() 
> method, since I saw Bio::Assembly::Contig implements Bio::AlignI 
> interface. Unfortunately I realize that this interface is inherited 
> but not implemented.
> I tried to hack it by adding a slice method which would act on a 
> Bio::Alignment created from the array of LocatableSeqs representing 
> the reads.
>
> This is the question:
> If I'm not wrong (please correct me if yes), Bio::Assembly::Contig 
> class stores reads informations in:
> Bio::Assembly::Contigs->{_elem}{READ_NAME}{_feat}{
>      _align_clipping:READ_NAME}
>      _aligned_coord:READ_NAME}
>      _quality_clipping:READ_NAME}
>
> Anyone of these 3 features _align_clipping, _aligned_coord, 
> _quality_clipping, contains a Bio::SeqFeature::Generic, which of them 
> is more suitable to the purpose expressed before, the slice method?
> And more, If you apologize me for being too long, is consequently to 
> the previous: I don't have perfectly clear the purpose of this 3 
> feature per read, can you explain it?
>
> Really thanks you for the time you would spend.
> Bye bye,
> Paolo


From bimber at wisc.edu  Sun Nov 29 00:31:25 2009
From: bimber at wisc.edu (Ben Bimber)
Date: Sat, 28 Nov 2009 23:31:25 -0600
Subject: [Bioperl-l] using bioperl to compare sequences
Message-ID: <9f985cdc0911282131l350bc525gd9ad4717c101ac63@mail.gmail.com>

Hello,

I have a couple years programming experience, but am reasonably new to
perl and extremely new to bioperl.  I have been reading through the
bioperl documentation and am trying to understand the best way to
approach a particular problem.  I'm hoping someone could offer some
tips and point me in the right direction.  If someone has solved this
sort of problem before, i'd prefer not to reinvent things.  Here's
what I'm trying to do:

Our lab generates mRNA sequence data, consisting of alleles of a given
gene or genes
I want to compare each of these sequences against a reference using
BLAST or clustalw (will need the ability to choose at run time)
Take the result of this alignment, then record positions of difference
between the experimental sequence and reference sequence (SNPs)
Translate the corresponding AA change(s) associated with each SNP.
There can be overlapping ORFs.

I see that bioperl has modules for BLAST and clustal.  I've also been
looking at the modules under variation.  I havent fully wrapped my
head around them, but they look to be what i'd use for SNP detection.

has anyone has written code to perform similar things and if so, would
you be willing to share specific examples?  Anything concrete to see
exactly how these modules operate would be extremely helpful.

Thanks in advance for any tips or help.


From jason at bioperl.org  Sun Nov 29 10:54:53 2009
From: jason at bioperl.org (Jason Stajich)
Date: Sun, 29 Nov 2009 07:54:53 -0800
Subject: [Bioperl-l] How to parse BLAST output - all hits of
	eachqueryinnew file
In-Reply-To: <21BFD947CEEF43CCAC8AFFDB7A064A49@NewLife>
References: <4B0D6C24.2080308@gmail.com><53DE480F205E42CE8D2B9421592AAF0E@NewLife><815D2A47BC9C4D89B8DEF0B10DA9EAF8@NewLife><4B0EA44D.2050507@gmail.com>
	<c3cc98c0911260802t2a66e7e8o7be3b79c03d86e6e@mail.gmail.com>
	<21BFD947CEEF43CCAC8AFFDB7A064A49@NewLife>
Message-ID: <897A8DB4-AF29-4601-A1E5-9A04D9D8C151@bioperl.org>

or
while( my $hit = $result->next_hit ) {
}
On Nov 28, 2009, at 6:55 PM, Mark A. Jensen wrote:

> Hi Tim--
> There's a bug in my code; should be
> for my $hit ($result->hits) {
> ...
> }
> and you're right about the comma. My bad.
>
> But I don't think you need this-- you're already looping over your
> query sequences and doing blastn on each one. So in the middle of
> your loop, you can simply write the blast result that you got:
>
> my $blio = Bio::SearchIO->new( -file => ">".$query->id.".bls", - 
> format=>"blast" );
> $blio->write_result($result);
>
> and forget about the foreach my $qid loop entirely.
>
> The files should show up in the directory from which you're
> running the script.
> cheers, MAJ
>
>
>
> ----- Original Message ----- From: "Tim Koehler" <timbourine81 at googlemail.com 
> >
> To: <bioperl-l at lists.open-bio.org>
> Sent: Thursday, November 26, 2009 11:02 AM
> Subject: Re: [Bioperl-l] How to parse BLAST output - all hits of  
> eachqueryinnew file
>
>
> ups, sent too early...
>
> Hey Mark,
>
> thanks for the answer. But I am still struggling, especially where  
> to put in
> your code.
>
> Here ist the code I have, so far:
>
> #!/usr/bin/perl -w
>
> ### should I put your code here as push is a perl command?
> my %hits_by_query;
> for ($result->hits) {
> ### I inserted a comma after name}}; if there is no comma, there was  
> the
> error: Scalar found where operator expected at
> 12_BLAST_two_sequence_each_query_one_file.PL line7, near "} $hit"
> ###        (Missing operator before  $hit?)
> ###Useless use of push with no values at
> 12_BLAST_two_sequence_each_query_one_file.PL line 7.
> ###syntax error at 12_BLAST_two_sequence_each_query_one_file.PL line  
> 7, near
> "} $hit"
> ###BEGIN not safe after errors--compilation aborted at
> 12_BLAST_two_sequence_each_query_one_file.PL line 13.
> push @{$hits_by_query{$hit->name}}, $hit;
> ###here, every time this terror appears: Name "main::result" used  
> only once:
> possible typo at 12_BLAST_two_sequence_each_query_one_file.PL line 5.
> ###error: Can't call method "hits" on an undefined value at
> 12_BLAST_two_sequence_each_query_one_file.PL line 5.
> }
>
>
> use strict;
> use Bio::Tools::Run::StandAloneBlast;
> use Bio::SeqIO;
> use Bio::SearchIO;
> use Bio::Search::Result::BlastResult;
>
> my $Seq_in = Bio::SeqIO->new (
> -file =>
> "/home/koehler/Programs/for_BLAST/BLAST_Pipeline/ 
> 1_to_BLAST_two_seq.fasta",
> -format => 'fasta'
> );
> while (my $query = $Seq_in->next_seq()) {
> my $factory = Bio::Tools::Run::StandAloneBlast->new(
> 'program' => 'blastn',
> 'database' => '/home/koehler/Programs/for_BLAST/BLAST_Pipeline/ 
> 3_BLAST_db',
> _READMETHOD => "Blast"
> );
>
> my $blast_report = $factory->blastall($query);
>
> ### Should I need to use a module? are the commands here at the right
> position? errors, e.g., Global symbol "$hit" requires explicit  
> package name
> #my %hits_by_query;
> #for ($result->hits) {
> ### inserted comma after name}}
> # push @{$hits_by_query{$hit->name}}, $hit;
> #}
>
> foreach my $qid ( keys %hits_by_query ) {
> my $result = Bio::Search::Result::BlastResult->new();
> $result->add_hit($_) for ( @{$hits_by_query{$qid}} );
> my $blio = Bio::SearchIO->new( -file => ">$qid\.bls", - 
> format=>'blast' );
> $blio->write_result($result);
> }
>
> ###where are the files stored? what is their name. Sorry, but I  
> cannot get
> behind that :(
>
> while( my $result = $blast_report->next_result ) {
> ## $result is a Bio::Search::Result::ResultI compliant object
> while( my $hit = $result->next_hit ) {
>  ## $hit is a Bio::Search::Hit::HitI compliant object
>  while( my $hsp = $hit->next_hsp ) {
>   ## $hsp is a Bio::Search::HSP::HSPI compliant object
>   if( $hsp->length('total') > 50 ) {
>    if ( $hsp->percent_identity >= 75 ) {
>    print  "Query= ",        $result->query_name,
>       "Hit= ",        $hit->name,
>           "Length= ",     $hsp->length('total'),
>           "Percent_id= ", $hsp->percent_identity,
>       "Subject=",        $hsp->hit_string,"\n";
>    }
>   }
>  }
> }
> }
> }
>
> Again, a big thanks in advance :)
>
> All the best,
>
> Tim
>
>
> On Thu, Nov 26, 2009 at 4:52 PM, Tim <timbourine81 at gmail.com> wrote:
>
>> Hey Mark,
>>
>> thanks for the answer
>>
>> On 25.11.2009 20:21, Mark A. Jensen wrote:
>> > whoops: change the following line:
>> > my $blio = Bio::SearchIO->new( -file => $qid.".bls", - 
>> format=>'blast' );
>> >
>> > to
>> >
>> > my $blio = Bio::SearchIO->new( -file => ">$qid\.bls", - 
>> format=>'blast' );
>> >
>> > (I always forget that...)
>> > MAJ
>> >
>> > ----- Original Message ----- From: "Mark A. Jensen" <maj at fortinbras.us 
>> >
>> > To: "Tim" <timbourine81 at gmail.com>; <bioperl-l at lists.open-bio.org>
>> > Sent: Wednesday, November 25, 2009 1:20 PM
>> > Subject: Re: [Bioperl-l] How to parse BLAST output - all hits of  
>> each
>> > queryinnew file
>> >
>> >
>> >> hey Tim--
>> >>
>> >> Sound like you need to go about collecting your queries inside  
>> out:
>> >>
>> >> my %hits_by_query;
>> >> for ($result->hits) {
>> >>  push @{$hits_by_query{$hit->name}} $hit;
>> >> }
>> >>
>> >> I believe now each hash element, keyed by the query name, will  
>> contain
>> >> an arrayref to the set of hits assoc with that query.
>> >>> From here, I believe
>> >>
>> >> use Bio::Search::Result::BlastResult;
>> >> use Bio::SearchIO;
>> >>
>> >> foreach my $qid ( keys %hits_by_query ) {
>> >>  my $result = Bio::Search::Result::BlastResult->new();
>> >>  $result->add_hit($_) for ( @{$hits_by_query{$qid}} );
>> >>  my $blio = Bio::SearchIO->new( -file => $qid.".bls", - 
>> format=>'blast'
>> );
>> >>  $blio->write_result($result);
>> >> }
>> >>
>> >> will do what you want.
>> >>
>> >> hope this helps -
>> >> Mark
>> >>
>> >> ----- Original Message ----- From: "Tim" <timbourine81 at gmail.com>
>> >> To: <bioperl-l at lists.open-bio.org>
>> >> Sent: Wednesday, November 25, 2009 12:40 PM
>> >> Subject: [Bioperl-l] How to parse BLAST output - all hits of each
>> >> query innew file
>> >>
>> >>
>> >>> Dear bioperl users,
>> >>>
>> >>> I am a real newbie and have - maybe a very trivial - question.
>> >>>
>> >>> I searched the mailing list archive and many howtos but I have  
>> not
>> found
>> >>> a concrete answer to my problem. So hopefully you can help me :)
>> >>>
>> >>> Background: I use the latest Bioperl version (installed it two  
>> weeks
>> >>> before).
>> >>> When I use Bio::Tools::Run::StandAloneBlast to BLAST one fasta  
>> file
>> >>> including different sequences, I get a BLAST output with many  
>> queries
>> >>> each having several hits / sbjcts.
>> >>>
>> >>> My problem is how to parse *all* hits of *one* query into a  
>> single new
>> >>> file. And this for all the queries I have in my BLAST output  
>> file.
>> >>>
>> >>> Or is it better the other way round; first to make fasta files  
>> with
>> only
>> >>> single sequences inside and BLAST each file? But how can I  
>> automize
>> that
>> >>> using Bioperl?
>> >>>
>> >>> I tried Bio::SearchIO but can only parse all queries and their
>> >>> respective hits in only one file...
>> >>> I think iteration is also necessary here, but I do not really  
>> know how
>> >>> to include that into Bio::SearchIO.
>> >>> Or do I have to use Module:Bio::Index::Blast?
>> >>>
>> >>> I can index a file (see below), but I have no idea what comes  
>> next...
>> >>>
>> >>> ###How I index a file...
>> >>>
>> >>> #!/usr/bin/perl -w
>> >>>
>> >>> $ENV{BIOPERL_INDEX_TYPE} = "SDBM_File";
>> >>>
>> >>> use Bio::Index::Fasta;
>> >>>
>> >>>
>> >>> $file_name = "8_to_BLAST_two_seq_index.fasta";
>> >>> $id = "48882";
>> >>> $inx = Bio::Index::Fasta->new (-filename => $file_name . ".idx",
>> >>> -write_flag => 1);
>> >>> $inx->make_index($file_name);
>> >>>
>> >>>
>> >>> Hopefully, you can give me at least hints what to look for.
>> >>>
>> >>> A big THANKS in advance!
>> >>>
>> >>> Cheers,
>> >>>
>> >>> Tim
>> >>> _______________________________________________
>> >>> Bioperl-l mailing list
>> >>> Bioperl-l at lists.open-bio.org
>> >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>> >>>
>> >>>
>> >>
>> >> _______________________________________________
>> >> Bioperl-l mailing list
>> >> Bioperl-l at lists.open-bio.org
>> >> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>> >>
>> >>
>> >
>>
>> Tim K?hler
> MPI for Terrestrial Microbiology
> Karl-von-Frisch-Stra?e
> D-35043 Marburg / Germany
>
> Email: koehlerd at mpi-marburg.mpg.de
> Phone: +49 6421 178-740
> Fax:   +49 6421 178-999
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

--
Jason Stajich
jason.stajich at gmail.com
jason at bioperl.org


From suzi at berkeleybop.org  Sun Nov 29 23:03:09 2009
From: suzi at berkeleybop.org (Suzanna Lewis)
Date: Sun, 29 Nov 2009 20:03:09 -0800
Subject: [Bioperl-l] [DAS] DAS workshop 7th-9th April 2010
In-Reply-To: <F30A9ED7-41E9-4833-A094-FDF0893E0F92@sanger.ac.uk>
References: <F30A9ED7-41E9-4833-A094-FDF0893E0F92@sanger.ac.uk>
Message-ID: <3AD3C819-4BAA-4D90-B141-9611F48C5CAD@ berkeleybop.org>

I/we (Gregg) would be interested in attending. We'd present an update on the collaborative, web-based version of Apollo. We will be working with Ian Holmes and Mitch Skinner using JBrowse for basic display.

-S


On Nov 26, 2009, at 6:57 AM, Jonathan Warren wrote:

> We are considering running a Distributed Annotation System workshop here at the Sanger/EBI in the UK subject to decent demand.
> The workshop will be held from Wednesday 7th-Friday 9th April 2010. If you would be interested in attending either to present or just take part
> then please email me jw12 at sanger.ac.uk
> 
> The format of the workshop is likely to be similar to last years (1st day for beginners, 2nd for both beginners and advanced users, 3rd day for advanced), information for which can be found here:
> http://www.dasregistry.org/course.jsp
> 
> If you would like to present then please send a short summary of what you would like to talk about.
> 
> Thanks
> 
> Jonathan.
> 
> Jonathan Warren
> Senior Developer and DAS coordinator
> jw12 at sanger.ac.uk
> 
> 
> 
> 
> 
> 
> 
> 
> 
> -- 
> The Wellcome Trust Sanger Institute is operated by Genome ResearchLimited, a charity registered in England with number 1021457 and acompany registered in England with number 2742969, whose registeredoffice is 215 Euston Road, London, NW1 2BE._______________________________________________
> DAS mailing list
> DAS at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/das
> 


From maj at fortinbras.us  Mon Nov 30 09:31:27 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Mon, 30 Nov 2009 09:31:27 -0500
Subject: [Bioperl-l] HOWTO copyright policy vs FDL on wiki
In-Reply-To: <81B3C4A1-9F14-4FF9-A4AF-F7E90817A2F1@verizon.net>
References: <9EC73CA501BD45BA912F2D77954D6CD7@NewLife>
	<81B3C4A1-9F14-4FF9-A4AF-F7E90817A2F1@verizon.net>
Message-ID: <513F1C824EF84974993A76F0CC719CDF@NewLife>

Well, it has a history, Jason's point. So the question could
be: "is this still a valid issue"? A while back, a user on the wiki,
with natural and good intentions, removed the authorship and revision
info from a couple of the HOWTOs; it is more wiki-like,
after all. But Chris had some objections to that, which I
seconded, mainly on the basis of the special status that
seems implied by the copyright note on the HOWTO
page. I also think that the nature of the howto is somewhat
different from other info on the site -- that developers themselves
put a lot of time in to explaining how to use their modules, and
that in this world where devs get paid by recognition, it is a reasonable
thing to allow this extra horn-tooting. Now, that is a policy
that could be completely separable from the issue of copyright.
However, devs may also get paid by using their materials in teaching
seminars. The dilemma would be that people who like to use the
wiki are people who like to share, and so it feels unnatural to
withhold from the community the materials they develop,  but
people who like to share also like to eat and wear shoes...
so I'm interested in everyone's thoughts about it.
----- Original Message ----- 
From: "Brian Osborne" <bosborne11 at verizon.net>
To: "Mark A. Jensen" <maj at fortinbras.us>
Cc: "Chris Fields" <cjfields at illinois.edu>; "Jason Stajich" 
<jason.stajich at ucr.edu>; "bioperl List" <bioperl-l at bioperl.org>
Sent: Monday, November 30, 2009 9:16 AM
Subject: Re: [Bioperl-l] HOWTO copyright policy vs FDL on wiki


> Mark,
>
> Let me ask you a question, and don't take this question as an implicit 
> criticism of your suggestion, it is not. Why would you want this more 
> restrictive copyright?
>
> Brian O.
>
> On Nov 28, 2009, at 10:32 PM, Mark A. Jensen wrote:
>
>> The HOWTOs appear to have a more restrictive copyright
>> than FDL-- in particular, the blurb at the bottom of the
>> HOWTO page asks users to use the documents for personal
>> use only. I'm for this; I think we should therefore have some
>> explicit license for these that specifies this kind of restriction,
>> and then express that on each howto and in BioPerl:Copyright.
>> Any thoughts on the right license and whether this is a good plan?
>> MAJ
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>
> 


From bosborne11 at verizon.net  Mon Nov 30 10:15:32 2009
From: bosborne11 at verizon.net (Brian Osborne)
Date: Mon, 30 Nov 2009 10:15:32 -0500
Subject: [Bioperl-l] HOWTO copyright policy vs FDL on wiki
In-Reply-To: <513F1C824EF84974993A76F0CC719CDF@NewLife>
References: <9EC73CA501BD45BA912F2D77954D6CD7@NewLife>
	<81B3C4A1-9F14-4FF9-A4AF-F7E90817A2F1@verizon.net>
	<513F1C824EF84974993A76F0CC719CDF@NewLife>
Message-ID: <54671455-A02C-4139-8C39-AC17B50D5CE6@verizon.net>

Mark,

I have no objection to a more restrictive copyright, and I also have  
no objection to using FDL, or things like it.

Brian O.

On Nov 30, 2009, at 9:31 AM, Mark A. Jensen wrote:

> Well, it has a history, Jason's point. So the question could
> be: "is this still a valid issue"? A while back, a user on the wiki,
> with natural and good intentions, removed the authorship and revision
> info from a couple of the HOWTOs; it is more wiki-like,
> after all. But Chris had some objections to that, which I
> seconded, mainly on the basis of the special status that
> seems implied by the copyright note on the HOWTO
> page. I also think that the nature of the howto is somewhat
> different from other info on the site -- that developers themselves
> put a lot of time in to explaining how to use their modules, and
> that in this world where devs get paid by recognition, it is a  
> reasonable
> thing to allow this extra horn-tooting. Now, that is a policy
> that could be completely separable from the issue of copyright.
> However, devs may also get paid by using their materials in teaching
> seminars. The dilemma would be that people who like to use the
> wiki are people who like to share, and so it feels unnatural to
> withhold from the community the materials they develop,  but
> people who like to share also like to eat and wear shoes...
> so I'm interested in everyone's thoughts about it.
> ----- Original Message ----- From: "Brian Osborne" <bosborne11 at verizon.net 
> >
> To: "Mark A. Jensen" <maj at fortinbras.us>
> Cc: "Chris Fields" <cjfields at illinois.edu>; "Jason Stajich" <jason.stajich at ucr.edu 
> >; "bioperl List" <bioperl-l at bioperl.org>
> Sent: Monday, November 30, 2009 9:16 AM
> Subject: Re: [Bioperl-l] HOWTO copyright policy vs FDL on wiki
>
>
>> Mark,
>>
>> Let me ask you a question, and don't take this question as an  
>> implicit criticism of your suggestion, it is not. Why would you  
>> want this more restrictive copyright?
>>
>> Brian O.
>>
>> On Nov 28, 2009, at 10:32 PM, Mark A. Jensen wrote:
>>
>>> The HOWTOs appear to have a more restrictive copyright
>>> than FDL-- in particular, the blurb at the bottom of the
>>> HOWTO page asks users to use the documents for personal
>>> use only. I'm for this; I think we should therefore have some
>>> explicit license for these that specifies this kind of restriction,
>>> and then express that on each howto and in BioPerl:Copyright.
>>> Any thoughts on the right license and whether this is a good plan?
>>> MAJ
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>>
>


From bosborne11 at verizon.net  Mon Nov 30 09:16:07 2009
From: bosborne11 at verizon.net (Brian Osborne)
Date: Mon, 30 Nov 2009 09:16:07 -0500
Subject: [Bioperl-l] HOWTO copyright policy vs FDL on wiki
In-Reply-To: <9EC73CA501BD45BA912F2D77954D6CD7@NewLife>
References: <9EC73CA501BD45BA912F2D77954D6CD7@NewLife>
Message-ID: <81B3C4A1-9F14-4FF9-A4AF-F7E90817A2F1@verizon.net>

Mark,

Let me ask you a question, and don't take this question as an implicit  
criticism of your suggestion, it is not. Why would you want this more  
restrictive copyright?

Brian O.

On Nov 28, 2009, at 10:32 PM, Mark A. Jensen wrote:

> The HOWTOs appear to have a more restrictive copyright
> than FDL-- in particular, the blurb at the bottom of the
> HOWTO page asks users to use the documents for personal
> use only. I'm for this; I think we should therefore have some
> explicit license for these that specifies this kind of restriction,
> and then express that on each howto and in BioPerl:Copyright.
> Any thoughts on the right license and whether this is a good plan?
> MAJ
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From maj at fortinbras.us  Mon Nov 30 12:41:44 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Mon, 30 Nov 2009 12:41:44 -0500
Subject: [Bioperl-l] How to parse BLAST output - all hits of each
	queryinnew file
In-Reply-To: <c3cc98c0911300923p144ac04ax8a881150dca54835@mail.gmail.com>
References: <4B0D6C24.2080308@gmail.com>
	<53DE480F205E42CE8D2B9421592AAF0E@NewLife>
	<815D2A47BC9C4D89B8DEF0B10DA9EAF8@NewLife>
	<4B0EA44D.2050507@gmail.com>
	<c3cc98c0911260802t2a66e7e8o7be3b79c03d86e6e@mail.gmail.com>
	<c3cc98c0911270123i6e4e83d3lfee0f5f32ca0cf46@mail.gmail.com>
	<18DF7D20DFEC044098A1062202F5FFF32B630E6C53@exchsth.agresearch.co.nz>
	<52D67F20A9CB4953B86FF794ADE0BE96@NewLife>
	<18DF7D20DFEC044098A1062202F5FFF32B630E6D05@exchsth.agresearch.co.nz>
	<c3cc98c0911300923p144ac04ax8a881150dca54835@mail.gmail.com>
Message-ID: <8C288FEF9CEB4055B0CDD19267FBA26C@NewLife>

thanks Tim! corrected (I hope) in r16432... 
MAJ
  ----- Original Message ----- 
  From: Tim Koehler 
  To: Smithies, Russell 
  Cc: Mark A. Jensen ; bioperl-l at lists.open-bio.org 
  Sent: Monday, November 30, 2009 12:23 PM
  Subject: Re: [Bioperl-l] How to parse BLAST output - all hits of each queryinnew file


  Hello everybody,

  thanks a lot for the overwhelming answers! All these codes are different flavors and worked all.

  For me the added code works the best. But I think I found a bug in ...Bio/SearchIO/blast.pm. 
  There the DEFAULT_BLAST_... variable is set to Bio::Search::Writer::HitTableWriter instead of Bio::SearchIO::Writer::HitTableWriter. This variable I changed also to HTMLResultWriter and others.

  So again: THANKS for the support!

  Cheers, 
  Tim

  #!/usr/bin/perl -w

  use strict;

  use Bio::Tools::Run::StandAloneBlast;

  use Bio::SeqIO;

  use Bio::SearchIO;

  ### add here the writer you want
  use Bio::SearchIO::Writer::HitTableWriter;

  use Bio::Search::Result::BlastResult;

   
  use Data::Dumper;

   
  my $Seq_in = Bio::SeqIO->new( -file   => "/home/koehler/Programs/for_BLAST/1_to_BLAST_two_seq.fasta",

                                -format => "fasta" );

   
  while ( my $query = $Seq_in->next_seq() ) {

         warn "Processing ",$query->id, "\n";

    my $factory =

      Bio::Tools::Run::StandAloneBlast->new(

                   program  => "blastn",

                   database => "/home/koehler/Programs/for_BLAST/BLAST_Pipeline/3_BLAST_db",

                   _READMETHOD => "Blast"

      );

   
    my $blast_report = $factory->blastall($query);

    sleep 5;

   
    # just write the result we got for this query into a 

     #new blast-formatted file...named after the id of the query seq...  

    my $result = $blast_report->next_result;

    my $blio = Bio::SearchIO->new( -file => ">".$query->id.".bls", -format => "blast" ) or die $!;

    $blio->write_result($result);

   
    # below, just looking at the current blast result

  ###this does not appear in the output files

    while ( my $result = $blast_report->next_result ) {

      ## $result is a Bio::Search::Result::ResultI compliant object

      while ( my $hit = $result->next_hit ) {

        ## $hit is a Bio::Search::Hit::HitI compliant object

        while ( my $hsp = $hit->next_hsp ) {

          ## $hsp is a Bio::Search::HSP::HSPI compliant object

          if ( $hsp->length('total') > 50 ) {

            if ( $hsp->percent_identity >= 75 ) {

              print "Query= ", $result->query_name,

                "Hit= ",        $hit->name,

                "Length= ",     $hsp->length('total'),

                "Percent_id= ", $hsp->percent_identity,

                "Subject=",     $hsp->hit_string, "\n";

            }

          }

        }

      }

    }

  }

   
  On Sun, Nov 29, 2009 at 11:29 PM, Smithies, Russell <Russell.Smithies at agresearch.co.nz> wrote:

    Changed it to a generic result and added a writer and it seems tio work:


      foreach my $qid ( keys %hits_by_query ) {

        warn "qid = $qid\n";

        my $res = Bio::Search::Result::GenericResult->new(-algorithm => "blastn") or die $!;

       # print Dumper $res;

        foreach my $h ( @{ $hits_by_query{$qid} } ){

                         warn "adding hit ", $h->name, "\n";

                         $res->add_hit($h) if defined($h);

                               }

        my $writerhtml =  Bio::SearchIO::Writer::HTMLResultWriter->new();

        my $blio = Bio::SearchIO->new(-writer => $writerhtml, -file => ">$qid\.bls\.html", -format => "blast" ) or die $!;

        $blio->write_result($res);

      }


    From: Mark A. Jensen [mailto:maj at fortinbras.us] 
    Sent: Monday, 30 November 2009 10:19 a.m.
    To: Smithies, Russell; 'Tim Koehler'


    Subject: Re: [Bioperl-l] How to parse BLAST output - all hits of each queryinnew file


    My thought here was that since Tim's already going one at a time thru

    his queries, my scrap was not really necessary: 


    use strict;

    use Bio::Tools::Run::StandAloneBlast;

    use Bio::SeqIO;

    use Bio::SearchIO;

    use Bio::Search::Result::BlastResult;


    use Data::Dumper;


    my $Seq_in = Bio::SeqIO->new( -file   => "sequences.fasta",

                                  -format => "fasta" );


    while ( my $query = $Seq_in->next_seq() ) {

           warn "Processing ",$query->id, "\n";

      my $factory =

        Bio::Tools::Run::StandAloneBlast->new(

                     program  => "blastn",

                     database => "/data/databases/flatfile/illuminati_blastdata/nt",

                     _READMETHOD => "Blast"

        );


      my $blast_report = $factory->blastall($query);

      sleep 5;


      # just write the result we got for this query into a 

       #new blast-formatted file...named after the id of the query seq...  

     my $result = $blast_report->next_result;

    my $blio = Bio::SearchIO->new( -file => ">".$query->id.".bls", -format => "blast" ) or die $!;

      $blio->write_result($result);


      # below, just looking at the current blast result

      while ( my $result = $blast_report->next_result ) {

        ## $result is a Bio::Search::Result::ResultI compliant object

        while ( my $hit = $result->next_hit ) {

          ## $hit is a Bio::Search::Hit::HitI compliant object

          while ( my $hsp = $hit->next_hsp ) {

            ## $hsp is a Bio::Search::HSP::HSPI compliant object

            if ( $hsp->length('total') > 50 ) {

              if ( $hsp->percent_identity >= 75 ) {

                print "Query= ", $result->query_name,

                  "Hit= ",        $hit->name,

                  "Length= ",     $hsp->length('total'),

                  "Percent_id= ", $hsp->percent_identity,

                  "Subject=",     $hsp->hit_string, "\n";

              }

            }

          }

        }

      }

    }

      ----- Original Message ----- 

      From: Smithies, Russell 

      To: 'Tim Koehler' ; 'maj at fortinbras.us' 

      Sent: Sunday, November 29, 2009 3:58 PM

      Subject: RE: [Bioperl-l] How to parse BLAST output - all hits of each queryinnew file


      Hi Tim

      With various people writing the ?howtos? and other docs, the examples are bound to have differing names for the variables used but as long as you?re consistent, it should all fit together.


      I think I?ve almost got your code working, just getting errors from Bio::Search::Result::BlastResult  which I?m not entirely sure how to use. Perhaps Mark can get this bit going?


      --Russell

      ===============================


      use strict;

      use Bio::Tools::Run::StandAloneBlast;

      use Bio::SeqIO;

      use Bio::SearchIO;

      use Bio::Search::Result::BlastResult;


      use Data::Dumper;


      my $Seq_in = Bio::SeqIO->new( -file   => "sequences.fasta",

                                    -format => "fasta" );


      while ( my $query = $Seq_in->next_seq() ) {

             warn "Processing ",$query->id, "\n";

        my $factory =

          Bio::Tools::Run::StandAloneBlast->new(

                       program  => "blastn",

                       database => "/data/databases/flatfile/illuminati_blastdata/nt",

                       _READMETHOD => "Blast"

          );


        my $blast_report = $factory->blastall($query);

        sleep 5;


        my %hits_by_query;


             while ( my $result = $blast_report->next_result ) {

               foreach my $hit ( $result->hits ) {

                           warn "Pushed a hit for ",$hit->name, "\n";

                 push( @{ $hits_by_query{ $hit->name } }, $hit );

               }

             }


        foreach my $qid ( keys %hits_by_query ) {

                    warn "qid = $qid\n";

          my $res = Bio::Search::Result::BlastResult->new() or die $!;

          print Dumper $res;

          foreach my $h ( @{ $hits_by_query{$qid} } ){

                           warn "adding hit ", $h->name, "\n";

                           $res->add_hit($h) if defined($h);

                                 }

          my $blio = Bio::SearchIO->new( -file => ">$qid\.bls", -format => "blast" ) or die $!;

          $blio->write_result($res);

        }


        while ( my $result = $blast_report->next_result ) {

          ## $result is a Bio::Search::Result::ResultI compliant object

          while ( my $hit = $result->next_hit ) {

            ## $hit is a Bio::Search::Hit::HitI compliant object

            while ( my $hsp = $hit->next_hsp ) {

              ## $hsp is a Bio::Search::HSP::HSPI compliant object

              if ( $hsp->length('total') > 50 ) {

                if ( $hsp->percent_identity >= 75 ) {

                  print "Query= ", $result->query_name,

                    "Hit= ",        $hit->name,

                    "Length= ",     $hsp->length('total'),

                    "Percent_id= ", $hsp->percent_identity,

                    "Subject=",     $hsp->hit_string, "\n";

                }

              }

            }

          }

        }

      }

      ===============================


      From: Tim Koehler [mailto:timbourine81 at googlemail.com] 
      Sent: Friday, 27 November 2009 10:24 p.m.
      To: Smithies, Russell; maj at fortinbras.us
      Subject: Re: [Bioperl-l] How to parse BLAST output - all hits of each queryinnew file


      Hey guys,

      please, do not get me wrong that I wanted to put the workload on you. So far I only found the HowTo's but in there in some way the language changed with time (e.g. $in to $Seq_in) or some things I simply could not find.
      Now I got a tip where else to search: the scrapbook and deobfuscator.

      I immediately will have a look at that.

      This is the first time for me touching linux / perl commands; that's why I thought after several days of trial and many errors ;) asking the mailinglist.

      I was very happy about your fast answers!

      Cheers and a nice weekend,

      Tim

      On Thu, Nov 26, 2009 at 5:02 PM, Tim Koehler <timbourine81 at googlemail.com> wrote:

      ups, sent too early...

      Hey Mark,

      thanks for the answer. But I am still struggling, especially where to put in your code.

      Here ist the code I have, so far:

      #!/usr/bin/perl -w

      ### should I put your code here as push is a perl command?


      my %hits_by_query;
      for ($result->hits) {

      ### I inserted a comma after name}}; if there is no comma, there was the error: Scalar found where operator expected at 12_BLAST_two_sequence_each_query_one_file.PL line7, near "} $hit"
      ###        (Missing operator before  $hit?)
      ###Useless use of push with no values at 12_BLAST_two_sequence_each_query_one_file.PL line 7.
      ###syntax error at 12_BLAST_two_sequence_each_query_one_file.PL line 7, near "} $hit"
      ###BEGIN not safe after errors--compilation aborted at 12_BLAST_two_sequence_each_query_one_file.PL line 13.


       push @{$hits_by_query{$hit->name}}, $hit;

      ###here, every time this terror appears: Name "main::result" used only once: possible typo at 12_BLAST_two_sequence_each_query_one_file.PL line 5.
      ###error: Can't call method "hits" on an undefined value at 12_BLAST_two_sequence_each_query_one_file.PL line 5.


      }


      use strict;
      use Bio::Tools::Run::StandAloneBlast;
      use Bio::SeqIO;
      use Bio::SearchIO;

      use Bio::Search::Result::BlastResult;

      my $Seq_in = Bio::SeqIO->new (
      -file => "/home/koehler/Programs/for_BLAST/BLAST_Pipeline/1_to_BLAST_two_seq.fasta",
      -format => 'fasta'
      );
      while (my $query = $Seq_in->next_seq()) {


      my $factory = Bio::Tools::Run::StandAloneBlast->new(

      'program' => 'blastn',
      'database' => '/home/koehler/Programs/for_BLAST/BLAST_Pipeline/3_BLAST_db',
      _READMETHOD => "Blast"
      );

      my $blast_report = $factory->blastall($query);

      ### Should I need to use a module? are the commands here at the right position? errors, e.g., Global symbol "$hit" requires explicit package name
      #my %hits_by_query;
      #for ($result->hits) {
      ### inserted comma after name}}
      # push @{$hits_by_query{$hit->name}}, $hit;
      #}


      foreach my $qid ( keys %hits_by_query ) {
       my $result = Bio::Search::Result::BlastResult->new();
       $result->add_hit($_) for ( @{$hits_by_query{$qid}} );
       my $blio = Bio::SearchIO->new( -file => ">$qid\.bls", -format=>'blast' );
       $blio->write_result($result);
      } 

      ###where are the files stored? what is their name. Sorry, but I cannot get behind that :(

      while( my $result = $blast_report->next_result ) {
        ## $result is a Bio::Search::Result::ResultI compliant object


        while( my $hit = $result->next_hit ) {

         ## $hit is a Bio::Search::Hit::HitI compliant object


         while( my $hsp = $hit->next_hsp ) {

          ## $hsp is a Bio::Search::HSP::HSPI compliant object
          if( $hsp->length('total') > 50 ) {
           if ( $hsp->percent_identity >= 75 ) {
           print  "Query= ",        $result->query_name,
              "Hit= ",        $hit->name,
                  "Length= ",     $hsp->length('total'),
                  "Percent_id= ", $hsp->percent_identity,
              "Subject=",        $hsp->hit_string,"\n";
           }
          }
         }
        }
      }
      }

      Again, a big thanks in advance :)

      All the best,

      Tim

      On Thu, Nov 26, 2009 at 4:52 PM, Tim <timbourine81 at gmail.com> wrote:

      Hey Mark,

      thanks for the answer


      On 25.11.2009 20:21, Mark A. Jensen wrote:
      > whoops: change the following line:
      > my $blio = Bio::SearchIO->new( -file => $qid.".bls", -format=>'blast' );
      >
      > to
      >
      > my $blio = Bio::SearchIO->new( -file => ">$qid\.bls", -format=>'blast' );
      >
      > (I always forget that...)
      > MAJ
      >
      > ----- Original Message ----- From: "Mark A. Jensen" <maj at fortinbras.us>
      > To: "Tim" <timbourine81 at gmail.com>; <bioperl-l at lists.open-bio.org>
      > Sent: Wednesday, November 25, 2009 1:20 PM
      > Subject: Re: [Bioperl-l] How to parse BLAST output - all hits of each
      > queryinnew file
      >
      >
      >> hey Tim--
      >>
      >> Sound like you need to go about collecting your queries inside out:
      >>
      >> my %hits_by_query;
      >> for ($result->hits) {
      >>  push @{$hits_by_query{$hit->name}} $hit;
      >> }
      >>
      >> I believe now each hash element, keyed by the query name, will contain
      >> an arrayref to the set of hits assoc with that query.
      >>> From here, I believe
      >>
      >> use Bio::Search::Result::BlastResult;
      >> use Bio::SearchIO;
      >>
      >> foreach my $qid ( keys %hits_by_query ) {
      >>  my $result = Bio::Search::Result::BlastResult->new();
      >>  $result->add_hit($_) for ( @{$hits_by_query{$qid}} );
      >>  my $blio = Bio::SearchIO->new( -file => $qid.".bls", -format=>'blast' );
      >>  $blio->write_result($result);
      >> }
      >>
      >> will do what you want.
      >>
      >> hope this helps -
      >> Mark
      >>
      >> ----- Original Message ----- From: "Tim" <timbourine81 at gmail.com>
      >> To: <bioperl-l at lists.open-bio.org>
      >> Sent: Wednesday, November 25, 2009 12:40 PM
      >> Subject: [Bioperl-l] How to parse BLAST output - all hits of each
      >> query innew file
      >>
      >>
      >>> Dear bioperl users,
      >>>
      >>> I am a real newbie and have - maybe a very trivial - question.
      >>>
      >>> I searched the mailing list archive and many howtos but I have not found
      >>> a concrete answer to my problem. So hopefully you can help me :)
      >>>
      >>> Background: I use the latest Bioperl version (installed it two weeks
      >>> before).
      >>> When I use Bio::Tools::Run::StandAloneBlast to BLAST one fasta file
      >>> including different sequences, I get a BLAST output with many queries
      >>> each having several hits / sbjcts.
      >>>
      >>> My problem is how to parse *all* hits of *one* query into a single new
      >>> file. And this for all the queries I have in my BLAST output file.
      >>>
      >>> Or is it better the other way round; first to make fasta files with only
      >>> single sequences inside and BLAST each file? But how can I automize that
      >>> using Bioperl?
      >>>
      >>> I tried Bio::SearchIO but can only parse all queries and their
      >>> respective hits in only one file...
      >>> I think iteration is also necessary here, but I do not really know how
      >>> to include that into Bio::SearchIO.
      >>> Or do I have to use Module:Bio::Index::Blast?
      >>>
      >>> I can index a file (see below), but I have no idea what comes next...
      >>>
      >>> ###How I index a file...
      >>>
      >>> #!/usr/bin/perl -w
      >>>
      >>> $ENV{BIOPERL_INDEX_TYPE} = "SDBM_File";
      >>>
      >>> use Bio::Index::Fasta;
      >>>
      >>>
      >>> $file_name = "8_to_BLAST_two_seq_index.fasta";
      >>> $id = "48882";
      >>> $inx = Bio::Index::Fasta->new (-filename => $file_name . ".idx",
      >>> -write_flag => 1);
      >>> $inx->make_index($file_name);
      >>>
      >>>
      >>> Hopefully, you can give me at least hints what to look for.
      >>>
      >>> A big THANKS in advance!
      >>>
      >>> Cheers,
      >>>
      >>> Tim
      >>> _______________________________________________
      >>> Bioperl-l mailing list
      >>> Bioperl-l at lists.open-bio.org
      >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
      >>>
      >>>
      >>
      >> _______________________________________________
      >> Bioperl-l mailing list
      >> Bioperl-l at lists.open-bio.org
      >> http://lists.open-bio.org/mailman/listinfo/bioperl-l
      >>
      >>
      >

      Tim K?hler
      MPI for Terrestrial Microbiology
      Karl-von-Frisch-Stra?e
      D-35043 Marburg / Germany

      Email: koehlerd at mpi-marburg.mpg.de
      Phone: +49 6421 178-740
      Fax:   +49 6421 178-999


--------------------------------------------------------------------------

      Attention: The information contained in this message and/or attachments from AgResearch Limited is intended only for the persons or entities to which it is addressed and may contain confidential and/or privileged material. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipients is prohibited by AgResearch Limited. If you have received this message in error, please notify the sender immediately.


--------------------------------------------------------------------------


From timbourine81 at googlemail.com  Mon Nov 30 12:23:58 2009
From: timbourine81 at googlemail.com (Tim Koehler)
Date: Mon, 30 Nov 2009 18:23:58 +0100
Subject: [Bioperl-l] How to parse BLAST output - all hits of each
	queryinnew file
In-Reply-To: <18DF7D20DFEC044098A1062202F5FFF32B630E6D05@exchsth.agresearch.co.nz>
References: <4B0D6C24.2080308@gmail.com>
	<53DE480F205E42CE8D2B9421592AAF0E@NewLife>
	<815D2A47BC9C4D89B8DEF0B10DA9EAF8@NewLife>
	<4B0EA44D.2050507@gmail.com>
	<c3cc98c0911260802t2a66e7e8o7be3b79c03d86e6e@mail.gmail.com>
	<c3cc98c0911270123i6e4e83d3lfee0f5f32ca0cf46@mail.gmail.com>
	<18DF7D20DFEC044098A1062202F5FFF32B630E6C53@exchsth.agresearch.co.nz>
	<52D67F20A9CB4953B86FF794ADE0BE96@NewLife>
	<18DF7D20DFEC044098A1062202F5FFF32B630E6D05@exchsth.agresearch.co.nz>
Message-ID: <c3cc98c0911300923p144ac04ax8a881150dca54835@mail.gmail.com>

Hello everybody,

thanks a lot for the overwhelming answers! All these codes are different
flavors and worked all.

For me the added code works the best. But I think I found a bug in
...Bio/SearchIO/blast.pm.
There the DEFAULT_BLAST_... variable is set to
Bio::Search::Writer::HitTableWriter instead of
Bio::SearchIO::Writer::HitTableWriter. This variable I changed also to
HTMLResultWriter
and others.

So again: THANKS for the support!

Cheers,
Tim

#!/usr/bin/perl -w

use strict;

use Bio::Tools::Run::StandAloneBlast;

use Bio::SeqIO;

use Bio::SearchIO;

### add here the writer you want
use Bio::SearchIO::Writer::HitTableWriter;

use Bio::Search::Result::BlastResult;


use Data::Dumper;


my $Seq_in = Bio::SeqIO->new( -file   =>
"/home/koehler/Programs/for_BLAST/1_to_BLAST_two_seq.fasta",

                              -format => "fasta" );


while ( my $query = $Seq_in->next_seq() ) {

       warn "Processing ",$query->id, "\n";

  my $factory =

    Bio::Tools::Run::StandAloneBlast->new(

                 program  => "blastn",

                 database =>
"/home/koehler/Programs/for_BLAST/BLAST_Pipeline/3_BLAST_db",

                 _READMETHOD => "Blast"

    );


  my $blast_report = $factory->blastall($query);

  sleep 5;


  # just write the result we got for this query into a

   #new blast-formatted file...named after the id of the query seq...

  my $result = $blast_report->next_result;

  my $blio = Bio::SearchIO->new( -file => ">".$query->id.".bls", -format =>
"blast" ) or die $!;

  $blio->write_result($result);


  # below, just looking at the current blast result

###this does not appear in the output files

  while ( my $result = $blast_report->next_result ) {

    ## $result is a Bio::Search::Result::ResultI compliant object

    while ( my $hit = $result->next_hit ) {

      ## $hit is a Bio::Search::Hit::HitI compliant object

      while ( my $hsp = $hit->next_hsp ) {

        ## $hsp is a Bio::Search::HSP::HSPI compliant object

        if ( $hsp->length('total') > 50 ) {

          if ( $hsp->percent_identity >= 75 ) {

            print "Query= ", $result->query_name,

              "Hit= ",        $hit->name,

              "Length= ",     $hsp->length('total'),

              "Percent_id= ", $hsp->percent_identity,

              "Subject=",     $hsp->hit_string, "\n";

          }

        }

      }

    }

  }

}


On Sun, Nov 29, 2009 at 11:29 PM, Smithies, Russell <
Russell.Smithies at agresearch.co.nz> wrote:

>  Changed it to a generic result and added a writer and it seems tio work:
>
>
>
>   foreach my $qid ( keys %hits_by_query ) {
>
>     warn "qid = $qid\n";
>
>     my $res = Bio::Search::Result::GenericResult->new(-algorithm =>
> "blastn") or die $!;
>
>    # print Dumper $res;
>
>     foreach my $h ( @{ $hits_by_query{$qid} } ){
>
>                      warn "adding hit ", $h->name, "\n";
>
>                      $res->add_hit($h) if defined($h);
>
>                            }
>
>     my $writerhtml =  Bio::SearchIO::Writer::HTMLResultWriter->new();
>
>     my $blio = Bio::SearchIO->new(-writer => $writerhtml, -file =>
> ">$qid\.bls\.html", -format => "blast" ) or die $!;
>
>     $blio->write_result($res);
>
>   }
>
>
>
>
>
> *From:* Mark A. Jensen [mailto:maj at fortinbras.us]
> *Sent:* Monday, 30 November 2009 10:19 a.m.
> *To:* Smithies, Russell; 'Tim Koehler'
>
> *Subject:* Re: [Bioperl-l] How to parse BLAST output - all hits of each
> queryinnew file
>
>
>
> My thought here was that since Tim's already going one at a time thru
>
> his queries, my scrap was not really necessary:
>
>
>
> use strict;
>
> use Bio::Tools::Run::StandAloneBlast;
>
> use Bio::SeqIO;
>
> use Bio::SearchIO;
>
> use Bio::Search::Result::BlastResult;
>
>
>
> use Data::Dumper;
>
>
>
> my $Seq_in = Bio::SeqIO->new( -file   => "sequences.fasta",
>
>                               -format => "fasta" );
>
>
>
> while ( my $query = $Seq_in->next_seq() ) {
>
>        warn "Processing ",$query->id, "\n";
>
>   my $factory =
>
>     Bio::Tools::Run::StandAloneBlast->new(
>
>                  program  => "blastn",
>
>                  database =>
> "/data/databases/flatfile/illuminati_blastdata/nt",
>
>                  _READMETHOD => "Blast"
>
>     );
>
>
>
>   my $blast_report = $factory->blastall($query);
>
>   sleep 5;
>
>
>
>   # just write the result we got for this query into a
>
>    #new blast-formatted file...named after the id of the query seq...
>
>  my $result = $blast_report->next_result;
>
> my $blio = Bio::SearchIO->new( -file => ">".$query->id.".bls", -format =>
> "blast" ) or die $!;
>
>   $blio->write_result($result);
>
>
>
>   # below, just looking at the current blast result
>
>   while ( my $result = $blast_report->next_result ) {
>
>     ## $result is a Bio::Search::Result::ResultI compliant object
>
>     while ( my $hit = $result->next_hit ) {
>
>       ## $hit is a Bio::Search::Hit::HitI compliant object
>
>       while ( my $hsp = $hit->next_hsp ) {
>
>         ## $hsp is a Bio::Search::HSP::HSPI compliant object
>
>         if ( $hsp->length('total') > 50 ) {
>
>           if ( $hsp->percent_identity >= 75 ) {
>
>             print "Query= ", $result->query_name,
>
>               "Hit= ",        $hit->name,
>
>               "Length= ",     $hsp->length('total'),
>
>               "Percent_id= ", $hsp->percent_identity,
>
>               "Subject=",     $hsp->hit_string, "\n";
>
>           }
>
>         }
>
>       }
>
>     }
>
>   }
>
> }
>
>  ----- Original Message -----
>
> *From:* Smithies, Russell <Russell.Smithies at agresearch.co.nz>
>
> *To:* 'Tim Koehler' <timbourine81 at googlemail.com> ; 'maj at fortinbras.us'<%27maj at fortinbras.us%27>
>
> *Sent:* Sunday, November 29, 2009 3:58 PM
>
> *Subject:* RE: [Bioperl-l] How to parse BLAST output - all hits of each
> queryinnew file
>
>
>
> Hi Tim
>
> With various people writing the ?howtos? and other docs, the examples are
> bound to have differing names for the variables used but as long as you?re
> consistent, it should all fit together.
>
>
>
> I think I?ve almost got your code working, just getting errors from
> Bio::Search::Result::BlastResult  which I?m not entirely sure how to use.
> Perhaps Mark can get this bit going?
>
>
>
> --Russell
>
> ===============================
>
>
>
> use strict;
>
> use Bio::Tools::Run::StandAloneBlast;
>
> use Bio::SeqIO;
>
> use Bio::SearchIO;
>
> use Bio::Search::Result::BlastResult;
>
>
>
> use Data::Dumper;
>
>
>
> my $Seq_in = Bio::SeqIO->new( -file   => "sequences.fasta",
>
>                               -format => "fasta" );
>
>
>
> while ( my $query = $Seq_in->next_seq() ) {
>
>        warn "Processing ",$query->id, "\n";
>
>   my $factory =
>
>     Bio::Tools::Run::StandAloneBlast->new(
>
>                  program  => "blastn",
>
>                  database =>
> "/data/databases/flatfile/illuminati_blastdata/nt",
>
>                  _READMETHOD => "Blast"
>
>     );
>
>
>
>   my $blast_report = $factory->blastall($query);
>
>   sleep 5;
>
>
>
>
>
>   my %hits_by_query;
>
>
>
>        while ( my $result = $blast_report->next_result ) {
>
>          foreach my $hit ( $result->hits ) {
>
>                      warn "Pushed a hit for ",$hit->name, "\n";
>
>            push( @{ $hits_by_query{ $hit->name } }, $hit );
>
>          }
>
>        }
>
>
>
>   foreach my $qid ( keys %hits_by_query ) {
>
>               warn "qid = $qid\n";
>
>     my $res = Bio::Search::Result::BlastResult->new() or die $!;
>
>     print Dumper $res;
>
>     foreach my $h ( @{ $hits_by_query{$qid} } ){
>
>                      warn "adding hit ", $h->name, "\n";
>
>                      $res->add_hit($h) if defined($h);
>
>                            }
>
>     my $blio = Bio::SearchIO->new( -file => ">$qid\.bls", -format =>
> "blast" ) or die $!;
>
>     $blio->write_result($res);
>
>   }
>
>
>
>   while ( my $result = $blast_report->next_result ) {
>
>     ## $result is a Bio::Search::Result::ResultI compliant object
>
>     while ( my $hit = $result->next_hit ) {
>
>       ## $hit is a Bio::Search::Hit::HitI compliant object
>
>       while ( my $hsp = $hit->next_hsp ) {
>
>         ## $hsp is a Bio::Search::HSP::HSPI compliant object
>
>         if ( $hsp->length('total') > 50 ) {
>
>           if ( $hsp->percent_identity >= 75 ) {
>
>             print "Query= ", $result->query_name,
>
>               "Hit= ",        $hit->name,
>
>               "Length= ",     $hsp->length('total'),
>
>               "Percent_id= ", $hsp->percent_identity,
>
>               "Subject=",     $hsp->hit_string, "\n";
>
>           }
>
>         }
>
>       }
>
>     }
>
>   }
>
> }
>
> ===============================
>
>
>
> *From:* Tim Koehler [mailto:timbourine81 at googlemail.com]
> *Sent:* Friday, 27 November 2009 10:24 p.m.
> *To:* Smithies, Russell; maj at fortinbras.us
> *Subject:* Re: [Bioperl-l] How to parse BLAST output - all hits of each
> queryinnew file
>
>
>
> Hey guys,
>
> please, do not get me wrong that I wanted to put the workload on you. So
> far I only found the HowTo's but in there in some way the language changed
> with time (e.g. $in to $Seq_in) or some things I simply could not find.
> Now I got a tip where else to search: the scrapbook and deobfuscator.
>
> I immediately will have a look at that.
>
> This is the first time for me touching linux / perl commands; that's why I
> thought after several days of trial and many errors ;) asking the
> mailinglist.
>
> I was very happy about your fast answers!
>
> Cheers and a nice weekend,
>
> Tim
>
> On Thu, Nov 26, 2009 at 5:02 PM, Tim Koehler <timbourine81 at googlemail.com>
> wrote:
>
> ups, sent too early...
>
> Hey Mark,
>
> thanks for the answer. But I am still struggling, especially where to put
> in your code.
>
> Here ist the code I have, so far:
>
> #!/usr/bin/perl -w
>
> ### should I put your code here as push is a perl command?
>
>
> my %hits_by_query;
> for ($result->hits) {
>
> ### I inserted a comma after name}}; if there is no comma, there was the
> error: Scalar found where operator expected at
> 12_BLAST_two_sequence_each_query_one_file.PL line7, near "} $hit"
> ###        (Missing operator before  $hit?)
> ###Useless use of push with no values at
> 12_BLAST_two_sequence_each_query_one_file.PL line 7.
> ###syntax error at 12_BLAST_two_sequence_each_query_one_file.PL line 7,
> near "} $hit"
> ###BEGIN not safe after errors--compilation aborted at
> 12_BLAST_two_sequence_each_query_one_file.PL line 13.
>
>
>  push @{$hits_by_query{$hit->name}}, $hit;
>
> ###here, every time this terror appears: Name "main::result" used only
> once: possible typo at 12_BLAST_two_sequence_each_query_one_file.PL line 5.
> ###error: Can't call method "hits" on an undefined value at
> 12_BLAST_two_sequence_each_query_one_file.PL line 5.
>
>
> }
>
>
> use strict;
> use Bio::Tools::Run::StandAloneBlast;
> use Bio::SeqIO;
> use Bio::SearchIO;
>
> use Bio::Search::Result::BlastResult;
>
> my $Seq_in = Bio::SeqIO->new (
> -file =>
> "/home/koehler/Programs/for_BLAST/BLAST_Pipeline/1_to_BLAST_two_seq.fasta",
> -format => 'fasta'
> );
> while (my $query = $Seq_in->next_seq()) {
>
>
> my $factory = Bio::Tools::Run::StandAloneBlast->new(
>
> 'program' => 'blastn',
> 'database' => '/home/koehler/Programs/for_BLAST/BLAST_Pipeline/3_BLAST_db',
> _READMETHOD => "Blast"
> );
>
> my $blast_report = $factory->blastall($query);
>
> ### Should I need to use a module? are the commands here at the right
> position? errors, e.g., Global symbol "$hit" requires explicit package name
> #my %hits_by_query;
> #for ($result->hits) {
> ### inserted comma after name}}
> # push @{$hits_by_query{$hit->name}}, $hit;
> #}
>
>
>
> foreach my $qid ( keys %hits_by_query ) {
>  my $result = Bio::Search::Result::BlastResult->new();
>  $result->add_hit($_) for ( @{$hits_by_query{$qid}} );
>  my $blio = Bio::SearchIO->new( -file => ">$qid\.bls", -format=>'blast' );
>  $blio->write_result($result);
> }
>
> ###where are the files stored? what is their name. Sorry, but I cannot get
> behind that :(
>
> while( my $result = $blast_report->next_result ) {
>   ## $result is a Bio::Search::Result::ResultI compliant object
>
>
>   while( my $hit = $result->next_hit ) {
>
>    ## $hit is a Bio::Search::Hit::HitI compliant object
>
>
>    while( my $hsp = $hit->next_hsp ) {
>
>     ## $hsp is a Bio::Search::HSP::HSPI compliant object
>     if( $hsp->length('total') > 50 ) {
>      if ( $hsp->percent_identity >= 75 ) {
>      print  "Query= ",        $result->query_name,
>         "Hit= ",        $hit->name,
>             "Length= ",     $hsp->length('total'),
>             "Percent_id= ", $hsp->percent_identity,
>         "Subject=",        $hsp->hit_string,"\n";
>      }
>     }
>    }
>   }
> }
> }
>
> Again, a big thanks in advance :)
>
> All the best,
>
> Tim
>
> On Thu, Nov 26, 2009 at 4:52 PM, Tim <timbourine81 at gmail.com> wrote:
>
> Hey Mark,
>
> thanks for the answer
>
>
>
>
> On 25.11.2009 20:21, Mark A. Jensen wrote:
> > whoops: change the following line:
> > my $blio = Bio::SearchIO->new( -file => $qid.".bls", -format=>'blast' );
> >
> > to
> >
> > my $blio = Bio::SearchIO->new( -file => ">$qid\.bls", -format=>'blast' );
> >
> > (I always forget that...)
> > MAJ
> >
> > ----- Original Message ----- From: "Mark A. Jensen" <maj at fortinbras.us>
> > To: "Tim" <timbourine81 at gmail.com>; <bioperl-l at lists.open-bio.org>
> > Sent: Wednesday, November 25, 2009 1:20 PM
> > Subject: Re: [Bioperl-l] How to parse BLAST output - all hits of each
> > queryinnew file
> >
> >
> >> hey Tim--
> >>
> >> Sound like you need to go about collecting your queries inside out:
> >>
> >> my %hits_by_query;
> >> for ($result->hits) {
> >>  push @{$hits_by_query{$hit->name}} $hit;
> >> }
> >>
> >> I believe now each hash element, keyed by the query name, will contain
> >> an arrayref to the set of hits assoc with that query.
> >>> From here, I believe
> >>
> >> use Bio::Search::Result::BlastResult;
> >> use Bio::SearchIO;
> >>
> >> foreach my $qid ( keys %hits_by_query ) {
> >>  my $result = Bio::Search::Result::BlastResult->new();
> >>  $result->add_hit($_) for ( @{$hits_by_query{$qid}} );
> >>  my $blio = Bio::SearchIO->new( -file => $qid.".bls", -format=>'blast'
> );
> >>  $blio->write_result($result);
> >> }
> >>
> >> will do what you want.
> >>
> >> hope this helps -
> >> Mark
> >>
> >> ----- Original Message ----- From: "Tim" <timbourine81 at gmail.com>
> >> To: <bioperl-l at lists.open-bio.org>
> >> Sent: Wednesday, November 25, 2009 12:40 PM
> >> Subject: [Bioperl-l] How to parse BLAST output - all hits of each
> >> query innew file
> >>
> >>
> >>> Dear bioperl users,
> >>>
> >>> I am a real newbie and have - maybe a very trivial - question.
> >>>
> >>> I searched the mailing list archive and many howtos but I have not
> found
> >>> a concrete answer to my problem. So hopefully you can help me :)
> >>>
> >>> Background: I use the latest Bioperl version (installed it two weeks
> >>> before).
> >>> When I use Bio::Tools::Run::StandAloneBlast to BLAST one fasta file
> >>> including different sequences, I get a BLAST output with many queries
> >>> each having several hits / sbjcts.
> >>>
> >>> My problem is how to parse *all* hits of *one* query into a single new
> >>> file. And this for all the queries I have in my BLAST output file.
> >>>
> >>> Or is it better the other way round; first to make fasta files with
> only
> >>> single sequences inside and BLAST each file? But how can I automize
> that
> >>> using Bioperl?
> >>>
> >>> I tried Bio::SearchIO but can only parse all queries and their
> >>> respective hits in only one file...
> >>> I think iteration is also necessary here, but I do not really know how
> >>> to include that into Bio::SearchIO.
> >>> Or do I have to use Module:Bio::Index::Blast?
> >>>
> >>> I can index a file (see below), but I have no idea what comes next...
> >>>
> >>> ###How I index a file...
> >>>
> >>> #!/usr/bin/perl -w
> >>>
> >>> $ENV{BIOPERL_INDEX_TYPE} = "SDBM_File";
> >>>
> >>> use Bio::Index::Fasta;
> >>>
> >>>
> >>> $file_name = "8_to_BLAST_two_seq_index.fasta";
> >>> $id = "48882";
> >>> $inx = Bio::Index::Fasta->new (-filename => $file_name . ".idx",
> >>> -write_flag => 1);
> >>> $inx->make_index($file_name);
> >>>
> >>>
> >>> Hopefully, you can give me at least hints what to look for.
> >>>
> >>> A big THANKS in advance!
> >>>
> >>> Cheers,
> >>>
> >>> Tim
> >>> _______________________________________________
> >>> Bioperl-l mailing list
> >>> Bioperl-l at lists.open-bio.org
> >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >>>
> >>>
> >>
> >> _______________________________________________
> >> Bioperl-l mailing list
> >> Bioperl-l at lists.open-bio.org
> >> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >>
> >>
> >
>
> Tim K?hler
> MPI for Terrestrial Microbiology
> Karl-von-Frisch-Stra?e
> D-35043 Marburg / Germany
>
> Email: koehlerd at mpi-marburg.mpg.de
> Phone: +49 6421 178-740
> Fax:   +49 6421 178-999
>
>
>
>
>  ------------------------------
>
> *Attention: *The information contained in this message and/or attachments
> from AgResearch Limited is intended only for the persons or entities to
> which it is addressed and may contain confidential and/or privileged
> material. Any review, retransmission, dissemination or other use of, or
> taking of any action in reliance upon, this information by persons or
> entities other than the intended recipients is prohibited by AgResearch
> Limited. If you have received this message in error, please notify the
> sender immediately.
>  ------------------------------
>
>
>
>


From maj at fortinbras.us  Mon Nov  2 04:47:15 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Sun, 1 Nov 2009 23:47:15 -0500
Subject: [Bioperl-l] annotations
Message-ID: <5150801225E0484D95DC51B2D00AE519@NewLife>

I'm cogitating on features and annotations. For a RichSeq, one gets the set of annotations by

$seq->annotation->get_Annotations

while getting features by 

$seq->get_Features

Is there a reason not to have a method in SeqI 

sub get_Annotations { shift->annotation->get_Annotations }

to allow a user to do what seems natural from a user's perspective, viz. $seq->get_Annotations? I imagine this might save hundreds of hours of frustration, integrated over all newbies.
MAJ


From cjfields at illinois.edu  Mon Nov  2 13:08:54 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Mon, 2 Nov 2009 07:08:54 -0600
Subject: [Bioperl-l] annotations
In-Reply-To: <5150801225E0484D95DC51B2D00AE519@NewLife>
References: <5150801225E0484D95DC51B2D00AE519@NewLife>
Message-ID: <6920A9E1-D221-4CF8-9866-0ADBDB254C19@illinois.edu>

On Nov 1, 2009, at 10:47 PM, Mark A. Jensen wrote:

> I'm cogitating on features and annotations. For a RichSeq, one gets  
> the set of annotations by
>
> $seq->annotation->get_Annotations
>
> while getting features by
>
> $seq->get_Features
>
> Is there a reason not to have a method in SeqI
>
> sub get_Annotations { shift->annotation->get_Annotations }
>
> to allow a user to do what seems natural from a user's perspective,  
> viz. $seq->get_Annotations? I imagine this might save hundreds of  
> hours of frustration, integrated over all newbies.
> MAJ

One could add the methods to delegate to annotation() (that's  
essentially what I'm planning on doing for Biome).

chris


From kiekyon.huang at gmail.com  Tue Nov  3 15:14:39 2009
From: kiekyon.huang at gmail.com (Kie Kyon Huang)
Date: Tue, 3 Nov 2009 23:14:39 +0800
Subject: [Bioperl-l] render_blast problem
Message-ID: <a54400840911030714q15c0a601t6882beaff68a45f5@mail.gmail.com>

Hi,

I was trying to follow the HOWTO:Graphics at
http://www.bioperl.org/wiki/HOWTO:Graphics

When running the command line in cygwin

$ perl render_blast1.pl data1.txt | display -

I get the following error line,

bash: display: command not found

I also tried

$ perl render_blast1.pl data1.txt > data1.png

however, I was unable to open the data1.png file using Microsoft
Office Picture Manager or windows Photo Gallery

Thanks

Huang


From biopython at maubp.freeserve.co.uk  Tue Nov  3 15:45:37 2009
From: biopython at maubp.freeserve.co.uk (Peter)
Date: Tue, 3 Nov 2009 15:45:37 +0000
Subject: [Bioperl-l] render_blast problem
In-Reply-To: <a54400840911030714q15c0a601t6882beaff68a45f5@mail.gmail.com>
References: <a54400840911030714q15c0a601t6882beaff68a45f5@mail.gmail.com>
Message-ID: <320fb6e00911030745s68331ef7n729505f460863e21@mail.gmail.com>

On Tue, Nov 3, 2009 at 3:14 PM, Kie Kyon Huang <kiekyon.huang at gmail.com> wrote:
> Hi,
>
> I was trying to follow the HOWTO:Graphics at
> http://www.bioperl.org/wiki/HOWTO:Graphics
>
> When running the command line in cygwin
>
> $ perl render_blast1.pl data1.txt | display -
>
> I get the following error line,
>
> bash: display: command not found

That makes sense on Windows, since display is a Unix
command line tool.

> I also tried
>
> $ perl render_blast1.pl data1.txt > data1.png

Based on the wiki, I think that ought to have worked.

> however, I was unable to open the data1.png file using Microsoft
> Office Picture Manager or windows Photo Gallery

Did you do this step?:
>> Important!  If you are on a Windows platform, you need to put
>> STDOUT into binary mode so that the PNG file does not go
>> through Window's carriage return/linefeed transformations.
>> Before the final print statement, put the statement
>> binmode(STDOUT). This advice also applies to certain older
>> versions of RedHat, which ship with a patched (and possibly
>> broken) version of Perl.

(BioPerl devs - couldn't that be added to the default
render_blast1.pl script with an if statement checking for
Windows?)

Peter


From biopython at maubp.freeserve.co.uk  Tue Nov  3 16:04:59 2009
From: biopython at maubp.freeserve.co.uk (Peter)
Date: Tue, 3 Nov 2009 16:04:59 +0000
Subject: [Bioperl-l] render_blast problem
In-Reply-To: <a54400840911030755s725229f7ib679d67932535753@mail.gmail.com>
References: <a54400840911030714q15c0a601t6882beaff68a45f5@mail.gmail.com>
	<320fb6e00911030745s68331ef7n729505f460863e21@mail.gmail.com>
	<a54400840911030755s725229f7ib679d67932535753@mail.gmail.com>
Message-ID: <320fb6e00911030804r62e50da6w373bbb61e9823f28@mail.gmail.com>

Mailing list CC'd - solved :)

On Tue, Nov 3, 2009 at 3:55 PM, Kie Kyon Huang <kiekyon.huang at gmail.com> wrote:
>
> ok, that fix it
> i forget sometimes what platform am i on.
> thanks

Great.

Peter


From amackey at virginia.edu  Tue Nov  3 17:09:00 2009
From: amackey at virginia.edu (Aaron Mackey)
Date: Tue, 3 Nov 2009 12:09:00 -0500
Subject: [Bioperl-l] svn errors?
Message-ID: <24c96eca0911030909p7cfbf858h4de5a345cf8a0782@mail.gmail.com>

[ajm6q at lc4 bioperl-live]$ svn update
svn: Decompression of svndiff data failed


I'll admit to not having svn updated in awhile; A clean, anonymous svn co
failed with the same message:

[...]
A    bioperl-live/Bio/Structure/StructureI.pm
A    bioperl-live/Bio/Structure/IO
svn: Decompression of svndiff data failed

-Aaron

P.S. I used this command: svn co svn://
code.open-bio.org/bioperl/bioperl-live/trunk bioperl-live


From cjfields at illinois.edu  Tue Nov  3 17:17:10 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Tue, 3 Nov 2009 11:17:10 -0600
Subject: [Bioperl-l] svn errors?
In-Reply-To: <24c96eca0911030909p7cfbf858h4de5a345cf8a0782@mail.gmail.com>
References: <24c96eca0911030909p7cfbf858h4de5a345cf8a0782@mail.gmail.com>
Message-ID: <8C5FC42D-F957-45AC-9AAC-876ACC9D77E0@illinois.edu>

Aaron,

Yep, this was reported to support (a couple of users on #bioperl  
reported the same problem).  Chris D. is looking into it.

I'm wondering if it's worth setting up a second mirror to github for  
this purpose.

chris

On Nov 3, 2009, at 11:09 AM, Aaron Mackey wrote:

> [ajm6q at lc4 bioperl-live]$ svn update
> svn: Decompression of svndiff data failed
>
>
> I'll admit to not having svn updated in awhile; A clean, anonymous  
> svn co
> failed with the same message:
>
> [...]
> A    bioperl-live/Bio/Structure/StructureI.pm
> A    bioperl-live/Bio/Structure/IO
> svn: Decompression of svndiff data failed
>
> -Aaron
>
> P.S. I used this command: svn co svn://
> code.open-bio.org/bioperl/bioperl-live/trunk bioperl-live
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From cjfields at illinois.edu  Tue Nov  3 17:19:56 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Tue, 3 Nov 2009 11:19:56 -0600
Subject: [Bioperl-l] render_blast problem
In-Reply-To: <320fb6e00911030745s68331ef7n729505f460863e21@mail.gmail.com>
References: <a54400840911030714q15c0a601t6882beaff68a45f5@mail.gmail.com>
	<320fb6e00911030745s68331ef7n729505f460863e21@mail.gmail.com>
Message-ID: <8336341C-C7B4-4740-A7C3-E2DE5FDAF651@illinois.edu>


On Nov 3, 2009, at 9:45 AM, Peter wrote:

> ...
> Did you do this step?:
>>> Important!  If you are on a Windows platform, you need to put
>>> STDOUT into binary mode so that the PNG file does not go
>>> through Window's carriage return/linefeed transformations.
>>> Before the final print statement, put the statement
>>> binmode(STDOUT). This advice also applies to certain older
>>> versions of RedHat, which ship with a patched (and possibly
>>> broken) version of Perl.
>
> (BioPerl devs - couldn't that be added to the default
> render_blast1.pl script with an if statement checking for
> Windows?)
>
> Peter

Yes, that should be added.  I'll work on it.

chris


From mauricio at open-bio.org  Tue Nov  3 17:20:52 2009
From: mauricio at open-bio.org (Mauricio Herrera Cuadra)
Date: Tue, 03 Nov 2009 11:20:52 -0600
Subject: [Bioperl-l] svn errors?
In-Reply-To: <24c96eca0911030909p7cfbf858h4de5a345cf8a0782@mail.gmail.com>
References: <24c96eca0911030909p7cfbf858h4de5a345cf8a0782@mail.gmail.com>
Message-ID: <4AF06674.30506@open-bio.org>

Hi Aaron,

This was reported a few days ago. Chris Dagdigian is working today on a 
fix for it.

Mauricio.

Aaron Mackey wrote:
> [ajm6q at lc4 bioperl-live]$ svn update
> svn: Decompression of svndiff data failed
> 
> 
> I'll admit to not having svn updated in awhile; A clean, anonymous svn co
> failed with the same message:
> 
> [...]
> A    bioperl-live/Bio/Structure/StructureI.pm
> A    bioperl-live/Bio/Structure/IO
> svn: Decompression of svndiff data failed
> 
> -Aaron
> 
> P.S. I used this command: svn co svn://
> code.open-bio.org/bioperl/bioperl-live/trunk bioperl-live
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 


From rachitasharma at gmail.com  Tue Nov  3 22:12:11 2009
From: rachitasharma at gmail.com (Rachita Sharma)
Date: Tue, 3 Nov 2009 14:12:11 -0800
Subject: [Bioperl-l] Trouble parsing PSI-BLAST
Message-ID: <48f9c0d0911031412v26935097ib06d13c2266cfd8a@mail.gmail.com>

I am having trouble parsing PSI-BLAST results. Please help.

The code is:
my $in = new Bio::SearchIO(        -format => 'blast',
                                -file =>
"BS_XFpsiRblastoutputs/e${ev}/bloutput${i}.txt");


while( my $result = $in->next_result ) {
while( my $hit = $result->next_hit ) {

$sth->execute($result->query_name, $hit->name, $hit->significance);
print "Query executed!\n";

}
}

The error is:
------------- EXCEPTION: Bio::Root::Exception -------------
MSG: no data for midline  ***** No hits found ******
STACK: Error::throw
STACK: Bio::Root::Root::throw
/usr/lib/perl5/site_perl/5.8.8/Bio/Root/Root.pm:359
STACK: Bio::SearchIO::blast::next_result
/usr/lib/perl5/site_perl/5.8.8/Bio/SearchIO/blast.pm:1813
STACK: BSubVCpsiRblast.pl:92
-----------------------------------------------------------


From cjfields at illinois.edu  Wed Nov  4 03:42:55 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Tue, 3 Nov 2009 21:42:55 -0600
Subject: [Bioperl-l] Trouble parsing PSI-BLAST
In-Reply-To: <48f9c0d0911031412v26935097ib06d13c2266cfd8a@mail.gmail.com>
References: <48f9c0d0911031412v26935097ib06d13c2266cfd8a@mail.gmail.com>
Message-ID: <DD8E7843-7181-45AD-95B1-FD877D0A5D4E@illinois.edu>

Rachita,

You'll have to give us more to go on than this.  The best thing to do  
is file a bug report and attach an example PSI-BLAST report and code  
that causes the problem.  The $sth->execute(...) is a bit odd, but  
that shouldn't cause the error in question.

Also, make sure to stipulate the OS, version of BioPerl, and perl  
version.

chris

On Nov 3, 2009, at 4:12 PM, Rachita Sharma wrote:

> I am having trouble parsing PSI-BLAST results. Please help.
>
> The code is:
> my $in = new Bio::SearchIO(        -format => 'blast',
>                                -file =>
> "BS_XFpsiRblastoutputs/e${ev}/bloutput${i}.txt");
>
>
> while( my $result = $in->next_result ) {
> while( my $hit = $result->next_hit ) {
>
> $sth->execute($result->query_name, $hit->name, $hit->significance);
> print "Query executed!\n";
>
> }
> }
>
> The error is:
> ------------- EXCEPTION: Bio::Root::Exception -------------
> MSG: no data for midline  ***** No hits found ******
> STACK: Error::throw
> STACK: Bio::Root::Root::throw
> /usr/lib/perl5/site_perl/5.8.8/Bio/Root/Root.pm:359
> STACK: Bio::SearchIO::blast::next_result
> /usr/lib/perl5/site_perl/5.8.8/Bio/SearchIO/blast.pm:1813
> STACK: BSubVCpsiRblast.pl:92
> -----------------------------------------------------------
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From alexl at users.sourceforge.net  Wed Nov  4 07:30:21 2009
From: alexl at users.sourceforge.net (Alex Lancaster)
Date: Wed, 04 Nov 2009 02:30:21 -0500
Subject: [Bioperl-l] version of ExtUtils::Manifest too strict?
Message-ID: <msd43yycfm.fsf@allele2.localdomain>

Does the version of ExtUtils::Manifest really need to be strictly
greater than or equal to 1.52?

Currently this blocks me updating the Fedora package of BioPerl to
1.6.1, because the version of perl that Fedora ships is on 1.51 and
hence the build fails with:

Checking prerequisites...
 - ERROR: ExtUtils::Manifest (1.51_01) is installed, but we need version >= 1.52

Full logs are here:
http://koji.fedoraproject.org/koji/taskinfo?taskID=1787483
http://koji.fedoraproject.org/koji/getfile?taskID=1787483&name=build.log

This is true even with the version of Perl in rawhide/F-12 etc.
(ExtUtils::Manifest is in the base perl package).

If it really is necessary, I would like to be armed with a good argument
why it needs to be updated, since the Perl package maintainer would have
to update the entire Perl package simply to get a more recent version of
one small subpackage.

Regards,
Alex


From jluis.lavin at unavarra.es  Wed Nov  4 08:43:35 2009
From: jluis.lavin at unavarra.es (jluis.lavin at unavarra.es)
Date: Wed, 4 Nov 2009 09:43:35 +0100 (CET)
Subject: [Bioperl-l] Trouble retrieving multiple sequences from NCBI in a
 single list query
Message-ID: <1386.130.206.164.153.1257324215.squirrel@webmail.unavarra.es>


Hello all,

I?m a newbie who is having terrible troubles trying to retrieve a list
multiple sequences from the NCBI and write them to a single file in Fasta
format.
The code I?ve written seems to read mylist and retrive the sequences, but
it kinda overwrites them so that I only get the last sequence on the list.
I?ve been told to ask the people on this mailing list for help, since you
may have come across this problem also or at last will know how to solve
it...

Here is my code, which basically consist on an STDIN for the list to be
read into an array and a loop to read each sequence (stopping when the
list ends) and retrieve a sequence each time the loop is launched,
writting that sequence to a fasta file. I only get a sequence back
although it seems to perform the retrieving process with each of the
sequences of the list...


#!/usr/bin/perl -w
use strict;
use Bio::DB::GenPept;
use Bio::DB::GenBank;
use Bio::SeqIO;
print "Enter your list name:";
my $archivo=<STDIN>;
chomp $archivo;
die ("Can?t open input\n") unless (open(INFILE, $archivo));
my @lista = <INFILE>;
foreach my $seq (@lista) {
    if ($seq eq '') {
        die ("empty list")
        }
    else {
my $db = new Bio::DB::GenPept("-format" => "Fasta");
my $seqobj = $db->get_Seq_by_acc($seq);
my $out = new Bio::SeqIO (-file => ">extracted_seqs.fasta",
-format => 'fasta');
$out->write_seq($seqobj);
}
}
exit;


An example list of sequences can be this one:

YP_003107578.1
YP_003106103.1
YP_003106552.1
YP_003106560.1
YP_003107053.1
YP_003107450.1
YP_003108000.1
YP_003105023.1
YP_003105264.1

Thanks in advance for your help ;)

-- 
Jos? Luis Lav?n Trueba, PhD

Dpto. de Producci?n Agraria
Grupo de Gen?tica y Microbiolog?a
Universidad P?blica de Navarra
31006 Pamplona
Navarra
SPAIN


From e.osimo at gmail.com  Wed Nov  4 09:54:52 2009
From: e.osimo at gmail.com (Emanuele Osimo)
Date: Wed, 4 Nov 2009 10:54:52 +0100
Subject: [Bioperl-l] Bio::Graphics and picture format
Message-ID: <2ac05d0f0911040154h4eed4a1j8108f78e6e4761f3@mail.gmail.com>

Hello everyone,
do you know if it is possible to generate an image with Bio::Graphics in a
vector format? Is there a list of available formats?
Thanks
Emanuele


From David.Messina at sbc.su.se  Wed Nov  4 09:52:53 2009
From: David.Messina at sbc.su.se (Dave Messina)
Date: Wed, 4 Nov 2009 10:52:53 +0100
Subject: [Bioperl-l] Trouble retrieving multiple sequences from NCBI in
	a single list query
In-Reply-To: <1386.130.206.164.153.1257324215.squirrel@webmail.unavarra.es>
References: <1386.130.206.164.153.1257324215.squirrel@webmail.unavarra.es>
Message-ID: <628aabb70911040152r19ed79dfnbc54f346295d28a8@mail.gmail.com>

>
> The code I?ve written seems to read mylist and retrive the sequences, but
> it kinda overwrites them so that I only get the last sequence on the list.
>

With this line

my $out = new Bio::SeqIO (-file => ">extracted_seqs.fasta", -format =>
'fasta');


you are opening the filehandle for the output file inside your loop, so each
time it is writing over the previous file with an empty file. Then, you
write a single sequence to that file with this line

$out->write_seq($seqobj);


So when you are done, you just have the last sequence in the output file.

If you move the opening of the output filehandle outside the loop (it needs
to be done only once), then it should work as you expect.

Also, I notice the newline characters are not being removed from your
sequence IDs  (actually I'm a little surprised that the sequences are being
retrieved). Just to be safe, you may want to add the line

chomp @lista;


after

my @lista = <INFILE>;


Dave


From jluis.lavin at unavarra.es  Wed Nov  4 10:14:40 2009
From: jluis.lavin at unavarra.es (jluis.lavin at unavarra.es)
Date: Wed, 4 Nov 2009 11:14:40 +0100 (CET)
Subject: [Bioperl-l] Trouble retrieving multiple sequences from NCBI in
 a single list query
In-Reply-To: <628aabb70911040152r19ed79dfnbc54f346295d28a8@mail.gmail.com>
References: <1386.130.206.164.153.1257324215.squirrel@webmail.unavarra.es>
	<628aabb70911040152r19ed79dfnbc54f346295d28a8@mail.gmail.com>
Message-ID: <1791.130.206.164.153.1257329680.squirrel@webmail.unavarra.es>

Thank you very very much Dave,
I?ve had a really frustrating time trying to find out what I was doing
wrong, it has been so frustrating that I was about to quit Bioperl.
Now I can try to focus on BLAST parsing for my comparative genomic analysis

You?re great in this mailing list, because you give a fast and neat advice
to all the questions asked here by newbies like me ;)


El Mie, 4 de Noviembre de 2009, 10:52, Dave Messina escribi?:
>>
>> The code I??ve written seems to read mylist and retrive the sequences,
>> but
>> it kinda overwrites them so that I only get the last sequence on the
>> list.
>>
>
> With this line
>
> my $out = new Bio::SeqIO (-file => ">extracted_seqs.fasta", -format =>
> 'fasta');
>
>
> you are opening the filehandle for the output file inside your loop, so
> each
> time it is writing over the previous file with an empty file. Then, you
> write a single sequence to that file with this line
>
> $out->write_seq($seqobj);
>
>
> So when you are done, you just have the last sequence in the output file.
>
> If you move the opening of the output filehandle outside the loop (it
> needs
> to be done only once), then it should work as you expect.
>
> Also, I notice the newline characters are not being removed from your
> sequence IDs  (actually I'm a little surprised that the sequences are
> being
> retrieved). Just to be safe, you may want to add the line
>
> chomp @lista;
>
>
> after
>
> my @lista = <INFILE>;
>
>
>
>
> Dave
>


-- 
Dr. Jos? Luis Lav?n Trueba

Dpto. de Producci?n Agraria
Grupo de Gen?tica y Microbiolog?a
Universidad P?blica de Navarra
31006 Pamplona
Navarra
SPAIN


From hrh at fmi.ch  Wed Nov  4 10:05:17 2009
From: hrh at fmi.ch (Hotz, Hans-Rudolf)
Date: Wed, 04 Nov 2009 11:05:17 +0100
Subject: [Bioperl-l] Trouble retrieving multiple sequences from NCBI in
 a single list query
In-Reply-To: <1386.130.206.164.153.1257324215.squirrel@webmail.unavarra.es>
Message-ID: <C717106D.54F2%hrh@fmi.ch>

Hi

try

my $out = new Bio::SeqIO (-file => ">>extracted_seqs.fasta",
                                     ^

this way you no longer overwrite your existing file, but append the next
sequence.

Regards, Hans


On 11/4/09 9:43 AM, "jluis.lavin at unavarra.es" <jluis.lavin at unavarra.es>
wrote:

> 
> Hello all,
> 
> I?m a newbie who is having terrible troubles trying to retrieve a list
> multiple sequences from the NCBI and write them to a single file in Fasta
> format.
> The code I?ve written seems to read mylist and retrive the sequences, but
> it kinda overwrites them so that I only get the last sequence on the list.
> I?ve been told to ask the people on this mailing list for help, since you
> may have come across this problem also or at last will know how to solve
> it...
> 
> Here is my code, which basically consist on an STDIN for the list to be
> read into an array and a loop to read each sequence (stopping when the
> list ends) and retrieve a sequence each time the loop is launched,
> writting that sequence to a fasta file. I only get a sequence back
> although it seems to perform the retrieving process with each of the
> sequences of the list...
> 
> 
> #!/usr/bin/perl -w
> use strict;
> use Bio::DB::GenPept;
> use Bio::DB::GenBank;
> use Bio::SeqIO;
> print "Enter your list name:";
> my $archivo=<STDIN>;
> chomp $archivo;
> die ("Can?t open input\n") unless (open(INFILE, $archivo));
> my @lista = <INFILE>;
> foreach my $seq (@lista) {
>     if ($seq eq '') {
>         die ("empty list")
>         }
>     else {
> my $db = new Bio::DB::GenPept("-format" => "Fasta");
> my $seqobj = $db->get_Seq_by_acc($seq);
> my $out = new Bio::SeqIO (-file => ">extracted_seqs.fasta",
> -format => 'fasta');
> $out->write_seq($seqobj);
> }
> }
> exit;
> 
> 
> An example list of sequences can be this one:
> 
> YP_003107578.1
> YP_003106103.1
> YP_003106552.1
> YP_003106560.1
> YP_003107053.1
> YP_003107450.1
> YP_003108000.1
> YP_003105023.1
> YP_003105264.1
> 
> Thanks in advance for your help ;)


From jluis.lavin at unavarra.es  Wed Nov  4 10:25:38 2009
From: jluis.lavin at unavarra.es (jluis.lavin at unavarra.es)
Date: Wed, 4 Nov 2009 11:25:38 +0100 (CET)
Subject: [Bioperl-l] Trouble retrieving multiple sequences from NCBI in
 asingle list query
In-Reply-To: <C717106D.54F2%hrh@fmi.ch>
References: <1386.130.206.164.153.1257324215.squirrel@webmail.unavarra.es>
	<C717106D.54F2%hrh@fmi.ch>
Message-ID: <1834.130.206.164.153.1257330338.squirrel@webmail.unavarra.es>

Thank you very much for your answer Hans!!!
It works perfectly,also a neat and fast solution, like Dave?s.

Blessings to you all ;)

El Mie, 4 de Noviembre de 2009, 11:05, Hotz, Hans-Rudolf escribi?:
> Hi
>
> try
>
> my $out = new Bio::SeqIO (-file => ">>extracted_seqs.fasta",
>                                      ^
>
> this way you no longer overwrite your existing file, but append the next
> sequence.
>
> Regards, Hans
>
>
>
> On 11/4/09 9:43 AM, "jluis.lavin at unavarra.es" <jluis.lavin at unavarra.es>
> wrote:
>
>>
>> Hello all,
>>
>> I?m a newbie who is having terrible troubles trying to retrieve a list
>> multiple sequences from the NCBI and write them to a single file in
>> Fasta
>> format.
>> The code I?ve written seems to read mylist and retrive the sequences,
>> but
>> it kinda overwrites them so that I only get the last sequence on the
>> list.
>> I?ve been told to ask the people on this mailing list for help, since
>> you
>> may have come across this problem also or at last will know how to solve
>> it...
>>
>> Here is my code, which basically consist on an STDIN for the list to be
>> read into an array and a loop to read each sequence (stopping when the
>> list ends) and retrieve a sequence each time the loop is launched,
>> writting that sequence to a fasta file. I only get a sequence back
>> although it seems to perform the retrieving process with each of the
>> sequences of the list...
>>
>>
>> #!/usr/bin/perl -w
>> use strict;
>> use Bio::DB::GenPept;
>> use Bio::DB::GenBank;
>> use Bio::SeqIO;
>> print "Enter your list name:";
>> my $archivo=<STDIN>;
>> chomp $archivo;
>> die ("Can?t open input\n") unless (open(INFILE, $archivo));
>> my @lista = <INFILE>;
>> foreach my $seq (@lista) {
>>     if ($seq eq '') {
>>         die ("empty list")
>>         }
>>     else {
>> my $db = new Bio::DB::GenPept("-format" => "Fasta");
>> my $seqobj = $db->get_Seq_by_acc($seq);
>> my $out = new Bio::SeqIO (-file => ">extracted_seqs.fasta",
>> -format => 'fasta');
>> $out->write_seq($seqobj);
>> }
>> }
>> exit;
>>
>>
>> An example list of sequences can be this one:
>>
>> YP_003107578.1
>> YP_003106103.1
>> YP_003106552.1
>> YP_003106560.1
>> YP_003107053.1
>> YP_003107450.1
>> YP_003108000.1
>> YP_003105023.1
>> YP_003105264.1
>>
>> Thanks in advance for your help ;)
>
>


-- 
Dr. Jos? Luis Lav?n Trueba

Dpto. de Producci?n Agraria
Grupo de Gen?tica y Microbiolog?a
Universidad P?blica de Navarra
31006 Pamplona
Navarra
SPAIN


From scott at scottcain.net  Wed Nov  4 13:26:02 2009
From: scott at scottcain.net (Scott Cain)
Date: Wed, 4 Nov 2009 08:26:02 -0500
Subject: [Bioperl-l] Bio::Graphics and picture format
In-Reply-To: <2ac05d0f0911040154h4eed4a1j8108f78e6e4761f3@mail.gmail.com>
References: <2ac05d0f0911040154h4eed4a1j8108f78e6e4761f3@mail.gmail.com>
Message-ID: <0FB17FBC-16BE-4A9F-AC75-983D3B4ECE7D@scottcain.net>

Hi Emanuele,

It is possible to use GD::SVG instead of GD to generate SVG graphics.   
To use it, you provide an argument of "-image_class  GD::SVG" to the  
constructor of Bio::Graphics::Panel.  See the perldoc of  
Bio::Graphics::Panel for more info.

Scott


On Nov 4, 2009, at 4:54 AM, Emanuele Osimo wrote:

> Hello everyone,
> do you know if it is possible to generate an image with  
> Bio::Graphics in a
> vector format? Is there a list of available formats?
> Thanks
> Emanuele
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

-----------------------------------------------------------------------
Scott Cain, Ph. D. scott at scottcain dot net
GMOD Coordinator (http://gmod.org/) 216-392-3087
Ontario Institute for Cancer Research


From b3sn7 at UNB.ca  Tue Nov  3 17:30:24 2009
From: b3sn7 at UNB.ca (Sharma, Rachita)
Date: Tue,  3 Nov 2009 13:30:24 -0400
Subject: [Bioperl-l] Trouble parsing PSI-BLAST
Message-ID: <1257269424.4af068b045434@webmail.unb.ca>


I am having trouble parsing PSI-BLAST results. Please help.

The code is:
my $in = new Bio::SearchIO(	-format => 'blast',
				-file => "BS_XFpsiRblastoutputs/e${ev}/bloutput${i}.txt");


while( my $result = $in->next_result ) {
while( my $hit = $result->next_hit ) {

$sth->execute($result->query_name, $hit->name, $hit->significance);
print "Query executed!\n";  

}
}

The error is:
------------- EXCEPTION: Bio::Root::Exception -------------
MSG: no data for midline  ***** No hits found ******
STACK: Error::throw
STACK: Bio::Root::Root::throw /usr/lib/perl5/site_perl/5.8.8/Bio/Root/Root.pm:359
STACK: Bio::SearchIO::blast::next_result
/usr/lib/perl5/site_perl/5.8.8/Bio/SearchIO/blast.pm:1813
STACK: BSubVCpsiRblast.pl:92
-----------------------------------------------------------


*******************************
Rachita Sharma
Research Assistant (PhD Student)
University of New Brunswick, NB, CANADA
email: Rachita.Sharma at unb.ca
Phone no: 503-895-3619
*******************************


From cjfields at illinois.edu  Wed Nov  4 13:53:35 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Wed, 4 Nov 2009 07:53:35 -0600
Subject: [Bioperl-l] version of ExtUtils::Manifest too strict?
In-Reply-To: <msd43yycfm.fsf@allele2.localdomain>
References: <msd43yycfm.fsf@allele2.localdomain>
Message-ID: <1D9E943F-2EDC-49AB-83DE-78DED5A8AC23@illinois.edu>

Alex,

Not sure why ExtUtils::Manifest can't be bundled as a separate perl  
package alone.  It is part of perl core but it's also available on  
CPAN separately from perl itself:

http://search.cpan.org/~rkobes/ExtUtils-Manifest-1.57/lib/ExtUtils/Manifest.pm

This is the commit message for that BTW.  This allows spaces in file  
names for the MANIFEST.  v1.52 is a bug fix and is required.

http://code.open-bio.org/svnweb/index.cgi/bioperl/revision/?rev=15673

chris

On Nov 4, 2009, at 1:30 AM, Alex Lancaster wrote:

> Does the version of ExtUtils::Manifest really need to be strictly
> greater than or equal to 1.52?
>
> Currently this blocks me updating the Fedora package of BioPerl to
> 1.6.1, because the version of perl that Fedora ships is on 1.51 and
> hence the build fails with:
>
> Checking prerequisites...
> - ERROR: ExtUtils::Manifest (1.51_01) is installed, but we need  
> version >= 1.52
>
> Full logs are here:
> http://koji.fedoraproject.org/koji/taskinfo?taskID=1787483
> http://koji.fedoraproject.org/koji/getfile?taskID=1787483&name=build.log
>
> This is true even with the version of Perl in rawhide/F-12 etc.
> (ExtUtils::Manifest is in the base perl package).
>
> If it really is necessary, I would like to be armed with a good  
> argument why this ca
> why it needs to be updated, since the Perl package maintainer would  
> have
> to update the entire Perl package simply to get a more recent  
> version of
> one small subpackage.
>
> Regards,
> Alex
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From cjfields at illinois.edu  Wed Nov  4 13:55:34 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Wed, 4 Nov 2009 07:55:34 -0600
Subject: [Bioperl-l] Trouble parsing PSI-BLAST
In-Reply-To: <1257269424.4af068b045434@webmail.unb.ca>
References: <1257269424.4af068b045434@webmail.unb.ca>
Message-ID: <70E34111-4E70-463D-86EE-06926EA57073@illinois.edu>

Rachita,

Asked and answered yesterday.  Please submit as a bug.

chris

On Nov 3, 2009, at 11:30 AM, Sharma, Rachita wrote:

>
> I am having trouble parsing PSI-BLAST results. Please help.
>
> The code is:
> my $in = new Bio::SearchIO(	-format => 'blast',
> 				-file => "BS_XFpsiRblastoutputs/e${ev}/bloutput${i}.txt");
>
>
> while( my $result = $in->next_result ) {
> while( my $hit = $result->next_hit ) {
>
> $sth->execute($result->query_name, $hit->name, $hit->significance);
> print "Query executed!\n";
>
> }
> }
>
> The error is:
> ------------- EXCEPTION: Bio::Root::Exception -------------
> MSG: no data for midline  ***** No hits found ******
> STACK: Error::throw
> STACK: Bio::Root::Root::throw /usr/lib/perl5/site_perl/5.8.8/Bio/ 
> Root/Root.pm:359
> STACK: Bio::SearchIO::blast::next_result
> /usr/lib/perl5/site_perl/5.8.8/Bio/SearchIO/blast.pm:1813
> STACK: BSubVCpsiRblast.pl:92
> -----------------------------------------------------------
>
>
>
>
> *******************************
> Rachita Sharma
> Research Assistant (PhD Student)
> University of New Brunswick, NB, CANADA
> email: Rachita.Sharma at unb.ca
> Phone no: 503-895-3619
> *******************************
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From David.Messina at sbc.su.se  Wed Nov  4 14:11:43 2009
From: David.Messina at sbc.su.se (Dave Messina)
Date: Wed, 4 Nov 2009 15:11:43 +0100
Subject: [Bioperl-l] Trouble retrieving multiple sequences from NCBI in
	a single list query
In-Reply-To: <1791.130.206.164.153.1257329680.squirrel@webmail.unavarra.es>
References: <1386.130.206.164.153.1257324215.squirrel@webmail.unavarra.es> 
	<628aabb70911040152r19ed79dfnbc54f346295d28a8@mail.gmail.com> 
	<1791.130.206.164.153.1257329680.squirrel@webmail.unavarra.es>
Message-ID: <628aabb70911040611q56b441c8o6888f326d0b314d@mail.gmail.com>

Aw shucks, Jos?, glad I could be of help. There are plenty of people who
answer questions around here, but my timezone sometimes gives me an
advantage for the European ones. :)


Dave


From daniel.gaston at gmail.com  Wed Nov  4 14:45:04 2009
From: daniel.gaston at gmail.com (Daniel Gaston)
Date: Wed, 4 Nov 2009 10:45:04 -0400
Subject: [Bioperl-l] SwissProt and Subcellular localization information
Message-ID: <50c615ba0911040645j1b28e727p5d7bf47a04db160b@mail.gmail.com>

Hi Everyone,

I have recently been playing around with SwissProt format flatfiles and want
to extract sequences based on subcellular localization. I notice in going
through the code for swiss.pm and swissdriver.pm that in both (more so in
swissdriver.pm) there are several steps where organelle information based on
the OG line could be extracted and added to data structure but isn't. It
seems that in both cases the OG line is being added in to the generic
lumping of data from the OC, OS, and OX lines in order to extract species
names and taxonomy information but getting rid of everything else. Is there
a particular reason for this or just a simple oversight? On the surface at
least it looks like a relatively simple modification to make although I
admit that I am not terribly adept at manipulating these SeqIO
datastructures.

Thanks for your time,

Dan


From daniel.gaston at gmail.com  Wed Nov  4 17:12:10 2009
From: daniel.gaston at gmail.com (Daniel Gaston)
Date: Wed, 4 Nov 2009 13:12:10 -0400
Subject: [Bioperl-l] SwissProt and Subcellular localization information
Message-ID: <50c615ba0911040912pfd2483fwe44cd098beed73c7@mail.gmail.com>

Sorry folks, it appears I was just being a bonehead and didn't look close
enough into Bio:Annotations and Bio:Species objects that store all of this
data.

Dan

On Wed, Nov 4, 2009 at 1:00 PM, <bioperl-l-request at lists.open-bio.org>wrote:

> Send Bioperl-l mailing list submissions to
>        bioperl-l at lists.open-bio.org
>
> To subscribe or unsubscribe via the World Wide Web, visit
>        http://lists.open-bio.org/mailman/listinfo/bioperl-l
> or, via email, send a message with subject or body 'help' to
>        bioperl-l-request at lists.open-bio.org
>
> You can reach the person managing the list at
>        bioperl-l-owner at lists.open-bio.org
>
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of Bioperl-l digest..."
>
> Today's Topics:
>
>   1.  SwissProt and Subcellular localization information
>      (Daniel Gaston)
>
>
> ---------- Forwarded message ----------
> From: Daniel Gaston <daniel.gaston at gmail.com>
> To: bioperl-l at lists.open-bio.org
> Date: Wed, 4 Nov 2009 10:45:04 -0400
> Subject: [Bioperl-l] SwissProt and Subcellular localization information
> Hi Everyone,
>
> I have recently been playing around with SwissProt format flatfiles and
> want
> to extract sequences based on subcellular localization. I notice in going
> through the code for swiss.pm and swissdriver.pm that in both (more so in
> swissdriver.pm) there are several steps where organelle information based
> on
> the OG line could be extracted and added to data structure but isn't. It
> seems that in both cases the OG line is being added in to the generic
> lumping of data from the OC, OS, and OX lines in order to extract species
> names and taxonomy information but getting rid of everything else. Is there
> a particular reason for this or just a simple oversight? On the surface at
> least it looks like a relatively simple modification to make although I
> admit that I am not terribly adept at manipulating these SeqIO
> datastructures.
>
> Thanks for your time,
>
> Dan
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


From jluis.lavin at unavarra.es  Thu Nov  5 15:28:23 2009
From: jluis.lavin at unavarra.es (jluis.lavin at unavarra.es)
Date: Thu, 5 Nov 2009 16:28:23 +0100 (CET)
Subject: [Bioperl-l] A question about iBio::Index: and its correct use
Message-ID: <2120.130.206.164.153.1257434903.squirrel@webmail.unavarra.es>


Hello to all,

I?m trying to write a script to retrieve a list of sequences from a local
FASTA file (for example a fasta archive where all the protein models of an
organism are stored). This file would be used by me as some kind "local
database" (sorry if I mistake a few concepts...)
I?ve been reading the BioPerl HOWTOs and I came across the
Bio::Index::Fasta tool.
If I didn?t misunderstood what I read (which can be easy because my low
level on programming) this Indexing tool should do the job.
I wrote a couple of scripts based on the documentation i read about this
tool, but I don?t seem to be able to create the index file to be used
later (to retrieve the sequences from).
-First of all, I want to ask the people in this forum if the
Bio::Index::Fasta is the right one to chose for this tasks.
-Then I?ll beg you to take a look at my scripts, because I don?t seem to
catch the bug...

Best wishes to you all and thanks in advance ;)

-- 
Jos? Luis Lav?n Trueba, PhD

Dpto. de Producci?n Agraria
Grupo de Gen?tica y Microbiolog?a
Universidad P?blica de Navarra
31006 Pamplona
Navarra
SPAIN


From maj at fortinbras.us  Thu Nov  5 15:39:05 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Thu, 5 Nov 2009 10:39:05 -0500
Subject: [Bioperl-l] A question about iBio::Index: and its correct use
In-Reply-To: <2120.130.206.164.153.1257434903.squirrel@webmail.unavarra.es>
References: <2120.130.206.164.153.1257434903.squirrel@webmail.unavarra.es>
Message-ID: <A28922858F64480ABD8A6696E269023C@NewLife>

Jos? -- It looks like this is a good solution to your problem. Please send you 
script so we can look at it-
cheers Mark
----- Original Message ----- 
From: <jluis.lavin at unavarra.es>
To: <bioperl-l at lists.open-bio.org>
Sent: Thursday, November 05, 2009 10:28 AM
Subject: [Bioperl-l] A question about iBio::Index: and its correct use


Hello to all,

I?m trying to write a script to retrieve a list of sequences from a local
FASTA file (for example a fasta archive where all the protein models of an
organism are stored). This file would be used by me as some kind "local
database" (sorry if I mistake a few concepts...)
I?ve been reading the BioPerl HOWTOs and I came across the
Bio::Index::Fasta tool.
If I didn?t misunderstood what I read (which can be easy because my low
level on programming) this Indexing tool should do the job.
I wrote a couple of scripts based on the documentation i read about this
tool, but I don?t seem to be able to create the index file to be used
later (to retrieve the sequences from).
-First of all, I want to ask the people in this forum if the
Bio::Index::Fasta is the right one to chose for this tasks.
-Then I?ll beg you to take a look at my scripts, because I don?t seem to
catch the bug...

Best wishes to you all and thanks in advance ;)

-- 
Jos? Luis Lav?n Trueba, PhD

Dpto. de Producci?n Agraria
Grupo de Gen?tica y Microbiolog?a
Universidad P?blica de Navarra
31006 Pamplona
Navarra
SPAIN


_______________________________________________
Bioperl-l mailing list
Bioperl-l at lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/bioperl-l


From jluis.lavin at unavarra.es  Thu Nov  5 15:46:36 2009
From: jluis.lavin at unavarra.es (jluis.lavin at unavarra.es)
Date: Thu, 5 Nov 2009 16:46:36 +0100 (CET)
Subject: [Bioperl-l] [Fwd: Re: A question about iBio::Index: and its correct
 use]
Message-ID: <2642.130.206.164.153.1257435996.squirrel@webmail.unavarra.es>


---------------------------- Mensaje original ----------------------------
Subject: Re: [Bioperl-l] A question about iBio::Index: and its correct use
From:    jluis.lavin at unavarra.es
Fecha:   Jue, 5 de Noviembre de 2009, 16:46
To:      "Mark A. Jensen" <maj at fortinbras.us>
--------------------------------------------------------------------------

Hi Mark,

I?ve actually got two scripts, the first one is to create the index and
the second one is to retrieve the sequence lis from the indexed file.

1)Here is the Index creation script:

#!/c:/Perl -w
use strict;
use Bio::Index::Fasta;
use strict;

print "Enter file for indexing: \n";
my $Index_File_Name = <STDIN>;
my $inx = Bio::Index::Fasta->new(-filename => $Index_File_Name.".idx",
    -write_flag => 1);
$inx->make_index(my $File_Name);

2)And here is the sequence retrieval script:

#!/c:/Perl -w
use Bio::Index::Fasta;
use strict;
#PC9.fasta is my genomic file
my $Index_File_Name ="PC9.fasta";
my $inx = Bio::Index::Fasta->new($Index_File_Name);
#LCS.txt is my sequences list
@ARGV = <lCS.txt>;
foreach  my $id (@ARGV) {
if ($id eq ''){
die ("empty list")
}
else {
my $seqobj = $inx->fetch($id);
my $out = new Bio::SeqIO (-file => ">>extracted_seqs.fasta",
-format => 'fasta');
$out->write_seq($seqobj);
}
}
exit;
}

I hope this code is not a total scum...

Thanks in advance ;)


El Jue, 5 de Noviembre de 2009, 16:39, Mark A. Jensen escribi?:
> Jos? -- It looks like this is a good solution to your problem. Please send
> you
> script so we can look at it-
> cheers Mark
> ----- Original Message -----
> From: <jluis.lavin at unavarra.es>
> To: <bioperl-l at lists.open-bio.org>
> Sent: Thursday, November 05, 2009 10:28 AM
> Subject: [Bioperl-l] A question about iBio::Index: and its correct use
>
>
>
> Hello to all,
>
> I?m trying to write a script to retrieve a list of sequences from a local
> FASTA file (for example a fasta archive where all the protein models of an
> organism are stored). This file would be used by me as some kind "local
> database" (sorry if I mistake a few concepts...)
> I?ve been reading the BioPerl HOWTOs and I came across the
> Bio::Index::Fasta tool.
> If I didn?t misunderstood what I read (which can be easy because my low
> level on programming) this Indexing tool should do the job.
> I wrote a couple of scripts based on the documentation i read about this
> tool, but I don?t seem to be able to create the index file to be used
> later (to retrieve the sequences from).
> -First of all, I want to ask the people in this forum if the
> Bio::Index::Fasta is the right one to chose for this tasks.
> -Then I?ll beg you to take a look at my scripts, because I don?t seem to
> catch the bug...
>
> Best wishes to you all and thanks in advance ;)
>
> --
> Jos? Luis Lav?n Trueba, PhD
>
> Dpto. de Producci?n Agraria
> Grupo de Gen?tica y Microbiolog?a
> Universidad P?blica de Navarra
> 31006 Pamplona
> Navarra
> SPAIN
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>
>


-- 
Dr. Jos? Luis Lav?n Trueba

Dpto. de Producci?n Agraria
Grupo de Gen?tica y Microbiolog?a
Universidad P?blica de Navarra
31006 Pamplona
Navarra
SPAIN


-- 
Dr. Jos? Luis Lav?n Trueba

Dpto. de Producci?n Agraria
Grupo de Gen?tica y Microbiolog?a
Universidad P?blica de Navarra
31006 Pamplona
Navarra
SPAIN


From maj at fortinbras.us  Thu Nov  5 15:37:53 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Thu, 5 Nov 2009 10:37:53 -0500
Subject: [Bioperl-l] Trouble retrieving multiple sequences from NCBI ina
	single list query
In-Reply-To: <628aabb70911040611q56b441c8o6888f326d0b314d@mail.gmail.com>
References: <1386.130.206.164.153.1257324215.squirrel@webmail.unavarra.es>
	<628aabb70911040152r19ed79dfnbc54f346295d28a8@mail.gmail.com>
	<1791.130.206.164.153.1257329680.squirrel@webmail.unavarra.es>
	<628aabb70911040611q56b441c8o6888f326d0b314d@mail.gmail.com>
Message-ID: <49075FDFF6764EE48E932D95EB994221@NewLife>

True, Dave, you compete only with crazed east coast core developers who're doing 
"just one more thing" at 2am....
----- Original Message ----- 
From: "Dave Messina" <David.Messina at sbc.su.se>
To: <jluis.lavin at unavarra.es>
Cc: <bioperl-l at lists.open-bio.org>
Sent: Wednesday, November 04, 2009 9:11 AM
Subject: Re: [Bioperl-l] Trouble retrieving multiple sequences from NCBI ina 
single list query


> Aw shucks, Jos?, glad I could be of help. There are plenty of people who
> answer questions around here, but my timezone sometimes gives me an
> advantage for the European ones. :)
>
>
> Dave
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l 


From hrh at fmi.ch  Thu Nov  5 16:02:48 2009
From: hrh at fmi.ch (Hotz, Hans-Rudolf)
Date: Thu, 05 Nov 2009 17:02:48 +0100
Subject: [Bioperl-l] A question about iBio::Index: and its correct use
In-Reply-To: <2120.130.206.164.153.1257434903.squirrel@webmail.unavarra.es>
Message-ID: <C718B5B8.5561%hrh@fmi.ch>


Jluis

> -Then I?ll beg you to take a look at my scripts, because I don?t seem to
> catch the bug...

you haven't attached/included any scripts, have you?


Anyway, have you considered using BLAST indices (created with the additional
flag "-o") together with the tool 'fastacmd' (which also included in the
NCBI blast binaries) as a simple (and very fast) alternative for fetching
sequences.


Regards, Hans


From maj at fortinbras.us  Thu Nov  5 16:02:09 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Thu, 5 Nov 2009 11:02:09 -0500
Subject: [Bioperl-l] [Fwd: Re: A question about iBio::Index: and its
	correct use]
In-Reply-To: <2642.130.206.164.153.1257435996.squirrel@webmail.unavarra.es>
References: <2642.130.206.164.153.1257435996.squirrel@webmail.unavarra.es>
Message-ID: <1984ED07F36C446284B25F617964B6C6@NewLife>

Hey Jos?,
The first thing that jumps out it the index file name. Looks
like you create it as
PC9.fasta.idx
But you read it as
PC9.fasta
Not an unusual mistake. Do
my $inx = Bio::Index::Fasta->new('PC9.fasta.idx');
and see if it works.
MAJ
----- Original Message ----- 
From: <jluis.lavin at unavarra.es>
To: <bioperl-l at lists.open-bio.org>
Sent: Thursday, November 05, 2009 10:46 AM
Subject: [Bioperl-l] [Fwd: Re: A question about iBio::Index: and its correct 
use]


---------------------------- Mensaje original ----------------------------
Subject: Re: [Bioperl-l] A question about iBio::Index: and its correct use
From:    jluis.lavin at unavarra.es
Fecha:   Jue, 5 de Noviembre de 2009, 16:46
To:      "Mark A. Jensen" <maj at fortinbras.us>
--------------------------------------------------------------------------

Hi Mark,

I?ve actually got two scripts, the first one is to create the index and
the second one is to retrieve the sequence lis from the indexed file.

1)Here is the Index creation script:

#!/c:/Perl -w
use strict;
use Bio::Index::Fasta;
use strict;

print "Enter file for indexing: \n";
my $Index_File_Name = <STDIN>;
my $inx = Bio::Index::Fasta->new(-filename => $Index_File_Name.".idx",
    -write_flag => 1);
$inx->make_index(my $File_Name);

2)And here is the sequence retrieval script:

#!/c:/Perl -w
use Bio::Index::Fasta;
use strict;
#PC9.fasta is my genomic file
my $Index_File_Name ="PC9.fasta";
my $inx = Bio::Index::Fasta->new($Index_File_Name);
#LCS.txt is my sequences list
@ARGV = <lCS.txt>;
foreach  my $id (@ARGV) {
if ($id eq ''){
die ("empty list")
}
else {
my $seqobj = $inx->fetch($id);
my $out = new Bio::SeqIO (-file => ">>extracted_seqs.fasta",
-format => 'fasta');
$out->write_seq($seqobj);
}
}
exit;
}

I hope this code is not a total scum...

Thanks in advance ;)


El Jue, 5 de Noviembre de 2009, 16:39, Mark A. Jensen escribi?:
> Jos? -- It looks like this is a good solution to your problem. Please send
> you
> script so we can look at it-
> cheers Mark
> ----- Original Message -----
> From: <jluis.lavin at unavarra.es>
> To: <bioperl-l at lists.open-bio.org>
> Sent: Thursday, November 05, 2009 10:28 AM
> Subject: [Bioperl-l] A question about iBio::Index: and its correct use
>
>
>
> Hello to all,
>
> I?m trying to write a script to retrieve a list of sequences from a local
> FASTA file (for example a fasta archive where all the protein models of an
> organism are stored). This file would be used by me as some kind "local
> database" (sorry if I mistake a few concepts...)
> I?ve been reading the BioPerl HOWTOs and I came across the
> Bio::Index::Fasta tool.
> If I didn?t misunderstood what I read (which can be easy because my low
> level on programming) this Indexing tool should do the job.
> I wrote a couple of scripts based on the documentation i read about this
> tool, but I don?t seem to be able to create the index file to be used
> later (to retrieve the sequences from).
> -First of all, I want to ask the people in this forum if the
> Bio::Index::Fasta is the right one to chose for this tasks.
> -Then I?ll beg you to take a look at my scripts, because I don?t seem to
> catch the bug...
>
> Best wishes to you all and thanks in advance ;)
>
> --
> Jos? Luis Lav?n Trueba, PhD
>
> Dpto. de Producci?n Agraria
> Grupo de Gen?tica y Microbiolog?a
> Universidad P?blica de Navarra
> 31006 Pamplona
> Navarra
> SPAIN
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>
>


-- 
Dr. Jos? Luis Lav?n Trueba

Dpto. de Producci?n Agraria
Grupo de Gen?tica y Microbiolog?a
Universidad P?blica de Navarra
31006 Pamplona
Navarra
SPAIN


-- 
Dr. Jos? Luis Lav?n Trueba

Dpto. de Producci?n Agraria
Grupo de Gen?tica y Microbiolog?a
Universidad P?blica de Navarra
31006 Pamplona
Navarra
SPAIN


_______________________________________________
Bioperl-l mailing list
Bioperl-l at lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/bioperl-l


From jluis.lavin at unavarra.es  Thu Nov  5 16:21:57 2009
From: jluis.lavin at unavarra.es (jluis.lavin at unavarra.es)
Date: Thu, 5 Nov 2009 17:21:57 +0100 (CET)
Subject: [Bioperl-l] [Fwd: Re: A question about iBio::Index: and its
 correct use]
In-Reply-To: <1984ED07F36C446284B25F617964B6C6@NewLife>
References: <2642.130.206.164.153.1257435996.squirrel@webmail.unavarra.es>
	<1984ED07F36C446284B25F617964B6C6@NewLife>
Message-ID: <2969.130.206.164.153.1257438117.squirrel@webmail.unavarra.es>

Thank you very much Mark, that?s a good point :$
I guess your correction is referred to the second script, isn?t it?

If it is so, there is still a problem with the first script, it doesn?t
create the PC9.fasta.idx file, instead it creates two files named:
-PC9.fasta.idx.pag
-PC9.fasta.idx.dir

which seem to be clearly related with some kind of indexing process...but,
unless the PC9.fasta.idx file is only virtual or remains hidden, I can?t
find it anywhere...
Forgive me if I?m talking nosense...

Thank you very much again for your help ;)


El Jue, 5 de Noviembre de 2009, 17:02, Mark A. Jensen escribi?:
> Hey Jos?,
> The first thing that jumps out it the index file name. Looks
> like you create it as
> PC9.fasta.idx
> But you read it as
> PC9.fasta
> Not an unusual mistake. Do
> my $inx = Bio::Index::Fasta->new('PC9.fasta.idx');
> and see if it works.
> MAJ
> ----- Original Message -----
> From: <jluis.lavin at unavarra.es>
> To: <bioperl-l at lists.open-bio.org>
> Sent: Thursday, November 05, 2009 10:46 AM
> Subject: [Bioperl-l] [Fwd: Re: A question about iBio::Index: and its
> correct
> use]
>
>
>
>
> ---------------------------- Mensaje original ----------------------------
> Subject: Re: [Bioperl-l] A question about iBio::Index: and its correct use
> From:    jluis.lavin at unavarra.es
> Fecha:   Jue, 5 de Noviembre de 2009, 16:46
> To:      "Mark A. Jensen" <maj at fortinbras.us>
> --------------------------------------------------------------------------
>
> Hi Mark,
>
> I?ve actually got two scripts, the first one is to create the index and
> the second one is to retrieve the sequence lis from the indexed file.
>
> 1)Here is the Index creation script:
>
> #!/c:/Perl -w
> use strict;
> use Bio::Index::Fasta;
> use strict;
>
> print "Enter file for indexing: \n";
> my $Index_File_Name = <STDIN>;
> my $inx = Bio::Index::Fasta->new(-filename => $Index_File_Name.".idx",
>     -write_flag => 1);
> $inx->make_index(my $File_Name);
>
> 2)And here is the sequence retrieval script:
>
> #!/c:/Perl -w
> use Bio::Index::Fasta;
> use strict;
> #PC9.fasta is my genomic file
> my $Index_File_Name ="PC9.fasta";
> my $inx = Bio::Index::Fasta->new($Index_File_Name);
> #LCS.txt is my sequences list
> @ARGV = <lCS.txt>;
> foreach  my $id (@ARGV) {
> if ($id eq ''){
> die ("empty list")
> }
> else {
> my $seqobj = $inx->fetch($id);
> my $out = new Bio::SeqIO (-file => ">>extracted_seqs.fasta",
> -format => 'fasta');
> $out->write_seq($seqobj);
> }
> }
> exit;
> }
>
> I hope this code is not a total scum...
>
> Thanks in advance ;)
>
>
>
> El Jue, 5 de Noviembre de 2009, 16:39, Mark A. Jensen escribi?:
>> Jos? -- It looks like this is a good solution to your problem. Please
>> send
>> you
>> script so we can look at it-
>> cheers Mark
>> ----- Original Message -----
>> From: <jluis.lavin at unavarra.es>
>> To: <bioperl-l at lists.open-bio.org>
>> Sent: Thursday, November 05, 2009 10:28 AM
>> Subject: [Bioperl-l] A question about iBio::Index: and its correct use
>>
>>
>>
>> Hello to all,
>>
>> I?m trying to write a script to retrieve a list of sequences from a
>> local
>> FASTA file (for example a fasta archive where all the protein models of
>> an
>> organism are stored). This file would be used by me as some kind "local
>> database" (sorry if I mistake a few concepts...)
>> I?ve been reading the BioPerl HOWTOs and I came across the
>> Bio::Index::Fasta tool.
>> If I didn?t misunderstood what I read (which can be easy because my low
>> level on programming) this Indexing tool should do the job.
>> I wrote a couple of scripts based on the documentation i read about this
>> tool, but I don?t seem to be able to create the index file to be used
>> later (to retrieve the sequences from).
>> -First of all, I want to ask the people in this forum if the
>> Bio::Index::Fasta is the right one to chose for this tasks.
>> -Then I?ll beg you to take a look at my scripts, because I don?t seem to
>> catch the bug...
>>
>> Best wishes to you all and thanks in advance ;)
>>
>> --
>> Jos? Luis Lav?n Trueba, PhD
>>
>> Dpto. de Producci?n Agraria
>> Grupo de Gen?tica y Microbiolog?a
>> Universidad P?blica de Navarra
>> 31006 Pamplona
>> Navarra
>> SPAIN
>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>>
>>
>
>
> --
> Dr. Jos? Luis Lav?n Trueba
>
> Dpto. de Producci?n Agraria
> Grupo de Gen?tica y Microbiolog?a
> Universidad P?blica de Navarra
> 31006 Pamplona
> Navarra
> SPAIN
>
>
>
> --
> Dr. Jos? Luis Lav?n Trueba
>
> Dpto. de Producci?n Agraria
> Grupo de Gen?tica y Microbiolog?a
> Universidad P?blica de Navarra
> 31006 Pamplona
> Navarra
> SPAIN
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>
>


-- 
Dr. Jos? Luis Lav?n Trueba

Dpto. de Producci?n Agraria
Grupo de Gen?tica y Microbiolog?a
Universidad P?blica de Navarra
31006 Pamplona
Navarra
SPAIN


From maj at fortinbras.us  Thu Nov  5 16:39:09 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Thu, 5 Nov 2009 11:39:09 -0500
Subject: [Bioperl-l] [Fwd: Re: A question about iBio::Index: and its
	correct use]
In-Reply-To: <2969.130.206.164.153.1257438117.squirrel@webmail.unavarra.es>
References: <2642.130.206.164.153.1257435996.squirrel@webmail.unavarra.es>
	<1984ED07F36C446284B25F617964B6C6@NewLife>
	<2969.130.206.164.153.1257438117.squirrel@webmail.unavarra.es>
Message-ID: <A1ACC4B552514872B77208248B31977C@NewLife>

Yes, these are files created by the SDBM, Perl's internal db manager. You should 
be able to
open the index by simply
$inx = Bio::Index::Fasta->new('PC9.fasta.idx');
and the dbm will know what to do--
cheers MAJ
----- Original Message ----- 
From: <jluis.lavin at unavarra.es>
To: "Mark A. Jensen" <maj at fortinbras.us>
Cc: <jluis.lavin at unavarra.es>; <bioperl-l at lists.open-bio.org>
Sent: Thursday, November 05, 2009 11:21 AM
Subject: Re: [Bioperl-l] [Fwd: Re: A question about iBio::Index: and its correct 
use]


> Thank you very much Mark, that?s a good point :$
> I guess your correction is referred to the second script, isn?t it?
>
> If it is so, there is still a problem with the first script, it doesn?t
> create the PC9.fasta.idx file, instead it creates two files named:
> -PC9.fasta.idx.pag
> -PC9.fasta.idx.dir
>
> which seem to be clearly related with some kind of indexing process...but,
> unless the PC9.fasta.idx file is only virtual or remains hidden, I can?t
> find it anywhere...
> Forgive me if I?m talking nosense...
>
> Thank you very much again for your help ;)
>
>
> El Jue, 5 de Noviembre de 2009, 17:02, Mark A. Jensen escribi?:
>> Hey Jos?,
>> The first thing that jumps out it the index file name. Looks
>> like you create it as
>> PC9.fasta.idx
>> But you read it as
>> PC9.fasta
>> Not an unusual mistake. Do
>> my $inx = Bio::Index::Fasta->new('PC9.fasta.idx');
>> and see if it works.
>> MAJ
>> ----- Original Message -----
>> From: <jluis.lavin at unavarra.es>
>> To: <bioperl-l at lists.open-bio.org>
>> Sent: Thursday, November 05, 2009 10:46 AM
>> Subject: [Bioperl-l] [Fwd: Re: A question about iBio::Index: and its
>> correct
>> use]
>>
>>
>>
>>
>> ---------------------------- Mensaje original ----------------------------
>> Subject: Re: [Bioperl-l] A question about iBio::Index: and its correct use
>> From:    jluis.lavin at unavarra.es
>> Fecha:   Jue, 5 de Noviembre de 2009, 16:46
>> To:      "Mark A. Jensen" <maj at fortinbras.us>
>> --------------------------------------------------------------------------
>>
>> Hi Mark,
>>
>> I?ve actually got two scripts, the first one is to create the index and
>> the second one is to retrieve the sequence lis from the indexed file.
>>
>> 1)Here is the Index creation script:
>>
>> #!/c:/Perl -w
>> use strict;
>> use Bio::Index::Fasta;
>> use strict;
>>
>> print "Enter file for indexing: \n";
>> my $Index_File_Name = <STDIN>;
>> my $inx = Bio::Index::Fasta->new(-filename => $Index_File_Name.".idx",
>>     -write_flag => 1);
>> $inx->make_index(my $File_Name);
>>
>> 2)And here is the sequence retrieval script:
>>
>> #!/c:/Perl -w
>> use Bio::Index::Fasta;
>> use strict;
>> #PC9.fasta is my genomic file
>> my $Index_File_Name ="PC9.fasta";
>> my $inx = Bio::Index::Fasta->new($Index_File_Name);
>> #LCS.txt is my sequences list
>> @ARGV = <lCS.txt>;
>> foreach  my $id (@ARGV) {
>> if ($id eq ''){
>> die ("empty list")
>> }
>> else {
>> my $seqobj = $inx->fetch($id);
>> my $out = new Bio::SeqIO (-file => ">>extracted_seqs.fasta",
>> -format => 'fasta');
>> $out->write_seq($seqobj);
>> }
>> }
>> exit;
>> }
>>
>> I hope this code is not a total scum...
>>
>> Thanks in advance ;)
>>
>>
>>
>> El Jue, 5 de Noviembre de 2009, 16:39, Mark A. Jensen escribi?:
>>> Jos? -- It looks like this is a good solution to your problem. Please
>>> send
>>> you
>>> script so we can look at it-
>>> cheers Mark
>>> ----- Original Message -----
>>> From: <jluis.lavin at unavarra.es>
>>> To: <bioperl-l at lists.open-bio.org>
>>> Sent: Thursday, November 05, 2009 10:28 AM
>>> Subject: [Bioperl-l] A question about iBio::Index: and its correct use
>>>
>>>
>>>
>>> Hello to all,
>>>
>>> I?m trying to write a script to retrieve a list of sequences from a
>>> local
>>> FASTA file (for example a fasta archive where all the protein models of
>>> an
>>> organism are stored). This file would be used by me as some kind "local
>>> database" (sorry if I mistake a few concepts...)
>>> I?ve been reading the BioPerl HOWTOs and I came across the
>>> Bio::Index::Fasta tool.
>>> If I didn?t misunderstood what I read (which can be easy because my low
>>> level on programming) this Indexing tool should do the job.
>>> I wrote a couple of scripts based on the documentation i read about this
>>> tool, but I don?t seem to be able to create the index file to be used
>>> later (to retrieve the sequences from).
>>> -First of all, I want to ask the people in this forum if the
>>> Bio::Index::Fasta is the right one to chose for this tasks.
>>> -Then I?ll beg you to take a look at my scripts, because I don?t seem to
>>> catch the bug...
>>>
>>> Best wishes to you all and thanks in advance ;)
>>>
>>> --
>>> Jos? Luis Lav?n Trueba, PhD
>>>
>>> Dpto. de Producci?n Agraria
>>> Grupo de Gen?tica y Microbiolog?a
>>> Universidad P?blica de Navarra
>>> 31006 Pamplona
>>> Navarra
>>> SPAIN
>>>
>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>>
>>>
>>
>>
>> --
>> Dr. Jos? Luis Lav?n Trueba
>>
>> Dpto. de Producci?n Agraria
>> Grupo de Gen?tica y Microbiolog?a
>> Universidad P?blica de Navarra
>> 31006 Pamplona
>> Navarra
>> SPAIN
>>
>>
>>
>> --
>> Dr. Jos? Luis Lav?n Trueba
>>
>> Dpto. de Producci?n Agraria
>> Grupo de Gen?tica y Microbiolog?a
>> Universidad P?blica de Navarra
>> 31006 Pamplona
>> Navarra
>> SPAIN
>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>>
>>
>
>
> -- 
> Dr. Jos? Luis Lav?n Trueba
>
> Dpto. de Producci?n Agraria
> Grupo de Gen?tica y Microbiolog?a
> Universidad P?blica de Navarra
> 31006 Pamplona
> Navarra
> SPAIN
>
>
>
> 


From jluis.lavin at unavarra.es  Thu Nov  5 17:48:12 2009
From: jluis.lavin at unavarra.es (jluis.lavin at unavarra.es)
Date: Thu, 5 Nov 2009 18:48:12 +0100 (CET)
Subject: [Bioperl-l] A question about iBio::Index: and its correct use
In-Reply-To: <C718B5B8.5561%hrh@fmi.ch>
References: <2120.130.206.164.153.1257434903.squirrel@webmail.unavarra.es>
	<C718B5B8.5561%hrh@fmi.ch>
Message-ID: <3313.130.206.164.153.1257443292.squirrel@webmail.unavarra.es>

Thanks a lot for your help Hans,
It's a little bit to hard to understand and turn into script this awesome
information you've just given me...I hope I can use it in a near future
anyway ;)
The issue here is that the sequences I,m indexing are not generated by the
NCBI nor stored there...although I belive you?re just refering to the tool
itself and not to a retrieval from the NCBI.

Thanks again you?re all great giving advice to newbies like me ;)

Best wishes to you all


El Jue, 5 de Noviembre de 2009, 17:02, Hotz, Hans-Rudolf escribi?:
>
>
>
> Jluis
>
>> -Then I?ll beg you to take a look at my scripts, because I don?t seem to
>> catch the bug...
>
> you haven't attached/included any scripts, have you?
>
>
> Anyway, have you considered using BLAST indices (created with the
> additional
> flag "-o") together with the tool 'fastacmd' (which also included in the
> NCBI blast binaries) as a simple (and very fast) alternative for fetching
> sequences.
>
>
> Regards, Hans
>
>
>


-- 
Dr. Jos? Luis Lav?n Trueba

Dpto. de Producci?n Agraria
Grupo de Gen?tica y Microbiolog?a
Universidad P?blica de Navarra
31006 Pamplona
Navarra
SPAIN


From florent.angly at gmail.com  Thu Nov  5 18:00:19 2009
From: florent.angly at gmail.com (Florent Angly)
Date: Thu, 05 Nov 2009 10:00:19 -0800
Subject: [Bioperl-l] A question about iBio::Index: and its correct use
In-Reply-To: <3313.130.206.164.153.1257443292.squirrel@webmail.unavarra.es>
References: <2120.130.206.164.153.1257434903.squirrel@webmail.unavarra.es>	<C718B5B8.5561%hrh@fmi.ch>
	<3313.130.206.164.153.1257443292.squirrel@webmail.unavarra.es>
Message-ID: <4AF312B3.9060009@gmail.com>

Hans-Rudolf was talking about a way to retrieve sequences from a BLAST 
database. If you use BLAST locally, then your database is local too.
More info here: 
http://www.ncbi.nlm.nih.gov/staff/tao/URLAPI/formatdb_fastacmd.html
Florent


jluis.lavin at unavarra.es wrote:
> Thanks a lot for your help Hans,
> It's a little bit to hard to understand and turn into script this awesome
> information you've just given me...I hope I can use it in a near future
> anyway ;)
> The issue here is that the sequences I,m indexing are not generated by the
> NCBI nor stored there...although I belive you?re just refering to the tool
> itself and not to a retrieval from the NCBI.
>
> Thanks again you?re all great giving advice to newbies like me ;)
>
> Best wishes to you all
>
>
> El Jue, 5 de Noviembre de 2009, 17:02, Hotz, Hans-Rudolf escribi?:
>   
>>
>> Jluis
>>
>>     
>>> -Then I?ll beg you to take a look at my scripts, because I don?t seem to
>>> catch the bug...
>>>       
>> you haven't attached/included any scripts, have you?
>>
>>
>> Anyway, have you considered using BLAST indices (created with the
>> additional
>> flag "-o") together with the tool 'fastacmd' (which also included in the
>> NCBI blast binaries) as a simple (and very fast) alternative for fetching
>> sequences.
>>
>>
>> Regards, Hans
>>
>>
>>
>>     
>
>
>   


From valiente at lsi.upc.edu  Fri Nov  6 08:06:48 2009
From: valiente at lsi.upc.edu (valiente at lsi.upc.edu)
Date: Fri, 6 Nov 2009 09:06:48 +0100 (CET)
Subject: [Bioperl-l] Bio::SeqIO::genbank.pm
Message-ID: <45737.147.83.59.225.1257494808.squirrel@webmail.lsi.upc.edu>


There is a line in Bio::SeqIO::genbank.pm to convert data in classification lines into a classification array by splitting only
on ';' or '.' so that a classification that is 2
or more words will still get
matched,my @class = map { s/^\s+//; s/\s+$//; s/\s{2,}/ /g; $_; } split /(?<!subgen)[;\.]+/, $class_lines;but this
will break organism names that have a dot inside, such as "Salmonella
enterica subsp. enterica?serovar Typhimurium", which is now
being broken into "Salmonella enterica subsp" and "enterica?serovar
Typhimurium".Changing [;\.]
to [;] solves this issue,my @class = map { s/^\s+//; s/\s+$//; s/\s{2,}/ /g; $_; } split /(?<!subgen)[;]+/,
$class_lines;Does anybody want to further
test it before I commit this change? Thanks,Gabriel


From jluis.lavin at unavarra.es  Fri Nov  6 08:44:45 2009
From: jluis.lavin at unavarra.es (jluis.lavin at unavarra.es)
Date: Fri, 6 Nov 2009 09:44:45 +0100 (CET)
Subject: [Bioperl-l] A question about iBio::Index: and its correct use
In-Reply-To: <4AF312B3.9060009@gmail.com>
References: <2120.130.206.164.153.1257434903.squirrel@webmail.unavarra.es>	<
	C718B5B8.5561%hrh@fmi.ch> 
	<3313.130.206.164.153.1257443292.squirrel@webmail.unavarra.es>
	<4AF312B3.9060009@gmail.com>
Message-ID: <1222.130.206.164.153.1257497085.squirrel@webmail.unavarra.es>

Thank you for the info Florent!
I?ll try to read al the information on the link you provided and try to
figure out how to make it work and if it is worthy for me, I mean, I work
with several sequence files that come from multiple databases (JGI, BROAD,
Genolevures or NCBI). Protein IDs from each of those databases is
different from NCBI. Maybe it could be easier to write a script that
allows me to enter a fasta file with all the protein models of a single
organism, parse it and then extract the sequences of a given list (using
the "ID style" of the particular database) than creating a BLAST index for
each organism I need to work with...Did I explain the issue correctly?
Anyway, since I don?t know anything about this tool Hans and you provided
me, I can easily be wrong...
Thank you for showing me the local BLAST Index tool, I?ll read the
documentation carefully and study all its possibilities.

Best wishes

JL


El Jue, 5 de Noviembre de 2009, 19:00, Florent Angly escribi?:
> Hans-Rudolf was talking about a way to retrieve sequences from a BLAST
> database. If you use BLAST locally, then your database is local too.
> More info here:
> http://www.ncbi.nlm.nih.gov/staff/tao/URLAPI/formatdb_fastacmd.html
> Florent
>
>
> jluis.lavin at unavarra.es wrote:
>> Thanks a lot for your help Hans,
>> It's a little bit to hard to understand and turn into script this
>> awesome
>> information you've just given me...I hope I can use it in a near future
>> anyway ;)
>> The issue here is that the sequences I,m indexing are not generated by
>> the
>> NCBI nor stored there...although I belive you?re just refering to the
>> tool
>> itself and not to a retrieval from the NCBI.
>>
>> Thanks again you?re all great giving advice to newbies like me ;)
>>
>> Best wishes to you all
>>
>>
>> El Jue, 5 de Noviembre de 2009, 17:02, Hotz, Hans-Rudolf escribi?:
>>
>>>
>>> Jluis
>>>
>>>
>>>> -Then I?ll beg you to take a look at my scripts, because I don?t seem
>>>> to
>>>> catch the bug...
>>>>
>>> you haven't attached/included any scripts, have you?
>>>
>>>
>>> Anyway, have you considered using BLAST indices (created with the
>>> additional
>>> flag "-o") together with the tool 'fastacmd' (which also included in
>>> the
>>> NCBI blast binaries) as a simple (and very fast) alternative for
>>> fetching
>>> sequences.
>>>
>>>
>>> Regards, Hans
>>>
>>>
>>>
>>>
>>
>>
>>
>
>


-- 
Dr. Jos? Luis Lav?n Trueba

Dpto. de Producci?n Agraria
Grupo de Gen?tica y Microbiolog?a
Universidad P?blica de Navarra
31006 Pamplona
Navarra
SPAIN


From maj at fortinbras.us  Fri Nov  6 12:45:01 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Fri, 6 Nov 2009 07:45:01 -0500
Subject: [Bioperl-l] Bioperl
In-Reply-To: <16842715.26316.1257510446095.JavaMail.root@durga.amrita.ac.in>
References: <16842715.26316.1257510446095.JavaMail.root@durga.amrita.ac.in>
Message-ID: <AE7A03CA8F45495C9F8D940AC0EC6D69@NewLife>

Hi Resmi-
You should look at http://bioperl.org/ under "Installation" for 
information on getting and installing BioPerl. An introduction 
to working with trees in BioPerl is at this link:
http://www.bioperl.org/wiki/HOWTO:Trees
cheers, 
Mark

----- Original Message ----- 
  From: Resmi S. 
  To: maj at fortinbras.us 
  Sent: Friday, November 06, 2009 7:27 AM
  Subject: Bioperl


  Respected Sir,
  I am Resmi S studying II MSc Bioinformatics.Now am doing my project in Phylogenetic Tree Construction using BioPerl.I am not much familiar on BioPerl modules.So could please send me the names of the Bioperl modules needed for my project.I also need to  know , from where i will get these modules.If that is from CPAN,then send me the location or link.I kindly request you to send me the details soon.

  Yours Sincerely,
     Resmi S,
     II MSc Bioinformatics,
     School of Biotechnology,
     Amrita Vishwa Vidyapeetham,
      Email : amm08bi019 at students.amrita.ac.in


------------------------------------------------------------------------------


  -------------------------------------------------------------------

  This mail has been scanned by Amrita GAV Server, Amrita Vishwa Vidyapeetham, Amritapuri Campus


From robert.bradbury at gmail.com  Fri Nov  6 17:35:22 2009
From: robert.bradbury at gmail.com (Robert Bradbury)
Date: Fri, 6 Nov 2009 12:35:22 -0500
Subject: [Bioperl-l] Function that determines serious mutations
Message-ID: <deaa866a0911060935q5e0364f5v9a296162c42fd0ca@mail.gmail.com>

Is there a function in the library (or has someone written one) that can
take a genbank entry and determine which mutations are harmful?

It would be used to produce a table summary of:
  GENE          # SNP      # BadSNP

One kind of gets this from NCBI if you lookup in the "GENE" db a gene name
and then go to the "GeneView" om dbSNP page it has the information I want
but largely in a graphical format while I simply want numbers I can dump
into a spreadsheet.

I don't think it would be hard, fetch the gene, run through the features for
the SNP database, figure out whether they are good or bad SNPs, accumulate
the statistics and dump it.  I think the functions available are flexible
enough to do it but I can't believe nobody has already done it.  It could be
a bit more complex in that one could do an analysis to see if the mutations
are in a conserved domain or mutations that code for Cysteine or Methionine
(or othe potentially "critical" amino acids) but since "critical" is in the
eye of the beholder there would have to be some kind of callback to a
scoring function.

Thanks,
Robert


From nevoband at igb.uiuc.edu  Fri Nov  6 20:58:05 2009
From: nevoband at igb.uiuc.edu (kleenix)
Date: Fri, 6 Nov 2009 12:58:05 -0800 (PST)
Subject: [Bioperl-l]  StandAloneBlast Unallowed parameter
Message-ID: <26230896.post@talk.nabble.com>


I'm not sure if i'm doing this wrong. I am trying to use the -m parameter in
blastall using the StandAloneBlast bioperl class.
when i add 'm'=>0 to @params i get Unallowed parameter: error.
Am I adding the parameter wrong? i'm using StandAloneBlast version 1.51

Thanks

-Nevo
-- 
View this message in context: http://old.nabble.com/StandAloneBlast-Unallowed-parameter-tp26230896p26230896.html
Sent from the Perl - Bioperl-L mailing list archive at Nabble.com.


From veronica.xiaoyu at gmail.com  Fri Nov  6 22:25:04 2009
From: veronica.xiaoyu at gmail.com (Xiaoyu Liang)
Date: Fri, 6 Nov 2009 17:25:04 -0500
Subject: [Bioperl-l] Parsing BLAST out file to HTML. How to change the
	description's name of each hit?
Message-ID: <a6a926b50911061425m67e47a3bx1853f21e1de5bd95@mail.gmail.com>

Hi,

I'm using Bio::SearchIO::Writer HTMLResultWriter help me parse BLAST out
file into HTML.

Anybody knows how to parse and change the description name of each hit?

By using hit->description can call hits' description, but it is not allowed
to be modified.

Thank you very much,
Xiaoyu


From maj at fortinbras.us  Sat Nov  7 00:40:17 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Fri, 6 Nov 2009 19:40:17 -0500
Subject: [Bioperl-l] Parsing BLAST out file to HTML. How to change
	thedescription's name of each hit?
In-Reply-To: <a6a926b50911061425m67e47a3bx1853f21e1de5bd95@mail.gmail.com>
References: <a6a926b50911061425m67e47a3bx1853f21e1de5bd95@mail.gmail.com>
Message-ID: <11592B31D9924FA7A8638D90AE4A3F4A@NewLife>

Xiaoyu-
That method should work to change the description; are you doing

$hit->description('This is my new description');

This method returns the old description when you change the value:

$hit->description('old');
$str = $hit->description('new'); # $str eq 'old'
$str = $hit->description;            # $str eq 'new'

MAJ

----- Original Message ----- 
From: "Xiaoyu Liang" <veronica.xiaoyu at gmail.com>
To: <Bioperl-l at lists.open-bio.org>
Sent: Friday, November 06, 2009 5:25 PM
Subject: [Bioperl-l] Parsing BLAST out file to HTML. How to change 
thedescription's name of each hit?


> Hi,
>
> I'm using Bio::SearchIO::Writer HTMLResultWriter help me parse BLAST out
> file into HTML.
>
> Anybody knows how to parse and change the description name of each hit?
>
> By using hit->description can call hits' description, but it is not allowed
> to be modified.
>
> Thank you very much,
> Xiaoyu
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> 


From Daniel.Lang at biologie.uni-freiburg.de  Sun Nov  8 14:50:48 2009
From: Daniel.Lang at biologie.uni-freiburg.de (Daniel Lang)
Date: Sun, 08 Nov 2009 15:50:48 +0100
Subject: [Bioperl-l] arguments to call back functions in GBrowse2
Message-ID: <4AF6DAC8.8070204@biologie.uni-freiburg.de>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi Lincoln,

a while back (May 29, 2009; 09:08pm) you replied to an even older thread
("Re: Access the parent of a Bio::DB::SeqFeature within a gbrowse config
callback function").

I missed your reply and did follow it up back then, sorry!

I'm currently facing the same issue again with gbrowse2. I have a
callback function for "balloon click". Following your last reply I
expected 5 arguments, but I am getting only three: $feature,$panel,$track.

In principle, I am using the latest releases/checkouts...
Which modules do I need to look at/update for this functionality?

Furthermore, is there a possibility to share global variables between
gbrowse2 and slaves? Should this work via init_code?
Should modules initialized in a conf be in the scope of a slave?

If not can I introduce modules via the slave config files, or do I need
to alter the slave scripts?


Thanks, again!

Cheers,
Daniel


PS: gbrowse2 rocks!
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAkr22sUACgkQmJnbCpJAG3A2MgCdG61bNRGMFVWExagzMFejKMjO
FiUAn16nQNemDGSy8nJBS5dUHQMnDgrP
=ODxn
-----END PGP SIGNATURE-----


From maj at fortinbras.us  Sun Nov  8 16:09:43 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Sun, 8 Nov 2009 11:09:43 -0500
Subject: [Bioperl-l] GuessSeqFormat: fastq?
Message-ID: <B5891A5A7F2D482DB093E01A7A8DA9A1@NewLife>

Hi All- 
Any plans in the works for a _possibly_fastq sequence guesser?
MAJ


From maj at fortinbras.us  Sun Nov  8 16:20:55 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Sun, 8 Nov 2009 11:20:55 -0500
Subject: [Bioperl-l] GuessSeqFormat: fastq?
In-Reply-To: <B5891A5A7F2D482DB093E01A7A8DA9A1@NewLife>
References: <B5891A5A7F2D482DB093E01A7A8DA9A1@NewLife>
Message-ID: <E2407ED235C24BFF9A03377416109318@NewLife>

Never mind; got it covered-- MAJ
----- Original Message ----- 
From: "Mark A. Jensen" <maj at fortinbras.us>
To: "bioperl-l" <bioperl-l at lists.open-bio.org>
Sent: Sunday, November 08, 2009 11:09 AM
Subject: [Bioperl-l] GuessSeqFormat: fastq?


> Hi All- 
> Any plans in the works for a _possibly_fastq sequence guesser?
> MAJ
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
>


From saikari78 at gmail.com  Mon Nov  9 15:47:10 2009
From: saikari78 at gmail.com (saikari keitele)
Date: Mon, 9 Nov 2009 15:47:10 +0000
Subject: [Bioperl-l] Retrieving link to protein from PubChem
Message-ID: <a38167fa0911090747p6702c62fibd7e8310d3a72dae@mail.gmail.com>

Hi,

I'm using Bioperl to retrieve records from PubChem.
I'm trying to find a way-but have been unsuccessful- to retrieve from a
compound record, the reference to the protein(s) that can synthesize the
compound.
Thanks very much.

saikari


From saikari78 at gmail.com  Mon Nov  9 16:05:57 2009
From: saikari78 at gmail.com (saikari keitele)
Date: Mon, 9 Nov 2009 16:05:57 +0000
Subject: [Bioperl-l] [bioperl newbie] Retrieving link to protein from PubChem
Message-ID: <a38167fa0911090805y76456205g3f42caa7e3471241@mail.gmail.com>

Hi,

I'm using Bioperl to retrieve records from PubChem.
I'm trying to find a way-but have been unsuccessful- to retrieve from a
compound record, the reference to the protein(s) that can synthesize the
compound.
Thanks very much.

saikari


From cjfields at illinois.edu  Mon Nov  9 16:27:10 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Mon, 9 Nov 2009 10:27:10 -0600
Subject: [Bioperl-l] [bioperl newbie] Retrieving link to protein from
	PubChem
In-Reply-To: <a38167fa0911090805y76456205g3f42caa7e3471241@mail.gmail.com>
References: <a38167fa0911090805y76456205g3f42caa7e3471241@mail.gmail.com>
Message-ID: <1ECC543A-F923-4D5E-A0C1-5BBD35ECAAE8@illinois.edu>

On Nov 9, 2009, at 10:05 AM, saikari keitele wrote:

> Hi,
>
> I'm using Bioperl to retrieve records from PubChem.
> I'm trying to find a way-but have been unsuccessful- to retrieve  
> from a
> compound record, the reference to the protein(s) that can synthesize  
> the
> compound.
> Thanks very much.
>
> saikari

The below bioperl script returns the GI for proteins that correspond  
to the substance passed on the command line; invoke using 'perl  
pc_substance.pl substance_requested'.  It probably needs more fiddling  
to catch everything but it should get you started.

For other bits and pieces (such as how to retrieve the raw sequence  
files), please see the EUtilities HOWTO:

http://www.bioperl.org/wiki/HOWTO:EUtilities_Cookbook

chris

----------------------------------------

#!/usr/bin/perl -w

use 5.010;
use strict;
use warnings;
use Bio::DB::EUtilities;

my $substance = shift;

my $eutil = Bio::DB::EUtilities->new(-eutil => 'esearch',
                                      -db => 'pcsubstance',
                                      -term => $substance,
                                      -usehistory => 'y');

my $hist = $eutil->next_History || die;

$eutil->reset_parameters(-eutil => 'elink',
                        -history => $hist,
                        -db      => 'protein',
                        -dbfrom  => 'pcsubstance',
                        -retmax  => 1000);

say join(',',$eutil->get_ids);


From saikari78 at gmail.com  Mon Nov  9 16:41:20 2009
From: saikari78 at gmail.com (saikari keitele)
Date: Mon, 9 Nov 2009 16:41:20 +0000
Subject: [Bioperl-l] [bioperl newbie] Retrieving link to protein from
	PubChem
In-Reply-To: <1ECC543A-F923-4D5E-A0C1-5BBD35ECAAE8@illinois.edu>
References: <a38167fa0911090805y76456205g3f42caa7e3471241@mail.gmail.com>
	<1ECC543A-F923-4D5E-A0C1-5BBD35ECAAE8@illinois.edu>
Message-ID: <a38167fa0911090841o6a0a7bcdj5e2396f1a58dc4f2@mail.gmail.com>

Fabulous!. Huge help.
saikari

On Mon, Nov 9, 2009 at 4:27 PM, Chris Fields <cjfields at illinois.edu> wrote:

>  On Nov 9, 2009, at 10:05 AM, saikari keitele wrote:
>
> Hi,
>>
>> I'm using Bioperl to retrieve records from PubChem.
>> I'm trying to find a way-but have been unsuccessful- to retrieve from a
>> compound record, the reference to the protein(s) that can synthesize the
>> compound.
>> Thanks very much.
>>
>> saikari
>>
>
> The below bioperl script returns the GI for proteins that correspond to the
> substance passed on the command line; invoke using 'perl pc_substance.plsubstance_requested'.  It probably needs more fiddling to catch everything
> but it should get you started.
>
> For other bits and pieces (such as how to retrieve the raw sequence files),
> please see the EUtilities HOWTO:
>
> http://www.bioperl.org/wiki/HOWTO:EUtilities_Cookbook
>
> chris
>
> ----------------------------------------
>
> #!/usr/bin/perl -w
>
> use 5.010;
> use strict;
> use warnings;
> use Bio::DB::EUtilities;
>
> my $substance = shift;
>
> my $eutil = Bio::DB::EUtilities->new(-eutil => 'esearch',
>                                     -db => 'pcsubstance',
>                                     -term => $substance,
>                                     -usehistory => 'y');
>
> my $hist = $eutil->next_History || die;
>
> $eutil->reset_parameters(-eutil => 'elink',
>                       -history => $hist,
>                       -db      => 'protein',
>                       -dbfrom  => 'pcsubstance',
>                       -retmax  => 1000);
>
> say join(',',$eutil->get_ids);
>


From gc11song at gmail.com  Mon Nov  9 18:08:48 2009
From: gc11song at gmail.com (Guangchun Song)
Date: Mon, 9 Nov 2009 12:08:48 -0600
Subject: [Bioperl-l] how to get the protein sequences from DNA sequences
	around novel SNPs?
Message-ID: <794eafc20911091008g1f98b944ncbd66ac4962a85a3@mail.gmail.com>

Hello,

I'm new bioperl user.  I' working on a project: To determine the
status of all tutative SNPs such as non-synonymous vs. synonymous, and
predict the tranlational effect of non-synonymous mutations as benign
or malicious.  I'm trying to use bioperl to get the DNA sequence and
translate to protein sequence for the SNPs that are in gene's coding
region.  Could someone tell me how to do it?

Thanks,

-Guangchun Song


From robert.bradbury at gmail.com  Mon Nov  9 21:15:33 2009
From: robert.bradbury at gmail.com (Robert Bradbury)
Date: Mon, 9 Nov 2009 16:15:33 -0500
Subject: [Bioperl-l] how to get the protein sequences from DNA sequences
	around novel SNPs?
In-Reply-To: <794eafc20911091008g1f98b944ncbd66ac4962a85a3@mail.gmail.com>
References: <794eafc20911091008g1f98b944ncbd66ac4962a85a3@mail.gmail.com>
Message-ID: <deaa866a0911091315l6a0782e8i24bfdc60065eb648@mail.gmail.com>

On Mon, Nov 9, 2009 at 1:08 PM, Guangchun Song <gc11song at gmail.com> wrote:
>
> I'm new bioperl user.  I' working on a project: To determine the
> status of all tutative SNPs such as non-synonymous vs. synonymous, and
> predict the tranlational effect of non-synonymous mutations as benign
> or malicious.  I'm trying to use bioperl to get the DNA sequence and
> translate to protein sequence for the SNPs that are in gene's coding
> region.  Could someone tell me how to do it?
>
>
I too would like to know if this information is available.  I've recently
been working with the dbSNP results from NCBI but they display the results
in a graphical format rather than data that one can play with and ask
questions of like "What is the most disease causing gene in the Human
Genome?" or "What are the critical proteins damaged by gene defects in the
Human Genome?" ... "In terms of premature deaths, extended health care
requirements, loss of quality of life, etc.?"

The same types of questions can be applied to the dog and cat genomes where
there is emotional value or the cow, horse, pig, etc. genomes where there is
economic value?

The value of BioPerl would increase significantly if there were
functionality that would allow easy access to "these mutations may have
negative/positive impact" (which means you need a function that qualifies
mutations by degree) and allow for impact to be subjectively determined
(implying there must be some callback function to provide a user
quality/impact rating).

For example:
   $/@differences =  protein_compare($mygene, $refseq_gene, @critical_aa,
@critical_domain, $callback)
Where $callback could "rate" differences about the protein and position and
the "type of interest" (e.g. metal binding amino acids, structural changing
amino acids, critical catalysis amino acids, etc.).

A default callback would be based on some evolving definition of "critical"
changes which result in human disease for example.

This is a "required" capability to be able to determine things like the
"adaptability" of a species -- those with fewest critical mutation points
may have better adaptability to mutation increasing circumstances.

Please pardon any errors in perl syntax/usage its been a while since I've
written perl and I'd really rather be coding in C.

Robert


From maj at fortinbras.us  Mon Nov  9 21:56:24 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Mon, 9 Nov 2009 16:56:24 -0500
Subject: [Bioperl-l] how to get the protein sequences from DNA
	sequencesaround novel SNPs?
In-Reply-To: <deaa866a0911091315l6a0782e8i24bfdc60065eb648@mail.gmail.com>
References: <794eafc20911091008g1f98b944ncbd66ac4962a85a3@mail.gmail.com>
	<deaa866a0911091315l6a0782e8i24bfdc60065eb648@mail.gmail.com>
Message-ID: <3ED3D387B5DE4248A218D42882369925@NewLife>

I agree that BioPerl would significantly increase in value with
such a module; in fact, the BioTeam would probably buy us out.
My opinion is that the entire GWAS enterprise is the search for
such a callback function, for humans anyway. For those engaged
in this quest, if BioPerl doesn't provide a Maserati, it at least provides
good italian-made (among others) parts.
MAJ
----- Original Message ----- 
From: "Robert Bradbury" <robert.bradbury at gmail.com>
To: "Guangchun Song" <gc11song at gmail.com>
Cc: <bioperl-l at lists.open-bio.org>
Sent: Monday, November 09, 2009 4:15 PM
Subject: Re: [Bioperl-l] how to get the protein sequences from DNA 
sequencesaround novel SNPs?


> On Mon, Nov 9, 2009 at 1:08 PM, Guangchun Song <gc11song at gmail.com> wrote:
>>
>> I'm new bioperl user.  I' working on a project: To determine the
>> status of all tutative SNPs such as non-synonymous vs. synonymous, and
>> predict the tranlational effect of non-synonymous mutations as benign
>> or malicious.  I'm trying to use bioperl to get the DNA sequence and
>> translate to protein sequence for the SNPs that are in gene's coding
>> region.  Could someone tell me how to do it?
>>
>>
> I too would like to know if this information is available.  I've recently
> been working with the dbSNP results from NCBI but they display the results
> in a graphical format rather than data that one can play with and ask
> questions of like "What is the most disease causing gene in the Human
> Genome?" or "What are the critical proteins damaged by gene defects in the
> Human Genome?" ... "In terms of premature deaths, extended health care
> requirements, loss of quality of life, etc.?"
>
> The same types of questions can be applied to the dog and cat genomes where
> there is emotional value or the cow, horse, pig, etc. genomes where there is
> economic value?
>
> The value of BioPerl would increase significantly if there were
> functionality that would allow easy access to "these mutations may have
> negative/positive impact" (which means you need a function that qualifies
> mutations by degree) and allow for impact to be subjectively determined
> (implying there must be some callback function to provide a user
> quality/impact rating).
>
> For example:
>   $/@differences =  protein_compare($mygene, $refseq_gene, @critical_aa,
> @critical_domain, $callback)
> Where $callback could "rate" differences about the protein and position and
> the "type of interest" (e.g. metal binding amino acids, structural changing
> amino acids, critical catalysis amino acids, etc.).
>
> A default callback would be based on some evolving definition of "critical"
> changes which result in human disease for example.
>
> This is a "required" capability to be able to determine things like the
> "adaptability" of a species -- those with fewest critical mutation points
> may have better adaptability to mutation increasing circumstances.
>
> Please pardon any errors in perl syntax/usage its been a while since I've
> written perl and I'd really rather be coding in C.
>
> Robert
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> 


From alexl at users.sourceforge.net  Mon Nov  9 23:44:07 2009
From: alexl at users.sourceforge.net (Alex Lancaster)
Date: Mon, 09 Nov 2009 18:44:07 -0500
Subject: [Bioperl-l] version of ExtUtils::Manifest too strict?
In-Reply-To: <1D9E943F-2EDC-49AB-83DE-78DED5A8AC23@illinois.edu> (Chris
	Fields's message of "Wed, 4 Nov 2009 07:53:35 -0600")
References: <msd43yycfm.fsf@allele2.localdomain>
	<1D9E943F-2EDC-49AB-83DE-78DED5A8AC23@illinois.edu>
Message-ID: <nmocnbuuuw.fsf@allele2.localdomain>

>>>>> Chris Fields  writes:

> Alex, Not sure why ExtUtils::Manifest can't be bundled as a separate
> perl package alone.  It is part of perl core but it's also available
> on CPAN separately from perl itself:

> http://search.cpan.org/~rkobes/ExtUtils-Manifest-1.57/lib/ExtUtils/Manifest.pm

Hi Chris,

Yes, in principle it would be possible to have this split out as a
separate package (currently it's a "subpackage" under the main perl
package), unfortunately that's just not the way it's currently done in
Fedora (probably because it's part of the core set and they like to
update all relevant packages in one step) and I have little control over
that.

As I suspected, the perl maintainer is not at all enthusiastic for
updating the whole of perl just for that package (except for rawhide
which would mean that bioperl 1.6.1 would not be available until F-13,
about 6 months from now).  See:

http://bugzilla.redhat.com/show_bug.cgi?id=533562#c1

Obviously I am not happy with this situation either, because it will
freeze bioperl on Fedora at 1.6.0 for about 6 months, so can you
recommend any temporary workarounds in the meantime?

> This is the commit message for that BTW.  This allows spaces in file
> names for the MANIFEST.  v1.52 is a bug fix and is required.

> http://code.open-bio.org/svnweb/index.cgi/bioperl/revision/?rev=15673

Perhaps I could create a patch that renamed files with spaces in them to
ones with no spaces and then rename them again upon installation.

Can you point me to which files are the problematic ones that triggered
the dependency for 1.52?  Perhaps I can figure a workaround.

Meanwhile I will press the maintainer of perl in Fedora to perhaps
reconsider his position (e.g. if another update for perl is going out
for another reason, like a security update, perhaps he could roll in the
1.52 update at the same time).

Cheers,
Alex


From cjfields at illinois.edu  Tue Nov 10 00:50:00 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Mon, 9 Nov 2009 18:50:00 -0600
Subject: [Bioperl-l] version of ExtUtils::Manifest too strict?
In-Reply-To: <nmocnbuuuw.fsf@allele2.localdomain>
References: <msd43yycfm.fsf@allele2.localdomain>
	<1D9E943F-2EDC-49AB-83DE-78DED5A8AC23@illinois.edu>
	<nmocnbuuuw.fsf@allele2.localdomain>
Message-ID: <29EA2398-F60B-48F2-AFE7-39A44011C451@illinois.edu>

On Nov 9, 2009, at 5:44 PM, Alex Lancaster wrote:

>>>>>> Chris Fields  writes:
>
>> Alex, Not sure why ExtUtils::Manifest can't be bundled as a separate
>> perl package alone.  It is part of perl core but it's also available
>> on CPAN separately from perl itself:
>
>> http://search.cpan.org/~rkobes/ExtUtils-Manifest-1.57/lib/ExtUtils/Manifest.pm
>
> Hi Chris,
>
> Yes, in principle it would be possible to have this split out as a
> separate package (currently it's a "subpackage" under the main perl
> package), unfortunately that's just not the way it's currently done in
> Fedora (probably because it's part of the core set and they like to
> update all relevant packages in one step) and I have little control  
> over
> that.
>
> As I suspected, the perl maintainer is not at all enthusiastic for
> updating the whole of perl just for that package (except for rawhide
> which would mean that bioperl 1.6.1 would not be available until F-13,
> about 6 months from now).  See:
>
> http://bugzilla.redhat.com/show_bug.cgi?id=533562#c1
>
> Obviously I am not happy with this situation either, because it will
> freeze bioperl on Fedora at 1.6.0 for about 6 months, so can you
> recommend any temporary workarounds in the meantime?

Well, if you don't absolutely require the MANIFEST for the final  
package you can forego the requirement.  The file in question that  
triggered the requirement is a data file used only for testing:

t/data/test 2.txt

>> This is the commit message for that BTW.  This allows spaces in file
>> names for the MANIFEST.  v1.52 is a bug fix and is required.
>
>> http://code.open-bio.org/svnweb/index.cgi/bioperl/revision/?rev=15673
>
> Perhaps I could create a patch that renamed files with spaces in  
> them to
> ones with no spaces and then rename them again upon installation.
>
> Can you point me to which files are the problematic ones that  
> triggered
> the dependency for 1.52?  Perhaps I can figure a workaround.
>
> Meanwhile I will press the maintainer of perl in Fedora to perhaps
> reconsider his position (e.g. if another update for perl is going out
> for another reason, like a security update, perhaps he could roll in  
> the
> 1.52 update at the same time).
>
> Cheers,
> Alex

I would point out that this is a fairly significant bug fix for  
ExtUtils::Manifest.  A newer point release of perl is now available  
(5.10.1) that contains the fix and has a fix for a performance  
regression that popped up in 5.10.0.

chris


From jay at jays.net  Tue Nov 10 00:05:51 2009
From: jay at jays.net (Jay Hannah)
Date: Mon, 9 Nov 2009 18:05:51 -0600
Subject: [Bioperl-l] Bio::Index::GenBank - by organism?
Message-ID: <3B01A09C-198E-4691-B807-7ED3250BB81A@jays.net>

Many thanks to Ewan Birney et. al. for Bio::Index::*

I can throw away my awful grep based index-by-accession stuff.   :)

Any chance someone has also written an organism based index mechanism?  
Something like...

while (my $seq = $inx?>get_Seq_by_organism('*Xanthomonas*')) {
    print $seq->display_id . "\n";
}

Thanks,

j


From cjfields at illinois.edu  Tue Nov 10 03:55:01 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Mon, 9 Nov 2009 21:55:01 -0600
Subject: [Bioperl-l] Bio::Index::GenBank - by organism?
In-Reply-To: <3B01A09C-198E-4691-B807-7ED3250BB81A@jays.net>
References: <3B01A09C-198E-4691-B807-7ED3250BB81A@jays.net>
Message-ID: <12DFD22E-42DC-4626-9873-0DE3EBB5CFBD@illinois.edu>

On Nov 9, 2009, at 6:05 PM, Jay Hannah wrote:

> Many thanks to Ewan Birney et. al. for Bio::Index::*
>
> I can throw away my awful grep based index-by-accession stuff.   :)
>
> Any chance someone has also written an organism based index  
> mechanism? Something like...
>
> while (my $seq = $inx?>get_Seq_by_organism('*Xanthomonas*')) {
>   print $seq->display_id . "\n";
> }
>
> Thanks,
>
> j

It should work via id_parser(); from Bio::Index::GenBank:

    $inx->id_parser(\&get_id);
    # make the index
    $inx->make_index($file_name);

    # here is where the retrieval key is specified
    sub get_id {
       my $line = shift;
       $line =~ /clone="(\S+)"/;
       $1;
    }

Change the code ref deal with the line you want and parse the name  
out.  Caveat: this may not be absolutely perfect (it only passes in a  
line at a time, and some species lines will wrap).  Also not sure how  
this would work in cases where multiple sequences from the same  
species are present.

The other option is to preparse everything and tie a hash to store a  
species->UID map, then use that along with your Bio::Index index to  
grab what you need.

chris


From cjfields at illinois.edu  Tue Nov 10 04:58:32 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Mon, 9 Nov 2009 22:58:32 -0600
Subject: [Bioperl-l] how to get the protein sequences from DNA sequences
	around novel SNPs?
In-Reply-To: <deaa866a0911091315l6a0782e8i24bfdc60065eb648@mail.gmail.com>
References: <794eafc20911091008g1f98b944ncbd66ac4962a85a3@mail.gmail.com>
	<deaa866a0911091315l6a0782e8i24bfdc60065eb648@mail.gmail.com>
Message-ID: <435BA1A8-2CCB-4D7A-8909-84F8135C439F@illinois.edu>

On Nov 9, 2009, at 3:15 PM, Robert Bradbury wrote:

> On Mon, Nov 9, 2009 at 1:08 PM, Guangchun Song <gc11song at gmail.com>  
> wrote:
>>
>> I'm new bioperl user.  I' working on a project: To determine the
>> status of all tutative SNPs such as non-synonymous vs. synonymous,  
>> and
>> predict the tranlational effect of non-synonymous mutations as benign
>> or malicious.  I'm trying to use bioperl to get the DNA sequence and
>> translate to protein sequence for the SNPs that are in gene's coding
>> region.  Could someone tell me how to do it?
>>
>>
> I too would like to know if this information is available.  I've  
> recently
> been working with the dbSNP results from NCBI but they display the  
> results
> in a graphical format rather than data that one can play with and ask
> questions of like "What is the most disease causing gene in the Human
> Genome?" or "What are the critical proteins damaged by gene defects  
> in the
> Human Genome?" ... "In terms of premature deaths, extended health care
> requirements, loss of quality of life, etc.?"
>
> The same types of questions can be applied to the dog and cat  
> genomes where
> there is emotional value or the cow, horse, pig, etc. genomes where  
> there is
> economic value?
>
> The value of BioPerl would increase significantly if there were
> functionality that would allow easy access to "these mutations may  
> have
> negative/positive impact" (which means you need a function that  
> qualifies
> mutations by degree) and allow for impact to be subjectively  
> determined
> (implying there must be some callback function to provide a user
> quality/impact rating).
>
> For example:
>   $/@differences =  protein_compare($mygene, $refseq_gene,  
> @critical_aa,
> @critical_domain, $callback)
> Where $callback could "rate" differences about the protein and  
> position and
> the "type of interest" (e.g. metal binding amino acids, structural  
> changing
> amino acids, critical catalysis amino acids, etc.).
>
> A default callback would be based on some evolving definition of  
> "critical"
> changes which result in human disease for example.
>
> This is a "required" capability to be able to determine things like  
> the
> "adaptability" of a species -- those with fewest critical mutation  
> points
> may have better adaptability to mutation increasing circumstances.
>
> Please pardon any errors in perl syntax/usage its been a while since  
> I've
> written perl and I'd really rather be coding in C.
>
> Robert

I will say that most of the information from the SNP database is  
available in various formats (see following link under 'Retrieval  
Types'):

http://www.ncbi.nlm.nih.gov/corehtml/query/static/efetchseq_help.html

You can access this information, as well as the full XML, using  
something like the following script.

chris

------------------------------------------------

#!/usr/bin/perl -w

use 5.010;
use strict;
use warnings;
use Bio::DB::EUtilities;

my $term = shift;
my $eutil  = Bio::DB::EUtilities->new(-eutil    => 'esearch',
                                       -db       => 'snp',
                                       -term     => $term,
                                       -usehistory => 'y',
                                       -retmax   => 100);

my $hist = $eutil->next_History || die "No history returned";

# for SNP XML, change retmode to 'xml'
$eutil->set_parameters(-eutil   => 'efetch',
                        -history => $hist,
                        -retmode => 'text',
                        -rettype => 'flt');

# dumps to STDOUT
say $eutil->get_Response->content;


From jluis.lavin at unavarra.es  Tue Nov 10 10:43:40 2009
From: jluis.lavin at unavarra.es (jluis.lavin at unavarra.es)
Date: Tue, 10 Nov 2009 11:43:40 +0100 (CET)
Subject: [Bioperl-l] [Fwd: Re: A question about iBio::Index: and
 itscorrect use]
In-Reply-To: <A1ACC4B552514872B77208248B31977C@NewLife>
References: <2642.130.206.164.153.1257435996.squirrel@webmail.unavarra.es><1
	984ED07F36C446284B25F617964B6C6@NewLife><2969.130.206.164.153.1257438117.squirrel@webmail.unavarra.es>
	<A1ACC4B552514872B77208248B31977C@NewLife>
Message-ID: <3471.130.206.164.153.1257849820.squirrel@webmail.unavarra.es>

Hello again,

I tried what Mark told me modifying the code line he told me but there?s
still a problem that I believe must be due to the sequences name.
My secuences header on the Fasta file have this format:

>PleosPC9_1_103820|fgenesh1_pg.3_#_1

Th part on the right of the pipe changes depending on the program used to
create the gene model, for example:

>PleosPC9_1_103820|fgenesh1_pg.3_#_1
>PleosPC9_1_123413|genemark.2731_g
>PleosPC9_1_52065|e_gw1.3.64.1

So I guess I need to parse my ids somehow for thr program to detect only
the first part of the fasta header (the "protein name") and not to get
messed with the other side of the pipe...

This is the corrected code I wrote following Mark?s indications, but I
still don?t have any idea about the parsing issue...

#!/c:/Perl -w
use Bio::Index::Fasta;
use strict;
#PC9.fasta is my genomic file
my $Index_File_Name ="PC9.fasta";
my $inx = Bio::Index::Fasta->new('PC9.fasta.idx');
#LCS.txt is my sequences list
@ARGV = <LCS.txt>;
foreach  my $id (@ARGV) {
if ($id eq ''){
die ("empty list")
}
else {
my $seqobj = $inx->fetch($id);
my $out = new Bio::SeqIO (-file => ">>index_extracted.fasta",
-format => 'fasta');
$out->write_seq($seqobj);
}
}
exit;
}

Thanks in advance

PD. May it be a faster way of extracting those sequences using plain PERL?


El Jue, 5 de Noviembre de 2009, 17:39, Mark A. Jensen escribi?:
> Yes, these are files created by the SDBM, Perl's internal db manager. You
> should
> be able to
> open the index by simply
> $inx = Bio::Index::Fasta->new('PC9.fasta.idx');
> and the dbm will know what to do--
> cheers MAJ
> ----- Original Message -----
> From: <jluis.lavin at unavarra.es>
> To: "Mark A. Jensen" <maj at fortinbras.us>
> Cc: <jluis.lavin at unavarra.es>; <bioperl-l at lists.open-bio.org>
> Sent: Thursday, November 05, 2009 11:21 AM
> Subject: Re: [Bioperl-l] [Fwd: Re: A question about iBio::Index: and its
> correct
> use]
>
>
>> Thank you very much Mark, that?s a good point :$
>> I guess your correction is referred to the second script, isn?t it?
>>
>> If it is so, there is still a problem with the first script, it doesn?t
>> create the PC9.fasta.idx file, instead it creates two files named:
>> -PC9.fasta.idx.pag
>> -PC9.fasta.idx.dir
>>
>> which seem to be clearly related with some kind of indexing
>> process...but,
>> unless the PC9.fasta.idx file is only virtual or remains hidden, I can?t
>> find it anywhere...
>> Forgive me if I?m talking nosense...
>>
>> Thank you very much again for your help ;)
>>
>>
>> El Jue, 5 de Noviembre de 2009, 17:02, Mark A. Jensen escribi?:
>>> Hey Jos?,
>>> The first thing that jumps out it the index file name. Looks
>>> like you create it as
>>> PC9.fasta.idx
>>> But you read it as
>>> PC9.fasta
>>> Not an unusual mistake. Do
>>> my $inx = Bio::Index::Fasta->new('PC9.fasta.idx');
>>> and see if it works.
>>> MAJ
>>> ----- Original Message -----
>>> From: <jluis.lavin at unavarra.es>
>>> To: <bioperl-l at lists.open-bio.org>
>>> Sent: Thursday, November 05, 2009 10:46 AM
>>> Subject: [Bioperl-l] [Fwd: Re: A question about iBio::Index: and its
>>> correct
>>> use]
>>>
>>>
>>>
>>>
>>> ---------------------------- Mensaje original
>>> ----------------------------
>>> Subject: Re: [Bioperl-l] A question about iBio::Index: and its correct
>>> use
>>> From:    jluis.lavin at unavarra.es
>>> Fecha:   Jue, 5 de Noviembre de 2009, 16:46
>>> To:      "Mark A. Jensen" <maj at fortinbras.us>
>>> --------------------------------------------------------------------------
>>>
>>> Hi Mark,
>>>
>>> I?ve actually got two scripts, the first one is to create the index and
>>> the second one is to retrieve the sequence lis from the indexed file.
>>>
>>> 1)Here is the Index creation script:
>>>
>>> #!/c:/Perl -w
>>> use strict;
>>> use Bio::Index::Fasta;
>>> use strict;
>>>
>>> print "Enter file for indexing: \n";
>>> my $Index_File_Name = <STDIN>;
>>> my $inx = Bio::Index::Fasta->new(-filename => $Index_File_Name.".idx",
>>>     -write_flag => 1);
>>> $inx->make_index(my $File_Name);
>>>
>>> 2)And here is the sequence retrieval script:
>>>
>>> #!/c:/Perl -w
>>> use Bio::Index::Fasta;
>>> use strict;
>>> #PC9.fasta is my genomic file
>>> my $Index_File_Name ="PC9.fasta";
>>> my $inx = Bio::Index::Fasta->new($Index_File_Name);
>>> #LCS.txt is my sequences list
>>> @ARGV = <lCS.txt>;
>>> foreach  my $id (@ARGV) {
>>> if ($id eq ''){
>>> die ("empty list")
>>> }
>>> else {
>>> my $seqobj = $inx->fetch($id);
>>> my $out = new Bio::SeqIO (-file => ">>extracted_seqs.fasta",
>>> -format => 'fasta');
>>> $out->write_seq($seqobj);
>>> }
>>> }
>>> exit;
>>> }
>>>
>>> I hope this code is not a total scum...
>>>
>>> Thanks in advance ;)
>>>
>>>
>>>
>>> El Jue, 5 de Noviembre de 2009, 16:39, Mark A. Jensen escribi?:
>>>> Jos? -- It looks like this is a good solution to your problem. Please
>>>> send
>>>> you
>>>> script so we can look at it-
>>>> cheers Mark
>>>> ----- Original Message -----
>>>> From: <jluis.lavin at unavarra.es>
>>>> To: <bioperl-l at lists.open-bio.org>
>>>> Sent: Thursday, November 05, 2009 10:28 AM
>>>> Subject: [Bioperl-l] A question about iBio::Index: and its correct use
>>>>
>>>>
>>>>
>>>> Hello to all,
>>>>
>>>> I?m trying to write a script to retrieve a list of sequences from a
>>>> local
>>>> FASTA file (for example a fasta archive where all the protein models
>>>> of
>>>> an
>>>> organism are stored). This file would be used by me as some kind
>>>> "local
>>>> database" (sorry if I mistake a few concepts...)
>>>> I?ve been reading the BioPerl HOWTOs and I came across the
>>>> Bio::Index::Fasta tool.
>>>> If I didn?t misunderstood what I read (which can be easy because my
>>>> low
>>>> level on programming) this Indexing tool should do the job.
>>>> I wrote a couple of scripts based on the documentation i read about
>>>> this
>>>> tool, but I don?t seem to be able to create the index file to be used
>>>> later (to retrieve the sequences from).
>>>> -First of all, I want to ask the people in this forum if the
>>>> Bio::Index::Fasta is the right one to chose for this tasks.
>>>> -Then I?ll beg you to take a look at my scripts, because I don?t seem
>>>> to
>>>> catch the bug...
>>>>
>>>> Best wishes to you all and thanks in advance ;)
>>>>
>>>> --
>>>> Jos? Luis Lav?n Trueba, PhD
>>>>
>>>> Dpto. de Producci?n Agraria
>>>> Grupo de Gen?tica y Microbiolog?a
>>>> Universidad P?blica de Navarra
>>>> 31006 Pamplona
>>>> Navarra
>>>> SPAIN
>>>>
>>>>
>>>> _______________________________________________
>>>> Bioperl-l mailing list
>>>> Bioperl-l at lists.open-bio.org
>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>>
>>>>
>>>>
>>>
>>>
>>> --
>>> Dr. Jos? Luis Lav?n Trueba
>>>
>>> Dpto. de Producci?n Agraria
>>> Grupo de Gen?tica y Microbiolog?a
>>> Universidad P?blica de Navarra
>>> 31006 Pamplona
>>> Navarra
>>> SPAIN
>>>
>>>
>>>
>>> --
>>> Dr. Jos? Luis Lav?n Trueba
>>>
>>> Dpto. de Producci?n Agraria
>>> Grupo de Gen?tica y Microbiolog?a
>>> Universidad P?blica de Navarra
>>> 31006 Pamplona
>>> Navarra
>>> SPAIN
>>>
>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>>
>>>
>>
>>
>> --
>> Dr. Jos? Luis Lav?n Trueba
>>
>> Dpto. de Producci?n Agraria
>> Grupo de Gen?tica y Microbiolog?a
>> Universidad P?blica de Navarra
>> 31006 Pamplona
>> Navarra
>> SPAIN
>>
>>
>>
>>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


-- 
Dr. Jos? Luis Lav?n Trueba

Dpto. de Producci?n Agraria
Grupo de Gen?tica y Microbiolog?a
Universidad P?blica de Navarra
31006 Pamplona
Navarra
SPAIN


From saikari78 at gmail.com  Tue Nov 10 11:41:11 2009
From: saikari78 at gmail.com (saikari keitele)
Date: Tue, 10 Nov 2009 11:41:11 +0000
Subject: [Bioperl-l] [bioperl newbie] Retrieving link to protein from
	PubChem
In-Reply-To: <a38167fa0911090841o6a0a7bcdj5e2396f1a58dc4f2@mail.gmail.com>
References: <a38167fa0911090805y76456205g3f42caa7e3471241@mail.gmail.com>
	<1ECC543A-F923-4D5E-A0C1-5BBD35ECAAE8@illinois.edu>
	<a38167fa0911090841o6a0a7bcdj5e2396f1a58dc4f2@mail.gmail.com>
Message-ID: <a38167fa0911100341j45af6218r449270a7f4b530ec@mail.gmail.com>

Thanks again very much for your help and the script.
i've been trying it, however I fail to find any protein record linked to a
record in the pcsubstance database.
Do you think that its is because  no links have been defined between the 2
databases, or that I am just unlucky and that no link exists for the
particular records I'm testing?
Thanks again

saikari

On Mon, Nov 9, 2009 at 4:41 PM, saikari keitele <saikari78 at gmail.com> wrote:

> Fabulous!. Huge help.
> saikari
>
>   On Mon, Nov 9, 2009 at 4:27 PM, Chris Fields <cjfields at illinois.edu>wrote:
>
>>  On Nov 9, 2009, at 10:05 AM, saikari keitele wrote:
>>
>> Hi,
>>>
>>> I'm using Bioperl to retrieve records from PubChem.
>>> I'm trying to find a way-but have been unsuccessful- to retrieve from a
>>> compound record, the reference to the protein(s) that can synthesize the
>>> compound.
>>> Thanks very much.
>>>
>>> saikari
>>>
>>
>> The below bioperl script returns the GI for proteins that correspond to
>> the substance passed on the command line; invoke using 'perl
>> pc_substance.pl substance_requested'.  It probably needs more fiddling to
>> catch everything but it should get you started.
>>
>> For other bits and pieces (such as how to retrieve the raw sequence
>> files), please see the EUtilities HOWTO:
>>
>> http://www.bioperl.org/wiki/HOWTO:EUtilities_Cookbook
>>
>> chris
>>
>> ----------------------------------------
>>
>> #!/usr/bin/perl -w
>>
>> use 5.010;
>> use strict;
>> use warnings;
>> use Bio::DB::EUtilities;
>>
>> my $substance = shift;
>>
>> my $eutil = Bio::DB::EUtilities->new(-eutil => 'esearch',
>>                                     -db => 'pcsubstance',
>>                                     -term => $substance,
>>                                     -usehistory => 'y');
>>
>> my $hist = $eutil->next_History || die;
>>
>> $eutil->reset_parameters(-eutil => 'elink',
>>                       -history => $hist,
>>                       -db      => 'protein',
>>                       -dbfrom  => 'pcsubstance',
>>                       -retmax  => 1000);
>>
>> say join(',',$eutil->get_ids);
>>
>
>


From heyne at informatik.uni-freiburg.de  Tue Nov 10 12:55:06 2009
From: heyne at informatik.uni-freiburg.de (Steffen Heyne)
Date: Tue, 10 Nov 2009 13:55:06 +0100
Subject: [Bioperl-l] problem with alignments and sequence locations
Message-ID: <4AF962AA.7060908@informatik.uni-freiburg.de>

Hi,

I'm using Bioperl for my research and it is very useful! Thank you!

Currently I have a problem with locations tags of sequences. I read in 
seed alignments of Rfam (in stockholm format, but I think it is similar 
to other formats).

If the location is like:

AB194432.1/908-846

the start/end values are changed to

$seq->start = 846
$seq->end = 908

and therefore the new location (e.g.$seq->get_nse) is:

AB194432.1/846-908

The $seq->strand tag is correctly set to -1 in this case, but if the 
alignment is written out again (clustal, stockholm,...) this strand info 
is lost and the sequences have this "wrong" location. But this 
information is important in respect to the sequence accession number.

Is there a way to set the location back to the original one or is this 
behavior desired? Any manually setting with $seq->start($val) failed due 
to automatic checking.

I'm using bioperl 1.6.1

Thanks!

steffen


-- 
---
Steffen Heyne, Dipl.-Bioinf.
Lehrstuhl f?r Bioinformatik
Institut f?r Informatik
Albert-Ludwigs-Universit?t Freiburg
Georges-K?hler-Allee 106
79110 Freiburg, Germany

Tel: (+49) 761 203 8239
Fax: (+49) 761 203 7462
Mail: heyne at informatik.uni-freiburg.de


From cjfields at illinois.edu  Tue Nov 10 13:58:52 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Tue, 10 Nov 2009 07:58:52 -0600
Subject: [Bioperl-l] problem with alignments and sequence locations
In-Reply-To: <4AF962AA.7060908@informatik.uni-freiburg.de>
References: <4AF962AA.7060908@informatik.uni-freiburg.de>
Message-ID: <DF72C01A-410F-4391-B33E-4884D7CB859E@illinois.edu>

On Nov 10, 2009, at 6:55 AM, Steffen Heyne wrote:

> Hi,
>
> I'm using Bioperl for my research and it is very useful! Thank you!
>
> Currently I have a problem with locations tags of sequences. I read  
> in seed alignments of Rfam (in stockholm format, but I think it is  
> similar to other formats).
>
> If the location is like:
>
> AB194432.1/908-846
>
> the start/end values are changed to
>
> $seq->start = 846
> $seq->end = 908
>
> and therefore the new location (e.g.$seq->get_nse) is:
>
> AB194432.1/846-908
>
> The $seq->strand tag is correctly set to -1 in this case, but if the  
> alignment is written out again (clustal, stockholm,...) this strand  
> info is lost and the sequences have this "wrong" location. But this  
> information is important in respect to the sequence accession number.
>
> Is there a way to set the location back to the original one or is  
> this behavior desired? Any manually setting with $seq->start($val)  
> failed due to automatic checking.
>
> I'm using bioperl 1.6.1
>
> Thanks!
>
> steffen

This is a definite bug. We recently discussed amending the NSE format  
due to this (the subject came up over the last few months or so); it's  
fallen through the cracks.  Fortunaely it is very easy to fix (the  
relevant method is in LocatableSeq).

Does anyone have a problem with me adding this in?  It will change  
output for only those instances where the strand is -1, so

AB194432.1/908-846

would be start = 846, end = 908, strand = -1

AB194432.1/846-908

would be start = 846, end = 908, strand = 1

chris


From cjfields at illinois.edu  Tue Nov 10 14:05:51 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Tue, 10 Nov 2009 08:05:51 -0600
Subject: [Bioperl-l] [bioperl newbie] Retrieving link to protein from
	PubChem
In-Reply-To: <a38167fa0911100341j45af6218r449270a7f4b530ec@mail.gmail.com>
References: <a38167fa0911090805y76456205g3f42caa7e3471241@mail.gmail.com>
	<1ECC543A-F923-4D5E-A0C1-5BBD35ECAAE8@illinois.edu>
	<a38167fa0911090841o6a0a7bcdj5e2396f1a58dc4f2@mail.gmail.com>
	<a38167fa0911100341j45af6218r449270a7f4b530ec@mail.gmail.com>
Message-ID: <738F6320-B87A-4541-B9FA-20273ABA96B9@illinois.edu>

On Nov 10, 2009, at 5:41 AM, saikari keitele wrote:

> Thanks again very much for your help and the script.
> i've been trying it, however I fail to find any protein record  
> linked to a
> record in the pcsubstance database.
> Do you think that its is because  no links have been defined between  
> the 2
> databases, or that I am just unlucky and that no link exists for the
> particular records I'm testing?
> Thanks again
>
> saikari

It's probably that no links have been defined.  I have found similar  
problems in the past with pubchem, in that not all substances have  
proteins associated with them.  Most proteins linked to are those with  
a deposited structure.

There are a few other databases to check out; KEGG, the BioCyc dbs  
(like EcoCyc), come to mind.  I don't think we have a generic remote  
query engine set up for any of those unfortunately (unless there is  
one I'm unaware of), but I know BioCyc comes with it's own set of  
tools (including perl- and java-based query tools) and can be set up  
locally, which is likely much faster and more in lines with what you  
need.

chris

...


From vebaev at gmail.com  Tue Nov 10 17:38:54 2009
From: vebaev at gmail.com (Vesselin Baev)
Date: Tue, 10 Nov 2009 09:38:54 -0800 (PST)
Subject: [Bioperl-l] Invitation to connect on LinkedIn
Message-ID: <1983273212.597925.1257874734811.JavaMail.app@ech3-cdn07.prod>

LinkedIn
------------

Vesselin Baev requested to add you as a connection on LinkedIn:
------------------------------------------

Bolotin,,

I'd like to add you to my professional network on LinkedIn.

- Vesselin

Accept invitation from Vesselin Baev
http://www.linkedin.com/e/uez6TYkHzbaXxXM-lUk23auFwJZodcPlXc2UWC0Ao8h/blk/I1572789477_2/pmpxnSRJrSdvj4R5fnhv9ClRsDgZp6lQs6lzoQ5AomZIpn8_cBYTdPgVe3sOdPkNiiZFlAN1oPlOp2YMdPsTcz8OdjwLrCBxbOYWrSlI/EML_comm_afe/

View invitation from Vesselin Baev
http://www.linkedin.com/e/uez6TYkHzbaXxXM-lUk23auFwJZodcPlXc2UWC0Ao8h/blk/I1572789477_2/39vdPsQejwTczsRckALqnpPbOYWrSlI/svi/

------------------------------------------ 
DID YOU KNOW your LinkedIn profile helps you control your public image when people search for you? Setting your profile as public means your LinkedIn profile will come up when people enter your name in leading search engines. Take control of your image! 
http://www.linkedin.com/e/ewp/inv-22/

 
------
(c) 2009, LinkedIn Corporation


From jason at bioperl.org  Tue Nov 10 18:47:02 2009
From: jason at bioperl.org (Jason Stajich)
Date: Tue, 10 Nov 2009 10:47:02 -0800
Subject: [Bioperl-l] [Fwd: Re: A question about iBio::Index: and
	itscorrect use]
In-Reply-To: <3471.130.206.164.153.1257849820.squirrel@webmail.unavarra.es>
References: <2642.130.206.164.153.1257435996.squirrel@webmail.unavarra.es><1
	984ED07F36C446284B25F617964B6C6@NewLife><2969.130.206.164.153.1257438117.squirrel@webmail.unavarra.es>
	<A1ACC4B552514872B77208248B31977C@NewLife>
	<3471.130.206.164.153.1257849820.squirrel@webmail.unavarra.es>
Message-ID: <E2D3B0C9-9294-4630-ACA0-FB0559E19E4C@bioperl.org>

Page 44 has the custom ID info or look at documentation for  
Bio::DB::Fasta - there is a similar syntax for Bio::Index::Fasta if  
you read the perldoc for the module.

  http://jason.open-bio.org/Bioperl_Tutorials/ProgrammingBiology2008/ProgBiology_BioPerl_I.pdf

Don't re-opening SeqIO each time just do it once at the beginning  
outside of the loop and then call write_seq within the loop.

This is one nuance of doing OO programming vs procedural is that there  
is some outside state information that can persist in an object, but  
conceptually, you want to open a filehandle once and just keep writing  
to it.

-jason
On Nov 10, 2009, at 2:43 AM, jluis.lavin at unavarra.es wrote:

> Hello again,
>
> I tried what Mark told me modifying the code line he told me but  
> there?s
> still a problem that I believe must be due to the sequences name.
> My secuences header on the Fasta file have this format:
>
>> PleosPC9_1_103820|fgenesh1_pg.3_#_1
>
> Th part on the right of the pipe changes depending on the program  
> used to
> create the gene model, for example:
>
>> PleosPC9_1_103820|fgenesh1_pg.3_#_1
>> PleosPC9_1_123413|genemark.2731_g
>> PleosPC9_1_52065|e_gw1.3.64.1
>
> So I guess I need to parse my ids somehow for thr program to detect  
> only
> the first part of the fasta header (the "protein name") and not to get
> messed with the other side of the pipe...
>
> This is the corrected code I wrote following Mark?s indications, but I
> still don?t have any idea about the parsing issue...
>
> #!/c:/Perl -w
> use Bio::Index::Fasta;
> use strict;
> #PC9.fasta is my genomic file
> my $Index_File_Name ="PC9.fasta";
> my $inx = Bio::Index::Fasta->new('PC9.fasta.idx');
> #LCS.txt is my sequences list
> @ARGV = <LCS.txt>;
> foreach  my $id (@ARGV) {
> if ($id eq ''){
> die ("empty list")
> }
> else {
> my $seqobj = $inx->fetch($id);
> my $out = new Bio::SeqIO (-file => ">>index_extracted.fasta",
> -format => 'fasta');
> $out->write_seq($seqobj);
> }
> }
> exit;
> }
>
> Thanks in advance
>
> PD. May it be a faster way of extracting those sequences using plain  
> PERL?
>
>
>
>
> El Jue, 5 de Noviembre de 2009, 17:39, Mark A. Jensen escribi?:
>> Yes, these are files created by the SDBM, Perl's internal db  
>> manager. You
>> should
>> be able to
>> open the index by simply
>> $inx = Bio::Index::Fasta->new('PC9.fasta.idx');
>> and the dbm will know what to do--
>> cheers MAJ
>> ----- Original Message -----
>> From: <jluis.lavin at unavarra.es>
>> To: "Mark A. Jensen" <maj at fortinbras.us>
>> Cc: <jluis.lavin at unavarra.es>; <bioperl-l at lists.open-bio.org>
>> Sent: Thursday, November 05, 2009 11:21 AM
>> Subject: Re: [Bioperl-l] [Fwd: Re: A question about iBio::Index:  
>> and its
>> correct
>> use]
>>
>>
>>> Thank you very much Mark, that?s a good point :$
>>> I guess your correction is referred to the second script, isn?t it?
>>>
>>> If it is so, there is still a problem with the first script, it  
>>> doesn?t
>>> create the PC9.fasta.idx file, instead it creates two files named:
>>> -PC9.fasta.idx.pag
>>> -PC9.fasta.idx.dir
>>>
>>> which seem to be clearly related with some kind of indexing
>>> process...but,
>>> unless the PC9.fasta.idx file is only virtual or remains hidden, I  
>>> can?t
>>> find it anywhere...
>>> Forgive me if I?m talking nosense...
>>>
>>> Thank you very much again for your help ;)
>>>
>>>
>>> El Jue, 5 de Noviembre de 2009, 17:02, Mark A. Jensen escribi?:
>>>> Hey Jos?,
>>>> The first thing that jumps out it the index file name. Looks
>>>> like you create it as
>>>> PC9.fasta.idx
>>>> But you read it as
>>>> PC9.fasta
>>>> Not an unusual mistake. Do
>>>> my $inx = Bio::Index::Fasta->new('PC9.fasta.idx');
>>>> and see if it works.
>>>> MAJ
>>>> ----- Original Message -----
>>>> From: <jluis.lavin at unavarra.es>
>>>> To: <bioperl-l at lists.open-bio.org>
>>>> Sent: Thursday, November 05, 2009 10:46 AM
>>>> Subject: [Bioperl-l] [Fwd: Re: A question about iBio::Index: and  
>>>> its
>>>> correct
>>>> use]
>>>>
>>>>
>>>>
>>>>
>>>> ---------------------------- Mensaje original
>>>> ----------------------------
>>>> Subject: Re: [Bioperl-l] A question about iBio::Index: and its  
>>>> correct
>>>> use
>>>> From:    jluis.lavin at unavarra.es
>>>> Fecha:   Jue, 5 de Noviembre de 2009, 16:46
>>>> To:      "Mark A. Jensen" <maj at fortinbras.us>
>>>> --------------------------------------------------------------------------
>>>>
>>>> Hi Mark,
>>>>
>>>> I?ve actually got two scripts, the first one is to create the  
>>>> index and
>>>> the second one is to retrieve the sequence lis from the indexed  
>>>> file.
>>>>
>>>> 1)Here is the Index creation script:
>>>>
>>>> #!/c:/Perl -w
>>>> use strict;
>>>> use Bio::Index::Fasta;
>>>> use strict;
>>>>
>>>> print "Enter file for indexing: \n";
>>>> my $Index_File_Name = <STDIN>;
>>>> my $inx = Bio::Index::Fasta->new(-filename =>  
>>>> $Index_File_Name.".idx",
>>>>    -write_flag => 1);
>>>> $inx->make_index(my $File_Name);
>>>>
>>>> 2)And here is the sequence retrieval script:
>>>>
>>>> #!/c:/Perl -w
>>>> use Bio::Index::Fasta;
>>>> use strict;
>>>> #PC9.fasta is my genomic file
>>>> my $Index_File_Name ="PC9.fasta";
>>>> my $inx = Bio::Index::Fasta->new($Index_File_Name);
>>>> #LCS.txt is my sequences list
>>>> @ARGV = <lCS.txt>;
>>>> foreach  my $id (@ARGV) {
>>>> if ($id eq ''){
>>>> die ("empty list")
>>>> }
>>>> else {
>>>> my $seqobj = $inx->fetch($id);
>>>> my $out = new Bio::SeqIO (-file => ">>extracted_seqs.fasta",
>>>> -format => 'fasta');
>>>> $out->write_seq($seqobj);
>>>> }
>>>> }
>>>> exit;
>>>> }
>>>>
>>>> I hope this code is not a total scum...
>>>>
>>>> Thanks in advance ;)
>>>>
>>>>
>>>>
>>>> El Jue, 5 de Noviembre de 2009, 16:39, Mark A. Jensen escribi?:
>>>>> Jos? -- It looks like this is a good solution to your problem.  
>>>>> Please
>>>>> send
>>>>> you
>>>>> script so we can look at it-
>>>>> cheers Mark
>>>>> ----- Original Message -----
>>>>> From: <jluis.lavin at unavarra.es>
>>>>> To: <bioperl-l at lists.open-bio.org>
>>>>> Sent: Thursday, November 05, 2009 10:28 AM
>>>>> Subject: [Bioperl-l] A question about iBio::Index: and its  
>>>>> correct use
>>>>>
>>>>>
>>>>>
>>>>> Hello to all,
>>>>>
>>>>> I?m trying to write a script to retrieve a list of sequences  
>>>>> from a
>>>>> local
>>>>> FASTA file (for example a fasta archive where all the protein  
>>>>> models
>>>>> of
>>>>> an
>>>>> organism are stored). This file would be used by me as some kind
>>>>> "local
>>>>> database" (sorry if I mistake a few concepts...)
>>>>> I?ve been reading the BioPerl HOWTOs and I came across the
>>>>> Bio::Index::Fasta tool.
>>>>> If I didn?t misunderstood what I read (which can be easy because  
>>>>> my
>>>>> low
>>>>> level on programming) this Indexing tool should do the job.
>>>>> I wrote a couple of scripts based on the documentation i read  
>>>>> about
>>>>> this
>>>>> tool, but I don?t seem to be able to create the index file to be  
>>>>> used
>>>>> later (to retrieve the sequences from).
>>>>> -First of all, I want to ask the people in this forum if the
>>>>> Bio::Index::Fasta is the right one to chose for this tasks.
>>>>> -Then I?ll beg you to take a look at my scripts, because I don?t  
>>>>> seem
>>>>> to
>>>>> catch the bug...
>>>>>
>>>>> Best wishes to you all and thanks in advance ;)
>>>>>
>>>>> --
>>>>> Jos? Luis Lav?n Trueba, PhD
>>>>>
>>>>> Dpto. de Producci?n Agraria
>>>>> Grupo de Gen?tica y Microbiolog?a
>>>>> Universidad P?blica de Navarra
>>>>> 31006 Pamplona
>>>>> Navarra
>>>>> SPAIN
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> Bioperl-l mailing list
>>>>> Bioperl-l at lists.open-bio.org
>>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> Dr. Jos? Luis Lav?n Trueba
>>>>
>>>> Dpto. de Producci?n Agraria
>>>> Grupo de Gen?tica y Microbiolog?a
>>>> Universidad P?blica de Navarra
>>>> 31006 Pamplona
>>>> Navarra
>>>> SPAIN
>>>>
>>>>
>>>>
>>>> --
>>>> Dr. Jos? Luis Lav?n Trueba
>>>>
>>>> Dpto. de Producci?n Agraria
>>>> Grupo de Gen?tica y Microbiolog?a
>>>> Universidad P?blica de Navarra
>>>> 31006 Pamplona
>>>> Navarra
>>>> SPAIN
>>>>
>>>>
>>>> _______________________________________________
>>>> Bioperl-l mailing list
>>>> Bioperl-l at lists.open-bio.org
>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>>
>>>>
>>>>
>>>
>>>
>>> --
>>> Dr. Jos? Luis Lav?n Trueba
>>>
>>> Dpto. de Producci?n Agraria
>>> Grupo de Gen?tica y Microbiolog?a
>>> Universidad P?blica de Navarra
>>> 31006 Pamplona
>>> Navarra
>>> SPAIN
>>>
>>>
>>>
>>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>
>
> -- 
> Dr. Jos? Luis Lav?n Trueba
>
> Dpto. de Producci?n Agraria
> Grupo de Gen?tica y Microbiolog?a
> Universidad P?blica de Navarra
> 31006 Pamplona
> Navarra
> SPAIN
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

--
Jason Stajich
jason.stajich at gmail.com
jason at bioperl.org


From jason at bioperl.org  Tue Nov 10 18:50:00 2009
From: jason at bioperl.org (Jason Stajich)
Date: Tue, 10 Nov 2009 10:50:00 -0800
Subject: [Bioperl-l] Bio::Index::GenBank - by organism?
In-Reply-To: <12DFD22E-42DC-4626-9873-0DE3EBB5CFBD@illinois.edu>
References: <3B01A09C-198E-4691-B807-7ED3250BB81A@jays.net>
	<12DFD22E-42DC-4626-9873-0DE3EBB5CFBD@illinois.edu>
Message-ID: <2BA451B1-6E18-483E-B655-74D1146772CC@bioperl.org>

You might also look at what mygenbank does:
http://homepage.mac.com/iankorf/mygenbank.html

On Nov 9, 2009, at 7:55 PM, Chris Fields wrote:

> On Nov 9, 2009, at 6:05 PM, Jay Hannah wrote:
>
>> Many thanks to Ewan Birney et. al. for Bio::Index::*
>>
>> I can throw away my awful grep based index-by-accession stuff.   :)
>>
>> Any chance someone has also written an organism based index  
>> mechanism? Something like...
>>
>> while (my $seq = $inx?>get_Seq_by_organism('*Xanthomonas*')) {
>>  print $seq->display_id . "\n";
>> }
>>
>> Thanks,
>>
>> j
>
> It should work via id_parser(); from Bio::Index::GenBank:
>
>   $inx->id_parser(\&get_id);
>   # make the index
>   $inx->make_index($file_name);
>
>   # here is where the retrieval key is specified
>   sub get_id {
>      my $line = shift;
>      $line =~ /clone="(\S+)"/;
>      $1;
>   }
>
> Change the code ref deal with the line you want and parse the name  
> out.  Caveat: this may not be absolutely perfect (it only passes in  
> a line at a time, and some species lines will wrap).  Also not sure  
> how this would work in cases where multiple sequences from the same  
> species are present.
>
> The other option is to preparse everything and tie a hash to store a  
> species->UID map, then use that along with your Bio::Index index to  
> grab what you need.
>
> chris
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

--
Jason Stajich
jason.stajich at gmail.com
jason at bioperl.org


From jluis.lavin at unavarra.es  Wed Nov 11 15:01:18 2009
From: jluis.lavin at unavarra.es (jluis.lavin at unavarra.es)
Date: Wed, 11 Nov 2009 16:01:18 +0100 (CET)
Subject: [Bioperl-l] [Fwd: Re: A question about iBio::Index:
 anditscorrect use]
In-Reply-To: <E2D3B0C9-9294-4630-ACA0-FB0559E19E4C@bioperl.org>
References: <2642.130.206.164.153.1257435996.squirrel@webmail.unavarra.es><1
	984ED07F36C446284B25F617964B6C6@NewLife><2969.130.206.164.153.1257438117.sq
	uirrel@webmail.unavarra.es><A1ACC4B552514872B77208248B31977C@NewLife><3471.
	130.206.164.153.1257849820.squirrel@webmail.unavarra.es>
	<E2D3B0C9-9294-4630-ACA0-FB0559E19E4C@bioperl.org>
Message-ID: <2979.130.206.164.153.1257951678.squirrel@webmail.unavarra.es>

Hi once again,
I have modified the script following the instructions Jason gave me (at
last what I understood, remember it is my first time trying to learn a
programming language...and I?m not the smartest guy in the class, hehe)but
it seems I didn?t fix the problem...
Here?s the new code I wrote:

#!/c:/Perl -w
	use strict;
        use Bio::Index::Fasta;
	use Bio::DB::Fasta;
	use Bio::SeqIO;
	use IO::File;

# assign files to scalars
my $index_file = 'PC91.fasta';
my $id_list = 'LCS2.txt';

# open index file
my $db = Bio::DB::Fasta->new($index_file) or die;

# open the id list
my $in = IO::File->new($id_list) or die;

# open FASTA to write
my $out = new Bio::SeqIO (-file => ">>index_extracted.fasta",
-format => 'fasta');

# retrieve ids loop
foreach my $id ($in) {
if ($id eq ''){
die ("empty list")
}
else {
my $seqobj = my $inx->fetch($id);
$out->write_seq($seqobj);
}
}

# parse fasta headers
sub my_makeid {
my $id = shift;
if ( $id =~ /^>[^:]+:(\S+)/ ) {
return $1;
} elsif ($id =~ /^>(\S+)/) {
return $1;
} else {
warn("cannot parse ID for $id\n");
}
}
exit;

Would anyone, please take a look at it ...

Thanks in advance ;)


El Mar, 10 de Noviembre de 2009, 19:47, Jason Stajich escribi?:
> Page 44 has the custom ID info or look at documentation for
> Bio::DB::Fasta - there is a similar syntax for Bio::Index::Fasta if
> you read the perldoc for the module.
>
>   http://jason.open-bio.org/Bioperl_Tutorials/ProgrammingBiology2008/ProgBiology_BioPerl_I.pdf
>
> Don't re-opening SeqIO each time just do it once at the beginning
> outside of the loop and then call write_seq within the loop.
>
> This is one nuance of doing OO programming vs procedural is that there
> is some outside state information that can persist in an object, but
> conceptually, you want to open a filehandle once and just keep writing
> to it.
>
> -jason
> On Nov 10, 2009, at 2:43 AM, jluis.lavin at unavarra.es wrote:
>
>> Hello again,
>>
>> I tried what Mark told me modifying the code line he told me but
>> there?s
>> still a problem that I believe must be due to the sequences name.
>> My secuences header on the Fasta file have this format:
>>
>>> PleosPC9_1_103820|fgenesh1_pg.3_#_1
>>
>> Th part on the right of the pipe changes depending on the program
>> used to
>> create the gene model, for example:
>>
>>> PleosPC9_1_103820|fgenesh1_pg.3_#_1
>>> PleosPC9_1_123413|genemark.2731_g
>>> PleosPC9_1_52065|e_gw1.3.64.1
>>
>> So I guess I need to parse my ids somehow for thr program to detect
>> only
>> the first part of the fasta header (the "protein name") and not to get
>> messed with the other side of the pipe...
>>
>> This is the corrected code I wrote following Mark?s indications, but I
>> still don?t have any idea about the parsing issue...
>>
>> #!/c:/Perl -w
>> use Bio::Index::Fasta;
>> use strict;
>> #PC9.fasta is my genomic file
>> my $Index_File_Name ="PC9.fasta";
>> my $inx = Bio::Index::Fasta->new('PC9.fasta.idx');
>> #LCS.txt is my sequences list
>> @ARGV = <LCS.txt>;
>> foreach  my $id (@ARGV) {
>> if ($id eq ''){
>> die ("empty list")
>> }
>> else {
>> my $seqobj = $inx->fetch($id);
>> my $out = new Bio::SeqIO (-file => ">>index_extracted.fasta",
>> -format => 'fasta');
>> $out->write_seq($seqobj);
>> }
>> }
>> exit;
>> }
>>
>> Thanks in advance
>>
>> PD. May it be a faster way of extracting those sequences using plain
>> PERL?
>>
>>
>>
>>
>> El Jue, 5 de Noviembre de 2009, 17:39, Mark A. Jensen escribi?:
>>> Yes, these are files created by the SDBM, Perl's internal db
>>> manager. You
>>> should
>>> be able to
>>> open the index by simply
>>> $inx = Bio::Index::Fasta->new('PC9.fasta.idx');
>>> and the dbm will know what to do--
>>> cheers MAJ
>>> ----- Original Message -----
>>> From: <jluis.lavin at unavarra.es>
>>> To: "Mark A. Jensen" <maj at fortinbras.us>
>>> Cc: <jluis.lavin at unavarra.es>; <bioperl-l at lists.open-bio.org>
>>> Sent: Thursday, November 05, 2009 11:21 AM
>>> Subject: Re: [Bioperl-l] [Fwd: Re: A question about iBio::Index:
>>> and its
>>> correct
>>> use]
>>>
>>>
>>>> Thank you very much Mark, that?s a good point :$
>>>> I guess your correction is referred to the second script, isn?t it?
>>>>
>>>> If it is so, there is still a problem with the first script, it
>>>> doesn?t
>>>> create the PC9.fasta.idx file, instead it creates two files named:
>>>> -PC9.fasta.idx.pag
>>>> -PC9.fasta.idx.dir
>>>>
>>>> which seem to be clearly related with some kind of indexing
>>>> process...but,
>>>> unless the PC9.fasta.idx file is only virtual or remains hidden, I
>>>> can?t
>>>> find it anywhere...
>>>> Forgive me if I?m talking nosense...
>>>>
>>>> Thank you very much again for your help ;)
>>>>
>>>>
>>>> El Jue, 5 de Noviembre de 2009, 17:02, Mark A. Jensen escribi?:
>>>>> Hey Jos?,
>>>>> The first thing that jumps out it the index file name. Looks
>>>>> like you create it as
>>>>> PC9.fasta.idx
>>>>> But you read it as
>>>>> PC9.fasta
>>>>> Not an unusual mistake. Do
>>>>> my $inx = Bio::Index::Fasta->new('PC9.fasta.idx');
>>>>> and see if it works.
>>>>> MAJ
>>>>> ----- Original Message -----
>>>>> From: <jluis.lavin at unavarra.es>
>>>>> To: <bioperl-l at lists.open-bio.org>
>>>>> Sent: Thursday, November 05, 2009 10:46 AM
>>>>> Subject: [Bioperl-l] [Fwd: Re: A question about iBio::Index: and
>>>>> its
>>>>> correct
>>>>> use]
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> ---------------------------- Mensaje original
>>>>> ----------------------------
>>>>> Subject: Re: [Bioperl-l] A question about iBio::Index: and its
>>>>> correct
>>>>> use
>>>>> From:    jluis.lavin at unavarra.es
>>>>> Fecha:   Jue, 5 de Noviembre de 2009, 16:46
>>>>> To:      "Mark A. Jensen" <maj at fortinbras.us>
>>>>> --------------------------------------------------------------------------
>>>>>
>>>>> Hi Mark,
>>>>>
>>>>> I?ve actually got two scripts, the first one is to create the
>>>>> index and
>>>>> the second one is to retrieve the sequence lis from the indexed
>>>>> file.
>>>>>
>>>>> 1)Here is the Index creation script:
>>>>>
>>>>> #!/c:/Perl -w
>>>>> use strict;
>>>>> use Bio::Index::Fasta;
>>>>> use strict;
>>>>>
>>>>> print "Enter file for indexing: \n";
>>>>> my $Index_File_Name = <STDIN>;
>>>>> my $inx = Bio::Index::Fasta->new(-filename =>
>>>>> $Index_File_Name.".idx",
>>>>>    -write_flag => 1);
>>>>> $inx->make_index(my $File_Name);
>>>>>
>>>>> 2)And here is the sequence retrieval script:
>>>>>
>>>>> #!/c:/Perl -w
>>>>> use Bio::Index::Fasta;
>>>>> use strict;
>>>>> #PC9.fasta is my genomic file
>>>>> my $Index_File_Name ="PC9.fasta";
>>>>> my $inx = Bio::Index::Fasta->new($Index_File_Name);
>>>>> #LCS.txt is my sequences list
>>>>> @ARGV = <lCS.txt>;
>>>>> foreach  my $id (@ARGV) {
>>>>> if ($id eq ''){
>>>>> die ("empty list")
>>>>> }
>>>>> else {
>>>>> my $seqobj = $inx->fetch($id);
>>>>> my $out = new Bio::SeqIO (-file => ">>extracted_seqs.fasta",
>>>>> -format => 'fasta');
>>>>> $out->write_seq($seqobj);
>>>>> }
>>>>> }
>>>>> exit;
>>>>> }
>>>>>
>>>>> I hope this code is not a total scum...
>>>>>
>>>>> Thanks in advance ;)
>>>>>
>>>>>
>>>>>
>>>>> El Jue, 5 de Noviembre de 2009, 16:39, Mark A. Jensen escribi?:
>>>>>> Jos? -- It looks like this is a good solution to your problem.
>>>>>> Please
>>>>>> send
>>>>>> you
>>>>>> script so we can look at it-
>>>>>> cheers Mark
>>>>>> ----- Original Message -----
>>>>>> From: <jluis.lavin at unavarra.es>
>>>>>> To: <bioperl-l at lists.open-bio.org>
>>>>>> Sent: Thursday, November 05, 2009 10:28 AM
>>>>>> Subject: [Bioperl-l] A question about iBio::Index: and its
>>>>>> correct use
>>>>>>
>>>>>>
>>>>>>
>>>>>> Hello to all,
>>>>>>
>>>>>> I?m trying to write a script to retrieve a list of sequences
>>>>>> from a
>>>>>> local
>>>>>> FASTA file (for example a fasta archive where all the protein
>>>>>> models
>>>>>> of
>>>>>> an
>>>>>> organism are stored). This file would be used by me as some kind
>>>>>> "local
>>>>>> database" (sorry if I mistake a few concepts...)
>>>>>> I?ve been reading the BioPerl HOWTOs and I came across the
>>>>>> Bio::Index::Fasta tool.
>>>>>> If I didn?t misunderstood what I read (which can be easy because
>>>>>> my
>>>>>> low
>>>>>> level on programming) this Indexing tool should do the job.
>>>>>> I wrote a couple of scripts based on the documentation i read
>>>>>> about
>>>>>> this
>>>>>> tool, but I don?t seem to be able to create the index file to be
>>>>>> used
>>>>>> later (to retrieve the sequences from).
>>>>>> -First of all, I want to ask the people in this forum if the
>>>>>> Bio::Index::Fasta is the right one to chose for this tasks.
>>>>>> -Then I?ll beg you to take a look at my scripts, because I don?t
>>>>>> seem
>>>>>> to
>>>>>> catch the bug...
>>>>>>
>>>>>> Best wishes to you all and thanks in advance ;)
>>>>>>
>>>>>> --
>>>>>> Jos? Luis Lav?n Trueba, PhD
>>>>>>
>>>>>> Dpto. de Producci?n Agraria
>>>>>> Grupo de Gen?tica y Microbiolog?a
>>>>>> Universidad P?blica de Navarra
>>>>>> 31006 Pamplona
>>>>>> Navarra
>>>>>> SPAIN
>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> Bioperl-l mailing list
>>>>>> Bioperl-l at lists.open-bio.org
>>>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Dr. Jos? Luis Lav?n Trueba
>>>>>
>>>>> Dpto. de Producci?n Agraria
>>>>> Grupo de Gen?tica y Microbiolog?a
>>>>> Universidad P?blica de Navarra
>>>>> 31006 Pamplona
>>>>> Navarra
>>>>> SPAIN
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Dr. Jos? Luis Lav?n Trueba
>>>>>
>>>>> Dpto. de Producci?n Agraria
>>>>> Grupo de Gen?tica y Microbiolog?a
>>>>> Universidad P?blica de Navarra
>>>>> 31006 Pamplona
>>>>> Navarra
>>>>> SPAIN
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> Bioperl-l mailing list
>>>>> Bioperl-l at lists.open-bio.org
>>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> Dr. Jos? Luis Lav?n Trueba
>>>>
>>>> Dpto. de Producci?n Agraria
>>>> Grupo de Gen?tica y Microbiolog?a
>>>> Universidad P?blica de Navarra
>>>> 31006 Pamplona
>>>> Navarra
>>>> SPAIN
>>>>
>>>>
>>>>
>>>>
>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>
>>
>> --
>> Dr. Jos? Luis Lav?n Trueba
>>
>> Dpto. de Producci?n Agraria
>> Grupo de Gen?tica y Microbiolog?a
>> Universidad P?blica de Navarra
>> 31006 Pamplona
>> Navarra
>> SPAIN
>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> --
> Jason Stajich
> jason.stajich at gmail.com
> jason at bioperl.org
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


-- 
Dr. Jos? Luis Lav?n Trueba

Dpto. de Producci?n Agraria
Grupo de Gen?tica y Microbiolog?a
Universidad P?blica de Navarra
31006 Pamplona
Navarra
SPAIN


From maj at fortinbras.us  Wed Nov 11 23:48:33 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Wed, 11 Nov 2009 18:48:33 -0500
Subject: [Bioperl-l] Maq assembly wrapper ready for beta testing
Message-ID: <4057E5A862B845EA8BB153888075590C@NewLife>

Hi All-

New modules are available in the core and in bioperl-run for
working with Heng Li's short read assembler "maq"
(http://maq.sourceforge.net/maq-man.shtml). Bio::Tools::Run::Maq
allows a quick assembly call with a canned a maq pipeline, and also
allows individual maq commands to be called separately. 
It uses Bio::Assembly::IO::maq  (a read-only module) to deliver
a Bio::Assembly::Scaffold from maq output. 

If you're interested, see
http://www.bioperl.org/wiki/HOWTO:Short-read_assemblies_with_maq
and update your core and bioperl-run. The code inherits from Florent's
excellent new Bio::Tools::Run::AssemblerBase -- kudos to him!!

tests are in bioperl-run/trunk/t/Maq.t, see them for myriad examples
send me the bugs
MAJ


From clarsen at vecna.com  Thu Nov 12 17:22:26 2009
From: clarsen at vecna.com (Chris Larsen)
Date: Thu, 12 Nov 2009 12:22:26 -0500
Subject: [Bioperl-l] Polyproteins, ribo slippage,
	and mat_peptide in  viruses?
In-Reply-To: <320fb6e00910271029m26f07564l727fb78adae81c11@mail.gmail.com>
References: <B0218AEF-3CEB-4E06-B8DF-7B302D024797@vecna.com>
	<320fb6e00910271029m26f07564l727fb78adae81c11@mail.gmail.com>
Message-ID: <7BBAE077-4D76-46C2-BF66-363F5A017278@vecna.com>

All,

This is a short followup on the prior thread of discussion, regarding  
computing mature peptide sequences for viruses. The topic has gone  
underwater for the time being as we solve some problems with source  
data. While the biopython effort and contributors on this board have  
given good guidance, and we now have scripts that function (thanks  
mostly to pcock), however, the source data on which everything relies  
is suspect:

   mat_peptide	15118..16914	<===
		/product="nsp13"	
		/note="helicase"
I can tell you the virus community does not want to rely heavily, on  
those position numbers. Furthermore we have found fewer compete source  
genomes for viruses than bacteria, more virus-to-virus variation in  
the data fields annotated in the GBK file, (Gene, CDS, ORF, Protein,  
Polyprotein, mat_peptide, db_xref) and in fact the community will have  
to come together significantly on how these molecules are defined in  
public repositories, before a mature scripting effort becomes  
reliable, public and well received. Because of the variation in  
viruses, it's not even clear at this point what a 'gene' is. I will  
let you know how we proceed when more sequence data has been fully  
analyzed, and we can think about making any perl based solution a new  
viral protein module.

Thanks,

Chris

-- 

Christopher Larsen, Ph.D.
Sr. Scientist / Grants Manager
Vecna Technologies
6404 Ivy Lane #500
Greenbelt, MD 20770
Phone: (240) 965-4525
Fax: (240) 547-6133
240-737-4525


From David.Messina at sbc.su.se  Thu Nov 12 19:20:54 2009
From: David.Messina at sbc.su.se (Dave Messina)
Date: Thu, 12 Nov 2009 20:20:54 +0100
Subject: [Bioperl-l] highest PAML version supported?
Message-ID: <628aabb70911121120w4c609056v50204b9bd9e5c3fb@mail.gmail.com>

Hi everyone,

What is the latest version of PAML (specifically codeml) that I can use with
bioperl-live and bioperl-run?

I looked around and couldn't find where (or if) this is documented.


With PAML version 4.3a against the current trunk of both -live and -run I
see this:
------------- EXCEPTION Bio::Root::NotImplemented -------------
MSG: Unknown format of PAML output did not see seqtype
STACK Bio::Tools::Phylo::PAML::_parse_summary
/Users/dave/src/bioperl-live/Bio/Tools/Phylo/PAML.pm:461
STACK Bio::Tools::Phylo::PAML::next_result
/Users/dave/src/bioperl-live/Bio/Tools/Phylo/PAML.pm:270
STACK toplevel ../bin/cluster_kaks:251
---------------------------------------------------------------

...which I suspect (but haven't confirmed) is due to a change in the file
format.


Dave


From jason at bioperl.org  Thu Nov 12 19:29:22 2009
From: jason at bioperl.org (Jason Stajich)
Date: Thu, 12 Nov 2009 11:29:22 -0800
Subject: [Bioperl-l] highest PAML version supported?
In-Reply-To: <628aabb70911121120w4c609056v50204b9bd9e5c3fb@mail.gmail.com>
References: <628aabb70911121120w4c609056v50204b9bd9e5c3fb@mail.gmail.com>
Message-ID: <D44511CE-D632-4CBE-9DD1-1B68EFD70124@bioperl.org>

prolly 3.15 or so.

it really needs a maintainer!!!

On Nov 12, 2009, at 11:20 AM, Dave Messina wrote:

> Hi everyone,
>
> What is the latest version of PAML (specifically codeml) that I can  
> use with
> bioperl-live and bioperl-run?
>
> I looked around and couldn't find where (or if) this is documented.
>
>
> With PAML version 4.3a against the current trunk of both -live and - 
> run I
> see this:
> ------------- EXCEPTION Bio::Root::NotImplemented -------------
> MSG: Unknown format of PAML output did not see seqtype
> STACK Bio::Tools::Phylo::PAML::_parse_summary
> /Users/dave/src/bioperl-live/Bio/Tools/Phylo/PAML.pm:461
> STACK Bio::Tools::Phylo::PAML::next_result
> /Users/dave/src/bioperl-live/Bio/Tools/Phylo/PAML.pm:270
> STACK toplevel ../bin/cluster_kaks:251
> ---------------------------------------------------------------
>
> ...which I suspect (but haven't confirmed) is due to a change in the  
> file
> format.
>
>
> Dave
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

--
Jason Stajich
jason.stajich at gmail.com
jason at bioperl.org


From scott at scottcain.net  Fri Nov 13 14:48:43 2009
From: scott at scottcain.net (Scott Cain)
Date: Fri, 13 Nov 2009 09:48:43 -0500
Subject: [Bioperl-l] January GMOD meeting announcement
Message-ID: <4536f7700911130648j40eb2d82g2594adaccf476d73@mail.gmail.com>

Hello,

I am pleased to announce that the January GMOD meeting will be taking
place on January 14 and 15 in San Diego at the Best Western Seven Seas
(the same location as last year).  Please see this page for
registration information:

  http://gmod.org/wiki/January_2010_GMOD_Meeting

When you go to that page, please take a moment to add suggestions for
the agenda.  There is no registration fee for this meeting, however
there is limited space, so please register early.

The proprietors of the Best Western have given us an excellent room
rate, and extended it to the previous week, so that people attending
the GMOD meeting and the Plant and Animal Genome meeting before it may
stay at the Best Western the entire time.

Please direct follow up questions to the gmod-devel mailing list:
https://lists.sourceforge.net/lists/listinfo/gmod-devel

Thanks and I look forward to seeing you in San Diego!
Scott


-- 
------------------------------------------------------------------------
Scott Cain, Ph. D.                                   scott at scottcain dot net
GMOD Coordinator (http://gmod.org/)                     216-392-3087
Ontario Institute for Cancer Research


From j.inoue at ucl.ac.uk  Sat Nov 14 19:20:29 2009
From: j.inoue at ucl.ac.uk (Jun Inoue)
Date: Sat, 14 Nov 2009 19:20:29 +0000
Subject: [Bioperl-l] Bio::TreeIO, Root-tip branch lengths
Message-ID: <F05F4263-D708-492A-A536-74059687279C@ucl.ac.uk>

Dear All,

I just started to learn BioPerl for phylogenetics.
Usually I am using perl v5.10.0 on my Mac OS 10.5.8.
I would like to ask you a hint to calculate the Branch lengths
from root to tip for all species in NEWICK TREE format.

Please see the following web site.
I am explaining what I want to do and
showing my easy script (not completed).
http://www.geocities.jp/ancientfishtree/BioPerl_BLRootTip.html

Thank you for your help.

Best,
Jun Inoue
http://www.geocities.jp/ancientfishtree/index_eng.html


From maj at fortinbras.us  Sat Nov 14 21:47:37 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Sat, 14 Nov 2009 16:47:37 -0500
Subject: [Bioperl-l] Bio::TreeIO, Root-tip branch lengths
In-Reply-To: <F05F4263-D708-492A-A536-74059687279C@ucl.ac.uk>
References: <F05F4263-D708-492A-A536-74059687279C@ucl.ac.uk>
Message-ID: <3BC179984D5E49868C4F12D181D82B8D@NewLife>

Hi Jun,

Some hints: incorporate

@leaves = $tree->get_leaf_nodes;

and

use Bio::Tree::TreeFunctionsI;
$distance = $tree->distance( $node_a, $node_b );

cheers, Mark

----- Original Message ----- 
From: "Jun Inoue" <j.inoue at ucl.ac.uk>
To: <bioperl-l at lists.open-bio.org>
Cc: "?? ?" <j.inoue at ucl.ac.uk>
Sent: Saturday, November 14, 2009 2:20 PM
Subject: [Bioperl-l] Bio::TreeIO, Root-tip branch lengths


> Dear All,
>
> I just started to learn BioPerl for phylogenetics.
> Usually I am using perl v5.10.0 on my Mac OS 10.5.8.
> I would like to ask you a hint to calculate the Branch lengths
> from root to tip for all species in NEWICK TREE format.
>
> Please see the following web site.
> I am explaining what I want to do and
> showing my easy script (not completed).
> http://www.geocities.jp/ancientfishtree/BioPerl_BLRootTip.html
>
> Thank you for your help.
>
> Best,
> Jun Inoue
> http://www.geocities.jp/ancientfishtree/index_eng.html
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> 


From jay at jays.net  Mon Nov 16 01:23:38 2009
From: jay at jays.net (Jay Hannah)
Date: Sun, 15 Nov 2009 19:23:38 -0600
Subject: [Bioperl-l] Bio::Index::GenBank - by organism?
In-Reply-To: <12DFD22E-42DC-4626-9873-0DE3EBB5CFBD@illinois.edu>
References: <3B01A09C-198E-4691-B807-7ED3250BB81A@jays.net>
	<12DFD22E-42DC-4626-9873-0DE3EBB5CFBD@illinois.edu>
Message-ID: <F8052B51-85FB-44B9-9254-9AD1E964FA7B@jays.net>

On Nov 9, 2009, at 9:55 PM, Chris Fields wrote:
> It should work via id_parser(); from Bio::Index::GenBank:
> 
>   $inx->id_parser(\&get_id);
>   # make the index
>   $inx->make_index($file_name);
> 
>   # here is where the retrieval key is specified
>   sub get_id {
>      my $line = shift;
>      $line =~ /clone="(\S+)"/;
>      $1;
>   }

This worked great for me today (tackling a different problem than the original).  Thanks!!

j


From veronica.xiaoyu at gmail.com  Fri Nov 13 20:35:48 2009
From: veronica.xiaoyu at gmail.com (Xiaoyu Liang)
Date: Fri, 13 Nov 2009 15:35:48 -0500
Subject: [Bioperl-l] Bio::Graphics::Panel question
Message-ID: <a6a926b50911131235p3b04b59fs634593f275300ed6@mail.gmail.com>

Hi,

I'm using Bio::Graphics to parse the blast result and generate images. But,
sometimes, in the middle of the output image, the hit's color is white,
eventhough I set it to other colors. I attached the picture here for an
example. This doesn't occur all the time, usually, it works well. I'm
wondering if I did something wrong? or depends on the blast result?

Thank you,
Xiaoyu
-------------- next part --------------
A non-text attachment was scrubbed...
Name: BLAST_problem.jpg
Type: image/jpeg
Size: 51888 bytes
Desc: not available
URL: <http://lists.open-bio.org/pipermail/bioperl-l/attachments/20091113/57550aa9/attachment-0004.jpg>

From ryan_bogard at hms.harvard.edu  Mon Nov 16 03:30:22 2009
From: ryan_bogard at hms.harvard.edu (rbogard)
Date: Sun, 15 Nov 2009 19:30:22 -0800 (PST)
Subject: [Bioperl-l]  Problems with bioperl in Mac OS X 10.6 (Perl 5.10.0)
Message-ID: <26366421.post@talk.nabble.com>


In advance, any advice would be grealy appreciated! I have installed
bioperl-588pm via fink but I am having difficulties calling the modules in
script. The following is added to .profile (bash):
PERL5LIB=/sw/lib/perl5/5.8.8:$PERL5LIB

If I change this to /sw/lib/perl5 then I get an @INC error, as use Bio::PERL
cannot be located.

The environment variables are as follows:

MANPATH=/sw/share/man:/usr/share/man:/usr/X11/man:/sw/lib/perl5/5.10.0/man:/usr/X11R6/man:/sw/lib/perl5-core/5.8.8/man:/sw/lib/perl5/5.8.8/man
PERL5LIB=/sw/lib/perl5/5.8.8:/sw/lib/perl5:/sw/lib/perl5/darwin:/sw/lib/perl5/5.8.8
PATH=/sw/bin:/sw/sbin:/usr/bin:/bin:/usr/sbin:/sbin:/usr/local/bin:/usr/X11/bin:/usr/X11R6/bin
INFOPATH=/sw/share/info:/sw/info:/usr/share/info


This is the perl script I'm attempting to run:
#!/sw/bin/perl5.8.8
use strict;
use Bio::Perl;
$seq_object = get_sequence('swiss',"ROA1_HUMAN");
write_sequence(">roa1.fasta",'fasta',$seq_object);

Here is the error output:

dyld: lazy symbol binding failed: Symbol not found: _Perl_Tstack_sp_ptr
  Referenced from:
/sw/lib/perl5/5.8.8/darwin-thread-multi-2level/auto/IO/IO.bundle
  Expected in: dynamic lookup

dyld: Symbol not found: _Perl_Tstack_sp_ptr
  Referenced from:
/sw/lib/perl5/5.8.8/darwin-thread-multi-2level/auto/IO/IO.bundle
  Expected in: dynamic lookup

Trace/BPT trap

I have looked through many forum postings and attempted the solutions
offered in those instances, but none seem to work in my case. I'm not sure
if it's because I have perl 5.10.0 installed while attempting to call
bioperl 5.8.8; however, others seem to have it working just fine.

Thank you, Ryan 
-- 
View this message in context: http://old.nabble.com/Problems-with-bioperl-in-Mac-OS-X-10.6-%28Perl-5.10.0%29-tp26366421p26366421.html
Sent from the Perl - Bioperl-L mailing list archive at Nabble.com.


From e.osimo at gmail.com  Mon Nov 16 07:04:40 2009
From: e.osimo at gmail.com (Emanuele Osimo)
Date: Mon, 16 Nov 2009 08:04:40 +0100
Subject: [Bioperl-l] Problems with bioperl in Mac OS X 10.6 (Perl 5.10.0)
In-Reply-To: <26366421.post@talk.nabble.com>
References: <26366421.post@talk.nabble.com>
Message-ID: <2ac05d0f0911152304v58985cb5x6ea0501bff7a41ab@mail.gmail.com>

Hello Ryan,
unfortunately, if you upgraded to 10.6 without formatting, I have to tell
you that you'll be in big trouble with perl and with everything you
installed from the commandline... Because in the upgrade process everything
in the system folders, perl and bioperl being some of these things, is
erased without being uninstalled, so you'll find a lot of folders with the
same name but no contents.
I suggest you, as I did, to format your pc and reinstall 10.6 from scratch.
Then youl'll be able to install mysql (I had to install
mysql-5.4.3-beta-osx10.5, the only to work on 10.6), and, working with perl
5.10 that is already installed, you'll install bioperl with no effort.
Bye
Emanuele

On Mon, Nov 16, 2009 at 04:30, rbogard <ryan_bogard at hms.harvard.edu> wrote:

>
> In advance, any advice would be grealy appreciated! I have installed
> bioperl-588pm via fink but I am having difficulties calling the modules in
> script. The following is added to .profile (bash):
> PERL5LIB=/sw/lib/perl5/5.8.8:$PERL5LIB
>
> If I change this to /sw/lib/perl5 then I get an @INC error, as use
> Bio::PERL
> cannot be located.
>
> The environment variables are as follows:
>
>
> MANPATH=/sw/share/man:/usr/share/man:/usr/X11/man:/sw/lib/perl5/5.10.0/man:/usr/X11R6/man:/sw/lib/perl5-core/5.8.8/man:/sw/lib/perl5/5.8.8/man
>
> PERL5LIB=/sw/lib/perl5/5.8.8:/sw/lib/perl5:/sw/lib/perl5/darwin:/sw/lib/perl5/5.8.8
>
> PATH=/sw/bin:/sw/sbin:/usr/bin:/bin:/usr/sbin:/sbin:/usr/local/bin:/usr/X11/bin:/usr/X11R6/bin
> INFOPATH=/sw/share/info:/sw/info:/usr/share/info
>
>
> This is the perl script I'm attempting to run:
> #!/sw/bin/perl5.8.8
> use strict;
> use Bio::Perl;
> $seq_object = get_sequence('swiss',"ROA1_HUMAN");
> write_sequence(">roa1.fasta",'fasta',$seq_object);
>
> Here is the error output:
>
> dyld: lazy symbol binding failed: Symbol not found: _Perl_Tstack_sp_ptr
>  Referenced from:
> /sw/lib/perl5/5.8.8/darwin-thread-multi-2level/auto/IO/IO.bundle
>  Expected in: dynamic lookup
>
> dyld: Symbol not found: _Perl_Tstack_sp_ptr
>  Referenced from:
> /sw/lib/perl5/5.8.8/darwin-thread-multi-2level/auto/IO/IO.bundle
>  Expected in: dynamic lookup
>
> Trace/BPT trap
>
> I have looked through many forum postings and attempted the solutions
> offered in those instances, but none seem to work in my case. I'm not sure
> if it's because I have perl 5.10.0 installed while attempting to call
> bioperl 5.8.8; however, others seem to have it working just fine.
>
> Thank you, Ryan
> --
> View this message in context:
> http://old.nabble.com/Problems-with-bioperl-in-Mac-OS-X-10.6-%28Perl-5.10.0%29-tp26366421p26366421.html
> Sent from the Perl - Bioperl-L mailing list archive at Nabble.com.
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


From ryan_bogard at hms.harvard.edu  Mon Nov 16 13:43:19 2009
From: ryan_bogard at hms.harvard.edu (rbogard)
Date: Mon, 16 Nov 2009 05:43:19 -0800 (PST)
Subject: [Bioperl-l] Problems with bioperl in Mac OS X 10.6 (Perl 5.10.0)
In-Reply-To: <2ac05d0f0911152304v58985cb5x6ea0501bff7a41ab@mail.gmail.com>
References: <26366421.post@talk.nabble.com>
	<2ac05d0f0911152304v58985cb5x6ea0501bff7a41ab@mail.gmail.com>
Message-ID: <26372079.post@talk.nabble.com>


The Mac OS X 10.6 was a fresh install on a new Mac Book Pro. Not sure if I
will have the same issues, but it's worth a shot as I have little on my
computer and reinstalling to start over wouldn't be too difficult. What
method did you use to install bioperl? I used fink and I am not sure the
available stable version is the one I need. I will install from the command
line this time around, and let you know how it turns out.

Thank you!


Emanuele Osimo wrote:
> 
> Hello Ryan,
> unfortunately, if you upgraded to 10.6 without formatting, I have to tell
> you that you'll be in big trouble with perl and with everything you
> installed from the commandline... Because in the upgrade process
> everything
> in the system folders, perl and bioperl being some of these things, is
> erased without being uninstalled, so you'll find a lot of folders with the
> same name but no contents.
> I suggest you, as I did, to format your pc and reinstall 10.6 from
> scratch.
> Then youl'll be able to install mysql (I had to install
> mysql-5.4.3-beta-osx10.5, the only to work on 10.6), and, working with
> perl
> 5.10 that is already installed, you'll install bioperl with no effort.
> Bye
> Emanuele
> 
> On Mon, Nov 16, 2009 at 04:30, rbogard <ryan_bogard at hms.harvard.edu>
> wrote:
> 
>>
>> In advance, any advice would be grealy appreciated! I have installed
>> bioperl-588pm via fink but I am having difficulties calling the modules
>> in
>> script. The following is added to .profile (bash):
>> PERL5LIB=/sw/lib/perl5/5.8.8:$PERL5LIB
>>
>> If I change this to /sw/lib/perl5 then I get an @INC error, as use
>> Bio::PERL
>> cannot be located.
>>
>> The environment variables are as follows:
>>
>>
>> MANPATH=/sw/share/man:/usr/share/man:/usr/X11/man:/sw/lib/perl5/5.10.0/man:/usr/X11R6/man:/sw/lib/perl5-core/5.8.8/man:/sw/lib/perl5/5.8.8/man
>>
>> PERL5LIB=/sw/lib/perl5/5.8.8:/sw/lib/perl5:/sw/lib/perl5/darwin:/sw/lib/perl5/5.8.8
>>
>> PATH=/sw/bin:/sw/sbin:/usr/bin:/bin:/usr/sbin:/sbin:/usr/local/bin:/usr/X11/bin:/usr/X11R6/bin
>> INFOPATH=/sw/share/info:/sw/info:/usr/share/info
>>
>>
>> This is the perl script I'm attempting to run:
>> #!/sw/bin/perl5.8.8
>> use strict;
>> use Bio::Perl;
>> $seq_object = get_sequence('swiss',"ROA1_HUMAN");
>> write_sequence(">roa1.fasta",'fasta',$seq_object);
>>
>> Here is the error output:
>>
>> dyld: lazy symbol binding failed: Symbol not found: _Perl_Tstack_sp_ptr
>>  Referenced from:
>> /sw/lib/perl5/5.8.8/darwin-thread-multi-2level/auto/IO/IO.bundle
>>  Expected in: dynamic lookup
>>
>> dyld: Symbol not found: _Perl_Tstack_sp_ptr
>>  Referenced from:
>> /sw/lib/perl5/5.8.8/darwin-thread-multi-2level/auto/IO/IO.bundle
>>  Expected in: dynamic lookup
>>
>> Trace/BPT trap
>>
>> I have looked through many forum postings and attempted the solutions
>> offered in those instances, but none seem to work in my case. I'm not
>> sure
>> if it's because I have perl 5.10.0 installed while attempting to call
>> bioperl 5.8.8; however, others seem to have it working just fine.
>>
>> Thank you, Ryan
>> --
>> View this message in context:
>> http://old.nabble.com/Problems-with-bioperl-in-Mac-OS-X-10.6-%28Perl-5.10.0%29-tp26366421p26366421.html
>> Sent from the Perl - Bioperl-L mailing list archive at Nabble.com.
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
> 

-- 
View this message in context: http://old.nabble.com/Problems-with-bioperl-in-Mac-OS-X-10.6-%28Perl-5.10.0%29-tp26366421p26372079.html
Sent from the Perl - Bioperl-L mailing list archive at Nabble.com.


From maj at fortinbras.us  Mon Nov 16 13:48:17 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Mon, 16 Nov 2009 08:48:17 -0500
Subject: [Bioperl-l] Problems with bioperl in Mac OS X 10.6 (Perl 5.10.0)
In-Reply-To: <26372079.post@talk.nabble.com>
References: <26366421.post@talk.nabble.com><2ac05d0f0911152304v58985cb5x6ea0501bff7a41ab@mail.gmail.com>
	<26372079.post@talk.nabble.com>
Message-ID: <8D822081B13F49C2A37677D3A47F38B4@NewLife>

Ryan,
I'm not a mac person, but Koen has said (see 
http://www.bioperl.org/wiki/Getting_BioPerl#Mac_OS_X_using_fink )
to use the unstable tree to get BioPerl 1.6.1, which is likely to be what you 
want.
cheers
Mark
----- Original Message ----- 
From: "rbogard" <ryan_bogard at hms.harvard.edu>
To: <Bioperl-l at lists.open-bio.org>
Sent: Monday, November 16, 2009 8:43 AM
Subject: Re: [Bioperl-l] Problems with bioperl in Mac OS X 10.6 (Perl 5.10.0)


>
> The Mac OS X 10.6 was a fresh install on a new Mac Book Pro. Not sure if I
> will have the same issues, but it's worth a shot as I have little on my
> computer and reinstalling to start over wouldn't be too difficult. What
> method did you use to install bioperl? I used fink and I am not sure the
> available stable version is the one I need. I will install from the command
> line this time around, and let you know how it turns out.
>
> Thank you!
>
>
>
> Emanuele Osimo wrote:
>>
>> Hello Ryan,
>> unfortunately, if you upgraded to 10.6 without formatting, I have to tell
>> you that you'll be in big trouble with perl and with everything you
>> installed from the commandline... Because in the upgrade process
>> everything
>> in the system folders, perl and bioperl being some of these things, is
>> erased without being uninstalled, so you'll find a lot of folders with the
>> same name but no contents.
>> I suggest you, as I did, to format your pc and reinstall 10.6 from
>> scratch.
>> Then youl'll be able to install mysql (I had to install
>> mysql-5.4.3-beta-osx10.5, the only to work on 10.6), and, working with
>> perl
>> 5.10 that is already installed, you'll install bioperl with no effort.
>> Bye
>> Emanuele
>>
>> On Mon, Nov 16, 2009 at 04:30, rbogard <ryan_bogard at hms.harvard.edu>
>> wrote:
>>
>>>
>>> In advance, any advice would be grealy appreciated! I have installed
>>> bioperl-588pm via fink but I am having difficulties calling the modules
>>> in
>>> script. The following is added to .profile (bash):
>>> PERL5LIB=/sw/lib/perl5/5.8.8:$PERL5LIB
>>>
>>> If I change this to /sw/lib/perl5 then I get an @INC error, as use
>>> Bio::PERL
>>> cannot be located.
>>>
>>> The environment variables are as follows:
>>>
>>>
>>> MANPATH=/sw/share/man:/usr/share/man:/usr/X11/man:/sw/lib/perl5/5.10.0/man:/usr/X11R6/man:/sw/lib/perl5-core/5.8.8/man:/sw/lib/perl5/5.8.8/man
>>>
>>> PERL5LIB=/sw/lib/perl5/5.8.8:/sw/lib/perl5:/sw/lib/perl5/darwin:/sw/lib/perl5/5.8.8
>>>
>>> PATH=/sw/bin:/sw/sbin:/usr/bin:/bin:/usr/sbin:/sbin:/usr/local/bin:/usr/X11/bin:/usr/X11R6/bin
>>> INFOPATH=/sw/share/info:/sw/info:/usr/share/info
>>>
>>>
>>> This is the perl script I'm attempting to run:
>>> #!/sw/bin/perl5.8.8
>>> use strict;
>>> use Bio::Perl;
>>> $seq_object = get_sequence('swiss',"ROA1_HUMAN");
>>> write_sequence(">roa1.fasta",'fasta',$seq_object);
>>>
>>> Here is the error output:
>>>
>>> dyld: lazy symbol binding failed: Symbol not found: _Perl_Tstack_sp_ptr
>>>  Referenced from:
>>> /sw/lib/perl5/5.8.8/darwin-thread-multi-2level/auto/IO/IO.bundle
>>>  Expected in: dynamic lookup
>>>
>>> dyld: Symbol not found: _Perl_Tstack_sp_ptr
>>>  Referenced from:
>>> /sw/lib/perl5/5.8.8/darwin-thread-multi-2level/auto/IO/IO.bundle
>>>  Expected in: dynamic lookup
>>>
>>> Trace/BPT trap
>>>
>>> I have looked through many forum postings and attempted the solutions
>>> offered in those instances, but none seem to work in my case. I'm not
>>> sure
>>> if it's because I have perl 5.10.0 installed while attempting to call
>>> bioperl 5.8.8; however, others seem to have it working just fine.
>>>
>>> Thank you, Ryan
>>> --
>>> View this message in context:
>>> http://old.nabble.com/Problems-with-bioperl-in-Mac-OS-X-10.6-%28Perl-5.10.0%29-tp26366421p26366421.html
>>> Sent from the Perl - Bioperl-L mailing list archive at Nabble.com.
>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>>
>
> -- 
> View this message in context: 
> http://old.nabble.com/Problems-with-bioperl-in-Mac-OS-X-10.6-%28Perl-5.10.0%29-tp26366421p26372079.html
> Sent from the Perl - Bioperl-L mailing list archive at Nabble.com.
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> 


From cjfields at illinois.edu  Mon Nov 16 15:00:09 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Mon, 16 Nov 2009 09:00:09 -0600
Subject: [Bioperl-l] Problems with bioperl in Mac OS X 10.6 (Perl 5.10.0)
In-Reply-To: <2ac05d0f0911152304v58985cb5x6ea0501bff7a41ab@mail.gmail.com>
References: <26366421.post@talk.nabble.com>
	<2ac05d0f0911152304v58985cb5x6ea0501bff7a41ab@mail.gmail.com>
Message-ID: <49681E01-E95D-4FC6-AE42-6E57ED43AAA2@illinois.edu>

On Nov 16, 2009, at 1:04 AM, Emanuele Osimo wrote:

> Hello Ryan,
> unfortunately, if you upgraded to 10.6 without formatting, I have to tell
> you that you'll be in big trouble with perl and with everything you
> installed from the commandline... Because in the upgrade process everything
> in the system folders, perl and bioperl being some of these things, is
> erased without being uninstalled, so you'll find a lot of folders with the
> same name but no contents.

> I suggest you, as I did, to format your pc and reinstall 10.6 from scratch.
> Then youl'll be able to install mysql (I had to install
> mysql-5.4.3-beta-osx10.5, the only to work on 10.6), and, working with perl
> 5.10 that is already installed, you'll install bioperl with no effort.
> Bye
> Emanuele

Just starting from scratch isn't always the best solution (though it is the cleanest).  In this case I don't think anything you mention applies, as there are conflicting symbols being reported.  My guess is conflicting perl builds, probably between your system 5.10.0 (snow leopard) and your fink-installed perl 5.8.8 (they are binary incompatible).  Also, remember that snow leopard is primarily 64-bit, so it might be best to try working out whether your fink is attempting to compile 64- vs 32-bit.  

In this case, I would just uninstall the fink-based perl and either use the system one (snow leopard = 5.10.0), or roll your own and install 5.10.1 locally or in /usr/local.  Do NOT replace the system one, as that will likely break your OS.

In my experience, and not to bash on fink or MacPorts, I never had much luck with their perl installs.  Unless I plan on only using fink or macports for my OS (not likely in my case), I find they tend to cause problems in the long term unless one uses them to install packages with very few dependencies, and even then you need to make sure fink is configure to compile the correct binary.  For instance, they're fairly good for gd, libxml2, etc., but beyond that one may get into issues with odd, version-specific dependencies with some packages, such as relying on perl 5.8.8 (but not perl 5.10.x), db42 (instead of db44), etc.  I've ended up in the past with 2-3 different perl versions, berkeley db versions, etc. 

chris

> On Mon, Nov 16, 2009 at 04:30, rbogard <ryan_bogard at hms.harvard.edu> wrote:
> 
>> 
>> In advance, any advice would be grealy appreciated! I have installed
>> bioperl-588pm via fink but I am having difficulties calling the modules in
>> script. The following is added to .profile (bash):
>> PERL5LIB=/sw/lib/perl5/5.8.8:$PERL5LIB
>> 
>> If I change this to /sw/lib/perl5 then I get an @INC error, as use
>> Bio::PERL
>> cannot be located.
>> 
>> The environment variables are as follows:
>> 
>> 
>> MANPATH=/sw/share/man:/usr/share/man:/usr/X11/man:/sw/lib/perl5/5.10.0/man:/usr/X11R6/man:/sw/lib/perl5-core/5.8.8/man:/sw/lib/perl5/5.8.8/man
>> 
>> PERL5LIB=/sw/lib/perl5/5.8.8:/sw/lib/perl5:/sw/lib/perl5/darwin:/sw/lib/perl5/5.8.8
>> 
>> PATH=/sw/bin:/sw/sbin:/usr/bin:/bin:/usr/sbin:/sbin:/usr/local/bin:/usr/X11/bin:/usr/X11R6/bin
>> INFOPATH=/sw/share/info:/sw/info:/usr/share/info
>> 
>> 
>> This is the perl script I'm attempting to run:
>> #!/sw/bin/perl5.8.8
>> use strict;
>> use Bio::Perl;
>> $seq_object = get_sequence('swiss',"ROA1_HUMAN");
>> write_sequence(">roa1.fasta",'fasta',$seq_object);
>> 
>> Here is the error output:
>> 
>> dyld: lazy symbol binding failed: Symbol not found: _Perl_Tstack_sp_ptr
>> Referenced from:
>> /sw/lib/perl5/5.8.8/darwin-thread-multi-2level/auto/IO/IO.bundle
>> Expected in: dynamic lookup
>> 
>> dyld: Symbol not found: _Perl_Tstack_sp_ptr
>> Referenced from:
>> /sw/lib/perl5/5.8.8/darwin-thread-multi-2level/auto/IO/IO.bundle
>> Expected in: dynamic lookup
>> 
>> Trace/BPT trap
>> 
>> I have looked through many forum postings and attempted the solutions
>> offered in those instances, but none seem to work in my case. I'm not sure
>> if it's because I have perl 5.10.0 installed while attempting to call
>> bioperl 5.8.8; however, others seem to have it working just fine.
>> 
>> Thank you, Ryan
>> --
>> View this message in context:
>> http://old.nabble.com/Problems-with-bioperl-in-Mac-OS-X-10.6-%28Perl-5.10.0%29-tp26366421p26366421.html
>> Sent from the Perl - Bioperl-L mailing list archive at Nabble.com.
>> 
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From cjfields at illinois.edu  Mon Nov 16 15:01:01 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Mon, 16 Nov 2009 09:01:01 -0600
Subject: [Bioperl-l] Problems with bioperl in Mac OS X 10.6 (Perl 5.10.0)
In-Reply-To: <8D822081B13F49C2A37677D3A47F38B4@NewLife>
References: <26366421.post@talk.nabble.com><2ac05d0f0911152304v58985cb5x6ea0501bff7a41ab@mail.gmail.com>
	<26372079.post@talk.nabble.com>
	<8D822081B13F49C2A37677D3A47F38B4@NewLife>
Message-ID: <58912861-CD59-4AFC-8F30-B0AA2E77AECB@illinois.edu>

Actually, why not just install via CPAN?  Any particular reason?

chris

On Nov 16, 2009, at 7:48 AM, Mark A. Jensen wrote:

> Ryan,
> I'm not a mac person, but Koen has said (see http://www.bioperl.org/wiki/Getting_BioPerl#Mac_OS_X_using_fink )
> to use the unstable tree to get BioPerl 1.6.1, which is likely to be what you want.
> cheers
> Mark
> ----- Original Message ----- From: "rbogard" <ryan_bogard at hms.harvard.edu>
> To: <Bioperl-l at lists.open-bio.org>
> Sent: Monday, November 16, 2009 8:43 AM
> Subject: Re: [Bioperl-l] Problems with bioperl in Mac OS X 10.6 (Perl 5.10.0)
> 
> 
>> 
>> The Mac OS X 10.6 was a fresh install on a new Mac Book Pro. Not sure if I
>> will have the same issues, but it's worth a shot as I have little on my
>> computer and reinstalling to start over wouldn't be too difficult. What
>> method did you use to install bioperl? I used fink and I am not sure the
>> available stable version is the one I need. I will install from the command
>> line this time around, and let you know how it turns out.
>> 
>> Thank you!
>> 
>> 
>> 
>> Emanuele Osimo wrote:
>>> 
>>> Hello Ryan,
>>> unfortunately, if you upgraded to 10.6 without formatting, I have to tell
>>> you that you'll be in big trouble with perl and with everything you
>>> installed from the commandline... Because in the upgrade process
>>> everything
>>> in the system folders, perl and bioperl being some of these things, is
>>> erased without being uninstalled, so you'll find a lot of folders with the
>>> same name but no contents.
>>> I suggest you, as I did, to format your pc and reinstall 10.6 from
>>> scratch.
>>> Then youl'll be able to install mysql (I had to install
>>> mysql-5.4.3-beta-osx10.5, the only to work on 10.6), and, working with
>>> perl
>>> 5.10 that is already installed, you'll install bioperl with no effort.
>>> Bye
>>> Emanuele
>>> 
>>> On Mon, Nov 16, 2009 at 04:30, rbogard <ryan_bogard at hms.harvard.edu>
>>> wrote:
>>> 
>>>> 
>>>> In advance, any advice would be grealy appreciated! I have installed
>>>> bioperl-588pm via fink but I am having difficulties calling the modules
>>>> in
>>>> script. The following is added to .profile (bash):
>>>> PERL5LIB=/sw/lib/perl5/5.8.8:$PERL5LIB
>>>> 
>>>> If I change this to /sw/lib/perl5 then I get an @INC error, as use
>>>> Bio::PERL
>>>> cannot be located.
>>>> 
>>>> The environment variables are as follows:
>>>> 
>>>> 
>>>> MANPATH=/sw/share/man:/usr/share/man:/usr/X11/man:/sw/lib/perl5/5.10.0/man:/usr/X11R6/man:/sw/lib/perl5-core/5.8.8/man:/sw/lib/perl5/5.8.8/man
>>>> 
>>>> PERL5LIB=/sw/lib/perl5/5.8.8:/sw/lib/perl5:/sw/lib/perl5/darwin:/sw/lib/perl5/5.8.8
>>>> 
>>>> PATH=/sw/bin:/sw/sbin:/usr/bin:/bin:/usr/sbin:/sbin:/usr/local/bin:/usr/X11/bin:/usr/X11R6/bin
>>>> INFOPATH=/sw/share/info:/sw/info:/usr/share/info
>>>> 
>>>> 
>>>> This is the perl script I'm attempting to run:
>>>> #!/sw/bin/perl5.8.8
>>>> use strict;
>>>> use Bio::Perl;
>>>> $seq_object = get_sequence('swiss',"ROA1_HUMAN");
>>>> write_sequence(">roa1.fasta",'fasta',$seq_object);
>>>> 
>>>> Here is the error output:
>>>> 
>>>> dyld: lazy symbol binding failed: Symbol not found: _Perl_Tstack_sp_ptr
>>>> Referenced from:
>>>> /sw/lib/perl5/5.8.8/darwin-thread-multi-2level/auto/IO/IO.bundle
>>>> Expected in: dynamic lookup
>>>> 
>>>> dyld: Symbol not found: _Perl_Tstack_sp_ptr
>>>> Referenced from:
>>>> /sw/lib/perl5/5.8.8/darwin-thread-multi-2level/auto/IO/IO.bundle
>>>> Expected in: dynamic lookup
>>>> 
>>>> Trace/BPT trap
>>>> 
>>>> I have looked through many forum postings and attempted the solutions
>>>> offered in those instances, but none seem to work in my case. I'm not
>>>> sure
>>>> if it's because I have perl 5.10.0 installed while attempting to call
>>>> bioperl 5.8.8; however, others seem to have it working just fine.
>>>> 
>>>> Thank you, Ryan
>>>> --
>>>> View this message in context:
>>>> http://old.nabble.com/Problems-with-bioperl-in-Mac-OS-X-10.6-%28Perl-5.10.0%29-tp26366421p26366421.html
>>>> Sent from the Perl - Bioperl-L mailing list archive at Nabble.com.
>>>> 
>>>> _______________________________________________
>>>> Bioperl-l mailing list
>>>> Bioperl-l at lists.open-bio.org
>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>> 
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>> 
>>> 
>> 
>> -- 
>> View this message in context: http://old.nabble.com/Problems-with-bioperl-in-Mac-OS-X-10.6-%28Perl-5.10.0%29-tp26366421p26372079.html
>> Sent from the Perl - Bioperl-L mailing list archive at Nabble.com.
>> 
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From Kevin.M.Brown at asu.edu  Mon Nov 16 15:49:13 2009
From: Kevin.M.Brown at asu.edu (Kevin Brown)
Date: Mon, 16 Nov 2009 08:49:13 -0700
Subject: [Bioperl-l] Bio::Graphics::Panel question
In-Reply-To: <a6a926b50911131235p3b04b59fs634593f275300ed6@mail.gmail.com>
References: <a6a926b50911131235p3b04b59fs634593f275300ed6@mail.gmail.com>
Message-ID: <1A4207F8295607498283FE9E93B775B40663EDB9@EX02.asurite.ad.asu.edu>

To really be able to tell if this was a bug, I (and probably the real
devs) would need to see that part of your code and the Blast file that
is having this issue as it could be your callback for color choice vs
the blast object (e.g. your color picker is missing an option that the
data comes in with and so returns with a blank value).

-----Original Message-----
From: bioperl-l-bounces at lists.open-bio.org
[mailto:bioperl-l-bounces at lists.open-bio.org] On Behalf Of Xiaoyu Liang
Sent: Friday, November 13, 2009 1:36 PM
To: Bioperl-l at lists.open-bio.org
Subject: [Bioperl-l] Bio::Graphics::Panel question

Hi,

I'm using Bio::Graphics to parse the blast result and generate images.
But, sometimes, in the middle of the output image, the hit's color is
white, eventhough I set it to other colors. I attached the picture here
for an example. This doesn't occur all the time, usually, it works well.
I'm wondering if I did something wrong? or depends on the blast result?

Thank you,
Xiaoyu


From ryan_bogard at hms.harvard.edu  Mon Nov 16 16:57:16 2009
From: ryan_bogard at hms.harvard.edu (rbogard)
Date: Mon, 16 Nov 2009 08:57:16 -0800 (PST)
Subject: [Bioperl-l] Problems with bioperl in Mac OS X 10.6 (Perl 5.10.0)
In-Reply-To: <58912861-CD59-4AFC-8F30-B0AA2E77AECB@illinois.edu>
References: <26366421.post@talk.nabble.com>
	<2ac05d0f0911152304v58985cb5x6ea0501bff7a41ab@mail.gmail.com>
	<26372079.post@talk.nabble.com>
	<8D822081B13F49C2A37677D3A47F38B4@NewLife>
	<58912861-CD59-4AFC-8F30-B0AA2E77AECB@illinois.edu>
Message-ID: <26375418.post@talk.nabble.com>


I read that posting by Koen and used the unstable tree after the first
attempt; however, the errors still persisted. I just finished a fresh
install and I will just follow Mr. Fields advice and use CPAN. 
Thank you all for the help!


Chris Fields-5 wrote:
> 
> Actually, why not just install via CPAN?  Any particular reason?
> 
> chris
> 
> On Nov 16, 2009, at 7:48 AM, Mark A. Jensen wrote:
> 
>> Ryan,
>> I'm not a mac person, but Koen has said (see
>> http://www.bioperl.org/wiki/Getting_BioPerl#Mac_OS_X_using_fink )
>> to use the unstable tree to get BioPerl 1.6.1, which is likely to be what
>> you want.
>> cheers
>> Mark
>> ----- Original Message ----- From: "rbogard"
>> <ryan_bogard at hms.harvard.edu>
>> To: <Bioperl-l at lists.open-bio.org>
>> Sent: Monday, November 16, 2009 8:43 AM
>> Subject: Re: [Bioperl-l] Problems with bioperl in Mac OS X 10.6 (Perl
>> 5.10.0)
>> 
>> 
>>> 
>>> The Mac OS X 10.6 was a fresh install on a new Mac Book Pro. Not sure if
>>> I
>>> will have the same issues, but it's worth a shot as I have little on my
>>> computer and reinstalling to start over wouldn't be too difficult. What
>>> method did you use to install bioperl? I used fink and I am not sure the
>>> available stable version is the one I need. I will install from the
>>> command
>>> line this time around, and let you know how it turns out.
>>> 
>>> Thank you!
>>> 
>>> 
>>> 
>>> Emanuele Osimo wrote:
>>>> 
>>>> Hello Ryan,
>>>> unfortunately, if you upgraded to 10.6 without formatting, I have to
>>>> tell
>>>> you that you'll be in big trouble with perl and with everything you
>>>> installed from the commandline... Because in the upgrade process
>>>> everything
>>>> in the system folders, perl and bioperl being some of these things, is
>>>> erased without being uninstalled, so you'll find a lot of folders with
>>>> the
>>>> same name but no contents.
>>>> I suggest you, as I did, to format your pc and reinstall 10.6 from
>>>> scratch.
>>>> Then youl'll be able to install mysql (I had to install
>>>> mysql-5.4.3-beta-osx10.5, the only to work on 10.6), and, working with
>>>> perl
>>>> 5.10 that is already installed, you'll install bioperl with no effort.
>>>> Bye
>>>> Emanuele
>>>> 
>>>> On Mon, Nov 16, 2009 at 04:30, rbogard <ryan_bogard at hms.harvard.edu>
>>>> wrote:
>>>> 
>>>>> 
>>>>> In advance, any advice would be grealy appreciated! I have installed
>>>>> bioperl-588pm via fink but I am having difficulties calling the
>>>>> modules
>>>>> in
>>>>> script. The following is added to .profile (bash):
>>>>> PERL5LIB=/sw/lib/perl5/5.8.8:$PERL5LIB
>>>>> 
>>>>> If I change this to /sw/lib/perl5 then I get an @INC error, as use
>>>>> Bio::PERL
>>>>> cannot be located.
>>>>> 
>>>>> The environment variables are as follows:
>>>>> 
>>>>> 
>>>>> MANPATH=/sw/share/man:/usr/share/man:/usr/X11/man:/sw/lib/perl5/5.10.0/man:/usr/X11R6/man:/sw/lib/perl5-core/5.8.8/man:/sw/lib/perl5/5.8.8/man
>>>>> 
>>>>> PERL5LIB=/sw/lib/perl5/5.8.8:/sw/lib/perl5:/sw/lib/perl5/darwin:/sw/lib/perl5/5.8.8
>>>>> 
>>>>> PATH=/sw/bin:/sw/sbin:/usr/bin:/bin:/usr/sbin:/sbin:/usr/local/bin:/usr/X11/bin:/usr/X11R6/bin
>>>>> INFOPATH=/sw/share/info:/sw/info:/usr/share/info
>>>>> 
>>>>> 
>>>>> This is the perl script I'm attempting to run:
>>>>> #!/sw/bin/perl5.8.8
>>>>> use strict;
>>>>> use Bio::Perl;
>>>>> $seq_object = get_sequence('swiss',"ROA1_HUMAN");
>>>>> write_sequence(">roa1.fasta",'fasta',$seq_object);
>>>>> 
>>>>> Here is the error output:
>>>>> 
>>>>> dyld: lazy symbol binding failed: Symbol not found:
>>>>> _Perl_Tstack_sp_ptr
>>>>> Referenced from:
>>>>> /sw/lib/perl5/5.8.8/darwin-thread-multi-2level/auto/IO/IO.bundle
>>>>> Expected in: dynamic lookup
>>>>> 
>>>>> dyld: Symbol not found: _Perl_Tstack_sp_ptr
>>>>> Referenced from:
>>>>> /sw/lib/perl5/5.8.8/darwin-thread-multi-2level/auto/IO/IO.bundle
>>>>> Expected in: dynamic lookup
>>>>> 
>>>>> Trace/BPT trap
>>>>> 
>>>>> I have looked through many forum postings and attempted the solutions
>>>>> offered in those instances, but none seem to work in my case. I'm not
>>>>> sure
>>>>> if it's because I have perl 5.10.0 installed while attempting to call
>>>>> bioperl 5.8.8; however, others seem to have it working just fine.
>>>>> 
>>>>> Thank you, Ryan
>>>>> --
>>>>> View this message in context:
>>>>> http://old.nabble.com/Problems-with-bioperl-in-Mac-OS-X-10.6-%28Perl-5.10.0%29-tp26366421p26366421.html
>>>>> Sent from the Perl - Bioperl-L mailing list archive at Nabble.com.
>>>>> 
>>>>> _______________________________________________
>>>>> Bioperl-l mailing list
>>>>> Bioperl-l at lists.open-bio.org
>>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>>> 
>>>> _______________________________________________
>>>> Bioperl-l mailing list
>>>> Bioperl-l at lists.open-bio.org
>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>> 
>>>> 
>>> 
>>> -- 
>>> View this message in context:
>>> http://old.nabble.com/Problems-with-bioperl-in-Mac-OS-X-10.6-%28Perl-5.10.0%29-tp26366421p26372079.html
>>> Sent from the Perl - Bioperl-L mailing list archive at Nabble.com.
>>> 
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>> 
>> 
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
> 

-- 
View this message in context: http://old.nabble.com/Problems-with-bioperl-in-Mac-OS-X-10.6-%28Perl-5.10.0%29-tp26366421p26375418.html
Sent from the Perl - Bioperl-L mailing list archive at Nabble.com.


From krishna.aneesh at gmail.com  Mon Nov 16 07:00:15 2009
From: krishna.aneesh at gmail.com (Aneesh K)
Date: Mon, 16 Nov 2009 12:30:15 +0530
Subject: [Bioperl-l] Regarding Bio::TreeIO Object
Message-ID: <9cb9dfd70911152300y34789f88qc69dd14bf505f57d@mail.gmail.com>

Hi,

I just started to use Bioperl modules. It's really useful and interesting.
Now I have in stuck with "Tree objects and phylogenetic trees".
I couldn't get any documentation/examples about reading/parsing phylip tree
files.

Please tell me from where I can get some sample codes for this.

Waiting for your reply.

Thanks
Aneesh.K
Mob. 09646181517


From David.Messina at sbc.su.se  Mon Nov 16 17:33:36 2009
From: David.Messina at sbc.su.se (Dave Messina)
Date: Mon, 16 Nov 2009 18:33:36 +0100
Subject: [Bioperl-l] highest PAML version supported?
In-Reply-To: <D44511CE-D632-4CBE-9DD1-1B68EFD70124@bioperl.org>
References: <628aabb70911121120w4c609056v50204b9bd9e5c3fb@mail.gmail.com>
	<D44511CE-D632-4CBE-9DD1-1B68EFD70124@bioperl.org>
Message-ID: <B0AEE42A-A40A-4BB9-9A1C-98381CBB4CA9@sbc.su.se>

Hi everyone,

I just committed support for parsing codeml 4.3a (August 2009) to bioperl-live. I added new tests and all PAML-related tests pass, but please report any problems you have to the list.

Note that I haven't tested the other PAML 4.3a executables to see if there are format changes with those. If you get the chance to try any and it doesn't work, let me know and I'll try to add support for them.

(Note that these changes are only to the PAML parsing code; Bio::Tools::Run already appears to handle 4.3a just fine.)


Dave


From jason at bioperl.org  Mon Nov 16 17:34:57 2009
From: jason at bioperl.org (Jason Stajich)
Date: Mon, 16 Nov 2009 09:34:57 -0800
Subject: [Bioperl-l] Regarding Bio::TreeIO Object
In-Reply-To: <9cb9dfd70911152300y34789f88qc69dd14bf505f57d@mail.gmail.com>
References: <9cb9dfd70911152300y34789f88qc69dd14bf505f57d@mail.gmail.com>
Message-ID: <D1D4E0B9-4741-4D45-84B6-6BB57B6E2B1E@bioperl.org>

Is this at all helpful to your questions.
http://www.bioperl.org/wiki/HOWTO:Trees

The trees are in 'newick' or new hampshire format though I don't think  
there is a phylip format for trees.

-jason
On Nov 15, 2009, at 11:00 PM, Aneesh K wrote:

> Hi,
>
> I just started to use Bioperl modules. It's really useful and  
> interesting.
> Now I have in stuck with "Tree objects and phylogenetic trees".
> I couldn't get any documentation/examples about reading/parsing  
> phylip tree
> files.
>
> Please tell me from where I can get some sample codes for this.
>
> Waiting for your reply.
>
> Thanks
> Aneesh.K
> Mob. 09646181517
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

--
Jason Stajich
jason.stajich at gmail.com
jason at bioperl.org


From roy.chaudhuri at gmail.com  Mon Nov 16 17:31:49 2009
From: roy.chaudhuri at gmail.com (Roy Chaudhuri)
Date: Mon, 16 Nov 2009 17:31:49 +0000
Subject: [Bioperl-l] Regarding Bio::TreeIO Object
In-Reply-To: <9cb9dfd70911152300y34789f88qc69dd14bf505f57d@mail.gmail.com>
References: <9cb9dfd70911152300y34789f88qc69dd14bf505f57d@mail.gmail.com>
Message-ID: <4B018C85.6020801@gmail.com>

Hi Aneesh,

See the Bioperl trees howto:
http://www.bioperl.org/wiki/HOWTO:Trees

Roy.

Aneesh K wrote:
> Hi,
> 
> I just started to use Bioperl modules. It's really useful and interesting.
> Now I have in stuck with "Tree objects and phylogenetic trees".
> I couldn't get any documentation/examples about reading/parsing phylip tree
> files.
> 
> Please tell me from where I can get some sample codes for this.
> 
> Waiting for your reply.
> 
> Thanks
> Aneesh.K
> Mob. 09646181517


-- 
Dr. Roy Chaudhuri
Department of Veterinary Medicine
University of Cambridge, U.K.


From Kevin.M.Brown at asu.edu  Mon Nov 16 18:22:07 2009
From: Kevin.M.Brown at asu.edu (Kevin Brown)
Date: Mon, 16 Nov 2009 11:22:07 -0700
Subject: [Bioperl-l] FW:  Bio::Graphics::Panel question
Message-ID: <1A4207F8295607498283FE9E93B775B40663EE37@EX02.asurite.ad.asu.edu>

Please keep your responses on the list for more timely help.
 

Kevin Brown
Center for Innovations in Medicine
Biodesign Institute
Arizona State University 

 
________________________________

From: Xiaoyu Liang [mailto:veronica.xiaoyu at gmail.com] 
Sent: Monday, November 16, 2009 9:34 AM
To: Kevin Brown
Subject: Re: [Bioperl-l] Bio::Graphics::Panel question


Hi Kevin, 

Thank you for ur quick response. I attached the BLAST .out file here.
And the follow is my code part. I have an array keeping the color for
each hit, and I printed it out the array, there is no missing. 

my $track = $panel->add_track(
                              -glyph       => 'graded_segments',
                              -label       => 1,
                              -connector   => 'dashed',
                              -font2color  => 'red',
                              -sort_order  => 'high_score',
                              -description => sub {
                                $feature = shift;
                                #print "--".$feature."\n";
                                return unless
$feature->has_tag('description');
                                my ($description) =
$feature->each_tag_value('description');
                                my ($id) = $feature->display_name;
                                my @records= split(/\|/,$description);
                                my $score = $feature->score;
                                #print $id.":".$score."\n";
                                if($score >=200){
                                        push (@color_array,1);
                                }elsif($score >=80){
                                        push (@color_array,2);
                                }elsif($score >=50){
                                        push (@color_array,3);
                                }elsif($score >= 40){
                                        push (@color_array,4);
                                }else{
                                        push (@color_array,5);
                                }
                                
                                if($type == 1){
                                        "Species:Arabidopsis TF
Family:$records[1] Score=$score";
                                }elsif($type == 2){
                                        if(scalar(@records)==5){
                                                "Species:$records[1] TF
Family:$records[2] Accepted Name:$records[3] Score=$score";
                                        }else{
                                                "Species:$records[1] TF
Family:$records[2] Score=$score";
                                        }
                                }else{
                                        "";
                                }
                               },
                               -bgcolor => sub{
                                        return unless
$feature->has_tag('description');
                                        if($color_array[$index] == 1 ){
                                                $color = 'red';
                                        }
                                        if($color_array[$index]== 2){
                                                $color = 'orange';
                                        }
                                        if($color_array[$index]== 3){
                                                $color = 'green';
                                        }
                                        if($color_array[$index]== 4){
                                                $color = 'blue';
                                        }
                                        if($color_array[$index]== 5){
                                                $color = 'black';
                                        }
                                        #if ($index == 20){
                                        #        $color = 'black';
                                        #}
                                        #print
$index."--".$color_array[$index]."\n";
                                        $index++;
                                        
                                        #print $feature."\n";
                                        #print
$feature->display_name."\n";
                                        return $color;
                               },
                             );


Best regrads,
Xiaoyu


On Mon, Nov 16, 2009 at 10:49 AM, Kevin Brown <Kevin.M.Brown at asu.edu>
wrote:


	To really be able to tell if this was a bug, I (and probably the
real
	devs) would need to see that part of your code and the Blast
file that
	is having this issue as it could be your callback for color
choice vs
	the blast object (e.g. your color picker is missing an option
that the
	data comes in with and so returns with a blank value).
	

	-----Original Message-----
	From: bioperl-l-bounces at lists.open-bio.org
	[mailto:bioperl-l-bounces at lists.open-bio.org] On Behalf Of
Xiaoyu Liang
	Sent: Friday, November 13, 2009 1:36 PM
	To: Bioperl-l at lists.open-bio.org
	Subject: [Bioperl-l] Bio::Graphics::Panel question
	
	Hi,
	
	I'm using Bio::Graphics to parse the blast result and generate
images.
	But, sometimes, in the middle of the output image, the hit's
color is
	white, eventhough I set it to other colors. I attached the
picture here
	for an example. This doesn't occur all the time, usually, it
works well.
	I'm wondering if I did something wrong? or depends on the blast
result?
	
	Thank you,
	Xiaoyu
	
	
	_______________________________________________
	Bioperl-l mailing list
	Bioperl-l at lists.open-bio.org
	http://lists.open-bio.org/mailman/listinfo/bioperl-l
	

-------------- next part --------------
A non-text attachment was scrubbed...
Name: 1258388779.out
Type: application/octet-stream
Size: 32599 bytes
Desc: 1258388779.out
URL: <http://lists.open-bio.org/pipermail/bioperl-l/attachments/20091116/cb23e40d/attachment-0004.obj>

From paolo.pavan at gmail.com  Mon Nov 16 19:06:06 2009
From: paolo.pavan at gmail.com (Paolo Pavan)
Date: Mon, 16 Nov 2009 20:06:06 +0100
Subject: [Bioperl-l] bioperl-ext installation issue
Message-ID: <56be91b60911161106w69e20fd9k133a465e8d4f8a3f@mail.gmail.com>

Hi everybody,
I have problems installing the bioperl-ext package, any help is much
appreciated.
1)

   - I start trying with cpan i /bioperl-ext/ the only resource available is
   /B/BI/BIRNEY/bioperl-ext-1.4 (is it ok?)
   - I install Inline::MakeMaker and Inline::C then
   - i/BIRNEY/bioperl-ext-1.4/ fails bacause I don't have staden package

2) I try to install io_lib-1.8.10.tar as suggested by the README (
ftp://ftp.mrc-lmb.cam.ac.uk/pub/staden/io_lib/), installation fails after:
...
gcc -g -O2 -o makeSCF makeSCF.o ../read/.libs/libread.a -lz -lm
../read/.libs/libread.a(compress.o): In function `fopen_compressed':
/root/Download/staden/io_lib-1.8.10/utils/compress.c:321: warning: the use
of `tempnam' is dangerous, better use `mkstemp'
gcc -DHAVE_CONFIG_H -I. -I. -I.. -I.. -I../read -I../alf -I../abi -I../ctf
-I../ztr -I../plain -I../scf -I../exp_file -I../utils  -I/usr/local/include
-g -O2 -c -o extract_seq.o `test -f extract_seq.c || echo './'`extract_seq.c
/bin/sh ../libtool --mode=link gcc  -g -O2   -o extract_seq  extract_seq.o
../read/libread.la
gcc -g -O2 -o extract_seq extract_seq.o ../read/.libs/libread.a -lz -lm
../read/.libs/libread.a(compress.o): In function `fopen_compressed':
/root/Download/staden/io_lib-1.8.10/utils/compress.c:321: warning: the use
of `tempnam' is dangerous, better use `mkstemp'
gcc -DHAVE_CONFIG_H -I. -I. -I.. -I.. -I../read -I../alf -I../abi -I../ctf
-I../ztr -I../plain -I../scf -I../exp_file -I../utils  -I/usr/local/include
-g -O2 -c -o index_tar.o `test -f index_tar.c || echo './'`index_tar.c
index_tar.c: In function ?main?:
index_tar.c:12: error: two or more data types in declaration specifiers
make[2]: *** [index_tar.o] Error 1
make[2]: Leaving directory `/home/root/Download/staden/io_lib-1.8.10/progs'
make[1]: *** [all-recursive] Error 1
make[1]: Leaving directory `/home/root/Download/staden/io_lib-1.8.10'
make: *** [all-recursive-am] Error 2

3) I give up staden, because I actually need pSW, and try to install from
Makefile.PL in Bio/Ext/Align but installation fails after:
...
Align.xs:18: warning: ?not_here? defined but not used
Running Mkbootstrap for Bio::Ext::Align ()
chmod 644 Align.bs
rm -f ../blib/arch/auto/Bio/Ext/Align/Align.so
gcc  -shared -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions
-fstack-protector --param=ssp-buffer-size=4 -m64 -mtune=generic Align.o  -o
../blib/arch/auto/Bio/Ext/Align/Align.so libs/libsw.a    \
           -lm          \

/usr/bin/ld: libs/libsw.a(aln.o): relocation R_X86_64_32 against `a local
symbol' can not be used when making a shared object; recompile with -fPIC
libs/libsw.a: could not read symbols: Bad value
collect2: ld returned 1 exit status
make[1]: *** [../blib/arch/auto/Bio/Ext/Align/Align.so] Error 1
make[1]: Leaving directory
`/home/root/.cpan/sources/authors/id/B/BI/BIRNEY/bioperl-ext-1.4/Bio/Ext/Align'
make: *** [subdirs] Error 2

I have also made some other tries such force install Bio::Ext:Align without
success but I'm sure I miss something trivial that I can't catch.
Can someone help me?

Thank you,
Paolo


From lincoln.stein at gmail.com  Mon Nov 16 20:08:20 2009
From: lincoln.stein at gmail.com (Lincoln Stein)
Date: Mon, 16 Nov 2009 15:08:20 -0500
Subject: [Bioperl-l] FW: Bio::Graphics::Panel question
In-Reply-To: <1A4207F8295607498283FE9E93B775B40663EE37@EX02.asurite.ad.asu.edu>
References: <1A4207F8295607498283FE9E93B775B40663EE37@EX02.asurite.ad.asu.edu>
Message-ID: <6dce9a0b0911161208q2f826d83s319184f0cacca097@mail.gmail.com>

Hi,

I think you should modify your color selection code as follows:


                                       if($color_array[$index] == 1 ){
                                               $color = 'red';
                                       }
                                       elsif($color_array[$index]== 2){
                                               $color = 'orange';
                                       }
                                       elsif($color_array[$index]== 3){
                                               $color = 'green';
                                       }
                                       elsif($color_array[$index]== 4){
                                               $color = 'blue';
                                       }
                                       elsif($color_array[$index]== 5){
                                               $color = 'black';
                                       }
                                       else { die "unexpected color array
value $color_array[$index]" }

Lincoln

On Mon, Nov 16, 2009 at 1:22 PM, Kevin Brown <Kevin.M.Brown at asu.edu> wrote:

> Please keep your responses on the list for more timely help.
>
>
> Kevin Brown
> Center for Innovations in Medicine
> Biodesign Institute
> Arizona State University
>
>
>
> ________________________________
>
> From: Xiaoyu Liang [mailto:veronica.xiaoyu at gmail.com]
> Sent: Monday, November 16, 2009 9:34 AM
> To: Kevin Brown
> Subject: Re: [Bioperl-l] Bio::Graphics::Panel question
>
>
> Hi Kevin,
>
> Thank you for ur quick response. I attached the BLAST .out file here.
> And the follow is my code part. I have an array keeping the color for
> each hit, and I printed it out the array, there is no missing.
>
> my $track = $panel->add_track(
>                              -glyph       => 'graded_segments',
>                              -label       => 1,
>                              -connector   => 'dashed',
>                              -font2color  => 'red',
>                              -sort_order  => 'high_score',
>                              -description => sub {
>                                $feature = shift;
>                                #print "--".$feature."\n";
>                                return unless
> $feature->has_tag('description');
>                                my ($description) =
> $feature->each_tag_value('description');
>                                my ($id) = $feature->display_name;
>                                my @records= split(/\|/,$description);
>                                my $score = $feature->score;
>                                #print $id.":".$score."\n";
>                                if($score >=200){
>                                        push (@color_array,1);
>                                }elsif($score >=80){
>                                        push (@color_array,2);
>                                }elsif($score >=50){
>                                        push (@color_array,3);
>                                }elsif($score >= 40){
>                                        push (@color_array,4);
>                                }else{
>                                        push (@color_array,5);
>                                }
>
>                                if($type == 1){
>                                        "Species:Arabidopsis TF
> Family:$records[1] Score=$score";
>                                }elsif($type == 2){
>                                        if(scalar(@records)==5){
>                                                "Species:$records[1] TF
> Family:$records[2] Accepted Name:$records[3] Score=$score";
>                                        }else{
>                                                "Species:$records[1] TF
> Family:$records[2] Score=$score";
>                                        }
>                                }else{
>                                        "";
>                                }
>                               },
>                               -bgcolor => sub{
>                                        return unless
> $feature->has_tag('description');
>                                        if($color_array[$index] == 1 ){
>                                                $color = 'red';
>                                        }
>                                        if($color_array[$index]== 2){
>                                                $color = 'orange';
>                                        }
>                                        if($color_array[$index]== 3){
>                                                $color = 'green';
>                                        }
>                                        if($color_array[$index]== 4){
>                                                $color = 'blue';
>                                        }
>                                        if($color_array[$index]== 5){
>                                                $color = 'black';
>                                        }
>                                        #if ($index == 20){
>                                        #        $color = 'black';
>                                        #}
>                                        #print
> $index."--".$color_array[$index]."\n";
>                                        $index++;
>
>                                        #print $feature."\n";
>                                        #print
> $feature->display_name."\n";
>                                        return $color;
>                               },
>                             );
>
>
> Best regrads,
> Xiaoyu
>
>
> On Mon, Nov 16, 2009 at 10:49 AM, Kevin Brown <Kevin.M.Brown at asu.edu>
> wrote:
>
>
>        To really be able to tell if this was a bug, I (and probably the
> real
>        devs) would need to see that part of your code and the Blast
> file that
>        is having this issue as it could be your callback for color
> choice vs
>        the blast object (e.g. your color picker is missing an option
> that the
>        data comes in with and so returns with a blank value).
>
>
>        -----Original Message-----
>        From: bioperl-l-bounces at lists.open-bio.org
>        [mailto:bioperl-l-bounces at lists.open-bio.org] On Behalf Of
> Xiaoyu Liang
>        Sent: Friday, November 13, 2009 1:36 PM
>        To: Bioperl-l at lists.open-bio.org
>        Subject: [Bioperl-l] Bio::Graphics::Panel question
>
>        Hi,
>
>        I'm using Bio::Graphics to parse the blast result and generate
> images.
>        But, sometimes, in the middle of the output image, the hit's
> color is
>        white, eventhough I set it to other colors. I attached the
> picture here
>        for an example. This doesn't occur all the time, usually, it
> works well.
>        I'm wondering if I did something wrong? or depends on the blast
> result?
>
>        Thank you,
>        Xiaoyu
>
>
>        _______________________________________________
>        Bioperl-l mailing list
>        Bioperl-l at lists.open-bio.org
>        http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


-- 
Lincoln D. Stein
Director, Informatics and Biocomputing Platform
Ontario Institute for Cancer Research
101 College St., Suite 800
Toronto, ON, Canada M5G0A3
416 673-8514
Assistant: Renata Musa <Renata.Musa at oicr.on.ca>


From ryan_bogard at hms.harvard.edu  Mon Nov 16 21:44:25 2009
From: ryan_bogard at hms.harvard.edu (rbogard)
Date: Mon, 16 Nov 2009 13:44:25 -0800 (PST)
Subject: [Bioperl-l] Problems with bioperl in Mac OS X 10.6 (Perl 5.10.0)
In-Reply-To: <26366421.post@talk.nabble.com>
References: <26366421.post@talk.nabble.com>
Message-ID: <26379710.post@talk.nabble.com>


Thank you all for your help! I was able to get bioperl working via manual
download and install. It was a combination of permissions issues and X86_64
vs. X86_32 compatibility issues. Using fink to download and install seems to
have given me a combination of 32 and 64 associated files (I probably did
something wrong in config). 


rbogard wrote:
> 
> In advance, any advice would be grealy appreciated! I have installed
> bioperl-588pm via fink but I am having difficulties calling the modules in
> script. The following is added to .profile (bash):
> PERL5LIB=/sw/lib/perl5/5.8.8:$PERL5LIB
> 
> If I change this to /sw/lib/perl5 then I get an @INC error, as use
> Bio::PERL cannot be located.
> 
> The environment variables are as follows:
> 
> MANPATH=/sw/share/man:/usr/share/man:/usr/X11/man:/sw/lib/perl5/5.10.0/man:/usr/X11R6/man:/sw/lib/perl5-core/5.8.8/man:/sw/lib/perl5/5.8.8/man
> PERL5LIB=/sw/lib/perl5/5.8.8:/sw/lib/perl5:/sw/lib/perl5/darwin:/sw/lib/perl5/5.8.8
> PATH=/sw/bin:/sw/sbin:/usr/bin:/bin:/usr/sbin:/sbin:/usr/local/bin:/usr/X11/bin:/usr/X11R6/bin
> INFOPATH=/sw/share/info:/sw/info:/usr/share/info
> 
> 
> This is the perl script I'm attempting to run:
> #!/sw/bin/perl5.8.8
> use strict;
> use Bio::Perl;
> $seq_object = get_sequence('swiss',"ROA1_HUMAN");
> write_sequence(">roa1.fasta",'fasta',$seq_object);
> 
> Here is the error output:
> 
> dyld: lazy symbol binding failed: Symbol not found: _Perl_Tstack_sp_ptr
>   Referenced from:
> /sw/lib/perl5/5.8.8/darwin-thread-multi-2level/auto/IO/IO.bundle
>   Expected in: dynamic lookup
> 
> dyld: Symbol not found: _Perl_Tstack_sp_ptr
>   Referenced from:
> /sw/lib/perl5/5.8.8/darwin-thread-multi-2level/auto/IO/IO.bundle
>   Expected in: dynamic lookup
> 
> Trace/BPT trap
> 
> I have looked through many forum postings and attempted the solutions
> offered in those instances, but none seem to work in my case. I'm not sure
> if it's because I have perl 5.10.0 installed while attempting to call
> bioperl 5.8.8; however, others seem to have it working just fine.
> 
> Thank you, Ryan 
> 

-- 
View this message in context: http://old.nabble.com/Problems-with-bioperl-in-Mac-OS-X-10.6-%28Perl-5.10.0%29-tp26366421p26379710.html
Sent from the Perl - Bioperl-L mailing list archive at Nabble.com.


From jay at jays.net  Mon Nov 16 22:02:10 2009
From: jay at jays.net (Jay Hannah)
Date: Mon, 16 Nov 2009 16:02:10 -0600
Subject: [Bioperl-l] Bio::Index::GenBank - by organism?
In-Reply-To: <2BA451B1-6E18-483E-B655-74D1146772CC@bioperl.org>
References: <3B01A09C-198E-4691-B807-7ED3250BB81A@jays.net>
	<12DFD22E-42DC-4626-9873-0DE3EBB5CFBD@illinois.edu>
	<2BA451B1-6E18-483E-B655-74D1146772CC@bioperl.org>
Message-ID: <60ADD3A9-D38B-4A39-A5CE-C8118DEC1242@jays.net>

On Nov 10, 2009, at 12:50 PM, Jason Stajich wrote:
> You might also look at what mygenbank does:
> http://homepage.mac.com/iankorf/mygenbank.html

It appears, perhaps, that BioSQL can provide *foo* searching like so:

http://www.biosql.org/wiki/Schema_Overview#TAXON.2C_TAXON_NAME

 SELECT DISTINCT include.ncbi_taxon_id FROM taxon
    INNER JOIN taxon AS include ON
      (include.left_value BETWEEN taxon.left_value
        AND taxon.right_value)
 WHERE taxon.taxon_id IN
   (SELECT taxon_id FROM taxon_name
    WHERE name LIKE '%fungi%')

So I think we're going to chase that for a while.

I didn't see a *foo* search in MyGenBank?

Thanks,

j
http://clab.ist.unomaha.edu/CLAB/index.php/User:Jhannah


From roy.chaudhuri at gmail.com  Tue Nov 17 11:24:07 2009
From: roy.chaudhuri at gmail.com (Roy Chaudhuri)
Date: Tue, 17 Nov 2009 11:24:07 +0000
Subject: [Bioperl-l] Regarding Bio::TreeIO Object
In-Reply-To: <9cb9dfd70911162117nfac0e52gea3d638e34337b16@mail.gmail.com>
References: <9cb9dfd70911152300y34789f88qc69dd14bf505f57d@mail.gmail.com>
	<4B018C85.6020801@gmail.com>
	<9cb9dfd70911162117nfac0e52gea3d638e34337b16@mail.gmail.com>
Message-ID: <4B0287D7.5050702@gmail.com>

Hi Aneesh,

Please keep your replies on the mailing list, that way someone else can 
respond, which would be particularly useful in this case since I know 
nothing about MapIO.

Roy.

Aneesh K wrote:
> Thanks for your reply. 
> 
> I would like to know about "Genetic Maps" also. I would like to 
> use MapIO object. 
> But I'm not aware about genetic maps and the mapmaker format. 
> 
> Please tell me from where I can get some examples for mapmaker format 
> and some example scripts to use MapIO object. 
> 
> Hoping your reply.
> 
> Aneesh.K
> Mob. 09646181517
> 
> 
> 
> On Mon, Nov 16, 2009 at 11:01 PM, Roy Chaudhuri <roy.chaudhuri at gmail.com 
> <mailto:roy.chaudhuri at gmail.com>> wrote:
> 
>     Hi Aneesh,
> 
>     See the Bioperl trees howto:
>     http://www.bioperl.org/wiki/HOWTO:Trees
> 
>     Roy.
> 
> 
>     Aneesh K wrote:
> 
>         Hi,
> 
>         I just started to use Bioperl modules. It's really useful and
>         interesting.
>         Now I have in stuck with "Tree objects and phylogenetic trees".
>         I couldn't get any documentation/examples about reading/parsing
>         phylip tree
>         files.
> 
>         Please tell me from where I can get some sample codes for this.
> 
>         Waiting for your reply.
> 
>         Thanks
>         Aneesh.K
>         Mob. 09646181517
> 
> 
> 
>     -- 
>     Dr. Roy Chaudhuri
>     Department of Veterinary Medicine
>     University of Cambridge, U.K.
> 
> 


From maj at fortinbras.us  Tue Nov 17 12:50:06 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Tue, 17 Nov 2009 07:50:06 -0500
Subject: [Bioperl-l] Regarding Bio::TreeIO Object
In-Reply-To: <4B0287D7.5050702@gmail.com>
References: <9cb9dfd70911152300y34789f88qc69dd14bf505f57d@mail.gmail.com><4B018C85.6020801@gmail.com><9cb9dfd70911162117nfac0e52gea3d638e34337b16@mail.gmail.com>
	<4B0287D7.5050702@gmail.com>
Message-ID: <394F62D51F15405BBCF8BB50DA0FF336@NewLife>

Aneesh, 
Have a look in the t/Map directory of the BioPerl distribution. These
are test scripts that are also examples of usage. The t/data directory
will contain the datafiles that the tests use; these will provide example data.
cheers 
Mark 
----- Original Message ----- 
From: "Roy Chaudhuri" <roy.chaudhuri at gmail.com>
To: "Aneesh K" <krishna.aneesh at gmail.com>; <bioperl-l at bioperl.org>
Sent: Tuesday, November 17, 2009 6:24 AM
Subject: Re: [Bioperl-l] Regarding Bio::TreeIO Object


> Hi Aneesh,
> 
> Please keep your replies on the mailing list, that way someone else can 
> respond, which would be particularly useful in this case since I know 
> nothing about MapIO.
> 
> Roy.
> 
> Aneesh K wrote:
>> Thanks for your reply. 
>> 
>> I would like to know about "Genetic Maps" also. I would like to 
>> use MapIO object. 
>> But I'm not aware about genetic maps and the mapmaker format. 
>> 
>> Please tell me from where I can get some examples for mapmaker format 
>> and some example scripts to use MapIO object. 
>> 
>> Hoping your reply.
>> 
>> Aneesh.K
>> Mob. 09646181517
>> 
>> 
>> 
>> On Mon, Nov 16, 2009 at 11:01 PM, Roy Chaudhuri <roy.chaudhuri at gmail.com 
>> <mailto:roy.chaudhuri at gmail.com>> wrote:
>> 
>>     Hi Aneesh,
>> 
>>     See the Bioperl trees howto:
>>     http://www.bioperl.org/wiki/HOWTO:Trees
>> 
>>     Roy.
>> 
>> 
>>     Aneesh K wrote:
>> 
>>         Hi,
>> 
>>         I just started to use Bioperl modules. It's really useful and
>>         interesting.
>>         Now I have in stuck with "Tree objects and phylogenetic trees".
>>         I couldn't get any documentation/examples about reading/parsing
>>         phylip tree
>>         files.
>> 
>>         Please tell me from where I can get some sample codes for this.
>> 
>>         Waiting for your reply.
>> 
>>         Thanks
>>         Aneesh.K
>>         Mob. 09646181517
>> 
>> 
>> 
>>     -- 
>>     Dr. Roy Chaudhuri
>>     Department of Veterinary Medicine
>>     University of Cambridge, U.K.
>> 
>> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
>


From veronica.xiaoyu at gmail.com  Wed Nov 18 17:18:33 2009
From: veronica.xiaoyu at gmail.com (Xiaoyu Liang)
Date: Wed, 18 Nov 2009 12:18:33 -0500
Subject: [Bioperl-l] how to visualize multiple sequences alignments
Message-ID: <a6a926b50911180918m6b6c31b2g191f9d78325e68de@mail.gmail.com>

Hi,

I'm wondering Is there any modules that can be used for visualizing multiple
sequences alignments? like the result from ClustalW?

Thank you very much,
Xiaoyu


From jason at bioperl.org  Wed Nov 18 18:23:05 2009
From: jason at bioperl.org (Jason Stajich)
Date: Wed, 18 Nov 2009 10:23:05 -0800
Subject: [Bioperl-l] how to visualize multiple sequences alignments
In-Reply-To: <a6a926b50911180918m6b6c31b2g191f9d78325e68de@mail.gmail.com>
References: <a6a926b50911180918m6b6c31b2g191f9d78325e68de@mail.gmail.com>
Message-ID: <FC65BC44-AC7A-4AE6-8DF0-3F7969CD5B4C@bioperl.org>

try jalview http://www.jalview.org/

On Nov 18, 2009, at 9:18 AM, Xiaoyu Liang wrote:

> Hi,
>
> I'm wondering Is there any modules that can be used for visualizing  
> multiple
> sequences alignments? like the result from ClustalW?
>
> Thank you very much,
> Xiaoyu
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

--
Jason Stajich
jason.stajich at gmail.com
jason at bioperl.org


From andrew.j.grimm at gmail.com  Thu Nov 19 02:52:31 2009
From: andrew.j.grimm at gmail.com (Andrew Grimm)
Date: Thu, 19 Nov 2009 13:52:31 +1100
Subject: [Bioperl-l] DANGER: hacking of bioperl wiki?
Message-ID: <b9140daa0911181852n3a000da7x3fc7a5f0661f10f3@mail.gmail.com>

Caution: read the whole email before visiting the bioperl wiki

I was doing some bioinformatics-related searching using google, and
one of the hits was to the bio dot perl dot org wiki (the FAQ in
particular).

When I did that, I was redirected to a ferdax dot com web site (a
typo-squatting of fedex?).

Some people reckon that ferdax hacks web sites and redirects google
hits from the victim web site to their own web site. For example, this
thread at google's webmaster central
http://www.google.com/support/forum/p/Webmasters/thread?tid=37a36c0d1ea99819&hl=en#all
(it's talking about zencart, but presumably they've since found other
victims)

Just going to the website without using google may not trigger the redirect.

Apologies if this is a false alarm, but I don't think it is.

I won't be in contact between Friday and Monday Australian time (I'll
be at railscamp 6 in Melbourne), so I won't be able to answer any
replies.

Thanks,

Andrew Grimm


From maj at fortinbras.us  Thu Nov 19 03:14:44 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Wed, 18 Nov 2009 22:14:44 -0500
Subject: [Bioperl-l] DANGER: hacking of bioperl wiki?
In-Reply-To: <b9140daa0911181852n3a000da7x3fc7a5f0661f10f3@mail.gmail.com>
References: <b9140daa0911181852n3a000da7x3fc7a5f0661f10f3@mail.gmail.com>
Message-ID: <7761C2223DB54DE6B836F302D2FF6AC0@NewLife>

Andrew-- thanks!! We're on it.
MAJ
----- Original Message ----- 
From: "Andrew Grimm" <andrew.j.grimm at gmail.com>
To: <bioperl-l at lists.open-bio.org>
Sent: Wednesday, November 18, 2009 9:52 PM
Subject: [Bioperl-l] DANGER: hacking of bioperl wiki?


> Caution: read the whole email before visiting the bioperl wiki
>
> I was doing some bioinformatics-related searching using google, and
> one of the hits was to the bio dot perl dot org wiki (the FAQ in
> particular).
>
> When I did that, I was redirected to a ferdax dot com web site (a
> typo-squatting of fedex?).
>
> Some people reckon that ferdax hacks web sites and redirects google
> hits from the victim web site to their own web site. For example, this
> thread at google's webmaster central
> http://www.google.com/support/forum/p/Webmasters/thread?tid=37a36c0d1ea99819&hl=en#all
> (it's talking about zencart, but presumably they've since found other
> victims)
>
> Just going to the website without using google may not trigger the redirect.
>
> Apologies if this is a false alarm, but I don't think it is.
>
> I won't be in contact between Friday and Monday Australian time (I'll
> be at railscamp 6 in Melbourne), so I won't be able to answer any
> replies.
>
> Thanks,
>
> Andrew Grimm
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> 


From sandipan.chowdhury at physiology.wisc.edu  Thu Nov 19 06:49:45 2009
From: sandipan.chowdhury at physiology.wisc.edu (Sandipan Chowdhury)
Date: Thu, 19 Nov 2009 00:49:45 -0600
Subject: [Bioperl-l] accessing EMBL database
Message-ID: <475F74057A618245A773CD325E105D1E033334AC@phy-srv01.physiol.physiology.wisc.edu>

Hi,
 
I have 3 questions all related to the retreival of sequences from online databases.
 
(1) I have been trying to download a protein sequence from the EMBL database and trying to write the sequence into a text file, as a string. I am using the following code: 
 
use Bio::DB::EMBL;
open b,">","s.txt";
$em_obj = Bio::DB::EMBL->new;
  $seq_obj = $em_obj->get_Seq_by_acc("CAB95729");
  $s_str = $seq_obj->seq;
  print b "$s_str\n";
close b;
 
The script is not working and gives the messege:
"MSG: EMBL stream with no ID. Not embl in my book
STACK: Error::throw
STACK: Bio::Root::Root::throw C:/Perl/site/lib/Bio/Root/Root.pm: 368
STACK: Bio::SeqIO::embl::next_seq C:/Perl/site/lib/Bio\SeqIO\embl.pm: 203
STACK: Bio::DB::WebDBSeqI::get_Seq_by_acc C:/Perl/site/lib/Bio/DB/WebDBSeqI.pm: 194
STACK: trial2.pl"
 
I am not sure what this means. A similar version of the script works for the Swissprot, GenBank and RefSeq databases but not for the EMBL. What is the way around this so that I can download the embl sequence?
 
(2) Also, is there anyway I can download sequences from DDBJ (database of Japan)?
 
(3) Can GI numbers be used to retreive the sequences? If so then how?
 
Answers to these questions would be greatly appreciated. I am very new to Perl/Bioperl and am not really familiar with the advanced programming features, so I would need to your help to find my way out of this situation.
 
Many Thanks
Sandipan
 

From maj at fortinbras.us  Thu Nov 19 13:10:07 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Thu, 19 Nov 2009 08:10:07 -0500
Subject: [Bioperl-l] accessing EMBL database
In-Reply-To: <475F74057A618245A773CD325E105D1E033334AC@phy-srv01.physiol.physiology.wisc.edu>
References: <475F74057A618245A773CD325E105D1E033334AC@phy-srv01.physiol.physiology.wisc.edu>
Message-ID: <E10EF917D9914031BCFDF8BDDBFA4F13@NewLife>

Sandipan-- That id (CAB95729) returns "No entries" from EMBL.
I would agree that the error message is not really informative.
The module documentation warns:

      # remember that EMBL_ID does not equal GenBank_ID!
so I would check that.
MAJ
----- Original Message ----- 
From: "Sandipan Chowdhury" <sandipan.chowdhury at physiology.wisc.edu>
To: <bioperl-l at bioperl.org>
Sent: Thursday, November 19, 2009 1:49 AM
Subject: [Bioperl-l] accessing EMBL database


> Hi,
>
> I have 3 questions all related to the retreival of sequences from online 
> databases.
>
> (1) I have been trying to download a protein sequence from the EMBL database 
> and trying to write the sequence into a text file, as a string. I am using the 
> following code:
>
> use Bio::DB::EMBL;
> open b,">","s.txt";
> $em_obj = Bio::DB::EMBL->new;
>  $seq_obj = $em_obj->get_Seq_by_acc("CAB95729");
>  $s_str = $seq_obj->seq;
>  print b "$s_str\n";
> close b;
>
> The script is not working and gives the messege:
> "MSG: EMBL stream with no ID. Not embl in my book
> STACK: Error::throw
> STACK: Bio::Root::Root::throw C:/Perl/site/lib/Bio/Root/Root.pm: 368
> STACK: Bio::SeqIO::embl::next_seq C:/Perl/site/lib/Bio\SeqIO\embl.pm: 203
> STACK: Bio::DB::WebDBSeqI::get_Seq_by_acc 
> C:/Perl/site/lib/Bio/DB/WebDBSeqI.pm: 194
> STACK: trial2.pl"
>
> I am not sure what this means. A similar version of the script works for the 
> Swissprot, GenBank and RefSeq databases but not for the EMBL. What is the way 
> around this so that I can download the embl sequence?
>
> (2) Also, is there anyway I can download sequences from DDBJ (database of 
> Japan)?
>
> (3) Can GI numbers be used to retreive the sequences? If so then how?
>
> Answers to these questions would be greatly appreciated. I am very new to 
> Perl/Bioperl and am not really familiar with the advanced programming 
> features, so I would need to your help to find my way out of this situation.
>
> Many Thanks
> Sandipan
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> 


From hrh at fmi.ch  Thu Nov 19 13:23:29 2009
From: hrh at fmi.ch (Hotz, Hans-Rudolf)
Date: Thu, 19 Nov 2009 14:23:29 +0100
Subject: [Bioperl-l] accessing EMBL database
In-Reply-To: <475F74057A618245A773CD325E105D1E033334AC@phy-srv01.physiol.physiology.wisc.edu>
Message-ID: <C72B0561.5887%hrh@fmi.ch>


Sandipan


> I have 3 questions all related to the retreival of sequences from online
> databases.
>  
> (1) I have been trying to download a protein sequence from the EMBL database
> and trying to write the sequence into a text file, as a string. I am using the
> following code: 
>  
> use Bio::DB::EMBL;
> open b,">","s.txt";
> $em_obj = Bio::DB::EMBL->new;
>   $seq_obj = $em_obj->get_Seq_by_acc("CAB95729");
>   $s_str = $seq_obj->seq;
>   print b "$s_str\n";
> close b;
>  
> The script is not working and gives the messege:
> "MSG: EMBL stream with no ID. Not embl in my book
> STACK: Error::throw
> STACK: Bio::Root::Root::throw C:/Perl/site/lib/Bio/Root/Root.pm: 368
> STACK: Bio::SeqIO::embl::next_seq C:/Perl/site/lib/Bio\SeqIO\embl.pm: 203
> STACK: Bio::DB::WebDBSeqI::get_Seq_by_acc
> C:/Perl/site/lib/Bio/DB/WebDBSeqI.pm: 194
> STACK: trial2.pl"
>  
> I am not sure what this means. A similar version of the script works for the
> Swissprot, GenBank and RefSeq databases but not for the EMBL. What is the way
> around this so that I can download the embl sequence?

"CAB95729" is a protein sequence, ie a translation of the CDS of
'AJ277028.1'.

As far as I know, Bio::DB::EMBL is only designed to get EMBL entries, ie the
nucleotides sequence


> (2) Also, is there anyway I can download sequences from DDBJ (database of
> Japan)?

Unless, for network/speed reason, why do you want to download data from
DDBJ? It contains the same data as GenBank and EMBL. Those three databases
exchange their data on a daily basis.
  
> (3) Can GI numbers be used to retreive the sequences? If so then how?

Have you looked at Bio::DB::Eutilities ? See the 'HOWTOs'  page in the
Bioperl Wiki


Regards, Hans


> Answers to these questions would be greatly appreciated. I am very new to
> Perl/Bioperl and am not really familiar with the advanced programming
> features, so I would need to your help to find my way out of this situation.
>  
> Many Thanks
> Sandipan
>  
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From cjfields at illinois.edu  Thu Nov 19 13:47:16 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Thu, 19 Nov 2009 07:47:16 -0600
Subject: [Bioperl-l] accessing EMBL database
In-Reply-To: <C72B0561.5887%hrh@fmi.ch>
References: <C72B0561.5887%hrh@fmi.ch>
Message-ID: <95D416ED-7630-40A1-ABA5-A3C3525D25B1@illinois.edu>


On Nov 19, 2009, at 7:23 AM, Hotz, Hans-Rudolf wrote:

> 
> Sandipan
> 
> 
>> I have 3 questions all related to the retreival of sequences from online
>> databases.
>> 
>> (1) I have been trying to download a protein sequence from the EMBL database
>> and trying to write the sequence into a text file, as a string. I am using the
>> following code: 
>> 
>> use Bio::DB::EMBL;
>> open b,">","s.txt";
>> $em_obj = Bio::DB::EMBL->new;
>>  $seq_obj = $em_obj->get_Seq_by_acc("CAB95729");
>>  $s_str = $seq_obj->seq;
>>  print b "$s_str\n";
>> close b;
>> 
>> The script is not working and gives the messege:
>> "MSG: EMBL stream with no ID. Not embl in my book
>> STACK: Error::throw
>> STACK: Bio::Root::Root::throw C:/Perl/site/lib/Bio/Root/Root.pm: 368
>> STACK: Bio::SeqIO::embl::next_seq C:/Perl/site/lib/Bio\SeqIO\embl.pm: 203
>> STACK: Bio::DB::WebDBSeqI::get_Seq_by_acc
>> C:/Perl/site/lib/Bio/DB/WebDBSeqI.pm: 194
>> STACK: trial2.pl"
>> 
>> I am not sure what this means. A similar version of the script works for the
>> Swissprot, GenBank and RefSeq databases but not for the EMBL. What is the way
>> around this so that I can download the embl sequence?
> 
> "CAB95729" is a protein sequence, ie a translation of the CDS of
> 'AJ277028.1'.
> 
> As far as I know, Bio::DB::EMBL is only designed to get EMBL entries, ie the
> nucleotides sequence
> 
> 
> 
>> (2) Also, is there anyway I can download sequences from DDBJ (database of
>> Japan)?
> 
> Unless, for network/speed reason, why do you want to download data from
> DDBJ? It contains the same data as GenBank and EMBL. Those three databases
> exchange their data on a daily basis.
> 
>> (3) Can GI numbers be used to retreive the sequences? If so then how?
> 
> Have you looked at Bio::DB::Eutilities ? See the 'HOWTOs'  page in the
> Bioperl Wiki
> 
> 
> 
> Regards, Hans
> 
> 
> 
>> Answers to these questions would be greatly appreciated. I am very new to
>> Perl/Bioperl and am not really familiar with the advanced programming
>> features, so I would need to your help to find my way out of this situation.
>> 
>> Many Thanks
>> Sandipan

To add to that, if you want the protein sequences as a Bio::Seq you can use Bio::DB::GenPept (Bio::DB::EUtilities will retrieve raw data only).

chris


From David.Messina at sbc.su.se  Thu Nov 19 14:04:55 2009
From: David.Messina at sbc.su.se (Dave Messina)
Date: Thu, 19 Nov 2009 15:04:55 +0100
Subject: [Bioperl-l] accessing EMBL database
In-Reply-To: <E10EF917D9914031BCFDF8BDDBFA4F13@NewLife>
References: <475F74057A618245A773CD325E105D1E033334AC@phy-srv01.physiol.physiology.wisc.edu>
	<E10EF917D9914031BCFDF8BDDBFA4F13@NewLife>
Message-ID: <B6E6CEFF-14E7-48D6-B9D8-9C114F166190@sbc.su.se>

> I would agree that the error message is not really informative.

Agreed that it could be better, but I wonder whether part of the problem with BioPerl error messages is the stack dump.

I think a lot of eyes just glaze right over when they see a big wad of complicated stuff, with colons and slashes and line numbers, spewing out at them.

Perhaps the stack dump should be turned off by default?

Wouldn't this:

ERROR: EMBL stream with no ID. Not embl in my book


Be a lot clearer than this?:

MSG: EMBL stream with no ID. Not embl in my book
STACK: Error::throw
STACK: Bio::Root::Root::throw C:/Perl/site/lib/Bio/Root/Root.pm: 368
STACK: Bio::SeqIO::embl::next_seq C:/Perl/site/lib/Bio\SeqIO\embl.pm: 203
STACK: Bio::DB::WebDBSeqI::get_Seq_by_acc C:/Perl/site/lib/Bio/DB/WebDBSeqI.pm: 194
STACK: trial2.pl


Just a thought. This has probably been discussed before.
Dave


From maj at fortinbras.us  Thu Nov 19 14:17:05 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Thu, 19 Nov 2009 09:17:05 -0500
Subject: [Bioperl-l] accessing EMBL database
In-Reply-To: <B6E6CEFF-14E7-48D6-B9D8-9C114F166190@sbc.su.se>
References: <475F74057A618245A773CD325E105D1E033334AC@phy-srv01.physiol.physiology.wisc.edu>
	<E10EF917D9914031BCFDF8BDDBFA4F13@NewLife>
	<B6E6CEFF-14E7-48D6-B9D8-9C114F166190@sbc.su.se>
Message-ID: <FADF827A6CE34C959062F2D93849E15A@NewLife>

I'm inclined to agree. Lots of responses to questions here that begin
"Well, as the error message said, you need to check...", which means
people tend towards "I broke it! Write the list!". I do find it hairy when
my errors are way down in the object tree.
----- Original Message ----- 
From: "Dave Messina" <David.Messina at sbc.su.se>
To: "Mark A. Jensen" <maj at fortinbras.us>
Cc: <bioperl-l at bioperl.org>
Sent: Thursday, November 19, 2009 9:04 AM
Subject: Re: [Bioperl-l] accessing EMBL database


> I would agree that the error message is not really informative.

Agreed that it could be better, but I wonder whether part of the problem with 
BioPerl error messages is the stack dump.

I think a lot of eyes just glaze right over when they see a big wad of 
complicated stuff, with colons and slashes and line numbers, spewing out at 
them.

Perhaps the stack dump should be turned off by default?

Wouldn't this:

ERROR: EMBL stream with no ID. Not embl in my book


Be a lot clearer than this?:

MSG: EMBL stream with no ID. Not embl in my book
STACK: Error::throw
STACK: Bio::Root::Root::throw C:/Perl/site/lib/Bio/Root/Root.pm: 368
STACK: Bio::SeqIO::embl::next_seq C:/Perl/site/lib/Bio\SeqIO\embl.pm: 203
STACK: Bio::DB::WebDBSeqI::get_Seq_by_acc C:/Perl/site/lib/Bio/DB/WebDBSeqI.pm: 
194
STACK: trial2.pl


Just a thought. This has probably been discussed before.
Dave


From rtbio.2009 at gmail.com  Thu Nov 19 14:55:27 2009
From: rtbio.2009 at gmail.com (Roopa Raghuveer)
Date: Thu, 19 Nov 2009 15:55:27 +0100
Subject: [Bioperl-l] Remote blast
Message-ID: <c7cac1600911190655q7c4cf910x221e8a2ebac40f2a@mail.gmail.com>

Hello everybody,

I have a problem. I would like to use remote blast to find sequences
matching for an input sequence.

Ex:-I would like to search sequences which match Trypanosoma Brucei
sequence.

I want the output to be only Trypanosoma Brucei sequences matching with my
query.When i tried to use remoteblast to nr database,I got sequences from
different organisms like E.coli,Pseudomonas etc.,

Could you please tell me how can this be solved...?

My code is as follows.

use Bio::Tools::Run::RemoteBlast;
  use strict;
  my $prog = 'blastn';
  my $db   = 'nr';
  my $e_val= '1e-10';
 my $organism= 'Trypanosoma Brucei';

  my @params = ( '-prog' => $prog,
         '-data' => $db,
         '-expect' => $e_val,
         '-readmethod' => 'SearchIO',
         '-Organism'   => $organism );

  my $factory = Bio::Tools::Run::RemoteBlast->
new(@params);

  #change a paramter
  #$Bio::Tools::Run::RemoteBlast::HEADER{'ENTREZ_QUERY'} = 'Trypanosoma
brucei[ORGN]'

  #remove a parameter
  #delete $Bio::Tools::Run::RemoteBlast::HEADER{'FILTER'};

  my $v = 1;
  #$v is just to turn on and off the messages

  my $str = Bio::SeqIO->new(-file=>'amino.fa' , '-format' => 'fasta' ,
'-organism' => 'Trypanosoma Brucei' );

  while (my $input = $str->next_seq()){
    #Blast a sequence against a database:
   my $r = $factory->submit_blast($input);
    #my $r = $factory->submit_blast('amino.fa');

    print STDERR "waiting..." if( $v > 0 );
    while ( my @rids = $factory->each_rid ) {
      foreach my $rid ( @rids ) {
        my $rc = $factory->retrieve_blast($rid);
        if( !ref($rc) ) {
          if( $rc < 0 ) {
            $factory->remove_rid($rid);
          }
          print STDERR "." if ( $v > 0 );
         sleep 5;
        }
     else {
          my $result = $rc->next_result();
          #save the output
          my $filename = $result->query_name()."\.out";
          $factory->save_output($filename);
          $factory->remove_rid($rid);
          print "\nQuery Name: ", $result->query_name(), "\n";
          while ( my $hit = $result->next_hit ) {
            next unless ( $v > 0);
            print "\thit name is ", $hit->name, "\n";
            while( my $hsp = $hit->next_hsp ) {
              print "\t\tscore is ", $hsp->score, "\n";
            }
          }
        }
      }
    }
  }

My input sequence is

>ref|NC_009512.1|:385-1902
GTGTCAGTGGAACTTTGGCAGCAGTGCGTGGAGCTTCTGCGCGATGAACTGCCTGCCCAGCAATTCAACA
CCTGGATCCGTCCGCTACAGGTCGAAGCCGAAGGCGACGAGTTGCGCGTCTATGCGCCTAACCGTTTCGT
TCTCGATTGGGTCAATGAAAAGTACCTGGGTCGTTTGCTCGAGCTGTTGGGTGAGAACGGTAGCGGCATT
GCACCAGCCCTTTCCTTATTAATAGGTAGCCGCCGCAGCTCGGCCCCAAGGGCTGCACCCAACGCGCCGG
TCAGCGCTGCCGTTGCGGCTTCGCTGGCGCAGACTCAGGCGCACAAGACGGCCCCGGCAGCAGCGGTTGA
ACCCGTTGCCGTGGCCGCGGCCGAGCCTGTATTGGTCGAGACGTCTTCGCGTGACAGCTTTGATGCCATG
GCCGAGCCTGCTGCTGCGCCGCCCAGTGGTGGCCGGGCTGAACAGCGCACCGTGCAGGTTGAAGGTGCGC
TCAAGCACACCAGTTACCTGAACCGGACCTTTACCTTTGACACCTTCGTCGAAGGTAAGTCGAACCAGCT
CGCCCGCGCGGCTGCCTGGCAGGTTGCGGACAACCCTAAGCATGGCTACAACCCACTGTTCCTTTATGGC
GGTGTGGGTTTGGGTAAAACCCACCTTATGCATGCTGTGGGTAACCATCTGCTGAAGAAGAATCCGAACG
CCAAGGTGGTGTACCTGCATTCGGAGCGCTTCGTCGCGGACATGGTCAAAGCGTTGCAACTCAACGCCAT
CAACGAATTCAAGCGCTTCTACCGCTCGGTGGACGCGTTGCTGATCGACGATATCCAGTTCTTCGCTCGC
AAAGAGCGCTCGCAAGAAGAGTTTTTCCACACCTTCAACGCCTTGCTTGAGGGTGGCCAGCAGGTAATCC
TTACCTCTGACCGCTATCCCAAGGAAATCGAAGGCCTGGAAGAGCGTCTGAAGTCGCGCTTTGGTTGGGG
CCTGACGGTGGCTGTCGAGCCGCCAGAGCTGGAGACCCGCGTAGCGATCCTGATGAAGAAGGCCGACCAG
GCCAAAGTCGAGCTCCCGCATGACGCAGCCTTTTTCATCGCTCAGCGCATCCGGTCCAACGTCCGTGAGC
TGGAAGGTGCACTGAAGCGAGTTATTGCTCACTCGCACTTCATGGGGCGTGACATCACCATCGAGCTGAT
TCGTGAATCGCTCAAGGATCTGTTGGCGCTGCAAGACAAACTGGTCAGTGTGGATAACATTCAGCGTACC
GTCGCTGAGTACTACAAGATCAAGATCTCCGATCTGTTGTCCAAGCGTCGTTCGCGTTCTGTCGCGCGCC
CGCGTCAGGTAGCCATGGCCCTGTCCAAGGAGTTGACCAACCACAGTCTGCCGGAAATCGGCGACATGTT
CGGTGGTCGCGACCATACGACCGTGCTGCACGCCTGCCGCAAAATCAATGAACTGAAGGAATCCGACGCG
GACATCCGCGAGGACTACAAGAACCTGCTGCGGACGCTGACGACCTGA

Please mail me regarding any queries.

Regards,
Roopa.


From cjfields at illinois.edu  Thu Nov 19 15:30:34 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Thu, 19 Nov 2009 09:30:34 -0600
Subject: [Bioperl-l] verbosity and error stack, was  accessing EMBL database
In-Reply-To: <FADF827A6CE34C959062F2D93849E15A@NewLife>
References: <475F74057A618245A773CD325E105D1E033334AC@phy-srv01.physiol.physiology.wisc.edu>
	<E10EF917D9914031BCFDF8BDDBFA4F13@NewLife>
	<B6E6CEFF-14E7-48D6-B9D8-9C114F166190@sbc.su.se>
	<FADF827A6CE34C959062F2D93849E15A@NewLife>
Message-ID: <B335150B-7479-4D31-BA0A-9F139E2CCE0E@illinois.edu>

Mark, Dave,

This could be based on verbose(). 

          Level      w     t     d    st
verbose   < 0        -     +     -    -/+
verbose     0        +     +     -    -/+
verbose     1        +     +     +    +/+
verbose   > 1        +* -> +     +    +/+
* converts to throw()
w = warn
t = throw
d = debug
st = stack trace

warn() is set up that way now, you don't get a stack trace unless verbose() is > 0.  throw() could be the same; would be a simple fix, really.

My only problem with the current state of things is (I think we've delved down this path before) verbosity level is tied to exception strictness as seen above, and they're really two separate concepts, at least to me.  Verbosity of 1 or more doesn't necessarily mean I want an elevated level of strictness along with it.  For instance, one might want very strict exceptions w/o the noise, or (conversely) lots of debugging output but no warnings. 

(aside: another small nit, but I haven't exactly liked that the global level of strictness is designated by a env. variable with DEBUG in the name, but that's just me).

I've been thinking it would be nice to have simple separate verbose/strict switches (this is the way it's implemented in Biome).  This would allow some finer grained control over output:

          Level      d    st
verbose     0        -    -
verbose     1        +    +
Default = BIOPERLDEBUG || 0 # current situation

          Level      w     t
strict      -1       -     +
strict      0        +     +
strict      1        +* -> +
* converts to throw()
Default = BIOPERLSTRICT || 0

We could even allow finer-grained control of verbosity (states which cover all combinations) w/o affecting strictness.

chris

On Nov 19, 2009, at 8:17 AM, Mark A. Jensen wrote:

> I'm inclined to agree. Lots of responses to questions here that begin
> "Well, as the error message said, you need to check...", which means
> people tend towards "I broke it! Write the list!". I do find it hairy when
> my errors are way down in the object tree.
> ----- Original Message ----- From: "Dave Messina" <David.Messina at sbc.su.se>
> To: "Mark A. Jensen" <maj at fortinbras.us>
> Cc: <bioperl-l at bioperl.org>
> Sent: Thursday, November 19, 2009 9:04 AM
> Subject: Re: [Bioperl-l] accessing EMBL database
> 
> 
>> I would agree that the error message is not really informative.
> 
> Agreed that it could be better, but I wonder whether part of the problem with BioPerl error messages is the stack dump.
> 
> I think a lot of eyes just glaze right over when they see a big wad of complicated stuff, with colons and slashes and line numbers, spewing out at them.
> 
> Perhaps the stack dump should be turned off by default?
> 
> Wouldn't this:
> 
> ERROR: EMBL stream with no ID. Not embl in my book
> 
> 
> 
> Be a lot clearer than this?:
> 
> MSG: EMBL stream with no ID. Not embl in my book
> STACK: Error::throw
> STACK: Bio::Root::Root::throw C:/Perl/site/lib/Bio/Root/Root.pm: 368
> STACK: Bio::SeqIO::embl::next_seq C:/Perl/site/lib/Bio\SeqIO\embl.pm: 203
> STACK: Bio::DB::WebDBSeqI::get_Seq_by_acc C:/Perl/site/lib/Bio/DB/WebDBSeqI.pm: 194
> STACK: trial2.pl
> 
> 
> 
> Just a thought. This has probably been discussed before.
> Dave
> 
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From roy.chaudhuri at gmail.com  Thu Nov 19 16:10:28 2009
From: roy.chaudhuri at gmail.com (Roy Chaudhuri)
Date: Thu, 19 Nov 2009 16:10:28 +0000
Subject: [Bioperl-l] Remote blast
In-Reply-To: <c7cac1600911190655q7c4cf910x221e8a2ebac40f2a@mail.gmail.com>
References: <c7cac1600911190655q7c4cf910x221e8a2ebac40f2a@mail.gmail.com>
Message-ID: <4B056DF4.2030502@gmail.com>

Hi Roopa,

I think that the -Organism parameter that you specify for 
Bio::Tools::Run::RemoteBlast is ignored - I can't find any reference to 
it in the documentation:
http://search.cpan.org/~cjfields/BioPerl-1.6.1/Bio/Tools/Run/RemoteBlast.pm

You have the correct approach in your code - limiting the search to the 
Entrez query "Trypanosoma brucei[ORGN]", but the line is commented out. 
If you uncomment the line (and add a semicolon afterwards), the program 
runs correctly, but no hits are reported below your threshold e-value. 
If you change the value of $e_val to 10 then some T.brucei hits are 
reported.

Roy.

Roopa Raghuveer wrote:
> Hello everybody,
> 
> I have a problem. I would like to use remote blast to find sequences
> matching for an input sequence.
> 
> Ex:-I would like to search sequences which match Trypanosoma Brucei
> sequence.
> 
> I want the output to be only Trypanosoma Brucei sequences matching with my
> query.When i tried to use remoteblast to nr database,I got sequences from
> different organisms like E.coli,Pseudomonas etc.,
> 
> Could you please tell me how can this be solved...?
> 
> My code is as follows.
> 
> use Bio::Tools::Run::RemoteBlast;
>   use strict;
>   my $prog = 'blastn';
>   my $db   = 'nr';
>   my $e_val= '1e-10';
>  my $organism= 'Trypanosoma Brucei';
> 
>   my @params = ( '-prog' => $prog,
>          '-data' => $db,
>          '-expect' => $e_val,
>          '-readmethod' => 'SearchIO',
>          '-Organism'   => $organism );
> 
>   my $factory = Bio::Tools::Run::RemoteBlast->
> new(@params);
> 
>   #change a paramter
>   #$Bio::Tools::Run::RemoteBlast::HEADER{'ENTREZ_QUERY'} = 'Trypanosoma
> brucei[ORGN]'
> 
>   #remove a parameter
>   #delete $Bio::Tools::Run::RemoteBlast::HEADER{'FILTER'};
> 
>   my $v = 1;
>   #$v is just to turn on and off the messages
> 
>   my $str = Bio::SeqIO->new(-file=>'amino.fa' , '-format' => 'fasta' ,
> '-organism' => 'Trypanosoma Brucei' );
> 
>   while (my $input = $str->next_seq()){
>     #Blast a sequence against a database:
>    my $r = $factory->submit_blast($input);
>     #my $r = $factory->submit_blast('amino.fa');
> 
>     print STDERR "waiting..." if( $v > 0 );
>     while ( my @rids = $factory->each_rid ) {
>       foreach my $rid ( @rids ) {
>         my $rc = $factory->retrieve_blast($rid);
>         if( !ref($rc) ) {
>           if( $rc < 0 ) {
>             $factory->remove_rid($rid);
>           }
>           print STDERR "." if ( $v > 0 );
>          sleep 5;
>         }
>      else {
>           my $result = $rc->next_result();
>           #save the output
>           my $filename = $result->query_name()."\.out";
>           $factory->save_output($filename);
>           $factory->remove_rid($rid);
>           print "\nQuery Name: ", $result->query_name(), "\n";
>           while ( my $hit = $result->next_hit ) {
>             next unless ( $v > 0);
>             print "\thit name is ", $hit->name, "\n";
>             while( my $hsp = $hit->next_hsp ) {
>               print "\t\tscore is ", $hsp->score, "\n";
>             }
>           }
>         }
>       }
>     }
>   }
> 
> My input sequence is
> 
>> ref|NC_009512.1|:385-1902
> GTGTCAGTGGAACTTTGGCAGCAGTGCGTGGAGCTTCTGCGCGATGAACTGCCTGCCCAGCAATTCAACA
> CCTGGATCCGTCCGCTACAGGTCGAAGCCGAAGGCGACGAGTTGCGCGTCTATGCGCCTAACCGTTTCGT
> TCTCGATTGGGTCAATGAAAAGTACCTGGGTCGTTTGCTCGAGCTGTTGGGTGAGAACGGTAGCGGCATT
> GCACCAGCCCTTTCCTTATTAATAGGTAGCCGCCGCAGCTCGGCCCCAAGGGCTGCACCCAACGCGCCGG
> TCAGCGCTGCCGTTGCGGCTTCGCTGGCGCAGACTCAGGCGCACAAGACGGCCCCGGCAGCAGCGGTTGA
> ACCCGTTGCCGTGGCCGCGGCCGAGCCTGTATTGGTCGAGACGTCTTCGCGTGACAGCTTTGATGCCATG
> GCCGAGCCTGCTGCTGCGCCGCCCAGTGGTGGCCGGGCTGAACAGCGCACCGTGCAGGTTGAAGGTGCGC
> TCAAGCACACCAGTTACCTGAACCGGACCTTTACCTTTGACACCTTCGTCGAAGGTAAGTCGAACCAGCT
> CGCCCGCGCGGCTGCCTGGCAGGTTGCGGACAACCCTAAGCATGGCTACAACCCACTGTTCCTTTATGGC
> GGTGTGGGTTTGGGTAAAACCCACCTTATGCATGCTGTGGGTAACCATCTGCTGAAGAAGAATCCGAACG
> CCAAGGTGGTGTACCTGCATTCGGAGCGCTTCGTCGCGGACATGGTCAAAGCGTTGCAACTCAACGCCAT
> CAACGAATTCAAGCGCTTCTACCGCTCGGTGGACGCGTTGCTGATCGACGATATCCAGTTCTTCGCTCGC
> AAAGAGCGCTCGCAAGAAGAGTTTTTCCACACCTTCAACGCCTTGCTTGAGGGTGGCCAGCAGGTAATCC
> TTACCTCTGACCGCTATCCCAAGGAAATCGAAGGCCTGGAAGAGCGTCTGAAGTCGCGCTTTGGTTGGGG
> CCTGACGGTGGCTGTCGAGCCGCCAGAGCTGGAGACCCGCGTAGCGATCCTGATGAAGAAGGCCGACCAG
> GCCAAAGTCGAGCTCCCGCATGACGCAGCCTTTTTCATCGCTCAGCGCATCCGGTCCAACGTCCGTGAGC
> TGGAAGGTGCACTGAAGCGAGTTATTGCTCACTCGCACTTCATGGGGCGTGACATCACCATCGAGCTGAT
> TCGTGAATCGCTCAAGGATCTGTTGGCGCTGCAAGACAAACTGGTCAGTGTGGATAACATTCAGCGTACC
> GTCGCTGAGTACTACAAGATCAAGATCTCCGATCTGTTGTCCAAGCGTCGTTCGCGTTCTGTCGCGCGCC
> CGCGTCAGGTAGCCATGGCCCTGTCCAAGGAGTTGACCAACCACAGTCTGCCGGAAATCGGCGACATGTT
> CGGTGGTCGCGACCATACGACCGTGCTGCACGCCTGCCGCAAAATCAATGAACTGAAGGAATCCGACGCG
> GACATCCGCGAGGACTACAAGAACCTGCTGCGGACGCTGACGACCTGA
> 
> Please mail me regarding any queries.
> 
> Regards,
> Roopa.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From clements at nescent.org  Thu Nov 19 17:46:32 2009
From: clements at nescent.org (Dave Clements)
Date: Thu, 19 Nov 2009 18:46:32 +0100
Subject: [Bioperl-l] how to visualize multiple sequences alignments
In-Reply-To: <FC65BC44-AC7A-4AE6-8DF0-3F7969CD5B4C@bioperl.org>
References: <a6a926b50911180918m6b6c31b2g191f9d78325e68de@mail.gmail.com>
	<FC65BC44-AC7A-4AE6-8DF0-3F7969CD5B4C@bioperl.org>
Message-ID: <f135c01c0911190946t7488718brfed76b975f6d2b2@mail.gmail.com>

Hi Xiaoyu,

I would also take a look at GBrowse_syn, a perl based solution built with
the GBrowse genome browser framework.

See http://gmod.org/wiki/GBrowse_syn.

Cheers,

Dave C.

On Wed, Nov 18, 2009 at 7:23 PM, Jason Stajich <jason at bioperl.org> wrote:

> try jalview http://www.jalview.org/
>
>
> On Nov 18, 2009, at 9:18 AM, Xiaoyu Liang wrote:
>
>  Hi,
>>
>> I'm wondering Is there any modules that can be used for visualizing
>> multiple
>> sequences alignments? like the result from ClustalW?
>>
>> Thank you very much,
>> Xiaoyu
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>
> --
> Jason Stajich
> jason.stajich at gmail.com
> jason at bioperl.org
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


-- 
http://gmod.org/wiki/GMOD_News
http://gmod.org/wiki/January_2010_GMOD_Meeting


From maj at fortinbras.us  Thu Nov 19 23:37:05 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Thu, 19 Nov 2009 18:37:05 -0500
Subject: [Bioperl-l] verbosity and error stack,
	was  accessing EMBL database
In-Reply-To: <B335150B-7479-4D31-BA0A-9F139E2CCE0E@illinois.edu>
References: <475F74057A618245A773CD325E105D1E033334AC@phy-srv01.physiol.physiology.wisc.edu><E10EF917D9914031BCFDF8BDDBFA4F13@NewLife><B6E6CEFF-14E7-48D6-B9D8-9C114F166190@sbc.su.se><FADF827A6CE34C959062F2D93849E15A@NewLife>
	<B335150B-7479-4D31-BA0A-9F139E2CCE0E@illinois.edu>
Message-ID: <D72A208491F04DBF9B3F7F10D86A9931@NewLife>

I like this verbose/strict separability a lot. Should we go for it?
----- Original Message ----- 
From: "Chris Fields" <cjfields at illinois.edu>
To: "Mark A. Jensen" <maj at fortinbras.us>
Cc: <bioperl-l at bioperl.org>
Sent: Thursday, November 19, 2009 10:30 AM
Subject: [Bioperl-l] verbosity and error stack, was accessing EMBL database


> Mark, Dave,
>
> This could be based on verbose().
>
>          Level      w     t     d    st
> verbose   < 0        -     +     -    -/+
> verbose     0        +     +     -    -/+
> verbose     1        +     +     +    +/+
> verbose   > 1        +* -> +     +    +/+
> * converts to throw()
> w = warn
> t = throw
> d = debug
> st = stack trace
>
> warn() is set up that way now, you don't get a stack trace unless verbose() is 
>  > 0.  throw() could be the same; would be a simple fix, really.
>
> My only problem with the current state of things is (I think we've delved down 
> this path before) verbosity level is tied to exception strictness as seen 
> above, and they're really two separate concepts, at least to me.  Verbosity of 
> 1 or more doesn't necessarily mean I want an elevated level of strictness 
> along with it.  For instance, one might want very strict exceptions w/o the 
> noise, or (conversely) lots of debugging output but no warnings.
>
> (aside: another small nit, but I haven't exactly liked that the global level 
> of strictness is designated by a env. variable with DEBUG in the name, but 
> that's just me).
>
> I've been thinking it would be nice to have simple separate verbose/strict 
> switches (this is the way it's implemented in Biome).  This would allow some 
> finer grained control over output:
>
>          Level      d    st
> verbose     0        -    -
> verbose     1        +    +
> Default = BIOPERLDEBUG || 0 # current situation
>
>          Level      w     t
> strict      -1       -     +
> strict      0        +     +
> strict      1        +* -> +
> * converts to throw()
> Default = BIOPERLSTRICT || 0
>
> We could even allow finer-grained control of verbosity (states which cover all 
> combinations) w/o affecting strictness.
>
> chris
>
> On Nov 19, 2009, at 8:17 AM, Mark A. Jensen wrote:
>
>> I'm inclined to agree. Lots of responses to questions here that begin
>> "Well, as the error message said, you need to check...", which means
>> people tend towards "I broke it! Write the list!". I do find it hairy when
>> my errors are way down in the object tree.
>> ----- Original Message ----- From: "Dave Messina" <David.Messina at sbc.su.se>
>> To: "Mark A. Jensen" <maj at fortinbras.us>
>> Cc: <bioperl-l at bioperl.org>
>> Sent: Thursday, November 19, 2009 9:04 AM
>> Subject: Re: [Bioperl-l] accessing EMBL database
>>
>>
>>> I would agree that the error message is not really informative.
>>
>> Agreed that it could be better, but I wonder whether part of the problem with 
>> BioPerl error messages is the stack dump.
>>
>> I think a lot of eyes just glaze right over when they see a big wad of 
>> complicated stuff, with colons and slashes and line numbers, spewing out at 
>> them.
>>
>> Perhaps the stack dump should be turned off by default?
>>
>> Wouldn't this:
>>
>> ERROR: EMBL stream with no ID. Not embl in my book
>>
>>
>>
>> Be a lot clearer than this?:
>>
>> MSG: EMBL stream with no ID. Not embl in my book
>> STACK: Error::throw
>> STACK: Bio::Root::Root::throw C:/Perl/site/lib/Bio/Root/Root.pm: 368
>> STACK: Bio::SeqIO::embl::next_seq C:/Perl/site/lib/Bio\SeqIO\embl.pm: 203
>> STACK: Bio::DB::WebDBSeqI::get_Seq_by_acc 
>> C:/Perl/site/lib/Bio/DB/WebDBSeqI.pm: 194
>> STACK: trial2.pl
>>
>>
>>
>> Just a thought. This has probably been discussed before.
>> Dave
>>
>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> 


From michael.watson at bbsrc.ac.uk  Fri Nov 20 10:07:10 2009
From: michael.watson at bbsrc.ac.uk (michael watson (IAH-C))
Date: Fri, 20 Nov 2009 10:07:10 +0000
Subject: [Bioperl-l] strand in Bio::SearchIO when parsing fasta35 output
In-Reply-To: <8D08960C647E64438CE5740657CBBDC501487319B6@iahcexch1.iah.bbsrc.ac.uk>
References: <8D08960C647E64438CE5740657CBBDC501487319AE@iahcexch1.iah.bbsrc.ac.uk>
	<9994F70B-AE92-4425-9AAC-E9A2DC26964E@bioperl.org>
	<8D08960C647E64438CE5740657CBBDC501487319B6@iahcexch1.iah.bbsrc.ac.uk>
Message-ID: <8D08960C647E64438CE5740657CBBDC50148731CEB@iahcexch1.iah.bbsrc.ac.uk>

Hello

I was just wondering if anyone had had time to look into this?

I posted a bug: http://bugzilla.open-bio.org/show_bug.cgi?id=2937

Thanks
Mick

-----Original Message-----
From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-bounces at lists.open-bio.org] On Behalf Of michael watson (IAH-C)
Sent: 27 October 2009 09:01
To: 'Jason Stajich'
Cc: bioperl-l
Subject: Re: [Bioperl-l] strand in Bio::SearchIO when parsing fasta35 output

Hi Jason

They both print 0 also.

A bug report it is

Mick

-----Original Message-----
From: Jason Stajich [mailto:jason.stajich at gmail.com] On Behalf Of Jason Stajich
Sent: 26 October 2009 18:46
To: michael watson (IAH-C)
Cc: bioperl-l
Subject: Re: [Bioperl-l] strand in Bio::SearchIO when parsing fasta35 output


Is this -m9 -d 0 output or standard default?  I think the strand is  
parsed in the HSP parsing.

Can you double check what $hsp->query->strand and $hsp->hit->strand  
prints?

A full example report as a bug request will be next step if that  
doesn't resolve.

-jason
On Oct 26, 2009, at 10:04 AM, michael watson (IAH-C) wrote:

> Dear all
>
> Where does this go?  Perhaps I am doing something wrong.
>
> Fasta35 output puts the strand in the hit list at the top:
>
> cluster_99033:3                                (  23) [r]  115 37.9   
> 0.0011
> cluster_79238:1                 (  27) [f]  126 38.0 0.00097 0.963  
> 0.963   27
>
> The [r] stands for reverse and the [f] stands for forward.
>
> There is also the text "rev-comp" after the hit line further down.
>
> However, when I parse fasta35 output using SearchIO and output the  
> strand of the HSP:
>
> print $hsp->strand('hit'), ",";
> print $hsp->strand('query'), "\n";
>
> This simply prints out 0, 0 (I assume 0 is the default in BioPerl  
> for "I don't know which strand it's on").
>
> So the information is there, but it's not getting parsed.   
> Alternatively, I've missed something and will feel a bit foolish.
>
> Currently using BioPerl 1.6.0
>
> Thanks
> Mick
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

--
Jason Stajich
jason.stajich at gmail.com
jason at bioperl.org


_______________________________________________
Bioperl-l mailing list
Bioperl-l at lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/bioperl-l


From David.Messina at sbc.su.se  Fri Nov 20 10:15:11 2009
From: David.Messina at sbc.su.se (Dave Messina)
Date: Fri, 20 Nov 2009 11:15:11 +0100
Subject: [Bioperl-l] verbosity and error stack,
	was  accessing EMBL database
In-Reply-To: <D72A208491F04DBF9B3F7F10D86A9931@NewLife>
References: <475F74057A618245A773CD325E105D1E033334AC@phy-srv01.physiol.physiology.wisc.edu><E10EF917D9914031BCFDF8BDDBFA4F13@NewLife><B6E6CEFF-14E7-48D6-B9D8-9C114F166190@sbc.su.se><FADF827A6CE34C959062F2D93849E15A@NewLife>
	<B335150B-7479-4D31-BA0A-9F139E2CCE0E@illinois.edu>
	<D72A208491F04DBF9B3F7F10D86A9931@NewLife>
Message-ID: <3277368F-615A-4DD3-B9B3-5D32A5EEEE98@sbc.su.se>

Chris, I took a look at how you implemented this in Biome -- very nice!


> I like this verbose/strict separability a lot. Should we go for it?

Me too. So yes, I think so.


> We could even allow finer-grained control of verbosity (states which cover all combinations) w/o affecting strictness.


Perhaps this is a job for Log::Log4Perl or Log::Dispatch?
http://search.cpan.org/~mschilli/Log-Log4perl-1.25/lib/Log/Log4perl.pm
http://search.cpan.org/~drolsky/Log-Dispatch-2.26/lib/Log/Dispatch.pm


That might be overkill, though.

Dave


From roychu at gmail.com  Fri Nov 20 10:21:54 2009
From: roychu at gmail.com (Chu, Roy)
Date: Fri, 20 Nov 2009 02:21:54 -0800
Subject: [Bioperl-l] Installing Bio-perl on dreamhost via CPAN
Message-ID: <4d7f3e450911200221m5f39ace2hb979712115fb9d78@mail.gmail.com>

Hi,

Does anyone use dreamhost as a web hosting service?  I'm just curious
if anyone has had any luck installing the module as their daemon seems
to kill my process whenever I try to install it.  Dreamhost tech
support attributes it to either exceeding the allocated memory cache
or exceeding the processing time.  I tried to nice the process, but
that didn't help for me.  Any luck or experience in resolving this
would be much appreciated.  I suppose my next attempt would be to try
installing it directly and hope I don't need root...

Thanks,
Roy


From s.denaxas at gmail.com  Fri Nov 20 10:27:42 2009
From: s.denaxas at gmail.com (Spiros Denaxas)
Date: Fri, 20 Nov 2009 11:27:42 +0100
Subject: [Bioperl-l] Installing Bio-perl on dreamhost via CPAN
In-Reply-To: <4d7f3e450911200221m5f39ace2hb979712115fb9d78@mail.gmail.com>
References: <4d7f3e450911200221m5f39ace2hb979712115fb9d78@mail.gmail.com>
Message-ID: <bba689ec0911200227g1a8d717elce0daebf6a96c6aa@mail.gmail.com>

Hello,

normally you don't need to be root -
http://sial.org/howto/perl/life-with-cpan/non-root/
Kind of disturbing that their tech support cannot give you a straight
answer on what they are killing the process.

Good luck
Spiros

On Fri, Nov 20, 2009 at 11:21 AM, Chu, Roy <roychu at gmail.com> wrote:

>  ?I suppose my next attempt would be to try
> installing it directly and hope I don't need root...
>
> Thanks,
> Roy
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


From charles-listes+bioperl at plessy.org  Fri Nov 20 10:44:45 2009
From: charles-listes+bioperl at plessy.org (Charles Plessy)
Date: Fri, 20 Nov 2009 19:44:45 +0900
Subject: [Bioperl-l] Installing Bio-perl on dreamhost via CPAN
In-Reply-To: <4d7f3e450911200221m5f39ace2hb979712115fb9d78@mail.gmail.com>
References: <4d7f3e450911200221m5f39ace2hb979712115fb9d78@mail.gmail.com>
Message-ID: <20091120104445.GG31318@kunpuu.plessy.org>

Le Fri, Nov 20, 2009 at 02:21:54AM -0800, Chu, Roy a ?crit :
> 
> Does anyone use dreamhost as a web hosting service?  I'm just curious
> if anyone has had any luck installing the module as their daemon seems
> to kill my process whenever I try to install it.  Dreamhost tech
> support attributes it to either exceeding the allocated memory cache
> or exceeding the processing time.  I tried to nice the process, but
> that didn't help for me.  Any luck or experience in resolving this
> would be much appreciated.  I suppose my next attempt would be to try
> installing it directly and hope I don't need root...

Dear Roy,

DreamHost uses Debian, so you can suggest them to install the Debian package.
If you are in contact with the tech service, do not hesitate to tell them to
contact me if they are interested by a backport of the 1.6.0 package. For
version 1.6.1, it may be more difficult as it depends on perl 5.10.1.

PS: if you propse to install BioPerl as a feature in the Dreamhost panel, I
will vote for it :)

Have a nice day,

--  
Charles Plessy
Debian Med packaging team,
http://www.debian.org/devel/debian-med
Tsurumi, Kanagawa, Japan


From cjfields at illinois.edu  Fri Nov 20 12:51:39 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Fri, 20 Nov 2009 06:51:39 -0600
Subject: [Bioperl-l] strand in Bio::SearchIO when parsing fasta35 output
In-Reply-To: <8D08960C647E64438CE5740657CBBDC50148731CEB@iahcexch1.iah.bbsrc.ac.uk>
References: <8D08960C647E64438CE5740657CBBDC501487319AE@iahcexch1.iah.bbsrc.ac.uk>
	<9994F70B-AE92-4425-9AAC-E9A2DC26964E@bioperl.org>
	<8D08960C647E64438CE5740657CBBDC501487319B6@iahcexch1.iah.bbsrc.ac.uk>
	<8D08960C647E64438CE5740657CBBDC50148731CEB@iahcexch1.iah.bbsrc.ac.uk>
Message-ID: <E9D5435B-07D6-46A9-AA84-C9667FA0CEDE@illinois.edu>

Mick,

Short answer, no.  It was in the queue to be fixed at some point in 1.6.x, but that queue is quite long.  I'm pushing it into the queue specifically for 1.6.2, so it should be addressed soon.

chris

On Nov 20, 2009, at 4:07 AM, michael watson (IAH-C) wrote:

> Hello
> 
> I was just wondering if anyone had had time to look into this?
> 
> I posted a bug: http://bugzilla.open-bio.org/show_bug.cgi?id=2937
> 
> Thanks
> Mick
> 
> -----Original Message-----
> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-bounces at lists.open-bio.org] On Behalf Of michael watson (IAH-C)
> Sent: 27 October 2009 09:01
> To: 'Jason Stajich'
> Cc: bioperl-l
> Subject: Re: [Bioperl-l] strand in Bio::SearchIO when parsing fasta35 output
> 
> Hi Jason
> 
> They both print 0 also.
> 
> A bug report it is
> 
> Mick
> 
> -----Original Message-----
> From: Jason Stajich [mailto:jason.stajich at gmail.com] On Behalf Of Jason Stajich
> Sent: 26 October 2009 18:46
> To: michael watson (IAH-C)
> Cc: bioperl-l
> Subject: Re: [Bioperl-l] strand in Bio::SearchIO when parsing fasta35 output
> 
> 
> Is this -m9 -d 0 output or standard default?  I think the strand is  
> parsed in the HSP parsing.
> 
> Can you double check what $hsp->query->strand and $hsp->hit->strand  
> prints?
> 
> A full example report as a bug request will be next step if that  
> doesn't resolve.
> 
> -jason
> On Oct 26, 2009, at 10:04 AM, michael watson (IAH-C) wrote:
> 
>> Dear all
>> 
>> Where does this go?  Perhaps I am doing something wrong.
>> 
>> Fasta35 output puts the strand in the hit list at the top:
>> 
>> cluster_99033:3                                (  23) [r]  115 37.9   
>> 0.0011
>> cluster_79238:1                 (  27) [f]  126 38.0 0.00097 0.963  
>> 0.963   27
>> 
>> The [r] stands for reverse and the [f] stands for forward.
>> 
>> There is also the text "rev-comp" after the hit line further down.
>> 
>> However, when I parse fasta35 output using SearchIO and output the  
>> strand of the HSP:
>> 
>> print $hsp->strand('hit'), ",";
>> print $hsp->strand('query'), "\n";
>> 
>> This simply prints out 0, 0 (I assume 0 is the default in BioPerl  
>> for "I don't know which strand it's on").
>> 
>> So the information is there, but it's not getting parsed.   
>> Alternatively, I've missed something and will feel a bit foolish.
>> 
>> Currently using BioPerl 1.6.0
>> 
>> Thanks
>> Mick
>> 
>> 
>> 
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
> --
> Jason Stajich
> jason.stajich at gmail.com
> jason at bioperl.org
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From cjfields at illinois.edu  Fri Nov 20 13:00:45 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Fri, 20 Nov 2009 07:00:45 -0600
Subject: [Bioperl-l] Installing Bio-perl on dreamhost via CPAN
In-Reply-To: <20091120104445.GG31318@kunpuu.plessy.org>
References: <4d7f3e450911200221m5f39ace2hb979712115fb9d78@mail.gmail.com>
	<20091120104445.GG31318@kunpuu.plessy.org>
Message-ID: <ECA15E72-8026-4EA0-A4F6-7C09EA2BD040@illinois.edu>


On Nov 20, 2009, at 4:44 AM, Charles Plessy wrote:

> Le Fri, Nov 20, 2009 at 02:21:54AM -0800, Chu, Roy a ?crit :
>> 
>> Does anyone use dreamhost as a web hosting service?  I'm just curious
>> if anyone has had any luck installing the module as their daemon seems
>> to kill my process whenever I try to install it.  Dreamhost tech
>> support attributes it to either exceeding the allocated memory cache
>> or exceeding the processing time.  I tried to nice the process, but
>> that didn't help for me.  Any luck or experience in resolving this
>> would be much appreciated.  I suppose my next attempt would be to try
>> installing it directly and hope I don't need root...
> 
> Dear Roy,
> 
> DreamHost uses Debian, so you can suggest them to install the Debian package.
> If you are in contact with the tech service, do not hesitate to tell them to
> contact me if they are interested by a backport of the 1.6.0 package. For
> version 1.6.1, it may be more difficult as it depends on perl 5.10.1.

Any reason why this is so?  We specify compatibility back to 5.6.1.

Alex mentioned the reliance on the specific Extutils::Manifest version.  The version requested has an important bug fix, is present on CPAN, and is backwards-compatible to 5.6.1.  It should be fairly easy to request that as a separate package.  

A strict requirement for perl 5.10.1 doesn't make sense in that light, unless said perl maintainer can enlighten us as to why this is an issue?  This one may require a ranty blog post.

> PS: if you propse to install BioPerl as a feature in the Dreamhost panel, I
> will vote for it :)
> 
> Have a nice day,
> 
> --  
> Charles Plessy
> Debian Med packaging team,
> http://www.debian.org/devel/debian-med
> Tsurumi, Kanagawa, Japan

chris


From rtbio.2009 at gmail.com  Fri Nov 20 15:52:09 2009
From: rtbio.2009 at gmail.com (Roopa Raghuveer)
Date: Fri, 20 Nov 2009 16:52:09 +0100
Subject: [Bioperl-l] Remote blast
In-Reply-To: <c7cac1600911200330m5aafc21ehd5038707b4af9cdd@mail.gmail.com>
References: <c7cac1600911190655q7c4cf910x221e8a2ebac40f2a@mail.gmail.com>
	<4B056DF4.2030502@gmail.com>
	<c7cac1600911200330m5aafc21ehd5038707b4af9cdd@mail.gmail.com>
Message-ID: <c7cac1600911200752h2dd48ca8r55877f5045948220@mail.gmail.com>

Hello everybody,

I have tried to use Remote blast on Trypanasoma brucei sequences and could
get certain hits.But I am unable to retrieve the complete sequence from
where I got hits.
i.e., I am unable to parse the blast output file for getting the complete
sequences of the hits. Here is my code.

#!/usr/bin/perl -w
use Bio::SearchIO;
my $blast_report = new Bio::SearchIO ('-format' => 'blast',
                                      '-file'   => $ARGV[0]);
my $result = $blast_report->next_result;
my $level = $ARGV[1];

while( my $hit = $result->next_hit) {
       print $hit->name;
       push(@arr1,$hit->name);
       while( my $hsp = $hit->next_hsp()) {
        if ($hsp->frac_identical() >= $level) {
            #print $hsp->hit_string, "\n";
            push(@arr,$hsp->hit_string);
        }
    }
}
$k=@arr1;
for($i=0;$i<$k;$i++){
push(@arr2,split(/|/,$arr1[$i]));
#print "$arr[$i]\n";
}
#$t=@arr2;

Here,I am trying to use the blast output file and get the complete sequence
where I found a hit  but  I could not get the complete sequence.

i/p:-
Last login: Mon Nov 16 11:57:22 on console
Welcome to Darwin!
lmbicip-mac1:~ cip$ ssh admin at 141.84.66.66
The authenticity of host '141.84.66.66 (141.84.66.66)' can't be established.
RSA key fingerprint is 2d:4a:09:1d:2e:f3:51:c7:ba:8b:29:37:36:f6:44:db.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added '141.84.66.66' (RSA) to the list of known hosts.
Password:
Last login: Fri Nov 20 13:52:57 2009 from 10.153.189.239
Have a lot of fun...
admin at BosLinux:~> clear


admin at BosLinux:~> cd Documents/
admin at BosLinux:~/Documents> clear


admin at BosLinux:~/Documents> vim blast.pl
admin at BosLinux:~/Documents> clear


admin at BosLinux:~/Documents> vim nnn.pl
admin at BosLinux:~/Documents> vim other.pl
admin at BosLinux:~/Documents> vim amino.fa
admin at BosLinux:~/Documents> vim Tb09.211.2410.out
admin at BosLinux:~/Documents> vim Tb09.211.2410.out


             ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Sbjct  661   TTTGATGAAACCCCAATTCGGACGTATGAAAAGATTCTTGCGGGCCGGCTTAAATTCCCC
720

Query  721   AATTGGTTTGATGAGCGTGCGCGGGATCTCGTAAAGGGTTTATTGCAAACGGATCACACG
780
             ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Sbjct  721   AATTGGTTTGATGAGCGTGCGCGGGATCTCGTAAAGGGTTTATTGCAAACGGATCACACG
780

Query  781   AAACGGTTGGGCACGCTGAAGGATGGCGTAGCTGATGTGAAGAATCACCCATTCTTCCGT
840
             ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Sbjct  781   AAACGGTTGGGCACGCTGAAGGATGGCGTAGCTGATGTGAAGAATCACCCATTCTTCCGT
840

Query  841   GGTGCGAATTGGGAGAAACTCTATGGACGTCATTATAACGCCCCCATTGCCGTAAAAGTG
900
             ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Sbjct  841   GGTGCGAATTGGGAGAAACTCTATGGACGTCATTATAACGCCCCCATTGCCGTAAAAGTG
900

Query  901   AAGAGCCCCGGCGACACAAGTAACTTTGAGTCGTATCCCGAGAGTGGAGATAAGGGTTCT
960
             ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Sbjct  901   AAGAGCCCCGGCGACACAAGTAACTTTGAGTCGTATCCCGAGAGTGGAGATAAGGGTTCT
960

Query  961   CCTCCACTAACCCCTTCGCAACAGGTTGCATTCCGTGGTTTTTAG  1005
             |||||||||||||||||||||||||||||||||||||||||||||
Sbjct  961   CCTCCACTAACCCCTTCGCAACAGGTTGCATTCCGTGGTTTTTAG  1005

>ref|XM_822286.1| Trypanosoma brucei TREU927 protein kinase A catalytic
subunit
isoform 2 (Tb09.211.2360) partial mRNA
Length=1011

 Score = 1622 bits (1798),  Expect = 0.0
 Identities = 944/974 (96%), Gaps = 0/974 (0%)
 Strand=Plus/Plus

Query  32    TGTTTACCAAGCCTGACACATCGGGATGGAAGCTGAGTGACTTTGAAATGGGTGACACGC
91
             |||||||||| |||||||||||||||||||||||||||||||||||||||||||||||||
Sbjct  38    TGTTTACCAAACCTGACACATCGGGATGGAAGCTGAGTGACTTTGAAATGGGTGACACGC
97

Query  92    TAGGGACCGGCTCGTTTGGTCGCGTGCGCATTGCAAAACTGAAGAGCAGGGGGGAGTATT
151
             ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Sbjct  98    TAGGGACCGGCTCGTTTGGTCGCGTGCGCATTGCAAAACTGAAGAGCAGGGGGGAGTATT
157

Query  152   ATGCAATAAAATGTCTAAAGAAGCATGAGATACTAAAGATGAAGCAGGTACAACACCTGA
211
             |||||||||||||||||||||||| |||||||||||||||||||||||||||||||||||
Sbjct  158   ATGCAATAAAATGTCTAAAGAAGCGTGAGATACTAAAGATGAAGCAGGTACAACACCTGA
217

Query  212   ACCAAGAGAAGCAAATTCTAATGGAGTTGTCACACCCCTTCATTGTGAACATGATGTGTT
271
             ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Sbjct  218   ACCAAGAGAAGCAAATTCTAATGGAGTTGTCACACCCCTTCATTGTGAACATGATGTGTT
277

uery  272   CCTTCCAGGATGAGAACCGCGTCTACTTTGTTCTAGAATTTGTGGTAGGTGGTGAGGTAT
331
             ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Sbjct  278   CCTTCCAGGATGAGAACCGCGTCTACTTTGTTCTAGAATTTGTGGTAGGTGGTGAGGTAT
337

Query  332   TTACTCACCTTCGTTCCGCAGGCCGTTTCCCGAATGACGTAGCGAAGTTCTATCATGCGG
391
             ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Sbjct  338   TTACTCACCTTCGTTCCGCAGGCCGTTTCCCGAATGACGTAGCGAAGTTCTATCATGCGG
397

Query  392   AGCTTGTGTTGGCCTTTGAATATTTACACTCGAAGGACATTATCTACCGTGACTTGAAAC
451
             ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Sbjct  398   AGCTTGTGTTGGCCTTTGAATATTTACACTCGAAGGACATTATCTACCGTGACTTGAAAC
457

Query  452   CTGAGAATCTGCTACTTGATGGGAAGGGACACGTCAAGGTGACTGATTTTGGTTTTGCTA
511
             ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Sbjct  458   CTGAGAATCTGCTACTTGATGGGAAGGGACACGTCAAGGTGACTGATTTTGGTTTTGCTA
517

Query  512   AGAAGGTGACGGATCGTACCTATACGTTATGTGGGACACCTGAGTATCTTGCACCTGAGG
571
             ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Sbjct  518   AGAAGGTGACGGATCGTACCTATACGTTATGTGGGACACCTGAGTATCTTGCACCTGAGG
577

Query  572   TAATTCAGAGCAAAGGACATGGGAAGGCTGTGGATTGGTGGACGATGGGTGTTTTGCTGT
631
             ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||

It follows like this.

The output I got is
ATGACGACAACTCCCACTGGTGATGGCCAACTGTTTACCAAGCCTGACACATCGGGATGGAAGCTGAGTGACTTTGAAATGGGTGACACGCTAGGGACCGGCTCGTTTGGTCGCGTGCGCATTGCAAAACTGAAGAGCAGGGGGGAGTATTATGCAATAAAATGTCTAAAGAAGCATGAGATACTAAAGATGAAGCAGGTACAACACCTGAACCAAGAGAAGCAAATTCTAATGGAGTTGTCACACCCCTTCATTGTGAACATGATGTGTTCCTTCCAGGATGAGAACCGCGTCTACTTTGTTCTAGAATTTGTGGTAGGTGGTGAGGTATTTACTCACCTTCGTTCCGCAGGCCGTTTCCCGAATGACGTAGCGAAGTTCTATCATGCGGAGCTTGTGTTGGCCTTTGAATATTTACACTCGAAGGACATTATCTACCGTGACTTGAAACCTGAGAATCTGCTACTTGATGGGAAGGGACACGTCAAGGTGACTGATTTTGGTTTTGCTAAGAAGGTGACGGATCGTACCTATACGTTATGTGGGACACCTGAGTATCTTGCACCTGAGGTAATTCAGAGCAAAGGACATGGGAAGGCTGTGGATTGGTGGACGATGGGTGTTTTGCTGTATGAATTCATAGCTGGCCATCCTCCCTTTTTTGATGAAACCCCAATTCGGACGTATGAAAAGATTCTTGCGGGCCGGCTTAAATTCCCCAATTGGTTTGATGAGCGTGCGCGGGATCTCGTAAAGGGTTTATTGCAAACGGATCACACGAAACGGTTGGGCACGCTGAAGGATGGCGTAGCTGATGTGAAGAATCACCCATTCTTCCGTGGTGCGAATTGGGAGAAACTCTATGGACGTCATTATAACGCCCCCATTGCCGTAAAAGTGAAGAGCCCCGGCGACACAAGTAACTTTGAGTCGTATCCCGAGAGTGGAGATAAGGGTTCTCCTCCACTAACCCCTTCGCAACAGG
TTGCATTCCGTGGTTTTTAG

TGTTTACCAAACCTGACACATCGGGATGGAAGCTGAGTGACTTTGAAATGGGTGACACGCTAGGGACCGGCTCGTTTGGTCGCGTGCGCATTGCAAAACTGAAGAGCAGGGGGGAGTATTATGCAATAAAATGTCTAAAGAAGCGTGAGATACTAAAGATGAAGCAGGTACAACACCTGAACCAAGAGAAGCAAATTCTAATGGAGTTGTCACACCCCTTCATTGTGAACATGATGTGTTCCTTCCAGGATGAGAACCGCGTCTACTTTGTTCTAGAATTTGTGGTAGGTGGTGAGGTATTTACTCACCTTCGTTCCGCAGGCCGTTTCCCGAATGACGTAGCGAAGTTCTATCATGCGGAGCTTGTGTTGGCCTTTGAATATTTACACTCGAAGGACATTATCTACCGTGACTTGAAACCTGAGAATCTGCTACTTGATGGGAAGGGACACGTCAAGGTGACTGATTTTGGTTTTGCTAAGAAGGTGACGGATCGTACCTATACGTTATGTGGGACACCTGAGTATCTTGCACCTGAGGTAATTCAGAGCAAAGGACATGGGAAGGCTGTGGATTGGTGGACGATGGGTGTTTTGCTGTATGAATTCATAGCTGGCCATCCTCCCTTTTTTGATGAAACCCCAATTCGGACGTATGAAAAGATTCTTGCGGGCCGGTTCAAATTCCCCAATTGGTTTGACTCCCGTGCGCGGGATCTCGTAAAGGGTTTATTGCAAACGGATCACACGAAACGGTTGGGCACGCTGAAGGATGGCGTAGCTGATGTGAAGAATCACCCATTCTTCCGTGGTGCGAATTGGGAGAAACTCTATGGACGTCATTATCACGCTCCCATTCCTGTAAAAGTGAAGAGCCCCGGCGACACAAGTAACTTTGAGTCGTATCCCGAGAGTGGGGATAAGCGGTTGCCCCCGTTAGCACCATCACAACAATTGGAGTTCCGTGGGTTTTAG
GGATGATGACCGATTGTACCTCCTCCTCGAGTATGTGGTGGGTGGCGAGCTGT

TCTCCCACCTCCGGAAGGCGGGAAAATTCCCTAATGATGTAGCCAAGTTCTACTCCGCAGAAGTGGTTTTGGCGTTTGAATATATTCATGAGTGCGGCATCGTATACCGTGACTTGAAGCCAGAAAATGTGCTTTTGGACAAGCAGGGAAACATTAAGATTACGGACTTTGGGTTCGCGAAACGCGTTAGGGACAGAACGTACACGCTATGTGGGACTCCAGAGTATCTTGCGCCGGAGATAATCCAAAGTAAAGGTCACGATCGGGCTGTGGATTGGTGGACACTCGGAATTCTTCTCTATGAGATGCTTGTCGGTTATCCTCCTTTTTTCGACGAGAGTCCTTTTAGAACATACGAAAAAATTTTAGAGGGGAAACTTCAGTTTCCAAAGTGGGTGGAGATGCGGGCGAAGGACCTCATAAAGAGTTTTTTAACAATTGAACCAACGAAACG

i.e.,It is only giving the region where it could find the best alignment
i.e., the best hit ones.

I want the complete sequence i.e., sequences corresponding to the accession
numbers
XM_822292.1
XM_822286.1
XM_822694.1

Database used in Remote blast was RefSeq i.e.,(refseq_rna),organism used
:Trypanasoma brucei.

Can any one please help me in solving this problem

Regards,
Roopa.
On Fri, Nov 20, 2009 at 12:30 PM, Roopa Raghuveer <rtbio.2009 at gmail.com>wrote:

>
> Hello Roy,
>
> Thanks a lot for your reply.My code is working for my sequence now.
>
> Thanks alot.
>
> Regards,
> Roopa.
>
> On Thu, Nov 19, 2009 at 5:10 PM, Roy Chaudhuri <roy.chaudhuri at gmail.com>wrote:
>
>> Hi Roopa,
>>
>> I think that the -Organism parameter that you specify for
>> Bio::Tools::Run::RemoteBlast is ignored - I can't find any reference to it
>> in the documentation:
>>
>> http://search.cpan.org/~cjfields/BioPerl-1.6.1/Bio/Tools/Run/RemoteBlast.pm<http://search.cpan.org/%7Ecjfields/BioPerl-1.6.1/Bio/Tools/Run/RemoteBlast.pm>
>>
>> You have the correct approach in your code - limiting the search to the
>> Entrez query "Trypanosoma brucei[ORGN]", but the line is commented out. If
>> you uncomment the line (and add a semicolon afterwards), the program runs
>> correctly, but no hits are reported below your threshold e-value. If you
>> change the value of $e_val to 10 then some T.brucei hits are reported.
>>
>> Roy.
>>
>> Roopa Raghuveer wrote:
>>
>>> Hello everybody,
>>>
>>> I have a problem. I would like to use remote blast to find sequences
>>> matching for an input sequence.
>>>
>>> Ex:-I would like to search sequences which match Trypanosoma Brucei
>>> sequence.
>>>
>>> I want the output to be only Trypanosoma Brucei sequences matching with
>>> my
>>> query.When i tried to use remoteblast to nr database,I got sequences from
>>> different organisms like E.coli,Pseudomonas etc.,
>>>
>>> Could you please tell me how can this be solved...?
>>>
>>> My code is as follows.
>>>
>>> use Bio::Tools::Run::RemoteBlast;
>>>  use strict;
>>>  my $prog = 'blastn';
>>>  my $db   = 'nr';
>>>  my $e_val= '1e-10';
>>>  my $organism= 'Trypanosoma Brucei';
>>>
>>>  my @params = ( '-prog' => $prog,
>>>         '-data' => $db,
>>>         '-expect' => $e_val,
>>>         '-readmethod' => 'SearchIO',
>>>         '-Organism'   => $organism );
>>>
>>>  my $factory = Bio::Tools::Run::RemoteBlast->
>>> new(@params);
>>>
>>>  #change a paramter
>>>  #$Bio::Tools::Run::RemoteBlast::HEADER{'ENTREZ_QUERY'} = 'Trypanosoma
>>> brucei[ORGN]'
>>>
>>>  #remove a parameter
>>>  #delete $Bio::Tools::Run::RemoteBlast::HEADER{'FILTER'};
>>>
>>>  my $v = 1;
>>>  #$v is just to turn on and off the messages
>>>
>>>  my $str = Bio::SeqIO->new(-file=>'amino.fa' , '-format' => 'fasta' ,
>>> '-organism' => 'Trypanosoma Brucei' );
>>>
>>>  while (my $input = $str->next_seq()){
>>>    #Blast a sequence against a database:
>>>   my $r = $factory->submit_blast($input);
>>>    #my $r = $factory->submit_blast('amino.fa');
>>>
>>>    print STDERR "waiting..." if( $v > 0 );
>>>    while ( my @rids = $factory->each_rid ) {
>>>      foreach my $rid ( @rids ) {
>>>        my $rc = $factory->retrieve_blast($rid);
>>>        if( !ref($rc) ) {
>>>          if( $rc < 0 ) {
>>>            $factory->remove_rid($rid);
>>>          }
>>>          print STDERR "." if ( $v > 0 );
>>>         sleep 5;
>>>        }
>>>     else {
>>>          my $result = $rc->next_result();
>>>          #save the output
>>>          my $filename = $result->query_name()."\.out";
>>>          $factory->save_output($filename);
>>>          $factory->remove_rid($rid);
>>>          print "\nQuery Name: ", $result->query_name(), "\n";
>>>          while ( my $hit = $result->next_hit ) {
>>>            next unless ( $v > 0);
>>>            print "\thit name is ", $hit->name, "\n";
>>>            while( my $hsp = $hit->next_hsp ) {
>>>              print "\t\tscore is ", $hsp->score, "\n";
>>>            }
>>>          }
>>>        }
>>>      }
>>>    }
>>>  }
>>>
>>> My input sequence is
>>>
>>>  ref|NC_009512.1|:385-1902
>>>>
>>> GTGTCAGTGGAACTTTGGCAGCAGTGCGTGGAGCTTCTGCGCGATGAACTGCCTGCCCAGCAATTCAACA
>>> CCTGGATCCGTCCGCTACAGGTCGAAGCCGAAGGCGACGAGTTGCGCGTCTATGCGCCTAACCGTTTCGT
>>> TCTCGATTGGGTCAATGAAAAGTACCTGGGTCGTTTGCTCGAGCTGTTGGGTGAGAACGGTAGCGGCATT
>>> GCACCAGCCCTTTCCTTATTAATAGGTAGCCGCCGCAGCTCGGCCCCAAGGGCTGCACCCAACGCGCCGG
>>> TCAGCGCTGCCGTTGCGGCTTCGCTGGCGCAGACTCAGGCGCACAAGACGGCCCCGGCAGCAGCGGTTGA
>>> ACCCGTTGCCGTGGCCGCGGCCGAGCCTGTATTGGTCGAGACGTCTTCGCGTGACAGCTTTGATGCCATG
>>> GCCGAGCCTGCTGCTGCGCCGCCCAGTGGTGGCCGGGCTGAACAGCGCACCGTGCAGGTTGAAGGTGCGC
>>> TCAAGCACACCAGTTACCTGAACCGGACCTTTACCTTTGACACCTTCGTCGAAGGTAAGTCGAACCAGCT
>>> CGCCCGCGCGGCTGCCTGGCAGGTTGCGGACAACCCTAAGCATGGCTACAACCCACTGTTCCTTTATGGC
>>> GGTGTGGGTTTGGGTAAAACCCACCTTATGCATGCTGTGGGTAACCATCTGCTGAAGAAGAATCCGAACG
>>> CCAAGGTGGTGTACCTGCATTCGGAGCGCTTCGTCGCGGACATGGTCAAAGCGTTGCAACTCAACGCCAT
>>> CAACGAATTCAAGCGCTTCTACCGCTCGGTGGACGCGTTGCTGATCGACGATATCCAGTTCTTCGCTCGC
>>> AAAGAGCGCTCGCAAGAAGAGTTTTTCCACACCTTCAACGCCTTGCTTGAGGGTGGCCAGCAGGTAATCC
>>> TTACCTCTGACCGCTATCCCAAGGAAATCGAAGGCCTGGAAGAGCGTCTGAAGTCGCGCTTTGGTTGGGG
>>> CCTGACGGTGGCTGTCGAGCCGCCAGAGCTGGAGACCCGCGTAGCGATCCTGATGAAGAAGGCCGACCAG
>>> GCCAAAGTCGAGCTCCCGCATGACGCAGCCTTTTTCATCGCTCAGCGCATCCGGTCCAACGTCCGTGAGC
>>> TGGAAGGTGCACTGAAGCGAGTTATTGCTCACTCGCACTTCATGGGGCGTGACATCACCATCGAGCTGAT
>>> TCGTGAATCGCTCAAGGATCTGTTGGCGCTGCAAGACAAACTGGTCAGTGTGGATAACATTCAGCGTACC
>>> GTCGCTGAGTACTACAAGATCAAGATCTCCGATCTGTTGTCCAAGCGTCGTTCGCGTTCTGTCGCGCGCC
>>> CGCGTCAGGTAGCCATGGCCCTGTCCAAGGAGTTGACCAACCACAGTCTGCCGGAAATCGGCGACATGTT
>>> CGGTGGTCGCGACCATACGACCGTGCTGCACGCCTGCCGCAAAATCAATGAACTGAAGGAATCCGACGCG
>>> GACATCCGCGAGGACTACAAGAACCTGCTGCGGACGCTGACGACCTGA
>>>
>>> Please mail me regarding any queries.
>>>
>>> Regards,
>>> Roopa.
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>
>>
>


From mauricio at open-bio.org  Fri Nov 20 16:15:22 2009
From: mauricio at open-bio.org (Mauricio Herrera Cuadra)
Date: Fri, 20 Nov 2009 10:15:22 -0600
Subject: [Bioperl-l] DANGER: hacking of bioperl wiki?
In-Reply-To: <7761C2223DB54DE6B836F302D2FF6AC0@NewLife>
References: <b9140daa0911181852n3a000da7x3fc7a5f0661f10f3@mail.gmail.com>
	<7761C2223DB54DE6B836F302D2FF6AC0@NewLife>
Message-ID: <4B06C09A.8060708@open-bio.org>

All OBF wikis and blogs have been upgraded and cleaned from the hack. 
Thanks for the heads up!

Mauricio.

Mark A. Jensen wrote:
> Andrew-- thanks!! We're on it.
> MAJ
> ----- Original Message ----- From: "Andrew Grimm" 
> <andrew.j.grimm at gmail.com>
> To: <bioperl-l at lists.open-bio.org>
> Sent: Wednesday, November 18, 2009 9:52 PM
> Subject: [Bioperl-l] DANGER: hacking of bioperl wiki?
> 
> 
>> Caution: read the whole email before visiting the bioperl wiki
>>
>> I was doing some bioinformatics-related searching using google, and
>> one of the hits was to the bio dot perl dot org wiki (the FAQ in
>> particular).
>>
>> When I did that, I was redirected to a ferdax dot com web site (a
>> typo-squatting of fedex?).
>>
>> Some people reckon that ferdax hacks web sites and redirects google
>> hits from the victim web site to their own web site. For example, this
>> thread at google's webmaster central
>> http://www.google.com/support/forum/p/Webmasters/thread?tid=37a36c0d1ea99819&hl=en#all 
>>
>> (it's talking about zencart, but presumably they've since found other
>> victims)
>>
>> Just going to the website without using google may not trigger the 
>> redirect.
>>
>> Apologies if this is a false alarm, but I don't think it is.
>>
>> I won't be in contact between Friday and Monday Australian time (I'll
>> be at railscamp 6 in Melbourne), so I won't be able to answer any
>> replies.
>>
>> Thanks,
>>
>> Andrew Grimm
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>>
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 


From David.Messina at sbc.su.se  Fri Nov 20 16:39:53 2009
From: David.Messina at sbc.su.se (Dave Messina)
Date: Fri, 20 Nov 2009 17:39:53 +0100
Subject: [Bioperl-l] Remote blast
In-Reply-To: <c7cac1600911200752h2dd48ca8r55877f5045948220@mail.gmail.com>
References: <c7cac1600911190655q7c4cf910x221e8a2ebac40f2a@mail.gmail.com>
	<4B056DF4.2030502@gmail.com>
	<c7cac1600911200330m5aafc21ehd5038707b4af9cdd@mail.gmail.com>
	<c7cac1600911200752h2dd48ca8r55877f5045948220@mail.gmail.com>
Message-ID: <7ECF627D-3DBF-4575-89CF-FA6348C88E8E@sbc.su.se>

Hi Roopa,

As far as I know, a BLAST report never contains the complete sequences of the hits. If it includes any part of the hit's sequence, it will be the part that matches the query.

You'll have to use the hit's ID or accession to get its complete sequence from somewhere else. You can use Bio::DB::Genbank to do that, for example.

See
http://www.bioperl.org/wiki/HOWTO:Beginners#Retrieving_a_sequence_from_a_database


Dave


From alessandra.bilardi at gmail.com  Fri Nov 20 17:44:18 2009
From: alessandra.bilardi at gmail.com (Alessandra)
Date: Fri, 20 Nov 2009 18:44:18 +0100
Subject: [Bioperl-l] Bio::DB::EUtilities question
Message-ID: <e0996aca0911200944r1162ceaew1bd846ade73d2841@mail.gmail.com>

Hi all,

I'm testing Bio::DB::EUtilities - webagent which interacts with and
retrieves data from NCBI's eUtils. My perl script works but it works
only if I request less than ~450 times get_Response function.. else I
have got this error message:

------------- EXCEPTION -------------
MSG: Response Error
Can't connect to eutils.ncbi.nlm.nih.gov:80 (connect: No route to host)
STACK Bio::DB::GenericWebAgent::get_Response
/usr/local/share/perl/5.10.0/Bio/DB/GenericWebAgent.pm:215
STACK toplevel ./wget4gbk.pl:77
-------------------------------------

wget4gbk.pl lines 76-77 are:
my $req = Bio::DB::EUtilities->new(-db => 'genome', -eutil =>
'esummary', -retmode => $mode, -rettype => $type, -id => $id);
my $entry = $req->get_Response;

I run perl script more ten times and this error arrives random time at
the range 300-600 requests. If I use another system to request data,
then I can to do ~ 10000 requests, without errors. Had I to set
EUtilities object with particular parameters?

Can you help me about random exception error?

Best,

-- 
 Alessandra Bilardi, Ph. D.
----
 CRIBI, University of Padova, Italy
 http://www.linkedin.com/in/bilardi
----


From maj at fortinbras.us  Fri Nov 20 18:42:38 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Fri, 20 Nov 2009 13:42:38 -0500
Subject: [Bioperl-l] gravatars on the wiki
Message-ID: <94431678F3764E8C9A49EA4D2FCD0DBD@NewLife>

Hi all, 
You can now reveal your Gravatar (http://www.gravatar.com) on the wiki, by including 
the following markup on the page:

 <winterPreWiki>
 {{#gravatar|youremail -at- yourplace -dot- tld}}
 </winterPreWiki>

You can do the antispam measure above, or use a regular email. Invalid emails throw an error.
http://bioperl.org/wiki/Gravatars 
Happy coding, 
MAJ


From roychu at gmail.com  Fri Nov 20 20:23:21 2009
From: roychu at gmail.com (Chu, Roy)
Date: Fri, 20 Nov 2009 12:23:21 -0800
Subject: [Bioperl-l] Installing Bio-perl on dreamhost via CPAN
In-Reply-To: <ECA15E72-8026-4EA0-A4F6-7C09EA2BD040@illinois.edu>
References: <4d7f3e450911200221m5f39ace2hb979712115fb9d78@mail.gmail.com>
	<20091120104445.GG31318@kunpuu.plessy.org>
	<ECA15E72-8026-4EA0-A4F6-7C09EA2BD040@illinois.edu>
Message-ID: <4d7f3e450911201223w59cb308q5af7690a28697966@mail.gmail.com>

"sounds very much like you process was killed for prolonged execution
time, or memory usage. We have a daemon in place that monitors for
processes that take up too much of a shared web server's resources, and
this may have kicked in (and often does when trying to install packages
on a shared server)."

This was the explanation they had.  Regarding asking their admins to
install, it seems is a "they'll try to get to it but don't hold your
breath situation."

Hmmm, I tried some other attempts, installing 1.4.0 posed no problems.
 I'm not a perl guru, so I tried to increase the build cache size from
the default, 10 MB, hoping that that may be the problem--can't imagine
how though, since I can't imagine how big the whole package version
can differ by (though honestly, I haven't checked).
Whenever I try to install 1.6.1, it runs into a problem I guess after
the 'make' step and lists the
modules--BioPerl-1.6.0/t/Variation/SeqDiff.t
BioPerl-1.6.0/t/Variation/SNP.t
BioPerl-1.6.0/t/Variation/Variation_IO.t
--and typically gets killed here '> Killed'

Next, I tried 1.6.0, then I get this:
"(I think you ran Build.PL directly, so will use CPAN to install
prerequisites on demand)
CPAN: Storable loaded ok (v2.12)
Going to read '/home/$username/.cpan/Metadata'
Killed" (everything prior works and it seems to get further along than
when I try to install 1.6.1)

Any insight into why this may be happening would be appreciated.
Something EQUALLY appreciated would be a recommendation of a decent
enough hosting service where someone has had success installing
Bio-Perl.  I'd try to set up my Mac web sharing feature and then try
to setup the stuff locally, but I haven't yet been able to
successfully get the port forwarding feature working properly on the
apple airport extreme--perplexing.  Next, I might just try to install
via the Build.pl script.

Hmm, checking the wiki, it seems I'll still be able to run remote
blast and use the basic seq modules, although some discrepancies and
idiosyncrasies may be expected?  Any head-ups about any false
assumptions by me would be greatly appreciated.

Thanks in advance,
Roy

On Fri, Nov 20, 2009 at 5:00 AM, Chris Fields <cjfields at illinois.edu> wrote:
>
> On Nov 20, 2009, at 4:44 AM, Charles Plessy wrote:
>
>> Le Fri, Nov 20, 2009 at 02:21:54AM -0800, Chu, Roy a ?crit :
>>>
>>> Does anyone use dreamhost as a web hosting service? ?I'm just curious
>>> if anyone has had any luck installing the module as their daemon seems
>>> to kill my process whenever I try to install it. ?Dreamhost tech
>>> support attributes it to either exceeding the allocated memory cache
>>> or exceeding the processing time. ?I tried to nice the process, but
>>> that didn't help for me. ?Any luck or experience in resolving this
>>> would be much appreciated. ?I suppose my next attempt would be to try
>>> installing it directly and hope I don't need root...
>>
>> Dear Roy,
>>
>> DreamHost uses Debian, so you can suggest them to install the Debian package.
>> If you are in contact with the tech service, do not hesitate to tell them to
>> contact me if they are interested by a backport of the 1.6.0 package. For
>> version 1.6.1, it may be more difficult as it depends on perl 5.10.1.
>
> Any reason why this is so? ?We specify compatibility back to 5.6.1.
>
> Alex mentioned the reliance on the specific Extutils::Manifest version. ?The version requested has an important bug fix, is present on CPAN, and is backwards-compatible to 5.6.1. ?It should be fairly easy to request that as a separate package.
>
> A strict requirement for perl 5.10.1 doesn't make sense in that light, unless said perl maintainer can enlighten us as to why this is an issue? ?This one may require a ranty blog post.
>
>> PS: if you propse to install BioPerl as a feature in the Dreamhost panel, I
>> will vote for it :)
>>
>> Have a nice day,
>>
>> --
>> Charles Plessy
>> Debian Med packaging team,
>> http://www.debian.org/devel/debian-med
>> Tsurumi, Kanagawa, Japan
>
> chris
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


From cjfields at illinois.edu  Fri Nov 20 20:40:24 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Fri, 20 Nov 2009 14:40:24 -0600
Subject: [Bioperl-l] Installing Bio-perl on dreamhost via CPAN
In-Reply-To: <4d7f3e450911201223w59cb308q5af7690a28697966@mail.gmail.com>
References: <4d7f3e450911200221m5f39ace2hb979712115fb9d78@mail.gmail.com>
	<20091120104445.GG31318@kunpuu.plessy.org>
	<ECA15E72-8026-4EA0-A4F6-7C09EA2BD040@illinois.edu>
	<4d7f3e450911201223w59cb308q5af7690a28697966@mail.gmail.com>
Message-ID: <1D1B0987-3309-4281-BCE0-2737E4F0D0B1@illinois.edu>

BioPerl is pure perl.  If you believe all dependencies are installed, just unpack the dist to a specific directory and point PERL5LIB at it (for bash):

export PERL5LIB=/home/USER/bioperl/bioperl-live

Note that if you plan on doing the same for other bioperl-related modules (ex: bioperl-db) you'll need to add 'lib' to it, as they use a generic Module::Build now.

export PERL5LIB=/home/USER/bioperl/bioperl-db/lib

You can also add a 'use lib' directive in your scripts as well.  More at the following link:

http://www.bioperl.org/wiki/Installing_Bioperl_for_Unix#USING_MODULES_NOT_INSTALLED_IN_THE_STANDARD_LOCATION

chris

On Nov 20, 2009, at 2:23 PM, Chu, Roy wrote:

> "sounds very much like you process was killed for prolonged execution
> time, or memory usage. We have a daemon in place that monitors for
> processes that take up too much of a shared web server's resources, and
> this may have kicked in (and often does when trying to install packages
> on a shared server)."
> 
> This was the explanation they had.  Regarding asking their admins to
> install, it seems is a "they'll try to get to it but don't hold your
> breath situation."
> 
> Hmmm, I tried some other attempts, installing 1.4.0 posed no problems.
> I'm not a perl guru, so I tried to increase the build cache size from
> the default, 10 MB, hoping that that may be the problem--can't imagine
> how though, since I can't imagine how big the whole package version
> can differ by (though honestly, I haven't checked).
> Whenever I try to install 1.6.1, it runs into a problem I guess after
> the 'make' step and lists the
> modules--BioPerl-1.6.0/t/Variation/SeqDiff.t
> BioPerl-1.6.0/t/Variation/SNP.t
> BioPerl-1.6.0/t/Variation/Variation_IO.t
> --and typically gets killed here '> Killed'
> 
> Next, I tried 1.6.0, then I get this:
> "(I think you ran Build.PL directly, so will use CPAN to install
> prerequisites on demand)
> CPAN: Storable loaded ok (v2.12)
> Going to read '/home/$username/.cpan/Metadata'
> Killed" (everything prior works and it seems to get further along than
> when I try to install 1.6.1)
> 
> Any insight into why this may be happening would be appreciated.
> Something EQUALLY appreciated would be a recommendation of a decent
> enough hosting service where someone has had success installing
> Bio-Perl.  I'd try to set up my Mac web sharing feature and then try
> to setup the stuff locally, but I haven't yet been able to
> successfully get the port forwarding feature working properly on the
> apple airport extreme--perplexing.  Next, I might just try to install
> via the Build.pl script.
> 
> Hmm, checking the wiki, it seems I'll still be able to run remote
> blast and use the basic seq modules, although some discrepancies and
> idiosyncrasies may be expected?  Any head-ups about any false
> assumptions by me would be greatly appreciated.
> 
> Thanks in advance,
> Roy
> 
> On Fri, Nov 20, 2009 at 5:00 AM, Chris Fields <cjfields at illinois.edu> wrote:
>> 
>> On Nov 20, 2009, at 4:44 AM, Charles Plessy wrote:
>> 
>>> Le Fri, Nov 20, 2009 at 02:21:54AM -0800, Chu, Roy a ?crit :
>>>> 
>>>> Does anyone use dreamhost as a web hosting service?  I'm just curious
>>>> if anyone has had any luck installing the module as their daemon seems
>>>> to kill my process whenever I try to install it.  Dreamhost tech
>>>> support attributes it to either exceeding the allocated memory cache
>>>> or exceeding the processing time.  I tried to nice the process, but
>>>> that didn't help for me.  Any luck or experience in resolving this
>>>> would be much appreciated.  I suppose my next attempt would be to try
>>>> installing it directly and hope I don't need root...
>>> 
>>> Dear Roy,
>>> 
>>> DreamHost uses Debian, so you can suggest them to install the Debian package.
>>> If you are in contact with the tech service, do not hesitate to tell them to
>>> contact me if they are interested by a backport of the 1.6.0 package. For
>>> version 1.6.1, it may be more difficult as it depends on perl 5.10.1.
>> 
>> Any reason why this is so?  We specify compatibility back to 5.6.1.
>> 
>> Alex mentioned the reliance on the specific Extutils::Manifest version.  The version requested has an important bug fix, is present on CPAN, and is backwards-compatible to 5.6.1.  It should be fairly easy to request that as a separate package.
>> 
>> A strict requirement for perl 5.10.1 doesn't make sense in that light, unless said perl maintainer can enlighten us as to why this is an issue?  This one may require a ranty blog post.
>> 
>>> PS: if you propse to install BioPerl as a feature in the Dreamhost panel, I
>>> will vote for it :)
>>> 
>>> Have a nice day,
>>> 
>>> --
>>> Charles Plessy
>>> Debian Med packaging team,
>>> http://www.debian.org/devel/debian-med
>>> Tsurumi, Kanagawa, Japan
>> 
>> chris
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From charles-listes+bioperl at plessy.org  Sat Nov 21 01:07:23 2009
From: charles-listes+bioperl at plessy.org (Charles Plessy)
Date: Sat, 21 Nov 2009 10:07:23 +0900
Subject: [Bioperl-l] Installing Bio-perl on dreamhost via CPAN
In-Reply-To: <ECA15E72-8026-4EA0-A4F6-7C09EA2BD040@illinois.edu>
References: <4d7f3e450911200221m5f39ace2hb979712115fb9d78@mail.gmail.com>
	<20091120104445.GG31318@kunpuu.plessy.org>
	<ECA15E72-8026-4EA0-A4F6-7C09EA2BD040@illinois.edu>
Message-ID: <20091121010723.GA7786@kunpuu.plessy.org>

Le Fri, Nov 20, 2009 at 07:00:45AM -0600, Chris Fields a ?crit :
> On Nov 20, 2009, at 4:44 AM, Charles Plessy wrote:
> > 
> > DreamHost uses Debian, so you can suggest them to install the Debian
> > package.  If you are in contact with the tech service, do not hesitate to
> > tell them to contact me if they are interested by a backport of the 1.6.0
> > package. For version 1.6.1, it may be more difficult as it depends on perl
> > 5.10.1.
> 
> Any reason why this is so?  We specify compatibility back to 5.6.1.

Dear Chris,

you make a good point: although for building we need to either depend on perl
5.10.1 or package separately Extutils::Manifest, the resulting bioperl package
does not depend on such a high version. Therefore, there is no need for a
backport, and the latest Debian package can be installed on Debian stable
(5.0/Lenny) system. I just checked the Dreamhost machine on which I happen to
have an acces, ?waratahs?, and it seems to be older, but nevertheless it may be
worth asking the admins anyway (with the big drawback that they would have to
be asked for each update).

Have a nice week-end,

-- 
Charles Plessy
Debian Med packaging team,
http://www.debian.org/devel/debian-med
Tsurumi, Kanagawa, Japan


From robert.bradbury at gmail.com  Sat Nov 21 01:40:14 2009
From: robert.bradbury at gmail.com (Robert Bradbury)
Date: Fri, 20 Nov 2009 20:40:14 -0500
Subject: [Bioperl-l] Excessive CPU use by various Bioperl sites
Message-ID: <deaa866a0911201740h63aeb09cma237064b7622f5ce@mail.gmail.com>

I run a Linux system which is in a gradual process of evolution from the
default Linux browsers (Galeon, Epiphany, etc.) through Firefox (better) to
Google's Chromium (IMO, perhaps the best so far).  Chromium allows one to
create a process per tab/URL so one can effectively track what it is doing.
 It also allows one to track the machine usage of these processes (through
the Developer > Task manager [shift-escape keyboard] option) which though
expensive in terms of overhead allows one to track offending windows (in
terms of memory or CPU use).  My processor recently jumped from a typical
700 MHz to 1.4 GHz speed (using the Linux Ondemand scheduler - which saves
~20 W at the wall outlet -- I've measured it) to the full tilt 2.8 GHz the
CPU is capable of.  Looking at the chrome task manager I was not surprised
to find the NY Times high on the list (they are pushing content, esp. using
Javascript) but much to my dismay the Jalview and Howto:Trees:Bioperl
appeared to be high on the list.  Now I am forced to ask myself *why* sites
which are simply distributing static information are eating up CPU on my
machine!  This is a fundamental flaw in the architecture of the sites --
wherein there should be conscious efforts to minimize user-CPU use (or avoid
Javascript entirely).  This would not be a problem if I were using Firefox
as I can easily use NoScript to block Javacscript from non-approved sites.
 But it raises the question of when one should allow Javascript to run (one
would "normally" approve academic sites by default) when even the academic
sites are abusing my CPU.  There needs to be much greater awareness both on
the part of software distributors and software consumers that it is *MY* CPU
and *MY* Electricty and *MY* contribution to global warming.  And the
developers/distributors should not be sucking down those resources without
first saying "May I?" and I have the option of saying "No you may not."
 There is enough we can do productively (running low homology blast
searches) without engaging in endless wheel spinning of Javascripts or
looped GIFs.

Robert


From maj at fortinbras.us  Sat Nov 21 04:17:12 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Fri, 20 Nov 2009 23:17:12 -0500
Subject: [Bioperl-l] ohlohers
Message-ID: <C003FAD20636489DBFB2D34F5955C68D@NewLife>

You can now add your Ohloh widgets and increase your carbon footprint with the less crufty:

 <winterPreWiki>
 {{#ohloh|acct_id|TYPE}}
 </winterPreWiki>

where TYPE is [Detailed|Rank|Tiny]. Taint checks aplenty.
MAJ


From maj at fortinbras.us  Sat Nov 21 04:33:02 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Fri, 20 Nov 2009 23:33:02 -0500
Subject: [Bioperl-l] Installing Bio-perl on dreamhost via CPAN
In-Reply-To: <4d7f3e450911201223w59cb308q5af7690a28697966@mail.gmail.com>
References: <4d7f3e450911200221m5f39ace2hb979712115fb9d78@mail.gmail.com><20091120104445.GG31318@kunpuu.plessy.org><ECA15E72-8026-4EA0-A4F6-7C09EA2BD040@illinois.edu>
	<4d7f3e450911201223w59cb308q5af7690a28697966@mail.gmail.com>
Message-ID: <9ECC66C2F23F47469AF0F07E3F9307FC@NewLife>

Maybe 'nightmarehost' is more appropriate. I've had no problems on AWS,
but this may not exactly what you need. MAJ
----- Original Message ----- 
From: "Chu, Roy" <roychu at gmail.com>
To: <bioperl-l at lists.open-bio.org>
Sent: Friday, November 20, 2009 3:23 PM
Subject: Re: [Bioperl-l] Installing Bio-perl on dreamhost via CPAN


"sounds very much like you process was killed for prolonged execution
time, or memory usage. We have a daemon in place that monitors for
processes that take up too much of a shared web server's resources, and
this may have kicked in (and often does when trying to install packages
on a shared server)."

This was the explanation they had.  Regarding asking their admins to
install, it seems is a "they'll try to get to it but don't hold your
breath situation."

Hmmm, I tried some other attempts, installing 1.4.0 posed no problems.
 I'm not a perl guru, so I tried to increase the build cache size from
the default, 10 MB, hoping that that may be the problem--can't imagine
how though, since I can't imagine how big the whole package version
can differ by (though honestly, I haven't checked).
Whenever I try to install 1.6.1, it runs into a problem I guess after
the 'make' step and lists the
modules--BioPerl-1.6.0/t/Variation/SeqDiff.t
BioPerl-1.6.0/t/Variation/SNP.t
BioPerl-1.6.0/t/Variation/Variation_IO.t
--and typically gets killed here '> Killed'

Next, I tried 1.6.0, then I get this:
"(I think you ran Build.PL directly, so will use CPAN to install
prerequisites on demand)
CPAN: Storable loaded ok (v2.12)
Going to read '/home/$username/.cpan/Metadata'
Killed" (everything prior works and it seems to get further along than
when I try to install 1.6.1)

Any insight into why this may be happening would be appreciated.
Something EQUALLY appreciated would be a recommendation of a decent
enough hosting service where someone has had success installing
Bio-Perl.  I'd try to set up my Mac web sharing feature and then try
to setup the stuff locally, but I haven't yet been able to
successfully get the port forwarding feature working properly on the
apple airport extreme--perplexing.  Next, I might just try to install
via the Build.pl script.

Hmm, checking the wiki, it seems I'll still be able to run remote
blast and use the basic seq modules, although some discrepancies and
idiosyncrasies may be expected?  Any head-ups about any false
assumptions by me would be greatly appreciated.

Thanks in advance,
Roy

On Fri, Nov 20, 2009 at 5:00 AM, Chris Fields <cjfields at illinois.edu> wrote:
>
> On Nov 20, 2009, at 4:44 AM, Charles Plessy wrote:
>
>> Le Fri, Nov 20, 2009 at 02:21:54AM -0800, Chu, Roy a ?crit :
>>>
>>> Does anyone use dreamhost as a web hosting service? I'm just curious
>>> if anyone has had any luck installing the module as their daemon seems
>>> to kill my process whenever I try to install it. Dreamhost tech
>>> support attributes it to either exceeding the allocated memory cache
>>> or exceeding the processing time. I tried to nice the process, but
>>> that didn't help for me. Any luck or experience in resolving this
>>> would be much appreciated. I suppose my next attempt would be to try
>>> installing it directly and hope I don't need root...
>>
>> Dear Roy,
>>
>> DreamHost uses Debian, so you can suggest them to install the Debian package.
>> If you are in contact with the tech service, do not hesitate to tell them to
>> contact me if they are interested by a backport of the 1.6.0 package. For
>> version 1.6.1, it may be more difficult as it depends on perl 5.10.1.
>
> Any reason why this is so? We specify compatibility back to 5.6.1.
>
> Alex mentioned the reliance on the specific Extutils::Manifest version. The 
> version requested has an important bug fix, is present on CPAN, and is 
> backwards-compatible to 5.6.1. It should be fairly easy to request that as a 
> separate package.
>
> A strict requirement for perl 5.10.1 doesn't make sense in that light, unless 
> said perl maintainer can enlighten us as to why this is an issue? This one may 
> require a ranty blog post.
>
>> PS: if you propse to install BioPerl as a feature in the Dreamhost panel, I
>> will vote for it :)
>>
>> Have a nice day,
>>
>> --
>> Charles Plessy
>> Debian Med packaging team,
>> http://www.debian.org/devel/debian-med
>> Tsurumi, Kanagawa, Japan
>
> chris
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>

_______________________________________________
Bioperl-l mailing list
Bioperl-l at lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/bioperl-l


From cjfields at illinois.edu  Sat Nov 21 04:38:23 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Fri, 20 Nov 2009 22:38:23 -0600
Subject: [Bioperl-l] Excessive CPU use by various Bioperl sites
In-Reply-To: <deaa866a0911201740h63aeb09cma237064b7622f5ce@mail.gmail.com>
References: <deaa866a0911201740h63aeb09cma237064b7622f5ce@mail.gmail.com>
Message-ID: <8163BC62-3F3E-4936-AAA9-61A4FB307C99@illinois.edu>

Robert, 

Not sure why you're seeing that, but the HOWTO (and, AFAIK, the wiki in general) do not use JS, unless there is a specific addition I'm unaware of.  Now, the site wiki was recently 'parasited' for redirects, which may be the culprit, but this is now fixed.  Can you at least retest to see if this persists?

Anyone else know about this?

chris

On Nov 20, 2009, at 7:40 PM, Robert Bradbury wrote:

> I run a Linux system which is in a gradual process of evolution from the
> default Linux browsers (Galeon, Epiphany, etc.) through Firefox (better) to
> Google's Chromium (IMO, perhaps the best so far).  Chromium allows one to
> create a process per tab/URL so one can effectively track what it is doing.
> It also allows one to track the machine usage of these processes (through
> the Developer > Task manager [shift-escape keyboard] option) which though
> expensive in terms of overhead allows one to track offending windows (in
> terms of memory or CPU use).  My processor recently jumped from a typical
> 700 MHz to 1.4 GHz speed (using the Linux Ondemand scheduler - which saves
> ~20 W at the wall outlet -- I've measured it) to the full tilt 2.8 GHz the
> CPU is capable of.  Looking at the chrome task manager I was not surprised
> to find the NY Times high on the list (they are pushing content, esp. using
> Javascript) but much to my dismay the Jalview and Howto:Trees:Bioperl
> appeared to be high on the list.  Now I am forced to ask myself *why* sites
> which are simply distributing static information are eating up CPU on my
> machine!  This is a fundamental flaw in the architecture of the sites --
> wherein there should be conscious efforts to minimize user-CPU use (or avoid
> Javascript entirely).  This would not be a problem if I were using Firefox
> as I can easily use NoScript to block Javacscript from non-approved sites.
> But it raises the question of when one should allow Javascript to run (one
> would "normally" approve academic sites by default) when even the academic
> sites are abusing my CPU.  There needs to be much greater awareness both on
> the part of software distributors and software consumers that it is *MY* CPU
> and *MY* Electricty and *MY* contribution to global warming.  And the
> developers/distributors should not be sucking down those resources without
> first saying "May I?" and I have the option of saying "No you may not."
> There is enough we can do productively (running low homology blast
> searches) without engaging in endless wheel spinning of Javascripts or
> looped GIFs.
> 
> Robert
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From sdavis2 at mail.nih.gov  Sat Nov 21 05:11:34 2009
From: sdavis2 at mail.nih.gov (Sean Davis)
Date: Fri, 20 Nov 2009 21:11:34 -0800
Subject: [Bioperl-l] Excessive CPU use by various Bioperl sites
In-Reply-To: <8163BC62-3F3E-4936-AAA9-61A4FB307C99@illinois.edu>
References: <deaa866a0911201740h63aeb09cma237064b7622f5ce@mail.gmail.com>
	<8163BC62-3F3E-4936-AAA9-61A4FB307C99@illinois.edu>
Message-ID: <264855a00911202111u4b1f1020r4aa6e0e9b0ea61@mail.gmail.com>

On Fri, Nov 20, 2009 at 8:38 PM, Chris Fields <cjfields at illinois.edu> wrote:

> Robert,
>
> Not sure why you're seeing that, but the HOWTO (and, AFAIK, the wiki in
> general) do not use JS, unless there is a specific addition I'm unaware of.
>  Now, the site wiki was recently 'parasited' for redirects, which may be the
> culprit, but this is now fixed.  Can you at least retest to see if this
> persists?
>
> Anyone else know about this?
>
>
The page in question does include javascript, it appears from the source.
 This is a function of using mediawiki, though, I believe and not something
specific to that page.

Sean


From cjfields at illinois.edu  Sat Nov 21 05:20:37 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Fri, 20 Nov 2009 23:20:37 -0600
Subject: [Bioperl-l] Excessive CPU use by various Bioperl sites
In-Reply-To: <264855a00911202111u4b1f1020r4aa6e0e9b0ea61@mail.gmail.com>
References: <deaa866a0911201740h63aeb09cma237064b7622f5ce@mail.gmail.com>
	<8163BC62-3F3E-4936-AAA9-61A4FB307C99@illinois.edu>
	<264855a00911202111u4b1f1020r4aa6e0e9b0ea61@mail.gmail.com>
Message-ID: <A7AC3865-3C9A-4C6E-85B5-349240C40680@illinois.edu>

On Nov 20, 2009, at 11:11 PM, Sean Davis wrote:

> On Fri, Nov 20, 2009 at 8:38 PM, Chris Fields <cjfields at illinois.edu> wrote:
> 
>> Robert,
>> 
>> Not sure why you're seeing that, but the HOWTO (and, AFAIK, the wiki in
>> general) do not use JS, unless there is a specific addition I'm unaware of.
>> Now, the site wiki was recently 'parasited' for redirects, which may be the
>> culprit, but this is now fixed.  Can you at least retest to see if this
>> persists?
>> 
>> Anyone else know about this?
>> 
>> 
> The page in question does include javascript, it appears from the source.
> This is a function of using mediawiki, though, I believe and not something
> specific to that page.
> 
> Sean

</sound of my hand slapping my forehead>

Sean, thanks for pointing that out.

chris


From robert.bradbury at gmail.com  Sat Nov 21 18:26:05 2009
From: robert.bradbury at gmail.com (Robert Bradbury)
Date: Sat, 21 Nov 2009 13:26:05 -0500
Subject: [Bioperl-l] Bio::DB::EUtilities question
In-Reply-To: <e0996aca0911200944r1162ceaew1bd846ade73d2841@mail.gmail.com>
References: <e0996aca0911200944r1162ceaew1bd846ade73d2841@mail.gmail.com>
Message-ID: <deaa866a0911211026t2c8e1cafvac9440586dc32122@mail.gmail.com>

It sounds like NCBI may be counting frequency of requests, how much data
they send or something similar.  Are you delaying the time between fetches?
 The code I've seen typically sleeps for a few seconds each time around a
loop.  You might try longer delays between fetches and see if that gets you
any more data.

Alternatively perhaps the libraries aren't reusing the TCP/IP connection
properly.  Is there a difference between the amount of memory on the
machines?  Have you watched the size of the process to see if it grows over
time?  I think the bug which prevented me from fetching a not-so-large
genome from a few months ago (eating up 3GB of memory in the process) has
not been resolved.  If so that could be your problem.

Robert

On Fri, Nov 20, 2009 at 12:44 PM, Alessandra
<alessandra.bilardi at gmail.com>wrote:
>
>
> I'm testing Bio::DB::EUtilities - webagent which interacts with and
> retrieves data from NCBI's eUtils. My perl script works but it works
> only if I request less than ~450 times get_Response function.. else I
> have got this error message:
>
> ------------- EXCEPTION -------------
> MSG: Response Error
> Can't connect to eutils.ncbi.nlm.nih.gov:80 (connect: No route to host)
> STACK Bio::DB::GenericWebAgent::get_Response
> /usr/local/share/perl/5.10.0/Bio/DB/GenericWebAgent.pm:215
> STACK toplevel ./wget4gbk.pl:77
>


From cjfields at illinois.edu  Sat Nov 21 19:19:24 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Sat, 21 Nov 2009 13:19:24 -0600
Subject: [Bioperl-l] Bio::DB::EUtilities question
In-Reply-To: <deaa866a0911211026t2c8e1cafvac9440586dc32122@mail.gmail.com>
References: <e0996aca0911200944r1162ceaew1bd846ade73d2841@mail.gmail.com>
	<deaa866a0911211026t2c8e1cafvac9440586dc32122@mail.gmail.com>
Message-ID: <837CE7E7-E625-4285-AD54-06FD168C0DF3@illinois.edu>

NCBI has specific rules about the repeated queries to its servers:

http://eutils.ncbi.nlm.nih.gov/#UserSystemRequirements

Acc. to that, if you are making over 100 requests at peak times you will run into problems (they'll probably temp-block your IP), even if the timeout is much shorter now (it's 3 requests/second, whereas a year or two ago it was once every 3 sec).  In general it's best to run something like this during off-hours.  

The actual limit on number of server requests is one specific part of Bio::DB::EUtilities that hasn't been added yet, but is tentatively planned.  

chris

On Nov 21, 2009, at 12:26 PM, Robert Bradbury wrote:

> It sounds like NCBI may be counting frequency of requests, how much data
> they send or something similar.  Are you delaying the time between fetches?
> The code I've seen typically sleeps for a few seconds each time around a
> loop.  You might try longer delays between fetches and see if that gets you
> any more data.
> 
> Alternatively perhaps the libraries aren't reusing the TCP/IP connection
> properly.  Is there a difference between the amount of memory on the
> machines?  Have you watched the size of the process to see if it grows over
> time?  I think the bug which prevented me from fetching a not-so-large
> genome from a few months ago (eating up 3GB of memory in the process) has
> not been resolved.  If so that could be your problem.
> 
> Robert
> 
> On Fri, Nov 20, 2009 at 12:44 PM, Alessandra
> <alessandra.bilardi at gmail.com>wrote:
>> 
>> 
>> I'm testing Bio::DB::EUtilities - webagent which interacts with and
>> retrieves data from NCBI's eUtils. My perl script works but it works
>> only if I request less than ~450 times get_Response function.. else I
>> have got this error message:
>> 
>> ------------- EXCEPTION -------------
>> MSG: Response Error
>> Can't connect to eutils.ncbi.nlm.nih.gov:80 (connect: No route to host)
>> STACK Bio::DB::GenericWebAgent::get_Response
>> /usr/local/share/perl/5.10.0/Bio/DB/GenericWebAgent.pm:215
>> STACK toplevel ./wget4gbk.pl:77
>> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From cjfields at illinois.edu  Sun Nov 22 02:58:37 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Sat, 21 Nov 2009 20:58:37 -0600
Subject: [Bioperl-l] BioPerl on FLOSS Weekly
Message-ID: <05EB7AF4-8A20-4046-A585-FBF41EA8350A@illinois.edu>

Jason and I were recently interviewed (Wednesday!) about BioPerl for FLOSS Weekly by Randal Schwartz, Leo Laporte, Marc Pelletier, and Kirsten Sanford.  The interview is now available online, so get your favorite flavor (MP3, podcast) here:

http://twit.tv/floss96

Enjoy!

chris and jason


From adsj at novozymes.com  Sun Nov 22 12:37:40 2009
From: adsj at novozymes.com (Adam =?iso-8859-1?Q?Sj=F8gren?=)
Date: Sun, 22 Nov 2009 13:37:40 +0100
Subject: [Bioperl-l] BioPerl on FLOSS Weekly
In-Reply-To: <05EB7AF4-8A20-4046-A585-FBF41EA8350A@illinois.edu> (Chris
	Fields's message of "Sat, 21 Nov 2009 20:58:37 -0600")
References: <05EB7AF4-8A20-4046-A585-FBF41EA8350A@illinois.edu>
Message-ID: <87aaye91m3.fsf@topper.koldfront.dk>

On Sat, 21 Nov 2009 20:58:37 -0600, Chris wrote:

> Jason and I were recently interviewed (Wednesday!) about BioPerl for
> FLOSS Weekly by Randal Schwartz, Leo Laporte, Marc Pelletier, and
> Kirsten Sanford.

Great!

How about linking to it on bioperl.org?


  :-),

   Adam

-- 
                                                          Adam Sj?gren
                                                    adsj at novozymes.com


From cjfields at illinois.edu  Sun Nov 22 20:30:01 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Sun, 22 Nov 2009 14:30:01 -0600
Subject: [Bioperl-l] BioPerl on FLOSS Weekly
In-Reply-To: <87aaye91m3.fsf@topper.koldfront.dk>
References: <05EB7AF4-8A20-4046-A585-FBF41EA8350A@illinois.edu>
	<87aaye91m3.fsf@topper.koldfront.dk>
Message-ID: <2F050081-8B44-4B4C-82D2-7AC71F156588@illinois.edu>


On Nov 22, 2009, at 6:37 AM, Adam Sj?gren wrote:

> On Sat, 21 Nov 2009 20:58:37 -0600, Chris wrote:
> 
>> Jason and I were recently interviewed (Wednesday!) about BioPerl for
>> FLOSS Weekly by Randal Schwartz, Leo Laporte, Marc Pelletier, and
>> Kirsten Sanford.
> 
> Great!
> 
> How about linking to it on bioperl.org?
> 
> 
>  :-),
> 
>   Adam
> 
> -- 
>                                                          Adam Sj?gren
>                                                    adsj at novozymes.com

Now posted via O|B|F News; I'll try to make that feed more prominent on the main page.  

Since this is the second such interview (Jason did one a few years back for PerlCast), I'm thinking we need a media page of some sort.

chris


From maj at fortinbras.us  Sun Nov 22 20:48:39 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Sun, 22 Nov 2009 15:48:39 -0500
Subject: [Bioperl-l] BioPerl on FLOSS Weekly
In-Reply-To: <2F050081-8B44-4B4C-82D2-7AC71F156588@illinois.edu>
References: <05EB7AF4-8A20-4046-A585-FBF41EA8350A@illinois.edu><87aaye91m3.fsf@topper.koldfront.dk>
	<2F050081-8B44-4B4C-82D2-7AC71F156588@illinois.edu>
Message-ID: <247658CC6D9A4529B281F4482BD3E4BD@NewLife>

We do have http://www.bioperl.org/wiki/Category:BioPerl_Media --
----- Original Message ----- 
From: "Chris Fields" <cjfields at illinois.edu>
To: "Adam Sj?gren" <adsj at novozymes.com>
Cc: <bioperl-l at bioperl.org>
Sent: Sunday, November 22, 2009 3:30 PM
Subject: Re: [Bioperl-l] BioPerl on FLOSS Weekly


On Nov 22, 2009, at 6:37 AM, Adam Sj?gren wrote:

> On Sat, 21 Nov 2009 20:58:37 -0600, Chris wrote:
>
>> Jason and I were recently interviewed (Wednesday!) about BioPerl for
>> FLOSS Weekly by Randal Schwartz, Leo Laporte, Marc Pelletier, and
>> Kirsten Sanford.
>
> Great!
>
> How about linking to it on bioperl.org?
>
>
>  :-),
>
>   Adam
>
> -- 
>                                                          Adam Sj?gren
>                                                    adsj at novozymes.com

Now posted via O|B|F News; I'll try to make that feed more prominent on the main 
page.

Since this is the second such interview (Jason did one a few years back for 
PerlCast), I'm thinking we need a media page of some sort.

chris
_______________________________________________
Bioperl-l mailing list
Bioperl-l at lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/bioperl-l


From jardim.rodrigo at gmail.com  Sun Nov 22 16:06:40 2009
From: jardim.rodrigo at gmail.com (Rodrigo Jardim)
Date: Sun, 22 Nov 2009 14:06:40 -0200
Subject: [Bioperl-l] Problems with Genbank Proteins File
Message-ID: <cad526f60911220806td613367x255b8c6a0cebd1fc@mail.gmail.com>

I have been problem to parser genbank protein file. I think that because
this file have a other order of fields. For example:

In most general genbank files:
========================
LOCUS       AA399704                  183 bp   mRNA    linear   EST
03-MAR-2000
ACCESSION   AA399704
VERSION     AA399704.1  GI:2053305
DEFINITION  TEUF0001 T.cruzi epimastigote non-normalized cDNA Library
            Trypanosoma cruzi cDNA clone 1 5' similar to T. cruzi gene for
            histone H2b (X60982), mRNA sequence.
KEYWORDS    EST.
SOURCE      Trypanosoma cruzi

In genbank protein files:
===================
LOCUS       XP_628849                510 aa            linear   INV
31-OCT-2008
DEFINITION  hypothetical protein [Dictyostelium discoideum AX4].
ACCESSION   XP_628849
VERSION     XP_628849.1  GI:66799847
DBSOURCE    REFSEQ: accession XM_628847.1
KEYWORDS    .
SOURCE      Dictyostelium discoideum AX4.

When I try to parser, Bioperl abort with message error.

Any ideas?

Thanks all,

-- 
Atc,
Rodrigo Jardim
jardim.rodrigo at gmail.com


From biopython at maubp.freeserve.co.uk  Mon Nov 23 17:36:36 2009
From: biopython at maubp.freeserve.co.uk (Peter)
Date: Mon, 23 Nov 2009 17:36:36 +0000
Subject: [Bioperl-l] Problems with Genbank Proteins File
In-Reply-To: <cad526f60911220806td613367x255b8c6a0cebd1fc@mail.gmail.com>
References: <cad526f60911220806td613367x255b8c6a0cebd1fc@mail.gmail.com>
Message-ID: <320fb6e00911230936ofb9d897rbd45abb73a361250@mail.gmail.com>

On Sun, Nov 22, 2009 at 4:06 PM, Rodrigo Jardim
<jardim.rodrigo at gmail.com> wrote:
> I have been problem to parser genbank protein file. I think that because
> this file have a other order of fields. For example:
>
> ...
>
> When I try to parser, Bioperl abort with message error.
>
> Any ideas?

There are some important bits of information missing - what is the error
message, and what version of BioPerl are you using?

Peter


From maj at fortinbras.us  Mon Nov 23 17:58:46 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Mon, 23 Nov 2009 12:58:46 -0500
Subject: [Bioperl-l] building samtools/Bio::DB::Sam on cygwin
Message-ID: <FD03906C0D074E1B8AFDB89A283E9FAB@NewLife>

Hi All--

I've had some hard-won success installing samtools and Lincoln's
Bio::DB::Sam under cygwin; thought some on the list would be able to
use my notes. (Yes, Jason, I'm working on Bio::Tools::Run::BWA...)


(To get the current samtools, ping
http://sourceforge.net/projects/samtools/files/samtools/0.1.7/samtools-0.1.7a.tar.bz2/download
)

* Getting samtools to make from scratch in cygwin

The following diff details the changes to the samtools Makefile I made
by hand. The key points are

-D_WIN32

and the additional variable LFLAGS and its interpolations. To get the
linker to see

libgcc libstdc++

I needed to add symlinks from /lib to the correct files in
/lib/gcc/i386-pc-cygwin/4.3.2/. Your gcc version may differ.


--- ../old/samtools-0.1.7a/Makefile 2009-11-16 10:13:43.000000000 -0500
+++ Makefile 2009-11-23 12:14:18.529000000 -0500
@@ -1,16 +1,18 @@
 CC=   gcc
 CFLAGS=  -g -Wall -O2 #-m64 #-arch ppc
-DFLAGS=  -D_FILE_OFFSET_BITS=64 -D_USE_KNETFILE -D_CURSES_LIB=1
+LFLAGS=         -lws2_32 -lgcc -lcygwin -lbz2 -lz -lstdc++
+DFLAGS=  -D_WIN32 -D_FILE_OFFSET_BITS=64 -D_CURSES_LIB=1
 LOBJS=  bgzf.o kstring.o bam_aux.o bam.o bam_import.o sam.o bam_index.o \
    bam_pileup.o bam_lpileup.o bam_md.o glf.o razf.o faidx.o knetfile.o \
    bam_sort.o sam_header.o
 AOBJS=  bam_tview.o bam_maqcns.o bam_plcmd.o sam_view.o \
    bam_rmdup.o bam_rmdupse.o bam_mate.o bam_stat.o bam_color.o \
    bamtk.o kaln.o

@@ -36,13 +38,13 @@
   $(AR) -cru $@ $(LOBJS)
 
 samtools:lib $(AOBJS)
-  $(CC) $(CFLAGS) -o $@ $(AOBJS) -lm $(LIBPATH) $(LIBCURSES) -lz -L. -lbam
+  $(CC) $(CFLAGS) -o $@ $(AOBJS) -Xlinker --enable-auto-import -lm $(LIBPATH) $(LIBCURSES) -lz -L. -lbam $(LFLAGS)
 
 razip:razip.o razf.o knetfile.o
-  $(CC) $(CFLAGS) -o $@ razf.o razip.o knetfile.o -lz
+  $(CC) $(CFLAGS) -o $@ razf.o razip.o knetfile.o -lz -lm -lws2_32
 
 bgzip:bgzip.o bgzf.o
-  $(CC) $(CFLAGS) -o $@ bgzf.o bgzip.o -lz
+  $(CC) $(CFLAGS) -o $@ bgzf.o bgzip.o -lz -lm -lws2_32
 
 razip.o:razf.h
 bam.o:bam.h razf.h bam_endian.h kstring.h sam_header.h

* Getting Bio::DB::Sam to compile and install

Bio::DB::Sam requires not the samtools.exe, but the bam library
created during the samtools build, as well as all the samtools header
files. Create a symlink in /lib to libbam.a in the build directory (or
copy libbam.a up to /lib), and create symlinks or copy *.h into
/usr/include. Then in cygwin bash shell

$ cpan
cpan> install Bio::DB::Sam

should fly. 

Hope someone finds this useful. These mods led me to a successful
Bio::DB::Sam install--have not yet checked original code based on
Bio::DB::Sam. If they don't work for you, reply to the list.

cheers, 
MAJ 


From jcline at ieee.org  Mon Nov 23 19:13:26 2009
From: jcline at ieee.org (Jonathan Cline)
Date: Mon, 23 Nov 2009 13:13:26 -0600
Subject: [Bioperl-l] Installing Bio-perl on dreamhost via CPAN
In-Reply-To: <mailman.15.1258822805.21407.bioperl-l@lists.open-bio.org>
References: <mailman.15.1258822805.21407.bioperl-l@lists.open-bio.org>
Message-ID: <4B0ADED6.8040901@ieee.org>

Dreamhost has terrible reliability.  I have stats going back years on a
standard dreamhost hosting account (non-dedicated server), and on some
days the web server doesn't respond.  Dreamhost service is OK for a
hobby blog however it is definitely *not* suitable for anything real. 
Add in latency, arbitrary account limits/restrictions,  etc, and as a
hosting service, it is a bad idea to host a project there.   Although
some users apparently get lucky with server allocation and end up on a
"good server", the provider can change this at any time as well.  I
think more typically, the accounts users don't notice, since most are
simple bloggers.

Here's a data snip that illustrates the problem with a typical dreamhost
account:

----------------------------------------------------------------------
date          uptime       dns   connect   request      ttfb      ttlb

2008-08-05     91.40     0.000     0.528     0.528     2.257     1.619
2008-08-04     89.13     0.002     0.301     0.301     1.302     0.971
2008-08-03     94.62     0.000     0.567     0.567     1.506     0.913
2008-08-02    100.00     0.000     0.335     0.335     1.475     1.079
2008-08-01    100.00     0.000     0.310     0.310     1.587     0.825
2008-07-31     93.55     0.023     0.386     0.386     1.280     0.759
2008-07-30    100.00     0.000     0.345     0.345     1.373     0.860
2008-07-29    100.00     0.000     0.358     0.358     1.335     0.757
2008-07-28    100.00     0.000     0.327     0.327     1.462     0.896
2008-07-27    100.00     0.000     0.292     0.292     1.410     0.966
2008-07-26    100.00     0.000     0.283     0.283     1.280     0.815
2008-07-25    100.00     0.000     0.297     0.297     1.231     0.853
2008-07-24    100.00     0.000     0.362     0.362     1.258     0.699
2008-07-23    100.00     0.000     0.339     0.339     1.270     0.785

----------------------------------------------------------------------
minimum        89.13     0.000     0.283     0.283     1.231     0.699
maximum       100.00     0.023     0.567     0.567     2.257     1.619
average        97.76     0.002     0.359     0.359     1.430     0.914
----------------------------------------------------------------------


Or this month:

----------------------------------------------------------------------
date          uptime       dns   connect   request      ttfb      ttlb

2009-11-11    100.00     0.011     0.097     0.097     1.260     1.638
2009-11-10    100.00     0.008     0.094     0.094     1.285     1.647
2009-11-09    100.00     0.008     0.094     0.094     1.494     1.872
2009-11-08    100.00     0.015     0.101     0.101     1.509     1.894
2009-11-07    100.00     0.006     0.092     0.092     1.453     1.831
2009-11-06    100.00     0.011     0.097     0.097     1.500     1.882
2009-11-05     97.80     0.012     0.097     0.097     1.445     1.806
2009-11-04    100.00     0.010     0.096     0.096     1.235     1.605
2009-11-03     95.65     0.007     0.093     0.093     1.266     1.612
2009-11-02    100.00     0.010     0.096     0.096     1.267     1.637
2009-11-01    100.00     0.007     0.093     0.093     1.311     1.692
2009-10-31    100.00     0.009     0.095     0.095     1.225     1.594
2009-10-30    100.00     0.009     0.095     0.095     1.364     1.739
2009-10-29    100.00     0.017     0.103     0.103     1.121     1.505

----------------------------------------------------------------------
minimum        95.65     0.006     0.092     0.092     1.121     1.505
maximum       100.00     0.017     0.103     0.103     1.509     1.894
average        99.53     0.010     0.096     0.096     1.338     1.711
----------------------------------------------------------------------


## Jonathan Cline
## jcline at ieee.org
## Mobile: +1-805-617-0223
########################


From cjfields at illinois.edu  Tue Nov 24 03:19:02 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Mon, 23 Nov 2009 21:19:02 -0600
Subject: [Bioperl-l] verbosity and error stack,
	was  accessing EMBL database
In-Reply-To: <3277368F-615A-4DD3-B9B3-5D32A5EEEE98@sbc.su.se>
References: <475F74057A618245A773CD325E105D1E033334AC@phy-srv01.physiol.physiology.wisc.edu><E10EF917D9914031BCFDF8BDDBFA4F13@NewLife><B6E6CEFF-14E7-48D6-B9D8-9C114F166190@sbc.su.se><FADF827A6CE34C959062F2D93849E15A@NewLife>
	<B335150B-7479-4D31-BA0A-9F139E2CCE0E@illinois.edu>
	<D72A208491F04DBF9B3F7F10D86A9931@NewLife>
	<3277368F-615A-4DD3-B9B3-5D32A5EEEE98@sbc.su.se>
Message-ID: <167D2408-653E-4DF5-BCD7-134CE2549E44@illinois.edu>

Okay, so I think it's feasible to add this into trunk.  I like the idea of optionally having a log class, if someone comes up with a nice way of adding it in I would be for it.

chris

On Nov 20, 2009, at 4:15 AM, Dave Messina wrote:

> Chris, I took a look at how you implemented this in Biome -- very nice!
> 
> 
>> I like this verbose/strict separability a lot. Should we go for it?
> 
> Me too. So yes, I think so.
> 
> 
>> We could even allow finer-grained control of verbosity (states which cover all combinations) w/o affecting strictness.
> 
> 
> Perhaps this is a job for Log::Log4Perl or Log::Dispatch?
> http://search.cpan.org/~mschilli/Log-Log4perl-1.25/lib/Log/Log4perl.pm
> http://search.cpan.org/~drolsky/Log-Dispatch-2.26/lib/Log/Dispatch.pm
> 
> 
> That might be overkill, though.
> 
> Dave
> 


From David.Messina at sbc.su.se  Tue Nov 24 16:18:22 2009
From: David.Messina at sbc.su.se (Dave Messina)
Date: Tue, 24 Nov 2009 17:18:22 +0100
Subject: [Bioperl-l] verbosity and error stack,
	was  accessing EMBL database
In-Reply-To: <167D2408-653E-4DF5-BCD7-134CE2549E44@illinois.edu>
References: <475F74057A618245A773CD325E105D1E033334AC@phy-srv01.physiol.physiology.wisc.edu><E10EF917D9914031BCFDF8BDDBFA4F13@NewLife><B6E6CEFF-14E7-48D6-B9D8-9C114F166190@sbc.su.se><FADF827A6CE34C959062F2D93849E15A@NewLife>
	<B335150B-7479-4D31-BA0A-9F139E2CCE0E@illinois.edu>
	<D72A208491F04DBF9B3F7F10D86A9931@NewLife>
	<3277368F-615A-4DD3-B9B3-5D32A5EEEE98@sbc.su.se>
	<167D2408-653E-4DF5-BCD7-134CE2549E44@illinois.edu>
Message-ID: <3FD2086D-062F-4706-9DC8-2A53224C4913@sbc.su.se>

> I like the idea of optionally having a log class, if someone comes up with a nice way of adding it in I would be for it.

My suggestion of the logging modules was actually to handle the various levels of verbose output -- I think both of the ones I mentioned "log" to STDERR by default.

But of course a nice side effect of using such a logging module is that it would allow optional logging to a file, too.

Dave


From paolo.pavan at gmail.com  Tue Nov 24 19:28:09 2009
From: paolo.pavan at gmail.com (Paolo Pavan)
Date: Tue, 24 Nov 2009 20:28:09 +0100
Subject: [Bioperl-l] Bio::Tools::Run::Cap3 usage question
Message-ID: <56be91b60911241128s52613a56u99e5b1cb3ba8d19a@mail.gmail.com>

Dear,
I'm confused about the proper usage of the module Bio::Tools::Run::Cap3.
As documented in the pod, the run(@seqs) method returns the cap3 report file
while I expect to return a Bio::Assembly object, consistently with other
Bio::Tools::Run classes.
However, I went around this by getting from the factory object the location
and the names of the temp output files (actually accessing a private
property, although) and reading them via the Assembly::IO system.
I was just wandering what is the proper designed way to do this job.

Thank you for enlighten the way!
Paolo


From Russell.Smithies at agresearch.co.nz  Tue Nov 24 22:04:31 2009
From: Russell.Smithies at agresearch.co.nz (Smithies, Russell)
Date: Wed, 25 Nov 2009 11:04:31 +1300
Subject: [Bioperl-l] Bio::DB::Fasta
Message-ID: <18DF7D20DFEC044098A1062202F5FFF32B63085409@exchsth.agresearch.co.nz>

Is there any way to pass a filename to Bio::DB::Fasta for the location of where to write the directory.index?
It's writing in the same dir as the fasta but I'd rather have it write in /tmp as it's part of a web app.

Thanx,

Russell


=======================================================================
Attention: The information contained in this message and/or attachments
from AgResearch Limited is intended only for the persons or entities
to which it is addressed and may contain confidential and/or privileged
material. Any review, retransmission, dissemination or other use of, or
taking of any action in reliance upon, this information by persons or
entities other than the intended recipients is prohibited by AgResearch
Limited. If you have received this message in error, please notify the
sender immediately.
=======================================================================


From Russell.Smithies at agresearch.co.nz  Tue Nov 24 22:21:52 2009
From: Russell.Smithies at agresearch.co.nz (Smithies, Russell)
Date: Wed, 25 Nov 2009 11:21:52 +1300
Subject: [Bioperl-l] Bio::DB::Fasta
In-Reply-To: <4296CD1039FC44B89034A1FD3E6721F3@NewLife>
References: <18DF7D20DFEC044098A1062202F5FFF32B63085409@exchsth.agresearch.co.nz>
	<4296CD1039FC44B89034A1FD3E6721F3@NewLife>
Message-ID: <18DF7D20DFEC044098A1062202F5FFF32B6308542C@exchsth.agresearch.co.nz>

That's what I ended up doing.
Also, there's no "obvious" way to index a single file so I ended putting the filename in the glob parameter.

my $db = Bio::DB::Fasta->new( "$tmp", -glob => "test.faa", -reindex => 1 );

--Russell


> -----Original Message-----
> From: Mark A. Jensen [mailto:maj at fortinbras.us]
> Sent: Wednesday, 25 November 2009 11:19 a.m.
> To: Smithies, Russell; 'bioperl-l'
> Subject: Re: [Bioperl-l] Bio::DB::Fasta
> 
> The code (method index_dir() ) seems to expect all the fasta files to be
> contained in that directory. Looks hairy; what about creating symlinks to
> your
> fasta files in a /tmp subdir and calling new() with that subdir?
> ----- Original Message -----
> From: "Smithies, Russell" <Russell.Smithies at agresearch.co.nz>
> To: "'bioperl-l'" <bioperl-l at bioperl.org>
> Sent: Tuesday, November 24, 2009 5:04 PM
> Subject: [Bioperl-l] Bio::DB::Fasta
> 
> 
> > Is there any way to pass a filename to Bio::DB::Fasta for the location
> of
> > where to write the directory.index?
> > It's writing in the same dir as the fasta but I'd rather have it write
> in /tmp
> > as it's part of a web app.
> >
> > Thanx,
> >
> > Russell
> >
> >
> > =======================================================================
> > Attention: The information contained in this message and/or attachments
> > from AgResearch Limited is intended only for the persons or entities
> > to which it is addressed and may contain confidential and/or privileged
> > material. Any review, retransmission, dissemination or other use of, or
> > taking of any action in reliance upon, this information by persons or
> > entities other than the intended recipients is prohibited by AgResearch
> > Limited. If you have received this message in error, please notify the
> > sender immediately.
> > =======================================================================
> >
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >
> >


From maj at fortinbras.us  Tue Nov 24 22:18:51 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Tue, 24 Nov 2009 17:18:51 -0500
Subject: [Bioperl-l] Bio::DB::Fasta
In-Reply-To: <18DF7D20DFEC044098A1062202F5FFF32B63085409@exchsth.agresearch.co.nz>
References: <18DF7D20DFEC044098A1062202F5FFF32B63085409@exchsth.agresearch.co.nz>
Message-ID: <4296CD1039FC44B89034A1FD3E6721F3@NewLife>

The code (method index_dir() ) seems to expect all the fasta files to be 
contained in that directory. Looks hairy; what about creating symlinks to your 
fasta files in a /tmp subdir and calling new() with that subdir?
----- Original Message ----- 
From: "Smithies, Russell" <Russell.Smithies at agresearch.co.nz>
To: "'bioperl-l'" <bioperl-l at bioperl.org>
Sent: Tuesday, November 24, 2009 5:04 PM
Subject: [Bioperl-l] Bio::DB::Fasta


> Is there any way to pass a filename to Bio::DB::Fasta for the location of 
> where to write the directory.index?
> It's writing in the same dir as the fasta but I'd rather have it write in /tmp 
> as it's part of a web app.
>
> Thanx,
>
> Russell
>
>
> =======================================================================
> Attention: The information contained in this message and/or attachments
> from AgResearch Limited is intended only for the persons or entities
> to which it is addressed and may contain confidential and/or privileged
> material. Any review, retransmission, dissemination or other use of, or
> taking of any action in reliance upon, this information by persons or
> entities other than the intended recipients is prohibited by AgResearch
> Limited. If you have received this message in error, please notify the
> sender immediately.
> =======================================================================
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> 


From florent.angly at gmail.com  Tue Nov 24 22:54:48 2009
From: florent.angly at gmail.com (Florent Angly)
Date: Tue, 24 Nov 2009 14:54:48 -0800
Subject: [Bioperl-l] Bio::Tools::Run::Cap3 usage question
In-Reply-To: <56be91b60911241128s52613a56u99e5b1cb3ba8d19a@mail.gmail.com>
References: <56be91b60911241128s52613a56u99e5b1cb3ba8d19a@mail.gmail.com>
Message-ID: <4B0C6438.8070405@gmail.com>

Hi Paolo,

It turns out that there is no standard for what is to be passed to the 
Bio::Tools::Run wrappers and returned by them. I noticed the 
inconsistency between the assembly wrappers recently while implementing 
support for new wrapper. I implemented inital support for additional de 
novo assembly programs in BioPerl (454 Newbler and Minimo) a couple of 
weeks ago and Mark Jensen added support for Maq, a program that 
assembler reads against a reference. In the process, all the assembly 
wrappers were changed to take the same type of input data (a FASTA 
sequence or an array reference of sequence objects) and return one of 
the following:
    * a Bio::Assembly::Scaffold object (the default), or
    * a Bio::Assembly::IO object, or
    * the name of a file for the output of the assembler
Use the out_type method to set up which output you want, e.g.:
    $factory->out_type('Bio::Assembly::IO');
or
    $factory->out_type('cap3_results.ace');
You'll have to use the code in the bioperl-run subversion if you want to 
use these new features.

Cheers,

Florent


Paolo Pavan wrote:
> Dear,
> I'm confused about the proper usage of the module Bio::Tools::Run::Cap3.
> As documented in the pod, the run(@seqs) method returns the cap3 report file
> while I expect to return a Bio::Assembly object, consistently with other
> Bio::Tools::Run classes.
> However, I went around this by getting from the factory object the location
> and the names of the temp output files (actually accessing a private
> property, although) and reading them via the Assembly::IO system.
> I was just wandering what is the proper designed way to do this job.
>
> Thank you for enlighten the way!
> Paolo
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>   


From roychu at gmail.com  Tue Nov 24 23:00:58 2009
From: roychu at gmail.com (Roy)
Date: Tue, 24 Nov 2009 15:00:58 -0800
Subject: [Bioperl-l] Remote Blast - same script but different results
Message-ID: <4d7f3e450911241500y7df305acq1d03819ea1ec7d3e@mail.gmail.com>

Hi bioperl community,

I've tried searching the old lists to see if this topic has been
covered, and perhaps this question arises from my own lack of
familiarity with BLAST, but (from my perl script listed below) I get
different results with remote blast when I call my script (that is, I
will either get hits or no hits at all).  I'll call the script one
time, and get no hits.  Then call the script again (with the same
parameters), and get the same several hits that I may have before
after having gotten no hits.  I use a subroutine to parse the blast
report information, and then I use a boolean to indicate whether
results are returned or not.  Any insight into what I may have missed
would be appreciated.  Short question, is this behavior typical?  My
understanding of how BLAST works is that it shouldn'tl...


Thanks in advance,
Roy

#!/usr/bin/perl -w

use strict;
use warnings;
use Carp;
use Bio::Perl;
use CGI;
use Bio::SeqIO;
use Bio::SearchIO;
use Bio::SeqFeature::Generic;
use Bio::Restriction::Analysis;
use Bio::Tools::Run::RemoteBlast;

use Bio::SimpleAlign;
use Bio::AlignIO;
use Bio::LocatableSeq;

my $five_seqobj = Bio::Seq->new(
		-seq		=>	'ATTCCCACCGGGACCTGCGGGGCTGAGTGCCCTTCTCGGTTGCTGCCGCTGAGGAGCCCGCCCAGCCAGCCAGGGCCGCGAGGCCGAGGCCAGGCCGCAGCCCAGGAGCCGCCCCACCGCAGCTGGCGATGGACCCGCCGAGGCCCGCGCTGCTGGCGCTGCTGGCGCTGCCTGCGCTGCTGCTGCTGCTGCTGGCGGGCGCCAGGGCCG',
		-display_id	=>	'genomic_a',
		-alphabet 	=>	'dna',
	);
my $three_seqobj = Bio::Seq->new(
		-seq		=>	'GTGAGTGCGCGGCCGCTCTGCGGGCGCAGAGGGAGCGGGAGGGAGCCGGCGGCACGAGGTTGGCCGGGGCAGCCTGGGCCTAGGCCAGAGGGAGGGCAGCCACAGGGTCCAGGGCGAGTGGGGGGATTGGACCAGCTGGCGGCCCCTGCAGGCTCAGGATGGGGGGCGCGGGATGGAGGGGCTGAGGAGGGGGTCTCCGGAGCCTGCCTC',
		-display_id	=>	'genomic_b',
		-alphabet 	=>	'dna',
	);

my @params = (
'-program' => 'blastn',
'-database' => 'refseq_genomic',
'-expect' => '10',
'-readmethod' => 'blastxml'
);
$Bio::Tools::Run::RemoteBlast::HEADER{'MEGABLAST'} = 'YES';
$Bio::Tools::Run::RemoteBlast::HEADER{'PERC_IDENT'} = 75;
$Bio::Tools::Run::RemoteBlast::HEADER{'FORMAT_TYPE'} = 'XML';
$Bio::Tools::Run::RemoteBlast::HEADER{'ENTREZ_QUERY'} = 'Homo sapiens [ORGN]';
$Bio::Tools::Run::RemoteBlast::HEADER{'HITLIST_SIZE'} = 100; # Put:
limit number of hits

my $factory_a = Bio::Tools::Run::RemoteBlast->new(@params);
$factory_a->retrieve_parameter('FORMAT_TYPE', 'XML');

my $hits_a;
my $hits_b;

my $r;
my $bool_hit;
print "Submitting BLAST query - 5' end (MEGABLAST = YES)\n";
$Bio::Tools::Run::RemoteBlast::HEADER{'MEGABLAST'} = 'YES';
$r = $factory_a->submit_blast($a_seqobj);
$bool_hit = fetch_blast_report($factory_a);
unless ($bool_hit) {
	print "\nNo hits\n";
	print "Re-submitting BLAST query - 5' end (MEGABLAST = NO)\n";
	sleep 5;
	$Bio::Tools::Run::RemoteBlast::HEADER{'MEGABLAST'} = 'NO';
	$r = $factory_a->submit_blast($a_seqobj);
	($bool_hit, $hits_a) = fetch_blast_report($factory_a);
	if ($bool_hit == 0) { print "No hits\n"; }
	sleep 5;
}

my $factory_b = Bio::Tools::Run::RemoteBlast->new(@params);
print "\n--------------------------------------------------\n\n";
print "Submitting BLAST query - 3' end (MEGABLAST = YES)\n";
$Bio::Tools::Run::RemoteBlast::HEADER{'MEGABLAST'} = 'YES';
$r = $remote_blast_three->submit_blast($b_seqobj);
$bool_hit = fetch_blast_report($factory_b);
unless ($bool_hit) {
	print " No hits\n";
	print "Re-submitting BLAST query - 3' end (MEGABLAST = NO)\n";
	sleep 5;
	$Bio::Tools::Run::RemoteBlast::HEADER{'MEGABLAST'} = 'NO';
	$r = $factory_b->submit_blast($b_seqobj);
	($bool_hit, $hits_b) = fetch_blast_report($factory_b);
	if ($bool_hit == 0) { print " No hits\n"; }
	sleep 5;
}

print "\nbye\n\n";

print "$hits_a\n$hits_b\n";

exit;

sub fetch_blast_report {
	my ($factory) = @_;
	my $v = 1;
	my $bool_hit = 0;
	my $hits = '';
	
	print STDERR "waiting...";
	while (my @rids = $factory->each_rid) {
		foreach my $rid (@rids) {
			print STDERR ".";
			my $rc = $factory->retrieve_blast($rid);
			# retrieves blast report from remote blast queue,
			# returns -1 on error, 0 on 'job not finished', Bio::SearchIO object
			# args, remote blast id (rid)
			if (!ref($rc)) {
				# if not empty string, ref EXPR returns a non-empty string if EXPR
is a reference
				if ($rc < 0) {
					$factory->remove_rid($rid);
				}
				print STDERR "." if ($v > 0);
#####################################################################################
is this printing out as multiple dots? when and why?
				sleep 5;
			} else {
				$bool_hit = 1;
				my $result = $rc->next_result();
				unless ($result->num_hits > 0) {
					$bool_hit = 0;
				}
				# returns: Bio::Search::Result::ResultI object
				$factory->remove_rid($rid);
				print "\ndatabase:\t", $result->database_name,"\n";
				print "query name:\t", $result->query_name,"\n";
				print "query length\t", $result->query_length,"\n";
				print "num hits\t", $result->num_hits,"\n";
				if ($result->num_hits) {
					# $result->hits returns an array of hits
					# $results->no_hits_found, boolean vs $#{@hits} ie. filtering\
					while (my $hit = $result->next_hit) {
					
					print "\nhit name:\t", $hit->name,"\n";	
					print "description:\t", $hit->description,"\n";	
					print "locus:\t", $hit->locus,"\n";	
					print "algorithm: ", $hit->algorithm,"\thit length: ",
$hit->length,"\thit ranking: ", $hit->rank,"\n";
					while (my $hsp = $hit->next_hsp) {
						print "evalue: ", $hsp->evalue,"\tscore: ",
$hsp->score,"\tpercent_id: ", $hsp->percent_identity,"\n";
						print "query_start: ", $hsp->query->start,"\tquery_end: ",
$hsp->query->end;
						print "\tquery_length: ", $hsp->query->length,"\tquery_strand:
", $hsp->strand('query'), "\n";
						print "subject_start: ", $hsp->subject->start,"\tsubject_end: ",
$hsp->subject->end;
						print "\tsubject_length: ",
$hsp->subject->length,"\tsubject_strand: ", $hsp->strand('subject'),
"\n\n";
						my $aln = $hsp->get_aln;
						if ($aln->is_flush) {
							foreach my $seq ($aln->each_seq) {
								print $seq->seq,"\n";
							}
							print $aln->gap_line, "\n";
							print $aln->consensus_string(95), "\n\n";
						}

						$hits .= $hit->name."\t".$hsp->subject->start."\t".$hsp->subject->end."\t".$hsp->strand('subject')."\n";
					}
				}		
			}
		}
	}
	return ($bool_hit, $hits);
}
}


From maj at fortinbras.us  Wed Nov 25 04:12:13 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Tue, 24 Nov 2009 23:12:13 -0500
Subject: [Bioperl-l] Bio::DB::Fasta
In-Reply-To: <18DF7D20DFEC044098A1062202F5FFF32B6308542C@exchsth.agresearch.co.nz>
References: <18DF7D20DFEC044098A1062202F5FFF32B63085409@exchsth.agresearch.co.nz>
	<4296CD1039FC44B89034A1FD3E6721F3@NewLife>
	<18DF7D20DFEC044098A1062202F5FFF32B6308542C@exchsth.agresearch.co.nz>
Message-ID: <3ECFA0236D1B467181EE63C8C6BE7E1F@NewLife>

I seem to be able to do
$db = Bio::DB::Fasta->new("$tmp/test.faa");
without a problem- something in the mixing of named and unnamed parameters?
----- Original Message ----- 
From: "Smithies, Russell" <Russell.Smithies at agresearch.co.nz>
To: "'Mark A. Jensen'" <maj at fortinbras.us>; "'bioperl-l'" 
<bioperl-l at bioperl.org>
Sent: Tuesday, November 24, 2009 5:21 PM
Subject: RE: [Bioperl-l] Bio::DB::Fasta


That's what I ended up doing.
Also, there's no "obvious" way to index a single file so I ended putting the 
filename in the glob parameter.

my $db = Bio::DB::Fasta->new( "$tmp", -glob => "test.faa", -reindex => 1 );

--Russell


> -----Original Message-----
> From: Mark A. Jensen [mailto:maj at fortinbras.us]
> Sent: Wednesday, 25 November 2009 11:19 a.m.
> To: Smithies, Russell; 'bioperl-l'
> Subject: Re: [Bioperl-l] Bio::DB::Fasta
>
> The code (method index_dir() ) seems to expect all the fasta files to be
> contained in that directory. Looks hairy; what about creating symlinks to
> your
> fasta files in a /tmp subdir and calling new() with that subdir?
> ----- Original Message -----
> From: "Smithies, Russell" <Russell.Smithies at agresearch.co.nz>
> To: "'bioperl-l'" <bioperl-l at bioperl.org>
> Sent: Tuesday, November 24, 2009 5:04 PM
> Subject: [Bioperl-l] Bio::DB::Fasta
>
>
> > Is there any way to pass a filename to Bio::DB::Fasta for the location
> of
> > where to write the directory.index?
> > It's writing in the same dir as the fasta but I'd rather have it write
> in /tmp
> > as it's part of a web app.
> >
> > Thanx,
> >
> > Russell
> >
> >
> > =======================================================================
> > Attention: The information contained in this message and/or attachments
> > from AgResearch Limited is intended only for the persons or entities
> > to which it is addressed and may contain confidential and/or privileged
> > material. Any review, retransmission, dissemination or other use of, or
> > taking of any action in reliance upon, this information by persons or
> > entities other than the intended recipients is prohibited by AgResearch
> > Limited. If you have received this message in error, please notify the
> > sender immediately.
> > =======================================================================
> >
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >
> >


From maj at fortinbras.us  Wed Nov 25 17:25:30 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Wed, 25 Nov 2009 12:25:30 -0500
Subject: [Bioperl-l] question for all regarding a sam-based Bio::Assembly::IO
Message-ID: <1E72D5B0A190448FA27545DB5B68638D@NewLife>

Short-readers, 

I'm working on an Assembly::IO class for sam alignments.
I'm currently making a decision about handling multiple reference sequences:
would you prefer that next_assembly() return an assembly that covers all reference
sequences, or that next_assembly iterates over each reference sequence?
(Or both?)

thanks for your input-
MAJ


From timbourine81 at gmail.com  Wed Nov 25 17:40:52 2009
From: timbourine81 at gmail.com (Tim)
Date: Wed, 25 Nov 2009 18:40:52 +0100
Subject: [Bioperl-l] How to parse BLAST output - all hits of each query in
	new file
Message-ID: <4B0D6C24.2080308@gmail.com>

Dear bioperl users,

I am a real newbie and have - maybe a very trivial - question.

I searched the mailing list archive and many howtos but I have not found
a concrete answer to my problem. So hopefully you can help me :)

Background: I use the latest Bioperl version (installed it two weeks
before).
When I use Bio::Tools::Run::StandAloneBlast to BLAST one fasta file
including different sequences, I get a BLAST output with many queries
each having several hits / sbjcts.

My problem is how to parse *all* hits of *one* query into a single new
file. And this for all the queries I have in my BLAST output file.

Or is it better the other way round; first to make fasta files with only
single sequences inside and BLAST each file? But how can I automize that
using Bioperl?

I tried Bio::SearchIO but can only parse all queries and their
respective hits in only one file...
I think iteration is also necessary here, but I do not really know how
to include that into Bio::SearchIO.
Or do I have to use Module:Bio::Index::Blast?

I can index a file (see below), but I have no idea what comes next...

###How I index a file...

#!/usr/bin/perl -w

$ENV{BIOPERL_INDEX_TYPE} = "SDBM_File";

use Bio::Index::Fasta;


$file_name = "8_to_BLAST_two_seq_index.fasta";
$id = "48882";
$inx = Bio::Index::Fasta->new (-filename => $file_name . ".idx",
-write_flag => 1);
$inx->make_index($file_name);


Hopefully, you can give me at least hints what to look for.

A big THANKS in advance!

Cheers,

Tim


From timbourine81 at gmail.com  Wed Nov 25 17:53:34 2009
From: timbourine81 at gmail.com (Tim)
Date: Wed, 25 Nov 2009 18:53:34 +0100
Subject: [Bioperl-l] How to parse different (fasta) files
Message-ID: <4B0D6F1E.8@gmail.com>

Hey everybody,

another question from me...if you do not mind :)

My situation is like this: I have parsed a standalone BLAST output using
SearchIO with only the hit names. Now I have a second fasta file with
the same sequences like in the BLAST database but including an alignment
(meaning "." and "-"). (There is no chance to make a BLAST database with
fasta files including the alignment, unfortunately...).
My intention is now to take the name of the hit sequences (BLAST output)
and to get the corresponding aligned sequences (fasta file incl.
alignment) and putting it in a new file.

Is anybody out there who has tried that before?

Again, I am a absolute greenhorn in using (Bio)perl. Maybe it is very
simple :D

Looking forward to get an answer of you.

All the best,

Tim
-- 
Tim K?hler
MPI for Terrestrial Microbiology
Karl-von-Frisch-Stra?e
D-35043 Marburg / Germany

Email: koehlerd at mpi-marburg.mpg.de
Phone: +49 6421 178-740
Fax:   +49 6421 178-999


From maj at fortinbras.us  Wed Nov 25 18:20:03 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Wed, 25 Nov 2009 13:20:03 -0500
Subject: [Bioperl-l] How to parse BLAST output - all hits of each query
	innew file
In-Reply-To: <4B0D6C24.2080308@gmail.com>
References: <4B0D6C24.2080308@gmail.com>
Message-ID: <53DE480F205E42CE8D2B9421592AAF0E@NewLife>

hey Tim--

Sound like you need to go about collecting your queries inside out:

my %hits_by_query;
for ($result->hits) {
  push @{$hits_by_query{$hit->name}} $hit;
}

I believe now each hash element, keyed by the query name, will contain
an arrayref to the set of hits assoc with that query.
>From here, I believe

use Bio::Search::Result::BlastResult;
use Bio::SearchIO;

foreach my $qid ( keys %hits_by_query ) {
  my $result = Bio::Search::Result::BlastResult->new();
  $result->add_hit($_) for ( @{$hits_by_query{$qid}} );
  my $blio = Bio::SearchIO->new( -file => $qid.".bls", -format=>'blast' );
  $blio->write_result($result);
}

will do what you want.

hope this helps -
Mark

----- Original Message ----- 
From: "Tim" <timbourine81 at gmail.com>
To: <bioperl-l at lists.open-bio.org>
Sent: Wednesday, November 25, 2009 12:40 PM
Subject: [Bioperl-l] How to parse BLAST output - all hits of each query innew 
file


> Dear bioperl users,
>
> I am a real newbie and have - maybe a very trivial - question.
>
> I searched the mailing list archive and many howtos but I have not found
> a concrete answer to my problem. So hopefully you can help me :)
>
> Background: I use the latest Bioperl version (installed it two weeks
> before).
> When I use Bio::Tools::Run::StandAloneBlast to BLAST one fasta file
> including different sequences, I get a BLAST output with many queries
> each having several hits / sbjcts.
>
> My problem is how to parse *all* hits of *one* query into a single new
> file. And this for all the queries I have in my BLAST output file.
>
> Or is it better the other way round; first to make fasta files with only
> single sequences inside and BLAST each file? But how can I automize that
> using Bioperl?
>
> I tried Bio::SearchIO but can only parse all queries and their
> respective hits in only one file...
> I think iteration is also necessary here, but I do not really know how
> to include that into Bio::SearchIO.
> Or do I have to use Module:Bio::Index::Blast?
>
> I can index a file (see below), but I have no idea what comes next...
>
> ###How I index a file...
>
> #!/usr/bin/perl -w
>
> $ENV{BIOPERL_INDEX_TYPE} = "SDBM_File";
>
> use Bio::Index::Fasta;
>
>
> $file_name = "8_to_BLAST_two_seq_index.fasta";
> $id = "48882";
> $inx = Bio::Index::Fasta->new (-filename => $file_name . ".idx",
> -write_flag => 1);
> $inx->make_index($file_name);
>
>
> Hopefully, you can give me at least hints what to look for.
>
> A big THANKS in advance!
>
> Cheers,
>
> Tim
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> 


From Russell.Smithies at agresearch.co.nz  Wed Nov 25 19:07:26 2009
From: Russell.Smithies at agresearch.co.nz (Smithies, Russell)
Date: Thu, 26 Nov 2009 08:07:26 +1300
Subject: [Bioperl-l] How to parse BLAST output - all hits of each query
 in new file
In-Reply-To: <4B0D6C24.2080308@gmail.com>
References: <4B0D6C24.2080308@gmail.com>
Message-ID: <18DF7D20DFEC044098A1062202F5FFF32B63085701@exchsth.agresearch.co.nz>

Hi Tim,
Here's some code for a job I'm working on at the moment that contains all the bits you'll probably need.
It's extracting 2 species-specific databases from nr (based on tax ids), doing a blast, then parsing the results and creating a substitution matrix. I was initially using Bio::DB::Eutilities to query and retrieve sequences but I kept getting errors and time-outs from NCBI when pulling back large numbers of sequences.
It should give you a rough idea of how to run Bio::Tools::Run::StandAloneBlast, Bio::DB::Fasta and Bio::SearchIO.

Email me direct if you want further explaination as it's not well commented ;-)

Russell Smithies 

Bioinformatics Applications Developer 
T +64 3 489 9085 
E  russell.smithies at agresearch.co.nz 

Invermay  Research Centre 
Puddle Alley, 
Mosgiel, 
New Zealand 
T  +64 3 489 3809   
F  +64 3 489 9174  
www.agresearch.co.nz

=======================================

#!/usr/local/bin/perl

use strict;
use Bio::Tools::Run::StandAloneBlast;
use Bio::SeqIO;
use Bio::SearchIO;
use Bio::DB::Fasta;

use Storable;

# Parameters: <query> <subject> <number or percentage of searches>
# Percentage can be specified as either 20p, 20P or 20%
# So for 20% of rice sequences blasted against oil palm:
#    4530 51953 20p   (4530=rice,51953=oil_palm, 20p=20%)
# Or for 20 searches:
#      4530 51953 20
#
my ( $q, $s, $c ) = @ARGV;

my $nr = "/data/databases/flatfile/illuminati_blastdata/nr";
my $tax_file = "/data/anonftp/pub/mirror/taxonomy/gi_taxid_prot.dmp.gz";
my $tmp = "/tmp/tax";


my %stats      = ();
my $total_subs = 0;

my $min_hsp_len      = 0;
my $min_hsp_identity = 0;
my $num_searches     = $c || 10;
my $blast_e          = '1e-6';
my $count            = 0;

# check if all the fasta and blast files exist
# if not, extract new fasta and re-formatdb the database
foreach my $t ( $q, $s ) {
  foreach ( map { "$tmp/$t.$_" } qw(faa list phr pin psq) ) {
    unless ( -e $_ ) {
      print "Creating database for $t\n";
      &create_database($t);
      last;
    }
  }
}

my @params = (
               -database => "$tmp/$q",
               -program  => 'blastp',
               -e        => $blast_e,
               -outfile  => "$tmp/blast.out",
               -v        => '1',
               -b        => '1'
);
my $factory = Bio::Tools::Run::StandAloneBlast->new(@params) or die $!;

# load the query sequences into a db
# makes it easier to randomly access them
my $db = Bio::DB::Fasta->new( "$tmp", -glob => "$s.faa", -reindex => 1 );

my @ids      = $db->ids;
my $id_count = $#ids;
exit "No sequences\n" unless $id_count;

# if a percentage is requested, calculate
# the required number of searches
if ( $num_searches =~ m/(\d+)[pP%]/ ) {
  $num_searches = int( ( $1 / 100 ) * $id_count );
  warn
"Searching random $1 percent ($num_searches) of $id_count sequences from taxid $q\n";
}

my $summary_file = "$tmp/".$$."_summary.txt";
open( OUT, ">", $summary_file ) or die $!;
print OUT
"#Summary of $num_searches random blast searches from taxid $q against taxid $s.\n";
print OUT "#Parameters used were:\n";
print OUT "#blast_e: $blast_e\n";
print OUT "#min_hsp_len: $min_hsp_len\n";
print OUT "#min_hsp_identity: $min_hsp_identity\n";
print OUT "\n";

while ( my $seq = $db->get_Seq_by_id( $ids[ rand($#ids) ] ) ) {
  next unless $seq;

  warn "Processing ", $seq->id, "\n";
  eval {
    my $blast_report = $factory->blastall($seq);
    sleep 5;
  };

  my $blast_in = new Bio::SearchIO( -format => "blast", -file => "$tmp/blast.out" );

  while ( my $result = $blast_in->next_result ) {
    if ( $result->num_hits <= 0 ) {
      warn "No hits for ", $result->query_accession, "\n";
      print OUT "No hits for ", $result->query_accession, "\n";
      next;
    }
    $count++;
    while ( my $hit = $result->next_hit ) {
      while ( my $hsp = $hit->next_hsp ) {
        warn sprintf( "%s had %s hsp%s\n",
                      $result->query_accession, $hit->num_hsps,
                      $hit->num_hsps > 1 ? "s" : "" );
        print OUT sprintf( "%s had %s hsp%s\n",
                      $result->query_accession, $hit->num_hsps,
                      $hit->num_hsps > 1 ? "s" : "" );

        # http://www.bioperl.org/wiki/HOWTO:SearchIO#Table_of_Methods
        if ( $hsp->length('total') > $min_hsp_len ) {
          if ( $hsp->percent_identity >= $min_hsp_identity ) {
            my @query_string = split '', $hsp->query_string;
            my @homol_string = split '', $hsp->homology_string;
            my @hit_string   = split '', $hsp->hit_string;
            for ( my $i = 0; $i < $#query_string; $i++ ) {
              next unless $homol_string[$i] =~ /\+/;
              $stats{ $query_string[$i] }{ $hit_string[$i] }++;
              $total_subs++;
            }
          }
        }
      }
    }
  }
  unlink '$tmp/blast.out' if -e '$tmp/blast.out';
  last if $count >= $num_searches;
}


# create summary frequency list
my %summary = ();
for my $query ( keys %stats ) {
  for my $hit ( keys %{ $stats{$query} } ) {
    $summary{"$query->$hit"} =
      sprintf( "%6f", $stats{$query}{$hit} / $total_subs );
  }
}

print OUT "\n";

# sort by decending frequencies and print to summary file
foreach my $k ( sort { $summary{$b} <=> $summary{$a} } keys %summary ) {
  print OUT "$k\t", $summary{$k}, "\n" unless $k =~ /TOTAL/;
}

print OUT "\n\n";

# print substitution matrix
my $i     = 0;
my @prots = qw(A R N D C Q E G H I L K M F P S T W Y V);
my $sep   = "\t";

print OUT sprintf( "%7s %s", $_, $sep ) foreach ( "       ", @prots );
print OUT "\n";

foreach my $x (@prots) {
  print OUT sprintf( "%7s|%s", $prots[ $i++ ], $sep );
  foreach my $y (@prots) {
    my $val =
      defined( $stats{$x}{$y} )
      ? sprintf( "%0.6f", $stats{$x}{$y} / $total_subs )
      : "--------";
    print OUT sprintf( "%s%s", $val, $sep );
  }
  print OUT "\n";
}
close OUT;


open(IN, $summary_file) or die $!;
print $_ while(<IN>);
close IN;


# extract sequences from nr database based on taxid.
sub create_database {
  my $txid      = shift;
  my %hash      = ();
  my $gi_stored = "/tmp/gi.dat";

  if ( -e $gi_stored ) {
    %hash = %{ retrieve($gi_stored) };
  }
  else {
    open( TXID, "zcat $tax_file | " ) or die $!;
    while (<TXID>) {
      chomp;
      my ( $gi, $tx ) = split( "\t", $_ );
      push( @{ $hash{$tx} }, $gi );
    }
    close TXID;

    store( \%hash, $gi_stored );
  }

  my $txlist = "$tmp/$txid.list";
  my $txseq  = "$tmp/$txid.faa";
	
	die "No sequences found for taxid $txid\n" unless defined( @{ $hash{$txid} });
	my $num_seqs =  scalar( @{ $hash{$txid} });
	warn "Found $num_seqs sequences for taxid $txid in $tax_file\n";

  open OUT, ">", $txlist or die $!;
  print OUT "$_\n" foreach ( @{ $hash{$txid} } );
  close OUT;

  my $cmd = "fastacmd -d $nr -i $txlist -t T -o $txseq 2>/dev/null";
  system $cmd;

  my $count = `grep -c '>' $txseq`;
  $count =~ s/\n//;
	warn "Could only extract $count sequences from $nr\n";

  $cmd = "formatdb -p T -i $tmp/$txid.faa -n $tmp/$txid -l $tmp/formatdb.log";
  system $cmd;

  $cmd = "fastacmd -d $tmp/$txid -I";
  system $cmd;

  warn "Check the formatdb.log for any errors\n";
}


=======================================


> -----Original Message-----
> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
> bounces at lists.open-bio.org] On Behalf Of Tim
> Sent: Thursday, 26 November 2009 6:41 a.m.
> To: bioperl-l at lists.open-bio.org
> Subject: [Bioperl-l] How to parse BLAST output - all hits of each query in
> new file
> 
> Dear bioperl users,
> 
> I am a real newbie and have - maybe a very trivial - question.
> 
> I searched the mailing list archive and many howtos but I have not found
> a concrete answer to my problem. So hopefully you can help me :)
> 
> Background: I use the latest Bioperl version (installed it two weeks
> before).
> When I use Bio::Tools::Run::StandAloneBlast to BLAST one fasta file
> including different sequences, I get a BLAST output with many queries
> each having several hits / sbjcts.
> 
> My problem is how to parse *all* hits of *one* query into a single new
> file. And this for all the queries I have in my BLAST output file.
> 
> Or is it better the other way round; first to make fasta files with only
> single sequences inside and BLAST each file? But how can I automize that
> using Bioperl?
> 
> I tried Bio::SearchIO but can only parse all queries and their
> respective hits in only one file...
> I think iteration is also necessary here, but I do not really know how
> to include that into Bio::SearchIO.
> Or do I have to use Module:Bio::Index::Blast?
> 
> I can index a file (see below), but I have no idea what comes next...
> 
> ###How I index a file...
> 
> #!/usr/bin/perl -w
> 
> $ENV{BIOPERL_INDEX_TYPE} = "SDBM_File";
> 
> use Bio::Index::Fasta;
> 
> 
> $file_name = "8_to_BLAST_two_seq_index.fasta";
> $id = "48882";
> $inx = Bio::Index::Fasta->new (-filename => $file_name . ".idx",
> -write_flag => 1);
> $inx->make_index($file_name);
> 
> 
> Hopefully, you can give me at least hints what to look for.
> 
> A big THANKS in advance!
> 
> Cheers,
> 
> Tim
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
=======================================================================
Attention: The information contained in this message and/or attachments
from AgResearch Limited is intended only for the persons or entities
to which it is addressed and may contain confidential and/or privileged
material. Any review, retransmission, dissemination or other use of, or
taking of any action in reliance upon, this information by persons or
entities other than the intended recipients is prohibited by AgResearch
Limited. If you have received this message in error, please notify the
sender immediately.
=======================================================================


From maj at fortinbras.us  Wed Nov 25 19:21:27 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Wed, 25 Nov 2009 14:21:27 -0500
Subject: [Bioperl-l] How to parse BLAST output - all hits of each
	queryinnew file
In-Reply-To: <53DE480F205E42CE8D2B9421592AAF0E@NewLife>
References: <4B0D6C24.2080308@gmail.com>
	<53DE480F205E42CE8D2B9421592AAF0E@NewLife>
Message-ID: <815D2A47BC9C4D89B8DEF0B10DA9EAF8@NewLife>

whoops: change the following line:
my $blio = Bio::SearchIO->new( -file => $qid.".bls", -format=>'blast' );

to

my $blio = Bio::SearchIO->new( -file => ">$qid\.bls", -format=>'blast' );

(I always forget that...)
MAJ

----- Original Message ----- 
From: "Mark A. Jensen" <maj at fortinbras.us>
To: "Tim" <timbourine81 at gmail.com>; <bioperl-l at lists.open-bio.org>
Sent: Wednesday, November 25, 2009 1:20 PM
Subject: Re: [Bioperl-l] How to parse BLAST output - all hits of each queryinnew 
file


> hey Tim--
>
> Sound like you need to go about collecting your queries inside out:
>
> my %hits_by_query;
> for ($result->hits) {
>  push @{$hits_by_query{$hit->name}} $hit;
> }
>
> I believe now each hash element, keyed by the query name, will contain
> an arrayref to the set of hits assoc with that query.
>>From here, I believe
>
> use Bio::Search::Result::BlastResult;
> use Bio::SearchIO;
>
> foreach my $qid ( keys %hits_by_query ) {
>  my $result = Bio::Search::Result::BlastResult->new();
>  $result->add_hit($_) for ( @{$hits_by_query{$qid}} );
>  my $blio = Bio::SearchIO->new( -file => $qid.".bls", -format=>'blast' );
>  $blio->write_result($result);
> }
>
> will do what you want.
>
> hope this helps -
> Mark
>
> ----- Original Message ----- 
> From: "Tim" <timbourine81 at gmail.com>
> To: <bioperl-l at lists.open-bio.org>
> Sent: Wednesday, November 25, 2009 12:40 PM
> Subject: [Bioperl-l] How to parse BLAST output - all hits of each query innew 
> file
>
>
>> Dear bioperl users,
>>
>> I am a real newbie and have - maybe a very trivial - question.
>>
>> I searched the mailing list archive and many howtos but I have not found
>> a concrete answer to my problem. So hopefully you can help me :)
>>
>> Background: I use the latest Bioperl version (installed it two weeks
>> before).
>> When I use Bio::Tools::Run::StandAloneBlast to BLAST one fasta file
>> including different sequences, I get a BLAST output with many queries
>> each having several hits / sbjcts.
>>
>> My problem is how to parse *all* hits of *one* query into a single new
>> file. And this for all the queries I have in my BLAST output file.
>>
>> Or is it better the other way round; first to make fasta files with only
>> single sequences inside and BLAST each file? But how can I automize that
>> using Bioperl?
>>
>> I tried Bio::SearchIO but can only parse all queries and their
>> respective hits in only one file...
>> I think iteration is also necessary here, but I do not really know how
>> to include that into Bio::SearchIO.
>> Or do I have to use Module:Bio::Index::Blast?
>>
>> I can index a file (see below), but I have no idea what comes next...
>>
>> ###How I index a file...
>>
>> #!/usr/bin/perl -w
>>
>> $ENV{BIOPERL_INDEX_TYPE} = "SDBM_File";
>>
>> use Bio::Index::Fasta;
>>
>>
>> $file_name = "8_to_BLAST_two_seq_index.fasta";
>> $id = "48882";
>> $inx = Bio::Index::Fasta->new (-filename => $file_name . ".idx",
>> -write_flag => 1);
>> $inx->make_index($file_name);
>>
>>
>> Hopefully, you can give me at least hints what to look for.
>>
>> A big THANKS in advance!
>>
>> Cheers,
>>
>> Tim
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> 


From alden.huang at gmail.com  Thu Nov 26 10:54:30 2009
From: alden.huang at gmail.com (Alden Huang)
Date: Thu, 26 Nov 2009 02:54:30 -0800
Subject: [Bioperl-l] Function that determines serious mutations
In-Reply-To: <deaa866a0911060935q5e0364f5v9a296162c42fd0ca@mail.gmail.com>
References: <deaa866a0911060935q5e0364f5v9a296162c42fd0ca@mail.gmail.com>
Message-ID: <9e408d720911260254r1e85169lb92d944d88a1880c@mail.gmail.com>

Hey rob,

Sorting Intolerant from Tolerant
http://sift.jcvi.org/

~alden

...a bit late, i kno; I just read you post now while cleaning the inbox

On Fri, Nov 6, 2009 at 9:35 AM, Robert Bradbury
<robert.bradbury at gmail.com> wrote:
> Is there a function in the library (or has someone written one) that can
> take a genbank entry and determine which mutations are harmful?
>
> It would be used to produce a table summary of:
> ?GENE ? ? ? ? ?# SNP ? ? ?# BadSNP
>
> One kind of gets this from NCBI if you lookup in the "GENE" db a gene name
> and then go to the "GeneView" om dbSNP page it has the information I want
> but largely in a graphical format while I simply want numbers I can dump
> into a spreadsheet.
>
> I don't think it would be hard, fetch the gene, run through the features for
> the SNP database, figure out whether they are good or bad SNPs, accumulate
> the statistics and dump it. ?I think the functions available are flexible
> enough to do it but I can't believe nobody has already done it. ?It could be
> a bit more complex in that one could do an analysis to see if the mutations
> are in a conserved domain or mutations that code for Cysteine or Methionine
> (or othe potentially "critical" amino acids) but since "critical" is in the
> eye of the beholder there would have to be some kind of callback to a
> scoring function.
>
> Thanks,
> Robert
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


From robert.bradbury at gmail.com  Thu Nov 26 11:27:50 2009
From: robert.bradbury at gmail.com (Robert Bradbury)
Date: Thu, 26 Nov 2009 06:27:50 -0500
Subject: [Bioperl-l] Function that determines serious mutations
In-Reply-To: <9e408d720911260254r1e85169lb92d944d88a1880c@mail.gmail.com>
References: <deaa866a0911060935q5e0364f5v9a296162c42fd0ca@mail.gmail.com>
	<9e408d720911260254r1e85169lb92d944d88a1880c@mail.gmail.com>
Message-ID: <deaa866a0911260327j5b57d16erfcbe5b996e1a6e64@mail.gmail.com>

On Thu, Nov 26, 2009 at 5:54 AM, Alden Huang <alden.huang at gmail.com> wrote:
>
> Sorting Intolerant from Tolerant
> http://sift.jcvi.org/
>
>
Ah yes, thank you very much.  This looks very much like a tool that can be
adapted for various uses.

Robert


From jason at bioperl.org  Thu Nov 26 17:16:17 2009
From: jason at bioperl.org (Jason Stajich)
Date: Thu, 26 Nov 2009 09:16:17 -0800
Subject: [Bioperl-l] question about a Bio::Tree::Tree method
In-Reply-To: <30960443.966281259248778372.JavaMail.defaultUser@defaultHost>
References: <30960443.966281259248778372.JavaMail.defaultUser@defaultHost>
Message-ID: <14F4B8C9-A1F4-436B-813F-50E139932D3D@bioperl.org>

Emilio - please ask your questions on the list - many people there can  
help answer questions.

get_nodes returns all the nodes in the tree, the options specify the  
order they are returned in.  Depending on your question the order  
probably won't matter so you can just call it without any arguments  
like in the examples and the HOWTO.

The documentation for the method says:
  Title   : get_nodes
         Usage   : my @nodes = $tree?>get_nodes()
         Function: Return list of Bio::Tree::NodeI objects
         Returns : array of Bio::Tree::NodeI objects
         Args    : (named values) hash with one value
                   order => ?b?breadth? first order or  
?d?depth? first order

So you can provide no arguments and get the default (breadth-first I  
believe) or you can specify
-order => 'd'
or
-order => 'depth'

to get the nodes in depth-first order.

-jason
On Nov 26, 2009, at 7:19 AM, miglio83 at libero.it wrote:

> Hi Jason,
> I'm Emilio Siena, a PhD student of the University of Perugia.
> I have
> a question about the method "get_nodes" of the  "Bio::Tree::Tree"  
> class.
> In
> particular I didn't understand which type of arguments it accepts  
> and in which
> format an argument should be given.
>
> Thank you in advance!
>
> Emilio

--
Jason Stajich
jason.stajich at gmail.com
jason at bioperl.org


From maj at fortinbras.us  Thu Nov 26 17:40:45 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Thu, 26 Nov 2009 12:40:45 -0500
Subject: [Bioperl-l] Bio::Assembly::IO::sam is alpha
Message-ID: <599F8BABCD2848EFA98FB24A4419674E@NewLife>

in bioperl-live/trunk with plenty pod; bravehearts can (please!) test on .bam files
cheers, MAJ


From mauricio at open-bio.org  Thu Nov 26 21:45:43 2009
From: mauricio at open-bio.org (Mauricio Herrera Cuadra)
Date: Thu, 26 Nov 2009 15:45:43 -0600
Subject: [Bioperl-l] [DAS] DAS workshop 7th-9th April 2010
In-Reply-To: <F30A9ED7-41E9-4833-A094-FDF0893E0F92@sanger.ac.uk>
References: <F30A9ED7-41E9-4833-A094-FDF0893E0F92@sanger.ac.uk>
Message-ID: <4B0EF707.6080202@open-bio.org>

Hi Jonathan,

Any chance it can be webcasted? I'm sure it would attract a lot of 
remote attendees ;)

Regards,
Mauricio.


Jonathan Warren wrote:
> We are considering running a Distributed Annotation System workshop here 
> at the Sanger/EBI in the UK subject to decent demand.
> The workshop will be held from Wednesday 7th-Friday 9th April 2010. If 
> you would be interested in attending either to present or just take part
> then please email me jw12 at sanger.ac.uk
> 
> The format of the workshop is likely to be similar to last years (1st 
> day for beginners, 2nd for both beginners and advanced users, 3rd day 
> for advanced), information for which can be found here:
> http://www.dasregistry.org/course.jsp
> 
> If you would like to present then please send a short summary of what 
> you would like to talk about.
> 
> Thanks
> 
> Jonathan.
> 
> Jonathan Warren
> Senior Developer and DAS coordinator
> jw12 at sanger.ac.uk
> 
> 
> 
> 
> 
> 
> 
> 
> 


From robert.bradbury at gmail.com  Fri Nov 27 02:06:40 2009
From: robert.bradbury at gmail.com (Robert Bradbury)
Date: Thu, 26 Nov 2009 21:06:40 -0500
Subject: [Bioperl-l] BioPerl "guts" question regarding forked processes
Message-ID: <deaa866a0911261806g12211028w23cd540ae3652ff4@mail.gmail.com>

I'm currently running near my process limit and running sequence fetches
from swissprot (I've also had this happen with getting gi's from NCBI) and
am running out of processes about halfway through the set I'm trying to
fetch [1].

Now, is there someplace in the bioperl documentation that documents where
one is supposed to wait() for defunct processes after each sequence fetch.
 I'm encountering the problem both when the sequence fetches succeed as well
as when they fail.

Thanks in advance.
Robert

1. This is due to a bug in chromium's use of flash that involves it leaving
many defunct processes that are uncollected and therefore counting towards
ones "process limit".


From kanzure at gmail.com  Fri Nov 27 02:12:46 2009
From: kanzure at gmail.com (Bryan Bishop)
Date: Thu, 26 Nov 2009 20:12:46 -0600
Subject: [Bioperl-l] BioPerl "guts" question regarding forked processes
In-Reply-To: <deaa866a0911261806g12211028w23cd540ae3652ff4@mail.gmail.com>
References: <deaa866a0911261806g12211028w23cd540ae3652ff4@mail.gmail.com>
Message-ID: <55ad6af70911261812q583277d5l71df0d66e756f617@mail.gmail.com>

On Thu, Nov 26, 2009 at 8:06 PM, Robert Bradbury wrote:
> I'm currently running near my process limit and running sequence fetches
> from swissprot (I've also had this happen with getting gi's from NCBI) and
> am running out of processes about halfway through the set I'm trying to
> fetch [1].

Hey Robert, sorry for the off-topic question, but I was wondering if
you're the same Robert Bradbury from the extropy-chat list. Hi?

- Bryan
http://heybryan.org/
1 512 203 0507


From paolo.pavan at gmail.com  Fri Nov 27 11:35:03 2009
From: paolo.pavan at gmail.com (Paolo Pavan)
Date: Fri, 27 Nov 2009 12:35:03 +0100
Subject: [Bioperl-l] More general Bio::Assembly::Contig question (was
	Bio::Tools::Run::Cap3 usage question)
Message-ID: <56be91b60911270335s3a50ab0cpb03aabb6660f81dc@mail.gmail.com>

Dear Florent,
Thank you for your kind answer and for your efforts spent in this module.
Since you are working on these topics I would like to seize the day and put
you some questions about some doubts I have in mind, if you agree, of course
:-)
Some times ago I tried to work with bioperl, loading the data from an ACE
file originated by Newbler; my need was to extract part of the contig like
an alignment of reads and I tought to do it with a slice() method, since I
saw Bio::Assembly::Contig implements Bio::AlignI interface. Unfortunately I
realize that this interface is inherited but not implemented.
I tried to hack it by adding a slice method which would act on a
Bio::Alignment created from the array of LocatableSeqs representing the
reads.

This is the question:
If I'm not wrong (please correct me if yes), Bio::Assembly::Contig class
stores reads informations in:
Bio::Assembly::Contigs->{_elem}{READ_NAME}{_feat}{
     _align_clipping:READ_NAME}
     _aligned_coord:READ_NAME}
     _quality_clipping:READ_NAME}

Anyone of these 3 features _align_clipping, _aligned_coord,
_quality_clipping, contains a Bio::SeqFeature::Generic, which of them is
more suitable to the purpose expressed before, the slice method?
And more, If you apologize me for being too long, is consequently to the
previous: I don't have perfectly clear the purpose of this 3 feature per
read, can you explain it?

Really thanks you for the time you would spend.
Bye bye,
Paolo


2009/11/24 Florent Angly <florent.angly at gmail.com>

> Hi Paolo,
>
> It turns out that there is no standard for what is to be passed to the
> Bio::Tools::Run wrappers and returned by them. I noticed the inconsistency
> between the assembly wrappers recently while implementing support for new
> wrapper. I implemented inital support for additional de novo assembly
> programs in BioPerl (454 Newbler and Minimo) a couple of weeks ago and Mark
> Jensen added support for Maq, a program that assembler reads against a
> reference. In the process, all the assembly wrappers were changed to take
> the same type of input data (a FASTA sequence or an array reference of
> sequence objects) and return one of the following:
>   * a Bio::Assembly::Scaffold object (the default), or
>   * a Bio::Assembly::IO object, or
>   * the name of a file for the output of the assembler
> Use the out_type method to set up which output you want, e.g.:
>   $factory->out_type('Bio::Assembly::IO');
> or
>   $factory->out_type('cap3_results.ace');
> You'll have to use the code in the bioperl-run subversion if you want to
> use these new features.
>
> Cheers,
>
> Florent
>
>
>
>
> Paolo Pavan wrote:
>
>> Dear,
>> I'm confused about the proper usage of the module Bio::Tools::Run::Cap3.
>> As documented in the pod, the run(@seqs) method returns the cap3 report
>> file
>> while I expect to return a Bio::Assembly object, consistently with other
>> Bio::Tools::Run classes.
>> However, I went around this by getting from the factory object the
>> location
>> and the names of the temp output files (actually accessing a private
>> property, although) and reading them via the Assembly::IO system.
>> I was just wandering what is the proper designed way to do this job.
>>
>> Thank you for enlighten the way!
>> Paolo
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>>
>>
>
>


From jw12 at sanger.ac.uk  Thu Nov 26 14:57:35 2009
From: jw12 at sanger.ac.uk (Jonathan Warren)
Date: Thu, 26 Nov 2009 14:57:35 +0000
Subject: [Bioperl-l] DAS workshop 7th-9th April 2010
Message-ID: <F30A9ED7-41E9-4833-A094-FDF0893E0F92@sanger.ac.uk>

We are considering running a Distributed Annotation System workshop  
here at the Sanger/EBI in the UK subject to decent demand.
The workshop will be held from Wednesday 7th-Friday 9th April 2010. If  
you would be interested in attending either to present or just take part
then please email me jw12 at sanger.ac.uk

The format of the workshop is likely to be similar to last years (1st  
day for beginners, 2nd for both beginners and advanced users, 3rd day  
for advanced), information for which can be found here:
http://www.dasregistry.org/course.jsp

If you would like to present then please send a short summary of what  
you would like to talk about.

Thanks

Jonathan.

Jonathan Warren
Senior Developer and DAS coordinator
jw12 at sanger.ac.uk


-- 
 The Wellcome Trust Sanger Institute is operated by Genome Research 
 Limited, a charity registered in England with number 1021457 and a 
 company registered in England with number 2742969, whose registered 
 office is 215 Euston Road, London, NW1 2BE. 


From timbourine81 at googlemail.com  Thu Nov 26 16:02:30 2009
From: timbourine81 at googlemail.com (Tim Koehler)
Date: Thu, 26 Nov 2009 17:02:30 +0100
Subject: [Bioperl-l] How to parse BLAST output - all hits of each
	queryinnew file
In-Reply-To: <4B0EA44D.2050507@gmail.com>
References: <4B0D6C24.2080308@gmail.com>
	<53DE480F205E42CE8D2B9421592AAF0E@NewLife>
	<815D2A47BC9C4D89B8DEF0B10DA9EAF8@NewLife>
	<4B0EA44D.2050507@gmail.com>
Message-ID: <c3cc98c0911260802t2a66e7e8o7be3b79c03d86e6e@mail.gmail.com>

ups, sent too early...

Hey Mark,

thanks for the answer. But I am still struggling, especially where to put in
your code.

Here ist the code I have, so far:

#!/usr/bin/perl -w

### should I put your code here as push is a perl command?
my %hits_by_query;
for ($result->hits) {
### I inserted a comma after name}}; if there is no comma, there was the
error: Scalar found where operator expected at
12_BLAST_two_sequence_each_query_one_file.PL line7, near "} $hit"
###        (Missing operator before  $hit?)
###Useless use of push with no values at
12_BLAST_two_sequence_each_query_one_file.PL line 7.
###syntax error at 12_BLAST_two_sequence_each_query_one_file.PL line 7, near
"} $hit"
###BEGIN not safe after errors--compilation aborted at
12_BLAST_two_sequence_each_query_one_file.PL line 13.
 push @{$hits_by_query{$hit->name}}, $hit;
###here, every time this terror appears: Name "main::result" used only once:
possible typo at 12_BLAST_two_sequence_each_query_one_file.PL line 5.
###error: Can't call method "hits" on an undefined value at
12_BLAST_two_sequence_each_query_one_file.PL line 5.
}


use strict;
use Bio::Tools::Run::StandAloneBlast;
use Bio::SeqIO;
use Bio::SearchIO;
use Bio::Search::Result::BlastResult;

my $Seq_in = Bio::SeqIO->new (
-file =>
"/home/koehler/Programs/for_BLAST/BLAST_Pipeline/1_to_BLAST_two_seq.fasta",
-format => 'fasta'
);
while (my $query = $Seq_in->next_seq()) {
my $factory = Bio::Tools::Run::StandAloneBlast->new(
'program' => 'blastn',
'database' => '/home/koehler/Programs/for_BLAST/BLAST_Pipeline/3_BLAST_db',
_READMETHOD => "Blast"
);

my $blast_report = $factory->blastall($query);

### Should I need to use a module? are the commands here at the right
position? errors, e.g., Global symbol "$hit" requires explicit package name
#my %hits_by_query;
#for ($result->hits) {
### inserted comma after name}}
# push @{$hits_by_query{$hit->name}}, $hit;
#}

foreach my $qid ( keys %hits_by_query ) {
 my $result = Bio::Search::Result::BlastResult->new();
 $result->add_hit($_) for ( @{$hits_by_query{$qid}} );
 my $blio = Bio::SearchIO->new( -file => ">$qid\.bls", -format=>'blast' );
 $blio->write_result($result);
}

###where are the files stored? what is their name. Sorry, but I cannot get
behind that :(

while( my $result = $blast_report->next_result ) {
  ## $result is a Bio::Search::Result::ResultI compliant object
  while( my $hit = $result->next_hit ) {
   ## $hit is a Bio::Search::Hit::HitI compliant object
   while( my $hsp = $hit->next_hsp ) {
    ## $hsp is a Bio::Search::HSP::HSPI compliant object
    if( $hsp->length('total') > 50 ) {
     if ( $hsp->percent_identity >= 75 ) {
     print  "Query= ",        $result->query_name,
        "Hit= ",        $hit->name,
            "Length= ",     $hsp->length('total'),
            "Percent_id= ", $hsp->percent_identity,
        "Subject=",        $hsp->hit_string,"\n";
     }
    }
   }
  }
}
}

Again, a big thanks in advance :)

All the best,

Tim


On Thu, Nov 26, 2009 at 4:52 PM, Tim <timbourine81 at gmail.com> wrote:

> Hey Mark,
>
> thanks for the answer
>
> On 25.11.2009 20:21, Mark A. Jensen wrote:
> > whoops: change the following line:
> > my $blio = Bio::SearchIO->new( -file => $qid.".bls", -format=>'blast' );
> >
> > to
> >
> > my $blio = Bio::SearchIO->new( -file => ">$qid\.bls", -format=>'blast' );
> >
> > (I always forget that...)
> > MAJ
> >
> > ----- Original Message ----- From: "Mark A. Jensen" <maj at fortinbras.us>
> > To: "Tim" <timbourine81 at gmail.com>; <bioperl-l at lists.open-bio.org>
> > Sent: Wednesday, November 25, 2009 1:20 PM
> > Subject: Re: [Bioperl-l] How to parse BLAST output - all hits of each
> > queryinnew file
> >
> >
> >> hey Tim--
> >>
> >> Sound like you need to go about collecting your queries inside out:
> >>
> >> my %hits_by_query;
> >> for ($result->hits) {
> >>  push @{$hits_by_query{$hit->name}} $hit;
> >> }
> >>
> >> I believe now each hash element, keyed by the query name, will contain
> >> an arrayref to the set of hits assoc with that query.
> >>> From here, I believe
> >>
> >> use Bio::Search::Result::BlastResult;
> >> use Bio::SearchIO;
> >>
> >> foreach my $qid ( keys %hits_by_query ) {
> >>  my $result = Bio::Search::Result::BlastResult->new();
> >>  $result->add_hit($_) for ( @{$hits_by_query{$qid}} );
> >>  my $blio = Bio::SearchIO->new( -file => $qid.".bls", -format=>'blast'
> );
> >>  $blio->write_result($result);
> >> }
> >>
> >> will do what you want.
> >>
> >> hope this helps -
> >> Mark
> >>
> >> ----- Original Message ----- From: "Tim" <timbourine81 at gmail.com>
> >> To: <bioperl-l at lists.open-bio.org>
> >> Sent: Wednesday, November 25, 2009 12:40 PM
> >> Subject: [Bioperl-l] How to parse BLAST output - all hits of each
> >> query innew file
> >>
> >>
> >>> Dear bioperl users,
> >>>
> >>> I am a real newbie and have - maybe a very trivial - question.
> >>>
> >>> I searched the mailing list archive and many howtos but I have not
> found
> >>> a concrete answer to my problem. So hopefully you can help me :)
> >>>
> >>> Background: I use the latest Bioperl version (installed it two weeks
> >>> before).
> >>> When I use Bio::Tools::Run::StandAloneBlast to BLAST one fasta file
> >>> including different sequences, I get a BLAST output with many queries
> >>> each having several hits / sbjcts.
> >>>
> >>> My problem is how to parse *all* hits of *one* query into a single new
> >>> file. And this for all the queries I have in my BLAST output file.
> >>>
> >>> Or is it better the other way round; first to make fasta files with
> only
> >>> single sequences inside and BLAST each file? But how can I automize
> that
> >>> using Bioperl?
> >>>
> >>> I tried Bio::SearchIO but can only parse all queries and their
> >>> respective hits in only one file...
> >>> I think iteration is also necessary here, but I do not really know how
> >>> to include that into Bio::SearchIO.
> >>> Or do I have to use Module:Bio::Index::Blast?
> >>>
> >>> I can index a file (see below), but I have no idea what comes next...
> >>>
> >>> ###How I index a file...
> >>>
> >>> #!/usr/bin/perl -w
> >>>
> >>> $ENV{BIOPERL_INDEX_TYPE} = "SDBM_File";
> >>>
> >>> use Bio::Index::Fasta;
> >>>
> >>>
> >>> $file_name = "8_to_BLAST_two_seq_index.fasta";
> >>> $id = "48882";
> >>> $inx = Bio::Index::Fasta->new (-filename => $file_name . ".idx",
> >>> -write_flag => 1);
> >>> $inx->make_index($file_name);
> >>>
> >>>
> >>> Hopefully, you can give me at least hints what to look for.
> >>>
> >>> A big THANKS in advance!
> >>>
> >>> Cheers,
> >>>
> >>> Tim
> >>> _______________________________________________
> >>> Bioperl-l mailing list
> >>> Bioperl-l at lists.open-bio.org
> >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >>>
> >>>
> >>
> >> _______________________________________________
> >> Bioperl-l mailing list
> >> Bioperl-l at lists.open-bio.org
> >> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >>
> >>
> >
>
> Tim K?hler
MPI for Terrestrial Microbiology
Karl-von-Frisch-Stra?e
D-35043 Marburg / Germany

Email: koehlerd at mpi-marburg.mpg.de
Phone: +49 6421 178-740
Fax:   +49 6421 178-999


From rtbio.2009 at gmail.com  Sat Nov 28 07:53:43 2009
From: rtbio.2009 at gmail.com (Roopa Raghuveer)
Date: Sat, 28 Nov 2009 08:53:43 +0100
Subject: [Bioperl-l] Linking of two cgi scripts
Message-ID: <c7cac1600911272353u70019256rf394078759660266@mail.gmail.com>

hello everyone,

I have a small question.

I would like to link two cgi scripts i.e.,

I have an input sequence being entered in a text area

ex:->gi|at442323|...
ATGCCCCCTTGGAACCAAAAAAA....

So I would like to compare this with the query sequences.These query
sequences would be from a BLAST script in the module blast.pm
So once I enter the input sequence and request for BLAST using submit
button,my request should go to a program which performs BLAST search.After
this, the sequences obtained from BLAST have to be returned to a program
Roopa.pm which compares the input sequence and the sequences obtained from
blast.

But I am unable to provide this link between the cgi scripts.(i.e.,one
script to use BLAST,the other script to compare the sequences and send the
results to the browser)

Could any one help me in this regard?

Regards,
Roopa.


From s.denaxas at gmail.com  Sat Nov 28 10:56:15 2009
From: s.denaxas at gmail.com (Spiros Denaxas)
Date: Sat, 28 Nov 2009 10:56:15 +0000
Subject: [Bioperl-l] Linking of two cgi scripts
In-Reply-To: <c7cac1600911272353u70019256rf394078759660266@mail.gmail.com>
References: <c7cac1600911272353u70019256rf394078759660266@mail.gmail.com>
Message-ID: <bba689ec0911280256u602b8f9dpffe9483189c56536@mail.gmail.com>

Hello,

Why do they both have to be CGi scripts? cant all the processing
happen server side, i.e. both BLAST and comparison of returned
results?

If that is strictly a requirement, you could:

a) get input from user on script A, i.e. the input sequence
b) do a HTTP request from the CGI to the other script B using LWP::UserAgent
c) get results from script B, pass on to comparison module
d) return results to user

As I said, this will be clunky so either do everything in one go or
consider AJAX

hope this helps
Spiros

On Sat, Nov 28, 2009 at 7:53 AM, Roopa Raghuveer <rtbio.2009 at gmail.com> wrote:
> hello everyone,
>
> I have a small question.
>
> I would like to link two cgi scripts i.e.,
>
> I have an input sequence being entered in a text area
>
> ex:->gi|at442323|...
> ATGCCCCCTTGGAACCAAAAAAA....
>
> So I would like to compare this with the query sequences.These query
> sequences would be from a BLAST script in the module blast.pm
> So once I enter the input sequence and request for BLAST using submit
> button,my request should go to a program which performs BLAST search.After
> this, the sequences obtained from BLAST have to be returned to a program
> Roopa.pm which compares the input sequence and the sequences obtained from
> blast.
>
> But I am unable to provide this link between the cgi scripts.(i.e.,one
> script to use BLAST,the other script to compare the sequences and send the
> results to the browser)
>
> Could any one help me in this regard?
>
> Regards,
> Roopa.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


From maj at fortinbras.us  Sat Nov 28 16:23:53 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Sat, 28 Nov 2009 11:23:53 -0500
Subject: [Bioperl-l] Run wrappers for BWA and Samtools
Message-ID: <7F56A6EEEB0E4EE291D5340F27DF7D3A@NewLife>

Hi All, 

Run wrappers for the bwa assembler and the samtools suite
are now available as beta in the bioperl-run/trunk. The bwa 
wrapper allows you to run a canned assembly pipeline, or 
to execute individual bwa components. The assembly pipeline
can return a Bio::Assembly::Scaffold object via the new 
Bio::Assembly::IO::sam module in bioperl-live/trunk
(this requires lstein's Bio::DB::Sam, from CPAN). Details at

http://www.bioperl.org/wiki/HOWTO:Short-read_assemblies_with_BWA

and, of course, in the pod. 

Cheers, 
MAJ


From maj at fortinbras.us  Sun Nov 29 02:55:42 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Sat, 28 Nov 2009 21:55:42 -0500
Subject: [Bioperl-l] How to parse BLAST output - all hits of
	eachqueryinnew file
In-Reply-To: <c3cc98c0911260802t2a66e7e8o7be3b79c03d86e6e@mail.gmail.com>
References: <4B0D6C24.2080308@gmail.com><53DE480F205E42CE8D2B9421592AAF0E@NewLife><815D2A47BC9C4D89B8DEF0B10DA9EAF8@NewLife><4B0EA44D.2050507@gmail.com>
	<c3cc98c0911260802t2a66e7e8o7be3b79c03d86e6e@mail.gmail.com>
Message-ID: <21BFD947CEEF43CCAC8AFFDB7A064A49@NewLife>

Hi Tim--
There's a bug in my code; should be
for my $hit ($result->hits) {
...
}
and you're right about the comma. My bad.

But I don't think you need this-- you're already looping over your
query sequences and doing blastn on each one. So in the middle of
your loop, you can simply write the blast result that you got:

my $blio = Bio::SearchIO->new( -file => 
">".$query->id.".bls", -format=>"blast" );
$blio->write_result($result);

and forget about the foreach my $qid loop entirely.

The files should show up in the directory from which you're
running the script.
cheers, MAJ


----- Original Message ----- 
From: "Tim Koehler" <timbourine81 at googlemail.com>
To: <bioperl-l at lists.open-bio.org>
Sent: Thursday, November 26, 2009 11:02 AM
Subject: Re: [Bioperl-l] How to parse BLAST output - all hits of eachqueryinnew 
file


ups, sent too early...

Hey Mark,

thanks for the answer. But I am still struggling, especially where to put in
your code.

Here ist the code I have, so far:

#!/usr/bin/perl -w

### should I put your code here as push is a perl command?
my %hits_by_query;
for ($result->hits) {
### I inserted a comma after name}}; if there is no comma, there was the
error: Scalar found where operator expected at
12_BLAST_two_sequence_each_query_one_file.PL line7, near "} $hit"
###        (Missing operator before  $hit?)
###Useless use of push with no values at
12_BLAST_two_sequence_each_query_one_file.PL line 7.
###syntax error at 12_BLAST_two_sequence_each_query_one_file.PL line 7, near
"} $hit"
###BEGIN not safe after errors--compilation aborted at
12_BLAST_two_sequence_each_query_one_file.PL line 13.
 push @{$hits_by_query{$hit->name}}, $hit;
###here, every time this terror appears: Name "main::result" used only once:
possible typo at 12_BLAST_two_sequence_each_query_one_file.PL line 5.
###error: Can't call method "hits" on an undefined value at
12_BLAST_two_sequence_each_query_one_file.PL line 5.
}


use strict;
use Bio::Tools::Run::StandAloneBlast;
use Bio::SeqIO;
use Bio::SearchIO;
use Bio::Search::Result::BlastResult;

my $Seq_in = Bio::SeqIO->new (
-file =>
"/home/koehler/Programs/for_BLAST/BLAST_Pipeline/1_to_BLAST_two_seq.fasta",
-format => 'fasta'
);
while (my $query = $Seq_in->next_seq()) {
my $factory = Bio::Tools::Run::StandAloneBlast->new(
'program' => 'blastn',
'database' => '/home/koehler/Programs/for_BLAST/BLAST_Pipeline/3_BLAST_db',
_READMETHOD => "Blast"
);

my $blast_report = $factory->blastall($query);

### Should I need to use a module? are the commands here at the right
position? errors, e.g., Global symbol "$hit" requires explicit package name
#my %hits_by_query;
#for ($result->hits) {
### inserted comma after name}}
# push @{$hits_by_query{$hit->name}}, $hit;
#}

foreach my $qid ( keys %hits_by_query ) {
 my $result = Bio::Search::Result::BlastResult->new();
 $result->add_hit($_) for ( @{$hits_by_query{$qid}} );
 my $blio = Bio::SearchIO->new( -file => ">$qid\.bls", -format=>'blast' );
 $blio->write_result($result);
}

###where are the files stored? what is their name. Sorry, but I cannot get
behind that :(

while( my $result = $blast_report->next_result ) {
  ## $result is a Bio::Search::Result::ResultI compliant object
  while( my $hit = $result->next_hit ) {
   ## $hit is a Bio::Search::Hit::HitI compliant object
   while( my $hsp = $hit->next_hsp ) {
    ## $hsp is a Bio::Search::HSP::HSPI compliant object
    if( $hsp->length('total') > 50 ) {
     if ( $hsp->percent_identity >= 75 ) {
     print  "Query= ",        $result->query_name,
        "Hit= ",        $hit->name,
            "Length= ",     $hsp->length('total'),
            "Percent_id= ", $hsp->percent_identity,
        "Subject=",        $hsp->hit_string,"\n";
     }
    }
   }
  }
}
}

Again, a big thanks in advance :)

All the best,

Tim


On Thu, Nov 26, 2009 at 4:52 PM, Tim <timbourine81 at gmail.com> wrote:

> Hey Mark,
>
> thanks for the answer
>
> On 25.11.2009 20:21, Mark A. Jensen wrote:
> > whoops: change the following line:
> > my $blio = Bio::SearchIO->new( -file => $qid.".bls", -format=>'blast' );
> >
> > to
> >
> > my $blio = Bio::SearchIO->new( -file => ">$qid\.bls", -format=>'blast' );
> >
> > (I always forget that...)
> > MAJ
> >
> > ----- Original Message ----- From: "Mark A. Jensen" <maj at fortinbras.us>
> > To: "Tim" <timbourine81 at gmail.com>; <bioperl-l at lists.open-bio.org>
> > Sent: Wednesday, November 25, 2009 1:20 PM
> > Subject: Re: [Bioperl-l] How to parse BLAST output - all hits of each
> > queryinnew file
> >
> >
> >> hey Tim--
> >>
> >> Sound like you need to go about collecting your queries inside out:
> >>
> >> my %hits_by_query;
> >> for ($result->hits) {
> >>  push @{$hits_by_query{$hit->name}} $hit;
> >> }
> >>
> >> I believe now each hash element, keyed by the query name, will contain
> >> an arrayref to the set of hits assoc with that query.
> >>> From here, I believe
> >>
> >> use Bio::Search::Result::BlastResult;
> >> use Bio::SearchIO;
> >>
> >> foreach my $qid ( keys %hits_by_query ) {
> >>  my $result = Bio::Search::Result::BlastResult->new();
> >>  $result->add_hit($_) for ( @{$hits_by_query{$qid}} );
> >>  my $blio = Bio::SearchIO->new( -file => $qid.".bls", -format=>'blast'
> );
> >>  $blio->write_result($result);
> >> }
> >>
> >> will do what you want.
> >>
> >> hope this helps -
> >> Mark
> >>
> >> ----- Original Message ----- From: "Tim" <timbourine81 at gmail.com>
> >> To: <bioperl-l at lists.open-bio.org>
> >> Sent: Wednesday, November 25, 2009 12:40 PM
> >> Subject: [Bioperl-l] How to parse BLAST output - all hits of each
> >> query innew file
> >>
> >>
> >>> Dear bioperl users,
> >>>
> >>> I am a real newbie and have - maybe a very trivial - question.
> >>>
> >>> I searched the mailing list archive and many howtos but I have not
> found
> >>> a concrete answer to my problem. So hopefully you can help me :)
> >>>
> >>> Background: I use the latest Bioperl version (installed it two weeks
> >>> before).
> >>> When I use Bio::Tools::Run::StandAloneBlast to BLAST one fasta file
> >>> including different sequences, I get a BLAST output with many queries
> >>> each having several hits / sbjcts.
> >>>
> >>> My problem is how to parse *all* hits of *one* query into a single new
> >>> file. And this for all the queries I have in my BLAST output file.
> >>>
> >>> Or is it better the other way round; first to make fasta files with
> only
> >>> single sequences inside and BLAST each file? But how can I automize
> that
> >>> using Bioperl?
> >>>
> >>> I tried Bio::SearchIO but can only parse all queries and their
> >>> respective hits in only one file...
> >>> I think iteration is also necessary here, but I do not really know how
> >>> to include that into Bio::SearchIO.
> >>> Or do I have to use Module:Bio::Index::Blast?
> >>>
> >>> I can index a file (see below), but I have no idea what comes next...
> >>>
> >>> ###How I index a file...
> >>>
> >>> #!/usr/bin/perl -w
> >>>
> >>> $ENV{BIOPERL_INDEX_TYPE} = "SDBM_File";
> >>>
> >>> use Bio::Index::Fasta;
> >>>
> >>>
> >>> $file_name = "8_to_BLAST_two_seq_index.fasta";
> >>> $id = "48882";
> >>> $inx = Bio::Index::Fasta->new (-filename => $file_name . ".idx",
> >>> -write_flag => 1);
> >>> $inx->make_index($file_name);
> >>>
> >>>
> >>> Hopefully, you can give me at least hints what to look for.
> >>>
> >>> A big THANKS in advance!
> >>>
> >>> Cheers,
> >>>
> >>> Tim
> >>> _______________________________________________
> >>> Bioperl-l mailing list
> >>> Bioperl-l at lists.open-bio.org
> >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >>>
> >>>
> >>
> >> _______________________________________________
> >> Bioperl-l mailing list
> >> Bioperl-l at lists.open-bio.org
> >> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >>
> >>
> >
>
> Tim K?hler
MPI for Terrestrial Microbiology
Karl-von-Frisch-Stra?e
D-35043 Marburg / Germany

Email: koehlerd at mpi-marburg.mpg.de
Phone: +49 6421 178-740
Fax:   +49 6421 178-999

_______________________________________________
Bioperl-l mailing list
Bioperl-l at lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/bioperl-l


From maj at fortinbras.us  Sun Nov 29 03:32:42 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Sat, 28 Nov 2009 22:32:42 -0500
Subject: [Bioperl-l] HOWTO copyright policy vs FDL on wiki
Message-ID: <9EC73CA501BD45BA912F2D77954D6CD7@NewLife>

The HOWTOs appear to have a more restrictive copyright
than FDL-- in particular, the blurb at the bottom of the 
HOWTO page asks users to use the documents for personal 
use only. I'm for this; I think we should therefore have some 
explicit license for these that specifies this kind of restriction, 
and then express that on each howto and in BioPerl:Copyright.
Any thoughts on the right license and whether this is a good plan?
MAJ


From florent.angly at gmail.com  Sun Nov 29 03:47:45 2009
From: florent.angly at gmail.com (Florent Angly)
Date: Sat, 28 Nov 2009 19:47:45 -0800
Subject: [Bioperl-l] More general Bio::Assembly::Contig question (was
 Bio::Tools::Run::Cap3 usage question)
In-Reply-To: <56be91b60911270335s3a50ab0cpb03aabb6660f81dc@mail.gmail.com>
References: <56be91b60911270335s3a50ab0cpb03aabb6660f81dc@mail.gmail.com>
Message-ID: <4B11EEE1.8070907@gmail.com>

Hi Paolo,

The aligned reads of a contig are stored in 
Bio::Assembly::Contigs->{_elem}{READ_NAME}{_seq}. To implement a slice() 
method, you could retrieve the reads using get_seq_ids(), 
get_seq_by_name() or get_seq_by_pos(). To retrieve the position of an 
aligned read in the contig, use get_seq_coord() which returns a 
Bio::SeqFeature::Generic object (from 
Bio::Assembly::Contigs->{_elem}{READ_NAME}{_feat}{_aligned_coord:READ_NAME}) 
on which you can call the start() and end() methods.

I'm not entirely sure what 
Bio::Assembly::Contigs->{_elem}{READ_NAME}{_feat}{_align_clipping:READ_NAME} 
and {_quality_clipping:READ_NAME} are. I believe that they represent the 
clear range of the read/contig.

Hope it helps,

Florent


Paolo Pavan wrote:
> Dear Florent,
> Thank you for your kind answer and for your efforts spent in this module.
> Since you are working on these topics I would like to seize the day 
> and put you some questions about some doubts I have in mind, if you 
> agree, of course :-)
> Some times ago I tried to work with bioperl, loading the data from an 
> ACE file originated by Newbler; my need was to extract part of the 
> contig like an alignment of reads and I tought to do it with a slice() 
> method, since I saw Bio::Assembly::Contig implements Bio::AlignI 
> interface. Unfortunately I realize that this interface is inherited 
> but not implemented.
> I tried to hack it by adding a slice method which would act on a 
> Bio::Alignment created from the array of LocatableSeqs representing 
> the reads.
>
> This is the question:
> If I'm not wrong (please correct me if yes), Bio::Assembly::Contig 
> class stores reads informations in:
> Bio::Assembly::Contigs->{_elem}{READ_NAME}{_feat}{
>      _align_clipping:READ_NAME}
>      _aligned_coord:READ_NAME}
>      _quality_clipping:READ_NAME}
>
> Anyone of these 3 features _align_clipping, _aligned_coord, 
> _quality_clipping, contains a Bio::SeqFeature::Generic, which of them 
> is more suitable to the purpose expressed before, the slice method?
> And more, If you apologize me for being too long, is consequently to 
> the previous: I don't have perfectly clear the purpose of this 3 
> feature per read, can you explain it?
>
> Really thanks you for the time you would spend.
> Bye bye,
> Paolo


From bimber at wisc.edu  Sun Nov 29 05:31:25 2009
From: bimber at wisc.edu (Ben Bimber)
Date: Sat, 28 Nov 2009 23:31:25 -0600
Subject: [Bioperl-l] using bioperl to compare sequences
Message-ID: <9f985cdc0911282131l350bc525gd9ad4717c101ac63@mail.gmail.com>

Hello,

I have a couple years programming experience, but am reasonably new to
perl and extremely new to bioperl.  I have been reading through the
bioperl documentation and am trying to understand the best way to
approach a particular problem.  I'm hoping someone could offer some
tips and point me in the right direction.  If someone has solved this
sort of problem before, i'd prefer not to reinvent things.  Here's
what I'm trying to do:

Our lab generates mRNA sequence data, consisting of alleles of a given
gene or genes
I want to compare each of these sequences against a reference using
BLAST or clustalw (will need the ability to choose at run time)
Take the result of this alignment, then record positions of difference
between the experimental sequence and reference sequence (SNPs)
Translate the corresponding AA change(s) associated with each SNP.
There can be overlapping ORFs.

I see that bioperl has modules for BLAST and clustal.  I've also been
looking at the modules under variation.  I havent fully wrapped my
head around them, but they look to be what i'd use for SNP detection.

has anyone has written code to perform similar things and if so, would
you be willing to share specific examples?  Anything concrete to see
exactly how these modules operate would be extremely helpful.

Thanks in advance for any tips or help.


From jason at bioperl.org  Sun Nov 29 15:54:53 2009
From: jason at bioperl.org (Jason Stajich)
Date: Sun, 29 Nov 2009 07:54:53 -0800
Subject: [Bioperl-l] How to parse BLAST output - all hits of
	eachqueryinnew file
In-Reply-To: <21BFD947CEEF43CCAC8AFFDB7A064A49@NewLife>
References: <4B0D6C24.2080308@gmail.com><53DE480F205E42CE8D2B9421592AAF0E@NewLife><815D2A47BC9C4D89B8DEF0B10DA9EAF8@NewLife><4B0EA44D.2050507@gmail.com>
	<c3cc98c0911260802t2a66e7e8o7be3b79c03d86e6e@mail.gmail.com>
	<21BFD947CEEF43CCAC8AFFDB7A064A49@NewLife>
Message-ID: <897A8DB4-AF29-4601-A1E5-9A04D9D8C151@bioperl.org>

or
while( my $hit = $result->next_hit ) {
}
On Nov 28, 2009, at 6:55 PM, Mark A. Jensen wrote:

> Hi Tim--
> There's a bug in my code; should be
> for my $hit ($result->hits) {
> ...
> }
> and you're right about the comma. My bad.
>
> But I don't think you need this-- you're already looping over your
> query sequences and doing blastn on each one. So in the middle of
> your loop, you can simply write the blast result that you got:
>
> my $blio = Bio::SearchIO->new( -file => ">".$query->id.".bls", - 
> format=>"blast" );
> $blio->write_result($result);
>
> and forget about the foreach my $qid loop entirely.
>
> The files should show up in the directory from which you're
> running the script.
> cheers, MAJ
>
>
>
> ----- Original Message ----- From: "Tim Koehler" <timbourine81 at googlemail.com 
> >
> To: <bioperl-l at lists.open-bio.org>
> Sent: Thursday, November 26, 2009 11:02 AM
> Subject: Re: [Bioperl-l] How to parse BLAST output - all hits of  
> eachqueryinnew file
>
>
> ups, sent too early...
>
> Hey Mark,
>
> thanks for the answer. But I am still struggling, especially where  
> to put in
> your code.
>
> Here ist the code I have, so far:
>
> #!/usr/bin/perl -w
>
> ### should I put your code here as push is a perl command?
> my %hits_by_query;
> for ($result->hits) {
> ### I inserted a comma after name}}; if there is no comma, there was  
> the
> error: Scalar found where operator expected at
> 12_BLAST_two_sequence_each_query_one_file.PL line7, near "} $hit"
> ###        (Missing operator before  $hit?)
> ###Useless use of push with no values at
> 12_BLAST_two_sequence_each_query_one_file.PL line 7.
> ###syntax error at 12_BLAST_two_sequence_each_query_one_file.PL line  
> 7, near
> "} $hit"
> ###BEGIN not safe after errors--compilation aborted at
> 12_BLAST_two_sequence_each_query_one_file.PL line 13.
> push @{$hits_by_query{$hit->name}}, $hit;
> ###here, every time this terror appears: Name "main::result" used  
> only once:
> possible typo at 12_BLAST_two_sequence_each_query_one_file.PL line 5.
> ###error: Can't call method "hits" on an undefined value at
> 12_BLAST_two_sequence_each_query_one_file.PL line 5.
> }
>
>
> use strict;
> use Bio::Tools::Run::StandAloneBlast;
> use Bio::SeqIO;
> use Bio::SearchIO;
> use Bio::Search::Result::BlastResult;
>
> my $Seq_in = Bio::SeqIO->new (
> -file =>
> "/home/koehler/Programs/for_BLAST/BLAST_Pipeline/ 
> 1_to_BLAST_two_seq.fasta",
> -format => 'fasta'
> );
> while (my $query = $Seq_in->next_seq()) {
> my $factory = Bio::Tools::Run::StandAloneBlast->new(
> 'program' => 'blastn',
> 'database' => '/home/koehler/Programs/for_BLAST/BLAST_Pipeline/ 
> 3_BLAST_db',
> _READMETHOD => "Blast"
> );
>
> my $blast_report = $factory->blastall($query);
>
> ### Should I need to use a module? are the commands here at the right
> position? errors, e.g., Global symbol "$hit" requires explicit  
> package name
> #my %hits_by_query;
> #for ($result->hits) {
> ### inserted comma after name}}
> # push @{$hits_by_query{$hit->name}}, $hit;
> #}
>
> foreach my $qid ( keys %hits_by_query ) {
> my $result = Bio::Search::Result::BlastResult->new();
> $result->add_hit($_) for ( @{$hits_by_query{$qid}} );
> my $blio = Bio::SearchIO->new( -file => ">$qid\.bls", - 
> format=>'blast' );
> $blio->write_result($result);
> }
>
> ###where are the files stored? what is their name. Sorry, but I  
> cannot get
> behind that :(
>
> while( my $result = $blast_report->next_result ) {
> ## $result is a Bio::Search::Result::ResultI compliant object
> while( my $hit = $result->next_hit ) {
>  ## $hit is a Bio::Search::Hit::HitI compliant object
>  while( my $hsp = $hit->next_hsp ) {
>   ## $hsp is a Bio::Search::HSP::HSPI compliant object
>   if( $hsp->length('total') > 50 ) {
>    if ( $hsp->percent_identity >= 75 ) {
>    print  "Query= ",        $result->query_name,
>       "Hit= ",        $hit->name,
>           "Length= ",     $hsp->length('total'),
>           "Percent_id= ", $hsp->percent_identity,
>       "Subject=",        $hsp->hit_string,"\n";
>    }
>   }
>  }
> }
> }
> }
>
> Again, a big thanks in advance :)
>
> All the best,
>
> Tim
>
>
> On Thu, Nov 26, 2009 at 4:52 PM, Tim <timbourine81 at gmail.com> wrote:
>
>> Hey Mark,
>>
>> thanks for the answer
>>
>> On 25.11.2009 20:21, Mark A. Jensen wrote:
>> > whoops: change the following line:
>> > my $blio = Bio::SearchIO->new( -file => $qid.".bls", - 
>> format=>'blast' );
>> >
>> > to
>> >
>> > my $blio = Bio::SearchIO->new( -file => ">$qid\.bls", - 
>> format=>'blast' );
>> >
>> > (I always forget that...)
>> > MAJ
>> >
>> > ----- Original Message ----- From: "Mark A. Jensen" <maj at fortinbras.us 
>> >
>> > To: "Tim" <timbourine81 at gmail.com>; <bioperl-l at lists.open-bio.org>
>> > Sent: Wednesday, November 25, 2009 1:20 PM
>> > Subject: Re: [Bioperl-l] How to parse BLAST output - all hits of  
>> each
>> > queryinnew file
>> >
>> >
>> >> hey Tim--
>> >>
>> >> Sound like you need to go about collecting your queries inside  
>> out:
>> >>
>> >> my %hits_by_query;
>> >> for ($result->hits) {
>> >>  push @{$hits_by_query{$hit->name}} $hit;
>> >> }
>> >>
>> >> I believe now each hash element, keyed by the query name, will  
>> contain
>> >> an arrayref to the set of hits assoc with that query.
>> >>> From here, I believe
>> >>
>> >> use Bio::Search::Result::BlastResult;
>> >> use Bio::SearchIO;
>> >>
>> >> foreach my $qid ( keys %hits_by_query ) {
>> >>  my $result = Bio::Search::Result::BlastResult->new();
>> >>  $result->add_hit($_) for ( @{$hits_by_query{$qid}} );
>> >>  my $blio = Bio::SearchIO->new( -file => $qid.".bls", - 
>> format=>'blast'
>> );
>> >>  $blio->write_result($result);
>> >> }
>> >>
>> >> will do what you want.
>> >>
>> >> hope this helps -
>> >> Mark
>> >>
>> >> ----- Original Message ----- From: "Tim" <timbourine81 at gmail.com>
>> >> To: <bioperl-l at lists.open-bio.org>
>> >> Sent: Wednesday, November 25, 2009 12:40 PM
>> >> Subject: [Bioperl-l] How to parse BLAST output - all hits of each
>> >> query innew file
>> >>
>> >>
>> >>> Dear bioperl users,
>> >>>
>> >>> I am a real newbie and have - maybe a very trivial - question.
>> >>>
>> >>> I searched the mailing list archive and many howtos but I have  
>> not
>> found
>> >>> a concrete answer to my problem. So hopefully you can help me :)
>> >>>
>> >>> Background: I use the latest Bioperl version (installed it two  
>> weeks
>> >>> before).
>> >>> When I use Bio::Tools::Run::StandAloneBlast to BLAST one fasta  
>> file
>> >>> including different sequences, I get a BLAST output with many  
>> queries
>> >>> each having several hits / sbjcts.
>> >>>
>> >>> My problem is how to parse *all* hits of *one* query into a  
>> single new
>> >>> file. And this for all the queries I have in my BLAST output  
>> file.
>> >>>
>> >>> Or is it better the other way round; first to make fasta files  
>> with
>> only
>> >>> single sequences inside and BLAST each file? But how can I  
>> automize
>> that
>> >>> using Bioperl?
>> >>>
>> >>> I tried Bio::SearchIO but can only parse all queries and their
>> >>> respective hits in only one file...
>> >>> I think iteration is also necessary here, but I do not really  
>> know how
>> >>> to include that into Bio::SearchIO.
>> >>> Or do I have to use Module:Bio::Index::Blast?
>> >>>
>> >>> I can index a file (see below), but I have no idea what comes  
>> next...
>> >>>
>> >>> ###How I index a file...
>> >>>
>> >>> #!/usr/bin/perl -w
>> >>>
>> >>> $ENV{BIOPERL_INDEX_TYPE} = "SDBM_File";
>> >>>
>> >>> use Bio::Index::Fasta;
>> >>>
>> >>>
>> >>> $file_name = "8_to_BLAST_two_seq_index.fasta";
>> >>> $id = "48882";
>> >>> $inx = Bio::Index::Fasta->new (-filename => $file_name . ".idx",
>> >>> -write_flag => 1);
>> >>> $inx->make_index($file_name);
>> >>>
>> >>>
>> >>> Hopefully, you can give me at least hints what to look for.
>> >>>
>> >>> A big THANKS in advance!
>> >>>
>> >>> Cheers,
>> >>>
>> >>> Tim
>> >>> _______________________________________________
>> >>> Bioperl-l mailing list
>> >>> Bioperl-l at lists.open-bio.org
>> >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>> >>>
>> >>>
>> >>
>> >> _______________________________________________
>> >> Bioperl-l mailing list
>> >> Bioperl-l at lists.open-bio.org
>> >> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>> >>
>> >>
>> >
>>
>> Tim K?hler
> MPI for Terrestrial Microbiology
> Karl-von-Frisch-Stra?e
> D-35043 Marburg / Germany
>
> Email: koehlerd at mpi-marburg.mpg.de
> Phone: +49 6421 178-740
> Fax:   +49 6421 178-999
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

--
Jason Stajich
jason.stajich at gmail.com
jason at bioperl.org


From suzi at berkeleybop.org  Mon Nov 30 04:03:09 2009
From: suzi at berkeleybop.org (Suzanna Lewis)
Date: Sun, 29 Nov 2009 20:03:09 -0800
Subject: [Bioperl-l] [DAS] DAS workshop 7th-9th April 2010
In-Reply-To: <F30A9ED7-41E9-4833-A094-FDF0893E0F92@sanger.ac.uk>
References: <F30A9ED7-41E9-4833-A094-FDF0893E0F92@sanger.ac.uk>
Message-ID: <3AD3C819-4BAA-4D90-B141-9611F48C5CAD@ berkeleybop.org>

I/we (Gregg) would be interested in attending. We'd present an update on the collaborative, web-based version of Apollo. We will be working with Ian Holmes and Mitch Skinner using JBrowse for basic display.

-S


On Nov 26, 2009, at 6:57 AM, Jonathan Warren wrote:

> We are considering running a Distributed Annotation System workshop here at the Sanger/EBI in the UK subject to decent demand.
> The workshop will be held from Wednesday 7th-Friday 9th April 2010. If you would be interested in attending either to present or just take part
> then please email me jw12 at sanger.ac.uk
> 
> The format of the workshop is likely to be similar to last years (1st day for beginners, 2nd for both beginners and advanced users, 3rd day for advanced), information for which can be found here:
> http://www.dasregistry.org/course.jsp
> 
> If you would like to present then please send a short summary of what you would like to talk about.
> 
> Thanks
> 
> Jonathan.
> 
> Jonathan Warren
> Senior Developer and DAS coordinator
> jw12 at sanger.ac.uk
> 
> 
> 
> 
> 
> 
> 
> 
> 
> -- 
> The Wellcome Trust Sanger Institute is operated by Genome ResearchLimited, a charity registered in England with number 1021457 and acompany registered in England with number 2742969, whose registeredoffice is 215 Euston Road, London, NW1 2BE._______________________________________________
> DAS mailing list
> DAS at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/das
> 


From maj at fortinbras.us  Mon Nov 30 14:31:27 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Mon, 30 Nov 2009 09:31:27 -0500
Subject: [Bioperl-l] HOWTO copyright policy vs FDL on wiki
In-Reply-To: <81B3C4A1-9F14-4FF9-A4AF-F7E90817A2F1@verizon.net>
References: <9EC73CA501BD45BA912F2D77954D6CD7@NewLife>
	<81B3C4A1-9F14-4FF9-A4AF-F7E90817A2F1@verizon.net>
Message-ID: <513F1C824EF84974993A76F0CC719CDF@NewLife>

Well, it has a history, Jason's point. So the question could
be: "is this still a valid issue"? A while back, a user on the wiki,
with natural and good intentions, removed the authorship and revision
info from a couple of the HOWTOs; it is more wiki-like,
after all. But Chris had some objections to that, which I
seconded, mainly on the basis of the special status that
seems implied by the copyright note on the HOWTO
page. I also think that the nature of the howto is somewhat
different from other info on the site -- that developers themselves
put a lot of time in to explaining how to use their modules, and
that in this world where devs get paid by recognition, it is a reasonable
thing to allow this extra horn-tooting. Now, that is a policy
that could be completely separable from the issue of copyright.
However, devs may also get paid by using their materials in teaching
seminars. The dilemma would be that people who like to use the
wiki are people who like to share, and so it feels unnatural to
withhold from the community the materials they develop,  but
people who like to share also like to eat and wear shoes...
so I'm interested in everyone's thoughts about it.
----- Original Message ----- 
From: "Brian Osborne" <bosborne11 at verizon.net>
To: "Mark A. Jensen" <maj at fortinbras.us>
Cc: "Chris Fields" <cjfields at illinois.edu>; "Jason Stajich" 
<jason.stajich at ucr.edu>; "bioperl List" <bioperl-l at bioperl.org>
Sent: Monday, November 30, 2009 9:16 AM
Subject: Re: [Bioperl-l] HOWTO copyright policy vs FDL on wiki


> Mark,
>
> Let me ask you a question, and don't take this question as an implicit 
> criticism of your suggestion, it is not. Why would you want this more 
> restrictive copyright?
>
> Brian O.
>
> On Nov 28, 2009, at 10:32 PM, Mark A. Jensen wrote:
>
>> The HOWTOs appear to have a more restrictive copyright
>> than FDL-- in particular, the blurb at the bottom of the
>> HOWTO page asks users to use the documents for personal
>> use only. I'm for this; I think we should therefore have some
>> explicit license for these that specifies this kind of restriction,
>> and then express that on each howto and in BioPerl:Copyright.
>> Any thoughts on the right license and whether this is a good plan?
>> MAJ
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>
> 


From bosborne11 at verizon.net  Mon Nov 30 15:15:32 2009
From: bosborne11 at verizon.net (Brian Osborne)
Date: Mon, 30 Nov 2009 10:15:32 -0500
Subject: [Bioperl-l] HOWTO copyright policy vs FDL on wiki
In-Reply-To: <513F1C824EF84974993A76F0CC719CDF@NewLife>
References: <9EC73CA501BD45BA912F2D77954D6CD7@NewLife>
	<81B3C4A1-9F14-4FF9-A4AF-F7E90817A2F1@verizon.net>
	<513F1C824EF84974993A76F0CC719CDF@NewLife>
Message-ID: <54671455-A02C-4139-8C39-AC17B50D5CE6@verizon.net>

Mark,

I have no objection to a more restrictive copyright, and I also have  
no objection to using FDL, or things like it.

Brian O.

On Nov 30, 2009, at 9:31 AM, Mark A. Jensen wrote:

> Well, it has a history, Jason's point. So the question could
> be: "is this still a valid issue"? A while back, a user on the wiki,
> with natural and good intentions, removed the authorship and revision
> info from a couple of the HOWTOs; it is more wiki-like,
> after all. But Chris had some objections to that, which I
> seconded, mainly on the basis of the special status that
> seems implied by the copyright note on the HOWTO
> page. I also think that the nature of the howto is somewhat
> different from other info on the site -- that developers themselves
> put a lot of time in to explaining how to use their modules, and
> that in this world where devs get paid by recognition, it is a  
> reasonable
> thing to allow this extra horn-tooting. Now, that is a policy
> that could be completely separable from the issue of copyright.
> However, devs may also get paid by using their materials in teaching
> seminars. The dilemma would be that people who like to use the
> wiki are people who like to share, and so it feels unnatural to
> withhold from the community the materials they develop,  but
> people who like to share also like to eat and wear shoes...
> so I'm interested in everyone's thoughts about it.
> ----- Original Message ----- From: "Brian Osborne" <bosborne11 at verizon.net 
> >
> To: "Mark A. Jensen" <maj at fortinbras.us>
> Cc: "Chris Fields" <cjfields at illinois.edu>; "Jason Stajich" <jason.stajich at ucr.edu 
> >; "bioperl List" <bioperl-l at bioperl.org>
> Sent: Monday, November 30, 2009 9:16 AM
> Subject: Re: [Bioperl-l] HOWTO copyright policy vs FDL on wiki
>
>
>> Mark,
>>
>> Let me ask you a question, and don't take this question as an  
>> implicit criticism of your suggestion, it is not. Why would you  
>> want this more restrictive copyright?
>>
>> Brian O.
>>
>> On Nov 28, 2009, at 10:32 PM, Mark A. Jensen wrote:
>>
>>> The HOWTOs appear to have a more restrictive copyright
>>> than FDL-- in particular, the blurb at the bottom of the
>>> HOWTO page asks users to use the documents for personal
>>> use only. I'm for this; I think we should therefore have some
>>> explicit license for these that specifies this kind of restriction,
>>> and then express that on each howto and in BioPerl:Copyright.
>>> Any thoughts on the right license and whether this is a good plan?
>>> MAJ
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>>
>


From bosborne11 at verizon.net  Mon Nov 30 14:16:07 2009
From: bosborne11 at verizon.net (Brian Osborne)
Date: Mon, 30 Nov 2009 09:16:07 -0500
Subject: [Bioperl-l] HOWTO copyright policy vs FDL on wiki
In-Reply-To: <9EC73CA501BD45BA912F2D77954D6CD7@NewLife>
References: <9EC73CA501BD45BA912F2D77954D6CD7@NewLife>
Message-ID: <81B3C4A1-9F14-4FF9-A4AF-F7E90817A2F1@verizon.net>

Mark,

Let me ask you a question, and don't take this question as an implicit  
criticism of your suggestion, it is not. Why would you want this more  
restrictive copyright?

Brian O.

On Nov 28, 2009, at 10:32 PM, Mark A. Jensen wrote:

> The HOWTOs appear to have a more restrictive copyright
> than FDL-- in particular, the blurb at the bottom of the
> HOWTO page asks users to use the documents for personal
> use only. I'm for this; I think we should therefore have some
> explicit license for these that specifies this kind of restriction,
> and then express that on each howto and in BioPerl:Copyright.
> Any thoughts on the right license and whether this is a good plan?
> MAJ
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From maj at fortinbras.us  Mon Nov 30 17:41:44 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Mon, 30 Nov 2009 12:41:44 -0500
Subject: [Bioperl-l] How to parse BLAST output - all hits of each
	queryinnew file
In-Reply-To: <c3cc98c0911300923p144ac04ax8a881150dca54835@mail.gmail.com>
References: <4B0D6C24.2080308@gmail.com>
	<53DE480F205E42CE8D2B9421592AAF0E@NewLife>
	<815D2A47BC9C4D89B8DEF0B10DA9EAF8@NewLife>
	<4B0EA44D.2050507@gmail.com>
	<c3cc98c0911260802t2a66e7e8o7be3b79c03d86e6e@mail.gmail.com>
	<c3cc98c0911270123i6e4e83d3lfee0f5f32ca0cf46@mail.gmail.com>
	<18DF7D20DFEC044098A1062202F5FFF32B630E6C53@exchsth.agresearch.co.nz>
	<52D67F20A9CB4953B86FF794ADE0BE96@NewLife>
	<18DF7D20DFEC044098A1062202F5FFF32B630E6D05@exchsth.agresearch.co.nz>
	<c3cc98c0911300923p144ac04ax8a881150dca54835@mail.gmail.com>
Message-ID: <8C288FEF9CEB4055B0CDD19267FBA26C@NewLife>

thanks Tim! corrected (I hope) in r16432... 
MAJ
  ----- Original Message ----- 
  From: Tim Koehler 
  To: Smithies, Russell 
  Cc: Mark A. Jensen ; bioperl-l at lists.open-bio.org 
  Sent: Monday, November 30, 2009 12:23 PM
  Subject: Re: [Bioperl-l] How to parse BLAST output - all hits of each queryinnew file


  Hello everybody,

  thanks a lot for the overwhelming answers! All these codes are different flavors and worked all.

  For me the added code works the best. But I think I found a bug in ...Bio/SearchIO/blast.pm. 
  There the DEFAULT_BLAST_... variable is set to Bio::Search::Writer::HitTableWriter instead of Bio::SearchIO::Writer::HitTableWriter. This variable I changed also to HTMLResultWriter and others.

  So again: THANKS for the support!

  Cheers, 
  Tim

  #!/usr/bin/perl -w

  use strict;

  use Bio::Tools::Run::StandAloneBlast;

  use Bio::SeqIO;

  use Bio::SearchIO;

  ### add here the writer you want
  use Bio::SearchIO::Writer::HitTableWriter;

  use Bio::Search::Result::BlastResult;

   
  use Data::Dumper;

   
  my $Seq_in = Bio::SeqIO->new( -file   => "/home/koehler/Programs/for_BLAST/1_to_BLAST_two_seq.fasta",

                                -format => "fasta" );

   
  while ( my $query = $Seq_in->next_seq() ) {

         warn "Processing ",$query->id, "\n";

    my $factory =

      Bio::Tools::Run::StandAloneBlast->new(

                   program  => "blastn",

                   database => "/home/koehler/Programs/for_BLAST/BLAST_Pipeline/3_BLAST_db",

                   _READMETHOD => "Blast"

      );

   
    my $blast_report = $factory->blastall($query);

    sleep 5;

   
    # just write the result we got for this query into a 

     #new blast-formatted file...named after the id of the query seq...  

    my $result = $blast_report->next_result;

    my $blio = Bio::SearchIO->new( -file => ">".$query->id.".bls", -format => "blast" ) or die $!;

    $blio->write_result($result);

   
    # below, just looking at the current blast result

  ###this does not appear in the output files

    while ( my $result = $blast_report->next_result ) {

      ## $result is a Bio::Search::Result::ResultI compliant object

      while ( my $hit = $result->next_hit ) {

        ## $hit is a Bio::Search::Hit::HitI compliant object

        while ( my $hsp = $hit->next_hsp ) {

          ## $hsp is a Bio::Search::HSP::HSPI compliant object

          if ( $hsp->length('total') > 50 ) {

            if ( $hsp->percent_identity >= 75 ) {

              print "Query= ", $result->query_name,

                "Hit= ",        $hit->name,

                "Length= ",     $hsp->length('total'),

                "Percent_id= ", $hsp->percent_identity,

                "Subject=",     $hsp->hit_string, "\n";

            }

          }

        }

      }

    }

  }

   
  On Sun, Nov 29, 2009 at 11:29 PM, Smithies, Russell <Russell.Smithies at agresearch.co.nz> wrote:

    Changed it to a generic result and added a writer and it seems tio work:


      foreach my $qid ( keys %hits_by_query ) {

        warn "qid = $qid\n";

        my $res = Bio::Search::Result::GenericResult->new(-algorithm => "blastn") or die $!;

       # print Dumper $res;

        foreach my $h ( @{ $hits_by_query{$qid} } ){

                         warn "adding hit ", $h->name, "\n";

                         $res->add_hit($h) if defined($h);

                               }

        my $writerhtml =  Bio::SearchIO::Writer::HTMLResultWriter->new();

        my $blio = Bio::SearchIO->new(-writer => $writerhtml, -file => ">$qid\.bls\.html", -format => "blast" ) or die $!;

        $blio->write_result($res);

      }


    From: Mark A. Jensen [mailto:maj at fortinbras.us] 
    Sent: Monday, 30 November 2009 10:19 a.m.
    To: Smithies, Russell; 'Tim Koehler'


    Subject: Re: [Bioperl-l] How to parse BLAST output - all hits of each queryinnew file


    My thought here was that since Tim's already going one at a time thru

    his queries, my scrap was not really necessary: 


    use strict;

    use Bio::Tools::Run::StandAloneBlast;

    use Bio::SeqIO;

    use Bio::SearchIO;

    use Bio::Search::Result::BlastResult;


    use Data::Dumper;


    my $Seq_in = Bio::SeqIO->new( -file   => "sequences.fasta",

                                  -format => "fasta" );


    while ( my $query = $Seq_in->next_seq() ) {

           warn "Processing ",$query->id, "\n";

      my $factory =

        Bio::Tools::Run::StandAloneBlast->new(

                     program  => "blastn",

                     database => "/data/databases/flatfile/illuminati_blastdata/nt",

                     _READMETHOD => "Blast"

        );


      my $blast_report = $factory->blastall($query);

      sleep 5;


      # just write the result we got for this query into a 

       #new blast-formatted file...named after the id of the query seq...  

     my $result = $blast_report->next_result;

    my $blio = Bio::SearchIO->new( -file => ">".$query->id.".bls", -format => "blast" ) or die $!;

      $blio->write_result($result);


      # below, just looking at the current blast result

      while ( my $result = $blast_report->next_result ) {

        ## $result is a Bio::Search::Result::ResultI compliant object

        while ( my $hit = $result->next_hit ) {

          ## $hit is a Bio::Search::Hit::HitI compliant object

          while ( my $hsp = $hit->next_hsp ) {

            ## $hsp is a Bio::Search::HSP::HSPI compliant object

            if ( $hsp->length('total') > 50 ) {

              if ( $hsp->percent_identity >= 75 ) {

                print "Query= ", $result->query_name,

                  "Hit= ",        $hit->name,

                  "Length= ",     $hsp->length('total'),

                  "Percent_id= ", $hsp->percent_identity,

                  "Subject=",     $hsp->hit_string, "\n";

              }

            }

          }

        }

      }

    }

      ----- Original Message ----- 

      From: Smithies, Russell 

      To: 'Tim Koehler' ; 'maj at fortinbras.us' 

      Sent: Sunday, November 29, 2009 3:58 PM

      Subject: RE: [Bioperl-l] How to parse BLAST output - all hits of each queryinnew file


      Hi Tim

      With various people writing the ?howtos? and other docs, the examples are bound to have differing names for the variables used but as long as you?re consistent, it should all fit together.


      I think I?ve almost got your code working, just getting errors from Bio::Search::Result::BlastResult  which I?m not entirely sure how to use. Perhaps Mark can get this bit going?


      --Russell

      ===============================


      use strict;

      use Bio::Tools::Run::StandAloneBlast;

      use Bio::SeqIO;

      use Bio::SearchIO;

      use Bio::Search::Result::BlastResult;


      use Data::Dumper;


      my $Seq_in = Bio::SeqIO->new( -file   => "sequences.fasta",

                                    -format => "fasta" );


      while ( my $query = $Seq_in->next_seq() ) {

             warn "Processing ",$query->id, "\n";

        my $factory =

          Bio::Tools::Run::StandAloneBlast->new(

                       program  => "blastn",

                       database => "/data/databases/flatfile/illuminati_blastdata/nt",

                       _READMETHOD => "Blast"

          );


        my $blast_report = $factory->blastall($query);

        sleep 5;


        my %hits_by_query;


             while ( my $result = $blast_report->next_result ) {

               foreach my $hit ( $result->hits ) {

                           warn "Pushed a hit for ",$hit->name, "\n";

                 push( @{ $hits_by_query{ $hit->name } }, $hit );

               }

             }


        foreach my $qid ( keys %hits_by_query ) {

                    warn "qid = $qid\n";

          my $res = Bio::Search::Result::BlastResult->new() or die $!;

          print Dumper $res;

          foreach my $h ( @{ $hits_by_query{$qid} } ){

                           warn "adding hit ", $h->name, "\n";

                           $res->add_hit($h) if defined($h);

                                 }

          my $blio = Bio::SearchIO->new( -file => ">$qid\.bls", -format => "blast" ) or die $!;

          $blio->write_result($res);

        }


        while ( my $result = $blast_report->next_result ) {

          ## $result is a Bio::Search::Result::ResultI compliant object

          while ( my $hit = $result->next_hit ) {

            ## $hit is a Bio::Search::Hit::HitI compliant object

            while ( my $hsp = $hit->next_hsp ) {

              ## $hsp is a Bio::Search::HSP::HSPI compliant object

              if ( $hsp->length('total') > 50 ) {

                if ( $hsp->percent_identity >= 75 ) {

                  print "Query= ", $result->query_name,

                    "Hit= ",        $hit->name,

                    "Length= ",     $hsp->length('total'),

                    "Percent_id= ", $hsp->percent_identity,

                    "Subject=",     $hsp->hit_string, "\n";

                }

              }

            }

          }

        }

      }

      ===============================


      From: Tim Koehler [mailto:timbourine81 at googlemail.com] 
      Sent: Friday, 27 November 2009 10:24 p.m.
      To: Smithies, Russell; maj at fortinbras.us
      Subject: Re: [Bioperl-l] How to parse BLAST output - all hits of each queryinnew file


      Hey guys,

      please, do not get me wrong that I wanted to put the workload on you. So far I only found the HowTo's but in there in some way the language changed with time (e.g. $in to $Seq_in) or some things I simply could not find.
      Now I got a tip where else to search: the scrapbook and deobfuscator.

      I immediately will have a look at that.

      This is the first time for me touching linux / perl commands; that's why I thought after several days of trial and many errors ;) asking the mailinglist.

      I was very happy about your fast answers!

      Cheers and a nice weekend,

      Tim

      On Thu, Nov 26, 2009 at 5:02 PM, Tim Koehler <timbourine81 at googlemail.com> wrote:

      ups, sent too early...

      Hey Mark,

      thanks for the answer. But I am still struggling, especially where to put in your code.

      Here ist the code I have, so far:

      #!/usr/bin/perl -w

      ### should I put your code here as push is a perl command?


      my %hits_by_query;
      for ($result->hits) {

      ### I inserted a comma after name}}; if there is no comma, there was the error: Scalar found where operator expected at 12_BLAST_two_sequence_each_query_one_file.PL line7, near "} $hit"
      ###        (Missing operator before  $hit?)
      ###Useless use of push with no values at 12_BLAST_two_sequence_each_query_one_file.PL line 7.
      ###syntax error at 12_BLAST_two_sequence_each_query_one_file.PL line 7, near "} $hit"
      ###BEGIN not safe after errors--compilation aborted at 12_BLAST_two_sequence_each_query_one_file.PL line 13.


       push @{$hits_by_query{$hit->name}}, $hit;

      ###here, every time this terror appears: Name "main::result" used only once: possible typo at 12_BLAST_two_sequence_each_query_one_file.PL line 5.
      ###error: Can't call method "hits" on an undefined value at 12_BLAST_two_sequence_each_query_one_file.PL line 5.


      }


      use strict;
      use Bio::Tools::Run::StandAloneBlast;
      use Bio::SeqIO;
      use Bio::SearchIO;

      use Bio::Search::Result::BlastResult;

      my $Seq_in = Bio::SeqIO->new (
      -file => "/home/koehler/Programs/for_BLAST/BLAST_Pipeline/1_to_BLAST_two_seq.fasta",
      -format => 'fasta'
      );
      while (my $query = $Seq_in->next_seq()) {


      my $factory = Bio::Tools::Run::StandAloneBlast->new(

      'program' => 'blastn',
      'database' => '/home/koehler/Programs/for_BLAST/BLAST_Pipeline/3_BLAST_db',
      _READMETHOD => "Blast"
      );

      my $blast_report = $factory->blastall($query);

      ### Should I need to use a module? are the commands here at the right position? errors, e.g., Global symbol "$hit" requires explicit package name
      #my %hits_by_query;
      #for ($result->hits) {
      ### inserted comma after name}}
      # push @{$hits_by_query{$hit->name}}, $hit;
      #}


      foreach my $qid ( keys %hits_by_query ) {
       my $result = Bio::Search::Result::BlastResult->new();
       $result->add_hit($_) for ( @{$hits_by_query{$qid}} );
       my $blio = Bio::SearchIO->new( -file => ">$qid\.bls", -format=>'blast' );
       $blio->write_result($result);
      } 

      ###where are the files stored? what is their name. Sorry, but I cannot get behind that :(

      while( my $result = $blast_report->next_result ) {
        ## $result is a Bio::Search::Result::ResultI compliant object


        while( my $hit = $result->next_hit ) {

         ## $hit is a Bio::Search::Hit::HitI compliant object


         while( my $hsp = $hit->next_hsp ) {

          ## $hsp is a Bio::Search::HSP::HSPI compliant object
          if( $hsp->length('total') > 50 ) {
           if ( $hsp->percent_identity >= 75 ) {
           print  "Query= ",        $result->query_name,
              "Hit= ",        $hit->name,
                  "Length= ",     $hsp->length('total'),
                  "Percent_id= ", $hsp->percent_identity,
              "Subject=",        $hsp->hit_string,"\n";
           }
          }
         }
        }
      }
      }

      Again, a big thanks in advance :)

      All the best,

      Tim

      On Thu, Nov 26, 2009 at 4:52 PM, Tim <timbourine81 at gmail.com> wrote:

      Hey Mark,

      thanks for the answer


      On 25.11.2009 20:21, Mark A. Jensen wrote:
      > whoops: change the following line:
      > my $blio = Bio::SearchIO->new( -file => $qid.".bls", -format=>'blast' );
      >
      > to
      >
      > my $blio = Bio::SearchIO->new( -file => ">$qid\.bls", -format=>'blast' );
      >
      > (I always forget that...)
      > MAJ
      >
      > ----- Original Message ----- From: "Mark A. Jensen" <maj at fortinbras.us>
      > To: "Tim" <timbourine81 at gmail.com>; <bioperl-l at lists.open-bio.org>
      > Sent: Wednesday, November 25, 2009 1:20 PM
      > Subject: Re: [Bioperl-l] How to parse BLAST output - all hits of each
      > queryinnew file
      >
      >
      >> hey Tim--
      >>
      >> Sound like you need to go about collecting your queries inside out:
      >>
      >> my %hits_by_query;
      >> for ($result->hits) {
      >>  push @{$hits_by_query{$hit->name}} $hit;
      >> }
      >>
      >> I believe now each hash element, keyed by the query name, will contain
      >> an arrayref to the set of hits assoc with that query.
      >>> From here, I believe
      >>
      >> use Bio::Search::Result::BlastResult;
      >> use Bio::SearchIO;
      >>
      >> foreach my $qid ( keys %hits_by_query ) {
      >>  my $result = Bio::Search::Result::BlastResult->new();
      >>  $result->add_hit($_) for ( @{$hits_by_query{$qid}} );
      >>  my $blio = Bio::SearchIO->new( -file => $qid.".bls", -format=>'blast' );
      >>  $blio->write_result($result);
      >> }
      >>
      >> will do what you want.
      >>
      >> hope this helps -
      >> Mark
      >>
      >> ----- Original Message ----- From: "Tim" <timbourine81 at gmail.com>
      >> To: <bioperl-l at lists.open-bio.org>
      >> Sent: Wednesday, November 25, 2009 12:40 PM
      >> Subject: [Bioperl-l] How to parse BLAST output - all hits of each
      >> query innew file
      >>
      >>
      >>> Dear bioperl users,
      >>>
      >>> I am a real newbie and have - maybe a very trivial - question.
      >>>
      >>> I searched the mailing list archive and many howtos but I have not found
      >>> a concrete answer to my problem. So hopefully you can help me :)
      >>>
      >>> Background: I use the latest Bioperl version (installed it two weeks
      >>> before).
      >>> When I use Bio::Tools::Run::StandAloneBlast to BLAST one fasta file
      >>> including different sequences, I get a BLAST output with many queries
      >>> each having several hits / sbjcts.
      >>>
      >>> My problem is how to parse *all* hits of *one* query into a single new
      >>> file. And this for all the queries I have in my BLAST output file.
      >>>
      >>> Or is it better the other way round; first to make fasta files with only
      >>> single sequences inside and BLAST each file? But how can I automize that
      >>> using Bioperl?
      >>>
      >>> I tried Bio::SearchIO but can only parse all queries and their
      >>> respective hits in only one file...
      >>> I think iteration is also necessary here, but I do not really know how
      >>> to include that into Bio::SearchIO.
      >>> Or do I have to use Module:Bio::Index::Blast?
      >>>
      >>> I can index a file (see below), but I have no idea what comes next...
      >>>
      >>> ###How I index a file...
      >>>
      >>> #!/usr/bin/perl -w
      >>>
      >>> $ENV{BIOPERL_INDEX_TYPE} = "SDBM_File";
      >>>
      >>> use Bio::Index::Fasta;
      >>>
      >>>
      >>> $file_name = "8_to_BLAST_two_seq_index.fasta";
      >>> $id = "48882";
      >>> $inx = Bio::Index::Fasta->new (-filename => $file_name . ".idx",
      >>> -write_flag => 1);
      >>> $inx->make_index($file_name);
      >>>
      >>>
      >>> Hopefully, you can give me at least hints what to look for.
      >>>
      >>> A big THANKS in advance!
      >>>
      >>> Cheers,
      >>>
      >>> Tim
      >>> _______________________________________________
      >>> Bioperl-l mailing list
      >>> Bioperl-l at lists.open-bio.org
      >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
      >>>
      >>>
      >>
      >> _______________________________________________
      >> Bioperl-l mailing list
      >> Bioperl-l at lists.open-bio.org
      >> http://lists.open-bio.org/mailman/listinfo/bioperl-l
      >>
      >>
      >

      Tim K?hler
      MPI for Terrestrial Microbiology
      Karl-von-Frisch-Stra?e
      D-35043 Marburg / Germany

      Email: koehlerd at mpi-marburg.mpg.de
      Phone: +49 6421 178-740
      Fax:   +49 6421 178-999


--------------------------------------------------------------------------

      Attention: The information contained in this message and/or attachments from AgResearch Limited is intended only for the persons or entities to which it is addressed and may contain confidential and/or privileged material. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipients is prohibited by AgResearch Limited. If you have received this message in error, please notify the sender immediately.


--------------------------------------------------------------------------


From timbourine81 at googlemail.com  Mon Nov 30 17:23:58 2009
From: timbourine81 at googlemail.com (Tim Koehler)
Date: Mon, 30 Nov 2009 18:23:58 +0100
Subject: [Bioperl-l] How to parse BLAST output - all hits of each
	queryinnew file
In-Reply-To: <18DF7D20DFEC044098A1062202F5FFF32B630E6D05@exchsth.agresearch.co.nz>
References: <4B0D6C24.2080308@gmail.com>
	<53DE480F205E42CE8D2B9421592AAF0E@NewLife>
	<815D2A47BC9C4D89B8DEF0B10DA9EAF8@NewLife>
	<4B0EA44D.2050507@gmail.com>
	<c3cc98c0911260802t2a66e7e8o7be3b79c03d86e6e@mail.gmail.com>
	<c3cc98c0911270123i6e4e83d3lfee0f5f32ca0cf46@mail.gmail.com>
	<18DF7D20DFEC044098A1062202F5FFF32B630E6C53@exchsth.agresearch.co.nz>
	<52D67F20A9CB4953B86FF794ADE0BE96@NewLife>
	<18DF7D20DFEC044098A1062202F5FFF32B630E6D05@exchsth.agresearch.co.nz>
Message-ID: <c3cc98c0911300923p144ac04ax8a881150dca54835@mail.gmail.com>

Hello everybody,

thanks a lot for the overwhelming answers! All these codes are different
flavors and worked all.

For me the added code works the best. But I think I found a bug in
...Bio/SearchIO/blast.pm.
There the DEFAULT_BLAST_... variable is set to
Bio::Search::Writer::HitTableWriter instead of
Bio::SearchIO::Writer::HitTableWriter. This variable I changed also to
HTMLResultWriter
and others.

So again: THANKS for the support!

Cheers,
Tim

#!/usr/bin/perl -w

use strict;

use Bio::Tools::Run::StandAloneBlast;

use Bio::SeqIO;

use Bio::SearchIO;

### add here the writer you want
use Bio::SearchIO::Writer::HitTableWriter;

use Bio::Search::Result::BlastResult;


use Data::Dumper;


my $Seq_in = Bio::SeqIO->new( -file   =>
"/home/koehler/Programs/for_BLAST/1_to_BLAST_two_seq.fasta",

                              -format => "fasta" );


while ( my $query = $Seq_in->next_seq() ) {

       warn "Processing ",$query->id, "\n";

  my $factory =

    Bio::Tools::Run::StandAloneBlast->new(

                 program  => "blastn",

                 database =>
"/home/koehler/Programs/for_BLAST/BLAST_Pipeline/3_BLAST_db",

                 _READMETHOD => "Blast"

    );


  my $blast_report = $factory->blastall($query);

  sleep 5;


  # just write the result we got for this query into a

   #new blast-formatted file...named after the id of the query seq...

  my $result = $blast_report->next_result;

  my $blio = Bio::SearchIO->new( -file => ">".$query->id.".bls", -format =>
"blast" ) or die $!;

  $blio->write_result($result);


  # below, just looking at the current blast result

###this does not appear in the output files

  while ( my $result = $blast_report->next_result ) {

    ## $result is a Bio::Search::Result::ResultI compliant object

    while ( my $hit = $result->next_hit ) {

      ## $hit is a Bio::Search::Hit::HitI compliant object

      while ( my $hsp = $hit->next_hsp ) {

        ## $hsp is a Bio::Search::HSP::HSPI compliant object

        if ( $hsp->length('total') > 50 ) {

          if ( $hsp->percent_identity >= 75 ) {

            print "Query= ", $result->query_name,

              "Hit= ",        $hit->name,

              "Length= ",     $hsp->length('total'),

              "Percent_id= ", $hsp->percent_identity,

              "Subject=",     $hsp->hit_string, "\n";

          }

        }

      }

    }

  }

}


On Sun, Nov 29, 2009 at 11:29 PM, Smithies, Russell <
Russell.Smithies at agresearch.co.nz> wrote:

>  Changed it to a generic result and added a writer and it seems tio work:
>
>
>
>   foreach my $qid ( keys %hits_by_query ) {
>
>     warn "qid = $qid\n";
>
>     my $res = Bio::Search::Result::GenericResult->new(-algorithm =>
> "blastn") or die $!;
>
>    # print Dumper $res;
>
>     foreach my $h ( @{ $hits_by_query{$qid} } ){
>
>                      warn "adding hit ", $h->name, "\n";
>
>                      $res->add_hit($h) if defined($h);
>
>                            }
>
>     my $writerhtml =  Bio::SearchIO::Writer::HTMLResultWriter->new();
>
>     my $blio = Bio::SearchIO->new(-writer => $writerhtml, -file =>
> ">$qid\.bls\.html", -format => "blast" ) or die $!;
>
>     $blio->write_result($res);
>
>   }
>
>
>
>
>
> *From:* Mark A. Jensen [mailto:maj at fortinbras.us]
> *Sent:* Monday, 30 November 2009 10:19 a.m.
> *To:* Smithies, Russell; 'Tim Koehler'
>
> *Subject:* Re: [Bioperl-l] How to parse BLAST output - all hits of each
> queryinnew file
>
>
>
> My thought here was that since Tim's already going one at a time thru
>
> his queries, my scrap was not really necessary:
>
>
>
> use strict;
>
> use Bio::Tools::Run::StandAloneBlast;
>
> use Bio::SeqIO;
>
> use Bio::SearchIO;
>
> use Bio::Search::Result::BlastResult;
>
>
>
> use Data::Dumper;
>
>
>
> my $Seq_in = Bio::SeqIO->new( -file   => "sequences.fasta",
>
>                               -format => "fasta" );
>
>
>
> while ( my $query = $Seq_in->next_seq() ) {
>
>        warn "Processing ",$query->id, "\n";
>
>   my $factory =
>
>     Bio::Tools::Run::StandAloneBlast->new(
>
>                  program  => "blastn",
>
>                  database =>
> "/data/databases/flatfile/illuminati_blastdata/nt",
>
>                  _READMETHOD => "Blast"
>
>     );
>
>
>
>   my $blast_report = $factory->blastall($query);
>
>   sleep 5;
>
>
>
>   # just write the result we got for this query into a
>
>    #new blast-formatted file...named after the id of the query seq...
>
>  my $result = $blast_report->next_result;
>
> my $blio = Bio::SearchIO->new( -file => ">".$query->id.".bls", -format =>
> "blast" ) or die $!;
>
>   $blio->write_result($result);
>
>
>
>   # below, just looking at the current blast result
>
>   while ( my $result = $blast_report->next_result ) {
>
>     ## $result is a Bio::Search::Result::ResultI compliant object
>
>     while ( my $hit = $result->next_hit ) {
>
>       ## $hit is a Bio::Search::Hit::HitI compliant object
>
>       while ( my $hsp = $hit->next_hsp ) {
>
>         ## $hsp is a Bio::Search::HSP::HSPI compliant object
>
>         if ( $hsp->length('total') > 50 ) {
>
>           if ( $hsp->percent_identity >= 75 ) {
>
>             print "Query= ", $result->query_name,
>
>               "Hit= ",        $hit->name,
>
>               "Length= ",     $hsp->length('total'),
>
>               "Percent_id= ", $hsp->percent_identity,
>
>               "Subject=",     $hsp->hit_string, "\n";
>
>           }
>
>         }
>
>       }
>
>     }
>
>   }
>
> }
>
>  ----- Original Message -----
>
> *From:* Smithies, Russell <Russell.Smithies at agresearch.co.nz>
>
> *To:* 'Tim Koehler' <timbourine81 at googlemail.com> ; 'maj at fortinbras.us'<%27maj at fortinbras.us%27>
>
> *Sent:* Sunday, November 29, 2009 3:58 PM
>
> *Subject:* RE: [Bioperl-l] How to parse BLAST output - all hits of each
> queryinnew file
>
>
>
> Hi Tim
>
> With various people writing the ?howtos? and other docs, the examples are
> bound to have differing names for the variables used but as long as you?re
> consistent, it should all fit together.
>
>
>
> I think I?ve almost got your code working, just getting errors from
> Bio::Search::Result::BlastResult  which I?m not entirely sure how to use.
> Perhaps Mark can get this bit going?
>
>
>
> --Russell
>
> ===============================
>
>
>
> use strict;
>
> use Bio::Tools::Run::StandAloneBlast;
>
> use Bio::SeqIO;
>
> use Bio::SearchIO;
>
> use Bio::Search::Result::BlastResult;
>
>
>
> use Data::Dumper;
>
>
>
> my $Seq_in = Bio::SeqIO->new( -file   => "sequences.fasta",
>
>                               -format => "fasta" );
>
>
>
> while ( my $query = $Seq_in->next_seq() ) {
>
>        warn "Processing ",$query->id, "\n";
>
>   my $factory =
>
>     Bio::Tools::Run::StandAloneBlast->new(
>
>                  program  => "blastn",
>
>                  database =>
> "/data/databases/flatfile/illuminati_blastdata/nt",
>
>                  _READMETHOD => "Blast"
>
>     );
>
>
>
>   my $blast_report = $factory->blastall($query);
>
>   sleep 5;
>
>
>
>
>
>   my %hits_by_query;
>
>
>
>        while ( my $result = $blast_report->next_result ) {
>
>          foreach my $hit ( $result->hits ) {
>
>                      warn "Pushed a hit for ",$hit->name, "\n";
>
>            push( @{ $hits_by_query{ $hit->name } }, $hit );
>
>          }
>
>        }
>
>
>
>   foreach my $qid ( keys %hits_by_query ) {
>
>               warn "qid = $qid\n";
>
>     my $res = Bio::Search::Result::BlastResult->new() or die $!;
>
>     print Dumper $res;
>
>     foreach my $h ( @{ $hits_by_query{$qid} } ){
>
>                      warn "adding hit ", $h->name, "\n";
>
>                      $res->add_hit($h) if defined($h);
>
>                            }
>
>     my $blio = Bio::SearchIO->new( -file => ">$qid\.bls", -format =>
> "blast" ) or die $!;
>
>     $blio->write_result($res);
>
>   }
>
>
>
>   while ( my $result = $blast_report->next_result ) {
>
>     ## $result is a Bio::Search::Result::ResultI compliant object
>
>     while ( my $hit = $result->next_hit ) {
>
>       ## $hit is a Bio::Search::Hit::HitI compliant object
>
>       while ( my $hsp = $hit->next_hsp ) {
>
>         ## $hsp is a Bio::Search::HSP::HSPI compliant object
>
>         if ( $hsp->length('total') > 50 ) {
>
>           if ( $hsp->percent_identity >= 75 ) {
>
>             print "Query= ", $result->query_name,
>
>               "Hit= ",        $hit->name,
>
>               "Length= ",     $hsp->length('total'),
>
>               "Percent_id= ", $hsp->percent_identity,
>
>               "Subject=",     $hsp->hit_string, "\n";
>
>           }
>
>         }
>
>       }
>
>     }
>
>   }
>
> }
>
> ===============================
>
>
>
> *From:* Tim Koehler [mailto:timbourine81 at googlemail.com]
> *Sent:* Friday, 27 November 2009 10:24 p.m.
> *To:* Smithies, Russell; maj at fortinbras.us
> *Subject:* Re: [Bioperl-l] How to parse BLAST output - all hits of each
> queryinnew file
>
>
>
> Hey guys,
>
> please, do not get me wrong that I wanted to put the workload on you. So
> far I only found the HowTo's but in there in some way the language changed
> with time (e.g. $in to $Seq_in) or some things I simply could not find.
> Now I got a tip where else to search: the scrapbook and deobfuscator.
>
> I immediately will have a look at that.
>
> This is the first time for me touching linux / perl commands; that's why I
> thought after several days of trial and many errors ;) asking the
> mailinglist.
>
> I was very happy about your fast answers!
>
> Cheers and a nice weekend,
>
> Tim
>
> On Thu, Nov 26, 2009 at 5:02 PM, Tim Koehler <timbourine81 at googlemail.com>
> wrote:
>
> ups, sent too early...
>
> Hey Mark,
>
> thanks for the answer. But I am still struggling, especially where to put
> in your code.
>
> Here ist the code I have, so far:
>
> #!/usr/bin/perl -w
>
> ### should I put your code here as push is a perl command?
>
>
> my %hits_by_query;
> for ($result->hits) {
>
> ### I inserted a comma after name}}; if there is no comma, there was the
> error: Scalar found where operator expected at
> 12_BLAST_two_sequence_each_query_one_file.PL line7, near "} $hit"
> ###        (Missing operator before  $hit?)
> ###Useless use of push with no values at
> 12_BLAST_two_sequence_each_query_one_file.PL line 7.
> ###syntax error at 12_BLAST_two_sequence_each_query_one_file.PL line 7,
> near "} $hit"
> ###BEGIN not safe after errors--compilation aborted at
> 12_BLAST_two_sequence_each_query_one_file.PL line 13.
>
>
>  push @{$hits_by_query{$hit->name}}, $hit;
>
> ###here, every time this terror appears: Name "main::result" used only
> once: possible typo at 12_BLAST_two_sequence_each_query_one_file.PL line 5.
> ###error: Can't call method "hits" on an undefined value at
> 12_BLAST_two_sequence_each_query_one_file.PL line 5.
>
>
> }
>
>
> use strict;
> use Bio::Tools::Run::StandAloneBlast;
> use Bio::SeqIO;
> use Bio::SearchIO;
>
> use Bio::Search::Result::BlastResult;
>
> my $Seq_in = Bio::SeqIO->new (
> -file =>
> "/home/koehler/Programs/for_BLAST/BLAST_Pipeline/1_to_BLAST_two_seq.fasta",
> -format => 'fasta'
> );
> while (my $query = $Seq_in->next_seq()) {
>
>
> my $factory = Bio::Tools::Run::StandAloneBlast->new(
>
> 'program' => 'blastn',
> 'database' => '/home/koehler/Programs/for_BLAST/BLAST_Pipeline/3_BLAST_db',
> _READMETHOD => "Blast"
> );
>
> my $blast_report = $factory->blastall($query);
>
> ### Should I need to use a module? are the commands here at the right
> position? errors, e.g., Global symbol "$hit" requires explicit package name
> #my %hits_by_query;
> #for ($result->hits) {
> ### inserted comma after name}}
> # push @{$hits_by_query{$hit->name}}, $hit;
> #}
>
>
>
> foreach my $qid ( keys %hits_by_query ) {
>  my $result = Bio::Search::Result::BlastResult->new();
>  $result->add_hit($_) for ( @{$hits_by_query{$qid}} );
>  my $blio = Bio::SearchIO->new( -file => ">$qid\.bls", -format=>'blast' );
>  $blio->write_result($result);
> }
>
> ###where are the files stored? what is their name. Sorry, but I cannot get
> behind that :(
>
> while( my $result = $blast_report->next_result ) {
>   ## $result is a Bio::Search::Result::ResultI compliant object
>
>
>   while( my $hit = $result->next_hit ) {
>
>    ## $hit is a Bio::Search::Hit::HitI compliant object
>
>
>    while( my $hsp = $hit->next_hsp ) {
>
>     ## $hsp is a Bio::Search::HSP::HSPI compliant object
>     if( $hsp->length('total') > 50 ) {
>      if ( $hsp->percent_identity >= 75 ) {
>      print  "Query= ",        $result->query_name,
>         "Hit= ",        $hit->name,
>             "Length= ",     $hsp->length('total'),
>             "Percent_id= ", $hsp->percent_identity,
>         "Subject=",        $hsp->hit_string,"\n";
>      }
>     }
>    }
>   }
> }
> }
>
> Again, a big thanks in advance :)
>
> All the best,
>
> Tim
>
> On Thu, Nov 26, 2009 at 4:52 PM, Tim <timbourine81 at gmail.com> wrote:
>
> Hey Mark,
>
> thanks for the answer
>
>
>
>
> On 25.11.2009 20:21, Mark A. Jensen wrote:
> > whoops: change the following line:
> > my $blio = Bio::SearchIO->new( -file => $qid.".bls", -format=>'blast' );
> >
> > to
> >
> > my $blio = Bio::SearchIO->new( -file => ">$qid\.bls", -format=>'blast' );
> >
> > (I always forget that...)
> > MAJ
> >
> > ----- Original Message ----- From: "Mark A. Jensen" <maj at fortinbras.us>
> > To: "Tim" <timbourine81 at gmail.com>; <bioperl-l at lists.open-bio.org>
> > Sent: Wednesday, November 25, 2009 1:20 PM
> > Subject: Re: [Bioperl-l] How to parse BLAST output - all hits of each
> > queryinnew file
> >
> >
> >> hey Tim--
> >>
> >> Sound like you need to go about collecting your queries inside out:
> >>
> >> my %hits_by_query;
> >> for ($result->hits) {
> >>  push @{$hits_by_query{$hit->name}} $hit;
> >> }
> >>
> >> I believe now each hash element, keyed by the query name, will contain
> >> an arrayref to the set of hits assoc with that query.
> >>> From here, I believe
> >>
> >> use Bio::Search::Result::BlastResult;
> >> use Bio::SearchIO;
> >>
> >> foreach my $qid ( keys %hits_by_query ) {
> >>  my $result = Bio::Search::Result::BlastResult->new();
> >>  $result->add_hit($_) for ( @{$hits_by_query{$qid}} );
> >>  my $blio = Bio::SearchIO->new( -file => $qid.".bls", -format=>'blast'
> );
> >>  $blio->write_result($result);
> >> }
> >>
> >> will do what you want.
> >>
> >> hope this helps -
> >> Mark
> >>
> >> ----- Original Message ----- From: "Tim" <timbourine81 at gmail.com>
> >> To: <bioperl-l at lists.open-bio.org>
> >> Sent: Wednesday, November 25, 2009 12:40 PM
> >> Subject: [Bioperl-l] How to parse BLAST output - all hits of each
> >> query innew file
> >>
> >>
> >>> Dear bioperl users,
> >>>
> >>> I am a real newbie and have - maybe a very trivial - question.
> >>>
> >>> I searched the mailing list archive and many howtos but I have not
> found
> >>> a concrete answer to my problem. So hopefully you can help me :)
> >>>
> >>> Background: I use the latest Bioperl version (installed it two weeks
> >>> before).
> >>> When I use Bio::Tools::Run::StandAloneBlast to BLAST one fasta file
> >>> including different sequences, I get a BLAST output with many queries
> >>> each having several hits / sbjcts.
> >>>
> >>> My problem is how to parse *all* hits of *one* query into a single new
> >>> file. And this for all the queries I have in my BLAST output file.
> >>>
> >>> Or is it better the other way round; first to make fasta files with
> only
> >>> single sequences inside and BLAST each file? But how can I automize
> that
> >>> using Bioperl?
> >>>
> >>> I tried Bio::SearchIO but can only parse all queries and their
> >>> respective hits in only one file...
> >>> I think iteration is also necessary here, but I do not really know how
> >>> to include that into Bio::SearchIO.
> >>> Or do I have to use Module:Bio::Index::Blast?
> >>>
> >>> I can index a file (see below), but I have no idea what comes next...
> >>>
> >>> ###How I index a file...
> >>>
> >>> #!/usr/bin/perl -w
> >>>
> >>> $ENV{BIOPERL_INDEX_TYPE} = "SDBM_File";
> >>>
> >>> use Bio::Index::Fasta;
> >>>
> >>>
> >>> $file_name = "8_to_BLAST_two_seq_index.fasta";
> >>> $id = "48882";
> >>> $inx = Bio::Index::Fasta->new (-filename => $file_name . ".idx",
> >>> -write_flag => 1);
> >>> $inx->make_index($file_name);
> >>>
> >>>
> >>> Hopefully, you can give me at least hints what to look for.
> >>>
> >>> A big THANKS in advance!
> >>>
> >>> Cheers,
> >>>
> >>> Tim
> >>> _______________________________________________
> >>> Bioperl-l mailing list
> >>> Bioperl-l at lists.open-bio.org
> >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >>>
> >>>
> >>
> >> _______________________________________________
> >> Bioperl-l mailing list
> >> Bioperl-l at lists.open-bio.org
> >> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >>
> >>
> >
>
> Tim K?hler
> MPI for Terrestrial Microbiology
> Karl-von-Frisch-Stra?e
> D-35043 Marburg / Germany
>
> Email: koehlerd at mpi-marburg.mpg.de
> Phone: +49 6421 178-740
> Fax:   +49 6421 178-999
>
>
>
>
>  ------------------------------
>
> *Attention: *The information contained in this message and/or attachments
> from AgResearch Limited is intended only for the persons or entities to
> which it is addressed and may contain confidential and/or privileged
> material. Any review, retransmission, dissemination or other use of, or
> taking of any action in reliance upon, this information by persons or
> entities other than the intended recipients is prohibited by AgResearch
> Limited. If you have received this message in error, please notify the
> sender immediately.
>  ------------------------------
>
>
>
>