From paola_bisignano at yahoo.it  Tue Sep  1 08:20:25 2009
From: paola_bisignano at yahoo.it (Paola Bisignano)
Date: Tue, 1 Sep 2009 12:20:25 +0000 (GMT)
Subject: [Bioperl-l] help parsing msf file or clustalW file reports
Message-ID: <154614.75143.qm@web25706.mail.ukl.yahoo.com>

Hi, 

I'm trying to parse fasta files, where I have couple of alignments....I need to identify my residue in my alignment......I have separate lists that derived from ligplot parsing files.. so I have to manipulate string...but I don't now how to start..it seems complicated..
I used Bio::AlignIO to parse the fasta file, so I can have a parsed file in msf or clustalW forma

here an example:
CLUSTAL W(1.81) multiple sequence alignment


Sequence/9-273???????? DKWEMERTDITMKHKLGGGQYGEVYEGVWKKYSLTVAVKTLKEDTMEVEEFLKEAAVMKE
2pl0:A/6-268?????????? DEWEVPRETLKLVERLGAGQFGEVWMGYYNGHT-KVAVKSLKQGSMSPDAFLAEANLMKQ
?????????????????????? *:**: *? :.: .:**.**:***: * :: :: .****:**:.:*. : ** ** :**:


Sequence/9-273???????? IKHPNLVQLLGVCTREPPFYIITEFMTYGNLLDYLRECNRQEVSAVVLLYMATQISSAME
2pl0:A/6-268?????????? LQHQRLVRLYAVVTQEP-IYIITEYMENGSLVDFLKTPSGIKLTINKLLDMAAQIAEGMA
?????????????????????? ::* .**:* .* *:** :*****:*? *.*:*:*:? .? :::?? ** **:**:..* 

I? choose two residue for example...how can I extract them...starting from their position in the pdb file?
I need to walk...to my sequence 

I don't know if it is clear because I cannot explain the question correctly in english...are there any Italians?
could anyone help me?


From scott at scottcain.net  Tue Sep  1 09:21:25 2009
From: scott at scottcain.net (Scott Cain)
Date: Tue, 1 Sep 2009 09:21:25 -0400
Subject: [Bioperl-l] GMOD Chado perl modules moving to the Bio namespace
Message-ID: <CFB4B2A1-6E7F-42D7-BC9A-00C7CB25D185@scottcain.net>

Hello all,

I just wanted to send out a general announcement about a change that  
is coming for perl modules that are distributed with the gmod/chado  
package.  There are some modules, notably Class::DBI classes that are  
automatically generated, that are currently in the Chado namespace.   
This move has been requested by the CPAN maintainers.  So any  
Chado::*  modules will become Bio::Chado::*, except for the Class::DBI  
classes, which will become Bio::Chado::CDBI::*.

This will probably affect relatively few users, though ModWare in its  
current incarnation will need to be updated.

Scott

-----------------------------------------------------------------------
Scott Cain, Ph. D. scott at scottcain dot net
GMOD Coordinator (http://gmod.org/) 216-392-3087
Ontario Institute for Cancer Research


From biopython at maubp.freeserve.co.uk  Tue Sep  1 11:33:13 2009
From: biopython at maubp.freeserve.co.uk (Peter)
Date: Tue, 1 Sep 2009 16:33:13 +0100
Subject: [Bioperl-l] Next-Gen and the next point release - updates
In-Reply-To: <320fb6e00908270455y2a80907chfae8007df60e72e2@mail.gmail.com>
References: <ED17AB7F-E2D9-4CFC-AE18-08B1312159C5@illinois.edu>
	<320fb6e00908261416p666b7ab7w8174eb5a48f38c61@mail.gmail.com>
	<F7DAE18A-8224-4721-861F-610D82F4BDFE@illinois.edu>
	<320fb6e00908270455y2a80907chfae8007df60e72e2@mail.gmail.com>
Message-ID: <320fb6e00909010833p7bffac97je12dc778cdd54971@mail.gmail.com>

On Thu, Aug 27, 2009 at 12:55 PM, Peter wrote:
>> The two conversions to solexa are still failing. ?I'm not sure but I think
>> it's something fairly simple, but I can't work on it until Friday (got too
>> many other things on my plate ATM). ?If I get stumped I'll post a message.
>
> ...
>
> This should narrow it down - the bug is in mapping PHRED
> scores (from either Sanger or Illumina 1.3+ files) to the
> Solexa encoding.
>
> Peter

Hi Chris,

I've just noticed BioPerl is treating invalid characters in the quality
string as a warning condition (not an error):
http://lists.open-bio.org/pipermail/open-bio-l/2009-September/000568.html

It seems for fastq-sanger and fastq-illumina, these get given PHRED 0
(character "!" or "@" respectively) which is reasonable. For fastq-solexa
to fastq-solexa however, Solexa -5 (ASCII 59, character ";") does not get
used - a bug?

Also, in all these cases there is currently a spurious "data loss" warning:

$ ./bioperl_sanger2sanger.pl < error_qual_null.fastq

--------------------- WARNING ---------------------
MSG: Unknown symbol with ASCII value 0 outside of quality range,
---------------------------------------------------

--------------------- WARNING ---------------------
MSG: Data loss for sanger: following values exceed max 93

---------------------------------------------------
@SLXA-B3_649_FC8437_R1_1_1_850_123
GAGGGTGTTGATCATGATGATGGCG
+
YYY!YYYYYYYYYWYYWYYSYYYSY
@SLXA-B3_649_FC8437_R1_1_1_397_389
GGTTTGAGAAAGAGAAATGAGATAA
+
YYYYYYYYYWYYYYWWYYYWYWYWW
@SLXA-B3_649_FC8437_R1_1_1_850_123
GAGGGTGTTGATCATGATGATGGCG
+
YYYYYYYYYYYYYWYYWYYSYYYSY
@SLXA-B3_649_FC8437_R1_1_1_362_549
GGAAACAAAGTTTTTCTCAACATAG
+
YYYYYYYYYYYYYYYYYYWWWWYWY
@SLXA-B3_649_FC8437_R1_1_1_183_714
GTATTATTTAATGGCATACACTCAA
+
YYYYYYYYYYWYYYYWYWWUWWWQQ

Regards,

Peter


From jason at bioperl.org  Tue Sep  1 11:49:00 2009
From: jason at bioperl.org (Jason Stajich)
Date: Tue, 1 Sep 2009 08:49:00 -0700
Subject: [Bioperl-l] help parsing msf file or clustalW file reports
In-Reply-To: <154614.75143.qm@web25706.mail.ukl.yahoo.com>
References: <154614.75143.qm@web25706.mail.ukl.yahoo.com>
Message-ID: <90DACEE3-BC71-4D82-A8FF-6441A720BC76@bioperl.org>

I think you might want to use the column_from_residue_number method  
that is part of Bio::SimpleAlign - it lets you get the column from an  
alignment based on the sequence residue, doing some math along the way  
to deal with gaps. That is the residue -> alignment direction.  If you  
are starting at the alignment and want to get the residue's position  
you will use the location_from_column on a particular sequence so

     # select somehow a sequence from the alignment, e.g.
     my $seq = $aln->get_seq_by_pos(1);
     #$loc is undef or Bio::LocationI object
     my $loc = $seq->location_from_column(5);

-jason

On Sep 1, 2009, at 5:20 AM, Paola Bisignano wrote:

> Hi,
>
> I'm trying to parse fasta files, where I have couple of  
> alignments....I need to identify my residue in my alignment......I  
> have separate lists that derived from ligplot parsing files.. so I  
> have to manipulate string...but I don't now how to start..it seems  
> complicated..
> I used Bio::AlignIO to parse the fasta file, so I can have a parsed  
> file in msf or clustalW forma
>
> here an example:
> CLUSTAL W(1.81) multiple sequence alignment
>
>
> Sequence/9-273          
> DKWEMERTDITMKHKLGGGQYGEVYEGVWKKYSLTVAVKTLKEDTMEVEEFLKEAAVMKE
> 2pl0:A/6-268           DEWEVPRETLKLVERLGAGQFGEVWMGYYNGHT- 
> KVAVKSLKQGSMSPDAFLAEANLMKQ
>                        *:**: *  :.: .:**.**:***:  
> * :: :: .****:**:.:*. : ** ** :**:
>
>
> Sequence/9-273          
> IKHPNLVQLLGVCTREPPFYIITEFMTYGNLLDYLRECNRQEVSAVVLLYMATQISSAME
> 2pl0:A/6-268           LQHQRLVRLYAVVTQEP- 
> IYIITEYMENGSLVDFLKTPSGIKLTINKLLDMAAQIAEGMA
>                        ::* .**:* .* *:** :*****:*   
> *.*:*:*:  .  :::   ** **:**:..*
>
> I  choose two residue for example...how can I extract  
> them...starting from their position in the pdb file?
> I need to walk...to my sequence
>
> I don't know if it is clear because I cannot explain the question  
> correctly in english...are there any Italians?
> could anyone help me?
>
>
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

--
Jason Stajich
jason.stajich at gmail.com
jason at bioperl.org


From cjfields at illinois.edu  Tue Sep  1 12:05:14 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Tue, 1 Sep 2009 11:05:14 -0500
Subject: [Bioperl-l] Next-Gen and the next point release - updates
In-Reply-To: <320fb6e00909010833p7bffac97je12dc778cdd54971@mail.gmail.com>
References: <ED17AB7F-E2D9-4CFC-AE18-08B1312159C5@illinois.edu>
	<320fb6e00908261416p666b7ab7w8174eb5a48f38c61@mail.gmail.com>
	<F7DAE18A-8224-4721-861F-610D82F4BDFE@illinois.edu>
	<320fb6e00908270455y2a80907chfae8007df60e72e2@mail.gmail.com>
	<320fb6e00909010833p7bffac97je12dc778cdd54971@mail.gmail.com>
Message-ID: <FB130819-94C6-419F-AD3D-BAEEDDE77737@illinois.edu>


On Sep 1, 2009, at 10:33 AM, Peter wrote:

> On Thu, Aug 27, 2009 at 12:55 PM, Peter wrote:
>>> The two conversions to solexa are still failing.  I'm not sure but  
>>> I think
>>> it's something fairly simple, but I can't work on it until Friday  
>>> (got too
>>> many other things on my plate ATM).  If I get stumped I'll post a  
>>> message.
>>
>> ...
>>
>> This should narrow it down - the bug is in mapping PHRED
>> scores (from either Sanger or Illumina 1.3+ files) to the
>> Solexa encoding.
>>
>> Peter
>
> Hi Chris,
>
> I've just noticed BioPerl is treating invalid characters in the  
> quality
> string as a warning condition (not an error):
> http://lists.open-bio.org/pipermail/open-bio-l/2009-September/000568.html
>
> It seems for fastq-sanger and fastq-illumina, these get given PHRED 0
> (character "!" or "@" respectively) which is reasonable. For fastq- 
> solexa
> to fastq-solexa however, Solexa -5 (ASCII 59, character ";") does  
> not get
> used - a bug?
>
> Also, in all these cases there is currently a spurious "data loss"  
> warning:
>
> $ ./bioperl_sanger2sanger.pl < error_qual_null.fastq
>
> --------------------- WARNING ---------------------
> MSG: Unknown symbol with ASCII value 0 outside of quality range,
> ---------------------------------------------------
>
> --------------------- WARNING ---------------------
> MSG: Data loss for sanger: following values exceed max 93
>
> ---------------------------------------------------
> @SLXA-B3_649_FC8437_R1_1_1_850_123
> GAGGGTGTTGATCATGATGATGGCG
> +
> YYY!YYYYYYYYYWYYWYYSYYYSY
> @SLXA-B3_649_FC8437_R1_1_1_397_389
> GGTTTGAGAAAGAGAAATGAGATAA
> +
> YYYYYYYYYWYYYYWWYYYWYWYWW
> @SLXA-B3_649_FC8437_R1_1_1_850_123
> GAGGGTGTTGATCATGATGATGGCG
> +
> YYYYYYYYYYYYYWYYWYYSYYYSY
> @SLXA-B3_649_FC8437_R1_1_1_362_549
> GGAAACAAAGTTTTTCTCAACATAG
> +
> YYYYYYYYYYYYYYYYYYWWWWYWY
> @SLXA-B3_649_FC8437_R1_1_1_183_714
> GTATTATTTAATGGCATACACTCAA
> +
> YYYYYYYYYYWYYYYWYWWUWWWQQ
>
> Regards,
>
> Peter

Right, per off-list discussion this can be changed (I would rather it  
die there anyway).

chris


From marcelo011982 at gmail.com  Tue Sep  1 13:33:51 2009
From: marcelo011982 at gmail.com (Marcelo Iwata)
Date: Tue, 1 Sep 2009 14:33:51 -0300
Subject: [Bioperl-l] remove overlapped sequences from Blastn results
Message-ID: <1c9f28970909011033h7f8a1bcl771db039bad384e7@mail.gmail.com>

Hi

I've made a blastn with such arguments:

../bin/blastall -p blastn -d DBBank -i myFasta.FASTA.txt  -e 0.00001 -o
Out2Blast.txt -a 8

and i want a script that removes overlapped sequences from the results..
For example, if a unigene A has the hit->start  and hit-end as 1 and 4, and
the B is at 2 and 3, respectively, the script remove second one.

I want to know if it already exist, and if not, is there a library that
works with such issue.

I know that at Bio::DB::gff we have overlapping_features. But , if something
directly exist (works with blast format), is better for me.

thanks in advance

From cjfields at illinois.edu  Tue Sep  1 14:10:30 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Tue, 1 Sep 2009 13:10:30 -0500
Subject: [Bioperl-l] remove overlapped sequences from Blastn results
In-Reply-To: <1c9f28970909011033h7f8a1bcl771db039bad384e7@mail.gmail.com>
References: <1c9f28970909011033h7f8a1bcl771db039bad384e7@mail.gmail.com>
Message-ID: <7A89A354-3211-4662-9672-895E16CFDEE8@illinois.edu>

Marcelo,

Do you mean tiling?  See:

http://www.bioperl.org/wiki/HOWTO:Tiling

chris

On Sep 1, 2009, at 12:33 PM, Marcelo Iwata wrote:

> Hi
>
> I've made a blastn with such arguments:
>
> ../bin/blastall -p blastn -d DBBank -i myFasta.FASTA.txt  -e 0.00001  
> -o
> Out2Blast.txt -a 8
>
> and i want a script that removes overlapped sequences from the  
> results..
> For example, if a unigene A has the hit->start  and hit-end as 1 and  
> 4, and
> the B is at 2 and 3, respectively, the script remove second one.
>
> I want to know if it already exist, and if not, is there a library  
> that
> works with such issue.
>
> I know that at Bio::DB::gff we have overlapping_features. But , if  
> something
> directly exist (works with blast format), is better for me.
>
> thanks in advance
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From cain.cshl at gmail.com  Tue Sep  1 15:47:50 2009
From: cain.cshl at gmail.com (Scott Cain)
Date: Tue, 1 Sep 2009 15:47:50 -0400
Subject: [Bioperl-l] GMOD Chado perl modules moving to the Bio namespace
In-Reply-To: <CFB4B2A1-6E7F-42D7-BC9A-00C7CB25D185@scottcain.net>
References: <CFB4B2A1-6E7F-42D7-BC9A-00C7CB25D185@scottcain.net>
Message-ID: <0CA5287E-BE85-4E7F-8ED3-B453092FACB1@gmail.com>

Hi Don,

I just wanted to let you know that I also updated the code in  
GMODTools, but I don't have a simple way to test it; perhaps you  
should take a look at the cvs diff to make sure what I did makes sense.

Thanks,
Scott

On Sep 1, 2009, at 9:21 AM, Scott Cain wrote:

> Hello all,
>
> I just wanted to send out a general announcement about a change that  
> is coming for perl modules that are distributed with the gmod/chado  
> package.  There are some modules, notably Class::DBI classes that  
> are automatically generated, that are currently in the Chado  
> namespace.  This move has been requested by the CPAN maintainers.   
> So any Chado::*  modules will become Bio::Chado::*, except for the  
> Class::DBI classes, which will become Bio::Chado::CDBI::*.
>
> This will probably affect relatively few users, though ModWare in  
> its current incarnation will need to be updated.
>
> Scott
>
> -----------------------------------------------------------------------
> Scott Cain, Ph. D. scott at scottcain dot net
> GMOD Coordinator (http://gmod.org/) 216-392-3087
> Ontario Institute for Cancer Research
>
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

-----------------------------------------------------------------------
Scott Cain, Ph. D. scott at scottcain dot net
GMOD Coordinator (http://gmod.org/) 216-392-3087
Ontario Institute for Cancer Research


From maj at fortinbras.us  Wed Sep  2 00:19:30 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Wed, 2 Sep 2009 00:19:30 -0400
Subject: [Bioperl-l] bioperl invades emacs
Message-ID: <56DB0DEEB22645DE94DE0E912A889409@NewLife>

Hi All, 

As part of the Documentation Project, I've written a full-
fledged minor mode for emacs, bioperl-mode. It allows 
the user to access BP pod while coding, using keyboard
shortcuts or menus. Pod pops up in a new view buffer,
which it itself active for quick pod searching. You can 
get the whole pod, pieces of pod, or even the pod headers
of individual methods. 

The best feature (IMHO) is the completion facility. This
not only saves typing, but allows browsing and follow-your-nose
programming (exactly the technique I used to make bioperl-mode,
thanks to the Extensible Self-Documenting Editor).

It's very easy to install, requires only one additional line 
in your .emacs file, and directly infects perl-mode 
(if you so choose) so its available whenever you
open .pl or .pm files.

For details, screenshots, download and install info,
and soporific design details, see
http://www.bioperl.org/wiki/Emacs_bioperl-mode

Send me the bugs!
cheers, 
MAJ

From rmb32 at cornell.edu  Wed Sep  2 00:31:15 2009
From: rmb32 at cornell.edu (Robert Buels)
Date: Tue, 01 Sep 2009 21:31:15 -0700
Subject: [Bioperl-l] bioperl invades emacs
In-Reply-To: <56DB0DEEB22645DE94DE0E912A889409@NewLife>
References: <56DB0DEEB22645DE94DE0E912A889409@NewLife>
Message-ID: <4A9DF513.1020607@cornell.edu>

Wow.  Bravo!

Rob


From cjfields at illinois.edu  Wed Sep  2 00:31:46 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Tue, 1 Sep 2009 23:31:46 -0500
Subject: [Bioperl-l] bioperl invades emacs
In-Reply-To: <56DB0DEEB22645DE94DE0E912A889409@NewLife>
References: <56DB0DEEB22645DE94DE0E912A889409@NewLife>
Message-ID: <2A49147F-17B4-42EB-A170-52DA009D7E1C@illinois.edu>

Very cool!  Thanks Mark!

chris

On Sep 1, 2009, at 11:19 PM, Mark A. Jensen wrote:

> Hi All,
>
> As part of the Documentation Project, I've written a full-
> fledged minor mode for emacs, bioperl-mode. It allows
> the user to access BP pod while coding, using keyboard
> shortcuts or menus. Pod pops up in a new view buffer,
> which it itself active for quick pod searching. You can
> get the whole pod, pieces of pod, or even the pod headers
> of individual methods.
>
> The best feature (IMHO) is the completion facility. This
> not only saves typing, but allows browsing and follow-your-nose
> programming (exactly the technique I used to make bioperl-mode,
> thanks to the Extensible Self-Documenting Editor).
>
> It's very easy to install, requires only one additional line
> in your .emacs file, and directly infects perl-mode
> (if you so choose) so its available whenever you
> open .pl or .pm files.
>
> For details, screenshots, download and install info,
> and soporific design details, see
> http://www.bioperl.org/wiki/Emacs_bioperl-mode
>
> Send me the bugs!
> cheers,
> MAJ
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From Russell.Smithies at agresearch.co.nz  Wed Sep  2 01:01:34 2009
From: Russell.Smithies at agresearch.co.nz (Smithies, Russell)
Date: Wed, 2 Sep 2009 17:01:34 +1200
Subject: [Bioperl-l] bioperl invades emacs
In-Reply-To: <56DB0DEEB22645DE94DE0E912A889409@NewLife>
References: <56DB0DEEB22645DE94DE0E912A889409@NewLife>
Message-ID: <18DF7D20DFEC044098A1062202F5FFF32AAB8A8478@exchsth.agresearch.co.nz>

emacs, how quaint  :-)
And here's me thinking you'd be a vi guru...

For those who frequent Windows, Eclipse with EPIC is a real winner!

--Russell


> -----Original Message-----
> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
> bounces at lists.open-bio.org] On Behalf Of Mark A. Jensen
> Sent: Wednesday, 2 September 2009 4:20 p.m.
> To: BioPerl List
> Subject: [Bioperl-l] bioperl invades emacs
> 
> Hi All,
> 
> As part of the Documentation Project, I've written a full-
> fledged minor mode for emacs, bioperl-mode. It allows
> the user to access BP pod while coding, using keyboard
> shortcuts or menus. Pod pops up in a new view buffer,
> which it itself active for quick pod searching. You can
> get the whole pod, pieces of pod, or even the pod headers
> of individual methods.
> 
> The best feature (IMHO) is the completion facility. This
> not only saves typing, but allows browsing and follow-your-nose
> programming (exactly the technique I used to make bioperl-mode,
> thanks to the Extensible Self-Documenting Editor).
> 
> It's very easy to install, requires only one additional line
> in your .emacs file, and directly infects perl-mode
> (if you so choose) so its available whenever you
> open .pl or .pm files.
> 
> For details, screenshots, download and install info,
> and soporific design details, see
> http://www.bioperl.org/wiki/Emacs_bioperl-mode
> 
> Send me the bugs!
> cheers,
> MAJ
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
=======================================================================
Attention: The information contained in this message and/or attachments
from AgResearch Limited is intended only for the persons or entities
to which it is addressed and may contain confidential and/or privileged
material. Any review, retransmission, dissemination or other use of, or
taking of any action in reliance upon, this information by persons or
entities other than the intended recipients is prohibited by AgResearch
Limited. If you have received this message in error, please notify the
sender immediately.
=======================================================================


From maj at fortinbras.us  Wed Sep  2 08:28:45 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Wed, 2 Sep 2009 08:28:45 -0400
Subject: [Bioperl-l] bioperl invades emacs
In-Reply-To: <4A9E2638.8020203@pasteur.fr>
References: <56DB0DEEB22645DE94DE0E912A889409@NewLife>
	<4A9E2638.8020203@pasteur.fr>
Message-ID: <AC0A7CC6F808466CB15D267CC86AEEE3@NewLife>

Hi Emmanuel-- I'll look into this and report back- thanks!
MAJ
----- Original Message ----- 
From: "Emmanuel Quevillon" <tuco at pasteur.fr>
To: "Mark A. Jensen" <maj at fortinbras.us>
Sent: Wednesday, September 02, 2009 4:00 AM
Subject: Re: [Bioperl-l] bioperl invades emacs


> Mark A. Jensen wrote:
>> Hi All, 
>> 
>> As part of the Documentation Project, I've written a full-
>> fledged minor mode for emacs, bioperl-mode. It allows 
>> the user to access BP pod while coding, using keyboard
>> shortcuts or menus. Pod pops up in a new view buffer,
>> which it itself active for quick pod searching. You can 
>> get the whole pod, pieces of pod, or even the pod headers
>> of individual methods. 
>> 
>> The best feature (IMHO) is the completion facility. This
>> not only saves typing, but allows browsing and follow-your-nose
>> programming (exactly the technique I used to make bioperl-mode,
>> thanks to the Extensible Self-Documenting Editor).
>> 
>> It's very easy to install, requires only one additional line 
>> in your .emacs file, and directly infects perl-mode 
>> (if you so choose) so its available whenever you
>> open .pl or .pm files.
>> 
>> For details, screenshots, download and install info,
>> and soporific design details, see
>> http://www.bioperl.org/wiki/Emacs_bioperl-mode
>> 
>> Send me the bugs!
>> cheers, 
>> MAJ
> rg/mailman/listinfo/bioperl-l
> 
> Hi Mark,
> 
> Great great job.
> But I am using Xemacs and not .emacs file are present in my home
> directory. So is there an trick to make you bioperl-mode working
> under xemacs?
> 
> Thanks for you help
> 
> Regards
> 
> Emmanuel
> -- 
> -------------------------
> Emmanuel Quevillon
> Biological Software and Databases Group
> Institut Pasteur
> +33 1 44 38 95 98
> tuco at_ pasteur dot fr
> -------------------------
> 
>

From maj at fortinbras.us  Wed Sep  2 08:07:14 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Wed, 2 Sep 2009 08:07:14 -0400
Subject: [Bioperl-l] bioperl invades emacs
In-Reply-To: <18DF7D20DFEC044098A1062202F5FFF32AAB8A8478@exchsth.agresearch.co.nz>
References: <56DB0DEEB22645DE94DE0E912A889409@NewLife>
	<18DF7D20DFEC044098A1062202F5FFF32AAB8A8478@exchsth.agresearch.co.nz>
Message-ID: <B9B317F95CA44F0C9335450D3FDDEC73@NewLife>

I only know one command in vi --- :q
MAJ
----- Original Message ----- 
From: "Smithies, Russell" <Russell.Smithies at agresearch.co.nz>
To: "'Mark A. Jensen'" <maj at fortinbras.us>; "'BioPerl List'" 
<bioperl-l at lists.open-bio.org>
Sent: Wednesday, September 02, 2009 1:01 AM
Subject: RE: [Bioperl-l] bioperl invades emacs


emacs, how quaint  :-)
And here's me thinking you'd be a vi guru...

For those who frequent Windows, Eclipse with EPIC is a real winner!

--Russell


> -----Original Message-----
> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
> bounces at lists.open-bio.org] On Behalf Of Mark A. Jensen
> Sent: Wednesday, 2 September 2009 4:20 p.m.
> To: BioPerl List
> Subject: [Bioperl-l] bioperl invades emacs
>
> Hi All,
>
> As part of the Documentation Project, I've written a full-
> fledged minor mode for emacs, bioperl-mode. It allows
> the user to access BP pod while coding, using keyboard
> shortcuts or menus. Pod pops up in a new view buffer,
> which it itself active for quick pod searching. You can
> get the whole pod, pieces of pod, or even the pod headers
> of individual methods.
>
> The best feature (IMHO) is the completion facility. This
> not only saves typing, but allows browsing and follow-your-nose
> programming (exactly the technique I used to make bioperl-mode,
> thanks to the Extensible Self-Documenting Editor).
>
> It's very easy to install, requires only one additional line
> in your .emacs file, and directly infects perl-mode
> (if you so choose) so its available whenever you
> open .pl or .pm files.
>
> For details, screenshots, download and install info,
> and soporific design details, see
> http://www.bioperl.org/wiki/Emacs_bioperl-mode
>
> Send me the bugs!
> cheers,
> MAJ
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
=======================================================================
Attention: The information contained in this message and/or attachments
from AgResearch Limited is intended only for the persons or entities
to which it is addressed and may contain confidential and/or privileged
material. Any review, retransmission, dissemination or other use of, or
taking of any action in reliance upon, this information by persons or
entities other than the intended recipients is prohibited by AgResearch
Limited. If you have received this message in error, please notify the
sender immediately.
=======================================================================


From hlapp at gmx.net  Wed Sep  2 11:51:18 2009
From: hlapp at gmx.net (Hilmar Lapp)
Date: Wed, 2 Sep 2009 11:51:18 -0400
Subject: [Bioperl-l] bioperl invades emacs
In-Reply-To: <56DB0DEEB22645DE94DE0E912A889409@NewLife>
References: <56DB0DEEB22645DE94DE0E912A889409@NewLife>
Message-ID: <73A8B147-7605-4E2E-98AF-F3B09AD6046F@gmx.net>

Very nice!! -hilmar

On Sep 2, 2009, at 12:19 AM, Mark A. Jensen wrote:

> Hi All,
>
> As part of the Documentation Project, I've written a full-
> fledged minor mode for emacs, bioperl-mode. It allows
> the user to access BP pod while coding, using keyboard
> shortcuts or menus. Pod pops up in a new view buffer,
> which it itself active for quick pod searching. You can
> get the whole pod, pieces of pod, or even the pod headers
> of individual methods.
>
> The best feature (IMHO) is the completion facility. This
> not only saves typing, but allows browsing and follow-your-nose
> programming (exactly the technique I used to make bioperl-mode,
> thanks to the Extensible Self-Documenting Editor).
>
> It's very easy to install, requires only one additional line
> in your .emacs file, and directly infects perl-mode
> (if you so choose) so its available whenever you
> open .pl or .pm files.
>
> For details, screenshots, download and install info,
> and soporific design details, see
> http://www.bioperl.org/wiki/Emacs_bioperl-mode
>
> Send me the bugs!
> cheers,
> MAJ
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From cjfields at illinois.edu  Wed Sep  2 16:23:01 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Wed, 2 Sep 2009 15:23:01 -0500
Subject: [Bioperl-l] remove overlapped sequences from Blastn results
In-Reply-To: <1c9f28970909021320o20037e00g871db92a37519f79@mail.gmail.com>
References: <1c9f28970909011033h7f8a1bcl771db039bad384e7@mail.gmail.com>
	<7A89A354-3211-4662-9672-895E16CFDEE8@illinois.edu>
	<1c9f28970909021320o20037e00g871db92a37519f79@mail.gmail.com>
Message-ID: <E39D878B-A6F1-441A-A511-7CA0FF0D1319@illinois.edu>

Marcelo,

(Make sure to keep responses on the main list)

The new Tiling stuff is in bioperl-live (subversion code); it hasn't  
been released yet but should appear in BioPerl 1.6.1 (an alpha will be  
out this week).

chris

On Sep 2, 2009, at 3:20 PM, Marcelo Iwata wrote:

> thanks Chris.
> I was at cpan search to download Bio::Search::Tiling, and it returns  
> to me the bioperl core module:
> BioPerl-1.6.0.tar.gz
> at http://search.cpan.org/~cjfields/BioPerl-1.6.0/Bio/Search/BlastStatistics.pm
>
> i've downloaded and upgrade my bioperl version, but, still not find  
> the MapTiling.pm
>
> Could this be result of Some kind of error at upgrade?
>  thks.
>
>
> On Tue, Sep 1, 2009 at 3:10 PM, Chris Fields <cjfields at illinois.edu>  
> wrote:
> Marcelo,
>
> Do you mean tiling?  See:
>
> http://www.bioperl.org/wiki/HOWTO:Tiling
>
> chris
>
>
> On Sep 1, 2009, at 12:33 PM, Marcelo Iwata wrote:
>
> Hi
>
> I've made a blastn with such arguments:
>
> ../bin/blastall -p blastn -d DBBank -i myFasta.FASTA.txt  -e 0.00001  
> -o
> Out2Blast.txt -a 8
>
> and i want a script that removes overlapped sequences from the  
> results..
> For example, if a unigene A has the hit->start  and hit-end as 1 and  
> 4, and
> the B is at 2 and 3, respectively, the script remove second one.
>
> I want to know if it already exist, and if not, is there a library  
> that
> works with such issue.
>
> I know that at Bio::DB::gff we have overlapping_features. But , if  
> something
> directly exist (works with blast format), is better for me.
>
> thanks in advance
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>


From maj at fortinbras.us  Wed Sep  2 21:04:06 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Wed, 2 Sep 2009 21:04:06 -0400
Subject: [Bioperl-l] bioperl invades emacs
In-Reply-To: <56DB0DEEB22645DE94DE0E912A889409@NewLife>
References: <56DB0DEEB22645DE94DE0E912A889409@NewLife>
Message-ID: <5009BD4ADDC94A03866AC4D4813907EB@NewLife>

Thanks everyone for your comments so far, on and off-list. 
(You're a terrific audience. I also code for weddings and 
bar mitzvahs. Tip your servers.)
The howto page now has a "Known Issues" section, and
I will be working to eliminate those in the next couple of 
days. 

cheers Mark
----- Original Message ----- 
From: "Mark A. Jensen" <maj at fortinbras.us>
To: "BioPerl List" <bioperl-l at lists.open-bio.org>
Sent: Wednesday, September 02, 2009 12:19 AM
Subject: [Bioperl-l] bioperl invades emacs


> Hi All, 
> 
> As part of the Documentation Project, I've written a full-
> fledged minor mode for emacs, bioperl-mode. It allows 
> the user to access BP pod while coding, using keyboard
> shortcuts or menus. Pod pops up in a new view buffer,
> which it itself active for quick pod searching. You can 
> get the whole pod, pieces of pod, or even the pod headers
> of individual methods. 
> 
> The best feature (IMHO) is the completion facility. This
> not only saves typing, but allows browsing and follow-your-nose
> programming (exactly the technique I used to make bioperl-mode,
> thanks to the Extensible Self-Documenting Editor).
> 
> It's very easy to install, requires only one additional line 
> in your .emacs file, and directly infects perl-mode 
> (if you so choose) so its available whenever you
> open .pl or .pm files.
> 
> For details, screenshots, download and install info,
> and soporific design details, see
> http://www.bioperl.org/wiki/Emacs_bioperl-mode
> 
> Send me the bugs!
> cheers, 
> MAJ
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
>

From jessica.sun at gmail.com  Tue Sep  1 11:25:36 2009
From: jessica.sun at gmail.com (jsun529)
Date: Tue, 1 Sep 2009 08:25:36 -0700 (PDT)
Subject: [Bioperl-l]  covert CDS coordinates with Gene coordinates
Message-ID: <25242395.post@talk.nabble.com>


Dear all,
  I like to know how to convert a CDS coordinates with Gene coordinates
using the use Bio::Coordinate::GeneMapper;
 the doc is not very clear and a working example will help a lot in 

using the objects return from Bioperl function and get the value out in
readable format.

Thanks,

-- 
View this message in context: http://www.nabble.com/covert-CDS-coordinates-with-Gene-coordinates-tp25242395p25242395.html
Sent from the Perl - Bioperl-L mailing list archive at Nabble.com.


From pg4 at sanger.ac.uk  Wed Sep  2 19:35:07 2009
From: pg4 at sanger.ac.uk (Pablo Marin-Garcia)
Date: Thu, 3 Sep 2009 00:35:07 +0100 (BST)
Subject: [Bioperl-l] bioperl invades emacs -- bug report?
In-Reply-To: <mailman.25.1251907209.22450.bioperl-l@lists.open-bio.org>
References: <mailman.25.1251907209.22450.bioperl-l@lists.open-bio.org>
Message-ID: <alpine.DEB.1.10.0909022007510.16229@deskpro17122.dynamic.sanger.ac.uk>


Hello Mark,

It sounds fantastic,

unfortunatelly I was unable to use it:

It does not found pod2text in my macosX and fail to find my bioperl paths 
in linux (probably due to a bug in the perl5lib parsing but I am a lisp 
novice so I could be wrong)

==  macosX ==

in my macbook macosX 10.5 emacs 22.3 it does not find the pod2text
GNU Emacs 22.3.1 (i386-apple-darwin9.6.0, X toolkit)

   -I have installed your modules in my local-lisp and added the requiere 
and now emacs fails with the error:

   File error: Searching for program, invalid argument, pod2text

   -- I have pod2text in /usr/bin and this is in my $PATH (I use fink 
emacs in not-window mode) but the same happens with the carbon emacs

==  debian etch with an old emacs 21 ==

GNU Emacs 21.4.1 (i486-pc-linux-gnu, X toolkit, Xaw3d scroll bars) of 
2007-06-19 on ninsei, modified by Debian

It loads ok but when asking for the pods

[pod] Namespace: Bio::

it does not autocomplete from there, and if I have the cursor over a 'use 
Bio::xxx', and select [BP Docs] 'view methods' or 'view pod' it says 'no 
match'

# [pod mth] Namespace: Bio::PrimarySeq [No match]

Reading bioperl-mode.el and bioperl-init.el I have seen that the variable 
that stores the path to bioperl has not other paths added a part of 
current path:

# c-h v bioperl-module-path [ret] => bioperl-module-path's value is "."


== bug when parsing perl5lib? ==

Please correct me if I am wrong but in bioperl-init.el when extracting the 
Bioperl paths from PERL5LIB this is not working for me in linux.

While debugging bioperl-init.el:
# (setq pth (getenv "PERL5LIB"))
#  "/nfs/home/pmg/ensembl-api/ensembl-compara/modules:...:/nfs/home/pmg/bioperl-live:..."
# (setq pth (if (file-exists-p (concat pth "/" "Bio")) pth nil))
# nil

No file is found because it is looking for all the paths 
concatenated together with a '/Bio' at the end:

   libpaht1:libpath2:libpath3/Bio

'concat' adds /Bio to the pth that is a string with all the 
PERL5LIB paths. Should this concat rather be applied to the splited perl5lib by ':' in unix or 
';' in windows and then tested for the existence of files?

for example in unix:

--- code --
(defun addbio (bio_path)
   "apend /Bio to each path"
   (concat bio_path "/" "Bio"))

(mapcar 'file-exists-p (mapcar 'addbio (split-string pth ":")))
-- end code ---

This would result in the list of T and F bioperl (and ensembl) paths
(t t nil t t t t t t nil nil nil ...)


Regards and thanks for the modules they would be very useful.

    -Pablo

=====================================================================
                      Pablo Marin-Garcia, PhD

                     \\//          (Argiope bruennichi
                \/\/`(||>O:'\/\/   with stabilimentum)
                     //\\

Sanger Institute                |  PostDoc / Computer Biologist
Wellcome Trust Genome Campus    |  team : 128/108 (Human Genetics)
Hinxton, Cambridge CB10 1HH     |  room : N333
United Kingdom                  |  email: pablo.marin at sanger.ac.uk
====================================================================


-- 
 The Wellcome Trust Sanger Institute is operated by Genome Research 
 Limited, a charity registered in England with number 1021457 and a 
 company registered in England with number 2742969, whose registered 
 office is 215 Euston Road, London, NW1 2BE. 

From maj at fortinbras.us  Wed Sep  2 22:34:59 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Wed, 2 Sep 2009 22:34:59 -0400
Subject: [Bioperl-l] bioperl invades emacs -- bug report?
In-Reply-To: <alpine.DEB.1.10.0909022007510.16229@deskpro17122.dynamic.sanger.ac.uk>
References: <mailman.25.1251907209.22450.bioperl-l@lists.open-bio.org>
	<alpine.DEB.1.10.0909022007510.16229@deskpro17122.dynamic.sanger.ac.uk>
Message-ID: <2669F98293CC4473ADAB8B80F93351FF@NewLife>

Thanks for all this work, Pablo. Am working hard on 21
back-compat. Will attempt some mac-friendly paths
and look at the perl5lib issue-

"No matches" are seeming to stem from failure to
find the Bio tree-- there's a workaround for this on
the wiki page as of right now. This will probably
not help the 21 problems, but the next commit
(tomorrow) will likely solve these. I will post to this
thread when that happens.
cheers Mark
----- Original Message ----- 
From: "Pablo Marin-Garcia" <pg4 at sanger.ac.uk>
To: <bioperl-l at lists.open-bio.org>
Sent: Wednesday, September 02, 2009 7:35 PM
Subject: Re: [Bioperl-l] bioperl invades emacs -- bug report?


>
>
> Hello Mark,
>
> It sounds fantastic,
>
> unfortunatelly I was unable to use it:
>
> It does not found pod2text in my macosX and fail to find my bioperl paths in 
> linux (probably due to a bug in the perl5lib parsing but I am a lisp novice so 
> I could be wrong)
>
> ==  macosX ==
>
> in my macbook macosX 10.5 emacs 22.3 it does not find the pod2text
> GNU Emacs 22.3.1 (i386-apple-darwin9.6.0, X toolkit)
>
>   -I have installed your modules in my local-lisp and added the requiere and 
> now emacs fails with the error:
>
>   File error: Searching for program, invalid argument, pod2text
>
>   -- I have pod2text in /usr/bin and this is in my $PATH (I use fink emacs in 
> not-window mode) but the same happens with the carbon emacs
>
> ==  debian etch with an old emacs 21 ==
>
> GNU Emacs 21.4.1 (i486-pc-linux-gnu, X toolkit, Xaw3d scroll bars) of 
> 2007-06-19 on ninsei, modified by Debian
>
> It loads ok but when asking for the pods
>
> [pod] Namespace: Bio::
>
> it does not autocomplete from there, and if I have the cursor over a 'use 
> Bio::xxx', and select [BP Docs] 'view methods' or 'view pod' it says 'no 
> match'
>
> # [pod mth] Namespace: Bio::PrimarySeq [No match]
>
> Reading bioperl-mode.el and bioperl-init.el I have seen that the variable that 
> stores the path to bioperl has not other paths added a part of current path:
>
> # c-h v bioperl-module-path [ret] => bioperl-module-path's value is "."
>
>
> == bug when parsing perl5lib? ==
>
> Please correct me if I am wrong but in bioperl-init.el when extracting the 
> Bioperl paths from PERL5LIB this is not working for me in linux.
>
> While debugging bioperl-init.el:
> # (setq pth (getenv "PERL5LIB"))
> # 
> "/nfs/home/pmg/ensembl-api/ensembl-compara/modules:...:/nfs/home/pmg/bioperl-live:..."
> # (setq pth (if (file-exists-p (concat pth "/" "Bio")) pth nil))
> # nil
>
> No file is found because it is looking for all the paths concatenated together 
> with a '/Bio' at the end:
>
>   libpaht1:libpath2:libpath3/Bio
>
> 'concat' adds /Bio to the pth that is a string with all the PERL5LIB paths. 
> Should this concat rather be applied to the splited perl5lib by ':' in unix or 
> ';' in windows and then tested for the existence of files?
>
> for example in unix:
>
> --- code --
> (defun addbio (bio_path)
>   "apend /Bio to each path"
>   (concat bio_path "/" "Bio"))
>
> (mapcar 'file-exists-p (mapcar 'addbio (split-string pth ":")))
> -- end code ---
>
> This would result in the list of T and F bioperl (and ensembl) paths
> (t t nil t t t t t t nil nil nil ...)
>
>
> Regards and thanks for the modules they would be very useful.
>
>    -Pablo
>
> =====================================================================
>                      Pablo Marin-Garcia, PhD
>
>                     \\//          (Argiope bruennichi
>                \/\/`(||>O:'\/\/   with stabilimentum)
>                     //\\
>
> Sanger Institute                |  PostDoc / Computer Biologist
> Wellcome Trust Genome Campus    |  team : 128/108 (Human Genetics)
> Hinxton, Cambridge CB10 1HH     |  room : N333
> United Kingdom                  |  email: pablo.marin at sanger.ac.uk
> ====================================================================
>
>
>
>
>
>
>
>
>
>
> -- 
> The Wellcome Trust Sanger Institute is operated by Genome Research Limited, a 
> charity registered in England with number 1021457 and a company registered in 
> England with number 2742969, whose registered office is 215 Euston Road, 
> London, NW1 2BE. _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> 


From maj at fortinbras.us  Thu Sep  3 00:21:14 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Thu, 3 Sep 2009 00:21:14 -0400
Subject: [Bioperl-l] bioperl invades emacs -- bug report?
In-Reply-To: <alpine.DEB.1.10.0909022007510.16229@deskpro17122.dynamic.sanger.ac.uk>
References: <mailman.25.1251907209.22450.bioperl-l@lists.open-bio.org>
	<alpine.DEB.1.10.0909022007510.16229@deskpro17122.dynamic.sanger.ac.uk>
Message-ID: <203092FB050648AA9F256788068F0A16@NewLife>

Hi Pablo and all-
Try the latest revision (>=16081) with your debian/Emacs 21. Set
the variable bioperl-module-path to the directory above the
Bio directory (same idea as ' use lib "./bioperl-live"; ' ), and try
again there. Tomorrow, MacOS
cheers,
Mark
----- Original Message ----- 
From: "Pablo Marin-Garcia" <pg4 at sanger.ac.uk>
To: <bioperl-l at lists.open-bio.org>
Sent: Wednesday, September 02, 2009 7:35 PM
Subject: Re: [Bioperl-l] bioperl invades emacs -- bug report?


>
>
> Hello Mark,
>
> It sounds fantastic,
>
> unfortunatelly I was unable to use it:
>
> It does not found pod2text in my macosX and fail to find my bioperl paths in 
> linux (probably due to a bug in the perl5lib parsing but I am a lisp novice so 
> I could be wrong)
>
> ==  macosX ==
>
> in my macbook macosX 10.5 emacs 22.3 it does not find the pod2text
> GNU Emacs 22.3.1 (i386-apple-darwin9.6.0, X toolkit)
>
>   -I have installed your modules in my local-lisp and added the requiere and 
> now emacs fails with the error:
>
>   File error: Searching for program, invalid argument, pod2text
>
>   -- I have pod2text in /usr/bin and this is in my $PATH (I use fink emacs in 
> not-window mode) but the same happens with the carbon emacs
>
> ==  debian etch with an old emacs 21 ==
>
> GNU Emacs 21.4.1 (i486-pc-linux-gnu, X toolkit, Xaw3d scroll bars) of 
> 2007-06-19 on ninsei, modified by Debian
>
> It loads ok but when asking for the pods
>
> [pod] Namespace: Bio::
>
> it does not autocomplete from there, and if I have the cursor over a 'use 
> Bio::xxx', and select [BP Docs] 'view methods' or 'view pod' it says 'no 
> match'
>
> # [pod mth] Namespace: Bio::PrimarySeq [No match]
>
> Reading bioperl-mode.el and bioperl-init.el I have seen that the variable that 
> stores the path to bioperl has not other paths added a part of current path:
>
> # c-h v bioperl-module-path [ret] => bioperl-module-path's value is "."
>
>
> == bug when parsing perl5lib? ==
>
> Please correct me if I am wrong but in bioperl-init.el when extracting the 
> Bioperl paths from PERL5LIB this is not working for me in linux.
>
> While debugging bioperl-init.el:
> # (setq pth (getenv "PERL5LIB"))
> # 
> "/nfs/home/pmg/ensembl-api/ensembl-compara/modules:...:/nfs/home/pmg/bioperl-live:..."
> # (setq pth (if (file-exists-p (concat pth "/" "Bio")) pth nil))
> # nil
>
> No file is found because it is looking for all the paths concatenated together 
> with a '/Bio' at the end:
>
>   libpaht1:libpath2:libpath3/Bio
>
> 'concat' adds /Bio to the pth that is a string with all the PERL5LIB paths. 
> Should this concat rather be applied to the splited perl5lib by ':' in unix or 
> ';' in windows and then tested for the existence of files?
>
> for example in unix:
>
> --- code --
> (defun addbio (bio_path)
>   "apend /Bio to each path"
>   (concat bio_path "/" "Bio"))
>
> (mapcar 'file-exists-p (mapcar 'addbio (split-string pth ":")))
> -- end code ---
>
> This would result in the list of T and F bioperl (and ensembl) paths
> (t t nil t t t t t t nil nil nil ...)
>
>
> Regards and thanks for the modules they would be very useful.
>
>    -Pablo
>
> =====================================================================
>                      Pablo Marin-Garcia, PhD
>
>                     \\//          (Argiope bruennichi
>                \/\/`(||>O:'\/\/   with stabilimentum)
>                     //\\
>
> Sanger Institute                |  PostDoc / Computer Biologist
> Wellcome Trust Genome Campus    |  team : 128/108 (Human Genetics)
> Hinxton, Cambridge CB10 1HH     |  room : N333
> United Kingdom                  |  email: pablo.marin at sanger.ac.uk
> ====================================================================
>
>
>
>
>
>
>
>
>
>
> -- 
> The Wellcome Trust Sanger Institute is operated by Genome Research Limited, a 
> charity registered in England with number 1021457 and a company registered in 
> England with number 2742969, whose registered office is 215 Euston Road, 
> London, NW1 2BE. _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> 


From tuco at pasteur.fr  Thu Sep  3 05:56:45 2009
From: tuco at pasteur.fr (Emmanuel Quevillon)
Date: Thu, 03 Sep 2009 11:56:45 +0200
Subject: [Bioperl-l] bioperl invades emacs
In-Reply-To: <5009BD4ADDC94A03866AC4D4813907EB@NewLife>
References: <56DB0DEEB22645DE94DE0E912A889409@NewLife>
	<5009BD4ADDC94A03866AC4D4813907EB@NewLife>
Message-ID: <4A9F92DD.2010701@pasteur.fr>

Mark A. Jensen wrote:
> Thanks everyone for your comments so far, on and off-list. (You're a
> terrific audience. I also code for weddings and bar mitzvahs. Tip your
> servers.)
> The howto page now has a "Known Issues" section, and
> I will be working to eliminate those in the next couple of days.
> cheers Mark

Hi Mark,

Thanks for your help. I decided to remove Xemacs :) and replace it
with Emacs. In fact, as I am running Ubuntu, it was a mess to know
where to put files.el etc and how to make it working.
So I removed everything , bit rude, and reinstall emacs-22.

What I've done after that.

$ cd /usr/share/emacs
$ cd 22.2
$ cp BIOPERL-MODE/etc/* etc/
$ cd site-lisp (which is a symlink to /usr/share/emacs22/site-lisp)
$ sudo mkdir bioperl-mode
$ cp BIOPERL-MODE/site-lisp/* bioperl-mode
$ cd ~
$ touch .emacs
$ cat .xemacs/init.el (with require 'bioperl-mode) > .emacs
$ cat .xemacs/custom.el >> .emacs (The file with my other emacs
stuff, e.g. Template Toolkit mode)

And it is all done and working perfectly!!

Thanks for this great file Mark

Regards

Emmanuel

-- 
-------------------------
Emmanuel Quevillon
Biological Software and Databases Group
Institut Pasteur
+33 1 44 38 95 98
tuco at_ pasteur dot fr
-------------------------

From maj at fortinbras.us  Thu Sep  3 07:22:31 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Thu, 3 Sep 2009 07:22:31 -0400
Subject: [Bioperl-l] bioperl invades emacs -- bug report?
In-Reply-To: <alpine.DEB.1.10.0909030814320.16229@deskpro17122.dynamic.sanger.ac.uk>
References: <mailman.25.1251907209.22450.bioperl-l@lists.open-bio.org>
	<alpine.DEB.1.10.0909022007510.16229@deskpro17122.dynamic.sanger.ac.uk>
	<203092FB050648AA9F256788068F0A16@NewLife>
	<alpine.DEB.1.10.0909030814320.16229@deskpro17122.dynamic.sanger.ac.uk>
Message-ID: <2465B400494242AEAB5F578BD6BB5301@NewLife>

I get it now-- you're right. I'll take care of that-
cheers
MAJ
----- Original Message ----- 
From: "Pablo Marin-Garcia" <pg4 at sanger.ac.uk>
To: "Mark A. Jensen" <maj at fortinbras.us>
Cc: <bioperl-l at lists.open-bio.org>
Sent: Thursday, September 03, 2009 4:01 AM
Subject: Re: [Bioperl-l] bioperl invades emacs -- bug report?


> On Thu, 3 Sep 2009, Mark A. Jensen wrote:
>
>> Hi Pablo and all-
>> Try the latest revision (>=16081) with your debian/Emacs 21. Set
>> the variable bioperl-module-path to the directory above the
>> Bio directory (same idea as ' use lib "./bioperl-live"; ' ), and try
>> again there. Tomorrow, MacOS
>> cheers,
>> Mark
>
> Hello Mark,
>
> after setting bioperl-module-path manually, your module works ok in linux 
> emacs 21.4 with latest revision.
>
> About the perl5lib issue, sorry about not reporting the platform: the report 
> was on linux not in mac os X. In the wiki you have a comment about mac OS X 
> separator:
>
> [wiki] The problem Pablo was running into is definitely the Mac OS X path 
> [wiki] separator issue.
>
> Here I was refering to ':' as the 'path seprator' for linux multipath 
> environmental vars not the systems directory separator [:/\].
>
> Also from the wiki
>
> [wiki] I think this is ok as it is, since bioperl-module-path is meant to 
> [wiki] point to the directory above Bio
>
> This is right. Probably my message was misleading. I wrongly appended '/Bio' 
> to the path instead to a temp variable for testing with file-exist-p. And 
> probably gave you the impression that the point was to have the /Bio added to 
> the path. Sorry about that.
>
> Instead my main point was about the line where you capture the PRL5LIB:
>
> [code] (if (setq pth (getenv "PERL5LIB"))
>
> wouldn't this leave pth with s *string* like "lib/path1:lib/path2:lob/path3" 
> in linux?
>
> Then, when you test:
>
> [code] (setq pth (if (file-exists-p (concat pth "/" "Bio")) pth nil))))
>
> it would append '/Bio' at the end of the whole string 
> 'lib/path1:lib/path2:lib/path3'. and this string path obviously does not 
> exist.
>
> Am I missing something? Shouldn't the 'concat /Bio' be applied to *each* 
> lib/path, splitting first the pth string by the ':' in linux/osX or equivalent 
> in windows.
>
> Sorry about not being very clear in my firest report.
>
>
>    -Pablo
>
>
>
>>> == bug when parsing perl5lib? ==
>>>
>>> Please correct me if I am wrong but in bioperl-init.el when extracting the 
>>> Bioperl paths from PERL5LIB this is not working for me in linux.
>>>
>>> While debugging bioperl-init.el:
>>> # (setq pth (getenv "PERL5LIB"))
>>> # 
>>> "/nfs/home/pmg/ensembl-api/ensembl-compara/modules:...:/nfs/home/pmg/bioperl-live:..."
>>> # (setq pth (if (file-exists-p (concat pth "/" "Bio")) pth nil))
>>> # nil
>>>
>>> No file is found because it is looking for all the paths concatenated 
>>> together with a '/Bio' at the end:
>>>
>>>   libpaht1:libpath2:libpath3/Bio
>>>
>>> 'concat' adds /Bio to the pth that is a string with all the PERL5LIB paths. 
>>> Should this concat rather be applied to the splited perl5lib by ':' in unix 
>>> or ';' in windows and then tested for the existence of files?
>>>
>>> for example in unix:
>>>
>>> --- code --
>>> (defun addbio (bio_path)
>>>   "apend /Bio to each path"
>>>   (concat bio_path "/" "Bio"))
>>>
>>> (mapcar 'file-exists-p (mapcar 'addbio (split-string pth ":")))
>>> -- end code ---
>>>
>>> This would result in the list of T and F bioperl (and ensembl) paths
>>> (t t nil t t t t t t nil nil nil ...)
>>>
>>>
>>> Regards and thanks for the modules they would be very useful.
>>>
>>>    -Pablo
>>>
>>> =====================================================================
>>>                      Pablo Marin-Garcia, PhD
>>>
>>>                     \\//          (Argiope bruennichi
>>>                \/\/`(||>O:'\/\/   with stabilimentum)
>>>                     //\\
>>>
>>> Sanger Institute                |  PostDoc / Computer Biologist
>>> Wellcome Trust Genome Campus    |  team : 128/108 (Human Genetics)
>>> Hinxton, Cambridge CB10 1HH     |  room : N333
>>> United Kingdom                  |  email: pablo.marin at sanger.ac.uk
>>> ====================================================================
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> -- 
>>> The Wellcome Trust Sanger Institute is operated by Genome Research Limited, 
>>> a charity registered in England with number 1021457 and a company registered 
>>> in England with number 2742969, whose registered office is 215 Euston Road, 
>>> London, NW1 2BE. _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>>
>>
>>
>
>
> =====================================================================
>                      Pablo Marin-Garcia, PhD
>
>                     \\//          (Argiope bruennichi
>                \/\/`(||>O:'\/\/   with stabilimentum)
>                     //\\
>
> Sanger Institute                |  PostDoc / Computer Biologist
> Wellcome Trust Genome Campus    |  team : 128/108 (Human Genetics)
> Hinxton, Cambridge CB10 1HH     |  room : N333
> United Kingdom                  |  email: pablo.marin at sanger.ac.uk
> ====================================================================
>
>
>
>
>
>
>
>
>
>
> -- 
> The Wellcome Trust Sanger Institute is operated by Genome Research Limited, a 
> charity registered in England with number 1021457 and a company registered in 
> England with number 2742969, whose registered office is 215 Euston Road, 
> London, NW1 2BE.
> 


From maj at fortinbras.us  Thu Sep  3 08:34:45 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Thu, 3 Sep 2009 08:34:45 -0400
Subject: [Bioperl-l] bioperl invades emacs
In-Reply-To: <56DB0DEEB22645DE94DE0E912A889409@NewLife>
References: <56DB0DEEB22645DE94DE0E912A889409@NewLife>
Message-ID: <736B3399B3754D4C9B1BB66414160D95@NewLife>

Hi All, 

Following bioperl-mode issues are resolved in r16020:

- compatibility with Emacs 21
- correct parsing of PERL5LIB
- Bio module search now includes PATH components 
  (after PERL5LIB search)
- Now get informative error if completion is attempted
  without a valid bioperl-module-path

Thanks for your patience and your bug reports-
cheers
MAJ

----- Original Message ----- 
From: "Mark A. Jensen" <maj at fortinbras.us>
To: "BioPerl List" <bioperl-l at lists.open-bio.org>
Sent: Wednesday, September 02, 2009 12:19 AM
Subject: [Bioperl-l] bioperl invades emacs


> Hi All, 
> 
> As part of the Documentation Project, I've written a full-
> fledged minor mode for emacs, bioperl-mode. It allows 
> the user to access BP pod while coding, using keyboard
> shortcuts or menus. Pod pops up in a new view buffer,
> which it itself active for quick pod searching. You can 
> get the whole pod, pieces of pod, or even the pod headers
> of individual methods. 
> 
> The best feature (IMHO) is the completion facility. This
> not only saves typing, but allows browsing and follow-your-nose
> programming (exactly the technique I used to make bioperl-mode,
> thanks to the Extensible Self-Documenting Editor).
> 
> It's very easy to install, requires only one additional line 
> in your .emacs file, and directly infects perl-mode 
> (if you so choose) so its available whenever you
> open .pl or .pm files.
> 
> For details, screenshots, download and install info,
> and soporific design details, see
> http://www.bioperl.org/wiki/Emacs_bioperl-mode
> 
> Send me the bugs!
> cheers, 
> MAJ
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
>

From neetisomaiya at gmail.com  Fri Sep  4 02:49:58 2009
From: neetisomaiya at gmail.com (Neeti Somaiya)
Date: Fri, 4 Sep 2009 12:19:58 +0530
Subject: [Bioperl-l] need help urgently
Message-ID: <764978cf0909032349t5d8683f3h3b2ccc506392c984@mail.gmail.com>

Hi,

I have an input list of gene names (can get gene ids from a local db
if required).
I need to fetch sequences of these genes. Can someone please guide me
as to how this can be done using perl/bioperl?

Any help will be deeply appreciated.

Thanks.

-Neeti
Even my blood says, B positive

From neetisomaiya at gmail.com  Fri Sep  4 05:17:17 2009
From: neetisomaiya at gmail.com (Neeti Somaiya)
Date: Fri, 4 Sep 2009 14:47:17 +0530
Subject: [Bioperl-l] need help urgently
In-Reply-To: <42006.192.168.1.1.1252051273.squirrel@mail.ncbs.res.in>
References: <764978cf0909032349t5d8683f3h3b2ccc506392c984@mail.gmail.com>
	<42006.192.168.1.1.1252051273.squirrel@mail.ncbs.res.in>
Message-ID: <764978cf0909040217k7f4382d6pcec65f8adfd66164@mail.gmail.com>

Thanks for the link.
So I need only the following lines of code to get the sequence?

use Bio::DB::GenBank;
$db_obj = Bio::DB::GenBank->new;
$seq_obj = $db_obj->get_Seq_by_id(2);

How do I print the sequence?
$seq_obj->seq ??

-Neeti
Even my blood says, B positive


On Fri, Sep 4, 2009 at 1:31 PM, K. Shameer<shameer at ncbs.res.in> wrote:
>
> Retrieving a sequence from a database : BioPerl HOWTO
> http://bit.ly/RWIot
>
> Trust this helps,
> Khader Shameer
> NCBS - TIFR
>
>> Hi,
>>
>> I have an input list of gene names (can get gene ids from a local db
>> if required).
>> I need to fetch sequences of these genes. Can someone please guide me
>> as to how this can be done using perl/bioperl?
>>
>> Any help will be deeply appreciated.
>>
>> Thanks.
>>
>> -Neeti
>> Even my blood says, B positive
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>
>
>

From neetisomaiya at gmail.com  Fri Sep  4 06:13:58 2009
From: neetisomaiya at gmail.com (Neeti Somaiya)
Date: Fri, 4 Sep 2009 15:43:58 +0530
Subject: [Bioperl-l] need help urgently
In-Reply-To: <764978cf0909040217k7f4382d6pcec65f8adfd66164@mail.gmail.com>
References: <764978cf0909032349t5d8683f3h3b2ccc506392c984@mail.gmail.com>
	<42006.192.168.1.1.1252051273.squirrel@mail.ncbs.res.in>
	<764978cf0909040217k7f4382d6pcec65f8adfd66164@mail.gmail.com>
Message-ID: <764978cf0909040313r48ff5660x3210324dc14bf966@mail.gmail.com>

Thanks for the replies.

So the get seq by accession/GI worked for me. Now can anyone tell me
the easiest way to get the GI /Accession of a gene from the gene
id/gene name?

-Neeti
Even my blood says, B positive


On Fri, Sep 4, 2009 at 2:47 PM, Neeti Somaiya<neetisomaiya at gmail.com> wrote:
> Thanks for the link.
> So I need only the following lines of code to get the sequence?
>
> use Bio::DB::GenBank;
> $db_obj = Bio::DB::GenBank->new;
> $seq_obj = $db_obj->get_Seq_by_id(2);
>
> How do I print the sequence?
> $seq_obj->seq ??
>
> -Neeti
> Even my blood says, B positive
>
>
>
> On Fri, Sep 4, 2009 at 1:31 PM, K. Shameer<shameer at ncbs.res.in> wrote:
>>
>> Retrieving a sequence from a database : BioPerl HOWTO
>> http://bit.ly/RWIot
>>
>> Trust this helps,
>> Khader Shameer
>> NCBS - TIFR
>>
>>> Hi,
>>>
>>> I have an input list of gene names (can get gene ids from a local db
>>> if required).
>>> I need to fetch sequences of these genes. Can someone please guide me
>>> as to how this can be done using perl/bioperl?
>>>
>>> Any help will be deeply appreciated.
>>>
>>> Thanks.
>>>
>>> -Neeti
>>> Even my blood says, B positive
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>
>>
>>
>

From e.osimo at gmail.com  Fri Sep  4 08:05:48 2009
From: e.osimo at gmail.com (Emanuele Osimo)
Date: Fri, 4 Sep 2009 14:05:48 +0200
Subject: [Bioperl-l] need help urgently
In-Reply-To: <764978cf0909040313r48ff5660x3210324dc14bf966@mail.gmail.com>
References: <764978cf0909032349t5d8683f3h3b2ccc506392c984@mail.gmail.com> 
	<42006.192.168.1.1.1252051273.squirrel@mail.ncbs.res.in>
	<764978cf0909040217k7f4382d6pcec65f8adfd66164@mail.gmail.com> 
	<764978cf0909040313r48ff5660x3210324dc14bf966@mail.gmail.com>
Message-ID: <2ac05d0f0909040505w75793434ifc25237edbabad5a@mail.gmail.com>

Try this:
http://david.abcc.ncifcrf.gov/conversion.jsp

Emanuele


On Fri, Sep 4, 2009 at 12:13, Neeti Somaiya <neetisomaiya at gmail.com> wrote:

> Thanks for the replies.
>
> So the get seq by accession/GI worked for me. Now can anyone tell me
> the easiest way to get the GI /Accession of a gene from the gene
> id/gene name?
>
> -Neeti
> Even my blood says, B positive
>
>
>
> On Fri, Sep 4, 2009 at 2:47 PM, Neeti Somaiya<neetisomaiya at gmail.com>
> wrote:
> > Thanks for the link.
> > So I need only the following lines of code to get the sequence?
> >
> > use Bio::DB::GenBank;
> > $db_obj = Bio::DB::GenBank->new;
> > $seq_obj = $db_obj->get_Seq_by_id(2);
> >
> > How do I print the sequence?
> > $seq_obj->seq ??
> >
> > -Neeti
> > Even my blood says, B positive
> >
> >
> >
> > On Fri, Sep 4, 2009 at 1:31 PM, K. Shameer<shameer at ncbs.res.in> wrote:
> >>
> >> Retrieving a sequence from a database : BioPerl HOWTO
> >> http://bit.ly/RWIot
> >>
> >> Trust this helps,
> >> Khader Shameer
> >> NCBS - TIFR
> >>
> >>> Hi,
> >>>
> >>> I have an input list of gene names (can get gene ids from a local db
> >>> if required).
> >>> I need to fetch sequences of these genes. Can someone please guide me
> >>> as to how this can be done using perl/bioperl?
> >>>
> >>> Any help will be deeply appreciated.
> >>>
> >>> Thanks.
> >>>
> >>> -Neeti
> >>> Even my blood says, B positive
> >>> _______________________________________________
> >>> Bioperl-l mailing list
> >>> Bioperl-l at lists.open-bio.org
> >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >>>
> >>
> >>
> >>
> >
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>

From neetisomaiya at gmail.com  Fri Sep  4 08:21:19 2009
From: neetisomaiya at gmail.com (Neeti Somaiya)
Date: Fri, 4 Sep 2009 17:51:19 +0530
Subject: [Bioperl-l] need help urgently
In-Reply-To: <2ac05d0f0909040505w75793434ifc25237edbabad5a@mail.gmail.com>
References: <764978cf0909032349t5d8683f3h3b2ccc506392c984@mail.gmail.com>
	<42006.192.168.1.1.1252051273.squirrel@mail.ncbs.res.in>
	<764978cf0909040217k7f4382d6pcec65f8adfd66164@mail.gmail.com>
	<764978cf0909040313r48ff5660x3210324dc14bf966@mail.gmail.com>
	<2ac05d0f0909040505w75793434ifc25237edbabad5a@mail.gmail.com>
Message-ID: <764978cf0909040521p7aa6af97hffc0c08381a04222@mail.gmail.com>

Thanks. Its an interesting tool.

But I want to do this programatically.

I have gene ids to start with. Cant find a method to directly get
sequence with gene id as input. So using the method of getting
sequence with accession as input, for which I need to know accessions
for my gene ids first. Is this a right approach? Please guide me. My
main aim is to get the nucleotide sequence of a gene from ids entrez
gene id/gene name. PLease guide me. I am confused.

-Neeti
Even my blood says, B positive


On Fri, Sep 4, 2009 at 5:35 PM, Emanuele Osimo<e.osimo at gmail.com> wrote:
> Try this:
> http://david.abcc.ncifcrf.gov/conversion.jsp
>
> Emanuele
>
>
> On Fri, Sep 4, 2009 at 12:13, Neeti Somaiya <neetisomaiya at gmail.com> wrote:
>>
>> Thanks for the replies.
>>
>> So the get seq by accession/GI worked for me. Now can anyone tell me
>> the easiest way to get the GI /Accession of a gene from the gene
>> id/gene name?
>>
>> -Neeti
>> Even my blood says, B positive
>>
>>
>>
>> On Fri, Sep 4, 2009 at 2:47 PM, Neeti Somaiya<neetisomaiya at gmail.com>
>> wrote:
>> > Thanks for the link.
>> > So I need only the following lines of code to get the sequence?
>> >
>> > use Bio::DB::GenBank;
>> > $db_obj = Bio::DB::GenBank->new;
>> > $seq_obj = $db_obj->get_Seq_by_id(2);
>> >
>> > How do I print the sequence?
>> > $seq_obj->seq ??
>> >
>> > -Neeti
>> > Even my blood says, B positive
>> >
>> >
>> >
>> > On Fri, Sep 4, 2009 at 1:31 PM, K. Shameer<shameer at ncbs.res.in> wrote:
>> >>
>> >> Retrieving a sequence from a database : BioPerl HOWTO
>> >> http://bit.ly/RWIot
>> >>
>> >> Trust this helps,
>> >> Khader Shameer
>> >> NCBS - TIFR
>> >>
>> >>> Hi,
>> >>>
>> >>> I have an input list of gene names (can get gene ids from a local db
>> >>> if required).
>> >>> I need to fetch sequences of these genes. Can someone please guide me
>> >>> as to how this can be done using perl/bioperl?
>> >>>
>> >>> Any help will be deeply appreciated.
>> >>>
>> >>> Thanks.
>> >>>
>> >>> -Neeti
>> >>> Even my blood says, B positive
>> >>> _______________________________________________
>> >>> Bioperl-l mailing list
>> >>> Bioperl-l at lists.open-bio.org
>> >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>> >>>
>> >>
>> >>
>> >>
>> >
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>

From paola_bisignano at yahoo.it  Fri Sep  4 08:32:02 2009
From: paola_bisignano at yahoo.it (Paola Bisignano)
Date: Fri, 4 Sep 2009 12:32:02 +0000 (GMT)
Subject: [Bioperl-l] problem parsing msf....:second part...I cannot solve
	sorry sorry
Message-ID: <330845.85818.qm@web25704.mail.ukl.yahoo.com>

I have a problem with the parsing of msf file...I can't find the exact


object of Bio::SimpleAlign for my case...


I have to identify residues (from a list) in aligned sequences...but


when I parse the alignment from fasta file, I save as msf file, where


I have to identify my residue (from the list, numbering as the pdb


file) and the residue aligned in the aligned sequences...


this is a piece of the file...


NoName ? MSF: 2 ?Type: P ?Wed Aug 26 10:32:50 2009 ?Check: 00 ..


?Name: Sequence/23-178 ?Len: ? ?156 ?Check: ?8937 ?Weight: ?1.00


?Name: 2zhz:A/1-148 ? ? Len: ? ?156 ?Check: ?9006 ?Weight: ?1.00


//


 ? ? ? ? ? ? ? ? ? ? ?1 ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? 50


Sequence/23-178 ? ? ? NDPRVAAYGE VDELNSWVGY TKSLINSHTQ VLSNELEEIQ QLLFDCGHDL


2zhz:A/1-148 ? ? ? ? ?DDARIAAIGD VDELNSQIGV L--LAEPLPD DVRAALSAIQ HDLFDLGGEL


 ? ? ? ? ? ? ? ? ? ? ?51 ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? 100


Sequence/23-178 ? ? ? ATPADDERHS FKFKQEQPTV WLEEKIDNYT QVVPAVKKHI LPGGTQLASA


2zhz:A/1-148 ? ? ? ? ?CIPGHAAITD AHLARLDG-- WLA----HYN GQLPPLEEFI LPGGARGAAL


 ? ? ? ? ? ? ? ? ? ? ?101 ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?150


Sequence/23-178 ? ? ? LHVARTITRR AERQIVQLMR EEQINQDVLI FINRLSDYFF AAARYANYLE


2zhz:A/1-148 ? ? ? ? ?AHVCRTVCRR AERSIVALGA SEPLNAAPRR YVNRLSDLLF VLARVLNRAA


 ? ? ? ? ? ? ? ? ? ? ?151 ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?200


Sequence/23-178 ? ? ? QQPDML


2zhz:A/1-148 ? ? ? ? ?GGADVL


for example in this I have to identify the residue that is in front of


Val 28 (that is in Sequen) in 2zhz:A (that manually conting is Ile


5)....


Tyr4-> has no residue in front of it because the alignment starts from


N23 of Sequence...


how can I find the way to enter the residue of my sequen, and extract


the residue from the other????


I wish you all dear friends..and I'm actually in atrouble with this..


Thanks for suggestions


Paola


From neetisomaiya at gmail.com  Fri Sep  4 08:40:10 2009
From: neetisomaiya at gmail.com (Neeti Somaiya)
Date: Fri, 4 Sep 2009 18:10:10 +0530
Subject: [Bioperl-l] need help urgently
In-Reply-To: <8CCCFE4D-84A4-47A4-A627-ADC6C0329686@illinois.edu>
References: <764978cf0909032349t5d8683f3h3b2ccc506392c984@mail.gmail.com>
	<42006.192.168.1.1.1252051273.squirrel@mail.ncbs.res.in>
	<764978cf0909040217k7f4382d6pcec65f8adfd66164@mail.gmail.com>
	<764978cf0909040313r48ff5660x3210324dc14bf966@mail.gmail.com>
	<2ac05d0f0909040505w75793434ifc25237edbabad5a@mail.gmail.com>
	<764978cf0909040521p7aa6af97hffc0c08381a04222@mail.gmail.com>
	<8CCCFE4D-84A4-47A4-A627-ADC6C0329686@illinois.edu>
Message-ID: <764978cf0909040540n531ea4d3o42f28a7e1578ad82@mail.gmail.com>

Hi,

Thanks for your reply. I saw this before and wanted to try this, but I
am unable to install this module of EUtilities. When I search on CPAN,
it gives me the entire bioperl package in the download option of this
module. Can I not get a tar.gz file of this module alone, which I can
gzip, untar and then run the make and all to install it? I dont want
to install entire bioperl again as I am using an older version. Any
suggestions?

-Neeti
Even my blood says, B positive


On Fri, Sep 4, 2009 at 6:00 PM, Chris Fields<cjfields at illinois.edu> wrote:
> Neeti,
>
> Something like this?
>
> http://www.bioperl.org/wiki/HOWTO:EUtilities_Cookbook#esummary_-.3E_efetch
>
> chris
>
> On Sep 4, 2009, at 7:21 AM, Neeti Somaiya wrote:
>
>> Thanks. Its an interesting tool.
>>
>> But I want to do this programatically.
>>
>> I have gene ids to start with. Cant find a method to directly get
>> sequence with gene id as input. So using the method of getting
>> sequence with accession as input, for which I need to know accessions
>> for my gene ids first. Is this a right approach? Please guide me. My
>> main aim is to get the nucleotide sequence of a gene from ids entrez
>> gene id/gene name. PLease guide me. I am confused.
>>
>> -Neeti
>> Even my blood says, B positive
>>
>>
>>
>> On Fri, Sep 4, 2009 at 5:35 PM, Emanuele Osimo<e.osimo at gmail.com> wrote:
>>>
>>> Try this:
>>> http://david.abcc.ncifcrf.gov/conversion.jsp
>>>
>>> Emanuele
>>>
>>>
>>> On Fri, Sep 4, 2009 at 12:13, Neeti Somaiya <neetisomaiya at gmail.com>
>>> wrote:
>>>>
>>>> Thanks for the replies.
>>>>
>>>> So the get seq by accession/GI worked for me. Now can anyone tell me
>>>> the easiest way to get the GI /Accession of a gene from the gene
>>>> id/gene name?
>>>>
>>>> -Neeti
>>>> Even my blood says, B positive
>>>>
>>>>
>>>>
>>>> On Fri, Sep 4, 2009 at 2:47 PM, Neeti Somaiya<neetisomaiya at gmail.com>
>>>> wrote:
>>>>>
>>>>> Thanks for the link.
>>>>> So I need only the following lines of code to get the sequence?
>>>>>
>>>>> use Bio::DB::GenBank;
>>>>> $db_obj = Bio::DB::GenBank->new;
>>>>> $seq_obj = $db_obj->get_Seq_by_id(2);
>>>>>
>>>>> How do I print the sequence?
>>>>> $seq_obj->seq ??
>>>>>
>>>>> -Neeti
>>>>> Even my blood says, B positive
>>>>>
>>>>>
>>>>>
>>>>> On Fri, Sep 4, 2009 at 1:31 PM, K. Shameer<shameer at ncbs.res.in> wrote:
>>>>>>
>>>>>> Retrieving a sequence from a database : BioPerl HOWTO
>>>>>> http://bit.ly/RWIot
>>>>>>
>>>>>> Trust this helps,
>>>>>> Khader Shameer
>>>>>> NCBS - TIFR
>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>> I have an input list of gene names (can get gene ids from a local db
>>>>>>> if required).
>>>>>>> I need to fetch sequences of these genes. Can someone please guide me
>>>>>>> as to how this can be done using perl/bioperl?
>>>>>>>
>>>>>>> Any help will be deeply appreciated.
>>>>>>>
>>>>>>> Thanks.
>>>>>>>
>>>>>>> -Neeti
>>>>>>> Even my blood says, B positive
>>>>>>> _______________________________________________
>>>>>>> Bioperl-l mailing list
>>>>>>> Bioperl-l at lists.open-bio.org
>>>>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>> _______________________________________________
>>>> Bioperl-l mailing list
>>>> Bioperl-l at lists.open-bio.org
>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>

From cjfields at illinois.edu  Fri Sep  4 08:30:42 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Fri, 4 Sep 2009 07:30:42 -0500
Subject: [Bioperl-l] need help urgently
In-Reply-To: <764978cf0909040521p7aa6af97hffc0c08381a04222@mail.gmail.com>
References: <764978cf0909032349t5d8683f3h3b2ccc506392c984@mail.gmail.com>
	<42006.192.168.1.1.1252051273.squirrel@mail.ncbs.res.in>
	<764978cf0909040217k7f4382d6pcec65f8adfd66164@mail.gmail.com>
	<764978cf0909040313r48ff5660x3210324dc14bf966@mail.gmail.com>
	<2ac05d0f0909040505w75793434ifc25237edbabad5a@mail.gmail.com>
	<764978cf0909040521p7aa6af97hffc0c08381a04222@mail.gmail.com>
Message-ID: <8CCCFE4D-84A4-47A4-A627-ADC6C0329686@illinois.edu>

Neeti,

Something like this?

http://www.bioperl.org/wiki/HOWTO:EUtilities_Cookbook#esummary_-.3E_efetch

chris

On Sep 4, 2009, at 7:21 AM, Neeti Somaiya wrote:

> Thanks. Its an interesting tool.
>
> But I want to do this programatically.
>
> I have gene ids to start with. Cant find a method to directly get
> sequence with gene id as input. So using the method of getting
> sequence with accession as input, for which I need to know accessions
> for my gene ids first. Is this a right approach? Please guide me. My
> main aim is to get the nucleotide sequence of a gene from ids entrez
> gene id/gene name. PLease guide me. I am confused.
>
> -Neeti
> Even my blood says, B positive
>
>
>
> On Fri, Sep 4, 2009 at 5:35 PM, Emanuele Osimo<e.osimo at gmail.com>  
> wrote:
>> Try this:
>> http://david.abcc.ncifcrf.gov/conversion.jsp
>>
>> Emanuele
>>
>>
>> On Fri, Sep 4, 2009 at 12:13, Neeti Somaiya  
>> <neetisomaiya at gmail.com> wrote:
>>>
>>> Thanks for the replies.
>>>
>>> So the get seq by accession/GI worked for me. Now can anyone tell me
>>> the easiest way to get the GI /Accession of a gene from the gene
>>> id/gene name?
>>>
>>> -Neeti
>>> Even my blood says, B positive
>>>
>>>
>>>
>>> On Fri, Sep 4, 2009 at 2:47 PM, Neeti  
>>> Somaiya<neetisomaiya at gmail.com>
>>> wrote:
>>>> Thanks for the link.
>>>> So I need only the following lines of code to get the sequence?
>>>>
>>>> use Bio::DB::GenBank;
>>>> $db_obj = Bio::DB::GenBank->new;
>>>> $seq_obj = $db_obj->get_Seq_by_id(2);
>>>>
>>>> How do I print the sequence?
>>>> $seq_obj->seq ??
>>>>
>>>> -Neeti
>>>> Even my blood says, B positive
>>>>
>>>>
>>>>
>>>> On Fri, Sep 4, 2009 at 1:31 PM, K. Shameer<shameer at ncbs.res.in>  
>>>> wrote:
>>>>>
>>>>> Retrieving a sequence from a database : BioPerl HOWTO
>>>>> http://bit.ly/RWIot
>>>>>
>>>>> Trust this helps,
>>>>> Khader Shameer
>>>>> NCBS - TIFR
>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> I have an input list of gene names (can get gene ids from a  
>>>>>> local db
>>>>>> if required).
>>>>>> I need to fetch sequences of these genes. Can someone please  
>>>>>> guide me
>>>>>> as to how this can be done using perl/bioperl?
>>>>>>
>>>>>> Any help will be deeply appreciated.
>>>>>>
>>>>>> Thanks.
>>>>>>
>>>>>> -Neeti
>>>>>> Even my blood says, B positive
>>>>>> _______________________________________________
>>>>>> Bioperl-l mailing list
>>>>>> Bioperl-l at lists.open-bio.org
>>>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From cjfields at illinois.edu  Fri Sep  4 08:49:19 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Fri, 4 Sep 2009 07:49:19 -0500
Subject: [Bioperl-l] need help urgently
In-Reply-To: <764978cf0909040540n531ea4d3o42f28a7e1578ad82@mail.gmail.com>
References: <764978cf0909032349t5d8683f3h3b2ccc506392c984@mail.gmail.com>
	<42006.192.168.1.1.1252051273.squirrel@mail.ncbs.res.in>
	<764978cf0909040217k7f4382d6pcec65f8adfd66164@mail.gmail.com>
	<764978cf0909040313r48ff5660x3210324dc14bf966@mail.gmail.com>
	<2ac05d0f0909040505w75793434ifc25237edbabad5a@mail.gmail.com>
	<764978cf0909040521p7aa6af97hffc0c08381a04222@mail.gmail.com>
	<8CCCFE4D-84A4-47A4-A627-ADC6C0329686@illinois.edu>
	<764978cf0909040540n531ea4d3o42f28a7e1578ad82@mail.gmail.com>
Message-ID: <4D83853D-90C3-4048-AFAB-FF6E2402C7AA@illinois.edu>

Neeti,

Sorry, it's a package deal (and Bio::DB::EUtilities relies on several  
other modules).  I am planning on spinning it out at some point into  
it's own package, but for now the easiest way to install is via 1.6  
off CPAN or downloading the nightly build:

http://www.bioperl.org/DIST/nightly_builds/

chris

On Sep 4, 2009, at 7:40 AM, Neeti Somaiya wrote:

> Hi,
>
> Thanks for your reply. I saw this before and wanted to try this, but I
> am unable to install this module of EUtilities. When I search on CPAN,
> it gives me the entire bioperl package in the download option of this
> module. Can I not get a tar.gz file of this module alone, which I can
> gzip, untar and then run the make and all to install it? I dont want
> to install entire bioperl again as I am using an older version. Any
> suggestions?
>
> -Neeti
> Even my blood says, B positive
>
>
>
> On Fri, Sep 4, 2009 at 6:00 PM, Chris Fields<cjfields at illinois.edu>  
> wrote:
>> Neeti,
>>
>> Something like this?
>>
>> http://www.bioperl.org/wiki/HOWTO:EUtilities_Cookbook#esummary_-.3E_efetch
>>
>> chris
>>
>> On Sep 4, 2009, at 7:21 AM, Neeti Somaiya wrote:
>>
>>> Thanks. Its an interesting tool.
>>>
>>> But I want to do this programatically.
>>>
>>> I have gene ids to start with. Cant find a method to directly get
>>> sequence with gene id as input. So using the method of getting
>>> sequence with accession as input, for which I need to know  
>>> accessions
>>> for my gene ids first. Is this a right approach? Please guide me. My
>>> main aim is to get the nucleotide sequence of a gene from ids entrez
>>> gene id/gene name. PLease guide me. I am confused.
>>>
>>> -Neeti
>>> Even my blood says, B positive
>>>
>>>
>>>
>>> On Fri, Sep 4, 2009 at 5:35 PM, Emanuele Osimo<e.osimo at gmail.com>  
>>> wrote:
>>>>
>>>> Try this:
>>>> http://david.abcc.ncifcrf.gov/conversion.jsp
>>>>
>>>> Emanuele
>>>>
>>>>
>>>> On Fri, Sep 4, 2009 at 12:13, Neeti Somaiya  
>>>> <neetisomaiya at gmail.com>
>>>> wrote:
>>>>>
>>>>> Thanks for the replies.
>>>>>
>>>>> So the get seq by accession/GI worked for me. Now can anyone  
>>>>> tell me
>>>>> the easiest way to get the GI /Accession of a gene from the gene
>>>>> id/gene name?
>>>>>
>>>>> -Neeti
>>>>> Even my blood says, B positive
>>>>>
>>>>>
>>>>>
>>>>> On Fri, Sep 4, 2009 at 2:47 PM, Neeti Somaiya<neetisomaiya at gmail.com 
>>>>> >
>>>>> wrote:
>>>>>>
>>>>>> Thanks for the link.
>>>>>> So I need only the following lines of code to get the sequence?
>>>>>>
>>>>>> use Bio::DB::GenBank;
>>>>>> $db_obj = Bio::DB::GenBank->new;
>>>>>> $seq_obj = $db_obj->get_Seq_by_id(2);
>>>>>>
>>>>>> How do I print the sequence?
>>>>>> $seq_obj->seq ??
>>>>>>
>>>>>> -Neeti
>>>>>> Even my blood says, B positive
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Fri, Sep 4, 2009 at 1:31 PM, K. Shameer<shameer at ncbs.res.in>  
>>>>>> wrote:
>>>>>>>
>>>>>>> Retrieving a sequence from a database : BioPerl HOWTO
>>>>>>> http://bit.ly/RWIot
>>>>>>>
>>>>>>> Trust this helps,
>>>>>>> Khader Shameer
>>>>>>> NCBS - TIFR
>>>>>>>
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> I have an input list of gene names (can get gene ids from a  
>>>>>>>> local db
>>>>>>>> if required).
>>>>>>>> I need to fetch sequences of these genes. Can someone please  
>>>>>>>> guide me
>>>>>>>> as to how this can be done using perl/bioperl?
>>>>>>>>
>>>>>>>> Any help will be deeply appreciated.
>>>>>>>>
>>>>>>>> Thanks.
>>>>>>>>
>>>>>>>> -Neeti
>>>>>>>> Even my blood says, B positive
>>>>>>>> _______________________________________________
>>>>>>>> Bioperl-l mailing list
>>>>>>>> Bioperl-l at lists.open-bio.org
>>>>>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>> _______________________________________________
>>>>> Bioperl-l mailing list
>>>>> Bioperl-l at lists.open-bio.org
>>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>>
>>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From pg4 at sanger.ac.uk  Thu Sep  3 04:01:26 2009
From: pg4 at sanger.ac.uk (Pablo Marin-Garcia)
Date: Thu, 3 Sep 2009 09:01:26 +0100 (BST)
Subject: [Bioperl-l] bioperl invades emacs -- bug report?
In-Reply-To: <203092FB050648AA9F256788068F0A16@NewLife>
References: <mailman.25.1251907209.22450.bioperl-l@lists.open-bio.org>
	<alpine.DEB.1.10.0909022007510.16229@deskpro17122.dynamic.sanger.ac.uk>
	<203092FB050648AA9F256788068F0A16@NewLife>
Message-ID: <alpine.DEB.1.10.0909030814320.16229@deskpro17122.dynamic.sanger.ac.uk>

On Thu, 3 Sep 2009, Mark A. Jensen wrote:

> Hi Pablo and all-
> Try the latest revision (>=16081) with your debian/Emacs 21. Set
> the variable bioperl-module-path to the directory above the
> Bio directory (same idea as ' use lib "./bioperl-live"; ' ), and try
> again there. Tomorrow, MacOS
> cheers,
> Mark

Hello Mark,

after setting bioperl-module-path manually, your module works ok in 
linux emacs 21.4 with latest revision.

About the perl5lib issue, sorry about not reporting the platform: the 
report was on linux not in mac os X. In the wiki you have a comment about 
mac OS X separator:

[wiki] The problem Pablo was running into is definitely the Mac OS X path 
[wiki] separator issue.

Here I was refering to ':' as the 'path seprator' for linux multipath 
environmental vars not the systems directory separator [:/\].

Also from the wiki

[wiki] I think this is ok as it is, since bioperl-module-path is meant to 
[wiki] point to the directory above Bio

This is right. Probably my message was misleading. I wrongly appended 
'/Bio' to the path instead to a temp variable for testing with 
file-exist-p. And probably gave you the impression that the point was to 
have the /Bio added to the path. Sorry about that.

Instead my main point was about the line where you capture the PRL5LIB:

[code] (if (setq pth (getenv "PERL5LIB"))

wouldn't this leave pth with s *string* like 
"lib/path1:lib/path2:lob/path3" in linux?

Then, when you test:

[code] (setq pth (if (file-exists-p (concat pth "/" "Bio")) pth nil))))

it would append '/Bio' at the end of the whole string 
'lib/path1:lib/path2:lib/path3'. and this string path obviously does not 
exist.

Am I missing something? Shouldn't the 'concat /Bio' be applied to *each* 
lib/path, splitting first the pth string by the ':' in linux/osX or 
equivalent in windows.

Sorry about not being very clear in my firest report.


    -Pablo


>> == bug when parsing perl5lib? ==
>> 
>> Please correct me if I am wrong but in bioperl-init.el when extracting the 
>> Bioperl paths from PERL5LIB this is not working for me in linux.
>> 
>> While debugging bioperl-init.el:
>> # (setq pth (getenv "PERL5LIB"))
>> # 
>> "/nfs/home/pmg/ensembl-api/ensembl-compara/modules:...:/nfs/home/pmg/bioperl-live:..."
>> # (setq pth (if (file-exists-p (concat pth "/" "Bio")) pth nil))
>> # nil
>> 
>> No file is found because it is looking for all the paths concatenated 
>> together with a '/Bio' at the end:
>>
>>   libpaht1:libpath2:libpath3/Bio
>> 
>> 'concat' adds /Bio to the pth that is a string with all the PERL5LIB paths. 
>> Should this concat rather be applied to the splited perl5lib by ':' in unix 
>> or ';' in windows and then tested for the existence of files?
>> 
>> for example in unix:
>> 
>> --- code --
>> (defun addbio (bio_path)
>>   "apend /Bio to each path"
>>   (concat bio_path "/" "Bio"))
>> 
>> (mapcar 'file-exists-p (mapcar 'addbio (split-string pth ":")))
>> -- end code ---
>> 
>> This would result in the list of T and F bioperl (and ensembl) paths
>> (t t nil t t t t t t nil nil nil ...)
>> 
>> 
>> Regards and thanks for the modules they would be very useful.
>>
>>    -Pablo
>> 
>> =====================================================================
>>                      Pablo Marin-Garcia, PhD
>>
>>                     \\//          (Argiope bruennichi
>>                \/\/`(||>O:'\/\/   with stabilimentum)
>>                     //\\
>> 
>> Sanger Institute                |  PostDoc / Computer Biologist
>> Wellcome Trust Genome Campus    |  team : 128/108 (Human Genetics)
>> Hinxton, Cambridge CB10 1HH     |  room : N333
>> United Kingdom                  |  email: pablo.marin at sanger.ac.uk
>> ====================================================================
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> -- 
>> The Wellcome Trust Sanger Institute is operated by Genome Research Limited, 
>> a charity registered in England with number 1021457 and a company 
>> registered in England with number 2742969, whose registered office is 215 
>> Euston Road, London, NW1 2BE. 
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>> 
>> 
>
>


=====================================================================
                      Pablo Marin-Garcia, PhD

                     \\//          (Argiope bruennichi
                \/\/`(||>O:'\/\/   with stabilimentum)
                     //\\

Sanger Institute                |  PostDoc / Computer Biologist
Wellcome Trust Genome Campus    |  team : 128/108 (Human Genetics)
Hinxton, Cambridge CB10 1HH     |  room : N333
United Kingdom                  |  email: pablo.marin at sanger.ac.uk
====================================================================


-- 
 The Wellcome Trust Sanger Institute is operated by Genome Research 
 Limited, a charity registered in England with number 1021457 and a 
 company registered in England with number 2742969, whose registered 
 office is 215 Euston Road, London, NW1 2BE. 

From paola.bisignano at gmail.com  Fri Sep  4 08:28:03 2009
From: paola.bisignano at gmail.com (Paola Bisignano)
Date: Fri, 4 Sep 2009 14:28:03 +0200
Subject: [Bioperl-l] problem parsing msf file
Message-ID: <e9cf89740909040528j69e5f8e6ka9d550840a4e0f9a@mail.gmail.com>

I have a problem with the parsing of msf file...I can't find the exact
object of Bio::SimpleAlign for my case...
I have to identify residues (from a list) in aligned sequences...but
when I parse the alignment from fasta file, I save as msf file, where
I have to identify my residue (from the list, numbering as the pdb
file) and the residue aligned in the aligned sequences...

this is a piece of the file...

NoName   MSF: 2  Type: P  Wed Aug 26 10:32:50 2009  Check: 00 ..

 Name: Sequence/23-178  Len:    156  Check:  8937  Weight:  1.00
 Name: 2zhz:A/1-148     Len:    156  Check:  9006  Weight:  1.00

//


                      1                                                   50
Sequence/23-178       NDPRVAAYGE VDELNSWVGY TKSLINSHTQ VLSNELEEIQ QLLFDCGHDL
2zhz:A/1-148          DDARIAAIGD VDELNSQIGV L--LAEPLPD DVRAALSAIQ HDLFDLGGEL


                      51                                                 100
Sequence/23-178       ATPADDERHS FKFKQEQPTV WLEEKIDNYT QVVPAVKKHI LPGGTQLASA
2zhz:A/1-148          CIPGHAAITD AHLARLDG-- WLA----HYN GQLPPLEEFI LPGGARGAAL


                      101                                                150
Sequence/23-178       LHVARTITRR AERQIVQLMR EEQINQDVLI FINRLSDYFF AAARYANYLE
2zhz:A/1-148          AHVCRTVCRR AERSIVALGA SEPLNAAPRR YVNRLSDLLF VLARVLNRAA


                      151                                                200
Sequence/23-178       QQPDML
2zhz:A/1-148          GGADVL

for example in this I have to identify the residue that is in front of
Val 28 (that is in Sequen) in 2zhz:A (that manually conting is Ile
5)....
Tyr4-> has no residue in front of it because the alignment starts from
N23 of Sequence...
how can I find the way to enter the residue of my sequen, and extract
the residue from the other????


I wish you all dear friends..and I'm actually in atrouble with this..
Thanks for suggestions

Paola

From jason at bioperl.org  Fri Sep  4 12:04:05 2009
From: jason at bioperl.org (Jason Stajich)
Date: Fri, 4 Sep 2009 09:04:05 -0700
Subject: [Bioperl-l] Fwd:  help parsing msf file or clustalW file reports
References: <369662.74237.qm@web25701.mail.ukl.yahoo.com>
Message-ID: <B5AEEBAD-22D3-40B6-AD06-17E268DFAFDD@bioperl.org>

Paola - it is important to continue to email the mailing list for your  
help.  I'm hoping another person on the list can help as I am swamped  
right now.
-jason

Begin forwarded message:

> From: Paola Bisignano <paola_bisignano at yahoo.it>
> Date: September 4, 2009 5:48:22 AM PDT
> To: Jason Stajich <jason at bioperl.org>
> Subject: Re: [Bioperl-l] help parsing msf file or clustalW file  
> reports
>
> Hi Jason, thank for your answer there are two day that I'm re- 
> studyng synopsys of bioperl and programming object...I understand  
> what you mean...but I have some problems...I don't actually know how  
> to start to parse this kind of file, I generated this msf file or  
> clustalW file, by parsing a fasta file of multiple paired  
> sequences..so I parsed in msf file...extracting only the paired  
> sequences I want..so homolog proteins that have same ligand  
> published in pdb bank..
>
>
> I have a problem with the parsing of msf file...I can't find the exact
>
>
> object of Bio::SimpleAlign for my case...
>
>
> I have to identify residues (from a list) in aligned sequences...but
>
>
> when I parse the alignment from fasta file, I save as msf file, where
>
>
> I have to identify my residue (from the list, numbering as the pdb
>
>
> file) and the residue aligned in the aligned sequences...
>
>
>
>
>
> this is a piece of the file...
>
>
>
>
>
> NoName   MSF: 2  Type: P  Wed Aug 26 10:32:50 2009  Check: 00 ..
>
>
>
>
>
>  Name: Sequence/23-178  Len:    156  Check:  8937  Weight:  1.00
>
>
>  Name: 2zhz:A/1-148     Len:    156  Check:  9006  Weight:  1.00
>
>
>
>
>
> //
>
>
>
>
>
>
>
>
>                       
> 1                                                   50
>
>
> Sequence/23-178       NDPRVAAYGE VDELNSWVGY TKSLINSHTQ VLSNELEEIQ  
> QLLFDCGHDL
>
>
> 2zhz:A/1-148          DDARIAAIGD VDELNSQIGV L--LAEPLPD DVRAALSAIQ  
> HDLFDLGGEL
>
>
>
>
>
>
>
>
>                       
> 51                                                 100
>
>
> Sequence/23-178       ATPADDERHS FKFKQEQPTV WLEEKIDNYT QVVPAVKKHI  
> LPGGTQLASA
>
>
> 2zhz:A/1-148          CIPGHAAITD AHLARLDG-- WLA----HYN GQLPPLEEFI  
> LPGGARGAAL
>
>
>
>
>
>
>
>
>                       
> 101                                                150
>
>
> Sequence/23-178       LHVARTITRR AERQIVQLMR EEQINQDVLI FINRLSDYFF  
> AAARYANYLE
>
>
> 2zhz:A/1-148          AHVCRTVCRR AERSIVALGA SEPLNAAPRR YVNRLSDLLF  
> VLARVLNRAA
>
>
>
>
>
>
>
>
>                       
> 151                                                200
>
>
> Sequence/23-178       QQPDML
>
>
> 2zhz:A/1-148          GGADVL
>
>
>
>
>
> for example in this I have to identify the residue that is in front of
>
>
> Val 28 (that is in Sequen) in 2zhz:A (that manually conting is Ile
>
>
> 5)....
>
>
> Tyr4-> has no residue in front of it because the alignment starts from
>
>
> N23 of Sequence...
>
>
> how can I find the way to enter the residue of my sequen, and extract
>
>
> the residue from the other????
>
>
>
>
>
>
>
>
> I wish you all dear friends..and I'm actually in atrouble with this..
>
>
> Thanks for suggestions
>
>
>
>
>
>
> --- Mar 1/9/09, Jason Stajich <jason at bioperl.org> ha scritto:
>
> Da: Jason Stajich <jason at bioperl.org>
> Oggetto: Re: [Bioperl-l] help parsing msf file or clustalW file  
> reports
> A: "Paola Bisignano" <paola_bisignano at yahoo.it>
> Cc: bioperl-l at lists.open-bio.org
> Data: Marted? 1 settembre 2009, 17:49
>
> I think you might want to use the column_from_residue_number method  
> that is part of Bio::SimpleAlign - it lets you get the column from  
> an alignment based on the sequence residue, doing some math along  
> the way to deal with gaps. That is the residue -> alignment  
> direction.  If you are starting at the alignment and want to get the  
> residue's position you will use the location_from_column on a  
> particular sequence so
>
>     # select somehow a sequence from the alignment, e.g.
>     my $seq = $aln->get_seq_by_pos(1);
>     #$loc is undef or Bio::LocationI object
>     my $loc = $seq->location_from_column(5);
>
> -jason
>
> On Sep 1, 2009, at 5:20 AM, Paola Bisignano wrote:
>
>> Hi,
>>
>> I'm trying to parse fasta files, where I have couple of  
>> alignments....I need to identify my residue in my alignment......I  
>> have separate lists that derived from ligplot parsing files.. so I  
>> have to manipulate string...but I don't now how to start..it seems  
>> complicated..
>> I used Bio::AlignIO to parse the fasta file, so I can have a parsed  
>> file in msf or clustalW forma
>>
>> here an example:
>> CLUSTAL W(1.81) multiple sequence alignment
>>
>>
>> Sequence/9-273          
>> DKWEMERTDITMKHKLGGGQYGEVYEGVWKKYSLTVAVKTLKEDTMEVEEFLKEAAVMKE
>> 2pl0:A/6-268           DEWEVPRETLKLVERLGAGQFGEVWMGYYNGHT- 
>> KVAVKSLKQGSMSPDAFLAEANLMKQ
>>                         *:**: *  :.: .:**.**:***:  
>> * :: :: .****:**:.:*. : ** ** :**:
>>
>>
>> Sequence/9-273          
>> IKHPNLVQLLGVCTREPPFYIITEFMTYGNLLDYLRECNRQEVSAVVLLYMATQISSAME
>> 2pl0:A/6-268           LQHQRLVRLYAVVTQEP- 
>> IYIITEYMENGSLVDFLKTPSGIKLTINKLLDMAAQIAEGMA
>>                         ::* .**:* .* *:** :*****:*   
>> *.*:*:*:  .  :::   ** **:**:..*
>>
>> I  choose two residue for example...how can I extract  
>> them...starting from their position in the pdb file?
>> I need to walk...to my sequence
>>
>> I don't know if it is clear because I cannot explain the question  
>> correctly in english...are there any Italians?
>> could anyone help me?
>>
>>
>>
>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> --
> Jason Stajich
> jason.stajich at gmail.com
> jason at bioperl.org
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>
>

--
Jason Stajich
jason.stajich at gmail.com
jason at bioperl.org


From robert.bradbury at gmail.com  Fri Sep  4 16:15:09 2009
From: robert.bradbury at gmail.com (Robert Bradbury)
Date: Fri, 4 Sep 2009 16:15:09 -0400
Subject: [Bioperl-l] need help urgently
In-Reply-To: <2ac05d0f0909040505w75793434ifc25237edbabad5a@mail.gmail.com>
References: <764978cf0909032349t5d8683f3h3b2ccc506392c984@mail.gmail.com>
	<42006.192.168.1.1.1252051273.squirrel@mail.ncbs.res.in>
	<764978cf0909040217k7f4382d6pcec65f8adfd66164@mail.gmail.com>
	<764978cf0909040313r48ff5660x3210324dc14bf966@mail.gmail.com>
	<2ac05d0f0909040505w75793434ifc25237edbabad5a@mail.gmail.com>
Message-ID: <deaa866a0909041315y4282d811g3047ab153812014d@mail.gmail.com>

On 9/4/09, Emanuele Osimo <e.osimo at gmail.com> wrote:
> Try this:
> http://david.abcc.ncifcrf.gov/conversion.jsp
>

It may be just me, but I've tried this in both Firefox and Opera and
it will not work without Javascript enabled.  Most "intelligent" sites
now tell you that Javascript must be enabled if they require it to
work properly.  More intelligent sites (such as Google's gmail) allow
you to toggle back and forth between Javascript & non-Javascript
implementations.

Note that, IMO, running with Javascript enabled for all sites all the
time is a bad idea (potentially for security reasons, but clearly for
sleep / suspend / power consumption reasons, and finally for the
reason of do you *really* trust that Javascript, your DNS provider,
and sites hosting the scripts are 100% secure?).  The only options
that seem generally available at this time are to run Firefox with
NoScript enabling of selective sites or to run two browser instances,
one with Javascript enabled, one with it disabled -- and to only use
the Javascript enabled browser on sites with a high probability of
being secure).

From lsbrath at gmail.com  Fri Sep  4 18:12:34 2009
From: lsbrath at gmail.com (Mgavi Brathwaite)
Date: Fri, 4 Sep 2009 18:12:34 -0400
Subject: [Bioperl-l] bio:graphics
Message-ID: <69367b8f0909041512l77b2431aqb89f57f82adae1@mail.gmail.com>

Hello,

I need to grab features(source, gene, cds, primer_bind) from a genbank file
and add features(5' and 3' UTR, misc_feature) to generate an image. The
images are on two tracks and with each track having multiple features. How
do I display different colors for the different features on the same track?
In my case 5'UTR, CDS, and 3'UTR are on the same track. I want the UTRs to
have one color and the CDS another.

I also need to grab the start and end info from the primer_bind feature
based on the /note tag values. In my case 'HUF' and 'HDF'. Code:

if( $feat->primary_tag eq 'primer_bind' ) {
            $feat->get_tag_values("note") if ($feat_object->has_tag("note")
&&
                tag_values("note") eq 'HDF');
            $pb_start = $feat->start;
            $pb_end = $feat->end;


I want to make sure that I am moving in the right direction.  Can someone
help me out?

M

From neetisomaiya at gmail.com  Sat Sep  5 00:52:11 2009
From: neetisomaiya at gmail.com (Neeti Somaiya)
Date: Sat, 5 Sep 2009 10:22:11 +0530
Subject: [Bioperl-l] need help urgently
In-Reply-To: <4D83853D-90C3-4048-AFAB-FF6E2402C7AA@illinois.edu>
References: <764978cf0909032349t5d8683f3h3b2ccc506392c984@mail.gmail.com>
	<42006.192.168.1.1.1252051273.squirrel@mail.ncbs.res.in>
	<764978cf0909040217k7f4382d6pcec65f8adfd66164@mail.gmail.com>
	<764978cf0909040313r48ff5660x3210324dc14bf966@mail.gmail.com>
	<2ac05d0f0909040505w75793434ifc25237edbabad5a@mail.gmail.com>
	<764978cf0909040521p7aa6af97hffc0c08381a04222@mail.gmail.com>
	<8CCCFE4D-84A4-47A4-A627-ADC6C0329686@illinois.edu>
	<764978cf0909040540n531ea4d3o42f28a7e1578ad82@mail.gmail.com>
	<4D83853D-90C3-4048-AFAB-FF6E2402C7AA@illinois.edu>
Message-ID: <764978cf0909042152v2ae26ee5q6c668c498ead605e@mail.gmail.com>

Ok, so I reinstalled bioperl and was able to run the EUtilities code
for my gene id.
But I am facing two issues :-

1) When I give multiple gene ids, it still returns data of only the
first gene id

2) The script returns the entire entry, and I am not able to figure
out how to just fetch the sequence, and if possible, in FASTA format.
I could not figure it out from the documentation.

Thanks.

-Neeti
Even my blood says, B positive


On Fri, Sep 4, 2009 at 6:19 PM, Chris Fields<cjfields at illinois.edu> wrote:
> Neeti,
>
> Sorry, it's a package deal (and Bio::DB::EUtilities relies on several other
> modules).  I am planning on spinning it out at some point into it's own
> package, but for now the easiest way to install is via 1.6 off CPAN or
> downloading the nightly build:
>
> http://www.bioperl.org/DIST/nightly_builds/
>
> chris
>
> On Sep 4, 2009, at 7:40 AM, Neeti Somaiya wrote:
>
>> Hi,
>>
>> Thanks for your reply. I saw this before and wanted to try this, but I
>> am unable to install this module of EUtilities. When I search on CPAN,
>> it gives me the entire bioperl package in the download option of this
>> module. Can I not get a tar.gz file of this module alone, which I can
>> gzip, untar and then run the make and all to install it? I dont want
>> to install entire bioperl again as I am using an older version. Any
>> suggestions?
>>
>> -Neeti
>> Even my blood says, B positive
>>
>>
>>
>> On Fri, Sep 4, 2009 at 6:00 PM, Chris Fields<cjfields at illinois.edu> wrote:
>>>
>>> Neeti,
>>>
>>> Something like this?
>>>
>>>
>>> http://www.bioperl.org/wiki/HOWTO:EUtilities_Cookbook#esummary_-.3E_efetch
>>>
>>> chris
>>>
>>> On Sep 4, 2009, at 7:21 AM, Neeti Somaiya wrote:
>>>
>>>> Thanks. Its an interesting tool.
>>>>
>>>> But I want to do this programatically.
>>>>
>>>> I have gene ids to start with. Cant find a method to directly get
>>>> sequence with gene id as input. So using the method of getting
>>>> sequence with accession as input, for which I need to know accessions
>>>> for my gene ids first. Is this a right approach? Please guide me. My
>>>> main aim is to get the nucleotide sequence of a gene from ids entrez
>>>> gene id/gene name. PLease guide me. I am confused.
>>>>
>>>> -Neeti
>>>> Even my blood says, B positive
>>>>
>>>>
>>>>
>>>> On Fri, Sep 4, 2009 at 5:35 PM, Emanuele Osimo<e.osimo at gmail.com> wrote:
>>>>>
>>>>> Try this:
>>>>> http://david.abcc.ncifcrf.gov/conversion.jsp
>>>>>
>>>>> Emanuele
>>>>>
>>>>>
>>>>> On Fri, Sep 4, 2009 at 12:13, Neeti Somaiya <neetisomaiya at gmail.com>
>>>>> wrote:
>>>>>>
>>>>>> Thanks for the replies.
>>>>>>
>>>>>> So the get seq by accession/GI worked for me. Now can anyone tell me
>>>>>> the easiest way to get the GI /Accession of a gene from the gene
>>>>>> id/gene name?
>>>>>>
>>>>>> -Neeti
>>>>>> Even my blood says, B positive
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Fri, Sep 4, 2009 at 2:47 PM, Neeti Somaiya<neetisomaiya at gmail.com>
>>>>>> wrote:
>>>>>>>
>>>>>>> Thanks for the link.
>>>>>>> So I need only the following lines of code to get the sequence?
>>>>>>>
>>>>>>> use Bio::DB::GenBank;
>>>>>>> $db_obj = Bio::DB::GenBank->new;
>>>>>>> $seq_obj = $db_obj->get_Seq_by_id(2);
>>>>>>>
>>>>>>> How do I print the sequence?
>>>>>>> $seq_obj->seq ??
>>>>>>>
>>>>>>> -Neeti
>>>>>>> Even my blood says, B positive
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Fri, Sep 4, 2009 at 1:31 PM, K. Shameer<shameer at ncbs.res.in>
>>>>>>> wrote:
>>>>>>>>
>>>>>>>> Retrieving a sequence from a database : BioPerl HOWTO
>>>>>>>> http://bit.ly/RWIot
>>>>>>>>
>>>>>>>> Trust this helps,
>>>>>>>> Khader Shameer
>>>>>>>> NCBS - TIFR
>>>>>>>>
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>> I have an input list of gene names (can get gene ids from a local
>>>>>>>>> db
>>>>>>>>> if required).
>>>>>>>>> I need to fetch sequences of these genes. Can someone please guide
>>>>>>>>> me
>>>>>>>>> as to how this can be done using perl/bioperl?
>>>>>>>>>
>>>>>>>>> Any help will be deeply appreciated.
>>>>>>>>>
>>>>>>>>> Thanks.
>>>>>>>>>
>>>>>>>>> -Neeti
>>>>>>>>> Even my blood says, B positive
>>>>>>>>> _______________________________________________
>>>>>>>>> Bioperl-l mailing list
>>>>>>>>> Bioperl-l at lists.open-bio.org
>>>>>>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>> _______________________________________________
>>>>>> Bioperl-l mailing list
>>>>>> Bioperl-l at lists.open-bio.org
>>>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>>>
>>>>>
>>>> _______________________________________________
>>>> Bioperl-l mailing list
>>>> Bioperl-l at lists.open-bio.org
>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>

From ybolo001 at student.ucr.edu  Sat Sep  5 03:37:58 2009
From: ybolo001 at student.ucr.edu (Eugene Bolotin)
Date: Sat, 5 Sep 2009 00:37:58 -0700
Subject: [Bioperl-l] need help urgently
In-Reply-To: <764978cf0909042152v2ae26ee5q6c668c498ead605e@mail.gmail.com>
References: <764978cf0909032349t5d8683f3h3b2ccc506392c984@mail.gmail.com>
	<42006.192.168.1.1.1252051273.squirrel@mail.ncbs.res.in>
	<764978cf0909040217k7f4382d6pcec65f8adfd66164@mail.gmail.com>
	<764978cf0909040313r48ff5660x3210324dc14bf966@mail.gmail.com>
	<2ac05d0f0909040505w75793434ifc25237edbabad5a@mail.gmail.com>
	<764978cf0909040521p7aa6af97hffc0c08381a04222@mail.gmail.com>
	<8CCCFE4D-84A4-47A4-A627-ADC6C0329686@illinois.edu>
	<764978cf0909040540n531ea4d3o42f28a7e1578ad82@mail.gmail.com>
	<4D83853D-90C3-4048-AFAB-FF6E2402C7AA@illinois.edu>
	<764978cf0909042152v2ae26ee5q6c668c498ead605e@mail.gmail.com>
Message-ID: <941fcc750909050037n3c0f4fc5u89fcf4f5c3e5f34d@mail.gmail.com>

Ok,
this is what I would do.
Download the database of gene names and sequences in fasta.
Then loop throught it with bioperl.
Regex the gene names, which you store into a hash, against the
seq->display_names() should match it up with gene ids
seq->seq() should print out the sequence
in bioperl.
Print out the ones that match.
Good luck.
- Show quoted text -

On Thu, Sep 3, 2009 at 11:49 PM, Neeti Somaiya<neetisomaiya at gmail.com> wrote:
> Hi,
>
> I have an input list of gene names (can get gene ids from a local db
> if required).
> I need to fetch sequences of these genes. Can someone please guide me
> as to how this can be done using perl/bioperl?
>
> Any help will be deeply appreciated.
>
> Thanks.
>
> -Neeti
> Even my blood says, B positive
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


--

On Fri, Sep 4, 2009 at 9:52 PM, Neeti Somaiya<neetisomaiya at gmail.com> wrote:
> Ok, so I reinstalled bioperl and was able to run the EUtilities code
> for my gene id.
> But I am facing two issues :-
>
> 1) When I give multiple gene ids, it still returns data of only the
> first gene id
>
> 2) The script returns the entire entry, and I am not able to figure
> out how to just fetch the sequence, and if possible, in FASTA format.
> I could not figure it out from the documentation.
>
> Thanks.
>
> -Neeti
> Even my blood says, B positive
>
>
>
> On Fri, Sep 4, 2009 at 6:19 PM, Chris Fields<cjfields at illinois.edu> wrote:
>> Neeti,
>>
>> Sorry, it's a package deal (and Bio::DB::EUtilities relies on several other
>> modules). ?I am planning on spinning it out at some point into it's own
>> package, but for now the easiest way to install is via 1.6 off CPAN or
>> downloading the nightly build:
>>
>> http://www.bioperl.org/DIST/nightly_builds/
>>
>> chris
>>
>> On Sep 4, 2009, at 7:40 AM, Neeti Somaiya wrote:
>>
>>> Hi,
>>>
>>> Thanks for your reply. I saw this before and wanted to try this, but I
>>> am unable to install this module of EUtilities. When I search on CPAN,
>>> it gives me the entire bioperl package in the download option of this
>>> module. Can I not get a tar.gz file of this module alone, which I can
>>> gzip, untar and then run the make and all to install it? I dont want
>>> to install entire bioperl again as I am using an older version. Any
>>> suggestions?
>>>
>>> -Neeti
>>> Even my blood says, B positive
>>>
>>>
>>>
>>> On Fri, Sep 4, 2009 at 6:00 PM, Chris Fields<cjfields at illinois.edu> wrote:
>>>>
>>>> Neeti,
>>>>
>>>> Something like this?
>>>>
>>>>
>>>> http://www.bioperl.org/wiki/HOWTO:EUtilities_Cookbook#esummary_-.3E_efetch
>>>>
>>>> chris
>>>>
>>>> On Sep 4, 2009, at 7:21 AM, Neeti Somaiya wrote:
>>>>
>>>>> Thanks. Its an interesting tool.
>>>>>
>>>>> But I want to do this programatically.
>>>>>
>>>>> I have gene ids to start with. Cant find a method to directly get
>>>>> sequence with gene id as input. So using the method of getting
>>>>> sequence with accession as input, for which I need to know accessions
>>>>> for my gene ids first. Is this a right approach? Please guide me. My
>>>>> main aim is to get the nucleotide sequence of a gene from ids entrez
>>>>> gene id/gene name. PLease guide me. I am confused.
>>>>>
>>>>> -Neeti
>>>>> Even my blood says, B positive
>>>>>
>>>>>
>>>>>
>>>>> On Fri, Sep 4, 2009 at 5:35 PM, Emanuele Osimo<e.osimo at gmail.com> wrote:
>>>>>>
>>>>>> Try this:
>>>>>> http://david.abcc.ncifcrf.gov/conversion.jsp
>>>>>>
>>>>>> Emanuele
>>>>>>
>>>>>>
>>>>>> On Fri, Sep 4, 2009 at 12:13, Neeti Somaiya <neetisomaiya at gmail.com>
>>>>>> wrote:
>>>>>>>
>>>>>>> Thanks for the replies.
>>>>>>>
>>>>>>> So the get seq by accession/GI worked for me. Now can anyone tell me
>>>>>>> the easiest way to get the GI /Accession of a gene from the gene
>>>>>>> id/gene name?
>>>>>>>
>>>>>>> -Neeti
>>>>>>> Even my blood says, B positive
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Fri, Sep 4, 2009 at 2:47 PM, Neeti Somaiya<neetisomaiya at gmail.com>
>>>>>>> wrote:
>>>>>>>>
>>>>>>>> Thanks for the link.
>>>>>>>> So I need only the following lines of code to get the sequence?
>>>>>>>>
>>>>>>>> use Bio::DB::GenBank;
>>>>>>>> $db_obj = Bio::DB::GenBank->new;
>>>>>>>> $seq_obj = $db_obj->get_Seq_by_id(2);
>>>>>>>>
>>>>>>>> How do I print the sequence?
>>>>>>>> $seq_obj->seq ??
>>>>>>>>
>>>>>>>> -Neeti
>>>>>>>> Even my blood says, B positive
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Fri, Sep 4, 2009 at 1:31 PM, K. Shameer<shameer at ncbs.res.in>
>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>> Retrieving a sequence from a database : BioPerl HOWTO
>>>>>>>>> http://bit.ly/RWIot
>>>>>>>>>
>>>>>>>>> Trust this helps,
>>>>>>>>> Khader Shameer
>>>>>>>>> NCBS - TIFR
>>>>>>>>>
>>>>>>>>>> Hi,
>>>>>>>>>>
>>>>>>>>>> I have an input list of gene names (can get gene ids from a local
>>>>>>>>>> db
>>>>>>>>>> if required).
>>>>>>>>>> I need to fetch sequences of these genes. Can someone please guide
>>>>>>>>>> me
>>>>>>>>>> as to how this can be done using perl/bioperl?
>>>>>>>>>>
>>>>>>>>>> Any help will be deeply appreciated.
>>>>>>>>>>
>>>>>>>>>> Thanks.
>>>>>>>>>>
>>>>>>>>>> -Neeti
>>>>>>>>>> Even my blood says, B positive
>>>>>>>>>> _______________________________________________
>>>>>>>>>> Bioperl-l mailing list
>>>>>>>>>> Bioperl-l at lists.open-bio.org
>>>>>>>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> Bioperl-l mailing list
>>>>>>> Bioperl-l at lists.open-bio.org
>>>>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>>>>
>>>>>>
>>>>> _______________________________________________
>>>>> Bioperl-l mailing list
>>>>> Bioperl-l at lists.open-bio.org
>>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>>
>>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


-- 
Eugene Bolotin
Ph.D. candidate
Genetics Genomics and Bioinformatics
University of California Riverside
ybolo001 at student.ucr.edu
Dr. Frances Sladek Lab


From maj at fortinbras.us  Sat Sep  5 08:53:12 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Sat, 5 Sep 2009 08:53:12 -0400
Subject: [Bioperl-l] bioperl invades emacs -- bug report?
In-Reply-To: <alpine.DEB.1.10.0909030814320.16229@deskpro17122.dynamic.sanger.ac.uk>
References: <mailman.25.1251907209.22450.bioperl-l@lists.open-bio.org><alpine.DEB.1.10.0909022007510.16229@deskpro17122.dynamic.sanger.ac.uk><203092FB050648AA9F256788068F0A16@NewLife>
	<alpine.DEB.1.10.0909030814320.16229@deskpro17122.dynamic.sanger.ac.uk>
Message-ID: <E63F6D209AF1432C9B9CAFF6F6182F9C@NewLife>

Hi Pablo-- You're right about the PERL5LIB issue; I had
not set up the module path to handle multiple paths as you
describe. I am working hard on an implementation that can
handle multiple paths; I hope to have it out next week --cheers MAJ
----- Original Message ----- 
From: "Pablo Marin-Garcia" <pg4 at sanger.ac.uk>
To: "Mark A. Jensen" <maj at fortinbras.us>
Cc: <bioperl-l at lists.open-bio.org>
Sent: Thursday, September 03, 2009 4:01 AM
Subject: Re: [Bioperl-l] bioperl invades emacs -- bug report?


> On Thu, 3 Sep 2009, Mark A. Jensen wrote:
>
>> Hi Pablo and all-
>> Try the latest revision (>=16081) with your debian/Emacs 21. Set
>> the variable bioperl-module-path to the directory above the
>> Bio directory (same idea as ' use lib "./bioperl-live"; ' ), and try
>> again there. Tomorrow, MacOS
>> cheers,
>> Mark
>
> Hello Mark,
>
> after setting bioperl-module-path manually, your module works ok in linux 
> emacs 21.4 with latest revision.
>
> About the perl5lib issue, sorry about not reporting the platform: the report 
> was on linux not in mac os X. In the wiki you have a comment about mac OS X 
> separator:
>
> [wiki] The problem Pablo was running into is definitely the Mac OS X path 
> [wiki] separator issue.
>
> Here I was refering to ':' as the 'path seprator' for linux multipath 
> environmental vars not the systems directory separator [:/\].
>
> Also from the wiki
>
> [wiki] I think this is ok as it is, since bioperl-module-path is meant to 
> [wiki] point to the directory above Bio
>
> This is right. Probably my message was misleading. I wrongly appended '/Bio' 
> to the path instead to a temp variable for testing with file-exist-p. And 
> probably gave you the impression that the point was to have the /Bio added to 
> the path. Sorry about that.
>
> Instead my main point was about the line where you capture the PRL5LIB:
>
> [code] (if (setq pth (getenv "PERL5LIB"))
>
> wouldn't this leave pth with s *string* like "lib/path1:lib/path2:lob/path3" 
> in linux?
>
> Then, when you test:
>
> [code] (setq pth (if (file-exists-p (concat pth "/" "Bio")) pth nil))))
>
> it would append '/Bio' at the end of the whole string 
> 'lib/path1:lib/path2:lib/path3'. and this string path obviously does not 
> exist.
>
> Am I missing something? Shouldn't the 'concat /Bio' be applied to *each* 
> lib/path, splitting first the pth string by the ':' in linux/osX or equivalent 
> in windows.
>
> Sorry about not being very clear in my firest report.
>
>
>    -Pablo
>
>
>
>>> == bug when parsing perl5lib? ==
>>>
>>> Please correct me if I am wrong but in bioperl-init.el when extracting the 
>>> Bioperl paths from PERL5LIB this is not working for me in linux.
>>>
>>> While debugging bioperl-init.el:
>>> # (setq pth (getenv "PERL5LIB"))
>>> # 
>>> "/nfs/home/pmg/ensembl-api/ensembl-compara/modules:...:/nfs/home/pmg/bioperl-live:..."
>>> # (setq pth (if (file-exists-p (concat pth "/" "Bio")) pth nil))
>>> # nil
>>>
>>> No file is found because it is looking for all the paths concatenated 
>>> together with a '/Bio' at the end:
>>>
>>>   libpaht1:libpath2:libpath3/Bio
>>>
>>> 'concat' adds /Bio to the pth that is a string with all the PERL5LIB paths. 
>>> Should this concat rather be applied to the splited perl5lib by ':' in unix 
>>> or ';' in windows and then tested for the existence of files?
>>>
>>> for example in unix:
>>>
>>> --- code --
>>> (defun addbio (bio_path)
>>>   "apend /Bio to each path"
>>>   (concat bio_path "/" "Bio"))
>>>
>>> (mapcar 'file-exists-p (mapcar 'addbio (split-string pth ":")))
>>> -- end code ---
>>>
>>> This would result in the list of T and F bioperl (and ensembl) paths
>>> (t t nil t t t t t t nil nil nil ...)
>>>
>>>
>>> Regards and thanks for the modules they would be very useful.
>>>
>>>    -Pablo
>>>
>>> =====================================================================
>>>                      Pablo Marin-Garcia, PhD
>>>
>>>                     \\//          (Argiope bruennichi
>>>                \/\/`(||>O:'\/\/   with stabilimentum)
>>>                     //\\
>>>
>>> Sanger Institute                |  PostDoc / Computer Biologist
>>> Wellcome Trust Genome Campus    |  team : 128/108 (Human Genetics)
>>> Hinxton, Cambridge CB10 1HH     |  room : N333
>>> United Kingdom                  |  email: pablo.marin at sanger.ac.uk
>>> ====================================================================
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> -- 
>>> The Wellcome Trust Sanger Institute is operated by Genome Research Limited, 
>>> a charity registered in England with number 1021457 and a company registered 
>>> in England with number 2742969, whose registered office is 215 Euston Road, 
>>> London, NW1 2BE. _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>>
>>
>>
>
>
> =====================================================================
>                      Pablo Marin-Garcia, PhD
>
>                     \\//          (Argiope bruennichi
>                \/\/`(||>O:'\/\/   with stabilimentum)
>                     //\\
>
> Sanger Institute                |  PostDoc / Computer Biologist
> Wellcome Trust Genome Campus    |  team : 128/108 (Human Genetics)
> Hinxton, Cambridge CB10 1HH     |  room : N333
> United Kingdom                  |  email: pablo.marin at sanger.ac.uk
> ====================================================================
>
>
>
>
>
>
>
>
>
>
> -- 
> The Wellcome Trust Sanger Institute is operated by Genome Research Limited, a 
> charity registered in England with number 1021457 and a company registered in 
> England with number 2742969, whose registered office is 215 Euston Road, 
> London, NW1 2BE. _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> 


From cjfields at illinois.edu  Sat Sep  5 09:44:54 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Sat, 5 Sep 2009 08:44:54 -0500
Subject: [Bioperl-l] need help urgently
In-Reply-To: <764978cf0909042152v2ae26ee5q6c668c498ead605e@mail.gmail.com>
References: <764978cf0909032349t5d8683f3h3b2ccc506392c984@mail.gmail.com>
	<42006.192.168.1.1.1252051273.squirrel@mail.ncbs.res.in>
	<764978cf0909040217k7f4382d6pcec65f8adfd66164@mail.gmail.com>
	<764978cf0909040313r48ff5660x3210324dc14bf966@mail.gmail.com>
	<2ac05d0f0909040505w75793434ifc25237edbabad5a@mail.gmail.com>
	<764978cf0909040521p7aa6af97hffc0c08381a04222@mail.gmail.com>
	<8CCCFE4D-84A4-47A4-A627-ADC6C0329686@illinois.edu>
	<764978cf0909040540n531ea4d3o42f28a7e1578ad82@mail.gmail.com>
	<4D83853D-90C3-4048-AFAB-FF6E2402C7AA@illinois.edu>
	<764978cf0909042152v2ae26ee5q6c668c498ead605e@mail.gmail.com>
Message-ID: <218A1F91-F492-43E6-814D-A31546E0FEB1@illinois.edu>

On Sep 4, 2009, at 11:52 PM, Neeti Somaiya wrote:

> Ok, so I reinstalled bioperl and was able to run the EUtilities code
> for my gene id.
> But I am facing two issues :-
>
> 1) When I give multiple gene ids, it still returns data of only the
> first gene id

This sounds like it's not iterating correctly.  You'll need to post  
your version of the script.

> 2) The script returns the entire entry, and I am not able to figure
> out how to just fetch the sequence, and if possible, in FASTA format.
> I could not figure it out from the documentation.

I recall this working last time I used it (I think June or July).   
Could you post the script you are using?

(realize this is a holiday weekend in the states, so you might have a  
delayed response from me or others)

> Thanks.
>
> -Neeti
> Even my blood says, B positive

chris

From neetisomaiya at gmail.com  Sun Sep  6 12:15:09 2009
From: neetisomaiya at gmail.com (Neeti Somaiya)
Date: Sun, 6 Sep 2009 21:45:09 +0530
Subject: [Bioperl-l] need help urgently
In-Reply-To: <218A1F91-F492-43E6-814D-A31546E0FEB1@illinois.edu>
References: <764978cf0909032349t5d8683f3h3b2ccc506392c984@mail.gmail.com>
	<764978cf0909040217k7f4382d6pcec65f8adfd66164@mail.gmail.com>
	<764978cf0909040313r48ff5660x3210324dc14bf966@mail.gmail.com>
	<2ac05d0f0909040505w75793434ifc25237edbabad5a@mail.gmail.com>
	<764978cf0909040521p7aa6af97hffc0c08381a04222@mail.gmail.com>
	<8CCCFE4D-84A4-47A4-A627-ADC6C0329686@illinois.edu>
	<764978cf0909040540n531ea4d3o42f28a7e1578ad82@mail.gmail.com>
	<4D83853D-90C3-4048-AFAB-FF6E2402C7AA@illinois.edu>
	<764978cf0909042152v2ae26ee5q6c668c498ead605e@mail.gmail.com>
	<218A1F91-F492-43E6-814D-A31546E0FEB1@illinois.edu>
Message-ID: <764978cf0909060915t7a2e6e45v4bb194b9cad18e18@mail.gmail.com>

Hi,

Thanks for the reply.

I am using the script exactly as it is given here :

http://www.bioperl.org/wiki/HOWTO:EUtilities_Cookbook#esummary_-.3E_efetch

-Neeti
Even my blood says, B positive


On Sat, Sep 5, 2009 at 7:14 PM, Chris Fields<cjfields at illinois.edu> wrote:
> On Sep 4, 2009, at 11:52 PM, Neeti Somaiya wrote:
>
>> Ok, so I reinstalled bioperl and was able to run the EUtilities code
>> for my gene id.
>> But I am facing two issues :-
>>
>> 1) When I give multiple gene ids, it still returns data of only the
>> first gene id
>
> This sounds like it's not iterating correctly.  You'll need to post your
> version of the script.
>
>> 2) The script returns the entire entry, and I am not able to figure
>> out how to just fetch the sequence, and if possible, in FASTA format.
>> I could not figure it out from the documentation.
>
> I recall this working last time I used it (I think June or July).  Could you
> post the script you are using?
>
> (realize this is a holiday weekend in the states, so you might have a
> delayed response from me or others)
>
>> Thanks.
>>
>> -Neeti
>> Even my blood says, B positive
>
> chris
>

From Russell.Smithies at agresearch.co.nz  Sun Sep  6 19:00:24 2009
From: Russell.Smithies at agresearch.co.nz (Smithies, Russell)
Date: Mon, 7 Sep 2009 11:00:24 +1200
Subject: [Bioperl-l] need help urgently
In-Reply-To: <764978cf0909040521p7aa6af97hffc0c08381a04222@mail.gmail.com>
References: <764978cf0909032349t5d8683f3h3b2ccc506392c984@mail.gmail.com>
	<42006.192.168.1.1.1252051273.squirrel@mail.ncbs.res.in>
	<764978cf0909040217k7f4382d6pcec65f8adfd66164@mail.gmail.com>
	<764978cf0909040313r48ff5660x3210324dc14bf966@mail.gmail.com>
	<2ac05d0f0909040505w75793434ifc25237edbabad5a@mail.gmail.com>
	<764978cf0909040521p7aa6af97hffc0c08381a04222@mail.gmail.com>
Message-ID: <18DF7D20DFEC044098A1062202F5FFF32B624C50D0@exchsth.agresearch.co.nz>

Grab the gene2accession list from here and do lookups.
Probably the fastest and easiest way.


Russell Smithies 

Bioinformatics Applications Developer 
T +64 3 489 9085 
E? russell.smithies at agresearch.co.nz 

Invermay? Research Centre 
Puddle Alley, 
Mosgiel, 
New Zealand 
T? +64 3 489 3809?? 
F? +64 3 489 9174? 
www.agresearch.co.nz 


> -----Original Message-----
> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
> bounces at lists.open-bio.org] On Behalf Of Neeti Somaiya
> Sent: Saturday, 5 September 2009 12:21 a.m.
> To: Emanuele Osimo; bioperl-l
> Subject: Re: [Bioperl-l] need help urgently
> 
> Thanks. Its an interesting tool.
> 
> But I want to do this programatically.
> 
> I have gene ids to start with. Cant find a method to directly get
> sequence with gene id as input. So using the method of getting
> sequence with accession as input, for which I need to know accessions
> for my gene ids first. Is this a right approach? Please guide me. My
> main aim is to get the nucleotide sequence of a gene from ids entrez
> gene id/gene name. PLease guide me. I am confused.
> 
> -Neeti
> Even my blood says, B positive
> 
> 
> 
> On Fri, Sep 4, 2009 at 5:35 PM, Emanuele Osimo<e.osimo at gmail.com> wrote:
> > Try this:
> > http://david.abcc.ncifcrf.gov/conversion.jsp
> >
> > Emanuele
> >
> >
> > On Fri, Sep 4, 2009 at 12:13, Neeti Somaiya <neetisomaiya at gmail.com> wrote:
> >>
> >> Thanks for the replies.
> >>
> >> So the get seq by accession/GI worked for me. Now can anyone tell me
> >> the easiest way to get the GI /Accession of a gene from the gene
> >> id/gene name?
> >>
> >> -Neeti
> >> Even my blood says, B positive
> >>
> >>
> >>
> >> On Fri, Sep 4, 2009 at 2:47 PM, Neeti Somaiya<neetisomaiya at gmail.com>
> >> wrote:
> >> > Thanks for the link.
> >> > So I need only the following lines of code to get the sequence?
> >> >
> >> > use Bio::DB::GenBank;
> >> > $db_obj = Bio::DB::GenBank->new;
> >> > $seq_obj = $db_obj->get_Seq_by_id(2);
> >> >
> >> > How do I print the sequence?
> >> > $seq_obj->seq ??
> >> >
> >> > -Neeti
> >> > Even my blood says, B positive
> >> >
> >> >
> >> >
> >> > On Fri, Sep 4, 2009 at 1:31 PM, K. Shameer<shameer at ncbs.res.in> wrote:
> >> >>
> >> >> Retrieving a sequence from a database : BioPerl HOWTO
> >> >> http://bit.ly/RWIot
> >> >>
> >> >> Trust this helps,
> >> >> Khader Shameer
> >> >> NCBS - TIFR
> >> >>
> >> >>> Hi,
> >> >>>
> >> >>> I have an input list of gene names (can get gene ids from a local db
> >> >>> if required).
> >> >>> I need to fetch sequences of these genes. Can someone please guide me
> >> >>> as to how this can be done using perl/bioperl?
> >> >>>
> >> >>> Any help will be deeply appreciated.
> >> >>>
> >> >>> Thanks.
> >> >>>
> >> >>> -Neeti
> >> >>> Even my blood says, B positive
> >> >>> _______________________________________________
> >> >>> Bioperl-l mailing list
> >> >>> Bioperl-l at lists.open-bio.org
> >> >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >> >>>
> >> >>
> >> >>
> >> >>
> >> >
> >> _______________________________________________
> >> Bioperl-l mailing list
> >> Bioperl-l at lists.open-bio.org
> >> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >
> >
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
=======================================================================
Attention: The information contained in this message and/or attachments
from AgResearch Limited is intended only for the persons or entities
to which it is addressed and may contain confidential and/or privileged
material. Any review, retransmission, dissemination or other use of, or
taking of any action in reliance upon, this information by persons or
entities other than the intended recipients is prohibited by AgResearch
Limited. If you have received this message in error, please notify the
sender immediately.
=======================================================================


From bnbowman at gmail.com  Mon Sep  7 04:17:25 2009
From: bnbowman at gmail.com (Brett Bowman)
Date: Mon, 7 Sep 2009 01:17:25 -0700
Subject: [Bioperl-l] Protein Sequence QSARs
Message-ID: <627d998d0909070117u760c8ef3k47a894cf52d099f1@mail.gmail.com>

I've been working on a script for my personal edification for annotating
protein sequence for QSARs, as described in the paper below, because I
didn't see anything in Bioperl to do it for me.  Essentially converting a
protein sequence of length N into a numerical matrix of size 3-by-N by
substitution, and then calculating the auto- and cross- correlation values
for various for a lag of L amino acids.  I was considering turning it into a
full blown module, but I wanted to ask if A) it had been done before and I
had just missed it, and B) whether anyone other than me would find such a
module useful.

Wold S, Jonsson J, Sj?str?m M, Sandberg M, R?nnar S: * DNA and peptide
sequences and chemical processes multivariately modeled by principal
component analysis and partial least-squares projections to latent
structures. **Anal Chim Acta* 1993, *277**:*239-253.

Brett Bowman
bnbowman at gmail.com
Woelk Lab, Stein Cancer Research Center
UCSD/SDSU Joint Program in Bioinformatics


From neetisomaiya at gmail.com  Mon Sep  7 06:04:06 2009
From: neetisomaiya at gmail.com (Neeti Somaiya)
Date: Mon, 7 Sep 2009 15:34:06 +0530
Subject: [Bioperl-l] need help urgently
In-Reply-To: <2ac05d0f0909040039v4d6fb77fw8793b43add632e3a@mail.gmail.com>
References: <764978cf0909032349t5d8683f3h3b2ccc506392c984@mail.gmail.com>
	<2ac05d0f0909040039v4d6fb77fw8793b43add632e3a@mail.gmail.com>
Message-ID: <764978cf0909070304w598d4bb5m51ad4e66f57cc1cf@mail.gmail.com>

I tried using EntrezGene instead of GenBank, as is given in the link
that you sent :

http://www.bioperl.org/wiki/HOWTO:Beginners#Retrieving_a_sequence_from_a_database

http://doc.bioperl.org/releases/bioperl-current/bioperl-live/Bio/DB/EntrezGene.html

use Bio::DB::EntrezGene;

    my $db = Bio::DB::EntrezGene->new;

    my $seq = $db->get_Seq_by_id(2); # Gene id

    # or ...

    my $seqio = $db->get_Stream_by_id([2, 4693, 3064]); # Gene ids
    while ( my $seq = $seqio->next_seq ) {
	    print "id is ", $seq->display_id, "\n";
    }

This doesnt seem to work.


-Neeti
Even my blood says, B positive


On Fri, Sep 4, 2009 at 1:09 PM, Emanuele Osimo<e.osimo at gmail.com> wrote:
> Hello,
> have you tried this?
> http://bio.perl.org/wiki/HOWTO:Getting_Genomic_Sequences#Using_Bio::DB::GenBank_when_you_have_genomic_coordinates
>
> Emanuele
>
> On Fri, Sep 4, 2009 at 08:49, Neeti Somaiya <neetisomaiya at gmail.com> wrote:
>>
>> Hi,
>>
>> I have an input list of gene names (can get gene ids from a local db
>> if required).
>> I need to fetch sequences of these genes. Can someone please guide me
>> as to how this can be done using perl/bioperl?
>>
>> Any help will be deeply appreciated.
>>
>> Thanks.
>>
>> -Neeti
>> Even my blood says, B positive
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>

From Russell.Smithies at agresearch.co.nz  Mon Sep  7 16:26:04 2009
From: Russell.Smithies at agresearch.co.nz (Smithies, Russell)
Date: Tue, 8 Sep 2009 08:26:04 +1200
Subject: [Bioperl-l] need help urgently
In-Reply-To: <764978cf0909070304w598d4bb5m51ad4e66f57cc1cf@mail.gmail.com>
References: <764978cf0909032349t5d8683f3h3b2ccc506392c984@mail.gmail.com>
	<2ac05d0f0909040039v4d6fb77fw8793b43add632e3a@mail.gmail.com>
	<764978cf0909070304w598d4bb5m51ad4e66f57cc1cf@mail.gmail.com>
Message-ID: <18DF7D20DFEC044098A1062202F5FFF32B624C53A3@exchsth.agresearch.co.nz>

This example code from the wiki _definitely_ works:
http://bio.perl.org/wiki/HOWTO:Getting_Genomic_Sequences#Using_Bio::DB::EntrezGene_to_get_genomic_coordinates
=========================================

use strict;
use Bio::DB::EntrezGene;
 
my $id = shift or die "Id?\n"; # use a Gene id
 
my $db = new Bio::DB::EntrezGene;
$db->verbose(1); ###
 
my $seq = $db->get_Seq_by_id($id);
 
my $ac = $seq->annotation;
 
for my $ann ($ac->get_Annotations('dblink')) {
	if ($ann->database eq "Evidence Viewer") {
                # get the sequence identifier, the start, and the stop
		my ($contig,$from,$to) = $ann->url =~ 
		  /contig=([^&]+).+from=(\d+)&to=(\d+)/;
		print "$contig\t$from\t$to\n";
	}
}

======================================

So if it doesn't work for you, there are a few things you need to check:
* what version of BioPerl are you using?
* are you behind a firewall?
* are you using a proxy?
* do you need to submit username/password for either of the 2 above
* turn on 'verbose' messages, it may help you debug


If you're still having problems, get back to me and I'll see if I can help.

--Russell


> -----Original Message-----
> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
> bounces at lists.open-bio.org] On Behalf Of Neeti Somaiya
> Sent: Monday, 7 September 2009 10:04 p.m.
> To: Emanuele Osimo; bioperl-l
> Subject: Re: [Bioperl-l] need help urgently
> 
> I tried using EntrezGene instead of GenBank, as is given in the link
> that you sent :
> 
> http://www.bioperl.org/wiki/HOWTO:Beginners#Retrieving_a_sequence_from_a_datab
> ase
> 
> http://doc.bioperl.org/releases/bioperl-current/bioperl-
> live/Bio/DB/EntrezGene.html
> 
> use Bio::DB::EntrezGene;
> 
>     my $db = Bio::DB::EntrezGene->new;
> 
>     my $seq = $db->get_Seq_by_id(2); # Gene id
> 
>     # or ...
> 
>     my $seqio = $db->get_Stream_by_id([2, 4693, 3064]); # Gene ids
>     while ( my $seq = $seqio->next_seq ) {
> 	    print "id is ", $seq->display_id, "\n";
>     }
> 
> This doesnt seem to work.
> 
> 
> -Neeti
> Even my blood says, B positive
> 
> 
> 
> On Fri, Sep 4, 2009 at 1:09 PM, Emanuele Osimo<e.osimo at gmail.com> wrote:
> > Hello,
> > have you tried this?
> >
> http://bio.perl.org/wiki/HOWTO:Getting_Genomic_Sequences#Using_Bio::DB::GenBan
> k_when_you_have_genomic_coordinates
> >
> > Emanuele
> >
> > On Fri, Sep 4, 2009 at 08:49, Neeti Somaiya <neetisomaiya at gmail.com> wrote:
> >>
> >> Hi,
> >>
> >> I have an input list of gene names (can get gene ids from a local db
> >> if required).
> >> I need to fetch sequences of these genes. Can someone please guide me
> >> as to how this can be done using perl/bioperl?
> >>
> >> Any help will be deeply appreciated.
> >>
> >> Thanks.
> >>
> >> -Neeti
> >> Even my blood says, B positive
> >> _______________________________________________
> >> Bioperl-l mailing list
> >> Bioperl-l at lists.open-bio.org
> >> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >
> >
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
=======================================================================
Attention: The information contained in this message and/or attachments
from AgResearch Limited is intended only for the persons or entities
to which it is addressed and may contain confidential and/or privileged
material. Any review, retransmission, dissemination or other use of, or
taking of any action in reliance upon, this information by persons or
entities other than the intended recipients is prohibited by AgResearch
Limited. If you have received this message in error, please notify the
sender immediately.
=======================================================================


From cjfields at illinois.edu  Mon Sep  7 16:56:03 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Mon, 7 Sep 2009 15:56:03 -0500
Subject: [Bioperl-l] Prepping for 1.6.1 (finally!)
Message-ID: <35CC277D-F0B6-45D0-A578-10A00B7A9C57@illinois.edu>

All,

I have updated the Changes file in bioperl-live in preparation for  
1.6.1.  The initial release will be an alpha, 1.6.0_1 (probably  
landing about mid-week), and based on CPAN tests, etc the final 1.6.1  
release next week.  I'll start merging changes over from trunk  
tonight, fixing last-minute bugs, etc.  I'm running my work using perl  
5.10.1 (64-bit) on Mac and will likely run these remotely on our local  
linux cluster.  Win tests are gladly welcome (this should work on  
Strawberry Perl now).

I highly suggest Mark, Jason, and any others (Lincoln, Scott, Chase,  
Robert Buels, Jay Hannah, Heikki, Sendu come to mind) look over the  
file to update it.  There are a few weak spots in there where I didn't  
make the code change or additions, or where a particular bug was fixed  
but not mentioned.  In particular:

1) Google Summer of Code work from Chase (Mark, Chase)
2) GMOD-related fixes (Lincoln, Scott)
3) YAPC Hackathon bug fixes (Robert, Jay, Bruno)
4) Tiling, Restriction refactors (Mark)

Also, please make changes to AUTHORS, etc as needed.

Thanks!

chris

From maj at fortinbras.us  Mon Sep  7 17:21:04 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Mon, 7 Sep 2009 17:21:04 -0400
Subject: [Bioperl-l] Prepping for 1.6.1 (finally!)
In-Reply-To: <35CC277D-F0B6-45D0-A578-10A00B7A9C57@illinois.edu>
References: <35CC277D-F0B6-45D0-A578-10A00B7A9C57@illinois.edu>
Message-ID: <29B3F9DC91A1422A89629790DD8CC313@NewLife>

aye-aye skipper--- 
----- Original Message ----- 
From: "Chris Fields" <cjfields at illinois.edu>
To: "BioPerl List" <bioperl-l at lists.open-bio.org>
Sent: Monday, September 07, 2009 4:56 PM
Subject: [Bioperl-l] Prepping for 1.6.1 (finally!)


> All,
> 
> I have updated the Changes file in bioperl-live in preparation for  
> 1.6.1.  The initial release will be an alpha, 1.6.0_1 (probably  
> landing about mid-week), and based on CPAN tests, etc the final 1.6.1  
> release next week.  I'll start merging changes over from trunk  
> tonight, fixing last-minute bugs, etc.  I'm running my work using perl  
> 5.10.1 (64-bit) on Mac and will likely run these remotely on our local  
> linux cluster.  Win tests are gladly welcome (this should work on  
> Strawberry Perl now).
> 
> I highly suggest Mark, Jason, and any others (Lincoln, Scott, Chase,  
> Robert Buels, Jay Hannah, Heikki, Sendu come to mind) look over the  
> file to update it.  There are a few weak spots in there where I didn't  
> make the code change or additions, or where a particular bug was fixed  
> but not mentioned.  In particular:
> 
> 1) Google Summer of Code work from Chase (Mark, Chase)
> 2) GMOD-related fixes (Lincoln, Scott)
> 3) YAPC Hackathon bug fixes (Robert, Jay, Bruno)
> 4) Tiling, Restriction refactors (Mark)
> 
> Also, please make changes to AUTHORS, etc as needed.
> 
> Thanks!
> 
> chris
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
>

From cjfields at illinois.edu  Tue Sep  8 00:23:26 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Mon, 7 Sep 2009 23:23:26 -0500
Subject: [Bioperl-l] Significant blocker for 1.6.1 : Nexml
Message-ID: <E5D7B830-6D19-47D2-8D5E-716B4CF84F0B@illinois.edu>

All,

I'm running into a pretty significant blocker for 1.6.1 re: Chase's  
Nexml code.  In particular, I have tried three versions of Bio::Phylo;  
the default CPAN installation (1.6), the latest CPAN RC (1.7_RC9, not  
installed by default), and the latest from Bio::Phylo svn:

https://nexml.svn.sourceforge.net/svnroot/nexml/trunk/nexml/perl

At this moment only the Bio::Phylo code from svn is working with  
BioPerl's Nexml modules.  From my local tests Bio::Phylo 1.6 appears  
to be missing Bio::Phylo::Factory (all Nexml tests fail), whereas  
1.7_RC9 has some kind of versioning issue (again, all tests fail).   
The problem: CPAN will always install 1.6 (the others are RC, so they  
won't be installed unless the full path is used).  Even so, nothing on  
CPAN even works; one must use the latest Bio::Phylo SVN code.

ATM I'm just not seeing how this can be released with 1.6.1 right now,  
unless one of the following occurs:

1) Rutger V. drops a quick non-RC release to CPAN,
2) check for the minimal working Bio::Phylo version and safely skip  
any Nexml-related tests unless proper version is present (not easy  
with a $VERSION like '1.7_RC9'),
3) push Nexml into it's own distribution (something we were planning  
on anyway with a number of modules)

As for #3 above, I think it probably belongs in a larger bioperl-phylo  
as Mark had previously proposed.  I'm open to just about any solution.

chris

From neetisomaiya at gmail.com  Tue Sep  8 00:27:43 2009
From: neetisomaiya at gmail.com (Neeti Somaiya)
Date: Tue, 8 Sep 2009 09:57:43 +0530
Subject: [Bioperl-l] need help urgently
In-Reply-To: <18DF7D20DFEC044098A1062202F5FFF32B624C53A3@exchsth.agresearch.co.nz>
References: <764978cf0909032349t5d8683f3h3b2ccc506392c984@mail.gmail.com>
	<2ac05d0f0909040039v4d6fb77fw8793b43add632e3a@mail.gmail.com>
	<764978cf0909070304w598d4bb5m51ad4e66f57cc1cf@mail.gmail.com>
	<18DF7D20DFEC044098A1062202F5FFF32B624C53A3@exchsth.agresearch.co.nz>
Message-ID: <764978cf0909072127n830d4e8x95d15a758fa919db@mail.gmail.com>

I actually want the nucleotide sequence of the gene. I thought the
Bio::DB::EntrezGene would give me a seq_obj for an entrez gene id and
then the seq method on that $seq_obj->seq() will give me the actual
genomic nucleotide sequence of the gene. But this doesnt happen. I am
able to print gene symbol using $seq_obj->display_id and able to do
other things, but I wanted the gene nucleotide sequence.

-Neeti
Even my blood says, B positive


On Tue, Sep 8, 2009 at 1:56 AM, Smithies,
Russell<Russell.Smithies at agresearch.co.nz> wrote:
> This example code from the wiki _definitely_ works:
> http://bio.perl.org/wiki/HOWTO:Getting_Genomic_Sequences#Using_Bio::DB::EntrezGene_to_get_genomic_coordinates
> =========================================
>
> use strict;
> use Bio::DB::EntrezGene;
>
> my $id = shift or die "Id?\n"; # use a Gene id
>
> my $db = new Bio::DB::EntrezGene;
> $db->verbose(1); ###
>
> my $seq = $db->get_Seq_by_id($id);
>
> my $ac = $seq->annotation;
>
> for my $ann ($ac->get_Annotations('dblink')) {
>        if ($ann->database eq "Evidence Viewer") {
>                # get the sequence identifier, the start, and the stop
>                my ($contig,$from,$to) = $ann->url =~
>                  /contig=([^&]+).+from=(\d+)&to=(\d+)/;
>                print "$contig\t$from\t$to\n";
>        }
> }
>
> ======================================
>
> So if it doesn't work for you, there are a few things you need to check:
> * what version of BioPerl are you using?
> * are you behind a firewall?
> * are you using a proxy?
> * do you need to submit username/password for either of the 2 above
> * turn on 'verbose' messages, it may help you debug
>
>
> If you're still having problems, get back to me and I'll see if I can help.
>
> --Russell
>
>
>> -----Original Message-----
>> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
>> bounces at lists.open-bio.org] On Behalf Of Neeti Somaiya
>> Sent: Monday, 7 September 2009 10:04 p.m.
>> To: Emanuele Osimo; bioperl-l
>> Subject: Re: [Bioperl-l] need help urgently
>>
>> I tried using EntrezGene instead of GenBank, as is given in the link
>> that you sent :
>>
>> http://www.bioperl.org/wiki/HOWTO:Beginners#Retrieving_a_sequence_from_a_datab
>> ase
>>
>> http://doc.bioperl.org/releases/bioperl-current/bioperl-
>> live/Bio/DB/EntrezGene.html
>>
>> use Bio::DB::EntrezGene;
>>
>>     my $db = Bio::DB::EntrezGene->new;
>>
>>     my $seq = $db->get_Seq_by_id(2); # Gene id
>>
>>     # or ...
>>
>>     my $seqio = $db->get_Stream_by_id([2, 4693, 3064]); # Gene ids
>>     while ( my $seq = $seqio->next_seq ) {
>>           print "id is ", $seq->display_id, "\n";
>>     }
>>
>> This doesnt seem to work.
>>
>>
>> -Neeti
>> Even my blood says, B positive
>>
>>
>>
>> On Fri, Sep 4, 2009 at 1:09 PM, Emanuele Osimo<e.osimo at gmail.com> wrote:
>> > Hello,
>> > have you tried this?
>> >
>> http://bio.perl.org/wiki/HOWTO:Getting_Genomic_Sequences#Using_Bio::DB::GenBan
>> k_when_you_have_genomic_coordinates
>> >
>> > Emanuele
>> >
>> > On Fri, Sep 4, 2009 at 08:49, Neeti Somaiya <neetisomaiya at gmail.com> wrote:
>> >>
>> >> Hi,
>> >>
>> >> I have an input list of gene names (can get gene ids from a local db
>> >> if required).
>> >> I need to fetch sequences of these genes. Can someone please guide me
>> >> as to how this can be done using perl/bioperl?
>> >>
>> >> Any help will be deeply appreciated.
>> >>
>> >> Thanks.
>> >>
>> >> -Neeti
>> >> Even my blood says, B positive
>> >> _______________________________________________
>> >> Bioperl-l mailing list
>> >> Bioperl-l at lists.open-bio.org
>> >> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>> >
>> >
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> =======================================================================
> Attention: The information contained in this message and/or attachments
> from AgResearch Limited is intended only for the persons or entities
> to which it is addressed and may contain confidential and/or privileged
> material. Any review, retransmission, dissemination or other use of, or
> taking of any action in reliance upon, this information by persons or
> entities other than the intended recipients is prohibited by AgResearch
> Limited. If you have received this message in error, please notify the
> sender immediately.
> =======================================================================
>

From Russell.Smithies at agresearch.co.nz  Tue Sep  8 00:41:47 2009
From: Russell.Smithies at agresearch.co.nz (Smithies, Russell)
Date: Tue, 8 Sep 2009 16:41:47 +1200
Subject: [Bioperl-l] need help urgently
In-Reply-To: <764978cf0909072127n830d4e8x95d15a758fa919db@mail.gmail.com>
References: <764978cf0909032349t5d8683f3h3b2ccc506392c984@mail.gmail.com>
	<2ac05d0f0909040039v4d6fb77fw8793b43add632e3a@mail.gmail.com>
	<764978cf0909070304w598d4bb5m51ad4e66f57cc1cf@mail.gmail.com>
	<18DF7D20DFEC044098A1062202F5FFF32B624C53A3@exchsth.agresearch.co.nz>
	<764978cf0909072127n830d4e8x95d15a758fa919db@mail.gmail.com>
Message-ID: <18DF7D20DFEC044098A1062202F5FFF32B624C5607@exchsth.agresearch.co.nz>

That bit of code gave you the accession, start and end for the sequence so you just needed to download it.
Bio::DB::Eutilities can do that for you.

Did you take a look at http://www.bioperl.org/wiki/HOWTO:Getting_Genomic_Sequences


--Russell

==================
#!perl -w

use strict;
use Bio::DB::EntrezGene;
use Bio::DB::EUtilities;

no warnings 'deprecated';
 
my $id = shift or die "Id?\n"; # use a Gene id
 
my $db = new Bio::DB::EntrezGene;
#$db->verbose(1);
my $seq = $db->get_Seq_by_id($id);
 
my $ac = $seq->annotation;
 
for my $ann ($ac->get_Annotations('dblink')) {
	if ($ann->database eq "Evidence Viewer") {
                # get the sequence identifier, the start, and the stop
		my ($acc,$from,$to) = $ann->url =~
		  /contig=([^&]+).+from=(\d+)&to=(\d+)/;
		print "$acc\t$from\t$to\n";

		# retrieve the sequence
		my $fetcher = Bio::DB::EUtilities->new(-eutil => 'efetch',
					   -db    => 'nucleotide',
					   -rettype => 'fasta');
            $fetcher->set_parameters(-id => $acc,
			     			-seq_start => $from,
			     			-seq_stop  => $to,
			     			-strand    => 1);
            my $seq = $fetcher->get_Response->content;
            print $seq;

	}
}

======================

> -----Original Message-----
> From: Neeti Somaiya [mailto:neetisomaiya at gmail.com]
> Sent: Tuesday, 8 September 2009 4:28 p.m.
> To: Smithies, Russell
> Cc: Emanuele Osimo; bioperl-l
> Subject: Re: [Bioperl-l] need help urgently
> 
> I actually want the nucleotide sequence of the gene. I thought the
> Bio::DB::EntrezGene would give me a seq_obj for an entrez gene id and
> then the seq method on that $seq_obj->seq() will give me the actual
> genomic nucleotide sequence of the gene. But this doesnt happen. I am
> able to print gene symbol using $seq_obj->display_id and able to do
> other things, but I wanted the gene nucleotide sequence.
> 
> -Neeti
> Even my blood says, B positive
> 
> 
> 
> On Tue, Sep 8, 2009 at 1:56 AM, Smithies,
> Russell<Russell.Smithies at agresearch.co.nz> wrote:
> > This example code from the wiki _definitely_ works:
> >
> http://bio.perl.org/wiki/HOWTO:Getting_Genomic_Sequences#Using_Bio::DB::Entrez
> Gene_to_get_genomic_coordinates
> > =========================================
> >
> > use strict;
> > use Bio::DB::EntrezGene;
> >
> > my $id = shift or die "Id?\n"; # use a Gene id
> >
> > my $db = new Bio::DB::EntrezGene;
> > $db->verbose(1); ###
> >
> > my $seq = $db->get_Seq_by_id($id);
> >
> > my $ac = $seq->annotation;
> >
> > for my $ann ($ac->get_Annotations('dblink')) {
> >        if ($ann->database eq "Evidence Viewer") {
> >                # get the sequence identifier, the start, and the stop
> >                my ($contig,$from,$to) = $ann->url =~
> >                  /contig=([^&]+).+from=(\d+)&to=(\d+)/;
> >                print "$contig\t$from\t$to\n";
> >        }
> > }
> >
> > ======================================
> >
> > So if it doesn't work for you, there are a few things you need to check:
> > * what version of BioPerl are you using?
> > * are you behind a firewall?
> > * are you using a proxy?
> > * do you need to submit username/password for either of the 2 above
> > * turn on 'verbose' messages, it may help you debug
> >
> >
> > If you're still having problems, get back to me and I'll see if I can help.
> >
> > --Russell
> >
> >
> >> -----Original Message-----
> >> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
> >> bounces at lists.open-bio.org] On Behalf Of Neeti Somaiya
> >> Sent: Monday, 7 September 2009 10:04 p.m.
> >> To: Emanuele Osimo; bioperl-l
> >> Subject: Re: [Bioperl-l] need help urgently
> >>
> >> I tried using EntrezGene instead of GenBank, as is given in the link
> >> that you sent :
> >>
> >>
> http://www.bioperl.org/wiki/HOWTO:Beginners#Retrieving_a_sequence_from_a_datab
> >> ase
> >>
> >> http://doc.bioperl.org/releases/bioperl-current/bioperl-
> >> live/Bio/DB/EntrezGene.html
> >>
> >> use Bio::DB::EntrezGene;
> >>
> >>     my $db = Bio::DB::EntrezGene->new;
> >>
> >>     my $seq = $db->get_Seq_by_id(2); # Gene id
> >>
> >>     # or ...
> >>
> >>     my $seqio = $db->get_Stream_by_id([2, 4693, 3064]); # Gene ids
> >>     while ( my $seq = $seqio->next_seq ) {
> >>           print "id is ", $seq->display_id, "\n";
> >>     }
> >>
> >> This doesnt seem to work.
> >>
> >>
> >> -Neeti
> >> Even my blood says, B positive
> >>
> >>
> >>
> >> On Fri, Sep 4, 2009 at 1:09 PM, Emanuele Osimo<e.osimo at gmail.com> wrote:
> >> > Hello,
> >> > have you tried this?
> >> >
> >>
> http://bio.perl.org/wiki/HOWTO:Getting_Genomic_Sequences#Using_Bio::DB::GenBan
> >> k_when_you_have_genomic_coordinates
> >> >
> >> > Emanuele
> >> >
> >> > On Fri, Sep 4, 2009 at 08:49, Neeti Somaiya <neetisomaiya at gmail.com>
> wrote:
> >> >>
> >> >> Hi,
> >> >>
> >> >> I have an input list of gene names (can get gene ids from a local db
> >> >> if required).
> >> >> I need to fetch sequences of these genes. Can someone please guide me
> >> >> as to how this can be done using perl/bioperl?
> >> >>
> >> >> Any help will be deeply appreciated.
> >> >>
> >> >> Thanks.
> >> >>
> >> >> -Neeti
> >> >> Even my blood says, B positive
> >> >> _______________________________________________
> >> >> Bioperl-l mailing list
> >> >> Bioperl-l at lists.open-bio.org
> >> >> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >> >
> >> >
> >> _______________________________________________
> >> Bioperl-l mailing list
> >> Bioperl-l at lists.open-bio.org
> >> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> > =======================================================================
> > Attention: The information contained in this message and/or attachments
> > from AgResearch Limited is intended only for the persons or entities
> > to which it is addressed and may contain confidential and/or privileged
> > material. Any review, retransmission, dissemination or other use of, or
> > taking of any action in reliance upon, this information by persons or
> > entities other than the intended recipients is prohibited by AgResearch
> > Limited. If you have received this message in error, please notify the
> > sender immediately.
> > =======================================================================
> >


From cjfields at illinois.edu  Tue Sep  8 00:50:01 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Mon, 7 Sep 2009 23:50:01 -0500
Subject: [Bioperl-l] need help urgently
In-Reply-To: <18DF7D20DFEC044098A1062202F5FFF32B624C5607@exchsth.agresearch.co.nz>
References: <764978cf0909032349t5d8683f3h3b2ccc506392c984@mail.gmail.com>
	<2ac05d0f0909040039v4d6fb77fw8793b43add632e3a@mail.gmail.com>
	<764978cf0909070304w598d4bb5m51ad4e66f57cc1cf@mail.gmail.com>
	<18DF7D20DFEC044098A1062202F5FFF32B624C53A3@exchsth.agresearch.co.nz>
	<764978cf0909072127n830d4e8x95d15a758fa919db@mail.gmail.com>
	<18DF7D20DFEC044098A1062202F5FFF32B624C5607@exchsth.agresearch.co.nz>
Message-ID: <76A4757A-80C5-400E-8D3B-C68E968FF581@illinois.edu>

Russell,

Any reason you're using "no warnings 'deprecated'" there?  The  
pseudohash warnings should no longer be showing up with EntrezGene  
stuff.  Or is it something else?

chris

On Sep 7, 2009, at 11:41 PM, Smithies, Russell wrote:

> That bit of code gave you the accession, start and end for the  
> sequence so you just needed to download it.
> Bio::DB::Eutilities can do that for you.
>
> Did you take a look at http://www.bioperl.org/wiki/HOWTO:Getting_Genomic_Sequences
>
>
>
> --Russell
>
> ==================
> #!perl -w
>
> use strict;
> use Bio::DB::EntrezGene;
> use Bio::DB::EUtilities;
>
> no warnings 'deprecated';
>
> my $id = shift or die "Id?\n"; # use a Gene id
>
> my $db = new Bio::DB::EntrezGene;
> #$db->verbose(1);
> my $seq = $db->get_Seq_by_id($id);
>
> my $ac = $seq->annotation;
>
> for my $ann ($ac->get_Annotations('dblink')) {
> 	if ($ann->database eq "Evidence Viewer") {
>                # get the sequence identifier, the start, and the stop
> 		my ($acc,$from,$to) = $ann->url =~
> 		  /contig=([^&]+).+from=(\d+)&to=(\d+)/;
> 		print "$acc\t$from\t$to\n";
>
> 		# retrieve the sequence
> 		my $fetcher = Bio::DB::EUtilities->new(-eutil => 'efetch',
> 					   -db    => 'nucleotide',
> 					   -rettype => 'fasta');
>            $fetcher->set_parameters(-id => $acc,
> 			     			-seq_start => $from,
> 			     			-seq_stop  => $to,
> 			     			-strand    => 1);
>            my $seq = $fetcher->get_Response->content;
>            print $seq;
>
> 	}
> }
>
> ======================
>
>> -----Original Message-----
>> From: Neeti Somaiya [mailto:neetisomaiya at gmail.com]
>> Sent: Tuesday, 8 September 2009 4:28 p.m.
>> To: Smithies, Russell
>> Cc: Emanuele Osimo; bioperl-l
>> Subject: Re: [Bioperl-l] need help urgently
>>
>> I actually want the nucleotide sequence of the gene. I thought the
>> Bio::DB::EntrezGene would give me a seq_obj for an entrez gene id and
>> then the seq method on that $seq_obj->seq() will give me the actual
>> genomic nucleotide sequence of the gene. But this doesnt happen. I am
>> able to print gene symbol using $seq_obj->display_id and able to do
>> other things, but I wanted the gene nucleotide sequence.
>>
>> -Neeti
>> Even my blood says, B positive
>>
>>
>>
>> On Tue, Sep 8, 2009 at 1:56 AM, Smithies,
>> Russell<Russell.Smithies at agresearch.co.nz> wrote:
>>> This example code from the wiki _definitely_ works:
>>>
>> http://bio.perl.org/wiki/HOWTO:Getting_Genomic_Sequences#Using_Bio::DB::Entrez
>> Gene_to_get_genomic_coordinates
>>> =========================================
>>>
>>> use strict;
>>> use Bio::DB::EntrezGene;
>>>
>>> my $id = shift or die "Id?\n"; # use a Gene id
>>>
>>> my $db = new Bio::DB::EntrezGene;
>>> $db->verbose(1); ###
>>>
>>> my $seq = $db->get_Seq_by_id($id);
>>>
>>> my $ac = $seq->annotation;
>>>
>>> for my $ann ($ac->get_Annotations('dblink')) {
>>>       if ($ann->database eq "Evidence Viewer") {
>>>               # get the sequence identifier, the start, and the stop
>>>               my ($contig,$from,$to) = $ann->url =~
>>>                 /contig=([^&]+).+from=(\d+)&to=(\d+)/;
>>>               print "$contig\t$from\t$to\n";
>>>       }
>>> }
>>>
>>> ======================================
>>>
>>> So if it doesn't work for you, there are a few things you need to  
>>> check:
>>> * what version of BioPerl are you using?
>>> * are you behind a firewall?
>>> * are you using a proxy?
>>> * do you need to submit username/password for either of the 2 above
>>> * turn on 'verbose' messages, it may help you debug
>>>
>>>
>>> If you're still having problems, get back to me and I'll see if I  
>>> can help.
>>>
>>> --Russell
>>>
>>>
>>>> -----Original Message-----
>>>> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
>>>> bounces at lists.open-bio.org] On Behalf Of Neeti Somaiya
>>>> Sent: Monday, 7 September 2009 10:04 p.m.
>>>> To: Emanuele Osimo; bioperl-l
>>>> Subject: Re: [Bioperl-l] need help urgently
>>>>
>>>> I tried using EntrezGene instead of GenBank, as is given in the  
>>>> link
>>>> that you sent :
>>>>
>>>>
>> http://www.bioperl.org/wiki/HOWTO:Beginners#Retrieving_a_sequence_from_a_datab
>>>> ase
>>>>
>>>> http://doc.bioperl.org/releases/bioperl-current/bioperl-
>>>> live/Bio/DB/EntrezGene.html
>>>>
>>>> use Bio::DB::EntrezGene;
>>>>
>>>>    my $db = Bio::DB::EntrezGene->new;
>>>>
>>>>    my $seq = $db->get_Seq_by_id(2); # Gene id
>>>>
>>>>    # or ...
>>>>
>>>>    my $seqio = $db->get_Stream_by_id([2, 4693, 3064]); # Gene ids
>>>>    while ( my $seq = $seqio->next_seq ) {
>>>>          print "id is ", $seq->display_id, "\n";
>>>>    }
>>>>
>>>> This doesnt seem to work.
>>>>
>>>>
>>>> -Neeti
>>>> Even my blood says, B positive
>>>>
>>>>
>>>>
>>>> On Fri, Sep 4, 2009 at 1:09 PM, Emanuele Osimo<e.osimo at gmail.com>  
>>>> wrote:
>>>>> Hello,
>>>>> have you tried this?
>>>>>
>>>>
>> http://bio.perl.org/wiki/HOWTO:Getting_Genomic_Sequences#Using_Bio::DB::GenBan
>>>> k_when_you_have_genomic_coordinates
>>>>>
>>>>> Emanuele
>>>>>
>>>>> On Fri, Sep 4, 2009 at 08:49, Neeti Somaiya <neetisomaiya at gmail.com 
>>>>> >
>> wrote:
>>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> I have an input list of gene names (can get gene ids from a  
>>>>>> local db
>>>>>> if required).
>>>>>> I need to fetch sequences of these genes. Can someone please  
>>>>>> guide me
>>>>>> as to how this can be done using perl/bioperl?
>>>>>>
>>>>>> Any help will be deeply appreciated.
>>>>>>
>>>>>> Thanks.
>>>>>>
>>>>>> -Neeti
>>>>>> Even my blood says, B positive
>>>>>> _______________________________________________
>>>>>> Bioperl-l mailing list
>>>>>> Bioperl-l at lists.open-bio.org
>>>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>>>
>>>>>
>>>> _______________________________________________
>>>> Bioperl-l mailing list
>>>> Bioperl-l at lists.open-bio.org
>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>> = 
>>> = 
>>> = 
>>> ====================================================================
>>> Attention: The information contained in this message and/or  
>>> attachments
>>> from AgResearch Limited is intended only for the persons or entities
>>> to which it is addressed and may contain confidential and/or  
>>> privileged
>>> material. Any review, retransmission, dissemination or other use  
>>> of, or
>>> taking of any action in reliance upon, this information by persons  
>>> or
>>> entities other than the intended recipients is prohibited by  
>>> AgResearch
>>> Limited. If you have received this message in error, please notify  
>>> the
>>> sender immediately.
>>> = 
>>> = 
>>> = 
>>> ====================================================================
>>>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From paola_bisignano at yahoo.it  Tue Sep  8 04:55:21 2009
From: paola_bisignano at yahoo.it (Paola Bisignano)
Date: Tue, 8 Sep 2009 08:55:21 +0000 (GMT)
Subject: [Bioperl-l] problem parsing pdb
Message-ID: <741671.67508.qm@web25705.mail.ukl.yahoo.com>

Hi,

I'm in a little troble because i need to exactly parse pdb file, to extract chain id and res id, but I finded that in some pdb the number of residue is followed by a letter because is probably a residue added by crystallographers and they didm't want to change the number of residue in sequence....for example the pdb 1PXX.pdb I parsed it with my script below, I didn't find any useful suggestion about this in bioperltutorial or documentation of bioperl online

#!/usr/local/bin/perl
use strict;
use warnings;
use Bio::Structure::IO;
use LWP::Simple;


?my $urlpdb= "http://www.rcsb.org/pdb/download/downloadFile.do?fileFormat=pdb&compression=NO&structureId=1PXX";
?? my $content = get($urlpdb); 
?? my $pdb_file = qq{1pxx.pdb};
?? open my $f, ">$pdb_file" or die $!;
?? binmode $f; 
?? print $f $content;
?? print qq{$pdb_file\n};
?? close $f;


my $structio=Bio::Structure::IO->new (-file=>$pdb_file);
?? my $struc=$structio->next_structure;
?? for my $chain ($struc->get_chains) 
??? {
??? my $chainid = $chain->id ;
??? for my $res ($struc->get_residues($chain))
??? ??? {
??? ??? my $resid=$res-> id;
??? ??? my $atoms= $struc->get_atoms($res);
??? ??? open my $f, ">> 1pxx.parsed";
??? ??? ??? print? $f?? "$chainid\t$resid\n";
??? ??? ??? close $f;
??? ??? }
??? }


but it gives my file with an error in ILE 105A? ILE 2105C because they have a letter that follow the number of resid.... can I solve that problem without writing intermediate files?
because i need to have the reside id as 105A not 105.A
so
?A????????? ILE-105A 
without point between number and letter....


Thank you all,

Paola


From lengjingmao at gmail.com  Tue Sep  8 06:13:05 2009
From: lengjingmao at gmail.com (shaohua.fan)
Date: Tue, 8 Sep 2009 12:13:05 +0200
Subject: [Bioperl-l] Bio::Tools::RepeatMasker update?
Message-ID: <517072a20909080313g5ec3380bo42e1871c3a6f4aab@mail.gmail.com>

Dear all ,

After reading the document and original code of Bio::Tools::RepeatMasker on
bioperl document 1.6.0, I have a question about this module's update.

The current repeatmasker's output(  .out) provide more information
than which have not listed in the module, for example, query(left) , repeat
(left), perc div, perc del, perc ins. these maybe useful for some users.

I think it is better to update this module in the lastest Bioperl version.

shaohua

From maj at fortinbras.us  Tue Sep  8 07:00:31 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Tue, 8 Sep 2009 07:00:31 -0400
Subject: [Bioperl-l] Significant blocker for 1.6.1 : Nexml
In-Reply-To: <E5D7B830-6D19-47D2-8D5E-716B4CF84F0B@illinois.edu>
References: <E5D7B830-6D19-47D2-8D5E-716B4CF84F0B@illinois.edu>
Message-ID: <AD2517BD451A403D9FF258B9A07569F2@NewLife>

Chris - 
I would like to vote for option #1, since working on Bio::Nexml with
Chase gave me the opp'y to patch Bio::Phylo some (including fixing
an old "fix" of mine), so (IMO) the CPAN version of Bio::Phylo 
would benefit too. Option #2 is ok, since Bio::Nexml has to be
essentially optional for the user anyway, dependent on whether
the user is willing to install Bio::Phylo, a fairly major commitment
 (nexml.t already skips if Bio::Phylo is unavailable); I think it's 
no problem if we make that dependency more stringent. We could
have nexml.t check the svn revision directly, rather than $VERSION,
as a kludge.
cheers MAJ 
----- Original Message ----- 
From: "Chris Fields" <cjfields at illinois.edu>
To: "BioPerl List" <bioperl-l at lists.open-bio.org>
Sent: Tuesday, September 08, 2009 12:23 AM
Subject: [Bioperl-l] Significant blocker for 1.6.1 : Nexml


> All,
> 
> I'm running into a pretty significant blocker for 1.6.1 re: Chase's  
> Nexml code.  In particular, I have tried three versions of Bio::Phylo;  
> the default CPAN installation (1.6), the latest CPAN RC (1.7_RC9, not  
> installed by default), and the latest from Bio::Phylo svn:
> 
> https://nexml.svn.sourceforge.net/svnroot/nexml/trunk/nexml/perl
> 
> At this moment only the Bio::Phylo code from svn is working with  
> BioPerl's Nexml modules.  From my local tests Bio::Phylo 1.6 appears  
> to be missing Bio::Phylo::Factory (all Nexml tests fail), whereas  
> 1.7_RC9 has some kind of versioning issue (again, all tests fail).   
> The problem: CPAN will always install 1.6 (the others are RC, so they  
> won't be installed unless the full path is used).  Even so, nothing on  
> CPAN even works; one must use the latest Bio::Phylo SVN code.
> 
> ATM I'm just not seeing how this can be released with 1.6.1 right now,  
> unless one of the following occurs:
> 
> 1) Rutger V. drops a quick non-RC release to CPAN,
> 2) check for the minimal working Bio::Phylo version and safely skip  
> any Nexml-related tests unless proper version is present (not easy  
> with a $VERSION like '1.7_RC9'),
> 3) push Nexml into it's own distribution (something we were planning  
> on anyway with a number of modules)
> 
> As for #3 above, I think it probably belongs in a larger bioperl-phylo  
> as Mark had previously proposed.  I'm open to just about any solution.
> 
> chris
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
>

From hlapp at gmx.net  Tue Sep  8 08:16:12 2009
From: hlapp at gmx.net (Hilmar Lapp)
Date: Tue, 8 Sep 2009 08:16:12 -0400
Subject: [Bioperl-l] Significant blocker for 1.6.1 : Nexml
In-Reply-To: <E5D7B830-6D19-47D2-8D5E-716B4CF84F0B@illinois.edu>
References: <E5D7B830-6D19-47D2-8D5E-716B4CF84F0B@illinois.edu>
Message-ID: <CB38C203-7253-4AEE-A6E3-922243B290D9@gmx.net>

I'd suspect that the latest Bio::Phylo changes have been due for CPAN  
release anyway, so unless those are unstable that seems like the  
easiest fix to me.

If the Nexml code works against not yet stable updates to Bio::Phylo,  
it shouldn't be in a BioPerl stable release, right?

	-hilmar

On Sep 8, 2009, at 12:23 AM, Chris Fields wrote:

> All,
>
> I'm running into a pretty significant blocker for 1.6.1 re: Chase's  
> Nexml code.  In particular, I have tried three versions of  
> Bio::Phylo; the default CPAN installation (1.6), the latest CPAN RC  
> (1.7_RC9, not installed by default), and the latest from Bio::Phylo  
> svn:
>
> https://nexml.svn.sourceforge.net/svnroot/nexml/trunk/nexml/perl
>
> At this moment only the Bio::Phylo code from svn is working with  
> BioPerl's Nexml modules.  From my local tests Bio::Phylo 1.6 appears  
> to be missing Bio::Phylo::Factory (all Nexml tests fail), whereas  
> 1.7_RC9 has some kind of versioning issue (again, all tests fail).   
> The problem: CPAN will always install 1.6 (the others are RC, so  
> they won't be installed unless the full path is used).  Even so,  
> nothing on CPAN even works; one must use the latest Bio::Phylo SVN  
> code.
>
> ATM I'm just not seeing how this can be released with 1.6.1 right  
> now, unless one of the following occurs:
>
> 1) Rutger V. drops a quick non-RC release to CPAN,
> 2) check for the minimal working Bio::Phylo version and safely skip  
> any Nexml-related tests unless proper version is present (not easy  
> with a $VERSION like '1.7_RC9'),
> 3) push Nexml into it's own distribution (something we were planning  
> on anyway with a number of modules)
>
> As for #3 above, I think it probably belongs in a larger bioperl- 
> phylo as Mark had previously proposed.  I'm open to just about any  
> solution.
>
> chris
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From cjfields at illinois.edu  Tue Sep  8 08:02:53 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Tue, 8 Sep 2009 07:02:53 -0500
Subject: [Bioperl-l] Bio::Tools::RepeatMasker update?
In-Reply-To: <517072a20909080313g5ec3380bo42e1871c3a6f4aab@mail.gmail.com>
References: <517072a20909080313g5ec3380bo42e1871c3a6f4aab@mail.gmail.com>
Message-ID: <74B85419-6A37-46CE-AAF3-F33013F4A058@illinois.edu>

Patches are welcome for this (or you can submit an enhancement request  
via bugzilla):

http://bugzilla.open-bio.org/

This won't be in the next point release, sorry.

chris

On Sep 8, 2009, at 5:13 AM, shaohua.fan wrote:

> Dear all ,
>
> After reading the document and original code of  
> Bio::Tools::RepeatMasker on
> bioperl document 1.6.0, I have a question about this module's update.
>
> The current repeatmasker's output(  .out) provide more information
> than which have not listed in the module, for example, query(left) ,  
> repeat
> (left), perc div, perc del, perc ins. these maybe useful for some  
> users.
>
> I think it is better to update this module in the lastest Bioperl  
> version.
>
> shaohua
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From cjfields at illinois.edu  Tue Sep  8 09:15:31 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Tue, 8 Sep 2009 08:15:31 -0500
Subject: [Bioperl-l] Significant blocker for 1.6.1 : Nexml
In-Reply-To: <CB38C203-7253-4AEE-A6E3-922243B290D9@gmx.net>
References: <E5D7B830-6D19-47D2-8D5E-716B4CF84F0B@illinois.edu>
	<CB38C203-7253-4AEE-A6E3-922243B290D9@gmx.net>
Message-ID: <3163670B-51E3-419F-835B-304BB52E1037@illinois.edu>

On Sep 8, 2009, at 7:16 AM, Hilmar Lapp wrote:

> I'd suspect that the latest Bio::Phylo changes have been due for  
> CPAN release anyway, so unless those are unstable that seems like  
> the easiest fix to me.

My thought as well, just not sure how stable that code is right now.   
Bio::Phylo has been in RC for a while now, correct?

> If the Nexml code works against not yet stable updates to  
> Bio::Phylo, it shouldn't be in a BioPerl stable release, right?

Right.  That should be sorted out first.

I can wait a bit longer for Rutger to respond; there are a few other  
odds and ends that can been worked on in the meantime.  I would like  
to get the alpha out soon and 1.6.1 in the next week or so though.

chris

> 	-hilmar
>
> On Sep 8, 2009, at 12:23 AM, Chris Fields wrote:
>
>> All,
>>
>> I'm running into a pretty significant blocker for 1.6.1 re: Chase's  
>> Nexml code.  In particular, I have tried three versions of  
>> Bio::Phylo; the default CPAN installation (1.6), the latest CPAN RC  
>> (1.7_RC9, not installed by default), and the latest from Bio::Phylo  
>> svn:
>>
>> https://nexml.svn.sourceforge.net/svnroot/nexml/trunk/nexml/perl
>>
>> At this moment only the Bio::Phylo code from svn is working with  
>> BioPerl's Nexml modules.  From my local tests Bio::Phylo 1.6  
>> appears to be missing Bio::Phylo::Factory (all Nexml tests fail),  
>> whereas 1.7_RC9 has some kind of versioning issue (again, all tests  
>> fail).  The problem: CPAN will always install 1.6 (the others are  
>> RC, so they won't be installed unless the full path is used).  Even  
>> so, nothing on CPAN even works; one must use the latest Bio::Phylo  
>> SVN code.
>>
>> ATM I'm just not seeing how this can be released with 1.6.1 right  
>> now, unless one of the following occurs:
>>
>> 1) Rutger V. drops a quick non-RC release to CPAN,
>> 2) check for the minimal working Bio::Phylo version and safely skip  
>> any Nexml-related tests unless proper version is present (not easy  
>> with a $VERSION like '1.7_RC9'),
>> 3) push Nexml into it's own distribution (something we were  
>> planning on anyway with a number of modules)
>>
>> As for #3 above, I think it probably belongs in a larger bioperl- 
>> phylo as Mark had previously proposed.  I'm open to just about any  
>> solution.
>>
>> chris
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> -- 
> ===========================================================
> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
> ===========================================================
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From maj at fortinbras.us  Tue Sep  8 10:39:07 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Tue, 8 Sep 2009 10:39:07 -0400
Subject: [Bioperl-l] Significant blocker for 1.6.1 : Nexml
In-Reply-To: <3163670B-51E3-419F-835B-304BB52E1037@illinois.edu>
References: <E5D7B830-6D19-47D2-8D5E-716B4CF84F0B@illinois.edu><CB38C203-7253-4AEE-A6E3-922243B290D9@gmx.net>
	<3163670B-51E3-419F-835B-304BB52E1037@illinois.edu>
Message-ID: <1CF993D6D3AC435CA77127466D6C072A@NewLife>

I agree with Hilmar-- I have no problem keeping it in the trunk for a while
longer, as I have an addition for dealing with arbitrary non-seq
data using the Population API sitting in bioperl-dev that's nearly
ready, but prob. not before cjf wants to get the release out.
----- Original Message ----- 
From: "Chris Fields" <cjfields at illinois.edu>
To: "Hilmar Lapp" <hlapp at gmx.net>
Cc: "BioPerl List" <bioperl-l at lists.open-bio.org>; "Rutger A. Vos" 
<rutgeraldo at gmail.com>
Sent: Tuesday, September 08, 2009 9:15 AM
Subject: Re: [Bioperl-l] Significant blocker for 1.6.1 : Nexml


> On Sep 8, 2009, at 7:16 AM, Hilmar Lapp wrote:
>
>> I'd suspect that the latest Bio::Phylo changes have been due for  CPAN 
>> release anyway, so unless those are unstable that seems like  the easiest fix 
>> to me.
>
> My thought as well, just not sure how stable that code is right now. 
> Bio::Phylo has been in RC for a while now, correct?
>
>> If the Nexml code works against not yet stable updates to  Bio::Phylo, it 
>> shouldn't be in a BioPerl stable release, right?
>
> Right.  That should be sorted out first.
>
> I can wait a bit longer for Rutger to respond; there are a few other  odds and 
> ends that can been worked on in the meantime.  I would like  to get the alpha 
> out soon and 1.6.1 in the next week or so though.
>
> chris
>
>> -hilmar
>>
>> On Sep 8, 2009, at 12:23 AM, Chris Fields wrote:
>>
>>> All,
>>>
>>> I'm running into a pretty significant blocker for 1.6.1 re: Chase's  Nexml 
>>> code.  In particular, I have tried three versions of  Bio::Phylo; the 
>>> default CPAN installation (1.6), the latest CPAN RC  (1.7_RC9, not installed 
>>> by default), and the latest from Bio::Phylo  svn:
>>>
>>> https://nexml.svn.sourceforge.net/svnroot/nexml/trunk/nexml/perl
>>>
>>> At this moment only the Bio::Phylo code from svn is working with  BioPerl's 
>>> Nexml modules.  From my local tests Bio::Phylo 1.6  appears to be missing 
>>> Bio::Phylo::Factory (all Nexml tests fail),  whereas 1.7_RC9 has some kind 
>>> of versioning issue (again, all tests  fail).  The problem: CPAN will always 
>>> install 1.6 (the others are  RC, so they won't be installed unless the full 
>>> path is used).  Even  so, nothing on CPAN even works; one must use the 
>>> latest Bio::Phylo  SVN code.
>>>
>>> ATM I'm just not seeing how this can be released with 1.6.1 right  now, 
>>> unless one of the following occurs:
>>>
>>> 1) Rutger V. drops a quick non-RC release to CPAN,
>>> 2) check for the minimal working Bio::Phylo version and safely skip  any 
>>> Nexml-related tests unless proper version is present (not easy  with a 
>>> $VERSION like '1.7_RC9'),
>>> 3) push Nexml into it's own distribution (something we were  planning on 
>>> anyway with a number of modules)
>>>
>>> As for #3 above, I think it probably belongs in a larger bioperl- phylo as 
>>> Mark had previously proposed.  I'm open to just about any  solution.
>>>
>>> chris
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>> -- 
>> ===========================================================
>> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
>> ===========================================================
>>
>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> 


From lincoln.stein at gmail.com  Tue Sep  8 10:58:25 2009
From: lincoln.stein at gmail.com (Lincoln Stein)
Date: Tue, 8 Sep 2009 10:58:25 -0400
Subject: [Bioperl-l] Prepping for 1.6.1 (finally!)
In-Reply-To: <35CC277D-F0B6-45D0-A578-10A00B7A9C57@illinois.edu>
References: <35CC277D-F0B6-45D0-A578-10A00B7A9C57@illinois.edu>
Message-ID: <6dce9a0b0909080758q7334a7b2yc69bc86b96118927@mail.gmail.com>

Will do.

Lincoln

On Mon, Sep 7, 2009 at 4:56 PM, Chris Fields <cjfields at illinois.edu> wrote:

> All,
>
> I have updated the Changes file in bioperl-live in preparation for 1.6.1.
>  The initial release will be an alpha, 1.6.0_1 (probably landing about
> mid-week), and based on CPAN tests, etc the final 1.6.1 release next week.
>  I'll start merging changes over from trunk tonight, fixing last-minute
> bugs, etc.  I'm running my work using perl 5.10.1 (64-bit) on Mac and will
> likely run these remotely on our local linux cluster.  Win tests are gladly
> welcome (this should work on Strawberry Perl now).
>
> I highly suggest Mark, Jason, and any others (Lincoln, Scott, Chase, Robert
> Buels, Jay Hannah, Heikki, Sendu come to mind) look over the file to update
> it.  There are a few weak spots in there where I didn't make the code change
> or additions, or where a particular bug was fixed but not mentioned.  In
> particular:
>
> 1) Google Summer of Code work from Chase (Mark, Chase)
> 2) GMOD-related fixes (Lincoln, Scott)
> 3) YAPC Hackathon bug fixes (Robert, Jay, Bruno)
> 4) Tiling, Restriction refactors (Mark)
>
> Also, please make changes to AUTHORS, etc as needed.
>
> Thanks!
>
> chris
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


-- 
Lincoln D. Stein
Director, Informatics and Biocomputing Platform
Ontario Institute for Cancer Research
101 College St., Suite 800
Toronto, ON, Canada M5G0A3
416 673-8514
Assistant: Renata Musa <Renata.Musa at oicr.on.ca>

From cjfields at illinois.edu  Tue Sep  8 11:43:29 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Tue, 8 Sep 2009 10:43:29 -0500
Subject: [Bioperl-l] Significant blocker for 1.6.1 : Nexml
In-Reply-To: <1CF993D6D3AC435CA77127466D6C072A@NewLife>
References: <E5D7B830-6D19-47D2-8D5E-716B4CF84F0B@illinois.edu><CB38C203-7253-4AEE-A6E3-922243B290D9@gmx.net>
	<3163670B-51E3-419F-835B-304BB52E1037@illinois.edu>
	<1CF993D6D3AC435CA77127466D6C072A@NewLife>
Message-ID: <4415308D-81DC-4F68-A6CF-E08FD03D1D6E@illinois.edu>

Mark

We can hold it in trunk until the next point release or we start  
splitting things off (whichever is first).

I have a little more time, though, and I'm thinking it would be a good  
idea to get the Nexml code into the wild (sooner than later) for users  
to test out.  Let's see if Rutger responds.

chris

On Sep 8, 2009, at 9:39 AM, Mark A. Jensen wrote:

> I agree with Hilmar-- I have no problem keeping it in the trunk for  
> a while
> longer, as I have an addition for dealing with arbitrary non-seq
> data using the Population API sitting in bioperl-dev that's nearly
> ready, but prob. not before cjf wants to get the release out.
> ----- Original Message ----- From: "Chris Fields" <cjfields at illinois.edu 
> >
> To: "Hilmar Lapp" <hlapp at gmx.net>
> Cc: "BioPerl List" <bioperl-l at lists.open-bio.org>; "Rutger A. Vos" <rutgeraldo at gmail.com 
> >
> Sent: Tuesday, September 08, 2009 9:15 AM
> Subject: Re: [Bioperl-l] Significant blocker for 1.6.1 : Nexml
>
>
>> On Sep 8, 2009, at 7:16 AM, Hilmar Lapp wrote:
>>
>>> I'd suspect that the latest Bio::Phylo changes have been due for   
>>> CPAN release anyway, so unless those are unstable that seems like   
>>> the easiest fix to me.
>>
>> My thought as well, just not sure how stable that code is right  
>> now. Bio::Phylo has been in RC for a while now, correct?
>>
>>> If the Nexml code works against not yet stable updates to   
>>> Bio::Phylo, it shouldn't be in a BioPerl stable release, right?
>>
>> Right.  That should be sorted out first.
>>
>> I can wait a bit longer for Rutger to respond; there are a few  
>> other  odds and ends that can been worked on in the meantime.  I  
>> would like  to get the alpha out soon and 1.6.1 in the next week or  
>> so though.
>>
>> chris
>>
>>> -hilmar
>>>
>>> On Sep 8, 2009, at 12:23 AM, Chris Fields wrote:
>>>
>>>> All,
>>>>
>>>> I'm running into a pretty significant blocker for 1.6.1 re:  
>>>> Chase's  Nexml code.  In particular, I have tried three versions  
>>>> of  Bio::Phylo; the default CPAN installation (1.6), the latest  
>>>> CPAN RC  (1.7_RC9, not installed by default), and the latest from  
>>>> Bio::Phylo  svn:
>>>>
>>>> https://nexml.svn.sourceforge.net/svnroot/nexml/trunk/nexml/perl
>>>>
>>>> At this moment only the Bio::Phylo code from svn is working with   
>>>> BioPerl's Nexml modules.  From my local tests Bio::Phylo 1.6   
>>>> appears to be missing Bio::Phylo::Factory (all Nexml tests  
>>>> fail),  whereas 1.7_RC9 has some kind of versioning issue (again,  
>>>> all tests  fail).  The problem: CPAN will always install 1.6 (the  
>>>> others are  RC, so they won't be installed unless the full path  
>>>> is used).  Even  so, nothing on CPAN even works; one must use the  
>>>> latest Bio::Phylo  SVN code.
>>>>
>>>> ATM I'm just not seeing how this can be released with 1.6.1  
>>>> right  now, unless one of the following occurs:
>>>>
>>>> 1) Rutger V. drops a quick non-RC release to CPAN,
>>>> 2) check for the minimal working Bio::Phylo version and safely  
>>>> skip  any Nexml-related tests unless proper version is present  
>>>> (not easy  with a $VERSION like '1.7_RC9'),
>>>> 3) push Nexml into it's own distribution (something we were   
>>>> planning on anyway with a number of modules)
>>>>
>>>> As for #3 above, I think it probably belongs in a larger bioperl-  
>>>> phylo as Mark had previously proposed.  I'm open to just about  
>>>> any  solution.
>>>>
>>>> chris
>>>> _______________________________________________
>>>> Bioperl-l mailing list
>>>> Bioperl-l at lists.open-bio.org
>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>> -- 
>>> ===========================================================
>>> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
>>> ===========================================================
>>>
>>>
>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From jason at bioperl.org  Tue Sep  8 15:43:39 2009
From: jason at bioperl.org (Jason Stajich)
Date: Tue, 8 Sep 2009 12:43:39 -0700
Subject: [Bioperl-l] Bio::DB::Fasta + Bio::SeqIO
Message-ID: <9858D52F-7580-44C9-A78E-4B1F1BF1B6ED@bioperl.org>

Bio::DB::Fasta returns Bio::PrimarySeq::Fasta objects which are  
perfectly fine to write with Bio::SeqIO::fasta but not for any of the  
rich-seq writers.
Do we think this is a bug or feature.  The solution is to write the  
PrimarySeq wrapped in a Bio::Seq object.

See this gist -- I would imagine this as additional test lines in t/ 
LocalDB/DBFasta.t but I don't know what we really expect?
http://gist.github.com/183169

I also notice that $seq->description & $seq->display_id don't allow  
'set' option - which probably makes sense since this is a read-only  
object that came from the DB, but it basically silently ignores set.   
I often do this if I pull seqs from a DB::Fasta db and re-format the  
IDs or description line.  So I end up making a new object and copying  
the data over.  I *think* this is really a feature not a bug, just  
wanted to bring it up.

-jason
--
Jason Stajich
jason.stajich at gmail.com
jason at bioperl.org


From cjfields at illinois.edu  Tue Sep  8 16:20:32 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Tue, 8 Sep 2009 15:20:32 -0500
Subject: [Bioperl-l] Bio::DB::Fasta + Bio::SeqIO
In-Reply-To: <9858D52F-7580-44C9-A78E-4B1F1BF1B6ED@bioperl.org>
References: <9858D52F-7580-44C9-A78E-4B1F1BF1B6ED@bioperl.org>
Message-ID: <AEE10370-F2B3-4723-9B79-23A5EBF86A51@illinois.edu>

On Sep 8, 2009, at 2:43 PM, Jason Stajich wrote:

> Bio::DB::Fasta returns Bio::PrimarySeq::Fasta objects which are  
> perfectly fine to write with Bio::SeqIO::fasta but not for any of  
> the rich-seq writers.
> Do we think this is a bug or feature.  The solution is to write the  
> PrimarySeq wrapped in a Bio::Seq object.

I think SeqIO requires any SeqI but doesn't specify anything for a  
simpler PrimarySeqI.  We could add some kind of general convenience  
wrapper in Bio::SeqIO to convert any PrimarySeqI to a requested SeqI  
class and just delegate to write_seq():

   # get a PrimarySeq somehow $seq, $out is Bio::SeqIO
   $out->write_PrimarySeq($seq); # or somesuch

> See this gist -- I would imagine this as additional test lines in t/ 
> LocalDB/DBFasta.t but I don't know what we really expect?
> http://gist.github.com/183169
>
> I also notice that $seq->description & $seq->display_id don't allow  
> 'set' option - which probably makes sense since this is a read-only  
> object that came from the DB, but it basically silently ignores  
> set.  I often do this if I pull seqs from a DB::Fasta db and re- 
> format the IDs or description line.  So I end up making a new object  
> and copying the data over.  I *think* this is really a feature not a  
> bug, just wanted to bring it up.
>
> -jason
> --
> Jason Stajich
> jason.stajich at gmail.com
> jason at bioperl.org

One can already cheat and do a few things.  For instance:

$seq->{id} = 'Foo';
print $seq->display_id; # should be 'Foo'

Won't work for all of them, though, such as description().   
Personally, if one made clear that such changes aren't retained in the  
database but must be redirected as output to another file then I don't  
see a problem (other PrimarySeqI are mutable, so why not these?).

Would there be any real performance hit from making those get/set  
accessors instead of ro getters?  The class is fairly small.

chris


From lelbourn at science.mq.edu.au  Mon Sep  7 03:52:04 2009
From: lelbourn at science.mq.edu.au (Liam Elbourne)
Date: Mon, 7 Sep 2009 17:52:04 +1000
Subject: [Bioperl-l] subsection of genbank file
Message-ID: <997B4CA2-D80B-4512-AA3E-74CB45DD7064@science.mq.edu.au>

Hi All,

Is there a method or methodology that will produce a fully fledged Seq  
object with all the associated metadata given a start and end  
position? To clarify, I create a sequence object from a genbank file:


****
my $io  = Bio::Seqio->new(as per usual);

my $seqobj = $io->next_seq();
****
I now want:

my $sub_seqobj = $seqobj between 300 and 2000

where $sub_seqobj is a Seq object (which I appreciate is an  
'aggregate' of objects) too. The "trunc" method only returns a  
PrimarySeq object which lacks all the annotation etc. I've previously  
done this task by iterating through feature by feature and parsing out  
what I needed, but thought there might be a more elegant approach...


Regards,
Liam Elbourne.


From alpapan at googlemail.com  Thu Sep 10 17:14:11 2009
From: alpapan at googlemail.com (Alexie Papanicolaou)
Date: Thu, 10 Sep 2009 22:14:11 +0100
Subject: [Bioperl-l] Bio::Search::HSP::FastaHSP -> get_aln -> Bio::Locatable
 end is float
Message-ID: <1252617251.6680.16.camel@alexie-desktop>

An embedded and charset-unspecified text was scrubbed...
Name: not available
URL: <http://lists.open-bio.org/pipermail/bioperl-l/attachments/20090910/f222c627/attachment.pl>

From maj at fortinbras.us  Thu Sep 10 23:52:27 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Thu, 10 Sep 2009 23:52:27 -0400
Subject: [Bioperl-l] Bio::Search::HSP::FastaHSP -> get_aln ->
	Bio::Locatable end is float
In-Reply-To: <1252617251.6680.16.camel@alexie-desktop>
References: <1252617251.6680.16.camel@alexie-desktop>
Message-ID: <D2C2357D7A81478B965996CF6DDD4AF2@NewLife>

Hi Alexie--
I am either responsible for this weirdness, or have fixed it in
an unreleased version. Anyway,  can you please make a bug
report at http://bugzilla.bioperl.org, and include some relevant
code and real data, and I will have a look.
Thanks a lot- Mark
----- Original Message ----- 
From: "Alexie Papanicolaou" <alpapan at googlemail.com>
To: <bioperl-l at lists.open-bio.org>
Sent: Thursday, September 10, 2009 5:14 PM
Subject: [Bioperl-l] Bio::Search::HSP::FastaHSP -> get_aln -> Bio::Locatable end 
is float


> Hello all,
>
> I get the following warning when parsing a fasty34 HSP using Bio::Search
> and then trying to getting the alignment using get_aln
>
> MSG: In sequence CONTIG residue count gives end value
> 565.333333333333.
> Overriding value [565] with value 565.333333333333 for
> Bio::LocatableSeq::end().
> MAEMFKIGDLVWAKMKGFSPWPGLVSNPTKDLKRPTSKKSAQQ/C/VFFLGTNNYAWIEEANIKPYFEYRDRLVKSNKSGAFKDALDAIEEYIKNNGAKFDDPDAEFNRLRESLAEKKESKPKQRKEKRPAHDDNSAKSPKKVRTNSVEADKESVRADSPILSNHSPRKGPASTLLERPTTIVRPLDDSQD
> STACK
> Bio::LocatableSeq::end /usr/local/share/perl/5.8.8/Bio/LocatableSeq.pm:196
> STACK
> Bio::LocatableSeq::new /usr/local/share/perl/5.8.8/Bio/LocatableSeq.pm:140
> STACK
> Bio::Search::HSP::FastaHSP::get_aln 
> /usr/local/share/perl/5.8.8/Bio/Search/HSP/FastaHSP.pm:174
>
> The frameshifts (/ and \ ) are causing this recalculation of length to a
> float (which is a bit weird) but is not fatal for my program. Is this
> intentional?
>
> My immediate problems is actually the warning message itself - which is
> quite annoying if you have hundreds of such sequences... any way to turn
> them off sort of commenting out the line at LocatableSeq.pm ?
> (redirecting STDERR wouldn't be desirable for a production script).
>
> many thanks
> alexie
>
>
> -- 
> Alexie Papanicolaou
> Richard ffrench-Constant group
> CEC-Biology
> Univ. Exeter in Cornwall
> Penryn
> TR10 9EZ
> United Kingdom
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> 


From gmodhelp at googlemail.com  Fri Sep 11 12:40:43 2009
From: gmodhelp at googlemail.com (Dave Clements, GMOD Help Desk)
Date: Fri, 11 Sep 2009 09:40:43 -0700
Subject: [Bioperl-l] CVS to SVN Conversion, 2009/09/15
In-Reply-To: <71ee57c70909110937m4a2598abv6a0a5aaa1e656fcc@mail.gmail.com>
References: <71ee57c70908241615w6f82abb6p25b0744e8f5fb006@mail.gmail.com>
	<71ee57c70909110935w2147628cq6e6984feb544e6b9@mail.gmail.com>
	<71ee57c70909110936g34612cf0g5a9d83aeee4e0efd@mail.gmail.com>
	<71ee57c70909110937m4a2598abv6a0a5aaa1e656fcc@mail.gmail.com>
Message-ID: <71ee57c70909110940y921b1dxfec278422d31be7f@mail.gmail.com>

Hello all,

This is a heads up that GMOD (in the form of Rob Buels) will be moving
its SourceForge source code repository from CVS to SVN on September
15, 2009.

If you have checked out and modified any code from that repository,
please commit your updates before 3am, Eastern US, on September 15.

Some important bits:
* All projects will be frozen in CVS and will remain available from CVS.
* No new updates will be allowed in CVS.
* All project will be moved to Subversion.
* Inactive projects will be moved to a separate archival directory.

See http://gmod.org/wiki/CVS_to_Subversion_Conversion for full details
and a list of active and inactive projects.

Thanks,

Dave C
--
* Please keep responses on the list!
* Was this helpful? ?Let us know at http://gmod.org/wiki/Help_Desk_Feedback


From jayoung at fhcrc.org  Fri Sep 11 21:11:00 2009
From: jayoung at fhcrc.org (Janet Young)
Date: Fri, 11 Sep 2009 18:11:00 -0700
Subject: [Bioperl-l] tree splice remove nodes
Message-ID: <BE5181C0-6BAF-42A8-A6A0-BC699FE640B0@fhcrc.org>

Hi,

I'm having a problem in a script that I'm hoping someone can help me  
figure out.  I'm using splice(-remove_id) to prune a Bio::Tree::Tree  
object, and it looks like it worked fine.

However, I'm also trying to keep a separate copy of the original  
(unpruned) tree in a different object but that second object seems to  
get pruned as well.

Here's my tree, stored in a file called testtree2.nwk:

(((A,(B,b)),C),D,E);

---------------------------------------
Here's my script:

#!/usr/bin/perl

use warnings;
use strict;
use Bio::TreeIO;

my $treeIO = new Bio::TreeIO(-file => "testtree2.nwk", - 
format=>'newick');
while (my $tree = $treeIO->next_tree() ) {

       print "\nfound a tree\n\n";
       my @originalleaves = $tree -> get_leaf_nodes();
       foreach my $originalleaf (@originalleaves) {print "original  
tree has node with id " . $originalleaf->id() . "\n";}

       my $tree2 = $tree;

       my @remove = ("D","E");
       print "\nremoving nodes @remove\n\n";

       $tree2 -> splice(-remove_id => \@remove);
       my @leaves2 = $tree2 -> get_leaf_nodes();
       foreach my $leaf2 (@leaves2) {print "after removing tree2 has  
node with id " . $leaf2->id() . "\n";}

       print "\n";

       my @originalleavesafter = $tree -> get_leaf_nodes();
       foreach my $leaf3 (@originalleavesafter) {print "after removing  
original tree has node with id " . $leaf3->id() . "\n";}

}

---------------------------------------


And here's my output:

found a tree

original tree has node with id A
original tree has node with id B
original tree has node with id b
original tree has node with id C
original tree has node with id D
original tree has node with id E

removing nodes D E

after removing tree2 has node with id A
after removing tree2 has node with id B
after removing tree2 has node with id b
after removing tree2 has node with id C

after removing original tree has node with id A
after removing original tree has node with id B
after removing original tree has node with id b
after removing original tree has node with id C


-------------------------

I want to splice the specified nodes out of $tree2 and leave $tree  
untouched, but both $tree and $tree2 seem to be affected by the splice  
operation. Am I failing to understand something about references/ 
dereferencing?   I'm not sure if I just haven't figured this out right  
or if it's a bug.  If it looks like a bug let me know and I'll post it  
to bugzilla.

thanks in advance for any advice,

Janet

-------------------------------------------------------------------

Dr. Janet Young (Trask lab)

Fred Hutchinson Cancer Research Center
1100 Fairview Avenue N., C3-168,
P.O. Box 19024, Seattle, WA 98109-1024, USA.

tel: (206) 667 1471 fax: (206) 667 6524
email: jayoung  ...at...  fhcrc.org

http://www.fhcrc.org/labs/trask/

-------------------------------------------------------------------


From maj at fortinbras.us  Fri Sep 11 22:00:53 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Fri, 11 Sep 2009 22:00:53 -0400
Subject: [Bioperl-l] tree splice remove nodes
In-Reply-To: <BE5181C0-6BAF-42A8-A6A0-BC699FE640B0@fhcrc.org>
References: <BE5181C0-6BAF-42A8-A6A0-BC699FE640B0@fhcrc.org>
Message-ID: <C8DF4B9CC00E4FAA8D55F43E787E3F38@NewLife>

Hi Janet-
The trouble here is that 
$tree2 = $tree
doesn't create an independent copy of the entire
tree data structure. So, $tree2 and $tree
essentially point to the same thing. 
The easiest way to get two independent copies 
is probably to read the file twice:

$treeIO = new Bio::TreeIO(-file => "testtree2.nwk", -format=>'newick');
$tree = $treeIO->next_tree;
$treeIO = new Bio::TreeIO(-file => "testtree2.nwk", -format=>'newick');
$tree2 = $treeIO->next tree;

which will create two copies. This is a little kludgy, but 
unfortunately, there doesn't seem to be any easy way to 
rewind the TreeIO object. 

When you want a copy of a complex object, generally 
you need to "clone" it, and there are variety of modules
you can use to create clones. [It's probably worth adding 
a clone() method to TreeFunctionsI--maybe I'll do that.]
Get the module Clone from CPAN and do

use Clone qw(clone);
....
$tree2 = clone($tree);
...

hope this helps- cheers 
MAJ
----- Original Message ----- 
From: "Janet Young" <jayoung at fhcrc.org>
To: <bioperl-l at lists.open-bio.org>
Sent: Friday, September 11, 2009 9:11 PM
Subject: [Bioperl-l] tree splice remove nodes


> Hi,
> 
> I'm having a problem in a script that I'm hoping someone can help me  
> figure out.  I'm using splice(-remove_id) to prune a Bio::Tree::Tree  
> object, and it looks like it worked fine.
> 
> However, I'm also trying to keep a separate copy of the original  
> (unpruned) tree in a different object but that second object seems to  
> get pruned as well.
> 
> Here's my tree, stored in a file called testtree2.nwk:
> 
> (((A,(B,b)),C),D,E);
> 
> ---------------------------------------
> Here's my script:
> 
> #!/usr/bin/perl
> 
> use warnings;
> use strict;
> use Bio::TreeIO;
> 
> my $treeIO = new Bio::TreeIO(-file => "testtree2.nwk", - 
> format=>'newick');
> while (my $tree = $treeIO->next_tree() ) {
> 
>       print "\nfound a tree\n\n";
>       my @originalleaves = $tree -> get_leaf_nodes();
>       foreach my $originalleaf (@originalleaves) {print "original  
> tree has node with id " . $originalleaf->id() . "\n";}
> 
>       my $tree2 = $tree;
> 
>       my @remove = ("D","E");
>       print "\nremoving nodes @remove\n\n";
> 
>       $tree2 -> splice(-remove_id => \@remove);
>       my @leaves2 = $tree2 -> get_leaf_nodes();
>       foreach my $leaf2 (@leaves2) {print "after removing tree2 has  
> node with id " . $leaf2->id() . "\n";}
> 
>       print "\n";
> 
>       my @originalleavesafter = $tree -> get_leaf_nodes();
>       foreach my $leaf3 (@originalleavesafter) {print "after removing  
> original tree has node with id " . $leaf3->id() . "\n";}
> 
> }
> 
> ---------------------------------------
> 
> 
> And here's my output:
> 
> found a tree
> 
> original tree has node with id A
> original tree has node with id B
> original tree has node with id b
> original tree has node with id C
> original tree has node with id D
> original tree has node with id E
> 
> removing nodes D E
> 
> after removing tree2 has node with id A
> after removing tree2 has node with id B
> after removing tree2 has node with id b
> after removing tree2 has node with id C
> 
> after removing original tree has node with id A
> after removing original tree has node with id B
> after removing original tree has node with id b
> after removing original tree has node with id C
> 
> 
> -------------------------
> 
> I want to splice the specified nodes out of $tree2 and leave $tree  
> untouched, but both $tree and $tree2 seem to be affected by the splice  
> operation. Am I failing to understand something about references/ 
> dereferencing?   I'm not sure if I just haven't figured this out right  
> or if it's a bug.  If it looks like a bug let me know and I'll post it  
> to bugzilla.
> 
> thanks in advance for any advice,
> 
> Janet
> 
> -------------------------------------------------------------------
> 
> Dr. Janet Young (Trask lab)
> 
> Fred Hutchinson Cancer Research Center
> 1100 Fairview Avenue N., C3-168,
> P.O. Box 19024, Seattle, WA 98109-1024, USA.
> 
> tel: (206) 667 1471 fax: (206) 667 6524
> email: jayoung  ...at...  fhcrc.org
> 
> http://www.fhcrc.org/labs/trask/
> 
> -------------------------------------------------------------------
> 
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
>

From cjfields at illinois.edu  Sat Sep 12 00:12:06 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Fri, 11 Sep 2009 23:12:06 -0500
Subject: [Bioperl-l] tree splice remove nodes
In-Reply-To: <C8DF4B9CC00E4FAA8D55F43E787E3F38@NewLife>
References: <BE5181C0-6BAF-42A8-A6A0-BC699FE640B0@fhcrc.org>
	<C8DF4B9CC00E4FAA8D55F43E787E3F38@NewLife>
Message-ID: <5BE22FC3-06F3-4D31-BB73-8F2C49D46A03@illinois.edu>

On Sep 11, 2009, at 9:00 PM, Mark A. Jensen wrote:

> Hi Janet-
> The trouble here is that $tree2 = $tree
> doesn't create an independent copy of the entire
> tree data structure. So, $tree2 and $tree
> essentially point to the same thing. The easiest way to get two  
> independent copies is probably to read the file twice:
>
> $treeIO = new Bio::TreeIO(-file => "testtree2.nwk", - 
> format=>'newick');
> $tree = $treeIO->next_tree;
> $treeIO = new Bio::TreeIO(-file => "testtree2.nwk", - 
> format=>'newick');
> $tree2 = $treeIO->next tree;
>
> which will create two copies. This is a little kludgy, but  
> unfortunately, there doesn't seem to be any easy way to rewind the  
> TreeIO object.

You can rewind the filehandle if it's seekable:

my $fh = $treeio->_fh;
seek($fh,0,0); # or something like that...

Don't use sysseek (doesn't work with buffered IO).

>  When you want a copy of a complex object, generally you need to  
> "clone" it, and there are variety of modules
> you can use to create clones. [It's probably worth adding a clone()  
> method to TreeFunctionsI--maybe I'll do that.]
> Get the module Clone from CPAN and do
>
> use Clone qw(clone);
> ....
> $tree2 = clone($tree);
> ...
>
> hope this helps- cheers MAJ

This normally works with bioperl objects, just not sure about Tree  
(might be worth testing out).

chris

From bix at sendu.me.uk  Sat Sep 12 04:33:22 2009
From: bix at sendu.me.uk (Sendu Bala)
Date: Sat, 12 Sep 2009 09:33:22 +0100
Subject: [Bioperl-l] tree splice remove nodes
In-Reply-To: <C8DF4B9CC00E4FAA8D55F43E787E3F38@NewLife>
References: <BE5181C0-6BAF-42A8-A6A0-BC699FE640B0@fhcrc.org>
	<C8DF4B9CC00E4FAA8D55F43E787E3F38@NewLife>
Message-ID: <4AAB5CD2.1040903@sendu.me.uk>

Mark A. Jensen wrote:
> Hi Janet-
> The trouble here is that $tree2 = $tree
> doesn't create an independent copy of the entire
> tree data structure. So, $tree2 and $tree
> essentially point to the same thing. The easiest way to get two 
> independent copies is probably to read the file twice:
> 
> $treeIO = new Bio::TreeIO(-file => "testtree2.nwk", -format=>'newick');
> $tree = $treeIO->next_tree;
> $treeIO = new Bio::TreeIO(-file => "testtree2.nwk", -format=>'newick');
> $tree2 = $treeIO->next tree;
> 
> which will create two copies. This is a little kludgy, but 
> unfortunately, there doesn't seem to be any easy way to rewind the 
> TreeIO object.
> When you want a copy of a complex object, generally you need to "clone" 
> it, and there are variety of modules
> you can use to create clones. [It's probably worth adding a clone() 
> method to TreeFunctionsI--maybe I'll do that.]
> Get the module Clone from CPAN and do

 From my comments in Bio/Tree/TreeFunctionsI.pm:

Clone.pm clone() seg faults and fails to make the clone, whilst Storable 
dclone needs $self->{_root_cleanup_methods} deleted (code ref) and seg 
faults at end of script.

TreeFunctionsI.pm already has the _clone() method. I suppose you could 
add some POD for it, rename it clone() and update the methods that call 
the private method to call the public version instead, Mark.

Janet: just clone your tree object with:
my $tree2 = $tree->_clone();

From maj at fortinbras.us  Sat Sep 12 07:37:37 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Sat, 12 Sep 2009 07:37:37 -0400
Subject: [Bioperl-l] tree splice remove nodes
In-Reply-To: <4AAB5CD2.1040903@sendu.me.uk>
References: <BE5181C0-6BAF-42A8-A6A0-BC699FE640B0@fhcrc.org>
	<C8DF4B9CC00E4FAA8D55F43E787E3F38@NewLife>
	<4AAB5CD2.1040903@sendu.me.uk>
Message-ID: <1A0B867B64B347A3B23A2F19EAA2A720@NewLife>

Done-- thanks Sendu. I made _clone alias clone, to keep 
from rocking anyone's boat. 
Janet- definitely do  $tree2 = $tree->_clone.

----- Original Message ----- 
From: "Sendu Bala" <bix at sendu.me.uk>
To: "Mark A. Jensen" <maj at fortinbras.us>
Cc: "Janet Young" <jayoung at fhcrc.org>; <bioperl-l at lists.open-bio.org>
Sent: Saturday, September 12, 2009 4:33 AM
Subject: Re: [Bioperl-l] tree splice remove nodes


> Mark A. Jensen wrote:
>> Hi Janet-
>> The trouble here is that $tree2 = $tree
>> doesn't create an independent copy of the entire
>> tree data structure. So, $tree2 and $tree
>> essentially point to the same thing. The easiest way to get two 
>> independent copies is probably to read the file twice:
>> 
>> $treeIO = new Bio::TreeIO(-file => "testtree2.nwk", -format=>'newick');
>> $tree = $treeIO->next_tree;
>> $treeIO = new Bio::TreeIO(-file => "testtree2.nwk", -format=>'newick');
>> $tree2 = $treeIO->next tree;
>> 
>> which will create two copies. This is a little kludgy, but 
>> unfortunately, there doesn't seem to be any easy way to rewind the 
>> TreeIO object.
>> When you want a copy of a complex object, generally you need to "clone" 
>> it, and there are variety of modules
>> you can use to create clones. [It's probably worth adding a clone() 
>> method to TreeFunctionsI--maybe I'll do that.]
>> Get the module Clone from CPAN and do
> 
> From my comments in Bio/Tree/TreeFunctionsI.pm:
> 
> Clone.pm clone() seg faults and fails to make the clone, whilst Storable 
> dclone needs $self->{_root_cleanup_methods} deleted (code ref) and seg 
> faults at end of script.
> 
> TreeFunctionsI.pm already has the _clone() method. I suppose you could 
> add some POD for it, rename it clone() and update the methods that call 
> the private method to call the public version instead, Mark.
> 
> Janet: just clone your tree object with:
> my $tree2 = $tree->_clone();
> 
>

From adlai at refenestration.com  Sat Sep 12 11:18:02 2009
From: adlai at refenestration.com (adlai burman)
Date: Sat, 12 Sep 2009 17:18:02 +0200
Subject: [Bioperl-l] Servers
Message-ID: <7667775E-09F9-4F20-B76C-2297DE629CF3@refenestration.com>

Can anyone suggest a hosting or server provider that actually has  
Bioperl installed?

Thanks,

Adlai

From maj at fortinbras.us  Sat Sep 12 12:45:35 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Sat, 12 Sep 2009 12:45:35 -0400
Subject: [Bioperl-l] Servers
In-Reply-To: <7667775E-09F9-4F20-B76C-2297DE629CF3@refenestration.com>
References: <7667775E-09F9-4F20-B76C-2297DE629CF3@refenestration.com>
Message-ID: <127343EFA5EF4F7CB756586A1B0B210E@NewLife>

I have a public amazon machine ; see http://fortinbras.us/bioperl-max
cheers MAJ
----- Original Message ----- 
From: "adlai burman" <adlai at refenestration.com>
To: <bioperl-l at lists.open-bio.org>
Sent: Saturday, September 12, 2009 11:18 AM
Subject: [Bioperl-l] Servers


> Can anyone suggest a hosting or server provider that actually has  
> Bioperl installed?
> 
> Thanks,
> 
> Adlai
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
>

From hartzell at alerce.com  Sat Sep 12 21:35:44 2009
From: hartzell at alerce.com (George Hartzell)
Date: Sat, 12 Sep 2009 18:35:44 -0700
Subject: [Bioperl-l] Bio::DB::GenBank question (acc vs. version)
Message-ID: <19116.19568.26115.542911@already.dhcp.gene.com>


It looks like get Bio::DB::GenBank::get_Seq_by_{version,acc} are
functionally identical.  They seem to trickle down to the same place
and walking through these two requests yields almost identical http
requests: 

  $db->get_Seq_by_version('J00522.1')
  GET http://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?retmode=text&rettype=gbwithparts&db=nucleotide&tool=bioperl&id=J00522.1&usehistory=n

  $db->get_Seq_by_acc('J00522')
  GET http://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?retmode=text&rettype=gbwithparts&db=nucleotide&tool=bioperl&id=J00522&usehistory=n

The only difference that I can see is that they index into different
secions of %PARAMSTRING defined in Bio::DB::GenBank, but those
sections contain the same information.

I'd like a general purpose tool that does The Right Thing whether
there's a .1 on the end of an identifier or not, and am just trying to
make sure I'm not doing something troublesome.

Am I correct about the above?

While I'm at it, I think that the comment

  # note that get_Stream_by_version is not implemented

in Bio::DB::GenBank was made obsolete by whoever commented out the

  $self->throw(...)

in get_Stream_by_version in Bio::WebDBSeqI.pm.

I'll happily commit the trivial doc fix if no one shoots down the
idea. (can't help big, might as well help small...).

Thanks,

g.

From maj at fortinbras.us  Sat Sep 12 23:14:06 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Sat, 12 Sep 2009 23:14:06 -0400
Subject: [Bioperl-l] Emacs bioperl-mode improved release
Message-ID: <DBDF390336FB4D8D935E2395E48207D7@NewLife>

Hi All--

[Future announcements/updates will be made on the wiki-
 http://bioperl.org/wiki/Emacs_bioperl-mode --
 put it on your watchlist...see the page for features and install
 info ]

Bioperl-mode (tar r16070) is improved:
- fancy syntax and header highlighting for pod views
- jump to .pm source from pod view (just press 'f')
- full support for multiple paths
  (e.g. "/usr/local/src/bioperl-live:/usr/local/src/bioperl-run"):
  the completion flattens the paths; if you wind up having to 
  make a choice (between, e.g., site-perl/5.10/Bio/Seq.pm
  and mytweaks/Bio/Seq.pm), completion will let you choose
  the path at the prompt.
- BPMODE_PATH convenience environment 
  variable is read for the search paths
- other stuff I can't remember
- there is a unit test suite under test.el of Wang Liang
  in the dev path

To do this stuff, I've backed off Emacs 21 compatibility; 
it'll bork (nicely) if you have 21. If there are "enough" complaints,
I will relent, but 22 is cool for people like me with the 
elisp disease.

Other technical issues remain; let me know and 
I'll do my best. My goal is to make this something
you can't live without. (And if you're not using
Emacs, are you really living?)

 M-x thanks

Mark


From bill at genenformics.com  Sun Sep 13 11:47:57 2009
From: bill at genenformics.com (bill at genenformics.com)
Date: Sun, 13 Sep 2009 08:47:57 -0700
Subject: [Bioperl-l] Bio::DB::GenBank question (acc vs. version)
In-Reply-To: <19116.19568.26115.542911@already.dhcp.gene.com>
References: <19116.19568.26115.542911@already.dhcp.gene.com>
Message-ID: <02cbfb3dfbb309f0b62cecd122bb5c2c.squirrel@mail.dreamhost.com>


I would like to make a few comments about get_Seq_by_version and
get_Seq_by_acc. Although both functions use the same NCBI eUtils API, they
are interpreted differently for a Seq_id with version or without version.

1. If the Seq_id has a version, GenBank ID server will locate
corresponding GI and emit the correct sequence.
2. If the Seq_id does not have a version, GBDataLoader  will try to find
the latest version number for that Seq_id, which is relatively slower and
the version number the ID server find out may NOT always be the latest.

IMHO, for both efficiency and consistency,
get_Seq_by_gi > get_Seq_by_version >> get_Seq_by_acc

Bill


>
> It looks like get Bio::DB::GenBank::get_Seq_by_{version,acc} are
> functionally identical.  They seem to trickle down to the same place
> and walking through these two requests yields almost identical http
> requests:
>
>   $db->get_Seq_by_version('J00522.1')
>   GET
> http://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?retmode=text&rettype=gbwithparts&db=nucleotide&tool=bioperl&id=J00522.1&usehistory=n
>
>   $db->get_Seq_by_acc('J00522')
>   GET
> http://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?retmode=text&rettype=gbwithparts&db=nucleotide&tool=bioperl&id=J00522&usehistory=n
>
> The only difference that I can see is that they index into different
> secions of %PARAMSTRING defined in Bio::DB::GenBank, but those
> sections contain the same information.
>
> I'd like a general purpose tool that does The Right Thing whether
> there's a .1 on the end of an identifier or not, and am just trying to
> make sure I'm not doing something troublesome.
>
> Am I correct about the above?
>
> While I'm at it, I think that the comment
>
>   # note that get_Stream_by_version is not implemented
>
> in Bio::DB::GenBank was made obsolete by whoever commented out the
>
>   $self->throw(...)
>
> in get_Stream_by_version in Bio::WebDBSeqI.pm.
>
> I'll happily commit the trivial doc fix if no one shoots down the
> idea. (can't help big, might as well help small...).
>
> Thanks,
>
> g.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


From maj at fortinbras.us  Sun Sep 13 21:26:57 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Sun, 13 Sep 2009 21:26:57 -0400
Subject: [Bioperl-l] Emacs bioperl-mode improved release
In-Reply-To: <DBDF390336FB4D8D935E2395E48207D7@NewLife>
References: <DBDF390336FB4D8D935E2395E48207D7@NewLife>
Message-ID: <CCFD820881654749B1EA479B45A7EA28@NewLife>

Sorry-- just one more tweak--
the latest tar (r16073) eliminates the dependency on pod2text
entirely; source is now parsed for pod directly by an elisp function.
cheers MAJ 
----- Original Message ----- 
From: "Mark A. Jensen" <maj at fortinbras.us>
To: "BioPerl List" <bioperl-l at lists.open-bio.org>
Sent: Saturday, September 12, 2009 11:14 PM
Subject: [Bioperl-l] Emacs bioperl-mode improved release


> Hi All--
> 
> [Future announcements/updates will be made on the wiki-
> http://bioperl.org/wiki/Emacs_bioperl-mode --
> put it on your watchlist...see the page for features and install
> info ]
> 
> Bioperl-mode (tar r16070) is improved:
> - fancy syntax and header highlighting for pod views
> - jump to .pm source from pod view (just press 'f')
> - full support for multiple paths
>  (e.g. "/usr/local/src/bioperl-live:/usr/local/src/bioperl-run"):
>  the completion flattens the paths; if you wind up having to 
>  make a choice (between, e.g., site-perl/5.10/Bio/Seq.pm
>  and mytweaks/Bio/Seq.pm), completion will let you choose
>  the path at the prompt.
> - BPMODE_PATH convenience environment 
>  variable is read for the search paths
> - other stuff I can't remember
> - there is a unit test suite under test.el of Wang Liang
>  in the dev path
> 
> To do this stuff, I've backed off Emacs 21 compatibility; 
> it'll bork (nicely) if you have 21. If there are "enough" complaints,
> I will relent, but 22 is cool for people like me with the 
> elisp disease.
> 
> Other technical issues remain; let me know and 
> I'll do my best. My goal is to make this something
> you can't live without. (And if you're not using
> Emacs, are you really living?)
> 
> M-x thanks
> 
> Mark
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
>

From neetisomaiya at gmail.com  Mon Sep 14 04:22:43 2009
From: neetisomaiya at gmail.com (Neeti Somaiya)
Date: Mon, 14 Sep 2009 13:52:43 +0530
Subject: [Bioperl-l] need help urgently
In-Reply-To: <18DF7D20DFEC044098A1062202F5FFF32B624C5607@exchsth.agresearch.co.nz>
References: <764978cf0909032349t5d8683f3h3b2ccc506392c984@mail.gmail.com>
	<2ac05d0f0909040039v4d6fb77fw8793b43add632e3a@mail.gmail.com>
	<764978cf0909070304w598d4bb5m51ad4e66f57cc1cf@mail.gmail.com>
	<18DF7D20DFEC044098A1062202F5FFF32B624C53A3@exchsth.agresearch.co.nz>
	<764978cf0909072127n830d4e8x95d15a758fa919db@mail.gmail.com>
	<18DF7D20DFEC044098A1062202F5FFF32B624C5607@exchsth.agresearch.co.nz>
Message-ID: <764978cf0909140122h3fe74b80lec7118e3edde24f9@mail.gmail.com>

Thanks a lot. This works for me.

I need one more help, can you point me to where exactly can we find
the link to this FASTA sequence, that we are retrieving here through
the code, in its actual entry in Entrez Gene in the NCBI website
(http://www.ncbi.nlm.nih.gov/sites/entrez)

-Neeti
Even my blood says, B positive


On Tue, Sep 8, 2009 at 10:11 AM, Smithies, Russell
<Russell.Smithies at agresearch.co.nz> wrote:
> That bit of code gave you the accession, start and end for the sequence so you just needed to download it.
> Bio::DB::Eutilities can do that for you.
>
> Did you take a look at http://www.bioperl.org/wiki/HOWTO:Getting_Genomic_Sequences
>
>
>
> --Russell
>
> ==================
> #!perl -w
>
> use strict;
> use Bio::DB::EntrezGene;
> use Bio::DB::EUtilities;
>
> no warnings 'deprecated';
>
> my $id = shift or die "Id?\n"; # use a Gene id
>
> my $db = new Bio::DB::EntrezGene;
> #$db->verbose(1);
> my $seq = $db->get_Seq_by_id($id);
>
> my $ac = $seq->annotation;
>
> for my $ann ($ac->get_Annotations('dblink')) {
>        if ($ann->database eq "Evidence Viewer") {
>                # get the sequence identifier, the start, and the stop
>                my ($acc,$from,$to) = $ann->url =~
>                  /contig=([^&]+).+from=(\d+)&to=(\d+)/;
>                print "$acc\t$from\t$to\n";
>
>                # retrieve the sequence
>                my $fetcher = Bio::DB::EUtilities->new(-eutil => 'efetch',
>                                           -db    => 'nucleotide',
>                                           -rettype => 'fasta');
>            $fetcher->set_parameters(-id => $acc,
>                                                -seq_start => $from,
>                                                -seq_stop  => $to,
>                                                -strand    => 1);
>            my $seq = $fetcher->get_Response->content;
>            print $seq;
>
>        }
> }
>
> ======================
>
>> -----Original Message-----
>> From: Neeti Somaiya [mailto:neetisomaiya at gmail.com]
>> Sent: Tuesday, 8 September 2009 4:28 p.m.
>> To: Smithies, Russell
>> Cc: Emanuele Osimo; bioperl-l
>> Subject: Re: [Bioperl-l] need help urgently
>>
>> I actually want the nucleotide sequence of the gene. I thought the
>> Bio::DB::EntrezGene would give me a seq_obj for an entrez gene id and
>> then the seq method on that $seq_obj->seq() will give me the actual
>> genomic nucleotide sequence of the gene. But this doesnt happen. I am
>> able to print gene symbol using $seq_obj->display_id and able to do
>> other things, but I wanted the gene nucleotide sequence.
>>
>> -Neeti
>> Even my blood says, B positive
>>
>>
>>
>> On Tue, Sep 8, 2009 at 1:56 AM, Smithies,
>> Russell<Russell.Smithies at agresearch.co.nz> wrote:
>> > This example code from the wiki _definitely_ works:
>> >
>> http://bio.perl.org/wiki/HOWTO:Getting_Genomic_Sequences#Using_Bio::DB::Entrez
>> Gene_to_get_genomic_coordinates
>> > =========================================
>> >
>> > use strict;
>> > use Bio::DB::EntrezGene;
>> >
>> > my $id = shift or die "Id?\n"; # use a Gene id
>> >
>> > my $db = new Bio::DB::EntrezGene;
>> > $db->verbose(1); ###
>> >
>> > my $seq = $db->get_Seq_by_id($id);
>> >
>> > my $ac = $seq->annotation;
>> >
>> > for my $ann ($ac->get_Annotations('dblink')) {
>> >        if ($ann->database eq "Evidence Viewer") {
>> >                # get the sequence identifier, the start, and the stop
>> >                my ($contig,$from,$to) = $ann->url =~
>> >                  /contig=([^&]+).+from=(\d+)&to=(\d+)/;
>> >                print "$contig\t$from\t$to\n";
>> >        }
>> > }
>> >
>> > ======================================
>> >
>> > So if it doesn't work for you, there are a few things you need to check:
>> > * what version of BioPerl are you using?
>> > * are you behind a firewall?
>> > * are you using a proxy?
>> > * do you need to submit username/password for either of the 2 above
>> > * turn on 'verbose' messages, it may help you debug
>> >
>> >
>> > If you're still having problems, get back to me and I'll see if I can help.
>> >
>> > --Russell
>> >
>> >
>> >> -----Original Message-----
>> >> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
>> >> bounces at lists.open-bio.org] On Behalf Of Neeti Somaiya
>> >> Sent: Monday, 7 September 2009 10:04 p.m.
>> >> To: Emanuele Osimo; bioperl-l
>> >> Subject: Re: [Bioperl-l] need help urgently
>> >>
>> >> I tried using EntrezGene instead of GenBank, as is given in the link
>> >> that you sent :
>> >>
>> >>
>> http://www.bioperl.org/wiki/HOWTO:Beginners#Retrieving_a_sequence_from_a_datab
>> >> ase
>> >>
>> >> http://doc.bioperl.org/releases/bioperl-current/bioperl-
>> >> live/Bio/DB/EntrezGene.html
>> >>
>> >> use Bio::DB::EntrezGene;
>> >>
>> >>     my $db = Bio::DB::EntrezGene->new;
>> >>
>> >>     my $seq = $db->get_Seq_by_id(2); # Gene id
>> >>
>> >>     # or ...
>> >>
>> >>     my $seqio = $db->get_Stream_by_id([2, 4693, 3064]); # Gene ids
>> >>     while ( my $seq = $seqio->next_seq ) {
>> >>           print "id is ", $seq->display_id, "\n";
>> >>     }
>> >>
>> >> This doesnt seem to work.
>> >>
>> >>
>> >> -Neeti
>> >> Even my blood says, B positive
>> >>
>> >>
>> >>
>> >> On Fri, Sep 4, 2009 at 1:09 PM, Emanuele Osimo<e.osimo at gmail.com> wrote:
>> >> > Hello,
>> >> > have you tried this?
>> >> >
>> >>
>> http://bio.perl.org/wiki/HOWTO:Getting_Genomic_Sequences#Using_Bio::DB::GenBan
>> >> k_when_you_have_genomic_coordinates
>> >> >
>> >> > Emanuele
>> >> >
>> >> > On Fri, Sep 4, 2009 at 08:49, Neeti Somaiya <neetisomaiya at gmail.com>
>> wrote:
>> >> >>
>> >> >> Hi,
>> >> >>
>> >> >> I have an input list of gene names (can get gene ids from a local db
>> >> >> if required).
>> >> >> I need to fetch sequences of these genes. Can someone please guide me
>> >> >> as to how this can be done using perl/bioperl?
>> >> >>
>> >> >> Any help will be deeply appreciated.
>> >> >>
>> >> >> Thanks.
>> >> >>
>> >> >> -Neeti
>> >> >> Even my blood says, B positive
>> >> >> _______________________________________________
>> >> >> Bioperl-l mailing list
>> >> >> Bioperl-l at lists.open-bio.org
>> >> >> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>> >> >
>> >> >
>> >> _______________________________________________
>> >> Bioperl-l mailing list
>> >> Bioperl-l at lists.open-bio.org
>> >> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>> > =======================================================================
>> > Attention: The information contained in this message and/or attachments
>> > from AgResearch Limited is intended only for the persons or entities
>> > to which it is addressed and may contain confidential and/or privileged
>> > material. Any review, retransmission, dissemination or other use of, or
>> > taking of any action in reliance upon, this information by persons or
>> > entities other than the intended recipients is prohibited by AgResearch
>> > Limited. If you have received this message in error, please notify the
>> > sender immediately.
>> > =======================================================================
>> >
>

From cavin.wardcaviness at gmail.com  Sun Sep 13 22:25:51 2009
From: cavin.wardcaviness at gmail.com (Cavin Ward-Caviness)
Date: Sun, 13 Sep 2009 22:25:51 -0400
Subject: [Bioperl-l] Beginner Script Error
Message-ID: <f39d52e60909131925ye176745qad0a0a16d4353a17@mail.gmail.com>

I am very new to perl and bioperl and figured I'd start learning by trying
to run a simple script to get BLAST data.  Here is the code I am trying to
run

use Bio::Perl;

$seq = get_sequence('swiss',"ROA1_HUMAN");

# uses the default database - nr in this case
$blast_result = blast_sequence($seq);

write_blast(">roa1.blast",$blast_result);

Instead of creating a file of the blast results I get the following error
message
Bio::SeqIO: swiss cannot be found.
Exception
Msg: Failed to load module Bio::SeqIO:swiss

It seems as though I may simply be missing the proper module.  I am running
bioperl 1.5.9_4 installed using the Perl Package Manager from the
instructions on the bioperl wiki page.  If I am simply missing a module
please let me know which one it is - and any other helpful modules that
someone in the bioinformatics field is likely to use.

Thanks,
Cavin

From joseguillin at hotmail.com  Mon Sep 14 08:48:28 2009
From: joseguillin at hotmail.com (Jose .)
Date: Mon, 14 Sep 2009 13:48:28 +0100
Subject: [Bioperl-l] Bio/Align/DNAStatistics.html print
	$jcmatrix->print_matrix; 
Message-ID: <BLU104-W2453ADE4584D2C479071A4A0E40@phx.gbl>


Hello,

I'm trying to use Bio::Align::DNAStatistics, but I get the following message:

Can't call method "print_matrix" on unblessed reference at Tree.pl line 32, <GEN0> line 44.

Other modules do work, such us Bio::SimpleAlign;


My code is basically a modification of the code I found in http://doc.bioperl.org/releases/bioperl-current/bioperl-live/Bio/Align/DNAStatistics.html, as it is as follows:

use strict;
use Bio::AlignIO;
use Bio::Align::DNAStatistics;


my $stats = Bio::Align::DNAStatistics->new();

my $alignin = Bio::AlignIO->new(-file => 'e1_output_uno_solo.fas',
                            -format => 'fasta');
my $aln = $alignin->next_aln;

my $jcmatrix = $stats-> distance (-align => $aln,
                  -method => 'Jukes-Cantor');

print $jcmatrix->print_matrix;

And the file 'e1_output_uno_solo.fas' has the following sequences:

>A
GGTTATCTCAACAACTGTCACC--GTGGGCGCTGGTCATTGGTACGGGTGAACGAGAGTT
AAACGGTCGTTAACCATAGAAACAAAACACACTGCACCTTAACTCACTGAATAGTTGACG
GTCTGCCTCAGGGCTTGAGACAACGGATGGATCTAAACTCATGCTGTAGCCTATCAAACT
TAGCCCCAGGGTACTTCCGTCCCTAGCCTCGCTACAAGGCCAGAAAGGGTTTTGAAGTCT
ACTCACTGTGACCAGCGGTCTAGTCAGGTTATGCTTCGGCACAAAACCTCAGAATCGGTA
ACCAGCCACTACACGAACTGAAATCAAATCGCGGGAGGTGGTCCATCTTTGTCCACGCTG
CGATGATTGGGTTGCTTTATAGTCTAGCTGCAAGGTTTTGCGTTCTGGTGGGAAGCGGCA
TCCAAGGGGTTGACTCCGCTCGTTTATAACATGCCTTGGGCCTCCATGGTGAGTCGCAAC
GTCAGCGTAGGCCTAGACGGCT

>B
GGATATCTCGACAACTTTTAGC--CTGGGCGCTTGGCATTGGTACACGTGACTTGCAGTT
AAAGGGTCGTTATACATAGAATCACTACCCAC--CAGGCGAACTCGCTGGAGAGCTGAGG
GTCACCCTCAGCGGTTGAGTTAACTGCTCGATGTTAACCGATGTTGGATCATAGGTAACT
TATCCTCAGTGTTCCTCTGTCCCTAGACTGGCTACAGGGCTACACCGGGTTTGAGGGGAT
ACTGACTGTTTTCAGCGGTAGTGTAAGTGTATGGTCCAACCCAAGGGTTCATGACCGGTA
AACTGCCCGTTCCCGCATTGAAATCAAATTGCAGGAGTTGGTACTTATTTGTCAACCTTA
CGATGATTGGGATGCATTTTAGTCGGGCTGGGCGGATTTGCGATCTGGGTGGAAGAGAGA
TGCATGGGGCTAACTCGTCTTGGTGAGTACCGGCATTGCACCGCAATGGACCGCCAAAAC
ATAAGAGTAGGTCGGGATGGCA

>C
GCTTATCTCAACAACCGACACGAAGTCGTCGCAGGTCAATGGTACACGTGAATTGAAGTC
ATAAGATCAGTAATGATCGAACCACCAAACCCTTAACCTCGACTCACGCGATAGCCGAGG
GTCTGCCTCCAGGGTTGATTTAAAGGTTCTATTTAAGACCGTTTTCGATCATAGGTTACT
TATCCCCAGAGTTCTACCGTCGTGAGAATGGCTACAAGGCTAGAATAGGTTTTAGGGT-T
ACTTACGGTCTGCAGCCGTATTGTGAGGTTATGGTCCGGCCCTAGGCGTCATGACCGATA
ATCAGCCCCTACCTGAAATGAAATCAAATCGCGGGAGTTGGTACTTATCTGTCAACGTTG
CGATGATGGGGATACATGTTGGTCTACCGCGACGGACTAGCGATCACGGGGGAAGCGGAT
TGCCCGGTGGTGACTCGACACGTTTAAAACCTGCCTGGTTCCCGCATGGATCGTCACAAC
GTATGTGCAGGTCGAAACGAGT

>D
CGTGATCGCAACAACTGTCACC--GTGGGCGCTGGCCGTTGGACCACGTGAAATGCTGTT
AAACGATCGTTCACCATAGAACCACTACACTCTTCACCTCAACCCGCGGGACAGGTGATG
GTGTCCCCCAGGGGTTGAGTGAACGGCTCGATGTAAACCCATGTTCGATCATAGGTAACG
TAGCCCCAGGGTGATTCCGTTCCTAAACTGGTTACAAGGCTAAAACGTGTTTTAGAGTAT
AATGACTGTCTACGGCGGTATTGTGATGTTATCATCCGTCCCTAGGCGTGGCGACCGTTA
AACAGCCTCTTCCCTAACTGATATCTAATCGTAGGAGTTGCTACGCATTTGTCAACGCAG
CGATGATGGTGATGCATCTTAATCTAGCTGG----TTTTTTGATCTCGGGTGACGCAGAT
AGTCAGGGGTTGACTCGCGTCGTTTGAAACGTGCCTTGCTCCTCAATGGACCCTCCGAAC
CTAAGAGTAGCTCGACACGGCT


I think the $aln object is OK, as I can use it with SimpleAlign.

Moreover, if I write
          print $jcmatrix;
instead of
          print $jcmatrix->print_matrix;
I get the memory reference, as normal===> ARRAY(0x859f08)

So my question is:

Why do I have an unblessed reference?

Can't call method "print_matrix" on unblessed reference at Tree.pl line 32, <GEN0> line 44.

Thank you very much in advance.

Jose G.

_________________________________________________________________
Hay tantos ordenadores como personas. ?Descubre ahora cu?l eres t?!
http://www.quepceres.com/

From maj at fortinbras.us  Mon Sep 14 13:00:24 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Mon, 14 Sep 2009 13:00:24 -0400
Subject: [Bioperl-l] Bio/Align/DNAStatistics.html
	print$jcmatrix->print_matrix; 
In-Reply-To: <BLU104-W2453ADE4584D2C479071A4A0E40@phx.gbl>
References: <BLU104-W2453ADE4584D2C479071A4A0E40@phx.gbl>
Message-ID: <7AD546C5A6BE4B66BF9705BC885E08B1@NewLife>

Hi Jose--
I don't get any problem with your script as written. You should upgrade to
BioPerl 1.6 and try again.
The "unblessed reference" is $jcmatrix. It may be undef for some reason.
MAJ
----- Original Message ----- 
From: "Jose ." <joseguillin at hotmail.com>
To: <bioperl-l at bioperl.org>
Sent: Monday, September 14, 2009 8:48 AM
Subject: [Bioperl-l] Bio/Align/DNAStatistics.html print$jcmatrix->print_matrix;


Hello,

I'm trying to use Bio::Align::DNAStatistics, but I get the following message:

Can't call method "print_matrix" on unblessed reference at Tree.pl line 32, 
<GEN0> line 44.

Other modules do work, such us Bio::SimpleAlign;


My code is basically a modification of the code I found in 
http://doc.bioperl.org/releases/bioperl-current/bioperl-live/Bio/Align/DNAStatistics.html, 
as it is as follows:

use strict;
use Bio::AlignIO;
use Bio::Align::DNAStatistics;


my $stats = Bio::Align::DNAStatistics->new();

my $alignin = Bio::AlignIO->new(-file => 'e1_output_uno_solo.fas',
                            -format => 'fasta');
my $aln = $alignin->next_aln;

my $jcmatrix = $stats-> distance (-align => $aln,
                  -method => 'Jukes-Cantor');

print $jcmatrix->print_matrix;

And the file 'e1_output_uno_solo.fas' has the following sequences:

>A
GGTTATCTCAACAACTGTCACC--GTGGGCGCTGGTCATTGGTACGGGTGAACGAGAGTT
AAACGGTCGTTAACCATAGAAACAAAACACACTGCACCTTAACTCACTGAATAGTTGACG
GTCTGCCTCAGGGCTTGAGACAACGGATGGATCTAAACTCATGCTGTAGCCTATCAAACT
TAGCCCCAGGGTACTTCCGTCCCTAGCCTCGCTACAAGGCCAGAAAGGGTTTTGAAGTCT
ACTCACTGTGACCAGCGGTCTAGTCAGGTTATGCTTCGGCACAAAACCTCAGAATCGGTA
ACCAGCCACTACACGAACTGAAATCAAATCGCGGGAGGTGGTCCATCTTTGTCCACGCTG
CGATGATTGGGTTGCTTTATAGTCTAGCTGCAAGGTTTTGCGTTCTGGTGGGAAGCGGCA
TCCAAGGGGTTGACTCCGCTCGTTTATAACATGCCTTGGGCCTCCATGGTGAGTCGCAAC
GTCAGCGTAGGCCTAGACGGCT

>B
GGATATCTCGACAACTTTTAGC--CTGGGCGCTTGGCATTGGTACACGTGACTTGCAGTT
AAAGGGTCGTTATACATAGAATCACTACCCAC--CAGGCGAACTCGCTGGAGAGCTGAGG
GTCACCCTCAGCGGTTGAGTTAACTGCTCGATGTTAACCGATGTTGGATCATAGGTAACT
TATCCTCAGTGTTCCTCTGTCCCTAGACTGGCTACAGGGCTACACCGGGTTTGAGGGGAT
ACTGACTGTTTTCAGCGGTAGTGTAAGTGTATGGTCCAACCCAAGGGTTCATGACCGGTA
AACTGCCCGTTCCCGCATTGAAATCAAATTGCAGGAGTTGGTACTTATTTGTCAACCTTA
CGATGATTGGGATGCATTTTAGTCGGGCTGGGCGGATTTGCGATCTGGGTGGAAGAGAGA
TGCATGGGGCTAACTCGTCTTGGTGAGTACCGGCATTGCACCGCAATGGACCGCCAAAAC
ATAAGAGTAGGTCGGGATGGCA

>C
GCTTATCTCAACAACCGACACGAAGTCGTCGCAGGTCAATGGTACACGTGAATTGAAGTC
ATAAGATCAGTAATGATCGAACCACCAAACCCTTAACCTCGACTCACGCGATAGCCGAGG
GTCTGCCTCCAGGGTTGATTTAAAGGTTCTATTTAAGACCGTTTTCGATCATAGGTTACT
TATCCCCAGAGTTCTACCGTCGTGAGAATGGCTACAAGGCTAGAATAGGTTTTAGGGT-T
ACTTACGGTCTGCAGCCGTATTGTGAGGTTATGGTCCGGCCCTAGGCGTCATGACCGATA
ATCAGCCCCTACCTGAAATGAAATCAAATCGCGGGAGTTGGTACTTATCTGTCAACGTTG
CGATGATGGGGATACATGTTGGTCTACCGCGACGGACTAGCGATCACGGGGGAAGCGGAT
TGCCCGGTGGTGACTCGACACGTTTAAAACCTGCCTGGTTCCCGCATGGATCGTCACAAC
GTATGTGCAGGTCGAAACGAGT

>D
CGTGATCGCAACAACTGTCACC--GTGGGCGCTGGCCGTTGGACCACGTGAAATGCTGTT
AAACGATCGTTCACCATAGAACCACTACACTCTTCACCTCAACCCGCGGGACAGGTGATG
GTGTCCCCCAGGGGTTGAGTGAACGGCTCGATGTAAACCCATGTTCGATCATAGGTAACG
TAGCCCCAGGGTGATTCCGTTCCTAAACTGGTTACAAGGCTAAAACGTGTTTTAGAGTAT
AATGACTGTCTACGGCGGTATTGTGATGTTATCATCCGTCCCTAGGCGTGGCGACCGTTA
AACAGCCTCTTCCCTAACTGATATCTAATCGTAGGAGTTGCTACGCATTTGTCAACGCAG
CGATGATGGTGATGCATCTTAATCTAGCTGG----TTTTTTGATCTCGGGTGACGCAGAT
AGTCAGGGGTTGACTCGCGTCGTTTGAAACGTGCCTTGCTCCTCAATGGACCCTCCGAAC
CTAAGAGTAGCTCGACACGGCT


I think the $aln object is OK, as I can use it with SimpleAlign.

Moreover, if I write
          print $jcmatrix;
instead of
          print $jcmatrix->print_matrix;
I get the memory reference, as normal===> ARRAY(0x859f08)

So my question is:

Why do I have an unblessed reference?

Can't call method "print_matrix" on unblessed reference at Tree.pl line 32, 
<GEN0> line 44.

Thank you very much in advance.

Jose G.

_________________________________________________________________
Hay tantos ordenadores como personas. ?Descubre ahora cu?l eres t?!
http://www.quepceres.com/
_______________________________________________
Bioperl-l mailing list
Bioperl-l at lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/bioperl-l


From jason at bioperl.org  Mon Sep 14 13:54:55 2009
From: jason at bioperl.org (Jason Stajich)
Date: Mon, 14 Sep 2009 10:54:55 -0700
Subject: [Bioperl-l] Bio/Align/DNAStatistics.html
	print$jcmatrix->print_matrix; 
In-Reply-To: <7AD546C5A6BE4B66BF9705BC885E08B1@NewLife>
References: <BLU104-W2453ADE4584D2C479071A4A0E40@phx.gbl>
	<7AD546C5A6BE4B66BF9705BC885E08B1@NewLife>
Message-ID: <8B440DC9-A1C8-4900-A0AB-96448616E46A@bioperl.org>

Yeah it seems like more of a bioperl problem -- possible that the  
older code didn't recognize 'jukes-cantor' but you can try the  
abbreviation 'jc' -- better to just upgrade tho!

This isn't the cause of the problem but I would also encourage use of  
Bio::Matrix::IO for printing the matrix (use the 'write_matrix'  
function) rather than print_matrix on the matrix itsself.

-jason
On Sep 14, 2009, at 10:00 AM, Mark A. Jensen wrote:

> Hi Jose--
> I don't get any problem with your script as written. You should  
> upgrade to
> BioPerl 1.6 and try again.
> The "unblessed reference" is $jcmatrix. It may be undef for some  
> reason.
> MAJ
> ----- Original Message ----- From: "Jose ." <joseguillin at hotmail.com>
> To: <bioperl-l at bioperl.org>
> Sent: Monday, September 14, 2009 8:48 AM
> Subject: [Bioperl-l] Bio/Align/DNAStatistics.html print$jcmatrix- 
> >print_matrix;
>
>
>
>
>
> Hello,
>
> I'm trying to use Bio::Align::DNAStatistics, but I get the following  
> message:
>
> Can't call method "print_matrix" on unblessed reference at Tree.pl  
> line 32, <GEN0> line 44.
>
> Other modules do work, such us Bio::SimpleAlign;
>
>
>
>
> My code is basically a modification of the code I found in http://doc.bioperl.org/releases/bioperl-current/bioperl-live/Bio/Align/DNAStatistics.html 
> , as it is as follows:
>
> use strict;
> use Bio::AlignIO;
> use Bio::Align::DNAStatistics;
>
>
> my $stats = Bio::Align::DNAStatistics->new();
>
> my $alignin = Bio::AlignIO->new(-file => 'e1_output_uno_solo.fas',
>                           -format => 'fasta');
> my $aln = $alignin->next_aln;
>
> my $jcmatrix = $stats-> distance (-align => $aln,
>                 -method => 'Jukes-Cantor');
>
> print $jcmatrix->print_matrix;
>
> And the file 'e1_output_uno_solo.fas' has the following sequences:
>
>> A
> GGTTATCTCAACAACTGTCACC--GTGGGCGCTGGTCATTGGTACGGGTGAACGAGAGTT
> AAACGGTCGTTAACCATAGAAACAAAACACACTGCACCTTAACTCACTGAATAGTTGACG
> GTCTGCCTCAGGGCTTGAGACAACGGATGGATCTAAACTCATGCTGTAGCCTATCAAACT
> TAGCCCCAGGGTACTTCCGTCCCTAGCCTCGCTACAAGGCCAGAAAGGGTTTTGAAGTCT
> ACTCACTGTGACCAGCGGTCTAGTCAGGTTATGCTTCGGCACAAAACCTCAGAATCGGTA
> ACCAGCCACTACACGAACTGAAATCAAATCGCGGGAGGTGGTCCATCTTTGTCCACGCTG
> CGATGATTGGGTTGCTTTATAGTCTAGCTGCAAGGTTTTGCGTTCTGGTGGGAAGCGGCA
> TCCAAGGGGTTGACTCCGCTCGTTTATAACATGCCTTGGGCCTCCATGGTGAGTCGCAAC
> GTCAGCGTAGGCCTAGACGGCT
>
>> B
> GGATATCTCGACAACTTTTAGC--CTGGGCGCTTGGCATTGGTACACGTGACTTGCAGTT
> AAAGGGTCGTTATACATAGAATCACTACCCAC--CAGGCGAACTCGCTGGAGAGCTGAGG
> GTCACCCTCAGCGGTTGAGTTAACTGCTCGATGTTAACCGATGTTGGATCATAGGTAACT
> TATCCTCAGTGTTCCTCTGTCCCTAGACTGGCTACAGGGCTACACCGGGTTTGAGGGGAT
> ACTGACTGTTTTCAGCGGTAGTGTAAGTGTATGGTCCAACCCAAGGGTTCATGACCGGTA
> AACTGCCCGTTCCCGCATTGAAATCAAATTGCAGGAGTTGGTACTTATTTGTCAACCTTA
> CGATGATTGGGATGCATTTTAGTCGGGCTGGGCGGATTTGCGATCTGGGTGGAAGAGAGA
> TGCATGGGGCTAACTCGTCTTGGTGAGTACCGGCATTGCACCGCAATGGACCGCCAAAAC
> ATAAGAGTAGGTCGGGATGGCA
>
>> C
> GCTTATCTCAACAACCGACACGAAGTCGTCGCAGGTCAATGGTACACGTGAATTGAAGTC
> ATAAGATCAGTAATGATCGAACCACCAAACCCTTAACCTCGACTCACGCGATAGCCGAGG
> GTCTGCCTCCAGGGTTGATTTAAAGGTTCTATTTAAGACCGTTTTCGATCATAGGTTACT
> TATCCCCAGAGTTCTACCGTCGTGAGAATGGCTACAAGGCTAGAATAGGTTTTAGGGT-T
> ACTTACGGTCTGCAGCCGTATTGTGAGGTTATGGTCCGGCCCTAGGCGTCATGACCGATA
> ATCAGCCCCTACCTGAAATGAAATCAAATCGCGGGAGTTGGTACTTATCTGTCAACGTTG
> CGATGATGGGGATACATGTTGGTCTACCGCGACGGACTAGCGATCACGGGGGAAGCGGAT
> TGCCCGGTGGTGACTCGACACGTTTAAAACCTGCCTGGTTCCCGCATGGATCGTCACAAC
> GTATGTGCAGGTCGAAACGAGT
>
>> D
> CGTGATCGCAACAACTGTCACC--GTGGGCGCTGGCCGTTGGACCACGTGAAATGCTGTT
> AAACGATCGTTCACCATAGAACCACTACACTCTTCACCTCAACCCGCGGGACAGGTGATG
> GTGTCCCCCAGGGGTTGAGTGAACGGCTCGATGTAAACCCATGTTCGATCATAGGTAACG
> TAGCCCCAGGGTGATTCCGTTCCTAAACTGGTTACAAGGCTAAAACGTGTTTTAGAGTAT
> AATGACTGTCTACGGCGGTATTGTGATGTTATCATCCGTCCCTAGGCGTGGCGACCGTTA
> AACAGCCTCTTCCCTAACTGATATCTAATCGTAGGAGTTGCTACGCATTTGTCAACGCAG
> CGATGATGGTGATGCATCTTAATCTAGCTGG----TTTTTTGATCTCGGGTGACGCAGAT
> AGTCAGGGGTTGACTCGCGTCGTTTGAAACGTGCCTTGCTCCTCAATGGACCCTCCGAAC
> CTAAGAGTAGCTCGACACGGCT
>
>
>
> I think the $aln object is OK, as I can use it with SimpleAlign.
>
> Moreover, if I write
>         print $jcmatrix;
> instead of
>         print $jcmatrix->print_matrix;
> I get the memory reference, as normal===> ARRAY(0x859f08)
>
> So my question is:
>
> Why do I have an unblessed reference?
>
> Can't call method "print_matrix" on unblessed reference at Tree.pl  
> line 32, <GEN0> line 44.
>
> Thank you very much in advance.
>
> Jose G.
>
> _________________________________________________________________
> Hay tantos ordenadores como personas. ?Descubre ahora cu?l eres t?!
> http://www.quepceres.com/
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

--
Jason Stajich
jason.stajich at gmail.com
jason at bioperl.org


From robert.bradbury at gmail.com  Mon Sep 14 15:34:52 2009
From: robert.bradbury at gmail.com (Robert Bradbury)
Date: Mon, 14 Sep 2009 15:34:52 -0400
Subject: [Bioperl-l] Beginner Script Error
In-Reply-To: <f39d52e60909131925ye176745qad0a0a16d4353a17@mail.gmail.com>
References: <f39d52e60909131925ye176745qad0a0a16d4353a17@mail.gmail.com>
Message-ID: <deaa866a0909141234p55341bcbhd4f713551180fed4@mail.gmail.com>

On 9/13/09, Cavin Ward-Caviness <cavin.wardcaviness at gmail.com> wrote:

> $seq = get_sequence('swiss',"ROA1_HUMAN");

Well, I haven't looked at the documentation or the source, but the
code I've got which does work which does a similar function is:
             # database options include: Swissprot, EMBL, GenBank and RefSeq
            $seq_object = get_sequence('swissprot', $seqname);

I think the names have to be string specific but may not need to be
case specific.  The seqname's also tend to be database format
specific, so my "general" function fetch will catch exceptions and
then try other databases, if for example it looks like a PDB
identifier.  I'm not sure whether there is a library function which
fetches a "general" sequence based on the sequence name format.
Presumably one could do something like this with some kind of
"prioritized" list of databases to go through, e.g. GenBank, EMBL,
SwissProt, RefSeq, PDB, JDB, JGI, Broad, NCBI, C. elegans, Drosophila,
Yeast, other organism specific databases.  It might be nice if there
were a "general" BioPerl function that would do this based on sequence
name format, locality (fetch from the nearest database),
up-to-dated-ness, ultimately one might like to have kind of a sequence
"rsync" function that of the form  UpdateSequence(SeqName, prefDb,
last-update-date, update-size, update-md5sum, ...) which would perform
inexpensive network-based updates for gene-sets of interest.  I'm
presuming that many sequence entries in active databases are
undergoing periodic updates and thus one might be interested in weekly
or monthly "local" db updates.

Robert

From robert.bradbury at gmail.com  Tue Sep 15 04:05:22 2009
From: robert.bradbury at gmail.com (Robert Bradbury)
Date: Tue, 15 Sep 2009 04:05:22 -0400
Subject: [Bioperl-l] Genome scanning questions/strategies
Message-ID: <deaa866a0909150105wcc651c5n4a50033d0392bbda@mail.gmail.com>

I have several applications which require scanning multiple genomes, in some
cases I can get away with scanning the protein sequences, in other cases I
need to scan the mRNA, or in the worst case the DNA sequences themselves.  I
have most of the available genomes on my hard drive but in cases where they
are not complete or undergo frequent revisions, I may need to interface
through the Genbank | Ensembl | JGI (or other?) databases.

Some of the applications are basic counting statistics:
1) How many proteins?
2) How many amino acids in the proteins?
3) What are the species specific codon frequencies in the codons?
4) What fraction of the genome is ncRNA, junk DNA, etc.?

Other applications involve some functional analysis, e.g. find all specified
protein domains of interest (presumably some HMM matching or equivalent),
find all signal sequences (nuclear targeting, mitochondrial targeting, ER
targeting, etc.), find all mRNA restriction enzyme cut sites, etc..

Questions are:
1) Are there "remote" functions that use genome center "supercomputers"
(other than say Remote Blast) that can be used for some of these purposes
and are interfaced in some way to BioPerl?
2) Will I incur genome center wrath by running all my queries "remotely"
(i.e. I do the computing, but they handle the database retreival & network
distribution)?  If not, what is a good "max query frequency"? [I'm on a DSL
line, so I can't push most servers very hard from an I/O standpoint.]

Finally, is there any "archive of experience" documenting the various
information systems limitations on various bioinformatics applications?
I.e. for I/O requirements and/or CPU requirements, is: BLAST <
HMM-domain-searching < Inter-genome-signal-scanning/matching?  Relates to
the question of when home based bioinformaticians need to begin considering
switching from DSL to Cable to FIOS and/or 1/3/4/6/8 core machines/clusters
can handle the workload.

Thank you,
Robert Bradbury

From neetisomaiya at gmail.com  Tue Sep 15 04:29:02 2009
From: neetisomaiya at gmail.com (Neeti Somaiya)
Date: Tue, 15 Sep 2009 13:59:02 +0530
Subject: [Bioperl-l] need help urgently
In-Reply-To: <764978cf0909140122h3fe74b80lec7118e3edde24f9@mail.gmail.com>
References: <764978cf0909032349t5d8683f3h3b2ccc506392c984@mail.gmail.com>
	<2ac05d0f0909040039v4d6fb77fw8793b43add632e3a@mail.gmail.com>
	<764978cf0909070304w598d4bb5m51ad4e66f57cc1cf@mail.gmail.com>
	<18DF7D20DFEC044098A1062202F5FFF32B624C53A3@exchsth.agresearch.co.nz>
	<764978cf0909072127n830d4e8x95d15a758fa919db@mail.gmail.com>
	<18DF7D20DFEC044098A1062202F5FFF32B624C5607@exchsth.agresearch.co.nz>
	<764978cf0909140122h3fe74b80lec7118e3edde24f9@mail.gmail.com>
Message-ID: <764978cf0909150129s69817921j82a9ca112aefe7ae@mail.gmail.com>

When I use Bio::DB::EntrezGene and EUtilities, the accession and
sequence that it returns to me for a gene is the second accession
mentioned in the "Genome Reference Consortium Human Build 37 Primary
Assembly". For eg, if we take entrez gene id 3630, the code returns
accession NT_009237.18. But I actually want to take the sequence of
the first accession i.e. NC_000011.9.

Please let me know how I could get that. Any help will be great.

-Neeti
Even my blood says, B positive


On Mon, Sep 14, 2009 at 1:52 PM, Neeti Somaiya <neetisomaiya at gmail.com> wrote:
> Thanks a lot. This works for me.
>
> I need one more help, can you point me to where exactly can we find
> the link to this FASTA sequence, that we are retrieving here through
> the code, in its actual entry in Entrez Gene in the NCBI website
> (http://www.ncbi.nlm.nih.gov/sites/entrez)
>
> -Neeti
> Even my blood says, B positive
>
>
>
> On Tue, Sep 8, 2009 at 10:11 AM, Smithies, Russell
> <Russell.Smithies at agresearch.co.nz> wrote:
>> That bit of code gave you the accession, start and end for the sequence so you just needed to download it.
>> Bio::DB::Eutilities can do that for you.
>>
>> Did you take a look at http://www.bioperl.org/wiki/HOWTO:Getting_Genomic_Sequences
>>
>>
>>
>> --Russell
>>
>> ==================
>> #!perl -w
>>
>> use strict;
>> use Bio::DB::EntrezGene;
>> use Bio::DB::EUtilities;
>>
>> no warnings 'deprecated';
>>
>> my $id = shift or die "Id?\n"; # use a Gene id
>>
>> my $db = new Bio::DB::EntrezGene;
>> #$db->verbose(1);
>> my $seq = $db->get_Seq_by_id($id);
>>
>> my $ac = $seq->annotation;
>>
>> for my $ann ($ac->get_Annotations('dblink')) {
>>        if ($ann->database eq "Evidence Viewer") {
>>                # get the sequence identifier, the start, and the stop
>>                my ($acc,$from,$to) = $ann->url =~
>>                  /contig=([^&]+).+from=(\d+)&to=(\d+)/;
>>                print "$acc\t$from\t$to\n";
>>
>>                # retrieve the sequence
>>                my $fetcher = Bio::DB::EUtilities->new(-eutil => 'efetch',
>>                                           -db    => 'nucleotide',
>>                                           -rettype => 'fasta');
>>            $fetcher->set_parameters(-id => $acc,
>>                                                -seq_start => $from,
>>                                                -seq_stop  => $to,
>>                                                -strand    => 1);
>>            my $seq = $fetcher->get_Response->content;
>>            print $seq;
>>
>>        }
>> }
>>
>> ======================
>>
>>> -----Original Message-----
>>> From: Neeti Somaiya [mailto:neetisomaiya at gmail.com]
>>> Sent: Tuesday, 8 September 2009 4:28 p.m.
>>> To: Smithies, Russell
>>> Cc: Emanuele Osimo; bioperl-l
>>> Subject: Re: [Bioperl-l] need help urgently
>>>
>>> I actually want the nucleotide sequence of the gene. I thought the
>>> Bio::DB::EntrezGene would give me a seq_obj for an entrez gene id and
>>> then the seq method on that $seq_obj->seq() will give me the actual
>>> genomic nucleotide sequence of the gene. But this doesnt happen. I am
>>> able to print gene symbol using $seq_obj->display_id and able to do
>>> other things, but I wanted the gene nucleotide sequence.
>>>
>>> -Neeti
>>> Even my blood says, B positive
>>>
>>>
>>>
>>> On Tue, Sep 8, 2009 at 1:56 AM, Smithies,
>>> Russell<Russell.Smithies at agresearch.co.nz> wrote:
>>> > This example code from the wiki _definitely_ works:
>>> >
>>> http://bio.perl.org/wiki/HOWTO:Getting_Genomic_Sequences#Using_Bio::DB::Entrez
>>> Gene_to_get_genomic_coordinates
>>> > =========================================
>>> >
>>> > use strict;
>>> > use Bio::DB::EntrezGene;
>>> >
>>> > my $id = shift or die "Id?\n"; # use a Gene id
>>> >
>>> > my $db = new Bio::DB::EntrezGene;
>>> > $db->verbose(1); ###
>>> >
>>> > my $seq = $db->get_Seq_by_id($id);
>>> >
>>> > my $ac = $seq->annotation;
>>> >
>>> > for my $ann ($ac->get_Annotations('dblink')) {
>>> >        if ($ann->database eq "Evidence Viewer") {
>>> >                # get the sequence identifier, the start, and the stop
>>> >                my ($contig,$from,$to) = $ann->url =~
>>> >                  /contig=([^&]+).+from=(\d+)&to=(\d+)/;
>>> >                print "$contig\t$from\t$to\n";
>>> >        }
>>> > }
>>> >
>>> > ======================================
>>> >
>>> > So if it doesn't work for you, there are a few things you need to check:
>>> > * what version of BioPerl are you using?
>>> > * are you behind a firewall?
>>> > * are you using a proxy?
>>> > * do you need to submit username/password for either of the 2 above
>>> > * turn on 'verbose' messages, it may help you debug
>>> >
>>> >
>>> > If you're still having problems, get back to me and I'll see if I can help.
>>> >
>>> > --Russell
>>> >
>>> >
>>> >> -----Original Message-----
>>> >> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
>>> >> bounces at lists.open-bio.org] On Behalf Of Neeti Somaiya
>>> >> Sent: Monday, 7 September 2009 10:04 p.m.
>>> >> To: Emanuele Osimo; bioperl-l
>>> >> Subject: Re: [Bioperl-l] need help urgently
>>> >>
>>> >> I tried using EntrezGene instead of GenBank, as is given in the link
>>> >> that you sent :
>>> >>
>>> >>
>>> http://www.bioperl.org/wiki/HOWTO:Beginners#Retrieving_a_sequence_from_a_datab
>>> >> ase
>>> >>
>>> >> http://doc.bioperl.org/releases/bioperl-current/bioperl-
>>> >> live/Bio/DB/EntrezGene.html
>>> >>
>>> >> use Bio::DB::EntrezGene;
>>> >>
>>> >>     my $db = Bio::DB::EntrezGene->new;
>>> >>
>>> >>     my $seq = $db->get_Seq_by_id(2); # Gene id
>>> >>
>>> >>     # or ...
>>> >>
>>> >>     my $seqio = $db->get_Stream_by_id([2, 4693, 3064]); # Gene ids
>>> >>     while ( my $seq = $seqio->next_seq ) {
>>> >>           print "id is ", $seq->display_id, "\n";
>>> >>     }
>>> >>
>>> >> This doesnt seem to work.
>>> >>
>>> >>
>>> >> -Neeti
>>> >> Even my blood says, B positive
>>> >>
>>> >>
>>> >>
>>> >> On Fri, Sep 4, 2009 at 1:09 PM, Emanuele Osimo<e.osimo at gmail.com> wrote:
>>> >> > Hello,
>>> >> > have you tried this?
>>> >> >
>>> >>
>>> http://bio.perl.org/wiki/HOWTO:Getting_Genomic_Sequences#Using_Bio::DB::GenBan
>>> >> k_when_you_have_genomic_coordinates
>>> >> >
>>> >> > Emanuele
>>> >> >
>>> >> > On Fri, Sep 4, 2009 at 08:49, Neeti Somaiya <neetisomaiya at gmail.com>
>>> wrote:
>>> >> >>
>>> >> >> Hi,
>>> >> >>
>>> >> >> I have an input list of gene names (can get gene ids from a local db
>>> >> >> if required).
>>> >> >> I need to fetch sequences of these genes. Can someone please guide me
>>> >> >> as to how this can be done using perl/bioperl?
>>> >> >>
>>> >> >> Any help will be deeply appreciated.
>>> >> >>
>>> >> >> Thanks.
>>> >> >>
>>> >> >> -Neeti
>>> >> >> Even my blood says, B positive
>>> >> >> _______________________________________________
>>> >> >> Bioperl-l mailing list
>>> >> >> Bioperl-l at lists.open-bio.org
>>> >> >> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>> >> >
>>> >> >
>>> >> _______________________________________________
>>> >> Bioperl-l mailing list
>>> >> Bioperl-l at lists.open-bio.org
>>> >> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>> > =======================================================================
>>> > Attention: The information contained in this message and/or attachments
>>> > from AgResearch Limited is intended only for the persons or entities
>>> > to which it is addressed and may contain confidential and/or privileged
>>> > material. Any review, retransmission, dissemination or other use of, or
>>> > taking of any action in reliance upon, this information by persons or
>>> > entities other than the intended recipients is prohibited by AgResearch
>>> > Limited. If you have received this message in error, please notify the
>>> > sender immediately.
>>> > =======================================================================
>>> >
>>
>

From cjfields at illinois.edu  Tue Sep 15 15:07:40 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Tue, 15 Sep 2009 14:07:40 -0500
Subject: [Bioperl-l] Significant blocker for 1.6.1 : Nexml
In-Reply-To: <4415308D-81DC-4F68-A6CF-E08FD03D1D6E@illinois.edu>
References: <E5D7B830-6D19-47D2-8D5E-716B4CF84F0B@illinois.edu><CB38C203-7253-4AEE-A6E3-922243B290D9@gmx.net>
	<3163670B-51E3-419F-835B-304BB52E1037@illinois.edu>
	<1CF993D6D3AC435CA77127466D6C072A@NewLife>
	<4415308D-81DC-4F68-A6CF-E08FD03D1D6E@illinois.edu>
Message-ID: <DE7BC2E3-F983-447F-86AD-34BFEA3B232A@illinois.edu>

I don't see an update to Bio::Phylo on CPAN yet, so I'm assuming we  
will leave Nexml off the 1.6.1 alpha for now.  I'll likely be  
releasing it later today or tomorrow to CPAN.

chris

On Sep 8, 2009, at 10:43 AM, Chris Fields wrote:

> Mark
>
> We can hold it in trunk until the next point release or we start  
> splitting things off (whichever is first).
>
> I have a little more time, though, and I'm thinking it would be a  
> good idea to get the Nexml code into the wild (sooner than later)  
> for users to test out.  Let's see if Rutger responds.
>
> chris
>
> On Sep 8, 2009, at 9:39 AM, Mark A. Jensen wrote:
>
>> I agree with Hilmar-- I have no problem keeping it in the trunk for  
>> a while
>> longer, as I have an addition for dealing with arbitrary non-seq
>> data using the Population API sitting in bioperl-dev that's nearly
>> ready, but prob. not before cjf wants to get the release out.
>> ----- Original Message ----- From: "Chris Fields" <cjfields at illinois.edu 
>> >
>> To: "Hilmar Lapp" <hlapp at gmx.net>
>> Cc: "BioPerl List" <bioperl-l at lists.open-bio.org>; "Rutger A. Vos" <rutgeraldo at gmail.com 
>> >
>> Sent: Tuesday, September 08, 2009 9:15 AM
>> Subject: Re: [Bioperl-l] Significant blocker for 1.6.1 : Nexml
>>
>>
>>> On Sep 8, 2009, at 7:16 AM, Hilmar Lapp wrote:
>>>
>>>> I'd suspect that the latest Bio::Phylo changes have been due for   
>>>> CPAN release anyway, so unless those are unstable that seems  
>>>> like  the easiest fix to me.
>>>
>>> My thought as well, just not sure how stable that code is right  
>>> now. Bio::Phylo has been in RC for a while now, correct?
>>>
>>>> If the Nexml code works against not yet stable updates to   
>>>> Bio::Phylo, it shouldn't be in a BioPerl stable release, right?
>>>
>>> Right.  That should be sorted out first.
>>>
>>> I can wait a bit longer for Rutger to respond; there are a few  
>>> other  odds and ends that can been worked on in the meantime.  I  
>>> would like  to get the alpha out soon and 1.6.1 in the next week  
>>> or so though.
>>>
>>> chris
>>>
>>>> -hilmar
>>>>
>>>> On Sep 8, 2009, at 12:23 AM, Chris Fields wrote:
>>>>
>>>>> All,
>>>>>
>>>>> I'm running into a pretty significant blocker for 1.6.1 re:  
>>>>> Chase's  Nexml code.  In particular, I have tried three versions  
>>>>> of  Bio::Phylo; the default CPAN installation (1.6), the latest  
>>>>> CPAN RC  (1.7_RC9, not installed by default), and the latest  
>>>>> from Bio::Phylo  svn:
>>>>>
>>>>> https://nexml.svn.sourceforge.net/svnroot/nexml/trunk/nexml/perl
>>>>>
>>>>> At this moment only the Bio::Phylo code from svn is working  
>>>>> with  BioPerl's Nexml modules.  From my local tests Bio::Phylo  
>>>>> 1.6  appears to be missing Bio::Phylo::Factory (all Nexml tests  
>>>>> fail),  whereas 1.7_RC9 has some kind of versioning issue  
>>>>> (again, all tests  fail).  The problem: CPAN will always install  
>>>>> 1.6 (the others are  RC, so they won't be installed unless the  
>>>>> full path is used).  Even  so, nothing on CPAN even works; one  
>>>>> must use the latest Bio::Phylo  SVN code.
>>>>>
>>>>> ATM I'm just not seeing how this can be released with 1.6.1  
>>>>> right  now, unless one of the following occurs:
>>>>>
>>>>> 1) Rutger V. drops a quick non-RC release to CPAN,
>>>>> 2) check for the minimal working Bio::Phylo version and safely  
>>>>> skip  any Nexml-related tests unless proper version is present  
>>>>> (not easy  with a $VERSION like '1.7_RC9'),
>>>>> 3) push Nexml into it's own distribution (something we were   
>>>>> planning on anyway with a number of modules)
>>>>>
>>>>> As for #3 above, I think it probably belongs in a larger  
>>>>> bioperl- phylo as Mark had previously proposed.  I'm open to  
>>>>> just about any  solution.
>>>>>
>>>>> chris
>>>>> _______________________________________________
>>>>> Bioperl-l mailing list
>>>>> Bioperl-l at lists.open-bio.org
>>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>>
>>>> -- 
>>>> ===========================================================
>>>> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
>>>> ===========================================================
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> Bioperl-l mailing list
>>>> Bioperl-l at lists.open-bio.org
>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


From cjfields at illinois.edu  Wed Sep 16 08:55:56 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Wed, 16 Sep 2009 07:55:56 -0500
Subject: [Bioperl-l] Bio::DB::GenBank question (acc vs. version)
In-Reply-To: <02cbfb3dfbb309f0b62cecd122bb5c2c.squirrel@mail.dreamhost.com>
References: <19116.19568.26115.542911@already.dhcp.gene.com>
	<02cbfb3dfbb309f0b62cecd122bb5c2c.squirrel@mail.dreamhost.com>
Message-ID: <0B8829A4-03EE-4BA0-8CF8-218782ED2630@illinois.edu>

Bill, George,

It's worth clarifying the docs on these and adding a TODO for them  
(and test cases!), but I tend to agree.  I believe, re: version, we  
can possibly use Bio::DB::SeqVersion to grab the right one, but it'll  
need further investigation.

As for generic accession w/o version, efetch does support it but it  
does have problems (pulling up more than one sequence in rare cases,  
for instance).

chris

On Sep 13, 2009, at 10:47 AM, bill at genenformics.com wrote:

> I would like to make a few comments about get_Seq_by_version and
> get_Seq_by_acc. Although both functions use the same NCBI eUtils  
> API, they
> are interpreted differently for a Seq_id with version or without  
> version.
>
> 1. If the Seq_id has a version, GenBank ID server will locate
> corresponding GI and emit the correct sequence.
> 2. If the Seq_id does not have a version, GBDataLoader  will try to  
> find
> the latest version number for that Seq_id, which is relatively  
> slower and
> the version number the ID server find out may NOT always be the  
> latest.
>
> IMHO, for both efficiency and consistency,
> get_Seq_by_gi > get_Seq_by_version >> get_Seq_by_acc
>
> Bill
>
>
>>
>> It looks like get Bio::DB::GenBank::get_Seq_by_{version,acc} are
>> functionally identical.  They seem to trickle down to the same place
>> and walking through these two requests yields almost identical http
>> requests:
>>
>>  $db->get_Seq_by_version('J00522.1')
>>  GET
>> http://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?retmode=text&rettype=gbwithparts&db=nucleotide&tool=bioperl&id=J00522.1&usehistory=n
>>
>>  $db->get_Seq_by_acc('J00522')
>>  GET
>> http://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?retmode=text&rettype=gbwithparts&db=nucleotide&tool=bioperl&id=J00522&usehistory=n
>>
>> The only difference that I can see is that they index into different
>> secions of %PARAMSTRING defined in Bio::DB::GenBank, but those
>> sections contain the same information.
>>
>> I'd like a general purpose tool that does The Right Thing whether
>> there's a .1 on the end of an identifier or not, and am just trying  
>> to
>> make sure I'm not doing something troublesome.
>>
>> Am I correct about the above?
>>
>> While I'm at it, I think that the comment
>>
>>  # note that get_Stream_by_version is not implemented
>>
>> in Bio::DB::GenBank was made obsolete by whoever commented out the
>>
>>  $self->throw(...)
>>
>> in get_Stream_by_version in Bio::WebDBSeqI.pm.
>>
>> I'll happily commit the trivial doc fix if no one shoots down the
>> idea. (can't help big, might as well help small...).
>>
>> Thanks,
>>
>> g.
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From cjfields at illinois.edu  Wed Sep 16 09:22:00 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Wed, 16 Sep 2009 08:22:00 -0500
Subject: [Bioperl-l] Genome scanning questions/strategies
In-Reply-To: <deaa866a0909150105wcc651c5n4a50033d0392bbda@mail.gmail.com>
References: <deaa866a0909150105wcc651c5n4a50033d0392bbda@mail.gmail.com>
Message-ID: <8674BA8B-ACCC-4C7D-989E-3532C0659A3F@illinois.edu>

On Sep 15, 2009, at 3:05 AM, Robert Bradbury wrote:

> I have several applications which require scanning multiple genomes,  
> in some
> cases I can get away with scanning the protein sequences, in other  
> cases I
> need to scan the mRNA, or in the worst case the DNA sequences  
> themselves.  I
> have most of the available genomes on my hard drive but in cases  
> where they
> are not complete or undergo frequent revisions, I may need to  
> interface
> through the Genbank | Ensembl | JGI (or other?) databases.
>
> Some of the applications are basic counting statistics:
> 1) How many proteins?
> 2) How many amino acids in the proteins?
> 3) What are the species specific codon frequencies in the codons?
> 4) What fraction of the genome is ncRNA, junk DNA, etc.?
>
> Other applications involve some functional analysis, e.g. find all  
> specified
> protein domains of interest (presumably some HMM matching or  
> equivalent),
> find all signal sequences (nuclear targeting, mitochondrial  
> targeting, ER
> targeting, etc.), find all mRNA restriction enzyme cut sites, etc..
>
> Questions are:
> 1) Are there "remote" functions that use genome center  
> "supercomputers"
> (other than say Remote Blast) that can be used for some of these  
> purposes
> and are interfaced in some way to BioPerl?

Re: remote tasks, there are a few tools for that.  See  
Bio::Tools::Analysis modules for ones that access remote servers, or  
the HOWTO:

http://www.bioperl.org/wiki/HOWTO:Simple_web_analysis

Setting up modules for these services can be risky, though, as we have  
no control over the continued evolution of the remote servers in  
question.  For instance, we had a set of Pise modules (around 100 I  
think) for remotely accessing services at any Pise server; however,  
these are now obsolete in favor of Mobyle.  I have long thought of  
setting something up to interface with either that service or Galaxy  
(which may be a more stable alternative), just haven't had the time.

Re databases: we have access to NCBI, EMBL, UniProt, and many others.   
NCBI eutils are available via Bio::DB::EUtilities.  You can use the  
Ensembl perl API for accessing Ensembl (including Compara and others),  
and Mark Jensen added Bio::DB::HIV for accessing HIV database  
information at LANL HIV Sequence Database.  These were all working  
with bioperl 1.6 last I tried (ensembl's API is separate and available  
from their website).

We don't have much beyond that, primarily b/c most other centers are  
very particular when queried remotely and will block IPs that spam  
their servers w/o an adequate timeout.  That's completely  
understandable from a webadmin perspective (think: possible denial of  
service attack).

> 2) Will I incur genome center wrath by running all my queries  
> "remotely"
> (i.e. I do the computing, but they handle the database retreival &  
> network
> distribution)?  If not, what is a good "max query frequency"? [I'm  
> on a DSL
> line, so I can't push most servers very hard from an I/O standpoint.]

You may if you abuse a specified timeout.  UCSC and NCBI both have  
been known to block IPs, but the timeout is quite different between  
the two (NCBI just reduced theirs to three queries per second, whereas  
I last heard UCSC was once per 30 seconds).

The best thing to do is check the documentation for the site in  
question or contact the webadmin to see if there is a requested  
timeout period.

> Finally, is there any "archive of experience" documenting the various
> information systems limitations on various bioinformatics  
> applications?
> I.e. for I/O requirements and/or CPU requirements, is: BLAST <
> HMM-domain-searching < Inter-genome-signal-scanning/matching?   
> Relates to
> the question of when home based bioinformaticians need to begin  
> considering
> switching from DSL to Cable to FIOS and/or 1/3/4/6/8 core machines/ 
> clusters
> can handle the workload.
>
> Thank you,
> Robert Bradbury

On that I'm not sure, but I would tend to think they don't want you  
taxing their local servers so there probably is some prioritization of  
tasks.

 From my perspective, if I were a home-based bioinformatician I would  
look seriously at cloud computing for most high-end tasks (Mark has  
even set up one for bioperl, bioperl-max).  It has a cost but it's  
very reasonable considering the cost of setting up a local cluster,  
maintenance and repairs, etc.  In fact, we have been putting serious  
thought into testing that direction instead of putting money into  
another high-cost local cluster, which is obsolete in, say, 3-4 years,  
or when we're getting Blue Waters in a couple years.

chris


From jajams at utu.fi  Wed Sep 16 06:04:18 2009
From: jajams at utu.fi (=?iso-8859-1?B?Ikpvb25hcyBK5G1zZW4i?=)
Date: Wed, 16 Sep 2009 13:04:18 +0300
Subject: [Bioperl-l] problem with a script
Message-ID: <fb44a91e1ccd0.4ab0e252@utu.fi>

Hi,

Im trying to run the script below and I get an error: "Can't call method "next_result" on an undefined value at parser.pl line 5."


#!/v/linux26_x86_64/appl/molbio/bioperl/perl/bin/
use Bio::SearchIO
my $searchio = Bio::SearchIO->new(-format => 'hmmer', -file   => '/wrk/xxxx/hmm/hmmsearch_nr.out');
while ( my $result = $in->next_result ) {
     while ( my $hit = $result->next_hit ) {
         while ( my $hsp-evalue<=10 ) {
             while ( my $hsp = $hit->next_hsp ) {
                 print $hit->accession(), "\n";
         }
     }
 }

Could someone tell me what is wrong?

Thanks.


From maj at fortinbras.us  Wed Sep 16 11:18:26 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Wed, 16 Sep 2009 11:18:26 -0400
Subject: [Bioperl-l] problem with a script
In-Reply-To: <fb44a91e1ccd0.4ab0e252@utu.fi>
References: <fb44a91e1ccd0.4ab0e252@utu.fi>
Message-ID: <A9C32C43FB5C46FD9DC320A4DD325104@NewLife>

Hi Joonas-- 

Put a semicolon after "use Bio::SearchIO" in line 2.
If that doesn't work, then the error suggests that $searchio is undefined 
because the parser failed for some reason.
You could try
 my $searchio = Bio::SearchIO->new(-format => 'hmmer', -file   => 
'/wrk/xxxx/hmm/hmmsearch_nr.out'
                   -verbose=>1);
to get more detailed error messages, they may direct you to the issue.

cheers MAJ

----- Original Message ----- 
From: ""Joonas J?msen"" <jajams at utu.fi>
To: "bioperl list" <bioperl-l at lists.open-bio.org>
Sent: Wednesday, September 16, 2009 6:04 AM
Subject: [Bioperl-l] problem with a script


> Hi,
>
> Im trying to run the script below and I get an error: "Can't call method 
> "next_result" on an undefined value at parser.pl line 5."
>
>
> #!/v/linux26_x86_64/appl/molbio/bioperl/perl/bin/
> use Bio::SearchIO
> my $searchio = Bio::SearchIO->new(-format => 'hmmer', -file   => 
> '/wrk/xxxx/hmm/hmmsearch_nr.out');
> while ( my $result = $in->next_result ) {
>     while ( my $hit = $result->next_hit ) {
>         while ( my $hsp-evalue<=10 ) {
>             while ( my $hsp = $hit->next_hsp ) {
>                 print $hit->accession(), "\n";
>         }
>     }
> }
>
> Could someone tell me what is wrong?
>
> Thanks.
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> 


From Kevin.M.Brown at asu.edu  Wed Sep 16 11:16:51 2009
From: Kevin.M.Brown at asu.edu (Kevin Brown)
Date: Wed, 16 Sep 2009 08:16:51 -0700
Subject: [Bioperl-l] problem with a script
In-Reply-To: <fb44a91e1ccd0.4ab0e252@utu.fi>
References: <fb44a91e1ccd0.4ab0e252@utu.fi>
Message-ID: <1A4207F8295607498283FE9E93B775B4063D4CB6@EX02.asurite.ad.asu.edu>

That's because the variable $in isn't defined, just like the error says. You are setting $searchio to be your input object, but not using it.

#!/v/linux26_x86_64/appl/molbio/bioperl/perl/bin/
use strict; #<-- this helps to find those pesky undeclared variables
use Bio::SearchIO;
my $searchio = Bio::SearchIO->new(-format => 'hmmer', -file   => '/wrk/xxxx/hmm/hmmsearch_nr.out');
while ( my $result = $searchio->next_result ) { # <-- changed this line
     while ( my $hit = $result->next_hit ) {
         while ( my $hsp-evalue<=10 ) {
             while ( my $hsp = $hit->next_hsp ) {
                 print $hit->accession(), "\n";
         }
     }
 }


Kevin Brown
Center for Innovations in Medicine
Biodesign Institute
Arizona State University  

> -----Original Message-----
> From: bioperl-l-bounces at lists.open-bio.org 
> [mailto:bioperl-l-bounces at lists.open-bio.org] On Behalf Of 
> "Joonas J?msen"
> Sent: Wednesday, September 16, 2009 3:04 AM
> To: bioperl list
> Subject: [Bioperl-l] problem with a script
> 
> Hi,
> 
> Im trying to run the script below and I get an error: "Can't 
> call method "next_result" on an undefined value at parser.pl line 5."
> 
> 
> #!/v/linux26_x86_64/appl/molbio/bioperl/perl/bin/
> use Bio::SearchIO
> my $searchio = Bio::SearchIO->new(-format => 'hmmer', -file   
> => '/wrk/xxxx/hmm/hmmsearch_nr.out');
> while ( my $result = $in->next_result ) {
>      while ( my $hit = $result->next_hit ) {
>          while ( my $hsp-evalue<=10 ) {
>              while ( my $hsp = $hit->next_hsp ) {
>                  print $hit->accession(), "\n";
>          }
>      }
>  }
> 
> Could someone tell me what is wrong?
> 
> Thanks.
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 


From rmb32 at cornell.edu  Wed Sep 16 11:05:16 2009
From: rmb32 at cornell.edu (Robert Buels)
Date: Wed, 16 Sep 2009 08:05:16 -0700
Subject: [Bioperl-l] problem with a script
In-Reply-To: <fb44a91e1ccd0.4ab0e252@utu.fi>
References: <fb44a91e1ccd0.4ab0e252@utu.fi>
Message-ID: <4AB0FEAC.50104@cornell.edu>

1.) You need to use strict.  Always have use strict at the top of your 
code.  That would have caught this error.
2.) The proximate problem here is that your searchio object is call 
$searchio, while you are calling $in->next_result.  You want 
$searchio->next_result instead.

Rob

Joonas J?msen wrote:
> Hi,
> 
> Im trying to run the script below and I get an error: "Can't call method "next_result" on an undefined value at parser.pl line 5."
> 
> 
> #!/v/linux26_x86_64/appl/molbio/bioperl/perl/bin/
> use Bio::SearchIO
> my $searchio = Bio::SearchIO->new(-format => 'hmmer', -file   => '/wrk/xxxx/hmm/hmmsearch_nr.out');
> while ( my $result = $in->next_result ) {
>      while ( my $hit = $result->next_hit ) {
>          while ( my $hsp-evalue<=10 ) {
>              while ( my $hsp = $hit->next_hsp ) {
>                  print $hit->accession(), "\n";
>          }
>      }
>  }
> 
> Could someone tell me what is wrong?
> 
> Thanks.
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 


From bill at genenformics.com  Wed Sep 16 13:22:56 2009
From: bill at genenformics.com (bill at genenformics.com)
Date: Wed, 16 Sep 2009 10:22:56 -0700
Subject: [Bioperl-l] Bio::DB::GenBank question (acc vs. version)
In-Reply-To: <0B8829A4-03EE-4BA0-8CF8-218782ED2630@illinois.edu>
References: <19116.19568.26115.542911@already.dhcp.gene.com>
	<02cbfb3dfbb309f0b62cecd122bb5c2c.squirrel@mail.dreamhost.com>
	<0B8829A4-03EE-4BA0-8CF8-218782ED2630@illinois.edu>
Message-ID: <6785fd2ac57ff4389dcbcd6b0e0861ae.squirrel@mail.dreamhost.com>


>
> As for generic accession w/o version, efetch does support it but it
> does have problems (pulling up more than one sequence in rare cases,
> for instance).
>

This is probably because NCBI ID servers are not completely synchronized
or are in the process of synchronization. get_Seq_by_acc is not as safe as
other functions.

Bill

>
> On Sep 13, 2009, at 10:47 AM, bill at genenformics.com wrote:
>
>> I would like to make a few comments about get_Seq_by_version and
>> get_Seq_by_acc. Although both functions use the same NCBI eUtils
>> API, they
>> are interpreted differently for a Seq_id with version or without
>> version.
>>
>> 1. If the Seq_id has a version, GenBank ID server will locate
>> corresponding GI and emit the correct sequence.
>> 2. If the Seq_id does not have a version, GBDataLoader  will try to
>> find
>> the latest version number for that Seq_id, which is relatively
>> slower and
>> the version number the ID server find out may NOT always be the
>> latest.
>>
>> IMHO, for both efficiency and consistency,
>> get_Seq_by_gi > get_Seq_by_version >> get_Seq_by_acc
>>
>> Bill
>>
>>
>>>
>>> It looks like get Bio::DB::GenBank::get_Seq_by_{version,acc} are
>>> functionally identical.  They seem to trickle down to the same place
>>> and walking through these two requests yields almost identical http
>>> requests:
>>>
>>>  $db->get_Seq_by_version('J00522.1')
>>>  GET
>>> http://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?retmode=text&rettype=gbwithparts&db=nucleotide&tool=bioperl&id=J00522.1&usehistory=n
>>>
>>>  $db->get_Seq_by_acc('J00522')
>>>  GET
>>> http://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?retmode=text&rettype=gbwithparts&db=nucleotide&tool=bioperl&id=J00522&usehistory=n
>>>
>>> The only difference that I can see is that they index into different
>>> secions of %PARAMSTRING defined in Bio::DB::GenBank, but those
>>> sections contain the same information.
>>>
>>> I'd like a general purpose tool that does The Right Thing whether
>>> there's a .1 on the end of an identifier or not, and am just trying
>>> to
>>> make sure I'm not doing something troublesome.
>>>
>>> Am I correct about the above?
>>>
>>> While I'm at it, I think that the comment
>>>
>>>  # note that get_Stream_by_version is not implemented
>>>
>>> in Bio::DB::GenBank was made obsolete by whoever commented out the
>>>
>>>  $self->throw(...)
>>>
>>> in get_Stream_by_version in Bio::WebDBSeqI.pm.
>>>
>>> I'll happily commit the trivial doc fix if no one shoots down the
>>> idea. (can't help big, might as well help small...).
>>>
>>> Thanks,
>>>
>>> g.
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>


From cjfields at illinois.edu  Wed Sep 16 13:29:40 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Wed, 16 Sep 2009 12:29:40 -0500
Subject: [Bioperl-l] Bio::DB::GenBank question (acc vs. version)
In-Reply-To: <6785fd2ac57ff4389dcbcd6b0e0861ae.squirrel@mail.dreamhost.com>
References: <19116.19568.26115.542911@already.dhcp.gene.com>
	<02cbfb3dfbb309f0b62cecd122bb5c2c.squirrel@mail.dreamhost.com>
	<0B8829A4-03EE-4BA0-8CF8-218782ED2630@illinois.edu>
	<6785fd2ac57ff4389dcbcd6b0e0861ae.squirrel@mail.dreamhost.com>
Message-ID: <B293F929-5714-4840-8FAD-7366F7C36137@illinois.edu>


On Sep 16, 2009, at 12:22 PM, bill at genenformics.com wrote:

>
>>
>> As for generic accession w/o version, efetch does support it but it
>> does have problems (pulling up more than one sequence in rare cases,
>> for instance).
>>
>
> This is probably because NCBI ID servers are not completely  
> synchronized
> or are in the process of synchronization. get_Seq_by_acc is not as  
> safe as
> other functions.
>
> Bill

Right, but unfortunately it's necessary as the default in most cases  
is to grab/display the accession, not the UID.  For instance, BLAST  
output must be specifically flagged to display the GI.

This is an instance where documentation would be a good idea to  
indicate the problem.  I think I have done that but I'll double-check.

chris

From rmb32 at cornell.edu  Wed Sep 16 15:04:16 2009
From: rmb32 at cornell.edu (Robert Buels)
Date: Wed, 16 Sep 2009 12:04:16 -0700
Subject: [Bioperl-l] problem with a script
In-Reply-To: <4AB1356D.4050307@utu.fi>
References: <fb44a91e1ccd0.4ab0e252@utu.fi> <4AB0FEAC.50104@cornell.edu>
	<4AB1356D.4050307@utu.fi>
Message-ID: <4AB136B0.6050304@cornell.edu>

You should also 'use warnings' at the top of all code.  That would have 
caught THIS error.

You are missing a comma after ....nr.out'

Rob

Joonas J?msen wrote:
> Thanks. Im still getting errors. I have no idea what the error means. It 
> says:
> 
> ------------- EXCEPTION: Bio::Root::Exception -------------
> MSG: Could not open 0: No such file or directory
> STACK: Error::throw
> STACK: Bio::Root::Root::throw 
> /v/linux26_x86_64/appl/molbio/bioperl/perl/lib/site_perl/5.8.9/Bio/Root/Root.pm:357 
> 
> STACK: Bio::Root::IO::_initialize_io 
> /v/linux26_x86_64/appl/molbio/bioperl/perl/lib/site_perl/5.8.9/Bio/Root/IO.pm:310 
> 
> STACK: Bio::Root::IO::new 
> /v/linux26_x86_64/appl/molbio/bioperl/perl/lib/site_perl/5.8.9/Bio/Root/IO.pm:223 
> 
> STACK: Bio::SearchIO::new 
> /v/linux26_x86_64/appl/molbio/bioperl/perl/lib/site_perl/5.8.9/Bio/SearchIO.pm:145 
> 
> STACK: Bio::SearchIO::new 
> /v/linux26_x86_64/appl/molbio/bioperl/perl/lib/site_perl/5.8.9/Bio/SearchIO.pm:177 
> 
> STACK: parser.pl:7
> -----------------------------------------------------------
> 
> And the code im using seems ok now:
> 
> #!/v/linux26_x86_64/appl/molbio/bioperl/perl/bin/
> 
> use strict;
> use Bio::SearchIO;
> 
> my $searchio = Bio::SearchIO->new(-format => 'hmmer', -file => 
> '/wrk/xxxx/hmm/hmmsearch_nr.out' -verbose=>1);
> while ( my $result = $searchio->next_result ) {
>     while ( my $hit = $result->next_hit ) {
>         while ( my $hsp = $hit->evalue<=10 ) {
>                 while ( my $hsp = $hit->next_hsp ) {
>                         print $hit->accession(), "\n";
>             }
>         }
>     }
> }
> 
> -J.
> 
> Robert Buels wrote:
>> 1.) You need to use strict.  Always have use strict at the top of your 
>> code.  That would have caught this error.
>> 2.) The proximate problem here is that your searchio object is call 
>> $searchio, while you are calling $in->next_result.  You want 
>> $searchio->next_result instead.
>>
>> Rob
>>
>> Joonas J?msen wrote:
>>> Hi,
>>>
>>> Im trying to run the script below and I get an error: "Can't call 
>>> method "next_result" on an undefined value at parser.pl line 5."
>>>
>>>
>>> #!/v/linux26_x86_64/appl/molbio/bioperl/perl/bin/
>>> use Bio::SearchIO
>>> my $searchio = Bio::SearchIO->new(-format => 'hmmer', -file   => 
>>> '/wrk/xxxx/hmm/hmmsearch_nr.out');
>>> while ( my $result = $in->next_result ) {
>>>      while ( my $hit = $result->next_hit ) {
>>>          while ( my $hsp-evalue<=10 ) {
>>>              while ( my $hsp = $hit->next_hsp ) {
>>>                  print $hit->accession(), "\n";
>>>          }
>>>      }
>>>  }
>>>
>>> Could someone tell me what is wrong?
>>>
>>> Thanks.
>>>
>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>


-- 
Robert Buels
Bioinformatics Analyst, Sol Genomics Network
Boyce Thompson Institute for Plant Research
Tower Rd
Ithaca, NY  14853
Tel: 503-889-8539
rmb32 at cornell.edu
http://www.sgn.cornell.edu

From rmb32 at cornell.edu  Wed Sep 16 15:23:27 2009
From: rmb32 at cornell.edu (Robert Buels)
Date: Wed, 16 Sep 2009 12:23:27 -0700
Subject: [Bioperl-l] problem with a script
In-Reply-To: <4AB13864.6070707@utu.fi>
References: <fb44a91e1ccd0.4ab0e252@utu.fi> <4AB0FEAC.50104@cornell.edu>
	<4AB1356D.4050307@utu.fi> <4AB136B0.6050304@cornell.edu>
	<4AB13864.6070707@utu.fi>
Message-ID: <4AB13B2F.5060502@cornell.edu>

Your report may not have accessions, try using name() instead of 
accession().


From abhishek.vit at gmail.com  Wed Sep 16 16:13:33 2009
From: abhishek.vit at gmail.com (Abhishek Pratap)
Date: Wed, 16 Sep 2009 16:13:33 -0400
Subject: [Bioperl-l] About FASTQ parser
Message-ID: <be9b52410909161313uab30d9cn24d7080eb1684de7@mail.gmail.com>

Hi Chris

I remember seeing a recent email about new bioperl fastq parser. Is it
part of bioperl 1.6 dist. I installed one and based on the doc
here(http://doc.bioperl.org/releases/bioperl-current/bioperl-live/Bio/SeqIO/fastq.html)
I am a bit lost.

I see two methods there : using Bio::SeqIO::fastq and
Bio::Seq::Quality. Are both same in terms of data returned and latter
giving a scale up in speed ?

This is not to offend any developer but small example/s on the HOWTO's
helps a lot.

The current example (copied below) is not working. I guess it is based
on a previous version of code.

# grabs the FASTQ parser, specifies the Illumina variant
  my $in = Bio::SeqIO->new(-format    => 'fastq-illumina',
                           -file      => 'mydata.fq');


My basic requirement is to read each read in fastq record and split it
into header: read: quality.


Thanks,
-Abhi

From abhishek.vit at gmail.com  Wed Sep 16 17:41:50 2009
From: abhishek.vit at gmail.com (Abhishek Pratap)
Date: Wed, 16 Sep 2009 17:41:50 -0400
Subject: [Bioperl-l] Allowing One error in Sequence matching
Message-ID: <be9b52410909161441w1ce271c4r1e518f7fd1ea7339@mail.gmail.com>

Hi All

I am not able to think of smart way to do sequence matching allowing
userdefined number of mismatches.

For eg:

Given Sequence : AGCT will be considered a match to reference if any
one base pair position #(1,2,3,4)  has a mismatch that is  [ACGTN] so
the possible matches could be

This is for position 1.
AGCT
GGCT
CGCT
TGCT
NGCT
and likewise for each position.

any nice regular expression. One way that I could think was to
generate all the possible tags for a given sequence and then do the
matching. It will be a computationally expensive for long dataset .
Any neat method ?

Thanks,
-Abhi

From maj at fortinbras.us  Wed Sep 16 18:33:00 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Wed, 16 Sep 2009 22:33:00 +0000
Subject: [Bioperl-l] Allowing One error in Sequence matching
Message-ID: <W403682148491321253140380@webmail21>

Hi Abhi -
Maybe Chris' scrap
http://www.bioperl.org/wiki/Tricking_the_perl_regex_engine_to_get_suboptimal_matches
is what you're after?
MAJ


>-----Original Message-----
>From: Abhishek Pratap [mailto:abhishek.vit at gmail.com]
>Sent: Wednesday, September 16, 2009 05:41 PM
>To: bioperl-l at lists.open-bio.org
>Subject: [Bioperl-l] Allowing One error in Sequence matching
>
>Hi All
>
>I am not able to think of smart way to do sequence matching allowing
>userdefined number of mismatches.
>
>For eg:
>
>Given Sequence : AGCT will be considered a match to reference if any
>one base pair position #(1,2,3,4)  has a mismatch that is  [ACGTN] so
>the possible matches could be
>
>This is for position 1.
>AGCT
>GGCT
>CGCT
>TGCT
>NGCT
>and likewise for each position.
>
>any nice regular expression. One way that I could think was to
>generate all the possible tags for a given sequence and then do the
>matching. It will be a computationally expensive for long dataset .
>Any neat method ?
>
>Thanks,
>-Abhi
>_______________________________________________
>Bioperl-l mailing list
>Bioperl-l at lists.open-bio.org
>http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


From Russell.Smithies at agresearch.co.nz  Wed Sep 16 19:06:45 2009
From: Russell.Smithies at agresearch.co.nz (Smithies, Russell)
Date: Thu, 17 Sep 2009 11:06:45 +1200
Subject: [Bioperl-l] Allowing One error in Sequence matching
In-Reply-To: <be9b52410909161441w1ce271c4r1e518f7fd1ea7339@mail.gmail.com>
References: <be9b52410909161441w1ce271c4r1e518f7fd1ea7339@mail.gmail.com>
Message-ID: <18DF7D20DFEC044098A1062202F5FFF32B62985946@exchsth.agresearch.co.nz>

How about chunk it into overlapping words, skip if >2 N, then regex?

$seq = "CGATCGNATGNCGTCTAGCTGACANGTTGACTCTAGCTGATCGATCGATCGTACGTANNCGTAGTCGTACNTACGATCTNACGCACGNATGCTACGTACG";

$motif = "ACGT";
foreach (split //, $motif) {$w .= "[${_}N]"}

foreach ($seq =~ /(?=(\w{4}))/g){
  next if tr/N/N/ >= 2;
  print "$_\n" if  eval "/$w/" ;
}


> -----Original Message-----
> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
> bounces at lists.open-bio.org] On Behalf Of Abhishek Pratap
> Sent: Thursday, 17 September 2009 9:42 a.m.
> To: bioperl-l at lists.open-bio.org
> Subject: [Bioperl-l] Allowing One error in Sequence matching
> 
> Hi All
> 
> I am not able to think of smart way to do sequence matching allowing
> userdefined number of mismatches.
> 
> For eg:
> 
> Given Sequence : AGCT will be considered a match to reference if any
> one base pair position #(1,2,3,4)  has a mismatch that is  [ACGTN] so
> the possible matches could be
> 
> This is for position 1.
> AGCT
> GGCT
> CGCT
> TGCT
> NGCT
> and likewise for each position.
> 
> any nice regular expression. One way that I could think was to
> generate all the possible tags for a given sequence and then do the
> matching. It will be a computationally expensive for long dataset .
> Any neat method ?
> 
> Thanks,
> -Abhi
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
=======================================================================
Attention: The information contained in this message and/or attachments
from AgResearch Limited is intended only for the persons or entities
to which it is addressed and may contain confidential and/or privileged
material. Any review, retransmission, dissemination or other use of, or
taking of any action in reliance upon, this information by persons or
entities other than the intended recipients is prohibited by AgResearch
Limited. If you have received this message in error, please notify the
sender immediately.
=======================================================================


From maj at fortinbras.us  Wed Sep 16 18:30:50 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Wed, 16 Sep 2009 18:30:50 -0400
Subject: [Bioperl-l] Allowing One error in Sequence matching
In-Reply-To: <be9b52410909161441w1ce271c4r1e518f7fd1ea7339@mail.gmail.com>
References: <be9b52410909161441w1ce271c4r1e518f7fd1ea7339@mail.gmail.com>
Message-ID: <1B8182A0898B452D80EA6035A178B7CE@NewLife>

Hi Abhi -
Maybe Chris' scrap
http://www.bioperl.org/wiki/Tricking_the_perl_regex_engine_to_get_suboptimal_matches
is what you're after?
MAJ
----- Original Message ----- 
From: "Abhishek Pratap" <abhishek.vit at gmail.com>
To: <bioperl-l at lists.open-bio.org>
Sent: Wednesday, September 16, 2009 5:41 PM
Subject: [Bioperl-l] Allowing One error in Sequence matching


> Hi All
>
> I am not able to think of smart way to do sequence matching allowing
> userdefined number of mismatches.
>
> For eg:
>
> Given Sequence : AGCT will be considered a match to reference if any
> one base pair position #(1,2,3,4)  has a mismatch that is  [ACGTN] so
> the possible matches could be
>
> This is for position 1.
> AGCT
> GGCT
> CGCT
> TGCT
> NGCT
> and likewise for each position.
>
> any nice regular expression. One way that I could think was to
> generate all the possible tags for a given sequence and then do the
> matching. It will be a computationally expensive for long dataset .
> Any neat method ?
>
> Thanks,
> -Abhi
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> 


From abhishek.vit at gmail.com  Wed Sep 16 21:39:13 2009
From: abhishek.vit at gmail.com (Abhishek Pratap)
Date: Wed, 16 Sep 2009 21:39:13 -0400
Subject: [Bioperl-l] Allowing One error in Sequence matching
In-Reply-To: <18DF7D20DFEC044098A1062202F5FFF32B62985946@exchsth.agresearch.co.nz>
References: <be9b52410909161441w1ce271c4r1e518f7fd1ea7339@mail.gmail.com>
	<18DF7D20DFEC044098A1062202F5FFF32B62985946@exchsth.agresearch.co.nz>
Message-ID: <be9b52410909161839k2dd86c57o63cc149057b6af99@mail.gmail.com>

Hi Russell

Thanks for a quick reply. However I am not following the code clearly
and the reason behind it.

Will this work for  matching AGCT  to ACCT | ANCT | AACT. It dint give
me the expected output when I ran it. I am more interested in
understanding the logic.

It would be great if you could expand a bit more.


Also if I do it the brute force way as suggested to me by a frnd , how
will that work in terms of scalability.

@dna1=split(//,$a);
@dna2=split(//,$b);
$x=0;
for($i=0;$i<@dna1;$i++){
        if ($dna1[$i] ne $dna2[$i]){
                        $x++;
        }
}

if($x<=1){
        print "RESULT: your sequence is true\n";
}

else { print " RESULT: your sequence is false\n";}

Thanks,
-Abhi


On Wed, Sep 16, 2009 at 7:06 PM, Smithies, Russell
<Russell.Smithies at agresearch.co.nz> wrote:
> How about chunk it into overlapping words, skip if >2 N, then regex?
>
> $seq = "CGATCGNATGNCGTCTAGCTGACANGTTGACTCTAGCTGATCGATCGATCGTACGTANNCGTAGTCGTACNTACGATCTNACGCACGNATGCTACGTACG";
>
> $motif = "ACGT";
> foreach (split //, $motif) {$w .= "[${_}N]"}
>
> foreach ($seq =~ /(?=(\w{4}))/g){
> ?next if tr/N/N/ >= 2;
> ?print "$_\n" if ?eval "/$w/" ;
> }
>
>
>
>> -----Original Message-----
>> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
>> bounces at lists.open-bio.org] On Behalf Of Abhishek Pratap
>> Sent: Thursday, 17 September 2009 9:42 a.m.
>> To: bioperl-l at lists.open-bio.org
>> Subject: [Bioperl-l] Allowing One error in Sequence matching
>>
>> Hi All
>>
>> I am not able to think of smart way to do sequence matching allowing
>> userdefined number of mismatches.
>>
>> For eg:
>>
>> Given Sequence : AGCT will be considered a match to reference if any
>> one base pair position #(1,2,3,4) ?has a mismatch that is ?[ACGTN] so
>> the possible matches could be
>>
>> This is for position 1.
>> AGCT
>> GGCT
>> CGCT
>> TGCT
>> NGCT
>> and likewise for each position.
>>
>> any nice regular expression. One way that I could think was to
>> generate all the possible tags for a given sequence and then do the
>> matching. It will be a computationally expensive for long dataset .
>> Any neat method ?
>>
>> Thanks,
>> -Abhi
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> =======================================================================
> Attention: The information contained in this message and/or attachments
> from AgResearch Limited is intended only for the persons or entities
> to which it is addressed and may contain confidential and/or privileged
> material. Any review, retransmission, dissemination or other use of, or
> taking of any action in reliance upon, this information by persons or
> entities other than the intended recipients is prohibited by AgResearch
> Limited. If you have received this message in error, please notify the
> sender immediately.
> =======================================================================
>


From Russell.Smithies at agresearch.co.nz  Wed Sep 16 21:46:54 2009
From: Russell.Smithies at agresearch.co.nz (Smithies, Russell)
Date: Thu, 17 Sep 2009 13:46:54 +1200
Subject: [Bioperl-l] Allowing One error in Sequence matching
In-Reply-To: <be9b52410909161839k2dd86c57o63cc149057b6af99@mail.gmail.com>
References: <be9b52410909161441w1ce271c4r1e518f7fd1ea7339@mail.gmail.com>
	<18DF7D20DFEC044098A1062202F5FFF32B62985946@exchsth.agresearch.co.nz>
	<be9b52410909161839k2dd86c57o63cc149057b6af99@mail.gmail.com>
Message-ID: <18DF7D20DFEC044098A1062202F5FFF32B62985A72@exchsth.agresearch.co.nz>

I misread your question, my example will match NGCT, ANCT, AGNT, or ACGN with 1 miss-match (or NGNT, NGCN, ANNT, ANCT etc with 2 miss-matches)
The eval is just doing a regex on the match string created by the loop - "[AN][GN][CN][TN]"
If your word size is short and you're not using too many mismatches, brute-forcing it with a compiled regex would probably work.


> -----Original Message-----
> From: Abhishek Pratap [mailto:abhishek.vit at gmail.com]
> Sent: Thursday, 17 September 2009 1:39 p.m.
> To: Smithies, Russell
> Cc: bioperl-l at lists.open-bio.org
> Subject: Re: [Bioperl-l] Allowing One error in Sequence matching
> 
> Hi Russell
> 
> Thanks for a quick reply. However I am not following the code clearly
> and the reason behind it.
> 
> Will this work for  matching AGCT  to ACCT | ANCT | AACT. It dint give
> me the expected output when I ran it. I am more interested in
> understanding the logic.
> 
> It would be great if you could expand a bit more.
> 
> 
> Also if I do it the brute force way as suggested to me by a frnd , how
> will that work in terms of scalability.
> 
> @dna1=split(//,$a);
> @dna2=split(//,$b);
> $x=0;
> for($i=0;$i<@dna1;$i++){
>         if ($dna1[$i] ne $dna2[$i]){
>                         $x++;
>         }
> }
> 
> if($x<=1){
>         print "RESULT: your sequence is true\n";
> }
> 
> else { print " RESULT: your sequence is false\n";}
> 
> Thanks,
> -Abhi
> 
> 
> On Wed, Sep 16, 2009 at 7:06 PM, Smithies, Russell
> <Russell.Smithies at agresearch.co.nz> wrote:
> > How about chunk it into overlapping words, skip if >2 N, then regex?
> >
> > $seq =
> "CGATCGNATGNCGTCTAGCTGACANGTTGACTCTAGCTGATCGATCGATCGTACGTANNCGTAGTCGTACNTACGAT
> CTNACGCACGNATGCTACGTACG";
> >
> > $motif = "ACGT";
> > foreach (split //, $motif) {$w .= "[${_}N]"}
> >
> > foreach ($seq =~ /(?=(\w{4}))/g){
> > ?next if tr/N/N/ >= 2;
> > ?print "$_\n" if ?eval "/$w/" ;
> > }
> >
> >
> >
> >> -----Original Message-----
> >> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
> >> bounces at lists.open-bio.org] On Behalf Of Abhishek Pratap
> >> Sent: Thursday, 17 September 2009 9:42 a.m.
> >> To: bioperl-l at lists.open-bio.org
> >> Subject: [Bioperl-l] Allowing One error in Sequence matching
> >>
> >> Hi All
> >>
> >> I am not able to think of smart way to do sequence matching allowing
> >> userdefined number of mismatches.
> >>
> >> For eg:
> >>
> >> Given Sequence : AGCT will be considered a match to reference if any
> >> one base pair position #(1,2,3,4) ?has a mismatch that is ?[ACGTN] so
> >> the possible matches could be
> >>
> >> This is for position 1.
> >> AGCT
> >> GGCT
> >> CGCT
> >> TGCT
> >> NGCT
> >> and likewise for each position.
> >>
> >> any nice regular expression. One way that I could think was to
> >> generate all the possible tags for a given sequence and then do the
> >> matching. It will be a computationally expensive for long dataset .
> >> Any neat method ?
> >>
> >> Thanks,
> >> -Abhi
> >> _______________________________________________
> >> Bioperl-l mailing list
> >> Bioperl-l at lists.open-bio.org
> >> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> > =======================================================================
> > Attention: The information contained in this message and/or attachments
> > from AgResearch Limited is intended only for the persons or entities
> > to which it is addressed and may contain confidential and/or privileged
> > material. Any review, retransmission, dissemination or other use of, or
> > taking of any action in reliance upon, this information by persons or
> > entities other than the intended recipients is prohibited by AgResearch
> > Limited. If you have received this message in error, please notify the
> > sender immediately.
> > =======================================================================
> >


From abhishek.vit at gmail.com  Wed Sep 16 23:12:20 2009
From: abhishek.vit at gmail.com (Abhishek Pratap)
Date: Wed, 16 Sep 2009 23:12:20 -0400
Subject: [Bioperl-l] Allowing One error in Sequence matching
In-Reply-To: <18DF7D20DFEC044098A1062202F5FFF32B62985A72@exchsth.agresearch.co.nz>
References: <be9b52410909161441w1ce271c4r1e518f7fd1ea7339@mail.gmail.com>
	<18DF7D20DFEC044098A1062202F5FFF32B62985946@exchsth.agresearch.co.nz>
	<be9b52410909161839k2dd86c57o63cc149057b6af99@mail.gmail.com>
	<18DF7D20DFEC044098A1062202F5FFF32B62985A72@exchsth.agresearch.co.nz>
Message-ID: <be9b52410909162012m5b18bc78u477e15957c88a45d@mail.gmail.com>

Thanks Russell.

I think having a "approx matching" method in bioperl will help
specially with NGS data where read matching with 1/2/3/4 errors is
sometimes needed.

Cheers,
-Abhi


On Wed, Sep 16, 2009 at 9:46 PM, Smithies, Russell
<Russell.Smithies at agresearch.co.nz> wrote:
> I misread your question, my example will match NGCT, ANCT, AGNT, or ACGN with 1 miss-match (or NGNT, NGCN, ANNT, ANCT etc with 2 miss-matches)
> The eval is just doing a regex on the match string created by the loop - "[AN][GN][CN][TN]"
> If your word size is short and you're not using too many mismatches, brute-forcing it with a compiled regex would probably work.
>
>
>> -----Original Message-----
>> From: Abhishek Pratap [mailto:abhishek.vit at gmail.com]
>> Sent: Thursday, 17 September 2009 1:39 p.m.
>> To: Smithies, Russell
>> Cc: bioperl-l at lists.open-bio.org
>> Subject: Re: [Bioperl-l] Allowing One error in Sequence matching
>>
>> Hi Russell
>>
>> Thanks for a quick reply. However I am not following the code clearly
>> and the reason behind it.
>>
>> Will this work for ?matching AGCT ?to ACCT | ANCT | AACT. It dint give
>> me the expected output when I ran it. I am more interested in
>> understanding the logic.
>>
>> It would be great if you could expand a bit more.
>>
>>
>> Also if I do it the brute force way as suggested to me by a frnd , how
>> will that work in terms of scalability.
>>
>> @dna1=split(//,$a);
>> @dna2=split(//,$b);
>> $x=0;
>> for($i=0;$i<@dna1;$i++){
>> ? ? ? ? if ($dna1[$i] ne $dna2[$i]){
>> ? ? ? ? ? ? ? ? ? ? ? ? $x++;
>> ? ? ? ? }
>> }
>>
>> if($x<=1){
>> ? ? ? ? print "RESULT: your sequence is true\n";
>> }
>>
>> else { print " RESULT: your sequence is false\n";}
>>
>> Thanks,
>> -Abhi
>>
>>
>> On Wed, Sep 16, 2009 at 7:06 PM, Smithies, Russell
>> <Russell.Smithies at agresearch.co.nz> wrote:
>> > How about chunk it into overlapping words, skip if >2 N, then regex?
>> >
>> > $seq =
>> "CGATCGNATGNCGTCTAGCTGACANGTTGACTCTAGCTGATCGATCGATCGTACGTANNCGTAGTCGTACNTACGAT
>> CTNACGCACGNATGCTACGTACG";
>> >
>> > $motif = "ACGT";
>> > foreach (split //, $motif) {$w .= "[${_}N]"}
>> >
>> > foreach ($seq =~ /(?=(\w{4}))/g){
>> > ?next if tr/N/N/ >= 2;
>> > ?print "$_\n" if ?eval "/$w/" ;
>> > }
>> >
>> >
>> >
>> >> -----Original Message-----
>> >> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
>> >> bounces at lists.open-bio.org] On Behalf Of Abhishek Pratap
>> >> Sent: Thursday, 17 September 2009 9:42 a.m.
>> >> To: bioperl-l at lists.open-bio.org
>> >> Subject: [Bioperl-l] Allowing One error in Sequence matching
>> >>
>> >> Hi All
>> >>
>> >> I am not able to think of smart way to do sequence matching allowing
>> >> userdefined number of mismatches.
>> >>
>> >> For eg:
>> >>
>> >> Given Sequence : AGCT will be considered a match to reference if any
>> >> one base pair position #(1,2,3,4) ?has a mismatch that is ?[ACGTN] so
>> >> the possible matches could be
>> >>
>> >> This is for position 1.
>> >> AGCT
>> >> GGCT
>> >> CGCT
>> >> TGCT
>> >> NGCT
>> >> and likewise for each position.
>> >>
>> >> any nice regular expression. One way that I could think was to
>> >> generate all the possible tags for a given sequence and then do the
>> >> matching. It will be a computationally expensive for long dataset .
>> >> Any neat method ?
>> >>
>> >> Thanks,
>> >> -Abhi
>> >> _______________________________________________
>> >> Bioperl-l mailing list
>> >> Bioperl-l at lists.open-bio.org
>> >> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>> > =======================================================================
>> > Attention: The information contained in this message and/or attachments
>> > from AgResearch Limited is intended only for the persons or entities
>> > to which it is addressed and may contain confidential and/or privileged
>> > material. Any review, retransmission, dissemination or other use of, or
>> > taking of any action in reliance upon, this information by persons or
>> > entities other than the intended recipients is prohibited by AgResearch
>> > Limited. If you have received this message in error, please notify the
>> > sender immediately.
>> > =======================================================================
>> >
>


From cjfields at illinois.edu  Thu Sep 17 00:39:03 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Wed, 16 Sep 2009 23:39:03 -0500
Subject: [Bioperl-l] About FASTQ parser
In-Reply-To: <be9b52410909161313uab30d9cn24d7080eb1684de7@mail.gmail.com>
References: <be9b52410909161313uab30d9cn24d7080eb1684de7@mail.gmail.com>
Message-ID: <32FBD592-4822-478C-BCAE-33F71E1857FC@illinois.edu>

Abhi,

The FASTQ parser hasn't been released to CPAN yet.  It is available  
via bioperl-live.  We haven't added any code yet to the HOWTO's, but  
the SYNOPSIS example in Bio::SeqIO::fastq should be sufficient to get  
you started.

Bio::Seq::Quality is the object returned via next_seq(); it can be  
queried for PHRED qual scores and other bits.  If you want to split  
things up you should call next_seq(), then generate a FASTQ output  
stream in the variant you want:

my $outfasta = Bio::SeqIO->new(-format => 'fastq-sanger', -file =>  
'>fasta.file');
my $outqual = Bio::SeqIO->new(-format => 'fastq-sanger', -file =>  
'>qual.file');

while (my $seq = $in->next_seq) {
    $outfasta->write_fasta($seq);
    $outqual->write_qual($seq);
}

Note I haven't tested that yet, but it should work.  Let me know if it  
doesn't.

chris

On Sep 16, 2009, at 3:13 PM, Abhishek Pratap wrote:

> Hi Chris
>
> I remember seeing a recent email about new bioperl fastq parser. Is it
> part of bioperl 1.6 dist. I installed one and based on the doc
> here(http://doc.bioperl.org/releases/bioperl-current/bioperl-live/Bio/SeqIO/fastq.html 
> )
> I am a bit lost.
>
> I see two methods there : using Bio::SeqIO::fastq and
> Bio::Seq::Quality. Are both same in terms of data returned and latter
> giving a scale up in speed ?
>
> This is not to offend any developer but small example/s on the HOWTO's
> helps a lot.
>
> The current example (copied below) is not working. I guess it is based
> on a previous version of code.
>
> # grabs the FASTQ parser, specifies the Illumina variant
> my $in = Bio::SeqIO->new(-format    => 'fastq-illumina',
>                          -file      => 'mydata.fq');
>
>
> My basic requirement is to read each read in fastq record and split it
> into header: read: quality.
>
>
> Thanks,
> -Abhi
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From abhishek.vit at gmail.com  Thu Sep 17 00:44:28 2009
From: abhishek.vit at gmail.com (Abhishek Pratap)
Date: Thu, 17 Sep 2009 00:44:28 -0400
Subject: [Bioperl-l] About FASTQ parser
In-Reply-To: <32FBD592-4822-478C-BCAE-33F71E1857FC@illinois.edu>
References: <be9b52410909161313uab30d9cn24d7080eb1684de7@mail.gmail.com>
	<32FBD592-4822-478C-BCAE-33F71E1857FC@illinois.edu>
Message-ID: <be9b52410909162144g3177f718nf239327e98bd30c2@mail.gmail.com>

Thanks for the quick info Chris.

Cheers,
-Abhi

On Thu, Sep 17, 2009 at 12:39 AM, Chris Fields <cjfields at illinois.edu> wrote:
> Abhi,
>
> The FASTQ parser hasn't been released to CPAN yet. ?It is available via
> bioperl-live. ?We haven't added any code yet to the HOWTO's, but the
> SYNOPSIS example in Bio::SeqIO::fastq should be sufficient to get you
> started.
>
> Bio::Seq::Quality is the object returned via next_seq(); it can be queried
> for PHRED qual scores and other bits. ?If you want to split things up you
> should call next_seq(), then generate a FASTQ output stream in the variant
> you want:
>
> my $outfasta = Bio::SeqIO->new(-format => 'fastq-sanger', -file =>
> '>fasta.file');
> my $outqual = Bio::SeqIO->new(-format => 'fastq-sanger', -file =>
> '>qual.file');
>
> while (my $seq = $in->next_seq) {
> ? $outfasta->write_fasta($seq);
> ? $outqual->write_qual($seq);
> }
>
> Note I haven't tested that yet, but it should work. ?Let me know if it
> doesn't.
>
> chris
>
> On Sep 16, 2009, at 3:13 PM, Abhishek Pratap wrote:
>
>> Hi Chris
>>
>> I remember seeing a recent email about new bioperl fastq parser. Is it
>> part of bioperl 1.6 dist. I installed one and based on the doc
>>
>> here(http://doc.bioperl.org/releases/bioperl-current/bioperl-live/Bio/SeqIO/fastq.html)
>> I am a bit lost.
>>
>> I see two methods there : using Bio::SeqIO::fastq and
>> Bio::Seq::Quality. Are both same in terms of data returned and latter
>> giving a scale up in speed ?
>>
>> This is not to offend any developer but small example/s on the HOWTO's
>> helps a lot.
>>
>> The current example (copied below) is not working. I guess it is based
>> on a previous version of code.
>>
>> # grabs the FASTQ parser, specifies the Illumina variant
>> my $in = Bio::SeqIO->new(-format ? ?=> 'fastq-illumina',
>> ? ? ? ? ? ? ? ? ? ? ? ? -file ? ? ?=> 'mydata.fq');
>>
>>
>> My basic requirement is to read each read in fastq record and split it
>> into header: read: quality.
>>
>>
>> Thanks,
>> -Abhi
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>


From amackey at virginia.edu  Thu Sep 17 06:52:31 2009
From: amackey at virginia.edu (Aaron Mackey)
Date: Thu, 17 Sep 2009 06:52:31 -0400
Subject: [Bioperl-l] Question concerning IUPAC.pm
In-Reply-To: <4AB203EF.6030107@agrar.hu-berlin.de>
References: <4AB203EF.6030107@agrar.hu-berlin.de>
Message-ID: <24c96eca0909170352h34b6a20t8648d4e097d57e1e@mail.gmail.com>

Dear Armin,

Please ask such questions on the BioPerl mailing list.

The Bio::Tools::IUPAC module does the opposite of what you want -- it takes
a sequence containing ambiguous codes (e.g. "Y") and generates all possible
combinations of unambiguous sequences (thus one sequence containing a "C"
instead of "Y", and a second sequence containing a "T" instead of "Y").

However, you can do this:

  my %lookup = Bio::Tools::IUPAC->iupac_rev_iub();

%lookup will now contain the following Perl hash:

A => 'A',
T => 'T',
 C => 'C',
G => 'G',
 AC => 'M',
AG => 'R',
 AT => 'W',
CG => 'S',
 CT => 'Y',
'GT' => 'K',
 ACG => 'V',
ACT => 'H',
 AGT => 'D',
CGT => 'B',
 ACGT=> 'N',
N => 'N'

-Aaron


On Thu, Sep 17, 2009 at 5:39 AM, Armin Schmitt <
armin.schmitt at agrar.hu-berlin.de> wrote:
>
> Dear Aaron,
>
> can I use your module IUPAC.pm to create
> ambiguity symbols?
>
> I.e. Input C,T -> output Y
>
> If yes, how can I do this? A little piece
> of code would be helpful. Otherwise,
> is there another perl module for this
> purpose?
>
> Thank you very much
>
> Armin Schmitt
>
>
> --
> Dr. Armin Schmitt
> Humboldt-Universit?t zu Berlin
> Department for Crop and Animal Sciences
> Invalidenstra?e 42
> 10115 Berlin
> Tel.:   +49-30-2093-9074
> Fax:    +49-30-2093-6397
> E-mail: armin.schmitt at agrar.hu-berlin.de
>
>


From abhishek.vit at gmail.com  Thu Sep 17 14:16:33 2009
From: abhishek.vit at gmail.com (Abhishek Pratap)
Date: Thu, 17 Sep 2009 14:16:33 -0400
Subject: [Bioperl-l] About FASTQ parser
In-Reply-To: <32FBD592-4822-478C-BCAE-33F71E1857FC@illinois.edu>
References: <be9b52410909161313uab30d9cn24d7080eb1684de7@mail.gmail.com>
	<32FBD592-4822-478C-BCAE-33F71E1857FC@illinois.edu>
Message-ID: <be9b52410909171116l3284d7b6pd80689a81d46efc1@mail.gmail.com>

Hi Chris

I am just wondering if the following is intentionally excluded from a
fasta record or a bug.

After reading in each fastq record from a FASTQ fiel the output of the
same recored  (  $out->write_seq($seq)  )  has line/text missing after
the + sign.


Eg:

@HWI-EAS397:1:1:11:252#NNNTNN/1
NACAATATCAATTAGAGGATTGCTTNGTTNAAGGNNTNGNTNNNANTNT
+
DNXPMXNYXMPVXZVTXYZ[[BBBBBBBBBBBBBBBBBBBBBBBBBBBB


PS: In our case we need the exact record to be printed out as we need
to split the fastq file into multiple fastq files based on the read
index in the @ Line. So exact output is needed to avoid conflicts with
downstream processing pipelines.

Thanks,
-Abhi

Thanks,
-Abhi

On Thu, Sep 17, 2009 at 12:39 AM, Chris Fields <cjfields at illinois.edu> wrote:
> Abhi,
>
> The FASTQ parser hasn't been released to CPAN yet. ?It is available via
> bioperl-live. ?We haven't added any code yet to the HOWTO's, but the
> SYNOPSIS example in Bio::SeqIO::fastq should be sufficient to get you
> started.
>
> Bio::Seq::Quality is the object returned via next_seq(); it can be queried
> for PHRED qual scores and other bits. ?If you want to split things up you
> should call next_seq(), then generate a FASTQ output stream in the variant
> you want:
>
> my $outfasta = Bio::SeqIO->new(-format => 'fastq-sanger', -file =>
> '>fasta.file');
> my $outqual = Bio::SeqIO->new(-format => 'fastq-sanger', -file =>
> '>qual.file');
>
> while (my $seq = $in->next_seq) {
> ? $outfasta->write_fasta($seq);
> ? $outqual->write_qual($seq);
> }
>
> Note I haven't tested that yet, but it should work. ?Let me know if it
> doesn't.
>
> chris
>
> On Sep 16, 2009, at 3:13 PM, Abhishek Pratap wrote:
>
>> Hi Chris
>>
>> I remember seeing a recent email about new bioperl fastq parser. Is it
>> part of bioperl 1.6 dist. I installed one and based on the doc
>>
>> here(http://doc.bioperl.org/releases/bioperl-current/bioperl-live/Bio/SeqIO/fastq.html)
>> I am a bit lost.
>>
>> I see two methods there : using Bio::SeqIO::fastq and
>> Bio::Seq::Quality. Are both same in terms of data returned and latter
>> giving a scale up in speed ?
>>
>> This is not to offend any developer but small example/s on the HOWTO's
>> helps a lot.
>>
>> The current example (copied below) is not working. I guess it is based
>> on a previous version of code.
>>
>> # grabs the FASTQ parser, specifies the Illumina variant
>> my $in = Bio::SeqIO->new(-format ? ?=> 'fastq-illumina',
>> ? ? ? ? ? ? ? ? ? ? ? ? -file ? ? ?=> 'mydata.fq');
>>
>>
>> My basic requirement is to read each read in fastq record and split it
>> into header: read: quality.
>>
>>
>> Thanks,
>> -Abhi
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>


From cjfields at illinois.edu  Thu Sep 17 16:54:20 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Thu, 17 Sep 2009 15:54:20 -0500
Subject: [Bioperl-l] BioPerl 1.6.0 alpha 1 released
Message-ID: <358B9E70-84C7-42DC-A473-C2AACC18A211@illinois.edu>

All,

Just a quick note that I have released the first alpha for the 1.6.1  
point release.  I uploaded it to CPAN, so it should be migrating to  
the various servers in the next few hours or so.  In the meantime, the  
alpha can be directly downloaded using the following links (pick your  
format):

http://bioperl.org/DIST/RC/BioPerl-1.6.0_1.tar.bz2
http://bioperl.org/DIST/RC/BioPerl-1.6.0_1.tar.gz
http://bioperl.org/DIST/RC/BioPerl-1.6.0_1.zip

If everything goes well, I'll have a more formalized release ready for  
the weekend.  I will also be attempting (hopefully with some success)  
getting a Windows PPM for the latest ActiveState Perl going over the  
next few days.  Feedback from users trying to install BioPerl using  
the latest Strawberry Perl would also be greatly appreciated.

Thanks!

chris

From cjfields at illinois.edu  Thu Sep 17 17:38:31 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Thu, 17 Sep 2009 16:38:31 -0500
Subject: [Bioperl-l] Size of BioPerl distribution
Message-ID: <75D16BEE-4851-4709-94D3-4D4A217B6E8D@illinois.edu>

After uploading the latest bioperl alpha to CPAN I noticed the size of  
the distribution archive has jumped up from ~7 MB to just over 10 MB.   
It looks like a majority of this is attributable to three data files  
for testing in t/data added after the 1.6.0 release:

gmap_f9-multiple_results.txt  (3 MB)
withrefm.906                  (2.5 MB)
1ZZ19XR301R-Alignment.tblastn (2 MB)

I'm not sure there is an easy way around the problem.  We could  
attempt to reduce the file size down, but I'm not convinced that's a  
long-term solution (the test data will only get larger as more test  
cases come up).

Any ideas?  Should we try to have a common biodata repo again?

chris

From rmb32 at cornell.edu  Thu Sep 17 18:04:47 2009
From: rmb32 at cornell.edu (Robert Buels)
Date: Thu, 17 Sep 2009 15:04:47 -0700
Subject: [Bioperl-l] Size of BioPerl distribution
In-Reply-To: <75D16BEE-4851-4709-94D3-4D4A217B6E8D@illinois.edu>
References: <75D16BEE-4851-4709-94D3-4D4A217B6E8D@illinois.edu>
Message-ID: <4AB2B27F.8050800@cornell.edu>

Chris Fields wrote:
 > Any ideas?  Should we try to have a common biodata repo again?

Beyond encouraging people to keep the test data smaller (I would think 
that multiple MB in a test data file is quite excessive!), I don't think 
it's worth worrying about that much.  The stuff in bioperl needs a 
significant amount of test data, and I think that's fine.

This problem is also addressed by the ongoing effort to break things up 
into more distros, I think that will help a lot.

Rob


From hlapp at gmx.net  Thu Sep 17 18:33:34 2009
From: hlapp at gmx.net (Hilmar Lapp)
Date: Thu, 17 Sep 2009 18:33:34 -0400
Subject: [Bioperl-l] Size of BioPerl distribution
In-Reply-To: <4AB2B27F.8050800@cornell.edu>
References: <75D16BEE-4851-4709-94D3-4D4A217B6E8D@illinois.edu>
	<4AB2B27F.8050800@cornell.edu>
Message-ID: <C84FCD2C-3CB7-498F-8977-3C52D194F110@gmx.net>


On Sep 17, 2009, at 6:04 PM, Robert Buels wrote:

> I don't think it's worth worrying about that much.  The stuff in  
> bioperl needs a significant amount of test data, and I think that's  
> fine.


I'd agree with that. Storage is cheap these days. -hilmar

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From cjfields at illinois.edu  Thu Sep 17 19:26:25 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Thu, 17 Sep 2009 18:26:25 -0500
Subject: [Bioperl-l] Size of BioPerl distribution
In-Reply-To: <C84FCD2C-3CB7-498F-8977-3C52D194F110@gmx.net>
References: <75D16BEE-4851-4709-94D3-4D4A217B6E8D@illinois.edu>
	<4AB2B27F.8050800@cornell.edu>
	<C84FCD2C-3CB7-498F-8977-3C52D194F110@gmx.net>
Message-ID: <2404AC8D-2095-415B-B1F3-CF79C4D24525@illinois.edu>

On Sep 17, 2009, at 5:33 PM, Hilmar Lapp wrote:

> On Sep 17, 2009, at 6:04 PM, Robert Buels wrote:
>
>> I don't think it's worth worrying about that much.  The stuff in  
>> bioperl needs a significant amount of test data, and I think that's  
>> fine.
>
> I'd agree with that. Storage is cheap these days. -hilmar

Kind of my thought as well, just a bit of a shock to see the dist.  
increase by 65% between point releases for just three test data  
files.  I may try paring those down a tad.

chris

From cjfields at illinois.edu  Thu Sep 17 19:26:52 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Thu, 17 Sep 2009 18:26:52 -0500
Subject: [Bioperl-l] About FASTQ parser
In-Reply-To: <be9b52410909171116l3284d7b6pd80689a81d46efc1@mail.gmail.com>
References: <be9b52410909161313uab30d9cn24d7080eb1684de7@mail.gmail.com>
	<32FBD592-4822-478C-BCAE-33F71E1857FC@illinois.edu>
	<be9b52410909171116l3284d7b6pd80689a81d46efc1@mail.gmail.com>
Message-ID: <06B0378C-312F-4F43-A99A-6F6CC1C88F61@illinois.edu>

The default format for most FASTQ parsers is to leave the extra header  
off (it increases the file size substantially).  You can add that back  
by setting quality_header():

my $out = Bio::SeqIO->new(-format => 'fastq', -file => $file, - 
quality_header => 1);

Again, let me know if that works okay.

chris

On Sep 17, 2009, at 1:16 PM, Abhishek Pratap wrote:

> Hi Chris
>
> I am just wondering if the following is intentionally excluded from a
> fasta record or a bug.
>
> After reading in each fastq record from a FASTQ fiel the output of the
> same recored  (  $out->write_seq($seq)  )  has line/text missing after
> the + sign.
>
>
>
> Eg:
>
> @HWI-EAS397:1:1:11:252#NNNTNN/1
> NACAATATCAATTAGAGGATTGCTTNGTTNAAGGNNTNGNTNNNANTNT
> +
> DNXPMXNYXMPVXZVTXYZ[[BBBBBBBBBBBBBBBBBBBBBBBBBBBB
>
>
> PS: In our case we need the exact record to be printed out as we need
> to split the fastq file into multiple fastq files based on the read
> index in the @ Line. So exact output is needed to avoid conflicts with
> downstream processing pipelines.
>
> Thanks,
> -Abhi
>
> Thanks,
> -Abhi
>
> On Thu, Sep 17, 2009 at 12:39 AM, Chris Fields  
> <cjfields at illinois.edu> wrote:
>> Abhi,
>>
>> The FASTQ parser hasn't been released to CPAN yet.  It is available  
>> via
>> bioperl-live.  We haven't added any code yet to the HOWTO's, but the
>> SYNOPSIS example in Bio::SeqIO::fastq should be sufficient to get you
>> started.
>>
>> Bio::Seq::Quality is the object returned via next_seq(); it can be  
>> queried
>> for PHRED qual scores and other bits.  If you want to split things  
>> up you
>> should call next_seq(), then generate a FASTQ output stream in the  
>> variant
>> you want:
>>
>> my $outfasta = Bio::SeqIO->new(-format => 'fastq-sanger', -file =>
>> '>fasta.file');
>> my $outqual = Bio::SeqIO->new(-format => 'fastq-sanger', -file =>
>> '>qual.file');
>>
>> while (my $seq = $in->next_seq) {
>>   $outfasta->write_fasta($seq);
>>   $outqual->write_qual($seq);
>> }
>>
>> Note I haven't tested that yet, but it should work.  Let me know if  
>> it
>> doesn't.
>>
>> chris
>>
>> On Sep 16, 2009, at 3:13 PM, Abhishek Pratap wrote:
>>
>>> Hi Chris
>>>
>>> I remember seeing a recent email about new bioperl fastq parser.  
>>> Is it
>>> part of bioperl 1.6 dist. I installed one and based on the doc
>>>
>>> here(http://doc.bioperl.org/releases/bioperl-current/bioperl-live/Bio/SeqIO/fastq.html 
>>> )
>>> I am a bit lost.
>>>
>>> I see two methods there : using Bio::SeqIO::fastq and
>>> Bio::Seq::Quality. Are both same in terms of data returned and  
>>> latter
>>> giving a scale up in speed ?
>>>
>>> This is not to offend any developer but small example/s on the  
>>> HOWTO's
>>> helps a lot.
>>>
>>> The current example (copied below) is not working. I guess it is  
>>> based
>>> on a previous version of code.
>>>
>>> # grabs the FASTQ parser, specifies the Illumina variant
>>> my $in = Bio::SeqIO->new(-format    => 'fastq-illumina',
>>>                         -file      => 'mydata.fq');
>>>
>>>
>>> My basic requirement is to read each read in fastq record and  
>>> split it
>>> into header: read: quality.
>>>
>>>
>>> Thanks,
>>> -Abhi
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From rmb32 at cornell.edu  Thu Sep 17 19:30:16 2009
From: rmb32 at cornell.edu (Robert Buels)
Date: Thu, 17 Sep 2009 16:30:16 -0700
Subject: [Bioperl-l] Size of BioPerl distribution
In-Reply-To: <2404AC8D-2095-415B-B1F3-CF79C4D24525@illinois.edu>
References: <75D16BEE-4851-4709-94D3-4D4A217B6E8D@illinois.edu>
	<4AB2B27F.8050800@cornell.edu>
	<C84FCD2C-3CB7-498F-8977-3C52D194F110@gmx.net>
	<2404AC8D-2095-415B-B1F3-CF79C4D24525@illinois.edu>
Message-ID: <4AB2C688.2030602@cornell.edu>

Chris Fields wrote:
> Kind of my thought as well, just a bit of a shock to see the dist. 
> increase by 65% between point releases for just three test data files.  
> I may try paring those down a tad.

Yes, those individual files are certainly excessive.


From maj at fortinbras.us  Thu Sep 17 19:36:09 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Thu, 17 Sep 2009 19:36:09 -0400
Subject: [Bioperl-l] Size of BioPerl distribution
In-Reply-To: <C84FCD2C-3CB7-498F-8977-3C52D194F110@gmx.net>
References: <75D16BEE-4851-4709-94D3-4D4A217B6E8D@illinois.edu><4AB2B27F.8050800@cornell.edu>
	<C84FCD2C-3CB7-498F-8977-3C52D194F110@gmx.net>
Message-ID: <EC73E6B6BD3D468E8C29D138AF64D483@NewLife>

Two of those files are my bad-- the withrefm is prob best in its entirety, since
it contains all the weird extra-site restrictions that the B:Restriction 
refactor
was meant to handle. The other is a tiling test file that I could probably 
replace
(or at least edit down)-- 
----- Original Message ----- 
From: "Hilmar Lapp" <hlapp at gmx.net>
To: "Robert Buels" <rmb32 at cornell.edu>
Cc: "Chris Fields" <cjfields at illinois.edu>; "BioPerl List" 
<bioperl-l at lists.open-bio.org>
Sent: Thursday, September 17, 2009 6:33 PM
Subject: Re: [Bioperl-l] Size of BioPerl distribution


>
> On Sep 17, 2009, at 6:04 PM, Robert Buels wrote:
>
>> I don't think it's worth worrying about that much.  The stuff in  bioperl 
>> needs a significant amount of test data, and I think that's  fine.
>
>
> I'd agree with that. Storage is cheap these days. -hilmar
>
> -- 
> ===========================================================
> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
> ===========================================================
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> 


From maj at fortinbras.us  Thu Sep 17 22:13:37 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Thu, 17 Sep 2009 22:13:37 -0400
Subject: [Bioperl-l] Size of BioPerl distribution
In-Reply-To: <75D16BEE-4851-4709-94D3-4D4A217B6E8D@illinois.edu>
References: <75D16BEE-4851-4709-94D3-4D4A217B6E8D@illinois.edu>
Message-ID: <F9FBB3236FA446BCAA504BBC62E3194F@NewLife>

t/data compresses from 21M to 9M. We could ship with 

$ tar -czf data.tar.gz data
$ rm -rf data

and do the following in Bio::Root::Test, if we're willing to expect 
Archive::Tar and IO::Zlib :

use vars qw( $ARCHIVE );
$ARCHIVE = "data.tar.gz";
...

sub test_input_file {
    # if it's there, fine
    my $fn =  File::Spec->catfile('t', 'data', @_);
    return $fn if -e $fn;
    # if it's not, expand the archive
    my $arch = File::Spec->catfile('t', $ARCHIVE);
    Bio::Root::Root->throw("Test data archive not present") unless (-e $arch);
    my $tar = Archive::Tar->new($arch);
    Bio::Root::Root->throw ("Can't extract test data archive") unless $tar;
    $tar->extract;
    return $fn if -e $fn;
    return;
}


----- Original Message ----- 
From: "Chris Fields" <cjfields at illinois.edu>
To: "BioPerl List" <bioperl-l at lists.open-bio.org>
Sent: Thursday, September 17, 2009 5:38 PM
Subject: [Bioperl-l] Size of BioPerl distribution


> After uploading the latest bioperl alpha to CPAN I noticed the size of  
> the distribution archive has jumped up from ~7 MB to just over 10 MB.   
> It looks like a majority of this is attributable to three data files  
> for testing in t/data added after the 1.6.0 release:
> 
> gmap_f9-multiple_results.txt  (3 MB)
> withrefm.906                  (2.5 MB)
> 1ZZ19XR301R-Alignment.tblastn (2 MB)
> 
> I'm not sure there is an easy way around the problem.  We could  
> attempt to reduce the file size down, but I'm not convinced that's a  
> long-term solution (the test data will only get larger as more test  
> cases come up).
> 
> Any ideas?  Should we try to have a common biodata repo again?
> 
> chris
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
>

From cjfields at illinois.edu  Thu Sep 17 22:53:09 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Thu, 17 Sep 2009 21:53:09 -0500
Subject: [Bioperl-l] Size of BioPerl distribution
In-Reply-To: <F9FBB3236FA446BCAA504BBC62E3194F@NewLife>
References: <75D16BEE-4851-4709-94D3-4D4A217B6E8D@illinois.edu>
	<F9FBB3236FA446BCAA504BBC62E3194F@NewLife>
Message-ID: <04BEE5E7-79C6-45DE-9EC7-D72AE9E881E5@illinois.edu>

Maybe attempt trimming them down a bit first, if that's possible.  If  
not, no worries (breaking up the distribution will help as Robert  
said).  Archive::Tar and IO::Zlib were added in core after 5.8  
(5.009003 to be exact), so I would rather not have to worry about any  
test-specific dependencies.

Anyway, we've got a little more time.  I'm getting a META.yml popping  
up (though everything appears to pass here).  Will look into it; may  
be related to a previously reported bug, but I would like to see some  
CPANPLUS tests coming in first.  That's what an alpha is for!

chris

On Sep 17, 2009, at 9:13 PM, Mark A. Jensen wrote:

> t/data compresses from 21M to 9M. We could ship with
> $ tar -czf data.tar.gz data
> $ rm -rf data
>
> and do the following in Bio::Root::Test, if we're willing to expect  
> Archive::Tar and IO::Zlib :
>
> use vars qw( $ARCHIVE );
> $ARCHIVE = "data.tar.gz";
> ...
>
> sub test_input_file {
>   # if it's there, fine
>   my $fn =  File::Spec->catfile('t', 'data', @_);
>   return $fn if -e $fn;
>   # if it's not, expand the archive
>   my $arch = File::Spec->catfile('t', $ARCHIVE);
>   Bio::Root::Root->throw("Test data archive not present") unless (-e  
> $arch);
>   my $tar = Archive::Tar->new($arch);
>   Bio::Root::Root->throw ("Can't extract test data archive") unless  
> $tar;
>   $tar->extract;
>   return $fn if -e $fn;
>   return;
> }
>
>
> ----- Original Message ----- From: "Chris Fields" <cjfields at illinois.edu 
> >
> To: "BioPerl List" <bioperl-l at lists.open-bio.org>
> Sent: Thursday, September 17, 2009 5:38 PM
> Subject: [Bioperl-l] Size of BioPerl distribution
>
>
>> After uploading the latest bioperl alpha to CPAN I noticed the size  
>> of  the distribution archive has jumped up from ~7 MB to just over  
>> 10 MB.   It looks like a majority of this is attributable to three  
>> data files  for testing in t/data added after the 1.6.0 release:
>> gmap_f9-multiple_results.txt  (3 MB)
>> withrefm.906                  (2.5 MB)
>> 1ZZ19XR301R-Alignment.tblastn (2 MB)
>> I'm not sure there is an easy way around the problem.  We could   
>> attempt to reduce the file size down, but I'm not convinced that's  
>> a  long-term solution (the test data will only get larger as more  
>> test  cases come up).
>> Any ideas?  Should we try to have a common biodata repo again?
>> chris
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From cjfields at illinois.edu  Thu Sep 17 23:48:13 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Thu, 17 Sep 2009 22:48:13 -0500
Subject: [Bioperl-l] Size of BioPerl distribution
In-Reply-To: <19123.504.682683.996798@already.dhcp.gene.com>
References: <75D16BEE-4851-4709-94D3-4D4A217B6E8D@illinois.edu>
	<4AB2B27F.8050800@cornell.edu>
	<C84FCD2C-3CB7-498F-8977-3C52D194F110@gmx.net>
	<2404AC8D-2095-415B-B1F3-CF79C4D24525@illinois.edu>
	<4AB2C688.2030602@cornell.edu>
	<19123.504.682683.996798@already.dhcp.gene.com>
Message-ID: <B1B941EE-8F1E-426C-82DC-D89B3A13AD3D@illinois.edu>

On Sep 17, 2009, at 10:43 PM, George Hartzell wrote:

> Robert Buels writes:
>> Chris Fields wrote:
>>> Kind of my thought as well, just a bit of a shock to see the dist.
>>> increase by 65% between point releases for just three test data  
>>> files.
>>> I may try paring those down a tad.
>>
>> Yes, those individual files are certainly excessive.
>
> Woo hoo.  Fame and fortune.  Or at least fame.  Or something just this
> side of embarrassment.  Rats.
>
> I'll see about making a smaller test for the gmap_f9 parser, while
> still using real data.
>
> Is there existing support in the searchio infrastructure for reading
> [gb]zip'ed files?
>
> Can it wait a day or three?
>
> g.

Yes, certainly.  I'll be working on a separate issue this weekend  
dealing with the META.yml that CPAN/CPANPLUS appear to be choking on,  
so I'll push back the release until early next week.

chris

From hartzell at alerce.com  Thu Sep 17 23:43:52 2009
From: hartzell at alerce.com (George Hartzell)
Date: Thu, 17 Sep 2009 20:43:52 -0700
Subject: [Bioperl-l] Size of BioPerl distribution
In-Reply-To: <4AB2C688.2030602@cornell.edu>
References: <75D16BEE-4851-4709-94D3-4D4A217B6E8D@illinois.edu>
	<4AB2B27F.8050800@cornell.edu>
	<C84FCD2C-3CB7-498F-8977-3C52D194F110@gmx.net>
	<2404AC8D-2095-415B-B1F3-CF79C4D24525@illinois.edu>
	<4AB2C688.2030602@cornell.edu>
Message-ID: <19123.504.682683.996798@already.dhcp.gene.com>

Robert Buels writes:
 > Chris Fields wrote:
 > > Kind of my thought as well, just a bit of a shock to see the dist. 
 > > increase by 65% between point releases for just three test data files.  
 > > I may try paring those down a tad.
 > 
 > Yes, those individual files are certainly excessive.

Woo hoo.  Fame and fortune.  Or at least fame.  Or something just this
side of embarrassment.  Rats.

I'll see about making a smaller test for the gmap_f9 parser, while
still using real data.

Is there existing support in the searchio infrastructure for reading
[gb]zip'ed files?

Can it wait a day or three?

g.

From roy.chaudhuri at gmail.com  Fri Sep 18 06:43:29 2009
From: roy.chaudhuri at gmail.com (Roy Chaudhuri)
Date: Fri, 18 Sep 2009 11:43:29 +0100
Subject: [Bioperl-l] subsection of genbank file
In-Reply-To: <997B4CA2-D80B-4512-AA3E-74CB45DD7064@science.mq.edu.au>
References: <997B4CA2-D80B-4512-AA3E-74CB45DD7064@science.mq.edu.au>
Message-ID: <4AB36451.3030207@gmail.com>

Hi Liam,

I just discovered your message, which has not yet been replied to. What 
you require has been discussed in a recent thread:
http://bioperl.org/pipermail/bioperl-l/2009-August/031071.html

Try using trunc_with_features from Bio::SeqUtils:

my $sub_seqobj=Bio::SeqUtils->trunc_with_features($seqobj, 300, 2000);
Cheers.
Roy.

Liam Elbourne wrote:
> Hi All,
> 
> Is there a method or methodology that will produce a fully fledged Seq  
> object with all the associated metadata given a start and end  
> position? To clarify, I create a sequence object from a genbank file:
> 
> 
> ****
> my $io  = Bio::Seqio->new(as per usual);
> 
> my $seqobj = $io->next_seq();
> ****
> I now want:
> 
> my $sub_seqobj = $seqobj between 300 and 2000
> 
> where $sub_seqobj is a Seq object (which I appreciate is an  
> 'aggregate' of objects) too. The "trunc" method only returns a  
> PrimarySeq object which lacks all the annotation etc. I've previously  
> done this task by iterating through feature by feature and parsing out  
> what I needed, but thought there might be a more elegant approach...
> 
> 
> Regards,
> Liam Elbourne.

-- 
Dr. Roy Chaudhuri
Department of Veterinary Medicine
University of Cambridge, U.K.

From maj at fortinbras.us  Fri Sep 18 08:11:11 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Fri, 18 Sep 2009 08:11:11 -0400
Subject: [Bioperl-l] problem parsing pdb
In-Reply-To: <741671.67508.qm@web25705.mail.ukl.yahoo.com>
References: <741671.67508.qm@web25705.mail.ukl.yahoo.com>
Message-ID: <DBEE748776B74A7988A942A7BBE13AA3@NewLife>

Hi Paola--
I will look at this. Stay tuned-
Mark
----- Original Message ----- 
From: "Paola Bisignano" <paola_bisignano at yahoo.it>
To: <bioperl-l at bioperl.org>
Sent: Tuesday, September 08, 2009 4:55 AM
Subject: [Bioperl-l] problem parsing pdb


Hi,

I'm in a little troble because i need to exactly parse pdb file, to extract 
chain id and res id, but I finded that in some pdb the number of residue is 
followed by a letter because is probably a residue added by crystallographers 
and they didm't want to change the number of residue in sequence....for example 
the pdb 1PXX.pdb I parsed it with my script below, I didn't find any useful 
suggestion about this in bioperltutorial or documentation of bioperl online

#!/usr/local/bin/perl
use strict;
use warnings;
use Bio::Structure::IO;
use LWP::Simple;


my $urlpdb= 
"http://www.rcsb.org/pdb/download/downloadFile.do?fileFormat=pdb&compression=NO&structureId=1PXX";
my $content = get($urlpdb);
my $pdb_file = qq{1pxx.pdb};
open my $f, ">$pdb_file" or die $!;
binmode $f;
print $f $content;
print qq{$pdb_file\n};
close $f;


my $structio=Bio::Structure::IO->new (-file=>$pdb_file);
my $struc=$structio->next_structure;
for my $chain ($struc->get_chains)
{
my $chainid = $chain->id ;
for my $res ($struc->get_residues($chain))
{
my $resid=$res-> id;
my $atoms= $struc->get_atoms($res);
open my $f, ">> 1pxx.parsed";
print $f "$chainid\t$resid\n";
close $f;
}
}


but it gives my file with an error in ILE 105A ILE 2105C because they have a 
letter that follow the number of resid.... can I solve that problem without 
writing intermediate files?
because i need to have the reside id as 105A not 105.A
so
A ILE-105A
without point between number and letter....


Thank you all,

Paola


_______________________________________________
Bioperl-l mailing list
Bioperl-l at lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/bioperl-l


From scott at scottcain.net  Fri Sep 18 10:11:23 2009
From: scott at scottcain.net (Scott Cain)
Date: Fri, 18 Sep 2009 10:11:23 -0400
Subject: [Bioperl-l] test failures in main trunk
Message-ID: <2DEEE102-8F58-4BBF-BEAD-97A1AA364787@scottcain.net>

With Chris trying to get a release out, I wanted to report these test  
failures from a fairly virgin system Ubuntu server 8.04.

Scott


t/SeqIO/raw.t ................................ 1/24 Can't locate  
Algorithm/Diff.pm in @INC (@INC contains: t/lib . /home/gmod/bioperl- 
live/blib/lib /home/gmod/bioperl-live/blib/arch /home/gmod/bioperl- 
live /etc/perl /usr/local/lib/perl/5.8.8 /usr/local/share/perl/5.8.8 / 
usr/lib/perl5 /usr/share/perl5 /usr/lib/perl/5.8 /usr/share/perl/5.8 / 
usr/local/lib/site_perl) at t/SeqIO/raw.t line 72.
BEGIN failed--compilation aborted at t/SeqIO/raw.t line 72.
# Looks like you planned 24 tests but ran 1.
# Looks like your test exited with 2 just after 1.
t/SeqIO/raw.t ................................ Dubious, test returned  
2 (wstat 512, 0x200)

t/SeqTools/Backtranslate.t ................... Can't locate ok.pm in  
@INC (@INC contains: t/lib /home/gmod/bioperl-live/blib/lib /home/gmod/ 
bioperl-live/blib/arch /home/gmod/bioperl-live /etc/perl /usr/local/ 
lib/perl/5.8.8 /usr/local/share/perl/5.8.8 /usr/lib/perl5 /usr/share/ 
perl5 /usr/lib/perl/5.8 /usr/share/perl/5.8 /usr/local/lib/ 
site_perl .) at t/SeqTools/Backtranslate.t line 9.
BEGIN failed--compilation aborted at t/SeqTools/Backtranslate.t line 9.
# Looks like your test exited with 2 before it could output anything.
t/SeqTools/Backtranslate.t ................... Dubious, test returned  
2 (wstat 512, 0x200)
Failed 8/8 subtests

t/SeqTools/SeqPattern.t ...................... 1/28
#   Failed test 'use Bio::Tools::SeqPattern;'
#   at t/SeqTools/SeqPattern.t line 12.
#     Tried to use 'Bio::Tools::SeqPattern'.
#     Error:  Can't locate List/MoreUtils.pm in @INC (@INC contains: t/ 
lib . /home/gmod/bioperl-live/blib/lib /home/gmod/bioperl-live/blib/ 
arch /home/gmod/bioperl-live /etc/perl /usr/local/lib/perl/5.8.8 /usr/ 
local/share/perl/5.8.8 /usr/lib/perl5 /usr/share/perl5 /usr/lib/perl/ 
5.8 /usr/share/perl/5.8 /usr/local/lib/site_perl) at Bio/Tools/ 
SeqPattern/Backtranslate.pm line 22.
# BEGIN failed--compilation aborted at Bio/Tools/SeqPattern/ 
Backtranslate.pm line 22.
# Compilation failed in require at Bio/Tools/SeqPattern.pm line 212.
# Compilation failed in require at (eval 17) line 2.
# BEGIN failed--compilation aborted at (eval 17) line 2.
Use of uninitialized value in concatenation (.) or string at Bio/Tools/ 
SeqPattern.pm line 431.
Use of uninitialized value in concatenation (.) or string at Bio/Tools/ 
SeqPattern.pm line 432.

#   Failed test at t/SeqTools/SeqPattern.t line 25.
#          got: '(CT).{1,80}(C[[]]CT).(AGGGG){1,200}'
#     expected: '(CT).{1,80}(C[GA][GA]CT).(AGGGG){1,200}'
Use of uninitialized value in concatenation (.) or string at Bio/Tools/ 
SeqPattern.pm line 431.
Use of uninitialized value in concatenation (.) or string at Bio/Tools/ 
SeqPattern.pm line 432.

#   Failed test at t/SeqTools/SeqPattern.t line 31.
#          got: '(CT).(C[][]CT){1,80}.(AGGGG){1,200}'
#     expected: '(CT).(C[AG][AG]CT){1,80}.(AGGGG){1,200}'
Use of uninitialized value in concatenation (.) or string at Bio/Tools/ 
SeqPattern.pm line 371.
Use of uninitialized value in concatenation (.) or string at Bio/Tools/ 
SeqPattern.pm line 372.

#   Failed test at t/SeqTools/SeqPattern.t line 38.
#          got: 'A[][]H'
#     expected: 'A[EQ][DN]H'
"_reverse_translate_motif" is not exported by the  
Bio::Tools::SeqPattern::Backtranslate module
Can't continue after import errors at Bio/Tools/SeqPattern.pm line 539
# Looks like you planned 28 tests but ran 9.
# Looks like you failed 4 tests of 9 run.
# Looks like your test exited with 255 just after 9.
t/SeqTools/SeqPattern.t ...................... Dubious, test returned  
255 (wstat 65280, 0xff00)
Failed 23/28 subtests


-----------------------------------------------------------------------
Scott Cain, Ph. D. scott at scottcain dot net
GMOD Coordinator (http://gmod.org/) 216-392-3087
Ontario Institute for Cancer Research


From dan.bolser at gmail.com  Fri Sep 18 10:11:30 2009
From: dan.bolser at gmail.com (Dan Bolser)
Date: Fri, 18 Sep 2009 15:11:30 +0100
Subject: [Bioperl-l] construct chromosome sequences from bac sequences
In-Reply-To: <dac81b0d0812300702x652813cel733eb9eaa82a408d@mail.gmail.com>
References: <dac81b0d0812300702x652813cel733eb9eaa82a408d@mail.gmail.com>
Message-ID: <2c8757af0909180711t7212f5aak9bc3c7f4e8d16120@mail.gmail.com>

Did you try loading the sequences into an alignment or an assembly object?

As far as I know BioPerl won't call a consensus for you, but you can
post process the alignment or assembly to do that.

Can an alignment hold sequences with qualities?


Sorry for the late reply, I'm just trawling the list for potential
answers to the question I'm about to post ;-)

Dan.


2008/12/30 Alper Yilmaz <alperyilmaz at gmail.com>:
> Hi,
>
> I have FPC report and BAC sequences in hand. I was wondering what is the
> most practical way to build chromosomes from these available information.
>
> I HAVE:
> FPC file:
> accession ? ?chr ? ?chr_start ? ?chr_end ? ?contig ? ?contig_start
> contig_end
> aaaaaaaaaa ? ?1 ? ?14700 ? ?215600 ? ?ctg1 ? ?14700 ? ?215600
> bbbbbbbbbb ? ?1 ? ?196000 ? ?362600 ? ?ctg1 ? ?196000 ? ?362600
> cccccccccc ? ?1 ? ?352800 ? ?524300 ? ?ctg1 ? ?352800 ? ?524300
> .
> .
>
> BAC fasta file:
>>aaaaaaaaaa
> GATCGATCAGCATCGACTACGACT...
>>bbbbbbbbbb
> AGTAGCAGTAGCTAGCACTACGAC...
>>cccccccccc
> ACGATCAGCATCAGCATCGACTAC...
> .
> .
> .
>
> I WANT:
>>chr1
> GACGACTAGCTACGACTAC...
>>chr2
> AGCTGATCACGATCACGAC...
>
> In theory a sequence object called "Chr1" can be created and then according
> to start and end locations of each BAC in FPC file, subsequences of Chr1 can
> be retrieved. However, there are two facts which might prevent using
> standard sequence objects.
> 1) There will be gaps in chromosomes. Is there a function to convert
> unassigned locations to N?
> 2) There are overlaps between BAC sequences. If the overlapping sequences
> are exactly same, it won't be problem, but if there are discrepancies
> between them, a decision has to be made as to which sequence to use in final
> Chr1 sequence.
>
> thanks,
>
> Alper Yilmaz
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


From dan.bolser at gmail.com  Fri Sep 18 10:27:27 2009
From: dan.bolser at gmail.com (Dan Bolser)
Date: Fri, 18 Sep 2009 15:27:27 +0100
Subject: [Bioperl-l] Parser: Ace file (Sequence Assembly) in Bioperl
In-Reply-To: <835D79AC-0D2A-40BE-87F1-0591F69C036A@illinois.edu>
References: <be9b52410901052142p2809652h68e6a05b3ae156eb@mail.gmail.com>
	<18DF7D20DFEC044098A1062202F5FFF31A69523F20@exchsth.agresearch.co.nz>
	<be9b52410901061243t576fcc1eg94928360b8e0f57b@mail.gmail.com>
	<B6BFD3C2-D5D0-4732-B3E2-C2DC9DD029F1@illinois.edu>
	<52cea20c0901061513x593acb44o641b87e35b8ff6fe@mail.gmail.com>
	<835D79AC-0D2A-40BE-87F1-0591F69C036A@illinois.edu>
Message-ID: <2c8757af0909180727r5a71a41fmee71eff92a49a888@mail.gmail.com>

2009/1/6 Chris Fields <cjfields at illinois.edu>:
> Could you archive the files and attach them to a bug report (you can mark it
> as an enhancement request). ?We can take a look.
>
> http://bugzilla.open-bio.org/

Out of interest, has this been added? Where is it documented?

Cheers,
Dan.


> chris
>
> On Jan 6, 2009, at 5:13 PM, Joshua Udall wrote:
>
>> Chris et al. -
>>
>> A student and I have written code to do this - write ace files as well as
>> parse them one entry at a time. ?In trying to use the Assembly::IO as it
>> was
>> in 1.5, we ran into problems with large ace files containing many entries
>> because of file handle limit issues with the inherited implementation
>> DB_File. ?Our implementation simply reads one contig at a time instead of
>> first trying to slurp the whole ace into memory. ?I'm happy to add it to
>> Bioperl, but I am not sure how to do it. ?If I sent *.pm files to someone,
>> could they help me get it into bioperl? ?It may not be perfect either, but
>> it should be a good start.
>>
>> Josh


From bosborne11 at verizon.net  Fri Sep 18 09:48:55 2009
From: bosborne11 at verizon.net (Brian Osborne)
Date: Fri, 18 Sep 2009 09:48:55 -0400
Subject: [Bioperl-l] problem parsing pdb
In-Reply-To: <DBEE748776B74A7988A942A7BBE13AA3@NewLife>
References: <741671.67508.qm@web25705.mail.ukl.yahoo.com>
	<DBEE748776B74A7988A942A7BBE13AA3@NewLife>
Message-ID: <AC62DAB3-3334-44A6-8172-753519B083FF@verizon.net>

Mark,

There was an interesting exchange about StructureIO::pdb a few years  
ago:

http://portal.open-bio.org/pipermail/bioperl-l/2006-September/022990.html

I don't think anyone has actually worked on this code since then and I  
also don't know if Paolo's question relates to the content of the  
thread, but it's good overview.

Brian O.


On Sep 18, 2009, at 8:11 AM, Mark A. Jensen wrote:

> Hi Paola--
> I will look at this. Stay tuned-
> Mark
> ----- Original Message ----- From: "Paola Bisignano" <paola_bisignano at yahoo.it 
> >
> To: <bioperl-l at bioperl.org>
> Sent: Tuesday, September 08, 2009 4:55 AM
> Subject: [Bioperl-l] problem parsing pdb
>
>
> Hi,
>
> I'm in a little troble because i need to exactly parse pdb file, to  
> extract chain id and res id, but I finded that in some pdb the  
> number of residue is followed by a letter because is probably a  
> residue added by crystallographers and they didm't want to change  
> the number of residue in sequence....for example the pdb 1PXX.pdb I  
> parsed it with my script below, I didn't find any useful suggestion  
> about this in bioperltutorial or documentation of bioperl online
>
> #!/usr/local/bin/perl
> use strict;
> use warnings;
> use Bio::Structure::IO;
> use LWP::Simple;
>
>
>
> my $urlpdb= "http://www.rcsb.org/pdb/download/downloadFile.do?fileFormat=pdb&compression=NO&structureId=1PXX 
> ";
> my $content = get($urlpdb);
> my $pdb_file = qq{1pxx.pdb};
> open my $f, ">$pdb_file" or die $!;
> binmode $f;
> print $f $content;
> print qq{$pdb_file\n};
> close $f;
>
>
>
> my $structio=Bio::Structure::IO->new (-file=>$pdb_file);
> my $struc=$structio->next_structure;
> for my $chain ($struc->get_chains)
> {
> my $chainid = $chain->id ;
> for my $res ($struc->get_residues($chain))
> {
> my $resid=$res-> id;
> my $atoms= $struc->get_atoms($res);
> open my $f, ">> 1pxx.parsed";
> print $f "$chainid\t$resid\n";
> close $f;
> }
> }
>
>
>
> but it gives my file with an error in ILE 105A ILE 2105C because  
> they have a letter that follow the number of resid.... can I solve  
> that problem without writing intermediate files?
> because i need to have the reside id as 105A not 105.A
> so
> A ILE-105A
> without point between number and letter....
>
>
>
>
> Thank you all,
>
> Paola
>
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From dan.bolser at gmail.com  Fri Sep 18 10:55:57 2009
From: dan.bolser at gmail.com (Dan Bolser)
Date: Fri, 18 Sep 2009 15:55:57 +0100
Subject: [Bioperl-l] Getting read position information from an ACE file?
Message-ID: <2c8757af0909180755u2e2ca178h9ce921f9bb22c7a3@mail.gmail.com>

Dear Perl Monkeys,

I wrote a little demo script for Bio::Assembly::IO here:

http://www.bioperl.org/wiki/Module:Bio::Assembly::IO


I would very much appreciate comments, criticisms and corrections on
that script (please just edit the wiki). For a newbie its always the
same question, am I doing it right?

In particular, I read about the 4 possible coordinates of a read in an
assembly. My script only retrieves two (?) of the possible four. How
should it be adjusted to print all four coordinates for each read?

Additionally, I'm not sure how to distinguish between the trimmed read
vs. the full length read and/or the aligned portion of the read vs.
the full length read.

What I *really* want is the coordinates of the aligned portion of the
read in gapped read and gapped consensus space, along with the quality
trimmed range of the read.

The ACE file in question is produced by the gsMapper program, which is
part of Newbler from Roche (454), so it has some small
'peculiarities', but I don't think they are critical for the task at
hand.


Thanks very much for any hep you can provide on any of the above issues.

Sincerely,
Dan.

From maj at fortinbras.us  Fri Sep 18 11:11:05 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Fri, 18 Sep 2009 11:11:05 -0400
Subject: [Bioperl-l] Getting read position information from an ACE file?
In-Reply-To: <2c8757af0909180755u2e2ca178h9ce921f9bb22c7a3@mail.gmail.com>
References: <2c8757af0909180755u2e2ca178h9ce921f9bb22c7a3@mail.gmail.com>
Message-ID: <FCD85C18EC5744269CEAB127F4D1D5C4@NewLife>

Dan -- I don't know much about Assembly, so can't help there. But can I  
encourage you and perhaps one or two others (steganographic content: fangly) 
to create a HOWTO stub out of this? Would be excellent-
cheers MAJ
----- Original Message ----- 
From: "Dan Bolser" <dan.bolser at gmail.com>
To: "BioPerl List" <bioperl-l at lists.open-bio.org>
Sent: Friday, September 18, 2009 10:55 AM
Subject: [Bioperl-l] Getting read position information from an ACE file?


> Dear Perl Monkeys,
> 
> I wrote a little demo script for Bio::Assembly::IO here:
> 
> http://www.bioperl.org/wiki/Module:Bio::Assembly::IO
> 
> 
> I would very much appreciate comments, criticisms and corrections on
> that script (please just edit the wiki). For a newbie its always the
> same question, am I doing it right?
> 
> In particular, I read about the 4 possible coordinates of a read in an
> assembly. My script only retrieves two (?) of the possible four. How
> should it be adjusted to print all four coordinates for each read?
> 
> Additionally, I'm not sure how to distinguish between the trimmed read
> vs. the full length read and/or the aligned portion of the read vs.
> the full length read.
> 
> What I *really* want is the coordinates of the aligned portion of the
> read in gapped read and gapped consensus space, along with the quality
> trimmed range of the read.
> 
> The ACE file in question is produced by the gsMapper program, which is
> part of Newbler from Roche (454), so it has some small
> 'peculiarities', but I don't think they are critical for the task at
> hand.
> 
> 
> Thanks very much for any hep you can provide on any of the above issues.
> 
> Sincerely,
> Dan.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
>

From anupam.contact at gmail.com  Fri Sep 18 11:20:03 2009
From: anupam.contact at gmail.com (anupam sinha)
Date: Fri, 18 Sep 2009 20:50:03 +0530
Subject: [Bioperl-l] Problems with Bioperl-run pkg
Message-ID: <82ec54570909180820t7981d230l48d8e4823bb2303f@mail.gmail.com>

Dear all,
                 I have installed the BioPerl-1.6.0.tar.gz and
Bioperl-run-1.6.0.tar.gz on a Fedora 7 system. I am trying to run *
/usr/bin/bp_pairwise_kaks.pl* script but keep on getting this error :

*Must have bioperl-run pkg installed to run this script at
/usr/bin/bp_pairwise_kaks.pl line 69*.

Though I have istalled the run package from Bioperl. Can anyone help me out
? Thanks in advance.


Regards,


Anupam Sinha

From cjfields at illinois.edu  Fri Sep 18 11:59:11 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Fri, 18 Sep 2009 10:59:11 -0500
Subject: [Bioperl-l] test failures in main trunk
In-Reply-To: <2DEEE102-8F58-4BBF-BEAD-97A1AA364787@scottcain.net>
References: <2DEEE102-8F58-4BBF-BEAD-97A1AA364787@scottcain.net>
Message-ID: <1D99E2C1-F484-4E05-8E02-0E948DBBCC6F@illinois.edu>

Interesting, will look into those.  The first one is troubling (that's  
set up to skip for Algoritm::Diff), the others should be a bit more  
straightforward.

Will have to see why List::MoreUtils is being used, but if it's  
necessary it's an additional dep.

chris

On Sep 18, 2009, at 9:11 AM, Scott Cain wrote:

> With Chris trying to get a release out, I wanted to report these  
> test failures from a fairly virgin system Ubuntu server 8.04.
>
> Scott
>
>
>
> t/SeqIO/raw.t ................................ 1/24 Can't locate  
> Algorithm/Diff.pm in @INC (@INC contains: t/lib . /home/gmod/bioperl- 
> live/blib/lib /home/gmod/bioperl-live/blib/arch /home/gmod/bioperl- 
> live /etc/perl /usr/local/lib/perl/5.8.8 /usr/local/share/perl/ 
> 5.8.8 /usr/lib/perl5 /usr/share/perl5 /usr/lib/perl/5.8 /usr/share/ 
> perl/5.8 /usr/local/lib/site_perl) at t/SeqIO/raw.t line 72.
> BEGIN failed--compilation aborted at t/SeqIO/raw.t line 72.
> # Looks like you planned 24 tests but ran 1.
> # Looks like your test exited with 2 just after 1.
> t/SeqIO/raw.t ................................ Dubious, test  
> returned 2 (wstat 512, 0x200)
>
> t/SeqTools/Backtranslate.t ................... Can't locate ok.pm in  
> @INC (@INC contains: t/lib /home/gmod/bioperl-live/blib/lib /home/ 
> gmod/bioperl-live/blib/arch /home/gmod/bioperl-live /etc/perl /usr/ 
> local/lib/perl/5.8.8 /usr/local/share/perl/5.8.8 /usr/lib/perl5 /usr/ 
> share/perl5 /usr/lib/perl/5.8 /usr/share/perl/5.8 /usr/local/lib/ 
> site_perl .) at t/SeqTools/Backtranslate.t line 9.
> BEGIN failed--compilation aborted at t/SeqTools/Backtranslate.t line  
> 9.
> # Looks like your test exited with 2 before it could output anything.
> t/SeqTools/Backtranslate.t ................... Dubious, test  
> returned 2 (wstat 512, 0x200)
> Failed 8/8 subtests
>
> t/SeqTools/SeqPattern.t ...................... 1/28
> #   Failed test 'use Bio::Tools::SeqPattern;'
> #   at t/SeqTools/SeqPattern.t line 12.
> #     Tried to use 'Bio::Tools::SeqPattern'.
> #     Error:  Can't locate List/MoreUtils.pm in @INC (@INC contains:  
> t/lib . /home/gmod/bioperl-live/blib/lib /home/gmod/bioperl-live/ 
> blib/arch /home/gmod/bioperl-live /etc/perl /usr/local/lib/perl/ 
> 5.8.8 /usr/local/share/perl/5.8.8 /usr/lib/perl5 /usr/share/perl5 / 
> usr/lib/perl/5.8 /usr/share/perl/5.8 /usr/local/lib/site_perl) at  
> Bio/Tools/SeqPattern/Backtranslate.pm line 22.
> # BEGIN failed--compilation aborted at Bio/Tools/SeqPattern/ 
> Backtranslate.pm line 22.
> # Compilation failed in require at Bio/Tools/SeqPattern.pm line 212.
> # Compilation failed in require at (eval 17) line 2.
> # BEGIN failed--compilation aborted at (eval 17) line 2.
> Use of uninitialized value in concatenation (.) or string at Bio/ 
> Tools/SeqPattern.pm line 431.
> Use of uninitialized value in concatenation (.) or string at Bio/ 
> Tools/SeqPattern.pm line 432.
>
> #   Failed test at t/SeqTools/SeqPattern.t line 25.
> #          got: '(CT).{1,80}(C[[]]CT).(AGGGG){1,200}'
> #     expected: '(CT).{1,80}(C[GA][GA]CT).(AGGGG){1,200}'
> Use of uninitialized value in concatenation (.) or string at Bio/ 
> Tools/SeqPattern.pm line 431.
> Use of uninitialized value in concatenation (.) or string at Bio/ 
> Tools/SeqPattern.pm line 432.
>
> #   Failed test at t/SeqTools/SeqPattern.t line 31.
> #          got: '(CT).(C[][]CT){1,80}.(AGGGG){1,200}'
> #     expected: '(CT).(C[AG][AG]CT){1,80}.(AGGGG){1,200}'
> Use of uninitialized value in concatenation (.) or string at Bio/ 
> Tools/SeqPattern.pm line 371.
> Use of uninitialized value in concatenation (.) or string at Bio/ 
> Tools/SeqPattern.pm line 372.
>
> #   Failed test at t/SeqTools/SeqPattern.t line 38.
> #          got: 'A[][]H'
> #     expected: 'A[EQ][DN]H'
> "_reverse_translate_motif" is not exported by the  
> Bio::Tools::SeqPattern::Backtranslate module
> Can't continue after import errors at Bio/Tools/SeqPattern.pm line 539
> # Looks like you planned 28 tests but ran 9.
> # Looks like you failed 4 tests of 9 run.
> # Looks like your test exited with 255 just after 9.
> t/SeqTools/SeqPattern.t ...................... Dubious, test  
> returned 255 (wstat 65280, 0xff00)
> Failed 23/28 subtests
>
>
> -----------------------------------------------------------------------
> Scott Cain, Ph. D. scott at scottcain dot net
> GMOD Coordinator (http://gmod.org/) 216-392-3087
> Ontario Institute for Cancer Research
>
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From cjfields at illinois.edu  Fri Sep 18 12:09:26 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Fri, 18 Sep 2009 11:09:26 -0500
Subject: [Bioperl-l] Parser: Ace file (Sequence Assembly) in Bioperl
In-Reply-To: <2c8757af0909180727r5a71a41fmee71eff92a49a888@mail.gmail.com>
References: <be9b52410901052142p2809652h68e6a05b3ae156eb@mail.gmail.com>
	<18DF7D20DFEC044098A1062202F5FFF31A69523F20@exchsth.agresearch.co.nz>
	<be9b52410901061243t576fcc1eg94928360b8e0f57b@mail.gmail.com>
	<B6BFD3C2-D5D0-4732-B3E2-C2DC9DD029F1@illinois.edu>
	<52cea20c0901061513x593acb44o641b87e35b8ff6fe@mail.gmail.com>
	<835D79AC-0D2A-40BE-87F1-0591F69C036A@illinois.edu>
	<2c8757af0909180727r5a71a41fmee71eff92a49a888@mail.gmail.com>
Message-ID: <124536CE-407B-4E2E-98B7-940DA4286CC8@illinois.edu>

Dan,

No, it hasn't made it in.  Currently, the problem is it doesn't have  
any tests attached, but that could be easily fixed if anyone wanted to  
donate a little time to getting them running.  My hands are a bit full  
with other stuff for the release.

We should have some ace files already to go in t/data somewhere if one  
were so inclined to do that, BTW  ;>

chris

On Sep 18, 2009, at 9:27 AM, Dan Bolser wrote:

> 2009/1/6 Chris Fields <cjfields at illinois.edu>:
>> Could you archive the files and attach them to a bug report (you  
>> can mark it
>> as an enhancement request).  We can take a look.
>>
>> http://bugzilla.open-bio.org/
>
> Out of interest, has this been added? Where is it documented?
>
> Cheers,
> Dan.
>
>
>> chris
>>
>> On Jan 6, 2009, at 5:13 PM, Joshua Udall wrote:
>>
>>> Chris et al. -
>>>
>>> A student and I have written code to do this - write ace files as  
>>> well as
>>> parse them one entry at a time.  In trying to use the Assembly::IO  
>>> as it
>>> was
>>> in 1.5, we ran into problems with large ace files containing many  
>>> entries
>>> because of file handle limit issues with the inherited  
>>> implementation
>>> DB_File.  Our implementation simply reads one contig at a time  
>>> instead of
>>> first trying to slurp the whole ace into memory.  I'm happy to add  
>>> it to
>>> Bioperl, but I am not sure how to do it.  If I sent *.pm files to  
>>> someone,
>>> could they help me get it into bioperl?  It may not be perfect  
>>> either, but
>>> it should be a good start.
>>>
>>> Josh
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From maj at fortinbras.us  Fri Sep 18 12:20:22 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Fri, 18 Sep 2009 12:20:22 -0400
Subject: [Bioperl-l] test failures in main trunk
In-Reply-To: <1D99E2C1-F484-4E05-8E02-0E948DBBCC6F@illinois.edu>
References: <2DEEE102-8F58-4BBF-BEAD-97A1AA364787@scottcain.net>
	<1D99E2C1-F484-4E05-8E02-0E948DBBCC6F@illinois.edu>
Message-ID: <E019D53941DD48E4B3294E113771B711@NewLife>


> Will have to see why List::MoreUtils is being used, but if it's  
> necessary it's an additional dep.

I didn't do it, officer....

> 
> chris
> 
> On Sep 18, 2009, at 9:11 AM, Scott Cain wrote:
> 
>> With Chris trying to get a release out, I wanted to report these  
>> test failures from a fairly virgin system Ubuntu server 8.04.
>>
>> Scott
>>
>>
>>
>> t/SeqIO/raw.t ................................ 1/24 Can't locate  
>> Algorithm/Diff.pm in @INC (@INC contains: t/lib . /home/gmod/bioperl- 
>> live/blib/lib /home/gmod/bioperl-live/blib/arch /home/gmod/bioperl- 
>> live /etc/perl /usr/local/lib/perl/5.8.8 /usr/local/share/perl/ 
>> 5.8.8 /usr/lib/perl5 /usr/share/perl5 /usr/lib/perl/5.8 /usr/share/ 
>> perl/5.8 /usr/local/lib/site_perl) at t/SeqIO/raw.t line 72.
>> BEGIN failed--compilation aborted at t/SeqIO/raw.t line 72.
>> # Looks like you planned 24 tests but ran 1.
>> # Looks like your test exited with 2 just after 1.
>> t/SeqIO/raw.t ................................ Dubious, test  
>> returned 2 (wstat 512, 0x200)
>>
>> t/SeqTools/Backtranslate.t ................... Can't locate ok.pm in  
>> @INC (@INC contains: t/lib /home/gmod/bioperl-live/blib/lib /home/ 
>> gmod/bioperl-live/blib/arch /home/gmod/bioperl-live /etc/perl /usr/ 
>> local/lib/perl/5.8.8 /usr/local/share/perl/5.8.8 /usr/lib/perl5 /usr/ 
>> share/perl5 /usr/lib/perl/5.8 /usr/share/perl/5.8 /usr/local/lib/ 
>> site_perl .) at t/SeqTools/Backtranslate.t line 9.
>> BEGIN failed--compilation aborted at t/SeqTools/Backtranslate.t line  
>> 9.
>> # Looks like your test exited with 2 before it could output anything.
>> t/SeqTools/Backtranslate.t ................... Dubious, test  
>> returned 2 (wstat 512, 0x200)
>> Failed 8/8 subtests
>>
>> t/SeqTools/SeqPattern.t ...................... 1/28
>> #   Failed test 'use Bio::Tools::SeqPattern;'
>> #   at t/SeqTools/SeqPattern.t line 12.
>> #     Tried to use 'Bio::Tools::SeqPattern'.
>> #     Error:  Can't locate List/MoreUtils.pm in @INC (@INC contains:  
>> t/lib . /home/gmod/bioperl-live/blib/lib /home/gmod/bioperl-live/ 
>> blib/arch /home/gmod/bioperl-live /etc/perl /usr/local/lib/perl/ 
>> 5.8.8 /usr/local/share/perl/5.8.8 /usr/lib/perl5 /usr/share/perl5 / 
>> usr/lib/perl/5.8 /usr/share/perl/5.8 /usr/local/lib/site_perl) at  
>> Bio/Tools/SeqPattern/Backtranslate.pm line 22.
>> # BEGIN failed--compilation aborted at Bio/Tools/SeqPattern/ 
>> Backtranslate.pm line 22.
>> # Compilation failed in require at Bio/Tools/SeqPattern.pm line 212.
>> # Compilation failed in require at (eval 17) line 2.
>> # BEGIN failed--compilation aborted at (eval 17) line 2.
>> Use of uninitialized value in concatenation (.) or string at Bio/ 
>> Tools/SeqPattern.pm line 431.
>> Use of uninitialized value in concatenation (.) or string at Bio/ 
>> Tools/SeqPattern.pm line 432.
>>
>> #   Failed test at t/SeqTools/SeqPattern.t line 25.
>> #          got: '(CT).{1,80}(C[[]]CT).(AGGGG){1,200}'
>> #     expected: '(CT).{1,80}(C[GA][GA]CT).(AGGGG){1,200}'
>> Use of uninitialized value in concatenation (.) or string at Bio/ 
>> Tools/SeqPattern.pm line 431.
>> Use of uninitialized value in concatenation (.) or string at Bio/ 
>> Tools/SeqPattern.pm line 432.
>>
>> #   Failed test at t/SeqTools/SeqPattern.t line 31.
>> #          got: '(CT).(C[][]CT){1,80}.(AGGGG){1,200}'
>> #     expected: '(CT).(C[AG][AG]CT){1,80}.(AGGGG){1,200}'
>> Use of uninitialized value in concatenation (.) or string at Bio/ 
>> Tools/SeqPattern.pm line 371.
>> Use of uninitialized value in concatenation (.) or string at Bio/ 
>> Tools/SeqPattern.pm line 372.
>>
>> #   Failed test at t/SeqTools/SeqPattern.t line 38.
>> #          got: 'A[][]H'
>> #     expected: 'A[EQ][DN]H'
>> "_reverse_translate_motif" is not exported by the  
>> Bio::Tools::SeqPattern::Backtranslate module
>> Can't continue after import errors at Bio/Tools/SeqPattern.pm line 539
>> # Looks like you planned 28 tests but ran 9.
>> # Looks like you failed 4 tests of 9 run.
>> # Looks like your test exited with 255 just after 9.
>> t/SeqTools/SeqPattern.t ...................... Dubious, test  
>> returned 255 (wstat 65280, 0xff00)
>> Failed 23/28 subtests
>>
>>
>> -----------------------------------------------------------------------
>> Scott Cain, Ph. D. scott at scottcain dot net
>> GMOD Coordinator (http://gmod.org/) 216-392-3087
>> Ontario Institute for Cancer Research
>>
>>
>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
>

From maj at fortinbras.us  Fri Sep 18 11:55:47 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Fri, 18 Sep 2009 11:55:47 -0400
Subject: [Bioperl-l] problem parsing pdb
In-Reply-To: <741671.67508.qm@web25705.mail.ukl.yahoo.com>
References: <741671.67508.qm@web25705.mail.ukl.yahoo.com>
Message-ID: <72DA6CA1499D4F67909197901218A9FF@NewLife>

Hi Paola--
My researches reveal that this is a "standard kludge" in pdb format. A letter 
following a residue number is called an "insertion code" or "icode", and my 
understanding is that is does allow for the insertion of residues without 
upsetting the rest of the coordinates. (This is a feature, and not laziness, 
since people very quickly begin to refer to amino acid coordinates based on a 
reference sequence in interesting region, and you can't easily say to the 
community,  "hey, that's 22 now, not 20...")

Since it's standard, you should expect it. Bio::Structure handles the icode by 
creating the residue id as follows:

   #my $res_name_num = $resname."-".$resseq;
   my $res_name_num = $resname."-".$resseq;
   $res_name_num .= '.'.$icode if $icode;

so you can get back the reside 3-letter name, its numerical position, and 
insertion code by doing

 my ($name, $number, $icode) = $res->id =~ /(.*?)-([0-9]+)\.?([A-Z]?)/;

In this case, if the icode is not present, then $icode eq '' (not undef).
Hope this helps-
Mark

----- Original Message ----- 
From: "Paola Bisignano" <paola_bisignano at yahoo.it>
To: <bioperl-l at bioperl.org>
Sent: Tuesday, September 08, 2009 4:55 AM
Subject: [Bioperl-l] problem parsing pdb


Hi,

I'm in a little troble because i need to exactly parse pdb file, to extract 
chain id and res id, but I finded that in some pdb the number of residue is 
followed by a letter because is probably a residue added by crystallographers 
and they didm't want to change the number of residue in sequence....for example 
the pdb 1PXX.pdb I parsed it with my script below, I didn't find any useful 
suggestion about this in bioperltutorial or documentation of bioperl online

#!/usr/local/bin/perl
use strict;
use warnings;
use Bio::Structure::IO;
use LWP::Simple;


my $urlpdb= 
"http://www.rcsb.org/pdb/download/downloadFile.do?fileFormat=pdb&compression=NO&structureId=1PXX";
my $content = get($urlpdb);
my $pdb_file = qq{1pxx.pdb};
open my $f, ">$pdb_file" or die $!;
binmode $f;
print $f $content;
print qq{$pdb_file\n};
close $f;


my $structio=Bio::Structure::IO->new (-file=>$pdb_file);
my $struc=$structio->next_structure;
for my $chain ($struc->get_chains)
{
my $chainid = $chain->id ;
for my $res ($struc->get_residues($chain))
{
my $resid=$res-> id;
my $atoms= $struc->get_atoms($res);
open my $f, ">> 1pxx.parsed";
print $f "$chainid\t$resid\n";
close $f;
}
}


but it gives my file with an error in ILE 105A ILE 2105C because they have a 
letter that follow the number of resid.... can I solve that problem without 
writing intermediate files?
because i need to have the reside id as 105A not 105.A
so
A ILE-105A
without point between number and letter....


Thank you all,

Paola


_______________________________________________
Bioperl-l mailing list
Bioperl-l at lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/bioperl-l


From abhishek.vit at gmail.com  Fri Sep 18 12:31:00 2009
From: abhishek.vit at gmail.com (Abhishek Pratap)
Date: Fri, 18 Sep 2009 12:31:00 -0400
Subject: [Bioperl-l] Parser: Ace file (Sequence Assembly) in Bioperl
In-Reply-To: <124536CE-407B-4E2E-98B7-940DA4286CC8@illinois.edu>
References: <be9b52410901052142p2809652h68e6a05b3ae156eb@mail.gmail.com>
	<18DF7D20DFEC044098A1062202F5FFF31A69523F20@exchsth.agresearch.co.nz>
	<be9b52410901061243t576fcc1eg94928360b8e0f57b@mail.gmail.com>
	<B6BFD3C2-D5D0-4732-B3E2-C2DC9DD029F1@illinois.edu>
	<52cea20c0901061513x593acb44o641b87e35b8ff6fe@mail.gmail.com>
	<835D79AC-0D2A-40BE-87F1-0591F69C036A@illinois.edu>
	<2c8757af0909180727r5a71a41fmee71eff92a49a888@mail.gmail.com>
	<124536CE-407B-4E2E-98B7-940DA4286CC8@illinois.edu>
Message-ID: <be9b52410909180931w2951318eqfa01c109a032bf9d@mail.gmail.com>

I have negligible experience with ace but will be happy to do some
testing. Although please let me know what code and functioanlity needs
to be checked.

Cheers,
-Abhi

On Fri, Sep 18, 2009 at 12:09 PM, Chris Fields <cjfields at illinois.edu> wrote:
> Dan,
>
> No, it hasn't made it in. ?Currently, the problem is it doesn't have any
> tests attached, but that could be easily fixed if anyone wanted to donate a
> little time to getting them running. ?My hands are a bit full with other
> stuff for the release.
>
> We should have some ace files already to go in t/data somewhere if one were
> so inclined to do that, BTW ?;>
>
> chris
>
> On Sep 18, 2009, at 9:27 AM, Dan Bolser wrote:
>
>> 2009/1/6 Chris Fields <cjfields at illinois.edu>:
>>>
>>> Could you archive the files and attach them to a bug report (you can mark
>>> it
>>> as an enhancement request). ?We can take a look.
>>>
>>> http://bugzilla.open-bio.org/
>>
>> Out of interest, has this been added? Where is it documented?
>>
>> Cheers,
>> Dan.
>>
>>
>>> chris
>>>
>>> On Jan 6, 2009, at 5:13 PM, Joshua Udall wrote:
>>>
>>>> Chris et al. -
>>>>
>>>> A student and I have written code to do this - write ace files as well
>>>> as
>>>> parse them one entry at a time. ?In trying to use the Assembly::IO as it
>>>> was
>>>> in 1.5, we ran into problems with large ace files containing many
>>>> entries
>>>> because of file handle limit issues with the inherited implementation
>>>> DB_File. ?Our implementation simply reads one contig at a time instead
>>>> of
>>>> first trying to slurp the whole ace into memory. ?I'm happy to add it to
>>>> Bioperl, but I am not sure how to do it. ?If I sent *.pm files to
>>>> someone,
>>>> could they help me get it into bioperl? ?It may not be perfect either,
>>>> but
>>>> it should be a good start.
>>>>
>>>> Josh
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>


From vecchi.b at gmail.com  Fri Sep 18 12:44:37 2009
From: vecchi.b at gmail.com (Bruno Vecchi)
Date: Fri, 18 Sep 2009 09:44:37 -0700
Subject: [Bioperl-l] test failures in main trunk
In-Reply-To: <E019D53941DD48E4B3294E113771B711@NewLife>
References: <2DEEE102-8F58-4BBF-BEAD-97A1AA364787@scottcain.net>
	<1D99E2C1-F484-4E05-8E02-0E948DBBCC6F@illinois.edu>
	<E019D53941DD48E4B3294E113771B711@NewLife>
Message-ID: <1a0c1b750909180944p55b226cbi18e3c608f401d951@mail.gmail.com>

The second test ("Can't locate ok.pm in @INC...") can be fixed by
using use_ok('My::Module') instead of use ok 'My::Module' in the test
files.

I've had a few of those in the past, and that fix did the trick.

Cheers,

Bruno.


2009/9/18 Mark A. Jensen <maj at fortinbras.us>:
>
>> Will have to see why List::MoreUtils is being used, but if it's ?necessary
>> it's an additional dep.
>
> I didn't do it, officer....
>
>>
>> chris
>>
>> On Sep 18, 2009, at 9:11 AM, Scott Cain wrote:
>>
>>> With Chris trying to get a release out, I wanted to report these ?test
>>> failures from a fairly virgin system Ubuntu server 8.04.
>>>
>>> Scott
>>>
>>>
>>>
>>> t/SeqIO/raw.t ................................ 1/24 Can't locate
>>> ?Algorithm/Diff.pm in @INC (@INC contains: t/lib . /home/gmod/bioperl-
>>> live/blib/lib /home/gmod/bioperl-live/blib/arch /home/gmod/bioperl- live
>>> /etc/perl /usr/local/lib/perl/5.8.8 /usr/local/share/perl/ 5.8.8
>>> /usr/lib/perl5 /usr/share/perl5 /usr/lib/perl/5.8 /usr/share/ perl/5.8
>>> /usr/local/lib/site_perl) at t/SeqIO/raw.t line 72.
>>> BEGIN failed--compilation aborted at t/SeqIO/raw.t line 72.
>>> # Looks like you planned 24 tests but ran 1.
>>> # Looks like your test exited with 2 just after 1.
>>> t/SeqIO/raw.t ................................ Dubious, test ?returned 2
>>> (wstat 512, 0x200)
>>>
>>> t/SeqTools/Backtranslate.t ................... Can't locate ok.pm in
>>> ?@INC (@INC contains: t/lib /home/gmod/bioperl-live/blib/lib /home/
>>> gmod/bioperl-live/blib/arch /home/gmod/bioperl-live /etc/perl /usr/
>>> local/lib/perl/5.8.8 /usr/local/share/perl/5.8.8 /usr/lib/perl5 /usr/
>>> share/perl5 /usr/lib/perl/5.8 /usr/share/perl/5.8 /usr/local/lib/ site_perl
>>> .) at t/SeqTools/Backtranslate.t line 9.
>>> BEGIN failed--compilation aborted at t/SeqTools/Backtranslate.t line ?9.
>>> # Looks like your test exited with 2 before it could output anything.
>>> t/SeqTools/Backtranslate.t ................... Dubious, test ?returned 2
>>> (wstat 512, 0x200)
>>> Failed 8/8 subtests
>>>
>>> t/SeqTools/SeqPattern.t ...................... 1/28
>>> # ? Failed test 'use Bio::Tools::SeqPattern;'
>>> # ? at t/SeqTools/SeqPattern.t line 12.
>>> # ? ? Tried to use 'Bio::Tools::SeqPattern'.
>>> # ? ? Error: ?Can't locate List/MoreUtils.pm in @INC (@INC contains:
>>> ?t/lib . /home/gmod/bioperl-live/blib/lib /home/gmod/bioperl-live/ blib/arch
>>> /home/gmod/bioperl-live /etc/perl /usr/local/lib/perl/ 5.8.8
>>> /usr/local/share/perl/5.8.8 /usr/lib/perl5 /usr/share/perl5 /
>>> usr/lib/perl/5.8 /usr/share/perl/5.8 /usr/local/lib/site_perl) at
>>> ?Bio/Tools/SeqPattern/Backtranslate.pm line 22.
>>> # BEGIN failed--compilation aborted at Bio/Tools/SeqPattern/
>>> Backtranslate.pm line 22.
>>> # Compilation failed in require at Bio/Tools/SeqPattern.pm line 212.
>>> # Compilation failed in require at (eval 17) line 2.
>>> # BEGIN failed--compilation aborted at (eval 17) line 2.
>>> Use of uninitialized value in concatenation (.) or string at Bio/
>>> Tools/SeqPattern.pm line 431.
>>> Use of uninitialized value in concatenation (.) or string at Bio/
>>> Tools/SeqPattern.pm line 432.
>>>
>>> # ? Failed test at t/SeqTools/SeqPattern.t line 25.
>>> # ? ? ? ? ?got: '(CT).{1,80}(C[[]]CT).(AGGGG){1,200}'
>>> # ? ? expected: '(CT).{1,80}(C[GA][GA]CT).(AGGGG){1,200}'
>>> Use of uninitialized value in concatenation (.) or string at Bio/
>>> Tools/SeqPattern.pm line 431.
>>> Use of uninitialized value in concatenation (.) or string at Bio/
>>> Tools/SeqPattern.pm line 432.
>>>
>>> # ? Failed test at t/SeqTools/SeqPattern.t line 31.
>>> # ? ? ? ? ?got: '(CT).(C[][]CT){1,80}.(AGGGG){1,200}'
>>> # ? ? expected: '(CT).(C[AG][AG]CT){1,80}.(AGGGG){1,200}'
>>> Use of uninitialized value in concatenation (.) or string at Bio/
>>> Tools/SeqPattern.pm line 371.
>>> Use of uninitialized value in concatenation (.) or string at Bio/
>>> Tools/SeqPattern.pm line 372.
>>>
>>> # ? Failed test at t/SeqTools/SeqPattern.t line 38.
>>> # ? ? ? ? ?got: 'A[][]H'
>>> # ? ? expected: 'A[EQ][DN]H'
>>> "_reverse_translate_motif" is not exported by the
>>> ?Bio::Tools::SeqPattern::Backtranslate module
>>> Can't continue after import errors at Bio/Tools/SeqPattern.pm line 539
>>> # Looks like you planned 28 tests but ran 9.
>>> # Looks like you failed 4 tests of 9 run.
>>> # Looks like your test exited with 255 just after 9.
>>> t/SeqTools/SeqPattern.t ...................... Dubious, test ?returned
>>> 255 (wstat 65280, 0xff00)
>>> Failed 23/28 subtests
>>>
>>>
>>> -----------------------------------------------------------------------
>>> Scott Cain, Ph. D. scott at scottcain dot net
>>> GMOD Coordinator (http://gmod.org/) 216-392-3087
>>> Ontario Institute for Cancer Research
>>>
>>>
>>>
>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


From dan.bolser at gmail.com  Fri Sep 18 12:54:36 2009
From: dan.bolser at gmail.com (Dan Bolser)
Date: Fri, 18 Sep 2009 17:54:36 +0100
Subject: [Bioperl-l] Parser: Ace file (Sequence Assembly) in Bioperl
In-Reply-To: <124536CE-407B-4E2E-98B7-940DA4286CC8@illinois.edu>
References: <be9b52410901052142p2809652h68e6a05b3ae156eb@mail.gmail.com>
	<18DF7D20DFEC044098A1062202F5FFF31A69523F20@exchsth.agresearch.co.nz>
	<be9b52410901061243t576fcc1eg94928360b8e0f57b@mail.gmail.com>
	<B6BFD3C2-D5D0-4732-B3E2-C2DC9DD029F1@illinois.edu>
	<52cea20c0901061513x593acb44o641b87e35b8ff6fe@mail.gmail.com>
	<835D79AC-0D2A-40BE-87F1-0591F69C036A@illinois.edu>
	<2c8757af0909180727r5a71a41fmee71eff92a49a888@mail.gmail.com>
	<124536CE-407B-4E2E-98B7-940DA4286CC8@illinois.edu>
Message-ID: <2c8757af0909180954ia4fecc3we72574d8ae8acd97@mail.gmail.com>

Please can you link to the bug that includes the code?


2009/9/18 Chris Fields <cjfields at illinois.edu>:
> Dan,
>
> No, it hasn't made it in. ?Currently, the problem is it doesn't have any
> tests attached, but that could be easily fixed if anyone wanted to donate a
> little time to getting them running. ?My hands are a bit full with other
> stuff for the release.
>
> We should have some ace files already to go in t/data somewhere if one were
> so inclined to do that, BTW ?;>
>
> chris
>
> On Sep 18, 2009, at 9:27 AM, Dan Bolser wrote:
>
>> 2009/1/6 Chris Fields <cjfields at illinois.edu>:
>>>
>>> Could you archive the files and attach them to a bug report (you can mark
>>> it
>>> as an enhancement request). ?We can take a look.
>>>
>>> http://bugzilla.open-bio.org/
>>
>> Out of interest, has this been added? Where is it documented?
>>
>> Cheers,
>> Dan.
>>
>>
>>> chris
>>>
>>> On Jan 6, 2009, at 5:13 PM, Joshua Udall wrote:
>>>
>>>> Chris et al. -
>>>>
>>>> A student and I have written code to do this - write ace files as well
>>>> as
>>>> parse them one entry at a time. ?In trying to use the Assembly::IO as it
>>>> was
>>>> in 1.5, we ran into problems with large ace files containing many
>>>> entries
>>>> because of file handle limit issues with the inherited implementation
>>>> DB_File. ?Our implementation simply reads one contig at a time instead
>>>> of
>>>> first trying to slurp the whole ace into memory. ?I'm happy to add it to
>>>> Bioperl, but I am not sure how to do it. ?If I sent *.pm files to
>>>> someone,
>>>> could they help me get it into bioperl? ?It may not be perfect either,
>>>> but
>>>> it should be a good start.
>>>>
>>>> Josh
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>


From dan.bolser at gmail.com  Fri Sep 18 13:09:09 2009
From: dan.bolser at gmail.com (Dan Bolser)
Date: Fri, 18 Sep 2009 18:09:09 +0100
Subject: [Bioperl-l] Getting read position information from an ACE file?
In-Reply-To: <FCD85C18EC5744269CEAB127F4D1D5C4@NewLife>
References: <2c8757af0909180755u2e2ca178h9ce921f9bb22c7a3@mail.gmail.com>
	<FCD85C18EC5744269CEAB127F4D1D5C4@NewLife>
Message-ID: <2c8757af0909181009w310bc69r3d9efa3d9a12d41b@mail.gmail.com>

2009/9/18 Mark A. Jensen <maj at fortinbras.us>:
> Dan -- I don't know much about Assembly, so can't help there. But can I
> ?encourage you and perhaps one or two others (steganographic content:
> fangly) to create a HOWTO stub out of this? Would be excellent-

I'd love to. ACE is pretty ubiquitous, so any additional info on how
to work with them using BioPerl should help a lot of people.

The problem is that I'm one of those people ;-)


I'm working on an 'ace2tab.plx' script that should encompass this
info. I'm finding that some 'read ids' have the .range format. i.e.
"read123455.23-239". However, some do not. i.e. "read123456". Not sure
where this ID comes from, but I think its telling me something about
partially aligned reads. The problem is that the coordinates I'm
seeing don't reflect that (they are just the start and the end point
of the full read).

A 'proper' ace2tab script would be very nice.


> cheers MAJ
> ----- Original Message ----- From: "Dan Bolser" <dan.bolser at gmail.com>
> To: "BioPerl List" <bioperl-l at lists.open-bio.org>
> Sent: Friday, September 18, 2009 10:55 AM
> Subject: [Bioperl-l] Getting read position information from an ACE file?
>
>
>> Dear Perl Monkeys,
>>
>> I wrote a little demo script for Bio::Assembly::IO here:
>>
>> http://www.bioperl.org/wiki/Module:Bio::Assembly::IO
>>
>>
>> I would very much appreciate comments, criticisms and corrections on
>> that script (please just edit the wiki). For a newbie its always the
>> same question, am I doing it right?
>>
>> In particular, I read about the 4 possible coordinates of a read in an
>> assembly. My script only retrieves two (?) of the possible four. How
>> should it be adjusted to print all four coordinates for each read?
>>
>> Additionally, I'm not sure how to distinguish between the trimmed read
>> vs. the full length read and/or the aligned portion of the read vs.
>> the full length read.
>>
>> What I *really* want is the coordinates of the aligned portion of the
>> read in gapped read and gapped consensus space, along with the quality
>> trimmed range of the read.
>>
>> The ACE file in question is produced by the gsMapper program, which is
>> part of Newbler from Roche (454), so it has some small
>> 'peculiarities', but I don't think they are critical for the task at
>> hand.
>>
>>
>> Thanks very much for any hep you can provide on any of the above issues.
>>
>> Sincerely,
>> Dan.
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>>
>


From cjfields at illinois.edu  Fri Sep 18 14:00:17 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Fri, 18 Sep 2009 13:00:17 -0500
Subject: [Bioperl-l] Getting read position information from an ACE file?
In-Reply-To: <FCD85C18EC5744269CEAB127F4D1D5C4@NewLife>
References: <2c8757af0909180755u2e2ca178h9ce921f9bb22c7a3@mail.gmail.com>
	<FCD85C18EC5744269CEAB127F4D1D5C4@NewLife>
Message-ID: <DCEC55AD-5B4E-42E6-9A7E-FB52E19EADA5@illinois.edu>

Agreed, and it may spur others to get involved, fix bugs, donate code,  
etc.

chris

On Sep 18, 2009, at 10:11 AM, Mark A. Jensen wrote:

> Dan -- I don't know much about Assembly, so can't help there. But  
> can I  encourage you and perhaps one or two others (steganographic  
> content: fangly) to create a HOWTO stub out of this? Would be  
> excellent-
> cheers MAJ
> ----- Original Message ----- From: "Dan Bolser" <dan.bolser at gmail.com>
> To: "BioPerl List" <bioperl-l at lists.open-bio.org>
> Sent: Friday, September 18, 2009 10:55 AM
> Subject: [Bioperl-l] Getting read position information from an ACE  
> file?
>
>
>> Dear Perl Monkeys,
>> I wrote a little demo script for Bio::Assembly::IO here:
>> http://www.bioperl.org/wiki/Module:Bio::Assembly::IO
>> I would very much appreciate comments, criticisms and corrections on
>> that script (please just edit the wiki). For a newbie its always the
>> same question, am I doing it right?
>> In particular, I read about the 4 possible coordinates of a read in  
>> an
>> assembly. My script only retrieves two (?) of the possible four. How
>> should it be adjusted to print all four coordinates for each read?
>> Additionally, I'm not sure how to distinguish between the trimmed  
>> read
>> vs. the full length read and/or the aligned portion of the read vs.
>> the full length read.
>> What I *really* want is the coordinates of the aligned portion of the
>> read in gapped read and gapped consensus space, along with the  
>> quality
>> trimmed range of the read.
>> The ACE file in question is produced by the gsMapper program, which  
>> is
>> part of Newbler from Roche (454), so it has some small
>> 'peculiarities', but I don't think they are critical for the task at
>> hand.
>> Thanks very much for any hep you can provide on any of the above  
>> issues.
>> Sincerely,
>> Dan.
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From cjfields at illinois.edu  Fri Sep 18 14:03:13 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Fri, 18 Sep 2009 13:03:13 -0500
Subject: [Bioperl-l] Parser: Ace file (Sequence Assembly) in Bioperl
In-Reply-To: <2c8757af0909180954ia4fecc3we72574d8ae8acd97@mail.gmail.com>
References: <be9b52410901052142p2809652h68e6a05b3ae156eb@mail.gmail.com>
	<18DF7D20DFEC044098A1062202F5FFF31A69523F20@exchsth.agresearch.co.nz>
	<be9b52410901061243t576fcc1eg94928360b8e0f57b@mail.gmail.com>
	<B6BFD3C2-D5D0-4732-B3E2-C2DC9DD029F1@illinois.edu>
	<52cea20c0901061513x593acb44o641b87e35b8ff6fe@mail.gmail.com>
	<835D79AC-0D2A-40BE-87F1-0591F69C036A@illinois.edu>
	<2c8757af0909180727r5a71a41fmee71eff92a49a888@mail.gmail.com>
	<124536CE-407B-4E2E-98B7-940DA4286CC8@illinois.edu>
	<2c8757af0909180954ia4fecc3we72574d8ae8acd97@mail.gmail.com>
Message-ID: <88BA1216-B8C6-478B-A295-4153D041F549@illinois.edu>

Bug 2726

http://bugzilla.open-bio.org/show_bug.cgi?id=2726

chris

On Sep 18, 2009, at 11:54 AM, Dan Bolser wrote:

> Please can you link to the bug that includes the code?
>
>
> 2009/9/18 Chris Fields <cjfields at illinois.edu>:
>> Dan,
>>
>> No, it hasn't made it in.  Currently, the problem is it doesn't  
>> have any
>> tests attached, but that could be easily fixed if anyone wanted to  
>> donate a
>> little time to getting them running.  My hands are a bit full with  
>> other
>> stuff for the release.
>>
>> We should have some ace files already to go in t/data somewhere if  
>> one were
>> so inclined to do that, BTW  ;>
>>
>> chris
>>
>> On Sep 18, 2009, at 9:27 AM, Dan Bolser wrote:
>>
>>> 2009/1/6 Chris Fields <cjfields at illinois.edu>:
>>>>
>>>> Could you archive the files and attach them to a bug report (you  
>>>> can mark
>>>> it
>>>> as an enhancement request).  We can take a look.
>>>>
>>>> http://bugzilla.open-bio.org/
>>>
>>> Out of interest, has this been added? Where is it documented?
>>>
>>> Cheers,
>>> Dan.
>>>
>>>
>>>> chris
>>>>
>>>> On Jan 6, 2009, at 5:13 PM, Joshua Udall wrote:
>>>>
>>>>> Chris et al. -
>>>>>
>>>>> A student and I have written code to do this - write ace files  
>>>>> as well
>>>>> as
>>>>> parse them one entry at a time.  In trying to use the  
>>>>> Assembly::IO as it
>>>>> was
>>>>> in 1.5, we ran into problems with large ace files containing many
>>>>> entries
>>>>> because of file handle limit issues with the inherited  
>>>>> implementation
>>>>> DB_File.  Our implementation simply reads one contig at a time  
>>>>> instead
>>>>> of
>>>>> first trying to slurp the whole ace into memory.  I'm happy to  
>>>>> add it to
>>>>> Bioperl, but I am not sure how to do it.  If I sent *.pm files to
>>>>> someone,
>>>>> could they help me get it into bioperl?  It may not be perfect  
>>>>> either,
>>>>> but
>>>>> it should be a good start.
>>>>>
>>>>> Josh
>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>>


From e.osimo at gmail.com  Fri Sep 18 18:33:22 2009
From: e.osimo at gmail.com (Emanuele Osimo)
Date: Sat, 19 Sep 2009 00:33:22 +0200
Subject: [Bioperl-l] Getting all annotations
Message-ID: <2ac05d0f0909181533u1e5d5d89l5c2c468950a9cef@mail.gmail.com>

Hello,
I was trying to figure out how to get from the Entrez database all the
reference annotation for a given genomic zone.
For example: I want to know which genes, transcripts, microRNAs etc are
present in chr 6 from 100kbp to 200kbp.
Is there a database that is arranged as a continuum (by sequence) instead of
by feature (gene, transcript etc)?

Thanks
Emanuele

From florent.angly at gmail.com  Sat Sep 19 22:20:31 2009
From: florent.angly at gmail.com (Florent Angly)
Date: Sat, 19 Sep 2009 19:20:31 -0700
Subject: [Bioperl-l] Parser: Ace file (Sequence Assembly) in Bioperl
In-Reply-To: <88BA1216-B8C6-478B-A295-4153D041F549@illinois.edu>
References: <be9b52410901052142p2809652h68e6a05b3ae156eb@mail.gmail.com>	<18DF7D20DFEC044098A1062202F5FFF31A69523F20@exchsth.agresearch.co.nz>	<be9b52410901061243t576fcc1eg94928360b8e0f57b@mail.gmail.com>	<B6BFD3C2-D5D0-4732-B3E2-C2DC9DD029F1@illinois.edu>	<52cea20c0901061513x593acb44o641b87e35b8ff6fe@mail.gmail.com>	<835D79AC-0D2A-40BE-87F1-0591F69C036A@illinois.edu>	<2c8757af0909180727r5a71a41fmee71eff92a49a888@mail.gmail.com>	<124536CE-407B-4E2E-98B7-940DA4286CC8@illinois.edu>	<2c8757af0909180954ia4fecc3we72574d8ae8acd97@mail.gmail.com>
	<88BA1216-B8C6-478B-A295-4153D041F549@illinois.edu>
Message-ID: <4AB5916F.1090104@gmail.com>

I suppose it is a good idea to wait until bioperl-live 1.6.1 is out 
before doing any significant work on the sequence assembly module.
Also, remember the assembly-related todo list: 
http://www.bioperl.org/wiki/Align_Refactor#Bio::Assembly-related
Florent


Chris Fields wrote:
> Bug 2726
>
> http://bugzilla.open-bio.org/show_bug.cgi?id=2726
>
> chris
>
> On Sep 18, 2009, at 11:54 AM, Dan Bolser wrote:
>
>> Please can you link to the bug that includes the code?
>>
>>
>> 2009/9/18 Chris Fields <cjfields at illinois.edu>:
>>> Dan,
>>>
>>> No, it hasn't made it in.  Currently, the problem is it doesn't have 
>>> any
>>> tests attached, but that could be easily fixed if anyone wanted to 
>>> donate a
>>> little time to getting them running.  My hands are a bit full with 
>>> other
>>> stuff for the release.
>>>
>>> We should have some ace files already to go in t/data somewhere if 
>>> one were
>>> so inclined to do that, BTW  ;>
>>>
>>> chris
>>>
>>> On Sep 18, 2009, at 9:27 AM, Dan Bolser wrote:
>>>
>>>> 2009/1/6 Chris Fields <cjfields at illinois.edu>:
>>>>>
>>>>> Could you archive the files and attach them to a bug report (you 
>>>>> can mark
>>>>> it
>>>>> as an enhancement request).  We can take a look.
>>>>>
>>>>> http://bugzilla.open-bio.org/
>>>>
>>>> Out of interest, has this been added? Where is it documented?
>>>>
>>>> Cheers,
>>>> Dan.
>>>>
>>>>
>>>>> chris
>>>>>
>>>>> On Jan 6, 2009, at 5:13 PM, Joshua Udall wrote:
>>>>>
>>>>>> Chris et al. -
>>>>>>
>>>>>> A student and I have written code to do this - write ace files as 
>>>>>> well
>>>>>> as
>>>>>> parse them one entry at a time.  In trying to use the 
>>>>>> Assembly::IO as it
>>>>>> was
>>>>>> in 1.5, we ran into problems with large ace files containing many
>>>>>> entries
>>>>>> because of file handle limit issues with the inherited 
>>>>>> implementation
>>>>>> DB_File.  Our implementation simply reads one contig at a time 
>>>>>> instead
>>>>>> of
>>>>>> first trying to slurp the whole ace into memory.  I'm happy to 
>>>>>> add it to
>>>>>> Bioperl, but I am not sure how to do it.  If I sent *.pm files to
>>>>>> someone,
>>>>>> could they help me get it into bioperl?  It may not be perfect 
>>>>>> either,
>>>>>> but
>>>>>> it should be a good start.
>>>>>>
>>>>>> Josh
>>>>
>>>> _______________________________________________
>>>> Bioperl-l mailing list
>>>> Bioperl-l at lists.open-bio.org
>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


From dan.bolser at gmail.com  Sun Sep 20 08:26:06 2009
From: dan.bolser at gmail.com (Dan Bolser)
Date: Sun, 20 Sep 2009 13:26:06 +0100
Subject: [Bioperl-l] Parser: Ace file (Sequence Assembly) in Bioperl
In-Reply-To: <4AB5916F.1090104@gmail.com>
References: <be9b52410901052142p2809652h68e6a05b3ae156eb@mail.gmail.com>
	<be9b52410901061243t576fcc1eg94928360b8e0f57b@mail.gmail.com>
	<B6BFD3C2-D5D0-4732-B3E2-C2DC9DD029F1@illinois.edu>
	<52cea20c0901061513x593acb44o641b87e35b8ff6fe@mail.gmail.com>
	<835D79AC-0D2A-40BE-87F1-0591F69C036A@illinois.edu>
	<2c8757af0909180727r5a71a41fmee71eff92a49a888@mail.gmail.com>
	<124536CE-407B-4E2E-98B7-940DA4286CC8@illinois.edu>
	<2c8757af0909180954ia4fecc3we72574d8ae8acd97@mail.gmail.com>
	<88BA1216-B8C6-478B-A295-4153D041F549@illinois.edu>
	<4AB5916F.1090104@gmail.com>
Message-ID: <2c8757af0909200526u3bb1766eo5d316dc5d7a2e1a5@mail.gmail.com>

2009/9/20 Florent Angly <florent.angly at gmail.com>:

...

> Also, remember the assembly-related todo list:
> http://www.bioperl.org/wiki/Align_Refactor#Bio::Assembly-related

Thanks for that link Florent. It's great to see the wiki being put to
such good use in the context of OSS development! I need to make a
mental note - before posting, check the mailing list archives _and_
the wiki!

Cheers,
Dan.


> Florent
>
>
> Chris Fields wrote:
>>
>> Bug 2726
>>
>> http://bugzilla.open-bio.org/show_bug.cgi?id=2726
>>
>> chris
>>
>> On Sep 18, 2009, at 11:54 AM, Dan Bolser wrote:
>>
>>> Please can you link to the bug that includes the code?
>>>
>>>
>>> 2009/9/18 Chris Fields <cjfields at illinois.edu>:
>>>>
>>>> Dan,
>>>>
>>>> No, it hasn't made it in. ?Currently, the problem is it doesn't have any
>>>> tests attached, but that could be easily fixed if anyone wanted to
>>>> donate a
>>>> little time to getting them running. ?My hands are a bit full with other
>>>> stuff for the release.
>>>>
>>>> We should have some ace files already to go in t/data somewhere if one
>>>> were
>>>> so inclined to do that, BTW ?;>
>>>>
>>>> chris
>>>>
>>>> On Sep 18, 2009, at 9:27 AM, Dan Bolser wrote:
>>>>
>>>>> 2009/1/6 Chris Fields <cjfields at illinois.edu>:
>>>>>>
>>>>>> Could you archive the files and attach them to a bug report (you can
>>>>>> mark
>>>>>> it
>>>>>> as an enhancement request). ?We can take a look.
>>>>>>
>>>>>> http://bugzilla.open-bio.org/
>>>>>
>>>>> Out of interest, has this been added? Where is it documented?
>>>>>
>>>>> Cheers,
>>>>> Dan.
>>>>>
>>>>>
>>>>>> chris
>>>>>>
>>>>>> On Jan 6, 2009, at 5:13 PM, Joshua Udall wrote:
>>>>>>
>>>>>>> Chris et al. -
>>>>>>>
>>>>>>> A student and I have written code to do this - write ace files as
>>>>>>> well
>>>>>>> as
>>>>>>> parse them one entry at a time. ?In trying to use the Assembly::IO as
>>>>>>> it
>>>>>>> was
>>>>>>> in 1.5, we ran into problems with large ace files containing many
>>>>>>> entries
>>>>>>> because of file handle limit issues with the inherited implementation
>>>>>>> DB_File. ?Our implementation simply reads one contig at a time
>>>>>>> instead
>>>>>>> of
>>>>>>> first trying to slurp the whole ace into memory. ?I'm happy to add it
>>>>>>> to
>>>>>>> Bioperl, but I am not sure how to do it. ?If I sent *.pm files to
>>>>>>> someone,
>>>>>>> could they help me get it into bioperl? ?It may not be perfect
>>>>>>> either,
>>>>>>> but
>>>>>>> it should be a good start.
>>>>>>>
>>>>>>> Josh
>>>>>
>>>>> _______________________________________________
>>>>> Bioperl-l mailing list
>>>>> Bioperl-l at lists.open-bio.org
>>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>>
>>>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>
>


From cjfields at illinois.edu  Sun Sep 20 10:34:08 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Sun, 20 Sep 2009 09:34:08 -0500
Subject: [Bioperl-l] Parser: Ace file (Sequence Assembly) in Bioperl
In-Reply-To: <4AB5916F.1090104@gmail.com>
References: <be9b52410901052142p2809652h68e6a05b3ae156eb@mail.gmail.com>	<18DF7D20DFEC044098A1062202F5FFF31A69523F20@exchsth.agresearch.co.nz>	<be9b52410901061243t576fcc1eg94928360b8e0f57b@mail.gmail.com>	<B6BFD3C2-D5D0-4732-B3E2-C2DC9DD029F1@illinois.edu>	<52cea20c0901061513x593acb44o641b87e35b8ff6fe@mail.gmail.com>	<835D79AC-0D2A-40BE-87F1-0591F69C036A@illinois.edu>	<2c8757af0909180727r5a71a41fmee71eff92a49a888@mail.gmail.com>	<124536CE-407B-4E2E-98B7-940DA4286CC8@illinois.edu>	<2c8757af0909180954ia4fecc3we72574d8ae8acd97@mail.gmail.com>
	<88BA1216-B8C6-478B-A295-4153D041F549@illinois.edu>
	<4AB5916F.1090104@gmail.com>
Message-ID: <F25C4EA4-1DB4-44F3-AB66-F58E6A90E302@illinois.edu>

Never hurts to get started, just make sure that there is a note  
indicating the status of Bio::Assembly.  In fact, the discussion page  
for it might make a good sot for Bio::Assembly design.

chris
On Sep 19, 2009, at 9:20 PM, Florent Angly wrote:

> I suppose it is a good idea to wait until bioperl-live 1.6.1 is out  
> before doing any significant work on the sequence assembly module.
> Also, remember the assembly-related todo list: http://www.bioperl.org/wiki/Align_Refactor#Bio::Assembly-related
> Florent
>
>
> Chris Fields wrote:
>> Bug 2726
>>
>> http://bugzilla.open-bio.org/show_bug.cgi?id=2726
>>
>> chris
>>
>> On Sep 18, 2009, at 11:54 AM, Dan Bolser wrote:
>>
>>> Please can you link to the bug that includes the code?
>>>
>>>
>>> 2009/9/18 Chris Fields <cjfields at illinois.edu>:
>>>> Dan,
>>>>
>>>> No, it hasn't made it in.  Currently, the problem is it doesn't  
>>>> have any
>>>> tests attached, but that could be easily fixed if anyone wanted  
>>>> to donate a
>>>> little time to getting them running.  My hands are a bit full  
>>>> with other
>>>> stuff for the release.
>>>>
>>>> We should have some ace files already to go in t/data somewhere  
>>>> if one were
>>>> so inclined to do that, BTW  ;>
>>>>
>>>> chris
>>>>
>>>> On Sep 18, 2009, at 9:27 AM, Dan Bolser wrote:
>>>>
>>>>> 2009/1/6 Chris Fields <cjfields at illinois.edu>:
>>>>>>
>>>>>> Could you archive the files and attach them to a bug report  
>>>>>> (you can mark
>>>>>> it
>>>>>> as an enhancement request).  We can take a look.
>>>>>>
>>>>>> http://bugzilla.open-bio.org/
>>>>>
>>>>> Out of interest, has this been added? Where is it documented?
>>>>>
>>>>> Cheers,
>>>>> Dan.
>>>>>
>>>>>
>>>>>> chris
>>>>>>
>>>>>> On Jan 6, 2009, at 5:13 PM, Joshua Udall wrote:
>>>>>>
>>>>>>> Chris et al. -
>>>>>>>
>>>>>>> A student and I have written code to do this - write ace files  
>>>>>>> as well
>>>>>>> as
>>>>>>> parse them one entry at a time.  In trying to use the  
>>>>>>> Assembly::IO as it
>>>>>>> was
>>>>>>> in 1.5, we ran into problems with large ace files containing  
>>>>>>> many
>>>>>>> entries
>>>>>>> because of file handle limit issues with the inherited  
>>>>>>> implementation
>>>>>>> DB_File.  Our implementation simply reads one contig at a time  
>>>>>>> instead
>>>>>>> of
>>>>>>> first trying to slurp the whole ace into memory.  I'm happy to  
>>>>>>> add it to
>>>>>>> Bioperl, but I am not sure how to do it.  If I sent *.pm files  
>>>>>>> to
>>>>>>> someone,
>>>>>>> could they help me get it into bioperl?  It may not be perfect  
>>>>>>> either,
>>>>>>> but
>>>>>>> it should be a good start.
>>>>>>>
>>>>>>> Josh
>>>>>
>>>>> _______________________________________________
>>>>> Bioperl-l mailing list
>>>>> Bioperl-l at lists.open-bio.org
>>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>>
>>>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From dan.bolser at gmail.com  Sun Sep 20 11:09:19 2009
From: dan.bolser at gmail.com (Dan Bolser)
Date: Sun, 20 Sep 2009 16:09:19 +0100
Subject: [Bioperl-l] Getting all annotations
In-Reply-To: <2ac05d0f0909181533u1e5d5d89l5c2c468950a9cef@mail.gmail.com>
References: <2ac05d0f0909181533u1e5d5d89l5c2c468950a9cef@mail.gmail.com>
Message-ID: <2c8757af0909200809g1f6c41eeyabfc8bdaac1fc19f@mail.gmail.com>

Hi Emanuele,

I guess you were Emos in irc://irc.freenode.net/#bioperl ?


I think the answer to your question can be found here:

http://www.biodas.org


All the best,
Dan.

2009/9/18 Emanuele Osimo <e.osimo at gmail.com>:
> Hello,
> I was trying to figure out how to get from the Entrez database all the
> reference annotation for a given genomic zone.
> For example: I want to know which genes, transcripts, microRNAs etc are
> present in chr 6 from 100kbp to 200kbp.
> Is there a database that is arranged as a continuum (by sequence) instead of
> by feature (gene, transcript etc)?
>
> Thanks
> Emanuele
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>

From maj at fortinbras.us  Mon Sep 21 00:22:54 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Mon, 21 Sep 2009 00:22:54 -0400
Subject: [Bioperl-l] a Main Page proposal
Message-ID: <094CC6C2C9DE448F9A8097AEEFE86B03@NewLife>

Hello all,

As Brian articulated so well for many of us, 
the wiki main page is, well, butt-ugly.
Please check out the Main Page Beta at
http://www.bioperl.org/wiki/Main_Page_Beta
and respond to this thread or on the discussion 
page. 

cheers and thanks, 
MAJ

From bix at sendu.me.uk  Mon Sep 21 02:25:04 2009
From: bix at sendu.me.uk (Sendu Bala)
Date: Mon, 21 Sep 2009 07:25:04 +0100
Subject: [Bioperl-l] a Main Page proposal
In-Reply-To: <094CC6C2C9DE448F9A8097AEEFE86B03@NewLife>
References: <094CC6C2C9DE448F9A8097AEEFE86B03@NewLife>
Message-ID: <4AB71C40.10902@sendu.me.uk>

Mark A. Jensen wrote:
> Hello all,
> 
> As Brian articulated so well for many of us, 
> the wiki main page is, well, butt-ugly.
> Please check out the Main Page Beta at
> http://www.bioperl.org/wiki/Main_Page_Beta
> and respond to this thread or on the discussion 
> page. 

I never thought the main page was 'butt-ugly' (rather, what I expect 
from a wiki), but, to put it bluntly, the graphical flourishes in your 
proposal are cringe-worthy. I couldn't do any better. I think for 
graphical things you'd need a professional graphics designer or similar.

The actual content and organisation of your version is probably an 
improvement though.

From rmb32 at cornell.edu  Mon Sep 21 03:40:31 2009
From: rmb32 at cornell.edu (Robert Buels)
Date: Mon, 21 Sep 2009 00:40:31 -0700
Subject: [Bioperl-l] a Main Page proposal
In-Reply-To: <4AB71C40.10902@sendu.me.uk>
References: <094CC6C2C9DE448F9A8097AEEFE86B03@NewLife>
	<4AB71C40.10902@sendu.me.uk>
Message-ID: <4AB72DEF.2010008@cornell.edu>

Sendu Bala wrote:
> from a wiki), but, to put it bluntly, the graphical flourishes in your 
> proposal are cringe-worthy. I couldn't do any better. I think for 

I think what Sendu was trying to say is that he didn't like the gradient 
section heads?  There are only two graphical things on that page, and 
the other one is an enlargement of the existing logo, so I suppose 
that's what he means.

They're not my absolute favorite either, but I certainly wouldn't 
describe them as cringe-worthy!  :-P

Rob


From biopython at maubp.freeserve.co.uk  Mon Sep 21 05:45:48 2009
From: biopython at maubp.freeserve.co.uk (Peter)
Date: Mon, 21 Sep 2009 10:45:48 +0100
Subject: [Bioperl-l] a Main Page proposal
In-Reply-To: <4AB72DEF.2010008@cornell.edu>
References: <094CC6C2C9DE448F9A8097AEEFE86B03@NewLife>
	<4AB71C40.10902@sendu.me.uk> <4AB72DEF.2010008@cornell.edu>
Message-ID: <320fb6e00909210245hc455330o109515b100b21153@mail.gmail.com>

On Mon, Sep 21, 2009 at 8:40 AM, Robert Buels <rmb32 at cornell.edu> wrote:
>
> I think what Sendu was trying to say is that he didn't like the gradient
> section heads? ?There are only two graphical things on that page, and the
> other one is an enlargement of the existing logo, so I suppose that's what
> he means.

On my browser the gradient section headers on that draft
suddenly change to grey for the section title text background
(Linux, Firefox 3.0.14).

Personally, I would also say that even this proposal is still
far too heavy (in terms of text content).

We had some similar discussions about the Biopython wiki
based homepage - although our old one was nowhere near
as busy as the current BioPerl main page, it was still not as
welcoming as our current version *tries* to be.

Old:
http://biopython.org/w/index.php?title=Biopython&oldid=2527

New:
http://biopython.org/wiki/Main_Page

It would be easy for you to embed the BioPerl OBF blog
headlines into the main page like we did.

I can dig out links to our mailing list archive if anyone is
interested in the discussion.

Peter


From maj at fortinbras.us  Mon Sep 21 07:20:31 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Mon, 21 Sep 2009 07:20:31 -0400
Subject: [Bioperl-l] a Main Page proposal
In-Reply-To: <4AB72DEF.2010008@cornell.edu>
References: <094CC6C2C9DE448F9A8097AEEFE86B03@NewLife><4AB71C40.10902@sendu.me.uk>
	<4AB72DEF.2010008@cornell.edu>
Message-ID: <22244A89D06E4F9B8D5F70A833E1C0DE@NewLife>

Hey, if Sendu cringed, he cringed. If I had one, I'd keep my 
day job. In the meantime, the graphics are removed. 
MAJ
----- Original Message ----- 
From: "Robert Buels" <rmb32 at cornell.edu>
Cc: "BioPerl List" <bioperl-l at lists.open-bio.org>
Sent: Monday, September 21, 2009 3:40 AM
Subject: Re: [Bioperl-l] a Main Page proposal


> Sendu Bala wrote:
>> from a wiki), but, to put it bluntly, the graphical flourishes in your 
>> proposal are cringe-worthy. I couldn't do any better. I think for 
> 
> I think what Sendu was trying to say is that he didn't like the gradient 
> section heads?  There are only two graphical things on that page, and 
> the other one is an enlargement of the existing logo, so I suppose 
> that's what he means.
> 
> They're not my absolute favorite either, but I certainly wouldn't 
> describe them as cringe-worthy!  :-P
> 
> Rob
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
>

From e.osimo at gmail.com  Mon Sep 21 07:35:00 2009
From: e.osimo at gmail.com (Emanuele Osimo)
Date: Mon, 21 Sep 2009 13:35:00 +0200
Subject: [Bioperl-l] a Main Page proposal
In-Reply-To: <094CC6C2C9DE448F9A8097AEEFE86B03@NewLife>
References: <094CC6C2C9DE448F9A8097AEEFE86B03@NewLife>
Message-ID: <2ac05d0f0909210435k66bd0ed3x9fd13d9f4ec44634@mail.gmail.com>

I can say that, for a neophyte, the contents are a great improvement.
You can find with a lot more ease what you are searching for.

Emanuele

On Mon, Sep 21, 2009 at 06:22, Mark A. Jensen <maj at fortinbras.us> wrote:

> Hello all,
>
> As Brian articulated so well for many of us,
> the wiki main page is, well, butt-ugly.
> Please check out the Main Page Beta at
> http://www.bioperl.org/wiki/Main_Page_Beta
> and respond to this thread or on the discussion
> page.
>
> cheers and thanks,
> MAJ
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>

From maj at fortinbras.us  Mon Sep 21 07:32:08 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Mon, 21 Sep 2009 07:32:08 -0400
Subject: [Bioperl-l] a Main Page proposal
In-Reply-To: <320fb6e00909210245hc455330o109515b100b21153@mail.gmail.com>
References: <094CC6C2C9DE448F9A8097AEEFE86B03@NewLife><4AB71C40.10902@sendu.me.uk>
	<4AB72DEF.2010008@cornell.edu>
	<320fb6e00909210245hc455330o109515b100b21153@mail.gmail.com>
Message-ID: <3C8F39ACAD954917ACDEFD863EC99B16@NewLife>

I'd appreciate those links, Peter- thanks
MAJ
----- Original Message ----- 
From: "Peter" <biopython at maubp.freeserve.co.uk>
To: "BioPerl List" <bioperl-l at lists.open-bio.org>
Sent: Monday, September 21, 2009 5:45 AM
Subject: Re: [Bioperl-l] a Main Page proposal


On Mon, Sep 21, 2009 at 8:40 AM, Robert Buels <rmb32 at cornell.edu> wrote:
>
> I think what Sendu was trying to say is that he didn't like the gradient
> section heads? There are only two graphical things on that page, and the
> other one is an enlargement of the existing logo, so I suppose that's what
> he means.

On my browser the gradient section headers on that draft
suddenly change to grey for the section title text background
(Linux, Firefox 3.0.14).

Personally, I would also say that even this proposal is still
far too heavy (in terms of text content).

We had some similar discussions about the Biopython wiki
based homepage - although our old one was nowhere near
as busy as the current BioPerl main page, it was still not as
welcoming as our current version *tries* to be.

Old:
http://biopython.org/w/index.php?title=Biopython&oldid=2527

New:
http://biopython.org/wiki/Main_Page

It would be easy for you to embed the BioPerl OBF blog
headlines into the main page like we did.

I can dig out links to our mailing list archive if anyone is
interested in the discussion.

Peter

_______________________________________________
Bioperl-l mailing list
Bioperl-l at lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/bioperl-l


From pmiguel at purdue.edu  Mon Sep 21 08:01:03 2009
From: pmiguel at purdue.edu (Phillip San Miguel)
Date: Mon, 21 Sep 2009 08:01:03 -0400
Subject: [Bioperl-l] Getting read position information from an ACE file?
In-Reply-To: <2c8757af0909181009w310bc69r3d9efa3d9a12d41b@mail.gmail.com>
References: <2c8757af0909180755u2e2ca178h9ce921f9bb22c7a3@mail.gmail.com>	<FCD85C18EC5744269CEAB127F4D1D5C4@NewLife>
	<2c8757af0909181009w310bc69r3d9efa3d9a12d41b@mail.gmail.com>
Message-ID: <4AB76AFF.7050902@purdue.edu>

Dan Bolser wrote:
> 2009/9/18 Mark A. Jensen <maj at fortinbras.us>:
>   
>> Dan -- I don't know much about Assembly, so can't help there. But can I
>>  encourage you and perhaps one or two others (steganographic content:
>> fangly) to create a HOWTO stub out of this? Would be excellent-
>>     
>
> I'd love to. ACE is pretty ubiquitous, so any additional info on how
> to work with them using BioPerl should help a lot of people.
>   
> The problem is that I'm one of those people ;-)
>
>
> I'm working on an 'ace2tab.plx' script that should encompass this
> info. I'm finding that some 'read ids' have the .range format. i.e.
> "read123455.23-239". However, some do not. i.e. "read123456". Not sure
> where this ID comes from, but I think its telling me something about
> partially aligned reads. 

I think you are right. I have heard that Newbler (the 454 assembler) 
does this insane thing, where it will rip reads apart into segments and 
cluster parts of reads in different contigs.

> The problem is that the coordinates I'm
> seeing don't reflect that (they are just the start and the end point
> of the full read).
>   

That sounds similar to how phrap/consed handle "chimeric" reads. But my 
experience is that phrap is pretty parsimonious with numbers of 
chimerics it will allow.  (That isn't entirely fair to Newbler -- I've 
never been able to get phrap to consistently assemble ESTs. Phrap seems 
tuned to assemble BAC shotgun reads. ESTs seem to drive it a little 
crazy. It will create contigs from a set of reads that have essentially 
no similarity to each other, nor to the consensus sequence phrap creates 
for them.)

-- 
Phillip

From hlapp at gmx.net  Mon Sep 21 08:22:34 2009
From: hlapp at gmx.net (Hilmar Lapp)
Date: Mon, 21 Sep 2009 08:22:34 -0400
Subject: [Bioperl-l] a Main Page proposal
In-Reply-To: <094CC6C2C9DE448F9A8097AEEFE86B03@NewLife>
References: <094CC6C2C9DE448F9A8097AEEFE86B03@NewLife>
Message-ID: <03B93F96-E28D-45CF-BD94-AD33634476AA@gmx.net>

What's probably worth looking at as a example is the gmod.org home  
page. Stylistically, one thing you want to get out of the way is the  
auto-generated TOC.

	-hilmar

On Sep 21, 2009, at 12:22 AM, Mark A. Jensen wrote:

> Hello all,
>
> As Brian articulated so well for many of us,
> the wiki main page is, well, butt-ugly.
> Please check out the Main Page Beta at
> http://www.bioperl.org/wiki/Main_Page_Beta
> and respond to this thread or on the discussion
> page.
>
> cheers and thanks,
> MAJ
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From biopython at maubp.freeserve.co.uk  Mon Sep 21 08:28:28 2009
From: biopython at maubp.freeserve.co.uk (Peter)
Date: Mon, 21 Sep 2009 13:28:28 +0100
Subject: [Bioperl-l] a Main Page proposal
In-Reply-To: <3C8F39ACAD954917ACDEFD863EC99B16@NewLife>
References: <094CC6C2C9DE448F9A8097AEEFE86B03@NewLife>
	<4AB71C40.10902@sendu.me.uk> <4AB72DEF.2010008@cornell.edu>
	<320fb6e00909210245hc455330o109515b100b21153@mail.gmail.com>
	<3C8F39ACAD954917ACDEFD863EC99B16@NewLife>
Message-ID: <320fb6e00909210528v849cd43h3bd4677b69222575@mail.gmail.com>

Peter wrote:
>> We had some similar discussions about the Biopython wiki
>> based homepage - although our old one was nowhere near
>> as busy as the current BioPerl main page, it was still not as
>> welcoming as our current version *tries* to be.
>> ...
>> I can dig out links to our mailing list archive if anyone is
>> interested in the discussion.

On Mon, Sep 21, 2009 at 12:32 PM, Mark A. Jensen wrote:
>
> I'd appreciate those links, Peter- thanks
> MAJ

OK, here you are - this was most of it, I'd have to dig though
my old emails to see what else I can find:
http://lists.open-bio.org/pipermail/biopython-dev/2009-April/005867.html

Remember Biopython went from a very minimal home page, to
something aiming to be more newcomer friendly. BioPerl on the
other hand seems to want to move away from the current very
text heavy information rich page to something more focused and
newcomer friendly. To me at least the current page is too dense,
intimidating, and the important bits get lost in all the content.

[My apologies if any of this feedback come accross too blunt.]

If you haven't already looked at them, you should checkout the
other OBF project pages for ideas. The BioJava homepage is
also using the wiki - in my opinion it is a bit cluttered, but is
still more accessible than the current BioPerl page. Also,
the BioRuby page is very nice - although not wiki based.

Regards,

Peter

From mwachholtz at unomaha.edu  Thu Sep 17 20:31:13 2009
From: mwachholtz at unomaha.edu (Michael UNO)
Date: Thu, 17 Sep 2009 17:31:13 -0700 (PDT)
Subject: [Bioperl-l]  Genome Scanning Question
Message-ID: <25497856.post@talk.nabble.com>


What objects & methods could be used if I wanted to determine if a gene is
located at a specific location within a genome at the Ensembl database. For
example, if given a coordinate (e.g. Canine Chr15:66,500,123) is there a
method that will simply tell me "yes, there is a gene at this location". And
can it tell what gene(s) are located at this coordinate?
-- 
View this message in context: http://www.nabble.com/Genome-Scanning-Question-tp25497856p25497856.html
Sent from the Perl - Bioperl-L mailing list archive at Nabble.com.


From sdavis2 at mail.nih.gov  Mon Sep 21 09:04:36 2009
From: sdavis2 at mail.nih.gov (Sean Davis)
Date: Mon, 21 Sep 2009 09:04:36 -0400
Subject: [Bioperl-l] Genome Scanning Question
In-Reply-To: <25497856.post@talk.nabble.com>
References: <25497856.post@talk.nabble.com>
Message-ID: <264855a00909210604o826871dr7121e3f26c0e34aa@mail.gmail.com>

On Thu, Sep 17, 2009 at 8:31 PM, Michael UNO <mwachholtz at unomaha.edu> wrote:

>
> What objects & methods could be used if I wanted to determine if a gene is
> located at a specific location within a genome at the Ensembl database. For
> example, if given a coordinate (e.g. Canine Chr15:66,500,123) is there a
> method that will simply tell me "yes, there is a gene at this location".
> And
> can it tell what gene(s) are located at this coordinate?
>

There are a number of ways to go about this.

If you want to go with perl, object-oriented, and ensembl, check out:

http://www.ensembl.org/info/docs/api/core/core_tutorial.html

If you want to start with tab-delimited text files, check out downloading
the text files from the UCSC genome browser.

Sean


> --
> View this message in context:
> http://www.nabble.com/Genome-Scanning-Question-tp25497856p25497856.html
> Sent from the Perl - Bioperl-L mailing list archive at Nabble.com.
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>

From cjfields at illinois.edu  Mon Sep 21 09:05:25 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Mon, 21 Sep 2009 08:05:25 -0500
Subject: [Bioperl-l] a Main Page proposal
In-Reply-To: <320fb6e00909210528v849cd43h3bd4677b69222575@mail.gmail.com>
References: <094CC6C2C9DE448F9A8097AEEFE86B03@NewLife>
	<4AB71C40.10902@sendu.me.uk> <4AB72DEF.2010008@cornell.edu>
	<320fb6e00909210245hc455330o109515b100b21153@mail.gmail.com>
	<3C8F39ACAD954917ACDEFD863EC99B16@NewLife>
	<320fb6e00909210528v849cd43h3bd4677b69222575@mail.gmail.com>
Message-ID: <D336A804-1B73-465E-971F-AD1A9EE5465C@illinois.edu>


On Sep 21, 2009, at 7:28 AM, Peter wrote:

> Peter wrote:
>>> We had some similar discussions about the Biopython wiki
>>> based homepage - although our old one was nowhere near
>>> as busy as the current BioPerl main page, it was still not as
>>> welcoming as our current version *tries* to be.
>>> ...
>>> I can dig out links to our mailing list archive if anyone is
>>> interested in the discussion.
>
> On Mon, Sep 21, 2009 at 12:32 PM, Mark A. Jensen wrote:
>>
>> I'd appreciate those links, Peter- thanks
>> MAJ
>
> OK, here you are - this was most of it, I'd have to dig though
> my old emails to see what else I can find:
> http://lists.open-bio.org/pipermail/biopython-dev/2009-April/005867.html
>
> Remember Biopython went from a very minimal home page, to
> something aiming to be more newcomer friendly. BioPerl on the
> other hand seems to want to move away from the current very
> text heavy information rich page to something more focused and
> newcomer friendly. To me at least the current page is too dense,
> intimidating, and the important bits get lost in all the content.
>
> [My apologies if any of this feedback come accross too blunt.]

Not at all; I'm thinking the same thing.

> If you haven't already looked at them, you should checkout the
> other OBF project pages for ideas. The BioJava homepage is
> also using the wiki - in my opinion it is a bit cluttered, but is
> still more accessible than the current BioPerl page. Also,
> the BioRuby page is very nice - although not wiki based.
>
> Regards,
>
> Peter

I think the Biopython layout is very nice and focused.  Maybe a bit  
too minimal, but then again I don't like scrolling up and down the  
page to find the relevant bits, so less may be better.

Reminds me of the simplifed design on the perl6 main page (just don't  
stare at the hallucinogenic butterfly too long):

http://www.perl6.org/

So, maybe a structured layout with the most important links, and  
additional links on a separate page.

chris


From maj at fortinbras.us  Mon Sep 21 09:22:35 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Mon, 21 Sep 2009 09:22:35 -0400
Subject: [Bioperl-l] a Main Page proposal
In-Reply-To: <D336A804-1B73-465E-971F-AD1A9EE5465C@illinois.edu>
References: <094CC6C2C9DE448F9A8097AEEFE86B03@NewLife><4AB71C40.10902@sendu.me.uk>
	<4AB72DEF.2010008@cornell.edu><320fb6e00909210245hc455330o109515b100b21153@mail.gmail.com><3C8F39ACAD954917ACDEFD863EC99B16@NewLife><320fb6e00909210528v849cd43h3bd4677b69222575@mail.gmail.com>
	<D336A804-1B73-465E-971F-AD1A9EE5465C@illinois.edu>
Message-ID: <0F980234804C4B3EA08E810E043F2537@NewLife>

Ah! I don't need a degree in design, just a dose of whatever Madame Butterfly 
was taking!
(Erdos had it right...)

----- Original Message ----- 
From: "Chris Fields" <cjfields at illinois.edu>
To: "Peter" <biopython at maubp.freeserve.co.uk>
Cc: "BioPerl List" <bioperl-l at lists.open-bio.org>; "Mark A. Jensen" 
<maj at fortinbras.us>
Sent: Monday, September 21, 2009 9:05 AM
Subject: Re: [Bioperl-l] a Main Page proposal


>
> On Sep 21, 2009, at 7:28 AM, Peter wrote:
>
>> Peter wrote:
>>>> We had some similar discussions about the Biopython wiki
>>>> based homepage - although our old one was nowhere near
>>>> as busy as the current BioPerl main page, it was still not as
>>>> welcoming as our current version *tries* to be.
>>>> ...
>>>> I can dig out links to our mailing list archive if anyone is
>>>> interested in the discussion.
>>
>> On Mon, Sep 21, 2009 at 12:32 PM, Mark A. Jensen wrote:
>>>
>>> I'd appreciate those links, Peter- thanks
>>> MAJ
>>
>> OK, here you are - this was most of it, I'd have to dig though
>> my old emails to see what else I can find:
>> http://lists.open-bio.org/pipermail/biopython-dev/2009-April/005867.html
>>
>> Remember Biopython went from a very minimal home page, to
>> something aiming to be more newcomer friendly. BioPerl on the
>> other hand seems to want to move away from the current very
>> text heavy information rich page to something more focused and
>> newcomer friendly. To me at least the current page is too dense,
>> intimidating, and the important bits get lost in all the content.
>>
>> [My apologies if any of this feedback come accross too blunt.]
>
> Not at all; I'm thinking the same thing.
>
>> If you haven't already looked at them, you should checkout the
>> other OBF project pages for ideas. The BioJava homepage is
>> also using the wiki - in my opinion it is a bit cluttered, but is
>> still more accessible than the current BioPerl page. Also,
>> the BioRuby page is very nice - although not wiki based.
>>
>> Regards,
>>
>> Peter
>
> I think the Biopython layout is very nice and focused.  Maybe a bit  too 
> minimal, but then again I don't like scrolling up and down the  page to find 
> the relevant bits, so less may be better.
>
> Reminds me of the simplifed design on the perl6 main page (just don't  stare 
> at the hallucinogenic butterfly too long):
>
> http://www.perl6.org/
>
> So, maybe a structured layout with the most important links, and  additional 
> links on a separate page.
>
> chris
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> 


From biopython at maubp.freeserve.co.uk  Mon Sep 21 09:58:21 2009
From: biopython at maubp.freeserve.co.uk (Peter)
Date: Mon, 21 Sep 2009 14:58:21 +0100
Subject: [Bioperl-l] a Main Page proposal
In-Reply-To: <D336A804-1B73-465E-971F-AD1A9EE5465C@illinois.edu>
References: <094CC6C2C9DE448F9A8097AEEFE86B03@NewLife>
	<4AB71C40.10902@sendu.me.uk> <4AB72DEF.2010008@cornell.edu>
	<320fb6e00909210245hc455330o109515b100b21153@mail.gmail.com>
	<3C8F39ACAD954917ACDEFD863EC99B16@NewLife>
	<320fb6e00909210528v849cd43h3bd4677b69222575@mail.gmail.com>
	<D336A804-1B73-465E-971F-AD1A9EE5465C@illinois.edu>
Message-ID: <320fb6e00909210658n70f96727g1eb190579a746cfa@mail.gmail.com>

On Mon, Sep 21, 2009 at 2:05 PM, Chris Fields <cjfields at illinois.edu> wrote:
>
> I think the Biopython layout is very nice and focused. ?Maybe
> a bit too minimal, but then again I don't like scrolling up and
> down the page to find the relevant bits, so less may be better.

Yes, trying to get everything on one screen was deliberate
(and works for most screen sizes).

> Reminds me of the simplifed design on the perl6 main page
> (just don't stare at the hallucinogenic butterfly too long):
>
> http://www.perl6.org/
>
> So, maybe a structured layout with the most important links,
> and additional links on a separate page.

Butterflies aside, yes - that is what we tried to do on the
Biopython page - just provide an "abstract", and links to
get people to the main content.

Peter


From ak at ebi.ac.uk  Mon Sep 21 10:06:44 2009
From: ak at ebi.ac.uk (Andreas =?iso-8859-1?B?S+Ro5HJp?=)
Date: Mon, 21 Sep 2009 15:06:44 +0100
Subject: [Bioperl-l] Genome Scanning Question
In-Reply-To: <25497856.post@talk.nabble.com>
References: <25497856.post@talk.nabble.com>
Message-ID: <20090921140644.GB12734@qux.windows.ebi.ac.uk>

On Thu, Sep 17, 2009 at 05:31:13PM -0700, Michael UNO wrote:
> 
> What objects & methods could be used if I wanted to determine if a gene is
> located at a specific location within a genome at the Ensembl database. For
> example, if given a coordinate (e.g. Canine Chr15:66,500,123) is there a
> method that will simply tell me "yes, there is a gene at this location". And
> can it tell what gene(s) are located at this coordinate?

Here's a basic script do do something like what you want to do, for a
specific species, chromosome, and region:

#!/usr/bin/perl -w

use strict;
use warnings;

use Bio::EnsEMBL::Registry;

my $registry = 'Bio::EnsEMBL::Registry';

$registry->load_registry_from_db(
  '-host' => 'ensembldb.ensembl.org',
  '-user' => 'anonymous'
);

my $species = 'Dog';

my ( $chrname, $chrstart, $chrend ) = ( '13', 40_500_000, 41_000_000 );

my $slice_adaptor = $registry->get_adaptor( $species, 'Core', 'Slice' );

my $slice =
  $slice_adaptor->fetch_by_region( 'Chromosome', $chrname, $chrstart,
  $chrend );

my @genes = @{ $slice->get_all_Genes() };

if ( !@genes ) {
  print("No genes on that interval\n");
} else {
  printf( "%d genes on the interval:\n", scalar(@genes) );
  foreach my $gene (@genes) {
    printf(
      "%s (%s) [%s,%s,%s]\n",
      $gene->stable_id(), $gene->external_name() || 'No external name',
      $gene->start(), $gene->end(), $gene->strand() );
  }
}


Are you aware of the ensembl-dev mailing list and of the ensembl
helpdesk at helpdesk at ensembl.org (or via the "he!p" button in the genome
browser itself)?


Regards,
Andreas


-- 
Andreas K?h?ri, Ensembl Software Developer            ()[]()[]
European Bioinformatics Institute (EMBL-EBI)          []()[]()
Wellcome Trust Genome Campus, Hinxton                 ()[]()[]
Cambridge CB10 1SD, United Kingdom                    []()[]()

From bosborne11 at verizon.net  Mon Sep 21 09:15:03 2009
From: bosborne11 at verizon.net (Brian Osborne)
Date: Mon, 21 Sep 2009 09:15:03 -0400
Subject: [Bioperl-l] a Main Page proposal
In-Reply-To: <094CC6C2C9DE448F9A8097AEEFE86B03@NewLife>
References: <094CC6C2C9DE448F9A8097AEEFE86B03@NewLife>
Message-ID: <7E8EC05A-ED60-4F70-850D-16DD7E037281@verizon.net>

Mark,

That's nice! I wonder if we can move some content up-top, on the  
right, for less scrolling. I will play with this later today...

Brian O.


On Sep 21, 2009, at 12:22 AM, Mark A. Jensen wrote:

> Hello all,
>
> As Brian articulated so well for many of us,
> the wiki main page is, well, butt-ugly.
> Please check out the Main Page Beta at
> http://www.bioperl.org/wiki/Main_Page_Beta
> and respond to this thread or on the discussion
> page.
>
> cheers and thanks,
> MAJ
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From anupam.contact at gmail.com  Mon Sep 21 10:18:52 2009
From: anupam.contact at gmail.com (anupam sinha)
Date: Mon, 21 Sep 2009 19:48:52 +0530
Subject: [Bioperl-l] Problems with Bioperl-run pkg
In-Reply-To: <82ec54570909180820t7981d230l48d8e4823bb2303f@mail.gmail.com>
References: <82ec54570909180820t7981d230l48d8e4823bb2303f@mail.gmail.com>
Message-ID: <82ec54570909210718v180f604btc835d88f2a9ec2fd@mail.gmail.com>

On Fri, Sep 18, 2009 at 8:50 PM, anupam sinha <anupam.contact at gmail.com>wrote:

> Dear all,
>                  I have installed the BioPerl-1.6.0.tar.gz and
> Bioperl-run-1.6.0.tar.gz on a Fedora 7 system. I am trying to run *
> /usr/bin/bp_pairwise_kaks.pl* script but keep on getting this error :
>
> *Must have bioperl-run pkg installed to run this script at
> /usr/bin/bp_pairwise_kaks.pl line 69*.
>
> Though I have istalled the run package from Bioperl. Can anyone help me out
> ? Thanks in advance.
>
>
>
> Regards,
>
>
> Anupam Sinha
>

From maj at fortinbras.us  Mon Sep 21 10:49:25 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Mon, 21 Sep 2009 10:49:25 -0400
Subject: [Bioperl-l] a Main Page proposal
In-Reply-To: <094CC6C2C9DE448F9A8097AEEFE86B03@NewLife>
References: <094CC6C2C9DE448F9A8097AEEFE86B03@NewLife>
Message-ID: <7C51BBB552AB4EBEA5ED60B49BDB9171@NewLife>

Please view the latest 
http://www.bioperl.org/wiki/Main_Page_Beta
No graphics. I incline towards more text, but you
already knew that.
MAJ
----- Original Message ----- 
From: "Mark A. Jensen" <maj at fortinbras.us>
To: "BioPerl List" <bioperl-l at lists.open-bio.org>
Sent: Monday, September 21, 2009 12:22 AM
Subject: [Bioperl-l] a Main Page proposal


> Hello all,
> 
> As Brian articulated so well for many of us, 
> the wiki main page is, well, butt-ugly.
> Please check out the Main Page Beta at
> http://www.bioperl.org/wiki/Main_Page_Beta
> and respond to this thread or on the discussion 
> page. 
> 
> cheers and thanks, 
> MAJ
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
>

From David.Messina at sbc.su.se  Mon Sep 21 13:03:56 2009
From: David.Messina at sbc.su.se (Dave Messina)
Date: Mon, 21 Sep 2009 19:03:56 +0200
Subject: [Bioperl-l] a Main Page proposal
In-Reply-To: <7C51BBB552AB4EBEA5ED60B49BDB9171@NewLife>
References: <094CC6C2C9DE448F9A8097AEEFE86B03@NewLife>
	<7C51BBB552AB4EBEA5ED60B49BDB9171@NewLife>
Message-ID: <628aabb70909211003w221b4c9fn5c11f7ff08fc952d@mail.gmail.com>

Hi Mark,
Thanks for taking on this (much needed) refresh.

I think your current version is substantially better than what we have now.
Still, I'd argue that something much more concise like the Biopython page
would make a bigger impact on visitors' ability to find what they're looking
for.

It's not that the details you have under each section shouldn't be
available, but rather that they could be clicked through to instead of being
on the front page.

The About section is a good example. I would bet most visitors to the
BioPerl website skip over the About section because they already know what
BioPerl is, and that section has the most valuable real estate on the page.
Those who don't know and are curious will probably be able to find it (the
word About on the front page of a website has become an idiom for "click her
to read the details about this").


Dave

From cjfields at illinois.edu  Mon Sep 21 13:42:10 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Mon, 21 Sep 2009 12:42:10 -0500
Subject: [Bioperl-l] a Main Page proposal
In-Reply-To: <628aabb70909211003w221b4c9fn5c11f7ff08fc952d@mail.gmail.com>
References: <094CC6C2C9DE448F9A8097AEEFE86B03@NewLife>
	<7C51BBB552AB4EBEA5ED60B49BDB9171@NewLife>
	<628aabb70909211003w221b4c9fn5c11f7ff08fc952d@mail.gmail.com>
Message-ID: <5C240DA6-6B3D-4E64-A8BC-1FBC90FFA471@illinois.edu>

On Sep 21, 2009, at 12:03 PM, Dave Messina wrote:

> Hi Mark,
> Thanks for taking on this (much needed) refresh.
>
> I think your current version is substantially better than what we  
> have now.
> Still, I'd argue that something much more concise like the Biopython  
> page
> would make a bigger impact on visitors' ability to find what they're  
> looking
> for.
>
> It's not that the details you have under each section shouldn't be
> available, but rather that they could be clicked through to instead  
> of being
> on the front page.
>
> The About section is a good example. I would bet most visitors to the
> BioPerl website skip over the About section because they already  
> know what
> BioPerl is, and that section has the most valuable real estate on  
> the page.
> Those who don't know and are curious will probably be able to find  
> it (the
> word About on the front page of a website has become an idiom for  
> "click her
> to read the details about this").
>
>
>
> Dave

How about this version (it's on my talk page):

http://www.bioperl.org/wiki/User_talk:Cjfields

chris

From maj at fortinbras.us  Mon Sep 21 13:45:03 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Mon, 21 Sep 2009 13:45:03 -0400
Subject: [Bioperl-l] a Main Page proposal
In-Reply-To: <628aabb70909211003w221b4c9fn5c11f7ff08fc952d@mail.gmail.com>
References: <094CC6C2C9DE448F9A8097AEEFE86B03@NewLife><7C51BBB552AB4EBEA5ED60B49BDB9171@NewLife>
	<628aabb70909211003w221b4c9fn5c11f7ff08fc952d@mail.gmail.com>
Message-ID: <42FBB964C0EA44FABCB50364C567A009@NewLife>

A nearly completely minimal solution is at Main Page Beta
----- Original Message ----- 
From: "Dave Messina" <David.Messina at sbc.su.se>
To: "Mark A. Jensen" <maj at fortinbras.us>
Cc: "BioPerl List" <bioperl-l at lists.open-bio.org>
Sent: Monday, September 21, 2009 1:03 PM
Subject: Re: [Bioperl-l] a Main Page proposal


> Hi Mark,
> Thanks for taking on this (much needed) refresh.
> 
> I think your current version is substantially better than what we have now.
> Still, I'd argue that something much more concise like the Biopython page
> would make a bigger impact on visitors' ability to find what they're looking
> for.
> 
> It's not that the details you have under each section shouldn't be
> available, but rather that they could be clicked through to instead of being
> on the front page.
> 
> The About section is a good example. I would bet most visitors to the
> BioPerl website skip over the About section because they already know what
> BioPerl is, and that section has the most valuable real estate on the page.
> Those who don't know and are curious will probably be able to find it (the
> word About on the front page of a website has become an idiom for "click her
> to read the details about this").
> 
> 
> 
> Dave
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
>

From armendarez77 at hotmail.com  Mon Sep 21 17:01:12 2009
From: armendarez77 at hotmail.com (armendarez77 at hotmail.com)
Date: Mon, 21 Sep 2009 14:01:12 -0700
Subject: [Bioperl-l] New to Bio::Tools::Run::RemoteBlast
Message-ID: <SNT119-W38149FD1B34EE5CA92BFBED2DD0@phx.gbl>


Hello,

Is there a function to blast one query sequence against multiple blast databases?  For example, I want to blast a sequence against all Microbial Genomes.  Currently, I can do it by placing multiple Microbial databases (eg. Microbial/100226, Microbial/101510, etc) into an array and iterate through them using a foreach loop.  Each individual database is placed in the '-data' parameter and the blast is performed.

Example Code:

use strict;
use Bio::Tools::Run::RemoteBlast;

my @microbDbs = qw(Microbial/100226 Microbial/101510 Microbial/103690 Microbial/1063);
my $e_val= '1e-3';

foreach my $db(@microbDbs){
  my @params = ( '-prog' => $prog,
                         '-data' => $db,
                         '-expect' => $e_val,
                         '-readmethod' => 'xml' );

  my $factory = Bio::Tools::Run::RemoteBlast->new(@params);
  my $v = 1;
  my $str = Bio::SeqIO->new(-file=>'test.fa' , '-format' => 'fasta' );
  while (my $input = $str->next_seq()){
    my $r = $factory->submit_blast($input);

    #Code continues...

}

Is there a more efficient way to accomplish this?

If this topic has been discussed please point the way.

Thank you,

Veronica

 		 	   		  
_________________________________________________________________
Microsoft brings you a new way to search the web.  Try  Bing? now
http://www.bing.com?form=MFEHPG&publ=WLHMTAG&crea=TEXT_MFEHPG_Core_tagline_try bing_1x1

From Russell.Smithies at agresearch.co.nz  Mon Sep 21 18:10:56 2009
From: Russell.Smithies at agresearch.co.nz (Smithies, Russell)
Date: Tue, 22 Sep 2009 10:10:56 +1200
Subject: [Bioperl-l] New to Bio::Tools::Run::RemoteBlast
In-Reply-To: <SNT119-W38149FD1B34EE5CA92BFBED2DD0@phx.gbl>
References: <SNT119-W38149FD1B34EE5CA92BFBED2DD0@phx.gbl>
Message-ID: <18DF7D20DFEC044098A1062202F5FFF32B62A6B9D0@exchsth.agresearch.co.nz>

You may need to setup blast locally (not a big job) as I don't think you can blast against multiple databases with B:T:R:RemoteBlast. 
Or you could do it manually on NCBI's site where you can filter results by entrez query (eg. 1239[taxid] for fermicutes) http://www.ncbi.nlm.nih.gov/BLAST/blastcgihelp.shtml#entrez_query 

--Russell


> -----Original Message-----
> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
> bounces at lists.open-bio.org] On Behalf Of armendarez77 at hotmail.com
> Sent: Tuesday, 22 September 2009 9:01 a.m.
> To: bioperl-l at lists.open-bio.org
> Subject: [Bioperl-l] New to Bio::Tools::Run::RemoteBlast
> 
> 
> 
> 
> 
> 
> 
> Hello,
> 
> Is there a function to blast one query sequence against multiple blast
> databases?  For example, I want to blast a sequence against all Microbial
> Genomes.  Currently, I can do it by placing multiple Microbial databases (eg.
> Microbial/100226, Microbial/101510, etc) into an array and iterate through
> them using a foreach loop.  Each individual database is placed in the '-data'
> parameter and the blast is performed.
> 
> Example Code:
> 
> use strict;
> use Bio::Tools::Run::RemoteBlast;
> 
> my @microbDbs = qw(Microbial/100226 Microbial/101510 Microbial/103690
> Microbial/1063);
> my $e_val= '1e-3';
> 
> foreach my $db(@microbDbs){
>   my @params = ( '-prog' => $prog,
>                          '-data' => $db,
>                          '-expect' => $e_val,
>                          '-readmethod' => 'xml' );
> 
>   my $factory = Bio::Tools::Run::RemoteBlast->new(@params);
>   my $v = 1;
>   my $str = Bio::SeqIO->new(-file=>'test.fa' , '-format' => 'fasta' );
>   while (my $input = $str->next_seq()){
>     my $r = $factory->submit_blast($input);
> 
>     #Code continues...
> 
> }
> 
> Is there a more efficient way to accomplish this?
> 
> If this topic has been discussed please point the way.
> 
> Thank you,
> 
> Veronica
> 
> 
> _________________________________________________________________
> Microsoft brings you a new way to search the web.  Try  Bing(tm) now
> http://www.bing.com?form=MFEHPG&publ=WLHMTAG&crea=TEXT_MFEHPG_Core_tagline_try
> bing_1x1
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
=======================================================================
Attention: The information contained in this message and/or attachments
from AgResearch Limited is intended only for the persons or entities
to which it is addressed and may contain confidential and/or privileged
material. Any review, retransmission, dissemination or other use of, or
taking of any action in reliance upon, this information by persons or
entities other than the intended recipients is prohibited by AgResearch
Limited. If you have received this message in error, please notify the
sender immediately.
=======================================================================


From bill at genenformics.com  Mon Sep 21 18:21:26 2009
From: bill at genenformics.com (bill at genenformics.com)
Date: Mon, 21 Sep 2009 15:21:26 -0700
Subject: [Bioperl-l] New to Bio::Tools::Run::RemoteBlast
In-Reply-To: <18DF7D20DFEC044098A1062202F5FFF32B62A6B9D0@exchsth.agresearch.co.nz>
References: <SNT119-W38149FD1B34EE5CA92BFBED2DD0@phx.gbl>
	<18DF7D20DFEC044098A1062202F5FFF32B62A6B9D0@exchsth.agresearch.co.nz>
Message-ID: <4a1b887d0770ac557b0a2578aefdce18.squirrel@mail.dreamhost.com>

BLAST DBs can be concatenated into a single target (.nal or .pal) file.

Check this out:

http://www.ncbi.nlm.nih.gov/Web/Newsltr/Winter00/blastlab.html

Bill

> You may need to setup blast locally (not a big job) as I don't think you
> can blast against multiple databases with B:T:R:RemoteBlast.
> Or you could do it manually on NCBI's site where you can filter results by
> entrez query (eg. 1239[taxid] for fermicutes)
> http://www.ncbi.nlm.nih.gov/BLAST/blastcgihelp.shtml#entrez_query
>
> --Russell
>
>
>> -----Original Message-----
>> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
>> bounces at lists.open-bio.org] On Behalf Of armendarez77 at hotmail.com
>> Sent: Tuesday, 22 September 2009 9:01 a.m.
>> To: bioperl-l at lists.open-bio.org
>> Subject: [Bioperl-l] New to Bio::Tools::Run::RemoteBlast
>>
>>
>>
>>
>>
>>
>>
>> Hello,
>>
>> Is there a function to blast one query sequence against multiple blast
>> databases?  For example, I want to blast a sequence against all
>> Microbial
>> Genomes.  Currently, I can do it by placing multiple Microbial databases
>> (eg.
>> Microbial/100226, Microbial/101510, etc) into an array and iterate
>> through
>> them using a foreach loop.  Each individual database is placed in the
>> '-data'
>> parameter and the blast is performed.
>>
>> Example Code:
>>
>> use strict;
>> use Bio::Tools::Run::RemoteBlast;
>>
>> my @microbDbs = qw(Microbial/100226 Microbial/101510 Microbial/103690
>> Microbial/1063);
>> my $e_val= '1e-3';
>>
>> foreach my $db(@microbDbs){
>>   my @params = ( '-prog' => $prog,
>>                          '-data' => $db,
>>                          '-expect' => $e_val,
>>                          '-readmethod' => 'xml' );
>>
>>   my $factory = Bio::Tools::Run::RemoteBlast->new(@params);
>>   my $v = 1;
>>   my $str = Bio::SeqIO->new(-file=>'test.fa' , '-format' => 'fasta' );
>>   while (my $input = $str->next_seq()){
>>     my $r = $factory->submit_blast($input);
>>
>>     #Code continues...
>>
>> }
>>
>> Is there a more efficient way to accomplish this?
>>
>> If this topic has been discussed please point the way.
>>
>> Thank you,
>>
>> Veronica
>>
>>
>> _________________________________________________________________
>> Microsoft brings you a new way to search the web.  Try  Bing(tm) now
>> http://www.bing.com?form=MFEHPG&publ=WLHMTAG&crea=TEXT_MFEHPG_Core_tagline_try
>> bing_1x1
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> =======================================================================
> Attention: The information contained in this message and/or attachments
> from AgResearch Limited is intended only for the persons or entities
> to which it is addressed and may contain confidential and/or privileged
> material. Any review, retransmission, dissemination or other use of, or
> taking of any action in reliance upon, this information by persons or
> entities other than the intended recipients is prohibited by AgResearch
> Limited. If you have received this message in error, please notify the
> sender immediately.
> =======================================================================
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


From Russell.Smithies at agresearch.co.nz  Mon Sep 21 18:48:26 2009
From: Russell.Smithies at agresearch.co.nz (Smithies, Russell)
Date: Tue, 22 Sep 2009 10:48:26 +1200
Subject: [Bioperl-l] New to Bio::Tools::Run::RemoteBlast
In-Reply-To: <4a1b887d0770ac557b0a2578aefdce18.squirrel@mail.dreamhost.com>
References: <SNT119-W38149FD1B34EE5CA92BFBED2DD0@phx.gbl>
	<18DF7D20DFEC044098A1062202F5FFF32B62A6B9D0@exchsth.agresearch.co.nz>
	<4a1b887d0770ac557b0a2578aefdce18.squirrel@mail.dreamhost.com>
Message-ID: <18DF7D20DFEC044098A1062202F5FFF32B62A6BA02@exchsth.agresearch.co.nz>

That doesn't work with remote databases though.
B:T:R:RemoteBlast uses the QBlast API (I think) so you're limited to the prebuilt databases NCBI offers.
http://www.ncbi.nlm.nih.gov/BLAST/Doc/urlapi.html 

Another thing to try is space-seperating your db list - I know it works with local blasts.
You could also bypass RemoteBlast and do it yourself by POSTing via URL.

This seems to work with multiple databases but you'd need to experiment:

http://www.ncbi.nlm.nih.gov/blast/Blast.cgi?QUERY=257700677&DATABASE=%22Microbial/100226%20Microbial/101510%20Microbial/103690%22&HITLIST_SIZE=10&FILTER=L&EXPECT=10&FORMAT_TYPE=HTML&PROGRAM=blastn&CLIENT=web&SERVICE=plain&NCBI_GI=on&PAGE=Nucleotides&CMD=Put


--Russell


> -----Original Message-----
> From: bill at genenformics.com [mailto:bill at genenformics.com]
> Sent: Tuesday, 22 September 2009 10:21 a.m.
> To: Smithies, Russell
> Cc: 'armendarez77 at hotmail.com'; 'bioperl-l at lists.open-bio.org'
> Subject: Re: [Bioperl-l] New to Bio::Tools::Run::RemoteBlast
> 
> BLAST DBs can be concatenated into a single target (.nal or .pal) file.
> 
> Check this out:
> 
> http://www.ncbi.nlm.nih.gov/Web/Newsltr/Winter00/blastlab.html
> 
> Bill
> 
> > You may need to setup blast locally (not a big job) as I don't think you
> > can blast against multiple databases with B:T:R:RemoteBlast.
> > Or you could do it manually on NCBI's site where you can filter results by
> > entrez query (eg. 1239[taxid] for fermicutes)
> > http://www.ncbi.nlm.nih.gov/BLAST/blastcgihelp.shtml#entrez_query
> >
> > --Russell
> >
> >
> >> -----Original Message-----
> >> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
> >> bounces at lists.open-bio.org] On Behalf Of armendarez77 at hotmail.com
> >> Sent: Tuesday, 22 September 2009 9:01 a.m.
> >> To: bioperl-l at lists.open-bio.org
> >> Subject: [Bioperl-l] New to Bio::Tools::Run::RemoteBlast
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >> Hello,
> >>
> >> Is there a function to blast one query sequence against multiple blast
> >> databases?  For example, I want to blast a sequence against all
> >> Microbial
> >> Genomes.  Currently, I can do it by placing multiple Microbial databases
> >> (eg.
> >> Microbial/100226, Microbial/101510, etc) into an array and iterate
> >> through
> >> them using a foreach loop.  Each individual database is placed in the
> >> '-data'
> >> parameter and the blast is performed.
> >>
> >> Example Code:
> >>
> >> use strict;
> >> use Bio::Tools::Run::RemoteBlast;
> >>
> >> my @microbDbs = qw(Microbial/100226 Microbial/101510 Microbial/103690
> >> Microbial/1063);
> >> my $e_val= '1e-3';
> >>
> >> foreach my $db(@microbDbs){
> >>   my @params = ( '-prog' => $prog,
> >>                          '-data' => $db,
> >>                          '-expect' => $e_val,
> >>                          '-readmethod' => 'xml' );
> >>
> >>   my $factory = Bio::Tools::Run::RemoteBlast->new(@params);
> >>   my $v = 1;
> >>   my $str = Bio::SeqIO->new(-file=>'test.fa' , '-format' => 'fasta' );
> >>   while (my $input = $str->next_seq()){
> >>     my $r = $factory->submit_blast($input);
> >>
> >>     #Code continues...
> >>
> >> }
> >>
> >> Is there a more efficient way to accomplish this?
> >>
> >> If this topic has been discussed please point the way.
> >>
> >> Thank you,
> >>
> >> Veronica
> >>
> >>
> >> _________________________________________________________________
> >> Microsoft brings you a new way to search the web.  Try  Bing(tm) now
> >>
> http://www.bing.com?form=MFEHPG&publ=WLHMTAG&crea=TEXT_MFEHPG_Core_tagline_try
> >> bing_1x1
> >> _______________________________________________
> >> Bioperl-l mailing list
> >> Bioperl-l at lists.open-bio.org
> >> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> > =======================================================================
> > Attention: The information contained in this message and/or attachments
> > from AgResearch Limited is intended only for the persons or entities
> > to which it is addressed and may contain confidential and/or privileged
> > material. Any review, retransmission, dissemination or other use of, or
> > taking of any action in reliance upon, this information by persons or
> > entities other than the intended recipients is prohibited by AgResearch
> > Limited. If you have received this message in error, please notify the
> > sender immediately.
> > =======================================================================
> >
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >
> 


From Russell.Smithies at agresearch.co.nz  Mon Sep 21 19:04:54 2009
From: Russell.Smithies at agresearch.co.nz (Smithies, Russell)
Date: Tue, 22 Sep 2009 11:04:54 +1200
Subject: [Bioperl-l] New to Bio::Tools::Run::RemoteBlast
In-Reply-To: <18DF7D20DFEC044098A1062202F5FFF32B62A6BA02@exchsth.agresearch.co.nz>
References: <SNT119-W38149FD1B34EE5CA92BFBED2DD0@phx.gbl>
	<18DF7D20DFEC044098A1062202F5FFF32B62A6B9D0@exchsth.agresearch.co.nz>
	<4a1b887d0770ac557b0a2578aefdce18.squirrel@mail.dreamhost.com>
	<18DF7D20DFEC044098A1062202F5FFF32B62A6BA02@exchsth.agresearch.co.nz>
Message-ID: <18DF7D20DFEC044098A1062202F5FFF32B62A6BA19@exchsth.agresearch.co.nz>

If you want to "manually" use Perl and QBlast, here's some example code.
I don't remember where it came from but it works well  :-)

**Ignore the UserAgent stuff, our firewall is fairly well tied down.

--Russell

============================

#!perl -w
$| = 1;

use LWP::UserAgent;
use HTTP::Request::Common 'POST';

$ua = LWP::UserAgent->new;
push @{ $ua->requests_redirectable }, 'POST';   #LWP doesn't redirect by default
$ua->agent('Mozilla/5.0');

#$ua->proxy( [ 'http', 'ftp' ] => 'http://username:password at your.proxy.if.required:8080' );

my $verbose = 1;
my $seq     = getSequence();
my ( $blast, $taxonomy ) = queryQBlast($seq);
$verbose && print "saving result\n";
saveToFile( $blast,    "blast.txt" );
saveToFile( $taxonomy, "taxonomy.html" );
$verbose && print "Done.\n";

sub queryQBlast {
  my ($seq) = @_;
  $seq =~ s/[\d\n\W]//g;
  my $sleepTime          = 0;
  my $sleepTimeIncrement = 5;
  my $totalSleepTime     = 0;
  my $maxSleepTime       = 600;    # 10 min
  my ( $rid, $rtoe ) = startQBlast($seq);
  my ( $blast, $taxonomy );

  while ( !$blast ) {
    $verbose && printf "wait %3d seconds\n", $sleepTime;
    sleep $sleepTime;
    ( $blast, $taxonomy ) = retrieveQBlastResult($rid);
    $sleepTime += $sleepTimeIncrement unless ( $sleepTime > 100 );
    $totalSleepTime += $sleepTimeIncrement;
    last if ( $totalSleepTime > $maxSleepTime );
  }
  return ( $blast, $taxonomy );
}

sub startQBlast {
  my ($sequence) = @_;
  my ( $expect, $wsize, $filter, $mega );
  my $hitList = 100;
  if ( length($sequence) <= 20 ) {
    $expect = 1000;
    $wsize  = 7;
    $mega   = "on";
    $filter = "";
  }
  else {
    $expect = 10;
    $wsize  = 28;
    $mega   = "on";
    $filter = "L";    # Low complexity
  }
  my $qblastURL = "http://www.ncbi.nlm.nih.gov/blast/Blast.cgi?";
  my $url       = $qblastURL . "QUERY=$sequence";
  $url .=
"&DATABASE=nr&HITLIST_SIZE=${hitList}&FILTER=${filter}&EXPECT=${expect}&FORMAT_TYPE=Text";
  $url .=
    "&PROGRAM=blastn&CLIENT=web&SERVICE=plain&NCBI_GI=on&PAGE=Nucleotides";
  $url .= "&SHOW_OVERVIEW=&WORD_SIZE=${wsize}&MEGABLAST=${mega}&CMD=Put";
  my $req = HTTP::Request->new( GET => $url );
  my $content = $ua->request($req)->content;
  $content =~ s/\s+/ /g;
  my ( $rid, $rtoe ) = $content =~
    /QBlastInfoBegin RID = ([\d\-\.\w]+) RTOE = (\d+) QBlastInfoEnd/;
  if ( !$rid ) { print qq{\nERROR missing RID:\n}; exit; }
  $verbose && print "RID $rid\n";
  return ( $rid, $rtoe );
}

sub retrieveQBlastResult {
  my ($rid)     = @_;
  my $qblastURL = "http://www.ncbi.nlm.nih.gov/blast/Blast.cgi?";
  my $url       = $qblastURL
    . "RID=$rid&CMD=Get&SHOW_OVERVIEW=&SHOW_LINKOUT=&FORMAT_TYPE=Text";
  my ( $blast, $taxonomy, $req );
  $req = HTTP::Request->new( GET => $url );
  $blast = $ua->request($req)->content;
  if ( $blast =~ /\s+Status=WAITING/ ) {
    $blast = "";
  }
  elsif ( $blast =~ /\s+Status=UNKNOWN/ ) {
    print "Error in processing\nRID $rid\n";
    exit;
  }
  else {
    $verbose && print "got blast result\n";
    $verbose && print "retrieving taxonomy data\n";
    $url = $qblastURL . "CMD=Get&RID=$rid&FORMAT_OBJECT=TaxBlast&NCBI_GI=on";
    $req = HTTP::Request->new( GET => $url );
    $taxonomy = $ua->request($req)->content;
    $taxonomy = "" if ( $taxonomy =~ /No valid taxids found in the alignment/ );
  }
  return ( $blast, $taxonomy );
}

sub saveToFile {
  my ( $data, $file ) = @_;
  local (*OUT);
  open( OUT, ">$file" );
  print OUT $data;
  close OUT;
}

sub getSequence {
  return qq{
AAAGGATTTATTGACGATGCGAACTACTCCGTTGGCCTGTTGGATGAAGGAACAAA
CCTTGGAAATGTTATTGATAACTATGTTTATGAACATACCCTGACAGGAAAAAATGCAT
TTTTTGTGGGGGATCTTGGGAAGATCGTGAAGAAGCACAGTCAGTGGCAGACCGTGGTG
GCTCAGATAAAGCCGTTTTACACGGTGAAGTGCAACTCCACTCCAGCCGTGCTTGAGAT
CTTGGCAGCTCTTGGAACTGGGTTTGCTTGTTCCAGCAAAAATGAAATGGCTTTAGTGC
AAGAATTGGGTGTATCTCCAGAAAACATCATTTTCACAAGTCCTTGTAAGCAAGTGTCT
CAGATAAAGTATGCAGCAAAAGTTGGAGTAAATATTATGACATGTGACAATGAGATTGA
ATTAAAGAAAATTGCAAGGAATCACCCAAATGCCAAGGTCTTACTACATATTGCAACAG
AAGATAATATTGGAGGTGAAGATGGTAACATGAAGTTTGGCACTACACTGAAGAATTGT
AGGCATCTTTTGGAATGTGCCAAGGAACTTGATGTCCAAATAATTGGGGTTAAATTTCA
TGTTTCAAGTGCTTGCAAAGAATATCAAGTATATGTACATGCCCTGTCTGATGCTCGAT
GTGTGTTTGACATGGCTGGAGAGTTTGGCTTTACAATGAACATGTTAGACATCGGTGGA
GGCTTCACAGGAACTGAAATTCAGTTGGAAGAGGTTAATCATGTTATCAGTCCTCTGTT
GGATATTTACTTCCCTGAAGGATCTGGCATTCAGATAATTTCAGAACCTGGAAGCTACT
ATGTATCTTCTGCGTTTACACTTGCAGTCAATATTATTGCTAAGAAAGTTGTTGAAAAT
GATAAATTTTCCTCTGGAGTAGAAAAAAATGGGAGTGATGAGCCAGCCTTCGTGTATTA
CATGAATGATGGTGTTTATGGTTCTTTTGCGAGTAAGCTTTCTGAGGACTTAAATACCA
TTCCAGAGGTTCACAAGAAATACAAGGAAGATGAGCCTCTGTTTACAAGCAGCCTTTGG
GGTCCATCCTGTGATGAGCTTGATCAAATTGTGGAAAGCTGTCTTCTTCCTGAGCTGAA
TGTGGGAGATTGGCTTATCTTTGATAACATGGGAGCAGATTCTTTCCACGAACCATCTG
CTTTTAATGATTTTCAGAGGCCAGCTATTTATTTCATGATGTCATTCAGTGATTGGTAT
GAGATGCAAGATGCTGGAATTACTTCAGATGCAATGATGAAAAACTTCTTCTTTGCACC
CTCTTGTATTCAGCTGAGCCAAGAAGACAGCTTTTCCACTGAAGCT};
}

================================

> -----Original Message-----
> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
> bounces at lists.open-bio.org] On Behalf Of Smithies, Russell
> Sent: Tuesday, 22 September 2009 10:48 a.m.
> To: 'bill at genenformics.com'
> Cc: 'bioperl-l at lists.open-bio.org'; 'armendarez77 at hotmail.com'
> Subject: Re: [Bioperl-l] New to Bio::Tools::Run::RemoteBlast
> 
> That doesn't work with remote databases though.
> B:T:R:RemoteBlast uses the QBlast API (I think) so you're limited to the
> prebuilt databases NCBI offers.
> http://www.ncbi.nlm.nih.gov/BLAST/Doc/urlapi.html
> 
> Another thing to try is space-seperating your db list - I know it works with
> local blasts.
> You could also bypass RemoteBlast and do it yourself by POSTing via URL.
> 
> This seems to work with multiple databases but you'd need to experiment:
> 
> http://www.ncbi.nlm.nih.gov/blast/Blast.cgi?QUERY=257700677&DATABASE=%22Microb
> ial/100226%20Microbial/101510%20Microbial/103690%22&HITLIST_SIZE=10&FILTER=L&E
> XPECT=10&FORMAT_TYPE=HTML&PROGRAM=blastn&CLIENT=web&SERVICE=plain&NCBI_GI=on&P
> AGE=Nucleotides&CMD=Put
> 
> 
> --Russell
> 
> 
> > -----Original Message-----
> > From: bill at genenformics.com [mailto:bill at genenformics.com]
> > Sent: Tuesday, 22 September 2009 10:21 a.m.
> > To: Smithies, Russell
> > Cc: 'armendarez77 at hotmail.com'; 'bioperl-l at lists.open-bio.org'
> > Subject: Re: [Bioperl-l] New to Bio::Tools::Run::RemoteBlast
> >
> > BLAST DBs can be concatenated into a single target (.nal or .pal) file.
> >
> > Check this out:
> >
> > http://www.ncbi.nlm.nih.gov/Web/Newsltr/Winter00/blastlab.html
> >
> > Bill
> >
> > > You may need to setup blast locally (not a big job) as I don't think you
> > > can blast against multiple databases with B:T:R:RemoteBlast.
> > > Or you could do it manually on NCBI's site where you can filter results by
> > > entrez query (eg. 1239[taxid] for fermicutes)
> > > http://www.ncbi.nlm.nih.gov/BLAST/blastcgihelp.shtml#entrez_query
> > >
> > > --Russell
> > >
> > >
> > >> -----Original Message-----
> > >> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
> > >> bounces at lists.open-bio.org] On Behalf Of armendarez77 at hotmail.com
> > >> Sent: Tuesday, 22 September 2009 9:01 a.m.
> > >> To: bioperl-l at lists.open-bio.org
> > >> Subject: [Bioperl-l] New to Bio::Tools::Run::RemoteBlast
> > >>
> > >>
> > >>
> > >>
> > >>
> > >>
> > >>
> > >> Hello,
> > >>
> > >> Is there a function to blast one query sequence against multiple blast
> > >> databases?  For example, I want to blast a sequence against all
> > >> Microbial
> > >> Genomes.  Currently, I can do it by placing multiple Microbial databases
> > >> (eg.
> > >> Microbial/100226, Microbial/101510, etc) into an array and iterate
> > >> through
> > >> them using a foreach loop.  Each individual database is placed in the
> > >> '-data'
> > >> parameter and the blast is performed.
> > >>
> > >> Example Code:
> > >>
> > >> use strict;
> > >> use Bio::Tools::Run::RemoteBlast;
> > >>
> > >> my @microbDbs = qw(Microbial/100226 Microbial/101510 Microbial/103690
> > >> Microbial/1063);
> > >> my $e_val= '1e-3';
> > >>
> > >> foreach my $db(@microbDbs){
> > >>   my @params = ( '-prog' => $prog,
> > >>                          '-data' => $db,
> > >>                          '-expect' => $e_val,
> > >>                          '-readmethod' => 'xml' );
> > >>
> > >>   my $factory = Bio::Tools::Run::RemoteBlast->new(@params);
> > >>   my $v = 1;
> > >>   my $str = Bio::SeqIO->new(-file=>'test.fa' , '-format' => 'fasta' );
> > >>   while (my $input = $str->next_seq()){
> > >>     my $r = $factory->submit_blast($input);
> > >>
> > >>     #Code continues...
> > >>
> > >> }
> > >>
> > >> Is there a more efficient way to accomplish this?
> > >>
> > >> If this topic has been discussed please point the way.
> > >>
> > >> Thank you,
> > >>
> > >> Veronica
> > >>
> > >>
> > >> _________________________________________________________________
> > >> Microsoft brings you a new way to search the web.  Try  Bing(tm) now
> > >>
> >
> http://www.bing.com?form=MFEHPG&publ=WLHMTAG&crea=TEXT_MFEHPG_Core_tagline_try
> > >> bing_1x1
> > >> _______________________________________________
> > >> Bioperl-l mailing list
> > >> Bioperl-l at lists.open-bio.org
> > >> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> > > =======================================================================
> > > Attention: The information contained in this message and/or attachments
> > > from AgResearch Limited is intended only for the persons or entities
> > > to which it is addressed and may contain confidential and/or privileged
> > > material. Any review, retransmission, dissemination or other use of, or
> > > taking of any action in reliance upon, this information by persons or
> > > entities other than the intended recipients is prohibited by AgResearch
> > > Limited. If you have received this message in error, please notify the
> > > sender immediately.
> > > =======================================================================
> > >
> > > _______________________________________________
> > > Bioperl-l mailing list
> > > Bioperl-l at lists.open-bio.org
> > > http://lists.open-bio.org/mailman/listinfo/bioperl-l
> > >
> >
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From Russell.Smithies at agresearch.co.nz  Mon Sep 21 16:51:51 2009
From: Russell.Smithies at agresearch.co.nz (Smithies, Russell)
Date: Tue, 22 Sep 2009 08:51:51 +1200
Subject: [Bioperl-l] a Main Page proposal
In-Reply-To: <~B4ab702db0000.4ab7e0410000.0001.mml.2798180807@NewLife>
References: <~B4ab702db0000.4ab7e0410000.0001.mml.2798180807@NewLife>
Message-ID: <18DF7D20DFEC044098A1062202F5FFF32B62A6B938@exchsth.agresearch.co.nz>

Here's a few comments to ignore at will :-)

How about using a different default skin so it doesn't look like all the other installations of MediaWiki?
I've attached a screenshot of one of my wikis using the "Daddio" skin but a bit of crafty CSS can do wonders.
Also, there's a lot of duplication with most of the links on Mediawiki:Sidebar also appearing on the main page content.
The "Treeview" is a nice extension as well for tidying up complex menus http://semeb.com/dpldemo/index.php?title=Treeview_extension 

I've got a bit of experience with wikis and extensions (we use LOTS of extensions) so let me know if there's anything you need.

--Russell


> -----Original Message-----
> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
> bounces at lists.open-bio.org] On Behalf Of Mark A. Jensen
> Sent: Monday, 21 September 2009 4:23 p.m.
> To: BioPerl List
> Subject: [Bioperl-l] a Main Page proposal
> 
> Hello all,
> 
> As Brian articulated so well for many of us,
> the wiki main page is, well, butt-ugly.
> Please check out the Main Page Beta at
> http://www.bioperl.org/wiki/Main_Page_Beta
> and respond to this thread or on the discussion
> page.
> 
> cheers and thanks,
> MAJ
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

=======================================================================
Attention: The information contained in this message and/or attachments
from AgResearch Limited is intended only for the persons or entities
to which it is addressed and may contain confidential and/or privileged
material. Any review, retransmission, dissemination or other use of, or
taking of any action in reliance upon, this information by persons or
entities other than the intended recipients is prohibited by AgResearch
Limited. If you have received this message in error, please notify the
sender immediately.
=======================================================================
-------------- next part --------------
A non-text attachment was scrubbed...
Name: daddio.png
Type: image/png
Size: 51263 bytes
Desc: daddio.png
URL: <http://lists.open-bio.org/pipermail/bioperl-l/attachments/20090922/643d7f79/attachment-0001.png>

From cjfields at illinois.edu  Mon Sep 21 23:38:18 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Mon, 21 Sep 2009 22:38:18 -0500
Subject: [Bioperl-l] a Main Page proposal
In-Reply-To: <18DF7D20DFEC044098A1062202F5FFF32B62A6B938@exchsth.agresearch.co.nz>
References: <~B4ab702db0000.4ab7e0410000.0001.mml.2798180807@NewLife>
	<18DF7D20DFEC044098A1062202F5FFF32B62A6B938@exchsth.agresearch.co.nz>
Message-ID: <B9C1E8A4-BDE0-45E7-858B-8BFABA1D2480@illinois.edu>

Russell, Mark,

It would be nice to change the background, just don't want it to be  
too distracting.

Also (I mentioned this to Mark off-list), I think the sidebar would be  
cleaned up considerably, but not until this becomes the default.  I  
also like the use of the TreeView extension, very nice!  Anyone have  
privs for the wiki to test it out?

chris

On Sep 21, 2009, at 3:51 PM, Smithies, Russell wrote:

> Here's a few comments to ignore at will :-)
>
> How about using a different default skin so it doesn't look like all  
> the other installations of MediaWiki?
> I've attached a screenshot of one of my wikis using the "Daddio"  
> skin but a bit of crafty CSS can do wonders.
> Also, there's a lot of duplication with most of the links on  
> Mediawiki:Sidebar also appearing on the main page content.
> The "Treeview" is a nice extension as well for tidying up complex  
> menus http://semeb.com/dpldemo/index.php?title=Treeview_extension
>
> I've got a bit of experience with wikis and extensions (we use LOTS  
> of extensions) so let me know if there's anything you need.
>
> --Russell
>
>
>> -----Original Message-----
>> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
>> bounces at lists.open-bio.org] On Behalf Of Mark A. Jensen
>> Sent: Monday, 21 September 2009 4:23 p.m.
>> To: BioPerl List
>> Subject: [Bioperl-l] a Main Page proposal
>>
>> Hello all,
>>
>> As Brian articulated so well for many of us,
>> the wiki main page is, well, butt-ugly.
>> Please check out the Main Page Beta at
>> http://www.bioperl.org/wiki/Main_Page_Beta
>> and respond to this thread or on the discussion
>> page.
>>
>> cheers and thanks,
>> MAJ
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> = 
> ======================================================================
> Attention: The information contained in this message and/or  
> attachments
> from AgResearch Limited is intended only for the persons or entities
> to which it is addressed and may contain confidential and/or  
> privileged
> material. Any review, retransmission, dissemination or other use of,  
> or
> taking of any action in reliance upon, this information by persons or
> entities other than the intended recipients is prohibited by  
> AgResearch
> Limited. If you have received this message in error, please notify the
> sender immediately.
> = 
> ======================================================================
> <daddio.png>_______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From cjfields at illinois.edu  Mon Sep 21 23:56:58 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Mon, 21 Sep 2009 22:56:58 -0500
Subject: [Bioperl-l] BioPerl 1.6.0 alpha 2 released
Message-ID: <2736FAB1-3728-465F-A07B-A8FFA790FC4C@illinois.edu>

Just a note that the second alpha is out and propagating it's way  
around the intertubes:

http://search.cpan.org/~cjfields/BioPerl-1.6.0_2/

Pick your favorite archive here:

http://bioperl.org/DIST/RC/

This should address the bugs reported by Scott from the last release.   
Just a note, but I am seeing a warning popping up with 64-bit perl  
5.10.1 on Mac with PopGen tests (I think it's a floating point  
addition issue).  Let me know if this is popping up elsewhere.

Enjoy!

chris

From jcline at ieee.org  Mon Sep 21 23:59:09 2009
From: jcline at ieee.org (Jonathan Cline)
Date: Mon, 21 Sep 2009 22:59:09 -0500
Subject: [Bioperl-l] a Main Page proposal
In-Reply-To: <mailman.7482.1253574466.2696.bioperl-l@lists.open-bio.org>
References: <mailman.7482.1253574466.2696.bioperl-l@lists.open-bio.org>
Message-ID: <4AB84B8D.5080005@ieee.org>

Throwing this out there:

- there should be a screenshot section (whatever that means for bioperl)

- the grammar of the beta page should be more correct.

"Welcome to BioPerl, a community effort to produce Perl code which is
useful in biology. "
==> "Welcome to BioPerl, a community effort to produce Perl code serving
as useful tool in the field of Biology."

>>The About section is a good example. I would bet most visitors to the
BioPerl website skip over the About section because they already know what
BioPerl is, ...  Dave<<


Most good software front pages say, in a couple sentences, "what it is
and what it's for", including pictures (as screenshots).

I would bet a ton of visitors don't know what bioperl is, or what it is
used for, or how it can benefit.  There is likely a metric for this (web
stats) as the ratio of new page visits that bounce away vs. new
clickthrus from the front page to the download or docs section.   i.e. a
visitor found the page and didn't continue reading.  I don't really know
all the things bioperl is good for and I've been reading about it here &
there for a while.

I like the following from the About and I believe it fits well on a
front page, expanding "toolkit" to "software library":

"What is Bioperl? It is an open source bioinformatics software library
used by researchers all over the world. If you're looking for a script
built to fit your exact needs you probably won't find it in Bioperl.
What you will find is a diverse set of Perl modules that will enable you
to write your own script, and a community of people who are willing to
help you. "

The old school definition of software library is something like: "useful
routines which can be used by an application (& not itself an
application)" which is basically the description above.

I also like the intro from wikipedia, which I found more informative
about bioperl, and would be good for a front page:

'BioPerl [1] is a collection of Perl modules that facilitate the
development of Perl scripts for bioinformatics applications. It has
played an integral role in the Human Genome Project.[2]  It is an active
open source software project supported by the Open Bioinformatics
Foundation.  In order to take advantage of BioPerl, the user needs a
basic understanding of the Perl programming language including an
understanding of how to use Perl references, modules, objects and methods."

The screenshots could also include pics of books on bioperl or perl+bio,
that would be neat.  (Tisdall's book comes to mind here)


## Jonathan Cline
## jcline at ieee.org
## Mobile: +1-805-617-0223
########################


From lelbourn at science.mq.edu.au  Tue Sep 22 01:05:28 2009
From: lelbourn at science.mq.edu.au (Liam Elbourne)
Date: Tue, 22 Sep 2009 15:05:28 +1000
Subject: [Bioperl-l] subsection of genbank file
In-Reply-To: <4AB36451.3030207@gmail.com>
References: <997B4CA2-D80B-4512-AA3E-74CB45DD7064@science.mq.edu.au>
	<4AB36451.3030207@gmail.com>
Message-ID: <3B0EF953-BF79-4384-964D-A992DFBDB609@science.mq.edu.au>

Hi Roy,

Thanks for that, works well, but there are no _gsf_tag_hash values?  
I'm particularly interested in the locus id, obviously the translation  
could be problematic if the whole gene is not included after  
truncation, but things like the note, product, protein_id would be  
good. I had a look at the code for the method and couldn't see any  
obvious why those values didn't make it across. Should I submit this  
as a bug, or is there something I'm missing?


Regards,
Liam.


On 18/09/2009, at 8:43 PM, Roy Chaudhuri wrote:

> Hi Liam,
>
> I just discovered your message, which has not yet been replied to.  
> What you require has been discussed in a recent thread:
> http://bioperl.org/pipermail/bioperl-l/2009-August/031071.html
>
> Try using trunc_with_features from Bio::SeqUtils:
>
> my $sub_seqobj=Bio::SeqUtils->trunc_with_features($seqobj, 300, 2000);
> Cheers.
> Roy.
>
> Liam Elbourne wrote:
>> Hi All,
>> Is there a method or methodology that will produce a fully fledged  
>> Seq  object with all the associated metadata given a start and end   
>> position? To clarify, I create a sequence object from a genbank file:
>> ****
>> my $io  = Bio::Seqio->new(as per usual);
>> my $seqobj = $io->next_seq();
>> ****
>> I now want:
>> my $sub_seqobj = $seqobj between 300 and 2000
>> where $sub_seqobj is a Seq object (which I appreciate is an   
>> 'aggregate' of objects) too. The "trunc" method only returns a   
>> PrimarySeq object which lacks all the annotation etc. I've  
>> previously  done this task by iterating through feature by feature  
>> and parsing out  what I needed, but thought there might be a more  
>> elegant approach...
>> Regards,
>> Liam Elbourne.
>
> -- 
> Dr. Roy Chaudhuri
> Department of Veterinary Medicine
> University of Cambridge, U.K.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> ac.uk ([131.111.51.215]:49455)
> 	by ppsw-7.c

______________________________

Dr Liam Elbourne
Research Fellow (Bioinformatics)
Paulsen Laboratory
Macquarie University
Sydney
Australia.

http://www2.oxfam.org.au/trailwalker/Sydney/team/228


From roy.chaudhuri at gmail.com  Tue Sep 22 03:17:26 2009
From: roy.chaudhuri at gmail.com (Roy Chaudhuri)
Date: Tue, 22 Sep 2009 08:17:26 +0100
Subject: [Bioperl-l] subsection of genbank file
In-Reply-To: <3B0EF953-BF79-4384-964D-A992DFBDB609@science.mq.edu.au>
References: <997B4CA2-D80B-4512-AA3E-74CB45DD7064@science.mq.edu.au>
	<4AB36451.3030207@gmail.com>
	<3B0EF953-BF79-4384-964D-A992DFBDB609@science.mq.edu.au>
Message-ID: <4AB87A06.4000209@gmail.com>

Hi Liam,

Yes, that is a bug - I think it is to do with the Feature Annotation 
rollback from 1.6, it works fine with 1.5.2. Looks like the tests I 
wrote don't check for the presence of tags, just the coordinates of the 
feature, so this hasn't been picked up. Submit it to Bugzilla, and I'll 
take a look when I get a chance.

Cheers.
Roy.

Liam Elbourne wrote:
> Hi Roy,
> 
> Thanks for that, works well, but there are no _gsf_tag_hash values? I'm 
> particularly interested in the locus id, obviously the translation could 
> be problematic if the whole gene is not included after truncation, but 
> things like the note, product, protein_id would be good. I had a look at 
> the code for the method and couldn't see any obvious why those values 
> didn't make it across. Should I submit this as a bug, or is there 
> something I'm missing?
> 
> 
> Regards,
> Liam.
> 
> 
> 
> On 18/09/2009, at 8:43 PM, Roy Chaudhuri wrote:
> 
>> Hi Liam,
>>
>> I just discovered your message, which has not yet been replied to. 
>> What you require has been discussed in a recent thread:
>> http://bioperl.org/pipermail/bioperl-l/2009-August/031071.html
>>
>> Try using trunc_with_features from Bio::SeqUtils:
>>
>> my $sub_seqobj=Bio::SeqUtils->trunc_with_features($seqobj, 300, 2000);
>> Cheers.
>> Roy.
>>
>> Liam Elbourne wrote:
>>> Hi All,
>>> Is there a method or methodology that will produce a fully fledged 
>>> Seq  object with all the associated metadata given a start and end 
>>>  position? To clarify, I create a sequence object from a genbank file:
>>> ****
>>> my $io  = Bio::Seqio->new(as per usual);
>>> my $seqobj = $io->next_seq();
>>> ****
>>> I now want:
>>> my $sub_seqobj = $seqobj between 300 and 2000
>>> where $sub_seqobj is a Seq object (which I appreciate is an 
>>>  'aggregate' of objects) too. The "trunc" method only returns a 
>>>  PrimarySeq object which lacks all the annotation etc. I've 
>>> previously  done this task by iterating through feature by feature 
>>> and parsing out  what I needed, but thought there might be a more 
>>> elegant approach...
>>> Regards,
>>> Liam Elbourne.
>>
>> -- 
>> Dr. Roy Chaudhuri
>> Department of Veterinary Medicine
>> University of Cambridge, U.K.
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>> ac.uk ([131.111.51.215]:49455)
>> by ppsw-7.c
> 
> ______________________________
> 
> Dr Liam Elbourne
> Research Fellow (Bioinformatics)
> Paulsen Laboratory
> Macquarie University
> Sydney
> Australia.
> 
> http://www2.oxfam.org.au/trailwalker/Sydney/team/228
> 
> 
> 


From lelbourn at science.mq.edu.au  Tue Sep 22 03:14:44 2009
From: lelbourn at science.mq.edu.au (Liam Elbourne)
Date: Tue, 22 Sep 2009 17:14:44 +1000
Subject: [Bioperl-l] dnastatistics
In-Reply-To: <8B440DC9-A1C8-4900-A0AB-96448616E46A@bioperl.org>
References: <BLU104-W2453ADE4584D2C479071A4A0E40@phx.gbl>
	<7AD546C5A6BE4B66BF9705BC885E08B1@NewLife>
	<8B440DC9-A1C8-4900-A0AB-96448616E46A@bioperl.org>
Message-ID: <A5C3A80C-03F0-4CEC-BA43-2271B58F6DC4@science.mq.edu.au>

So I also had no problem running the code as written by Jose (Bioperl  
1.6.0, perl 5.10), but in the documentation for DNAStatistics it says:

"The routines are not well tested and do contain errors at this point.  
Work is underway to correct them, but do not expect this code to give  
you the right answer currently!"!

So I'm using dnadist (as I think the documentation recommends), and it  
does produce different numbers to $stats->distance(-).

I tried write_matrix from Bio::Matrix::IO - got a message saying it  
hasn't been implemented yet?

And if Jose hasn't already found it, try Data::Dumper; it will change  
your life....

Regards,
Liam.

On 15/09/2009, at 3:54 AM, Jason Stajich wrote:

> Yeah it seems like more of a bioperl problem -- possible that the  
> older code didn't recognize 'jukes-cantor' but you can try the  
> abbreviation 'jc' -- better to just upgrade tho!
>
> This isn't the cause of the problem but I would also encourage use  
> of Bio::Matrix::IO for printing the matrix (use the 'write_matrix'  
> function) rather than print_matrix on the matrix itsself.
>
> -jason
> On Sep 14, 2009, at 10:00 AM, Mark A. Jensen wrote:
>
>> Hi Jose--
>> I don't get any problem with your script as written. You should  
>> upgrade to
>> BioPerl 1.6 and try again.
>> The "unblessed reference" is $jcmatrix. It may be undef for some  
>> reason.
>> MAJ
>> ----- Original Message ----- From: "Jose ." <joseguillin at hotmail.com>
>> To: <bioperl-l at bioperl.org>
>> Sent: Monday, September 14, 2009 8:48 AM
>> Subject: [Bioperl-l] Bio/Align/DNAStatistics.html print$jcmatrix- 
>> >print_matrix;
>>
>>
>>
>>
>>
>> Hello,
>>
>> I'm trying to use Bio::Align::DNAStatistics, but I get the  
>> following message:
>>
>> Can't call method "print_matrix" on unblessed reference at Tree.pl  
>> line 32, <GEN0> line 44.
>>
>> Other modules do work, such us Bio::SimpleAlign;
>>
>>
>>
>>
>> My code is basically a modification of the code I found in http://doc.bioperl.org/releases/bioperl-current/bioperl-live/Bio/Align/DNAStatistics.html 
>> , as it is as follows:
>>
>> use strict;
>> use Bio::AlignIO;
>> use Bio::Align::DNAStatistics;
>>
>>
>> my $stats = Bio::Align::DNAStatistics->new();
>>
>> my $alignin = Bio::AlignIO->new(-file => 'e1_output_uno_solo.fas',
>>                          -format => 'fasta');
>> my $aln = $alignin->next_aln;
>>
>> my $jcmatrix = $stats-> distance (-align => $aln,
>>                -method => 'Jukes-Cantor');
>>
>> print $jcmatrix->print_matrix;
>>
>> And the file 'e1_output_uno_solo.fas' has the following sequences:
>>
>>> A
>> GGTTATCTCAACAACTGTCACC--GTGGGCGCTGGTCATTGGTACGGGTGAACGAGAGTT
>> AAACGGTCGTTAACCATAGAAACAAAACACACTGCACCTTAACTCACTGAATAGTTGACG
>> GTCTGCCTCAGGGCTTGAGACAACGGATGGATCTAAACTCATGCTGTAGCCTATCAAACT
>> TAGCCCCAGGGTACTTCCGTCCCTAGCCTCGCTACAAGGCCAGAAAGGGTTTTGAAGTCT
>> ACTCACTGTGACCAGCGGTCTAGTCAGGTTATGCTTCGGCACAAAACCTCAGAATCGGTA
>> ACCAGCCACTACACGAACTGAAATCAAATCGCGGGAGGTGGTCCATCTTTGTCCACGCTG
>> CGATGATTGGGTTGCTTTATAGTCTAGCTGCAAGGTTTTGCGTTCTGGTGGGAAGCGGSubject:  
>> Re: [Bioperl-l] Bio/Align/DNAStatistics.html
> 	print$jcmatrix->print_maCA
>> TCCAAGGGGTTGACTCCGCTCGTTTATAACATGCCTTGGGCCTCCATGGTGAGTCGCAAC
>> GTCAGCGTAGGCCTAGACGGCT
>>
>>> B
>> GGATATCTCGACAACTTTTAGC--CTGGGCGCTTGGCATTGGTACACGTGACTTGCAGTT
>> AAAGGGTCGTTATACATAGAATCACTACCCAC--CAGGCGAACTCGCTGGAGAGCTGAGG
>> GTCACCCTCAGCGGTTGAGTTAACTGCTCGATGTTAACCGATGTTGGATCATAGGTAACT
>> TATCCTCAGTGTTCCTCTGTCCCTAGACTGGCTACAGGGCTACACCGGGTTTGAGGGGAT
>> ACTGACTGTTTTCAGCGGTAGTGTAAGTGTATGGTCCAACCCAAGGGTTCATGACCGGTA
>> AACTGCCCGTTCCCGCATTGAAATCAAATTGCAGGAGTTGGTACTTATTTGTCAACCTTA
>> CGATGATTGGGATGCATTTTAGTCGGGCTGGGCGGATTTGCGATCTGGGTGGAAGAGAGA
>> TGCATGGGGCTAACTCGTCTTGGTGAGTACCGGCATTGCACCGCAATGGACCGCCAAAAC
>> ATAAGAGTAGGTCGGGATGGCA
>>
>>> C
>> GCTTATCTCAACAACCGACACGAAGTCGTCGCAGGTCAATGGTACACGTGAATTGAAGTC
>> ATAAGATCAGTAATGATCGAACCACCAAACCCTTAACCTCGACTCACGCGATAGCCGAGG
>> GTCTGCCTCCAGGGTTGATTTAAAGGTTCTATTTAAGACCGTTTTCGATCATAGGTTACT
>> TATCCCCAGAGTTCTACCGTCGTGAGAATGGCTACAAGGCTAGAATAGGTTTTAGGGT-T
>> ACTTACGGTCTGCAGCCGTATTGTGAGGTTATGGTCCGGCCCTAGGCGTCATGACCGATA
>> ATCAGCCCCTACCTGAAATGAAATCAAATCGCGGGAGTTGGTACTTATCTGTCAACGTTG
>> CGATGATGGGGATACATGTTGGTCTACCGCGACGGACTAGCGATCACGGGGGAAGCGGAT
>> TGCCCGGTGGTGACTCGACACGTTTAAAACCTGCCTGGTTCCCGCATGGATCGTCACAAC
>> GTATGTGCAGGTCGAAACGAGT
>>
>>> D
>> CGTGATCGCAACAACTGTCACC--GTGGGCGCTGGCCGTTGGACCACGTGAAATGCTGTT
>> AAACGATCGTTCACCATAGAACCACTACACTCTTCACCTCAACCCGCGGGACAGGTGATG
>> GTGTCCCCCAGGGGTTGAGTGAACGGCTCGATGTAAACCCATGTTCGATCATAGGTAACG
>> TAGCCCCAGGGTGATTCCGTTCCTAAACTGGTTACAAGGCTAAAACGTGTTTTAGAGTAT
>> AATGACTGTCTACGGCGGTATTGTGATGTTATCATCCGTCCCTAGGCGTGGCGACCGTTA
>> AACAGCCTCTTCCCTAACTGATATCTAATCGTAGGAGTTGCTACGCATTTGTCAACGCAG
>> CGATGATGGTGATGCATCTTAATCTAGCTGG----TTTTTTGATCTCGGGTGACGCAGAT
>> AGTCAGGGGTTGACTCGCGTCGTTTGAAACGTGCCTTGCTCCTCAATGGACCCTCCGAAC
>> CTAAGAGTAGCTCGACACGGCT
>>
>>
>>
>> I think the $aln object is OK, as I can use it with SimpleAlign.
>>
>> Moreover, if I write
>>        print $jcmatrix;
>> instead of
>>        print $jcmatrix->print_matrix;
>> I get the memory reference, as normal===> ARRAY(0x859f08)
>>
>> So my question is:
>>
>> Why do I have an unblessed reference?
>>
>> Can't call method "print_matrix" on unblessed reference at Tree.pl  
>> line 32, <GEN0> line 44.
>>
>> Thank you very much in advance.
>>
>> Jose G.
>>
>> _________________________________________________________________
>> Hay tantos ordenadores como personas. ?Descubre ahora cu?l eres t?!
>> http://www.quepceres.com/
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> --
> Jason Stajich
> jason.stajich at gmail.com
> jason at bioperl.org
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

______________________________


From maj at fortinbras.us  Tue Sep 22 07:12:38 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Tue, 22 Sep 2009 07:12:38 -0400
Subject: [Bioperl-l] dnastatistics
In-Reply-To: <A5C3A80C-03F0-4CEC-BA43-2271B58F6DC4@science.mq.edu.au>
References: <BLU104-W2453ADE4584D2C479071A4A0E40@phx.gbl><7AD546C5A6BE4B66BF9705BC885E08B1@NewLife><8B440DC9-A1C8-4900-A0AB-96448616E46A@bioperl.org>
	<A5C3A80C-03F0-4CEC-BA43-2271B58F6DC4@science.mq.edu.au>
Message-ID: <39991E8FD29E4A43B8098C0BA6740C9C@NewLife>

Thanks Liam-- I think the discrepancy between dnadist and the
module is worth making a bug report for- can you do that and
include the data (or part of it) you were using?
Jason, is that work really underway, or should someone pick up
that ball?
----- Original Message ----- 
From: "Liam Elbourne" <lelbourn at science.mq.edu.au>
To: "Jason Stajich" <jason at bioperl.org>
Cc: "Mark A. Jensen" <maj at fortinbras.us>; <bioperl-l at bioperl.org>; "Jose ." 
<joseguillin at hotmail.com>
Sent: Tuesday, September 22, 2009 3:14 AM
Subject: [Bioperl-l] dnastatistics


So I also had no problem running the code as written by Jose (Bioperl
1.6.0, perl 5.10), but in the documentation for DNAStatistics it says:

"The routines are not well tested and do contain errors at this point.
Work is underway to correct them, but do not expect this code to give
you the right answer currently!"!

So I'm using dnadist (as I think the documentation recommends), and it
does produce different numbers to $stats->distance(-).

I tried write_matrix from Bio::Matrix::IO - got a message saying it
hasn't been implemented yet?

And if Jose hasn't already found it, try Data::Dumper; it will change
your life....

Regards,
Liam.

On 15/09/2009, at 3:54 AM, Jason Stajich wrote:

> Yeah it seems like more of a bioperl problem -- possible that the  older code 
> didn't recognize 'jukes-cantor' but you can try the  abbreviation 'jc' --  
> better to just upgrade tho!
>
> This isn't the cause of the problem but I would also encourage use  of 
> Bio::Matrix::IO for printing the matrix (use the 'write_matrix'  function) 
> rather than print_matrix on the matrix itsself.
>
> -jason
> On Sep 14, 2009, at 10:00 AM, Mark A. Jensen wrote:
>
>> Hi Jose--
>> I don't get any problem with your script as written. You should  upgrade to
>> BioPerl 1.6 and try again.
>> The "unblessed reference" is $jcmatrix. It may be undef for some  reason.
>> MAJ
>> ----- Original Message ----- From: "Jose ." <joseguillin at hotmail.com>
>> To: <bioperl-l at bioperl.org>
>> Sent: Monday, September 14, 2009 8:48 AM
>> Subject: [Bioperl-l] Bio/Align/DNAStatistics.html print$jcmatrix-
>> >print_matrix;
>>
>>
>>
>>
>>
>> Hello,
>>
>> I'm trying to use Bio::Align::DNAStatistics, but I get the  following 
>> message:
>>
>> Can't call method "print_matrix" on unblessed reference at Tree.pl  line 32, 
>> <GEN0> line 44.
>>
>> Other modules do work, such us Bio::SimpleAlign;
>>
>>
>>
>>
>> My code is basically a modification of the code I found in 
>> http://doc.bioperl.org/releases/bioperl-current/bioperl-live/Bio/Align/DNAStatistics.html , 
>> as it is as follows:
>>
>> use strict;
>> use Bio::AlignIO;
>> use Bio::Align::DNAStatistics;
>>
>>
>> my $stats = Bio::Align::DNAStatistics->new();
>>
>> my $alignin = Bio::AlignIO->new(-file => 'e1_output_uno_solo.fas',
>>                          -format => 'fasta');
>> my $aln = $alignin->next_aln;
>>
>> my $jcmatrix = $stats-> distance (-align => $aln,
>>                -method => 'Jukes-Cantor');
>>
>> print $jcmatrix->print_matrix;
>>
>> And the file 'e1_output_uno_solo.fas' has the following sequences:
>>
>>> A
>> GGTTATCTCAACAACTGTCACC--GTGGGCGCTGGTCATTGGTACGGGTGAACGAGAGTT
>> AAACGGTCGTTAACCATAGAAACAAAACACACTGCACCTTAACTCACTGAATAGTTGACG
>> GTCTGCCTCAGGGCTTGAGACAACGGATGGATCTAAACTCATGCTGTAGCCTATCAAACT
>> TAGCCCCAGGGTACTTCCGTCCCTAGCCTCGCTACAAGGCCAGAAAGGGTTTTGAAGTCT
>> ACTCACTGTGACCAGCGGTCTAGTCAGGTTATGCTTCGGCACAAAACCTCAGAATCGGTA
>> ACCAGCCACTACACGAACTGAAATCAAATCGCGGGAGGTGGTCCATCTTTGTCCACGCTG
>> CGATGATTGGGTTGCTTTATAGTCTAGCTGCAAGGTTTTGCGTTCTGGTGGGAAGCGGSubject:  Re: 
>> [Bioperl-l] Bio/Align/DNAStatistics.html
> print$jcmatrix->print_maCA
>> TCCAAGGGGTTGACTCCGCTCGTTTATAACATGCCTTGGGCCTCCATGGTGAGTCGCAAC
>> GTCAGCGTAGGCCTAGACGGCT
>>
>>> B
>> GGATATCTCGACAACTTTTAGC--CTGGGCGCTTGGCATTGGTACACGTGACTTGCAGTT
>> AAAGGGTCGTTATACATAGAATCACTACCCAC--CAGGCGAACTCGCTGGAGAGCTGAGG
>> GTCACCCTCAGCGGTTGAGTTAACTGCTCGATGTTAACCGATGTTGGATCATAGGTAACT
>> TATCCTCAGTGTTCCTCTGTCCCTAGACTGGCTACAGGGCTACACCGGGTTTGAGGGGAT
>> ACTGACTGTTTTCAGCGGTAGTGTAAGTGTATGGTCCAACCCAAGGGTTCATGACCGGTA
>> AACTGCCCGTTCCCGCATTGAAATCAAATTGCAGGAGTTGGTACTTATTTGTCAACCTTA
>> CGATGATTGGGATGCATTTTAGTCGGGCTGGGCGGATTTGCGATCTGGGTGGAAGAGAGA
>> TGCATGGGGCTAACTCGTCTTGGTGAGTACCGGCATTGCACCGCAATGGACCGCCAAAAC
>> ATAAGAGTAGGTCGGGATGGCA
>>
>>> C
>> GCTTATCTCAACAACCGACACGAAGTCGTCGCAGGTCAATGGTACACGTGAATTGAAGTC
>> ATAAGATCAGTAATGATCGAACCACCAAACCCTTAACCTCGACTCACGCGATAGCCGAGG
>> GTCTGCCTCCAGGGTTGATTTAAAGGTTCTATTTAAGACCGTTTTCGATCATAGGTTACT
>> TATCCCCAGAGTTCTACCGTCGTGAGAATGGCTACAAGGCTAGAATAGGTTTTAGGGT-T
>> ACTTACGGTCTGCAGCCGTATTGTGAGGTTATGGTCCGGCCCTAGGCGTCATGACCGATA
>> ATCAGCCCCTACCTGAAATGAAATCAAATCGCGGGAGTTGGTACTTATCTGTCAACGTTG
>> CGATGATGGGGATACATGTTGGTCTACCGCGACGGACTAGCGATCACGGGGGAAGCGGAT
>> TGCCCGGTGGTGACTCGACACGTTTAAAACCTGCCTGGTTCCCGCATGGATCGTCACAAC
>> GTATGTGCAGGTCGAAACGAGT
>>
>>> D
>> CGTGATCGCAACAACTGTCACC--GTGGGCGCTGGCCGTTGGACCACGTGAAATGCTGTT
>> AAACGATCGTTCACCATAGAACCACTACACTCTTCACCTCAACCCGCGGGACAGGTGATG
>> GTGTCCCCCAGGGGTTGAGTGAACGGCTCGATGTAAACCCATGTTCGATCATAGGTAACG
>> TAGCCCCAGGGTGATTCCGTTCCTAAACTGGTTACAAGGCTAAAACGTGTTTTAGAGTAT
>> AATGACTGTCTACGGCGGTATTGTGATGTTATCATCCGTCCCTAGGCGTGGCGACCGTTA
>> AACAGCCTCTTCCCTAACTGATATCTAATCGTAGGAGTTGCTACGCATTTGTCAACGCAG
>> CGATGATGGTGATGCATCTTAATCTAGCTGG----TTTTTTGATCTCGGGTGACGCAGAT
>> AGTCAGGGGTTGACTCGCGTCGTTTGAAACGTGCCTTGCTCCTCAATGGACCCTCCGAAC
>> CTAAGAGTAGCTCGACACGGCT
>>
>>
>>
>> I think the $aln object is OK, as I can use it with SimpleAlign.
>>
>> Moreover, if I write
>>        print $jcmatrix;
>> instead of
>>        print $jcmatrix->print_matrix;
>> I get the memory reference, as normal===> ARRAY(0x859f08)
>>
>> So my question is:
>>
>> Why do I have an unblessed reference?
>>
>> Can't call method "print_matrix" on unblessed reference at Tree.pl  line 32, 
>> <GEN0> line 44.
>>
>> Thank you very much in advance.
>>
>> Jose G.
>>
>> _________________________________________________________________
>> Hay tantos ordenadores como personas. ?Descubre ahora cu?l eres t?!
>> http://www.quepceres.com/
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> --
> Jason Stajich
> jason.stajich at gmail.com
> jason at bioperl.org
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

______________________________


_______________________________________________
Bioperl-l mailing list
Bioperl-l at lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/bioperl-l


From dan.bolser at gmail.com  Tue Sep 22 09:09:50 2009
From: dan.bolser at gmail.com (Dan Bolser)
Date: Tue, 22 Sep 2009 14:09:50 +0100
Subject: [Bioperl-l] Converting between allowed SearchIO formats?
Message-ID: <2c8757af0909220609n518243efh63608aa05df13d1c@mail.gmail.com>

Hi all,

I'm reading in a blasttable format blast result, filtering, and hoping
to write out a similarly formatted result. Based on experience with
SeqIO, I expected to do something like the following:

use Bio::SearchIO;

## Open the sequence search report
my $seqI = Bio::SearchIO->
  new( -file   => $file,
       -format => $format,
     );

## Open the output report
my $seqO = Bio::SearchIO->
  new( -file   => ">OUTPUT",
       -format => $format,
     );

while( my $result = $seqI->next_result ) {
  ## Do some filtering...

  $seqO->write_result( $result );
}


However, the above method does not work here. Is this for some deep
reason, or could the above method (based on the way SeqIO works) be
made to work? I'm guessing that the SearchIO object conversion is
simply harder to do than with SeqIO?

So now I'm trying to use the correct method, via
Bio::SearchIO::Writer::HSPTableWriter. The problem is, I can't find a
1 to 1 correspondence between the fields in the blasttable and the
columns provided by the writer. So far I have something like this:

blasttable ->		HSPTableWriter

(result) query_name	query_name
(hit) name		hit_name
(hsp) frac_identical	frac_identical_query?
			frac_identical_hit?
(hsp) hsp_length	length_aln_query?
			length_aln_hit?
(?) mismatches		?
(hsp) gaps		?
			gaps_query?
			gaps_hit?
			gaps_total?
(hsp) start('query')	start_query
(hsp) end('query')	end_query
(hsp) start('hit')	start_hit
(hsp) end('hit')	end_hit
(hsp) significance	expect
(hsp) bits		bits


For (hsp) frac_identical, it seems as if the (undocumented)
frac_identical_total column is giving the right value, however, I'ts
hard to be certain because the format is of the value is different
(the blasttable says 93.51 while HSPTableWriter says 0.94). How can I
change the output format of HSPTableWriter?

Is there any improvement on the above mapping? It seems strange that I
can read in a blasttable, but I can't write one out (using a generic
object interface). For example, where do I get the hsp length from
(which column)?

I'm sure this has come up before, so apologies for not being able to
track down the appropriate docs.


Thanks for any help,
Dan.

P.S. when dumping a blasttable from a blasttable using HSP methods,
how should I calculate the number of mismatches? Currently I'm trying:

      my $len = $hsp->length;
      my $match = $len * $hsp->frac_identical;
      my $mismatch = $len - $match;

but the resulting values differ from those in the original blasttable.
I have the feeling this is a FAQ ...

From cjfields at illinois.edu  Tue Sep 22 10:00:44 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Tue, 22 Sep 2009 09:00:44 -0500
Subject: [Bioperl-l] Converting between allowed SearchIO formats?
In-Reply-To: <2c8757af0909220609n518243efh63608aa05df13d1c@mail.gmail.com>
References: <2c8757af0909220609n518243efh63608aa05df13d1c@mail.gmail.com>
Message-ID: <B7F6253D-F9EE-4EC0-9ABE-53CB85E37D16@illinois.edu>

On Sep 22, 2009, at 8:09 AM, Dan Bolser wrote:

> Hi all,
>
> I'm reading in a blasttable format blast result, filtering, and hoping
> to write out a similarly formatted result. Based on experience with
> SeqIO, I expected to do something like the following:
>
> use Bio::SearchIO;
>
> ## Open the sequence search report
> my $seqI = Bio::SearchIO->
> new( -file   => $file,
>      -format => $format,
>    );
>
> ## Open the output report
> my $seqO = Bio::SearchIO->
> new( -file   => ">OUTPUT",
>      -format => $format,
>    );
>
> while( my $result = $seqI->next_result ) {
> ## Do some filtering...
>
> $seqO->write_result( $result );
> }
>
>
> However, the above method does not work here. Is this for some deep
> reason, or could the above method (based on the way SeqIO works) be
> made to work? I'm guessing that the SearchIO object conversion is
> simply harder to do than with SeqIO?

This is something Jason could probably speak up on, but from my  
perspective it comes down to 'why?'.  This opens up a very hard-to- 
implement door (converting to and from, for instance, BLAST to HMMER),  
which doesn't make sense from the end-user perspective.  What most  
users want out of those formats is getting at the data in an easily  
accessible way, to further process them (filter, to GFF, etc), or to  
have them summarized.  the Writer classes take care of the latter.

There is a very generic, all-purpose write_result in Bio::SearchIO  
that just calls the a ResultWriter object (and dies if it isn't  
present).  Note that this expects a ResultWriter, not a Hit/HSPWriter;  
it is write_result() after all. I think this kind of goes against the  
well-established API that exists with the other write_foo  
implementations for the IO classes, where the input/output format  
should match, but there you have it.

> So now I'm trying to use the correct method, via
> Bio::SearchIO::Writer::HSPTableWriter. The problem is, I can't find a
> 1 to 1 correspondence between the fields in the blasttable and the
> columns provided by the writer. So far I have something like this:
>
> blasttable ->		HSPTableWriter
>
> (result) query_name	query_name
> (hit) name		hit_name
> (hsp) frac_identical	frac_identical_query?
> 			frac_identical_hit?
> (hsp) hsp_length	length_aln_query?
> 			length_aln_hit?
> (?) mismatches		?
> (hsp) gaps		?
> 			gaps_query?
> 			gaps_hit?
> 			gaps_total?
> (hsp) start('query')	start_query
> (hsp) end('query')	end_query
> (hsp) start('hit')	start_hit
> (hsp) end('hit')	end_hit
> (hsp) significance	expect
> (hsp) bits		bits
>
>
> For (hsp) frac_identical, it seems as if the (undocumented)
> frac_identical_total column is giving the right value, however, I'ts
> hard to be certain because the format is of the value is different
> (the blasttable says 93.51 while HSPTableWriter says 0.94). How can I
> change the output format of HSPTableWriter?

Not sure but it appears hard-coded.  This could probably be rewritten  
to spit out certain data attributes by name (e.g. you could ask for  
percent_identity), but I'm not sure.

> Is there any improvement on the above mapping? It seems strange that I
> can read in a blasttable, but I can't write one out (using a generic
> object interface). For example, where do I get the hsp length from
> (which column)?
>
> I'm sure this has come up before, so apologies for not being able to
> track down the appropriate docs.

 From the POD:

'Here are the columns that can be specified in the -columns
parameter when creating a HSPTableWriter object.  If a -columns  
parameter
is not specified, this list, in this order, will be used as the  
default.'

In other words, you keep track of the columns (which appear 1-based).

> Thanks for any help,
> Dan.
> P.S. when dumping a blasttable from a blasttable using HSP methods,
> how should I calculate the number of mismatches? Currently I'm trying:
>
>     my $len = $hsp->length;
>     my $match = $len * $hsp->frac_identical;
>     my $mismatch = $len - $match;
>
> but the resulting values differ from those in the original blasttable.
> I have the feeling this is a FAQ ...

Maybe use seq_inds instead?

BTW, HSP length() defaults on the 'total' length (includes gaps).  The  
above calculation doesn't account for that.

With seq_inds, 'mismatch' are residue-only (no gaps); 'no_match' is  
mismatched residues + gaps (you have to also indicate whether this is  
based on the query or hit).

Also note that seq_inds deals with (1) mapping differences, e.g. any  
query that requires translation, and (2) frameshifts, such as from  
FASTX/Y output (again translated sequence output).  If you are dealing  
with a translated sequence you will want to account for those bits as  
well.

chris

From cjfields at illinois.edu  Tue Sep 22 10:20:47 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Tue, 22 Sep 2009 09:20:47 -0500
Subject: [Bioperl-l] a Main Page proposal
In-Reply-To: <4AB84B8D.5080005@ieee.org>
References: <mailman.7482.1253574466.2696.bioperl-l@lists.open-bio.org>
	<4AB84B8D.5080005@ieee.org>
Message-ID: <2B4BD21D-D06A-43A1-8860-A8F3771130E7@illinois.edu>

On Sep 21, 2009, at 10:59 PM, Jonathan Cline wrote:

> Throwing this out there:
>
> - there should be a screenshot section (whatever that means for  
> bioperl)

The only area that would apply is for Gbrowse/Bio::Graphics.  For much  
of the rest that's a bit trickier, but it's possible.

> - the grammar of the beta page should be more correct.
>
> "Welcome to BioPerl, a community effort to produce Perl code which is
> useful in biology. "
> ==> "Welcome to BioPerl, a community effort to produce Perl code  
> serving
> as useful tool in the field of Biology."
>
>>> The About section is a good example. I would bet most visitors to  
>>> the
> BioPerl website skip over the About section because they already  
> know what
> BioPerl is, ...  Dave<<
>
> Most good software front pages say, in a couple sentences, "what it is
> and what it's for", including pictures (as screenshots).

Right.

> I would bet a ton of visitors don't know what bioperl is, or what it  
> is
> used for, or how it can benefit.  There is likely a metric for this  
> (web
> stats) as the ratio of new page visits that bounce away vs. new
> clickthrus from the front page to the download or docs section.    
> i.e. a
> visitor found the page and didn't continue reading.  I don't really  
> know
> all the things bioperl is good for and I've been reading about it  
> here &
> there for a while.
>
> I like the following from the About and I believe it fits well on a
> front page, expanding "toolkit" to "software library":
>
> "What is Bioperl? It is an open source bioinformatics software library
> used by researchers all over the world. If you're looking for a script
> built to fit your exact needs you probably won't find it in Bioperl.
> What you will find is a diverse set of Perl modules that will enable  
> you
> to write your own script, and a community of people who are willing to
> help you. "
>
> The old school definition of software library is something like:  
> "useful
> routines which can be used by an application (& not itself an
> application)" which is basically the description above.
>
> I also like the intro from wikipedia, which I found more informative
> about bioperl, and would be good for a front page:
>
> 'BioPerl [1] is a collection of Perl modules that facilitate the
> development of Perl scripts for bioinformatics applications. It has
> played an integral role in the Human Genome Project.[2]  It is an  
> active
> open source software project supported by the Open Bioinformatics
> Foundation.  In order to take advantage of BioPerl, the user needs a
> basic understanding of the Perl programming language including an
> understanding of how to use Perl references, modules, objects and  
> methods."
>
> The screenshots could also include pics of books on bioperl or perl 
> +bio,
> that would be neat.  (Tisdall's book comes to mind here)

I tend to agree here, but Tisdall only discusses BioPerl in detail in  
the second book (Mastering Perl for Bioinformatics).  I think we're  
safe as long as we indicate that, just don't want to run into a  
situation like the recent issue that some users had with Gentleman's  
'R for Bioinformatics' book released last year.

I don't think it was intentional, but a lot of users purchased it  
thinking it would be a BioConductor book, mainly b/c it was advertised  
on the BioConductor website.  Unfortunately it had very little to do  
with BioC (or bioinformatics, really), and the reviews of the book  
reflect that.  It's unfortunate, as I found it to be a pretty good  
book on R.

-c

> ## Jonathan Cline
> ## jcline at ieee.org
> ## Mobile: +1-805-617-0223
> ########################


From cjfields at illinois.edu  Tue Sep 22 11:53:13 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Tue, 22 Sep 2009 10:53:13 -0500
Subject: [Bioperl-l] BioPerl 1.6.0 alpha 2 released
In-Reply-To: <2736FAB1-3728-465F-A07B-A8FFA790FC4C@illinois.edu>
References: <2736FAB1-3728-465F-A07B-A8FFA790FC4C@illinois.edu>
Message-ID: <2ED641E3-F69E-4513-B261-0949FDE35EBB@illinois.edu>

And just as quickly, getting back lots indicating more problems from  
CPAN Testers.  Some can be ignored (appear due to the local perl  
testing environment so are local to the tester).  The following are  
the most significant; appears a hard-coded SeqFeature_SQLite.t got  
bundled in somehow, so I'll drop an alpha 3 shortly.

chris

#   Failed test 'use Bio::SeqFeature::Annotated;'
#   at t/Annotation/Annotation.t line 23.
#     Tried to use 'Bio::SeqFeature::Annotated'.
#     Error:  Can't locate URI/Escape.pm in @INC (@INC contains: t/ 
lib . /Users/david/cpantesting/perl-5.10.1/.cpan/build/ 
BioPerl-1.6.0._2-QVXU9n/blib/lib /Users/david/cpantesting/ 
perl-5.10.1/.cpan/build/BioPerl-1.6.0._2-QVXU9n/blib/arch /Users/david/ 
cpantesting/perl-5.10.1/.cpan/build/BioPerl-1.6.0._2-QVXU9n /sw/lib/ 
perl5 /sw/lib/perl5/darwin /Users/david/cpantesting/perl-5.10.1/lib/ 
5.10.1/darwin-thread-multi-2level /Users/david/cpantesting/perl-5.10.1/ 
lib/5.10.1 /Users/david/cpantesting/perl-5.10.1/lib/site_perl/5.10.1/ 
darwin-thread-multi-2level /Users/david/cpantesting/perl-5.10.1/lib/ 
site_perl/5.10.1) at Bio/SeqFeature/Annotated.pm line 100.
# BEGIN failed--compilation aborted at Bio/SeqFeature/Annotated.pm  
line 100.
# Compilation failed in require at (eval 60) line 2.
# BEGIN failed--compilation aborted at (eval 60) line 2.
# Looks like you failed 1 test of 159.
t/Annotation/Annotation.t ....................
Dubious, test returned 1 (wstat 256, 0x100)
Failed 1/159 subtests
	(less 12 skipped subtests: 146 okay)


t/LocalDB/SeqFeature.t ....................... ok
DBD::SQLite::db prepare_cached failed: near "INDEXED": syntax error(1)  
at dbdimp.c line 271 at Bio/DB/SeqFeature/Store/DBI/mysql.pm line 1678.

-------------------- EXCEPTION --------------------
MSG: near "INDEXED": syntax error(1) at dbdimp.c line 271
STACK Bio::DB::SeqFeature::Store::DBI::mysql::_prepare Bio/DB/ 
SeqFeature/Store/DBI/mysql.pm:1678
STACK Bio::DB::SeqFeature::Store::DBI::SQLite::_features Bio/DB/ 
SeqFeature/Store/DBI/SQLite.pm:665
STACK Bio::DB::SeqFeature::Store::get_features_by_attribute Bio/DB/ 
SeqFeature/Store.pm:961
STACK toplevel t/LocalDB/SeqFeature.t:135
-------------------------------------------
# Looks like you planned 69 tests but only ran 40.
# Looks like your test died just after 40.
t/LocalDB/SeqFeature_SQLite.t ................
Failed 29/69 subtests


On Sep 21, 2009, at 10:56 PM, Chris Fields wrote:

> Just a note that the second alpha is out and propagating it's way  
> around the intertubes:
>
> http://search.cpan.org/~cjfields/BioPerl-1.6.0_2/
>
> Pick your favorite archive here:
>
> http://bioperl.org/DIST/RC/
>
> This should address the bugs reported by Scott from the last  
> release.  Just a note, but I am seeing a warning popping up with 64- 
> bit perl 5.10.1 on Mac with PopGen tests (I think it's a floating  
> point addition issue).  Let me know if this is popping up elsewhere.
>
> Enjoy!
>
> chris
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From jason at bioperl.org  Tue Sep 22 12:01:51 2009
From: jason at bioperl.org (Jason Stajich)
Date: Tue, 22 Sep 2009 09:01:51 -0700
Subject: [Bioperl-l] dnastatistics
In-Reply-To: <39991E8FD29E4A43B8098C0BA6740C9C@NewLife>
References: <BLU104-W2453ADE4584D2C479071A4A0E40@phx.gbl><7AD546C5A6BE4B66BF9705BC885E08B1@NewLife><8B440DC9-A1C8-4900-A0AB-96448616E46A@bioperl.org>
	<A5C3A80C-03F0-4CEC-BA43-2271B58F6DC4@science.mq.edu.au>
	<39991E8FD29E4A43B8098C0BA6740C9C@NewLife>
Message-ID: <1027EFFB-18B5-446B-A5B0-9DA628EEEF08@bioperl.org>

someone should pick up the ball.

On Sep 22, 2009, at 4:12 AM, Mark A. Jensen wrote:

> Thanks Liam-- I think the discrepancy between dnadist and the
> module is worth making a bug report for- can you do that and
> include the data (or part of it) you were using?
> Jason, is that work really underway, or should someone pick up
> that ball?
> ----- Original Message ----- From: "Liam Elbourne" <lelbourn at science.mq.edu.au 
> >
> To: "Jason Stajich" <jason at bioperl.org>
> Cc: "Mark A. Jensen" <maj at fortinbras.us>; <bioperl-l at bioperl.org>;  
> "Jose ." <joseguillin at hotmail.com>
> Sent: Tuesday, September 22, 2009 3:14 AM
> Subject: [Bioperl-l] dnastatistics
>
>
> So I also had no problem running the code as written by Jose (Bioperl
> 1.6.0, perl 5.10), but in the documentation for DNAStatistics it says:
>
> "The routines are not well tested and do contain errors at this point.
> Work is underway to correct them, but do not expect this code to give
> you the right answer currently!"!
>
> So I'm using dnadist (as I think the documentation recommends), and it
> does produce different numbers to $stats->distance(-).
>
> I tried write_matrix from Bio::Matrix::IO - got a message saying it
> hasn't been implemented yet?
>
> And if Jose hasn't already found it, try Data::Dumper; it will change
> your life....
>
> Regards,
> Liam.
>
> On 15/09/2009, at 3:54 AM, Jason Stajich wrote:
>
>> Yeah it seems like more of a bioperl problem -- possible that the   
>> older code didn't recognize 'jukes-cantor' but you can try the   
>> abbreviation 'jc' --  better to just upgrade tho!
>>
>> This isn't the cause of the problem but I would also encourage use   
>> of Bio::Matrix::IO for printing the matrix (use the 'write_matrix'   
>> function) rather than print_matrix on the matrix itsself.
>>
>> -jason
>> On Sep 14, 2009, at 10:00 AM, Mark A. Jensen wrote:
>>
>>> Hi Jose--
>>> I don't get any problem with your script as written. You should   
>>> upgrade to
>>> BioPerl 1.6 and try again.
>>> The "unblessed reference" is $jcmatrix. It may be undef for some   
>>> reason.
>>> MAJ
>>> ----- Original Message ----- From: "Jose ."  
>>> <joseguillin at hotmail.com>
>>> To: <bioperl-l at bioperl.org>
>>> Sent: Monday, September 14, 2009 8:48 AM
>>> Subject: [Bioperl-l] Bio/Align/DNAStatistics.html print$jcmatrix-
>>> >print_matrix;
>>>
>>>
>>>
>>>
>>>
>>> Hello,
>>>
>>> I'm trying to use Bio::Align::DNAStatistics, but I get the   
>>> following message:
>>>
>>> Can't call method "print_matrix" on unblessed reference at  
>>> Tree.pl  line 32, <GEN0> line 44.
>>>
>>> Other modules do work, such us Bio::SimpleAlign;
>>>
>>>
>>>
>>>
>>> My code is basically a modification of the code I found in http://doc.bioperl.org/releases/bioperl-current/bioperl-live/Bio/Align/DNAStatistics.html 
>>>  , as it is as follows:
>>>
>>> use strict;
>>> use Bio::AlignIO;
>>> use Bio::Align::DNAStatistics;
>>>
>>>
>>> my $stats = Bio::Align::DNAStatistics->new();
>>>
>>> my $alignin = Bio::AlignIO->new(-file => 'e1_output_uno_solo.fas',
>>>                         -format => 'fasta');
>>> my $aln = $alignin->next_aln;
>>>
>>> my $jcmatrix = $stats-> distance (-align => $aln,
>>>               -method => 'Jukes-Cantor');
>>>
>>> print $jcmatrix->print_matrix;
>>>
>>> And the file 'e1_output_uno_solo.fas' has the following sequences:
>>>
>>>> A
>>> GGTTATCTCAACAACTGTCACC--GTGGGCGCTGGTCATTGGTACGGGTGAACGAGAGTT
>>> AAACGGTCGTTAACCATAGAAACAAAACACACTGCACCTTAACTCACTGAATAGTTGACG
>>> GTCTGCCTCAGGGCTTGAGACAACGGATGGATCTAAACTCATGCTGTAGCCTATCAAACT
>>> TAGCCCCAGGGTACTTCCGTCCCTAGCCTCGCTACAAGGCCAGAAAGGGTTTTGAAGTCT
>>> ACTCACTGTGACCAGCGGTCTAGTCAGGTTATGCTTCGGCACAAAACCTCAGAATCGGTA
>>> ACCAGCCACTACACGAACTGAAATCAAATCGCGGGAGGTGGTCCATCTTTGTCCACGCTG
>>> CGATGATTGGGTTGCTTTATAGTCTAGCTGCAAGGTTTTGCGTTCTGGTGGGAAGCGGSubject 
>>> :  Re: [Bioperl-l] Bio/Align/DNAStatistics.html
>> print$jcmatrix->print_maCA
>>> TCCAAGGGGTTGACTCCGCTCGTTTATAACATGCCTTGGGCCTCCATGGTGAGTCGCAAC
>>> GTCAGCGTAGGCCTAGACGGCT
>>>
>>>> B
>>> GGATATCTCGACAACTTTTAGC--CTGGGCGCTTGGCATTGGTACACGTGACTTGCAGTT
>>> AAAGGGTCGTTATACATAGAATCACTACCCAC--CAGGCGAACTCGCTGGAGAGCTGAGG
>>> GTCACCCTCAGCGGTTGAGTTAACTGCTCGATGTTAACCGATGTTGGATCATAGGTAACT
>>> TATCCTCAGTGTTCCTCTGTCCCTAGACTGGCTACAGGGCTACACCGGGTTTGAGGGGAT
>>> ACTGACTGTTTTCAGCGGTAGTGTAAGTGTATGGTCCAACCCAAGGGTTCATGACCGGTA
>>> AACTGCCCGTTCCCGCATTGAAATCAAATTGCAGGAGTTGGTACTTATTTGTCAACCTTA
>>> CGATGATTGGGATGCATTTTAGTCGGGCTGGGCGGATTTGCGATCTGGGTGGAAGAGAGA
>>> TGCATGGGGCTAACTCGTCTTGGTGAGTACCGGCATTGCACCGCAATGGACCGCCAAAAC
>>> ATAAGAGTAGGTCGGGATGGCA
>>>
>>>> C
>>> GCTTATCTCAACAACCGACACGAAGTCGTCGCAGGTCAATGGTACACGTGAATTGAAGTC
>>> ATAAGATCAGTAATGATCGAACCACCAAACCCTTAACCTCGACTCACGCGATAGCCGAGG
>>> GTCTGCCTCCAGGGTTGATTTAAAGGTTCTATTTAAGACCGTTTTCGATCATAGGTTACT
>>> TATCCCCAGAGTTCTACCGTCGTGAGAATGGCTACAAGGCTAGAATAGGTTTTAGGGT-T
>>> ACTTACGGTCTGCAGCCGTATTGTGAGGTTATGGTCCGGCCCTAGGCGTCATGACCGATA
>>> ATCAGCCCCTACCTGAAATGAAATCAAATCGCGGGAGTTGGTACTTATCTGTCAACGTTG
>>> CGATGATGGGGATACATGTTGGTCTACCGCGACGGACTAGCGATCACGGGGGAAGCGGAT
>>> TGCCCGGTGGTGACTCGACACGTTTAAAACCTGCCTGGTTCCCGCATGGATCGTCACAAC
>>> GTATGTGCAGGTCGAAACGAGT
>>>
>>>> D
>>> CGTGATCGCAACAACTGTCACC--GTGGGCGCTGGCCGTTGGACCACGTGAAATGCTGTT
>>> AAACGATCGTTCACCATAGAACCACTACACTCTTCACCTCAACCCGCGGGACAGGTGATG
>>> GTGTCCCCCAGGGGTTGAGTGAACGGCTCGATGTAAACCCATGTTCGATCATAGGTAACG
>>> TAGCCCCAGGGTGATTCCGTTCCTAAACTGGTTACAAGGCTAAAACGTGTTTTAGAGTAT
>>> AATGACTGTCTACGGCGGTATTGTGATGTTATCATCCGTCCCTAGGCGTGGCGACCGTTA
>>> AACAGCCTCTTCCCTAACTGATATCTAATCGTAGGAGTTGCTACGCATTTGTCAACGCAG
>>> CGATGATGGTGATGCATCTTAATCTAGCTGG----TTTTTTGATCTCGGGTGACGCAGAT
>>> AGTCAGGGGTTGACTCGCGTCGTTTGAAACGTGCCTTGCTCCTCAATGGACCCTCCGAAC
>>> CTAAGAGTAGCTCGACACGGCT
>>>
>>>
>>>
>>> I think the $aln object is OK, as I can use it with SimpleAlign.
>>>
>>> Moreover, if I write
>>>       print $jcmatrix;
>>> instead of
>>>       print $jcmatrix->print_matrix;
>>> I get the memory reference, as normal===> ARRAY(0x859f08)
>>>
>>> So my question is:
>>>
>>> Why do I have an unblessed reference?
>>>
>>> Can't call method "print_matrix" on unblessed reference at  
>>> Tree.pl  line 32, <GEN0> line 44.
>>>
>>> Thank you very much in advance.
>>>
>>> Jose G.
>>>
>>> _________________________________________________________________
>>> Hay tantos ordenadores como personas. ?Descubre ahora cu?l eres t?!
>>> http://www.quepceres.com/
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>> --
>> Jason Stajich
>> jason.stajich at gmail.com
>> jason at bioperl.org
>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> ______________________________
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>

--
Jason Stajich
jason.stajich at gmail.com
jason at bioperl.org


From jason at bioperl.org  Tue Sep 22 12:07:14 2009
From: jason at bioperl.org (Jason Stajich)
Date: Tue, 22 Sep 2009 09:07:14 -0700
Subject: [Bioperl-l] Converting between allowed SearchIO formats?
In-Reply-To: <B7F6253D-F9EE-4EC0-9ABE-53CB85E37D16@illinois.edu>
References: <2c8757af0909220609n518243efh63608aa05df13d1c@mail.gmail.com>
	<B7F6253D-F9EE-4EC0-9ABE-53CB85E37D16@illinois.edu>
Message-ID: <CE021960-F0DC-4BA7-91B7-21A5B2F6F1BF@bioperl.org>

>>
>>
>> However, the above method does not work here. Is this for some deep
>> reason, or could the above method (based on the way SeqIO works) be
>> made to work? I'm guessing that the SearchIO object conversion is
>> simply harder to do than with SeqIO?
>
> This is something Jason could probably speak up on, but from my  
> perspective it comes down to 'why?'.  This opens up a very hard-to- 
> implement door (converting to and from, for instance, BLAST to  
> HMMER), which doesn't make sense from the end-user perspective.   
> What most users want out of those formats is getting at the data in  
> an easily accessible way, to further process them (filter, to GFF,  
> etc), or to have them summarized.  the Writer classes take care of  
> the latter.
>


> There is a very generic, all-purpose write_result in Bio::SearchIO  
> that just calls the a ResultWriter object (and dies if it isn't  
> present).  Note that this expects a ResultWriter, not a Hit/ 
> HSPWriter; it is write_result() after all. I think this kind of goes  
> against the well-established API that exists with the other  
> write_foo implementations for the IO classes, where the input/output  
> format should match, but there you have it.
>

Dan -
I'm confused about what you are trying to do or what is broken - are  
you just annoyed that the API isn't the same style as Bio::SeqIO.


--
Jason Stajich
jason.stajich at gmail.com
jason at bioperl.org


From shalabh.sharma7 at gmail.com  Tue Sep 22 12:48:39 2009
From: shalabh.sharma7 at gmail.com (shalabh sharma)
Date: Tue, 22 Sep 2009 12:48:39 -0400
Subject: [Bioperl-l] Stockholm to fasta
Message-ID: <9fcc48c70909220948t7988b48eu7a8dcf89ee2d6042@mail.gmail.com>

Hi All,      I am trying to convert stockholm to fasta format. I am using
"sreformat" for this purpose. I am getting a fasta file but the problem is i
want header information from stockholm in my fasta file.
Like:
# STOCKHOLM 1.0

#=GF AC   RF00003
#=GF ID   U1
#=GF DE   U1 spliceosomal RNA
- - - - - - - - - -  - - - -
- - - - - - - - - - - -- -
- - - - - - -- - - - - -
#=GF RL   J Biol Chem 2001;276:21476-21481.
#=GF CC   U1 is a small nuclear RNA (snRNA) component of the spliceosome
#=GF CC   (involved in pre-mRNA splicing). Its 5' end forms complementary
#=GF CC   base pairs with the 5' splice junction, thus defining the 5'
#=GF CC   donor site of an intron.
#=GF CC   There are significant differences in sequence and secondary
#=GF CC   structure between metazoan and yeast U1 snRNAs, the latter being
#=GF CC   much longer (568 nucleotides as compared to 164 nucleotides in
#=GF CC   human). Nevertheless, secondary structure predictions suggest
#=GF CC   that all U1 snRNAs share a 'common core' consisting of helices I,
#=GF CC   II, the proximal region of III, and IV [1].
#=GF CC   This family does not contain the larger yeast sequences.
#=GF SQ   100


X63783.1/2024-2186
UUACUUACCUGGCUGG.AGUUU.GCUA...UCGAUCAU.GAAG.GGUAG.
X63783.1/1394-1556
UUACUUACCUGGCUGG.AGUUA.GCUA...UCGAUCAU.GAAG.GGUAG.
X58845.1/1-161
..ACUUACCUGGCUGG.AGUUU.GCUA...UCGAUCAU.GAAG.GGUAG.
X63783.1/596-756
UAAAUUACAAUGUUGU.AGUUA.GCUA...UAUAUCAA.AAAA.UAUAG.
M29062.1/238-387
UUACUUACCUGGCAUG.AGUUU..CUG...CAGCACAA.GAAU.UGUGG.

As a output i am just getting a fasta file with the headers like
 "X63783.1/2024-2186" but what i want is that it should include some
information like U1 or U1 spliceosomal RNA from the stockholm headers.

I would really appreciate if anyone can help me out.

Thanks
Shalabh

From roy.chaudhuri at gmail.com  Tue Sep 22 12:44:47 2009
From: roy.chaudhuri at gmail.com (Roy Chaudhuri)
Date: Tue, 22 Sep 2009 17:44:47 +0100
Subject: [Bioperl-l] subsection of genbank file
In-Reply-To: <4AB87A06.4000209@gmail.com>
References: <997B4CA2-D80B-4512-AA3E-74CB45DD7064@science.mq.edu.au>	<4AB36451.3030207@gmail.com>	<3B0EF953-BF79-4384-964D-A992DFBDB609@science.mq.edu.au>
	<4AB87A06.4000209@gmail.com>
Message-ID: <4AB8FEFF.6060408@gmail.com>

Hi Liam,

My mistake, it looks like the bug had already been reported and fixed, 
which means I get to go home earlier. I've marked your bug as a 
duplicate of bug 2810.

You can get the patched version by installing bioperl-live (just 
downloading the bioperl-live SeqUtils.pm and putting it in the correct 
place on your system would probably also work).

Cheers.
Roy.

Roy Chaudhuri wrote:
> Hi Liam,
> 
> Yes, that is a bug - I think it is to do with the Feature Annotation 
> rollback from 1.6, it works fine with 1.5.2. Looks like the tests I 
> wrote don't check for the presence of tags, just the coordinates of the 
> feature, so this hasn't been picked up. Submit it to Bugzilla, and I'll 
> take a look when I get a chance.
> 
> Cheers.
> Roy.
> 
> Liam Elbourne wrote:
>> Hi Roy,
>>
>> Thanks for that, works well, but there are no _gsf_tag_hash values? I'm 
>> particularly interested in the locus id, obviously the translation could 
>> be problematic if the whole gene is not included after truncation, but 
>> things like the note, product, protein_id would be good. I had a look at 
>> the code for the method and couldn't see any obvious why those values 
>> didn't make it across. Should I submit this as a bug, or is there 
>> something I'm missing?
>>
>>
>> Regards,
>> Liam.
>>
>>
>>
>> On 18/09/2009, at 8:43 PM, Roy Chaudhuri wrote:
>>
>>> Hi Liam,
>>>
>>> I just discovered your message, which has not yet been replied to. 
>>> What you require has been discussed in a recent thread:
>>> http://bioperl.org/pipermail/bioperl-l/2009-August/031071.html
>>>
>>> Try using trunc_with_features from Bio::SeqUtils:
>>>
>>> my $sub_seqobj=Bio::SeqUtils->trunc_with_features($seqobj, 300, 2000);
>>> Cheers.
>>> Roy.
>>>
>>> Liam Elbourne wrote:
>>>> Hi All,
>>>> Is there a method or methodology that will produce a fully fledged 
>>>> Seq  object with all the associated metadata given a start and end 
>>>>  position? To clarify, I create a sequence object from a genbank file:
>>>> ****
>>>> my $io  = Bio::Seqio->new(as per usual);
>>>> my $seqobj = $io->next_seq();
>>>> ****
>>>> I now want:
>>>> my $sub_seqobj = $seqobj between 300 and 2000
>>>> where $sub_seqobj is a Seq object (which I appreciate is an 
>>>>  'aggregate' of objects) too. The "trunc" method only returns a 
>>>>  PrimarySeq object which lacks all the annotation etc. I've 
>>>> previously  done this task by iterating through feature by feature 
>>>> and parsing out  what I needed, but thought there might be a more 
>>>> elegant approach...
>>>> Regards,
>>>> Liam Elbourne.
>>> -- 
>>> Dr. Roy Chaudhuri
>>> Department of Veterinary Medicine
>>> University of Cambridge, U.K.
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>> ac.uk ([131.111.51.215]:49455)
>>> by ppsw-7.c
>> ______________________________
>>
>> Dr Liam Elbourne
>> Research Fellow (Bioinformatics)
>> Paulsen Laboratory
>> Macquarie University
>> Sydney
>> Australia.
>>
>> http://www2.oxfam.org.au/trailwalker/Sydney/team/228
>>
>>
>>
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From cjfields at illinois.edu  Tue Sep 22 13:12:10 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Tue, 22 Sep 2009 12:12:10 -0500
Subject: [Bioperl-l] subsection of genbank file
In-Reply-To: <4AB8FEFF.6060408@gmail.com>
References: <997B4CA2-D80B-4512-AA3E-74CB45DD7064@science.mq.edu.au>	<4AB36451.3030207@gmail.com>	<3B0EF953-BF79-4384-964D-A992DFBDB609@science.mq.edu.au>
	<4AB87A06.4000209@gmail.com> <4AB8FEFF.6060408@gmail.com>
Message-ID: <1F043B63-3DD1-49DD-86F3-B2FB9AD34725@illinois.edu>

That should be out in the latest alpha on CPAN as well (the final  
1.6.1 should be out this week).

chris

On Sep 22, 2009, at 11:44 AM, Roy Chaudhuri wrote:

> Hi Liam,
>
> My mistake, it looks like the bug had already been reported and  
> fixed, which means I get to go home earlier. I've marked your bug as  
> a duplicate of bug 2810.
>
> You can get the patched version by installing bioperl-live (just  
> downloading the bioperl-live SeqUtils.pm and putting it in the  
> correct place on your system would probably also work).
>
> Cheers.
> Roy.
>
> Roy Chaudhuri wrote:
>> Hi Liam,
>> Yes, that is a bug - I think it is to do with the Feature  
>> Annotation rollback from 1.6, it works fine with 1.5.2. Looks like  
>> the tests I wrote don't check for the presence of tags, just the  
>> coordinates of the feature, so this hasn't been picked up. Submit  
>> it to Bugzilla, and I'll take a look when I get a chance.
>> Cheers.
>> Roy.
>> Liam Elbourne wrote:
>>> Hi Roy,
>>>
>>> Thanks for that, works well, but there are no _gsf_tag_hash  
>>> values? I'm particularly interested in the locus id, obviously the  
>>> translation could be problematic if the whole gene is not included  
>>> after truncation, but things like the note, product, protein_id  
>>> would be good. I had a look at the code for the method and  
>>> couldn't see any obvious why those values didn't make it across.  
>>> Should I submit this as a bug, or is there something I'm missing?
>>>
>>>
>>> Regards,
>>> Liam.
>>>
>>>
>>>
>>> On 18/09/2009, at 8:43 PM, Roy Chaudhuri wrote:
>>>
>>>> Hi Liam,
>>>>
>>>> I just discovered your message, which has not yet been replied  
>>>> to. What you require has been discussed in a recent thread:
>>>> http://bioperl.org/pipermail/bioperl-l/2009-August/031071.html
>>>>
>>>> Try using trunc_with_features from Bio::SeqUtils:
>>>>
>>>> my $sub_seqobj=Bio::SeqUtils->trunc_with_features($seqobj, 300,  
>>>> 2000);
>>>> Cheers.
>>>> Roy.
>>>>
>>>> Liam Elbourne wrote:
>>>>> Hi All,
>>>>> Is there a method or methodology that will produce a fully  
>>>>> fledged Seq  object with all the associated metadata given a  
>>>>> start and end  position? To clarify, I create a sequence object  
>>>>> from a genbank file:
>>>>> ****
>>>>> my $io  = Bio::Seqio->new(as per usual);
>>>>> my $seqobj = $io->next_seq();
>>>>> ****
>>>>> I now want:
>>>>> my $sub_seqobj = $seqobj between 300 and 2000
>>>>> where $sub_seqobj is a Seq object (which I appreciate is an   
>>>>> 'aggregate' of objects) too. The "trunc" method only returns a   
>>>>> PrimarySeq object which lacks all the annotation etc. I've  
>>>>> previously  done this task by iterating through feature by  
>>>>> feature and parsing out  what I needed, but thought there might  
>>>>> be a more elegant approach...
>>>>> Regards,
>>>>> Liam Elbourne.
>>>> -- 
>>>> Dr. Roy Chaudhuri
>>>> Department of Veterinary Medicine
>>>> University of Cambridge, U.K.
>>>> _______________________________________________
>>>> Bioperl-l mailing list
>>>> Bioperl-l at lists.open-bio.org
>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>> ac.uk ([131.111.51.215]:49455)
>>>> by ppsw-7.c
>>> ______________________________
>>>
>>> Dr Liam Elbourne
>>> Research Fellow (Bioinformatics)
>>> Paulsen Laboratory
>>> Macquarie University
>>> Sydney
>>> Australia.
>>>
>>> http://www2.oxfam.org.au/trailwalker/Sydney/team/228
>>>
>>>
>>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From cjfields at illinois.edu  Tue Sep 22 13:13:53 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Tue, 22 Sep 2009 12:13:53 -0500
Subject: [Bioperl-l] Stockholm to fasta
In-Reply-To: <9fcc48c70909220948t7988b48eu7a8dcf89ee2d6042@mail.gmail.com>
References: <9fcc48c70909220948t7988b48eu7a8dcf89ee2d6042@mail.gmail.com>
Message-ID: <EA566A7E-C146-4C2C-9AD5-88B9BB34EC43@illinois.edu>

The POD for Bio::AlignIO::stockholm indicates where the various bits  
of information are stored.  Everything from the header should be in  
there in the latest bioperl; in many cases it's not ideally stored,  
but it's accessible.

You'll need to preprocess your seqs in the SimpleAlign returned  
(iterate through them and change the relevant bits like desc(),  
displayname(), seq_id, etc) and may need to do other modifications,  
but it should work.

chris

On Sep 22, 2009, at 11:48 AM, shalabh sharma wrote:

> Hi All,      I am trying to convert stockholm to fasta format. I am  
> using
> "sreformat" for this purpose. I am getting a fasta file but the  
> problem is i
> want header information from stockholm in my fasta file.
> Like:
> # STOCKHOLM 1.0
>
> #=GF AC   RF00003
> #=GF ID   U1
> #=GF DE   U1 spliceosomal RNA
> - - - - - - - - - -  - - - -
> - - - - - - - - - - - -- -
> - - - - - - -- - - - - -
> #=GF RL   J Biol Chem 2001;276:21476-21481.
> #=GF CC   U1 is a small nuclear RNA (snRNA) component of the  
> spliceosome
> #=GF CC   (involved in pre-mRNA splicing). Its 5' end forms  
> complementary
> #=GF CC   base pairs with the 5' splice junction, thus defining the 5'
> #=GF CC   donor site of an intron.
> #=GF CC   There are significant differences in sequence and secondary
> #=GF CC   structure between metazoan and yeast U1 snRNAs, the latter  
> being
> #=GF CC   much longer (568 nucleotides as compared to 164  
> nucleotides in
> #=GF CC   human). Nevertheless, secondary structure predictions  
> suggest
> #=GF CC   that all U1 snRNAs share a 'common core' consisting of  
> helices I,
> #=GF CC   II, the proximal region of III, and IV [1].
> #=GF CC   This family does not contain the larger yeast sequences.
> #=GF SQ   100
>
>
> X63783.1/2024-2186
> UUACUUACCUGGCUGG.AGUUU.GCUA...UCGAUCAU.GAAG.GGUAG.
> X63783.1/1394-1556
> UUACUUACCUGGCUGG.AGUUA.GCUA...UCGAUCAU.GAAG.GGUAG.
> X58845.1/1-161
> ..ACUUACCUGGCUGG.AGUUU.GCUA...UCGAUCAU.GAAG.GGUAG.
> X63783.1/596-756
> UAAAUUACAAUGUUGU.AGUUA.GCUA...UAUAUCAA.AAAA.UAUAG.
> M29062.1/238-387
> UUACUUACCUGGCAUG.AGUUU..CUG...CAGCACAA.GAAU.UGUGG.
>
> As a output i am just getting a fasta file with the headers like
> "X63783.1/2024-2186" but what i want is that it should include some
> information like U1 or U1 spliceosomal RNA from the stockholm headers.
>
> I would really appreciate if anyone can help me out.
>
> Thanks
> Shalabh
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From shalabh.sharma7 at gmail.com  Tue Sep 22 16:17:11 2009
From: shalabh.sharma7 at gmail.com (shalabh sharma)
Date: Tue, 22 Sep 2009 16:17:11 -0400
Subject: [Bioperl-l] Stockholm to fasta
In-Reply-To: <EA566A7E-C146-4C2C-9AD5-88B9BB34EC43@illinois.edu>
References: <9fcc48c70909220948t7988b48eu7a8dcf89ee2d6042@mail.gmail.com>
	<EA566A7E-C146-4C2C-9AD5-88B9BB34EC43@illinois.edu>
Message-ID: <9fcc48c70909221317i509a45cbm19783c1210f7c69b@mail.gmail.com>

Hi Chris,           Thanks a lot it was really helpful.

Thanks
Shalabh


On Tue, Sep 22, 2009 at 1:13 PM, Chris Fields <cjfields at illinois.edu> wrote:

> The POD for Bio::AlignIO::stockholm indicates where the various bits of
> information are stored.  Everything from the header should be in there in
> the latest bioperl; in many cases it's not ideally stored, but it's
> accessible.
>
> You'll need to preprocess your seqs in the SimpleAlign returned (iterate
> through them and change the relevant bits like desc(), displayname(),
> seq_id, etc) and may need to do other modifications, but it should work.
>
> chris
>
>
> On Sep 22, 2009, at 11:48 AM, shalabh sharma wrote:
>
>  Hi All,      I am trying to convert stockholm to fasta format. I am using
>> "sreformat" for this purpose. I am getting a fasta file but the problem is
>> i
>> want header information from stockholm in my fasta file.
>> Like:
>> # STOCKHOLM 1.0
>>
>> #=GF AC   RF00003
>> #=GF ID   U1
>> #=GF DE   U1 spliceosomal RNA
>> - - - - - - - - - -  - - - -
>> - - - - - - - - - - - -- -
>> - - - - - - -- - - - - -
>> #=GF RL   J Biol Chem 2001;276:21476-21481.
>> #=GF CC   U1 is a small nuclear RNA (snRNA) component of the spliceosome
>> #=GF CC   (involved in pre-mRNA splicing). Its 5' end forms complementary
>> #=GF CC   base pairs with the 5' splice junction, thus defining the 5'
>> #=GF CC   donor site of an intron.
>> #=GF CC   There are significant differences in sequence and secondary
>> #=GF CC   structure between metazoan and yeast U1 snRNAs, the latter being
>> #=GF CC   much longer (568 nucleotides as compared to 164 nucleotides in
>> #=GF CC   human). Nevertheless, secondary structure predictions suggest
>> #=GF CC   that all U1 snRNAs share a 'common core' consisting of helices
>> I,
>> #=GF CC   II, the proximal region of III, and IV [1].
>> #=GF CC   This family does not contain the larger yeast sequences.
>> #=GF SQ   100
>>
>>
>> X63783.1/2024-2186
>> UUACUUACCUGGCUGG.AGUUU.GCUA...UCGAUCAU.GAAG.GGUAG.
>> X63783.1/1394-1556
>> UUACUUACCUGGCUGG.AGUUA.GCUA...UCGAUCAU.GAAG.GGUAG.
>> X58845.1/1-161
>> ..ACUUACCUGGCUGG.AGUUU.GCUA...UCGAUCAU.GAAG.GGUAG.
>> X63783.1/596-756
>> UAAAUUACAAUGUUGU.AGUUA.GCUA...UAUAUCAA.AAAA.UAUAG.
>> M29062.1/238-387
>> UUACUUACCUGGCAUG.AGUUU..CUG...CAGCACAA.GAAU.UGUGG.
>>
>> As a output i am just getting a fasta file with the headers like
>> "X63783.1/2024-2186" but what i want is that it should include some
>> information like U1 or U1 spliceosomal RNA from the stockholm headers.
>>
>> I would really appreciate if anyone can help me out.
>>
>> Thanks
>> Shalabh
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>
>

From cjfields at illinois.edu  Tue Sep 22 16:29:28 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Tue, 22 Sep 2009 15:29:28 -0500
Subject: [Bioperl-l] BioPerl 1.6.0 alpha 3 released
Message-ID: <A59164B5-0408-4A94-9262-8B814DD48CE1@illinois.edu>

The third alpha is now out and propagating it's way around the  
intertubes:

http://search.cpan.org/~cjfields/BioPerl-1.6.0_3/

Pick your favorite archive here:

http://bioperl.org/DIST/RC/

This includes some unmerged changes from 1.6.0.  Test failures from  
the last alpha indicated these somehow were missed, so I basically ran  
a global diff against main trunk to check for missing commits (all  
located in t/ as it turned out).

Also fixed is are the SeqFeature_SQLite.t failures; this is a file  
autogenerated with Build.PL tests that somehow made it's way into the  
last alpha release.  This is now properly cleaned up along with it's  
test database using './Build clean'.  BTW, very nice SQLite  
implementation; I may be using it!

Please let me know if anything pops up; I'm hoping to release 1.6.1 by  
this Thursday-Friday.

Enjoy!

chris

From dan.bolser at gmail.com  Tue Sep 22 17:33:13 2009
From: dan.bolser at gmail.com (Dan Bolser)
Date: Tue, 22 Sep 2009 22:33:13 +0100
Subject: [Bioperl-l] Converting between allowed SearchIO formats?
In-Reply-To: <CE021960-F0DC-4BA7-91B7-21A5B2F6F1BF@bioperl.org>
References: <2c8757af0909220609n518243efh63608aa05df13d1c@mail.gmail.com>
	<B7F6253D-F9EE-4EC0-9ABE-53CB85E37D16@illinois.edu>
	<CE021960-F0DC-4BA7-91B7-21A5B2F6F1BF@bioperl.org>
Message-ID: <2c8757af0909221433p6d8b5dbeuf8c16218b732e54e@mail.gmail.com>

2009/9/22 Jason Stajich <jason at bioperl.org>

>
>
> However, the above method does not work here. Is this for some deep
>
> reason, or could the above method (based on the way SeqIO works) be
>
> made to work? I'm guessing that the SearchIO object conversion is
>
> simply harder to do than with SeqIO?
>
>
> This is something Jason could probably speak up on, but from my perspective
> it comes down to 'why?'.  This opens up a very hard-to-implement door
> (converting to and from, for instance, BLAST to HMMER), which doesn't make
> sense from the end-user perspective.  What most users want out of those
> formats is getting at the data in an easily accessible way, to further
> process them (filter, to GFF, etc), or to have them summarized.  the Writer
> classes take care of the latter.
>
>
> There is a very generic, all-purpose write_result in Bio::SearchIO that
> just calls the a ResultWriter object (and dies if it isn't present).  Note
> that this expects a ResultWriter, not a Hit/HSPWriter; it is write_result()
> after all. I think this kind of goes against the well-established API that
> exists with the other write_foo implementations for the IO classes, where
> the input/output format should match, but there you have it.
>
> Dan -
> I'm confused about what you are trying to do or what is broken - are you
> just annoyed that the API isn't the same style as Bio::SeqIO.
>

No, I'm not annoyed. I was just confused initially because it didn't work as
'expected', and then I was wondering why (I was just curious). I take Chris's
point that this could be a lot of work to implement for a very marginal use
case.

Very simply, what I am trying to do is this: a) read in a blasttable, b)
filter the HSPs per 'result' (per query sequence), and c) write the HSPs out
in blasttable format.

I was stuck at step c, but I'm not saying anything is broken (just my
understanding of how to use SearchIO::Writer::HSPTableWriter).

I'll look again at Chris's suggestions to see if I can get code to just
'round trip' the blasttable format. From there I think I should be able to
do what I want.


Cheers,
Dan.


--
> Jason Stajich
> jason.stajich at gmail.com
> jason at bioperl.org
>
>

From maj at fortinbras.us  Tue Sep 22 18:32:15 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Tue, 22 Sep 2009 18:32:15 -0400
Subject: [Bioperl-l] Converting between allowed SearchIO formats?
In-Reply-To: <2c8757af0909221433p6d8b5dbeuf8c16218b732e54e@mail.gmail.com>
References: <2c8757af0909220609n518243efh63608aa05df13d1c@mail.gmail.com><B7F6253D-F9EE-4EC0-9ABE-53CB85E37D16@illinois.edu><CE021960-F0DC-4BA7-91B7-21A5B2F6F1BF@bioperl.org>
	<2c8757af0909221433p6d8b5dbeuf8c16218b732e54e@mail.gmail.com>
Message-ID: <9C7D7F02BFBD4F2AA16E151B52125C93@NewLife>

Apropos this, here's something I ran across the other day:

"Just remember when using BioPerl that it was never designed
to 'round trip' your favorite formats. Rather, it was designed to
store sequence data from many widely different formats into a
common object framework and make that framework available
to other sequence manipulation tasks in a programmatic fashion."

from HOWTO:SeqIO#Caveats

Food for thought, anyway--- MAJ

----- Original Message ----- 
From: "Dan Bolser" <dan.bolser at gmail.com>
To: "Jason Stajich" <jason at bioperl.org>
Cc: "Chris Fields" <cjfields at illinois.edu>; "BioPerl List" 
<bioperl-l at lists.open-bio.org>
Sent: Tuesday, September 22, 2009 5:33 PM
Subject: Re: [Bioperl-l] Converting between allowed SearchIO formats?


> 2009/9/22 Jason Stajich <jason at bioperl.org>
>
>>
>>
>> However, the above method does not work here. Is this for some deep
>>
>> reason, or could the above method (based on the way SeqIO works) be
>>
>> made to work? I'm guessing that the SearchIO object conversion is
>>
>> simply harder to do than with SeqIO?
>>
>>
>> This is something Jason could probably speak up on, but from my perspective
>> it comes down to 'why?'.  This opens up a very hard-to-implement door
>> (converting to and from, for instance, BLAST to HMMER), which doesn't make
>> sense from the end-user perspective.  What most users want out of those
>> formats is getting at the data in an easily accessible way, to further
>> process them (filter, to GFF, etc), or to have them summarized.  the Writer
>> classes take care of the latter.
>>
>>
>> There is a very generic, all-purpose write_result in Bio::SearchIO that
>> just calls the a ResultWriter object (and dies if it isn't present).  Note
>> that this expects a ResultWriter, not a Hit/HSPWriter; it is write_result()
>> after all. I think this kind of goes against the well-established API that
>> exists with the other write_foo implementations for the IO classes, where
>> the input/output format should match, but there you have it.
>>
>> Dan -
>> I'm confused about what you are trying to do or what is broken - are you
>> just annoyed that the API isn't the same style as Bio::SeqIO.
>>
>
> No, I'm not annoyed. I was just confused initially because it didn't work as
> 'expected', and then I was wondering why (I was just curious). I take Chris's
> point that this could be a lot of work to implement for a very marginal use
> case.
>
> Very simply, what I am trying to do is this: a) read in a blasttable, b)
> filter the HSPs per 'result' (per query sequence), and c) write the HSPs out
> in blasttable format.
>
> I was stuck at step c, but I'm not saying anything is broken (just my
> understanding of how to use SearchIO::Writer::HSPTableWriter).
>
> I'll look again at Chris's suggestions to see if I can get code to just
> 'round trip' the blasttable format. From there I think I should be able to
> do what I want.
>
>
> Cheers,
> Dan.
>
>
> --
>> Jason Stajich
>> jason.stajich at gmail.com
>> jason at bioperl.org
>>
>>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> 


From clements at nescent.org  Tue Sep 22 19:15:50 2009
From: clements at nescent.org (Dave Clements)
Date: Tue, 22 Sep 2009 16:15:50 -0700
Subject: [Bioperl-l] a Main Page proposal
In-Reply-To: <2B4BD21D-D06A-43A1-8860-A8F3771130E7@illinois.edu>
References: <mailman.7482.1253574466.2696.bioperl-l@lists.open-bio.org>
	<4AB84B8D.5080005@ieee.org>
	<2B4BD21D-D06A-43A1-8860-A8F3771130E7@illinois.edu>
Message-ID: <f135c01c0909221615y4e79fb48u7e1ce6771b046d10@mail.gmail.com>

Hello all,

For open source project wikis, it's nice if the home page
1) Lets new users know that this is an active project with a lot going on.
2) Encourages people to contribute to the project and the wiki.

Both the BioPython,org and GMOD.org sites include a list of links to news
items on the home page.  This is done in both sites with a MediaWiki
extension.

The GMOD.org home page also includes a list of new and recently updated wiki
pages.  This achieves both goals, by showing what's happening, and by giving
people a slight reward for updating the wiki by placing a link to the page
on the wiki.  This is also done with MediaWiki extensions.

My 2?,

Dave C

-- 
GMOD News: http://gmod.org/wiki/GMOD_News


From David.Messina at sbc.su.se  Wed Sep 23 07:37:02 2009
From: David.Messina at sbc.su.se (Dave Messina)
Date: Wed, 23 Sep 2009 13:37:02 +0200
Subject: [Bioperl-l] a Main Page proposal
In-Reply-To: <f135c01c0909221615y4e79fb48u7e1ce6771b046d10@mail.gmail.com>
References: <mailman.7482.1253574466.2696.bioperl-l@lists.open-bio.org> 
	<4AB84B8D.5080005@ieee.org>
	<2B4BD21D-D06A-43A1-8860-A8F3771130E7@illinois.edu> 
	<f135c01c0909221615y4e79fb48u7e1ce6771b046d10@mail.gmail.com>
Message-ID: <628aabb70909230437p13de2164sef4950cc3be2ce23@mail.gmail.com>

I think either Chris' version or Mark's earlier, slightly more verbose
version would work well and fulfill the goals of reducing clutter and making
it easier to find what you're looking for for visitors new and old.

I do like the idea of a newsfeed, which summarizes what's been going on
lately and let's new users know the project is active. Embedding the BioPerl
twitter feed would be an easy solution.


The GMOD.org home page also includes a list of new and recently updated
> wiki pages.  This achieves both goals, by showing what's happening, and by
> giving people a slight reward for updating the wiki by placing a link to the
> page on the wiki.
>

I like this idea too.


Dave

From maj at fortinbras.us  Wed Sep 23 07:47:24 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Wed, 23 Sep 2009 07:47:24 -0400
Subject: [Bioperl-l] a Main Page proposal
In-Reply-To: <628aabb70909230437p13de2164sef4950cc3be2ce23@mail.gmail.com>
References: <mailman.7482.1253574466.2696.bioperl-l@lists.open-bio.org>
	<4AB84B8D.5080005@ieee.org>
	<2B4BD21D-D06A-43A1-8860-A8F3771130E7@illinois.edu>
	<f135c01c0909221615y4e79fb48u7e1ce6771b046d10@mail.gmail.com>
	<628aabb70909230437p13de2164sef4950cc3be2ce23@mail.gmail.com>
Message-ID: <0AD07A69C66B4B5BB8599BA5483145D7@NewLife>

Johnathan, Dave and Dave -- thanks for these helpful comments-
I'm beginning to think there is a happy medium for this medium.
MAJ
  ----- Original Message ----- 
  From: Dave Messina 
  To: Dave Clements 
  Cc: bioperl-l at lists.open-bio.org ; Mark A. Jensen ; Chris Fields 
  Sent: Wednesday, September 23, 2009 7:37 AM
  Subject: Re: [Bioperl-l] a Main Page proposal


  I think either Chris' version or Mark's earlier, slightly more verbose version would work well and fulfill the goals of reducing clutter and making it easier to find what you're looking for for visitors new and old.


  I do like the idea of a newsfeed, which summarizes what's been going on lately and let's new users know the project is active. Embedding the BioPerl twitter feed would be an easy solution.


    The GMOD.org home page also includes a list of new and recently updated wiki pages.  This achieves both goals, by showing what's happening, and by giving people a slight reward for updating the wiki by placing a link to the page on the wiki.


  I like this idea too.


  Dave 

From biopython at maubp.freeserve.co.uk  Wed Sep 23 08:12:56 2009
From: biopython at maubp.freeserve.co.uk (Peter)
Date: Wed, 23 Sep 2009 13:12:56 +0100
Subject: [Bioperl-l] a Main Page proposal
In-Reply-To: <628aabb70909230437p13de2164sef4950cc3be2ce23@mail.gmail.com>
References: <mailman.7482.1253574466.2696.bioperl-l@lists.open-bio.org>
	<4AB84B8D.5080005@ieee.org>
	<2B4BD21D-D06A-43A1-8860-A8F3771130E7@illinois.edu>
	<f135c01c0909221615y4e79fb48u7e1ce6771b046d10@mail.gmail.com>
	<628aabb70909230437p13de2164sef4950cc3be2ce23@mail.gmail.com>
Message-ID: <320fb6e00909230512u3d0c2031xb418e3253476be2f@mail.gmail.com>

On Wed, Sep 23, 2009 at 12:37 PM, Dave Messina <David.Messina at sbc.su.se> wrote:
> I think either Chris' version or Mark's earlier, slightly more verbose
> version would work well and fulfill the goals of reducing clutter and making
> it easier to find what you're looking for for visitors new and old.
>
> I do like the idea of a newsfeed, which summarizes what's been going on
> lately and let's new users know the project is active. Embedding the BioPerl
> twitter feed would be an easy solution.

Embedding your news feed would be just as easy:

http://news.open-bio.org/news/category/obf-projects/bioperl/feed/rdf
http://news.open-bio.org/news/category/obf-projects/bioperl/feed/rss
http://news.open-bio.org/news/category/obf-projects/bioperl/feed/rss2
http://news.open-bio.org/news/category/obf-projects/bioperl/feed/atom

Which (news server vs twitter feed) is preferable is down to you guys,
although for 2009 at least there has been more activity on twitter.
I'm not sure if you have the news posts re-tweeted or not (the last
news server post was back in Feb), but Biopython and the OBF
twitter accounts are doing this via twitterfeed.

Peter

From maj at fortinbras.us  Wed Sep 23 08:51:15 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Wed, 23 Sep 2009 08:51:15 -0400
Subject: [Bioperl-l] Protein Sequence QSARs
In-Reply-To: <627d998d0909070117u760c8ef3k47a894cf52d099f1@mail.gmail.com>
References: <627d998d0909070117u760c8ef3k47a894cf52d099f1@mail.gmail.com>
Message-ID: <3B9AACAB654F4F4DBB6CE00A9B26FBF6@NewLife>

Hi Brett--
I doubt if anything this specialized exists in BioPerl.
I'd say go for it, but R may be better suited for the calculations you
want to do. For dealing with matrices, you may want to check out
the Bio::Matrix namespace.
cheers Mark
----- Original Message ----- 
From: "Brett Bowman" <bnbowman at gmail.com>
To: <bioperl-l at lists.open-bio.org>
Sent: Monday, September 07, 2009 4:17 AM
Subject: [Bioperl-l] Protein Sequence QSARs


I've been working on a script for my personal edification for annotating
protein sequence for QSARs, as described in the paper below, because I
didn't see anything in Bioperl to do it for me.  Essentially converting a
protein sequence of length N into a numerical matrix of size 3-by-N by
substitution, and then calculating the auto- and cross- correlation values
for various for a lag of L amino acids.  I was considering turning it into a
full blown module, but I wanted to ask if A) it had been done before and I
had just missed it, and B) whether anyone other than me would find such a
module useful.

Wold S, Jonsson J, Sj?str?m M, Sandberg M, R?nnar S: * DNA and peptide
sequences and chemical processes multivariately modeled by principal
component analysis and partial least-squares projections to latent
structures. **Anal Chim Acta* 1993, *277**:*239-253.

Brett Bowman
bnbowman at gmail.com
Woelk Lab, Stein Cancer Research Center
UCSD/SDSU Joint Program in Bioinformatics

_______________________________________________
Bioperl-l mailing list
Bioperl-l at lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/bioperl-l


From maj at fortinbras.us  Wed Sep 23 09:04:48 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Wed, 23 Sep 2009 09:04:48 -0400
Subject: [Bioperl-l] Fw:  problem parsing msf file
Message-ID: <4851B51372DE4761B8CC26D685B57344@NewLife>

neglected the list
----- Original Message ----- 
From: "Mark A. Jensen" <maj at fortinbras.us>
To: "Paola Bisignano" <paola.bisignano at gmail.com>
Sent: Wednesday, September 23, 2009 9:04 AM
Subject: Re: [Bioperl-l] problem parsing msf file


> Hi Paola--
> I think you need column_from_residue_number() off the SimpleAlign object,
> and location_from_column off the LocatableSeq object. For your example, 
> try
> 
> $alnio = Bio::AlignIO->new( -file=>"my.msf");
> $aln = $alnio->next_aln;
> 
> $s1 = $aln->get_seq_by_pos(1);
> $s2 = $aln->get_seq_by_pos(2);
> 
> $col = $aln->column_from_residue_number( $s1->id, 28);
> $s2coord = $s2->location_from_column( $col - 1);
> 
> Now, $s2coord should equal 4 (the coordinate of the R before the I
> that aligns with the V in sequence 1).
> MAJ
> 
> 
> ----- Original Message ----- 
> From: "Paola Bisignano" <paola.bisignano at gmail.com>
> To: "Mark A. Jensen" <maj at fortinbras.us>; <bioperl-l at lists.open-bio.org>
> Sent: Friday, September 04, 2009 8:28 AM
> Subject: [Bioperl-l] problem parsing msf file
> 
> 
>>I have a problem with the parsing of msf file...I can't find the exact
>> object of Bio::SimpleAlign for my case...
>> I have to identify residues (from a list) in aligned sequences...but
>> when I parse the alignment from fasta file, I save as msf file, where
>> I have to identify my residue (from the list, numbering as the pdb
>> file) and the residue aligned in the aligned sequences...
>> 
>> this is a piece of the file...
>> 
>> NoName   MSF: 2  Type: P  Wed Aug 26 10:32:50 2009  Check: 00 ..
>> 
>> Name: Sequence/23-178  Len:    156  Check:  8937  Weight:  1.00
>> Name: 2zhz:A/1-148     Len:    156  Check:  9006  Weight:  1.00
>> 
>> //
>> 
>> 
>>                      1                                                   50
>> Sequence/23-178       NDPRVAAYGE VDELNSWVGY TKSLINSHTQ VLSNELEEIQ QLLFDCGHDL
>> 2zhz:A/1-148          DDARIAAIGD VDELNSQIGV L--LAEPLPD DVRAALSAIQ HDLFDLGGEL
>> 
>> 
>>                      51                                                 100
>> Sequence/23-178       ATPADDERHS FKFKQEQPTV WLEEKIDNYT QVVPAVKKHI LPGGTQLASA
>> 2zhz:A/1-148          CIPGHAAITD AHLARLDG-- WLA----HYN GQLPPLEEFI LPGGARGAAL
>> 
>> 
>>                      101                                                150
>> Sequence/23-178       LHVARTITRR AERQIVQLMR EEQINQDVLI FINRLSDYFF AAARYANYLE
>> 2zhz:A/1-148          AHVCRTVCRR AERSIVALGA SEPLNAAPRR YVNRLSDLLF VLARVLNRAA
>> 
>> 
>>                      151                                                200
>> Sequence/23-178       QQPDML
>> 2zhz:A/1-148          GGADVL
>> 
>> for example in this I have to identify the residue that is in front of
>> Val 28 (that is in Sequen) in 2zhz:A (that manually conting is Ile
>> 5)....
>> Tyr4-> has no residue in front of it because the alignment starts from
>> N23 of Sequence...
>> how can I find the way to enter the residue of my sequen, and extract
>> the residue from the other????
>> 
>> 
>> I wish you all dear friends..and I'm actually in atrouble with this..
>> Thanks for suggestions
>> 
>> Paola
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>> 
>>

From cjfields at illinois.edu  Wed Sep 23 10:41:14 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Wed, 23 Sep 2009 09:41:14 -0500
Subject: [Bioperl-l] a Main Page proposal
In-Reply-To: <320fb6e00909230512u3d0c2031xb418e3253476be2f@mail.gmail.com>
References: <mailman.7482.1253574466.2696.bioperl-l@lists.open-bio.org>
	<4AB84B8D.5080005@ieee.org>
	<2B4BD21D-D06A-43A1-8860-A8F3771130E7@illinois.edu>
	<f135c01c0909221615y4e79fb48u7e1ce6771b046d10@mail.gmail.com>
	<628aabb70909230437p13de2164sef4950cc3be2ce23@mail.gmail.com>
	<320fb6e00909230512u3d0c2031xb418e3253476be2f@mail.gmail.com>
Message-ID: <9D6376D4-DFAC-4363-BA1C-0E27AB01373E@illinois.edu>

On Sep 23, 2009, at 7:12 AM, Peter wrote:

> On Wed, Sep 23, 2009 at 12:37 PM, Dave Messina <David.Messina at sbc.su.se 
> > wrote:
>> I think either Chris' version or Mark's earlier, slightly more  
>> verbose
>> version would work well and fulfill the goals of reducing clutter  
>> and making
>> it easier to find what you're looking for for visitors new and old.
>>
>> I do like the idea of a newsfeed, which summarizes what's been  
>> going on
>> lately and let's new users know the project is active. Embedding  
>> the BioPerl
>> twitter feed would be an easy solution.
>
> Embedding your news feed would be just as easy:
>
> http://news.open-bio.org/news/category/obf-projects/bioperl/feed/rdf
> http://news.open-bio.org/news/category/obf-projects/bioperl/feed/rss
> http://news.open-bio.org/news/category/obf-projects/bioperl/feed/rss2
> http://news.open-bio.org/news/category/obf-projects/bioperl/feed/atom
>
> Which (news server vs twitter feed) is preferable is down to you guys,
> although for 2009 at least there has been more activity on twitter.
> I'm not sure if you have the news posts re-tweeted or not (the last
> news server post was back in Feb), but Biopython and the OBF
> twitter accounts are doing this via twitterfeed.
>
> Peter

Not to add yet more to the list, but I also think a concise list of  
projects using (or 'powered by') bioperl should be front-and-center;  
not a lot of users know when/where bioperl is used.  This applies to  
the other bio* as well, particularly biopython (seeing it popping up  
more and more).

For an example, see the biomart homepage:

http://www.biomart.org/

chris

From adlai at refenestration.com  Wed Sep 23 10:38:32 2009
From: adlai at refenestration.com (adlai burman)
Date: Wed, 23 Sep 2009 16:38:32 +0200
Subject: [Bioperl-l] Newbie: Format GenBank
Message-ID: <BA67A13E-EAF0-4297-8013-22656D3D1740@refenestration.com>

I have finally got past two major hurdles (for me) only to get stumped:
1. I have written a perl script that can take a genbank formated text  
file as a filehandle and do all sorts of nifty (for me) things with it.
2. I have gotten my BioPerl installation working on a web hosting  
service so my advisor can use this through a browser.

BUT the code I have to fetch GB record can print it as a single HTML  
line, and what I need is for it to assign the retrieved file to a  
scaler variable. I am going blind trying to figure out how access  
(not write) the gb file from an SeqIO object and assign it to a  
variable.

Here's an example of the code I have going on the server:

#!/usr/bin/perl
print "Content-type: text/html\n\n";
use Bio::SeqIO;
use Bio::DB::GenBank;

$genBank = new Bio::DB::GenBank;  # This object knows how to talk to  
GenBank

my $seq = $genBank->get_Seq_by_acc('DQ897681');  # get a record by  
accession

my $seqOut = new Bio::SeqIO(-format => 'genbank');

$seqOut->write_seq($seq);


exit;

where 'DQ897861' will be replaced by a CGI post.

I know that write_seq is not what I need, and I assume that this is a  
simple problem but can anyone tell me how to assign the retrieved gb  
file to a scaler?

Thanks,
Adlai

From joseguillin at hotmail.com  Tue Sep 22 10:39:52 2009
From: joseguillin at hotmail.com (Jose .)
Date: Tue, 22 Sep 2009 15:39:52 +0100
Subject: [Bioperl-l] dnastatistics
In-Reply-To: <A5C3A80C-03F0-4CEC-BA43-2271B58F6DC4@science.mq.edu.au>
References: <BLU104-W2453ADE4584D2C479071A4A0E40@phx.gbl>
	<7AD546C5A6BE4B66BF9705BC885E08B1@NewLife>
	<8B440DC9-A1C8-4900-A0AB-96448616E46A@bioperl.org>
	<A5C3A80C-03F0-4CEC-BA43-2271B58F6DC4@science.mq.edu.au>
Message-ID: <BLU104-W475752FF9D5EADD0269E7A0DC0@phx.gbl>


Hi Liam,
I've tried analyzing the same alignment with both softwares (DNAStatatistics and dnadist), using the same analysis method (Jukes-Cantor), and I got pretty much the same results:

use strict;
use Bio::AlignIO;
Use Bio::Align::DNAStatistics;
my $stats = Bio::Align::DNAStatistics->new();
my $alignin = Bio::AlignIO->new(-file => 'e1_output_uno_solo.fas',
                         -format => 'fasta');
my $aln = $alignin->next_aln;
my $jcmatrix = $stats-> distance (-align => $aln,
               -method => 'Jukes-Cantor');
print $jcmatrix->print_matrix;
RESULT:A              0.00000  0.40900  0.41834  0.38044B              0.40900  0.00000  0.41358  0.37240C              0.41834  0.41358  0.00000  0.37809D              0.38044  0.37240  0.37809  0.00000

I used the web-based dnadist  ( http://mobyle.pasteur.fr/cgi-bin/portal.py?form=dnadist ), which is mentioned in the CPAN-dnadist documentation ( http://search.cpan.org/~birney/bioperl-run-1.4/Bio/Tools/Run/PiseApplication/dnadist.pm ),  setting Jukes-Cantor as Distance (D), and these are the Results:    4
A          0.000000 0.408996 0.418335 0.380436
B          0.408996 0.000000 0.413575 0.372400
C          0.418335 0.413575 0.000000 0.378086
D          0.380436 0.372400 0.378086 0.000000The difference is because of rounding off.Could it be by any chance that your analysis were made using two different methods, by default? (I think dnadist uses F84 instead of Jukes-Cantor by default). 

Using F84 instead of Jukes-Cantor in dnadist gives:
    4
A          0.000000 0.470013 0.479477 0.435071
B          0.470013 0.000000 0.468730 0.417669
C          0.479477 0.468730 0.000000 0.421582
D          0.435071 0.417669 0.421582 0.000000

On the other hand, DnaStatistics documentation offers the possibility of using F84, but it's not yet implementedMSG: Abstract method "Bio::Align::DNAStatistics::D_F84" is not implemented by package Bio::Align::DNAStatistics.
This is not your fault - author of Bio::Align::DNAStatistics should be blamed!


So, I think Jukes-Cantor works the same in Bio::Align::DNAStatistics and web-based dnadist; but other methods maybe not.
I want to thank you for letting me know about Data::Dumper, I've read the documentation and seems very handy. I think it could help me sooner or later. I'll try it out!!As I'm using DNAStatistics for a project, please let me know if you find what is wrong; or if I can help you further somehow.
Regards,
Jose G.


Subject: dnastatistics
From: lelbourn at science.mq.edu.au
Date: Tue, 22 Sep 2009 17:14:44 +1000
CC: maj at fortinbras.us; bioperl-l at bioperl.org; joseguillin at hotmail.com
To: jason at bioperl.org


So I also had no problem running the code as written by Jose (Bioperl 1.6.0, perl 5.10), but in the documentation for DNAStatistics it says:
"The routines are not well tested and do contain errors at this point. Work is underway to correct them, but do not expect this code to give you the right answer currently!"!
So I'm using dnadist (as I think the documentation recommends), and it does produce different numbers to $stats->distance(-).
I tried write_matrix from Bio::Matrix::IO - got a message saying it hasn't been implemented yet?
And if Jose hasn't already found it, try Data::Dumper; it will change your life....
Regards,Liam.
On 15/09/2009, at 3:54 AM, Jason Stajich wrote:Yeah it seems like more of a bioperl problem -- possible that the older code didn't recognize 'jukes-cantor' but you can try the abbreviation 'jc' -- better to just upgrade tho!

This isn't the cause of the problem but I would also encourage use of Bio::Matrix::IO for printing the matrix (use the 'write_matrix' function) rather than print_matrix on the matrix itsself.

-jason
On Sep 14, 2009, at 10:00 AM, Mark A. Jensen wrote:

Hi Jose--
I don't get any problem with your script as written. You should upgrade to
BioPerl 1.6 and try again.
The "unblessed reference" is $jcmatrix. It may be undef for some reason.
MAJ
----- Original Message ----- From: "Jose ." <joseguillin at hotmail.com>
To: <bioperl-l at bioperl.org>
Sent: Monday, September 14, 2009 8:48 AM
Subject: [Bioperl-l] Bio/Align/DNAStatistics.html print$jcmatrix->print_matrix;


Hello,

I'm trying to use Bio::Align::DNAStatistics, but I get the following message:

Can't call method "print_matrix" on unblessed reference at Tree.pl line 32, <GEN0> line 44.

Other modules do work, such us Bio::SimpleAlign;


My code is basically a modification of the code I found in http://doc.bioperl.org/releases/bioperl-current/bioperl-live/Bio/Align/DNAStatistics.html, as it is as follows:

use strict;
use Bio::AlignIO;
use Bio::Align::DNAStatistics;


my $stats = Bio::Align::DNAStatistics->new();

my $alignin = Bio::AlignIO->new(-file => 'e1_output_uno_solo.fas',
                          -format => 'fasta');
my $aln = $alignin->next_aln;

my $jcmatrix = $stats-> distance (-align => $aln,
                -method => 'Jukes-Cantor');

print $jcmatrix->print_matrix;

And the file 'e1_output_uno_solo.fas' has the following sequences:

A
GGTTATCTCAACAACTGTCACC--GTGGGCGCTGGTCATTGGTACGGGTGAACGAGAGTT
AAACGGTCGTTAACCATAGAAACAAAACACACTGCACCTTAACTCACTGAATAGTTGACG
GTCTGCCTCAGGGCTTGAGACAACGGATGGATCTAAACTCATGCTGTAGCCTATCAAACT
TAGCCCCAGGGTACTTCCGTCCCTAGCCTCGCTACAAGGCCAGAAAGGGTTTTGAAGTCT
ACTCACTGTGACCAGCGGTCTAGTCAGGTTATGCTTCGGCACAAAACCTCAGAATCGGTA
ACCAGCCACTACACGAACTGAAATCAAATCGCGGGAGGTGGTCCATCTTTGTCCACGCTG
CGATGATTGGGTTGCTTTATAGTCTAGCTGCAAGGTTTTGCGTTCTGGTGGGAAGCGGSubject: Re: [Bioperl-l] Bio/Align/DNAStatistics.html
	print$jcmatrix->print_maCA
TCCAAGGGGTTGACTCCGCTCGTTTATAACATGCCTTGGGCCTCCATGGTGAGTCGCAAC
GTCAGCGTAGGCCTAGACGGCT

B
GGATATCTCGACAACTTTTAGC--CTGGGCGCTTGGCATTGGTACACGTGACTTGCAGTT
AAAGGGTCGTTATACATAGAATCACTACCCAC--CAGGCGAACTCGCTGGAGAGCTGAGG
GTCACCCTCAGCGGTTGAGTTAACTGCTCGATGTTAACCGATGTTGGATCATAGGTAACT
TATCCTCAGTGTTCCTCTGTCCCTAGACTGGCTACAGGGCTACACCGGGTTTGAGGGGAT
ACTGACTGTTTTCAGCGGTAGTGTAAGTGTATGGTCCAACCCAAGGGTTCATGACCGGTA
AACTGCCCGTTCCCGCATTGAAATCAAATTGCAGGAGTTGGTACTTATTTGTCAACCTTA
CGATGATTGGGATGCATTTTAGTCGGGCTGGGCGGATTTGCGATCTGGGTGGAAGAGAGA
TGCATGGGGCTAACTCGTCTTGGTGAGTACCGGCATTGCACCGCAATGGACCGCCAAAAC
ATAAGAGTAGGTCGGGATGGCA

C
GCTTATCTCAACAACCGACACGAAGTCGTCGCAGGTCAATGGTACACGTGAATTGAAGTC
ATAAGATCAGTAATGATCGAACCACCAAACCCTTAACCTCGACTCACGCGATAGCCGAGG
GTCTGCCTCCAGGGTTGATTTAAAGGTTCTATTTAAGACCGTTTTCGATCATAGGTTACT
TATCCCCAGAGTTCTACCGTCGTGAGAATGGCTACAAGGCTAGAATAGGTTTTAGGGT-T
ACTTACGGTCTGCAGCCGTATTGTGAGGTTATGGTCCGGCCCTAGGCGTCATGACCGATA
ATCAGCCCCTACCTGAAATGAAATCAAATCGCGGGAGTTGGTACTTATCTGTCAACGTTG
CGATGATGGGGATACATGTTGGTCTACCGCGACGGACTAGCGATCACGGGGGAAGCGGAT
TGCCCGGTGGTGACTCGACACGTTTAAAACCTGCCTGGTTCCCGCATGGATCGTCACAAC
GTATGTGCAGGTCGAAACGAGT

D
CGTGATCGCAACAACTGTCACC--GTGGGCGCTGGCCGTTGGACCACGTGAAATGCTGTT
AAACGATCGTTCACCATAGAACCACTACACTCTTCACCTCAACCCGCGGGACAGGTGATG
GTGTCCCCCAGGGGTTGAGTGAACGGCTCGATGTAAACCCATGTTCGATCATAGGTAACG
TAGCCCCAGGGTGATTCCGTTCCTAAACTGGTTACAAGGCTAAAACGTGTTTTAGAGTAT
AATGACTGTCTACGGCGGTATTGTGATGTTATCATCCGTCCCTAGGCGTGGCGACCGTTA
AACAGCCTCTTCCCTAACTGATATCTAATCGTAGGAGTTGCTACGCATTTGTCAACGCAG
CGATGATGGTGATGCATCTTAATCTAGCTGG----TTTTTTGATCTCGGGTGACGCAGAT
AGTCAGGGGTTGACTCGCGTCGTTTGAAACGTGCCTTGCTCCTCAATGGACCCTCCGAAC
CTAAGAGTAGCTCGACACGGCT


I think the $aln object is OK, as I can use it with SimpleAlign.

Moreover, if I write
        print $jcmatrix;
instead of
        print $jcmatrix->print_matrix;
I get the memory reference, as normal===> ARRAY(0x859f08)

So my question is:

Why do I have an unblessed reference?

Can't call method "print_matrix" on unblessed reference at Tree.pl line 32, <GEN0> line 44.

Thank you very much in advance.

Jose G.

_________________________________________________________________
Hay tantos ordenadores como personas. ?Descubre ahora cu?l eres t?!
http://www.quepceres.com/
_______________________________________________
Bioperl-l mailing list
Bioperl-l at lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/bioperl-l


_______________________________________________
Bioperl-l mailing list
Bioperl-l at lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/bioperl-l

--
Jason Stajich
jason.stajich at gmail.com
jason at bioperl.org


_______________________________________________
Bioperl-l mailing list
Bioperl-l at lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/bioperl-l


______________________________


_________________________________________________________________
Comparte tus mejores momentos del verano ?Hazlo con Windows Live Fotos!
http://www.vivelive.com/compartirfotos

From A.J.Pemberton at bham.ac.uk  Tue Sep 22 13:06:04 2009
From: A.J.Pemberton at bham.ac.uk (Anthony Pemberton)
Date: Tue, 22 Sep 2009 18:06:04 +0100
Subject: [Bioperl-l] Problems installing latest stable bioperl-db (1.6)
Message-ID: <3A5B0BBDAF00724AB5F10155650102306F86D3F6@LESMBX1.adf.bham.ac.uk>

Folks,

I am experiencing problems installing bioperl-db. I followed the instructions on the website both installing via CPAN and downloading the source tarball. Get the same error. I think I have missing prerequistes, the first error I get is:

Can't locate Array/Compare.pm in @INC (@INC contains: t/lib t /usr/local/BioPerl-db-1.6.0/blib/lib 
/usr/local/BioPerl-db-1.6.0/blib/arch /usr/local/BioPerl-db-1.6.0 /usr/lib64/perl5/5.8.5/x86_64-linux-thread-multi
/usr/lib/perl5/5.8.5 /usr/lib64/perl5/site_perl/5.8.5/x86_64-linux-thread-multi /usr/lib/perl5/site_perl/5.8.5 
/usr/lib/perl5/site_perl /usr/lib64/perl5/vendor_perl/5.8.5/x86_64-linux-thread-multi /usr/lib/perl5/vendor_perl/5.8.5 
/usr/lib64/perl5/vendor_perl/5.8.3/x86_64-linux-thread-multi /usr/lib/perl5/vendor_perl .) at t/lib/Test/Warn.pm line 228.

Can anyone help?

Regards,

Tony P.


**************************************************************
Mr. A. Pemberton			Tel:+44 121 414 3388
School of Biosciences,			Fax:+44 121 414 5925
The University of Birmingham                    Email:a.j.pemberton at bham.ac.uk
Birmingham B15 2TT U.K.
**************************************************************


From joseguillin at hotmail.com  Wed Sep 23 11:08:04 2009
From: joseguillin at hotmail.com (Jose .)
Date: Wed, 23 Sep 2009 16:08:04 +0100
Subject: [Bioperl-l] Bio::Matrix::IO
Message-ID: <BLU104-W13A9E771FB4CC77748AAC5A0DB0@phx.gbl>


Hi,
I've found a typo in the Bio/Matrix/IO/phylip.pm documentation. There's a comma missing, 
=head1 SYNOPSIS

  use Bio::Matrix::IO;
  my $parser = Bio::Matrix::IO->new(-format   => 'phylip'    <------ comma missing
                                   -file     => 't/data/phylipdist.out');
  my $matrix = $parser->next_matrix;

It's also in the CPAN web:http://search.cpan.org/~cjfields/BioPerl-1.6.0_2/Bio/Matrix/IO/phylip.pm
And the BioPerl web:http://doc.bioperl.org/releases/bioperl-current/bioperl-live/Bio/Matrix/IO/phylip.html

This could mislead BioPerl begginers (like me) or absentminded BioPerl advanced who rely on the SYNOPSIS code.
Thank you! :)
_________________________________________________________________
Desc?rgate Internet Explorer 8 ?Y gana gratis viajes con Spanair!
http://www.vivelive.com/spanair

From maj at fortinbras.us  Wed Sep 23 11:36:59 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Wed, 23 Sep 2009 11:36:59 -0400
Subject: [Bioperl-l] Problems installing latest stable bioperl-db (1.6)
In-Reply-To: <3A5B0BBDAF00724AB5F10155650102306F86D3F6@LESMBX1.adf.bham.ac.uk>
References: <3A5B0BBDAF00724AB5F10155650102306F86D3F6@LESMBX1.adf.bham.ac.uk>
Message-ID: <3E7712FC278A4C9C89CBFC9A683AE301@NewLife>

hi Tony- missing prereqs are the issue with this message,yes-
the brute force approach would be to install each of these
as they come up; you can do

$ cpan
cpan> install Array::Compare

etc., then attempt the bioperl-db install again; lather, rinse, repeat.
MAJ
----- Original Message ----- 
From: "Anthony Pemberton" <A.J.Pemberton at bham.ac.uk>
To: <bioperl-l at bioperl.org>
Sent: Tuesday, September 22, 2009 1:06 PM
Subject: [Bioperl-l] Problems installing latest stable bioperl-db (1.6)


> Folks,
>
> I am experiencing problems installing bioperl-db. I followed the instructions 
> on the website both installing via CPAN and downloading the source tarball. 
> Get the same error. I think I have missing prerequistes, the first error I get 
> is:
>
> Can't locate Array/Compare.pm in @INC (@INC contains: t/lib t 
> /usr/local/BioPerl-db-1.6.0/blib/lib
> /usr/local/BioPerl-db-1.6.0/blib/arch /usr/local/BioPerl-db-1.6.0 
> /usr/lib64/perl5/5.8.5/x86_64-linux-thread-multi
> /usr/lib/perl5/5.8.5 
> /usr/lib64/perl5/site_perl/5.8.5/x86_64-linux-thread-multi 
> /usr/lib/perl5/site_perl/5.8.5
> /usr/lib/perl5/site_perl 
> /usr/lib64/perl5/vendor_perl/5.8.5/x86_64-linux-thread-multi 
> /usr/lib/perl5/vendor_perl/5.8.5
> /usr/lib64/perl5/vendor_perl/5.8.3/x86_64-linux-thread-multi 
> /usr/lib/perl5/vendor_perl .) at t/lib/Test/Warn.pm line 228.
>
> Can anyone help?
>
> Regards,
>
> Tony P.
>
>
> **************************************************************
> Mr. A. Pemberton Tel:+44 121 414 3388
> School of Biosciences, Fax:+44 121 414 5925
> The University of Birmingham                    Email:a.j.pemberton at bham.ac.uk
> Birmingham B15 2TT U.K.
> **************************************************************
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> 


From maj at fortinbras.us  Wed Sep 23 11:46:03 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Wed, 23 Sep 2009 11:46:03 -0400
Subject: [Bioperl-l] Bio::Matrix::IO
In-Reply-To: <BLU104-W13A9E771FB4CC77748AAC5A0DB0@phx.gbl>
References: <BLU104-W13A9E771FB4CC77748AAC5A0DB0@phx.gbl>
Message-ID: <E37AFAC689C84477817EFF38511B5709@NewLife>

thanks Jose - fixed it
MAJ
----- Original Message ----- 
From: "Jose ." <joseguillin at hotmail.com>
To: <bioperl-l at bioperl.org>
Sent: Wednesday, September 23, 2009 11:08 AM
Subject: [Bioperl-l] Bio::Matrix::IO


Hi,
I've found a typo in the Bio/Matrix/IO/phylip.pm documentation. There's a comma 
missing,
=head1 SYNOPSIS

  use Bio::Matrix::IO;
  my $parser = Bio::Matrix::IO->new(-format   => 'phylip'    <------ comma 
missing
                                   -file     => 't/data/phylipdist.out');
  my $matrix = $parser->next_matrix;

It's also in the CPAN 
web:http://search.cpan.org/~cjfields/BioPerl-1.6.0_2/Bio/Matrix/IO/phylip.pm
And the BioPerl 
web:http://doc.bioperl.org/releases/bioperl-current/bioperl-live/Bio/Matrix/IO/phylip.html

This could mislead BioPerl begginers (like me) or absentminded BioPerl advanced 
who rely on the SYNOPSIS code.
Thank you! :)
_________________________________________________________________
Desc?rgate Internet Explorer 8 ?Y gana gratis viajes con Spanair!
http://www.vivelive.com/spanair
_______________________________________________
Bioperl-l mailing list
Bioperl-l at lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/bioperl-l


From roy.chaudhuri at gmail.com  Wed Sep 23 12:27:26 2009
From: roy.chaudhuri at gmail.com (Roy Chaudhuri)
Date: Wed, 23 Sep 2009 17:27:26 +0100
Subject: [Bioperl-l] Newbie: Format GenBank
In-Reply-To: <BA67A13E-EAF0-4297-8013-22656D3D1740@refenestration.com>
References: <BA67A13E-EAF0-4297-8013-22656D3D1740@refenestration.com>
Message-ID: <4ABA4C6E.60609@gmail.com>

Hi Adlai,

In Perl you can open a string as if it was a file:

my $string;
open my $fh, '>', \$string or die $!;
my $seqOut=Bio::SeqIO->new(-fh=>$fh, -format=>'genbank';

$seqOut->write_seq($seq) should now write to the string.

However, are you sure this is your problem? Printing to STDOUT (which is 
what SeqIO does if you don't specify a file) should work fine with a CGI 
script. Your sequence is being displayed as one line because HTML 
ignores newline characters, but you can get around that by using a <pre> 
tag to specify pre-formatted text:

my $seqOut = new Bio::SeqIO(-format => 'genbank');
print "<pre>\n";
$seqOut->write_seq($seq);

Hope this helps.
Roy.

adlai burman wrote:
> I have finally got past two major hurdles (for me) only to get stumped:
> 1. I have written a perl script that can take a genbank formated text  
> file as a filehandle and do all sorts of nifty (for me) things with it.
> 2. I have gotten my BioPerl installation working on a web hosting  
> service so my advisor can use this through a browser.
> 
> BUT the code I have to fetch GB record can print it as a single HTML  
> line, and what I need is for it to assign the retrieved file to a  
> scaler variable. I am going blind trying to figure out how access  
> (not write) the gb file from an SeqIO object and assign it to a  
> variable.
> 
> Here's an example of the code I have going on the server:
> 
> #!/usr/bin/perl
> print "Content-type: text/html\n\n";
> use Bio::SeqIO;
> use Bio::DB::GenBank;
> 
> $genBank = new Bio::DB::GenBank;  # This object knows how to talk to  
> GenBank
> 
> my $seq = $genBank->get_Seq_by_acc('DQ897681');  # get a record by  
> accession
> 
> my $seqOut = new Bio::SeqIO(-format => 'genbank');
> 
> $seqOut->write_seq($seq);
> 
> 
> exit;
> 
> where 'DQ897861' will be replaced by a CGI post.
> 
> I know that write_seq is not what I need, and I assume that this is a  
> simple problem but can anyone tell me how to assign the retrieved gb  
> file to a scaler?
> 
> Thanks,
> Adlai
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From cjfields at illinois.edu  Wed Sep 23 13:47:51 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Wed, 23 Sep 2009 12:47:51 -0500
Subject: [Bioperl-l] Newbie: Format GenBank
In-Reply-To: <BA67A13E-EAF0-4297-8013-22656D3D1740@refenestration.com>
References: <BA67A13E-EAF0-4297-8013-22656D3D1740@refenestration.com>
Message-ID: <16121E7E-7619-4F02-82CC-20C6F5F6B230@illinois.edu>

On Sep 23, 2009, at 9:38 AM, adlai burman wrote:

> I have finally got past two major hurdles (for me) only to get  
> stumped:
> 1. I have written a perl script that can take a genbank formated  
> text file as a filehandle and do all sorts of nifty (for me) things  
> with it.
> 2. I have gotten my BioPerl installation working on a web hosting  
> service so my advisor can use this through a browser.
>
> BUT the code I have to fetch GB record can print it as a single HTML  
> line, and what I need is for it to assign the retrieved file to a  
> scaler variable. I am going blind trying to figure out how access  
> (not write) the gb file from an SeqIO object and assign it to a  
> variable.
>
> Here's an example of the code I have going on the server:
>
> #!/usr/bin/perl
> print "Content-type: text/html\n\n";
> use Bio::SeqIO;
> use Bio::DB::GenBank;
>
> $genBank = new Bio::DB::GenBank;  # This object knows how to talk to  
> GenBank
>
> my $seq = $genBank->get_Seq_by_acc('DQ897681');  # get a record by  
> accession
>
> my $seqOut = new Bio::SeqIO(-format => 'genbank');
>
> $seqOut->write_seq($seq);
>
> exit;
>
> where 'DQ897861' will be replaced by a CGI post.
>
> I know that write_seq is not what I need, and I assume that this is  
> a simple problem but can anyone tell me how to assign the retrieved  
> gb file to a scaler?
>
> Thanks,
> Adlai

Actually, there are two ways you can do this, one involving write_seq.

(1) The first is to just grab the raw data using Bio::DB::EUtilities:

use Bio::DB::EUtilities;

my $eutil = Bio::DB::EUtilities->new(-eutil     => 'efetch',
                                      -db        => 'nuccore',
                                      -id        => 'DQ897681',
                                      -rettype   => 'gb');

my $var = $eutil->get_Response->content;

(2) Use IO::String (see the SeqIO HOWTO), or Roy's example code.  That  
would 'filter' everything through SeqIO via next_seq/write_seq, so the  
output is what BioPerl spits out and may not be exactly the same.

chris

From cjfields at illinois.edu  Wed Sep 23 13:47:56 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Wed, 23 Sep 2009 12:47:56 -0500
Subject: [Bioperl-l] Problems installing latest stable bioperl-db (1.6)
In-Reply-To: <3E7712FC278A4C9C89CBFC9A683AE301@NewLife>
References: <3A5B0BBDAF00724AB5F10155650102306F86D3F6@LESMBX1.adf.bham.ac.uk>
	<3E7712FC278A4C9C89CBFC9A683AE301@NewLife>
Message-ID: <67AB606C-5CC9-4C1E-84EE-EFB7C37667E9@illinois.edu>

Appears Array::Compare is used for Test::Warn, so it isn't a true  
requirement (probably a test_requires or somesuch).

chris

On Sep 23, 2009, at 10:36 AM, Mark A. Jensen wrote:

> hi Tony- missing prereqs are the issue with this message,yes-
> the brute force approach would be to install each of these
> as they come up; you can do
>
> $ cpan
> cpan> install Array::Compare
>
> etc., then attempt the bioperl-db install again; lather, rinse,  
> repeat.
> MAJ
> ----- Original Message ----- From: "Anthony Pemberton" <A.J.Pemberton at bham.ac.uk 
> >
> To: <bioperl-l at bioperl.org>
> Sent: Tuesday, September 22, 2009 1:06 PM
> Subject: [Bioperl-l] Problems installing latest stable bioperl-db  
> (1.6)
>
>
>> Folks,
>>
>> I am experiencing problems installing bioperl-db. I followed the  
>> instructions on the website both installing via CPAN and  
>> downloading the source tarball. Get the same error. I think I have  
>> missing prerequistes, the first error I get is:
>>
>> Can't locate Array/Compare.pm in @INC (@INC contains: t/lib t /usr/ 
>> local/BioPerl-db-1.6.0/blib/lib
>> /usr/local/BioPerl-db-1.6.0/blib/arch /usr/local/BioPerl-db-1.6.0 / 
>> usr/lib64/perl5/5.8.5/x86_64-linux-thread-multi
>> /usr/lib/perl5/5.8.5 /usr/lib64/perl5/site_perl/5.8.5/x86_64-linux- 
>> thread-multi /usr/lib/perl5/site_perl/5.8.5
>> /usr/lib/perl5/site_perl /usr/lib64/perl5/vendor_perl/5.8.5/x86_64- 
>> linux-thread-multi /usr/lib/perl5/vendor_perl/5.8.5
>> /usr/lib64/perl5/vendor_perl/5.8.3/x86_64-linux-thread-multi /usr/ 
>> lib/perl5/vendor_perl .) at t/lib/Test/Warn.pm line 228.
>>
>> Can anyone help?
>>
>> Regards,
>>
>> Tony P.
>>
>>
>> **************************************************************
>> Mr. A. Pemberton Tel:+44 121 414 3388
>> School of Biosciences, Fax:+44 121 414 5925
>> The University of Birmingham                     
>> Email:a.j.pemberton at bham.ac.uk
>> Birmingham B15 2TT U.K.
>> **************************************************************
>>
>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From cjfields at illinois.edu  Wed Sep 23 16:58:37 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Wed, 23 Sep 2009 15:58:37 -0500
Subject: [Bioperl-l] BioPerl 1.6.0 alpha 3 released
In-Reply-To: <3BEA4A335B853745AE0BA5E81DE3782A09DD5D30@exch1-hi.accelrys.net>
References: <A59164B5-0408-4A94-9262-8B814DD48CE1@illinois.edu>
	<3BEA4A335B853745AE0BA5E81DE3782A09DD5D30@exch1-hi.accelrys.net>
Message-ID: <EA6593D4-5F3D-4CD9-95C6-598B9C561609@illinois.edu>

Yes, that would be good.  I don't have immediate access to anything  
running WinXP/vista/7 but I can probably look into this sometime  
tomorrow or Monday.

Just to make sure, is this with ActivePerl or Strawberry Perl?

chris

On Sep 23, 2009, at 3:52 PM, Kristine Briedis wrote:

> Hi Chris,
>
> We tested BioPerl 1.6.0 alpha 3 with our set of Pipeline Pilot  
> regressions and noticed a small problem.  The fasta validation check  
> for '>' in SeqIO::fasta (line 127) throws when used with  
> Index::Fasta on Windows because the position after '>' is being  
> indexed.  It looks like you already fixed the same problem for Linux  
> (comment in line 190 of Index::Fasta).  Do you want me to put this  
> into bugzilla?  Let me know if you have any questions.  Thanks!
>
> Cheers,
> Kristine
>
>
> -----Original Message-----
> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l- 
> bounces at lists.open-bio.org] On Behalf Of Chris Fields
> Sent: Tuesday, September 22, 2009 1:29 PM
> To: BioPerl List
> Subject: [Bioperl-l] BioPerl 1.6.0 alpha 3 released
>
> The third alpha is now out and propagating it's way around the
> intertubes:
>
> http://search.cpan.org/~cjfields/BioPerl-1.6.0_3/
>
> Pick your favorite archive here:
>
> http://bioperl.org/DIST/RC/
>
> This includes some unmerged changes from 1.6.0.  Test failures from
> the last alpha indicated these somehow were missed, so I basically ran
> a global diff against main trunk to check for missing commits (all
> located in t/ as it turned out).
>
> Also fixed is are the SeqFeature_SQLite.t failures; this is a file
> autogenerated with Build.PL tests that somehow made it's way into the
> last alpha release.  This is now properly cleaned up along with it's
> test database using './Build clean'.  BTW, very nice SQLite
> implementation; I may be using it!
>
> Please let me know if anything pops up; I'm hoping to release 1.6.1 by
> this Thursday-Friday.
>
> Enjoy!
>
> chris
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From KBriedis at accelrys.com  Wed Sep 23 16:52:09 2009
From: KBriedis at accelrys.com (Kristine Briedis)
Date: Wed, 23 Sep 2009 16:52:09 -0400
Subject: [Bioperl-l] BioPerl 1.6.0 alpha 3 released
In-Reply-To: <A59164B5-0408-4A94-9262-8B814DD48CE1@illinois.edu>
References: <A59164B5-0408-4A94-9262-8B814DD48CE1@illinois.edu>
Message-ID: <3BEA4A335B853745AE0BA5E81DE3782A09DD5D30@exch1-hi.accelrys.net>

Hi Chris,

We tested BioPerl 1.6.0 alpha 3 with our set of Pipeline Pilot regressions and noticed a small problem.  The fasta validation check for '>' in SeqIO::fasta (line 127) throws when used with Index::Fasta on Windows because the position after '>' is being indexed.  It looks like you already fixed the same problem for Linux (comment in line 190 of Index::Fasta).  Do you want me to put this into bugzilla?  Let me know if you have any questions.  Thanks!

Cheers,
Kristine


-----Original Message-----
From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-bounces at lists.open-bio.org] On Behalf Of Chris Fields
Sent: Tuesday, September 22, 2009 1:29 PM
To: BioPerl List
Subject: [Bioperl-l] BioPerl 1.6.0 alpha 3 released

The third alpha is now out and propagating it's way around the  
intertubes:

http://search.cpan.org/~cjfields/BioPerl-1.6.0_3/

Pick your favorite archive here:

http://bioperl.org/DIST/RC/

This includes some unmerged changes from 1.6.0.  Test failures from  
the last alpha indicated these somehow were missed, so I basically ran  
a global diff against main trunk to check for missing commits (all  
located in t/ as it turned out).

Also fixed is are the SeqFeature_SQLite.t failures; this is a file  
autogenerated with Build.PL tests that somehow made it's way into the  
last alpha release.  This is now properly cleaned up along with it's  
test database using './Build clean'.  BTW, very nice SQLite  
implementation; I may be using it!

Please let me know if anything pops up; I'm hoping to release 1.6.1 by  
this Thursday-Friday.

Enjoy!

chris
_______________________________________________
Bioperl-l mailing list
Bioperl-l at lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/bioperl-l


From KBriedis at accelrys.com  Wed Sep 23 18:40:10 2009
From: KBriedis at accelrys.com (Kristine Briedis)
Date: Wed, 23 Sep 2009 18:40:10 -0400
Subject: [Bioperl-l] BioPerl 1.6.0 alpha 3 released
In-Reply-To: <EA6593D4-5F3D-4CD9-95C6-598B9C561609@illinois.edu>
References: <A59164B5-0408-4A94-9262-8B814DD48CE1@illinois.edu>
	<3BEA4A335B853745AE0BA5E81DE3782A09DD5D30@exch1-hi.accelrys.net>
	<EA6593D4-5F3D-4CD9-95C6-598B9C561609@illinois.edu>
Message-ID: <3BEA4A335B853745AE0BA5E81DE3782A09DD5DF8@exch1-hi.accelrys.net>

Hi Chris,

ActivePerl.  I'll open a bug.  Thanks!

Cheers,
Kristine


-----Original Message-----
From: Chris Fields [mailto:cjfields at illinois.edu] 
Sent: Wednesday, September 23, 2009 1:59 PM
To: Kristine Briedis
Cc: BioPerl List
Subject: Re: [Bioperl-l] BioPerl 1.6.0 alpha 3 released

Yes, that would be good.  I don't have immediate access to anything  
running WinXP/vista/7 but I can probably look into this sometime  
tomorrow or Monday.

Just to make sure, is this with ActivePerl or Strawberry Perl?

chris

On Sep 23, 2009, at 3:52 PM, Kristine Briedis wrote:

> Hi Chris,
>
> We tested BioPerl 1.6.0 alpha 3 with our set of Pipeline Pilot  
> regressions and noticed a small problem.  The fasta validation check  
> for '>' in SeqIO::fasta (line 127) throws when used with  
> Index::Fasta on Windows because the position after '>' is being  
> indexed.  It looks like you already fixed the same problem for Linux  
> (comment in line 190 of Index::Fasta).  Do you want me to put this  
> into bugzilla?  Let me know if you have any questions.  Thanks!
>
> Cheers,
> Kristine
>
>
> -----Original Message-----
> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l- 
> bounces at lists.open-bio.org] On Behalf Of Chris Fields
> Sent: Tuesday, September 22, 2009 1:29 PM
> To: BioPerl List
> Subject: [Bioperl-l] BioPerl 1.6.0 alpha 3 released
>
> The third alpha is now out and propagating it's way around the
> intertubes:
>
> http://search.cpan.org/~cjfields/BioPerl-1.6.0_3/
>
> Pick your favorite archive here:
>
> http://bioperl.org/DIST/RC/
>
> This includes some unmerged changes from 1.6.0.  Test failures from
> the last alpha indicated these somehow were missed, so I basically ran
> a global diff against main trunk to check for missing commits (all
> located in t/ as it turned out).
>
> Also fixed is are the SeqFeature_SQLite.t failures; this is a file
> autogenerated with Build.PL tests that somehow made it's way into the
> last alpha release.  This is now properly cleaned up along with it's
> test database using './Build clean'.  BTW, very nice SQLite
> implementation; I may be using it!
>
> Please let me know if anything pops up; I'm hoping to release 1.6.1 by
> this Thursday-Friday.
>
> Enjoy!
>
> chris
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From cjfields at illinois.edu  Wed Sep 23 18:49:45 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Wed, 23 Sep 2009 17:49:45 -0500
Subject: [Bioperl-l] BioPerl.pm and 1.6.1
References: <1253727169.18486.1336281841@webmail.messagingengine.com>
Message-ID: <1AF393BC-2352-4ADA-A4E3-3EF13B99CAE8@illinois.edu>

All,

I've recently noticed that CPAN is not grabbing the correct  
descriptive information from Build.PL.  The current description is  
coming from Bio::LiveSeq::IO::BioPerl, which is the first module found  
with the same 'BioPerl' namesake:

http://search.cpan.org/search?query=bioperl&mode=dist

Therefore we need something that acts as the description and main page  
for the distributions.  We have a bioperl.pod already, just need to  
update it and add it to trunk, and maybe release another alpha with it  
included to make sure it's working.  I also want to fix the recent  
Windows issue reported by Kristine.

Therefore, I will being adding this for core and the other  
distributions per Curtis Jewell's suggestion (below).  Please let me  
know if there are any disagreements with this; I'll probably push  
another alpha out with this in the next few days (also hopefully  
containing the bug fix mentioned above).

chris

Begin forwarded message:

> From: "Curtis Jewell" <lists.perl.module-authors at csjewell.fastmail.us>
> Date: September 23, 2009 12:32:49 PM CDT
> To: "Chris Fields" <cjfields at illinois.edu>
> Subject: Re: distribution description
>
> Chris, I'd make it a BioPerl.pm that just declares a package and  
> version
> and does nothing else other than being a holder for Pod - because the
> first thing I wanted to do when I heard about it and wanted to check
> whether it worked in Strawberry is to do 'cpan BioPerl', which of
> course, blows up.
>
> --Curtis
>
> On Tue, 22 Sep 2009 22:23 -0500, "Chris Fields"  
> <cjfields at illinois.edu>
> wrote:
>> I've noticed in the last number of CPAN releases of BioPerl that the
>> description for the distribution is being pulled from one of our
>> modules (Bio::LiveSeq::IO::BioPerl).  I'm guessing this is b/c it's
>> the first match to the distribution name.
>>
>> Is there any way to make sure the description is pulled from the
>> abstract?  We're using a subclass of Module::Build and have defined
>> dist_abstract (I'm thinking of adding a BioPerl.pod to the root
>> directory just to catch this).
>>
>> chris
> --
> Curtis Jewell
> swordsman at csjewell.fastmail.us
>
> %DCL-E-MEM-BAD, bad memory
> -VMS-F-PDGERS, pudding between the ears
>
> [I use PC-Alpine, which deliberately does not display colors and  
> pictures in HTML mail]
>


From cjfields at illinois.edu  Wed Sep 23 19:00:55 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Wed, 23 Sep 2009 18:00:55 -0500
Subject: [Bioperl-l] BioPerl 1.6.0 alpha 3 released
In-Reply-To: <3BEA4A335B853745AE0BA5E81DE3782A09DD5DF8@exch1-hi.accelrys.net>
References: <A59164B5-0408-4A94-9262-8B814DD48CE1@illinois.edu>
	<3BEA4A335B853745AE0BA5E81DE3782A09DD5D30@exch1-hi.accelrys.net>
	<EA6593D4-5F3D-4CD9-95C6-598B9C561609@illinois.edu>
	<3BEA4A335B853745AE0BA5E81DE3782A09DD5DF8@exch1-hi.accelrys.net>
Message-ID: <D704BD1B-C44B-4AB5-9C14-9F4F63A46FEE@illinois.edu>

Kristine,

I have been planning on installing a temp WinXP VM using VirtualBox,  
so this'll give me an excuse to set that up ;>

chris

On Sep 23, 2009, at 5:40 PM, Kristine Briedis wrote:

> Hi Chris,
>
> ActivePerl.  I'll open a bug.  Thanks!
>
> Cheers,
> Kristine
>
>
> -----Original Message-----
> From: Chris Fields [mailto:cjfields at illinois.edu]
> Sent: Wednesday, September 23, 2009 1:59 PM
> To: Kristine Briedis
> Cc: BioPerl List
> Subject: Re: [Bioperl-l] BioPerl 1.6.0 alpha 3 released
>
> Yes, that would be good.  I don't have immediate access to anything
> running WinXP/vista/7 but I can probably look into this sometime
> tomorrow or Monday.
>
> Just to make sure, is this with ActivePerl or Strawberry Perl?
>
> chris
>
> On Sep 23, 2009, at 3:52 PM, Kristine Briedis wrote:
>
>> Hi Chris,
>>
>> We tested BioPerl 1.6.0 alpha 3 with our set of Pipeline Pilot
>> regressions and noticed a small problem.  The fasta validation check
>> for '>' in SeqIO::fasta (line 127) throws when used with
>> Index::Fasta on Windows because the position after '>' is being
>> indexed.  It looks like you already fixed the same problem for Linux
>> (comment in line 190 of Index::Fasta).  Do you want me to put this
>> into bugzilla?  Let me know if you have any questions.  Thanks!
>>
>> Cheers,
>> Kristine
>>
>>
>> -----Original Message-----
>> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
>> bounces at lists.open-bio.org] On Behalf Of Chris Fields
>> Sent: Tuesday, September 22, 2009 1:29 PM
>> To: BioPerl List
>> Subject: [Bioperl-l] BioPerl 1.6.0 alpha 3 released
>>
>> The third alpha is now out and propagating it's way around the
>> intertubes:
>>
>> http://search.cpan.org/~cjfields/BioPerl-1.6.0_3/
>>
>> Pick your favorite archive here:
>>
>> http://bioperl.org/DIST/RC/
>>
>> This includes some unmerged changes from 1.6.0.  Test failures from
>> the last alpha indicated these somehow were missed, so I basically  
>> ran
>> a global diff against main trunk to check for missing commits (all
>> located in t/ as it turned out).
>>
>> Also fixed is are the SeqFeature_SQLite.t failures; this is a file
>> autogenerated with Build.PL tests that somehow made it's way into the
>> last alpha release.  This is now properly cleaned up along with it's
>> test database using './Build clean'.  BTW, very nice SQLite
>> implementation; I may be using it!
>>
>> Please let me know if anything pops up; I'm hoping to release 1.6.1  
>> by
>> this Thursday-Friday.
>>
>> Enjoy!
>>
>> chris
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From David.Messina at sbc.su.se  Thu Sep 24 05:38:19 2009
From: David.Messina at sbc.su.se (Dave Messina)
Date: Thu, 24 Sep 2009 11:38:19 +0200
Subject: [Bioperl-l] a Main Page proposal
In-Reply-To: <9D6376D4-DFAC-4363-BA1C-0E27AB01373E@illinois.edu>
References: <mailman.7482.1253574466.2696.bioperl-l@lists.open-bio.org> 
	<4AB84B8D.5080005@ieee.org>
	<2B4BD21D-D06A-43A1-8860-A8F3771130E7@illinois.edu> 
	<f135c01c0909221615y4e79fb48u7e1ce6771b046d10@mail.gmail.com> 
	<628aabb70909230437p13de2164sef4950cc3be2ce23@mail.gmail.com> 
	<320fb6e00909230512u3d0c2031xb418e3253476be2f@mail.gmail.com> 
	<9D6376D4-DFAC-4363-BA1C-0E27AB01373E@illinois.edu>
Message-ID: <628aabb70909240238v439d6c46l93a5ead53f161c37@mail.gmail.com>

>
> Not to add yet more to the list, but I also think a concise list of
> projects using (or 'powered by') bioperl should be front-and-center; not a
> lot of users know when/where bioperl is used.  This applies to the other
> bio* as well, particularly biopython (seeing it popping up more and more).
>


Along these lines, it'd be great to publicize not only
BioPerl-*powered*projects, but ones which interface with it, too.

Just this week, for example, there is this, which could go both on a static
page and in the newsfeed:
http://bioinformatics.oxfordjournals.org/cgi/content/abstract/btp554v1

MOODS: fast search for position weight matrix matches in DNA sequences.

Korhonen J, Martinm?ki P, Pizzi C, Rastas P, Ukkonen E.
Department of Computer Science and Helsinki Institute for Information
Technology,
University of Helsinki, Helsinki, Finland.

SUMMARY: MOODS (MOtif Occurrence Detection Suite) is a software package for
matching position weight matrices against DNA sequences. MOODS implements
state-of-the-art on-line matching algorithms, achieving considerably faster
scanning speed than with a simple brute-force search. MOODS is written in C++,
with bindings for the popular BioPerl and Biopython toolkits. It can easily be
adapted for different purposes and integrated into existing workflows. It can
also be used as a C++ library. AVAILABILITY: The package with documentation and
examples of usage is available at http://www.cs.helsinki.fi/group/pssmfind. The
source code is also available under the terms of a GNU General Public License
(GPL). CONTACT: janne.h.korhonen at helsinki.fi.

PMID: 19773334 [PubMed - as supplied by publisher]


From maj at fortinbras.us  Thu Sep 24 10:17:26 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Thu, 24 Sep 2009 10:17:26 -0400
Subject: [Bioperl-l] DB_File dependency and ActiveState 5.10
Message-ID: <1A8A9461E94441EE9BD73D02A8F81F52@NewLife>

Gurus of a db stripe:
 
ActiveState 5.10 has such a problem with BDB that it
disables their ppm build of the DB_File module. I know
what the *ultimate* solution is...however...

I did a quick grep of 'use DB_File' across the trunk, and 
it seems there are two categories of dependency--

(1) use of BDB is an option among other dbms
      (e.g., among the  Bio::DB::GFF::Adaptor::)

(2) BDB is the developer's personal choice
    (e.g., possibly Bio::DB::FileCache)

In Bio::DB::Fasta, AnyDBM_File is used to allow the 
user a choice. Are there fundamental reasons not to 
convert the type (2) dependencies to AnyDBM_File?
I will try to do this (on a branch) if there are no technical
objections. General derision, however, will only goad
me into action-

Thanks,
MAJ


From A.J.Pemberton at bham.ac.uk  Thu Sep 24 11:08:06 2009
From: A.J.Pemberton at bham.ac.uk (Anthony Pemberton)
Date: Thu, 24 Sep 2009 16:08:06 +0100
Subject: [Bioperl-l] Problems installing latest stable bioperl-db (1.6)
In-Reply-To: <67AB606C-5CC9-4C1E-84EE-EFB7C37667E9@illinois.edu>
References: <3A5B0BBDAF00724AB5F10155650102306F86D3F6@LESMBX1.adf.bham.ac.uk>
	<3E7712FC278A4C9C89CBFC9A683AE301@NewLife>
	<67AB606C-5CC9-4C1E-84EE-EFB7C37667E9@illinois.edu>
Message-ID: <3A5B0BBDAF00724AB5F10155650102306F86D403@LESMBX1.adf.bham.ac.uk>

Chris, Mark,

Thank you, I have made significant progress with the install. I had to do a 

Cpan> force install Array::Compare

To get the model properly installed. 

However, I now have a new error. When I do

Cpan> install CJFIELDS/BioPerl-db-1.6.0.tar.gz

I get the following error (now only 1 of the 16 tests fails):

t/12ontology.t .... 1/740 Bio::OntologyIO: soflat cannot be found
Exception
------------- EXCEPTION -------------
MSG: Failed to load module Bio::OntologyIO::soflat. Can't locate Graph/Directed.pm in @INC (@INC contains: t/lib t /root/.cpan/build/BioPerl-db-1.6.0-xim2YV/blib/lib /root/.cpan/build/BioPerl-db-1.6.0-xim2YV/blib/arch /root/.cpan/build/BioPerl-db-1.6.0-xim2YV /usr/lib64/perl5/5.8.5/x86_64-linux-thread-multi /usr/lib/perl5/5.8.5 /usr/lib64/perl5/site_perl/5.8.5/x86_64-linux-thread-multi /usr/lib/perl5/site_perl/5.8.5 /usr/lib/perl5/site_perl /usr/lib64/perl5/vendor_perl/5.8.5/x86_64-linux-thread-multi /usr/lib/perl5/vendor_perl/5.8.5 /usr/lib64/perl5/vendor_perl/5.8.3/x86_64-linux-thread-multi /usr/lib/perl5/vendor_perl .) at /usr/lib/perl5/site_perl/5.8.5/Bio/Ontology/SimpleGOEngine/GraphAdaptor.pm line 118.
BEGIN failed--compilation aborted at /usr/lib/perl5/site_perl/5.8.5/Bio/Ontology/SimpleGOEngine/GraphAdaptor.pm line 118.


Can you help with this one?

Regards,

Tony Pemberton


> -----Original Message-----
> From: Chris Fields [mailto:cjfields at illinois.edu]
> Sent: 23 September 2009 18:48
> To: Mark A. Jensen
> Cc: Anthony Pemberton; bioperl-l at bioperl.org
> Subject: Re: [Bioperl-l] Problems installing latest stable bioperl-db
> (1.6)
> 
> Appears Array::Compare is used for Test::Warn, so it isn't a true
> requirement (probably a test_requires or somesuch).
> 
> chris
> 
> On Sep 23, 2009, at 10:36 AM, Mark A. Jensen wrote:
> 
> > hi Tony- missing prereqs are the issue with this message,yes-
> > the brute force approach would be to install each of these
> > as they come up; you can do
> >
> > $ cpan
> > cpan> install Array::Compare
> >
> > etc., then attempt the bioperl-db install again; lather, rinse,
> > repeat.
> > MAJ
> > ----- Original Message ----- From: "Anthony Pemberton"
> <A.J.Pemberton at bham.ac.uk
> > >
> > To: <bioperl-l at bioperl.org>
> > Sent: Tuesday, September 22, 2009 1:06 PM
> > Subject: [Bioperl-l] Problems installing latest stable bioperl-db
> > (1.6)
> >
> >
> >> Folks,
> >>
> >> I am experiencing problems installing bioperl-db. I followed the
> >> instructions on the website both installing via CPAN and
> >> downloading the source tarball. Get the same error. I think I have
> >> missing prerequistes, the first error I get is:
> >>
> >> Can't locate Array/Compare.pm in @INC (@INC contains: t/lib t /usr/
> >> local/BioPerl-db-1.6.0/blib/lib
> >> /usr/local/BioPerl-db-1.6.0/blib/arch /usr/local/BioPerl-db-1.6.0 /
> >> usr/lib64/perl5/5.8.5/x86_64-linux-thread-multi
> >> /usr/lib/perl5/5.8.5 /usr/lib64/perl5/site_perl/5.8.5/x86_64-linux-
> >> thread-multi /usr/lib/perl5/site_perl/5.8.5
> >> /usr/lib/perl5/site_perl /usr/lib64/perl5/vendor_perl/5.8.5/x86_64-
> >> linux-thread-multi /usr/lib/perl5/vendor_perl/5.8.5
> >> /usr/lib64/perl5/vendor_perl/5.8.3/x86_64-linux-thread-multi /usr/
> >> lib/perl5/vendor_perl .) at t/lib/Test/Warn.pm line 228.
> >>
> >> Can anyone help?
> >>
> >> Regards,
> >>
> >> Tony P.
> >>
> >>
> >> **************************************************************
> >> Mr. A. Pemberton Tel:+44 121 414 3388
> >> School of Biosciences, Fax:+44 121 414 5925
> >> The University of Birmingham
> >> Email:a.j.pemberton at bham.ac.uk
> >> Birmingham B15 2TT U.K.
> >> **************************************************************
> >>
> >>
> >>
> >> _______________________________________________
> >> Bioperl-l mailing list
> >> Bioperl-l at lists.open-bio.org
> >> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >>
> >
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l


From jason at bioperl.org  Thu Sep 24 12:23:44 2009
From: jason at bioperl.org (Jason Stajich)
Date: Thu, 24 Sep 2009 09:23:44 -0700
Subject: [Bioperl-l] a Main Page proposal
In-Reply-To: <628aabb70909240238v439d6c46l93a5ead53f161c37@mail.gmail.com>
References: <mailman.7482.1253574466.2696.bioperl-l@lists.open-bio.org>
	<4AB84B8D.5080005@ieee.org>
	<2B4BD21D-D06A-43A1-8860-A8F3771130E7@illinois.edu>
	<f135c01c0909221615y4e79fb48u7e1ce6771b046d10@mail.gmail.com>
	<628aabb70909230437p13de2164sef4950cc3be2ce23@mail.gmail.com>
	<320fb6e00909230512u3d0c2031xb418e3253476be2f@mail.gmail.com>
	<9D6376D4-DFAC-4363-BA1C-0E27AB01373E@illinois.edu>
	<628aabb70909240238v439d6c46l93a5ead53f161c37@mail.gmail.com>
Message-ID: <3B49F41D-4FBB-48CD-BA33-3D6C783CBA38@bioperl.org>

If someone also wants to volunteer to keep up the publications page -  
this is where I *had* been curating a list up by citations and google  
scholar searches for 'bioperl' and things that reference 2002 paper.

Seems like this is where the static copy of that information should go  
- but highlighting things on the a page with a circulating list or  
something that just listed recent additions to the list could be done  
by the web dev gurus and could be kewl.
The current issue is that a) it is large so I think pubmed plugin  
rendering can be slow (or gets broken as it seems to be now).
http://bioperl.org/wiki/BioPerl_publications
http://bioperl.org/wiki/BioPerl_publications/2008
http://bioperl.org/wiki/BioPerl_publications/2007
etc....

-jason
On Sep 24, 2009, at 2:38 AM, Dave Messina wrote:

>>
>> Not to add yet more to the list, but I also think a concise list of
>> projects using (or 'powered by') bioperl should be front-and- 
>> center; not a
>> lot of users know when/where bioperl is used.  This applies to the  
>> other
>> bio* as well, particularly biopython (seeing it popping up more and  
>> more).
>>
>
>
> Along these lines, it'd be great to publicize not only
> BioPerl-*powered*projects, but ones which interface with it, too.
>
> Just this week, for example, there is this, which could go both on a  
> static
> page and in the newsfeed:
> http://bioinformatics.oxfordjournals.org/cgi/content/abstract/btp554v1
>
> MOODS: fast search for position weight matrix matches in DNA  
> sequences.
>
> Korhonen J, Martinm?ki P, Pizzi C, Rastas P, Ukkonen E.
> Department of Computer Science and Helsinki Institute for Information
> Technology,
> University of Helsinki, Helsinki, Finland.
>
> SUMMARY: MOODS (MOtif Occurrence Detection Suite) is a software  
> package for
> matching position weight matrices against DNA sequences. MOODS  
> implements
> state-of-the-art on-line matching algorithms, achieving considerably  
> faster
> scanning speed than with a simple brute-force search. MOODS is  
> written in C++,
> with bindings for the popular BioPerl and Biopython toolkits. It can  
> easily be
> adapted for different purposes and integrated into existing  
> workflows. It can
> also be used as a C++ library. AVAILABILITY: The package with  
> documentation and
> examples of usage is available at http://www.cs.helsinki.fi/group/pssmfind 
> . The
> source code is also available under the terms of a GNU General  
> Public License
> (GPL). CONTACT: janne.h.korhonen at helsinki.fi.
>
> PMID: 19773334 [PubMed - as supplied by publisher]
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

--
Jason Stajich
jason.stajich at gmail.com
jason at bioperl.org


From rmb32 at cornell.edu  Thu Sep 24 12:28:08 2009
From: rmb32 at cornell.edu (Robert Buels)
Date: Thu, 24 Sep 2009 09:28:08 -0700
Subject: [Bioperl-l] DB_File dependency and ActiveState 5.10
In-Reply-To: <1A8A9461E94441EE9BD73D02A8F81F52@NewLife>
References: <1A8A9461E94441EE9BD73D02A8F81F52@NewLife>
Message-ID: <4ABB9E18.3060003@cornell.edu>

Sounds like a good idea to me.

Rob


From cjfields at illinois.edu  Thu Sep 24 12:58:32 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Thu, 24 Sep 2009 11:58:32 -0500
Subject: [Bioperl-l] Problems installing latest stable bioperl-db (1.6)
In-Reply-To: <3A5B0BBDAF00724AB5F10155650102306F86D403@LESMBX1.adf.bham.ac.uk>
References: <3A5B0BBDAF00724AB5F10155650102306F86D3F6@LESMBX1.adf.bham.ac.uk>
	<3E7712FC278A4C9C89CBFC9A683AE301@NewLife>
	<67AB606C-5CC9-4C1E-84EE-EFB7C37667E9@illinois.edu>
	<3A5B0BBDAF00724AB5F10155650102306F86D403@LESMBX1.adf.bham.ac.uk>
Message-ID: <2BDD197A-3DEF-44CE-9F98-6B3F117084EE@illinois.edu>

Tony,

The error should point out the problem: install Graph::Directed via  
CPAN.

Saying that, we need to add that as a 'recommends' for the db package  
and skip those tests if Graph::Directed isn't present.  Will do that  
now.

chris

On Sep 24, 2009, at 10:08 AM, Anthony Pemberton wrote:

> Chris, Mark,
>
> Thank you, I have made significant progress with the install. I had  
> to do a
>
> Cpan> force install Array::Compare
>
> To get the model properly installed.
>
> However, I now have a new error. When I do
>
> Cpan> install CJFIELDS/BioPerl-db-1.6.0.tar.gz
>
> I get the following error (now only 1 of the 16 tests fails):
>
> t/12ontology.t .... 1/740 Bio::OntologyIO: soflat cannot be found
> Exception
> ------------- EXCEPTION -------------
> MSG: Failed to load module Bio::OntologyIO::soflat. Can't locate  
> Graph/Directed.pm in @INC (@INC contains: t/lib t /root/.cpan/build/ 
> BioPerl-db-1.6.0-xim2YV/blib/lib /root/.cpan/build/BioPerl-db-1.6.0- 
> xim2YV/blib/arch /root/.cpan/build/BioPerl-db-1.6.0-xim2YV /usr/ 
> lib64/perl5/5.8.5/x86_64-linux-thread-multi /usr/lib/perl5/5.8.5 / 
> usr/lib64/perl5/site_perl/5.8.5/x86_64-linux-thread-multi /usr/lib/ 
> perl5/site_perl/5.8.5 /usr/lib/perl5/site_perl /usr/lib64/perl5/ 
> vendor_perl/5.8.5/x86_64-linux-thread-multi /usr/lib/perl5/ 
> vendor_perl/5.8.5 /usr/lib64/perl5/vendor_perl/5.8.3/x86_64-linux- 
> thread-multi /usr/lib/perl5/vendor_perl .) at /usr/lib/perl5/ 
> site_perl/5.8.5/Bio/Ontology/SimpleGOEngine/GraphAdaptor.pm line 118.
> BEGIN failed--compilation aborted at /usr/lib/perl5/site_perl/5.8.5/ 
> Bio/Ontology/SimpleGOEngine/GraphAdaptor.pm line 118.
>
>
> Can you help with this one?
>
> Regards,
>
> Tony Pemberton
>
>
>> -----Original Message-----
>> From: Chris Fields [mailto:cjfields at illinois.edu]
>> Sent: 23 September 2009 18:48
>> To: Mark A. Jensen
>> Cc: Anthony Pemberton; bioperl-l at bioperl.org
>> Subject: Re: [Bioperl-l] Problems installing latest stable bioperl-db
>> (1.6)
>>
>> Appears Array::Compare is used for Test::Warn, so it isn't a true
>> requirement (probably a test_requires or somesuch).
>>
>> chris
>>
>> On Sep 23, 2009, at 10:36 AM, Mark A. Jensen wrote:
>>
>>> hi Tony- missing prereqs are the issue with this message,yes-
>>> the brute force approach would be to install each of these
>>> as they come up; you can do
>>>
>>> $ cpan
>>> cpan> install Array::Compare
>>>
>>> etc., then attempt the bioperl-db install again; lather, rinse,
>>> repeat.
>>> MAJ
>>> ----- Original Message ----- From: "Anthony Pemberton"
>> <A.J.Pemberton at bham.ac.uk
>>>>
>>> To: <bioperl-l at bioperl.org>
>>> Sent: Tuesday, September 22, 2009 1:06 PM
>>> Subject: [Bioperl-l] Problems installing latest stable bioperl-db
>>> (1.6)
>>>
>>>
>>>> Folks,
>>>>
>>>> I am experiencing problems installing bioperl-db. I followed the
>>>> instructions on the website both installing via CPAN and
>>>> downloading the source tarball. Get the same error. I think I have
>>>> missing prerequistes, the first error I get is:
>>>>
>>>> Can't locate Array/Compare.pm in @INC (@INC contains: t/lib t /usr/
>>>> local/BioPerl-db-1.6.0/blib/lib
>>>> /usr/local/BioPerl-db-1.6.0/blib/arch /usr/local/BioPerl-db-1.6.0 /
>>>> usr/lib64/perl5/5.8.5/x86_64-linux-thread-multi
>>>> /usr/lib/perl5/5.8.5 /usr/lib64/perl5/site_perl/5.8.5/x86_64-linux-
>>>> thread-multi /usr/lib/perl5/site_perl/5.8.5
>>>> /usr/lib/perl5/site_perl /usr/lib64/perl5/vendor_perl/5.8.5/x86_64-
>>>> linux-thread-multi /usr/lib/perl5/vendor_perl/5.8.5
>>>> /usr/lib64/perl5/vendor_perl/5.8.3/x86_64-linux-thread-multi /usr/
>>>> lib/perl5/vendor_perl .) at t/lib/Test/Warn.pm line 228.
>>>>
>>>> Can anyone help?
>>>>
>>>> Regards,
>>>>
>>>> Tony P.
>>>>
>>>>
>>>> **************************************************************
>>>> Mr. A. Pemberton Tel:+44 121 414 3388
>>>> School of Biosciences, Fax:+44 121 414 5925
>>>> The University of Birmingham
>>>> Email:a.j.pemberton at bham.ac.uk
>>>> Birmingham B15 2TT U.K.
>>>> **************************************************************
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> Bioperl-l mailing list
>>>> Bioperl-l at lists.open-bio.org
>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>>
>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


From cjfields at illinois.edu  Thu Sep 24 13:50:34 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Thu, 24 Sep 2009 12:50:34 -0500
Subject: [Bioperl-l] DB_File dependency and ActiveState 5.10
In-Reply-To: <1A8A9461E94441EE9BD73D02A8F81F52@NewLife>
References: <1A8A9461E94441EE9BD73D02A8F81F52@NewLife>
Message-ID: <759F1C97-401A-434C-956C-20A1DED9D834@illinois.edu>

I do support doing this for sheer flexibility, but it's not an  
absolute showstopper for ActivePerl.  There is a working DB_File PPM  
available for ActivePerl 5.10.1 in the Trouchelle PPM repo:

http://trouchelle.com/ppm10/

That repo is listed in the 'Suggested' list in the latest PPM4  
Preferences (Repositories tag). I had to install it to fix that WinXP  
Bio::Index bug.

(Based on that Bio::Index modules also have this requirement, at least  
tests were being skipped based on lack of DB_File)

chris

On Sep 24, 2009, at 9:17 AM, Mark A. Jensen wrote:

> Gurus of a db stripe:
>
> ActiveState 5.10 has such a problem with BDB that it
> disables their ppm build of the DB_File module. I know
> what the *ultimate* solution is...however...
>
> I did a quick grep of 'use DB_File' across the trunk, and
> it seems there are two categories of dependency--
>
> (1) use of BDB is an option among other dbms
>      (e.g., among the  Bio::DB::GFF::Adaptor::)
>
> (2) BDB is the developer's personal choice
>    (e.g., possibly Bio::DB::FileCache)
>
> In Bio::DB::Fasta, AnyDBM_File is used to allow the
> user a choice. Are there fundamental reasons not to
> convert the type (2) dependencies to AnyDBM_File?
> I will try to do this (on a branch) if there are no technical
> objections. General derision, however, will only goad
> me into action-
>
> Thanks,
> MAJ
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From cjfields at illinois.edu  Thu Sep 24 14:03:48 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Thu, 24 Sep 2009 13:03:48 -0500
Subject: [Bioperl-l] Bio::SeqIO::scf tests failing on WinXP?
Message-ID: <159370FD-B6F5-4702-AF35-B7126BA7399A@illinois.edu>

Can someone (Mark?) who has a WinXP setup run tests on Bio::SeqIO::scf  
for Windows using the last alpha or bioperl-live?  I'm getting a  
pretty significant fail with the last alpha release (I've managed to  
fix the others) via my remote desktop setup (haven't set up virtualbox  
yet).  I just want to confirm this is occurring elsewhere and plan  
accordingly, namely indicating the module doesn't work with windows  
for the time being.

Build test --test-files t/SeqIO/scf.t --verbose

chris

From maj at fortinbras.us  Thu Sep 24 14:39:38 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Thu, 24 Sep 2009 14:39:38 -0400
Subject: [Bioperl-l] DB_File dependency and ActiveState 5.10
In-Reply-To: <759F1C97-401A-434C-956C-20A1DED9D834@illinois.edu>
References: <1A8A9461E94441EE9BD73D02A8F81F52@NewLife>
	<759F1C97-401A-434C-956C-20A1DED9D834@illinois.edu>
Message-ID: <3715F68607084E4684A4B54E542468E4@NewLife>

All righty. I did find the trouchelle repo, but my ppm
didn't believe that DB_File was in it.
----- Original Message ----- 
From: "Chris Fields" <cjfields at illinois.edu>
To: "Mark A. Jensen" <maj at fortinbras.us>
Cc: "BioPerl List" <bioperl-l at lists.open-bio.org>
Sent: Thursday, September 24, 2009 1:50 PM
Subject: Re: [Bioperl-l] DB_File dependency and ActiveState 5.10


>I do support doing this for sheer flexibility, but it's not an  
> absolute showstopper for ActivePerl.  There is a working DB_File PPM  
> available for ActivePerl 5.10.1 in the Trouchelle PPM repo:
> 
> http://trouchelle.com/ppm10/
> 
> That repo is listed in the 'Suggested' list in the latest PPM4  
> Preferences (Repositories tag). I had to install it to fix that WinXP  
> Bio::Index bug.
> 
> (Based on that Bio::Index modules also have this requirement, at least  
> tests were being skipped based on lack of DB_File)
> 
> chris
> 
> On Sep 24, 2009, at 9:17 AM, Mark A. Jensen wrote:
> 
>> Gurus of a db stripe:
>>
>> ActiveState 5.10 has such a problem with BDB that it
>> disables their ppm build of the DB_File module. I know
>> what the *ultimate* solution is...however...
>>
>> I did a quick grep of 'use DB_File' across the trunk, and
>> it seems there are two categories of dependency--
>>
>> (1) use of BDB is an option among other dbms
>>      (e.g., among the  Bio::DB::GFF::Adaptor::)
>>
>> (2) BDB is the developer's personal choice
>>    (e.g., possibly Bio::DB::FileCache)
>>
>> In Bio::DB::Fasta, AnyDBM_File is used to allow the
>> user a choice. Are there fundamental reasons not to
>> convert the type (2) dependencies to AnyDBM_File?
>> I will try to do this (on a branch) if there are no technical
>> objections. General derision, however, will only goad
>> me into action-
>>
>> Thanks,
>> MAJ
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
> 
>

From maj at fortinbras.us  Thu Sep 24 14:40:03 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Thu, 24 Sep 2009 14:40:03 -0400
Subject: [Bioperl-l] Bio::SeqIO::scf tests failing on WinXP?
In-Reply-To: <159370FD-B6F5-4702-AF35-B7126BA7399A@illinois.edu>
References: <159370FD-B6F5-4702-AF35-B7126BA7399A@illinois.edu>
Message-ID: <791B5C5CB3C34A8AAC348DC59E934198@NewLife>

aye-aye
----- Original Message ----- 
From: "Chris Fields" <cjfields at illinois.edu>
To: "BioPerl List" <bioperl-l at lists.open-bio.org>
Sent: Thursday, September 24, 2009 2:03 PM
Subject: [Bioperl-l] Bio::SeqIO::scf tests failing on WinXP?


> Can someone (Mark?) who has a WinXP setup run tests on Bio::SeqIO::scf  
> for Windows using the last alpha or bioperl-live?  I'm getting a  
> pretty significant fail with the last alpha release (I've managed to  
> fix the others) via my remote desktop setup (haven't set up virtualbox  
> yet).  I just want to confirm this is occurring elsewhere and plan  
> accordingly, namely indicating the module doesn't work with windows  
> for the time being.
> 
> Build test --test-files t/SeqIO/scf.t --verbose
> 
> chris
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
>

From e.osimo at gmail.com  Fri Sep 25 03:59:10 2009
From: e.osimo at gmail.com (Emanuele Osimo)
Date: Fri, 25 Sep 2009 09:59:10 +0200
Subject: [Bioperl-l] a Main Page proposal
In-Reply-To: <3B49F41D-4FBB-48CD-BA33-3D6C783CBA38@bioperl.org>
References: <mailman.7482.1253574466.2696.bioperl-l@lists.open-bio.org> 
	<4AB84B8D.5080005@ieee.org>
	<2B4BD21D-D06A-43A1-8860-A8F3771130E7@illinois.edu> 
	<f135c01c0909221615y4e79fb48u7e1ce6771b046d10@mail.gmail.com> 
	<628aabb70909230437p13de2164sef4950cc3be2ce23@mail.gmail.com> 
	<320fb6e00909230512u3d0c2031xb418e3253476be2f@mail.gmail.com> 
	<9D6376D4-DFAC-4363-BA1C-0E27AB01373E@illinois.edu>
	<628aabb70909240238v439d6c46l93a5ead53f161c37@mail.gmail.com> 
	<3B49F41D-4FBB-48CD-BA33-3D6C783CBA38@bioperl.org>
Message-ID: <2ac05d0f0909250059p56c75124hfb8b16b865a831c@mail.gmail.com>

Dear Jason,
it's more than 24 hours that I try connecting to
http://bioperl.org/wiki/BioPerl_publications, but it won't work.
Emanuele


On Thu, Sep 24, 2009 at 18:23, Jason Stajich <jason at bioperl.org> wrote:

> If someone also wants to volunteer to keep up the publications page - this
> is where I *had* been curating a list up by citations and google scholar
> searches for 'bioperl' and things that reference 2002 paper.
>
> Seems like this is where the static copy of that information should go -
> but highlighting things on the a page with a circulating list or something
> that just listed recent additions to the list could be done by the web dev
> gurus and could be kewl.
> The current issue is that a) it is large so I think pubmed plugin rendering
> can be slow (or gets broken as it seems to be now).
> http://bioperl.org/wiki/BioPerl_publications
> http://bioperl.org/wiki/BioPerl_publications/2008
> http://bioperl.org/wiki/BioPerl_publications/2007
> etc....
>
> -jason
>
> On Sep 24, 2009, at 2:38 AM, Dave Messina wrote:
>
>
>>> Not to add yet more to the list, but I also think a concise list of
>>> projects using (or 'powered by') bioperl should be front-and-center; not
>>> a
>>> lot of users know when/where bioperl is used.  This applies to the other
>>> bio* as well, particularly biopython (seeing it popping up more and
>>> more).
>>>
>>>
>>
>> Along these lines, it'd be great to publicize not only
>> BioPerl-*powered*projects, but ones which interface with it, too.
>>
>> Just this week, for example, there is this, which could go both on a
>> static
>> page and in the newsfeed:
>> http://bioinformatics.oxfordjournals.org/cgi/content/abstract/btp554v1
>>
>> MOODS: fast search for position weight matrix matches in DNA sequences.
>>
>> Korhonen J, Martinm?ki P, Pizzi C, Rastas P, Ukkonen E.
>> Department of Computer Science and Helsinki Institute for Information
>> Technology,
>> University of Helsinki, Helsinki, Finland.
>>
>> SUMMARY: MOODS (MOtif Occurrence Detection Suite) is a software package
>> for
>> matching position weight matrices against DNA sequences. MOODS implements
>> state-of-the-art on-line matching algorithms, achieving considerably
>> faster
>> scanning speed than with a simple brute-force search. MOODS is written in
>> C++,
>> with bindings for the popular BioPerl and Biopython toolkits. It can
>> easily be
>> adapted for different purposes and integrated into existing workflows. It
>> can
>> also be used as a C++ library. AVAILABILITY: The package with
>> documentation and
>> examples of usage is available at
>> http://www.cs.helsinki.fi/group/pssmfind. The
>> source code is also available under the terms of a GNU General Public
>> License
>> (GPL). CONTACT: janne.h.korhonen at helsinki.fi.
>>
>> PMID: 19773334 [PubMed - as supplied by publisher]
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>
> --
> Jason Stajich
> jason.stajich at gmail.com
> jason at bioperl.org
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


From hlapp at gmx.net  Fri Sep 25 07:26:37 2009
From: hlapp at gmx.net (Hilmar Lapp)
Date: Fri, 25 Sep 2009 07:26:37 -0400
Subject: [Bioperl-l] a Main Page proposal
In-Reply-To: <2ac05d0f0909250059p56c75124hfb8b16b865a831c@mail.gmail.com>
References: <mailman.7482.1253574466.2696.bioperl-l@lists.open-bio.org>
	<4AB84B8D.5080005@ieee.org>
	<2B4BD21D-D06A-43A1-8860-A8F3771130E7@illinois.edu>
	<f135c01c0909221615y4e79fb48u7e1ce6771b046d10@mail.gmail.com>
	<628aabb70909230437p13de2164sef4950cc3be2ce23@mail.gmail.com>
	<320fb6e00909230512u3d0c2031xb418e3253476be2f@mail.gmail.com>
	<9D6376D4-DFAC-4363-BA1C-0E27AB01373E@illinois.edu>
	<628aabb70909240238v439d6c46l93a5ead53f161c37@mail.gmail.com>
	<3B49F41D-4FBB-48CD-BA33-3D6C783CBA38@bioperl.org>
	<2ac05d0f0909250059p56c75124hfb8b16b865a831c@mail.gmail.com>
Message-ID: <9B33DB9A-C82D-42E5-87D3-A26BD166F7F5@gmx.net>

Odd. Something's going on in the page that upsets MediaWiki. I can  
actually pull up the page in edit mode.

Is the citation extension working correctly? The year-by-year pages  
look odd.

	-hilmar

On Sep 25, 2009, at 3:59 AM, Emanuele Osimo wrote:

> Dear Jason,
> it's more than 24 hours that I try connecting to
> http://bioperl.org/wiki/BioPerl_publications, but it won't work.
> Emanuele
>
>
> On Thu, Sep 24, 2009 at 18:23, Jason Stajich <jason at bioperl.org>  
> wrote:
>
>> If someone also wants to volunteer to keep up the publications page  
>> - this
>> is where I *had* been curating a list up by citations and google  
>> scholar
>> searches for 'bioperl' and things that reference 2002 paper.
>>
>> Seems like this is where the static copy of that information should  
>> go -
>> but highlighting things on the a page with a circulating list or  
>> something
>> that just listed recent additions to the list could be done by the  
>> web dev
>> gurus and could be kewl.
>> The current issue is that a) it is large so I think pubmed plugin  
>> rendering
>> can be slow (or gets broken as it seems to be now).
>> http://bioperl.org/wiki/BioPerl_publications
>> http://bioperl.org/wiki/BioPerl_publications/2008
>> http://bioperl.org/wiki/BioPerl_publications/2007
>> etc....
>>
>> -jason
>>
>> On Sep 24, 2009, at 2:38 AM, Dave Messina wrote:
>>
>>
>>>> Not to add yet more to the list, but I also think a concise list of
>>>> projects using (or 'powered by') bioperl should be front-and- 
>>>> center; not
>>>> a
>>>> lot of users know when/where bioperl is used.  This applies to  
>>>> the other
>>>> bio* as well, particularly biopython (seeing it popping up more and
>>>> more).
>>>>
>>>>
>>>
>>> Along these lines, it'd be great to publicize not only
>>> BioPerl-*powered*projects, but ones which interface with it, too.
>>>
>>> Just this week, for example, there is this, which could go both on a
>>> static
>>> page and in the newsfeed:
>>> http://bioinformatics.oxfordjournals.org/cgi/content/abstract/btp554v1
>>>
>>> MOODS: fast search for position weight matrix matches in DNA  
>>> sequences.
>>>
>>> Korhonen J, Martinm?ki P, Pizzi C, Rastas P, Ukkonen E.
>>> Department of Computer Science and Helsinki Institute for  
>>> Information
>>> Technology,
>>> University of Helsinki, Helsinki, Finland.
>>>
>>> SUMMARY: MOODS (MOtif Occurrence Detection Suite) is a software  
>>> package
>>> for
>>> matching position weight matrices against DNA sequences. MOODS  
>>> implements
>>> state-of-the-art on-line matching algorithms, achieving considerably
>>> faster
>>> scanning speed than with a simple brute-force search. MOODS is  
>>> written in
>>> C++,
>>> with bindings for the popular BioPerl and Biopython toolkits. It can
>>> easily be
>>> adapted for different purposes and integrated into existing  
>>> workflows. It
>>> can
>>> also be used as a C++ library. AVAILABILITY: The package with
>>> documentation and
>>> examples of usage is available at
>>> http://www.cs.helsinki.fi/group/pssmfind. The
>>> source code is also available under the terms of a GNU General  
>>> Public
>>> License
>>> (GPL). CONTACT: janne.h.korhonen at helsinki.fi.
>>>
>>> PMID: 19773334 [PubMed - as supplied by publisher]
>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>
>> --
>> Jason Stajich
>> jason.stajich at gmail.com
>> jason at bioperl.org
>>
>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From biopython at maubp.freeserve.co.uk  Fri Sep 25 07:40:33 2009
From: biopython at maubp.freeserve.co.uk (Peter)
Date: Fri, 25 Sep 2009 12:40:33 +0100
Subject: [Bioperl-l] a Main Page proposal
In-Reply-To: <9B33DB9A-C82D-42E5-87D3-A26BD166F7F5@gmx.net>
References: <mailman.7482.1253574466.2696.bioperl-l@lists.open-bio.org>
	<2B4BD21D-D06A-43A1-8860-A8F3771130E7@illinois.edu>
	<f135c01c0909221615y4e79fb48u7e1ce6771b046d10@mail.gmail.com>
	<628aabb70909230437p13de2164sef4950cc3be2ce23@mail.gmail.com>
	<320fb6e00909230512u3d0c2031xb418e3253476be2f@mail.gmail.com>
	<9D6376D4-DFAC-4363-BA1C-0E27AB01373E@illinois.edu>
	<628aabb70909240238v439d6c46l93a5ead53f161c37@mail.gmail.com>
	<3B49F41D-4FBB-48CD-BA33-3D6C783CBA38@bioperl.org>
	<2ac05d0f0909250059p56c75124hfb8b16b865a831c@mail.gmail.com>
	<9B33DB9A-C82D-42E5-87D3-A26BD166F7F5@gmx.net>
Message-ID: <320fb6e00909250440i18ee4216o80cedd418feed842@mail.gmail.com>

On Fri, Sep 25, 2009 at 12:26 PM, Hilmar Lapp <hlapp at gmx.net> wrote:
> Odd. Something's going on in the page that upsets MediaWiki. I can actually
> pull up the page in edit mode.
>
> Is the citation extension working correctly? The year-by-year pages look
> odd.

It is working on the Biopython and BioJava pages (which use the same
server and mediawiki installation, right?),

http://biopython.org/wiki/Documentation#Papers
http://biopython.org/wiki/Publications
http://biojava.org/wiki/BioJava:BioJavaInside

[I know there are references with a funny character in them, the extension
doesn't like accents. I normally redo those references by hand but it is
a hassle and just giving a PMID is much easier]

Peter

From maj at fortinbras.us  Fri Sep 25 08:50:26 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Fri, 25 Sep 2009 08:50:26 -0400
Subject: [Bioperl-l] a Main Page proposal
In-Reply-To: <9B33DB9A-C82D-42E5-87D3-A26BD166F7F5@gmx.net>
References: <mailman.7482.1253574466.2696.bioperl-l@lists.open-bio.org><4AB84B8D.5080005@ieee.org><2B4BD21D-D06A-43A1-8860-A8F3771130E7@illinois.edu><f135c01c0909221615y4e79fb48u7e1ce6771b046d10@mail.gmail.com><628aabb70909230437p13de2164sef4950cc3be2ce23@mail.gmail.com><320fb6e00909230512u3d0c2031xb418e3253476be2f@mail.gmail.com><9D6376D4-DFAC-4363-BA1C-0E27AB01373E@illinois.edu><628aabb70909240238v439d6c46l93a5ead53f161c37@mail.gmail.com><3B49F41D-4FBB-48CD-BA33-3D6C783CBA38@bioperl.org><2ac05d0f0909250059p56c75124hfb8b16b865a831c@mail.gmail.com>
	<9B33DB9A-C82D-42E5-87D3-A26BD166F7F5@gmx.net>
Message-ID: <4E35933353E14BB98975BCCAF79F5E0B@NewLife>

I've been playing with this. I think it's either a numbers problem (>230 
references => bork) or a timeout problem. Attempting to isolate a single 
"BioPerl publications/200x" page for the error gives inconsistent
results, but including enough of these pages to give more than about 230 
references gives the error (using
preview).
----- Original Message ----- 
From: "Hilmar Lapp" <hlapp at gmx.net>
To: "Emanuele Osimo" <e.osimo at gmail.com>
Cc: "perl bioperl ml" <bioperl-l at lists.open-bio.org>
Sent: Friday, September 25, 2009 7:26 AM
Subject: Re: [Bioperl-l] a Main Page proposal


Odd. Something's going on in the page that upsets MediaWiki. I can
actually pull up the page in edit mode.

Is the citation extension working correctly? The year-by-year pages
look odd.

-hilmar

On Sep 25, 2009, at 3:59 AM, Emanuele Osimo wrote:

> Dear Jason,
> it's more than 24 hours that I try connecting to
> http://bioperl.org/wiki/BioPerl_publications, but it won't work.
> Emanuele
>
>
> On Thu, Sep 24, 2009 at 18:23, Jason Stajich <jason at bioperl.org>  wrote:
>
>> If someone also wants to volunteer to keep up the publications page  - this
>> is where I *had* been curating a list up by citations and google  scholar
>> searches for 'bioperl' and things that reference 2002 paper.
>>
>> Seems like this is where the static copy of that information should  go -
>> but highlighting things on the a page with a circulating list or  something
>> that just listed recent additions to the list could be done by the  web dev
>> gurus and could be kewl.
>> The current issue is that a) it is large so I think pubmed plugin  rendering
>> can be slow (or gets broken as it seems to be now).
>> http://bioperl.org/wiki/BioPerl_publications
>> http://bioperl.org/wiki/BioPerl_publications/2008
>> http://bioperl.org/wiki/BioPerl_publications/2007
>> etc....
>>
>> -jason
>>
>> On Sep 24, 2009, at 2:38 AM, Dave Messina wrote:
>>
>>
>>>> Not to add yet more to the list, but I also think a concise list of
>>>> projects using (or 'powered by') bioperl should be front-and- center; not
>>>> a
>>>> lot of users know when/where bioperl is used.  This applies to  the other
>>>> bio* as well, particularly biopython (seeing it popping up more and
>>>> more).
>>>>
>>>>
>>>
>>> Along these lines, it'd be great to publicize not only
>>> BioPerl-*powered*projects, but ones which interface with it, too.
>>>
>>> Just this week, for example, there is this, which could go both on a
>>> static
>>> page and in the newsfeed:
>>> http://bioinformatics.oxfordjournals.org/cgi/content/abstract/btp554v1
>>>
>>> MOODS: fast search for position weight matrix matches in DNA  sequences.
>>>
>>> Korhonen J, Martinm?ki P, Pizzi C, Rastas P, Ukkonen E.
>>> Department of Computer Science and Helsinki Institute for  Information
>>> Technology,
>>> University of Helsinki, Helsinki, Finland.
>>>
>>> SUMMARY: MOODS (MOtif Occurrence Detection Suite) is a software  package
>>> for
>>> matching position weight matrices against DNA sequences. MOODS  implements
>>> state-of-the-art on-line matching algorithms, achieving considerably
>>> faster
>>> scanning speed than with a simple brute-force search. MOODS is  written in
>>> C++,
>>> with bindings for the popular BioPerl and Biopython toolkits. It can
>>> easily be
>>> adapted for different purposes and integrated into existing  workflows. It
>>> can
>>> also be used as a C++ library. AVAILABILITY: The package with
>>> documentation and
>>> examples of usage is available at
>>> http://www.cs.helsinki.fi/group/pssmfind. The
>>> source code is also available under the terms of a GNU General  Public
>>> License
>>> (GPL). CONTACT: janne.h.korhonen at helsinki.fi.
>>>
>>> PMID: 19773334 [PubMed - as supplied by publisher]
>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>
>> --
>> Jason Stajich
>> jason.stajich at gmail.com
>> jason at bioperl.org
>>
>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


_______________________________________________
Bioperl-l mailing list
Bioperl-l at lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/bioperl-l


From maj at fortinbras.us  Fri Sep 25 09:08:10 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Fri, 25 Sep 2009 09:08:10 -0400
Subject: [Bioperl-l] a Main Page proposal
In-Reply-To: <D336A804-1B73-465E-971F-AD1A9EE5465C@illinois.edu>
References: <094CC6C2C9DE448F9A8097AEEFE86B03@NewLife><4AB71C40.10902@sendu.me.uk>
	<4AB72DEF.2010008@cornell.edu><320fb6e00909210245hc455330o109515b100b21153@mail.gmail.com><3C8F39ACAD954917ACDEFD863EC99B16@NewLife><320fb6e00909210528v849cd43h3bd4677b69222575@mail.gmail.com>
	<D336A804-1B73-465E-971F-AD1A9EE5465C@illinois.edu>
Message-ID: <3327C0C1167C4889A980809FD642A0A2@NewLife>

The idea I now have is that <biblio> is hitting the server too rapidly and 
getting bounced after a while.
----- Original Message ----- 
From: "Chris Fields" <cjfields at illinois.edu>
To: "Peter" <biopython at maubp.freeserve.co.uk>
Cc: "BioPerl List" <bioperl-l at lists.open-bio.org>; "Mark A. Jensen" 
<maj at fortinbras.us>
Sent: Monday, September 21, 2009 9:05 AM
Subject: Re: [Bioperl-l] a Main Page proposal


>
> On Sep 21, 2009, at 7:28 AM, Peter wrote:
>
>> Peter wrote:
>>>> We had some similar discussions about the Biopython wiki
>>>> based homepage - although our old one was nowhere near
>>>> as busy as the current BioPerl main page, it was still not as
>>>> welcoming as our current version *tries* to be.
>>>> ...
>>>> I can dig out links to our mailing list archive if anyone is
>>>> interested in the discussion.
>>
>> On Mon, Sep 21, 2009 at 12:32 PM, Mark A. Jensen wrote:
>>>
>>> I'd appreciate those links, Peter- thanks
>>> MAJ
>>
>> OK, here you are - this was most of it, I'd have to dig though
>> my old emails to see what else I can find:
>> http://lists.open-bio.org/pipermail/biopython-dev/2009-April/005867.html
>>
>> Remember Biopython went from a very minimal home page, to
>> something aiming to be more newcomer friendly. BioPerl on the
>> other hand seems to want to move away from the current very
>> text heavy information rich page to something more focused and
>> newcomer friendly. To me at least the current page is too dense,
>> intimidating, and the important bits get lost in all the content.
>>
>> [My apologies if any of this feedback come accross too blunt.]
>
> Not at all; I'm thinking the same thing.
>
>> If you haven't already looked at them, you should checkout the
>> other OBF project pages for ideas. The BioJava homepage is
>> also using the wiki - in my opinion it is a bit cluttered, but is
>> still more accessible than the current BioPerl page. Also,
>> the BioRuby page is very nice - although not wiki based.
>>
>> Regards,
>>
>> Peter
>
> I think the Biopython layout is very nice and focused.  Maybe a bit  too 
> minimal, but then again I don't like scrolling up and down the  page to find 
> the relevant bits, so less may be better.
>
> Reminds me of the simplifed design on the perl6 main page (just don't  stare 
> at the hallucinogenic butterfly too long):
>
> http://www.perl6.org/
>
> So, maybe a structured layout with the most important links, and  additional 
> links on a separate page.
>
> chris
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> 


From maj at fortinbras.us  Fri Sep 25 09:30:21 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Fri, 25 Sep 2009 09:30:21 -0400
Subject: [Bioperl-l] a Main Page proposal
In-Reply-To: <D336A804-1B73-465E-971F-AD1A9EE5465C@illinois.edu>
References: <094CC6C2C9DE448F9A8097AEEFE86B03@NewLife><4AB71C40.10902@sendu.me.uk>
	<4AB72DEF.2010008@cornell.edu><320fb6e00909210245hc455330o109515b100b21153@mail.gmail.com><3C8F39ACAD954917ACDEFD863EC99B16@NewLife><320fb6e00909210528v849cd43h3bd4677b69222575@mail.gmail.com>
	<D336A804-1B73-465E-971F-AD1A9EE5465C@illinois.edu>
Message-ID: <A06AF115F63B4C558D368B730BFB441D@NewLife>

It's ugly, but it works now.
----- Original Message ----- 
From: "Chris Fields" <cjfields at illinois.edu>
To: "Peter" <biopython at maubp.freeserve.co.uk>
Cc: "BioPerl List" <bioperl-l at lists.open-bio.org>; "Mark A. Jensen" 
<maj at fortinbras.us>
Sent: Monday, September 21, 2009 9:05 AM
Subject: Re: [Bioperl-l] a Main Page proposal


>
> On Sep 21, 2009, at 7:28 AM, Peter wrote:
>
>> Peter wrote:
>>>> We had some similar discussions about the Biopython wiki
>>>> based homepage - although our old one was nowhere near
>>>> as busy as the current BioPerl main page, it was still not as
>>>> welcoming as our current version *tries* to be.
>>>> ...
>>>> I can dig out links to our mailing list archive if anyone is
>>>> interested in the discussion.
>>
>> On Mon, Sep 21, 2009 at 12:32 PM, Mark A. Jensen wrote:
>>>
>>> I'd appreciate those links, Peter- thanks
>>> MAJ
>>
>> OK, here you are - this was most of it, I'd have to dig though
>> my old emails to see what else I can find:
>> http://lists.open-bio.org/pipermail/biopython-dev/2009-April/005867.html
>>
>> Remember Biopython went from a very minimal home page, to
>> something aiming to be more newcomer friendly. BioPerl on the
>> other hand seems to want to move away from the current very
>> text heavy information rich page to something more focused and
>> newcomer friendly. To me at least the current page is too dense,
>> intimidating, and the important bits get lost in all the content.
>>
>> [My apologies if any of this feedback come accross too blunt.]
>
> Not at all; I'm thinking the same thing.
>
>> If you haven't already looked at them, you should checkout the
>> other OBF project pages for ideas. The BioJava homepage is
>> also using the wiki - in my opinion it is a bit cluttered, but is
>> still more accessible than the current BioPerl page. Also,
>> the BioRuby page is very nice - although not wiki based.
>>
>> Regards,
>>
>> Peter
>
> I think the Biopython layout is very nice and focused.  Maybe a bit  too 
> minimal, but then again I don't like scrolling up and down the  page to find 
> the relevant bits, so less may be better.
>
> Reminds me of the simplifed design on the perl6 main page (just don't  stare 
> at the hallucinogenic butterfly too long):
>
> http://www.perl6.org/
>
> So, maybe a structured layout with the most important links, and  additional 
> links on a separate page.
>
> chris
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> 


From jason at bioperl.org  Fri Sep 25 11:47:55 2009
From: jason at bioperl.org (Jason Stajich)
Date: Fri, 25 Sep 2009 08:47:55 -0700
Subject: [Bioperl-l] a Main Page proposal
In-Reply-To: <A06AF115F63B4C558D368B730BFB441D@NewLife>
References: <094CC6C2C9DE448F9A8097AEEFE86B03@NewLife><4AB71C40.10902@sendu.me.uk>
	<4AB72DEF.2010008@cornell.edu><320fb6e00909210245hc455330o109515b100b21153@mail.gmail.com><3C8F39ACAD954917ACDEFD863EC99B16@NewLife><320fb6e00909210528v849cd43h3bd4677b69222575@mail.gmail.com>
	<D336A804-1B73-465E-971F-AD1A9EE5465C@illinois.edu>
	<A06AF115F63B4C558D368B730BFB441D@NewLife>
Message-ID: <2F3B82E8-3A61-4FDB-A55E-38899C262ED6@bioperl.org>

thanks - yeah I had separated it by year to make it easier to update  
them since the main file was too large, but I liked having them all  
pulled in onto one page in order to see the total number of cites.  
Brian's graphic is nice but a little out of date, and only reflects a  
pubmed query.

Basically that system doesn't work well enough with biblio since it  
isn't caching the lookups very well.   We can probably do better  
somehow, but someone would have to really be dedicated to it, so I can  
kind of see now why we could use something like this to generate the  
citations so they'd be static.
http://sumsearch.uthscsa.edu/cite/

I had used Biblio extension as it was so easy but maybe it just can't  
scale for that number of needed refs as it doesn't do very good local  
caching AFAIK.

-jason
On Sep 25, 2009, at 6:30 AM, Mark A. Jensen wrote:

> It's ugly, but it works now.
> ----- Original Message ----- From: "Chris Fields" <cjfields at illinois.edu 
> >
> To: "Peter" <biopython at maubp.freeserve.co.uk>
> Cc: "BioPerl List" <bioperl-l at lists.open-bio.org>; "Mark A. Jensen" <maj at fortinbras.us 
> >
> Sent: Monday, September 21, 2009 9:05 AM
> Subject: Re: [Bioperl-l] a Main Page proposal
>
>
>>
>> On Sep 21, 2009, at 7:28 AM, Peter wrote:
>>
>>> Peter wrote:
>>>>> We had some similar discussions about the Biopython wiki
>>>>> based homepage - although our old one was nowhere near
>>>>> as busy as the current BioPerl main page, it was still not as
>>>>> welcoming as our current version *tries* to be.
>>>>> ...
>>>>> I can dig out links to our mailing list archive if anyone is
>>>>> interested in the discussion.
>>>
>>> On Mon, Sep 21, 2009 at 12:32 PM, Mark A. Jensen wrote:
>>>>
>>>> I'd appreciate those links, Peter- thanks
>>>> MAJ
>>>
>>> OK, here you are - this was most of it, I'd have to dig though
>>> my old emails to see what else I can find:
>>> http://lists.open-bio.org/pipermail/biopython-dev/2009-April/005867.html
>>>
>>> Remember Biopython went from a very minimal home page, to
>>> something aiming to be more newcomer friendly. BioPerl on the
>>> other hand seems to want to move away from the current very
>>> text heavy information rich page to something more focused and
>>> newcomer friendly. To me at least the current page is too dense,
>>> intimidating, and the important bits get lost in all the content.
>>>
>>> [My apologies if any of this feedback come accross too blunt.]
>>
>> Not at all; I'm thinking the same thing.
>>
>>> If you haven't already looked at them, you should checkout the
>>> other OBF project pages for ideas. The BioJava homepage is
>>> also using the wiki - in my opinion it is a bit cluttered, but is
>>> still more accessible than the current BioPerl page. Also,
>>> the BioRuby page is very nice - although not wiki based.
>>>
>>> Regards,
>>>
>>> Peter
>>
>> I think the Biopython layout is very nice and focused.  Maybe a  
>> bit  too minimal, but then again I don't like scrolling up and down  
>> the  page to find the relevant bits, so less may be better.
>>
>> Reminds me of the simplifed design on the perl6 main page (just  
>> don't  stare at the hallucinogenic butterfly too long):
>>
>> http://www.perl6.org/
>>
>> So, maybe a structured layout with the most important links, and   
>> additional links on a separate page.
>>
>> chris
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

--
Jason Stajich
jason.stajich at gmail.com
jason at bioperl.org


From jason at bioperl.org  Fri Sep 25 12:54:36 2009
From: jason at bioperl.org (Jason Stajich)
Date: Fri, 25 Sep 2009 09:54:36 -0700
Subject: [Bioperl-l] a Main Page proposal
In-Reply-To: <3575DEFF2D0342D0A2553D87EB958D6E@NewLife>
References: <094CC6C2C9DE448F9A8097AEEFE86B03@NewLife><4AB71C40.10902@sendu.me.uk><4AB72DEF.2010008@cornell.edu><320fb6e00909210245hc455330o109515b100b21153@mail.gmail.com><3C8F39ACAD954917ACDEFD863EC99B16@NewLife><320fb6e00909210528v849cd43h3bd4677b69222575@mail.gmail.com><D336A804-1B73-465E-971F-AD1A9EE5465C@illinois.edu><A06AF115F63B4C558D368B730BFB441D@NewLife>
	<2F3B82E8-3A61-4FDB-A55E-38899C262ED6@bioperl.org>
	<3575DEFF2D0342D0A2553D87EB958D6E@NewLife>
Message-ID: <7275015E-45FC-4E2A-9379-89F7447DEB32@bioperl.org>

cheers- any efforts are appreciated.  I am not sure what is the best  
way to provide this info to folks without spending a ton of time  
curating.  What would be ideal is if the software worked well enough  
that a volunteer only spent time adding the info not debugging the  
display or code.  It might be that something better exists -- online  
reference management like citeulike or mendeley -- that could then be  
linked in via an API.  .... Webservices, etc will save us all, right?   
Okay not really, but at least we can try and keep this organized till  
it is clear what are alternate solutions.  Martin has stopped working  
on Biblio as far as I know and php-hacking is not my favorite pastime.

-jason
On Sep 25, 2009, at 9:38 AM, Mark A. Jensen wrote:

> I figured you really wanted the 'hundreds-o-cites' effect-- I'm just  
> thinking of this
> as a workaround until the issues are resolved. Not sure I can devote  
> too much
> time to playing with it now (procrastinating using other projects at  
> the mo') but
> I can put it in the todo list on the Documentation Project page....
> cheers MAJ
> ----- Original Message ----- From: "Jason Stajich" <jason at bioperl.org>
> To: "Mark A. Jensen" <maj at fortinbras.us>
> Cc: "Chris Fields" <cjfields at illinois.edu>; "BioPerl List" <bioperl-l at lists.open-bio.org 
> >; "Peter" <biopython at maubp.freeserve.co.uk>
> Sent: Friday, September 25, 2009 11:47 AM
> Subject: Re: [Bioperl-l] a Main Page proposal
>
>
>> thanks - yeah I had separated it by year to make it easier to  
>> update  them since the main file was too large, but I liked having  
>> them all  pulled in onto one page in order to see the total number  
>> of cites.  Brian's graphic is nice but a little out of date, and  
>> only reflects a  pubmed query.
>>
>> Basically that system doesn't work well enough with biblio since  
>> it  isn't caching the lookups very well.   We can probably do  
>> better  somehow, but someone would have to really be dedicated to  
>> it, so I can  kind of see now why we could use something like this  
>> to generate the  citations so they'd be static.
>> http://sumsearch.uthscsa.edu/cite/
>>
>> I had used Biblio extension as it was so easy but maybe it just  
>> can't  scale for that number of needed refs as it doesn't do very  
>> good local  caching AFAIK.
>>
>> -jason
>> On Sep 25, 2009, at 6:30 AM, Mark A. Jensen wrote:
>>
>>> It's ugly, but it works now.
>>> ----- Original Message ----- From: "Chris Fields" <cjfields at illinois.edu
>>> >
>>> To: "Peter" <biopython at maubp.freeserve.co.uk>
>>> Cc: "BioPerl List" <bioperl-l at lists.open-bio.org>; "Mark A.  
>>> Jensen" <maj at fortinbras.us
>>> >
>>> Sent: Monday, September 21, 2009 9:05 AM
>>> Subject: Re: [Bioperl-l] a Main Page proposal
>>>
>>>
>>>>
>>>> On Sep 21, 2009, at 7:28 AM, Peter wrote:
>>>>
>>>>> Peter wrote:
>>>>>>> We had some similar discussions about the Biopython wiki
>>>>>>> based homepage - although our old one was nowhere near
>>>>>>> as busy as the current BioPerl main page, it was still not as
>>>>>>> welcoming as our current version *tries* to be.
>>>>>>> ...
>>>>>>> I can dig out links to our mailing list archive if anyone is
>>>>>>> interested in the discussion.
>>>>>
>>>>> On Mon, Sep 21, 2009 at 12:32 PM, Mark A. Jensen wrote:
>>>>>>
>>>>>> I'd appreciate those links, Peter- thanks
>>>>>> MAJ
>>>>>
>>>>> OK, here you are - this was most of it, I'd have to dig though
>>>>> my old emails to see what else I can find:
>>>>> http://lists.open-bio.org/pipermail/biopython-dev/2009-April/005867.html
>>>>>
>>>>> Remember Biopython went from a very minimal home page, to
>>>>> something aiming to be more newcomer friendly. BioPerl on the
>>>>> other hand seems to want to move away from the current very
>>>>> text heavy information rich page to something more focused and
>>>>> newcomer friendly. To me at least the current page is too dense,
>>>>> intimidating, and the important bits get lost in all the content.
>>>>>
>>>>> [My apologies if any of this feedback come accross too blunt.]
>>>>
>>>> Not at all; I'm thinking the same thing.
>>>>
>>>>> If you haven't already looked at them, you should checkout the
>>>>> other OBF project pages for ideas. The BioJava homepage is
>>>>> also using the wiki - in my opinion it is a bit cluttered, but is
>>>>> still more accessible than the current BioPerl page. Also,
>>>>> the BioRuby page is very nice - although not wiki based.
>>>>>
>>>>> Regards,
>>>>>
>>>>> Peter
>>>>
>>>> I think the Biopython layout is very nice and focused.  Maybe a   
>>>> bit  too minimal, but then again I don't like scrolling up and  
>>>> down  the  page to find the relevant bits, so less may be better.
>>>>
>>>> Reminds me of the simplifed design on the perl6 main page (just   
>>>> don't stare at the hallucinogenic butterfly too long):
>>>>
>>>> http://www.perl6.org/
>>>>
>>>> So, maybe a structured layout with the most important links, and  
>>>> additional links on a separate page.
>>>>
>>>> chris
>>>>
>>>> _______________________________________________
>>>> Bioperl-l mailing list
>>>> Bioperl-l at lists.open-bio.org
>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>>
>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>> --
>> Jason Stajich
>> jason.stajich at gmail.com
>> jason at bioperl.org
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>

--
Jason Stajich
jason.stajich at gmail.com
jason at bioperl.org


From maj at fortinbras.us  Fri Sep 25 12:38:40 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Fri, 25 Sep 2009 12:38:40 -0400
Subject: [Bioperl-l] a Main Page proposal
In-Reply-To: <2F3B82E8-3A61-4FDB-A55E-38899C262ED6@bioperl.org>
References: <094CC6C2C9DE448F9A8097AEEFE86B03@NewLife><4AB71C40.10902@sendu.me.uk><4AB72DEF.2010008@cornell.edu><320fb6e00909210245hc455330o109515b100b21153@mail.gmail.com><3C8F39ACAD954917ACDEFD863EC99B16@NewLife><320fb6e00909210528v849cd43h3bd4677b69222575@mail.gmail.com><D336A804-1B73-465E-971F-AD1A9EE5465C@illinois.edu><A06AF115F63B4C558D368B730BFB441D@NewLife>
	<2F3B82E8-3A61-4FDB-A55E-38899C262ED6@bioperl.org>
Message-ID: <3575DEFF2D0342D0A2553D87EB958D6E@NewLife>

I figured you really wanted the 'hundreds-o-cites' effect-- I'm just thinking of 
this
as a workaround until the issues are resolved. Not sure I can devote too much
time to playing with it now (procrastinating using other projects at the mo') 
but
I can put it in the todo list on the Documentation Project page....
cheers MAJ
----- Original Message ----- 
From: "Jason Stajich" <jason at bioperl.org>
To: "Mark A. Jensen" <maj at fortinbras.us>
Cc: "Chris Fields" <cjfields at illinois.edu>; "BioPerl List" 
<bioperl-l at lists.open-bio.org>; "Peter" <biopython at maubp.freeserve.co.uk>
Sent: Friday, September 25, 2009 11:47 AM
Subject: Re: [Bioperl-l] a Main Page proposal


> thanks - yeah I had separated it by year to make it easier to update  them 
> since the main file was too large, but I liked having them all  pulled in onto 
> one page in order to see the total number of cites.  Brian's graphic is nice 
> but a little out of date, and only reflects a  pubmed query.
>
> Basically that system doesn't work well enough with biblio since it  isn't 
> caching the lookups very well.   We can probably do better  somehow, but 
> someone would have to really be dedicated to it, so I can  kind of see now why 
> we could use something like this to generate the  citations so they'd be 
> static.
> http://sumsearch.uthscsa.edu/cite/
>
> I had used Biblio extension as it was so easy but maybe it just can't  scale 
> for that number of needed refs as it doesn't do very good local  caching 
> AFAIK.
>
> -jason
> On Sep 25, 2009, at 6:30 AM, Mark A. Jensen wrote:
>
>> It's ugly, but it works now.
>> ----- Original Message ----- From: "Chris Fields" <cjfields at illinois.edu
>> >
>> To: "Peter" <biopython at maubp.freeserve.co.uk>
>> Cc: "BioPerl List" <bioperl-l at lists.open-bio.org>; "Mark A. Jensen" 
>> <maj at fortinbras.us
>> >
>> Sent: Monday, September 21, 2009 9:05 AM
>> Subject: Re: [Bioperl-l] a Main Page proposal
>>
>>
>>>
>>> On Sep 21, 2009, at 7:28 AM, Peter wrote:
>>>
>>>> Peter wrote:
>>>>>> We had some similar discussions about the Biopython wiki
>>>>>> based homepage - although our old one was nowhere near
>>>>>> as busy as the current BioPerl main page, it was still not as
>>>>>> welcoming as our current version *tries* to be.
>>>>>> ...
>>>>>> I can dig out links to our mailing list archive if anyone is
>>>>>> interested in the discussion.
>>>>
>>>> On Mon, Sep 21, 2009 at 12:32 PM, Mark A. Jensen wrote:
>>>>>
>>>>> I'd appreciate those links, Peter- thanks
>>>>> MAJ
>>>>
>>>> OK, here you are - this was most of it, I'd have to dig though
>>>> my old emails to see what else I can find:
>>>> http://lists.open-bio.org/pipermail/biopython-dev/2009-April/005867.html
>>>>
>>>> Remember Biopython went from a very minimal home page, to
>>>> something aiming to be more newcomer friendly. BioPerl on the
>>>> other hand seems to want to move away from the current very
>>>> text heavy information rich page to something more focused and
>>>> newcomer friendly. To me at least the current page is too dense,
>>>> intimidating, and the important bits get lost in all the content.
>>>>
>>>> [My apologies if any of this feedback come accross too blunt.]
>>>
>>> Not at all; I'm thinking the same thing.
>>>
>>>> If you haven't already looked at them, you should checkout the
>>>> other OBF project pages for ideas. The BioJava homepage is
>>>> also using the wiki - in my opinion it is a bit cluttered, but is
>>>> still more accessible than the current BioPerl page. Also,
>>>> the BioRuby page is very nice - although not wiki based.
>>>>
>>>> Regards,
>>>>
>>>> Peter
>>>
>>> I think the Biopython layout is very nice and focused.  Maybe a  bit  too 
>>> minimal, but then again I don't like scrolling up and down  the  page to 
>>> find the relevant bits, so less may be better.
>>>
>>> Reminds me of the simplifed design on the perl6 main page (just  don't 
>>> stare at the hallucinogenic butterfly too long):
>>>
>>> http://www.perl6.org/
>>>
>>> So, maybe a structured layout with the most important links, and 
>>> additional links on a separate page.
>>>
>>> chris
>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> --
> Jason Stajich
> jason.stajich at gmail.com
> jason at bioperl.org
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> 


From jcline at ieee.org  Fri Sep 25 15:11:20 2009
From: jcline at ieee.org (Jonathan Cline)
Date: Fri, 25 Sep 2009 14:11:20 -0500
Subject: [Bioperl-l] LIMS::Controller and LIMS::Web
Message-ID: <4ABD15D8.9020304@ieee.org>

Anyone using the CPAN LIMS::Web or associated modules, have a web site
which demonstrates functionality?  The links in the .pod are not current.

>From CPAN:

DESCRIPTION ^

LIMS::Controller is a versatile object-oriented Perl module designed to
control a LIMS database and its web interface. Inheriting from the
LIMS::Web::Interface and LIMS::Database::Util classes, the module
provides automation for many core and advanced functions required of a
web/database object layer, enabling rapid development of Perl CGI scripts.

-- 

## Jonathan Cline
## jcline at ieee.org
## Mobile: +1-805-617-0223
########################


From bosborne11 at verizon.net  Fri Sep 25 22:13:16 2009
From: bosborne11 at verizon.net (Brian Osborne)
Date: Fri, 25 Sep 2009 22:13:16 -0400
Subject: [Bioperl-l] a Main Page proposal
In-Reply-To: <42FBB964C0EA44FABCB50364C567A009@NewLife>
References: <094CC6C2C9DE448F9A8097AEEFE86B03@NewLife><7C51BBB552AB4EBEA5ED60B49BDB9171@NewLife>
	<628aabb70909211003w221b4c9fn5c11f7ff08fc952d@mail.gmail.com>
	<42FBB964C0EA44FABCB50364C567A009@NewLife>
Message-ID: <B4584560-48AE-4EFB-BE94-E1481FD24E1C@verizon.net>

Mark,

Really nice, and a significant improvement over the existing.

You've gotten good feedback, you've considered these thoughts and  
incorporated them - is it time to move the beta to Main? Yes. In my  
opinion your 'beta' is far superior - just do it.

Brian O.


On Sep 21, 2009, at 1:45 PM, Mark A. Jensen wrote:

> A nearly completely minimal solution is at Main Page Beta
> ----- Original Message ----- From: "Dave Messina" <David.Messina at sbc.su.se 
> >
> To: "Mark A. Jensen" <maj at fortinbras.us>
> Cc: "BioPerl List" <bioperl-l at lists.open-bio.org>
> Sent: Monday, September 21, 2009 1:03 PM
> Subject: Re: [Bioperl-l] a Main Page proposal
>
>
>> Hi Mark,
>> Thanks for taking on this (much needed) refresh.
>> I think your current version is substantially better than what we  
>> have now.
>> Still, I'd argue that something much more concise like the  
>> Biopython page
>> would make a bigger impact on visitors' ability to find what  
>> they're looking
>> for.
>> It's not that the details you have under each section shouldn't be
>> available, but rather that they could be clicked through to instead  
>> of being
>> on the front page.
>> The About section is a good example. I would bet most visitors to the
>> BioPerl website skip over the About section because they already  
>> know what
>> BioPerl is, and that section has the most valuable real estate on  
>> the page.
>> Those who don't know and are curious will probably be able to find  
>> it (the
>> word About on the front page of a website has become an idiom for  
>> "click her
>> to read the details about this").
>> Dave
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From maj at fortinbras.us  Fri Sep 25 22:22:49 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Fri, 25 Sep 2009 22:22:49 -0400
Subject: [Bioperl-l] a Main Page proposal
In-Reply-To: <B4584560-48AE-4EFB-BE94-E1481FD24E1C@verizon.net>
References: <094CC6C2C9DE448F9A8097AEEFE86B03@NewLife><7C51BBB552AB4EBEA5ED60B49BDB9171@NewLife><628aabb70909211003w221b4c9fn5c11f7ff08fc952d@mail.gmail.com><42FBB964C0EA44FABCB50364C567A009@NewLife>
	<B4584560-48AE-4EFB-BE94-E1481FD24E1C@verizon.net>
Message-ID: <ACA5C04C052442259262125A5F0B8E74@NewLife>

Cheers, Brian-- I am becoming swayed now by Chris' whack 
at it, on his talk page. My thought is that we'll hammer out the 
final version after the release, then pull the trigger-- Your thoughts?
MAJ
----- Original Message ----- 
From: "Brian Osborne" <bosborne11 at verizon.net>
To: "Mark A. Jensen" <maj at fortinbras.us>
Cc: "BioPerl List" <bioperl-l at lists.open-bio.org>
Sent: Friday, September 25, 2009 10:13 PM
Subject: Re: [Bioperl-l] a Main Page proposal


> Mark,
> 
> Really nice, and a significant improvement over the existing.
> 
> You've gotten good feedback, you've considered these thoughts and  
> incorporated them - is it time to move the beta to Main? Yes. In my  
> opinion your 'beta' is far superior - just do it.
> 
> Brian O.
> 
> 
> On Sep 21, 2009, at 1:45 PM, Mark A. Jensen wrote:
> 
>> A nearly completely minimal solution is at Main Page Beta
>> ----- Original Message ----- From: "Dave Messina" <David.Messina at sbc.su.se 
>> >
>> To: "Mark A. Jensen" <maj at fortinbras.us>
>> Cc: "BioPerl List" <bioperl-l at lists.open-bio.org>
>> Sent: Monday, September 21, 2009 1:03 PM
>> Subject: Re: [Bioperl-l] a Main Page proposal
>>
>>
>>> Hi Mark,
>>> Thanks for taking on this (much needed) refresh.
>>> I think your current version is substantially better than what we  
>>> have now.
>>> Still, I'd argue that something much more concise like the  
>>> Biopython page
>>> would make a bigger impact on visitors' ability to find what  
>>> they're looking
>>> for.
>>> It's not that the details you have under each section shouldn't be
>>> available, but rather that they could be clicked through to instead  
>>> of being
>>> on the front page.
>>> The About section is a good example. I would bet most visitors to the
>>> BioPerl website skip over the About section because they already  
>>> know what
>>> BioPerl is, and that section has the most valuable real estate on  
>>> the page.
>>> Those who don't know and are curious will probably be able to find  
>>> it (the
>>> word About on the front page of a website has become an idiom for  
>>> "click her
>>> to read the details about this").
>>> Dave
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
>

From maj at fortinbras.us  Fri Sep 25 22:45:21 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Fri, 25 Sep 2009 22:45:21 -0400
Subject: [Bioperl-l] a Main Page proposal
In-Reply-To: <EB52E49A-37B3-4652-9BFD-441BA174FF84@verizon.net>
References: <094CC6C2C9DE448F9A8097AEEFE86B03@NewLife><7C51BBB552AB4EBEA5ED60B49BDB9171@NewLife><628aabb70909211003w221b4c9fn5c11f7ff08fc952d@mail.gmail.com><42FBB964C0EA44FABCB50364C567A009@NewLife>
	<B4584560-48AE-4EFB-BE94-E1481FD24E1C@verizon.net>
	<ACA5C04C052442259262125A5F0B8E74@NewLife>
	<EB52E49A-37B3-4652-9BFD-441BA174FF84@verizon.net>
Message-ID: <E53214D989154E8184BA97573C925DF9@NewLife>

sounds good-- I can make the changes (soon) and we'll tweak it from the echte page
(unless I hear diff'rnt)
cheers MAJ
  ----- Original Message ----- 
  From: Brian Osborne 
  To: Mark A. Jensen 
  Cc: BioPerl List 
  Sent: Friday, September 25, 2009 10:42 PM
  Subject: Re: [Bioperl-l] a Main Page proposal


  Mark,


  I don't love the italics in the version that Chris made but that's just personal preference. He's right in thinking that putting more in the top of the page is good: less scrolling.


  One could color the backgrounds of his tables, that might look nice.


  Either way, or a combination of both, is preferable to what we have. There really is no need to wait since the current page is abysmal. I can say that freely since I'm probably one of its authors!


  One thought though: move the "search" up to a center-left location, below "main links". The Wiki search is pretty good at finding pages so if someone doesn't find what they're looking for in the main section they might be drawn to search for it.


  Brian O.


  On Sep 25, 2009, at 10:22 PM, Mark A. Jensen wrote:


    Cheers, Brian-- I am becoming swayed now by Chris' whack at it, on his talk page. My thought is that we'll hammer out the final version after the release, then pull the trigger-- Your thoughts?
    MAJ
    ----- Original Message ----- From: "Brian Osborne" <bosborne11 at verizon.net>
    To: "Mark A. Jensen" <maj at fortinbras.us>
    Cc: "BioPerl List" <bioperl-l at lists.open-bio.org>
    Sent: Friday, September 25, 2009 10:13 PM
    Subject: Re: [Bioperl-l] a Main Page proposal


      Mark,

      Really nice, and a significant improvement over the existing.

      You've gotten good feedback, you've considered these thoughts and  incorporated them - is it time to move the beta to Main? Yes. In my  opinion your 'beta' is far superior - just do it.

      Brian O.

      On Sep 21, 2009, at 1:45 PM, Mark A. Jensen wrote:

        A nearly completely minimal solution is at Main Page Beta

        ----- Original Message ----- From: "Dave Messina" <David.Messina at sbc.su.se >

        To: "Mark A. Jensen" <maj at fortinbras.us>

        Cc: "BioPerl List" <bioperl-l at lists.open-bio.org>

        Sent: Monday, September 21, 2009 1:03 PM

        Subject: Re: [Bioperl-l] a Main Page proposal


          Hi Mark,

          Thanks for taking on this (much needed) refresh.

          I think your current version is substantially better than what we  have now.

          Still, I'd argue that something much more concise like the  Biopython page

          would make a bigger impact on visitors' ability to find what  they're looking

          for.

          It's not that the details you have under each section shouldn't be

          available, but rather that they could be clicked through to instead  of being

          on the front page.

          The About section is a good example. I would bet most visitors to the

          BioPerl website skip over the About section because they already  know what

          BioPerl is, and that section has the most valuable real estate on  the page.

          Those who don't know and are curious will probably be able to find  it (the

          word About on the front page of a website has become an idiom for  "click her

          to read the details about this").

          Dave

          _______________________________________________

          Bioperl-l mailing list

          Bioperl-l at lists.open-bio.org

          http://lists.open-bio.org/mailman/listinfo/bioperl-l


        _______________________________________________

        Bioperl-l mailing list

        Bioperl-l at lists.open-bio.org

        http://lists.open-bio.org/mailman/listinfo/bioperl-l

      _______________________________________________

      Bioperl-l mailing list

      Bioperl-l at lists.open-bio.org

      http://lists.open-bio.org/mailman/listinfo/bioperl-l


    _______________________________________________
    Bioperl-l mailing list
    Bioperl-l at lists.open-bio.org
    http://lists.open-bio.org/mailman/listinfo/bioperl-l


From bosborne11 at verizon.net  Fri Sep 25 22:42:38 2009
From: bosborne11 at verizon.net (Brian Osborne)
Date: Fri, 25 Sep 2009 22:42:38 -0400
Subject: [Bioperl-l] a Main Page proposal
In-Reply-To: <ACA5C04C052442259262125A5F0B8E74@NewLife>
References: <094CC6C2C9DE448F9A8097AEEFE86B03@NewLife><7C51BBB552AB4EBEA5ED60B49BDB9171@NewLife><628aabb70909211003w221b4c9fn5c11f7ff08fc952d@mail.gmail.com><42FBB964C0EA44FABCB50364C567A009@NewLife>
	<B4584560-48AE-4EFB-BE94-E1481FD24E1C@verizon.net>
	<ACA5C04C052442259262125A5F0B8E74@NewLife>
Message-ID: <EB52E49A-37B3-4652-9BFD-441BA174FF84@verizon.net>

Mark,

I don't love the italics in the version that Chris made but that's  
just personal preference. He's right in thinking that putting more in  
the top of the page is good: less scrolling.

One could color the backgrounds of his tables, that might look nice.

Either way, or a combination of both, is preferable to what we have.  
There really is no need to wait since the current page is abysmal. I  
can say that freely since I'm probably one of its authors!

One thought though: move the "search" up to a center-left location,  
below "main links". The Wiki search is pretty good at finding pages so  
if someone doesn't find what they're looking for in the main section  
they might be drawn to search for it.

Brian O.


On Sep 25, 2009, at 10:22 PM, Mark A. Jensen wrote:

> Cheers, Brian-- I am becoming swayed now by Chris' whack at it, on  
> his talk page. My thought is that we'll hammer out the final version  
> after the release, then pull the trigger-- Your thoughts?
> MAJ
> ----- Original Message ----- From: "Brian Osborne" <bosborne11 at verizon.net 
> >
> To: "Mark A. Jensen" <maj at fortinbras.us>
> Cc: "BioPerl List" <bioperl-l at lists.open-bio.org>
> Sent: Friday, September 25, 2009 10:13 PM
> Subject: Re: [Bioperl-l] a Main Page proposal
>
>
>> Mark,
>> Really nice, and a significant improvement over the existing.
>> You've gotten good feedback, you've considered these thoughts and   
>> incorporated them - is it time to move the beta to Main? Yes. In  
>> my  opinion your 'beta' is far superior - just do it.
>> Brian O.
>> On Sep 21, 2009, at 1:45 PM, Mark A. Jensen wrote:
>>> A nearly completely minimal solution is at Main Page Beta
>>> ----- Original Message ----- From: "Dave Messina" <David.Messina at sbc.su.se 
>>>  >
>>> To: "Mark A. Jensen" <maj at fortinbras.us>
>>> Cc: "BioPerl List" <bioperl-l at lists.open-bio.org>
>>> Sent: Monday, September 21, 2009 1:03 PM
>>> Subject: Re: [Bioperl-l] a Main Page proposal
>>>
>>>
>>>> Hi Mark,
>>>> Thanks for taking on this (much needed) refresh.
>>>> I think your current version is substantially better than what  
>>>> we  have now.
>>>> Still, I'd argue that something much more concise like the   
>>>> Biopython page
>>>> would make a bigger impact on visitors' ability to find what   
>>>> they're looking
>>>> for.
>>>> It's not that the details you have under each section shouldn't be
>>>> available, but rather that they could be clicked through to  
>>>> instead  of being
>>>> on the front page.
>>>> The About section is a good example. I would bet most visitors to  
>>>> the
>>>> BioPerl website skip over the About section because they already   
>>>> know what
>>>> BioPerl is, and that section has the most valuable real estate  
>>>> on  the page.
>>>> Those who don't know and are curious will probably be able to  
>>>> find  it (the
>>>> word About on the front page of a website has become an idiom  
>>>> for  "click her
>>>> to read the details about this").
>>>> Dave
>>>> _______________________________________________
>>>> Bioperl-l mailing list
>>>> Bioperl-l at lists.open-bio.org
>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From cjfields at illinois.edu  Sat Sep 26 00:04:57 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Fri, 25 Sep 2009 23:04:57 -0500
Subject: [Bioperl-l] a Main Page proposal
In-Reply-To: <EB52E49A-37B3-4652-9BFD-441BA174FF84@verizon.net>
References: <094CC6C2C9DE448F9A8097AEEFE86B03@NewLife><7C51BBB552AB4EBEA5ED60B49BDB9171@NewLife><628aabb70909211003w221b4c9fn5c11f7ff08fc952d@mail.gmail.com><42FBB964C0EA44FABCB50364C567A009@NewLife>
	<B4584560-48AE-4EFB-BE94-E1481FD24E1C@verizon.net>
	<ACA5C04C052442259262125A5F0B8E74@NewLife>
	<EB52E49A-37B3-4652-9BFD-441BA174FF84@verizon.net>
Message-ID: <68A162A4-45F1-4ADC-87C9-57E388DF2666@illinois.edu>

Brian, Mark,

Agreed about the italics; there's a lot more that can be done with  
tables if needed:

http://meta.wikimedia.org/wiki/Help:Table

I say go ahead and pull the trigger.  No need to wait 'til 1.6.1 on  
this, the sooner it's fixed the better.  We can tweak the rest (add  
News updates, etc) along the way.

chris

On Sep 25, 2009, at 9:42 PM, Brian Osborne wrote:

> Mark,
>
> I don't love the italics in the version that Chris made but that's  
> just personal preference. He's right in thinking that putting more  
> in the top of the page is good: less scrolling.
>
> One could color the backgrounds of his tables, that might look nice.
>
> Either way, or a combination of both, is preferable to what we have.  
> There really is no need to wait since the current page is abysmal. I  
> can say that freely since I'm probably one of its authors!
>
> One thought though: move the "search" up to a center-left location,  
> below "main links". The Wiki search is pretty good at finding pages  
> so if someone doesn't find what they're looking for in the main  
> section they might be drawn to search for it.
>
> Brian O.
>
>
> On Sep 25, 2009, at 10:22 PM, Mark A. Jensen wrote:
>
>> Cheers, Brian-- I am becoming swayed now by Chris' whack at it, on  
>> his talk page. My thought is that we'll hammer out the final  
>> version after the release, then pull the trigger-- Your thoughts?
>> MAJ
>> ----- Original Message ----- From: "Brian Osborne" <bosborne11 at verizon.net 
>> >
>> To: "Mark A. Jensen" <maj at fortinbras.us>
>> Cc: "BioPerl List" <bioperl-l at lists.open-bio.org>
>> Sent: Friday, September 25, 2009 10:13 PM
>> Subject: Re: [Bioperl-l] a Main Page proposal
>>
>>
>>> Mark,
>>> Really nice, and a significant improvement over the existing.
>>> You've gotten good feedback, you've considered these thoughts and   
>>> incorporated them - is it time to move the beta to Main? Yes. In  
>>> my  opinion your 'beta' is far superior - just do it.
>>> Brian O.
>>> On Sep 21, 2009, at 1:45 PM, Mark A. Jensen wrote:
>>>> A nearly completely minimal solution is at Main Page Beta
>>>> ----- Original Message ----- From: "Dave Messina" <David.Messina at sbc.su.se 
>>>>  >
>>>> To: "Mark A. Jensen" <maj at fortinbras.us>
>>>> Cc: "BioPerl List" <bioperl-l at lists.open-bio.org>
>>>> Sent: Monday, September 21, 2009 1:03 PM
>>>> Subject: Re: [Bioperl-l] a Main Page proposal
>>>>
>>>>
>>>>> Hi Mark,
>>>>> Thanks for taking on this (much needed) refresh.
>>>>> I think your current version is substantially better than what  
>>>>> we  have now.
>>>>> Still, I'd argue that something much more concise like the   
>>>>> Biopython page
>>>>> would make a bigger impact on visitors' ability to find what   
>>>>> they're looking
>>>>> for.
>>>>> It's not that the details you have under each section shouldn't be
>>>>> available, but rather that they could be clicked through to  
>>>>> instead  of being
>>>>> on the front page.
>>>>> The About section is a good example. I would bet most visitors  
>>>>> to the
>>>>> BioPerl website skip over the About section because they  
>>>>> already  know what
>>>>> BioPerl is, and that section has the most valuable real estate  
>>>>> on  the page.
>>>>> Those who don't know and are curious will probably be able to  
>>>>> find  it (the
>>>>> word About on the front page of a website has become an idiom  
>>>>> for  "click her
>>>>> to read the details about this").
>>>>> Dave
>>>>> _______________________________________________
>>>>> Bioperl-l mailing list
>>>>> Bioperl-l at lists.open-bio.org
>>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>>>
>>>> _______________________________________________
>>>> Bioperl-l mailing list
>>>> Bioperl-l at lists.open-bio.org
>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From cjfields at illinois.edu  Sat Sep 26 00:52:35 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Fri, 25 Sep 2009 23:52:35 -0500
Subject: [Bioperl-l] BioPerl 1.6.0 alpha 4 released
Message-ID: <2EDBBBF5-2109-456A-B768-178B012A8192@illinois.edu>

All,

Core 1.6.0 alpha 4 is now floating about on the intertubes and CPAN:

http://search.cpan.org/~cjfields/BioPerl-1.6.0_4/

http://bioperl.org/DIST/RC/

So far this is passing all tests for ActivePerl on WinXP once DB_File  
is installed.  I'll try running some tests for Strawberry Perl, but no  
promises.

At this late stage any additional updates will only be doc tweaks and  
dealing with small bug fixes prior to 1.6.1.  The only renaming issue  
is I need to rename BioPerl.pod to BioPerl.pm and adding a simple  
VERSION to it (per Curtis Jewell's suggestion).  I may post a very  
short alpha 5 to test that, with 1.6.1 posted by Sunday.

Enjoy!

chris

From e.osimo at gmail.com  Sun Sep 27 05:00:17 2009
From: e.osimo at gmail.com (Emanuele Osimo)
Date: Sun, 27 Sep 2009 11:00:17 +0200
Subject: [Bioperl-l] setting a strand in Bio::Graphics
Message-ID: <2ac05d0f0909270200j3bb478b3t77b83bccc1e5022c@mail.gmail.com>

Hello,
I've tried all the arrows suggested in
http://search.cpan.org/~lds/Bio-Graphics-1.982/lib/Bio/Graphics/Glyph/arrow.pm,
but I can't figure out how to tell in the options of $panel->add_track the
strand of the feature I'm adding.
I'm drawing DNA elements from a local DB, and I have a field "strand" which
can be + or -.
Please help!
Emanuele

From maj at fortinbras.us  Sun Sep 27 20:54:04 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Sun, 27 Sep 2009 20:54:04 -0400
Subject: [Bioperl-l] setting a strand in Bio::Graphics
In-Reply-To: <2ac05d0f0909270200j3bb478b3t77b83bccc1e5022c@mail.gmail.com>
References: <2ac05d0f0909270200j3bb478b3t77b83bccc1e5022c@mail.gmail.com>
Message-ID: <6CF05E74FEAE45679CDEDF48B7E15856@NewLife>

Emos- Without the code, I can only guess, but you might not be providing
the options correctly. Have a look at
http://www.bioperl.org/wiki/Drawing_with_multiple_glyphs_in_a_single_track
for something that may help.
MAJ
----- Original Message ----- 
From: "Emanuele Osimo" <e.osimo at gmail.com>
To: "perl bioperl ml" <bioperl-l at lists.open-bio.org>
Sent: Sunday, September 27, 2009 5:00 AM
Subject: [Bioperl-l] setting a strand in Bio::Graphics


> Hello,
> I've tried all the arrows suggested in
> http://search.cpan.org/~lds/Bio-Graphics-1.982/lib/Bio/Graphics/Glyph/arrow.pm,
> but I can't figure out how to tell in the options of $panel->add_track the
> strand of the feature I'm adding.
> I'm drawing DNA elements from a local DB, and I have a field "strand" which
> can be + or -.
> Please help!
> Emanuele
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> 


From cjfields at illinois.edu  Mon Sep 28 00:34:01 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Sun, 27 Sep 2009 23:34:01 -0500
Subject: [Bioperl-l] BioPerl 1.6.0 alpha 5 released
Message-ID: <277ED183-2F43-479F-88D2-A0A325105C53@illinois.edu>

All,

The last alpha for the 1.6.1 release is out and should be propagating  
around CPAN now.  This should be a quick one (it has a few last-minute  
bug fixes for some problems that popped up on CPAN RT and fixes one  
mistake I made in the last alpha).

You can currently get it here (.tar.gz only for now):

http://bioperl.org/DIST/RC/BioPerl-1.6.0_5.tar.gz

The final 1.6.1 release should drop in the next day or two.

chris

From adsj at novozymes.com  Mon Sep 28 03:51:15 2009
From: adsj at novozymes.com (Adam =?iso-8859-1?Q?Sj=F8gren?=)
Date: Mon, 28 Sep 2009 09:51:15 +0200
Subject: [Bioperl-l] Long /labels are wrapped, but can't be read
Message-ID: <87hbunv764.fsf@topper.koldfront.dk>

  Hi.


I am wondering whether this is a buglet or just a case of "Don't do
that":

If I set a very long /label on a feature and output the sequence in EMBL
format, the qualifier value gets wrapped, but not quoted.

When BioPerl reads such a file, an exception is thrown.

I probably shouldn't be setting very long labels... But oughtn't BioPerl
throw an exception when a too long label is set, or automatically quote
the value when it is long enough to be wrapped, or know how to read a
wrapped yet unquoted value?

I will be happy to try and provide a patch for whichever solution is
preferred.

Here is an example script:

  #!/usr/bin/perl

  use strict;
  use warnings;

  use IO::String;

  use Bio::Seq;
  use Bio::SeqFeature::Generic;
  use Bio::SeqIO;

  print 'BioPerl ' . $Bio::Root::Version::VERSION . "\n";

  my $seq=Bio::Seq->new(-seq=>'ATG');
  my $feature=Bio::SeqFeature::Generic->new(-primary=>'misc_feature', -start=>1, -end=>3);
  $feature->add_tag_value(label=>'averylonglabelthisisindeedbutitoughttoworkanywaydontyouthink');
  $seq->add_SeqFeature($feature);

  my $out_string=out($seq);
  print $out_string;

  my $fh=IO::String->new($out_string);
  my $in=Bio::SeqIO->new(-fh=>$fh, -format=>'EMBL');
  my $in_seq=$in->next_seq;

  print "Done\n";

  sub out {
      my ($seq)=@_;

      my $string='';
      my $fh=IO::String->new($string);
      my $out=Bio::SeqIO->new(-fh=>$fh, -format=>'EMBL');
      $out->write_seq($seq);

      return $string;
  }

Which gives this output when run:

  BioPerl 1.0069
  ID   unknown; SV 1; linear; unassigned DNA; STD; UNC; 3 BP.
  XX
  AC   unknown;
  XX
  XX
  FH   Key             Location/Qualifiers
  FH
  FT   misc_feature    1..3
  FT                   /label=averylonglabelthisisindeedbutitoughttoworkanywaydont
  FT                   youthink
  XX
  SQ   Sequence 3 BP; 1 A; 0 C; 1 G; 1 T; 0 other;
       atg                                                                       3
  //

  ------------- EXCEPTION: Bio::Root::Exception -------------
  MSG: Can't see new qualifier in: youthink
  from:
  /label=averylonglabelthisisindeedbutitoughttoworkanywaydont
  youthink

  STACK: Error::throw
  STACK: Bio::Root::Root::throw Bio/Root/Root.pm:368
  STACK: Bio::SeqIO::embl::_read_FTHelper_EMBL Bio/SeqIO/embl.pm:1294
  STACK: Bio::SeqIO::embl::next_seq Bio/SeqIO/embl.pm:392
  STACK: /z/home/adsj/bugs/bioperl/embl/embl.pl:24
  -----------------------------------------------------------

If I change the value to include "-quotes ("simulating" that embl.pm
quotes the value), BioPerl can read the EMBL string it produces fine:

  -----------------------------------------------------------
  adsj at ala:~/work/bioperl/bioperl-live$ perl -I. ~/bugs/bioperl/embl/embl.pl 
  BioPerl 1.0069
  ID   unknown; SV 1; linear; unassigned DNA; STD; UNC; 3 BP.
  XX
  AC   unknown;
  XX
  XX
  FH   Key             Location/Qualifiers
  FH
  FT   misc_feature    1..3
  FT                   /label=""averylonglabelthisisindeedbutitoughttoworkanywaydo
  FT                   ntyouthink""
  XX
  SQ   Sequence 3 BP; 1 A; 0 C; 1 G; 1 T; 0 other;
       atg                                                                       3
  //
  Done


  Best regards,

     Adam

-- 
                                                          Adam Sj?gren
                                                    adsj at novozymes.com


From paola_bisignano at yahoo.it  Mon Sep 28 06:00:07 2009
From: paola_bisignano at yahoo.it (Paola Bisignano)
Date: Mon, 28 Sep 2009 10:00:07 +0000 (GMT)
Subject: [Bioperl-l] parsing msf file (sorry last question about it)
Message-ID: <504748.72296.qm@web25704.mail.ukl.yahoo.com>

Hi dear friends,


I used Bio::AlignIO to parse msf file, using method

colum_from_residue_number, as you suggested to obtain the position in

the alignment of ?residues of interest (in contact with my ligand) and

I have to do a check of the residue:

I want to extract the type of the residue...I ask my question using

the number of the residue in the PDB, and i want the script return

also the residue so if I want to know the position af ala21, I ?will

do:


my $alnio = Bio::AlignIO->new( -file=>"my file.msf");

my $aln = $alnio->next_aln;


my $s1 = $aln->get_seq_by_pos(1);

my $s2 = $aln->get_seq_by_pos(2);


my $col = $aln->column_from_residue_ number( $s1->id, 21)


and It will return the position (es. 5) but I want to check if in

position 5 of the alignment there is A (for ala)....I looked in

documentation, but I couldn't find anything for that


Thank you all for help you gave and will give to me,


best regards,


paola


From David.Messina at sbc.su.se  Mon Sep 28 07:28:27 2009
From: David.Messina at sbc.su.se (Dave Messina)
Date: Mon, 28 Sep 2009 13:28:27 +0200
Subject: [Bioperl-l] parsing msf file (sorry last question about it)
In-Reply-To: <504748.72296.qm@web25704.mail.ukl.yahoo.com>
References: <504748.72296.qm@web25704.mail.ukl.yahoo.com>
Message-ID: <628aabb70909280428q54e08ef9sa005aeab9f3a7b62@mail.gmail.com>

Hi Paola,

> my $alnio = Bio::AlignIO->new( -file=>"my file.msf");
> my $aln = $alnio->next_aln;
>
> my $s1 = $aln->get_seq_by_pos(1);
> my $s2 = $aln->get_seq_by_pos(2);
>
> my $col = $aln->column_from_residue_ number( $s1->id, 21)


# extract sequences and check values for the alignment column $pos
  foreach my $seq ($aln->each_seq) {
      my $res = $seq->subseq($col, $col);
     if ($res eq 'A') {
         # do something
     }
  }


Please try the above code. I haven't tested it, but I think it will do what
you want.

Best,
Dave

PS - I found that code in the documentation for Bio::Align::AlignI. Right
now there is an effort to improve the BioPerl documentation, and it would be
helpful if you could let us know where you looked for the answer to your
question so we can try to make it easier to find.

Did you look in Bio::AlignIO? Did you also look anywhere else?

Thanks for your help!

From David.Messina at sbc.su.se  Mon Sep 28 08:05:58 2009
From: David.Messina at sbc.su.se (Dave Messina)
Date: Mon, 28 Sep 2009 14:05:58 +0200
Subject: [Bioperl-l] parsing msf file (sorry last question about it)
In-Reply-To: <678730.88068.qm@web25708.mail.ukl.yahoo.com>
References: <628aabb70909280428q54e08ef9sa005aeab9f3a7b62@mail.gmail.com> 
	<678730.88068.qm@web25708.mail.ukl.yahoo.com>
Message-ID: <628aabb70909280505l2c5f02b7k8387d5dfd3643575@mail.gmail.com>

On Mon, Sep 28, 2009 at 13:56, Paola Bisignano <paola_bisignano at yahoo.it>wrote:

> yes I have a look at
> http://doc.bioperl.org/releases/bioperl-1.0/Bio/AlignIO.html
>
> but I didn't find your suggestion


> thank,
> I'll try it in a while.......
> sorry I did not search in AlignI....


No problem, Paola -- thanks for letting us know.

Dave

From maj at fortinbras.us  Mon Sep 28 10:32:39 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Mon, 28 Sep 2009 10:32:39 -0400
Subject: [Bioperl-l] setting a strand in Bio::Graphics
In-Reply-To: <2ac05d0f0909280728y791a5e60r904be0d7e8f747f7@mail.gmail.com>
References: <2ac05d0f0909270200j3bb478b3t77b83bccc1e5022c@mail.gmail.com>
	<6CF05E74FEAE45679CDEDF48B7E15856@NewLife>
	<2ac05d0f0909280728y791a5e60r904be0d7e8f747f7@mail.gmail.com>
Message-ID: <A45CF4D6E34B405B86E5F2DF651B8964@NewLife>

Now that's what I call user-friendly.
  ----- Original Message ----- 
  From: Emanuele Osimo 
  To: Mark A. Jensen 
  Sent: Monday, September 28, 2009 10:28 AM
  Subject: Re: [Bioperl-l] setting a strand in Bio::Graphics


  Hello everyone,
  thank you, I found what I needed. You have to add                           

  -strand_arrow => 1

  in $panel->add_track, and 

  -strand        => +/-1,

  in $feature = Bio::SeqFeature::Generic->new options.

  Thanks
  Emanuele


  On Mon, Sep 28, 2009 at 02:54, Mark A. Jensen <maj at fortinbras.us> wrote:

    Emos- Without the code, I can only guess, but you might not be providing
    the options correctly. Have a look at
    http://www.bioperl.org/wiki/Drawing_with_multiple_glyphs_in_a_single_track
    for something that may help.
    MAJ
    ----- Original Message ----- From: "Emanuele Osimo" <e.osimo at gmail.com>
    To: "perl bioperl ml" <bioperl-l at lists.open-bio.org>
    Sent: Sunday, September 27, 2009 5:00 AM
    Subject: [Bioperl-l] setting a strand in Bio::Graphics


      Hello,
      I've tried all the arrows suggested in
      http://search.cpan.org/~lds/Bio-Graphics-1.982/lib/Bio/Graphics/Glyph/arrow.pm,
      but I can't figure out how to tell in the options of $panel->add_track the
      strand of the feature I'm adding.
      I'm drawing DNA elements from a local DB, and I have a field "strand" which
      can be + or -.
      Please help!
      Emanuele

      _______________________________________________
      Bioperl-l mailing list
      Bioperl-l at lists.open-bio.org
      http://lists.open-bio.org/mailman/listinfo/bioperl-l


From paolo.pavan at gmail.com  Mon Sep 28 11:51:52 2009
From: paolo.pavan at gmail.com (Paolo Pavan)
Date: Mon, 28 Sep 2009 17:51:52 +0200
Subject: [Bioperl-l] BioPerl object deep copy
Message-ID: <56be91b60909280851g2299726bvfbdd6ef44e262fe7@mail.gmail.com>

Hi all,
I would like to have just a programming hint, there is a way in
bioperl (or just in perl) to get an deep copy or a clone of an object?
That is, I get a new object with all the fields copied one by one.

At least, can I do so for a Bio::SeqI or a Bio::AlignI compliant object?

Thank you,
Paolo

From s.denaxas at gmail.com  Mon Sep 28 11:56:09 2009
From: s.denaxas at gmail.com (Spiros Denaxas)
Date: Mon, 28 Sep 2009 16:56:09 +0100
Subject: [Bioperl-l] BioPerl object deep copy
In-Reply-To: <56be91b60909280851g2299726bvfbdd6ef44e262fe7@mail.gmail.com>
References: <56be91b60909280851g2299726bvfbdd6ef44e262fe7@mail.gmail.com>
Message-ID: <bba689ec0909280856q3fa3c8b1pf5b5dd48bc493eb4@mail.gmail.com>

Hi Paolo,

You can use Clone [1]. Blindly cloning blessed objects though is not a
good idea so make sure you know what each one instantiates.

Spiros

[1] http://perldoc.net/Clone.pm

On Mon, Sep 28, 2009 at 4:51 PM, Paolo Pavan <paolo.pavan at gmail.com> wrote:
> Hi all,
> I would like to have just a programming hint, there is a way in
> bioperl (or just in perl) to get an deep copy or a clone of an object?
> That is, I get a new object with all the fields copied one by one.
>
> At least, can I do so for a Bio::SeqI or a Bio::AlignI compliant object?
>
> Thank you,
> Paolo
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>

From maj at fortinbras.us  Mon Sep 28 12:05:42 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Mon, 28 Sep 2009 12:05:42 -0400
Subject: [Bioperl-l] BioPerl object deep copy
In-Reply-To: <56be91b60909280851g2299726bvfbdd6ef44e262fe7@mail.gmail.com>
References: <56be91b60909280851g2299726bvfbdd6ef44e262fe7@mail.gmail.com>
Message-ID: <5A61641A14AE4D80A495047A56659894@NewLife>

For some relatively careful examples of cloning code, 
you can look at the source for 
Bio::Tree::TreeFunctionsI::clone
and 
Bio::Restriction::Enzyme::clone (not clone_depr)
MAJ

----- Original Message ----- 
From: "Paolo Pavan" <paolo.pavan at gmail.com>
To: <bioperl-l at lists.open-bio.org>
Sent: Monday, September 28, 2009 11:51 AM
Subject: [Bioperl-l] BioPerl object deep copy


> Hi all,
> I would like to have just a programming hint, there is a way in
> bioperl (or just in perl) to get an deep copy or a clone of an object?
> That is, I get a new object with all the fields copied one by one.
> 
> At least, can I do so for a Bio::SeqI or a Bio::AlignI compliant object?
> 
> Thank you,
> Paolo
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
>

From cjfields at illinois.edu  Mon Sep 28 12:29:14 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Mon, 28 Sep 2009 11:29:14 -0500
Subject: [Bioperl-l] BioPerl object deep copy
In-Reply-To: <5A61641A14AE4D80A495047A56659894@NewLife>
References: <56be91b60909280851g2299726bvfbdd6ef44e262fe7@mail.gmail.com>
	<5A61641A14AE4D80A495047A56659894@NewLife>
Message-ID: <05BB0DB4-6017-40A1-92B2-6F441CCACDC6@illinois.edu>

As Spiros points out, Clone works in almost all cases and is very fast  
(XS-based I think).  IIRC the only time it borks out is if there is a  
code ref, as with Bio::Tree::Tree, but if it doesn't work you should  
get an error indicating the problem.

chris

On Sep 28, 2009, at 11:05 AM, Mark A. Jensen wrote:

> For some relatively careful examples of cloning code, you can look  
> at the source for Bio::Tree::TreeFunctionsI::clone
> and Bio::Restriction::Enzyme::clone (not clone_depr)
> MAJ
>
> ----- Original Message ----- From: "Paolo Pavan" <paolo.pavan at gmail.com 
> >
> To: <bioperl-l at lists.open-bio.org>
> Sent: Monday, September 28, 2009 11:51 AM
> Subject: [Bioperl-l] BioPerl object deep copy
>
>
>> Hi all,
>> I would like to have just a programming hint, there is a way in
>> bioperl (or just in perl) to get an deep copy or a clone of an  
>> object?
>> That is, I get a new object with all the fields copied one by one.
>> At least, can I do so for a Bio::SeqI or a Bio::AlignI compliant  
>> object?
>> Thank you,
>> Paolo
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From cjfields at illinois.edu  Mon Sep 28 13:00:09 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Mon, 28 Sep 2009 12:00:09 -0500
Subject: [Bioperl-l] BioPerl 1.6.0 alpha 6 released (?!?)
In-Reply-To: <20090928063013.GB1081@kunpuu.plessy.org>
References: <277ED183-2F43-479F-88D2-A0A325105C53@illinois.edu>
	<20090928063013.GB1081@kunpuu.plessy.org>
Message-ID: <CFD37E37-2B74-402F-BA0F-898A1642FFE8@illinois.edu>

Charles (and everyone else),

This bug was a bit sneaky.  The tests skipped on pretty much every  
system b/c of a requirement for both DB_File and BerkeleyDB (e.g. if  
both weren't installed, the tests were skipped).  I committed a fix  
for it; unfortunately that means I need to set up another alpha for  
testing, so...

The final final alpha has just been uploaded to CPAN and is now  
available here:

http://bioperl.org/DIST/RC/BioPerl-1.6.0_6.tar.gz

The final 1.6.1 release should still be in the next day or two, just  
awaiting some test reports via CPAN...

chris

On Sep 28, 2009, at 1:30 AM, Charles Plessy wrote:

> Le Sun, Sep 27, 2009 at 11:34:01PM -0500, Chris Fields a ?crit :
>>
>> http://bioperl.org/DIST/RC/BioPerl-1.6.0_5.tar.gz
>>
>
> Hi Chris,
>
> I have the following errors when building bioperl with perl 5.10.1  
> on Debian:
>
> Test Summary Report
> -------------------
> t/LocalDB/Registry.t                       (Wstat: 2304 Tests: 13  
> Failed: 1)
>  Failed test:  13
>  Non-zero exit status: 9
>  Parse errors: Bad plan.  You planned 14 tests but ran 13.
> t/RemoteDB/EUtilities.t                    (Wstat: 256 Tests: 309  
> Failed: 1)
>  Failed test:  309
>  Non-zero exit status: 1
> t/Tools/Run/RemoteBlast.t                  (Wstat: 65280 Tests: 13  
> Failed: 0)
>  Non-zero exit status: 255
>  Parse errors: Bad plan.  You planned 16 tests but ran 13.
> Files=329, Tests=20766, 434 wallclock secs ( 2.64 usr  0.51 sys +  
> 100.55 cusr  6.24 csys = 109.94 CPU)
> Result: FAIL
>
>
> t/Align/AlignStats.t ......................... ok
> t/Align/AlignUtil.t .......................... ok
> t/Align/SimpleAlign.t ........................ ok
> t/Align/TreeBuild.t .......................... ok
> t/Align/Utilities.t .......................... ok
> t/AlignIO/AlignIO.t .......................... ok
> t/AlignIO/arp.t .............................. ok
> t/AlignIO/bl2seq.t ........................... ok
> t/AlignIO/clustalw.t ......................... ok
> t/AlignIO/emboss.t ........................... ok
> t/AlignIO/fasta.t ............................ ok
> t/AlignIO/largemultifasta.t .................. ok
> t/AlignIO/maf.t .............................. ok
> t/AlignIO/mase.t ............................. ok
> t/AlignIO/mega.t ............................. ok
> t/AlignIO/meme.t ............................. ok
> t/AlignIO/metafasta.t ........................ ok
> t/AlignIO/msf.t .............................. ok
> t/AlignIO/nexus.t ............................ ok
> t/AlignIO/pfam.t ............................. ok
> t/AlignIO/phylip.t ........................... ok
> t/AlignIO/po.t ............................... ok
> t/AlignIO/prodom.t ........................... ok
> t/AlignIO/psi.t .............................. ok
> t/AlignIO/selex.t ............................ ok
> t/AlignIO/stockholm.t ........................ ok
> t/AlignIO/xmfa.t ............................. ok
> t/Alphabet.t ................................. ok
> t/Annotation/Annotation.t .................... ok
> t/Annotation/AnnotationAdaptor.t ............. ok
> t/Assembly/Assembly.t ........................ ok
> t/Assembly/ContigSpectrum.t .................. ok
> t/Biblio/Biblio.t ............................ ok
> t/Biblio/References.t ........................ ok
> t/Biblio/biofetch.t .......................... ok
> t/Biblio/eutils.t ............................ ok
> t/ClusterIO/ClusterIO.t ...................... ok
> t/ClusterIO/SequenceFamily.t ................. ok
> t/ClusterIO/unigene.t ........................ ok
> t/Coordinate/CoordinateGraph.t ............... ok
> t/Coordinate/CoordinateMapper.t .............. ok
> t/Coordinate/GeneCoordinateMapper.t .......... ok
> t/LiveSeq/Chain.t ............................ ok
> t/LiveSeq/LiveSeq.t .......................... ok
> t/LiveSeq/Mutation.t ......................... ok
> t/LiveSeq/Mutator.t .......................... ok
> t/LocalDB/BioDBGFF.t ......................... ok
> t/LocalDB/BlastIndex.t ....................... ok
> t/LocalDB/DBFasta.t .......................... ok
> t/LocalDB/DBQual.t ........................... ok
> t/LocalDB/Flat.t ............................. ok
> t/LocalDB/Index.t ............................ ok
> t/LocalDB/Registry.t ......................... 1/14
> --------------------- WARNING ---------------------
> MSG:
> ------------- EXCEPTION: Bio::Root::Exception -------------
> MSG: The sequence does not appear to be FASTA format (lacks a  
> descriptor line '>')
> STACK: Error::throw
> STACK: Bio::Root::Root::throw Bio/Root/Root.pm:368
> STACK: Bio::SeqIO::fasta::next_seq Bio/SeqIO/fasta.pm:127
> STACK: Bio::DB::Flat::BDB::get_Seq_by_id Bio/DB/Flat/BDB.pm:143
> STACK: Bio::DB::Failover::get_Seq_by_id Bio/DB/Failover.pm:122
> STACK: t/LocalDB/Registry.t:69
> -----------------------------------------------------------
>
> ---------------------------------------------------
>
> --------------------- WARNING ---------------------
> MSG: No sequence retrieved by database Bio::DB::Flat::BDB::fasta
> ---------------------------------------------------
>
> #   Failed test at t/LocalDB/Registry.t line 70.
> Can't call method "seq" on an undefined value at t/LocalDB/ 
> Registry.t line 71, <GEN17> line 1.
> # Looks like you planned 14 tests but ran 13.
> # Looks like you failed 1 test of 13 run.
> # Looks like your test exited with 9 just after 13.
> t/LocalDB/Registry.t ......................... Dubious, test  
> returned 9 (wstat 2304, 0x900)
> Failed 2/14 subtests
> t/LocalDB/SeqFeature.t ....................... ok
> t/LocalDB/transfac_pro.t ..................... ok
> t/Map/Cyto.t ................................. ok
> t/Map/Linkage.t .............................. ok
> t/Map/Map.t .................................. ok
> t/Map/MapIO.t ................................ ok
> t/Map/MicrosatelliteMarker.t ................. ok
> t/Map/Physical.t ............................. ok
> t/Matrix/IO/masta.t .......................... ok
> t/Matrix/IO/psm.t ............................ ok
> t/Matrix/InstanceSite.t ...................... ok
> t/Matrix/Matrix.t ............................ ok
> t/Matrix/ProtMatrix.t ........................ ok
> t/Matrix/ProtPsm.t ........................... ok
> t/Matrix/SiteMatrix.t ........................ ok
> t/Ontology/GOterm.t .......................... ok
> t/Ontology/GraphAdaptor.t .................... ok
> t/Ontology/IO/go.t ........................... ok
> t/Ontology/IO/interpro.t ..................... ok
> t/Ontology/IO/obo.t .......................... ok
> t/Ontology/Ontology.t ........................ ok
> t/Ontology/OntologyEngine.t .................. ok
> t/Ontology/OntologyStore.t ................... ok
> t/Ontology/Relationship.t .................... ok
> t/Ontology/RelationshipType.t ................ ok
> t/Ontology/Term.t ............................ ok
> t/Perl.t ..................................... ok
> t/Phenotype/Correlate.t ...................... ok
> t/Phenotype/MeSH.t ........................... ok
> t/Phenotype/Measure.t ........................ ok
> t/Phenotype/MiniMIMentry.t ................... ok
> t/Phenotype/OMIMentry.t ...................... ok
> t/Phenotype/OMIMentryAllelicVariant.t ........ ok
> t/Phenotype/OMIMparser.t ..................... ok
> t/Phenotype/Phenotype.t ...................... ok
> t/PodSyntax.t ................................ ok
> t/PopGen/Coalescent.t ........................ ok
> t/PopGen/HtSNP.t ............................. ok
> t/PopGen/MK.t ................................ ok
> t/PopGen/PopGen.t ............................ ok
> t/PopGen/PopGenSims.t ........................ ok
> t/PopGen/TagHaplotype.t ...................... ok
> t/RemoteDB/BioFetch.t ........................ ok
> t/RemoteDB/CUTG.t ............................ ok
> t/RemoteDB/EMBL.t ............................ ok
> t/RemoteDB/EUtilities.t ...................... 309/309
> #   Failed test 'EPost to EFetch'
> #   at t/RemoteDB/EUtilities.t line 159.
> #          got: '0'
> #     expected: '5'
> # Looks like you failed 1 test of 309.
> t/RemoteDB/EUtilities.t ...................... Dubious, test  
> returned 1 (wstat 256, 0x100)
> Failed 1/309 subtests
> t/RemoteDB/EntrezGene.t ...................... ok
> t/RemoteDB/GenBank.t ......................... ok
> t/RemoteDB/GenPept.t ......................... ok
> t/RemoteDB/HIV/HIV.t ......................... ok
> t/RemoteDB/HIV/HIVAnnotProcessor.t ........... ok
> t/RemoteDB/HIV/HIVQuery.t .................... 22/41 Use of  
> uninitialized value $rest[0] in join or string at (eval 68) line 15.
> t/RemoteDB/HIV/HIVQuery.t .................... ok
> t/RemoteDB/HIV/HIVQueryHelper.t .............. ok
> t/RemoteDB/MeSH.t ............................ ok
> t/RemoteDB/Query/GenBank.t ................... ok
> t/RemoteDB/RefSeq.t .......................... ok
> t/RemoteDB/SeqHound.t ........................ ok
> t/RemoteDB/SeqRead_fail.t .................... ok
> t/RemoteDB/SeqVersion.t ...................... ok
> t/RemoteDB/SwissProt.t ....................... ok
> t/RemoteDB/Taxonomy.t ........................ ok
> t/Restriction/Analysis-refac.t ............... ok
> t/Restriction/Analysis.t ..................... ok
> t/Restriction/Gel.t .......................... ok
> t/Restriction/IO.t ........................... ok
> t/Root/Exception.t ........................... ok
> t/Root/RootI.t ............................... ok
> t/Root/RootIO.t .............................. ok
> t/Root/Storable.t ............................ ok
> t/Root/Tempfile.t ............................ ok
> t/Root/Utilities.t ........................... ok
> t/SearchDist.t ............................... skipped: The optional  
> module Bio::Ext::Align (or dependencies thereof) was not installed
> t/SearchIO/CigarString.t ..................... ok
> t/SearchIO/SearchIO.t ........................ ok
> t/SearchIO/SimilarityPair.t .................. ok
> t/SearchIO/Tiling.t .......................... ok
> t/SearchIO/Writer/GbrowseGFF.t ............... ok
> t/SearchIO/Writer/HSPTableWriter.t ........... ok
> t/SearchIO/Writer/HTMLWriter.t ............... ok
> t/SearchIO/Writer/HitTableWriter.t ........... ok
> t/SearchIO/blast.t ........................... ok
> t/SearchIO/blast_pull.t ...................... ok
> t/SearchIO/blasttable.t ...................... ok
> t/SearchIO/blastxml.t ........................ ok
> t/SearchIO/cross_match.t ..................... ok
> t/SearchIO/erpin.t ........................... ok
> t/SearchIO/exonerate.t ....................... ok
> t/SearchIO/fasta.t ........................... ok
> t/SearchIO/gmap_f9.t ......................... ok
> t/SearchIO/hmmer.t ........................... ok
> t/SearchIO/hmmer_pull.t ...................... ok
> t/SearchIO/infernal.t ........................ ok
> t/SearchIO/megablast.t ....................... ok
> t/SearchIO/psl.t ............................. ok
> t/SearchIO/rnamotif.t ........................ ok
> t/SearchIO/sim4.t ............................ ok
> t/SearchIO/waba.t ............................ ok
> t/SearchIO/wise.t ............................ ok
> t/Seq/DBLink.t ............................... ok
> t/Seq/EncodedSeq.t ........................... ok
> t/Seq/LargeLocatableSeq.t .................... ok
> t/Seq/LargePSeq.t ............................ ok
> t/Seq/LocatableSeq.t ......................... ok
> t/Seq/MetaSeq.t .............................. ok
> t/Seq/PrimaryQual.t .......................... ok
> t/Seq/PrimarySeq.t ........................... ok
> t/Seq/PrimedSeq.t ............................ ok
> t/Seq/Quality.t .............................. ok
> t/Seq/Seq.t .................................. ok
> t/Seq/WithQuality.t .......................... ok
> t/SeqEvolution.t ............................. ok
> t/SeqFeature/FeatureIO.t ..................... ok
> t/SeqFeature/Location.t ...................... ok
> t/SeqFeature/LocationFactory.t ............... ok
> t/SeqFeature/Primer.t ........................ ok
> t/SeqFeature/Range.t ......................... ok
> t/SeqFeature/RangeI.t ........................ ok
> t/SeqFeature/SeqAnalysisParser.t ............. ok
> t/SeqFeature/SeqFeatAnnotated.t .............. ok
> t/SeqFeature/SeqFeatCollection.t ............. ok
> t/SeqFeature/SeqFeature.t .................... ok
> t/SeqFeature/SeqFeaturePrimer.t .............. ok
> t/SeqFeature/Unflattener.t ................... ok
> t/SeqFeature/Unflattener2.t .................. ok
> t/SeqIO.t .................................... ok
> t/SeqIO/Handler.t ............................ ok
> t/SeqIO/MultiFile.t .......................... ok
> t/SeqIO/Multiple_fasta.t ..................... ok
> t/SeqIO/SeqBuilder.t ......................... ok
> t/SeqIO/Splicedseq.t ......................... ok
> t/SeqIO/abi.t ................................ skipped: The optional  
> module Bio::SeqIO::staden::read (or dependencies thereof) was not  
> installed
> t/SeqIO/ace.t ................................ ok
> t/SeqIO/agave.t .............................. ok
> t/SeqIO/alf.t ................................ skipped: The optional  
> module Bio::SeqIO::staden::read (or dependencies thereof) was not  
> installed
> t/SeqIO/asciitree.t .......................... ok
> t/SeqIO/bsml.t ............................... ok
> t/SeqIO/bsml_sax.t ........................... ok
> t/SeqIO/chadoxml.t ........................... ok
> t/SeqIO/chaos.t .............................. ok
> t/SeqIO/chaosxml.t ........................... ok
> t/SeqIO/ctf.t ................................ skipped: The optional  
> module Bio::SeqIO::staden::read (or dependencies thereof) was not  
> installed
> t/SeqIO/embl.t ............................... ok
> t/SeqIO/entrezgene.t ......................... ok
> t/SeqIO/excel.t .............................. ok
> t/SeqIO/exp.t ................................ skipped: The optional  
> module Bio::SeqIO::staden::read (or dependencies thereof) was not  
> installed
> t/SeqIO/fasta.t .............................. ok
> t/SeqIO/fastq.t .............................. ok
> t/SeqIO/flybase_chadoxml.t ................... ok
> t/SeqIO/game.t ............................... ok
> t/SeqIO/gcg.t ................................ ok
> t/SeqIO/genbank.t ............................ ok
> t/SeqIO/interpro.t ........................... ok
> t/SeqIO/kegg.t ............................... ok
> t/SeqIO/largefasta.t ......................... ok
> t/SeqIO/lasergene.t .......................... ok
> t/SeqIO/locuslink.t .......................... ok
> t/SeqIO/metafasta.t .......................... ok
> t/SeqIO/phd.t ................................ ok
> t/SeqIO/pir.t ................................ ok
> t/SeqIO/pln.t ................................ skipped: The optional  
> module Bio::SeqIO::staden::read (or dependencies thereof) was not  
> installed
> t/SeqIO/qual.t ............................... ok
> t/SeqIO/raw.t ................................ ok
> t/SeqIO/scf.t ................................ ok
> t/SeqIO/strider.t ............................ ok
> t/SeqIO/swiss.t .............................. ok
> t/SeqIO/tab.t ................................ ok
> t/SeqIO/table.t .............................. ok
> t/SeqIO/tigr.t ............................... ok
> t/SeqIO/tigrxml.t ............................ ok
> t/SeqIO/tinyseq.t ............................ ok
> t/SeqIO/ztr.t ................................ skipped: The optional  
> module Bio::SeqIO::staden::read (or dependencies thereof) was not  
> installed
> t/SeqTools/Backtranslate.t ................... ok
> t/SeqTools/CodonTable.t ...................... ok
> t/SeqTools/ECnumber.t ........................ ok
> t/SeqTools/GuessSeqFormat.t .................. ok
> t/SeqTools/OddCodes.t ........................ ok
> t/SeqTools/SeqPattern.t ...................... ok
> t/SeqTools/SeqStats.t ........................ ok
> t/SeqTools/SeqUtils.t ........................ ok
> t/SeqTools/SeqWords.t ........................ ok
> t/Species.t .................................. ok
> t/Structure/IO.t ............................. ok
> t/Structure/Structure.t ...................... ok
> t/Symbol.t ................................... ok
> t/TaxonTree.t ................................ skipped: All tests  
> are being skipped, probably because the module(s) being tested here  
> are now deprecated
> t/Tools/Alignment/Consed.t ................... ok
> t/Tools/Analysis/DNA/ESEfinder.t ............. ok
> t/Tools/Analysis/Protein/Domcut.t ............ ok
> t/Tools/Analysis/Protein/ELM.t ............... ok
> t/Tools/Analysis/Protein/GOR4.t .............. ok
> t/Tools/Analysis/Protein/HNN.t ............... ok
> t/Tools/Analysis/Protein/Mitoprot.t .......... ok
> t/Tools/Analysis/Protein/NetPhos.t ........... ok
> t/Tools/Analysis/Protein/Scansite.t .......... ok
> t/Tools/Analysis/Protein/Sopma.t ............. ok
> t/Tools/EMBOSS/Palindrome.t .................. ok
> t/Tools/EUtilities/EUtilParameters.t ......... ok
> t/Tools/EUtilities/egquery.t ................. ok
> t/Tools/EUtilities/einfo.t ................... ok
> t/Tools/EUtilities/elink_acheck.t ............ ok
> t/Tools/EUtilities/elink_lcheck.t ............ ok
> t/Tools/EUtilities/elink_llinks.t ............ ok
> t/Tools/EUtilities/elink_ncheck.t ............ ok
> t/Tools/EUtilities/elink_neighbor.t .......... ok
> t/Tools/EUtilities/elink_neighbor_history.t .. ok
> t/Tools/EUtilities/elink_scores.t ............ ok
> t/Tools/EUtilities/epost.t ................... ok
> t/Tools/EUtilities/esearch.t ................. ok
> t/Tools/EUtilities/espell.t .................. ok
> t/Tools/EUtilities/esummary.t ................ ok
> t/Tools/Est2Genome.t ......................... ok
> t/Tools/FootPrinter.t ........................ ok
> t/Tools/GFF.t ................................ ok
> t/Tools/Geneid.t ............................. ok
> t/Tools/Genewise.t ........................... ok
> t/Tools/Genomewise.t ......................... ok
> t/Tools/Genpred.t ............................ ok
> t/Tools/Hmmer.t .............................. ok
> t/Tools/IUPAC.t .............................. ok
> t/Tools/Lucy.t ............................... ok
> t/Tools/Match.t .............................. ok
> t/Tools/Phylo/Gerp.t ......................... ok
> t/Tools/Phylo/Molphy.t ....................... ok
> t/Tools/Phylo/PAML.t ......................... ok
> t/Tools/Phylo/Phylip/ProtDist.t .............. ok
> t/Tools/Primer3.t ............................ ok
> t/Tools/Promoterwise.t ....................... ok
> t/Tools/Pseudowise.t ......................... ok
> t/Tools/QRNA.t ............................... ok
> t/Tools/RandDistFunctions.t .................. ok
> t/Tools/RepeatMasker.t ....................... ok
> t/Tools/Run/RemoteBlast.t .................... 13/16
> --------------------- WARNING ---------------------
> MSG: Server failed to return any data
> ---------------------------------------------------
> # Looks like you planned 16 tests but ran 13.
> t/Tools/Run/RemoteBlast.t .................... Dubious, test  
> returned 255 (wstat 65280, 0xff00)
> Failed 3/16 subtests
> t/Tools/Run/RemoteBlast_rpsblast.t ........... ok
> t/Tools/Run/StandAloneBlast.t ................ ok
> t/Tools/Run/WrapperBase.t .................... ok
> t/Tools/Seg.t ................................ ok
> t/Tools/SiRNA.t .............................. ok
> t/Tools/Sigcleave.t .......................... ok
> t/Tools/Signalp.t ............................ ok
> t/Tools/Signalp/ExtendedSignalp.t ............ ok
> t/Tools/Sim4.t ............................... ok
> t/Tools/Spidey/Spidey.t ...................... ok
> t/Tools/TandemRepeatsFinder.t ................ ok
> t/Tools/TargetP.t ............................ ok
> t/Tools/Tmhmm.t .............................. ok
> t/Tools/ePCR.t ............................... ok
> t/Tools/pICalculator.t ....................... ok
> t/Tools/rnamotif.t ........................... skipped: All tests  
> are being skipped, probably because the module(s) being tested here  
> are now deprecated
> t/Tools/tRNAscanSE.t ......................... ok
> t/Tree/Compatible.t .......................... ok
> t/Tree/Node.t ................................ ok
> t/Tree/PhyloNetwork/Factory.t ................ ok
> t/Tree/PhyloNetwork/GraphViz.t ............... ok
> t/Tree/PhyloNetwork/MuVector.t ............... ok
> t/Tree/PhyloNetwork/PhyloNetwork.t ........... ok
> t/Tree/PhyloNetwork/RandomFactory.t .......... skipped: The optional  
> module Math::Random (or dependencies thereof) was not installed
> t/Tree/PhyloNetwork/TreeFactory.t ............ ok
> t/Tree/RandomTreeFactory.t ................... ok
> t/Tree/Tree.t ................................ ok
> t/Tree/TreeIO.t .............................. ok
> t/Tree/TreeIO/lintree.t ...................... ok
> t/Tree/TreeIO/newick.t ....................... ok
> t/Tree/TreeIO/nexus.t ........................ ok
> t/Tree/TreeIO/nhx.t .......................... ok
> t/Tree/TreeIO/phyloxml.t ..................... ok
> t/Tree/TreeIO/svggraph.t ..................... 1/4 Use of  
> uninitialized value $txt[0] in join or string at /usr/share/perl5/ 
> SVG/Element.pm line 1195, <GEN0> line 1.
> t/Tree/TreeIO/svggraph.t ..................... ok
> t/Tree/TreeIO/tabtree.t ...................... ok
> t/Tree/TreeStatistics.t ...................... ok
> t/Variation/AAChange.t ....................... ok
> t/Variation/AAReverseMutate.t ................ ok
> t/Variation/Allele.t ......................... ok
> t/Variation/DNAMutation.t .................... ok
> t/Variation/RNAChange.t ...................... ok
> t/Variation/SNP.t ............................ ok
> t/Variation/SeqDiff.t ........................ ok
> t/Variation/Variation_IO.t ................... ok
>
>
> Cheers,
>
> -- 
> Charles Plessy
> Debian Med packaging team,
> http://www.debian.org/devel/debian-med
> Tsurumi, Kanagawa, Japan


From cjfields at illinois.edu  Mon Sep 28 13:28:29 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Mon, 28 Sep 2009 12:28:29 -0500
Subject: [Bioperl-l] Policy on tests
Message-ID: <00F31D5F-D531-4A5E-A11E-F7B67283FA8B@illinois.edu>

All,

This is a bit of a rant related to the spat of alphas I've had to  
release over the last few weeks.  We have a fairly loose policy on  
testing; for instance, most CPAN installations should not run network-  
or DB-dependent tests or other developer-dependent tests by default  
(POD formatting, for instance), or tests for a 'recommended' module  
should be skipped.  That is currently in place.

However, I do think all tests that are skipped need to be reported  
somehow, and optional tests should NOT skip if they are off by default  
and are specifically requested.  This is not currently the behavior.   
So far I have been bitten twice by this.

The last instance was with the latest alpha, where ODBA-related tests  
were mistakenly skipped when BerkeleyDB wasn't installed.  As it turns  
out, BerkeleyDB isn't required, but (according to standard test  
harness output) t/LocalDB/Registry.t passed w/o reporting any problems  
when in reality it silently skipped over 90% of the tests (this is  
only seen with --verbose output).  In the past I have also run into  
network tests silently passing when the remote server was not in  
service anymore (IIRC this was with XEMBL modules, which are no longer  
in the distribution).

 From my point of view, speaking as both a user and developer, I need  
to know when these tests are skipped or fail.  In instances where I  
specifically request a set of tests to be run and a test fails, they  
*should* fail quite loudly and catastrophically (i.e. if there is a  
server-side issue, a problem with DB connection, etc).  They shouldn't  
be skipped over if a problem arises, otherwise if it a legitimate bug  
it silently passes.  If it is something I haven't set up correctly (a  
DB connection, for instance) I would like to know about it via the  
test failures.

Am I the only one thinking along these lines?  Should we come up with  
a simple policy on how we're setting up and running tests?

chris

From paola.bisignano at gmail.com  Mon Sep 28 05:50:52 2009
From: paola.bisignano at gmail.com (Paola Bisignano)
Date: Mon, 28 Sep 2009 11:50:52 +0200
Subject: [Bioperl-l] parsing msf file
Message-ID: <e9cf89740909280250u40f1a118pa7527a2f27c5bc0@mail.gmail.com>

Hi dear friends,

I used Bio::AlignIO to parse msf file, using method
colum_from_residue_number, as you suggested to obtain the position in
the alignment of  residues of interest (in contact with my ligand) and
I have to do a check of the residue:
I want to extract the type of the residue...I ask my question using
the number of the residue in the PDB, and i want the script return
also the residue so if I want to know the position af ala21, I  will
do:

my $alnio = Bio::AlignIO->new( -file=>"my file.msf");
my $aln = $alnio->next_aln;

my $s1 = $aln->get_seq_by_pos(1);
my $s2 = $aln->get_seq_by_pos(2);

my $col = $aln->column_from_residue_number( $s1->id, 21)

and It will return the position (es. 5) but I want to check if in
position 5 of the alignment there is A (for ala)....I looked in
documentation, but I couldn't find anything for that


Thank you all for help you gave and will give to me,

best regards,

paola

From maj at fortinbras.us  Mon Sep 28 21:25:33 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Mon, 28 Sep 2009 21:25:33 -0400
Subject: [Bioperl-l] parsing msf file
In-Reply-To: <e9cf89740909280250u40f1a118pa7527a2f27c5bc0@mail.gmail.com>
References: <e9cf89740909280250u40f1a118pa7527a2f27c5bc0@mail.gmail.com>
Message-ID: <1C5008B41F6D4BFF9F5160633D284442@NewLife>

Hi Paola--
I think you're saying you want to see if A is present in other 
sequences in the alignment at alignment column 5. Here's
where you use location_from_column, which is a method 
off the sequence object themselves. The idea is to do 

# $col is obtained as in your script...
for my $seq ($aln->each_seq) {
  if ( $seq->subseq( $seq->location_from_column($col) ) eq 'A') {
     print "si!";
  else {
     print "no!";
  }
}

You might find the code at 
http://www.bioperl.org/wiki/Site_entropy_in_an_alignment
helpful since it uses these principles. 
Mark
----- Original Message ----- 
From: "Paola Bisignano" <paola.bisignano at gmail.com>
To: <bioperl-l at lists.open-bio.org>
Sent: Monday, September 28, 2009 5:50 AM
Subject: [Bioperl-l] parsing msf file


> Hi dear friends,
> 
> I used Bio::AlignIO to parse msf file, using method
> colum_from_residue_number, as you suggested to obtain the position in
> the alignment of  residues of interest (in contact with my ligand) and
> I have to do a check of the residue:
> I want to extract the type of the residue...I ask my question using
> the number of the residue in the PDB, and i want the script return
> also the residue so if I want to know the position af ala21, I  will
> do:
> 
> my $alnio = Bio::AlignIO->new( -file=>"my file.msf");
> my $aln = $alnio->next_aln;
> 
> my $s1 = $aln->get_seq_by_pos(1);
> my $s2 = $aln->get_seq_by_pos(2);
> 
> my $col = $aln->column_from_residue_number( $s1->id, 21)
> 
> and It will return the position (es. 5) but I want to check if in
> position 5 of the alignment there is A (for ala)....I looked in
> documentation, but I couldn't find anything for that
> 
> 
> Thank you all for help you gave and will give to me,
> 
> best regards,
> 
> paola
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
>

From martin.senger at gmail.com  Tue Sep 29 01:31:41 2009
From: martin.senger at gmail.com (Martin Senger)
Date: Tue, 29 Sep 2009 13:31:41 +0800
Subject: [Bioperl-l] a Main Page proposal
Message-ID: <4d93f07c0909282231k35bc636as73993fe031034340@mail.gmail.com>

> Martin has stopped working on Biblio as far as I know and php-hacking is
> not my favorite pastime.


That's true. I can still revive the code - but the question is (always has
been) where to host the server (of the web services providing the biblio
data). It was hosted, and maintained, at EBI. But I do not know if EBI is
still maintaining it, or willing to do so.

Cheers,
Martin

-- 
Martin Senger
email: martin.senger at gmail.com,m.senger at cgiar.org
skype: martinsenger

From jason at bioperl.org  Tue Sep 29 01:43:30 2009
From: jason at bioperl.org (Jason Stajich)
Date: Mon, 28 Sep 2009 22:43:30 -0700
Subject: [Bioperl-l] a Main Page proposal
In-Reply-To: <4d93f07c0909282231k35bc636as73993fe031034340@mail.gmail.com>
References: <4d93f07c0909282231k35bc636as73993fe031034340@mail.gmail.com>
Message-ID: <E9D67D22-ABC3-4199-B8D9-E0675197B9BF@bioperl.org>

hah! I actually meant the Biblio.php Wikimedia plugin by Martin Jambon  
-- but hey the Bio::Biblio db stuff should be discussed too.

-jason
On Sep 28, 2009, at 10:31 PM, Martin Senger wrote:

>> Martin has stopped working on Biblio as far as I know and php- 
>> hacking is
>> not my favorite pastime.
>
>
> That's true. I can still revive the code - but the question is  
> (always has
> been) where to host the server (of the web services providing the  
> biblio
> data). It was hosted, and maintained, at EBI. But I do not know if  
> EBI is
> still maintaining it, or willing to do so.
>
> Cheers,
> Martin
>
> -- 
> Martin Senger
> email: martin.senger at gmail.com,m.senger at cgiar.org
> skype: martinsenger

--
Jason Stajich
jason.stajich at gmail.com
jason at bioperl.org


From cjfields at illinois.edu  Tue Sep 29 14:01:29 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Tue, 29 Sep 2009 13:01:29 -0500
Subject: [Bioperl-l] BioPerl 1.6.1 released
Message-ID: <7E88F275-4D4B-48DD-84C7-A97C2C16EA97@illinois.edu>

We are pleased to announce the availability of BioPerl 1.6.1, the  
latest release of BioPerl core code.  You can grab it here:

Via CPAN:

http://search.cpan.org/~cjfields/BioPerl-1.6.1/

Via the BioPerl website:

http://bioperl.org/DIST/BioPerl-1.6.1.tar.bz2
http://bioperl.org/DIST/BioPerl-1.6.1.tar.gz
http://bioperl.org/DIST/BioPerl-1.6.1.zip

The PPM for Windows should also finally be available this week,  
ActivePerl problems permitting (we will post more information when it  
becomes available).

Tons of bug fixes and changes have been incorporated into this  
release.  For a more complete change list please see the 'Changes'  
file included with the distribution.

A few highlights:

* FASTQ parsing and interconversion of the three FASTQ variants  
(Sanger, Illumina, Solexa) now works (a concerted OBF effort!)
* Significant refactoring of Bio::Restriction methods
* Complete refactoring of Bio::Search-related tiling code, including  
HOWTO documentation
* GBrowse-related fixes
    - berkeleydb database now autoindexes wig files and locks correctly
    - add Pg, SQLite, and faster BerkeleyDB implementations
* Infernal 1.0 output is now parsed
* New SearchIO-based parser for gmap -f9 output
* BLAST XML parsing essentially complete
* Installation via CPANPLUS should now work
* For those using Strawberry Perl on Windows, the latest build is  
expected to pass all tests.
* 'raw' sequence format now parsed by line or optionally as a single  
sequence
* SCF parsing/writing now round-trips
* Demo code for using RPS-BLAST and Bio::Tools::Run::RemoteBlast
* Bio::Tools::SeqPattern now has a backtranslate() method
* Bio::Tree::Statistics now has methods to calculate Fitch-based  
score, internal trait values, statratio(), sum of leaf distances  
[heikki]
* scripts
    - update to bp_seqfeature_load for SQLite [lstein]
    - hivq.pl - commmand-line interface to Bio::DB::HIV [maj]
    - fastam9_to_table - fix for MPI output [jason]
    - gccalc - total stats [jason]
    - einfo  - simple script to find up-to-date NCBI database list,  
list field and link values for a specific database

We will shortly release updates for BioPerl-db, BioPerl-run, and  
BioPerl-network.  Enjoy!

chris

From rmb32 at cornell.edu  Tue Sep 29 14:22:03 2009
From: rmb32 at cornell.edu (Robert Buels)
Date: Tue, 29 Sep 2009 11:22:03 -0700
Subject: [Bioperl-l] BioPerl 1.6.1 released
In-Reply-To: <7E88F275-4D4B-48DD-84C7-A97C2C16EA97@illinois.edu>
References: <7E88F275-4D4B-48DD-84C7-A97C2C16EA97@illinois.edu>
Message-ID: <4AC2504B.1000707@cornell.edu>

Chris Fields wrote:
 > We are pleased to announce the availability of BioPerl 1.6.1, the
 > latest release of BioPerl core code.

Hooray!  You rock Chris!  Tremendous thanks for your many hours of work 
to get it out the door!

Rob


From scott at scottcain.net  Tue Sep 29 14:23:08 2009
From: scott at scottcain.net (Scott Cain)
Date: Tue, 29 Sep 2009 14:23:08 -0400
Subject: [Bioperl-l] BioPerl 1.6.1 released
In-Reply-To: <7E88F275-4D4B-48DD-84C7-A97C2C16EA97@illinois.edu>
References: <7E88F275-4D4B-48DD-84C7-A97C2C16EA97@illinois.edu>
Message-ID: <536f21b00909291123h12a7c941tdd3edb7fadbb1149@mail.gmail.com>

Chris,

Congratulations and thanks so much for the time and effort that went into this.

Scott


On Tue, Sep 29, 2009 at 2:01 PM, Chris Fields <cjfields at illinois.edu> wrote:
> We are pleased to announce the availability of BioPerl 1.6.1, the latest
> release of BioPerl core code. ?You can grab it here:
>
> Via CPAN:
>
> http://search.cpan.org/~cjfields/BioPerl-1.6.1/
>
> Via the BioPerl website:
>
> http://bioperl.org/DIST/BioPerl-1.6.1.tar.bz2
> http://bioperl.org/DIST/BioPerl-1.6.1.tar.gz
> http://bioperl.org/DIST/BioPerl-1.6.1.zip
>
> The PPM for Windows should also finally be available this week, ActivePerl
> problems permitting (we will post more information when it becomes
> available).
>
> Tons of bug fixes and changes have been incorporated into this release. ?For
> a more complete change list please see the 'Changes' file included with the
> distribution.
>
> A few highlights:
>
> * FASTQ parsing and interconversion of the three FASTQ variants (Sanger,
> Illumina, Solexa) now works (a concerted OBF effort!)
> * Significant refactoring of Bio::Restriction methods
> * Complete refactoring of Bio::Search-related tiling code, including HOWTO
> documentation
> * GBrowse-related fixes
> ? - berkeleydb database now autoindexes wig files and locks correctly
> ? - add Pg, SQLite, and faster BerkeleyDB implementations
> * Infernal 1.0 output is now parsed
> * New SearchIO-based parser for gmap -f9 output
> * BLAST XML parsing essentially complete
> * Installation via CPANPLUS should now work
> * For those using Strawberry Perl on Windows, the latest build is expected
> to pass all tests.
> * 'raw' sequence format now parsed by line or optionally as a single
> sequence
> * SCF parsing/writing now round-trips
> * Demo code for using RPS-BLAST and Bio::Tools::Run::RemoteBlast
> * Bio::Tools::SeqPattern now has a backtranslate() method
> * Bio::Tree::Statistics now has methods to calculate Fitch-based score,
> internal trait values, statratio(), sum of leaf distances [heikki]
> * scripts
> ? - update to bp_seqfeature_load for SQLite [lstein]
> ? - hivq.pl - commmand-line interface to Bio::DB::HIV [maj]
> ? - fastam9_to_table - fix for MPI output [jason]
> ? - gccalc - total stats [jason]
> ? - einfo ?- simple script to find up-to-date NCBI database list, list field
> and link values for a specific database
>
> We will shortly release updates for BioPerl-db, BioPerl-run, and
> BioPerl-network. ?Enjoy!
>
> chris
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


-- 
------------------------------------------------------------------------
Scott Cain, Ph. D.                                   scott at scottcain dot net
GMOD Coordinator (http://gmod.org/)                     216-392-3087
Ontario Institute for Cancer Research


From hlapp at gmx.net  Tue Sep 29 15:56:58 2009
From: hlapp at gmx.net (Hilmar Lapp)
Date: Tue, 29 Sep 2009 15:56:58 -0400
Subject: [Bioperl-l] BioPerl 1.6.1 released
In-Reply-To: <7E88F275-4D4B-48DD-84C7-A97C2C16EA97@illinois.edu>
References: <7E88F275-4D4B-48DD-84C7-A97C2C16EA97@illinois.edu>
Message-ID: <C06DA705-6249-4D86-BE9D-E2E4DCEBFAF0@gmx.net>

Congrats from me too - awesome Chris, and thanks on behalf of the  
project!

	-hilmar

On Sep 29, 2009, at 2:01 PM, Chris Fields wrote:

> We are pleased to announce the availability of BioPerl 1.6.1, the  
> latest release of BioPerl core code.  You can grab it here:
>
> Via CPAN:
>
> http://search.cpan.org/~cjfields/BioPerl-1.6.1/
>
> Via the BioPerl website:
>
> http://bioperl.org/DIST/BioPerl-1.6.1.tar.bz2
> http://bioperl.org/DIST/BioPerl-1.6.1.tar.gz
> http://bioperl.org/DIST/BioPerl-1.6.1.zip
>
> The PPM for Windows should also finally be available this week,  
> ActivePerl problems permitting (we will post more information when  
> it becomes available).
>
> Tons of bug fixes and changes have been incorporated into this  
> release.  For a more complete change list please see the 'Changes'  
> file included with the distribution.
>
> A few highlights:
>
> * FASTQ parsing and interconversion of the three FASTQ variants  
> (Sanger, Illumina, Solexa) now works (a concerted OBF effort!)
> * Significant refactoring of Bio::Restriction methods
> * Complete refactoring of Bio::Search-related tiling code, including  
> HOWTO documentation
> * GBrowse-related fixes
>   - berkeleydb database now autoindexes wig files and locks correctly
>   - add Pg, SQLite, and faster BerkeleyDB implementations
> * Infernal 1.0 output is now parsed
> * New SearchIO-based parser for gmap -f9 output
> * BLAST XML parsing essentially complete
> * Installation via CPANPLUS should now work
> * For those using Strawberry Perl on Windows, the latest build is  
> expected to pass all tests.
> * 'raw' sequence format now parsed by line or optionally as a single  
> sequence
> * SCF parsing/writing now round-trips
> * Demo code for using RPS-BLAST and Bio::Tools::Run::RemoteBlast
> * Bio::Tools::SeqPattern now has a backtranslate() method
> * Bio::Tree::Statistics now has methods to calculate Fitch-based  
> score, internal trait values, statratio(), sum of leaf distances  
> [heikki]
> * scripts
>   - update to bp_seqfeature_load for SQLite [lstein]
>   - hivq.pl - commmand-line interface to Bio::DB::HIV [maj]
>   - fastam9_to_table - fix for MPI output [jason]
>   - gccalc - total stats [jason]
>   - einfo  - simple script to find up-to-date NCBI database list,  
> list field and link values for a specific database
>
> We will shortly release updates for BioPerl-db, BioPerl-run, and  
> BioPerl-network.  Enjoy!
>
> chris
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From cjfields at illinois.edu  Tue Sep 29 16:38:04 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Tue, 29 Sep 2009 15:38:04 -0500
Subject: [Bioperl-l] BioPerl 1.6.1 released
In-Reply-To: <C06DA705-6249-4D86-BE9D-E2E4DCEBFAF0@gmx.net>
References: <7E88F275-4D4B-48DD-84C7-A97C2C16EA97@illinois.edu>
	<C06DA705-6249-4D86-BE9D-E2E4DCEBFAF0@gmx.net>
Message-ID: <5B8C4E37-5F3D-4E76-AB94-1C613AE04CDF@illinois.edu>

No prob.  Next up is db, run, and network!

chris

On Sep 29, 2009, at 2:56 PM, Hilmar Lapp wrote:

> Congrats from me too - awesome Chris, and thanks on behalf of the  
> project!
>
> 	-hilmar
>
> On Sep 29, 2009, at 2:01 PM, Chris Fields wrote:
>
>> We are pleased to announce the availability of BioPerl 1.6.1, the  
>> latest release of BioPerl core code.  You can grab it here:
>>
>> Via CPAN:
>>
>> http://search.cpan.org/~cjfields/BioPerl-1.6.1/
>>
>> Via the BioPerl website:
>>
>> http://bioperl.org/DIST/BioPerl-1.6.1.tar.bz2
>> http://bioperl.org/DIST/BioPerl-1.6.1.tar.gz
>> http://bioperl.org/DIST/BioPerl-1.6.1.zip
>>
>> The PPM for Windows should also finally be available this week,  
>> ActivePerl problems permitting (we will post more information when  
>> it becomes available).
>>
>> Tons of bug fixes and changes have been incorporated into this  
>> release.  For a more complete change list please see the 'Changes'  
>> file included with the distribution.
>>
>> A few highlights:
>>
>> * FASTQ parsing and interconversion of the three FASTQ variants  
>> (Sanger, Illumina, Solexa) now works (a concerted OBF effort!)
>> * Significant refactoring of Bio::Restriction methods
>> * Complete refactoring of Bio::Search-related tiling code,  
>> including HOWTO documentation
>> * GBrowse-related fixes
>>  - berkeleydb database now autoindexes wig files and locks correctly
>>  - add Pg, SQLite, and faster BerkeleyDB implementations
>> * Infernal 1.0 output is now parsed
>> * New SearchIO-based parser for gmap -f9 output
>> * BLAST XML parsing essentially complete
>> * Installation via CPANPLUS should now work
>> * For those using Strawberry Perl on Windows, the latest build is  
>> expected to pass all tests.
>> * 'raw' sequence format now parsed by line or optionally as a  
>> single sequence
>> * SCF parsing/writing now round-trips
>> * Demo code for using RPS-BLAST and Bio::Tools::Run::RemoteBlast
>> * Bio::Tools::SeqPattern now has a backtranslate() method
>> * Bio::Tree::Statistics now has methods to calculate Fitch-based  
>> score, internal trait values, statratio(), sum of leaf distances  
>> [heikki]
>> * scripts
>>  - update to bp_seqfeature_load for SQLite [lstein]
>>  - hivq.pl - commmand-line interface to Bio::DB::HIV [maj]
>>  - fastam9_to_table - fix for MPI output [jason]
>>  - gccalc - total stats [jason]
>>  - einfo  - simple script to find up-to-date NCBI database list,  
>> list field and link values for a specific database
>>
>> We will shortly release updates for BioPerl-db, BioPerl-run, and  
>> BioPerl-network.  Enjoy!
>>
>> chris
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>
> -- 
> ===========================================================
> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
> ===========================================================
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From cjfields at illinois.edu  Tue Sep 29 17:11:33 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Tue, 29 Sep 2009 16:11:33 -0500
Subject: [Bioperl-l] Naming of BioPerl-run/db/network
Message-ID: <4384F324-E30E-490D-A6FB-3EB4C54E4481@illinois.edu>

Right now all our subdistributions have a naming scheme like BioPerl- 
db.  I'm thinking we should subtly change those to BioPerl-DB, BioPerl- 
Run, BioPerl-Network, etc.  The primary reason is that the prior  
method of naming doesn't quite match the syntax of other distributions:

Win32-Console
Win32-EventLog
MooseX-Aliases
etc etc

I'll go ahead and make these changes unless there is rabid dissent ;>

chris

From bix at sendu.me.uk  Tue Sep 29 15:06:17 2009
From: bix at sendu.me.uk (Sendu Bala)
Date: Tue, 29 Sep 2009 20:06:17 +0100
Subject: [Bioperl-l] BioPerl 1.6.1 released
In-Reply-To: <7E88F275-4D4B-48DD-84C7-A97C2C16EA97@illinois.edu>
References: <7E88F275-4D4B-48DD-84C7-A97C2C16EA97@illinois.edu>
Message-ID: <4AC25AA9.5080803@sendu.me.uk>

Chris Fields wrote:
> We are pleased to announce the availability of BioPerl 1.6.1, the latest 
> release of BioPerl core code.  You can grab it here:

Great job Chris. *cheers*

From hlapp at gmx.net  Tue Sep 29 17:49:07 2009
From: hlapp at gmx.net (Hilmar Lapp)
Date: Tue, 29 Sep 2009 17:49:07 -0400
Subject: [Bioperl-l] Naming of BioPerl-run/db/network
In-Reply-To: <4384F324-E30E-490D-A6FB-3EB4C54E4481@illinois.edu>
References: <4384F324-E30E-490D-A6FB-3EB4C54E4481@illinois.edu>
Message-ID: <6C5CBE0E-EDA5-4079-BFD7-DEE95E8C749C@gmx.net>

Fine with me :-)

	-hilmar

On Sep 29, 2009, at 5:11 PM, Chris Fields wrote:

> Right now all our subdistributions have a naming scheme like BioPerl- 
> db.  I'm thinking we should subtly change those to BioPerl-DB,  
> BioPerl-Run, BioPerl-Network, etc.  The primary reason is that the  
> prior method of naming doesn't quite match the syntax of other  
> distributions:
>
> Win32-Console
> Win32-EventLog
> MooseX-Aliases
> etc etc
>
> I'll go ahead and make these changes unless there is rabid dissent ;>
>
> chris
>

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From maj at fortinbras.us  Tue Sep 29 18:33:23 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Tue, 29 Sep 2009 18:33:23 -0400
Subject: [Bioperl-l] BioPerl 1.6.1 released
In-Reply-To: <7E88F275-4D4B-48DD-84C7-A97C2C16EA97@illinois.edu>
References: <7E88F275-4D4B-48DD-84C7-A97C2C16EA97@illinois.edu>
Message-ID: <5D35D16E84554CA687C6CA4758806884@NewLife>

Gnarly, dude.
MAJ
----- Original Message ----- 
From: "Chris Fields" <cjfields at illinois.edu>
To: "BioPerl List" <bioperl-l at lists.open-bio.org>
Sent: Tuesday, September 29, 2009 2:01 PM
Subject: [Bioperl-l] BioPerl 1.6.1 released


> We are pleased to announce the availability of BioPerl 1.6.1, the  
> latest release of BioPerl core code.  You can grab it here:
> 
> Via CPAN:
> 
> http://search.cpan.org/~cjfields/BioPerl-1.6.1/
> 
> Via the BioPerl website:
> 
> http://bioperl.org/DIST/BioPerl-1.6.1.tar.bz2
> http://bioperl.org/DIST/BioPerl-1.6.1.tar.gz
> http://bioperl.org/DIST/BioPerl-1.6.1.zip
> 
> The PPM for Windows should also finally be available this week,  
> ActivePerl problems permitting (we will post more information when it  
> becomes available).
> 
> Tons of bug fixes and changes have been incorporated into this  
> release.  For a more complete change list please see the 'Changes'  
> file included with the distribution.
> 
> A few highlights:
> 
> * FASTQ parsing and interconversion of the three FASTQ variants  
> (Sanger, Illumina, Solexa) now works (a concerted OBF effort!)
> * Significant refactoring of Bio::Restriction methods
> * Complete refactoring of Bio::Search-related tiling code, including  
> HOWTO documentation
> * GBrowse-related fixes
>    - berkeleydb database now autoindexes wig files and locks correctly
>    - add Pg, SQLite, and faster BerkeleyDB implementations
> * Infernal 1.0 output is now parsed
> * New SearchIO-based parser for gmap -f9 output
> * BLAST XML parsing essentially complete
> * Installation via CPANPLUS should now work
> * For those using Strawberry Perl on Windows, the latest build is  
> expected to pass all tests.
> * 'raw' sequence format now parsed by line or optionally as a single  
> sequence
> * SCF parsing/writing now round-trips
> * Demo code for using RPS-BLAST and Bio::Tools::Run::RemoteBlast
> * Bio::Tools::SeqPattern now has a backtranslate() method
> * Bio::Tree::Statistics now has methods to calculate Fitch-based  
> score, internal trait values, statratio(), sum of leaf distances  
> [heikki]
> * scripts
>    - update to bp_seqfeature_load for SQLite [lstein]
>    - hivq.pl - commmand-line interface to Bio::DB::HIV [maj]
>    - fastam9_to_table - fix for MPI output [jason]
>    - gccalc - total stats [jason]
>    - einfo  - simple script to find up-to-date NCBI database list,  
> list field and link values for a specific database
> 
> We will shortly release updates for BioPerl-db, BioPerl-run, and  
> BioPerl-network.  Enjoy!
> 
> chris
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
>

From cjfields at illinois.edu  Tue Sep 29 23:54:04 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Tue, 29 Sep 2009 22:54:04 -0500
Subject: [Bioperl-l] Long /labels are wrapped, but can't be read
In-Reply-To: <87hbunv764.fsf@topper.koldfront.dk>
References: <87hbunv764.fsf@topper.koldfront.dk>
Message-ID: <86373CE8-4C61-4124-BCF3-35975523CC9C@illinois.edu>

Adam,

Not sure, but this could be a case of 'both'.  Labels that are quoted  
and aren't are currently distinguished via a global hash lookup  
(%FTQUAL_NO_QUOTE) due to the way the parser works; there is some  
logic behind this, just can't quite recall at the moment why it is  
this way.  You could set a hash key for the label in cases where it  
isn't quoted, that should work.  You can also test out the  
Bio::SeqIO::embldriver version (-format => 'embldriver').

If the above doesn't work out it's worth filing a bug for this  
behavior, though I'm not sure how easily it will be to fix.

chris

On Sep 28, 2009, at 2:51 AM, Adam Sj?gren wrote:

>  Hi.
>
>
> I am wondering whether this is a buglet or just a case of "Don't do
> that":
>
> If I set a very long /label on a feature and output the sequence in  
> EMBL
> format, the qualifier value gets wrapped, but not quoted.
>
> When BioPerl reads such a file, an exception is thrown.
>
> I probably shouldn't be setting very long labels... But oughtn't  
> BioPerl
> throw an exception when a too long label is set, or automatically  
> quote
> the value when it is long enough to be wrapped, or know how to read a
> wrapped yet unquoted value?
>
> I will be happy to try and provide a patch for whichever solution is
> preferred.
>
> Here is an example script:
>
>  #!/usr/bin/perl
>
>  use strict;
>  use warnings;
>
>  use IO::String;
>
>  use Bio::Seq;
>  use Bio::SeqFeature::Generic;
>  use Bio::SeqIO;
>
>  print 'BioPerl ' . $Bio::Root::Version::VERSION . "\n";
>
>  my $seq=Bio::Seq->new(-seq=>'ATG');
>  my $feature=Bio::SeqFeature::Generic->new(-primary=>'misc_feature',  
> -start=>1, -end=>3);
>  $feature->add_tag_value 
> (label 
> =>'averylonglabelthisisindeedbutitoughttoworkanywaydontyouthink');
>  $seq->add_SeqFeature($feature);
>
>  my $out_string=out($seq);
>  print $out_string;
>
>  my $fh=IO::String->new($out_string);
>  my $in=Bio::SeqIO->new(-fh=>$fh, -format=>'EMBL');
>  my $in_seq=$in->next_seq;
>
>  print "Done\n";
>
>  sub out {
>      my ($seq)=@_;
>
>      my $string='';
>      my $fh=IO::String->new($string);
>      my $out=Bio::SeqIO->new(-fh=>$fh, -format=>'EMBL');
>      $out->write_seq($seq);
>
>      return $string;
>  }
>
> Which gives this output when run:
>
>  BioPerl 1.0069
>  ID   unknown; SV 1; linear; unassigned DNA; STD; UNC; 3 BP.
>  XX
>  AC   unknown;
>  XX
>  XX
>  FH   Key             Location/Qualifiers
>  FH
>  FT   misc_feature    1..3
>  FT                   / 
> label=averylonglabelthisisindeedbutitoughttoworkanywaydont
>  FT                   youthink
>  XX
>  SQ   Sequence 3 BP; 1 A; 0 C; 1 G; 1 T; 0 other;
>        
> atg 
>                                                                        3
>  //
>
>  ------------- EXCEPTION: Bio::Root::Exception -------------
>  MSG: Can't see new qualifier in: youthink
>  from:
>  /label=averylonglabelthisisindeedbutitoughttoworkanywaydont
>  youthink
>
>  STACK: Error::throw
>  STACK: Bio::Root::Root::throw Bio/Root/Root.pm:368
>  STACK: Bio::SeqIO::embl::_read_FTHelper_EMBL Bio/SeqIO/embl.pm:1294
>  STACK: Bio::SeqIO::embl::next_seq Bio/SeqIO/embl.pm:392
>  STACK: /z/home/adsj/bugs/bioperl/embl/embl.pl:24
>  -----------------------------------------------------------
>
> If I change the value to include "-quotes ("simulating" that embl.pm
> quotes the value), BioPerl can read the EMBL string it produces fine:
>
>  -----------------------------------------------------------
>  adsj at ala:~/work/bioperl/bioperl-live$ perl -I. ~/bugs/bioperl/embl/ 
> embl.pl
>  BioPerl 1.0069
>  ID   unknown; SV 1; linear; unassigned DNA; STD; UNC; 3 BP.
>  XX
>  AC   unknown;
>  XX
>  XX
>  FH   Key             Location/Qualifiers
>  FH
>  FT   misc_feature    1..3
>  FT                   / 
> label=""averylonglabelthisisindeedbutitoughttoworkanywaydo
>  FT                   ntyouthink""
>  XX
>  SQ   Sequence 3 BP; 1 A; 0 C; 1 G; 1 T; 0 other;
>        
> atg 
>                                                                        3
>  //
>  Done
>
>
>  Best regards,
>
>     Adam
>
> -- 
>                                                          Adam Sj?gren
>                                                    adsj at novozymes.com
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From adsj at novozymes.com  Wed Sep 30 05:50:36 2009
From: adsj at novozymes.com (Adam =?iso-8859-1?Q?Sj=F8gren?=)
Date: Wed, 30 Sep 2009 11:50:36 +0200
Subject: [Bioperl-l] Long /labels are wrapped, but can't be read
In-Reply-To: <86373CE8-4C61-4124-BCF3-35975523CC9C@illinois.edu> (Chris
	Fields's message of "Tue, 29 Sep 2009 22:54:04 -0500")
References: <87hbunv764.fsf@topper.koldfront.dk>
	<86373CE8-4C61-4124-BCF3-35975523CC9C@illinois.edu>
Message-ID: <87vdj0g3rn.fsf@topper.koldfront.dk>

On Tue, 29 Sep 2009 22:54:04 -0500, Chris wrote:

> Not sure, but this could be a case of 'both'. Labels that are quoted
> and aren't are currently distinguished via a global hash lookup
> (%FTQUAL_NO_QUOTE) due to the way the parser works; there is some
> logic behind this, just can't quite recall at the moment why it is
> this way.

Yes, I saw that there is a number of qualifiers that aren't quoted
automatically.

The very easy "fix" for me would be to simply remove "label" from
%FTQUAL_NO_QUOTE, but I'm not really sure what the reason for not
quoting all values is, so I was hesitant to just propose that.

> You could set a hash key for the label in cases where it isn't quoted,
> that should work. You can also test out the Bio::SeqIO::embldriver
> version (-format => 'embldriver').

Ah, embldriver reads the wrapped qualifier when it isn't quoted without
problem. Nice! I hadn't noticed embldriver.

I wonder which one is correct in this case?

And should I switch to using embldriver to read, or does it make sense
to try and concoct a patch that changes embl?


  Thanks for the feedback!

     Adam

-- 
                                                          Adam Sj?gren
                                                    adsj at novozymes.com


From sidd.basu at gmail.com  Wed Sep 30 13:24:53 2009
From: sidd.basu at gmail.com (Siddhartha Basu)
Date: Wed, 30 Sep 2009 12:24:53 -0500
Subject: [Bioperl-l]  Re: BioPerl 1.6.1 released
In-Reply-To: <5B8C4E37-5F3D-4E76-AB94-1C613AE04CDF@illinois.edu>
References: <7E88F275-4D4B-48DD-84C7-A97C2C16EA97@illinois.edu>
	<C06DA705-6249-4D86-BE9D-E2E4DCEBFAF0@gmx.net>
	<5B8C4E37-5F3D-4E76-AB94-1C613AE04CDF@illinois.edu>
Message-ID: <4ac39469.0637560a.5a63.1fee@mx.google.com>

Congrats chris,  really appreciate your time and effort.

-siddhartha

On Tue, 29 Sep 2009, Chris Fields wrote:

> No prob.  Next up is db, run, and network!
>
> chris
>
> On Sep 29, 2009, at 2:56 PM, Hilmar Lapp wrote:
>
> > Congrats from me too - awesome Chris, and thanks on behalf of the project!
> >
> > 	-hilmar
> >
> > On Sep 29, 2009, at 2:01 PM, Chris Fields wrote:
> >
> >> We are pleased to announce the availability of BioPerl 1.6.1, the latest 
> >> release of BioPerl core code.  You can grab it here:
> >>
> >> Via CPAN:
> >>
> >> http://search.cpan.org/~cjfields/BioPerl-1.6.1/
> >>
> >> Via the BioPerl website:
> >>
> >> http://bioperl.org/DIST/BioPerl-1.6.1.tar.bz2
> >> http://bioperl.org/DIST/BioPerl-1.6.1.tar.gz
> >> http://bioperl.org/DIST/BioPerl-1.6.1.zip
> >>
> >> The PPM for Windows should also finally be available this week, 
> >> ActivePerl problems permitting (we will post more information when it 
> >> becomes available).
> >>
> >> Tons of bug fixes and changes have been incorporated into this release.  
> >> For a more complete change list please see the 'Changes' file included 
> >> with the distribution.
> >>
> >> A few highlights:
> >>
> >> * FASTQ parsing and interconversion of the three FASTQ variants (Sanger, 
> >> Illumina, Solexa) now works (a concerted OBF effort!)
> >> * Significant refactoring of Bio::Restriction methods
> >> * Complete refactoring of Bio::Search-related tiling code, including 
> >> HOWTO documentation
> >> * GBrowse-related fixes
> >>  - berkeleydb database now autoindexes wig files and locks correctly
> >>  - add Pg, SQLite, and faster BerkeleyDB implementations
> >> * Infernal 1.0 output is now parsed
> >> * New SearchIO-based parser for gmap -f9 output
> >> * BLAST XML parsing essentially complete
> >> * Installation via CPANPLUS should now work
> >> * For those using Strawberry Perl on Windows, the latest build is 
> >> expected to pass all tests.
> >> * 'raw' sequence format now parsed by line or optionally as a single 
> >> sequence
> >> * SCF parsing/writing now round-trips
> >> * Demo code for using RPS-BLAST and Bio::Tools::Run::RemoteBlast
> >> * Bio::Tools::SeqPattern now has a backtranslate() method
> >> * Bio::Tree::Statistics now has methods to calculate Fitch-based score, 
> >> internal trait values, statratio(), sum of leaf distances [heikki]
> >> * scripts
> >>  - update to bp_seqfeature_load for SQLite [lstein]
> >>  - hivq.pl - commmand-line interface to Bio::DB::HIV [maj]
> >>  - fastam9_to_table - fix for MPI output [jason]
> >>  - gccalc - total stats [jason]
> >>  - einfo  - simple script to find up-to-date NCBI database list, list 
> >> field and link values for a specific database
> >>
> >> We will shortly release updates for BioPerl-db, BioPerl-run, and 
> >> BioPerl-network.  Enjoy!
> >>
> >> chris
> >> _______________________________________________
> >> Bioperl-l mailing list
> >> Bioperl-l at lists.open-bio.org
> >> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >>
> >
> > -- 
> > ===========================================================
> > : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
> > ===========================================================
> >
> >
> >
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

From antonina.iagovitina at epfl.ch  Wed Sep 30 14:09:17 2009
From: antonina.iagovitina at epfl.ch (Antonina Iagovitina)
Date: Wed, 30 Sep 2009 20:09:17 +0200
Subject: [Bioperl-l] assistance with bioperl
Message-ID: <4AC39ECD.6060405@epfl.ch>

Here is the error message I get when I try to align a sequence to an existing
alignment. Please help
I am using Windows XP and Clustalw version1.83

 MSG:
 ERROR: Could not open sequence file (-profile) 
 No. of seqs. read = -1. No alignment!
 
use Bio::AlignIO;
use Bio::SeqIO;
use Bio::Seq;
use Bio::Tools::Run::Alignment::Clustalw;

my @params = ('ktuple' => 2, 'matrix' => 'BLOSUM');
my $factory = Bio::Tools::Run::Alignment::Clustalw->new(@params);
$str = Bio::AlignIO->new(-file=> 'cysprot1a.msf');
$aln = $str->next_aln();
$str1 = Bio::SeqIO->new(-file=> 'cysprot1b.fa');
$seq = $str1->next_seq();
$aln = $factory->profile_align($aln,$seq);
end


From maj at fortinbras.us  Wed Sep 30 14:24:59 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Wed, 30 Sep 2009 14:24:59 -0400
Subject: [Bioperl-l] assistance with bioperl
In-Reply-To: <4AC39ECD.6060405@epfl.ch>
References: <4AC39ECD.6060405@epfl.ch>
Message-ID: <569E83EDBFE044638187504E5E7A8C11@NewLife>

Antonina--
Try the following:
Make sure that cysprot1a.msf and cysprot1b.fa are in the current directory, 
or use full path names for the files. 
MAJ
----- Original Message ----- 
From: "Antonina Iagovitina" <antonina.iagovitina at epfl.ch>
To: <bioperl-l at lists.open-bio.org>
Sent: Wednesday, September 30, 2009 2:09 PM
Subject: [Bioperl-l] assistance with bioperl


> Here is the error message I get when I try to align a sequence to an existing
> alignment. Please help
> I am using Windows XP and Clustalw version1.83
> 
> MSG:
> ERROR: Could not open sequence file (-profile) 
> No. of seqs. read = -1. No alignment!
> 
> use Bio::AlignIO;
> use Bio::SeqIO;
> use Bio::Seq;
> use Bio::Tools::Run::Alignment::Clustalw;
> 
> my @params = ('ktuple' => 2, 'matrix' => 'BLOSUM');
> my $factory = Bio::Tools::Run::Alignment::Clustalw->new(@params);
> $str = Bio::AlignIO->new(-file=> 'cysprot1a.msf');
> $aln = $str->next_aln();
> $str1 = Bio::SeqIO->new(-file=> 'cysprot1b.fa');
> $seq = $str1->next_seq();
> $aln = $factory->profile_align($aln,$seq);
> end
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
>

From me at miguel.weapps.com  Wed Sep 30 18:16:38 2009
From: me at miguel.weapps.com (Luis M Rodriguez-R)
Date: Wed, 30 Sep 2009 17:16:38 -0500
Subject: [Bioperl-l] Nexus symbols
Message-ID: <0EFFDCCA-48C6-4609-8503-17E61FCDD67B@miguel.weapps.com>

Dear all,

Is there a way to remove the "symbols" (i.e. the 'symbols="ATCG"')  
from the "format" line in the Nexus output of Bio::AlignIO?

My code (snippet) is:

my $fasta_i = Bio::AlignIO->new(-file=>"<$outfile.aln.fasta", '- 
format'=>"fasta");
my $nexus_o = Bio::AlignIO->new(-file=>">$outfile.aln.nex", '- 
format'=>"nexus");
while(my $fasta_aln=$fasta_i->next_aln){$nexus_o- 
 >write_aln($fasta_aln);}

And I would like to remove the symbols (is not compatible with MrBayes  
v3.1.2: "Could not find parameter "symbols"").

Also, it would be nice to be able to change the TITLE comment.

Thanks all!
Regards,

Luis M. Rodriguez-R
[http://bioinf.uniandes.edu.co/~miguel/]
---------------------------------
Unidad de Bioinform?tica del Laboratorio de Micolog?a y Fitopatolog?a
Universidad de Los Andes, Colombia
[http://bioinf.uniandes.edu.co]

+ 57 1 3394949 ext 2619
luisrodr at uniandes.edu.co
me at miguel.weapps.com


From jason at bioperl.org  Wed Sep 30 18:40:33 2009
From: jason at bioperl.org (Jason Stajich)
Date: Wed, 30 Sep 2009 15:40:33 -0700
Subject: [Bioperl-l] Nexus symbols
In-Reply-To: <0EFFDCCA-48C6-4609-8503-17E61FCDD67B@miguel.weapps.com>
References: <0EFFDCCA-48C6-4609-8503-17E61FCDD67B@miguel.weapps.com>
Message-ID: <483DB389-9332-4573-84C7-3AF09AC2BACA@bioperl.org>

-show_symbols => 0

If you use bp_sreformat.pl script specify --special="mrbayes" it will  
set both of the endblock and show_symbols values to 0.


perldoc Bio::AlignIO::nexus

        new

         Title   : new
         Usage   : $alignio = Bio::AlignIO->new(-format => ?nexus?, - 
file => ?filename?);
         Function: returns a new Bio::AlignIO object to handle  
clustalw files
         Returns : Bio::AlignIO::clustalw object
         Args    : -verbose => verbosity setting (-1,0,1,2)
                   -file    => name of file to read in or with ">" -  
writeout
                   -fh      => alternative to -file param - provide a  
filehandle
                               to read from/write to
                   -format  => type of Alignment Format to process or  
produce

                   Customization of nexus flavor output

                   -show_symbols => print the symbols="ATGC" in the  
data definition
                                    (MrBayes does not like this)
                                    boolean [default is 1]
                   -show_endblock => print an ?endblock;? at the end  
of the data
                                    (MyBayes does not like this)
                                    boolean [default is 1]

On Sep 30, 2009, at 3:16 PM, Luis M Rodriguez-R wrote:

> Dear all,
>
> Is there a way to remove the "symbols" (i.e. the 'symbols="ATCG"')  
> from the "format" line in the Nexus output of Bio::AlignIO?
>
> My code (snippet) is:
>
> my $fasta_i = Bio::AlignIO->new(-file=>"<$outfile.aln.fasta", '- 
> format'=>"fasta");
> my $nexus_o = Bio::AlignIO->new(-file=>">$outfile.aln.nex", '- 
> format'=>"nexus");
> while(my $fasta_aln=$fasta_i->next_aln){$nexus_o- 
> >write_aln($fasta_aln);}
>
> And I would like to remove the symbols (is not compatible with  
> MrBayes v3.1.2: "Could not find parameter "symbols"").
>
> Also, it would be nice to be able to change the TITLE comment.
>
> Thanks all!
> Regards,
>
> Luis M. Rodriguez-R
> [http://bioinf.uniandes.edu.co/~miguel/]
> ---------------------------------
> Unidad de Bioinform?tica del Laboratorio de Micolog?a y Fitopatolog?a
> Universidad de Los Andes, Colombia
> [http://bioinf.uniandes.edu.co]
>
> + 57 1 3394949 ext 2619
> luisrodr at uniandes.edu.co
> me at miguel.weapps.com
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

--
Jason Stajich
jason.stajich at gmail.com
jason at bioperl.org


From me at miguel.weapps.com  Wed Sep 30 16:51:04 2009
From: me at miguel.weapps.com (Luis M Rodriguez-R)
Date: Wed, 30 Sep 2009 15:51:04 -0500
Subject: [Bioperl-l] Nexus symbols
Message-ID: <788222E4-FCCC-4D4D-880B-1F5156945DB8@miguel.weapps.com>

Dear all,

Is there a way to remove the "symbols" (i.e. the 'symbols="ATCG"')  
from the "format" line in the Nexus output of Bio::AlignIO?

My code (snippet) is:

my $fasta_i = Bio::AlignIO->new(-file=>"<$outfile.aln.fasta", '- 
format'=>"fasta");
my $nexus_o = Bio::AlignIO->new(-file=>">$outfile.aln.nex", '- 
format'=>"nexus");
while(my $fasta_aln=$fasta_i->next_aln){$nexus_o- 
 >write_aln($fasta_aln);}

And I would like to remove the symbols (is not compatible with MrBayes  
v3.1.2: "Could not find parameter "symbols"").

Also, it would be nice to be able to change the TITLE comment.

Thanks all!
Regards,

Luis M. Rodriguez-R
[http://bioinf.uniandes.edu.co/~miguel/]
---------------------------------
Unidad de Bioinform?tica del Laboratorio de Micolog?a y Fitopatolog?a
Universidad de Los Andes, Colombia
[http://bioinf.uniandes.edu.co]

+ 57 1 3394949 ext 2619
luisrodr at uniandes.edu.co
me at miguel.weapps.com


From paola_bisignano at yahoo.it  Tue Sep  1 08:20:25 2009
From: paola_bisignano at yahoo.it (Paola Bisignano)
Date: Tue, 1 Sep 2009 12:20:25 +0000 (GMT)
Subject: [Bioperl-l] help parsing msf file or clustalW file reports
Message-ID: <154614.75143.qm@web25706.mail.ukl.yahoo.com>

Hi, 

I'm trying to parse fasta files, where I have couple of alignments....I need to identify my residue in my alignment......I have separate lists that derived from ligplot parsing files.. so I have to manipulate string...but I don't now how to start..it seems complicated..
I used Bio::AlignIO to parse the fasta file, so I can have a parsed file in msf or clustalW forma

here an example:
CLUSTAL W(1.81) multiple sequence alignment


Sequence/9-273???????? DKWEMERTDITMKHKLGGGQYGEVYEGVWKKYSLTVAVKTLKEDTMEVEEFLKEAAVMKE
2pl0:A/6-268?????????? DEWEVPRETLKLVERLGAGQFGEVWMGYYNGHT-KVAVKSLKQGSMSPDAFLAEANLMKQ
?????????????????????? *:**: *? :.: .:**.**:***: * :: :: .****:**:.:*. : ** ** :**:


Sequence/9-273???????? IKHPNLVQLLGVCTREPPFYIITEFMTYGNLLDYLRECNRQEVSAVVLLYMATQISSAME
2pl0:A/6-268?????????? LQHQRLVRLYAVVTQEP-IYIITEYMENGSLVDFLKTPSGIKLTINKLLDMAAQIAEGMA
?????????????????????? ::* .**:* .* *:** :*****:*? *.*:*:*:? .? :::?? ** **:**:..* 

I? choose two residue for example...how can I extract them...starting from their position in the pdb file?
I need to walk...to my sequence 

I don't know if it is clear because I cannot explain the question correctly in english...are there any Italians?
could anyone help me?


From scott at scottcain.net  Tue Sep  1 09:21:25 2009
From: scott at scottcain.net (Scott Cain)
Date: Tue, 1 Sep 2009 09:21:25 -0400
Subject: [Bioperl-l] GMOD Chado perl modules moving to the Bio namespace
Message-ID: <CFB4B2A1-6E7F-42D7-BC9A-00C7CB25D185@scottcain.net>

Hello all,

I just wanted to send out a general announcement about a change that  
is coming for perl modules that are distributed with the gmod/chado  
package.  There are some modules, notably Class::DBI classes that are  
automatically generated, that are currently in the Chado namespace.   
This move has been requested by the CPAN maintainers.  So any  
Chado::*  modules will become Bio::Chado::*, except for the Class::DBI  
classes, which will become Bio::Chado::CDBI::*.

This will probably affect relatively few users, though ModWare in its  
current incarnation will need to be updated.

Scott

-----------------------------------------------------------------------
Scott Cain, Ph. D. scott at scottcain dot net
GMOD Coordinator (http://gmod.org/) 216-392-3087
Ontario Institute for Cancer Research


From biopython at maubp.freeserve.co.uk  Tue Sep  1 11:33:13 2009
From: biopython at maubp.freeserve.co.uk (Peter)
Date: Tue, 1 Sep 2009 16:33:13 +0100
Subject: [Bioperl-l] Next-Gen and the next point release - updates
In-Reply-To: <320fb6e00908270455y2a80907chfae8007df60e72e2@mail.gmail.com>
References: <ED17AB7F-E2D9-4CFC-AE18-08B1312159C5@illinois.edu>
	<320fb6e00908261416p666b7ab7w8174eb5a48f38c61@mail.gmail.com>
	<F7DAE18A-8224-4721-861F-610D82F4BDFE@illinois.edu>
	<320fb6e00908270455y2a80907chfae8007df60e72e2@mail.gmail.com>
Message-ID: <320fb6e00909010833p7bffac97je12dc778cdd54971@mail.gmail.com>

On Thu, Aug 27, 2009 at 12:55 PM, Peter wrote:
>> The two conversions to solexa are still failing. ?I'm not sure but I think
>> it's something fairly simple, but I can't work on it until Friday (got too
>> many other things on my plate ATM). ?If I get stumped I'll post a message.
>
> ...
>
> This should narrow it down - the bug is in mapping PHRED
> scores (from either Sanger or Illumina 1.3+ files) to the
> Solexa encoding.
>
> Peter

Hi Chris,

I've just noticed BioPerl is treating invalid characters in the quality
string as a warning condition (not an error):
http://lists.open-bio.org/pipermail/open-bio-l/2009-September/000568.html

It seems for fastq-sanger and fastq-illumina, these get given PHRED 0
(character "!" or "@" respectively) which is reasonable. For fastq-solexa
to fastq-solexa however, Solexa -5 (ASCII 59, character ";") does not get
used - a bug?

Also, in all these cases there is currently a spurious "data loss" warning:

$ ./bioperl_sanger2sanger.pl < error_qual_null.fastq

--------------------- WARNING ---------------------
MSG: Unknown symbol with ASCII value 0 outside of quality range,
---------------------------------------------------

--------------------- WARNING ---------------------
MSG: Data loss for sanger: following values exceed max 93

---------------------------------------------------
@SLXA-B3_649_FC8437_R1_1_1_850_123
GAGGGTGTTGATCATGATGATGGCG
+
YYY!YYYYYYYYYWYYWYYSYYYSY
@SLXA-B3_649_FC8437_R1_1_1_397_389
GGTTTGAGAAAGAGAAATGAGATAA
+
YYYYYYYYYWYYYYWWYYYWYWYWW
@SLXA-B3_649_FC8437_R1_1_1_850_123
GAGGGTGTTGATCATGATGATGGCG
+
YYYYYYYYYYYYYWYYWYYSYYYSY
@SLXA-B3_649_FC8437_R1_1_1_362_549
GGAAACAAAGTTTTTCTCAACATAG
+
YYYYYYYYYYYYYYYYYYWWWWYWY
@SLXA-B3_649_FC8437_R1_1_1_183_714
GTATTATTTAATGGCATACACTCAA
+
YYYYYYYYYYWYYYYWYWWUWWWQQ

Regards,

Peter


From jason at bioperl.org  Tue Sep  1 11:49:00 2009
From: jason at bioperl.org (Jason Stajich)
Date: Tue, 1 Sep 2009 08:49:00 -0700
Subject: [Bioperl-l] help parsing msf file or clustalW file reports
In-Reply-To: <154614.75143.qm@web25706.mail.ukl.yahoo.com>
References: <154614.75143.qm@web25706.mail.ukl.yahoo.com>
Message-ID: <90DACEE3-BC71-4D82-A8FF-6441A720BC76@bioperl.org>

I think you might want to use the column_from_residue_number method  
that is part of Bio::SimpleAlign - it lets you get the column from an  
alignment based on the sequence residue, doing some math along the way  
to deal with gaps. That is the residue -> alignment direction.  If you  
are starting at the alignment and want to get the residue's position  
you will use the location_from_column on a particular sequence so

     # select somehow a sequence from the alignment, e.g.
     my $seq = $aln->get_seq_by_pos(1);
     #$loc is undef or Bio::LocationI object
     my $loc = $seq->location_from_column(5);

-jason

On Sep 1, 2009, at 5:20 AM, Paola Bisignano wrote:

> Hi,
>
> I'm trying to parse fasta files, where I have couple of  
> alignments....I need to identify my residue in my alignment......I  
> have separate lists that derived from ligplot parsing files.. so I  
> have to manipulate string...but I don't now how to start..it seems  
> complicated..
> I used Bio::AlignIO to parse the fasta file, so I can have a parsed  
> file in msf or clustalW forma
>
> here an example:
> CLUSTAL W(1.81) multiple sequence alignment
>
>
> Sequence/9-273          
> DKWEMERTDITMKHKLGGGQYGEVYEGVWKKYSLTVAVKTLKEDTMEVEEFLKEAAVMKE
> 2pl0:A/6-268           DEWEVPRETLKLVERLGAGQFGEVWMGYYNGHT- 
> KVAVKSLKQGSMSPDAFLAEANLMKQ
>                        *:**: *  :.: .:**.**:***:  
> * :: :: .****:**:.:*. : ** ** :**:
>
>
> Sequence/9-273          
> IKHPNLVQLLGVCTREPPFYIITEFMTYGNLLDYLRECNRQEVSAVVLLYMATQISSAME
> 2pl0:A/6-268           LQHQRLVRLYAVVTQEP- 
> IYIITEYMENGSLVDFLKTPSGIKLTINKLLDMAAQIAEGMA
>                        ::* .**:* .* *:** :*****:*   
> *.*:*:*:  .  :::   ** **:**:..*
>
> I  choose two residue for example...how can I extract  
> them...starting from their position in the pdb file?
> I need to walk...to my sequence
>
> I don't know if it is clear because I cannot explain the question  
> correctly in english...are there any Italians?
> could anyone help me?
>
>
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

--
Jason Stajich
jason.stajich at gmail.com
jason at bioperl.org


From cjfields at illinois.edu  Tue Sep  1 12:05:14 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Tue, 1 Sep 2009 11:05:14 -0500
Subject: [Bioperl-l] Next-Gen and the next point release - updates
In-Reply-To: <320fb6e00909010833p7bffac97je12dc778cdd54971@mail.gmail.com>
References: <ED17AB7F-E2D9-4CFC-AE18-08B1312159C5@illinois.edu>
	<320fb6e00908261416p666b7ab7w8174eb5a48f38c61@mail.gmail.com>
	<F7DAE18A-8224-4721-861F-610D82F4BDFE@illinois.edu>
	<320fb6e00908270455y2a80907chfae8007df60e72e2@mail.gmail.com>
	<320fb6e00909010833p7bffac97je12dc778cdd54971@mail.gmail.com>
Message-ID: <FB130819-94C6-419F-AD3D-BAEEDDE77737@illinois.edu>


On Sep 1, 2009, at 10:33 AM, Peter wrote:

> On Thu, Aug 27, 2009 at 12:55 PM, Peter wrote:
>>> The two conversions to solexa are still failing.  I'm not sure but  
>>> I think
>>> it's something fairly simple, but I can't work on it until Friday  
>>> (got too
>>> many other things on my plate ATM).  If I get stumped I'll post a  
>>> message.
>>
>> ...
>>
>> This should narrow it down - the bug is in mapping PHRED
>> scores (from either Sanger or Illumina 1.3+ files) to the
>> Solexa encoding.
>>
>> Peter
>
> Hi Chris,
>
> I've just noticed BioPerl is treating invalid characters in the  
> quality
> string as a warning condition (not an error):
> http://lists.open-bio.org/pipermail/open-bio-l/2009-September/000568.html
>
> It seems for fastq-sanger and fastq-illumina, these get given PHRED 0
> (character "!" or "@" respectively) which is reasonable. For fastq- 
> solexa
> to fastq-solexa however, Solexa -5 (ASCII 59, character ";") does  
> not get
> used - a bug?
>
> Also, in all these cases there is currently a spurious "data loss"  
> warning:
>
> $ ./bioperl_sanger2sanger.pl < error_qual_null.fastq
>
> --------------------- WARNING ---------------------
> MSG: Unknown symbol with ASCII value 0 outside of quality range,
> ---------------------------------------------------
>
> --------------------- WARNING ---------------------
> MSG: Data loss for sanger: following values exceed max 93
>
> ---------------------------------------------------
> @SLXA-B3_649_FC8437_R1_1_1_850_123
> GAGGGTGTTGATCATGATGATGGCG
> +
> YYY!YYYYYYYYYWYYWYYSYYYSY
> @SLXA-B3_649_FC8437_R1_1_1_397_389
> GGTTTGAGAAAGAGAAATGAGATAA
> +
> YYYYYYYYYWYYYYWWYYYWYWYWW
> @SLXA-B3_649_FC8437_R1_1_1_850_123
> GAGGGTGTTGATCATGATGATGGCG
> +
> YYYYYYYYYYYYYWYYWYYSYYYSY
> @SLXA-B3_649_FC8437_R1_1_1_362_549
> GGAAACAAAGTTTTTCTCAACATAG
> +
> YYYYYYYYYYYYYYYYYYWWWWYWY
> @SLXA-B3_649_FC8437_R1_1_1_183_714
> GTATTATTTAATGGCATACACTCAA
> +
> YYYYYYYYYYWYYYYWYWWUWWWQQ
>
> Regards,
>
> Peter

Right, per off-list discussion this can be changed (I would rather it  
die there anyway).

chris


From marcelo011982 at gmail.com  Tue Sep  1 13:33:51 2009
From: marcelo011982 at gmail.com (Marcelo Iwata)
Date: Tue, 1 Sep 2009 14:33:51 -0300
Subject: [Bioperl-l] remove overlapped sequences from Blastn results
Message-ID: <1c9f28970909011033h7f8a1bcl771db039bad384e7@mail.gmail.com>

Hi

I've made a blastn with such arguments:

../bin/blastall -p blastn -d DBBank -i myFasta.FASTA.txt  -e 0.00001 -o
Out2Blast.txt -a 8

and i want a script that removes overlapped sequences from the results..
For example, if a unigene A has the hit->start  and hit-end as 1 and 4, and
the B is at 2 and 3, respectively, the script remove second one.

I want to know if it already exist, and if not, is there a library that
works with such issue.

I know that at Bio::DB::gff we have overlapping_features. But , if something
directly exist (works with blast format), is better for me.

thanks in advance


From cjfields at illinois.edu  Tue Sep  1 14:10:30 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Tue, 1 Sep 2009 13:10:30 -0500
Subject: [Bioperl-l] remove overlapped sequences from Blastn results
In-Reply-To: <1c9f28970909011033h7f8a1bcl771db039bad384e7@mail.gmail.com>
References: <1c9f28970909011033h7f8a1bcl771db039bad384e7@mail.gmail.com>
Message-ID: <7A89A354-3211-4662-9672-895E16CFDEE8@illinois.edu>

Marcelo,

Do you mean tiling?  See:

http://www.bioperl.org/wiki/HOWTO:Tiling

chris

On Sep 1, 2009, at 12:33 PM, Marcelo Iwata wrote:

> Hi
>
> I've made a blastn with such arguments:
>
> ../bin/blastall -p blastn -d DBBank -i myFasta.FASTA.txt  -e 0.00001  
> -o
> Out2Blast.txt -a 8
>
> and i want a script that removes overlapped sequences from the  
> results..
> For example, if a unigene A has the hit->start  and hit-end as 1 and  
> 4, and
> the B is at 2 and 3, respectively, the script remove second one.
>
> I want to know if it already exist, and if not, is there a library  
> that
> works with such issue.
>
> I know that at Bio::DB::gff we have overlapping_features. But , if  
> something
> directly exist (works with blast format), is better for me.
>
> thanks in advance
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From cain.cshl at gmail.com  Tue Sep  1 15:47:50 2009
From: cain.cshl at gmail.com (Scott Cain)
Date: Tue, 1 Sep 2009 15:47:50 -0400
Subject: [Bioperl-l] GMOD Chado perl modules moving to the Bio namespace
In-Reply-To: <CFB4B2A1-6E7F-42D7-BC9A-00C7CB25D185@scottcain.net>
References: <CFB4B2A1-6E7F-42D7-BC9A-00C7CB25D185@scottcain.net>
Message-ID: <0CA5287E-BE85-4E7F-8ED3-B453092FACB1@gmail.com>

Hi Don,

I just wanted to let you know that I also updated the code in  
GMODTools, but I don't have a simple way to test it; perhaps you  
should take a look at the cvs diff to make sure what I did makes sense.

Thanks,
Scott

On Sep 1, 2009, at 9:21 AM, Scott Cain wrote:

> Hello all,
>
> I just wanted to send out a general announcement about a change that  
> is coming for perl modules that are distributed with the gmod/chado  
> package.  There are some modules, notably Class::DBI classes that  
> are automatically generated, that are currently in the Chado  
> namespace.  This move has been requested by the CPAN maintainers.   
> So any Chado::*  modules will become Bio::Chado::*, except for the  
> Class::DBI classes, which will become Bio::Chado::CDBI::*.
>
> This will probably affect relatively few users, though ModWare in  
> its current incarnation will need to be updated.
>
> Scott
>
> -----------------------------------------------------------------------
> Scott Cain, Ph. D. scott at scottcain dot net
> GMOD Coordinator (http://gmod.org/) 216-392-3087
> Ontario Institute for Cancer Research
>
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

-----------------------------------------------------------------------
Scott Cain, Ph. D. scott at scottcain dot net
GMOD Coordinator (http://gmod.org/) 216-392-3087
Ontario Institute for Cancer Research


From maj at fortinbras.us  Wed Sep  2 00:19:30 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Wed, 2 Sep 2009 00:19:30 -0400
Subject: [Bioperl-l] bioperl invades emacs
Message-ID: <56DB0DEEB22645DE94DE0E912A889409@NewLife>

Hi All, 

As part of the Documentation Project, I've written a full-
fledged minor mode for emacs, bioperl-mode. It allows 
the user to access BP pod while coding, using keyboard
shortcuts or menus. Pod pops up in a new view buffer,
which it itself active for quick pod searching. You can 
get the whole pod, pieces of pod, or even the pod headers
of individual methods. 

The best feature (IMHO) is the completion facility. This
not only saves typing, but allows browsing and follow-your-nose
programming (exactly the technique I used to make bioperl-mode,
thanks to the Extensible Self-Documenting Editor).

It's very easy to install, requires only one additional line 
in your .emacs file, and directly infects perl-mode 
(if you so choose) so its available whenever you
open .pl or .pm files.

For details, screenshots, download and install info,
and soporific design details, see
http://www.bioperl.org/wiki/Emacs_bioperl-mode

Send me the bugs!
cheers, 
MAJ


From rmb32 at cornell.edu  Wed Sep  2 00:31:15 2009
From: rmb32 at cornell.edu (Robert Buels)
Date: Tue, 01 Sep 2009 21:31:15 -0700
Subject: [Bioperl-l] bioperl invades emacs
In-Reply-To: <56DB0DEEB22645DE94DE0E912A889409@NewLife>
References: <56DB0DEEB22645DE94DE0E912A889409@NewLife>
Message-ID: <4A9DF513.1020607@cornell.edu>

Wow.  Bravo!

Rob


From cjfields at illinois.edu  Wed Sep  2 00:31:46 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Tue, 1 Sep 2009 23:31:46 -0500
Subject: [Bioperl-l] bioperl invades emacs
In-Reply-To: <56DB0DEEB22645DE94DE0E912A889409@NewLife>
References: <56DB0DEEB22645DE94DE0E912A889409@NewLife>
Message-ID: <2A49147F-17B4-42EB-A170-52DA009D7E1C@illinois.edu>

Very cool!  Thanks Mark!

chris

On Sep 1, 2009, at 11:19 PM, Mark A. Jensen wrote:

> Hi All,
>
> As part of the Documentation Project, I've written a full-
> fledged minor mode for emacs, bioperl-mode. It allows
> the user to access BP pod while coding, using keyboard
> shortcuts or menus. Pod pops up in a new view buffer,
> which it itself active for quick pod searching. You can
> get the whole pod, pieces of pod, or even the pod headers
> of individual methods.
>
> The best feature (IMHO) is the completion facility. This
> not only saves typing, but allows browsing and follow-your-nose
> programming (exactly the technique I used to make bioperl-mode,
> thanks to the Extensible Self-Documenting Editor).
>
> It's very easy to install, requires only one additional line
> in your .emacs file, and directly infects perl-mode
> (if you so choose) so its available whenever you
> open .pl or .pm files.
>
> For details, screenshots, download and install info,
> and soporific design details, see
> http://www.bioperl.org/wiki/Emacs_bioperl-mode
>
> Send me the bugs!
> cheers,
> MAJ
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From Russell.Smithies at agresearch.co.nz  Wed Sep  2 01:01:34 2009
From: Russell.Smithies at agresearch.co.nz (Smithies, Russell)
Date: Wed, 2 Sep 2009 17:01:34 +1200
Subject: [Bioperl-l] bioperl invades emacs
In-Reply-To: <56DB0DEEB22645DE94DE0E912A889409@NewLife>
References: <56DB0DEEB22645DE94DE0E912A889409@NewLife>
Message-ID: <18DF7D20DFEC044098A1062202F5FFF32AAB8A8478@exchsth.agresearch.co.nz>

emacs, how quaint  :-)
And here's me thinking you'd be a vi guru...

For those who frequent Windows, Eclipse with EPIC is a real winner!

--Russell


> -----Original Message-----
> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
> bounces at lists.open-bio.org] On Behalf Of Mark A. Jensen
> Sent: Wednesday, 2 September 2009 4:20 p.m.
> To: BioPerl List
> Subject: [Bioperl-l] bioperl invades emacs
> 
> Hi All,
> 
> As part of the Documentation Project, I've written a full-
> fledged minor mode for emacs, bioperl-mode. It allows
> the user to access BP pod while coding, using keyboard
> shortcuts or menus. Pod pops up in a new view buffer,
> which it itself active for quick pod searching. You can
> get the whole pod, pieces of pod, or even the pod headers
> of individual methods.
> 
> The best feature (IMHO) is the completion facility. This
> not only saves typing, but allows browsing and follow-your-nose
> programming (exactly the technique I used to make bioperl-mode,
> thanks to the Extensible Self-Documenting Editor).
> 
> It's very easy to install, requires only one additional line
> in your .emacs file, and directly infects perl-mode
> (if you so choose) so its available whenever you
> open .pl or .pm files.
> 
> For details, screenshots, download and install info,
> and soporific design details, see
> http://www.bioperl.org/wiki/Emacs_bioperl-mode
> 
> Send me the bugs!
> cheers,
> MAJ
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
=======================================================================
Attention: The information contained in this message and/or attachments
from AgResearch Limited is intended only for the persons or entities
to which it is addressed and may contain confidential and/or privileged
material. Any review, retransmission, dissemination or other use of, or
taking of any action in reliance upon, this information by persons or
entities other than the intended recipients is prohibited by AgResearch
Limited. If you have received this message in error, please notify the
sender immediately.
=======================================================================


From maj at fortinbras.us  Wed Sep  2 08:28:45 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Wed, 2 Sep 2009 08:28:45 -0400
Subject: [Bioperl-l] bioperl invades emacs
In-Reply-To: <4A9E2638.8020203@pasteur.fr>
References: <56DB0DEEB22645DE94DE0E912A889409@NewLife>
	<4A9E2638.8020203@pasteur.fr>
Message-ID: <AC0A7CC6F808466CB15D267CC86AEEE3@NewLife>

Hi Emmanuel-- I'll look into this and report back- thanks!
MAJ
----- Original Message ----- 
From: "Emmanuel Quevillon" <tuco at pasteur.fr>
To: "Mark A. Jensen" <maj at fortinbras.us>
Sent: Wednesday, September 02, 2009 4:00 AM
Subject: Re: [Bioperl-l] bioperl invades emacs


> Mark A. Jensen wrote:
>> Hi All, 
>> 
>> As part of the Documentation Project, I've written a full-
>> fledged minor mode for emacs, bioperl-mode. It allows 
>> the user to access BP pod while coding, using keyboard
>> shortcuts or menus. Pod pops up in a new view buffer,
>> which it itself active for quick pod searching. You can 
>> get the whole pod, pieces of pod, or even the pod headers
>> of individual methods. 
>> 
>> The best feature (IMHO) is the completion facility. This
>> not only saves typing, but allows browsing and follow-your-nose
>> programming (exactly the technique I used to make bioperl-mode,
>> thanks to the Extensible Self-Documenting Editor).
>> 
>> It's very easy to install, requires only one additional line 
>> in your .emacs file, and directly infects perl-mode 
>> (if you so choose) so its available whenever you
>> open .pl or .pm files.
>> 
>> For details, screenshots, download and install info,
>> and soporific design details, see
>> http://www.bioperl.org/wiki/Emacs_bioperl-mode
>> 
>> Send me the bugs!
>> cheers, 
>> MAJ
> rg/mailman/listinfo/bioperl-l
> 
> Hi Mark,
> 
> Great great job.
> But I am using Xemacs and not .emacs file are present in my home
> directory. So is there an trick to make you bioperl-mode working
> under xemacs?
> 
> Thanks for you help
> 
> Regards
> 
> Emmanuel
> -- 
> -------------------------
> Emmanuel Quevillon
> Biological Software and Databases Group
> Institut Pasteur
> +33 1 44 38 95 98
> tuco at_ pasteur dot fr
> -------------------------
> 
>


From maj at fortinbras.us  Wed Sep  2 08:07:14 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Wed, 2 Sep 2009 08:07:14 -0400
Subject: [Bioperl-l] bioperl invades emacs
In-Reply-To: <18DF7D20DFEC044098A1062202F5FFF32AAB8A8478@exchsth.agresearch.co.nz>
References: <56DB0DEEB22645DE94DE0E912A889409@NewLife>
	<18DF7D20DFEC044098A1062202F5FFF32AAB8A8478@exchsth.agresearch.co.nz>
Message-ID: <B9B317F95CA44F0C9335450D3FDDEC73@NewLife>

I only know one command in vi --- :q
MAJ
----- Original Message ----- 
From: "Smithies, Russell" <Russell.Smithies at agresearch.co.nz>
To: "'Mark A. Jensen'" <maj at fortinbras.us>; "'BioPerl List'" 
<bioperl-l at lists.open-bio.org>
Sent: Wednesday, September 02, 2009 1:01 AM
Subject: RE: [Bioperl-l] bioperl invades emacs


emacs, how quaint  :-)
And here's me thinking you'd be a vi guru...

For those who frequent Windows, Eclipse with EPIC is a real winner!

--Russell


> -----Original Message-----
> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
> bounces at lists.open-bio.org] On Behalf Of Mark A. Jensen
> Sent: Wednesday, 2 September 2009 4:20 p.m.
> To: BioPerl List
> Subject: [Bioperl-l] bioperl invades emacs
>
> Hi All,
>
> As part of the Documentation Project, I've written a full-
> fledged minor mode for emacs, bioperl-mode. It allows
> the user to access BP pod while coding, using keyboard
> shortcuts or menus. Pod pops up in a new view buffer,
> which it itself active for quick pod searching. You can
> get the whole pod, pieces of pod, or even the pod headers
> of individual methods.
>
> The best feature (IMHO) is the completion facility. This
> not only saves typing, but allows browsing and follow-your-nose
> programming (exactly the technique I used to make bioperl-mode,
> thanks to the Extensible Self-Documenting Editor).
>
> It's very easy to install, requires only one additional line
> in your .emacs file, and directly infects perl-mode
> (if you so choose) so its available whenever you
> open .pl or .pm files.
>
> For details, screenshots, download and install info,
> and soporific design details, see
> http://www.bioperl.org/wiki/Emacs_bioperl-mode
>
> Send me the bugs!
> cheers,
> MAJ
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
=======================================================================
Attention: The information contained in this message and/or attachments
from AgResearch Limited is intended only for the persons or entities
to which it is addressed and may contain confidential and/or privileged
material. Any review, retransmission, dissemination or other use of, or
taking of any action in reliance upon, this information by persons or
entities other than the intended recipients is prohibited by AgResearch
Limited. If you have received this message in error, please notify the
sender immediately.
=======================================================================


From hlapp at gmx.net  Wed Sep  2 11:51:18 2009
From: hlapp at gmx.net (Hilmar Lapp)
Date: Wed, 2 Sep 2009 11:51:18 -0400
Subject: [Bioperl-l] bioperl invades emacs
In-Reply-To: <56DB0DEEB22645DE94DE0E912A889409@NewLife>
References: <56DB0DEEB22645DE94DE0E912A889409@NewLife>
Message-ID: <73A8B147-7605-4E2E-98AF-F3B09AD6046F@gmx.net>

Very nice!! -hilmar

On Sep 2, 2009, at 12:19 AM, Mark A. Jensen wrote:

> Hi All,
>
> As part of the Documentation Project, I've written a full-
> fledged minor mode for emacs, bioperl-mode. It allows
> the user to access BP pod while coding, using keyboard
> shortcuts or menus. Pod pops up in a new view buffer,
> which it itself active for quick pod searching. You can
> get the whole pod, pieces of pod, or even the pod headers
> of individual methods.
>
> The best feature (IMHO) is the completion facility. This
> not only saves typing, but allows browsing and follow-your-nose
> programming (exactly the technique I used to make bioperl-mode,
> thanks to the Extensible Self-Documenting Editor).
>
> It's very easy to install, requires only one additional line
> in your .emacs file, and directly infects perl-mode
> (if you so choose) so its available whenever you
> open .pl or .pm files.
>
> For details, screenshots, download and install info,
> and soporific design details, see
> http://www.bioperl.org/wiki/Emacs_bioperl-mode
>
> Send me the bugs!
> cheers,
> MAJ
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From cjfields at illinois.edu  Wed Sep  2 16:23:01 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Wed, 2 Sep 2009 15:23:01 -0500
Subject: [Bioperl-l] remove overlapped sequences from Blastn results
In-Reply-To: <1c9f28970909021320o20037e00g871db92a37519f79@mail.gmail.com>
References: <1c9f28970909011033h7f8a1bcl771db039bad384e7@mail.gmail.com>
	<7A89A354-3211-4662-9672-895E16CFDEE8@illinois.edu>
	<1c9f28970909021320o20037e00g871db92a37519f79@mail.gmail.com>
Message-ID: <E39D878B-A6F1-441A-A511-7CA0FF0D1319@illinois.edu>

Marcelo,

(Make sure to keep responses on the main list)

The new Tiling stuff is in bioperl-live (subversion code); it hasn't  
been released yet but should appear in BioPerl 1.6.1 (an alpha will be  
out this week).

chris

On Sep 2, 2009, at 3:20 PM, Marcelo Iwata wrote:

> thanks Chris.
> I was at cpan search to download Bio::Search::Tiling, and it returns  
> to me the bioperl core module:
> BioPerl-1.6.0.tar.gz
> at http://search.cpan.org/~cjfields/BioPerl-1.6.0/Bio/Search/BlastStatistics.pm
>
> i've downloaded and upgrade my bioperl version, but, still not find  
> the MapTiling.pm
>
> Could this be result of Some kind of error at upgrade?
>  thks.
>
>
> On Tue, Sep 1, 2009 at 3:10 PM, Chris Fields <cjfields at illinois.edu>  
> wrote:
> Marcelo,
>
> Do you mean tiling?  See:
>
> http://www.bioperl.org/wiki/HOWTO:Tiling
>
> chris
>
>
> On Sep 1, 2009, at 12:33 PM, Marcelo Iwata wrote:
>
> Hi
>
> I've made a blastn with such arguments:
>
> ../bin/blastall -p blastn -d DBBank -i myFasta.FASTA.txt  -e 0.00001  
> -o
> Out2Blast.txt -a 8
>
> and i want a script that removes overlapped sequences from the  
> results..
> For example, if a unigene A has the hit->start  and hit-end as 1 and  
> 4, and
> the B is at 2 and 3, respectively, the script remove second one.
>
> I want to know if it already exist, and if not, is there a library  
> that
> works with such issue.
>
> I know that at Bio::DB::gff we have overlapping_features. But , if  
> something
> directly exist (works with blast format), is better for me.
>
> thanks in advance
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>


From maj at fortinbras.us  Wed Sep  2 21:04:06 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Wed, 2 Sep 2009 21:04:06 -0400
Subject: [Bioperl-l] bioperl invades emacs
In-Reply-To: <56DB0DEEB22645DE94DE0E912A889409@NewLife>
References: <56DB0DEEB22645DE94DE0E912A889409@NewLife>
Message-ID: <5009BD4ADDC94A03866AC4D4813907EB@NewLife>

Thanks everyone for your comments so far, on and off-list. 
(You're a terrific audience. I also code for weddings and 
bar mitzvahs. Tip your servers.)
The howto page now has a "Known Issues" section, and
I will be working to eliminate those in the next couple of 
days. 

cheers Mark
----- Original Message ----- 
From: "Mark A. Jensen" <maj at fortinbras.us>
To: "BioPerl List" <bioperl-l at lists.open-bio.org>
Sent: Wednesday, September 02, 2009 12:19 AM
Subject: [Bioperl-l] bioperl invades emacs


> Hi All, 
> 
> As part of the Documentation Project, I've written a full-
> fledged minor mode for emacs, bioperl-mode. It allows 
> the user to access BP pod while coding, using keyboard
> shortcuts or menus. Pod pops up in a new view buffer,
> which it itself active for quick pod searching. You can 
> get the whole pod, pieces of pod, or even the pod headers
> of individual methods. 
> 
> The best feature (IMHO) is the completion facility. This
> not only saves typing, but allows browsing and follow-your-nose
> programming (exactly the technique I used to make bioperl-mode,
> thanks to the Extensible Self-Documenting Editor).
> 
> It's very easy to install, requires only one additional line 
> in your .emacs file, and directly infects perl-mode 
> (if you so choose) so its available whenever you
> open .pl or .pm files.
> 
> For details, screenshots, download and install info,
> and soporific design details, see
> http://www.bioperl.org/wiki/Emacs_bioperl-mode
> 
> Send me the bugs!
> cheers, 
> MAJ
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
>


From jessica.sun at gmail.com  Tue Sep  1 11:25:36 2009
From: jessica.sun at gmail.com (jsun529)
Date: Tue, 1 Sep 2009 08:25:36 -0700 (PDT)
Subject: [Bioperl-l]  covert CDS coordinates with Gene coordinates
Message-ID: <25242395.post@talk.nabble.com>


Dear all,
  I like to know how to convert a CDS coordinates with Gene coordinates
using the use Bio::Coordinate::GeneMapper;
 the doc is not very clear and a working example will help a lot in 

using the objects return from Bioperl function and get the value out in
readable format.

Thanks,

-- 
View this message in context: http://www.nabble.com/covert-CDS-coordinates-with-Gene-coordinates-tp25242395p25242395.html
Sent from the Perl - Bioperl-L mailing list archive at Nabble.com.


From pg4 at sanger.ac.uk  Wed Sep  2 19:35:07 2009
From: pg4 at sanger.ac.uk (Pablo Marin-Garcia)
Date: Thu, 3 Sep 2009 00:35:07 +0100 (BST)
Subject: [Bioperl-l] bioperl invades emacs -- bug report?
In-Reply-To: <mailman.25.1251907209.22450.bioperl-l@lists.open-bio.org>
References: <mailman.25.1251907209.22450.bioperl-l@lists.open-bio.org>
Message-ID: <alpine.DEB.1.10.0909022007510.16229@deskpro17122.dynamic.sanger.ac.uk>


Hello Mark,

It sounds fantastic,

unfortunatelly I was unable to use it:

It does not found pod2text in my macosX and fail to find my bioperl paths 
in linux (probably due to a bug in the perl5lib parsing but I am a lisp 
novice so I could be wrong)

==  macosX ==

in my macbook macosX 10.5 emacs 22.3 it does not find the pod2text
GNU Emacs 22.3.1 (i386-apple-darwin9.6.0, X toolkit)

   -I have installed your modules in my local-lisp and added the requiere 
and now emacs fails with the error:

   File error: Searching for program, invalid argument, pod2text

   -- I have pod2text in /usr/bin and this is in my $PATH (I use fink 
emacs in not-window mode) but the same happens with the carbon emacs

==  debian etch with an old emacs 21 ==

GNU Emacs 21.4.1 (i486-pc-linux-gnu, X toolkit, Xaw3d scroll bars) of 
2007-06-19 on ninsei, modified by Debian

It loads ok but when asking for the pods

[pod] Namespace: Bio::

it does not autocomplete from there, and if I have the cursor over a 'use 
Bio::xxx', and select [BP Docs] 'view methods' or 'view pod' it says 'no 
match'

# [pod mth] Namespace: Bio::PrimarySeq [No match]

Reading bioperl-mode.el and bioperl-init.el I have seen that the variable 
that stores the path to bioperl has not other paths added a part of 
current path:

# c-h v bioperl-module-path [ret] => bioperl-module-path's value is "."


== bug when parsing perl5lib? ==

Please correct me if I am wrong but in bioperl-init.el when extracting the 
Bioperl paths from PERL5LIB this is not working for me in linux.

While debugging bioperl-init.el:
# (setq pth (getenv "PERL5LIB"))
#  "/nfs/home/pmg/ensembl-api/ensembl-compara/modules:...:/nfs/home/pmg/bioperl-live:..."
# (setq pth (if (file-exists-p (concat pth "/" "Bio")) pth nil))
# nil

No file is found because it is looking for all the paths 
concatenated together with a '/Bio' at the end:

   libpaht1:libpath2:libpath3/Bio

'concat' adds /Bio to the pth that is a string with all the 
PERL5LIB paths. Should this concat rather be applied to the splited perl5lib by ':' in unix or 
';' in windows and then tested for the existence of files?

for example in unix:

--- code --
(defun addbio (bio_path)
   "apend /Bio to each path"
   (concat bio_path "/" "Bio"))

(mapcar 'file-exists-p (mapcar 'addbio (split-string pth ":")))
-- end code ---

This would result in the list of T and F bioperl (and ensembl) paths
(t t nil t t t t t t nil nil nil ...)


Regards and thanks for the modules they would be very useful.

    -Pablo

=====================================================================
                      Pablo Marin-Garcia, PhD

                     \\//          (Argiope bruennichi
                \/\/`(||>O:'\/\/   with stabilimentum)
                     //\\

Sanger Institute                |  PostDoc / Computer Biologist
Wellcome Trust Genome Campus    |  team : 128/108 (Human Genetics)
Hinxton, Cambridge CB10 1HH     |  room : N333
United Kingdom                  |  email: pablo.marin at sanger.ac.uk
====================================================================


-- 
 The Wellcome Trust Sanger Institute is operated by Genome Research 
 Limited, a charity registered in England with number 1021457 and a 
 company registered in England with number 2742969, whose registered 
 office is 215 Euston Road, London, NW1 2BE. 


From maj at fortinbras.us  Wed Sep  2 22:34:59 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Wed, 2 Sep 2009 22:34:59 -0400
Subject: [Bioperl-l] bioperl invades emacs -- bug report?
In-Reply-To: <alpine.DEB.1.10.0909022007510.16229@deskpro17122.dynamic.sanger.ac.uk>
References: <mailman.25.1251907209.22450.bioperl-l@lists.open-bio.org>
	<alpine.DEB.1.10.0909022007510.16229@deskpro17122.dynamic.sanger.ac.uk>
Message-ID: <2669F98293CC4473ADAB8B80F93351FF@NewLife>

Thanks for all this work, Pablo. Am working hard on 21
back-compat. Will attempt some mac-friendly paths
and look at the perl5lib issue-

"No matches" are seeming to stem from failure to
find the Bio tree-- there's a workaround for this on
the wiki page as of right now. This will probably
not help the 21 problems, but the next commit
(tomorrow) will likely solve these. I will post to this
thread when that happens.
cheers Mark
----- Original Message ----- 
From: "Pablo Marin-Garcia" <pg4 at sanger.ac.uk>
To: <bioperl-l at lists.open-bio.org>
Sent: Wednesday, September 02, 2009 7:35 PM
Subject: Re: [Bioperl-l] bioperl invades emacs -- bug report?


>
>
> Hello Mark,
>
> It sounds fantastic,
>
> unfortunatelly I was unable to use it:
>
> It does not found pod2text in my macosX and fail to find my bioperl paths in 
> linux (probably due to a bug in the perl5lib parsing but I am a lisp novice so 
> I could be wrong)
>
> ==  macosX ==
>
> in my macbook macosX 10.5 emacs 22.3 it does not find the pod2text
> GNU Emacs 22.3.1 (i386-apple-darwin9.6.0, X toolkit)
>
>   -I have installed your modules in my local-lisp and added the requiere and 
> now emacs fails with the error:
>
>   File error: Searching for program, invalid argument, pod2text
>
>   -- I have pod2text in /usr/bin and this is in my $PATH (I use fink emacs in 
> not-window mode) but the same happens with the carbon emacs
>
> ==  debian etch with an old emacs 21 ==
>
> GNU Emacs 21.4.1 (i486-pc-linux-gnu, X toolkit, Xaw3d scroll bars) of 
> 2007-06-19 on ninsei, modified by Debian
>
> It loads ok but when asking for the pods
>
> [pod] Namespace: Bio::
>
> it does not autocomplete from there, and if I have the cursor over a 'use 
> Bio::xxx', and select [BP Docs] 'view methods' or 'view pod' it says 'no 
> match'
>
> # [pod mth] Namespace: Bio::PrimarySeq [No match]
>
> Reading bioperl-mode.el and bioperl-init.el I have seen that the variable that 
> stores the path to bioperl has not other paths added a part of current path:
>
> # c-h v bioperl-module-path [ret] => bioperl-module-path's value is "."
>
>
> == bug when parsing perl5lib? ==
>
> Please correct me if I am wrong but in bioperl-init.el when extracting the 
> Bioperl paths from PERL5LIB this is not working for me in linux.
>
> While debugging bioperl-init.el:
> # (setq pth (getenv "PERL5LIB"))
> # 
> "/nfs/home/pmg/ensembl-api/ensembl-compara/modules:...:/nfs/home/pmg/bioperl-live:..."
> # (setq pth (if (file-exists-p (concat pth "/" "Bio")) pth nil))
> # nil
>
> No file is found because it is looking for all the paths concatenated together 
> with a '/Bio' at the end:
>
>   libpaht1:libpath2:libpath3/Bio
>
> 'concat' adds /Bio to the pth that is a string with all the PERL5LIB paths. 
> Should this concat rather be applied to the splited perl5lib by ':' in unix or 
> ';' in windows and then tested for the existence of files?
>
> for example in unix:
>
> --- code --
> (defun addbio (bio_path)
>   "apend /Bio to each path"
>   (concat bio_path "/" "Bio"))
>
> (mapcar 'file-exists-p (mapcar 'addbio (split-string pth ":")))
> -- end code ---
>
> This would result in the list of T and F bioperl (and ensembl) paths
> (t t nil t t t t t t nil nil nil ...)
>
>
> Regards and thanks for the modules they would be very useful.
>
>    -Pablo
>
> =====================================================================
>                      Pablo Marin-Garcia, PhD
>
>                     \\//          (Argiope bruennichi
>                \/\/`(||>O:'\/\/   with stabilimentum)
>                     //\\
>
> Sanger Institute                |  PostDoc / Computer Biologist
> Wellcome Trust Genome Campus    |  team : 128/108 (Human Genetics)
> Hinxton, Cambridge CB10 1HH     |  room : N333
> United Kingdom                  |  email: pablo.marin at sanger.ac.uk
> ====================================================================
>
>
>
>
>
>
>
>
>
>
> -- 
> The Wellcome Trust Sanger Institute is operated by Genome Research Limited, a 
> charity registered in England with number 1021457 and a company registered in 
> England with number 2742969, whose registered office is 215 Euston Road, 
> London, NW1 2BE. _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> 


From maj at fortinbras.us  Thu Sep  3 00:21:14 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Thu, 3 Sep 2009 00:21:14 -0400
Subject: [Bioperl-l] bioperl invades emacs -- bug report?
In-Reply-To: <alpine.DEB.1.10.0909022007510.16229@deskpro17122.dynamic.sanger.ac.uk>
References: <mailman.25.1251907209.22450.bioperl-l@lists.open-bio.org>
	<alpine.DEB.1.10.0909022007510.16229@deskpro17122.dynamic.sanger.ac.uk>
Message-ID: <203092FB050648AA9F256788068F0A16@NewLife>

Hi Pablo and all-
Try the latest revision (>=16081) with your debian/Emacs 21. Set
the variable bioperl-module-path to the directory above the
Bio directory (same idea as ' use lib "./bioperl-live"; ' ), and try
again there. Tomorrow, MacOS
cheers,
Mark
----- Original Message ----- 
From: "Pablo Marin-Garcia" <pg4 at sanger.ac.uk>
To: <bioperl-l at lists.open-bio.org>
Sent: Wednesday, September 02, 2009 7:35 PM
Subject: Re: [Bioperl-l] bioperl invades emacs -- bug report?


>
>
> Hello Mark,
>
> It sounds fantastic,
>
> unfortunatelly I was unable to use it:
>
> It does not found pod2text in my macosX and fail to find my bioperl paths in 
> linux (probably due to a bug in the perl5lib parsing but I am a lisp novice so 
> I could be wrong)
>
> ==  macosX ==
>
> in my macbook macosX 10.5 emacs 22.3 it does not find the pod2text
> GNU Emacs 22.3.1 (i386-apple-darwin9.6.0, X toolkit)
>
>   -I have installed your modules in my local-lisp and added the requiere and 
> now emacs fails with the error:
>
>   File error: Searching for program, invalid argument, pod2text
>
>   -- I have pod2text in /usr/bin and this is in my $PATH (I use fink emacs in 
> not-window mode) but the same happens with the carbon emacs
>
> ==  debian etch with an old emacs 21 ==
>
> GNU Emacs 21.4.1 (i486-pc-linux-gnu, X toolkit, Xaw3d scroll bars) of 
> 2007-06-19 on ninsei, modified by Debian
>
> It loads ok but when asking for the pods
>
> [pod] Namespace: Bio::
>
> it does not autocomplete from there, and if I have the cursor over a 'use 
> Bio::xxx', and select [BP Docs] 'view methods' or 'view pod' it says 'no 
> match'
>
> # [pod mth] Namespace: Bio::PrimarySeq [No match]
>
> Reading bioperl-mode.el and bioperl-init.el I have seen that the variable that 
> stores the path to bioperl has not other paths added a part of current path:
>
> # c-h v bioperl-module-path [ret] => bioperl-module-path's value is "."
>
>
> == bug when parsing perl5lib? ==
>
> Please correct me if I am wrong but in bioperl-init.el when extracting the 
> Bioperl paths from PERL5LIB this is not working for me in linux.
>
> While debugging bioperl-init.el:
> # (setq pth (getenv "PERL5LIB"))
> # 
> "/nfs/home/pmg/ensembl-api/ensembl-compara/modules:...:/nfs/home/pmg/bioperl-live:..."
> # (setq pth (if (file-exists-p (concat pth "/" "Bio")) pth nil))
> # nil
>
> No file is found because it is looking for all the paths concatenated together 
> with a '/Bio' at the end:
>
>   libpaht1:libpath2:libpath3/Bio
>
> 'concat' adds /Bio to the pth that is a string with all the PERL5LIB paths. 
> Should this concat rather be applied to the splited perl5lib by ':' in unix or 
> ';' in windows and then tested for the existence of files?
>
> for example in unix:
>
> --- code --
> (defun addbio (bio_path)
>   "apend /Bio to each path"
>   (concat bio_path "/" "Bio"))
>
> (mapcar 'file-exists-p (mapcar 'addbio (split-string pth ":")))
> -- end code ---
>
> This would result in the list of T and F bioperl (and ensembl) paths
> (t t nil t t t t t t nil nil nil ...)
>
>
> Regards and thanks for the modules they would be very useful.
>
>    -Pablo
>
> =====================================================================
>                      Pablo Marin-Garcia, PhD
>
>                     \\//          (Argiope bruennichi
>                \/\/`(||>O:'\/\/   with stabilimentum)
>                     //\\
>
> Sanger Institute                |  PostDoc / Computer Biologist
> Wellcome Trust Genome Campus    |  team : 128/108 (Human Genetics)
> Hinxton, Cambridge CB10 1HH     |  room : N333
> United Kingdom                  |  email: pablo.marin at sanger.ac.uk
> ====================================================================
>
>
>
>
>
>
>
>
>
>
> -- 
> The Wellcome Trust Sanger Institute is operated by Genome Research Limited, a 
> charity registered in England with number 1021457 and a company registered in 
> England with number 2742969, whose registered office is 215 Euston Road, 
> London, NW1 2BE. _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> 


From tuco at pasteur.fr  Thu Sep  3 05:56:45 2009
From: tuco at pasteur.fr (Emmanuel Quevillon)
Date: Thu, 03 Sep 2009 11:56:45 +0200
Subject: [Bioperl-l] bioperl invades emacs
In-Reply-To: <5009BD4ADDC94A03866AC4D4813907EB@NewLife>
References: <56DB0DEEB22645DE94DE0E912A889409@NewLife>
	<5009BD4ADDC94A03866AC4D4813907EB@NewLife>
Message-ID: <4A9F92DD.2010701@pasteur.fr>

Mark A. Jensen wrote:
> Thanks everyone for your comments so far, on and off-list. (You're a
> terrific audience. I also code for weddings and bar mitzvahs. Tip your
> servers.)
> The howto page now has a "Known Issues" section, and
> I will be working to eliminate those in the next couple of days.
> cheers Mark

Hi Mark,

Thanks for your help. I decided to remove Xemacs :) and replace it
with Emacs. In fact, as I am running Ubuntu, it was a mess to know
where to put files.el etc and how to make it working.
So I removed everything , bit rude, and reinstall emacs-22.

What I've done after that.

$ cd /usr/share/emacs
$ cd 22.2
$ cp BIOPERL-MODE/etc/* etc/
$ cd site-lisp (which is a symlink to /usr/share/emacs22/site-lisp)
$ sudo mkdir bioperl-mode
$ cp BIOPERL-MODE/site-lisp/* bioperl-mode
$ cd ~
$ touch .emacs
$ cat .xemacs/init.el (with require 'bioperl-mode) > .emacs
$ cat .xemacs/custom.el >> .emacs (The file with my other emacs
stuff, e.g. Template Toolkit mode)

And it is all done and working perfectly!!

Thanks for this great file Mark

Regards

Emmanuel

-- 
-------------------------
Emmanuel Quevillon
Biological Software and Databases Group
Institut Pasteur
+33 1 44 38 95 98
tuco at_ pasteur dot fr
-------------------------


From maj at fortinbras.us  Thu Sep  3 07:22:31 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Thu, 3 Sep 2009 07:22:31 -0400
Subject: [Bioperl-l] bioperl invades emacs -- bug report?
In-Reply-To: <alpine.DEB.1.10.0909030814320.16229@deskpro17122.dynamic.sanger.ac.uk>
References: <mailman.25.1251907209.22450.bioperl-l@lists.open-bio.org>
	<alpine.DEB.1.10.0909022007510.16229@deskpro17122.dynamic.sanger.ac.uk>
	<203092FB050648AA9F256788068F0A16@NewLife>
	<alpine.DEB.1.10.0909030814320.16229@deskpro17122.dynamic.sanger.ac.uk>
Message-ID: <2465B400494242AEAB5F578BD6BB5301@NewLife>

I get it now-- you're right. I'll take care of that-
cheers
MAJ
----- Original Message ----- 
From: "Pablo Marin-Garcia" <pg4 at sanger.ac.uk>
To: "Mark A. Jensen" <maj at fortinbras.us>
Cc: <bioperl-l at lists.open-bio.org>
Sent: Thursday, September 03, 2009 4:01 AM
Subject: Re: [Bioperl-l] bioperl invades emacs -- bug report?


> On Thu, 3 Sep 2009, Mark A. Jensen wrote:
>
>> Hi Pablo and all-
>> Try the latest revision (>=16081) with your debian/Emacs 21. Set
>> the variable bioperl-module-path to the directory above the
>> Bio directory (same idea as ' use lib "./bioperl-live"; ' ), and try
>> again there. Tomorrow, MacOS
>> cheers,
>> Mark
>
> Hello Mark,
>
> after setting bioperl-module-path manually, your module works ok in linux 
> emacs 21.4 with latest revision.
>
> About the perl5lib issue, sorry about not reporting the platform: the report 
> was on linux not in mac os X. In the wiki you have a comment about mac OS X 
> separator:
>
> [wiki] The problem Pablo was running into is definitely the Mac OS X path 
> [wiki] separator issue.
>
> Here I was refering to ':' as the 'path seprator' for linux multipath 
> environmental vars not the systems directory separator [:/\].
>
> Also from the wiki
>
> [wiki] I think this is ok as it is, since bioperl-module-path is meant to 
> [wiki] point to the directory above Bio
>
> This is right. Probably my message was misleading. I wrongly appended '/Bio' 
> to the path instead to a temp variable for testing with file-exist-p. And 
> probably gave you the impression that the point was to have the /Bio added to 
> the path. Sorry about that.
>
> Instead my main point was about the line where you capture the PRL5LIB:
>
> [code] (if (setq pth (getenv "PERL5LIB"))
>
> wouldn't this leave pth with s *string* like "lib/path1:lib/path2:lob/path3" 
> in linux?
>
> Then, when you test:
>
> [code] (setq pth (if (file-exists-p (concat pth "/" "Bio")) pth nil))))
>
> it would append '/Bio' at the end of the whole string 
> 'lib/path1:lib/path2:lib/path3'. and this string path obviously does not 
> exist.
>
> Am I missing something? Shouldn't the 'concat /Bio' be applied to *each* 
> lib/path, splitting first the pth string by the ':' in linux/osX or equivalent 
> in windows.
>
> Sorry about not being very clear in my firest report.
>
>
>    -Pablo
>
>
>
>>> == bug when parsing perl5lib? ==
>>>
>>> Please correct me if I am wrong but in bioperl-init.el when extracting the 
>>> Bioperl paths from PERL5LIB this is not working for me in linux.
>>>
>>> While debugging bioperl-init.el:
>>> # (setq pth (getenv "PERL5LIB"))
>>> # 
>>> "/nfs/home/pmg/ensembl-api/ensembl-compara/modules:...:/nfs/home/pmg/bioperl-live:..."
>>> # (setq pth (if (file-exists-p (concat pth "/" "Bio")) pth nil))
>>> # nil
>>>
>>> No file is found because it is looking for all the paths concatenated 
>>> together with a '/Bio' at the end:
>>>
>>>   libpaht1:libpath2:libpath3/Bio
>>>
>>> 'concat' adds /Bio to the pth that is a string with all the PERL5LIB paths. 
>>> Should this concat rather be applied to the splited perl5lib by ':' in unix 
>>> or ';' in windows and then tested for the existence of files?
>>>
>>> for example in unix:
>>>
>>> --- code --
>>> (defun addbio (bio_path)
>>>   "apend /Bio to each path"
>>>   (concat bio_path "/" "Bio"))
>>>
>>> (mapcar 'file-exists-p (mapcar 'addbio (split-string pth ":")))
>>> -- end code ---
>>>
>>> This would result in the list of T and F bioperl (and ensembl) paths
>>> (t t nil t t t t t t nil nil nil ...)
>>>
>>>
>>> Regards and thanks for the modules they would be very useful.
>>>
>>>    -Pablo
>>>
>>> =====================================================================
>>>                      Pablo Marin-Garcia, PhD
>>>
>>>                     \\//          (Argiope bruennichi
>>>                \/\/`(||>O:'\/\/   with stabilimentum)
>>>                     //\\
>>>
>>> Sanger Institute                |  PostDoc / Computer Biologist
>>> Wellcome Trust Genome Campus    |  team : 128/108 (Human Genetics)
>>> Hinxton, Cambridge CB10 1HH     |  room : N333
>>> United Kingdom                  |  email: pablo.marin at sanger.ac.uk
>>> ====================================================================
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> -- 
>>> The Wellcome Trust Sanger Institute is operated by Genome Research Limited, 
>>> a charity registered in England with number 1021457 and a company registered 
>>> in England with number 2742969, whose registered office is 215 Euston Road, 
>>> London, NW1 2BE. _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>>
>>
>>
>
>
> =====================================================================
>                      Pablo Marin-Garcia, PhD
>
>                     \\//          (Argiope bruennichi
>                \/\/`(||>O:'\/\/   with stabilimentum)
>                     //\\
>
> Sanger Institute                |  PostDoc / Computer Biologist
> Wellcome Trust Genome Campus    |  team : 128/108 (Human Genetics)
> Hinxton, Cambridge CB10 1HH     |  room : N333
> United Kingdom                  |  email: pablo.marin at sanger.ac.uk
> ====================================================================
>
>
>
>
>
>
>
>
>
>
> -- 
> The Wellcome Trust Sanger Institute is operated by Genome Research Limited, a 
> charity registered in England with number 1021457 and a company registered in 
> England with number 2742969, whose registered office is 215 Euston Road, 
> London, NW1 2BE.
> 


From maj at fortinbras.us  Thu Sep  3 08:34:45 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Thu, 3 Sep 2009 08:34:45 -0400
Subject: [Bioperl-l] bioperl invades emacs
In-Reply-To: <56DB0DEEB22645DE94DE0E912A889409@NewLife>
References: <56DB0DEEB22645DE94DE0E912A889409@NewLife>
Message-ID: <736B3399B3754D4C9B1BB66414160D95@NewLife>

Hi All, 

Following bioperl-mode issues are resolved in r16020:

- compatibility with Emacs 21
- correct parsing of PERL5LIB
- Bio module search now includes PATH components 
  (after PERL5LIB search)
- Now get informative error if completion is attempted
  without a valid bioperl-module-path

Thanks for your patience and your bug reports-
cheers
MAJ

----- Original Message ----- 
From: "Mark A. Jensen" <maj at fortinbras.us>
To: "BioPerl List" <bioperl-l at lists.open-bio.org>
Sent: Wednesday, September 02, 2009 12:19 AM
Subject: [Bioperl-l] bioperl invades emacs


> Hi All, 
> 
> As part of the Documentation Project, I've written a full-
> fledged minor mode for emacs, bioperl-mode. It allows 
> the user to access BP pod while coding, using keyboard
> shortcuts or menus. Pod pops up in a new view buffer,
> which it itself active for quick pod searching. You can 
> get the whole pod, pieces of pod, or even the pod headers
> of individual methods. 
> 
> The best feature (IMHO) is the completion facility. This
> not only saves typing, but allows browsing and follow-your-nose
> programming (exactly the technique I used to make bioperl-mode,
> thanks to the Extensible Self-Documenting Editor).
> 
> It's very easy to install, requires only one additional line 
> in your .emacs file, and directly infects perl-mode 
> (if you so choose) so its available whenever you
> open .pl or .pm files.
> 
> For details, screenshots, download and install info,
> and soporific design details, see
> http://www.bioperl.org/wiki/Emacs_bioperl-mode
> 
> Send me the bugs!
> cheers, 
> MAJ
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
>


From neetisomaiya at gmail.com  Fri Sep  4 02:49:58 2009
From: neetisomaiya at gmail.com (Neeti Somaiya)
Date: Fri, 4 Sep 2009 12:19:58 +0530
Subject: [Bioperl-l] need help urgently
Message-ID: <764978cf0909032349t5d8683f3h3b2ccc506392c984@mail.gmail.com>

Hi,

I have an input list of gene names (can get gene ids from a local db
if required).
I need to fetch sequences of these genes. Can someone please guide me
as to how this can be done using perl/bioperl?

Any help will be deeply appreciated.

Thanks.

-Neeti
Even my blood says, B positive


From neetisomaiya at gmail.com  Fri Sep  4 05:17:17 2009
From: neetisomaiya at gmail.com (Neeti Somaiya)
Date: Fri, 4 Sep 2009 14:47:17 +0530
Subject: [Bioperl-l] need help urgently
In-Reply-To: <42006.192.168.1.1.1252051273.squirrel@mail.ncbs.res.in>
References: <764978cf0909032349t5d8683f3h3b2ccc506392c984@mail.gmail.com>
	<42006.192.168.1.1.1252051273.squirrel@mail.ncbs.res.in>
Message-ID: <764978cf0909040217k7f4382d6pcec65f8adfd66164@mail.gmail.com>

Thanks for the link.
So I need only the following lines of code to get the sequence?

use Bio::DB::GenBank;
$db_obj = Bio::DB::GenBank->new;
$seq_obj = $db_obj->get_Seq_by_id(2);

How do I print the sequence?
$seq_obj->seq ??

-Neeti
Even my blood says, B positive


On Fri, Sep 4, 2009 at 1:31 PM, K. Shameer<shameer at ncbs.res.in> wrote:
>
> Retrieving a sequence from a database : BioPerl HOWTO
> http://bit.ly/RWIot
>
> Trust this helps,
> Khader Shameer
> NCBS - TIFR
>
>> Hi,
>>
>> I have an input list of gene names (can get gene ids from a local db
>> if required).
>> I need to fetch sequences of these genes. Can someone please guide me
>> as to how this can be done using perl/bioperl?
>>
>> Any help will be deeply appreciated.
>>
>> Thanks.
>>
>> -Neeti
>> Even my blood says, B positive
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>
>
>


From neetisomaiya at gmail.com  Fri Sep  4 06:13:58 2009
From: neetisomaiya at gmail.com (Neeti Somaiya)
Date: Fri, 4 Sep 2009 15:43:58 +0530
Subject: [Bioperl-l] need help urgently
In-Reply-To: <764978cf0909040217k7f4382d6pcec65f8adfd66164@mail.gmail.com>
References: <764978cf0909032349t5d8683f3h3b2ccc506392c984@mail.gmail.com>
	<42006.192.168.1.1.1252051273.squirrel@mail.ncbs.res.in>
	<764978cf0909040217k7f4382d6pcec65f8adfd66164@mail.gmail.com>
Message-ID: <764978cf0909040313r48ff5660x3210324dc14bf966@mail.gmail.com>

Thanks for the replies.

So the get seq by accession/GI worked for me. Now can anyone tell me
the easiest way to get the GI /Accession of a gene from the gene
id/gene name?

-Neeti
Even my blood says, B positive


On Fri, Sep 4, 2009 at 2:47 PM, Neeti Somaiya<neetisomaiya at gmail.com> wrote:
> Thanks for the link.
> So I need only the following lines of code to get the sequence?
>
> use Bio::DB::GenBank;
> $db_obj = Bio::DB::GenBank->new;
> $seq_obj = $db_obj->get_Seq_by_id(2);
>
> How do I print the sequence?
> $seq_obj->seq ??
>
> -Neeti
> Even my blood says, B positive
>
>
>
> On Fri, Sep 4, 2009 at 1:31 PM, K. Shameer<shameer at ncbs.res.in> wrote:
>>
>> Retrieving a sequence from a database : BioPerl HOWTO
>> http://bit.ly/RWIot
>>
>> Trust this helps,
>> Khader Shameer
>> NCBS - TIFR
>>
>>> Hi,
>>>
>>> I have an input list of gene names (can get gene ids from a local db
>>> if required).
>>> I need to fetch sequences of these genes. Can someone please guide me
>>> as to how this can be done using perl/bioperl?
>>>
>>> Any help will be deeply appreciated.
>>>
>>> Thanks.
>>>
>>> -Neeti
>>> Even my blood says, B positive
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>
>>
>>
>


From e.osimo at gmail.com  Fri Sep  4 08:05:48 2009
From: e.osimo at gmail.com (Emanuele Osimo)
Date: Fri, 4 Sep 2009 14:05:48 +0200
Subject: [Bioperl-l] need help urgently
In-Reply-To: <764978cf0909040313r48ff5660x3210324dc14bf966@mail.gmail.com>
References: <764978cf0909032349t5d8683f3h3b2ccc506392c984@mail.gmail.com> 
	<42006.192.168.1.1.1252051273.squirrel@mail.ncbs.res.in>
	<764978cf0909040217k7f4382d6pcec65f8adfd66164@mail.gmail.com> 
	<764978cf0909040313r48ff5660x3210324dc14bf966@mail.gmail.com>
Message-ID: <2ac05d0f0909040505w75793434ifc25237edbabad5a@mail.gmail.com>

Try this:
http://david.abcc.ncifcrf.gov/conversion.jsp

Emanuele


On Fri, Sep 4, 2009 at 12:13, Neeti Somaiya <neetisomaiya at gmail.com> wrote:

> Thanks for the replies.
>
> So the get seq by accession/GI worked for me. Now can anyone tell me
> the easiest way to get the GI /Accession of a gene from the gene
> id/gene name?
>
> -Neeti
> Even my blood says, B positive
>
>
>
> On Fri, Sep 4, 2009 at 2:47 PM, Neeti Somaiya<neetisomaiya at gmail.com>
> wrote:
> > Thanks for the link.
> > So I need only the following lines of code to get the sequence?
> >
> > use Bio::DB::GenBank;
> > $db_obj = Bio::DB::GenBank->new;
> > $seq_obj = $db_obj->get_Seq_by_id(2);
> >
> > How do I print the sequence?
> > $seq_obj->seq ??
> >
> > -Neeti
> > Even my blood says, B positive
> >
> >
> >
> > On Fri, Sep 4, 2009 at 1:31 PM, K. Shameer<shameer at ncbs.res.in> wrote:
> >>
> >> Retrieving a sequence from a database : BioPerl HOWTO
> >> http://bit.ly/RWIot
> >>
> >> Trust this helps,
> >> Khader Shameer
> >> NCBS - TIFR
> >>
> >>> Hi,
> >>>
> >>> I have an input list of gene names (can get gene ids from a local db
> >>> if required).
> >>> I need to fetch sequences of these genes. Can someone please guide me
> >>> as to how this can be done using perl/bioperl?
> >>>
> >>> Any help will be deeply appreciated.
> >>>
> >>> Thanks.
> >>>
> >>> -Neeti
> >>> Even my blood says, B positive
> >>> _______________________________________________
> >>> Bioperl-l mailing list
> >>> Bioperl-l at lists.open-bio.org
> >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >>>
> >>
> >>
> >>
> >
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


From neetisomaiya at gmail.com  Fri Sep  4 08:21:19 2009
From: neetisomaiya at gmail.com (Neeti Somaiya)
Date: Fri, 4 Sep 2009 17:51:19 +0530
Subject: [Bioperl-l] need help urgently
In-Reply-To: <2ac05d0f0909040505w75793434ifc25237edbabad5a@mail.gmail.com>
References: <764978cf0909032349t5d8683f3h3b2ccc506392c984@mail.gmail.com>
	<42006.192.168.1.1.1252051273.squirrel@mail.ncbs.res.in>
	<764978cf0909040217k7f4382d6pcec65f8adfd66164@mail.gmail.com>
	<764978cf0909040313r48ff5660x3210324dc14bf966@mail.gmail.com>
	<2ac05d0f0909040505w75793434ifc25237edbabad5a@mail.gmail.com>
Message-ID: <764978cf0909040521p7aa6af97hffc0c08381a04222@mail.gmail.com>

Thanks. Its an interesting tool.

But I want to do this programatically.

I have gene ids to start with. Cant find a method to directly get
sequence with gene id as input. So using the method of getting
sequence with accession as input, for which I need to know accessions
for my gene ids first. Is this a right approach? Please guide me. My
main aim is to get the nucleotide sequence of a gene from ids entrez
gene id/gene name. PLease guide me. I am confused.

-Neeti
Even my blood says, B positive


On Fri, Sep 4, 2009 at 5:35 PM, Emanuele Osimo<e.osimo at gmail.com> wrote:
> Try this:
> http://david.abcc.ncifcrf.gov/conversion.jsp
>
> Emanuele
>
>
> On Fri, Sep 4, 2009 at 12:13, Neeti Somaiya <neetisomaiya at gmail.com> wrote:
>>
>> Thanks for the replies.
>>
>> So the get seq by accession/GI worked for me. Now can anyone tell me
>> the easiest way to get the GI /Accession of a gene from the gene
>> id/gene name?
>>
>> -Neeti
>> Even my blood says, B positive
>>
>>
>>
>> On Fri, Sep 4, 2009 at 2:47 PM, Neeti Somaiya<neetisomaiya at gmail.com>
>> wrote:
>> > Thanks for the link.
>> > So I need only the following lines of code to get the sequence?
>> >
>> > use Bio::DB::GenBank;
>> > $db_obj = Bio::DB::GenBank->new;
>> > $seq_obj = $db_obj->get_Seq_by_id(2);
>> >
>> > How do I print the sequence?
>> > $seq_obj->seq ??
>> >
>> > -Neeti
>> > Even my blood says, B positive
>> >
>> >
>> >
>> > On Fri, Sep 4, 2009 at 1:31 PM, K. Shameer<shameer at ncbs.res.in> wrote:
>> >>
>> >> Retrieving a sequence from a database : BioPerl HOWTO
>> >> http://bit.ly/RWIot
>> >>
>> >> Trust this helps,
>> >> Khader Shameer
>> >> NCBS - TIFR
>> >>
>> >>> Hi,
>> >>>
>> >>> I have an input list of gene names (can get gene ids from a local db
>> >>> if required).
>> >>> I need to fetch sequences of these genes. Can someone please guide me
>> >>> as to how this can be done using perl/bioperl?
>> >>>
>> >>> Any help will be deeply appreciated.
>> >>>
>> >>> Thanks.
>> >>>
>> >>> -Neeti
>> >>> Even my blood says, B positive
>> >>> _______________________________________________
>> >>> Bioperl-l mailing list
>> >>> Bioperl-l at lists.open-bio.org
>> >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>> >>>
>> >>
>> >>
>> >>
>> >
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>


From paola_bisignano at yahoo.it  Fri Sep  4 08:32:02 2009
From: paola_bisignano at yahoo.it (Paola Bisignano)
Date: Fri, 4 Sep 2009 12:32:02 +0000 (GMT)
Subject: [Bioperl-l] problem parsing msf....:second part...I cannot solve
	sorry sorry
Message-ID: <330845.85818.qm@web25704.mail.ukl.yahoo.com>

I have a problem with the parsing of msf file...I can't find the exact


object of Bio::SimpleAlign for my case...


I have to identify residues (from a list) in aligned sequences...but


when I parse the alignment from fasta file, I save as msf file, where


I have to identify my residue (from the list, numbering as the pdb


file) and the residue aligned in the aligned sequences...


this is a piece of the file...


NoName ? MSF: 2 ?Type: P ?Wed Aug 26 10:32:50 2009 ?Check: 00 ..


?Name: Sequence/23-178 ?Len: ? ?156 ?Check: ?8937 ?Weight: ?1.00


?Name: 2zhz:A/1-148 ? ? Len: ? ?156 ?Check: ?9006 ?Weight: ?1.00


//


 ? ? ? ? ? ? ? ? ? ? ?1 ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? 50


Sequence/23-178 ? ? ? NDPRVAAYGE VDELNSWVGY TKSLINSHTQ VLSNELEEIQ QLLFDCGHDL


2zhz:A/1-148 ? ? ? ? ?DDARIAAIGD VDELNSQIGV L--LAEPLPD DVRAALSAIQ HDLFDLGGEL


 ? ? ? ? ? ? ? ? ? ? ?51 ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? 100


Sequence/23-178 ? ? ? ATPADDERHS FKFKQEQPTV WLEEKIDNYT QVVPAVKKHI LPGGTQLASA


2zhz:A/1-148 ? ? ? ? ?CIPGHAAITD AHLARLDG-- WLA----HYN GQLPPLEEFI LPGGARGAAL


 ? ? ? ? ? ? ? ? ? ? ?101 ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?150


Sequence/23-178 ? ? ? LHVARTITRR AERQIVQLMR EEQINQDVLI FINRLSDYFF AAARYANYLE


2zhz:A/1-148 ? ? ? ? ?AHVCRTVCRR AERSIVALGA SEPLNAAPRR YVNRLSDLLF VLARVLNRAA


 ? ? ? ? ? ? ? ? ? ? ?151 ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?200


Sequence/23-178 ? ? ? QQPDML


2zhz:A/1-148 ? ? ? ? ?GGADVL


for example in this I have to identify the residue that is in front of


Val 28 (that is in Sequen) in 2zhz:A (that manually conting is Ile


5)....


Tyr4-> has no residue in front of it because the alignment starts from


N23 of Sequence...


how can I find the way to enter the residue of my sequen, and extract


the residue from the other????


I wish you all dear friends..and I'm actually in atrouble with this..


Thanks for suggestions


Paola


From neetisomaiya at gmail.com  Fri Sep  4 08:40:10 2009
From: neetisomaiya at gmail.com (Neeti Somaiya)
Date: Fri, 4 Sep 2009 18:10:10 +0530
Subject: [Bioperl-l] need help urgently
In-Reply-To: <8CCCFE4D-84A4-47A4-A627-ADC6C0329686@illinois.edu>
References: <764978cf0909032349t5d8683f3h3b2ccc506392c984@mail.gmail.com>
	<42006.192.168.1.1.1252051273.squirrel@mail.ncbs.res.in>
	<764978cf0909040217k7f4382d6pcec65f8adfd66164@mail.gmail.com>
	<764978cf0909040313r48ff5660x3210324dc14bf966@mail.gmail.com>
	<2ac05d0f0909040505w75793434ifc25237edbabad5a@mail.gmail.com>
	<764978cf0909040521p7aa6af97hffc0c08381a04222@mail.gmail.com>
	<8CCCFE4D-84A4-47A4-A627-ADC6C0329686@illinois.edu>
Message-ID: <764978cf0909040540n531ea4d3o42f28a7e1578ad82@mail.gmail.com>

Hi,

Thanks for your reply. I saw this before and wanted to try this, but I
am unable to install this module of EUtilities. When I search on CPAN,
it gives me the entire bioperl package in the download option of this
module. Can I not get a tar.gz file of this module alone, which I can
gzip, untar and then run the make and all to install it? I dont want
to install entire bioperl again as I am using an older version. Any
suggestions?

-Neeti
Even my blood says, B positive


On Fri, Sep 4, 2009 at 6:00 PM, Chris Fields<cjfields at illinois.edu> wrote:
> Neeti,
>
> Something like this?
>
> http://www.bioperl.org/wiki/HOWTO:EUtilities_Cookbook#esummary_-.3E_efetch
>
> chris
>
> On Sep 4, 2009, at 7:21 AM, Neeti Somaiya wrote:
>
>> Thanks. Its an interesting tool.
>>
>> But I want to do this programatically.
>>
>> I have gene ids to start with. Cant find a method to directly get
>> sequence with gene id as input. So using the method of getting
>> sequence with accession as input, for which I need to know accessions
>> for my gene ids first. Is this a right approach? Please guide me. My
>> main aim is to get the nucleotide sequence of a gene from ids entrez
>> gene id/gene name. PLease guide me. I am confused.
>>
>> -Neeti
>> Even my blood says, B positive
>>
>>
>>
>> On Fri, Sep 4, 2009 at 5:35 PM, Emanuele Osimo<e.osimo at gmail.com> wrote:
>>>
>>> Try this:
>>> http://david.abcc.ncifcrf.gov/conversion.jsp
>>>
>>> Emanuele
>>>
>>>
>>> On Fri, Sep 4, 2009 at 12:13, Neeti Somaiya <neetisomaiya at gmail.com>
>>> wrote:
>>>>
>>>> Thanks for the replies.
>>>>
>>>> So the get seq by accession/GI worked for me. Now can anyone tell me
>>>> the easiest way to get the GI /Accession of a gene from the gene
>>>> id/gene name?
>>>>
>>>> -Neeti
>>>> Even my blood says, B positive
>>>>
>>>>
>>>>
>>>> On Fri, Sep 4, 2009 at 2:47 PM, Neeti Somaiya<neetisomaiya at gmail.com>
>>>> wrote:
>>>>>
>>>>> Thanks for the link.
>>>>> So I need only the following lines of code to get the sequence?
>>>>>
>>>>> use Bio::DB::GenBank;
>>>>> $db_obj = Bio::DB::GenBank->new;
>>>>> $seq_obj = $db_obj->get_Seq_by_id(2);
>>>>>
>>>>> How do I print the sequence?
>>>>> $seq_obj->seq ??
>>>>>
>>>>> -Neeti
>>>>> Even my blood says, B positive
>>>>>
>>>>>
>>>>>
>>>>> On Fri, Sep 4, 2009 at 1:31 PM, K. Shameer<shameer at ncbs.res.in> wrote:
>>>>>>
>>>>>> Retrieving a sequence from a database : BioPerl HOWTO
>>>>>> http://bit.ly/RWIot
>>>>>>
>>>>>> Trust this helps,
>>>>>> Khader Shameer
>>>>>> NCBS - TIFR
>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>> I have an input list of gene names (can get gene ids from a local db
>>>>>>> if required).
>>>>>>> I need to fetch sequences of these genes. Can someone please guide me
>>>>>>> as to how this can be done using perl/bioperl?
>>>>>>>
>>>>>>> Any help will be deeply appreciated.
>>>>>>>
>>>>>>> Thanks.
>>>>>>>
>>>>>>> -Neeti
>>>>>>> Even my blood says, B positive
>>>>>>> _______________________________________________
>>>>>>> Bioperl-l mailing list
>>>>>>> Bioperl-l at lists.open-bio.org
>>>>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>> _______________________________________________
>>>> Bioperl-l mailing list
>>>> Bioperl-l at lists.open-bio.org
>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>


From cjfields at illinois.edu  Fri Sep  4 08:30:42 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Fri, 4 Sep 2009 07:30:42 -0500
Subject: [Bioperl-l] need help urgently
In-Reply-To: <764978cf0909040521p7aa6af97hffc0c08381a04222@mail.gmail.com>
References: <764978cf0909032349t5d8683f3h3b2ccc506392c984@mail.gmail.com>
	<42006.192.168.1.1.1252051273.squirrel@mail.ncbs.res.in>
	<764978cf0909040217k7f4382d6pcec65f8adfd66164@mail.gmail.com>
	<764978cf0909040313r48ff5660x3210324dc14bf966@mail.gmail.com>
	<2ac05d0f0909040505w75793434ifc25237edbabad5a@mail.gmail.com>
	<764978cf0909040521p7aa6af97hffc0c08381a04222@mail.gmail.com>
Message-ID: <8CCCFE4D-84A4-47A4-A627-ADC6C0329686@illinois.edu>

Neeti,

Something like this?

http://www.bioperl.org/wiki/HOWTO:EUtilities_Cookbook#esummary_-.3E_efetch

chris

On Sep 4, 2009, at 7:21 AM, Neeti Somaiya wrote:

> Thanks. Its an interesting tool.
>
> But I want to do this programatically.
>
> I have gene ids to start with. Cant find a method to directly get
> sequence with gene id as input. So using the method of getting
> sequence with accession as input, for which I need to know accessions
> for my gene ids first. Is this a right approach? Please guide me. My
> main aim is to get the nucleotide sequence of a gene from ids entrez
> gene id/gene name. PLease guide me. I am confused.
>
> -Neeti
> Even my blood says, B positive
>
>
>
> On Fri, Sep 4, 2009 at 5:35 PM, Emanuele Osimo<e.osimo at gmail.com>  
> wrote:
>> Try this:
>> http://david.abcc.ncifcrf.gov/conversion.jsp
>>
>> Emanuele
>>
>>
>> On Fri, Sep 4, 2009 at 12:13, Neeti Somaiya  
>> <neetisomaiya at gmail.com> wrote:
>>>
>>> Thanks for the replies.
>>>
>>> So the get seq by accession/GI worked for me. Now can anyone tell me
>>> the easiest way to get the GI /Accession of a gene from the gene
>>> id/gene name?
>>>
>>> -Neeti
>>> Even my blood says, B positive
>>>
>>>
>>>
>>> On Fri, Sep 4, 2009 at 2:47 PM, Neeti  
>>> Somaiya<neetisomaiya at gmail.com>
>>> wrote:
>>>> Thanks for the link.
>>>> So I need only the following lines of code to get the sequence?
>>>>
>>>> use Bio::DB::GenBank;
>>>> $db_obj = Bio::DB::GenBank->new;
>>>> $seq_obj = $db_obj->get_Seq_by_id(2);
>>>>
>>>> How do I print the sequence?
>>>> $seq_obj->seq ??
>>>>
>>>> -Neeti
>>>> Even my blood says, B positive
>>>>
>>>>
>>>>
>>>> On Fri, Sep 4, 2009 at 1:31 PM, K. Shameer<shameer at ncbs.res.in>  
>>>> wrote:
>>>>>
>>>>> Retrieving a sequence from a database : BioPerl HOWTO
>>>>> http://bit.ly/RWIot
>>>>>
>>>>> Trust this helps,
>>>>> Khader Shameer
>>>>> NCBS - TIFR
>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> I have an input list of gene names (can get gene ids from a  
>>>>>> local db
>>>>>> if required).
>>>>>> I need to fetch sequences of these genes. Can someone please  
>>>>>> guide me
>>>>>> as to how this can be done using perl/bioperl?
>>>>>>
>>>>>> Any help will be deeply appreciated.
>>>>>>
>>>>>> Thanks.
>>>>>>
>>>>>> -Neeti
>>>>>> Even my blood says, B positive
>>>>>> _______________________________________________
>>>>>> Bioperl-l mailing list
>>>>>> Bioperl-l at lists.open-bio.org
>>>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From cjfields at illinois.edu  Fri Sep  4 08:49:19 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Fri, 4 Sep 2009 07:49:19 -0500
Subject: [Bioperl-l] need help urgently
In-Reply-To: <764978cf0909040540n531ea4d3o42f28a7e1578ad82@mail.gmail.com>
References: <764978cf0909032349t5d8683f3h3b2ccc506392c984@mail.gmail.com>
	<42006.192.168.1.1.1252051273.squirrel@mail.ncbs.res.in>
	<764978cf0909040217k7f4382d6pcec65f8adfd66164@mail.gmail.com>
	<764978cf0909040313r48ff5660x3210324dc14bf966@mail.gmail.com>
	<2ac05d0f0909040505w75793434ifc25237edbabad5a@mail.gmail.com>
	<764978cf0909040521p7aa6af97hffc0c08381a04222@mail.gmail.com>
	<8CCCFE4D-84A4-47A4-A627-ADC6C0329686@illinois.edu>
	<764978cf0909040540n531ea4d3o42f28a7e1578ad82@mail.gmail.com>
Message-ID: <4D83853D-90C3-4048-AFAB-FF6E2402C7AA@illinois.edu>

Neeti,

Sorry, it's a package deal (and Bio::DB::EUtilities relies on several  
other modules).  I am planning on spinning it out at some point into  
it's own package, but for now the easiest way to install is via 1.6  
off CPAN or downloading the nightly build:

http://www.bioperl.org/DIST/nightly_builds/

chris

On Sep 4, 2009, at 7:40 AM, Neeti Somaiya wrote:

> Hi,
>
> Thanks for your reply. I saw this before and wanted to try this, but I
> am unable to install this module of EUtilities. When I search on CPAN,
> it gives me the entire bioperl package in the download option of this
> module. Can I not get a tar.gz file of this module alone, which I can
> gzip, untar and then run the make and all to install it? I dont want
> to install entire bioperl again as I am using an older version. Any
> suggestions?
>
> -Neeti
> Even my blood says, B positive
>
>
>
> On Fri, Sep 4, 2009 at 6:00 PM, Chris Fields<cjfields at illinois.edu>  
> wrote:
>> Neeti,
>>
>> Something like this?
>>
>> http://www.bioperl.org/wiki/HOWTO:EUtilities_Cookbook#esummary_-.3E_efetch
>>
>> chris
>>
>> On Sep 4, 2009, at 7:21 AM, Neeti Somaiya wrote:
>>
>>> Thanks. Its an interesting tool.
>>>
>>> But I want to do this programatically.
>>>
>>> I have gene ids to start with. Cant find a method to directly get
>>> sequence with gene id as input. So using the method of getting
>>> sequence with accession as input, for which I need to know  
>>> accessions
>>> for my gene ids first. Is this a right approach? Please guide me. My
>>> main aim is to get the nucleotide sequence of a gene from ids entrez
>>> gene id/gene name. PLease guide me. I am confused.
>>>
>>> -Neeti
>>> Even my blood says, B positive
>>>
>>>
>>>
>>> On Fri, Sep 4, 2009 at 5:35 PM, Emanuele Osimo<e.osimo at gmail.com>  
>>> wrote:
>>>>
>>>> Try this:
>>>> http://david.abcc.ncifcrf.gov/conversion.jsp
>>>>
>>>> Emanuele
>>>>
>>>>
>>>> On Fri, Sep 4, 2009 at 12:13, Neeti Somaiya  
>>>> <neetisomaiya at gmail.com>
>>>> wrote:
>>>>>
>>>>> Thanks for the replies.
>>>>>
>>>>> So the get seq by accession/GI worked for me. Now can anyone  
>>>>> tell me
>>>>> the easiest way to get the GI /Accession of a gene from the gene
>>>>> id/gene name?
>>>>>
>>>>> -Neeti
>>>>> Even my blood says, B positive
>>>>>
>>>>>
>>>>>
>>>>> On Fri, Sep 4, 2009 at 2:47 PM, Neeti Somaiya<neetisomaiya at gmail.com 
>>>>> >
>>>>> wrote:
>>>>>>
>>>>>> Thanks for the link.
>>>>>> So I need only the following lines of code to get the sequence?
>>>>>>
>>>>>> use Bio::DB::GenBank;
>>>>>> $db_obj = Bio::DB::GenBank->new;
>>>>>> $seq_obj = $db_obj->get_Seq_by_id(2);
>>>>>>
>>>>>> How do I print the sequence?
>>>>>> $seq_obj->seq ??
>>>>>>
>>>>>> -Neeti
>>>>>> Even my blood says, B positive
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Fri, Sep 4, 2009 at 1:31 PM, K. Shameer<shameer at ncbs.res.in>  
>>>>>> wrote:
>>>>>>>
>>>>>>> Retrieving a sequence from a database : BioPerl HOWTO
>>>>>>> http://bit.ly/RWIot
>>>>>>>
>>>>>>> Trust this helps,
>>>>>>> Khader Shameer
>>>>>>> NCBS - TIFR
>>>>>>>
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> I have an input list of gene names (can get gene ids from a  
>>>>>>>> local db
>>>>>>>> if required).
>>>>>>>> I need to fetch sequences of these genes. Can someone please  
>>>>>>>> guide me
>>>>>>>> as to how this can be done using perl/bioperl?
>>>>>>>>
>>>>>>>> Any help will be deeply appreciated.
>>>>>>>>
>>>>>>>> Thanks.
>>>>>>>>
>>>>>>>> -Neeti
>>>>>>>> Even my blood says, B positive
>>>>>>>> _______________________________________________
>>>>>>>> Bioperl-l mailing list
>>>>>>>> Bioperl-l at lists.open-bio.org
>>>>>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>> _______________________________________________
>>>>> Bioperl-l mailing list
>>>>> Bioperl-l at lists.open-bio.org
>>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>>
>>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From pg4 at sanger.ac.uk  Thu Sep  3 04:01:26 2009
From: pg4 at sanger.ac.uk (Pablo Marin-Garcia)
Date: Thu, 3 Sep 2009 09:01:26 +0100 (BST)
Subject: [Bioperl-l] bioperl invades emacs -- bug report?
In-Reply-To: <203092FB050648AA9F256788068F0A16@NewLife>
References: <mailman.25.1251907209.22450.bioperl-l@lists.open-bio.org>
	<alpine.DEB.1.10.0909022007510.16229@deskpro17122.dynamic.sanger.ac.uk>
	<203092FB050648AA9F256788068F0A16@NewLife>
Message-ID: <alpine.DEB.1.10.0909030814320.16229@deskpro17122.dynamic.sanger.ac.uk>

On Thu, 3 Sep 2009, Mark A. Jensen wrote:

> Hi Pablo and all-
> Try the latest revision (>=16081) with your debian/Emacs 21. Set
> the variable bioperl-module-path to the directory above the
> Bio directory (same idea as ' use lib "./bioperl-live"; ' ), and try
> again there. Tomorrow, MacOS
> cheers,
> Mark

Hello Mark,

after setting bioperl-module-path manually, your module works ok in 
linux emacs 21.4 with latest revision.

About the perl5lib issue, sorry about not reporting the platform: the 
report was on linux not in mac os X. In the wiki you have a comment about 
mac OS X separator:

[wiki] The problem Pablo was running into is definitely the Mac OS X path 
[wiki] separator issue.

Here I was refering to ':' as the 'path seprator' for linux multipath 
environmental vars not the systems directory separator [:/\].

Also from the wiki

[wiki] I think this is ok as it is, since bioperl-module-path is meant to 
[wiki] point to the directory above Bio

This is right. Probably my message was misleading. I wrongly appended 
'/Bio' to the path instead to a temp variable for testing with 
file-exist-p. And probably gave you the impression that the point was to 
have the /Bio added to the path. Sorry about that.

Instead my main point was about the line where you capture the PRL5LIB:

[code] (if (setq pth (getenv "PERL5LIB"))

wouldn't this leave pth with s *string* like 
"lib/path1:lib/path2:lob/path3" in linux?

Then, when you test:

[code] (setq pth (if (file-exists-p (concat pth "/" "Bio")) pth nil))))

it would append '/Bio' at the end of the whole string 
'lib/path1:lib/path2:lib/path3'. and this string path obviously does not 
exist.

Am I missing something? Shouldn't the 'concat /Bio' be applied to *each* 
lib/path, splitting first the pth string by the ':' in linux/osX or 
equivalent in windows.

Sorry about not being very clear in my firest report.


    -Pablo


>> == bug when parsing perl5lib? ==
>> 
>> Please correct me if I am wrong but in bioperl-init.el when extracting the 
>> Bioperl paths from PERL5LIB this is not working for me in linux.
>> 
>> While debugging bioperl-init.el:
>> # (setq pth (getenv "PERL5LIB"))
>> # 
>> "/nfs/home/pmg/ensembl-api/ensembl-compara/modules:...:/nfs/home/pmg/bioperl-live:..."
>> # (setq pth (if (file-exists-p (concat pth "/" "Bio")) pth nil))
>> # nil
>> 
>> No file is found because it is looking for all the paths concatenated 
>> together with a '/Bio' at the end:
>>
>>   libpaht1:libpath2:libpath3/Bio
>> 
>> 'concat' adds /Bio to the pth that is a string with all the PERL5LIB paths. 
>> Should this concat rather be applied to the splited perl5lib by ':' in unix 
>> or ';' in windows and then tested for the existence of files?
>> 
>> for example in unix:
>> 
>> --- code --
>> (defun addbio (bio_path)
>>   "apend /Bio to each path"
>>   (concat bio_path "/" "Bio"))
>> 
>> (mapcar 'file-exists-p (mapcar 'addbio (split-string pth ":")))
>> -- end code ---
>> 
>> This would result in the list of T and F bioperl (and ensembl) paths
>> (t t nil t t t t t t nil nil nil ...)
>> 
>> 
>> Regards and thanks for the modules they would be very useful.
>>
>>    -Pablo
>> 
>> =====================================================================
>>                      Pablo Marin-Garcia, PhD
>>
>>                     \\//          (Argiope bruennichi
>>                \/\/`(||>O:'\/\/   with stabilimentum)
>>                     //\\
>> 
>> Sanger Institute                |  PostDoc / Computer Biologist
>> Wellcome Trust Genome Campus    |  team : 128/108 (Human Genetics)
>> Hinxton, Cambridge CB10 1HH     |  room : N333
>> United Kingdom                  |  email: pablo.marin at sanger.ac.uk
>> ====================================================================
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> -- 
>> The Wellcome Trust Sanger Institute is operated by Genome Research Limited, 
>> a charity registered in England with number 1021457 and a company 
>> registered in England with number 2742969, whose registered office is 215 
>> Euston Road, London, NW1 2BE. 
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>> 
>> 
>
>


=====================================================================
                      Pablo Marin-Garcia, PhD

                     \\//          (Argiope bruennichi
                \/\/`(||>O:'\/\/   with stabilimentum)
                     //\\

Sanger Institute                |  PostDoc / Computer Biologist
Wellcome Trust Genome Campus    |  team : 128/108 (Human Genetics)
Hinxton, Cambridge CB10 1HH     |  room : N333
United Kingdom                  |  email: pablo.marin at sanger.ac.uk
====================================================================


-- 
 The Wellcome Trust Sanger Institute is operated by Genome Research 
 Limited, a charity registered in England with number 1021457 and a 
 company registered in England with number 2742969, whose registered 
 office is 215 Euston Road, London, NW1 2BE. 


From paola.bisignano at gmail.com  Fri Sep  4 08:28:03 2009
From: paola.bisignano at gmail.com (Paola Bisignano)
Date: Fri, 4 Sep 2009 14:28:03 +0200
Subject: [Bioperl-l] problem parsing msf file
Message-ID: <e9cf89740909040528j69e5f8e6ka9d550840a4e0f9a@mail.gmail.com>

I have a problem with the parsing of msf file...I can't find the exact
object of Bio::SimpleAlign for my case...
I have to identify residues (from a list) in aligned sequences...but
when I parse the alignment from fasta file, I save as msf file, where
I have to identify my residue (from the list, numbering as the pdb
file) and the residue aligned in the aligned sequences...

this is a piece of the file...

NoName   MSF: 2  Type: P  Wed Aug 26 10:32:50 2009  Check: 00 ..

 Name: Sequence/23-178  Len:    156  Check:  8937  Weight:  1.00
 Name: 2zhz:A/1-148     Len:    156  Check:  9006  Weight:  1.00

//


                      1                                                   50
Sequence/23-178       NDPRVAAYGE VDELNSWVGY TKSLINSHTQ VLSNELEEIQ QLLFDCGHDL
2zhz:A/1-148          DDARIAAIGD VDELNSQIGV L--LAEPLPD DVRAALSAIQ HDLFDLGGEL


                      51                                                 100
Sequence/23-178       ATPADDERHS FKFKQEQPTV WLEEKIDNYT QVVPAVKKHI LPGGTQLASA
2zhz:A/1-148          CIPGHAAITD AHLARLDG-- WLA----HYN GQLPPLEEFI LPGGARGAAL


                      101                                                150
Sequence/23-178       LHVARTITRR AERQIVQLMR EEQINQDVLI FINRLSDYFF AAARYANYLE
2zhz:A/1-148          AHVCRTVCRR AERSIVALGA SEPLNAAPRR YVNRLSDLLF VLARVLNRAA


                      151                                                200
Sequence/23-178       QQPDML
2zhz:A/1-148          GGADVL

for example in this I have to identify the residue that is in front of
Val 28 (that is in Sequen) in 2zhz:A (that manually conting is Ile
5)....
Tyr4-> has no residue in front of it because the alignment starts from
N23 of Sequence...
how can I find the way to enter the residue of my sequen, and extract
the residue from the other????


I wish you all dear friends..and I'm actually in atrouble with this..
Thanks for suggestions

Paola


From jason at bioperl.org  Fri Sep  4 12:04:05 2009
From: jason at bioperl.org (Jason Stajich)
Date: Fri, 4 Sep 2009 09:04:05 -0700
Subject: [Bioperl-l] Fwd:  help parsing msf file or clustalW file reports
References: <369662.74237.qm@web25701.mail.ukl.yahoo.com>
Message-ID: <B5AEEBAD-22D3-40B6-AD06-17E268DFAFDD@bioperl.org>

Paola - it is important to continue to email the mailing list for your  
help.  I'm hoping another person on the list can help as I am swamped  
right now.
-jason

Begin forwarded message:

> From: Paola Bisignano <paola_bisignano at yahoo.it>
> Date: September 4, 2009 5:48:22 AM PDT
> To: Jason Stajich <jason at bioperl.org>
> Subject: Re: [Bioperl-l] help parsing msf file or clustalW file  
> reports
>
> Hi Jason, thank for your answer there are two day that I'm re- 
> studyng synopsys of bioperl and programming object...I understand  
> what you mean...but I have some problems...I don't actually know how  
> to start to parse this kind of file, I generated this msf file or  
> clustalW file, by parsing a fasta file of multiple paired  
> sequences..so I parsed in msf file...extracting only the paired  
> sequences I want..so homolog proteins that have same ligand  
> published in pdb bank..
>
>
> I have a problem with the parsing of msf file...I can't find the exact
>
>
> object of Bio::SimpleAlign for my case...
>
>
> I have to identify residues (from a list) in aligned sequences...but
>
>
> when I parse the alignment from fasta file, I save as msf file, where
>
>
> I have to identify my residue (from the list, numbering as the pdb
>
>
> file) and the residue aligned in the aligned sequences...
>
>
>
>
>
> this is a piece of the file...
>
>
>
>
>
> NoName   MSF: 2  Type: P  Wed Aug 26 10:32:50 2009  Check: 00 ..
>
>
>
>
>
>  Name: Sequence/23-178  Len:    156  Check:  8937  Weight:  1.00
>
>
>  Name: 2zhz:A/1-148     Len:    156  Check:  9006  Weight:  1.00
>
>
>
>
>
> //
>
>
>
>
>
>
>
>
>                       
> 1                                                   50
>
>
> Sequence/23-178       NDPRVAAYGE VDELNSWVGY TKSLINSHTQ VLSNELEEIQ  
> QLLFDCGHDL
>
>
> 2zhz:A/1-148          DDARIAAIGD VDELNSQIGV L--LAEPLPD DVRAALSAIQ  
> HDLFDLGGEL
>
>
>
>
>
>
>
>
>                       
> 51                                                 100
>
>
> Sequence/23-178       ATPADDERHS FKFKQEQPTV WLEEKIDNYT QVVPAVKKHI  
> LPGGTQLASA
>
>
> 2zhz:A/1-148          CIPGHAAITD AHLARLDG-- WLA----HYN GQLPPLEEFI  
> LPGGARGAAL
>
>
>
>
>
>
>
>
>                       
> 101                                                150
>
>
> Sequence/23-178       LHVARTITRR AERQIVQLMR EEQINQDVLI FINRLSDYFF  
> AAARYANYLE
>
>
> 2zhz:A/1-148          AHVCRTVCRR AERSIVALGA SEPLNAAPRR YVNRLSDLLF  
> VLARVLNRAA
>
>
>
>
>
>
>
>
>                       
> 151                                                200
>
>
> Sequence/23-178       QQPDML
>
>
> 2zhz:A/1-148          GGADVL
>
>
>
>
>
> for example in this I have to identify the residue that is in front of
>
>
> Val 28 (that is in Sequen) in 2zhz:A (that manually conting is Ile
>
>
> 5)....
>
>
> Tyr4-> has no residue in front of it because the alignment starts from
>
>
> N23 of Sequence...
>
>
> how can I find the way to enter the residue of my sequen, and extract
>
>
> the residue from the other????
>
>
>
>
>
>
>
>
> I wish you all dear friends..and I'm actually in atrouble with this..
>
>
> Thanks for suggestions
>
>
>
>
>
>
> --- Mar 1/9/09, Jason Stajich <jason at bioperl.org> ha scritto:
>
> Da: Jason Stajich <jason at bioperl.org>
> Oggetto: Re: [Bioperl-l] help parsing msf file or clustalW file  
> reports
> A: "Paola Bisignano" <paola_bisignano at yahoo.it>
> Cc: bioperl-l at lists.open-bio.org
> Data: Marted? 1 settembre 2009, 17:49
>
> I think you might want to use the column_from_residue_number method  
> that is part of Bio::SimpleAlign - it lets you get the column from  
> an alignment based on the sequence residue, doing some math along  
> the way to deal with gaps. That is the residue -> alignment  
> direction.  If you are starting at the alignment and want to get the  
> residue's position you will use the location_from_column on a  
> particular sequence so
>
>     # select somehow a sequence from the alignment, e.g.
>     my $seq = $aln->get_seq_by_pos(1);
>     #$loc is undef or Bio::LocationI object
>     my $loc = $seq->location_from_column(5);
>
> -jason
>
> On Sep 1, 2009, at 5:20 AM, Paola Bisignano wrote:
>
>> Hi,
>>
>> I'm trying to parse fasta files, where I have couple of  
>> alignments....I need to identify my residue in my alignment......I  
>> have separate lists that derived from ligplot parsing files.. so I  
>> have to manipulate string...but I don't now how to start..it seems  
>> complicated..
>> I used Bio::AlignIO to parse the fasta file, so I can have a parsed  
>> file in msf or clustalW forma
>>
>> here an example:
>> CLUSTAL W(1.81) multiple sequence alignment
>>
>>
>> Sequence/9-273          
>> DKWEMERTDITMKHKLGGGQYGEVYEGVWKKYSLTVAVKTLKEDTMEVEEFLKEAAVMKE
>> 2pl0:A/6-268           DEWEVPRETLKLVERLGAGQFGEVWMGYYNGHT- 
>> KVAVKSLKQGSMSPDAFLAEANLMKQ
>>                         *:**: *  :.: .:**.**:***:  
>> * :: :: .****:**:.:*. : ** ** :**:
>>
>>
>> Sequence/9-273          
>> IKHPNLVQLLGVCTREPPFYIITEFMTYGNLLDYLRECNRQEVSAVVLLYMATQISSAME
>> 2pl0:A/6-268           LQHQRLVRLYAVVTQEP- 
>> IYIITEYMENGSLVDFLKTPSGIKLTINKLLDMAAQIAEGMA
>>                         ::* .**:* .* *:** :*****:*   
>> *.*:*:*:  .  :::   ** **:**:..*
>>
>> I  choose two residue for example...how can I extract  
>> them...starting from their position in the pdb file?
>> I need to walk...to my sequence
>>
>> I don't know if it is clear because I cannot explain the question  
>> correctly in english...are there any Italians?
>> could anyone help me?
>>
>>
>>
>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> --
> Jason Stajich
> jason.stajich at gmail.com
> jason at bioperl.org
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>
>

--
Jason Stajich
jason.stajich at gmail.com
jason at bioperl.org


From robert.bradbury at gmail.com  Fri Sep  4 16:15:09 2009
From: robert.bradbury at gmail.com (Robert Bradbury)
Date: Fri, 4 Sep 2009 16:15:09 -0400
Subject: [Bioperl-l] need help urgently
In-Reply-To: <2ac05d0f0909040505w75793434ifc25237edbabad5a@mail.gmail.com>
References: <764978cf0909032349t5d8683f3h3b2ccc506392c984@mail.gmail.com>
	<42006.192.168.1.1.1252051273.squirrel@mail.ncbs.res.in>
	<764978cf0909040217k7f4382d6pcec65f8adfd66164@mail.gmail.com>
	<764978cf0909040313r48ff5660x3210324dc14bf966@mail.gmail.com>
	<2ac05d0f0909040505w75793434ifc25237edbabad5a@mail.gmail.com>
Message-ID: <deaa866a0909041315y4282d811g3047ab153812014d@mail.gmail.com>

On 9/4/09, Emanuele Osimo <e.osimo at gmail.com> wrote:
> Try this:
> http://david.abcc.ncifcrf.gov/conversion.jsp
>

It may be just me, but I've tried this in both Firefox and Opera and
it will not work without Javascript enabled.  Most "intelligent" sites
now tell you that Javascript must be enabled if they require it to
work properly.  More intelligent sites (such as Google's gmail) allow
you to toggle back and forth between Javascript & non-Javascript
implementations.

Note that, IMO, running with Javascript enabled for all sites all the
time is a bad idea (potentially for security reasons, but clearly for
sleep / suspend / power consumption reasons, and finally for the
reason of do you *really* trust that Javascript, your DNS provider,
and sites hosting the scripts are 100% secure?).  The only options
that seem generally available at this time are to run Firefox with
NoScript enabling of selective sites or to run two browser instances,
one with Javascript enabled, one with it disabled -- and to only use
the Javascript enabled browser on sites with a high probability of
being secure).


From lsbrath at gmail.com  Fri Sep  4 18:12:34 2009
From: lsbrath at gmail.com (Mgavi Brathwaite)
Date: Fri, 4 Sep 2009 18:12:34 -0400
Subject: [Bioperl-l] bio:graphics
Message-ID: <69367b8f0909041512l77b2431aqb89f57f82adae1@mail.gmail.com>

Hello,

I need to grab features(source, gene, cds, primer_bind) from a genbank file
and add features(5' and 3' UTR, misc_feature) to generate an image. The
images are on two tracks and with each track having multiple features. How
do I display different colors for the different features on the same track?
In my case 5'UTR, CDS, and 3'UTR are on the same track. I want the UTRs to
have one color and the CDS another.

I also need to grab the start and end info from the primer_bind feature
based on the /note tag values. In my case 'HUF' and 'HDF'. Code:

if( $feat->primary_tag eq 'primer_bind' ) {
            $feat->get_tag_values("note") if ($feat_object->has_tag("note")
&&
                tag_values("note") eq 'HDF');
            $pb_start = $feat->start;
            $pb_end = $feat->end;


I want to make sure that I am moving in the right direction.  Can someone
help me out?

M


From neetisomaiya at gmail.com  Sat Sep  5 00:52:11 2009
From: neetisomaiya at gmail.com (Neeti Somaiya)
Date: Sat, 5 Sep 2009 10:22:11 +0530
Subject: [Bioperl-l] need help urgently
In-Reply-To: <4D83853D-90C3-4048-AFAB-FF6E2402C7AA@illinois.edu>
References: <764978cf0909032349t5d8683f3h3b2ccc506392c984@mail.gmail.com>
	<42006.192.168.1.1.1252051273.squirrel@mail.ncbs.res.in>
	<764978cf0909040217k7f4382d6pcec65f8adfd66164@mail.gmail.com>
	<764978cf0909040313r48ff5660x3210324dc14bf966@mail.gmail.com>
	<2ac05d0f0909040505w75793434ifc25237edbabad5a@mail.gmail.com>
	<764978cf0909040521p7aa6af97hffc0c08381a04222@mail.gmail.com>
	<8CCCFE4D-84A4-47A4-A627-ADC6C0329686@illinois.edu>
	<764978cf0909040540n531ea4d3o42f28a7e1578ad82@mail.gmail.com>
	<4D83853D-90C3-4048-AFAB-FF6E2402C7AA@illinois.edu>
Message-ID: <764978cf0909042152v2ae26ee5q6c668c498ead605e@mail.gmail.com>

Ok, so I reinstalled bioperl and was able to run the EUtilities code
for my gene id.
But I am facing two issues :-

1) When I give multiple gene ids, it still returns data of only the
first gene id

2) The script returns the entire entry, and I am not able to figure
out how to just fetch the sequence, and if possible, in FASTA format.
I could not figure it out from the documentation.

Thanks.

-Neeti
Even my blood says, B positive


On Fri, Sep 4, 2009 at 6:19 PM, Chris Fields<cjfields at illinois.edu> wrote:
> Neeti,
>
> Sorry, it's a package deal (and Bio::DB::EUtilities relies on several other
> modules).  I am planning on spinning it out at some point into it's own
> package, but for now the easiest way to install is via 1.6 off CPAN or
> downloading the nightly build:
>
> http://www.bioperl.org/DIST/nightly_builds/
>
> chris
>
> On Sep 4, 2009, at 7:40 AM, Neeti Somaiya wrote:
>
>> Hi,
>>
>> Thanks for your reply. I saw this before and wanted to try this, but I
>> am unable to install this module of EUtilities. When I search on CPAN,
>> it gives me the entire bioperl package in the download option of this
>> module. Can I not get a tar.gz file of this module alone, which I can
>> gzip, untar and then run the make and all to install it? I dont want
>> to install entire bioperl again as I am using an older version. Any
>> suggestions?
>>
>> -Neeti
>> Even my blood says, B positive
>>
>>
>>
>> On Fri, Sep 4, 2009 at 6:00 PM, Chris Fields<cjfields at illinois.edu> wrote:
>>>
>>> Neeti,
>>>
>>> Something like this?
>>>
>>>
>>> http://www.bioperl.org/wiki/HOWTO:EUtilities_Cookbook#esummary_-.3E_efetch
>>>
>>> chris
>>>
>>> On Sep 4, 2009, at 7:21 AM, Neeti Somaiya wrote:
>>>
>>>> Thanks. Its an interesting tool.
>>>>
>>>> But I want to do this programatically.
>>>>
>>>> I have gene ids to start with. Cant find a method to directly get
>>>> sequence with gene id as input. So using the method of getting
>>>> sequence with accession as input, for which I need to know accessions
>>>> for my gene ids first. Is this a right approach? Please guide me. My
>>>> main aim is to get the nucleotide sequence of a gene from ids entrez
>>>> gene id/gene name. PLease guide me. I am confused.
>>>>
>>>> -Neeti
>>>> Even my blood says, B positive
>>>>
>>>>
>>>>
>>>> On Fri, Sep 4, 2009 at 5:35 PM, Emanuele Osimo<e.osimo at gmail.com> wrote:
>>>>>
>>>>> Try this:
>>>>> http://david.abcc.ncifcrf.gov/conversion.jsp
>>>>>
>>>>> Emanuele
>>>>>
>>>>>
>>>>> On Fri, Sep 4, 2009 at 12:13, Neeti Somaiya <neetisomaiya at gmail.com>
>>>>> wrote:
>>>>>>
>>>>>> Thanks for the replies.
>>>>>>
>>>>>> So the get seq by accession/GI worked for me. Now can anyone tell me
>>>>>> the easiest way to get the GI /Accession of a gene from the gene
>>>>>> id/gene name?
>>>>>>
>>>>>> -Neeti
>>>>>> Even my blood says, B positive
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Fri, Sep 4, 2009 at 2:47 PM, Neeti Somaiya<neetisomaiya at gmail.com>
>>>>>> wrote:
>>>>>>>
>>>>>>> Thanks for the link.
>>>>>>> So I need only the following lines of code to get the sequence?
>>>>>>>
>>>>>>> use Bio::DB::GenBank;
>>>>>>> $db_obj = Bio::DB::GenBank->new;
>>>>>>> $seq_obj = $db_obj->get_Seq_by_id(2);
>>>>>>>
>>>>>>> How do I print the sequence?
>>>>>>> $seq_obj->seq ??
>>>>>>>
>>>>>>> -Neeti
>>>>>>> Even my blood says, B positive
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Fri, Sep 4, 2009 at 1:31 PM, K. Shameer<shameer at ncbs.res.in>
>>>>>>> wrote:
>>>>>>>>
>>>>>>>> Retrieving a sequence from a database : BioPerl HOWTO
>>>>>>>> http://bit.ly/RWIot
>>>>>>>>
>>>>>>>> Trust this helps,
>>>>>>>> Khader Shameer
>>>>>>>> NCBS - TIFR
>>>>>>>>
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>> I have an input list of gene names (can get gene ids from a local
>>>>>>>>> db
>>>>>>>>> if required).
>>>>>>>>> I need to fetch sequences of these genes. Can someone please guide
>>>>>>>>> me
>>>>>>>>> as to how this can be done using perl/bioperl?
>>>>>>>>>
>>>>>>>>> Any help will be deeply appreciated.
>>>>>>>>>
>>>>>>>>> Thanks.
>>>>>>>>>
>>>>>>>>> -Neeti
>>>>>>>>> Even my blood says, B positive
>>>>>>>>> _______________________________________________
>>>>>>>>> Bioperl-l mailing list
>>>>>>>>> Bioperl-l at lists.open-bio.org
>>>>>>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>> _______________________________________________
>>>>>> Bioperl-l mailing list
>>>>>> Bioperl-l at lists.open-bio.org
>>>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>>>
>>>>>
>>>> _______________________________________________
>>>> Bioperl-l mailing list
>>>> Bioperl-l at lists.open-bio.org
>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>


From ybolo001 at student.ucr.edu  Sat Sep  5 03:37:58 2009
From: ybolo001 at student.ucr.edu (Eugene Bolotin)
Date: Sat, 5 Sep 2009 00:37:58 -0700
Subject: [Bioperl-l] need help urgently
In-Reply-To: <764978cf0909042152v2ae26ee5q6c668c498ead605e@mail.gmail.com>
References: <764978cf0909032349t5d8683f3h3b2ccc506392c984@mail.gmail.com>
	<42006.192.168.1.1.1252051273.squirrel@mail.ncbs.res.in>
	<764978cf0909040217k7f4382d6pcec65f8adfd66164@mail.gmail.com>
	<764978cf0909040313r48ff5660x3210324dc14bf966@mail.gmail.com>
	<2ac05d0f0909040505w75793434ifc25237edbabad5a@mail.gmail.com>
	<764978cf0909040521p7aa6af97hffc0c08381a04222@mail.gmail.com>
	<8CCCFE4D-84A4-47A4-A627-ADC6C0329686@illinois.edu>
	<764978cf0909040540n531ea4d3o42f28a7e1578ad82@mail.gmail.com>
	<4D83853D-90C3-4048-AFAB-FF6E2402C7AA@illinois.edu>
	<764978cf0909042152v2ae26ee5q6c668c498ead605e@mail.gmail.com>
Message-ID: <941fcc750909050037n3c0f4fc5u89fcf4f5c3e5f34d@mail.gmail.com>

Ok,
this is what I would do.
Download the database of gene names and sequences in fasta.
Then loop throught it with bioperl.
Regex the gene names, which you store into a hash, against the
seq->display_names() should match it up with gene ids
seq->seq() should print out the sequence
in bioperl.
Print out the ones that match.
Good luck.
- Show quoted text -

On Thu, Sep 3, 2009 at 11:49 PM, Neeti Somaiya<neetisomaiya at gmail.com> wrote:
> Hi,
>
> I have an input list of gene names (can get gene ids from a local db
> if required).
> I need to fetch sequences of these genes. Can someone please guide me
> as to how this can be done using perl/bioperl?
>
> Any help will be deeply appreciated.
>
> Thanks.
>
> -Neeti
> Even my blood says, B positive
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


--

On Fri, Sep 4, 2009 at 9:52 PM, Neeti Somaiya<neetisomaiya at gmail.com> wrote:
> Ok, so I reinstalled bioperl and was able to run the EUtilities code
> for my gene id.
> But I am facing two issues :-
>
> 1) When I give multiple gene ids, it still returns data of only the
> first gene id
>
> 2) The script returns the entire entry, and I am not able to figure
> out how to just fetch the sequence, and if possible, in FASTA format.
> I could not figure it out from the documentation.
>
> Thanks.
>
> -Neeti
> Even my blood says, B positive
>
>
>
> On Fri, Sep 4, 2009 at 6:19 PM, Chris Fields<cjfields at illinois.edu> wrote:
>> Neeti,
>>
>> Sorry, it's a package deal (and Bio::DB::EUtilities relies on several other
>> modules). ?I am planning on spinning it out at some point into it's own
>> package, but for now the easiest way to install is via 1.6 off CPAN or
>> downloading the nightly build:
>>
>> http://www.bioperl.org/DIST/nightly_builds/
>>
>> chris
>>
>> On Sep 4, 2009, at 7:40 AM, Neeti Somaiya wrote:
>>
>>> Hi,
>>>
>>> Thanks for your reply. I saw this before and wanted to try this, but I
>>> am unable to install this module of EUtilities. When I search on CPAN,
>>> it gives me the entire bioperl package in the download option of this
>>> module. Can I not get a tar.gz file of this module alone, which I can
>>> gzip, untar and then run the make and all to install it? I dont want
>>> to install entire bioperl again as I am using an older version. Any
>>> suggestions?
>>>
>>> -Neeti
>>> Even my blood says, B positive
>>>
>>>
>>>
>>> On Fri, Sep 4, 2009 at 6:00 PM, Chris Fields<cjfields at illinois.edu> wrote:
>>>>
>>>> Neeti,
>>>>
>>>> Something like this?
>>>>
>>>>
>>>> http://www.bioperl.org/wiki/HOWTO:EUtilities_Cookbook#esummary_-.3E_efetch
>>>>
>>>> chris
>>>>
>>>> On Sep 4, 2009, at 7:21 AM, Neeti Somaiya wrote:
>>>>
>>>>> Thanks. Its an interesting tool.
>>>>>
>>>>> But I want to do this programatically.
>>>>>
>>>>> I have gene ids to start with. Cant find a method to directly get
>>>>> sequence with gene id as input. So using the method of getting
>>>>> sequence with accession as input, for which I need to know accessions
>>>>> for my gene ids first. Is this a right approach? Please guide me. My
>>>>> main aim is to get the nucleotide sequence of a gene from ids entrez
>>>>> gene id/gene name. PLease guide me. I am confused.
>>>>>
>>>>> -Neeti
>>>>> Even my blood says, B positive
>>>>>
>>>>>
>>>>>
>>>>> On Fri, Sep 4, 2009 at 5:35 PM, Emanuele Osimo<e.osimo at gmail.com> wrote:
>>>>>>
>>>>>> Try this:
>>>>>> http://david.abcc.ncifcrf.gov/conversion.jsp
>>>>>>
>>>>>> Emanuele
>>>>>>
>>>>>>
>>>>>> On Fri, Sep 4, 2009 at 12:13, Neeti Somaiya <neetisomaiya at gmail.com>
>>>>>> wrote:
>>>>>>>
>>>>>>> Thanks for the replies.
>>>>>>>
>>>>>>> So the get seq by accession/GI worked for me. Now can anyone tell me
>>>>>>> the easiest way to get the GI /Accession of a gene from the gene
>>>>>>> id/gene name?
>>>>>>>
>>>>>>> -Neeti
>>>>>>> Even my blood says, B positive
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Fri, Sep 4, 2009 at 2:47 PM, Neeti Somaiya<neetisomaiya at gmail.com>
>>>>>>> wrote:
>>>>>>>>
>>>>>>>> Thanks for the link.
>>>>>>>> So I need only the following lines of code to get the sequence?
>>>>>>>>
>>>>>>>> use Bio::DB::GenBank;
>>>>>>>> $db_obj = Bio::DB::GenBank->new;
>>>>>>>> $seq_obj = $db_obj->get_Seq_by_id(2);
>>>>>>>>
>>>>>>>> How do I print the sequence?
>>>>>>>> $seq_obj->seq ??
>>>>>>>>
>>>>>>>> -Neeti
>>>>>>>> Even my blood says, B positive
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Fri, Sep 4, 2009 at 1:31 PM, K. Shameer<shameer at ncbs.res.in>
>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>> Retrieving a sequence from a database : BioPerl HOWTO
>>>>>>>>> http://bit.ly/RWIot
>>>>>>>>>
>>>>>>>>> Trust this helps,
>>>>>>>>> Khader Shameer
>>>>>>>>> NCBS - TIFR
>>>>>>>>>
>>>>>>>>>> Hi,
>>>>>>>>>>
>>>>>>>>>> I have an input list of gene names (can get gene ids from a local
>>>>>>>>>> db
>>>>>>>>>> if required).
>>>>>>>>>> I need to fetch sequences of these genes. Can someone please guide
>>>>>>>>>> me
>>>>>>>>>> as to how this can be done using perl/bioperl?
>>>>>>>>>>
>>>>>>>>>> Any help will be deeply appreciated.
>>>>>>>>>>
>>>>>>>>>> Thanks.
>>>>>>>>>>
>>>>>>>>>> -Neeti
>>>>>>>>>> Even my blood says, B positive
>>>>>>>>>> _______________________________________________
>>>>>>>>>> Bioperl-l mailing list
>>>>>>>>>> Bioperl-l at lists.open-bio.org
>>>>>>>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> Bioperl-l mailing list
>>>>>>> Bioperl-l at lists.open-bio.org
>>>>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>>>>
>>>>>>
>>>>> _______________________________________________
>>>>> Bioperl-l mailing list
>>>>> Bioperl-l at lists.open-bio.org
>>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>>
>>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


-- 
Eugene Bolotin
Ph.D. candidate
Genetics Genomics and Bioinformatics
University of California Riverside
ybolo001 at student.ucr.edu
Dr. Frances Sladek Lab


From maj at fortinbras.us  Sat Sep  5 08:53:12 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Sat, 5 Sep 2009 08:53:12 -0400
Subject: [Bioperl-l] bioperl invades emacs -- bug report?
In-Reply-To: <alpine.DEB.1.10.0909030814320.16229@deskpro17122.dynamic.sanger.ac.uk>
References: <mailman.25.1251907209.22450.bioperl-l@lists.open-bio.org><alpine.DEB.1.10.0909022007510.16229@deskpro17122.dynamic.sanger.ac.uk><203092FB050648AA9F256788068F0A16@NewLife>
	<alpine.DEB.1.10.0909030814320.16229@deskpro17122.dynamic.sanger.ac.uk>
Message-ID: <E63F6D209AF1432C9B9CAFF6F6182F9C@NewLife>

Hi Pablo-- You're right about the PERL5LIB issue; I had
not set up the module path to handle multiple paths as you
describe. I am working hard on an implementation that can
handle multiple paths; I hope to have it out next week --cheers MAJ
----- Original Message ----- 
From: "Pablo Marin-Garcia" <pg4 at sanger.ac.uk>
To: "Mark A. Jensen" <maj at fortinbras.us>
Cc: <bioperl-l at lists.open-bio.org>
Sent: Thursday, September 03, 2009 4:01 AM
Subject: Re: [Bioperl-l] bioperl invades emacs -- bug report?


> On Thu, 3 Sep 2009, Mark A. Jensen wrote:
>
>> Hi Pablo and all-
>> Try the latest revision (>=16081) with your debian/Emacs 21. Set
>> the variable bioperl-module-path to the directory above the
>> Bio directory (same idea as ' use lib "./bioperl-live"; ' ), and try
>> again there. Tomorrow, MacOS
>> cheers,
>> Mark
>
> Hello Mark,
>
> after setting bioperl-module-path manually, your module works ok in linux 
> emacs 21.4 with latest revision.
>
> About the perl5lib issue, sorry about not reporting the platform: the report 
> was on linux not in mac os X. In the wiki you have a comment about mac OS X 
> separator:
>
> [wiki] The problem Pablo was running into is definitely the Mac OS X path 
> [wiki] separator issue.
>
> Here I was refering to ':' as the 'path seprator' for linux multipath 
> environmental vars not the systems directory separator [:/\].
>
> Also from the wiki
>
> [wiki] I think this is ok as it is, since bioperl-module-path is meant to 
> [wiki] point to the directory above Bio
>
> This is right. Probably my message was misleading. I wrongly appended '/Bio' 
> to the path instead to a temp variable for testing with file-exist-p. And 
> probably gave you the impression that the point was to have the /Bio added to 
> the path. Sorry about that.
>
> Instead my main point was about the line where you capture the PRL5LIB:
>
> [code] (if (setq pth (getenv "PERL5LIB"))
>
> wouldn't this leave pth with s *string* like "lib/path1:lib/path2:lob/path3" 
> in linux?
>
> Then, when you test:
>
> [code] (setq pth (if (file-exists-p (concat pth "/" "Bio")) pth nil))))
>
> it would append '/Bio' at the end of the whole string 
> 'lib/path1:lib/path2:lib/path3'. and this string path obviously does not 
> exist.
>
> Am I missing something? Shouldn't the 'concat /Bio' be applied to *each* 
> lib/path, splitting first the pth string by the ':' in linux/osX or equivalent 
> in windows.
>
> Sorry about not being very clear in my firest report.
>
>
>    -Pablo
>
>
>
>>> == bug when parsing perl5lib? ==
>>>
>>> Please correct me if I am wrong but in bioperl-init.el when extracting the 
>>> Bioperl paths from PERL5LIB this is not working for me in linux.
>>>
>>> While debugging bioperl-init.el:
>>> # (setq pth (getenv "PERL5LIB"))
>>> # 
>>> "/nfs/home/pmg/ensembl-api/ensembl-compara/modules:...:/nfs/home/pmg/bioperl-live:..."
>>> # (setq pth (if (file-exists-p (concat pth "/" "Bio")) pth nil))
>>> # nil
>>>
>>> No file is found because it is looking for all the paths concatenated 
>>> together with a '/Bio' at the end:
>>>
>>>   libpaht1:libpath2:libpath3/Bio
>>>
>>> 'concat' adds /Bio to the pth that is a string with all the PERL5LIB paths. 
>>> Should this concat rather be applied to the splited perl5lib by ':' in unix 
>>> or ';' in windows and then tested for the existence of files?
>>>
>>> for example in unix:
>>>
>>> --- code --
>>> (defun addbio (bio_path)
>>>   "apend /Bio to each path"
>>>   (concat bio_path "/" "Bio"))
>>>
>>> (mapcar 'file-exists-p (mapcar 'addbio (split-string pth ":")))
>>> -- end code ---
>>>
>>> This would result in the list of T and F bioperl (and ensembl) paths
>>> (t t nil t t t t t t nil nil nil ...)
>>>
>>>
>>> Regards and thanks for the modules they would be very useful.
>>>
>>>    -Pablo
>>>
>>> =====================================================================
>>>                      Pablo Marin-Garcia, PhD
>>>
>>>                     \\//          (Argiope bruennichi
>>>                \/\/`(||>O:'\/\/   with stabilimentum)
>>>                     //\\
>>>
>>> Sanger Institute                |  PostDoc / Computer Biologist
>>> Wellcome Trust Genome Campus    |  team : 128/108 (Human Genetics)
>>> Hinxton, Cambridge CB10 1HH     |  room : N333
>>> United Kingdom                  |  email: pablo.marin at sanger.ac.uk
>>> ====================================================================
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> -- 
>>> The Wellcome Trust Sanger Institute is operated by Genome Research Limited, 
>>> a charity registered in England with number 1021457 and a company registered 
>>> in England with number 2742969, whose registered office is 215 Euston Road, 
>>> London, NW1 2BE. _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>>
>>
>>
>
>
> =====================================================================
>                      Pablo Marin-Garcia, PhD
>
>                     \\//          (Argiope bruennichi
>                \/\/`(||>O:'\/\/   with stabilimentum)
>                     //\\
>
> Sanger Institute                |  PostDoc / Computer Biologist
> Wellcome Trust Genome Campus    |  team : 128/108 (Human Genetics)
> Hinxton, Cambridge CB10 1HH     |  room : N333
> United Kingdom                  |  email: pablo.marin at sanger.ac.uk
> ====================================================================
>
>
>
>
>
>
>
>
>
>
> -- 
> The Wellcome Trust Sanger Institute is operated by Genome Research Limited, a 
> charity registered in England with number 1021457 and a company registered in 
> England with number 2742969, whose registered office is 215 Euston Road, 
> London, NW1 2BE. _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> 


From cjfields at illinois.edu  Sat Sep  5 09:44:54 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Sat, 5 Sep 2009 08:44:54 -0500
Subject: [Bioperl-l] need help urgently
In-Reply-To: <764978cf0909042152v2ae26ee5q6c668c498ead605e@mail.gmail.com>
References: <764978cf0909032349t5d8683f3h3b2ccc506392c984@mail.gmail.com>
	<42006.192.168.1.1.1252051273.squirrel@mail.ncbs.res.in>
	<764978cf0909040217k7f4382d6pcec65f8adfd66164@mail.gmail.com>
	<764978cf0909040313r48ff5660x3210324dc14bf966@mail.gmail.com>
	<2ac05d0f0909040505w75793434ifc25237edbabad5a@mail.gmail.com>
	<764978cf0909040521p7aa6af97hffc0c08381a04222@mail.gmail.com>
	<8CCCFE4D-84A4-47A4-A627-ADC6C0329686@illinois.edu>
	<764978cf0909040540n531ea4d3o42f28a7e1578ad82@mail.gmail.com>
	<4D83853D-90C3-4048-AFAB-FF6E2402C7AA@illinois.edu>
	<764978cf0909042152v2ae26ee5q6c668c498ead605e@mail.gmail.com>
Message-ID: <218A1F91-F492-43E6-814D-A31546E0FEB1@illinois.edu>

On Sep 4, 2009, at 11:52 PM, Neeti Somaiya wrote:

> Ok, so I reinstalled bioperl and was able to run the EUtilities code
> for my gene id.
> But I am facing two issues :-
>
> 1) When I give multiple gene ids, it still returns data of only the
> first gene id

This sounds like it's not iterating correctly.  You'll need to post  
your version of the script.

> 2) The script returns the entire entry, and I am not able to figure
> out how to just fetch the sequence, and if possible, in FASTA format.
> I could not figure it out from the documentation.

I recall this working last time I used it (I think June or July).   
Could you post the script you are using?

(realize this is a holiday weekend in the states, so you might have a  
delayed response from me or others)

> Thanks.
>
> -Neeti
> Even my blood says, B positive

chris


From neetisomaiya at gmail.com  Sun Sep  6 12:15:09 2009
From: neetisomaiya at gmail.com (Neeti Somaiya)
Date: Sun, 6 Sep 2009 21:45:09 +0530
Subject: [Bioperl-l] need help urgently
In-Reply-To: <218A1F91-F492-43E6-814D-A31546E0FEB1@illinois.edu>
References: <764978cf0909032349t5d8683f3h3b2ccc506392c984@mail.gmail.com>
	<764978cf0909040217k7f4382d6pcec65f8adfd66164@mail.gmail.com>
	<764978cf0909040313r48ff5660x3210324dc14bf966@mail.gmail.com>
	<2ac05d0f0909040505w75793434ifc25237edbabad5a@mail.gmail.com>
	<764978cf0909040521p7aa6af97hffc0c08381a04222@mail.gmail.com>
	<8CCCFE4D-84A4-47A4-A627-ADC6C0329686@illinois.edu>
	<764978cf0909040540n531ea4d3o42f28a7e1578ad82@mail.gmail.com>
	<4D83853D-90C3-4048-AFAB-FF6E2402C7AA@illinois.edu>
	<764978cf0909042152v2ae26ee5q6c668c498ead605e@mail.gmail.com>
	<218A1F91-F492-43E6-814D-A31546E0FEB1@illinois.edu>
Message-ID: <764978cf0909060915t7a2e6e45v4bb194b9cad18e18@mail.gmail.com>

Hi,

Thanks for the reply.

I am using the script exactly as it is given here :

http://www.bioperl.org/wiki/HOWTO:EUtilities_Cookbook#esummary_-.3E_efetch

-Neeti
Even my blood says, B positive


On Sat, Sep 5, 2009 at 7:14 PM, Chris Fields<cjfields at illinois.edu> wrote:
> On Sep 4, 2009, at 11:52 PM, Neeti Somaiya wrote:
>
>> Ok, so I reinstalled bioperl and was able to run the EUtilities code
>> for my gene id.
>> But I am facing two issues :-
>>
>> 1) When I give multiple gene ids, it still returns data of only the
>> first gene id
>
> This sounds like it's not iterating correctly.  You'll need to post your
> version of the script.
>
>> 2) The script returns the entire entry, and I am not able to figure
>> out how to just fetch the sequence, and if possible, in FASTA format.
>> I could not figure it out from the documentation.
>
> I recall this working last time I used it (I think June or July).  Could you
> post the script you are using?
>
> (realize this is a holiday weekend in the states, so you might have a
> delayed response from me or others)
>
>> Thanks.
>>
>> -Neeti
>> Even my blood says, B positive
>
> chris
>


From Russell.Smithies at agresearch.co.nz  Sun Sep  6 19:00:24 2009
From: Russell.Smithies at agresearch.co.nz (Smithies, Russell)
Date: Mon, 7 Sep 2009 11:00:24 +1200
Subject: [Bioperl-l] need help urgently
In-Reply-To: <764978cf0909040521p7aa6af97hffc0c08381a04222@mail.gmail.com>
References: <764978cf0909032349t5d8683f3h3b2ccc506392c984@mail.gmail.com>
	<42006.192.168.1.1.1252051273.squirrel@mail.ncbs.res.in>
	<764978cf0909040217k7f4382d6pcec65f8adfd66164@mail.gmail.com>
	<764978cf0909040313r48ff5660x3210324dc14bf966@mail.gmail.com>
	<2ac05d0f0909040505w75793434ifc25237edbabad5a@mail.gmail.com>
	<764978cf0909040521p7aa6af97hffc0c08381a04222@mail.gmail.com>
Message-ID: <18DF7D20DFEC044098A1062202F5FFF32B624C50D0@exchsth.agresearch.co.nz>

Grab the gene2accession list from here and do lookups.
Probably the fastest and easiest way.


Russell Smithies 

Bioinformatics Applications Developer 
T +64 3 489 9085 
E? russell.smithies at agresearch.co.nz 

Invermay? Research Centre 
Puddle Alley, 
Mosgiel, 
New Zealand 
T? +64 3 489 3809?? 
F? +64 3 489 9174? 
www.agresearch.co.nz 


> -----Original Message-----
> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
> bounces at lists.open-bio.org] On Behalf Of Neeti Somaiya
> Sent: Saturday, 5 September 2009 12:21 a.m.
> To: Emanuele Osimo; bioperl-l
> Subject: Re: [Bioperl-l] need help urgently
> 
> Thanks. Its an interesting tool.
> 
> But I want to do this programatically.
> 
> I have gene ids to start with. Cant find a method to directly get
> sequence with gene id as input. So using the method of getting
> sequence with accession as input, for which I need to know accessions
> for my gene ids first. Is this a right approach? Please guide me. My
> main aim is to get the nucleotide sequence of a gene from ids entrez
> gene id/gene name. PLease guide me. I am confused.
> 
> -Neeti
> Even my blood says, B positive
> 
> 
> 
> On Fri, Sep 4, 2009 at 5:35 PM, Emanuele Osimo<e.osimo at gmail.com> wrote:
> > Try this:
> > http://david.abcc.ncifcrf.gov/conversion.jsp
> >
> > Emanuele
> >
> >
> > On Fri, Sep 4, 2009 at 12:13, Neeti Somaiya <neetisomaiya at gmail.com> wrote:
> >>
> >> Thanks for the replies.
> >>
> >> So the get seq by accession/GI worked for me. Now can anyone tell me
> >> the easiest way to get the GI /Accession of a gene from the gene
> >> id/gene name?
> >>
> >> -Neeti
> >> Even my blood says, B positive
> >>
> >>
> >>
> >> On Fri, Sep 4, 2009 at 2:47 PM, Neeti Somaiya<neetisomaiya at gmail.com>
> >> wrote:
> >> > Thanks for the link.
> >> > So I need only the following lines of code to get the sequence?
> >> >
> >> > use Bio::DB::GenBank;
> >> > $db_obj = Bio::DB::GenBank->new;
> >> > $seq_obj = $db_obj->get_Seq_by_id(2);
> >> >
> >> > How do I print the sequence?
> >> > $seq_obj->seq ??
> >> >
> >> > -Neeti
> >> > Even my blood says, B positive
> >> >
> >> >
> >> >
> >> > On Fri, Sep 4, 2009 at 1:31 PM, K. Shameer<shameer at ncbs.res.in> wrote:
> >> >>
> >> >> Retrieving a sequence from a database : BioPerl HOWTO
> >> >> http://bit.ly/RWIot
> >> >>
> >> >> Trust this helps,
> >> >> Khader Shameer
> >> >> NCBS - TIFR
> >> >>
> >> >>> Hi,
> >> >>>
> >> >>> I have an input list of gene names (can get gene ids from a local db
> >> >>> if required).
> >> >>> I need to fetch sequences of these genes. Can someone please guide me
> >> >>> as to how this can be done using perl/bioperl?
> >> >>>
> >> >>> Any help will be deeply appreciated.
> >> >>>
> >> >>> Thanks.
> >> >>>
> >> >>> -Neeti
> >> >>> Even my blood says, B positive
> >> >>> _______________________________________________
> >> >>> Bioperl-l mailing list
> >> >>> Bioperl-l at lists.open-bio.org
> >> >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >> >>>
> >> >>
> >> >>
> >> >>
> >> >
> >> _______________________________________________
> >> Bioperl-l mailing list
> >> Bioperl-l at lists.open-bio.org
> >> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >
> >
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
=======================================================================
Attention: The information contained in this message and/or attachments
from AgResearch Limited is intended only for the persons or entities
to which it is addressed and may contain confidential and/or privileged
material. Any review, retransmission, dissemination or other use of, or
taking of any action in reliance upon, this information by persons or
entities other than the intended recipients is prohibited by AgResearch
Limited. If you have received this message in error, please notify the
sender immediately.
=======================================================================


From bnbowman at gmail.com  Mon Sep  7 04:17:25 2009
From: bnbowman at gmail.com (Brett Bowman)
Date: Mon, 7 Sep 2009 01:17:25 -0700
Subject: [Bioperl-l] Protein Sequence QSARs
Message-ID: <627d998d0909070117u760c8ef3k47a894cf52d099f1@mail.gmail.com>

I've been working on a script for my personal edification for annotating
protein sequence for QSARs, as described in the paper below, because I
didn't see anything in Bioperl to do it for me.  Essentially converting a
protein sequence of length N into a numerical matrix of size 3-by-N by
substitution, and then calculating the auto- and cross- correlation values
for various for a lag of L amino acids.  I was considering turning it into a
full blown module, but I wanted to ask if A) it had been done before and I
had just missed it, and B) whether anyone other than me would find such a
module useful.

Wold S, Jonsson J, Sj?str?m M, Sandberg M, R?nnar S: * DNA and peptide
sequences and chemical processes multivariately modeled by principal
component analysis and partial least-squares projections to latent
structures. **Anal Chim Acta* 1993, *277**:*239-253.

Brett Bowman
bnbowman at gmail.com
Woelk Lab, Stein Cancer Research Center
UCSD/SDSU Joint Program in Bioinformatics


From neetisomaiya at gmail.com  Mon Sep  7 06:04:06 2009
From: neetisomaiya at gmail.com (Neeti Somaiya)
Date: Mon, 7 Sep 2009 15:34:06 +0530
Subject: [Bioperl-l] need help urgently
In-Reply-To: <2ac05d0f0909040039v4d6fb77fw8793b43add632e3a@mail.gmail.com>
References: <764978cf0909032349t5d8683f3h3b2ccc506392c984@mail.gmail.com>
	<2ac05d0f0909040039v4d6fb77fw8793b43add632e3a@mail.gmail.com>
Message-ID: <764978cf0909070304w598d4bb5m51ad4e66f57cc1cf@mail.gmail.com>

I tried using EntrezGene instead of GenBank, as is given in the link
that you sent :

http://www.bioperl.org/wiki/HOWTO:Beginners#Retrieving_a_sequence_from_a_database

http://doc.bioperl.org/releases/bioperl-current/bioperl-live/Bio/DB/EntrezGene.html

use Bio::DB::EntrezGene;

    my $db = Bio::DB::EntrezGene->new;

    my $seq = $db->get_Seq_by_id(2); # Gene id

    # or ...

    my $seqio = $db->get_Stream_by_id([2, 4693, 3064]); # Gene ids
    while ( my $seq = $seqio->next_seq ) {
	    print "id is ", $seq->display_id, "\n";
    }

This doesnt seem to work.


-Neeti
Even my blood says, B positive


On Fri, Sep 4, 2009 at 1:09 PM, Emanuele Osimo<e.osimo at gmail.com> wrote:
> Hello,
> have you tried this?
> http://bio.perl.org/wiki/HOWTO:Getting_Genomic_Sequences#Using_Bio::DB::GenBank_when_you_have_genomic_coordinates
>
> Emanuele
>
> On Fri, Sep 4, 2009 at 08:49, Neeti Somaiya <neetisomaiya at gmail.com> wrote:
>>
>> Hi,
>>
>> I have an input list of gene names (can get gene ids from a local db
>> if required).
>> I need to fetch sequences of these genes. Can someone please guide me
>> as to how this can be done using perl/bioperl?
>>
>> Any help will be deeply appreciated.
>>
>> Thanks.
>>
>> -Neeti
>> Even my blood says, B positive
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>


From Russell.Smithies at agresearch.co.nz  Mon Sep  7 16:26:04 2009
From: Russell.Smithies at agresearch.co.nz (Smithies, Russell)
Date: Tue, 8 Sep 2009 08:26:04 +1200
Subject: [Bioperl-l] need help urgently
In-Reply-To: <764978cf0909070304w598d4bb5m51ad4e66f57cc1cf@mail.gmail.com>
References: <764978cf0909032349t5d8683f3h3b2ccc506392c984@mail.gmail.com>
	<2ac05d0f0909040039v4d6fb77fw8793b43add632e3a@mail.gmail.com>
	<764978cf0909070304w598d4bb5m51ad4e66f57cc1cf@mail.gmail.com>
Message-ID: <18DF7D20DFEC044098A1062202F5FFF32B624C53A3@exchsth.agresearch.co.nz>

This example code from the wiki _definitely_ works:
http://bio.perl.org/wiki/HOWTO:Getting_Genomic_Sequences#Using_Bio::DB::EntrezGene_to_get_genomic_coordinates
=========================================

use strict;
use Bio::DB::EntrezGene;
 
my $id = shift or die "Id?\n"; # use a Gene id
 
my $db = new Bio::DB::EntrezGene;
$db->verbose(1); ###
 
my $seq = $db->get_Seq_by_id($id);
 
my $ac = $seq->annotation;
 
for my $ann ($ac->get_Annotations('dblink')) {
	if ($ann->database eq "Evidence Viewer") {
                # get the sequence identifier, the start, and the stop
		my ($contig,$from,$to) = $ann->url =~ 
		  /contig=([^&]+).+from=(\d+)&to=(\d+)/;
		print "$contig\t$from\t$to\n";
	}
}

======================================

So if it doesn't work for you, there are a few things you need to check:
* what version of BioPerl are you using?
* are you behind a firewall?
* are you using a proxy?
* do you need to submit username/password for either of the 2 above
* turn on 'verbose' messages, it may help you debug


If you're still having problems, get back to me and I'll see if I can help.

--Russell


> -----Original Message-----
> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
> bounces at lists.open-bio.org] On Behalf Of Neeti Somaiya
> Sent: Monday, 7 September 2009 10:04 p.m.
> To: Emanuele Osimo; bioperl-l
> Subject: Re: [Bioperl-l] need help urgently
> 
> I tried using EntrezGene instead of GenBank, as is given in the link
> that you sent :
> 
> http://www.bioperl.org/wiki/HOWTO:Beginners#Retrieving_a_sequence_from_a_datab
> ase
> 
> http://doc.bioperl.org/releases/bioperl-current/bioperl-
> live/Bio/DB/EntrezGene.html
> 
> use Bio::DB::EntrezGene;
> 
>     my $db = Bio::DB::EntrezGene->new;
> 
>     my $seq = $db->get_Seq_by_id(2); # Gene id
> 
>     # or ...
> 
>     my $seqio = $db->get_Stream_by_id([2, 4693, 3064]); # Gene ids
>     while ( my $seq = $seqio->next_seq ) {
> 	    print "id is ", $seq->display_id, "\n";
>     }
> 
> This doesnt seem to work.
> 
> 
> -Neeti
> Even my blood says, B positive
> 
> 
> 
> On Fri, Sep 4, 2009 at 1:09 PM, Emanuele Osimo<e.osimo at gmail.com> wrote:
> > Hello,
> > have you tried this?
> >
> http://bio.perl.org/wiki/HOWTO:Getting_Genomic_Sequences#Using_Bio::DB::GenBan
> k_when_you_have_genomic_coordinates
> >
> > Emanuele
> >
> > On Fri, Sep 4, 2009 at 08:49, Neeti Somaiya <neetisomaiya at gmail.com> wrote:
> >>
> >> Hi,
> >>
> >> I have an input list of gene names (can get gene ids from a local db
> >> if required).
> >> I need to fetch sequences of these genes. Can someone please guide me
> >> as to how this can be done using perl/bioperl?
> >>
> >> Any help will be deeply appreciated.
> >>
> >> Thanks.
> >>
> >> -Neeti
> >> Even my blood says, B positive
> >> _______________________________________________
> >> Bioperl-l mailing list
> >> Bioperl-l at lists.open-bio.org
> >> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >
> >
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
=======================================================================
Attention: The information contained in this message and/or attachments
from AgResearch Limited is intended only for the persons or entities
to which it is addressed and may contain confidential and/or privileged
material. Any review, retransmission, dissemination or other use of, or
taking of any action in reliance upon, this information by persons or
entities other than the intended recipients is prohibited by AgResearch
Limited. If you have received this message in error, please notify the
sender immediately.
=======================================================================


From cjfields at illinois.edu  Mon Sep  7 16:56:03 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Mon, 7 Sep 2009 15:56:03 -0500
Subject: [Bioperl-l] Prepping for 1.6.1 (finally!)
Message-ID: <35CC277D-F0B6-45D0-A578-10A00B7A9C57@illinois.edu>

All,

I have updated the Changes file in bioperl-live in preparation for  
1.6.1.  The initial release will be an alpha, 1.6.0_1 (probably  
landing about mid-week), and based on CPAN tests, etc the final 1.6.1  
release next week.  I'll start merging changes over from trunk  
tonight, fixing last-minute bugs, etc.  I'm running my work using perl  
5.10.1 (64-bit) on Mac and will likely run these remotely on our local  
linux cluster.  Win tests are gladly welcome (this should work on  
Strawberry Perl now).

I highly suggest Mark, Jason, and any others (Lincoln, Scott, Chase,  
Robert Buels, Jay Hannah, Heikki, Sendu come to mind) look over the  
file to update it.  There are a few weak spots in there where I didn't  
make the code change or additions, or where a particular bug was fixed  
but not mentioned.  In particular:

1) Google Summer of Code work from Chase (Mark, Chase)
2) GMOD-related fixes (Lincoln, Scott)
3) YAPC Hackathon bug fixes (Robert, Jay, Bruno)
4) Tiling, Restriction refactors (Mark)

Also, please make changes to AUTHORS, etc as needed.

Thanks!

chris


From maj at fortinbras.us  Mon Sep  7 17:21:04 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Mon, 7 Sep 2009 17:21:04 -0400
Subject: [Bioperl-l] Prepping for 1.6.1 (finally!)
In-Reply-To: <35CC277D-F0B6-45D0-A578-10A00B7A9C57@illinois.edu>
References: <35CC277D-F0B6-45D0-A578-10A00B7A9C57@illinois.edu>
Message-ID: <29B3F9DC91A1422A89629790DD8CC313@NewLife>

aye-aye skipper--- 
----- Original Message ----- 
From: "Chris Fields" <cjfields at illinois.edu>
To: "BioPerl List" <bioperl-l at lists.open-bio.org>
Sent: Monday, September 07, 2009 4:56 PM
Subject: [Bioperl-l] Prepping for 1.6.1 (finally!)


> All,
> 
> I have updated the Changes file in bioperl-live in preparation for  
> 1.6.1.  The initial release will be an alpha, 1.6.0_1 (probably  
> landing about mid-week), and based on CPAN tests, etc the final 1.6.1  
> release next week.  I'll start merging changes over from trunk  
> tonight, fixing last-minute bugs, etc.  I'm running my work using perl  
> 5.10.1 (64-bit) on Mac and will likely run these remotely on our local  
> linux cluster.  Win tests are gladly welcome (this should work on  
> Strawberry Perl now).
> 
> I highly suggest Mark, Jason, and any others (Lincoln, Scott, Chase,  
> Robert Buels, Jay Hannah, Heikki, Sendu come to mind) look over the  
> file to update it.  There are a few weak spots in there where I didn't  
> make the code change or additions, or where a particular bug was fixed  
> but not mentioned.  In particular:
> 
> 1) Google Summer of Code work from Chase (Mark, Chase)
> 2) GMOD-related fixes (Lincoln, Scott)
> 3) YAPC Hackathon bug fixes (Robert, Jay, Bruno)
> 4) Tiling, Restriction refactors (Mark)
> 
> Also, please make changes to AUTHORS, etc as needed.
> 
> Thanks!
> 
> chris
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
>


From cjfields at illinois.edu  Tue Sep  8 00:23:26 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Mon, 7 Sep 2009 23:23:26 -0500
Subject: [Bioperl-l] Significant blocker for 1.6.1 : Nexml
Message-ID: <E5D7B830-6D19-47D2-8D5E-716B4CF84F0B@illinois.edu>

All,

I'm running into a pretty significant blocker for 1.6.1 re: Chase's  
Nexml code.  In particular, I have tried three versions of Bio::Phylo;  
the default CPAN installation (1.6), the latest CPAN RC (1.7_RC9, not  
installed by default), and the latest from Bio::Phylo svn:

https://nexml.svn.sourceforge.net/svnroot/nexml/trunk/nexml/perl

At this moment only the Bio::Phylo code from svn is working with  
BioPerl's Nexml modules.  From my local tests Bio::Phylo 1.6 appears  
to be missing Bio::Phylo::Factory (all Nexml tests fail), whereas  
1.7_RC9 has some kind of versioning issue (again, all tests fail).   
The problem: CPAN will always install 1.6 (the others are RC, so they  
won't be installed unless the full path is used).  Even so, nothing on  
CPAN even works; one must use the latest Bio::Phylo SVN code.

ATM I'm just not seeing how this can be released with 1.6.1 right now,  
unless one of the following occurs:

1) Rutger V. drops a quick non-RC release to CPAN,
2) check for the minimal working Bio::Phylo version and safely skip  
any Nexml-related tests unless proper version is present (not easy  
with a $VERSION like '1.7_RC9'),
3) push Nexml into it's own distribution (something we were planning  
on anyway with a number of modules)

As for #3 above, I think it probably belongs in a larger bioperl-phylo  
as Mark had previously proposed.  I'm open to just about any solution.

chris


From neetisomaiya at gmail.com  Tue Sep  8 00:27:43 2009
From: neetisomaiya at gmail.com (Neeti Somaiya)
Date: Tue, 8 Sep 2009 09:57:43 +0530
Subject: [Bioperl-l] need help urgently
In-Reply-To: <18DF7D20DFEC044098A1062202F5FFF32B624C53A3@exchsth.agresearch.co.nz>
References: <764978cf0909032349t5d8683f3h3b2ccc506392c984@mail.gmail.com>
	<2ac05d0f0909040039v4d6fb77fw8793b43add632e3a@mail.gmail.com>
	<764978cf0909070304w598d4bb5m51ad4e66f57cc1cf@mail.gmail.com>
	<18DF7D20DFEC044098A1062202F5FFF32B624C53A3@exchsth.agresearch.co.nz>
Message-ID: <764978cf0909072127n830d4e8x95d15a758fa919db@mail.gmail.com>

I actually want the nucleotide sequence of the gene. I thought the
Bio::DB::EntrezGene would give me a seq_obj for an entrez gene id and
then the seq method on that $seq_obj->seq() will give me the actual
genomic nucleotide sequence of the gene. But this doesnt happen. I am
able to print gene symbol using $seq_obj->display_id and able to do
other things, but I wanted the gene nucleotide sequence.

-Neeti
Even my blood says, B positive


On Tue, Sep 8, 2009 at 1:56 AM, Smithies,
Russell<Russell.Smithies at agresearch.co.nz> wrote:
> This example code from the wiki _definitely_ works:
> http://bio.perl.org/wiki/HOWTO:Getting_Genomic_Sequences#Using_Bio::DB::EntrezGene_to_get_genomic_coordinates
> =========================================
>
> use strict;
> use Bio::DB::EntrezGene;
>
> my $id = shift or die "Id?\n"; # use a Gene id
>
> my $db = new Bio::DB::EntrezGene;
> $db->verbose(1); ###
>
> my $seq = $db->get_Seq_by_id($id);
>
> my $ac = $seq->annotation;
>
> for my $ann ($ac->get_Annotations('dblink')) {
>        if ($ann->database eq "Evidence Viewer") {
>                # get the sequence identifier, the start, and the stop
>                my ($contig,$from,$to) = $ann->url =~
>                  /contig=([^&]+).+from=(\d+)&to=(\d+)/;
>                print "$contig\t$from\t$to\n";
>        }
> }
>
> ======================================
>
> So if it doesn't work for you, there are a few things you need to check:
> * what version of BioPerl are you using?
> * are you behind a firewall?
> * are you using a proxy?
> * do you need to submit username/password for either of the 2 above
> * turn on 'verbose' messages, it may help you debug
>
>
> If you're still having problems, get back to me and I'll see if I can help.
>
> --Russell
>
>
>> -----Original Message-----
>> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
>> bounces at lists.open-bio.org] On Behalf Of Neeti Somaiya
>> Sent: Monday, 7 September 2009 10:04 p.m.
>> To: Emanuele Osimo; bioperl-l
>> Subject: Re: [Bioperl-l] need help urgently
>>
>> I tried using EntrezGene instead of GenBank, as is given in the link
>> that you sent :
>>
>> http://www.bioperl.org/wiki/HOWTO:Beginners#Retrieving_a_sequence_from_a_datab
>> ase
>>
>> http://doc.bioperl.org/releases/bioperl-current/bioperl-
>> live/Bio/DB/EntrezGene.html
>>
>> use Bio::DB::EntrezGene;
>>
>>     my $db = Bio::DB::EntrezGene->new;
>>
>>     my $seq = $db->get_Seq_by_id(2); # Gene id
>>
>>     # or ...
>>
>>     my $seqio = $db->get_Stream_by_id([2, 4693, 3064]); # Gene ids
>>     while ( my $seq = $seqio->next_seq ) {
>>           print "id is ", $seq->display_id, "\n";
>>     }
>>
>> This doesnt seem to work.
>>
>>
>> -Neeti
>> Even my blood says, B positive
>>
>>
>>
>> On Fri, Sep 4, 2009 at 1:09 PM, Emanuele Osimo<e.osimo at gmail.com> wrote:
>> > Hello,
>> > have you tried this?
>> >
>> http://bio.perl.org/wiki/HOWTO:Getting_Genomic_Sequences#Using_Bio::DB::GenBan
>> k_when_you_have_genomic_coordinates
>> >
>> > Emanuele
>> >
>> > On Fri, Sep 4, 2009 at 08:49, Neeti Somaiya <neetisomaiya at gmail.com> wrote:
>> >>
>> >> Hi,
>> >>
>> >> I have an input list of gene names (can get gene ids from a local db
>> >> if required).
>> >> I need to fetch sequences of these genes. Can someone please guide me
>> >> as to how this can be done using perl/bioperl?
>> >>
>> >> Any help will be deeply appreciated.
>> >>
>> >> Thanks.
>> >>
>> >> -Neeti
>> >> Even my blood says, B positive
>> >> _______________________________________________
>> >> Bioperl-l mailing list
>> >> Bioperl-l at lists.open-bio.org
>> >> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>> >
>> >
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> =======================================================================
> Attention: The information contained in this message and/or attachments
> from AgResearch Limited is intended only for the persons or entities
> to which it is addressed and may contain confidential and/or privileged
> material. Any review, retransmission, dissemination or other use of, or
> taking of any action in reliance upon, this information by persons or
> entities other than the intended recipients is prohibited by AgResearch
> Limited. If you have received this message in error, please notify the
> sender immediately.
> =======================================================================
>


From Russell.Smithies at agresearch.co.nz  Tue Sep  8 00:41:47 2009
From: Russell.Smithies at agresearch.co.nz (Smithies, Russell)
Date: Tue, 8 Sep 2009 16:41:47 +1200
Subject: [Bioperl-l] need help urgently
In-Reply-To: <764978cf0909072127n830d4e8x95d15a758fa919db@mail.gmail.com>
References: <764978cf0909032349t5d8683f3h3b2ccc506392c984@mail.gmail.com>
	<2ac05d0f0909040039v4d6fb77fw8793b43add632e3a@mail.gmail.com>
	<764978cf0909070304w598d4bb5m51ad4e66f57cc1cf@mail.gmail.com>
	<18DF7D20DFEC044098A1062202F5FFF32B624C53A3@exchsth.agresearch.co.nz>
	<764978cf0909072127n830d4e8x95d15a758fa919db@mail.gmail.com>
Message-ID: <18DF7D20DFEC044098A1062202F5FFF32B624C5607@exchsth.agresearch.co.nz>

That bit of code gave you the accession, start and end for the sequence so you just needed to download it.
Bio::DB::Eutilities can do that for you.

Did you take a look at http://www.bioperl.org/wiki/HOWTO:Getting_Genomic_Sequences


--Russell

==================
#!perl -w

use strict;
use Bio::DB::EntrezGene;
use Bio::DB::EUtilities;

no warnings 'deprecated';
 
my $id = shift or die "Id?\n"; # use a Gene id
 
my $db = new Bio::DB::EntrezGene;
#$db->verbose(1);
my $seq = $db->get_Seq_by_id($id);
 
my $ac = $seq->annotation;
 
for my $ann ($ac->get_Annotations('dblink')) {
	if ($ann->database eq "Evidence Viewer") {
                # get the sequence identifier, the start, and the stop
		my ($acc,$from,$to) = $ann->url =~
		  /contig=([^&]+).+from=(\d+)&to=(\d+)/;
		print "$acc\t$from\t$to\n";

		# retrieve the sequence
		my $fetcher = Bio::DB::EUtilities->new(-eutil => 'efetch',
					   -db    => 'nucleotide',
					   -rettype => 'fasta');
            $fetcher->set_parameters(-id => $acc,
			     			-seq_start => $from,
			     			-seq_stop  => $to,
			     			-strand    => 1);
            my $seq = $fetcher->get_Response->content;
            print $seq;

	}
}

======================

> -----Original Message-----
> From: Neeti Somaiya [mailto:neetisomaiya at gmail.com]
> Sent: Tuesday, 8 September 2009 4:28 p.m.
> To: Smithies, Russell
> Cc: Emanuele Osimo; bioperl-l
> Subject: Re: [Bioperl-l] need help urgently
> 
> I actually want the nucleotide sequence of the gene. I thought the
> Bio::DB::EntrezGene would give me a seq_obj for an entrez gene id and
> then the seq method on that $seq_obj->seq() will give me the actual
> genomic nucleotide sequence of the gene. But this doesnt happen. I am
> able to print gene symbol using $seq_obj->display_id and able to do
> other things, but I wanted the gene nucleotide sequence.
> 
> -Neeti
> Even my blood says, B positive
> 
> 
> 
> On Tue, Sep 8, 2009 at 1:56 AM, Smithies,
> Russell<Russell.Smithies at agresearch.co.nz> wrote:
> > This example code from the wiki _definitely_ works:
> >
> http://bio.perl.org/wiki/HOWTO:Getting_Genomic_Sequences#Using_Bio::DB::Entrez
> Gene_to_get_genomic_coordinates
> > =========================================
> >
> > use strict;
> > use Bio::DB::EntrezGene;
> >
> > my $id = shift or die "Id?\n"; # use a Gene id
> >
> > my $db = new Bio::DB::EntrezGene;
> > $db->verbose(1); ###
> >
> > my $seq = $db->get_Seq_by_id($id);
> >
> > my $ac = $seq->annotation;
> >
> > for my $ann ($ac->get_Annotations('dblink')) {
> >        if ($ann->database eq "Evidence Viewer") {
> >                # get the sequence identifier, the start, and the stop
> >                my ($contig,$from,$to) = $ann->url =~
> >                  /contig=([^&]+).+from=(\d+)&to=(\d+)/;
> >                print "$contig\t$from\t$to\n";
> >        }
> > }
> >
> > ======================================
> >
> > So if it doesn't work for you, there are a few things you need to check:
> > * what version of BioPerl are you using?
> > * are you behind a firewall?
> > * are you using a proxy?
> > * do you need to submit username/password for either of the 2 above
> > * turn on 'verbose' messages, it may help you debug
> >
> >
> > If you're still having problems, get back to me and I'll see if I can help.
> >
> > --Russell
> >
> >
> >> -----Original Message-----
> >> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
> >> bounces at lists.open-bio.org] On Behalf Of Neeti Somaiya
> >> Sent: Monday, 7 September 2009 10:04 p.m.
> >> To: Emanuele Osimo; bioperl-l
> >> Subject: Re: [Bioperl-l] need help urgently
> >>
> >> I tried using EntrezGene instead of GenBank, as is given in the link
> >> that you sent :
> >>
> >>
> http://www.bioperl.org/wiki/HOWTO:Beginners#Retrieving_a_sequence_from_a_datab
> >> ase
> >>
> >> http://doc.bioperl.org/releases/bioperl-current/bioperl-
> >> live/Bio/DB/EntrezGene.html
> >>
> >> use Bio::DB::EntrezGene;
> >>
> >>     my $db = Bio::DB::EntrezGene->new;
> >>
> >>     my $seq = $db->get_Seq_by_id(2); # Gene id
> >>
> >>     # or ...
> >>
> >>     my $seqio = $db->get_Stream_by_id([2, 4693, 3064]); # Gene ids
> >>     while ( my $seq = $seqio->next_seq ) {
> >>           print "id is ", $seq->display_id, "\n";
> >>     }
> >>
> >> This doesnt seem to work.
> >>
> >>
> >> -Neeti
> >> Even my blood says, B positive
> >>
> >>
> >>
> >> On Fri, Sep 4, 2009 at 1:09 PM, Emanuele Osimo<e.osimo at gmail.com> wrote:
> >> > Hello,
> >> > have you tried this?
> >> >
> >>
> http://bio.perl.org/wiki/HOWTO:Getting_Genomic_Sequences#Using_Bio::DB::GenBan
> >> k_when_you_have_genomic_coordinates
> >> >
> >> > Emanuele
> >> >
> >> > On Fri, Sep 4, 2009 at 08:49, Neeti Somaiya <neetisomaiya at gmail.com>
> wrote:
> >> >>
> >> >> Hi,
> >> >>
> >> >> I have an input list of gene names (can get gene ids from a local db
> >> >> if required).
> >> >> I need to fetch sequences of these genes. Can someone please guide me
> >> >> as to how this can be done using perl/bioperl?
> >> >>
> >> >> Any help will be deeply appreciated.
> >> >>
> >> >> Thanks.
> >> >>
> >> >> -Neeti
> >> >> Even my blood says, B positive
> >> >> _______________________________________________
> >> >> Bioperl-l mailing list
> >> >> Bioperl-l at lists.open-bio.org
> >> >> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >> >
> >> >
> >> _______________________________________________
> >> Bioperl-l mailing list
> >> Bioperl-l at lists.open-bio.org
> >> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> > =======================================================================
> > Attention: The information contained in this message and/or attachments
> > from AgResearch Limited is intended only for the persons or entities
> > to which it is addressed and may contain confidential and/or privileged
> > material. Any review, retransmission, dissemination or other use of, or
> > taking of any action in reliance upon, this information by persons or
> > entities other than the intended recipients is prohibited by AgResearch
> > Limited. If you have received this message in error, please notify the
> > sender immediately.
> > =======================================================================
> >


From cjfields at illinois.edu  Tue Sep  8 00:50:01 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Mon, 7 Sep 2009 23:50:01 -0500
Subject: [Bioperl-l] need help urgently
In-Reply-To: <18DF7D20DFEC044098A1062202F5FFF32B624C5607@exchsth.agresearch.co.nz>
References: <764978cf0909032349t5d8683f3h3b2ccc506392c984@mail.gmail.com>
	<2ac05d0f0909040039v4d6fb77fw8793b43add632e3a@mail.gmail.com>
	<764978cf0909070304w598d4bb5m51ad4e66f57cc1cf@mail.gmail.com>
	<18DF7D20DFEC044098A1062202F5FFF32B624C53A3@exchsth.agresearch.co.nz>
	<764978cf0909072127n830d4e8x95d15a758fa919db@mail.gmail.com>
	<18DF7D20DFEC044098A1062202F5FFF32B624C5607@exchsth.agresearch.co.nz>
Message-ID: <76A4757A-80C5-400E-8D3B-C68E968FF581@illinois.edu>

Russell,

Any reason you're using "no warnings 'deprecated'" there?  The  
pseudohash warnings should no longer be showing up with EntrezGene  
stuff.  Or is it something else?

chris

On Sep 7, 2009, at 11:41 PM, Smithies, Russell wrote:

> That bit of code gave you the accession, start and end for the  
> sequence so you just needed to download it.
> Bio::DB::Eutilities can do that for you.
>
> Did you take a look at http://www.bioperl.org/wiki/HOWTO:Getting_Genomic_Sequences
>
>
>
> --Russell
>
> ==================
> #!perl -w
>
> use strict;
> use Bio::DB::EntrezGene;
> use Bio::DB::EUtilities;
>
> no warnings 'deprecated';
>
> my $id = shift or die "Id?\n"; # use a Gene id
>
> my $db = new Bio::DB::EntrezGene;
> #$db->verbose(1);
> my $seq = $db->get_Seq_by_id($id);
>
> my $ac = $seq->annotation;
>
> for my $ann ($ac->get_Annotations('dblink')) {
> 	if ($ann->database eq "Evidence Viewer") {
>                # get the sequence identifier, the start, and the stop
> 		my ($acc,$from,$to) = $ann->url =~
> 		  /contig=([^&]+).+from=(\d+)&to=(\d+)/;
> 		print "$acc\t$from\t$to\n";
>
> 		# retrieve the sequence
> 		my $fetcher = Bio::DB::EUtilities->new(-eutil => 'efetch',
> 					   -db    => 'nucleotide',
> 					   -rettype => 'fasta');
>            $fetcher->set_parameters(-id => $acc,
> 			     			-seq_start => $from,
> 			     			-seq_stop  => $to,
> 			     			-strand    => 1);
>            my $seq = $fetcher->get_Response->content;
>            print $seq;
>
> 	}
> }
>
> ======================
>
>> -----Original Message-----
>> From: Neeti Somaiya [mailto:neetisomaiya at gmail.com]
>> Sent: Tuesday, 8 September 2009 4:28 p.m.
>> To: Smithies, Russell
>> Cc: Emanuele Osimo; bioperl-l
>> Subject: Re: [Bioperl-l] need help urgently
>>
>> I actually want the nucleotide sequence of the gene. I thought the
>> Bio::DB::EntrezGene would give me a seq_obj for an entrez gene id and
>> then the seq method on that $seq_obj->seq() will give me the actual
>> genomic nucleotide sequence of the gene. But this doesnt happen. I am
>> able to print gene symbol using $seq_obj->display_id and able to do
>> other things, but I wanted the gene nucleotide sequence.
>>
>> -Neeti
>> Even my blood says, B positive
>>
>>
>>
>> On Tue, Sep 8, 2009 at 1:56 AM, Smithies,
>> Russell<Russell.Smithies at agresearch.co.nz> wrote:
>>> This example code from the wiki _definitely_ works:
>>>
>> http://bio.perl.org/wiki/HOWTO:Getting_Genomic_Sequences#Using_Bio::DB::Entrez
>> Gene_to_get_genomic_coordinates
>>> =========================================
>>>
>>> use strict;
>>> use Bio::DB::EntrezGene;
>>>
>>> my $id = shift or die "Id?\n"; # use a Gene id
>>>
>>> my $db = new Bio::DB::EntrezGene;
>>> $db->verbose(1); ###
>>>
>>> my $seq = $db->get_Seq_by_id($id);
>>>
>>> my $ac = $seq->annotation;
>>>
>>> for my $ann ($ac->get_Annotations('dblink')) {
>>>       if ($ann->database eq "Evidence Viewer") {
>>>               # get the sequence identifier, the start, and the stop
>>>               my ($contig,$from,$to) = $ann->url =~
>>>                 /contig=([^&]+).+from=(\d+)&to=(\d+)/;
>>>               print "$contig\t$from\t$to\n";
>>>       }
>>> }
>>>
>>> ======================================
>>>
>>> So if it doesn't work for you, there are a few things you need to  
>>> check:
>>> * what version of BioPerl are you using?
>>> * are you behind a firewall?
>>> * are you using a proxy?
>>> * do you need to submit username/password for either of the 2 above
>>> * turn on 'verbose' messages, it may help you debug
>>>
>>>
>>> If you're still having problems, get back to me and I'll see if I  
>>> can help.
>>>
>>> --Russell
>>>
>>>
>>>> -----Original Message-----
>>>> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
>>>> bounces at lists.open-bio.org] On Behalf Of Neeti Somaiya
>>>> Sent: Monday, 7 September 2009 10:04 p.m.
>>>> To: Emanuele Osimo; bioperl-l
>>>> Subject: Re: [Bioperl-l] need help urgently
>>>>
>>>> I tried using EntrezGene instead of GenBank, as is given in the  
>>>> link
>>>> that you sent :
>>>>
>>>>
>> http://www.bioperl.org/wiki/HOWTO:Beginners#Retrieving_a_sequence_from_a_datab
>>>> ase
>>>>
>>>> http://doc.bioperl.org/releases/bioperl-current/bioperl-
>>>> live/Bio/DB/EntrezGene.html
>>>>
>>>> use Bio::DB::EntrezGene;
>>>>
>>>>    my $db = Bio::DB::EntrezGene->new;
>>>>
>>>>    my $seq = $db->get_Seq_by_id(2); # Gene id
>>>>
>>>>    # or ...
>>>>
>>>>    my $seqio = $db->get_Stream_by_id([2, 4693, 3064]); # Gene ids
>>>>    while ( my $seq = $seqio->next_seq ) {
>>>>          print "id is ", $seq->display_id, "\n";
>>>>    }
>>>>
>>>> This doesnt seem to work.
>>>>
>>>>
>>>> -Neeti
>>>> Even my blood says, B positive
>>>>
>>>>
>>>>
>>>> On Fri, Sep 4, 2009 at 1:09 PM, Emanuele Osimo<e.osimo at gmail.com>  
>>>> wrote:
>>>>> Hello,
>>>>> have you tried this?
>>>>>
>>>>
>> http://bio.perl.org/wiki/HOWTO:Getting_Genomic_Sequences#Using_Bio::DB::GenBan
>>>> k_when_you_have_genomic_coordinates
>>>>>
>>>>> Emanuele
>>>>>
>>>>> On Fri, Sep 4, 2009 at 08:49, Neeti Somaiya <neetisomaiya at gmail.com 
>>>>> >
>> wrote:
>>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> I have an input list of gene names (can get gene ids from a  
>>>>>> local db
>>>>>> if required).
>>>>>> I need to fetch sequences of these genes. Can someone please  
>>>>>> guide me
>>>>>> as to how this can be done using perl/bioperl?
>>>>>>
>>>>>> Any help will be deeply appreciated.
>>>>>>
>>>>>> Thanks.
>>>>>>
>>>>>> -Neeti
>>>>>> Even my blood says, B positive
>>>>>> _______________________________________________
>>>>>> Bioperl-l mailing list
>>>>>> Bioperl-l at lists.open-bio.org
>>>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>>>
>>>>>
>>>> _______________________________________________
>>>> Bioperl-l mailing list
>>>> Bioperl-l at lists.open-bio.org
>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>> = 
>>> = 
>>> = 
>>> ====================================================================
>>> Attention: The information contained in this message and/or  
>>> attachments
>>> from AgResearch Limited is intended only for the persons or entities
>>> to which it is addressed and may contain confidential and/or  
>>> privileged
>>> material. Any review, retransmission, dissemination or other use  
>>> of, or
>>> taking of any action in reliance upon, this information by persons  
>>> or
>>> entities other than the intended recipients is prohibited by  
>>> AgResearch
>>> Limited. If you have received this message in error, please notify  
>>> the
>>> sender immediately.
>>> = 
>>> = 
>>> = 
>>> ====================================================================
>>>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From paola_bisignano at yahoo.it  Tue Sep  8 04:55:21 2009
From: paola_bisignano at yahoo.it (Paola Bisignano)
Date: Tue, 8 Sep 2009 08:55:21 +0000 (GMT)
Subject: [Bioperl-l] problem parsing pdb
Message-ID: <741671.67508.qm@web25705.mail.ukl.yahoo.com>

Hi,

I'm in a little troble because i need to exactly parse pdb file, to extract chain id and res id, but I finded that in some pdb the number of residue is followed by a letter because is probably a residue added by crystallographers and they didm't want to change the number of residue in sequence....for example the pdb 1PXX.pdb I parsed it with my script below, I didn't find any useful suggestion about this in bioperltutorial or documentation of bioperl online

#!/usr/local/bin/perl
use strict;
use warnings;
use Bio::Structure::IO;
use LWP::Simple;


?my $urlpdb= "http://www.rcsb.org/pdb/download/downloadFile.do?fileFormat=pdb&compression=NO&structureId=1PXX";
?? my $content = get($urlpdb); 
?? my $pdb_file = qq{1pxx.pdb};
?? open my $f, ">$pdb_file" or die $!;
?? binmode $f; 
?? print $f $content;
?? print qq{$pdb_file\n};
?? close $f;


my $structio=Bio::Structure::IO->new (-file=>$pdb_file);
?? my $struc=$structio->next_structure;
?? for my $chain ($struc->get_chains) 
??? {
??? my $chainid = $chain->id ;
??? for my $res ($struc->get_residues($chain))
??? ??? {
??? ??? my $resid=$res-> id;
??? ??? my $atoms= $struc->get_atoms($res);
??? ??? open my $f, ">> 1pxx.parsed";
??? ??? ??? print? $f?? "$chainid\t$resid\n";
??? ??? ??? close $f;
??? ??? }
??? }


but it gives my file with an error in ILE 105A? ILE 2105C because they have a letter that follow the number of resid.... can I solve that problem without writing intermediate files?
because i need to have the reside id as 105A not 105.A
so
?A????????? ILE-105A 
without point between number and letter....


Thank you all,

Paola


From lengjingmao at gmail.com  Tue Sep  8 06:13:05 2009
From: lengjingmao at gmail.com (shaohua.fan)
Date: Tue, 8 Sep 2009 12:13:05 +0200
Subject: [Bioperl-l] Bio::Tools::RepeatMasker update?
Message-ID: <517072a20909080313g5ec3380bo42e1871c3a6f4aab@mail.gmail.com>

Dear all ,

After reading the document and original code of Bio::Tools::RepeatMasker on
bioperl document 1.6.0, I have a question about this module's update.

The current repeatmasker's output(  .out) provide more information
than which have not listed in the module, for example, query(left) , repeat
(left), perc div, perc del, perc ins. these maybe useful for some users.

I think it is better to update this module in the lastest Bioperl version.

shaohua


From maj at fortinbras.us  Tue Sep  8 07:00:31 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Tue, 8 Sep 2009 07:00:31 -0400
Subject: [Bioperl-l] Significant blocker for 1.6.1 : Nexml
In-Reply-To: <E5D7B830-6D19-47D2-8D5E-716B4CF84F0B@illinois.edu>
References: <E5D7B830-6D19-47D2-8D5E-716B4CF84F0B@illinois.edu>
Message-ID: <AD2517BD451A403D9FF258B9A07569F2@NewLife>

Chris - 
I would like to vote for option #1, since working on Bio::Nexml with
Chase gave me the opp'y to patch Bio::Phylo some (including fixing
an old "fix" of mine), so (IMO) the CPAN version of Bio::Phylo 
would benefit too. Option #2 is ok, since Bio::Nexml has to be
essentially optional for the user anyway, dependent on whether
the user is willing to install Bio::Phylo, a fairly major commitment
 (nexml.t already skips if Bio::Phylo is unavailable); I think it's 
no problem if we make that dependency more stringent. We could
have nexml.t check the svn revision directly, rather than $VERSION,
as a kludge.
cheers MAJ 
----- Original Message ----- 
From: "Chris Fields" <cjfields at illinois.edu>
To: "BioPerl List" <bioperl-l at lists.open-bio.org>
Sent: Tuesday, September 08, 2009 12:23 AM
Subject: [Bioperl-l] Significant blocker for 1.6.1 : Nexml


> All,
> 
> I'm running into a pretty significant blocker for 1.6.1 re: Chase's  
> Nexml code.  In particular, I have tried three versions of Bio::Phylo;  
> the default CPAN installation (1.6), the latest CPAN RC (1.7_RC9, not  
> installed by default), and the latest from Bio::Phylo svn:
> 
> https://nexml.svn.sourceforge.net/svnroot/nexml/trunk/nexml/perl
> 
> At this moment only the Bio::Phylo code from svn is working with  
> BioPerl's Nexml modules.  From my local tests Bio::Phylo 1.6 appears  
> to be missing Bio::Phylo::Factory (all Nexml tests fail), whereas  
> 1.7_RC9 has some kind of versioning issue (again, all tests fail).   
> The problem: CPAN will always install 1.6 (the others are RC, so they  
> won't be installed unless the full path is used).  Even so, nothing on  
> CPAN even works; one must use the latest Bio::Phylo SVN code.
> 
> ATM I'm just not seeing how this can be released with 1.6.1 right now,  
> unless one of the following occurs:
> 
> 1) Rutger V. drops a quick non-RC release to CPAN,
> 2) check for the minimal working Bio::Phylo version and safely skip  
> any Nexml-related tests unless proper version is present (not easy  
> with a $VERSION like '1.7_RC9'),
> 3) push Nexml into it's own distribution (something we were planning  
> on anyway with a number of modules)
> 
> As for #3 above, I think it probably belongs in a larger bioperl-phylo  
> as Mark had previously proposed.  I'm open to just about any solution.
> 
> chris
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
>


From hlapp at gmx.net  Tue Sep  8 08:16:12 2009
From: hlapp at gmx.net (Hilmar Lapp)
Date: Tue, 8 Sep 2009 08:16:12 -0400
Subject: [Bioperl-l] Significant blocker for 1.6.1 : Nexml
In-Reply-To: <E5D7B830-6D19-47D2-8D5E-716B4CF84F0B@illinois.edu>
References: <E5D7B830-6D19-47D2-8D5E-716B4CF84F0B@illinois.edu>
Message-ID: <CB38C203-7253-4AEE-A6E3-922243B290D9@gmx.net>

I'd suspect that the latest Bio::Phylo changes have been due for CPAN  
release anyway, so unless those are unstable that seems like the  
easiest fix to me.

If the Nexml code works against not yet stable updates to Bio::Phylo,  
it shouldn't be in a BioPerl stable release, right?

	-hilmar

On Sep 8, 2009, at 12:23 AM, Chris Fields wrote:

> All,
>
> I'm running into a pretty significant blocker for 1.6.1 re: Chase's  
> Nexml code.  In particular, I have tried three versions of  
> Bio::Phylo; the default CPAN installation (1.6), the latest CPAN RC  
> (1.7_RC9, not installed by default), and the latest from Bio::Phylo  
> svn:
>
> https://nexml.svn.sourceforge.net/svnroot/nexml/trunk/nexml/perl
>
> At this moment only the Bio::Phylo code from svn is working with  
> BioPerl's Nexml modules.  From my local tests Bio::Phylo 1.6 appears  
> to be missing Bio::Phylo::Factory (all Nexml tests fail), whereas  
> 1.7_RC9 has some kind of versioning issue (again, all tests fail).   
> The problem: CPAN will always install 1.6 (the others are RC, so  
> they won't be installed unless the full path is used).  Even so,  
> nothing on CPAN even works; one must use the latest Bio::Phylo SVN  
> code.
>
> ATM I'm just not seeing how this can be released with 1.6.1 right  
> now, unless one of the following occurs:
>
> 1) Rutger V. drops a quick non-RC release to CPAN,
> 2) check for the minimal working Bio::Phylo version and safely skip  
> any Nexml-related tests unless proper version is present (not easy  
> with a $VERSION like '1.7_RC9'),
> 3) push Nexml into it's own distribution (something we were planning  
> on anyway with a number of modules)
>
> As for #3 above, I think it probably belongs in a larger bioperl- 
> phylo as Mark had previously proposed.  I'm open to just about any  
> solution.
>
> chris
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From cjfields at illinois.edu  Tue Sep  8 08:02:53 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Tue, 8 Sep 2009 07:02:53 -0500
Subject: [Bioperl-l] Bio::Tools::RepeatMasker update?
In-Reply-To: <517072a20909080313g5ec3380bo42e1871c3a6f4aab@mail.gmail.com>
References: <517072a20909080313g5ec3380bo42e1871c3a6f4aab@mail.gmail.com>
Message-ID: <74B85419-6A37-46CE-AAF3-F33013F4A058@illinois.edu>

Patches are welcome for this (or you can submit an enhancement request  
via bugzilla):

http://bugzilla.open-bio.org/

This won't be in the next point release, sorry.

chris

On Sep 8, 2009, at 5:13 AM, shaohua.fan wrote:

> Dear all ,
>
> After reading the document and original code of  
> Bio::Tools::RepeatMasker on
> bioperl document 1.6.0, I have a question about this module's update.
>
> The current repeatmasker's output(  .out) provide more information
> than which have not listed in the module, for example, query(left) ,  
> repeat
> (left), perc div, perc del, perc ins. these maybe useful for some  
> users.
>
> I think it is better to update this module in the lastest Bioperl  
> version.
>
> shaohua
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From cjfields at illinois.edu  Tue Sep  8 09:15:31 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Tue, 8 Sep 2009 08:15:31 -0500
Subject: [Bioperl-l] Significant blocker for 1.6.1 : Nexml
In-Reply-To: <CB38C203-7253-4AEE-A6E3-922243B290D9@gmx.net>
References: <E5D7B830-6D19-47D2-8D5E-716B4CF84F0B@illinois.edu>
	<CB38C203-7253-4AEE-A6E3-922243B290D9@gmx.net>
Message-ID: <3163670B-51E3-419F-835B-304BB52E1037@illinois.edu>

On Sep 8, 2009, at 7:16 AM, Hilmar Lapp wrote:

> I'd suspect that the latest Bio::Phylo changes have been due for  
> CPAN release anyway, so unless those are unstable that seems like  
> the easiest fix to me.

My thought as well, just not sure how stable that code is right now.   
Bio::Phylo has been in RC for a while now, correct?

> If the Nexml code works against not yet stable updates to  
> Bio::Phylo, it shouldn't be in a BioPerl stable release, right?

Right.  That should be sorted out first.

I can wait a bit longer for Rutger to respond; there are a few other  
odds and ends that can been worked on in the meantime.  I would like  
to get the alpha out soon and 1.6.1 in the next week or so though.

chris

> 	-hilmar
>
> On Sep 8, 2009, at 12:23 AM, Chris Fields wrote:
>
>> All,
>>
>> I'm running into a pretty significant blocker for 1.6.1 re: Chase's  
>> Nexml code.  In particular, I have tried three versions of  
>> Bio::Phylo; the default CPAN installation (1.6), the latest CPAN RC  
>> (1.7_RC9, not installed by default), and the latest from Bio::Phylo  
>> svn:
>>
>> https://nexml.svn.sourceforge.net/svnroot/nexml/trunk/nexml/perl
>>
>> At this moment only the Bio::Phylo code from svn is working with  
>> BioPerl's Nexml modules.  From my local tests Bio::Phylo 1.6  
>> appears to be missing Bio::Phylo::Factory (all Nexml tests fail),  
>> whereas 1.7_RC9 has some kind of versioning issue (again, all tests  
>> fail).  The problem: CPAN will always install 1.6 (the others are  
>> RC, so they won't be installed unless the full path is used).  Even  
>> so, nothing on CPAN even works; one must use the latest Bio::Phylo  
>> SVN code.
>>
>> ATM I'm just not seeing how this can be released with 1.6.1 right  
>> now, unless one of the following occurs:
>>
>> 1) Rutger V. drops a quick non-RC release to CPAN,
>> 2) check for the minimal working Bio::Phylo version and safely skip  
>> any Nexml-related tests unless proper version is present (not easy  
>> with a $VERSION like '1.7_RC9'),
>> 3) push Nexml into it's own distribution (something we were  
>> planning on anyway with a number of modules)
>>
>> As for #3 above, I think it probably belongs in a larger bioperl- 
>> phylo as Mark had previously proposed.  I'm open to just about any  
>> solution.
>>
>> chris
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> -- 
> ===========================================================
> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
> ===========================================================
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From maj at fortinbras.us  Tue Sep  8 10:39:07 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Tue, 8 Sep 2009 10:39:07 -0400
Subject: [Bioperl-l] Significant blocker for 1.6.1 : Nexml
In-Reply-To: <3163670B-51E3-419F-835B-304BB52E1037@illinois.edu>
References: <E5D7B830-6D19-47D2-8D5E-716B4CF84F0B@illinois.edu><CB38C203-7253-4AEE-A6E3-922243B290D9@gmx.net>
	<3163670B-51E3-419F-835B-304BB52E1037@illinois.edu>
Message-ID: <1CF993D6D3AC435CA77127466D6C072A@NewLife>

I agree with Hilmar-- I have no problem keeping it in the trunk for a while
longer, as I have an addition for dealing with arbitrary non-seq
data using the Population API sitting in bioperl-dev that's nearly
ready, but prob. not before cjf wants to get the release out.
----- Original Message ----- 
From: "Chris Fields" <cjfields at illinois.edu>
To: "Hilmar Lapp" <hlapp at gmx.net>
Cc: "BioPerl List" <bioperl-l at lists.open-bio.org>; "Rutger A. Vos" 
<rutgeraldo at gmail.com>
Sent: Tuesday, September 08, 2009 9:15 AM
Subject: Re: [Bioperl-l] Significant blocker for 1.6.1 : Nexml


> On Sep 8, 2009, at 7:16 AM, Hilmar Lapp wrote:
>
>> I'd suspect that the latest Bio::Phylo changes have been due for  CPAN 
>> release anyway, so unless those are unstable that seems like  the easiest fix 
>> to me.
>
> My thought as well, just not sure how stable that code is right now. 
> Bio::Phylo has been in RC for a while now, correct?
>
>> If the Nexml code works against not yet stable updates to  Bio::Phylo, it 
>> shouldn't be in a BioPerl stable release, right?
>
> Right.  That should be sorted out first.
>
> I can wait a bit longer for Rutger to respond; there are a few other  odds and 
> ends that can been worked on in the meantime.  I would like  to get the alpha 
> out soon and 1.6.1 in the next week or so though.
>
> chris
>
>> -hilmar
>>
>> On Sep 8, 2009, at 12:23 AM, Chris Fields wrote:
>>
>>> All,
>>>
>>> I'm running into a pretty significant blocker for 1.6.1 re: Chase's  Nexml 
>>> code.  In particular, I have tried three versions of  Bio::Phylo; the 
>>> default CPAN installation (1.6), the latest CPAN RC  (1.7_RC9, not installed 
>>> by default), and the latest from Bio::Phylo  svn:
>>>
>>> https://nexml.svn.sourceforge.net/svnroot/nexml/trunk/nexml/perl
>>>
>>> At this moment only the Bio::Phylo code from svn is working with  BioPerl's 
>>> Nexml modules.  From my local tests Bio::Phylo 1.6  appears to be missing 
>>> Bio::Phylo::Factory (all Nexml tests fail),  whereas 1.7_RC9 has some kind 
>>> of versioning issue (again, all tests  fail).  The problem: CPAN will always 
>>> install 1.6 (the others are  RC, so they won't be installed unless the full 
>>> path is used).  Even  so, nothing on CPAN even works; one must use the 
>>> latest Bio::Phylo  SVN code.
>>>
>>> ATM I'm just not seeing how this can be released with 1.6.1 right  now, 
>>> unless one of the following occurs:
>>>
>>> 1) Rutger V. drops a quick non-RC release to CPAN,
>>> 2) check for the minimal working Bio::Phylo version and safely skip  any 
>>> Nexml-related tests unless proper version is present (not easy  with a 
>>> $VERSION like '1.7_RC9'),
>>> 3) push Nexml into it's own distribution (something we were  planning on 
>>> anyway with a number of modules)
>>>
>>> As for #3 above, I think it probably belongs in a larger bioperl- phylo as 
>>> Mark had previously proposed.  I'm open to just about any  solution.
>>>
>>> chris
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>> -- 
>> ===========================================================
>> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
>> ===========================================================
>>
>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> 


From lincoln.stein at gmail.com  Tue Sep  8 10:58:25 2009
From: lincoln.stein at gmail.com (Lincoln Stein)
Date: Tue, 8 Sep 2009 10:58:25 -0400
Subject: [Bioperl-l] Prepping for 1.6.1 (finally!)
In-Reply-To: <35CC277D-F0B6-45D0-A578-10A00B7A9C57@illinois.edu>
References: <35CC277D-F0B6-45D0-A578-10A00B7A9C57@illinois.edu>
Message-ID: <6dce9a0b0909080758q7334a7b2yc69bc86b96118927@mail.gmail.com>

Will do.

Lincoln

On Mon, Sep 7, 2009 at 4:56 PM, Chris Fields <cjfields at illinois.edu> wrote:

> All,
>
> I have updated the Changes file in bioperl-live in preparation for 1.6.1.
>  The initial release will be an alpha, 1.6.0_1 (probably landing about
> mid-week), and based on CPAN tests, etc the final 1.6.1 release next week.
>  I'll start merging changes over from trunk tonight, fixing last-minute
> bugs, etc.  I'm running my work using perl 5.10.1 (64-bit) on Mac and will
> likely run these remotely on our local linux cluster.  Win tests are gladly
> welcome (this should work on Strawberry Perl now).
>
> I highly suggest Mark, Jason, and any others (Lincoln, Scott, Chase, Robert
> Buels, Jay Hannah, Heikki, Sendu come to mind) look over the file to update
> it.  There are a few weak spots in there where I didn't make the code change
> or additions, or where a particular bug was fixed but not mentioned.  In
> particular:
>
> 1) Google Summer of Code work from Chase (Mark, Chase)
> 2) GMOD-related fixes (Lincoln, Scott)
> 3) YAPC Hackathon bug fixes (Robert, Jay, Bruno)
> 4) Tiling, Restriction refactors (Mark)
>
> Also, please make changes to AUTHORS, etc as needed.
>
> Thanks!
>
> chris
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


-- 
Lincoln D. Stein
Director, Informatics and Biocomputing Platform
Ontario Institute for Cancer Research
101 College St., Suite 800
Toronto, ON, Canada M5G0A3
416 673-8514
Assistant: Renata Musa <Renata.Musa at oicr.on.ca>


From cjfields at illinois.edu  Tue Sep  8 11:43:29 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Tue, 8 Sep 2009 10:43:29 -0500
Subject: [Bioperl-l] Significant blocker for 1.6.1 : Nexml
In-Reply-To: <1CF993D6D3AC435CA77127466D6C072A@NewLife>
References: <E5D7B830-6D19-47D2-8D5E-716B4CF84F0B@illinois.edu><CB38C203-7253-4AEE-A6E3-922243B290D9@gmx.net>
	<3163670B-51E3-419F-835B-304BB52E1037@illinois.edu>
	<1CF993D6D3AC435CA77127466D6C072A@NewLife>
Message-ID: <4415308D-81DC-4F68-A6CF-E08FD03D1D6E@illinois.edu>

Mark

We can hold it in trunk until the next point release or we start  
splitting things off (whichever is first).

I have a little more time, though, and I'm thinking it would be a good  
idea to get the Nexml code into the wild (sooner than later) for users  
to test out.  Let's see if Rutger responds.

chris

On Sep 8, 2009, at 9:39 AM, Mark A. Jensen wrote:

> I agree with Hilmar-- I have no problem keeping it in the trunk for  
> a while
> longer, as I have an addition for dealing with arbitrary non-seq
> data using the Population API sitting in bioperl-dev that's nearly
> ready, but prob. not before cjf wants to get the release out.
> ----- Original Message ----- From: "Chris Fields" <cjfields at illinois.edu 
> >
> To: "Hilmar Lapp" <hlapp at gmx.net>
> Cc: "BioPerl List" <bioperl-l at lists.open-bio.org>; "Rutger A. Vos" <rutgeraldo at gmail.com 
> >
> Sent: Tuesday, September 08, 2009 9:15 AM
> Subject: Re: [Bioperl-l] Significant blocker for 1.6.1 : Nexml
>
>
>> On Sep 8, 2009, at 7:16 AM, Hilmar Lapp wrote:
>>
>>> I'd suspect that the latest Bio::Phylo changes have been due for   
>>> CPAN release anyway, so unless those are unstable that seems like   
>>> the easiest fix to me.
>>
>> My thought as well, just not sure how stable that code is right  
>> now. Bio::Phylo has been in RC for a while now, correct?
>>
>>> If the Nexml code works against not yet stable updates to   
>>> Bio::Phylo, it shouldn't be in a BioPerl stable release, right?
>>
>> Right.  That should be sorted out first.
>>
>> I can wait a bit longer for Rutger to respond; there are a few  
>> other  odds and ends that can been worked on in the meantime.  I  
>> would like  to get the alpha out soon and 1.6.1 in the next week or  
>> so though.
>>
>> chris
>>
>>> -hilmar
>>>
>>> On Sep 8, 2009, at 12:23 AM, Chris Fields wrote:
>>>
>>>> All,
>>>>
>>>> I'm running into a pretty significant blocker for 1.6.1 re:  
>>>> Chase's  Nexml code.  In particular, I have tried three versions  
>>>> of  Bio::Phylo; the default CPAN installation (1.6), the latest  
>>>> CPAN RC  (1.7_RC9, not installed by default), and the latest from  
>>>> Bio::Phylo  svn:
>>>>
>>>> https://nexml.svn.sourceforge.net/svnroot/nexml/trunk/nexml/perl
>>>>
>>>> At this moment only the Bio::Phylo code from svn is working with   
>>>> BioPerl's Nexml modules.  From my local tests Bio::Phylo 1.6   
>>>> appears to be missing Bio::Phylo::Factory (all Nexml tests  
>>>> fail),  whereas 1.7_RC9 has some kind of versioning issue (again,  
>>>> all tests  fail).  The problem: CPAN will always install 1.6 (the  
>>>> others are  RC, so they won't be installed unless the full path  
>>>> is used).  Even  so, nothing on CPAN even works; one must use the  
>>>> latest Bio::Phylo  SVN code.
>>>>
>>>> ATM I'm just not seeing how this can be released with 1.6.1  
>>>> right  now, unless one of the following occurs:
>>>>
>>>> 1) Rutger V. drops a quick non-RC release to CPAN,
>>>> 2) check for the minimal working Bio::Phylo version and safely  
>>>> skip  any Nexml-related tests unless proper version is present  
>>>> (not easy  with a $VERSION like '1.7_RC9'),
>>>> 3) push Nexml into it's own distribution (something we were   
>>>> planning on anyway with a number of modules)
>>>>
>>>> As for #3 above, I think it probably belongs in a larger bioperl-  
>>>> phylo as Mark had previously proposed.  I'm open to just about  
>>>> any  solution.
>>>>
>>>> chris
>>>> _______________________________________________
>>>> Bioperl-l mailing list
>>>> Bioperl-l at lists.open-bio.org
>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>> -- 
>>> ===========================================================
>>> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
>>> ===========================================================
>>>
>>>
>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From jason at bioperl.org  Tue Sep  8 15:43:39 2009
From: jason at bioperl.org (Jason Stajich)
Date: Tue, 8 Sep 2009 12:43:39 -0700
Subject: [Bioperl-l] Bio::DB::Fasta + Bio::SeqIO
Message-ID: <9858D52F-7580-44C9-A78E-4B1F1BF1B6ED@bioperl.org>

Bio::DB::Fasta returns Bio::PrimarySeq::Fasta objects which are  
perfectly fine to write with Bio::SeqIO::fasta but not for any of the  
rich-seq writers.
Do we think this is a bug or feature.  The solution is to write the  
PrimarySeq wrapped in a Bio::Seq object.

See this gist -- I would imagine this as additional test lines in t/ 
LocalDB/DBFasta.t but I don't know what we really expect?
http://gist.github.com/183169

I also notice that $seq->description & $seq->display_id don't allow  
'set' option - which probably makes sense since this is a read-only  
object that came from the DB, but it basically silently ignores set.   
I often do this if I pull seqs from a DB::Fasta db and re-format the  
IDs or description line.  So I end up making a new object and copying  
the data over.  I *think* this is really a feature not a bug, just  
wanted to bring it up.

-jason
--
Jason Stajich
jason.stajich at gmail.com
jason at bioperl.org


From cjfields at illinois.edu  Tue Sep  8 16:20:32 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Tue, 8 Sep 2009 15:20:32 -0500
Subject: [Bioperl-l] Bio::DB::Fasta + Bio::SeqIO
In-Reply-To: <9858D52F-7580-44C9-A78E-4B1F1BF1B6ED@bioperl.org>
References: <9858D52F-7580-44C9-A78E-4B1F1BF1B6ED@bioperl.org>
Message-ID: <AEE10370-F2B3-4723-9B79-23A5EBF86A51@illinois.edu>

On Sep 8, 2009, at 2:43 PM, Jason Stajich wrote:

> Bio::DB::Fasta returns Bio::PrimarySeq::Fasta objects which are  
> perfectly fine to write with Bio::SeqIO::fasta but not for any of  
> the rich-seq writers.
> Do we think this is a bug or feature.  The solution is to write the  
> PrimarySeq wrapped in a Bio::Seq object.

I think SeqIO requires any SeqI but doesn't specify anything for a  
simpler PrimarySeqI.  We could add some kind of general convenience  
wrapper in Bio::SeqIO to convert any PrimarySeqI to a requested SeqI  
class and just delegate to write_seq():

   # get a PrimarySeq somehow $seq, $out is Bio::SeqIO
   $out->write_PrimarySeq($seq); # or somesuch

> See this gist -- I would imagine this as additional test lines in t/ 
> LocalDB/DBFasta.t but I don't know what we really expect?
> http://gist.github.com/183169
>
> I also notice that $seq->description & $seq->display_id don't allow  
> 'set' option - which probably makes sense since this is a read-only  
> object that came from the DB, but it basically silently ignores  
> set.  I often do this if I pull seqs from a DB::Fasta db and re- 
> format the IDs or description line.  So I end up making a new object  
> and copying the data over.  I *think* this is really a feature not a  
> bug, just wanted to bring it up.
>
> -jason
> --
> Jason Stajich
> jason.stajich at gmail.com
> jason at bioperl.org

One can already cheat and do a few things.  For instance:

$seq->{id} = 'Foo';
print $seq->display_id; # should be 'Foo'

Won't work for all of them, though, such as description().   
Personally, if one made clear that such changes aren't retained in the  
database but must be redirected as output to another file then I don't  
see a problem (other PrimarySeqI are mutable, so why not these?).

Would there be any real performance hit from making those get/set  
accessors instead of ro getters?  The class is fairly small.

chris


From lelbourn at science.mq.edu.au  Mon Sep  7 03:52:04 2009
From: lelbourn at science.mq.edu.au (Liam Elbourne)
Date: Mon, 7 Sep 2009 17:52:04 +1000
Subject: [Bioperl-l] subsection of genbank file
Message-ID: <997B4CA2-D80B-4512-AA3E-74CB45DD7064@science.mq.edu.au>

Hi All,

Is there a method or methodology that will produce a fully fledged Seq  
object with all the associated metadata given a start and end  
position? To clarify, I create a sequence object from a genbank file:


****
my $io  = Bio::Seqio->new(as per usual);

my $seqobj = $io->next_seq();
****
I now want:

my $sub_seqobj = $seqobj between 300 and 2000

where $sub_seqobj is a Seq object (which I appreciate is an  
'aggregate' of objects) too. The "trunc" method only returns a  
PrimarySeq object which lacks all the annotation etc. I've previously  
done this task by iterating through feature by feature and parsing out  
what I needed, but thought there might be a more elegant approach...


Regards,
Liam Elbourne.


From alpapan at googlemail.com  Thu Sep 10 17:14:11 2009
From: alpapan at googlemail.com (Alexie Papanicolaou)
Date: Thu, 10 Sep 2009 22:14:11 +0100
Subject: [Bioperl-l] Bio::Search::HSP::FastaHSP -> get_aln -> Bio::Locatable
 end is float
Message-ID: <1252617251.6680.16.camel@alexie-desktop>

An embedded and charset-unspecified text was scrubbed...
Name: not available
URL: <http://lists.open-bio.org/pipermail/bioperl-l/attachments/20090910/f222c627/attachment-0001.pl>

From maj at fortinbras.us  Thu Sep 10 23:52:27 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Thu, 10 Sep 2009 23:52:27 -0400
Subject: [Bioperl-l] Bio::Search::HSP::FastaHSP -> get_aln ->
	Bio::Locatable end is float
In-Reply-To: <1252617251.6680.16.camel@alexie-desktop>
References: <1252617251.6680.16.camel@alexie-desktop>
Message-ID: <D2C2357D7A81478B965996CF6DDD4AF2@NewLife>

Hi Alexie--
I am either responsible for this weirdness, or have fixed it in
an unreleased version. Anyway,  can you please make a bug
report at http://bugzilla.bioperl.org, and include some relevant
code and real data, and I will have a look.
Thanks a lot- Mark
----- Original Message ----- 
From: "Alexie Papanicolaou" <alpapan at googlemail.com>
To: <bioperl-l at lists.open-bio.org>
Sent: Thursday, September 10, 2009 5:14 PM
Subject: [Bioperl-l] Bio::Search::HSP::FastaHSP -> get_aln -> Bio::Locatable end 
is float


> Hello all,
>
> I get the following warning when parsing a fasty34 HSP using Bio::Search
> and then trying to getting the alignment using get_aln
>
> MSG: In sequence CONTIG residue count gives end value
> 565.333333333333.
> Overriding value [565] with value 565.333333333333 for
> Bio::LocatableSeq::end().
> MAEMFKIGDLVWAKMKGFSPWPGLVSNPTKDLKRPTSKKSAQQ/C/VFFLGTNNYAWIEEANIKPYFEYRDRLVKSNKSGAFKDALDAIEEYIKNNGAKFDDPDAEFNRLRESLAEKKESKPKQRKEKRPAHDDNSAKSPKKVRTNSVEADKESVRADSPILSNHSPRKGPASTLLERPTTIVRPLDDSQD
> STACK
> Bio::LocatableSeq::end /usr/local/share/perl/5.8.8/Bio/LocatableSeq.pm:196
> STACK
> Bio::LocatableSeq::new /usr/local/share/perl/5.8.8/Bio/LocatableSeq.pm:140
> STACK
> Bio::Search::HSP::FastaHSP::get_aln 
> /usr/local/share/perl/5.8.8/Bio/Search/HSP/FastaHSP.pm:174
>
> The frameshifts (/ and \ ) are causing this recalculation of length to a
> float (which is a bit weird) but is not fatal for my program. Is this
> intentional?
>
> My immediate problems is actually the warning message itself - which is
> quite annoying if you have hundreds of such sequences... any way to turn
> them off sort of commenting out the line at LocatableSeq.pm ?
> (redirecting STDERR wouldn't be desirable for a production script).
>
> many thanks
> alexie
>
>
> -- 
> Alexie Papanicolaou
> Richard ffrench-Constant group
> CEC-Biology
> Univ. Exeter in Cornwall
> Penryn
> TR10 9EZ
> United Kingdom
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> 


From gmodhelp at googlemail.com  Fri Sep 11 12:40:43 2009
From: gmodhelp at googlemail.com (Dave Clements, GMOD Help Desk)
Date: Fri, 11 Sep 2009 09:40:43 -0700
Subject: [Bioperl-l] CVS to SVN Conversion, 2009/09/15
In-Reply-To: <71ee57c70909110937m4a2598abv6a0a5aaa1e656fcc@mail.gmail.com>
References: <71ee57c70908241615w6f82abb6p25b0744e8f5fb006@mail.gmail.com>
	<71ee57c70909110935w2147628cq6e6984feb544e6b9@mail.gmail.com>
	<71ee57c70909110936g34612cf0g5a9d83aeee4e0efd@mail.gmail.com>
	<71ee57c70909110937m4a2598abv6a0a5aaa1e656fcc@mail.gmail.com>
Message-ID: <71ee57c70909110940y921b1dxfec278422d31be7f@mail.gmail.com>

Hello all,

This is a heads up that GMOD (in the form of Rob Buels) will be moving
its SourceForge source code repository from CVS to SVN on September
15, 2009.

If you have checked out and modified any code from that repository,
please commit your updates before 3am, Eastern US, on September 15.

Some important bits:
* All projects will be frozen in CVS and will remain available from CVS.
* No new updates will be allowed in CVS.
* All project will be moved to Subversion.
* Inactive projects will be moved to a separate archival directory.

See http://gmod.org/wiki/CVS_to_Subversion_Conversion for full details
and a list of active and inactive projects.

Thanks,

Dave C
--
* Please keep responses on the list!
* Was this helpful? ?Let us know at http://gmod.org/wiki/Help_Desk_Feedback


From jayoung at fhcrc.org  Fri Sep 11 21:11:00 2009
From: jayoung at fhcrc.org (Janet Young)
Date: Fri, 11 Sep 2009 18:11:00 -0700
Subject: [Bioperl-l] tree splice remove nodes
Message-ID: <BE5181C0-6BAF-42A8-A6A0-BC699FE640B0@fhcrc.org>

Hi,

I'm having a problem in a script that I'm hoping someone can help me  
figure out.  I'm using splice(-remove_id) to prune a Bio::Tree::Tree  
object, and it looks like it worked fine.

However, I'm also trying to keep a separate copy of the original  
(unpruned) tree in a different object but that second object seems to  
get pruned as well.

Here's my tree, stored in a file called testtree2.nwk:

(((A,(B,b)),C),D,E);

---------------------------------------
Here's my script:

#!/usr/bin/perl

use warnings;
use strict;
use Bio::TreeIO;

my $treeIO = new Bio::TreeIO(-file => "testtree2.nwk", - 
format=>'newick');
while (my $tree = $treeIO->next_tree() ) {

       print "\nfound a tree\n\n";
       my @originalleaves = $tree -> get_leaf_nodes();
       foreach my $originalleaf (@originalleaves) {print "original  
tree has node with id " . $originalleaf->id() . "\n";}

       my $tree2 = $tree;

       my @remove = ("D","E");
       print "\nremoving nodes @remove\n\n";

       $tree2 -> splice(-remove_id => \@remove);
       my @leaves2 = $tree2 -> get_leaf_nodes();
       foreach my $leaf2 (@leaves2) {print "after removing tree2 has  
node with id " . $leaf2->id() . "\n";}

       print "\n";

       my @originalleavesafter = $tree -> get_leaf_nodes();
       foreach my $leaf3 (@originalleavesafter) {print "after removing  
original tree has node with id " . $leaf3->id() . "\n";}

}

---------------------------------------


And here's my output:

found a tree

original tree has node with id A
original tree has node with id B
original tree has node with id b
original tree has node with id C
original tree has node with id D
original tree has node with id E

removing nodes D E

after removing tree2 has node with id A
after removing tree2 has node with id B
after removing tree2 has node with id b
after removing tree2 has node with id C

after removing original tree has node with id A
after removing original tree has node with id B
after removing original tree has node with id b
after removing original tree has node with id C


-------------------------

I want to splice the specified nodes out of $tree2 and leave $tree  
untouched, but both $tree and $tree2 seem to be affected by the splice  
operation. Am I failing to understand something about references/ 
dereferencing?   I'm not sure if I just haven't figured this out right  
or if it's a bug.  If it looks like a bug let me know and I'll post it  
to bugzilla.

thanks in advance for any advice,

Janet

-------------------------------------------------------------------

Dr. Janet Young (Trask lab)

Fred Hutchinson Cancer Research Center
1100 Fairview Avenue N., C3-168,
P.O. Box 19024, Seattle, WA 98109-1024, USA.

tel: (206) 667 1471 fax: (206) 667 6524
email: jayoung  ...at...  fhcrc.org

http://www.fhcrc.org/labs/trask/

-------------------------------------------------------------------


From maj at fortinbras.us  Fri Sep 11 22:00:53 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Fri, 11 Sep 2009 22:00:53 -0400
Subject: [Bioperl-l] tree splice remove nodes
In-Reply-To: <BE5181C0-6BAF-42A8-A6A0-BC699FE640B0@fhcrc.org>
References: <BE5181C0-6BAF-42A8-A6A0-BC699FE640B0@fhcrc.org>
Message-ID: <C8DF4B9CC00E4FAA8D55F43E787E3F38@NewLife>

Hi Janet-
The trouble here is that 
$tree2 = $tree
doesn't create an independent copy of the entire
tree data structure. So, $tree2 and $tree
essentially point to the same thing. 
The easiest way to get two independent copies 
is probably to read the file twice:

$treeIO = new Bio::TreeIO(-file => "testtree2.nwk", -format=>'newick');
$tree = $treeIO->next_tree;
$treeIO = new Bio::TreeIO(-file => "testtree2.nwk", -format=>'newick');
$tree2 = $treeIO->next tree;

which will create two copies. This is a little kludgy, but 
unfortunately, there doesn't seem to be any easy way to 
rewind the TreeIO object. 

When you want a copy of a complex object, generally 
you need to "clone" it, and there are variety of modules
you can use to create clones. [It's probably worth adding 
a clone() method to TreeFunctionsI--maybe I'll do that.]
Get the module Clone from CPAN and do

use Clone qw(clone);
....
$tree2 = clone($tree);
...

hope this helps- cheers 
MAJ
----- Original Message ----- 
From: "Janet Young" <jayoung at fhcrc.org>
To: <bioperl-l at lists.open-bio.org>
Sent: Friday, September 11, 2009 9:11 PM
Subject: [Bioperl-l] tree splice remove nodes


> Hi,
> 
> I'm having a problem in a script that I'm hoping someone can help me  
> figure out.  I'm using splice(-remove_id) to prune a Bio::Tree::Tree  
> object, and it looks like it worked fine.
> 
> However, I'm also trying to keep a separate copy of the original  
> (unpruned) tree in a different object but that second object seems to  
> get pruned as well.
> 
> Here's my tree, stored in a file called testtree2.nwk:
> 
> (((A,(B,b)),C),D,E);
> 
> ---------------------------------------
> Here's my script:
> 
> #!/usr/bin/perl
> 
> use warnings;
> use strict;
> use Bio::TreeIO;
> 
> my $treeIO = new Bio::TreeIO(-file => "testtree2.nwk", - 
> format=>'newick');
> while (my $tree = $treeIO->next_tree() ) {
> 
>       print "\nfound a tree\n\n";
>       my @originalleaves = $tree -> get_leaf_nodes();
>       foreach my $originalleaf (@originalleaves) {print "original  
> tree has node with id " . $originalleaf->id() . "\n";}
> 
>       my $tree2 = $tree;
> 
>       my @remove = ("D","E");
>       print "\nremoving nodes @remove\n\n";
> 
>       $tree2 -> splice(-remove_id => \@remove);
>       my @leaves2 = $tree2 -> get_leaf_nodes();
>       foreach my $leaf2 (@leaves2) {print "after removing tree2 has  
> node with id " . $leaf2->id() . "\n";}
> 
>       print "\n";
> 
>       my @originalleavesafter = $tree -> get_leaf_nodes();
>       foreach my $leaf3 (@originalleavesafter) {print "after removing  
> original tree has node with id " . $leaf3->id() . "\n";}
> 
> }
> 
> ---------------------------------------
> 
> 
> And here's my output:
> 
> found a tree
> 
> original tree has node with id A
> original tree has node with id B
> original tree has node with id b
> original tree has node with id C
> original tree has node with id D
> original tree has node with id E
> 
> removing nodes D E
> 
> after removing tree2 has node with id A
> after removing tree2 has node with id B
> after removing tree2 has node with id b
> after removing tree2 has node with id C
> 
> after removing original tree has node with id A
> after removing original tree has node with id B
> after removing original tree has node with id b
> after removing original tree has node with id C
> 
> 
> -------------------------
> 
> I want to splice the specified nodes out of $tree2 and leave $tree  
> untouched, but both $tree and $tree2 seem to be affected by the splice  
> operation. Am I failing to understand something about references/ 
> dereferencing?   I'm not sure if I just haven't figured this out right  
> or if it's a bug.  If it looks like a bug let me know and I'll post it  
> to bugzilla.
> 
> thanks in advance for any advice,
> 
> Janet
> 
> -------------------------------------------------------------------
> 
> Dr. Janet Young (Trask lab)
> 
> Fred Hutchinson Cancer Research Center
> 1100 Fairview Avenue N., C3-168,
> P.O. Box 19024, Seattle, WA 98109-1024, USA.
> 
> tel: (206) 667 1471 fax: (206) 667 6524
> email: jayoung  ...at...  fhcrc.org
> 
> http://www.fhcrc.org/labs/trask/
> 
> -------------------------------------------------------------------
> 
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
>


From cjfields at illinois.edu  Sat Sep 12 00:12:06 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Fri, 11 Sep 2009 23:12:06 -0500
Subject: [Bioperl-l] tree splice remove nodes
In-Reply-To: <C8DF4B9CC00E4FAA8D55F43E787E3F38@NewLife>
References: <BE5181C0-6BAF-42A8-A6A0-BC699FE640B0@fhcrc.org>
	<C8DF4B9CC00E4FAA8D55F43E787E3F38@NewLife>
Message-ID: <5BE22FC3-06F3-4D31-BB73-8F2C49D46A03@illinois.edu>

On Sep 11, 2009, at 9:00 PM, Mark A. Jensen wrote:

> Hi Janet-
> The trouble here is that $tree2 = $tree
> doesn't create an independent copy of the entire
> tree data structure. So, $tree2 and $tree
> essentially point to the same thing. The easiest way to get two  
> independent copies is probably to read the file twice:
>
> $treeIO = new Bio::TreeIO(-file => "testtree2.nwk", - 
> format=>'newick');
> $tree = $treeIO->next_tree;
> $treeIO = new Bio::TreeIO(-file => "testtree2.nwk", - 
> format=>'newick');
> $tree2 = $treeIO->next tree;
>
> which will create two copies. This is a little kludgy, but  
> unfortunately, there doesn't seem to be any easy way to rewind the  
> TreeIO object.

You can rewind the filehandle if it's seekable:

my $fh = $treeio->_fh;
seek($fh,0,0); # or something like that...

Don't use sysseek (doesn't work with buffered IO).

>  When you want a copy of a complex object, generally you need to  
> "clone" it, and there are variety of modules
> you can use to create clones. [It's probably worth adding a clone()  
> method to TreeFunctionsI--maybe I'll do that.]
> Get the module Clone from CPAN and do
>
> use Clone qw(clone);
> ....
> $tree2 = clone($tree);
> ...
>
> hope this helps- cheers MAJ

This normally works with bioperl objects, just not sure about Tree  
(might be worth testing out).

chris


From bix at sendu.me.uk  Sat Sep 12 04:33:22 2009
From: bix at sendu.me.uk (Sendu Bala)
Date: Sat, 12 Sep 2009 09:33:22 +0100
Subject: [Bioperl-l] tree splice remove nodes
In-Reply-To: <C8DF4B9CC00E4FAA8D55F43E787E3F38@NewLife>
References: <BE5181C0-6BAF-42A8-A6A0-BC699FE640B0@fhcrc.org>
	<C8DF4B9CC00E4FAA8D55F43E787E3F38@NewLife>
Message-ID: <4AAB5CD2.1040903@sendu.me.uk>

Mark A. Jensen wrote:
> Hi Janet-
> The trouble here is that $tree2 = $tree
> doesn't create an independent copy of the entire
> tree data structure. So, $tree2 and $tree
> essentially point to the same thing. The easiest way to get two 
> independent copies is probably to read the file twice:
> 
> $treeIO = new Bio::TreeIO(-file => "testtree2.nwk", -format=>'newick');
> $tree = $treeIO->next_tree;
> $treeIO = new Bio::TreeIO(-file => "testtree2.nwk", -format=>'newick');
> $tree2 = $treeIO->next tree;
> 
> which will create two copies. This is a little kludgy, but 
> unfortunately, there doesn't seem to be any easy way to rewind the 
> TreeIO object.
> When you want a copy of a complex object, generally you need to "clone" 
> it, and there are variety of modules
> you can use to create clones. [It's probably worth adding a clone() 
> method to TreeFunctionsI--maybe I'll do that.]
> Get the module Clone from CPAN and do

 From my comments in Bio/Tree/TreeFunctionsI.pm:

Clone.pm clone() seg faults and fails to make the clone, whilst Storable 
dclone needs $self->{_root_cleanup_methods} deleted (code ref) and seg 
faults at end of script.

TreeFunctionsI.pm already has the _clone() method. I suppose you could 
add some POD for it, rename it clone() and update the methods that call 
the private method to call the public version instead, Mark.

Janet: just clone your tree object with:
my $tree2 = $tree->_clone();


From maj at fortinbras.us  Sat Sep 12 07:37:37 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Sat, 12 Sep 2009 07:37:37 -0400
Subject: [Bioperl-l] tree splice remove nodes
In-Reply-To: <4AAB5CD2.1040903@sendu.me.uk>
References: <BE5181C0-6BAF-42A8-A6A0-BC699FE640B0@fhcrc.org>
	<C8DF4B9CC00E4FAA8D55F43E787E3F38@NewLife>
	<4AAB5CD2.1040903@sendu.me.uk>
Message-ID: <1A0B867B64B347A3B23A2F19EAA2A720@NewLife>

Done-- thanks Sendu. I made _clone alias clone, to keep 
from rocking anyone's boat. 
Janet- definitely do  $tree2 = $tree->_clone.

----- Original Message ----- 
From: "Sendu Bala" <bix at sendu.me.uk>
To: "Mark A. Jensen" <maj at fortinbras.us>
Cc: "Janet Young" <jayoung at fhcrc.org>; <bioperl-l at lists.open-bio.org>
Sent: Saturday, September 12, 2009 4:33 AM
Subject: Re: [Bioperl-l] tree splice remove nodes


> Mark A. Jensen wrote:
>> Hi Janet-
>> The trouble here is that $tree2 = $tree
>> doesn't create an independent copy of the entire
>> tree data structure. So, $tree2 and $tree
>> essentially point to the same thing. The easiest way to get two 
>> independent copies is probably to read the file twice:
>> 
>> $treeIO = new Bio::TreeIO(-file => "testtree2.nwk", -format=>'newick');
>> $tree = $treeIO->next_tree;
>> $treeIO = new Bio::TreeIO(-file => "testtree2.nwk", -format=>'newick');
>> $tree2 = $treeIO->next tree;
>> 
>> which will create two copies. This is a little kludgy, but 
>> unfortunately, there doesn't seem to be any easy way to rewind the 
>> TreeIO object.
>> When you want a copy of a complex object, generally you need to "clone" 
>> it, and there are variety of modules
>> you can use to create clones. [It's probably worth adding a clone() 
>> method to TreeFunctionsI--maybe I'll do that.]
>> Get the module Clone from CPAN and do
> 
> From my comments in Bio/Tree/TreeFunctionsI.pm:
> 
> Clone.pm clone() seg faults and fails to make the clone, whilst Storable 
> dclone needs $self->{_root_cleanup_methods} deleted (code ref) and seg 
> faults at end of script.
> 
> TreeFunctionsI.pm already has the _clone() method. I suppose you could 
> add some POD for it, rename it clone() and update the methods that call 
> the private method to call the public version instead, Mark.
> 
> Janet: just clone your tree object with:
> my $tree2 = $tree->_clone();
> 
>


From adlai at refenestration.com  Sat Sep 12 11:18:02 2009
From: adlai at refenestration.com (adlai burman)
Date: Sat, 12 Sep 2009 17:18:02 +0200
Subject: [Bioperl-l] Servers
Message-ID: <7667775E-09F9-4F20-B76C-2297DE629CF3@refenestration.com>

Can anyone suggest a hosting or server provider that actually has  
Bioperl installed?

Thanks,

Adlai


From maj at fortinbras.us  Sat Sep 12 12:45:35 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Sat, 12 Sep 2009 12:45:35 -0400
Subject: [Bioperl-l] Servers
In-Reply-To: <7667775E-09F9-4F20-B76C-2297DE629CF3@refenestration.com>
References: <7667775E-09F9-4F20-B76C-2297DE629CF3@refenestration.com>
Message-ID: <127343EFA5EF4F7CB756586A1B0B210E@NewLife>

I have a public amazon machine ; see http://fortinbras.us/bioperl-max
cheers MAJ
----- Original Message ----- 
From: "adlai burman" <adlai at refenestration.com>
To: <bioperl-l at lists.open-bio.org>
Sent: Saturday, September 12, 2009 11:18 AM
Subject: [Bioperl-l] Servers


> Can anyone suggest a hosting or server provider that actually has  
> Bioperl installed?
> 
> Thanks,
> 
> Adlai
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
>


From hartzell at alerce.com  Sat Sep 12 21:35:44 2009
From: hartzell at alerce.com (George Hartzell)
Date: Sat, 12 Sep 2009 18:35:44 -0700
Subject: [Bioperl-l] Bio::DB::GenBank question (acc vs. version)
Message-ID: <19116.19568.26115.542911@already.dhcp.gene.com>


It looks like get Bio::DB::GenBank::get_Seq_by_{version,acc} are
functionally identical.  They seem to trickle down to the same place
and walking through these two requests yields almost identical http
requests: 

  $db->get_Seq_by_version('J00522.1')
  GET http://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?retmode=text&rettype=gbwithparts&db=nucleotide&tool=bioperl&id=J00522.1&usehistory=n

  $db->get_Seq_by_acc('J00522')
  GET http://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?retmode=text&rettype=gbwithparts&db=nucleotide&tool=bioperl&id=J00522&usehistory=n

The only difference that I can see is that they index into different
secions of %PARAMSTRING defined in Bio::DB::GenBank, but those
sections contain the same information.

I'd like a general purpose tool that does The Right Thing whether
there's a .1 on the end of an identifier or not, and am just trying to
make sure I'm not doing something troublesome.

Am I correct about the above?

While I'm at it, I think that the comment

  # note that get_Stream_by_version is not implemented

in Bio::DB::GenBank was made obsolete by whoever commented out the

  $self->throw(...)

in get_Stream_by_version in Bio::WebDBSeqI.pm.

I'll happily commit the trivial doc fix if no one shoots down the
idea. (can't help big, might as well help small...).

Thanks,

g.


From maj at fortinbras.us  Sat Sep 12 23:14:06 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Sat, 12 Sep 2009 23:14:06 -0400
Subject: [Bioperl-l] Emacs bioperl-mode improved release
Message-ID: <DBDF390336FB4D8D935E2395E48207D7@NewLife>

Hi All--

[Future announcements/updates will be made on the wiki-
 http://bioperl.org/wiki/Emacs_bioperl-mode --
 put it on your watchlist...see the page for features and install
 info ]

Bioperl-mode (tar r16070) is improved:
- fancy syntax and header highlighting for pod views
- jump to .pm source from pod view (just press 'f')
- full support for multiple paths
  (e.g. "/usr/local/src/bioperl-live:/usr/local/src/bioperl-run"):
  the completion flattens the paths; if you wind up having to 
  make a choice (between, e.g., site-perl/5.10/Bio/Seq.pm
  and mytweaks/Bio/Seq.pm), completion will let you choose
  the path at the prompt.
- BPMODE_PATH convenience environment 
  variable is read for the search paths
- other stuff I can't remember
- there is a unit test suite under test.el of Wang Liang
  in the dev path

To do this stuff, I've backed off Emacs 21 compatibility; 
it'll bork (nicely) if you have 21. If there are "enough" complaints,
I will relent, but 22 is cool for people like me with the 
elisp disease.

Other technical issues remain; let me know and 
I'll do my best. My goal is to make this something
you can't live without. (And if you're not using
Emacs, are you really living?)

 M-x thanks

Mark


From bill at genenformics.com  Sun Sep 13 11:47:57 2009
From: bill at genenformics.com (bill at genenformics.com)
Date: Sun, 13 Sep 2009 08:47:57 -0700
Subject: [Bioperl-l] Bio::DB::GenBank question (acc vs. version)
In-Reply-To: <19116.19568.26115.542911@already.dhcp.gene.com>
References: <19116.19568.26115.542911@already.dhcp.gene.com>
Message-ID: <02cbfb3dfbb309f0b62cecd122bb5c2c.squirrel@mail.dreamhost.com>


I would like to make a few comments about get_Seq_by_version and
get_Seq_by_acc. Although both functions use the same NCBI eUtils API, they
are interpreted differently for a Seq_id with version or without version.

1. If the Seq_id has a version, GenBank ID server will locate
corresponding GI and emit the correct sequence.
2. If the Seq_id does not have a version, GBDataLoader  will try to find
the latest version number for that Seq_id, which is relatively slower and
the version number the ID server find out may NOT always be the latest.

IMHO, for both efficiency and consistency,
get_Seq_by_gi > get_Seq_by_version >> get_Seq_by_acc

Bill


>
> It looks like get Bio::DB::GenBank::get_Seq_by_{version,acc} are
> functionally identical.  They seem to trickle down to the same place
> and walking through these two requests yields almost identical http
> requests:
>
>   $db->get_Seq_by_version('J00522.1')
>   GET
> http://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?retmode=text&rettype=gbwithparts&db=nucleotide&tool=bioperl&id=J00522.1&usehistory=n
>
>   $db->get_Seq_by_acc('J00522')
>   GET
> http://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?retmode=text&rettype=gbwithparts&db=nucleotide&tool=bioperl&id=J00522&usehistory=n
>
> The only difference that I can see is that they index into different
> secions of %PARAMSTRING defined in Bio::DB::GenBank, but those
> sections contain the same information.
>
> I'd like a general purpose tool that does The Right Thing whether
> there's a .1 on the end of an identifier or not, and am just trying to
> make sure I'm not doing something troublesome.
>
> Am I correct about the above?
>
> While I'm at it, I think that the comment
>
>   # note that get_Stream_by_version is not implemented
>
> in Bio::DB::GenBank was made obsolete by whoever commented out the
>
>   $self->throw(...)
>
> in get_Stream_by_version in Bio::WebDBSeqI.pm.
>
> I'll happily commit the trivial doc fix if no one shoots down the
> idea. (can't help big, might as well help small...).
>
> Thanks,
>
> g.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


From maj at fortinbras.us  Sun Sep 13 21:26:57 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Sun, 13 Sep 2009 21:26:57 -0400
Subject: [Bioperl-l] Emacs bioperl-mode improved release
In-Reply-To: <DBDF390336FB4D8D935E2395E48207D7@NewLife>
References: <DBDF390336FB4D8D935E2395E48207D7@NewLife>
Message-ID: <CCFD820881654749B1EA479B45A7EA28@NewLife>

Sorry-- just one more tweak--
the latest tar (r16073) eliminates the dependency on pod2text
entirely; source is now parsed for pod directly by an elisp function.
cheers MAJ 
----- Original Message ----- 
From: "Mark A. Jensen" <maj at fortinbras.us>
To: "BioPerl List" <bioperl-l at lists.open-bio.org>
Sent: Saturday, September 12, 2009 11:14 PM
Subject: [Bioperl-l] Emacs bioperl-mode improved release


> Hi All--
> 
> [Future announcements/updates will be made on the wiki-
> http://bioperl.org/wiki/Emacs_bioperl-mode --
> put it on your watchlist...see the page for features and install
> info ]
> 
> Bioperl-mode (tar r16070) is improved:
> - fancy syntax and header highlighting for pod views
> - jump to .pm source from pod view (just press 'f')
> - full support for multiple paths
>  (e.g. "/usr/local/src/bioperl-live:/usr/local/src/bioperl-run"):
>  the completion flattens the paths; if you wind up having to 
>  make a choice (between, e.g., site-perl/5.10/Bio/Seq.pm
>  and mytweaks/Bio/Seq.pm), completion will let you choose
>  the path at the prompt.
> - BPMODE_PATH convenience environment 
>  variable is read for the search paths
> - other stuff I can't remember
> - there is a unit test suite under test.el of Wang Liang
>  in the dev path
> 
> To do this stuff, I've backed off Emacs 21 compatibility; 
> it'll bork (nicely) if you have 21. If there are "enough" complaints,
> I will relent, but 22 is cool for people like me with the 
> elisp disease.
> 
> Other technical issues remain; let me know and 
> I'll do my best. My goal is to make this something
> you can't live without. (And if you're not using
> Emacs, are you really living?)
> 
> M-x thanks
> 
> Mark
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
>


From neetisomaiya at gmail.com  Mon Sep 14 04:22:43 2009
From: neetisomaiya at gmail.com (Neeti Somaiya)
Date: Mon, 14 Sep 2009 13:52:43 +0530
Subject: [Bioperl-l] need help urgently
In-Reply-To: <18DF7D20DFEC044098A1062202F5FFF32B624C5607@exchsth.agresearch.co.nz>
References: <764978cf0909032349t5d8683f3h3b2ccc506392c984@mail.gmail.com>
	<2ac05d0f0909040039v4d6fb77fw8793b43add632e3a@mail.gmail.com>
	<764978cf0909070304w598d4bb5m51ad4e66f57cc1cf@mail.gmail.com>
	<18DF7D20DFEC044098A1062202F5FFF32B624C53A3@exchsth.agresearch.co.nz>
	<764978cf0909072127n830d4e8x95d15a758fa919db@mail.gmail.com>
	<18DF7D20DFEC044098A1062202F5FFF32B624C5607@exchsth.agresearch.co.nz>
Message-ID: <764978cf0909140122h3fe74b80lec7118e3edde24f9@mail.gmail.com>

Thanks a lot. This works for me.

I need one more help, can you point me to where exactly can we find
the link to this FASTA sequence, that we are retrieving here through
the code, in its actual entry in Entrez Gene in the NCBI website
(http://www.ncbi.nlm.nih.gov/sites/entrez)

-Neeti
Even my blood says, B positive


On Tue, Sep 8, 2009 at 10:11 AM, Smithies, Russell
<Russell.Smithies at agresearch.co.nz> wrote:
> That bit of code gave you the accession, start and end for the sequence so you just needed to download it.
> Bio::DB::Eutilities can do that for you.
>
> Did you take a look at http://www.bioperl.org/wiki/HOWTO:Getting_Genomic_Sequences
>
>
>
> --Russell
>
> ==================
> #!perl -w
>
> use strict;
> use Bio::DB::EntrezGene;
> use Bio::DB::EUtilities;
>
> no warnings 'deprecated';
>
> my $id = shift or die "Id?\n"; # use a Gene id
>
> my $db = new Bio::DB::EntrezGene;
> #$db->verbose(1);
> my $seq = $db->get_Seq_by_id($id);
>
> my $ac = $seq->annotation;
>
> for my $ann ($ac->get_Annotations('dblink')) {
>        if ($ann->database eq "Evidence Viewer") {
>                # get the sequence identifier, the start, and the stop
>                my ($acc,$from,$to) = $ann->url =~
>                  /contig=([^&]+).+from=(\d+)&to=(\d+)/;
>                print "$acc\t$from\t$to\n";
>
>                # retrieve the sequence
>                my $fetcher = Bio::DB::EUtilities->new(-eutil => 'efetch',
>                                           -db    => 'nucleotide',
>                                           -rettype => 'fasta');
>            $fetcher->set_parameters(-id => $acc,
>                                                -seq_start => $from,
>                                                -seq_stop  => $to,
>                                                -strand    => 1);
>            my $seq = $fetcher->get_Response->content;
>            print $seq;
>
>        }
> }
>
> ======================
>
>> -----Original Message-----
>> From: Neeti Somaiya [mailto:neetisomaiya at gmail.com]
>> Sent: Tuesday, 8 September 2009 4:28 p.m.
>> To: Smithies, Russell
>> Cc: Emanuele Osimo; bioperl-l
>> Subject: Re: [Bioperl-l] need help urgently
>>
>> I actually want the nucleotide sequence of the gene. I thought the
>> Bio::DB::EntrezGene would give me a seq_obj for an entrez gene id and
>> then the seq method on that $seq_obj->seq() will give me the actual
>> genomic nucleotide sequence of the gene. But this doesnt happen. I am
>> able to print gene symbol using $seq_obj->display_id and able to do
>> other things, but I wanted the gene nucleotide sequence.
>>
>> -Neeti
>> Even my blood says, B positive
>>
>>
>>
>> On Tue, Sep 8, 2009 at 1:56 AM, Smithies,
>> Russell<Russell.Smithies at agresearch.co.nz> wrote:
>> > This example code from the wiki _definitely_ works:
>> >
>> http://bio.perl.org/wiki/HOWTO:Getting_Genomic_Sequences#Using_Bio::DB::Entrez
>> Gene_to_get_genomic_coordinates
>> > =========================================
>> >
>> > use strict;
>> > use Bio::DB::EntrezGene;
>> >
>> > my $id = shift or die "Id?\n"; # use a Gene id
>> >
>> > my $db = new Bio::DB::EntrezGene;
>> > $db->verbose(1); ###
>> >
>> > my $seq = $db->get_Seq_by_id($id);
>> >
>> > my $ac = $seq->annotation;
>> >
>> > for my $ann ($ac->get_Annotations('dblink')) {
>> >        if ($ann->database eq "Evidence Viewer") {
>> >                # get the sequence identifier, the start, and the stop
>> >                my ($contig,$from,$to) = $ann->url =~
>> >                  /contig=([^&]+).+from=(\d+)&to=(\d+)/;
>> >                print "$contig\t$from\t$to\n";
>> >        }
>> > }
>> >
>> > ======================================
>> >
>> > So if it doesn't work for you, there are a few things you need to check:
>> > * what version of BioPerl are you using?
>> > * are you behind a firewall?
>> > * are you using a proxy?
>> > * do you need to submit username/password for either of the 2 above
>> > * turn on 'verbose' messages, it may help you debug
>> >
>> >
>> > If you're still having problems, get back to me and I'll see if I can help.
>> >
>> > --Russell
>> >
>> >
>> >> -----Original Message-----
>> >> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
>> >> bounces at lists.open-bio.org] On Behalf Of Neeti Somaiya
>> >> Sent: Monday, 7 September 2009 10:04 p.m.
>> >> To: Emanuele Osimo; bioperl-l
>> >> Subject: Re: [Bioperl-l] need help urgently
>> >>
>> >> I tried using EntrezGene instead of GenBank, as is given in the link
>> >> that you sent :
>> >>
>> >>
>> http://www.bioperl.org/wiki/HOWTO:Beginners#Retrieving_a_sequence_from_a_datab
>> >> ase
>> >>
>> >> http://doc.bioperl.org/releases/bioperl-current/bioperl-
>> >> live/Bio/DB/EntrezGene.html
>> >>
>> >> use Bio::DB::EntrezGene;
>> >>
>> >>     my $db = Bio::DB::EntrezGene->new;
>> >>
>> >>     my $seq = $db->get_Seq_by_id(2); # Gene id
>> >>
>> >>     # or ...
>> >>
>> >>     my $seqio = $db->get_Stream_by_id([2, 4693, 3064]); # Gene ids
>> >>     while ( my $seq = $seqio->next_seq ) {
>> >>           print "id is ", $seq->display_id, "\n";
>> >>     }
>> >>
>> >> This doesnt seem to work.
>> >>
>> >>
>> >> -Neeti
>> >> Even my blood says, B positive
>> >>
>> >>
>> >>
>> >> On Fri, Sep 4, 2009 at 1:09 PM, Emanuele Osimo<e.osimo at gmail.com> wrote:
>> >> > Hello,
>> >> > have you tried this?
>> >> >
>> >>
>> http://bio.perl.org/wiki/HOWTO:Getting_Genomic_Sequences#Using_Bio::DB::GenBan
>> >> k_when_you_have_genomic_coordinates
>> >> >
>> >> > Emanuele
>> >> >
>> >> > On Fri, Sep 4, 2009 at 08:49, Neeti Somaiya <neetisomaiya at gmail.com>
>> wrote:
>> >> >>
>> >> >> Hi,
>> >> >>
>> >> >> I have an input list of gene names (can get gene ids from a local db
>> >> >> if required).
>> >> >> I need to fetch sequences of these genes. Can someone please guide me
>> >> >> as to how this can be done using perl/bioperl?
>> >> >>
>> >> >> Any help will be deeply appreciated.
>> >> >>
>> >> >> Thanks.
>> >> >>
>> >> >> -Neeti
>> >> >> Even my blood says, B positive
>> >> >> _______________________________________________
>> >> >> Bioperl-l mailing list
>> >> >> Bioperl-l at lists.open-bio.org
>> >> >> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>> >> >
>> >> >
>> >> _______________________________________________
>> >> Bioperl-l mailing list
>> >> Bioperl-l at lists.open-bio.org
>> >> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>> > =======================================================================
>> > Attention: The information contained in this message and/or attachments
>> > from AgResearch Limited is intended only for the persons or entities
>> > to which it is addressed and may contain confidential and/or privileged
>> > material. Any review, retransmission, dissemination or other use of, or
>> > taking of any action in reliance upon, this information by persons or
>> > entities other than the intended recipients is prohibited by AgResearch
>> > Limited. If you have received this message in error, please notify the
>> > sender immediately.
>> > =======================================================================
>> >
>


From cavin.wardcaviness at gmail.com  Sun Sep 13 22:25:51 2009
From: cavin.wardcaviness at gmail.com (Cavin Ward-Caviness)
Date: Sun, 13 Sep 2009 22:25:51 -0400
Subject: [Bioperl-l] Beginner Script Error
Message-ID: <f39d52e60909131925ye176745qad0a0a16d4353a17@mail.gmail.com>

I am very new to perl and bioperl and figured I'd start learning by trying
to run a simple script to get BLAST data.  Here is the code I am trying to
run

use Bio::Perl;

$seq = get_sequence('swiss',"ROA1_HUMAN");

# uses the default database - nr in this case
$blast_result = blast_sequence($seq);

write_blast(">roa1.blast",$blast_result);

Instead of creating a file of the blast results I get the following error
message
Bio::SeqIO: swiss cannot be found.
Exception
Msg: Failed to load module Bio::SeqIO:swiss

It seems as though I may simply be missing the proper module.  I am running
bioperl 1.5.9_4 installed using the Perl Package Manager from the
instructions on the bioperl wiki page.  If I am simply missing a module
please let me know which one it is - and any other helpful modules that
someone in the bioinformatics field is likely to use.

Thanks,
Cavin


From joseguillin at hotmail.com  Mon Sep 14 08:48:28 2009
From: joseguillin at hotmail.com (Jose .)
Date: Mon, 14 Sep 2009 13:48:28 +0100
Subject: [Bioperl-l] Bio/Align/DNAStatistics.html print
	$jcmatrix->print_matrix; 
Message-ID: <BLU104-W2453ADE4584D2C479071A4A0E40@phx.gbl>


Hello,

I'm trying to use Bio::Align::DNAStatistics, but I get the following message:

Can't call method "print_matrix" on unblessed reference at Tree.pl line 32, <GEN0> line 44.

Other modules do work, such us Bio::SimpleAlign;


My code is basically a modification of the code I found in http://doc.bioperl.org/releases/bioperl-current/bioperl-live/Bio/Align/DNAStatistics.html, as it is as follows:

use strict;
use Bio::AlignIO;
use Bio::Align::DNAStatistics;


my $stats = Bio::Align::DNAStatistics->new();

my $alignin = Bio::AlignIO->new(-file => 'e1_output_uno_solo.fas',
                            -format => 'fasta');
my $aln = $alignin->next_aln;

my $jcmatrix = $stats-> distance (-align => $aln,
                  -method => 'Jukes-Cantor');

print $jcmatrix->print_matrix;

And the file 'e1_output_uno_solo.fas' has the following sequences:

>A
GGTTATCTCAACAACTGTCACC--GTGGGCGCTGGTCATTGGTACGGGTGAACGAGAGTT
AAACGGTCGTTAACCATAGAAACAAAACACACTGCACCTTAACTCACTGAATAGTTGACG
GTCTGCCTCAGGGCTTGAGACAACGGATGGATCTAAACTCATGCTGTAGCCTATCAAACT
TAGCCCCAGGGTACTTCCGTCCCTAGCCTCGCTACAAGGCCAGAAAGGGTTTTGAAGTCT
ACTCACTGTGACCAGCGGTCTAGTCAGGTTATGCTTCGGCACAAAACCTCAGAATCGGTA
ACCAGCCACTACACGAACTGAAATCAAATCGCGGGAGGTGGTCCATCTTTGTCCACGCTG
CGATGATTGGGTTGCTTTATAGTCTAGCTGCAAGGTTTTGCGTTCTGGTGGGAAGCGGCA
TCCAAGGGGTTGACTCCGCTCGTTTATAACATGCCTTGGGCCTCCATGGTGAGTCGCAAC
GTCAGCGTAGGCCTAGACGGCT

>B
GGATATCTCGACAACTTTTAGC--CTGGGCGCTTGGCATTGGTACACGTGACTTGCAGTT
AAAGGGTCGTTATACATAGAATCACTACCCAC--CAGGCGAACTCGCTGGAGAGCTGAGG
GTCACCCTCAGCGGTTGAGTTAACTGCTCGATGTTAACCGATGTTGGATCATAGGTAACT
TATCCTCAGTGTTCCTCTGTCCCTAGACTGGCTACAGGGCTACACCGGGTTTGAGGGGAT
ACTGACTGTTTTCAGCGGTAGTGTAAGTGTATGGTCCAACCCAAGGGTTCATGACCGGTA
AACTGCCCGTTCCCGCATTGAAATCAAATTGCAGGAGTTGGTACTTATTTGTCAACCTTA
CGATGATTGGGATGCATTTTAGTCGGGCTGGGCGGATTTGCGATCTGGGTGGAAGAGAGA
TGCATGGGGCTAACTCGTCTTGGTGAGTACCGGCATTGCACCGCAATGGACCGCCAAAAC
ATAAGAGTAGGTCGGGATGGCA

>C
GCTTATCTCAACAACCGACACGAAGTCGTCGCAGGTCAATGGTACACGTGAATTGAAGTC
ATAAGATCAGTAATGATCGAACCACCAAACCCTTAACCTCGACTCACGCGATAGCCGAGG
GTCTGCCTCCAGGGTTGATTTAAAGGTTCTATTTAAGACCGTTTTCGATCATAGGTTACT
TATCCCCAGAGTTCTACCGTCGTGAGAATGGCTACAAGGCTAGAATAGGTTTTAGGGT-T
ACTTACGGTCTGCAGCCGTATTGTGAGGTTATGGTCCGGCCCTAGGCGTCATGACCGATA
ATCAGCCCCTACCTGAAATGAAATCAAATCGCGGGAGTTGGTACTTATCTGTCAACGTTG
CGATGATGGGGATACATGTTGGTCTACCGCGACGGACTAGCGATCACGGGGGAAGCGGAT
TGCCCGGTGGTGACTCGACACGTTTAAAACCTGCCTGGTTCCCGCATGGATCGTCACAAC
GTATGTGCAGGTCGAAACGAGT

>D
CGTGATCGCAACAACTGTCACC--GTGGGCGCTGGCCGTTGGACCACGTGAAATGCTGTT
AAACGATCGTTCACCATAGAACCACTACACTCTTCACCTCAACCCGCGGGACAGGTGATG
GTGTCCCCCAGGGGTTGAGTGAACGGCTCGATGTAAACCCATGTTCGATCATAGGTAACG
TAGCCCCAGGGTGATTCCGTTCCTAAACTGGTTACAAGGCTAAAACGTGTTTTAGAGTAT
AATGACTGTCTACGGCGGTATTGTGATGTTATCATCCGTCCCTAGGCGTGGCGACCGTTA
AACAGCCTCTTCCCTAACTGATATCTAATCGTAGGAGTTGCTACGCATTTGTCAACGCAG
CGATGATGGTGATGCATCTTAATCTAGCTGG----TTTTTTGATCTCGGGTGACGCAGAT
AGTCAGGGGTTGACTCGCGTCGTTTGAAACGTGCCTTGCTCCTCAATGGACCCTCCGAAC
CTAAGAGTAGCTCGACACGGCT


I think the $aln object is OK, as I can use it with SimpleAlign.

Moreover, if I write
          print $jcmatrix;
instead of
          print $jcmatrix->print_matrix;
I get the memory reference, as normal===> ARRAY(0x859f08)

So my question is:

Why do I have an unblessed reference?

Can't call method "print_matrix" on unblessed reference at Tree.pl line 32, <GEN0> line 44.

Thank you very much in advance.

Jose G.

_________________________________________________________________
Hay tantos ordenadores como personas. ?Descubre ahora cu?l eres t?!
http://www.quepceres.com/


From maj at fortinbras.us  Mon Sep 14 13:00:24 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Mon, 14 Sep 2009 13:00:24 -0400
Subject: [Bioperl-l] Bio/Align/DNAStatistics.html
	print$jcmatrix->print_matrix; 
In-Reply-To: <BLU104-W2453ADE4584D2C479071A4A0E40@phx.gbl>
References: <BLU104-W2453ADE4584D2C479071A4A0E40@phx.gbl>
Message-ID: <7AD546C5A6BE4B66BF9705BC885E08B1@NewLife>

Hi Jose--
I don't get any problem with your script as written. You should upgrade to
BioPerl 1.6 and try again.
The "unblessed reference" is $jcmatrix. It may be undef for some reason.
MAJ
----- Original Message ----- 
From: "Jose ." <joseguillin at hotmail.com>
To: <bioperl-l at bioperl.org>
Sent: Monday, September 14, 2009 8:48 AM
Subject: [Bioperl-l] Bio/Align/DNAStatistics.html print$jcmatrix->print_matrix;


Hello,

I'm trying to use Bio::Align::DNAStatistics, but I get the following message:

Can't call method "print_matrix" on unblessed reference at Tree.pl line 32, 
<GEN0> line 44.

Other modules do work, such us Bio::SimpleAlign;


My code is basically a modification of the code I found in 
http://doc.bioperl.org/releases/bioperl-current/bioperl-live/Bio/Align/DNAStatistics.html, 
as it is as follows:

use strict;
use Bio::AlignIO;
use Bio::Align::DNAStatistics;


my $stats = Bio::Align::DNAStatistics->new();

my $alignin = Bio::AlignIO->new(-file => 'e1_output_uno_solo.fas',
                            -format => 'fasta');
my $aln = $alignin->next_aln;

my $jcmatrix = $stats-> distance (-align => $aln,
                  -method => 'Jukes-Cantor');

print $jcmatrix->print_matrix;

And the file 'e1_output_uno_solo.fas' has the following sequences:

>A
GGTTATCTCAACAACTGTCACC--GTGGGCGCTGGTCATTGGTACGGGTGAACGAGAGTT
AAACGGTCGTTAACCATAGAAACAAAACACACTGCACCTTAACTCACTGAATAGTTGACG
GTCTGCCTCAGGGCTTGAGACAACGGATGGATCTAAACTCATGCTGTAGCCTATCAAACT
TAGCCCCAGGGTACTTCCGTCCCTAGCCTCGCTACAAGGCCAGAAAGGGTTTTGAAGTCT
ACTCACTGTGACCAGCGGTCTAGTCAGGTTATGCTTCGGCACAAAACCTCAGAATCGGTA
ACCAGCCACTACACGAACTGAAATCAAATCGCGGGAGGTGGTCCATCTTTGTCCACGCTG
CGATGATTGGGTTGCTTTATAGTCTAGCTGCAAGGTTTTGCGTTCTGGTGGGAAGCGGCA
TCCAAGGGGTTGACTCCGCTCGTTTATAACATGCCTTGGGCCTCCATGGTGAGTCGCAAC
GTCAGCGTAGGCCTAGACGGCT

>B
GGATATCTCGACAACTTTTAGC--CTGGGCGCTTGGCATTGGTACACGTGACTTGCAGTT
AAAGGGTCGTTATACATAGAATCACTACCCAC--CAGGCGAACTCGCTGGAGAGCTGAGG
GTCACCCTCAGCGGTTGAGTTAACTGCTCGATGTTAACCGATGTTGGATCATAGGTAACT
TATCCTCAGTGTTCCTCTGTCCCTAGACTGGCTACAGGGCTACACCGGGTTTGAGGGGAT
ACTGACTGTTTTCAGCGGTAGTGTAAGTGTATGGTCCAACCCAAGGGTTCATGACCGGTA
AACTGCCCGTTCCCGCATTGAAATCAAATTGCAGGAGTTGGTACTTATTTGTCAACCTTA
CGATGATTGGGATGCATTTTAGTCGGGCTGGGCGGATTTGCGATCTGGGTGGAAGAGAGA
TGCATGGGGCTAACTCGTCTTGGTGAGTACCGGCATTGCACCGCAATGGACCGCCAAAAC
ATAAGAGTAGGTCGGGATGGCA

>C
GCTTATCTCAACAACCGACACGAAGTCGTCGCAGGTCAATGGTACACGTGAATTGAAGTC
ATAAGATCAGTAATGATCGAACCACCAAACCCTTAACCTCGACTCACGCGATAGCCGAGG
GTCTGCCTCCAGGGTTGATTTAAAGGTTCTATTTAAGACCGTTTTCGATCATAGGTTACT
TATCCCCAGAGTTCTACCGTCGTGAGAATGGCTACAAGGCTAGAATAGGTTTTAGGGT-T
ACTTACGGTCTGCAGCCGTATTGTGAGGTTATGGTCCGGCCCTAGGCGTCATGACCGATA
ATCAGCCCCTACCTGAAATGAAATCAAATCGCGGGAGTTGGTACTTATCTGTCAACGTTG
CGATGATGGGGATACATGTTGGTCTACCGCGACGGACTAGCGATCACGGGGGAAGCGGAT
TGCCCGGTGGTGACTCGACACGTTTAAAACCTGCCTGGTTCCCGCATGGATCGTCACAAC
GTATGTGCAGGTCGAAACGAGT

>D
CGTGATCGCAACAACTGTCACC--GTGGGCGCTGGCCGTTGGACCACGTGAAATGCTGTT
AAACGATCGTTCACCATAGAACCACTACACTCTTCACCTCAACCCGCGGGACAGGTGATG
GTGTCCCCCAGGGGTTGAGTGAACGGCTCGATGTAAACCCATGTTCGATCATAGGTAACG
TAGCCCCAGGGTGATTCCGTTCCTAAACTGGTTACAAGGCTAAAACGTGTTTTAGAGTAT
AATGACTGTCTACGGCGGTATTGTGATGTTATCATCCGTCCCTAGGCGTGGCGACCGTTA
AACAGCCTCTTCCCTAACTGATATCTAATCGTAGGAGTTGCTACGCATTTGTCAACGCAG
CGATGATGGTGATGCATCTTAATCTAGCTGG----TTTTTTGATCTCGGGTGACGCAGAT
AGTCAGGGGTTGACTCGCGTCGTTTGAAACGTGCCTTGCTCCTCAATGGACCCTCCGAAC
CTAAGAGTAGCTCGACACGGCT


I think the $aln object is OK, as I can use it with SimpleAlign.

Moreover, if I write
          print $jcmatrix;
instead of
          print $jcmatrix->print_matrix;
I get the memory reference, as normal===> ARRAY(0x859f08)

So my question is:

Why do I have an unblessed reference?

Can't call method "print_matrix" on unblessed reference at Tree.pl line 32, 
<GEN0> line 44.

Thank you very much in advance.

Jose G.

_________________________________________________________________
Hay tantos ordenadores como personas. ?Descubre ahora cu?l eres t?!
http://www.quepceres.com/
_______________________________________________
Bioperl-l mailing list
Bioperl-l at lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/bioperl-l


From jason at bioperl.org  Mon Sep 14 13:54:55 2009
From: jason at bioperl.org (Jason Stajich)
Date: Mon, 14 Sep 2009 10:54:55 -0700
Subject: [Bioperl-l] Bio/Align/DNAStatistics.html
	print$jcmatrix->print_matrix; 
In-Reply-To: <7AD546C5A6BE4B66BF9705BC885E08B1@NewLife>
References: <BLU104-W2453ADE4584D2C479071A4A0E40@phx.gbl>
	<7AD546C5A6BE4B66BF9705BC885E08B1@NewLife>
Message-ID: <8B440DC9-A1C8-4900-A0AB-96448616E46A@bioperl.org>

Yeah it seems like more of a bioperl problem -- possible that the  
older code didn't recognize 'jukes-cantor' but you can try the  
abbreviation 'jc' -- better to just upgrade tho!

This isn't the cause of the problem but I would also encourage use of  
Bio::Matrix::IO for printing the matrix (use the 'write_matrix'  
function) rather than print_matrix on the matrix itsself.

-jason
On Sep 14, 2009, at 10:00 AM, Mark A. Jensen wrote:

> Hi Jose--
> I don't get any problem with your script as written. You should  
> upgrade to
> BioPerl 1.6 and try again.
> The "unblessed reference" is $jcmatrix. It may be undef for some  
> reason.
> MAJ
> ----- Original Message ----- From: "Jose ." <joseguillin at hotmail.com>
> To: <bioperl-l at bioperl.org>
> Sent: Monday, September 14, 2009 8:48 AM
> Subject: [Bioperl-l] Bio/Align/DNAStatistics.html print$jcmatrix- 
> >print_matrix;
>
>
>
>
>
> Hello,
>
> I'm trying to use Bio::Align::DNAStatistics, but I get the following  
> message:
>
> Can't call method "print_matrix" on unblessed reference at Tree.pl  
> line 32, <GEN0> line 44.
>
> Other modules do work, such us Bio::SimpleAlign;
>
>
>
>
> My code is basically a modification of the code I found in http://doc.bioperl.org/releases/bioperl-current/bioperl-live/Bio/Align/DNAStatistics.html 
> , as it is as follows:
>
> use strict;
> use Bio::AlignIO;
> use Bio::Align::DNAStatistics;
>
>
> my $stats = Bio::Align::DNAStatistics->new();
>
> my $alignin = Bio::AlignIO->new(-file => 'e1_output_uno_solo.fas',
>                           -format => 'fasta');
> my $aln = $alignin->next_aln;
>
> my $jcmatrix = $stats-> distance (-align => $aln,
>                 -method => 'Jukes-Cantor');
>
> print $jcmatrix->print_matrix;
>
> And the file 'e1_output_uno_solo.fas' has the following sequences:
>
>> A
> GGTTATCTCAACAACTGTCACC--GTGGGCGCTGGTCATTGGTACGGGTGAACGAGAGTT
> AAACGGTCGTTAACCATAGAAACAAAACACACTGCACCTTAACTCACTGAATAGTTGACG
> GTCTGCCTCAGGGCTTGAGACAACGGATGGATCTAAACTCATGCTGTAGCCTATCAAACT
> TAGCCCCAGGGTACTTCCGTCCCTAGCCTCGCTACAAGGCCAGAAAGGGTTTTGAAGTCT
> ACTCACTGTGACCAGCGGTCTAGTCAGGTTATGCTTCGGCACAAAACCTCAGAATCGGTA
> ACCAGCCACTACACGAACTGAAATCAAATCGCGGGAGGTGGTCCATCTTTGTCCACGCTG
> CGATGATTGGGTTGCTTTATAGTCTAGCTGCAAGGTTTTGCGTTCTGGTGGGAAGCGGCA
> TCCAAGGGGTTGACTCCGCTCGTTTATAACATGCCTTGGGCCTCCATGGTGAGTCGCAAC
> GTCAGCGTAGGCCTAGACGGCT
>
>> B
> GGATATCTCGACAACTTTTAGC--CTGGGCGCTTGGCATTGGTACACGTGACTTGCAGTT
> AAAGGGTCGTTATACATAGAATCACTACCCAC--CAGGCGAACTCGCTGGAGAGCTGAGG
> GTCACCCTCAGCGGTTGAGTTAACTGCTCGATGTTAACCGATGTTGGATCATAGGTAACT
> TATCCTCAGTGTTCCTCTGTCCCTAGACTGGCTACAGGGCTACACCGGGTTTGAGGGGAT
> ACTGACTGTTTTCAGCGGTAGTGTAAGTGTATGGTCCAACCCAAGGGTTCATGACCGGTA
> AACTGCCCGTTCCCGCATTGAAATCAAATTGCAGGAGTTGGTACTTATTTGTCAACCTTA
> CGATGATTGGGATGCATTTTAGTCGGGCTGGGCGGATTTGCGATCTGGGTGGAAGAGAGA
> TGCATGGGGCTAACTCGTCTTGGTGAGTACCGGCATTGCACCGCAATGGACCGCCAAAAC
> ATAAGAGTAGGTCGGGATGGCA
>
>> C
> GCTTATCTCAACAACCGACACGAAGTCGTCGCAGGTCAATGGTACACGTGAATTGAAGTC
> ATAAGATCAGTAATGATCGAACCACCAAACCCTTAACCTCGACTCACGCGATAGCCGAGG
> GTCTGCCTCCAGGGTTGATTTAAAGGTTCTATTTAAGACCGTTTTCGATCATAGGTTACT
> TATCCCCAGAGTTCTACCGTCGTGAGAATGGCTACAAGGCTAGAATAGGTTTTAGGGT-T
> ACTTACGGTCTGCAGCCGTATTGTGAGGTTATGGTCCGGCCCTAGGCGTCATGACCGATA
> ATCAGCCCCTACCTGAAATGAAATCAAATCGCGGGAGTTGGTACTTATCTGTCAACGTTG
> CGATGATGGGGATACATGTTGGTCTACCGCGACGGACTAGCGATCACGGGGGAAGCGGAT
> TGCCCGGTGGTGACTCGACACGTTTAAAACCTGCCTGGTTCCCGCATGGATCGTCACAAC
> GTATGTGCAGGTCGAAACGAGT
>
>> D
> CGTGATCGCAACAACTGTCACC--GTGGGCGCTGGCCGTTGGACCACGTGAAATGCTGTT
> AAACGATCGTTCACCATAGAACCACTACACTCTTCACCTCAACCCGCGGGACAGGTGATG
> GTGTCCCCCAGGGGTTGAGTGAACGGCTCGATGTAAACCCATGTTCGATCATAGGTAACG
> TAGCCCCAGGGTGATTCCGTTCCTAAACTGGTTACAAGGCTAAAACGTGTTTTAGAGTAT
> AATGACTGTCTACGGCGGTATTGTGATGTTATCATCCGTCCCTAGGCGTGGCGACCGTTA
> AACAGCCTCTTCCCTAACTGATATCTAATCGTAGGAGTTGCTACGCATTTGTCAACGCAG
> CGATGATGGTGATGCATCTTAATCTAGCTGG----TTTTTTGATCTCGGGTGACGCAGAT
> AGTCAGGGGTTGACTCGCGTCGTTTGAAACGTGCCTTGCTCCTCAATGGACCCTCCGAAC
> CTAAGAGTAGCTCGACACGGCT
>
>
>
> I think the $aln object is OK, as I can use it with SimpleAlign.
>
> Moreover, if I write
>         print $jcmatrix;
> instead of
>         print $jcmatrix->print_matrix;
> I get the memory reference, as normal===> ARRAY(0x859f08)
>
> So my question is:
>
> Why do I have an unblessed reference?
>
> Can't call method "print_matrix" on unblessed reference at Tree.pl  
> line 32, <GEN0> line 44.
>
> Thank you very much in advance.
>
> Jose G.
>
> _________________________________________________________________
> Hay tantos ordenadores como personas. ?Descubre ahora cu?l eres t?!
> http://www.quepceres.com/
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

--
Jason Stajich
jason.stajich at gmail.com
jason at bioperl.org


From robert.bradbury at gmail.com  Mon Sep 14 15:34:52 2009
From: robert.bradbury at gmail.com (Robert Bradbury)
Date: Mon, 14 Sep 2009 15:34:52 -0400
Subject: [Bioperl-l] Beginner Script Error
In-Reply-To: <f39d52e60909131925ye176745qad0a0a16d4353a17@mail.gmail.com>
References: <f39d52e60909131925ye176745qad0a0a16d4353a17@mail.gmail.com>
Message-ID: <deaa866a0909141234p55341bcbhd4f713551180fed4@mail.gmail.com>

On 9/13/09, Cavin Ward-Caviness <cavin.wardcaviness at gmail.com> wrote:

> $seq = get_sequence('swiss',"ROA1_HUMAN");

Well, I haven't looked at the documentation or the source, but the
code I've got which does work which does a similar function is:
             # database options include: Swissprot, EMBL, GenBank and RefSeq
            $seq_object = get_sequence('swissprot', $seqname);

I think the names have to be string specific but may not need to be
case specific.  The seqname's also tend to be database format
specific, so my "general" function fetch will catch exceptions and
then try other databases, if for example it looks like a PDB
identifier.  I'm not sure whether there is a library function which
fetches a "general" sequence based on the sequence name format.
Presumably one could do something like this with some kind of
"prioritized" list of databases to go through, e.g. GenBank, EMBL,
SwissProt, RefSeq, PDB, JDB, JGI, Broad, NCBI, C. elegans, Drosophila,
Yeast, other organism specific databases.  It might be nice if there
were a "general" BioPerl function that would do this based on sequence
name format, locality (fetch from the nearest database),
up-to-dated-ness, ultimately one might like to have kind of a sequence
"rsync" function that of the form  UpdateSequence(SeqName, prefDb,
last-update-date, update-size, update-md5sum, ...) which would perform
inexpensive network-based updates for gene-sets of interest.  I'm
presuming that many sequence entries in active databases are
undergoing periodic updates and thus one might be interested in weekly
or monthly "local" db updates.

Robert


From robert.bradbury at gmail.com  Tue Sep 15 04:05:22 2009
From: robert.bradbury at gmail.com (Robert Bradbury)
Date: Tue, 15 Sep 2009 04:05:22 -0400
Subject: [Bioperl-l] Genome scanning questions/strategies
Message-ID: <deaa866a0909150105wcc651c5n4a50033d0392bbda@mail.gmail.com>

I have several applications which require scanning multiple genomes, in some
cases I can get away with scanning the protein sequences, in other cases I
need to scan the mRNA, or in the worst case the DNA sequences themselves.  I
have most of the available genomes on my hard drive but in cases where they
are not complete or undergo frequent revisions, I may need to interface
through the Genbank | Ensembl | JGI (or other?) databases.

Some of the applications are basic counting statistics:
1) How many proteins?
2) How many amino acids in the proteins?
3) What are the species specific codon frequencies in the codons?
4) What fraction of the genome is ncRNA, junk DNA, etc.?

Other applications involve some functional analysis, e.g. find all specified
protein domains of interest (presumably some HMM matching or equivalent),
find all signal sequences (nuclear targeting, mitochondrial targeting, ER
targeting, etc.), find all mRNA restriction enzyme cut sites, etc..

Questions are:
1) Are there "remote" functions that use genome center "supercomputers"
(other than say Remote Blast) that can be used for some of these purposes
and are interfaced in some way to BioPerl?
2) Will I incur genome center wrath by running all my queries "remotely"
(i.e. I do the computing, but they handle the database retreival & network
distribution)?  If not, what is a good "max query frequency"? [I'm on a DSL
line, so I can't push most servers very hard from an I/O standpoint.]

Finally, is there any "archive of experience" documenting the various
information systems limitations on various bioinformatics applications?
I.e. for I/O requirements and/or CPU requirements, is: BLAST <
HMM-domain-searching < Inter-genome-signal-scanning/matching?  Relates to
the question of when home based bioinformaticians need to begin considering
switching from DSL to Cable to FIOS and/or 1/3/4/6/8 core machines/clusters
can handle the workload.

Thank you,
Robert Bradbury


From neetisomaiya at gmail.com  Tue Sep 15 04:29:02 2009
From: neetisomaiya at gmail.com (Neeti Somaiya)
Date: Tue, 15 Sep 2009 13:59:02 +0530
Subject: [Bioperl-l] need help urgently
In-Reply-To: <764978cf0909140122h3fe74b80lec7118e3edde24f9@mail.gmail.com>
References: <764978cf0909032349t5d8683f3h3b2ccc506392c984@mail.gmail.com>
	<2ac05d0f0909040039v4d6fb77fw8793b43add632e3a@mail.gmail.com>
	<764978cf0909070304w598d4bb5m51ad4e66f57cc1cf@mail.gmail.com>
	<18DF7D20DFEC044098A1062202F5FFF32B624C53A3@exchsth.agresearch.co.nz>
	<764978cf0909072127n830d4e8x95d15a758fa919db@mail.gmail.com>
	<18DF7D20DFEC044098A1062202F5FFF32B624C5607@exchsth.agresearch.co.nz>
	<764978cf0909140122h3fe74b80lec7118e3edde24f9@mail.gmail.com>
Message-ID: <764978cf0909150129s69817921j82a9ca112aefe7ae@mail.gmail.com>

When I use Bio::DB::EntrezGene and EUtilities, the accession and
sequence that it returns to me for a gene is the second accession
mentioned in the "Genome Reference Consortium Human Build 37 Primary
Assembly". For eg, if we take entrez gene id 3630, the code returns
accession NT_009237.18. But I actually want to take the sequence of
the first accession i.e. NC_000011.9.

Please let me know how I could get that. Any help will be great.

-Neeti
Even my blood says, B positive


On Mon, Sep 14, 2009 at 1:52 PM, Neeti Somaiya <neetisomaiya at gmail.com> wrote:
> Thanks a lot. This works for me.
>
> I need one more help, can you point me to where exactly can we find
> the link to this FASTA sequence, that we are retrieving here through
> the code, in its actual entry in Entrez Gene in the NCBI website
> (http://www.ncbi.nlm.nih.gov/sites/entrez)
>
> -Neeti
> Even my blood says, B positive
>
>
>
> On Tue, Sep 8, 2009 at 10:11 AM, Smithies, Russell
> <Russell.Smithies at agresearch.co.nz> wrote:
>> That bit of code gave you the accession, start and end for the sequence so you just needed to download it.
>> Bio::DB::Eutilities can do that for you.
>>
>> Did you take a look at http://www.bioperl.org/wiki/HOWTO:Getting_Genomic_Sequences
>>
>>
>>
>> --Russell
>>
>> ==================
>> #!perl -w
>>
>> use strict;
>> use Bio::DB::EntrezGene;
>> use Bio::DB::EUtilities;
>>
>> no warnings 'deprecated';
>>
>> my $id = shift or die "Id?\n"; # use a Gene id
>>
>> my $db = new Bio::DB::EntrezGene;
>> #$db->verbose(1);
>> my $seq = $db->get_Seq_by_id($id);
>>
>> my $ac = $seq->annotation;
>>
>> for my $ann ($ac->get_Annotations('dblink')) {
>>        if ($ann->database eq "Evidence Viewer") {
>>                # get the sequence identifier, the start, and the stop
>>                my ($acc,$from,$to) = $ann->url =~
>>                  /contig=([^&]+).+from=(\d+)&to=(\d+)/;
>>                print "$acc\t$from\t$to\n";
>>
>>                # retrieve the sequence
>>                my $fetcher = Bio::DB::EUtilities->new(-eutil => 'efetch',
>>                                           -db    => 'nucleotide',
>>                                           -rettype => 'fasta');
>>            $fetcher->set_parameters(-id => $acc,
>>                                                -seq_start => $from,
>>                                                -seq_stop  => $to,
>>                                                -strand    => 1);
>>            my $seq = $fetcher->get_Response->content;
>>            print $seq;
>>
>>        }
>> }
>>
>> ======================
>>
>>> -----Original Message-----
>>> From: Neeti Somaiya [mailto:neetisomaiya at gmail.com]
>>> Sent: Tuesday, 8 September 2009 4:28 p.m.
>>> To: Smithies, Russell
>>> Cc: Emanuele Osimo; bioperl-l
>>> Subject: Re: [Bioperl-l] need help urgently
>>>
>>> I actually want the nucleotide sequence of the gene. I thought the
>>> Bio::DB::EntrezGene would give me a seq_obj for an entrez gene id and
>>> then the seq method on that $seq_obj->seq() will give me the actual
>>> genomic nucleotide sequence of the gene. But this doesnt happen. I am
>>> able to print gene symbol using $seq_obj->display_id and able to do
>>> other things, but I wanted the gene nucleotide sequence.
>>>
>>> -Neeti
>>> Even my blood says, B positive
>>>
>>>
>>>
>>> On Tue, Sep 8, 2009 at 1:56 AM, Smithies,
>>> Russell<Russell.Smithies at agresearch.co.nz> wrote:
>>> > This example code from the wiki _definitely_ works:
>>> >
>>> http://bio.perl.org/wiki/HOWTO:Getting_Genomic_Sequences#Using_Bio::DB::Entrez
>>> Gene_to_get_genomic_coordinates
>>> > =========================================
>>> >
>>> > use strict;
>>> > use Bio::DB::EntrezGene;
>>> >
>>> > my $id = shift or die "Id?\n"; # use a Gene id
>>> >
>>> > my $db = new Bio::DB::EntrezGene;
>>> > $db->verbose(1); ###
>>> >
>>> > my $seq = $db->get_Seq_by_id($id);
>>> >
>>> > my $ac = $seq->annotation;
>>> >
>>> > for my $ann ($ac->get_Annotations('dblink')) {
>>> >        if ($ann->database eq "Evidence Viewer") {
>>> >                # get the sequence identifier, the start, and the stop
>>> >                my ($contig,$from,$to) = $ann->url =~
>>> >                  /contig=([^&]+).+from=(\d+)&to=(\d+)/;
>>> >                print "$contig\t$from\t$to\n";
>>> >        }
>>> > }
>>> >
>>> > ======================================
>>> >
>>> > So if it doesn't work for you, there are a few things you need to check:
>>> > * what version of BioPerl are you using?
>>> > * are you behind a firewall?
>>> > * are you using a proxy?
>>> > * do you need to submit username/password for either of the 2 above
>>> > * turn on 'verbose' messages, it may help you debug
>>> >
>>> >
>>> > If you're still having problems, get back to me and I'll see if I can help.
>>> >
>>> > --Russell
>>> >
>>> >
>>> >> -----Original Message-----
>>> >> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
>>> >> bounces at lists.open-bio.org] On Behalf Of Neeti Somaiya
>>> >> Sent: Monday, 7 September 2009 10:04 p.m.
>>> >> To: Emanuele Osimo; bioperl-l
>>> >> Subject: Re: [Bioperl-l] need help urgently
>>> >>
>>> >> I tried using EntrezGene instead of GenBank, as is given in the link
>>> >> that you sent :
>>> >>
>>> >>
>>> http://www.bioperl.org/wiki/HOWTO:Beginners#Retrieving_a_sequence_from_a_datab
>>> >> ase
>>> >>
>>> >> http://doc.bioperl.org/releases/bioperl-current/bioperl-
>>> >> live/Bio/DB/EntrezGene.html
>>> >>
>>> >> use Bio::DB::EntrezGene;
>>> >>
>>> >>     my $db = Bio::DB::EntrezGene->new;
>>> >>
>>> >>     my $seq = $db->get_Seq_by_id(2); # Gene id
>>> >>
>>> >>     # or ...
>>> >>
>>> >>     my $seqio = $db->get_Stream_by_id([2, 4693, 3064]); # Gene ids
>>> >>     while ( my $seq = $seqio->next_seq ) {
>>> >>           print "id is ", $seq->display_id, "\n";
>>> >>     }
>>> >>
>>> >> This doesnt seem to work.
>>> >>
>>> >>
>>> >> -Neeti
>>> >> Even my blood says, B positive
>>> >>
>>> >>
>>> >>
>>> >> On Fri, Sep 4, 2009 at 1:09 PM, Emanuele Osimo<e.osimo at gmail.com> wrote:
>>> >> > Hello,
>>> >> > have you tried this?
>>> >> >
>>> >>
>>> http://bio.perl.org/wiki/HOWTO:Getting_Genomic_Sequences#Using_Bio::DB::GenBan
>>> >> k_when_you_have_genomic_coordinates
>>> >> >
>>> >> > Emanuele
>>> >> >
>>> >> > On Fri, Sep 4, 2009 at 08:49, Neeti Somaiya <neetisomaiya at gmail.com>
>>> wrote:
>>> >> >>
>>> >> >> Hi,
>>> >> >>
>>> >> >> I have an input list of gene names (can get gene ids from a local db
>>> >> >> if required).
>>> >> >> I need to fetch sequences of these genes. Can someone please guide me
>>> >> >> as to how this can be done using perl/bioperl?
>>> >> >>
>>> >> >> Any help will be deeply appreciated.
>>> >> >>
>>> >> >> Thanks.
>>> >> >>
>>> >> >> -Neeti
>>> >> >> Even my blood says, B positive
>>> >> >> _______________________________________________
>>> >> >> Bioperl-l mailing list
>>> >> >> Bioperl-l at lists.open-bio.org
>>> >> >> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>> >> >
>>> >> >
>>> >> _______________________________________________
>>> >> Bioperl-l mailing list
>>> >> Bioperl-l at lists.open-bio.org
>>> >> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>> > =======================================================================
>>> > Attention: The information contained in this message and/or attachments
>>> > from AgResearch Limited is intended only for the persons or entities
>>> > to which it is addressed and may contain confidential and/or privileged
>>> > material. Any review, retransmission, dissemination or other use of, or
>>> > taking of any action in reliance upon, this information by persons or
>>> > entities other than the intended recipients is prohibited by AgResearch
>>> > Limited. If you have received this message in error, please notify the
>>> > sender immediately.
>>> > =======================================================================
>>> >
>>
>


From cjfields at illinois.edu  Tue Sep 15 15:07:40 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Tue, 15 Sep 2009 14:07:40 -0500
Subject: [Bioperl-l] Significant blocker for 1.6.1 : Nexml
In-Reply-To: <4415308D-81DC-4F68-A6CF-E08FD03D1D6E@illinois.edu>
References: <E5D7B830-6D19-47D2-8D5E-716B4CF84F0B@illinois.edu><CB38C203-7253-4AEE-A6E3-922243B290D9@gmx.net>
	<3163670B-51E3-419F-835B-304BB52E1037@illinois.edu>
	<1CF993D6D3AC435CA77127466D6C072A@NewLife>
	<4415308D-81DC-4F68-A6CF-E08FD03D1D6E@illinois.edu>
Message-ID: <DE7BC2E3-F983-447F-86AD-34BFEA3B232A@illinois.edu>

I don't see an update to Bio::Phylo on CPAN yet, so I'm assuming we  
will leave Nexml off the 1.6.1 alpha for now.  I'll likely be  
releasing it later today or tomorrow to CPAN.

chris

On Sep 8, 2009, at 10:43 AM, Chris Fields wrote:

> Mark
>
> We can hold it in trunk until the next point release or we start  
> splitting things off (whichever is first).
>
> I have a little more time, though, and I'm thinking it would be a  
> good idea to get the Nexml code into the wild (sooner than later)  
> for users to test out.  Let's see if Rutger responds.
>
> chris
>
> On Sep 8, 2009, at 9:39 AM, Mark A. Jensen wrote:
>
>> I agree with Hilmar-- I have no problem keeping it in the trunk for  
>> a while
>> longer, as I have an addition for dealing with arbitrary non-seq
>> data using the Population API sitting in bioperl-dev that's nearly
>> ready, but prob. not before cjf wants to get the release out.
>> ----- Original Message ----- From: "Chris Fields" <cjfields at illinois.edu 
>> >
>> To: "Hilmar Lapp" <hlapp at gmx.net>
>> Cc: "BioPerl List" <bioperl-l at lists.open-bio.org>; "Rutger A. Vos" <rutgeraldo at gmail.com 
>> >
>> Sent: Tuesday, September 08, 2009 9:15 AM
>> Subject: Re: [Bioperl-l] Significant blocker for 1.6.1 : Nexml
>>
>>
>>> On Sep 8, 2009, at 7:16 AM, Hilmar Lapp wrote:
>>>
>>>> I'd suspect that the latest Bio::Phylo changes have been due for   
>>>> CPAN release anyway, so unless those are unstable that seems  
>>>> like  the easiest fix to me.
>>>
>>> My thought as well, just not sure how stable that code is right  
>>> now. Bio::Phylo has been in RC for a while now, correct?
>>>
>>>> If the Nexml code works against not yet stable updates to   
>>>> Bio::Phylo, it shouldn't be in a BioPerl stable release, right?
>>>
>>> Right.  That should be sorted out first.
>>>
>>> I can wait a bit longer for Rutger to respond; there are a few  
>>> other  odds and ends that can been worked on in the meantime.  I  
>>> would like  to get the alpha out soon and 1.6.1 in the next week  
>>> or so though.
>>>
>>> chris
>>>
>>>> -hilmar
>>>>
>>>> On Sep 8, 2009, at 12:23 AM, Chris Fields wrote:
>>>>
>>>>> All,
>>>>>
>>>>> I'm running into a pretty significant blocker for 1.6.1 re:  
>>>>> Chase's  Nexml code.  In particular, I have tried three versions  
>>>>> of  Bio::Phylo; the default CPAN installation (1.6), the latest  
>>>>> CPAN RC  (1.7_RC9, not installed by default), and the latest  
>>>>> from Bio::Phylo  svn:
>>>>>
>>>>> https://nexml.svn.sourceforge.net/svnroot/nexml/trunk/nexml/perl
>>>>>
>>>>> At this moment only the Bio::Phylo code from svn is working  
>>>>> with  BioPerl's Nexml modules.  From my local tests Bio::Phylo  
>>>>> 1.6  appears to be missing Bio::Phylo::Factory (all Nexml tests  
>>>>> fail),  whereas 1.7_RC9 has some kind of versioning issue  
>>>>> (again, all tests  fail).  The problem: CPAN will always install  
>>>>> 1.6 (the others are  RC, so they won't be installed unless the  
>>>>> full path is used).  Even  so, nothing on CPAN even works; one  
>>>>> must use the latest Bio::Phylo  SVN code.
>>>>>
>>>>> ATM I'm just not seeing how this can be released with 1.6.1  
>>>>> right  now, unless one of the following occurs:
>>>>>
>>>>> 1) Rutger V. drops a quick non-RC release to CPAN,
>>>>> 2) check for the minimal working Bio::Phylo version and safely  
>>>>> skip  any Nexml-related tests unless proper version is present  
>>>>> (not easy  with a $VERSION like '1.7_RC9'),
>>>>> 3) push Nexml into it's own distribution (something we were   
>>>>> planning on anyway with a number of modules)
>>>>>
>>>>> As for #3 above, I think it probably belongs in a larger  
>>>>> bioperl- phylo as Mark had previously proposed.  I'm open to  
>>>>> just about any  solution.
>>>>>
>>>>> chris
>>>>> _______________________________________________
>>>>> Bioperl-l mailing list
>>>>> Bioperl-l at lists.open-bio.org
>>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>>
>>>> -- 
>>>> ===========================================================
>>>> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
>>>> ===========================================================
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> Bioperl-l mailing list
>>>> Bioperl-l at lists.open-bio.org
>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


From cjfields at illinois.edu  Wed Sep 16 08:55:56 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Wed, 16 Sep 2009 07:55:56 -0500
Subject: [Bioperl-l] Bio::DB::GenBank question (acc vs. version)
In-Reply-To: <02cbfb3dfbb309f0b62cecd122bb5c2c.squirrel@mail.dreamhost.com>
References: <19116.19568.26115.542911@already.dhcp.gene.com>
	<02cbfb3dfbb309f0b62cecd122bb5c2c.squirrel@mail.dreamhost.com>
Message-ID: <0B8829A4-03EE-4BA0-8CF8-218782ED2630@illinois.edu>

Bill, George,

It's worth clarifying the docs on these and adding a TODO for them  
(and test cases!), but I tend to agree.  I believe, re: version, we  
can possibly use Bio::DB::SeqVersion to grab the right one, but it'll  
need further investigation.

As for generic accession w/o version, efetch does support it but it  
does have problems (pulling up more than one sequence in rare cases,  
for instance).

chris

On Sep 13, 2009, at 10:47 AM, bill at genenformics.com wrote:

> I would like to make a few comments about get_Seq_by_version and
> get_Seq_by_acc. Although both functions use the same NCBI eUtils  
> API, they
> are interpreted differently for a Seq_id with version or without  
> version.
>
> 1. If the Seq_id has a version, GenBank ID server will locate
> corresponding GI and emit the correct sequence.
> 2. If the Seq_id does not have a version, GBDataLoader  will try to  
> find
> the latest version number for that Seq_id, which is relatively  
> slower and
> the version number the ID server find out may NOT always be the  
> latest.
>
> IMHO, for both efficiency and consistency,
> get_Seq_by_gi > get_Seq_by_version >> get_Seq_by_acc
>
> Bill
>
>
>>
>> It looks like get Bio::DB::GenBank::get_Seq_by_{version,acc} are
>> functionally identical.  They seem to trickle down to the same place
>> and walking through these two requests yields almost identical http
>> requests:
>>
>>  $db->get_Seq_by_version('J00522.1')
>>  GET
>> http://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?retmode=text&rettype=gbwithparts&db=nucleotide&tool=bioperl&id=J00522.1&usehistory=n
>>
>>  $db->get_Seq_by_acc('J00522')
>>  GET
>> http://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?retmode=text&rettype=gbwithparts&db=nucleotide&tool=bioperl&id=J00522&usehistory=n
>>
>> The only difference that I can see is that they index into different
>> secions of %PARAMSTRING defined in Bio::DB::GenBank, but those
>> sections contain the same information.
>>
>> I'd like a general purpose tool that does The Right Thing whether
>> there's a .1 on the end of an identifier or not, and am just trying  
>> to
>> make sure I'm not doing something troublesome.
>>
>> Am I correct about the above?
>>
>> While I'm at it, I think that the comment
>>
>>  # note that get_Stream_by_version is not implemented
>>
>> in Bio::DB::GenBank was made obsolete by whoever commented out the
>>
>>  $self->throw(...)
>>
>> in get_Stream_by_version in Bio::WebDBSeqI.pm.
>>
>> I'll happily commit the trivial doc fix if no one shoots down the
>> idea. (can't help big, might as well help small...).
>>
>> Thanks,
>>
>> g.
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From cjfields at illinois.edu  Wed Sep 16 09:22:00 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Wed, 16 Sep 2009 08:22:00 -0500
Subject: [Bioperl-l] Genome scanning questions/strategies
In-Reply-To: <deaa866a0909150105wcc651c5n4a50033d0392bbda@mail.gmail.com>
References: <deaa866a0909150105wcc651c5n4a50033d0392bbda@mail.gmail.com>
Message-ID: <8674BA8B-ACCC-4C7D-989E-3532C0659A3F@illinois.edu>

On Sep 15, 2009, at 3:05 AM, Robert Bradbury wrote:

> I have several applications which require scanning multiple genomes,  
> in some
> cases I can get away with scanning the protein sequences, in other  
> cases I
> need to scan the mRNA, or in the worst case the DNA sequences  
> themselves.  I
> have most of the available genomes on my hard drive but in cases  
> where they
> are not complete or undergo frequent revisions, I may need to  
> interface
> through the Genbank | Ensembl | JGI (or other?) databases.
>
> Some of the applications are basic counting statistics:
> 1) How many proteins?
> 2) How many amino acids in the proteins?
> 3) What are the species specific codon frequencies in the codons?
> 4) What fraction of the genome is ncRNA, junk DNA, etc.?
>
> Other applications involve some functional analysis, e.g. find all  
> specified
> protein domains of interest (presumably some HMM matching or  
> equivalent),
> find all signal sequences (nuclear targeting, mitochondrial  
> targeting, ER
> targeting, etc.), find all mRNA restriction enzyme cut sites, etc..
>
> Questions are:
> 1) Are there "remote" functions that use genome center  
> "supercomputers"
> (other than say Remote Blast) that can be used for some of these  
> purposes
> and are interfaced in some way to BioPerl?

Re: remote tasks, there are a few tools for that.  See  
Bio::Tools::Analysis modules for ones that access remote servers, or  
the HOWTO:

http://www.bioperl.org/wiki/HOWTO:Simple_web_analysis

Setting up modules for these services can be risky, though, as we have  
no control over the continued evolution of the remote servers in  
question.  For instance, we had a set of Pise modules (around 100 I  
think) for remotely accessing services at any Pise server; however,  
these are now obsolete in favor of Mobyle.  I have long thought of  
setting something up to interface with either that service or Galaxy  
(which may be a more stable alternative), just haven't had the time.

Re databases: we have access to NCBI, EMBL, UniProt, and many others.   
NCBI eutils are available via Bio::DB::EUtilities.  You can use the  
Ensembl perl API for accessing Ensembl (including Compara and others),  
and Mark Jensen added Bio::DB::HIV for accessing HIV database  
information at LANL HIV Sequence Database.  These were all working  
with bioperl 1.6 last I tried (ensembl's API is separate and available  
from their website).

We don't have much beyond that, primarily b/c most other centers are  
very particular when queried remotely and will block IPs that spam  
their servers w/o an adequate timeout.  That's completely  
understandable from a webadmin perspective (think: possible denial of  
service attack).

> 2) Will I incur genome center wrath by running all my queries  
> "remotely"
> (i.e. I do the computing, but they handle the database retreival &  
> network
> distribution)?  If not, what is a good "max query frequency"? [I'm  
> on a DSL
> line, so I can't push most servers very hard from an I/O standpoint.]

You may if you abuse a specified timeout.  UCSC and NCBI both have  
been known to block IPs, but the timeout is quite different between  
the two (NCBI just reduced theirs to three queries per second, whereas  
I last heard UCSC was once per 30 seconds).

The best thing to do is check the documentation for the site in  
question or contact the webadmin to see if there is a requested  
timeout period.

> Finally, is there any "archive of experience" documenting the various
> information systems limitations on various bioinformatics  
> applications?
> I.e. for I/O requirements and/or CPU requirements, is: BLAST <
> HMM-domain-searching < Inter-genome-signal-scanning/matching?   
> Relates to
> the question of when home based bioinformaticians need to begin  
> considering
> switching from DSL to Cable to FIOS and/or 1/3/4/6/8 core machines/ 
> clusters
> can handle the workload.
>
> Thank you,
> Robert Bradbury

On that I'm not sure, but I would tend to think they don't want you  
taxing their local servers so there probably is some prioritization of  
tasks.

 From my perspective, if I were a home-based bioinformatician I would  
look seriously at cloud computing for most high-end tasks (Mark has  
even set up one for bioperl, bioperl-max).  It has a cost but it's  
very reasonable considering the cost of setting up a local cluster,  
maintenance and repairs, etc.  In fact, we have been putting serious  
thought into testing that direction instead of putting money into  
another high-cost local cluster, which is obsolete in, say, 3-4 years,  
or when we're getting Blue Waters in a couple years.

chris


From jajams at utu.fi  Wed Sep 16 06:04:18 2009
From: jajams at utu.fi (=?iso-8859-1?B?Ikpvb25hcyBK5G1zZW4i?=)
Date: Wed, 16 Sep 2009 13:04:18 +0300
Subject: [Bioperl-l] problem with a script
Message-ID: <fb44a91e1ccd0.4ab0e252@utu.fi>

Hi,

Im trying to run the script below and I get an error: "Can't call method "next_result" on an undefined value at parser.pl line 5."


#!/v/linux26_x86_64/appl/molbio/bioperl/perl/bin/
use Bio::SearchIO
my $searchio = Bio::SearchIO->new(-format => 'hmmer', -file   => '/wrk/xxxx/hmm/hmmsearch_nr.out');
while ( my $result = $in->next_result ) {
     while ( my $hit = $result->next_hit ) {
         while ( my $hsp-evalue<=10 ) {
             while ( my $hsp = $hit->next_hsp ) {
                 print $hit->accession(), "\n";
         }
     }
 }

Could someone tell me what is wrong?

Thanks.


From maj at fortinbras.us  Wed Sep 16 11:18:26 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Wed, 16 Sep 2009 11:18:26 -0400
Subject: [Bioperl-l] problem with a script
In-Reply-To: <fb44a91e1ccd0.4ab0e252@utu.fi>
References: <fb44a91e1ccd0.4ab0e252@utu.fi>
Message-ID: <A9C32C43FB5C46FD9DC320A4DD325104@NewLife>

Hi Joonas-- 

Put a semicolon after "use Bio::SearchIO" in line 2.
If that doesn't work, then the error suggests that $searchio is undefined 
because the parser failed for some reason.
You could try
 my $searchio = Bio::SearchIO->new(-format => 'hmmer', -file   => 
'/wrk/xxxx/hmm/hmmsearch_nr.out'
                   -verbose=>1);
to get more detailed error messages, they may direct you to the issue.

cheers MAJ

----- Original Message ----- 
From: ""Joonas J?msen"" <jajams at utu.fi>
To: "bioperl list" <bioperl-l at lists.open-bio.org>
Sent: Wednesday, September 16, 2009 6:04 AM
Subject: [Bioperl-l] problem with a script


> Hi,
>
> Im trying to run the script below and I get an error: "Can't call method 
> "next_result" on an undefined value at parser.pl line 5."
>
>
> #!/v/linux26_x86_64/appl/molbio/bioperl/perl/bin/
> use Bio::SearchIO
> my $searchio = Bio::SearchIO->new(-format => 'hmmer', -file   => 
> '/wrk/xxxx/hmm/hmmsearch_nr.out');
> while ( my $result = $in->next_result ) {
>     while ( my $hit = $result->next_hit ) {
>         while ( my $hsp-evalue<=10 ) {
>             while ( my $hsp = $hit->next_hsp ) {
>                 print $hit->accession(), "\n";
>         }
>     }
> }
>
> Could someone tell me what is wrong?
>
> Thanks.
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> 


From Kevin.M.Brown at asu.edu  Wed Sep 16 11:16:51 2009
From: Kevin.M.Brown at asu.edu (Kevin Brown)
Date: Wed, 16 Sep 2009 08:16:51 -0700
Subject: [Bioperl-l] problem with a script
In-Reply-To: <fb44a91e1ccd0.4ab0e252@utu.fi>
References: <fb44a91e1ccd0.4ab0e252@utu.fi>
Message-ID: <1A4207F8295607498283FE9E93B775B4063D4CB6@EX02.asurite.ad.asu.edu>

That's because the variable $in isn't defined, just like the error says. You are setting $searchio to be your input object, but not using it.

#!/v/linux26_x86_64/appl/molbio/bioperl/perl/bin/
use strict; #<-- this helps to find those pesky undeclared variables
use Bio::SearchIO;
my $searchio = Bio::SearchIO->new(-format => 'hmmer', -file   => '/wrk/xxxx/hmm/hmmsearch_nr.out');
while ( my $result = $searchio->next_result ) { # <-- changed this line
     while ( my $hit = $result->next_hit ) {
         while ( my $hsp-evalue<=10 ) {
             while ( my $hsp = $hit->next_hsp ) {
                 print $hit->accession(), "\n";
         }
     }
 }


Kevin Brown
Center for Innovations in Medicine
Biodesign Institute
Arizona State University  

> -----Original Message-----
> From: bioperl-l-bounces at lists.open-bio.org 
> [mailto:bioperl-l-bounces at lists.open-bio.org] On Behalf Of 
> "Joonas J?msen"
> Sent: Wednesday, September 16, 2009 3:04 AM
> To: bioperl list
> Subject: [Bioperl-l] problem with a script
> 
> Hi,
> 
> Im trying to run the script below and I get an error: "Can't 
> call method "next_result" on an undefined value at parser.pl line 5."
> 
> 
> #!/v/linux26_x86_64/appl/molbio/bioperl/perl/bin/
> use Bio::SearchIO
> my $searchio = Bio::SearchIO->new(-format => 'hmmer', -file   
> => '/wrk/xxxx/hmm/hmmsearch_nr.out');
> while ( my $result = $in->next_result ) {
>      while ( my $hit = $result->next_hit ) {
>          while ( my $hsp-evalue<=10 ) {
>              while ( my $hsp = $hit->next_hsp ) {
>                  print $hit->accession(), "\n";
>          }
>      }
>  }
> 
> Could someone tell me what is wrong?
> 
> Thanks.
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 


From rmb32 at cornell.edu  Wed Sep 16 11:05:16 2009
From: rmb32 at cornell.edu (Robert Buels)
Date: Wed, 16 Sep 2009 08:05:16 -0700
Subject: [Bioperl-l] problem with a script
In-Reply-To: <fb44a91e1ccd0.4ab0e252@utu.fi>
References: <fb44a91e1ccd0.4ab0e252@utu.fi>
Message-ID: <4AB0FEAC.50104@cornell.edu>

1.) You need to use strict.  Always have use strict at the top of your 
code.  That would have caught this error.
2.) The proximate problem here is that your searchio object is call 
$searchio, while you are calling $in->next_result.  You want 
$searchio->next_result instead.

Rob

Joonas J?msen wrote:
> Hi,
> 
> Im trying to run the script below and I get an error: "Can't call method "next_result" on an undefined value at parser.pl line 5."
> 
> 
> #!/v/linux26_x86_64/appl/molbio/bioperl/perl/bin/
> use Bio::SearchIO
> my $searchio = Bio::SearchIO->new(-format => 'hmmer', -file   => '/wrk/xxxx/hmm/hmmsearch_nr.out');
> while ( my $result = $in->next_result ) {
>      while ( my $hit = $result->next_hit ) {
>          while ( my $hsp-evalue<=10 ) {
>              while ( my $hsp = $hit->next_hsp ) {
>                  print $hit->accession(), "\n";
>          }
>      }
>  }
> 
> Could someone tell me what is wrong?
> 
> Thanks.
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 


From bill at genenformics.com  Wed Sep 16 13:22:56 2009
From: bill at genenformics.com (bill at genenformics.com)
Date: Wed, 16 Sep 2009 10:22:56 -0700
Subject: [Bioperl-l] Bio::DB::GenBank question (acc vs. version)
In-Reply-To: <0B8829A4-03EE-4BA0-8CF8-218782ED2630@illinois.edu>
References: <19116.19568.26115.542911@already.dhcp.gene.com>
	<02cbfb3dfbb309f0b62cecd122bb5c2c.squirrel@mail.dreamhost.com>
	<0B8829A4-03EE-4BA0-8CF8-218782ED2630@illinois.edu>
Message-ID: <6785fd2ac57ff4389dcbcd6b0e0861ae.squirrel@mail.dreamhost.com>


>
> As for generic accession w/o version, efetch does support it but it
> does have problems (pulling up more than one sequence in rare cases,
> for instance).
>

This is probably because NCBI ID servers are not completely synchronized
or are in the process of synchronization. get_Seq_by_acc is not as safe as
other functions.

Bill

>
> On Sep 13, 2009, at 10:47 AM, bill at genenformics.com wrote:
>
>> I would like to make a few comments about get_Seq_by_version and
>> get_Seq_by_acc. Although both functions use the same NCBI eUtils
>> API, they
>> are interpreted differently for a Seq_id with version or without
>> version.
>>
>> 1. If the Seq_id has a version, GenBank ID server will locate
>> corresponding GI and emit the correct sequence.
>> 2. If the Seq_id does not have a version, GBDataLoader  will try to
>> find
>> the latest version number for that Seq_id, which is relatively
>> slower and
>> the version number the ID server find out may NOT always be the
>> latest.
>>
>> IMHO, for both efficiency and consistency,
>> get_Seq_by_gi > get_Seq_by_version >> get_Seq_by_acc
>>
>> Bill
>>
>>
>>>
>>> It looks like get Bio::DB::GenBank::get_Seq_by_{version,acc} are
>>> functionally identical.  They seem to trickle down to the same place
>>> and walking through these two requests yields almost identical http
>>> requests:
>>>
>>>  $db->get_Seq_by_version('J00522.1')
>>>  GET
>>> http://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?retmode=text&rettype=gbwithparts&db=nucleotide&tool=bioperl&id=J00522.1&usehistory=n
>>>
>>>  $db->get_Seq_by_acc('J00522')
>>>  GET
>>> http://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?retmode=text&rettype=gbwithparts&db=nucleotide&tool=bioperl&id=J00522&usehistory=n
>>>
>>> The only difference that I can see is that they index into different
>>> secions of %PARAMSTRING defined in Bio::DB::GenBank, but those
>>> sections contain the same information.
>>>
>>> I'd like a general purpose tool that does The Right Thing whether
>>> there's a .1 on the end of an identifier or not, and am just trying
>>> to
>>> make sure I'm not doing something troublesome.
>>>
>>> Am I correct about the above?
>>>
>>> While I'm at it, I think that the comment
>>>
>>>  # note that get_Stream_by_version is not implemented
>>>
>>> in Bio::DB::GenBank was made obsolete by whoever commented out the
>>>
>>>  $self->throw(...)
>>>
>>> in get_Stream_by_version in Bio::WebDBSeqI.pm.
>>>
>>> I'll happily commit the trivial doc fix if no one shoots down the
>>> idea. (can't help big, might as well help small...).
>>>
>>> Thanks,
>>>
>>> g.
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>


From cjfields at illinois.edu  Wed Sep 16 13:29:40 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Wed, 16 Sep 2009 12:29:40 -0500
Subject: [Bioperl-l] Bio::DB::GenBank question (acc vs. version)
In-Reply-To: <6785fd2ac57ff4389dcbcd6b0e0861ae.squirrel@mail.dreamhost.com>
References: <19116.19568.26115.542911@already.dhcp.gene.com>
	<02cbfb3dfbb309f0b62cecd122bb5c2c.squirrel@mail.dreamhost.com>
	<0B8829A4-03EE-4BA0-8CF8-218782ED2630@illinois.edu>
	<6785fd2ac57ff4389dcbcd6b0e0861ae.squirrel@mail.dreamhost.com>
Message-ID: <B293F929-5714-4840-8FAD-7366F7C36137@illinois.edu>


On Sep 16, 2009, at 12:22 PM, bill at genenformics.com wrote:

>
>>
>> As for generic accession w/o version, efetch does support it but it
>> does have problems (pulling up more than one sequence in rare cases,
>> for instance).
>>
>
> This is probably because NCBI ID servers are not completely  
> synchronized
> or are in the process of synchronization. get_Seq_by_acc is not as  
> safe as
> other functions.
>
> Bill

Right, but unfortunately it's necessary as the default in most cases  
is to grab/display the accession, not the UID.  For instance, BLAST  
output must be specifically flagged to display the GI.

This is an instance where documentation would be a good idea to  
indicate the problem.  I think I have done that but I'll double-check.

chris


From rmb32 at cornell.edu  Wed Sep 16 15:04:16 2009
From: rmb32 at cornell.edu (Robert Buels)
Date: Wed, 16 Sep 2009 12:04:16 -0700
Subject: [Bioperl-l] problem with a script
In-Reply-To: <4AB1356D.4050307@utu.fi>
References: <fb44a91e1ccd0.4ab0e252@utu.fi> <4AB0FEAC.50104@cornell.edu>
	<4AB1356D.4050307@utu.fi>
Message-ID: <4AB136B0.6050304@cornell.edu>

You should also 'use warnings' at the top of all code.  That would have 
caught THIS error.

You are missing a comma after ....nr.out'

Rob

Joonas J?msen wrote:
> Thanks. Im still getting errors. I have no idea what the error means. It 
> says:
> 
> ------------- EXCEPTION: Bio::Root::Exception -------------
> MSG: Could not open 0: No such file or directory
> STACK: Error::throw
> STACK: Bio::Root::Root::throw 
> /v/linux26_x86_64/appl/molbio/bioperl/perl/lib/site_perl/5.8.9/Bio/Root/Root.pm:357 
> 
> STACK: Bio::Root::IO::_initialize_io 
> /v/linux26_x86_64/appl/molbio/bioperl/perl/lib/site_perl/5.8.9/Bio/Root/IO.pm:310 
> 
> STACK: Bio::Root::IO::new 
> /v/linux26_x86_64/appl/molbio/bioperl/perl/lib/site_perl/5.8.9/Bio/Root/IO.pm:223 
> 
> STACK: Bio::SearchIO::new 
> /v/linux26_x86_64/appl/molbio/bioperl/perl/lib/site_perl/5.8.9/Bio/SearchIO.pm:145 
> 
> STACK: Bio::SearchIO::new 
> /v/linux26_x86_64/appl/molbio/bioperl/perl/lib/site_perl/5.8.9/Bio/SearchIO.pm:177 
> 
> STACK: parser.pl:7
> -----------------------------------------------------------
> 
> And the code im using seems ok now:
> 
> #!/v/linux26_x86_64/appl/molbio/bioperl/perl/bin/
> 
> use strict;
> use Bio::SearchIO;
> 
> my $searchio = Bio::SearchIO->new(-format => 'hmmer', -file => 
> '/wrk/xxxx/hmm/hmmsearch_nr.out' -verbose=>1);
> while ( my $result = $searchio->next_result ) {
>     while ( my $hit = $result->next_hit ) {
>         while ( my $hsp = $hit->evalue<=10 ) {
>                 while ( my $hsp = $hit->next_hsp ) {
>                         print $hit->accession(), "\n";
>             }
>         }
>     }
> }
> 
> -J.
> 
> Robert Buels wrote:
>> 1.) You need to use strict.  Always have use strict at the top of your 
>> code.  That would have caught this error.
>> 2.) The proximate problem here is that your searchio object is call 
>> $searchio, while you are calling $in->next_result.  You want 
>> $searchio->next_result instead.
>>
>> Rob
>>
>> Joonas J?msen wrote:
>>> Hi,
>>>
>>> Im trying to run the script below and I get an error: "Can't call 
>>> method "next_result" on an undefined value at parser.pl line 5."
>>>
>>>
>>> #!/v/linux26_x86_64/appl/molbio/bioperl/perl/bin/
>>> use Bio::SearchIO
>>> my $searchio = Bio::SearchIO->new(-format => 'hmmer', -file   => 
>>> '/wrk/xxxx/hmm/hmmsearch_nr.out');
>>> while ( my $result = $in->next_result ) {
>>>      while ( my $hit = $result->next_hit ) {
>>>          while ( my $hsp-evalue<=10 ) {
>>>              while ( my $hsp = $hit->next_hsp ) {
>>>                  print $hit->accession(), "\n";
>>>          }
>>>      }
>>>  }
>>>
>>> Could someone tell me what is wrong?
>>>
>>> Thanks.
>>>
>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>


-- 
Robert Buels
Bioinformatics Analyst, Sol Genomics Network
Boyce Thompson Institute for Plant Research
Tower Rd
Ithaca, NY  14853
Tel: 503-889-8539
rmb32 at cornell.edu
http://www.sgn.cornell.edu


From rmb32 at cornell.edu  Wed Sep 16 15:23:27 2009
From: rmb32 at cornell.edu (Robert Buels)
Date: Wed, 16 Sep 2009 12:23:27 -0700
Subject: [Bioperl-l] problem with a script
In-Reply-To: <4AB13864.6070707@utu.fi>
References: <fb44a91e1ccd0.4ab0e252@utu.fi> <4AB0FEAC.50104@cornell.edu>
	<4AB1356D.4050307@utu.fi> <4AB136B0.6050304@cornell.edu>
	<4AB13864.6070707@utu.fi>
Message-ID: <4AB13B2F.5060502@cornell.edu>

Your report may not have accessions, try using name() instead of 
accession().


From abhishek.vit at gmail.com  Wed Sep 16 16:13:33 2009
From: abhishek.vit at gmail.com (Abhishek Pratap)
Date: Wed, 16 Sep 2009 16:13:33 -0400
Subject: [Bioperl-l] About FASTQ parser
Message-ID: <be9b52410909161313uab30d9cn24d7080eb1684de7@mail.gmail.com>

Hi Chris

I remember seeing a recent email about new bioperl fastq parser. Is it
part of bioperl 1.6 dist. I installed one and based on the doc
here(http://doc.bioperl.org/releases/bioperl-current/bioperl-live/Bio/SeqIO/fastq.html)
I am a bit lost.

I see two methods there : using Bio::SeqIO::fastq and
Bio::Seq::Quality. Are both same in terms of data returned and latter
giving a scale up in speed ?

This is not to offend any developer but small example/s on the HOWTO's
helps a lot.

The current example (copied below) is not working. I guess it is based
on a previous version of code.

# grabs the FASTQ parser, specifies the Illumina variant
  my $in = Bio::SeqIO->new(-format    => 'fastq-illumina',
                           -file      => 'mydata.fq');


My basic requirement is to read each read in fastq record and split it
into header: read: quality.


Thanks,
-Abhi


From abhishek.vit at gmail.com  Wed Sep 16 17:41:50 2009
From: abhishek.vit at gmail.com (Abhishek Pratap)
Date: Wed, 16 Sep 2009 17:41:50 -0400
Subject: [Bioperl-l] Allowing One error in Sequence matching
Message-ID: <be9b52410909161441w1ce271c4r1e518f7fd1ea7339@mail.gmail.com>

Hi All

I am not able to think of smart way to do sequence matching allowing
userdefined number of mismatches.

For eg:

Given Sequence : AGCT will be considered a match to reference if any
one base pair position #(1,2,3,4)  has a mismatch that is  [ACGTN] so
the possible matches could be

This is for position 1.
AGCT
GGCT
CGCT
TGCT
NGCT
and likewise for each position.

any nice regular expression. One way that I could think was to
generate all the possible tags for a given sequence and then do the
matching. It will be a computationally expensive for long dataset .
Any neat method ?

Thanks,
-Abhi


From maj at fortinbras.us  Wed Sep 16 18:33:00 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Wed, 16 Sep 2009 22:33:00 +0000
Subject: [Bioperl-l] Allowing One error in Sequence matching
Message-ID: <W403682148491321253140380@webmail21>

Hi Abhi -
Maybe Chris' scrap
http://www.bioperl.org/wiki/Tricking_the_perl_regex_engine_to_get_suboptimal_matches
is what you're after?
MAJ


>-----Original Message-----
>From: Abhishek Pratap [mailto:abhishek.vit at gmail.com]
>Sent: Wednesday, September 16, 2009 05:41 PM
>To: bioperl-l at lists.open-bio.org
>Subject: [Bioperl-l] Allowing One error in Sequence matching
>
>Hi All
>
>I am not able to think of smart way to do sequence matching allowing
>userdefined number of mismatches.
>
>For eg:
>
>Given Sequence : AGCT will be considered a match to reference if any
>one base pair position #(1,2,3,4)  has a mismatch that is  [ACGTN] so
>the possible matches could be
>
>This is for position 1.
>AGCT
>GGCT
>CGCT
>TGCT
>NGCT
>and likewise for each position.
>
>any nice regular expression. One way that I could think was to
>generate all the possible tags for a given sequence and then do the
>matching. It will be a computationally expensive for long dataset .
>Any neat method ?
>
>Thanks,
>-Abhi
>_______________________________________________
>Bioperl-l mailing list
>Bioperl-l at lists.open-bio.org
>http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


From Russell.Smithies at agresearch.co.nz  Wed Sep 16 19:06:45 2009
From: Russell.Smithies at agresearch.co.nz (Smithies, Russell)
Date: Thu, 17 Sep 2009 11:06:45 +1200
Subject: [Bioperl-l] Allowing One error in Sequence matching
In-Reply-To: <be9b52410909161441w1ce271c4r1e518f7fd1ea7339@mail.gmail.com>
References: <be9b52410909161441w1ce271c4r1e518f7fd1ea7339@mail.gmail.com>
Message-ID: <18DF7D20DFEC044098A1062202F5FFF32B62985946@exchsth.agresearch.co.nz>

How about chunk it into overlapping words, skip if >2 N, then regex?

$seq = "CGATCGNATGNCGTCTAGCTGACANGTTGACTCTAGCTGATCGATCGATCGTACGTANNCGTAGTCGTACNTACGATCTNACGCACGNATGCTACGTACG";

$motif = "ACGT";
foreach (split //, $motif) {$w .= "[${_}N]"}

foreach ($seq =~ /(?=(\w{4}))/g){
  next if tr/N/N/ >= 2;
  print "$_\n" if  eval "/$w/" ;
}


> -----Original Message-----
> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
> bounces at lists.open-bio.org] On Behalf Of Abhishek Pratap
> Sent: Thursday, 17 September 2009 9:42 a.m.
> To: bioperl-l at lists.open-bio.org
> Subject: [Bioperl-l] Allowing One error in Sequence matching
> 
> Hi All
> 
> I am not able to think of smart way to do sequence matching allowing
> userdefined number of mismatches.
> 
> For eg:
> 
> Given Sequence : AGCT will be considered a match to reference if any
> one base pair position #(1,2,3,4)  has a mismatch that is  [ACGTN] so
> the possible matches could be
> 
> This is for position 1.
> AGCT
> GGCT
> CGCT
> TGCT
> NGCT
> and likewise for each position.
> 
> any nice regular expression. One way that I could think was to
> generate all the possible tags for a given sequence and then do the
> matching. It will be a computationally expensive for long dataset .
> Any neat method ?
> 
> Thanks,
> -Abhi
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
=======================================================================
Attention: The information contained in this message and/or attachments
from AgResearch Limited is intended only for the persons or entities
to which it is addressed and may contain confidential and/or privileged
material. Any review, retransmission, dissemination or other use of, or
taking of any action in reliance upon, this information by persons or
entities other than the intended recipients is prohibited by AgResearch
Limited. If you have received this message in error, please notify the
sender immediately.
=======================================================================


From maj at fortinbras.us  Wed Sep 16 18:30:50 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Wed, 16 Sep 2009 18:30:50 -0400
Subject: [Bioperl-l] Allowing One error in Sequence matching
In-Reply-To: <be9b52410909161441w1ce271c4r1e518f7fd1ea7339@mail.gmail.com>
References: <be9b52410909161441w1ce271c4r1e518f7fd1ea7339@mail.gmail.com>
Message-ID: <1B8182A0898B452D80EA6035A178B7CE@NewLife>

Hi Abhi -
Maybe Chris' scrap
http://www.bioperl.org/wiki/Tricking_the_perl_regex_engine_to_get_suboptimal_matches
is what you're after?
MAJ
----- Original Message ----- 
From: "Abhishek Pratap" <abhishek.vit at gmail.com>
To: <bioperl-l at lists.open-bio.org>
Sent: Wednesday, September 16, 2009 5:41 PM
Subject: [Bioperl-l] Allowing One error in Sequence matching


> Hi All
>
> I am not able to think of smart way to do sequence matching allowing
> userdefined number of mismatches.
>
> For eg:
>
> Given Sequence : AGCT will be considered a match to reference if any
> one base pair position #(1,2,3,4)  has a mismatch that is  [ACGTN] so
> the possible matches could be
>
> This is for position 1.
> AGCT
> GGCT
> CGCT
> TGCT
> NGCT
> and likewise for each position.
>
> any nice regular expression. One way that I could think was to
> generate all the possible tags for a given sequence and then do the
> matching. It will be a computationally expensive for long dataset .
> Any neat method ?
>
> Thanks,
> -Abhi
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> 


From abhishek.vit at gmail.com  Wed Sep 16 21:39:13 2009
From: abhishek.vit at gmail.com (Abhishek Pratap)
Date: Wed, 16 Sep 2009 21:39:13 -0400
Subject: [Bioperl-l] Allowing One error in Sequence matching
In-Reply-To: <18DF7D20DFEC044098A1062202F5FFF32B62985946@exchsth.agresearch.co.nz>
References: <be9b52410909161441w1ce271c4r1e518f7fd1ea7339@mail.gmail.com>
	<18DF7D20DFEC044098A1062202F5FFF32B62985946@exchsth.agresearch.co.nz>
Message-ID: <be9b52410909161839k2dd86c57o63cc149057b6af99@mail.gmail.com>

Hi Russell

Thanks for a quick reply. However I am not following the code clearly
and the reason behind it.

Will this work for  matching AGCT  to ACCT | ANCT | AACT. It dint give
me the expected output when I ran it. I am more interested in
understanding the logic.

It would be great if you could expand a bit more.


Also if I do it the brute force way as suggested to me by a frnd , how
will that work in terms of scalability.

@dna1=split(//,$a);
@dna2=split(//,$b);
$x=0;
for($i=0;$i<@dna1;$i++){
        if ($dna1[$i] ne $dna2[$i]){
                        $x++;
        }
}

if($x<=1){
        print "RESULT: your sequence is true\n";
}

else { print " RESULT: your sequence is false\n";}

Thanks,
-Abhi


On Wed, Sep 16, 2009 at 7:06 PM, Smithies, Russell
<Russell.Smithies at agresearch.co.nz> wrote:
> How about chunk it into overlapping words, skip if >2 N, then regex?
>
> $seq = "CGATCGNATGNCGTCTAGCTGACANGTTGACTCTAGCTGATCGATCGATCGTACGTANNCGTAGTCGTACNTACGATCTNACGCACGNATGCTACGTACG";
>
> $motif = "ACGT";
> foreach (split //, $motif) {$w .= "[${_}N]"}
>
> foreach ($seq =~ /(?=(\w{4}))/g){
> ?next if tr/N/N/ >= 2;
> ?print "$_\n" if ?eval "/$w/" ;
> }
>
>
>
>> -----Original Message-----
>> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
>> bounces at lists.open-bio.org] On Behalf Of Abhishek Pratap
>> Sent: Thursday, 17 September 2009 9:42 a.m.
>> To: bioperl-l at lists.open-bio.org
>> Subject: [Bioperl-l] Allowing One error in Sequence matching
>>
>> Hi All
>>
>> I am not able to think of smart way to do sequence matching allowing
>> userdefined number of mismatches.
>>
>> For eg:
>>
>> Given Sequence : AGCT will be considered a match to reference if any
>> one base pair position #(1,2,3,4) ?has a mismatch that is ?[ACGTN] so
>> the possible matches could be
>>
>> This is for position 1.
>> AGCT
>> GGCT
>> CGCT
>> TGCT
>> NGCT
>> and likewise for each position.
>>
>> any nice regular expression. One way that I could think was to
>> generate all the possible tags for a given sequence and then do the
>> matching. It will be a computationally expensive for long dataset .
>> Any neat method ?
>>
>> Thanks,
>> -Abhi
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> =======================================================================
> Attention: The information contained in this message and/or attachments
> from AgResearch Limited is intended only for the persons or entities
> to which it is addressed and may contain confidential and/or privileged
> material. Any review, retransmission, dissemination or other use of, or
> taking of any action in reliance upon, this information by persons or
> entities other than the intended recipients is prohibited by AgResearch
> Limited. If you have received this message in error, please notify the
> sender immediately.
> =======================================================================
>


From Russell.Smithies at agresearch.co.nz  Wed Sep 16 21:46:54 2009
From: Russell.Smithies at agresearch.co.nz (Smithies, Russell)
Date: Thu, 17 Sep 2009 13:46:54 +1200
Subject: [Bioperl-l] Allowing One error in Sequence matching
In-Reply-To: <be9b52410909161839k2dd86c57o63cc149057b6af99@mail.gmail.com>
References: <be9b52410909161441w1ce271c4r1e518f7fd1ea7339@mail.gmail.com>
	<18DF7D20DFEC044098A1062202F5FFF32B62985946@exchsth.agresearch.co.nz>
	<be9b52410909161839k2dd86c57o63cc149057b6af99@mail.gmail.com>
Message-ID: <18DF7D20DFEC044098A1062202F5FFF32B62985A72@exchsth.agresearch.co.nz>

I misread your question, my example will match NGCT, ANCT, AGNT, or ACGN with 1 miss-match (or NGNT, NGCN, ANNT, ANCT etc with 2 miss-matches)
The eval is just doing a regex on the match string created by the loop - "[AN][GN][CN][TN]"
If your word size is short and you're not using too many mismatches, brute-forcing it with a compiled regex would probably work.


> -----Original Message-----
> From: Abhishek Pratap [mailto:abhishek.vit at gmail.com]
> Sent: Thursday, 17 September 2009 1:39 p.m.
> To: Smithies, Russell
> Cc: bioperl-l at lists.open-bio.org
> Subject: Re: [Bioperl-l] Allowing One error in Sequence matching
> 
> Hi Russell
> 
> Thanks for a quick reply. However I am not following the code clearly
> and the reason behind it.
> 
> Will this work for  matching AGCT  to ACCT | ANCT | AACT. It dint give
> me the expected output when I ran it. I am more interested in
> understanding the logic.
> 
> It would be great if you could expand a bit more.
> 
> 
> Also if I do it the brute force way as suggested to me by a frnd , how
> will that work in terms of scalability.
> 
> @dna1=split(//,$a);
> @dna2=split(//,$b);
> $x=0;
> for($i=0;$i<@dna1;$i++){
>         if ($dna1[$i] ne $dna2[$i]){
>                         $x++;
>         }
> }
> 
> if($x<=1){
>         print "RESULT: your sequence is true\n";
> }
> 
> else { print " RESULT: your sequence is false\n";}
> 
> Thanks,
> -Abhi
> 
> 
> On Wed, Sep 16, 2009 at 7:06 PM, Smithies, Russell
> <Russell.Smithies at agresearch.co.nz> wrote:
> > How about chunk it into overlapping words, skip if >2 N, then regex?
> >
> > $seq =
> "CGATCGNATGNCGTCTAGCTGACANGTTGACTCTAGCTGATCGATCGATCGTACGTANNCGTAGTCGTACNTACGAT
> CTNACGCACGNATGCTACGTACG";
> >
> > $motif = "ACGT";
> > foreach (split //, $motif) {$w .= "[${_}N]"}
> >
> > foreach ($seq =~ /(?=(\w{4}))/g){
> > ?next if tr/N/N/ >= 2;
> > ?print "$_\n" if ?eval "/$w/" ;
> > }
> >
> >
> >
> >> -----Original Message-----
> >> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
> >> bounces at lists.open-bio.org] On Behalf Of Abhishek Pratap
> >> Sent: Thursday, 17 September 2009 9:42 a.m.
> >> To: bioperl-l at lists.open-bio.org
> >> Subject: [Bioperl-l] Allowing One error in Sequence matching
> >>
> >> Hi All
> >>
> >> I am not able to think of smart way to do sequence matching allowing
> >> userdefined number of mismatches.
> >>
> >> For eg:
> >>
> >> Given Sequence : AGCT will be considered a match to reference if any
> >> one base pair position #(1,2,3,4) ?has a mismatch that is ?[ACGTN] so
> >> the possible matches could be
> >>
> >> This is for position 1.
> >> AGCT
> >> GGCT
> >> CGCT
> >> TGCT
> >> NGCT
> >> and likewise for each position.
> >>
> >> any nice regular expression. One way that I could think was to
> >> generate all the possible tags for a given sequence and then do the
> >> matching. It will be a computationally expensive for long dataset .
> >> Any neat method ?
> >>
> >> Thanks,
> >> -Abhi
> >> _______________________________________________
> >> Bioperl-l mailing list
> >> Bioperl-l at lists.open-bio.org
> >> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> > =======================================================================
> > Attention: The information contained in this message and/or attachments
> > from AgResearch Limited is intended only for the persons or entities
> > to which it is addressed and may contain confidential and/or privileged
> > material. Any review, retransmission, dissemination or other use of, or
> > taking of any action in reliance upon, this information by persons or
> > entities other than the intended recipients is prohibited by AgResearch
> > Limited. If you have received this message in error, please notify the
> > sender immediately.
> > =======================================================================
> >


From abhishek.vit at gmail.com  Wed Sep 16 23:12:20 2009
From: abhishek.vit at gmail.com (Abhishek Pratap)
Date: Wed, 16 Sep 2009 23:12:20 -0400
Subject: [Bioperl-l] Allowing One error in Sequence matching
In-Reply-To: <18DF7D20DFEC044098A1062202F5FFF32B62985A72@exchsth.agresearch.co.nz>
References: <be9b52410909161441w1ce271c4r1e518f7fd1ea7339@mail.gmail.com>
	<18DF7D20DFEC044098A1062202F5FFF32B62985946@exchsth.agresearch.co.nz>
	<be9b52410909161839k2dd86c57o63cc149057b6af99@mail.gmail.com>
	<18DF7D20DFEC044098A1062202F5FFF32B62985A72@exchsth.agresearch.co.nz>
Message-ID: <be9b52410909162012m5b18bc78u477e15957c88a45d@mail.gmail.com>

Thanks Russell.

I think having a "approx matching" method in bioperl will help
specially with NGS data where read matching with 1/2/3/4 errors is
sometimes needed.

Cheers,
-Abhi


On Wed, Sep 16, 2009 at 9:46 PM, Smithies, Russell
<Russell.Smithies at agresearch.co.nz> wrote:
> I misread your question, my example will match NGCT, ANCT, AGNT, or ACGN with 1 miss-match (or NGNT, NGCN, ANNT, ANCT etc with 2 miss-matches)
> The eval is just doing a regex on the match string created by the loop - "[AN][GN][CN][TN]"
> If your word size is short and you're not using too many mismatches, brute-forcing it with a compiled regex would probably work.
>
>
>> -----Original Message-----
>> From: Abhishek Pratap [mailto:abhishek.vit at gmail.com]
>> Sent: Thursday, 17 September 2009 1:39 p.m.
>> To: Smithies, Russell
>> Cc: bioperl-l at lists.open-bio.org
>> Subject: Re: [Bioperl-l] Allowing One error in Sequence matching
>>
>> Hi Russell
>>
>> Thanks for a quick reply. However I am not following the code clearly
>> and the reason behind it.
>>
>> Will this work for ?matching AGCT ?to ACCT | ANCT | AACT. It dint give
>> me the expected output when I ran it. I am more interested in
>> understanding the logic.
>>
>> It would be great if you could expand a bit more.
>>
>>
>> Also if I do it the brute force way as suggested to me by a frnd , how
>> will that work in terms of scalability.
>>
>> @dna1=split(//,$a);
>> @dna2=split(//,$b);
>> $x=0;
>> for($i=0;$i<@dna1;$i++){
>> ? ? ? ? if ($dna1[$i] ne $dna2[$i]){
>> ? ? ? ? ? ? ? ? ? ? ? ? $x++;
>> ? ? ? ? }
>> }
>>
>> if($x<=1){
>> ? ? ? ? print "RESULT: your sequence is true\n";
>> }
>>
>> else { print " RESULT: your sequence is false\n";}
>>
>> Thanks,
>> -Abhi
>>
>>
>> On Wed, Sep 16, 2009 at 7:06 PM, Smithies, Russell
>> <Russell.Smithies at agresearch.co.nz> wrote:
>> > How about chunk it into overlapping words, skip if >2 N, then regex?
>> >
>> > $seq =
>> "CGATCGNATGNCGTCTAGCTGACANGTTGACTCTAGCTGATCGATCGATCGTACGTANNCGTAGTCGTACNTACGAT
>> CTNACGCACGNATGCTACGTACG";
>> >
>> > $motif = "ACGT";
>> > foreach (split //, $motif) {$w .= "[${_}N]"}
>> >
>> > foreach ($seq =~ /(?=(\w{4}))/g){
>> > ?next if tr/N/N/ >= 2;
>> > ?print "$_\n" if ?eval "/$w/" ;
>> > }
>> >
>> >
>> >
>> >> -----Original Message-----
>> >> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
>> >> bounces at lists.open-bio.org] On Behalf Of Abhishek Pratap
>> >> Sent: Thursday, 17 September 2009 9:42 a.m.
>> >> To: bioperl-l at lists.open-bio.org
>> >> Subject: [Bioperl-l] Allowing One error in Sequence matching
>> >>
>> >> Hi All
>> >>
>> >> I am not able to think of smart way to do sequence matching allowing
>> >> userdefined number of mismatches.
>> >>
>> >> For eg:
>> >>
>> >> Given Sequence : AGCT will be considered a match to reference if any
>> >> one base pair position #(1,2,3,4) ?has a mismatch that is ?[ACGTN] so
>> >> the possible matches could be
>> >>
>> >> This is for position 1.
>> >> AGCT
>> >> GGCT
>> >> CGCT
>> >> TGCT
>> >> NGCT
>> >> and likewise for each position.
>> >>
>> >> any nice regular expression. One way that I could think was to
>> >> generate all the possible tags for a given sequence and then do the
>> >> matching. It will be a computationally expensive for long dataset .
>> >> Any neat method ?
>> >>
>> >> Thanks,
>> >> -Abhi
>> >> _______________________________________________
>> >> Bioperl-l mailing list
>> >> Bioperl-l at lists.open-bio.org
>> >> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>> > =======================================================================
>> > Attention: The information contained in this message and/or attachments
>> > from AgResearch Limited is intended only for the persons or entities
>> > to which it is addressed and may contain confidential and/or privileged
>> > material. Any review, retransmission, dissemination or other use of, or
>> > taking of any action in reliance upon, this information by persons or
>> > entities other than the intended recipients is prohibited by AgResearch
>> > Limited. If you have received this message in error, please notify the
>> > sender immediately.
>> > =======================================================================
>> >
>


From cjfields at illinois.edu  Thu Sep 17 00:39:03 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Wed, 16 Sep 2009 23:39:03 -0500
Subject: [Bioperl-l] About FASTQ parser
In-Reply-To: <be9b52410909161313uab30d9cn24d7080eb1684de7@mail.gmail.com>
References: <be9b52410909161313uab30d9cn24d7080eb1684de7@mail.gmail.com>
Message-ID: <32FBD592-4822-478C-BCAE-33F71E1857FC@illinois.edu>

Abhi,

The FASTQ parser hasn't been released to CPAN yet.  It is available  
via bioperl-live.  We haven't added any code yet to the HOWTO's, but  
the SYNOPSIS example in Bio::SeqIO::fastq should be sufficient to get  
you started.

Bio::Seq::Quality is the object returned via next_seq(); it can be  
queried for PHRED qual scores and other bits.  If you want to split  
things up you should call next_seq(), then generate a FASTQ output  
stream in the variant you want:

my $outfasta = Bio::SeqIO->new(-format => 'fastq-sanger', -file =>  
'>fasta.file');
my $outqual = Bio::SeqIO->new(-format => 'fastq-sanger', -file =>  
'>qual.file');

while (my $seq = $in->next_seq) {
    $outfasta->write_fasta($seq);
    $outqual->write_qual($seq);
}

Note I haven't tested that yet, but it should work.  Let me know if it  
doesn't.

chris

On Sep 16, 2009, at 3:13 PM, Abhishek Pratap wrote:

> Hi Chris
>
> I remember seeing a recent email about new bioperl fastq parser. Is it
> part of bioperl 1.6 dist. I installed one and based on the doc
> here(http://doc.bioperl.org/releases/bioperl-current/bioperl-live/Bio/SeqIO/fastq.html 
> )
> I am a bit lost.
>
> I see two methods there : using Bio::SeqIO::fastq and
> Bio::Seq::Quality. Are both same in terms of data returned and latter
> giving a scale up in speed ?
>
> This is not to offend any developer but small example/s on the HOWTO's
> helps a lot.
>
> The current example (copied below) is not working. I guess it is based
> on a previous version of code.
>
> # grabs the FASTQ parser, specifies the Illumina variant
> my $in = Bio::SeqIO->new(-format    => 'fastq-illumina',
>                          -file      => 'mydata.fq');
>
>
> My basic requirement is to read each read in fastq record and split it
> into header: read: quality.
>
>
> Thanks,
> -Abhi
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From abhishek.vit at gmail.com  Thu Sep 17 00:44:28 2009
From: abhishek.vit at gmail.com (Abhishek Pratap)
Date: Thu, 17 Sep 2009 00:44:28 -0400
Subject: [Bioperl-l] About FASTQ parser
In-Reply-To: <32FBD592-4822-478C-BCAE-33F71E1857FC@illinois.edu>
References: <be9b52410909161313uab30d9cn24d7080eb1684de7@mail.gmail.com>
	<32FBD592-4822-478C-BCAE-33F71E1857FC@illinois.edu>
Message-ID: <be9b52410909162144g3177f718nf239327e98bd30c2@mail.gmail.com>

Thanks for the quick info Chris.

Cheers,
-Abhi

On Thu, Sep 17, 2009 at 12:39 AM, Chris Fields <cjfields at illinois.edu> wrote:
> Abhi,
>
> The FASTQ parser hasn't been released to CPAN yet. ?It is available via
> bioperl-live. ?We haven't added any code yet to the HOWTO's, but the
> SYNOPSIS example in Bio::SeqIO::fastq should be sufficient to get you
> started.
>
> Bio::Seq::Quality is the object returned via next_seq(); it can be queried
> for PHRED qual scores and other bits. ?If you want to split things up you
> should call next_seq(), then generate a FASTQ output stream in the variant
> you want:
>
> my $outfasta = Bio::SeqIO->new(-format => 'fastq-sanger', -file =>
> '>fasta.file');
> my $outqual = Bio::SeqIO->new(-format => 'fastq-sanger', -file =>
> '>qual.file');
>
> while (my $seq = $in->next_seq) {
> ? $outfasta->write_fasta($seq);
> ? $outqual->write_qual($seq);
> }
>
> Note I haven't tested that yet, but it should work. ?Let me know if it
> doesn't.
>
> chris
>
> On Sep 16, 2009, at 3:13 PM, Abhishek Pratap wrote:
>
>> Hi Chris
>>
>> I remember seeing a recent email about new bioperl fastq parser. Is it
>> part of bioperl 1.6 dist. I installed one and based on the doc
>>
>> here(http://doc.bioperl.org/releases/bioperl-current/bioperl-live/Bio/SeqIO/fastq.html)
>> I am a bit lost.
>>
>> I see two methods there : using Bio::SeqIO::fastq and
>> Bio::Seq::Quality. Are both same in terms of data returned and latter
>> giving a scale up in speed ?
>>
>> This is not to offend any developer but small example/s on the HOWTO's
>> helps a lot.
>>
>> The current example (copied below) is not working. I guess it is based
>> on a previous version of code.
>>
>> # grabs the FASTQ parser, specifies the Illumina variant
>> my $in = Bio::SeqIO->new(-format ? ?=> 'fastq-illumina',
>> ? ? ? ? ? ? ? ? ? ? ? ? -file ? ? ?=> 'mydata.fq');
>>
>>
>> My basic requirement is to read each read in fastq record and split it
>> into header: read: quality.
>>
>>
>> Thanks,
>> -Abhi
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>


From amackey at virginia.edu  Thu Sep 17 06:52:31 2009
From: amackey at virginia.edu (Aaron Mackey)
Date: Thu, 17 Sep 2009 06:52:31 -0400
Subject: [Bioperl-l] Question concerning IUPAC.pm
In-Reply-To: <4AB203EF.6030107@agrar.hu-berlin.de>
References: <4AB203EF.6030107@agrar.hu-berlin.de>
Message-ID: <24c96eca0909170352h34b6a20t8648d4e097d57e1e@mail.gmail.com>

Dear Armin,

Please ask such questions on the BioPerl mailing list.

The Bio::Tools::IUPAC module does the opposite of what you want -- it takes
a sequence containing ambiguous codes (e.g. "Y") and generates all possible
combinations of unambiguous sequences (thus one sequence containing a "C"
instead of "Y", and a second sequence containing a "T" instead of "Y").

However, you can do this:

  my %lookup = Bio::Tools::IUPAC->iupac_rev_iub();

%lookup will now contain the following Perl hash:

A => 'A',
T => 'T',
 C => 'C',
G => 'G',
 AC => 'M',
AG => 'R',
 AT => 'W',
CG => 'S',
 CT => 'Y',
'GT' => 'K',
 ACG => 'V',
ACT => 'H',
 AGT => 'D',
CGT => 'B',
 ACGT=> 'N',
N => 'N'

-Aaron


On Thu, Sep 17, 2009 at 5:39 AM, Armin Schmitt <
armin.schmitt at agrar.hu-berlin.de> wrote:
>
> Dear Aaron,
>
> can I use your module IUPAC.pm to create
> ambiguity symbols?
>
> I.e. Input C,T -> output Y
>
> If yes, how can I do this? A little piece
> of code would be helpful. Otherwise,
> is there another perl module for this
> purpose?
>
> Thank you very much
>
> Armin Schmitt
>
>
> --
> Dr. Armin Schmitt
> Humboldt-Universit?t zu Berlin
> Department for Crop and Animal Sciences
> Invalidenstra?e 42
> 10115 Berlin
> Tel.:   +49-30-2093-9074
> Fax:    +49-30-2093-6397
> E-mail: armin.schmitt at agrar.hu-berlin.de
>
>


From abhishek.vit at gmail.com  Thu Sep 17 14:16:33 2009
From: abhishek.vit at gmail.com (Abhishek Pratap)
Date: Thu, 17 Sep 2009 14:16:33 -0400
Subject: [Bioperl-l] About FASTQ parser
In-Reply-To: <32FBD592-4822-478C-BCAE-33F71E1857FC@illinois.edu>
References: <be9b52410909161313uab30d9cn24d7080eb1684de7@mail.gmail.com>
	<32FBD592-4822-478C-BCAE-33F71E1857FC@illinois.edu>
Message-ID: <be9b52410909171116l3284d7b6pd80689a81d46efc1@mail.gmail.com>

Hi Chris

I am just wondering if the following is intentionally excluded from a
fasta record or a bug.

After reading in each fastq record from a FASTQ fiel the output of the
same recored  (  $out->write_seq($seq)  )  has line/text missing after
the + sign.


Eg:

@HWI-EAS397:1:1:11:252#NNNTNN/1
NACAATATCAATTAGAGGATTGCTTNGTTNAAGGNNTNGNTNNNANTNT
+
DNXPMXNYXMPVXZVTXYZ[[BBBBBBBBBBBBBBBBBBBBBBBBBBBB


PS: In our case we need the exact record to be printed out as we need
to split the fastq file into multiple fastq files based on the read
index in the @ Line. So exact output is needed to avoid conflicts with
downstream processing pipelines.

Thanks,
-Abhi

Thanks,
-Abhi

On Thu, Sep 17, 2009 at 12:39 AM, Chris Fields <cjfields at illinois.edu> wrote:
> Abhi,
>
> The FASTQ parser hasn't been released to CPAN yet. ?It is available via
> bioperl-live. ?We haven't added any code yet to the HOWTO's, but the
> SYNOPSIS example in Bio::SeqIO::fastq should be sufficient to get you
> started.
>
> Bio::Seq::Quality is the object returned via next_seq(); it can be queried
> for PHRED qual scores and other bits. ?If you want to split things up you
> should call next_seq(), then generate a FASTQ output stream in the variant
> you want:
>
> my $outfasta = Bio::SeqIO->new(-format => 'fastq-sanger', -file =>
> '>fasta.file');
> my $outqual = Bio::SeqIO->new(-format => 'fastq-sanger', -file =>
> '>qual.file');
>
> while (my $seq = $in->next_seq) {
> ? $outfasta->write_fasta($seq);
> ? $outqual->write_qual($seq);
> }
>
> Note I haven't tested that yet, but it should work. ?Let me know if it
> doesn't.
>
> chris
>
> On Sep 16, 2009, at 3:13 PM, Abhishek Pratap wrote:
>
>> Hi Chris
>>
>> I remember seeing a recent email about new bioperl fastq parser. Is it
>> part of bioperl 1.6 dist. I installed one and based on the doc
>>
>> here(http://doc.bioperl.org/releases/bioperl-current/bioperl-live/Bio/SeqIO/fastq.html)
>> I am a bit lost.
>>
>> I see two methods there : using Bio::SeqIO::fastq and
>> Bio::Seq::Quality. Are both same in terms of data returned and latter
>> giving a scale up in speed ?
>>
>> This is not to offend any developer but small example/s on the HOWTO's
>> helps a lot.
>>
>> The current example (copied below) is not working. I guess it is based
>> on a previous version of code.
>>
>> # grabs the FASTQ parser, specifies the Illumina variant
>> my $in = Bio::SeqIO->new(-format ? ?=> 'fastq-illumina',
>> ? ? ? ? ? ? ? ? ? ? ? ? -file ? ? ?=> 'mydata.fq');
>>
>>
>> My basic requirement is to read each read in fastq record and split it
>> into header: read: quality.
>>
>>
>> Thanks,
>> -Abhi
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>


From cjfields at illinois.edu  Thu Sep 17 16:54:20 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Thu, 17 Sep 2009 15:54:20 -0500
Subject: [Bioperl-l] BioPerl 1.6.0 alpha 1 released
Message-ID: <358B9E70-84C7-42DC-A473-C2AACC18A211@illinois.edu>

All,

Just a quick note that I have released the first alpha for the 1.6.1  
point release.  I uploaded it to CPAN, so it should be migrating to  
the various servers in the next few hours or so.  In the meantime, the  
alpha can be directly downloaded using the following links (pick your  
format):

http://bioperl.org/DIST/RC/BioPerl-1.6.0_1.tar.bz2
http://bioperl.org/DIST/RC/BioPerl-1.6.0_1.tar.gz
http://bioperl.org/DIST/RC/BioPerl-1.6.0_1.zip

If everything goes well, I'll have a more formalized release ready for  
the weekend.  I will also be attempting (hopefully with some success)  
getting a Windows PPM for the latest ActiveState Perl going over the  
next few days.  Feedback from users trying to install BioPerl using  
the latest Strawberry Perl would also be greatly appreciated.

Thanks!

chris


From cjfields at illinois.edu  Thu Sep 17 17:38:31 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Thu, 17 Sep 2009 16:38:31 -0500
Subject: [Bioperl-l] Size of BioPerl distribution
Message-ID: <75D16BEE-4851-4709-94D3-4D4A217B6E8D@illinois.edu>

After uploading the latest bioperl alpha to CPAN I noticed the size of  
the distribution archive has jumped up from ~7 MB to just over 10 MB.   
It looks like a majority of this is attributable to three data files  
for testing in t/data added after the 1.6.0 release:

gmap_f9-multiple_results.txt  (3 MB)
withrefm.906                  (2.5 MB)
1ZZ19XR301R-Alignment.tblastn (2 MB)

I'm not sure there is an easy way around the problem.  We could  
attempt to reduce the file size down, but I'm not convinced that's a  
long-term solution (the test data will only get larger as more test  
cases come up).

Any ideas?  Should we try to have a common biodata repo again?

chris


From rmb32 at cornell.edu  Thu Sep 17 18:04:47 2009
From: rmb32 at cornell.edu (Robert Buels)
Date: Thu, 17 Sep 2009 15:04:47 -0700
Subject: [Bioperl-l] Size of BioPerl distribution
In-Reply-To: <75D16BEE-4851-4709-94D3-4D4A217B6E8D@illinois.edu>
References: <75D16BEE-4851-4709-94D3-4D4A217B6E8D@illinois.edu>
Message-ID: <4AB2B27F.8050800@cornell.edu>

Chris Fields wrote:
 > Any ideas?  Should we try to have a common biodata repo again?

Beyond encouraging people to keep the test data smaller (I would think 
that multiple MB in a test data file is quite excessive!), I don't think 
it's worth worrying about that much.  The stuff in bioperl needs a 
significant amount of test data, and I think that's fine.

This problem is also addressed by the ongoing effort to break things up 
into more distros, I think that will help a lot.

Rob


From hlapp at gmx.net  Thu Sep 17 18:33:34 2009
From: hlapp at gmx.net (Hilmar Lapp)
Date: Thu, 17 Sep 2009 18:33:34 -0400
Subject: [Bioperl-l] Size of BioPerl distribution
In-Reply-To: <4AB2B27F.8050800@cornell.edu>
References: <75D16BEE-4851-4709-94D3-4D4A217B6E8D@illinois.edu>
	<4AB2B27F.8050800@cornell.edu>
Message-ID: <C84FCD2C-3CB7-498F-8977-3C52D194F110@gmx.net>


On Sep 17, 2009, at 6:04 PM, Robert Buels wrote:

> I don't think it's worth worrying about that much.  The stuff in  
> bioperl needs a significant amount of test data, and I think that's  
> fine.


I'd agree with that. Storage is cheap these days. -hilmar

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From cjfields at illinois.edu  Thu Sep 17 19:26:25 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Thu, 17 Sep 2009 18:26:25 -0500
Subject: [Bioperl-l] Size of BioPerl distribution
In-Reply-To: <C84FCD2C-3CB7-498F-8977-3C52D194F110@gmx.net>
References: <75D16BEE-4851-4709-94D3-4D4A217B6E8D@illinois.edu>
	<4AB2B27F.8050800@cornell.edu>
	<C84FCD2C-3CB7-498F-8977-3C52D194F110@gmx.net>
Message-ID: <2404AC8D-2095-415B-B1F3-CF79C4D24525@illinois.edu>

On Sep 17, 2009, at 5:33 PM, Hilmar Lapp wrote:

> On Sep 17, 2009, at 6:04 PM, Robert Buels wrote:
>
>> I don't think it's worth worrying about that much.  The stuff in  
>> bioperl needs a significant amount of test data, and I think that's  
>> fine.
>
> I'd agree with that. Storage is cheap these days. -hilmar

Kind of my thought as well, just a bit of a shock to see the dist.  
increase by 65% between point releases for just three test data  
files.  I may try paring those down a tad.

chris


From cjfields at illinois.edu  Thu Sep 17 19:26:52 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Thu, 17 Sep 2009 18:26:52 -0500
Subject: [Bioperl-l] About FASTQ parser
In-Reply-To: <be9b52410909171116l3284d7b6pd80689a81d46efc1@mail.gmail.com>
References: <be9b52410909161313uab30d9cn24d7080eb1684de7@mail.gmail.com>
	<32FBD592-4822-478C-BCAE-33F71E1857FC@illinois.edu>
	<be9b52410909171116l3284d7b6pd80689a81d46efc1@mail.gmail.com>
Message-ID: <06B0378C-312F-4F43-A99A-6F6CC1C88F61@illinois.edu>

The default format for most FASTQ parsers is to leave the extra header  
off (it increases the file size substantially).  You can add that back  
by setting quality_header():

my $out = Bio::SeqIO->new(-format => 'fastq', -file => $file, - 
quality_header => 1);

Again, let me know if that works okay.

chris

On Sep 17, 2009, at 1:16 PM, Abhishek Pratap wrote:

> Hi Chris
>
> I am just wondering if the following is intentionally excluded from a
> fasta record or a bug.
>
> After reading in each fastq record from a FASTQ fiel the output of the
> same recored  (  $out->write_seq($seq)  )  has line/text missing after
> the + sign.
>
>
>
> Eg:
>
> @HWI-EAS397:1:1:11:252#NNNTNN/1
> NACAATATCAATTAGAGGATTGCTTNGTTNAAGGNNTNGNTNNNANTNT
> +
> DNXPMXNYXMPVXZVTXYZ[[BBBBBBBBBBBBBBBBBBBBBBBBBBBB
>
>
> PS: In our case we need the exact record to be printed out as we need
> to split the fastq file into multiple fastq files based on the read
> index in the @ Line. So exact output is needed to avoid conflicts with
> downstream processing pipelines.
>
> Thanks,
> -Abhi
>
> Thanks,
> -Abhi
>
> On Thu, Sep 17, 2009 at 12:39 AM, Chris Fields  
> <cjfields at illinois.edu> wrote:
>> Abhi,
>>
>> The FASTQ parser hasn't been released to CPAN yet.  It is available  
>> via
>> bioperl-live.  We haven't added any code yet to the HOWTO's, but the
>> SYNOPSIS example in Bio::SeqIO::fastq should be sufficient to get you
>> started.
>>
>> Bio::Seq::Quality is the object returned via next_seq(); it can be  
>> queried
>> for PHRED qual scores and other bits.  If you want to split things  
>> up you
>> should call next_seq(), then generate a FASTQ output stream in the  
>> variant
>> you want:
>>
>> my $outfasta = Bio::SeqIO->new(-format => 'fastq-sanger', -file =>
>> '>fasta.file');
>> my $outqual = Bio::SeqIO->new(-format => 'fastq-sanger', -file =>
>> '>qual.file');
>>
>> while (my $seq = $in->next_seq) {
>>   $outfasta->write_fasta($seq);
>>   $outqual->write_qual($seq);
>> }
>>
>> Note I haven't tested that yet, but it should work.  Let me know if  
>> it
>> doesn't.
>>
>> chris
>>
>> On Sep 16, 2009, at 3:13 PM, Abhishek Pratap wrote:
>>
>>> Hi Chris
>>>
>>> I remember seeing a recent email about new bioperl fastq parser.  
>>> Is it
>>> part of bioperl 1.6 dist. I installed one and based on the doc
>>>
>>> here(http://doc.bioperl.org/releases/bioperl-current/bioperl-live/Bio/SeqIO/fastq.html 
>>> )
>>> I am a bit lost.
>>>
>>> I see two methods there : using Bio::SeqIO::fastq and
>>> Bio::Seq::Quality. Are both same in terms of data returned and  
>>> latter
>>> giving a scale up in speed ?
>>>
>>> This is not to offend any developer but small example/s on the  
>>> HOWTO's
>>> helps a lot.
>>>
>>> The current example (copied below) is not working. I guess it is  
>>> based
>>> on a previous version of code.
>>>
>>> # grabs the FASTQ parser, specifies the Illumina variant
>>> my $in = Bio::SeqIO->new(-format    => 'fastq-illumina',
>>>                         -file      => 'mydata.fq');
>>>
>>>
>>> My basic requirement is to read each read in fastq record and  
>>> split it
>>> into header: read: quality.
>>>
>>>
>>> Thanks,
>>> -Abhi
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From rmb32 at cornell.edu  Thu Sep 17 19:30:16 2009
From: rmb32 at cornell.edu (Robert Buels)
Date: Thu, 17 Sep 2009 16:30:16 -0700
Subject: [Bioperl-l] Size of BioPerl distribution
In-Reply-To: <2404AC8D-2095-415B-B1F3-CF79C4D24525@illinois.edu>
References: <75D16BEE-4851-4709-94D3-4D4A217B6E8D@illinois.edu>
	<4AB2B27F.8050800@cornell.edu>
	<C84FCD2C-3CB7-498F-8977-3C52D194F110@gmx.net>
	<2404AC8D-2095-415B-B1F3-CF79C4D24525@illinois.edu>
Message-ID: <4AB2C688.2030602@cornell.edu>

Chris Fields wrote:
> Kind of my thought as well, just a bit of a shock to see the dist. 
> increase by 65% between point releases for just three test data files.  
> I may try paring those down a tad.

Yes, those individual files are certainly excessive.


From maj at fortinbras.us  Thu Sep 17 19:36:09 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Thu, 17 Sep 2009 19:36:09 -0400
Subject: [Bioperl-l] Size of BioPerl distribution
In-Reply-To: <C84FCD2C-3CB7-498F-8977-3C52D194F110@gmx.net>
References: <75D16BEE-4851-4709-94D3-4D4A217B6E8D@illinois.edu><4AB2B27F.8050800@cornell.edu>
	<C84FCD2C-3CB7-498F-8977-3C52D194F110@gmx.net>
Message-ID: <EC73E6B6BD3D468E8C29D138AF64D483@NewLife>

Two of those files are my bad-- the withrefm is prob best in its entirety, since
it contains all the weird extra-site restrictions that the B:Restriction 
refactor
was meant to handle. The other is a tiling test file that I could probably 
replace
(or at least edit down)-- 
----- Original Message ----- 
From: "Hilmar Lapp" <hlapp at gmx.net>
To: "Robert Buels" <rmb32 at cornell.edu>
Cc: "Chris Fields" <cjfields at illinois.edu>; "BioPerl List" 
<bioperl-l at lists.open-bio.org>
Sent: Thursday, September 17, 2009 6:33 PM
Subject: Re: [Bioperl-l] Size of BioPerl distribution


>
> On Sep 17, 2009, at 6:04 PM, Robert Buels wrote:
>
>> I don't think it's worth worrying about that much.  The stuff in  bioperl 
>> needs a significant amount of test data, and I think that's  fine.
>
>
> I'd agree with that. Storage is cheap these days. -hilmar
>
> -- 
> ===========================================================
> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
> ===========================================================
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> 


From maj at fortinbras.us  Thu Sep 17 22:13:37 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Thu, 17 Sep 2009 22:13:37 -0400
Subject: [Bioperl-l] Size of BioPerl distribution
In-Reply-To: <75D16BEE-4851-4709-94D3-4D4A217B6E8D@illinois.edu>
References: <75D16BEE-4851-4709-94D3-4D4A217B6E8D@illinois.edu>
Message-ID: <F9FBB3236FA446BCAA504BBC62E3194F@NewLife>

t/data compresses from 21M to 9M. We could ship with 

$ tar -czf data.tar.gz data
$ rm -rf data

and do the following in Bio::Root::Test, if we're willing to expect 
Archive::Tar and IO::Zlib :

use vars qw( $ARCHIVE );
$ARCHIVE = "data.tar.gz";
...

sub test_input_file {
    # if it's there, fine
    my $fn =  File::Spec->catfile('t', 'data', @_);
    return $fn if -e $fn;
    # if it's not, expand the archive
    my $arch = File::Spec->catfile('t', $ARCHIVE);
    Bio::Root::Root->throw("Test data archive not present") unless (-e $arch);
    my $tar = Archive::Tar->new($arch);
    Bio::Root::Root->throw ("Can't extract test data archive") unless $tar;
    $tar->extract;
    return $fn if -e $fn;
    return;
}


----- Original Message ----- 
From: "Chris Fields" <cjfields at illinois.edu>
To: "BioPerl List" <bioperl-l at lists.open-bio.org>
Sent: Thursday, September 17, 2009 5:38 PM
Subject: [Bioperl-l] Size of BioPerl distribution


> After uploading the latest bioperl alpha to CPAN I noticed the size of  
> the distribution archive has jumped up from ~7 MB to just over 10 MB.   
> It looks like a majority of this is attributable to three data files  
> for testing in t/data added after the 1.6.0 release:
> 
> gmap_f9-multiple_results.txt  (3 MB)
> withrefm.906                  (2.5 MB)
> 1ZZ19XR301R-Alignment.tblastn (2 MB)
> 
> I'm not sure there is an easy way around the problem.  We could  
> attempt to reduce the file size down, but I'm not convinced that's a  
> long-term solution (the test data will only get larger as more test  
> cases come up).
> 
> Any ideas?  Should we try to have a common biodata repo again?
> 
> chris
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
>


From cjfields at illinois.edu  Thu Sep 17 22:53:09 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Thu, 17 Sep 2009 21:53:09 -0500
Subject: [Bioperl-l] Size of BioPerl distribution
In-Reply-To: <F9FBB3236FA446BCAA504BBC62E3194F@NewLife>
References: <75D16BEE-4851-4709-94D3-4D4A217B6E8D@illinois.edu>
	<F9FBB3236FA446BCAA504BBC62E3194F@NewLife>
Message-ID: <04BEE5E7-79C6-45DE-9EC7-D72AE9E881E5@illinois.edu>

Maybe attempt trimming them down a bit first, if that's possible.  If  
not, no worries (breaking up the distribution will help as Robert  
said).  Archive::Tar and IO::Zlib were added in core after 5.8  
(5.009003 to be exact), so I would rather not have to worry about any  
test-specific dependencies.

Anyway, we've got a little more time.  I'm getting a META.yml popping  
up (though everything appears to pass here).  Will look into it; may  
be related to a previously reported bug, but I would like to see some  
CPANPLUS tests coming in first.  That's what an alpha is for!

chris

On Sep 17, 2009, at 9:13 PM, Mark A. Jensen wrote:

> t/data compresses from 21M to 9M. We could ship with
> $ tar -czf data.tar.gz data
> $ rm -rf data
>
> and do the following in Bio::Root::Test, if we're willing to expect  
> Archive::Tar and IO::Zlib :
>
> use vars qw( $ARCHIVE );
> $ARCHIVE = "data.tar.gz";
> ...
>
> sub test_input_file {
>   # if it's there, fine
>   my $fn =  File::Spec->catfile('t', 'data', @_);
>   return $fn if -e $fn;
>   # if it's not, expand the archive
>   my $arch = File::Spec->catfile('t', $ARCHIVE);
>   Bio::Root::Root->throw("Test data archive not present") unless (-e  
> $arch);
>   my $tar = Archive::Tar->new($arch);
>   Bio::Root::Root->throw ("Can't extract test data archive") unless  
> $tar;
>   $tar->extract;
>   return $fn if -e $fn;
>   return;
> }
>
>
> ----- Original Message ----- From: "Chris Fields" <cjfields at illinois.edu 
> >
> To: "BioPerl List" <bioperl-l at lists.open-bio.org>
> Sent: Thursday, September 17, 2009 5:38 PM
> Subject: [Bioperl-l] Size of BioPerl distribution
>
>
>> After uploading the latest bioperl alpha to CPAN I noticed the size  
>> of  the distribution archive has jumped up from ~7 MB to just over  
>> 10 MB.   It looks like a majority of this is attributable to three  
>> data files  for testing in t/data added after the 1.6.0 release:
>> gmap_f9-multiple_results.txt  (3 MB)
>> withrefm.906                  (2.5 MB)
>> 1ZZ19XR301R-Alignment.tblastn (2 MB)
>> I'm not sure there is an easy way around the problem.  We could   
>> attempt to reduce the file size down, but I'm not convinced that's  
>> a  long-term solution (the test data will only get larger as more  
>> test  cases come up).
>> Any ideas?  Should we try to have a common biodata repo again?
>> chris
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From cjfields at illinois.edu  Thu Sep 17 23:48:13 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Thu, 17 Sep 2009 22:48:13 -0500
Subject: [Bioperl-l] Size of BioPerl distribution
In-Reply-To: <19123.504.682683.996798@already.dhcp.gene.com>
References: <75D16BEE-4851-4709-94D3-4D4A217B6E8D@illinois.edu>
	<4AB2B27F.8050800@cornell.edu>
	<C84FCD2C-3CB7-498F-8977-3C52D194F110@gmx.net>
	<2404AC8D-2095-415B-B1F3-CF79C4D24525@illinois.edu>
	<4AB2C688.2030602@cornell.edu>
	<19123.504.682683.996798@already.dhcp.gene.com>
Message-ID: <B1B941EE-8F1E-426C-82DC-D89B3A13AD3D@illinois.edu>

On Sep 17, 2009, at 10:43 PM, George Hartzell wrote:

> Robert Buels writes:
>> Chris Fields wrote:
>>> Kind of my thought as well, just a bit of a shock to see the dist.
>>> increase by 65% between point releases for just three test data  
>>> files.
>>> I may try paring those down a tad.
>>
>> Yes, those individual files are certainly excessive.
>
> Woo hoo.  Fame and fortune.  Or at least fame.  Or something just this
> side of embarrassment.  Rats.
>
> I'll see about making a smaller test for the gmap_f9 parser, while
> still using real data.
>
> Is there existing support in the searchio infrastructure for reading
> [gb]zip'ed files?
>
> Can it wait a day or three?
>
> g.

Yes, certainly.  I'll be working on a separate issue this weekend  
dealing with the META.yml that CPAN/CPANPLUS appear to be choking on,  
so I'll push back the release until early next week.

chris


From hartzell at alerce.com  Thu Sep 17 23:43:52 2009
From: hartzell at alerce.com (George Hartzell)
Date: Thu, 17 Sep 2009 20:43:52 -0700
Subject: [Bioperl-l] Size of BioPerl distribution
In-Reply-To: <4AB2C688.2030602@cornell.edu>
References: <75D16BEE-4851-4709-94D3-4D4A217B6E8D@illinois.edu>
	<4AB2B27F.8050800@cornell.edu>
	<C84FCD2C-3CB7-498F-8977-3C52D194F110@gmx.net>
	<2404AC8D-2095-415B-B1F3-CF79C4D24525@illinois.edu>
	<4AB2C688.2030602@cornell.edu>
Message-ID: <19123.504.682683.996798@already.dhcp.gene.com>

Robert Buels writes:
 > Chris Fields wrote:
 > > Kind of my thought as well, just a bit of a shock to see the dist. 
 > > increase by 65% between point releases for just three test data files.  
 > > I may try paring those down a tad.
 > 
 > Yes, those individual files are certainly excessive.

Woo hoo.  Fame and fortune.  Or at least fame.  Or something just this
side of embarrassment.  Rats.

I'll see about making a smaller test for the gmap_f9 parser, while
still using real data.

Is there existing support in the searchio infrastructure for reading
[gb]zip'ed files?

Can it wait a day or three?

g.


From roy.chaudhuri at gmail.com  Fri Sep 18 06:43:29 2009
From: roy.chaudhuri at gmail.com (Roy Chaudhuri)
Date: Fri, 18 Sep 2009 11:43:29 +0100
Subject: [Bioperl-l] subsection of genbank file
In-Reply-To: <997B4CA2-D80B-4512-AA3E-74CB45DD7064@science.mq.edu.au>
References: <997B4CA2-D80B-4512-AA3E-74CB45DD7064@science.mq.edu.au>
Message-ID: <4AB36451.3030207@gmail.com>

Hi Liam,

I just discovered your message, which has not yet been replied to. What 
you require has been discussed in a recent thread:
http://bioperl.org/pipermail/bioperl-l/2009-August/031071.html

Try using trunc_with_features from Bio::SeqUtils:

my $sub_seqobj=Bio::SeqUtils->trunc_with_features($seqobj, 300, 2000);
Cheers.
Roy.

Liam Elbourne wrote:
> Hi All,
> 
> Is there a method or methodology that will produce a fully fledged Seq  
> object with all the associated metadata given a start and end  
> position? To clarify, I create a sequence object from a genbank file:
> 
> 
> ****
> my $io  = Bio::Seqio->new(as per usual);
> 
> my $seqobj = $io->next_seq();
> ****
> I now want:
> 
> my $sub_seqobj = $seqobj between 300 and 2000
> 
> where $sub_seqobj is a Seq object (which I appreciate is an  
> 'aggregate' of objects) too. The "trunc" method only returns a  
> PrimarySeq object which lacks all the annotation etc. I've previously  
> done this task by iterating through feature by feature and parsing out  
> what I needed, but thought there might be a more elegant approach...
> 
> 
> Regards,
> Liam Elbourne.

-- 
Dr. Roy Chaudhuri
Department of Veterinary Medicine
University of Cambridge, U.K.


From maj at fortinbras.us  Fri Sep 18 08:11:11 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Fri, 18 Sep 2009 08:11:11 -0400
Subject: [Bioperl-l] problem parsing pdb
In-Reply-To: <741671.67508.qm@web25705.mail.ukl.yahoo.com>
References: <741671.67508.qm@web25705.mail.ukl.yahoo.com>
Message-ID: <DBEE748776B74A7988A942A7BBE13AA3@NewLife>

Hi Paola--
I will look at this. Stay tuned-
Mark
----- Original Message ----- 
From: "Paola Bisignano" <paola_bisignano at yahoo.it>
To: <bioperl-l at bioperl.org>
Sent: Tuesday, September 08, 2009 4:55 AM
Subject: [Bioperl-l] problem parsing pdb


Hi,

I'm in a little troble because i need to exactly parse pdb file, to extract 
chain id and res id, but I finded that in some pdb the number of residue is 
followed by a letter because is probably a residue added by crystallographers 
and they didm't want to change the number of residue in sequence....for example 
the pdb 1PXX.pdb I parsed it with my script below, I didn't find any useful 
suggestion about this in bioperltutorial or documentation of bioperl online

#!/usr/local/bin/perl
use strict;
use warnings;
use Bio::Structure::IO;
use LWP::Simple;


my $urlpdb= 
"http://www.rcsb.org/pdb/download/downloadFile.do?fileFormat=pdb&compression=NO&structureId=1PXX";
my $content = get($urlpdb);
my $pdb_file = qq{1pxx.pdb};
open my $f, ">$pdb_file" or die $!;
binmode $f;
print $f $content;
print qq{$pdb_file\n};
close $f;


my $structio=Bio::Structure::IO->new (-file=>$pdb_file);
my $struc=$structio->next_structure;
for my $chain ($struc->get_chains)
{
my $chainid = $chain->id ;
for my $res ($struc->get_residues($chain))
{
my $resid=$res-> id;
my $atoms= $struc->get_atoms($res);
open my $f, ">> 1pxx.parsed";
print $f "$chainid\t$resid\n";
close $f;
}
}


but it gives my file with an error in ILE 105A ILE 2105C because they have a 
letter that follow the number of resid.... can I solve that problem without 
writing intermediate files?
because i need to have the reside id as 105A not 105.A
so
A ILE-105A
without point between number and letter....


Thank you all,

Paola


_______________________________________________
Bioperl-l mailing list
Bioperl-l at lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/bioperl-l


From scott at scottcain.net  Fri Sep 18 10:11:23 2009
From: scott at scottcain.net (Scott Cain)
Date: Fri, 18 Sep 2009 10:11:23 -0400
Subject: [Bioperl-l] test failures in main trunk
Message-ID: <2DEEE102-8F58-4BBF-BEAD-97A1AA364787@scottcain.net>

With Chris trying to get a release out, I wanted to report these test  
failures from a fairly virgin system Ubuntu server 8.04.

Scott


t/SeqIO/raw.t ................................ 1/24 Can't locate  
Algorithm/Diff.pm in @INC (@INC contains: t/lib . /home/gmod/bioperl- 
live/blib/lib /home/gmod/bioperl-live/blib/arch /home/gmod/bioperl- 
live /etc/perl /usr/local/lib/perl/5.8.8 /usr/local/share/perl/5.8.8 / 
usr/lib/perl5 /usr/share/perl5 /usr/lib/perl/5.8 /usr/share/perl/5.8 / 
usr/local/lib/site_perl) at t/SeqIO/raw.t line 72.
BEGIN failed--compilation aborted at t/SeqIO/raw.t line 72.
# Looks like you planned 24 tests but ran 1.
# Looks like your test exited with 2 just after 1.
t/SeqIO/raw.t ................................ Dubious, test returned  
2 (wstat 512, 0x200)

t/SeqTools/Backtranslate.t ................... Can't locate ok.pm in  
@INC (@INC contains: t/lib /home/gmod/bioperl-live/blib/lib /home/gmod/ 
bioperl-live/blib/arch /home/gmod/bioperl-live /etc/perl /usr/local/ 
lib/perl/5.8.8 /usr/local/share/perl/5.8.8 /usr/lib/perl5 /usr/share/ 
perl5 /usr/lib/perl/5.8 /usr/share/perl/5.8 /usr/local/lib/ 
site_perl .) at t/SeqTools/Backtranslate.t line 9.
BEGIN failed--compilation aborted at t/SeqTools/Backtranslate.t line 9.
# Looks like your test exited with 2 before it could output anything.
t/SeqTools/Backtranslate.t ................... Dubious, test returned  
2 (wstat 512, 0x200)
Failed 8/8 subtests

t/SeqTools/SeqPattern.t ...................... 1/28
#   Failed test 'use Bio::Tools::SeqPattern;'
#   at t/SeqTools/SeqPattern.t line 12.
#     Tried to use 'Bio::Tools::SeqPattern'.
#     Error:  Can't locate List/MoreUtils.pm in @INC (@INC contains: t/ 
lib . /home/gmod/bioperl-live/blib/lib /home/gmod/bioperl-live/blib/ 
arch /home/gmod/bioperl-live /etc/perl /usr/local/lib/perl/5.8.8 /usr/ 
local/share/perl/5.8.8 /usr/lib/perl5 /usr/share/perl5 /usr/lib/perl/ 
5.8 /usr/share/perl/5.8 /usr/local/lib/site_perl) at Bio/Tools/ 
SeqPattern/Backtranslate.pm line 22.
# BEGIN failed--compilation aborted at Bio/Tools/SeqPattern/ 
Backtranslate.pm line 22.
# Compilation failed in require at Bio/Tools/SeqPattern.pm line 212.
# Compilation failed in require at (eval 17) line 2.
# BEGIN failed--compilation aborted at (eval 17) line 2.
Use of uninitialized value in concatenation (.) or string at Bio/Tools/ 
SeqPattern.pm line 431.
Use of uninitialized value in concatenation (.) or string at Bio/Tools/ 
SeqPattern.pm line 432.

#   Failed test at t/SeqTools/SeqPattern.t line 25.
#          got: '(CT).{1,80}(C[[]]CT).(AGGGG){1,200}'
#     expected: '(CT).{1,80}(C[GA][GA]CT).(AGGGG){1,200}'
Use of uninitialized value in concatenation (.) or string at Bio/Tools/ 
SeqPattern.pm line 431.
Use of uninitialized value in concatenation (.) or string at Bio/Tools/ 
SeqPattern.pm line 432.

#   Failed test at t/SeqTools/SeqPattern.t line 31.
#          got: '(CT).(C[][]CT){1,80}.(AGGGG){1,200}'
#     expected: '(CT).(C[AG][AG]CT){1,80}.(AGGGG){1,200}'
Use of uninitialized value in concatenation (.) or string at Bio/Tools/ 
SeqPattern.pm line 371.
Use of uninitialized value in concatenation (.) or string at Bio/Tools/ 
SeqPattern.pm line 372.

#   Failed test at t/SeqTools/SeqPattern.t line 38.
#          got: 'A[][]H'
#     expected: 'A[EQ][DN]H'
"_reverse_translate_motif" is not exported by the  
Bio::Tools::SeqPattern::Backtranslate module
Can't continue after import errors at Bio/Tools/SeqPattern.pm line 539
# Looks like you planned 28 tests but ran 9.
# Looks like you failed 4 tests of 9 run.
# Looks like your test exited with 255 just after 9.
t/SeqTools/SeqPattern.t ...................... Dubious, test returned  
255 (wstat 65280, 0xff00)
Failed 23/28 subtests


-----------------------------------------------------------------------
Scott Cain, Ph. D. scott at scottcain dot net
GMOD Coordinator (http://gmod.org/) 216-392-3087
Ontario Institute for Cancer Research


From dan.bolser at gmail.com  Fri Sep 18 10:11:30 2009
From: dan.bolser at gmail.com (Dan Bolser)
Date: Fri, 18 Sep 2009 15:11:30 +0100
Subject: [Bioperl-l] construct chromosome sequences from bac sequences
In-Reply-To: <dac81b0d0812300702x652813cel733eb9eaa82a408d@mail.gmail.com>
References: <dac81b0d0812300702x652813cel733eb9eaa82a408d@mail.gmail.com>
Message-ID: <2c8757af0909180711t7212f5aak9bc3c7f4e8d16120@mail.gmail.com>

Did you try loading the sequences into an alignment or an assembly object?

As far as I know BioPerl won't call a consensus for you, but you can
post process the alignment or assembly to do that.

Can an alignment hold sequences with qualities?


Sorry for the late reply, I'm just trawling the list for potential
answers to the question I'm about to post ;-)

Dan.


2008/12/30 Alper Yilmaz <alperyilmaz at gmail.com>:
> Hi,
>
> I have FPC report and BAC sequences in hand. I was wondering what is the
> most practical way to build chromosomes from these available information.
>
> I HAVE:
> FPC file:
> accession ? ?chr ? ?chr_start ? ?chr_end ? ?contig ? ?contig_start
> contig_end
> aaaaaaaaaa ? ?1 ? ?14700 ? ?215600 ? ?ctg1 ? ?14700 ? ?215600
> bbbbbbbbbb ? ?1 ? ?196000 ? ?362600 ? ?ctg1 ? ?196000 ? ?362600
> cccccccccc ? ?1 ? ?352800 ? ?524300 ? ?ctg1 ? ?352800 ? ?524300
> .
> .
>
> BAC fasta file:
>>aaaaaaaaaa
> GATCGATCAGCATCGACTACGACT...
>>bbbbbbbbbb
> AGTAGCAGTAGCTAGCACTACGAC...
>>cccccccccc
> ACGATCAGCATCAGCATCGACTAC...
> .
> .
> .
>
> I WANT:
>>chr1
> GACGACTAGCTACGACTAC...
>>chr2
> AGCTGATCACGATCACGAC...
>
> In theory a sequence object called "Chr1" can be created and then according
> to start and end locations of each BAC in FPC file, subsequences of Chr1 can
> be retrieved. However, there are two facts which might prevent using
> standard sequence objects.
> 1) There will be gaps in chromosomes. Is there a function to convert
> unassigned locations to N?
> 2) There are overlaps between BAC sequences. If the overlapping sequences
> are exactly same, it won't be problem, but if there are discrepancies
> between them, a decision has to be made as to which sequence to use in final
> Chr1 sequence.
>
> thanks,
>
> Alper Yilmaz
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


From dan.bolser at gmail.com  Fri Sep 18 10:27:27 2009
From: dan.bolser at gmail.com (Dan Bolser)
Date: Fri, 18 Sep 2009 15:27:27 +0100
Subject: [Bioperl-l] Parser: Ace file (Sequence Assembly) in Bioperl
In-Reply-To: <835D79AC-0D2A-40BE-87F1-0591F69C036A@illinois.edu>
References: <be9b52410901052142p2809652h68e6a05b3ae156eb@mail.gmail.com>
	<18DF7D20DFEC044098A1062202F5FFF31A69523F20@exchsth.agresearch.co.nz>
	<be9b52410901061243t576fcc1eg94928360b8e0f57b@mail.gmail.com>
	<B6BFD3C2-D5D0-4732-B3E2-C2DC9DD029F1@illinois.edu>
	<52cea20c0901061513x593acb44o641b87e35b8ff6fe@mail.gmail.com>
	<835D79AC-0D2A-40BE-87F1-0591F69C036A@illinois.edu>
Message-ID: <2c8757af0909180727r5a71a41fmee71eff92a49a888@mail.gmail.com>

2009/1/6 Chris Fields <cjfields at illinois.edu>:
> Could you archive the files and attach them to a bug report (you can mark it
> as an enhancement request). ?We can take a look.
>
> http://bugzilla.open-bio.org/

Out of interest, has this been added? Where is it documented?

Cheers,
Dan.


> chris
>
> On Jan 6, 2009, at 5:13 PM, Joshua Udall wrote:
>
>> Chris et al. -
>>
>> A student and I have written code to do this - write ace files as well as
>> parse them one entry at a time. ?In trying to use the Assembly::IO as it
>> was
>> in 1.5, we ran into problems with large ace files containing many entries
>> because of file handle limit issues with the inherited implementation
>> DB_File. ?Our implementation simply reads one contig at a time instead of
>> first trying to slurp the whole ace into memory. ?I'm happy to add it to
>> Bioperl, but I am not sure how to do it. ?If I sent *.pm files to someone,
>> could they help me get it into bioperl? ?It may not be perfect either, but
>> it should be a good start.
>>
>> Josh


From bosborne11 at verizon.net  Fri Sep 18 09:48:55 2009
From: bosborne11 at verizon.net (Brian Osborne)
Date: Fri, 18 Sep 2009 09:48:55 -0400
Subject: [Bioperl-l] problem parsing pdb
In-Reply-To: <DBEE748776B74A7988A942A7BBE13AA3@NewLife>
References: <741671.67508.qm@web25705.mail.ukl.yahoo.com>
	<DBEE748776B74A7988A942A7BBE13AA3@NewLife>
Message-ID: <AC62DAB3-3334-44A6-8172-753519B083FF@verizon.net>

Mark,

There was an interesting exchange about StructureIO::pdb a few years  
ago:

http://portal.open-bio.org/pipermail/bioperl-l/2006-September/022990.html

I don't think anyone has actually worked on this code since then and I  
also don't know if Paolo's question relates to the content of the  
thread, but it's good overview.

Brian O.


On Sep 18, 2009, at 8:11 AM, Mark A. Jensen wrote:

> Hi Paola--
> I will look at this. Stay tuned-
> Mark
> ----- Original Message ----- From: "Paola Bisignano" <paola_bisignano at yahoo.it 
> >
> To: <bioperl-l at bioperl.org>
> Sent: Tuesday, September 08, 2009 4:55 AM
> Subject: [Bioperl-l] problem parsing pdb
>
>
> Hi,
>
> I'm in a little troble because i need to exactly parse pdb file, to  
> extract chain id and res id, but I finded that in some pdb the  
> number of residue is followed by a letter because is probably a  
> residue added by crystallographers and they didm't want to change  
> the number of residue in sequence....for example the pdb 1PXX.pdb I  
> parsed it with my script below, I didn't find any useful suggestion  
> about this in bioperltutorial or documentation of bioperl online
>
> #!/usr/local/bin/perl
> use strict;
> use warnings;
> use Bio::Structure::IO;
> use LWP::Simple;
>
>
>
> my $urlpdb= "http://www.rcsb.org/pdb/download/downloadFile.do?fileFormat=pdb&compression=NO&structureId=1PXX 
> ";
> my $content = get($urlpdb);
> my $pdb_file = qq{1pxx.pdb};
> open my $f, ">$pdb_file" or die $!;
> binmode $f;
> print $f $content;
> print qq{$pdb_file\n};
> close $f;
>
>
>
> my $structio=Bio::Structure::IO->new (-file=>$pdb_file);
> my $struc=$structio->next_structure;
> for my $chain ($struc->get_chains)
> {
> my $chainid = $chain->id ;
> for my $res ($struc->get_residues($chain))
> {
> my $resid=$res-> id;
> my $atoms= $struc->get_atoms($res);
> open my $f, ">> 1pxx.parsed";
> print $f "$chainid\t$resid\n";
> close $f;
> }
> }
>
>
>
> but it gives my file with an error in ILE 105A ILE 2105C because  
> they have a letter that follow the number of resid.... can I solve  
> that problem without writing intermediate files?
> because i need to have the reside id as 105A not 105.A
> so
> A ILE-105A
> without point between number and letter....
>
>
>
>
> Thank you all,
>
> Paola
>
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From dan.bolser at gmail.com  Fri Sep 18 10:55:57 2009
From: dan.bolser at gmail.com (Dan Bolser)
Date: Fri, 18 Sep 2009 15:55:57 +0100
Subject: [Bioperl-l] Getting read position information from an ACE file?
Message-ID: <2c8757af0909180755u2e2ca178h9ce921f9bb22c7a3@mail.gmail.com>

Dear Perl Monkeys,

I wrote a little demo script for Bio::Assembly::IO here:

http://www.bioperl.org/wiki/Module:Bio::Assembly::IO


I would very much appreciate comments, criticisms and corrections on
that script (please just edit the wiki). For a newbie its always the
same question, am I doing it right?

In particular, I read about the 4 possible coordinates of a read in an
assembly. My script only retrieves two (?) of the possible four. How
should it be adjusted to print all four coordinates for each read?

Additionally, I'm not sure how to distinguish between the trimmed read
vs. the full length read and/or the aligned portion of the read vs.
the full length read.

What I *really* want is the coordinates of the aligned portion of the
read in gapped read and gapped consensus space, along with the quality
trimmed range of the read.

The ACE file in question is produced by the gsMapper program, which is
part of Newbler from Roche (454), so it has some small
'peculiarities', but I don't think they are critical for the task at
hand.


Thanks very much for any hep you can provide on any of the above issues.

Sincerely,
Dan.


From maj at fortinbras.us  Fri Sep 18 11:11:05 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Fri, 18 Sep 2009 11:11:05 -0400
Subject: [Bioperl-l] Getting read position information from an ACE file?
In-Reply-To: <2c8757af0909180755u2e2ca178h9ce921f9bb22c7a3@mail.gmail.com>
References: <2c8757af0909180755u2e2ca178h9ce921f9bb22c7a3@mail.gmail.com>
Message-ID: <FCD85C18EC5744269CEAB127F4D1D5C4@NewLife>

Dan -- I don't know much about Assembly, so can't help there. But can I  
encourage you and perhaps one or two others (steganographic content: fangly) 
to create a HOWTO stub out of this? Would be excellent-
cheers MAJ
----- Original Message ----- 
From: "Dan Bolser" <dan.bolser at gmail.com>
To: "BioPerl List" <bioperl-l at lists.open-bio.org>
Sent: Friday, September 18, 2009 10:55 AM
Subject: [Bioperl-l] Getting read position information from an ACE file?


> Dear Perl Monkeys,
> 
> I wrote a little demo script for Bio::Assembly::IO here:
> 
> http://www.bioperl.org/wiki/Module:Bio::Assembly::IO
> 
> 
> I would very much appreciate comments, criticisms and corrections on
> that script (please just edit the wiki). For a newbie its always the
> same question, am I doing it right?
> 
> In particular, I read about the 4 possible coordinates of a read in an
> assembly. My script only retrieves two (?) of the possible four. How
> should it be adjusted to print all four coordinates for each read?
> 
> Additionally, I'm not sure how to distinguish between the trimmed read
> vs. the full length read and/or the aligned portion of the read vs.
> the full length read.
> 
> What I *really* want is the coordinates of the aligned portion of the
> read in gapped read and gapped consensus space, along with the quality
> trimmed range of the read.
> 
> The ACE file in question is produced by the gsMapper program, which is
> part of Newbler from Roche (454), so it has some small
> 'peculiarities', but I don't think they are critical for the task at
> hand.
> 
> 
> Thanks very much for any hep you can provide on any of the above issues.
> 
> Sincerely,
> Dan.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
>


From anupam.contact at gmail.com  Fri Sep 18 11:20:03 2009
From: anupam.contact at gmail.com (anupam sinha)
Date: Fri, 18 Sep 2009 20:50:03 +0530
Subject: [Bioperl-l] Problems with Bioperl-run pkg
Message-ID: <82ec54570909180820t7981d230l48d8e4823bb2303f@mail.gmail.com>

Dear all,
                 I have installed the BioPerl-1.6.0.tar.gz and
Bioperl-run-1.6.0.tar.gz on a Fedora 7 system. I am trying to run *
/usr/bin/bp_pairwise_kaks.pl* script but keep on getting this error :

*Must have bioperl-run pkg installed to run this script at
/usr/bin/bp_pairwise_kaks.pl line 69*.

Though I have istalled the run package from Bioperl. Can anyone help me out
? Thanks in advance.


Regards,


Anupam Sinha


From cjfields at illinois.edu  Fri Sep 18 11:59:11 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Fri, 18 Sep 2009 10:59:11 -0500
Subject: [Bioperl-l] test failures in main trunk
In-Reply-To: <2DEEE102-8F58-4BBF-BEAD-97A1AA364787@scottcain.net>
References: <2DEEE102-8F58-4BBF-BEAD-97A1AA364787@scottcain.net>
Message-ID: <1D99E2C1-F484-4E05-8E02-0E948DBBCC6F@illinois.edu>

Interesting, will look into those.  The first one is troubling (that's  
set up to skip for Algoritm::Diff), the others should be a bit more  
straightforward.

Will have to see why List::MoreUtils is being used, but if it's  
necessary it's an additional dep.

chris

On Sep 18, 2009, at 9:11 AM, Scott Cain wrote:

> With Chris trying to get a release out, I wanted to report these  
> test failures from a fairly virgin system Ubuntu server 8.04.
>
> Scott
>
>
>
> t/SeqIO/raw.t ................................ 1/24 Can't locate  
> Algorithm/Diff.pm in @INC (@INC contains: t/lib . /home/gmod/bioperl- 
> live/blib/lib /home/gmod/bioperl-live/blib/arch /home/gmod/bioperl- 
> live /etc/perl /usr/local/lib/perl/5.8.8 /usr/local/share/perl/ 
> 5.8.8 /usr/lib/perl5 /usr/share/perl5 /usr/lib/perl/5.8 /usr/share/ 
> perl/5.8 /usr/local/lib/site_perl) at t/SeqIO/raw.t line 72.
> BEGIN failed--compilation aborted at t/SeqIO/raw.t line 72.
> # Looks like you planned 24 tests but ran 1.
> # Looks like your test exited with 2 just after 1.
> t/SeqIO/raw.t ................................ Dubious, test  
> returned 2 (wstat 512, 0x200)
>
> t/SeqTools/Backtranslate.t ................... Can't locate ok.pm in  
> @INC (@INC contains: t/lib /home/gmod/bioperl-live/blib/lib /home/ 
> gmod/bioperl-live/blib/arch /home/gmod/bioperl-live /etc/perl /usr/ 
> local/lib/perl/5.8.8 /usr/local/share/perl/5.8.8 /usr/lib/perl5 /usr/ 
> share/perl5 /usr/lib/perl/5.8 /usr/share/perl/5.8 /usr/local/lib/ 
> site_perl .) at t/SeqTools/Backtranslate.t line 9.
> BEGIN failed--compilation aborted at t/SeqTools/Backtranslate.t line  
> 9.
> # Looks like your test exited with 2 before it could output anything.
> t/SeqTools/Backtranslate.t ................... Dubious, test  
> returned 2 (wstat 512, 0x200)
> Failed 8/8 subtests
>
> t/SeqTools/SeqPattern.t ...................... 1/28
> #   Failed test 'use Bio::Tools::SeqPattern;'
> #   at t/SeqTools/SeqPattern.t line 12.
> #     Tried to use 'Bio::Tools::SeqPattern'.
> #     Error:  Can't locate List/MoreUtils.pm in @INC (@INC contains:  
> t/lib . /home/gmod/bioperl-live/blib/lib /home/gmod/bioperl-live/ 
> blib/arch /home/gmod/bioperl-live /etc/perl /usr/local/lib/perl/ 
> 5.8.8 /usr/local/share/perl/5.8.8 /usr/lib/perl5 /usr/share/perl5 / 
> usr/lib/perl/5.8 /usr/share/perl/5.8 /usr/local/lib/site_perl) at  
> Bio/Tools/SeqPattern/Backtranslate.pm line 22.
> # BEGIN failed--compilation aborted at Bio/Tools/SeqPattern/ 
> Backtranslate.pm line 22.
> # Compilation failed in require at Bio/Tools/SeqPattern.pm line 212.
> # Compilation failed in require at (eval 17) line 2.
> # BEGIN failed--compilation aborted at (eval 17) line 2.
> Use of uninitialized value in concatenation (.) or string at Bio/ 
> Tools/SeqPattern.pm line 431.
> Use of uninitialized value in concatenation (.) or string at Bio/ 
> Tools/SeqPattern.pm line 432.
>
> #   Failed test at t/SeqTools/SeqPattern.t line 25.
> #          got: '(CT).{1,80}(C[[]]CT).(AGGGG){1,200}'
> #     expected: '(CT).{1,80}(C[GA][GA]CT).(AGGGG){1,200}'
> Use of uninitialized value in concatenation (.) or string at Bio/ 
> Tools/SeqPattern.pm line 431.
> Use of uninitialized value in concatenation (.) or string at Bio/ 
> Tools/SeqPattern.pm line 432.
>
> #   Failed test at t/SeqTools/SeqPattern.t line 31.
> #          got: '(CT).(C[][]CT){1,80}.(AGGGG){1,200}'
> #     expected: '(CT).(C[AG][AG]CT){1,80}.(AGGGG){1,200}'
> Use of uninitialized value in concatenation (.) or string at Bio/ 
> Tools/SeqPattern.pm line 371.
> Use of uninitialized value in concatenation (.) or string at Bio/ 
> Tools/SeqPattern.pm line 372.
>
> #   Failed test at t/SeqTools/SeqPattern.t line 38.
> #          got: 'A[][]H'
> #     expected: 'A[EQ][DN]H'
> "_reverse_translate_motif" is not exported by the  
> Bio::Tools::SeqPattern::Backtranslate module
> Can't continue after import errors at Bio/Tools/SeqPattern.pm line 539
> # Looks like you planned 28 tests but ran 9.
> # Looks like you failed 4 tests of 9 run.
> # Looks like your test exited with 255 just after 9.
> t/SeqTools/SeqPattern.t ...................... Dubious, test  
> returned 255 (wstat 65280, 0xff00)
> Failed 23/28 subtests
>
>
> -----------------------------------------------------------------------
> Scott Cain, Ph. D. scott at scottcain dot net
> GMOD Coordinator (http://gmod.org/) 216-392-3087
> Ontario Institute for Cancer Research
>
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From cjfields at illinois.edu  Fri Sep 18 12:09:26 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Fri, 18 Sep 2009 11:09:26 -0500
Subject: [Bioperl-l] Parser: Ace file (Sequence Assembly) in Bioperl
In-Reply-To: <2c8757af0909180727r5a71a41fmee71eff92a49a888@mail.gmail.com>
References: <be9b52410901052142p2809652h68e6a05b3ae156eb@mail.gmail.com>
	<18DF7D20DFEC044098A1062202F5FFF31A69523F20@exchsth.agresearch.co.nz>
	<be9b52410901061243t576fcc1eg94928360b8e0f57b@mail.gmail.com>
	<B6BFD3C2-D5D0-4732-B3E2-C2DC9DD029F1@illinois.edu>
	<52cea20c0901061513x593acb44o641b87e35b8ff6fe@mail.gmail.com>
	<835D79AC-0D2A-40BE-87F1-0591F69C036A@illinois.edu>
	<2c8757af0909180727r5a71a41fmee71eff92a49a888@mail.gmail.com>
Message-ID: <124536CE-407B-4E2E-98B7-940DA4286CC8@illinois.edu>

Dan,

No, it hasn't made it in.  Currently, the problem is it doesn't have  
any tests attached, but that could be easily fixed if anyone wanted to  
donate a little time to getting them running.  My hands are a bit full  
with other stuff for the release.

We should have some ace files already to go in t/data somewhere if one  
were so inclined to do that, BTW  ;>

chris

On Sep 18, 2009, at 9:27 AM, Dan Bolser wrote:

> 2009/1/6 Chris Fields <cjfields at illinois.edu>:
>> Could you archive the files and attach them to a bug report (you  
>> can mark it
>> as an enhancement request).  We can take a look.
>>
>> http://bugzilla.open-bio.org/
>
> Out of interest, has this been added? Where is it documented?
>
> Cheers,
> Dan.
>
>
>> chris
>>
>> On Jan 6, 2009, at 5:13 PM, Joshua Udall wrote:
>>
>>> Chris et al. -
>>>
>>> A student and I have written code to do this - write ace files as  
>>> well as
>>> parse them one entry at a time.  In trying to use the Assembly::IO  
>>> as it
>>> was
>>> in 1.5, we ran into problems with large ace files containing many  
>>> entries
>>> because of file handle limit issues with the inherited  
>>> implementation
>>> DB_File.  Our implementation simply reads one contig at a time  
>>> instead of
>>> first trying to slurp the whole ace into memory.  I'm happy to add  
>>> it to
>>> Bioperl, but I am not sure how to do it.  If I sent *.pm files to  
>>> someone,
>>> could they help me get it into bioperl?  It may not be perfect  
>>> either, but
>>> it should be a good start.
>>>
>>> Josh
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From maj at fortinbras.us  Fri Sep 18 12:20:22 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Fri, 18 Sep 2009 12:20:22 -0400
Subject: [Bioperl-l] test failures in main trunk
In-Reply-To: <1D99E2C1-F484-4E05-8E02-0E948DBBCC6F@illinois.edu>
References: <2DEEE102-8F58-4BBF-BEAD-97A1AA364787@scottcain.net>
	<1D99E2C1-F484-4E05-8E02-0E948DBBCC6F@illinois.edu>
Message-ID: <E019D53941DD48E4B3294E113771B711@NewLife>


> Will have to see why List::MoreUtils is being used, but if it's  
> necessary it's an additional dep.

I didn't do it, officer....

> 
> chris
> 
> On Sep 18, 2009, at 9:11 AM, Scott Cain wrote:
> 
>> With Chris trying to get a release out, I wanted to report these  
>> test failures from a fairly virgin system Ubuntu server 8.04.
>>
>> Scott
>>
>>
>>
>> t/SeqIO/raw.t ................................ 1/24 Can't locate  
>> Algorithm/Diff.pm in @INC (@INC contains: t/lib . /home/gmod/bioperl- 
>> live/blib/lib /home/gmod/bioperl-live/blib/arch /home/gmod/bioperl- 
>> live /etc/perl /usr/local/lib/perl/5.8.8 /usr/local/share/perl/ 
>> 5.8.8 /usr/lib/perl5 /usr/share/perl5 /usr/lib/perl/5.8 /usr/share/ 
>> perl/5.8 /usr/local/lib/site_perl) at t/SeqIO/raw.t line 72.
>> BEGIN failed--compilation aborted at t/SeqIO/raw.t line 72.
>> # Looks like you planned 24 tests but ran 1.
>> # Looks like your test exited with 2 just after 1.
>> t/SeqIO/raw.t ................................ Dubious, test  
>> returned 2 (wstat 512, 0x200)
>>
>> t/SeqTools/Backtranslate.t ................... Can't locate ok.pm in  
>> @INC (@INC contains: t/lib /home/gmod/bioperl-live/blib/lib /home/ 
>> gmod/bioperl-live/blib/arch /home/gmod/bioperl-live /etc/perl /usr/ 
>> local/lib/perl/5.8.8 /usr/local/share/perl/5.8.8 /usr/lib/perl5 /usr/ 
>> share/perl5 /usr/lib/perl/5.8 /usr/share/perl/5.8 /usr/local/lib/ 
>> site_perl .) at t/SeqTools/Backtranslate.t line 9.
>> BEGIN failed--compilation aborted at t/SeqTools/Backtranslate.t line  
>> 9.
>> # Looks like your test exited with 2 before it could output anything.
>> t/SeqTools/Backtranslate.t ................... Dubious, test  
>> returned 2 (wstat 512, 0x200)
>> Failed 8/8 subtests
>>
>> t/SeqTools/SeqPattern.t ...................... 1/28
>> #   Failed test 'use Bio::Tools::SeqPattern;'
>> #   at t/SeqTools/SeqPattern.t line 12.
>> #     Tried to use 'Bio::Tools::SeqPattern'.
>> #     Error:  Can't locate List/MoreUtils.pm in @INC (@INC contains:  
>> t/lib . /home/gmod/bioperl-live/blib/lib /home/gmod/bioperl-live/ 
>> blib/arch /home/gmod/bioperl-live /etc/perl /usr/local/lib/perl/ 
>> 5.8.8 /usr/local/share/perl/5.8.8 /usr/lib/perl5 /usr/share/perl5 / 
>> usr/lib/perl/5.8 /usr/share/perl/5.8 /usr/local/lib/site_perl) at  
>> Bio/Tools/SeqPattern/Backtranslate.pm line 22.
>> # BEGIN failed--compilation aborted at Bio/Tools/SeqPattern/ 
>> Backtranslate.pm line 22.
>> # Compilation failed in require at Bio/Tools/SeqPattern.pm line 212.
>> # Compilation failed in require at (eval 17) line 2.
>> # BEGIN failed--compilation aborted at (eval 17) line 2.
>> Use of uninitialized value in concatenation (.) or string at Bio/ 
>> Tools/SeqPattern.pm line 431.
>> Use of uninitialized value in concatenation (.) or string at Bio/ 
>> Tools/SeqPattern.pm line 432.
>>
>> #   Failed test at t/SeqTools/SeqPattern.t line 25.
>> #          got: '(CT).{1,80}(C[[]]CT).(AGGGG){1,200}'
>> #     expected: '(CT).{1,80}(C[GA][GA]CT).(AGGGG){1,200}'
>> Use of uninitialized value in concatenation (.) or string at Bio/ 
>> Tools/SeqPattern.pm line 431.
>> Use of uninitialized value in concatenation (.) or string at Bio/ 
>> Tools/SeqPattern.pm line 432.
>>
>> #   Failed test at t/SeqTools/SeqPattern.t line 31.
>> #          got: '(CT).(C[][]CT){1,80}.(AGGGG){1,200}'
>> #     expected: '(CT).(C[AG][AG]CT){1,80}.(AGGGG){1,200}'
>> Use of uninitialized value in concatenation (.) or string at Bio/ 
>> Tools/SeqPattern.pm line 371.
>> Use of uninitialized value in concatenation (.) or string at Bio/ 
>> Tools/SeqPattern.pm line 372.
>>
>> #   Failed test at t/SeqTools/SeqPattern.t line 38.
>> #          got: 'A[][]H'
>> #     expected: 'A[EQ][DN]H'
>> "_reverse_translate_motif" is not exported by the  
>> Bio::Tools::SeqPattern::Backtranslate module
>> Can't continue after import errors at Bio/Tools/SeqPattern.pm line 539
>> # Looks like you planned 28 tests but ran 9.
>> # Looks like you failed 4 tests of 9 run.
>> # Looks like your test exited with 255 just after 9.
>> t/SeqTools/SeqPattern.t ...................... Dubious, test  
>> returned 255 (wstat 65280, 0xff00)
>> Failed 23/28 subtests
>>
>>
>> -----------------------------------------------------------------------
>> Scott Cain, Ph. D. scott at scottcain dot net
>> GMOD Coordinator (http://gmod.org/) 216-392-3087
>> Ontario Institute for Cancer Research
>>
>>
>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
>


From maj at fortinbras.us  Fri Sep 18 11:55:47 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Fri, 18 Sep 2009 11:55:47 -0400
Subject: [Bioperl-l] problem parsing pdb
In-Reply-To: <741671.67508.qm@web25705.mail.ukl.yahoo.com>
References: <741671.67508.qm@web25705.mail.ukl.yahoo.com>
Message-ID: <72DA6CA1499D4F67909197901218A9FF@NewLife>

Hi Paola--
My researches reveal that this is a "standard kludge" in pdb format. A letter 
following a residue number is called an "insertion code" or "icode", and my 
understanding is that is does allow for the insertion of residues without 
upsetting the rest of the coordinates. (This is a feature, and not laziness, 
since people very quickly begin to refer to amino acid coordinates based on a 
reference sequence in interesting region, and you can't easily say to the 
community,  "hey, that's 22 now, not 20...")

Since it's standard, you should expect it. Bio::Structure handles the icode by 
creating the residue id as follows:

   #my $res_name_num = $resname."-".$resseq;
   my $res_name_num = $resname."-".$resseq;
   $res_name_num .= '.'.$icode if $icode;

so you can get back the reside 3-letter name, its numerical position, and 
insertion code by doing

 my ($name, $number, $icode) = $res->id =~ /(.*?)-([0-9]+)\.?([A-Z]?)/;

In this case, if the icode is not present, then $icode eq '' (not undef).
Hope this helps-
Mark

----- Original Message ----- 
From: "Paola Bisignano" <paola_bisignano at yahoo.it>
To: <bioperl-l at bioperl.org>
Sent: Tuesday, September 08, 2009 4:55 AM
Subject: [Bioperl-l] problem parsing pdb


Hi,

I'm in a little troble because i need to exactly parse pdb file, to extract 
chain id and res id, but I finded that in some pdb the number of residue is 
followed by a letter because is probably a residue added by crystallographers 
and they didm't want to change the number of residue in sequence....for example 
the pdb 1PXX.pdb I parsed it with my script below, I didn't find any useful 
suggestion about this in bioperltutorial or documentation of bioperl online

#!/usr/local/bin/perl
use strict;
use warnings;
use Bio::Structure::IO;
use LWP::Simple;


my $urlpdb= 
"http://www.rcsb.org/pdb/download/downloadFile.do?fileFormat=pdb&compression=NO&structureId=1PXX";
my $content = get($urlpdb);
my $pdb_file = qq{1pxx.pdb};
open my $f, ">$pdb_file" or die $!;
binmode $f;
print $f $content;
print qq{$pdb_file\n};
close $f;


my $structio=Bio::Structure::IO->new (-file=>$pdb_file);
my $struc=$structio->next_structure;
for my $chain ($struc->get_chains)
{
my $chainid = $chain->id ;
for my $res ($struc->get_residues($chain))
{
my $resid=$res-> id;
my $atoms= $struc->get_atoms($res);
open my $f, ">> 1pxx.parsed";
print $f "$chainid\t$resid\n";
close $f;
}
}


but it gives my file with an error in ILE 105A ILE 2105C because they have a 
letter that follow the number of resid.... can I solve that problem without 
writing intermediate files?
because i need to have the reside id as 105A not 105.A
so
A ILE-105A
without point between number and letter....


Thank you all,

Paola


_______________________________________________
Bioperl-l mailing list
Bioperl-l at lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/bioperl-l


From abhishek.vit at gmail.com  Fri Sep 18 12:31:00 2009
From: abhishek.vit at gmail.com (Abhishek Pratap)
Date: Fri, 18 Sep 2009 12:31:00 -0400
Subject: [Bioperl-l] Parser: Ace file (Sequence Assembly) in Bioperl
In-Reply-To: <124536CE-407B-4E2E-98B7-940DA4286CC8@illinois.edu>
References: <be9b52410901052142p2809652h68e6a05b3ae156eb@mail.gmail.com>
	<18DF7D20DFEC044098A1062202F5FFF31A69523F20@exchsth.agresearch.co.nz>
	<be9b52410901061243t576fcc1eg94928360b8e0f57b@mail.gmail.com>
	<B6BFD3C2-D5D0-4732-B3E2-C2DC9DD029F1@illinois.edu>
	<52cea20c0901061513x593acb44o641b87e35b8ff6fe@mail.gmail.com>
	<835D79AC-0D2A-40BE-87F1-0591F69C036A@illinois.edu>
	<2c8757af0909180727r5a71a41fmee71eff92a49a888@mail.gmail.com>
	<124536CE-407B-4E2E-98B7-940DA4286CC8@illinois.edu>
Message-ID: <be9b52410909180931w2951318eqfa01c109a032bf9d@mail.gmail.com>

I have negligible experience with ace but will be happy to do some
testing. Although please let me know what code and functioanlity needs
to be checked.

Cheers,
-Abhi

On Fri, Sep 18, 2009 at 12:09 PM, Chris Fields <cjfields at illinois.edu> wrote:
> Dan,
>
> No, it hasn't made it in. ?Currently, the problem is it doesn't have any
> tests attached, but that could be easily fixed if anyone wanted to donate a
> little time to getting them running. ?My hands are a bit full with other
> stuff for the release.
>
> We should have some ace files already to go in t/data somewhere if one were
> so inclined to do that, BTW ?;>
>
> chris
>
> On Sep 18, 2009, at 9:27 AM, Dan Bolser wrote:
>
>> 2009/1/6 Chris Fields <cjfields at illinois.edu>:
>>>
>>> Could you archive the files and attach them to a bug report (you can mark
>>> it
>>> as an enhancement request). ?We can take a look.
>>>
>>> http://bugzilla.open-bio.org/
>>
>> Out of interest, has this been added? Where is it documented?
>>
>> Cheers,
>> Dan.
>>
>>
>>> chris
>>>
>>> On Jan 6, 2009, at 5:13 PM, Joshua Udall wrote:
>>>
>>>> Chris et al. -
>>>>
>>>> A student and I have written code to do this - write ace files as well
>>>> as
>>>> parse them one entry at a time. ?In trying to use the Assembly::IO as it
>>>> was
>>>> in 1.5, we ran into problems with large ace files containing many
>>>> entries
>>>> because of file handle limit issues with the inherited implementation
>>>> DB_File. ?Our implementation simply reads one contig at a time instead
>>>> of
>>>> first trying to slurp the whole ace into memory. ?I'm happy to add it to
>>>> Bioperl, but I am not sure how to do it. ?If I sent *.pm files to
>>>> someone,
>>>> could they help me get it into bioperl? ?It may not be perfect either,
>>>> but
>>>> it should be a good start.
>>>>
>>>> Josh
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>


From vecchi.b at gmail.com  Fri Sep 18 12:44:37 2009
From: vecchi.b at gmail.com (Bruno Vecchi)
Date: Fri, 18 Sep 2009 09:44:37 -0700
Subject: [Bioperl-l] test failures in main trunk
In-Reply-To: <E019D53941DD48E4B3294E113771B711@NewLife>
References: <2DEEE102-8F58-4BBF-BEAD-97A1AA364787@scottcain.net>
	<1D99E2C1-F484-4E05-8E02-0E948DBBCC6F@illinois.edu>
	<E019D53941DD48E4B3294E113771B711@NewLife>
Message-ID: <1a0c1b750909180944p55b226cbi18e3c608f401d951@mail.gmail.com>

The second test ("Can't locate ok.pm in @INC...") can be fixed by
using use_ok('My::Module') instead of use ok 'My::Module' in the test
files.

I've had a few of those in the past, and that fix did the trick.

Cheers,

Bruno.


2009/9/18 Mark A. Jensen <maj at fortinbras.us>:
>
>> Will have to see why List::MoreUtils is being used, but if it's ?necessary
>> it's an additional dep.
>
> I didn't do it, officer....
>
>>
>> chris
>>
>> On Sep 18, 2009, at 9:11 AM, Scott Cain wrote:
>>
>>> With Chris trying to get a release out, I wanted to report these ?test
>>> failures from a fairly virgin system Ubuntu server 8.04.
>>>
>>> Scott
>>>
>>>
>>>
>>> t/SeqIO/raw.t ................................ 1/24 Can't locate
>>> ?Algorithm/Diff.pm in @INC (@INC contains: t/lib . /home/gmod/bioperl-
>>> live/blib/lib /home/gmod/bioperl-live/blib/arch /home/gmod/bioperl- live
>>> /etc/perl /usr/local/lib/perl/5.8.8 /usr/local/share/perl/ 5.8.8
>>> /usr/lib/perl5 /usr/share/perl5 /usr/lib/perl/5.8 /usr/share/ perl/5.8
>>> /usr/local/lib/site_perl) at t/SeqIO/raw.t line 72.
>>> BEGIN failed--compilation aborted at t/SeqIO/raw.t line 72.
>>> # Looks like you planned 24 tests but ran 1.
>>> # Looks like your test exited with 2 just after 1.
>>> t/SeqIO/raw.t ................................ Dubious, test ?returned 2
>>> (wstat 512, 0x200)
>>>
>>> t/SeqTools/Backtranslate.t ................... Can't locate ok.pm in
>>> ?@INC (@INC contains: t/lib /home/gmod/bioperl-live/blib/lib /home/
>>> gmod/bioperl-live/blib/arch /home/gmod/bioperl-live /etc/perl /usr/
>>> local/lib/perl/5.8.8 /usr/local/share/perl/5.8.8 /usr/lib/perl5 /usr/
>>> share/perl5 /usr/lib/perl/5.8 /usr/share/perl/5.8 /usr/local/lib/ site_perl
>>> .) at t/SeqTools/Backtranslate.t line 9.
>>> BEGIN failed--compilation aborted at t/SeqTools/Backtranslate.t line ?9.
>>> # Looks like your test exited with 2 before it could output anything.
>>> t/SeqTools/Backtranslate.t ................... Dubious, test ?returned 2
>>> (wstat 512, 0x200)
>>> Failed 8/8 subtests
>>>
>>> t/SeqTools/SeqPattern.t ...................... 1/28
>>> # ? Failed test 'use Bio::Tools::SeqPattern;'
>>> # ? at t/SeqTools/SeqPattern.t line 12.
>>> # ? ? Tried to use 'Bio::Tools::SeqPattern'.
>>> # ? ? Error: ?Can't locate List/MoreUtils.pm in @INC (@INC contains:
>>> ?t/lib . /home/gmod/bioperl-live/blib/lib /home/gmod/bioperl-live/ blib/arch
>>> /home/gmod/bioperl-live /etc/perl /usr/local/lib/perl/ 5.8.8
>>> /usr/local/share/perl/5.8.8 /usr/lib/perl5 /usr/share/perl5 /
>>> usr/lib/perl/5.8 /usr/share/perl/5.8 /usr/local/lib/site_perl) at
>>> ?Bio/Tools/SeqPattern/Backtranslate.pm line 22.
>>> # BEGIN failed--compilation aborted at Bio/Tools/SeqPattern/
>>> Backtranslate.pm line 22.
>>> # Compilation failed in require at Bio/Tools/SeqPattern.pm line 212.
>>> # Compilation failed in require at (eval 17) line 2.
>>> # BEGIN failed--compilation aborted at (eval 17) line 2.
>>> Use of uninitialized value in concatenation (.) or string at Bio/
>>> Tools/SeqPattern.pm line 431.
>>> Use of uninitialized value in concatenation (.) or string at Bio/
>>> Tools/SeqPattern.pm line 432.
>>>
>>> # ? Failed test at t/SeqTools/SeqPattern.t line 25.
>>> # ? ? ? ? ?got: '(CT).{1,80}(C[[]]CT).(AGGGG){1,200}'
>>> # ? ? expected: '(CT).{1,80}(C[GA][GA]CT).(AGGGG){1,200}'
>>> Use of uninitialized value in concatenation (.) or string at Bio/
>>> Tools/SeqPattern.pm line 431.
>>> Use of uninitialized value in concatenation (.) or string at Bio/
>>> Tools/SeqPattern.pm line 432.
>>>
>>> # ? Failed test at t/SeqTools/SeqPattern.t line 31.
>>> # ? ? ? ? ?got: '(CT).(C[][]CT){1,80}.(AGGGG){1,200}'
>>> # ? ? expected: '(CT).(C[AG][AG]CT){1,80}.(AGGGG){1,200}'
>>> Use of uninitialized value in concatenation (.) or string at Bio/
>>> Tools/SeqPattern.pm line 371.
>>> Use of uninitialized value in concatenation (.) or string at Bio/
>>> Tools/SeqPattern.pm line 372.
>>>
>>> # ? Failed test at t/SeqTools/SeqPattern.t line 38.
>>> # ? ? ? ? ?got: 'A[][]H'
>>> # ? ? expected: 'A[EQ][DN]H'
>>> "_reverse_translate_motif" is not exported by the
>>> ?Bio::Tools::SeqPattern::Backtranslate module
>>> Can't continue after import errors at Bio/Tools/SeqPattern.pm line 539
>>> # Looks like you planned 28 tests but ran 9.
>>> # Looks like you failed 4 tests of 9 run.
>>> # Looks like your test exited with 255 just after 9.
>>> t/SeqTools/SeqPattern.t ...................... Dubious, test ?returned
>>> 255 (wstat 65280, 0xff00)
>>> Failed 23/28 subtests
>>>
>>>
>>> -----------------------------------------------------------------------
>>> Scott Cain, Ph. D. scott at scottcain dot net
>>> GMOD Coordinator (http://gmod.org/) 216-392-3087
>>> Ontario Institute for Cancer Research
>>>
>>>
>>>
>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


From dan.bolser at gmail.com  Fri Sep 18 12:54:36 2009
From: dan.bolser at gmail.com (Dan Bolser)
Date: Fri, 18 Sep 2009 17:54:36 +0100
Subject: [Bioperl-l] Parser: Ace file (Sequence Assembly) in Bioperl
In-Reply-To: <124536CE-407B-4E2E-98B7-940DA4286CC8@illinois.edu>
References: <be9b52410901052142p2809652h68e6a05b3ae156eb@mail.gmail.com>
	<18DF7D20DFEC044098A1062202F5FFF31A69523F20@exchsth.agresearch.co.nz>
	<be9b52410901061243t576fcc1eg94928360b8e0f57b@mail.gmail.com>
	<B6BFD3C2-D5D0-4732-B3E2-C2DC9DD029F1@illinois.edu>
	<52cea20c0901061513x593acb44o641b87e35b8ff6fe@mail.gmail.com>
	<835D79AC-0D2A-40BE-87F1-0591F69C036A@illinois.edu>
	<2c8757af0909180727r5a71a41fmee71eff92a49a888@mail.gmail.com>
	<124536CE-407B-4E2E-98B7-940DA4286CC8@illinois.edu>
Message-ID: <2c8757af0909180954ia4fecc3we72574d8ae8acd97@mail.gmail.com>

Please can you link to the bug that includes the code?


2009/9/18 Chris Fields <cjfields at illinois.edu>:
> Dan,
>
> No, it hasn't made it in. ?Currently, the problem is it doesn't have any
> tests attached, but that could be easily fixed if anyone wanted to donate a
> little time to getting them running. ?My hands are a bit full with other
> stuff for the release.
>
> We should have some ace files already to go in t/data somewhere if one were
> so inclined to do that, BTW ?;>
>
> chris
>
> On Sep 18, 2009, at 9:27 AM, Dan Bolser wrote:
>
>> 2009/1/6 Chris Fields <cjfields at illinois.edu>:
>>>
>>> Could you archive the files and attach them to a bug report (you can mark
>>> it
>>> as an enhancement request). ?We can take a look.
>>>
>>> http://bugzilla.open-bio.org/
>>
>> Out of interest, has this been added? Where is it documented?
>>
>> Cheers,
>> Dan.
>>
>>
>>> chris
>>>
>>> On Jan 6, 2009, at 5:13 PM, Joshua Udall wrote:
>>>
>>>> Chris et al. -
>>>>
>>>> A student and I have written code to do this - write ace files as well
>>>> as
>>>> parse them one entry at a time. ?In trying to use the Assembly::IO as it
>>>> was
>>>> in 1.5, we ran into problems with large ace files containing many
>>>> entries
>>>> because of file handle limit issues with the inherited implementation
>>>> DB_File. ?Our implementation simply reads one contig at a time instead
>>>> of
>>>> first trying to slurp the whole ace into memory. ?I'm happy to add it to
>>>> Bioperl, but I am not sure how to do it. ?If I sent *.pm files to
>>>> someone,
>>>> could they help me get it into bioperl? ?It may not be perfect either,
>>>> but
>>>> it should be a good start.
>>>>
>>>> Josh
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>


From dan.bolser at gmail.com  Fri Sep 18 13:09:09 2009
From: dan.bolser at gmail.com (Dan Bolser)
Date: Fri, 18 Sep 2009 18:09:09 +0100
Subject: [Bioperl-l] Getting read position information from an ACE file?
In-Reply-To: <FCD85C18EC5744269CEAB127F4D1D5C4@NewLife>
References: <2c8757af0909180755u2e2ca178h9ce921f9bb22c7a3@mail.gmail.com>
	<FCD85C18EC5744269CEAB127F4D1D5C4@NewLife>
Message-ID: <2c8757af0909181009w310bc69r3d9efa3d9a12d41b@mail.gmail.com>

2009/9/18 Mark A. Jensen <maj at fortinbras.us>:
> Dan -- I don't know much about Assembly, so can't help there. But can I
> ?encourage you and perhaps one or two others (steganographic content:
> fangly) to create a HOWTO stub out of this? Would be excellent-

I'd love to. ACE is pretty ubiquitous, so any additional info on how
to work with them using BioPerl should help a lot of people.

The problem is that I'm one of those people ;-)


I'm working on an 'ace2tab.plx' script that should encompass this
info. I'm finding that some 'read ids' have the .range format. i.e.
"read123455.23-239". However, some do not. i.e. "read123456". Not sure
where this ID comes from, but I think its telling me something about
partially aligned reads. The problem is that the coordinates I'm
seeing don't reflect that (they are just the start and the end point
of the full read).

A 'proper' ace2tab script would be very nice.


> cheers MAJ
> ----- Original Message ----- From: "Dan Bolser" <dan.bolser at gmail.com>
> To: "BioPerl List" <bioperl-l at lists.open-bio.org>
> Sent: Friday, September 18, 2009 10:55 AM
> Subject: [Bioperl-l] Getting read position information from an ACE file?
>
>
>> Dear Perl Monkeys,
>>
>> I wrote a little demo script for Bio::Assembly::IO here:
>>
>> http://www.bioperl.org/wiki/Module:Bio::Assembly::IO
>>
>>
>> I would very much appreciate comments, criticisms and corrections on
>> that script (please just edit the wiki). For a newbie its always the
>> same question, am I doing it right?
>>
>> In particular, I read about the 4 possible coordinates of a read in an
>> assembly. My script only retrieves two (?) of the possible four. How
>> should it be adjusted to print all four coordinates for each read?
>>
>> Additionally, I'm not sure how to distinguish between the trimmed read
>> vs. the full length read and/or the aligned portion of the read vs.
>> the full length read.
>>
>> What I *really* want is the coordinates of the aligned portion of the
>> read in gapped read and gapped consensus space, along with the quality
>> trimmed range of the read.
>>
>> The ACE file in question is produced by the gsMapper program, which is
>> part of Newbler from Roche (454), so it has some small
>> 'peculiarities', but I don't think they are critical for the task at
>> hand.
>>
>>
>> Thanks very much for any hep you can provide on any of the above issues.
>>
>> Sincerely,
>> Dan.
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>>
>


From cjfields at illinois.edu  Fri Sep 18 14:00:17 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Fri, 18 Sep 2009 13:00:17 -0500
Subject: [Bioperl-l] Getting read position information from an ACE file?
In-Reply-To: <FCD85C18EC5744269CEAB127F4D1D5C4@NewLife>
References: <2c8757af0909180755u2e2ca178h9ce921f9bb22c7a3@mail.gmail.com>
	<FCD85C18EC5744269CEAB127F4D1D5C4@NewLife>
Message-ID: <DCEC55AD-5B4E-42E6-9A7E-FB52E19EADA5@illinois.edu>

Agreed, and it may spur others to get involved, fix bugs, donate code,  
etc.

chris

On Sep 18, 2009, at 10:11 AM, Mark A. Jensen wrote:

> Dan -- I don't know much about Assembly, so can't help there. But  
> can I  encourage you and perhaps one or two others (steganographic  
> content: fangly) to create a HOWTO stub out of this? Would be  
> excellent-
> cheers MAJ
> ----- Original Message ----- From: "Dan Bolser" <dan.bolser at gmail.com>
> To: "BioPerl List" <bioperl-l at lists.open-bio.org>
> Sent: Friday, September 18, 2009 10:55 AM
> Subject: [Bioperl-l] Getting read position information from an ACE  
> file?
>
>
>> Dear Perl Monkeys,
>> I wrote a little demo script for Bio::Assembly::IO here:
>> http://www.bioperl.org/wiki/Module:Bio::Assembly::IO
>> I would very much appreciate comments, criticisms and corrections on
>> that script (please just edit the wiki). For a newbie its always the
>> same question, am I doing it right?
>> In particular, I read about the 4 possible coordinates of a read in  
>> an
>> assembly. My script only retrieves two (?) of the possible four. How
>> should it be adjusted to print all four coordinates for each read?
>> Additionally, I'm not sure how to distinguish between the trimmed  
>> read
>> vs. the full length read and/or the aligned portion of the read vs.
>> the full length read.
>> What I *really* want is the coordinates of the aligned portion of the
>> read in gapped read and gapped consensus space, along with the  
>> quality
>> trimmed range of the read.
>> The ACE file in question is produced by the gsMapper program, which  
>> is
>> part of Newbler from Roche (454), so it has some small
>> 'peculiarities', but I don't think they are critical for the task at
>> hand.
>> Thanks very much for any hep you can provide on any of the above  
>> issues.
>> Sincerely,
>> Dan.
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From cjfields at illinois.edu  Fri Sep 18 14:03:13 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Fri, 18 Sep 2009 13:03:13 -0500
Subject: [Bioperl-l] Parser: Ace file (Sequence Assembly) in Bioperl
In-Reply-To: <2c8757af0909180954ia4fecc3we72574d8ae8acd97@mail.gmail.com>
References: <be9b52410901052142p2809652h68e6a05b3ae156eb@mail.gmail.com>
	<18DF7D20DFEC044098A1062202F5FFF31A69523F20@exchsth.agresearch.co.nz>
	<be9b52410901061243t576fcc1eg94928360b8e0f57b@mail.gmail.com>
	<B6BFD3C2-D5D0-4732-B3E2-C2DC9DD029F1@illinois.edu>
	<52cea20c0901061513x593acb44o641b87e35b8ff6fe@mail.gmail.com>
	<835D79AC-0D2A-40BE-87F1-0591F69C036A@illinois.edu>
	<2c8757af0909180727r5a71a41fmee71eff92a49a888@mail.gmail.com>
	<124536CE-407B-4E2E-98B7-940DA4286CC8@illinois.edu>
	<2c8757af0909180954ia4fecc3we72574d8ae8acd97@mail.gmail.com>
Message-ID: <88BA1216-B8C6-478B-A295-4153D041F549@illinois.edu>

Bug 2726

http://bugzilla.open-bio.org/show_bug.cgi?id=2726

chris

On Sep 18, 2009, at 11:54 AM, Dan Bolser wrote:

> Please can you link to the bug that includes the code?
>
>
> 2009/9/18 Chris Fields <cjfields at illinois.edu>:
>> Dan,
>>
>> No, it hasn't made it in.  Currently, the problem is it doesn't  
>> have any
>> tests attached, but that could be easily fixed if anyone wanted to  
>> donate a
>> little time to getting them running.  My hands are a bit full with  
>> other
>> stuff for the release.
>>
>> We should have some ace files already to go in t/data somewhere if  
>> one were
>> so inclined to do that, BTW  ;>
>>
>> chris
>>
>> On Sep 18, 2009, at 9:27 AM, Dan Bolser wrote:
>>
>>> 2009/1/6 Chris Fields <cjfields at illinois.edu>:
>>>>
>>>> Could you archive the files and attach them to a bug report (you  
>>>> can mark
>>>> it
>>>> as an enhancement request).  We can take a look.
>>>>
>>>> http://bugzilla.open-bio.org/
>>>
>>> Out of interest, has this been added? Where is it documented?
>>>
>>> Cheers,
>>> Dan.
>>>
>>>
>>>> chris
>>>>
>>>> On Jan 6, 2009, at 5:13 PM, Joshua Udall wrote:
>>>>
>>>>> Chris et al. -
>>>>>
>>>>> A student and I have written code to do this - write ace files  
>>>>> as well
>>>>> as
>>>>> parse them one entry at a time.  In trying to use the  
>>>>> Assembly::IO as it
>>>>> was
>>>>> in 1.5, we ran into problems with large ace files containing many
>>>>> entries
>>>>> because of file handle limit issues with the inherited  
>>>>> implementation
>>>>> DB_File.  Our implementation simply reads one contig at a time  
>>>>> instead
>>>>> of
>>>>> first trying to slurp the whole ace into memory.  I'm happy to  
>>>>> add it to
>>>>> Bioperl, but I am not sure how to do it.  If I sent *.pm files to
>>>>> someone,
>>>>> could they help me get it into bioperl?  It may not be perfect  
>>>>> either,
>>>>> but
>>>>> it should be a good start.
>>>>>
>>>>> Josh
>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>>


From e.osimo at gmail.com  Fri Sep 18 18:33:22 2009
From: e.osimo at gmail.com (Emanuele Osimo)
Date: Sat, 19 Sep 2009 00:33:22 +0200
Subject: [Bioperl-l] Getting all annotations
Message-ID: <2ac05d0f0909181533u1e5d5d89l5c2c468950a9cef@mail.gmail.com>

Hello,
I was trying to figure out how to get from the Entrez database all the
reference annotation for a given genomic zone.
For example: I want to know which genes, transcripts, microRNAs etc are
present in chr 6 from 100kbp to 200kbp.
Is there a database that is arranged as a continuum (by sequence) instead of
by feature (gene, transcript etc)?

Thanks
Emanuele


From florent.angly at gmail.com  Sat Sep 19 22:20:31 2009
From: florent.angly at gmail.com (Florent Angly)
Date: Sat, 19 Sep 2009 19:20:31 -0700
Subject: [Bioperl-l] Parser: Ace file (Sequence Assembly) in Bioperl
In-Reply-To: <88BA1216-B8C6-478B-A295-4153D041F549@illinois.edu>
References: <be9b52410901052142p2809652h68e6a05b3ae156eb@mail.gmail.com>	<18DF7D20DFEC044098A1062202F5FFF31A69523F20@exchsth.agresearch.co.nz>	<be9b52410901061243t576fcc1eg94928360b8e0f57b@mail.gmail.com>	<B6BFD3C2-D5D0-4732-B3E2-C2DC9DD029F1@illinois.edu>	<52cea20c0901061513x593acb44o641b87e35b8ff6fe@mail.gmail.com>	<835D79AC-0D2A-40BE-87F1-0591F69C036A@illinois.edu>	<2c8757af0909180727r5a71a41fmee71eff92a49a888@mail.gmail.com>	<124536CE-407B-4E2E-98B7-940DA4286CC8@illinois.edu>	<2c8757af0909180954ia4fecc3we72574d8ae8acd97@mail.gmail.com>
	<88BA1216-B8C6-478B-A295-4153D041F549@illinois.edu>
Message-ID: <4AB5916F.1090104@gmail.com>

I suppose it is a good idea to wait until bioperl-live 1.6.1 is out 
before doing any significant work on the sequence assembly module.
Also, remember the assembly-related todo list: 
http://www.bioperl.org/wiki/Align_Refactor#Bio::Assembly-related
Florent


Chris Fields wrote:
> Bug 2726
>
> http://bugzilla.open-bio.org/show_bug.cgi?id=2726
>
> chris
>
> On Sep 18, 2009, at 11:54 AM, Dan Bolser wrote:
>
>> Please can you link to the bug that includes the code?
>>
>>
>> 2009/9/18 Chris Fields <cjfields at illinois.edu>:
>>> Dan,
>>>
>>> No, it hasn't made it in.  Currently, the problem is it doesn't have 
>>> any
>>> tests attached, but that could be easily fixed if anyone wanted to 
>>> donate a
>>> little time to getting them running.  My hands are a bit full with 
>>> other
>>> stuff for the release.
>>>
>>> We should have some ace files already to go in t/data somewhere if 
>>> one were
>>> so inclined to do that, BTW  ;>
>>>
>>> chris
>>>
>>> On Sep 18, 2009, at 9:27 AM, Dan Bolser wrote:
>>>
>>>> 2009/1/6 Chris Fields <cjfields at illinois.edu>:
>>>>>
>>>>> Could you archive the files and attach them to a bug report (you 
>>>>> can mark
>>>>> it
>>>>> as an enhancement request).  We can take a look.
>>>>>
>>>>> http://bugzilla.open-bio.org/
>>>>
>>>> Out of interest, has this been added? Where is it documented?
>>>>
>>>> Cheers,
>>>> Dan.
>>>>
>>>>
>>>>> chris
>>>>>
>>>>> On Jan 6, 2009, at 5:13 PM, Joshua Udall wrote:
>>>>>
>>>>>> Chris et al. -
>>>>>>
>>>>>> A student and I have written code to do this - write ace files as 
>>>>>> well
>>>>>> as
>>>>>> parse them one entry at a time.  In trying to use the 
>>>>>> Assembly::IO as it
>>>>>> was
>>>>>> in 1.5, we ran into problems with large ace files containing many
>>>>>> entries
>>>>>> because of file handle limit issues with the inherited 
>>>>>> implementation
>>>>>> DB_File.  Our implementation simply reads one contig at a time 
>>>>>> instead
>>>>>> of
>>>>>> first trying to slurp the whole ace into memory.  I'm happy to 
>>>>>> add it to
>>>>>> Bioperl, but I am not sure how to do it.  If I sent *.pm files to
>>>>>> someone,
>>>>>> could they help me get it into bioperl?  It may not be perfect 
>>>>>> either,
>>>>>> but
>>>>>> it should be a good start.
>>>>>>
>>>>>> Josh
>>>>
>>>> _______________________________________________
>>>> Bioperl-l mailing list
>>>> Bioperl-l at lists.open-bio.org
>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


From dan.bolser at gmail.com  Sun Sep 20 08:26:06 2009
From: dan.bolser at gmail.com (Dan Bolser)
Date: Sun, 20 Sep 2009 13:26:06 +0100
Subject: [Bioperl-l] Parser: Ace file (Sequence Assembly) in Bioperl
In-Reply-To: <4AB5916F.1090104@gmail.com>
References: <be9b52410901052142p2809652h68e6a05b3ae156eb@mail.gmail.com>
	<be9b52410901061243t576fcc1eg94928360b8e0f57b@mail.gmail.com>
	<B6BFD3C2-D5D0-4732-B3E2-C2DC9DD029F1@illinois.edu>
	<52cea20c0901061513x593acb44o641b87e35b8ff6fe@mail.gmail.com>
	<835D79AC-0D2A-40BE-87F1-0591F69C036A@illinois.edu>
	<2c8757af0909180727r5a71a41fmee71eff92a49a888@mail.gmail.com>
	<124536CE-407B-4E2E-98B7-940DA4286CC8@illinois.edu>
	<2c8757af0909180954ia4fecc3we72574d8ae8acd97@mail.gmail.com>
	<88BA1216-B8C6-478B-A295-4153D041F549@illinois.edu>
	<4AB5916F.1090104@gmail.com>
Message-ID: <2c8757af0909200526u3bb1766eo5d316dc5d7a2e1a5@mail.gmail.com>

2009/9/20 Florent Angly <florent.angly at gmail.com>:

...

> Also, remember the assembly-related todo list:
> http://www.bioperl.org/wiki/Align_Refactor#Bio::Assembly-related

Thanks for that link Florent. It's great to see the wiki being put to
such good use in the context of OSS development! I need to make a
mental note - before posting, check the mailing list archives _and_
the wiki!

Cheers,
Dan.


> Florent
>
>
> Chris Fields wrote:
>>
>> Bug 2726
>>
>> http://bugzilla.open-bio.org/show_bug.cgi?id=2726
>>
>> chris
>>
>> On Sep 18, 2009, at 11:54 AM, Dan Bolser wrote:
>>
>>> Please can you link to the bug that includes the code?
>>>
>>>
>>> 2009/9/18 Chris Fields <cjfields at illinois.edu>:
>>>>
>>>> Dan,
>>>>
>>>> No, it hasn't made it in. ?Currently, the problem is it doesn't have any
>>>> tests attached, but that could be easily fixed if anyone wanted to
>>>> donate a
>>>> little time to getting them running. ?My hands are a bit full with other
>>>> stuff for the release.
>>>>
>>>> We should have some ace files already to go in t/data somewhere if one
>>>> were
>>>> so inclined to do that, BTW ?;>
>>>>
>>>> chris
>>>>
>>>> On Sep 18, 2009, at 9:27 AM, Dan Bolser wrote:
>>>>
>>>>> 2009/1/6 Chris Fields <cjfields at illinois.edu>:
>>>>>>
>>>>>> Could you archive the files and attach them to a bug report (you can
>>>>>> mark
>>>>>> it
>>>>>> as an enhancement request). ?We can take a look.
>>>>>>
>>>>>> http://bugzilla.open-bio.org/
>>>>>
>>>>> Out of interest, has this been added? Where is it documented?
>>>>>
>>>>> Cheers,
>>>>> Dan.
>>>>>
>>>>>
>>>>>> chris
>>>>>>
>>>>>> On Jan 6, 2009, at 5:13 PM, Joshua Udall wrote:
>>>>>>
>>>>>>> Chris et al. -
>>>>>>>
>>>>>>> A student and I have written code to do this - write ace files as
>>>>>>> well
>>>>>>> as
>>>>>>> parse them one entry at a time. ?In trying to use the Assembly::IO as
>>>>>>> it
>>>>>>> was
>>>>>>> in 1.5, we ran into problems with large ace files containing many
>>>>>>> entries
>>>>>>> because of file handle limit issues with the inherited implementation
>>>>>>> DB_File. ?Our implementation simply reads one contig at a time
>>>>>>> instead
>>>>>>> of
>>>>>>> first trying to slurp the whole ace into memory. ?I'm happy to add it
>>>>>>> to
>>>>>>> Bioperl, but I am not sure how to do it. ?If I sent *.pm files to
>>>>>>> someone,
>>>>>>> could they help me get it into bioperl? ?It may not be perfect
>>>>>>> either,
>>>>>>> but
>>>>>>> it should be a good start.
>>>>>>>
>>>>>>> Josh
>>>>>
>>>>> _______________________________________________
>>>>> Bioperl-l mailing list
>>>>> Bioperl-l at lists.open-bio.org
>>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>>
>>>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>
>


From cjfields at illinois.edu  Sun Sep 20 10:34:08 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Sun, 20 Sep 2009 09:34:08 -0500
Subject: [Bioperl-l] Parser: Ace file (Sequence Assembly) in Bioperl
In-Reply-To: <4AB5916F.1090104@gmail.com>
References: <be9b52410901052142p2809652h68e6a05b3ae156eb@mail.gmail.com>	<18DF7D20DFEC044098A1062202F5FFF31A69523F20@exchsth.agresearch.co.nz>	<be9b52410901061243t576fcc1eg94928360b8e0f57b@mail.gmail.com>	<B6BFD3C2-D5D0-4732-B3E2-C2DC9DD029F1@illinois.edu>	<52cea20c0901061513x593acb44o641b87e35b8ff6fe@mail.gmail.com>	<835D79AC-0D2A-40BE-87F1-0591F69C036A@illinois.edu>	<2c8757af0909180727r5a71a41fmee71eff92a49a888@mail.gmail.com>	<124536CE-407B-4E2E-98B7-940DA4286CC8@illinois.edu>	<2c8757af0909180954ia4fecc3we72574d8ae8acd97@mail.gmail.com>
	<88BA1216-B8C6-478B-A295-4153D041F549@illinois.edu>
	<4AB5916F.1090104@gmail.com>
Message-ID: <F25C4EA4-1DB4-44F3-AB66-F58E6A90E302@illinois.edu>

Never hurts to get started, just make sure that there is a note  
indicating the status of Bio::Assembly.  In fact, the discussion page  
for it might make a good sot for Bio::Assembly design.

chris
On Sep 19, 2009, at 9:20 PM, Florent Angly wrote:

> I suppose it is a good idea to wait until bioperl-live 1.6.1 is out  
> before doing any significant work on the sequence assembly module.
> Also, remember the assembly-related todo list: http://www.bioperl.org/wiki/Align_Refactor#Bio::Assembly-related
> Florent
>
>
> Chris Fields wrote:
>> Bug 2726
>>
>> http://bugzilla.open-bio.org/show_bug.cgi?id=2726
>>
>> chris
>>
>> On Sep 18, 2009, at 11:54 AM, Dan Bolser wrote:
>>
>>> Please can you link to the bug that includes the code?
>>>
>>>
>>> 2009/9/18 Chris Fields <cjfields at illinois.edu>:
>>>> Dan,
>>>>
>>>> No, it hasn't made it in.  Currently, the problem is it doesn't  
>>>> have any
>>>> tests attached, but that could be easily fixed if anyone wanted  
>>>> to donate a
>>>> little time to getting them running.  My hands are a bit full  
>>>> with other
>>>> stuff for the release.
>>>>
>>>> We should have some ace files already to go in t/data somewhere  
>>>> if one were
>>>> so inclined to do that, BTW  ;>
>>>>
>>>> chris
>>>>
>>>> On Sep 18, 2009, at 9:27 AM, Dan Bolser wrote:
>>>>
>>>>> 2009/1/6 Chris Fields <cjfields at illinois.edu>:
>>>>>>
>>>>>> Could you archive the files and attach them to a bug report  
>>>>>> (you can mark
>>>>>> it
>>>>>> as an enhancement request).  We can take a look.
>>>>>>
>>>>>> http://bugzilla.open-bio.org/
>>>>>
>>>>> Out of interest, has this been added? Where is it documented?
>>>>>
>>>>> Cheers,
>>>>> Dan.
>>>>>
>>>>>
>>>>>> chris
>>>>>>
>>>>>> On Jan 6, 2009, at 5:13 PM, Joshua Udall wrote:
>>>>>>
>>>>>>> Chris et al. -
>>>>>>>
>>>>>>> A student and I have written code to do this - write ace files  
>>>>>>> as well
>>>>>>> as
>>>>>>> parse them one entry at a time.  In trying to use the  
>>>>>>> Assembly::IO as it
>>>>>>> was
>>>>>>> in 1.5, we ran into problems with large ace files containing  
>>>>>>> many
>>>>>>> entries
>>>>>>> because of file handle limit issues with the inherited  
>>>>>>> implementation
>>>>>>> DB_File.  Our implementation simply reads one contig at a time  
>>>>>>> instead
>>>>>>> of
>>>>>>> first trying to slurp the whole ace into memory.  I'm happy to  
>>>>>>> add it to
>>>>>>> Bioperl, but I am not sure how to do it.  If I sent *.pm files  
>>>>>>> to
>>>>>>> someone,
>>>>>>> could they help me get it into bioperl?  It may not be perfect  
>>>>>>> either,
>>>>>>> but
>>>>>>> it should be a good start.
>>>>>>>
>>>>>>> Josh
>>>>>
>>>>> _______________________________________________
>>>>> Bioperl-l mailing list
>>>>> Bioperl-l at lists.open-bio.org
>>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>>
>>>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From dan.bolser at gmail.com  Sun Sep 20 11:09:19 2009
From: dan.bolser at gmail.com (Dan Bolser)
Date: Sun, 20 Sep 2009 16:09:19 +0100
Subject: [Bioperl-l] Getting all annotations
In-Reply-To: <2ac05d0f0909181533u1e5d5d89l5c2c468950a9cef@mail.gmail.com>
References: <2ac05d0f0909181533u1e5d5d89l5c2c468950a9cef@mail.gmail.com>
Message-ID: <2c8757af0909200809g1f6c41eeyabfc8bdaac1fc19f@mail.gmail.com>

Hi Emanuele,

I guess you were Emos in irc://irc.freenode.net/#bioperl ?


I think the answer to your question can be found here:

http://www.biodas.org


All the best,
Dan.

2009/9/18 Emanuele Osimo <e.osimo at gmail.com>:
> Hello,
> I was trying to figure out how to get from the Entrez database all the
> reference annotation for a given genomic zone.
> For example: I want to know which genes, transcripts, microRNAs etc are
> present in chr 6 from 100kbp to 200kbp.
> Is there a database that is arranged as a continuum (by sequence) instead of
> by feature (gene, transcript etc)?
>
> Thanks
> Emanuele
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


From maj at fortinbras.us  Mon Sep 21 00:22:54 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Mon, 21 Sep 2009 00:22:54 -0400
Subject: [Bioperl-l] a Main Page proposal
Message-ID: <094CC6C2C9DE448F9A8097AEEFE86B03@NewLife>

Hello all,

As Brian articulated so well for many of us, 
the wiki main page is, well, butt-ugly.
Please check out the Main Page Beta at
http://www.bioperl.org/wiki/Main_Page_Beta
and respond to this thread or on the discussion 
page. 

cheers and thanks, 
MAJ


From bix at sendu.me.uk  Mon Sep 21 02:25:04 2009
From: bix at sendu.me.uk (Sendu Bala)
Date: Mon, 21 Sep 2009 07:25:04 +0100
Subject: [Bioperl-l] a Main Page proposal
In-Reply-To: <094CC6C2C9DE448F9A8097AEEFE86B03@NewLife>
References: <094CC6C2C9DE448F9A8097AEEFE86B03@NewLife>
Message-ID: <4AB71C40.10902@sendu.me.uk>

Mark A. Jensen wrote:
> Hello all,
> 
> As Brian articulated so well for many of us, 
> the wiki main page is, well, butt-ugly.
> Please check out the Main Page Beta at
> http://www.bioperl.org/wiki/Main_Page_Beta
> and respond to this thread or on the discussion 
> page. 

I never thought the main page was 'butt-ugly' (rather, what I expect 
from a wiki), but, to put it bluntly, the graphical flourishes in your 
proposal are cringe-worthy. I couldn't do any better. I think for 
graphical things you'd need a professional graphics designer or similar.

The actual content and organisation of your version is probably an 
improvement though.


From rmb32 at cornell.edu  Mon Sep 21 03:40:31 2009
From: rmb32 at cornell.edu (Robert Buels)
Date: Mon, 21 Sep 2009 00:40:31 -0700
Subject: [Bioperl-l] a Main Page proposal
In-Reply-To: <4AB71C40.10902@sendu.me.uk>
References: <094CC6C2C9DE448F9A8097AEEFE86B03@NewLife>
	<4AB71C40.10902@sendu.me.uk>
Message-ID: <4AB72DEF.2010008@cornell.edu>

Sendu Bala wrote:
> from a wiki), but, to put it bluntly, the graphical flourishes in your 
> proposal are cringe-worthy. I couldn't do any better. I think for 

I think what Sendu was trying to say is that he didn't like the gradient 
section heads?  There are only two graphical things on that page, and 
the other one is an enlargement of the existing logo, so I suppose 
that's what he means.

They're not my absolute favorite either, but I certainly wouldn't 
describe them as cringe-worthy!  :-P

Rob


From biopython at maubp.freeserve.co.uk  Mon Sep 21 05:45:48 2009
From: biopython at maubp.freeserve.co.uk (Peter)
Date: Mon, 21 Sep 2009 10:45:48 +0100
Subject: [Bioperl-l] a Main Page proposal
In-Reply-To: <4AB72DEF.2010008@cornell.edu>
References: <094CC6C2C9DE448F9A8097AEEFE86B03@NewLife>
	<4AB71C40.10902@sendu.me.uk> <4AB72DEF.2010008@cornell.edu>
Message-ID: <320fb6e00909210245hc455330o109515b100b21153@mail.gmail.com>

On Mon, Sep 21, 2009 at 8:40 AM, Robert Buels <rmb32 at cornell.edu> wrote:
>
> I think what Sendu was trying to say is that he didn't like the gradient
> section heads? ?There are only two graphical things on that page, and the
> other one is an enlargement of the existing logo, so I suppose that's what
> he means.

On my browser the gradient section headers on that draft
suddenly change to grey for the section title text background
(Linux, Firefox 3.0.14).

Personally, I would also say that even this proposal is still
far too heavy (in terms of text content).

We had some similar discussions about the Biopython wiki
based homepage - although our old one was nowhere near
as busy as the current BioPerl main page, it was still not as
welcoming as our current version *tries* to be.

Old:
http://biopython.org/w/index.php?title=Biopython&oldid=2527

New:
http://biopython.org/wiki/Main_Page

It would be easy for you to embed the BioPerl OBF blog
headlines into the main page like we did.

I can dig out links to our mailing list archive if anyone is
interested in the discussion.

Peter


From maj at fortinbras.us  Mon Sep 21 07:20:31 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Mon, 21 Sep 2009 07:20:31 -0400
Subject: [Bioperl-l] a Main Page proposal
In-Reply-To: <4AB72DEF.2010008@cornell.edu>
References: <094CC6C2C9DE448F9A8097AEEFE86B03@NewLife><4AB71C40.10902@sendu.me.uk>
	<4AB72DEF.2010008@cornell.edu>
Message-ID: <22244A89D06E4F9B8D5F70A833E1C0DE@NewLife>

Hey, if Sendu cringed, he cringed. If I had one, I'd keep my 
day job. In the meantime, the graphics are removed. 
MAJ
----- Original Message ----- 
From: "Robert Buels" <rmb32 at cornell.edu>
Cc: "BioPerl List" <bioperl-l at lists.open-bio.org>
Sent: Monday, September 21, 2009 3:40 AM
Subject: Re: [Bioperl-l] a Main Page proposal


> Sendu Bala wrote:
>> from a wiki), but, to put it bluntly, the graphical flourishes in your 
>> proposal are cringe-worthy. I couldn't do any better. I think for 
> 
> I think what Sendu was trying to say is that he didn't like the gradient 
> section heads?  There are only two graphical things on that page, and 
> the other one is an enlargement of the existing logo, so I suppose 
> that's what he means.
> 
> They're not my absolute favorite either, but I certainly wouldn't 
> describe them as cringe-worthy!  :-P
> 
> Rob
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
>


From e.osimo at gmail.com  Mon Sep 21 07:35:00 2009
From: e.osimo at gmail.com (Emanuele Osimo)
Date: Mon, 21 Sep 2009 13:35:00 +0200
Subject: [Bioperl-l] a Main Page proposal
In-Reply-To: <094CC6C2C9DE448F9A8097AEEFE86B03@NewLife>
References: <094CC6C2C9DE448F9A8097AEEFE86B03@NewLife>
Message-ID: <2ac05d0f0909210435k66bd0ed3x9fd13d9f4ec44634@mail.gmail.com>

I can say that, for a neophyte, the contents are a great improvement.
You can find with a lot more ease what you are searching for.

Emanuele

On Mon, Sep 21, 2009 at 06:22, Mark A. Jensen <maj at fortinbras.us> wrote:

> Hello all,
>
> As Brian articulated so well for many of us,
> the wiki main page is, well, butt-ugly.
> Please check out the Main Page Beta at
> http://www.bioperl.org/wiki/Main_Page_Beta
> and respond to this thread or on the discussion
> page.
>
> cheers and thanks,
> MAJ
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


From maj at fortinbras.us  Mon Sep 21 07:32:08 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Mon, 21 Sep 2009 07:32:08 -0400
Subject: [Bioperl-l] a Main Page proposal
In-Reply-To: <320fb6e00909210245hc455330o109515b100b21153@mail.gmail.com>
References: <094CC6C2C9DE448F9A8097AEEFE86B03@NewLife><4AB71C40.10902@sendu.me.uk>
	<4AB72DEF.2010008@cornell.edu>
	<320fb6e00909210245hc455330o109515b100b21153@mail.gmail.com>
Message-ID: <3C8F39ACAD954917ACDEFD863EC99B16@NewLife>

I'd appreciate those links, Peter- thanks
MAJ
----- Original Message ----- 
From: "Peter" <biopython at maubp.freeserve.co.uk>
To: "BioPerl List" <bioperl-l at lists.open-bio.org>
Sent: Monday, September 21, 2009 5:45 AM
Subject: Re: [Bioperl-l] a Main Page proposal


On Mon, Sep 21, 2009 at 8:40 AM, Robert Buels <rmb32 at cornell.edu> wrote:
>
> I think what Sendu was trying to say is that he didn't like the gradient
> section heads? There are only two graphical things on that page, and the
> other one is an enlargement of the existing logo, so I suppose that's what
> he means.

On my browser the gradient section headers on that draft
suddenly change to grey for the section title text background
(Linux, Firefox 3.0.14).

Personally, I would also say that even this proposal is still
far too heavy (in terms of text content).

We had some similar discussions about the Biopython wiki
based homepage - although our old one was nowhere near
as busy as the current BioPerl main page, it was still not as
welcoming as our current version *tries* to be.

Old:
http://biopython.org/w/index.php?title=Biopython&oldid=2527

New:
http://biopython.org/wiki/Main_Page

It would be easy for you to embed the BioPerl OBF blog
headlines into the main page like we did.

I can dig out links to our mailing list archive if anyone is
interested in the discussion.

Peter

_______________________________________________
Bioperl-l mailing list
Bioperl-l at lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/bioperl-l


From pmiguel at purdue.edu  Mon Sep 21 08:01:03 2009
From: pmiguel at purdue.edu (Phillip San Miguel)
Date: Mon, 21 Sep 2009 08:01:03 -0400
Subject: [Bioperl-l] Getting read position information from an ACE file?
In-Reply-To: <2c8757af0909181009w310bc69r3d9efa3d9a12d41b@mail.gmail.com>
References: <2c8757af0909180755u2e2ca178h9ce921f9bb22c7a3@mail.gmail.com>	<FCD85C18EC5744269CEAB127F4D1D5C4@NewLife>
	<2c8757af0909181009w310bc69r3d9efa3d9a12d41b@mail.gmail.com>
Message-ID: <4AB76AFF.7050902@purdue.edu>

Dan Bolser wrote:
> 2009/9/18 Mark A. Jensen <maj at fortinbras.us>:
>   
>> Dan -- I don't know much about Assembly, so can't help there. But can I
>>  encourage you and perhaps one or two others (steganographic content:
>> fangly) to create a HOWTO stub out of this? Would be excellent-
>>     
>
> I'd love to. ACE is pretty ubiquitous, so any additional info on how
> to work with them using BioPerl should help a lot of people.
>   
> The problem is that I'm one of those people ;-)
>
>
> I'm working on an 'ace2tab.plx' script that should encompass this
> info. I'm finding that some 'read ids' have the .range format. i.e.
> "read123455.23-239". However, some do not. i.e. "read123456". Not sure
> where this ID comes from, but I think its telling me something about
> partially aligned reads. 

I think you are right. I have heard that Newbler (the 454 assembler) 
does this insane thing, where it will rip reads apart into segments and 
cluster parts of reads in different contigs.

> The problem is that the coordinates I'm
> seeing don't reflect that (they are just the start and the end point
> of the full read).
>   

That sounds similar to how phrap/consed handle "chimeric" reads. But my 
experience is that phrap is pretty parsimonious with numbers of 
chimerics it will allow.  (That isn't entirely fair to Newbler -- I've 
never been able to get phrap to consistently assemble ESTs. Phrap seems 
tuned to assemble BAC shotgun reads. ESTs seem to drive it a little 
crazy. It will create contigs from a set of reads that have essentially 
no similarity to each other, nor to the consensus sequence phrap creates 
for them.)

-- 
Phillip


From hlapp at gmx.net  Mon Sep 21 08:22:34 2009
From: hlapp at gmx.net (Hilmar Lapp)
Date: Mon, 21 Sep 2009 08:22:34 -0400
Subject: [Bioperl-l] a Main Page proposal
In-Reply-To: <094CC6C2C9DE448F9A8097AEEFE86B03@NewLife>
References: <094CC6C2C9DE448F9A8097AEEFE86B03@NewLife>
Message-ID: <03B93F96-E28D-45CF-BD94-AD33634476AA@gmx.net>

What's probably worth looking at as a example is the gmod.org home  
page. Stylistically, one thing you want to get out of the way is the  
auto-generated TOC.

	-hilmar

On Sep 21, 2009, at 12:22 AM, Mark A. Jensen wrote:

> Hello all,
>
> As Brian articulated so well for many of us,
> the wiki main page is, well, butt-ugly.
> Please check out the Main Page Beta at
> http://www.bioperl.org/wiki/Main_Page_Beta
> and respond to this thread or on the discussion
> page.
>
> cheers and thanks,
> MAJ
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From biopython at maubp.freeserve.co.uk  Mon Sep 21 08:28:28 2009
From: biopython at maubp.freeserve.co.uk (Peter)
Date: Mon, 21 Sep 2009 13:28:28 +0100
Subject: [Bioperl-l] a Main Page proposal
In-Reply-To: <3C8F39ACAD954917ACDEFD863EC99B16@NewLife>
References: <094CC6C2C9DE448F9A8097AEEFE86B03@NewLife>
	<4AB71C40.10902@sendu.me.uk> <4AB72DEF.2010008@cornell.edu>
	<320fb6e00909210245hc455330o109515b100b21153@mail.gmail.com>
	<3C8F39ACAD954917ACDEFD863EC99B16@NewLife>
Message-ID: <320fb6e00909210528v849cd43h3bd4677b69222575@mail.gmail.com>

Peter wrote:
>> We had some similar discussions about the Biopython wiki
>> based homepage - although our old one was nowhere near
>> as busy as the current BioPerl main page, it was still not as
>> welcoming as our current version *tries* to be.
>> ...
>> I can dig out links to our mailing list archive if anyone is
>> interested in the discussion.

On Mon, Sep 21, 2009 at 12:32 PM, Mark A. Jensen wrote:
>
> I'd appreciate those links, Peter- thanks
> MAJ

OK, here you are - this was most of it, I'd have to dig though
my old emails to see what else I can find:
http://lists.open-bio.org/pipermail/biopython-dev/2009-April/005867.html

Remember Biopython went from a very minimal home page, to
something aiming to be more newcomer friendly. BioPerl on the
other hand seems to want to move away from the current very
text heavy information rich page to something more focused and
newcomer friendly. To me at least the current page is too dense,
intimidating, and the important bits get lost in all the content.

[My apologies if any of this feedback come accross too blunt.]

If you haven't already looked at them, you should checkout the
other OBF project pages for ideas. The BioJava homepage is
also using the wiki - in my opinion it is a bit cluttered, but is
still more accessible than the current BioPerl page. Also,
the BioRuby page is very nice - although not wiki based.

Regards,

Peter


From mwachholtz at unomaha.edu  Thu Sep 17 20:31:13 2009
From: mwachholtz at unomaha.edu (Michael UNO)
Date: Thu, 17 Sep 2009 17:31:13 -0700 (PDT)
Subject: [Bioperl-l]  Genome Scanning Question
Message-ID: <25497856.post@talk.nabble.com>


What objects & methods could be used if I wanted to determine if a gene is
located at a specific location within a genome at the Ensembl database. For
example, if given a coordinate (e.g. Canine Chr15:66,500,123) is there a
method that will simply tell me "yes, there is a gene at this location". And
can it tell what gene(s) are located at this coordinate?
-- 
View this message in context: http://www.nabble.com/Genome-Scanning-Question-tp25497856p25497856.html
Sent from the Perl - Bioperl-L mailing list archive at Nabble.com.


From sdavis2 at mail.nih.gov  Mon Sep 21 09:04:36 2009
From: sdavis2 at mail.nih.gov (Sean Davis)
Date: Mon, 21 Sep 2009 09:04:36 -0400
Subject: [Bioperl-l] Genome Scanning Question
In-Reply-To: <25497856.post@talk.nabble.com>
References: <25497856.post@talk.nabble.com>
Message-ID: <264855a00909210604o826871dr7121e3f26c0e34aa@mail.gmail.com>

On Thu, Sep 17, 2009 at 8:31 PM, Michael UNO <mwachholtz at unomaha.edu> wrote:

>
> What objects & methods could be used if I wanted to determine if a gene is
> located at a specific location within a genome at the Ensembl database. For
> example, if given a coordinate (e.g. Canine Chr15:66,500,123) is there a
> method that will simply tell me "yes, there is a gene at this location".
> And
> can it tell what gene(s) are located at this coordinate?
>

There are a number of ways to go about this.

If you want to go with perl, object-oriented, and ensembl, check out:

http://www.ensembl.org/info/docs/api/core/core_tutorial.html

If you want to start with tab-delimited text files, check out downloading
the text files from the UCSC genome browser.

Sean


> --
> View this message in context:
> http://www.nabble.com/Genome-Scanning-Question-tp25497856p25497856.html
> Sent from the Perl - Bioperl-L mailing list archive at Nabble.com.
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


From cjfields at illinois.edu  Mon Sep 21 09:05:25 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Mon, 21 Sep 2009 08:05:25 -0500
Subject: [Bioperl-l] a Main Page proposal
In-Reply-To: <320fb6e00909210528v849cd43h3bd4677b69222575@mail.gmail.com>
References: <094CC6C2C9DE448F9A8097AEEFE86B03@NewLife>
	<4AB71C40.10902@sendu.me.uk> <4AB72DEF.2010008@cornell.edu>
	<320fb6e00909210245hc455330o109515b100b21153@mail.gmail.com>
	<3C8F39ACAD954917ACDEFD863EC99B16@NewLife>
	<320fb6e00909210528v849cd43h3bd4677b69222575@mail.gmail.com>
Message-ID: <D336A804-1B73-465E-971F-AD1A9EE5465C@illinois.edu>


On Sep 21, 2009, at 7:28 AM, Peter wrote:

> Peter wrote:
>>> We had some similar discussions about the Biopython wiki
>>> based homepage - although our old one was nowhere near
>>> as busy as the current BioPerl main page, it was still not as
>>> welcoming as our current version *tries* to be.
>>> ...
>>> I can dig out links to our mailing list archive if anyone is
>>> interested in the discussion.
>
> On Mon, Sep 21, 2009 at 12:32 PM, Mark A. Jensen wrote:
>>
>> I'd appreciate those links, Peter- thanks
>> MAJ
>
> OK, here you are - this was most of it, I'd have to dig though
> my old emails to see what else I can find:
> http://lists.open-bio.org/pipermail/biopython-dev/2009-April/005867.html
>
> Remember Biopython went from a very minimal home page, to
> something aiming to be more newcomer friendly. BioPerl on the
> other hand seems to want to move away from the current very
> text heavy information rich page to something more focused and
> newcomer friendly. To me at least the current page is too dense,
> intimidating, and the important bits get lost in all the content.
>
> [My apologies if any of this feedback come accross too blunt.]

Not at all; I'm thinking the same thing.

> If you haven't already looked at them, you should checkout the
> other OBF project pages for ideas. The BioJava homepage is
> also using the wiki - in my opinion it is a bit cluttered, but is
> still more accessible than the current BioPerl page. Also,
> the BioRuby page is very nice - although not wiki based.
>
> Regards,
>
> Peter

I think the Biopython layout is very nice and focused.  Maybe a bit  
too minimal, but then again I don't like scrolling up and down the  
page to find the relevant bits, so less may be better.

Reminds me of the simplifed design on the perl6 main page (just don't  
stare at the hallucinogenic butterfly too long):

http://www.perl6.org/

So, maybe a structured layout with the most important links, and  
additional links on a separate page.

chris


From maj at fortinbras.us  Mon Sep 21 09:22:35 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Mon, 21 Sep 2009 09:22:35 -0400
Subject: [Bioperl-l] a Main Page proposal
In-Reply-To: <D336A804-1B73-465E-971F-AD1A9EE5465C@illinois.edu>
References: <094CC6C2C9DE448F9A8097AEEFE86B03@NewLife><4AB71C40.10902@sendu.me.uk>
	<4AB72DEF.2010008@cornell.edu><320fb6e00909210245hc455330o109515b100b21153@mail.gmail.com><3C8F39ACAD954917ACDEFD863EC99B16@NewLife><320fb6e00909210528v849cd43h3bd4677b69222575@mail.gmail.com>
	<D336A804-1B73-465E-971F-AD1A9EE5465C@illinois.edu>
Message-ID: <0F980234804C4B3EA08E810E043F2537@NewLife>

Ah! I don't need a degree in design, just a dose of whatever Madame Butterfly 
was taking!
(Erdos had it right...)

----- Original Message ----- 
From: "Chris Fields" <cjfields at illinois.edu>
To: "Peter" <biopython at maubp.freeserve.co.uk>
Cc: "BioPerl List" <bioperl-l at lists.open-bio.org>; "Mark A. Jensen" 
<maj at fortinbras.us>
Sent: Monday, September 21, 2009 9:05 AM
Subject: Re: [Bioperl-l] a Main Page proposal


>
> On Sep 21, 2009, at 7:28 AM, Peter wrote:
>
>> Peter wrote:
>>>> We had some similar discussions about the Biopython wiki
>>>> based homepage - although our old one was nowhere near
>>>> as busy as the current BioPerl main page, it was still not as
>>>> welcoming as our current version *tries* to be.
>>>> ...
>>>> I can dig out links to our mailing list archive if anyone is
>>>> interested in the discussion.
>>
>> On Mon, Sep 21, 2009 at 12:32 PM, Mark A. Jensen wrote:
>>>
>>> I'd appreciate those links, Peter- thanks
>>> MAJ
>>
>> OK, here you are - this was most of it, I'd have to dig though
>> my old emails to see what else I can find:
>> http://lists.open-bio.org/pipermail/biopython-dev/2009-April/005867.html
>>
>> Remember Biopython went from a very minimal home page, to
>> something aiming to be more newcomer friendly. BioPerl on the
>> other hand seems to want to move away from the current very
>> text heavy information rich page to something more focused and
>> newcomer friendly. To me at least the current page is too dense,
>> intimidating, and the important bits get lost in all the content.
>>
>> [My apologies if any of this feedback come accross too blunt.]
>
> Not at all; I'm thinking the same thing.
>
>> If you haven't already looked at them, you should checkout the
>> other OBF project pages for ideas. The BioJava homepage is
>> also using the wiki - in my opinion it is a bit cluttered, but is
>> still more accessible than the current BioPerl page. Also,
>> the BioRuby page is very nice - although not wiki based.
>>
>> Regards,
>>
>> Peter
>
> I think the Biopython layout is very nice and focused.  Maybe a bit  too 
> minimal, but then again I don't like scrolling up and down the  page to find 
> the relevant bits, so less may be better.
>
> Reminds me of the simplifed design on the perl6 main page (just don't  stare 
> at the hallucinogenic butterfly too long):
>
> http://www.perl6.org/
>
> So, maybe a structured layout with the most important links, and  additional 
> links on a separate page.
>
> chris
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> 


From biopython at maubp.freeserve.co.uk  Mon Sep 21 09:58:21 2009
From: biopython at maubp.freeserve.co.uk (Peter)
Date: Mon, 21 Sep 2009 14:58:21 +0100
Subject: [Bioperl-l] a Main Page proposal
In-Reply-To: <D336A804-1B73-465E-971F-AD1A9EE5465C@illinois.edu>
References: <094CC6C2C9DE448F9A8097AEEFE86B03@NewLife>
	<4AB71C40.10902@sendu.me.uk> <4AB72DEF.2010008@cornell.edu>
	<320fb6e00909210245hc455330o109515b100b21153@mail.gmail.com>
	<3C8F39ACAD954917ACDEFD863EC99B16@NewLife>
	<320fb6e00909210528v849cd43h3bd4677b69222575@mail.gmail.com>
	<D336A804-1B73-465E-971F-AD1A9EE5465C@illinois.edu>
Message-ID: <320fb6e00909210658n70f96727g1eb190579a746cfa@mail.gmail.com>

On Mon, Sep 21, 2009 at 2:05 PM, Chris Fields <cjfields at illinois.edu> wrote:
>
> I think the Biopython layout is very nice and focused. ?Maybe
> a bit too minimal, but then again I don't like scrolling up and
> down the page to find the relevant bits, so less may be better.

Yes, trying to get everything on one screen was deliberate
(and works for most screen sizes).

> Reminds me of the simplifed design on the perl6 main page
> (just don't stare at the hallucinogenic butterfly too long):
>
> http://www.perl6.org/
>
> So, maybe a structured layout with the most important links,
> and additional links on a separate page.

Butterflies aside, yes - that is what we tried to do on the
Biopython page - just provide an "abstract", and links to
get people to the main content.

Peter


From ak at ebi.ac.uk  Mon Sep 21 10:06:44 2009
From: ak at ebi.ac.uk (Andreas =?iso-8859-1?B?S+Ro5HJp?=)
Date: Mon, 21 Sep 2009 15:06:44 +0100
Subject: [Bioperl-l] Genome Scanning Question
In-Reply-To: <25497856.post@talk.nabble.com>
References: <25497856.post@talk.nabble.com>
Message-ID: <20090921140644.GB12734@qux.windows.ebi.ac.uk>

On Thu, Sep 17, 2009 at 05:31:13PM -0700, Michael UNO wrote:
> 
> What objects & methods could be used if I wanted to determine if a gene is
> located at a specific location within a genome at the Ensembl database. For
> example, if given a coordinate (e.g. Canine Chr15:66,500,123) is there a
> method that will simply tell me "yes, there is a gene at this location". And
> can it tell what gene(s) are located at this coordinate?

Here's a basic script do do something like what you want to do, for a
specific species, chromosome, and region:

#!/usr/bin/perl -w

use strict;
use warnings;

use Bio::EnsEMBL::Registry;

my $registry = 'Bio::EnsEMBL::Registry';

$registry->load_registry_from_db(
  '-host' => 'ensembldb.ensembl.org',
  '-user' => 'anonymous'
);

my $species = 'Dog';

my ( $chrname, $chrstart, $chrend ) = ( '13', 40_500_000, 41_000_000 );

my $slice_adaptor = $registry->get_adaptor( $species, 'Core', 'Slice' );

my $slice =
  $slice_adaptor->fetch_by_region( 'Chromosome', $chrname, $chrstart,
  $chrend );

my @genes = @{ $slice->get_all_Genes() };

if ( !@genes ) {
  print("No genes on that interval\n");
} else {
  printf( "%d genes on the interval:\n", scalar(@genes) );
  foreach my $gene (@genes) {
    printf(
      "%s (%s) [%s,%s,%s]\n",
      $gene->stable_id(), $gene->external_name() || 'No external name',
      $gene->start(), $gene->end(), $gene->strand() );
  }
}


Are you aware of the ensembl-dev mailing list and of the ensembl
helpdesk at helpdesk at ensembl.org (or via the "he!p" button in the genome
browser itself)?


Regards,
Andreas


-- 
Andreas K?h?ri, Ensembl Software Developer            ()[]()[]
European Bioinformatics Institute (EMBL-EBI)          []()[]()
Wellcome Trust Genome Campus, Hinxton                 ()[]()[]
Cambridge CB10 1SD, United Kingdom                    []()[]()


From bosborne11 at verizon.net  Mon Sep 21 09:15:03 2009
From: bosborne11 at verizon.net (Brian Osborne)
Date: Mon, 21 Sep 2009 09:15:03 -0400
Subject: [Bioperl-l] a Main Page proposal
In-Reply-To: <094CC6C2C9DE448F9A8097AEEFE86B03@NewLife>
References: <094CC6C2C9DE448F9A8097AEEFE86B03@NewLife>
Message-ID: <7E8EC05A-ED60-4F70-850D-16DD7E037281@verizon.net>

Mark,

That's nice! I wonder if we can move some content up-top, on the  
right, for less scrolling. I will play with this later today...

Brian O.


On Sep 21, 2009, at 12:22 AM, Mark A. Jensen wrote:

> Hello all,
>
> As Brian articulated so well for many of us,
> the wiki main page is, well, butt-ugly.
> Please check out the Main Page Beta at
> http://www.bioperl.org/wiki/Main_Page_Beta
> and respond to this thread or on the discussion
> page.
>
> cheers and thanks,
> MAJ
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From anupam.contact at gmail.com  Mon Sep 21 10:18:52 2009
From: anupam.contact at gmail.com (anupam sinha)
Date: Mon, 21 Sep 2009 19:48:52 +0530
Subject: [Bioperl-l] Problems with Bioperl-run pkg
In-Reply-To: <82ec54570909180820t7981d230l48d8e4823bb2303f@mail.gmail.com>
References: <82ec54570909180820t7981d230l48d8e4823bb2303f@mail.gmail.com>
Message-ID: <82ec54570909210718v180f604btc835d88f2a9ec2fd@mail.gmail.com>

On Fri, Sep 18, 2009 at 8:50 PM, anupam sinha <anupam.contact at gmail.com>wrote:

> Dear all,
>                  I have installed the BioPerl-1.6.0.tar.gz and
> Bioperl-run-1.6.0.tar.gz on a Fedora 7 system. I am trying to run *
> /usr/bin/bp_pairwise_kaks.pl* script but keep on getting this error :
>
> *Must have bioperl-run pkg installed to run this script at
> /usr/bin/bp_pairwise_kaks.pl line 69*.
>
> Though I have istalled the run package from Bioperl. Can anyone help me out
> ? Thanks in advance.
>
>
>
> Regards,
>
>
> Anupam Sinha
>


From maj at fortinbras.us  Mon Sep 21 10:49:25 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Mon, 21 Sep 2009 10:49:25 -0400
Subject: [Bioperl-l] a Main Page proposal
In-Reply-To: <094CC6C2C9DE448F9A8097AEEFE86B03@NewLife>
References: <094CC6C2C9DE448F9A8097AEEFE86B03@NewLife>
Message-ID: <7C51BBB552AB4EBEA5ED60B49BDB9171@NewLife>

Please view the latest 
http://www.bioperl.org/wiki/Main_Page_Beta
No graphics. I incline towards more text, but you
already knew that.
MAJ
----- Original Message ----- 
From: "Mark A. Jensen" <maj at fortinbras.us>
To: "BioPerl List" <bioperl-l at lists.open-bio.org>
Sent: Monday, September 21, 2009 12:22 AM
Subject: [Bioperl-l] a Main Page proposal


> Hello all,
> 
> As Brian articulated so well for many of us, 
> the wiki main page is, well, butt-ugly.
> Please check out the Main Page Beta at
> http://www.bioperl.org/wiki/Main_Page_Beta
> and respond to this thread or on the discussion 
> page. 
> 
> cheers and thanks, 
> MAJ
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
>


From David.Messina at sbc.su.se  Mon Sep 21 13:03:56 2009
From: David.Messina at sbc.su.se (Dave Messina)
Date: Mon, 21 Sep 2009 19:03:56 +0200
Subject: [Bioperl-l] a Main Page proposal
In-Reply-To: <7C51BBB552AB4EBEA5ED60B49BDB9171@NewLife>
References: <094CC6C2C9DE448F9A8097AEEFE86B03@NewLife>
	<7C51BBB552AB4EBEA5ED60B49BDB9171@NewLife>
Message-ID: <628aabb70909211003w221b4c9fn5c11f7ff08fc952d@mail.gmail.com>

Hi Mark,
Thanks for taking on this (much needed) refresh.

I think your current version is substantially better than what we have now.
Still, I'd argue that something much more concise like the Biopython page
would make a bigger impact on visitors' ability to find what they're looking
for.

It's not that the details you have under each section shouldn't be
available, but rather that they could be clicked through to instead of being
on the front page.

The About section is a good example. I would bet most visitors to the
BioPerl website skip over the About section because they already know what
BioPerl is, and that section has the most valuable real estate on the page.
Those who don't know and are curious will probably be able to find it (the
word About on the front page of a website has become an idiom for "click her
to read the details about this").


Dave


From cjfields at illinois.edu  Mon Sep 21 13:42:10 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Mon, 21 Sep 2009 12:42:10 -0500
Subject: [Bioperl-l] a Main Page proposal
In-Reply-To: <628aabb70909211003w221b4c9fn5c11f7ff08fc952d@mail.gmail.com>
References: <094CC6C2C9DE448F9A8097AEEFE86B03@NewLife>
	<7C51BBB552AB4EBEA5ED60B49BDB9171@NewLife>
	<628aabb70909211003w221b4c9fn5c11f7ff08fc952d@mail.gmail.com>
Message-ID: <5C240DA6-6B3D-4E64-A8BC-1FBC90FFA471@illinois.edu>

On Sep 21, 2009, at 12:03 PM, Dave Messina wrote:

> Hi Mark,
> Thanks for taking on this (much needed) refresh.
>
> I think your current version is substantially better than what we  
> have now.
> Still, I'd argue that something much more concise like the Biopython  
> page
> would make a bigger impact on visitors' ability to find what they're  
> looking
> for.
>
> It's not that the details you have under each section shouldn't be
> available, but rather that they could be clicked through to instead  
> of being
> on the front page.
>
> The About section is a good example. I would bet most visitors to the
> BioPerl website skip over the About section because they already  
> know what
> BioPerl is, and that section has the most valuable real estate on  
> the page.
> Those who don't know and are curious will probably be able to find  
> it (the
> word About on the front page of a website has become an idiom for  
> "click her
> to read the details about this").
>
>
>
> Dave

How about this version (it's on my talk page):

http://www.bioperl.org/wiki/User_talk:Cjfields

chris


From maj at fortinbras.us  Mon Sep 21 13:45:03 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Mon, 21 Sep 2009 13:45:03 -0400
Subject: [Bioperl-l] a Main Page proposal
In-Reply-To: <628aabb70909211003w221b4c9fn5c11f7ff08fc952d@mail.gmail.com>
References: <094CC6C2C9DE448F9A8097AEEFE86B03@NewLife><7C51BBB552AB4EBEA5ED60B49BDB9171@NewLife>
	<628aabb70909211003w221b4c9fn5c11f7ff08fc952d@mail.gmail.com>
Message-ID: <42FBB964C0EA44FABCB50364C567A009@NewLife>

A nearly completely minimal solution is at Main Page Beta
----- Original Message ----- 
From: "Dave Messina" <David.Messina at sbc.su.se>
To: "Mark A. Jensen" <maj at fortinbras.us>
Cc: "BioPerl List" <bioperl-l at lists.open-bio.org>
Sent: Monday, September 21, 2009 1:03 PM
Subject: Re: [Bioperl-l] a Main Page proposal


> Hi Mark,
> Thanks for taking on this (much needed) refresh.
> 
> I think your current version is substantially better than what we have now.
> Still, I'd argue that something much more concise like the Biopython page
> would make a bigger impact on visitors' ability to find what they're looking
> for.
> 
> It's not that the details you have under each section shouldn't be
> available, but rather that they could be clicked through to instead of being
> on the front page.
> 
> The About section is a good example. I would bet most visitors to the
> BioPerl website skip over the About section because they already know what
> BioPerl is, and that section has the most valuable real estate on the page.
> Those who don't know and are curious will probably be able to find it (the
> word About on the front page of a website has become an idiom for "click her
> to read the details about this").
> 
> 
> 
> Dave
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
>


From armendarez77 at hotmail.com  Mon Sep 21 17:01:12 2009
From: armendarez77 at hotmail.com (armendarez77 at hotmail.com)
Date: Mon, 21 Sep 2009 14:01:12 -0700
Subject: [Bioperl-l] New to Bio::Tools::Run::RemoteBlast
Message-ID: <SNT119-W38149FD1B34EE5CA92BFBED2DD0@phx.gbl>


Hello,

Is there a function to blast one query sequence against multiple blast databases?  For example, I want to blast a sequence against all Microbial Genomes.  Currently, I can do it by placing multiple Microbial databases (eg. Microbial/100226, Microbial/101510, etc) into an array and iterate through them using a foreach loop.  Each individual database is placed in the '-data' parameter and the blast is performed.

Example Code:

use strict;
use Bio::Tools::Run::RemoteBlast;

my @microbDbs = qw(Microbial/100226 Microbial/101510 Microbial/103690 Microbial/1063);
my $e_val= '1e-3';

foreach my $db(@microbDbs){
  my @params = ( '-prog' => $prog,
                         '-data' => $db,
                         '-expect' => $e_val,
                         '-readmethod' => 'xml' );

  my $factory = Bio::Tools::Run::RemoteBlast->new(@params);
  my $v = 1;
  my $str = Bio::SeqIO->new(-file=>'test.fa' , '-format' => 'fasta' );
  while (my $input = $str->next_seq()){
    my $r = $factory->submit_blast($input);

    #Code continues...

}

Is there a more efficient way to accomplish this?

If this topic has been discussed please point the way.

Thank you,

Veronica

 		 	   		  
_________________________________________________________________
Microsoft brings you a new way to search the web.  Try  Bing? now
http://www.bing.com?form=MFEHPG&publ=WLHMTAG&crea=TEXT_MFEHPG_Core_tagline_try bing_1x1


From Russell.Smithies at agresearch.co.nz  Mon Sep 21 18:10:56 2009
From: Russell.Smithies at agresearch.co.nz (Smithies, Russell)
Date: Tue, 22 Sep 2009 10:10:56 +1200
Subject: [Bioperl-l] New to Bio::Tools::Run::RemoteBlast
In-Reply-To: <SNT119-W38149FD1B34EE5CA92BFBED2DD0@phx.gbl>
References: <SNT119-W38149FD1B34EE5CA92BFBED2DD0@phx.gbl>
Message-ID: <18DF7D20DFEC044098A1062202F5FFF32B62A6B9D0@exchsth.agresearch.co.nz>

You may need to setup blast locally (not a big job) as I don't think you can blast against multiple databases with B:T:R:RemoteBlast. 
Or you could do it manually on NCBI's site where you can filter results by entrez query (eg. 1239[taxid] for fermicutes) http://www.ncbi.nlm.nih.gov/BLAST/blastcgihelp.shtml#entrez_query 

--Russell


> -----Original Message-----
> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
> bounces at lists.open-bio.org] On Behalf Of armendarez77 at hotmail.com
> Sent: Tuesday, 22 September 2009 9:01 a.m.
> To: bioperl-l at lists.open-bio.org
> Subject: [Bioperl-l] New to Bio::Tools::Run::RemoteBlast
> 
> 
> 
> 
> 
> 
> 
> Hello,
> 
> Is there a function to blast one query sequence against multiple blast
> databases?  For example, I want to blast a sequence against all Microbial
> Genomes.  Currently, I can do it by placing multiple Microbial databases (eg.
> Microbial/100226, Microbial/101510, etc) into an array and iterate through
> them using a foreach loop.  Each individual database is placed in the '-data'
> parameter and the blast is performed.
> 
> Example Code:
> 
> use strict;
> use Bio::Tools::Run::RemoteBlast;
> 
> my @microbDbs = qw(Microbial/100226 Microbial/101510 Microbial/103690
> Microbial/1063);
> my $e_val= '1e-3';
> 
> foreach my $db(@microbDbs){
>   my @params = ( '-prog' => $prog,
>                          '-data' => $db,
>                          '-expect' => $e_val,
>                          '-readmethod' => 'xml' );
> 
>   my $factory = Bio::Tools::Run::RemoteBlast->new(@params);
>   my $v = 1;
>   my $str = Bio::SeqIO->new(-file=>'test.fa' , '-format' => 'fasta' );
>   while (my $input = $str->next_seq()){
>     my $r = $factory->submit_blast($input);
> 
>     #Code continues...
> 
> }
> 
> Is there a more efficient way to accomplish this?
> 
> If this topic has been discussed please point the way.
> 
> Thank you,
> 
> Veronica
> 
> 
> _________________________________________________________________
> Microsoft brings you a new way to search the web.  Try  Bing(tm) now
> http://www.bing.com?form=MFEHPG&publ=WLHMTAG&crea=TEXT_MFEHPG_Core_tagline_try
> bing_1x1
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
=======================================================================
Attention: The information contained in this message and/or attachments
from AgResearch Limited is intended only for the persons or entities
to which it is addressed and may contain confidential and/or privileged
material. Any review, retransmission, dissemination or other use of, or
taking of any action in reliance upon, this information by persons or
entities other than the intended recipients is prohibited by AgResearch
Limited. If you have received this message in error, please notify the
sender immediately.
=======================================================================


From bill at genenformics.com  Mon Sep 21 18:21:26 2009
From: bill at genenformics.com (bill at genenformics.com)
Date: Mon, 21 Sep 2009 15:21:26 -0700
Subject: [Bioperl-l] New to Bio::Tools::Run::RemoteBlast
In-Reply-To: <18DF7D20DFEC044098A1062202F5FFF32B62A6B9D0@exchsth.agresearch.co.nz>
References: <SNT119-W38149FD1B34EE5CA92BFBED2DD0@phx.gbl>
	<18DF7D20DFEC044098A1062202F5FFF32B62A6B9D0@exchsth.agresearch.co.nz>
Message-ID: <4a1b887d0770ac557b0a2578aefdce18.squirrel@mail.dreamhost.com>

BLAST DBs can be concatenated into a single target (.nal or .pal) file.

Check this out:

http://www.ncbi.nlm.nih.gov/Web/Newsltr/Winter00/blastlab.html

Bill

> You may need to setup blast locally (not a big job) as I don't think you
> can blast against multiple databases with B:T:R:RemoteBlast.
> Or you could do it manually on NCBI's site where you can filter results by
> entrez query (eg. 1239[taxid] for fermicutes)
> http://www.ncbi.nlm.nih.gov/BLAST/blastcgihelp.shtml#entrez_query
>
> --Russell
>
>
>> -----Original Message-----
>> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
>> bounces at lists.open-bio.org] On Behalf Of armendarez77 at hotmail.com
>> Sent: Tuesday, 22 September 2009 9:01 a.m.
>> To: bioperl-l at lists.open-bio.org
>> Subject: [Bioperl-l] New to Bio::Tools::Run::RemoteBlast
>>
>>
>>
>>
>>
>>
>>
>> Hello,
>>
>> Is there a function to blast one query sequence against multiple blast
>> databases?  For example, I want to blast a sequence against all
>> Microbial
>> Genomes.  Currently, I can do it by placing multiple Microbial databases
>> (eg.
>> Microbial/100226, Microbial/101510, etc) into an array and iterate
>> through
>> them using a foreach loop.  Each individual database is placed in the
>> '-data'
>> parameter and the blast is performed.
>>
>> Example Code:
>>
>> use strict;
>> use Bio::Tools::Run::RemoteBlast;
>>
>> my @microbDbs = qw(Microbial/100226 Microbial/101510 Microbial/103690
>> Microbial/1063);
>> my $e_val= '1e-3';
>>
>> foreach my $db(@microbDbs){
>>   my @params = ( '-prog' => $prog,
>>                          '-data' => $db,
>>                          '-expect' => $e_val,
>>                          '-readmethod' => 'xml' );
>>
>>   my $factory = Bio::Tools::Run::RemoteBlast->new(@params);
>>   my $v = 1;
>>   my $str = Bio::SeqIO->new(-file=>'test.fa' , '-format' => 'fasta' );
>>   while (my $input = $str->next_seq()){
>>     my $r = $factory->submit_blast($input);
>>
>>     #Code continues...
>>
>> }
>>
>> Is there a more efficient way to accomplish this?
>>
>> If this topic has been discussed please point the way.
>>
>> Thank you,
>>
>> Veronica
>>
>>
>> _________________________________________________________________
>> Microsoft brings you a new way to search the web.  Try  Bing(tm) now
>> http://www.bing.com?form=MFEHPG&publ=WLHMTAG&crea=TEXT_MFEHPG_Core_tagline_try
>> bing_1x1
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> =======================================================================
> Attention: The information contained in this message and/or attachments
> from AgResearch Limited is intended only for the persons or entities
> to which it is addressed and may contain confidential and/or privileged
> material. Any review, retransmission, dissemination or other use of, or
> taking of any action in reliance upon, this information by persons or
> entities other than the intended recipients is prohibited by AgResearch
> Limited. If you have received this message in error, please notify the
> sender immediately.
> =======================================================================
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


From Russell.Smithies at agresearch.co.nz  Mon Sep 21 18:48:26 2009
From: Russell.Smithies at agresearch.co.nz (Smithies, Russell)
Date: Tue, 22 Sep 2009 10:48:26 +1200
Subject: [Bioperl-l] New to Bio::Tools::Run::RemoteBlast
In-Reply-To: <4a1b887d0770ac557b0a2578aefdce18.squirrel@mail.dreamhost.com>
References: <SNT119-W38149FD1B34EE5CA92BFBED2DD0@phx.gbl>
	<18DF7D20DFEC044098A1062202F5FFF32B62A6B9D0@exchsth.agresearch.co.nz>
	<4a1b887d0770ac557b0a2578aefdce18.squirrel@mail.dreamhost.com>
Message-ID: <18DF7D20DFEC044098A1062202F5FFF32B62A6BA02@exchsth.agresearch.co.nz>

That doesn't work with remote databases though.
B:T:R:RemoteBlast uses the QBlast API (I think) so you're limited to the prebuilt databases NCBI offers.
http://www.ncbi.nlm.nih.gov/BLAST/Doc/urlapi.html 

Another thing to try is space-seperating your db list - I know it works with local blasts.
You could also bypass RemoteBlast and do it yourself by POSTing via URL.

This seems to work with multiple databases but you'd need to experiment:

http://www.ncbi.nlm.nih.gov/blast/Blast.cgi?QUERY=257700677&DATABASE=%22Microbial/100226%20Microbial/101510%20Microbial/103690%22&HITLIST_SIZE=10&FILTER=L&EXPECT=10&FORMAT_TYPE=HTML&PROGRAM=blastn&CLIENT=web&SERVICE=plain&NCBI_GI=on&PAGE=Nucleotides&CMD=Put


--Russell


> -----Original Message-----
> From: bill at genenformics.com [mailto:bill at genenformics.com]
> Sent: Tuesday, 22 September 2009 10:21 a.m.
> To: Smithies, Russell
> Cc: 'armendarez77 at hotmail.com'; 'bioperl-l at lists.open-bio.org'
> Subject: Re: [Bioperl-l] New to Bio::Tools::Run::RemoteBlast
> 
> BLAST DBs can be concatenated into a single target (.nal or .pal) file.
> 
> Check this out:
> 
> http://www.ncbi.nlm.nih.gov/Web/Newsltr/Winter00/blastlab.html
> 
> Bill
> 
> > You may need to setup blast locally (not a big job) as I don't think you
> > can blast against multiple databases with B:T:R:RemoteBlast.
> > Or you could do it manually on NCBI's site where you can filter results by
> > entrez query (eg. 1239[taxid] for fermicutes)
> > http://www.ncbi.nlm.nih.gov/BLAST/blastcgihelp.shtml#entrez_query
> >
> > --Russell
> >
> >
> >> -----Original Message-----
> >> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
> >> bounces at lists.open-bio.org] On Behalf Of armendarez77 at hotmail.com
> >> Sent: Tuesday, 22 September 2009 9:01 a.m.
> >> To: bioperl-l at lists.open-bio.org
> >> Subject: [Bioperl-l] New to Bio::Tools::Run::RemoteBlast
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >> Hello,
> >>
> >> Is there a function to blast one query sequence against multiple blast
> >> databases?  For example, I want to blast a sequence against all
> >> Microbial
> >> Genomes.  Currently, I can do it by placing multiple Microbial databases
> >> (eg.
> >> Microbial/100226, Microbial/101510, etc) into an array and iterate
> >> through
> >> them using a foreach loop.  Each individual database is placed in the
> >> '-data'
> >> parameter and the blast is performed.
> >>
> >> Example Code:
> >>
> >> use strict;
> >> use Bio::Tools::Run::RemoteBlast;
> >>
> >> my @microbDbs = qw(Microbial/100226 Microbial/101510 Microbial/103690
> >> Microbial/1063);
> >> my $e_val= '1e-3';
> >>
> >> foreach my $db(@microbDbs){
> >>   my @params = ( '-prog' => $prog,
> >>                          '-data' => $db,
> >>                          '-expect' => $e_val,
> >>                          '-readmethod' => 'xml' );
> >>
> >>   my $factory = Bio::Tools::Run::RemoteBlast->new(@params);
> >>   my $v = 1;
> >>   my $str = Bio::SeqIO->new(-file=>'test.fa' , '-format' => 'fasta' );
> >>   while (my $input = $str->next_seq()){
> >>     my $r = $factory->submit_blast($input);
> >>
> >>     #Code continues...
> >>
> >> }
> >>
> >> Is there a more efficient way to accomplish this?
> >>
> >> If this topic has been discussed please point the way.
> >>
> >> Thank you,
> >>
> >> Veronica
> >>
> >>
> >> _________________________________________________________________
> >> Microsoft brings you a new way to search the web.  Try  Bing(tm) now
> >>
> http://www.bing.com?form=MFEHPG&publ=WLHMTAG&crea=TEXT_MFEHPG_Core_tagline_try
> >> bing_1x1
> >> _______________________________________________
> >> Bioperl-l mailing list
> >> Bioperl-l at lists.open-bio.org
> >> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> > =======================================================================
> > Attention: The information contained in this message and/or attachments
> > from AgResearch Limited is intended only for the persons or entities
> > to which it is addressed and may contain confidential and/or privileged
> > material. Any review, retransmission, dissemination or other use of, or
> > taking of any action in reliance upon, this information by persons or
> > entities other than the intended recipients is prohibited by AgResearch
> > Limited. If you have received this message in error, please notify the
> > sender immediately.
> > =======================================================================
> >
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >
> 


From Russell.Smithies at agresearch.co.nz  Mon Sep 21 19:04:54 2009
From: Russell.Smithies at agresearch.co.nz (Smithies, Russell)
Date: Tue, 22 Sep 2009 11:04:54 +1200
Subject: [Bioperl-l] New to Bio::Tools::Run::RemoteBlast
In-Reply-To: <18DF7D20DFEC044098A1062202F5FFF32B62A6BA02@exchsth.agresearch.co.nz>
References: <SNT119-W38149FD1B34EE5CA92BFBED2DD0@phx.gbl>
	<18DF7D20DFEC044098A1062202F5FFF32B62A6B9D0@exchsth.agresearch.co.nz>
	<4a1b887d0770ac557b0a2578aefdce18.squirrel@mail.dreamhost.com>
	<18DF7D20DFEC044098A1062202F5FFF32B62A6BA02@exchsth.agresearch.co.nz>
Message-ID: <18DF7D20DFEC044098A1062202F5FFF32B62A6BA19@exchsth.agresearch.co.nz>

If you want to "manually" use Perl and QBlast, here's some example code.
I don't remember where it came from but it works well  :-)

**Ignore the UserAgent stuff, our firewall is fairly well tied down.

--Russell

============================

#!perl -w
$| = 1;

use LWP::UserAgent;
use HTTP::Request::Common 'POST';

$ua = LWP::UserAgent->new;
push @{ $ua->requests_redirectable }, 'POST';   #LWP doesn't redirect by default
$ua->agent('Mozilla/5.0');

#$ua->proxy( [ 'http', 'ftp' ] => 'http://username:password at your.proxy.if.required:8080' );

my $verbose = 1;
my $seq     = getSequence();
my ( $blast, $taxonomy ) = queryQBlast($seq);
$verbose && print "saving result\n";
saveToFile( $blast,    "blast.txt" );
saveToFile( $taxonomy, "taxonomy.html" );
$verbose && print "Done.\n";

sub queryQBlast {
  my ($seq) = @_;
  $seq =~ s/[\d\n\W]//g;
  my $sleepTime          = 0;
  my $sleepTimeIncrement = 5;
  my $totalSleepTime     = 0;
  my $maxSleepTime       = 600;    # 10 min
  my ( $rid, $rtoe ) = startQBlast($seq);
  my ( $blast, $taxonomy );

  while ( !$blast ) {
    $verbose && printf "wait %3d seconds\n", $sleepTime;
    sleep $sleepTime;
    ( $blast, $taxonomy ) = retrieveQBlastResult($rid);
    $sleepTime += $sleepTimeIncrement unless ( $sleepTime > 100 );
    $totalSleepTime += $sleepTimeIncrement;
    last if ( $totalSleepTime > $maxSleepTime );
  }
  return ( $blast, $taxonomy );
}

sub startQBlast {
  my ($sequence) = @_;
  my ( $expect, $wsize, $filter, $mega );
  my $hitList = 100;
  if ( length($sequence) <= 20 ) {
    $expect = 1000;
    $wsize  = 7;
    $mega   = "on";
    $filter = "";
  }
  else {
    $expect = 10;
    $wsize  = 28;
    $mega   = "on";
    $filter = "L";    # Low complexity
  }
  my $qblastURL = "http://www.ncbi.nlm.nih.gov/blast/Blast.cgi?";
  my $url       = $qblastURL . "QUERY=$sequence";
  $url .=
"&DATABASE=nr&HITLIST_SIZE=${hitList}&FILTER=${filter}&EXPECT=${expect}&FORMAT_TYPE=Text";
  $url .=
    "&PROGRAM=blastn&CLIENT=web&SERVICE=plain&NCBI_GI=on&PAGE=Nucleotides";
  $url .= "&SHOW_OVERVIEW=&WORD_SIZE=${wsize}&MEGABLAST=${mega}&CMD=Put";
  my $req = HTTP::Request->new( GET => $url );
  my $content = $ua->request($req)->content;
  $content =~ s/\s+/ /g;
  my ( $rid, $rtoe ) = $content =~
    /QBlastInfoBegin RID = ([\d\-\.\w]+) RTOE = (\d+) QBlastInfoEnd/;
  if ( !$rid ) { print qq{\nERROR missing RID:\n}; exit; }
  $verbose && print "RID $rid\n";
  return ( $rid, $rtoe );
}

sub retrieveQBlastResult {
  my ($rid)     = @_;
  my $qblastURL = "http://www.ncbi.nlm.nih.gov/blast/Blast.cgi?";
  my $url       = $qblastURL
    . "RID=$rid&CMD=Get&SHOW_OVERVIEW=&SHOW_LINKOUT=&FORMAT_TYPE=Text";
  my ( $blast, $taxonomy, $req );
  $req = HTTP::Request->new( GET => $url );
  $blast = $ua->request($req)->content;
  if ( $blast =~ /\s+Status=WAITING/ ) {
    $blast = "";
  }
  elsif ( $blast =~ /\s+Status=UNKNOWN/ ) {
    print "Error in processing\nRID $rid\n";
    exit;
  }
  else {
    $verbose && print "got blast result\n";
    $verbose && print "retrieving taxonomy data\n";
    $url = $qblastURL . "CMD=Get&RID=$rid&FORMAT_OBJECT=TaxBlast&NCBI_GI=on";
    $req = HTTP::Request->new( GET => $url );
    $taxonomy = $ua->request($req)->content;
    $taxonomy = "" if ( $taxonomy =~ /No valid taxids found in the alignment/ );
  }
  return ( $blast, $taxonomy );
}

sub saveToFile {
  my ( $data, $file ) = @_;
  local (*OUT);
  open( OUT, ">$file" );
  print OUT $data;
  close OUT;
}

sub getSequence {
  return qq{
AAAGGATTTATTGACGATGCGAACTACTCCGTTGGCCTGTTGGATGAAGGAACAAA
CCTTGGAAATGTTATTGATAACTATGTTTATGAACATACCCTGACAGGAAAAAATGCAT
TTTTTGTGGGGGATCTTGGGAAGATCGTGAAGAAGCACAGTCAGTGGCAGACCGTGGTG
GCTCAGATAAAGCCGTTTTACACGGTGAAGTGCAACTCCACTCCAGCCGTGCTTGAGAT
CTTGGCAGCTCTTGGAACTGGGTTTGCTTGTTCCAGCAAAAATGAAATGGCTTTAGTGC
AAGAATTGGGTGTATCTCCAGAAAACATCATTTTCACAAGTCCTTGTAAGCAAGTGTCT
CAGATAAAGTATGCAGCAAAAGTTGGAGTAAATATTATGACATGTGACAATGAGATTGA
ATTAAAGAAAATTGCAAGGAATCACCCAAATGCCAAGGTCTTACTACATATTGCAACAG
AAGATAATATTGGAGGTGAAGATGGTAACATGAAGTTTGGCACTACACTGAAGAATTGT
AGGCATCTTTTGGAATGTGCCAAGGAACTTGATGTCCAAATAATTGGGGTTAAATTTCA
TGTTTCAAGTGCTTGCAAAGAATATCAAGTATATGTACATGCCCTGTCTGATGCTCGAT
GTGTGTTTGACATGGCTGGAGAGTTTGGCTTTACAATGAACATGTTAGACATCGGTGGA
GGCTTCACAGGAACTGAAATTCAGTTGGAAGAGGTTAATCATGTTATCAGTCCTCTGTT
GGATATTTACTTCCCTGAAGGATCTGGCATTCAGATAATTTCAGAACCTGGAAGCTACT
ATGTATCTTCTGCGTTTACACTTGCAGTCAATATTATTGCTAAGAAAGTTGTTGAAAAT
GATAAATTTTCCTCTGGAGTAGAAAAAAATGGGAGTGATGAGCCAGCCTTCGTGTATTA
CATGAATGATGGTGTTTATGGTTCTTTTGCGAGTAAGCTTTCTGAGGACTTAAATACCA
TTCCAGAGGTTCACAAGAAATACAAGGAAGATGAGCCTCTGTTTACAAGCAGCCTTTGG
GGTCCATCCTGTGATGAGCTTGATCAAATTGTGGAAAGCTGTCTTCTTCCTGAGCTGAA
TGTGGGAGATTGGCTTATCTTTGATAACATGGGAGCAGATTCTTTCCACGAACCATCTG
CTTTTAATGATTTTCAGAGGCCAGCTATTTATTTCATGATGTCATTCAGTGATTGGTAT
GAGATGCAAGATGCTGGAATTACTTCAGATGCAATGATGAAAAACTTCTTCTTTGCACC
CTCTTGTATTCAGCTGAGCCAAGAAGACAGCTTTTCCACTGAAGCT};
}

================================

> -----Original Message-----
> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
> bounces at lists.open-bio.org] On Behalf Of Smithies, Russell
> Sent: Tuesday, 22 September 2009 10:48 a.m.
> To: 'bill at genenformics.com'
> Cc: 'bioperl-l at lists.open-bio.org'; 'armendarez77 at hotmail.com'
> Subject: Re: [Bioperl-l] New to Bio::Tools::Run::RemoteBlast
> 
> That doesn't work with remote databases though.
> B:T:R:RemoteBlast uses the QBlast API (I think) so you're limited to the
> prebuilt databases NCBI offers.
> http://www.ncbi.nlm.nih.gov/BLAST/Doc/urlapi.html
> 
> Another thing to try is space-seperating your db list - I know it works with
> local blasts.
> You could also bypass RemoteBlast and do it yourself by POSTing via URL.
> 
> This seems to work with multiple databases but you'd need to experiment:
> 
> http://www.ncbi.nlm.nih.gov/blast/Blast.cgi?QUERY=257700677&DATABASE=%22Microb
> ial/100226%20Microbial/101510%20Microbial/103690%22&HITLIST_SIZE=10&FILTER=L&E
> XPECT=10&FORMAT_TYPE=HTML&PROGRAM=blastn&CLIENT=web&SERVICE=plain&NCBI_GI=on&P
> AGE=Nucleotides&CMD=Put
> 
> 
> --Russell
> 
> 
> > -----Original Message-----
> > From: bill at genenformics.com [mailto:bill at genenformics.com]
> > Sent: Tuesday, 22 September 2009 10:21 a.m.
> > To: Smithies, Russell
> > Cc: 'armendarez77 at hotmail.com'; 'bioperl-l at lists.open-bio.org'
> > Subject: Re: [Bioperl-l] New to Bio::Tools::Run::RemoteBlast
> >
> > BLAST DBs can be concatenated into a single target (.nal or .pal) file.
> >
> > Check this out:
> >
> > http://www.ncbi.nlm.nih.gov/Web/Newsltr/Winter00/blastlab.html
> >
> > Bill
> >
> > > You may need to setup blast locally (not a big job) as I don't think you
> > > can blast against multiple databases with B:T:R:RemoteBlast.
> > > Or you could do it manually on NCBI's site where you can filter results by
> > > entrez query (eg. 1239[taxid] for fermicutes)
> > > http://www.ncbi.nlm.nih.gov/BLAST/blastcgihelp.shtml#entrez_query
> > >
> > > --Russell
> > >
> > >
> > >> -----Original Message-----
> > >> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
> > >> bounces at lists.open-bio.org] On Behalf Of armendarez77 at hotmail.com
> > >> Sent: Tuesday, 22 September 2009 9:01 a.m.
> > >> To: bioperl-l at lists.open-bio.org
> > >> Subject: [Bioperl-l] New to Bio::Tools::Run::RemoteBlast
> > >>
> > >>
> > >>
> > >>
> > >>
> > >>
> > >>
> > >> Hello,
> > >>
> > >> Is there a function to blast one query sequence against multiple blast
> > >> databases?  For example, I want to blast a sequence against all
> > >> Microbial
> > >> Genomes.  Currently, I can do it by placing multiple Microbial databases
> > >> (eg.
> > >> Microbial/100226, Microbial/101510, etc) into an array and iterate
> > >> through
> > >> them using a foreach loop.  Each individual database is placed in the
> > >> '-data'
> > >> parameter and the blast is performed.
> > >>
> > >> Example Code:
> > >>
> > >> use strict;
> > >> use Bio::Tools::Run::RemoteBlast;
> > >>
> > >> my @microbDbs = qw(Microbial/100226 Microbial/101510 Microbial/103690
> > >> Microbial/1063);
> > >> my $e_val= '1e-3';
> > >>
> > >> foreach my $db(@microbDbs){
> > >>   my @params = ( '-prog' => $prog,
> > >>                          '-data' => $db,
> > >>                          '-expect' => $e_val,
> > >>                          '-readmethod' => 'xml' );
> > >>
> > >>   my $factory = Bio::Tools::Run::RemoteBlast->new(@params);
> > >>   my $v = 1;
> > >>   my $str = Bio::SeqIO->new(-file=>'test.fa' , '-format' => 'fasta' );
> > >>   while (my $input = $str->next_seq()){
> > >>     my $r = $factory->submit_blast($input);
> > >>
> > >>     #Code continues...
> > >>
> > >> }
> > >>
> > >> Is there a more efficient way to accomplish this?
> > >>
> > >> If this topic has been discussed please point the way.
> > >>
> > >> Thank you,
> > >>
> > >> Veronica
> > >>
> > >>
> > >> _________________________________________________________________
> > >> Microsoft brings you a new way to search the web.  Try  Bing(tm) now
> > >>
> >
> http://www.bing.com?form=MFEHPG&publ=WLHMTAG&crea=TEXT_MFEHPG_Core_tagline_try
> > >> bing_1x1
> > >> _______________________________________________
> > >> Bioperl-l mailing list
> > >> Bioperl-l at lists.open-bio.org
> > >> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> > > =======================================================================
> > > Attention: The information contained in this message and/or attachments
> > > from AgResearch Limited is intended only for the persons or entities
> > > to which it is addressed and may contain confidential and/or privileged
> > > material. Any review, retransmission, dissemination or other use of, or
> > > taking of any action in reliance upon, this information by persons or
> > > entities other than the intended recipients is prohibited by AgResearch
> > > Limited. If you have received this message in error, please notify the
> > > sender immediately.
> > > =======================================================================
> > >
> > > _______________________________________________
> > > Bioperl-l mailing list
> > > Bioperl-l at lists.open-bio.org
> > > http://lists.open-bio.org/mailman/listinfo/bioperl-l
> > >
> >
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From Russell.Smithies at agresearch.co.nz  Mon Sep 21 16:51:51 2009
From: Russell.Smithies at agresearch.co.nz (Smithies, Russell)
Date: Tue, 22 Sep 2009 08:51:51 +1200
Subject: [Bioperl-l] a Main Page proposal
In-Reply-To: <~B4ab702db0000.4ab7e0410000.0001.mml.2798180807@NewLife>
References: <~B4ab702db0000.4ab7e0410000.0001.mml.2798180807@NewLife>
Message-ID: <18DF7D20DFEC044098A1062202F5FFF32B62A6B938@exchsth.agresearch.co.nz>

Here's a few comments to ignore at will :-)

How about using a different default skin so it doesn't look like all the other installations of MediaWiki?
I've attached a screenshot of one of my wikis using the "Daddio" skin but a bit of crafty CSS can do wonders.
Also, there's a lot of duplication with most of the links on Mediawiki:Sidebar also appearing on the main page content.
The "Treeview" is a nice extension as well for tidying up complex menus http://semeb.com/dpldemo/index.php?title=Treeview_extension 

I've got a bit of experience with wikis and extensions (we use LOTS of extensions) so let me know if there's anything you need.

--Russell


> -----Original Message-----
> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
> bounces at lists.open-bio.org] On Behalf Of Mark A. Jensen
> Sent: Monday, 21 September 2009 4:23 p.m.
> To: BioPerl List
> Subject: [Bioperl-l] a Main Page proposal
> 
> Hello all,
> 
> As Brian articulated so well for many of us,
> the wiki main page is, well, butt-ugly.
> Please check out the Main Page Beta at
> http://www.bioperl.org/wiki/Main_Page_Beta
> and respond to this thread or on the discussion
> page.
> 
> cheers and thanks,
> MAJ
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

=======================================================================
Attention: The information contained in this message and/or attachments
from AgResearch Limited is intended only for the persons or entities
to which it is addressed and may contain confidential and/or privileged
material. Any review, retransmission, dissemination or other use of, or
taking of any action in reliance upon, this information by persons or
entities other than the intended recipients is prohibited by AgResearch
Limited. If you have received this message in error, please notify the
sender immediately.
=======================================================================
-------------- next part --------------
A non-text attachment was scrubbed...
Name: daddio.png
Type: image/png
Size: 51263 bytes
Desc: daddio.png
URL: <http://lists.open-bio.org/pipermail/bioperl-l/attachments/20090922/643d7f79/attachment-0002.png>

From cjfields at illinois.edu  Mon Sep 21 23:38:18 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Mon, 21 Sep 2009 22:38:18 -0500
Subject: [Bioperl-l] a Main Page proposal
In-Reply-To: <18DF7D20DFEC044098A1062202F5FFF32B62A6B938@exchsth.agresearch.co.nz>
References: <~B4ab702db0000.4ab7e0410000.0001.mml.2798180807@NewLife>
	<18DF7D20DFEC044098A1062202F5FFF32B62A6B938@exchsth.agresearch.co.nz>
Message-ID: <B9C1E8A4-BDE0-45E7-858B-8BFABA1D2480@illinois.edu>

Russell, Mark,

It would be nice to change the background, just don't want it to be  
too distracting.

Also (I mentioned this to Mark off-list), I think the sidebar would be  
cleaned up considerably, but not until this becomes the default.  I  
also like the use of the TreeView extension, very nice!  Anyone have  
privs for the wiki to test it out?

chris

On Sep 21, 2009, at 3:51 PM, Smithies, Russell wrote:

> Here's a few comments to ignore at will :-)
>
> How about using a different default skin so it doesn't look like all  
> the other installations of MediaWiki?
> I've attached a screenshot of one of my wikis using the "Daddio"  
> skin but a bit of crafty CSS can do wonders.
> Also, there's a lot of duplication with most of the links on  
> Mediawiki:Sidebar also appearing on the main page content.
> The "Treeview" is a nice extension as well for tidying up complex  
> menus http://semeb.com/dpldemo/index.php?title=Treeview_extension
>
> I've got a bit of experience with wikis and extensions (we use LOTS  
> of extensions) so let me know if there's anything you need.
>
> --Russell
>
>
>> -----Original Message-----
>> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
>> bounces at lists.open-bio.org] On Behalf Of Mark A. Jensen
>> Sent: Monday, 21 September 2009 4:23 p.m.
>> To: BioPerl List
>> Subject: [Bioperl-l] a Main Page proposal
>>
>> Hello all,
>>
>> As Brian articulated so well for many of us,
>> the wiki main page is, well, butt-ugly.
>> Please check out the Main Page Beta at
>> http://www.bioperl.org/wiki/Main_Page_Beta
>> and respond to this thread or on the discussion
>> page.
>>
>> cheers and thanks,
>> MAJ
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> = 
> ======================================================================
> Attention: The information contained in this message and/or  
> attachments
> from AgResearch Limited is intended only for the persons or entities
> to which it is addressed and may contain confidential and/or  
> privileged
> material. Any review, retransmission, dissemination or other use of,  
> or
> taking of any action in reliance upon, this information by persons or
> entities other than the intended recipients is prohibited by  
> AgResearch
> Limited. If you have received this message in error, please notify the
> sender immediately.
> = 
> ======================================================================
> <daddio.png>_______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From cjfields at illinois.edu  Mon Sep 21 23:56:58 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Mon, 21 Sep 2009 22:56:58 -0500
Subject: [Bioperl-l] BioPerl 1.6.0 alpha 2 released
Message-ID: <2736FAB1-3728-465F-A07B-A8FFA790FC4C@illinois.edu>

Just a note that the second alpha is out and propagating it's way  
around the intertubes:

http://search.cpan.org/~cjfields/BioPerl-1.6.0_2/

Pick your favorite archive here:

http://bioperl.org/DIST/RC/

This should address the bugs reported by Scott from the last release.   
Just a note, but I am seeing a warning popping up with 64-bit perl  
5.10.1 on Mac with PopGen tests (I think it's a floating point  
addition issue).  Let me know if this is popping up elsewhere.

Enjoy!

chris


From jcline at ieee.org  Mon Sep 21 23:59:09 2009
From: jcline at ieee.org (Jonathan Cline)
Date: Mon, 21 Sep 2009 22:59:09 -0500
Subject: [Bioperl-l] a Main Page proposal
In-Reply-To: <mailman.7482.1253574466.2696.bioperl-l@lists.open-bio.org>
References: <mailman.7482.1253574466.2696.bioperl-l@lists.open-bio.org>
Message-ID: <4AB84B8D.5080005@ieee.org>

Throwing this out there:

- there should be a screenshot section (whatever that means for bioperl)

- the grammar of the beta page should be more correct.

"Welcome to BioPerl, a community effort to produce Perl code which is
useful in biology. "
==> "Welcome to BioPerl, a community effort to produce Perl code serving
as useful tool in the field of Biology."

>>The About section is a good example. I would bet most visitors to the
BioPerl website skip over the About section because they already know what
BioPerl is, ...  Dave<<


Most good software front pages say, in a couple sentences, "what it is
and what it's for", including pictures (as screenshots).

I would bet a ton of visitors don't know what bioperl is, or what it is
used for, or how it can benefit.  There is likely a metric for this (web
stats) as the ratio of new page visits that bounce away vs. new
clickthrus from the front page to the download or docs section.   i.e. a
visitor found the page and didn't continue reading.  I don't really know
all the things bioperl is good for and I've been reading about it here &
there for a while.

I like the following from the About and I believe it fits well on a
front page, expanding "toolkit" to "software library":

"What is Bioperl? It is an open source bioinformatics software library
used by researchers all over the world. If you're looking for a script
built to fit your exact needs you probably won't find it in Bioperl.
What you will find is a diverse set of Perl modules that will enable you
to write your own script, and a community of people who are willing to
help you. "

The old school definition of software library is something like: "useful
routines which can be used by an application (& not itself an
application)" which is basically the description above.

I also like the intro from wikipedia, which I found more informative
about bioperl, and would be good for a front page:

'BioPerl [1] is a collection of Perl modules that facilitate the
development of Perl scripts for bioinformatics applications. It has
played an integral role in the Human Genome Project.[2]  It is an active
open source software project supported by the Open Bioinformatics
Foundation.  In order to take advantage of BioPerl, the user needs a
basic understanding of the Perl programming language including an
understanding of how to use Perl references, modules, objects and methods."

The screenshots could also include pics of books on bioperl or perl+bio,
that would be neat.  (Tisdall's book comes to mind here)


## Jonathan Cline
## jcline at ieee.org
## Mobile: +1-805-617-0223
########################


From lelbourn at science.mq.edu.au  Tue Sep 22 01:05:28 2009
From: lelbourn at science.mq.edu.au (Liam Elbourne)
Date: Tue, 22 Sep 2009 15:05:28 +1000
Subject: [Bioperl-l] subsection of genbank file
In-Reply-To: <4AB36451.3030207@gmail.com>
References: <997B4CA2-D80B-4512-AA3E-74CB45DD7064@science.mq.edu.au>
	<4AB36451.3030207@gmail.com>
Message-ID: <3B0EF953-BF79-4384-964D-A992DFBDB609@science.mq.edu.au>

Hi Roy,

Thanks for that, works well, but there are no _gsf_tag_hash values?  
I'm particularly interested in the locus id, obviously the translation  
could be problematic if the whole gene is not included after  
truncation, but things like the note, product, protein_id would be  
good. I had a look at the code for the method and couldn't see any  
obvious why those values didn't make it across. Should I submit this  
as a bug, or is there something I'm missing?


Regards,
Liam.


On 18/09/2009, at 8:43 PM, Roy Chaudhuri wrote:

> Hi Liam,
>
> I just discovered your message, which has not yet been replied to.  
> What you require has been discussed in a recent thread:
> http://bioperl.org/pipermail/bioperl-l/2009-August/031071.html
>
> Try using trunc_with_features from Bio::SeqUtils:
>
> my $sub_seqobj=Bio::SeqUtils->trunc_with_features($seqobj, 300, 2000);
> Cheers.
> Roy.
>
> Liam Elbourne wrote:
>> Hi All,
>> Is there a method or methodology that will produce a fully fledged  
>> Seq  object with all the associated metadata given a start and end   
>> position? To clarify, I create a sequence object from a genbank file:
>> ****
>> my $io  = Bio::Seqio->new(as per usual);
>> my $seqobj = $io->next_seq();
>> ****
>> I now want:
>> my $sub_seqobj = $seqobj between 300 and 2000
>> where $sub_seqobj is a Seq object (which I appreciate is an   
>> 'aggregate' of objects) too. The "trunc" method only returns a   
>> PrimarySeq object which lacks all the annotation etc. I've  
>> previously  done this task by iterating through feature by feature  
>> and parsing out  what I needed, but thought there might be a more  
>> elegant approach...
>> Regards,
>> Liam Elbourne.
>
> -- 
> Dr. Roy Chaudhuri
> Department of Veterinary Medicine
> University of Cambridge, U.K.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> ac.uk ([131.111.51.215]:49455)
> 	by ppsw-7.c

______________________________

Dr Liam Elbourne
Research Fellow (Bioinformatics)
Paulsen Laboratory
Macquarie University
Sydney
Australia.

http://www2.oxfam.org.au/trailwalker/Sydney/team/228


From roy.chaudhuri at gmail.com  Tue Sep 22 03:17:26 2009
From: roy.chaudhuri at gmail.com (Roy Chaudhuri)
Date: Tue, 22 Sep 2009 08:17:26 +0100
Subject: [Bioperl-l] subsection of genbank file
In-Reply-To: <3B0EF953-BF79-4384-964D-A992DFBDB609@science.mq.edu.au>
References: <997B4CA2-D80B-4512-AA3E-74CB45DD7064@science.mq.edu.au>
	<4AB36451.3030207@gmail.com>
	<3B0EF953-BF79-4384-964D-A992DFBDB609@science.mq.edu.au>
Message-ID: <4AB87A06.4000209@gmail.com>

Hi Liam,

Yes, that is a bug - I think it is to do with the Feature Annotation 
rollback from 1.6, it works fine with 1.5.2. Looks like the tests I 
wrote don't check for the presence of tags, just the coordinates of the 
feature, so this hasn't been picked up. Submit it to Bugzilla, and I'll 
take a look when I get a chance.

Cheers.
Roy.

Liam Elbourne wrote:
> Hi Roy,
> 
> Thanks for that, works well, but there are no _gsf_tag_hash values? I'm 
> particularly interested in the locus id, obviously the translation could 
> be problematic if the whole gene is not included after truncation, but 
> things like the note, product, protein_id would be good. I had a look at 
> the code for the method and couldn't see any obvious why those values 
> didn't make it across. Should I submit this as a bug, or is there 
> something I'm missing?
> 
> 
> Regards,
> Liam.
> 
> 
> 
> On 18/09/2009, at 8:43 PM, Roy Chaudhuri wrote:
> 
>> Hi Liam,
>>
>> I just discovered your message, which has not yet been replied to. 
>> What you require has been discussed in a recent thread:
>> http://bioperl.org/pipermail/bioperl-l/2009-August/031071.html
>>
>> Try using trunc_with_features from Bio::SeqUtils:
>>
>> my $sub_seqobj=Bio::SeqUtils->trunc_with_features($seqobj, 300, 2000);
>> Cheers.
>> Roy.
>>
>> Liam Elbourne wrote:
>>> Hi All,
>>> Is there a method or methodology that will produce a fully fledged 
>>> Seq  object with all the associated metadata given a start and end 
>>>  position? To clarify, I create a sequence object from a genbank file:
>>> ****
>>> my $io  = Bio::Seqio->new(as per usual);
>>> my $seqobj = $io->next_seq();
>>> ****
>>> I now want:
>>> my $sub_seqobj = $seqobj between 300 and 2000
>>> where $sub_seqobj is a Seq object (which I appreciate is an 
>>>  'aggregate' of objects) too. The "trunc" method only returns a 
>>>  PrimarySeq object which lacks all the annotation etc. I've 
>>> previously  done this task by iterating through feature by feature 
>>> and parsing out  what I needed, but thought there might be a more 
>>> elegant approach...
>>> Regards,
>>> Liam Elbourne.
>>
>> -- 
>> Dr. Roy Chaudhuri
>> Department of Veterinary Medicine
>> University of Cambridge, U.K.
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>> ac.uk ([131.111.51.215]:49455)
>> by ppsw-7.c
> 
> ______________________________
> 
> Dr Liam Elbourne
> Research Fellow (Bioinformatics)
> Paulsen Laboratory
> Macquarie University
> Sydney
> Australia.
> 
> http://www2.oxfam.org.au/trailwalker/Sydney/team/228
> 
> 
> 


From lelbourn at science.mq.edu.au  Tue Sep 22 03:14:44 2009
From: lelbourn at science.mq.edu.au (Liam Elbourne)
Date: Tue, 22 Sep 2009 17:14:44 +1000
Subject: [Bioperl-l] dnastatistics
In-Reply-To: <8B440DC9-A1C8-4900-A0AB-96448616E46A@bioperl.org>
References: <BLU104-W2453ADE4584D2C479071A4A0E40@phx.gbl>
	<7AD546C5A6BE4B66BF9705BC885E08B1@NewLife>
	<8B440DC9-A1C8-4900-A0AB-96448616E46A@bioperl.org>
Message-ID: <A5C3A80C-03F0-4CEC-BA43-2271B58F6DC4@science.mq.edu.au>

So I also had no problem running the code as written by Jose (Bioperl  
1.6.0, perl 5.10), but in the documentation for DNAStatistics it says:

"The routines are not well tested and do contain errors at this point.  
Work is underway to correct them, but do not expect this code to give  
you the right answer currently!"!

So I'm using dnadist (as I think the documentation recommends), and it  
does produce different numbers to $stats->distance(-).

I tried write_matrix from Bio::Matrix::IO - got a message saying it  
hasn't been implemented yet?

And if Jose hasn't already found it, try Data::Dumper; it will change  
your life....

Regards,
Liam.

On 15/09/2009, at 3:54 AM, Jason Stajich wrote:

> Yeah it seems like more of a bioperl problem -- possible that the  
> older code didn't recognize 'jukes-cantor' but you can try the  
> abbreviation 'jc' -- better to just upgrade tho!
>
> This isn't the cause of the problem but I would also encourage use  
> of Bio::Matrix::IO for printing the matrix (use the 'write_matrix'  
> function) rather than print_matrix on the matrix itsself.
>
> -jason
> On Sep 14, 2009, at 10:00 AM, Mark A. Jensen wrote:
>
>> Hi Jose--
>> I don't get any problem with your script as written. You should  
>> upgrade to
>> BioPerl 1.6 and try again.
>> The "unblessed reference" is $jcmatrix. It may be undef for some  
>> reason.
>> MAJ
>> ----- Original Message ----- From: "Jose ." <joseguillin at hotmail.com>
>> To: <bioperl-l at bioperl.org>
>> Sent: Monday, September 14, 2009 8:48 AM
>> Subject: [Bioperl-l] Bio/Align/DNAStatistics.html print$jcmatrix- 
>> >print_matrix;
>>
>>
>>
>>
>>
>> Hello,
>>
>> I'm trying to use Bio::Align::DNAStatistics, but I get the  
>> following message:
>>
>> Can't call method "print_matrix" on unblessed reference at Tree.pl  
>> line 32, <GEN0> line 44.
>>
>> Other modules do work, such us Bio::SimpleAlign;
>>
>>
>>
>>
>> My code is basically a modification of the code I found in http://doc.bioperl.org/releases/bioperl-current/bioperl-live/Bio/Align/DNAStatistics.html 
>> , as it is as follows:
>>
>> use strict;
>> use Bio::AlignIO;
>> use Bio::Align::DNAStatistics;
>>
>>
>> my $stats = Bio::Align::DNAStatistics->new();
>>
>> my $alignin = Bio::AlignIO->new(-file => 'e1_output_uno_solo.fas',
>>                          -format => 'fasta');
>> my $aln = $alignin->next_aln;
>>
>> my $jcmatrix = $stats-> distance (-align => $aln,
>>                -method => 'Jukes-Cantor');
>>
>> print $jcmatrix->print_matrix;
>>
>> And the file 'e1_output_uno_solo.fas' has the following sequences:
>>
>>> A
>> GGTTATCTCAACAACTGTCACC--GTGGGCGCTGGTCATTGGTACGGGTGAACGAGAGTT
>> AAACGGTCGTTAACCATAGAAACAAAACACACTGCACCTTAACTCACTGAATAGTTGACG
>> GTCTGCCTCAGGGCTTGAGACAACGGATGGATCTAAACTCATGCTGTAGCCTATCAAACT
>> TAGCCCCAGGGTACTTCCGTCCCTAGCCTCGCTACAAGGCCAGAAAGGGTTTTGAAGTCT
>> ACTCACTGTGACCAGCGGTCTAGTCAGGTTATGCTTCGGCACAAAACCTCAGAATCGGTA
>> ACCAGCCACTACACGAACTGAAATCAAATCGCGGGAGGTGGTCCATCTTTGTCCACGCTG
>> CGATGATTGGGTTGCTTTATAGTCTAGCTGCAAGGTTTTGCGTTCTGGTGGGAAGCGGSubject:  
>> Re: [Bioperl-l] Bio/Align/DNAStatistics.html
> 	print$jcmatrix->print_maCA
>> TCCAAGGGGTTGACTCCGCTCGTTTATAACATGCCTTGGGCCTCCATGGTGAGTCGCAAC
>> GTCAGCGTAGGCCTAGACGGCT
>>
>>> B
>> GGATATCTCGACAACTTTTAGC--CTGGGCGCTTGGCATTGGTACACGTGACTTGCAGTT
>> AAAGGGTCGTTATACATAGAATCACTACCCAC--CAGGCGAACTCGCTGGAGAGCTGAGG
>> GTCACCCTCAGCGGTTGAGTTAACTGCTCGATGTTAACCGATGTTGGATCATAGGTAACT
>> TATCCTCAGTGTTCCTCTGTCCCTAGACTGGCTACAGGGCTACACCGGGTTTGAGGGGAT
>> ACTGACTGTTTTCAGCGGTAGTGTAAGTGTATGGTCCAACCCAAGGGTTCATGACCGGTA
>> AACTGCCCGTTCCCGCATTGAAATCAAATTGCAGGAGTTGGTACTTATTTGTCAACCTTA
>> CGATGATTGGGATGCATTTTAGTCGGGCTGGGCGGATTTGCGATCTGGGTGGAAGAGAGA
>> TGCATGGGGCTAACTCGTCTTGGTGAGTACCGGCATTGCACCGCAATGGACCGCCAAAAC
>> ATAAGAGTAGGTCGGGATGGCA
>>
>>> C
>> GCTTATCTCAACAACCGACACGAAGTCGTCGCAGGTCAATGGTACACGTGAATTGAAGTC
>> ATAAGATCAGTAATGATCGAACCACCAAACCCTTAACCTCGACTCACGCGATAGCCGAGG
>> GTCTGCCTCCAGGGTTGATTTAAAGGTTCTATTTAAGACCGTTTTCGATCATAGGTTACT
>> TATCCCCAGAGTTCTACCGTCGTGAGAATGGCTACAAGGCTAGAATAGGTTTTAGGGT-T
>> ACTTACGGTCTGCAGCCGTATTGTGAGGTTATGGTCCGGCCCTAGGCGTCATGACCGATA
>> ATCAGCCCCTACCTGAAATGAAATCAAATCGCGGGAGTTGGTACTTATCTGTCAACGTTG
>> CGATGATGGGGATACATGTTGGTCTACCGCGACGGACTAGCGATCACGGGGGAAGCGGAT
>> TGCCCGGTGGTGACTCGACACGTTTAAAACCTGCCTGGTTCCCGCATGGATCGTCACAAC
>> GTATGTGCAGGTCGAAACGAGT
>>
>>> D
>> CGTGATCGCAACAACTGTCACC--GTGGGCGCTGGCCGTTGGACCACGTGAAATGCTGTT
>> AAACGATCGTTCACCATAGAACCACTACACTCTTCACCTCAACCCGCGGGACAGGTGATG
>> GTGTCCCCCAGGGGTTGAGTGAACGGCTCGATGTAAACCCATGTTCGATCATAGGTAACG
>> TAGCCCCAGGGTGATTCCGTTCCTAAACTGGTTACAAGGCTAAAACGTGTTTTAGAGTAT
>> AATGACTGTCTACGGCGGTATTGTGATGTTATCATCCGTCCCTAGGCGTGGCGACCGTTA
>> AACAGCCTCTTCCCTAACTGATATCTAATCGTAGGAGTTGCTACGCATTTGTCAACGCAG
>> CGATGATGGTGATGCATCTTAATCTAGCTGG----TTTTTTGATCTCGGGTGACGCAGAT
>> AGTCAGGGGTTGACTCGCGTCGTTTGAAACGTGCCTTGCTCCTCAATGGACCCTCCGAAC
>> CTAAGAGTAGCTCGACACGGCT
>>
>>
>>
>> I think the $aln object is OK, as I can use it with SimpleAlign.
>>
>> Moreover, if I write
>>        print $jcmatrix;
>> instead of
>>        print $jcmatrix->print_matrix;
>> I get the memory reference, as normal===> ARRAY(0x859f08)
>>
>> So my question is:
>>
>> Why do I have an unblessed reference?
>>
>> Can't call method "print_matrix" on unblessed reference at Tree.pl  
>> line 32, <GEN0> line 44.
>>
>> Thank you very much in advance.
>>
>> Jose G.
>>
>> _________________________________________________________________
>> Hay tantos ordenadores como personas. ?Descubre ahora cu?l eres t?!
>> http://www.quepceres.com/
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> --
> Jason Stajich
> jason.stajich at gmail.com
> jason at bioperl.org
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

______________________________


From maj at fortinbras.us  Tue Sep 22 07:12:38 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Tue, 22 Sep 2009 07:12:38 -0400
Subject: [Bioperl-l] dnastatistics
In-Reply-To: <A5C3A80C-03F0-4CEC-BA43-2271B58F6DC4@science.mq.edu.au>
References: <BLU104-W2453ADE4584D2C479071A4A0E40@phx.gbl><7AD546C5A6BE4B66BF9705BC885E08B1@NewLife><8B440DC9-A1C8-4900-A0AB-96448616E46A@bioperl.org>
	<A5C3A80C-03F0-4CEC-BA43-2271B58F6DC4@science.mq.edu.au>
Message-ID: <39991E8FD29E4A43B8098C0BA6740C9C@NewLife>

Thanks Liam-- I think the discrepancy between dnadist and the
module is worth making a bug report for- can you do that and
include the data (or part of it) you were using?
Jason, is that work really underway, or should someone pick up
that ball?
----- Original Message ----- 
From: "Liam Elbourne" <lelbourn at science.mq.edu.au>
To: "Jason Stajich" <jason at bioperl.org>
Cc: "Mark A. Jensen" <maj at fortinbras.us>; <bioperl-l at bioperl.org>; "Jose ." 
<joseguillin at hotmail.com>
Sent: Tuesday, September 22, 2009 3:14 AM
Subject: [Bioperl-l] dnastatistics


So I also had no problem running the code as written by Jose (Bioperl
1.6.0, perl 5.10), but in the documentation for DNAStatistics it says:

"The routines are not well tested and do contain errors at this point.
Work is underway to correct them, but do not expect this code to give
you the right answer currently!"!

So I'm using dnadist (as I think the documentation recommends), and it
does produce different numbers to $stats->distance(-).

I tried write_matrix from Bio::Matrix::IO - got a message saying it
hasn't been implemented yet?

And if Jose hasn't already found it, try Data::Dumper; it will change
your life....

Regards,
Liam.

On 15/09/2009, at 3:54 AM, Jason Stajich wrote:

> Yeah it seems like more of a bioperl problem -- possible that the  older code 
> didn't recognize 'jukes-cantor' but you can try the  abbreviation 'jc' --  
> better to just upgrade tho!
>
> This isn't the cause of the problem but I would also encourage use  of 
> Bio::Matrix::IO for printing the matrix (use the 'write_matrix'  function) 
> rather than print_matrix on the matrix itsself.
>
> -jason
> On Sep 14, 2009, at 10:00 AM, Mark A. Jensen wrote:
>
>> Hi Jose--
>> I don't get any problem with your script as written. You should  upgrade to
>> BioPerl 1.6 and try again.
>> The "unblessed reference" is $jcmatrix. It may be undef for some  reason.
>> MAJ
>> ----- Original Message ----- From: "Jose ." <joseguillin at hotmail.com>
>> To: <bioperl-l at bioperl.org>
>> Sent: Monday, September 14, 2009 8:48 AM
>> Subject: [Bioperl-l] Bio/Align/DNAStatistics.html print$jcmatrix-
>> >print_matrix;
>>
>>
>>
>>
>>
>> Hello,
>>
>> I'm trying to use Bio::Align::DNAStatistics, but I get the  following 
>> message:
>>
>> Can't call method "print_matrix" on unblessed reference at Tree.pl  line 32, 
>> <GEN0> line 44.
>>
>> Other modules do work, such us Bio::SimpleAlign;
>>
>>
>>
>>
>> My code is basically a modification of the code I found in 
>> http://doc.bioperl.org/releases/bioperl-current/bioperl-live/Bio/Align/DNAStatistics.html , 
>> as it is as follows:
>>
>> use strict;
>> use Bio::AlignIO;
>> use Bio::Align::DNAStatistics;
>>
>>
>> my $stats = Bio::Align::DNAStatistics->new();
>>
>> my $alignin = Bio::AlignIO->new(-file => 'e1_output_uno_solo.fas',
>>                          -format => 'fasta');
>> my $aln = $alignin->next_aln;
>>
>> my $jcmatrix = $stats-> distance (-align => $aln,
>>                -method => 'Jukes-Cantor');
>>
>> print $jcmatrix->print_matrix;
>>
>> And the file 'e1_output_uno_solo.fas' has the following sequences:
>>
>>> A
>> GGTTATCTCAACAACTGTCACC--GTGGGCGCTGGTCATTGGTACGGGTGAACGAGAGTT
>> AAACGGTCGTTAACCATAGAAACAAAACACACTGCACCTTAACTCACTGAATAGTTGACG
>> GTCTGCCTCAGGGCTTGAGACAACGGATGGATCTAAACTCATGCTGTAGCCTATCAAACT
>> TAGCCCCAGGGTACTTCCGTCCCTAGCCTCGCTACAAGGCCAGAAAGGGTTTTGAAGTCT
>> ACTCACTGTGACCAGCGGTCTAGTCAGGTTATGCTTCGGCACAAAACCTCAGAATCGGTA
>> ACCAGCCACTACACGAACTGAAATCAAATCGCGGGAGGTGGTCCATCTTTGTCCACGCTG
>> CGATGATTGGGTTGCTTTATAGTCTAGCTGCAAGGTTTTGCGTTCTGGTGGGAAGCGGSubject:  Re: 
>> [Bioperl-l] Bio/Align/DNAStatistics.html
> print$jcmatrix->print_maCA
>> TCCAAGGGGTTGACTCCGCTCGTTTATAACATGCCTTGGGCCTCCATGGTGAGTCGCAAC
>> GTCAGCGTAGGCCTAGACGGCT
>>
>>> B
>> GGATATCTCGACAACTTTTAGC--CTGGGCGCTTGGCATTGGTACACGTGACTTGCAGTT
>> AAAGGGTCGTTATACATAGAATCACTACCCAC--CAGGCGAACTCGCTGGAGAGCTGAGG
>> GTCACCCTCAGCGGTTGAGTTAACTGCTCGATGTTAACCGATGTTGGATCATAGGTAACT
>> TATCCTCAGTGTTCCTCTGTCCCTAGACTGGCTACAGGGCTACACCGGGTTTGAGGGGAT
>> ACTGACTGTTTTCAGCGGTAGTGTAAGTGTATGGTCCAACCCAAGGGTTCATGACCGGTA
>> AACTGCCCGTTCCCGCATTGAAATCAAATTGCAGGAGTTGGTACTTATTTGTCAACCTTA
>> CGATGATTGGGATGCATTTTAGTCGGGCTGGGCGGATTTGCGATCTGGGTGGAAGAGAGA
>> TGCATGGGGCTAACTCGTCTTGGTGAGTACCGGCATTGCACCGCAATGGACCGCCAAAAC
>> ATAAGAGTAGGTCGGGATGGCA
>>
>>> C
>> GCTTATCTCAACAACCGACACGAAGTCGTCGCAGGTCAATGGTACACGTGAATTGAAGTC
>> ATAAGATCAGTAATGATCGAACCACCAAACCCTTAACCTCGACTCACGCGATAGCCGAGG
>> GTCTGCCTCCAGGGTTGATTTAAAGGTTCTATTTAAGACCGTTTTCGATCATAGGTTACT
>> TATCCCCAGAGTTCTACCGTCGTGAGAATGGCTACAAGGCTAGAATAGGTTTTAGGGT-T
>> ACTTACGGTCTGCAGCCGTATTGTGAGGTTATGGTCCGGCCCTAGGCGTCATGACCGATA
>> ATCAGCCCCTACCTGAAATGAAATCAAATCGCGGGAGTTGGTACTTATCTGTCAACGTTG
>> CGATGATGGGGATACATGTTGGTCTACCGCGACGGACTAGCGATCACGGGGGAAGCGGAT
>> TGCCCGGTGGTGACTCGACACGTTTAAAACCTGCCTGGTTCCCGCATGGATCGTCACAAC
>> GTATGTGCAGGTCGAAACGAGT
>>
>>> D
>> CGTGATCGCAACAACTGTCACC--GTGGGCGCTGGCCGTTGGACCACGTGAAATGCTGTT
>> AAACGATCGTTCACCATAGAACCACTACACTCTTCACCTCAACCCGCGGGACAGGTGATG
>> GTGTCCCCCAGGGGTTGAGTGAACGGCTCGATGTAAACCCATGTTCGATCATAGGTAACG
>> TAGCCCCAGGGTGATTCCGTTCCTAAACTGGTTACAAGGCTAAAACGTGTTTTAGAGTAT
>> AATGACTGTCTACGGCGGTATTGTGATGTTATCATCCGTCCCTAGGCGTGGCGACCGTTA
>> AACAGCCTCTTCCCTAACTGATATCTAATCGTAGGAGTTGCTACGCATTTGTCAACGCAG
>> CGATGATGGTGATGCATCTTAATCTAGCTGG----TTTTTTGATCTCGGGTGACGCAGAT
>> AGTCAGGGGTTGACTCGCGTCGTTTGAAACGTGCCTTGCTCCTCAATGGACCCTCCGAAC
>> CTAAGAGTAGCTCGACACGGCT
>>
>>
>>
>> I think the $aln object is OK, as I can use it with SimpleAlign.
>>
>> Moreover, if I write
>>        print $jcmatrix;
>> instead of
>>        print $jcmatrix->print_matrix;
>> I get the memory reference, as normal===> ARRAY(0x859f08)
>>
>> So my question is:
>>
>> Why do I have an unblessed reference?
>>
>> Can't call method "print_matrix" on unblessed reference at Tree.pl  line 32, 
>> <GEN0> line 44.
>>
>> Thank you very much in advance.
>>
>> Jose G.
>>
>> _________________________________________________________________
>> Hay tantos ordenadores como personas. ?Descubre ahora cu?l eres t?!
>> http://www.quepceres.com/
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> --
> Jason Stajich
> jason.stajich at gmail.com
> jason at bioperl.org
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

______________________________


_______________________________________________
Bioperl-l mailing list
Bioperl-l at lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/bioperl-l


From dan.bolser at gmail.com  Tue Sep 22 09:09:50 2009
From: dan.bolser at gmail.com (Dan Bolser)
Date: Tue, 22 Sep 2009 14:09:50 +0100
Subject: [Bioperl-l] Converting between allowed SearchIO formats?
Message-ID: <2c8757af0909220609n518243efh63608aa05df13d1c@mail.gmail.com>

Hi all,

I'm reading in a blasttable format blast result, filtering, and hoping
to write out a similarly formatted result. Based on experience with
SeqIO, I expected to do something like the following:

use Bio::SearchIO;

## Open the sequence search report
my $seqI = Bio::SearchIO->
  new( -file   => $file,
       -format => $format,
     );

## Open the output report
my $seqO = Bio::SearchIO->
  new( -file   => ">OUTPUT",
       -format => $format,
     );

while( my $result = $seqI->next_result ) {
  ## Do some filtering...

  $seqO->write_result( $result );
}


However, the above method does not work here. Is this for some deep
reason, or could the above method (based on the way SeqIO works) be
made to work? I'm guessing that the SearchIO object conversion is
simply harder to do than with SeqIO?

So now I'm trying to use the correct method, via
Bio::SearchIO::Writer::HSPTableWriter. The problem is, I can't find a
1 to 1 correspondence between the fields in the blasttable and the
columns provided by the writer. So far I have something like this:

blasttable ->		HSPTableWriter

(result) query_name	query_name
(hit) name		hit_name
(hsp) frac_identical	frac_identical_query?
			frac_identical_hit?
(hsp) hsp_length	length_aln_query?
			length_aln_hit?
(?) mismatches		?
(hsp) gaps		?
			gaps_query?
			gaps_hit?
			gaps_total?
(hsp) start('query')	start_query
(hsp) end('query')	end_query
(hsp) start('hit')	start_hit
(hsp) end('hit')	end_hit
(hsp) significance	expect
(hsp) bits		bits


For (hsp) frac_identical, it seems as if the (undocumented)
frac_identical_total column is giving the right value, however, I'ts
hard to be certain because the format is of the value is different
(the blasttable says 93.51 while HSPTableWriter says 0.94). How can I
change the output format of HSPTableWriter?

Is there any improvement on the above mapping? It seems strange that I
can read in a blasttable, but I can't write one out (using a generic
object interface). For example, where do I get the hsp length from
(which column)?

I'm sure this has come up before, so apologies for not being able to
track down the appropriate docs.


Thanks for any help,
Dan.

P.S. when dumping a blasttable from a blasttable using HSP methods,
how should I calculate the number of mismatches? Currently I'm trying:

      my $len = $hsp->length;
      my $match = $len * $hsp->frac_identical;
      my $mismatch = $len - $match;

but the resulting values differ from those in the original blasttable.
I have the feeling this is a FAQ ...


From cjfields at illinois.edu  Tue Sep 22 10:00:44 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Tue, 22 Sep 2009 09:00:44 -0500
Subject: [Bioperl-l] Converting between allowed SearchIO formats?
In-Reply-To: <2c8757af0909220609n518243efh63608aa05df13d1c@mail.gmail.com>
References: <2c8757af0909220609n518243efh63608aa05df13d1c@mail.gmail.com>
Message-ID: <B7F6253D-F9EE-4EC0-9ABE-53CB85E37D16@illinois.edu>

On Sep 22, 2009, at 8:09 AM, Dan Bolser wrote:

> Hi all,
>
> I'm reading in a blasttable format blast result, filtering, and hoping
> to write out a similarly formatted result. Based on experience with
> SeqIO, I expected to do something like the following:
>
> use Bio::SearchIO;
>
> ## Open the sequence search report
> my $seqI = Bio::SearchIO->
> new( -file   => $file,
>      -format => $format,
>    );
>
> ## Open the output report
> my $seqO = Bio::SearchIO->
> new( -file   => ">OUTPUT",
>      -format => $format,
>    );
>
> while( my $result = $seqI->next_result ) {
> ## Do some filtering...
>
> $seqO->write_result( $result );
> }
>
>
> However, the above method does not work here. Is this for some deep
> reason, or could the above method (based on the way SeqIO works) be
> made to work? I'm guessing that the SearchIO object conversion is
> simply harder to do than with SeqIO?

This is something Jason could probably speak up on, but from my  
perspective it comes down to 'why?'.  This opens up a very hard-to- 
implement door (converting to and from, for instance, BLAST to HMMER),  
which doesn't make sense from the end-user perspective.  What most  
users want out of those formats is getting at the data in an easily  
accessible way, to further process them (filter, to GFF, etc), or to  
have them summarized.  the Writer classes take care of the latter.

There is a very generic, all-purpose write_result in Bio::SearchIO  
that just calls the a ResultWriter object (and dies if it isn't  
present).  Note that this expects a ResultWriter, not a Hit/HSPWriter;  
it is write_result() after all. I think this kind of goes against the  
well-established API that exists with the other write_foo  
implementations for the IO classes, where the input/output format  
should match, but there you have it.

> So now I'm trying to use the correct method, via
> Bio::SearchIO::Writer::HSPTableWriter. The problem is, I can't find a
> 1 to 1 correspondence between the fields in the blasttable and the
> columns provided by the writer. So far I have something like this:
>
> blasttable ->		HSPTableWriter
>
> (result) query_name	query_name
> (hit) name		hit_name
> (hsp) frac_identical	frac_identical_query?
> 			frac_identical_hit?
> (hsp) hsp_length	length_aln_query?
> 			length_aln_hit?
> (?) mismatches		?
> (hsp) gaps		?
> 			gaps_query?
> 			gaps_hit?
> 			gaps_total?
> (hsp) start('query')	start_query
> (hsp) end('query')	end_query
> (hsp) start('hit')	start_hit
> (hsp) end('hit')	end_hit
> (hsp) significance	expect
> (hsp) bits		bits
>
>
> For (hsp) frac_identical, it seems as if the (undocumented)
> frac_identical_total column is giving the right value, however, I'ts
> hard to be certain because the format is of the value is different
> (the blasttable says 93.51 while HSPTableWriter says 0.94). How can I
> change the output format of HSPTableWriter?

Not sure but it appears hard-coded.  This could probably be rewritten  
to spit out certain data attributes by name (e.g. you could ask for  
percent_identity), but I'm not sure.

> Is there any improvement on the above mapping? It seems strange that I
> can read in a blasttable, but I can't write one out (using a generic
> object interface). For example, where do I get the hsp length from
> (which column)?
>
> I'm sure this has come up before, so apologies for not being able to
> track down the appropriate docs.

 From the POD:

'Here are the columns that can be specified in the -columns
parameter when creating a HSPTableWriter object.  If a -columns  
parameter
is not specified, this list, in this order, will be used as the  
default.'

In other words, you keep track of the columns (which appear 1-based).

> Thanks for any help,
> Dan.
> P.S. when dumping a blasttable from a blasttable using HSP methods,
> how should I calculate the number of mismatches? Currently I'm trying:
>
>     my $len = $hsp->length;
>     my $match = $len * $hsp->frac_identical;
>     my $mismatch = $len - $match;
>
> but the resulting values differ from those in the original blasttable.
> I have the feeling this is a FAQ ...

Maybe use seq_inds instead?

BTW, HSP length() defaults on the 'total' length (includes gaps).  The  
above calculation doesn't account for that.

With seq_inds, 'mismatch' are residue-only (no gaps); 'no_match' is  
mismatched residues + gaps (you have to also indicate whether this is  
based on the query or hit).

Also note that seq_inds deals with (1) mapping differences, e.g. any  
query that requires translation, and (2) frameshifts, such as from  
FASTX/Y output (again translated sequence output).  If you are dealing  
with a translated sequence you will want to account for those bits as  
well.

chris


From cjfields at illinois.edu  Tue Sep 22 10:20:47 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Tue, 22 Sep 2009 09:20:47 -0500
Subject: [Bioperl-l] a Main Page proposal
In-Reply-To: <4AB84B8D.5080005@ieee.org>
References: <mailman.7482.1253574466.2696.bioperl-l@lists.open-bio.org>
	<4AB84B8D.5080005@ieee.org>
Message-ID: <2B4BD21D-D06A-43A1-8860-A8F3771130E7@illinois.edu>

On Sep 21, 2009, at 10:59 PM, Jonathan Cline wrote:

> Throwing this out there:
>
> - there should be a screenshot section (whatever that means for  
> bioperl)

The only area that would apply is for Gbrowse/Bio::Graphics.  For much  
of the rest that's a bit trickier, but it's possible.

> - the grammar of the beta page should be more correct.
>
> "Welcome to BioPerl, a community effort to produce Perl code which is
> useful in biology. "
> ==> "Welcome to BioPerl, a community effort to produce Perl code  
> serving
> as useful tool in the field of Biology."
>
>>> The About section is a good example. I would bet most visitors to  
>>> the
> BioPerl website skip over the About section because they already  
> know what
> BioPerl is, ...  Dave<<
>
> Most good software front pages say, in a couple sentences, "what it is
> and what it's for", including pictures (as screenshots).

Right.

> I would bet a ton of visitors don't know what bioperl is, or what it  
> is
> used for, or how it can benefit.  There is likely a metric for this  
> (web
> stats) as the ratio of new page visits that bounce away vs. new
> clickthrus from the front page to the download or docs section.    
> i.e. a
> visitor found the page and didn't continue reading.  I don't really  
> know
> all the things bioperl is good for and I've been reading about it  
> here &
> there for a while.
>
> I like the following from the About and I believe it fits well on a
> front page, expanding "toolkit" to "software library":
>
> "What is Bioperl? It is an open source bioinformatics software library
> used by researchers all over the world. If you're looking for a script
> built to fit your exact needs you probably won't find it in Bioperl.
> What you will find is a diverse set of Perl modules that will enable  
> you
> to write your own script, and a community of people who are willing to
> help you. "
>
> The old school definition of software library is something like:  
> "useful
> routines which can be used by an application (& not itself an
> application)" which is basically the description above.
>
> I also like the intro from wikipedia, which I found more informative
> about bioperl, and would be good for a front page:
>
> 'BioPerl [1] is a collection of Perl modules that facilitate the
> development of Perl scripts for bioinformatics applications. It has
> played an integral role in the Human Genome Project.[2]  It is an  
> active
> open source software project supported by the Open Bioinformatics
> Foundation.  In order to take advantage of BioPerl, the user needs a
> basic understanding of the Perl programming language including an
> understanding of how to use Perl references, modules, objects and  
> methods."
>
> The screenshots could also include pics of books on bioperl or perl 
> +bio,
> that would be neat.  (Tisdall's book comes to mind here)

I tend to agree here, but Tisdall only discusses BioPerl in detail in  
the second book (Mastering Perl for Bioinformatics).  I think we're  
safe as long as we indicate that, just don't want to run into a  
situation like the recent issue that some users had with Gentleman's  
'R for Bioinformatics' book released last year.

I don't think it was intentional, but a lot of users purchased it  
thinking it would be a BioConductor book, mainly b/c it was advertised  
on the BioConductor website.  Unfortunately it had very little to do  
with BioC (or bioinformatics, really), and the reviews of the book  
reflect that.  It's unfortunate, as I found it to be a pretty good  
book on R.

-c

> ## Jonathan Cline
> ## jcline at ieee.org
> ## Mobile: +1-805-617-0223
> ########################


From cjfields at illinois.edu  Tue Sep 22 11:53:13 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Tue, 22 Sep 2009 10:53:13 -0500
Subject: [Bioperl-l] BioPerl 1.6.0 alpha 2 released
In-Reply-To: <2736FAB1-3728-465F-A07B-A8FFA790FC4C@illinois.edu>
References: <2736FAB1-3728-465F-A07B-A8FFA790FC4C@illinois.edu>
Message-ID: <2ED641E3-F69E-4513-B261-0949FDE35EBB@illinois.edu>

And just as quickly, getting back lots indicating more problems from  
CPAN Testers.  Some can be ignored (appear due to the local perl  
testing environment so are local to the tester).  The following are  
the most significant; appears a hard-coded SeqFeature_SQLite.t got  
bundled in somehow, so I'll drop an alpha 3 shortly.

chris

#   Failed test 'use Bio::SeqFeature::Annotated;'
#   at t/Annotation/Annotation.t line 23.
#     Tried to use 'Bio::SeqFeature::Annotated'.
#     Error:  Can't locate URI/Escape.pm in @INC (@INC contains: t/ 
lib . /Users/david/cpantesting/perl-5.10.1/.cpan/build/ 
BioPerl-1.6.0._2-QVXU9n/blib/lib /Users/david/cpantesting/ 
perl-5.10.1/.cpan/build/BioPerl-1.6.0._2-QVXU9n/blib/arch /Users/david/ 
cpantesting/perl-5.10.1/.cpan/build/BioPerl-1.6.0._2-QVXU9n /sw/lib/ 
perl5 /sw/lib/perl5/darwin /Users/david/cpantesting/perl-5.10.1/lib/ 
5.10.1/darwin-thread-multi-2level /Users/david/cpantesting/perl-5.10.1/ 
lib/5.10.1 /Users/david/cpantesting/perl-5.10.1/lib/site_perl/5.10.1/ 
darwin-thread-multi-2level /Users/david/cpantesting/perl-5.10.1/lib/ 
site_perl/5.10.1) at Bio/SeqFeature/Annotated.pm line 100.
# BEGIN failed--compilation aborted at Bio/SeqFeature/Annotated.pm  
line 100.
# Compilation failed in require at (eval 60) line 2.
# BEGIN failed--compilation aborted at (eval 60) line 2.
# Looks like you failed 1 test of 159.
t/Annotation/Annotation.t ....................
Dubious, test returned 1 (wstat 256, 0x100)
Failed 1/159 subtests
	(less 12 skipped subtests: 146 okay)


t/LocalDB/SeqFeature.t ....................... ok
DBD::SQLite::db prepare_cached failed: near "INDEXED": syntax error(1)  
at dbdimp.c line 271 at Bio/DB/SeqFeature/Store/DBI/mysql.pm line 1678.

-------------------- EXCEPTION --------------------
MSG: near "INDEXED": syntax error(1) at dbdimp.c line 271
STACK Bio::DB::SeqFeature::Store::DBI::mysql::_prepare Bio/DB/ 
SeqFeature/Store/DBI/mysql.pm:1678
STACK Bio::DB::SeqFeature::Store::DBI::SQLite::_features Bio/DB/ 
SeqFeature/Store/DBI/SQLite.pm:665
STACK Bio::DB::SeqFeature::Store::get_features_by_attribute Bio/DB/ 
SeqFeature/Store.pm:961
STACK toplevel t/LocalDB/SeqFeature.t:135
-------------------------------------------
# Looks like you planned 69 tests but only ran 40.
# Looks like your test died just after 40.
t/LocalDB/SeqFeature_SQLite.t ................
Failed 29/69 subtests


On Sep 21, 2009, at 10:56 PM, Chris Fields wrote:

> Just a note that the second alpha is out and propagating it's way  
> around the intertubes:
>
> http://search.cpan.org/~cjfields/BioPerl-1.6.0_2/
>
> Pick your favorite archive here:
>
> http://bioperl.org/DIST/RC/
>
> This should address the bugs reported by Scott from the last  
> release.  Just a note, but I am seeing a warning popping up with 64- 
> bit perl 5.10.1 on Mac with PopGen tests (I think it's a floating  
> point addition issue).  Let me know if this is popping up elsewhere.
>
> Enjoy!
>
> chris
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From jason at bioperl.org  Tue Sep 22 12:01:51 2009
From: jason at bioperl.org (Jason Stajich)
Date: Tue, 22 Sep 2009 09:01:51 -0700
Subject: [Bioperl-l] dnastatistics
In-Reply-To: <39991E8FD29E4A43B8098C0BA6740C9C@NewLife>
References: <BLU104-W2453ADE4584D2C479071A4A0E40@phx.gbl><7AD546C5A6BE4B66BF9705BC885E08B1@NewLife><8B440DC9-A1C8-4900-A0AB-96448616E46A@bioperl.org>
	<A5C3A80C-03F0-4CEC-BA43-2271B58F6DC4@science.mq.edu.au>
	<39991E8FD29E4A43B8098C0BA6740C9C@NewLife>
Message-ID: <1027EFFB-18B5-446B-A5B0-9DA628EEEF08@bioperl.org>

someone should pick up the ball.

On Sep 22, 2009, at 4:12 AM, Mark A. Jensen wrote:

> Thanks Liam-- I think the discrepancy between dnadist and the
> module is worth making a bug report for- can you do that and
> include the data (or part of it) you were using?
> Jason, is that work really underway, or should someone pick up
> that ball?
> ----- Original Message ----- From: "Liam Elbourne" <lelbourn at science.mq.edu.au 
> >
> To: "Jason Stajich" <jason at bioperl.org>
> Cc: "Mark A. Jensen" <maj at fortinbras.us>; <bioperl-l at bioperl.org>;  
> "Jose ." <joseguillin at hotmail.com>
> Sent: Tuesday, September 22, 2009 3:14 AM
> Subject: [Bioperl-l] dnastatistics
>
>
> So I also had no problem running the code as written by Jose (Bioperl
> 1.6.0, perl 5.10), but in the documentation for DNAStatistics it says:
>
> "The routines are not well tested and do contain errors at this point.
> Work is underway to correct them, but do not expect this code to give
> you the right answer currently!"!
>
> So I'm using dnadist (as I think the documentation recommends), and it
> does produce different numbers to $stats->distance(-).
>
> I tried write_matrix from Bio::Matrix::IO - got a message saying it
> hasn't been implemented yet?
>
> And if Jose hasn't already found it, try Data::Dumper; it will change
> your life....
>
> Regards,
> Liam.
>
> On 15/09/2009, at 3:54 AM, Jason Stajich wrote:
>
>> Yeah it seems like more of a bioperl problem -- possible that the   
>> older code didn't recognize 'jukes-cantor' but you can try the   
>> abbreviation 'jc' --  better to just upgrade tho!
>>
>> This isn't the cause of the problem but I would also encourage use   
>> of Bio::Matrix::IO for printing the matrix (use the 'write_matrix'   
>> function) rather than print_matrix on the matrix itsself.
>>
>> -jason
>> On Sep 14, 2009, at 10:00 AM, Mark A. Jensen wrote:
>>
>>> Hi Jose--
>>> I don't get any problem with your script as written. You should   
>>> upgrade to
>>> BioPerl 1.6 and try again.
>>> The "unblessed reference" is $jcmatrix. It may be undef for some   
>>> reason.
>>> MAJ
>>> ----- Original Message ----- From: "Jose ."  
>>> <joseguillin at hotmail.com>
>>> To: <bioperl-l at bioperl.org>
>>> Sent: Monday, September 14, 2009 8:48 AM
>>> Subject: [Bioperl-l] Bio/Align/DNAStatistics.html print$jcmatrix-
>>> >print_matrix;
>>>
>>>
>>>
>>>
>>>
>>> Hello,
>>>
>>> I'm trying to use Bio::Align::DNAStatistics, but I get the   
>>> following message:
>>>
>>> Can't call method "print_matrix" on unblessed reference at  
>>> Tree.pl  line 32, <GEN0> line 44.
>>>
>>> Other modules do work, such us Bio::SimpleAlign;
>>>
>>>
>>>
>>>
>>> My code is basically a modification of the code I found in http://doc.bioperl.org/releases/bioperl-current/bioperl-live/Bio/Align/DNAStatistics.html 
>>>  , as it is as follows:
>>>
>>> use strict;
>>> use Bio::AlignIO;
>>> use Bio::Align::DNAStatistics;
>>>
>>>
>>> my $stats = Bio::Align::DNAStatistics->new();
>>>
>>> my $alignin = Bio::AlignIO->new(-file => 'e1_output_uno_solo.fas',
>>>                         -format => 'fasta');
>>> my $aln = $alignin->next_aln;
>>>
>>> my $jcmatrix = $stats-> distance (-align => $aln,
>>>               -method => 'Jukes-Cantor');
>>>
>>> print $jcmatrix->print_matrix;
>>>
>>> And the file 'e1_output_uno_solo.fas' has the following sequences:
>>>
>>>> A
>>> GGTTATCTCAACAACTGTCACC--GTGGGCGCTGGTCATTGGTACGGGTGAACGAGAGTT
>>> AAACGGTCGTTAACCATAGAAACAAAACACACTGCACCTTAACTCACTGAATAGTTGACG
>>> GTCTGCCTCAGGGCTTGAGACAACGGATGGATCTAAACTCATGCTGTAGCCTATCAAACT
>>> TAGCCCCAGGGTACTTCCGTCCCTAGCCTCGCTACAAGGCCAGAAAGGGTTTTGAAGTCT
>>> ACTCACTGTGACCAGCGGTCTAGTCAGGTTATGCTTCGGCACAAAACCTCAGAATCGGTA
>>> ACCAGCCACTACACGAACTGAAATCAAATCGCGGGAGGTGGTCCATCTTTGTCCACGCTG
>>> CGATGATTGGGTTGCTTTATAGTCTAGCTGCAAGGTTTTGCGTTCTGGTGGGAAGCGGSubject 
>>> :  Re: [Bioperl-l] Bio/Align/DNAStatistics.html
>> print$jcmatrix->print_maCA
>>> TCCAAGGGGTTGACTCCGCTCGTTTATAACATGCCTTGGGCCTCCATGGTGAGTCGCAAC
>>> GTCAGCGTAGGCCTAGACGGCT
>>>
>>>> B
>>> GGATATCTCGACAACTTTTAGC--CTGGGCGCTTGGCATTGGTACACGTGACTTGCAGTT
>>> AAAGGGTCGTTATACATAGAATCACTACCCAC--CAGGCGAACTCGCTGGAGAGCTGAGG
>>> GTCACCCTCAGCGGTTGAGTTAACTGCTCGATGTTAACCGATGTTGGATCATAGGTAACT
>>> TATCCTCAGTGTTCCTCTGTCCCTAGACTGGCTACAGGGCTACACCGGGTTTGAGGGGAT
>>> ACTGACTGTTTTCAGCGGTAGTGTAAGTGTATGGTCCAACCCAAGGGTTCATGACCGGTA
>>> AACTGCCCGTTCCCGCATTGAAATCAAATTGCAGGAGTTGGTACTTATTTGTCAACCTTA
>>> CGATGATTGGGATGCATTTTAGTCGGGCTGGGCGGATTTGCGATCTGGGTGGAAGAGAGA
>>> TGCATGGGGCTAACTCGTCTTGGTGAGTACCGGCATTGCACCGCAATGGACCGCCAAAAC
>>> ATAAGAGTAGGTCGGGATGGCA
>>>
>>>> C
>>> GCTTATCTCAACAACCGACACGAAGTCGTCGCAGGTCAATGGTACACGTGAATTGAAGTC
>>> ATAAGATCAGTAATGATCGAACCACCAAACCCTTAACCTCGACTCACGCGATAGCCGAGG
>>> GTCTGCCTCCAGGGTTGATTTAAAGGTTCTATTTAAGACCGTTTTCGATCATAGGTTACT
>>> TATCCCCAGAGTTCTACCGTCGTGAGAATGGCTACAAGGCTAGAATAGGTTTTAGGGT-T
>>> ACTTACGGTCTGCAGCCGTATTGTGAGGTTATGGTCCGGCCCTAGGCGTCATGACCGATA
>>> ATCAGCCCCTACCTGAAATGAAATCAAATCGCGGGAGTTGGTACTTATCTGTCAACGTTG
>>> CGATGATGGGGATACATGTTGGTCTACCGCGACGGACTAGCGATCACGGGGGAAGCGGAT
>>> TGCCCGGTGGTGACTCGACACGTTTAAAACCTGCCTGGTTCCCGCATGGATCGTCACAAC
>>> GTATGTGCAGGTCGAAACGAGT
>>>
>>>> D
>>> CGTGATCGCAACAACTGTCACC--GTGGGCGCTGGCCGTTGGACCACGTGAAATGCTGTT
>>> AAACGATCGTTCACCATAGAACCACTACACTCTTCACCTCAACCCGCGGGACAGGTGATG
>>> GTGTCCCCCAGGGGTTGAGTGAACGGCTCGATGTAAACCCATGTTCGATCATAGGTAACG
>>> TAGCCCCAGGGTGATTCCGTTCCTAAACTGGTTACAAGGCTAAAACGTGTTTTAGAGTAT
>>> AATGACTGTCTACGGCGGTATTGTGATGTTATCATCCGTCCCTAGGCGTGGCGACCGTTA
>>> AACAGCCTCTTCCCTAACTGATATCTAATCGTAGGAGTTGCTACGCATTTGTCAACGCAG
>>> CGATGATGGTGATGCATCTTAATCTAGCTGG----TTTTTTGATCTCGGGTGACGCAGAT
>>> AGTCAGGGGTTGACTCGCGTCGTTTGAAACGTGCCTTGCTCCTCAATGGACCCTCCGAAC
>>> CTAAGAGTAGCTCGACACGGCT
>>>
>>>
>>>
>>> I think the $aln object is OK, as I can use it with SimpleAlign.
>>>
>>> Moreover, if I write
>>>       print $jcmatrix;
>>> instead of
>>>       print $jcmatrix->print_matrix;
>>> I get the memory reference, as normal===> ARRAY(0x859f08)
>>>
>>> So my question is:
>>>
>>> Why do I have an unblessed reference?
>>>
>>> Can't call method "print_matrix" on unblessed reference at  
>>> Tree.pl  line 32, <GEN0> line 44.
>>>
>>> Thank you very much in advance.
>>>
>>> Jose G.
>>>
>>> _________________________________________________________________
>>> Hay tantos ordenadores como personas. ?Descubre ahora cu?l eres t?!
>>> http://www.quepceres.com/
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>> --
>> Jason Stajich
>> jason.stajich at gmail.com
>> jason at bioperl.org
>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> ______________________________
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>

--
Jason Stajich
jason.stajich at gmail.com
jason at bioperl.org


From jason at bioperl.org  Tue Sep 22 12:07:14 2009
From: jason at bioperl.org (Jason Stajich)
Date: Tue, 22 Sep 2009 09:07:14 -0700
Subject: [Bioperl-l] Converting between allowed SearchIO formats?
In-Reply-To: <B7F6253D-F9EE-4EC0-9ABE-53CB85E37D16@illinois.edu>
References: <2c8757af0909220609n518243efh63608aa05df13d1c@mail.gmail.com>
	<B7F6253D-F9EE-4EC0-9ABE-53CB85E37D16@illinois.edu>
Message-ID: <CE021960-F0DC-4BA7-91B7-21A5B2F6F1BF@bioperl.org>

>>
>>
>> However, the above method does not work here. Is this for some deep
>> reason, or could the above method (based on the way SeqIO works) be
>> made to work? I'm guessing that the SearchIO object conversion is
>> simply harder to do than with SeqIO?
>
> This is something Jason could probably speak up on, but from my  
> perspective it comes down to 'why?'.  This opens up a very hard-to- 
> implement door (converting to and from, for instance, BLAST to  
> HMMER), which doesn't make sense from the end-user perspective.   
> What most users want out of those formats is getting at the data in  
> an easily accessible way, to further process them (filter, to GFF,  
> etc), or to have them summarized.  the Writer classes take care of  
> the latter.
>


> There is a very generic, all-purpose write_result in Bio::SearchIO  
> that just calls the a ResultWriter object (and dies if it isn't  
> present).  Note that this expects a ResultWriter, not a Hit/ 
> HSPWriter; it is write_result() after all. I think this kind of goes  
> against the well-established API that exists with the other  
> write_foo implementations for the IO classes, where the input/output  
> format should match, but there you have it.
>

Dan -
I'm confused about what you are trying to do or what is broken - are  
you just annoyed that the API isn't the same style as Bio::SeqIO.


--
Jason Stajich
jason.stajich at gmail.com
jason at bioperl.org


From shalabh.sharma7 at gmail.com  Tue Sep 22 12:48:39 2009
From: shalabh.sharma7 at gmail.com (shalabh sharma)
Date: Tue, 22 Sep 2009 12:48:39 -0400
Subject: [Bioperl-l] Stockholm to fasta
Message-ID: <9fcc48c70909220948t7988b48eu7a8dcf89ee2d6042@mail.gmail.com>

Hi All,      I am trying to convert stockholm to fasta format. I am using
"sreformat" for this purpose. I am getting a fasta file but the problem is i
want header information from stockholm in my fasta file.
Like:
# STOCKHOLM 1.0

#=GF AC   RF00003
#=GF ID   U1
#=GF DE   U1 spliceosomal RNA
- - - - - - - - - -  - - - -
- - - - - - - - - - - -- -
- - - - - - -- - - - - -
#=GF RL   J Biol Chem 2001;276:21476-21481.
#=GF CC   U1 is a small nuclear RNA (snRNA) component of the spliceosome
#=GF CC   (involved in pre-mRNA splicing). Its 5' end forms complementary
#=GF CC   base pairs with the 5' splice junction, thus defining the 5'
#=GF CC   donor site of an intron.
#=GF CC   There are significant differences in sequence and secondary
#=GF CC   structure between metazoan and yeast U1 snRNAs, the latter being
#=GF CC   much longer (568 nucleotides as compared to 164 nucleotides in
#=GF CC   human). Nevertheless, secondary structure predictions suggest
#=GF CC   that all U1 snRNAs share a 'common core' consisting of helices I,
#=GF CC   II, the proximal region of III, and IV [1].
#=GF CC   This family does not contain the larger yeast sequences.
#=GF SQ   100


X63783.1/2024-2186
UUACUUACCUGGCUGG.AGUUU.GCUA...UCGAUCAU.GAAG.GGUAG.
X63783.1/1394-1556
UUACUUACCUGGCUGG.AGUUA.GCUA...UCGAUCAU.GAAG.GGUAG.
X58845.1/1-161
..ACUUACCUGGCUGG.AGUUU.GCUA...UCGAUCAU.GAAG.GGUAG.
X63783.1/596-756
UAAAUUACAAUGUUGU.AGUUA.GCUA...UAUAUCAA.AAAA.UAUAG.
M29062.1/238-387
UUACUUACCUGGCAUG.AGUUU..CUG...CAGCACAA.GAAU.UGUGG.

As a output i am just getting a fasta file with the headers like
 "X63783.1/2024-2186" but what i want is that it should include some
information like U1 or U1 spliceosomal RNA from the stockholm headers.

I would really appreciate if anyone can help me out.

Thanks
Shalabh


From roy.chaudhuri at gmail.com  Tue Sep 22 12:44:47 2009
From: roy.chaudhuri at gmail.com (Roy Chaudhuri)
Date: Tue, 22 Sep 2009 17:44:47 +0100
Subject: [Bioperl-l] subsection of genbank file
In-Reply-To: <4AB87A06.4000209@gmail.com>
References: <997B4CA2-D80B-4512-AA3E-74CB45DD7064@science.mq.edu.au>	<4AB36451.3030207@gmail.com>	<3B0EF953-BF79-4384-964D-A992DFBDB609@science.mq.edu.au>
	<4AB87A06.4000209@gmail.com>
Message-ID: <4AB8FEFF.6060408@gmail.com>

Hi Liam,

My mistake, it looks like the bug had already been reported and fixed, 
which means I get to go home earlier. I've marked your bug as a 
duplicate of bug 2810.

You can get the patched version by installing bioperl-live (just 
downloading the bioperl-live SeqUtils.pm and putting it in the correct 
place on your system would probably also work).

Cheers.
Roy.

Roy Chaudhuri wrote:
> Hi Liam,
> 
> Yes, that is a bug - I think it is to do with the Feature Annotation 
> rollback from 1.6, it works fine with 1.5.2. Looks like the tests I 
> wrote don't check for the presence of tags, just the coordinates of the 
> feature, so this hasn't been picked up. Submit it to Bugzilla, and I'll 
> take a look when I get a chance.
> 
> Cheers.
> Roy.
> 
> Liam Elbourne wrote:
>> Hi Roy,
>>
>> Thanks for that, works well, but there are no _gsf_tag_hash values? I'm 
>> particularly interested in the locus id, obviously the translation could 
>> be problematic if the whole gene is not included after truncation, but 
>> things like the note, product, protein_id would be good. I had a look at 
>> the code for the method and couldn't see any obvious why those values 
>> didn't make it across. Should I submit this as a bug, or is there 
>> something I'm missing?
>>
>>
>> Regards,
>> Liam.
>>
>>
>>
>> On 18/09/2009, at 8:43 PM, Roy Chaudhuri wrote:
>>
>>> Hi Liam,
>>>
>>> I just discovered your message, which has not yet been replied to. 
>>> What you require has been discussed in a recent thread:
>>> http://bioperl.org/pipermail/bioperl-l/2009-August/031071.html
>>>
>>> Try using trunc_with_features from Bio::SeqUtils:
>>>
>>> my $sub_seqobj=Bio::SeqUtils->trunc_with_features($seqobj, 300, 2000);
>>> Cheers.
>>> Roy.
>>>
>>> Liam Elbourne wrote:
>>>> Hi All,
>>>> Is there a method or methodology that will produce a fully fledged 
>>>> Seq  object with all the associated metadata given a start and end 
>>>>  position? To clarify, I create a sequence object from a genbank file:
>>>> ****
>>>> my $io  = Bio::Seqio->new(as per usual);
>>>> my $seqobj = $io->next_seq();
>>>> ****
>>>> I now want:
>>>> my $sub_seqobj = $seqobj between 300 and 2000
>>>> where $sub_seqobj is a Seq object (which I appreciate is an 
>>>>  'aggregate' of objects) too. The "trunc" method only returns a 
>>>>  PrimarySeq object which lacks all the annotation etc. I've 
>>>> previously  done this task by iterating through feature by feature 
>>>> and parsing out  what I needed, but thought there might be a more 
>>>> elegant approach...
>>>> Regards,
>>>> Liam Elbourne.
>>> -- 
>>> Dr. Roy Chaudhuri
>>> Department of Veterinary Medicine
>>> University of Cambridge, U.K.
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>> ac.uk ([131.111.51.215]:49455)
>>> by ppsw-7.c
>> ______________________________
>>
>> Dr Liam Elbourne
>> Research Fellow (Bioinformatics)
>> Paulsen Laboratory
>> Macquarie University
>> Sydney
>> Australia.
>>
>> http://www2.oxfam.org.au/trailwalker/Sydney/team/228
>>
>>
>>
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From cjfields at illinois.edu  Tue Sep 22 13:12:10 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Tue, 22 Sep 2009 12:12:10 -0500
Subject: [Bioperl-l] subsection of genbank file
In-Reply-To: <4AB8FEFF.6060408@gmail.com>
References: <997B4CA2-D80B-4512-AA3E-74CB45DD7064@science.mq.edu.au>	<4AB36451.3030207@gmail.com>	<3B0EF953-BF79-4384-964D-A992DFBDB609@science.mq.edu.au>
	<4AB87A06.4000209@gmail.com> <4AB8FEFF.6060408@gmail.com>
Message-ID: <1F043B63-3DD1-49DD-86F3-B2FB9AD34725@illinois.edu>

That should be out in the latest alpha on CPAN as well (the final  
1.6.1 should be out this week).

chris

On Sep 22, 2009, at 11:44 AM, Roy Chaudhuri wrote:

> Hi Liam,
>
> My mistake, it looks like the bug had already been reported and  
> fixed, which means I get to go home earlier. I've marked your bug as  
> a duplicate of bug 2810.
>
> You can get the patched version by installing bioperl-live (just  
> downloading the bioperl-live SeqUtils.pm and putting it in the  
> correct place on your system would probably also work).
>
> Cheers.
> Roy.
>
> Roy Chaudhuri wrote:
>> Hi Liam,
>> Yes, that is a bug - I think it is to do with the Feature  
>> Annotation rollback from 1.6, it works fine with 1.5.2. Looks like  
>> the tests I wrote don't check for the presence of tags, just the  
>> coordinates of the feature, so this hasn't been picked up. Submit  
>> it to Bugzilla, and I'll take a look when I get a chance.
>> Cheers.
>> Roy.
>> Liam Elbourne wrote:
>>> Hi Roy,
>>>
>>> Thanks for that, works well, but there are no _gsf_tag_hash  
>>> values? I'm particularly interested in the locus id, obviously the  
>>> translation could be problematic if the whole gene is not included  
>>> after truncation, but things like the note, product, protein_id  
>>> would be good. I had a look at the code for the method and  
>>> couldn't see any obvious why those values didn't make it across.  
>>> Should I submit this as a bug, or is there something I'm missing?
>>>
>>>
>>> Regards,
>>> Liam.
>>>
>>>
>>>
>>> On 18/09/2009, at 8:43 PM, Roy Chaudhuri wrote:
>>>
>>>> Hi Liam,
>>>>
>>>> I just discovered your message, which has not yet been replied  
>>>> to. What you require has been discussed in a recent thread:
>>>> http://bioperl.org/pipermail/bioperl-l/2009-August/031071.html
>>>>
>>>> Try using trunc_with_features from Bio::SeqUtils:
>>>>
>>>> my $sub_seqobj=Bio::SeqUtils->trunc_with_features($seqobj, 300,  
>>>> 2000);
>>>> Cheers.
>>>> Roy.
>>>>
>>>> Liam Elbourne wrote:
>>>>> Hi All,
>>>>> Is there a method or methodology that will produce a fully  
>>>>> fledged Seq  object with all the associated metadata given a  
>>>>> start and end  position? To clarify, I create a sequence object  
>>>>> from a genbank file:
>>>>> ****
>>>>> my $io  = Bio::Seqio->new(as per usual);
>>>>> my $seqobj = $io->next_seq();
>>>>> ****
>>>>> I now want:
>>>>> my $sub_seqobj = $seqobj between 300 and 2000
>>>>> where $sub_seqobj is a Seq object (which I appreciate is an   
>>>>> 'aggregate' of objects) too. The "trunc" method only returns a   
>>>>> PrimarySeq object which lacks all the annotation etc. I've  
>>>>> previously  done this task by iterating through feature by  
>>>>> feature and parsing out  what I needed, but thought there might  
>>>>> be a more elegant approach...
>>>>> Regards,
>>>>> Liam Elbourne.
>>>> -- 
>>>> Dr. Roy Chaudhuri
>>>> Department of Veterinary Medicine
>>>> University of Cambridge, U.K.
>>>> _______________________________________________
>>>> Bioperl-l mailing list
>>>> Bioperl-l at lists.open-bio.org
>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>> ac.uk ([131.111.51.215]:49455)
>>>> by ppsw-7.c
>>> ______________________________
>>>
>>> Dr Liam Elbourne
>>> Research Fellow (Bioinformatics)
>>> Paulsen Laboratory
>>> Macquarie University
>>> Sydney
>>> Australia.
>>>
>>> http://www2.oxfam.org.au/trailwalker/Sydney/team/228
>>>
>>>
>>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From cjfields at illinois.edu  Tue Sep 22 13:13:53 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Tue, 22 Sep 2009 12:13:53 -0500
Subject: [Bioperl-l] Stockholm to fasta
In-Reply-To: <9fcc48c70909220948t7988b48eu7a8dcf89ee2d6042@mail.gmail.com>
References: <9fcc48c70909220948t7988b48eu7a8dcf89ee2d6042@mail.gmail.com>
Message-ID: <EA566A7E-C146-4C2C-9AD5-88B9BB34EC43@illinois.edu>

The POD for Bio::AlignIO::stockholm indicates where the various bits  
of information are stored.  Everything from the header should be in  
there in the latest bioperl; in many cases it's not ideally stored,  
but it's accessible.

You'll need to preprocess your seqs in the SimpleAlign returned  
(iterate through them and change the relevant bits like desc(),  
displayname(), seq_id, etc) and may need to do other modifications,  
but it should work.

chris

On Sep 22, 2009, at 11:48 AM, shalabh sharma wrote:

> Hi All,      I am trying to convert stockholm to fasta format. I am  
> using
> "sreformat" for this purpose. I am getting a fasta file but the  
> problem is i
> want header information from stockholm in my fasta file.
> Like:
> # STOCKHOLM 1.0
>
> #=GF AC   RF00003
> #=GF ID   U1
> #=GF DE   U1 spliceosomal RNA
> - - - - - - - - - -  - - - -
> - - - - - - - - - - - -- -
> - - - - - - -- - - - - -
> #=GF RL   J Biol Chem 2001;276:21476-21481.
> #=GF CC   U1 is a small nuclear RNA (snRNA) component of the  
> spliceosome
> #=GF CC   (involved in pre-mRNA splicing). Its 5' end forms  
> complementary
> #=GF CC   base pairs with the 5' splice junction, thus defining the 5'
> #=GF CC   donor site of an intron.
> #=GF CC   There are significant differences in sequence and secondary
> #=GF CC   structure between metazoan and yeast U1 snRNAs, the latter  
> being
> #=GF CC   much longer (568 nucleotides as compared to 164  
> nucleotides in
> #=GF CC   human). Nevertheless, secondary structure predictions  
> suggest
> #=GF CC   that all U1 snRNAs share a 'common core' consisting of  
> helices I,
> #=GF CC   II, the proximal region of III, and IV [1].
> #=GF CC   This family does not contain the larger yeast sequences.
> #=GF SQ   100
>
>
> X63783.1/2024-2186
> UUACUUACCUGGCUGG.AGUUU.GCUA...UCGAUCAU.GAAG.GGUAG.
> X63783.1/1394-1556
> UUACUUACCUGGCUGG.AGUUA.GCUA...UCGAUCAU.GAAG.GGUAG.
> X58845.1/1-161
> ..ACUUACCUGGCUGG.AGUUU.GCUA...UCGAUCAU.GAAG.GGUAG.
> X63783.1/596-756
> UAAAUUACAAUGUUGU.AGUUA.GCUA...UAUAUCAA.AAAA.UAUAG.
> M29062.1/238-387
> UUACUUACCUGGCAUG.AGUUU..CUG...CAGCACAA.GAAU.UGUGG.
>
> As a output i am just getting a fasta file with the headers like
> "X63783.1/2024-2186" but what i want is that it should include some
> information like U1 or U1 spliceosomal RNA from the stockholm headers.
>
> I would really appreciate if anyone can help me out.
>
> Thanks
> Shalabh
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From shalabh.sharma7 at gmail.com  Tue Sep 22 16:17:11 2009
From: shalabh.sharma7 at gmail.com (shalabh sharma)
Date: Tue, 22 Sep 2009 16:17:11 -0400
Subject: [Bioperl-l] Stockholm to fasta
In-Reply-To: <EA566A7E-C146-4C2C-9AD5-88B9BB34EC43@illinois.edu>
References: <9fcc48c70909220948t7988b48eu7a8dcf89ee2d6042@mail.gmail.com>
	<EA566A7E-C146-4C2C-9AD5-88B9BB34EC43@illinois.edu>
Message-ID: <9fcc48c70909221317i509a45cbm19783c1210f7c69b@mail.gmail.com>

Hi Chris,           Thanks a lot it was really helpful.

Thanks
Shalabh


On Tue, Sep 22, 2009 at 1:13 PM, Chris Fields <cjfields at illinois.edu> wrote:

> The POD for Bio::AlignIO::stockholm indicates where the various bits of
> information are stored.  Everything from the header should be in there in
> the latest bioperl; in many cases it's not ideally stored, but it's
> accessible.
>
> You'll need to preprocess your seqs in the SimpleAlign returned (iterate
> through them and change the relevant bits like desc(), displayname(),
> seq_id, etc) and may need to do other modifications, but it should work.
>
> chris
>
>
> On Sep 22, 2009, at 11:48 AM, shalabh sharma wrote:
>
>  Hi All,      I am trying to convert stockholm to fasta format. I am using
>> "sreformat" for this purpose. I am getting a fasta file but the problem is
>> i
>> want header information from stockholm in my fasta file.
>> Like:
>> # STOCKHOLM 1.0
>>
>> #=GF AC   RF00003
>> #=GF ID   U1
>> #=GF DE   U1 spliceosomal RNA
>> - - - - - - - - - -  - - - -
>> - - - - - - - - - - - -- -
>> - - - - - - -- - - - - -
>> #=GF RL   J Biol Chem 2001;276:21476-21481.
>> #=GF CC   U1 is a small nuclear RNA (snRNA) component of the spliceosome
>> #=GF CC   (involved in pre-mRNA splicing). Its 5' end forms complementary
>> #=GF CC   base pairs with the 5' splice junction, thus defining the 5'
>> #=GF CC   donor site of an intron.
>> #=GF CC   There are significant differences in sequence and secondary
>> #=GF CC   structure between metazoan and yeast U1 snRNAs, the latter being
>> #=GF CC   much longer (568 nucleotides as compared to 164 nucleotides in
>> #=GF CC   human). Nevertheless, secondary structure predictions suggest
>> #=GF CC   that all U1 snRNAs share a 'common core' consisting of helices
>> I,
>> #=GF CC   II, the proximal region of III, and IV [1].
>> #=GF CC   This family does not contain the larger yeast sequences.
>> #=GF SQ   100
>>
>>
>> X63783.1/2024-2186
>> UUACUUACCUGGCUGG.AGUUU.GCUA...UCGAUCAU.GAAG.GGUAG.
>> X63783.1/1394-1556
>> UUACUUACCUGGCUGG.AGUUA.GCUA...UCGAUCAU.GAAG.GGUAG.
>> X58845.1/1-161
>> ..ACUUACCUGGCUGG.AGUUU.GCUA...UCGAUCAU.GAAG.GGUAG.
>> X63783.1/596-756
>> UAAAUUACAAUGUUGU.AGUUA.GCUA...UAUAUCAA.AAAA.UAUAG.
>> M29062.1/238-387
>> UUACUUACCUGGCAUG.AGUUU..CUG...CAGCACAA.GAAU.UGUGG.
>>
>> As a output i am just getting a fasta file with the headers like
>> "X63783.1/2024-2186" but what i want is that it should include some
>> information like U1 or U1 spliceosomal RNA from the stockholm headers.
>>
>> I would really appreciate if anyone can help me out.
>>
>> Thanks
>> Shalabh
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>
>


From cjfields at illinois.edu  Tue Sep 22 16:29:28 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Tue, 22 Sep 2009 15:29:28 -0500
Subject: [Bioperl-l] BioPerl 1.6.0 alpha 3 released
Message-ID: <A59164B5-0408-4A94-9262-8B814DD48CE1@illinois.edu>

The third alpha is now out and propagating it's way around the  
intertubes:

http://search.cpan.org/~cjfields/BioPerl-1.6.0_3/

Pick your favorite archive here:

http://bioperl.org/DIST/RC/

This includes some unmerged changes from 1.6.0.  Test failures from  
the last alpha indicated these somehow were missed, so I basically ran  
a global diff against main trunk to check for missing commits (all  
located in t/ as it turned out).

Also fixed is are the SeqFeature_SQLite.t failures; this is a file  
autogenerated with Build.PL tests that somehow made it's way into the  
last alpha release.  This is now properly cleaned up along with it's  
test database using './Build clean'.  BTW, very nice SQLite  
implementation; I may be using it!

Please let me know if anything pops up; I'm hoping to release 1.6.1 by  
this Thursday-Friday.

Enjoy!

chris


From dan.bolser at gmail.com  Tue Sep 22 17:33:13 2009
From: dan.bolser at gmail.com (Dan Bolser)
Date: Tue, 22 Sep 2009 22:33:13 +0100
Subject: [Bioperl-l] Converting between allowed SearchIO formats?
In-Reply-To: <CE021960-F0DC-4BA7-91B7-21A5B2F6F1BF@bioperl.org>
References: <2c8757af0909220609n518243efh63608aa05df13d1c@mail.gmail.com>
	<B7F6253D-F9EE-4EC0-9ABE-53CB85E37D16@illinois.edu>
	<CE021960-F0DC-4BA7-91B7-21A5B2F6F1BF@bioperl.org>
Message-ID: <2c8757af0909221433p6d8b5dbeuf8c16218b732e54e@mail.gmail.com>

2009/9/22 Jason Stajich <jason at bioperl.org>

>
>
> However, the above method does not work here. Is this for some deep
>
> reason, or could the above method (based on the way SeqIO works) be
>
> made to work? I'm guessing that the SearchIO object conversion is
>
> simply harder to do than with SeqIO?
>
>
> This is something Jason could probably speak up on, but from my perspective
> it comes down to 'why?'.  This opens up a very hard-to-implement door
> (converting to and from, for instance, BLAST to HMMER), which doesn't make
> sense from the end-user perspective.  What most users want out of those
> formats is getting at the data in an easily accessible way, to further
> process them (filter, to GFF, etc), or to have them summarized.  the Writer
> classes take care of the latter.
>
>
> There is a very generic, all-purpose write_result in Bio::SearchIO that
> just calls the a ResultWriter object (and dies if it isn't present).  Note
> that this expects a ResultWriter, not a Hit/HSPWriter; it is write_result()
> after all. I think this kind of goes against the well-established API that
> exists with the other write_foo implementations for the IO classes, where
> the input/output format should match, but there you have it.
>
> Dan -
> I'm confused about what you are trying to do or what is broken - are you
> just annoyed that the API isn't the same style as Bio::SeqIO.
>

No, I'm not annoyed. I was just confused initially because it didn't work as
'expected', and then I was wondering why (I was just curious). I take Chris's
point that this could be a lot of work to implement for a very marginal use
case.

Very simply, what I am trying to do is this: a) read in a blasttable, b)
filter the HSPs per 'result' (per query sequence), and c) write the HSPs out
in blasttable format.

I was stuck at step c, but I'm not saying anything is broken (just my
understanding of how to use SearchIO::Writer::HSPTableWriter).

I'll look again at Chris's suggestions to see if I can get code to just
'round trip' the blasttable format. From there I think I should be able to
do what I want.


Cheers,
Dan.


--
> Jason Stajich
> jason.stajich at gmail.com
> jason at bioperl.org
>
>


From maj at fortinbras.us  Tue Sep 22 18:32:15 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Tue, 22 Sep 2009 18:32:15 -0400
Subject: [Bioperl-l] Converting between allowed SearchIO formats?
In-Reply-To: <2c8757af0909221433p6d8b5dbeuf8c16218b732e54e@mail.gmail.com>
References: <2c8757af0909220609n518243efh63608aa05df13d1c@mail.gmail.com><B7F6253D-F9EE-4EC0-9ABE-53CB85E37D16@illinois.edu><CE021960-F0DC-4BA7-91B7-21A5B2F6F1BF@bioperl.org>
	<2c8757af0909221433p6d8b5dbeuf8c16218b732e54e@mail.gmail.com>
Message-ID: <9C7D7F02BFBD4F2AA16E151B52125C93@NewLife>

Apropos this, here's something I ran across the other day:

"Just remember when using BioPerl that it was never designed
to 'round trip' your favorite formats. Rather, it was designed to
store sequence data from many widely different formats into a
common object framework and make that framework available
to other sequence manipulation tasks in a programmatic fashion."

from HOWTO:SeqIO#Caveats

Food for thought, anyway--- MAJ

----- Original Message ----- 
From: "Dan Bolser" <dan.bolser at gmail.com>
To: "Jason Stajich" <jason at bioperl.org>
Cc: "Chris Fields" <cjfields at illinois.edu>; "BioPerl List" 
<bioperl-l at lists.open-bio.org>
Sent: Tuesday, September 22, 2009 5:33 PM
Subject: Re: [Bioperl-l] Converting between allowed SearchIO formats?


> 2009/9/22 Jason Stajich <jason at bioperl.org>
>
>>
>>
>> However, the above method does not work here. Is this for some deep
>>
>> reason, or could the above method (based on the way SeqIO works) be
>>
>> made to work? I'm guessing that the SearchIO object conversion is
>>
>> simply harder to do than with SeqIO?
>>
>>
>> This is something Jason could probably speak up on, but from my perspective
>> it comes down to 'why?'.  This opens up a very hard-to-implement door
>> (converting to and from, for instance, BLAST to HMMER), which doesn't make
>> sense from the end-user perspective.  What most users want out of those
>> formats is getting at the data in an easily accessible way, to further
>> process them (filter, to GFF, etc), or to have them summarized.  the Writer
>> classes take care of the latter.
>>
>>
>> There is a very generic, all-purpose write_result in Bio::SearchIO that
>> just calls the a ResultWriter object (and dies if it isn't present).  Note
>> that this expects a ResultWriter, not a Hit/HSPWriter; it is write_result()
>> after all. I think this kind of goes against the well-established API that
>> exists with the other write_foo implementations for the IO classes, where
>> the input/output format should match, but there you have it.
>>
>> Dan -
>> I'm confused about what you are trying to do or what is broken - are you
>> just annoyed that the API isn't the same style as Bio::SeqIO.
>>
>
> No, I'm not annoyed. I was just confused initially because it didn't work as
> 'expected', and then I was wondering why (I was just curious). I take Chris's
> point that this could be a lot of work to implement for a very marginal use
> case.
>
> Very simply, what I am trying to do is this: a) read in a blasttable, b)
> filter the HSPs per 'result' (per query sequence), and c) write the HSPs out
> in blasttable format.
>
> I was stuck at step c, but I'm not saying anything is broken (just my
> understanding of how to use SearchIO::Writer::HSPTableWriter).
>
> I'll look again at Chris's suggestions to see if I can get code to just
> 'round trip' the blasttable format. From there I think I should be able to
> do what I want.
>
>
> Cheers,
> Dan.
>
>
> --
>> Jason Stajich
>> jason.stajich at gmail.com
>> jason at bioperl.org
>>
>>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> 


From clements at nescent.org  Tue Sep 22 19:15:50 2009
From: clements at nescent.org (Dave Clements)
Date: Tue, 22 Sep 2009 16:15:50 -0700
Subject: [Bioperl-l] a Main Page proposal
In-Reply-To: <2B4BD21D-D06A-43A1-8860-A8F3771130E7@illinois.edu>
References: <mailman.7482.1253574466.2696.bioperl-l@lists.open-bio.org>
	<4AB84B8D.5080005@ieee.org>
	<2B4BD21D-D06A-43A1-8860-A8F3771130E7@illinois.edu>
Message-ID: <f135c01c0909221615y4e79fb48u7e1ce6771b046d10@mail.gmail.com>

Hello all,

For open source project wikis, it's nice if the home page
1) Lets new users know that this is an active project with a lot going on.
2) Encourages people to contribute to the project and the wiki.

Both the BioPython,org and GMOD.org sites include a list of links to news
items on the home page.  This is done in both sites with a MediaWiki
extension.

The GMOD.org home page also includes a list of new and recently updated wiki
pages.  This achieves both goals, by showing what's happening, and by giving
people a slight reward for updating the wiki by placing a link to the page
on the wiki.  This is also done with MediaWiki extensions.

My 2?,

Dave C

-- 
GMOD News: http://gmod.org/wiki/GMOD_News


From David.Messina at sbc.su.se  Wed Sep 23 07:37:02 2009
From: David.Messina at sbc.su.se (Dave Messina)
Date: Wed, 23 Sep 2009 13:37:02 +0200
Subject: [Bioperl-l] a Main Page proposal
In-Reply-To: <f135c01c0909221615y4e79fb48u7e1ce6771b046d10@mail.gmail.com>
References: <mailman.7482.1253574466.2696.bioperl-l@lists.open-bio.org> 
	<4AB84B8D.5080005@ieee.org>
	<2B4BD21D-D06A-43A1-8860-A8F3771130E7@illinois.edu> 
	<f135c01c0909221615y4e79fb48u7e1ce6771b046d10@mail.gmail.com>
Message-ID: <628aabb70909230437p13de2164sef4950cc3be2ce23@mail.gmail.com>

I think either Chris' version or Mark's earlier, slightly more verbose
version would work well and fulfill the goals of reducing clutter and making
it easier to find what you're looking for for visitors new and old.

I do like the idea of a newsfeed, which summarizes what's been going on
lately and let's new users know the project is active. Embedding the BioPerl
twitter feed would be an easy solution.


The GMOD.org home page also includes a list of new and recently updated
> wiki pages.  This achieves both goals, by showing what's happening, and by
> giving people a slight reward for updating the wiki by placing a link to the
> page on the wiki.
>

I like this idea too.


Dave


From maj at fortinbras.us  Wed Sep 23 07:47:24 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Wed, 23 Sep 2009 07:47:24 -0400
Subject: [Bioperl-l] a Main Page proposal
In-Reply-To: <628aabb70909230437p13de2164sef4950cc3be2ce23@mail.gmail.com>
References: <mailman.7482.1253574466.2696.bioperl-l@lists.open-bio.org>
	<4AB84B8D.5080005@ieee.org>
	<2B4BD21D-D06A-43A1-8860-A8F3771130E7@illinois.edu>
	<f135c01c0909221615y4e79fb48u7e1ce6771b046d10@mail.gmail.com>
	<628aabb70909230437p13de2164sef4950cc3be2ce23@mail.gmail.com>
Message-ID: <0AD07A69C66B4B5BB8599BA5483145D7@NewLife>

Johnathan, Dave and Dave -- thanks for these helpful comments-
I'm beginning to think there is a happy medium for this medium.
MAJ
  ----- Original Message ----- 
  From: Dave Messina 
  To: Dave Clements 
  Cc: bioperl-l at lists.open-bio.org ; Mark A. Jensen ; Chris Fields 
  Sent: Wednesday, September 23, 2009 7:37 AM
  Subject: Re: [Bioperl-l] a Main Page proposal


  I think either Chris' version or Mark's earlier, slightly more verbose version would work well and fulfill the goals of reducing clutter and making it easier to find what you're looking for for visitors new and old.


  I do like the idea of a newsfeed, which summarizes what's been going on lately and let's new users know the project is active. Embedding the BioPerl twitter feed would be an easy solution.


    The GMOD.org home page also includes a list of new and recently updated wiki pages.  This achieves both goals, by showing what's happening, and by giving people a slight reward for updating the wiki by placing a link to the page on the wiki.


  I like this idea too.


  Dave 


From biopython at maubp.freeserve.co.uk  Wed Sep 23 08:12:56 2009
From: biopython at maubp.freeserve.co.uk (Peter)
Date: Wed, 23 Sep 2009 13:12:56 +0100
Subject: [Bioperl-l] a Main Page proposal
In-Reply-To: <628aabb70909230437p13de2164sef4950cc3be2ce23@mail.gmail.com>
References: <mailman.7482.1253574466.2696.bioperl-l@lists.open-bio.org>
	<4AB84B8D.5080005@ieee.org>
	<2B4BD21D-D06A-43A1-8860-A8F3771130E7@illinois.edu>
	<f135c01c0909221615y4e79fb48u7e1ce6771b046d10@mail.gmail.com>
	<628aabb70909230437p13de2164sef4950cc3be2ce23@mail.gmail.com>
Message-ID: <320fb6e00909230512u3d0c2031xb418e3253476be2f@mail.gmail.com>

On Wed, Sep 23, 2009 at 12:37 PM, Dave Messina <David.Messina at sbc.su.se> wrote:
> I think either Chris' version or Mark's earlier, slightly more verbose
> version would work well and fulfill the goals of reducing clutter and making
> it easier to find what you're looking for for visitors new and old.
>
> I do like the idea of a newsfeed, which summarizes what's been going on
> lately and let's new users know the project is active. Embedding the BioPerl
> twitter feed would be an easy solution.

Embedding your news feed would be just as easy:

http://news.open-bio.org/news/category/obf-projects/bioperl/feed/rdf
http://news.open-bio.org/news/category/obf-projects/bioperl/feed/rss
http://news.open-bio.org/news/category/obf-projects/bioperl/feed/rss2
http://news.open-bio.org/news/category/obf-projects/bioperl/feed/atom

Which (news server vs twitter feed) is preferable is down to you guys,
although for 2009 at least there has been more activity on twitter.
I'm not sure if you have the news posts re-tweeted or not (the last
news server post was back in Feb), but Biopython and the OBF
twitter accounts are doing this via twitterfeed.

Peter


From maj at fortinbras.us  Wed Sep 23 08:51:15 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Wed, 23 Sep 2009 08:51:15 -0400
Subject: [Bioperl-l] Protein Sequence QSARs
In-Reply-To: <627d998d0909070117u760c8ef3k47a894cf52d099f1@mail.gmail.com>
References: <627d998d0909070117u760c8ef3k47a894cf52d099f1@mail.gmail.com>
Message-ID: <3B9AACAB654F4F4DBB6CE00A9B26FBF6@NewLife>

Hi Brett--
I doubt if anything this specialized exists in BioPerl.
I'd say go for it, but R may be better suited for the calculations you
want to do. For dealing with matrices, you may want to check out
the Bio::Matrix namespace.
cheers Mark
----- Original Message ----- 
From: "Brett Bowman" <bnbowman at gmail.com>
To: <bioperl-l at lists.open-bio.org>
Sent: Monday, September 07, 2009 4:17 AM
Subject: [Bioperl-l] Protein Sequence QSARs


I've been working on a script for my personal edification for annotating
protein sequence for QSARs, as described in the paper below, because I
didn't see anything in Bioperl to do it for me.  Essentially converting a
protein sequence of length N into a numerical matrix of size 3-by-N by
substitution, and then calculating the auto- and cross- correlation values
for various for a lag of L amino acids.  I was considering turning it into a
full blown module, but I wanted to ask if A) it had been done before and I
had just missed it, and B) whether anyone other than me would find such a
module useful.

Wold S, Jonsson J, Sj?str?m M, Sandberg M, R?nnar S: * DNA and peptide
sequences and chemical processes multivariately modeled by principal
component analysis and partial least-squares projections to latent
structures. **Anal Chim Acta* 1993, *277**:*239-253.

Brett Bowman
bnbowman at gmail.com
Woelk Lab, Stein Cancer Research Center
UCSD/SDSU Joint Program in Bioinformatics

_______________________________________________
Bioperl-l mailing list
Bioperl-l at lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/bioperl-l


From maj at fortinbras.us  Wed Sep 23 09:04:48 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Wed, 23 Sep 2009 09:04:48 -0400
Subject: [Bioperl-l] Fw:  problem parsing msf file
Message-ID: <4851B51372DE4761B8CC26D685B57344@NewLife>

neglected the list
----- Original Message ----- 
From: "Mark A. Jensen" <maj at fortinbras.us>
To: "Paola Bisignano" <paola.bisignano at gmail.com>
Sent: Wednesday, September 23, 2009 9:04 AM
Subject: Re: [Bioperl-l] problem parsing msf file


> Hi Paola--
> I think you need column_from_residue_number() off the SimpleAlign object,
> and location_from_column off the LocatableSeq object. For your example, 
> try
> 
> $alnio = Bio::AlignIO->new( -file=>"my.msf");
> $aln = $alnio->next_aln;
> 
> $s1 = $aln->get_seq_by_pos(1);
> $s2 = $aln->get_seq_by_pos(2);
> 
> $col = $aln->column_from_residue_number( $s1->id, 28);
> $s2coord = $s2->location_from_column( $col - 1);
> 
> Now, $s2coord should equal 4 (the coordinate of the R before the I
> that aligns with the V in sequence 1).
> MAJ
> 
> 
> ----- Original Message ----- 
> From: "Paola Bisignano" <paola.bisignano at gmail.com>
> To: "Mark A. Jensen" <maj at fortinbras.us>; <bioperl-l at lists.open-bio.org>
> Sent: Friday, September 04, 2009 8:28 AM
> Subject: [Bioperl-l] problem parsing msf file
> 
> 
>>I have a problem with the parsing of msf file...I can't find the exact
>> object of Bio::SimpleAlign for my case...
>> I have to identify residues (from a list) in aligned sequences...but
>> when I parse the alignment from fasta file, I save as msf file, where
>> I have to identify my residue (from the list, numbering as the pdb
>> file) and the residue aligned in the aligned sequences...
>> 
>> this is a piece of the file...
>> 
>> NoName   MSF: 2  Type: P  Wed Aug 26 10:32:50 2009  Check: 00 ..
>> 
>> Name: Sequence/23-178  Len:    156  Check:  8937  Weight:  1.00
>> Name: 2zhz:A/1-148     Len:    156  Check:  9006  Weight:  1.00
>> 
>> //
>> 
>> 
>>                      1                                                   50
>> Sequence/23-178       NDPRVAAYGE VDELNSWVGY TKSLINSHTQ VLSNELEEIQ QLLFDCGHDL
>> 2zhz:A/1-148          DDARIAAIGD VDELNSQIGV L--LAEPLPD DVRAALSAIQ HDLFDLGGEL
>> 
>> 
>>                      51                                                 100
>> Sequence/23-178       ATPADDERHS FKFKQEQPTV WLEEKIDNYT QVVPAVKKHI LPGGTQLASA
>> 2zhz:A/1-148          CIPGHAAITD AHLARLDG-- WLA----HYN GQLPPLEEFI LPGGARGAAL
>> 
>> 
>>                      101                                                150
>> Sequence/23-178       LHVARTITRR AERQIVQLMR EEQINQDVLI FINRLSDYFF AAARYANYLE
>> 2zhz:A/1-148          AHVCRTVCRR AERSIVALGA SEPLNAAPRR YVNRLSDLLF VLARVLNRAA
>> 
>> 
>>                      151                                                200
>> Sequence/23-178       QQPDML
>> 2zhz:A/1-148          GGADVL
>> 
>> for example in this I have to identify the residue that is in front of
>> Val 28 (that is in Sequen) in 2zhz:A (that manually conting is Ile
>> 5)....
>> Tyr4-> has no residue in front of it because the alignment starts from
>> N23 of Sequence...
>> how can I find the way to enter the residue of my sequen, and extract
>> the residue from the other????
>> 
>> 
>> I wish you all dear friends..and I'm actually in atrouble with this..
>> Thanks for suggestions
>> 
>> Paola
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>> 
>>


From cjfields at illinois.edu  Wed Sep 23 10:41:14 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Wed, 23 Sep 2009 09:41:14 -0500
Subject: [Bioperl-l] a Main Page proposal
In-Reply-To: <320fb6e00909230512u3d0c2031xb418e3253476be2f@mail.gmail.com>
References: <mailman.7482.1253574466.2696.bioperl-l@lists.open-bio.org>
	<4AB84B8D.5080005@ieee.org>
	<2B4BD21D-D06A-43A1-8860-A8F3771130E7@illinois.edu>
	<f135c01c0909221615y4e79fb48u7e1ce6771b046d10@mail.gmail.com>
	<628aabb70909230437p13de2164sef4950cc3be2ce23@mail.gmail.com>
	<320fb6e00909230512u3d0c2031xb418e3253476be2f@mail.gmail.com>
Message-ID: <9D6376D4-DFAC-4363-BA1C-0E27AB01373E@illinois.edu>

On Sep 23, 2009, at 7:12 AM, Peter wrote:

> On Wed, Sep 23, 2009 at 12:37 PM, Dave Messina <David.Messina at sbc.su.se 
> > wrote:
>> I think either Chris' version or Mark's earlier, slightly more  
>> verbose
>> version would work well and fulfill the goals of reducing clutter  
>> and making
>> it easier to find what you're looking for for visitors new and old.
>>
>> I do like the idea of a newsfeed, which summarizes what's been  
>> going on
>> lately and let's new users know the project is active. Embedding  
>> the BioPerl
>> twitter feed would be an easy solution.
>
> Embedding your news feed would be just as easy:
>
> http://news.open-bio.org/news/category/obf-projects/bioperl/feed/rdf
> http://news.open-bio.org/news/category/obf-projects/bioperl/feed/rss
> http://news.open-bio.org/news/category/obf-projects/bioperl/feed/rss2
> http://news.open-bio.org/news/category/obf-projects/bioperl/feed/atom
>
> Which (news server vs twitter feed) is preferable is down to you guys,
> although for 2009 at least there has been more activity on twitter.
> I'm not sure if you have the news posts re-tweeted or not (the last
> news server post was back in Feb), but Biopython and the OBF
> twitter accounts are doing this via twitterfeed.
>
> Peter

Not to add yet more to the list, but I also think a concise list of  
projects using (or 'powered by') bioperl should be front-and-center;  
not a lot of users know when/where bioperl is used.  This applies to  
the other bio* as well, particularly biopython (seeing it popping up  
more and more).

For an example, see the biomart homepage:

http://www.biomart.org/

chris


From adlai at refenestration.com  Wed Sep 23 10:38:32 2009
From: adlai at refenestration.com (adlai burman)
Date: Wed, 23 Sep 2009 16:38:32 +0200
Subject: [Bioperl-l] Newbie: Format GenBank
Message-ID: <BA67A13E-EAF0-4297-8013-22656D3D1740@refenestration.com>

I have finally got past two major hurdles (for me) only to get stumped:
1. I have written a perl script that can take a genbank formated text  
file as a filehandle and do all sorts of nifty (for me) things with it.
2. I have gotten my BioPerl installation working on a web hosting  
service so my advisor can use this through a browser.

BUT the code I have to fetch GB record can print it as a single HTML  
line, and what I need is for it to assign the retrieved file to a  
scaler variable. I am going blind trying to figure out how access  
(not write) the gb file from an SeqIO object and assign it to a  
variable.

Here's an example of the code I have going on the server:

#!/usr/bin/perl
print "Content-type: text/html\n\n";
use Bio::SeqIO;
use Bio::DB::GenBank;

$genBank = new Bio::DB::GenBank;  # This object knows how to talk to  
GenBank

my $seq = $genBank->get_Seq_by_acc('DQ897681');  # get a record by  
accession

my $seqOut = new Bio::SeqIO(-format => 'genbank');

$seqOut->write_seq($seq);


exit;

where 'DQ897861' will be replaced by a CGI post.

I know that write_seq is not what I need, and I assume that this is a  
simple problem but can anyone tell me how to assign the retrieved gb  
file to a scaler?

Thanks,
Adlai


From joseguillin at hotmail.com  Tue Sep 22 10:39:52 2009
From: joseguillin at hotmail.com (Jose .)
Date: Tue, 22 Sep 2009 15:39:52 +0100
Subject: [Bioperl-l] dnastatistics
In-Reply-To: <A5C3A80C-03F0-4CEC-BA43-2271B58F6DC4@science.mq.edu.au>
References: <BLU104-W2453ADE4584D2C479071A4A0E40@phx.gbl>
	<7AD546C5A6BE4B66BF9705BC885E08B1@NewLife>
	<8B440DC9-A1C8-4900-A0AB-96448616E46A@bioperl.org>
	<A5C3A80C-03F0-4CEC-BA43-2271B58F6DC4@science.mq.edu.au>
Message-ID: <BLU104-W475752FF9D5EADD0269E7A0DC0@phx.gbl>


Hi Liam,
I've tried analyzing the same alignment with both softwares (DNAStatatistics and dnadist), using the same analysis method (Jukes-Cantor), and I got pretty much the same results:

use strict;
use Bio::AlignIO;
Use Bio::Align::DNAStatistics;
my $stats = Bio::Align::DNAStatistics->new();
my $alignin = Bio::AlignIO->new(-file => 'e1_output_uno_solo.fas',
                         -format => 'fasta');
my $aln = $alignin->next_aln;
my $jcmatrix = $stats-> distance (-align => $aln,
               -method => 'Jukes-Cantor');
print $jcmatrix->print_matrix;
RESULT:A              0.00000  0.40900  0.41834  0.38044B              0.40900  0.00000  0.41358  0.37240C              0.41834  0.41358  0.00000  0.37809D              0.38044  0.37240  0.37809  0.00000

I used the web-based dnadist  ( http://mobyle.pasteur.fr/cgi-bin/portal.py?form=dnadist ), which is mentioned in the CPAN-dnadist documentation ( http://search.cpan.org/~birney/bioperl-run-1.4/Bio/Tools/Run/PiseApplication/dnadist.pm ),  setting Jukes-Cantor as Distance (D), and these are the Results:    4
A          0.000000 0.408996 0.418335 0.380436
B          0.408996 0.000000 0.413575 0.372400
C          0.418335 0.413575 0.000000 0.378086
D          0.380436 0.372400 0.378086 0.000000The difference is because of rounding off.Could it be by any chance that your analysis were made using two different methods, by default? (I think dnadist uses F84 instead of Jukes-Cantor by default). 

Using F84 instead of Jukes-Cantor in dnadist gives:
    4
A          0.000000 0.470013 0.479477 0.435071
B          0.470013 0.000000 0.468730 0.417669
C          0.479477 0.468730 0.000000 0.421582
D          0.435071 0.417669 0.421582 0.000000

On the other hand, DnaStatistics documentation offers the possibility of using F84, but it's not yet implementedMSG: Abstract method "Bio::Align::DNAStatistics::D_F84" is not implemented by package Bio::Align::DNAStatistics.
This is not your fault - author of Bio::Align::DNAStatistics should be blamed!


So, I think Jukes-Cantor works the same in Bio::Align::DNAStatistics and web-based dnadist; but other methods maybe not.
I want to thank you for letting me know about Data::Dumper, I've read the documentation and seems very handy. I think it could help me sooner or later. I'll try it out!!As I'm using DNAStatistics for a project, please let me know if you find what is wrong; or if I can help you further somehow.
Regards,
Jose G.


Subject: dnastatistics
From: lelbourn at science.mq.edu.au
Date: Tue, 22 Sep 2009 17:14:44 +1000
CC: maj at fortinbras.us; bioperl-l at bioperl.org; joseguillin at hotmail.com
To: jason at bioperl.org


So I also had no problem running the code as written by Jose (Bioperl 1.6.0, perl 5.10), but in the documentation for DNAStatistics it says:
"The routines are not well tested and do contain errors at this point. Work is underway to correct them, but do not expect this code to give you the right answer currently!"!
So I'm using dnadist (as I think the documentation recommends), and it does produce different numbers to $stats->distance(-).
I tried write_matrix from Bio::Matrix::IO - got a message saying it hasn't been implemented yet?
And if Jose hasn't already found it, try Data::Dumper; it will change your life....
Regards,Liam.
On 15/09/2009, at 3:54 AM, Jason Stajich wrote:Yeah it seems like more of a bioperl problem -- possible that the older code didn't recognize 'jukes-cantor' but you can try the abbreviation 'jc' -- better to just upgrade tho!

This isn't the cause of the problem but I would also encourage use of Bio::Matrix::IO for printing the matrix (use the 'write_matrix' function) rather than print_matrix on the matrix itsself.

-jason
On Sep 14, 2009, at 10:00 AM, Mark A. Jensen wrote:

Hi Jose--
I don't get any problem with your script as written. You should upgrade to
BioPerl 1.6 and try again.
The "unblessed reference" is $jcmatrix. It may be undef for some reason.
MAJ
----- Original Message ----- From: "Jose ." <joseguillin at hotmail.com>
To: <bioperl-l at bioperl.org>
Sent: Monday, September 14, 2009 8:48 AM
Subject: [Bioperl-l] Bio/Align/DNAStatistics.html print$jcmatrix->print_matrix;


Hello,

I'm trying to use Bio::Align::DNAStatistics, but I get the following message:

Can't call method "print_matrix" on unblessed reference at Tree.pl line 32, <GEN0> line 44.

Other modules do work, such us Bio::SimpleAlign;


My code is basically a modification of the code I found in http://doc.bioperl.org/releases/bioperl-current/bioperl-live/Bio/Align/DNAStatistics.html, as it is as follows:

use strict;
use Bio::AlignIO;
use Bio::Align::DNAStatistics;


my $stats = Bio::Align::DNAStatistics->new();

my $alignin = Bio::AlignIO->new(-file => 'e1_output_uno_solo.fas',
                          -format => 'fasta');
my $aln = $alignin->next_aln;

my $jcmatrix = $stats-> distance (-align => $aln,
                -method => 'Jukes-Cantor');

print $jcmatrix->print_matrix;

And the file 'e1_output_uno_solo.fas' has the following sequences:

A
GGTTATCTCAACAACTGTCACC--GTGGGCGCTGGTCATTGGTACGGGTGAACGAGAGTT
AAACGGTCGTTAACCATAGAAACAAAACACACTGCACCTTAACTCACTGAATAGTTGACG
GTCTGCCTCAGGGCTTGAGACAACGGATGGATCTAAACTCATGCTGTAGCCTATCAAACT
TAGCCCCAGGGTACTTCCGTCCCTAGCCTCGCTACAAGGCCAGAAAGGGTTTTGAAGTCT
ACTCACTGTGACCAGCGGTCTAGTCAGGTTATGCTTCGGCACAAAACCTCAGAATCGGTA
ACCAGCCACTACACGAACTGAAATCAAATCGCGGGAGGTGGTCCATCTTTGTCCACGCTG
CGATGATTGGGTTGCTTTATAGTCTAGCTGCAAGGTTTTGCGTTCTGGTGGGAAGCGGSubject: Re: [Bioperl-l] Bio/Align/DNAStatistics.html
	print$jcmatrix->print_maCA
TCCAAGGGGTTGACTCCGCTCGTTTATAACATGCCTTGGGCCTCCATGGTGAGTCGCAAC
GTCAGCGTAGGCCTAGACGGCT

B
GGATATCTCGACAACTTTTAGC--CTGGGCGCTTGGCATTGGTACACGTGACTTGCAGTT
AAAGGGTCGTTATACATAGAATCACTACCCAC--CAGGCGAACTCGCTGGAGAGCTGAGG
GTCACCCTCAGCGGTTGAGTTAACTGCTCGATGTTAACCGATGTTGGATCATAGGTAACT
TATCCTCAGTGTTCCTCTGTCCCTAGACTGGCTACAGGGCTACACCGGGTTTGAGGGGAT
ACTGACTGTTTTCAGCGGTAGTGTAAGTGTATGGTCCAACCCAAGGGTTCATGACCGGTA
AACTGCCCGTTCCCGCATTGAAATCAAATTGCAGGAGTTGGTACTTATTTGTCAACCTTA
CGATGATTGGGATGCATTTTAGTCGGGCTGGGCGGATTTGCGATCTGGGTGGAAGAGAGA
TGCATGGGGCTAACTCGTCTTGGTGAGTACCGGCATTGCACCGCAATGGACCGCCAAAAC
ATAAGAGTAGGTCGGGATGGCA

C
GCTTATCTCAACAACCGACACGAAGTCGTCGCAGGTCAATGGTACACGTGAATTGAAGTC
ATAAGATCAGTAATGATCGAACCACCAAACCCTTAACCTCGACTCACGCGATAGCCGAGG
GTCTGCCTCCAGGGTTGATTTAAAGGTTCTATTTAAGACCGTTTTCGATCATAGGTTACT
TATCCCCAGAGTTCTACCGTCGTGAGAATGGCTACAAGGCTAGAATAGGTTTTAGGGT-T
ACTTACGGTCTGCAGCCGTATTGTGAGGTTATGGTCCGGCCCTAGGCGTCATGACCGATA
ATCAGCCCCTACCTGAAATGAAATCAAATCGCGGGAGTTGGTACTTATCTGTCAACGTTG
CGATGATGGGGATACATGTTGGTCTACCGCGACGGACTAGCGATCACGGGGGAAGCGGAT
TGCCCGGTGGTGACTCGACACGTTTAAAACCTGCCTGGTTCCCGCATGGATCGTCACAAC
GTATGTGCAGGTCGAAACGAGT

D
CGTGATCGCAACAACTGTCACC--GTGGGCGCTGGCCGTTGGACCACGTGAAATGCTGTT
AAACGATCGTTCACCATAGAACCACTACACTCTTCACCTCAACCCGCGGGACAGGTGATG
GTGTCCCCCAGGGGTTGAGTGAACGGCTCGATGTAAACCCATGTTCGATCATAGGTAACG
TAGCCCCAGGGTGATTCCGTTCCTAAACTGGTTACAAGGCTAAAACGTGTTTTAGAGTAT
AATGACTGTCTACGGCGGTATTGTGATGTTATCATCCGTCCCTAGGCGTGGCGACCGTTA
AACAGCCTCTTCCCTAACTGATATCTAATCGTAGGAGTTGCTACGCATTTGTCAACGCAG
CGATGATGGTGATGCATCTTAATCTAGCTGG----TTTTTTGATCTCGGGTGACGCAGAT
AGTCAGGGGTTGACTCGCGTCGTTTGAAACGTGCCTTGCTCCTCAATGGACCCTCCGAAC
CTAAGAGTAGCTCGACACGGCT


I think the $aln object is OK, as I can use it with SimpleAlign.

Moreover, if I write
        print $jcmatrix;
instead of
        print $jcmatrix->print_matrix;
I get the memory reference, as normal===> ARRAY(0x859f08)

So my question is:

Why do I have an unblessed reference?

Can't call method "print_matrix" on unblessed reference at Tree.pl line 32, <GEN0> line 44.

Thank you very much in advance.

Jose G.

_________________________________________________________________
Hay tantos ordenadores como personas. ?Descubre ahora cu?l eres t?!
http://www.quepceres.com/
_______________________________________________
Bioperl-l mailing list
Bioperl-l at lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/bioperl-l


_______________________________________________
Bioperl-l mailing list
Bioperl-l at lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/bioperl-l

--
Jason Stajich
jason.stajich at gmail.com
jason at bioperl.org


_______________________________________________
Bioperl-l mailing list
Bioperl-l at lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/bioperl-l


______________________________


_________________________________________________________________
Comparte tus mejores momentos del verano ?Hazlo con Windows Live Fotos!
http://www.vivelive.com/compartirfotos


From A.J.Pemberton at bham.ac.uk  Tue Sep 22 13:06:04 2009
From: A.J.Pemberton at bham.ac.uk (Anthony Pemberton)
Date: Tue, 22 Sep 2009 18:06:04 +0100
Subject: [Bioperl-l] Problems installing latest stable bioperl-db (1.6)
Message-ID: <3A5B0BBDAF00724AB5F10155650102306F86D3F6@LESMBX1.adf.bham.ac.uk>

Folks,

I am experiencing problems installing bioperl-db. I followed the instructions on the website both installing via CPAN and downloading the source tarball. Get the same error. I think I have missing prerequistes, the first error I get is:

Can't locate Array/Compare.pm in @INC (@INC contains: t/lib t /usr/local/BioPerl-db-1.6.0/blib/lib 
/usr/local/BioPerl-db-1.6.0/blib/arch /usr/local/BioPerl-db-1.6.0 /usr/lib64/perl5/5.8.5/x86_64-linux-thread-multi
/usr/lib/perl5/5.8.5 /usr/lib64/perl5/site_perl/5.8.5/x86_64-linux-thread-multi /usr/lib/perl5/site_perl/5.8.5 
/usr/lib/perl5/site_perl /usr/lib64/perl5/vendor_perl/5.8.5/x86_64-linux-thread-multi /usr/lib/perl5/vendor_perl/5.8.5 
/usr/lib64/perl5/vendor_perl/5.8.3/x86_64-linux-thread-multi /usr/lib/perl5/vendor_perl .) at t/lib/Test/Warn.pm line 228.

Can anyone help?

Regards,

Tony P.


**************************************************************
Mr. A. Pemberton			Tel:+44 121 414 3388
School of Biosciences,			Fax:+44 121 414 5925
The University of Birmingham                    Email:a.j.pemberton at bham.ac.uk
Birmingham B15 2TT U.K.
**************************************************************


From joseguillin at hotmail.com  Wed Sep 23 11:08:04 2009
From: joseguillin at hotmail.com (Jose .)
Date: Wed, 23 Sep 2009 16:08:04 +0100
Subject: [Bioperl-l] Bio::Matrix::IO
Message-ID: <BLU104-W13A9E771FB4CC77748AAC5A0DB0@phx.gbl>


Hi,
I've found a typo in the Bio/Matrix/IO/phylip.pm documentation. There's a comma missing, 
=head1 SYNOPSIS

  use Bio::Matrix::IO;
  my $parser = Bio::Matrix::IO->new(-format   => 'phylip'    <------ comma missing
                                   -file     => 't/data/phylipdist.out');
  my $matrix = $parser->next_matrix;

It's also in the CPAN web:http://search.cpan.org/~cjfields/BioPerl-1.6.0_2/Bio/Matrix/IO/phylip.pm
And the BioPerl web:http://doc.bioperl.org/releases/bioperl-current/bioperl-live/Bio/Matrix/IO/phylip.html

This could mislead BioPerl begginers (like me) or absentminded BioPerl advanced who rely on the SYNOPSIS code.
Thank you! :)
_________________________________________________________________
Desc?rgate Internet Explorer 8 ?Y gana gratis viajes con Spanair!
http://www.vivelive.com/spanair


From maj at fortinbras.us  Wed Sep 23 11:36:59 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Wed, 23 Sep 2009 11:36:59 -0400
Subject: [Bioperl-l] Problems installing latest stable bioperl-db (1.6)
In-Reply-To: <3A5B0BBDAF00724AB5F10155650102306F86D3F6@LESMBX1.adf.bham.ac.uk>
References: <3A5B0BBDAF00724AB5F10155650102306F86D3F6@LESMBX1.adf.bham.ac.uk>
Message-ID: <3E7712FC278A4C9C89CBFC9A683AE301@NewLife>

hi Tony- missing prereqs are the issue with this message,yes-
the brute force approach would be to install each of these
as they come up; you can do

$ cpan
cpan> install Array::Compare

etc., then attempt the bioperl-db install again; lather, rinse, repeat.
MAJ
----- Original Message ----- 
From: "Anthony Pemberton" <A.J.Pemberton at bham.ac.uk>
To: <bioperl-l at bioperl.org>
Sent: Tuesday, September 22, 2009 1:06 PM
Subject: [Bioperl-l] Problems installing latest stable bioperl-db (1.6)


> Folks,
>
> I am experiencing problems installing bioperl-db. I followed the instructions 
> on the website both installing via CPAN and downloading the source tarball. 
> Get the same error. I think I have missing prerequistes, the first error I get 
> is:
>
> Can't locate Array/Compare.pm in @INC (@INC contains: t/lib t 
> /usr/local/BioPerl-db-1.6.0/blib/lib
> /usr/local/BioPerl-db-1.6.0/blib/arch /usr/local/BioPerl-db-1.6.0 
> /usr/lib64/perl5/5.8.5/x86_64-linux-thread-multi
> /usr/lib/perl5/5.8.5 
> /usr/lib64/perl5/site_perl/5.8.5/x86_64-linux-thread-multi 
> /usr/lib/perl5/site_perl/5.8.5
> /usr/lib/perl5/site_perl 
> /usr/lib64/perl5/vendor_perl/5.8.5/x86_64-linux-thread-multi 
> /usr/lib/perl5/vendor_perl/5.8.5
> /usr/lib64/perl5/vendor_perl/5.8.3/x86_64-linux-thread-multi 
> /usr/lib/perl5/vendor_perl .) at t/lib/Test/Warn.pm line 228.
>
> Can anyone help?
>
> Regards,
>
> Tony P.
>
>
> **************************************************************
> Mr. A. Pemberton Tel:+44 121 414 3388
> School of Biosciences, Fax:+44 121 414 5925
> The University of Birmingham                    Email:a.j.pemberton at bham.ac.uk
> Birmingham B15 2TT U.K.
> **************************************************************
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> 


From maj at fortinbras.us  Wed Sep 23 11:46:03 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Wed, 23 Sep 2009 11:46:03 -0400
Subject: [Bioperl-l] Bio::Matrix::IO
In-Reply-To: <BLU104-W13A9E771FB4CC77748AAC5A0DB0@phx.gbl>
References: <BLU104-W13A9E771FB4CC77748AAC5A0DB0@phx.gbl>
Message-ID: <E37AFAC689C84477817EFF38511B5709@NewLife>

thanks Jose - fixed it
MAJ
----- Original Message ----- 
From: "Jose ." <joseguillin at hotmail.com>
To: <bioperl-l at bioperl.org>
Sent: Wednesday, September 23, 2009 11:08 AM
Subject: [Bioperl-l] Bio::Matrix::IO


Hi,
I've found a typo in the Bio/Matrix/IO/phylip.pm documentation. There's a comma 
missing,
=head1 SYNOPSIS

  use Bio::Matrix::IO;
  my $parser = Bio::Matrix::IO->new(-format   => 'phylip'    <------ comma 
missing
                                   -file     => 't/data/phylipdist.out');
  my $matrix = $parser->next_matrix;

It's also in the CPAN 
web:http://search.cpan.org/~cjfields/BioPerl-1.6.0_2/Bio/Matrix/IO/phylip.pm
And the BioPerl 
web:http://doc.bioperl.org/releases/bioperl-current/bioperl-live/Bio/Matrix/IO/phylip.html

This could mislead BioPerl begginers (like me) or absentminded BioPerl advanced 
who rely on the SYNOPSIS code.
Thank you! :)
_________________________________________________________________
Desc?rgate Internet Explorer 8 ?Y gana gratis viajes con Spanair!
http://www.vivelive.com/spanair
_______________________________________________
Bioperl-l mailing list
Bioperl-l at lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/bioperl-l


From roy.chaudhuri at gmail.com  Wed Sep 23 12:27:26 2009
From: roy.chaudhuri at gmail.com (Roy Chaudhuri)
Date: Wed, 23 Sep 2009 17:27:26 +0100
Subject: [Bioperl-l] Newbie: Format GenBank
In-Reply-To: <BA67A13E-EAF0-4297-8013-22656D3D1740@refenestration.com>
References: <BA67A13E-EAF0-4297-8013-22656D3D1740@refenestration.com>
Message-ID: <4ABA4C6E.60609@gmail.com>

Hi Adlai,

In Perl you can open a string as if it was a file:

my $string;
open my $fh, '>', \$string or die $!;
my $seqOut=Bio::SeqIO->new(-fh=>$fh, -format=>'genbank';

$seqOut->write_seq($seq) should now write to the string.

However, are you sure this is your problem? Printing to STDOUT (which is 
what SeqIO does if you don't specify a file) should work fine with a CGI 
script. Your sequence is being displayed as one line because HTML 
ignores newline characters, but you can get around that by using a <pre> 
tag to specify pre-formatted text:

my $seqOut = new Bio::SeqIO(-format => 'genbank');
print "<pre>\n";
$seqOut->write_seq($seq);

Hope this helps.
Roy.

adlai burman wrote:
> I have finally got past two major hurdles (for me) only to get stumped:
> 1. I have written a perl script that can take a genbank formated text  
> file as a filehandle and do all sorts of nifty (for me) things with it.
> 2. I have gotten my BioPerl installation working on a web hosting  
> service so my advisor can use this through a browser.
> 
> BUT the code I have to fetch GB record can print it as a single HTML  
> line, and what I need is for it to assign the retrieved file to a  
> scaler variable. I am going blind trying to figure out how access  
> (not write) the gb file from an SeqIO object and assign it to a  
> variable.
> 
> Here's an example of the code I have going on the server:
> 
> #!/usr/bin/perl
> print "Content-type: text/html\n\n";
> use Bio::SeqIO;
> use Bio::DB::GenBank;
> 
> $genBank = new Bio::DB::GenBank;  # This object knows how to talk to  
> GenBank
> 
> my $seq = $genBank->get_Seq_by_acc('DQ897681');  # get a record by  
> accession
> 
> my $seqOut = new Bio::SeqIO(-format => 'genbank');
> 
> $seqOut->write_seq($seq);
> 
> 
> exit;
> 
> where 'DQ897861' will be replaced by a CGI post.
> 
> I know that write_seq is not what I need, and I assume that this is a  
> simple problem but can anyone tell me how to assign the retrieved gb  
> file to a scaler?
> 
> Thanks,
> Adlai
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From cjfields at illinois.edu  Wed Sep 23 13:47:51 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Wed, 23 Sep 2009 12:47:51 -0500
Subject: [Bioperl-l] Newbie: Format GenBank
In-Reply-To: <BA67A13E-EAF0-4297-8013-22656D3D1740@refenestration.com>
References: <BA67A13E-EAF0-4297-8013-22656D3D1740@refenestration.com>
Message-ID: <16121E7E-7619-4F02-82CC-20C6F5F6B230@illinois.edu>

On Sep 23, 2009, at 9:38 AM, adlai burman wrote:

> I have finally got past two major hurdles (for me) only to get  
> stumped:
> 1. I have written a perl script that can take a genbank formated  
> text file as a filehandle and do all sorts of nifty (for me) things  
> with it.
> 2. I have gotten my BioPerl installation working on a web hosting  
> service so my advisor can use this through a browser.
>
> BUT the code I have to fetch GB record can print it as a single HTML  
> line, and what I need is for it to assign the retrieved file to a  
> scaler variable. I am going blind trying to figure out how access  
> (not write) the gb file from an SeqIO object and assign it to a  
> variable.
>
> Here's an example of the code I have going on the server:
>
> #!/usr/bin/perl
> print "Content-type: text/html\n\n";
> use Bio::SeqIO;
> use Bio::DB::GenBank;
>
> $genBank = new Bio::DB::GenBank;  # This object knows how to talk to  
> GenBank
>
> my $seq = $genBank->get_Seq_by_acc('DQ897681');  # get a record by  
> accession
>
> my $seqOut = new Bio::SeqIO(-format => 'genbank');
>
> $seqOut->write_seq($seq);
>
> exit;
>
> where 'DQ897861' will be replaced by a CGI post.
>
> I know that write_seq is not what I need, and I assume that this is  
> a simple problem but can anyone tell me how to assign the retrieved  
> gb file to a scaler?
>
> Thanks,
> Adlai

Actually, there are two ways you can do this, one involving write_seq.

(1) The first is to just grab the raw data using Bio::DB::EUtilities:

use Bio::DB::EUtilities;

my $eutil = Bio::DB::EUtilities->new(-eutil     => 'efetch',
                                      -db        => 'nuccore',
                                      -id        => 'DQ897681',
                                      -rettype   => 'gb');

my $var = $eutil->get_Response->content;

(2) Use IO::String (see the SeqIO HOWTO), or Roy's example code.  That  
would 'filter' everything through SeqIO via next_seq/write_seq, so the  
output is what BioPerl spits out and may not be exactly the same.

chris


From cjfields at illinois.edu  Wed Sep 23 13:47:56 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Wed, 23 Sep 2009 12:47:56 -0500
Subject: [Bioperl-l] Problems installing latest stable bioperl-db (1.6)
In-Reply-To: <3E7712FC278A4C9C89CBFC9A683AE301@NewLife>
References: <3A5B0BBDAF00724AB5F10155650102306F86D3F6@LESMBX1.adf.bham.ac.uk>
	<3E7712FC278A4C9C89CBFC9A683AE301@NewLife>
Message-ID: <67AB606C-5CC9-4C1E-84EE-EFB7C37667E9@illinois.edu>

Appears Array::Compare is used for Test::Warn, so it isn't a true  
requirement (probably a test_requires or somesuch).

chris

On Sep 23, 2009, at 10:36 AM, Mark A. Jensen wrote:

> hi Tony- missing prereqs are the issue with this message,yes-
> the brute force approach would be to install each of these
> as they come up; you can do
>
> $ cpan
> cpan> install Array::Compare
>
> etc., then attempt the bioperl-db install again; lather, rinse,  
> repeat.
> MAJ
> ----- Original Message ----- From: "Anthony Pemberton" <A.J.Pemberton at bham.ac.uk 
> >
> To: <bioperl-l at bioperl.org>
> Sent: Tuesday, September 22, 2009 1:06 PM
> Subject: [Bioperl-l] Problems installing latest stable bioperl-db  
> (1.6)
>
>
>> Folks,
>>
>> I am experiencing problems installing bioperl-db. I followed the  
>> instructions on the website both installing via CPAN and  
>> downloading the source tarball. Get the same error. I think I have  
>> missing prerequistes, the first error I get is:
>>
>> Can't locate Array/Compare.pm in @INC (@INC contains: t/lib t /usr/ 
>> local/BioPerl-db-1.6.0/blib/lib
>> /usr/local/BioPerl-db-1.6.0/blib/arch /usr/local/BioPerl-db-1.6.0 / 
>> usr/lib64/perl5/5.8.5/x86_64-linux-thread-multi
>> /usr/lib/perl5/5.8.5 /usr/lib64/perl5/site_perl/5.8.5/x86_64-linux- 
>> thread-multi /usr/lib/perl5/site_perl/5.8.5
>> /usr/lib/perl5/site_perl /usr/lib64/perl5/vendor_perl/5.8.5/x86_64- 
>> linux-thread-multi /usr/lib/perl5/vendor_perl/5.8.5
>> /usr/lib64/perl5/vendor_perl/5.8.3/x86_64-linux-thread-multi /usr/ 
>> lib/perl5/vendor_perl .) at t/lib/Test/Warn.pm line 228.
>>
>> Can anyone help?
>>
>> Regards,
>>
>> Tony P.
>>
>>
>> **************************************************************
>> Mr. A. Pemberton Tel:+44 121 414 3388
>> School of Biosciences, Fax:+44 121 414 5925
>> The University of Birmingham                     
>> Email:a.j.pemberton at bham.ac.uk
>> Birmingham B15 2TT U.K.
>> **************************************************************
>>
>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From cjfields at illinois.edu  Wed Sep 23 16:58:37 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Wed, 23 Sep 2009 15:58:37 -0500
Subject: [Bioperl-l] BioPerl 1.6.0 alpha 3 released
In-Reply-To: <3BEA4A335B853745AE0BA5E81DE3782A09DD5D30@exch1-hi.accelrys.net>
References: <A59164B5-0408-4A94-9262-8B814DD48CE1@illinois.edu>
	<3BEA4A335B853745AE0BA5E81DE3782A09DD5D30@exch1-hi.accelrys.net>
Message-ID: <EA6593D4-5F3D-4CD9-95C6-598B9C561609@illinois.edu>

Yes, that would be good.  I don't have immediate access to anything  
running WinXP/vista/7 but I can probably look into this sometime  
tomorrow or Monday.

Just to make sure, is this with ActivePerl or Strawberry Perl?

chris

On Sep 23, 2009, at 3:52 PM, Kristine Briedis wrote:

> Hi Chris,
>
> We tested BioPerl 1.6.0 alpha 3 with our set of Pipeline Pilot  
> regressions and noticed a small problem.  The fasta validation check  
> for '>' in SeqIO::fasta (line 127) throws when used with  
> Index::Fasta on Windows because the position after '>' is being  
> indexed.  It looks like you already fixed the same problem for Linux  
> (comment in line 190 of Index::Fasta).  Do you want me to put this  
> into bugzilla?  Let me know if you have any questions.  Thanks!
>
> Cheers,
> Kristine
>
>
> -----Original Message-----
> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l- 
> bounces at lists.open-bio.org] On Behalf Of Chris Fields
> Sent: Tuesday, September 22, 2009 1:29 PM
> To: BioPerl List
> Subject: [Bioperl-l] BioPerl 1.6.0 alpha 3 released
>
> The third alpha is now out and propagating it's way around the
> intertubes:
>
> http://search.cpan.org/~cjfields/BioPerl-1.6.0_3/
>
> Pick your favorite archive here:
>
> http://bioperl.org/DIST/RC/
>
> This includes some unmerged changes from 1.6.0.  Test failures from
> the last alpha indicated these somehow were missed, so I basically ran
> a global diff against main trunk to check for missing commits (all
> located in t/ as it turned out).
>
> Also fixed is are the SeqFeature_SQLite.t failures; this is a file
> autogenerated with Build.PL tests that somehow made it's way into the
> last alpha release.  This is now properly cleaned up along with it's
> test database using './Build clean'.  BTW, very nice SQLite
> implementation; I may be using it!
>
> Please let me know if anything pops up; I'm hoping to release 1.6.1 by
> this Thursday-Friday.
>
> Enjoy!
>
> chris
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From KBriedis at accelrys.com  Wed Sep 23 16:52:09 2009
From: KBriedis at accelrys.com (Kristine Briedis)
Date: Wed, 23 Sep 2009 16:52:09 -0400
Subject: [Bioperl-l] BioPerl 1.6.0 alpha 3 released
In-Reply-To: <A59164B5-0408-4A94-9262-8B814DD48CE1@illinois.edu>
References: <A59164B5-0408-4A94-9262-8B814DD48CE1@illinois.edu>
Message-ID: <3BEA4A335B853745AE0BA5E81DE3782A09DD5D30@exch1-hi.accelrys.net>

Hi Chris,

We tested BioPerl 1.6.0 alpha 3 with our set of Pipeline Pilot regressions and noticed a small problem.  The fasta validation check for '>' in SeqIO::fasta (line 127) throws when used with Index::Fasta on Windows because the position after '>' is being indexed.  It looks like you already fixed the same problem for Linux (comment in line 190 of Index::Fasta).  Do you want me to put this into bugzilla?  Let me know if you have any questions.  Thanks!

Cheers,
Kristine


-----Original Message-----
From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-bounces at lists.open-bio.org] On Behalf Of Chris Fields
Sent: Tuesday, September 22, 2009 1:29 PM
To: BioPerl List
Subject: [Bioperl-l] BioPerl 1.6.0 alpha 3 released

The third alpha is now out and propagating it's way around the  
intertubes:

http://search.cpan.org/~cjfields/BioPerl-1.6.0_3/

Pick your favorite archive here:

http://bioperl.org/DIST/RC/

This includes some unmerged changes from 1.6.0.  Test failures from  
the last alpha indicated these somehow were missed, so I basically ran  
a global diff against main trunk to check for missing commits (all  
located in t/ as it turned out).

Also fixed is are the SeqFeature_SQLite.t failures; this is a file  
autogenerated with Build.PL tests that somehow made it's way into the  
last alpha release.  This is now properly cleaned up along with it's  
test database using './Build clean'.  BTW, very nice SQLite  
implementation; I may be using it!

Please let me know if anything pops up; I'm hoping to release 1.6.1 by  
this Thursday-Friday.

Enjoy!

chris
_______________________________________________
Bioperl-l mailing list
Bioperl-l at lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/bioperl-l


From KBriedis at accelrys.com  Wed Sep 23 18:40:10 2009
From: KBriedis at accelrys.com (Kristine Briedis)
Date: Wed, 23 Sep 2009 18:40:10 -0400
Subject: [Bioperl-l] BioPerl 1.6.0 alpha 3 released
In-Reply-To: <EA6593D4-5F3D-4CD9-95C6-598B9C561609@illinois.edu>
References: <A59164B5-0408-4A94-9262-8B814DD48CE1@illinois.edu>
	<3BEA4A335B853745AE0BA5E81DE3782A09DD5D30@exch1-hi.accelrys.net>
	<EA6593D4-5F3D-4CD9-95C6-598B9C561609@illinois.edu>
Message-ID: <3BEA4A335B853745AE0BA5E81DE3782A09DD5DF8@exch1-hi.accelrys.net>

Hi Chris,

ActivePerl.  I'll open a bug.  Thanks!

Cheers,
Kristine


-----Original Message-----
From: Chris Fields [mailto:cjfields at illinois.edu] 
Sent: Wednesday, September 23, 2009 1:59 PM
To: Kristine Briedis
Cc: BioPerl List
Subject: Re: [Bioperl-l] BioPerl 1.6.0 alpha 3 released

Yes, that would be good.  I don't have immediate access to anything  
running WinXP/vista/7 but I can probably look into this sometime  
tomorrow or Monday.

Just to make sure, is this with ActivePerl or Strawberry Perl?

chris

On Sep 23, 2009, at 3:52 PM, Kristine Briedis wrote:

> Hi Chris,
>
> We tested BioPerl 1.6.0 alpha 3 with our set of Pipeline Pilot  
> regressions and noticed a small problem.  The fasta validation check  
> for '>' in SeqIO::fasta (line 127) throws when used with  
> Index::Fasta on Windows because the position after '>' is being  
> indexed.  It looks like you already fixed the same problem for Linux  
> (comment in line 190 of Index::Fasta).  Do you want me to put this  
> into bugzilla?  Let me know if you have any questions.  Thanks!
>
> Cheers,
> Kristine
>
>
> -----Original Message-----
> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l- 
> bounces at lists.open-bio.org] On Behalf Of Chris Fields
> Sent: Tuesday, September 22, 2009 1:29 PM
> To: BioPerl List
> Subject: [Bioperl-l] BioPerl 1.6.0 alpha 3 released
>
> The third alpha is now out and propagating it's way around the
> intertubes:
>
> http://search.cpan.org/~cjfields/BioPerl-1.6.0_3/
>
> Pick your favorite archive here:
>
> http://bioperl.org/DIST/RC/
>
> This includes some unmerged changes from 1.6.0.  Test failures from
> the last alpha indicated these somehow were missed, so I basically ran
> a global diff against main trunk to check for missing commits (all
> located in t/ as it turned out).
>
> Also fixed is are the SeqFeature_SQLite.t failures; this is a file
> autogenerated with Build.PL tests that somehow made it's way into the
> last alpha release.  This is now properly cleaned up along with it's
> test database using './Build clean'.  BTW, very nice SQLite
> implementation; I may be using it!
>
> Please let me know if anything pops up; I'm hoping to release 1.6.1 by
> this Thursday-Friday.
>
> Enjoy!
>
> chris
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From cjfields at illinois.edu  Wed Sep 23 18:49:45 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Wed, 23 Sep 2009 17:49:45 -0500
Subject: [Bioperl-l] BioPerl.pm and 1.6.1
References: <1253727169.18486.1336281841@webmail.messagingengine.com>
Message-ID: <1AF393BC-2352-4ADA-A4E3-3EF13B99CAE8@illinois.edu>

All,

I've recently noticed that CPAN is not grabbing the correct  
descriptive information from Build.PL.  The current description is  
coming from Bio::LiveSeq::IO::BioPerl, which is the first module found  
with the same 'BioPerl' namesake:

http://search.cpan.org/search?query=bioperl&mode=dist

Therefore we need something that acts as the description and main page  
for the distributions.  We have a bioperl.pod already, just need to  
update it and add it to trunk, and maybe release another alpha with it  
included to make sure it's working.  I also want to fix the recent  
Windows issue reported by Kristine.

Therefore, I will being adding this for core and the other  
distributions per Curtis Jewell's suggestion (below).  Please let me  
know if there are any disagreements with this; I'll probably push  
another alpha out with this in the next few days (also hopefully  
containing the bug fix mentioned above).

chris

Begin forwarded message:

> From: "Curtis Jewell" <lists.perl.module-authors at csjewell.fastmail.us>
> Date: September 23, 2009 12:32:49 PM CDT
> To: "Chris Fields" <cjfields at illinois.edu>
> Subject: Re: distribution description
>
> Chris, I'd make it a BioPerl.pm that just declares a package and  
> version
> and does nothing else other than being a holder for Pod - because the
> first thing I wanted to do when I heard about it and wanted to check
> whether it worked in Strawberry is to do 'cpan BioPerl', which of
> course, blows up.
>
> --Curtis
>
> On Tue, 22 Sep 2009 22:23 -0500, "Chris Fields"  
> <cjfields at illinois.edu>
> wrote:
>> I've noticed in the last number of CPAN releases of BioPerl that the
>> description for the distribution is being pulled from one of our
>> modules (Bio::LiveSeq::IO::BioPerl).  I'm guessing this is b/c it's
>> the first match to the distribution name.
>>
>> Is there any way to make sure the description is pulled from the
>> abstract?  We're using a subclass of Module::Build and have defined
>> dist_abstract (I'm thinking of adding a BioPerl.pod to the root
>> directory just to catch this).
>>
>> chris
> --
> Curtis Jewell
> swordsman at csjewell.fastmail.us
>
> %DCL-E-MEM-BAD, bad memory
> -VMS-F-PDGERS, pudding between the ears
>
> [I use PC-Alpine, which deliberately does not display colors and  
> pictures in HTML mail]
>


From cjfields at illinois.edu  Wed Sep 23 19:00:55 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Wed, 23 Sep 2009 18:00:55 -0500
Subject: [Bioperl-l] BioPerl 1.6.0 alpha 3 released
In-Reply-To: <3BEA4A335B853745AE0BA5E81DE3782A09DD5DF8@exch1-hi.accelrys.net>
References: <A59164B5-0408-4A94-9262-8B814DD48CE1@illinois.edu>
	<3BEA4A335B853745AE0BA5E81DE3782A09DD5D30@exch1-hi.accelrys.net>
	<EA6593D4-5F3D-4CD9-95C6-598B9C561609@illinois.edu>
	<3BEA4A335B853745AE0BA5E81DE3782A09DD5DF8@exch1-hi.accelrys.net>
Message-ID: <D704BD1B-C44B-4AB5-9C14-9F4F63A46FEE@illinois.edu>

Kristine,

I have been planning on installing a temp WinXP VM using VirtualBox,  
so this'll give me an excuse to set that up ;>

chris

On Sep 23, 2009, at 5:40 PM, Kristine Briedis wrote:

> Hi Chris,
>
> ActivePerl.  I'll open a bug.  Thanks!
>
> Cheers,
> Kristine
>
>
> -----Original Message-----
> From: Chris Fields [mailto:cjfields at illinois.edu]
> Sent: Wednesday, September 23, 2009 1:59 PM
> To: Kristine Briedis
> Cc: BioPerl List
> Subject: Re: [Bioperl-l] BioPerl 1.6.0 alpha 3 released
>
> Yes, that would be good.  I don't have immediate access to anything
> running WinXP/vista/7 but I can probably look into this sometime
> tomorrow or Monday.
>
> Just to make sure, is this with ActivePerl or Strawberry Perl?
>
> chris
>
> On Sep 23, 2009, at 3:52 PM, Kristine Briedis wrote:
>
>> Hi Chris,
>>
>> We tested BioPerl 1.6.0 alpha 3 with our set of Pipeline Pilot
>> regressions and noticed a small problem.  The fasta validation check
>> for '>' in SeqIO::fasta (line 127) throws when used with
>> Index::Fasta on Windows because the position after '>' is being
>> indexed.  It looks like you already fixed the same problem for Linux
>> (comment in line 190 of Index::Fasta).  Do you want me to put this
>> into bugzilla?  Let me know if you have any questions.  Thanks!
>>
>> Cheers,
>> Kristine
>>
>>
>> -----Original Message-----
>> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
>> bounces at lists.open-bio.org] On Behalf Of Chris Fields
>> Sent: Tuesday, September 22, 2009 1:29 PM
>> To: BioPerl List
>> Subject: [Bioperl-l] BioPerl 1.6.0 alpha 3 released
>>
>> The third alpha is now out and propagating it's way around the
>> intertubes:
>>
>> http://search.cpan.org/~cjfields/BioPerl-1.6.0_3/
>>
>> Pick your favorite archive here:
>>
>> http://bioperl.org/DIST/RC/
>>
>> This includes some unmerged changes from 1.6.0.  Test failures from
>> the last alpha indicated these somehow were missed, so I basically  
>> ran
>> a global diff against main trunk to check for missing commits (all
>> located in t/ as it turned out).
>>
>> Also fixed is are the SeqFeature_SQLite.t failures; this is a file
>> autogenerated with Build.PL tests that somehow made it's way into the
>> last alpha release.  This is now properly cleaned up along with it's
>> test database using './Build clean'.  BTW, very nice SQLite
>> implementation; I may be using it!
>>
>> Please let me know if anything pops up; I'm hoping to release 1.6.1  
>> by
>> this Thursday-Friday.
>>
>> Enjoy!
>>
>> chris
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From David.Messina at sbc.su.se  Thu Sep 24 05:38:19 2009
From: David.Messina at sbc.su.se (Dave Messina)
Date: Thu, 24 Sep 2009 11:38:19 +0200
Subject: [Bioperl-l] a Main Page proposal
In-Reply-To: <9D6376D4-DFAC-4363-BA1C-0E27AB01373E@illinois.edu>
References: <mailman.7482.1253574466.2696.bioperl-l@lists.open-bio.org> 
	<4AB84B8D.5080005@ieee.org>
	<2B4BD21D-D06A-43A1-8860-A8F3771130E7@illinois.edu> 
	<f135c01c0909221615y4e79fb48u7e1ce6771b046d10@mail.gmail.com> 
	<628aabb70909230437p13de2164sef4950cc3be2ce23@mail.gmail.com> 
	<320fb6e00909230512u3d0c2031xb418e3253476be2f@mail.gmail.com> 
	<9D6376D4-DFAC-4363-BA1C-0E27AB01373E@illinois.edu>
Message-ID: <628aabb70909240238v439d6c46l93a5ead53f161c37@mail.gmail.com>

>
> Not to add yet more to the list, but I also think a concise list of
> projects using (or 'powered by') bioperl should be front-and-center; not a
> lot of users know when/where bioperl is used.  This applies to the other
> bio* as well, particularly biopython (seeing it popping up more and more).
>


Along these lines, it'd be great to publicize not only
BioPerl-*powered*projects, but ones which interface with it, too.

Just this week, for example, there is this, which could go both on a static
page and in the newsfeed:
http://bioinformatics.oxfordjournals.org/cgi/content/abstract/btp554v1

MOODS: fast search for position weight matrix matches in DNA sequences.

Korhonen J, Martinm?ki P, Pizzi C, Rastas P, Ukkonen E.
Department of Computer Science and Helsinki Institute for Information
Technology,
University of Helsinki, Helsinki, Finland.

SUMMARY: MOODS (MOtif Occurrence Detection Suite) is a software package for
matching position weight matrices against DNA sequences. MOODS implements
state-of-the-art on-line matching algorithms, achieving considerably faster
scanning speed than with a simple brute-force search. MOODS is written in C++,
with bindings for the popular BioPerl and Biopython toolkits. It can easily be
adapted for different purposes and integrated into existing workflows. It can
also be used as a C++ library. AVAILABILITY: The package with documentation and
examples of usage is available at http://www.cs.helsinki.fi/group/pssmfind. The
source code is also available under the terms of a GNU General Public License
(GPL). CONTACT: janne.h.korhonen at helsinki.fi.

PMID: 19773334 [PubMed - as supplied by publisher]


From maj at fortinbras.us  Thu Sep 24 10:17:26 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Thu, 24 Sep 2009 10:17:26 -0400
Subject: [Bioperl-l] DB_File dependency and ActiveState 5.10
Message-ID: <1A8A9461E94441EE9BD73D02A8F81F52@NewLife>

Gurus of a db stripe:
 
ActiveState 5.10 has such a problem with BDB that it
disables their ppm build of the DB_File module. I know
what the *ultimate* solution is...however...

I did a quick grep of 'use DB_File' across the trunk, and 
it seems there are two categories of dependency--

(1) use of BDB is an option among other dbms
      (e.g., among the  Bio::DB::GFF::Adaptor::)

(2) BDB is the developer's personal choice
    (e.g., possibly Bio::DB::FileCache)

In Bio::DB::Fasta, AnyDBM_File is used to allow the 
user a choice. Are there fundamental reasons not to 
convert the type (2) dependencies to AnyDBM_File?
I will try to do this (on a branch) if there are no technical
objections. General derision, however, will only goad
me into action-

Thanks,
MAJ


From A.J.Pemberton at bham.ac.uk  Thu Sep 24 11:08:06 2009
From: A.J.Pemberton at bham.ac.uk (Anthony Pemberton)
Date: Thu, 24 Sep 2009 16:08:06 +0100
Subject: [Bioperl-l] Problems installing latest stable bioperl-db (1.6)
In-Reply-To: <67AB606C-5CC9-4C1E-84EE-EFB7C37667E9@illinois.edu>
References: <3A5B0BBDAF00724AB5F10155650102306F86D3F6@LESMBX1.adf.bham.ac.uk>
	<3E7712FC278A4C9C89CBFC9A683AE301@NewLife>
	<67AB606C-5CC9-4C1E-84EE-EFB7C37667E9@illinois.edu>
Message-ID: <3A5B0BBDAF00724AB5F10155650102306F86D403@LESMBX1.adf.bham.ac.uk>

Chris, Mark,

Thank you, I have made significant progress with the install. I had to do a 

Cpan> force install Array::Compare

To get the model properly installed. 

However, I now have a new error. When I do

Cpan> install CJFIELDS/BioPerl-db-1.6.0.tar.gz

I get the following error (now only 1 of the 16 tests fails):

t/12ontology.t .... 1/740 Bio::OntologyIO: soflat cannot be found
Exception
------------- EXCEPTION -------------
MSG: Failed to load module Bio::OntologyIO::soflat. Can't locate Graph/Directed.pm in @INC (@INC contains: t/lib t /root/.cpan/build/BioPerl-db-1.6.0-xim2YV/blib/lib /root/.cpan/build/BioPerl-db-1.6.0-xim2YV/blib/arch /root/.cpan/build/BioPerl-db-1.6.0-xim2YV /usr/lib64/perl5/5.8.5/x86_64-linux-thread-multi /usr/lib/perl5/5.8.5 /usr/lib64/perl5/site_perl/5.8.5/x86_64-linux-thread-multi /usr/lib/perl5/site_perl/5.8.5 /usr/lib/perl5/site_perl /usr/lib64/perl5/vendor_perl/5.8.5/x86_64-linux-thread-multi /usr/lib/perl5/vendor_perl/5.8.5 /usr/lib64/perl5/vendor_perl/5.8.3/x86_64-linux-thread-multi /usr/lib/perl5/vendor_perl .) at /usr/lib/perl5/site_perl/5.8.5/Bio/Ontology/SimpleGOEngine/GraphAdaptor.pm line 118.
BEGIN failed--compilation aborted at /usr/lib/perl5/site_perl/5.8.5/Bio/Ontology/SimpleGOEngine/GraphAdaptor.pm line 118.


Can you help with this one?

Regards,

Tony Pemberton


> -----Original Message-----
> From: Chris Fields [mailto:cjfields at illinois.edu]
> Sent: 23 September 2009 18:48
> To: Mark A. Jensen
> Cc: Anthony Pemberton; bioperl-l at bioperl.org
> Subject: Re: [Bioperl-l] Problems installing latest stable bioperl-db
> (1.6)
> 
> Appears Array::Compare is used for Test::Warn, so it isn't a true
> requirement (probably a test_requires or somesuch).
> 
> chris
> 
> On Sep 23, 2009, at 10:36 AM, Mark A. Jensen wrote:
> 
> > hi Tony- missing prereqs are the issue with this message,yes-
> > the brute force approach would be to install each of these
> > as they come up; you can do
> >
> > $ cpan
> > cpan> install Array::Compare
> >
> > etc., then attempt the bioperl-db install again; lather, rinse,
> > repeat.
> > MAJ
> > ----- Original Message ----- From: "Anthony Pemberton"
> <A.J.Pemberton at bham.ac.uk
> > >
> > To: <bioperl-l at bioperl.org>
> > Sent: Tuesday, September 22, 2009 1:06 PM
> > Subject: [Bioperl-l] Problems installing latest stable bioperl-db
> > (1.6)
> >
> >
> >> Folks,
> >>
> >> I am experiencing problems installing bioperl-db. I followed the
> >> instructions on the website both installing via CPAN and
> >> downloading the source tarball. Get the same error. I think I have
> >> missing prerequistes, the first error I get is:
> >>
> >> Can't locate Array/Compare.pm in @INC (@INC contains: t/lib t /usr/
> >> local/BioPerl-db-1.6.0/blib/lib
> >> /usr/local/BioPerl-db-1.6.0/blib/arch /usr/local/BioPerl-db-1.6.0 /
> >> usr/lib64/perl5/5.8.5/x86_64-linux-thread-multi
> >> /usr/lib/perl5/5.8.5 /usr/lib64/perl5/site_perl/5.8.5/x86_64-linux-
> >> thread-multi /usr/lib/perl5/site_perl/5.8.5
> >> /usr/lib/perl5/site_perl /usr/lib64/perl5/vendor_perl/5.8.5/x86_64-
> >> linux-thread-multi /usr/lib/perl5/vendor_perl/5.8.5
> >> /usr/lib64/perl5/vendor_perl/5.8.3/x86_64-linux-thread-multi /usr/
> >> lib/perl5/vendor_perl .) at t/lib/Test/Warn.pm line 228.
> >>
> >> Can anyone help?
> >>
> >> Regards,
> >>
> >> Tony P.
> >>
> >>
> >> **************************************************************
> >> Mr. A. Pemberton Tel:+44 121 414 3388
> >> School of Biosciences, Fax:+44 121 414 5925
> >> The University of Birmingham
> >> Email:a.j.pemberton at bham.ac.uk
> >> Birmingham B15 2TT U.K.
> >> **************************************************************
> >>
> >>
> >>
> >> _______________________________________________
> >> Bioperl-l mailing list
> >> Bioperl-l at lists.open-bio.org
> >> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >>
> >
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l


From jason at bioperl.org  Thu Sep 24 12:23:44 2009
From: jason at bioperl.org (Jason Stajich)
Date: Thu, 24 Sep 2009 09:23:44 -0700
Subject: [Bioperl-l] a Main Page proposal
In-Reply-To: <628aabb70909240238v439d6c46l93a5ead53f161c37@mail.gmail.com>
References: <mailman.7482.1253574466.2696.bioperl-l@lists.open-bio.org>
	<4AB84B8D.5080005@ieee.org>
	<2B4BD21D-D06A-43A1-8860-A8F3771130E7@illinois.edu>
	<f135c01c0909221615y4e79fb48u7e1ce6771b046d10@mail.gmail.com>
	<628aabb70909230437p13de2164sef4950cc3be2ce23@mail.gmail.com>
	<320fb6e00909230512u3d0c2031xb418e3253476be2f@mail.gmail.com>
	<9D6376D4-DFAC-4363-BA1C-0E27AB01373E@illinois.edu>
	<628aabb70909240238v439d6c46l93a5ead53f161c37@mail.gmail.com>
Message-ID: <3B49F41D-4FBB-48CD-BA33-3D6C783CBA38@bioperl.org>

If someone also wants to volunteer to keep up the publications page -  
this is where I *had* been curating a list up by citations and google  
scholar searches for 'bioperl' and things that reference 2002 paper.

Seems like this is where the static copy of that information should go  
- but highlighting things on the a page with a circulating list or  
something that just listed recent additions to the list could be done  
by the web dev gurus and could be kewl.
The current issue is that a) it is large so I think pubmed plugin  
rendering can be slow (or gets broken as it seems to be now).
http://bioperl.org/wiki/BioPerl_publications
http://bioperl.org/wiki/BioPerl_publications/2008
http://bioperl.org/wiki/BioPerl_publications/2007
etc....

-jason
On Sep 24, 2009, at 2:38 AM, Dave Messina wrote:

>>
>> Not to add yet more to the list, but I also think a concise list of
>> projects using (or 'powered by') bioperl should be front-and- 
>> center; not a
>> lot of users know when/where bioperl is used.  This applies to the  
>> other
>> bio* as well, particularly biopython (seeing it popping up more and  
>> more).
>>
>
>
> Along these lines, it'd be great to publicize not only
> BioPerl-*powered*projects, but ones which interface with it, too.
>
> Just this week, for example, there is this, which could go both on a  
> static
> page and in the newsfeed:
> http://bioinformatics.oxfordjournals.org/cgi/content/abstract/btp554v1
>
> MOODS: fast search for position weight matrix matches in DNA  
> sequences.
>
> Korhonen J, Martinm?ki P, Pizzi C, Rastas P, Ukkonen E.
> Department of Computer Science and Helsinki Institute for Information
> Technology,
> University of Helsinki, Helsinki, Finland.
>
> SUMMARY: MOODS (MOtif Occurrence Detection Suite) is a software  
> package for
> matching position weight matrices against DNA sequences. MOODS  
> implements
> state-of-the-art on-line matching algorithms, achieving considerably  
> faster
> scanning speed than with a simple brute-force search. MOODS is  
> written in C++,
> with bindings for the popular BioPerl and Biopython toolkits. It can  
> easily be
> adapted for different purposes and integrated into existing  
> workflows. It can
> also be used as a C++ library. AVAILABILITY: The package with  
> documentation and
> examples of usage is available at http://www.cs.helsinki.fi/group/pssmfind 
> . The
> source code is also available under the terms of a GNU General  
> Public License
> (GPL). CONTACT: janne.h.korhonen at helsinki.fi.
>
> PMID: 19773334 [PubMed - as supplied by publisher]
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

--
Jason Stajich
jason.stajich at gmail.com
jason at bioperl.org


From rmb32 at cornell.edu  Thu Sep 24 12:28:08 2009
From: rmb32 at cornell.edu (Robert Buels)
Date: Thu, 24 Sep 2009 09:28:08 -0700
Subject: [Bioperl-l] DB_File dependency and ActiveState 5.10
In-Reply-To: <1A8A9461E94441EE9BD73D02A8F81F52@NewLife>
References: <1A8A9461E94441EE9BD73D02A8F81F52@NewLife>
Message-ID: <4ABB9E18.3060003@cornell.edu>

Sounds like a good idea to me.

Rob


From cjfields at illinois.edu  Thu Sep 24 12:58:32 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Thu, 24 Sep 2009 11:58:32 -0500
Subject: [Bioperl-l] Problems installing latest stable bioperl-db (1.6)
In-Reply-To: <3A5B0BBDAF00724AB5F10155650102306F86D403@LESMBX1.adf.bham.ac.uk>
References: <3A5B0BBDAF00724AB5F10155650102306F86D3F6@LESMBX1.adf.bham.ac.uk>
	<3E7712FC278A4C9C89CBFC9A683AE301@NewLife>
	<67AB606C-5CC9-4C1E-84EE-EFB7C37667E9@illinois.edu>
	<3A5B0BBDAF00724AB5F10155650102306F86D403@LESMBX1.adf.bham.ac.uk>
Message-ID: <2BDD197A-3DEF-44CE-9F98-6B3F117084EE@illinois.edu>

Tony,

The error should point out the problem: install Graph::Directed via  
CPAN.

Saying that, we need to add that as a 'recommends' for the db package  
and skip those tests if Graph::Directed isn't present.  Will do that  
now.

chris

On Sep 24, 2009, at 10:08 AM, Anthony Pemberton wrote:

> Chris, Mark,
>
> Thank you, I have made significant progress with the install. I had  
> to do a
>
> Cpan> force install Array::Compare
>
> To get the model properly installed.
>
> However, I now have a new error. When I do
>
> Cpan> install CJFIELDS/BioPerl-db-1.6.0.tar.gz
>
> I get the following error (now only 1 of the 16 tests fails):
>
> t/12ontology.t .... 1/740 Bio::OntologyIO: soflat cannot be found
> Exception
> ------------- EXCEPTION -------------
> MSG: Failed to load module Bio::OntologyIO::soflat. Can't locate  
> Graph/Directed.pm in @INC (@INC contains: t/lib t /root/.cpan/build/ 
> BioPerl-db-1.6.0-xim2YV/blib/lib /root/.cpan/build/BioPerl-db-1.6.0- 
> xim2YV/blib/arch /root/.cpan/build/BioPerl-db-1.6.0-xim2YV /usr/ 
> lib64/perl5/5.8.5/x86_64-linux-thread-multi /usr/lib/perl5/5.8.5 / 
> usr/lib64/perl5/site_perl/5.8.5/x86_64-linux-thread-multi /usr/lib/ 
> perl5/site_perl/5.8.5 /usr/lib/perl5/site_perl /usr/lib64/perl5/ 
> vendor_perl/5.8.5/x86_64-linux-thread-multi /usr/lib/perl5/ 
> vendor_perl/5.8.5 /usr/lib64/perl5/vendor_perl/5.8.3/x86_64-linux- 
> thread-multi /usr/lib/perl5/vendor_perl .) at /usr/lib/perl5/ 
> site_perl/5.8.5/Bio/Ontology/SimpleGOEngine/GraphAdaptor.pm line 118.
> BEGIN failed--compilation aborted at /usr/lib/perl5/site_perl/5.8.5/ 
> Bio/Ontology/SimpleGOEngine/GraphAdaptor.pm line 118.
>
>
> Can you help with this one?
>
> Regards,
>
> Tony Pemberton
>
>
>> -----Original Message-----
>> From: Chris Fields [mailto:cjfields at illinois.edu]
>> Sent: 23 September 2009 18:48
>> To: Mark A. Jensen
>> Cc: Anthony Pemberton; bioperl-l at bioperl.org
>> Subject: Re: [Bioperl-l] Problems installing latest stable bioperl-db
>> (1.6)
>>
>> Appears Array::Compare is used for Test::Warn, so it isn't a true
>> requirement (probably a test_requires or somesuch).
>>
>> chris
>>
>> On Sep 23, 2009, at 10:36 AM, Mark A. Jensen wrote:
>>
>>> hi Tony- missing prereqs are the issue with this message,yes-
>>> the brute force approach would be to install each of these
>>> as they come up; you can do
>>>
>>> $ cpan
>>> cpan> install Array::Compare
>>>
>>> etc., then attempt the bioperl-db install again; lather, rinse,
>>> repeat.
>>> MAJ
>>> ----- Original Message ----- From: "Anthony Pemberton"
>> <A.J.Pemberton at bham.ac.uk
>>>>
>>> To: <bioperl-l at bioperl.org>
>>> Sent: Tuesday, September 22, 2009 1:06 PM
>>> Subject: [Bioperl-l] Problems installing latest stable bioperl-db
>>> (1.6)
>>>
>>>
>>>> Folks,
>>>>
>>>> I am experiencing problems installing bioperl-db. I followed the
>>>> instructions on the website both installing via CPAN and
>>>> downloading the source tarball. Get the same error. I think I have
>>>> missing prerequistes, the first error I get is:
>>>>
>>>> Can't locate Array/Compare.pm in @INC (@INC contains: t/lib t /usr/
>>>> local/BioPerl-db-1.6.0/blib/lib
>>>> /usr/local/BioPerl-db-1.6.0/blib/arch /usr/local/BioPerl-db-1.6.0 /
>>>> usr/lib64/perl5/5.8.5/x86_64-linux-thread-multi
>>>> /usr/lib/perl5/5.8.5 /usr/lib64/perl5/site_perl/5.8.5/x86_64-linux-
>>>> thread-multi /usr/lib/perl5/site_perl/5.8.5
>>>> /usr/lib/perl5/site_perl /usr/lib64/perl5/vendor_perl/5.8.5/x86_64-
>>>> linux-thread-multi /usr/lib/perl5/vendor_perl/5.8.5
>>>> /usr/lib64/perl5/vendor_perl/5.8.3/x86_64-linux-thread-multi /usr/
>>>> lib/perl5/vendor_perl .) at t/lib/Test/Warn.pm line 228.
>>>>
>>>> Can anyone help?
>>>>
>>>> Regards,
>>>>
>>>> Tony P.
>>>>
>>>>
>>>> **************************************************************
>>>> Mr. A. Pemberton Tel:+44 121 414 3388
>>>> School of Biosciences, Fax:+44 121 414 5925
>>>> The University of Birmingham
>>>> Email:a.j.pemberton at bham.ac.uk
>>>> Birmingham B15 2TT U.K.
>>>> **************************************************************
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> Bioperl-l mailing list
>>>> Bioperl-l at lists.open-bio.org
>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>>
>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


From cjfields at illinois.edu  Thu Sep 24 13:50:34 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Thu, 24 Sep 2009 12:50:34 -0500
Subject: [Bioperl-l] DB_File dependency and ActiveState 5.10
In-Reply-To: <1A8A9461E94441EE9BD73D02A8F81F52@NewLife>
References: <1A8A9461E94441EE9BD73D02A8F81F52@NewLife>
Message-ID: <759F1C97-401A-434C-956C-20A1DED9D834@illinois.edu>

I do support doing this for sheer flexibility, but it's not an  
absolute showstopper for ActivePerl.  There is a working DB_File PPM  
available for ActivePerl 5.10.1 in the Trouchelle PPM repo:

http://trouchelle.com/ppm10/

That repo is listed in the 'Suggested' list in the latest PPM4  
Preferences (Repositories tag). I had to install it to fix that WinXP  
Bio::Index bug.

(Based on that Bio::Index modules also have this requirement, at least  
tests were being skipped based on lack of DB_File)

chris

On Sep 24, 2009, at 9:17 AM, Mark A. Jensen wrote:

> Gurus of a db stripe:
>
> ActiveState 5.10 has such a problem with BDB that it
> disables their ppm build of the DB_File module. I know
> what the *ultimate* solution is...however...
>
> I did a quick grep of 'use DB_File' across the trunk, and
> it seems there are two categories of dependency--
>
> (1) use of BDB is an option among other dbms
>      (e.g., among the  Bio::DB::GFF::Adaptor::)
>
> (2) BDB is the developer's personal choice
>    (e.g., possibly Bio::DB::FileCache)
>
> In Bio::DB::Fasta, AnyDBM_File is used to allow the
> user a choice. Are there fundamental reasons not to
> convert the type (2) dependencies to AnyDBM_File?
> I will try to do this (on a branch) if there are no technical
> objections. General derision, however, will only goad
> me into action-
>
> Thanks,
> MAJ
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From cjfields at illinois.edu  Thu Sep 24 14:03:48 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Thu, 24 Sep 2009 13:03:48 -0500
Subject: [Bioperl-l] Bio::SeqIO::scf tests failing on WinXP?
Message-ID: <159370FD-B6F5-4702-AF35-B7126BA7399A@illinois.edu>

Can someone (Mark?) who has a WinXP setup run tests on Bio::SeqIO::scf  
for Windows using the last alpha or bioperl-live?  I'm getting a  
pretty significant fail with the last alpha release (I've managed to  
fix the others) via my remote desktop setup (haven't set up virtualbox  
yet).  I just want to confirm this is occurring elsewhere and plan  
accordingly, namely indicating the module doesn't work with windows  
for the time being.

Build test --test-files t/SeqIO/scf.t --verbose

chris


From maj at fortinbras.us  Thu Sep 24 14:39:38 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Thu, 24 Sep 2009 14:39:38 -0400
Subject: [Bioperl-l] DB_File dependency and ActiveState 5.10
In-Reply-To: <759F1C97-401A-434C-956C-20A1DED9D834@illinois.edu>
References: <1A8A9461E94441EE9BD73D02A8F81F52@NewLife>
	<759F1C97-401A-434C-956C-20A1DED9D834@illinois.edu>
Message-ID: <3715F68607084E4684A4B54E542468E4@NewLife>

All righty. I did find the trouchelle repo, but my ppm
didn't believe that DB_File was in it.
----- Original Message ----- 
From: "Chris Fields" <cjfields at illinois.edu>
To: "Mark A. Jensen" <maj at fortinbras.us>
Cc: "BioPerl List" <bioperl-l at lists.open-bio.org>
Sent: Thursday, September 24, 2009 1:50 PM
Subject: Re: [Bioperl-l] DB_File dependency and ActiveState 5.10


>I do support doing this for sheer flexibility, but it's not an  
> absolute showstopper for ActivePerl.  There is a working DB_File PPM  
> available for ActivePerl 5.10.1 in the Trouchelle PPM repo:
> 
> http://trouchelle.com/ppm10/
> 
> That repo is listed in the 'Suggested' list in the latest PPM4  
> Preferences (Repositories tag). I had to install it to fix that WinXP  
> Bio::Index bug.
> 
> (Based on that Bio::Index modules also have this requirement, at least  
> tests were being skipped based on lack of DB_File)
> 
> chris
> 
> On Sep 24, 2009, at 9:17 AM, Mark A. Jensen wrote:
> 
>> Gurus of a db stripe:
>>
>> ActiveState 5.10 has such a problem with BDB that it
>> disables their ppm build of the DB_File module. I know
>> what the *ultimate* solution is...however...
>>
>> I did a quick grep of 'use DB_File' across the trunk, and
>> it seems there are two categories of dependency--
>>
>> (1) use of BDB is an option among other dbms
>>      (e.g., among the  Bio::DB::GFF::Adaptor::)
>>
>> (2) BDB is the developer's personal choice
>>    (e.g., possibly Bio::DB::FileCache)
>>
>> In Bio::DB::Fasta, AnyDBM_File is used to allow the
>> user a choice. Are there fundamental reasons not to
>> convert the type (2) dependencies to AnyDBM_File?
>> I will try to do this (on a branch) if there are no technical
>> objections. General derision, however, will only goad
>> me into action-
>>
>> Thanks,
>> MAJ
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
> 
>


From maj at fortinbras.us  Thu Sep 24 14:40:03 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Thu, 24 Sep 2009 14:40:03 -0400
Subject: [Bioperl-l] Bio::SeqIO::scf tests failing on WinXP?
In-Reply-To: <159370FD-B6F5-4702-AF35-B7126BA7399A@illinois.edu>
References: <159370FD-B6F5-4702-AF35-B7126BA7399A@illinois.edu>
Message-ID: <791B5C5CB3C34A8AAC348DC59E934198@NewLife>

aye-aye
----- Original Message ----- 
From: "Chris Fields" <cjfields at illinois.edu>
To: "BioPerl List" <bioperl-l at lists.open-bio.org>
Sent: Thursday, September 24, 2009 2:03 PM
Subject: [Bioperl-l] Bio::SeqIO::scf tests failing on WinXP?


> Can someone (Mark?) who has a WinXP setup run tests on Bio::SeqIO::scf  
> for Windows using the last alpha or bioperl-live?  I'm getting a  
> pretty significant fail with the last alpha release (I've managed to  
> fix the others) via my remote desktop setup (haven't set up virtualbox  
> yet).  I just want to confirm this is occurring elsewhere and plan  
> accordingly, namely indicating the module doesn't work with windows  
> for the time being.
> 
> Build test --test-files t/SeqIO/scf.t --verbose
> 
> chris
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
>


From e.osimo at gmail.com  Fri Sep 25 03:59:10 2009
From: e.osimo at gmail.com (Emanuele Osimo)
Date: Fri, 25 Sep 2009 09:59:10 +0200
Subject: [Bioperl-l] a Main Page proposal
In-Reply-To: <3B49F41D-4FBB-48CD-BA33-3D6C783CBA38@bioperl.org>
References: <mailman.7482.1253574466.2696.bioperl-l@lists.open-bio.org> 
	<4AB84B8D.5080005@ieee.org>
	<2B4BD21D-D06A-43A1-8860-A8F3771130E7@illinois.edu> 
	<f135c01c0909221615y4e79fb48u7e1ce6771b046d10@mail.gmail.com> 
	<628aabb70909230437p13de2164sef4950cc3be2ce23@mail.gmail.com> 
	<320fb6e00909230512u3d0c2031xb418e3253476be2f@mail.gmail.com> 
	<9D6376D4-DFAC-4363-BA1C-0E27AB01373E@illinois.edu>
	<628aabb70909240238v439d6c46l93a5ead53f161c37@mail.gmail.com> 
	<3B49F41D-4FBB-48CD-BA33-3D6C783CBA38@bioperl.org>
Message-ID: <2ac05d0f0909250059p56c75124hfb8b16b865a831c@mail.gmail.com>

Dear Jason,
it's more than 24 hours that I try connecting to
http://bioperl.org/wiki/BioPerl_publications, but it won't work.
Emanuele


On Thu, Sep 24, 2009 at 18:23, Jason Stajich <jason at bioperl.org> wrote:

> If someone also wants to volunteer to keep up the publications page - this
> is where I *had* been curating a list up by citations and google scholar
> searches for 'bioperl' and things that reference 2002 paper.
>
> Seems like this is where the static copy of that information should go -
> but highlighting things on the a page with a circulating list or something
> that just listed recent additions to the list could be done by the web dev
> gurus and could be kewl.
> The current issue is that a) it is large so I think pubmed plugin rendering
> can be slow (or gets broken as it seems to be now).
> http://bioperl.org/wiki/BioPerl_publications
> http://bioperl.org/wiki/BioPerl_publications/2008
> http://bioperl.org/wiki/BioPerl_publications/2007
> etc....
>
> -jason
>
> On Sep 24, 2009, at 2:38 AM, Dave Messina wrote:
>
>
>>> Not to add yet more to the list, but I also think a concise list of
>>> projects using (or 'powered by') bioperl should be front-and-center; not
>>> a
>>> lot of users know when/where bioperl is used.  This applies to the other
>>> bio* as well, particularly biopython (seeing it popping up more and
>>> more).
>>>
>>>
>>
>> Along these lines, it'd be great to publicize not only
>> BioPerl-*powered*projects, but ones which interface with it, too.
>>
>> Just this week, for example, there is this, which could go both on a
>> static
>> page and in the newsfeed:
>> http://bioinformatics.oxfordjournals.org/cgi/content/abstract/btp554v1
>>
>> MOODS: fast search for position weight matrix matches in DNA sequences.
>>
>> Korhonen J, Martinm?ki P, Pizzi C, Rastas P, Ukkonen E.
>> Department of Computer Science and Helsinki Institute for Information
>> Technology,
>> University of Helsinki, Helsinki, Finland.
>>
>> SUMMARY: MOODS (MOtif Occurrence Detection Suite) is a software package
>> for
>> matching position weight matrices against DNA sequences. MOODS implements
>> state-of-the-art on-line matching algorithms, achieving considerably
>> faster
>> scanning speed than with a simple brute-force search. MOODS is written in
>> C++,
>> with bindings for the popular BioPerl and Biopython toolkits. It can
>> easily be
>> adapted for different purposes and integrated into existing workflows. It
>> can
>> also be used as a C++ library. AVAILABILITY: The package with
>> documentation and
>> examples of usage is available at
>> http://www.cs.helsinki.fi/group/pssmfind. The
>> source code is also available under the terms of a GNU General Public
>> License
>> (GPL). CONTACT: janne.h.korhonen at helsinki.fi.
>>
>> PMID: 19773334 [PubMed - as supplied by publisher]
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>
> --
> Jason Stajich
> jason.stajich at gmail.com
> jason at bioperl.org
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


From hlapp at gmx.net  Fri Sep 25 07:26:37 2009
From: hlapp at gmx.net (Hilmar Lapp)
Date: Fri, 25 Sep 2009 07:26:37 -0400
Subject: [Bioperl-l] a Main Page proposal
In-Reply-To: <2ac05d0f0909250059p56c75124hfb8b16b865a831c@mail.gmail.com>
References: <mailman.7482.1253574466.2696.bioperl-l@lists.open-bio.org>
	<4AB84B8D.5080005@ieee.org>
	<2B4BD21D-D06A-43A1-8860-A8F3771130E7@illinois.edu>
	<f135c01c0909221615y4e79fb48u7e1ce6771b046d10@mail.gmail.com>
	<628aabb70909230437p13de2164sef4950cc3be2ce23@mail.gmail.com>
	<320fb6e00909230512u3d0c2031xb418e3253476be2f@mail.gmail.com>
	<9D6376D4-DFAC-4363-BA1C-0E27AB01373E@illinois.edu>
	<628aabb70909240238v439d6c46l93a5ead53f161c37@mail.gmail.com>
	<3B49F41D-4FBB-48CD-BA33-3D6C783CBA38@bioperl.org>
	<2ac05d0f0909250059p56c75124hfb8b16b865a831c@mail.gmail.com>
Message-ID: <9B33DB9A-C82D-42E5-87D3-A26BD166F7F5@gmx.net>

Odd. Something's going on in the page that upsets MediaWiki. I can  
actually pull up the page in edit mode.

Is the citation extension working correctly? The year-by-year pages  
look odd.

	-hilmar

On Sep 25, 2009, at 3:59 AM, Emanuele Osimo wrote:

> Dear Jason,
> it's more than 24 hours that I try connecting to
> http://bioperl.org/wiki/BioPerl_publications, but it won't work.
> Emanuele
>
>
> On Thu, Sep 24, 2009 at 18:23, Jason Stajich <jason at bioperl.org>  
> wrote:
>
>> If someone also wants to volunteer to keep up the publications page  
>> - this
>> is where I *had* been curating a list up by citations and google  
>> scholar
>> searches for 'bioperl' and things that reference 2002 paper.
>>
>> Seems like this is where the static copy of that information should  
>> go -
>> but highlighting things on the a page with a circulating list or  
>> something
>> that just listed recent additions to the list could be done by the  
>> web dev
>> gurus and could be kewl.
>> The current issue is that a) it is large so I think pubmed plugin  
>> rendering
>> can be slow (or gets broken as it seems to be now).
>> http://bioperl.org/wiki/BioPerl_publications
>> http://bioperl.org/wiki/BioPerl_publications/2008
>> http://bioperl.org/wiki/BioPerl_publications/2007
>> etc....
>>
>> -jason
>>
>> On Sep 24, 2009, at 2:38 AM, Dave Messina wrote:
>>
>>
>>>> Not to add yet more to the list, but I also think a concise list of
>>>> projects using (or 'powered by') bioperl should be front-and- 
>>>> center; not
>>>> a
>>>> lot of users know when/where bioperl is used.  This applies to  
>>>> the other
>>>> bio* as well, particularly biopython (seeing it popping up more and
>>>> more).
>>>>
>>>>
>>>
>>> Along these lines, it'd be great to publicize not only
>>> BioPerl-*powered*projects, but ones which interface with it, too.
>>>
>>> Just this week, for example, there is this, which could go both on a
>>> static
>>> page and in the newsfeed:
>>> http://bioinformatics.oxfordjournals.org/cgi/content/abstract/btp554v1
>>>
>>> MOODS: fast search for position weight matrix matches in DNA  
>>> sequences.
>>>
>>> Korhonen J, Martinm?ki P, Pizzi C, Rastas P, Ukkonen E.
>>> Department of Computer Science and Helsinki Institute for  
>>> Information
>>> Technology,
>>> University of Helsinki, Helsinki, Finland.
>>>
>>> SUMMARY: MOODS (MOtif Occurrence Detection Suite) is a software  
>>> package
>>> for
>>> matching position weight matrices against DNA sequences. MOODS  
>>> implements
>>> state-of-the-art on-line matching algorithms, achieving considerably
>>> faster
>>> scanning speed than with a simple brute-force search. MOODS is  
>>> written in
>>> C++,
>>> with bindings for the popular BioPerl and Biopython toolkits. It can
>>> easily be
>>> adapted for different purposes and integrated into existing  
>>> workflows. It
>>> can
>>> also be used as a C++ library. AVAILABILITY: The package with
>>> documentation and
>>> examples of usage is available at
>>> http://www.cs.helsinki.fi/group/pssmfind. The
>>> source code is also available under the terms of a GNU General  
>>> Public
>>> License
>>> (GPL). CONTACT: janne.h.korhonen at helsinki.fi.
>>>
>>> PMID: 19773334 [PubMed - as supplied by publisher]
>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>
>> --
>> Jason Stajich
>> jason.stajich at gmail.com
>> jason at bioperl.org
>>
>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From biopython at maubp.freeserve.co.uk  Fri Sep 25 07:40:33 2009
From: biopython at maubp.freeserve.co.uk (Peter)
Date: Fri, 25 Sep 2009 12:40:33 +0100
Subject: [Bioperl-l] a Main Page proposal
In-Reply-To: <9B33DB9A-C82D-42E5-87D3-A26BD166F7F5@gmx.net>
References: <mailman.7482.1253574466.2696.bioperl-l@lists.open-bio.org>
	<2B4BD21D-D06A-43A1-8860-A8F3771130E7@illinois.edu>
	<f135c01c0909221615y4e79fb48u7e1ce6771b046d10@mail.gmail.com>
	<628aabb70909230437p13de2164sef4950cc3be2ce23@mail.gmail.com>
	<320fb6e00909230512u3d0c2031xb418e3253476be2f@mail.gmail.com>
	<9D6376D4-DFAC-4363-BA1C-0E27AB01373E@illinois.edu>
	<628aabb70909240238v439d6c46l93a5ead53f161c37@mail.gmail.com>
	<3B49F41D-4FBB-48CD-BA33-3D6C783CBA38@bioperl.org>
	<2ac05d0f0909250059p56c75124hfb8b16b865a831c@mail.gmail.com>
	<9B33DB9A-C82D-42E5-87D3-A26BD166F7F5@gmx.net>
Message-ID: <320fb6e00909250440i18ee4216o80cedd418feed842@mail.gmail.com>

On Fri, Sep 25, 2009 at 12:26 PM, Hilmar Lapp <hlapp at gmx.net> wrote:
> Odd. Something's going on in the page that upsets MediaWiki. I can actually
> pull up the page in edit mode.
>
> Is the citation extension working correctly? The year-by-year pages look
> odd.

It is working on the Biopython and BioJava pages (which use the same
server and mediawiki installation, right?),

http://biopython.org/wiki/Documentation#Papers
http://biopython.org/wiki/Publications
http://biojava.org/wiki/BioJava:BioJavaInside

[I know there are references with a funny character in them, the extension
doesn't like accents. I normally redo those references by hand but it is
a hassle and just giving a PMID is much easier]

Peter


From maj at fortinbras.us  Fri Sep 25 08:50:26 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Fri, 25 Sep 2009 08:50:26 -0400
Subject: [Bioperl-l] a Main Page proposal
In-Reply-To: <9B33DB9A-C82D-42E5-87D3-A26BD166F7F5@gmx.net>
References: <mailman.7482.1253574466.2696.bioperl-l@lists.open-bio.org><4AB84B8D.5080005@ieee.org><2B4BD21D-D06A-43A1-8860-A8F3771130E7@illinois.edu><f135c01c0909221615y4e79fb48u7e1ce6771b046d10@mail.gmail.com><628aabb70909230437p13de2164sef4950cc3be2ce23@mail.gmail.com><320fb6e00909230512u3d0c2031xb418e3253476be2f@mail.gmail.com><9D6376D4-DFAC-4363-BA1C-0E27AB01373E@illinois.edu><628aabb70909240238v439d6c46l93a5ead53f161c37@mail.gmail.com><3B49F41D-4FBB-48CD-BA33-3D6C783CBA38@bioperl.org><2ac05d0f0909250059p56c75124hfb8b16b865a831c@mail.gmail.com>
	<9B33DB9A-C82D-42E5-87D3-A26BD166F7F5@gmx.net>
Message-ID: <4E35933353E14BB98975BCCAF79F5E0B@NewLife>

I've been playing with this. I think it's either a numbers problem (>230 
references => bork) or a timeout problem. Attempting to isolate a single 
"BioPerl publications/200x" page for the error gives inconsistent
results, but including enough of these pages to give more than about 230 
references gives the error (using
preview).
----- Original Message ----- 
From: "Hilmar Lapp" <hlapp at gmx.net>
To: "Emanuele Osimo" <e.osimo at gmail.com>
Cc: "perl bioperl ml" <bioperl-l at lists.open-bio.org>
Sent: Friday, September 25, 2009 7:26 AM
Subject: Re: [Bioperl-l] a Main Page proposal


Odd. Something's going on in the page that upsets MediaWiki. I can
actually pull up the page in edit mode.

Is the citation extension working correctly? The year-by-year pages
look odd.

-hilmar

On Sep 25, 2009, at 3:59 AM, Emanuele Osimo wrote:

> Dear Jason,
> it's more than 24 hours that I try connecting to
> http://bioperl.org/wiki/BioPerl_publications, but it won't work.
> Emanuele
>
>
> On Thu, Sep 24, 2009 at 18:23, Jason Stajich <jason at bioperl.org>  wrote:
>
>> If someone also wants to volunteer to keep up the publications page  - this
>> is where I *had* been curating a list up by citations and google  scholar
>> searches for 'bioperl' and things that reference 2002 paper.
>>
>> Seems like this is where the static copy of that information should  go -
>> but highlighting things on the a page with a circulating list or  something
>> that just listed recent additions to the list could be done by the  web dev
>> gurus and could be kewl.
>> The current issue is that a) it is large so I think pubmed plugin  rendering
>> can be slow (or gets broken as it seems to be now).
>> http://bioperl.org/wiki/BioPerl_publications
>> http://bioperl.org/wiki/BioPerl_publications/2008
>> http://bioperl.org/wiki/BioPerl_publications/2007
>> etc....
>>
>> -jason
>>
>> On Sep 24, 2009, at 2:38 AM, Dave Messina wrote:
>>
>>
>>>> Not to add yet more to the list, but I also think a concise list of
>>>> projects using (or 'powered by') bioperl should be front-and- center; not
>>>> a
>>>> lot of users know when/where bioperl is used.  This applies to  the other
>>>> bio* as well, particularly biopython (seeing it popping up more and
>>>> more).
>>>>
>>>>
>>>
>>> Along these lines, it'd be great to publicize not only
>>> BioPerl-*powered*projects, but ones which interface with it, too.
>>>
>>> Just this week, for example, there is this, which could go both on a
>>> static
>>> page and in the newsfeed:
>>> http://bioinformatics.oxfordjournals.org/cgi/content/abstract/btp554v1
>>>
>>> MOODS: fast search for position weight matrix matches in DNA  sequences.
>>>
>>> Korhonen J, Martinm?ki P, Pizzi C, Rastas P, Ukkonen E.
>>> Department of Computer Science and Helsinki Institute for  Information
>>> Technology,
>>> University of Helsinki, Helsinki, Finland.
>>>
>>> SUMMARY: MOODS (MOtif Occurrence Detection Suite) is a software  package
>>> for
>>> matching position weight matrices against DNA sequences. MOODS  implements
>>> state-of-the-art on-line matching algorithms, achieving considerably
>>> faster
>>> scanning speed than with a simple brute-force search. MOODS is  written in
>>> C++,
>>> with bindings for the popular BioPerl and Biopython toolkits. It can
>>> easily be
>>> adapted for different purposes and integrated into existing  workflows. It
>>> can
>>> also be used as a C++ library. AVAILABILITY: The package with
>>> documentation and
>>> examples of usage is available at
>>> http://www.cs.helsinki.fi/group/pssmfind. The
>>> source code is also available under the terms of a GNU General  Public
>>> License
>>> (GPL). CONTACT: janne.h.korhonen at helsinki.fi.
>>>
>>> PMID: 19773334 [PubMed - as supplied by publisher]
>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>
>> --
>> Jason Stajich
>> jason.stajich at gmail.com
>> jason at bioperl.org
>>
>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


_______________________________________________
Bioperl-l mailing list
Bioperl-l at lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/bioperl-l


From maj at fortinbras.us  Fri Sep 25 09:08:10 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Fri, 25 Sep 2009 09:08:10 -0400
Subject: [Bioperl-l] a Main Page proposal
In-Reply-To: <D336A804-1B73-465E-971F-AD1A9EE5465C@illinois.edu>
References: <094CC6C2C9DE448F9A8097AEEFE86B03@NewLife><4AB71C40.10902@sendu.me.uk>
	<4AB72DEF.2010008@cornell.edu><320fb6e00909210245hc455330o109515b100b21153@mail.gmail.com><3C8F39ACAD954917ACDEFD863EC99B16@NewLife><320fb6e00909210528v849cd43h3bd4677b69222575@mail.gmail.com>
	<D336A804-1B73-465E-971F-AD1A9EE5465C@illinois.edu>
Message-ID: <3327C0C1167C4889A980809FD642A0A2@NewLife>

The idea I now have is that <biblio> is hitting the server too rapidly and 
getting bounced after a while.
----- Original Message ----- 
From: "Chris Fields" <cjfields at illinois.edu>
To: "Peter" <biopython at maubp.freeserve.co.uk>
Cc: "BioPerl List" <bioperl-l at lists.open-bio.org>; "Mark A. Jensen" 
<maj at fortinbras.us>
Sent: Monday, September 21, 2009 9:05 AM
Subject: Re: [Bioperl-l] a Main Page proposal


>
> On Sep 21, 2009, at 7:28 AM, Peter wrote:
>
>> Peter wrote:
>>>> We had some similar discussions about the Biopython wiki
>>>> based homepage - although our old one was nowhere near
>>>> as busy as the current BioPerl main page, it was still not as
>>>> welcoming as our current version *tries* to be.
>>>> ...
>>>> I can dig out links to our mailing list archive if anyone is
>>>> interested in the discussion.
>>
>> On Mon, Sep 21, 2009 at 12:32 PM, Mark A. Jensen wrote:
>>>
>>> I'd appreciate those links, Peter- thanks
>>> MAJ
>>
>> OK, here you are - this was most of it, I'd have to dig though
>> my old emails to see what else I can find:
>> http://lists.open-bio.org/pipermail/biopython-dev/2009-April/005867.html
>>
>> Remember Biopython went from a very minimal home page, to
>> something aiming to be more newcomer friendly. BioPerl on the
>> other hand seems to want to move away from the current very
>> text heavy information rich page to something more focused and
>> newcomer friendly. To me at least the current page is too dense,
>> intimidating, and the important bits get lost in all the content.
>>
>> [My apologies if any of this feedback come accross too blunt.]
>
> Not at all; I'm thinking the same thing.
>
>> If you haven't already looked at them, you should checkout the
>> other OBF project pages for ideas. The BioJava homepage is
>> also using the wiki - in my opinion it is a bit cluttered, but is
>> still more accessible than the current BioPerl page. Also,
>> the BioRuby page is very nice - although not wiki based.
>>
>> Regards,
>>
>> Peter
>
> I think the Biopython layout is very nice and focused.  Maybe a bit  too 
> minimal, but then again I don't like scrolling up and down the  page to find 
> the relevant bits, so less may be better.
>
> Reminds me of the simplifed design on the perl6 main page (just don't  stare 
> at the hallucinogenic butterfly too long):
>
> http://www.perl6.org/
>
> So, maybe a structured layout with the most important links, and  additional 
> links on a separate page.
>
> chris
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> 


From maj at fortinbras.us  Fri Sep 25 09:30:21 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Fri, 25 Sep 2009 09:30:21 -0400
Subject: [Bioperl-l] a Main Page proposal
In-Reply-To: <D336A804-1B73-465E-971F-AD1A9EE5465C@illinois.edu>
References: <094CC6C2C9DE448F9A8097AEEFE86B03@NewLife><4AB71C40.10902@sendu.me.uk>
	<4AB72DEF.2010008@cornell.edu><320fb6e00909210245hc455330o109515b100b21153@mail.gmail.com><3C8F39ACAD954917ACDEFD863EC99B16@NewLife><320fb6e00909210528v849cd43h3bd4677b69222575@mail.gmail.com>
	<D336A804-1B73-465E-971F-AD1A9EE5465C@illinois.edu>
Message-ID: <A06AF115F63B4C558D368B730BFB441D@NewLife>

It's ugly, but it works now.
----- Original Message ----- 
From: "Chris Fields" <cjfields at illinois.edu>
To: "Peter" <biopython at maubp.freeserve.co.uk>
Cc: "BioPerl List" <bioperl-l at lists.open-bio.org>; "Mark A. Jensen" 
<maj at fortinbras.us>
Sent: Monday, September 21, 2009 9:05 AM
Subject: Re: [Bioperl-l] a Main Page proposal


>
> On Sep 21, 2009, at 7:28 AM, Peter wrote:
>
>> Peter wrote:
>>>> We had some similar discussions about the Biopython wiki
>>>> based homepage - although our old one was nowhere near
>>>> as busy as the current BioPerl main page, it was still not as
>>>> welcoming as our current version *tries* to be.
>>>> ...
>>>> I can dig out links to our mailing list archive if anyone is
>>>> interested in the discussion.
>>
>> On Mon, Sep 21, 2009 at 12:32 PM, Mark A. Jensen wrote:
>>>
>>> I'd appreciate those links, Peter- thanks
>>> MAJ
>>
>> OK, here you are - this was most of it, I'd have to dig though
>> my old emails to see what else I can find:
>> http://lists.open-bio.org/pipermail/biopython-dev/2009-April/005867.html
>>
>> Remember Biopython went from a very minimal home page, to
>> something aiming to be more newcomer friendly. BioPerl on the
>> other hand seems to want to move away from the current very
>> text heavy information rich page to something more focused and
>> newcomer friendly. To me at least the current page is too dense,
>> intimidating, and the important bits get lost in all the content.
>>
>> [My apologies if any of this feedback come accross too blunt.]
>
> Not at all; I'm thinking the same thing.
>
>> If you haven't already looked at them, you should checkout the
>> other OBF project pages for ideas. The BioJava homepage is
>> also using the wiki - in my opinion it is a bit cluttered, but is
>> still more accessible than the current BioPerl page. Also,
>> the BioRuby page is very nice - although not wiki based.
>>
>> Regards,
>>
>> Peter
>
> I think the Biopython layout is very nice and focused.  Maybe a bit  too 
> minimal, but then again I don't like scrolling up and down the  page to find 
> the relevant bits, so less may be better.
>
> Reminds me of the simplifed design on the perl6 main page (just don't  stare 
> at the hallucinogenic butterfly too long):
>
> http://www.perl6.org/
>
> So, maybe a structured layout with the most important links, and  additional 
> links on a separate page.
>
> chris
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> 


From jason at bioperl.org  Fri Sep 25 11:47:55 2009
From: jason at bioperl.org (Jason Stajich)
Date: Fri, 25 Sep 2009 08:47:55 -0700
Subject: [Bioperl-l] a Main Page proposal
In-Reply-To: <A06AF115F63B4C558D368B730BFB441D@NewLife>
References: <094CC6C2C9DE448F9A8097AEEFE86B03@NewLife><4AB71C40.10902@sendu.me.uk>
	<4AB72DEF.2010008@cornell.edu><320fb6e00909210245hc455330o109515b100b21153@mail.gmail.com><3C8F39ACAD954917ACDEFD863EC99B16@NewLife><320fb6e00909210528v849cd43h3bd4677b69222575@mail.gmail.com>
	<D336A804-1B73-465E-971F-AD1A9EE5465C@illinois.edu>
	<A06AF115F63B4C558D368B730BFB441D@NewLife>
Message-ID: <2F3B82E8-3A61-4FDB-A55E-38899C262ED6@bioperl.org>

thanks - yeah I had separated it by year to make it easier to update  
them since the main file was too large, but I liked having them all  
pulled in onto one page in order to see the total number of cites.  
Brian's graphic is nice but a little out of date, and only reflects a  
pubmed query.

Basically that system doesn't work well enough with biblio since it  
isn't caching the lookups very well.   We can probably do better  
somehow, but someone would have to really be dedicated to it, so I can  
kind of see now why we could use something like this to generate the  
citations so they'd be static.
http://sumsearch.uthscsa.edu/cite/

I had used Biblio extension as it was so easy but maybe it just can't  
scale for that number of needed refs as it doesn't do very good local  
caching AFAIK.

-jason
On Sep 25, 2009, at 6:30 AM, Mark A. Jensen wrote:

> It's ugly, but it works now.
> ----- Original Message ----- From: "Chris Fields" <cjfields at illinois.edu 
> >
> To: "Peter" <biopython at maubp.freeserve.co.uk>
> Cc: "BioPerl List" <bioperl-l at lists.open-bio.org>; "Mark A. Jensen" <maj at fortinbras.us 
> >
> Sent: Monday, September 21, 2009 9:05 AM
> Subject: Re: [Bioperl-l] a Main Page proposal
>
>
>>
>> On Sep 21, 2009, at 7:28 AM, Peter wrote:
>>
>>> Peter wrote:
>>>>> We had some similar discussions about the Biopython wiki
>>>>> based homepage - although our old one was nowhere near
>>>>> as busy as the current BioPerl main page, it was still not as
>>>>> welcoming as our current version *tries* to be.
>>>>> ...
>>>>> I can dig out links to our mailing list archive if anyone is
>>>>> interested in the discussion.
>>>
>>> On Mon, Sep 21, 2009 at 12:32 PM, Mark A. Jensen wrote:
>>>>
>>>> I'd appreciate those links, Peter- thanks
>>>> MAJ
>>>
>>> OK, here you are - this was most of it, I'd have to dig though
>>> my old emails to see what else I can find:
>>> http://lists.open-bio.org/pipermail/biopython-dev/2009-April/005867.html
>>>
>>> Remember Biopython went from a very minimal home page, to
>>> something aiming to be more newcomer friendly. BioPerl on the
>>> other hand seems to want to move away from the current very
>>> text heavy information rich page to something more focused and
>>> newcomer friendly. To me at least the current page is too dense,
>>> intimidating, and the important bits get lost in all the content.
>>>
>>> [My apologies if any of this feedback come accross too blunt.]
>>
>> Not at all; I'm thinking the same thing.
>>
>>> If you haven't already looked at them, you should checkout the
>>> other OBF project pages for ideas. The BioJava homepage is
>>> also using the wiki - in my opinion it is a bit cluttered, but is
>>> still more accessible than the current BioPerl page. Also,
>>> the BioRuby page is very nice - although not wiki based.
>>>
>>> Regards,
>>>
>>> Peter
>>
>> I think the Biopython layout is very nice and focused.  Maybe a  
>> bit  too minimal, but then again I don't like scrolling up and down  
>> the  page to find the relevant bits, so less may be better.
>>
>> Reminds me of the simplifed design on the perl6 main page (just  
>> don't  stare at the hallucinogenic butterfly too long):
>>
>> http://www.perl6.org/
>>
>> So, maybe a structured layout with the most important links, and   
>> additional links on a separate page.
>>
>> chris
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

--
Jason Stajich
jason.stajich at gmail.com
jason at bioperl.org


From jason at bioperl.org  Fri Sep 25 12:54:36 2009
From: jason at bioperl.org (Jason Stajich)
Date: Fri, 25 Sep 2009 09:54:36 -0700
Subject: [Bioperl-l] a Main Page proposal
In-Reply-To: <3575DEFF2D0342D0A2553D87EB958D6E@NewLife>
References: <094CC6C2C9DE448F9A8097AEEFE86B03@NewLife><4AB71C40.10902@sendu.me.uk><4AB72DEF.2010008@cornell.edu><320fb6e00909210245hc455330o109515b100b21153@mail.gmail.com><3C8F39ACAD954917ACDEFD863EC99B16@NewLife><320fb6e00909210528v849cd43h3bd4677b69222575@mail.gmail.com><D336A804-1B73-465E-971F-AD1A9EE5465C@illinois.edu><A06AF115F63B4C558D368B730BFB441D@NewLife>
	<2F3B82E8-3A61-4FDB-A55E-38899C262ED6@bioperl.org>
	<3575DEFF2D0342D0A2553D87EB958D6E@NewLife>
Message-ID: <7275015E-45FC-4E2A-9379-89F7447DEB32@bioperl.org>

cheers- any efforts are appreciated.  I am not sure what is the best  
way to provide this info to folks without spending a ton of time  
curating.  What would be ideal is if the software worked well enough  
that a volunteer only spent time adding the info not debugging the  
display or code.  It might be that something better exists -- online  
reference management like citeulike or mendeley -- that could then be  
linked in via an API.  .... Webservices, etc will save us all, right?   
Okay not really, but at least we can try and keep this organized till  
it is clear what are alternate solutions.  Martin has stopped working  
on Biblio as far as I know and php-hacking is not my favorite pastime.

-jason
On Sep 25, 2009, at 9:38 AM, Mark A. Jensen wrote:

> I figured you really wanted the 'hundreds-o-cites' effect-- I'm just  
> thinking of this
> as a workaround until the issues are resolved. Not sure I can devote  
> too much
> time to playing with it now (procrastinating using other projects at  
> the mo') but
> I can put it in the todo list on the Documentation Project page....
> cheers MAJ
> ----- Original Message ----- From: "Jason Stajich" <jason at bioperl.org>
> To: "Mark A. Jensen" <maj at fortinbras.us>
> Cc: "Chris Fields" <cjfields at illinois.edu>; "BioPerl List" <bioperl-l at lists.open-bio.org 
> >; "Peter" <biopython at maubp.freeserve.co.uk>
> Sent: Friday, September 25, 2009 11:47 AM
> Subject: Re: [Bioperl-l] a Main Page proposal
>
>
>> thanks - yeah I had separated it by year to make it easier to  
>> update  them since the main file was too large, but I liked having  
>> them all  pulled in onto one page in order to see the total number  
>> of cites.  Brian's graphic is nice but a little out of date, and  
>> only reflects a  pubmed query.
>>
>> Basically that system doesn't work well enough with biblio since  
>> it  isn't caching the lookups very well.   We can probably do  
>> better  somehow, but someone would have to really be dedicated to  
>> it, so I can  kind of see now why we could use something like this  
>> to generate the  citations so they'd be static.
>> http://sumsearch.uthscsa.edu/cite/
>>
>> I had used Biblio extension as it was so easy but maybe it just  
>> can't  scale for that number of needed refs as it doesn't do very  
>> good local  caching AFAIK.
>>
>> -jason
>> On Sep 25, 2009, at 6:30 AM, Mark A. Jensen wrote:
>>
>>> It's ugly, but it works now.
>>> ----- Original Message ----- From: "Chris Fields" <cjfields at illinois.edu
>>> >
>>> To: "Peter" <biopython at maubp.freeserve.co.uk>
>>> Cc: "BioPerl List" <bioperl-l at lists.open-bio.org>; "Mark A.  
>>> Jensen" <maj at fortinbras.us
>>> >
>>> Sent: Monday, September 21, 2009 9:05 AM
>>> Subject: Re: [Bioperl-l] a Main Page proposal
>>>
>>>
>>>>
>>>> On Sep 21, 2009, at 7:28 AM, Peter wrote:
>>>>
>>>>> Peter wrote:
>>>>>>> We had some similar discussions about the Biopython wiki
>>>>>>> based homepage - although our old one was nowhere near
>>>>>>> as busy as the current BioPerl main page, it was still not as
>>>>>>> welcoming as our current version *tries* to be.
>>>>>>> ...
>>>>>>> I can dig out links to our mailing list archive if anyone is
>>>>>>> interested in the discussion.
>>>>>
>>>>> On Mon, Sep 21, 2009 at 12:32 PM, Mark A. Jensen wrote:
>>>>>>
>>>>>> I'd appreciate those links, Peter- thanks
>>>>>> MAJ
>>>>>
>>>>> OK, here you are - this was most of it, I'd have to dig though
>>>>> my old emails to see what else I can find:
>>>>> http://lists.open-bio.org/pipermail/biopython-dev/2009-April/005867.html
>>>>>
>>>>> Remember Biopython went from a very minimal home page, to
>>>>> something aiming to be more newcomer friendly. BioPerl on the
>>>>> other hand seems to want to move away from the current very
>>>>> text heavy information rich page to something more focused and
>>>>> newcomer friendly. To me at least the current page is too dense,
>>>>> intimidating, and the important bits get lost in all the content.
>>>>>
>>>>> [My apologies if any of this feedback come accross too blunt.]
>>>>
>>>> Not at all; I'm thinking the same thing.
>>>>
>>>>> If you haven't already looked at them, you should checkout the
>>>>> other OBF project pages for ideas. The BioJava homepage is
>>>>> also using the wiki - in my opinion it is a bit cluttered, but is
>>>>> still more accessible than the current BioPerl page. Also,
>>>>> the BioRuby page is very nice - although not wiki based.
>>>>>
>>>>> Regards,
>>>>>
>>>>> Peter
>>>>
>>>> I think the Biopython layout is very nice and focused.  Maybe a   
>>>> bit  too minimal, but then again I don't like scrolling up and  
>>>> down  the  page to find the relevant bits, so less may be better.
>>>>
>>>> Reminds me of the simplifed design on the perl6 main page (just   
>>>> don't stare at the hallucinogenic butterfly too long):
>>>>
>>>> http://www.perl6.org/
>>>>
>>>> So, maybe a structured layout with the most important links, and  
>>>> additional links on a separate page.
>>>>
>>>> chris
>>>>
>>>> _______________________________________________
>>>> Bioperl-l mailing list
>>>> Bioperl-l at lists.open-bio.org
>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>>
>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>> --
>> Jason Stajich
>> jason.stajich at gmail.com
>> jason at bioperl.org
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>

--
Jason Stajich
jason.stajich at gmail.com
jason at bioperl.org


From maj at fortinbras.us  Fri Sep 25 12:38:40 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Fri, 25 Sep 2009 12:38:40 -0400
Subject: [Bioperl-l] a Main Page proposal
In-Reply-To: <2F3B82E8-3A61-4FDB-A55E-38899C262ED6@bioperl.org>
References: <094CC6C2C9DE448F9A8097AEEFE86B03@NewLife><4AB71C40.10902@sendu.me.uk><4AB72DEF.2010008@cornell.edu><320fb6e00909210245hc455330o109515b100b21153@mail.gmail.com><3C8F39ACAD954917ACDEFD863EC99B16@NewLife><320fb6e00909210528v849cd43h3bd4677b69222575@mail.gmail.com><D336A804-1B73-465E-971F-AD1A9EE5465C@illinois.edu><A06AF115F63B4C558D368B730BFB441D@NewLife>
	<2F3B82E8-3A61-4FDB-A55E-38899C262ED6@bioperl.org>
Message-ID: <3575DEFF2D0342D0A2553D87EB958D6E@NewLife>

I figured you really wanted the 'hundreds-o-cites' effect-- I'm just thinking of 
this
as a workaround until the issues are resolved. Not sure I can devote too much
time to playing with it now (procrastinating using other projects at the mo') 
but
I can put it in the todo list on the Documentation Project page....
cheers MAJ
----- Original Message ----- 
From: "Jason Stajich" <jason at bioperl.org>
To: "Mark A. Jensen" <maj at fortinbras.us>
Cc: "Chris Fields" <cjfields at illinois.edu>; "BioPerl List" 
<bioperl-l at lists.open-bio.org>; "Peter" <biopython at maubp.freeserve.co.uk>
Sent: Friday, September 25, 2009 11:47 AM
Subject: Re: [Bioperl-l] a Main Page proposal


> thanks - yeah I had separated it by year to make it easier to update  them 
> since the main file was too large, but I liked having them all  pulled in onto 
> one page in order to see the total number of cites.  Brian's graphic is nice 
> but a little out of date, and only reflects a  pubmed query.
>
> Basically that system doesn't work well enough with biblio since it  isn't 
> caching the lookups very well.   We can probably do better  somehow, but 
> someone would have to really be dedicated to it, so I can  kind of see now why 
> we could use something like this to generate the  citations so they'd be 
> static.
> http://sumsearch.uthscsa.edu/cite/
>
> I had used Biblio extension as it was so easy but maybe it just can't  scale 
> for that number of needed refs as it doesn't do very good local  caching 
> AFAIK.
>
> -jason
> On Sep 25, 2009, at 6:30 AM, Mark A. Jensen wrote:
>
>> It's ugly, but it works now.
>> ----- Original Message ----- From: "Chris Fields" <cjfields at illinois.edu
>> >
>> To: "Peter" <biopython at maubp.freeserve.co.uk>
>> Cc: "BioPerl List" <bioperl-l at lists.open-bio.org>; "Mark A. Jensen" 
>> <maj at fortinbras.us
>> >
>> Sent: Monday, September 21, 2009 9:05 AM
>> Subject: Re: [Bioperl-l] a Main Page proposal
>>
>>
>>>
>>> On Sep 21, 2009, at 7:28 AM, Peter wrote:
>>>
>>>> Peter wrote:
>>>>>> We had some similar discussions about the Biopython wiki
>>>>>> based homepage - although our old one was nowhere near
>>>>>> as busy as the current BioPerl main page, it was still not as
>>>>>> welcoming as our current version *tries* to be.
>>>>>> ...
>>>>>> I can dig out links to our mailing list archive if anyone is
>>>>>> interested in the discussion.
>>>>
>>>> On Mon, Sep 21, 2009 at 12:32 PM, Mark A. Jensen wrote:
>>>>>
>>>>> I'd appreciate those links, Peter- thanks
>>>>> MAJ
>>>>
>>>> OK, here you are - this was most of it, I'd have to dig though
>>>> my old emails to see what else I can find:
>>>> http://lists.open-bio.org/pipermail/biopython-dev/2009-April/005867.html
>>>>
>>>> Remember Biopython went from a very minimal home page, to
>>>> something aiming to be more newcomer friendly. BioPerl on the
>>>> other hand seems to want to move away from the current very
>>>> text heavy information rich page to something more focused and
>>>> newcomer friendly. To me at least the current page is too dense,
>>>> intimidating, and the important bits get lost in all the content.
>>>>
>>>> [My apologies if any of this feedback come accross too blunt.]
>>>
>>> Not at all; I'm thinking the same thing.
>>>
>>>> If you haven't already looked at them, you should checkout the
>>>> other OBF project pages for ideas. The BioJava homepage is
>>>> also using the wiki - in my opinion it is a bit cluttered, but is
>>>> still more accessible than the current BioPerl page. Also,
>>>> the BioRuby page is very nice - although not wiki based.
>>>>
>>>> Regards,
>>>>
>>>> Peter
>>>
>>> I think the Biopython layout is very nice and focused.  Maybe a  bit  too 
>>> minimal, but then again I don't like scrolling up and down  the  page to 
>>> find the relevant bits, so less may be better.
>>>
>>> Reminds me of the simplifed design on the perl6 main page (just  don't 
>>> stare at the hallucinogenic butterfly too long):
>>>
>>> http://www.perl6.org/
>>>
>>> So, maybe a structured layout with the most important links, and 
>>> additional links on a separate page.
>>>
>>> chris
>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> --
> Jason Stajich
> jason.stajich at gmail.com
> jason at bioperl.org
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> 


From jcline at ieee.org  Fri Sep 25 15:11:20 2009
From: jcline at ieee.org (Jonathan Cline)
Date: Fri, 25 Sep 2009 14:11:20 -0500
Subject: [Bioperl-l] LIMS::Controller and LIMS::Web
Message-ID: <4ABD15D8.9020304@ieee.org>

Anyone using the CPAN LIMS::Web or associated modules, have a web site
which demonstrates functionality?  The links in the .pod are not current.

>From CPAN:

DESCRIPTION ^

LIMS::Controller is a versatile object-oriented Perl module designed to
control a LIMS database and its web interface. Inheriting from the
LIMS::Web::Interface and LIMS::Database::Util classes, the module
provides automation for many core and advanced functions required of a
web/database object layer, enabling rapid development of Perl CGI scripts.

-- 

## Jonathan Cline
## jcline at ieee.org
## Mobile: +1-805-617-0223
########################


From bosborne11 at verizon.net  Fri Sep 25 22:13:16 2009
From: bosborne11 at verizon.net (Brian Osborne)
Date: Fri, 25 Sep 2009 22:13:16 -0400
Subject: [Bioperl-l] a Main Page proposal
In-Reply-To: <42FBB964C0EA44FABCB50364C567A009@NewLife>
References: <094CC6C2C9DE448F9A8097AEEFE86B03@NewLife><7C51BBB552AB4EBEA5ED60B49BDB9171@NewLife>
	<628aabb70909211003w221b4c9fn5c11f7ff08fc952d@mail.gmail.com>
	<42FBB964C0EA44FABCB50364C567A009@NewLife>
Message-ID: <B4584560-48AE-4EFB-BE94-E1481FD24E1C@verizon.net>

Mark,

Really nice, and a significant improvement over the existing.

You've gotten good feedback, you've considered these thoughts and  
incorporated them - is it time to move the beta to Main? Yes. In my  
opinion your 'beta' is far superior - just do it.

Brian O.


On Sep 21, 2009, at 1:45 PM, Mark A. Jensen wrote:

> A nearly completely minimal solution is at Main Page Beta
> ----- Original Message ----- From: "Dave Messina" <David.Messina at sbc.su.se 
> >
> To: "Mark A. Jensen" <maj at fortinbras.us>
> Cc: "BioPerl List" <bioperl-l at lists.open-bio.org>
> Sent: Monday, September 21, 2009 1:03 PM
> Subject: Re: [Bioperl-l] a Main Page proposal
>
>
>> Hi Mark,
>> Thanks for taking on this (much needed) refresh.
>> I think your current version is substantially better than what we  
>> have now.
>> Still, I'd argue that something much more concise like the  
>> Biopython page
>> would make a bigger impact on visitors' ability to find what  
>> they're looking
>> for.
>> It's not that the details you have under each section shouldn't be
>> available, but rather that they could be clicked through to instead  
>> of being
>> on the front page.
>> The About section is a good example. I would bet most visitors to the
>> BioPerl website skip over the About section because they already  
>> know what
>> BioPerl is, and that section has the most valuable real estate on  
>> the page.
>> Those who don't know and are curious will probably be able to find  
>> it (the
>> word About on the front page of a website has become an idiom for  
>> "click her
>> to read the details about this").
>> Dave
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From maj at fortinbras.us  Fri Sep 25 22:22:49 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Fri, 25 Sep 2009 22:22:49 -0400
Subject: [Bioperl-l] a Main Page proposal
In-Reply-To: <B4584560-48AE-4EFB-BE94-E1481FD24E1C@verizon.net>
References: <094CC6C2C9DE448F9A8097AEEFE86B03@NewLife><7C51BBB552AB4EBEA5ED60B49BDB9171@NewLife><628aabb70909211003w221b4c9fn5c11f7ff08fc952d@mail.gmail.com><42FBB964C0EA44FABCB50364C567A009@NewLife>
	<B4584560-48AE-4EFB-BE94-E1481FD24E1C@verizon.net>
Message-ID: <ACA5C04C052442259262125A5F0B8E74@NewLife>

Cheers, Brian-- I am becoming swayed now by Chris' whack 
at it, on his talk page. My thought is that we'll hammer out the 
final version after the release, then pull the trigger-- Your thoughts?
MAJ
----- Original Message ----- 
From: "Brian Osborne" <bosborne11 at verizon.net>
To: "Mark A. Jensen" <maj at fortinbras.us>
Cc: "BioPerl List" <bioperl-l at lists.open-bio.org>
Sent: Friday, September 25, 2009 10:13 PM
Subject: Re: [Bioperl-l] a Main Page proposal


> Mark,
> 
> Really nice, and a significant improvement over the existing.
> 
> You've gotten good feedback, you've considered these thoughts and  
> incorporated them - is it time to move the beta to Main? Yes. In my  
> opinion your 'beta' is far superior - just do it.
> 
> Brian O.
> 
> 
> On Sep 21, 2009, at 1:45 PM, Mark A. Jensen wrote:
> 
>> A nearly completely minimal solution is at Main Page Beta
>> ----- Original Message ----- From: "Dave Messina" <David.Messina at sbc.su.se 
>> >
>> To: "Mark A. Jensen" <maj at fortinbras.us>
>> Cc: "BioPerl List" <bioperl-l at lists.open-bio.org>
>> Sent: Monday, September 21, 2009 1:03 PM
>> Subject: Re: [Bioperl-l] a Main Page proposal
>>
>>
>>> Hi Mark,
>>> Thanks for taking on this (much needed) refresh.
>>> I think your current version is substantially better than what we  
>>> have now.
>>> Still, I'd argue that something much more concise like the  
>>> Biopython page
>>> would make a bigger impact on visitors' ability to find what  
>>> they're looking
>>> for.
>>> It's not that the details you have under each section shouldn't be
>>> available, but rather that they could be clicked through to instead  
>>> of being
>>> on the front page.
>>> The About section is a good example. I would bet most visitors to the
>>> BioPerl website skip over the About section because they already  
>>> know what
>>> BioPerl is, and that section has the most valuable real estate on  
>>> the page.
>>> Those who don't know and are curious will probably be able to find  
>>> it (the
>>> word About on the front page of a website has become an idiom for  
>>> "click her
>>> to read the details about this").
>>> Dave
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
>


From maj at fortinbras.us  Fri Sep 25 22:45:21 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Fri, 25 Sep 2009 22:45:21 -0400
Subject: [Bioperl-l] a Main Page proposal
In-Reply-To: <EB52E49A-37B3-4652-9BFD-441BA174FF84@verizon.net>
References: <094CC6C2C9DE448F9A8097AEEFE86B03@NewLife><7C51BBB552AB4EBEA5ED60B49BDB9171@NewLife><628aabb70909211003w221b4c9fn5c11f7ff08fc952d@mail.gmail.com><42FBB964C0EA44FABCB50364C567A009@NewLife>
	<B4584560-48AE-4EFB-BE94-E1481FD24E1C@verizon.net>
	<ACA5C04C052442259262125A5F0B8E74@NewLife>
	<EB52E49A-37B3-4652-9BFD-441BA174FF84@verizon.net>
Message-ID: <E53214D989154E8184BA97573C925DF9@NewLife>

sounds good-- I can make the changes (soon) and we'll tweak it from the echte page
(unless I hear diff'rnt)
cheers MAJ
  ----- Original Message ----- 
  From: Brian Osborne 
  To: Mark A. Jensen 
  Cc: BioPerl List 
  Sent: Friday, September 25, 2009 10:42 PM
  Subject: Re: [Bioperl-l] a Main Page proposal


  Mark,


  I don't love the italics in the version that Chris made but that's just personal preference. He's right in thinking that putting more in the top of the page is good: less scrolling.


  One could color the backgrounds of his tables, that might look nice.


  Either way, or a combination of both, is preferable to what we have. There really is no need to wait since the current page is abysmal. I can say that freely since I'm probably one of its authors!


  One thought though: move the "search" up to a center-left location, below "main links". The Wiki search is pretty good at finding pages so if someone doesn't find what they're looking for in the main section they might be drawn to search for it.


  Brian O.


  On Sep 25, 2009, at 10:22 PM, Mark A. Jensen wrote:


    Cheers, Brian-- I am becoming swayed now by Chris' whack at it, on his talk page. My thought is that we'll hammer out the final version after the release, then pull the trigger-- Your thoughts?
    MAJ
    ----- Original Message ----- From: "Brian Osborne" <bosborne11 at verizon.net>
    To: "Mark A. Jensen" <maj at fortinbras.us>
    Cc: "BioPerl List" <bioperl-l at lists.open-bio.org>
    Sent: Friday, September 25, 2009 10:13 PM
    Subject: Re: [Bioperl-l] a Main Page proposal


      Mark,

      Really nice, and a significant improvement over the existing.

      You've gotten good feedback, you've considered these thoughts and  incorporated them - is it time to move the beta to Main? Yes. In my  opinion your 'beta' is far superior - just do it.

      Brian O.

      On Sep 21, 2009, at 1:45 PM, Mark A. Jensen wrote:

        A nearly completely minimal solution is at Main Page Beta

        ----- Original Message ----- From: "Dave Messina" <David.Messina at sbc.su.se >

        To: "Mark A. Jensen" <maj at fortinbras.us>

        Cc: "BioPerl List" <bioperl-l at lists.open-bio.org>

        Sent: Monday, September 21, 2009 1:03 PM

        Subject: Re: [Bioperl-l] a Main Page proposal


          Hi Mark,

          Thanks for taking on this (much needed) refresh.

          I think your current version is substantially better than what we  have now.

          Still, I'd argue that something much more concise like the  Biopython page

          would make a bigger impact on visitors' ability to find what  they're looking

          for.

          It's not that the details you have under each section shouldn't be

          available, but rather that they could be clicked through to instead  of being

          on the front page.

          The About section is a good example. I would bet most visitors to the

          BioPerl website skip over the About section because they already  know what

          BioPerl is, and that section has the most valuable real estate on  the page.

          Those who don't know and are curious will probably be able to find  it (the

          word About on the front page of a website has become an idiom for  "click her

          to read the details about this").

          Dave

          _______________________________________________

          Bioperl-l mailing list

          Bioperl-l at lists.open-bio.org

          http://lists.open-bio.org/mailman/listinfo/bioperl-l


        _______________________________________________

        Bioperl-l mailing list

        Bioperl-l at lists.open-bio.org

        http://lists.open-bio.org/mailman/listinfo/bioperl-l

      _______________________________________________

      Bioperl-l mailing list

      Bioperl-l at lists.open-bio.org

      http://lists.open-bio.org/mailman/listinfo/bioperl-l


    _______________________________________________
    Bioperl-l mailing list
    Bioperl-l at lists.open-bio.org
    http://lists.open-bio.org/mailman/listinfo/bioperl-l


From bosborne11 at verizon.net  Fri Sep 25 22:42:38 2009
From: bosborne11 at verizon.net (Brian Osborne)
Date: Fri, 25 Sep 2009 22:42:38 -0400
Subject: [Bioperl-l] a Main Page proposal
In-Reply-To: <ACA5C04C052442259262125A5F0B8E74@NewLife>
References: <094CC6C2C9DE448F9A8097AEEFE86B03@NewLife><7C51BBB552AB4EBEA5ED60B49BDB9171@NewLife><628aabb70909211003w221b4c9fn5c11f7ff08fc952d@mail.gmail.com><42FBB964C0EA44FABCB50364C567A009@NewLife>
	<B4584560-48AE-4EFB-BE94-E1481FD24E1C@verizon.net>
	<ACA5C04C052442259262125A5F0B8E74@NewLife>
Message-ID: <EB52E49A-37B3-4652-9BFD-441BA174FF84@verizon.net>

Mark,

I don't love the italics in the version that Chris made but that's  
just personal preference. He's right in thinking that putting more in  
the top of the page is good: less scrolling.

One could color the backgrounds of his tables, that might look nice.

Either way, or a combination of both, is preferable to what we have.  
There really is no need to wait since the current page is abysmal. I  
can say that freely since I'm probably one of its authors!

One thought though: move the "search" up to a center-left location,  
below "main links". The Wiki search is pretty good at finding pages so  
if someone doesn't find what they're looking for in the main section  
they might be drawn to search for it.

Brian O.


On Sep 25, 2009, at 10:22 PM, Mark A. Jensen wrote:

> Cheers, Brian-- I am becoming swayed now by Chris' whack at it, on  
> his talk page. My thought is that we'll hammer out the final version  
> after the release, then pull the trigger-- Your thoughts?
> MAJ
> ----- Original Message ----- From: "Brian Osborne" <bosborne11 at verizon.net 
> >
> To: "Mark A. Jensen" <maj at fortinbras.us>
> Cc: "BioPerl List" <bioperl-l at lists.open-bio.org>
> Sent: Friday, September 25, 2009 10:13 PM
> Subject: Re: [Bioperl-l] a Main Page proposal
>
>
>> Mark,
>> Really nice, and a significant improvement over the existing.
>> You've gotten good feedback, you've considered these thoughts and   
>> incorporated them - is it time to move the beta to Main? Yes. In  
>> my  opinion your 'beta' is far superior - just do it.
>> Brian O.
>> On Sep 21, 2009, at 1:45 PM, Mark A. Jensen wrote:
>>> A nearly completely minimal solution is at Main Page Beta
>>> ----- Original Message ----- From: "Dave Messina" <David.Messina at sbc.su.se 
>>>  >
>>> To: "Mark A. Jensen" <maj at fortinbras.us>
>>> Cc: "BioPerl List" <bioperl-l at lists.open-bio.org>
>>> Sent: Monday, September 21, 2009 1:03 PM
>>> Subject: Re: [Bioperl-l] a Main Page proposal
>>>
>>>
>>>> Hi Mark,
>>>> Thanks for taking on this (much needed) refresh.
>>>> I think your current version is substantially better than what  
>>>> we  have now.
>>>> Still, I'd argue that something much more concise like the   
>>>> Biopython page
>>>> would make a bigger impact on visitors' ability to find what   
>>>> they're looking
>>>> for.
>>>> It's not that the details you have under each section shouldn't be
>>>> available, but rather that they could be clicked through to  
>>>> instead  of being
>>>> on the front page.
>>>> The About section is a good example. I would bet most visitors to  
>>>> the
>>>> BioPerl website skip over the About section because they already   
>>>> know what
>>>> BioPerl is, and that section has the most valuable real estate  
>>>> on  the page.
>>>> Those who don't know and are curious will probably be able to  
>>>> find  it (the
>>>> word About on the front page of a website has become an idiom  
>>>> for  "click her
>>>> to read the details about this").
>>>> Dave
>>>> _______________________________________________
>>>> Bioperl-l mailing list
>>>> Bioperl-l at lists.open-bio.org
>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From cjfields at illinois.edu  Sat Sep 26 00:04:57 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Fri, 25 Sep 2009 23:04:57 -0500
Subject: [Bioperl-l] a Main Page proposal
In-Reply-To: <EB52E49A-37B3-4652-9BFD-441BA174FF84@verizon.net>
References: <094CC6C2C9DE448F9A8097AEEFE86B03@NewLife><7C51BBB552AB4EBEA5ED60B49BDB9171@NewLife><628aabb70909211003w221b4c9fn5c11f7ff08fc952d@mail.gmail.com><42FBB964C0EA44FABCB50364C567A009@NewLife>
	<B4584560-48AE-4EFB-BE94-E1481FD24E1C@verizon.net>
	<ACA5C04C052442259262125A5F0B8E74@NewLife>
	<EB52E49A-37B3-4652-9BFD-441BA174FF84@verizon.net>
Message-ID: <68A162A4-45F1-4ADC-87C9-57E388DF2666@illinois.edu>

Brian, Mark,

Agreed about the italics; there's a lot more that can be done with  
tables if needed:

http://meta.wikimedia.org/wiki/Help:Table

I say go ahead and pull the trigger.  No need to wait 'til 1.6.1 on  
this, the sooner it's fixed the better.  We can tweak the rest (add  
News updates, etc) along the way.

chris

On Sep 25, 2009, at 9:42 PM, Brian Osborne wrote:

> Mark,
>
> I don't love the italics in the version that Chris made but that's  
> just personal preference. He's right in thinking that putting more  
> in the top of the page is good: less scrolling.
>
> One could color the backgrounds of his tables, that might look nice.
>
> Either way, or a combination of both, is preferable to what we have.  
> There really is no need to wait since the current page is abysmal. I  
> can say that freely since I'm probably one of its authors!
>
> One thought though: move the "search" up to a center-left location,  
> below "main links". The Wiki search is pretty good at finding pages  
> so if someone doesn't find what they're looking for in the main  
> section they might be drawn to search for it.
>
> Brian O.
>
>
> On Sep 25, 2009, at 10:22 PM, Mark A. Jensen wrote:
>
>> Cheers, Brian-- I am becoming swayed now by Chris' whack at it, on  
>> his talk page. My thought is that we'll hammer out the final  
>> version after the release, then pull the trigger-- Your thoughts?
>> MAJ
>> ----- Original Message ----- From: "Brian Osborne" <bosborne11 at verizon.net 
>> >
>> To: "Mark A. Jensen" <maj at fortinbras.us>
>> Cc: "BioPerl List" <bioperl-l at lists.open-bio.org>
>> Sent: Friday, September 25, 2009 10:13 PM
>> Subject: Re: [Bioperl-l] a Main Page proposal
>>
>>
>>> Mark,
>>> Really nice, and a significant improvement over the existing.
>>> You've gotten good feedback, you've considered these thoughts and   
>>> incorporated them - is it time to move the beta to Main? Yes. In  
>>> my  opinion your 'beta' is far superior - just do it.
>>> Brian O.
>>> On Sep 21, 2009, at 1:45 PM, Mark A. Jensen wrote:
>>>> A nearly completely minimal solution is at Main Page Beta
>>>> ----- Original Message ----- From: "Dave Messina" <David.Messina at sbc.su.se 
>>>>  >
>>>> To: "Mark A. Jensen" <maj at fortinbras.us>
>>>> Cc: "BioPerl List" <bioperl-l at lists.open-bio.org>
>>>> Sent: Monday, September 21, 2009 1:03 PM
>>>> Subject: Re: [Bioperl-l] a Main Page proposal
>>>>
>>>>
>>>>> Hi Mark,
>>>>> Thanks for taking on this (much needed) refresh.
>>>>> I think your current version is substantially better than what  
>>>>> we  have now.
>>>>> Still, I'd argue that something much more concise like the   
>>>>> Biopython page
>>>>> would make a bigger impact on visitors' ability to find what   
>>>>> they're looking
>>>>> for.
>>>>> It's not that the details you have under each section shouldn't be
>>>>> available, but rather that they could be clicked through to  
>>>>> instead  of being
>>>>> on the front page.
>>>>> The About section is a good example. I would bet most visitors  
>>>>> to the
>>>>> BioPerl website skip over the About section because they  
>>>>> already  know what
>>>>> BioPerl is, and that section has the most valuable real estate  
>>>>> on  the page.
>>>>> Those who don't know and are curious will probably be able to  
>>>>> find  it (the
>>>>> word About on the front page of a website has become an idiom  
>>>>> for  "click her
>>>>> to read the details about this").
>>>>> Dave
>>>>> _______________________________________________
>>>>> Bioperl-l mailing list
>>>>> Bioperl-l at lists.open-bio.org
>>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>>>
>>>> _______________________________________________
>>>> Bioperl-l mailing list
>>>> Bioperl-l at lists.open-bio.org
>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From cjfields at illinois.edu  Sat Sep 26 00:52:35 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Fri, 25 Sep 2009 23:52:35 -0500
Subject: [Bioperl-l] BioPerl 1.6.0 alpha 4 released
Message-ID: <2EDBBBF5-2109-456A-B768-178B012A8192@illinois.edu>

All,

Core 1.6.0 alpha 4 is now floating about on the intertubes and CPAN:

http://search.cpan.org/~cjfields/BioPerl-1.6.0_4/

http://bioperl.org/DIST/RC/

So far this is passing all tests for ActivePerl on WinXP once DB_File  
is installed.  I'll try running some tests for Strawberry Perl, but no  
promises.

At this late stage any additional updates will only be doc tweaks and  
dealing with small bug fixes prior to 1.6.1.  The only renaming issue  
is I need to rename BioPerl.pod to BioPerl.pm and adding a simple  
VERSION to it (per Curtis Jewell's suggestion).  I may post a very  
short alpha 5 to test that, with 1.6.1 posted by Sunday.

Enjoy!

chris


From e.osimo at gmail.com  Sun Sep 27 05:00:17 2009
From: e.osimo at gmail.com (Emanuele Osimo)
Date: Sun, 27 Sep 2009 11:00:17 +0200
Subject: [Bioperl-l] setting a strand in Bio::Graphics
Message-ID: <2ac05d0f0909270200j3bb478b3t77b83bccc1e5022c@mail.gmail.com>

Hello,
I've tried all the arrows suggested in
http://search.cpan.org/~lds/Bio-Graphics-1.982/lib/Bio/Graphics/Glyph/arrow.pm,
but I can't figure out how to tell in the options of $panel->add_track the
strand of the feature I'm adding.
I'm drawing DNA elements from a local DB, and I have a field "strand" which
can be + or -.
Please help!
Emanuele


From maj at fortinbras.us  Sun Sep 27 20:54:04 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Sun, 27 Sep 2009 20:54:04 -0400
Subject: [Bioperl-l] setting a strand in Bio::Graphics
In-Reply-To: <2ac05d0f0909270200j3bb478b3t77b83bccc1e5022c@mail.gmail.com>
References: <2ac05d0f0909270200j3bb478b3t77b83bccc1e5022c@mail.gmail.com>
Message-ID: <6CF05E74FEAE45679CDEDF48B7E15856@NewLife>

Emos- Without the code, I can only guess, but you might not be providing
the options correctly. Have a look at
http://www.bioperl.org/wiki/Drawing_with_multiple_glyphs_in_a_single_track
for something that may help.
MAJ
----- Original Message ----- 
From: "Emanuele Osimo" <e.osimo at gmail.com>
To: "perl bioperl ml" <bioperl-l at lists.open-bio.org>
Sent: Sunday, September 27, 2009 5:00 AM
Subject: [Bioperl-l] setting a strand in Bio::Graphics


> Hello,
> I've tried all the arrows suggested in
> http://search.cpan.org/~lds/Bio-Graphics-1.982/lib/Bio/Graphics/Glyph/arrow.pm,
> but I can't figure out how to tell in the options of $panel->add_track the
> strand of the feature I'm adding.
> I'm drawing DNA elements from a local DB, and I have a field "strand" which
> can be + or -.
> Please help!
> Emanuele
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> 


From cjfields at illinois.edu  Mon Sep 28 00:34:01 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Sun, 27 Sep 2009 23:34:01 -0500
Subject: [Bioperl-l] BioPerl 1.6.0 alpha 5 released
Message-ID: <277ED183-2F43-479F-88D2-A0A325105C53@illinois.edu>

All,

The last alpha for the 1.6.1 release is out and should be propagating  
around CPAN now.  This should be a quick one (it has a few last-minute  
bug fixes for some problems that popped up on CPAN RT and fixes one  
mistake I made in the last alpha).

You can currently get it here (.tar.gz only for now):

http://bioperl.org/DIST/RC/BioPerl-1.6.0_5.tar.gz

The final 1.6.1 release should drop in the next day or two.

chris


From adsj at novozymes.com  Mon Sep 28 03:51:15 2009
From: adsj at novozymes.com (Adam =?iso-8859-1?Q?Sj=F8gren?=)
Date: Mon, 28 Sep 2009 09:51:15 +0200
Subject: [Bioperl-l] Long /labels are wrapped, but can't be read
Message-ID: <87hbunv764.fsf@topper.koldfront.dk>

  Hi.


I am wondering whether this is a buglet or just a case of "Don't do
that":

If I set a very long /label on a feature and output the sequence in EMBL
format, the qualifier value gets wrapped, but not quoted.

When BioPerl reads such a file, an exception is thrown.

I probably shouldn't be setting very long labels... But oughtn't BioPerl
throw an exception when a too long label is set, or automatically quote
the value when it is long enough to be wrapped, or know how to read a
wrapped yet unquoted value?

I will be happy to try and provide a patch for whichever solution is
preferred.

Here is an example script:

  #!/usr/bin/perl

  use strict;
  use warnings;

  use IO::String;

  use Bio::Seq;
  use Bio::SeqFeature::Generic;
  use Bio::SeqIO;

  print 'BioPerl ' . $Bio::Root::Version::VERSION . "\n";

  my $seq=Bio::Seq->new(-seq=>'ATG');
  my $feature=Bio::SeqFeature::Generic->new(-primary=>'misc_feature', -start=>1, -end=>3);
  $feature->add_tag_value(label=>'averylonglabelthisisindeedbutitoughttoworkanywaydontyouthink');
  $seq->add_SeqFeature($feature);

  my $out_string=out($seq);
  print $out_string;

  my $fh=IO::String->new($out_string);
  my $in=Bio::SeqIO->new(-fh=>$fh, -format=>'EMBL');
  my $in_seq=$in->next_seq;

  print "Done\n";

  sub out {
      my ($seq)=@_;

      my $string='';
      my $fh=IO::String->new($string);
      my $out=Bio::SeqIO->new(-fh=>$fh, -format=>'EMBL');
      $out->write_seq($seq);

      return $string;
  }

Which gives this output when run:

  BioPerl 1.0069
  ID   unknown; SV 1; linear; unassigned DNA; STD; UNC; 3 BP.
  XX
  AC   unknown;
  XX
  XX
  FH   Key             Location/Qualifiers
  FH
  FT   misc_feature    1..3
  FT                   /label=averylonglabelthisisindeedbutitoughttoworkanywaydont
  FT                   youthink
  XX
  SQ   Sequence 3 BP; 1 A; 0 C; 1 G; 1 T; 0 other;
       atg                                                                       3
  //

  ------------- EXCEPTION: Bio::Root::Exception -------------
  MSG: Can't see new qualifier in: youthink
  from:
  /label=averylonglabelthisisindeedbutitoughttoworkanywaydont
  youthink

  STACK: Error::throw
  STACK: Bio::Root::Root::throw Bio/Root/Root.pm:368
  STACK: Bio::SeqIO::embl::_read_FTHelper_EMBL Bio/SeqIO/embl.pm:1294
  STACK: Bio::SeqIO::embl::next_seq Bio/SeqIO/embl.pm:392
  STACK: /z/home/adsj/bugs/bioperl/embl/embl.pl:24
  -----------------------------------------------------------

If I change the value to include "-quotes ("simulating" that embl.pm
quotes the value), BioPerl can read the EMBL string it produces fine:

  -----------------------------------------------------------
  adsj at ala:~/work/bioperl/bioperl-live$ perl -I. ~/bugs/bioperl/embl/embl.pl 
  BioPerl 1.0069
  ID   unknown; SV 1; linear; unassigned DNA; STD; UNC; 3 BP.
  XX
  AC   unknown;
  XX
  XX
  FH   Key             Location/Qualifiers
  FH
  FT   misc_feature    1..3
  FT                   /label=""averylonglabelthisisindeedbutitoughttoworkanywaydo
  FT                   ntyouthink""
  XX
  SQ   Sequence 3 BP; 1 A; 0 C; 1 G; 1 T; 0 other;
       atg                                                                       3
  //
  Done


  Best regards,

     Adam

-- 
                                                          Adam Sj?gren
                                                    adsj at novozymes.com


From paola_bisignano at yahoo.it  Mon Sep 28 06:00:07 2009
From: paola_bisignano at yahoo.it (Paola Bisignano)
Date: Mon, 28 Sep 2009 10:00:07 +0000 (GMT)
Subject: [Bioperl-l] parsing msf file (sorry last question about it)
Message-ID: <504748.72296.qm@web25704.mail.ukl.yahoo.com>

Hi dear friends,


I used Bio::AlignIO to parse msf file, using method

colum_from_residue_number, as you suggested to obtain the position in

the alignment of ?residues of interest (in contact with my ligand) and

I have to do a check of the residue:

I want to extract the type of the residue...I ask my question using

the number of the residue in the PDB, and i want the script return

also the residue so if I want to know the position af ala21, I ?will

do:


my $alnio = Bio::AlignIO->new( -file=>"my file.msf");

my $aln = $alnio->next_aln;


my $s1 = $aln->get_seq_by_pos(1);

my $s2 = $aln->get_seq_by_pos(2);


my $col = $aln->column_from_residue_ number( $s1->id, 21)


and It will return the position (es. 5) but I want to check if in

position 5 of the alignment there is A (for ala)....I looked in

documentation, but I couldn't find anything for that


Thank you all for help you gave and will give to me,


best regards,


paola


From David.Messina at sbc.su.se  Mon Sep 28 07:28:27 2009
From: David.Messina at sbc.su.se (Dave Messina)
Date: Mon, 28 Sep 2009 13:28:27 +0200
Subject: [Bioperl-l] parsing msf file (sorry last question about it)
In-Reply-To: <504748.72296.qm@web25704.mail.ukl.yahoo.com>
References: <504748.72296.qm@web25704.mail.ukl.yahoo.com>
Message-ID: <628aabb70909280428q54e08ef9sa005aeab9f3a7b62@mail.gmail.com>

Hi Paola,

> my $alnio = Bio::AlignIO->new( -file=>"my file.msf");
> my $aln = $alnio->next_aln;
>
> my $s1 = $aln->get_seq_by_pos(1);
> my $s2 = $aln->get_seq_by_pos(2);
>
> my $col = $aln->column_from_residue_ number( $s1->id, 21)


# extract sequences and check values for the alignment column $pos
  foreach my $seq ($aln->each_seq) {
      my $res = $seq->subseq($col, $col);
     if ($res eq 'A') {
         # do something
     }
  }


Please try the above code. I haven't tested it, but I think it will do what
you want.

Best,
Dave

PS - I found that code in the documentation for Bio::Align::AlignI. Right
now there is an effort to improve the BioPerl documentation, and it would be
helpful if you could let us know where you looked for the answer to your
question so we can try to make it easier to find.

Did you look in Bio::AlignIO? Did you also look anywhere else?

Thanks for your help!


From David.Messina at sbc.su.se  Mon Sep 28 08:05:58 2009
From: David.Messina at sbc.su.se (Dave Messina)
Date: Mon, 28 Sep 2009 14:05:58 +0200
Subject: [Bioperl-l] parsing msf file (sorry last question about it)
In-Reply-To: <678730.88068.qm@web25708.mail.ukl.yahoo.com>
References: <628aabb70909280428q54e08ef9sa005aeab9f3a7b62@mail.gmail.com> 
	<678730.88068.qm@web25708.mail.ukl.yahoo.com>
Message-ID: <628aabb70909280505l2c5f02b7k8387d5dfd3643575@mail.gmail.com>

On Mon, Sep 28, 2009 at 13:56, Paola Bisignano <paola_bisignano at yahoo.it>wrote:

> yes I have a look at
> http://doc.bioperl.org/releases/bioperl-1.0/Bio/AlignIO.html
>
> but I didn't find your suggestion


> thank,
> I'll try it in a while.......
> sorry I did not search in AlignI....


No problem, Paola -- thanks for letting us know.

Dave


From maj at fortinbras.us  Mon Sep 28 10:32:39 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Mon, 28 Sep 2009 10:32:39 -0400
Subject: [Bioperl-l] setting a strand in Bio::Graphics
In-Reply-To: <2ac05d0f0909280728y791a5e60r904be0d7e8f747f7@mail.gmail.com>
References: <2ac05d0f0909270200j3bb478b3t77b83bccc1e5022c@mail.gmail.com>
	<6CF05E74FEAE45679CDEDF48B7E15856@NewLife>
	<2ac05d0f0909280728y791a5e60r904be0d7e8f747f7@mail.gmail.com>
Message-ID: <A45CF4D6E34B405B86E5F2DF651B8964@NewLife>

Now that's what I call user-friendly.
  ----- Original Message ----- 
  From: Emanuele Osimo 
  To: Mark A. Jensen 
  Sent: Monday, September 28, 2009 10:28 AM
  Subject: Re: [Bioperl-l] setting a strand in Bio::Graphics


  Hello everyone,
  thank you, I found what I needed. You have to add                           

  -strand_arrow => 1

  in $panel->add_track, and 

  -strand        => +/-1,

  in $feature = Bio::SeqFeature::Generic->new options.

  Thanks
  Emanuele


  On Mon, Sep 28, 2009 at 02:54, Mark A. Jensen <maj at fortinbras.us> wrote:

    Emos- Without the code, I can only guess, but you might not be providing
    the options correctly. Have a look at
    http://www.bioperl.org/wiki/Drawing_with_multiple_glyphs_in_a_single_track
    for something that may help.
    MAJ
    ----- Original Message ----- From: "Emanuele Osimo" <e.osimo at gmail.com>
    To: "perl bioperl ml" <bioperl-l at lists.open-bio.org>
    Sent: Sunday, September 27, 2009 5:00 AM
    Subject: [Bioperl-l] setting a strand in Bio::Graphics


      Hello,
      I've tried all the arrows suggested in
      http://search.cpan.org/~lds/Bio-Graphics-1.982/lib/Bio/Graphics/Glyph/arrow.pm,
      but I can't figure out how to tell in the options of $panel->add_track the
      strand of the feature I'm adding.
      I'm drawing DNA elements from a local DB, and I have a field "strand" which
      can be + or -.
      Please help!
      Emanuele

      _______________________________________________
      Bioperl-l mailing list
      Bioperl-l at lists.open-bio.org
      http://lists.open-bio.org/mailman/listinfo/bioperl-l


From paolo.pavan at gmail.com  Mon Sep 28 11:51:52 2009
From: paolo.pavan at gmail.com (Paolo Pavan)
Date: Mon, 28 Sep 2009 17:51:52 +0200
Subject: [Bioperl-l] BioPerl object deep copy
Message-ID: <56be91b60909280851g2299726bvfbdd6ef44e262fe7@mail.gmail.com>

Hi all,
I would like to have just a programming hint, there is a way in
bioperl (or just in perl) to get an deep copy or a clone of an object?
That is, I get a new object with all the fields copied one by one.

At least, can I do so for a Bio::SeqI or a Bio::AlignI compliant object?

Thank you,
Paolo


From s.denaxas at gmail.com  Mon Sep 28 11:56:09 2009
From: s.denaxas at gmail.com (Spiros Denaxas)
Date: Mon, 28 Sep 2009 16:56:09 +0100
Subject: [Bioperl-l] BioPerl object deep copy
In-Reply-To: <56be91b60909280851g2299726bvfbdd6ef44e262fe7@mail.gmail.com>
References: <56be91b60909280851g2299726bvfbdd6ef44e262fe7@mail.gmail.com>
Message-ID: <bba689ec0909280856q3fa3c8b1pf5b5dd48bc493eb4@mail.gmail.com>

Hi Paolo,

You can use Clone [1]. Blindly cloning blessed objects though is not a
good idea so make sure you know what each one instantiates.

Spiros

[1] http://perldoc.net/Clone.pm

On Mon, Sep 28, 2009 at 4:51 PM, Paolo Pavan <paolo.pavan at gmail.com> wrote:
> Hi all,
> I would like to have just a programming hint, there is a way in
> bioperl (or just in perl) to get an deep copy or a clone of an object?
> That is, I get a new object with all the fields copied one by one.
>
> At least, can I do so for a Bio::SeqI or a Bio::AlignI compliant object?
>
> Thank you,
> Paolo
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


From maj at fortinbras.us  Mon Sep 28 12:05:42 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Mon, 28 Sep 2009 12:05:42 -0400
Subject: [Bioperl-l] BioPerl object deep copy
In-Reply-To: <56be91b60909280851g2299726bvfbdd6ef44e262fe7@mail.gmail.com>
References: <56be91b60909280851g2299726bvfbdd6ef44e262fe7@mail.gmail.com>
Message-ID: <5A61641A14AE4D80A495047A56659894@NewLife>

For some relatively careful examples of cloning code, 
you can look at the source for 
Bio::Tree::TreeFunctionsI::clone
and 
Bio::Restriction::Enzyme::clone (not clone_depr)
MAJ

----- Original Message ----- 
From: "Paolo Pavan" <paolo.pavan at gmail.com>
To: <bioperl-l at lists.open-bio.org>
Sent: Monday, September 28, 2009 11:51 AM
Subject: [Bioperl-l] BioPerl object deep copy


> Hi all,
> I would like to have just a programming hint, there is a way in
> bioperl (or just in perl) to get an deep copy or a clone of an object?
> That is, I get a new object with all the fields copied one by one.
> 
> At least, can I do so for a Bio::SeqI or a Bio::AlignI compliant object?
> 
> Thank you,
> Paolo
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
>


From cjfields at illinois.edu  Mon Sep 28 12:29:14 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Mon, 28 Sep 2009 11:29:14 -0500
Subject: [Bioperl-l] BioPerl object deep copy
In-Reply-To: <5A61641A14AE4D80A495047A56659894@NewLife>
References: <56be91b60909280851g2299726bvfbdd6ef44e262fe7@mail.gmail.com>
	<5A61641A14AE4D80A495047A56659894@NewLife>
Message-ID: <05BB0DB4-6017-40A1-92B2-6F441CCACDC6@illinois.edu>

As Spiros points out, Clone works in almost all cases and is very fast  
(XS-based I think).  IIRC the only time it borks out is if there is a  
code ref, as with Bio::Tree::Tree, but if it doesn't work you should  
get an error indicating the problem.

chris

On Sep 28, 2009, at 11:05 AM, Mark A. Jensen wrote:

> For some relatively careful examples of cloning code, you can look  
> at the source for Bio::Tree::TreeFunctionsI::clone
> and Bio::Restriction::Enzyme::clone (not clone_depr)
> MAJ
>
> ----- Original Message ----- From: "Paolo Pavan" <paolo.pavan at gmail.com 
> >
> To: <bioperl-l at lists.open-bio.org>
> Sent: Monday, September 28, 2009 11:51 AM
> Subject: [Bioperl-l] BioPerl object deep copy
>
>
>> Hi all,
>> I would like to have just a programming hint, there is a way in
>> bioperl (or just in perl) to get an deep copy or a clone of an  
>> object?
>> That is, I get a new object with all the fields copied one by one.
>> At least, can I do so for a Bio::SeqI or a Bio::AlignI compliant  
>> object?
>> Thank you,
>> Paolo
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From cjfields at illinois.edu  Mon Sep 28 13:00:09 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Mon, 28 Sep 2009 12:00:09 -0500
Subject: [Bioperl-l] BioPerl 1.6.0 alpha 6 released (?!?)
In-Reply-To: <20090928063013.GB1081@kunpuu.plessy.org>
References: <277ED183-2F43-479F-88D2-A0A325105C53@illinois.edu>
	<20090928063013.GB1081@kunpuu.plessy.org>
Message-ID: <CFD37E37-2B74-402F-BA0F-898A1642FFE8@illinois.edu>

Charles (and everyone else),

This bug was a bit sneaky.  The tests skipped on pretty much every  
system b/c of a requirement for both DB_File and BerkeleyDB (e.g. if  
both weren't installed, the tests were skipped).  I committed a fix  
for it; unfortunately that means I need to set up another alpha for  
testing, so...

The final final alpha has just been uploaded to CPAN and is now  
available here:

http://bioperl.org/DIST/RC/BioPerl-1.6.0_6.tar.gz

The final 1.6.1 release should still be in the next day or two, just  
awaiting some test reports via CPAN...

chris

On Sep 28, 2009, at 1:30 AM, Charles Plessy wrote:

> Le Sun, Sep 27, 2009 at 11:34:01PM -0500, Chris Fields a ?crit :
>>
>> http://bioperl.org/DIST/RC/BioPerl-1.6.0_5.tar.gz
>>
>
> Hi Chris,
>
> I have the following errors when building bioperl with perl 5.10.1  
> on Debian:
>
> Test Summary Report
> -------------------
> t/LocalDB/Registry.t                       (Wstat: 2304 Tests: 13  
> Failed: 1)
>  Failed test:  13
>  Non-zero exit status: 9
>  Parse errors: Bad plan.  You planned 14 tests but ran 13.
> t/RemoteDB/EUtilities.t                    (Wstat: 256 Tests: 309  
> Failed: 1)
>  Failed test:  309
>  Non-zero exit status: 1
> t/Tools/Run/RemoteBlast.t                  (Wstat: 65280 Tests: 13  
> Failed: 0)
>  Non-zero exit status: 255
>  Parse errors: Bad plan.  You planned 16 tests but ran 13.
> Files=329, Tests=20766, 434 wallclock secs ( 2.64 usr  0.51 sys +  
> 100.55 cusr  6.24 csys = 109.94 CPU)
> Result: FAIL
>
>
> t/Align/AlignStats.t ......................... ok
> t/Align/AlignUtil.t .......................... ok
> t/Align/SimpleAlign.t ........................ ok
> t/Align/TreeBuild.t .......................... ok
> t/Align/Utilities.t .......................... ok
> t/AlignIO/AlignIO.t .......................... ok
> t/AlignIO/arp.t .............................. ok
> t/AlignIO/bl2seq.t ........................... ok
> t/AlignIO/clustalw.t ......................... ok
> t/AlignIO/emboss.t ........................... ok
> t/AlignIO/fasta.t ............................ ok
> t/AlignIO/largemultifasta.t .................. ok
> t/AlignIO/maf.t .............................. ok
> t/AlignIO/mase.t ............................. ok
> t/AlignIO/mega.t ............................. ok
> t/AlignIO/meme.t ............................. ok
> t/AlignIO/metafasta.t ........................ ok
> t/AlignIO/msf.t .............................. ok
> t/AlignIO/nexus.t ............................ ok
> t/AlignIO/pfam.t ............................. ok
> t/AlignIO/phylip.t ........................... ok
> t/AlignIO/po.t ............................... ok
> t/AlignIO/prodom.t ........................... ok
> t/AlignIO/psi.t .............................. ok
> t/AlignIO/selex.t ............................ ok
> t/AlignIO/stockholm.t ........................ ok
> t/AlignIO/xmfa.t ............................. ok
> t/Alphabet.t ................................. ok
> t/Annotation/Annotation.t .................... ok
> t/Annotation/AnnotationAdaptor.t ............. ok
> t/Assembly/Assembly.t ........................ ok
> t/Assembly/ContigSpectrum.t .................. ok
> t/Biblio/Biblio.t ............................ ok
> t/Biblio/References.t ........................ ok
> t/Biblio/biofetch.t .......................... ok
> t/Biblio/eutils.t ............................ ok
> t/ClusterIO/ClusterIO.t ...................... ok
> t/ClusterIO/SequenceFamily.t ................. ok
> t/ClusterIO/unigene.t ........................ ok
> t/Coordinate/CoordinateGraph.t ............... ok
> t/Coordinate/CoordinateMapper.t .............. ok
> t/Coordinate/GeneCoordinateMapper.t .......... ok
> t/LiveSeq/Chain.t ............................ ok
> t/LiveSeq/LiveSeq.t .......................... ok
> t/LiveSeq/Mutation.t ......................... ok
> t/LiveSeq/Mutator.t .......................... ok
> t/LocalDB/BioDBGFF.t ......................... ok
> t/LocalDB/BlastIndex.t ....................... ok
> t/LocalDB/DBFasta.t .......................... ok
> t/LocalDB/DBQual.t ........................... ok
> t/LocalDB/Flat.t ............................. ok
> t/LocalDB/Index.t ............................ ok
> t/LocalDB/Registry.t ......................... 1/14
> --------------------- WARNING ---------------------
> MSG:
> ------------- EXCEPTION: Bio::Root::Exception -------------
> MSG: The sequence does not appear to be FASTA format (lacks a  
> descriptor line '>')
> STACK: Error::throw
> STACK: Bio::Root::Root::throw Bio/Root/Root.pm:368
> STACK: Bio::SeqIO::fasta::next_seq Bio/SeqIO/fasta.pm:127
> STACK: Bio::DB::Flat::BDB::get_Seq_by_id Bio/DB/Flat/BDB.pm:143
> STACK: Bio::DB::Failover::get_Seq_by_id Bio/DB/Failover.pm:122
> STACK: t/LocalDB/Registry.t:69
> -----------------------------------------------------------
>
> ---------------------------------------------------
>
> --------------------- WARNING ---------------------
> MSG: No sequence retrieved by database Bio::DB::Flat::BDB::fasta
> ---------------------------------------------------
>
> #   Failed test at t/LocalDB/Registry.t line 70.
> Can't call method "seq" on an undefined value at t/LocalDB/ 
> Registry.t line 71, <GEN17> line 1.
> # Looks like you planned 14 tests but ran 13.
> # Looks like you failed 1 test of 13 run.
> # Looks like your test exited with 9 just after 13.
> t/LocalDB/Registry.t ......................... Dubious, test  
> returned 9 (wstat 2304, 0x900)
> Failed 2/14 subtests
> t/LocalDB/SeqFeature.t ....................... ok
> t/LocalDB/transfac_pro.t ..................... ok
> t/Map/Cyto.t ................................. ok
> t/Map/Linkage.t .............................. ok
> t/Map/Map.t .................................. ok
> t/Map/MapIO.t ................................ ok
> t/Map/MicrosatelliteMarker.t ................. ok
> t/Map/Physical.t ............................. ok
> t/Matrix/IO/masta.t .......................... ok
> t/Matrix/IO/psm.t ............................ ok
> t/Matrix/InstanceSite.t ...................... ok
> t/Matrix/Matrix.t ............................ ok
> t/Matrix/ProtMatrix.t ........................ ok
> t/Matrix/ProtPsm.t ........................... ok
> t/Matrix/SiteMatrix.t ........................ ok
> t/Ontology/GOterm.t .......................... ok
> t/Ontology/GraphAdaptor.t .................... ok
> t/Ontology/IO/go.t ........................... ok
> t/Ontology/IO/interpro.t ..................... ok
> t/Ontology/IO/obo.t .......................... ok
> t/Ontology/Ontology.t ........................ ok
> t/Ontology/OntologyEngine.t .................. ok
> t/Ontology/OntologyStore.t ................... ok
> t/Ontology/Relationship.t .................... ok
> t/Ontology/RelationshipType.t ................ ok
> t/Ontology/Term.t ............................ ok
> t/Perl.t ..................................... ok
> t/Phenotype/Correlate.t ...................... ok
> t/Phenotype/MeSH.t ........................... ok
> t/Phenotype/Measure.t ........................ ok
> t/Phenotype/MiniMIMentry.t ................... ok
> t/Phenotype/OMIMentry.t ...................... ok
> t/Phenotype/OMIMentryAllelicVariant.t ........ ok
> t/Phenotype/OMIMparser.t ..................... ok
> t/Phenotype/Phenotype.t ...................... ok
> t/PodSyntax.t ................................ ok
> t/PopGen/Coalescent.t ........................ ok
> t/PopGen/HtSNP.t ............................. ok
> t/PopGen/MK.t ................................ ok
> t/PopGen/PopGen.t ............................ ok
> t/PopGen/PopGenSims.t ........................ ok
> t/PopGen/TagHaplotype.t ...................... ok
> t/RemoteDB/BioFetch.t ........................ ok
> t/RemoteDB/CUTG.t ............................ ok
> t/RemoteDB/EMBL.t ............................ ok
> t/RemoteDB/EUtilities.t ...................... 309/309
> #   Failed test 'EPost to EFetch'
> #   at t/RemoteDB/EUtilities.t line 159.
> #          got: '0'
> #     expected: '5'
> # Looks like you failed 1 test of 309.
> t/RemoteDB/EUtilities.t ...................... Dubious, test  
> returned 1 (wstat 256, 0x100)
> Failed 1/309 subtests
> t/RemoteDB/EntrezGene.t ...................... ok
> t/RemoteDB/GenBank.t ......................... ok
> t/RemoteDB/GenPept.t ......................... ok
> t/RemoteDB/HIV/HIV.t ......................... ok
> t/RemoteDB/HIV/HIVAnnotProcessor.t ........... ok
> t/RemoteDB/HIV/HIVQuery.t .................... 22/41 Use of  
> uninitialized value $rest[0] in join or string at (eval 68) line 15.
> t/RemoteDB/HIV/HIVQuery.t .................... ok
> t/RemoteDB/HIV/HIVQueryHelper.t .............. ok
> t/RemoteDB/MeSH.t ............................ ok
> t/RemoteDB/Query/GenBank.t ................... ok
> t/RemoteDB/RefSeq.t .......................... ok
> t/RemoteDB/SeqHound.t ........................ ok
> t/RemoteDB/SeqRead_fail.t .................... ok
> t/RemoteDB/SeqVersion.t ...................... ok
> t/RemoteDB/SwissProt.t ....................... ok
> t/RemoteDB/Taxonomy.t ........................ ok
> t/Restriction/Analysis-refac.t ............... ok
> t/Restriction/Analysis.t ..................... ok
> t/Restriction/Gel.t .......................... ok
> t/Restriction/IO.t ........................... ok
> t/Root/Exception.t ........................... ok
> t/Root/RootI.t ............................... ok
> t/Root/RootIO.t .............................. ok
> t/Root/Storable.t ............................ ok
> t/Root/Tempfile.t ............................ ok
> t/Root/Utilities.t ........................... ok
> t/SearchDist.t ............................... skipped: The optional  
> module Bio::Ext::Align (or dependencies thereof) was not installed
> t/SearchIO/CigarString.t ..................... ok
> t/SearchIO/SearchIO.t ........................ ok
> t/SearchIO/SimilarityPair.t .................. ok
> t/SearchIO/Tiling.t .......................... ok
> t/SearchIO/Writer/GbrowseGFF.t ............... ok
> t/SearchIO/Writer/HSPTableWriter.t ........... ok
> t/SearchIO/Writer/HTMLWriter.t ............... ok
> t/SearchIO/Writer/HitTableWriter.t ........... ok
> t/SearchIO/blast.t ........................... ok
> t/SearchIO/blast_pull.t ...................... ok
> t/SearchIO/blasttable.t ...................... ok
> t/SearchIO/blastxml.t ........................ ok
> t/SearchIO/cross_match.t ..................... ok
> t/SearchIO/erpin.t ........................... ok
> t/SearchIO/exonerate.t ....................... ok
> t/SearchIO/fasta.t ........................... ok
> t/SearchIO/gmap_f9.t ......................... ok
> t/SearchIO/hmmer.t ........................... ok
> t/SearchIO/hmmer_pull.t ...................... ok
> t/SearchIO/infernal.t ........................ ok
> t/SearchIO/megablast.t ....................... ok
> t/SearchIO/psl.t ............................. ok
> t/SearchIO/rnamotif.t ........................ ok
> t/SearchIO/sim4.t ............................ ok
> t/SearchIO/waba.t ............................ ok
> t/SearchIO/wise.t ............................ ok
> t/Seq/DBLink.t ............................... ok
> t/Seq/EncodedSeq.t ........................... ok
> t/Seq/LargeLocatableSeq.t .................... ok
> t/Seq/LargePSeq.t ............................ ok
> t/Seq/LocatableSeq.t ......................... ok
> t/Seq/MetaSeq.t .............................. ok
> t/Seq/PrimaryQual.t .......................... ok
> t/Seq/PrimarySeq.t ........................... ok
> t/Seq/PrimedSeq.t ............................ ok
> t/Seq/Quality.t .............................. ok
> t/Seq/Seq.t .................................. ok
> t/Seq/WithQuality.t .......................... ok
> t/SeqEvolution.t ............................. ok
> t/SeqFeature/FeatureIO.t ..................... ok
> t/SeqFeature/Location.t ...................... ok
> t/SeqFeature/LocationFactory.t ............... ok
> t/SeqFeature/Primer.t ........................ ok
> t/SeqFeature/Range.t ......................... ok
> t/SeqFeature/RangeI.t ........................ ok
> t/SeqFeature/SeqAnalysisParser.t ............. ok
> t/SeqFeature/SeqFeatAnnotated.t .............. ok
> t/SeqFeature/SeqFeatCollection.t ............. ok
> t/SeqFeature/SeqFeature.t .................... ok
> t/SeqFeature/SeqFeaturePrimer.t .............. ok
> t/SeqFeature/Unflattener.t ................... ok
> t/SeqFeature/Unflattener2.t .................. ok
> t/SeqIO.t .................................... ok
> t/SeqIO/Handler.t ............................ ok
> t/SeqIO/MultiFile.t .......................... ok
> t/SeqIO/Multiple_fasta.t ..................... ok
> t/SeqIO/SeqBuilder.t ......................... ok
> t/SeqIO/Splicedseq.t ......................... ok
> t/SeqIO/abi.t ................................ skipped: The optional  
> module Bio::SeqIO::staden::read (or dependencies thereof) was not  
> installed
> t/SeqIO/ace.t ................................ ok
> t/SeqIO/agave.t .............................. ok
> t/SeqIO/alf.t ................................ skipped: The optional  
> module Bio::SeqIO::staden::read (or dependencies thereof) was not  
> installed
> t/SeqIO/asciitree.t .......................... ok
> t/SeqIO/bsml.t ............................... ok
> t/SeqIO/bsml_sax.t ........................... ok
> t/SeqIO/chadoxml.t ........................... ok
> t/SeqIO/chaos.t .............................. ok
> t/SeqIO/chaosxml.t ........................... ok
> t/SeqIO/ctf.t ................................ skipped: The optional  
> module Bio::SeqIO::staden::read (or dependencies thereof) was not  
> installed
> t/SeqIO/embl.t ............................... ok
> t/SeqIO/entrezgene.t ......................... ok
> t/SeqIO/excel.t .............................. ok
> t/SeqIO/exp.t ................................ skipped: The optional  
> module Bio::SeqIO::staden::read (or dependencies thereof) was not  
> installed
> t/SeqIO/fasta.t .............................. ok
> t/SeqIO/fastq.t .............................. ok
> t/SeqIO/flybase_chadoxml.t ................... ok
> t/SeqIO/game.t ............................... ok
> t/SeqIO/gcg.t ................................ ok
> t/SeqIO/genbank.t ............................ ok
> t/SeqIO/interpro.t ........................... ok
> t/SeqIO/kegg.t ............................... ok
> t/SeqIO/largefasta.t ......................... ok
> t/SeqIO/lasergene.t .......................... ok
> t/SeqIO/locuslink.t .......................... ok
> t/SeqIO/metafasta.t .......................... ok
> t/SeqIO/phd.t ................................ ok
> t/SeqIO/pir.t ................................ ok
> t/SeqIO/pln.t ................................ skipped: The optional  
> module Bio::SeqIO::staden::read (or dependencies thereof) was not  
> installed
> t/SeqIO/qual.t ............................... ok
> t/SeqIO/raw.t ................................ ok
> t/SeqIO/scf.t ................................ ok
> t/SeqIO/strider.t ............................ ok
> t/SeqIO/swiss.t .............................. ok
> t/SeqIO/tab.t ................................ ok
> t/SeqIO/table.t .............................. ok
> t/SeqIO/tigr.t ............................... ok
> t/SeqIO/tigrxml.t ............................ ok
> t/SeqIO/tinyseq.t ............................ ok
> t/SeqIO/ztr.t ................................ skipped: The optional  
> module Bio::SeqIO::staden::read (or dependencies thereof) was not  
> installed
> t/SeqTools/Backtranslate.t ................... ok
> t/SeqTools/CodonTable.t ...................... ok
> t/SeqTools/ECnumber.t ........................ ok
> t/SeqTools/GuessSeqFormat.t .................. ok
> t/SeqTools/OddCodes.t ........................ ok
> t/SeqTools/SeqPattern.t ...................... ok
> t/SeqTools/SeqStats.t ........................ ok
> t/SeqTools/SeqUtils.t ........................ ok
> t/SeqTools/SeqWords.t ........................ ok
> t/Species.t .................................. ok
> t/Structure/IO.t ............................. ok
> t/Structure/Structure.t ...................... ok
> t/Symbol.t ................................... ok
> t/TaxonTree.t ................................ skipped: All tests  
> are being skipped, probably because the module(s) being tested here  
> are now deprecated
> t/Tools/Alignment/Consed.t ................... ok
> t/Tools/Analysis/DNA/ESEfinder.t ............. ok
> t/Tools/Analysis/Protein/Domcut.t ............ ok
> t/Tools/Analysis/Protein/ELM.t ............... ok
> t/Tools/Analysis/Protein/GOR4.t .............. ok
> t/Tools/Analysis/Protein/HNN.t ............... ok
> t/Tools/Analysis/Protein/Mitoprot.t .......... ok
> t/Tools/Analysis/Protein/NetPhos.t ........... ok
> t/Tools/Analysis/Protein/Scansite.t .......... ok
> t/Tools/Analysis/Protein/Sopma.t ............. ok
> t/Tools/EMBOSS/Palindrome.t .................. ok
> t/Tools/EUtilities/EUtilParameters.t ......... ok
> t/Tools/EUtilities/egquery.t ................. ok
> t/Tools/EUtilities/einfo.t ................... ok
> t/Tools/EUtilities/elink_acheck.t ............ ok
> t/Tools/EUtilities/elink_lcheck.t ............ ok
> t/Tools/EUtilities/elink_llinks.t ............ ok
> t/Tools/EUtilities/elink_ncheck.t ............ ok
> t/Tools/EUtilities/elink_neighbor.t .......... ok
> t/Tools/EUtilities/elink_neighbor_history.t .. ok
> t/Tools/EUtilities/elink_scores.t ............ ok
> t/Tools/EUtilities/epost.t ................... ok
> t/Tools/EUtilities/esearch.t ................. ok
> t/Tools/EUtilities/espell.t .................. ok
> t/Tools/EUtilities/esummary.t ................ ok
> t/Tools/Est2Genome.t ......................... ok
> t/Tools/FootPrinter.t ........................ ok
> t/Tools/GFF.t ................................ ok
> t/Tools/Geneid.t ............................. ok
> t/Tools/Genewise.t ........................... ok
> t/Tools/Genomewise.t ......................... ok
> t/Tools/Genpred.t ............................ ok
> t/Tools/Hmmer.t .............................. ok
> t/Tools/IUPAC.t .............................. ok
> t/Tools/Lucy.t ............................... ok
> t/Tools/Match.t .............................. ok
> t/Tools/Phylo/Gerp.t ......................... ok
> t/Tools/Phylo/Molphy.t ....................... ok
> t/Tools/Phylo/PAML.t ......................... ok
> t/Tools/Phylo/Phylip/ProtDist.t .............. ok
> t/Tools/Primer3.t ............................ ok
> t/Tools/Promoterwise.t ....................... ok
> t/Tools/Pseudowise.t ......................... ok
> t/Tools/QRNA.t ............................... ok
> t/Tools/RandDistFunctions.t .................. ok
> t/Tools/RepeatMasker.t ....................... ok
> t/Tools/Run/RemoteBlast.t .................... 13/16
> --------------------- WARNING ---------------------
> MSG: Server failed to return any data
> ---------------------------------------------------
> # Looks like you planned 16 tests but ran 13.
> t/Tools/Run/RemoteBlast.t .................... Dubious, test  
> returned 255 (wstat 65280, 0xff00)
> Failed 3/16 subtests
> t/Tools/Run/RemoteBlast_rpsblast.t ........... ok
> t/Tools/Run/StandAloneBlast.t ................ ok
> t/Tools/Run/WrapperBase.t .................... ok
> t/Tools/Seg.t ................................ ok
> t/Tools/SiRNA.t .............................. ok
> t/Tools/Sigcleave.t .......................... ok
> t/Tools/Signalp.t ............................ ok
> t/Tools/Signalp/ExtendedSignalp.t ............ ok
> t/Tools/Sim4.t ............................... ok
> t/Tools/Spidey/Spidey.t ...................... ok
> t/Tools/TandemRepeatsFinder.t ................ ok
> t/Tools/TargetP.t ............................ ok
> t/Tools/Tmhmm.t .............................. ok
> t/Tools/ePCR.t ............................... ok
> t/Tools/pICalculator.t ....................... ok
> t/Tools/rnamotif.t ........................... skipped: All tests  
> are being skipped, probably because the module(s) being tested here  
> are now deprecated
> t/Tools/tRNAscanSE.t ......................... ok
> t/Tree/Compatible.t .......................... ok
> t/Tree/Node.t ................................ ok
> t/Tree/PhyloNetwork/Factory.t ................ ok
> t/Tree/PhyloNetwork/GraphViz.t ............... ok
> t/Tree/PhyloNetwork/MuVector.t ............... ok
> t/Tree/PhyloNetwork/PhyloNetwork.t ........... ok
> t/Tree/PhyloNetwork/RandomFactory.t .......... skipped: The optional  
> module Math::Random (or dependencies thereof) was not installed
> t/Tree/PhyloNetwork/TreeFactory.t ............ ok
> t/Tree/RandomTreeFactory.t ................... ok
> t/Tree/Tree.t ................................ ok
> t/Tree/TreeIO.t .............................. ok
> t/Tree/TreeIO/lintree.t ...................... ok
> t/Tree/TreeIO/newick.t ....................... ok
> t/Tree/TreeIO/nexus.t ........................ ok
> t/Tree/TreeIO/nhx.t .......................... ok
> t/Tree/TreeIO/phyloxml.t ..................... ok
> t/Tree/TreeIO/svggraph.t ..................... 1/4 Use of  
> uninitialized value $txt[0] in join or string at /usr/share/perl5/ 
> SVG/Element.pm line 1195, <GEN0> line 1.
> t/Tree/TreeIO/svggraph.t ..................... ok
> t/Tree/TreeIO/tabtree.t ...................... ok
> t/Tree/TreeStatistics.t ...................... ok
> t/Variation/AAChange.t ....................... ok
> t/Variation/AAReverseMutate.t ................ ok
> t/Variation/Allele.t ......................... ok
> t/Variation/DNAMutation.t .................... ok
> t/Variation/RNAChange.t ...................... ok
> t/Variation/SNP.t ............................ ok
> t/Variation/SeqDiff.t ........................ ok
> t/Variation/Variation_IO.t ................... ok
>
>
> Cheers,
>
> -- 
> Charles Plessy
> Debian Med packaging team,
> http://www.debian.org/devel/debian-med
> Tsurumi, Kanagawa, Japan


From cjfields at illinois.edu  Mon Sep 28 13:28:29 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Mon, 28 Sep 2009 12:28:29 -0500
Subject: [Bioperl-l] Policy on tests
Message-ID: <00F31D5F-D531-4A5E-A11E-F7B67283FA8B@illinois.edu>

All,

This is a bit of a rant related to the spat of alphas I've had to  
release over the last few weeks.  We have a fairly loose policy on  
testing; for instance, most CPAN installations should not run network-  
or DB-dependent tests or other developer-dependent tests by default  
(POD formatting, for instance), or tests for a 'recommended' module  
should be skipped.  That is currently in place.

However, I do think all tests that are skipped need to be reported  
somehow, and optional tests should NOT skip if they are off by default  
and are specifically requested.  This is not currently the behavior.   
So far I have been bitten twice by this.

The last instance was with the latest alpha, where ODBA-related tests  
were mistakenly skipped when BerkeleyDB wasn't installed.  As it turns  
out, BerkeleyDB isn't required, but (according to standard test  
harness output) t/LocalDB/Registry.t passed w/o reporting any problems  
when in reality it silently skipped over 90% of the tests (this is  
only seen with --verbose output).  In the past I have also run into  
network tests silently passing when the remote server was not in  
service anymore (IIRC this was with XEMBL modules, which are no longer  
in the distribution).

 From my point of view, speaking as both a user and developer, I need  
to know when these tests are skipped or fail.  In instances where I  
specifically request a set of tests to be run and a test fails, they  
*should* fail quite loudly and catastrophically (i.e. if there is a  
server-side issue, a problem with DB connection, etc).  They shouldn't  
be skipped over if a problem arises, otherwise if it a legitimate bug  
it silently passes.  If it is something I haven't set up correctly (a  
DB connection, for instance) I would like to know about it via the  
test failures.

Am I the only one thinking along these lines?  Should we come up with  
a simple policy on how we're setting up and running tests?

chris


From paola.bisignano at gmail.com  Mon Sep 28 05:50:52 2009
From: paola.bisignano at gmail.com (Paola Bisignano)
Date: Mon, 28 Sep 2009 11:50:52 +0200
Subject: [Bioperl-l] parsing msf file
Message-ID: <e9cf89740909280250u40f1a118pa7527a2f27c5bc0@mail.gmail.com>

Hi dear friends,

I used Bio::AlignIO to parse msf file, using method
colum_from_residue_number, as you suggested to obtain the position in
the alignment of  residues of interest (in contact with my ligand) and
I have to do a check of the residue:
I want to extract the type of the residue...I ask my question using
the number of the residue in the PDB, and i want the script return
also the residue so if I want to know the position af ala21, I  will
do:

my $alnio = Bio::AlignIO->new( -file=>"my file.msf");
my $aln = $alnio->next_aln;

my $s1 = $aln->get_seq_by_pos(1);
my $s2 = $aln->get_seq_by_pos(2);

my $col = $aln->column_from_residue_number( $s1->id, 21)

and It will return the position (es. 5) but I want to check if in
position 5 of the alignment there is A (for ala)....I looked in
documentation, but I couldn't find anything for that


Thank you all for help you gave and will give to me,

best regards,

paola


From maj at fortinbras.us  Mon Sep 28 21:25:33 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Mon, 28 Sep 2009 21:25:33 -0400
Subject: [Bioperl-l] parsing msf file
In-Reply-To: <e9cf89740909280250u40f1a118pa7527a2f27c5bc0@mail.gmail.com>
References: <e9cf89740909280250u40f1a118pa7527a2f27c5bc0@mail.gmail.com>
Message-ID: <1C5008B41F6D4BFF9F5160633D284442@NewLife>

Hi Paola--
I think you're saying you want to see if A is present in other 
sequences in the alignment at alignment column 5. Here's
where you use location_from_column, which is a method 
off the sequence object themselves. The idea is to do 

# $col is obtained as in your script...
for my $seq ($aln->each_seq) {
  if ( $seq->subseq( $seq->location_from_column($col) ) eq 'A') {
     print "si!";
  else {
     print "no!";
  }
}

You might find the code at 
http://www.bioperl.org/wiki/Site_entropy_in_an_alignment
helpful since it uses these principles. 
Mark
----- Original Message ----- 
From: "Paola Bisignano" <paola.bisignano at gmail.com>
To: <bioperl-l at lists.open-bio.org>
Sent: Monday, September 28, 2009 5:50 AM
Subject: [Bioperl-l] parsing msf file


> Hi dear friends,
> 
> I used Bio::AlignIO to parse msf file, using method
> colum_from_residue_number, as you suggested to obtain the position in
> the alignment of  residues of interest (in contact with my ligand) and
> I have to do a check of the residue:
> I want to extract the type of the residue...I ask my question using
> the number of the residue in the PDB, and i want the script return
> also the residue so if I want to know the position af ala21, I  will
> do:
> 
> my $alnio = Bio::AlignIO->new( -file=>"my file.msf");
> my $aln = $alnio->next_aln;
> 
> my $s1 = $aln->get_seq_by_pos(1);
> my $s2 = $aln->get_seq_by_pos(2);
> 
> my $col = $aln->column_from_residue_number( $s1->id, 21)
> 
> and It will return the position (es. 5) but I want to check if in
> position 5 of the alignment there is A (for ala)....I looked in
> documentation, but I couldn't find anything for that
> 
> 
> Thank you all for help you gave and will give to me,
> 
> best regards,
> 
> paola
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
>


From martin.senger at gmail.com  Tue Sep 29 01:31:41 2009
From: martin.senger at gmail.com (Martin Senger)
Date: Tue, 29 Sep 2009 13:31:41 +0800
Subject: [Bioperl-l] a Main Page proposal
Message-ID: <4d93f07c0909282231k35bc636as73993fe031034340@mail.gmail.com>

> Martin has stopped working on Biblio as far as I know and php-hacking is
> not my favorite pastime.


That's true. I can still revive the code - but the question is (always has
been) where to host the server (of the web services providing the biblio
data). It was hosted, and maintained, at EBI. But I do not know if EBI is
still maintaining it, or willing to do so.

Cheers,
Martin

-- 
Martin Senger
email: martin.senger at gmail.com,m.senger at cgiar.org
skype: martinsenger


From jason at bioperl.org  Tue Sep 29 01:43:30 2009
From: jason at bioperl.org (Jason Stajich)
Date: Mon, 28 Sep 2009 22:43:30 -0700
Subject: [Bioperl-l] a Main Page proposal
In-Reply-To: <4d93f07c0909282231k35bc636as73993fe031034340@mail.gmail.com>
References: <4d93f07c0909282231k35bc636as73993fe031034340@mail.gmail.com>
Message-ID: <E9D67D22-ABC3-4199-B8D9-E0675197B9BF@bioperl.org>

hah! I actually meant the Biblio.php Wikimedia plugin by Martin Jambon  
-- but hey the Bio::Biblio db stuff should be discussed too.

-jason
On Sep 28, 2009, at 10:31 PM, Martin Senger wrote:

>> Martin has stopped working on Biblio as far as I know and php- 
>> hacking is
>> not my favorite pastime.
>
>
> That's true. I can still revive the code - but the question is  
> (always has
> been) where to host the server (of the web services providing the  
> biblio
> data). It was hosted, and maintained, at EBI. But I do not know if  
> EBI is
> still maintaining it, or willing to do so.
>
> Cheers,
> Martin
>
> -- 
> Martin Senger
> email: martin.senger at gmail.com,m.senger at cgiar.org
> skype: martinsenger

--
Jason Stajich
jason.stajich at gmail.com
jason at bioperl.org


From cjfields at illinois.edu  Tue Sep 29 14:01:29 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Tue, 29 Sep 2009 13:01:29 -0500
Subject: [Bioperl-l] BioPerl 1.6.1 released
Message-ID: <7E88F275-4D4B-48DD-84C7-A97C2C16EA97@illinois.edu>

We are pleased to announce the availability of BioPerl 1.6.1, the  
latest release of BioPerl core code.  You can grab it here:

Via CPAN:

http://search.cpan.org/~cjfields/BioPerl-1.6.1/

Via the BioPerl website:

http://bioperl.org/DIST/BioPerl-1.6.1.tar.bz2
http://bioperl.org/DIST/BioPerl-1.6.1.tar.gz
http://bioperl.org/DIST/BioPerl-1.6.1.zip

The PPM for Windows should also finally be available this week,  
ActivePerl problems permitting (we will post more information when it  
becomes available).

Tons of bug fixes and changes have been incorporated into this  
release.  For a more complete change list please see the 'Changes'  
file included with the distribution.

A few highlights:

* FASTQ parsing and interconversion of the three FASTQ variants  
(Sanger, Illumina, Solexa) now works (a concerted OBF effort!)
* Significant refactoring of Bio::Restriction methods
* Complete refactoring of Bio::Search-related tiling code, including  
HOWTO documentation
* GBrowse-related fixes
    - berkeleydb database now autoindexes wig files and locks correctly
    - add Pg, SQLite, and faster BerkeleyDB implementations
* Infernal 1.0 output is now parsed
* New SearchIO-based parser for gmap -f9 output
* BLAST XML parsing essentially complete
* Installation via CPANPLUS should now work
* For those using Strawberry Perl on Windows, the latest build is  
expected to pass all tests.
* 'raw' sequence format now parsed by line or optionally as a single  
sequence
* SCF parsing/writing now round-trips
* Demo code for using RPS-BLAST and Bio::Tools::Run::RemoteBlast
* Bio::Tools::SeqPattern now has a backtranslate() method
* Bio::Tree::Statistics now has methods to calculate Fitch-based  
score, internal trait values, statratio(), sum of leaf distances  
[heikki]
* scripts
    - update to bp_seqfeature_load for SQLite [lstein]
    - hivq.pl - commmand-line interface to Bio::DB::HIV [maj]
    - fastam9_to_table - fix for MPI output [jason]
    - gccalc - total stats [jason]
    - einfo  - simple script to find up-to-date NCBI database list,  
list field and link values for a specific database

We will shortly release updates for BioPerl-db, BioPerl-run, and  
BioPerl-network.  Enjoy!

chris


From rmb32 at cornell.edu  Tue Sep 29 14:22:03 2009
From: rmb32 at cornell.edu (Robert Buels)
Date: Tue, 29 Sep 2009 11:22:03 -0700
Subject: [Bioperl-l] BioPerl 1.6.1 released
In-Reply-To: <7E88F275-4D4B-48DD-84C7-A97C2C16EA97@illinois.edu>
References: <7E88F275-4D4B-48DD-84C7-A97C2C16EA97@illinois.edu>
Message-ID: <4AC2504B.1000707@cornell.edu>

Chris Fields wrote:
 > We are pleased to announce the availability of BioPerl 1.6.1, the
 > latest release of BioPerl core code.

Hooray!  You rock Chris!  Tremendous thanks for your many hours of work 
to get it out the door!

Rob


From scott at scottcain.net  Tue Sep 29 14:23:08 2009
From: scott at scottcain.net (Scott Cain)
Date: Tue, 29 Sep 2009 14:23:08 -0400
Subject: [Bioperl-l] BioPerl 1.6.1 released
In-Reply-To: <7E88F275-4D4B-48DD-84C7-A97C2C16EA97@illinois.edu>
References: <7E88F275-4D4B-48DD-84C7-A97C2C16EA97@illinois.edu>
Message-ID: <536f21b00909291123h12a7c941tdd3edb7fadbb1149@mail.gmail.com>

Chris,

Congratulations and thanks so much for the time and effort that went into this.

Scott


On Tue, Sep 29, 2009 at 2:01 PM, Chris Fields <cjfields at illinois.edu> wrote:
> We are pleased to announce the availability of BioPerl 1.6.1, the latest
> release of BioPerl core code. ?You can grab it here:
>
> Via CPAN:
>
> http://search.cpan.org/~cjfields/BioPerl-1.6.1/
>
> Via the BioPerl website:
>
> http://bioperl.org/DIST/BioPerl-1.6.1.tar.bz2
> http://bioperl.org/DIST/BioPerl-1.6.1.tar.gz
> http://bioperl.org/DIST/BioPerl-1.6.1.zip
>
> The PPM for Windows should also finally be available this week, ActivePerl
> problems permitting (we will post more information when it becomes
> available).
>
> Tons of bug fixes and changes have been incorporated into this release. ?For
> a more complete change list please see the 'Changes' file included with the
> distribution.
>
> A few highlights:
>
> * FASTQ parsing and interconversion of the three FASTQ variants (Sanger,
> Illumina, Solexa) now works (a concerted OBF effort!)
> * Significant refactoring of Bio::Restriction methods
> * Complete refactoring of Bio::Search-related tiling code, including HOWTO
> documentation
> * GBrowse-related fixes
> ? - berkeleydb database now autoindexes wig files and locks correctly
> ? - add Pg, SQLite, and faster BerkeleyDB implementations
> * Infernal 1.0 output is now parsed
> * New SearchIO-based parser for gmap -f9 output
> * BLAST XML parsing essentially complete
> * Installation via CPANPLUS should now work
> * For those using Strawberry Perl on Windows, the latest build is expected
> to pass all tests.
> * 'raw' sequence format now parsed by line or optionally as a single
> sequence
> * SCF parsing/writing now round-trips
> * Demo code for using RPS-BLAST and Bio::Tools::Run::RemoteBlast
> * Bio::Tools::SeqPattern now has a backtranslate() method
> * Bio::Tree::Statistics now has methods to calculate Fitch-based score,
> internal trait values, statratio(), sum of leaf distances [heikki]
> * scripts
> ? - update to bp_seqfeature_load for SQLite [lstein]
> ? - hivq.pl - commmand-line interface to Bio::DB::HIV [maj]
> ? - fastam9_to_table - fix for MPI output [jason]
> ? - gccalc - total stats [jason]
> ? - einfo ?- simple script to find up-to-date NCBI database list, list field
> and link values for a specific database
>
> We will shortly release updates for BioPerl-db, BioPerl-run, and
> BioPerl-network. ?Enjoy!
>
> chris
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


-- 
------------------------------------------------------------------------
Scott Cain, Ph. D.                                   scott at scottcain dot net
GMOD Coordinator (http://gmod.org/)                     216-392-3087
Ontario Institute for Cancer Research


From hlapp at gmx.net  Tue Sep 29 15:56:58 2009
From: hlapp at gmx.net (Hilmar Lapp)
Date: Tue, 29 Sep 2009 15:56:58 -0400
Subject: [Bioperl-l] BioPerl 1.6.1 released
In-Reply-To: <7E88F275-4D4B-48DD-84C7-A97C2C16EA97@illinois.edu>
References: <7E88F275-4D4B-48DD-84C7-A97C2C16EA97@illinois.edu>
Message-ID: <C06DA705-6249-4D86-BE9D-E2E4DCEBFAF0@gmx.net>

Congrats from me too - awesome Chris, and thanks on behalf of the  
project!

	-hilmar

On Sep 29, 2009, at 2:01 PM, Chris Fields wrote:

> We are pleased to announce the availability of BioPerl 1.6.1, the  
> latest release of BioPerl core code.  You can grab it here:
>
> Via CPAN:
>
> http://search.cpan.org/~cjfields/BioPerl-1.6.1/
>
> Via the BioPerl website:
>
> http://bioperl.org/DIST/BioPerl-1.6.1.tar.bz2
> http://bioperl.org/DIST/BioPerl-1.6.1.tar.gz
> http://bioperl.org/DIST/BioPerl-1.6.1.zip
>
> The PPM for Windows should also finally be available this week,  
> ActivePerl problems permitting (we will post more information when  
> it becomes available).
>
> Tons of bug fixes and changes have been incorporated into this  
> release.  For a more complete change list please see the 'Changes'  
> file included with the distribution.
>
> A few highlights:
>
> * FASTQ parsing and interconversion of the three FASTQ variants  
> (Sanger, Illumina, Solexa) now works (a concerted OBF effort!)
> * Significant refactoring of Bio::Restriction methods
> * Complete refactoring of Bio::Search-related tiling code, including  
> HOWTO documentation
> * GBrowse-related fixes
>   - berkeleydb database now autoindexes wig files and locks correctly
>   - add Pg, SQLite, and faster BerkeleyDB implementations
> * Infernal 1.0 output is now parsed
> * New SearchIO-based parser for gmap -f9 output
> * BLAST XML parsing essentially complete
> * Installation via CPANPLUS should now work
> * For those using Strawberry Perl on Windows, the latest build is  
> expected to pass all tests.
> * 'raw' sequence format now parsed by line or optionally as a single  
> sequence
> * SCF parsing/writing now round-trips
> * Demo code for using RPS-BLAST and Bio::Tools::Run::RemoteBlast
> * Bio::Tools::SeqPattern now has a backtranslate() method
> * Bio::Tree::Statistics now has methods to calculate Fitch-based  
> score, internal trait values, statratio(), sum of leaf distances  
> [heikki]
> * scripts
>   - update to bp_seqfeature_load for SQLite [lstein]
>   - hivq.pl - commmand-line interface to Bio::DB::HIV [maj]
>   - fastam9_to_table - fix for MPI output [jason]
>   - gccalc - total stats [jason]
>   - einfo  - simple script to find up-to-date NCBI database list,  
> list field and link values for a specific database
>
> We will shortly release updates for BioPerl-db, BioPerl-run, and  
> BioPerl-network.  Enjoy!
>
> chris
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From cjfields at illinois.edu  Tue Sep 29 16:38:04 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Tue, 29 Sep 2009 15:38:04 -0500
Subject: [Bioperl-l] BioPerl 1.6.1 released
In-Reply-To: <C06DA705-6249-4D86-BE9D-E2E4DCEBFAF0@gmx.net>
References: <7E88F275-4D4B-48DD-84C7-A97C2C16EA97@illinois.edu>
	<C06DA705-6249-4D86-BE9D-E2E4DCEBFAF0@gmx.net>
Message-ID: <5B8C4E37-5F3D-4E76-AB94-1C613AE04CDF@illinois.edu>

No prob.  Next up is db, run, and network!

chris

On Sep 29, 2009, at 2:56 PM, Hilmar Lapp wrote:

> Congrats from me too - awesome Chris, and thanks on behalf of the  
> project!
>
> 	-hilmar
>
> On Sep 29, 2009, at 2:01 PM, Chris Fields wrote:
>
>> We are pleased to announce the availability of BioPerl 1.6.1, the  
>> latest release of BioPerl core code.  You can grab it here:
>>
>> Via CPAN:
>>
>> http://search.cpan.org/~cjfields/BioPerl-1.6.1/
>>
>> Via the BioPerl website:
>>
>> http://bioperl.org/DIST/BioPerl-1.6.1.tar.bz2
>> http://bioperl.org/DIST/BioPerl-1.6.1.tar.gz
>> http://bioperl.org/DIST/BioPerl-1.6.1.zip
>>
>> The PPM for Windows should also finally be available this week,  
>> ActivePerl problems permitting (we will post more information when  
>> it becomes available).
>>
>> Tons of bug fixes and changes have been incorporated into this  
>> release.  For a more complete change list please see the 'Changes'  
>> file included with the distribution.
>>
>> A few highlights:
>>
>> * FASTQ parsing and interconversion of the three FASTQ variants  
>> (Sanger, Illumina, Solexa) now works (a concerted OBF effort!)
>> * Significant refactoring of Bio::Restriction methods
>> * Complete refactoring of Bio::Search-related tiling code,  
>> including HOWTO documentation
>> * GBrowse-related fixes
>>  - berkeleydb database now autoindexes wig files and locks correctly
>>  - add Pg, SQLite, and faster BerkeleyDB implementations
>> * Infernal 1.0 output is now parsed
>> * New SearchIO-based parser for gmap -f9 output
>> * BLAST XML parsing essentially complete
>> * Installation via CPANPLUS should now work
>> * For those using Strawberry Perl on Windows, the latest build is  
>> expected to pass all tests.
>> * 'raw' sequence format now parsed by line or optionally as a  
>> single sequence
>> * SCF parsing/writing now round-trips
>> * Demo code for using RPS-BLAST and Bio::Tools::Run::RemoteBlast
>> * Bio::Tools::SeqPattern now has a backtranslate() method
>> * Bio::Tree::Statistics now has methods to calculate Fitch-based  
>> score, internal trait values, statratio(), sum of leaf distances  
>> [heikki]
>> * scripts
>>  - update to bp_seqfeature_load for SQLite [lstein]
>>  - hivq.pl - commmand-line interface to Bio::DB::HIV [maj]
>>  - fastam9_to_table - fix for MPI output [jason]
>>  - gccalc - total stats [jason]
>>  - einfo  - simple script to find up-to-date NCBI database list,  
>> list field and link values for a specific database
>>
>> We will shortly release updates for BioPerl-db, BioPerl-run, and  
>> BioPerl-network.  Enjoy!
>>
>> chris
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>
> -- 
> ===========================================================
> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
> ===========================================================
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From cjfields at illinois.edu  Tue Sep 29 17:11:33 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Tue, 29 Sep 2009 16:11:33 -0500
Subject: [Bioperl-l] Naming of BioPerl-run/db/network
Message-ID: <4384F324-E30E-490D-A6FB-3EB4C54E4481@illinois.edu>

Right now all our subdistributions have a naming scheme like BioPerl- 
db.  I'm thinking we should subtly change those to BioPerl-DB, BioPerl- 
Run, BioPerl-Network, etc.  The primary reason is that the prior  
method of naming doesn't quite match the syntax of other distributions:

Win32-Console
Win32-EventLog
MooseX-Aliases
etc etc

I'll go ahead and make these changes unless there is rabid dissent ;>

chris


From bix at sendu.me.uk  Tue Sep 29 15:06:17 2009
From: bix at sendu.me.uk (Sendu Bala)
Date: Tue, 29 Sep 2009 20:06:17 +0100
Subject: [Bioperl-l] BioPerl 1.6.1 released
In-Reply-To: <7E88F275-4D4B-48DD-84C7-A97C2C16EA97@illinois.edu>
References: <7E88F275-4D4B-48DD-84C7-A97C2C16EA97@illinois.edu>
Message-ID: <4AC25AA9.5080803@sendu.me.uk>

Chris Fields wrote:
> We are pleased to announce the availability of BioPerl 1.6.1, the latest 
> release of BioPerl core code.  You can grab it here:

Great job Chris. *cheers*


From hlapp at gmx.net  Tue Sep 29 17:49:07 2009
From: hlapp at gmx.net (Hilmar Lapp)
Date: Tue, 29 Sep 2009 17:49:07 -0400
Subject: [Bioperl-l] Naming of BioPerl-run/db/network
In-Reply-To: <4384F324-E30E-490D-A6FB-3EB4C54E4481@illinois.edu>
References: <4384F324-E30E-490D-A6FB-3EB4C54E4481@illinois.edu>
Message-ID: <6C5CBE0E-EDA5-4079-BFD7-DEE95E8C749C@gmx.net>

Fine with me :-)

	-hilmar

On Sep 29, 2009, at 5:11 PM, Chris Fields wrote:

> Right now all our subdistributions have a naming scheme like BioPerl- 
> db.  I'm thinking we should subtly change those to BioPerl-DB,  
> BioPerl-Run, BioPerl-Network, etc.  The primary reason is that the  
> prior method of naming doesn't quite match the syntax of other  
> distributions:
>
> Win32-Console
> Win32-EventLog
> MooseX-Aliases
> etc etc
>
> I'll go ahead and make these changes unless there is rabid dissent ;>
>
> chris
>

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From maj at fortinbras.us  Tue Sep 29 18:33:23 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Tue, 29 Sep 2009 18:33:23 -0400
Subject: [Bioperl-l] BioPerl 1.6.1 released
In-Reply-To: <7E88F275-4D4B-48DD-84C7-A97C2C16EA97@illinois.edu>
References: <7E88F275-4D4B-48DD-84C7-A97C2C16EA97@illinois.edu>
Message-ID: <5D35D16E84554CA687C6CA4758806884@NewLife>

Gnarly, dude.
MAJ
----- Original Message ----- 
From: "Chris Fields" <cjfields at illinois.edu>
To: "BioPerl List" <bioperl-l at lists.open-bio.org>
Sent: Tuesday, September 29, 2009 2:01 PM
Subject: [Bioperl-l] BioPerl 1.6.1 released


> We are pleased to announce the availability of BioPerl 1.6.1, the  
> latest release of BioPerl core code.  You can grab it here:
> 
> Via CPAN:
> 
> http://search.cpan.org/~cjfields/BioPerl-1.6.1/
> 
> Via the BioPerl website:
> 
> http://bioperl.org/DIST/BioPerl-1.6.1.tar.bz2
> http://bioperl.org/DIST/BioPerl-1.6.1.tar.gz
> http://bioperl.org/DIST/BioPerl-1.6.1.zip
> 
> The PPM for Windows should also finally be available this week,  
> ActivePerl problems permitting (we will post more information when it  
> becomes available).
> 
> Tons of bug fixes and changes have been incorporated into this  
> release.  For a more complete change list please see the 'Changes'  
> file included with the distribution.
> 
> A few highlights:
> 
> * FASTQ parsing and interconversion of the three FASTQ variants  
> (Sanger, Illumina, Solexa) now works (a concerted OBF effort!)
> * Significant refactoring of Bio::Restriction methods
> * Complete refactoring of Bio::Search-related tiling code, including  
> HOWTO documentation
> * GBrowse-related fixes
>    - berkeleydb database now autoindexes wig files and locks correctly
>    - add Pg, SQLite, and faster BerkeleyDB implementations
> * Infernal 1.0 output is now parsed
> * New SearchIO-based parser for gmap -f9 output
> * BLAST XML parsing essentially complete
> * Installation via CPANPLUS should now work
> * For those using Strawberry Perl on Windows, the latest build is  
> expected to pass all tests.
> * 'raw' sequence format now parsed by line or optionally as a single  
> sequence
> * SCF parsing/writing now round-trips
> * Demo code for using RPS-BLAST and Bio::Tools::Run::RemoteBlast
> * Bio::Tools::SeqPattern now has a backtranslate() method
> * Bio::Tree::Statistics now has methods to calculate Fitch-based  
> score, internal trait values, statratio(), sum of leaf distances  
> [heikki]
> * scripts
>    - update to bp_seqfeature_load for SQLite [lstein]
>    - hivq.pl - commmand-line interface to Bio::DB::HIV [maj]
>    - fastam9_to_table - fix for MPI output [jason]
>    - gccalc - total stats [jason]
>    - einfo  - simple script to find up-to-date NCBI database list,  
> list field and link values for a specific database
> 
> We will shortly release updates for BioPerl-db, BioPerl-run, and  
> BioPerl-network.  Enjoy!
> 
> chris
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
>


From cjfields at illinois.edu  Tue Sep 29 23:54:04 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Tue, 29 Sep 2009 22:54:04 -0500
Subject: [Bioperl-l] Long /labels are wrapped, but can't be read
In-Reply-To: <87hbunv764.fsf@topper.koldfront.dk>
References: <87hbunv764.fsf@topper.koldfront.dk>
Message-ID: <86373CE8-4C61-4124-BCF3-35975523CC9C@illinois.edu>

Adam,

Not sure, but this could be a case of 'both'.  Labels that are quoted  
and aren't are currently distinguished via a global hash lookup  
(%FTQUAL_NO_QUOTE) due to the way the parser works; there is some  
logic behind this, just can't quite recall at the moment why it is  
this way.  You could set a hash key for the label in cases where it  
isn't quoted, that should work.  You can also test out the  
Bio::SeqIO::embldriver version (-format => 'embldriver').

If the above doesn't work out it's worth filing a bug for this  
behavior, though I'm not sure how easily it will be to fix.

chris

On Sep 28, 2009, at 2:51 AM, Adam Sj?gren wrote:

>  Hi.
>
>
> I am wondering whether this is a buglet or just a case of "Don't do
> that":
>
> If I set a very long /label on a feature and output the sequence in  
> EMBL
> format, the qualifier value gets wrapped, but not quoted.
>
> When BioPerl reads such a file, an exception is thrown.
>
> I probably shouldn't be setting very long labels... But oughtn't  
> BioPerl
> throw an exception when a too long label is set, or automatically  
> quote
> the value when it is long enough to be wrapped, or know how to read a
> wrapped yet unquoted value?
>
> I will be happy to try and provide a patch for whichever solution is
> preferred.
>
> Here is an example script:
>
>  #!/usr/bin/perl
>
>  use strict;
>  use warnings;
>
>  use IO::String;
>
>  use Bio::Seq;
>  use Bio::SeqFeature::Generic;
>  use Bio::SeqIO;
>
>  print 'BioPerl ' . $Bio::Root::Version::VERSION . "\n";
>
>  my $seq=Bio::Seq->new(-seq=>'ATG');
>  my $feature=Bio::SeqFeature::Generic->new(-primary=>'misc_feature',  
> -start=>1, -end=>3);
>  $feature->add_tag_value 
> (label 
> =>'averylonglabelthisisindeedbutitoughttoworkanywaydontyouthink');
>  $seq->add_SeqFeature($feature);
>
>  my $out_string=out($seq);
>  print $out_string;
>
>  my $fh=IO::String->new($out_string);
>  my $in=Bio::SeqIO->new(-fh=>$fh, -format=>'EMBL');
>  my $in_seq=$in->next_seq;
>
>  print "Done\n";
>
>  sub out {
>      my ($seq)=@_;
>
>      my $string='';
>      my $fh=IO::String->new($string);
>      my $out=Bio::SeqIO->new(-fh=>$fh, -format=>'EMBL');
>      $out->write_seq($seq);
>
>      return $string;
>  }
>
> Which gives this output when run:
>
>  BioPerl 1.0069
>  ID   unknown; SV 1; linear; unassigned DNA; STD; UNC; 3 BP.
>  XX
>  AC   unknown;
>  XX
>  XX
>  FH   Key             Location/Qualifiers
>  FH
>  FT   misc_feature    1..3
>  FT                   / 
> label=averylonglabelthisisindeedbutitoughttoworkanywaydont
>  FT                   youthink
>  XX
>  SQ   Sequence 3 BP; 1 A; 0 C; 1 G; 1 T; 0 other;
>        
> atg 
>                                                                        3
>  //
>
>  ------------- EXCEPTION: Bio::Root::Exception -------------
>  MSG: Can't see new qualifier in: youthink
>  from:
>  /label=averylonglabelthisisindeedbutitoughttoworkanywaydont
>  youthink
>
>  STACK: Error::throw
>  STACK: Bio::Root::Root::throw Bio/Root/Root.pm:368
>  STACK: Bio::SeqIO::embl::_read_FTHelper_EMBL Bio/SeqIO/embl.pm:1294
>  STACK: Bio::SeqIO::embl::next_seq Bio/SeqIO/embl.pm:392
>  STACK: /z/home/adsj/bugs/bioperl/embl/embl.pl:24
>  -----------------------------------------------------------
>
> If I change the value to include "-quotes ("simulating" that embl.pm
> quotes the value), BioPerl can read the EMBL string it produces fine:
>
>  -----------------------------------------------------------
>  adsj at ala:~/work/bioperl/bioperl-live$ perl -I. ~/bugs/bioperl/embl/ 
> embl.pl
>  BioPerl 1.0069
>  ID   unknown; SV 1; linear; unassigned DNA; STD; UNC; 3 BP.
>  XX
>  AC   unknown;
>  XX
>  XX
>  FH   Key             Location/Qualifiers
>  FH
>  FT   misc_feature    1..3
>  FT                   / 
> label=""averylonglabelthisisindeedbutitoughttoworkanywaydo
>  FT                   ntyouthink""
>  XX
>  SQ   Sequence 3 BP; 1 A; 0 C; 1 G; 1 T; 0 other;
>        
> atg 
>                                                                        3
>  //
>  Done
>
>
>  Best regards,
>
>     Adam
>
> -- 
>                                                          Adam Sj?gren
>                                                    adsj at novozymes.com
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From adsj at novozymes.com  Wed Sep 30 05:50:36 2009
From: adsj at novozymes.com (Adam =?iso-8859-1?Q?Sj=F8gren?=)
Date: Wed, 30 Sep 2009 11:50:36 +0200
Subject: [Bioperl-l] Long /labels are wrapped, but can't be read
In-Reply-To: <86373CE8-4C61-4124-BCF3-35975523CC9C@illinois.edu> (Chris
	Fields's message of "Tue, 29 Sep 2009 22:54:04 -0500")
References: <87hbunv764.fsf@topper.koldfront.dk>
	<86373CE8-4C61-4124-BCF3-35975523CC9C@illinois.edu>
Message-ID: <87vdj0g3rn.fsf@topper.koldfront.dk>

On Tue, 29 Sep 2009 22:54:04 -0500, Chris wrote:

> Not sure, but this could be a case of 'both'. Labels that are quoted
> and aren't are currently distinguished via a global hash lookup
> (%FTQUAL_NO_QUOTE) due to the way the parser works; there is some
> logic behind this, just can't quite recall at the moment why it is
> this way.

Yes, I saw that there is a number of qualifiers that aren't quoted
automatically.

The very easy "fix" for me would be to simply remove "label" from
%FTQUAL_NO_QUOTE, but I'm not really sure what the reason for not
quoting all values is, so I was hesitant to just propose that.

> You could set a hash key for the label in cases where it isn't quoted,
> that should work. You can also test out the Bio::SeqIO::embldriver
> version (-format => 'embldriver').

Ah, embldriver reads the wrapped qualifier when it isn't quoted without
problem. Nice! I hadn't noticed embldriver.

I wonder which one is correct in this case?

And should I switch to using embldriver to read, or does it make sense
to try and concoct a patch that changes embl?


  Thanks for the feedback!

     Adam

-- 
                                                          Adam Sj?gren
                                                    adsj at novozymes.com


From sidd.basu at gmail.com  Wed Sep 30 13:24:53 2009
From: sidd.basu at gmail.com (Siddhartha Basu)
Date: Wed, 30 Sep 2009 12:24:53 -0500
Subject: [Bioperl-l]  Re: BioPerl 1.6.1 released
In-Reply-To: <5B8C4E37-5F3D-4E76-AB94-1C613AE04CDF@illinois.edu>
References: <7E88F275-4D4B-48DD-84C7-A97C2C16EA97@illinois.edu>
	<C06DA705-6249-4D86-BE9D-E2E4DCEBFAF0@gmx.net>
	<5B8C4E37-5F3D-4E76-AB94-1C613AE04CDF@illinois.edu>
Message-ID: <4ac39469.0637560a.5a63.1fee@mx.google.com>

Congrats chris,  really appreciate your time and effort.

-siddhartha

On Tue, 29 Sep 2009, Chris Fields wrote:

> No prob.  Next up is db, run, and network!
>
> chris
>
> On Sep 29, 2009, at 2:56 PM, Hilmar Lapp wrote:
>
> > Congrats from me too - awesome Chris, and thanks on behalf of the project!
> >
> > 	-hilmar
> >
> > On Sep 29, 2009, at 2:01 PM, Chris Fields wrote:
> >
> >> We are pleased to announce the availability of BioPerl 1.6.1, the latest 
> >> release of BioPerl core code.  You can grab it here:
> >>
> >> Via CPAN:
> >>
> >> http://search.cpan.org/~cjfields/BioPerl-1.6.1/
> >>
> >> Via the BioPerl website:
> >>
> >> http://bioperl.org/DIST/BioPerl-1.6.1.tar.bz2
> >> http://bioperl.org/DIST/BioPerl-1.6.1.tar.gz
> >> http://bioperl.org/DIST/BioPerl-1.6.1.zip
> >>
> >> The PPM for Windows should also finally be available this week, 
> >> ActivePerl problems permitting (we will post more information when it 
> >> becomes available).
> >>
> >> Tons of bug fixes and changes have been incorporated into this release.  
> >> For a more complete change list please see the 'Changes' file included 
> >> with the distribution.
> >>
> >> A few highlights:
> >>
> >> * FASTQ parsing and interconversion of the three FASTQ variants (Sanger, 
> >> Illumina, Solexa) now works (a concerted OBF effort!)
> >> * Significant refactoring of Bio::Restriction methods
> >> * Complete refactoring of Bio::Search-related tiling code, including 
> >> HOWTO documentation
> >> * GBrowse-related fixes
> >>  - berkeleydb database now autoindexes wig files and locks correctly
> >>  - add Pg, SQLite, and faster BerkeleyDB implementations
> >> * Infernal 1.0 output is now parsed
> >> * New SearchIO-based parser for gmap -f9 output
> >> * BLAST XML parsing essentially complete
> >> * Installation via CPANPLUS should now work
> >> * For those using Strawberry Perl on Windows, the latest build is 
> >> expected to pass all tests.
> >> * 'raw' sequence format now parsed by line or optionally as a single 
> >> sequence
> >> * SCF parsing/writing now round-trips
> >> * Demo code for using RPS-BLAST and Bio::Tools::Run::RemoteBlast
> >> * Bio::Tools::SeqPattern now has a backtranslate() method
> >> * Bio::Tree::Statistics now has methods to calculate Fitch-based score, 
> >> internal trait values, statratio(), sum of leaf distances [heikki]
> >> * scripts
> >>  - update to bp_seqfeature_load for SQLite [lstein]
> >>  - hivq.pl - commmand-line interface to Bio::DB::HIV [maj]
> >>  - fastam9_to_table - fix for MPI output [jason]
> >>  - gccalc - total stats [jason]
> >>  - einfo  - simple script to find up-to-date NCBI database list, list 
> >> field and link values for a specific database
> >>
> >> We will shortly release updates for BioPerl-db, BioPerl-run, and 
> >> BioPerl-network.  Enjoy!
> >>
> >> chris
> >> _______________________________________________
> >> Bioperl-l mailing list
> >> Bioperl-l at lists.open-bio.org
> >> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >>
> >
> > -- 
> > ===========================================================
> > : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
> > ===========================================================
> >
> >
> >
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From antonina.iagovitina at epfl.ch  Wed Sep 30 14:09:17 2009
From: antonina.iagovitina at epfl.ch (Antonina Iagovitina)
Date: Wed, 30 Sep 2009 20:09:17 +0200
Subject: [Bioperl-l] assistance with bioperl
Message-ID: <4AC39ECD.6060405@epfl.ch>

Here is the error message I get when I try to align a sequence to an existing
alignment. Please help
I am using Windows XP and Clustalw version1.83

 MSG:
 ERROR: Could not open sequence file (-profile) 
 No. of seqs. read = -1. No alignment!
 
use Bio::AlignIO;
use Bio::SeqIO;
use Bio::Seq;
use Bio::Tools::Run::Alignment::Clustalw;

my @params = ('ktuple' => 2, 'matrix' => 'BLOSUM');
my $factory = Bio::Tools::Run::Alignment::Clustalw->new(@params);
$str = Bio::AlignIO->new(-file=> 'cysprot1a.msf');
$aln = $str->next_aln();
$str1 = Bio::SeqIO->new(-file=> 'cysprot1b.fa');
$seq = $str1->next_seq();
$aln = $factory->profile_align($aln,$seq);
end


From maj at fortinbras.us  Wed Sep 30 14:24:59 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Wed, 30 Sep 2009 14:24:59 -0400
Subject: [Bioperl-l] assistance with bioperl
In-Reply-To: <4AC39ECD.6060405@epfl.ch>
References: <4AC39ECD.6060405@epfl.ch>
Message-ID: <569E83EDBFE044638187504E5E7A8C11@NewLife>

Antonina--
Try the following:
Make sure that cysprot1a.msf and cysprot1b.fa are in the current directory, 
or use full path names for the files. 
MAJ
----- Original Message ----- 
From: "Antonina Iagovitina" <antonina.iagovitina at epfl.ch>
To: <bioperl-l at lists.open-bio.org>
Sent: Wednesday, September 30, 2009 2:09 PM
Subject: [Bioperl-l] assistance with bioperl


> Here is the error message I get when I try to align a sequence to an existing
> alignment. Please help
> I am using Windows XP and Clustalw version1.83
> 
> MSG:
> ERROR: Could not open sequence file (-profile) 
> No. of seqs. read = -1. No alignment!
> 
> use Bio::AlignIO;
> use Bio::SeqIO;
> use Bio::Seq;
> use Bio::Tools::Run::Alignment::Clustalw;
> 
> my @params = ('ktuple' => 2, 'matrix' => 'BLOSUM');
> my $factory = Bio::Tools::Run::Alignment::Clustalw->new(@params);
> $str = Bio::AlignIO->new(-file=> 'cysprot1a.msf');
> $aln = $str->next_aln();
> $str1 = Bio::SeqIO->new(-file=> 'cysprot1b.fa');
> $seq = $str1->next_seq();
> $aln = $factory->profile_align($aln,$seq);
> end
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
>


From me at miguel.weapps.com  Wed Sep 30 18:16:38 2009
From: me at miguel.weapps.com (Luis M Rodriguez-R)
Date: Wed, 30 Sep 2009 17:16:38 -0500
Subject: [Bioperl-l] Nexus symbols
Message-ID: <0EFFDCCA-48C6-4609-8503-17E61FCDD67B@miguel.weapps.com>

Dear all,

Is there a way to remove the "symbols" (i.e. the 'symbols="ATCG"')  
from the "format" line in the Nexus output of Bio::AlignIO?

My code (snippet) is:

my $fasta_i = Bio::AlignIO->new(-file=>"<$outfile.aln.fasta", '- 
format'=>"fasta");
my $nexus_o = Bio::AlignIO->new(-file=>">$outfile.aln.nex", '- 
format'=>"nexus");
while(my $fasta_aln=$fasta_i->next_aln){$nexus_o- 
 >write_aln($fasta_aln);}

And I would like to remove the symbols (is not compatible with MrBayes  
v3.1.2: "Could not find parameter "symbols"").

Also, it would be nice to be able to change the TITLE comment.

Thanks all!
Regards,

Luis M. Rodriguez-R
[http://bioinf.uniandes.edu.co/~miguel/]
---------------------------------
Unidad de Bioinform?tica del Laboratorio de Micolog?a y Fitopatolog?a
Universidad de Los Andes, Colombia
[http://bioinf.uniandes.edu.co]

+ 57 1 3394949 ext 2619
luisrodr at uniandes.edu.co
me at miguel.weapps.com


From jason at bioperl.org  Wed Sep 30 18:40:33 2009
From: jason at bioperl.org (Jason Stajich)
Date: Wed, 30 Sep 2009 15:40:33 -0700
Subject: [Bioperl-l] Nexus symbols
In-Reply-To: <0EFFDCCA-48C6-4609-8503-17E61FCDD67B@miguel.weapps.com>
References: <0EFFDCCA-48C6-4609-8503-17E61FCDD67B@miguel.weapps.com>
Message-ID: <483DB389-9332-4573-84C7-3AF09AC2BACA@bioperl.org>

-show_symbols => 0

If you use bp_sreformat.pl script specify --special="mrbayes" it will  
set both of the endblock and show_symbols values to 0.


perldoc Bio::AlignIO::nexus

        new

         Title   : new
         Usage   : $alignio = Bio::AlignIO->new(-format => ?nexus?, - 
file => ?filename?);
         Function: returns a new Bio::AlignIO object to handle  
clustalw files
         Returns : Bio::AlignIO::clustalw object
         Args    : -verbose => verbosity setting (-1,0,1,2)
                   -file    => name of file to read in or with ">" -  
writeout
                   -fh      => alternative to -file param - provide a  
filehandle
                               to read from/write to
                   -format  => type of Alignment Format to process or  
produce

                   Customization of nexus flavor output

                   -show_symbols => print the symbols="ATGC" in the  
data definition
                                    (MrBayes does not like this)
                                    boolean [default is 1]
                   -show_endblock => print an ?endblock;? at the end  
of the data
                                    (MyBayes does not like this)
                                    boolean [default is 1]

On Sep 30, 2009, at 3:16 PM, Luis M Rodriguez-R wrote:

> Dear all,
>
> Is there a way to remove the "symbols" (i.e. the 'symbols="ATCG"')  
> from the "format" line in the Nexus output of Bio::AlignIO?
>
> My code (snippet) is:
>
> my $fasta_i = Bio::AlignIO->new(-file=>"<$outfile.aln.fasta", '- 
> format'=>"fasta");
> my $nexus_o = Bio::AlignIO->new(-file=>">$outfile.aln.nex", '- 
> format'=>"nexus");
> while(my $fasta_aln=$fasta_i->next_aln){$nexus_o- 
> >write_aln($fasta_aln);}
>
> And I would like to remove the symbols (is not compatible with  
> MrBayes v3.1.2: "Could not find parameter "symbols"").
>
> Also, it would be nice to be able to change the TITLE comment.
>
> Thanks all!
> Regards,
>
> Luis M. Rodriguez-R
> [http://bioinf.uniandes.edu.co/~miguel/]
> ---------------------------------
> Unidad de Bioinform?tica del Laboratorio de Micolog?a y Fitopatolog?a
> Universidad de Los Andes, Colombia
> [http://bioinf.uniandes.edu.co]
>
> + 57 1 3394949 ext 2619
> luisrodr at uniandes.edu.co
> me at miguel.weapps.com
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

--
Jason Stajich
jason.stajich at gmail.com
jason at bioperl.org


From me at miguel.weapps.com  Wed Sep 30 16:51:04 2009
From: me at miguel.weapps.com (Luis M Rodriguez-R)
Date: Wed, 30 Sep 2009 15:51:04 -0500
Subject: [Bioperl-l] Nexus symbols
Message-ID: <788222E4-FCCC-4D4D-880B-1F5156945DB8@miguel.weapps.com>

Dear all,

Is there a way to remove the "symbols" (i.e. the 'symbols="ATCG"')  
from the "format" line in the Nexus output of Bio::AlignIO?

My code (snippet) is:

my $fasta_i = Bio::AlignIO->new(-file=>"<$outfile.aln.fasta", '- 
format'=>"fasta");
my $nexus_o = Bio::AlignIO->new(-file=>">$outfile.aln.nex", '- 
format'=>"nexus");
while(my $fasta_aln=$fasta_i->next_aln){$nexus_o- 
 >write_aln($fasta_aln);}

And I would like to remove the symbols (is not compatible with MrBayes  
v3.1.2: "Could not find parameter "symbols"").

Also, it would be nice to be able to change the TITLE comment.

Thanks all!
Regards,

Luis M. Rodriguez-R
[http://bioinf.uniandes.edu.co/~miguel/]
---------------------------------
Unidad de Bioinform?tica del Laboratorio de Micolog?a y Fitopatolog?a
Universidad de Los Andes, Colombia
[http://bioinf.uniandes.edu.co]

+ 57 1 3394949 ext 2619
luisrodr at uniandes.edu.co
me at miguel.weapps.com


From paola_bisignano at yahoo.it  Tue Sep  1 08:20:25 2009
From: paola_bisignano at yahoo.it (Paola Bisignano)
Date: Tue, 1 Sep 2009 12:20:25 +0000 (GMT)
Subject: [Bioperl-l] help parsing msf file or clustalW file reports
Message-ID: <154614.75143.qm@web25706.mail.ukl.yahoo.com>

Hi, 

I'm trying to parse fasta files, where I have couple of alignments....I need to identify my residue in my alignment......I have separate lists that derived from ligplot parsing files.. so I have to manipulate string...but I don't now how to start..it seems complicated..
I used Bio::AlignIO to parse the fasta file, so I can have a parsed file in msf or clustalW forma

here an example:
CLUSTAL W(1.81) multiple sequence alignment


Sequence/9-273???????? DKWEMERTDITMKHKLGGGQYGEVYEGVWKKYSLTVAVKTLKEDTMEVEEFLKEAAVMKE
2pl0:A/6-268?????????? DEWEVPRETLKLVERLGAGQFGEVWMGYYNGHT-KVAVKSLKQGSMSPDAFLAEANLMKQ
?????????????????????? *:**: *? :.: .:**.**:***: * :: :: .****:**:.:*. : ** ** :**:


Sequence/9-273???????? IKHPNLVQLLGVCTREPPFYIITEFMTYGNLLDYLRECNRQEVSAVVLLYMATQISSAME
2pl0:A/6-268?????????? LQHQRLVRLYAVVTQEP-IYIITEYMENGSLVDFLKTPSGIKLTINKLLDMAAQIAEGMA
?????????????????????? ::* .**:* .* *:** :*****:*? *.*:*:*:? .? :::?? ** **:**:..* 

I? choose two residue for example...how can I extract them...starting from their position in the pdb file?
I need to walk...to my sequence 

I don't know if it is clear because I cannot explain the question correctly in english...are there any Italians?
could anyone help me?


From scott at scottcain.net  Tue Sep  1 09:21:25 2009
From: scott at scottcain.net (Scott Cain)
Date: Tue, 1 Sep 2009 09:21:25 -0400
Subject: [Bioperl-l] GMOD Chado perl modules moving to the Bio namespace
Message-ID: <CFB4B2A1-6E7F-42D7-BC9A-00C7CB25D185@scottcain.net>

Hello all,

I just wanted to send out a general announcement about a change that  
is coming for perl modules that are distributed with the gmod/chado  
package.  There are some modules, notably Class::DBI classes that are  
automatically generated, that are currently in the Chado namespace.   
This move has been requested by the CPAN maintainers.  So any  
Chado::*  modules will become Bio::Chado::*, except for the Class::DBI  
classes, which will become Bio::Chado::CDBI::*.

This will probably affect relatively few users, though ModWare in its  
current incarnation will need to be updated.

Scott

-----------------------------------------------------------------------
Scott Cain, Ph. D. scott at scottcain dot net
GMOD Coordinator (http://gmod.org/) 216-392-3087
Ontario Institute for Cancer Research


From biopython at maubp.freeserve.co.uk  Tue Sep  1 11:33:13 2009
From: biopython at maubp.freeserve.co.uk (Peter)
Date: Tue, 1 Sep 2009 16:33:13 +0100
Subject: [Bioperl-l] Next-Gen and the next point release - updates
In-Reply-To: <320fb6e00908270455y2a80907chfae8007df60e72e2@mail.gmail.com>
References: <ED17AB7F-E2D9-4CFC-AE18-08B1312159C5@illinois.edu>
	<320fb6e00908261416p666b7ab7w8174eb5a48f38c61@mail.gmail.com>
	<F7DAE18A-8224-4721-861F-610D82F4BDFE@illinois.edu>
	<320fb6e00908270455y2a80907chfae8007df60e72e2@mail.gmail.com>
Message-ID: <320fb6e00909010833p7bffac97je12dc778cdd54971@mail.gmail.com>

On Thu, Aug 27, 2009 at 12:55 PM, Peter wrote:
>> The two conversions to solexa are still failing. ?I'm not sure but I think
>> it's something fairly simple, but I can't work on it until Friday (got too
>> many other things on my plate ATM). ?If I get stumped I'll post a message.
>
> ...
>
> This should narrow it down - the bug is in mapping PHRED
> scores (from either Sanger or Illumina 1.3+ files) to the
> Solexa encoding.
>
> Peter

Hi Chris,

I've just noticed BioPerl is treating invalid characters in the quality
string as a warning condition (not an error):
http://lists.open-bio.org/pipermail/open-bio-l/2009-September/000568.html

It seems for fastq-sanger and fastq-illumina, these get given PHRED 0
(character "!" or "@" respectively) which is reasonable. For fastq-solexa
to fastq-solexa however, Solexa -5 (ASCII 59, character ";") does not get
used - a bug?

Also, in all these cases there is currently a spurious "data loss" warning:

$ ./bioperl_sanger2sanger.pl < error_qual_null.fastq

--------------------- WARNING ---------------------
MSG: Unknown symbol with ASCII value 0 outside of quality range,
---------------------------------------------------

--------------------- WARNING ---------------------
MSG: Data loss for sanger: following values exceed max 93

---------------------------------------------------
@SLXA-B3_649_FC8437_R1_1_1_850_123
GAGGGTGTTGATCATGATGATGGCG
+
YYY!YYYYYYYYYWYYWYYSYYYSY
@SLXA-B3_649_FC8437_R1_1_1_397_389
GGTTTGAGAAAGAGAAATGAGATAA
+
YYYYYYYYYWYYYYWWYYYWYWYWW
@SLXA-B3_649_FC8437_R1_1_1_850_123
GAGGGTGTTGATCATGATGATGGCG
+
YYYYYYYYYYYYYWYYWYYSYYYSY
@SLXA-B3_649_FC8437_R1_1_1_362_549
GGAAACAAAGTTTTTCTCAACATAG
+
YYYYYYYYYYYYYYYYYYWWWWYWY
@SLXA-B3_649_FC8437_R1_1_1_183_714
GTATTATTTAATGGCATACACTCAA
+
YYYYYYYYYYWYYYYWYWWUWWWQQ

Regards,

Peter


From jason at bioperl.org  Tue Sep  1 11:49:00 2009
From: jason at bioperl.org (Jason Stajich)
Date: Tue, 1 Sep 2009 08:49:00 -0700
Subject: [Bioperl-l] help parsing msf file or clustalW file reports
In-Reply-To: <154614.75143.qm@web25706.mail.ukl.yahoo.com>
References: <154614.75143.qm@web25706.mail.ukl.yahoo.com>
Message-ID: <90DACEE3-BC71-4D82-A8FF-6441A720BC76@bioperl.org>

I think you might want to use the column_from_residue_number method  
that is part of Bio::SimpleAlign - it lets you get the column from an  
alignment based on the sequence residue, doing some math along the way  
to deal with gaps. That is the residue -> alignment direction.  If you  
are starting at the alignment and want to get the residue's position  
you will use the location_from_column on a particular sequence so

     # select somehow a sequence from the alignment, e.g.
     my $seq = $aln->get_seq_by_pos(1);
     #$loc is undef or Bio::LocationI object
     my $loc = $seq->location_from_column(5);

-jason

On Sep 1, 2009, at 5:20 AM, Paola Bisignano wrote:

> Hi,
>
> I'm trying to parse fasta files, where I have couple of  
> alignments....I need to identify my residue in my alignment......I  
> have separate lists that derived from ligplot parsing files.. so I  
> have to manipulate string...but I don't now how to start..it seems  
> complicated..
> I used Bio::AlignIO to parse the fasta file, so I can have a parsed  
> file in msf or clustalW forma
>
> here an example:
> CLUSTAL W(1.81) multiple sequence alignment
>
>
> Sequence/9-273          
> DKWEMERTDITMKHKLGGGQYGEVYEGVWKKYSLTVAVKTLKEDTMEVEEFLKEAAVMKE
> 2pl0:A/6-268           DEWEVPRETLKLVERLGAGQFGEVWMGYYNGHT- 
> KVAVKSLKQGSMSPDAFLAEANLMKQ
>                        *:**: *  :.: .:**.**:***:  
> * :: :: .****:**:.:*. : ** ** :**:
>
>
> Sequence/9-273          
> IKHPNLVQLLGVCTREPPFYIITEFMTYGNLLDYLRECNRQEVSAVVLLYMATQISSAME
> 2pl0:A/6-268           LQHQRLVRLYAVVTQEP- 
> IYIITEYMENGSLVDFLKTPSGIKLTINKLLDMAAQIAEGMA
>                        ::* .**:* .* *:** :*****:*   
> *.*:*:*:  .  :::   ** **:**:..*
>
> I  choose two residue for example...how can I extract  
> them...starting from their position in the pdb file?
> I need to walk...to my sequence
>
> I don't know if it is clear because I cannot explain the question  
> correctly in english...are there any Italians?
> could anyone help me?
>
>
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

--
Jason Stajich
jason.stajich at gmail.com
jason at bioperl.org


From cjfields at illinois.edu  Tue Sep  1 12:05:14 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Tue, 1 Sep 2009 11:05:14 -0500
Subject: [Bioperl-l] Next-Gen and the next point release - updates
In-Reply-To: <320fb6e00909010833p7bffac97je12dc778cdd54971@mail.gmail.com>
References: <ED17AB7F-E2D9-4CFC-AE18-08B1312159C5@illinois.edu>
	<320fb6e00908261416p666b7ab7w8174eb5a48f38c61@mail.gmail.com>
	<F7DAE18A-8224-4721-861F-610D82F4BDFE@illinois.edu>
	<320fb6e00908270455y2a80907chfae8007df60e72e2@mail.gmail.com>
	<320fb6e00909010833p7bffac97je12dc778cdd54971@mail.gmail.com>
Message-ID: <FB130819-94C6-419F-AD3D-BAEEDDE77737@illinois.edu>


On Sep 1, 2009, at 10:33 AM, Peter wrote:

> On Thu, Aug 27, 2009 at 12:55 PM, Peter wrote:
>>> The two conversions to solexa are still failing.  I'm not sure but  
>>> I think
>>> it's something fairly simple, but I can't work on it until Friday  
>>> (got too
>>> many other things on my plate ATM).  If I get stumped I'll post a  
>>> message.
>>
>> ...
>>
>> This should narrow it down - the bug is in mapping PHRED
>> scores (from either Sanger or Illumina 1.3+ files) to the
>> Solexa encoding.
>>
>> Peter
>
> Hi Chris,
>
> I've just noticed BioPerl is treating invalid characters in the  
> quality
> string as a warning condition (not an error):
> http://lists.open-bio.org/pipermail/open-bio-l/2009-September/000568.html
>
> It seems for fastq-sanger and fastq-illumina, these get given PHRED 0
> (character "!" or "@" respectively) which is reasonable. For fastq- 
> solexa
> to fastq-solexa however, Solexa -5 (ASCII 59, character ";") does  
> not get
> used - a bug?
>
> Also, in all these cases there is currently a spurious "data loss"  
> warning:
>
> $ ./bioperl_sanger2sanger.pl < error_qual_null.fastq
>
> --------------------- WARNING ---------------------
> MSG: Unknown symbol with ASCII value 0 outside of quality range,
> ---------------------------------------------------
>
> --------------------- WARNING ---------------------
> MSG: Data loss for sanger: following values exceed max 93
>
> ---------------------------------------------------
> @SLXA-B3_649_FC8437_R1_1_1_850_123
> GAGGGTGTTGATCATGATGATGGCG
> +
> YYY!YYYYYYYYYWYYWYYSYYYSY
> @SLXA-B3_649_FC8437_R1_1_1_397_389
> GGTTTGAGAAAGAGAAATGAGATAA
> +
> YYYYYYYYYWYYYYWWYYYWYWYWW
> @SLXA-B3_649_FC8437_R1_1_1_850_123
> GAGGGTGTTGATCATGATGATGGCG
> +
> YYYYYYYYYYYYYWYYWYYSYYYSY
> @SLXA-B3_649_FC8437_R1_1_1_362_549
> GGAAACAAAGTTTTTCTCAACATAG
> +
> YYYYYYYYYYYYYYYYYYWWWWYWY
> @SLXA-B3_649_FC8437_R1_1_1_183_714
> GTATTATTTAATGGCATACACTCAA
> +
> YYYYYYYYYYWYYYYWYWWUWWWQQ
>
> Regards,
>
> Peter

Right, per off-list discussion this can be changed (I would rather it  
die there anyway).

chris


From marcelo011982 at gmail.com  Tue Sep  1 13:33:51 2009
From: marcelo011982 at gmail.com (Marcelo Iwata)
Date: Tue, 1 Sep 2009 14:33:51 -0300
Subject: [Bioperl-l] remove overlapped sequences from Blastn results
Message-ID: <1c9f28970909011033h7f8a1bcl771db039bad384e7@mail.gmail.com>

Hi

I've made a blastn with such arguments:

../bin/blastall -p blastn -d DBBank -i myFasta.FASTA.txt  -e 0.00001 -o
Out2Blast.txt -a 8

and i want a script that removes overlapped sequences from the results..
For example, if a unigene A has the hit->start  and hit-end as 1 and 4, and
the B is at 2 and 3, respectively, the script remove second one.

I want to know if it already exist, and if not, is there a library that
works with such issue.

I know that at Bio::DB::gff we have overlapping_features. But , if something
directly exist (works with blast format), is better for me.

thanks in advance


From cjfields at illinois.edu  Tue Sep  1 14:10:30 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Tue, 1 Sep 2009 13:10:30 -0500
Subject: [Bioperl-l] remove overlapped sequences from Blastn results
In-Reply-To: <1c9f28970909011033h7f8a1bcl771db039bad384e7@mail.gmail.com>
References: <1c9f28970909011033h7f8a1bcl771db039bad384e7@mail.gmail.com>
Message-ID: <7A89A354-3211-4662-9672-895E16CFDEE8@illinois.edu>

Marcelo,

Do you mean tiling?  See:

http://www.bioperl.org/wiki/HOWTO:Tiling

chris

On Sep 1, 2009, at 12:33 PM, Marcelo Iwata wrote:

> Hi
>
> I've made a blastn with such arguments:
>
> ../bin/blastall -p blastn -d DBBank -i myFasta.FASTA.txt  -e 0.00001  
> -o
> Out2Blast.txt -a 8
>
> and i want a script that removes overlapped sequences from the  
> results..
> For example, if a unigene A has the hit->start  and hit-end as 1 and  
> 4, and
> the B is at 2 and 3, respectively, the script remove second one.
>
> I want to know if it already exist, and if not, is there a library  
> that
> works with such issue.
>
> I know that at Bio::DB::gff we have overlapping_features. But , if  
> something
> directly exist (works with blast format), is better for me.
>
> thanks in advance
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From cain.cshl at gmail.com  Tue Sep  1 15:47:50 2009
From: cain.cshl at gmail.com (Scott Cain)
Date: Tue, 1 Sep 2009 15:47:50 -0400
Subject: [Bioperl-l] GMOD Chado perl modules moving to the Bio namespace
In-Reply-To: <CFB4B2A1-6E7F-42D7-BC9A-00C7CB25D185@scottcain.net>
References: <CFB4B2A1-6E7F-42D7-BC9A-00C7CB25D185@scottcain.net>
Message-ID: <0CA5287E-BE85-4E7F-8ED3-B453092FACB1@gmail.com>

Hi Don,

I just wanted to let you know that I also updated the code in  
GMODTools, but I don't have a simple way to test it; perhaps you  
should take a look at the cvs diff to make sure what I did makes sense.

Thanks,
Scott

On Sep 1, 2009, at 9:21 AM, Scott Cain wrote:

> Hello all,
>
> I just wanted to send out a general announcement about a change that  
> is coming for perl modules that are distributed with the gmod/chado  
> package.  There are some modules, notably Class::DBI classes that  
> are automatically generated, that are currently in the Chado  
> namespace.  This move has been requested by the CPAN maintainers.   
> So any Chado::*  modules will become Bio::Chado::*, except for the  
> Class::DBI classes, which will become Bio::Chado::CDBI::*.
>
> This will probably affect relatively few users, though ModWare in  
> its current incarnation will need to be updated.
>
> Scott
>
> -----------------------------------------------------------------------
> Scott Cain, Ph. D. scott at scottcain dot net
> GMOD Coordinator (http://gmod.org/) 216-392-3087
> Ontario Institute for Cancer Research
>
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

-----------------------------------------------------------------------
Scott Cain, Ph. D. scott at scottcain dot net
GMOD Coordinator (http://gmod.org/) 216-392-3087
Ontario Institute for Cancer Research


From maj at fortinbras.us  Wed Sep  2 00:19:30 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Wed, 2 Sep 2009 00:19:30 -0400
Subject: [Bioperl-l] bioperl invades emacs
Message-ID: <56DB0DEEB22645DE94DE0E912A889409@NewLife>

Hi All, 

As part of the Documentation Project, I've written a full-
fledged minor mode for emacs, bioperl-mode. It allows 
the user to access BP pod while coding, using keyboard
shortcuts or menus. Pod pops up in a new view buffer,
which it itself active for quick pod searching. You can 
get the whole pod, pieces of pod, or even the pod headers
of individual methods. 

The best feature (IMHO) is the completion facility. This
not only saves typing, but allows browsing and follow-your-nose
programming (exactly the technique I used to make bioperl-mode,
thanks to the Extensible Self-Documenting Editor).

It's very easy to install, requires only one additional line 
in your .emacs file, and directly infects perl-mode 
(if you so choose) so its available whenever you
open .pl or .pm files.

For details, screenshots, download and install info,
and soporific design details, see
http://www.bioperl.org/wiki/Emacs_bioperl-mode

Send me the bugs!
cheers, 
MAJ


From rmb32 at cornell.edu  Wed Sep  2 00:31:15 2009
From: rmb32 at cornell.edu (Robert Buels)
Date: Tue, 01 Sep 2009 21:31:15 -0700
Subject: [Bioperl-l] bioperl invades emacs
In-Reply-To: <56DB0DEEB22645DE94DE0E912A889409@NewLife>
References: <56DB0DEEB22645DE94DE0E912A889409@NewLife>
Message-ID: <4A9DF513.1020607@cornell.edu>

Wow.  Bravo!

Rob


From cjfields at illinois.edu  Wed Sep  2 00:31:46 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Tue, 1 Sep 2009 23:31:46 -0500
Subject: [Bioperl-l] bioperl invades emacs
In-Reply-To: <56DB0DEEB22645DE94DE0E912A889409@NewLife>
References: <56DB0DEEB22645DE94DE0E912A889409@NewLife>
Message-ID: <2A49147F-17B4-42EB-A170-52DA009D7E1C@illinois.edu>

Very cool!  Thanks Mark!

chris

On Sep 1, 2009, at 11:19 PM, Mark A. Jensen wrote:

> Hi All,
>
> As part of the Documentation Project, I've written a full-
> fledged minor mode for emacs, bioperl-mode. It allows
> the user to access BP pod while coding, using keyboard
> shortcuts or menus. Pod pops up in a new view buffer,
> which it itself active for quick pod searching. You can
> get the whole pod, pieces of pod, or even the pod headers
> of individual methods.
>
> The best feature (IMHO) is the completion facility. This
> not only saves typing, but allows browsing and follow-your-nose
> programming (exactly the technique I used to make bioperl-mode,
> thanks to the Extensible Self-Documenting Editor).
>
> It's very easy to install, requires only one additional line
> in your .emacs file, and directly infects perl-mode
> (if you so choose) so its available whenever you
> open .pl or .pm files.
>
> For details, screenshots, download and install info,
> and soporific design details, see
> http://www.bioperl.org/wiki/Emacs_bioperl-mode
>
> Send me the bugs!
> cheers,
> MAJ
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From Russell.Smithies at agresearch.co.nz  Wed Sep  2 01:01:34 2009
From: Russell.Smithies at agresearch.co.nz (Smithies, Russell)
Date: Wed, 2 Sep 2009 17:01:34 +1200
Subject: [Bioperl-l] bioperl invades emacs
In-Reply-To: <56DB0DEEB22645DE94DE0E912A889409@NewLife>
References: <56DB0DEEB22645DE94DE0E912A889409@NewLife>
Message-ID: <18DF7D20DFEC044098A1062202F5FFF32AAB8A8478@exchsth.agresearch.co.nz>

emacs, how quaint  :-)
And here's me thinking you'd be a vi guru...

For those who frequent Windows, Eclipse with EPIC is a real winner!

--Russell


> -----Original Message-----
> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
> bounces at lists.open-bio.org] On Behalf Of Mark A. Jensen
> Sent: Wednesday, 2 September 2009 4:20 p.m.
> To: BioPerl List
> Subject: [Bioperl-l] bioperl invades emacs
> 
> Hi All,
> 
> As part of the Documentation Project, I've written a full-
> fledged minor mode for emacs, bioperl-mode. It allows
> the user to access BP pod while coding, using keyboard
> shortcuts or menus. Pod pops up in a new view buffer,
> which it itself active for quick pod searching. You can
> get the whole pod, pieces of pod, or even the pod headers
> of individual methods.
> 
> The best feature (IMHO) is the completion facility. This
> not only saves typing, but allows browsing and follow-your-nose
> programming (exactly the technique I used to make bioperl-mode,
> thanks to the Extensible Self-Documenting Editor).
> 
> It's very easy to install, requires only one additional line
> in your .emacs file, and directly infects perl-mode
> (if you so choose) so its available whenever you
> open .pl or .pm files.
> 
> For details, screenshots, download and install info,
> and soporific design details, see
> http://www.bioperl.org/wiki/Emacs_bioperl-mode
> 
> Send me the bugs!
> cheers,
> MAJ
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
=======================================================================
Attention: The information contained in this message and/or attachments
from AgResearch Limited is intended only for the persons or entities
to which it is addressed and may contain confidential and/or privileged
material. Any review, retransmission, dissemination or other use of, or
taking of any action in reliance upon, this information by persons or
entities other than the intended recipients is prohibited by AgResearch
Limited. If you have received this message in error, please notify the
sender immediately.
=======================================================================


From maj at fortinbras.us  Wed Sep  2 08:28:45 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Wed, 2 Sep 2009 08:28:45 -0400
Subject: [Bioperl-l] bioperl invades emacs
In-Reply-To: <4A9E2638.8020203@pasteur.fr>
References: <56DB0DEEB22645DE94DE0E912A889409@NewLife>
	<4A9E2638.8020203@pasteur.fr>
Message-ID: <AC0A7CC6F808466CB15D267CC86AEEE3@NewLife>

Hi Emmanuel-- I'll look into this and report back- thanks!
MAJ
----- Original Message ----- 
From: "Emmanuel Quevillon" <tuco at pasteur.fr>
To: "Mark A. Jensen" <maj at fortinbras.us>
Sent: Wednesday, September 02, 2009 4:00 AM
Subject: Re: [Bioperl-l] bioperl invades emacs


> Mark A. Jensen wrote:
>> Hi All, 
>> 
>> As part of the Documentation Project, I've written a full-
>> fledged minor mode for emacs, bioperl-mode. It allows 
>> the user to access BP pod while coding, using keyboard
>> shortcuts or menus. Pod pops up in a new view buffer,
>> which it itself active for quick pod searching. You can 
>> get the whole pod, pieces of pod, or even the pod headers
>> of individual methods. 
>> 
>> The best feature (IMHO) is the completion facility. This
>> not only saves typing, but allows browsing and follow-your-nose
>> programming (exactly the technique I used to make bioperl-mode,
>> thanks to the Extensible Self-Documenting Editor).
>> 
>> It's very easy to install, requires only one additional line 
>> in your .emacs file, and directly infects perl-mode 
>> (if you so choose) so its available whenever you
>> open .pl or .pm files.
>> 
>> For details, screenshots, download and install info,
>> and soporific design details, see
>> http://www.bioperl.org/wiki/Emacs_bioperl-mode
>> 
>> Send me the bugs!
>> cheers, 
>> MAJ
> rg/mailman/listinfo/bioperl-l
> 
> Hi Mark,
> 
> Great great job.
> But I am using Xemacs and not .emacs file are present in my home
> directory. So is there an trick to make you bioperl-mode working
> under xemacs?
> 
> Thanks for you help
> 
> Regards
> 
> Emmanuel
> -- 
> -------------------------
> Emmanuel Quevillon
> Biological Software and Databases Group
> Institut Pasteur
> +33 1 44 38 95 98
> tuco at_ pasteur dot fr
> -------------------------
> 
>


From maj at fortinbras.us  Wed Sep  2 08:07:14 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Wed, 2 Sep 2009 08:07:14 -0400
Subject: [Bioperl-l] bioperl invades emacs
In-Reply-To: <18DF7D20DFEC044098A1062202F5FFF32AAB8A8478@exchsth.agresearch.co.nz>
References: <56DB0DEEB22645DE94DE0E912A889409@NewLife>
	<18DF7D20DFEC044098A1062202F5FFF32AAB8A8478@exchsth.agresearch.co.nz>
Message-ID: <B9B317F95CA44F0C9335450D3FDDEC73@NewLife>

I only know one command in vi --- :q
MAJ
----- Original Message ----- 
From: "Smithies, Russell" <Russell.Smithies at agresearch.co.nz>
To: "'Mark A. Jensen'" <maj at fortinbras.us>; "'BioPerl List'" 
<bioperl-l at lists.open-bio.org>
Sent: Wednesday, September 02, 2009 1:01 AM
Subject: RE: [Bioperl-l] bioperl invades emacs


emacs, how quaint  :-)
And here's me thinking you'd be a vi guru...

For those who frequent Windows, Eclipse with EPIC is a real winner!

--Russell


> -----Original Message-----
> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
> bounces at lists.open-bio.org] On Behalf Of Mark A. Jensen
> Sent: Wednesday, 2 September 2009 4:20 p.m.
> To: BioPerl List
> Subject: [Bioperl-l] bioperl invades emacs
>
> Hi All,
>
> As part of the Documentation Project, I've written a full-
> fledged minor mode for emacs, bioperl-mode. It allows
> the user to access BP pod while coding, using keyboard
> shortcuts or menus. Pod pops up in a new view buffer,
> which it itself active for quick pod searching. You can
> get the whole pod, pieces of pod, or even the pod headers
> of individual methods.
>
> The best feature (IMHO) is the completion facility. This
> not only saves typing, but allows browsing and follow-your-nose
> programming (exactly the technique I used to make bioperl-mode,
> thanks to the Extensible Self-Documenting Editor).
>
> It's very easy to install, requires only one additional line
> in your .emacs file, and directly infects perl-mode
> (if you so choose) so its available whenever you
> open .pl or .pm files.
>
> For details, screenshots, download and install info,
> and soporific design details, see
> http://www.bioperl.org/wiki/Emacs_bioperl-mode
>
> Send me the bugs!
> cheers,
> MAJ
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
=======================================================================
Attention: The information contained in this message and/or attachments
from AgResearch Limited is intended only for the persons or entities
to which it is addressed and may contain confidential and/or privileged
material. Any review, retransmission, dissemination or other use of, or
taking of any action in reliance upon, this information by persons or
entities other than the intended recipients is prohibited by AgResearch
Limited. If you have received this message in error, please notify the
sender immediately.
=======================================================================


From hlapp at gmx.net  Wed Sep  2 11:51:18 2009
From: hlapp at gmx.net (Hilmar Lapp)
Date: Wed, 2 Sep 2009 11:51:18 -0400
Subject: [Bioperl-l] bioperl invades emacs
In-Reply-To: <56DB0DEEB22645DE94DE0E912A889409@NewLife>
References: <56DB0DEEB22645DE94DE0E912A889409@NewLife>
Message-ID: <73A8B147-7605-4E2E-98AF-F3B09AD6046F@gmx.net>

Very nice!! -hilmar

On Sep 2, 2009, at 12:19 AM, Mark A. Jensen wrote:

> Hi All,
>
> As part of the Documentation Project, I've written a full-
> fledged minor mode for emacs, bioperl-mode. It allows
> the user to access BP pod while coding, using keyboard
> shortcuts or menus. Pod pops up in a new view buffer,
> which it itself active for quick pod searching. You can
> get the whole pod, pieces of pod, or even the pod headers
> of individual methods.
>
> The best feature (IMHO) is the completion facility. This
> not only saves typing, but allows browsing and follow-your-nose
> programming (exactly the technique I used to make bioperl-mode,
> thanks to the Extensible Self-Documenting Editor).
>
> It's very easy to install, requires only one additional line
> in your .emacs file, and directly infects perl-mode
> (if you so choose) so its available whenever you
> open .pl or .pm files.
>
> For details, screenshots, download and install info,
> and soporific design details, see
> http://www.bioperl.org/wiki/Emacs_bioperl-mode
>
> Send me the bugs!
> cheers,
> MAJ
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From cjfields at illinois.edu  Wed Sep  2 16:23:01 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Wed, 2 Sep 2009 15:23:01 -0500
Subject: [Bioperl-l] remove overlapped sequences from Blastn results
In-Reply-To: <1c9f28970909021320o20037e00g871db92a37519f79@mail.gmail.com>
References: <1c9f28970909011033h7f8a1bcl771db039bad384e7@mail.gmail.com>
	<7A89A354-3211-4662-9672-895E16CFDEE8@illinois.edu>
	<1c9f28970909021320o20037e00g871db92a37519f79@mail.gmail.com>
Message-ID: <E39D878B-A6F1-441A-A511-7CA0FF0D1319@illinois.edu>

Marcelo,

(Make sure to keep responses on the main list)

The new Tiling stuff is in bioperl-live (subversion code); it hasn't  
been released yet but should appear in BioPerl 1.6.1 (an alpha will be  
out this week).

chris

On Sep 2, 2009, at 3:20 PM, Marcelo Iwata wrote:

> thanks Chris.
> I was at cpan search to download Bio::Search::Tiling, and it returns  
> to me the bioperl core module:
> BioPerl-1.6.0.tar.gz
> at http://search.cpan.org/~cjfields/BioPerl-1.6.0/Bio/Search/BlastStatistics.pm
>
> i've downloaded and upgrade my bioperl version, but, still not find  
> the MapTiling.pm
>
> Could this be result of Some kind of error at upgrade?
>  thks.
>
>
> On Tue, Sep 1, 2009 at 3:10 PM, Chris Fields <cjfields at illinois.edu>  
> wrote:
> Marcelo,
>
> Do you mean tiling?  See:
>
> http://www.bioperl.org/wiki/HOWTO:Tiling
>
> chris
>
>
> On Sep 1, 2009, at 12:33 PM, Marcelo Iwata wrote:
>
> Hi
>
> I've made a blastn with such arguments:
>
> ../bin/blastall -p blastn -d DBBank -i myFasta.FASTA.txt  -e 0.00001  
> -o
> Out2Blast.txt -a 8
>
> and i want a script that removes overlapped sequences from the  
> results..
> For example, if a unigene A has the hit->start  and hit-end as 1 and  
> 4, and
> the B is at 2 and 3, respectively, the script remove second one.
>
> I want to know if it already exist, and if not, is there a library  
> that
> works with such issue.
>
> I know that at Bio::DB::gff we have overlapping_features. But , if  
> something
> directly exist (works with blast format), is better for me.
>
> thanks in advance
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>


From maj at fortinbras.us  Wed Sep  2 21:04:06 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Wed, 2 Sep 2009 21:04:06 -0400
Subject: [Bioperl-l] bioperl invades emacs
In-Reply-To: <56DB0DEEB22645DE94DE0E912A889409@NewLife>
References: <56DB0DEEB22645DE94DE0E912A889409@NewLife>
Message-ID: <5009BD4ADDC94A03866AC4D4813907EB@NewLife>

Thanks everyone for your comments so far, on and off-list. 
(You're a terrific audience. I also code for weddings and 
bar mitzvahs. Tip your servers.)
The howto page now has a "Known Issues" section, and
I will be working to eliminate those in the next couple of 
days. 

cheers Mark
----- Original Message ----- 
From: "Mark A. Jensen" <maj at fortinbras.us>
To: "BioPerl List" <bioperl-l at lists.open-bio.org>
Sent: Wednesday, September 02, 2009 12:19 AM
Subject: [Bioperl-l] bioperl invades emacs


> Hi All, 
> 
> As part of the Documentation Project, I've written a full-
> fledged minor mode for emacs, bioperl-mode. It allows 
> the user to access BP pod while coding, using keyboard
> shortcuts or menus. Pod pops up in a new view buffer,
> which it itself active for quick pod searching. You can 
> get the whole pod, pieces of pod, or even the pod headers
> of individual methods. 
> 
> The best feature (IMHO) is the completion facility. This
> not only saves typing, but allows browsing and follow-your-nose
> programming (exactly the technique I used to make bioperl-mode,
> thanks to the Extensible Self-Documenting Editor).
> 
> It's very easy to install, requires only one additional line 
> in your .emacs file, and directly infects perl-mode 
> (if you so choose) so its available whenever you
> open .pl or .pm files.
> 
> For details, screenshots, download and install info,
> and soporific design details, see
> http://www.bioperl.org/wiki/Emacs_bioperl-mode
> 
> Send me the bugs!
> cheers, 
> MAJ
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
>


From jessica.sun at gmail.com  Tue Sep  1 11:25:36 2009
From: jessica.sun at gmail.com (jsun529)
Date: Tue, 1 Sep 2009 08:25:36 -0700 (PDT)
Subject: [Bioperl-l]  covert CDS coordinates with Gene coordinates
Message-ID: <25242395.post@talk.nabble.com>


Dear all,
  I like to know how to convert a CDS coordinates with Gene coordinates
using the use Bio::Coordinate::GeneMapper;
 the doc is not very clear and a working example will help a lot in 

using the objects return from Bioperl function and get the value out in
readable format.

Thanks,

-- 
View this message in context: http://www.nabble.com/covert-CDS-coordinates-with-Gene-coordinates-tp25242395p25242395.html
Sent from the Perl - Bioperl-L mailing list archive at Nabble.com.


From pg4 at sanger.ac.uk  Wed Sep  2 19:35:07 2009
From: pg4 at sanger.ac.uk (Pablo Marin-Garcia)
Date: Thu, 3 Sep 2009 00:35:07 +0100 (BST)
Subject: [Bioperl-l] bioperl invades emacs -- bug report?
In-Reply-To: <mailman.25.1251907209.22450.bioperl-l@lists.open-bio.org>
References: <mailman.25.1251907209.22450.bioperl-l@lists.open-bio.org>
Message-ID: <alpine.DEB.1.10.0909022007510.16229@deskpro17122.dynamic.sanger.ac.uk>


Hello Mark,

It sounds fantastic,

unfortunatelly I was unable to use it:

It does not found pod2text in my macosX and fail to find my bioperl paths 
in linux (probably due to a bug in the perl5lib parsing but I am a lisp 
novice so I could be wrong)

==  macosX ==

in my macbook macosX 10.5 emacs 22.3 it does not find the pod2text
GNU Emacs 22.3.1 (i386-apple-darwin9.6.0, X toolkit)

   -I have installed your modules in my local-lisp and added the requiere 
and now emacs fails with the error:

   File error: Searching for program, invalid argument, pod2text

   -- I have pod2text in /usr/bin and this is in my $PATH (I use fink 
emacs in not-window mode) but the same happens with the carbon emacs

==  debian etch with an old emacs 21 ==

GNU Emacs 21.4.1 (i486-pc-linux-gnu, X toolkit, Xaw3d scroll bars) of 
2007-06-19 on ninsei, modified by Debian

It loads ok but when asking for the pods

[pod] Namespace: Bio::

it does not autocomplete from there, and if I have the cursor over a 'use 
Bio::xxx', and select [BP Docs] 'view methods' or 'view pod' it says 'no 
match'

# [pod mth] Namespace: Bio::PrimarySeq [No match]

Reading bioperl-mode.el and bioperl-init.el I have seen that the variable 
that stores the path to bioperl has not other paths added a part of 
current path:

# c-h v bioperl-module-path [ret] => bioperl-module-path's value is "."


== bug when parsing perl5lib? ==

Please correct me if I am wrong but in bioperl-init.el when extracting the 
Bioperl paths from PERL5LIB this is not working for me in linux.

While debugging bioperl-init.el:
# (setq pth (getenv "PERL5LIB"))
#  "/nfs/home/pmg/ensembl-api/ensembl-compara/modules:...:/nfs/home/pmg/bioperl-live:..."
# (setq pth (if (file-exists-p (concat pth "/" "Bio")) pth nil))
# nil

No file is found because it is looking for all the paths 
concatenated together with a '/Bio' at the end:

   libpaht1:libpath2:libpath3/Bio

'concat' adds /Bio to the pth that is a string with all the 
PERL5LIB paths. Should this concat rather be applied to the splited perl5lib by ':' in unix or 
';' in windows and then tested for the existence of files?

for example in unix:

--- code --
(defun addbio (bio_path)
   "apend /Bio to each path"
   (concat bio_path "/" "Bio"))

(mapcar 'file-exists-p (mapcar 'addbio (split-string pth ":")))
-- end code ---

This would result in the list of T and F bioperl (and ensembl) paths
(t t nil t t t t t t nil nil nil ...)


Regards and thanks for the modules they would be very useful.

    -Pablo

=====================================================================
                      Pablo Marin-Garcia, PhD

                     \\//          (Argiope bruennichi
                \/\/`(||>O:'\/\/   with stabilimentum)
                     //\\

Sanger Institute                |  PostDoc / Computer Biologist
Wellcome Trust Genome Campus    |  team : 128/108 (Human Genetics)
Hinxton, Cambridge CB10 1HH     |  room : N333
United Kingdom                  |  email: pablo.marin at sanger.ac.uk
====================================================================


-- 
 The Wellcome Trust Sanger Institute is operated by Genome Research 
 Limited, a charity registered in England with number 1021457 and a 
 company registered in England with number 2742969, whose registered 
 office is 215 Euston Road, London, NW1 2BE. 


From maj at fortinbras.us  Wed Sep  2 22:34:59 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Wed, 2 Sep 2009 22:34:59 -0400
Subject: [Bioperl-l] bioperl invades emacs -- bug report?
In-Reply-To: <alpine.DEB.1.10.0909022007510.16229@deskpro17122.dynamic.sanger.ac.uk>
References: <mailman.25.1251907209.22450.bioperl-l@lists.open-bio.org>
	<alpine.DEB.1.10.0909022007510.16229@deskpro17122.dynamic.sanger.ac.uk>
Message-ID: <2669F98293CC4473ADAB8B80F93351FF@NewLife>

Thanks for all this work, Pablo. Am working hard on 21
back-compat. Will attempt some mac-friendly paths
and look at the perl5lib issue-

"No matches" are seeming to stem from failure to
find the Bio tree-- there's a workaround for this on
the wiki page as of right now. This will probably
not help the 21 problems, but the next commit
(tomorrow) will likely solve these. I will post to this
thread when that happens.
cheers Mark
----- Original Message ----- 
From: "Pablo Marin-Garcia" <pg4 at sanger.ac.uk>
To: <bioperl-l at lists.open-bio.org>
Sent: Wednesday, September 02, 2009 7:35 PM
Subject: Re: [Bioperl-l] bioperl invades emacs -- bug report?


>
>
> Hello Mark,
>
> It sounds fantastic,
>
> unfortunatelly I was unable to use it:
>
> It does not found pod2text in my macosX and fail to find my bioperl paths in 
> linux (probably due to a bug in the perl5lib parsing but I am a lisp novice so 
> I could be wrong)
>
> ==  macosX ==
>
> in my macbook macosX 10.5 emacs 22.3 it does not find the pod2text
> GNU Emacs 22.3.1 (i386-apple-darwin9.6.0, X toolkit)
>
>   -I have installed your modules in my local-lisp and added the requiere and 
> now emacs fails with the error:
>
>   File error: Searching for program, invalid argument, pod2text
>
>   -- I have pod2text in /usr/bin and this is in my $PATH (I use fink emacs in 
> not-window mode) but the same happens with the carbon emacs
>
> ==  debian etch with an old emacs 21 ==
>
> GNU Emacs 21.4.1 (i486-pc-linux-gnu, X toolkit, Xaw3d scroll bars) of 
> 2007-06-19 on ninsei, modified by Debian
>
> It loads ok but when asking for the pods
>
> [pod] Namespace: Bio::
>
> it does not autocomplete from there, and if I have the cursor over a 'use 
> Bio::xxx', and select [BP Docs] 'view methods' or 'view pod' it says 'no 
> match'
>
> # [pod mth] Namespace: Bio::PrimarySeq [No match]
>
> Reading bioperl-mode.el and bioperl-init.el I have seen that the variable that 
> stores the path to bioperl has not other paths added a part of current path:
>
> # c-h v bioperl-module-path [ret] => bioperl-module-path's value is "."
>
>
> == bug when parsing perl5lib? ==
>
> Please correct me if I am wrong but in bioperl-init.el when extracting the 
> Bioperl paths from PERL5LIB this is not working for me in linux.
>
> While debugging bioperl-init.el:
> # (setq pth (getenv "PERL5LIB"))
> # 
> "/nfs/home/pmg/ensembl-api/ensembl-compara/modules:...:/nfs/home/pmg/bioperl-live:..."
> # (setq pth (if (file-exists-p (concat pth "/" "Bio")) pth nil))
> # nil
>
> No file is found because it is looking for all the paths concatenated together 
> with a '/Bio' at the end:
>
>   libpaht1:libpath2:libpath3/Bio
>
> 'concat' adds /Bio to the pth that is a string with all the PERL5LIB paths. 
> Should this concat rather be applied to the splited perl5lib by ':' in unix or 
> ';' in windows and then tested for the existence of files?
>
> for example in unix:
>
> --- code --
> (defun addbio (bio_path)
>   "apend /Bio to each path"
>   (concat bio_path "/" "Bio"))
>
> (mapcar 'file-exists-p (mapcar 'addbio (split-string pth ":")))
> -- end code ---
>
> This would result in the list of T and F bioperl (and ensembl) paths
> (t t nil t t t t t t nil nil nil ...)
>
>
> Regards and thanks for the modules they would be very useful.
>
>    -Pablo
>
> =====================================================================
>                      Pablo Marin-Garcia, PhD
>
>                     \\//          (Argiope bruennichi
>                \/\/`(||>O:'\/\/   with stabilimentum)
>                     //\\
>
> Sanger Institute                |  PostDoc / Computer Biologist
> Wellcome Trust Genome Campus    |  team : 128/108 (Human Genetics)
> Hinxton, Cambridge CB10 1HH     |  room : N333
> United Kingdom                  |  email: pablo.marin at sanger.ac.uk
> ====================================================================
>
>
>
>
>
>
>
>
>
>
> -- 
> The Wellcome Trust Sanger Institute is operated by Genome Research Limited, a 
> charity registered in England with number 1021457 and a company registered in 
> England with number 2742969, whose registered office is 215 Euston Road, 
> London, NW1 2BE. _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> 


From maj at fortinbras.us  Thu Sep  3 00:21:14 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Thu, 3 Sep 2009 00:21:14 -0400
Subject: [Bioperl-l] bioperl invades emacs -- bug report?
In-Reply-To: <alpine.DEB.1.10.0909022007510.16229@deskpro17122.dynamic.sanger.ac.uk>
References: <mailman.25.1251907209.22450.bioperl-l@lists.open-bio.org>
	<alpine.DEB.1.10.0909022007510.16229@deskpro17122.dynamic.sanger.ac.uk>
Message-ID: <203092FB050648AA9F256788068F0A16@NewLife>

Hi Pablo and all-
Try the latest revision (>=16081) with your debian/Emacs 21. Set
the variable bioperl-module-path to the directory above the
Bio directory (same idea as ' use lib "./bioperl-live"; ' ), and try
again there. Tomorrow, MacOS
cheers,
Mark
----- Original Message ----- 
From: "Pablo Marin-Garcia" <pg4 at sanger.ac.uk>
To: <bioperl-l at lists.open-bio.org>
Sent: Wednesday, September 02, 2009 7:35 PM
Subject: Re: [Bioperl-l] bioperl invades emacs -- bug report?


>
>
> Hello Mark,
>
> It sounds fantastic,
>
> unfortunatelly I was unable to use it:
>
> It does not found pod2text in my macosX and fail to find my bioperl paths in 
> linux (probably due to a bug in the perl5lib parsing but I am a lisp novice so 
> I could be wrong)
>
> ==  macosX ==
>
> in my macbook macosX 10.5 emacs 22.3 it does not find the pod2text
> GNU Emacs 22.3.1 (i386-apple-darwin9.6.0, X toolkit)
>
>   -I have installed your modules in my local-lisp and added the requiere and 
> now emacs fails with the error:
>
>   File error: Searching for program, invalid argument, pod2text
>
>   -- I have pod2text in /usr/bin and this is in my $PATH (I use fink emacs in 
> not-window mode) but the same happens with the carbon emacs
>
> ==  debian etch with an old emacs 21 ==
>
> GNU Emacs 21.4.1 (i486-pc-linux-gnu, X toolkit, Xaw3d scroll bars) of 
> 2007-06-19 on ninsei, modified by Debian
>
> It loads ok but when asking for the pods
>
> [pod] Namespace: Bio::
>
> it does not autocomplete from there, and if I have the cursor over a 'use 
> Bio::xxx', and select [BP Docs] 'view methods' or 'view pod' it says 'no 
> match'
>
> # [pod mth] Namespace: Bio::PrimarySeq [No match]
>
> Reading bioperl-mode.el and bioperl-init.el I have seen that the variable that 
> stores the path to bioperl has not other paths added a part of current path:
>
> # c-h v bioperl-module-path [ret] => bioperl-module-path's value is "."
>
>
> == bug when parsing perl5lib? ==
>
> Please correct me if I am wrong but in bioperl-init.el when extracting the 
> Bioperl paths from PERL5LIB this is not working for me in linux.
>
> While debugging bioperl-init.el:
> # (setq pth (getenv "PERL5LIB"))
> # 
> "/nfs/home/pmg/ensembl-api/ensembl-compara/modules:...:/nfs/home/pmg/bioperl-live:..."
> # (setq pth (if (file-exists-p (concat pth "/" "Bio")) pth nil))
> # nil
>
> No file is found because it is looking for all the paths concatenated together 
> with a '/Bio' at the end:
>
>   libpaht1:libpath2:libpath3/Bio
>
> 'concat' adds /Bio to the pth that is a string with all the PERL5LIB paths. 
> Should this concat rather be applied to the splited perl5lib by ':' in unix or 
> ';' in windows and then tested for the existence of files?
>
> for example in unix:
>
> --- code --
> (defun addbio (bio_path)
>   "apend /Bio to each path"
>   (concat bio_path "/" "Bio"))
>
> (mapcar 'file-exists-p (mapcar 'addbio (split-string pth ":")))
> -- end code ---
>
> This would result in the list of T and F bioperl (and ensembl) paths
> (t t nil t t t t t t nil nil nil ...)
>
>
> Regards and thanks for the modules they would be very useful.
>
>    -Pablo
>
> =====================================================================
>                      Pablo Marin-Garcia, PhD
>
>                     \\//          (Argiope bruennichi
>                \/\/`(||>O:'\/\/   with stabilimentum)
>                     //\\
>
> Sanger Institute                |  PostDoc / Computer Biologist
> Wellcome Trust Genome Campus    |  team : 128/108 (Human Genetics)
> Hinxton, Cambridge CB10 1HH     |  room : N333
> United Kingdom                  |  email: pablo.marin at sanger.ac.uk
> ====================================================================
>
>
>
>
>
>
>
>
>
>
> -- 
> The Wellcome Trust Sanger Institute is operated by Genome Research Limited, a 
> charity registered in England with number 1021457 and a company registered in 
> England with number 2742969, whose registered office is 215 Euston Road, 
> London, NW1 2BE. _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> 


From tuco at pasteur.fr  Thu Sep  3 05:56:45 2009
From: tuco at pasteur.fr (Emmanuel Quevillon)
Date: Thu, 03 Sep 2009 11:56:45 +0200
Subject: [Bioperl-l] bioperl invades emacs
In-Reply-To: <5009BD4ADDC94A03866AC4D4813907EB@NewLife>
References: <56DB0DEEB22645DE94DE0E912A889409@NewLife>
	<5009BD4ADDC94A03866AC4D4813907EB@NewLife>
Message-ID: <4A9F92DD.2010701@pasteur.fr>

Mark A. Jensen wrote:
> Thanks everyone for your comments so far, on and off-list. (You're a
> terrific audience. I also code for weddings and bar mitzvahs. Tip your
> servers.)
> The howto page now has a "Known Issues" section, and
> I will be working to eliminate those in the next couple of days.
> cheers Mark

Hi Mark,

Thanks for your help. I decided to remove Xemacs :) and replace it
with Emacs. In fact, as I am running Ubuntu, it was a mess to know
where to put files.el etc and how to make it working.
So I removed everything , bit rude, and reinstall emacs-22.

What I've done after that.

$ cd /usr/share/emacs
$ cd 22.2
$ cp BIOPERL-MODE/etc/* etc/
$ cd site-lisp (which is a symlink to /usr/share/emacs22/site-lisp)
$ sudo mkdir bioperl-mode
$ cp BIOPERL-MODE/site-lisp/* bioperl-mode
$ cd ~
$ touch .emacs
$ cat .xemacs/init.el (with require 'bioperl-mode) > .emacs
$ cat .xemacs/custom.el >> .emacs (The file with my other emacs
stuff, e.g. Template Toolkit mode)

And it is all done and working perfectly!!

Thanks for this great file Mark

Regards

Emmanuel

-- 
-------------------------
Emmanuel Quevillon
Biological Software and Databases Group
Institut Pasteur
+33 1 44 38 95 98
tuco at_ pasteur dot fr
-------------------------


From maj at fortinbras.us  Thu Sep  3 07:22:31 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Thu, 3 Sep 2009 07:22:31 -0400
Subject: [Bioperl-l] bioperl invades emacs -- bug report?
In-Reply-To: <alpine.DEB.1.10.0909030814320.16229@deskpro17122.dynamic.sanger.ac.uk>
References: <mailman.25.1251907209.22450.bioperl-l@lists.open-bio.org>
	<alpine.DEB.1.10.0909022007510.16229@deskpro17122.dynamic.sanger.ac.uk>
	<203092FB050648AA9F256788068F0A16@NewLife>
	<alpine.DEB.1.10.0909030814320.16229@deskpro17122.dynamic.sanger.ac.uk>
Message-ID: <2465B400494242AEAB5F578BD6BB5301@NewLife>

I get it now-- you're right. I'll take care of that-
cheers
MAJ
----- Original Message ----- 
From: "Pablo Marin-Garcia" <pg4 at sanger.ac.uk>
To: "Mark A. Jensen" <maj at fortinbras.us>
Cc: <bioperl-l at lists.open-bio.org>
Sent: Thursday, September 03, 2009 4:01 AM
Subject: Re: [Bioperl-l] bioperl invades emacs -- bug report?


> On Thu, 3 Sep 2009, Mark A. Jensen wrote:
>
>> Hi Pablo and all-
>> Try the latest revision (>=16081) with your debian/Emacs 21. Set
>> the variable bioperl-module-path to the directory above the
>> Bio directory (same idea as ' use lib "./bioperl-live"; ' ), and try
>> again there. Tomorrow, MacOS
>> cheers,
>> Mark
>
> Hello Mark,
>
> after setting bioperl-module-path manually, your module works ok in linux 
> emacs 21.4 with latest revision.
>
> About the perl5lib issue, sorry about not reporting the platform: the report 
> was on linux not in mac os X. In the wiki you have a comment about mac OS X 
> separator:
>
> [wiki] The problem Pablo was running into is definitely the Mac OS X path 
> [wiki] separator issue.
>
> Here I was refering to ':' as the 'path seprator' for linux multipath 
> environmental vars not the systems directory separator [:/\].
>
> Also from the wiki
>
> [wiki] I think this is ok as it is, since bioperl-module-path is meant to 
> [wiki] point to the directory above Bio
>
> This is right. Probably my message was misleading. I wrongly appended '/Bio' 
> to the path instead to a temp variable for testing with file-exist-p. And 
> probably gave you the impression that the point was to have the /Bio added to 
> the path. Sorry about that.
>
> Instead my main point was about the line where you capture the PRL5LIB:
>
> [code] (if (setq pth (getenv "PERL5LIB"))
>
> wouldn't this leave pth with s *string* like "lib/path1:lib/path2:lob/path3" 
> in linux?
>
> Then, when you test:
>
> [code] (setq pth (if (file-exists-p (concat pth "/" "Bio")) pth nil))))
>
> it would append '/Bio' at the end of the whole string 
> 'lib/path1:lib/path2:lib/path3'. and this string path obviously does not 
> exist.
>
> Am I missing something? Shouldn't the 'concat /Bio' be applied to *each* 
> lib/path, splitting first the pth string by the ':' in linux/osX or equivalent 
> in windows.
>
> Sorry about not being very clear in my firest report.
>
>
>    -Pablo
>
>
>
>>> == bug when parsing perl5lib? ==
>>>
>>> Please correct me if I am wrong but in bioperl-init.el when extracting the 
>>> Bioperl paths from PERL5LIB this is not working for me in linux.
>>>
>>> While debugging bioperl-init.el:
>>> # (setq pth (getenv "PERL5LIB"))
>>> # 
>>> "/nfs/home/pmg/ensembl-api/ensembl-compara/modules:...:/nfs/home/pmg/bioperl-live:..."
>>> # (setq pth (if (file-exists-p (concat pth "/" "Bio")) pth nil))
>>> # nil
>>>
>>> No file is found because it is looking for all the paths concatenated 
>>> together with a '/Bio' at the end:
>>>
>>>   libpaht1:libpath2:libpath3/Bio
>>>
>>> 'concat' adds /Bio to the pth that is a string with all the PERL5LIB paths. 
>>> Should this concat rather be applied to the splited perl5lib by ':' in unix 
>>> or ';' in windows and then tested for the existence of files?
>>>
>>> for example in unix:
>>>
>>> --- code --
>>> (defun addbio (bio_path)
>>>   "apend /Bio to each path"
>>>   (concat bio_path "/" "Bio"))
>>>
>>> (mapcar 'file-exists-p (mapcar 'addbio (split-string pth ":")))
>>> -- end code ---
>>>
>>> This would result in the list of T and F bioperl (and ensembl) paths
>>> (t t nil t t t t t t nil nil nil ...)
>>>
>>>
>>> Regards and thanks for the modules they would be very useful.
>>>
>>>    -Pablo
>>>
>>> =====================================================================
>>>                      Pablo Marin-Garcia, PhD
>>>
>>>                     \\//          (Argiope bruennichi
>>>                \/\/`(||>O:'\/\/   with stabilimentum)
>>>                     //\\
>>>
>>> Sanger Institute                |  PostDoc / Computer Biologist
>>> Wellcome Trust Genome Campus    |  team : 128/108 (Human Genetics)
>>> Hinxton, Cambridge CB10 1HH     |  room : N333
>>> United Kingdom                  |  email: pablo.marin at sanger.ac.uk
>>> ====================================================================
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> -- 
>>> The Wellcome Trust Sanger Institute is operated by Genome Research Limited, 
>>> a charity registered in England with number 1021457 and a company registered 
>>> in England with number 2742969, whose registered office is 215 Euston Road, 
>>> London, NW1 2BE. _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>>
>>
>>
>
>
> =====================================================================
>                      Pablo Marin-Garcia, PhD
>
>                     \\//          (Argiope bruennichi
>                \/\/`(||>O:'\/\/   with stabilimentum)
>                     //\\
>
> Sanger Institute                |  PostDoc / Computer Biologist
> Wellcome Trust Genome Campus    |  team : 128/108 (Human Genetics)
> Hinxton, Cambridge CB10 1HH     |  room : N333
> United Kingdom                  |  email: pablo.marin at sanger.ac.uk
> ====================================================================
>
>
>
>
>
>
>
>
>
>
> -- 
> The Wellcome Trust Sanger Institute is operated by Genome Research Limited, a 
> charity registered in England with number 1021457 and a company registered in 
> England with number 2742969, whose registered office is 215 Euston Road, 
> London, NW1 2BE.
> 


From maj at fortinbras.us  Thu Sep  3 08:34:45 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Thu, 3 Sep 2009 08:34:45 -0400
Subject: [Bioperl-l] bioperl invades emacs
In-Reply-To: <56DB0DEEB22645DE94DE0E912A889409@NewLife>
References: <56DB0DEEB22645DE94DE0E912A889409@NewLife>
Message-ID: <736B3399B3754D4C9B1BB66414160D95@NewLife>

Hi All, 

Following bioperl-mode issues are resolved in r16020:

- compatibility with Emacs 21
- correct parsing of PERL5LIB
- Bio module search now includes PATH components 
  (after PERL5LIB search)
- Now get informative error if completion is attempted
  without a valid bioperl-module-path

Thanks for your patience and your bug reports-
cheers
MAJ

----- Original Message ----- 
From: "Mark A. Jensen" <maj at fortinbras.us>
To: "BioPerl List" <bioperl-l at lists.open-bio.org>
Sent: Wednesday, September 02, 2009 12:19 AM
Subject: [Bioperl-l] bioperl invades emacs


> Hi All, 
> 
> As part of the Documentation Project, I've written a full-
> fledged minor mode for emacs, bioperl-mode. It allows 
> the user to access BP pod while coding, using keyboard
> shortcuts or menus. Pod pops up in a new view buffer,
> which it itself active for quick pod searching. You can 
> get the whole pod, pieces of pod, or even the pod headers
> of individual methods. 
> 
> The best feature (IMHO) is the completion facility. This
> not only saves typing, but allows browsing and follow-your-nose
> programming (exactly the technique I used to make bioperl-mode,
> thanks to the Extensible Self-Documenting Editor).
> 
> It's very easy to install, requires only one additional line 
> in your .emacs file, and directly infects perl-mode 
> (if you so choose) so its available whenever you
> open .pl or .pm files.
> 
> For details, screenshots, download and install info,
> and soporific design details, see
> http://www.bioperl.org/wiki/Emacs_bioperl-mode
> 
> Send me the bugs!
> cheers, 
> MAJ
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
>


From neetisomaiya at gmail.com  Fri Sep  4 02:49:58 2009
From: neetisomaiya at gmail.com (Neeti Somaiya)
Date: Fri, 4 Sep 2009 12:19:58 +0530
Subject: [Bioperl-l] need help urgently
Message-ID: <764978cf0909032349t5d8683f3h3b2ccc506392c984@mail.gmail.com>

Hi,

I have an input list of gene names (can get gene ids from a local db
if required).
I need to fetch sequences of these genes. Can someone please guide me
as to how this can be done using perl/bioperl?

Any help will be deeply appreciated.

Thanks.

-Neeti
Even my blood says, B positive


From neetisomaiya at gmail.com  Fri Sep  4 05:17:17 2009
From: neetisomaiya at gmail.com (Neeti Somaiya)
Date: Fri, 4 Sep 2009 14:47:17 +0530
Subject: [Bioperl-l] need help urgently
In-Reply-To: <42006.192.168.1.1.1252051273.squirrel@mail.ncbs.res.in>
References: <764978cf0909032349t5d8683f3h3b2ccc506392c984@mail.gmail.com>
	<42006.192.168.1.1.1252051273.squirrel@mail.ncbs.res.in>
Message-ID: <764978cf0909040217k7f4382d6pcec65f8adfd66164@mail.gmail.com>

Thanks for the link.
So I need only the following lines of code to get the sequence?

use Bio::DB::GenBank;
$db_obj = Bio::DB::GenBank->new;
$seq_obj = $db_obj->get_Seq_by_id(2);

How do I print the sequence?
$seq_obj->seq ??

-Neeti
Even my blood says, B positive


On Fri, Sep 4, 2009 at 1:31 PM, K. Shameer<shameer at ncbs.res.in> wrote:
>
> Retrieving a sequence from a database : BioPerl HOWTO
> http://bit.ly/RWIot
>
> Trust this helps,
> Khader Shameer
> NCBS - TIFR
>
>> Hi,
>>
>> I have an input list of gene names (can get gene ids from a local db
>> if required).
>> I need to fetch sequences of these genes. Can someone please guide me
>> as to how this can be done using perl/bioperl?
>>
>> Any help will be deeply appreciated.
>>
>> Thanks.
>>
>> -Neeti
>> Even my blood says, B positive
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>
>
>


From neetisomaiya at gmail.com  Fri Sep  4 06:13:58 2009
From: neetisomaiya at gmail.com (Neeti Somaiya)
Date: Fri, 4 Sep 2009 15:43:58 +0530
Subject: [Bioperl-l] need help urgently
In-Reply-To: <764978cf0909040217k7f4382d6pcec65f8adfd66164@mail.gmail.com>
References: <764978cf0909032349t5d8683f3h3b2ccc506392c984@mail.gmail.com>
	<42006.192.168.1.1.1252051273.squirrel@mail.ncbs.res.in>
	<764978cf0909040217k7f4382d6pcec65f8adfd66164@mail.gmail.com>
Message-ID: <764978cf0909040313r48ff5660x3210324dc14bf966@mail.gmail.com>

Thanks for the replies.

So the get seq by accession/GI worked for me. Now can anyone tell me
the easiest way to get the GI /Accession of a gene from the gene
id/gene name?

-Neeti
Even my blood says, B positive


On Fri, Sep 4, 2009 at 2:47 PM, Neeti Somaiya<neetisomaiya at gmail.com> wrote:
> Thanks for the link.
> So I need only the following lines of code to get the sequence?
>
> use Bio::DB::GenBank;
> $db_obj = Bio::DB::GenBank->new;
> $seq_obj = $db_obj->get_Seq_by_id(2);
>
> How do I print the sequence?
> $seq_obj->seq ??
>
> -Neeti
> Even my blood says, B positive
>
>
>
> On Fri, Sep 4, 2009 at 1:31 PM, K. Shameer<shameer at ncbs.res.in> wrote:
>>
>> Retrieving a sequence from a database : BioPerl HOWTO
>> http://bit.ly/RWIot
>>
>> Trust this helps,
>> Khader Shameer
>> NCBS - TIFR
>>
>>> Hi,
>>>
>>> I have an input list of gene names (can get gene ids from a local db
>>> if required).
>>> I need to fetch sequences of these genes. Can someone please guide me
>>> as to how this can be done using perl/bioperl?
>>>
>>> Any help will be deeply appreciated.
>>>
>>> Thanks.
>>>
>>> -Neeti
>>> Even my blood says, B positive
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>
>>
>>
>


From e.osimo at gmail.com  Fri Sep  4 08:05:48 2009
From: e.osimo at gmail.com (Emanuele Osimo)
Date: Fri, 4 Sep 2009 14:05:48 +0200
Subject: [Bioperl-l] need help urgently
In-Reply-To: <764978cf0909040313r48ff5660x3210324dc14bf966@mail.gmail.com>
References: <764978cf0909032349t5d8683f3h3b2ccc506392c984@mail.gmail.com> 
	<42006.192.168.1.1.1252051273.squirrel@mail.ncbs.res.in>
	<764978cf0909040217k7f4382d6pcec65f8adfd66164@mail.gmail.com> 
	<764978cf0909040313r48ff5660x3210324dc14bf966@mail.gmail.com>
Message-ID: <2ac05d0f0909040505w75793434ifc25237edbabad5a@mail.gmail.com>

Try this:
http://david.abcc.ncifcrf.gov/conversion.jsp

Emanuele


On Fri, Sep 4, 2009 at 12:13, Neeti Somaiya <neetisomaiya at gmail.com> wrote:

> Thanks for the replies.
>
> So the get seq by accession/GI worked for me. Now can anyone tell me
> the easiest way to get the GI /Accession of a gene from the gene
> id/gene name?
>
> -Neeti
> Even my blood says, B positive
>
>
>
> On Fri, Sep 4, 2009 at 2:47 PM, Neeti Somaiya<neetisomaiya at gmail.com>
> wrote:
> > Thanks for the link.
> > So I need only the following lines of code to get the sequence?
> >
> > use Bio::DB::GenBank;
> > $db_obj = Bio::DB::GenBank->new;
> > $seq_obj = $db_obj->get_Seq_by_id(2);
> >
> > How do I print the sequence?
> > $seq_obj->seq ??
> >
> > -Neeti
> > Even my blood says, B positive
> >
> >
> >
> > On Fri, Sep 4, 2009 at 1:31 PM, K. Shameer<shameer at ncbs.res.in> wrote:
> >>
> >> Retrieving a sequence from a database : BioPerl HOWTO
> >> http://bit.ly/RWIot
> >>
> >> Trust this helps,
> >> Khader Shameer
> >> NCBS - TIFR
> >>
> >>> Hi,
> >>>
> >>> I have an input list of gene names (can get gene ids from a local db
> >>> if required).
> >>> I need to fetch sequences of these genes. Can someone please guide me
> >>> as to how this can be done using perl/bioperl?
> >>>
> >>> Any help will be deeply appreciated.
> >>>
> >>> Thanks.
> >>>
> >>> -Neeti
> >>> Even my blood says, B positive
> >>> _______________________________________________
> >>> Bioperl-l mailing list
> >>> Bioperl-l at lists.open-bio.org
> >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >>>
> >>
> >>
> >>
> >
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


From neetisomaiya at gmail.com  Fri Sep  4 08:21:19 2009
From: neetisomaiya at gmail.com (Neeti Somaiya)
Date: Fri, 4 Sep 2009 17:51:19 +0530
Subject: [Bioperl-l] need help urgently
In-Reply-To: <2ac05d0f0909040505w75793434ifc25237edbabad5a@mail.gmail.com>
References: <764978cf0909032349t5d8683f3h3b2ccc506392c984@mail.gmail.com>
	<42006.192.168.1.1.1252051273.squirrel@mail.ncbs.res.in>
	<764978cf0909040217k7f4382d6pcec65f8adfd66164@mail.gmail.com>
	<764978cf0909040313r48ff5660x3210324dc14bf966@mail.gmail.com>
	<2ac05d0f0909040505w75793434ifc25237edbabad5a@mail.gmail.com>
Message-ID: <764978cf0909040521p7aa6af97hffc0c08381a04222@mail.gmail.com>

Thanks. Its an interesting tool.

But I want to do this programatically.

I have gene ids to start with. Cant find a method to directly get
sequence with gene id as input. So using the method of getting
sequence with accession as input, for which I need to know accessions
for my gene ids first. Is this a right approach? Please guide me. My
main aim is to get the nucleotide sequence of a gene from ids entrez
gene id/gene name. PLease guide me. I am confused.

-Neeti
Even my blood says, B positive


On Fri, Sep 4, 2009 at 5:35 PM, Emanuele Osimo<e.osimo at gmail.com> wrote:
> Try this:
> http://david.abcc.ncifcrf.gov/conversion.jsp
>
> Emanuele
>
>
> On Fri, Sep 4, 2009 at 12:13, Neeti Somaiya <neetisomaiya at gmail.com> wrote:
>>
>> Thanks for the replies.
>>
>> So the get seq by accession/GI worked for me. Now can anyone tell me
>> the easiest way to get the GI /Accession of a gene from the gene
>> id/gene name?
>>
>> -Neeti
>> Even my blood says, B positive
>>
>>
>>
>> On Fri, Sep 4, 2009 at 2:47 PM, Neeti Somaiya<neetisomaiya at gmail.com>
>> wrote:
>> > Thanks for the link.
>> > So I need only the following lines of code to get the sequence?
>> >
>> > use Bio::DB::GenBank;
>> > $db_obj = Bio::DB::GenBank->new;
>> > $seq_obj = $db_obj->get_Seq_by_id(2);
>> >
>> > How do I print the sequence?
>> > $seq_obj->seq ??
>> >
>> > -Neeti
>> > Even my blood says, B positive
>> >
>> >
>> >
>> > On Fri, Sep 4, 2009 at 1:31 PM, K. Shameer<shameer at ncbs.res.in> wrote:
>> >>
>> >> Retrieving a sequence from a database : BioPerl HOWTO
>> >> http://bit.ly/RWIot
>> >>
>> >> Trust this helps,
>> >> Khader Shameer
>> >> NCBS - TIFR
>> >>
>> >>> Hi,
>> >>>
>> >>> I have an input list of gene names (can get gene ids from a local db
>> >>> if required).
>> >>> I need to fetch sequences of these genes. Can someone please guide me
>> >>> as to how this can be done using perl/bioperl?
>> >>>
>> >>> Any help will be deeply appreciated.
>> >>>
>> >>> Thanks.
>> >>>
>> >>> -Neeti
>> >>> Even my blood says, B positive
>> >>> _______________________________________________
>> >>> Bioperl-l mailing list
>> >>> Bioperl-l at lists.open-bio.org
>> >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>> >>>
>> >>
>> >>
>> >>
>> >
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>


From paola_bisignano at yahoo.it  Fri Sep  4 08:32:02 2009
From: paola_bisignano at yahoo.it (Paola Bisignano)
Date: Fri, 4 Sep 2009 12:32:02 +0000 (GMT)
Subject: [Bioperl-l] problem parsing msf....:second part...I cannot solve
	sorry sorry
Message-ID: <330845.85818.qm@web25704.mail.ukl.yahoo.com>

I have a problem with the parsing of msf file...I can't find the exact


object of Bio::SimpleAlign for my case...


I have to identify residues (from a list) in aligned sequences...but


when I parse the alignment from fasta file, I save as msf file, where


I have to identify my residue (from the list, numbering as the pdb


file) and the residue aligned in the aligned sequences...


this is a piece of the file...


NoName ? MSF: 2 ?Type: P ?Wed Aug 26 10:32:50 2009 ?Check: 00 ..


?Name: Sequence/23-178 ?Len: ? ?156 ?Check: ?8937 ?Weight: ?1.00


?Name: 2zhz:A/1-148 ? ? Len: ? ?156 ?Check: ?9006 ?Weight: ?1.00


//


 ? ? ? ? ? ? ? ? ? ? ?1 ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? 50


Sequence/23-178 ? ? ? NDPRVAAYGE VDELNSWVGY TKSLINSHTQ VLSNELEEIQ QLLFDCGHDL


2zhz:A/1-148 ? ? ? ? ?DDARIAAIGD VDELNSQIGV L--LAEPLPD DVRAALSAIQ HDLFDLGGEL


 ? ? ? ? ? ? ? ? ? ? ?51 ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? 100


Sequence/23-178 ? ? ? ATPADDERHS FKFKQEQPTV WLEEKIDNYT QVVPAVKKHI LPGGTQLASA


2zhz:A/1-148 ? ? ? ? ?CIPGHAAITD AHLARLDG-- WLA----HYN GQLPPLEEFI LPGGARGAAL


 ? ? ? ? ? ? ? ? ? ? ?101 ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?150


Sequence/23-178 ? ? ? LHVARTITRR AERQIVQLMR EEQINQDVLI FINRLSDYFF AAARYANYLE


2zhz:A/1-148 ? ? ? ? ?AHVCRTVCRR AERSIVALGA SEPLNAAPRR YVNRLSDLLF VLARVLNRAA


 ? ? ? ? ? ? ? ? ? ? ?151 ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?200


Sequence/23-178 ? ? ? QQPDML


2zhz:A/1-148 ? ? ? ? ?GGADVL


for example in this I have to identify the residue that is in front of


Val 28 (that is in Sequen) in 2zhz:A (that manually conting is Ile


5)....


Tyr4-> has no residue in front of it because the alignment starts from


N23 of Sequence...


how can I find the way to enter the residue of my sequen, and extract


the residue from the other????


I wish you all dear friends..and I'm actually in atrouble with this..


Thanks for suggestions


Paola


From neetisomaiya at gmail.com  Fri Sep  4 08:40:10 2009
From: neetisomaiya at gmail.com (Neeti Somaiya)
Date: Fri, 4 Sep 2009 18:10:10 +0530
Subject: [Bioperl-l] need help urgently
In-Reply-To: <8CCCFE4D-84A4-47A4-A627-ADC6C0329686@illinois.edu>
References: <764978cf0909032349t5d8683f3h3b2ccc506392c984@mail.gmail.com>
	<42006.192.168.1.1.1252051273.squirrel@mail.ncbs.res.in>
	<764978cf0909040217k7f4382d6pcec65f8adfd66164@mail.gmail.com>
	<764978cf0909040313r48ff5660x3210324dc14bf966@mail.gmail.com>
	<2ac05d0f0909040505w75793434ifc25237edbabad5a@mail.gmail.com>
	<764978cf0909040521p7aa6af97hffc0c08381a04222@mail.gmail.com>
	<8CCCFE4D-84A4-47A4-A627-ADC6C0329686@illinois.edu>
Message-ID: <764978cf0909040540n531ea4d3o42f28a7e1578ad82@mail.gmail.com>

Hi,

Thanks for your reply. I saw this before and wanted to try this, but I
am unable to install this module of EUtilities. When I search on CPAN,
it gives me the entire bioperl package in the download option of this
module. Can I not get a tar.gz file of this module alone, which I can
gzip, untar and then run the make and all to install it? I dont want
to install entire bioperl again as I am using an older version. Any
suggestions?

-Neeti
Even my blood says, B positive


On Fri, Sep 4, 2009 at 6:00 PM, Chris Fields<cjfields at illinois.edu> wrote:
> Neeti,
>
> Something like this?
>
> http://www.bioperl.org/wiki/HOWTO:EUtilities_Cookbook#esummary_-.3E_efetch
>
> chris
>
> On Sep 4, 2009, at 7:21 AM, Neeti Somaiya wrote:
>
>> Thanks. Its an interesting tool.
>>
>> But I want to do this programatically.
>>
>> I have gene ids to start with. Cant find a method to directly get
>> sequence with gene id as input. So using the method of getting
>> sequence with accession as input, for which I need to know accessions
>> for my gene ids first. Is this a right approach? Please guide me. My
>> main aim is to get the nucleotide sequence of a gene from ids entrez
>> gene id/gene name. PLease guide me. I am confused.
>>
>> -Neeti
>> Even my blood says, B positive
>>
>>
>>
>> On Fri, Sep 4, 2009 at 5:35 PM, Emanuele Osimo<e.osimo at gmail.com> wrote:
>>>
>>> Try this:
>>> http://david.abcc.ncifcrf.gov/conversion.jsp
>>>
>>> Emanuele
>>>
>>>
>>> On Fri, Sep 4, 2009 at 12:13, Neeti Somaiya <neetisomaiya at gmail.com>
>>> wrote:
>>>>
>>>> Thanks for the replies.
>>>>
>>>> So the get seq by accession/GI worked for me. Now can anyone tell me
>>>> the easiest way to get the GI /Accession of a gene from the gene
>>>> id/gene name?
>>>>
>>>> -Neeti
>>>> Even my blood says, B positive
>>>>
>>>>
>>>>
>>>> On Fri, Sep 4, 2009 at 2:47 PM, Neeti Somaiya<neetisomaiya at gmail.com>
>>>> wrote:
>>>>>
>>>>> Thanks for the link.
>>>>> So I need only the following lines of code to get the sequence?
>>>>>
>>>>> use Bio::DB::GenBank;
>>>>> $db_obj = Bio::DB::GenBank->new;
>>>>> $seq_obj = $db_obj->get_Seq_by_id(2);
>>>>>
>>>>> How do I print the sequence?
>>>>> $seq_obj->seq ??
>>>>>
>>>>> -Neeti
>>>>> Even my blood says, B positive
>>>>>
>>>>>
>>>>>
>>>>> On Fri, Sep 4, 2009 at 1:31 PM, K. Shameer<shameer at ncbs.res.in> wrote:
>>>>>>
>>>>>> Retrieving a sequence from a database : BioPerl HOWTO
>>>>>> http://bit.ly/RWIot
>>>>>>
>>>>>> Trust this helps,
>>>>>> Khader Shameer
>>>>>> NCBS - TIFR
>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>> I have an input list of gene names (can get gene ids from a local db
>>>>>>> if required).
>>>>>>> I need to fetch sequences of these genes. Can someone please guide me
>>>>>>> as to how this can be done using perl/bioperl?
>>>>>>>
>>>>>>> Any help will be deeply appreciated.
>>>>>>>
>>>>>>> Thanks.
>>>>>>>
>>>>>>> -Neeti
>>>>>>> Even my blood says, B positive
>>>>>>> _______________________________________________
>>>>>>> Bioperl-l mailing list
>>>>>>> Bioperl-l at lists.open-bio.org
>>>>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>> _______________________________________________
>>>> Bioperl-l mailing list
>>>> Bioperl-l at lists.open-bio.org
>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>


From cjfields at illinois.edu  Fri Sep  4 08:30:42 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Fri, 4 Sep 2009 07:30:42 -0500
Subject: [Bioperl-l] need help urgently
In-Reply-To: <764978cf0909040521p7aa6af97hffc0c08381a04222@mail.gmail.com>
References: <764978cf0909032349t5d8683f3h3b2ccc506392c984@mail.gmail.com>
	<42006.192.168.1.1.1252051273.squirrel@mail.ncbs.res.in>
	<764978cf0909040217k7f4382d6pcec65f8adfd66164@mail.gmail.com>
	<764978cf0909040313r48ff5660x3210324dc14bf966@mail.gmail.com>
	<2ac05d0f0909040505w75793434ifc25237edbabad5a@mail.gmail.com>
	<764978cf0909040521p7aa6af97hffc0c08381a04222@mail.gmail.com>
Message-ID: <8CCCFE4D-84A4-47A4-A627-ADC6C0329686@illinois.edu>

Neeti,

Something like this?

http://www.bioperl.org/wiki/HOWTO:EUtilities_Cookbook#esummary_-.3E_efetch

chris

On Sep 4, 2009, at 7:21 AM, Neeti Somaiya wrote:

> Thanks. Its an interesting tool.
>
> But I want to do this programatically.
>
> I have gene ids to start with. Cant find a method to directly get
> sequence with gene id as input. So using the method of getting
> sequence with accession as input, for which I need to know accessions
> for my gene ids first. Is this a right approach? Please guide me. My
> main aim is to get the nucleotide sequence of a gene from ids entrez
> gene id/gene name. PLease guide me. I am confused.
>
> -Neeti
> Even my blood says, B positive
>
>
>
> On Fri, Sep 4, 2009 at 5:35 PM, Emanuele Osimo<e.osimo at gmail.com>  
> wrote:
>> Try this:
>> http://david.abcc.ncifcrf.gov/conversion.jsp
>>
>> Emanuele
>>
>>
>> On Fri, Sep 4, 2009 at 12:13, Neeti Somaiya  
>> <neetisomaiya at gmail.com> wrote:
>>>
>>> Thanks for the replies.
>>>
>>> So the get seq by accession/GI worked for me. Now can anyone tell me
>>> the easiest way to get the GI /Accession of a gene from the gene
>>> id/gene name?
>>>
>>> -Neeti
>>> Even my blood says, B positive
>>>
>>>
>>>
>>> On Fri, Sep 4, 2009 at 2:47 PM, Neeti  
>>> Somaiya<neetisomaiya at gmail.com>
>>> wrote:
>>>> Thanks for the link.
>>>> So I need only the following lines of code to get the sequence?
>>>>
>>>> use Bio::DB::GenBank;
>>>> $db_obj = Bio::DB::GenBank->new;
>>>> $seq_obj = $db_obj->get_Seq_by_id(2);
>>>>
>>>> How do I print the sequence?
>>>> $seq_obj->seq ??
>>>>
>>>> -Neeti
>>>> Even my blood says, B positive
>>>>
>>>>
>>>>
>>>> On Fri, Sep 4, 2009 at 1:31 PM, K. Shameer<shameer at ncbs.res.in>  
>>>> wrote:
>>>>>
>>>>> Retrieving a sequence from a database : BioPerl HOWTO
>>>>> http://bit.ly/RWIot
>>>>>
>>>>> Trust this helps,
>>>>> Khader Shameer
>>>>> NCBS - TIFR
>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> I have an input list of gene names (can get gene ids from a  
>>>>>> local db
>>>>>> if required).
>>>>>> I need to fetch sequences of these genes. Can someone please  
>>>>>> guide me
>>>>>> as to how this can be done using perl/bioperl?
>>>>>>
>>>>>> Any help will be deeply appreciated.
>>>>>>
>>>>>> Thanks.
>>>>>>
>>>>>> -Neeti
>>>>>> Even my blood says, B positive
>>>>>> _______________________________________________
>>>>>> Bioperl-l mailing list
>>>>>> Bioperl-l at lists.open-bio.org
>>>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From cjfields at illinois.edu  Fri Sep  4 08:49:19 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Fri, 4 Sep 2009 07:49:19 -0500
Subject: [Bioperl-l] need help urgently
In-Reply-To: <764978cf0909040540n531ea4d3o42f28a7e1578ad82@mail.gmail.com>
References: <764978cf0909032349t5d8683f3h3b2ccc506392c984@mail.gmail.com>
	<42006.192.168.1.1.1252051273.squirrel@mail.ncbs.res.in>
	<764978cf0909040217k7f4382d6pcec65f8adfd66164@mail.gmail.com>
	<764978cf0909040313r48ff5660x3210324dc14bf966@mail.gmail.com>
	<2ac05d0f0909040505w75793434ifc25237edbabad5a@mail.gmail.com>
	<764978cf0909040521p7aa6af97hffc0c08381a04222@mail.gmail.com>
	<8CCCFE4D-84A4-47A4-A627-ADC6C0329686@illinois.edu>
	<764978cf0909040540n531ea4d3o42f28a7e1578ad82@mail.gmail.com>
Message-ID: <4D83853D-90C3-4048-AFAB-FF6E2402C7AA@illinois.edu>

Neeti,

Sorry, it's a package deal (and Bio::DB::EUtilities relies on several  
other modules).  I am planning on spinning it out at some point into  
it's own package, but for now the easiest way to install is via 1.6  
off CPAN or downloading the nightly build:

http://www.bioperl.org/DIST/nightly_builds/

chris

On Sep 4, 2009, at 7:40 AM, Neeti Somaiya wrote:

> Hi,
>
> Thanks for your reply. I saw this before and wanted to try this, but I
> am unable to install this module of EUtilities. When I search on CPAN,
> it gives me the entire bioperl package in the download option of this
> module. Can I not get a tar.gz file of this module alone, which I can
> gzip, untar and then run the make and all to install it? I dont want
> to install entire bioperl again as I am using an older version. Any
> suggestions?
>
> -Neeti
> Even my blood says, B positive
>
>
>
> On Fri, Sep 4, 2009 at 6:00 PM, Chris Fields<cjfields at illinois.edu>  
> wrote:
>> Neeti,
>>
>> Something like this?
>>
>> http://www.bioperl.org/wiki/HOWTO:EUtilities_Cookbook#esummary_-.3E_efetch
>>
>> chris
>>
>> On Sep 4, 2009, at 7:21 AM, Neeti Somaiya wrote:
>>
>>> Thanks. Its an interesting tool.
>>>
>>> But I want to do this programatically.
>>>
>>> I have gene ids to start with. Cant find a method to directly get
>>> sequence with gene id as input. So using the method of getting
>>> sequence with accession as input, for which I need to know  
>>> accessions
>>> for my gene ids first. Is this a right approach? Please guide me. My
>>> main aim is to get the nucleotide sequence of a gene from ids entrez
>>> gene id/gene name. PLease guide me. I am confused.
>>>
>>> -Neeti
>>> Even my blood says, B positive
>>>
>>>
>>>
>>> On Fri, Sep 4, 2009 at 5:35 PM, Emanuele Osimo<e.osimo at gmail.com>  
>>> wrote:
>>>>
>>>> Try this:
>>>> http://david.abcc.ncifcrf.gov/conversion.jsp
>>>>
>>>> Emanuele
>>>>
>>>>
>>>> On Fri, Sep 4, 2009 at 12:13, Neeti Somaiya  
>>>> <neetisomaiya at gmail.com>
>>>> wrote:
>>>>>
>>>>> Thanks for the replies.
>>>>>
>>>>> So the get seq by accession/GI worked for me. Now can anyone  
>>>>> tell me
>>>>> the easiest way to get the GI /Accession of a gene from the gene
>>>>> id/gene name?
>>>>>
>>>>> -Neeti
>>>>> Even my blood says, B positive
>>>>>
>>>>>
>>>>>
>>>>> On Fri, Sep 4, 2009 at 2:47 PM, Neeti Somaiya<neetisomaiya at gmail.com 
>>>>> >
>>>>> wrote:
>>>>>>
>>>>>> Thanks for the link.
>>>>>> So I need only the following lines of code to get the sequence?
>>>>>>
>>>>>> use Bio::DB::GenBank;
>>>>>> $db_obj = Bio::DB::GenBank->new;
>>>>>> $seq_obj = $db_obj->get_Seq_by_id(2);
>>>>>>
>>>>>> How do I print the sequence?
>>>>>> $seq_obj->seq ??
>>>>>>
>>>>>> -Neeti
>>>>>> Even my blood says, B positive
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Fri, Sep 4, 2009 at 1:31 PM, K. Shameer<shameer at ncbs.res.in>  
>>>>>> wrote:
>>>>>>>
>>>>>>> Retrieving a sequence from a database : BioPerl HOWTO
>>>>>>> http://bit.ly/RWIot
>>>>>>>
>>>>>>> Trust this helps,
>>>>>>> Khader Shameer
>>>>>>> NCBS - TIFR
>>>>>>>
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> I have an input list of gene names (can get gene ids from a  
>>>>>>>> local db
>>>>>>>> if required).
>>>>>>>> I need to fetch sequences of these genes. Can someone please  
>>>>>>>> guide me
>>>>>>>> as to how this can be done using perl/bioperl?
>>>>>>>>
>>>>>>>> Any help will be deeply appreciated.
>>>>>>>>
>>>>>>>> Thanks.
>>>>>>>>
>>>>>>>> -Neeti
>>>>>>>> Even my blood says, B positive
>>>>>>>> _______________________________________________
>>>>>>>> Bioperl-l mailing list
>>>>>>>> Bioperl-l at lists.open-bio.org
>>>>>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>> _______________________________________________
>>>>> Bioperl-l mailing list
>>>>> Bioperl-l at lists.open-bio.org
>>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>>
>>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From pg4 at sanger.ac.uk  Thu Sep  3 04:01:26 2009
From: pg4 at sanger.ac.uk (Pablo Marin-Garcia)
Date: Thu, 3 Sep 2009 09:01:26 +0100 (BST)
Subject: [Bioperl-l] bioperl invades emacs -- bug report?
In-Reply-To: <203092FB050648AA9F256788068F0A16@NewLife>
References: <mailman.25.1251907209.22450.bioperl-l@lists.open-bio.org>
	<alpine.DEB.1.10.0909022007510.16229@deskpro17122.dynamic.sanger.ac.uk>
	<203092FB050648AA9F256788068F0A16@NewLife>
Message-ID: <alpine.DEB.1.10.0909030814320.16229@deskpro17122.dynamic.sanger.ac.uk>

On Thu, 3 Sep 2009, Mark A. Jensen wrote:

> Hi Pablo and all-
> Try the latest revision (>=16081) with your debian/Emacs 21. Set
> the variable bioperl-module-path to the directory above the
> Bio directory (same idea as ' use lib "./bioperl-live"; ' ), and try
> again there. Tomorrow, MacOS
> cheers,
> Mark

Hello Mark,

after setting bioperl-module-path manually, your module works ok in 
linux emacs 21.4 with latest revision.

About the perl5lib issue, sorry about not reporting the platform: the 
report was on linux not in mac os X. In the wiki you have a comment about 
mac OS X separator:

[wiki] The problem Pablo was running into is definitely the Mac OS X path 
[wiki] separator issue.

Here I was refering to ':' as the 'path seprator' for linux multipath 
environmental vars not the systems directory separator [:/\].

Also from the wiki

[wiki] I think this is ok as it is, since bioperl-module-path is meant to 
[wiki] point to the directory above Bio

This is right. Probably my message was misleading. I wrongly appended 
'/Bio' to the path instead to a temp variable for testing with 
file-exist-p. And probably gave you the impression that the point was to 
have the /Bio added to the path. Sorry about that.

Instead my main point was about the line where you capture the PRL5LIB:

[code] (if (setq pth (getenv "PERL5LIB"))

wouldn't this leave pth with s *string* like 
"lib/path1:lib/path2:lob/path3" in linux?

Then, when you test:

[code] (setq pth (if (file-exists-p (concat pth "/" "Bio")) pth nil))))

it would append '/Bio' at the end of the whole string 
'lib/path1:lib/path2:lib/path3'. and this string path obviously does not 
exist.

Am I missing something? Shouldn't the 'concat /Bio' be applied to *each* 
lib/path, splitting first the pth string by the ':' in linux/osX or 
equivalent in windows.

Sorry about not being very clear in my firest report.


    -Pablo


>> == bug when parsing perl5lib? ==
>> 
>> Please correct me if I am wrong but in bioperl-init.el when extracting the 
>> Bioperl paths from PERL5LIB this is not working for me in linux.
>> 
>> While debugging bioperl-init.el:
>> # (setq pth (getenv "PERL5LIB"))
>> # 
>> "/nfs/home/pmg/ensembl-api/ensembl-compara/modules:...:/nfs/home/pmg/bioperl-live:..."
>> # (setq pth (if (file-exists-p (concat pth "/" "Bio")) pth nil))
>> # nil
>> 
>> No file is found because it is looking for all the paths concatenated 
>> together with a '/Bio' at the end:
>>
>>   libpaht1:libpath2:libpath3/Bio
>> 
>> 'concat' adds /Bio to the pth that is a string with all the PERL5LIB paths. 
>> Should this concat rather be applied to the splited perl5lib by ':' in unix 
>> or ';' in windows and then tested for the existence of files?
>> 
>> for example in unix:
>> 
>> --- code --
>> (defun addbio (bio_path)
>>   "apend /Bio to each path"
>>   (concat bio_path "/" "Bio"))
>> 
>> (mapcar 'file-exists-p (mapcar 'addbio (split-string pth ":")))
>> -- end code ---
>> 
>> This would result in the list of T and F bioperl (and ensembl) paths
>> (t t nil t t t t t t nil nil nil ...)
>> 
>> 
>> Regards and thanks for the modules they would be very useful.
>>
>>    -Pablo
>> 
>> =====================================================================
>>                      Pablo Marin-Garcia, PhD
>>
>>                     \\//          (Argiope bruennichi
>>                \/\/`(||>O:'\/\/   with stabilimentum)
>>                     //\\
>> 
>> Sanger Institute                |  PostDoc / Computer Biologist
>> Wellcome Trust Genome Campus    |  team : 128/108 (Human Genetics)
>> Hinxton, Cambridge CB10 1HH     |  room : N333
>> United Kingdom                  |  email: pablo.marin at sanger.ac.uk
>> ====================================================================
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> -- 
>> The Wellcome Trust Sanger Institute is operated by Genome Research Limited, 
>> a charity registered in England with number 1021457 and a company 
>> registered in England with number 2742969, whose registered office is 215 
>> Euston Road, London, NW1 2BE. 
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>> 
>> 
>
>


=====================================================================
                      Pablo Marin-Garcia, PhD

                     \\//          (Argiope bruennichi
                \/\/`(||>O:'\/\/   with stabilimentum)
                     //\\

Sanger Institute                |  PostDoc / Computer Biologist
Wellcome Trust Genome Campus    |  team : 128/108 (Human Genetics)
Hinxton, Cambridge CB10 1HH     |  room : N333
United Kingdom                  |  email: pablo.marin at sanger.ac.uk
====================================================================


-- 
 The Wellcome Trust Sanger Institute is operated by Genome Research 
 Limited, a charity registered in England with number 1021457 and a 
 company registered in England with number 2742969, whose registered 
 office is 215 Euston Road, London, NW1 2BE. 


From paola.bisignano at gmail.com  Fri Sep  4 08:28:03 2009
From: paola.bisignano at gmail.com (Paola Bisignano)
Date: Fri, 4 Sep 2009 14:28:03 +0200
Subject: [Bioperl-l] problem parsing msf file
Message-ID: <e9cf89740909040528j69e5f8e6ka9d550840a4e0f9a@mail.gmail.com>

I have a problem with the parsing of msf file...I can't find the exact
object of Bio::SimpleAlign for my case...
I have to identify residues (from a list) in aligned sequences...but
when I parse the alignment from fasta file, I save as msf file, where
I have to identify my residue (from the list, numbering as the pdb
file) and the residue aligned in the aligned sequences...

this is a piece of the file...

NoName   MSF: 2  Type: P  Wed Aug 26 10:32:50 2009  Check: 00 ..

 Name: Sequence/23-178  Len:    156  Check:  8937  Weight:  1.00
 Name: 2zhz:A/1-148     Len:    156  Check:  9006  Weight:  1.00

//


                      1                                                   50
Sequence/23-178       NDPRVAAYGE VDELNSWVGY TKSLINSHTQ VLSNELEEIQ QLLFDCGHDL
2zhz:A/1-148          DDARIAAIGD VDELNSQIGV L--LAEPLPD DVRAALSAIQ HDLFDLGGEL


                      51                                                 100
Sequence/23-178       ATPADDERHS FKFKQEQPTV WLEEKIDNYT QVVPAVKKHI LPGGTQLASA
2zhz:A/1-148          CIPGHAAITD AHLARLDG-- WLA----HYN GQLPPLEEFI LPGGARGAAL


                      101                                                150
Sequence/23-178       LHVARTITRR AERQIVQLMR EEQINQDVLI FINRLSDYFF AAARYANYLE
2zhz:A/1-148          AHVCRTVCRR AERSIVALGA SEPLNAAPRR YVNRLSDLLF VLARVLNRAA


                      151                                                200
Sequence/23-178       QQPDML
2zhz:A/1-148          GGADVL

for example in this I have to identify the residue that is in front of
Val 28 (that is in Sequen) in 2zhz:A (that manually conting is Ile
5)....
Tyr4-> has no residue in front of it because the alignment starts from
N23 of Sequence...
how can I find the way to enter the residue of my sequen, and extract
the residue from the other????


I wish you all dear friends..and I'm actually in atrouble with this..
Thanks for suggestions

Paola


From jason at bioperl.org  Fri Sep  4 12:04:05 2009
From: jason at bioperl.org (Jason Stajich)
Date: Fri, 4 Sep 2009 09:04:05 -0700
Subject: [Bioperl-l] Fwd:  help parsing msf file or clustalW file reports
References: <369662.74237.qm@web25701.mail.ukl.yahoo.com>
Message-ID: <B5AEEBAD-22D3-40B6-AD06-17E268DFAFDD@bioperl.org>

Paola - it is important to continue to email the mailing list for your  
help.  I'm hoping another person on the list can help as I am swamped  
right now.
-jason

Begin forwarded message:

> From: Paola Bisignano <paola_bisignano at yahoo.it>
> Date: September 4, 2009 5:48:22 AM PDT
> To: Jason Stajich <jason at bioperl.org>
> Subject: Re: [Bioperl-l] help parsing msf file or clustalW file  
> reports
>
> Hi Jason, thank for your answer there are two day that I'm re- 
> studyng synopsys of bioperl and programming object...I understand  
> what you mean...but I have some problems...I don't actually know how  
> to start to parse this kind of file, I generated this msf file or  
> clustalW file, by parsing a fasta file of multiple paired  
> sequences..so I parsed in msf file...extracting only the paired  
> sequences I want..so homolog proteins that have same ligand  
> published in pdb bank..
>
>
> I have a problem with the parsing of msf file...I can't find the exact
>
>
> object of Bio::SimpleAlign for my case...
>
>
> I have to identify residues (from a list) in aligned sequences...but
>
>
> when I parse the alignment from fasta file, I save as msf file, where
>
>
> I have to identify my residue (from the list, numbering as the pdb
>
>
> file) and the residue aligned in the aligned sequences...
>
>
>
>
>
> this is a piece of the file...
>
>
>
>
>
> NoName   MSF: 2  Type: P  Wed Aug 26 10:32:50 2009  Check: 00 ..
>
>
>
>
>
>  Name: Sequence/23-178  Len:    156  Check:  8937  Weight:  1.00
>
>
>  Name: 2zhz:A/1-148     Len:    156  Check:  9006  Weight:  1.00
>
>
>
>
>
> //
>
>
>
>
>
>
>
>
>                       
> 1                                                   50
>
>
> Sequence/23-178       NDPRVAAYGE VDELNSWVGY TKSLINSHTQ VLSNELEEIQ  
> QLLFDCGHDL
>
>
> 2zhz:A/1-148          DDARIAAIGD VDELNSQIGV L--LAEPLPD DVRAALSAIQ  
> HDLFDLGGEL
>
>
>
>
>
>
>
>
>                       
> 51                                                 100
>
>
> Sequence/23-178       ATPADDERHS FKFKQEQPTV WLEEKIDNYT QVVPAVKKHI  
> LPGGTQLASA
>
>
> 2zhz:A/1-148          CIPGHAAITD AHLARLDG-- WLA----HYN GQLPPLEEFI  
> LPGGARGAAL
>
>
>
>
>
>
>
>
>                       
> 101                                                150
>
>
> Sequence/23-178       LHVARTITRR AERQIVQLMR EEQINQDVLI FINRLSDYFF  
> AAARYANYLE
>
>
> 2zhz:A/1-148          AHVCRTVCRR AERSIVALGA SEPLNAAPRR YVNRLSDLLF  
> VLARVLNRAA
>
>
>
>
>
>
>
>
>                       
> 151                                                200
>
>
> Sequence/23-178       QQPDML
>
>
> 2zhz:A/1-148          GGADVL
>
>
>
>
>
> for example in this I have to identify the residue that is in front of
>
>
> Val 28 (that is in Sequen) in 2zhz:A (that manually conting is Ile
>
>
> 5)....
>
>
> Tyr4-> has no residue in front of it because the alignment starts from
>
>
> N23 of Sequence...
>
>
> how can I find the way to enter the residue of my sequen, and extract
>
>
> the residue from the other????
>
>
>
>
>
>
>
>
> I wish you all dear friends..and I'm actually in atrouble with this..
>
>
> Thanks for suggestions
>
>
>
>
>
>
> --- Mar 1/9/09, Jason Stajich <jason at bioperl.org> ha scritto:
>
> Da: Jason Stajich <jason at bioperl.org>
> Oggetto: Re: [Bioperl-l] help parsing msf file or clustalW file  
> reports
> A: "Paola Bisignano" <paola_bisignano at yahoo.it>
> Cc: bioperl-l at lists.open-bio.org
> Data: Marted? 1 settembre 2009, 17:49
>
> I think you might want to use the column_from_residue_number method  
> that is part of Bio::SimpleAlign - it lets you get the column from  
> an alignment based on the sequence residue, doing some math along  
> the way to deal with gaps. That is the residue -> alignment  
> direction.  If you are starting at the alignment and want to get the  
> residue's position you will use the location_from_column on a  
> particular sequence so
>
>     # select somehow a sequence from the alignment, e.g.
>     my $seq = $aln->get_seq_by_pos(1);
>     #$loc is undef or Bio::LocationI object
>     my $loc = $seq->location_from_column(5);
>
> -jason
>
> On Sep 1, 2009, at 5:20 AM, Paola Bisignano wrote:
>
>> Hi,
>>
>> I'm trying to parse fasta files, where I have couple of  
>> alignments....I need to identify my residue in my alignment......I  
>> have separate lists that derived from ligplot parsing files.. so I  
>> have to manipulate string...but I don't now how to start..it seems  
>> complicated..
>> I used Bio::AlignIO to parse the fasta file, so I can have a parsed  
>> file in msf or clustalW forma
>>
>> here an example:
>> CLUSTAL W(1.81) multiple sequence alignment
>>
>>
>> Sequence/9-273          
>> DKWEMERTDITMKHKLGGGQYGEVYEGVWKKYSLTVAVKTLKEDTMEVEEFLKEAAVMKE
>> 2pl0:A/6-268           DEWEVPRETLKLVERLGAGQFGEVWMGYYNGHT- 
>> KVAVKSLKQGSMSPDAFLAEANLMKQ
>>                         *:**: *  :.: .:**.**:***:  
>> * :: :: .****:**:.:*. : ** ** :**:
>>
>>
>> Sequence/9-273          
>> IKHPNLVQLLGVCTREPPFYIITEFMTYGNLLDYLRECNRQEVSAVVLLYMATQISSAME
>> 2pl0:A/6-268           LQHQRLVRLYAVVTQEP- 
>> IYIITEYMENGSLVDFLKTPSGIKLTINKLLDMAAQIAEGMA
>>                         ::* .**:* .* *:** :*****:*   
>> *.*:*:*:  .  :::   ** **:**:..*
>>
>> I  choose two residue for example...how can I extract  
>> them...starting from their position in the pdb file?
>> I need to walk...to my sequence
>>
>> I don't know if it is clear because I cannot explain the question  
>> correctly in english...are there any Italians?
>> could anyone help me?
>>
>>
>>
>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> --
> Jason Stajich
> jason.stajich at gmail.com
> jason at bioperl.org
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>
>

--
Jason Stajich
jason.stajich at gmail.com
jason at bioperl.org


From robert.bradbury at gmail.com  Fri Sep  4 16:15:09 2009
From: robert.bradbury at gmail.com (Robert Bradbury)
Date: Fri, 4 Sep 2009 16:15:09 -0400
Subject: [Bioperl-l] need help urgently
In-Reply-To: <2ac05d0f0909040505w75793434ifc25237edbabad5a@mail.gmail.com>
References: <764978cf0909032349t5d8683f3h3b2ccc506392c984@mail.gmail.com>
	<42006.192.168.1.1.1252051273.squirrel@mail.ncbs.res.in>
	<764978cf0909040217k7f4382d6pcec65f8adfd66164@mail.gmail.com>
	<764978cf0909040313r48ff5660x3210324dc14bf966@mail.gmail.com>
	<2ac05d0f0909040505w75793434ifc25237edbabad5a@mail.gmail.com>
Message-ID: <deaa866a0909041315y4282d811g3047ab153812014d@mail.gmail.com>

On 9/4/09, Emanuele Osimo <e.osimo at gmail.com> wrote:
> Try this:
> http://david.abcc.ncifcrf.gov/conversion.jsp
>

It may be just me, but I've tried this in both Firefox and Opera and
it will not work without Javascript enabled.  Most "intelligent" sites
now tell you that Javascript must be enabled if they require it to
work properly.  More intelligent sites (such as Google's gmail) allow
you to toggle back and forth between Javascript & non-Javascript
implementations.

Note that, IMO, running with Javascript enabled for all sites all the
time is a bad idea (potentially for security reasons, but clearly for
sleep / suspend / power consumption reasons, and finally for the
reason of do you *really* trust that Javascript, your DNS provider,
and sites hosting the scripts are 100% secure?).  The only options
that seem generally available at this time are to run Firefox with
NoScript enabling of selective sites or to run two browser instances,
one with Javascript enabled, one with it disabled -- and to only use
the Javascript enabled browser on sites with a high probability of
being secure).


From lsbrath at gmail.com  Fri Sep  4 18:12:34 2009
From: lsbrath at gmail.com (Mgavi Brathwaite)
Date: Fri, 4 Sep 2009 18:12:34 -0400
Subject: [Bioperl-l] bio:graphics
Message-ID: <69367b8f0909041512l77b2431aqb89f57f82adae1@mail.gmail.com>

Hello,

I need to grab features(source, gene, cds, primer_bind) from a genbank file
and add features(5' and 3' UTR, misc_feature) to generate an image. The
images are on two tracks and with each track having multiple features. How
do I display different colors for the different features on the same track?
In my case 5'UTR, CDS, and 3'UTR are on the same track. I want the UTRs to
have one color and the CDS another.

I also need to grab the start and end info from the primer_bind feature
based on the /note tag values. In my case 'HUF' and 'HDF'. Code:

if( $feat->primary_tag eq 'primer_bind' ) {
            $feat->get_tag_values("note") if ($feat_object->has_tag("note")
&&
                tag_values("note") eq 'HDF');
            $pb_start = $feat->start;
            $pb_end = $feat->end;


I want to make sure that I am moving in the right direction.  Can someone
help me out?

M


From neetisomaiya at gmail.com  Sat Sep  5 00:52:11 2009
From: neetisomaiya at gmail.com (Neeti Somaiya)
Date: Sat, 5 Sep 2009 10:22:11 +0530
Subject: [Bioperl-l] need help urgently
In-Reply-To: <4D83853D-90C3-4048-AFAB-FF6E2402C7AA@illinois.edu>
References: <764978cf0909032349t5d8683f3h3b2ccc506392c984@mail.gmail.com>
	<42006.192.168.1.1.1252051273.squirrel@mail.ncbs.res.in>
	<764978cf0909040217k7f4382d6pcec65f8adfd66164@mail.gmail.com>
	<764978cf0909040313r48ff5660x3210324dc14bf966@mail.gmail.com>
	<2ac05d0f0909040505w75793434ifc25237edbabad5a@mail.gmail.com>
	<764978cf0909040521p7aa6af97hffc0c08381a04222@mail.gmail.com>
	<8CCCFE4D-84A4-47A4-A627-ADC6C0329686@illinois.edu>
	<764978cf0909040540n531ea4d3o42f28a7e1578ad82@mail.gmail.com>
	<4D83853D-90C3-4048-AFAB-FF6E2402C7AA@illinois.edu>
Message-ID: <764978cf0909042152v2ae26ee5q6c668c498ead605e@mail.gmail.com>

Ok, so I reinstalled bioperl and was able to run the EUtilities code
for my gene id.
But I am facing two issues :-

1) When I give multiple gene ids, it still returns data of only the
first gene id

2) The script returns the entire entry, and I am not able to figure
out how to just fetch the sequence, and if possible, in FASTA format.
I could not figure it out from the documentation.

Thanks.

-Neeti
Even my blood says, B positive


On Fri, Sep 4, 2009 at 6:19 PM, Chris Fields<cjfields at illinois.edu> wrote:
> Neeti,
>
> Sorry, it's a package deal (and Bio::DB::EUtilities relies on several other
> modules).  I am planning on spinning it out at some point into it's own
> package, but for now the easiest way to install is via 1.6 off CPAN or
> downloading the nightly build:
>
> http://www.bioperl.org/DIST/nightly_builds/
>
> chris
>
> On Sep 4, 2009, at 7:40 AM, Neeti Somaiya wrote:
>
>> Hi,
>>
>> Thanks for your reply. I saw this before and wanted to try this, but I
>> am unable to install this module of EUtilities. When I search on CPAN,
>> it gives me the entire bioperl package in the download option of this
>> module. Can I not get a tar.gz file of this module alone, which I can
>> gzip, untar and then run the make and all to install it? I dont want
>> to install entire bioperl again as I am using an older version. Any
>> suggestions?
>>
>> -Neeti
>> Even my blood says, B positive
>>
>>
>>
>> On Fri, Sep 4, 2009 at 6:00 PM, Chris Fields<cjfields at illinois.edu> wrote:
>>>
>>> Neeti,
>>>
>>> Something like this?
>>>
>>>
>>> http://www.bioperl.org/wiki/HOWTO:EUtilities_Cookbook#esummary_-.3E_efetch
>>>
>>> chris
>>>
>>> On Sep 4, 2009, at 7:21 AM, Neeti Somaiya wrote:
>>>
>>>> Thanks. Its an interesting tool.
>>>>
>>>> But I want to do this programatically.
>>>>
>>>> I have gene ids to start with. Cant find a method to directly get
>>>> sequence with gene id as input. So using the method of getting
>>>> sequence with accession as input, for which I need to know accessions
>>>> for my gene ids first. Is this a right approach? Please guide me. My
>>>> main aim is to get the nucleotide sequence of a gene from ids entrez
>>>> gene id/gene name. PLease guide me. I am confused.
>>>>
>>>> -Neeti
>>>> Even my blood says, B positive
>>>>
>>>>
>>>>
>>>> On Fri, Sep 4, 2009 at 5:35 PM, Emanuele Osimo<e.osimo at gmail.com> wrote:
>>>>>
>>>>> Try this:
>>>>> http://david.abcc.ncifcrf.gov/conversion.jsp
>>>>>
>>>>> Emanuele
>>>>>
>>>>>
>>>>> On Fri, Sep 4, 2009 at 12:13, Neeti Somaiya <neetisomaiya at gmail.com>
>>>>> wrote:
>>>>>>
>>>>>> Thanks for the replies.
>>>>>>
>>>>>> So the get seq by accession/GI worked for me. Now can anyone tell me
>>>>>> the easiest way to get the GI /Accession of a gene from the gene
>>>>>> id/gene name?
>>>>>>
>>>>>> -Neeti
>>>>>> Even my blood says, B positive
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Fri, Sep 4, 2009 at 2:47 PM, Neeti Somaiya<neetisomaiya at gmail.com>
>>>>>> wrote:
>>>>>>>
>>>>>>> Thanks for the link.
>>>>>>> So I need only the following lines of code to get the sequence?
>>>>>>>
>>>>>>> use Bio::DB::GenBank;
>>>>>>> $db_obj = Bio::DB::GenBank->new;
>>>>>>> $seq_obj = $db_obj->get_Seq_by_id(2);
>>>>>>>
>>>>>>> How do I print the sequence?
>>>>>>> $seq_obj->seq ??
>>>>>>>
>>>>>>> -Neeti
>>>>>>> Even my blood says, B positive
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Fri, Sep 4, 2009 at 1:31 PM, K. Shameer<shameer at ncbs.res.in>
>>>>>>> wrote:
>>>>>>>>
>>>>>>>> Retrieving a sequence from a database : BioPerl HOWTO
>>>>>>>> http://bit.ly/RWIot
>>>>>>>>
>>>>>>>> Trust this helps,
>>>>>>>> Khader Shameer
>>>>>>>> NCBS - TIFR
>>>>>>>>
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>> I have an input list of gene names (can get gene ids from a local
>>>>>>>>> db
>>>>>>>>> if required).
>>>>>>>>> I need to fetch sequences of these genes. Can someone please guide
>>>>>>>>> me
>>>>>>>>> as to how this can be done using perl/bioperl?
>>>>>>>>>
>>>>>>>>> Any help will be deeply appreciated.
>>>>>>>>>
>>>>>>>>> Thanks.
>>>>>>>>>
>>>>>>>>> -Neeti
>>>>>>>>> Even my blood says, B positive
>>>>>>>>> _______________________________________________
>>>>>>>>> Bioperl-l mailing list
>>>>>>>>> Bioperl-l at lists.open-bio.org
>>>>>>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>> _______________________________________________
>>>>>> Bioperl-l mailing list
>>>>>> Bioperl-l at lists.open-bio.org
>>>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>>>
>>>>>
>>>> _______________________________________________
>>>> Bioperl-l mailing list
>>>> Bioperl-l at lists.open-bio.org
>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>


From ybolo001 at student.ucr.edu  Sat Sep  5 03:37:58 2009
From: ybolo001 at student.ucr.edu (Eugene Bolotin)
Date: Sat, 5 Sep 2009 00:37:58 -0700
Subject: [Bioperl-l] need help urgently
In-Reply-To: <764978cf0909042152v2ae26ee5q6c668c498ead605e@mail.gmail.com>
References: <764978cf0909032349t5d8683f3h3b2ccc506392c984@mail.gmail.com>
	<42006.192.168.1.1.1252051273.squirrel@mail.ncbs.res.in>
	<764978cf0909040217k7f4382d6pcec65f8adfd66164@mail.gmail.com>
	<764978cf0909040313r48ff5660x3210324dc14bf966@mail.gmail.com>
	<2ac05d0f0909040505w75793434ifc25237edbabad5a@mail.gmail.com>
	<764978cf0909040521p7aa6af97hffc0c08381a04222@mail.gmail.com>
	<8CCCFE4D-84A4-47A4-A627-ADC6C0329686@illinois.edu>
	<764978cf0909040540n531ea4d3o42f28a7e1578ad82@mail.gmail.com>
	<4D83853D-90C3-4048-AFAB-FF6E2402C7AA@illinois.edu>
	<764978cf0909042152v2ae26ee5q6c668c498ead605e@mail.gmail.com>
Message-ID: <941fcc750909050037n3c0f4fc5u89fcf4f5c3e5f34d@mail.gmail.com>

Ok,
this is what I would do.
Download the database of gene names and sequences in fasta.
Then loop throught it with bioperl.
Regex the gene names, which you store into a hash, against the
seq->display_names() should match it up with gene ids
seq->seq() should print out the sequence
in bioperl.
Print out the ones that match.
Good luck.
- Show quoted text -

On Thu, Sep 3, 2009 at 11:49 PM, Neeti Somaiya<neetisomaiya at gmail.com> wrote:
> Hi,
>
> I have an input list of gene names (can get gene ids from a local db
> if required).
> I need to fetch sequences of these genes. Can someone please guide me
> as to how this can be done using perl/bioperl?
>
> Any help will be deeply appreciated.
>
> Thanks.
>
> -Neeti
> Even my blood says, B positive
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


--

On Fri, Sep 4, 2009 at 9:52 PM, Neeti Somaiya<neetisomaiya at gmail.com> wrote:
> Ok, so I reinstalled bioperl and was able to run the EUtilities code
> for my gene id.
> But I am facing two issues :-
>
> 1) When I give multiple gene ids, it still returns data of only the
> first gene id
>
> 2) The script returns the entire entry, and I am not able to figure
> out how to just fetch the sequence, and if possible, in FASTA format.
> I could not figure it out from the documentation.
>
> Thanks.
>
> -Neeti
> Even my blood says, B positive
>
>
>
> On Fri, Sep 4, 2009 at 6:19 PM, Chris Fields<cjfields at illinois.edu> wrote:
>> Neeti,
>>
>> Sorry, it's a package deal (and Bio::DB::EUtilities relies on several other
>> modules). ?I am planning on spinning it out at some point into it's own
>> package, but for now the easiest way to install is via 1.6 off CPAN or
>> downloading the nightly build:
>>
>> http://www.bioperl.org/DIST/nightly_builds/
>>
>> chris
>>
>> On Sep 4, 2009, at 7:40 AM, Neeti Somaiya wrote:
>>
>>> Hi,
>>>
>>> Thanks for your reply. I saw this before and wanted to try this, but I
>>> am unable to install this module of EUtilities. When I search on CPAN,
>>> it gives me the entire bioperl package in the download option of this
>>> module. Can I not get a tar.gz file of this module alone, which I can
>>> gzip, untar and then run the make and all to install it? I dont want
>>> to install entire bioperl again as I am using an older version. Any
>>> suggestions?
>>>
>>> -Neeti
>>> Even my blood says, B positive
>>>
>>>
>>>
>>> On Fri, Sep 4, 2009 at 6:00 PM, Chris Fields<cjfields at illinois.edu> wrote:
>>>>
>>>> Neeti,
>>>>
>>>> Something like this?
>>>>
>>>>
>>>> http://www.bioperl.org/wiki/HOWTO:EUtilities_Cookbook#esummary_-.3E_efetch
>>>>
>>>> chris
>>>>
>>>> On Sep 4, 2009, at 7:21 AM, Neeti Somaiya wrote:
>>>>
>>>>> Thanks. Its an interesting tool.
>>>>>
>>>>> But I want to do this programatically.
>>>>>
>>>>> I have gene ids to start with. Cant find a method to directly get
>>>>> sequence with gene id as input. So using the method of getting
>>>>> sequence with accession as input, for which I need to know accessions
>>>>> for my gene ids first. Is this a right approach? Please guide me. My
>>>>> main aim is to get the nucleotide sequence of a gene from ids entrez
>>>>> gene id/gene name. PLease guide me. I am confused.
>>>>>
>>>>> -Neeti
>>>>> Even my blood says, B positive
>>>>>
>>>>>
>>>>>
>>>>> On Fri, Sep 4, 2009 at 5:35 PM, Emanuele Osimo<e.osimo at gmail.com> wrote:
>>>>>>
>>>>>> Try this:
>>>>>> http://david.abcc.ncifcrf.gov/conversion.jsp
>>>>>>
>>>>>> Emanuele
>>>>>>
>>>>>>
>>>>>> On Fri, Sep 4, 2009 at 12:13, Neeti Somaiya <neetisomaiya at gmail.com>
>>>>>> wrote:
>>>>>>>
>>>>>>> Thanks for the replies.
>>>>>>>
>>>>>>> So the get seq by accession/GI worked for me. Now can anyone tell me
>>>>>>> the easiest way to get the GI /Accession of a gene from the gene
>>>>>>> id/gene name?
>>>>>>>
>>>>>>> -Neeti
>>>>>>> Even my blood says, B positive
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Fri, Sep 4, 2009 at 2:47 PM, Neeti Somaiya<neetisomaiya at gmail.com>
>>>>>>> wrote:
>>>>>>>>
>>>>>>>> Thanks for the link.
>>>>>>>> So I need only the following lines of code to get the sequence?
>>>>>>>>
>>>>>>>> use Bio::DB::GenBank;
>>>>>>>> $db_obj = Bio::DB::GenBank->new;
>>>>>>>> $seq_obj = $db_obj->get_Seq_by_id(2);
>>>>>>>>
>>>>>>>> How do I print the sequence?
>>>>>>>> $seq_obj->seq ??
>>>>>>>>
>>>>>>>> -Neeti
>>>>>>>> Even my blood says, B positive
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Fri, Sep 4, 2009 at 1:31 PM, K. Shameer<shameer at ncbs.res.in>
>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>> Retrieving a sequence from a database : BioPerl HOWTO
>>>>>>>>> http://bit.ly/RWIot
>>>>>>>>>
>>>>>>>>> Trust this helps,
>>>>>>>>> Khader Shameer
>>>>>>>>> NCBS - TIFR
>>>>>>>>>
>>>>>>>>>> Hi,
>>>>>>>>>>
>>>>>>>>>> I have an input list of gene names (can get gene ids from a local
>>>>>>>>>> db
>>>>>>>>>> if required).
>>>>>>>>>> I need to fetch sequences of these genes. Can someone please guide
>>>>>>>>>> me
>>>>>>>>>> as to how this can be done using perl/bioperl?
>>>>>>>>>>
>>>>>>>>>> Any help will be deeply appreciated.
>>>>>>>>>>
>>>>>>>>>> Thanks.
>>>>>>>>>>
>>>>>>>>>> -Neeti
>>>>>>>>>> Even my blood says, B positive
>>>>>>>>>> _______________________________________________
>>>>>>>>>> Bioperl-l mailing list
>>>>>>>>>> Bioperl-l at lists.open-bio.org
>>>>>>>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> Bioperl-l mailing list
>>>>>>> Bioperl-l at lists.open-bio.org
>>>>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>>>>
>>>>>>
>>>>> _______________________________________________
>>>>> Bioperl-l mailing list
>>>>> Bioperl-l at lists.open-bio.org
>>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>>
>>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


-- 
Eugene Bolotin
Ph.D. candidate
Genetics Genomics and Bioinformatics
University of California Riverside
ybolo001 at student.ucr.edu
Dr. Frances Sladek Lab


From maj at fortinbras.us  Sat Sep  5 08:53:12 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Sat, 5 Sep 2009 08:53:12 -0400
Subject: [Bioperl-l] bioperl invades emacs -- bug report?
In-Reply-To: <alpine.DEB.1.10.0909030814320.16229@deskpro17122.dynamic.sanger.ac.uk>
References: <mailman.25.1251907209.22450.bioperl-l@lists.open-bio.org><alpine.DEB.1.10.0909022007510.16229@deskpro17122.dynamic.sanger.ac.uk><203092FB050648AA9F256788068F0A16@NewLife>
	<alpine.DEB.1.10.0909030814320.16229@deskpro17122.dynamic.sanger.ac.uk>
Message-ID: <E63F6D209AF1432C9B9CAFF6F6182F9C@NewLife>

Hi Pablo-- You're right about the PERL5LIB issue; I had
not set up the module path to handle multiple paths as you
describe. I am working hard on an implementation that can
handle multiple paths; I hope to have it out next week --cheers MAJ
----- Original Message ----- 
From: "Pablo Marin-Garcia" <pg4 at sanger.ac.uk>
To: "Mark A. Jensen" <maj at fortinbras.us>
Cc: <bioperl-l at lists.open-bio.org>
Sent: Thursday, September 03, 2009 4:01 AM
Subject: Re: [Bioperl-l] bioperl invades emacs -- bug report?


> On Thu, 3 Sep 2009, Mark A. Jensen wrote:
>
>> Hi Pablo and all-
>> Try the latest revision (>=16081) with your debian/Emacs 21. Set
>> the variable bioperl-module-path to the directory above the
>> Bio directory (same idea as ' use lib "./bioperl-live"; ' ), and try
>> again there. Tomorrow, MacOS
>> cheers,
>> Mark
>
> Hello Mark,
>
> after setting bioperl-module-path manually, your module works ok in linux 
> emacs 21.4 with latest revision.
>
> About the perl5lib issue, sorry about not reporting the platform: the report 
> was on linux not in mac os X. In the wiki you have a comment about mac OS X 
> separator:
>
> [wiki] The problem Pablo was running into is definitely the Mac OS X path 
> [wiki] separator issue.
>
> Here I was refering to ':' as the 'path seprator' for linux multipath 
> environmental vars not the systems directory separator [:/\].
>
> Also from the wiki
>
> [wiki] I think this is ok as it is, since bioperl-module-path is meant to 
> [wiki] point to the directory above Bio
>
> This is right. Probably my message was misleading. I wrongly appended '/Bio' 
> to the path instead to a temp variable for testing with file-exist-p. And 
> probably gave you the impression that the point was to have the /Bio added to 
> the path. Sorry about that.
>
> Instead my main point was about the line where you capture the PRL5LIB:
>
> [code] (if (setq pth (getenv "PERL5LIB"))
>
> wouldn't this leave pth with s *string* like "lib/path1:lib/path2:lob/path3" 
> in linux?
>
> Then, when you test:
>
> [code] (setq pth (if (file-exists-p (concat pth "/" "Bio")) pth nil))))
>
> it would append '/Bio' at the end of the whole string 
> 'lib/path1:lib/path2:lib/path3'. and this string path obviously does not 
> exist.
>
> Am I missing something? Shouldn't the 'concat /Bio' be applied to *each* 
> lib/path, splitting first the pth string by the ':' in linux/osX or equivalent 
> in windows.
>
> Sorry about not being very clear in my firest report.
>
>
>    -Pablo
>
>
>
>>> == bug when parsing perl5lib? ==
>>>
>>> Please correct me if I am wrong but in bioperl-init.el when extracting the 
>>> Bioperl paths from PERL5LIB this is not working for me in linux.
>>>
>>> While debugging bioperl-init.el:
>>> # (setq pth (getenv "PERL5LIB"))
>>> # 
>>> "/nfs/home/pmg/ensembl-api/ensembl-compara/modules:...:/nfs/home/pmg/bioperl-live:..."
>>> # (setq pth (if (file-exists-p (concat pth "/" "Bio")) pth nil))
>>> # nil
>>>
>>> No file is found because it is looking for all the paths concatenated 
>>> together with a '/Bio' at the end:
>>>
>>>   libpaht1:libpath2:libpath3/Bio
>>>
>>> 'concat' adds /Bio to the pth that is a string with all the PERL5LIB paths. 
>>> Should this concat rather be applied to the splited perl5lib by ':' in unix 
>>> or ';' in windows and then tested for the existence of files?
>>>
>>> for example in unix:
>>>
>>> --- code --
>>> (defun addbio (bio_path)
>>>   "apend /Bio to each path"
>>>   (concat bio_path "/" "Bio"))
>>>
>>> (mapcar 'file-exists-p (mapcar 'addbio (split-string pth ":")))
>>> -- end code ---
>>>
>>> This would result in the list of T and F bioperl (and ensembl) paths
>>> (t t nil t t t t t t nil nil nil ...)
>>>
>>>
>>> Regards and thanks for the modules they would be very useful.
>>>
>>>    -Pablo
>>>
>>> =====================================================================
>>>                      Pablo Marin-Garcia, PhD
>>>
>>>                     \\//          (Argiope bruennichi
>>>                \/\/`(||>O:'\/\/   with stabilimentum)
>>>                     //\\
>>>
>>> Sanger Institute                |  PostDoc / Computer Biologist
>>> Wellcome Trust Genome Campus    |  team : 128/108 (Human Genetics)
>>> Hinxton, Cambridge CB10 1HH     |  room : N333
>>> United Kingdom                  |  email: pablo.marin at sanger.ac.uk
>>> ====================================================================
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> -- 
>>> The Wellcome Trust Sanger Institute is operated by Genome Research Limited, 
>>> a charity registered in England with number 1021457 and a company registered 
>>> in England with number 2742969, whose registered office is 215 Euston Road, 
>>> London, NW1 2BE. _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>>
>>
>>
>
>
> =====================================================================
>                      Pablo Marin-Garcia, PhD
>
>                     \\//          (Argiope bruennichi
>                \/\/`(||>O:'\/\/   with stabilimentum)
>                     //\\
>
> Sanger Institute                |  PostDoc / Computer Biologist
> Wellcome Trust Genome Campus    |  team : 128/108 (Human Genetics)
> Hinxton, Cambridge CB10 1HH     |  room : N333
> United Kingdom                  |  email: pablo.marin at sanger.ac.uk
> ====================================================================
>
>
>
>
>
>
>
>
>
>
> -- 
> The Wellcome Trust Sanger Institute is operated by Genome Research Limited, a 
> charity registered in England with number 1021457 and a company registered in 
> England with number 2742969, whose registered office is 215 Euston Road, 
> London, NW1 2BE. _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> 


From cjfields at illinois.edu  Sat Sep  5 09:44:54 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Sat, 5 Sep 2009 08:44:54 -0500
Subject: [Bioperl-l] need help urgently
In-Reply-To: <764978cf0909042152v2ae26ee5q6c668c498ead605e@mail.gmail.com>
References: <764978cf0909032349t5d8683f3h3b2ccc506392c984@mail.gmail.com>
	<42006.192.168.1.1.1252051273.squirrel@mail.ncbs.res.in>
	<764978cf0909040217k7f4382d6pcec65f8adfd66164@mail.gmail.com>
	<764978cf0909040313r48ff5660x3210324dc14bf966@mail.gmail.com>
	<2ac05d0f0909040505w75793434ifc25237edbabad5a@mail.gmail.com>
	<764978cf0909040521p7aa6af97hffc0c08381a04222@mail.gmail.com>
	<8CCCFE4D-84A4-47A4-A627-ADC6C0329686@illinois.edu>
	<764978cf0909040540n531ea4d3o42f28a7e1578ad82@mail.gmail.com>
	<4D83853D-90C3-4048-AFAB-FF6E2402C7AA@illinois.edu>
	<764978cf0909042152v2ae26ee5q6c668c498ead605e@mail.gmail.com>
Message-ID: <218A1F91-F492-43E6-814D-A31546E0FEB1@illinois.edu>

On Sep 4, 2009, at 11:52 PM, Neeti Somaiya wrote:

> Ok, so I reinstalled bioperl and was able to run the EUtilities code
> for my gene id.
> But I am facing two issues :-
>
> 1) When I give multiple gene ids, it still returns data of only the
> first gene id

This sounds like it's not iterating correctly.  You'll need to post  
your version of the script.

> 2) The script returns the entire entry, and I am not able to figure
> out how to just fetch the sequence, and if possible, in FASTA format.
> I could not figure it out from the documentation.

I recall this working last time I used it (I think June or July).   
Could you post the script you are using?

(realize this is a holiday weekend in the states, so you might have a  
delayed response from me or others)

> Thanks.
>
> -Neeti
> Even my blood says, B positive

chris


From neetisomaiya at gmail.com  Sun Sep  6 12:15:09 2009
From: neetisomaiya at gmail.com (Neeti Somaiya)
Date: Sun, 6 Sep 2009 21:45:09 +0530
Subject: [Bioperl-l] need help urgently
In-Reply-To: <218A1F91-F492-43E6-814D-A31546E0FEB1@illinois.edu>
References: <764978cf0909032349t5d8683f3h3b2ccc506392c984@mail.gmail.com>
	<764978cf0909040217k7f4382d6pcec65f8adfd66164@mail.gmail.com>
	<764978cf0909040313r48ff5660x3210324dc14bf966@mail.gmail.com>
	<2ac05d0f0909040505w75793434ifc25237edbabad5a@mail.gmail.com>
	<764978cf0909040521p7aa6af97hffc0c08381a04222@mail.gmail.com>
	<8CCCFE4D-84A4-47A4-A627-ADC6C0329686@illinois.edu>
	<764978cf0909040540n531ea4d3o42f28a7e1578ad82@mail.gmail.com>
	<4D83853D-90C3-4048-AFAB-FF6E2402C7AA@illinois.edu>
	<764978cf0909042152v2ae26ee5q6c668c498ead605e@mail.gmail.com>
	<218A1F91-F492-43E6-814D-A31546E0FEB1@illinois.edu>
Message-ID: <764978cf0909060915t7a2e6e45v4bb194b9cad18e18@mail.gmail.com>

Hi,

Thanks for the reply.

I am using the script exactly as it is given here :

http://www.bioperl.org/wiki/HOWTO:EUtilities_Cookbook#esummary_-.3E_efetch

-Neeti
Even my blood says, B positive


On Sat, Sep 5, 2009 at 7:14 PM, Chris Fields<cjfields at illinois.edu> wrote:
> On Sep 4, 2009, at 11:52 PM, Neeti Somaiya wrote:
>
>> Ok, so I reinstalled bioperl and was able to run the EUtilities code
>> for my gene id.
>> But I am facing two issues :-
>>
>> 1) When I give multiple gene ids, it still returns data of only the
>> first gene id
>
> This sounds like it's not iterating correctly.  You'll need to post your
> version of the script.
>
>> 2) The script returns the entire entry, and I am not able to figure
>> out how to just fetch the sequence, and if possible, in FASTA format.
>> I could not figure it out from the documentation.
>
> I recall this working last time I used it (I think June or July).  Could you
> post the script you are using?
>
> (realize this is a holiday weekend in the states, so you might have a
> delayed response from me or others)
>
>> Thanks.
>>
>> -Neeti
>> Even my blood says, B positive
>
> chris
>


From Russell.Smithies at agresearch.co.nz  Sun Sep  6 19:00:24 2009
From: Russell.Smithies at agresearch.co.nz (Smithies, Russell)
Date: Mon, 7 Sep 2009 11:00:24 +1200
Subject: [Bioperl-l] need help urgently
In-Reply-To: <764978cf0909040521p7aa6af97hffc0c08381a04222@mail.gmail.com>
References: <764978cf0909032349t5d8683f3h3b2ccc506392c984@mail.gmail.com>
	<42006.192.168.1.1.1252051273.squirrel@mail.ncbs.res.in>
	<764978cf0909040217k7f4382d6pcec65f8adfd66164@mail.gmail.com>
	<764978cf0909040313r48ff5660x3210324dc14bf966@mail.gmail.com>
	<2ac05d0f0909040505w75793434ifc25237edbabad5a@mail.gmail.com>
	<764978cf0909040521p7aa6af97hffc0c08381a04222@mail.gmail.com>
Message-ID: <18DF7D20DFEC044098A1062202F5FFF32B624C50D0@exchsth.agresearch.co.nz>

Grab the gene2accession list from here and do lookups.
Probably the fastest and easiest way.


Russell Smithies 

Bioinformatics Applications Developer 
T +64 3 489 9085 
E? russell.smithies at agresearch.co.nz 

Invermay? Research Centre 
Puddle Alley, 
Mosgiel, 
New Zealand 
T? +64 3 489 3809?? 
F? +64 3 489 9174? 
www.agresearch.co.nz 


> -----Original Message-----
> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
> bounces at lists.open-bio.org] On Behalf Of Neeti Somaiya
> Sent: Saturday, 5 September 2009 12:21 a.m.
> To: Emanuele Osimo; bioperl-l
> Subject: Re: [Bioperl-l] need help urgently
> 
> Thanks. Its an interesting tool.
> 
> But I want to do this programatically.
> 
> I have gene ids to start with. Cant find a method to directly get
> sequence with gene id as input. So using the method of getting
> sequence with accession as input, for which I need to know accessions
> for my gene ids first. Is this a right approach? Please guide me. My
> main aim is to get the nucleotide sequence of a gene from ids entrez
> gene id/gene name. PLease guide me. I am confused.
> 
> -Neeti
> Even my blood says, B positive
> 
> 
> 
> On Fri, Sep 4, 2009 at 5:35 PM, Emanuele Osimo<e.osimo at gmail.com> wrote:
> > Try this:
> > http://david.abcc.ncifcrf.gov/conversion.jsp
> >
> > Emanuele
> >
> >
> > On Fri, Sep 4, 2009 at 12:13, Neeti Somaiya <neetisomaiya at gmail.com> wrote:
> >>
> >> Thanks for the replies.
> >>
> >> So the get seq by accession/GI worked for me. Now can anyone tell me
> >> the easiest way to get the GI /Accession of a gene from the gene
> >> id/gene name?
> >>
> >> -Neeti
> >> Even my blood says, B positive
> >>
> >>
> >>
> >> On Fri, Sep 4, 2009 at 2:47 PM, Neeti Somaiya<neetisomaiya at gmail.com>
> >> wrote:
> >> > Thanks for the link.
> >> > So I need only the following lines of code to get the sequence?
> >> >
> >> > use Bio::DB::GenBank;
> >> > $db_obj = Bio::DB::GenBank->new;
> >> > $seq_obj = $db_obj->get_Seq_by_id(2);
> >> >
> >> > How do I print the sequence?
> >> > $seq_obj->seq ??
> >> >
> >> > -Neeti
> >> > Even my blood says, B positive
> >> >
> >> >
> >> >
> >> > On Fri, Sep 4, 2009 at 1:31 PM, K. Shameer<shameer at ncbs.res.in> wrote:
> >> >>
> >> >> Retrieving a sequence from a database : BioPerl HOWTO
> >> >> http://bit.ly/RWIot
> >> >>
> >> >> Trust this helps,
> >> >> Khader Shameer
> >> >> NCBS - TIFR
> >> >>
> >> >>> Hi,
> >> >>>
> >> >>> I have an input list of gene names (can get gene ids from a local db
> >> >>> if required).
> >> >>> I need to fetch sequences of these genes. Can someone please guide me
> >> >>> as to how this can be done using perl/bioperl?
> >> >>>
> >> >>> Any help will be deeply appreciated.
> >> >>>
> >> >>> Thanks.
> >> >>>
> >> >>> -Neeti
> >> >>> Even my blood says, B positive
> >> >>> _______________________________________________
> >> >>> Bioperl-l mailing list
> >> >>> Bioperl-l at lists.open-bio.org
> >> >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >> >>>
> >> >>
> >> >>
> >> >>
> >> >
> >> _______________________________________________
> >> Bioperl-l mailing list
> >> Bioperl-l at lists.open-bio.org
> >> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >
> >
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
=======================================================================
Attention: The information contained in this message and/or attachments
from AgResearch Limited is intended only for the persons or entities
to which it is addressed and may contain confidential and/or privileged
material. Any review, retransmission, dissemination or other use of, or
taking of any action in reliance upon, this information by persons or
entities other than the intended recipients is prohibited by AgResearch
Limited. If you have received this message in error, please notify the
sender immediately.
=======================================================================


From bnbowman at gmail.com  Mon Sep  7 04:17:25 2009
From: bnbowman at gmail.com (Brett Bowman)
Date: Mon, 7 Sep 2009 01:17:25 -0700
Subject: [Bioperl-l] Protein Sequence QSARs
Message-ID: <627d998d0909070117u760c8ef3k47a894cf52d099f1@mail.gmail.com>

I've been working on a script for my personal edification for annotating
protein sequence for QSARs, as described in the paper below, because I
didn't see anything in Bioperl to do it for me.  Essentially converting a
protein sequence of length N into a numerical matrix of size 3-by-N by
substitution, and then calculating the auto- and cross- correlation values
for various for a lag of L amino acids.  I was considering turning it into a
full blown module, but I wanted to ask if A) it had been done before and I
had just missed it, and B) whether anyone other than me would find such a
module useful.

Wold S, Jonsson J, Sj?str?m M, Sandberg M, R?nnar S: * DNA and peptide
sequences and chemical processes multivariately modeled by principal
component analysis and partial least-squares projections to latent
structures. **Anal Chim Acta* 1993, *277**:*239-253.

Brett Bowman
bnbowman at gmail.com
Woelk Lab, Stein Cancer Research Center
UCSD/SDSU Joint Program in Bioinformatics


From neetisomaiya at gmail.com  Mon Sep  7 06:04:06 2009
From: neetisomaiya at gmail.com (Neeti Somaiya)
Date: Mon, 7 Sep 2009 15:34:06 +0530
Subject: [Bioperl-l] need help urgently
In-Reply-To: <2ac05d0f0909040039v4d6fb77fw8793b43add632e3a@mail.gmail.com>
References: <764978cf0909032349t5d8683f3h3b2ccc506392c984@mail.gmail.com>
	<2ac05d0f0909040039v4d6fb77fw8793b43add632e3a@mail.gmail.com>
Message-ID: <764978cf0909070304w598d4bb5m51ad4e66f57cc1cf@mail.gmail.com>

I tried using EntrezGene instead of GenBank, as is given in the link
that you sent :

http://www.bioperl.org/wiki/HOWTO:Beginners#Retrieving_a_sequence_from_a_database

http://doc.bioperl.org/releases/bioperl-current/bioperl-live/Bio/DB/EntrezGene.html

use Bio::DB::EntrezGene;

    my $db = Bio::DB::EntrezGene->new;

    my $seq = $db->get_Seq_by_id(2); # Gene id

    # or ...

    my $seqio = $db->get_Stream_by_id([2, 4693, 3064]); # Gene ids
    while ( my $seq = $seqio->next_seq ) {
	    print "id is ", $seq->display_id, "\n";
    }

This doesnt seem to work.


-Neeti
Even my blood says, B positive


On Fri, Sep 4, 2009 at 1:09 PM, Emanuele Osimo<e.osimo at gmail.com> wrote:
> Hello,
> have you tried this?
> http://bio.perl.org/wiki/HOWTO:Getting_Genomic_Sequences#Using_Bio::DB::GenBank_when_you_have_genomic_coordinates
>
> Emanuele
>
> On Fri, Sep 4, 2009 at 08:49, Neeti Somaiya <neetisomaiya at gmail.com> wrote:
>>
>> Hi,
>>
>> I have an input list of gene names (can get gene ids from a local db
>> if required).
>> I need to fetch sequences of these genes. Can someone please guide me
>> as to how this can be done using perl/bioperl?
>>
>> Any help will be deeply appreciated.
>>
>> Thanks.
>>
>> -Neeti
>> Even my blood says, B positive
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>


From Russell.Smithies at agresearch.co.nz  Mon Sep  7 16:26:04 2009
From: Russell.Smithies at agresearch.co.nz (Smithies, Russell)
Date: Tue, 8 Sep 2009 08:26:04 +1200
Subject: [Bioperl-l] need help urgently
In-Reply-To: <764978cf0909070304w598d4bb5m51ad4e66f57cc1cf@mail.gmail.com>
References: <764978cf0909032349t5d8683f3h3b2ccc506392c984@mail.gmail.com>
	<2ac05d0f0909040039v4d6fb77fw8793b43add632e3a@mail.gmail.com>
	<764978cf0909070304w598d4bb5m51ad4e66f57cc1cf@mail.gmail.com>
Message-ID: <18DF7D20DFEC044098A1062202F5FFF32B624C53A3@exchsth.agresearch.co.nz>

This example code from the wiki _definitely_ works:
http://bio.perl.org/wiki/HOWTO:Getting_Genomic_Sequences#Using_Bio::DB::EntrezGene_to_get_genomic_coordinates
=========================================

use strict;
use Bio::DB::EntrezGene;
 
my $id = shift or die "Id?\n"; # use a Gene id
 
my $db = new Bio::DB::EntrezGene;
$db->verbose(1); ###
 
my $seq = $db->get_Seq_by_id($id);
 
my $ac = $seq->annotation;
 
for my $ann ($ac->get_Annotations('dblink')) {
	if ($ann->database eq "Evidence Viewer") {
                # get the sequence identifier, the start, and the stop
		my ($contig,$from,$to) = $ann->url =~ 
		  /contig=([^&]+).+from=(\d+)&to=(\d+)/;
		print "$contig\t$from\t$to\n";
	}
}

======================================

So if it doesn't work for you, there are a few things you need to check:
* what version of BioPerl are you using?
* are you behind a firewall?
* are you using a proxy?
* do you need to submit username/password for either of the 2 above
* turn on 'verbose' messages, it may help you debug


If you're still having problems, get back to me and I'll see if I can help.

--Russell


> -----Original Message-----
> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
> bounces at lists.open-bio.org] On Behalf Of Neeti Somaiya
> Sent: Monday, 7 September 2009 10:04 p.m.
> To: Emanuele Osimo; bioperl-l
> Subject: Re: [Bioperl-l] need help urgently
> 
> I tried using EntrezGene instead of GenBank, as is given in the link
> that you sent :
> 
> http://www.bioperl.org/wiki/HOWTO:Beginners#Retrieving_a_sequence_from_a_datab
> ase
> 
> http://doc.bioperl.org/releases/bioperl-current/bioperl-
> live/Bio/DB/EntrezGene.html
> 
> use Bio::DB::EntrezGene;
> 
>     my $db = Bio::DB::EntrezGene->new;
> 
>     my $seq = $db->get_Seq_by_id(2); # Gene id
> 
>     # or ...
> 
>     my $seqio = $db->get_Stream_by_id([2, 4693, 3064]); # Gene ids
>     while ( my $seq = $seqio->next_seq ) {
> 	    print "id is ", $seq->display_id, "\n";
>     }
> 
> This doesnt seem to work.
> 
> 
> -Neeti
> Even my blood says, B positive
> 
> 
> 
> On Fri, Sep 4, 2009 at 1:09 PM, Emanuele Osimo<e.osimo at gmail.com> wrote:
> > Hello,
> > have you tried this?
> >
> http://bio.perl.org/wiki/HOWTO:Getting_Genomic_Sequences#Using_Bio::DB::GenBan
> k_when_you_have_genomic_coordinates
> >
> > Emanuele
> >
> > On Fri, Sep 4, 2009 at 08:49, Neeti Somaiya <neetisomaiya at gmail.com> wrote:
> >>
> >> Hi,
> >>
> >> I have an input list of gene names (can get gene ids from a local db
> >> if required).
> >> I need to fetch sequences of these genes. Can someone please guide me
> >> as to how this can be done using perl/bioperl?
> >>
> >> Any help will be deeply appreciated.
> >>
> >> Thanks.
> >>
> >> -Neeti
> >> Even my blood says, B positive
> >> _______________________________________________
> >> Bioperl-l mailing list
> >> Bioperl-l at lists.open-bio.org
> >> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >
> >
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
=======================================================================
Attention: The information contained in this message and/or attachments
from AgResearch Limited is intended only for the persons or entities
to which it is addressed and may contain confidential and/or privileged
material. Any review, retransmission, dissemination or other use of, or
taking of any action in reliance upon, this information by persons or
entities other than the intended recipients is prohibited by AgResearch
Limited. If you have received this message in error, please notify the
sender immediately.
=======================================================================


From cjfields at illinois.edu  Mon Sep  7 16:56:03 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Mon, 7 Sep 2009 15:56:03 -0500
Subject: [Bioperl-l] Prepping for 1.6.1 (finally!)
Message-ID: <35CC277D-F0B6-45D0-A578-10A00B7A9C57@illinois.edu>

All,

I have updated the Changes file in bioperl-live in preparation for  
1.6.1.  The initial release will be an alpha, 1.6.0_1 (probably  
landing about mid-week), and based on CPAN tests, etc the final 1.6.1  
release next week.  I'll start merging changes over from trunk  
tonight, fixing last-minute bugs, etc.  I'm running my work using perl  
5.10.1 (64-bit) on Mac and will likely run these remotely on our local  
linux cluster.  Win tests are gladly welcome (this should work on  
Strawberry Perl now).

I highly suggest Mark, Jason, and any others (Lincoln, Scott, Chase,  
Robert Buels, Jay Hannah, Heikki, Sendu come to mind) look over the  
file to update it.  There are a few weak spots in there where I didn't  
make the code change or additions, or where a particular bug was fixed  
but not mentioned.  In particular:

1) Google Summer of Code work from Chase (Mark, Chase)
2) GMOD-related fixes (Lincoln, Scott)
3) YAPC Hackathon bug fixes (Robert, Jay, Bruno)
4) Tiling, Restriction refactors (Mark)

Also, please make changes to AUTHORS, etc as needed.

Thanks!

chris


From maj at fortinbras.us  Mon Sep  7 17:21:04 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Mon, 7 Sep 2009 17:21:04 -0400
Subject: [Bioperl-l] Prepping for 1.6.1 (finally!)
In-Reply-To: <35CC277D-F0B6-45D0-A578-10A00B7A9C57@illinois.edu>
References: <35CC277D-F0B6-45D0-A578-10A00B7A9C57@illinois.edu>
Message-ID: <29B3F9DC91A1422A89629790DD8CC313@NewLife>

aye-aye skipper--- 
----- Original Message ----- 
From: "Chris Fields" <cjfields at illinois.edu>
To: "BioPerl List" <bioperl-l at lists.open-bio.org>
Sent: Monday, September 07, 2009 4:56 PM
Subject: [Bioperl-l] Prepping for 1.6.1 (finally!)


> All,
> 
> I have updated the Changes file in bioperl-live in preparation for  
> 1.6.1.  The initial release will be an alpha, 1.6.0_1 (probably  
> landing about mid-week), and based on CPAN tests, etc the final 1.6.1  
> release next week.  I'll start merging changes over from trunk  
> tonight, fixing last-minute bugs, etc.  I'm running my work using perl  
> 5.10.1 (64-bit) on Mac and will likely run these remotely on our local  
> linux cluster.  Win tests are gladly welcome (this should work on  
> Strawberry Perl now).
> 
> I highly suggest Mark, Jason, and any others (Lincoln, Scott, Chase,  
> Robert Buels, Jay Hannah, Heikki, Sendu come to mind) look over the  
> file to update it.  There are a few weak spots in there where I didn't  
> make the code change or additions, or where a particular bug was fixed  
> but not mentioned.  In particular:
> 
> 1) Google Summer of Code work from Chase (Mark, Chase)
> 2) GMOD-related fixes (Lincoln, Scott)
> 3) YAPC Hackathon bug fixes (Robert, Jay, Bruno)
> 4) Tiling, Restriction refactors (Mark)
> 
> Also, please make changes to AUTHORS, etc as needed.
> 
> Thanks!
> 
> chris
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
>


From cjfields at illinois.edu  Tue Sep  8 00:23:26 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Mon, 7 Sep 2009 23:23:26 -0500
Subject: [Bioperl-l] Significant blocker for 1.6.1 : Nexml
Message-ID: <E5D7B830-6D19-47D2-8D5E-716B4CF84F0B@illinois.edu>

All,

I'm running into a pretty significant blocker for 1.6.1 re: Chase's  
Nexml code.  In particular, I have tried three versions of Bio::Phylo;  
the default CPAN installation (1.6), the latest CPAN RC (1.7_RC9, not  
installed by default), and the latest from Bio::Phylo svn:

https://nexml.svn.sourceforge.net/svnroot/nexml/trunk/nexml/perl

At this moment only the Bio::Phylo code from svn is working with  
BioPerl's Nexml modules.  From my local tests Bio::Phylo 1.6 appears  
to be missing Bio::Phylo::Factory (all Nexml tests fail), whereas  
1.7_RC9 has some kind of versioning issue (again, all tests fail).   
The problem: CPAN will always install 1.6 (the others are RC, so they  
won't be installed unless the full path is used).  Even so, nothing on  
CPAN even works; one must use the latest Bio::Phylo SVN code.

ATM I'm just not seeing how this can be released with 1.6.1 right now,  
unless one of the following occurs:

1) Rutger V. drops a quick non-RC release to CPAN,
2) check for the minimal working Bio::Phylo version and safely skip  
any Nexml-related tests unless proper version is present (not easy  
with a $VERSION like '1.7_RC9'),
3) push Nexml into it's own distribution (something we were planning  
on anyway with a number of modules)

As for #3 above, I think it probably belongs in a larger bioperl-phylo  
as Mark had previously proposed.  I'm open to just about any solution.

chris


From neetisomaiya at gmail.com  Tue Sep  8 00:27:43 2009
From: neetisomaiya at gmail.com (Neeti Somaiya)
Date: Tue, 8 Sep 2009 09:57:43 +0530
Subject: [Bioperl-l] need help urgently
In-Reply-To: <18DF7D20DFEC044098A1062202F5FFF32B624C53A3@exchsth.agresearch.co.nz>
References: <764978cf0909032349t5d8683f3h3b2ccc506392c984@mail.gmail.com>
	<2ac05d0f0909040039v4d6fb77fw8793b43add632e3a@mail.gmail.com>
	<764978cf0909070304w598d4bb5m51ad4e66f57cc1cf@mail.gmail.com>
	<18DF7D20DFEC044098A1062202F5FFF32B624C53A3@exchsth.agresearch.co.nz>
Message-ID: <764978cf0909072127n830d4e8x95d15a758fa919db@mail.gmail.com>

I actually want the nucleotide sequence of the gene. I thought the
Bio::DB::EntrezGene would give me a seq_obj for an entrez gene id and
then the seq method on that $seq_obj->seq() will give me the actual
genomic nucleotide sequence of the gene. But this doesnt happen. I am
able to print gene symbol using $seq_obj->display_id and able to do
other things, but I wanted the gene nucleotide sequence.

-Neeti
Even my blood says, B positive


On Tue, Sep 8, 2009 at 1:56 AM, Smithies,
Russell<Russell.Smithies at agresearch.co.nz> wrote:
> This example code from the wiki _definitely_ works:
> http://bio.perl.org/wiki/HOWTO:Getting_Genomic_Sequences#Using_Bio::DB::EntrezGene_to_get_genomic_coordinates
> =========================================
>
> use strict;
> use Bio::DB::EntrezGene;
>
> my $id = shift or die "Id?\n"; # use a Gene id
>
> my $db = new Bio::DB::EntrezGene;
> $db->verbose(1); ###
>
> my $seq = $db->get_Seq_by_id($id);
>
> my $ac = $seq->annotation;
>
> for my $ann ($ac->get_Annotations('dblink')) {
>        if ($ann->database eq "Evidence Viewer") {
>                # get the sequence identifier, the start, and the stop
>                my ($contig,$from,$to) = $ann->url =~
>                  /contig=([^&]+).+from=(\d+)&to=(\d+)/;
>                print "$contig\t$from\t$to\n";
>        }
> }
>
> ======================================
>
> So if it doesn't work for you, there are a few things you need to check:
> * what version of BioPerl are you using?
> * are you behind a firewall?
> * are you using a proxy?
> * do you need to submit username/password for either of the 2 above
> * turn on 'verbose' messages, it may help you debug
>
>
> If you're still having problems, get back to me and I'll see if I can help.
>
> --Russell
>
>
>> -----Original Message-----
>> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
>> bounces at lists.open-bio.org] On Behalf Of Neeti Somaiya
>> Sent: Monday, 7 September 2009 10:04 p.m.
>> To: Emanuele Osimo; bioperl-l
>> Subject: Re: [Bioperl-l] need help urgently
>>
>> I tried using EntrezGene instead of GenBank, as is given in the link
>> that you sent :
>>
>> http://www.bioperl.org/wiki/HOWTO:Beginners#Retrieving_a_sequence_from_a_datab
>> ase
>>
>> http://doc.bioperl.org/releases/bioperl-current/bioperl-
>> live/Bio/DB/EntrezGene.html
>>
>> use Bio::DB::EntrezGene;
>>
>>     my $db = Bio::DB::EntrezGene->new;
>>
>>     my $seq = $db->get_Seq_by_id(2); # Gene id
>>
>>     # or ...
>>
>>     my $seqio = $db->get_Stream_by_id([2, 4693, 3064]); # Gene ids
>>     while ( my $seq = $seqio->next_seq ) {
>>           print "id is ", $seq->display_id, "\n";
>>     }
>>
>> This doesnt seem to work.
>>
>>
>> -Neeti
>> Even my blood says, B positive
>>
>>
>>
>> On Fri, Sep 4, 2009 at 1:09 PM, Emanuele Osimo<e.osimo at gmail.com> wrote:
>> > Hello,
>> > have you tried this?
>> >
>> http://bio.perl.org/wiki/HOWTO:Getting_Genomic_Sequences#Using_Bio::DB::GenBan
>> k_when_you_have_genomic_coordinates
>> >
>> > Emanuele
>> >
>> > On Fri, Sep 4, 2009 at 08:49, Neeti Somaiya <neetisomaiya at gmail.com> wrote:
>> >>
>> >> Hi,
>> >>
>> >> I have an input list of gene names (can get gene ids from a local db
>> >> if required).
>> >> I need to fetch sequences of these genes. Can someone please guide me
>> >> as to how this can be done using perl/bioperl?
>> >>
>> >> Any help will be deeply appreciated.
>> >>
>> >> Thanks.
>> >>
>> >> -Neeti
>> >> Even my blood says, B positive
>> >> _______________________________________________
>> >> Bioperl-l mailing list
>> >> Bioperl-l at lists.open-bio.org
>> >> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>> >
>> >
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> =======================================================================
> Attention: The information contained in this message and/or attachments
> from AgResearch Limited is intended only for the persons or entities
> to which it is addressed and may contain confidential and/or privileged
> material. Any review, retransmission, dissemination or other use of, or
> taking of any action in reliance upon, this information by persons or
> entities other than the intended recipients is prohibited by AgResearch
> Limited. If you have received this message in error, please notify the
> sender immediately.
> =======================================================================
>


From Russell.Smithies at agresearch.co.nz  Tue Sep  8 00:41:47 2009
From: Russell.Smithies at agresearch.co.nz (Smithies, Russell)
Date: Tue, 8 Sep 2009 16:41:47 +1200
Subject: [Bioperl-l] need help urgently
In-Reply-To: <764978cf0909072127n830d4e8x95d15a758fa919db@mail.gmail.com>
References: <764978cf0909032349t5d8683f3h3b2ccc506392c984@mail.gmail.com>
	<2ac05d0f0909040039v4d6fb77fw8793b43add632e3a@mail.gmail.com>
	<764978cf0909070304w598d4bb5m51ad4e66f57cc1cf@mail.gmail.com>
	<18DF7D20DFEC044098A1062202F5FFF32B624C53A3@exchsth.agresearch.co.nz>
	<764978cf0909072127n830d4e8x95d15a758fa919db@mail.gmail.com>
Message-ID: <18DF7D20DFEC044098A1062202F5FFF32B624C5607@exchsth.agresearch.co.nz>

That bit of code gave you the accession, start and end for the sequence so you just needed to download it.
Bio::DB::Eutilities can do that for you.

Did you take a look at http://www.bioperl.org/wiki/HOWTO:Getting_Genomic_Sequences


--Russell

==================
#!perl -w

use strict;
use Bio::DB::EntrezGene;
use Bio::DB::EUtilities;

no warnings 'deprecated';
 
my $id = shift or die "Id?\n"; # use a Gene id
 
my $db = new Bio::DB::EntrezGene;
#$db->verbose(1);
my $seq = $db->get_Seq_by_id($id);
 
my $ac = $seq->annotation;
 
for my $ann ($ac->get_Annotations('dblink')) {
	if ($ann->database eq "Evidence Viewer") {
                # get the sequence identifier, the start, and the stop
		my ($acc,$from,$to) = $ann->url =~
		  /contig=([^&]+).+from=(\d+)&to=(\d+)/;
		print "$acc\t$from\t$to\n";

		# retrieve the sequence
		my $fetcher = Bio::DB::EUtilities->new(-eutil => 'efetch',
					   -db    => 'nucleotide',
					   -rettype => 'fasta');
            $fetcher->set_parameters(-id => $acc,
			     			-seq_start => $from,
			     			-seq_stop  => $to,
			     			-strand    => 1);
            my $seq = $fetcher->get_Response->content;
            print $seq;

	}
}

======================

> -----Original Message-----
> From: Neeti Somaiya [mailto:neetisomaiya at gmail.com]
> Sent: Tuesday, 8 September 2009 4:28 p.m.
> To: Smithies, Russell
> Cc: Emanuele Osimo; bioperl-l
> Subject: Re: [Bioperl-l] need help urgently
> 
> I actually want the nucleotide sequence of the gene. I thought the
> Bio::DB::EntrezGene would give me a seq_obj for an entrez gene id and
> then the seq method on that $seq_obj->seq() will give me the actual
> genomic nucleotide sequence of the gene. But this doesnt happen. I am
> able to print gene symbol using $seq_obj->display_id and able to do
> other things, but I wanted the gene nucleotide sequence.
> 
> -Neeti
> Even my blood says, B positive
> 
> 
> 
> On Tue, Sep 8, 2009 at 1:56 AM, Smithies,
> Russell<Russell.Smithies at agresearch.co.nz> wrote:
> > This example code from the wiki _definitely_ works:
> >
> http://bio.perl.org/wiki/HOWTO:Getting_Genomic_Sequences#Using_Bio::DB::Entrez
> Gene_to_get_genomic_coordinates
> > =========================================
> >
> > use strict;
> > use Bio::DB::EntrezGene;
> >
> > my $id = shift or die "Id?\n"; # use a Gene id
> >
> > my $db = new Bio::DB::EntrezGene;
> > $db->verbose(1); ###
> >
> > my $seq = $db->get_Seq_by_id($id);
> >
> > my $ac = $seq->annotation;
> >
> > for my $ann ($ac->get_Annotations('dblink')) {
> >        if ($ann->database eq "Evidence Viewer") {
> >                # get the sequence identifier, the start, and the stop
> >                my ($contig,$from,$to) = $ann->url =~
> >                  /contig=([^&]+).+from=(\d+)&to=(\d+)/;
> >                print "$contig\t$from\t$to\n";
> >        }
> > }
> >
> > ======================================
> >
> > So if it doesn't work for you, there are a few things you need to check:
> > * what version of BioPerl are you using?
> > * are you behind a firewall?
> > * are you using a proxy?
> > * do you need to submit username/password for either of the 2 above
> > * turn on 'verbose' messages, it may help you debug
> >
> >
> > If you're still having problems, get back to me and I'll see if I can help.
> >
> > --Russell
> >
> >
> >> -----Original Message-----
> >> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
> >> bounces at lists.open-bio.org] On Behalf Of Neeti Somaiya
> >> Sent: Monday, 7 September 2009 10:04 p.m.
> >> To: Emanuele Osimo; bioperl-l
> >> Subject: Re: [Bioperl-l] need help urgently
> >>
> >> I tried using EntrezGene instead of GenBank, as is given in the link
> >> that you sent :
> >>
> >>
> http://www.bioperl.org/wiki/HOWTO:Beginners#Retrieving_a_sequence_from_a_datab
> >> ase
> >>
> >> http://doc.bioperl.org/releases/bioperl-current/bioperl-
> >> live/Bio/DB/EntrezGene.html
> >>
> >> use Bio::DB::EntrezGene;
> >>
> >>     my $db = Bio::DB::EntrezGene->new;
> >>
> >>     my $seq = $db->get_Seq_by_id(2); # Gene id
> >>
> >>     # or ...
> >>
> >>     my $seqio = $db->get_Stream_by_id([2, 4693, 3064]); # Gene ids
> >>     while ( my $seq = $seqio->next_seq ) {
> >>           print "id is ", $seq->display_id, "\n";
> >>     }
> >>
> >> This doesnt seem to work.
> >>
> >>
> >> -Neeti
> >> Even my blood says, B positive
> >>
> >>
> >>
> >> On Fri, Sep 4, 2009 at 1:09 PM, Emanuele Osimo<e.osimo at gmail.com> wrote:
> >> > Hello,
> >> > have you tried this?
> >> >
> >>
> http://bio.perl.org/wiki/HOWTO:Getting_Genomic_Sequences#Using_Bio::DB::GenBan
> >> k_when_you_have_genomic_coordinates
> >> >
> >> > Emanuele
> >> >
> >> > On Fri, Sep 4, 2009 at 08:49, Neeti Somaiya <neetisomaiya at gmail.com>
> wrote:
> >> >>
> >> >> Hi,
> >> >>
> >> >> I have an input list of gene names (can get gene ids from a local db
> >> >> if required).
> >> >> I need to fetch sequences of these genes. Can someone please guide me
> >> >> as to how this can be done using perl/bioperl?
> >> >>
> >> >> Any help will be deeply appreciated.
> >> >>
> >> >> Thanks.
> >> >>
> >> >> -Neeti
> >> >> Even my blood says, B positive
> >> >> _______________________________________________
> >> >> Bioperl-l mailing list
> >> >> Bioperl-l at lists.open-bio.org
> >> >> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >> >
> >> >
> >> _______________________________________________
> >> Bioperl-l mailing list
> >> Bioperl-l at lists.open-bio.org
> >> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> > =======================================================================
> > Attention: The information contained in this message and/or attachments
> > from AgResearch Limited is intended only for the persons or entities
> > to which it is addressed and may contain confidential and/or privileged
> > material. Any review, retransmission, dissemination or other use of, or
> > taking of any action in reliance upon, this information by persons or
> > entities other than the intended recipients is prohibited by AgResearch
> > Limited. If you have received this message in error, please notify the
> > sender immediately.
> > =======================================================================
> >


From cjfields at illinois.edu  Tue Sep  8 00:50:01 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Mon, 7 Sep 2009 23:50:01 -0500
Subject: [Bioperl-l] need help urgently
In-Reply-To: <18DF7D20DFEC044098A1062202F5FFF32B624C5607@exchsth.agresearch.co.nz>
References: <764978cf0909032349t5d8683f3h3b2ccc506392c984@mail.gmail.com>
	<2ac05d0f0909040039v4d6fb77fw8793b43add632e3a@mail.gmail.com>
	<764978cf0909070304w598d4bb5m51ad4e66f57cc1cf@mail.gmail.com>
	<18DF7D20DFEC044098A1062202F5FFF32B624C53A3@exchsth.agresearch.co.nz>
	<764978cf0909072127n830d4e8x95d15a758fa919db@mail.gmail.com>
	<18DF7D20DFEC044098A1062202F5FFF32B624C5607@exchsth.agresearch.co.nz>
Message-ID: <76A4757A-80C5-400E-8D3B-C68E968FF581@illinois.edu>

Russell,

Any reason you're using "no warnings 'deprecated'" there?  The  
pseudohash warnings should no longer be showing up with EntrezGene  
stuff.  Or is it something else?

chris

On Sep 7, 2009, at 11:41 PM, Smithies, Russell wrote:

> That bit of code gave you the accession, start and end for the  
> sequence so you just needed to download it.
> Bio::DB::Eutilities can do that for you.
>
> Did you take a look at http://www.bioperl.org/wiki/HOWTO:Getting_Genomic_Sequences
>
>
>
> --Russell
>
> ==================
> #!perl -w
>
> use strict;
> use Bio::DB::EntrezGene;
> use Bio::DB::EUtilities;
>
> no warnings 'deprecated';
>
> my $id = shift or die "Id?\n"; # use a Gene id
>
> my $db = new Bio::DB::EntrezGene;
> #$db->verbose(1);
> my $seq = $db->get_Seq_by_id($id);
>
> my $ac = $seq->annotation;
>
> for my $ann ($ac->get_Annotations('dblink')) {
> 	if ($ann->database eq "Evidence Viewer") {
>                # get the sequence identifier, the start, and the stop
> 		my ($acc,$from,$to) = $ann->url =~
> 		  /contig=([^&]+).+from=(\d+)&to=(\d+)/;
> 		print "$acc\t$from\t$to\n";
>
> 		# retrieve the sequence
> 		my $fetcher = Bio::DB::EUtilities->new(-eutil => 'efetch',
> 					   -db    => 'nucleotide',
> 					   -rettype => 'fasta');
>            $fetcher->set_parameters(-id => $acc,
> 			     			-seq_start => $from,
> 			     			-seq_stop  => $to,
> 			     			-strand    => 1);
>            my $seq = $fetcher->get_Response->content;
>            print $seq;
>
> 	}
> }
>
> ======================
>
>> -----Original Message-----
>> From: Neeti Somaiya [mailto:neetisomaiya at gmail.com]
>> Sent: Tuesday, 8 September 2009 4:28 p.m.
>> To: Smithies, Russell
>> Cc: Emanuele Osimo; bioperl-l
>> Subject: Re: [Bioperl-l] need help urgently
>>
>> I actually want the nucleotide sequence of the gene. I thought the
>> Bio::DB::EntrezGene would give me a seq_obj for an entrez gene id and
>> then the seq method on that $seq_obj->seq() will give me the actual
>> genomic nucleotide sequence of the gene. But this doesnt happen. I am
>> able to print gene symbol using $seq_obj->display_id and able to do
>> other things, but I wanted the gene nucleotide sequence.
>>
>> -Neeti
>> Even my blood says, B positive
>>
>>
>>
>> On Tue, Sep 8, 2009 at 1:56 AM, Smithies,
>> Russell<Russell.Smithies at agresearch.co.nz> wrote:
>>> This example code from the wiki _definitely_ works:
>>>
>> http://bio.perl.org/wiki/HOWTO:Getting_Genomic_Sequences#Using_Bio::DB::Entrez
>> Gene_to_get_genomic_coordinates
>>> =========================================
>>>
>>> use strict;
>>> use Bio::DB::EntrezGene;
>>>
>>> my $id = shift or die "Id?\n"; # use a Gene id
>>>
>>> my $db = new Bio::DB::EntrezGene;
>>> $db->verbose(1); ###
>>>
>>> my $seq = $db->get_Seq_by_id($id);
>>>
>>> my $ac = $seq->annotation;
>>>
>>> for my $ann ($ac->get_Annotations('dblink')) {
>>>       if ($ann->database eq "Evidence Viewer") {
>>>               # get the sequence identifier, the start, and the stop
>>>               my ($contig,$from,$to) = $ann->url =~
>>>                 /contig=([^&]+).+from=(\d+)&to=(\d+)/;
>>>               print "$contig\t$from\t$to\n";
>>>       }
>>> }
>>>
>>> ======================================
>>>
>>> So if it doesn't work for you, there are a few things you need to  
>>> check:
>>> * what version of BioPerl are you using?
>>> * are you behind a firewall?
>>> * are you using a proxy?
>>> * do you need to submit username/password for either of the 2 above
>>> * turn on 'verbose' messages, it may help you debug
>>>
>>>
>>> If you're still having problems, get back to me and I'll see if I  
>>> can help.
>>>
>>> --Russell
>>>
>>>
>>>> -----Original Message-----
>>>> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
>>>> bounces at lists.open-bio.org] On Behalf Of Neeti Somaiya
>>>> Sent: Monday, 7 September 2009 10:04 p.m.
>>>> To: Emanuele Osimo; bioperl-l
>>>> Subject: Re: [Bioperl-l] need help urgently
>>>>
>>>> I tried using EntrezGene instead of GenBank, as is given in the  
>>>> link
>>>> that you sent :
>>>>
>>>>
>> http://www.bioperl.org/wiki/HOWTO:Beginners#Retrieving_a_sequence_from_a_datab
>>>> ase
>>>>
>>>> http://doc.bioperl.org/releases/bioperl-current/bioperl-
>>>> live/Bio/DB/EntrezGene.html
>>>>
>>>> use Bio::DB::EntrezGene;
>>>>
>>>>    my $db = Bio::DB::EntrezGene->new;
>>>>
>>>>    my $seq = $db->get_Seq_by_id(2); # Gene id
>>>>
>>>>    # or ...
>>>>
>>>>    my $seqio = $db->get_Stream_by_id([2, 4693, 3064]); # Gene ids
>>>>    while ( my $seq = $seqio->next_seq ) {
>>>>          print "id is ", $seq->display_id, "\n";
>>>>    }
>>>>
>>>> This doesnt seem to work.
>>>>
>>>>
>>>> -Neeti
>>>> Even my blood says, B positive
>>>>
>>>>
>>>>
>>>> On Fri, Sep 4, 2009 at 1:09 PM, Emanuele Osimo<e.osimo at gmail.com>  
>>>> wrote:
>>>>> Hello,
>>>>> have you tried this?
>>>>>
>>>>
>> http://bio.perl.org/wiki/HOWTO:Getting_Genomic_Sequences#Using_Bio::DB::GenBan
>>>> k_when_you_have_genomic_coordinates
>>>>>
>>>>> Emanuele
>>>>>
>>>>> On Fri, Sep 4, 2009 at 08:49, Neeti Somaiya <neetisomaiya at gmail.com 
>>>>> >
>> wrote:
>>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> I have an input list of gene names (can get gene ids from a  
>>>>>> local db
>>>>>> if required).
>>>>>> I need to fetch sequences of these genes. Can someone please  
>>>>>> guide me
>>>>>> as to how this can be done using perl/bioperl?
>>>>>>
>>>>>> Any help will be deeply appreciated.
>>>>>>
>>>>>> Thanks.
>>>>>>
>>>>>> -Neeti
>>>>>> Even my blood says, B positive
>>>>>> _______________________________________________
>>>>>> Bioperl-l mailing list
>>>>>> Bioperl-l at lists.open-bio.org
>>>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>>>
>>>>>
>>>> _______________________________________________
>>>> Bioperl-l mailing list
>>>> Bioperl-l at lists.open-bio.org
>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>> = 
>>> = 
>>> = 
>>> ====================================================================
>>> Attention: The information contained in this message and/or  
>>> attachments
>>> from AgResearch Limited is intended only for the persons or entities
>>> to which it is addressed and may contain confidential and/or  
>>> privileged
>>> material. Any review, retransmission, dissemination or other use  
>>> of, or
>>> taking of any action in reliance upon, this information by persons  
>>> or
>>> entities other than the intended recipients is prohibited by  
>>> AgResearch
>>> Limited. If you have received this message in error, please notify  
>>> the
>>> sender immediately.
>>> = 
>>> = 
>>> = 
>>> ====================================================================
>>>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From paola_bisignano at yahoo.it  Tue Sep  8 04:55:21 2009
From: paola_bisignano at yahoo.it (Paola Bisignano)
Date: Tue, 8 Sep 2009 08:55:21 +0000 (GMT)
Subject: [Bioperl-l] problem parsing pdb
Message-ID: <741671.67508.qm@web25705.mail.ukl.yahoo.com>

Hi,

I'm in a little troble because i need to exactly parse pdb file, to extract chain id and res id, but I finded that in some pdb the number of residue is followed by a letter because is probably a residue added by crystallographers and they didm't want to change the number of residue in sequence....for example the pdb 1PXX.pdb I parsed it with my script below, I didn't find any useful suggestion about this in bioperltutorial or documentation of bioperl online

#!/usr/local/bin/perl
use strict;
use warnings;
use Bio::Structure::IO;
use LWP::Simple;


?my $urlpdb= "http://www.rcsb.org/pdb/download/downloadFile.do?fileFormat=pdb&compression=NO&structureId=1PXX";
?? my $content = get($urlpdb); 
?? my $pdb_file = qq{1pxx.pdb};
?? open my $f, ">$pdb_file" or die $!;
?? binmode $f; 
?? print $f $content;
?? print qq{$pdb_file\n};
?? close $f;


my $structio=Bio::Structure::IO->new (-file=>$pdb_file);
?? my $struc=$structio->next_structure;
?? for my $chain ($struc->get_chains) 
??? {
??? my $chainid = $chain->id ;
??? for my $res ($struc->get_residues($chain))
??? ??? {
??? ??? my $resid=$res-> id;
??? ??? my $atoms= $struc->get_atoms($res);
??? ??? open my $f, ">> 1pxx.parsed";
??? ??? ??? print? $f?? "$chainid\t$resid\n";
??? ??? ??? close $f;
??? ??? }
??? }


but it gives my file with an error in ILE 105A? ILE 2105C because they have a letter that follow the number of resid.... can I solve that problem without writing intermediate files?
because i need to have the reside id as 105A not 105.A
so
?A????????? ILE-105A 
without point between number and letter....


Thank you all,

Paola


From lengjingmao at gmail.com  Tue Sep  8 06:13:05 2009
From: lengjingmao at gmail.com (shaohua.fan)
Date: Tue, 8 Sep 2009 12:13:05 +0200
Subject: [Bioperl-l] Bio::Tools::RepeatMasker update?
Message-ID: <517072a20909080313g5ec3380bo42e1871c3a6f4aab@mail.gmail.com>

Dear all ,

After reading the document and original code of Bio::Tools::RepeatMasker on
bioperl document 1.6.0, I have a question about this module's update.

The current repeatmasker's output(  .out) provide more information
than which have not listed in the module, for example, query(left) , repeat
(left), perc div, perc del, perc ins. these maybe useful for some users.

I think it is better to update this module in the lastest Bioperl version.

shaohua


From maj at fortinbras.us  Tue Sep  8 07:00:31 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Tue, 8 Sep 2009 07:00:31 -0400
Subject: [Bioperl-l] Significant blocker for 1.6.1 : Nexml
In-Reply-To: <E5D7B830-6D19-47D2-8D5E-716B4CF84F0B@illinois.edu>
References: <E5D7B830-6D19-47D2-8D5E-716B4CF84F0B@illinois.edu>
Message-ID: <AD2517BD451A403D9FF258B9A07569F2@NewLife>

Chris - 
I would like to vote for option #1, since working on Bio::Nexml with
Chase gave me the opp'y to patch Bio::Phylo some (including fixing
an old "fix" of mine), so (IMO) the CPAN version of Bio::Phylo 
would benefit too. Option #2 is ok, since Bio::Nexml has to be
essentially optional for the user anyway, dependent on whether
the user is willing to install Bio::Phylo, a fairly major commitment
 (nexml.t already skips if Bio::Phylo is unavailable); I think it's 
no problem if we make that dependency more stringent. We could
have nexml.t check the svn revision directly, rather than $VERSION,
as a kludge.
cheers MAJ 
----- Original Message ----- 
From: "Chris Fields" <cjfields at illinois.edu>
To: "BioPerl List" <bioperl-l at lists.open-bio.org>
Sent: Tuesday, September 08, 2009 12:23 AM
Subject: [Bioperl-l] Significant blocker for 1.6.1 : Nexml


> All,
> 
> I'm running into a pretty significant blocker for 1.6.1 re: Chase's  
> Nexml code.  In particular, I have tried three versions of Bio::Phylo;  
> the default CPAN installation (1.6), the latest CPAN RC (1.7_RC9, not  
> installed by default), and the latest from Bio::Phylo svn:
> 
> https://nexml.svn.sourceforge.net/svnroot/nexml/trunk/nexml/perl
> 
> At this moment only the Bio::Phylo code from svn is working with  
> BioPerl's Nexml modules.  From my local tests Bio::Phylo 1.6 appears  
> to be missing Bio::Phylo::Factory (all Nexml tests fail), whereas  
> 1.7_RC9 has some kind of versioning issue (again, all tests fail).   
> The problem: CPAN will always install 1.6 (the others are RC, so they  
> won't be installed unless the full path is used).  Even so, nothing on  
> CPAN even works; one must use the latest Bio::Phylo SVN code.
> 
> ATM I'm just not seeing how this can be released with 1.6.1 right now,  
> unless one of the following occurs:
> 
> 1) Rutger V. drops a quick non-RC release to CPAN,
> 2) check for the minimal working Bio::Phylo version and safely skip  
> any Nexml-related tests unless proper version is present (not easy  
> with a $VERSION like '1.7_RC9'),
> 3) push Nexml into it's own distribution (something we were planning  
> on anyway with a number of modules)
> 
> As for #3 above, I think it probably belongs in a larger bioperl-phylo  
> as Mark had previously proposed.  I'm open to just about any solution.
> 
> chris
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
>


From hlapp at gmx.net  Tue Sep  8 08:16:12 2009
From: hlapp at gmx.net (Hilmar Lapp)
Date: Tue, 8 Sep 2009 08:16:12 -0400
Subject: [Bioperl-l] Significant blocker for 1.6.1 : Nexml
In-Reply-To: <E5D7B830-6D19-47D2-8D5E-716B4CF84F0B@illinois.edu>
References: <E5D7B830-6D19-47D2-8D5E-716B4CF84F0B@illinois.edu>
Message-ID: <CB38C203-7253-4AEE-A6E3-922243B290D9@gmx.net>

I'd suspect that the latest Bio::Phylo changes have been due for CPAN  
release anyway, so unless those are unstable that seems like the  
easiest fix to me.

If the Nexml code works against not yet stable updates to Bio::Phylo,  
it shouldn't be in a BioPerl stable release, right?

	-hilmar

On Sep 8, 2009, at 12:23 AM, Chris Fields wrote:

> All,
>
> I'm running into a pretty significant blocker for 1.6.1 re: Chase's  
> Nexml code.  In particular, I have tried three versions of  
> Bio::Phylo; the default CPAN installation (1.6), the latest CPAN RC  
> (1.7_RC9, not installed by default), and the latest from Bio::Phylo  
> svn:
>
> https://nexml.svn.sourceforge.net/svnroot/nexml/trunk/nexml/perl
>
> At this moment only the Bio::Phylo code from svn is working with  
> BioPerl's Nexml modules.  From my local tests Bio::Phylo 1.6 appears  
> to be missing Bio::Phylo::Factory (all Nexml tests fail), whereas  
> 1.7_RC9 has some kind of versioning issue (again, all tests fail).   
> The problem: CPAN will always install 1.6 (the others are RC, so  
> they won't be installed unless the full path is used).  Even so,  
> nothing on CPAN even works; one must use the latest Bio::Phylo SVN  
> code.
>
> ATM I'm just not seeing how this can be released with 1.6.1 right  
> now, unless one of the following occurs:
>
> 1) Rutger V. drops a quick non-RC release to CPAN,
> 2) check for the minimal working Bio::Phylo version and safely skip  
> any Nexml-related tests unless proper version is present (not easy  
> with a $VERSION like '1.7_RC9'),
> 3) push Nexml into it's own distribution (something we were planning  
> on anyway with a number of modules)
>
> As for #3 above, I think it probably belongs in a larger bioperl- 
> phylo as Mark had previously proposed.  I'm open to just about any  
> solution.
>
> chris
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From cjfields at illinois.edu  Tue Sep  8 08:02:53 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Tue, 8 Sep 2009 07:02:53 -0500
Subject: [Bioperl-l] Bio::Tools::RepeatMasker update?
In-Reply-To: <517072a20909080313g5ec3380bo42e1871c3a6f4aab@mail.gmail.com>
References: <517072a20909080313g5ec3380bo42e1871c3a6f4aab@mail.gmail.com>
Message-ID: <74B85419-6A37-46CE-AAF3-F33013F4A058@illinois.edu>

Patches are welcome for this (or you can submit an enhancement request  
via bugzilla):

http://bugzilla.open-bio.org/

This won't be in the next point release, sorry.

chris

On Sep 8, 2009, at 5:13 AM, shaohua.fan wrote:

> Dear all ,
>
> After reading the document and original code of  
> Bio::Tools::RepeatMasker on
> bioperl document 1.6.0, I have a question about this module's update.
>
> The current repeatmasker's output(  .out) provide more information
> than which have not listed in the module, for example, query(left) ,  
> repeat
> (left), perc div, perc del, perc ins. these maybe useful for some  
> users.
>
> I think it is better to update this module in the lastest Bioperl  
> version.
>
> shaohua
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From cjfields at illinois.edu  Tue Sep  8 09:15:31 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Tue, 8 Sep 2009 08:15:31 -0500
Subject: [Bioperl-l] Significant blocker for 1.6.1 : Nexml
In-Reply-To: <CB38C203-7253-4AEE-A6E3-922243B290D9@gmx.net>
References: <E5D7B830-6D19-47D2-8D5E-716B4CF84F0B@illinois.edu>
	<CB38C203-7253-4AEE-A6E3-922243B290D9@gmx.net>
Message-ID: <3163670B-51E3-419F-835B-304BB52E1037@illinois.edu>

On Sep 8, 2009, at 7:16 AM, Hilmar Lapp wrote:

> I'd suspect that the latest Bio::Phylo changes have been due for  
> CPAN release anyway, so unless those are unstable that seems like  
> the easiest fix to me.

My thought as well, just not sure how stable that code is right now.   
Bio::Phylo has been in RC for a while now, correct?

> If the Nexml code works against not yet stable updates to  
> Bio::Phylo, it shouldn't be in a BioPerl stable release, right?

Right.  That should be sorted out first.

I can wait a bit longer for Rutger to respond; there are a few other  
odds and ends that can been worked on in the meantime.  I would like  
to get the alpha out soon and 1.6.1 in the next week or so though.

chris

> 	-hilmar
>
> On Sep 8, 2009, at 12:23 AM, Chris Fields wrote:
>
>> All,
>>
>> I'm running into a pretty significant blocker for 1.6.1 re: Chase's  
>> Nexml code.  In particular, I have tried three versions of  
>> Bio::Phylo; the default CPAN installation (1.6), the latest CPAN RC  
>> (1.7_RC9, not installed by default), and the latest from Bio::Phylo  
>> svn:
>>
>> https://nexml.svn.sourceforge.net/svnroot/nexml/trunk/nexml/perl
>>
>> At this moment only the Bio::Phylo code from svn is working with  
>> BioPerl's Nexml modules.  From my local tests Bio::Phylo 1.6  
>> appears to be missing Bio::Phylo::Factory (all Nexml tests fail),  
>> whereas 1.7_RC9 has some kind of versioning issue (again, all tests  
>> fail).  The problem: CPAN will always install 1.6 (the others are  
>> RC, so they won't be installed unless the full path is used).  Even  
>> so, nothing on CPAN even works; one must use the latest Bio::Phylo  
>> SVN code.
>>
>> ATM I'm just not seeing how this can be released with 1.6.1 right  
>> now, unless one of the following occurs:
>>
>> 1) Rutger V. drops a quick non-RC release to CPAN,
>> 2) check for the minimal working Bio::Phylo version and safely skip  
>> any Nexml-related tests unless proper version is present (not easy  
>> with a $VERSION like '1.7_RC9'),
>> 3) push Nexml into it's own distribution (something we were  
>> planning on anyway with a number of modules)
>>
>> As for #3 above, I think it probably belongs in a larger bioperl- 
>> phylo as Mark had previously proposed.  I'm open to just about any  
>> solution.
>>
>> chris
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> -- 
> ===========================================================
> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
> ===========================================================
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From maj at fortinbras.us  Tue Sep  8 10:39:07 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Tue, 8 Sep 2009 10:39:07 -0400
Subject: [Bioperl-l] Significant blocker for 1.6.1 : Nexml
In-Reply-To: <3163670B-51E3-419F-835B-304BB52E1037@illinois.edu>
References: <E5D7B830-6D19-47D2-8D5E-716B4CF84F0B@illinois.edu><CB38C203-7253-4AEE-A6E3-922243B290D9@gmx.net>
	<3163670B-51E3-419F-835B-304BB52E1037@illinois.edu>
Message-ID: <1CF993D6D3AC435CA77127466D6C072A@NewLife>

I agree with Hilmar-- I have no problem keeping it in the trunk for a while
longer, as I have an addition for dealing with arbitrary non-seq
data using the Population API sitting in bioperl-dev that's nearly
ready, but prob. not before cjf wants to get the release out.
----- Original Message ----- 
From: "Chris Fields" <cjfields at illinois.edu>
To: "Hilmar Lapp" <hlapp at gmx.net>
Cc: "BioPerl List" <bioperl-l at lists.open-bio.org>; "Rutger A. Vos" 
<rutgeraldo at gmail.com>
Sent: Tuesday, September 08, 2009 9:15 AM
Subject: Re: [Bioperl-l] Significant blocker for 1.6.1 : Nexml


> On Sep 8, 2009, at 7:16 AM, Hilmar Lapp wrote:
>
>> I'd suspect that the latest Bio::Phylo changes have been due for  CPAN 
>> release anyway, so unless those are unstable that seems like  the easiest fix 
>> to me.
>
> My thought as well, just not sure how stable that code is right now. 
> Bio::Phylo has been in RC for a while now, correct?
>
>> If the Nexml code works against not yet stable updates to  Bio::Phylo, it 
>> shouldn't be in a BioPerl stable release, right?
>
> Right.  That should be sorted out first.
>
> I can wait a bit longer for Rutger to respond; there are a few other  odds and 
> ends that can been worked on in the meantime.  I would like  to get the alpha 
> out soon and 1.6.1 in the next week or so though.
>
> chris
>
>> -hilmar
>>
>> On Sep 8, 2009, at 12:23 AM, Chris Fields wrote:
>>
>>> All,
>>>
>>> I'm running into a pretty significant blocker for 1.6.1 re: Chase's  Nexml 
>>> code.  In particular, I have tried three versions of  Bio::Phylo; the 
>>> default CPAN installation (1.6), the latest CPAN RC  (1.7_RC9, not installed 
>>> by default), and the latest from Bio::Phylo  svn:
>>>
>>> https://nexml.svn.sourceforge.net/svnroot/nexml/trunk/nexml/perl
>>>
>>> At this moment only the Bio::Phylo code from svn is working with  BioPerl's 
>>> Nexml modules.  From my local tests Bio::Phylo 1.6  appears to be missing 
>>> Bio::Phylo::Factory (all Nexml tests fail),  whereas 1.7_RC9 has some kind 
>>> of versioning issue (again, all tests  fail).  The problem: CPAN will always 
>>> install 1.6 (the others are  RC, so they won't be installed unless the full 
>>> path is used).  Even  so, nothing on CPAN even works; one must use the 
>>> latest Bio::Phylo  SVN code.
>>>
>>> ATM I'm just not seeing how this can be released with 1.6.1 right  now, 
>>> unless one of the following occurs:
>>>
>>> 1) Rutger V. drops a quick non-RC release to CPAN,
>>> 2) check for the minimal working Bio::Phylo version and safely skip  any 
>>> Nexml-related tests unless proper version is present (not easy  with a 
>>> $VERSION like '1.7_RC9'),
>>> 3) push Nexml into it's own distribution (something we were  planning on 
>>> anyway with a number of modules)
>>>
>>> As for #3 above, I think it probably belongs in a larger bioperl- phylo as 
>>> Mark had previously proposed.  I'm open to just about any  solution.
>>>
>>> chris
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>> -- 
>> ===========================================================
>> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
>> ===========================================================
>>
>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> 


From lincoln.stein at gmail.com  Tue Sep  8 10:58:25 2009
From: lincoln.stein at gmail.com (Lincoln Stein)
Date: Tue, 8 Sep 2009 10:58:25 -0400
Subject: [Bioperl-l] Prepping for 1.6.1 (finally!)
In-Reply-To: <35CC277D-F0B6-45D0-A578-10A00B7A9C57@illinois.edu>
References: <35CC277D-F0B6-45D0-A578-10A00B7A9C57@illinois.edu>
Message-ID: <6dce9a0b0909080758q7334a7b2yc69bc86b96118927@mail.gmail.com>

Will do.

Lincoln

On Mon, Sep 7, 2009 at 4:56 PM, Chris Fields <cjfields at illinois.edu> wrote:

> All,
>
> I have updated the Changes file in bioperl-live in preparation for 1.6.1.
>  The initial release will be an alpha, 1.6.0_1 (probably landing about
> mid-week), and based on CPAN tests, etc the final 1.6.1 release next week.
>  I'll start merging changes over from trunk tonight, fixing last-minute
> bugs, etc.  I'm running my work using perl 5.10.1 (64-bit) on Mac and will
> likely run these remotely on our local linux cluster.  Win tests are gladly
> welcome (this should work on Strawberry Perl now).
>
> I highly suggest Mark, Jason, and any others (Lincoln, Scott, Chase, Robert
> Buels, Jay Hannah, Heikki, Sendu come to mind) look over the file to update
> it.  There are a few weak spots in there where I didn't make the code change
> or additions, or where a particular bug was fixed but not mentioned.  In
> particular:
>
> 1) Google Summer of Code work from Chase (Mark, Chase)
> 2) GMOD-related fixes (Lincoln, Scott)
> 3) YAPC Hackathon bug fixes (Robert, Jay, Bruno)
> 4) Tiling, Restriction refactors (Mark)
>
> Also, please make changes to AUTHORS, etc as needed.
>
> Thanks!
>
> chris
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


-- 
Lincoln D. Stein
Director, Informatics and Biocomputing Platform
Ontario Institute for Cancer Research
101 College St., Suite 800
Toronto, ON, Canada M5G0A3
416 673-8514
Assistant: Renata Musa <Renata.Musa at oicr.on.ca>


From cjfields at illinois.edu  Tue Sep  8 11:43:29 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Tue, 8 Sep 2009 10:43:29 -0500
Subject: [Bioperl-l] Significant blocker for 1.6.1 : Nexml
In-Reply-To: <1CF993D6D3AC435CA77127466D6C072A@NewLife>
References: <E5D7B830-6D19-47D2-8D5E-716B4CF84F0B@illinois.edu><CB38C203-7253-4AEE-A6E3-922243B290D9@gmx.net>
	<3163670B-51E3-419F-835B-304BB52E1037@illinois.edu>
	<1CF993D6D3AC435CA77127466D6C072A@NewLife>
Message-ID: <4415308D-81DC-4F68-A6CF-E08FD03D1D6E@illinois.edu>

Mark

We can hold it in trunk until the next point release or we start  
splitting things off (whichever is first).

I have a little more time, though, and I'm thinking it would be a good  
idea to get the Nexml code into the wild (sooner than later) for users  
to test out.  Let's see if Rutger responds.

chris

On Sep 8, 2009, at 9:39 AM, Mark A. Jensen wrote:

> I agree with Hilmar-- I have no problem keeping it in the trunk for  
> a while
> longer, as I have an addition for dealing with arbitrary non-seq
> data using the Population API sitting in bioperl-dev that's nearly
> ready, but prob. not before cjf wants to get the release out.
> ----- Original Message ----- From: "Chris Fields" <cjfields at illinois.edu 
> >
> To: "Hilmar Lapp" <hlapp at gmx.net>
> Cc: "BioPerl List" <bioperl-l at lists.open-bio.org>; "Rutger A. Vos" <rutgeraldo at gmail.com 
> >
> Sent: Tuesday, September 08, 2009 9:15 AM
> Subject: Re: [Bioperl-l] Significant blocker for 1.6.1 : Nexml
>
>
>> On Sep 8, 2009, at 7:16 AM, Hilmar Lapp wrote:
>>
>>> I'd suspect that the latest Bio::Phylo changes have been due for   
>>> CPAN release anyway, so unless those are unstable that seems like   
>>> the easiest fix to me.
>>
>> My thought as well, just not sure how stable that code is right  
>> now. Bio::Phylo has been in RC for a while now, correct?
>>
>>> If the Nexml code works against not yet stable updates to   
>>> Bio::Phylo, it shouldn't be in a BioPerl stable release, right?
>>
>> Right.  That should be sorted out first.
>>
>> I can wait a bit longer for Rutger to respond; there are a few  
>> other  odds and ends that can been worked on in the meantime.  I  
>> would like  to get the alpha out soon and 1.6.1 in the next week or  
>> so though.
>>
>> chris
>>
>>> -hilmar
>>>
>>> On Sep 8, 2009, at 12:23 AM, Chris Fields wrote:
>>>
>>>> All,
>>>>
>>>> I'm running into a pretty significant blocker for 1.6.1 re:  
>>>> Chase's  Nexml code.  In particular, I have tried three versions  
>>>> of  Bio::Phylo; the default CPAN installation (1.6), the latest  
>>>> CPAN RC  (1.7_RC9, not installed by default), and the latest from  
>>>> Bio::Phylo  svn:
>>>>
>>>> https://nexml.svn.sourceforge.net/svnroot/nexml/trunk/nexml/perl
>>>>
>>>> At this moment only the Bio::Phylo code from svn is working with   
>>>> BioPerl's Nexml modules.  From my local tests Bio::Phylo 1.6   
>>>> appears to be missing Bio::Phylo::Factory (all Nexml tests  
>>>> fail),  whereas 1.7_RC9 has some kind of versioning issue (again,  
>>>> all tests  fail).  The problem: CPAN will always install 1.6 (the  
>>>> others are  RC, so they won't be installed unless the full path  
>>>> is used).  Even  so, nothing on CPAN even works; one must use the  
>>>> latest Bio::Phylo  SVN code.
>>>>
>>>> ATM I'm just not seeing how this can be released with 1.6.1  
>>>> right  now, unless one of the following occurs:
>>>>
>>>> 1) Rutger V. drops a quick non-RC release to CPAN,
>>>> 2) check for the minimal working Bio::Phylo version and safely  
>>>> skip  any Nexml-related tests unless proper version is present  
>>>> (not easy  with a $VERSION like '1.7_RC9'),
>>>> 3) push Nexml into it's own distribution (something we were   
>>>> planning on anyway with a number of modules)
>>>>
>>>> As for #3 above, I think it probably belongs in a larger bioperl-  
>>>> phylo as Mark had previously proposed.  I'm open to just about  
>>>> any  solution.
>>>>
>>>> chris
>>>> _______________________________________________
>>>> Bioperl-l mailing list
>>>> Bioperl-l at lists.open-bio.org
>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>> -- 
>>> ===========================================================
>>> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
>>> ===========================================================
>>>
>>>
>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From jason at bioperl.org  Tue Sep  8 15:43:39 2009
From: jason at bioperl.org (Jason Stajich)
Date: Tue, 8 Sep 2009 12:43:39 -0700
Subject: [Bioperl-l] Bio::DB::Fasta + Bio::SeqIO
Message-ID: <9858D52F-7580-44C9-A78E-4B1F1BF1B6ED@bioperl.org>

Bio::DB::Fasta returns Bio::PrimarySeq::Fasta objects which are  
perfectly fine to write with Bio::SeqIO::fasta but not for any of the  
rich-seq writers.
Do we think this is a bug or feature.  The solution is to write the  
PrimarySeq wrapped in a Bio::Seq object.

See this gist -- I would imagine this as additional test lines in t/ 
LocalDB/DBFasta.t but I don't know what we really expect?
http://gist.github.com/183169

I also notice that $seq->description & $seq->display_id don't allow  
'set' option - which probably makes sense since this is a read-only  
object that came from the DB, but it basically silently ignores set.   
I often do this if I pull seqs from a DB::Fasta db and re-format the  
IDs or description line.  So I end up making a new object and copying  
the data over.  I *think* this is really a feature not a bug, just  
wanted to bring it up.

-jason
--
Jason Stajich
jason.stajich at gmail.com
jason at bioperl.org


From cjfields at illinois.edu  Tue Sep  8 16:20:32 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Tue, 8 Sep 2009 15:20:32 -0500
Subject: [Bioperl-l] Bio::DB::Fasta + Bio::SeqIO
In-Reply-To: <9858D52F-7580-44C9-A78E-4B1F1BF1B6ED@bioperl.org>
References: <9858D52F-7580-44C9-A78E-4B1F1BF1B6ED@bioperl.org>
Message-ID: <AEE10370-F2B3-4723-9B79-23A5EBF86A51@illinois.edu>

On Sep 8, 2009, at 2:43 PM, Jason Stajich wrote:

> Bio::DB::Fasta returns Bio::PrimarySeq::Fasta objects which are  
> perfectly fine to write with Bio::SeqIO::fasta but not for any of  
> the rich-seq writers.
> Do we think this is a bug or feature.  The solution is to write the  
> PrimarySeq wrapped in a Bio::Seq object.

I think SeqIO requires any SeqI but doesn't specify anything for a  
simpler PrimarySeqI.  We could add some kind of general convenience  
wrapper in Bio::SeqIO to convert any PrimarySeqI to a requested SeqI  
class and just delegate to write_seq():

   # get a PrimarySeq somehow $seq, $out is Bio::SeqIO
   $out->write_PrimarySeq($seq); # or somesuch

> See this gist -- I would imagine this as additional test lines in t/ 
> LocalDB/DBFasta.t but I don't know what we really expect?
> http://gist.github.com/183169
>
> I also notice that $seq->description & $seq->display_id don't allow  
> 'set' option - which probably makes sense since this is a read-only  
> object that came from the DB, but it basically silently ignores  
> set.  I often do this if I pull seqs from a DB::Fasta db and re- 
> format the IDs or description line.  So I end up making a new object  
> and copying the data over.  I *think* this is really a feature not a  
> bug, just wanted to bring it up.
>
> -jason
> --
> Jason Stajich
> jason.stajich at gmail.com
> jason at bioperl.org

One can already cheat and do a few things.  For instance:

$seq->{id} = 'Foo';
print $seq->display_id; # should be 'Foo'

Won't work for all of them, though, such as description().   
Personally, if one made clear that such changes aren't retained in the  
database but must be redirected as output to another file then I don't  
see a problem (other PrimarySeqI are mutable, so why not these?).

Would there be any real performance hit from making those get/set  
accessors instead of ro getters?  The class is fairly small.

chris


From lelbourn at science.mq.edu.au  Mon Sep  7 03:52:04 2009
From: lelbourn at science.mq.edu.au (Liam Elbourne)
Date: Mon, 7 Sep 2009 17:52:04 +1000
Subject: [Bioperl-l] subsection of genbank file
Message-ID: <997B4CA2-D80B-4512-AA3E-74CB45DD7064@science.mq.edu.au>

Hi All,

Is there a method or methodology that will produce a fully fledged Seq  
object with all the associated metadata given a start and end  
position? To clarify, I create a sequence object from a genbank file:


****
my $io  = Bio::Seqio->new(as per usual);

my $seqobj = $io->next_seq();
****
I now want:

my $sub_seqobj = $seqobj between 300 and 2000

where $sub_seqobj is a Seq object (which I appreciate is an  
'aggregate' of objects) too. The "trunc" method only returns a  
PrimarySeq object which lacks all the annotation etc. I've previously  
done this task by iterating through feature by feature and parsing out  
what I needed, but thought there might be a more elegant approach...


Regards,
Liam Elbourne.


From alpapan at googlemail.com  Thu Sep 10 17:14:11 2009
From: alpapan at googlemail.com (Alexie Papanicolaou)
Date: Thu, 10 Sep 2009 22:14:11 +0100
Subject: [Bioperl-l] Bio::Search::HSP::FastaHSP -> get_aln -> Bio::Locatable
 end is float
Message-ID: <1252617251.6680.16.camel@alexie-desktop>

An embedded and charset-unspecified text was scrubbed...
Name: not available
URL: <http://lists.open-bio.org/pipermail/bioperl-l/attachments/20090910/f222c627/attachment-0002.pl>

From maj at fortinbras.us  Thu Sep 10 23:52:27 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Thu, 10 Sep 2009 23:52:27 -0400
Subject: [Bioperl-l] Bio::Search::HSP::FastaHSP -> get_aln ->
	Bio::Locatable end is float
In-Reply-To: <1252617251.6680.16.camel@alexie-desktop>
References: <1252617251.6680.16.camel@alexie-desktop>
Message-ID: <D2C2357D7A81478B965996CF6DDD4AF2@NewLife>

Hi Alexie--
I am either responsible for this weirdness, or have fixed it in
an unreleased version. Anyway,  can you please make a bug
report at http://bugzilla.bioperl.org, and include some relevant
code and real data, and I will have a look.
Thanks a lot- Mark
----- Original Message ----- 
From: "Alexie Papanicolaou" <alpapan at googlemail.com>
To: <bioperl-l at lists.open-bio.org>
Sent: Thursday, September 10, 2009 5:14 PM
Subject: [Bioperl-l] Bio::Search::HSP::FastaHSP -> get_aln -> Bio::Locatable end 
is float


> Hello all,
>
> I get the following warning when parsing a fasty34 HSP using Bio::Search
> and then trying to getting the alignment using get_aln
>
> MSG: In sequence CONTIG residue count gives end value
> 565.333333333333.
> Overriding value [565] with value 565.333333333333 for
> Bio::LocatableSeq::end().
> MAEMFKIGDLVWAKMKGFSPWPGLVSNPTKDLKRPTSKKSAQQ/C/VFFLGTNNYAWIEEANIKPYFEYRDRLVKSNKSGAFKDALDAIEEYIKNNGAKFDDPDAEFNRLRESLAEKKESKPKQRKEKRPAHDDNSAKSPKKVRTNSVEADKESVRADSPILSNHSPRKGPASTLLERPTTIVRPLDDSQD
> STACK
> Bio::LocatableSeq::end /usr/local/share/perl/5.8.8/Bio/LocatableSeq.pm:196
> STACK
> Bio::LocatableSeq::new /usr/local/share/perl/5.8.8/Bio/LocatableSeq.pm:140
> STACK
> Bio::Search::HSP::FastaHSP::get_aln 
> /usr/local/share/perl/5.8.8/Bio/Search/HSP/FastaHSP.pm:174
>
> The frameshifts (/ and \ ) are causing this recalculation of length to a
> float (which is a bit weird) but is not fatal for my program. Is this
> intentional?
>
> My immediate problems is actually the warning message itself - which is
> quite annoying if you have hundreds of such sequences... any way to turn
> them off sort of commenting out the line at LocatableSeq.pm ?
> (redirecting STDERR wouldn't be desirable for a production script).
>
> many thanks
> alexie
>
>
> -- 
> Alexie Papanicolaou
> Richard ffrench-Constant group
> CEC-Biology
> Univ. Exeter in Cornwall
> Penryn
> TR10 9EZ
> United Kingdom
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> 


From gmodhelp at googlemail.com  Fri Sep 11 12:40:43 2009
From: gmodhelp at googlemail.com (Dave Clements, GMOD Help Desk)
Date: Fri, 11 Sep 2009 09:40:43 -0700
Subject: [Bioperl-l] CVS to SVN Conversion, 2009/09/15
In-Reply-To: <71ee57c70909110937m4a2598abv6a0a5aaa1e656fcc@mail.gmail.com>
References: <71ee57c70908241615w6f82abb6p25b0744e8f5fb006@mail.gmail.com>
	<71ee57c70909110935w2147628cq6e6984feb544e6b9@mail.gmail.com>
	<71ee57c70909110936g34612cf0g5a9d83aeee4e0efd@mail.gmail.com>
	<71ee57c70909110937m4a2598abv6a0a5aaa1e656fcc@mail.gmail.com>
Message-ID: <71ee57c70909110940y921b1dxfec278422d31be7f@mail.gmail.com>

Hello all,

This is a heads up that GMOD (in the form of Rob Buels) will be moving
its SourceForge source code repository from CVS to SVN on September
15, 2009.

If you have checked out and modified any code from that repository,
please commit your updates before 3am, Eastern US, on September 15.

Some important bits:
* All projects will be frozen in CVS and will remain available from CVS.
* No new updates will be allowed in CVS.
* All project will be moved to Subversion.
* Inactive projects will be moved to a separate archival directory.

See http://gmod.org/wiki/CVS_to_Subversion_Conversion for full details
and a list of active and inactive projects.

Thanks,

Dave C
--
* Please keep responses on the list!
* Was this helpful? ?Let us know at http://gmod.org/wiki/Help_Desk_Feedback


From jayoung at fhcrc.org  Fri Sep 11 21:11:00 2009
From: jayoung at fhcrc.org (Janet Young)
Date: Fri, 11 Sep 2009 18:11:00 -0700
Subject: [Bioperl-l] tree splice remove nodes
Message-ID: <BE5181C0-6BAF-42A8-A6A0-BC699FE640B0@fhcrc.org>

Hi,

I'm having a problem in a script that I'm hoping someone can help me  
figure out.  I'm using splice(-remove_id) to prune a Bio::Tree::Tree  
object, and it looks like it worked fine.

However, I'm also trying to keep a separate copy of the original  
(unpruned) tree in a different object but that second object seems to  
get pruned as well.

Here's my tree, stored in a file called testtree2.nwk:

(((A,(B,b)),C),D,E);

---------------------------------------
Here's my script:

#!/usr/bin/perl

use warnings;
use strict;
use Bio::TreeIO;

my $treeIO = new Bio::TreeIO(-file => "testtree2.nwk", - 
format=>'newick');
while (my $tree = $treeIO->next_tree() ) {

       print "\nfound a tree\n\n";
       my @originalleaves = $tree -> get_leaf_nodes();
       foreach my $originalleaf (@originalleaves) {print "original  
tree has node with id " . $originalleaf->id() . "\n";}

       my $tree2 = $tree;

       my @remove = ("D","E");
       print "\nremoving nodes @remove\n\n";

       $tree2 -> splice(-remove_id => \@remove);
       my @leaves2 = $tree2 -> get_leaf_nodes();
       foreach my $leaf2 (@leaves2) {print "after removing tree2 has  
node with id " . $leaf2->id() . "\n";}

       print "\n";

       my @originalleavesafter = $tree -> get_leaf_nodes();
       foreach my $leaf3 (@originalleavesafter) {print "after removing  
original tree has node with id " . $leaf3->id() . "\n";}

}

---------------------------------------


And here's my output:

found a tree

original tree has node with id A
original tree has node with id B
original tree has node with id b
original tree has node with id C
original tree has node with id D
original tree has node with id E

removing nodes D E

after removing tree2 has node with id A
after removing tree2 has node with id B
after removing tree2 has node with id b
after removing tree2 has node with id C

after removing original tree has node with id A
after removing original tree has node with id B
after removing original tree has node with id b
after removing original tree has node with id C


-------------------------

I want to splice the specified nodes out of $tree2 and leave $tree  
untouched, but both $tree and $tree2 seem to be affected by the splice  
operation. Am I failing to understand something about references/ 
dereferencing?   I'm not sure if I just haven't figured this out right  
or if it's a bug.  If it looks like a bug let me know and I'll post it  
to bugzilla.

thanks in advance for any advice,

Janet

-------------------------------------------------------------------

Dr. Janet Young (Trask lab)

Fred Hutchinson Cancer Research Center
1100 Fairview Avenue N., C3-168,
P.O. Box 19024, Seattle, WA 98109-1024, USA.

tel: (206) 667 1471 fax: (206) 667 6524
email: jayoung  ...at...  fhcrc.org

http://www.fhcrc.org/labs/trask/

-------------------------------------------------------------------


From maj at fortinbras.us  Fri Sep 11 22:00:53 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Fri, 11 Sep 2009 22:00:53 -0400
Subject: [Bioperl-l] tree splice remove nodes
In-Reply-To: <BE5181C0-6BAF-42A8-A6A0-BC699FE640B0@fhcrc.org>
References: <BE5181C0-6BAF-42A8-A6A0-BC699FE640B0@fhcrc.org>
Message-ID: <C8DF4B9CC00E4FAA8D55F43E787E3F38@NewLife>

Hi Janet-
The trouble here is that 
$tree2 = $tree
doesn't create an independent copy of the entire
tree data structure. So, $tree2 and $tree
essentially point to the same thing. 
The easiest way to get two independent copies 
is probably to read the file twice:

$treeIO = new Bio::TreeIO(-file => "testtree2.nwk", -format=>'newick');
$tree = $treeIO->next_tree;
$treeIO = new Bio::TreeIO(-file => "testtree2.nwk", -format=>'newick');
$tree2 = $treeIO->next tree;

which will create two copies. This is a little kludgy, but 
unfortunately, there doesn't seem to be any easy way to 
rewind the TreeIO object. 

When you want a copy of a complex object, generally 
you need to "clone" it, and there are variety of modules
you can use to create clones. [It's probably worth adding 
a clone() method to TreeFunctionsI--maybe I'll do that.]
Get the module Clone from CPAN and do

use Clone qw(clone);
....
$tree2 = clone($tree);
...

hope this helps- cheers 
MAJ
----- Original Message ----- 
From: "Janet Young" <jayoung at fhcrc.org>
To: <bioperl-l at lists.open-bio.org>
Sent: Friday, September 11, 2009 9:11 PM
Subject: [Bioperl-l] tree splice remove nodes


> Hi,
> 
> I'm having a problem in a script that I'm hoping someone can help me  
> figure out.  I'm using splice(-remove_id) to prune a Bio::Tree::Tree  
> object, and it looks like it worked fine.
> 
> However, I'm also trying to keep a separate copy of the original  
> (unpruned) tree in a different object but that second object seems to  
> get pruned as well.
> 
> Here's my tree, stored in a file called testtree2.nwk:
> 
> (((A,(B,b)),C),D,E);
> 
> ---------------------------------------
> Here's my script:
> 
> #!/usr/bin/perl
> 
> use warnings;
> use strict;
> use Bio::TreeIO;
> 
> my $treeIO = new Bio::TreeIO(-file => "testtree2.nwk", - 
> format=>'newick');
> while (my $tree = $treeIO->next_tree() ) {
> 
>       print "\nfound a tree\n\n";
>       my @originalleaves = $tree -> get_leaf_nodes();
>       foreach my $originalleaf (@originalleaves) {print "original  
> tree has node with id " . $originalleaf->id() . "\n";}
> 
>       my $tree2 = $tree;
> 
>       my @remove = ("D","E");
>       print "\nremoving nodes @remove\n\n";
> 
>       $tree2 -> splice(-remove_id => \@remove);
>       my @leaves2 = $tree2 -> get_leaf_nodes();
>       foreach my $leaf2 (@leaves2) {print "after removing tree2 has  
> node with id " . $leaf2->id() . "\n";}
> 
>       print "\n";
> 
>       my @originalleavesafter = $tree -> get_leaf_nodes();
>       foreach my $leaf3 (@originalleavesafter) {print "after removing  
> original tree has node with id " . $leaf3->id() . "\n";}
> 
> }
> 
> ---------------------------------------
> 
> 
> And here's my output:
> 
> found a tree
> 
> original tree has node with id A
> original tree has node with id B
> original tree has node with id b
> original tree has node with id C
> original tree has node with id D
> original tree has node with id E
> 
> removing nodes D E
> 
> after removing tree2 has node with id A
> after removing tree2 has node with id B
> after removing tree2 has node with id b
> after removing tree2 has node with id C
> 
> after removing original tree has node with id A
> after removing original tree has node with id B
> after removing original tree has node with id b
> after removing original tree has node with id C
> 
> 
> -------------------------
> 
> I want to splice the specified nodes out of $tree2 and leave $tree  
> untouched, but both $tree and $tree2 seem to be affected by the splice  
> operation. Am I failing to understand something about references/ 
> dereferencing?   I'm not sure if I just haven't figured this out right  
> or if it's a bug.  If it looks like a bug let me know and I'll post it  
> to bugzilla.
> 
> thanks in advance for any advice,
> 
> Janet
> 
> -------------------------------------------------------------------
> 
> Dr. Janet Young (Trask lab)
> 
> Fred Hutchinson Cancer Research Center
> 1100 Fairview Avenue N., C3-168,
> P.O. Box 19024, Seattle, WA 98109-1024, USA.
> 
> tel: (206) 667 1471 fax: (206) 667 6524
> email: jayoung  ...at...  fhcrc.org
> 
> http://www.fhcrc.org/labs/trask/
> 
> -------------------------------------------------------------------
> 
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
>


From cjfields at illinois.edu  Sat Sep 12 00:12:06 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Fri, 11 Sep 2009 23:12:06 -0500
Subject: [Bioperl-l] tree splice remove nodes
In-Reply-To: <C8DF4B9CC00E4FAA8D55F43E787E3F38@NewLife>
References: <BE5181C0-6BAF-42A8-A6A0-BC699FE640B0@fhcrc.org>
	<C8DF4B9CC00E4FAA8D55F43E787E3F38@NewLife>
Message-ID: <5BE22FC3-06F3-4D31-BB73-8F2C49D46A03@illinois.edu>

On Sep 11, 2009, at 9:00 PM, Mark A. Jensen wrote:

> Hi Janet-
> The trouble here is that $tree2 = $tree
> doesn't create an independent copy of the entire
> tree data structure. So, $tree2 and $tree
> essentially point to the same thing. The easiest way to get two  
> independent copies is probably to read the file twice:
>
> $treeIO = new Bio::TreeIO(-file => "testtree2.nwk", - 
> format=>'newick');
> $tree = $treeIO->next_tree;
> $treeIO = new Bio::TreeIO(-file => "testtree2.nwk", - 
> format=>'newick');
> $tree2 = $treeIO->next tree;
>
> which will create two copies. This is a little kludgy, but  
> unfortunately, there doesn't seem to be any easy way to rewind the  
> TreeIO object.

You can rewind the filehandle if it's seekable:

my $fh = $treeio->_fh;
seek($fh,0,0); # or something like that...

Don't use sysseek (doesn't work with buffered IO).

>  When you want a copy of a complex object, generally you need to  
> "clone" it, and there are variety of modules
> you can use to create clones. [It's probably worth adding a clone()  
> method to TreeFunctionsI--maybe I'll do that.]
> Get the module Clone from CPAN and do
>
> use Clone qw(clone);
> ....
> $tree2 = clone($tree);
> ...
>
> hope this helps- cheers MAJ

This normally works with bioperl objects, just not sure about Tree  
(might be worth testing out).

chris


From bix at sendu.me.uk  Sat Sep 12 04:33:22 2009
From: bix at sendu.me.uk (Sendu Bala)
Date: Sat, 12 Sep 2009 09:33:22 +0100
Subject: [Bioperl-l] tree splice remove nodes
In-Reply-To: <C8DF4B9CC00E4FAA8D55F43E787E3F38@NewLife>
References: <BE5181C0-6BAF-42A8-A6A0-BC699FE640B0@fhcrc.org>
	<C8DF4B9CC00E4FAA8D55F43E787E3F38@NewLife>
Message-ID: <4AAB5CD2.1040903@sendu.me.uk>

Mark A. Jensen wrote:
> Hi Janet-
> The trouble here is that $tree2 = $tree
> doesn't create an independent copy of the entire
> tree data structure. So, $tree2 and $tree
> essentially point to the same thing. The easiest way to get two 
> independent copies is probably to read the file twice:
> 
> $treeIO = new Bio::TreeIO(-file => "testtree2.nwk", -format=>'newick');
> $tree = $treeIO->next_tree;
> $treeIO = new Bio::TreeIO(-file => "testtree2.nwk", -format=>'newick');
> $tree2 = $treeIO->next tree;
> 
> which will create two copies. This is a little kludgy, but 
> unfortunately, there doesn't seem to be any easy way to rewind the 
> TreeIO object.
> When you want a copy of a complex object, generally you need to "clone" 
> it, and there are variety of modules
> you can use to create clones. [It's probably worth adding a clone() 
> method to TreeFunctionsI--maybe I'll do that.]
> Get the module Clone from CPAN and do

 From my comments in Bio/Tree/TreeFunctionsI.pm:

Clone.pm clone() seg faults and fails to make the clone, whilst Storable 
dclone needs $self->{_root_cleanup_methods} deleted (code ref) and seg 
faults at end of script.

TreeFunctionsI.pm already has the _clone() method. I suppose you could 
add some POD for it, rename it clone() and update the methods that call 
the private method to call the public version instead, Mark.

Janet: just clone your tree object with:
my $tree2 = $tree->_clone();


From maj at fortinbras.us  Sat Sep 12 07:37:37 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Sat, 12 Sep 2009 07:37:37 -0400
Subject: [Bioperl-l] tree splice remove nodes
In-Reply-To: <4AAB5CD2.1040903@sendu.me.uk>
References: <BE5181C0-6BAF-42A8-A6A0-BC699FE640B0@fhcrc.org>
	<C8DF4B9CC00E4FAA8D55F43E787E3F38@NewLife>
	<4AAB5CD2.1040903@sendu.me.uk>
Message-ID: <1A0B867B64B347A3B23A2F19EAA2A720@NewLife>

Done-- thanks Sendu. I made _clone alias clone, to keep 
from rocking anyone's boat. 
Janet- definitely do  $tree2 = $tree->_clone.

----- Original Message ----- 
From: "Sendu Bala" <bix at sendu.me.uk>
To: "Mark A. Jensen" <maj at fortinbras.us>
Cc: "Janet Young" <jayoung at fhcrc.org>; <bioperl-l at lists.open-bio.org>
Sent: Saturday, September 12, 2009 4:33 AM
Subject: Re: [Bioperl-l] tree splice remove nodes


> Mark A. Jensen wrote:
>> Hi Janet-
>> The trouble here is that $tree2 = $tree
>> doesn't create an independent copy of the entire
>> tree data structure. So, $tree2 and $tree
>> essentially point to the same thing. The easiest way to get two 
>> independent copies is probably to read the file twice:
>> 
>> $treeIO = new Bio::TreeIO(-file => "testtree2.nwk", -format=>'newick');
>> $tree = $treeIO->next_tree;
>> $treeIO = new Bio::TreeIO(-file => "testtree2.nwk", -format=>'newick');
>> $tree2 = $treeIO->next tree;
>> 
>> which will create two copies. This is a little kludgy, but 
>> unfortunately, there doesn't seem to be any easy way to rewind the 
>> TreeIO object.
>> When you want a copy of a complex object, generally you need to "clone" 
>> it, and there are variety of modules
>> you can use to create clones. [It's probably worth adding a clone() 
>> method to TreeFunctionsI--maybe I'll do that.]
>> Get the module Clone from CPAN and do
> 
> From my comments in Bio/Tree/TreeFunctionsI.pm:
> 
> Clone.pm clone() seg faults and fails to make the clone, whilst Storable 
> dclone needs $self->{_root_cleanup_methods} deleted (code ref) and seg 
> faults at end of script.
> 
> TreeFunctionsI.pm already has the _clone() method. I suppose you could 
> add some POD for it, rename it clone() and update the methods that call 
> the private method to call the public version instead, Mark.
> 
> Janet: just clone your tree object with:
> my $tree2 = $tree->_clone();
> 
>


From adlai at refenestration.com  Sat Sep 12 11:18:02 2009
From: adlai at refenestration.com (adlai burman)
Date: Sat, 12 Sep 2009 17:18:02 +0200
Subject: [Bioperl-l] Servers
Message-ID: <7667775E-09F9-4F20-B76C-2297DE629CF3@refenestration.com>

Can anyone suggest a hosting or server provider that actually has  
Bioperl installed?

Thanks,

Adlai


From maj at fortinbras.us  Sat Sep 12 12:45:35 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Sat, 12 Sep 2009 12:45:35 -0400
Subject: [Bioperl-l] Servers
In-Reply-To: <7667775E-09F9-4F20-B76C-2297DE629CF3@refenestration.com>
References: <7667775E-09F9-4F20-B76C-2297DE629CF3@refenestration.com>
Message-ID: <127343EFA5EF4F7CB756586A1B0B210E@NewLife>

I have a public amazon machine ; see http://fortinbras.us/bioperl-max
cheers MAJ
----- Original Message ----- 
From: "adlai burman" <adlai at refenestration.com>
To: <bioperl-l at lists.open-bio.org>
Sent: Saturday, September 12, 2009 11:18 AM
Subject: [Bioperl-l] Servers


> Can anyone suggest a hosting or server provider that actually has  
> Bioperl installed?
> 
> Thanks,
> 
> Adlai
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
>


From hartzell at alerce.com  Sat Sep 12 21:35:44 2009
From: hartzell at alerce.com (George Hartzell)
Date: Sat, 12 Sep 2009 18:35:44 -0700
Subject: [Bioperl-l] Bio::DB::GenBank question (acc vs. version)
Message-ID: <19116.19568.26115.542911@already.dhcp.gene.com>


It looks like get Bio::DB::GenBank::get_Seq_by_{version,acc} are
functionally identical.  They seem to trickle down to the same place
and walking through these two requests yields almost identical http
requests: 

  $db->get_Seq_by_version('J00522.1')
  GET http://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?retmode=text&rettype=gbwithparts&db=nucleotide&tool=bioperl&id=J00522.1&usehistory=n

  $db->get_Seq_by_acc('J00522')
  GET http://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?retmode=text&rettype=gbwithparts&db=nucleotide&tool=bioperl&id=J00522&usehistory=n

The only difference that I can see is that they index into different
secions of %PARAMSTRING defined in Bio::DB::GenBank, but those
sections contain the same information.

I'd like a general purpose tool that does The Right Thing whether
there's a .1 on the end of an identifier or not, and am just trying to
make sure I'm not doing something troublesome.

Am I correct about the above?

While I'm at it, I think that the comment

  # note that get_Stream_by_version is not implemented

in Bio::DB::GenBank was made obsolete by whoever commented out the

  $self->throw(...)

in get_Stream_by_version in Bio::WebDBSeqI.pm.

I'll happily commit the trivial doc fix if no one shoots down the
idea. (can't help big, might as well help small...).

Thanks,

g.


From maj at fortinbras.us  Sat Sep 12 23:14:06 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Sat, 12 Sep 2009 23:14:06 -0400
Subject: [Bioperl-l] Emacs bioperl-mode improved release
Message-ID: <DBDF390336FB4D8D935E2395E48207D7@NewLife>

Hi All--

[Future announcements/updates will be made on the wiki-
 http://bioperl.org/wiki/Emacs_bioperl-mode --
 put it on your watchlist...see the page for features and install
 info ]

Bioperl-mode (tar r16070) is improved:
- fancy syntax and header highlighting for pod views
- jump to .pm source from pod view (just press 'f')
- full support for multiple paths
  (e.g. "/usr/local/src/bioperl-live:/usr/local/src/bioperl-run"):
  the completion flattens the paths; if you wind up having to 
  make a choice (between, e.g., site-perl/5.10/Bio/Seq.pm
  and mytweaks/Bio/Seq.pm), completion will let you choose
  the path at the prompt.
- BPMODE_PATH convenience environment 
  variable is read for the search paths
- other stuff I can't remember
- there is a unit test suite under test.el of Wang Liang
  in the dev path

To do this stuff, I've backed off Emacs 21 compatibility; 
it'll bork (nicely) if you have 21. If there are "enough" complaints,
I will relent, but 22 is cool for people like me with the 
elisp disease.

Other technical issues remain; let me know and 
I'll do my best. My goal is to make this something
you can't live without. (And if you're not using
Emacs, are you really living?)

 M-x thanks

Mark


From bill at genenformics.com  Sun Sep 13 11:47:57 2009
From: bill at genenformics.com (bill at genenformics.com)
Date: Sun, 13 Sep 2009 08:47:57 -0700
Subject: [Bioperl-l] Bio::DB::GenBank question (acc vs. version)
In-Reply-To: <19116.19568.26115.542911@already.dhcp.gene.com>
References: <19116.19568.26115.542911@already.dhcp.gene.com>
Message-ID: <02cbfb3dfbb309f0b62cecd122bb5c2c.squirrel@mail.dreamhost.com>


I would like to make a few comments about get_Seq_by_version and
get_Seq_by_acc. Although both functions use the same NCBI eUtils API, they
are interpreted differently for a Seq_id with version or without version.

1. If the Seq_id has a version, GenBank ID server will locate
corresponding GI and emit the correct sequence.
2. If the Seq_id does not have a version, GBDataLoader  will try to find
the latest version number for that Seq_id, which is relatively slower and
the version number the ID server find out may NOT always be the latest.

IMHO, for both efficiency and consistency,
get_Seq_by_gi > get_Seq_by_version >> get_Seq_by_acc

Bill


>
> It looks like get Bio::DB::GenBank::get_Seq_by_{version,acc} are
> functionally identical.  They seem to trickle down to the same place
> and walking through these two requests yields almost identical http
> requests:
>
>   $db->get_Seq_by_version('J00522.1')
>   GET
> http://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?retmode=text&rettype=gbwithparts&db=nucleotide&tool=bioperl&id=J00522.1&usehistory=n
>
>   $db->get_Seq_by_acc('J00522')
>   GET
> http://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?retmode=text&rettype=gbwithparts&db=nucleotide&tool=bioperl&id=J00522&usehistory=n
>
> The only difference that I can see is that they index into different
> secions of %PARAMSTRING defined in Bio::DB::GenBank, but those
> sections contain the same information.
>
> I'd like a general purpose tool that does The Right Thing whether
> there's a .1 on the end of an identifier or not, and am just trying to
> make sure I'm not doing something troublesome.
>
> Am I correct about the above?
>
> While I'm at it, I think that the comment
>
>   # note that get_Stream_by_version is not implemented
>
> in Bio::DB::GenBank was made obsolete by whoever commented out the
>
>   $self->throw(...)
>
> in get_Stream_by_version in Bio::WebDBSeqI.pm.
>
> I'll happily commit the trivial doc fix if no one shoots down the
> idea. (can't help big, might as well help small...).
>
> Thanks,
>
> g.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


From maj at fortinbras.us  Sun Sep 13 21:26:57 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Sun, 13 Sep 2009 21:26:57 -0400
Subject: [Bioperl-l] Emacs bioperl-mode improved release
In-Reply-To: <DBDF390336FB4D8D935E2395E48207D7@NewLife>
References: <DBDF390336FB4D8D935E2395E48207D7@NewLife>
Message-ID: <CCFD820881654749B1EA479B45A7EA28@NewLife>

Sorry-- just one more tweak--
the latest tar (r16073) eliminates the dependency on pod2text
entirely; source is now parsed for pod directly by an elisp function.
cheers MAJ 
----- Original Message ----- 
From: "Mark A. Jensen" <maj at fortinbras.us>
To: "BioPerl List" <bioperl-l at lists.open-bio.org>
Sent: Saturday, September 12, 2009 11:14 PM
Subject: [Bioperl-l] Emacs bioperl-mode improved release


> Hi All--
> 
> [Future announcements/updates will be made on the wiki-
> http://bioperl.org/wiki/Emacs_bioperl-mode --
> put it on your watchlist...see the page for features and install
> info ]
> 
> Bioperl-mode (tar r16070) is improved:
> - fancy syntax and header highlighting for pod views
> - jump to .pm source from pod view (just press 'f')
> - full support for multiple paths
>  (e.g. "/usr/local/src/bioperl-live:/usr/local/src/bioperl-run"):
>  the completion flattens the paths; if you wind up having to 
>  make a choice (between, e.g., site-perl/5.10/Bio/Seq.pm
>  and mytweaks/Bio/Seq.pm), completion will let you choose
>  the path at the prompt.
> - BPMODE_PATH convenience environment 
>  variable is read for the search paths
> - other stuff I can't remember
> - there is a unit test suite under test.el of Wang Liang
>  in the dev path
> 
> To do this stuff, I've backed off Emacs 21 compatibility; 
> it'll bork (nicely) if you have 21. If there are "enough" complaints,
> I will relent, but 22 is cool for people like me with the 
> elisp disease.
> 
> Other technical issues remain; let me know and 
> I'll do my best. My goal is to make this something
> you can't live without. (And if you're not using
> Emacs, are you really living?)
> 
> M-x thanks
> 
> Mark
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
>


From neetisomaiya at gmail.com  Mon Sep 14 04:22:43 2009
From: neetisomaiya at gmail.com (Neeti Somaiya)
Date: Mon, 14 Sep 2009 13:52:43 +0530
Subject: [Bioperl-l] need help urgently
In-Reply-To: <18DF7D20DFEC044098A1062202F5FFF32B624C5607@exchsth.agresearch.co.nz>
References: <764978cf0909032349t5d8683f3h3b2ccc506392c984@mail.gmail.com>
	<2ac05d0f0909040039v4d6fb77fw8793b43add632e3a@mail.gmail.com>
	<764978cf0909070304w598d4bb5m51ad4e66f57cc1cf@mail.gmail.com>
	<18DF7D20DFEC044098A1062202F5FFF32B624C53A3@exchsth.agresearch.co.nz>
	<764978cf0909072127n830d4e8x95d15a758fa919db@mail.gmail.com>
	<18DF7D20DFEC044098A1062202F5FFF32B624C5607@exchsth.agresearch.co.nz>
Message-ID: <764978cf0909140122h3fe74b80lec7118e3edde24f9@mail.gmail.com>

Thanks a lot. This works for me.

I need one more help, can you point me to where exactly can we find
the link to this FASTA sequence, that we are retrieving here through
the code, in its actual entry in Entrez Gene in the NCBI website
(http://www.ncbi.nlm.nih.gov/sites/entrez)

-Neeti
Even my blood says, B positive


On Tue, Sep 8, 2009 at 10:11 AM, Smithies, Russell
<Russell.Smithies at agresearch.co.nz> wrote:
> That bit of code gave you the accession, start and end for the sequence so you just needed to download it.
> Bio::DB::Eutilities can do that for you.
>
> Did you take a look at http://www.bioperl.org/wiki/HOWTO:Getting_Genomic_Sequences
>
>
>
> --Russell
>
> ==================
> #!perl -w
>
> use strict;
> use Bio::DB::EntrezGene;
> use Bio::DB::EUtilities;
>
> no warnings 'deprecated';
>
> my $id = shift or die "Id?\n"; # use a Gene id
>
> my $db = new Bio::DB::EntrezGene;
> #$db->verbose(1);
> my $seq = $db->get_Seq_by_id($id);
>
> my $ac = $seq->annotation;
>
> for my $ann ($ac->get_Annotations('dblink')) {
>        if ($ann->database eq "Evidence Viewer") {
>                # get the sequence identifier, the start, and the stop
>                my ($acc,$from,$to) = $ann->url =~
>                  /contig=([^&]+).+from=(\d+)&to=(\d+)/;
>                print "$acc\t$from\t$to\n";
>
>                # retrieve the sequence
>                my $fetcher = Bio::DB::EUtilities->new(-eutil => 'efetch',
>                                           -db    => 'nucleotide',
>                                           -rettype => 'fasta');
>            $fetcher->set_parameters(-id => $acc,
>                                                -seq_start => $from,
>                                                -seq_stop  => $to,
>                                                -strand    => 1);
>            my $seq = $fetcher->get_Response->content;
>            print $seq;
>
>        }
> }
>
> ======================
>
>> -----Original Message-----
>> From: Neeti Somaiya [mailto:neetisomaiya at gmail.com]
>> Sent: Tuesday, 8 September 2009 4:28 p.m.
>> To: Smithies, Russell
>> Cc: Emanuele Osimo; bioperl-l
>> Subject: Re: [Bioperl-l] need help urgently
>>
>> I actually want the nucleotide sequence of the gene. I thought the
>> Bio::DB::EntrezGene would give me a seq_obj for an entrez gene id and
>> then the seq method on that $seq_obj->seq() will give me the actual
>> genomic nucleotide sequence of the gene. But this doesnt happen. I am
>> able to print gene symbol using $seq_obj->display_id and able to do
>> other things, but I wanted the gene nucleotide sequence.
>>
>> -Neeti
>> Even my blood says, B positive
>>
>>
>>
>> On Tue, Sep 8, 2009 at 1:56 AM, Smithies,
>> Russell<Russell.Smithies at agresearch.co.nz> wrote:
>> > This example code from the wiki _definitely_ works:
>> >
>> http://bio.perl.org/wiki/HOWTO:Getting_Genomic_Sequences#Using_Bio::DB::Entrez
>> Gene_to_get_genomic_coordinates
>> > =========================================
>> >
>> > use strict;
>> > use Bio::DB::EntrezGene;
>> >
>> > my $id = shift or die "Id?\n"; # use a Gene id
>> >
>> > my $db = new Bio::DB::EntrezGene;
>> > $db->verbose(1); ###
>> >
>> > my $seq = $db->get_Seq_by_id($id);
>> >
>> > my $ac = $seq->annotation;
>> >
>> > for my $ann ($ac->get_Annotations('dblink')) {
>> >        if ($ann->database eq "Evidence Viewer") {
>> >                # get the sequence identifier, the start, and the stop
>> >                my ($contig,$from,$to) = $ann->url =~
>> >                  /contig=([^&]+).+from=(\d+)&to=(\d+)/;
>> >                print "$contig\t$from\t$to\n";
>> >        }
>> > }
>> >
>> > ======================================
>> >
>> > So if it doesn't work for you, there are a few things you need to check:
>> > * what version of BioPerl are you using?
>> > * are you behind a firewall?
>> > * are you using a proxy?
>> > * do you need to submit username/password for either of the 2 above
>> > * turn on 'verbose' messages, it may help you debug
>> >
>> >
>> > If you're still having problems, get back to me and I'll see if I can help.
>> >
>> > --Russell
>> >
>> >
>> >> -----Original Message-----
>> >> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
>> >> bounces at lists.open-bio.org] On Behalf Of Neeti Somaiya
>> >> Sent: Monday, 7 September 2009 10:04 p.m.
>> >> To: Emanuele Osimo; bioperl-l
>> >> Subject: Re: [Bioperl-l] need help urgently
>> >>
>> >> I tried using EntrezGene instead of GenBank, as is given in the link
>> >> that you sent :
>> >>
>> >>
>> http://www.bioperl.org/wiki/HOWTO:Beginners#Retrieving_a_sequence_from_a_datab
>> >> ase
>> >>
>> >> http://doc.bioperl.org/releases/bioperl-current/bioperl-
>> >> live/Bio/DB/EntrezGene.html
>> >>
>> >> use Bio::DB::EntrezGene;
>> >>
>> >>     my $db = Bio::DB::EntrezGene->new;
>> >>
>> >>     my $seq = $db->get_Seq_by_id(2); # Gene id
>> >>
>> >>     # or ...
>> >>
>> >>     my $seqio = $db->get_Stream_by_id([2, 4693, 3064]); # Gene ids
>> >>     while ( my $seq = $seqio->next_seq ) {
>> >>           print "id is ", $seq->display_id, "\n";
>> >>     }
>> >>
>> >> This doesnt seem to work.
>> >>
>> >>
>> >> -Neeti
>> >> Even my blood says, B positive
>> >>
>> >>
>> >>
>> >> On Fri, Sep 4, 2009 at 1:09 PM, Emanuele Osimo<e.osimo at gmail.com> wrote:
>> >> > Hello,
>> >> > have you tried this?
>> >> >
>> >>
>> http://bio.perl.org/wiki/HOWTO:Getting_Genomic_Sequences#Using_Bio::DB::GenBan
>> >> k_when_you_have_genomic_coordinates
>> >> >
>> >> > Emanuele
>> >> >
>> >> > On Fri, Sep 4, 2009 at 08:49, Neeti Somaiya <neetisomaiya at gmail.com>
>> wrote:
>> >> >>
>> >> >> Hi,
>> >> >>
>> >> >> I have an input list of gene names (can get gene ids from a local db
>> >> >> if required).
>> >> >> I need to fetch sequences of these genes. Can someone please guide me
>> >> >> as to how this can be done using perl/bioperl?
>> >> >>
>> >> >> Any help will be deeply appreciated.
>> >> >>
>> >> >> Thanks.
>> >> >>
>> >> >> -Neeti
>> >> >> Even my blood says, B positive
>> >> >> _______________________________________________
>> >> >> Bioperl-l mailing list
>> >> >> Bioperl-l at lists.open-bio.org
>> >> >> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>> >> >
>> >> >
>> >> _______________________________________________
>> >> Bioperl-l mailing list
>> >> Bioperl-l at lists.open-bio.org
>> >> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>> > =======================================================================
>> > Attention: The information contained in this message and/or attachments
>> > from AgResearch Limited is intended only for the persons or entities
>> > to which it is addressed and may contain confidential and/or privileged
>> > material. Any review, retransmission, dissemination or other use of, or
>> > taking of any action in reliance upon, this information by persons or
>> > entities other than the intended recipients is prohibited by AgResearch
>> > Limited. If you have received this message in error, please notify the
>> > sender immediately.
>> > =======================================================================
>> >
>


From cavin.wardcaviness at gmail.com  Sun Sep 13 22:25:51 2009
From: cavin.wardcaviness at gmail.com (Cavin Ward-Caviness)
Date: Sun, 13 Sep 2009 22:25:51 -0400
Subject: [Bioperl-l] Beginner Script Error
Message-ID: <f39d52e60909131925ye176745qad0a0a16d4353a17@mail.gmail.com>

I am very new to perl and bioperl and figured I'd start learning by trying
to run a simple script to get BLAST data.  Here is the code I am trying to
run

use Bio::Perl;

$seq = get_sequence('swiss',"ROA1_HUMAN");

# uses the default database - nr in this case
$blast_result = blast_sequence($seq);

write_blast(">roa1.blast",$blast_result);

Instead of creating a file of the blast results I get the following error
message
Bio::SeqIO: swiss cannot be found.
Exception
Msg: Failed to load module Bio::SeqIO:swiss

It seems as though I may simply be missing the proper module.  I am running
bioperl 1.5.9_4 installed using the Perl Package Manager from the
instructions on the bioperl wiki page.  If I am simply missing a module
please let me know which one it is - and any other helpful modules that
someone in the bioinformatics field is likely to use.

Thanks,
Cavin


From joseguillin at hotmail.com  Mon Sep 14 08:48:28 2009
From: joseguillin at hotmail.com (Jose .)
Date: Mon, 14 Sep 2009 13:48:28 +0100
Subject: [Bioperl-l] Bio/Align/DNAStatistics.html print
	$jcmatrix->print_matrix; 
Message-ID: <BLU104-W2453ADE4584D2C479071A4A0E40@phx.gbl>


Hello,

I'm trying to use Bio::Align::DNAStatistics, but I get the following message:

Can't call method "print_matrix" on unblessed reference at Tree.pl line 32, <GEN0> line 44.

Other modules do work, such us Bio::SimpleAlign;


My code is basically a modification of the code I found in http://doc.bioperl.org/releases/bioperl-current/bioperl-live/Bio/Align/DNAStatistics.html, as it is as follows:

use strict;
use Bio::AlignIO;
use Bio::Align::DNAStatistics;


my $stats = Bio::Align::DNAStatistics->new();

my $alignin = Bio::AlignIO->new(-file => 'e1_output_uno_solo.fas',
                            -format => 'fasta');
my $aln = $alignin->next_aln;

my $jcmatrix = $stats-> distance (-align => $aln,
                  -method => 'Jukes-Cantor');

print $jcmatrix->print_matrix;

And the file 'e1_output_uno_solo.fas' has the following sequences:

>A
GGTTATCTCAACAACTGTCACC--GTGGGCGCTGGTCATTGGTACGGGTGAACGAGAGTT
AAACGGTCGTTAACCATAGAAACAAAACACACTGCACCTTAACTCACTGAATAGTTGACG
GTCTGCCTCAGGGCTTGAGACAACGGATGGATCTAAACTCATGCTGTAGCCTATCAAACT
TAGCCCCAGGGTACTTCCGTCCCTAGCCTCGCTACAAGGCCAGAAAGGGTTTTGAAGTCT
ACTCACTGTGACCAGCGGTCTAGTCAGGTTATGCTTCGGCACAAAACCTCAGAATCGGTA
ACCAGCCACTACACGAACTGAAATCAAATCGCGGGAGGTGGTCCATCTTTGTCCACGCTG
CGATGATTGGGTTGCTTTATAGTCTAGCTGCAAGGTTTTGCGTTCTGGTGGGAAGCGGCA
TCCAAGGGGTTGACTCCGCTCGTTTATAACATGCCTTGGGCCTCCATGGTGAGTCGCAAC
GTCAGCGTAGGCCTAGACGGCT

>B
GGATATCTCGACAACTTTTAGC--CTGGGCGCTTGGCATTGGTACACGTGACTTGCAGTT
AAAGGGTCGTTATACATAGAATCACTACCCAC--CAGGCGAACTCGCTGGAGAGCTGAGG
GTCACCCTCAGCGGTTGAGTTAACTGCTCGATGTTAACCGATGTTGGATCATAGGTAACT
TATCCTCAGTGTTCCTCTGTCCCTAGACTGGCTACAGGGCTACACCGGGTTTGAGGGGAT
ACTGACTGTTTTCAGCGGTAGTGTAAGTGTATGGTCCAACCCAAGGGTTCATGACCGGTA
AACTGCCCGTTCCCGCATTGAAATCAAATTGCAGGAGTTGGTACTTATTTGTCAACCTTA
CGATGATTGGGATGCATTTTAGTCGGGCTGGGCGGATTTGCGATCTGGGTGGAAGAGAGA
TGCATGGGGCTAACTCGTCTTGGTGAGTACCGGCATTGCACCGCAATGGACCGCCAAAAC
ATAAGAGTAGGTCGGGATGGCA

>C
GCTTATCTCAACAACCGACACGAAGTCGTCGCAGGTCAATGGTACACGTGAATTGAAGTC
ATAAGATCAGTAATGATCGAACCACCAAACCCTTAACCTCGACTCACGCGATAGCCGAGG
GTCTGCCTCCAGGGTTGATTTAAAGGTTCTATTTAAGACCGTTTTCGATCATAGGTTACT
TATCCCCAGAGTTCTACCGTCGTGAGAATGGCTACAAGGCTAGAATAGGTTTTAGGGT-T
ACTTACGGTCTGCAGCCGTATTGTGAGGTTATGGTCCGGCCCTAGGCGTCATGACCGATA
ATCAGCCCCTACCTGAAATGAAATCAAATCGCGGGAGTTGGTACTTATCTGTCAACGTTG
CGATGATGGGGATACATGTTGGTCTACCGCGACGGACTAGCGATCACGGGGGAAGCGGAT
TGCCCGGTGGTGACTCGACACGTTTAAAACCTGCCTGGTTCCCGCATGGATCGTCACAAC
GTATGTGCAGGTCGAAACGAGT

>D
CGTGATCGCAACAACTGTCACC--GTGGGCGCTGGCCGTTGGACCACGTGAAATGCTGTT
AAACGATCGTTCACCATAGAACCACTACACTCTTCACCTCAACCCGCGGGACAGGTGATG
GTGTCCCCCAGGGGTTGAGTGAACGGCTCGATGTAAACCCATGTTCGATCATAGGTAACG
TAGCCCCAGGGTGATTCCGTTCCTAAACTGGTTACAAGGCTAAAACGTGTTTTAGAGTAT
AATGACTGTCTACGGCGGTATTGTGATGTTATCATCCGTCCCTAGGCGTGGCGACCGTTA
AACAGCCTCTTCCCTAACTGATATCTAATCGTAGGAGTTGCTACGCATTTGTCAACGCAG
CGATGATGGTGATGCATCTTAATCTAGCTGG----TTTTTTGATCTCGGGTGACGCAGAT
AGTCAGGGGTTGACTCGCGTCGTTTGAAACGTGCCTTGCTCCTCAATGGACCCTCCGAAC
CTAAGAGTAGCTCGACACGGCT


I think the $aln object is OK, as I can use it with SimpleAlign.

Moreover, if I write
          print $jcmatrix;
instead of
          print $jcmatrix->print_matrix;
I get the memory reference, as normal===> ARRAY(0x859f08)

So my question is:

Why do I have an unblessed reference?

Can't call method "print_matrix" on unblessed reference at Tree.pl line 32, <GEN0> line 44.

Thank you very much in advance.

Jose G.

_________________________________________________________________
Hay tantos ordenadores como personas. ?Descubre ahora cu?l eres t?!
http://www.quepceres.com/


From maj at fortinbras.us  Mon Sep 14 13:00:24 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Mon, 14 Sep 2009 13:00:24 -0400
Subject: [Bioperl-l] Bio/Align/DNAStatistics.html
	print$jcmatrix->print_matrix; 
In-Reply-To: <BLU104-W2453ADE4584D2C479071A4A0E40@phx.gbl>
References: <BLU104-W2453ADE4584D2C479071A4A0E40@phx.gbl>
Message-ID: <7AD546C5A6BE4B66BF9705BC885E08B1@NewLife>

Hi Jose--
I don't get any problem with your script as written. You should upgrade to
BioPerl 1.6 and try again.
The "unblessed reference" is $jcmatrix. It may be undef for some reason.
MAJ
----- Original Message ----- 
From: "Jose ." <joseguillin at hotmail.com>
To: <bioperl-l at bioperl.org>
Sent: Monday, September 14, 2009 8:48 AM
Subject: [Bioperl-l] Bio/Align/DNAStatistics.html print$jcmatrix->print_matrix;


Hello,

I'm trying to use Bio::Align::DNAStatistics, but I get the following message:

Can't call method "print_matrix" on unblessed reference at Tree.pl line 32, 
<GEN0> line 44.

Other modules do work, such us Bio::SimpleAlign;


My code is basically a modification of the code I found in 
http://doc.bioperl.org/releases/bioperl-current/bioperl-live/Bio/Align/DNAStatistics.html, 
as it is as follows:

use strict;
use Bio::AlignIO;
use Bio::Align::DNAStatistics;


my $stats = Bio::Align::DNAStatistics->new();

my $alignin = Bio::AlignIO->new(-file => 'e1_output_uno_solo.fas',
                            -format => 'fasta');
my $aln = $alignin->next_aln;

my $jcmatrix = $stats-> distance (-align => $aln,
                  -method => 'Jukes-Cantor');

print $jcmatrix->print_matrix;

And the file 'e1_output_uno_solo.fas' has the following sequences:

>A
GGTTATCTCAACAACTGTCACC--GTGGGCGCTGGTCATTGGTACGGGTGAACGAGAGTT
AAACGGTCGTTAACCATAGAAACAAAACACACTGCACCTTAACTCACTGAATAGTTGACG
GTCTGCCTCAGGGCTTGAGACAACGGATGGATCTAAACTCATGCTGTAGCCTATCAAACT
TAGCCCCAGGGTACTTCCGTCCCTAGCCTCGCTACAAGGCCAGAAAGGGTTTTGAAGTCT
ACTCACTGTGACCAGCGGTCTAGTCAGGTTATGCTTCGGCACAAAACCTCAGAATCGGTA
ACCAGCCACTACACGAACTGAAATCAAATCGCGGGAGGTGGTCCATCTTTGTCCACGCTG
CGATGATTGGGTTGCTTTATAGTCTAGCTGCAAGGTTTTGCGTTCTGGTGGGAAGCGGCA
TCCAAGGGGTTGACTCCGCTCGTTTATAACATGCCTTGGGCCTCCATGGTGAGTCGCAAC
GTCAGCGTAGGCCTAGACGGCT

>B
GGATATCTCGACAACTTTTAGC--CTGGGCGCTTGGCATTGGTACACGTGACTTGCAGTT
AAAGGGTCGTTATACATAGAATCACTACCCAC--CAGGCGAACTCGCTGGAGAGCTGAGG
GTCACCCTCAGCGGTTGAGTTAACTGCTCGATGTTAACCGATGTTGGATCATAGGTAACT
TATCCTCAGTGTTCCTCTGTCCCTAGACTGGCTACAGGGCTACACCGGGTTTGAGGGGAT
ACTGACTGTTTTCAGCGGTAGTGTAAGTGTATGGTCCAACCCAAGGGTTCATGACCGGTA
AACTGCCCGTTCCCGCATTGAAATCAAATTGCAGGAGTTGGTACTTATTTGTCAACCTTA
CGATGATTGGGATGCATTTTAGTCGGGCTGGGCGGATTTGCGATCTGGGTGGAAGAGAGA
TGCATGGGGCTAACTCGTCTTGGTGAGTACCGGCATTGCACCGCAATGGACCGCCAAAAC
ATAAGAGTAGGTCGGGATGGCA

>C
GCTTATCTCAACAACCGACACGAAGTCGTCGCAGGTCAATGGTACACGTGAATTGAAGTC
ATAAGATCAGTAATGATCGAACCACCAAACCCTTAACCTCGACTCACGCGATAGCCGAGG
GTCTGCCTCCAGGGTTGATTTAAAGGTTCTATTTAAGACCGTTTTCGATCATAGGTTACT
TATCCCCAGAGTTCTACCGTCGTGAGAATGGCTACAAGGCTAGAATAGGTTTTAGGGT-T
ACTTACGGTCTGCAGCCGTATTGTGAGGTTATGGTCCGGCCCTAGGCGTCATGACCGATA
ATCAGCCCCTACCTGAAATGAAATCAAATCGCGGGAGTTGGTACTTATCTGTCAACGTTG
CGATGATGGGGATACATGTTGGTCTACCGCGACGGACTAGCGATCACGGGGGAAGCGGAT
TGCCCGGTGGTGACTCGACACGTTTAAAACCTGCCTGGTTCCCGCATGGATCGTCACAAC
GTATGTGCAGGTCGAAACGAGT

>D
CGTGATCGCAACAACTGTCACC--GTGGGCGCTGGCCGTTGGACCACGTGAAATGCTGTT
AAACGATCGTTCACCATAGAACCACTACACTCTTCACCTCAACCCGCGGGACAGGTGATG
GTGTCCCCCAGGGGTTGAGTGAACGGCTCGATGTAAACCCATGTTCGATCATAGGTAACG
TAGCCCCAGGGTGATTCCGTTCCTAAACTGGTTACAAGGCTAAAACGTGTTTTAGAGTAT
AATGACTGTCTACGGCGGTATTGTGATGTTATCATCCGTCCCTAGGCGTGGCGACCGTTA
AACAGCCTCTTCCCTAACTGATATCTAATCGTAGGAGTTGCTACGCATTTGTCAACGCAG
CGATGATGGTGATGCATCTTAATCTAGCTGG----TTTTTTGATCTCGGGTGACGCAGAT
AGTCAGGGGTTGACTCGCGTCGTTTGAAACGTGCCTTGCTCCTCAATGGACCCTCCGAAC
CTAAGAGTAGCTCGACACGGCT


I think the $aln object is OK, as I can use it with SimpleAlign.

Moreover, if I write
          print $jcmatrix;
instead of
          print $jcmatrix->print_matrix;
I get the memory reference, as normal===> ARRAY(0x859f08)

So my question is:

Why do I have an unblessed reference?

Can't call method "print_matrix" on unblessed reference at Tree.pl line 32, 
<GEN0> line 44.

Thank you very much in advance.

Jose G.

_________________________________________________________________
Hay tantos ordenadores como personas. ?Descubre ahora cu?l eres t?!
http://www.quepceres.com/
_______________________________________________
Bioperl-l mailing list
Bioperl-l at lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/bioperl-l


From jason at bioperl.org  Mon Sep 14 13:54:55 2009
From: jason at bioperl.org (Jason Stajich)
Date: Mon, 14 Sep 2009 10:54:55 -0700
Subject: [Bioperl-l] Bio/Align/DNAStatistics.html
	print$jcmatrix->print_matrix; 
In-Reply-To: <7AD546C5A6BE4B66BF9705BC885E08B1@NewLife>
References: <BLU104-W2453ADE4584D2C479071A4A0E40@phx.gbl>
	<7AD546C5A6BE4B66BF9705BC885E08B1@NewLife>
Message-ID: <8B440DC9-A1C8-4900-A0AB-96448616E46A@bioperl.org>

Yeah it seems like more of a bioperl problem -- possible that the  
older code didn't recognize 'jukes-cantor' but you can try the  
abbreviation 'jc' -- better to just upgrade tho!

This isn't the cause of the problem but I would also encourage use of  
Bio::Matrix::IO for printing the matrix (use the 'write_matrix'  
function) rather than print_matrix on the matrix itsself.

-jason
On Sep 14, 2009, at 10:00 AM, Mark A. Jensen wrote:

> Hi Jose--
> I don't get any problem with your script as written. You should  
> upgrade to
> BioPerl 1.6 and try again.
> The "unblessed reference" is $jcmatrix. It may be undef for some  
> reason.
> MAJ
> ----- Original Message ----- From: "Jose ." <joseguillin at hotmail.com>
> To: <bioperl-l at bioperl.org>
> Sent: Monday, September 14, 2009 8:48 AM
> Subject: [Bioperl-l] Bio/Align/DNAStatistics.html print$jcmatrix- 
> >print_matrix;
>
>
>
>
>
> Hello,
>
> I'm trying to use Bio::Align::DNAStatistics, but I get the following  
> message:
>
> Can't call method "print_matrix" on unblessed reference at Tree.pl  
> line 32, <GEN0> line 44.
>
> Other modules do work, such us Bio::SimpleAlign;
>
>
>
>
> My code is basically a modification of the code I found in http://doc.bioperl.org/releases/bioperl-current/bioperl-live/Bio/Align/DNAStatistics.html 
> , as it is as follows:
>
> use strict;
> use Bio::AlignIO;
> use Bio::Align::DNAStatistics;
>
>
> my $stats = Bio::Align::DNAStatistics->new();
>
> my $alignin = Bio::AlignIO->new(-file => 'e1_output_uno_solo.fas',
>                           -format => 'fasta');
> my $aln = $alignin->next_aln;
>
> my $jcmatrix = $stats-> distance (-align => $aln,
>                 -method => 'Jukes-Cantor');
>
> print $jcmatrix->print_matrix;
>
> And the file 'e1_output_uno_solo.fas' has the following sequences:
>
>> A
> GGTTATCTCAACAACTGTCACC--GTGGGCGCTGGTCATTGGTACGGGTGAACGAGAGTT
> AAACGGTCGTTAACCATAGAAACAAAACACACTGCACCTTAACTCACTGAATAGTTGACG
> GTCTGCCTCAGGGCTTGAGACAACGGATGGATCTAAACTCATGCTGTAGCCTATCAAACT
> TAGCCCCAGGGTACTTCCGTCCCTAGCCTCGCTACAAGGCCAGAAAGGGTTTTGAAGTCT
> ACTCACTGTGACCAGCGGTCTAGTCAGGTTATGCTTCGGCACAAAACCTCAGAATCGGTA
> ACCAGCCACTACACGAACTGAAATCAAATCGCGGGAGGTGGTCCATCTTTGTCCACGCTG
> CGATGATTGGGTTGCTTTATAGTCTAGCTGCAAGGTTTTGCGTTCTGGTGGGAAGCGGCA
> TCCAAGGGGTTGACTCCGCTCGTTTATAACATGCCTTGGGCCTCCATGGTGAGTCGCAAC
> GTCAGCGTAGGCCTAGACGGCT
>
>> B
> GGATATCTCGACAACTTTTAGC--CTGGGCGCTTGGCATTGGTACACGTGACTTGCAGTT
> AAAGGGTCGTTATACATAGAATCACTACCCAC--CAGGCGAACTCGCTGGAGAGCTGAGG
> GTCACCCTCAGCGGTTGAGTTAACTGCTCGATGTTAACCGATGTTGGATCATAGGTAACT
> TATCCTCAGTGTTCCTCTGTCCCTAGACTGGCTACAGGGCTACACCGGGTTTGAGGGGAT
> ACTGACTGTTTTCAGCGGTAGTGTAAGTGTATGGTCCAACCCAAGGGTTCATGACCGGTA
> AACTGCCCGTTCCCGCATTGAAATCAAATTGCAGGAGTTGGTACTTATTTGTCAACCTTA
> CGATGATTGGGATGCATTTTAGTCGGGCTGGGCGGATTTGCGATCTGGGTGGAAGAGAGA
> TGCATGGGGCTAACTCGTCTTGGTGAGTACCGGCATTGCACCGCAATGGACCGCCAAAAC
> ATAAGAGTAGGTCGGGATGGCA
>
>> C
> GCTTATCTCAACAACCGACACGAAGTCGTCGCAGGTCAATGGTACACGTGAATTGAAGTC
> ATAAGATCAGTAATGATCGAACCACCAAACCCTTAACCTCGACTCACGCGATAGCCGAGG
> GTCTGCCTCCAGGGTTGATTTAAAGGTTCTATTTAAGACCGTTTTCGATCATAGGTTACT
> TATCCCCAGAGTTCTACCGTCGTGAGAATGGCTACAAGGCTAGAATAGGTTTTAGGGT-T
> ACTTACGGTCTGCAGCCGTATTGTGAGGTTATGGTCCGGCCCTAGGCGTCATGACCGATA
> ATCAGCCCCTACCTGAAATGAAATCAAATCGCGGGAGTTGGTACTTATCTGTCAACGTTG
> CGATGATGGGGATACATGTTGGTCTACCGCGACGGACTAGCGATCACGGGGGAAGCGGAT
> TGCCCGGTGGTGACTCGACACGTTTAAAACCTGCCTGGTTCCCGCATGGATCGTCACAAC
> GTATGTGCAGGTCGAAACGAGT
>
>> D
> CGTGATCGCAACAACTGTCACC--GTGGGCGCTGGCCGTTGGACCACGTGAAATGCTGTT
> AAACGATCGTTCACCATAGAACCACTACACTCTTCACCTCAACCCGCGGGACAGGTGATG
> GTGTCCCCCAGGGGTTGAGTGAACGGCTCGATGTAAACCCATGTTCGATCATAGGTAACG
> TAGCCCCAGGGTGATTCCGTTCCTAAACTGGTTACAAGGCTAAAACGTGTTTTAGAGTAT
> AATGACTGTCTACGGCGGTATTGTGATGTTATCATCCGTCCCTAGGCGTGGCGACCGTTA
> AACAGCCTCTTCCCTAACTGATATCTAATCGTAGGAGTTGCTACGCATTTGTCAACGCAG
> CGATGATGGTGATGCATCTTAATCTAGCTGG----TTTTTTGATCTCGGGTGACGCAGAT
> AGTCAGGGGTTGACTCGCGTCGTTTGAAACGTGCCTTGCTCCTCAATGGACCCTCCGAAC
> CTAAGAGTAGCTCGACACGGCT
>
>
>
> I think the $aln object is OK, as I can use it with SimpleAlign.
>
> Moreover, if I write
>         print $jcmatrix;
> instead of
>         print $jcmatrix->print_matrix;
> I get the memory reference, as normal===> ARRAY(0x859f08)
>
> So my question is:
>
> Why do I have an unblessed reference?
>
> Can't call method "print_matrix" on unblessed reference at Tree.pl  
> line 32, <GEN0> line 44.
>
> Thank you very much in advance.
>
> Jose G.
>
> _________________________________________________________________
> Hay tantos ordenadores como personas. ?Descubre ahora cu?l eres t?!
> http://www.quepceres.com/
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

--
Jason Stajich
jason.stajich at gmail.com
jason at bioperl.org


From robert.bradbury at gmail.com  Mon Sep 14 15:34:52 2009
From: robert.bradbury at gmail.com (Robert Bradbury)
Date: Mon, 14 Sep 2009 15:34:52 -0400
Subject: [Bioperl-l] Beginner Script Error
In-Reply-To: <f39d52e60909131925ye176745qad0a0a16d4353a17@mail.gmail.com>
References: <f39d52e60909131925ye176745qad0a0a16d4353a17@mail.gmail.com>
Message-ID: <deaa866a0909141234p55341bcbhd4f713551180fed4@mail.gmail.com>

On 9/13/09, Cavin Ward-Caviness <cavin.wardcaviness at gmail.com> wrote:

> $seq = get_sequence('swiss',"ROA1_HUMAN");

Well, I haven't looked at the documentation or the source, but the
code I've got which does work which does a similar function is:
             # database options include: Swissprot, EMBL, GenBank and RefSeq
            $seq_object = get_sequence('swissprot', $seqname);

I think the names have to be string specific but may not need to be
case specific.  The seqname's also tend to be database format
specific, so my "general" function fetch will catch exceptions and
then try other databases, if for example it looks like a PDB
identifier.  I'm not sure whether there is a library function which
fetches a "general" sequence based on the sequence name format.
Presumably one could do something like this with some kind of
"prioritized" list of databases to go through, e.g. GenBank, EMBL,
SwissProt, RefSeq, PDB, JDB, JGI, Broad, NCBI, C. elegans, Drosophila,
Yeast, other organism specific databases.  It might be nice if there
were a "general" BioPerl function that would do this based on sequence
name format, locality (fetch from the nearest database),
up-to-dated-ness, ultimately one might like to have kind of a sequence
"rsync" function that of the form  UpdateSequence(SeqName, prefDb,
last-update-date, update-size, update-md5sum, ...) which would perform
inexpensive network-based updates for gene-sets of interest.  I'm
presuming that many sequence entries in active databases are
undergoing periodic updates and thus one might be interested in weekly
or monthly "local" db updates.

Robert


From robert.bradbury at gmail.com  Tue Sep 15 04:05:22 2009
From: robert.bradbury at gmail.com (Robert Bradbury)
Date: Tue, 15 Sep 2009 04:05:22 -0400
Subject: [Bioperl-l] Genome scanning questions/strategies
Message-ID: <deaa866a0909150105wcc651c5n4a50033d0392bbda@mail.gmail.com>

I have several applications which require scanning multiple genomes, in some
cases I can get away with scanning the protein sequences, in other cases I
need to scan the mRNA, or in the worst case the DNA sequences themselves.  I
have most of the available genomes on my hard drive but in cases where they
are not complete or undergo frequent revisions, I may need to interface
through the Genbank | Ensembl | JGI (or other?) databases.

Some of the applications are basic counting statistics:
1) How many proteins?
2) How many amino acids in the proteins?
3) What are the species specific codon frequencies in the codons?
4) What fraction of the genome is ncRNA, junk DNA, etc.?

Other applications involve some functional analysis, e.g. find all specified
protein domains of interest (presumably some HMM matching or equivalent),
find all signal sequences (nuclear targeting, mitochondrial targeting, ER
targeting, etc.), find all mRNA restriction enzyme cut sites, etc..

Questions are:
1) Are there "remote" functions that use genome center "supercomputers"
(other than say Remote Blast) that can be used for some of these purposes
and are interfaced in some way to BioPerl?
2) Will I incur genome center wrath by running all my queries "remotely"
(i.e. I do the computing, but they handle the database retreival & network
distribution)?  If not, what is a good "max query frequency"? [I'm on a DSL
line, so I can't push most servers very hard from an I/O standpoint.]

Finally, is there any "archive of experience" documenting the various
information systems limitations on various bioinformatics applications?
I.e. for I/O requirements and/or CPU requirements, is: BLAST <
HMM-domain-searching < Inter-genome-signal-scanning/matching?  Relates to
the question of when home based bioinformaticians need to begin considering
switching from DSL to Cable to FIOS and/or 1/3/4/6/8 core machines/clusters
can handle the workload.

Thank you,
Robert Bradbury


From neetisomaiya at gmail.com  Tue Sep 15 04:29:02 2009
From: neetisomaiya at gmail.com (Neeti Somaiya)
Date: Tue, 15 Sep 2009 13:59:02 +0530
Subject: [Bioperl-l] need help urgently
In-Reply-To: <764978cf0909140122h3fe74b80lec7118e3edde24f9@mail.gmail.com>
References: <764978cf0909032349t5d8683f3h3b2ccc506392c984@mail.gmail.com>
	<2ac05d0f0909040039v4d6fb77fw8793b43add632e3a@mail.gmail.com>
	<764978cf0909070304w598d4bb5m51ad4e66f57cc1cf@mail.gmail.com>
	<18DF7D20DFEC044098A1062202F5FFF32B624C53A3@exchsth.agresearch.co.nz>
	<764978cf0909072127n830d4e8x95d15a758fa919db@mail.gmail.com>
	<18DF7D20DFEC044098A1062202F5FFF32B624C5607@exchsth.agresearch.co.nz>
	<764978cf0909140122h3fe74b80lec7118e3edde24f9@mail.gmail.com>
Message-ID: <764978cf0909150129s69817921j82a9ca112aefe7ae@mail.gmail.com>

When I use Bio::DB::EntrezGene and EUtilities, the accession and
sequence that it returns to me for a gene is the second accession
mentioned in the "Genome Reference Consortium Human Build 37 Primary
Assembly". For eg, if we take entrez gene id 3630, the code returns
accession NT_009237.18. But I actually want to take the sequence of
the first accession i.e. NC_000011.9.

Please let me know how I could get that. Any help will be great.

-Neeti
Even my blood says, B positive


On Mon, Sep 14, 2009 at 1:52 PM, Neeti Somaiya <neetisomaiya at gmail.com> wrote:
> Thanks a lot. This works for me.
>
> I need one more help, can you point me to where exactly can we find
> the link to this FASTA sequence, that we are retrieving here through
> the code, in its actual entry in Entrez Gene in the NCBI website
> (http://www.ncbi.nlm.nih.gov/sites/entrez)
>
> -Neeti
> Even my blood says, B positive
>
>
>
> On Tue, Sep 8, 2009 at 10:11 AM, Smithies, Russell
> <Russell.Smithies at agresearch.co.nz> wrote:
>> That bit of code gave you the accession, start and end for the sequence so you just needed to download it.
>> Bio::DB::Eutilities can do that for you.
>>
>> Did you take a look at http://www.bioperl.org/wiki/HOWTO:Getting_Genomic_Sequences
>>
>>
>>
>> --Russell
>>
>> ==================
>> #!perl -w
>>
>> use strict;
>> use Bio::DB::EntrezGene;
>> use Bio::DB::EUtilities;
>>
>> no warnings 'deprecated';
>>
>> my $id = shift or die "Id?\n"; # use a Gene id
>>
>> my $db = new Bio::DB::EntrezGene;
>> #$db->verbose(1);
>> my $seq = $db->get_Seq_by_id($id);
>>
>> my $ac = $seq->annotation;
>>
>> for my $ann ($ac->get_Annotations('dblink')) {
>>        if ($ann->database eq "Evidence Viewer") {
>>                # get the sequence identifier, the start, and the stop
>>                my ($acc,$from,$to) = $ann->url =~
>>                  /contig=([^&]+).+from=(\d+)&to=(\d+)/;
>>                print "$acc\t$from\t$to\n";
>>
>>                # retrieve the sequence
>>                my $fetcher = Bio::DB::EUtilities->new(-eutil => 'efetch',
>>                                           -db    => 'nucleotide',
>>                                           -rettype => 'fasta');
>>            $fetcher->set_parameters(-id => $acc,
>>                                                -seq_start => $from,
>>                                                -seq_stop  => $to,
>>                                                -strand    => 1);
>>            my $seq = $fetcher->get_Response->content;
>>            print $seq;
>>
>>        }
>> }
>>
>> ======================
>>
>>> -----Original Message-----
>>> From: Neeti Somaiya [mailto:neetisomaiya at gmail.com]
>>> Sent: Tuesday, 8 September 2009 4:28 p.m.
>>> To: Smithies, Russell
>>> Cc: Emanuele Osimo; bioperl-l
>>> Subject: Re: [Bioperl-l] need help urgently
>>>
>>> I actually want the nucleotide sequence of the gene. I thought the
>>> Bio::DB::EntrezGene would give me a seq_obj for an entrez gene id and
>>> then the seq method on that $seq_obj->seq() will give me the actual
>>> genomic nucleotide sequence of the gene. But this doesnt happen. I am
>>> able to print gene symbol using $seq_obj->display_id and able to do
>>> other things, but I wanted the gene nucleotide sequence.
>>>
>>> -Neeti
>>> Even my blood says, B positive
>>>
>>>
>>>
>>> On Tue, Sep 8, 2009 at 1:56 AM, Smithies,
>>> Russell<Russell.Smithies at agresearch.co.nz> wrote:
>>> > This example code from the wiki _definitely_ works:
>>> >
>>> http://bio.perl.org/wiki/HOWTO:Getting_Genomic_Sequences#Using_Bio::DB::Entrez
>>> Gene_to_get_genomic_coordinates
>>> > =========================================
>>> >
>>> > use strict;
>>> > use Bio::DB::EntrezGene;
>>> >
>>> > my $id = shift or die "Id?\n"; # use a Gene id
>>> >
>>> > my $db = new Bio::DB::EntrezGene;
>>> > $db->verbose(1); ###
>>> >
>>> > my $seq = $db->get_Seq_by_id($id);
>>> >
>>> > my $ac = $seq->annotation;
>>> >
>>> > for my $ann ($ac->get_Annotations('dblink')) {
>>> >        if ($ann->database eq "Evidence Viewer") {
>>> >                # get the sequence identifier, the start, and the stop
>>> >                my ($contig,$from,$to) = $ann->url =~
>>> >                  /contig=([^&]+).+from=(\d+)&to=(\d+)/;
>>> >                print "$contig\t$from\t$to\n";
>>> >        }
>>> > }
>>> >
>>> > ======================================
>>> >
>>> > So if it doesn't work for you, there are a few things you need to check:
>>> > * what version of BioPerl are you using?
>>> > * are you behind a firewall?
>>> > * are you using a proxy?
>>> > * do you need to submit username/password for either of the 2 above
>>> > * turn on 'verbose' messages, it may help you debug
>>> >
>>> >
>>> > If you're still having problems, get back to me and I'll see if I can help.
>>> >
>>> > --Russell
>>> >
>>> >
>>> >> -----Original Message-----
>>> >> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
>>> >> bounces at lists.open-bio.org] On Behalf Of Neeti Somaiya
>>> >> Sent: Monday, 7 September 2009 10:04 p.m.
>>> >> To: Emanuele Osimo; bioperl-l
>>> >> Subject: Re: [Bioperl-l] need help urgently
>>> >>
>>> >> I tried using EntrezGene instead of GenBank, as is given in the link
>>> >> that you sent :
>>> >>
>>> >>
>>> http://www.bioperl.org/wiki/HOWTO:Beginners#Retrieving_a_sequence_from_a_datab
>>> >> ase
>>> >>
>>> >> http://doc.bioperl.org/releases/bioperl-current/bioperl-
>>> >> live/Bio/DB/EntrezGene.html
>>> >>
>>> >> use Bio::DB::EntrezGene;
>>> >>
>>> >>     my $db = Bio::DB::EntrezGene->new;
>>> >>
>>> >>     my $seq = $db->get_Seq_by_id(2); # Gene id
>>> >>
>>> >>     # or ...
>>> >>
>>> >>     my $seqio = $db->get_Stream_by_id([2, 4693, 3064]); # Gene ids
>>> >>     while ( my $seq = $seqio->next_seq ) {
>>> >>           print "id is ", $seq->display_id, "\n";
>>> >>     }
>>> >>
>>> >> This doesnt seem to work.
>>> >>
>>> >>
>>> >> -Neeti
>>> >> Even my blood says, B positive
>>> >>
>>> >>
>>> >>
>>> >> On Fri, Sep 4, 2009 at 1:09 PM, Emanuele Osimo<e.osimo at gmail.com> wrote:
>>> >> > Hello,
>>> >> > have you tried this?
>>> >> >
>>> >>
>>> http://bio.perl.org/wiki/HOWTO:Getting_Genomic_Sequences#Using_Bio::DB::GenBan
>>> >> k_when_you_have_genomic_coordinates
>>> >> >
>>> >> > Emanuele
>>> >> >
>>> >> > On Fri, Sep 4, 2009 at 08:49, Neeti Somaiya <neetisomaiya at gmail.com>
>>> wrote:
>>> >> >>
>>> >> >> Hi,
>>> >> >>
>>> >> >> I have an input list of gene names (can get gene ids from a local db
>>> >> >> if required).
>>> >> >> I need to fetch sequences of these genes. Can someone please guide me
>>> >> >> as to how this can be done using perl/bioperl?
>>> >> >>
>>> >> >> Any help will be deeply appreciated.
>>> >> >>
>>> >> >> Thanks.
>>> >> >>
>>> >> >> -Neeti
>>> >> >> Even my blood says, B positive
>>> >> >> _______________________________________________
>>> >> >> Bioperl-l mailing list
>>> >> >> Bioperl-l at lists.open-bio.org
>>> >> >> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>> >> >
>>> >> >
>>> >> _______________________________________________
>>> >> Bioperl-l mailing list
>>> >> Bioperl-l at lists.open-bio.org
>>> >> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>> > =======================================================================
>>> > Attention: The information contained in this message and/or attachments
>>> > from AgResearch Limited is intended only for the persons or entities
>>> > to which it is addressed and may contain confidential and/or privileged
>>> > material. Any review, retransmission, dissemination or other use of, or
>>> > taking of any action in reliance upon, this information by persons or
>>> > entities other than the intended recipients is prohibited by AgResearch
>>> > Limited. If you have received this message in error, please notify the
>>> > sender immediately.
>>> > =======================================================================
>>> >
>>
>


From cjfields at illinois.edu  Tue Sep 15 15:07:40 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Tue, 15 Sep 2009 14:07:40 -0500
Subject: [Bioperl-l] Significant blocker for 1.6.1 : Nexml
In-Reply-To: <4415308D-81DC-4F68-A6CF-E08FD03D1D6E@illinois.edu>
References: <E5D7B830-6D19-47D2-8D5E-716B4CF84F0B@illinois.edu><CB38C203-7253-4AEE-A6E3-922243B290D9@gmx.net>
	<3163670B-51E3-419F-835B-304BB52E1037@illinois.edu>
	<1CF993D6D3AC435CA77127466D6C072A@NewLife>
	<4415308D-81DC-4F68-A6CF-E08FD03D1D6E@illinois.edu>
Message-ID: <DE7BC2E3-F983-447F-86AD-34BFEA3B232A@illinois.edu>

I don't see an update to Bio::Phylo on CPAN yet, so I'm assuming we  
will leave Nexml off the 1.6.1 alpha for now.  I'll likely be  
releasing it later today or tomorrow to CPAN.

chris

On Sep 8, 2009, at 10:43 AM, Chris Fields wrote:

> Mark
>
> We can hold it in trunk until the next point release or we start  
> splitting things off (whichever is first).
>
> I have a little more time, though, and I'm thinking it would be a  
> good idea to get the Nexml code into the wild (sooner than later)  
> for users to test out.  Let's see if Rutger responds.
>
> chris
>
> On Sep 8, 2009, at 9:39 AM, Mark A. Jensen wrote:
>
>> I agree with Hilmar-- I have no problem keeping it in the trunk for  
>> a while
>> longer, as I have an addition for dealing with arbitrary non-seq
>> data using the Population API sitting in bioperl-dev that's nearly
>> ready, but prob. not before cjf wants to get the release out.
>> ----- Original Message ----- From: "Chris Fields" <cjfields at illinois.edu 
>> >
>> To: "Hilmar Lapp" <hlapp at gmx.net>
>> Cc: "BioPerl List" <bioperl-l at lists.open-bio.org>; "Rutger A. Vos" <rutgeraldo at gmail.com 
>> >
>> Sent: Tuesday, September 08, 2009 9:15 AM
>> Subject: Re: [Bioperl-l] Significant blocker for 1.6.1 : Nexml
>>
>>
>>> On Sep 8, 2009, at 7:16 AM, Hilmar Lapp wrote:
>>>
>>>> I'd suspect that the latest Bio::Phylo changes have been due for   
>>>> CPAN release anyway, so unless those are unstable that seems  
>>>> like  the easiest fix to me.
>>>
>>> My thought as well, just not sure how stable that code is right  
>>> now. Bio::Phylo has been in RC for a while now, correct?
>>>
>>>> If the Nexml code works against not yet stable updates to   
>>>> Bio::Phylo, it shouldn't be in a BioPerl stable release, right?
>>>
>>> Right.  That should be sorted out first.
>>>
>>> I can wait a bit longer for Rutger to respond; there are a few  
>>> other  odds and ends that can been worked on in the meantime.  I  
>>> would like  to get the alpha out soon and 1.6.1 in the next week  
>>> or so though.
>>>
>>> chris
>>>
>>>> -hilmar
>>>>
>>>> On Sep 8, 2009, at 12:23 AM, Chris Fields wrote:
>>>>
>>>>> All,
>>>>>
>>>>> I'm running into a pretty significant blocker for 1.6.1 re:  
>>>>> Chase's  Nexml code.  In particular, I have tried three versions  
>>>>> of  Bio::Phylo; the default CPAN installation (1.6), the latest  
>>>>> CPAN RC  (1.7_RC9, not installed by default), and the latest  
>>>>> from Bio::Phylo  svn:
>>>>>
>>>>> https://nexml.svn.sourceforge.net/svnroot/nexml/trunk/nexml/perl
>>>>>
>>>>> At this moment only the Bio::Phylo code from svn is working  
>>>>> with  BioPerl's Nexml modules.  From my local tests Bio::Phylo  
>>>>> 1.6  appears to be missing Bio::Phylo::Factory (all Nexml tests  
>>>>> fail),  whereas 1.7_RC9 has some kind of versioning issue  
>>>>> (again, all tests  fail).  The problem: CPAN will always install  
>>>>> 1.6 (the others are  RC, so they won't be installed unless the  
>>>>> full path is used).  Even  so, nothing on CPAN even works; one  
>>>>> must use the latest Bio::Phylo  SVN code.
>>>>>
>>>>> ATM I'm just not seeing how this can be released with 1.6.1  
>>>>> right  now, unless one of the following occurs:
>>>>>
>>>>> 1) Rutger V. drops a quick non-RC release to CPAN,
>>>>> 2) check for the minimal working Bio::Phylo version and safely  
>>>>> skip  any Nexml-related tests unless proper version is present  
>>>>> (not easy  with a $VERSION like '1.7_RC9'),
>>>>> 3) push Nexml into it's own distribution (something we were   
>>>>> planning on anyway with a number of modules)
>>>>>
>>>>> As for #3 above, I think it probably belongs in a larger  
>>>>> bioperl- phylo as Mark had previously proposed.  I'm open to  
>>>>> just about any  solution.
>>>>>
>>>>> chris
>>>>> _______________________________________________
>>>>> Bioperl-l mailing list
>>>>> Bioperl-l at lists.open-bio.org
>>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>>
>>>> -- 
>>>> ===========================================================
>>>> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
>>>> ===========================================================
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> Bioperl-l mailing list
>>>> Bioperl-l at lists.open-bio.org
>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


From cjfields at illinois.edu  Wed Sep 16 08:55:56 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Wed, 16 Sep 2009 07:55:56 -0500
Subject: [Bioperl-l] Bio::DB::GenBank question (acc vs. version)
In-Reply-To: <02cbfb3dfbb309f0b62cecd122bb5c2c.squirrel@mail.dreamhost.com>
References: <19116.19568.26115.542911@already.dhcp.gene.com>
	<02cbfb3dfbb309f0b62cecd122bb5c2c.squirrel@mail.dreamhost.com>
Message-ID: <0B8829A4-03EE-4BA0-8CF8-218782ED2630@illinois.edu>

Bill, George,

It's worth clarifying the docs on these and adding a TODO for them  
(and test cases!), but I tend to agree.  I believe, re: version, we  
can possibly use Bio::DB::SeqVersion to grab the right one, but it'll  
need further investigation.

As for generic accession w/o version, efetch does support it but it  
does have problems (pulling up more than one sequence in rare cases,  
for instance).

chris

On Sep 13, 2009, at 10:47 AM, bill at genenformics.com wrote:

> I would like to make a few comments about get_Seq_by_version and
> get_Seq_by_acc. Although both functions use the same NCBI eUtils  
> API, they
> are interpreted differently for a Seq_id with version or without  
> version.
>
> 1. If the Seq_id has a version, GenBank ID server will locate
> corresponding GI and emit the correct sequence.
> 2. If the Seq_id does not have a version, GBDataLoader  will try to  
> find
> the latest version number for that Seq_id, which is relatively  
> slower and
> the version number the ID server find out may NOT always be the  
> latest.
>
> IMHO, for both efficiency and consistency,
> get_Seq_by_gi > get_Seq_by_version >> get_Seq_by_acc
>
> Bill
>
>
>>
>> It looks like get Bio::DB::GenBank::get_Seq_by_{version,acc} are
>> functionally identical.  They seem to trickle down to the same place
>> and walking through these two requests yields almost identical http
>> requests:
>>
>>  $db->get_Seq_by_version('J00522.1')
>>  GET
>> http://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?retmode=text&rettype=gbwithparts&db=nucleotide&tool=bioperl&id=J00522.1&usehistory=n
>>
>>  $db->get_Seq_by_acc('J00522')
>>  GET
>> http://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?retmode=text&rettype=gbwithparts&db=nucleotide&tool=bioperl&id=J00522&usehistory=n
>>
>> The only difference that I can see is that they index into different
>> secions of %PARAMSTRING defined in Bio::DB::GenBank, but those
>> sections contain the same information.
>>
>> I'd like a general purpose tool that does The Right Thing whether
>> there's a .1 on the end of an identifier or not, and am just trying  
>> to
>> make sure I'm not doing something troublesome.
>>
>> Am I correct about the above?
>>
>> While I'm at it, I think that the comment
>>
>>  # note that get_Stream_by_version is not implemented
>>
>> in Bio::DB::GenBank was made obsolete by whoever commented out the
>>
>>  $self->throw(...)
>>
>> in get_Stream_by_version in Bio::WebDBSeqI.pm.
>>
>> I'll happily commit the trivial doc fix if no one shoots down the
>> idea. (can't help big, might as well help small...).
>>
>> Thanks,
>>
>> g.
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From cjfields at illinois.edu  Wed Sep 16 09:22:00 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Wed, 16 Sep 2009 08:22:00 -0500
Subject: [Bioperl-l] Genome scanning questions/strategies
In-Reply-To: <deaa866a0909150105wcc651c5n4a50033d0392bbda@mail.gmail.com>
References: <deaa866a0909150105wcc651c5n4a50033d0392bbda@mail.gmail.com>
Message-ID: <8674BA8B-ACCC-4C7D-989E-3532C0659A3F@illinois.edu>

On Sep 15, 2009, at 3:05 AM, Robert Bradbury wrote:

> I have several applications which require scanning multiple genomes,  
> in some
> cases I can get away with scanning the protein sequences, in other  
> cases I
> need to scan the mRNA, or in the worst case the DNA sequences  
> themselves.  I
> have most of the available genomes on my hard drive but in cases  
> where they
> are not complete or undergo frequent revisions, I may need to  
> interface
> through the Genbank | Ensembl | JGI (or other?) databases.
>
> Some of the applications are basic counting statistics:
> 1) How many proteins?
> 2) How many amino acids in the proteins?
> 3) What are the species specific codon frequencies in the codons?
> 4) What fraction of the genome is ncRNA, junk DNA, etc.?
>
> Other applications involve some functional analysis, e.g. find all  
> specified
> protein domains of interest (presumably some HMM matching or  
> equivalent),
> find all signal sequences (nuclear targeting, mitochondrial  
> targeting, ER
> targeting, etc.), find all mRNA restriction enzyme cut sites, etc..
>
> Questions are:
> 1) Are there "remote" functions that use genome center  
> "supercomputers"
> (other than say Remote Blast) that can be used for some of these  
> purposes
> and are interfaced in some way to BioPerl?

Re: remote tasks, there are a few tools for that.  See  
Bio::Tools::Analysis modules for ones that access remote servers, or  
the HOWTO:

http://www.bioperl.org/wiki/HOWTO:Simple_web_analysis

Setting up modules for these services can be risky, though, as we have  
no control over the continued evolution of the remote servers in  
question.  For instance, we had a set of Pise modules (around 100 I  
think) for remotely accessing services at any Pise server; however,  
these are now obsolete in favor of Mobyle.  I have long thought of  
setting something up to interface with either that service or Galaxy  
(which may be a more stable alternative), just haven't had the time.

Re databases: we have access to NCBI, EMBL, UniProt, and many others.   
NCBI eutils are available via Bio::DB::EUtilities.  You can use the  
Ensembl perl API for accessing Ensembl (including Compara and others),  
and Mark Jensen added Bio::DB::HIV for accessing HIV database  
information at LANL HIV Sequence Database.  These were all working  
with bioperl 1.6 last I tried (ensembl's API is separate and available  
from their website).

We don't have much beyond that, primarily b/c most other centers are  
very particular when queried remotely and will block IPs that spam  
their servers w/o an adequate timeout.  That's completely  
understandable from a webadmin perspective (think: possible denial of  
service attack).

> 2) Will I incur genome center wrath by running all my queries  
> "remotely"
> (i.e. I do the computing, but they handle the database retreival &  
> network
> distribution)?  If not, what is a good "max query frequency"? [I'm  
> on a DSL
> line, so I can't push most servers very hard from an I/O standpoint.]

You may if you abuse a specified timeout.  UCSC and NCBI both have  
been known to block IPs, but the timeout is quite different between  
the two (NCBI just reduced theirs to three queries per second, whereas  
I last heard UCSC was once per 30 seconds).

The best thing to do is check the documentation for the site in  
question or contact the webadmin to see if there is a requested  
timeout period.

> Finally, is there any "archive of experience" documenting the various
> information systems limitations on various bioinformatics  
> applications?
> I.e. for I/O requirements and/or CPU requirements, is: BLAST <
> HMM-domain-searching < Inter-genome-signal-scanning/matching?   
> Relates to
> the question of when home based bioinformaticians need to begin  
> considering
> switching from DSL to Cable to FIOS and/or 1/3/4/6/8 core machines/ 
> clusters
> can handle the workload.
>
> Thank you,
> Robert Bradbury

On that I'm not sure, but I would tend to think they don't want you  
taxing their local servers so there probably is some prioritization of  
tasks.

 From my perspective, if I were a home-based bioinformatician I would  
look seriously at cloud computing for most high-end tasks (Mark has  
even set up one for bioperl, bioperl-max).  It has a cost but it's  
very reasonable considering the cost of setting up a local cluster,  
maintenance and repairs, etc.  In fact, we have been putting serious  
thought into testing that direction instead of putting money into  
another high-cost local cluster, which is obsolete in, say, 3-4 years,  
or when we're getting Blue Waters in a couple years.

chris


From jajams at utu.fi  Wed Sep 16 06:04:18 2009
From: jajams at utu.fi (=?iso-8859-1?B?Ikpvb25hcyBK5G1zZW4i?=)
Date: Wed, 16 Sep 2009 13:04:18 +0300
Subject: [Bioperl-l] problem with a script
Message-ID: <fb44a91e1ccd0.4ab0e252@utu.fi>

Hi,

Im trying to run the script below and I get an error: "Can't call method "next_result" on an undefined value at parser.pl line 5."


#!/v/linux26_x86_64/appl/molbio/bioperl/perl/bin/
use Bio::SearchIO
my $searchio = Bio::SearchIO->new(-format => 'hmmer', -file   => '/wrk/xxxx/hmm/hmmsearch_nr.out');
while ( my $result = $in->next_result ) {
     while ( my $hit = $result->next_hit ) {
         while ( my $hsp-evalue<=10 ) {
             while ( my $hsp = $hit->next_hsp ) {
                 print $hit->accession(), "\n";
         }
     }
 }

Could someone tell me what is wrong?

Thanks.


From maj at fortinbras.us  Wed Sep 16 11:18:26 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Wed, 16 Sep 2009 11:18:26 -0400
Subject: [Bioperl-l] problem with a script
In-Reply-To: <fb44a91e1ccd0.4ab0e252@utu.fi>
References: <fb44a91e1ccd0.4ab0e252@utu.fi>
Message-ID: <A9C32C43FB5C46FD9DC320A4DD325104@NewLife>

Hi Joonas-- 

Put a semicolon after "use Bio::SearchIO" in line 2.
If that doesn't work, then the error suggests that $searchio is undefined 
because the parser failed for some reason.
You could try
 my $searchio = Bio::SearchIO->new(-format => 'hmmer', -file   => 
'/wrk/xxxx/hmm/hmmsearch_nr.out'
                   -verbose=>1);
to get more detailed error messages, they may direct you to the issue.

cheers MAJ

----- Original Message ----- 
From: ""Joonas J?msen"" <jajams at utu.fi>
To: "bioperl list" <bioperl-l at lists.open-bio.org>
Sent: Wednesday, September 16, 2009 6:04 AM
Subject: [Bioperl-l] problem with a script


> Hi,
>
> Im trying to run the script below and I get an error: "Can't call method 
> "next_result" on an undefined value at parser.pl line 5."
>
>
> #!/v/linux26_x86_64/appl/molbio/bioperl/perl/bin/
> use Bio::SearchIO
> my $searchio = Bio::SearchIO->new(-format => 'hmmer', -file   => 
> '/wrk/xxxx/hmm/hmmsearch_nr.out');
> while ( my $result = $in->next_result ) {
>     while ( my $hit = $result->next_hit ) {
>         while ( my $hsp-evalue<=10 ) {
>             while ( my $hsp = $hit->next_hsp ) {
>                 print $hit->accession(), "\n";
>         }
>     }
> }
>
> Could someone tell me what is wrong?
>
> Thanks.
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> 


From Kevin.M.Brown at asu.edu  Wed Sep 16 11:16:51 2009
From: Kevin.M.Brown at asu.edu (Kevin Brown)
Date: Wed, 16 Sep 2009 08:16:51 -0700
Subject: [Bioperl-l] problem with a script
In-Reply-To: <fb44a91e1ccd0.4ab0e252@utu.fi>
References: <fb44a91e1ccd0.4ab0e252@utu.fi>
Message-ID: <1A4207F8295607498283FE9E93B775B4063D4CB6@EX02.asurite.ad.asu.edu>

That's because the variable $in isn't defined, just like the error says. You are setting $searchio to be your input object, but not using it.

#!/v/linux26_x86_64/appl/molbio/bioperl/perl/bin/
use strict; #<-- this helps to find those pesky undeclared variables
use Bio::SearchIO;
my $searchio = Bio::SearchIO->new(-format => 'hmmer', -file   => '/wrk/xxxx/hmm/hmmsearch_nr.out');
while ( my $result = $searchio->next_result ) { # <-- changed this line
     while ( my $hit = $result->next_hit ) {
         while ( my $hsp-evalue<=10 ) {
             while ( my $hsp = $hit->next_hsp ) {
                 print $hit->accession(), "\n";
         }
     }
 }


Kevin Brown
Center for Innovations in Medicine
Biodesign Institute
Arizona State University  

> -----Original Message-----
> From: bioperl-l-bounces at lists.open-bio.org 
> [mailto:bioperl-l-bounces at lists.open-bio.org] On Behalf Of 
> "Joonas J?msen"
> Sent: Wednesday, September 16, 2009 3:04 AM
> To: bioperl list
> Subject: [Bioperl-l] problem with a script
> 
> Hi,
> 
> Im trying to run the script below and I get an error: "Can't 
> call method "next_result" on an undefined value at parser.pl line 5."
> 
> 
> #!/v/linux26_x86_64/appl/molbio/bioperl/perl/bin/
> use Bio::SearchIO
> my $searchio = Bio::SearchIO->new(-format => 'hmmer', -file   
> => '/wrk/xxxx/hmm/hmmsearch_nr.out');
> while ( my $result = $in->next_result ) {
>      while ( my $hit = $result->next_hit ) {
>          while ( my $hsp-evalue<=10 ) {
>              while ( my $hsp = $hit->next_hsp ) {
>                  print $hit->accession(), "\n";
>          }
>      }
>  }
> 
> Could someone tell me what is wrong?
> 
> Thanks.
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 


From rmb32 at cornell.edu  Wed Sep 16 11:05:16 2009
From: rmb32 at cornell.edu (Robert Buels)
Date: Wed, 16 Sep 2009 08:05:16 -0700
Subject: [Bioperl-l] problem with a script
In-Reply-To: <fb44a91e1ccd0.4ab0e252@utu.fi>
References: <fb44a91e1ccd0.4ab0e252@utu.fi>
Message-ID: <4AB0FEAC.50104@cornell.edu>

1.) You need to use strict.  Always have use strict at the top of your 
code.  That would have caught this error.
2.) The proximate problem here is that your searchio object is call 
$searchio, while you are calling $in->next_result.  You want 
$searchio->next_result instead.

Rob

Joonas J?msen wrote:
> Hi,
> 
> Im trying to run the script below and I get an error: "Can't call method "next_result" on an undefined value at parser.pl line 5."
> 
> 
> #!/v/linux26_x86_64/appl/molbio/bioperl/perl/bin/
> use Bio::SearchIO
> my $searchio = Bio::SearchIO->new(-format => 'hmmer', -file   => '/wrk/xxxx/hmm/hmmsearch_nr.out');
> while ( my $result = $in->next_result ) {
>      while ( my $hit = $result->next_hit ) {
>          while ( my $hsp-evalue<=10 ) {
>              while ( my $hsp = $hit->next_hsp ) {
>                  print $hit->accession(), "\n";
>          }
>      }
>  }
> 
> Could someone tell me what is wrong?
> 
> Thanks.
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 


From bill at genenformics.com  Wed Sep 16 13:22:56 2009
From: bill at genenformics.com (bill at genenformics.com)
Date: Wed, 16 Sep 2009 10:22:56 -0700
Subject: [Bioperl-l] Bio::DB::GenBank question (acc vs. version)
In-Reply-To: <0B8829A4-03EE-4BA0-8CF8-218782ED2630@illinois.edu>
References: <19116.19568.26115.542911@already.dhcp.gene.com>
	<02cbfb3dfbb309f0b62cecd122bb5c2c.squirrel@mail.dreamhost.com>
	<0B8829A4-03EE-4BA0-8CF8-218782ED2630@illinois.edu>
Message-ID: <6785fd2ac57ff4389dcbcd6b0e0861ae.squirrel@mail.dreamhost.com>


>
> As for generic accession w/o version, efetch does support it but it
> does have problems (pulling up more than one sequence in rare cases,
> for instance).
>

This is probably because NCBI ID servers are not completely synchronized
or are in the process of synchronization. get_Seq_by_acc is not as safe as
other functions.

Bill

>
> On Sep 13, 2009, at 10:47 AM, bill at genenformics.com wrote:
>
>> I would like to make a few comments about get_Seq_by_version and
>> get_Seq_by_acc. Although both functions use the same NCBI eUtils
>> API, they
>> are interpreted differently for a Seq_id with version or without
>> version.
>>
>> 1. If the Seq_id has a version, GenBank ID server will locate
>> corresponding GI and emit the correct sequence.
>> 2. If the Seq_id does not have a version, GBDataLoader  will try to
>> find
>> the latest version number for that Seq_id, which is relatively
>> slower and
>> the version number the ID server find out may NOT always be the
>> latest.
>>
>> IMHO, for both efficiency and consistency,
>> get_Seq_by_gi > get_Seq_by_version >> get_Seq_by_acc
>>
>> Bill
>>
>>
>>>
>>> It looks like get Bio::DB::GenBank::get_Seq_by_{version,acc} are
>>> functionally identical.  They seem to trickle down to the same place
>>> and walking through these two requests yields almost identical http
>>> requests:
>>>
>>>  $db->get_Seq_by_version('J00522.1')
>>>  GET
>>> http://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?retmode=text&rettype=gbwithparts&db=nucleotide&tool=bioperl&id=J00522.1&usehistory=n
>>>
>>>  $db->get_Seq_by_acc('J00522')
>>>  GET
>>> http://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?retmode=text&rettype=gbwithparts&db=nucleotide&tool=bioperl&id=J00522&usehistory=n
>>>
>>> The only difference that I can see is that they index into different
>>> secions of %PARAMSTRING defined in Bio::DB::GenBank, but those
>>> sections contain the same information.
>>>
>>> I'd like a general purpose tool that does The Right Thing whether
>>> there's a .1 on the end of an identifier or not, and am just trying
>>> to
>>> make sure I'm not doing something troublesome.
>>>
>>> Am I correct about the above?
>>>
>>> While I'm at it, I think that the comment
>>>
>>>  # note that get_Stream_by_version is not implemented
>>>
>>> in Bio::DB::GenBank was made obsolete by whoever commented out the
>>>
>>>  $self->throw(...)
>>>
>>> in get_Stream_by_version in Bio::WebDBSeqI.pm.
>>>
>>> I'll happily commit the trivial doc fix if no one shoots down the
>>> idea. (can't help big, might as well help small...).
>>>
>>> Thanks,
>>>
>>> g.
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>


From cjfields at illinois.edu  Wed Sep 16 13:29:40 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Wed, 16 Sep 2009 12:29:40 -0500
Subject: [Bioperl-l] Bio::DB::GenBank question (acc vs. version)
In-Reply-To: <6785fd2ac57ff4389dcbcd6b0e0861ae.squirrel@mail.dreamhost.com>
References: <19116.19568.26115.542911@already.dhcp.gene.com>
	<02cbfb3dfbb309f0b62cecd122bb5c2c.squirrel@mail.dreamhost.com>
	<0B8829A4-03EE-4BA0-8CF8-218782ED2630@illinois.edu>
	<6785fd2ac57ff4389dcbcd6b0e0861ae.squirrel@mail.dreamhost.com>
Message-ID: <B293F929-5714-4840-8FAD-7366F7C36137@illinois.edu>


On Sep 16, 2009, at 12:22 PM, bill at genenformics.com wrote:

>
>>
>> As for generic accession w/o version, efetch does support it but it
>> does have problems (pulling up more than one sequence in rare cases,
>> for instance).
>>
>
> This is probably because NCBI ID servers are not completely  
> synchronized
> or are in the process of synchronization. get_Seq_by_acc is not as  
> safe as
> other functions.
>
> Bill

Right, but unfortunately it's necessary as the default in most cases  
is to grab/display the accession, not the UID.  For instance, BLAST  
output must be specifically flagged to display the GI.

This is an instance where documentation would be a good idea to  
indicate the problem.  I think I have done that but I'll double-check.

chris


From rmb32 at cornell.edu  Wed Sep 16 15:04:16 2009
From: rmb32 at cornell.edu (Robert Buels)
Date: Wed, 16 Sep 2009 12:04:16 -0700
Subject: [Bioperl-l] problem with a script
In-Reply-To: <4AB1356D.4050307@utu.fi>
References: <fb44a91e1ccd0.4ab0e252@utu.fi> <4AB0FEAC.50104@cornell.edu>
	<4AB1356D.4050307@utu.fi>
Message-ID: <4AB136B0.6050304@cornell.edu>

You should also 'use warnings' at the top of all code.  That would have 
caught THIS error.

You are missing a comma after ....nr.out'

Rob

Joonas J?msen wrote:
> Thanks. Im still getting errors. I have no idea what the error means. It 
> says:
> 
> ------------- EXCEPTION: Bio::Root::Exception -------------
> MSG: Could not open 0: No such file or directory
> STACK: Error::throw
> STACK: Bio::Root::Root::throw 
> /v/linux26_x86_64/appl/molbio/bioperl/perl/lib/site_perl/5.8.9/Bio/Root/Root.pm:357 
> 
> STACK: Bio::Root::IO::_initialize_io 
> /v/linux26_x86_64/appl/molbio/bioperl/perl/lib/site_perl/5.8.9/Bio/Root/IO.pm:310 
> 
> STACK: Bio::Root::IO::new 
> /v/linux26_x86_64/appl/molbio/bioperl/perl/lib/site_perl/5.8.9/Bio/Root/IO.pm:223 
> 
> STACK: Bio::SearchIO::new 
> /v/linux26_x86_64/appl/molbio/bioperl/perl/lib/site_perl/5.8.9/Bio/SearchIO.pm:145 
> 
> STACK: Bio::SearchIO::new 
> /v/linux26_x86_64/appl/molbio/bioperl/perl/lib/site_perl/5.8.9/Bio/SearchIO.pm:177 
> 
> STACK: parser.pl:7
> -----------------------------------------------------------
> 
> And the code im using seems ok now:
> 
> #!/v/linux26_x86_64/appl/molbio/bioperl/perl/bin/
> 
> use strict;
> use Bio::SearchIO;
> 
> my $searchio = Bio::SearchIO->new(-format => 'hmmer', -file => 
> '/wrk/xxxx/hmm/hmmsearch_nr.out' -verbose=>1);
> while ( my $result = $searchio->next_result ) {
>     while ( my $hit = $result->next_hit ) {
>         while ( my $hsp = $hit->evalue<=10 ) {
>                 while ( my $hsp = $hit->next_hsp ) {
>                         print $hit->accession(), "\n";
>             }
>         }
>     }
> }
> 
> -J.
> 
> Robert Buels wrote:
>> 1.) You need to use strict.  Always have use strict at the top of your 
>> code.  That would have caught this error.
>> 2.) The proximate problem here is that your searchio object is call 
>> $searchio, while you are calling $in->next_result.  You want 
>> $searchio->next_result instead.
>>
>> Rob
>>
>> Joonas J?msen wrote:
>>> Hi,
>>>
>>> Im trying to run the script below and I get an error: "Can't call 
>>> method "next_result" on an undefined value at parser.pl line 5."
>>>
>>>
>>> #!/v/linux26_x86_64/appl/molbio/bioperl/perl/bin/
>>> use Bio::SearchIO
>>> my $searchio = Bio::SearchIO->new(-format => 'hmmer', -file   => 
>>> '/wrk/xxxx/hmm/hmmsearch_nr.out');
>>> while ( my $result = $in->next_result ) {
>>>      while ( my $hit = $result->next_hit ) {
>>>          while ( my $hsp-evalue<=10 ) {
>>>              while ( my $hsp = $hit->next_hsp ) {
>>>                  print $hit->accession(), "\n";
>>>          }
>>>      }
>>>  }
>>>
>>> Could someone tell me what is wrong?
>>>
>>> Thanks.
>>>
>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>


-- 
Robert Buels
Bioinformatics Analyst, Sol Genomics Network
Boyce Thompson Institute for Plant Research
Tower Rd
Ithaca, NY  14853
Tel: 503-889-8539
rmb32 at cornell.edu
http://www.sgn.cornell.edu


From rmb32 at cornell.edu  Wed Sep 16 15:23:27 2009
From: rmb32 at cornell.edu (Robert Buels)
Date: Wed, 16 Sep 2009 12:23:27 -0700
Subject: [Bioperl-l] problem with a script
In-Reply-To: <4AB13864.6070707@utu.fi>
References: <fb44a91e1ccd0.4ab0e252@utu.fi> <4AB0FEAC.50104@cornell.edu>
	<4AB1356D.4050307@utu.fi> <4AB136B0.6050304@cornell.edu>
	<4AB13864.6070707@utu.fi>
Message-ID: <4AB13B2F.5060502@cornell.edu>

Your report may not have accessions, try using name() instead of 
accession().


From abhishek.vit at gmail.com  Wed Sep 16 16:13:33 2009
From: abhishek.vit at gmail.com (Abhishek Pratap)
Date: Wed, 16 Sep 2009 16:13:33 -0400
Subject: [Bioperl-l] About FASTQ parser
Message-ID: <be9b52410909161313uab30d9cn24d7080eb1684de7@mail.gmail.com>

Hi Chris

I remember seeing a recent email about new bioperl fastq parser. Is it
part of bioperl 1.6 dist. I installed one and based on the doc
here(http://doc.bioperl.org/releases/bioperl-current/bioperl-live/Bio/SeqIO/fastq.html)
I am a bit lost.

I see two methods there : using Bio::SeqIO::fastq and
Bio::Seq::Quality. Are both same in terms of data returned and latter
giving a scale up in speed ?

This is not to offend any developer but small example/s on the HOWTO's
helps a lot.

The current example (copied below) is not working. I guess it is based
on a previous version of code.

# grabs the FASTQ parser, specifies the Illumina variant
  my $in = Bio::SeqIO->new(-format    => 'fastq-illumina',
                           -file      => 'mydata.fq');


My basic requirement is to read each read in fastq record and split it
into header: read: quality.


Thanks,
-Abhi


From abhishek.vit at gmail.com  Wed Sep 16 17:41:50 2009
From: abhishek.vit at gmail.com (Abhishek Pratap)
Date: Wed, 16 Sep 2009 17:41:50 -0400
Subject: [Bioperl-l] Allowing One error in Sequence matching
Message-ID: <be9b52410909161441w1ce271c4r1e518f7fd1ea7339@mail.gmail.com>

Hi All

I am not able to think of smart way to do sequence matching allowing
userdefined number of mismatches.

For eg:

Given Sequence : AGCT will be considered a match to reference if any
one base pair position #(1,2,3,4)  has a mismatch that is  [ACGTN] so
the possible matches could be

This is for position 1.
AGCT
GGCT
CGCT
TGCT
NGCT
and likewise for each position.

any nice regular expression. One way that I could think was to
generate all the possible tags for a given sequence and then do the
matching. It will be a computationally expensive for long dataset .
Any neat method ?

Thanks,
-Abhi


From maj at fortinbras.us  Wed Sep 16 18:33:00 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Wed, 16 Sep 2009 22:33:00 +0000
Subject: [Bioperl-l] Allowing One error in Sequence matching
Message-ID: <W403682148491321253140380@webmail21>

Hi Abhi -
Maybe Chris' scrap
http://www.bioperl.org/wiki/Tricking_the_perl_regex_engine_to_get_suboptimal_matches
is what you're after?
MAJ


>-----Original Message-----
>From: Abhishek Pratap [mailto:abhishek.vit at gmail.com]
>Sent: Wednesday, September 16, 2009 05:41 PM
>To: bioperl-l at lists.open-bio.org
>Subject: [Bioperl-l] Allowing One error in Sequence matching
>
>Hi All
>
>I am not able to think of smart way to do sequence matching allowing
>userdefined number of mismatches.
>
>For eg:
>
>Given Sequence : AGCT will be considered a match to reference if any
>one base pair position #(1,2,3,4)  has a mismatch that is  [ACGTN] so
>the possible matches could be
>
>This is for position 1.
>AGCT
>GGCT
>CGCT
>TGCT
>NGCT
>and likewise for each position.
>
>any nice regular expression. One way that I could think was to
>generate all the possible tags for a given sequence and then do the
>matching. It will be a computationally expensive for long dataset .
>Any neat method ?
>
>Thanks,
>-Abhi
>_______________________________________________
>Bioperl-l mailing list
>Bioperl-l at lists.open-bio.org
>http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


From Russell.Smithies at agresearch.co.nz  Wed Sep 16 19:06:45 2009
From: Russell.Smithies at agresearch.co.nz (Smithies, Russell)
Date: Thu, 17 Sep 2009 11:06:45 +1200
Subject: [Bioperl-l] Allowing One error in Sequence matching
In-Reply-To: <be9b52410909161441w1ce271c4r1e518f7fd1ea7339@mail.gmail.com>
References: <be9b52410909161441w1ce271c4r1e518f7fd1ea7339@mail.gmail.com>
Message-ID: <18DF7D20DFEC044098A1062202F5FFF32B62985946@exchsth.agresearch.co.nz>

How about chunk it into overlapping words, skip if >2 N, then regex?

$seq = "CGATCGNATGNCGTCTAGCTGACANGTTGACTCTAGCTGATCGATCGATCGTACGTANNCGTAGTCGTACNTACGATCTNACGCACGNATGCTACGTACG";

$motif = "ACGT";
foreach (split //, $motif) {$w .= "[${_}N]"}

foreach ($seq =~ /(?=(\w{4}))/g){
  next if tr/N/N/ >= 2;
  print "$_\n" if  eval "/$w/" ;
}


> -----Original Message-----
> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
> bounces at lists.open-bio.org] On Behalf Of Abhishek Pratap
> Sent: Thursday, 17 September 2009 9:42 a.m.
> To: bioperl-l at lists.open-bio.org
> Subject: [Bioperl-l] Allowing One error in Sequence matching
> 
> Hi All
> 
> I am not able to think of smart way to do sequence matching allowing
> userdefined number of mismatches.
> 
> For eg:
> 
> Given Sequence : AGCT will be considered a match to reference if any
> one base pair position #(1,2,3,4)  has a mismatch that is  [ACGTN] so
> the possible matches could be
> 
> This is for position 1.
> AGCT
> GGCT
> CGCT
> TGCT
> NGCT
> and likewise for each position.
> 
> any nice regular expression. One way that I could think was to
> generate all the possible tags for a given sequence and then do the
> matching. It will be a computationally expensive for long dataset .
> Any neat method ?
> 
> Thanks,
> -Abhi
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
=======================================================================
Attention: The information contained in this message and/or attachments
from AgResearch Limited is intended only for the persons or entities
to which it is addressed and may contain confidential and/or privileged
material. Any review, retransmission, dissemination or other use of, or
taking of any action in reliance upon, this information by persons or
entities other than the intended recipients is prohibited by AgResearch
Limited. If you have received this message in error, please notify the
sender immediately.
=======================================================================


From maj at fortinbras.us  Wed Sep 16 18:30:50 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Wed, 16 Sep 2009 18:30:50 -0400
Subject: [Bioperl-l] Allowing One error in Sequence matching
In-Reply-To: <be9b52410909161441w1ce271c4r1e518f7fd1ea7339@mail.gmail.com>
References: <be9b52410909161441w1ce271c4r1e518f7fd1ea7339@mail.gmail.com>
Message-ID: <1B8182A0898B452D80EA6035A178B7CE@NewLife>

Hi Abhi -
Maybe Chris' scrap
http://www.bioperl.org/wiki/Tricking_the_perl_regex_engine_to_get_suboptimal_matches
is what you're after?
MAJ
----- Original Message ----- 
From: "Abhishek Pratap" <abhishek.vit at gmail.com>
To: <bioperl-l at lists.open-bio.org>
Sent: Wednesday, September 16, 2009 5:41 PM
Subject: [Bioperl-l] Allowing One error in Sequence matching


> Hi All
>
> I am not able to think of smart way to do sequence matching allowing
> userdefined number of mismatches.
>
> For eg:
>
> Given Sequence : AGCT will be considered a match to reference if any
> one base pair position #(1,2,3,4)  has a mismatch that is  [ACGTN] so
> the possible matches could be
>
> This is for position 1.
> AGCT
> GGCT
> CGCT
> TGCT
> NGCT
> and likewise for each position.
>
> any nice regular expression. One way that I could think was to
> generate all the possible tags for a given sequence and then do the
> matching. It will be a computationally expensive for long dataset .
> Any neat method ?
>
> Thanks,
> -Abhi
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> 


From abhishek.vit at gmail.com  Wed Sep 16 21:39:13 2009
From: abhishek.vit at gmail.com (Abhishek Pratap)
Date: Wed, 16 Sep 2009 21:39:13 -0400
Subject: [Bioperl-l] Allowing One error in Sequence matching
In-Reply-To: <18DF7D20DFEC044098A1062202F5FFF32B62985946@exchsth.agresearch.co.nz>
References: <be9b52410909161441w1ce271c4r1e518f7fd1ea7339@mail.gmail.com>
	<18DF7D20DFEC044098A1062202F5FFF32B62985946@exchsth.agresearch.co.nz>
Message-ID: <be9b52410909161839k2dd86c57o63cc149057b6af99@mail.gmail.com>

Hi Russell

Thanks for a quick reply. However I am not following the code clearly
and the reason behind it.

Will this work for  matching AGCT  to ACCT | ANCT | AACT. It dint give
me the expected output when I ran it. I am more interested in
understanding the logic.

It would be great if you could expand a bit more.


Also if I do it the brute force way as suggested to me by a frnd , how
will that work in terms of scalability.

@dna1=split(//,$a);
@dna2=split(//,$b);
$x=0;
for($i=0;$i<@dna1;$i++){
        if ($dna1[$i] ne $dna2[$i]){
                        $x++;
        }
}

if($x<=1){
        print "RESULT: your sequence is true\n";
}

else { print " RESULT: your sequence is false\n";}

Thanks,
-Abhi


On Wed, Sep 16, 2009 at 7:06 PM, Smithies, Russell
<Russell.Smithies at agresearch.co.nz> wrote:
> How about chunk it into overlapping words, skip if >2 N, then regex?
>
> $seq = "CGATCGNATGNCGTCTAGCTGACANGTTGACTCTAGCTGATCGATCGATCGTACGTANNCGTAGTCGTACNTACGATCTNACGCACGNATGCTACGTACG";
>
> $motif = "ACGT";
> foreach (split //, $motif) {$w .= "[${_}N]"}
>
> foreach ($seq =~ /(?=(\w{4}))/g){
> ?next if tr/N/N/ >= 2;
> ?print "$_\n" if ?eval "/$w/" ;
> }
>
>
>
>> -----Original Message-----
>> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
>> bounces at lists.open-bio.org] On Behalf Of Abhishek Pratap
>> Sent: Thursday, 17 September 2009 9:42 a.m.
>> To: bioperl-l at lists.open-bio.org
>> Subject: [Bioperl-l] Allowing One error in Sequence matching
>>
>> Hi All
>>
>> I am not able to think of smart way to do sequence matching allowing
>> userdefined number of mismatches.
>>
>> For eg:
>>
>> Given Sequence : AGCT will be considered a match to reference if any
>> one base pair position #(1,2,3,4) ?has a mismatch that is ?[ACGTN] so
>> the possible matches could be
>>
>> This is for position 1.
>> AGCT
>> GGCT
>> CGCT
>> TGCT
>> NGCT
>> and likewise for each position.
>>
>> any nice regular expression. One way that I could think was to
>> generate all the possible tags for a given sequence and then do the
>> matching. It will be a computationally expensive for long dataset .
>> Any neat method ?
>>
>> Thanks,
>> -Abhi
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> =======================================================================
> Attention: The information contained in this message and/or attachments
> from AgResearch Limited is intended only for the persons or entities
> to which it is addressed and may contain confidential and/or privileged
> material. Any review, retransmission, dissemination or other use of, or
> taking of any action in reliance upon, this information by persons or
> entities other than the intended recipients is prohibited by AgResearch
> Limited. If you have received this message in error, please notify the
> sender immediately.
> =======================================================================
>


From Russell.Smithies at agresearch.co.nz  Wed Sep 16 21:46:54 2009
From: Russell.Smithies at agresearch.co.nz (Smithies, Russell)
Date: Thu, 17 Sep 2009 13:46:54 +1200
Subject: [Bioperl-l] Allowing One error in Sequence matching
In-Reply-To: <be9b52410909161839k2dd86c57o63cc149057b6af99@mail.gmail.com>
References: <be9b52410909161441w1ce271c4r1e518f7fd1ea7339@mail.gmail.com>
	<18DF7D20DFEC044098A1062202F5FFF32B62985946@exchsth.agresearch.co.nz>
	<be9b52410909161839k2dd86c57o63cc149057b6af99@mail.gmail.com>
Message-ID: <18DF7D20DFEC044098A1062202F5FFF32B62985A72@exchsth.agresearch.co.nz>

I misread your question, my example will match NGCT, ANCT, AGNT, or ACGN with 1 miss-match (or NGNT, NGCN, ANNT, ANCT etc with 2 miss-matches)
The eval is just doing a regex on the match string created by the loop - "[AN][GN][CN][TN]"
If your word size is short and you're not using too many mismatches, brute-forcing it with a compiled regex would probably work.


> -----Original Message-----
> From: Abhishek Pratap [mailto:abhishek.vit at gmail.com]
> Sent: Thursday, 17 September 2009 1:39 p.m.
> To: Smithies, Russell
> Cc: bioperl-l at lists.open-bio.org
> Subject: Re: [Bioperl-l] Allowing One error in Sequence matching
> 
> Hi Russell
> 
> Thanks for a quick reply. However I am not following the code clearly
> and the reason behind it.
> 
> Will this work for  matching AGCT  to ACCT | ANCT | AACT. It dint give
> me the expected output when I ran it. I am more interested in
> understanding the logic.
> 
> It would be great if you could expand a bit more.
> 
> 
> Also if I do it the brute force way as suggested to me by a frnd , how
> will that work in terms of scalability.
> 
> @dna1=split(//,$a);
> @dna2=split(//,$b);
> $x=0;
> for($i=0;$i<@dna1;$i++){
>         if ($dna1[$i] ne $dna2[$i]){
>                         $x++;
>         }
> }
> 
> if($x<=1){
>         print "RESULT: your sequence is true\n";
> }
> 
> else { print " RESULT: your sequence is false\n";}
> 
> Thanks,
> -Abhi
> 
> 
> On Wed, Sep 16, 2009 at 7:06 PM, Smithies, Russell
> <Russell.Smithies at agresearch.co.nz> wrote:
> > How about chunk it into overlapping words, skip if >2 N, then regex?
> >
> > $seq =
> "CGATCGNATGNCGTCTAGCTGACANGTTGACTCTAGCTGATCGATCGATCGTACGTANNCGTAGTCGTACNTACGAT
> CTNACGCACGNATGCTACGTACG";
> >
> > $motif = "ACGT";
> > foreach (split //, $motif) {$w .= "[${_}N]"}
> >
> > foreach ($seq =~ /(?=(\w{4}))/g){
> > ?next if tr/N/N/ >= 2;
> > ?print "$_\n" if ?eval "/$w/" ;
> > }
> >
> >
> >
> >> -----Original Message-----
> >> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
> >> bounces at lists.open-bio.org] On Behalf Of Abhishek Pratap
> >> Sent: Thursday, 17 September 2009 9:42 a.m.
> >> To: bioperl-l at lists.open-bio.org
> >> Subject: [Bioperl-l] Allowing One error in Sequence matching
> >>
> >> Hi All
> >>
> >> I am not able to think of smart way to do sequence matching allowing
> >> userdefined number of mismatches.
> >>
> >> For eg:
> >>
> >> Given Sequence : AGCT will be considered a match to reference if any
> >> one base pair position #(1,2,3,4) ?has a mismatch that is ?[ACGTN] so
> >> the possible matches could be
> >>
> >> This is for position 1.
> >> AGCT
> >> GGCT
> >> CGCT
> >> TGCT
> >> NGCT
> >> and likewise for each position.
> >>
> >> any nice regular expression. One way that I could think was to
> >> generate all the possible tags for a given sequence and then do the
> >> matching. It will be a computationally expensive for long dataset .
> >> Any neat method ?
> >>
> >> Thanks,
> >> -Abhi
> >> _______________________________________________
> >> Bioperl-l mailing list
> >> Bioperl-l at lists.open-bio.org
> >> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> > =======================================================================
> > Attention: The information contained in this message and/or attachments
> > from AgResearch Limited is intended only for the persons or entities
> > to which it is addressed and may contain confidential and/or privileged
> > material. Any review, retransmission, dissemination or other use of, or
> > taking of any action in reliance upon, this information by persons or
> > entities other than the intended recipients is prohibited by AgResearch
> > Limited. If you have received this message in error, please notify the
> > sender immediately.
> > =======================================================================
> >


From abhishek.vit at gmail.com  Wed Sep 16 23:12:20 2009
From: abhishek.vit at gmail.com (Abhishek Pratap)
Date: Wed, 16 Sep 2009 23:12:20 -0400
Subject: [Bioperl-l] Allowing One error in Sequence matching
In-Reply-To: <18DF7D20DFEC044098A1062202F5FFF32B62985A72@exchsth.agresearch.co.nz>
References: <be9b52410909161441w1ce271c4r1e518f7fd1ea7339@mail.gmail.com>
	<18DF7D20DFEC044098A1062202F5FFF32B62985946@exchsth.agresearch.co.nz>
	<be9b52410909161839k2dd86c57o63cc149057b6af99@mail.gmail.com>
	<18DF7D20DFEC044098A1062202F5FFF32B62985A72@exchsth.agresearch.co.nz>
Message-ID: <be9b52410909162012m5b18bc78u477e15957c88a45d@mail.gmail.com>

Thanks Russell.

I think having a "approx matching" method in bioperl will help
specially with NGS data where read matching with 1/2/3/4 errors is
sometimes needed.

Cheers,
-Abhi


On Wed, Sep 16, 2009 at 9:46 PM, Smithies, Russell
<Russell.Smithies at agresearch.co.nz> wrote:
> I misread your question, my example will match NGCT, ANCT, AGNT, or ACGN with 1 miss-match (or NGNT, NGCN, ANNT, ANCT etc with 2 miss-matches)
> The eval is just doing a regex on the match string created by the loop - "[AN][GN][CN][TN]"
> If your word size is short and you're not using too many mismatches, brute-forcing it with a compiled regex would probably work.
>
>
>> -----Original Message-----
>> From: Abhishek Pratap [mailto:abhishek.vit at gmail.com]
>> Sent: Thursday, 17 September 2009 1:39 p.m.
>> To: Smithies, Russell
>> Cc: bioperl-l at lists.open-bio.org
>> Subject: Re: [Bioperl-l] Allowing One error in Sequence matching
>>
>> Hi Russell
>>
>> Thanks for a quick reply. However I am not following the code clearly
>> and the reason behind it.
>>
>> Will this work for ?matching AGCT ?to ACCT | ANCT | AACT. It dint give
>> me the expected output when I ran it. I am more interested in
>> understanding the logic.
>>
>> It would be great if you could expand a bit more.
>>
>>
>> Also if I do it the brute force way as suggested to me by a frnd , how
>> will that work in terms of scalability.
>>
>> @dna1=split(//,$a);
>> @dna2=split(//,$b);
>> $x=0;
>> for($i=0;$i<@dna1;$i++){
>> ? ? ? ? if ($dna1[$i] ne $dna2[$i]){
>> ? ? ? ? ? ? ? ? ? ? ? ? $x++;
>> ? ? ? ? }
>> }
>>
>> if($x<=1){
>> ? ? ? ? print "RESULT: your sequence is true\n";
>> }
>>
>> else { print " RESULT: your sequence is false\n";}
>>
>> Thanks,
>> -Abhi
>>
>>
>> On Wed, Sep 16, 2009 at 7:06 PM, Smithies, Russell
>> <Russell.Smithies at agresearch.co.nz> wrote:
>> > How about chunk it into overlapping words, skip if >2 N, then regex?
>> >
>> > $seq =
>> "CGATCGNATGNCGTCTAGCTGACANGTTGACTCTAGCTGATCGATCGATCGTACGTANNCGTAGTCGTACNTACGAT
>> CTNACGCACGNATGCTACGTACG";
>> >
>> > $motif = "ACGT";
>> > foreach (split //, $motif) {$w .= "[${_}N]"}
>> >
>> > foreach ($seq =~ /(?=(\w{4}))/g){
>> > ?next if tr/N/N/ >= 2;
>> > ?print "$_\n" if ?eval "/$w/" ;
>> > }
>> >
>> >
>> >
>> >> -----Original Message-----
>> >> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
>> >> bounces at lists.open-bio.org] On Behalf Of Abhishek Pratap
>> >> Sent: Thursday, 17 September 2009 9:42 a.m.
>> >> To: bioperl-l at lists.open-bio.org
>> >> Subject: [Bioperl-l] Allowing One error in Sequence matching
>> >>
>> >> Hi All
>> >>
>> >> I am not able to think of smart way to do sequence matching allowing
>> >> userdefined number of mismatches.
>> >>
>> >> For eg:
>> >>
>> >> Given Sequence : AGCT will be considered a match to reference if any
>> >> one base pair position #(1,2,3,4) ?has a mismatch that is ?[ACGTN] so
>> >> the possible matches could be
>> >>
>> >> This is for position 1.
>> >> AGCT
>> >> GGCT
>> >> CGCT
>> >> TGCT
>> >> NGCT
>> >> and likewise for each position.
>> >>
>> >> any nice regular expression. One way that I could think was to
>> >> generate all the possible tags for a given sequence and then do the
>> >> matching. It will be a computationally expensive for long dataset .
>> >> Any neat method ?
>> >>
>> >> Thanks,
>> >> -Abhi
>> >> _______________________________________________
>> >> Bioperl-l mailing list
>> >> Bioperl-l at lists.open-bio.org
>> >> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>> > =======================================================================
>> > Attention: The information contained in this message and/or attachments
>> > from AgResearch Limited is intended only for the persons or entities
>> > to which it is addressed and may contain confidential and/or privileged
>> > material. Any review, retransmission, dissemination or other use of, or
>> > taking of any action in reliance upon, this information by persons or
>> > entities other than the intended recipients is prohibited by AgResearch
>> > Limited. If you have received this message in error, please notify the
>> > sender immediately.
>> > =======================================================================
>> >
>


From cjfields at illinois.edu  Thu Sep 17 00:39:03 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Wed, 16 Sep 2009 23:39:03 -0500
Subject: [Bioperl-l] About FASTQ parser
In-Reply-To: <be9b52410909161313uab30d9cn24d7080eb1684de7@mail.gmail.com>
References: <be9b52410909161313uab30d9cn24d7080eb1684de7@mail.gmail.com>
Message-ID: <32FBD592-4822-478C-BCAE-33F71E1857FC@illinois.edu>

Abhi,

The FASTQ parser hasn't been released to CPAN yet.  It is available  
via bioperl-live.  We haven't added any code yet to the HOWTO's, but  
the SYNOPSIS example in Bio::SeqIO::fastq should be sufficient to get  
you started.

Bio::Seq::Quality is the object returned via next_seq(); it can be  
queried for PHRED qual scores and other bits.  If you want to split  
things up you should call next_seq(), then generate a FASTQ output  
stream in the variant you want:

my $outfasta = Bio::SeqIO->new(-format => 'fastq-sanger', -file =>  
'>fasta.file');
my $outqual = Bio::SeqIO->new(-format => 'fastq-sanger', -file =>  
'>qual.file');

while (my $seq = $in->next_seq) {
    $outfasta->write_fasta($seq);
    $outqual->write_qual($seq);
}

Note I haven't tested that yet, but it should work.  Let me know if it  
doesn't.

chris

On Sep 16, 2009, at 3:13 PM, Abhishek Pratap wrote:

> Hi Chris
>
> I remember seeing a recent email about new bioperl fastq parser. Is it
> part of bioperl 1.6 dist. I installed one and based on the doc
> here(http://doc.bioperl.org/releases/bioperl-current/bioperl-live/Bio/SeqIO/fastq.html 
> )
> I am a bit lost.
>
> I see two methods there : using Bio::SeqIO::fastq and
> Bio::Seq::Quality. Are both same in terms of data returned and latter
> giving a scale up in speed ?
>
> This is not to offend any developer but small example/s on the HOWTO's
> helps a lot.
>
> The current example (copied below) is not working. I guess it is based
> on a previous version of code.
>
> # grabs the FASTQ parser, specifies the Illumina variant
> my $in = Bio::SeqIO->new(-format    => 'fastq-illumina',
>                          -file      => 'mydata.fq');
>
>
> My basic requirement is to read each read in fastq record and split it
> into header: read: quality.
>
>
> Thanks,
> -Abhi
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From abhishek.vit at gmail.com  Thu Sep 17 00:44:28 2009
From: abhishek.vit at gmail.com (Abhishek Pratap)
Date: Thu, 17 Sep 2009 00:44:28 -0400
Subject: [Bioperl-l] About FASTQ parser
In-Reply-To: <32FBD592-4822-478C-BCAE-33F71E1857FC@illinois.edu>
References: <be9b52410909161313uab30d9cn24d7080eb1684de7@mail.gmail.com>
	<32FBD592-4822-478C-BCAE-33F71E1857FC@illinois.edu>
Message-ID: <be9b52410909162144g3177f718nf239327e98bd30c2@mail.gmail.com>

Thanks for the quick info Chris.

Cheers,
-Abhi

On Thu, Sep 17, 2009 at 12:39 AM, Chris Fields <cjfields at illinois.edu> wrote:
> Abhi,
>
> The FASTQ parser hasn't been released to CPAN yet. ?It is available via
> bioperl-live. ?We haven't added any code yet to the HOWTO's, but the
> SYNOPSIS example in Bio::SeqIO::fastq should be sufficient to get you
> started.
>
> Bio::Seq::Quality is the object returned via next_seq(); it can be queried
> for PHRED qual scores and other bits. ?If you want to split things up you
> should call next_seq(), then generate a FASTQ output stream in the variant
> you want:
>
> my $outfasta = Bio::SeqIO->new(-format => 'fastq-sanger', -file =>
> '>fasta.file');
> my $outqual = Bio::SeqIO->new(-format => 'fastq-sanger', -file =>
> '>qual.file');
>
> while (my $seq = $in->next_seq) {
> ? $outfasta->write_fasta($seq);
> ? $outqual->write_qual($seq);
> }
>
> Note I haven't tested that yet, but it should work. ?Let me know if it
> doesn't.
>
> chris
>
> On Sep 16, 2009, at 3:13 PM, Abhishek Pratap wrote:
>
>> Hi Chris
>>
>> I remember seeing a recent email about new bioperl fastq parser. Is it
>> part of bioperl 1.6 dist. I installed one and based on the doc
>>
>> here(http://doc.bioperl.org/releases/bioperl-current/bioperl-live/Bio/SeqIO/fastq.html)
>> I am a bit lost.
>>
>> I see two methods there : using Bio::SeqIO::fastq and
>> Bio::Seq::Quality. Are both same in terms of data returned and latter
>> giving a scale up in speed ?
>>
>> This is not to offend any developer but small example/s on the HOWTO's
>> helps a lot.
>>
>> The current example (copied below) is not working. I guess it is based
>> on a previous version of code.
>>
>> # grabs the FASTQ parser, specifies the Illumina variant
>> my $in = Bio::SeqIO->new(-format ? ?=> 'fastq-illumina',
>> ? ? ? ? ? ? ? ? ? ? ? ? -file ? ? ?=> 'mydata.fq');
>>
>>
>> My basic requirement is to read each read in fastq record and split it
>> into header: read: quality.
>>
>>
>> Thanks,
>> -Abhi
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>


From amackey at virginia.edu  Thu Sep 17 06:52:31 2009
From: amackey at virginia.edu (Aaron Mackey)
Date: Thu, 17 Sep 2009 06:52:31 -0400
Subject: [Bioperl-l] Question concerning IUPAC.pm
In-Reply-To: <4AB203EF.6030107@agrar.hu-berlin.de>
References: <4AB203EF.6030107@agrar.hu-berlin.de>
Message-ID: <24c96eca0909170352h34b6a20t8648d4e097d57e1e@mail.gmail.com>

Dear Armin,

Please ask such questions on the BioPerl mailing list.

The Bio::Tools::IUPAC module does the opposite of what you want -- it takes
a sequence containing ambiguous codes (e.g. "Y") and generates all possible
combinations of unambiguous sequences (thus one sequence containing a "C"
instead of "Y", and a second sequence containing a "T" instead of "Y").

However, you can do this:

  my %lookup = Bio::Tools::IUPAC->iupac_rev_iub();

%lookup will now contain the following Perl hash:

A => 'A',
T => 'T',
 C => 'C',
G => 'G',
 AC => 'M',
AG => 'R',
 AT => 'W',
CG => 'S',
 CT => 'Y',
'GT' => 'K',
 ACG => 'V',
ACT => 'H',
 AGT => 'D',
CGT => 'B',
 ACGT=> 'N',
N => 'N'

-Aaron


On Thu, Sep 17, 2009 at 5:39 AM, Armin Schmitt <
armin.schmitt at agrar.hu-berlin.de> wrote:
>
> Dear Aaron,
>
> can I use your module IUPAC.pm to create
> ambiguity symbols?
>
> I.e. Input C,T -> output Y
>
> If yes, how can I do this? A little piece
> of code would be helpful. Otherwise,
> is there another perl module for this
> purpose?
>
> Thank you very much
>
> Armin Schmitt
>
>
> --
> Dr. Armin Schmitt
> Humboldt-Universit?t zu Berlin
> Department for Crop and Animal Sciences
> Invalidenstra?e 42
> 10115 Berlin
> Tel.:   +49-30-2093-9074
> Fax:    +49-30-2093-6397
> E-mail: armin.schmitt at agrar.hu-berlin.de
>
>


From abhishek.vit at gmail.com  Thu Sep 17 14:16:33 2009
From: abhishek.vit at gmail.com (Abhishek Pratap)
Date: Thu, 17 Sep 2009 14:16:33 -0400
Subject: [Bioperl-l] About FASTQ parser
In-Reply-To: <32FBD592-4822-478C-BCAE-33F71E1857FC@illinois.edu>
References: <be9b52410909161313uab30d9cn24d7080eb1684de7@mail.gmail.com>
	<32FBD592-4822-478C-BCAE-33F71E1857FC@illinois.edu>
Message-ID: <be9b52410909171116l3284d7b6pd80689a81d46efc1@mail.gmail.com>

Hi Chris

I am just wondering if the following is intentionally excluded from a
fasta record or a bug.

After reading in each fastq record from a FASTQ fiel the output of the
same recored  (  $out->write_seq($seq)  )  has line/text missing after
the + sign.


Eg:

@HWI-EAS397:1:1:11:252#NNNTNN/1
NACAATATCAATTAGAGGATTGCTTNGTTNAAGGNNTNGNTNNNANTNT
+
DNXPMXNYXMPVXZVTXYZ[[BBBBBBBBBBBBBBBBBBBBBBBBBBBB


PS: In our case we need the exact record to be printed out as we need
to split the fastq file into multiple fastq files based on the read
index in the @ Line. So exact output is needed to avoid conflicts with
downstream processing pipelines.

Thanks,
-Abhi

Thanks,
-Abhi

On Thu, Sep 17, 2009 at 12:39 AM, Chris Fields <cjfields at illinois.edu> wrote:
> Abhi,
>
> The FASTQ parser hasn't been released to CPAN yet. ?It is available via
> bioperl-live. ?We haven't added any code yet to the HOWTO's, but the
> SYNOPSIS example in Bio::SeqIO::fastq should be sufficient to get you
> started.
>
> Bio::Seq::Quality is the object returned via next_seq(); it can be queried
> for PHRED qual scores and other bits. ?If you want to split things up you
> should call next_seq(), then generate a FASTQ output stream in the variant
> you want:
>
> my $outfasta = Bio::SeqIO->new(-format => 'fastq-sanger', -file =>
> '>fasta.file');
> my $outqual = Bio::SeqIO->new(-format => 'fastq-sanger', -file =>
> '>qual.file');
>
> while (my $seq = $in->next_seq) {
> ? $outfasta->write_fasta($seq);
> ? $outqual->write_qual($seq);
> }
>
> Note I haven't tested that yet, but it should work. ?Let me know if it
> doesn't.
>
> chris
>
> On Sep 16, 2009, at 3:13 PM, Abhishek Pratap wrote:
>
>> Hi Chris
>>
>> I remember seeing a recent email about new bioperl fastq parser. Is it
>> part of bioperl 1.6 dist. I installed one and based on the doc
>>
>> here(http://doc.bioperl.org/releases/bioperl-current/bioperl-live/Bio/SeqIO/fastq.html)
>> I am a bit lost.
>>
>> I see two methods there : using Bio::SeqIO::fastq and
>> Bio::Seq::Quality. Are both same in terms of data returned and latter
>> giving a scale up in speed ?
>>
>> This is not to offend any developer but small example/s on the HOWTO's
>> helps a lot.
>>
>> The current example (copied below) is not working. I guess it is based
>> on a previous version of code.
>>
>> # grabs the FASTQ parser, specifies the Illumina variant
>> my $in = Bio::SeqIO->new(-format ? ?=> 'fastq-illumina',
>> ? ? ? ? ? ? ? ? ? ? ? ? -file ? ? ?=> 'mydata.fq');
>>
>>
>> My basic requirement is to read each read in fastq record and split it
>> into header: read: quality.
>>
>>
>> Thanks,
>> -Abhi
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>


From cjfields at illinois.edu  Thu Sep 17 16:54:20 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Thu, 17 Sep 2009 15:54:20 -0500
Subject: [Bioperl-l] BioPerl 1.6.0 alpha 1 released
Message-ID: <358B9E70-84C7-42DC-A473-C2AACC18A211@illinois.edu>

All,

Just a quick note that I have released the first alpha for the 1.6.1  
point release.  I uploaded it to CPAN, so it should be migrating to  
the various servers in the next few hours or so.  In the meantime, the  
alpha can be directly downloaded using the following links (pick your  
format):

http://bioperl.org/DIST/RC/BioPerl-1.6.0_1.tar.bz2
http://bioperl.org/DIST/RC/BioPerl-1.6.0_1.tar.gz
http://bioperl.org/DIST/RC/BioPerl-1.6.0_1.zip

If everything goes well, I'll have a more formalized release ready for  
the weekend.  I will also be attempting (hopefully with some success)  
getting a Windows PPM for the latest ActiveState Perl going over the  
next few days.  Feedback from users trying to install BioPerl using  
the latest Strawberry Perl would also be greatly appreciated.

Thanks!

chris


From cjfields at illinois.edu  Thu Sep 17 17:38:31 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Thu, 17 Sep 2009 16:38:31 -0500
Subject: [Bioperl-l] Size of BioPerl distribution
Message-ID: <75D16BEE-4851-4709-94D3-4D4A217B6E8D@illinois.edu>

After uploading the latest bioperl alpha to CPAN I noticed the size of  
the distribution archive has jumped up from ~7 MB to just over 10 MB.   
It looks like a majority of this is attributable to three data files  
for testing in t/data added after the 1.6.0 release:

gmap_f9-multiple_results.txt  (3 MB)
withrefm.906                  (2.5 MB)
1ZZ19XR301R-Alignment.tblastn (2 MB)

I'm not sure there is an easy way around the problem.  We could  
attempt to reduce the file size down, but I'm not convinced that's a  
long-term solution (the test data will only get larger as more test  
cases come up).

Any ideas?  Should we try to have a common biodata repo again?

chris


From rmb32 at cornell.edu  Thu Sep 17 18:04:47 2009
From: rmb32 at cornell.edu (Robert Buels)
Date: Thu, 17 Sep 2009 15:04:47 -0700
Subject: [Bioperl-l] Size of BioPerl distribution
In-Reply-To: <75D16BEE-4851-4709-94D3-4D4A217B6E8D@illinois.edu>
References: <75D16BEE-4851-4709-94D3-4D4A217B6E8D@illinois.edu>
Message-ID: <4AB2B27F.8050800@cornell.edu>

Chris Fields wrote:
 > Any ideas?  Should we try to have a common biodata repo again?

Beyond encouraging people to keep the test data smaller (I would think 
that multiple MB in a test data file is quite excessive!), I don't think 
it's worth worrying about that much.  The stuff in bioperl needs a 
significant amount of test data, and I think that's fine.

This problem is also addressed by the ongoing effort to break things up 
into more distros, I think that will help a lot.

Rob


From hlapp at gmx.net  Thu Sep 17 18:33:34 2009
From: hlapp at gmx.net (Hilmar Lapp)
Date: Thu, 17 Sep 2009 18:33:34 -0400
Subject: [Bioperl-l] Size of BioPerl distribution
In-Reply-To: <4AB2B27F.8050800@cornell.edu>
References: <75D16BEE-4851-4709-94D3-4D4A217B6E8D@illinois.edu>
	<4AB2B27F.8050800@cornell.edu>
Message-ID: <C84FCD2C-3CB7-498F-8977-3C52D194F110@gmx.net>


On Sep 17, 2009, at 6:04 PM, Robert Buels wrote:

> I don't think it's worth worrying about that much.  The stuff in  
> bioperl needs a significant amount of test data, and I think that's  
> fine.


I'd agree with that. Storage is cheap these days. -hilmar

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From cjfields at illinois.edu  Thu Sep 17 19:26:25 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Thu, 17 Sep 2009 18:26:25 -0500
Subject: [Bioperl-l] Size of BioPerl distribution
In-Reply-To: <C84FCD2C-3CB7-498F-8977-3C52D194F110@gmx.net>
References: <75D16BEE-4851-4709-94D3-4D4A217B6E8D@illinois.edu>
	<4AB2B27F.8050800@cornell.edu>
	<C84FCD2C-3CB7-498F-8977-3C52D194F110@gmx.net>
Message-ID: <2404AC8D-2095-415B-B1F3-CF79C4D24525@illinois.edu>

On Sep 17, 2009, at 5:33 PM, Hilmar Lapp wrote:

> On Sep 17, 2009, at 6:04 PM, Robert Buels wrote:
>
>> I don't think it's worth worrying about that much.  The stuff in  
>> bioperl needs a significant amount of test data, and I think that's  
>> fine.
>
> I'd agree with that. Storage is cheap these days. -hilmar

Kind of my thought as well, just a bit of a shock to see the dist.  
increase by 65% between point releases for just three test data  
files.  I may try paring those down a tad.

chris


From cjfields at illinois.edu  Thu Sep 17 19:26:52 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Thu, 17 Sep 2009 18:26:52 -0500
Subject: [Bioperl-l] About FASTQ parser
In-Reply-To: <be9b52410909171116l3284d7b6pd80689a81d46efc1@mail.gmail.com>
References: <be9b52410909161313uab30d9cn24d7080eb1684de7@mail.gmail.com>
	<32FBD592-4822-478C-BCAE-33F71E1857FC@illinois.edu>
	<be9b52410909171116l3284d7b6pd80689a81d46efc1@mail.gmail.com>
Message-ID: <06B0378C-312F-4F43-A99A-6F6CC1C88F61@illinois.edu>

The default format for most FASTQ parsers is to leave the extra header  
off (it increases the file size substantially).  You can add that back  
by setting quality_header():

my $out = Bio::SeqIO->new(-format => 'fastq', -file => $file, - 
quality_header => 1);

Again, let me know if that works okay.

chris

On Sep 17, 2009, at 1:16 PM, Abhishek Pratap wrote:

> Hi Chris
>
> I am just wondering if the following is intentionally excluded from a
> fasta record or a bug.
>
> After reading in each fastq record from a FASTQ fiel the output of the
> same recored  (  $out->write_seq($seq)  )  has line/text missing after
> the + sign.
>
>
>
> Eg:
>
> @HWI-EAS397:1:1:11:252#NNNTNN/1
> NACAATATCAATTAGAGGATTGCTTNGTTNAAGGNNTNGNTNNNANTNT
> +
> DNXPMXNYXMPVXZVTXYZ[[BBBBBBBBBBBBBBBBBBBBBBBBBBBB
>
>
> PS: In our case we need the exact record to be printed out as we need
> to split the fastq file into multiple fastq files based on the read
> index in the @ Line. So exact output is needed to avoid conflicts with
> downstream processing pipelines.
>
> Thanks,
> -Abhi
>
> Thanks,
> -Abhi
>
> On Thu, Sep 17, 2009 at 12:39 AM, Chris Fields  
> <cjfields at illinois.edu> wrote:
>> Abhi,
>>
>> The FASTQ parser hasn't been released to CPAN yet.  It is available  
>> via
>> bioperl-live.  We haven't added any code yet to the HOWTO's, but the
>> SYNOPSIS example in Bio::SeqIO::fastq should be sufficient to get you
>> started.
>>
>> Bio::Seq::Quality is the object returned via next_seq(); it can be  
>> queried
>> for PHRED qual scores and other bits.  If you want to split things  
>> up you
>> should call next_seq(), then generate a FASTQ output stream in the  
>> variant
>> you want:
>>
>> my $outfasta = Bio::SeqIO->new(-format => 'fastq-sanger', -file =>
>> '>fasta.file');
>> my $outqual = Bio::SeqIO->new(-format => 'fastq-sanger', -file =>
>> '>qual.file');
>>
>> while (my $seq = $in->next_seq) {
>>   $outfasta->write_fasta($seq);
>>   $outqual->write_qual($seq);
>> }
>>
>> Note I haven't tested that yet, but it should work.  Let me know if  
>> it
>> doesn't.
>>
>> chris
>>
>> On Sep 16, 2009, at 3:13 PM, Abhishek Pratap wrote:
>>
>>> Hi Chris
>>>
>>> I remember seeing a recent email about new bioperl fastq parser.  
>>> Is it
>>> part of bioperl 1.6 dist. I installed one and based on the doc
>>>
>>> here(http://doc.bioperl.org/releases/bioperl-current/bioperl-live/Bio/SeqIO/fastq.html 
>>> )
>>> I am a bit lost.
>>>
>>> I see two methods there : using Bio::SeqIO::fastq and
>>> Bio::Seq::Quality. Are both same in terms of data returned and  
>>> latter
>>> giving a scale up in speed ?
>>>
>>> This is not to offend any developer but small example/s on the  
>>> HOWTO's
>>> helps a lot.
>>>
>>> The current example (copied below) is not working. I guess it is  
>>> based
>>> on a previous version of code.
>>>
>>> # grabs the FASTQ parser, specifies the Illumina variant
>>> my $in = Bio::SeqIO->new(-format    => 'fastq-illumina',
>>>                         -file      => 'mydata.fq');
>>>
>>>
>>> My basic requirement is to read each read in fastq record and  
>>> split it
>>> into header: read: quality.
>>>
>>>
>>> Thanks,
>>> -Abhi
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From rmb32 at cornell.edu  Thu Sep 17 19:30:16 2009
From: rmb32 at cornell.edu (Robert Buels)
Date: Thu, 17 Sep 2009 16:30:16 -0700
Subject: [Bioperl-l] Size of BioPerl distribution
In-Reply-To: <2404AC8D-2095-415B-B1F3-CF79C4D24525@illinois.edu>
References: <75D16BEE-4851-4709-94D3-4D4A217B6E8D@illinois.edu>
	<4AB2B27F.8050800@cornell.edu>
	<C84FCD2C-3CB7-498F-8977-3C52D194F110@gmx.net>
	<2404AC8D-2095-415B-B1F3-CF79C4D24525@illinois.edu>
Message-ID: <4AB2C688.2030602@cornell.edu>

Chris Fields wrote:
> Kind of my thought as well, just a bit of a shock to see the dist. 
> increase by 65% between point releases for just three test data files.  
> I may try paring those down a tad.

Yes, those individual files are certainly excessive.


From maj at fortinbras.us  Thu Sep 17 19:36:09 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Thu, 17 Sep 2009 19:36:09 -0400
Subject: [Bioperl-l] Size of BioPerl distribution
In-Reply-To: <C84FCD2C-3CB7-498F-8977-3C52D194F110@gmx.net>
References: <75D16BEE-4851-4709-94D3-4D4A217B6E8D@illinois.edu><4AB2B27F.8050800@cornell.edu>
	<C84FCD2C-3CB7-498F-8977-3C52D194F110@gmx.net>
Message-ID: <EC73E6B6BD3D468E8C29D138AF64D483@NewLife>

Two of those files are my bad-- the withrefm is prob best in its entirety, since
it contains all the weird extra-site restrictions that the B:Restriction 
refactor
was meant to handle. The other is a tiling test file that I could probably 
replace
(or at least edit down)-- 
----- Original Message ----- 
From: "Hilmar Lapp" <hlapp at gmx.net>
To: "Robert Buels" <rmb32 at cornell.edu>
Cc: "Chris Fields" <cjfields at illinois.edu>; "BioPerl List" 
<bioperl-l at lists.open-bio.org>
Sent: Thursday, September 17, 2009 6:33 PM
Subject: Re: [Bioperl-l] Size of BioPerl distribution


>
> On Sep 17, 2009, at 6:04 PM, Robert Buels wrote:
>
>> I don't think it's worth worrying about that much.  The stuff in  bioperl 
>> needs a significant amount of test data, and I think that's  fine.
>
>
> I'd agree with that. Storage is cheap these days. -hilmar
>
> -- 
> ===========================================================
> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
> ===========================================================
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> 


From maj at fortinbras.us  Thu Sep 17 22:13:37 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Thu, 17 Sep 2009 22:13:37 -0400
Subject: [Bioperl-l] Size of BioPerl distribution
In-Reply-To: <75D16BEE-4851-4709-94D3-4D4A217B6E8D@illinois.edu>
References: <75D16BEE-4851-4709-94D3-4D4A217B6E8D@illinois.edu>
Message-ID: <F9FBB3236FA446BCAA504BBC62E3194F@NewLife>

t/data compresses from 21M to 9M. We could ship with 

$ tar -czf data.tar.gz data
$ rm -rf data

and do the following in Bio::Root::Test, if we're willing to expect 
Archive::Tar and IO::Zlib :

use vars qw( $ARCHIVE );
$ARCHIVE = "data.tar.gz";
...

sub test_input_file {
    # if it's there, fine
    my $fn =  File::Spec->catfile('t', 'data', @_);
    return $fn if -e $fn;
    # if it's not, expand the archive
    my $arch = File::Spec->catfile('t', $ARCHIVE);
    Bio::Root::Root->throw("Test data archive not present") unless (-e $arch);
    my $tar = Archive::Tar->new($arch);
    Bio::Root::Root->throw ("Can't extract test data archive") unless $tar;
    $tar->extract;
    return $fn if -e $fn;
    return;
}


----- Original Message ----- 
From: "Chris Fields" <cjfields at illinois.edu>
To: "BioPerl List" <bioperl-l at lists.open-bio.org>
Sent: Thursday, September 17, 2009 5:38 PM
Subject: [Bioperl-l] Size of BioPerl distribution


> After uploading the latest bioperl alpha to CPAN I noticed the size of  
> the distribution archive has jumped up from ~7 MB to just over 10 MB.   
> It looks like a majority of this is attributable to three data files  
> for testing in t/data added after the 1.6.0 release:
> 
> gmap_f9-multiple_results.txt  (3 MB)
> withrefm.906                  (2.5 MB)
> 1ZZ19XR301R-Alignment.tblastn (2 MB)
> 
> I'm not sure there is an easy way around the problem.  We could  
> attempt to reduce the file size down, but I'm not convinced that's a  
> long-term solution (the test data will only get larger as more test  
> cases come up).
> 
> Any ideas?  Should we try to have a common biodata repo again?
> 
> chris
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
>


From cjfields at illinois.edu  Thu Sep 17 22:53:09 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Thu, 17 Sep 2009 21:53:09 -0500
Subject: [Bioperl-l] Size of BioPerl distribution
In-Reply-To: <F9FBB3236FA446BCAA504BBC62E3194F@NewLife>
References: <75D16BEE-4851-4709-94D3-4D4A217B6E8D@illinois.edu>
	<F9FBB3236FA446BCAA504BBC62E3194F@NewLife>
Message-ID: <04BEE5E7-79C6-45DE-9EC7-D72AE9E881E5@illinois.edu>

Maybe attempt trimming them down a bit first, if that's possible.  If  
not, no worries (breaking up the distribution will help as Robert  
said).  Archive::Tar and IO::Zlib were added in core after 5.8  
(5.009003 to be exact), so I would rather not have to worry about any  
test-specific dependencies.

Anyway, we've got a little more time.  I'm getting a META.yml popping  
up (though everything appears to pass here).  Will look into it; may  
be related to a previously reported bug, but I would like to see some  
CPANPLUS tests coming in first.  That's what an alpha is for!

chris

On Sep 17, 2009, at 9:13 PM, Mark A. Jensen wrote:

> t/data compresses from 21M to 9M. We could ship with
> $ tar -czf data.tar.gz data
> $ rm -rf data
>
> and do the following in Bio::Root::Test, if we're willing to expect  
> Archive::Tar and IO::Zlib :
>
> use vars qw( $ARCHIVE );
> $ARCHIVE = "data.tar.gz";
> ...
>
> sub test_input_file {
>   # if it's there, fine
>   my $fn =  File::Spec->catfile('t', 'data', @_);
>   return $fn if -e $fn;
>   # if it's not, expand the archive
>   my $arch = File::Spec->catfile('t', $ARCHIVE);
>   Bio::Root::Root->throw("Test data archive not present") unless (-e  
> $arch);
>   my $tar = Archive::Tar->new($arch);
>   Bio::Root::Root->throw ("Can't extract test data archive") unless  
> $tar;
>   $tar->extract;
>   return $fn if -e $fn;
>   return;
> }
>
>
> ----- Original Message ----- From: "Chris Fields" <cjfields at illinois.edu 
> >
> To: "BioPerl List" <bioperl-l at lists.open-bio.org>
> Sent: Thursday, September 17, 2009 5:38 PM
> Subject: [Bioperl-l] Size of BioPerl distribution
>
>
>> After uploading the latest bioperl alpha to CPAN I noticed the size  
>> of  the distribution archive has jumped up from ~7 MB to just over  
>> 10 MB.   It looks like a majority of this is attributable to three  
>> data files  for testing in t/data added after the 1.6.0 release:
>> gmap_f9-multiple_results.txt  (3 MB)
>> withrefm.906                  (2.5 MB)
>> 1ZZ19XR301R-Alignment.tblastn (2 MB)
>> I'm not sure there is an easy way around the problem.  We could   
>> attempt to reduce the file size down, but I'm not convinced that's  
>> a  long-term solution (the test data will only get larger as more  
>> test  cases come up).
>> Any ideas?  Should we try to have a common biodata repo again?
>> chris
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From cjfields at illinois.edu  Thu Sep 17 23:48:13 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Thu, 17 Sep 2009 22:48:13 -0500
Subject: [Bioperl-l] Size of BioPerl distribution
In-Reply-To: <19123.504.682683.996798@already.dhcp.gene.com>
References: <75D16BEE-4851-4709-94D3-4D4A217B6E8D@illinois.edu>
	<4AB2B27F.8050800@cornell.edu>
	<C84FCD2C-3CB7-498F-8977-3C52D194F110@gmx.net>
	<2404AC8D-2095-415B-B1F3-CF79C4D24525@illinois.edu>
	<4AB2C688.2030602@cornell.edu>
	<19123.504.682683.996798@already.dhcp.gene.com>
Message-ID: <B1B941EE-8F1E-426C-82DC-D89B3A13AD3D@illinois.edu>

On Sep 17, 2009, at 10:43 PM, George Hartzell wrote:

> Robert Buels writes:
>> Chris Fields wrote:
>>> Kind of my thought as well, just a bit of a shock to see the dist.
>>> increase by 65% between point releases for just three test data  
>>> files.
>>> I may try paring those down a tad.
>>
>> Yes, those individual files are certainly excessive.
>
> Woo hoo.  Fame and fortune.  Or at least fame.  Or something just this
> side of embarrassment.  Rats.
>
> I'll see about making a smaller test for the gmap_f9 parser, while
> still using real data.
>
> Is there existing support in the searchio infrastructure for reading
> [gb]zip'ed files?
>
> Can it wait a day or three?
>
> g.

Yes, certainly.  I'll be working on a separate issue this weekend  
dealing with the META.yml that CPAN/CPANPLUS appear to be choking on,  
so I'll push back the release until early next week.

chris


From hartzell at alerce.com  Thu Sep 17 23:43:52 2009
From: hartzell at alerce.com (George Hartzell)
Date: Thu, 17 Sep 2009 20:43:52 -0700
Subject: [Bioperl-l] Size of BioPerl distribution
In-Reply-To: <4AB2C688.2030602@cornell.edu>
References: <75D16BEE-4851-4709-94D3-4D4A217B6E8D@illinois.edu>
	<4AB2B27F.8050800@cornell.edu>
	<C84FCD2C-3CB7-498F-8977-3C52D194F110@gmx.net>
	<2404AC8D-2095-415B-B1F3-CF79C4D24525@illinois.edu>
	<4AB2C688.2030602@cornell.edu>
Message-ID: <19123.504.682683.996798@already.dhcp.gene.com>

Robert Buels writes:
 > Chris Fields wrote:
 > > Kind of my thought as well, just a bit of a shock to see the dist. 
 > > increase by 65% between point releases for just three test data files.  
 > > I may try paring those down a tad.
 > 
 > Yes, those individual files are certainly excessive.

Woo hoo.  Fame and fortune.  Or at least fame.  Or something just this
side of embarrassment.  Rats.

I'll see about making a smaller test for the gmap_f9 parser, while
still using real data.

Is there existing support in the searchio infrastructure for reading
[gb]zip'ed files?

Can it wait a day or three?

g.


From roy.chaudhuri at gmail.com  Fri Sep 18 06:43:29 2009
From: roy.chaudhuri at gmail.com (Roy Chaudhuri)
Date: Fri, 18 Sep 2009 11:43:29 +0100
Subject: [Bioperl-l] subsection of genbank file
In-Reply-To: <997B4CA2-D80B-4512-AA3E-74CB45DD7064@science.mq.edu.au>
References: <997B4CA2-D80B-4512-AA3E-74CB45DD7064@science.mq.edu.au>
Message-ID: <4AB36451.3030207@gmail.com>

Hi Liam,

I just discovered your message, which has not yet been replied to. What 
you require has been discussed in a recent thread:
http://bioperl.org/pipermail/bioperl-l/2009-August/031071.html

Try using trunc_with_features from Bio::SeqUtils:

my $sub_seqobj=Bio::SeqUtils->trunc_with_features($seqobj, 300, 2000);
Cheers.
Roy.

Liam Elbourne wrote:
> Hi All,
> 
> Is there a method or methodology that will produce a fully fledged Seq  
> object with all the associated metadata given a start and end  
> position? To clarify, I create a sequence object from a genbank file:
> 
> 
> ****
> my $io  = Bio::Seqio->new(as per usual);
> 
> my $seqobj = $io->next_seq();
> ****
> I now want:
> 
> my $sub_seqobj = $seqobj between 300 and 2000
> 
> where $sub_seqobj is a Seq object (which I appreciate is an  
> 'aggregate' of objects) too. The "trunc" method only returns a  
> PrimarySeq object which lacks all the annotation etc. I've previously  
> done this task by iterating through feature by feature and parsing out  
> what I needed, but thought there might be a more elegant approach...
> 
> 
> Regards,
> Liam Elbourne.

-- 
Dr. Roy Chaudhuri
Department of Veterinary Medicine
University of Cambridge, U.K.


From maj at fortinbras.us  Fri Sep 18 08:11:11 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Fri, 18 Sep 2009 08:11:11 -0400
Subject: [Bioperl-l] problem parsing pdb
In-Reply-To: <741671.67508.qm@web25705.mail.ukl.yahoo.com>
References: <741671.67508.qm@web25705.mail.ukl.yahoo.com>
Message-ID: <DBEE748776B74A7988A942A7BBE13AA3@NewLife>

Hi Paola--
I will look at this. Stay tuned-
Mark
----- Original Message ----- 
From: "Paola Bisignano" <paola_bisignano at yahoo.it>
To: <bioperl-l at bioperl.org>
Sent: Tuesday, September 08, 2009 4:55 AM
Subject: [Bioperl-l] problem parsing pdb


Hi,

I'm in a little troble because i need to exactly parse pdb file, to extract 
chain id and res id, but I finded that in some pdb the number of residue is 
followed by a letter because is probably a residue added by crystallographers 
and they didm't want to change the number of residue in sequence....for example 
the pdb 1PXX.pdb I parsed it with my script below, I didn't find any useful 
suggestion about this in bioperltutorial or documentation of bioperl online

#!/usr/local/bin/perl
use strict;
use warnings;
use Bio::Structure::IO;
use LWP::Simple;


my $urlpdb= 
"http://www.rcsb.org/pdb/download/downloadFile.do?fileFormat=pdb&compression=NO&structureId=1PXX";
my $content = get($urlpdb);
my $pdb_file = qq{1pxx.pdb};
open my $f, ">$pdb_file" or die $!;
binmode $f;
print $f $content;
print qq{$pdb_file\n};
close $f;


my $structio=Bio::Structure::IO->new (-file=>$pdb_file);
my $struc=$structio->next_structure;
for my $chain ($struc->get_chains)
{
my $chainid = $chain->id ;
for my $res ($struc->get_residues($chain))
{
my $resid=$res-> id;
my $atoms= $struc->get_atoms($res);
open my $f, ">> 1pxx.parsed";
print $f "$chainid\t$resid\n";
close $f;
}
}


but it gives my file with an error in ILE 105A ILE 2105C because they have a 
letter that follow the number of resid.... can I solve that problem without 
writing intermediate files?
because i need to have the reside id as 105A not 105.A
so
A ILE-105A
without point between number and letter....


Thank you all,

Paola


_______________________________________________
Bioperl-l mailing list
Bioperl-l at lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/bioperl-l


From scott at scottcain.net  Fri Sep 18 10:11:23 2009
From: scott at scottcain.net (Scott Cain)
Date: Fri, 18 Sep 2009 10:11:23 -0400
Subject: [Bioperl-l] test failures in main trunk
Message-ID: <2DEEE102-8F58-4BBF-BEAD-97A1AA364787@scottcain.net>

With Chris trying to get a release out, I wanted to report these test  
failures from a fairly virgin system Ubuntu server 8.04.

Scott


t/SeqIO/raw.t ................................ 1/24 Can't locate  
Algorithm/Diff.pm in @INC (@INC contains: t/lib . /home/gmod/bioperl- 
live/blib/lib /home/gmod/bioperl-live/blib/arch /home/gmod/bioperl- 
live /etc/perl /usr/local/lib/perl/5.8.8 /usr/local/share/perl/5.8.8 / 
usr/lib/perl5 /usr/share/perl5 /usr/lib/perl/5.8 /usr/share/perl/5.8 / 
usr/local/lib/site_perl) at t/SeqIO/raw.t line 72.
BEGIN failed--compilation aborted at t/SeqIO/raw.t line 72.
# Looks like you planned 24 tests but ran 1.
# Looks like your test exited with 2 just after 1.
t/SeqIO/raw.t ................................ Dubious, test returned  
2 (wstat 512, 0x200)

t/SeqTools/Backtranslate.t ................... Can't locate ok.pm in  
@INC (@INC contains: t/lib /home/gmod/bioperl-live/blib/lib /home/gmod/ 
bioperl-live/blib/arch /home/gmod/bioperl-live /etc/perl /usr/local/ 
lib/perl/5.8.8 /usr/local/share/perl/5.8.8 /usr/lib/perl5 /usr/share/ 
perl5 /usr/lib/perl/5.8 /usr/share/perl/5.8 /usr/local/lib/ 
site_perl .) at t/SeqTools/Backtranslate.t line 9.
BEGIN failed--compilation aborted at t/SeqTools/Backtranslate.t line 9.
# Looks like your test exited with 2 before it could output anything.
t/SeqTools/Backtranslate.t ................... Dubious, test returned  
2 (wstat 512, 0x200)
Failed 8/8 subtests

t/SeqTools/SeqPattern.t ...................... 1/28
#   Failed test 'use Bio::Tools::SeqPattern;'
#   at t/SeqTools/SeqPattern.t line 12.
#     Tried to use 'Bio::Tools::SeqPattern'.
#     Error:  Can't locate List/MoreUtils.pm in @INC (@INC contains: t/ 
lib . /home/gmod/bioperl-live/blib/lib /home/gmod/bioperl-live/blib/ 
arch /home/gmod/bioperl-live /etc/perl /usr/local/lib/perl/5.8.8 /usr/ 
local/share/perl/5.8.8 /usr/lib/perl5 /usr/share/perl5 /usr/lib/perl/ 
5.8 /usr/share/perl/5.8 /usr/local/lib/site_perl) at Bio/Tools/ 
SeqPattern/Backtranslate.pm line 22.
# BEGIN failed--compilation aborted at Bio/Tools/SeqPattern/ 
Backtranslate.pm line 22.
# Compilation failed in require at Bio/Tools/SeqPattern.pm line 212.
# Compilation failed in require at (eval 17) line 2.
# BEGIN failed--compilation aborted at (eval 17) line 2.
Use of uninitialized value in concatenation (.) or string at Bio/Tools/ 
SeqPattern.pm line 431.
Use of uninitialized value in concatenation (.) or string at Bio/Tools/ 
SeqPattern.pm line 432.

#   Failed test at t/SeqTools/SeqPattern.t line 25.
#          got: '(CT).{1,80}(C[[]]CT).(AGGGG){1,200}'
#     expected: '(CT).{1,80}(C[GA][GA]CT).(AGGGG){1,200}'
Use of uninitialized value in concatenation (.) or string at Bio/Tools/ 
SeqPattern.pm line 431.
Use of uninitialized value in concatenation (.) or string at Bio/Tools/ 
SeqPattern.pm line 432.

#   Failed test at t/SeqTools/SeqPattern.t line 31.
#          got: '(CT).(C[][]CT){1,80}.(AGGGG){1,200}'
#     expected: '(CT).(C[AG][AG]CT){1,80}.(AGGGG){1,200}'
Use of uninitialized value in concatenation (.) or string at Bio/Tools/ 
SeqPattern.pm line 371.
Use of uninitialized value in concatenation (.) or string at Bio/Tools/ 
SeqPattern.pm line 372.

#   Failed test at t/SeqTools/SeqPattern.t line 38.
#          got: 'A[][]H'
#     expected: 'A[EQ][DN]H'
"_reverse_translate_motif" is not exported by the  
Bio::Tools::SeqPattern::Backtranslate module
Can't continue after import errors at Bio/Tools/SeqPattern.pm line 539
# Looks like you planned 28 tests but ran 9.
# Looks like you failed 4 tests of 9 run.
# Looks like your test exited with 255 just after 9.
t/SeqTools/SeqPattern.t ...................... Dubious, test returned  
255 (wstat 65280, 0xff00)
Failed 23/28 subtests


-----------------------------------------------------------------------
Scott Cain, Ph. D. scott at scottcain dot net
GMOD Coordinator (http://gmod.org/) 216-392-3087
Ontario Institute for Cancer Research


From dan.bolser at gmail.com  Fri Sep 18 10:11:30 2009
From: dan.bolser at gmail.com (Dan Bolser)
Date: Fri, 18 Sep 2009 15:11:30 +0100
Subject: [Bioperl-l] construct chromosome sequences from bac sequences
In-Reply-To: <dac81b0d0812300702x652813cel733eb9eaa82a408d@mail.gmail.com>
References: <dac81b0d0812300702x652813cel733eb9eaa82a408d@mail.gmail.com>
Message-ID: <2c8757af0909180711t7212f5aak9bc3c7f4e8d16120@mail.gmail.com>

Did you try loading the sequences into an alignment or an assembly object?

As far as I know BioPerl won't call a consensus for you, but you can
post process the alignment or assembly to do that.

Can an alignment hold sequences with qualities?


Sorry for the late reply, I'm just trawling the list for potential
answers to the question I'm about to post ;-)

Dan.


2008/12/30 Alper Yilmaz <alperyilmaz at gmail.com>:
> Hi,
>
> I have FPC report and BAC sequences in hand. I was wondering what is the
> most practical way to build chromosomes from these available information.
>
> I HAVE:
> FPC file:
> accession ? ?chr ? ?chr_start ? ?chr_end ? ?contig ? ?contig_start
> contig_end
> aaaaaaaaaa ? ?1 ? ?14700 ? ?215600 ? ?ctg1 ? ?14700 ? ?215600
> bbbbbbbbbb ? ?1 ? ?196000 ? ?362600 ? ?ctg1 ? ?196000 ? ?362600
> cccccccccc ? ?1 ? ?352800 ? ?524300 ? ?ctg1 ? ?352800 ? ?524300
> .
> .
>
> BAC fasta file:
>>aaaaaaaaaa
> GATCGATCAGCATCGACTACGACT...
>>bbbbbbbbbb
> AGTAGCAGTAGCTAGCACTACGAC...
>>cccccccccc
> ACGATCAGCATCAGCATCGACTAC...
> .
> .
> .
>
> I WANT:
>>chr1
> GACGACTAGCTACGACTAC...
>>chr2
> AGCTGATCACGATCACGAC...
>
> In theory a sequence object called "Chr1" can be created and then according
> to start and end locations of each BAC in FPC file, subsequences of Chr1 can
> be retrieved. However, there are two facts which might prevent using
> standard sequence objects.
> 1) There will be gaps in chromosomes. Is there a function to convert
> unassigned locations to N?
> 2) There are overlaps between BAC sequences. If the overlapping sequences
> are exactly same, it won't be problem, but if there are discrepancies
> between them, a decision has to be made as to which sequence to use in final
> Chr1 sequence.
>
> thanks,
>
> Alper Yilmaz
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


From dan.bolser at gmail.com  Fri Sep 18 10:27:27 2009
From: dan.bolser at gmail.com (Dan Bolser)
Date: Fri, 18 Sep 2009 15:27:27 +0100
Subject: [Bioperl-l] Parser: Ace file (Sequence Assembly) in Bioperl
In-Reply-To: <835D79AC-0D2A-40BE-87F1-0591F69C036A@illinois.edu>
References: <be9b52410901052142p2809652h68e6a05b3ae156eb@mail.gmail.com>
	<18DF7D20DFEC044098A1062202F5FFF31A69523F20@exchsth.agresearch.co.nz>
	<be9b52410901061243t576fcc1eg94928360b8e0f57b@mail.gmail.com>
	<B6BFD3C2-D5D0-4732-B3E2-C2DC9DD029F1@illinois.edu>
	<52cea20c0901061513x593acb44o641b87e35b8ff6fe@mail.gmail.com>
	<835D79AC-0D2A-40BE-87F1-0591F69C036A@illinois.edu>
Message-ID: <2c8757af0909180727r5a71a41fmee71eff92a49a888@mail.gmail.com>

2009/1/6 Chris Fields <cjfields at illinois.edu>:
> Could you archive the files and attach them to a bug report (you can mark it
> as an enhancement request). ?We can take a look.
>
> http://bugzilla.open-bio.org/

Out of interest, has this been added? Where is it documented?

Cheers,
Dan.


> chris
>
> On Jan 6, 2009, at 5:13 PM, Joshua Udall wrote:
>
>> Chris et al. -
>>
>> A student and I have written code to do this - write ace files as well as
>> parse them one entry at a time. ?In trying to use the Assembly::IO as it
>> was
>> in 1.5, we ran into problems with large ace files containing many entries
>> because of file handle limit issues with the inherited implementation
>> DB_File. ?Our implementation simply reads one contig at a time instead of
>> first trying to slurp the whole ace into memory. ?I'm happy to add it to
>> Bioperl, but I am not sure how to do it. ?If I sent *.pm files to someone,
>> could they help me get it into bioperl? ?It may not be perfect either, but
>> it should be a good start.
>>
>> Josh


From bosborne11 at verizon.net  Fri Sep 18 09:48:55 2009
From: bosborne11 at verizon.net (Brian Osborne)
Date: Fri, 18 Sep 2009 09:48:55 -0400
Subject: [Bioperl-l] problem parsing pdb
In-Reply-To: <DBEE748776B74A7988A942A7BBE13AA3@NewLife>
References: <741671.67508.qm@web25705.mail.ukl.yahoo.com>
	<DBEE748776B74A7988A942A7BBE13AA3@NewLife>
Message-ID: <AC62DAB3-3334-44A6-8172-753519B083FF@verizon.net>

Mark,

There was an interesting exchange about StructureIO::pdb a few years  
ago:

http://portal.open-bio.org/pipermail/bioperl-l/2006-September/022990.html

I don't think anyone has actually worked on this code since then and I  
also don't know if Paolo's question relates to the content of the  
thread, but it's good overview.

Brian O.


On Sep 18, 2009, at 8:11 AM, Mark A. Jensen wrote:

> Hi Paola--
> I will look at this. Stay tuned-
> Mark
> ----- Original Message ----- From: "Paola Bisignano" <paola_bisignano at yahoo.it 
> >
> To: <bioperl-l at bioperl.org>
> Sent: Tuesday, September 08, 2009 4:55 AM
> Subject: [Bioperl-l] problem parsing pdb
>
>
> Hi,
>
> I'm in a little troble because i need to exactly parse pdb file, to  
> extract chain id and res id, but I finded that in some pdb the  
> number of residue is followed by a letter because is probably a  
> residue added by crystallographers and they didm't want to change  
> the number of residue in sequence....for example the pdb 1PXX.pdb I  
> parsed it with my script below, I didn't find any useful suggestion  
> about this in bioperltutorial or documentation of bioperl online
>
> #!/usr/local/bin/perl
> use strict;
> use warnings;
> use Bio::Structure::IO;
> use LWP::Simple;
>
>
>
> my $urlpdb= "http://www.rcsb.org/pdb/download/downloadFile.do?fileFormat=pdb&compression=NO&structureId=1PXX 
> ";
> my $content = get($urlpdb);
> my $pdb_file = qq{1pxx.pdb};
> open my $f, ">$pdb_file" or die $!;
> binmode $f;
> print $f $content;
> print qq{$pdb_file\n};
> close $f;
>
>
>
> my $structio=Bio::Structure::IO->new (-file=>$pdb_file);
> my $struc=$structio->next_structure;
> for my $chain ($struc->get_chains)
> {
> my $chainid = $chain->id ;
> for my $res ($struc->get_residues($chain))
> {
> my $resid=$res-> id;
> my $atoms= $struc->get_atoms($res);
> open my $f, ">> 1pxx.parsed";
> print $f "$chainid\t$resid\n";
> close $f;
> }
> }
>
>
>
> but it gives my file with an error in ILE 105A ILE 2105C because  
> they have a letter that follow the number of resid.... can I solve  
> that problem without writing intermediate files?
> because i need to have the reside id as 105A not 105.A
> so
> A ILE-105A
> without point between number and letter....
>
>
>
>
> Thank you all,
>
> Paola
>
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From dan.bolser at gmail.com  Fri Sep 18 10:55:57 2009
From: dan.bolser at gmail.com (Dan Bolser)
Date: Fri, 18 Sep 2009 15:55:57 +0100
Subject: [Bioperl-l] Getting read position information from an ACE file?
Message-ID: <2c8757af0909180755u2e2ca178h9ce921f9bb22c7a3@mail.gmail.com>

Dear Perl Monkeys,

I wrote a little demo script for Bio::Assembly::IO here:

http://www.bioperl.org/wiki/Module:Bio::Assembly::IO


I would very much appreciate comments, criticisms and corrections on
that script (please just edit the wiki). For a newbie its always the
same question, am I doing it right?

In particular, I read about the 4 possible coordinates of a read in an
assembly. My script only retrieves two (?) of the possible four. How
should it be adjusted to print all four coordinates for each read?

Additionally, I'm not sure how to distinguish between the trimmed read
vs. the full length read and/or the aligned portion of the read vs.
the full length read.

What I *really* want is the coordinates of the aligned portion of the
read in gapped read and gapped consensus space, along with the quality
trimmed range of the read.

The ACE file in question is produced by the gsMapper program, which is
part of Newbler from Roche (454), so it has some small
'peculiarities', but I don't think they are critical for the task at
hand.


Thanks very much for any hep you can provide on any of the above issues.

Sincerely,
Dan.


From maj at fortinbras.us  Fri Sep 18 11:11:05 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Fri, 18 Sep 2009 11:11:05 -0400
Subject: [Bioperl-l] Getting read position information from an ACE file?
In-Reply-To: <2c8757af0909180755u2e2ca178h9ce921f9bb22c7a3@mail.gmail.com>
References: <2c8757af0909180755u2e2ca178h9ce921f9bb22c7a3@mail.gmail.com>
Message-ID: <FCD85C18EC5744269CEAB127F4D1D5C4@NewLife>

Dan -- I don't know much about Assembly, so can't help there. But can I  
encourage you and perhaps one or two others (steganographic content: fangly) 
to create a HOWTO stub out of this? Would be excellent-
cheers MAJ
----- Original Message ----- 
From: "Dan Bolser" <dan.bolser at gmail.com>
To: "BioPerl List" <bioperl-l at lists.open-bio.org>
Sent: Friday, September 18, 2009 10:55 AM
Subject: [Bioperl-l] Getting read position information from an ACE file?


> Dear Perl Monkeys,
> 
> I wrote a little demo script for Bio::Assembly::IO here:
> 
> http://www.bioperl.org/wiki/Module:Bio::Assembly::IO
> 
> 
> I would very much appreciate comments, criticisms and corrections on
> that script (please just edit the wiki). For a newbie its always the
> same question, am I doing it right?
> 
> In particular, I read about the 4 possible coordinates of a read in an
> assembly. My script only retrieves two (?) of the possible four. How
> should it be adjusted to print all four coordinates for each read?
> 
> Additionally, I'm not sure how to distinguish between the trimmed read
> vs. the full length read and/or the aligned portion of the read vs.
> the full length read.
> 
> What I *really* want is the coordinates of the aligned portion of the
> read in gapped read and gapped consensus space, along with the quality
> trimmed range of the read.
> 
> The ACE file in question is produced by the gsMapper program, which is
> part of Newbler from Roche (454), so it has some small
> 'peculiarities', but I don't think they are critical for the task at
> hand.
> 
> 
> Thanks very much for any hep you can provide on any of the above issues.
> 
> Sincerely,
> Dan.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
>


From anupam.contact at gmail.com  Fri Sep 18 11:20:03 2009
From: anupam.contact at gmail.com (anupam sinha)
Date: Fri, 18 Sep 2009 20:50:03 +0530
Subject: [Bioperl-l] Problems with Bioperl-run pkg
Message-ID: <82ec54570909180820t7981d230l48d8e4823bb2303f@mail.gmail.com>

Dear all,
                 I have installed the BioPerl-1.6.0.tar.gz and
Bioperl-run-1.6.0.tar.gz on a Fedora 7 system. I am trying to run *
/usr/bin/bp_pairwise_kaks.pl* script but keep on getting this error :

*Must have bioperl-run pkg installed to run this script at
/usr/bin/bp_pairwise_kaks.pl line 69*.

Though I have istalled the run package from Bioperl. Can anyone help me out
? Thanks in advance.


Regards,


Anupam Sinha


From cjfields at illinois.edu  Fri Sep 18 11:59:11 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Fri, 18 Sep 2009 10:59:11 -0500
Subject: [Bioperl-l] test failures in main trunk
In-Reply-To: <2DEEE102-8F58-4BBF-BEAD-97A1AA364787@scottcain.net>
References: <2DEEE102-8F58-4BBF-BEAD-97A1AA364787@scottcain.net>
Message-ID: <1D99E2C1-F484-4E05-8E02-0E948DBBCC6F@illinois.edu>

Interesting, will look into those.  The first one is troubling (that's  
set up to skip for Algoritm::Diff), the others should be a bit more  
straightforward.

Will have to see why List::MoreUtils is being used, but if it's  
necessary it's an additional dep.

chris

On Sep 18, 2009, at 9:11 AM, Scott Cain wrote:

> With Chris trying to get a release out, I wanted to report these  
> test failures from a fairly virgin system Ubuntu server 8.04.
>
> Scott
>
>
>
> t/SeqIO/raw.t ................................ 1/24 Can't locate  
> Algorithm/Diff.pm in @INC (@INC contains: t/lib . /home/gmod/bioperl- 
> live/blib/lib /home/gmod/bioperl-live/blib/arch /home/gmod/bioperl- 
> live /etc/perl /usr/local/lib/perl/5.8.8 /usr/local/share/perl/ 
> 5.8.8 /usr/lib/perl5 /usr/share/perl5 /usr/lib/perl/5.8 /usr/share/ 
> perl/5.8 /usr/local/lib/site_perl) at t/SeqIO/raw.t line 72.
> BEGIN failed--compilation aborted at t/SeqIO/raw.t line 72.
> # Looks like you planned 24 tests but ran 1.
> # Looks like your test exited with 2 just after 1.
> t/SeqIO/raw.t ................................ Dubious, test  
> returned 2 (wstat 512, 0x200)
>
> t/SeqTools/Backtranslate.t ................... Can't locate ok.pm in  
> @INC (@INC contains: t/lib /home/gmod/bioperl-live/blib/lib /home/ 
> gmod/bioperl-live/blib/arch /home/gmod/bioperl-live /etc/perl /usr/ 
> local/lib/perl/5.8.8 /usr/local/share/perl/5.8.8 /usr/lib/perl5 /usr/ 
> share/perl5 /usr/lib/perl/5.8 /usr/share/perl/5.8 /usr/local/lib/ 
> site_perl .) at t/SeqTools/Backtranslate.t line 9.
> BEGIN failed--compilation aborted at t/SeqTools/Backtranslate.t line  
> 9.
> # Looks like your test exited with 2 before it could output anything.
> t/SeqTools/Backtranslate.t ................... Dubious, test  
> returned 2 (wstat 512, 0x200)
> Failed 8/8 subtests
>
> t/SeqTools/SeqPattern.t ...................... 1/28
> #   Failed test 'use Bio::Tools::SeqPattern;'
> #   at t/SeqTools/SeqPattern.t line 12.
> #     Tried to use 'Bio::Tools::SeqPattern'.
> #     Error:  Can't locate List/MoreUtils.pm in @INC (@INC contains:  
> t/lib . /home/gmod/bioperl-live/blib/lib /home/gmod/bioperl-live/ 
> blib/arch /home/gmod/bioperl-live /etc/perl /usr/local/lib/perl/ 
> 5.8.8 /usr/local/share/perl/5.8.8 /usr/lib/perl5 /usr/share/perl5 / 
> usr/lib/perl/5.8 /usr/share/perl/5.8 /usr/local/lib/site_perl) at  
> Bio/Tools/SeqPattern/Backtranslate.pm line 22.
> # BEGIN failed--compilation aborted at Bio/Tools/SeqPattern/ 
> Backtranslate.pm line 22.
> # Compilation failed in require at Bio/Tools/SeqPattern.pm line 212.
> # Compilation failed in require at (eval 17) line 2.
> # BEGIN failed--compilation aborted at (eval 17) line 2.
> Use of uninitialized value in concatenation (.) or string at Bio/ 
> Tools/SeqPattern.pm line 431.
> Use of uninitialized value in concatenation (.) or string at Bio/ 
> Tools/SeqPattern.pm line 432.
>
> #   Failed test at t/SeqTools/SeqPattern.t line 25.
> #          got: '(CT).{1,80}(C[[]]CT).(AGGGG){1,200}'
> #     expected: '(CT).{1,80}(C[GA][GA]CT).(AGGGG){1,200}'
> Use of uninitialized value in concatenation (.) or string at Bio/ 
> Tools/SeqPattern.pm line 431.
> Use of uninitialized value in concatenation (.) or string at Bio/ 
> Tools/SeqPattern.pm line 432.
>
> #   Failed test at t/SeqTools/SeqPattern.t line 31.
> #          got: '(CT).(C[][]CT){1,80}.(AGGGG){1,200}'
> #     expected: '(CT).(C[AG][AG]CT){1,80}.(AGGGG){1,200}'
> Use of uninitialized value in concatenation (.) or string at Bio/ 
> Tools/SeqPattern.pm line 371.
> Use of uninitialized value in concatenation (.) or string at Bio/ 
> Tools/SeqPattern.pm line 372.
>
> #   Failed test at t/SeqTools/SeqPattern.t line 38.
> #          got: 'A[][]H'
> #     expected: 'A[EQ][DN]H'
> "_reverse_translate_motif" is not exported by the  
> Bio::Tools::SeqPattern::Backtranslate module
> Can't continue after import errors at Bio/Tools/SeqPattern.pm line 539
> # Looks like you planned 28 tests but ran 9.
> # Looks like you failed 4 tests of 9 run.
> # Looks like your test exited with 255 just after 9.
> t/SeqTools/SeqPattern.t ...................... Dubious, test  
> returned 255 (wstat 65280, 0xff00)
> Failed 23/28 subtests
>
>
> -----------------------------------------------------------------------
> Scott Cain, Ph. D. scott at scottcain dot net
> GMOD Coordinator (http://gmod.org/) 216-392-3087
> Ontario Institute for Cancer Research
>
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From cjfields at illinois.edu  Fri Sep 18 12:09:26 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Fri, 18 Sep 2009 11:09:26 -0500
Subject: [Bioperl-l] Parser: Ace file (Sequence Assembly) in Bioperl
In-Reply-To: <2c8757af0909180727r5a71a41fmee71eff92a49a888@mail.gmail.com>
References: <be9b52410901052142p2809652h68e6a05b3ae156eb@mail.gmail.com>
	<18DF7D20DFEC044098A1062202F5FFF31A69523F20@exchsth.agresearch.co.nz>
	<be9b52410901061243t576fcc1eg94928360b8e0f57b@mail.gmail.com>
	<B6BFD3C2-D5D0-4732-B3E2-C2DC9DD029F1@illinois.edu>
	<52cea20c0901061513x593acb44o641b87e35b8ff6fe@mail.gmail.com>
	<835D79AC-0D2A-40BE-87F1-0591F69C036A@illinois.edu>
	<2c8757af0909180727r5a71a41fmee71eff92a49a888@mail.gmail.com>
Message-ID: <124536CE-407B-4E2E-98B7-940DA4286CC8@illinois.edu>

Dan,

No, it hasn't made it in.  Currently, the problem is it doesn't have  
any tests attached, but that could be easily fixed if anyone wanted to  
donate a little time to getting them running.  My hands are a bit full  
with other stuff for the release.

We should have some ace files already to go in t/data somewhere if one  
were so inclined to do that, BTW  ;>

chris

On Sep 18, 2009, at 9:27 AM, Dan Bolser wrote:

> 2009/1/6 Chris Fields <cjfields at illinois.edu>:
>> Could you archive the files and attach them to a bug report (you  
>> can mark it
>> as an enhancement request).  We can take a look.
>>
>> http://bugzilla.open-bio.org/
>
> Out of interest, has this been added? Where is it documented?
>
> Cheers,
> Dan.
>
>
>> chris
>>
>> On Jan 6, 2009, at 5:13 PM, Joshua Udall wrote:
>>
>>> Chris et al. -
>>>
>>> A student and I have written code to do this - write ace files as  
>>> well as
>>> parse them one entry at a time.  In trying to use the Assembly::IO  
>>> as it
>>> was
>>> in 1.5, we ran into problems with large ace files containing many  
>>> entries
>>> because of file handle limit issues with the inherited  
>>> implementation
>>> DB_File.  Our implementation simply reads one contig at a time  
>>> instead of
>>> first trying to slurp the whole ace into memory.  I'm happy to add  
>>> it to
>>> Bioperl, but I am not sure how to do it.  If I sent *.pm files to  
>>> someone,
>>> could they help me get it into bioperl?  It may not be perfect  
>>> either, but
>>> it should be a good start.
>>>
>>> Josh
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From maj at fortinbras.us  Fri Sep 18 12:20:22 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Fri, 18 Sep 2009 12:20:22 -0400
Subject: [Bioperl-l] test failures in main trunk
In-Reply-To: <1D99E2C1-F484-4E05-8E02-0E948DBBCC6F@illinois.edu>
References: <2DEEE102-8F58-4BBF-BEAD-97A1AA364787@scottcain.net>
	<1D99E2C1-F484-4E05-8E02-0E948DBBCC6F@illinois.edu>
Message-ID: <E019D53941DD48E4B3294E113771B711@NewLife>


> Will have to see why List::MoreUtils is being used, but if it's  
> necessary it's an additional dep.

I didn't do it, officer....

> 
> chris
> 
> On Sep 18, 2009, at 9:11 AM, Scott Cain wrote:
> 
>> With Chris trying to get a release out, I wanted to report these  
>> test failures from a fairly virgin system Ubuntu server 8.04.
>>
>> Scott
>>
>>
>>
>> t/SeqIO/raw.t ................................ 1/24 Can't locate  
>> Algorithm/Diff.pm in @INC (@INC contains: t/lib . /home/gmod/bioperl- 
>> live/blib/lib /home/gmod/bioperl-live/blib/arch /home/gmod/bioperl- 
>> live /etc/perl /usr/local/lib/perl/5.8.8 /usr/local/share/perl/ 
>> 5.8.8 /usr/lib/perl5 /usr/share/perl5 /usr/lib/perl/5.8 /usr/share/ 
>> perl/5.8 /usr/local/lib/site_perl) at t/SeqIO/raw.t line 72.
>> BEGIN failed--compilation aborted at t/SeqIO/raw.t line 72.
>> # Looks like you planned 24 tests but ran 1.
>> # Looks like your test exited with 2 just after 1.
>> t/SeqIO/raw.t ................................ Dubious, test  
>> returned 2 (wstat 512, 0x200)
>>
>> t/SeqTools/Backtranslate.t ................... Can't locate ok.pm in  
>> @INC (@INC contains: t/lib /home/gmod/bioperl-live/blib/lib /home/ 
>> gmod/bioperl-live/blib/arch /home/gmod/bioperl-live /etc/perl /usr/ 
>> local/lib/perl/5.8.8 /usr/local/share/perl/5.8.8 /usr/lib/perl5 /usr/ 
>> share/perl5 /usr/lib/perl/5.8 /usr/share/perl/5.8 /usr/local/lib/ 
>> site_perl .) at t/SeqTools/Backtranslate.t line 9.
>> BEGIN failed--compilation aborted at t/SeqTools/Backtranslate.t line  
>> 9.
>> # Looks like your test exited with 2 before it could output anything.
>> t/SeqTools/Backtranslate.t ................... Dubious, test  
>> returned 2 (wstat 512, 0x200)
>> Failed 8/8 subtests
>>
>> t/SeqTools/SeqPattern.t ...................... 1/28
>> #   Failed test 'use Bio::Tools::SeqPattern;'
>> #   at t/SeqTools/SeqPattern.t line 12.
>> #     Tried to use 'Bio::Tools::SeqPattern'.
>> #     Error:  Can't locate List/MoreUtils.pm in @INC (@INC contains:  
>> t/lib . /home/gmod/bioperl-live/blib/lib /home/gmod/bioperl-live/ 
>> blib/arch /home/gmod/bioperl-live /etc/perl /usr/local/lib/perl/ 
>> 5.8.8 /usr/local/share/perl/5.8.8 /usr/lib/perl5 /usr/share/perl5 / 
>> usr/lib/perl/5.8 /usr/share/perl/5.8 /usr/local/lib/site_perl) at  
>> Bio/Tools/SeqPattern/Backtranslate.pm line 22.
>> # BEGIN failed--compilation aborted at Bio/Tools/SeqPattern/ 
>> Backtranslate.pm line 22.
>> # Compilation failed in require at Bio/Tools/SeqPattern.pm line 212.
>> # Compilation failed in require at (eval 17) line 2.
>> # BEGIN failed--compilation aborted at (eval 17) line 2.
>> Use of uninitialized value in concatenation (.) or string at Bio/ 
>> Tools/SeqPattern.pm line 431.
>> Use of uninitialized value in concatenation (.) or string at Bio/ 
>> Tools/SeqPattern.pm line 432.
>>
>> #   Failed test at t/SeqTools/SeqPattern.t line 25.
>> #          got: '(CT).{1,80}(C[[]]CT).(AGGGG){1,200}'
>> #     expected: '(CT).{1,80}(C[GA][GA]CT).(AGGGG){1,200}'
>> Use of uninitialized value in concatenation (.) or string at Bio/ 
>> Tools/SeqPattern.pm line 431.
>> Use of uninitialized value in concatenation (.) or string at Bio/ 
>> Tools/SeqPattern.pm line 432.
>>
>> #   Failed test at t/SeqTools/SeqPattern.t line 31.
>> #          got: '(CT).(C[][]CT){1,80}.(AGGGG){1,200}'
>> #     expected: '(CT).(C[AG][AG]CT){1,80}.(AGGGG){1,200}'
>> Use of uninitialized value in concatenation (.) or string at Bio/ 
>> Tools/SeqPattern.pm line 371.
>> Use of uninitialized value in concatenation (.) or string at Bio/ 
>> Tools/SeqPattern.pm line 372.
>>
>> #   Failed test at t/SeqTools/SeqPattern.t line 38.
>> #          got: 'A[][]H'
>> #     expected: 'A[EQ][DN]H'
>> "_reverse_translate_motif" is not exported by the  
>> Bio::Tools::SeqPattern::Backtranslate module
>> Can't continue after import errors at Bio/Tools/SeqPattern.pm line 539
>> # Looks like you planned 28 tests but ran 9.
>> # Looks like you failed 4 tests of 9 run.
>> # Looks like your test exited with 255 just after 9.
>> t/SeqTools/SeqPattern.t ...................... Dubious, test  
>> returned 255 (wstat 65280, 0xff00)
>> Failed 23/28 subtests
>>
>>
>> -----------------------------------------------------------------------
>> Scott Cain, Ph. D. scott at scottcain dot net
>> GMOD Coordinator (http://gmod.org/) 216-392-3087
>> Ontario Institute for Cancer Research
>>
>>
>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
>


From maj at fortinbras.us  Fri Sep 18 11:55:47 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Fri, 18 Sep 2009 11:55:47 -0400
Subject: [Bioperl-l] problem parsing pdb
In-Reply-To: <741671.67508.qm@web25705.mail.ukl.yahoo.com>
References: <741671.67508.qm@web25705.mail.ukl.yahoo.com>
Message-ID: <72DA6CA1499D4F67909197901218A9FF@NewLife>

Hi Paola--
My researches reveal that this is a "standard kludge" in pdb format. A letter 
following a residue number is called an "insertion code" or "icode", and my 
understanding is that is does allow for the insertion of residues without 
upsetting the rest of the coordinates. (This is a feature, and not laziness, 
since people very quickly begin to refer to amino acid coordinates based on a 
reference sequence in interesting region, and you can't easily say to the 
community,  "hey, that's 22 now, not 20...")

Since it's standard, you should expect it. Bio::Structure handles the icode by 
creating the residue id as follows:

   #my $res_name_num = $resname."-".$resseq;
   my $res_name_num = $resname."-".$resseq;
   $res_name_num .= '.'.$icode if $icode;

so you can get back the reside 3-letter name, its numerical position, and 
insertion code by doing

 my ($name, $number, $icode) = $res->id =~ /(.*?)-([0-9]+)\.?([A-Z]?)/;

In this case, if the icode is not present, then $icode eq '' (not undef).
Hope this helps-
Mark

----- Original Message ----- 
From: "Paola Bisignano" <paola_bisignano at yahoo.it>
To: <bioperl-l at bioperl.org>
Sent: Tuesday, September 08, 2009 4:55 AM
Subject: [Bioperl-l] problem parsing pdb


Hi,

I'm in a little troble because i need to exactly parse pdb file, to extract 
chain id and res id, but I finded that in some pdb the number of residue is 
followed by a letter because is probably a residue added by crystallographers 
and they didm't want to change the number of residue in sequence....for example 
the pdb 1PXX.pdb I parsed it with my script below, I didn't find any useful 
suggestion about this in bioperltutorial or documentation of bioperl online

#!/usr/local/bin/perl
use strict;
use warnings;
use Bio::Structure::IO;
use LWP::Simple;


my $urlpdb= 
"http://www.rcsb.org/pdb/download/downloadFile.do?fileFormat=pdb&compression=NO&structureId=1PXX";
my $content = get($urlpdb);
my $pdb_file = qq{1pxx.pdb};
open my $f, ">$pdb_file" or die $!;
binmode $f;
print $f $content;
print qq{$pdb_file\n};
close $f;


my $structio=Bio::Structure::IO->new (-file=>$pdb_file);
my $struc=$structio->next_structure;
for my $chain ($struc->get_chains)
{
my $chainid = $chain->id ;
for my $res ($struc->get_residues($chain))
{
my $resid=$res-> id;
my $atoms= $struc->get_atoms($res);
open my $f, ">> 1pxx.parsed";
print $f "$chainid\t$resid\n";
close $f;
}
}


but it gives my file with an error in ILE 105A ILE 2105C because they have a 
letter that follow the number of resid.... can I solve that problem without 
writing intermediate files?
because i need to have the reside id as 105A not 105.A
so
A ILE-105A
without point between number and letter....


Thank you all,

Paola


_______________________________________________
Bioperl-l mailing list
Bioperl-l at lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/bioperl-l


From abhishek.vit at gmail.com  Fri Sep 18 12:31:00 2009
From: abhishek.vit at gmail.com (Abhishek Pratap)
Date: Fri, 18 Sep 2009 12:31:00 -0400
Subject: [Bioperl-l] Parser: Ace file (Sequence Assembly) in Bioperl
In-Reply-To: <124536CE-407B-4E2E-98B7-940DA4286CC8@illinois.edu>
References: <be9b52410901052142p2809652h68e6a05b3ae156eb@mail.gmail.com>
	<18DF7D20DFEC044098A1062202F5FFF31A69523F20@exchsth.agresearch.co.nz>
	<be9b52410901061243t576fcc1eg94928360b8e0f57b@mail.gmail.com>
	<B6BFD3C2-D5D0-4732-B3E2-C2DC9DD029F1@illinois.edu>
	<52cea20c0901061513x593acb44o641b87e35b8ff6fe@mail.gmail.com>
	<835D79AC-0D2A-40BE-87F1-0591F69C036A@illinois.edu>
	<2c8757af0909180727r5a71a41fmee71eff92a49a888@mail.gmail.com>
	<124536CE-407B-4E2E-98B7-940DA4286CC8@illinois.edu>
Message-ID: <be9b52410909180931w2951318eqfa01c109a032bf9d@mail.gmail.com>

I have negligible experience with ace but will be happy to do some
testing. Although please let me know what code and functioanlity needs
to be checked.

Cheers,
-Abhi

On Fri, Sep 18, 2009 at 12:09 PM, Chris Fields <cjfields at illinois.edu> wrote:
> Dan,
>
> No, it hasn't made it in. ?Currently, the problem is it doesn't have any
> tests attached, but that could be easily fixed if anyone wanted to donate a
> little time to getting them running. ?My hands are a bit full with other
> stuff for the release.
>
> We should have some ace files already to go in t/data somewhere if one were
> so inclined to do that, BTW ?;>
>
> chris
>
> On Sep 18, 2009, at 9:27 AM, Dan Bolser wrote:
>
>> 2009/1/6 Chris Fields <cjfields at illinois.edu>:
>>>
>>> Could you archive the files and attach them to a bug report (you can mark
>>> it
>>> as an enhancement request). ?We can take a look.
>>>
>>> http://bugzilla.open-bio.org/
>>
>> Out of interest, has this been added? Where is it documented?
>>
>> Cheers,
>> Dan.
>>
>>
>>> chris
>>>
>>> On Jan 6, 2009, at 5:13 PM, Joshua Udall wrote:
>>>
>>>> Chris et al. -
>>>>
>>>> A student and I have written code to do this - write ace files as well
>>>> as
>>>> parse them one entry at a time. ?In trying to use the Assembly::IO as it
>>>> was
>>>> in 1.5, we ran into problems with large ace files containing many
>>>> entries
>>>> because of file handle limit issues with the inherited implementation
>>>> DB_File. ?Our implementation simply reads one contig at a time instead
>>>> of
>>>> first trying to slurp the whole ace into memory. ?I'm happy to add it to
>>>> Bioperl, but I am not sure how to do it. ?If I sent *.pm files to
>>>> someone,
>>>> could they help me get it into bioperl? ?It may not be perfect either,
>>>> but
>>>> it should be a good start.
>>>>
>>>> Josh
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>


From vecchi.b at gmail.com  Fri Sep 18 12:44:37 2009
From: vecchi.b at gmail.com (Bruno Vecchi)
Date: Fri, 18 Sep 2009 09:44:37 -0700
Subject: [Bioperl-l] test failures in main trunk
In-Reply-To: <E019D53941DD48E4B3294E113771B711@NewLife>
References: <2DEEE102-8F58-4BBF-BEAD-97A1AA364787@scottcain.net>
	<1D99E2C1-F484-4E05-8E02-0E948DBBCC6F@illinois.edu>
	<E019D53941DD48E4B3294E113771B711@NewLife>
Message-ID: <1a0c1b750909180944p55b226cbi18e3c608f401d951@mail.gmail.com>

The second test ("Can't locate ok.pm in @INC...") can be fixed by
using use_ok('My::Module') instead of use ok 'My::Module' in the test
files.

I've had a few of those in the past, and that fix did the trick.

Cheers,

Bruno.


2009/9/18 Mark A. Jensen <maj at fortinbras.us>:
>
>> Will have to see why List::MoreUtils is being used, but if it's ?necessary
>> it's an additional dep.
>
> I didn't do it, officer....
>
>>
>> chris
>>
>> On Sep 18, 2009, at 9:11 AM, Scott Cain wrote:
>>
>>> With Chris trying to get a release out, I wanted to report these ?test
>>> failures from a fairly virgin system Ubuntu server 8.04.
>>>
>>> Scott
>>>
>>>
>>>
>>> t/SeqIO/raw.t ................................ 1/24 Can't locate
>>> ?Algorithm/Diff.pm in @INC (@INC contains: t/lib . /home/gmod/bioperl-
>>> live/blib/lib /home/gmod/bioperl-live/blib/arch /home/gmod/bioperl- live
>>> /etc/perl /usr/local/lib/perl/5.8.8 /usr/local/share/perl/ 5.8.8
>>> /usr/lib/perl5 /usr/share/perl5 /usr/lib/perl/5.8 /usr/share/ perl/5.8
>>> /usr/local/lib/site_perl) at t/SeqIO/raw.t line 72.
>>> BEGIN failed--compilation aborted at t/SeqIO/raw.t line 72.
>>> # Looks like you planned 24 tests but ran 1.
>>> # Looks like your test exited with 2 just after 1.
>>> t/SeqIO/raw.t ................................ Dubious, test ?returned 2
>>> (wstat 512, 0x200)
>>>
>>> t/SeqTools/Backtranslate.t ................... Can't locate ok.pm in
>>> ?@INC (@INC contains: t/lib /home/gmod/bioperl-live/blib/lib /home/
>>> gmod/bioperl-live/blib/arch /home/gmod/bioperl-live /etc/perl /usr/
>>> local/lib/perl/5.8.8 /usr/local/share/perl/5.8.8 /usr/lib/perl5 /usr/
>>> share/perl5 /usr/lib/perl/5.8 /usr/share/perl/5.8 /usr/local/lib/ site_perl
>>> .) at t/SeqTools/Backtranslate.t line 9.
>>> BEGIN failed--compilation aborted at t/SeqTools/Backtranslate.t line ?9.
>>> # Looks like your test exited with 2 before it could output anything.
>>> t/SeqTools/Backtranslate.t ................... Dubious, test ?returned 2
>>> (wstat 512, 0x200)
>>> Failed 8/8 subtests
>>>
>>> t/SeqTools/SeqPattern.t ...................... 1/28
>>> # ? Failed test 'use Bio::Tools::SeqPattern;'
>>> # ? at t/SeqTools/SeqPattern.t line 12.
>>> # ? ? Tried to use 'Bio::Tools::SeqPattern'.
>>> # ? ? Error: ?Can't locate List/MoreUtils.pm in @INC (@INC contains:
>>> ?t/lib . /home/gmod/bioperl-live/blib/lib /home/gmod/bioperl-live/ blib/arch
>>> /home/gmod/bioperl-live /etc/perl /usr/local/lib/perl/ 5.8.8
>>> /usr/local/share/perl/5.8.8 /usr/lib/perl5 /usr/share/perl5 /
>>> usr/lib/perl/5.8 /usr/share/perl/5.8 /usr/local/lib/site_perl) at
>>> ?Bio/Tools/SeqPattern/Backtranslate.pm line 22.
>>> # BEGIN failed--compilation aborted at Bio/Tools/SeqPattern/
>>> Backtranslate.pm line 22.
>>> # Compilation failed in require at Bio/Tools/SeqPattern.pm line 212.
>>> # Compilation failed in require at (eval 17) line 2.
>>> # BEGIN failed--compilation aborted at (eval 17) line 2.
>>> Use of uninitialized value in concatenation (.) or string at Bio/
>>> Tools/SeqPattern.pm line 431.
>>> Use of uninitialized value in concatenation (.) or string at Bio/
>>> Tools/SeqPattern.pm line 432.
>>>
>>> # ? Failed test at t/SeqTools/SeqPattern.t line 25.
>>> # ? ? ? ? ?got: '(CT).{1,80}(C[[]]CT).(AGGGG){1,200}'
>>> # ? ? expected: '(CT).{1,80}(C[GA][GA]CT).(AGGGG){1,200}'
>>> Use of uninitialized value in concatenation (.) or string at Bio/
>>> Tools/SeqPattern.pm line 431.
>>> Use of uninitialized value in concatenation (.) or string at Bio/
>>> Tools/SeqPattern.pm line 432.
>>>
>>> # ? Failed test at t/SeqTools/SeqPattern.t line 31.
>>> # ? ? ? ? ?got: '(CT).(C[][]CT){1,80}.(AGGGG){1,200}'
>>> # ? ? expected: '(CT).(C[AG][AG]CT){1,80}.(AGGGG){1,200}'
>>> Use of uninitialized value in concatenation (.) or string at Bio/
>>> Tools/SeqPattern.pm line 371.
>>> Use of uninitialized value in concatenation (.) or string at Bio/
>>> Tools/SeqPattern.pm line 372.
>>>
>>> # ? Failed test at t/SeqTools/SeqPattern.t line 38.
>>> # ? ? ? ? ?got: 'A[][]H'
>>> # ? ? expected: 'A[EQ][DN]H'
>>> "_reverse_translate_motif" is not exported by the
>>> ?Bio::Tools::SeqPattern::Backtranslate module
>>> Can't continue after import errors at Bio/Tools/SeqPattern.pm line 539
>>> # Looks like you planned 28 tests but ran 9.
>>> # Looks like you failed 4 tests of 9 run.
>>> # Looks like your test exited with 255 just after 9.
>>> t/SeqTools/SeqPattern.t ...................... Dubious, test ?returned
>>> 255 (wstat 65280, 0xff00)
>>> Failed 23/28 subtests
>>>
>>>
>>> -----------------------------------------------------------------------
>>> Scott Cain, Ph. D. scott at scottcain dot net
>>> GMOD Coordinator (http://gmod.org/) 216-392-3087
>>> Ontario Institute for Cancer Research
>>>
>>>
>>>
>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


From dan.bolser at gmail.com  Fri Sep 18 12:54:36 2009
From: dan.bolser at gmail.com (Dan Bolser)
Date: Fri, 18 Sep 2009 17:54:36 +0100
Subject: [Bioperl-l] Parser: Ace file (Sequence Assembly) in Bioperl
In-Reply-To: <124536CE-407B-4E2E-98B7-940DA4286CC8@illinois.edu>
References: <be9b52410901052142p2809652h68e6a05b3ae156eb@mail.gmail.com>
	<18DF7D20DFEC044098A1062202F5FFF31A69523F20@exchsth.agresearch.co.nz>
	<be9b52410901061243t576fcc1eg94928360b8e0f57b@mail.gmail.com>
	<B6BFD3C2-D5D0-4732-B3E2-C2DC9DD029F1@illinois.edu>
	<52cea20c0901061513x593acb44o641b87e35b8ff6fe@mail.gmail.com>
	<835D79AC-0D2A-40BE-87F1-0591F69C036A@illinois.edu>
	<2c8757af0909180727r5a71a41fmee71eff92a49a888@mail.gmail.com>
	<124536CE-407B-4E2E-98B7-940DA4286CC8@illinois.edu>
Message-ID: <2c8757af0909180954ia4fecc3we72574d8ae8acd97@mail.gmail.com>

Please can you link to the bug that includes the code?


2009/9/18 Chris Fields <cjfields at illinois.edu>:
> Dan,
>
> No, it hasn't made it in. ?Currently, the problem is it doesn't have any
> tests attached, but that could be easily fixed if anyone wanted to donate a
> little time to getting them running. ?My hands are a bit full with other
> stuff for the release.
>
> We should have some ace files already to go in t/data somewhere if one were
> so inclined to do that, BTW ?;>
>
> chris
>
> On Sep 18, 2009, at 9:27 AM, Dan Bolser wrote:
>
>> 2009/1/6 Chris Fields <cjfields at illinois.edu>:
>>>
>>> Could you archive the files and attach them to a bug report (you can mark
>>> it
>>> as an enhancement request). ?We can take a look.
>>>
>>> http://bugzilla.open-bio.org/
>>
>> Out of interest, has this been added? Where is it documented?
>>
>> Cheers,
>> Dan.
>>
>>
>>> chris
>>>
>>> On Jan 6, 2009, at 5:13 PM, Joshua Udall wrote:
>>>
>>>> Chris et al. -
>>>>
>>>> A student and I have written code to do this - write ace files as well
>>>> as
>>>> parse them one entry at a time. ?In trying to use the Assembly::IO as it
>>>> was
>>>> in 1.5, we ran into problems with large ace files containing many
>>>> entries
>>>> because of file handle limit issues with the inherited implementation
>>>> DB_File. ?Our implementation simply reads one contig at a time instead
>>>> of
>>>> first trying to slurp the whole ace into memory. ?I'm happy to add it to
>>>> Bioperl, but I am not sure how to do it. ?If I sent *.pm files to
>>>> someone,
>>>> could they help me get it into bioperl? ?It may not be perfect either,
>>>> but
>>>> it should be a good start.
>>>>
>>>> Josh
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>


From dan.bolser at gmail.com  Fri Sep 18 13:09:09 2009
From: dan.bolser at gmail.com (Dan Bolser)
Date: Fri, 18 Sep 2009 18:09:09 +0100
Subject: [Bioperl-l] Getting read position information from an ACE file?
In-Reply-To: <FCD85C18EC5744269CEAB127F4D1D5C4@NewLife>
References: <2c8757af0909180755u2e2ca178h9ce921f9bb22c7a3@mail.gmail.com>
	<FCD85C18EC5744269CEAB127F4D1D5C4@NewLife>
Message-ID: <2c8757af0909181009w310bc69r3d9efa3d9a12d41b@mail.gmail.com>

2009/9/18 Mark A. Jensen <maj at fortinbras.us>:
> Dan -- I don't know much about Assembly, so can't help there. But can I
> ?encourage you and perhaps one or two others (steganographic content:
> fangly) to create a HOWTO stub out of this? Would be excellent-

I'd love to. ACE is pretty ubiquitous, so any additional info on how
to work with them using BioPerl should help a lot of people.

The problem is that I'm one of those people ;-)


I'm working on an 'ace2tab.plx' script that should encompass this
info. I'm finding that some 'read ids' have the .range format. i.e.
"read123455.23-239". However, some do not. i.e. "read123456". Not sure
where this ID comes from, but I think its telling me something about
partially aligned reads. The problem is that the coordinates I'm
seeing don't reflect that (they are just the start and the end point
of the full read).

A 'proper' ace2tab script would be very nice.


> cheers MAJ
> ----- Original Message ----- From: "Dan Bolser" <dan.bolser at gmail.com>
> To: "BioPerl List" <bioperl-l at lists.open-bio.org>
> Sent: Friday, September 18, 2009 10:55 AM
> Subject: [Bioperl-l] Getting read position information from an ACE file?
>
>
>> Dear Perl Monkeys,
>>
>> I wrote a little demo script for Bio::Assembly::IO here:
>>
>> http://www.bioperl.org/wiki/Module:Bio::Assembly::IO
>>
>>
>> I would very much appreciate comments, criticisms and corrections on
>> that script (please just edit the wiki). For a newbie its always the
>> same question, am I doing it right?
>>
>> In particular, I read about the 4 possible coordinates of a read in an
>> assembly. My script only retrieves two (?) of the possible four. How
>> should it be adjusted to print all four coordinates for each read?
>>
>> Additionally, I'm not sure how to distinguish between the trimmed read
>> vs. the full length read and/or the aligned portion of the read vs.
>> the full length read.
>>
>> What I *really* want is the coordinates of the aligned portion of the
>> read in gapped read and gapped consensus space, along with the quality
>> trimmed range of the read.
>>
>> The ACE file in question is produced by the gsMapper program, which is
>> part of Newbler from Roche (454), so it has some small
>> 'peculiarities', but I don't think they are critical for the task at
>> hand.
>>
>>
>> Thanks very much for any hep you can provide on any of the above issues.
>>
>> Sincerely,
>> Dan.
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>>
>


From cjfields at illinois.edu  Fri Sep 18 14:00:17 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Fri, 18 Sep 2009 13:00:17 -0500
Subject: [Bioperl-l] Getting read position information from an ACE file?
In-Reply-To: <FCD85C18EC5744269CEAB127F4D1D5C4@NewLife>
References: <2c8757af0909180755u2e2ca178h9ce921f9bb22c7a3@mail.gmail.com>
	<FCD85C18EC5744269CEAB127F4D1D5C4@NewLife>
Message-ID: <DCEC55AD-5B4E-42E6-9A7E-FB52E19EADA5@illinois.edu>

Agreed, and it may spur others to get involved, fix bugs, donate code,  
etc.

chris

On Sep 18, 2009, at 10:11 AM, Mark A. Jensen wrote:

> Dan -- I don't know much about Assembly, so can't help there. But  
> can I  encourage you and perhaps one or two others (steganographic  
> content: fangly) to create a HOWTO stub out of this? Would be  
> excellent-
> cheers MAJ
> ----- Original Message ----- From: "Dan Bolser" <dan.bolser at gmail.com>
> To: "BioPerl List" <bioperl-l at lists.open-bio.org>
> Sent: Friday, September 18, 2009 10:55 AM
> Subject: [Bioperl-l] Getting read position information from an ACE  
> file?
>
>
>> Dear Perl Monkeys,
>> I wrote a little demo script for Bio::Assembly::IO here:
>> http://www.bioperl.org/wiki/Module:Bio::Assembly::IO
>> I would very much appreciate comments, criticisms and corrections on
>> that script (please just edit the wiki). For a newbie its always the
>> same question, am I doing it right?
>> In particular, I read about the 4 possible coordinates of a read in  
>> an
>> assembly. My script only retrieves two (?) of the possible four. How
>> should it be adjusted to print all four coordinates for each read?
>> Additionally, I'm not sure how to distinguish between the trimmed  
>> read
>> vs. the full length read and/or the aligned portion of the read vs.
>> the full length read.
>> What I *really* want is the coordinates of the aligned portion of the
>> read in gapped read and gapped consensus space, along with the  
>> quality
>> trimmed range of the read.
>> The ACE file in question is produced by the gsMapper program, which  
>> is
>> part of Newbler from Roche (454), so it has some small
>> 'peculiarities', but I don't think they are critical for the task at
>> hand.
>> Thanks very much for any hep you can provide on any of the above  
>> issues.
>> Sincerely,
>> Dan.
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From cjfields at illinois.edu  Fri Sep 18 14:03:13 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Fri, 18 Sep 2009 13:03:13 -0500
Subject: [Bioperl-l] Parser: Ace file (Sequence Assembly) in Bioperl
In-Reply-To: <2c8757af0909180954ia4fecc3we72574d8ae8acd97@mail.gmail.com>
References: <be9b52410901052142p2809652h68e6a05b3ae156eb@mail.gmail.com>
	<18DF7D20DFEC044098A1062202F5FFF31A69523F20@exchsth.agresearch.co.nz>
	<be9b52410901061243t576fcc1eg94928360b8e0f57b@mail.gmail.com>
	<B6BFD3C2-D5D0-4732-B3E2-C2DC9DD029F1@illinois.edu>
	<52cea20c0901061513x593acb44o641b87e35b8ff6fe@mail.gmail.com>
	<835D79AC-0D2A-40BE-87F1-0591F69C036A@illinois.edu>
	<2c8757af0909180727r5a71a41fmee71eff92a49a888@mail.gmail.com>
	<124536CE-407B-4E2E-98B7-940DA4286CC8@illinois.edu>
	<2c8757af0909180954ia4fecc3we72574d8ae8acd97@mail.gmail.com>
Message-ID: <88BA1216-B8C6-478B-A295-4153D041F549@illinois.edu>

Bug 2726

http://bugzilla.open-bio.org/show_bug.cgi?id=2726

chris

On Sep 18, 2009, at 11:54 AM, Dan Bolser wrote:

> Please can you link to the bug that includes the code?
>
>
> 2009/9/18 Chris Fields <cjfields at illinois.edu>:
>> Dan,
>>
>> No, it hasn't made it in.  Currently, the problem is it doesn't  
>> have any
>> tests attached, but that could be easily fixed if anyone wanted to  
>> donate a
>> little time to getting them running.  My hands are a bit full with  
>> other
>> stuff for the release.
>>
>> We should have some ace files already to go in t/data somewhere if  
>> one were
>> so inclined to do that, BTW  ;>
>>
>> chris
>>
>> On Sep 18, 2009, at 9:27 AM, Dan Bolser wrote:
>>
>>> 2009/1/6 Chris Fields <cjfields at illinois.edu>:
>>>>
>>>> Could you archive the files and attach them to a bug report (you  
>>>> can mark
>>>> it
>>>> as an enhancement request).  We can take a look.
>>>>
>>>> http://bugzilla.open-bio.org/
>>>
>>> Out of interest, has this been added? Where is it documented?
>>>
>>> Cheers,
>>> Dan.
>>>
>>>
>>>> chris
>>>>
>>>> On Jan 6, 2009, at 5:13 PM, Joshua Udall wrote:
>>>>
>>>>> Chris et al. -
>>>>>
>>>>> A student and I have written code to do this - write ace files  
>>>>> as well
>>>>> as
>>>>> parse them one entry at a time.  In trying to use the  
>>>>> Assembly::IO as it
>>>>> was
>>>>> in 1.5, we ran into problems with large ace files containing many
>>>>> entries
>>>>> because of file handle limit issues with the inherited  
>>>>> implementation
>>>>> DB_File.  Our implementation simply reads one contig at a time  
>>>>> instead
>>>>> of
>>>>> first trying to slurp the whole ace into memory.  I'm happy to  
>>>>> add it to
>>>>> Bioperl, but I am not sure how to do it.  If I sent *.pm files to
>>>>> someone,
>>>>> could they help me get it into bioperl?  It may not be perfect  
>>>>> either,
>>>>> but
>>>>> it should be a good start.
>>>>>
>>>>> Josh
>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>>


From e.osimo at gmail.com  Fri Sep 18 18:33:22 2009
From: e.osimo at gmail.com (Emanuele Osimo)
Date: Sat, 19 Sep 2009 00:33:22 +0200
Subject: [Bioperl-l] Getting all annotations
Message-ID: <2ac05d0f0909181533u1e5d5d89l5c2c468950a9cef@mail.gmail.com>

Hello,
I was trying to figure out how to get from the Entrez database all the
reference annotation for a given genomic zone.
For example: I want to know which genes, transcripts, microRNAs etc are
present in chr 6 from 100kbp to 200kbp.
Is there a database that is arranged as a continuum (by sequence) instead of
by feature (gene, transcript etc)?

Thanks
Emanuele


From florent.angly at gmail.com  Sat Sep 19 22:20:31 2009
From: florent.angly at gmail.com (Florent Angly)
Date: Sat, 19 Sep 2009 19:20:31 -0700
Subject: [Bioperl-l] Parser: Ace file (Sequence Assembly) in Bioperl
In-Reply-To: <88BA1216-B8C6-478B-A295-4153D041F549@illinois.edu>
References: <be9b52410901052142p2809652h68e6a05b3ae156eb@mail.gmail.com>	<18DF7D20DFEC044098A1062202F5FFF31A69523F20@exchsth.agresearch.co.nz>	<be9b52410901061243t576fcc1eg94928360b8e0f57b@mail.gmail.com>	<B6BFD3C2-D5D0-4732-B3E2-C2DC9DD029F1@illinois.edu>	<52cea20c0901061513x593acb44o641b87e35b8ff6fe@mail.gmail.com>	<835D79AC-0D2A-40BE-87F1-0591F69C036A@illinois.edu>	<2c8757af0909180727r5a71a41fmee71eff92a49a888@mail.gmail.com>	<124536CE-407B-4E2E-98B7-940DA4286CC8@illinois.edu>	<2c8757af0909180954ia4fecc3we72574d8ae8acd97@mail.gmail.com>
	<88BA1216-B8C6-478B-A295-4153D041F549@illinois.edu>
Message-ID: <4AB5916F.1090104@gmail.com>

I suppose it is a good idea to wait until bioperl-live 1.6.1 is out 
before doing any significant work on the sequence assembly module.
Also, remember the assembly-related todo list: 
http://www.bioperl.org/wiki/Align_Refactor#Bio::Assembly-related
Florent


Chris Fields wrote:
> Bug 2726
>
> http://bugzilla.open-bio.org/show_bug.cgi?id=2726
>
> chris
>
> On Sep 18, 2009, at 11:54 AM, Dan Bolser wrote:
>
>> Please can you link to the bug that includes the code?
>>
>>
>> 2009/9/18 Chris Fields <cjfields at illinois.edu>:
>>> Dan,
>>>
>>> No, it hasn't made it in.  Currently, the problem is it doesn't have 
>>> any
>>> tests attached, but that could be easily fixed if anyone wanted to 
>>> donate a
>>> little time to getting them running.  My hands are a bit full with 
>>> other
>>> stuff for the release.
>>>
>>> We should have some ace files already to go in t/data somewhere if 
>>> one were
>>> so inclined to do that, BTW  ;>
>>>
>>> chris
>>>
>>> On Sep 18, 2009, at 9:27 AM, Dan Bolser wrote:
>>>
>>>> 2009/1/6 Chris Fields <cjfields at illinois.edu>:
>>>>>
>>>>> Could you archive the files and attach them to a bug report (you 
>>>>> can mark
>>>>> it
>>>>> as an enhancement request).  We can take a look.
>>>>>
>>>>> http://bugzilla.open-bio.org/
>>>>
>>>> Out of interest, has this been added? Where is it documented?
>>>>
>>>> Cheers,
>>>> Dan.
>>>>
>>>>
>>>>> chris
>>>>>
>>>>> On Jan 6, 2009, at 5:13 PM, Joshua Udall wrote:
>>>>>
>>>>>> Chris et al. -
>>>>>>
>>>>>> A student and I have written code to do this - write ace files as 
>>>>>> well
>>>>>> as
>>>>>> parse them one entry at a time.  In trying to use the 
>>>>>> Assembly::IO as it
>>>>>> was
>>>>>> in 1.5, we ran into problems with large ace files containing many
>>>>>> entries
>>>>>> because of file handle limit issues with the inherited 
>>>>>> implementation
>>>>>> DB_File.  Our implementation simply reads one contig at a time 
>>>>>> instead
>>>>>> of
>>>>>> first trying to slurp the whole ace into memory.  I'm happy to 
>>>>>> add it to
>>>>>> Bioperl, but I am not sure how to do it.  If I sent *.pm files to
>>>>>> someone,
>>>>>> could they help me get it into bioperl?  It may not be perfect 
>>>>>> either,
>>>>>> but
>>>>>> it should be a good start.
>>>>>>
>>>>>> Josh
>>>>
>>>> _______________________________________________
>>>> Bioperl-l mailing list
>>>> Bioperl-l at lists.open-bio.org
>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


From dan.bolser at gmail.com  Sun Sep 20 08:26:06 2009
From: dan.bolser at gmail.com (Dan Bolser)
Date: Sun, 20 Sep 2009 13:26:06 +0100
Subject: [Bioperl-l] Parser: Ace file (Sequence Assembly) in Bioperl
In-Reply-To: <4AB5916F.1090104@gmail.com>
References: <be9b52410901052142p2809652h68e6a05b3ae156eb@mail.gmail.com>
	<be9b52410901061243t576fcc1eg94928360b8e0f57b@mail.gmail.com>
	<B6BFD3C2-D5D0-4732-B3E2-C2DC9DD029F1@illinois.edu>
	<52cea20c0901061513x593acb44o641b87e35b8ff6fe@mail.gmail.com>
	<835D79AC-0D2A-40BE-87F1-0591F69C036A@illinois.edu>
	<2c8757af0909180727r5a71a41fmee71eff92a49a888@mail.gmail.com>
	<124536CE-407B-4E2E-98B7-940DA4286CC8@illinois.edu>
	<2c8757af0909180954ia4fecc3we72574d8ae8acd97@mail.gmail.com>
	<88BA1216-B8C6-478B-A295-4153D041F549@illinois.edu>
	<4AB5916F.1090104@gmail.com>
Message-ID: <2c8757af0909200526u3bb1766eo5d316dc5d7a2e1a5@mail.gmail.com>

2009/9/20 Florent Angly <florent.angly at gmail.com>:

...

> Also, remember the assembly-related todo list:
> http://www.bioperl.org/wiki/Align_Refactor#Bio::Assembly-related

Thanks for that link Florent. It's great to see the wiki being put to
such good use in the context of OSS development! I need to make a
mental note - before posting, check the mailing list archives _and_
the wiki!

Cheers,
Dan.


> Florent
>
>
> Chris Fields wrote:
>>
>> Bug 2726
>>
>> http://bugzilla.open-bio.org/show_bug.cgi?id=2726
>>
>> chris
>>
>> On Sep 18, 2009, at 11:54 AM, Dan Bolser wrote:
>>
>>> Please can you link to the bug that includes the code?
>>>
>>>
>>> 2009/9/18 Chris Fields <cjfields at illinois.edu>:
>>>>
>>>> Dan,
>>>>
>>>> No, it hasn't made it in. ?Currently, the problem is it doesn't have any
>>>> tests attached, but that could be easily fixed if anyone wanted to
>>>> donate a
>>>> little time to getting them running. ?My hands are a bit full with other
>>>> stuff for the release.
>>>>
>>>> We should have some ace files already to go in t/data somewhere if one
>>>> were
>>>> so inclined to do that, BTW ?;>
>>>>
>>>> chris
>>>>
>>>> On Sep 18, 2009, at 9:27 AM, Dan Bolser wrote:
>>>>
>>>>> 2009/1/6 Chris Fields <cjfields at illinois.edu>:
>>>>>>
>>>>>> Could you archive the files and attach them to a bug report (you can
>>>>>> mark
>>>>>> it
>>>>>> as an enhancement request). ?We can take a look.
>>>>>>
>>>>>> http://bugzilla.open-bio.org/
>>>>>
>>>>> Out of interest, has this been added? Where is it documented?
>>>>>
>>>>> Cheers,
>>>>> Dan.
>>>>>
>>>>>
>>>>>> chris
>>>>>>
>>>>>> On Jan 6, 2009, at 5:13 PM, Joshua Udall wrote:
>>>>>>
>>>>>>> Chris et al. -
>>>>>>>
>>>>>>> A student and I have written code to do this - write ace files as
>>>>>>> well
>>>>>>> as
>>>>>>> parse them one entry at a time. ?In trying to use the Assembly::IO as
>>>>>>> it
>>>>>>> was
>>>>>>> in 1.5, we ran into problems with large ace files containing many
>>>>>>> entries
>>>>>>> because of file handle limit issues with the inherited implementation
>>>>>>> DB_File. ?Our implementation simply reads one contig at a time
>>>>>>> instead
>>>>>>> of
>>>>>>> first trying to slurp the whole ace into memory. ?I'm happy to add it
>>>>>>> to
>>>>>>> Bioperl, but I am not sure how to do it. ?If I sent *.pm files to
>>>>>>> someone,
>>>>>>> could they help me get it into bioperl? ?It may not be perfect
>>>>>>> either,
>>>>>>> but
>>>>>>> it should be a good start.
>>>>>>>
>>>>>>> Josh
>>>>>
>>>>> _______________________________________________
>>>>> Bioperl-l mailing list
>>>>> Bioperl-l at lists.open-bio.org
>>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>>
>>>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>
>


From cjfields at illinois.edu  Sun Sep 20 10:34:08 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Sun, 20 Sep 2009 09:34:08 -0500
Subject: [Bioperl-l] Parser: Ace file (Sequence Assembly) in Bioperl
In-Reply-To: <4AB5916F.1090104@gmail.com>
References: <be9b52410901052142p2809652h68e6a05b3ae156eb@mail.gmail.com>	<18DF7D20DFEC044098A1062202F5FFF31A69523F20@exchsth.agresearch.co.nz>	<be9b52410901061243t576fcc1eg94928360b8e0f57b@mail.gmail.com>	<B6BFD3C2-D5D0-4732-B3E2-C2DC9DD029F1@illinois.edu>	<52cea20c0901061513x593acb44o641b87e35b8ff6fe@mail.gmail.com>	<835D79AC-0D2A-40BE-87F1-0591F69C036A@illinois.edu>	<2c8757af0909180727r5a71a41fmee71eff92a49a888@mail.gmail.com>	<124536CE-407B-4E2E-98B7-940DA4286CC8@illinois.edu>	<2c8757af0909180954ia4fecc3we72574d8ae8acd97@mail.gmail.com>
	<88BA1216-B8C6-478B-A295-4153D041F549@illinois.edu>
	<4AB5916F.1090104@gmail.com>
Message-ID: <F25C4EA4-1DB4-44F3-AB66-F58E6A90E302@illinois.edu>

Never hurts to get started, just make sure that there is a note  
indicating the status of Bio::Assembly.  In fact, the discussion page  
for it might make a good sot for Bio::Assembly design.

chris
On Sep 19, 2009, at 9:20 PM, Florent Angly wrote:

> I suppose it is a good idea to wait until bioperl-live 1.6.1 is out  
> before doing any significant work on the sequence assembly module.
> Also, remember the assembly-related todo list: http://www.bioperl.org/wiki/Align_Refactor#Bio::Assembly-related
> Florent
>
>
> Chris Fields wrote:
>> Bug 2726
>>
>> http://bugzilla.open-bio.org/show_bug.cgi?id=2726
>>
>> chris
>>
>> On Sep 18, 2009, at 11:54 AM, Dan Bolser wrote:
>>
>>> Please can you link to the bug that includes the code?
>>>
>>>
>>> 2009/9/18 Chris Fields <cjfields at illinois.edu>:
>>>> Dan,
>>>>
>>>> No, it hasn't made it in.  Currently, the problem is it doesn't  
>>>> have any
>>>> tests attached, but that could be easily fixed if anyone wanted  
>>>> to donate a
>>>> little time to getting them running.  My hands are a bit full  
>>>> with other
>>>> stuff for the release.
>>>>
>>>> We should have some ace files already to go in t/data somewhere  
>>>> if one were
>>>> so inclined to do that, BTW  ;>
>>>>
>>>> chris
>>>>
>>>> On Sep 18, 2009, at 9:27 AM, Dan Bolser wrote:
>>>>
>>>>> 2009/1/6 Chris Fields <cjfields at illinois.edu>:
>>>>>>
>>>>>> Could you archive the files and attach them to a bug report  
>>>>>> (you can mark
>>>>>> it
>>>>>> as an enhancement request).  We can take a look.
>>>>>>
>>>>>> http://bugzilla.open-bio.org/
>>>>>
>>>>> Out of interest, has this been added? Where is it documented?
>>>>>
>>>>> Cheers,
>>>>> Dan.
>>>>>
>>>>>
>>>>>> chris
>>>>>>
>>>>>> On Jan 6, 2009, at 5:13 PM, Joshua Udall wrote:
>>>>>>
>>>>>>> Chris et al. -
>>>>>>>
>>>>>>> A student and I have written code to do this - write ace files  
>>>>>>> as well
>>>>>>> as
>>>>>>> parse them one entry at a time.  In trying to use the  
>>>>>>> Assembly::IO as it
>>>>>>> was
>>>>>>> in 1.5, we ran into problems with large ace files containing  
>>>>>>> many
>>>>>>> entries
>>>>>>> because of file handle limit issues with the inherited  
>>>>>>> implementation
>>>>>>> DB_File.  Our implementation simply reads one contig at a time  
>>>>>>> instead
>>>>>>> of
>>>>>>> first trying to slurp the whole ace into memory.  I'm happy to  
>>>>>>> add it to
>>>>>>> Bioperl, but I am not sure how to do it.  If I sent *.pm files  
>>>>>>> to
>>>>>>> someone,
>>>>>>> could they help me get it into bioperl?  It may not be perfect  
>>>>>>> either,
>>>>>>> but
>>>>>>> it should be a good start.
>>>>>>>
>>>>>>> Josh
>>>>>
>>>>> _______________________________________________
>>>>> Bioperl-l mailing list
>>>>> Bioperl-l at lists.open-bio.org
>>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>>
>>>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From dan.bolser at gmail.com  Sun Sep 20 11:09:19 2009
From: dan.bolser at gmail.com (Dan Bolser)
Date: Sun, 20 Sep 2009 16:09:19 +0100
Subject: [Bioperl-l] Getting all annotations
In-Reply-To: <2ac05d0f0909181533u1e5d5d89l5c2c468950a9cef@mail.gmail.com>
References: <2ac05d0f0909181533u1e5d5d89l5c2c468950a9cef@mail.gmail.com>
Message-ID: <2c8757af0909200809g1f6c41eeyabfc8bdaac1fc19f@mail.gmail.com>

Hi Emanuele,

I guess you were Emos in irc://irc.freenode.net/#bioperl ?


I think the answer to your question can be found here:

http://www.biodas.org


All the best,
Dan.

2009/9/18 Emanuele Osimo <e.osimo at gmail.com>:
> Hello,
> I was trying to figure out how to get from the Entrez database all the
> reference annotation for a given genomic zone.
> For example: I want to know which genes, transcripts, microRNAs etc are
> present in chr 6 from 100kbp to 200kbp.
> Is there a database that is arranged as a continuum (by sequence) instead of
> by feature (gene, transcript etc)?
>
> Thanks
> Emanuele
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


From maj at fortinbras.us  Mon Sep 21 00:22:54 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Mon, 21 Sep 2009 00:22:54 -0400
Subject: [Bioperl-l] a Main Page proposal
Message-ID: <094CC6C2C9DE448F9A8097AEEFE86B03@NewLife>

Hello all,

As Brian articulated so well for many of us, 
the wiki main page is, well, butt-ugly.
Please check out the Main Page Beta at
http://www.bioperl.org/wiki/Main_Page_Beta
and respond to this thread or on the discussion 
page. 

cheers and thanks, 
MAJ


From bix at sendu.me.uk  Mon Sep 21 02:25:04 2009
From: bix at sendu.me.uk (Sendu Bala)
Date: Mon, 21 Sep 2009 07:25:04 +0100
Subject: [Bioperl-l] a Main Page proposal
In-Reply-To: <094CC6C2C9DE448F9A8097AEEFE86B03@NewLife>
References: <094CC6C2C9DE448F9A8097AEEFE86B03@NewLife>
Message-ID: <4AB71C40.10902@sendu.me.uk>

Mark A. Jensen wrote:
> Hello all,
> 
> As Brian articulated so well for many of us, 
> the wiki main page is, well, butt-ugly.
> Please check out the Main Page Beta at
> http://www.bioperl.org/wiki/Main_Page_Beta
> and respond to this thread or on the discussion 
> page. 

I never thought the main page was 'butt-ugly' (rather, what I expect 
from a wiki), but, to put it bluntly, the graphical flourishes in your 
proposal are cringe-worthy. I couldn't do any better. I think for 
graphical things you'd need a professional graphics designer or similar.

The actual content and organisation of your version is probably an 
improvement though.


From rmb32 at cornell.edu  Mon Sep 21 03:40:31 2009
From: rmb32 at cornell.edu (Robert Buels)
Date: Mon, 21 Sep 2009 00:40:31 -0700
Subject: [Bioperl-l] a Main Page proposal
In-Reply-To: <4AB71C40.10902@sendu.me.uk>
References: <094CC6C2C9DE448F9A8097AEEFE86B03@NewLife>
	<4AB71C40.10902@sendu.me.uk>
Message-ID: <4AB72DEF.2010008@cornell.edu>

Sendu Bala wrote:
> from a wiki), but, to put it bluntly, the graphical flourishes in your 
> proposal are cringe-worthy. I couldn't do any better. I think for 

I think what Sendu was trying to say is that he didn't like the gradient 
section heads?  There are only two graphical things on that page, and 
the other one is an enlargement of the existing logo, so I suppose 
that's what he means.

They're not my absolute favorite either, but I certainly wouldn't 
describe them as cringe-worthy!  :-P

Rob


From biopython at maubp.freeserve.co.uk  Mon Sep 21 05:45:48 2009
From: biopython at maubp.freeserve.co.uk (Peter)
Date: Mon, 21 Sep 2009 10:45:48 +0100
Subject: [Bioperl-l] a Main Page proposal
In-Reply-To: <4AB72DEF.2010008@cornell.edu>
References: <094CC6C2C9DE448F9A8097AEEFE86B03@NewLife>
	<4AB71C40.10902@sendu.me.uk> <4AB72DEF.2010008@cornell.edu>
Message-ID: <320fb6e00909210245hc455330o109515b100b21153@mail.gmail.com>

On Mon, Sep 21, 2009 at 8:40 AM, Robert Buels <rmb32 at cornell.edu> wrote:
>
> I think what Sendu was trying to say is that he didn't like the gradient
> section heads? ?There are only two graphical things on that page, and the
> other one is an enlargement of the existing logo, so I suppose that's what
> he means.

On my browser the gradient section headers on that draft
suddenly change to grey for the section title text background
(Linux, Firefox 3.0.14).

Personally, I would also say that even this proposal is still
far too heavy (in terms of text content).

We had some similar discussions about the Biopython wiki
based homepage - although our old one was nowhere near
as busy as the current BioPerl main page, it was still not as
welcoming as our current version *tries* to be.

Old:
http://biopython.org/w/index.php?title=Biopython&oldid=2527

New:
http://biopython.org/wiki/Main_Page

It would be easy for you to embed the BioPerl OBF blog
headlines into the main page like we did.

I can dig out links to our mailing list archive if anyone is
interested in the discussion.

Peter


From maj at fortinbras.us  Mon Sep 21 07:20:31 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Mon, 21 Sep 2009 07:20:31 -0400
Subject: [Bioperl-l] a Main Page proposal
In-Reply-To: <4AB72DEF.2010008@cornell.edu>
References: <094CC6C2C9DE448F9A8097AEEFE86B03@NewLife><4AB71C40.10902@sendu.me.uk>
	<4AB72DEF.2010008@cornell.edu>
Message-ID: <22244A89D06E4F9B8D5F70A833E1C0DE@NewLife>

Hey, if Sendu cringed, he cringed. If I had one, I'd keep my 
day job. In the meantime, the graphics are removed. 
MAJ
----- Original Message ----- 
From: "Robert Buels" <rmb32 at cornell.edu>
Cc: "BioPerl List" <bioperl-l at lists.open-bio.org>
Sent: Monday, September 21, 2009 3:40 AM
Subject: Re: [Bioperl-l] a Main Page proposal


> Sendu Bala wrote:
>> from a wiki), but, to put it bluntly, the graphical flourishes in your 
>> proposal are cringe-worthy. I couldn't do any better. I think for 
> 
> I think what Sendu was trying to say is that he didn't like the gradient 
> section heads?  There are only two graphical things on that page, and 
> the other one is an enlargement of the existing logo, so I suppose 
> that's what he means.
> 
> They're not my absolute favorite either, but I certainly wouldn't 
> describe them as cringe-worthy!  :-P
> 
> Rob
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
>


From e.osimo at gmail.com  Mon Sep 21 07:35:00 2009
From: e.osimo at gmail.com (Emanuele Osimo)
Date: Mon, 21 Sep 2009 13:35:00 +0200
Subject: [Bioperl-l] a Main Page proposal
In-Reply-To: <094CC6C2C9DE448F9A8097AEEFE86B03@NewLife>
References: <094CC6C2C9DE448F9A8097AEEFE86B03@NewLife>
Message-ID: <2ac05d0f0909210435k66bd0ed3x9fd13d9f4ec44634@mail.gmail.com>

I can say that, for a neophyte, the contents are a great improvement.
You can find with a lot more ease what you are searching for.

Emanuele

On Mon, Sep 21, 2009 at 06:22, Mark A. Jensen <maj at fortinbras.us> wrote:

> Hello all,
>
> As Brian articulated so well for many of us,
> the wiki main page is, well, butt-ugly.
> Please check out the Main Page Beta at
> http://www.bioperl.org/wiki/Main_Page_Beta
> and respond to this thread or on the discussion
> page.
>
> cheers and thanks,
> MAJ
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


From maj at fortinbras.us  Mon Sep 21 07:32:08 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Mon, 21 Sep 2009 07:32:08 -0400
Subject: [Bioperl-l] a Main Page proposal
In-Reply-To: <320fb6e00909210245hc455330o109515b100b21153@mail.gmail.com>
References: <094CC6C2C9DE448F9A8097AEEFE86B03@NewLife><4AB71C40.10902@sendu.me.uk>
	<4AB72DEF.2010008@cornell.edu>
	<320fb6e00909210245hc455330o109515b100b21153@mail.gmail.com>
Message-ID: <3C8F39ACAD954917ACDEFD863EC99B16@NewLife>

I'd appreciate those links, Peter- thanks
MAJ
----- Original Message ----- 
From: "Peter" <biopython at maubp.freeserve.co.uk>
To: "BioPerl List" <bioperl-l at lists.open-bio.org>
Sent: Monday, September 21, 2009 5:45 AM
Subject: Re: [Bioperl-l] a Main Page proposal


On Mon, Sep 21, 2009 at 8:40 AM, Robert Buels <rmb32 at cornell.edu> wrote:
>
> I think what Sendu was trying to say is that he didn't like the gradient
> section heads? There are only two graphical things on that page, and the
> other one is an enlargement of the existing logo, so I suppose that's what
> he means.

On my browser the gradient section headers on that draft
suddenly change to grey for the section title text background
(Linux, Firefox 3.0.14).

Personally, I would also say that even this proposal is still
far too heavy (in terms of text content).

We had some similar discussions about the Biopython wiki
based homepage - although our old one was nowhere near
as busy as the current BioPerl main page, it was still not as
welcoming as our current version *tries* to be.

Old:
http://biopython.org/w/index.php?title=Biopython&oldid=2527

New:
http://biopython.org/wiki/Main_Page

It would be easy for you to embed the BioPerl OBF blog
headlines into the main page like we did.

I can dig out links to our mailing list archive if anyone is
interested in the discussion.

Peter

_______________________________________________
Bioperl-l mailing list
Bioperl-l at lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/bioperl-l


From pmiguel at purdue.edu  Mon Sep 21 08:01:03 2009
From: pmiguel at purdue.edu (Phillip San Miguel)
Date: Mon, 21 Sep 2009 08:01:03 -0400
Subject: [Bioperl-l] Getting read position information from an ACE file?
In-Reply-To: <2c8757af0909181009w310bc69r3d9efa3d9a12d41b@mail.gmail.com>
References: <2c8757af0909180755u2e2ca178h9ce921f9bb22c7a3@mail.gmail.com>	<FCD85C18EC5744269CEAB127F4D1D5C4@NewLife>
	<2c8757af0909181009w310bc69r3d9efa3d9a12d41b@mail.gmail.com>
Message-ID: <4AB76AFF.7050902@purdue.edu>

Dan Bolser wrote:
> 2009/9/18 Mark A. Jensen <maj at fortinbras.us>:
>   
>> Dan -- I don't know much about Assembly, so can't help there. But can I
>>  encourage you and perhaps one or two others (steganographic content:
>> fangly) to create a HOWTO stub out of this? Would be excellent-
>>     
>
> I'd love to. ACE is pretty ubiquitous, so any additional info on how
> to work with them using BioPerl should help a lot of people.
>   
> The problem is that I'm one of those people ;-)
>
>
> I'm working on an 'ace2tab.plx' script that should encompass this
> info. I'm finding that some 'read ids' have the .range format. i.e.
> "read123455.23-239". However, some do not. i.e. "read123456". Not sure
> where this ID comes from, but I think its telling me something about
> partially aligned reads. 

I think you are right. I have heard that Newbler (the 454 assembler) 
does this insane thing, where it will rip reads apart into segments and 
cluster parts of reads in different contigs.

> The problem is that the coordinates I'm
> seeing don't reflect that (they are just the start and the end point
> of the full read).
>   

That sounds similar to how phrap/consed handle "chimeric" reads. But my 
experience is that phrap is pretty parsimonious with numbers of 
chimerics it will allow.  (That isn't entirely fair to Newbler -- I've 
never been able to get phrap to consistently assemble ESTs. Phrap seems 
tuned to assemble BAC shotgun reads. ESTs seem to drive it a little 
crazy. It will create contigs from a set of reads that have essentially 
no similarity to each other, nor to the consensus sequence phrap creates 
for them.)

-- 
Phillip


From hlapp at gmx.net  Mon Sep 21 08:22:34 2009
From: hlapp at gmx.net (Hilmar Lapp)
Date: Mon, 21 Sep 2009 08:22:34 -0400
Subject: [Bioperl-l] a Main Page proposal
In-Reply-To: <094CC6C2C9DE448F9A8097AEEFE86B03@NewLife>
References: <094CC6C2C9DE448F9A8097AEEFE86B03@NewLife>
Message-ID: <03B93F96-E28D-45CF-BD94-AD33634476AA@gmx.net>

What's probably worth looking at as a example is the gmod.org home  
page. Stylistically, one thing you want to get out of the way is the  
auto-generated TOC.

	-hilmar

On Sep 21, 2009, at 12:22 AM, Mark A. Jensen wrote:

> Hello all,
>
> As Brian articulated so well for many of us,
> the wiki main page is, well, butt-ugly.
> Please check out the Main Page Beta at
> http://www.bioperl.org/wiki/Main_Page_Beta
> and respond to this thread or on the discussion
> page.
>
> cheers and thanks,
> MAJ
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From biopython at maubp.freeserve.co.uk  Mon Sep 21 08:28:28 2009
From: biopython at maubp.freeserve.co.uk (Peter)
Date: Mon, 21 Sep 2009 13:28:28 +0100
Subject: [Bioperl-l] a Main Page proposal
In-Reply-To: <3C8F39ACAD954917ACDEFD863EC99B16@NewLife>
References: <094CC6C2C9DE448F9A8097AEEFE86B03@NewLife>
	<4AB71C40.10902@sendu.me.uk> <4AB72DEF.2010008@cornell.edu>
	<320fb6e00909210245hc455330o109515b100b21153@mail.gmail.com>
	<3C8F39ACAD954917ACDEFD863EC99B16@NewLife>
Message-ID: <320fb6e00909210528v849cd43h3bd4677b69222575@mail.gmail.com>

Peter wrote:
>> We had some similar discussions about the Biopython wiki
>> based homepage - although our old one was nowhere near
>> as busy as the current BioPerl main page, it was still not as
>> welcoming as our current version *tries* to be.
>> ...
>> I can dig out links to our mailing list archive if anyone is
>> interested in the discussion.

On Mon, Sep 21, 2009 at 12:32 PM, Mark A. Jensen wrote:
>
> I'd appreciate those links, Peter- thanks
> MAJ

OK, here you are - this was most of it, I'd have to dig though
my old emails to see what else I can find:
http://lists.open-bio.org/pipermail/biopython-dev/2009-April/005867.html

Remember Biopython went from a very minimal home page, to
something aiming to be more newcomer friendly. BioPerl on the
other hand seems to want to move away from the current very
text heavy information rich page to something more focused and
newcomer friendly. To me at least the current page is too dense,
intimidating, and the important bits get lost in all the content.

[My apologies if any of this feedback come accross too blunt.]

If you haven't already looked at them, you should checkout the
other OBF project pages for ideas. The BioJava homepage is
also using the wiki - in my opinion it is a bit cluttered, but is
still more accessible than the current BioPerl page. Also,
the BioRuby page is very nice - although not wiki based.

Regards,

Peter


From mwachholtz at unomaha.edu  Thu Sep 17 20:31:13 2009
From: mwachholtz at unomaha.edu (Michael UNO)
Date: Thu, 17 Sep 2009 17:31:13 -0700 (PDT)
Subject: [Bioperl-l]  Genome Scanning Question
Message-ID: <25497856.post@talk.nabble.com>


What objects & methods could be used if I wanted to determine if a gene is
located at a specific location within a genome at the Ensembl database. For
example, if given a coordinate (e.g. Canine Chr15:66,500,123) is there a
method that will simply tell me "yes, there is a gene at this location". And
can it tell what gene(s) are located at this coordinate?
-- 
View this message in context: http://www.nabble.com/Genome-Scanning-Question-tp25497856p25497856.html
Sent from the Perl - Bioperl-L mailing list archive at Nabble.com.


From sdavis2 at mail.nih.gov  Mon Sep 21 09:04:36 2009
From: sdavis2 at mail.nih.gov (Sean Davis)
Date: Mon, 21 Sep 2009 09:04:36 -0400
Subject: [Bioperl-l] Genome Scanning Question
In-Reply-To: <25497856.post@talk.nabble.com>
References: <25497856.post@talk.nabble.com>
Message-ID: <264855a00909210604o826871dr7121e3f26c0e34aa@mail.gmail.com>

On Thu, Sep 17, 2009 at 8:31 PM, Michael UNO <mwachholtz at unomaha.edu> wrote:

>
> What objects & methods could be used if I wanted to determine if a gene is
> located at a specific location within a genome at the Ensembl database. For
> example, if given a coordinate (e.g. Canine Chr15:66,500,123) is there a
> method that will simply tell me "yes, there is a gene at this location".
> And
> can it tell what gene(s) are located at this coordinate?
>

There are a number of ways to go about this.

If you want to go with perl, object-oriented, and ensembl, check out:

http://www.ensembl.org/info/docs/api/core/core_tutorial.html

If you want to start with tab-delimited text files, check out downloading
the text files from the UCSC genome browser.

Sean


> --
> View this message in context:
> http://www.nabble.com/Genome-Scanning-Question-tp25497856p25497856.html
> Sent from the Perl - Bioperl-L mailing list archive at Nabble.com.
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


From cjfields at illinois.edu  Mon Sep 21 09:05:25 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Mon, 21 Sep 2009 08:05:25 -0500
Subject: [Bioperl-l] a Main Page proposal
In-Reply-To: <320fb6e00909210528v849cd43h3bd4677b69222575@mail.gmail.com>
References: <094CC6C2C9DE448F9A8097AEEFE86B03@NewLife>
	<4AB71C40.10902@sendu.me.uk> <4AB72DEF.2010008@cornell.edu>
	<320fb6e00909210245hc455330o109515b100b21153@mail.gmail.com>
	<3C8F39ACAD954917ACDEFD863EC99B16@NewLife>
	<320fb6e00909210528v849cd43h3bd4677b69222575@mail.gmail.com>
Message-ID: <D336A804-1B73-465E-971F-AD1A9EE5465C@illinois.edu>


On Sep 21, 2009, at 7:28 AM, Peter wrote:

> Peter wrote:
>>> We had some similar discussions about the Biopython wiki
>>> based homepage - although our old one was nowhere near
>>> as busy as the current BioPerl main page, it was still not as
>>> welcoming as our current version *tries* to be.
>>> ...
>>> I can dig out links to our mailing list archive if anyone is
>>> interested in the discussion.
>
> On Mon, Sep 21, 2009 at 12:32 PM, Mark A. Jensen wrote:
>>
>> I'd appreciate those links, Peter- thanks
>> MAJ
>
> OK, here you are - this was most of it, I'd have to dig though
> my old emails to see what else I can find:
> http://lists.open-bio.org/pipermail/biopython-dev/2009-April/005867.html
>
> Remember Biopython went from a very minimal home page, to
> something aiming to be more newcomer friendly. BioPerl on the
> other hand seems to want to move away from the current very
> text heavy information rich page to something more focused and
> newcomer friendly. To me at least the current page is too dense,
> intimidating, and the important bits get lost in all the content.
>
> [My apologies if any of this feedback come accross too blunt.]

Not at all; I'm thinking the same thing.

> If you haven't already looked at them, you should checkout the
> other OBF project pages for ideas. The BioJava homepage is
> also using the wiki - in my opinion it is a bit cluttered, but is
> still more accessible than the current BioPerl page. Also,
> the BioRuby page is very nice - although not wiki based.
>
> Regards,
>
> Peter

I think the Biopython layout is very nice and focused.  Maybe a bit  
too minimal, but then again I don't like scrolling up and down the  
page to find the relevant bits, so less may be better.

Reminds me of the simplifed design on the perl6 main page (just don't  
stare at the hallucinogenic butterfly too long):

http://www.perl6.org/

So, maybe a structured layout with the most important links, and  
additional links on a separate page.

chris


From maj at fortinbras.us  Mon Sep 21 09:22:35 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Mon, 21 Sep 2009 09:22:35 -0400
Subject: [Bioperl-l] a Main Page proposal
In-Reply-To: <D336A804-1B73-465E-971F-AD1A9EE5465C@illinois.edu>
References: <094CC6C2C9DE448F9A8097AEEFE86B03@NewLife><4AB71C40.10902@sendu.me.uk>
	<4AB72DEF.2010008@cornell.edu><320fb6e00909210245hc455330o109515b100b21153@mail.gmail.com><3C8F39ACAD954917ACDEFD863EC99B16@NewLife><320fb6e00909210528v849cd43h3bd4677b69222575@mail.gmail.com>
	<D336A804-1B73-465E-971F-AD1A9EE5465C@illinois.edu>
Message-ID: <0F980234804C4B3EA08E810E043F2537@NewLife>

Ah! I don't need a degree in design, just a dose of whatever Madame Butterfly 
was taking!
(Erdos had it right...)

----- Original Message ----- 
From: "Chris Fields" <cjfields at illinois.edu>
To: "Peter" <biopython at maubp.freeserve.co.uk>
Cc: "BioPerl List" <bioperl-l at lists.open-bio.org>; "Mark A. Jensen" 
<maj at fortinbras.us>
Sent: Monday, September 21, 2009 9:05 AM
Subject: Re: [Bioperl-l] a Main Page proposal


>
> On Sep 21, 2009, at 7:28 AM, Peter wrote:
>
>> Peter wrote:
>>>> We had some similar discussions about the Biopython wiki
>>>> based homepage - although our old one was nowhere near
>>>> as busy as the current BioPerl main page, it was still not as
>>>> welcoming as our current version *tries* to be.
>>>> ...
>>>> I can dig out links to our mailing list archive if anyone is
>>>> interested in the discussion.
>>
>> On Mon, Sep 21, 2009 at 12:32 PM, Mark A. Jensen wrote:
>>>
>>> I'd appreciate those links, Peter- thanks
>>> MAJ
>>
>> OK, here you are - this was most of it, I'd have to dig though
>> my old emails to see what else I can find:
>> http://lists.open-bio.org/pipermail/biopython-dev/2009-April/005867.html
>>
>> Remember Biopython went from a very minimal home page, to
>> something aiming to be more newcomer friendly. BioPerl on the
>> other hand seems to want to move away from the current very
>> text heavy information rich page to something more focused and
>> newcomer friendly. To me at least the current page is too dense,
>> intimidating, and the important bits get lost in all the content.
>>
>> [My apologies if any of this feedback come accross too blunt.]
>
> Not at all; I'm thinking the same thing.
>
>> If you haven't already looked at them, you should checkout the
>> other OBF project pages for ideas. The BioJava homepage is
>> also using the wiki - in my opinion it is a bit cluttered, but is
>> still more accessible than the current BioPerl page. Also,
>> the BioRuby page is very nice - although not wiki based.
>>
>> Regards,
>>
>> Peter
>
> I think the Biopython layout is very nice and focused.  Maybe a bit  too 
> minimal, but then again I don't like scrolling up and down the  page to find 
> the relevant bits, so less may be better.
>
> Reminds me of the simplifed design on the perl6 main page (just don't  stare 
> at the hallucinogenic butterfly too long):
>
> http://www.perl6.org/
>
> So, maybe a structured layout with the most important links, and  additional 
> links on a separate page.
>
> chris
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> 


From biopython at maubp.freeserve.co.uk  Mon Sep 21 09:58:21 2009
From: biopython at maubp.freeserve.co.uk (Peter)
Date: Mon, 21 Sep 2009 14:58:21 +0100
Subject: [Bioperl-l] a Main Page proposal
In-Reply-To: <D336A804-1B73-465E-971F-AD1A9EE5465C@illinois.edu>
References: <094CC6C2C9DE448F9A8097AEEFE86B03@NewLife>
	<4AB71C40.10902@sendu.me.uk> <4AB72DEF.2010008@cornell.edu>
	<320fb6e00909210245hc455330o109515b100b21153@mail.gmail.com>
	<3C8F39ACAD954917ACDEFD863EC99B16@NewLife>
	<320fb6e00909210528v849cd43h3bd4677b69222575@mail.gmail.com>
	<D336A804-1B73-465E-971F-AD1A9EE5465C@illinois.edu>
Message-ID: <320fb6e00909210658n70f96727g1eb190579a746cfa@mail.gmail.com>

On Mon, Sep 21, 2009 at 2:05 PM, Chris Fields <cjfields at illinois.edu> wrote:
>
> I think the Biopython layout is very nice and focused. ?Maybe
> a bit too minimal, but then again I don't like scrolling up and
> down the page to find the relevant bits, so less may be better.

Yes, trying to get everything on one screen was deliberate
(and works for most screen sizes).

> Reminds me of the simplifed design on the perl6 main page
> (just don't stare at the hallucinogenic butterfly too long):
>
> http://www.perl6.org/
>
> So, maybe a structured layout with the most important links,
> and additional links on a separate page.

Butterflies aside, yes - that is what we tried to do on the
Biopython page - just provide an "abstract", and links to
get people to the main content.

Peter


From ak at ebi.ac.uk  Mon Sep 21 10:06:44 2009
From: ak at ebi.ac.uk (Andreas =?iso-8859-1?B?S+Ro5HJp?=)
Date: Mon, 21 Sep 2009 15:06:44 +0100
Subject: [Bioperl-l] Genome Scanning Question
In-Reply-To: <25497856.post@talk.nabble.com>
References: <25497856.post@talk.nabble.com>
Message-ID: <20090921140644.GB12734@qux.windows.ebi.ac.uk>

On Thu, Sep 17, 2009 at 05:31:13PM -0700, Michael UNO wrote:
> 
> What objects & methods could be used if I wanted to determine if a gene is
> located at a specific location within a genome at the Ensembl database. For
> example, if given a coordinate (e.g. Canine Chr15:66,500,123) is there a
> method that will simply tell me "yes, there is a gene at this location". And
> can it tell what gene(s) are located at this coordinate?

Here's a basic script do do something like what you want to do, for a
specific species, chromosome, and region:

#!/usr/bin/perl -w

use strict;
use warnings;

use Bio::EnsEMBL::Registry;

my $registry = 'Bio::EnsEMBL::Registry';

$registry->load_registry_from_db(
  '-host' => 'ensembldb.ensembl.org',
  '-user' => 'anonymous'
);

my $species = 'Dog';

my ( $chrname, $chrstart, $chrend ) = ( '13', 40_500_000, 41_000_000 );

my $slice_adaptor = $registry->get_adaptor( $species, 'Core', 'Slice' );

my $slice =
  $slice_adaptor->fetch_by_region( 'Chromosome', $chrname, $chrstart,
  $chrend );

my @genes = @{ $slice->get_all_Genes() };

if ( !@genes ) {
  print("No genes on that interval\n");
} else {
  printf( "%d genes on the interval:\n", scalar(@genes) );
  foreach my $gene (@genes) {
    printf(
      "%s (%s) [%s,%s,%s]\n",
      $gene->stable_id(), $gene->external_name() || 'No external name',
      $gene->start(), $gene->end(), $gene->strand() );
  }
}


Are you aware of the ensembl-dev mailing list and of the ensembl
helpdesk at helpdesk at ensembl.org (or via the "he!p" button in the genome
browser itself)?


Regards,
Andreas


-- 
Andreas K?h?ri, Ensembl Software Developer            ()[]()[]
European Bioinformatics Institute (EMBL-EBI)          []()[]()
Wellcome Trust Genome Campus, Hinxton                 ()[]()[]
Cambridge CB10 1SD, United Kingdom                    []()[]()


From bosborne11 at verizon.net  Mon Sep 21 09:15:03 2009
From: bosborne11 at verizon.net (Brian Osborne)
Date: Mon, 21 Sep 2009 09:15:03 -0400
Subject: [Bioperl-l] a Main Page proposal
In-Reply-To: <094CC6C2C9DE448F9A8097AEEFE86B03@NewLife>
References: <094CC6C2C9DE448F9A8097AEEFE86B03@NewLife>
Message-ID: <7E8EC05A-ED60-4F70-850D-16DD7E037281@verizon.net>

Mark,

That's nice! I wonder if we can move some content up-top, on the  
right, for less scrolling. I will play with this later today...

Brian O.


On Sep 21, 2009, at 12:22 AM, Mark A. Jensen wrote:

> Hello all,
>
> As Brian articulated so well for many of us,
> the wiki main page is, well, butt-ugly.
> Please check out the Main Page Beta at
> http://www.bioperl.org/wiki/Main_Page_Beta
> and respond to this thread or on the discussion
> page.
>
> cheers and thanks,
> MAJ
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From anupam.contact at gmail.com  Mon Sep 21 10:18:52 2009
From: anupam.contact at gmail.com (anupam sinha)
Date: Mon, 21 Sep 2009 19:48:52 +0530
Subject: [Bioperl-l] Problems with Bioperl-run pkg
In-Reply-To: <82ec54570909180820t7981d230l48d8e4823bb2303f@mail.gmail.com>
References: <82ec54570909180820t7981d230l48d8e4823bb2303f@mail.gmail.com>
Message-ID: <82ec54570909210718v180f604btc835d88f2a9ec2fd@mail.gmail.com>

On Fri, Sep 18, 2009 at 8:50 PM, anupam sinha <anupam.contact at gmail.com>wrote:

> Dear all,
>                  I have installed the BioPerl-1.6.0.tar.gz and
> Bioperl-run-1.6.0.tar.gz on a Fedora 7 system. I am trying to run *
> /usr/bin/bp_pairwise_kaks.pl* script but keep on getting this error :
>
> *Must have bioperl-run pkg installed to run this script at
> /usr/bin/bp_pairwise_kaks.pl line 69*.
>
> Though I have istalled the run package from Bioperl. Can anyone help me out
> ? Thanks in advance.
>
>
>
> Regards,
>
>
> Anupam Sinha
>


From maj at fortinbras.us  Mon Sep 21 10:49:25 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Mon, 21 Sep 2009 10:49:25 -0400
Subject: [Bioperl-l] a Main Page proposal
In-Reply-To: <094CC6C2C9DE448F9A8097AEEFE86B03@NewLife>
References: <094CC6C2C9DE448F9A8097AEEFE86B03@NewLife>
Message-ID: <7C51BBB552AB4EBEA5ED60B49BDB9171@NewLife>

Please view the latest 
http://www.bioperl.org/wiki/Main_Page_Beta
No graphics. I incline towards more text, but you
already knew that.
MAJ
----- Original Message ----- 
From: "Mark A. Jensen" <maj at fortinbras.us>
To: "BioPerl List" <bioperl-l at lists.open-bio.org>
Sent: Monday, September 21, 2009 12:22 AM
Subject: [Bioperl-l] a Main Page proposal


> Hello all,
> 
> As Brian articulated so well for many of us, 
> the wiki main page is, well, butt-ugly.
> Please check out the Main Page Beta at
> http://www.bioperl.org/wiki/Main_Page_Beta
> and respond to this thread or on the discussion 
> page. 
> 
> cheers and thanks, 
> MAJ
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
>


From David.Messina at sbc.su.se  Mon Sep 21 13:03:56 2009
From: David.Messina at sbc.su.se (Dave Messina)
Date: Mon, 21 Sep 2009 19:03:56 +0200
Subject: [Bioperl-l] a Main Page proposal
In-Reply-To: <7C51BBB552AB4EBEA5ED60B49BDB9171@NewLife>
References: <094CC6C2C9DE448F9A8097AEEFE86B03@NewLife>
	<7C51BBB552AB4EBEA5ED60B49BDB9171@NewLife>
Message-ID: <628aabb70909211003w221b4c9fn5c11f7ff08fc952d@mail.gmail.com>

Hi Mark,
Thanks for taking on this (much needed) refresh.

I think your current version is substantially better than what we have now.
Still, I'd argue that something much more concise like the Biopython page
would make a bigger impact on visitors' ability to find what they're looking
for.

It's not that the details you have under each section shouldn't be
available, but rather that they could be clicked through to instead of being
on the front page.

The About section is a good example. I would bet most visitors to the
BioPerl website skip over the About section because they already know what
BioPerl is, and that section has the most valuable real estate on the page.
Those who don't know and are curious will probably be able to find it (the
word About on the front page of a website has become an idiom for "click her
to read the details about this").


Dave


From cjfields at illinois.edu  Mon Sep 21 13:42:10 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Mon, 21 Sep 2009 12:42:10 -0500
Subject: [Bioperl-l] a Main Page proposal
In-Reply-To: <628aabb70909211003w221b4c9fn5c11f7ff08fc952d@mail.gmail.com>
References: <094CC6C2C9DE448F9A8097AEEFE86B03@NewLife>
	<7C51BBB552AB4EBEA5ED60B49BDB9171@NewLife>
	<628aabb70909211003w221b4c9fn5c11f7ff08fc952d@mail.gmail.com>
Message-ID: <5C240DA6-6B3D-4E64-A8BC-1FBC90FFA471@illinois.edu>

On Sep 21, 2009, at 12:03 PM, Dave Messina wrote:

> Hi Mark,
> Thanks for taking on this (much needed) refresh.
>
> I think your current version is substantially better than what we  
> have now.
> Still, I'd argue that something much more concise like the Biopython  
> page
> would make a bigger impact on visitors' ability to find what they're  
> looking
> for.
>
> It's not that the details you have under each section shouldn't be
> available, but rather that they could be clicked through to instead  
> of being
> on the front page.
>
> The About section is a good example. I would bet most visitors to the
> BioPerl website skip over the About section because they already  
> know what
> BioPerl is, and that section has the most valuable real estate on  
> the page.
> Those who don't know and are curious will probably be able to find  
> it (the
> word About on the front page of a website has become an idiom for  
> "click her
> to read the details about this").
>
>
>
> Dave

How about this version (it's on my talk page):

http://www.bioperl.org/wiki/User_talk:Cjfields

chris


From maj at fortinbras.us  Mon Sep 21 13:45:03 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Mon, 21 Sep 2009 13:45:03 -0400
Subject: [Bioperl-l] a Main Page proposal
In-Reply-To: <628aabb70909211003w221b4c9fn5c11f7ff08fc952d@mail.gmail.com>
References: <094CC6C2C9DE448F9A8097AEEFE86B03@NewLife><7C51BBB552AB4EBEA5ED60B49BDB9171@NewLife>
	<628aabb70909211003w221b4c9fn5c11f7ff08fc952d@mail.gmail.com>
Message-ID: <42FBB964C0EA44FABCB50364C567A009@NewLife>

A nearly completely minimal solution is at Main Page Beta
----- Original Message ----- 
From: "Dave Messina" <David.Messina at sbc.su.se>
To: "Mark A. Jensen" <maj at fortinbras.us>
Cc: "BioPerl List" <bioperl-l at lists.open-bio.org>
Sent: Monday, September 21, 2009 1:03 PM
Subject: Re: [Bioperl-l] a Main Page proposal


> Hi Mark,
> Thanks for taking on this (much needed) refresh.
> 
> I think your current version is substantially better than what we have now.
> Still, I'd argue that something much more concise like the Biopython page
> would make a bigger impact on visitors' ability to find what they're looking
> for.
> 
> It's not that the details you have under each section shouldn't be
> available, but rather that they could be clicked through to instead of being
> on the front page.
> 
> The About section is a good example. I would bet most visitors to the
> BioPerl website skip over the About section because they already know what
> BioPerl is, and that section has the most valuable real estate on the page.
> Those who don't know and are curious will probably be able to find it (the
> word About on the front page of a website has become an idiom for "click her
> to read the details about this").
> 
> 
> 
> Dave
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
>


From armendarez77 at hotmail.com  Mon Sep 21 17:01:12 2009
From: armendarez77 at hotmail.com (armendarez77 at hotmail.com)
Date: Mon, 21 Sep 2009 14:01:12 -0700
Subject: [Bioperl-l] New to Bio::Tools::Run::RemoteBlast
Message-ID: <SNT119-W38149FD1B34EE5CA92BFBED2DD0@phx.gbl>


Hello,

Is there a function to blast one query sequence against multiple blast databases?  For example, I want to blast a sequence against all Microbial Genomes.  Currently, I can do it by placing multiple Microbial databases (eg. Microbial/100226, Microbial/101510, etc) into an array and iterate through them using a foreach loop.  Each individual database is placed in the '-data' parameter and the blast is performed.

Example Code:

use strict;
use Bio::Tools::Run::RemoteBlast;

my @microbDbs = qw(Microbial/100226 Microbial/101510 Microbial/103690 Microbial/1063);
my $e_val= '1e-3';

foreach my $db(@microbDbs){
  my @params = ( '-prog' => $prog,
                         '-data' => $db,
                         '-expect' => $e_val,
                         '-readmethod' => 'xml' );

  my $factory = Bio::Tools::Run::RemoteBlast->new(@params);
  my $v = 1;
  my $str = Bio::SeqIO->new(-file=>'test.fa' , '-format' => 'fasta' );
  while (my $input = $str->next_seq()){
    my $r = $factory->submit_blast($input);

    #Code continues...

}

Is there a more efficient way to accomplish this?

If this topic has been discussed please point the way.

Thank you,

Veronica

 		 	   		  
_________________________________________________________________
Microsoft brings you a new way to search the web.  Try  Bing? now
http://www.bing.com?form=MFEHPG&publ=WLHMTAG&crea=TEXT_MFEHPG_Core_tagline_try bing_1x1


From Russell.Smithies at agresearch.co.nz  Mon Sep 21 18:10:56 2009
From: Russell.Smithies at agresearch.co.nz (Smithies, Russell)
Date: Tue, 22 Sep 2009 10:10:56 +1200
Subject: [Bioperl-l] New to Bio::Tools::Run::RemoteBlast
In-Reply-To: <SNT119-W38149FD1B34EE5CA92BFBED2DD0@phx.gbl>
References: <SNT119-W38149FD1B34EE5CA92BFBED2DD0@phx.gbl>
Message-ID: <18DF7D20DFEC044098A1062202F5FFF32B62A6B9D0@exchsth.agresearch.co.nz>

You may need to setup blast locally (not a big job) as I don't think you can blast against multiple databases with B:T:R:RemoteBlast. 
Or you could do it manually on NCBI's site where you can filter results by entrez query (eg. 1239[taxid] for fermicutes) http://www.ncbi.nlm.nih.gov/BLAST/blastcgihelp.shtml#entrez_query 

--Russell


> -----Original Message-----
> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
> bounces at lists.open-bio.org] On Behalf Of armendarez77 at hotmail.com
> Sent: Tuesday, 22 September 2009 9:01 a.m.
> To: bioperl-l at lists.open-bio.org
> Subject: [Bioperl-l] New to Bio::Tools::Run::RemoteBlast
> 
> 
> 
> 
> 
> 
> 
> Hello,
> 
> Is there a function to blast one query sequence against multiple blast
> databases?  For example, I want to blast a sequence against all Microbial
> Genomes.  Currently, I can do it by placing multiple Microbial databases (eg.
> Microbial/100226, Microbial/101510, etc) into an array and iterate through
> them using a foreach loop.  Each individual database is placed in the '-data'
> parameter and the blast is performed.
> 
> Example Code:
> 
> use strict;
> use Bio::Tools::Run::RemoteBlast;
> 
> my @microbDbs = qw(Microbial/100226 Microbial/101510 Microbial/103690
> Microbial/1063);
> my $e_val= '1e-3';
> 
> foreach my $db(@microbDbs){
>   my @params = ( '-prog' => $prog,
>                          '-data' => $db,
>                          '-expect' => $e_val,
>                          '-readmethod' => 'xml' );
> 
>   my $factory = Bio::Tools::Run::RemoteBlast->new(@params);
>   my $v = 1;
>   my $str = Bio::SeqIO->new(-file=>'test.fa' , '-format' => 'fasta' );
>   while (my $input = $str->next_seq()){
>     my $r = $factory->submit_blast($input);
> 
>     #Code continues...
> 
> }
> 
> Is there a more efficient way to accomplish this?
> 
> If this topic has been discussed please point the way.
> 
> Thank you,
> 
> Veronica
> 
> 
> _________________________________________________________________
> Microsoft brings you a new way to search the web.  Try  Bing(tm) now
> http://www.bing.com?form=MFEHPG&publ=WLHMTAG&crea=TEXT_MFEHPG_Core_tagline_try
> bing_1x1
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
=======================================================================
Attention: The information contained in this message and/or attachments
from AgResearch Limited is intended only for the persons or entities
to which it is addressed and may contain confidential and/or privileged
material. Any review, retransmission, dissemination or other use of, or
taking of any action in reliance upon, this information by persons or
entities other than the intended recipients is prohibited by AgResearch
Limited. If you have received this message in error, please notify the
sender immediately.
=======================================================================


From bill at genenformics.com  Mon Sep 21 18:21:26 2009
From: bill at genenformics.com (bill at genenformics.com)
Date: Mon, 21 Sep 2009 15:21:26 -0700
Subject: [Bioperl-l] New to Bio::Tools::Run::RemoteBlast
In-Reply-To: <18DF7D20DFEC044098A1062202F5FFF32B62A6B9D0@exchsth.agresearch.co.nz>
References: <SNT119-W38149FD1B34EE5CA92BFBED2DD0@phx.gbl>
	<18DF7D20DFEC044098A1062202F5FFF32B62A6B9D0@exchsth.agresearch.co.nz>
Message-ID: <4a1b887d0770ac557b0a2578aefdce18.squirrel@mail.dreamhost.com>

BLAST DBs can be concatenated into a single target (.nal or .pal) file.

Check this out:

http://www.ncbi.nlm.nih.gov/Web/Newsltr/Winter00/blastlab.html

Bill

> You may need to setup blast locally (not a big job) as I don't think you
> can blast against multiple databases with B:T:R:RemoteBlast.
> Or you could do it manually on NCBI's site where you can filter results by
> entrez query (eg. 1239[taxid] for fermicutes)
> http://www.ncbi.nlm.nih.gov/BLAST/blastcgihelp.shtml#entrez_query
>
> --Russell
>
>
>> -----Original Message-----
>> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
>> bounces at lists.open-bio.org] On Behalf Of armendarez77 at hotmail.com
>> Sent: Tuesday, 22 September 2009 9:01 a.m.
>> To: bioperl-l at lists.open-bio.org
>> Subject: [Bioperl-l] New to Bio::Tools::Run::RemoteBlast
>>
>>
>>
>>
>>
>>
>>
>> Hello,
>>
>> Is there a function to blast one query sequence against multiple blast
>> databases?  For example, I want to blast a sequence against all
>> Microbial
>> Genomes.  Currently, I can do it by placing multiple Microbial databases
>> (eg.
>> Microbial/100226, Microbial/101510, etc) into an array and iterate
>> through
>> them using a foreach loop.  Each individual database is placed in the
>> '-data'
>> parameter and the blast is performed.
>>
>> Example Code:
>>
>> use strict;
>> use Bio::Tools::Run::RemoteBlast;
>>
>> my @microbDbs = qw(Microbial/100226 Microbial/101510 Microbial/103690
>> Microbial/1063);
>> my $e_val= '1e-3';
>>
>> foreach my $db(@microbDbs){
>>   my @params = ( '-prog' => $prog,
>>                          '-data' => $db,
>>                          '-expect' => $e_val,
>>                          '-readmethod' => 'xml' );
>>
>>   my $factory = Bio::Tools::Run::RemoteBlast->new(@params);
>>   my $v = 1;
>>   my $str = Bio::SeqIO->new(-file=>'test.fa' , '-format' => 'fasta' );
>>   while (my $input = $str->next_seq()){
>>     my $r = $factory->submit_blast($input);
>>
>>     #Code continues...
>>
>> }
>>
>> Is there a more efficient way to accomplish this?
>>
>> If this topic has been discussed please point the way.
>>
>> Thank you,
>>
>> Veronica
>>
>>
>> _________________________________________________________________
>> Microsoft brings you a new way to search the web.  Try  Bing(tm) now
>> http://www.bing.com?form=MFEHPG&publ=WLHMTAG&crea=TEXT_MFEHPG_Core_tagline_try
>> bing_1x1
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> =======================================================================
> Attention: The information contained in this message and/or attachments
> from AgResearch Limited is intended only for the persons or entities
> to which it is addressed and may contain confidential and/or privileged
> material. Any review, retransmission, dissemination or other use of, or
> taking of any action in reliance upon, this information by persons or
> entities other than the intended recipients is prohibited by AgResearch
> Limited. If you have received this message in error, please notify the
> sender immediately.
> =======================================================================
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


From Russell.Smithies at agresearch.co.nz  Mon Sep 21 18:48:26 2009
From: Russell.Smithies at agresearch.co.nz (Smithies, Russell)
Date: Tue, 22 Sep 2009 10:48:26 +1200
Subject: [Bioperl-l] New to Bio::Tools::Run::RemoteBlast
In-Reply-To: <4a1b887d0770ac557b0a2578aefdce18.squirrel@mail.dreamhost.com>
References: <SNT119-W38149FD1B34EE5CA92BFBED2DD0@phx.gbl>
	<18DF7D20DFEC044098A1062202F5FFF32B62A6B9D0@exchsth.agresearch.co.nz>
	<4a1b887d0770ac557b0a2578aefdce18.squirrel@mail.dreamhost.com>
Message-ID: <18DF7D20DFEC044098A1062202F5FFF32B62A6BA02@exchsth.agresearch.co.nz>

That doesn't work with remote databases though.
B:T:R:RemoteBlast uses the QBlast API (I think) so you're limited to the prebuilt databases NCBI offers.
http://www.ncbi.nlm.nih.gov/BLAST/Doc/urlapi.html 

Another thing to try is space-seperating your db list - I know it works with local blasts.
You could also bypass RemoteBlast and do it yourself by POSTing via URL.

This seems to work with multiple databases but you'd need to experiment:

http://www.ncbi.nlm.nih.gov/blast/Blast.cgi?QUERY=257700677&DATABASE=%22Microbial/100226%20Microbial/101510%20Microbial/103690%22&HITLIST_SIZE=10&FILTER=L&EXPECT=10&FORMAT_TYPE=HTML&PROGRAM=blastn&CLIENT=web&SERVICE=plain&NCBI_GI=on&PAGE=Nucleotides&CMD=Put


--Russell


> -----Original Message-----
> From: bill at genenformics.com [mailto:bill at genenformics.com]
> Sent: Tuesday, 22 September 2009 10:21 a.m.
> To: Smithies, Russell
> Cc: 'armendarez77 at hotmail.com'; 'bioperl-l at lists.open-bio.org'
> Subject: Re: [Bioperl-l] New to Bio::Tools::Run::RemoteBlast
> 
> BLAST DBs can be concatenated into a single target (.nal or .pal) file.
> 
> Check this out:
> 
> http://www.ncbi.nlm.nih.gov/Web/Newsltr/Winter00/blastlab.html
> 
> Bill
> 
> > You may need to setup blast locally (not a big job) as I don't think you
> > can blast against multiple databases with B:T:R:RemoteBlast.
> > Or you could do it manually on NCBI's site where you can filter results by
> > entrez query (eg. 1239[taxid] for fermicutes)
> > http://www.ncbi.nlm.nih.gov/BLAST/blastcgihelp.shtml#entrez_query
> >
> > --Russell
> >
> >
> >> -----Original Message-----
> >> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
> >> bounces at lists.open-bio.org] On Behalf Of armendarez77 at hotmail.com
> >> Sent: Tuesday, 22 September 2009 9:01 a.m.
> >> To: bioperl-l at lists.open-bio.org
> >> Subject: [Bioperl-l] New to Bio::Tools::Run::RemoteBlast
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >> Hello,
> >>
> >> Is there a function to blast one query sequence against multiple blast
> >> databases?  For example, I want to blast a sequence against all
> >> Microbial
> >> Genomes.  Currently, I can do it by placing multiple Microbial databases
> >> (eg.
> >> Microbial/100226, Microbial/101510, etc) into an array and iterate
> >> through
> >> them using a foreach loop.  Each individual database is placed in the
> >> '-data'
> >> parameter and the blast is performed.
> >>
> >> Example Code:
> >>
> >> use strict;
> >> use Bio::Tools::Run::RemoteBlast;
> >>
> >> my @microbDbs = qw(Microbial/100226 Microbial/101510 Microbial/103690
> >> Microbial/1063);
> >> my $e_val= '1e-3';
> >>
> >> foreach my $db(@microbDbs){
> >>   my @params = ( '-prog' => $prog,
> >>                          '-data' => $db,
> >>                          '-expect' => $e_val,
> >>                          '-readmethod' => 'xml' );
> >>
> >>   my $factory = Bio::Tools::Run::RemoteBlast->new(@params);
> >>   my $v = 1;
> >>   my $str = Bio::SeqIO->new(-file=>'test.fa' , '-format' => 'fasta' );
> >>   while (my $input = $str->next_seq()){
> >>     my $r = $factory->submit_blast($input);
> >>
> >>     #Code continues...
> >>
> >> }
> >>
> >> Is there a more efficient way to accomplish this?
> >>
> >> If this topic has been discussed please point the way.
> >>
> >> Thank you,
> >>
> >> Veronica
> >>
> >>
> >> _________________________________________________________________
> >> Microsoft brings you a new way to search the web.  Try  Bing(tm) now
> >>
> http://www.bing.com?form=MFEHPG&publ=WLHMTAG&crea=TEXT_MFEHPG_Core_tagline_try
> >> bing_1x1
> >> _______________________________________________
> >> Bioperl-l mailing list
> >> Bioperl-l at lists.open-bio.org
> >> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> > =======================================================================
> > Attention: The information contained in this message and/or attachments
> > from AgResearch Limited is intended only for the persons or entities
> > to which it is addressed and may contain confidential and/or privileged
> > material. Any review, retransmission, dissemination or other use of, or
> > taking of any action in reliance upon, this information by persons or
> > entities other than the intended recipients is prohibited by AgResearch
> > Limited. If you have received this message in error, please notify the
> > sender immediately.
> > =======================================================================
> >
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >
> 


From Russell.Smithies at agresearch.co.nz  Mon Sep 21 19:04:54 2009
From: Russell.Smithies at agresearch.co.nz (Smithies, Russell)
Date: Tue, 22 Sep 2009 11:04:54 +1200
Subject: [Bioperl-l] New to Bio::Tools::Run::RemoteBlast
In-Reply-To: <18DF7D20DFEC044098A1062202F5FFF32B62A6BA02@exchsth.agresearch.co.nz>
References: <SNT119-W38149FD1B34EE5CA92BFBED2DD0@phx.gbl>
	<18DF7D20DFEC044098A1062202F5FFF32B62A6B9D0@exchsth.agresearch.co.nz>
	<4a1b887d0770ac557b0a2578aefdce18.squirrel@mail.dreamhost.com>
	<18DF7D20DFEC044098A1062202F5FFF32B62A6BA02@exchsth.agresearch.co.nz>
Message-ID: <18DF7D20DFEC044098A1062202F5FFF32B62A6BA19@exchsth.agresearch.co.nz>

If you want to "manually" use Perl and QBlast, here's some example code.
I don't remember where it came from but it works well  :-)

**Ignore the UserAgent stuff, our firewall is fairly well tied down.

--Russell

============================

#!perl -w
$| = 1;

use LWP::UserAgent;
use HTTP::Request::Common 'POST';

$ua = LWP::UserAgent->new;
push @{ $ua->requests_redirectable }, 'POST';   #LWP doesn't redirect by default
$ua->agent('Mozilla/5.0');

#$ua->proxy( [ 'http', 'ftp' ] => 'http://username:password at your.proxy.if.required:8080' );

my $verbose = 1;
my $seq     = getSequence();
my ( $blast, $taxonomy ) = queryQBlast($seq);
$verbose && print "saving result\n";
saveToFile( $blast,    "blast.txt" );
saveToFile( $taxonomy, "taxonomy.html" );
$verbose && print "Done.\n";

sub queryQBlast {
  my ($seq) = @_;
  $seq =~ s/[\d\n\W]//g;
  my $sleepTime          = 0;
  my $sleepTimeIncrement = 5;
  my $totalSleepTime     = 0;
  my $maxSleepTime       = 600;    # 10 min
  my ( $rid, $rtoe ) = startQBlast($seq);
  my ( $blast, $taxonomy );

  while ( !$blast ) {
    $verbose && printf "wait %3d seconds\n", $sleepTime;
    sleep $sleepTime;
    ( $blast, $taxonomy ) = retrieveQBlastResult($rid);
    $sleepTime += $sleepTimeIncrement unless ( $sleepTime > 100 );
    $totalSleepTime += $sleepTimeIncrement;
    last if ( $totalSleepTime > $maxSleepTime );
  }
  return ( $blast, $taxonomy );
}

sub startQBlast {
  my ($sequence) = @_;
  my ( $expect, $wsize, $filter, $mega );
  my $hitList = 100;
  if ( length($sequence) <= 20 ) {
    $expect = 1000;
    $wsize  = 7;
    $mega   = "on";
    $filter = "";
  }
  else {
    $expect = 10;
    $wsize  = 28;
    $mega   = "on";
    $filter = "L";    # Low complexity
  }
  my $qblastURL = "http://www.ncbi.nlm.nih.gov/blast/Blast.cgi?";
  my $url       = $qblastURL . "QUERY=$sequence";
  $url .=
"&DATABASE=nr&HITLIST_SIZE=${hitList}&FILTER=${filter}&EXPECT=${expect}&FORMAT_TYPE=Text";
  $url .=
    "&PROGRAM=blastn&CLIENT=web&SERVICE=plain&NCBI_GI=on&PAGE=Nucleotides";
  $url .= "&SHOW_OVERVIEW=&WORD_SIZE=${wsize}&MEGABLAST=${mega}&CMD=Put";
  my $req = HTTP::Request->new( GET => $url );
  my $content = $ua->request($req)->content;
  $content =~ s/\s+/ /g;
  my ( $rid, $rtoe ) = $content =~
    /QBlastInfoBegin RID = ([\d\-\.\w]+) RTOE = (\d+) QBlastInfoEnd/;
  if ( !$rid ) { print qq{\nERROR missing RID:\n}; exit; }
  $verbose && print "RID $rid\n";
  return ( $rid, $rtoe );
}

sub retrieveQBlastResult {
  my ($rid)     = @_;
  my $qblastURL = "http://www.ncbi.nlm.nih.gov/blast/Blast.cgi?";
  my $url       = $qblastURL
    . "RID=$rid&CMD=Get&SHOW_OVERVIEW=&SHOW_LINKOUT=&FORMAT_TYPE=Text";
  my ( $blast, $taxonomy, $req );
  $req = HTTP::Request->new( GET => $url );
  $blast = $ua->request($req)->content;
  if ( $blast =~ /\s+Status=WAITING/ ) {
    $blast = "";
  }
  elsif ( $blast =~ /\s+Status=UNKNOWN/ ) {
    print "Error in processing\nRID $rid\n";
    exit;
  }
  else {
    $verbose && print "got blast result\n";
    $verbose && print "retrieving taxonomy data\n";
    $url = $qblastURL . "CMD=Get&RID=$rid&FORMAT_OBJECT=TaxBlast&NCBI_GI=on";
    $req = HTTP::Request->new( GET => $url );
    $taxonomy = $ua->request($req)->content;
    $taxonomy = "" if ( $taxonomy =~ /No valid taxids found in the alignment/ );
  }
  return ( $blast, $taxonomy );
}

sub saveToFile {
  my ( $data, $file ) = @_;
  local (*OUT);
  open( OUT, ">$file" );
  print OUT $data;
  close OUT;
}

sub getSequence {
  return qq{
AAAGGATTTATTGACGATGCGAACTACTCCGTTGGCCTGTTGGATGAAGGAACAAA
CCTTGGAAATGTTATTGATAACTATGTTTATGAACATACCCTGACAGGAAAAAATGCAT
TTTTTGTGGGGGATCTTGGGAAGATCGTGAAGAAGCACAGTCAGTGGCAGACCGTGGTG
GCTCAGATAAAGCCGTTTTACACGGTGAAGTGCAACTCCACTCCAGCCGTGCTTGAGAT
CTTGGCAGCTCTTGGAACTGGGTTTGCTTGTTCCAGCAAAAATGAAATGGCTTTAGTGC
AAGAATTGGGTGTATCTCCAGAAAACATCATTTTCACAAGTCCTTGTAAGCAAGTGTCT
CAGATAAAGTATGCAGCAAAAGTTGGAGTAAATATTATGACATGTGACAATGAGATTGA
ATTAAAGAAAATTGCAAGGAATCACCCAAATGCCAAGGTCTTACTACATATTGCAACAG
AAGATAATATTGGAGGTGAAGATGGTAACATGAAGTTTGGCACTACACTGAAGAATTGT
AGGCATCTTTTGGAATGTGCCAAGGAACTTGATGTCCAAATAATTGGGGTTAAATTTCA
TGTTTCAAGTGCTTGCAAAGAATATCAAGTATATGTACATGCCCTGTCTGATGCTCGAT
GTGTGTTTGACATGGCTGGAGAGTTTGGCTTTACAATGAACATGTTAGACATCGGTGGA
GGCTTCACAGGAACTGAAATTCAGTTGGAAGAGGTTAATCATGTTATCAGTCCTCTGTT
GGATATTTACTTCCCTGAAGGATCTGGCATTCAGATAATTTCAGAACCTGGAAGCTACT
ATGTATCTTCTGCGTTTACACTTGCAGTCAATATTATTGCTAAGAAAGTTGTTGAAAAT
GATAAATTTTCCTCTGGAGTAGAAAAAAATGGGAGTGATGAGCCAGCCTTCGTGTATTA
CATGAATGATGGTGTTTATGGTTCTTTTGCGAGTAAGCTTTCTGAGGACTTAAATACCA
TTCCAGAGGTTCACAAGAAATACAAGGAAGATGAGCCTCTGTTTACAAGCAGCCTTTGG
GGTCCATCCTGTGATGAGCTTGATCAAATTGTGGAAAGCTGTCTTCTTCCTGAGCTGAA
TGTGGGAGATTGGCTTATCTTTGATAACATGGGAGCAGATTCTTTCCACGAACCATCTG
CTTTTAATGATTTTCAGAGGCCAGCTATTTATTTCATGATGTCATTCAGTGATTGGTAT
GAGATGCAAGATGCTGGAATTACTTCAGATGCAATGATGAAAAACTTCTTCTTTGCACC
CTCTTGTATTCAGCTGAGCCAAGAAGACAGCTTTTCCACTGAAGCT};
}

================================

> -----Original Message-----
> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
> bounces at lists.open-bio.org] On Behalf Of Smithies, Russell
> Sent: Tuesday, 22 September 2009 10:48 a.m.
> To: 'bill at genenformics.com'
> Cc: 'bioperl-l at lists.open-bio.org'; 'armendarez77 at hotmail.com'
> Subject: Re: [Bioperl-l] New to Bio::Tools::Run::RemoteBlast
> 
> That doesn't work with remote databases though.
> B:T:R:RemoteBlast uses the QBlast API (I think) so you're limited to the
> prebuilt databases NCBI offers.
> http://www.ncbi.nlm.nih.gov/BLAST/Doc/urlapi.html
> 
> Another thing to try is space-seperating your db list - I know it works with
> local blasts.
> You could also bypass RemoteBlast and do it yourself by POSTing via URL.
> 
> This seems to work with multiple databases but you'd need to experiment:
> 
> http://www.ncbi.nlm.nih.gov/blast/Blast.cgi?QUERY=257700677&DATABASE=%22Microb
> ial/100226%20Microbial/101510%20Microbial/103690%22&HITLIST_SIZE=10&FILTER=L&E
> XPECT=10&FORMAT_TYPE=HTML&PROGRAM=blastn&CLIENT=web&SERVICE=plain&NCBI_GI=on&P
> AGE=Nucleotides&CMD=Put
> 
> 
> --Russell
> 
> 
> > -----Original Message-----
> > From: bill at genenformics.com [mailto:bill at genenformics.com]
> > Sent: Tuesday, 22 September 2009 10:21 a.m.
> > To: Smithies, Russell
> > Cc: 'armendarez77 at hotmail.com'; 'bioperl-l at lists.open-bio.org'
> > Subject: Re: [Bioperl-l] New to Bio::Tools::Run::RemoteBlast
> >
> > BLAST DBs can be concatenated into a single target (.nal or .pal) file.
> >
> > Check this out:
> >
> > http://www.ncbi.nlm.nih.gov/Web/Newsltr/Winter00/blastlab.html
> >
> > Bill
> >
> > > You may need to setup blast locally (not a big job) as I don't think you
> > > can blast against multiple databases with B:T:R:RemoteBlast.
> > > Or you could do it manually on NCBI's site where you can filter results by
> > > entrez query (eg. 1239[taxid] for fermicutes)
> > > http://www.ncbi.nlm.nih.gov/BLAST/blastcgihelp.shtml#entrez_query
> > >
> > > --Russell
> > >
> > >
> > >> -----Original Message-----
> > >> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
> > >> bounces at lists.open-bio.org] On Behalf Of armendarez77 at hotmail.com
> > >> Sent: Tuesday, 22 September 2009 9:01 a.m.
> > >> To: bioperl-l at lists.open-bio.org
> > >> Subject: [Bioperl-l] New to Bio::Tools::Run::RemoteBlast
> > >>
> > >>
> > >>
> > >>
> > >>
> > >>
> > >>
> > >> Hello,
> > >>
> > >> Is there a function to blast one query sequence against multiple blast
> > >> databases?  For example, I want to blast a sequence against all
> > >> Microbial
> > >> Genomes.  Currently, I can do it by placing multiple Microbial databases
> > >> (eg.
> > >> Microbial/100226, Microbial/101510, etc) into an array and iterate
> > >> through
> > >> them using a foreach loop.  Each individual database is placed in the
> > >> '-data'
> > >> parameter and the blast is performed.
> > >>
> > >> Example Code:
> > >>
> > >> use strict;
> > >> use Bio::Tools::Run::RemoteBlast;
> > >>
> > >> my @microbDbs = qw(Microbial/100226 Microbial/101510 Microbial/103690
> > >> Microbial/1063);
> > >> my $e_val= '1e-3';
> > >>
> > >> foreach my $db(@microbDbs){
> > >>   my @params = ( '-prog' => $prog,
> > >>                          '-data' => $db,
> > >>                          '-expect' => $e_val,
> > >>                          '-readmethod' => 'xml' );
> > >>
> > >>   my $factory = Bio::Tools::Run::RemoteBlast->new(@params);
> > >>   my $v = 1;
> > >>   my $str = Bio::SeqIO->new(-file=>'test.fa' , '-format' => 'fasta' );
> > >>   while (my $input = $str->next_seq()){
> > >>     my $r = $factory->submit_blast($input);
> > >>
> > >>     #Code continues...
> > >>
> > >> }
> > >>
> > >> Is there a more efficient way to accomplish this?
> > >>
> > >> If this topic has been discussed please point the way.
> > >>
> > >> Thank you,
> > >>
> > >> Veronica
> > >>
> > >>
> > >> _________________________________________________________________
> > >> Microsoft brings you a new way to search the web.  Try  Bing(tm) now
> > >>
> >
> http://www.bing.com?form=MFEHPG&publ=WLHMTAG&crea=TEXT_MFEHPG_Core_tagline_try
> > >> bing_1x1
> > >> _______________________________________________
> > >> Bioperl-l mailing list
> > >> Bioperl-l at lists.open-bio.org
> > >> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> > > =======================================================================
> > > Attention: The information contained in this message and/or attachments
> > > from AgResearch Limited is intended only for the persons or entities
> > > to which it is addressed and may contain confidential and/or privileged
> > > material. Any review, retransmission, dissemination or other use of, or
> > > taking of any action in reliance upon, this information by persons or
> > > entities other than the intended recipients is prohibited by AgResearch
> > > Limited. If you have received this message in error, please notify the
> > > sender immediately.
> > > =======================================================================
> > >
> > > _______________________________________________
> > > Bioperl-l mailing list
> > > Bioperl-l at lists.open-bio.org
> > > http://lists.open-bio.org/mailman/listinfo/bioperl-l
> > >
> >
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From Russell.Smithies at agresearch.co.nz  Mon Sep 21 16:51:51 2009
From: Russell.Smithies at agresearch.co.nz (Smithies, Russell)
Date: Tue, 22 Sep 2009 08:51:51 +1200
Subject: [Bioperl-l] a Main Page proposal
In-Reply-To: <~B4ab702db0000.4ab7e0410000.0001.mml.2798180807@NewLife>
References: <~B4ab702db0000.4ab7e0410000.0001.mml.2798180807@NewLife>
Message-ID: <18DF7D20DFEC044098A1062202F5FFF32B62A6B938@exchsth.agresearch.co.nz>

Here's a few comments to ignore at will :-)

How about using a different default skin so it doesn't look like all the other installations of MediaWiki?
I've attached a screenshot of one of my wikis using the "Daddio" skin but a bit of crafty CSS can do wonders.
Also, there's a lot of duplication with most of the links on Mediawiki:Sidebar also appearing on the main page content.
The "Treeview" is a nice extension as well for tidying up complex menus http://semeb.com/dpldemo/index.php?title=Treeview_extension 

I've got a bit of experience with wikis and extensions (we use LOTS of extensions) so let me know if there's anything you need.

--Russell


> -----Original Message-----
> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
> bounces at lists.open-bio.org] On Behalf Of Mark A. Jensen
> Sent: Monday, 21 September 2009 4:23 p.m.
> To: BioPerl List
> Subject: [Bioperl-l] a Main Page proposal
> 
> Hello all,
> 
> As Brian articulated so well for many of us,
> the wiki main page is, well, butt-ugly.
> Please check out the Main Page Beta at
> http://www.bioperl.org/wiki/Main_Page_Beta
> and respond to this thread or on the discussion
> page.
> 
> cheers and thanks,
> MAJ
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

=======================================================================
Attention: The information contained in this message and/or attachments
from AgResearch Limited is intended only for the persons or entities
to which it is addressed and may contain confidential and/or privileged
material. Any review, retransmission, dissemination or other use of, or
taking of any action in reliance upon, this information by persons or
entities other than the intended recipients is prohibited by AgResearch
Limited. If you have received this message in error, please notify the
sender immediately.
=======================================================================
-------------- next part --------------
A non-text attachment was scrubbed...
Name: daddio.png
Type: image/png
Size: 51263 bytes
Desc: daddio.png
URL: <http://lists.open-bio.org/pipermail/bioperl-l/attachments/20090922/643d7f79/attachment-0003.png>

From cjfields at illinois.edu  Mon Sep 21 23:38:18 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Mon, 21 Sep 2009 22:38:18 -0500
Subject: [Bioperl-l] a Main Page proposal
In-Reply-To: <18DF7D20DFEC044098A1062202F5FFF32B62A6B938@exchsth.agresearch.co.nz>
References: <~B4ab702db0000.4ab7e0410000.0001.mml.2798180807@NewLife>
	<18DF7D20DFEC044098A1062202F5FFF32B62A6B938@exchsth.agresearch.co.nz>
Message-ID: <B9C1E8A4-BDE0-45E7-858B-8BFABA1D2480@illinois.edu>

Russell, Mark,

It would be nice to change the background, just don't want it to be  
too distracting.

Also (I mentioned this to Mark off-list), I think the sidebar would be  
cleaned up considerably, but not until this becomes the default.  I  
also like the use of the TreeView extension, very nice!  Anyone have  
privs for the wiki to test it out?

chris

On Sep 21, 2009, at 3:51 PM, Smithies, Russell wrote:

> Here's a few comments to ignore at will :-)
>
> How about using a different default skin so it doesn't look like all  
> the other installations of MediaWiki?
> I've attached a screenshot of one of my wikis using the "Daddio"  
> skin but a bit of crafty CSS can do wonders.
> Also, there's a lot of duplication with most of the links on  
> Mediawiki:Sidebar also appearing on the main page content.
> The "Treeview" is a nice extension as well for tidying up complex  
> menus http://semeb.com/dpldemo/index.php?title=Treeview_extension
>
> I've got a bit of experience with wikis and extensions (we use LOTS  
> of extensions) so let me know if there's anything you need.
>
> --Russell
>
>
>> -----Original Message-----
>> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
>> bounces at lists.open-bio.org] On Behalf Of Mark A. Jensen
>> Sent: Monday, 21 September 2009 4:23 p.m.
>> To: BioPerl List
>> Subject: [Bioperl-l] a Main Page proposal
>>
>> Hello all,
>>
>> As Brian articulated so well for many of us,
>> the wiki main page is, well, butt-ugly.
>> Please check out the Main Page Beta at
>> http://www.bioperl.org/wiki/Main_Page_Beta
>> and respond to this thread or on the discussion
>> page.
>>
>> cheers and thanks,
>> MAJ
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> = 
> ======================================================================
> Attention: The information contained in this message and/or  
> attachments
> from AgResearch Limited is intended only for the persons or entities
> to which it is addressed and may contain confidential and/or  
> privileged
> material. Any review, retransmission, dissemination or other use of,  
> or
> taking of any action in reliance upon, this information by persons or
> entities other than the intended recipients is prohibited by  
> AgResearch
> Limited. If you have received this message in error, please notify the
> sender immediately.
> = 
> ======================================================================
> <daddio.png>_______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From cjfields at illinois.edu  Mon Sep 21 23:56:58 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Mon, 21 Sep 2009 22:56:58 -0500
Subject: [Bioperl-l] BioPerl 1.6.0 alpha 2 released
Message-ID: <2736FAB1-3728-465F-A07B-A8FFA790FC4C@illinois.edu>

Just a note that the second alpha is out and propagating it's way  
around the intertubes:

http://search.cpan.org/~cjfields/BioPerl-1.6.0_2/

Pick your favorite archive here:

http://bioperl.org/DIST/RC/

This should address the bugs reported by Scott from the last release.   
Just a note, but I am seeing a warning popping up with 64-bit perl  
5.10.1 on Mac with PopGen tests (I think it's a floating point  
addition issue).  Let me know if this is popping up elsewhere.

Enjoy!

chris


From jcline at ieee.org  Mon Sep 21 23:59:09 2009
From: jcline at ieee.org (Jonathan Cline)
Date: Mon, 21 Sep 2009 22:59:09 -0500
Subject: [Bioperl-l] a Main Page proposal
In-Reply-To: <mailman.7482.1253574466.2696.bioperl-l@lists.open-bio.org>
References: <mailman.7482.1253574466.2696.bioperl-l@lists.open-bio.org>
Message-ID: <4AB84B8D.5080005@ieee.org>

Throwing this out there:

- there should be a screenshot section (whatever that means for bioperl)

- the grammar of the beta page should be more correct.

"Welcome to BioPerl, a community effort to produce Perl code which is
useful in biology. "
==> "Welcome to BioPerl, a community effort to produce Perl code serving
as useful tool in the field of Biology."

>>The About section is a good example. I would bet most visitors to the
BioPerl website skip over the About section because they already know what
BioPerl is, ...  Dave<<


Most good software front pages say, in a couple sentences, "what it is
and what it's for", including pictures (as screenshots).

I would bet a ton of visitors don't know what bioperl is, or what it is
used for, or how it can benefit.  There is likely a metric for this (web
stats) as the ratio of new page visits that bounce away vs. new
clickthrus from the front page to the download or docs section.   i.e. a
visitor found the page and didn't continue reading.  I don't really know
all the things bioperl is good for and I've been reading about it here &
there for a while.

I like the following from the About and I believe it fits well on a
front page, expanding "toolkit" to "software library":

"What is Bioperl? It is an open source bioinformatics software library
used by researchers all over the world. If you're looking for a script
built to fit your exact needs you probably won't find it in Bioperl.
What you will find is a diverse set of Perl modules that will enable you
to write your own script, and a community of people who are willing to
help you. "

The old school definition of software library is something like: "useful
routines which can be used by an application (& not itself an
application)" which is basically the description above.

I also like the intro from wikipedia, which I found more informative
about bioperl, and would be good for a front page:

'BioPerl [1] is a collection of Perl modules that facilitate the
development of Perl scripts for bioinformatics applications. It has
played an integral role in the Human Genome Project.[2]  It is an active
open source software project supported by the Open Bioinformatics
Foundation.  In order to take advantage of BioPerl, the user needs a
basic understanding of the Perl programming language including an
understanding of how to use Perl references, modules, objects and methods."

The screenshots could also include pics of books on bioperl or perl+bio,
that would be neat.  (Tisdall's book comes to mind here)


## Jonathan Cline
## jcline at ieee.org
## Mobile: +1-805-617-0223
########################


From lelbourn at science.mq.edu.au  Tue Sep 22 01:05:28 2009
From: lelbourn at science.mq.edu.au (Liam Elbourne)
Date: Tue, 22 Sep 2009 15:05:28 +1000
Subject: [Bioperl-l] subsection of genbank file
In-Reply-To: <4AB36451.3030207@gmail.com>
References: <997B4CA2-D80B-4512-AA3E-74CB45DD7064@science.mq.edu.au>
	<4AB36451.3030207@gmail.com>
Message-ID: <3B0EF953-BF79-4384-964D-A992DFBDB609@science.mq.edu.au>

Hi Roy,

Thanks for that, works well, but there are no _gsf_tag_hash values?  
I'm particularly interested in the locus id, obviously the translation  
could be problematic if the whole gene is not included after  
truncation, but things like the note, product, protein_id would be  
good. I had a look at the code for the method and couldn't see any  
obvious why those values didn't make it across. Should I submit this  
as a bug, or is there something I'm missing?


Regards,
Liam.


On 18/09/2009, at 8:43 PM, Roy Chaudhuri wrote:

> Hi Liam,
>
> I just discovered your message, which has not yet been replied to.  
> What you require has been discussed in a recent thread:
> http://bioperl.org/pipermail/bioperl-l/2009-August/031071.html
>
> Try using trunc_with_features from Bio::SeqUtils:
>
> my $sub_seqobj=Bio::SeqUtils->trunc_with_features($seqobj, 300, 2000);
> Cheers.
> Roy.
>
> Liam Elbourne wrote:
>> Hi All,
>> Is there a method or methodology that will produce a fully fledged  
>> Seq  object with all the associated metadata given a start and end   
>> position? To clarify, I create a sequence object from a genbank file:
>> ****
>> my $io  = Bio::Seqio->new(as per usual);
>> my $seqobj = $io->next_seq();
>> ****
>> I now want:
>> my $sub_seqobj = $seqobj between 300 and 2000
>> where $sub_seqobj is a Seq object (which I appreciate is an   
>> 'aggregate' of objects) too. The "trunc" method only returns a   
>> PrimarySeq object which lacks all the annotation etc. I've  
>> previously  done this task by iterating through feature by feature  
>> and parsing out  what I needed, but thought there might be a more  
>> elegant approach...
>> Regards,
>> Liam Elbourne.
>
> -- 
> Dr. Roy Chaudhuri
> Department of Veterinary Medicine
> University of Cambridge, U.K.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> ac.uk ([131.111.51.215]:49455)
> 	by ppsw-7.c

______________________________

Dr Liam Elbourne
Research Fellow (Bioinformatics)
Paulsen Laboratory
Macquarie University
Sydney
Australia.

http://www2.oxfam.org.au/trailwalker/Sydney/team/228


From roy.chaudhuri at gmail.com  Tue Sep 22 03:17:26 2009
From: roy.chaudhuri at gmail.com (Roy Chaudhuri)
Date: Tue, 22 Sep 2009 08:17:26 +0100
Subject: [Bioperl-l] subsection of genbank file
In-Reply-To: <3B0EF953-BF79-4384-964D-A992DFBDB609@science.mq.edu.au>
References: <997B4CA2-D80B-4512-AA3E-74CB45DD7064@science.mq.edu.au>
	<4AB36451.3030207@gmail.com>
	<3B0EF953-BF79-4384-964D-A992DFBDB609@science.mq.edu.au>
Message-ID: <4AB87A06.4000209@gmail.com>

Hi Liam,

Yes, that is a bug - I think it is to do with the Feature Annotation 
rollback from 1.6, it works fine with 1.5.2. Looks like the tests I 
wrote don't check for the presence of tags, just the coordinates of the 
feature, so this hasn't been picked up. Submit it to Bugzilla, and I'll 
take a look when I get a chance.

Cheers.
Roy.

Liam Elbourne wrote:
> Hi Roy,
> 
> Thanks for that, works well, but there are no _gsf_tag_hash values? I'm 
> particularly interested in the locus id, obviously the translation could 
> be problematic if the whole gene is not included after truncation, but 
> things like the note, product, protein_id would be good. I had a look at 
> the code for the method and couldn't see any obvious why those values 
> didn't make it across. Should I submit this as a bug, or is there 
> something I'm missing?
> 
> 
> Regards,
> Liam.
> 
> 
> 
> On 18/09/2009, at 8:43 PM, Roy Chaudhuri wrote:
> 
>> Hi Liam,
>>
>> I just discovered your message, which has not yet been replied to. 
>> What you require has been discussed in a recent thread:
>> http://bioperl.org/pipermail/bioperl-l/2009-August/031071.html
>>
>> Try using trunc_with_features from Bio::SeqUtils:
>>
>> my $sub_seqobj=Bio::SeqUtils->trunc_with_features($seqobj, 300, 2000);
>> Cheers.
>> Roy.
>>
>> Liam Elbourne wrote:
>>> Hi All,
>>> Is there a method or methodology that will produce a fully fledged 
>>> Seq  object with all the associated metadata given a start and end 
>>>  position? To clarify, I create a sequence object from a genbank file:
>>> ****
>>> my $io  = Bio::Seqio->new(as per usual);
>>> my $seqobj = $io->next_seq();
>>> ****
>>> I now want:
>>> my $sub_seqobj = $seqobj between 300 and 2000
>>> where $sub_seqobj is a Seq object (which I appreciate is an 
>>>  'aggregate' of objects) too. The "trunc" method only returns a 
>>>  PrimarySeq object which lacks all the annotation etc. I've 
>>> previously  done this task by iterating through feature by feature 
>>> and parsing out  what I needed, but thought there might be a more 
>>> elegant approach...
>>> Regards,
>>> Liam Elbourne.
>>
>> -- 
>> Dr. Roy Chaudhuri
>> Department of Veterinary Medicine
>> University of Cambridge, U.K.
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>> ac.uk ([131.111.51.215]:49455)
>> by ppsw-7.c
> 
> ______________________________
> 
> Dr Liam Elbourne
> Research Fellow (Bioinformatics)
> Paulsen Laboratory
> Macquarie University
> Sydney
> Australia.
> 
> http://www2.oxfam.org.au/trailwalker/Sydney/team/228
> 
> 
> 


From lelbourn at science.mq.edu.au  Tue Sep 22 03:14:44 2009
From: lelbourn at science.mq.edu.au (Liam Elbourne)
Date: Tue, 22 Sep 2009 17:14:44 +1000
Subject: [Bioperl-l] dnastatistics
In-Reply-To: <8B440DC9-A1C8-4900-A0AB-96448616E46A@bioperl.org>
References: <BLU104-W2453ADE4584D2C479071A4A0E40@phx.gbl>
	<7AD546C5A6BE4B66BF9705BC885E08B1@NewLife>
	<8B440DC9-A1C8-4900-A0AB-96448616E46A@bioperl.org>
Message-ID: <A5C3A80C-03F0-4CEC-BA43-2271B58F6DC4@science.mq.edu.au>

So I also had no problem running the code as written by Jose (Bioperl  
1.6.0, perl 5.10), but in the documentation for DNAStatistics it says:

"The routines are not well tested and do contain errors at this point.  
Work is underway to correct them, but do not expect this code to give  
you the right answer currently!"!

So I'm using dnadist (as I think the documentation recommends), and it  
does produce different numbers to $stats->distance(-).

I tried write_matrix from Bio::Matrix::IO - got a message saying it  
hasn't been implemented yet?

And if Jose hasn't already found it, try Data::Dumper; it will change  
your life....

Regards,
Liam.

On 15/09/2009, at 3:54 AM, Jason Stajich wrote:

> Yeah it seems like more of a bioperl problem -- possible that the  
> older code didn't recognize 'jukes-cantor' but you can try the  
> abbreviation 'jc' -- better to just upgrade tho!
>
> This isn't the cause of the problem but I would also encourage use  
> of Bio::Matrix::IO for printing the matrix (use the 'write_matrix'  
> function) rather than print_matrix on the matrix itsself.
>
> -jason
> On Sep 14, 2009, at 10:00 AM, Mark A. Jensen wrote:
>
>> Hi Jose--
>> I don't get any problem with your script as written. You should  
>> upgrade to
>> BioPerl 1.6 and try again.
>> The "unblessed reference" is $jcmatrix. It may be undef for some  
>> reason.
>> MAJ
>> ----- Original Message ----- From: "Jose ." <joseguillin at hotmail.com>
>> To: <bioperl-l at bioperl.org>
>> Sent: Monday, September 14, 2009 8:48 AM
>> Subject: [Bioperl-l] Bio/Align/DNAStatistics.html print$jcmatrix- 
>> >print_matrix;
>>
>>
>>
>>
>>
>> Hello,
>>
>> I'm trying to use Bio::Align::DNAStatistics, but I get the  
>> following message:
>>
>> Can't call method "print_matrix" on unblessed reference at Tree.pl  
>> line 32, <GEN0> line 44.
>>
>> Other modules do work, such us Bio::SimpleAlign;
>>
>>
>>
>>
>> My code is basically a modification of the code I found in http://doc.bioperl.org/releases/bioperl-current/bioperl-live/Bio/Align/DNAStatistics.html 
>> , as it is as follows:
>>
>> use strict;
>> use Bio::AlignIO;
>> use Bio::Align::DNAStatistics;
>>
>>
>> my $stats = Bio::Align::DNAStatistics->new();
>>
>> my $alignin = Bio::AlignIO->new(-file => 'e1_output_uno_solo.fas',
>>                          -format => 'fasta');
>> my $aln = $alignin->next_aln;
>>
>> my $jcmatrix = $stats-> distance (-align => $aln,
>>                -method => 'Jukes-Cantor');
>>
>> print $jcmatrix->print_matrix;
>>
>> And the file 'e1_output_uno_solo.fas' has the following sequences:
>>
>>> A
>> GGTTATCTCAACAACTGTCACC--GTGGGCGCTGGTCATTGGTACGGGTGAACGAGAGTT
>> AAACGGTCGTTAACCATAGAAACAAAACACACTGCACCTTAACTCACTGAATAGTTGACG
>> GTCTGCCTCAGGGCTTGAGACAACGGATGGATCTAAACTCATGCTGTAGCCTATCAAACT
>> TAGCCCCAGGGTACTTCCGTCCCTAGCCTCGCTACAAGGCCAGAAAGGGTTTTGAAGTCT
>> ACTCACTGTGACCAGCGGTCTAGTCAGGTTATGCTTCGGCACAAAACCTCAGAATCGGTA
>> ACCAGCCACTACACGAACTGAAATCAAATCGCGGGAGGTGGTCCATCTTTGTCCACGCTG
>> CGATGATTGGGTTGCTTTATAGTCTAGCTGCAAGGTTTTGCGTTCTGGTGGGAAGCGGSubject:  
>> Re: [Bioperl-l] Bio/Align/DNAStatistics.html
> 	print$jcmatrix->print_maCA
>> TCCAAGGGGTTGACTCCGCTCGTTTATAACATGCCTTGGGCCTCCATGGTGAGTCGCAAC
>> GTCAGCGTAGGCCTAGACGGCT
>>
>>> B
>> GGATATCTCGACAACTTTTAGC--CTGGGCGCTTGGCATTGGTACACGTGACTTGCAGTT
>> AAAGGGTCGTTATACATAGAATCACTACCCAC--CAGGCGAACTCGCTGGAGAGCTGAGG
>> GTCACCCTCAGCGGTTGAGTTAACTGCTCGATGTTAACCGATGTTGGATCATAGGTAACT
>> TATCCTCAGTGTTCCTCTGTCCCTAGACTGGCTACAGGGCTACACCGGGTTTGAGGGGAT
>> ACTGACTGTTTTCAGCGGTAGTGTAAGTGTATGGTCCAACCCAAGGGTTCATGACCGGTA
>> AACTGCCCGTTCCCGCATTGAAATCAAATTGCAGGAGTTGGTACTTATTTGTCAACCTTA
>> CGATGATTGGGATGCATTTTAGTCGGGCTGGGCGGATTTGCGATCTGGGTGGAAGAGAGA
>> TGCATGGGGCTAACTCGTCTTGGTGAGTACCGGCATTGCACCGCAATGGACCGCCAAAAC
>> ATAAGAGTAGGTCGGGATGGCA
>>
>>> C
>> GCTTATCTCAACAACCGACACGAAGTCGTCGCAGGTCAATGGTACACGTGAATTGAAGTC
>> ATAAGATCAGTAATGATCGAACCACCAAACCCTTAACCTCGACTCACGCGATAGCCGAGG
>> GTCTGCCTCCAGGGTTGATTTAAAGGTTCTATTTAAGACCGTTTTCGATCATAGGTTACT
>> TATCCCCAGAGTTCTACCGTCGTGAGAATGGCTACAAGGCTAGAATAGGTTTTAGGGT-T
>> ACTTACGGTCTGCAGCCGTATTGTGAGGTTATGGTCCGGCCCTAGGCGTCATGACCGATA
>> ATCAGCCCCTACCTGAAATGAAATCAAATCGCGGGAGTTGGTACTTATCTGTCAACGTTG
>> CGATGATGGGGATACATGTTGGTCTACCGCGACGGACTAGCGATCACGGGGGAAGCGGAT
>> TGCCCGGTGGTGACTCGACACGTTTAAAACCTGCCTGGTTCCCGCATGGATCGTCACAAC
>> GTATGTGCAGGTCGAAACGAGT
>>
>>> D
>> CGTGATCGCAACAACTGTCACC--GTGGGCGCTGGCCGTTGGACCACGTGAAATGCTGTT
>> AAACGATCGTTCACCATAGAACCACTACACTCTTCACCTCAACCCGCGGGACAGGTGATG
>> GTGTCCCCCAGGGGTTGAGTGAACGGCTCGATGTAAACCCATGTTCGATCATAGGTAACG
>> TAGCCCCAGGGTGATTCCGTTCCTAAACTGGTTACAAGGCTAAAACGTGTTTTAGAGTAT
>> AATGACTGTCTACGGCGGTATTGTGATGTTATCATCCGTCCCTAGGCGTGGCGACCGTTA
>> AACAGCCTCTTCCCTAACTGATATCTAATCGTAGGAGTTGCTACGCATTTGTCAACGCAG
>> CGATGATGGTGATGCATCTTAATCTAGCTGG----TTTTTTGATCTCGGGTGACGCAGAT
>> AGTCAGGGGTTGACTCGCGTCGTTTGAAACGTGCCTTGCTCCTCAATGGACCCTCCGAAC
>> CTAAGAGTAGCTCGACACGGCT
>>
>>
>>
>> I think the $aln object is OK, as I can use it with SimpleAlign.
>>
>> Moreover, if I write
>>        print $jcmatrix;
>> instead of
>>        print $jcmatrix->print_matrix;
>> I get the memory reference, as normal===> ARRAY(0x859f08)
>>
>> So my question is:
>>
>> Why do I have an unblessed reference?
>>
>> Can't call method "print_matrix" on unblessed reference at Tree.pl  
>> line 32, <GEN0> line 44.
>>
>> Thank you very much in advance.
>>
>> Jose G.
>>
>> _________________________________________________________________
>> Hay tantos ordenadores como personas. ?Descubre ahora cu?l eres t?!
>> http://www.quepceres.com/
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> --
> Jason Stajich
> jason.stajich at gmail.com
> jason at bioperl.org
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

______________________________


From maj at fortinbras.us  Tue Sep 22 07:12:38 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Tue, 22 Sep 2009 07:12:38 -0400
Subject: [Bioperl-l] dnastatistics
In-Reply-To: <A5C3A80C-03F0-4CEC-BA43-2271B58F6DC4@science.mq.edu.au>
References: <BLU104-W2453ADE4584D2C479071A4A0E40@phx.gbl><7AD546C5A6BE4B66BF9705BC885E08B1@NewLife><8B440DC9-A1C8-4900-A0AB-96448616E46A@bioperl.org>
	<A5C3A80C-03F0-4CEC-BA43-2271B58F6DC4@science.mq.edu.au>
Message-ID: <39991E8FD29E4A43B8098C0BA6740C9C@NewLife>

Thanks Liam-- I think the discrepancy between dnadist and the
module is worth making a bug report for- can you do that and
include the data (or part of it) you were using?
Jason, is that work really underway, or should someone pick up
that ball?
----- Original Message ----- 
From: "Liam Elbourne" <lelbourn at science.mq.edu.au>
To: "Jason Stajich" <jason at bioperl.org>
Cc: "Mark A. Jensen" <maj at fortinbras.us>; <bioperl-l at bioperl.org>; "Jose ." 
<joseguillin at hotmail.com>
Sent: Tuesday, September 22, 2009 3:14 AM
Subject: [Bioperl-l] dnastatistics


So I also had no problem running the code as written by Jose (Bioperl
1.6.0, perl 5.10), but in the documentation for DNAStatistics it says:

"The routines are not well tested and do contain errors at this point.
Work is underway to correct them, but do not expect this code to give
you the right answer currently!"!

So I'm using dnadist (as I think the documentation recommends), and it
does produce different numbers to $stats->distance(-).

I tried write_matrix from Bio::Matrix::IO - got a message saying it
hasn't been implemented yet?

And if Jose hasn't already found it, try Data::Dumper; it will change
your life....

Regards,
Liam.

On 15/09/2009, at 3:54 AM, Jason Stajich wrote:

> Yeah it seems like more of a bioperl problem -- possible that the  older code 
> didn't recognize 'jukes-cantor' but you can try the  abbreviation 'jc' --  
> better to just upgrade tho!
>
> This isn't the cause of the problem but I would also encourage use  of 
> Bio::Matrix::IO for printing the matrix (use the 'write_matrix'  function) 
> rather than print_matrix on the matrix itsself.
>
> -jason
> On Sep 14, 2009, at 10:00 AM, Mark A. Jensen wrote:
>
>> Hi Jose--
>> I don't get any problem with your script as written. You should  upgrade to
>> BioPerl 1.6 and try again.
>> The "unblessed reference" is $jcmatrix. It may be undef for some  reason.
>> MAJ
>> ----- Original Message ----- From: "Jose ." <joseguillin at hotmail.com>
>> To: <bioperl-l at bioperl.org>
>> Sent: Monday, September 14, 2009 8:48 AM
>> Subject: [Bioperl-l] Bio/Align/DNAStatistics.html print$jcmatrix-
>> >print_matrix;
>>
>>
>>
>>
>>
>> Hello,
>>
>> I'm trying to use Bio::Align::DNAStatistics, but I get the  following 
>> message:
>>
>> Can't call method "print_matrix" on unblessed reference at Tree.pl  line 32, 
>> <GEN0> line 44.
>>
>> Other modules do work, such us Bio::SimpleAlign;
>>
>>
>>
>>
>> My code is basically a modification of the code I found in 
>> http://doc.bioperl.org/releases/bioperl-current/bioperl-live/Bio/Align/DNAStatistics.html , 
>> as it is as follows:
>>
>> use strict;
>> use Bio::AlignIO;
>> use Bio::Align::DNAStatistics;
>>
>>
>> my $stats = Bio::Align::DNAStatistics->new();
>>
>> my $alignin = Bio::AlignIO->new(-file => 'e1_output_uno_solo.fas',
>>                          -format => 'fasta');
>> my $aln = $alignin->next_aln;
>>
>> my $jcmatrix = $stats-> distance (-align => $aln,
>>                -method => 'Jukes-Cantor');
>>
>> print $jcmatrix->print_matrix;
>>
>> And the file 'e1_output_uno_solo.fas' has the following sequences:
>>
>>> A
>> GGTTATCTCAACAACTGTCACC--GTGGGCGCTGGTCATTGGTACGGGTGAACGAGAGTT
>> AAACGGTCGTTAACCATAGAAACAAAACACACTGCACCTTAACTCACTGAATAGTTGACG
>> GTCTGCCTCAGGGCTTGAGACAACGGATGGATCTAAACTCATGCTGTAGCCTATCAAACT
>> TAGCCCCAGGGTACTTCCGTCCCTAGCCTCGCTACAAGGCCAGAAAGGGTTTTGAAGTCT
>> ACTCACTGTGACCAGCGGTCTAGTCAGGTTATGCTTCGGCACAAAACCTCAGAATCGGTA
>> ACCAGCCACTACACGAACTGAAATCAAATCGCGGGAGGTGGTCCATCTTTGTCCACGCTG
>> CGATGATTGGGTTGCTTTATAGTCTAGCTGCAAGGTTTTGCGTTCTGGTGGGAAGCGGSubject:  Re: 
>> [Bioperl-l] Bio/Align/DNAStatistics.html
> print$jcmatrix->print_maCA
>> TCCAAGGGGTTGACTCCGCTCGTTTATAACATGCCTTGGGCCTCCATGGTGAGTCGCAAC
>> GTCAGCGTAGGCCTAGACGGCT
>>
>>> B
>> GGATATCTCGACAACTTTTAGC--CTGGGCGCTTGGCATTGGTACACGTGACTTGCAGTT
>> AAAGGGTCGTTATACATAGAATCACTACCCAC--CAGGCGAACTCGCTGGAGAGCTGAGG
>> GTCACCCTCAGCGGTTGAGTTAACTGCTCGATGTTAACCGATGTTGGATCATAGGTAACT
>> TATCCTCAGTGTTCCTCTGTCCCTAGACTGGCTACAGGGCTACACCGGGTTTGAGGGGAT
>> ACTGACTGTTTTCAGCGGTAGTGTAAGTGTATGGTCCAACCCAAGGGTTCATGACCGGTA
>> AACTGCCCGTTCCCGCATTGAAATCAAATTGCAGGAGTTGGTACTTATTTGTCAACCTTA
>> CGATGATTGGGATGCATTTTAGTCGGGCTGGGCGGATTTGCGATCTGGGTGGAAGAGAGA
>> TGCATGGGGCTAACTCGTCTTGGTGAGTACCGGCATTGCACCGCAATGGACCGCCAAAAC
>> ATAAGAGTAGGTCGGGATGGCA
>>
>>> C
>> GCTTATCTCAACAACCGACACGAAGTCGTCGCAGGTCAATGGTACACGTGAATTGAAGTC
>> ATAAGATCAGTAATGATCGAACCACCAAACCCTTAACCTCGACTCACGCGATAGCCGAGG
>> GTCTGCCTCCAGGGTTGATTTAAAGGTTCTATTTAAGACCGTTTTCGATCATAGGTTACT
>> TATCCCCAGAGTTCTACCGTCGTGAGAATGGCTACAAGGCTAGAATAGGTTTTAGGGT-T
>> ACTTACGGTCTGCAGCCGTATTGTGAGGTTATGGTCCGGCCCTAGGCGTCATGACCGATA
>> ATCAGCCCCTACCTGAAATGAAATCAAATCGCGGGAGTTGGTACTTATCTGTCAACGTTG
>> CGATGATGGGGATACATGTTGGTCTACCGCGACGGACTAGCGATCACGGGGGAAGCGGAT
>> TGCCCGGTGGTGACTCGACACGTTTAAAACCTGCCTGGTTCCCGCATGGATCGTCACAAC
>> GTATGTGCAGGTCGAAACGAGT
>>
>>> D
>> CGTGATCGCAACAACTGTCACC--GTGGGCGCTGGCCGTTGGACCACGTGAAATGCTGTT
>> AAACGATCGTTCACCATAGAACCACTACACTCTTCACCTCAACCCGCGGGACAGGTGATG
>> GTGTCCCCCAGGGGTTGAGTGAACGGCTCGATGTAAACCCATGTTCGATCATAGGTAACG
>> TAGCCCCAGGGTGATTCCGTTCCTAAACTGGTTACAAGGCTAAAACGTGTTTTAGAGTAT
>> AATGACTGTCTACGGCGGTATTGTGATGTTATCATCCGTCCCTAGGCGTGGCGACCGTTA
>> AACAGCCTCTTCCCTAACTGATATCTAATCGTAGGAGTTGCTACGCATTTGTCAACGCAG
>> CGATGATGGTGATGCATCTTAATCTAGCTGG----TTTTTTGATCTCGGGTGACGCAGAT
>> AGTCAGGGGTTGACTCGCGTCGTTTGAAACGTGCCTTGCTCCTCAATGGACCCTCCGAAC
>> CTAAGAGTAGCTCGACACGGCT
>>
>>
>>
>> I think the $aln object is OK, as I can use it with SimpleAlign.
>>
>> Moreover, if I write
>>        print $jcmatrix;
>> instead of
>>        print $jcmatrix->print_matrix;
>> I get the memory reference, as normal===> ARRAY(0x859f08)
>>
>> So my question is:
>>
>> Why do I have an unblessed reference?
>>
>> Can't call method "print_matrix" on unblessed reference at Tree.pl  line 32, 
>> <GEN0> line 44.
>>
>> Thank you very much in advance.
>>
>> Jose G.
>>
>> _________________________________________________________________
>> Hay tantos ordenadores como personas. ?Descubre ahora cu?l eres t?!
>> http://www.quepceres.com/
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> --
> Jason Stajich
> jason.stajich at gmail.com
> jason at bioperl.org
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

______________________________


_______________________________________________
Bioperl-l mailing list
Bioperl-l at lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/bioperl-l


From dan.bolser at gmail.com  Tue Sep 22 09:09:50 2009
From: dan.bolser at gmail.com (Dan Bolser)
Date: Tue, 22 Sep 2009 14:09:50 +0100
Subject: [Bioperl-l] Converting between allowed SearchIO formats?
Message-ID: <2c8757af0909220609n518243efh63608aa05df13d1c@mail.gmail.com>

Hi all,

I'm reading in a blasttable format blast result, filtering, and hoping
to write out a similarly formatted result. Based on experience with
SeqIO, I expected to do something like the following:

use Bio::SearchIO;

## Open the sequence search report
my $seqI = Bio::SearchIO->
  new( -file   => $file,
       -format => $format,
     );

## Open the output report
my $seqO = Bio::SearchIO->
  new( -file   => ">OUTPUT",
       -format => $format,
     );

while( my $result = $seqI->next_result ) {
  ## Do some filtering...

  $seqO->write_result( $result );
}


However, the above method does not work here. Is this for some deep
reason, or could the above method (based on the way SeqIO works) be
made to work? I'm guessing that the SearchIO object conversion is
simply harder to do than with SeqIO?

So now I'm trying to use the correct method, via
Bio::SearchIO::Writer::HSPTableWriter. The problem is, I can't find a
1 to 1 correspondence between the fields in the blasttable and the
columns provided by the writer. So far I have something like this:

blasttable ->		HSPTableWriter

(result) query_name	query_name
(hit) name		hit_name
(hsp) frac_identical	frac_identical_query?
			frac_identical_hit?
(hsp) hsp_length	length_aln_query?
			length_aln_hit?
(?) mismatches		?
(hsp) gaps		?
			gaps_query?
			gaps_hit?
			gaps_total?
(hsp) start('query')	start_query
(hsp) end('query')	end_query
(hsp) start('hit')	start_hit
(hsp) end('hit')	end_hit
(hsp) significance	expect
(hsp) bits		bits


For (hsp) frac_identical, it seems as if the (undocumented)
frac_identical_total column is giving the right value, however, I'ts
hard to be certain because the format is of the value is different
(the blasttable says 93.51 while HSPTableWriter says 0.94). How can I
change the output format of HSPTableWriter?

Is there any improvement on the above mapping? It seems strange that I
can read in a blasttable, but I can't write one out (using a generic
object interface). For example, where do I get the hsp length from
(which column)?

I'm sure this has come up before, so apologies for not being able to
track down the appropriate docs.


Thanks for any help,
Dan.

P.S. when dumping a blasttable from a blasttable using HSP methods,
how should I calculate the number of mismatches? Currently I'm trying:

      my $len = $hsp->length;
      my $match = $len * $hsp->frac_identical;
      my $mismatch = $len - $match;

but the resulting values differ from those in the original blasttable.
I have the feeling this is a FAQ ...


From cjfields at illinois.edu  Tue Sep 22 10:00:44 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Tue, 22 Sep 2009 09:00:44 -0500
Subject: [Bioperl-l] Converting between allowed SearchIO formats?
In-Reply-To: <2c8757af0909220609n518243efh63608aa05df13d1c@mail.gmail.com>
References: <2c8757af0909220609n518243efh63608aa05df13d1c@mail.gmail.com>
Message-ID: <B7F6253D-F9EE-4EC0-9ABE-53CB85E37D16@illinois.edu>

On Sep 22, 2009, at 8:09 AM, Dan Bolser wrote:

> Hi all,
>
> I'm reading in a blasttable format blast result, filtering, and hoping
> to write out a similarly formatted result. Based on experience with
> SeqIO, I expected to do something like the following:
>
> use Bio::SearchIO;
>
> ## Open the sequence search report
> my $seqI = Bio::SearchIO->
> new( -file   => $file,
>      -format => $format,
>    );
>
> ## Open the output report
> my $seqO = Bio::SearchIO->
> new( -file   => ">OUTPUT",
>      -format => $format,
>    );
>
> while( my $result = $seqI->next_result ) {
> ## Do some filtering...
>
> $seqO->write_result( $result );
> }
>
>
> However, the above method does not work here. Is this for some deep
> reason, or could the above method (based on the way SeqIO works) be
> made to work? I'm guessing that the SearchIO object conversion is
> simply harder to do than with SeqIO?

This is something Jason could probably speak up on, but from my  
perspective it comes down to 'why?'.  This opens up a very hard-to- 
implement door (converting to and from, for instance, BLAST to HMMER),  
which doesn't make sense from the end-user perspective.  What most  
users want out of those formats is getting at the data in an easily  
accessible way, to further process them (filter, to GFF, etc), or to  
have them summarized.  the Writer classes take care of the latter.

There is a very generic, all-purpose write_result in Bio::SearchIO  
that just calls the a ResultWriter object (and dies if it isn't  
present).  Note that this expects a ResultWriter, not a Hit/HSPWriter;  
it is write_result() after all. I think this kind of goes against the  
well-established API that exists with the other write_foo  
implementations for the IO classes, where the input/output format  
should match, but there you have it.

> So now I'm trying to use the correct method, via
> Bio::SearchIO::Writer::HSPTableWriter. The problem is, I can't find a
> 1 to 1 correspondence between the fields in the blasttable and the
> columns provided by the writer. So far I have something like this:
>
> blasttable ->		HSPTableWriter
>
> (result) query_name	query_name
> (hit) name		hit_name
> (hsp) frac_identical	frac_identical_query?
> 			frac_identical_hit?
> (hsp) hsp_length	length_aln_query?
> 			length_aln_hit?
> (?) mismatches		?
> (hsp) gaps		?
> 			gaps_query?
> 			gaps_hit?
> 			gaps_total?
> (hsp) start('query')	start_query
> (hsp) end('query')	end_query
> (hsp) start('hit')	start_hit
> (hsp) end('hit')	end_hit
> (hsp) significance	expect
> (hsp) bits		bits
>
>
> For (hsp) frac_identical, it seems as if the (undocumented)
> frac_identical_total column is giving the right value, however, I'ts
> hard to be certain because the format is of the value is different
> (the blasttable says 93.51 while HSPTableWriter says 0.94). How can I
> change the output format of HSPTableWriter?

Not sure but it appears hard-coded.  This could probably be rewritten  
to spit out certain data attributes by name (e.g. you could ask for  
percent_identity), but I'm not sure.

> Is there any improvement on the above mapping? It seems strange that I
> can read in a blasttable, but I can't write one out (using a generic
> object interface). For example, where do I get the hsp length from
> (which column)?
>
> I'm sure this has come up before, so apologies for not being able to
> track down the appropriate docs.

 From the POD:

'Here are the columns that can be specified in the -columns
parameter when creating a HSPTableWriter object.  If a -columns  
parameter
is not specified, this list, in this order, will be used as the  
default.'

In other words, you keep track of the columns (which appear 1-based).

> Thanks for any help,
> Dan.
> P.S. when dumping a blasttable from a blasttable using HSP methods,
> how should I calculate the number of mismatches? Currently I'm trying:
>
>     my $len = $hsp->length;
>     my $match = $len * $hsp->frac_identical;
>     my $mismatch = $len - $match;
>
> but the resulting values differ from those in the original blasttable.
> I have the feeling this is a FAQ ...

Maybe use seq_inds instead?

BTW, HSP length() defaults on the 'total' length (includes gaps).  The  
above calculation doesn't account for that.

With seq_inds, 'mismatch' are residue-only (no gaps); 'no_match' is  
mismatched residues + gaps (you have to also indicate whether this is  
based on the query or hit).

Also note that seq_inds deals with (1) mapping differences, e.g. any  
query that requires translation, and (2) frameshifts, such as from  
FASTX/Y output (again translated sequence output).  If you are dealing  
with a translated sequence you will want to account for those bits as  
well.

chris


From cjfields at illinois.edu  Tue Sep 22 10:20:47 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Tue, 22 Sep 2009 09:20:47 -0500
Subject: [Bioperl-l] a Main Page proposal
In-Reply-To: <4AB84B8D.5080005@ieee.org>
References: <mailman.7482.1253574466.2696.bioperl-l@lists.open-bio.org>
	<4AB84B8D.5080005@ieee.org>
Message-ID: <2B4BD21D-D06A-43A1-8860-A8F3771130E7@illinois.edu>

On Sep 21, 2009, at 10:59 PM, Jonathan Cline wrote:

> Throwing this out there:
>
> - there should be a screenshot section (whatever that means for  
> bioperl)

The only area that would apply is for Gbrowse/Bio::Graphics.  For much  
of the rest that's a bit trickier, but it's possible.

> - the grammar of the beta page should be more correct.
>
> "Welcome to BioPerl, a community effort to produce Perl code which is
> useful in biology. "
> ==> "Welcome to BioPerl, a community effort to produce Perl code  
> serving
> as useful tool in the field of Biology."
>
>>> The About section is a good example. I would bet most visitors to  
>>> the
> BioPerl website skip over the About section because they already  
> know what
> BioPerl is, ...  Dave<<
>
> Most good software front pages say, in a couple sentences, "what it is
> and what it's for", including pictures (as screenshots).

Right.

> I would bet a ton of visitors don't know what bioperl is, or what it  
> is
> used for, or how it can benefit.  There is likely a metric for this  
> (web
> stats) as the ratio of new page visits that bounce away vs. new
> clickthrus from the front page to the download or docs section.    
> i.e. a
> visitor found the page and didn't continue reading.  I don't really  
> know
> all the things bioperl is good for and I've been reading about it  
> here &
> there for a while.
>
> I like the following from the About and I believe it fits well on a
> front page, expanding "toolkit" to "software library":
>
> "What is Bioperl? It is an open source bioinformatics software library
> used by researchers all over the world. If you're looking for a script
> built to fit your exact needs you probably won't find it in Bioperl.
> What you will find is a diverse set of Perl modules that will enable  
> you
> to write your own script, and a community of people who are willing to
> help you. "
>
> The old school definition of software library is something like:  
> "useful
> routines which can be used by an application (& not itself an
> application)" which is basically the description above.
>
> I also like the intro from wikipedia, which I found more informative
> about bioperl, and would be good for a front page:
>
> 'BioPerl [1] is a collection of Perl modules that facilitate the
> development of Perl scripts for bioinformatics applications. It has
> played an integral role in the Human Genome Project.[2]  It is an  
> active
> open source software project supported by the Open Bioinformatics
> Foundation.  In order to take advantage of BioPerl, the user needs a
> basic understanding of the Perl programming language including an
> understanding of how to use Perl references, modules, objects and  
> methods."
>
> The screenshots could also include pics of books on bioperl or perl 
> +bio,
> that would be neat.  (Tisdall's book comes to mind here)

I tend to agree here, but Tisdall only discusses BioPerl in detail in  
the second book (Mastering Perl for Bioinformatics).  I think we're  
safe as long as we indicate that, just don't want to run into a  
situation like the recent issue that some users had with Gentleman's  
'R for Bioinformatics' book released last year.

I don't think it was intentional, but a lot of users purchased it  
thinking it would be a BioConductor book, mainly b/c it was advertised  
on the BioConductor website.  Unfortunately it had very little to do  
with BioC (or bioinformatics, really), and the reviews of the book  
reflect that.  It's unfortunate, as I found it to be a pretty good  
book on R.

-c

> ## Jonathan Cline
> ## jcline at ieee.org
> ## Mobile: +1-805-617-0223
> ########################


From cjfields at illinois.edu  Tue Sep 22 11:53:13 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Tue, 22 Sep 2009 10:53:13 -0500
Subject: [Bioperl-l] BioPerl 1.6.0 alpha 2 released
In-Reply-To: <2736FAB1-3728-465F-A07B-A8FFA790FC4C@illinois.edu>
References: <2736FAB1-3728-465F-A07B-A8FFA790FC4C@illinois.edu>
Message-ID: <2ED641E3-F69E-4513-B261-0949FDE35EBB@illinois.edu>

And just as quickly, getting back lots indicating more problems from  
CPAN Testers.  Some can be ignored (appear due to the local perl  
testing environment so are local to the tester).  The following are  
the most significant; appears a hard-coded SeqFeature_SQLite.t got  
bundled in somehow, so I'll drop an alpha 3 shortly.

chris

#   Failed test 'use Bio::SeqFeature::Annotated;'
#   at t/Annotation/Annotation.t line 23.
#     Tried to use 'Bio::SeqFeature::Annotated'.
#     Error:  Can't locate URI/Escape.pm in @INC (@INC contains: t/ 
lib . /Users/david/cpantesting/perl-5.10.1/.cpan/build/ 
BioPerl-1.6.0._2-QVXU9n/blib/lib /Users/david/cpantesting/ 
perl-5.10.1/.cpan/build/BioPerl-1.6.0._2-QVXU9n/blib/arch /Users/david/ 
cpantesting/perl-5.10.1/.cpan/build/BioPerl-1.6.0._2-QVXU9n /sw/lib/ 
perl5 /sw/lib/perl5/darwin /Users/david/cpantesting/perl-5.10.1/lib/ 
5.10.1/darwin-thread-multi-2level /Users/david/cpantesting/perl-5.10.1/ 
lib/5.10.1 /Users/david/cpantesting/perl-5.10.1/lib/site_perl/5.10.1/ 
darwin-thread-multi-2level /Users/david/cpantesting/perl-5.10.1/lib/ 
site_perl/5.10.1) at Bio/SeqFeature/Annotated.pm line 100.
# BEGIN failed--compilation aborted at Bio/SeqFeature/Annotated.pm  
line 100.
# Compilation failed in require at (eval 60) line 2.
# BEGIN failed--compilation aborted at (eval 60) line 2.
# Looks like you failed 1 test of 159.
t/Annotation/Annotation.t ....................
Dubious, test returned 1 (wstat 256, 0x100)
Failed 1/159 subtests
	(less 12 skipped subtests: 146 okay)


t/LocalDB/SeqFeature.t ....................... ok
DBD::SQLite::db prepare_cached failed: near "INDEXED": syntax error(1)  
at dbdimp.c line 271 at Bio/DB/SeqFeature/Store/DBI/mysql.pm line 1678.

-------------------- EXCEPTION --------------------
MSG: near "INDEXED": syntax error(1) at dbdimp.c line 271
STACK Bio::DB::SeqFeature::Store::DBI::mysql::_prepare Bio/DB/ 
SeqFeature/Store/DBI/mysql.pm:1678
STACK Bio::DB::SeqFeature::Store::DBI::SQLite::_features Bio/DB/ 
SeqFeature/Store/DBI/SQLite.pm:665
STACK Bio::DB::SeqFeature::Store::get_features_by_attribute Bio/DB/ 
SeqFeature/Store.pm:961
STACK toplevel t/LocalDB/SeqFeature.t:135
-------------------------------------------
# Looks like you planned 69 tests but only ran 40.
# Looks like your test died just after 40.
t/LocalDB/SeqFeature_SQLite.t ................
Failed 29/69 subtests


On Sep 21, 2009, at 10:56 PM, Chris Fields wrote:

> Just a note that the second alpha is out and propagating it's way  
> around the intertubes:
>
> http://search.cpan.org/~cjfields/BioPerl-1.6.0_2/
>
> Pick your favorite archive here:
>
> http://bioperl.org/DIST/RC/
>
> This should address the bugs reported by Scott from the last  
> release.  Just a note, but I am seeing a warning popping up with 64- 
> bit perl 5.10.1 on Mac with PopGen tests (I think it's a floating  
> point addition issue).  Let me know if this is popping up elsewhere.
>
> Enjoy!
>
> chris
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From jason at bioperl.org  Tue Sep 22 12:01:51 2009
From: jason at bioperl.org (Jason Stajich)
Date: Tue, 22 Sep 2009 09:01:51 -0700
Subject: [Bioperl-l] dnastatistics
In-Reply-To: <39991E8FD29E4A43B8098C0BA6740C9C@NewLife>
References: <BLU104-W2453ADE4584D2C479071A4A0E40@phx.gbl><7AD546C5A6BE4B66BF9705BC885E08B1@NewLife><8B440DC9-A1C8-4900-A0AB-96448616E46A@bioperl.org>
	<A5C3A80C-03F0-4CEC-BA43-2271B58F6DC4@science.mq.edu.au>
	<39991E8FD29E4A43B8098C0BA6740C9C@NewLife>
Message-ID: <1027EFFB-18B5-446B-A5B0-9DA628EEEF08@bioperl.org>

someone should pick up the ball.

On Sep 22, 2009, at 4:12 AM, Mark A. Jensen wrote:

> Thanks Liam-- I think the discrepancy between dnadist and the
> module is worth making a bug report for- can you do that and
> include the data (or part of it) you were using?
> Jason, is that work really underway, or should someone pick up
> that ball?
> ----- Original Message ----- From: "Liam Elbourne" <lelbourn at science.mq.edu.au 
> >
> To: "Jason Stajich" <jason at bioperl.org>
> Cc: "Mark A. Jensen" <maj at fortinbras.us>; <bioperl-l at bioperl.org>;  
> "Jose ." <joseguillin at hotmail.com>
> Sent: Tuesday, September 22, 2009 3:14 AM
> Subject: [Bioperl-l] dnastatistics
>
>
> So I also had no problem running the code as written by Jose (Bioperl
> 1.6.0, perl 5.10), but in the documentation for DNAStatistics it says:
>
> "The routines are not well tested and do contain errors at this point.
> Work is underway to correct them, but do not expect this code to give
> you the right answer currently!"!
>
> So I'm using dnadist (as I think the documentation recommends), and it
> does produce different numbers to $stats->distance(-).
>
> I tried write_matrix from Bio::Matrix::IO - got a message saying it
> hasn't been implemented yet?
>
> And if Jose hasn't already found it, try Data::Dumper; it will change
> your life....
>
> Regards,
> Liam.
>
> On 15/09/2009, at 3:54 AM, Jason Stajich wrote:
>
>> Yeah it seems like more of a bioperl problem -- possible that the   
>> older code didn't recognize 'jukes-cantor' but you can try the   
>> abbreviation 'jc' --  better to just upgrade tho!
>>
>> This isn't the cause of the problem but I would also encourage use   
>> of Bio::Matrix::IO for printing the matrix (use the 'write_matrix'   
>> function) rather than print_matrix on the matrix itsself.
>>
>> -jason
>> On Sep 14, 2009, at 10:00 AM, Mark A. Jensen wrote:
>>
>>> Hi Jose--
>>> I don't get any problem with your script as written. You should   
>>> upgrade to
>>> BioPerl 1.6 and try again.
>>> The "unblessed reference" is $jcmatrix. It may be undef for some   
>>> reason.
>>> MAJ
>>> ----- Original Message ----- From: "Jose ."  
>>> <joseguillin at hotmail.com>
>>> To: <bioperl-l at bioperl.org>
>>> Sent: Monday, September 14, 2009 8:48 AM
>>> Subject: [Bioperl-l] Bio/Align/DNAStatistics.html print$jcmatrix-
>>> >print_matrix;
>>>
>>>
>>>
>>>
>>>
>>> Hello,
>>>
>>> I'm trying to use Bio::Align::DNAStatistics, but I get the   
>>> following message:
>>>
>>> Can't call method "print_matrix" on unblessed reference at  
>>> Tree.pl  line 32, <GEN0> line 44.
>>>
>>> Other modules do work, such us Bio::SimpleAlign;
>>>
>>>
>>>
>>>
>>> My code is basically a modification of the code I found in http://doc.bioperl.org/releases/bioperl-current/bioperl-live/Bio/Align/DNAStatistics.html 
>>>  , as it is as follows:
>>>
>>> use strict;
>>> use Bio::AlignIO;
>>> use Bio::Align::DNAStatistics;
>>>
>>>
>>> my $stats = Bio::Align::DNAStatistics->new();
>>>
>>> my $alignin = Bio::AlignIO->new(-file => 'e1_output_uno_solo.fas',
>>>                         -format => 'fasta');
>>> my $aln = $alignin->next_aln;
>>>
>>> my $jcmatrix = $stats-> distance (-align => $aln,
>>>               -method => 'Jukes-Cantor');
>>>
>>> print $jcmatrix->print_matrix;
>>>
>>> And the file 'e1_output_uno_solo.fas' has the following sequences:
>>>
>>>> A
>>> GGTTATCTCAACAACTGTCACC--GTGGGCGCTGGTCATTGGTACGGGTGAACGAGAGTT
>>> AAACGGTCGTTAACCATAGAAACAAAACACACTGCACCTTAACTCACTGAATAGTTGACG
>>> GTCTGCCTCAGGGCTTGAGACAACGGATGGATCTAAACTCATGCTGTAGCCTATCAAACT
>>> TAGCCCCAGGGTACTTCCGTCCCTAGCCTCGCTACAAGGCCAGAAAGGGTTTTGAAGTCT
>>> ACTCACTGTGACCAGCGGTCTAGTCAGGTTATGCTTCGGCACAAAACCTCAGAATCGGTA
>>> ACCAGCCACTACACGAACTGAAATCAAATCGCGGGAGGTGGTCCATCTTTGTCCACGCTG
>>> CGATGATTGGGTTGCTTTATAGTCTAGCTGCAAGGTTTTGCGTTCTGGTGGGAAGCGGSubject 
>>> :  Re: [Bioperl-l] Bio/Align/DNAStatistics.html
>> print$jcmatrix->print_maCA
>>> TCCAAGGGGTTGACTCCGCTCGTTTATAACATGCCTTGGGCCTCCATGGTGAGTCGCAAC
>>> GTCAGCGTAGGCCTAGACGGCT
>>>
>>>> B
>>> GGATATCTCGACAACTTTTAGC--CTGGGCGCTTGGCATTGGTACACGTGACTTGCAGTT
>>> AAAGGGTCGTTATACATAGAATCACTACCCAC--CAGGCGAACTCGCTGGAGAGCTGAGG
>>> GTCACCCTCAGCGGTTGAGTTAACTGCTCGATGTTAACCGATGTTGGATCATAGGTAACT
>>> TATCCTCAGTGTTCCTCTGTCCCTAGACTGGCTACAGGGCTACACCGGGTTTGAGGGGAT
>>> ACTGACTGTTTTCAGCGGTAGTGTAAGTGTATGGTCCAACCCAAGGGTTCATGACCGGTA
>>> AACTGCCCGTTCCCGCATTGAAATCAAATTGCAGGAGTTGGTACTTATTTGTCAACCTTA
>>> CGATGATTGGGATGCATTTTAGTCGGGCTGGGCGGATTTGCGATCTGGGTGGAAGAGAGA
>>> TGCATGGGGCTAACTCGTCTTGGTGAGTACCGGCATTGCACCGCAATGGACCGCCAAAAC
>>> ATAAGAGTAGGTCGGGATGGCA
>>>
>>>> C
>>> GCTTATCTCAACAACCGACACGAAGTCGTCGCAGGTCAATGGTACACGTGAATTGAAGTC
>>> ATAAGATCAGTAATGATCGAACCACCAAACCCTTAACCTCGACTCACGCGATAGCCGAGG
>>> GTCTGCCTCCAGGGTTGATTTAAAGGTTCTATTTAAGACCGTTTTCGATCATAGGTTACT
>>> TATCCCCAGAGTTCTACCGTCGTGAGAATGGCTACAAGGCTAGAATAGGTTTTAGGGT-T
>>> ACTTACGGTCTGCAGCCGTATTGTGAGGTTATGGTCCGGCCCTAGGCGTCATGACCGATA
>>> ATCAGCCCCTACCTGAAATGAAATCAAATCGCGGGAGTTGGTACTTATCTGTCAACGTTG
>>> CGATGATGGGGATACATGTTGGTCTACCGCGACGGACTAGCGATCACGGGGGAAGCGGAT
>>> TGCCCGGTGGTGACTCGACACGTTTAAAACCTGCCTGGTTCCCGCATGGATCGTCACAAC
>>> GTATGTGCAGGTCGAAACGAGT
>>>
>>>> D
>>> CGTGATCGCAACAACTGTCACC--GTGGGCGCTGGCCGTTGGACCACGTGAAATGCTGTT
>>> AAACGATCGTTCACCATAGAACCACTACACTCTTCACCTCAACCCGCGGGACAGGTGATG
>>> GTGTCCCCCAGGGGTTGAGTGAACGGCTCGATGTAAACCCATGTTCGATCATAGGTAACG
>>> TAGCCCCAGGGTGATTCCGTTCCTAAACTGGTTACAAGGCTAAAACGTGTTTTAGAGTAT
>>> AATGACTGTCTACGGCGGTATTGTGATGTTATCATCCGTCCCTAGGCGTGGCGACCGTTA
>>> AACAGCCTCTTCCCTAACTGATATCTAATCGTAGGAGTTGCTACGCATTTGTCAACGCAG
>>> CGATGATGGTGATGCATCTTAATCTAGCTGG----TTTTTTGATCTCGGGTGACGCAGAT
>>> AGTCAGGGGTTGACTCGCGTCGTTTGAAACGTGCCTTGCTCCTCAATGGACCCTCCGAAC
>>> CTAAGAGTAGCTCGACACGGCT
>>>
>>>
>>>
>>> I think the $aln object is OK, as I can use it with SimpleAlign.
>>>
>>> Moreover, if I write
>>>       print $jcmatrix;
>>> instead of
>>>       print $jcmatrix->print_matrix;
>>> I get the memory reference, as normal===> ARRAY(0x859f08)
>>>
>>> So my question is:
>>>
>>> Why do I have an unblessed reference?
>>>
>>> Can't call method "print_matrix" on unblessed reference at  
>>> Tree.pl  line 32, <GEN0> line 44.
>>>
>>> Thank you very much in advance.
>>>
>>> Jose G.
>>>
>>> _________________________________________________________________
>>> Hay tantos ordenadores como personas. ?Descubre ahora cu?l eres t?!
>>> http://www.quepceres.com/
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>> --
>> Jason Stajich
>> jason.stajich at gmail.com
>> jason at bioperl.org
>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> ______________________________
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>

--
Jason Stajich
jason.stajich at gmail.com
jason at bioperl.org


From jason at bioperl.org  Tue Sep 22 12:07:14 2009
From: jason at bioperl.org (Jason Stajich)
Date: Tue, 22 Sep 2009 09:07:14 -0700
Subject: [Bioperl-l] Converting between allowed SearchIO formats?
In-Reply-To: <B7F6253D-F9EE-4EC0-9ABE-53CB85E37D16@illinois.edu>
References: <2c8757af0909220609n518243efh63608aa05df13d1c@mail.gmail.com>
	<B7F6253D-F9EE-4EC0-9ABE-53CB85E37D16@illinois.edu>
Message-ID: <CE021960-F0DC-4BA7-91B7-21A5B2F6F1BF@bioperl.org>

>>
>>
>> However, the above method does not work here. Is this for some deep
>> reason, or could the above method (based on the way SeqIO works) be
>> made to work? I'm guessing that the SearchIO object conversion is
>> simply harder to do than with SeqIO?
>
> This is something Jason could probably speak up on, but from my  
> perspective it comes down to 'why?'.  This opens up a very hard-to- 
> implement door (converting to and from, for instance, BLAST to  
> HMMER), which doesn't make sense from the end-user perspective.   
> What most users want out of those formats is getting at the data in  
> an easily accessible way, to further process them (filter, to GFF,  
> etc), or to have them summarized.  the Writer classes take care of  
> the latter.
>


> There is a very generic, all-purpose write_result in Bio::SearchIO  
> that just calls the a ResultWriter object (and dies if it isn't  
> present).  Note that this expects a ResultWriter, not a Hit/ 
> HSPWriter; it is write_result() after all. I think this kind of goes  
> against the well-established API that exists with the other  
> write_foo implementations for the IO classes, where the input/output  
> format should match, but there you have it.
>

Dan -
I'm confused about what you are trying to do or what is broken - are  
you just annoyed that the API isn't the same style as Bio::SeqIO.


--
Jason Stajich
jason.stajich at gmail.com
jason at bioperl.org


From shalabh.sharma7 at gmail.com  Tue Sep 22 12:48:39 2009
From: shalabh.sharma7 at gmail.com (shalabh sharma)
Date: Tue, 22 Sep 2009 12:48:39 -0400
Subject: [Bioperl-l] Stockholm to fasta
Message-ID: <9fcc48c70909220948t7988b48eu7a8dcf89ee2d6042@mail.gmail.com>

Hi All,      I am trying to convert stockholm to fasta format. I am using
"sreformat" for this purpose. I am getting a fasta file but the problem is i
want header information from stockholm in my fasta file.
Like:
# STOCKHOLM 1.0

#=GF AC   RF00003
#=GF ID   U1
#=GF DE   U1 spliceosomal RNA
- - - - - - - - - -  - - - -
- - - - - - - - - - - -- -
- - - - - - -- - - - - -
#=GF RL   J Biol Chem 2001;276:21476-21481.
#=GF CC   U1 is a small nuclear RNA (snRNA) component of the spliceosome
#=GF CC   (involved in pre-mRNA splicing). Its 5' end forms complementary
#=GF CC   base pairs with the 5' splice junction, thus defining the 5'
#=GF CC   donor site of an intron.
#=GF CC   There are significant differences in sequence and secondary
#=GF CC   structure between metazoan and yeast U1 snRNAs, the latter being
#=GF CC   much longer (568 nucleotides as compared to 164 nucleotides in
#=GF CC   human). Nevertheless, secondary structure predictions suggest
#=GF CC   that all U1 snRNAs share a 'common core' consisting of helices I,
#=GF CC   II, the proximal region of III, and IV [1].
#=GF CC   This family does not contain the larger yeast sequences.
#=GF SQ   100


X63783.1/2024-2186
UUACUUACCUGGCUGG.AGUUU.GCUA...UCGAUCAU.GAAG.GGUAG.
X63783.1/1394-1556
UUACUUACCUGGCUGG.AGUUA.GCUA...UCGAUCAU.GAAG.GGUAG.
X58845.1/1-161
..ACUUACCUGGCUGG.AGUUU.GCUA...UCGAUCAU.GAAG.GGUAG.
X63783.1/596-756
UAAAUUACAAUGUUGU.AGUUA.GCUA...UAUAUCAA.AAAA.UAUAG.
M29062.1/238-387
UUACUUACCUGGCAUG.AGUUU..CUG...CAGCACAA.GAAU.UGUGG.

As a output i am just getting a fasta file with the headers like
 "X63783.1/2024-2186" but what i want is that it should include some
information like U1 or U1 spliceosomal RNA from the stockholm headers.

I would really appreciate if anyone can help me out.

Thanks
Shalabh


From roy.chaudhuri at gmail.com  Tue Sep 22 12:44:47 2009
From: roy.chaudhuri at gmail.com (Roy Chaudhuri)
Date: Tue, 22 Sep 2009 17:44:47 +0100
Subject: [Bioperl-l] subsection of genbank file
In-Reply-To: <4AB87A06.4000209@gmail.com>
References: <997B4CA2-D80B-4512-AA3E-74CB45DD7064@science.mq.edu.au>	<4AB36451.3030207@gmail.com>	<3B0EF953-BF79-4384-964D-A992DFBDB609@science.mq.edu.au>
	<4AB87A06.4000209@gmail.com>
Message-ID: <4AB8FEFF.6060408@gmail.com>

Hi Liam,

My mistake, it looks like the bug had already been reported and fixed, 
which means I get to go home earlier. I've marked your bug as a 
duplicate of bug 2810.

You can get the patched version by installing bioperl-live (just 
downloading the bioperl-live SeqUtils.pm and putting it in the correct 
place on your system would probably also work).

Cheers.
Roy.

Roy Chaudhuri wrote:
> Hi Liam,
> 
> Yes, that is a bug - I think it is to do with the Feature Annotation 
> rollback from 1.6, it works fine with 1.5.2. Looks like the tests I 
> wrote don't check for the presence of tags, just the coordinates of the 
> feature, so this hasn't been picked up. Submit it to Bugzilla, and I'll 
> take a look when I get a chance.
> 
> Cheers.
> Roy.
> 
> Liam Elbourne wrote:
>> Hi Roy,
>>
>> Thanks for that, works well, but there are no _gsf_tag_hash values? I'm 
>> particularly interested in the locus id, obviously the translation could 
>> be problematic if the whole gene is not included after truncation, but 
>> things like the note, product, protein_id would be good. I had a look at 
>> the code for the method and couldn't see any obvious why those values 
>> didn't make it across. Should I submit this as a bug, or is there 
>> something I'm missing?
>>
>>
>> Regards,
>> Liam.
>>
>>
>>
>> On 18/09/2009, at 8:43 PM, Roy Chaudhuri wrote:
>>
>>> Hi Liam,
>>>
>>> I just discovered your message, which has not yet been replied to. 
>>> What you require has been discussed in a recent thread:
>>> http://bioperl.org/pipermail/bioperl-l/2009-August/031071.html
>>>
>>> Try using trunc_with_features from Bio::SeqUtils:
>>>
>>> my $sub_seqobj=Bio::SeqUtils->trunc_with_features($seqobj, 300, 2000);
>>> Cheers.
>>> Roy.
>>>
>>> Liam Elbourne wrote:
>>>> Hi All,
>>>> Is there a method or methodology that will produce a fully fledged 
>>>> Seq  object with all the associated metadata given a start and end 
>>>>  position? To clarify, I create a sequence object from a genbank file:
>>>> ****
>>>> my $io  = Bio::Seqio->new(as per usual);
>>>> my $seqobj = $io->next_seq();
>>>> ****
>>>> I now want:
>>>> my $sub_seqobj = $seqobj between 300 and 2000
>>>> where $sub_seqobj is a Seq object (which I appreciate is an 
>>>>  'aggregate' of objects) too. The "trunc" method only returns a 
>>>>  PrimarySeq object which lacks all the annotation etc. I've 
>>>> previously  done this task by iterating through feature by feature 
>>>> and parsing out  what I needed, but thought there might be a more 
>>>> elegant approach...
>>>> Regards,
>>>> Liam Elbourne.
>>> -- 
>>> Dr. Roy Chaudhuri
>>> Department of Veterinary Medicine
>>> University of Cambridge, U.K.
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>> ac.uk ([131.111.51.215]:49455)
>>> by ppsw-7.c
>> ______________________________
>>
>> Dr Liam Elbourne
>> Research Fellow (Bioinformatics)
>> Paulsen Laboratory
>> Macquarie University
>> Sydney
>> Australia.
>>
>> http://www2.oxfam.org.au/trailwalker/Sydney/team/228
>>
>>
>>
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From cjfields at illinois.edu  Tue Sep 22 13:12:10 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Tue, 22 Sep 2009 12:12:10 -0500
Subject: [Bioperl-l] subsection of genbank file
In-Reply-To: <4AB8FEFF.6060408@gmail.com>
References: <997B4CA2-D80B-4512-AA3E-74CB45DD7064@science.mq.edu.au>	<4AB36451.3030207@gmail.com>	<3B0EF953-BF79-4384-964D-A992DFBDB609@science.mq.edu.au>
	<4AB87A06.4000209@gmail.com> <4AB8FEFF.6060408@gmail.com>
Message-ID: <1F043B63-3DD1-49DD-86F3-B2FB9AD34725@illinois.edu>

That should be out in the latest alpha on CPAN as well (the final  
1.6.1 should be out this week).

chris

On Sep 22, 2009, at 11:44 AM, Roy Chaudhuri wrote:

> Hi Liam,
>
> My mistake, it looks like the bug had already been reported and  
> fixed, which means I get to go home earlier. I've marked your bug as  
> a duplicate of bug 2810.
>
> You can get the patched version by installing bioperl-live (just  
> downloading the bioperl-live SeqUtils.pm and putting it in the  
> correct place on your system would probably also work).
>
> Cheers.
> Roy.
>
> Roy Chaudhuri wrote:
>> Hi Liam,
>> Yes, that is a bug - I think it is to do with the Feature  
>> Annotation rollback from 1.6, it works fine with 1.5.2. Looks like  
>> the tests I wrote don't check for the presence of tags, just the  
>> coordinates of the feature, so this hasn't been picked up. Submit  
>> it to Bugzilla, and I'll take a look when I get a chance.
>> Cheers.
>> Roy.
>> Liam Elbourne wrote:
>>> Hi Roy,
>>>
>>> Thanks for that, works well, but there are no _gsf_tag_hash  
>>> values? I'm particularly interested in the locus id, obviously the  
>>> translation could be problematic if the whole gene is not included  
>>> after truncation, but things like the note, product, protein_id  
>>> would be good. I had a look at the code for the method and  
>>> couldn't see any obvious why those values didn't make it across.  
>>> Should I submit this as a bug, or is there something I'm missing?
>>>
>>>
>>> Regards,
>>> Liam.
>>>
>>>
>>>
>>> On 18/09/2009, at 8:43 PM, Roy Chaudhuri wrote:
>>>
>>>> Hi Liam,
>>>>
>>>> I just discovered your message, which has not yet been replied  
>>>> to. What you require has been discussed in a recent thread:
>>>> http://bioperl.org/pipermail/bioperl-l/2009-August/031071.html
>>>>
>>>> Try using trunc_with_features from Bio::SeqUtils:
>>>>
>>>> my $sub_seqobj=Bio::SeqUtils->trunc_with_features($seqobj, 300,  
>>>> 2000);
>>>> Cheers.
>>>> Roy.
>>>>
>>>> Liam Elbourne wrote:
>>>>> Hi All,
>>>>> Is there a method or methodology that will produce a fully  
>>>>> fledged Seq  object with all the associated metadata given a  
>>>>> start and end  position? To clarify, I create a sequence object  
>>>>> from a genbank file:
>>>>> ****
>>>>> my $io  = Bio::Seqio->new(as per usual);
>>>>> my $seqobj = $io->next_seq();
>>>>> ****
>>>>> I now want:
>>>>> my $sub_seqobj = $seqobj between 300 and 2000
>>>>> where $sub_seqobj is a Seq object (which I appreciate is an   
>>>>> 'aggregate' of objects) too. The "trunc" method only returns a   
>>>>> PrimarySeq object which lacks all the annotation etc. I've  
>>>>> previously  done this task by iterating through feature by  
>>>>> feature and parsing out  what I needed, but thought there might  
>>>>> be a more elegant approach...
>>>>> Regards,
>>>>> Liam Elbourne.
>>>> -- 
>>>> Dr. Roy Chaudhuri
>>>> Department of Veterinary Medicine
>>>> University of Cambridge, U.K.
>>>> _______________________________________________
>>>> Bioperl-l mailing list
>>>> Bioperl-l at lists.open-bio.org
>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>> ac.uk ([131.111.51.215]:49455)
>>>> by ppsw-7.c
>>> ______________________________
>>>
>>> Dr Liam Elbourne
>>> Research Fellow (Bioinformatics)
>>> Paulsen Laboratory
>>> Macquarie University
>>> Sydney
>>> Australia.
>>>
>>> http://www2.oxfam.org.au/trailwalker/Sydney/team/228
>>>
>>>
>>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From cjfields at illinois.edu  Tue Sep 22 13:13:53 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Tue, 22 Sep 2009 12:13:53 -0500
Subject: [Bioperl-l] Stockholm to fasta
In-Reply-To: <9fcc48c70909220948t7988b48eu7a8dcf89ee2d6042@mail.gmail.com>
References: <9fcc48c70909220948t7988b48eu7a8dcf89ee2d6042@mail.gmail.com>
Message-ID: <EA566A7E-C146-4C2C-9AD5-88B9BB34EC43@illinois.edu>

The POD for Bio::AlignIO::stockholm indicates where the various bits  
of information are stored.  Everything from the header should be in  
there in the latest bioperl; in many cases it's not ideally stored,  
but it's accessible.

You'll need to preprocess your seqs in the SimpleAlign returned  
(iterate through them and change the relevant bits like desc(),  
displayname(), seq_id, etc) and may need to do other modifications,  
but it should work.

chris

On Sep 22, 2009, at 11:48 AM, shalabh sharma wrote:

> Hi All,      I am trying to convert stockholm to fasta format. I am  
> using
> "sreformat" for this purpose. I am getting a fasta file but the  
> problem is i
> want header information from stockholm in my fasta file.
> Like:
> # STOCKHOLM 1.0
>
> #=GF AC   RF00003
> #=GF ID   U1
> #=GF DE   U1 spliceosomal RNA
> - - - - - - - - - -  - - - -
> - - - - - - - - - - - -- -
> - - - - - - -- - - - - -
> #=GF RL   J Biol Chem 2001;276:21476-21481.
> #=GF CC   U1 is a small nuclear RNA (snRNA) component of the  
> spliceosome
> #=GF CC   (involved in pre-mRNA splicing). Its 5' end forms  
> complementary
> #=GF CC   base pairs with the 5' splice junction, thus defining the 5'
> #=GF CC   donor site of an intron.
> #=GF CC   There are significant differences in sequence and secondary
> #=GF CC   structure between metazoan and yeast U1 snRNAs, the latter  
> being
> #=GF CC   much longer (568 nucleotides as compared to 164  
> nucleotides in
> #=GF CC   human). Nevertheless, secondary structure predictions  
> suggest
> #=GF CC   that all U1 snRNAs share a 'common core' consisting of  
> helices I,
> #=GF CC   II, the proximal region of III, and IV [1].
> #=GF CC   This family does not contain the larger yeast sequences.
> #=GF SQ   100
>
>
> X63783.1/2024-2186
> UUACUUACCUGGCUGG.AGUUU.GCUA...UCGAUCAU.GAAG.GGUAG.
> X63783.1/1394-1556
> UUACUUACCUGGCUGG.AGUUA.GCUA...UCGAUCAU.GAAG.GGUAG.
> X58845.1/1-161
> ..ACUUACCUGGCUGG.AGUUU.GCUA...UCGAUCAU.GAAG.GGUAG.
> X63783.1/596-756
> UAAAUUACAAUGUUGU.AGUUA.GCUA...UAUAUCAA.AAAA.UAUAG.
> M29062.1/238-387
> UUACUUACCUGGCAUG.AGUUU..CUG...CAGCACAA.GAAU.UGUGG.
>
> As a output i am just getting a fasta file with the headers like
> "X63783.1/2024-2186" but what i want is that it should include some
> information like U1 or U1 spliceosomal RNA from the stockholm headers.
>
> I would really appreciate if anyone can help me out.
>
> Thanks
> Shalabh
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From shalabh.sharma7 at gmail.com  Tue Sep 22 16:17:11 2009
From: shalabh.sharma7 at gmail.com (shalabh sharma)
Date: Tue, 22 Sep 2009 16:17:11 -0400
Subject: [Bioperl-l] Stockholm to fasta
In-Reply-To: <EA566A7E-C146-4C2C-9AD5-88B9BB34EC43@illinois.edu>
References: <9fcc48c70909220948t7988b48eu7a8dcf89ee2d6042@mail.gmail.com>
	<EA566A7E-C146-4C2C-9AD5-88B9BB34EC43@illinois.edu>
Message-ID: <9fcc48c70909221317i509a45cbm19783c1210f7c69b@mail.gmail.com>

Hi Chris,           Thanks a lot it was really helpful.

Thanks
Shalabh


On Tue, Sep 22, 2009 at 1:13 PM, Chris Fields <cjfields at illinois.edu> wrote:

> The POD for Bio::AlignIO::stockholm indicates where the various bits of
> information are stored.  Everything from the header should be in there in
> the latest bioperl; in many cases it's not ideally stored, but it's
> accessible.
>
> You'll need to preprocess your seqs in the SimpleAlign returned (iterate
> through them and change the relevant bits like desc(), displayname(),
> seq_id, etc) and may need to do other modifications, but it should work.
>
> chris
>
>
> On Sep 22, 2009, at 11:48 AM, shalabh sharma wrote:
>
>  Hi All,      I am trying to convert stockholm to fasta format. I am using
>> "sreformat" for this purpose. I am getting a fasta file but the problem is
>> i
>> want header information from stockholm in my fasta file.
>> Like:
>> # STOCKHOLM 1.0
>>
>> #=GF AC   RF00003
>> #=GF ID   U1
>> #=GF DE   U1 spliceosomal RNA
>> - - - - - - - - - -  - - - -
>> - - - - - - - - - - - -- -
>> - - - - - - -- - - - - -
>> #=GF RL   J Biol Chem 2001;276:21476-21481.
>> #=GF CC   U1 is a small nuclear RNA (snRNA) component of the spliceosome
>> #=GF CC   (involved in pre-mRNA splicing). Its 5' end forms complementary
>> #=GF CC   base pairs with the 5' splice junction, thus defining the 5'
>> #=GF CC   donor site of an intron.
>> #=GF CC   There are significant differences in sequence and secondary
>> #=GF CC   structure between metazoan and yeast U1 snRNAs, the latter being
>> #=GF CC   much longer (568 nucleotides as compared to 164 nucleotides in
>> #=GF CC   human). Nevertheless, secondary structure predictions suggest
>> #=GF CC   that all U1 snRNAs share a 'common core' consisting of helices
>> I,
>> #=GF CC   II, the proximal region of III, and IV [1].
>> #=GF CC   This family does not contain the larger yeast sequences.
>> #=GF SQ   100
>>
>>
>> X63783.1/2024-2186
>> UUACUUACCUGGCUGG.AGUUU.GCUA...UCGAUCAU.GAAG.GGUAG.
>> X63783.1/1394-1556
>> UUACUUACCUGGCUGG.AGUUA.GCUA...UCGAUCAU.GAAG.GGUAG.
>> X58845.1/1-161
>> ..ACUUACCUGGCUGG.AGUUU.GCUA...UCGAUCAU.GAAG.GGUAG.
>> X63783.1/596-756
>> UAAAUUACAAUGUUGU.AGUUA.GCUA...UAUAUCAA.AAAA.UAUAG.
>> M29062.1/238-387
>> UUACUUACCUGGCAUG.AGUUU..CUG...CAGCACAA.GAAU.UGUGG.
>>
>> As a output i am just getting a fasta file with the headers like
>> "X63783.1/2024-2186" but what i want is that it should include some
>> information like U1 or U1 spliceosomal RNA from the stockholm headers.
>>
>> I would really appreciate if anyone can help me out.
>>
>> Thanks
>> Shalabh
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>
>


From cjfields at illinois.edu  Tue Sep 22 16:29:28 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Tue, 22 Sep 2009 15:29:28 -0500
Subject: [Bioperl-l] BioPerl 1.6.0 alpha 3 released
Message-ID: <A59164B5-0408-4A94-9262-8B814DD48CE1@illinois.edu>

The third alpha is now out and propagating it's way around the  
intertubes:

http://search.cpan.org/~cjfields/BioPerl-1.6.0_3/

Pick your favorite archive here:

http://bioperl.org/DIST/RC/

This includes some unmerged changes from 1.6.0.  Test failures from  
the last alpha indicated these somehow were missed, so I basically ran  
a global diff against main trunk to check for missing commits (all  
located in t/ as it turned out).

Also fixed is are the SeqFeature_SQLite.t failures; this is a file  
autogenerated with Build.PL tests that somehow made it's way into the  
last alpha release.  This is now properly cleaned up along with it's  
test database using './Build clean'.  BTW, very nice SQLite  
implementation; I may be using it!

Please let me know if anything pops up; I'm hoping to release 1.6.1 by  
this Thursday-Friday.

Enjoy!

chris


From dan.bolser at gmail.com  Tue Sep 22 17:33:13 2009
From: dan.bolser at gmail.com (Dan Bolser)
Date: Tue, 22 Sep 2009 22:33:13 +0100
Subject: [Bioperl-l] Converting between allowed SearchIO formats?
In-Reply-To: <CE021960-F0DC-4BA7-91B7-21A5B2F6F1BF@bioperl.org>
References: <2c8757af0909220609n518243efh63608aa05df13d1c@mail.gmail.com>
	<B7F6253D-F9EE-4EC0-9ABE-53CB85E37D16@illinois.edu>
	<CE021960-F0DC-4BA7-91B7-21A5B2F6F1BF@bioperl.org>
Message-ID: <2c8757af0909221433p6d8b5dbeuf8c16218b732e54e@mail.gmail.com>

2009/9/22 Jason Stajich <jason at bioperl.org>

>
>
> However, the above method does not work here. Is this for some deep
>
> reason, or could the above method (based on the way SeqIO works) be
>
> made to work? I'm guessing that the SearchIO object conversion is
>
> simply harder to do than with SeqIO?
>
>
> This is something Jason could probably speak up on, but from my perspective
> it comes down to 'why?'.  This opens up a very hard-to-implement door
> (converting to and from, for instance, BLAST to HMMER), which doesn't make
> sense from the end-user perspective.  What most users want out of those
> formats is getting at the data in an easily accessible way, to further
> process them (filter, to GFF, etc), or to have them summarized.  the Writer
> classes take care of the latter.
>
>
> There is a very generic, all-purpose write_result in Bio::SearchIO that
> just calls the a ResultWriter object (and dies if it isn't present).  Note
> that this expects a ResultWriter, not a Hit/HSPWriter; it is write_result()
> after all. I think this kind of goes against the well-established API that
> exists with the other write_foo implementations for the IO classes, where
> the input/output format should match, but there you have it.
>
> Dan -
> I'm confused about what you are trying to do or what is broken - are you
> just annoyed that the API isn't the same style as Bio::SeqIO.
>

No, I'm not annoyed. I was just confused initially because it didn't work as
'expected', and then I was wondering why (I was just curious). I take Chris's
point that this could be a lot of work to implement for a very marginal use
case.

Very simply, what I am trying to do is this: a) read in a blasttable, b)
filter the HSPs per 'result' (per query sequence), and c) write the HSPs out
in blasttable format.

I was stuck at step c, but I'm not saying anything is broken (just my
understanding of how to use SearchIO::Writer::HSPTableWriter).

I'll look again at Chris's suggestions to see if I can get code to just
'round trip' the blasttable format. From there I think I should be able to
do what I want.


Cheers,
Dan.


--
> Jason Stajich
> jason.stajich at gmail.com
> jason at bioperl.org
>
>


From maj at fortinbras.us  Tue Sep 22 18:32:15 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Tue, 22 Sep 2009 18:32:15 -0400
Subject: [Bioperl-l] Converting between allowed SearchIO formats?
In-Reply-To: <2c8757af0909221433p6d8b5dbeuf8c16218b732e54e@mail.gmail.com>
References: <2c8757af0909220609n518243efh63608aa05df13d1c@mail.gmail.com><B7F6253D-F9EE-4EC0-9ABE-53CB85E37D16@illinois.edu><CE021960-F0DC-4BA7-91B7-21A5B2F6F1BF@bioperl.org>
	<2c8757af0909221433p6d8b5dbeuf8c16218b732e54e@mail.gmail.com>
Message-ID: <9C7D7F02BFBD4F2AA16E151B52125C93@NewLife>

Apropos this, here's something I ran across the other day:

"Just remember when using BioPerl that it was never designed
to 'round trip' your favorite formats. Rather, it was designed to
store sequence data from many widely different formats into a
common object framework and make that framework available
to other sequence manipulation tasks in a programmatic fashion."

from HOWTO:SeqIO#Caveats

Food for thought, anyway--- MAJ

----- Original Message ----- 
From: "Dan Bolser" <dan.bolser at gmail.com>
To: "Jason Stajich" <jason at bioperl.org>
Cc: "Chris Fields" <cjfields at illinois.edu>; "BioPerl List" 
<bioperl-l at lists.open-bio.org>
Sent: Tuesday, September 22, 2009 5:33 PM
Subject: Re: [Bioperl-l] Converting between allowed SearchIO formats?


> 2009/9/22 Jason Stajich <jason at bioperl.org>
>
>>
>>
>> However, the above method does not work here. Is this for some deep
>>
>> reason, or could the above method (based on the way SeqIO works) be
>>
>> made to work? I'm guessing that the SearchIO object conversion is
>>
>> simply harder to do than with SeqIO?
>>
>>
>> This is something Jason could probably speak up on, but from my perspective
>> it comes down to 'why?'.  This opens up a very hard-to-implement door
>> (converting to and from, for instance, BLAST to HMMER), which doesn't make
>> sense from the end-user perspective.  What most users want out of those
>> formats is getting at the data in an easily accessible way, to further
>> process them (filter, to GFF, etc), or to have them summarized.  the Writer
>> classes take care of the latter.
>>
>>
>> There is a very generic, all-purpose write_result in Bio::SearchIO that
>> just calls the a ResultWriter object (and dies if it isn't present).  Note
>> that this expects a ResultWriter, not a Hit/HSPWriter; it is write_result()
>> after all. I think this kind of goes against the well-established API that
>> exists with the other write_foo implementations for the IO classes, where
>> the input/output format should match, but there you have it.
>>
>> Dan -
>> I'm confused about what you are trying to do or what is broken - are you
>> just annoyed that the API isn't the same style as Bio::SeqIO.
>>
>
> No, I'm not annoyed. I was just confused initially because it didn't work as
> 'expected', and then I was wondering why (I was just curious). I take Chris's
> point that this could be a lot of work to implement for a very marginal use
> case.
>
> Very simply, what I am trying to do is this: a) read in a blasttable, b)
> filter the HSPs per 'result' (per query sequence), and c) write the HSPs out
> in blasttable format.
>
> I was stuck at step c, but I'm not saying anything is broken (just my
> understanding of how to use SearchIO::Writer::HSPTableWriter).
>
> I'll look again at Chris's suggestions to see if I can get code to just
> 'round trip' the blasttable format. From there I think I should be able to
> do what I want.
>
>
> Cheers,
> Dan.
>
>
> --
>> Jason Stajich
>> jason.stajich at gmail.com
>> jason at bioperl.org
>>
>>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> 


From clements at nescent.org  Tue Sep 22 19:15:50 2009
From: clements at nescent.org (Dave Clements)
Date: Tue, 22 Sep 2009 16:15:50 -0700
Subject: [Bioperl-l] a Main Page proposal
In-Reply-To: <2B4BD21D-D06A-43A1-8860-A8F3771130E7@illinois.edu>
References: <mailman.7482.1253574466.2696.bioperl-l@lists.open-bio.org>
	<4AB84B8D.5080005@ieee.org>
	<2B4BD21D-D06A-43A1-8860-A8F3771130E7@illinois.edu>
Message-ID: <f135c01c0909221615y4e79fb48u7e1ce6771b046d10@mail.gmail.com>

Hello all,

For open source project wikis, it's nice if the home page
1) Lets new users know that this is an active project with a lot going on.
2) Encourages people to contribute to the project and the wiki.

Both the BioPython,org and GMOD.org sites include a list of links to news
items on the home page.  This is done in both sites with a MediaWiki
extension.

The GMOD.org home page also includes a list of new and recently updated wiki
pages.  This achieves both goals, by showing what's happening, and by giving
people a slight reward for updating the wiki by placing a link to the page
on the wiki.  This is also done with MediaWiki extensions.

My 2?,

Dave C

-- 
GMOD News: http://gmod.org/wiki/GMOD_News


From David.Messina at sbc.su.se  Wed Sep 23 07:37:02 2009
From: David.Messina at sbc.su.se (Dave Messina)
Date: Wed, 23 Sep 2009 13:37:02 +0200
Subject: [Bioperl-l] a Main Page proposal
In-Reply-To: <f135c01c0909221615y4e79fb48u7e1ce6771b046d10@mail.gmail.com>
References: <mailman.7482.1253574466.2696.bioperl-l@lists.open-bio.org> 
	<4AB84B8D.5080005@ieee.org>
	<2B4BD21D-D06A-43A1-8860-A8F3771130E7@illinois.edu> 
	<f135c01c0909221615y4e79fb48u7e1ce6771b046d10@mail.gmail.com>
Message-ID: <628aabb70909230437p13de2164sef4950cc3be2ce23@mail.gmail.com>

I think either Chris' version or Mark's earlier, slightly more verbose
version would work well and fulfill the goals of reducing clutter and making
it easier to find what you're looking for for visitors new and old.

I do like the idea of a newsfeed, which summarizes what's been going on
lately and let's new users know the project is active. Embedding the BioPerl
twitter feed would be an easy solution.


The GMOD.org home page also includes a list of new and recently updated
> wiki pages.  This achieves both goals, by showing what's happening, and by
> giving people a slight reward for updating the wiki by placing a link to the
> page on the wiki.
>

I like this idea too.


Dave


From maj at fortinbras.us  Wed Sep 23 07:47:24 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Wed, 23 Sep 2009 07:47:24 -0400
Subject: [Bioperl-l] a Main Page proposal
In-Reply-To: <628aabb70909230437p13de2164sef4950cc3be2ce23@mail.gmail.com>
References: <mailman.7482.1253574466.2696.bioperl-l@lists.open-bio.org>
	<4AB84B8D.5080005@ieee.org>
	<2B4BD21D-D06A-43A1-8860-A8F3771130E7@illinois.edu>
	<f135c01c0909221615y4e79fb48u7e1ce6771b046d10@mail.gmail.com>
	<628aabb70909230437p13de2164sef4950cc3be2ce23@mail.gmail.com>
Message-ID: <0AD07A69C66B4B5BB8599BA5483145D7@NewLife>

Johnathan, Dave and Dave -- thanks for these helpful comments-
I'm beginning to think there is a happy medium for this medium.
MAJ
  ----- Original Message ----- 
  From: Dave Messina 
  To: Dave Clements 
  Cc: bioperl-l at lists.open-bio.org ; Mark A. Jensen ; Chris Fields 
  Sent: Wednesday, September 23, 2009 7:37 AM
  Subject: Re: [Bioperl-l] a Main Page proposal


  I think either Chris' version or Mark's earlier, slightly more verbose version would work well and fulfill the goals of reducing clutter and making it easier to find what you're looking for for visitors new and old.


  I do like the idea of a newsfeed, which summarizes what's been going on lately and let's new users know the project is active. Embedding the BioPerl twitter feed would be an easy solution.


    The GMOD.org home page also includes a list of new and recently updated wiki pages.  This achieves both goals, by showing what's happening, and by giving people a slight reward for updating the wiki by placing a link to the page on the wiki.


  I like this idea too.


  Dave 


From biopython at maubp.freeserve.co.uk  Wed Sep 23 08:12:56 2009
From: biopython at maubp.freeserve.co.uk (Peter)
Date: Wed, 23 Sep 2009 13:12:56 +0100
Subject: [Bioperl-l] a Main Page proposal
In-Reply-To: <628aabb70909230437p13de2164sef4950cc3be2ce23@mail.gmail.com>
References: <mailman.7482.1253574466.2696.bioperl-l@lists.open-bio.org>
	<4AB84B8D.5080005@ieee.org>
	<2B4BD21D-D06A-43A1-8860-A8F3771130E7@illinois.edu>
	<f135c01c0909221615y4e79fb48u7e1ce6771b046d10@mail.gmail.com>
	<628aabb70909230437p13de2164sef4950cc3be2ce23@mail.gmail.com>
Message-ID: <320fb6e00909230512u3d0c2031xb418e3253476be2f@mail.gmail.com>

On Wed, Sep 23, 2009 at 12:37 PM, Dave Messina <David.Messina at sbc.su.se> wrote:
> I think either Chris' version or Mark's earlier, slightly more verbose
> version would work well and fulfill the goals of reducing clutter and making
> it easier to find what you're looking for for visitors new and old.
>
> I do like the idea of a newsfeed, which summarizes what's been going on
> lately and let's new users know the project is active. Embedding the BioPerl
> twitter feed would be an easy solution.

Embedding your news feed would be just as easy:

http://news.open-bio.org/news/category/obf-projects/bioperl/feed/rdf
http://news.open-bio.org/news/category/obf-projects/bioperl/feed/rss
http://news.open-bio.org/news/category/obf-projects/bioperl/feed/rss2
http://news.open-bio.org/news/category/obf-projects/bioperl/feed/atom

Which (news server vs twitter feed) is preferable is down to you guys,
although for 2009 at least there has been more activity on twitter.
I'm not sure if you have the news posts re-tweeted or not (the last
news server post was back in Feb), but Biopython and the OBF
twitter accounts are doing this via twitterfeed.

Peter


From maj at fortinbras.us  Wed Sep 23 08:51:15 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Wed, 23 Sep 2009 08:51:15 -0400
Subject: [Bioperl-l] Protein Sequence QSARs
In-Reply-To: <627d998d0909070117u760c8ef3k47a894cf52d099f1@mail.gmail.com>
References: <627d998d0909070117u760c8ef3k47a894cf52d099f1@mail.gmail.com>
Message-ID: <3B9AACAB654F4F4DBB6CE00A9B26FBF6@NewLife>

Hi Brett--
I doubt if anything this specialized exists in BioPerl.
I'd say go for it, but R may be better suited for the calculations you
want to do. For dealing with matrices, you may want to check out
the Bio::Matrix namespace.
cheers Mark
----- Original Message ----- 
From: "Brett Bowman" <bnbowman at gmail.com>
To: <bioperl-l at lists.open-bio.org>
Sent: Monday, September 07, 2009 4:17 AM
Subject: [Bioperl-l] Protein Sequence QSARs


I've been working on a script for my personal edification for annotating
protein sequence for QSARs, as described in the paper below, because I
didn't see anything in Bioperl to do it for me.  Essentially converting a
protein sequence of length N into a numerical matrix of size 3-by-N by
substitution, and then calculating the auto- and cross- correlation values
for various for a lag of L amino acids.  I was considering turning it into a
full blown module, but I wanted to ask if A) it had been done before and I
had just missed it, and B) whether anyone other than me would find such a
module useful.

Wold S, Jonsson J, Sj?str?m M, Sandberg M, R?nnar S: * DNA and peptide
sequences and chemical processes multivariately modeled by principal
component analysis and partial least-squares projections to latent
structures. **Anal Chim Acta* 1993, *277**:*239-253.

Brett Bowman
bnbowman at gmail.com
Woelk Lab, Stein Cancer Research Center
UCSD/SDSU Joint Program in Bioinformatics

_______________________________________________
Bioperl-l mailing list
Bioperl-l at lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/bioperl-l


From maj at fortinbras.us  Wed Sep 23 09:04:48 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Wed, 23 Sep 2009 09:04:48 -0400
Subject: [Bioperl-l] Fw:  problem parsing msf file
Message-ID: <4851B51372DE4761B8CC26D685B57344@NewLife>

neglected the list
----- Original Message ----- 
From: "Mark A. Jensen" <maj at fortinbras.us>
To: "Paola Bisignano" <paola.bisignano at gmail.com>
Sent: Wednesday, September 23, 2009 9:04 AM
Subject: Re: [Bioperl-l] problem parsing msf file


> Hi Paola--
> I think you need column_from_residue_number() off the SimpleAlign object,
> and location_from_column off the LocatableSeq object. For your example, 
> try
> 
> $alnio = Bio::AlignIO->new( -file=>"my.msf");
> $aln = $alnio->next_aln;
> 
> $s1 = $aln->get_seq_by_pos(1);
> $s2 = $aln->get_seq_by_pos(2);
> 
> $col = $aln->column_from_residue_number( $s1->id, 28);
> $s2coord = $s2->location_from_column( $col - 1);
> 
> Now, $s2coord should equal 4 (the coordinate of the R before the I
> that aligns with the V in sequence 1).
> MAJ
> 
> 
> ----- Original Message ----- 
> From: "Paola Bisignano" <paola.bisignano at gmail.com>
> To: "Mark A. Jensen" <maj at fortinbras.us>; <bioperl-l at lists.open-bio.org>
> Sent: Friday, September 04, 2009 8:28 AM
> Subject: [Bioperl-l] problem parsing msf file
> 
> 
>>I have a problem with the parsing of msf file...I can't find the exact
>> object of Bio::SimpleAlign for my case...
>> I have to identify residues (from a list) in aligned sequences...but
>> when I parse the alignment from fasta file, I save as msf file, where
>> I have to identify my residue (from the list, numbering as the pdb
>> file) and the residue aligned in the aligned sequences...
>> 
>> this is a piece of the file...
>> 
>> NoName   MSF: 2  Type: P  Wed Aug 26 10:32:50 2009  Check: 00 ..
>> 
>> Name: Sequence/23-178  Len:    156  Check:  8937  Weight:  1.00
>> Name: 2zhz:A/1-148     Len:    156  Check:  9006  Weight:  1.00
>> 
>> //
>> 
>> 
>>                      1                                                   50
>> Sequence/23-178       NDPRVAAYGE VDELNSWVGY TKSLINSHTQ VLSNELEEIQ QLLFDCGHDL
>> 2zhz:A/1-148          DDARIAAIGD VDELNSQIGV L--LAEPLPD DVRAALSAIQ HDLFDLGGEL
>> 
>> 
>>                      51                                                 100
>> Sequence/23-178       ATPADDERHS FKFKQEQPTV WLEEKIDNYT QVVPAVKKHI LPGGTQLASA
>> 2zhz:A/1-148          CIPGHAAITD AHLARLDG-- WLA----HYN GQLPPLEEFI LPGGARGAAL
>> 
>> 
>>                      101                                                150
>> Sequence/23-178       LHVARTITRR AERQIVQLMR EEQINQDVLI FINRLSDYFF AAARYANYLE
>> 2zhz:A/1-148          AHVCRTVCRR AERSIVALGA SEPLNAAPRR YVNRLSDLLF VLARVLNRAA
>> 
>> 
>>                      151                                                200
>> Sequence/23-178       QQPDML
>> 2zhz:A/1-148          GGADVL
>> 
>> for example in this I have to identify the residue that is in front of
>> Val 28 (that is in Sequen) in 2zhz:A (that manually conting is Ile
>> 5)....
>> Tyr4-> has no residue in front of it because the alignment starts from
>> N23 of Sequence...
>> how can I find the way to enter the residue of my sequen, and extract
>> the residue from the other????
>> 
>> 
>> I wish you all dear friends..and I'm actually in atrouble with this..
>> Thanks for suggestions
>> 
>> Paola
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>> 
>>


From cjfields at illinois.edu  Wed Sep 23 10:41:14 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Wed, 23 Sep 2009 09:41:14 -0500
Subject: [Bioperl-l] a Main Page proposal
In-Reply-To: <320fb6e00909230512u3d0c2031xb418e3253476be2f@mail.gmail.com>
References: <mailman.7482.1253574466.2696.bioperl-l@lists.open-bio.org>
	<4AB84B8D.5080005@ieee.org>
	<2B4BD21D-D06A-43A1-8860-A8F3771130E7@illinois.edu>
	<f135c01c0909221615y4e79fb48u7e1ce6771b046d10@mail.gmail.com>
	<628aabb70909230437p13de2164sef4950cc3be2ce23@mail.gmail.com>
	<320fb6e00909230512u3d0c2031xb418e3253476be2f@mail.gmail.com>
Message-ID: <9D6376D4-DFAC-4363-BA1C-0E27AB01373E@illinois.edu>

On Sep 23, 2009, at 7:12 AM, Peter wrote:

> On Wed, Sep 23, 2009 at 12:37 PM, Dave Messina <David.Messina at sbc.su.se 
> > wrote:
>> I think either Chris' version or Mark's earlier, slightly more  
>> verbose
>> version would work well and fulfill the goals of reducing clutter  
>> and making
>> it easier to find what you're looking for for visitors new and old.
>>
>> I do like the idea of a newsfeed, which summarizes what's been  
>> going on
>> lately and let's new users know the project is active. Embedding  
>> the BioPerl
>> twitter feed would be an easy solution.
>
> Embedding your news feed would be just as easy:
>
> http://news.open-bio.org/news/category/obf-projects/bioperl/feed/rdf
> http://news.open-bio.org/news/category/obf-projects/bioperl/feed/rss
> http://news.open-bio.org/news/category/obf-projects/bioperl/feed/rss2
> http://news.open-bio.org/news/category/obf-projects/bioperl/feed/atom
>
> Which (news server vs twitter feed) is preferable is down to you guys,
> although for 2009 at least there has been more activity on twitter.
> I'm not sure if you have the news posts re-tweeted or not (the last
> news server post was back in Feb), but Biopython and the OBF
> twitter accounts are doing this via twitterfeed.
>
> Peter

Not to add yet more to the list, but I also think a concise list of  
projects using (or 'powered by') bioperl should be front-and-center;  
not a lot of users know when/where bioperl is used.  This applies to  
the other bio* as well, particularly biopython (seeing it popping up  
more and more).

For an example, see the biomart homepage:

http://www.biomart.org/

chris


From adlai at refenestration.com  Wed Sep 23 10:38:32 2009
From: adlai at refenestration.com (adlai burman)
Date: Wed, 23 Sep 2009 16:38:32 +0200
Subject: [Bioperl-l] Newbie: Format GenBank
Message-ID: <BA67A13E-EAF0-4297-8013-22656D3D1740@refenestration.com>

I have finally got past two major hurdles (for me) only to get stumped:
1. I have written a perl script that can take a genbank formated text  
file as a filehandle and do all sorts of nifty (for me) things with it.
2. I have gotten my BioPerl installation working on a web hosting  
service so my advisor can use this through a browser.

BUT the code I have to fetch GB record can print it as a single HTML  
line, and what I need is for it to assign the retrieved file to a  
scaler variable. I am going blind trying to figure out how access  
(not write) the gb file from an SeqIO object and assign it to a  
variable.

Here's an example of the code I have going on the server:

#!/usr/bin/perl
print "Content-type: text/html\n\n";
use Bio::SeqIO;
use Bio::DB::GenBank;

$genBank = new Bio::DB::GenBank;  # This object knows how to talk to  
GenBank

my $seq = $genBank->get_Seq_by_acc('DQ897681');  # get a record by  
accession

my $seqOut = new Bio::SeqIO(-format => 'genbank');

$seqOut->write_seq($seq);


exit;

where 'DQ897861' will be replaced by a CGI post.

I know that write_seq is not what I need, and I assume that this is a  
simple problem but can anyone tell me how to assign the retrieved gb  
file to a scaler?

Thanks,
Adlai


From joseguillin at hotmail.com  Tue Sep 22 10:39:52 2009
From: joseguillin at hotmail.com (Jose .)
Date: Tue, 22 Sep 2009 15:39:52 +0100
Subject: [Bioperl-l] dnastatistics
In-Reply-To: <A5C3A80C-03F0-4CEC-BA43-2271B58F6DC4@science.mq.edu.au>
References: <BLU104-W2453ADE4584D2C479071A4A0E40@phx.gbl>
	<7AD546C5A6BE4B66BF9705BC885E08B1@NewLife>
	<8B440DC9-A1C8-4900-A0AB-96448616E46A@bioperl.org>
	<A5C3A80C-03F0-4CEC-BA43-2271B58F6DC4@science.mq.edu.au>
Message-ID: <BLU104-W475752FF9D5EADD0269E7A0DC0@phx.gbl>


Hi Liam,
I've tried analyzing the same alignment with both softwares (DNAStatatistics and dnadist), using the same analysis method (Jukes-Cantor), and I got pretty much the same results:

use strict;
use Bio::AlignIO;
Use Bio::Align::DNAStatistics;
my $stats = Bio::Align::DNAStatistics->new();
my $alignin = Bio::AlignIO->new(-file => 'e1_output_uno_solo.fas',
                         -format => 'fasta');
my $aln = $alignin->next_aln;
my $jcmatrix = $stats-> distance (-align => $aln,
               -method => 'Jukes-Cantor');
print $jcmatrix->print_matrix;
RESULT:A              0.00000  0.40900  0.41834  0.38044B              0.40900  0.00000  0.41358  0.37240C              0.41834  0.41358  0.00000  0.37809D              0.38044  0.37240  0.37809  0.00000

I used the web-based dnadist  ( http://mobyle.pasteur.fr/cgi-bin/portal.py?form=dnadist ), which is mentioned in the CPAN-dnadist documentation ( http://search.cpan.org/~birney/bioperl-run-1.4/Bio/Tools/Run/PiseApplication/dnadist.pm ),  setting Jukes-Cantor as Distance (D), and these are the Results:    4
A          0.000000 0.408996 0.418335 0.380436
B          0.408996 0.000000 0.413575 0.372400
C          0.418335 0.413575 0.000000 0.378086
D          0.380436 0.372400 0.378086 0.000000The difference is because of rounding off.Could it be by any chance that your analysis were made using two different methods, by default? (I think dnadist uses F84 instead of Jukes-Cantor by default). 

Using F84 instead of Jukes-Cantor in dnadist gives:
    4
A          0.000000 0.470013 0.479477 0.435071
B          0.470013 0.000000 0.468730 0.417669
C          0.479477 0.468730 0.000000 0.421582
D          0.435071 0.417669 0.421582 0.000000

On the other hand, DnaStatistics documentation offers the possibility of using F84, but it's not yet implementedMSG: Abstract method "Bio::Align::DNAStatistics::D_F84" is not implemented by package Bio::Align::DNAStatistics.
This is not your fault - author of Bio::Align::DNAStatistics should be blamed!


So, I think Jukes-Cantor works the same in Bio::Align::DNAStatistics and web-based dnadist; but other methods maybe not.
I want to thank you for letting me know about Data::Dumper, I've read the documentation and seems very handy. I think it could help me sooner or later. I'll try it out!!As I'm using DNAStatistics for a project, please let me know if you find what is wrong; or if I can help you further somehow.
Regards,
Jose G.


Subject: dnastatistics
From: lelbourn at science.mq.edu.au
Date: Tue, 22 Sep 2009 17:14:44 +1000
CC: maj at fortinbras.us; bioperl-l at bioperl.org; joseguillin at hotmail.com
To: jason at bioperl.org


So I also had no problem running the code as written by Jose (Bioperl 1.6.0, perl 5.10), but in the documentation for DNAStatistics it says:
"The routines are not well tested and do contain errors at this point. Work is underway to correct them, but do not expect this code to give you the right answer currently!"!
So I'm using dnadist (as I think the documentation recommends), and it does produce different numbers to $stats->distance(-).
I tried write_matrix from Bio::Matrix::IO - got a message saying it hasn't been implemented yet?
And if Jose hasn't already found it, try Data::Dumper; it will change your life....
Regards,Liam.
On 15/09/2009, at 3:54 AM, Jason Stajich wrote:Yeah it seems like more of a bioperl problem -- possible that the older code didn't recognize 'jukes-cantor' but you can try the abbreviation 'jc' -- better to just upgrade tho!

This isn't the cause of the problem but I would also encourage use of Bio::Matrix::IO for printing the matrix (use the 'write_matrix' function) rather than print_matrix on the matrix itsself.

-jason
On Sep 14, 2009, at 10:00 AM, Mark A. Jensen wrote:

Hi Jose--
I don't get any problem with your script as written. You should upgrade to
BioPerl 1.6 and try again.
The "unblessed reference" is $jcmatrix. It may be undef for some reason.
MAJ
----- Original Message ----- From: "Jose ." <joseguillin at hotmail.com>
To: <bioperl-l at bioperl.org>
Sent: Monday, September 14, 2009 8:48 AM
Subject: [Bioperl-l] Bio/Align/DNAStatistics.html print$jcmatrix->print_matrix;


Hello,

I'm trying to use Bio::Align::DNAStatistics, but I get the following message:

Can't call method "print_matrix" on unblessed reference at Tree.pl line 32, <GEN0> line 44.

Other modules do work, such us Bio::SimpleAlign;


My code is basically a modification of the code I found in http://doc.bioperl.org/releases/bioperl-current/bioperl-live/Bio/Align/DNAStatistics.html, as it is as follows:

use strict;
use Bio::AlignIO;
use Bio::Align::DNAStatistics;


my $stats = Bio::Align::DNAStatistics->new();

my $alignin = Bio::AlignIO->new(-file => 'e1_output_uno_solo.fas',
                          -format => 'fasta');
my $aln = $alignin->next_aln;

my $jcmatrix = $stats-> distance (-align => $aln,
                -method => 'Jukes-Cantor');

print $jcmatrix->print_matrix;

And the file 'e1_output_uno_solo.fas' has the following sequences:

A
GGTTATCTCAACAACTGTCACC--GTGGGCGCTGGTCATTGGTACGGGTGAACGAGAGTT
AAACGGTCGTTAACCATAGAAACAAAACACACTGCACCTTAACTCACTGAATAGTTGACG
GTCTGCCTCAGGGCTTGAGACAACGGATGGATCTAAACTCATGCTGTAGCCTATCAAACT
TAGCCCCAGGGTACTTCCGTCCCTAGCCTCGCTACAAGGCCAGAAAGGGTTTTGAAGTCT
ACTCACTGTGACCAGCGGTCTAGTCAGGTTATGCTTCGGCACAAAACCTCAGAATCGGTA
ACCAGCCACTACACGAACTGAAATCAAATCGCGGGAGGTGGTCCATCTTTGTCCACGCTG
CGATGATTGGGTTGCTTTATAGTCTAGCTGCAAGGTTTTGCGTTCTGGTGGGAAGCGGSubject: Re: [Bioperl-l] Bio/Align/DNAStatistics.html
	print$jcmatrix->print_maCA
TCCAAGGGGTTGACTCCGCTCGTTTATAACATGCCTTGGGCCTCCATGGTGAGTCGCAAC
GTCAGCGTAGGCCTAGACGGCT

B
GGATATCTCGACAACTTTTAGC--CTGGGCGCTTGGCATTGGTACACGTGACTTGCAGTT
AAAGGGTCGTTATACATAGAATCACTACCCAC--CAGGCGAACTCGCTGGAGAGCTGAGG
GTCACCCTCAGCGGTTGAGTTAACTGCTCGATGTTAACCGATGTTGGATCATAGGTAACT
TATCCTCAGTGTTCCTCTGTCCCTAGACTGGCTACAGGGCTACACCGGGTTTGAGGGGAT
ACTGACTGTTTTCAGCGGTAGTGTAAGTGTATGGTCCAACCCAAGGGTTCATGACCGGTA
AACTGCCCGTTCCCGCATTGAAATCAAATTGCAGGAGTTGGTACTTATTTGTCAACCTTA
CGATGATTGGGATGCATTTTAGTCGGGCTGGGCGGATTTGCGATCTGGGTGGAAGAGAGA
TGCATGGGGCTAACTCGTCTTGGTGAGTACCGGCATTGCACCGCAATGGACCGCCAAAAC
ATAAGAGTAGGTCGGGATGGCA

C
GCTTATCTCAACAACCGACACGAAGTCGTCGCAGGTCAATGGTACACGTGAATTGAAGTC
ATAAGATCAGTAATGATCGAACCACCAAACCCTTAACCTCGACTCACGCGATAGCCGAGG
GTCTGCCTCCAGGGTTGATTTAAAGGTTCTATTTAAGACCGTTTTCGATCATAGGTTACT
TATCCCCAGAGTTCTACCGTCGTGAGAATGGCTACAAGGCTAGAATAGGTTTTAGGGT-T
ACTTACGGTCTGCAGCCGTATTGTGAGGTTATGGTCCGGCCCTAGGCGTCATGACCGATA
ATCAGCCCCTACCTGAAATGAAATCAAATCGCGGGAGTTGGTACTTATCTGTCAACGTTG
CGATGATGGGGATACATGTTGGTCTACCGCGACGGACTAGCGATCACGGGGGAAGCGGAT
TGCCCGGTGGTGACTCGACACGTTTAAAACCTGCCTGGTTCCCGCATGGATCGTCACAAC
GTATGTGCAGGTCGAAACGAGT

D
CGTGATCGCAACAACTGTCACC--GTGGGCGCTGGCCGTTGGACCACGTGAAATGCTGTT
AAACGATCGTTCACCATAGAACCACTACACTCTTCACCTCAACCCGCGGGACAGGTGATG
GTGTCCCCCAGGGGTTGAGTGAACGGCTCGATGTAAACCCATGTTCGATCATAGGTAACG
TAGCCCCAGGGTGATTCCGTTCCTAAACTGGTTACAAGGCTAAAACGTGTTTTAGAGTAT
AATGACTGTCTACGGCGGTATTGTGATGTTATCATCCGTCCCTAGGCGTGGCGACCGTTA
AACAGCCTCTTCCCTAACTGATATCTAATCGTAGGAGTTGCTACGCATTTGTCAACGCAG
CGATGATGGTGATGCATCTTAATCTAGCTGG----TTTTTTGATCTCGGGTGACGCAGAT
AGTCAGGGGTTGACTCGCGTCGTTTGAAACGTGCCTTGCTCCTCAATGGACCCTCCGAAC
CTAAGAGTAGCTCGACACGGCT


I think the $aln object is OK, as I can use it with SimpleAlign.

Moreover, if I write
        print $jcmatrix;
instead of
        print $jcmatrix->print_matrix;
I get the memory reference, as normal===> ARRAY(0x859f08)

So my question is:

Why do I have an unblessed reference?

Can't call method "print_matrix" on unblessed reference at Tree.pl line 32, <GEN0> line 44.

Thank you very much in advance.

Jose G.

_________________________________________________________________
Hay tantos ordenadores como personas. ?Descubre ahora cu?l eres t?!
http://www.quepceres.com/
_______________________________________________
Bioperl-l mailing list
Bioperl-l at lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/bioperl-l


_______________________________________________
Bioperl-l mailing list
Bioperl-l at lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/bioperl-l

--
Jason Stajich
jason.stajich at gmail.com
jason at bioperl.org


_______________________________________________
Bioperl-l mailing list
Bioperl-l at lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/bioperl-l


______________________________


_________________________________________________________________
Comparte tus mejores momentos del verano ?Hazlo con Windows Live Fotos!
http://www.vivelive.com/compartirfotos


From A.J.Pemberton at bham.ac.uk  Tue Sep 22 13:06:04 2009
From: A.J.Pemberton at bham.ac.uk (Anthony Pemberton)
Date: Tue, 22 Sep 2009 18:06:04 +0100
Subject: [Bioperl-l] Problems installing latest stable bioperl-db (1.6)
Message-ID: <3A5B0BBDAF00724AB5F10155650102306F86D3F6@LESMBX1.adf.bham.ac.uk>

Folks,

I am experiencing problems installing bioperl-db. I followed the instructions on the website both installing via CPAN and downloading the source tarball. Get the same error. I think I have missing prerequistes, the first error I get is:

Can't locate Array/Compare.pm in @INC (@INC contains: t/lib t /usr/local/BioPerl-db-1.6.0/blib/lib 
/usr/local/BioPerl-db-1.6.0/blib/arch /usr/local/BioPerl-db-1.6.0 /usr/lib64/perl5/5.8.5/x86_64-linux-thread-multi
/usr/lib/perl5/5.8.5 /usr/lib64/perl5/site_perl/5.8.5/x86_64-linux-thread-multi /usr/lib/perl5/site_perl/5.8.5 
/usr/lib/perl5/site_perl /usr/lib64/perl5/vendor_perl/5.8.5/x86_64-linux-thread-multi /usr/lib/perl5/vendor_perl/5.8.5 
/usr/lib64/perl5/vendor_perl/5.8.3/x86_64-linux-thread-multi /usr/lib/perl5/vendor_perl .) at t/lib/Test/Warn.pm line 228.

Can anyone help?

Regards,

Tony P.


**************************************************************
Mr. A. Pemberton			Tel:+44 121 414 3388
School of Biosciences,			Fax:+44 121 414 5925
The University of Birmingham                    Email:a.j.pemberton at bham.ac.uk
Birmingham B15 2TT U.K.
**************************************************************


From joseguillin at hotmail.com  Wed Sep 23 11:08:04 2009
From: joseguillin at hotmail.com (Jose .)
Date: Wed, 23 Sep 2009 16:08:04 +0100
Subject: [Bioperl-l] Bio::Matrix::IO
Message-ID: <BLU104-W13A9E771FB4CC77748AAC5A0DB0@phx.gbl>


Hi,
I've found a typo in the Bio/Matrix/IO/phylip.pm documentation. There's a comma missing, 
=head1 SYNOPSIS

  use Bio::Matrix::IO;
  my $parser = Bio::Matrix::IO->new(-format   => 'phylip'    <------ comma missing
                                   -file     => 't/data/phylipdist.out');
  my $matrix = $parser->next_matrix;

It's also in the CPAN web:http://search.cpan.org/~cjfields/BioPerl-1.6.0_2/Bio/Matrix/IO/phylip.pm
And the BioPerl web:http://doc.bioperl.org/releases/bioperl-current/bioperl-live/Bio/Matrix/IO/phylip.html

This could mislead BioPerl begginers (like me) or absentminded BioPerl advanced who rely on the SYNOPSIS code.
Thank you! :)
_________________________________________________________________
Desc?rgate Internet Explorer 8 ?Y gana gratis viajes con Spanair!
http://www.vivelive.com/spanair


From maj at fortinbras.us  Wed Sep 23 11:36:59 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Wed, 23 Sep 2009 11:36:59 -0400
Subject: [Bioperl-l] Problems installing latest stable bioperl-db (1.6)
In-Reply-To: <3A5B0BBDAF00724AB5F10155650102306F86D3F6@LESMBX1.adf.bham.ac.uk>
References: <3A5B0BBDAF00724AB5F10155650102306F86D3F6@LESMBX1.adf.bham.ac.uk>
Message-ID: <3E7712FC278A4C9C89CBFC9A683AE301@NewLife>

hi Tony- missing prereqs are the issue with this message,yes-
the brute force approach would be to install each of these
as they come up; you can do

$ cpan
cpan> install Array::Compare

etc., then attempt the bioperl-db install again; lather, rinse, repeat.
MAJ
----- Original Message ----- 
From: "Anthony Pemberton" <A.J.Pemberton at bham.ac.uk>
To: <bioperl-l at bioperl.org>
Sent: Tuesday, September 22, 2009 1:06 PM
Subject: [Bioperl-l] Problems installing latest stable bioperl-db (1.6)


> Folks,
>
> I am experiencing problems installing bioperl-db. I followed the instructions 
> on the website both installing via CPAN and downloading the source tarball. 
> Get the same error. I think I have missing prerequistes, the first error I get 
> is:
>
> Can't locate Array/Compare.pm in @INC (@INC contains: t/lib t 
> /usr/local/BioPerl-db-1.6.0/blib/lib
> /usr/local/BioPerl-db-1.6.0/blib/arch /usr/local/BioPerl-db-1.6.0 
> /usr/lib64/perl5/5.8.5/x86_64-linux-thread-multi
> /usr/lib/perl5/5.8.5 
> /usr/lib64/perl5/site_perl/5.8.5/x86_64-linux-thread-multi 
> /usr/lib/perl5/site_perl/5.8.5
> /usr/lib/perl5/site_perl 
> /usr/lib64/perl5/vendor_perl/5.8.5/x86_64-linux-thread-multi 
> /usr/lib/perl5/vendor_perl/5.8.5
> /usr/lib64/perl5/vendor_perl/5.8.3/x86_64-linux-thread-multi 
> /usr/lib/perl5/vendor_perl .) at t/lib/Test/Warn.pm line 228.
>
> Can anyone help?
>
> Regards,
>
> Tony P.
>
>
> **************************************************************
> Mr. A. Pemberton Tel:+44 121 414 3388
> School of Biosciences, Fax:+44 121 414 5925
> The University of Birmingham                    Email:a.j.pemberton at bham.ac.uk
> Birmingham B15 2TT U.K.
> **************************************************************
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> 


From maj at fortinbras.us  Wed Sep 23 11:46:03 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Wed, 23 Sep 2009 11:46:03 -0400
Subject: [Bioperl-l] Bio::Matrix::IO
In-Reply-To: <BLU104-W13A9E771FB4CC77748AAC5A0DB0@phx.gbl>
References: <BLU104-W13A9E771FB4CC77748AAC5A0DB0@phx.gbl>
Message-ID: <E37AFAC689C84477817EFF38511B5709@NewLife>

thanks Jose - fixed it
MAJ
----- Original Message ----- 
From: "Jose ." <joseguillin at hotmail.com>
To: <bioperl-l at bioperl.org>
Sent: Wednesday, September 23, 2009 11:08 AM
Subject: [Bioperl-l] Bio::Matrix::IO


Hi,
I've found a typo in the Bio/Matrix/IO/phylip.pm documentation. There's a comma 
missing,
=head1 SYNOPSIS

  use Bio::Matrix::IO;
  my $parser = Bio::Matrix::IO->new(-format   => 'phylip'    <------ comma 
missing
                                   -file     => 't/data/phylipdist.out');
  my $matrix = $parser->next_matrix;

It's also in the CPAN 
web:http://search.cpan.org/~cjfields/BioPerl-1.6.0_2/Bio/Matrix/IO/phylip.pm
And the BioPerl 
web:http://doc.bioperl.org/releases/bioperl-current/bioperl-live/Bio/Matrix/IO/phylip.html

This could mislead BioPerl begginers (like me) or absentminded BioPerl advanced 
who rely on the SYNOPSIS code.
Thank you! :)
_________________________________________________________________
Desc?rgate Internet Explorer 8 ?Y gana gratis viajes con Spanair!
http://www.vivelive.com/spanair
_______________________________________________
Bioperl-l mailing list
Bioperl-l at lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/bioperl-l


From roy.chaudhuri at gmail.com  Wed Sep 23 12:27:26 2009
From: roy.chaudhuri at gmail.com (Roy Chaudhuri)
Date: Wed, 23 Sep 2009 17:27:26 +0100
Subject: [Bioperl-l] Newbie: Format GenBank
In-Reply-To: <BA67A13E-EAF0-4297-8013-22656D3D1740@refenestration.com>
References: <BA67A13E-EAF0-4297-8013-22656D3D1740@refenestration.com>
Message-ID: <4ABA4C6E.60609@gmail.com>

Hi Adlai,

In Perl you can open a string as if it was a file:

my $string;
open my $fh, '>', \$string or die $!;
my $seqOut=Bio::SeqIO->new(-fh=>$fh, -format=>'genbank';

$seqOut->write_seq($seq) should now write to the string.

However, are you sure this is your problem? Printing to STDOUT (which is 
what SeqIO does if you don't specify a file) should work fine with a CGI 
script. Your sequence is being displayed as one line because HTML 
ignores newline characters, but you can get around that by using a <pre> 
tag to specify pre-formatted text:

my $seqOut = new Bio::SeqIO(-format => 'genbank');
print "<pre>\n";
$seqOut->write_seq($seq);

Hope this helps.
Roy.

adlai burman wrote:
> I have finally got past two major hurdles (for me) only to get stumped:
> 1. I have written a perl script that can take a genbank formated text  
> file as a filehandle and do all sorts of nifty (for me) things with it.
> 2. I have gotten my BioPerl installation working on a web hosting  
> service so my advisor can use this through a browser.
> 
> BUT the code I have to fetch GB record can print it as a single HTML  
> line, and what I need is for it to assign the retrieved file to a  
> scaler variable. I am going blind trying to figure out how access  
> (not write) the gb file from an SeqIO object and assign it to a  
> variable.
> 
> Here's an example of the code I have going on the server:
> 
> #!/usr/bin/perl
> print "Content-type: text/html\n\n";
> use Bio::SeqIO;
> use Bio::DB::GenBank;
> 
> $genBank = new Bio::DB::GenBank;  # This object knows how to talk to  
> GenBank
> 
> my $seq = $genBank->get_Seq_by_acc('DQ897681');  # get a record by  
> accession
> 
> my $seqOut = new Bio::SeqIO(-format => 'genbank');
> 
> $seqOut->write_seq($seq);
> 
> 
> exit;
> 
> where 'DQ897861' will be replaced by a CGI post.
> 
> I know that write_seq is not what I need, and I assume that this is a  
> simple problem but can anyone tell me how to assign the retrieved gb  
> file to a scaler?
> 
> Thanks,
> Adlai
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From cjfields at illinois.edu  Wed Sep 23 13:47:51 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Wed, 23 Sep 2009 12:47:51 -0500
Subject: [Bioperl-l] Newbie: Format GenBank
In-Reply-To: <BA67A13E-EAF0-4297-8013-22656D3D1740@refenestration.com>
References: <BA67A13E-EAF0-4297-8013-22656D3D1740@refenestration.com>
Message-ID: <16121E7E-7619-4F02-82CC-20C6F5F6B230@illinois.edu>

On Sep 23, 2009, at 9:38 AM, adlai burman wrote:

> I have finally got past two major hurdles (for me) only to get  
> stumped:
> 1. I have written a perl script that can take a genbank formated  
> text file as a filehandle and do all sorts of nifty (for me) things  
> with it.
> 2. I have gotten my BioPerl installation working on a web hosting  
> service so my advisor can use this through a browser.
>
> BUT the code I have to fetch GB record can print it as a single HTML  
> line, and what I need is for it to assign the retrieved file to a  
> scaler variable. I am going blind trying to figure out how access  
> (not write) the gb file from an SeqIO object and assign it to a  
> variable.
>
> Here's an example of the code I have going on the server:
>
> #!/usr/bin/perl
> print "Content-type: text/html\n\n";
> use Bio::SeqIO;
> use Bio::DB::GenBank;
>
> $genBank = new Bio::DB::GenBank;  # This object knows how to talk to  
> GenBank
>
> my $seq = $genBank->get_Seq_by_acc('DQ897681');  # get a record by  
> accession
>
> my $seqOut = new Bio::SeqIO(-format => 'genbank');
>
> $seqOut->write_seq($seq);
>
> exit;
>
> where 'DQ897861' will be replaced by a CGI post.
>
> I know that write_seq is not what I need, and I assume that this is  
> a simple problem but can anyone tell me how to assign the retrieved  
> gb file to a scaler?
>
> Thanks,
> Adlai

Actually, there are two ways you can do this, one involving write_seq.

(1) The first is to just grab the raw data using Bio::DB::EUtilities:

use Bio::DB::EUtilities;

my $eutil = Bio::DB::EUtilities->new(-eutil     => 'efetch',
                                      -db        => 'nuccore',
                                      -id        => 'DQ897681',
                                      -rettype   => 'gb');

my $var = $eutil->get_Response->content;

(2) Use IO::String (see the SeqIO HOWTO), or Roy's example code.  That  
would 'filter' everything through SeqIO via next_seq/write_seq, so the  
output is what BioPerl spits out and may not be exactly the same.

chris


From cjfields at illinois.edu  Wed Sep 23 13:47:56 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Wed, 23 Sep 2009 12:47:56 -0500
Subject: [Bioperl-l] Problems installing latest stable bioperl-db (1.6)
In-Reply-To: <3E7712FC278A4C9C89CBFC9A683AE301@NewLife>
References: <3A5B0BBDAF00724AB5F10155650102306F86D3F6@LESMBX1.adf.bham.ac.uk>
	<3E7712FC278A4C9C89CBFC9A683AE301@NewLife>
Message-ID: <67AB606C-5CC9-4C1E-84EE-EFB7C37667E9@illinois.edu>

Appears Array::Compare is used for Test::Warn, so it isn't a true  
requirement (probably a test_requires or somesuch).

chris

On Sep 23, 2009, at 10:36 AM, Mark A. Jensen wrote:

> hi Tony- missing prereqs are the issue with this message,yes-
> the brute force approach would be to install each of these
> as they come up; you can do
>
> $ cpan
> cpan> install Array::Compare
>
> etc., then attempt the bioperl-db install again; lather, rinse,  
> repeat.
> MAJ
> ----- Original Message ----- From: "Anthony Pemberton" <A.J.Pemberton at bham.ac.uk 
> >
> To: <bioperl-l at bioperl.org>
> Sent: Tuesday, September 22, 2009 1:06 PM
> Subject: [Bioperl-l] Problems installing latest stable bioperl-db  
> (1.6)
>
>
>> Folks,
>>
>> I am experiencing problems installing bioperl-db. I followed the  
>> instructions on the website both installing via CPAN and  
>> downloading the source tarball. Get the same error. I think I have  
>> missing prerequistes, the first error I get is:
>>
>> Can't locate Array/Compare.pm in @INC (@INC contains: t/lib t /usr/ 
>> local/BioPerl-db-1.6.0/blib/lib
>> /usr/local/BioPerl-db-1.6.0/blib/arch /usr/local/BioPerl-db-1.6.0 / 
>> usr/lib64/perl5/5.8.5/x86_64-linux-thread-multi
>> /usr/lib/perl5/5.8.5 /usr/lib64/perl5/site_perl/5.8.5/x86_64-linux- 
>> thread-multi /usr/lib/perl5/site_perl/5.8.5
>> /usr/lib/perl5/site_perl /usr/lib64/perl5/vendor_perl/5.8.5/x86_64- 
>> linux-thread-multi /usr/lib/perl5/vendor_perl/5.8.5
>> /usr/lib64/perl5/vendor_perl/5.8.3/x86_64-linux-thread-multi /usr/ 
>> lib/perl5/vendor_perl .) at t/lib/Test/Warn.pm line 228.
>>
>> Can anyone help?
>>
>> Regards,
>>
>> Tony P.
>>
>>
>> **************************************************************
>> Mr. A. Pemberton Tel:+44 121 414 3388
>> School of Biosciences, Fax:+44 121 414 5925
>> The University of Birmingham                     
>> Email:a.j.pemberton at bham.ac.uk
>> Birmingham B15 2TT U.K.
>> **************************************************************
>>
>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From cjfields at illinois.edu  Wed Sep 23 16:58:37 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Wed, 23 Sep 2009 15:58:37 -0500
Subject: [Bioperl-l] BioPerl 1.6.0 alpha 3 released
In-Reply-To: <3BEA4A335B853745AE0BA5E81DE3782A09DD5D30@exch1-hi.accelrys.net>
References: <A59164B5-0408-4A94-9262-8B814DD48CE1@illinois.edu>
	<3BEA4A335B853745AE0BA5E81DE3782A09DD5D30@exch1-hi.accelrys.net>
Message-ID: <EA6593D4-5F3D-4CD9-95C6-598B9C561609@illinois.edu>

Yes, that would be good.  I don't have immediate access to anything  
running WinXP/vista/7 but I can probably look into this sometime  
tomorrow or Monday.

Just to make sure, is this with ActivePerl or Strawberry Perl?

chris

On Sep 23, 2009, at 3:52 PM, Kristine Briedis wrote:

> Hi Chris,
>
> We tested BioPerl 1.6.0 alpha 3 with our set of Pipeline Pilot  
> regressions and noticed a small problem.  The fasta validation check  
> for '>' in SeqIO::fasta (line 127) throws when used with  
> Index::Fasta on Windows because the position after '>' is being  
> indexed.  It looks like you already fixed the same problem for Linux  
> (comment in line 190 of Index::Fasta).  Do you want me to put this  
> into bugzilla?  Let me know if you have any questions.  Thanks!
>
> Cheers,
> Kristine
>
>
> -----Original Message-----
> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l- 
> bounces at lists.open-bio.org] On Behalf Of Chris Fields
> Sent: Tuesday, September 22, 2009 1:29 PM
> To: BioPerl List
> Subject: [Bioperl-l] BioPerl 1.6.0 alpha 3 released
>
> The third alpha is now out and propagating it's way around the
> intertubes:
>
> http://search.cpan.org/~cjfields/BioPerl-1.6.0_3/
>
> Pick your favorite archive here:
>
> http://bioperl.org/DIST/RC/
>
> This includes some unmerged changes from 1.6.0.  Test failures from
> the last alpha indicated these somehow were missed, so I basically ran
> a global diff against main trunk to check for missing commits (all
> located in t/ as it turned out).
>
> Also fixed is are the SeqFeature_SQLite.t failures; this is a file
> autogenerated with Build.PL tests that somehow made it's way into the
> last alpha release.  This is now properly cleaned up along with it's
> test database using './Build clean'.  BTW, very nice SQLite
> implementation; I may be using it!
>
> Please let me know if anything pops up; I'm hoping to release 1.6.1 by
> this Thursday-Friday.
>
> Enjoy!
>
> chris
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From KBriedis at accelrys.com  Wed Sep 23 16:52:09 2009
From: KBriedis at accelrys.com (Kristine Briedis)
Date: Wed, 23 Sep 2009 16:52:09 -0400
Subject: [Bioperl-l] BioPerl 1.6.0 alpha 3 released
In-Reply-To: <A59164B5-0408-4A94-9262-8B814DD48CE1@illinois.edu>
References: <A59164B5-0408-4A94-9262-8B814DD48CE1@illinois.edu>
Message-ID: <3BEA4A335B853745AE0BA5E81DE3782A09DD5D30@exch1-hi.accelrys.net>

Hi Chris,

We tested BioPerl 1.6.0 alpha 3 with our set of Pipeline Pilot regressions and noticed a small problem.  The fasta validation check for '>' in SeqIO::fasta (line 127) throws when used with Index::Fasta on Windows because the position after '>' is being indexed.  It looks like you already fixed the same problem for Linux (comment in line 190 of Index::Fasta).  Do you want me to put this into bugzilla?  Let me know if you have any questions.  Thanks!

Cheers,
Kristine


-----Original Message-----
From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-bounces at lists.open-bio.org] On Behalf Of Chris Fields
Sent: Tuesday, September 22, 2009 1:29 PM
To: BioPerl List
Subject: [Bioperl-l] BioPerl 1.6.0 alpha 3 released

The third alpha is now out and propagating it's way around the  
intertubes:

http://search.cpan.org/~cjfields/BioPerl-1.6.0_3/

Pick your favorite archive here:

http://bioperl.org/DIST/RC/

This includes some unmerged changes from 1.6.0.  Test failures from  
the last alpha indicated these somehow were missed, so I basically ran  
a global diff against main trunk to check for missing commits (all  
located in t/ as it turned out).

Also fixed is are the SeqFeature_SQLite.t failures; this is a file  
autogenerated with Build.PL tests that somehow made it's way into the  
last alpha release.  This is now properly cleaned up along with it's  
test database using './Build clean'.  BTW, very nice SQLite  
implementation; I may be using it!

Please let me know if anything pops up; I'm hoping to release 1.6.1 by  
this Thursday-Friday.

Enjoy!

chris
_______________________________________________
Bioperl-l mailing list
Bioperl-l at lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/bioperl-l


From KBriedis at accelrys.com  Wed Sep 23 18:40:10 2009
From: KBriedis at accelrys.com (Kristine Briedis)
Date: Wed, 23 Sep 2009 18:40:10 -0400
Subject: [Bioperl-l] BioPerl 1.6.0 alpha 3 released
In-Reply-To: <EA6593D4-5F3D-4CD9-95C6-598B9C561609@illinois.edu>
References: <A59164B5-0408-4A94-9262-8B814DD48CE1@illinois.edu>
	<3BEA4A335B853745AE0BA5E81DE3782A09DD5D30@exch1-hi.accelrys.net>
	<EA6593D4-5F3D-4CD9-95C6-598B9C561609@illinois.edu>
Message-ID: <3BEA4A335B853745AE0BA5E81DE3782A09DD5DF8@exch1-hi.accelrys.net>

Hi Chris,

ActivePerl.  I'll open a bug.  Thanks!

Cheers,
Kristine


-----Original Message-----
From: Chris Fields [mailto:cjfields at illinois.edu] 
Sent: Wednesday, September 23, 2009 1:59 PM
To: Kristine Briedis
Cc: BioPerl List
Subject: Re: [Bioperl-l] BioPerl 1.6.0 alpha 3 released

Yes, that would be good.  I don't have immediate access to anything  
running WinXP/vista/7 but I can probably look into this sometime  
tomorrow or Monday.

Just to make sure, is this with ActivePerl or Strawberry Perl?

chris

On Sep 23, 2009, at 3:52 PM, Kristine Briedis wrote:

> Hi Chris,
>
> We tested BioPerl 1.6.0 alpha 3 with our set of Pipeline Pilot  
> regressions and noticed a small problem.  The fasta validation check  
> for '>' in SeqIO::fasta (line 127) throws when used with  
> Index::Fasta on Windows because the position after '>' is being  
> indexed.  It looks like you already fixed the same problem for Linux  
> (comment in line 190 of Index::Fasta).  Do you want me to put this  
> into bugzilla?  Let me know if you have any questions.  Thanks!
>
> Cheers,
> Kristine
>
>
> -----Original Message-----
> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l- 
> bounces at lists.open-bio.org] On Behalf Of Chris Fields
> Sent: Tuesday, September 22, 2009 1:29 PM
> To: BioPerl List
> Subject: [Bioperl-l] BioPerl 1.6.0 alpha 3 released
>
> The third alpha is now out and propagating it's way around the
> intertubes:
>
> http://search.cpan.org/~cjfields/BioPerl-1.6.0_3/
>
> Pick your favorite archive here:
>
> http://bioperl.org/DIST/RC/
>
> This includes some unmerged changes from 1.6.0.  Test failures from
> the last alpha indicated these somehow were missed, so I basically ran
> a global diff against main trunk to check for missing commits (all
> located in t/ as it turned out).
>
> Also fixed is are the SeqFeature_SQLite.t failures; this is a file
> autogenerated with Build.PL tests that somehow made it's way into the
> last alpha release.  This is now properly cleaned up along with it's
> test database using './Build clean'.  BTW, very nice SQLite
> implementation; I may be using it!
>
> Please let me know if anything pops up; I'm hoping to release 1.6.1 by
> this Thursday-Friday.
>
> Enjoy!
>
> chris
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From cjfields at illinois.edu  Wed Sep 23 18:49:45 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Wed, 23 Sep 2009 17:49:45 -0500
Subject: [Bioperl-l] BioPerl.pm and 1.6.1
References: <1253727169.18486.1336281841@webmail.messagingengine.com>
Message-ID: <1AF393BC-2352-4ADA-A4E3-3EF13B99CAE8@illinois.edu>

All,

I've recently noticed that CPAN is not grabbing the correct  
descriptive information from Build.PL.  The current description is  
coming from Bio::LiveSeq::IO::BioPerl, which is the first module found  
with the same 'BioPerl' namesake:

http://search.cpan.org/search?query=bioperl&mode=dist

Therefore we need something that acts as the description and main page  
for the distributions.  We have a bioperl.pod already, just need to  
update it and add it to trunk, and maybe release another alpha with it  
included to make sure it's working.  I also want to fix the recent  
Windows issue reported by Kristine.

Therefore, I will being adding this for core and the other  
distributions per Curtis Jewell's suggestion (below).  Please let me  
know if there are any disagreements with this; I'll probably push  
another alpha out with this in the next few days (also hopefully  
containing the bug fix mentioned above).

chris

Begin forwarded message:

> From: "Curtis Jewell" <lists.perl.module-authors at csjewell.fastmail.us>
> Date: September 23, 2009 12:32:49 PM CDT
> To: "Chris Fields" <cjfields at illinois.edu>
> Subject: Re: distribution description
>
> Chris, I'd make it a BioPerl.pm that just declares a package and  
> version
> and does nothing else other than being a holder for Pod - because the
> first thing I wanted to do when I heard about it and wanted to check
> whether it worked in Strawberry is to do 'cpan BioPerl', which of
> course, blows up.
>
> --Curtis
>
> On Tue, 22 Sep 2009 22:23 -0500, "Chris Fields"  
> <cjfields at illinois.edu>
> wrote:
>> I've noticed in the last number of CPAN releases of BioPerl that the
>> description for the distribution is being pulled from one of our
>> modules (Bio::LiveSeq::IO::BioPerl).  I'm guessing this is b/c it's
>> the first match to the distribution name.
>>
>> Is there any way to make sure the description is pulled from the
>> abstract?  We're using a subclass of Module::Build and have defined
>> dist_abstract (I'm thinking of adding a BioPerl.pod to the root
>> directory just to catch this).
>>
>> chris
> --
> Curtis Jewell
> swordsman at csjewell.fastmail.us
>
> %DCL-E-MEM-BAD, bad memory
> -VMS-F-PDGERS, pudding between the ears
>
> [I use PC-Alpine, which deliberately does not display colors and  
> pictures in HTML mail]
>


From cjfields at illinois.edu  Wed Sep 23 19:00:55 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Wed, 23 Sep 2009 18:00:55 -0500
Subject: [Bioperl-l] BioPerl 1.6.0 alpha 3 released
In-Reply-To: <3BEA4A335B853745AE0BA5E81DE3782A09DD5DF8@exch1-hi.accelrys.net>
References: <A59164B5-0408-4A94-9262-8B814DD48CE1@illinois.edu>
	<3BEA4A335B853745AE0BA5E81DE3782A09DD5D30@exch1-hi.accelrys.net>
	<EA6593D4-5F3D-4CD9-95C6-598B9C561609@illinois.edu>
	<3BEA4A335B853745AE0BA5E81DE3782A09DD5DF8@exch1-hi.accelrys.net>
Message-ID: <D704BD1B-C44B-4AB5-9C14-9F4F63A46FEE@illinois.edu>

Kristine,

I have been planning on installing a temp WinXP VM using VirtualBox,  
so this'll give me an excuse to set that up ;>

chris

On Sep 23, 2009, at 5:40 PM, Kristine Briedis wrote:

> Hi Chris,
>
> ActivePerl.  I'll open a bug.  Thanks!
>
> Cheers,
> Kristine
>
>
> -----Original Message-----
> From: Chris Fields [mailto:cjfields at illinois.edu]
> Sent: Wednesday, September 23, 2009 1:59 PM
> To: Kristine Briedis
> Cc: BioPerl List
> Subject: Re: [Bioperl-l] BioPerl 1.6.0 alpha 3 released
>
> Yes, that would be good.  I don't have immediate access to anything
> running WinXP/vista/7 but I can probably look into this sometime
> tomorrow or Monday.
>
> Just to make sure, is this with ActivePerl or Strawberry Perl?
>
> chris
>
> On Sep 23, 2009, at 3:52 PM, Kristine Briedis wrote:
>
>> Hi Chris,
>>
>> We tested BioPerl 1.6.0 alpha 3 with our set of Pipeline Pilot
>> regressions and noticed a small problem.  The fasta validation check
>> for '>' in SeqIO::fasta (line 127) throws when used with
>> Index::Fasta on Windows because the position after '>' is being
>> indexed.  It looks like you already fixed the same problem for Linux
>> (comment in line 190 of Index::Fasta).  Do you want me to put this
>> into bugzilla?  Let me know if you have any questions.  Thanks!
>>
>> Cheers,
>> Kristine
>>
>>
>> -----Original Message-----
>> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
>> bounces at lists.open-bio.org] On Behalf Of Chris Fields
>> Sent: Tuesday, September 22, 2009 1:29 PM
>> To: BioPerl List
>> Subject: [Bioperl-l] BioPerl 1.6.0 alpha 3 released
>>
>> The third alpha is now out and propagating it's way around the
>> intertubes:
>>
>> http://search.cpan.org/~cjfields/BioPerl-1.6.0_3/
>>
>> Pick your favorite archive here:
>>
>> http://bioperl.org/DIST/RC/
>>
>> This includes some unmerged changes from 1.6.0.  Test failures from
>> the last alpha indicated these somehow were missed, so I basically  
>> ran
>> a global diff against main trunk to check for missing commits (all
>> located in t/ as it turned out).
>>
>> Also fixed is are the SeqFeature_SQLite.t failures; this is a file
>> autogenerated with Build.PL tests that somehow made it's way into the
>> last alpha release.  This is now properly cleaned up along with it's
>> test database using './Build clean'.  BTW, very nice SQLite
>> implementation; I may be using it!
>>
>> Please let me know if anything pops up; I'm hoping to release 1.6.1  
>> by
>> this Thursday-Friday.
>>
>> Enjoy!
>>
>> chris
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From David.Messina at sbc.su.se  Thu Sep 24 05:38:19 2009
From: David.Messina at sbc.su.se (Dave Messina)
Date: Thu, 24 Sep 2009 11:38:19 +0200
Subject: [Bioperl-l] a Main Page proposal
In-Reply-To: <9D6376D4-DFAC-4363-BA1C-0E27AB01373E@illinois.edu>
References: <mailman.7482.1253574466.2696.bioperl-l@lists.open-bio.org> 
	<4AB84B8D.5080005@ieee.org>
	<2B4BD21D-D06A-43A1-8860-A8F3771130E7@illinois.edu> 
	<f135c01c0909221615y4e79fb48u7e1ce6771b046d10@mail.gmail.com> 
	<628aabb70909230437p13de2164sef4950cc3be2ce23@mail.gmail.com> 
	<320fb6e00909230512u3d0c2031xb418e3253476be2f@mail.gmail.com> 
	<9D6376D4-DFAC-4363-BA1C-0E27AB01373E@illinois.edu>
Message-ID: <628aabb70909240238v439d6c46l93a5ead53f161c37@mail.gmail.com>

>
> Not to add yet more to the list, but I also think a concise list of
> projects using (or 'powered by') bioperl should be front-and-center; not a
> lot of users know when/where bioperl is used.  This applies to the other
> bio* as well, particularly biopython (seeing it popping up more and more).
>


Along these lines, it'd be great to publicize not only
BioPerl-*powered*projects, but ones which interface with it, too.

Just this week, for example, there is this, which could go both on a static
page and in the newsfeed:
http://bioinformatics.oxfordjournals.org/cgi/content/abstract/btp554v1

MOODS: fast search for position weight matrix matches in DNA sequences.

Korhonen J, Martinm?ki P, Pizzi C, Rastas P, Ukkonen E.
Department of Computer Science and Helsinki Institute for Information
Technology,
University of Helsinki, Helsinki, Finland.

SUMMARY: MOODS (MOtif Occurrence Detection Suite) is a software package for
matching position weight matrices against DNA sequences. MOODS implements
state-of-the-art on-line matching algorithms, achieving considerably faster
scanning speed than with a simple brute-force search. MOODS is written in C++,
with bindings for the popular BioPerl and Biopython toolkits. It can easily be
adapted for different purposes and integrated into existing workflows. It can
also be used as a C++ library. AVAILABILITY: The package with documentation and
examples of usage is available at http://www.cs.helsinki.fi/group/pssmfind. The
source code is also available under the terms of a GNU General Public License
(GPL). CONTACT: janne.h.korhonen at helsinki.fi.

PMID: 19773334 [PubMed - as supplied by publisher]


From maj at fortinbras.us  Thu Sep 24 10:17:26 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Thu, 24 Sep 2009 10:17:26 -0400
Subject: [Bioperl-l] DB_File dependency and ActiveState 5.10
Message-ID: <1A8A9461E94441EE9BD73D02A8F81F52@NewLife>

Gurus of a db stripe:
 
ActiveState 5.10 has such a problem with BDB that it
disables their ppm build of the DB_File module. I know
what the *ultimate* solution is...however...

I did a quick grep of 'use DB_File' across the trunk, and 
it seems there are two categories of dependency--

(1) use of BDB is an option among other dbms
      (e.g., among the  Bio::DB::GFF::Adaptor::)

(2) BDB is the developer's personal choice
    (e.g., possibly Bio::DB::FileCache)

In Bio::DB::Fasta, AnyDBM_File is used to allow the 
user a choice. Are there fundamental reasons not to 
convert the type (2) dependencies to AnyDBM_File?
I will try to do this (on a branch) if there are no technical
objections. General derision, however, will only goad
me into action-

Thanks,
MAJ


From A.J.Pemberton at bham.ac.uk  Thu Sep 24 11:08:06 2009
From: A.J.Pemberton at bham.ac.uk (Anthony Pemberton)
Date: Thu, 24 Sep 2009 16:08:06 +0100
Subject: [Bioperl-l] Problems installing latest stable bioperl-db (1.6)
In-Reply-To: <67AB606C-5CC9-4C1E-84EE-EFB7C37667E9@illinois.edu>
References: <3A5B0BBDAF00724AB5F10155650102306F86D3F6@LESMBX1.adf.bham.ac.uk>
	<3E7712FC278A4C9C89CBFC9A683AE301@NewLife>
	<67AB606C-5CC9-4C1E-84EE-EFB7C37667E9@illinois.edu>
Message-ID: <3A5B0BBDAF00724AB5F10155650102306F86D403@LESMBX1.adf.bham.ac.uk>

Chris, Mark,

Thank you, I have made significant progress with the install. I had to do a 

Cpan> force install Array::Compare

To get the model properly installed. 

However, I now have a new error. When I do

Cpan> install CJFIELDS/BioPerl-db-1.6.0.tar.gz

I get the following error (now only 1 of the 16 tests fails):

t/12ontology.t .... 1/740 Bio::OntologyIO: soflat cannot be found
Exception
------------- EXCEPTION -------------
MSG: Failed to load module Bio::OntologyIO::soflat. Can't locate Graph/Directed.pm in @INC (@INC contains: t/lib t /root/.cpan/build/BioPerl-db-1.6.0-xim2YV/blib/lib /root/.cpan/build/BioPerl-db-1.6.0-xim2YV/blib/arch /root/.cpan/build/BioPerl-db-1.6.0-xim2YV /usr/lib64/perl5/5.8.5/x86_64-linux-thread-multi /usr/lib/perl5/5.8.5 /usr/lib64/perl5/site_perl/5.8.5/x86_64-linux-thread-multi /usr/lib/perl5/site_perl/5.8.5 /usr/lib/perl5/site_perl /usr/lib64/perl5/vendor_perl/5.8.5/x86_64-linux-thread-multi /usr/lib/perl5/vendor_perl/5.8.5 /usr/lib64/perl5/vendor_perl/5.8.3/x86_64-linux-thread-multi /usr/lib/perl5/vendor_perl .) at /usr/lib/perl5/site_perl/5.8.5/Bio/Ontology/SimpleGOEngine/GraphAdaptor.pm line 118.
BEGIN failed--compilation aborted at /usr/lib/perl5/site_perl/5.8.5/Bio/Ontology/SimpleGOEngine/GraphAdaptor.pm line 118.


Can you help with this one?

Regards,

Tony Pemberton


> -----Original Message-----
> From: Chris Fields [mailto:cjfields at illinois.edu]
> Sent: 23 September 2009 18:48
> To: Mark A. Jensen
> Cc: Anthony Pemberton; bioperl-l at bioperl.org
> Subject: Re: [Bioperl-l] Problems installing latest stable bioperl-db
> (1.6)
> 
> Appears Array::Compare is used for Test::Warn, so it isn't a true
> requirement (probably a test_requires or somesuch).
> 
> chris
> 
> On Sep 23, 2009, at 10:36 AM, Mark A. Jensen wrote:
> 
> > hi Tony- missing prereqs are the issue with this message,yes-
> > the brute force approach would be to install each of these
> > as they come up; you can do
> >
> > $ cpan
> > cpan> install Array::Compare
> >
> > etc., then attempt the bioperl-db install again; lather, rinse,
> > repeat.
> > MAJ
> > ----- Original Message ----- From: "Anthony Pemberton"
> <A.J.Pemberton at bham.ac.uk
> > >
> > To: <bioperl-l at bioperl.org>
> > Sent: Tuesday, September 22, 2009 1:06 PM
> > Subject: [Bioperl-l] Problems installing latest stable bioperl-db
> > (1.6)
> >
> >
> >> Folks,
> >>
> >> I am experiencing problems installing bioperl-db. I followed the
> >> instructions on the website both installing via CPAN and
> >> downloading the source tarball. Get the same error. I think I have
> >> missing prerequistes, the first error I get is:
> >>
> >> Can't locate Array/Compare.pm in @INC (@INC contains: t/lib t /usr/
> >> local/BioPerl-db-1.6.0/blib/lib
> >> /usr/local/BioPerl-db-1.6.0/blib/arch /usr/local/BioPerl-db-1.6.0 /
> >> usr/lib64/perl5/5.8.5/x86_64-linux-thread-multi
> >> /usr/lib/perl5/5.8.5 /usr/lib64/perl5/site_perl/5.8.5/x86_64-linux-
> >> thread-multi /usr/lib/perl5/site_perl/5.8.5
> >> /usr/lib/perl5/site_perl /usr/lib64/perl5/vendor_perl/5.8.5/x86_64-
> >> linux-thread-multi /usr/lib/perl5/vendor_perl/5.8.5
> >> /usr/lib64/perl5/vendor_perl/5.8.3/x86_64-linux-thread-multi /usr/
> >> lib/perl5/vendor_perl .) at t/lib/Test/Warn.pm line 228.
> >>
> >> Can anyone help?
> >>
> >> Regards,
> >>
> >> Tony P.
> >>
> >>
> >> **************************************************************
> >> Mr. A. Pemberton Tel:+44 121 414 3388
> >> School of Biosciences, Fax:+44 121 414 5925
> >> The University of Birmingham
> >> Email:a.j.pemberton at bham.ac.uk
> >> Birmingham B15 2TT U.K.
> >> **************************************************************
> >>
> >>
> >>
> >> _______________________________________________
> >> Bioperl-l mailing list
> >> Bioperl-l at lists.open-bio.org
> >> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >>
> >
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l


From jason at bioperl.org  Thu Sep 24 12:23:44 2009
From: jason at bioperl.org (Jason Stajich)
Date: Thu, 24 Sep 2009 09:23:44 -0700
Subject: [Bioperl-l] a Main Page proposal
In-Reply-To: <628aabb70909240238v439d6c46l93a5ead53f161c37@mail.gmail.com>
References: <mailman.7482.1253574466.2696.bioperl-l@lists.open-bio.org>
	<4AB84B8D.5080005@ieee.org>
	<2B4BD21D-D06A-43A1-8860-A8F3771130E7@illinois.edu>
	<f135c01c0909221615y4e79fb48u7e1ce6771b046d10@mail.gmail.com>
	<628aabb70909230437p13de2164sef4950cc3be2ce23@mail.gmail.com>
	<320fb6e00909230512u3d0c2031xb418e3253476be2f@mail.gmail.com>
	<9D6376D4-DFAC-4363-BA1C-0E27AB01373E@illinois.edu>
	<628aabb70909240238v439d6c46l93a5ead53f161c37@mail.gmail.com>
Message-ID: <3B49F41D-4FBB-48CD-BA33-3D6C783CBA38@bioperl.org>

If someone also wants to volunteer to keep up the publications page -  
this is where I *had* been curating a list up by citations and google  
scholar searches for 'bioperl' and things that reference 2002 paper.

Seems like this is where the static copy of that information should go  
- but highlighting things on the a page with a circulating list or  
something that just listed recent additions to the list could be done  
by the web dev gurus and could be kewl.
The current issue is that a) it is large so I think pubmed plugin  
rendering can be slow (or gets broken as it seems to be now).
http://bioperl.org/wiki/BioPerl_publications
http://bioperl.org/wiki/BioPerl_publications/2008
http://bioperl.org/wiki/BioPerl_publications/2007
etc....

-jason
On Sep 24, 2009, at 2:38 AM, Dave Messina wrote:

>>
>> Not to add yet more to the list, but I also think a concise list of
>> projects using (or 'powered by') bioperl should be front-and- 
>> center; not a
>> lot of users know when/where bioperl is used.  This applies to the  
>> other
>> bio* as well, particularly biopython (seeing it popping up more and  
>> more).
>>
>
>
> Along these lines, it'd be great to publicize not only
> BioPerl-*powered*projects, but ones which interface with it, too.
>
> Just this week, for example, there is this, which could go both on a  
> static
> page and in the newsfeed:
> http://bioinformatics.oxfordjournals.org/cgi/content/abstract/btp554v1
>
> MOODS: fast search for position weight matrix matches in DNA  
> sequences.
>
> Korhonen J, Martinm?ki P, Pizzi C, Rastas P, Ukkonen E.
> Department of Computer Science and Helsinki Institute for Information
> Technology,
> University of Helsinki, Helsinki, Finland.
>
> SUMMARY: MOODS (MOtif Occurrence Detection Suite) is a software  
> package for
> matching position weight matrices against DNA sequences. MOODS  
> implements
> state-of-the-art on-line matching algorithms, achieving considerably  
> faster
> scanning speed than with a simple brute-force search. MOODS is  
> written in C++,
> with bindings for the popular BioPerl and Biopython toolkits. It can  
> easily be
> adapted for different purposes and integrated into existing  
> workflows. It can
> also be used as a C++ library. AVAILABILITY: The package with  
> documentation and
> examples of usage is available at http://www.cs.helsinki.fi/group/pssmfind 
> . The
> source code is also available under the terms of a GNU General  
> Public License
> (GPL). CONTACT: janne.h.korhonen at helsinki.fi.
>
> PMID: 19773334 [PubMed - as supplied by publisher]
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

--
Jason Stajich
jason.stajich at gmail.com
jason at bioperl.org


From rmb32 at cornell.edu  Thu Sep 24 12:28:08 2009
From: rmb32 at cornell.edu (Robert Buels)
Date: Thu, 24 Sep 2009 09:28:08 -0700
Subject: [Bioperl-l] DB_File dependency and ActiveState 5.10
In-Reply-To: <1A8A9461E94441EE9BD73D02A8F81F52@NewLife>
References: <1A8A9461E94441EE9BD73D02A8F81F52@NewLife>
Message-ID: <4ABB9E18.3060003@cornell.edu>

Sounds like a good idea to me.

Rob


From cjfields at illinois.edu  Thu Sep 24 12:58:32 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Thu, 24 Sep 2009 11:58:32 -0500
Subject: [Bioperl-l] Problems installing latest stable bioperl-db (1.6)
In-Reply-To: <3A5B0BBDAF00724AB5F10155650102306F86D403@LESMBX1.adf.bham.ac.uk>
References: <3A5B0BBDAF00724AB5F10155650102306F86D3F6@LESMBX1.adf.bham.ac.uk>
	<3E7712FC278A4C9C89CBFC9A683AE301@NewLife>
	<67AB606C-5CC9-4C1E-84EE-EFB7C37667E9@illinois.edu>
	<3A5B0BBDAF00724AB5F10155650102306F86D403@LESMBX1.adf.bham.ac.uk>
Message-ID: <2BDD197A-3DEF-44CE-9F98-6B3F117084EE@illinois.edu>

Tony,

The error should point out the problem: install Graph::Directed via  
CPAN.

Saying that, we need to add that as a 'recommends' for the db package  
and skip those tests if Graph::Directed isn't present.  Will do that  
now.

chris

On Sep 24, 2009, at 10:08 AM, Anthony Pemberton wrote:

> Chris, Mark,
>
> Thank you, I have made significant progress with the install. I had  
> to do a
>
> Cpan> force install Array::Compare
>
> To get the model properly installed.
>
> However, I now have a new error. When I do
>
> Cpan> install CJFIELDS/BioPerl-db-1.6.0.tar.gz
>
> I get the following error (now only 1 of the 16 tests fails):
>
> t/12ontology.t .... 1/740 Bio::OntologyIO: soflat cannot be found
> Exception
> ------------- EXCEPTION -------------
> MSG: Failed to load module Bio::OntologyIO::soflat. Can't locate  
> Graph/Directed.pm in @INC (@INC contains: t/lib t /root/.cpan/build/ 
> BioPerl-db-1.6.0-xim2YV/blib/lib /root/.cpan/build/BioPerl-db-1.6.0- 
> xim2YV/blib/arch /root/.cpan/build/BioPerl-db-1.6.0-xim2YV /usr/ 
> lib64/perl5/5.8.5/x86_64-linux-thread-multi /usr/lib/perl5/5.8.5 / 
> usr/lib64/perl5/site_perl/5.8.5/x86_64-linux-thread-multi /usr/lib/ 
> perl5/site_perl/5.8.5 /usr/lib/perl5/site_perl /usr/lib64/perl5/ 
> vendor_perl/5.8.5/x86_64-linux-thread-multi /usr/lib/perl5/ 
> vendor_perl/5.8.5 /usr/lib64/perl5/vendor_perl/5.8.3/x86_64-linux- 
> thread-multi /usr/lib/perl5/vendor_perl .) at /usr/lib/perl5/ 
> site_perl/5.8.5/Bio/Ontology/SimpleGOEngine/GraphAdaptor.pm line 118.
> BEGIN failed--compilation aborted at /usr/lib/perl5/site_perl/5.8.5/ 
> Bio/Ontology/SimpleGOEngine/GraphAdaptor.pm line 118.
>
>
> Can you help with this one?
>
> Regards,
>
> Tony Pemberton
>
>
>> -----Original Message-----
>> From: Chris Fields [mailto:cjfields at illinois.edu]
>> Sent: 23 September 2009 18:48
>> To: Mark A. Jensen
>> Cc: Anthony Pemberton; bioperl-l at bioperl.org
>> Subject: Re: [Bioperl-l] Problems installing latest stable bioperl-db
>> (1.6)
>>
>> Appears Array::Compare is used for Test::Warn, so it isn't a true
>> requirement (probably a test_requires or somesuch).
>>
>> chris
>>
>> On Sep 23, 2009, at 10:36 AM, Mark A. Jensen wrote:
>>
>>> hi Tony- missing prereqs are the issue with this message,yes-
>>> the brute force approach would be to install each of these
>>> as they come up; you can do
>>>
>>> $ cpan
>>> cpan> install Array::Compare
>>>
>>> etc., then attempt the bioperl-db install again; lather, rinse,
>>> repeat.
>>> MAJ
>>> ----- Original Message ----- From: "Anthony Pemberton"
>> <A.J.Pemberton at bham.ac.uk
>>>>
>>> To: <bioperl-l at bioperl.org>
>>> Sent: Tuesday, September 22, 2009 1:06 PM
>>> Subject: [Bioperl-l] Problems installing latest stable bioperl-db
>>> (1.6)
>>>
>>>
>>>> Folks,
>>>>
>>>> I am experiencing problems installing bioperl-db. I followed the
>>>> instructions on the website both installing via CPAN and
>>>> downloading the source tarball. Get the same error. I think I have
>>>> missing prerequistes, the first error I get is:
>>>>
>>>> Can't locate Array/Compare.pm in @INC (@INC contains: t/lib t /usr/
>>>> local/BioPerl-db-1.6.0/blib/lib
>>>> /usr/local/BioPerl-db-1.6.0/blib/arch /usr/local/BioPerl-db-1.6.0 /
>>>> usr/lib64/perl5/5.8.5/x86_64-linux-thread-multi
>>>> /usr/lib/perl5/5.8.5 /usr/lib64/perl5/site_perl/5.8.5/x86_64-linux-
>>>> thread-multi /usr/lib/perl5/site_perl/5.8.5
>>>> /usr/lib/perl5/site_perl /usr/lib64/perl5/vendor_perl/5.8.5/x86_64-
>>>> linux-thread-multi /usr/lib/perl5/vendor_perl/5.8.5
>>>> /usr/lib64/perl5/vendor_perl/5.8.3/x86_64-linux-thread-multi /usr/
>>>> lib/perl5/vendor_perl .) at t/lib/Test/Warn.pm line 228.
>>>>
>>>> Can anyone help?
>>>>
>>>> Regards,
>>>>
>>>> Tony P.
>>>>
>>>>
>>>> **************************************************************
>>>> Mr. A. Pemberton Tel:+44 121 414 3388
>>>> School of Biosciences, Fax:+44 121 414 5925
>>>> The University of Birmingham
>>>> Email:a.j.pemberton at bham.ac.uk
>>>> Birmingham B15 2TT U.K.
>>>> **************************************************************
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> Bioperl-l mailing list
>>>> Bioperl-l at lists.open-bio.org
>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>>
>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


From cjfields at illinois.edu  Thu Sep 24 13:50:34 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Thu, 24 Sep 2009 12:50:34 -0500
Subject: [Bioperl-l] DB_File dependency and ActiveState 5.10
In-Reply-To: <1A8A9461E94441EE9BD73D02A8F81F52@NewLife>
References: <1A8A9461E94441EE9BD73D02A8F81F52@NewLife>
Message-ID: <759F1C97-401A-434C-956C-20A1DED9D834@illinois.edu>

I do support doing this for sheer flexibility, but it's not an  
absolute showstopper for ActivePerl.  There is a working DB_File PPM  
available for ActivePerl 5.10.1 in the Trouchelle PPM repo:

http://trouchelle.com/ppm10/

That repo is listed in the 'Suggested' list in the latest PPM4  
Preferences (Repositories tag). I had to install it to fix that WinXP  
Bio::Index bug.

(Based on that Bio::Index modules also have this requirement, at least  
tests were being skipped based on lack of DB_File)

chris

On Sep 24, 2009, at 9:17 AM, Mark A. Jensen wrote:

> Gurus of a db stripe:
>
> ActiveState 5.10 has such a problem with BDB that it
> disables their ppm build of the DB_File module. I know
> what the *ultimate* solution is...however...
>
> I did a quick grep of 'use DB_File' across the trunk, and
> it seems there are two categories of dependency--
>
> (1) use of BDB is an option among other dbms
>      (e.g., among the  Bio::DB::GFF::Adaptor::)
>
> (2) BDB is the developer's personal choice
>    (e.g., possibly Bio::DB::FileCache)
>
> In Bio::DB::Fasta, AnyDBM_File is used to allow the
> user a choice. Are there fundamental reasons not to
> convert the type (2) dependencies to AnyDBM_File?
> I will try to do this (on a branch) if there are no technical
> objections. General derision, however, will only goad
> me into action-
>
> Thanks,
> MAJ
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From cjfields at illinois.edu  Thu Sep 24 14:03:48 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Thu, 24 Sep 2009 13:03:48 -0500
Subject: [Bioperl-l] Bio::SeqIO::scf tests failing on WinXP?
Message-ID: <159370FD-B6F5-4702-AF35-B7126BA7399A@illinois.edu>

Can someone (Mark?) who has a WinXP setup run tests on Bio::SeqIO::scf  
for Windows using the last alpha or bioperl-live?  I'm getting a  
pretty significant fail with the last alpha release (I've managed to  
fix the others) via my remote desktop setup (haven't set up virtualbox  
yet).  I just want to confirm this is occurring elsewhere and plan  
accordingly, namely indicating the module doesn't work with windows  
for the time being.

Build test --test-files t/SeqIO/scf.t --verbose

chris


From maj at fortinbras.us  Thu Sep 24 14:39:38 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Thu, 24 Sep 2009 14:39:38 -0400
Subject: [Bioperl-l] DB_File dependency and ActiveState 5.10
In-Reply-To: <759F1C97-401A-434C-956C-20A1DED9D834@illinois.edu>
References: <1A8A9461E94441EE9BD73D02A8F81F52@NewLife>
	<759F1C97-401A-434C-956C-20A1DED9D834@illinois.edu>
Message-ID: <3715F68607084E4684A4B54E542468E4@NewLife>

All righty. I did find the trouchelle repo, but my ppm
didn't believe that DB_File was in it.
----- Original Message ----- 
From: "Chris Fields" <cjfields at illinois.edu>
To: "Mark A. Jensen" <maj at fortinbras.us>
Cc: "BioPerl List" <bioperl-l at lists.open-bio.org>
Sent: Thursday, September 24, 2009 1:50 PM
Subject: Re: [Bioperl-l] DB_File dependency and ActiveState 5.10


>I do support doing this for sheer flexibility, but it's not an  
> absolute showstopper for ActivePerl.  There is a working DB_File PPM  
> available for ActivePerl 5.10.1 in the Trouchelle PPM repo:
> 
> http://trouchelle.com/ppm10/
> 
> That repo is listed in the 'Suggested' list in the latest PPM4  
> Preferences (Repositories tag). I had to install it to fix that WinXP  
> Bio::Index bug.
> 
> (Based on that Bio::Index modules also have this requirement, at least  
> tests were being skipped based on lack of DB_File)
> 
> chris
> 
> On Sep 24, 2009, at 9:17 AM, Mark A. Jensen wrote:
> 
>> Gurus of a db stripe:
>>
>> ActiveState 5.10 has such a problem with BDB that it
>> disables their ppm build of the DB_File module. I know
>> what the *ultimate* solution is...however...
>>
>> I did a quick grep of 'use DB_File' across the trunk, and
>> it seems there are two categories of dependency--
>>
>> (1) use of BDB is an option among other dbms
>>      (e.g., among the  Bio::DB::GFF::Adaptor::)
>>
>> (2) BDB is the developer's personal choice
>>    (e.g., possibly Bio::DB::FileCache)
>>
>> In Bio::DB::Fasta, AnyDBM_File is used to allow the
>> user a choice. Are there fundamental reasons not to
>> convert the type (2) dependencies to AnyDBM_File?
>> I will try to do this (on a branch) if there are no technical
>> objections. General derision, however, will only goad
>> me into action-
>>
>> Thanks,
>> MAJ
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
> 
>


From maj at fortinbras.us  Thu Sep 24 14:40:03 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Thu, 24 Sep 2009 14:40:03 -0400
Subject: [Bioperl-l] Bio::SeqIO::scf tests failing on WinXP?
In-Reply-To: <159370FD-B6F5-4702-AF35-B7126BA7399A@illinois.edu>
References: <159370FD-B6F5-4702-AF35-B7126BA7399A@illinois.edu>
Message-ID: <791B5C5CB3C34A8AAC348DC59E934198@NewLife>

aye-aye
----- Original Message ----- 
From: "Chris Fields" <cjfields at illinois.edu>
To: "BioPerl List" <bioperl-l at lists.open-bio.org>
Sent: Thursday, September 24, 2009 2:03 PM
Subject: [Bioperl-l] Bio::SeqIO::scf tests failing on WinXP?


> Can someone (Mark?) who has a WinXP setup run tests on Bio::SeqIO::scf  
> for Windows using the last alpha or bioperl-live?  I'm getting a  
> pretty significant fail with the last alpha release (I've managed to  
> fix the others) via my remote desktop setup (haven't set up virtualbox  
> yet).  I just want to confirm this is occurring elsewhere and plan  
> accordingly, namely indicating the module doesn't work with windows  
> for the time being.
> 
> Build test --test-files t/SeqIO/scf.t --verbose
> 
> chris
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
>


From e.osimo at gmail.com  Fri Sep 25 03:59:10 2009
From: e.osimo at gmail.com (Emanuele Osimo)
Date: Fri, 25 Sep 2009 09:59:10 +0200
Subject: [Bioperl-l] a Main Page proposal
In-Reply-To: <3B49F41D-4FBB-48CD-BA33-3D6C783CBA38@bioperl.org>
References: <mailman.7482.1253574466.2696.bioperl-l@lists.open-bio.org> 
	<4AB84B8D.5080005@ieee.org>
	<2B4BD21D-D06A-43A1-8860-A8F3771130E7@illinois.edu> 
	<f135c01c0909221615y4e79fb48u7e1ce6771b046d10@mail.gmail.com> 
	<628aabb70909230437p13de2164sef4950cc3be2ce23@mail.gmail.com> 
	<320fb6e00909230512u3d0c2031xb418e3253476be2f@mail.gmail.com> 
	<9D6376D4-DFAC-4363-BA1C-0E27AB01373E@illinois.edu>
	<628aabb70909240238v439d6c46l93a5ead53f161c37@mail.gmail.com> 
	<3B49F41D-4FBB-48CD-BA33-3D6C783CBA38@bioperl.org>
Message-ID: <2ac05d0f0909250059p56c75124hfb8b16b865a831c@mail.gmail.com>

Dear Jason,
it's more than 24 hours that I try connecting to
http://bioperl.org/wiki/BioPerl_publications, but it won't work.
Emanuele


On Thu, Sep 24, 2009 at 18:23, Jason Stajich <jason at bioperl.org> wrote:

> If someone also wants to volunteer to keep up the publications page - this
> is where I *had* been curating a list up by citations and google scholar
> searches for 'bioperl' and things that reference 2002 paper.
>
> Seems like this is where the static copy of that information should go -
> but highlighting things on the a page with a circulating list or something
> that just listed recent additions to the list could be done by the web dev
> gurus and could be kewl.
> The current issue is that a) it is large so I think pubmed plugin rendering
> can be slow (or gets broken as it seems to be now).
> http://bioperl.org/wiki/BioPerl_publications
> http://bioperl.org/wiki/BioPerl_publications/2008
> http://bioperl.org/wiki/BioPerl_publications/2007
> etc....
>
> -jason
>
> On Sep 24, 2009, at 2:38 AM, Dave Messina wrote:
>
>
>>> Not to add yet more to the list, but I also think a concise list of
>>> projects using (or 'powered by') bioperl should be front-and-center; not
>>> a
>>> lot of users know when/where bioperl is used.  This applies to the other
>>> bio* as well, particularly biopython (seeing it popping up more and
>>> more).
>>>
>>>
>>
>> Along these lines, it'd be great to publicize not only
>> BioPerl-*powered*projects, but ones which interface with it, too.
>>
>> Just this week, for example, there is this, which could go both on a
>> static
>> page and in the newsfeed:
>> http://bioinformatics.oxfordjournals.org/cgi/content/abstract/btp554v1
>>
>> MOODS: fast search for position weight matrix matches in DNA sequences.
>>
>> Korhonen J, Martinm?ki P, Pizzi C, Rastas P, Ukkonen E.
>> Department of Computer Science and Helsinki Institute for Information
>> Technology,
>> University of Helsinki, Helsinki, Finland.
>>
>> SUMMARY: MOODS (MOtif Occurrence Detection Suite) is a software package
>> for
>> matching position weight matrices against DNA sequences. MOODS implements
>> state-of-the-art on-line matching algorithms, achieving considerably
>> faster
>> scanning speed than with a simple brute-force search. MOODS is written in
>> C++,
>> with bindings for the popular BioPerl and Biopython toolkits. It can
>> easily be
>> adapted for different purposes and integrated into existing workflows. It
>> can
>> also be used as a C++ library. AVAILABILITY: The package with
>> documentation and
>> examples of usage is available at
>> http://www.cs.helsinki.fi/group/pssmfind. The
>> source code is also available under the terms of a GNU General Public
>> License
>> (GPL). CONTACT: janne.h.korhonen at helsinki.fi.
>>
>> PMID: 19773334 [PubMed - as supplied by publisher]
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>
> --
> Jason Stajich
> jason.stajich at gmail.com
> jason at bioperl.org
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


From hlapp at gmx.net  Fri Sep 25 07:26:37 2009
From: hlapp at gmx.net (Hilmar Lapp)
Date: Fri, 25 Sep 2009 07:26:37 -0400
Subject: [Bioperl-l] a Main Page proposal
In-Reply-To: <2ac05d0f0909250059p56c75124hfb8b16b865a831c@mail.gmail.com>
References: <mailman.7482.1253574466.2696.bioperl-l@lists.open-bio.org>
	<4AB84B8D.5080005@ieee.org>
	<2B4BD21D-D06A-43A1-8860-A8F3771130E7@illinois.edu>
	<f135c01c0909221615y4e79fb48u7e1ce6771b046d10@mail.gmail.com>
	<628aabb70909230437p13de2164sef4950cc3be2ce23@mail.gmail.com>
	<320fb6e00909230512u3d0c2031xb418e3253476be2f@mail.gmail.com>
	<9D6376D4-DFAC-4363-BA1C-0E27AB01373E@illinois.edu>
	<628aabb70909240238v439d6c46l93a5ead53f161c37@mail.gmail.com>
	<3B49F41D-4FBB-48CD-BA33-3D6C783CBA38@bioperl.org>
	<2ac05d0f0909250059p56c75124hfb8b16b865a831c@mail.gmail.com>
Message-ID: <9B33DB9A-C82D-42E5-87D3-A26BD166F7F5@gmx.net>

Odd. Something's going on in the page that upsets MediaWiki. I can  
actually pull up the page in edit mode.

Is the citation extension working correctly? The year-by-year pages  
look odd.

	-hilmar

On Sep 25, 2009, at 3:59 AM, Emanuele Osimo wrote:

> Dear Jason,
> it's more than 24 hours that I try connecting to
> http://bioperl.org/wiki/BioPerl_publications, but it won't work.
> Emanuele
>
>
> On Thu, Sep 24, 2009 at 18:23, Jason Stajich <jason at bioperl.org>  
> wrote:
>
>> If someone also wants to volunteer to keep up the publications page  
>> - this
>> is where I *had* been curating a list up by citations and google  
>> scholar
>> searches for 'bioperl' and things that reference 2002 paper.
>>
>> Seems like this is where the static copy of that information should  
>> go -
>> but highlighting things on the a page with a circulating list or  
>> something
>> that just listed recent additions to the list could be done by the  
>> web dev
>> gurus and could be kewl.
>> The current issue is that a) it is large so I think pubmed plugin  
>> rendering
>> can be slow (or gets broken as it seems to be now).
>> http://bioperl.org/wiki/BioPerl_publications
>> http://bioperl.org/wiki/BioPerl_publications/2008
>> http://bioperl.org/wiki/BioPerl_publications/2007
>> etc....
>>
>> -jason
>>
>> On Sep 24, 2009, at 2:38 AM, Dave Messina wrote:
>>
>>
>>>> Not to add yet more to the list, but I also think a concise list of
>>>> projects using (or 'powered by') bioperl should be front-and- 
>>>> center; not
>>>> a
>>>> lot of users know when/where bioperl is used.  This applies to  
>>>> the other
>>>> bio* as well, particularly biopython (seeing it popping up more and
>>>> more).
>>>>
>>>>
>>>
>>> Along these lines, it'd be great to publicize not only
>>> BioPerl-*powered*projects, but ones which interface with it, too.
>>>
>>> Just this week, for example, there is this, which could go both on a
>>> static
>>> page and in the newsfeed:
>>> http://bioinformatics.oxfordjournals.org/cgi/content/abstract/btp554v1
>>>
>>> MOODS: fast search for position weight matrix matches in DNA  
>>> sequences.
>>>
>>> Korhonen J, Martinm?ki P, Pizzi C, Rastas P, Ukkonen E.
>>> Department of Computer Science and Helsinki Institute for  
>>> Information
>>> Technology,
>>> University of Helsinki, Helsinki, Finland.
>>>
>>> SUMMARY: MOODS (MOtif Occurrence Detection Suite) is a software  
>>> package
>>> for
>>> matching position weight matrices against DNA sequences. MOODS  
>>> implements
>>> state-of-the-art on-line matching algorithms, achieving considerably
>>> faster
>>> scanning speed than with a simple brute-force search. MOODS is  
>>> written in
>>> C++,
>>> with bindings for the popular BioPerl and Biopython toolkits. It can
>>> easily be
>>> adapted for different purposes and integrated into existing  
>>> workflows. It
>>> can
>>> also be used as a C++ library. AVAILABILITY: The package with
>>> documentation and
>>> examples of usage is available at
>>> http://www.cs.helsinki.fi/group/pssmfind. The
>>> source code is also available under the terms of a GNU General  
>>> Public
>>> License
>>> (GPL). CONTACT: janne.h.korhonen at helsinki.fi.
>>>
>>> PMID: 19773334 [PubMed - as supplied by publisher]
>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>
>> --
>> Jason Stajich
>> jason.stajich at gmail.com
>> jason at bioperl.org
>>
>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From biopython at maubp.freeserve.co.uk  Fri Sep 25 07:40:33 2009
From: biopython at maubp.freeserve.co.uk (Peter)
Date: Fri, 25 Sep 2009 12:40:33 +0100
Subject: [Bioperl-l] a Main Page proposal
In-Reply-To: <9B33DB9A-C82D-42E5-87D3-A26BD166F7F5@gmx.net>
References: <mailman.7482.1253574466.2696.bioperl-l@lists.open-bio.org>
	<2B4BD21D-D06A-43A1-8860-A8F3771130E7@illinois.edu>
	<f135c01c0909221615y4e79fb48u7e1ce6771b046d10@mail.gmail.com>
	<628aabb70909230437p13de2164sef4950cc3be2ce23@mail.gmail.com>
	<320fb6e00909230512u3d0c2031xb418e3253476be2f@mail.gmail.com>
	<9D6376D4-DFAC-4363-BA1C-0E27AB01373E@illinois.edu>
	<628aabb70909240238v439d6c46l93a5ead53f161c37@mail.gmail.com>
	<3B49F41D-4FBB-48CD-BA33-3D6C783CBA38@bioperl.org>
	<2ac05d0f0909250059p56c75124hfb8b16b865a831c@mail.gmail.com>
	<9B33DB9A-C82D-42E5-87D3-A26BD166F7F5@gmx.net>
Message-ID: <320fb6e00909250440i18ee4216o80cedd418feed842@mail.gmail.com>

On Fri, Sep 25, 2009 at 12:26 PM, Hilmar Lapp <hlapp at gmx.net> wrote:
> Odd. Something's going on in the page that upsets MediaWiki. I can actually
> pull up the page in edit mode.
>
> Is the citation extension working correctly? The year-by-year pages look
> odd.

It is working on the Biopython and BioJava pages (which use the same
server and mediawiki installation, right?),

http://biopython.org/wiki/Documentation#Papers
http://biopython.org/wiki/Publications
http://biojava.org/wiki/BioJava:BioJavaInside

[I know there are references with a funny character in them, the extension
doesn't like accents. I normally redo those references by hand but it is
a hassle and just giving a PMID is much easier]

Peter


From maj at fortinbras.us  Fri Sep 25 08:50:26 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Fri, 25 Sep 2009 08:50:26 -0400
Subject: [Bioperl-l] a Main Page proposal
In-Reply-To: <9B33DB9A-C82D-42E5-87D3-A26BD166F7F5@gmx.net>
References: <mailman.7482.1253574466.2696.bioperl-l@lists.open-bio.org><4AB84B8D.5080005@ieee.org><2B4BD21D-D06A-43A1-8860-A8F3771130E7@illinois.edu><f135c01c0909221615y4e79fb48u7e1ce6771b046d10@mail.gmail.com><628aabb70909230437p13de2164sef4950cc3be2ce23@mail.gmail.com><320fb6e00909230512u3d0c2031xb418e3253476be2f@mail.gmail.com><9D6376D4-DFAC-4363-BA1C-0E27AB01373E@illinois.edu><628aabb70909240238v439d6c46l93a5ead53f161c37@mail.gmail.com><3B49F41D-4FBB-48CD-BA33-3D6C783CBA38@bioperl.org><2ac05d0f0909250059p56c75124hfb8b16b865a831c@mail.gmail.com>
	<9B33DB9A-C82D-42E5-87D3-A26BD166F7F5@gmx.net>
Message-ID: <4E35933353E14BB98975BCCAF79F5E0B@NewLife>

I've been playing with this. I think it's either a numbers problem (>230 
references => bork) or a timeout problem. Attempting to isolate a single 
"BioPerl publications/200x" page for the error gives inconsistent
results, but including enough of these pages to give more than about 230 
references gives the error (using
preview).
----- Original Message ----- 
From: "Hilmar Lapp" <hlapp at gmx.net>
To: "Emanuele Osimo" <e.osimo at gmail.com>
Cc: "perl bioperl ml" <bioperl-l at lists.open-bio.org>
Sent: Friday, September 25, 2009 7:26 AM
Subject: Re: [Bioperl-l] a Main Page proposal


Odd. Something's going on in the page that upsets MediaWiki. I can
actually pull up the page in edit mode.

Is the citation extension working correctly? The year-by-year pages
look odd.

-hilmar

On Sep 25, 2009, at 3:59 AM, Emanuele Osimo wrote:

> Dear Jason,
> it's more than 24 hours that I try connecting to
> http://bioperl.org/wiki/BioPerl_publications, but it won't work.
> Emanuele
>
>
> On Thu, Sep 24, 2009 at 18:23, Jason Stajich <jason at bioperl.org>  wrote:
>
>> If someone also wants to volunteer to keep up the publications page  - this
>> is where I *had* been curating a list up by citations and google  scholar
>> searches for 'bioperl' and things that reference 2002 paper.
>>
>> Seems like this is where the static copy of that information should  go -
>> but highlighting things on the a page with a circulating list or  something
>> that just listed recent additions to the list could be done by the  web dev
>> gurus and could be kewl.
>> The current issue is that a) it is large so I think pubmed plugin  rendering
>> can be slow (or gets broken as it seems to be now).
>> http://bioperl.org/wiki/BioPerl_publications
>> http://bioperl.org/wiki/BioPerl_publications/2008
>> http://bioperl.org/wiki/BioPerl_publications/2007
>> etc....
>>
>> -jason
>>
>> On Sep 24, 2009, at 2:38 AM, Dave Messina wrote:
>>
>>
>>>> Not to add yet more to the list, but I also think a concise list of
>>>> projects using (or 'powered by') bioperl should be front-and- center; not
>>>> a
>>>> lot of users know when/where bioperl is used.  This applies to  the other
>>>> bio* as well, particularly biopython (seeing it popping up more and
>>>> more).
>>>>
>>>>
>>>
>>> Along these lines, it'd be great to publicize not only
>>> BioPerl-*powered*projects, but ones which interface with it, too.
>>>
>>> Just this week, for example, there is this, which could go both on a
>>> static
>>> page and in the newsfeed:
>>> http://bioinformatics.oxfordjournals.org/cgi/content/abstract/btp554v1
>>>
>>> MOODS: fast search for position weight matrix matches in DNA  sequences.
>>>
>>> Korhonen J, Martinm?ki P, Pizzi C, Rastas P, Ukkonen E.
>>> Department of Computer Science and Helsinki Institute for  Information
>>> Technology,
>>> University of Helsinki, Helsinki, Finland.
>>>
>>> SUMMARY: MOODS (MOtif Occurrence Detection Suite) is a software  package
>>> for
>>> matching position weight matrices against DNA sequences. MOODS  implements
>>> state-of-the-art on-line matching algorithms, achieving considerably
>>> faster
>>> scanning speed than with a simple brute-force search. MOODS is  written in
>>> C++,
>>> with bindings for the popular BioPerl and Biopython toolkits. It can
>>> easily be
>>> adapted for different purposes and integrated into existing  workflows. It
>>> can
>>> also be used as a C++ library. AVAILABILITY: The package with
>>> documentation and
>>> examples of usage is available at
>>> http://www.cs.helsinki.fi/group/pssmfind. The
>>> source code is also available under the terms of a GNU General  Public
>>> License
>>> (GPL). CONTACT: janne.h.korhonen at helsinki.fi.
>>>
>>> PMID: 19773334 [PubMed - as supplied by publisher]
>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>
>> --
>> Jason Stajich
>> jason.stajich at gmail.com
>> jason at bioperl.org
>>
>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


_______________________________________________
Bioperl-l mailing list
Bioperl-l at lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/bioperl-l


From maj at fortinbras.us  Fri Sep 25 09:08:10 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Fri, 25 Sep 2009 09:08:10 -0400
Subject: [Bioperl-l] a Main Page proposal
In-Reply-To: <D336A804-1B73-465E-971F-AD1A9EE5465C@illinois.edu>
References: <094CC6C2C9DE448F9A8097AEEFE86B03@NewLife><4AB71C40.10902@sendu.me.uk>
	<4AB72DEF.2010008@cornell.edu><320fb6e00909210245hc455330o109515b100b21153@mail.gmail.com><3C8F39ACAD954917ACDEFD863EC99B16@NewLife><320fb6e00909210528v849cd43h3bd4677b69222575@mail.gmail.com>
	<D336A804-1B73-465E-971F-AD1A9EE5465C@illinois.edu>
Message-ID: <3327C0C1167C4889A980809FD642A0A2@NewLife>

The idea I now have is that <biblio> is hitting the server too rapidly and 
getting bounced after a while.
----- Original Message ----- 
From: "Chris Fields" <cjfields at illinois.edu>
To: "Peter" <biopython at maubp.freeserve.co.uk>
Cc: "BioPerl List" <bioperl-l at lists.open-bio.org>; "Mark A. Jensen" 
<maj at fortinbras.us>
Sent: Monday, September 21, 2009 9:05 AM
Subject: Re: [Bioperl-l] a Main Page proposal


>
> On Sep 21, 2009, at 7:28 AM, Peter wrote:
>
>> Peter wrote:
>>>> We had some similar discussions about the Biopython wiki
>>>> based homepage - although our old one was nowhere near
>>>> as busy as the current BioPerl main page, it was still not as
>>>> welcoming as our current version *tries* to be.
>>>> ...
>>>> I can dig out links to our mailing list archive if anyone is
>>>> interested in the discussion.
>>
>> On Mon, Sep 21, 2009 at 12:32 PM, Mark A. Jensen wrote:
>>>
>>> I'd appreciate those links, Peter- thanks
>>> MAJ
>>
>> OK, here you are - this was most of it, I'd have to dig though
>> my old emails to see what else I can find:
>> http://lists.open-bio.org/pipermail/biopython-dev/2009-April/005867.html
>>
>> Remember Biopython went from a very minimal home page, to
>> something aiming to be more newcomer friendly. BioPerl on the
>> other hand seems to want to move away from the current very
>> text heavy information rich page to something more focused and
>> newcomer friendly. To me at least the current page is too dense,
>> intimidating, and the important bits get lost in all the content.
>>
>> [My apologies if any of this feedback come accross too blunt.]
>
> Not at all; I'm thinking the same thing.
>
>> If you haven't already looked at them, you should checkout the
>> other OBF project pages for ideas. The BioJava homepage is
>> also using the wiki - in my opinion it is a bit cluttered, but is
>> still more accessible than the current BioPerl page. Also,
>> the BioRuby page is very nice - although not wiki based.
>>
>> Regards,
>>
>> Peter
>
> I think the Biopython layout is very nice and focused.  Maybe a bit  too 
> minimal, but then again I don't like scrolling up and down the  page to find 
> the relevant bits, so less may be better.
>
> Reminds me of the simplifed design on the perl6 main page (just don't  stare 
> at the hallucinogenic butterfly too long):
>
> http://www.perl6.org/
>
> So, maybe a structured layout with the most important links, and  additional 
> links on a separate page.
>
> chris
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> 


From maj at fortinbras.us  Fri Sep 25 09:30:21 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Fri, 25 Sep 2009 09:30:21 -0400
Subject: [Bioperl-l] a Main Page proposal
In-Reply-To: <D336A804-1B73-465E-971F-AD1A9EE5465C@illinois.edu>
References: <094CC6C2C9DE448F9A8097AEEFE86B03@NewLife><4AB71C40.10902@sendu.me.uk>
	<4AB72DEF.2010008@cornell.edu><320fb6e00909210245hc455330o109515b100b21153@mail.gmail.com><3C8F39ACAD954917ACDEFD863EC99B16@NewLife><320fb6e00909210528v849cd43h3bd4677b69222575@mail.gmail.com>
	<D336A804-1B73-465E-971F-AD1A9EE5465C@illinois.edu>
Message-ID: <A06AF115F63B4C558D368B730BFB441D@NewLife>

It's ugly, but it works now.
----- Original Message ----- 
From: "Chris Fields" <cjfields at illinois.edu>
To: "Peter" <biopython at maubp.freeserve.co.uk>
Cc: "BioPerl List" <bioperl-l at lists.open-bio.org>; "Mark A. Jensen" 
<maj at fortinbras.us>
Sent: Monday, September 21, 2009 9:05 AM
Subject: Re: [Bioperl-l] a Main Page proposal


>
> On Sep 21, 2009, at 7:28 AM, Peter wrote:
>
>> Peter wrote:
>>>> We had some similar discussions about the Biopython wiki
>>>> based homepage - although our old one was nowhere near
>>>> as busy as the current BioPerl main page, it was still not as
>>>> welcoming as our current version *tries* to be.
>>>> ...
>>>> I can dig out links to our mailing list archive if anyone is
>>>> interested in the discussion.
>>
>> On Mon, Sep 21, 2009 at 12:32 PM, Mark A. Jensen wrote:
>>>
>>> I'd appreciate those links, Peter- thanks
>>> MAJ
>>
>> OK, here you are - this was most of it, I'd have to dig though
>> my old emails to see what else I can find:
>> http://lists.open-bio.org/pipermail/biopython-dev/2009-April/005867.html
>>
>> Remember Biopython went from a very minimal home page, to
>> something aiming to be more newcomer friendly. BioPerl on the
>> other hand seems to want to move away from the current very
>> text heavy information rich page to something more focused and
>> newcomer friendly. To me at least the current page is too dense,
>> intimidating, and the important bits get lost in all the content.
>>
>> [My apologies if any of this feedback come accross too blunt.]
>
> Not at all; I'm thinking the same thing.
>
>> If you haven't already looked at them, you should checkout the
>> other OBF project pages for ideas. The BioJava homepage is
>> also using the wiki - in my opinion it is a bit cluttered, but is
>> still more accessible than the current BioPerl page. Also,
>> the BioRuby page is very nice - although not wiki based.
>>
>> Regards,
>>
>> Peter
>
> I think the Biopython layout is very nice and focused.  Maybe a bit  too 
> minimal, but then again I don't like scrolling up and down the  page to find 
> the relevant bits, so less may be better.
>
> Reminds me of the simplifed design on the perl6 main page (just don't  stare 
> at the hallucinogenic butterfly too long):
>
> http://www.perl6.org/
>
> So, maybe a structured layout with the most important links, and  additional 
> links on a separate page.
>
> chris
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> 


From jason at bioperl.org  Fri Sep 25 11:47:55 2009
From: jason at bioperl.org (Jason Stajich)
Date: Fri, 25 Sep 2009 08:47:55 -0700
Subject: [Bioperl-l] a Main Page proposal
In-Reply-To: <A06AF115F63B4C558D368B730BFB441D@NewLife>
References: <094CC6C2C9DE448F9A8097AEEFE86B03@NewLife><4AB71C40.10902@sendu.me.uk>
	<4AB72DEF.2010008@cornell.edu><320fb6e00909210245hc455330o109515b100b21153@mail.gmail.com><3C8F39ACAD954917ACDEFD863EC99B16@NewLife><320fb6e00909210528v849cd43h3bd4677b69222575@mail.gmail.com>
	<D336A804-1B73-465E-971F-AD1A9EE5465C@illinois.edu>
	<A06AF115F63B4C558D368B730BFB441D@NewLife>
Message-ID: <2F3B82E8-3A61-4FDB-A55E-38899C262ED6@bioperl.org>

thanks - yeah I had separated it by year to make it easier to update  
them since the main file was too large, but I liked having them all  
pulled in onto one page in order to see the total number of cites.  
Brian's graphic is nice but a little out of date, and only reflects a  
pubmed query.

Basically that system doesn't work well enough with biblio since it  
isn't caching the lookups very well.   We can probably do better  
somehow, but someone would have to really be dedicated to it, so I can  
kind of see now why we could use something like this to generate the  
citations so they'd be static.
http://sumsearch.uthscsa.edu/cite/

I had used Biblio extension as it was so easy but maybe it just can't  
scale for that number of needed refs as it doesn't do very good local  
caching AFAIK.

-jason
On Sep 25, 2009, at 6:30 AM, Mark A. Jensen wrote:

> It's ugly, but it works now.
> ----- Original Message ----- From: "Chris Fields" <cjfields at illinois.edu 
> >
> To: "Peter" <biopython at maubp.freeserve.co.uk>
> Cc: "BioPerl List" <bioperl-l at lists.open-bio.org>; "Mark A. Jensen" <maj at fortinbras.us 
> >
> Sent: Monday, September 21, 2009 9:05 AM
> Subject: Re: [Bioperl-l] a Main Page proposal
>
>
>>
>> On Sep 21, 2009, at 7:28 AM, Peter wrote:
>>
>>> Peter wrote:
>>>>> We had some similar discussions about the Biopython wiki
>>>>> based homepage - although our old one was nowhere near
>>>>> as busy as the current BioPerl main page, it was still not as
>>>>> welcoming as our current version *tries* to be.
>>>>> ...
>>>>> I can dig out links to our mailing list archive if anyone is
>>>>> interested in the discussion.
>>>
>>> On Mon, Sep 21, 2009 at 12:32 PM, Mark A. Jensen wrote:
>>>>
>>>> I'd appreciate those links, Peter- thanks
>>>> MAJ
>>>
>>> OK, here you are - this was most of it, I'd have to dig though
>>> my old emails to see what else I can find:
>>> http://lists.open-bio.org/pipermail/biopython-dev/2009-April/005867.html
>>>
>>> Remember Biopython went from a very minimal home page, to
>>> something aiming to be more newcomer friendly. BioPerl on the
>>> other hand seems to want to move away from the current very
>>> text heavy information rich page to something more focused and
>>> newcomer friendly. To me at least the current page is too dense,
>>> intimidating, and the important bits get lost in all the content.
>>>
>>> [My apologies if any of this feedback come accross too blunt.]
>>
>> Not at all; I'm thinking the same thing.
>>
>>> If you haven't already looked at them, you should checkout the
>>> other OBF project pages for ideas. The BioJava homepage is
>>> also using the wiki - in my opinion it is a bit cluttered, but is
>>> still more accessible than the current BioPerl page. Also,
>>> the BioRuby page is very nice - although not wiki based.
>>>
>>> Regards,
>>>
>>> Peter
>>
>> I think the Biopython layout is very nice and focused.  Maybe a  
>> bit  too minimal, but then again I don't like scrolling up and down  
>> the  page to find the relevant bits, so less may be better.
>>
>> Reminds me of the simplifed design on the perl6 main page (just  
>> don't  stare at the hallucinogenic butterfly too long):
>>
>> http://www.perl6.org/
>>
>> So, maybe a structured layout with the most important links, and   
>> additional links on a separate page.
>>
>> chris
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

--
Jason Stajich
jason.stajich at gmail.com
jason at bioperl.org


From jason at bioperl.org  Fri Sep 25 12:54:36 2009
From: jason at bioperl.org (Jason Stajich)
Date: Fri, 25 Sep 2009 09:54:36 -0700
Subject: [Bioperl-l] a Main Page proposal
In-Reply-To: <3575DEFF2D0342D0A2553D87EB958D6E@NewLife>
References: <094CC6C2C9DE448F9A8097AEEFE86B03@NewLife><4AB71C40.10902@sendu.me.uk><4AB72DEF.2010008@cornell.edu><320fb6e00909210245hc455330o109515b100b21153@mail.gmail.com><3C8F39ACAD954917ACDEFD863EC99B16@NewLife><320fb6e00909210528v849cd43h3bd4677b69222575@mail.gmail.com><D336A804-1B73-465E-971F-AD1A9EE5465C@illinois.edu><A06AF115F63B4C558D368B730BFB441D@NewLife>
	<2F3B82E8-3A61-4FDB-A55E-38899C262ED6@bioperl.org>
	<3575DEFF2D0342D0A2553D87EB958D6E@NewLife>
Message-ID: <7275015E-45FC-4E2A-9379-89F7447DEB32@bioperl.org>

cheers- any efforts are appreciated.  I am not sure what is the best  
way to provide this info to folks without spending a ton of time  
curating.  What would be ideal is if the software worked well enough  
that a volunteer only spent time adding the info not debugging the  
display or code.  It might be that something better exists -- online  
reference management like citeulike or mendeley -- that could then be  
linked in via an API.  .... Webservices, etc will save us all, right?   
Okay not really, but at least we can try and keep this organized till  
it is clear what are alternate solutions.  Martin has stopped working  
on Biblio as far as I know and php-hacking is not my favorite pastime.

-jason
On Sep 25, 2009, at 9:38 AM, Mark A. Jensen wrote:

> I figured you really wanted the 'hundreds-o-cites' effect-- I'm just  
> thinking of this
> as a workaround until the issues are resolved. Not sure I can devote  
> too much
> time to playing with it now (procrastinating using other projects at  
> the mo') but
> I can put it in the todo list on the Documentation Project page....
> cheers MAJ
> ----- Original Message ----- From: "Jason Stajich" <jason at bioperl.org>
> To: "Mark A. Jensen" <maj at fortinbras.us>
> Cc: "Chris Fields" <cjfields at illinois.edu>; "BioPerl List" <bioperl-l at lists.open-bio.org 
> >; "Peter" <biopython at maubp.freeserve.co.uk>
> Sent: Friday, September 25, 2009 11:47 AM
> Subject: Re: [Bioperl-l] a Main Page proposal
>
>
>> thanks - yeah I had separated it by year to make it easier to  
>> update  them since the main file was too large, but I liked having  
>> them all  pulled in onto one page in order to see the total number  
>> of cites.  Brian's graphic is nice but a little out of date, and  
>> only reflects a  pubmed query.
>>
>> Basically that system doesn't work well enough with biblio since  
>> it  isn't caching the lookups very well.   We can probably do  
>> better  somehow, but someone would have to really be dedicated to  
>> it, so I can  kind of see now why we could use something like this  
>> to generate the  citations so they'd be static.
>> http://sumsearch.uthscsa.edu/cite/
>>
>> I had used Biblio extension as it was so easy but maybe it just  
>> can't  scale for that number of needed refs as it doesn't do very  
>> good local  caching AFAIK.
>>
>> -jason
>> On Sep 25, 2009, at 6:30 AM, Mark A. Jensen wrote:
>>
>>> It's ugly, but it works now.
>>> ----- Original Message ----- From: "Chris Fields" <cjfields at illinois.edu
>>> >
>>> To: "Peter" <biopython at maubp.freeserve.co.uk>
>>> Cc: "BioPerl List" <bioperl-l at lists.open-bio.org>; "Mark A.  
>>> Jensen" <maj at fortinbras.us
>>> >
>>> Sent: Monday, September 21, 2009 9:05 AM
>>> Subject: Re: [Bioperl-l] a Main Page proposal
>>>
>>>
>>>>
>>>> On Sep 21, 2009, at 7:28 AM, Peter wrote:
>>>>
>>>>> Peter wrote:
>>>>>>> We had some similar discussions about the Biopython wiki
>>>>>>> based homepage - although our old one was nowhere near
>>>>>>> as busy as the current BioPerl main page, it was still not as
>>>>>>> welcoming as our current version *tries* to be.
>>>>>>> ...
>>>>>>> I can dig out links to our mailing list archive if anyone is
>>>>>>> interested in the discussion.
>>>>>
>>>>> On Mon, Sep 21, 2009 at 12:32 PM, Mark A. Jensen wrote:
>>>>>>
>>>>>> I'd appreciate those links, Peter- thanks
>>>>>> MAJ
>>>>>
>>>>> OK, here you are - this was most of it, I'd have to dig though
>>>>> my old emails to see what else I can find:
>>>>> http://lists.open-bio.org/pipermail/biopython-dev/2009-April/005867.html
>>>>>
>>>>> Remember Biopython went from a very minimal home page, to
>>>>> something aiming to be more newcomer friendly. BioPerl on the
>>>>> other hand seems to want to move away from the current very
>>>>> text heavy information rich page to something more focused and
>>>>> newcomer friendly. To me at least the current page is too dense,
>>>>> intimidating, and the important bits get lost in all the content.
>>>>>
>>>>> [My apologies if any of this feedback come accross too blunt.]
>>>>
>>>> Not at all; I'm thinking the same thing.
>>>>
>>>>> If you haven't already looked at them, you should checkout the
>>>>> other OBF project pages for ideas. The BioJava homepage is
>>>>> also using the wiki - in my opinion it is a bit cluttered, but is
>>>>> still more accessible than the current BioPerl page. Also,
>>>>> the BioRuby page is very nice - although not wiki based.
>>>>>
>>>>> Regards,
>>>>>
>>>>> Peter
>>>>
>>>> I think the Biopython layout is very nice and focused.  Maybe a   
>>>> bit  too minimal, but then again I don't like scrolling up and  
>>>> down  the  page to find the relevant bits, so less may be better.
>>>>
>>>> Reminds me of the simplifed design on the perl6 main page (just   
>>>> don't stare at the hallucinogenic butterfly too long):
>>>>
>>>> http://www.perl6.org/
>>>>
>>>> So, maybe a structured layout with the most important links, and  
>>>> additional links on a separate page.
>>>>
>>>> chris
>>>>
>>>> _______________________________________________
>>>> Bioperl-l mailing list
>>>> Bioperl-l at lists.open-bio.org
>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>>
>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>> --
>> Jason Stajich
>> jason.stajich at gmail.com
>> jason at bioperl.org
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>

--
Jason Stajich
jason.stajich at gmail.com
jason at bioperl.org


From maj at fortinbras.us  Fri Sep 25 12:38:40 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Fri, 25 Sep 2009 12:38:40 -0400
Subject: [Bioperl-l] a Main Page proposal
In-Reply-To: <2F3B82E8-3A61-4FDB-A55E-38899C262ED6@bioperl.org>
References: <094CC6C2C9DE448F9A8097AEEFE86B03@NewLife><4AB71C40.10902@sendu.me.uk><4AB72DEF.2010008@cornell.edu><320fb6e00909210245hc455330o109515b100b21153@mail.gmail.com><3C8F39ACAD954917ACDEFD863EC99B16@NewLife><320fb6e00909210528v849cd43h3bd4677b69222575@mail.gmail.com><D336A804-1B73-465E-971F-AD1A9EE5465C@illinois.edu><A06AF115F63B4C558D368B730BFB441D@NewLife>
	<2F3B82E8-3A61-4FDB-A55E-38899C262ED6@bioperl.org>
Message-ID: <3575DEFF2D0342D0A2553D87EB958D6E@NewLife>

I figured you really wanted the 'hundreds-o-cites' effect-- I'm just thinking of 
this
as a workaround until the issues are resolved. Not sure I can devote too much
time to playing with it now (procrastinating using other projects at the mo') 
but
I can put it in the todo list on the Documentation Project page....
cheers MAJ
----- Original Message ----- 
From: "Jason Stajich" <jason at bioperl.org>
To: "Mark A. Jensen" <maj at fortinbras.us>
Cc: "Chris Fields" <cjfields at illinois.edu>; "BioPerl List" 
<bioperl-l at lists.open-bio.org>; "Peter" <biopython at maubp.freeserve.co.uk>
Sent: Friday, September 25, 2009 11:47 AM
Subject: Re: [Bioperl-l] a Main Page proposal


> thanks - yeah I had separated it by year to make it easier to update  them 
> since the main file was too large, but I liked having them all  pulled in onto 
> one page in order to see the total number of cites.  Brian's graphic is nice 
> but a little out of date, and only reflects a  pubmed query.
>
> Basically that system doesn't work well enough with biblio since it  isn't 
> caching the lookups very well.   We can probably do better  somehow, but 
> someone would have to really be dedicated to it, so I can  kind of see now why 
> we could use something like this to generate the  citations so they'd be 
> static.
> http://sumsearch.uthscsa.edu/cite/
>
> I had used Biblio extension as it was so easy but maybe it just can't  scale 
> for that number of needed refs as it doesn't do very good local  caching 
> AFAIK.
>
> -jason
> On Sep 25, 2009, at 6:30 AM, Mark A. Jensen wrote:
>
>> It's ugly, but it works now.
>> ----- Original Message ----- From: "Chris Fields" <cjfields at illinois.edu
>> >
>> To: "Peter" <biopython at maubp.freeserve.co.uk>
>> Cc: "BioPerl List" <bioperl-l at lists.open-bio.org>; "Mark A. Jensen" 
>> <maj at fortinbras.us
>> >
>> Sent: Monday, September 21, 2009 9:05 AM
>> Subject: Re: [Bioperl-l] a Main Page proposal
>>
>>
>>>
>>> On Sep 21, 2009, at 7:28 AM, Peter wrote:
>>>
>>>> Peter wrote:
>>>>>> We had some similar discussions about the Biopython wiki
>>>>>> based homepage - although our old one was nowhere near
>>>>>> as busy as the current BioPerl main page, it was still not as
>>>>>> welcoming as our current version *tries* to be.
>>>>>> ...
>>>>>> I can dig out links to our mailing list archive if anyone is
>>>>>> interested in the discussion.
>>>>
>>>> On Mon, Sep 21, 2009 at 12:32 PM, Mark A. Jensen wrote:
>>>>>
>>>>> I'd appreciate those links, Peter- thanks
>>>>> MAJ
>>>>
>>>> OK, here you are - this was most of it, I'd have to dig though
>>>> my old emails to see what else I can find:
>>>> http://lists.open-bio.org/pipermail/biopython-dev/2009-April/005867.html
>>>>
>>>> Remember Biopython went from a very minimal home page, to
>>>> something aiming to be more newcomer friendly. BioPerl on the
>>>> other hand seems to want to move away from the current very
>>>> text heavy information rich page to something more focused and
>>>> newcomer friendly. To me at least the current page is too dense,
>>>> intimidating, and the important bits get lost in all the content.
>>>>
>>>> [My apologies if any of this feedback come accross too blunt.]
>>>
>>> Not at all; I'm thinking the same thing.
>>>
>>>> If you haven't already looked at them, you should checkout the
>>>> other OBF project pages for ideas. The BioJava homepage is
>>>> also using the wiki - in my opinion it is a bit cluttered, but is
>>>> still more accessible than the current BioPerl page. Also,
>>>> the BioRuby page is very nice - although not wiki based.
>>>>
>>>> Regards,
>>>>
>>>> Peter
>>>
>>> I think the Biopython layout is very nice and focused.  Maybe a  bit  too 
>>> minimal, but then again I don't like scrolling up and down  the  page to 
>>> find the relevant bits, so less may be better.
>>>
>>> Reminds me of the simplifed design on the perl6 main page (just  don't 
>>> stare at the hallucinogenic butterfly too long):
>>>
>>> http://www.perl6.org/
>>>
>>> So, maybe a structured layout with the most important links, and 
>>> additional links on a separate page.
>>>
>>> chris
>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> --
> Jason Stajich
> jason.stajich at gmail.com
> jason at bioperl.org
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> 


From jcline at ieee.org  Fri Sep 25 15:11:20 2009
From: jcline at ieee.org (Jonathan Cline)
Date: Fri, 25 Sep 2009 14:11:20 -0500
Subject: [Bioperl-l] LIMS::Controller and LIMS::Web
Message-ID: <4ABD15D8.9020304@ieee.org>

Anyone using the CPAN LIMS::Web or associated modules, have a web site
which demonstrates functionality?  The links in the .pod are not current.

>From CPAN:

DESCRIPTION ^

LIMS::Controller is a versatile object-oriented Perl module designed to
control a LIMS database and its web interface. Inheriting from the
LIMS::Web::Interface and LIMS::Database::Util classes, the module
provides automation for many core and advanced functions required of a
web/database object layer, enabling rapid development of Perl CGI scripts.

-- 

## Jonathan Cline
## jcline at ieee.org
## Mobile: +1-805-617-0223
########################


From bosborne11 at verizon.net  Fri Sep 25 22:13:16 2009
From: bosborne11 at verizon.net (Brian Osborne)
Date: Fri, 25 Sep 2009 22:13:16 -0400
Subject: [Bioperl-l] a Main Page proposal
In-Reply-To: <42FBB964C0EA44FABCB50364C567A009@NewLife>
References: <094CC6C2C9DE448F9A8097AEEFE86B03@NewLife><7C51BBB552AB4EBEA5ED60B49BDB9171@NewLife>
	<628aabb70909211003w221b4c9fn5c11f7ff08fc952d@mail.gmail.com>
	<42FBB964C0EA44FABCB50364C567A009@NewLife>
Message-ID: <B4584560-48AE-4EFB-BE94-E1481FD24E1C@verizon.net>

Mark,

Really nice, and a significant improvement over the existing.

You've gotten good feedback, you've considered these thoughts and  
incorporated them - is it time to move the beta to Main? Yes. In my  
opinion your 'beta' is far superior - just do it.

Brian O.


On Sep 21, 2009, at 1:45 PM, Mark A. Jensen wrote:

> A nearly completely minimal solution is at Main Page Beta
> ----- Original Message ----- From: "Dave Messina" <David.Messina at sbc.su.se 
> >
> To: "Mark A. Jensen" <maj at fortinbras.us>
> Cc: "BioPerl List" <bioperl-l at lists.open-bio.org>
> Sent: Monday, September 21, 2009 1:03 PM
> Subject: Re: [Bioperl-l] a Main Page proposal
>
>
>> Hi Mark,
>> Thanks for taking on this (much needed) refresh.
>> I think your current version is substantially better than what we  
>> have now.
>> Still, I'd argue that something much more concise like the  
>> Biopython page
>> would make a bigger impact on visitors' ability to find what  
>> they're looking
>> for.
>> It's not that the details you have under each section shouldn't be
>> available, but rather that they could be clicked through to instead  
>> of being
>> on the front page.
>> The About section is a good example. I would bet most visitors to the
>> BioPerl website skip over the About section because they already  
>> know what
>> BioPerl is, and that section has the most valuable real estate on  
>> the page.
>> Those who don't know and are curious will probably be able to find  
>> it (the
>> word About on the front page of a website has become an idiom for  
>> "click her
>> to read the details about this").
>> Dave
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From maj at fortinbras.us  Fri Sep 25 22:22:49 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Fri, 25 Sep 2009 22:22:49 -0400
Subject: [Bioperl-l] a Main Page proposal
In-Reply-To: <B4584560-48AE-4EFB-BE94-E1481FD24E1C@verizon.net>
References: <094CC6C2C9DE448F9A8097AEEFE86B03@NewLife><7C51BBB552AB4EBEA5ED60B49BDB9171@NewLife><628aabb70909211003w221b4c9fn5c11f7ff08fc952d@mail.gmail.com><42FBB964C0EA44FABCB50364C567A009@NewLife>
	<B4584560-48AE-4EFB-BE94-E1481FD24E1C@verizon.net>
Message-ID: <ACA5C04C052442259262125A5F0B8E74@NewLife>

Cheers, Brian-- I am becoming swayed now by Chris' whack 
at it, on his talk page. My thought is that we'll hammer out the 
final version after the release, then pull the trigger-- Your thoughts?
MAJ
----- Original Message ----- 
From: "Brian Osborne" <bosborne11 at verizon.net>
To: "Mark A. Jensen" <maj at fortinbras.us>
Cc: "BioPerl List" <bioperl-l at lists.open-bio.org>
Sent: Friday, September 25, 2009 10:13 PM
Subject: Re: [Bioperl-l] a Main Page proposal


> Mark,
> 
> Really nice, and a significant improvement over the existing.
> 
> You've gotten good feedback, you've considered these thoughts and  
> incorporated them - is it time to move the beta to Main? Yes. In my  
> opinion your 'beta' is far superior - just do it.
> 
> Brian O.
> 
> 
> On Sep 21, 2009, at 1:45 PM, Mark A. Jensen wrote:
> 
>> A nearly completely minimal solution is at Main Page Beta
>> ----- Original Message ----- From: "Dave Messina" <David.Messina at sbc.su.se 
>> >
>> To: "Mark A. Jensen" <maj at fortinbras.us>
>> Cc: "BioPerl List" <bioperl-l at lists.open-bio.org>
>> Sent: Monday, September 21, 2009 1:03 PM
>> Subject: Re: [Bioperl-l] a Main Page proposal
>>
>>
>>> Hi Mark,
>>> Thanks for taking on this (much needed) refresh.
>>> I think your current version is substantially better than what we  
>>> have now.
>>> Still, I'd argue that something much more concise like the  
>>> Biopython page
>>> would make a bigger impact on visitors' ability to find what  
>>> they're looking
>>> for.
>>> It's not that the details you have under each section shouldn't be
>>> available, but rather that they could be clicked through to instead  
>>> of being
>>> on the front page.
>>> The About section is a good example. I would bet most visitors to the
>>> BioPerl website skip over the About section because they already  
>>> know what
>>> BioPerl is, and that section has the most valuable real estate on  
>>> the page.
>>> Those who don't know and are curious will probably be able to find  
>>> it (the
>>> word About on the front page of a website has become an idiom for  
>>> "click her
>>> to read the details about this").
>>> Dave
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
>


From maj at fortinbras.us  Fri Sep 25 22:45:21 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Fri, 25 Sep 2009 22:45:21 -0400
Subject: [Bioperl-l] a Main Page proposal
In-Reply-To: <EB52E49A-37B3-4652-9BFD-441BA174FF84@verizon.net>
References: <094CC6C2C9DE448F9A8097AEEFE86B03@NewLife><7C51BBB552AB4EBEA5ED60B49BDB9171@NewLife><628aabb70909211003w221b4c9fn5c11f7ff08fc952d@mail.gmail.com><42FBB964C0EA44FABCB50364C567A009@NewLife>
	<B4584560-48AE-4EFB-BE94-E1481FD24E1C@verizon.net>
	<ACA5C04C052442259262125A5F0B8E74@NewLife>
	<EB52E49A-37B3-4652-9BFD-441BA174FF84@verizon.net>
Message-ID: <E53214D989154E8184BA97573C925DF9@NewLife>

sounds good-- I can make the changes (soon) and we'll tweak it from the echte page
(unless I hear diff'rnt)
cheers MAJ
  ----- Original Message ----- 
  From: Brian Osborne 
  To: Mark A. Jensen 
  Cc: BioPerl List 
  Sent: Friday, September 25, 2009 10:42 PM
  Subject: Re: [Bioperl-l] a Main Page proposal


  Mark,


  I don't love the italics in the version that Chris made but that's just personal preference. He's right in thinking that putting more in the top of the page is good: less scrolling.


  One could color the backgrounds of his tables, that might look nice.


  Either way, or a combination of both, is preferable to what we have. There really is no need to wait since the current page is abysmal. I can say that freely since I'm probably one of its authors!


  One thought though: move the "search" up to a center-left location, below "main links". The Wiki search is pretty good at finding pages so if someone doesn't find what they're looking for in the main section they might be drawn to search for it.


  Brian O.


  On Sep 25, 2009, at 10:22 PM, Mark A. Jensen wrote:


    Cheers, Brian-- I am becoming swayed now by Chris' whack at it, on his talk page. My thought is that we'll hammer out the final version after the release, then pull the trigger-- Your thoughts?
    MAJ
    ----- Original Message ----- From: "Brian Osborne" <bosborne11 at verizon.net>
    To: "Mark A. Jensen" <maj at fortinbras.us>
    Cc: "BioPerl List" <bioperl-l at lists.open-bio.org>
    Sent: Friday, September 25, 2009 10:13 PM
    Subject: Re: [Bioperl-l] a Main Page proposal


      Mark,

      Really nice, and a significant improvement over the existing.

      You've gotten good feedback, you've considered these thoughts and  incorporated them - is it time to move the beta to Main? Yes. In my  opinion your 'beta' is far superior - just do it.

      Brian O.

      On Sep 21, 2009, at 1:45 PM, Mark A. Jensen wrote:

        A nearly completely minimal solution is at Main Page Beta

        ----- Original Message ----- From: "Dave Messina" <David.Messina at sbc.su.se >

        To: "Mark A. Jensen" <maj at fortinbras.us>

        Cc: "BioPerl List" <bioperl-l at lists.open-bio.org>

        Sent: Monday, September 21, 2009 1:03 PM

        Subject: Re: [Bioperl-l] a Main Page proposal


          Hi Mark,

          Thanks for taking on this (much needed) refresh.

          I think your current version is substantially better than what we  have now.

          Still, I'd argue that something much more concise like the  Biopython page

          would make a bigger impact on visitors' ability to find what  they're looking

          for.

          It's not that the details you have under each section shouldn't be

          available, but rather that they could be clicked through to instead  of being

          on the front page.

          The About section is a good example. I would bet most visitors to the

          BioPerl website skip over the About section because they already  know what

          BioPerl is, and that section has the most valuable real estate on  the page.

          Those who don't know and are curious will probably be able to find  it (the

          word About on the front page of a website has become an idiom for  "click her

          to read the details about this").

          Dave

          _______________________________________________

          Bioperl-l mailing list

          Bioperl-l at lists.open-bio.org

          http://lists.open-bio.org/mailman/listinfo/bioperl-l


        _______________________________________________

        Bioperl-l mailing list

        Bioperl-l at lists.open-bio.org

        http://lists.open-bio.org/mailman/listinfo/bioperl-l

      _______________________________________________

      Bioperl-l mailing list

      Bioperl-l at lists.open-bio.org

      http://lists.open-bio.org/mailman/listinfo/bioperl-l


    _______________________________________________
    Bioperl-l mailing list
    Bioperl-l at lists.open-bio.org
    http://lists.open-bio.org/mailman/listinfo/bioperl-l


From bosborne11 at verizon.net  Fri Sep 25 22:42:38 2009
From: bosborne11 at verizon.net (Brian Osborne)
Date: Fri, 25 Sep 2009 22:42:38 -0400
Subject: [Bioperl-l] a Main Page proposal
In-Reply-To: <ACA5C04C052442259262125A5F0B8E74@NewLife>
References: <094CC6C2C9DE448F9A8097AEEFE86B03@NewLife><7C51BBB552AB4EBEA5ED60B49BDB9171@NewLife><628aabb70909211003w221b4c9fn5c11f7ff08fc952d@mail.gmail.com><42FBB964C0EA44FABCB50364C567A009@NewLife>
	<B4584560-48AE-4EFB-BE94-E1481FD24E1C@verizon.net>
	<ACA5C04C052442259262125A5F0B8E74@NewLife>
Message-ID: <EB52E49A-37B3-4652-9BFD-441BA174FF84@verizon.net>

Mark,

I don't love the italics in the version that Chris made but that's  
just personal preference. He's right in thinking that putting more in  
the top of the page is good: less scrolling.

One could color the backgrounds of his tables, that might look nice.

Either way, or a combination of both, is preferable to what we have.  
There really is no need to wait since the current page is abysmal. I  
can say that freely since I'm probably one of its authors!

One thought though: move the "search" up to a center-left location,  
below "main links". The Wiki search is pretty good at finding pages so  
if someone doesn't find what they're looking for in the main section  
they might be drawn to search for it.

Brian O.


On Sep 25, 2009, at 10:22 PM, Mark A. Jensen wrote:

> Cheers, Brian-- I am becoming swayed now by Chris' whack at it, on  
> his talk page. My thought is that we'll hammer out the final version  
> after the release, then pull the trigger-- Your thoughts?
> MAJ
> ----- Original Message ----- From: "Brian Osborne" <bosborne11 at verizon.net 
> >
> To: "Mark A. Jensen" <maj at fortinbras.us>
> Cc: "BioPerl List" <bioperl-l at lists.open-bio.org>
> Sent: Friday, September 25, 2009 10:13 PM
> Subject: Re: [Bioperl-l] a Main Page proposal
>
>
>> Mark,
>> Really nice, and a significant improvement over the existing.
>> You've gotten good feedback, you've considered these thoughts and   
>> incorporated them - is it time to move the beta to Main? Yes. In  
>> my  opinion your 'beta' is far superior - just do it.
>> Brian O.
>> On Sep 21, 2009, at 1:45 PM, Mark A. Jensen wrote:
>>> A nearly completely minimal solution is at Main Page Beta
>>> ----- Original Message ----- From: "Dave Messina" <David.Messina at sbc.su.se 
>>>  >
>>> To: "Mark A. Jensen" <maj at fortinbras.us>
>>> Cc: "BioPerl List" <bioperl-l at lists.open-bio.org>
>>> Sent: Monday, September 21, 2009 1:03 PM
>>> Subject: Re: [Bioperl-l] a Main Page proposal
>>>
>>>
>>>> Hi Mark,
>>>> Thanks for taking on this (much needed) refresh.
>>>> I think your current version is substantially better than what  
>>>> we  have now.
>>>> Still, I'd argue that something much more concise like the   
>>>> Biopython page
>>>> would make a bigger impact on visitors' ability to find what   
>>>> they're looking
>>>> for.
>>>> It's not that the details you have under each section shouldn't be
>>>> available, but rather that they could be clicked through to  
>>>> instead  of being
>>>> on the front page.
>>>> The About section is a good example. I would bet most visitors to  
>>>> the
>>>> BioPerl website skip over the About section because they already   
>>>> know what
>>>> BioPerl is, and that section has the most valuable real estate  
>>>> on  the page.
>>>> Those who don't know and are curious will probably be able to  
>>>> find  it (the
>>>> word About on the front page of a website has become an idiom  
>>>> for  "click her
>>>> to read the details about this").
>>>> Dave
>>>> _______________________________________________
>>>> Bioperl-l mailing list
>>>> Bioperl-l at lists.open-bio.org
>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From cjfields at illinois.edu  Sat Sep 26 00:04:57 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Fri, 25 Sep 2009 23:04:57 -0500
Subject: [Bioperl-l] a Main Page proposal
In-Reply-To: <EB52E49A-37B3-4652-9BFD-441BA174FF84@verizon.net>
References: <094CC6C2C9DE448F9A8097AEEFE86B03@NewLife><7C51BBB552AB4EBEA5ED60B49BDB9171@NewLife><628aabb70909211003w221b4c9fn5c11f7ff08fc952d@mail.gmail.com><42FBB964C0EA44FABCB50364C567A009@NewLife>
	<B4584560-48AE-4EFB-BE94-E1481FD24E1C@verizon.net>
	<ACA5C04C052442259262125A5F0B8E74@NewLife>
	<EB52E49A-37B3-4652-9BFD-441BA174FF84@verizon.net>
Message-ID: <68A162A4-45F1-4ADC-87C9-57E388DF2666@illinois.edu>

Brian, Mark,

Agreed about the italics; there's a lot more that can be done with  
tables if needed:

http://meta.wikimedia.org/wiki/Help:Table

I say go ahead and pull the trigger.  No need to wait 'til 1.6.1 on  
this, the sooner it's fixed the better.  We can tweak the rest (add  
News updates, etc) along the way.

chris

On Sep 25, 2009, at 9:42 PM, Brian Osborne wrote:

> Mark,
>
> I don't love the italics in the version that Chris made but that's  
> just personal preference. He's right in thinking that putting more  
> in the top of the page is good: less scrolling.
>
> One could color the backgrounds of his tables, that might look nice.
>
> Either way, or a combination of both, is preferable to what we have.  
> There really is no need to wait since the current page is abysmal. I  
> can say that freely since I'm probably one of its authors!
>
> One thought though: move the "search" up to a center-left location,  
> below "main links". The Wiki search is pretty good at finding pages  
> so if someone doesn't find what they're looking for in the main  
> section they might be drawn to search for it.
>
> Brian O.
>
>
> On Sep 25, 2009, at 10:22 PM, Mark A. Jensen wrote:
>
>> Cheers, Brian-- I am becoming swayed now by Chris' whack at it, on  
>> his talk page. My thought is that we'll hammer out the final  
>> version after the release, then pull the trigger-- Your thoughts?
>> MAJ
>> ----- Original Message ----- From: "Brian Osborne" <bosborne11 at verizon.net 
>> >
>> To: "Mark A. Jensen" <maj at fortinbras.us>
>> Cc: "BioPerl List" <bioperl-l at lists.open-bio.org>
>> Sent: Friday, September 25, 2009 10:13 PM
>> Subject: Re: [Bioperl-l] a Main Page proposal
>>
>>
>>> Mark,
>>> Really nice, and a significant improvement over the existing.
>>> You've gotten good feedback, you've considered these thoughts and   
>>> incorporated them - is it time to move the beta to Main? Yes. In  
>>> my  opinion your 'beta' is far superior - just do it.
>>> Brian O.
>>> On Sep 21, 2009, at 1:45 PM, Mark A. Jensen wrote:
>>>> A nearly completely minimal solution is at Main Page Beta
>>>> ----- Original Message ----- From: "Dave Messina" <David.Messina at sbc.su.se 
>>>>  >
>>>> To: "Mark A. Jensen" <maj at fortinbras.us>
>>>> Cc: "BioPerl List" <bioperl-l at lists.open-bio.org>
>>>> Sent: Monday, September 21, 2009 1:03 PM
>>>> Subject: Re: [Bioperl-l] a Main Page proposal
>>>>
>>>>
>>>>> Hi Mark,
>>>>> Thanks for taking on this (much needed) refresh.
>>>>> I think your current version is substantially better than what  
>>>>> we  have now.
>>>>> Still, I'd argue that something much more concise like the   
>>>>> Biopython page
>>>>> would make a bigger impact on visitors' ability to find what   
>>>>> they're looking
>>>>> for.
>>>>> It's not that the details you have under each section shouldn't be
>>>>> available, but rather that they could be clicked through to  
>>>>> instead  of being
>>>>> on the front page.
>>>>> The About section is a good example. I would bet most visitors  
>>>>> to the
>>>>> BioPerl website skip over the About section because they  
>>>>> already  know what
>>>>> BioPerl is, and that section has the most valuable real estate  
>>>>> on  the page.
>>>>> Those who don't know and are curious will probably be able to  
>>>>> find  it (the
>>>>> word About on the front page of a website has become an idiom  
>>>>> for  "click her
>>>>> to read the details about this").
>>>>> Dave
>>>>> _______________________________________________
>>>>> Bioperl-l mailing list
>>>>> Bioperl-l at lists.open-bio.org
>>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>>>
>>>> _______________________________________________
>>>> Bioperl-l mailing list
>>>> Bioperl-l at lists.open-bio.org
>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From cjfields at illinois.edu  Sat Sep 26 00:52:35 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Fri, 25 Sep 2009 23:52:35 -0500
Subject: [Bioperl-l] BioPerl 1.6.0 alpha 4 released
Message-ID: <2EDBBBF5-2109-456A-B768-178B012A8192@illinois.edu>

All,

Core 1.6.0 alpha 4 is now floating about on the intertubes and CPAN:

http://search.cpan.org/~cjfields/BioPerl-1.6.0_4/

http://bioperl.org/DIST/RC/

So far this is passing all tests for ActivePerl on WinXP once DB_File  
is installed.  I'll try running some tests for Strawberry Perl, but no  
promises.

At this late stage any additional updates will only be doc tweaks and  
dealing with small bug fixes prior to 1.6.1.  The only renaming issue  
is I need to rename BioPerl.pod to BioPerl.pm and adding a simple  
VERSION to it (per Curtis Jewell's suggestion).  I may post a very  
short alpha 5 to test that, with 1.6.1 posted by Sunday.

Enjoy!

chris


From e.osimo at gmail.com  Sun Sep 27 05:00:17 2009
From: e.osimo at gmail.com (Emanuele Osimo)
Date: Sun, 27 Sep 2009 11:00:17 +0200
Subject: [Bioperl-l] setting a strand in Bio::Graphics
Message-ID: <2ac05d0f0909270200j3bb478b3t77b83bccc1e5022c@mail.gmail.com>

Hello,
I've tried all the arrows suggested in
http://search.cpan.org/~lds/Bio-Graphics-1.982/lib/Bio/Graphics/Glyph/arrow.pm,
but I can't figure out how to tell in the options of $panel->add_track the
strand of the feature I'm adding.
I'm drawing DNA elements from a local DB, and I have a field "strand" which
can be + or -.
Please help!
Emanuele


From maj at fortinbras.us  Sun Sep 27 20:54:04 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Sun, 27 Sep 2009 20:54:04 -0400
Subject: [Bioperl-l] setting a strand in Bio::Graphics
In-Reply-To: <2ac05d0f0909270200j3bb478b3t77b83bccc1e5022c@mail.gmail.com>
References: <2ac05d0f0909270200j3bb478b3t77b83bccc1e5022c@mail.gmail.com>
Message-ID: <6CF05E74FEAE45679CDEDF48B7E15856@NewLife>

Emos- Without the code, I can only guess, but you might not be providing
the options correctly. Have a look at
http://www.bioperl.org/wiki/Drawing_with_multiple_glyphs_in_a_single_track
for something that may help.
MAJ
----- Original Message ----- 
From: "Emanuele Osimo" <e.osimo at gmail.com>
To: "perl bioperl ml" <bioperl-l at lists.open-bio.org>
Sent: Sunday, September 27, 2009 5:00 AM
Subject: [Bioperl-l] setting a strand in Bio::Graphics


> Hello,
> I've tried all the arrows suggested in
> http://search.cpan.org/~lds/Bio-Graphics-1.982/lib/Bio/Graphics/Glyph/arrow.pm,
> but I can't figure out how to tell in the options of $panel->add_track the
> strand of the feature I'm adding.
> I'm drawing DNA elements from a local DB, and I have a field "strand" which
> can be + or -.
> Please help!
> Emanuele
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> 


From cjfields at illinois.edu  Mon Sep 28 00:34:01 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Sun, 27 Sep 2009 23:34:01 -0500
Subject: [Bioperl-l] BioPerl 1.6.0 alpha 5 released
Message-ID: <277ED183-2F43-479F-88D2-A0A325105C53@illinois.edu>

All,

The last alpha for the 1.6.1 release is out and should be propagating  
around CPAN now.  This should be a quick one (it has a few last-minute  
bug fixes for some problems that popped up on CPAN RT and fixes one  
mistake I made in the last alpha).

You can currently get it here (.tar.gz only for now):

http://bioperl.org/DIST/RC/BioPerl-1.6.0_5.tar.gz

The final 1.6.1 release should drop in the next day or two.

chris


From adsj at novozymes.com  Mon Sep 28 03:51:15 2009
From: adsj at novozymes.com (Adam =?iso-8859-1?Q?Sj=F8gren?=)
Date: Mon, 28 Sep 2009 09:51:15 +0200
Subject: [Bioperl-l] Long /labels are wrapped, but can't be read
Message-ID: <87hbunv764.fsf@topper.koldfront.dk>

  Hi.


I am wondering whether this is a buglet or just a case of "Don't do
that":

If I set a very long /label on a feature and output the sequence in EMBL
format, the qualifier value gets wrapped, but not quoted.

When BioPerl reads such a file, an exception is thrown.

I probably shouldn't be setting very long labels... But oughtn't BioPerl
throw an exception when a too long label is set, or automatically quote
the value when it is long enough to be wrapped, or know how to read a
wrapped yet unquoted value?

I will be happy to try and provide a patch for whichever solution is
preferred.

Here is an example script:

  #!/usr/bin/perl

  use strict;
  use warnings;

  use IO::String;

  use Bio::Seq;
  use Bio::SeqFeature::Generic;
  use Bio::SeqIO;

  print 'BioPerl ' . $Bio::Root::Version::VERSION . "\n";

  my $seq=Bio::Seq->new(-seq=>'ATG');
  my $feature=Bio::SeqFeature::Generic->new(-primary=>'misc_feature', -start=>1, -end=>3);
  $feature->add_tag_value(label=>'averylonglabelthisisindeedbutitoughttoworkanywaydontyouthink');
  $seq->add_SeqFeature($feature);

  my $out_string=out($seq);
  print $out_string;

  my $fh=IO::String->new($out_string);
  my $in=Bio::SeqIO->new(-fh=>$fh, -format=>'EMBL');
  my $in_seq=$in->next_seq;

  print "Done\n";

  sub out {
      my ($seq)=@_;

      my $string='';
      my $fh=IO::String->new($string);
      my $out=Bio::SeqIO->new(-fh=>$fh, -format=>'EMBL');
      $out->write_seq($seq);

      return $string;
  }

Which gives this output when run:

  BioPerl 1.0069
  ID   unknown; SV 1; linear; unassigned DNA; STD; UNC; 3 BP.
  XX
  AC   unknown;
  XX
  XX
  FH   Key             Location/Qualifiers
  FH
  FT   misc_feature    1..3
  FT                   /label=averylonglabelthisisindeedbutitoughttoworkanywaydont
  FT                   youthink
  XX
  SQ   Sequence 3 BP; 1 A; 0 C; 1 G; 1 T; 0 other;
       atg                                                                       3
  //

  ------------- EXCEPTION: Bio::Root::Exception -------------
  MSG: Can't see new qualifier in: youthink
  from:
  /label=averylonglabelthisisindeedbutitoughttoworkanywaydont
  youthink

  STACK: Error::throw
  STACK: Bio::Root::Root::throw Bio/Root/Root.pm:368
  STACK: Bio::SeqIO::embl::_read_FTHelper_EMBL Bio/SeqIO/embl.pm:1294
  STACK: Bio::SeqIO::embl::next_seq Bio/SeqIO/embl.pm:392
  STACK: /z/home/adsj/bugs/bioperl/embl/embl.pl:24
  -----------------------------------------------------------

If I change the value to include "-quotes ("simulating" that embl.pm
quotes the value), BioPerl can read the EMBL string it produces fine:

  -----------------------------------------------------------
  adsj at ala:~/work/bioperl/bioperl-live$ perl -I. ~/bugs/bioperl/embl/embl.pl 
  BioPerl 1.0069
  ID   unknown; SV 1; linear; unassigned DNA; STD; UNC; 3 BP.
  XX
  AC   unknown;
  XX
  XX
  FH   Key             Location/Qualifiers
  FH
  FT   misc_feature    1..3
  FT                   /label=""averylonglabelthisisindeedbutitoughttoworkanywaydo
  FT                   ntyouthink""
  XX
  SQ   Sequence 3 BP; 1 A; 0 C; 1 G; 1 T; 0 other;
       atg                                                                       3
  //
  Done


  Best regards,

     Adam

-- 
                                                          Adam Sj?gren
                                                    adsj at novozymes.com


From paola_bisignano at yahoo.it  Mon Sep 28 06:00:07 2009
From: paola_bisignano at yahoo.it (Paola Bisignano)
Date: Mon, 28 Sep 2009 10:00:07 +0000 (GMT)
Subject: [Bioperl-l] parsing msf file (sorry last question about it)
Message-ID: <504748.72296.qm@web25704.mail.ukl.yahoo.com>

Hi dear friends,


I used Bio::AlignIO to parse msf file, using method

colum_from_residue_number, as you suggested to obtain the position in

the alignment of ?residues of interest (in contact with my ligand) and

I have to do a check of the residue:

I want to extract the type of the residue...I ask my question using

the number of the residue in the PDB, and i want the script return

also the residue so if I want to know the position af ala21, I ?will

do:


my $alnio = Bio::AlignIO->new( -file=>"my file.msf");

my $aln = $alnio->next_aln;


my $s1 = $aln->get_seq_by_pos(1);

my $s2 = $aln->get_seq_by_pos(2);


my $col = $aln->column_from_residue_ number( $s1->id, 21)


and It will return the position (es. 5) but I want to check if in

position 5 of the alignment there is A (for ala)....I looked in

documentation, but I couldn't find anything for that


Thank you all for help you gave and will give to me,


best regards,


paola


From David.Messina at sbc.su.se  Mon Sep 28 07:28:27 2009
From: David.Messina at sbc.su.se (Dave Messina)
Date: Mon, 28 Sep 2009 13:28:27 +0200
Subject: [Bioperl-l] parsing msf file (sorry last question about it)
In-Reply-To: <504748.72296.qm@web25704.mail.ukl.yahoo.com>
References: <504748.72296.qm@web25704.mail.ukl.yahoo.com>
Message-ID: <628aabb70909280428q54e08ef9sa005aeab9f3a7b62@mail.gmail.com>

Hi Paola,

> my $alnio = Bio::AlignIO->new( -file=>"my file.msf");
> my $aln = $alnio->next_aln;
>
> my $s1 = $aln->get_seq_by_pos(1);
> my $s2 = $aln->get_seq_by_pos(2);
>
> my $col = $aln->column_from_residue_ number( $s1->id, 21)


# extract sequences and check values for the alignment column $pos
  foreach my $seq ($aln->each_seq) {
      my $res = $seq->subseq($col, $col);
     if ($res eq 'A') {
         # do something
     }
  }


Please try the above code. I haven't tested it, but I think it will do what
you want.

Best,
Dave

PS - I found that code in the documentation for Bio::Align::AlignI. Right
now there is an effort to improve the BioPerl documentation, and it would be
helpful if you could let us know where you looked for the answer to your
question so we can try to make it easier to find.

Did you look in Bio::AlignIO? Did you also look anywhere else?

Thanks for your help!


From David.Messina at sbc.su.se  Mon Sep 28 08:05:58 2009
From: David.Messina at sbc.su.se (Dave Messina)
Date: Mon, 28 Sep 2009 14:05:58 +0200
Subject: [Bioperl-l] parsing msf file (sorry last question about it)
In-Reply-To: <678730.88068.qm@web25708.mail.ukl.yahoo.com>
References: <628aabb70909280428q54e08ef9sa005aeab9f3a7b62@mail.gmail.com> 
	<678730.88068.qm@web25708.mail.ukl.yahoo.com>
Message-ID: <628aabb70909280505l2c5f02b7k8387d5dfd3643575@mail.gmail.com>

On Mon, Sep 28, 2009 at 13:56, Paola Bisignano <paola_bisignano at yahoo.it>wrote:

> yes I have a look at
> http://doc.bioperl.org/releases/bioperl-1.0/Bio/AlignIO.html
>
> but I didn't find your suggestion


> thank,
> I'll try it in a while.......
> sorry I did not search in AlignI....


No problem, Paola -- thanks for letting us know.

Dave


From maj at fortinbras.us  Mon Sep 28 10:32:39 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Mon, 28 Sep 2009 10:32:39 -0400
Subject: [Bioperl-l] setting a strand in Bio::Graphics
In-Reply-To: <2ac05d0f0909280728y791a5e60r904be0d7e8f747f7@mail.gmail.com>
References: <2ac05d0f0909270200j3bb478b3t77b83bccc1e5022c@mail.gmail.com>
	<6CF05E74FEAE45679CDEDF48B7E15856@NewLife>
	<2ac05d0f0909280728y791a5e60r904be0d7e8f747f7@mail.gmail.com>
Message-ID: <A45CF4D6E34B405B86E5F2DF651B8964@NewLife>

Now that's what I call user-friendly.
  ----- Original Message ----- 
  From: Emanuele Osimo 
  To: Mark A. Jensen 
  Sent: Monday, September 28, 2009 10:28 AM
  Subject: Re: [Bioperl-l] setting a strand in Bio::Graphics


  Hello everyone,
  thank you, I found what I needed. You have to add                           

  -strand_arrow => 1

  in $panel->add_track, and 

  -strand        => +/-1,

  in $feature = Bio::SeqFeature::Generic->new options.

  Thanks
  Emanuele


  On Mon, Sep 28, 2009 at 02:54, Mark A. Jensen <maj at fortinbras.us> wrote:

    Emos- Without the code, I can only guess, but you might not be providing
    the options correctly. Have a look at
    http://www.bioperl.org/wiki/Drawing_with_multiple_glyphs_in_a_single_track
    for something that may help.
    MAJ
    ----- Original Message ----- From: "Emanuele Osimo" <e.osimo at gmail.com>
    To: "perl bioperl ml" <bioperl-l at lists.open-bio.org>
    Sent: Sunday, September 27, 2009 5:00 AM
    Subject: [Bioperl-l] setting a strand in Bio::Graphics


      Hello,
      I've tried all the arrows suggested in
      http://search.cpan.org/~lds/Bio-Graphics-1.982/lib/Bio/Graphics/Glyph/arrow.pm,
      but I can't figure out how to tell in the options of $panel->add_track the
      strand of the feature I'm adding.
      I'm drawing DNA elements from a local DB, and I have a field "strand" which
      can be + or -.
      Please help!
      Emanuele

      _______________________________________________
      Bioperl-l mailing list
      Bioperl-l at lists.open-bio.org
      http://lists.open-bio.org/mailman/listinfo/bioperl-l


From paolo.pavan at gmail.com  Mon Sep 28 11:51:52 2009
From: paolo.pavan at gmail.com (Paolo Pavan)
Date: Mon, 28 Sep 2009 17:51:52 +0200
Subject: [Bioperl-l] BioPerl object deep copy
Message-ID: <56be91b60909280851g2299726bvfbdd6ef44e262fe7@mail.gmail.com>

Hi all,
I would like to have just a programming hint, there is a way in
bioperl (or just in perl) to get an deep copy or a clone of an object?
That is, I get a new object with all the fields copied one by one.

At least, can I do so for a Bio::SeqI or a Bio::AlignI compliant object?

Thank you,
Paolo


From s.denaxas at gmail.com  Mon Sep 28 11:56:09 2009
From: s.denaxas at gmail.com (Spiros Denaxas)
Date: Mon, 28 Sep 2009 16:56:09 +0100
Subject: [Bioperl-l] BioPerl object deep copy
In-Reply-To: <56be91b60909280851g2299726bvfbdd6ef44e262fe7@mail.gmail.com>
References: <56be91b60909280851g2299726bvfbdd6ef44e262fe7@mail.gmail.com>
Message-ID: <bba689ec0909280856q3fa3c8b1pf5b5dd48bc493eb4@mail.gmail.com>

Hi Paolo,

You can use Clone [1]. Blindly cloning blessed objects though is not a
good idea so make sure you know what each one instantiates.

Spiros

[1] http://perldoc.net/Clone.pm

On Mon, Sep 28, 2009 at 4:51 PM, Paolo Pavan <paolo.pavan at gmail.com> wrote:
> Hi all,
> I would like to have just a programming hint, there is a way in
> bioperl (or just in perl) to get an deep copy or a clone of an object?
> That is, I get a new object with all the fields copied one by one.
>
> At least, can I do so for a Bio::SeqI or a Bio::AlignI compliant object?
>
> Thank you,
> Paolo
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


From maj at fortinbras.us  Mon Sep 28 12:05:42 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Mon, 28 Sep 2009 12:05:42 -0400
Subject: [Bioperl-l] BioPerl object deep copy
In-Reply-To: <56be91b60909280851g2299726bvfbdd6ef44e262fe7@mail.gmail.com>
References: <56be91b60909280851g2299726bvfbdd6ef44e262fe7@mail.gmail.com>
Message-ID: <5A61641A14AE4D80A495047A56659894@NewLife>

For some relatively careful examples of cloning code, 
you can look at the source for 
Bio::Tree::TreeFunctionsI::clone
and 
Bio::Restriction::Enzyme::clone (not clone_depr)
MAJ

----- Original Message ----- 
From: "Paolo Pavan" <paolo.pavan at gmail.com>
To: <bioperl-l at lists.open-bio.org>
Sent: Monday, September 28, 2009 11:51 AM
Subject: [Bioperl-l] BioPerl object deep copy


> Hi all,
> I would like to have just a programming hint, there is a way in
> bioperl (or just in perl) to get an deep copy or a clone of an object?
> That is, I get a new object with all the fields copied one by one.
> 
> At least, can I do so for a Bio::SeqI or a Bio::AlignI compliant object?
> 
> Thank you,
> Paolo
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
>


From cjfields at illinois.edu  Mon Sep 28 12:29:14 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Mon, 28 Sep 2009 11:29:14 -0500
Subject: [Bioperl-l] BioPerl object deep copy
In-Reply-To: <5A61641A14AE4D80A495047A56659894@NewLife>
References: <56be91b60909280851g2299726bvfbdd6ef44e262fe7@mail.gmail.com>
	<5A61641A14AE4D80A495047A56659894@NewLife>
Message-ID: <05BB0DB4-6017-40A1-92B2-6F441CCACDC6@illinois.edu>

As Spiros points out, Clone works in almost all cases and is very fast  
(XS-based I think).  IIRC the only time it borks out is if there is a  
code ref, as with Bio::Tree::Tree, but if it doesn't work you should  
get an error indicating the problem.

chris

On Sep 28, 2009, at 11:05 AM, Mark A. Jensen wrote:

> For some relatively careful examples of cloning code, you can look  
> at the source for Bio::Tree::TreeFunctionsI::clone
> and Bio::Restriction::Enzyme::clone (not clone_depr)
> MAJ
>
> ----- Original Message ----- From: "Paolo Pavan" <paolo.pavan at gmail.com 
> >
> To: <bioperl-l at lists.open-bio.org>
> Sent: Monday, September 28, 2009 11:51 AM
> Subject: [Bioperl-l] BioPerl object deep copy
>
>
>> Hi all,
>> I would like to have just a programming hint, there is a way in
>> bioperl (or just in perl) to get an deep copy or a clone of an  
>> object?
>> That is, I get a new object with all the fields copied one by one.
>> At least, can I do so for a Bio::SeqI or a Bio::AlignI compliant  
>> object?
>> Thank you,
>> Paolo
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From cjfields at illinois.edu  Mon Sep 28 13:00:09 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Mon, 28 Sep 2009 12:00:09 -0500
Subject: [Bioperl-l] BioPerl 1.6.0 alpha 6 released (?!?)
In-Reply-To: <20090928063013.GB1081@kunpuu.plessy.org>
References: <277ED183-2F43-479F-88D2-A0A325105C53@illinois.edu>
	<20090928063013.GB1081@kunpuu.plessy.org>
Message-ID: <CFD37E37-2B74-402F-BA0F-898A1642FFE8@illinois.edu>

Charles (and everyone else),

This bug was a bit sneaky.  The tests skipped on pretty much every  
system b/c of a requirement for both DB_File and BerkeleyDB (e.g. if  
both weren't installed, the tests were skipped).  I committed a fix  
for it; unfortunately that means I need to set up another alpha for  
testing, so...

The final final alpha has just been uploaded to CPAN and is now  
available here:

http://bioperl.org/DIST/RC/BioPerl-1.6.0_6.tar.gz

The final 1.6.1 release should still be in the next day or two, just  
awaiting some test reports via CPAN...

chris

On Sep 28, 2009, at 1:30 AM, Charles Plessy wrote:

> Le Sun, Sep 27, 2009 at 11:34:01PM -0500, Chris Fields a ?crit :
>>
>> http://bioperl.org/DIST/RC/BioPerl-1.6.0_5.tar.gz
>>
>
> Hi Chris,
>
> I have the following errors when building bioperl with perl 5.10.1  
> on Debian:
>
> Test Summary Report
> -------------------
> t/LocalDB/Registry.t                       (Wstat: 2304 Tests: 13  
> Failed: 1)
>  Failed test:  13
>  Non-zero exit status: 9
>  Parse errors: Bad plan.  You planned 14 tests but ran 13.
> t/RemoteDB/EUtilities.t                    (Wstat: 256 Tests: 309  
> Failed: 1)
>  Failed test:  309
>  Non-zero exit status: 1
> t/Tools/Run/RemoteBlast.t                  (Wstat: 65280 Tests: 13  
> Failed: 0)
>  Non-zero exit status: 255
>  Parse errors: Bad plan.  You planned 16 tests but ran 13.
> Files=329, Tests=20766, 434 wallclock secs ( 2.64 usr  0.51 sys +  
> 100.55 cusr  6.24 csys = 109.94 CPU)
> Result: FAIL
>
>
> t/Align/AlignStats.t ......................... ok
> t/Align/AlignUtil.t .......................... ok
> t/Align/SimpleAlign.t ........................ ok
> t/Align/TreeBuild.t .......................... ok
> t/Align/Utilities.t .......................... ok
> t/AlignIO/AlignIO.t .......................... ok
> t/AlignIO/arp.t .............................. ok
> t/AlignIO/bl2seq.t ........................... ok
> t/AlignIO/clustalw.t ......................... ok
> t/AlignIO/emboss.t ........................... ok
> t/AlignIO/fasta.t ............................ ok
> t/AlignIO/largemultifasta.t .................. ok
> t/AlignIO/maf.t .............................. ok
> t/AlignIO/mase.t ............................. ok
> t/AlignIO/mega.t ............................. ok
> t/AlignIO/meme.t ............................. ok
> t/AlignIO/metafasta.t ........................ ok
> t/AlignIO/msf.t .............................. ok
> t/AlignIO/nexus.t ............................ ok
> t/AlignIO/pfam.t ............................. ok
> t/AlignIO/phylip.t ........................... ok
> t/AlignIO/po.t ............................... ok
> t/AlignIO/prodom.t ........................... ok
> t/AlignIO/psi.t .............................. ok
> t/AlignIO/selex.t ............................ ok
> t/AlignIO/stockholm.t ........................ ok
> t/AlignIO/xmfa.t ............................. ok
> t/Alphabet.t ................................. ok
> t/Annotation/Annotation.t .................... ok
> t/Annotation/AnnotationAdaptor.t ............. ok
> t/Assembly/Assembly.t ........................ ok
> t/Assembly/ContigSpectrum.t .................. ok
> t/Biblio/Biblio.t ............................ ok
> t/Biblio/References.t ........................ ok
> t/Biblio/biofetch.t .......................... ok
> t/Biblio/eutils.t ............................ ok
> t/ClusterIO/ClusterIO.t ...................... ok
> t/ClusterIO/SequenceFamily.t ................. ok
> t/ClusterIO/unigene.t ........................ ok
> t/Coordinate/CoordinateGraph.t ............... ok
> t/Coordinate/CoordinateMapper.t .............. ok
> t/Coordinate/GeneCoordinateMapper.t .......... ok
> t/LiveSeq/Chain.t ............................ ok
> t/LiveSeq/LiveSeq.t .......................... ok
> t/LiveSeq/Mutation.t ......................... ok
> t/LiveSeq/Mutator.t .......................... ok
> t/LocalDB/BioDBGFF.t ......................... ok
> t/LocalDB/BlastIndex.t ....................... ok
> t/LocalDB/DBFasta.t .......................... ok
> t/LocalDB/DBQual.t ........................... ok
> t/LocalDB/Flat.t ............................. ok
> t/LocalDB/Index.t ............................ ok
> t/LocalDB/Registry.t ......................... 1/14
> --------------------- WARNING ---------------------
> MSG:
> ------------- EXCEPTION: Bio::Root::Exception -------------
> MSG: The sequence does not appear to be FASTA format (lacks a  
> descriptor line '>')
> STACK: Error::throw
> STACK: Bio::Root::Root::throw Bio/Root/Root.pm:368
> STACK: Bio::SeqIO::fasta::next_seq Bio/SeqIO/fasta.pm:127
> STACK: Bio::DB::Flat::BDB::get_Seq_by_id Bio/DB/Flat/BDB.pm:143
> STACK: Bio::DB::Failover::get_Seq_by_id Bio/DB/Failover.pm:122
> STACK: t/LocalDB/Registry.t:69
> -----------------------------------------------------------
>
> ---------------------------------------------------
>
> --------------------- WARNING ---------------------
> MSG: No sequence retrieved by database Bio::DB::Flat::BDB::fasta
> ---------------------------------------------------
>
> #   Failed test at t/LocalDB/Registry.t line 70.
> Can't call method "seq" on an undefined value at t/LocalDB/ 
> Registry.t line 71, <GEN17> line 1.
> # Looks like you planned 14 tests but ran 13.
> # Looks like you failed 1 test of 13 run.
> # Looks like your test exited with 9 just after 13.
> t/LocalDB/Registry.t ......................... Dubious, test  
> returned 9 (wstat 2304, 0x900)
> Failed 2/14 subtests
> t/LocalDB/SeqFeature.t ....................... ok
> t/LocalDB/transfac_pro.t ..................... ok
> t/Map/Cyto.t ................................. ok
> t/Map/Linkage.t .............................. ok
> t/Map/Map.t .................................. ok
> t/Map/MapIO.t ................................ ok
> t/Map/MicrosatelliteMarker.t ................. ok
> t/Map/Physical.t ............................. ok
> t/Matrix/IO/masta.t .......................... ok
> t/Matrix/IO/psm.t ............................ ok
> t/Matrix/InstanceSite.t ...................... ok
> t/Matrix/Matrix.t ............................ ok
> t/Matrix/ProtMatrix.t ........................ ok
> t/Matrix/ProtPsm.t ........................... ok
> t/Matrix/SiteMatrix.t ........................ ok
> t/Ontology/GOterm.t .......................... ok
> t/Ontology/GraphAdaptor.t .................... ok
> t/Ontology/IO/go.t ........................... ok
> t/Ontology/IO/interpro.t ..................... ok
> t/Ontology/IO/obo.t .......................... ok
> t/Ontology/Ontology.t ........................ ok
> t/Ontology/OntologyEngine.t .................. ok
> t/Ontology/OntologyStore.t ................... ok
> t/Ontology/Relationship.t .................... ok
> t/Ontology/RelationshipType.t ................ ok
> t/Ontology/Term.t ............................ ok
> t/Perl.t ..................................... ok
> t/Phenotype/Correlate.t ...................... ok
> t/Phenotype/MeSH.t ........................... ok
> t/Phenotype/Measure.t ........................ ok
> t/Phenotype/MiniMIMentry.t ................... ok
> t/Phenotype/OMIMentry.t ...................... ok
> t/Phenotype/OMIMentryAllelicVariant.t ........ ok
> t/Phenotype/OMIMparser.t ..................... ok
> t/Phenotype/Phenotype.t ...................... ok
> t/PodSyntax.t ................................ ok
> t/PopGen/Coalescent.t ........................ ok
> t/PopGen/HtSNP.t ............................. ok
> t/PopGen/MK.t ................................ ok
> t/PopGen/PopGen.t ............................ ok
> t/PopGen/PopGenSims.t ........................ ok
> t/PopGen/TagHaplotype.t ...................... ok
> t/RemoteDB/BioFetch.t ........................ ok
> t/RemoteDB/CUTG.t ............................ ok
> t/RemoteDB/EMBL.t ............................ ok
> t/RemoteDB/EUtilities.t ...................... 309/309
> #   Failed test 'EPost to EFetch'
> #   at t/RemoteDB/EUtilities.t line 159.
> #          got: '0'
> #     expected: '5'
> # Looks like you failed 1 test of 309.
> t/RemoteDB/EUtilities.t ...................... Dubious, test  
> returned 1 (wstat 256, 0x100)
> Failed 1/309 subtests
> t/RemoteDB/EntrezGene.t ...................... ok
> t/RemoteDB/GenBank.t ......................... ok
> t/RemoteDB/GenPept.t ......................... ok
> t/RemoteDB/HIV/HIV.t ......................... ok
> t/RemoteDB/HIV/HIVAnnotProcessor.t ........... ok
> t/RemoteDB/HIV/HIVQuery.t .................... 22/41 Use of  
> uninitialized value $rest[0] in join or string at (eval 68) line 15.
> t/RemoteDB/HIV/HIVQuery.t .................... ok
> t/RemoteDB/HIV/HIVQueryHelper.t .............. ok
> t/RemoteDB/MeSH.t ............................ ok
> t/RemoteDB/Query/GenBank.t ................... ok
> t/RemoteDB/RefSeq.t .......................... ok
> t/RemoteDB/SeqHound.t ........................ ok
> t/RemoteDB/SeqRead_fail.t .................... ok
> t/RemoteDB/SeqVersion.t ...................... ok
> t/RemoteDB/SwissProt.t ....................... ok
> t/RemoteDB/Taxonomy.t ........................ ok
> t/Restriction/Analysis-refac.t ............... ok
> t/Restriction/Analysis.t ..................... ok
> t/Restriction/Gel.t .......................... ok
> t/Restriction/IO.t ........................... ok
> t/Root/Exception.t ........................... ok
> t/Root/RootI.t ............................... ok
> t/Root/RootIO.t .............................. ok
> t/Root/Storable.t ............................ ok
> t/Root/Tempfile.t ............................ ok
> t/Root/Utilities.t ........................... ok
> t/SearchDist.t ............................... skipped: The optional  
> module Bio::Ext::Align (or dependencies thereof) was not installed
> t/SearchIO/CigarString.t ..................... ok
> t/SearchIO/SearchIO.t ........................ ok
> t/SearchIO/SimilarityPair.t .................. ok
> t/SearchIO/Tiling.t .......................... ok
> t/SearchIO/Writer/GbrowseGFF.t ............... ok
> t/SearchIO/Writer/HSPTableWriter.t ........... ok
> t/SearchIO/Writer/HTMLWriter.t ............... ok
> t/SearchIO/Writer/HitTableWriter.t ........... ok
> t/SearchIO/blast.t ........................... ok
> t/SearchIO/blast_pull.t ...................... ok
> t/SearchIO/blasttable.t ...................... ok
> t/SearchIO/blastxml.t ........................ ok
> t/SearchIO/cross_match.t ..................... ok
> t/SearchIO/erpin.t ........................... ok
> t/SearchIO/exonerate.t ....................... ok
> t/SearchIO/fasta.t ........................... ok
> t/SearchIO/gmap_f9.t ......................... ok
> t/SearchIO/hmmer.t ........................... ok
> t/SearchIO/hmmer_pull.t ...................... ok
> t/SearchIO/infernal.t ........................ ok
> t/SearchIO/megablast.t ....................... ok
> t/SearchIO/psl.t ............................. ok
> t/SearchIO/rnamotif.t ........................ ok
> t/SearchIO/sim4.t ............................ ok
> t/SearchIO/waba.t ............................ ok
> t/SearchIO/wise.t ............................ ok
> t/Seq/DBLink.t ............................... ok
> t/Seq/EncodedSeq.t ........................... ok
> t/Seq/LargeLocatableSeq.t .................... ok
> t/Seq/LargePSeq.t ............................ ok
> t/Seq/LocatableSeq.t ......................... ok
> t/Seq/MetaSeq.t .............................. ok
> t/Seq/PrimaryQual.t .......................... ok
> t/Seq/PrimarySeq.t ........................... ok
> t/Seq/PrimedSeq.t ............................ ok
> t/Seq/Quality.t .............................. ok
> t/Seq/Seq.t .................................. ok
> t/Seq/WithQuality.t .......................... ok
> t/SeqEvolution.t ............................. ok
> t/SeqFeature/FeatureIO.t ..................... ok
> t/SeqFeature/Location.t ...................... ok
> t/SeqFeature/LocationFactory.t ............... ok
> t/SeqFeature/Primer.t ........................ ok
> t/SeqFeature/Range.t ......................... ok
> t/SeqFeature/RangeI.t ........................ ok
> t/SeqFeature/SeqAnalysisParser.t ............. ok
> t/SeqFeature/SeqFeatAnnotated.t .............. ok
> t/SeqFeature/SeqFeatCollection.t ............. ok
> t/SeqFeature/SeqFeature.t .................... ok
> t/SeqFeature/SeqFeaturePrimer.t .............. ok
> t/SeqFeature/Unflattener.t ................... ok
> t/SeqFeature/Unflattener2.t .................. ok
> t/SeqIO.t .................................... ok
> t/SeqIO/Handler.t ............................ ok
> t/SeqIO/MultiFile.t .......................... ok
> t/SeqIO/Multiple_fasta.t ..................... ok
> t/SeqIO/SeqBuilder.t ......................... ok
> t/SeqIO/Splicedseq.t ......................... ok
> t/SeqIO/abi.t ................................ skipped: The optional  
> module Bio::SeqIO::staden::read (or dependencies thereof) was not  
> installed
> t/SeqIO/ace.t ................................ ok
> t/SeqIO/agave.t .............................. ok
> t/SeqIO/alf.t ................................ skipped: The optional  
> module Bio::SeqIO::staden::read (or dependencies thereof) was not  
> installed
> t/SeqIO/asciitree.t .......................... ok
> t/SeqIO/bsml.t ............................... ok
> t/SeqIO/bsml_sax.t ........................... ok
> t/SeqIO/chadoxml.t ........................... ok
> t/SeqIO/chaos.t .............................. ok
> t/SeqIO/chaosxml.t ........................... ok
> t/SeqIO/ctf.t ................................ skipped: The optional  
> module Bio::SeqIO::staden::read (or dependencies thereof) was not  
> installed
> t/SeqIO/embl.t ............................... ok
> t/SeqIO/entrezgene.t ......................... ok
> t/SeqIO/excel.t .............................. ok
> t/SeqIO/exp.t ................................ skipped: The optional  
> module Bio::SeqIO::staden::read (or dependencies thereof) was not  
> installed
> t/SeqIO/fasta.t .............................. ok
> t/SeqIO/fastq.t .............................. ok
> t/SeqIO/flybase_chadoxml.t ................... ok
> t/SeqIO/game.t ............................... ok
> t/SeqIO/gcg.t ................................ ok
> t/SeqIO/genbank.t ............................ ok
> t/SeqIO/interpro.t ........................... ok
> t/SeqIO/kegg.t ............................... ok
> t/SeqIO/largefasta.t ......................... ok
> t/SeqIO/lasergene.t .......................... ok
> t/SeqIO/locuslink.t .......................... ok
> t/SeqIO/metafasta.t .......................... ok
> t/SeqIO/phd.t ................................ ok
> t/SeqIO/pir.t ................................ ok
> t/SeqIO/pln.t ................................ skipped: The optional  
> module Bio::SeqIO::staden::read (or dependencies thereof) was not  
> installed
> t/SeqIO/qual.t ............................... ok
> t/SeqIO/raw.t ................................ ok
> t/SeqIO/scf.t ................................ ok
> t/SeqIO/strider.t ............................ ok
> t/SeqIO/swiss.t .............................. ok
> t/SeqIO/tab.t ................................ ok
> t/SeqIO/table.t .............................. ok
> t/SeqIO/tigr.t ............................... ok
> t/SeqIO/tigrxml.t ............................ ok
> t/SeqIO/tinyseq.t ............................ ok
> t/SeqIO/ztr.t ................................ skipped: The optional  
> module Bio::SeqIO::staden::read (or dependencies thereof) was not  
> installed
> t/SeqTools/Backtranslate.t ................... ok
> t/SeqTools/CodonTable.t ...................... ok
> t/SeqTools/ECnumber.t ........................ ok
> t/SeqTools/GuessSeqFormat.t .................. ok
> t/SeqTools/OddCodes.t ........................ ok
> t/SeqTools/SeqPattern.t ...................... ok
> t/SeqTools/SeqStats.t ........................ ok
> t/SeqTools/SeqUtils.t ........................ ok
> t/SeqTools/SeqWords.t ........................ ok
> t/Species.t .................................. ok
> t/Structure/IO.t ............................. ok
> t/Structure/Structure.t ...................... ok
> t/Symbol.t ................................... ok
> t/TaxonTree.t ................................ skipped: All tests  
> are being skipped, probably because the module(s) being tested here  
> are now deprecated
> t/Tools/Alignment/Consed.t ................... ok
> t/Tools/Analysis/DNA/ESEfinder.t ............. ok
> t/Tools/Analysis/Protein/Domcut.t ............ ok
> t/Tools/Analysis/Protein/ELM.t ............... ok
> t/Tools/Analysis/Protein/GOR4.t .............. ok
> t/Tools/Analysis/Protein/HNN.t ............... ok
> t/Tools/Analysis/Protein/Mitoprot.t .......... ok
> t/Tools/Analysis/Protein/NetPhos.t ........... ok
> t/Tools/Analysis/Protein/Scansite.t .......... ok
> t/Tools/Analysis/Protein/Sopma.t ............. ok
> t/Tools/EMBOSS/Palindrome.t .................. ok
> t/Tools/EUtilities/EUtilParameters.t ......... ok
> t/Tools/EUtilities/egquery.t ................. ok
> t/Tools/EUtilities/einfo.t ................... ok
> t/Tools/EUtilities/elink_acheck.t ............ ok
> t/Tools/EUtilities/elink_lcheck.t ............ ok
> t/Tools/EUtilities/elink_llinks.t ............ ok
> t/Tools/EUtilities/elink_ncheck.t ............ ok
> t/Tools/EUtilities/elink_neighbor.t .......... ok
> t/Tools/EUtilities/elink_neighbor_history.t .. ok
> t/Tools/EUtilities/elink_scores.t ............ ok
> t/Tools/EUtilities/epost.t ................... ok
> t/Tools/EUtilities/esearch.t ................. ok
> t/Tools/EUtilities/espell.t .................. ok
> t/Tools/EUtilities/esummary.t ................ ok
> t/Tools/Est2Genome.t ......................... ok
> t/Tools/FootPrinter.t ........................ ok
> t/Tools/GFF.t ................................ ok
> t/Tools/Geneid.t ............................. ok
> t/Tools/Genewise.t ........................... ok
> t/Tools/Genomewise.t ......................... ok
> t/Tools/Genpred.t ............................ ok
> t/Tools/Hmmer.t .............................. ok
> t/Tools/IUPAC.t .............................. ok
> t/Tools/Lucy.t ............................... ok
> t/Tools/Match.t .............................. ok
> t/Tools/Phylo/Gerp.t ......................... ok
> t/Tools/Phylo/Molphy.t ....................... ok
> t/Tools/Phylo/PAML.t ......................... ok
> t/Tools/Phylo/Phylip/ProtDist.t .............. ok
> t/Tools/Primer3.t ............................ ok
> t/Tools/Promoterwise.t ....................... ok
> t/Tools/Pseudowise.t ......................... ok
> t/Tools/QRNA.t ............................... ok
> t/Tools/RandDistFunctions.t .................. ok
> t/Tools/RepeatMasker.t ....................... ok
> t/Tools/Run/RemoteBlast.t .................... 13/16
> --------------------- WARNING ---------------------
> MSG: Server failed to return any data
> ---------------------------------------------------
> # Looks like you planned 16 tests but ran 13.
> t/Tools/Run/RemoteBlast.t .................... Dubious, test  
> returned 255 (wstat 65280, 0xff00)
> Failed 3/16 subtests
> t/Tools/Run/RemoteBlast_rpsblast.t ........... ok
> t/Tools/Run/StandAloneBlast.t ................ ok
> t/Tools/Run/WrapperBase.t .................... ok
> t/Tools/Seg.t ................................ ok
> t/Tools/SiRNA.t .............................. ok
> t/Tools/Sigcleave.t .......................... ok
> t/Tools/Signalp.t ............................ ok
> t/Tools/Signalp/ExtendedSignalp.t ............ ok
> t/Tools/Sim4.t ............................... ok
> t/Tools/Spidey/Spidey.t ...................... ok
> t/Tools/TandemRepeatsFinder.t ................ ok
> t/Tools/TargetP.t ............................ ok
> t/Tools/Tmhmm.t .............................. ok
> t/Tools/ePCR.t ............................... ok
> t/Tools/pICalculator.t ....................... ok
> t/Tools/rnamotif.t ........................... skipped: All tests  
> are being skipped, probably because the module(s) being tested here  
> are now deprecated
> t/Tools/tRNAscanSE.t ......................... ok
> t/Tree/Compatible.t .......................... ok
> t/Tree/Node.t ................................ ok
> t/Tree/PhyloNetwork/Factory.t ................ ok
> t/Tree/PhyloNetwork/GraphViz.t ............... ok
> t/Tree/PhyloNetwork/MuVector.t ............... ok
> t/Tree/PhyloNetwork/PhyloNetwork.t ........... ok
> t/Tree/PhyloNetwork/RandomFactory.t .......... skipped: The optional  
> module Math::Random (or dependencies thereof) was not installed
> t/Tree/PhyloNetwork/TreeFactory.t ............ ok
> t/Tree/RandomTreeFactory.t ................... ok
> t/Tree/Tree.t ................................ ok
> t/Tree/TreeIO.t .............................. ok
> t/Tree/TreeIO/lintree.t ...................... ok
> t/Tree/TreeIO/newick.t ....................... ok
> t/Tree/TreeIO/nexus.t ........................ ok
> t/Tree/TreeIO/nhx.t .......................... ok
> t/Tree/TreeIO/phyloxml.t ..................... ok
> t/Tree/TreeIO/svggraph.t ..................... 1/4 Use of  
> uninitialized value $txt[0] in join or string at /usr/share/perl5/ 
> SVG/Element.pm line 1195, <GEN0> line 1.
> t/Tree/TreeIO/svggraph.t ..................... ok
> t/Tree/TreeIO/tabtree.t ...................... ok
> t/Tree/TreeStatistics.t ...................... ok
> t/Variation/AAChange.t ....................... ok
> t/Variation/AAReverseMutate.t ................ ok
> t/Variation/Allele.t ......................... ok
> t/Variation/DNAMutation.t .................... ok
> t/Variation/RNAChange.t ...................... ok
> t/Variation/SNP.t ............................ ok
> t/Variation/SeqDiff.t ........................ ok
> t/Variation/Variation_IO.t ................... ok
>
>
> Cheers,
>
> -- 
> Charles Plessy
> Debian Med packaging team,
> http://www.debian.org/devel/debian-med
> Tsurumi, Kanagawa, Japan


From cjfields at illinois.edu  Mon Sep 28 13:28:29 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Mon, 28 Sep 2009 12:28:29 -0500
Subject: [Bioperl-l] Policy on tests
Message-ID: <00F31D5F-D531-4A5E-A11E-F7B67283FA8B@illinois.edu>

All,

This is a bit of a rant related to the spat of alphas I've had to  
release over the last few weeks.  We have a fairly loose policy on  
testing; for instance, most CPAN installations should not run network-  
or DB-dependent tests or other developer-dependent tests by default  
(POD formatting, for instance), or tests for a 'recommended' module  
should be skipped.  That is currently in place.

However, I do think all tests that are skipped need to be reported  
somehow, and optional tests should NOT skip if they are off by default  
and are specifically requested.  This is not currently the behavior.   
So far I have been bitten twice by this.

The last instance was with the latest alpha, where ODBA-related tests  
were mistakenly skipped when BerkeleyDB wasn't installed.  As it turns  
out, BerkeleyDB isn't required, but (according to standard test  
harness output) t/LocalDB/Registry.t passed w/o reporting any problems  
when in reality it silently skipped over 90% of the tests (this is  
only seen with --verbose output).  In the past I have also run into  
network tests silently passing when the remote server was not in  
service anymore (IIRC this was with XEMBL modules, which are no longer  
in the distribution).

 From my point of view, speaking as both a user and developer, I need  
to know when these tests are skipped or fail.  In instances where I  
specifically request a set of tests to be run and a test fails, they  
*should* fail quite loudly and catastrophically (i.e. if there is a  
server-side issue, a problem with DB connection, etc).  They shouldn't  
be skipped over if a problem arises, otherwise if it a legitimate bug  
it silently passes.  If it is something I haven't set up correctly (a  
DB connection, for instance) I would like to know about it via the  
test failures.

Am I the only one thinking along these lines?  Should we come up with  
a simple policy on how we're setting up and running tests?

chris


From paola.bisignano at gmail.com  Mon Sep 28 05:50:52 2009
From: paola.bisignano at gmail.com (Paola Bisignano)
Date: Mon, 28 Sep 2009 11:50:52 +0200
Subject: [Bioperl-l] parsing msf file
Message-ID: <e9cf89740909280250u40f1a118pa7527a2f27c5bc0@mail.gmail.com>

Hi dear friends,

I used Bio::AlignIO to parse msf file, using method
colum_from_residue_number, as you suggested to obtain the position in
the alignment of  residues of interest (in contact with my ligand) and
I have to do a check of the residue:
I want to extract the type of the residue...I ask my question using
the number of the residue in the PDB, and i want the script return
also the residue so if I want to know the position af ala21, I  will
do:

my $alnio = Bio::AlignIO->new( -file=>"my file.msf");
my $aln = $alnio->next_aln;

my $s1 = $aln->get_seq_by_pos(1);
my $s2 = $aln->get_seq_by_pos(2);

my $col = $aln->column_from_residue_number( $s1->id, 21)

and It will return the position (es. 5) but I want to check if in
position 5 of the alignment there is A (for ala)....I looked in
documentation, but I couldn't find anything for that


Thank you all for help you gave and will give to me,

best regards,

paola


From maj at fortinbras.us  Mon Sep 28 21:25:33 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Mon, 28 Sep 2009 21:25:33 -0400
Subject: [Bioperl-l] parsing msf file
In-Reply-To: <e9cf89740909280250u40f1a118pa7527a2f27c5bc0@mail.gmail.com>
References: <e9cf89740909280250u40f1a118pa7527a2f27c5bc0@mail.gmail.com>
Message-ID: <1C5008B41F6D4BFF9F5160633D284442@NewLife>

Hi Paola--
I think you're saying you want to see if A is present in other 
sequences in the alignment at alignment column 5. Here's
where you use location_from_column, which is a method 
off the sequence object themselves. The idea is to do 

# $col is obtained as in your script...
for my $seq ($aln->each_seq) {
  if ( $seq->subseq( $seq->location_from_column($col) ) eq 'A') {
     print "si!";
  else {
     print "no!";
  }
}

You might find the code at 
http://www.bioperl.org/wiki/Site_entropy_in_an_alignment
helpful since it uses these principles. 
Mark
----- Original Message ----- 
From: "Paola Bisignano" <paola.bisignano at gmail.com>
To: <bioperl-l at lists.open-bio.org>
Sent: Monday, September 28, 2009 5:50 AM
Subject: [Bioperl-l] parsing msf file


> Hi dear friends,
> 
> I used Bio::AlignIO to parse msf file, using method
> colum_from_residue_number, as you suggested to obtain the position in
> the alignment of  residues of interest (in contact with my ligand) and
> I have to do a check of the residue:
> I want to extract the type of the residue...I ask my question using
> the number of the residue in the PDB, and i want the script return
> also the residue so if I want to know the position af ala21, I  will
> do:
> 
> my $alnio = Bio::AlignIO->new( -file=>"my file.msf");
> my $aln = $alnio->next_aln;
> 
> my $s1 = $aln->get_seq_by_pos(1);
> my $s2 = $aln->get_seq_by_pos(2);
> 
> my $col = $aln->column_from_residue_number( $s1->id, 21)
> 
> and It will return the position (es. 5) but I want to check if in
> position 5 of the alignment there is A (for ala)....I looked in
> documentation, but I couldn't find anything for that
> 
> 
> Thank you all for help you gave and will give to me,
> 
> best regards,
> 
> paola
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
>


From martin.senger at gmail.com  Tue Sep 29 01:31:41 2009
From: martin.senger at gmail.com (Martin Senger)
Date: Tue, 29 Sep 2009 13:31:41 +0800
Subject: [Bioperl-l] a Main Page proposal
Message-ID: <4d93f07c0909282231k35bc636as73993fe031034340@mail.gmail.com>

> Martin has stopped working on Biblio as far as I know and php-hacking is
> not my favorite pastime.


That's true. I can still revive the code - but the question is (always has
been) where to host the server (of the web services providing the biblio
data). It was hosted, and maintained, at EBI. But I do not know if EBI is
still maintaining it, or willing to do so.

Cheers,
Martin

-- 
Martin Senger
email: martin.senger at gmail.com,m.senger at cgiar.org
skype: martinsenger


From jason at bioperl.org  Tue Sep 29 01:43:30 2009
From: jason at bioperl.org (Jason Stajich)
Date: Mon, 28 Sep 2009 22:43:30 -0700
Subject: [Bioperl-l] a Main Page proposal
In-Reply-To: <4d93f07c0909282231k35bc636as73993fe031034340@mail.gmail.com>
References: <4d93f07c0909282231k35bc636as73993fe031034340@mail.gmail.com>
Message-ID: <E9D67D22-ABC3-4199-B8D9-E0675197B9BF@bioperl.org>

hah! I actually meant the Biblio.php Wikimedia plugin by Martin Jambon  
-- but hey the Bio::Biblio db stuff should be discussed too.

-jason
On Sep 28, 2009, at 10:31 PM, Martin Senger wrote:

>> Martin has stopped working on Biblio as far as I know and php- 
>> hacking is
>> not my favorite pastime.
>
>
> That's true. I can still revive the code - but the question is  
> (always has
> been) where to host the server (of the web services providing the  
> biblio
> data). It was hosted, and maintained, at EBI. But I do not know if  
> EBI is
> still maintaining it, or willing to do so.
>
> Cheers,
> Martin
>
> -- 
> Martin Senger
> email: martin.senger at gmail.com,m.senger at cgiar.org
> skype: martinsenger

--
Jason Stajich
jason.stajich at gmail.com
jason at bioperl.org


From cjfields at illinois.edu  Tue Sep 29 14:01:29 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Tue, 29 Sep 2009 13:01:29 -0500
Subject: [Bioperl-l] BioPerl 1.6.1 released
Message-ID: <7E88F275-4D4B-48DD-84C7-A97C2C16EA97@illinois.edu>

We are pleased to announce the availability of BioPerl 1.6.1, the  
latest release of BioPerl core code.  You can grab it here:

Via CPAN:

http://search.cpan.org/~cjfields/BioPerl-1.6.1/

Via the BioPerl website:

http://bioperl.org/DIST/BioPerl-1.6.1.tar.bz2
http://bioperl.org/DIST/BioPerl-1.6.1.tar.gz
http://bioperl.org/DIST/BioPerl-1.6.1.zip

The PPM for Windows should also finally be available this week,  
ActivePerl problems permitting (we will post more information when it  
becomes available).

Tons of bug fixes and changes have been incorporated into this  
release.  For a more complete change list please see the 'Changes'  
file included with the distribution.

A few highlights:

* FASTQ parsing and interconversion of the three FASTQ variants  
(Sanger, Illumina, Solexa) now works (a concerted OBF effort!)
* Significant refactoring of Bio::Restriction methods
* Complete refactoring of Bio::Search-related tiling code, including  
HOWTO documentation
* GBrowse-related fixes
    - berkeleydb database now autoindexes wig files and locks correctly
    - add Pg, SQLite, and faster BerkeleyDB implementations
* Infernal 1.0 output is now parsed
* New SearchIO-based parser for gmap -f9 output
* BLAST XML parsing essentially complete
* Installation via CPANPLUS should now work
* For those using Strawberry Perl on Windows, the latest build is  
expected to pass all tests.
* 'raw' sequence format now parsed by line or optionally as a single  
sequence
* SCF parsing/writing now round-trips
* Demo code for using RPS-BLAST and Bio::Tools::Run::RemoteBlast
* Bio::Tools::SeqPattern now has a backtranslate() method
* Bio::Tree::Statistics now has methods to calculate Fitch-based  
score, internal trait values, statratio(), sum of leaf distances  
[heikki]
* scripts
    - update to bp_seqfeature_load for SQLite [lstein]
    - hivq.pl - commmand-line interface to Bio::DB::HIV [maj]
    - fastam9_to_table - fix for MPI output [jason]
    - gccalc - total stats [jason]
    - einfo  - simple script to find up-to-date NCBI database list,  
list field and link values for a specific database

We will shortly release updates for BioPerl-db, BioPerl-run, and  
BioPerl-network.  Enjoy!

chris


From rmb32 at cornell.edu  Tue Sep 29 14:22:03 2009
From: rmb32 at cornell.edu (Robert Buels)
Date: Tue, 29 Sep 2009 11:22:03 -0700
Subject: [Bioperl-l] BioPerl 1.6.1 released
In-Reply-To: <7E88F275-4D4B-48DD-84C7-A97C2C16EA97@illinois.edu>
References: <7E88F275-4D4B-48DD-84C7-A97C2C16EA97@illinois.edu>
Message-ID: <4AC2504B.1000707@cornell.edu>

Chris Fields wrote:
 > We are pleased to announce the availability of BioPerl 1.6.1, the
 > latest release of BioPerl core code.

Hooray!  You rock Chris!  Tremendous thanks for your many hours of work 
to get it out the door!

Rob


From scott at scottcain.net  Tue Sep 29 14:23:08 2009
From: scott at scottcain.net (Scott Cain)
Date: Tue, 29 Sep 2009 14:23:08 -0400
Subject: [Bioperl-l] BioPerl 1.6.1 released
In-Reply-To: <7E88F275-4D4B-48DD-84C7-A97C2C16EA97@illinois.edu>
References: <7E88F275-4D4B-48DD-84C7-A97C2C16EA97@illinois.edu>
Message-ID: <536f21b00909291123h12a7c941tdd3edb7fadbb1149@mail.gmail.com>

Chris,

Congratulations and thanks so much for the time and effort that went into this.

Scott


On Tue, Sep 29, 2009 at 2:01 PM, Chris Fields <cjfields at illinois.edu> wrote:
> We are pleased to announce the availability of BioPerl 1.6.1, the latest
> release of BioPerl core code. ?You can grab it here:
>
> Via CPAN:
>
> http://search.cpan.org/~cjfields/BioPerl-1.6.1/
>
> Via the BioPerl website:
>
> http://bioperl.org/DIST/BioPerl-1.6.1.tar.bz2
> http://bioperl.org/DIST/BioPerl-1.6.1.tar.gz
> http://bioperl.org/DIST/BioPerl-1.6.1.zip
>
> The PPM for Windows should also finally be available this week, ActivePerl
> problems permitting (we will post more information when it becomes
> available).
>
> Tons of bug fixes and changes have been incorporated into this release. ?For
> a more complete change list please see the 'Changes' file included with the
> distribution.
>
> A few highlights:
>
> * FASTQ parsing and interconversion of the three FASTQ variants (Sanger,
> Illumina, Solexa) now works (a concerted OBF effort!)
> * Significant refactoring of Bio::Restriction methods
> * Complete refactoring of Bio::Search-related tiling code, including HOWTO
> documentation
> * GBrowse-related fixes
> ? - berkeleydb database now autoindexes wig files and locks correctly
> ? - add Pg, SQLite, and faster BerkeleyDB implementations
> * Infernal 1.0 output is now parsed
> * New SearchIO-based parser for gmap -f9 output
> * BLAST XML parsing essentially complete
> * Installation via CPANPLUS should now work
> * For those using Strawberry Perl on Windows, the latest build is expected
> to pass all tests.
> * 'raw' sequence format now parsed by line or optionally as a single
> sequence
> * SCF parsing/writing now round-trips
> * Demo code for using RPS-BLAST and Bio::Tools::Run::RemoteBlast
> * Bio::Tools::SeqPattern now has a backtranslate() method
> * Bio::Tree::Statistics now has methods to calculate Fitch-based score,
> internal trait values, statratio(), sum of leaf distances [heikki]
> * scripts
> ? - update to bp_seqfeature_load for SQLite [lstein]
> ? - hivq.pl - commmand-line interface to Bio::DB::HIV [maj]
> ? - fastam9_to_table - fix for MPI output [jason]
> ? - gccalc - total stats [jason]
> ? - einfo ?- simple script to find up-to-date NCBI database list, list field
> and link values for a specific database
>
> We will shortly release updates for BioPerl-db, BioPerl-run, and
> BioPerl-network. ?Enjoy!
>
> chris
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


-- 
------------------------------------------------------------------------
Scott Cain, Ph. D.                                   scott at scottcain dot net
GMOD Coordinator (http://gmod.org/)                     216-392-3087
Ontario Institute for Cancer Research


From hlapp at gmx.net  Tue Sep 29 15:56:58 2009
From: hlapp at gmx.net (Hilmar Lapp)
Date: Tue, 29 Sep 2009 15:56:58 -0400
Subject: [Bioperl-l] BioPerl 1.6.1 released
In-Reply-To: <7E88F275-4D4B-48DD-84C7-A97C2C16EA97@illinois.edu>
References: <7E88F275-4D4B-48DD-84C7-A97C2C16EA97@illinois.edu>
Message-ID: <C06DA705-6249-4D86-BE9D-E2E4DCEBFAF0@gmx.net>

Congrats from me too - awesome Chris, and thanks on behalf of the  
project!

	-hilmar

On Sep 29, 2009, at 2:01 PM, Chris Fields wrote:

> We are pleased to announce the availability of BioPerl 1.6.1, the  
> latest release of BioPerl core code.  You can grab it here:
>
> Via CPAN:
>
> http://search.cpan.org/~cjfields/BioPerl-1.6.1/
>
> Via the BioPerl website:
>
> http://bioperl.org/DIST/BioPerl-1.6.1.tar.bz2
> http://bioperl.org/DIST/BioPerl-1.6.1.tar.gz
> http://bioperl.org/DIST/BioPerl-1.6.1.zip
>
> The PPM for Windows should also finally be available this week,  
> ActivePerl problems permitting (we will post more information when  
> it becomes available).
>
> Tons of bug fixes and changes have been incorporated into this  
> release.  For a more complete change list please see the 'Changes'  
> file included with the distribution.
>
> A few highlights:
>
> * FASTQ parsing and interconversion of the three FASTQ variants  
> (Sanger, Illumina, Solexa) now works (a concerted OBF effort!)
> * Significant refactoring of Bio::Restriction methods
> * Complete refactoring of Bio::Search-related tiling code, including  
> HOWTO documentation
> * GBrowse-related fixes
>   - berkeleydb database now autoindexes wig files and locks correctly
>   - add Pg, SQLite, and faster BerkeleyDB implementations
> * Infernal 1.0 output is now parsed
> * New SearchIO-based parser for gmap -f9 output
> * BLAST XML parsing essentially complete
> * Installation via CPANPLUS should now work
> * For those using Strawberry Perl on Windows, the latest build is  
> expected to pass all tests.
> * 'raw' sequence format now parsed by line or optionally as a single  
> sequence
> * SCF parsing/writing now round-trips
> * Demo code for using RPS-BLAST and Bio::Tools::Run::RemoteBlast
> * Bio::Tools::SeqPattern now has a backtranslate() method
> * Bio::Tree::Statistics now has methods to calculate Fitch-based  
> score, internal trait values, statratio(), sum of leaf distances  
> [heikki]
> * scripts
>   - update to bp_seqfeature_load for SQLite [lstein]
>   - hivq.pl - commmand-line interface to Bio::DB::HIV [maj]
>   - fastam9_to_table - fix for MPI output [jason]
>   - gccalc - total stats [jason]
>   - einfo  - simple script to find up-to-date NCBI database list,  
> list field and link values for a specific database
>
> We will shortly release updates for BioPerl-db, BioPerl-run, and  
> BioPerl-network.  Enjoy!
>
> chris
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From cjfields at illinois.edu  Tue Sep 29 16:38:04 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Tue, 29 Sep 2009 15:38:04 -0500
Subject: [Bioperl-l] BioPerl 1.6.1 released
In-Reply-To: <C06DA705-6249-4D86-BE9D-E2E4DCEBFAF0@gmx.net>
References: <7E88F275-4D4B-48DD-84C7-A97C2C16EA97@illinois.edu>
	<C06DA705-6249-4D86-BE9D-E2E4DCEBFAF0@gmx.net>
Message-ID: <5B8C4E37-5F3D-4E76-AB94-1C613AE04CDF@illinois.edu>

No prob.  Next up is db, run, and network!

chris

On Sep 29, 2009, at 2:56 PM, Hilmar Lapp wrote:

> Congrats from me too - awesome Chris, and thanks on behalf of the  
> project!
>
> 	-hilmar
>
> On Sep 29, 2009, at 2:01 PM, Chris Fields wrote:
>
>> We are pleased to announce the availability of BioPerl 1.6.1, the  
>> latest release of BioPerl core code.  You can grab it here:
>>
>> Via CPAN:
>>
>> http://search.cpan.org/~cjfields/BioPerl-1.6.1/
>>
>> Via the BioPerl website:
>>
>> http://bioperl.org/DIST/BioPerl-1.6.1.tar.bz2
>> http://bioperl.org/DIST/BioPerl-1.6.1.tar.gz
>> http://bioperl.org/DIST/BioPerl-1.6.1.zip
>>
>> The PPM for Windows should also finally be available this week,  
>> ActivePerl problems permitting (we will post more information when  
>> it becomes available).
>>
>> Tons of bug fixes and changes have been incorporated into this  
>> release.  For a more complete change list please see the 'Changes'  
>> file included with the distribution.
>>
>> A few highlights:
>>
>> * FASTQ parsing and interconversion of the three FASTQ variants  
>> (Sanger, Illumina, Solexa) now works (a concerted OBF effort!)
>> * Significant refactoring of Bio::Restriction methods
>> * Complete refactoring of Bio::Search-related tiling code,  
>> including HOWTO documentation
>> * GBrowse-related fixes
>>  - berkeleydb database now autoindexes wig files and locks correctly
>>  - add Pg, SQLite, and faster BerkeleyDB implementations
>> * Infernal 1.0 output is now parsed
>> * New SearchIO-based parser for gmap -f9 output
>> * BLAST XML parsing essentially complete
>> * Installation via CPANPLUS should now work
>> * For those using Strawberry Perl on Windows, the latest build is  
>> expected to pass all tests.
>> * 'raw' sequence format now parsed by line or optionally as a  
>> single sequence
>> * SCF parsing/writing now round-trips
>> * Demo code for using RPS-BLAST and Bio::Tools::Run::RemoteBlast
>> * Bio::Tools::SeqPattern now has a backtranslate() method
>> * Bio::Tree::Statistics now has methods to calculate Fitch-based  
>> score, internal trait values, statratio(), sum of leaf distances  
>> [heikki]
>> * scripts
>>  - update to bp_seqfeature_load for SQLite [lstein]
>>  - hivq.pl - commmand-line interface to Bio::DB::HIV [maj]
>>  - fastam9_to_table - fix for MPI output [jason]
>>  - gccalc - total stats [jason]
>>  - einfo  - simple script to find up-to-date NCBI database list,  
>> list field and link values for a specific database
>>
>> We will shortly release updates for BioPerl-db, BioPerl-run, and  
>> BioPerl-network.  Enjoy!
>>
>> chris
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>
> -- 
> ===========================================================
> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
> ===========================================================
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From cjfields at illinois.edu  Tue Sep 29 17:11:33 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Tue, 29 Sep 2009 16:11:33 -0500
Subject: [Bioperl-l] Naming of BioPerl-run/db/network
Message-ID: <4384F324-E30E-490D-A6FB-3EB4C54E4481@illinois.edu>

Right now all our subdistributions have a naming scheme like BioPerl- 
db.  I'm thinking we should subtly change those to BioPerl-DB, BioPerl- 
Run, BioPerl-Network, etc.  The primary reason is that the prior  
method of naming doesn't quite match the syntax of other distributions:

Win32-Console
Win32-EventLog
MooseX-Aliases
etc etc

I'll go ahead and make these changes unless there is rabid dissent ;>

chris


From bix at sendu.me.uk  Tue Sep 29 15:06:17 2009
From: bix at sendu.me.uk (Sendu Bala)
Date: Tue, 29 Sep 2009 20:06:17 +0100
Subject: [Bioperl-l] BioPerl 1.6.1 released
In-Reply-To: <7E88F275-4D4B-48DD-84C7-A97C2C16EA97@illinois.edu>
References: <7E88F275-4D4B-48DD-84C7-A97C2C16EA97@illinois.edu>
Message-ID: <4AC25AA9.5080803@sendu.me.uk>

Chris Fields wrote:
> We are pleased to announce the availability of BioPerl 1.6.1, the latest 
> release of BioPerl core code.  You can grab it here:

Great job Chris. *cheers*


From hlapp at gmx.net  Tue Sep 29 17:49:07 2009
From: hlapp at gmx.net (Hilmar Lapp)
Date: Tue, 29 Sep 2009 17:49:07 -0400
Subject: [Bioperl-l] Naming of BioPerl-run/db/network
In-Reply-To: <4384F324-E30E-490D-A6FB-3EB4C54E4481@illinois.edu>
References: <4384F324-E30E-490D-A6FB-3EB4C54E4481@illinois.edu>
Message-ID: <6C5CBE0E-EDA5-4079-BFD7-DEE95E8C749C@gmx.net>

Fine with me :-)

	-hilmar

On Sep 29, 2009, at 5:11 PM, Chris Fields wrote:

> Right now all our subdistributions have a naming scheme like BioPerl- 
> db.  I'm thinking we should subtly change those to BioPerl-DB,  
> BioPerl-Run, BioPerl-Network, etc.  The primary reason is that the  
> prior method of naming doesn't quite match the syntax of other  
> distributions:
>
> Win32-Console
> Win32-EventLog
> MooseX-Aliases
> etc etc
>
> I'll go ahead and make these changes unless there is rabid dissent ;>
>
> chris
>

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From maj at fortinbras.us  Tue Sep 29 18:33:23 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Tue, 29 Sep 2009 18:33:23 -0400
Subject: [Bioperl-l] BioPerl 1.6.1 released
In-Reply-To: <7E88F275-4D4B-48DD-84C7-A97C2C16EA97@illinois.edu>
References: <7E88F275-4D4B-48DD-84C7-A97C2C16EA97@illinois.edu>
Message-ID: <5D35D16E84554CA687C6CA4758806884@NewLife>

Gnarly, dude.
MAJ
----- Original Message ----- 
From: "Chris Fields" <cjfields at illinois.edu>
To: "BioPerl List" <bioperl-l at lists.open-bio.org>
Sent: Tuesday, September 29, 2009 2:01 PM
Subject: [Bioperl-l] BioPerl 1.6.1 released


> We are pleased to announce the availability of BioPerl 1.6.1, the  
> latest release of BioPerl core code.  You can grab it here:
> 
> Via CPAN:
> 
> http://search.cpan.org/~cjfields/BioPerl-1.6.1/
> 
> Via the BioPerl website:
> 
> http://bioperl.org/DIST/BioPerl-1.6.1.tar.bz2
> http://bioperl.org/DIST/BioPerl-1.6.1.tar.gz
> http://bioperl.org/DIST/BioPerl-1.6.1.zip
> 
> The PPM for Windows should also finally be available this week,  
> ActivePerl problems permitting (we will post more information when it  
> becomes available).
> 
> Tons of bug fixes and changes have been incorporated into this  
> release.  For a more complete change list please see the 'Changes'  
> file included with the distribution.
> 
> A few highlights:
> 
> * FASTQ parsing and interconversion of the three FASTQ variants  
> (Sanger, Illumina, Solexa) now works (a concerted OBF effort!)
> * Significant refactoring of Bio::Restriction methods
> * Complete refactoring of Bio::Search-related tiling code, including  
> HOWTO documentation
> * GBrowse-related fixes
>    - berkeleydb database now autoindexes wig files and locks correctly
>    - add Pg, SQLite, and faster BerkeleyDB implementations
> * Infernal 1.0 output is now parsed
> * New SearchIO-based parser for gmap -f9 output
> * BLAST XML parsing essentially complete
> * Installation via CPANPLUS should now work
> * For those using Strawberry Perl on Windows, the latest build is  
> expected to pass all tests.
> * 'raw' sequence format now parsed by line or optionally as a single  
> sequence
> * SCF parsing/writing now round-trips
> * Demo code for using RPS-BLAST and Bio::Tools::Run::RemoteBlast
> * Bio::Tools::SeqPattern now has a backtranslate() method
> * Bio::Tree::Statistics now has methods to calculate Fitch-based  
> score, internal trait values, statratio(), sum of leaf distances  
> [heikki]
> * scripts
>    - update to bp_seqfeature_load for SQLite [lstein]
>    - hivq.pl - commmand-line interface to Bio::DB::HIV [maj]
>    - fastam9_to_table - fix for MPI output [jason]
>    - gccalc - total stats [jason]
>    - einfo  - simple script to find up-to-date NCBI database list,  
> list field and link values for a specific database
> 
> We will shortly release updates for BioPerl-db, BioPerl-run, and  
> BioPerl-network.  Enjoy!
> 
> chris
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
>


From cjfields at illinois.edu  Tue Sep 29 23:54:04 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Tue, 29 Sep 2009 22:54:04 -0500
Subject: [Bioperl-l] Long /labels are wrapped, but can't be read
In-Reply-To: <87hbunv764.fsf@topper.koldfront.dk>
References: <87hbunv764.fsf@topper.koldfront.dk>
Message-ID: <86373CE8-4C61-4124-BCF3-35975523CC9C@illinois.edu>

Adam,

Not sure, but this could be a case of 'both'.  Labels that are quoted  
and aren't are currently distinguished via a global hash lookup  
(%FTQUAL_NO_QUOTE) due to the way the parser works; there is some  
logic behind this, just can't quite recall at the moment why it is  
this way.  You could set a hash key for the label in cases where it  
isn't quoted, that should work.  You can also test out the  
Bio::SeqIO::embldriver version (-format => 'embldriver').

If the above doesn't work out it's worth filing a bug for this  
behavior, though I'm not sure how easily it will be to fix.

chris

On Sep 28, 2009, at 2:51 AM, Adam Sj?gren wrote:

>  Hi.
>
>
> I am wondering whether this is a buglet or just a case of "Don't do
> that":
>
> If I set a very long /label on a feature and output the sequence in  
> EMBL
> format, the qualifier value gets wrapped, but not quoted.
>
> When BioPerl reads such a file, an exception is thrown.
>
> I probably shouldn't be setting very long labels... But oughtn't  
> BioPerl
> throw an exception when a too long label is set, or automatically  
> quote
> the value when it is long enough to be wrapped, or know how to read a
> wrapped yet unquoted value?
>
> I will be happy to try and provide a patch for whichever solution is
> preferred.
>
> Here is an example script:
>
>  #!/usr/bin/perl
>
>  use strict;
>  use warnings;
>
>  use IO::String;
>
>  use Bio::Seq;
>  use Bio::SeqFeature::Generic;
>  use Bio::SeqIO;
>
>  print 'BioPerl ' . $Bio::Root::Version::VERSION . "\n";
>
>  my $seq=Bio::Seq->new(-seq=>'ATG');
>  my $feature=Bio::SeqFeature::Generic->new(-primary=>'misc_feature',  
> -start=>1, -end=>3);
>  $feature->add_tag_value 
> (label 
> =>'averylonglabelthisisindeedbutitoughttoworkanywaydontyouthink');
>  $seq->add_SeqFeature($feature);
>
>  my $out_string=out($seq);
>  print $out_string;
>
>  my $fh=IO::String->new($out_string);
>  my $in=Bio::SeqIO->new(-fh=>$fh, -format=>'EMBL');
>  my $in_seq=$in->next_seq;
>
>  print "Done\n";
>
>  sub out {
>      my ($seq)=@_;
>
>      my $string='';
>      my $fh=IO::String->new($string);
>      my $out=Bio::SeqIO->new(-fh=>$fh, -format=>'EMBL');
>      $out->write_seq($seq);
>
>      return $string;
>  }
>
> Which gives this output when run:
>
>  BioPerl 1.0069
>  ID   unknown; SV 1; linear; unassigned DNA; STD; UNC; 3 BP.
>  XX
>  AC   unknown;
>  XX
>  XX
>  FH   Key             Location/Qualifiers
>  FH
>  FT   misc_feature    1..3
>  FT                   / 
> label=averylonglabelthisisindeedbutitoughttoworkanywaydont
>  FT                   youthink
>  XX
>  SQ   Sequence 3 BP; 1 A; 0 C; 1 G; 1 T; 0 other;
>        
> atg 
>                                                                        3
>  //
>
>  ------------- EXCEPTION: Bio::Root::Exception -------------
>  MSG: Can't see new qualifier in: youthink
>  from:
>  /label=averylonglabelthisisindeedbutitoughttoworkanywaydont
>  youthink
>
>  STACK: Error::throw
>  STACK: Bio::Root::Root::throw Bio/Root/Root.pm:368
>  STACK: Bio::SeqIO::embl::_read_FTHelper_EMBL Bio/SeqIO/embl.pm:1294
>  STACK: Bio::SeqIO::embl::next_seq Bio/SeqIO/embl.pm:392
>  STACK: /z/home/adsj/bugs/bioperl/embl/embl.pl:24
>  -----------------------------------------------------------
>
> If I change the value to include "-quotes ("simulating" that embl.pm
> quotes the value), BioPerl can read the EMBL string it produces fine:
>
>  -----------------------------------------------------------
>  adsj at ala:~/work/bioperl/bioperl-live$ perl -I. ~/bugs/bioperl/embl/ 
> embl.pl
>  BioPerl 1.0069
>  ID   unknown; SV 1; linear; unassigned DNA; STD; UNC; 3 BP.
>  XX
>  AC   unknown;
>  XX
>  XX
>  FH   Key             Location/Qualifiers
>  FH
>  FT   misc_feature    1..3
>  FT                   / 
> label=""averylonglabelthisisindeedbutitoughttoworkanywaydo
>  FT                   ntyouthink""
>  XX
>  SQ   Sequence 3 BP; 1 A; 0 C; 1 G; 1 T; 0 other;
>        
> atg 
>                                                                        3
>  //
>  Done
>
>
>  Best regards,
>
>     Adam
>
> -- 
>                                                          Adam Sj?gren
>                                                    adsj at novozymes.com
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From adsj at novozymes.com  Wed Sep 30 05:50:36 2009
From: adsj at novozymes.com (Adam =?iso-8859-1?Q?Sj=F8gren?=)
Date: Wed, 30 Sep 2009 11:50:36 +0200
Subject: [Bioperl-l] Long /labels are wrapped, but can't be read
In-Reply-To: <86373CE8-4C61-4124-BCF3-35975523CC9C@illinois.edu> (Chris
	Fields's message of "Tue, 29 Sep 2009 22:54:04 -0500")
References: <87hbunv764.fsf@topper.koldfront.dk>
	<86373CE8-4C61-4124-BCF3-35975523CC9C@illinois.edu>
Message-ID: <87vdj0g3rn.fsf@topper.koldfront.dk>

On Tue, 29 Sep 2009 22:54:04 -0500, Chris wrote:

> Not sure, but this could be a case of 'both'. Labels that are quoted
> and aren't are currently distinguished via a global hash lookup
> (%FTQUAL_NO_QUOTE) due to the way the parser works; there is some
> logic behind this, just can't quite recall at the moment why it is
> this way.

Yes, I saw that there is a number of qualifiers that aren't quoted
automatically.

The very easy "fix" for me would be to simply remove "label" from
%FTQUAL_NO_QUOTE, but I'm not really sure what the reason for not
quoting all values is, so I was hesitant to just propose that.

> You could set a hash key for the label in cases where it isn't quoted,
> that should work. You can also test out the Bio::SeqIO::embldriver
> version (-format => 'embldriver').

Ah, embldriver reads the wrapped qualifier when it isn't quoted without
problem. Nice! I hadn't noticed embldriver.

I wonder which one is correct in this case?

And should I switch to using embldriver to read, or does it make sense
to try and concoct a patch that changes embl?


  Thanks for the feedback!

     Adam

-- 
                                                          Adam Sj?gren
                                                    adsj at novozymes.com


From sidd.basu at gmail.com  Wed Sep 30 13:24:53 2009
From: sidd.basu at gmail.com (Siddhartha Basu)
Date: Wed, 30 Sep 2009 12:24:53 -0500
Subject: [Bioperl-l]  Re: BioPerl 1.6.1 released
In-Reply-To: <5B8C4E37-5F3D-4E76-AB94-1C613AE04CDF@illinois.edu>
References: <7E88F275-4D4B-48DD-84C7-A97C2C16EA97@illinois.edu>
	<C06DA705-6249-4D86-BE9D-E2E4DCEBFAF0@gmx.net>
	<5B8C4E37-5F3D-4E76-AB94-1C613AE04CDF@illinois.edu>
Message-ID: <4ac39469.0637560a.5a63.1fee@mx.google.com>

Congrats chris,  really appreciate your time and effort.

-siddhartha

On Tue, 29 Sep 2009, Chris Fields wrote:

> No prob.  Next up is db, run, and network!
>
> chris
>
> On Sep 29, 2009, at 2:56 PM, Hilmar Lapp wrote:
>
> > Congrats from me too - awesome Chris, and thanks on behalf of the project!
> >
> > 	-hilmar
> >
> > On Sep 29, 2009, at 2:01 PM, Chris Fields wrote:
> >
> >> We are pleased to announce the availability of BioPerl 1.6.1, the latest 
> >> release of BioPerl core code.  You can grab it here:
> >>
> >> Via CPAN:
> >>
> >> http://search.cpan.org/~cjfields/BioPerl-1.6.1/
> >>
> >> Via the BioPerl website:
> >>
> >> http://bioperl.org/DIST/BioPerl-1.6.1.tar.bz2
> >> http://bioperl.org/DIST/BioPerl-1.6.1.tar.gz
> >> http://bioperl.org/DIST/BioPerl-1.6.1.zip
> >>
> >> The PPM for Windows should also finally be available this week, 
> >> ActivePerl problems permitting (we will post more information when it 
> >> becomes available).
> >>
> >> Tons of bug fixes and changes have been incorporated into this release.  
> >> For a more complete change list please see the 'Changes' file included 
> >> with the distribution.
> >>
> >> A few highlights:
> >>
> >> * FASTQ parsing and interconversion of the three FASTQ variants (Sanger, 
> >> Illumina, Solexa) now works (a concerted OBF effort!)
> >> * Significant refactoring of Bio::Restriction methods
> >> * Complete refactoring of Bio::Search-related tiling code, including 
> >> HOWTO documentation
> >> * GBrowse-related fixes
> >>  - berkeleydb database now autoindexes wig files and locks correctly
> >>  - add Pg, SQLite, and faster BerkeleyDB implementations
> >> * Infernal 1.0 output is now parsed
> >> * New SearchIO-based parser for gmap -f9 output
> >> * BLAST XML parsing essentially complete
> >> * Installation via CPANPLUS should now work
> >> * For those using Strawberry Perl on Windows, the latest build is 
> >> expected to pass all tests.
> >> * 'raw' sequence format now parsed by line or optionally as a single 
> >> sequence
> >> * SCF parsing/writing now round-trips
> >> * Demo code for using RPS-BLAST and Bio::Tools::Run::RemoteBlast
> >> * Bio::Tools::SeqPattern now has a backtranslate() method
> >> * Bio::Tree::Statistics now has methods to calculate Fitch-based score, 
> >> internal trait values, statratio(), sum of leaf distances [heikki]
> >> * scripts
> >>  - update to bp_seqfeature_load for SQLite [lstein]
> >>  - hivq.pl - commmand-line interface to Bio::DB::HIV [maj]
> >>  - fastam9_to_table - fix for MPI output [jason]
> >>  - gccalc - total stats [jason]
> >>  - einfo  - simple script to find up-to-date NCBI database list, list 
> >> field and link values for a specific database
> >>
> >> We will shortly release updates for BioPerl-db, BioPerl-run, and 
> >> BioPerl-network.  Enjoy!
> >>
> >> chris
> >> _______________________________________________
> >> Bioperl-l mailing list
> >> Bioperl-l at lists.open-bio.org
> >> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >>
> >
> > -- 
> > ===========================================================
> > : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
> > ===========================================================
> >
> >
> >
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From antonina.iagovitina at epfl.ch  Wed Sep 30 14:09:17 2009
From: antonina.iagovitina at epfl.ch (Antonina Iagovitina)
Date: Wed, 30 Sep 2009 20:09:17 +0200
Subject: [Bioperl-l] assistance with bioperl
Message-ID: <4AC39ECD.6060405@epfl.ch>

Here is the error message I get when I try to align a sequence to an existing
alignment. Please help
I am using Windows XP and Clustalw version1.83

 MSG:
 ERROR: Could not open sequence file (-profile) 
 No. of seqs. read = -1. No alignment!
 
use Bio::AlignIO;
use Bio::SeqIO;
use Bio::Seq;
use Bio::Tools::Run::Alignment::Clustalw;

my @params = ('ktuple' => 2, 'matrix' => 'BLOSUM');
my $factory = Bio::Tools::Run::Alignment::Clustalw->new(@params);
$str = Bio::AlignIO->new(-file=> 'cysprot1a.msf');
$aln = $str->next_aln();
$str1 = Bio::SeqIO->new(-file=> 'cysprot1b.fa');
$seq = $str1->next_seq();
$aln = $factory->profile_align($aln,$seq);
end


From maj at fortinbras.us  Wed Sep 30 14:24:59 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Wed, 30 Sep 2009 14:24:59 -0400
Subject: [Bioperl-l] assistance with bioperl
In-Reply-To: <4AC39ECD.6060405@epfl.ch>
References: <4AC39ECD.6060405@epfl.ch>
Message-ID: <569E83EDBFE044638187504E5E7A8C11@NewLife>

Antonina--
Try the following:
Make sure that cysprot1a.msf and cysprot1b.fa are in the current directory, 
or use full path names for the files. 
MAJ
----- Original Message ----- 
From: "Antonina Iagovitina" <antonina.iagovitina at epfl.ch>
To: <bioperl-l at lists.open-bio.org>
Sent: Wednesday, September 30, 2009 2:09 PM
Subject: [Bioperl-l] assistance with bioperl


> Here is the error message I get when I try to align a sequence to an existing
> alignment. Please help
> I am using Windows XP and Clustalw version1.83
> 
> MSG:
> ERROR: Could not open sequence file (-profile) 
> No. of seqs. read = -1. No alignment!
> 
> use Bio::AlignIO;
> use Bio::SeqIO;
> use Bio::Seq;
> use Bio::Tools::Run::Alignment::Clustalw;
> 
> my @params = ('ktuple' => 2, 'matrix' => 'BLOSUM');
> my $factory = Bio::Tools::Run::Alignment::Clustalw->new(@params);
> $str = Bio::AlignIO->new(-file=> 'cysprot1a.msf');
> $aln = $str->next_aln();
> $str1 = Bio::SeqIO->new(-file=> 'cysprot1b.fa');
> $seq = $str1->next_seq();
> $aln = $factory->profile_align($aln,$seq);
> end
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
>


From me at miguel.weapps.com  Wed Sep 30 18:16:38 2009
From: me at miguel.weapps.com (Luis M Rodriguez-R)
Date: Wed, 30 Sep 2009 17:16:38 -0500
Subject: [Bioperl-l] Nexus symbols
Message-ID: <0EFFDCCA-48C6-4609-8503-17E61FCDD67B@miguel.weapps.com>

Dear all,

Is there a way to remove the "symbols" (i.e. the 'symbols="ATCG"')  
from the "format" line in the Nexus output of Bio::AlignIO?

My code (snippet) is:

my $fasta_i = Bio::AlignIO->new(-file=>"<$outfile.aln.fasta", '- 
format'=>"fasta");
my $nexus_o = Bio::AlignIO->new(-file=>">$outfile.aln.nex", '- 
format'=>"nexus");
while(my $fasta_aln=$fasta_i->next_aln){$nexus_o- 
 >write_aln($fasta_aln);}

And I would like to remove the symbols (is not compatible with MrBayes  
v3.1.2: "Could not find parameter "symbols"").

Also, it would be nice to be able to change the TITLE comment.

Thanks all!
Regards,

Luis M. Rodriguez-R
[http://bioinf.uniandes.edu.co/~miguel/]
---------------------------------
Unidad de Bioinform?tica del Laboratorio de Micolog?a y Fitopatolog?a
Universidad de Los Andes, Colombia
[http://bioinf.uniandes.edu.co]

+ 57 1 3394949 ext 2619
luisrodr at uniandes.edu.co
me at miguel.weapps.com


From jason at bioperl.org  Wed Sep 30 18:40:33 2009
From: jason at bioperl.org (Jason Stajich)
Date: Wed, 30 Sep 2009 15:40:33 -0700
Subject: [Bioperl-l] Nexus symbols
In-Reply-To: <0EFFDCCA-48C6-4609-8503-17E61FCDD67B@miguel.weapps.com>
References: <0EFFDCCA-48C6-4609-8503-17E61FCDD67B@miguel.weapps.com>
Message-ID: <483DB389-9332-4573-84C7-3AF09AC2BACA@bioperl.org>

-show_symbols => 0

If you use bp_sreformat.pl script specify --special="mrbayes" it will  
set both of the endblock and show_symbols values to 0.


perldoc Bio::AlignIO::nexus

        new

         Title   : new
         Usage   : $alignio = Bio::AlignIO->new(-format => ?nexus?, - 
file => ?filename?);
         Function: returns a new Bio::AlignIO object to handle  
clustalw files
         Returns : Bio::AlignIO::clustalw object
         Args    : -verbose => verbosity setting (-1,0,1,2)
                   -file    => name of file to read in or with ">" -  
writeout
                   -fh      => alternative to -file param - provide a  
filehandle
                               to read from/write to
                   -format  => type of Alignment Format to process or  
produce

                   Customization of nexus flavor output

                   -show_symbols => print the symbols="ATGC" in the  
data definition
                                    (MrBayes does not like this)
                                    boolean [default is 1]
                   -show_endblock => print an ?endblock;? at the end  
of the data
                                    (MyBayes does not like this)
                                    boolean [default is 1]

On Sep 30, 2009, at 3:16 PM, Luis M Rodriguez-R wrote:

> Dear all,
>
> Is there a way to remove the "symbols" (i.e. the 'symbols="ATCG"')  
> from the "format" line in the Nexus output of Bio::AlignIO?
>
> My code (snippet) is:
>
> my $fasta_i = Bio::AlignIO->new(-file=>"<$outfile.aln.fasta", '- 
> format'=>"fasta");
> my $nexus_o = Bio::AlignIO->new(-file=>">$outfile.aln.nex", '- 
> format'=>"nexus");
> while(my $fasta_aln=$fasta_i->next_aln){$nexus_o- 
> >write_aln($fasta_aln);}
>
> And I would like to remove the symbols (is not compatible with  
> MrBayes v3.1.2: "Could not find parameter "symbols"").
>
> Also, it would be nice to be able to change the TITLE comment.
>
> Thanks all!
> Regards,
>
> Luis M. Rodriguez-R
> [http://bioinf.uniandes.edu.co/~miguel/]
> ---------------------------------
> Unidad de Bioinform?tica del Laboratorio de Micolog?a y Fitopatolog?a
> Universidad de Los Andes, Colombia
> [http://bioinf.uniandes.edu.co]
>
> + 57 1 3394949 ext 2619
> luisrodr at uniandes.edu.co
> me at miguel.weapps.com
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

--
Jason Stajich
jason.stajich at gmail.com
jason at bioperl.org


From me at miguel.weapps.com  Wed Sep 30 16:51:04 2009
From: me at miguel.weapps.com (Luis M Rodriguez-R)
Date: Wed, 30 Sep 2009 15:51:04 -0500
Subject: [Bioperl-l] Nexus symbols
Message-ID: <788222E4-FCCC-4D4D-880B-1F5156945DB8@miguel.weapps.com>

Dear all,

Is there a way to remove the "symbols" (i.e. the 'symbols="ATCG"')  
from the "format" line in the Nexus output of Bio::AlignIO?

My code (snippet) is:

my $fasta_i = Bio::AlignIO->new(-file=>"<$outfile.aln.fasta", '- 
format'=>"fasta");
my $nexus_o = Bio::AlignIO->new(-file=>">$outfile.aln.nex", '- 
format'=>"nexus");
while(my $fasta_aln=$fasta_i->next_aln){$nexus_o- 
 >write_aln($fasta_aln);}

And I would like to remove the symbols (is not compatible with MrBayes  
v3.1.2: "Could not find parameter "symbols"").

Also, it would be nice to be able to change the TITLE comment.

Thanks all!
Regards,

Luis M. Rodriguez-R
[http://bioinf.uniandes.edu.co/~miguel/]
---------------------------------
Unidad de Bioinform?tica del Laboratorio de Micolog?a y Fitopatolog?a
Universidad de Los Andes, Colombia
[http://bioinf.uniandes.edu.co]

+ 57 1 3394949 ext 2619
luisrodr at uniandes.edu.co
me at miguel.weapps.com


From paola_bisignano at yahoo.it  Tue Sep  1 12:20:25 2009
From: paola_bisignano at yahoo.it (Paola Bisignano)
Date: Tue, 1 Sep 2009 12:20:25 +0000 (GMT)
Subject: [Bioperl-l] help parsing msf file or clustalW file reports
Message-ID: <154614.75143.qm@web25706.mail.ukl.yahoo.com>

Hi, 

I'm trying to parse fasta files, where I have couple of alignments....I need to identify my residue in my alignment......I have separate lists that derived from ligplot parsing files.. so I have to manipulate string...but I don't now how to start..it seems complicated..
I used Bio::AlignIO to parse the fasta file, so I can have a parsed file in msf or clustalW forma

here an example:
CLUSTAL W(1.81) multiple sequence alignment


Sequence/9-273???????? DKWEMERTDITMKHKLGGGQYGEVYEGVWKKYSLTVAVKTLKEDTMEVEEFLKEAAVMKE
2pl0:A/6-268?????????? DEWEVPRETLKLVERLGAGQFGEVWMGYYNGHT-KVAVKSLKQGSMSPDAFLAEANLMKQ
?????????????????????? *:**: *? :.: .:**.**:***: * :: :: .****:**:.:*. : ** ** :**:


Sequence/9-273???????? IKHPNLVQLLGVCTREPPFYIITEFMTYGNLLDYLRECNRQEVSAVVLLYMATQISSAME
2pl0:A/6-268?????????? LQHQRLVRLYAVVTQEP-IYIITEYMENGSLVDFLKTPSGIKLTINKLLDMAAQIAEGMA
?????????????????????? ::* .**:* .* *:** :*****:*? *.*:*:*:? .? :::?? ** **:**:..* 

I? choose two residue for example...how can I extract them...starting from their position in the pdb file?
I need to walk...to my sequence 

I don't know if it is clear because I cannot explain the question correctly in english...are there any Italians?
could anyone help me?


From scott at scottcain.net  Tue Sep  1 13:21:25 2009
From: scott at scottcain.net (Scott Cain)
Date: Tue, 1 Sep 2009 09:21:25 -0400
Subject: [Bioperl-l] GMOD Chado perl modules moving to the Bio namespace
Message-ID: <CFB4B2A1-6E7F-42D7-BC9A-00C7CB25D185@scottcain.net>

Hello all,

I just wanted to send out a general announcement about a change that  
is coming for perl modules that are distributed with the gmod/chado  
package.  There are some modules, notably Class::DBI classes that are  
automatically generated, that are currently in the Chado namespace.   
This move has been requested by the CPAN maintainers.  So any  
Chado::*  modules will become Bio::Chado::*, except for the Class::DBI  
classes, which will become Bio::Chado::CDBI::*.

This will probably affect relatively few users, though ModWare in its  
current incarnation will need to be updated.

Scott

-----------------------------------------------------------------------
Scott Cain, Ph. D. scott at scottcain dot net
GMOD Coordinator (http://gmod.org/) 216-392-3087
Ontario Institute for Cancer Research


From biopython at maubp.freeserve.co.uk  Tue Sep  1 15:33:13 2009
From: biopython at maubp.freeserve.co.uk (Peter)
Date: Tue, 1 Sep 2009 16:33:13 +0100
Subject: [Bioperl-l] Next-Gen and the next point release - updates
In-Reply-To: <320fb6e00908270455y2a80907chfae8007df60e72e2@mail.gmail.com>
References: <ED17AB7F-E2D9-4CFC-AE18-08B1312159C5@illinois.edu>
	<320fb6e00908261416p666b7ab7w8174eb5a48f38c61@mail.gmail.com>
	<F7DAE18A-8224-4721-861F-610D82F4BDFE@illinois.edu>
	<320fb6e00908270455y2a80907chfae8007df60e72e2@mail.gmail.com>
Message-ID: <320fb6e00909010833p7bffac97je12dc778cdd54971@mail.gmail.com>

On Thu, Aug 27, 2009 at 12:55 PM, Peter wrote:
>> The two conversions to solexa are still failing. ?I'm not sure but I think
>> it's something fairly simple, but I can't work on it until Friday (got too
>> many other things on my plate ATM). ?If I get stumped I'll post a message.
>
> ...
>
> This should narrow it down - the bug is in mapping PHRED
> scores (from either Sanger or Illumina 1.3+ files) to the
> Solexa encoding.
>
> Peter

Hi Chris,

I've just noticed BioPerl is treating invalid characters in the quality
string as a warning condition (not an error):
http://lists.open-bio.org/pipermail/open-bio-l/2009-September/000568.html

It seems for fastq-sanger and fastq-illumina, these get given PHRED 0
(character "!" or "@" respectively) which is reasonable. For fastq-solexa
to fastq-solexa however, Solexa -5 (ASCII 59, character ";") does not get
used - a bug?

Also, in all these cases there is currently a spurious "data loss" warning:

$ ./bioperl_sanger2sanger.pl < error_qual_null.fastq

--------------------- WARNING ---------------------
MSG: Unknown symbol with ASCII value 0 outside of quality range,
---------------------------------------------------

--------------------- WARNING ---------------------
MSG: Data loss for sanger: following values exceed max 93

---------------------------------------------------
@SLXA-B3_649_FC8437_R1_1_1_850_123
GAGGGTGTTGATCATGATGATGGCG
+
YYY!YYYYYYYYYWYYWYYSYYYSY
@SLXA-B3_649_FC8437_R1_1_1_397_389
GGTTTGAGAAAGAGAAATGAGATAA
+
YYYYYYYYYWYYYYWWYYYWYWYWW
@SLXA-B3_649_FC8437_R1_1_1_850_123
GAGGGTGTTGATCATGATGATGGCG
+
YYYYYYYYYYYYYWYYWYYSYYYSY
@SLXA-B3_649_FC8437_R1_1_1_362_549
GGAAACAAAGTTTTTCTCAACATAG
+
YYYYYYYYYYYYYYYYYYWWWWYWY
@SLXA-B3_649_FC8437_R1_1_1_183_714
GTATTATTTAATGGCATACACTCAA
+
YYYYYYYYYYWYYYYWYWWUWWWQQ

Regards,

Peter


From jason at bioperl.org  Tue Sep  1 15:49:00 2009
From: jason at bioperl.org (Jason Stajich)
Date: Tue, 1 Sep 2009 08:49:00 -0700
Subject: [Bioperl-l] help parsing msf file or clustalW file reports
In-Reply-To: <154614.75143.qm@web25706.mail.ukl.yahoo.com>
References: <154614.75143.qm@web25706.mail.ukl.yahoo.com>
Message-ID: <90DACEE3-BC71-4D82-A8FF-6441A720BC76@bioperl.org>

I think you might want to use the column_from_residue_number method  
that is part of Bio::SimpleAlign - it lets you get the column from an  
alignment based on the sequence residue, doing some math along the way  
to deal with gaps. That is the residue -> alignment direction.  If you  
are starting at the alignment and want to get the residue's position  
you will use the location_from_column on a particular sequence so

     # select somehow a sequence from the alignment, e.g.
     my $seq = $aln->get_seq_by_pos(1);
     #$loc is undef or Bio::LocationI object
     my $loc = $seq->location_from_column(5);

-jason

On Sep 1, 2009, at 5:20 AM, Paola Bisignano wrote:

> Hi,
>
> I'm trying to parse fasta files, where I have couple of  
> alignments....I need to identify my residue in my alignment......I  
> have separate lists that derived from ligplot parsing files.. so I  
> have to manipulate string...but I don't now how to start..it seems  
> complicated..
> I used Bio::AlignIO to parse the fasta file, so I can have a parsed  
> file in msf or clustalW forma
>
> here an example:
> CLUSTAL W(1.81) multiple sequence alignment
>
>
> Sequence/9-273          
> DKWEMERTDITMKHKLGGGQYGEVYEGVWKKYSLTVAVKTLKEDTMEVEEFLKEAAVMKE
> 2pl0:A/6-268           DEWEVPRETLKLVERLGAGQFGEVWMGYYNGHT- 
> KVAVKSLKQGSMSPDAFLAEANLMKQ
>                        *:**: *  :.: .:**.**:***:  
> * :: :: .****:**:.:*. : ** ** :**:
>
>
> Sequence/9-273          
> IKHPNLVQLLGVCTREPPFYIITEFMTYGNLLDYLRECNRQEVSAVVLLYMATQISSAME
> 2pl0:A/6-268           LQHQRLVRLYAVVTQEP- 
> IYIITEYMENGSLVDFLKTPSGIKLTINKLLDMAAQIAEGMA
>                        ::* .**:* .* *:** :*****:*   
> *.*:*:*:  .  :::   ** **:**:..*
>
> I  choose two residue for example...how can I extract  
> them...starting from their position in the pdb file?
> I need to walk...to my sequence
>
> I don't know if it is clear because I cannot explain the question  
> correctly in english...are there any Italians?
> could anyone help me?
>
>
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

--
Jason Stajich
jason.stajich at gmail.com
jason at bioperl.org


From cjfields at illinois.edu  Tue Sep  1 16:05:14 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Tue, 1 Sep 2009 11:05:14 -0500
Subject: [Bioperl-l] Next-Gen and the next point release - updates
In-Reply-To: <320fb6e00909010833p7bffac97je12dc778cdd54971@mail.gmail.com>
References: <ED17AB7F-E2D9-4CFC-AE18-08B1312159C5@illinois.edu>
	<320fb6e00908261416p666b7ab7w8174eb5a48f38c61@mail.gmail.com>
	<F7DAE18A-8224-4721-861F-610D82F4BDFE@illinois.edu>
	<320fb6e00908270455y2a80907chfae8007df60e72e2@mail.gmail.com>
	<320fb6e00909010833p7bffac97je12dc778cdd54971@mail.gmail.com>
Message-ID: <FB130819-94C6-419F-AD3D-BAEEDDE77737@illinois.edu>


On Sep 1, 2009, at 10:33 AM, Peter wrote:

> On Thu, Aug 27, 2009 at 12:55 PM, Peter wrote:
>>> The two conversions to solexa are still failing.  I'm not sure but  
>>> I think
>>> it's something fairly simple, but I can't work on it until Friday  
>>> (got too
>>> many other things on my plate ATM).  If I get stumped I'll post a  
>>> message.
>>
>> ...
>>
>> This should narrow it down - the bug is in mapping PHRED
>> scores (from either Sanger or Illumina 1.3+ files) to the
>> Solexa encoding.
>>
>> Peter
>
> Hi Chris,
>
> I've just noticed BioPerl is treating invalid characters in the  
> quality
> string as a warning condition (not an error):
> http://lists.open-bio.org/pipermail/open-bio-l/2009-September/000568.html
>
> It seems for fastq-sanger and fastq-illumina, these get given PHRED 0
> (character "!" or "@" respectively) which is reasonable. For fastq- 
> solexa
> to fastq-solexa however, Solexa -5 (ASCII 59, character ";") does  
> not get
> used - a bug?
>
> Also, in all these cases there is currently a spurious "data loss"  
> warning:
>
> $ ./bioperl_sanger2sanger.pl < error_qual_null.fastq
>
> --------------------- WARNING ---------------------
> MSG: Unknown symbol with ASCII value 0 outside of quality range,
> ---------------------------------------------------
>
> --------------------- WARNING ---------------------
> MSG: Data loss for sanger: following values exceed max 93
>
> ---------------------------------------------------
> @SLXA-B3_649_FC8437_R1_1_1_850_123
> GAGGGTGTTGATCATGATGATGGCG
> +
> YYY!YYYYYYYYYWYYWYYSYYYSY
> @SLXA-B3_649_FC8437_R1_1_1_397_389
> GGTTTGAGAAAGAGAAATGAGATAA
> +
> YYYYYYYYYWYYYYWWYYYWYWYWW
> @SLXA-B3_649_FC8437_R1_1_1_850_123
> GAGGGTGTTGATCATGATGATGGCG
> +
> YYYYYYYYYYYYYWYYWYYSYYYSY
> @SLXA-B3_649_FC8437_R1_1_1_362_549
> GGAAACAAAGTTTTTCTCAACATAG
> +
> YYYYYYYYYYYYYYYYYYWWWWYWY
> @SLXA-B3_649_FC8437_R1_1_1_183_714
> GTATTATTTAATGGCATACACTCAA
> +
> YYYYYYYYYYWYYYYWYWWUWWWQQ
>
> Regards,
>
> Peter

Right, per off-list discussion this can be changed (I would rather it  
die there anyway).

chris


From marcelo011982 at gmail.com  Tue Sep  1 17:33:51 2009
From: marcelo011982 at gmail.com (Marcelo Iwata)
Date: Tue, 1 Sep 2009 14:33:51 -0300
Subject: [Bioperl-l] remove overlapped sequences from Blastn results
Message-ID: <1c9f28970909011033h7f8a1bcl771db039bad384e7@mail.gmail.com>

Hi

I've made a blastn with such arguments:

../bin/blastall -p blastn -d DBBank -i myFasta.FASTA.txt  -e 0.00001 -o
Out2Blast.txt -a 8

and i want a script that removes overlapped sequences from the results..
For example, if a unigene A has the hit->start  and hit-end as 1 and 4, and
the B is at 2 and 3, respectively, the script remove second one.

I want to know if it already exist, and if not, is there a library that
works with such issue.

I know that at Bio::DB::gff we have overlapping_features. But , if something
directly exist (works with blast format), is better for me.

thanks in advance


From cjfields at illinois.edu  Tue Sep  1 18:10:30 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Tue, 1 Sep 2009 13:10:30 -0500
Subject: [Bioperl-l] remove overlapped sequences from Blastn results
In-Reply-To: <1c9f28970909011033h7f8a1bcl771db039bad384e7@mail.gmail.com>
References: <1c9f28970909011033h7f8a1bcl771db039bad384e7@mail.gmail.com>
Message-ID: <7A89A354-3211-4662-9672-895E16CFDEE8@illinois.edu>

Marcelo,

Do you mean tiling?  See:

http://www.bioperl.org/wiki/HOWTO:Tiling

chris

On Sep 1, 2009, at 12:33 PM, Marcelo Iwata wrote:

> Hi
>
> I've made a blastn with such arguments:
>
> ../bin/blastall -p blastn -d DBBank -i myFasta.FASTA.txt  -e 0.00001  
> -o
> Out2Blast.txt -a 8
>
> and i want a script that removes overlapped sequences from the  
> results..
> For example, if a unigene A has the hit->start  and hit-end as 1 and  
> 4, and
> the B is at 2 and 3, respectively, the script remove second one.
>
> I want to know if it already exist, and if not, is there a library  
> that
> works with such issue.
>
> I know that at Bio::DB::gff we have overlapping_features. But , if  
> something
> directly exist (works with blast format), is better for me.
>
> thanks in advance
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From cain.cshl at gmail.com  Tue Sep  1 19:47:50 2009
From: cain.cshl at gmail.com (Scott Cain)
Date: Tue, 1 Sep 2009 15:47:50 -0400
Subject: [Bioperl-l] GMOD Chado perl modules moving to the Bio namespace
In-Reply-To: <CFB4B2A1-6E7F-42D7-BC9A-00C7CB25D185@scottcain.net>
References: <CFB4B2A1-6E7F-42D7-BC9A-00C7CB25D185@scottcain.net>
Message-ID: <0CA5287E-BE85-4E7F-8ED3-B453092FACB1@gmail.com>

Hi Don,

I just wanted to let you know that I also updated the code in  
GMODTools, but I don't have a simple way to test it; perhaps you  
should take a look at the cvs diff to make sure what I did makes sense.

Thanks,
Scott

On Sep 1, 2009, at 9:21 AM, Scott Cain wrote:

> Hello all,
>
> I just wanted to send out a general announcement about a change that  
> is coming for perl modules that are distributed with the gmod/chado  
> package.  There are some modules, notably Class::DBI classes that  
> are automatically generated, that are currently in the Chado  
> namespace.  This move has been requested by the CPAN maintainers.   
> So any Chado::*  modules will become Bio::Chado::*, except for the  
> Class::DBI classes, which will become Bio::Chado::CDBI::*.
>
> This will probably affect relatively few users, though ModWare in  
> its current incarnation will need to be updated.
>
> Scott
>
> -----------------------------------------------------------------------
> Scott Cain, Ph. D. scott at scottcain dot net
> GMOD Coordinator (http://gmod.org/) 216-392-3087
> Ontario Institute for Cancer Research
>
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

-----------------------------------------------------------------------
Scott Cain, Ph. D. scott at scottcain dot net
GMOD Coordinator (http://gmod.org/) 216-392-3087
Ontario Institute for Cancer Research


From maj at fortinbras.us  Wed Sep  2 04:19:30 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Wed, 2 Sep 2009 00:19:30 -0400
Subject: [Bioperl-l] bioperl invades emacs
Message-ID: <56DB0DEEB22645DE94DE0E912A889409@NewLife>

Hi All, 

As part of the Documentation Project, I've written a full-
fledged minor mode for emacs, bioperl-mode. It allows 
the user to access BP pod while coding, using keyboard
shortcuts or menus. Pod pops up in a new view buffer,
which it itself active for quick pod searching. You can 
get the whole pod, pieces of pod, or even the pod headers
of individual methods. 

The best feature (IMHO) is the completion facility. This
not only saves typing, but allows browsing and follow-your-nose
programming (exactly the technique I used to make bioperl-mode,
thanks to the Extensible Self-Documenting Editor).

It's very easy to install, requires only one additional line 
in your .emacs file, and directly infects perl-mode 
(if you so choose) so its available whenever you
open .pl or .pm files.

For details, screenshots, download and install info,
and soporific design details, see
http://www.bioperl.org/wiki/Emacs_bioperl-mode

Send me the bugs!
cheers, 
MAJ


From rmb32 at cornell.edu  Wed Sep  2 04:31:15 2009
From: rmb32 at cornell.edu (Robert Buels)
Date: Tue, 01 Sep 2009 21:31:15 -0700
Subject: [Bioperl-l] bioperl invades emacs
In-Reply-To: <56DB0DEEB22645DE94DE0E912A889409@NewLife>
References: <56DB0DEEB22645DE94DE0E912A889409@NewLife>
Message-ID: <4A9DF513.1020607@cornell.edu>

Wow.  Bravo!

Rob


From cjfields at illinois.edu  Wed Sep  2 04:31:46 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Tue, 1 Sep 2009 23:31:46 -0500
Subject: [Bioperl-l] bioperl invades emacs
In-Reply-To: <56DB0DEEB22645DE94DE0E912A889409@NewLife>
References: <56DB0DEEB22645DE94DE0E912A889409@NewLife>
Message-ID: <2A49147F-17B4-42EB-A170-52DA009D7E1C@illinois.edu>

Very cool!  Thanks Mark!

chris

On Sep 1, 2009, at 11:19 PM, Mark A. Jensen wrote:

> Hi All,
>
> As part of the Documentation Project, I've written a full-
> fledged minor mode for emacs, bioperl-mode. It allows
> the user to access BP pod while coding, using keyboard
> shortcuts or menus. Pod pops up in a new view buffer,
> which it itself active for quick pod searching. You can
> get the whole pod, pieces of pod, or even the pod headers
> of individual methods.
>
> The best feature (IMHO) is the completion facility. This
> not only saves typing, but allows browsing and follow-your-nose
> programming (exactly the technique I used to make bioperl-mode,
> thanks to the Extensible Self-Documenting Editor).
>
> It's very easy to install, requires only one additional line
> in your .emacs file, and directly infects perl-mode
> (if you so choose) so its available whenever you
> open .pl or .pm files.
>
> For details, screenshots, download and install info,
> and soporific design details, see
> http://www.bioperl.org/wiki/Emacs_bioperl-mode
>
> Send me the bugs!
> cheers,
> MAJ
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From Russell.Smithies at agresearch.co.nz  Wed Sep  2 05:01:34 2009
From: Russell.Smithies at agresearch.co.nz (Smithies, Russell)
Date: Wed, 2 Sep 2009 17:01:34 +1200
Subject: [Bioperl-l] bioperl invades emacs
In-Reply-To: <56DB0DEEB22645DE94DE0E912A889409@NewLife>
References: <56DB0DEEB22645DE94DE0E912A889409@NewLife>
Message-ID: <18DF7D20DFEC044098A1062202F5FFF32AAB8A8478@exchsth.agresearch.co.nz>

emacs, how quaint  :-)
And here's me thinking you'd be a vi guru...

For those who frequent Windows, Eclipse with EPIC is a real winner!

--Russell


> -----Original Message-----
> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
> bounces at lists.open-bio.org] On Behalf Of Mark A. Jensen
> Sent: Wednesday, 2 September 2009 4:20 p.m.
> To: BioPerl List
> Subject: [Bioperl-l] bioperl invades emacs
> 
> Hi All,
> 
> As part of the Documentation Project, I've written a full-
> fledged minor mode for emacs, bioperl-mode. It allows
> the user to access BP pod while coding, using keyboard
> shortcuts or menus. Pod pops up in a new view buffer,
> which it itself active for quick pod searching. You can
> get the whole pod, pieces of pod, or even the pod headers
> of individual methods.
> 
> The best feature (IMHO) is the completion facility. This
> not only saves typing, but allows browsing and follow-your-nose
> programming (exactly the technique I used to make bioperl-mode,
> thanks to the Extensible Self-Documenting Editor).
> 
> It's very easy to install, requires only one additional line
> in your .emacs file, and directly infects perl-mode
> (if you so choose) so its available whenever you
> open .pl or .pm files.
> 
> For details, screenshots, download and install info,
> and soporific design details, see
> http://www.bioperl.org/wiki/Emacs_bioperl-mode
> 
> Send me the bugs!
> cheers,
> MAJ
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
=======================================================================
Attention: The information contained in this message and/or attachments
from AgResearch Limited is intended only for the persons or entities
to which it is addressed and may contain confidential and/or privileged
material. Any review, retransmission, dissemination or other use of, or
taking of any action in reliance upon, this information by persons or
entities other than the intended recipients is prohibited by AgResearch
Limited. If you have received this message in error, please notify the
sender immediately.
=======================================================================


From maj at fortinbras.us  Wed Sep  2 12:28:45 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Wed, 2 Sep 2009 08:28:45 -0400
Subject: [Bioperl-l] bioperl invades emacs
In-Reply-To: <4A9E2638.8020203@pasteur.fr>
References: <56DB0DEEB22645DE94DE0E912A889409@NewLife>
	<4A9E2638.8020203@pasteur.fr>
Message-ID: <AC0A7CC6F808466CB15D267CC86AEEE3@NewLife>

Hi Emmanuel-- I'll look into this and report back- thanks!
MAJ
----- Original Message ----- 
From: "Emmanuel Quevillon" <tuco at pasteur.fr>
To: "Mark A. Jensen" <maj at fortinbras.us>
Sent: Wednesday, September 02, 2009 4:00 AM
Subject: Re: [Bioperl-l] bioperl invades emacs


> Mark A. Jensen wrote:
>> Hi All, 
>> 
>> As part of the Documentation Project, I've written a full-
>> fledged minor mode for emacs, bioperl-mode. It allows 
>> the user to access BP pod while coding, using keyboard
>> shortcuts or menus. Pod pops up in a new view buffer,
>> which it itself active for quick pod searching. You can 
>> get the whole pod, pieces of pod, or even the pod headers
>> of individual methods. 
>> 
>> The best feature (IMHO) is the completion facility. This
>> not only saves typing, but allows browsing and follow-your-nose
>> programming (exactly the technique I used to make bioperl-mode,
>> thanks to the Extensible Self-Documenting Editor).
>> 
>> It's very easy to install, requires only one additional line 
>> in your .emacs file, and directly infects perl-mode 
>> (if you so choose) so its available whenever you
>> open .pl or .pm files.
>> 
>> For details, screenshots, download and install info,
>> and soporific design details, see
>> http://www.bioperl.org/wiki/Emacs_bioperl-mode
>> 
>> Send me the bugs!
>> cheers, 
>> MAJ
> rg/mailman/listinfo/bioperl-l
> 
> Hi Mark,
> 
> Great great job.
> But I am using Xemacs and not .emacs file are present in my home
> directory. So is there an trick to make you bioperl-mode working
> under xemacs?
> 
> Thanks for you help
> 
> Regards
> 
> Emmanuel
> -- 
> -------------------------
> Emmanuel Quevillon
> Biological Software and Databases Group
> Institut Pasteur
> +33 1 44 38 95 98
> tuco at_ pasteur dot fr
> -------------------------
> 
>


From maj at fortinbras.us  Wed Sep  2 12:07:14 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Wed, 2 Sep 2009 08:07:14 -0400
Subject: [Bioperl-l] bioperl invades emacs
In-Reply-To: <18DF7D20DFEC044098A1062202F5FFF32AAB8A8478@exchsth.agresearch.co.nz>
References: <56DB0DEEB22645DE94DE0E912A889409@NewLife>
	<18DF7D20DFEC044098A1062202F5FFF32AAB8A8478@exchsth.agresearch.co.nz>
Message-ID: <B9B317F95CA44F0C9335450D3FDDEC73@NewLife>

I only know one command in vi --- :q
MAJ
----- Original Message ----- 
From: "Smithies, Russell" <Russell.Smithies at agresearch.co.nz>
To: "'Mark A. Jensen'" <maj at fortinbras.us>; "'BioPerl List'" 
<bioperl-l at lists.open-bio.org>
Sent: Wednesday, September 02, 2009 1:01 AM
Subject: RE: [Bioperl-l] bioperl invades emacs


emacs, how quaint  :-)
And here's me thinking you'd be a vi guru...

For those who frequent Windows, Eclipse with EPIC is a real winner!

--Russell


> -----Original Message-----
> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
> bounces at lists.open-bio.org] On Behalf Of Mark A. Jensen
> Sent: Wednesday, 2 September 2009 4:20 p.m.
> To: BioPerl List
> Subject: [Bioperl-l] bioperl invades emacs
>
> Hi All,
>
> As part of the Documentation Project, I've written a full-
> fledged minor mode for emacs, bioperl-mode. It allows
> the user to access BP pod while coding, using keyboard
> shortcuts or menus. Pod pops up in a new view buffer,
> which it itself active for quick pod searching. You can
> get the whole pod, pieces of pod, or even the pod headers
> of individual methods.
>
> The best feature (IMHO) is the completion facility. This
> not only saves typing, but allows browsing and follow-your-nose
> programming (exactly the technique I used to make bioperl-mode,
> thanks to the Extensible Self-Documenting Editor).
>
> It's very easy to install, requires only one additional line
> in your .emacs file, and directly infects perl-mode
> (if you so choose) so its available whenever you
> open .pl or .pm files.
>
> For details, screenshots, download and install info,
> and soporific design details, see
> http://www.bioperl.org/wiki/Emacs_bioperl-mode
>
> Send me the bugs!
> cheers,
> MAJ
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
=======================================================================
Attention: The information contained in this message and/or attachments
from AgResearch Limited is intended only for the persons or entities
to which it is addressed and may contain confidential and/or privileged
material. Any review, retransmission, dissemination or other use of, or
taking of any action in reliance upon, this information by persons or
entities other than the intended recipients is prohibited by AgResearch
Limited. If you have received this message in error, please notify the
sender immediately.
=======================================================================


From hlapp at gmx.net  Wed Sep  2 15:51:18 2009
From: hlapp at gmx.net (Hilmar Lapp)
Date: Wed, 2 Sep 2009 11:51:18 -0400
Subject: [Bioperl-l] bioperl invades emacs
In-Reply-To: <56DB0DEEB22645DE94DE0E912A889409@NewLife>
References: <56DB0DEEB22645DE94DE0E912A889409@NewLife>
Message-ID: <73A8B147-7605-4E2E-98AF-F3B09AD6046F@gmx.net>

Very nice!! -hilmar

On Sep 2, 2009, at 12:19 AM, Mark A. Jensen wrote:

> Hi All,
>
> As part of the Documentation Project, I've written a full-
> fledged minor mode for emacs, bioperl-mode. It allows
> the user to access BP pod while coding, using keyboard
> shortcuts or menus. Pod pops up in a new view buffer,
> which it itself active for quick pod searching. You can
> get the whole pod, pieces of pod, or even the pod headers
> of individual methods.
>
> The best feature (IMHO) is the completion facility. This
> not only saves typing, but allows browsing and follow-your-nose
> programming (exactly the technique I used to make bioperl-mode,
> thanks to the Extensible Self-Documenting Editor).
>
> It's very easy to install, requires only one additional line
> in your .emacs file, and directly infects perl-mode
> (if you so choose) so its available whenever you
> open .pl or .pm files.
>
> For details, screenshots, download and install info,
> and soporific design details, see
> http://www.bioperl.org/wiki/Emacs_bioperl-mode
>
> Send me the bugs!
> cheers,
> MAJ
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From cjfields at illinois.edu  Wed Sep  2 20:23:01 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Wed, 2 Sep 2009 15:23:01 -0500
Subject: [Bioperl-l] remove overlapped sequences from Blastn results
In-Reply-To: <1c9f28970909021320o20037e00g871db92a37519f79@mail.gmail.com>
References: <1c9f28970909011033h7f8a1bcl771db039bad384e7@mail.gmail.com>
	<7A89A354-3211-4662-9672-895E16CFDEE8@illinois.edu>
	<1c9f28970909021320o20037e00g871db92a37519f79@mail.gmail.com>
Message-ID: <E39D878B-A6F1-441A-A511-7CA0FF0D1319@illinois.edu>

Marcelo,

(Make sure to keep responses on the main list)

The new Tiling stuff is in bioperl-live (subversion code); it hasn't  
been released yet but should appear in BioPerl 1.6.1 (an alpha will be  
out this week).

chris

On Sep 2, 2009, at 3:20 PM, Marcelo Iwata wrote:

> thanks Chris.
> I was at cpan search to download Bio::Search::Tiling, and it returns  
> to me the bioperl core module:
> BioPerl-1.6.0.tar.gz
> at http://search.cpan.org/~cjfields/BioPerl-1.6.0/Bio/Search/BlastStatistics.pm
>
> i've downloaded and upgrade my bioperl version, but, still not find  
> the MapTiling.pm
>
> Could this be result of Some kind of error at upgrade?
>  thks.
>
>
> On Tue, Sep 1, 2009 at 3:10 PM, Chris Fields <cjfields at illinois.edu>  
> wrote:
> Marcelo,
>
> Do you mean tiling?  See:
>
> http://www.bioperl.org/wiki/HOWTO:Tiling
>
> chris
>
>
> On Sep 1, 2009, at 12:33 PM, Marcelo Iwata wrote:
>
> Hi
>
> I've made a blastn with such arguments:
>
> ../bin/blastall -p blastn -d DBBank -i myFasta.FASTA.txt  -e 0.00001  
> -o
> Out2Blast.txt -a 8
>
> and i want a script that removes overlapped sequences from the  
> results..
> For example, if a unigene A has the hit->start  and hit-end as 1 and  
> 4, and
> the B is at 2 and 3, respectively, the script remove second one.
>
> I want to know if it already exist, and if not, is there a library  
> that
> works with such issue.
>
> I know that at Bio::DB::gff we have overlapping_features. But , if  
> something
> directly exist (works with blast format), is better for me.
>
> thanks in advance
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>


From maj at fortinbras.us  Thu Sep  3 01:04:06 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Wed, 2 Sep 2009 21:04:06 -0400
Subject: [Bioperl-l] bioperl invades emacs
In-Reply-To: <56DB0DEEB22645DE94DE0E912A889409@NewLife>
References: <56DB0DEEB22645DE94DE0E912A889409@NewLife>
Message-ID: <5009BD4ADDC94A03866AC4D4813907EB@NewLife>

Thanks everyone for your comments so far, on and off-list. 
(You're a terrific audience. I also code for weddings and 
bar mitzvahs. Tip your servers.)
The howto page now has a "Known Issues" section, and
I will be working to eliminate those in the next couple of 
days. 

cheers Mark
----- Original Message ----- 
From: "Mark A. Jensen" <maj at fortinbras.us>
To: "BioPerl List" <bioperl-l at lists.open-bio.org>
Sent: Wednesday, September 02, 2009 12:19 AM
Subject: [Bioperl-l] bioperl invades emacs


> Hi All, 
> 
> As part of the Documentation Project, I've written a full-
> fledged minor mode for emacs, bioperl-mode. It allows 
> the user to access BP pod while coding, using keyboard
> shortcuts or menus. Pod pops up in a new view buffer,
> which it itself active for quick pod searching. You can 
> get the whole pod, pieces of pod, or even the pod headers
> of individual methods. 
> 
> The best feature (IMHO) is the completion facility. This
> not only saves typing, but allows browsing and follow-your-nose
> programming (exactly the technique I used to make bioperl-mode,
> thanks to the Extensible Self-Documenting Editor).
> 
> It's very easy to install, requires only one additional line 
> in your .emacs file, and directly infects perl-mode 
> (if you so choose) so its available whenever you
> open .pl or .pm files.
> 
> For details, screenshots, download and install info,
> and soporific design details, see
> http://www.bioperl.org/wiki/Emacs_bioperl-mode
> 
> Send me the bugs!
> cheers, 
> MAJ
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
>


From jessica.sun at gmail.com  Tue Sep  1 15:25:36 2009
From: jessica.sun at gmail.com (jsun529)
Date: Tue, 1 Sep 2009 08:25:36 -0700 (PDT)
Subject: [Bioperl-l]  covert CDS coordinates with Gene coordinates
Message-ID: <25242395.post@talk.nabble.com>


Dear all,
  I like to know how to convert a CDS coordinates with Gene coordinates
using the use Bio::Coordinate::GeneMapper;
 the doc is not very clear and a working example will help a lot in 

using the objects return from Bioperl function and get the value out in
readable format.

Thanks,

-- 
View this message in context: http://www.nabble.com/covert-CDS-coordinates-with-Gene-coordinates-tp25242395p25242395.html
Sent from the Perl - Bioperl-L mailing list archive at Nabble.com.


From pg4 at sanger.ac.uk  Wed Sep  2 23:35:07 2009
From: pg4 at sanger.ac.uk (Pablo Marin-Garcia)
Date: Thu, 3 Sep 2009 00:35:07 +0100 (BST)
Subject: [Bioperl-l] bioperl invades emacs -- bug report?
In-Reply-To: <mailman.25.1251907209.22450.bioperl-l@lists.open-bio.org>
References: <mailman.25.1251907209.22450.bioperl-l@lists.open-bio.org>
Message-ID: <alpine.DEB.1.10.0909022007510.16229@deskpro17122.dynamic.sanger.ac.uk>


Hello Mark,

It sounds fantastic,

unfortunatelly I was unable to use it:

It does not found pod2text in my macosX and fail to find my bioperl paths 
in linux (probably due to a bug in the perl5lib parsing but I am a lisp 
novice so I could be wrong)

==  macosX ==

in my macbook macosX 10.5 emacs 22.3 it does not find the pod2text
GNU Emacs 22.3.1 (i386-apple-darwin9.6.0, X toolkit)

   -I have installed your modules in my local-lisp and added the requiere 
and now emacs fails with the error:

   File error: Searching for program, invalid argument, pod2text

   -- I have pod2text in /usr/bin and this is in my $PATH (I use fink 
emacs in not-window mode) but the same happens with the carbon emacs

==  debian etch with an old emacs 21 ==

GNU Emacs 21.4.1 (i486-pc-linux-gnu, X toolkit, Xaw3d scroll bars) of 
2007-06-19 on ninsei, modified by Debian

It loads ok but when asking for the pods

[pod] Namespace: Bio::

it does not autocomplete from there, and if I have the cursor over a 'use 
Bio::xxx', and select [BP Docs] 'view methods' or 'view pod' it says 'no 
match'

# [pod mth] Namespace: Bio::PrimarySeq [No match]

Reading bioperl-mode.el and bioperl-init.el I have seen that the variable 
that stores the path to bioperl has not other paths added a part of 
current path:

# c-h v bioperl-module-path [ret] => bioperl-module-path's value is "."


== bug when parsing perl5lib? ==

Please correct me if I am wrong but in bioperl-init.el when extracting the 
Bioperl paths from PERL5LIB this is not working for me in linux.

While debugging bioperl-init.el:
# (setq pth (getenv "PERL5LIB"))
#  "/nfs/home/pmg/ensembl-api/ensembl-compara/modules:...:/nfs/home/pmg/bioperl-live:..."
# (setq pth (if (file-exists-p (concat pth "/" "Bio")) pth nil))
# nil

No file is found because it is looking for all the paths 
concatenated together with a '/Bio' at the end:

   libpaht1:libpath2:libpath3/Bio

'concat' adds /Bio to the pth that is a string with all the 
PERL5LIB paths. Should this concat rather be applied to the splited perl5lib by ':' in unix or 
';' in windows and then tested for the existence of files?

for example in unix:

--- code --
(defun addbio (bio_path)
   "apend /Bio to each path"
   (concat bio_path "/" "Bio"))

(mapcar 'file-exists-p (mapcar 'addbio (split-string pth ":")))
-- end code ---

This would result in the list of T and F bioperl (and ensembl) paths
(t t nil t t t t t t nil nil nil ...)


Regards and thanks for the modules they would be very useful.

    -Pablo

=====================================================================
                      Pablo Marin-Garcia, PhD

                     \\//          (Argiope bruennichi
                \/\/`(||>O:'\/\/   with stabilimentum)
                     //\\

Sanger Institute                |  PostDoc / Computer Biologist
Wellcome Trust Genome Campus    |  team : 128/108 (Human Genetics)
Hinxton, Cambridge CB10 1HH     |  room : N333
United Kingdom                  |  email: pablo.marin at sanger.ac.uk
====================================================================


-- 
 The Wellcome Trust Sanger Institute is operated by Genome Research 
 Limited, a charity registered in England with number 1021457 and a 
 company registered in England with number 2742969, whose registered 
 office is 215 Euston Road, London, NW1 2BE. 


From maj at fortinbras.us  Thu Sep  3 02:34:59 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Wed, 2 Sep 2009 22:34:59 -0400
Subject: [Bioperl-l] bioperl invades emacs -- bug report?
In-Reply-To: <alpine.DEB.1.10.0909022007510.16229@deskpro17122.dynamic.sanger.ac.uk>
References: <mailman.25.1251907209.22450.bioperl-l@lists.open-bio.org>
	<alpine.DEB.1.10.0909022007510.16229@deskpro17122.dynamic.sanger.ac.uk>
Message-ID: <2669F98293CC4473ADAB8B80F93351FF@NewLife>

Thanks for all this work, Pablo. Am working hard on 21
back-compat. Will attempt some mac-friendly paths
and look at the perl5lib issue-

"No matches" are seeming to stem from failure to
find the Bio tree-- there's a workaround for this on
the wiki page as of right now. This will probably
not help the 21 problems, but the next commit
(tomorrow) will likely solve these. I will post to this
thread when that happens.
cheers Mark
----- Original Message ----- 
From: "Pablo Marin-Garcia" <pg4 at sanger.ac.uk>
To: <bioperl-l at lists.open-bio.org>
Sent: Wednesday, September 02, 2009 7:35 PM
Subject: Re: [Bioperl-l] bioperl invades emacs -- bug report?


>
>
> Hello Mark,
>
> It sounds fantastic,
>
> unfortunatelly I was unable to use it:
>
> It does not found pod2text in my macosX and fail to find my bioperl paths in 
> linux (probably due to a bug in the perl5lib parsing but I am a lisp novice so 
> I could be wrong)
>
> ==  macosX ==
>
> in my macbook macosX 10.5 emacs 22.3 it does not find the pod2text
> GNU Emacs 22.3.1 (i386-apple-darwin9.6.0, X toolkit)
>
>   -I have installed your modules in my local-lisp and added the requiere and 
> now emacs fails with the error:
>
>   File error: Searching for program, invalid argument, pod2text
>
>   -- I have pod2text in /usr/bin and this is in my $PATH (I use fink emacs in 
> not-window mode) but the same happens with the carbon emacs
>
> ==  debian etch with an old emacs 21 ==
>
> GNU Emacs 21.4.1 (i486-pc-linux-gnu, X toolkit, Xaw3d scroll bars) of 
> 2007-06-19 on ninsei, modified by Debian
>
> It loads ok but when asking for the pods
>
> [pod] Namespace: Bio::
>
> it does not autocomplete from there, and if I have the cursor over a 'use 
> Bio::xxx', and select [BP Docs] 'view methods' or 'view pod' it says 'no 
> match'
>
> # [pod mth] Namespace: Bio::PrimarySeq [No match]
>
> Reading bioperl-mode.el and bioperl-init.el I have seen that the variable that 
> stores the path to bioperl has not other paths added a part of current path:
>
> # c-h v bioperl-module-path [ret] => bioperl-module-path's value is "."
>
>
> == bug when parsing perl5lib? ==
>
> Please correct me if I am wrong but in bioperl-init.el when extracting the 
> Bioperl paths from PERL5LIB this is not working for me in linux.
>
> While debugging bioperl-init.el:
> # (setq pth (getenv "PERL5LIB"))
> # 
> "/nfs/home/pmg/ensembl-api/ensembl-compara/modules:...:/nfs/home/pmg/bioperl-live:..."
> # (setq pth (if (file-exists-p (concat pth "/" "Bio")) pth nil))
> # nil
>
> No file is found because it is looking for all the paths concatenated together 
> with a '/Bio' at the end:
>
>   libpaht1:libpath2:libpath3/Bio
>
> 'concat' adds /Bio to the pth that is a string with all the PERL5LIB paths. 
> Should this concat rather be applied to the splited perl5lib by ':' in unix or 
> ';' in windows and then tested for the existence of files?
>
> for example in unix:
>
> --- code --
> (defun addbio (bio_path)
>   "apend /Bio to each path"
>   (concat bio_path "/" "Bio"))
>
> (mapcar 'file-exists-p (mapcar 'addbio (split-string pth ":")))
> -- end code ---
>
> This would result in the list of T and F bioperl (and ensembl) paths
> (t t nil t t t t t t nil nil nil ...)
>
>
> Regards and thanks for the modules they would be very useful.
>
>    -Pablo
>
> =====================================================================
>                      Pablo Marin-Garcia, PhD
>
>                     \\//          (Argiope bruennichi
>                \/\/`(||>O:'\/\/   with stabilimentum)
>                     //\\
>
> Sanger Institute                |  PostDoc / Computer Biologist
> Wellcome Trust Genome Campus    |  team : 128/108 (Human Genetics)
> Hinxton, Cambridge CB10 1HH     |  room : N333
> United Kingdom                  |  email: pablo.marin at sanger.ac.uk
> ====================================================================
>
>
>
>
>
>
>
>
>
>
> -- 
> The Wellcome Trust Sanger Institute is operated by Genome Research Limited, a 
> charity registered in England with number 1021457 and a company registered in 
> England with number 2742969, whose registered office is 215 Euston Road, 
> London, NW1 2BE. _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> 


From maj at fortinbras.us  Thu Sep  3 04:21:14 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Thu, 3 Sep 2009 00:21:14 -0400
Subject: [Bioperl-l] bioperl invades emacs -- bug report?
In-Reply-To: <alpine.DEB.1.10.0909022007510.16229@deskpro17122.dynamic.sanger.ac.uk>
References: <mailman.25.1251907209.22450.bioperl-l@lists.open-bio.org>
	<alpine.DEB.1.10.0909022007510.16229@deskpro17122.dynamic.sanger.ac.uk>
Message-ID: <203092FB050648AA9F256788068F0A16@NewLife>

Hi Pablo and all-
Try the latest revision (>=16081) with your debian/Emacs 21. Set
the variable bioperl-module-path to the directory above the
Bio directory (same idea as ' use lib "./bioperl-live"; ' ), and try
again there. Tomorrow, MacOS
cheers,
Mark
----- Original Message ----- 
From: "Pablo Marin-Garcia" <pg4 at sanger.ac.uk>
To: <bioperl-l at lists.open-bio.org>
Sent: Wednesday, September 02, 2009 7:35 PM
Subject: Re: [Bioperl-l] bioperl invades emacs -- bug report?


>
>
> Hello Mark,
>
> It sounds fantastic,
>
> unfortunatelly I was unable to use it:
>
> It does not found pod2text in my macosX and fail to find my bioperl paths in 
> linux (probably due to a bug in the perl5lib parsing but I am a lisp novice so 
> I could be wrong)
>
> ==  macosX ==
>
> in my macbook macosX 10.5 emacs 22.3 it does not find the pod2text
> GNU Emacs 22.3.1 (i386-apple-darwin9.6.0, X toolkit)
>
>   -I have installed your modules in my local-lisp and added the requiere and 
> now emacs fails with the error:
>
>   File error: Searching for program, invalid argument, pod2text
>
>   -- I have pod2text in /usr/bin and this is in my $PATH (I use fink emacs in 
> not-window mode) but the same happens with the carbon emacs
>
> ==  debian etch with an old emacs 21 ==
>
> GNU Emacs 21.4.1 (i486-pc-linux-gnu, X toolkit, Xaw3d scroll bars) of 
> 2007-06-19 on ninsei, modified by Debian
>
> It loads ok but when asking for the pods
>
> [pod] Namespace: Bio::
>
> it does not autocomplete from there, and if I have the cursor over a 'use 
> Bio::xxx', and select [BP Docs] 'view methods' or 'view pod' it says 'no 
> match'
>
> # [pod mth] Namespace: Bio::PrimarySeq [No match]
>
> Reading bioperl-mode.el and bioperl-init.el I have seen that the variable that 
> stores the path to bioperl has not other paths added a part of current path:
>
> # c-h v bioperl-module-path [ret] => bioperl-module-path's value is "."
>
>
> == bug when parsing perl5lib? ==
>
> Please correct me if I am wrong but in bioperl-init.el when extracting the 
> Bioperl paths from PERL5LIB this is not working for me in linux.
>
> While debugging bioperl-init.el:
> # (setq pth (getenv "PERL5LIB"))
> # 
> "/nfs/home/pmg/ensembl-api/ensembl-compara/modules:...:/nfs/home/pmg/bioperl-live:..."
> # (setq pth (if (file-exists-p (concat pth "/" "Bio")) pth nil))
> # nil
>
> No file is found because it is looking for all the paths concatenated together 
> with a '/Bio' at the end:
>
>   libpaht1:libpath2:libpath3/Bio
>
> 'concat' adds /Bio to the pth that is a string with all the PERL5LIB paths. 
> Should this concat rather be applied to the splited perl5lib by ':' in unix or 
> ';' in windows and then tested for the existence of files?
>
> for example in unix:
>
> --- code --
> (defun addbio (bio_path)
>   "apend /Bio to each path"
>   (concat bio_path "/" "Bio"))
>
> (mapcar 'file-exists-p (mapcar 'addbio (split-string pth ":")))
> -- end code ---
>
> This would result in the list of T and F bioperl (and ensembl) paths
> (t t nil t t t t t t nil nil nil ...)
>
>
> Regards and thanks for the modules they would be very useful.
>
>    -Pablo
>
> =====================================================================
>                      Pablo Marin-Garcia, PhD
>
>                     \\//          (Argiope bruennichi
>                \/\/`(||>O:'\/\/   with stabilimentum)
>                     //\\
>
> Sanger Institute                |  PostDoc / Computer Biologist
> Wellcome Trust Genome Campus    |  team : 128/108 (Human Genetics)
> Hinxton, Cambridge CB10 1HH     |  room : N333
> United Kingdom                  |  email: pablo.marin at sanger.ac.uk
> ====================================================================
>
>
>
>
>
>
>
>
>
>
> -- 
> The Wellcome Trust Sanger Institute is operated by Genome Research Limited, a 
> charity registered in England with number 1021457 and a company registered in 
> England with number 2742969, whose registered office is 215 Euston Road, 
> London, NW1 2BE. _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> 


From tuco at pasteur.fr  Thu Sep  3 09:56:45 2009
From: tuco at pasteur.fr (Emmanuel Quevillon)
Date: Thu, 03 Sep 2009 11:56:45 +0200
Subject: [Bioperl-l] bioperl invades emacs
In-Reply-To: <5009BD4ADDC94A03866AC4D4813907EB@NewLife>
References: <56DB0DEEB22645DE94DE0E912A889409@NewLife>
	<5009BD4ADDC94A03866AC4D4813907EB@NewLife>
Message-ID: <4A9F92DD.2010701@pasteur.fr>

Mark A. Jensen wrote:
> Thanks everyone for your comments so far, on and off-list. (You're a
> terrific audience. I also code for weddings and bar mitzvahs. Tip your
> servers.)
> The howto page now has a "Known Issues" section, and
> I will be working to eliminate those in the next couple of days.
> cheers Mark

Hi Mark,

Thanks for your help. I decided to remove Xemacs :) and replace it
with Emacs. In fact, as I am running Ubuntu, it was a mess to know
where to put files.el etc and how to make it working.
So I removed everything , bit rude, and reinstall emacs-22.

What I've done after that.

$ cd /usr/share/emacs
$ cd 22.2
$ cp BIOPERL-MODE/etc/* etc/
$ cd site-lisp (which is a symlink to /usr/share/emacs22/site-lisp)
$ sudo mkdir bioperl-mode
$ cp BIOPERL-MODE/site-lisp/* bioperl-mode
$ cd ~
$ touch .emacs
$ cat .xemacs/init.el (with require 'bioperl-mode) > .emacs
$ cat .xemacs/custom.el >> .emacs (The file with my other emacs
stuff, e.g. Template Toolkit mode)

And it is all done and working perfectly!!

Thanks for this great file Mark

Regards

Emmanuel

-- 
-------------------------
Emmanuel Quevillon
Biological Software and Databases Group
Institut Pasteur
+33 1 44 38 95 98
tuco at_ pasteur dot fr
-------------------------


From maj at fortinbras.us  Thu Sep  3 11:22:31 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Thu, 3 Sep 2009 07:22:31 -0400
Subject: [Bioperl-l] bioperl invades emacs -- bug report?
In-Reply-To: <alpine.DEB.1.10.0909030814320.16229@deskpro17122.dynamic.sanger.ac.uk>
References: <mailman.25.1251907209.22450.bioperl-l@lists.open-bio.org>
	<alpine.DEB.1.10.0909022007510.16229@deskpro17122.dynamic.sanger.ac.uk>
	<203092FB050648AA9F256788068F0A16@NewLife>
	<alpine.DEB.1.10.0909030814320.16229@deskpro17122.dynamic.sanger.ac.uk>
Message-ID: <2465B400494242AEAB5F578BD6BB5301@NewLife>

I get it now-- you're right. I'll take care of that-
cheers
MAJ
----- Original Message ----- 
From: "Pablo Marin-Garcia" <pg4 at sanger.ac.uk>
To: "Mark A. Jensen" <maj at fortinbras.us>
Cc: <bioperl-l at lists.open-bio.org>
Sent: Thursday, September 03, 2009 4:01 AM
Subject: Re: [Bioperl-l] bioperl invades emacs -- bug report?


> On Thu, 3 Sep 2009, Mark A. Jensen wrote:
>
>> Hi Pablo and all-
>> Try the latest revision (>=16081) with your debian/Emacs 21. Set
>> the variable bioperl-module-path to the directory above the
>> Bio directory (same idea as ' use lib "./bioperl-live"; ' ), and try
>> again there. Tomorrow, MacOS
>> cheers,
>> Mark
>
> Hello Mark,
>
> after setting bioperl-module-path manually, your module works ok in linux 
> emacs 21.4 with latest revision.
>
> About the perl5lib issue, sorry about not reporting the platform: the report 
> was on linux not in mac os X. In the wiki you have a comment about mac OS X 
> separator:
>
> [wiki] The problem Pablo was running into is definitely the Mac OS X path 
> [wiki] separator issue.
>
> Here I was refering to ':' as the 'path seprator' for linux multipath 
> environmental vars not the systems directory separator [:/\].
>
> Also from the wiki
>
> [wiki] I think this is ok as it is, since bioperl-module-path is meant to 
> [wiki] point to the directory above Bio
>
> This is right. Probably my message was misleading. I wrongly appended '/Bio' 
> to the path instead to a temp variable for testing with file-exist-p. And 
> probably gave you the impression that the point was to have the /Bio added to 
> the path. Sorry about that.
>
> Instead my main point was about the line where you capture the PRL5LIB:
>
> [code] (if (setq pth (getenv "PERL5LIB"))
>
> wouldn't this leave pth with s *string* like "lib/path1:lib/path2:lob/path3" 
> in linux?
>
> Then, when you test:
>
> [code] (setq pth (if (file-exists-p (concat pth "/" "Bio")) pth nil))))
>
> it would append '/Bio' at the end of the whole string 
> 'lib/path1:lib/path2:lib/path3'. and this string path obviously does not 
> exist.
>
> Am I missing something? Shouldn't the 'concat /Bio' be applied to *each* 
> lib/path, splitting first the pth string by the ':' in linux/osX or equivalent 
> in windows.
>
> Sorry about not being very clear in my firest report.
>
>
>    -Pablo
>
>
>
>>> == bug when parsing perl5lib? ==
>>>
>>> Please correct me if I am wrong but in bioperl-init.el when extracting the 
>>> Bioperl paths from PERL5LIB this is not working for me in linux.
>>>
>>> While debugging bioperl-init.el:
>>> # (setq pth (getenv "PERL5LIB"))
>>> # 
>>> "/nfs/home/pmg/ensembl-api/ensembl-compara/modules:...:/nfs/home/pmg/bioperl-live:..."
>>> # (setq pth (if (file-exists-p (concat pth "/" "Bio")) pth nil))
>>> # nil
>>>
>>> No file is found because it is looking for all the paths concatenated 
>>> together with a '/Bio' at the end:
>>>
>>>   libpaht1:libpath2:libpath3/Bio
>>>
>>> 'concat' adds /Bio to the pth that is a string with all the PERL5LIB paths. 
>>> Should this concat rather be applied to the splited perl5lib by ':' in unix 
>>> or ';' in windows and then tested for the existence of files?
>>>
>>> for example in unix:
>>>
>>> --- code --
>>> (defun addbio (bio_path)
>>>   "apend /Bio to each path"
>>>   (concat bio_path "/" "Bio"))
>>>
>>> (mapcar 'file-exists-p (mapcar 'addbio (split-string pth ":")))
>>> -- end code ---
>>>
>>> This would result in the list of T and F bioperl (and ensembl) paths
>>> (t t nil t t t t t t nil nil nil ...)
>>>
>>>
>>> Regards and thanks for the modules they would be very useful.
>>>
>>>    -Pablo
>>>
>>> =====================================================================
>>>                      Pablo Marin-Garcia, PhD
>>>
>>>                     \\//          (Argiope bruennichi
>>>                \/\/`(||>O:'\/\/   with stabilimentum)
>>>                     //\\
>>>
>>> Sanger Institute                |  PostDoc / Computer Biologist
>>> Wellcome Trust Genome Campus    |  team : 128/108 (Human Genetics)
>>> Hinxton, Cambridge CB10 1HH     |  room : N333
>>> United Kingdom                  |  email: pablo.marin at sanger.ac.uk
>>> ====================================================================
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> -- 
>>> The Wellcome Trust Sanger Institute is operated by Genome Research Limited, 
>>> a charity registered in England with number 1021457 and a company registered 
>>> in England with number 2742969, whose registered office is 215 Euston Road, 
>>> London, NW1 2BE. _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>>
>>
>>
>
>
> =====================================================================
>                      Pablo Marin-Garcia, PhD
>
>                     \\//          (Argiope bruennichi
>                \/\/`(||>O:'\/\/   with stabilimentum)
>                     //\\
>
> Sanger Institute                |  PostDoc / Computer Biologist
> Wellcome Trust Genome Campus    |  team : 128/108 (Human Genetics)
> Hinxton, Cambridge CB10 1HH     |  room : N333
> United Kingdom                  |  email: pablo.marin at sanger.ac.uk
> ====================================================================
>
>
>
>
>
>
>
>
>
>
> -- 
> The Wellcome Trust Sanger Institute is operated by Genome Research Limited, a 
> charity registered in England with number 1021457 and a company registered in 
> England with number 2742969, whose registered office is 215 Euston Road, 
> London, NW1 2BE.
> 


From maj at fortinbras.us  Thu Sep  3 12:34:45 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Thu, 3 Sep 2009 08:34:45 -0400
Subject: [Bioperl-l] bioperl invades emacs
In-Reply-To: <56DB0DEEB22645DE94DE0E912A889409@NewLife>
References: <56DB0DEEB22645DE94DE0E912A889409@NewLife>
Message-ID: <736B3399B3754D4C9B1BB66414160D95@NewLife>

Hi All, 

Following bioperl-mode issues are resolved in r16020:

- compatibility with Emacs 21
- correct parsing of PERL5LIB
- Bio module search now includes PATH components 
  (after PERL5LIB search)
- Now get informative error if completion is attempted
  without a valid bioperl-module-path

Thanks for your patience and your bug reports-
cheers
MAJ

----- Original Message ----- 
From: "Mark A. Jensen" <maj at fortinbras.us>
To: "BioPerl List" <bioperl-l at lists.open-bio.org>
Sent: Wednesday, September 02, 2009 12:19 AM
Subject: [Bioperl-l] bioperl invades emacs


> Hi All, 
> 
> As part of the Documentation Project, I've written a full-
> fledged minor mode for emacs, bioperl-mode. It allows 
> the user to access BP pod while coding, using keyboard
> shortcuts or menus. Pod pops up in a new view buffer,
> which it itself active for quick pod searching. You can 
> get the whole pod, pieces of pod, or even the pod headers
> of individual methods. 
> 
> The best feature (IMHO) is the completion facility. This
> not only saves typing, but allows browsing and follow-your-nose
> programming (exactly the technique I used to make bioperl-mode,
> thanks to the Extensible Self-Documenting Editor).
> 
> It's very easy to install, requires only one additional line 
> in your .emacs file, and directly infects perl-mode 
> (if you so choose) so its available whenever you
> open .pl or .pm files.
> 
> For details, screenshots, download and install info,
> and soporific design details, see
> http://www.bioperl.org/wiki/Emacs_bioperl-mode
> 
> Send me the bugs!
> cheers, 
> MAJ
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
>


From neetisomaiya at gmail.com  Fri Sep  4 06:49:58 2009
From: neetisomaiya at gmail.com (Neeti Somaiya)
Date: Fri, 4 Sep 2009 12:19:58 +0530
Subject: [Bioperl-l] need help urgently
Message-ID: <764978cf0909032349t5d8683f3h3b2ccc506392c984@mail.gmail.com>

Hi,

I have an input list of gene names (can get gene ids from a local db
if required).
I need to fetch sequences of these genes. Can someone please guide me
as to how this can be done using perl/bioperl?

Any help will be deeply appreciated.

Thanks.

-Neeti
Even my blood says, B positive


From neetisomaiya at gmail.com  Fri Sep  4 09:17:17 2009
From: neetisomaiya at gmail.com (Neeti Somaiya)
Date: Fri, 4 Sep 2009 14:47:17 +0530
Subject: [Bioperl-l] need help urgently
In-Reply-To: <42006.192.168.1.1.1252051273.squirrel@mail.ncbs.res.in>
References: <764978cf0909032349t5d8683f3h3b2ccc506392c984@mail.gmail.com>
	<42006.192.168.1.1.1252051273.squirrel@mail.ncbs.res.in>
Message-ID: <764978cf0909040217k7f4382d6pcec65f8adfd66164@mail.gmail.com>

Thanks for the link.
So I need only the following lines of code to get the sequence?

use Bio::DB::GenBank;
$db_obj = Bio::DB::GenBank->new;
$seq_obj = $db_obj->get_Seq_by_id(2);

How do I print the sequence?
$seq_obj->seq ??

-Neeti
Even my blood says, B positive


On Fri, Sep 4, 2009 at 1:31 PM, K. Shameer<shameer at ncbs.res.in> wrote:
>
> Retrieving a sequence from a database : BioPerl HOWTO
> http://bit.ly/RWIot
>
> Trust this helps,
> Khader Shameer
> NCBS - TIFR
>
>> Hi,
>>
>> I have an input list of gene names (can get gene ids from a local db
>> if required).
>> I need to fetch sequences of these genes. Can someone please guide me
>> as to how this can be done using perl/bioperl?
>>
>> Any help will be deeply appreciated.
>>
>> Thanks.
>>
>> -Neeti
>> Even my blood says, B positive
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>
>
>


From neetisomaiya at gmail.com  Fri Sep  4 10:13:58 2009
From: neetisomaiya at gmail.com (Neeti Somaiya)
Date: Fri, 4 Sep 2009 15:43:58 +0530
Subject: [Bioperl-l] need help urgently
In-Reply-To: <764978cf0909040217k7f4382d6pcec65f8adfd66164@mail.gmail.com>
References: <764978cf0909032349t5d8683f3h3b2ccc506392c984@mail.gmail.com>
	<42006.192.168.1.1.1252051273.squirrel@mail.ncbs.res.in>
	<764978cf0909040217k7f4382d6pcec65f8adfd66164@mail.gmail.com>
Message-ID: <764978cf0909040313r48ff5660x3210324dc14bf966@mail.gmail.com>

Thanks for the replies.

So the get seq by accession/GI worked for me. Now can anyone tell me
the easiest way to get the GI /Accession of a gene from the gene
id/gene name?

-Neeti
Even my blood says, B positive


On Fri, Sep 4, 2009 at 2:47 PM, Neeti Somaiya<neetisomaiya at gmail.com> wrote:
> Thanks for the link.
> So I need only the following lines of code to get the sequence?
>
> use Bio::DB::GenBank;
> $db_obj = Bio::DB::GenBank->new;
> $seq_obj = $db_obj->get_Seq_by_id(2);
>
> How do I print the sequence?
> $seq_obj->seq ??
>
> -Neeti
> Even my blood says, B positive
>
>
>
> On Fri, Sep 4, 2009 at 1:31 PM, K. Shameer<shameer at ncbs.res.in> wrote:
>>
>> Retrieving a sequence from a database : BioPerl HOWTO
>> http://bit.ly/RWIot
>>
>> Trust this helps,
>> Khader Shameer
>> NCBS - TIFR
>>
>>> Hi,
>>>
>>> I have an input list of gene names (can get gene ids from a local db
>>> if required).
>>> I need to fetch sequences of these genes. Can someone please guide me
>>> as to how this can be done using perl/bioperl?
>>>
>>> Any help will be deeply appreciated.
>>>
>>> Thanks.
>>>
>>> -Neeti
>>> Even my blood says, B positive
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>
>>
>>
>


From e.osimo at gmail.com  Fri Sep  4 12:05:48 2009
From: e.osimo at gmail.com (Emanuele Osimo)
Date: Fri, 4 Sep 2009 14:05:48 +0200
Subject: [Bioperl-l] need help urgently
In-Reply-To: <764978cf0909040313r48ff5660x3210324dc14bf966@mail.gmail.com>
References: <764978cf0909032349t5d8683f3h3b2ccc506392c984@mail.gmail.com> 
	<42006.192.168.1.1.1252051273.squirrel@mail.ncbs.res.in>
	<764978cf0909040217k7f4382d6pcec65f8adfd66164@mail.gmail.com> 
	<764978cf0909040313r48ff5660x3210324dc14bf966@mail.gmail.com>
Message-ID: <2ac05d0f0909040505w75793434ifc25237edbabad5a@mail.gmail.com>

Try this:
http://david.abcc.ncifcrf.gov/conversion.jsp

Emanuele


On Fri, Sep 4, 2009 at 12:13, Neeti Somaiya <neetisomaiya at gmail.com> wrote:

> Thanks for the replies.
>
> So the get seq by accession/GI worked for me. Now can anyone tell me
> the easiest way to get the GI /Accession of a gene from the gene
> id/gene name?
>
> -Neeti
> Even my blood says, B positive
>
>
>
> On Fri, Sep 4, 2009 at 2:47 PM, Neeti Somaiya<neetisomaiya at gmail.com>
> wrote:
> > Thanks for the link.
> > So I need only the following lines of code to get the sequence?
> >
> > use Bio::DB::GenBank;
> > $db_obj = Bio::DB::GenBank->new;
> > $seq_obj = $db_obj->get_Seq_by_id(2);
> >
> > How do I print the sequence?
> > $seq_obj->seq ??
> >
> > -Neeti
> > Even my blood says, B positive
> >
> >
> >
> > On Fri, Sep 4, 2009 at 1:31 PM, K. Shameer<shameer at ncbs.res.in> wrote:
> >>
> >> Retrieving a sequence from a database : BioPerl HOWTO
> >> http://bit.ly/RWIot
> >>
> >> Trust this helps,
> >> Khader Shameer
> >> NCBS - TIFR
> >>
> >>> Hi,
> >>>
> >>> I have an input list of gene names (can get gene ids from a local db
> >>> if required).
> >>> I need to fetch sequences of these genes. Can someone please guide me
> >>> as to how this can be done using perl/bioperl?
> >>>
> >>> Any help will be deeply appreciated.
> >>>
> >>> Thanks.
> >>>
> >>> -Neeti
> >>> Even my blood says, B positive
> >>> _______________________________________________
> >>> Bioperl-l mailing list
> >>> Bioperl-l at lists.open-bio.org
> >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >>>
> >>
> >>
> >>
> >
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


From neetisomaiya at gmail.com  Fri Sep  4 12:21:19 2009
From: neetisomaiya at gmail.com (Neeti Somaiya)
Date: Fri, 4 Sep 2009 17:51:19 +0530
Subject: [Bioperl-l] need help urgently
In-Reply-To: <2ac05d0f0909040505w75793434ifc25237edbabad5a@mail.gmail.com>
References: <764978cf0909032349t5d8683f3h3b2ccc506392c984@mail.gmail.com>
	<42006.192.168.1.1.1252051273.squirrel@mail.ncbs.res.in>
	<764978cf0909040217k7f4382d6pcec65f8adfd66164@mail.gmail.com>
	<764978cf0909040313r48ff5660x3210324dc14bf966@mail.gmail.com>
	<2ac05d0f0909040505w75793434ifc25237edbabad5a@mail.gmail.com>
Message-ID: <764978cf0909040521p7aa6af97hffc0c08381a04222@mail.gmail.com>

Thanks. Its an interesting tool.

But I want to do this programatically.

I have gene ids to start with. Cant find a method to directly get
sequence with gene id as input. So using the method of getting
sequence with accession as input, for which I need to know accessions
for my gene ids first. Is this a right approach? Please guide me. My
main aim is to get the nucleotide sequence of a gene from ids entrez
gene id/gene name. PLease guide me. I am confused.

-Neeti
Even my blood says, B positive


On Fri, Sep 4, 2009 at 5:35 PM, Emanuele Osimo<e.osimo at gmail.com> wrote:
> Try this:
> http://david.abcc.ncifcrf.gov/conversion.jsp
>
> Emanuele
>
>
> On Fri, Sep 4, 2009 at 12:13, Neeti Somaiya <neetisomaiya at gmail.com> wrote:
>>
>> Thanks for the replies.
>>
>> So the get seq by accession/GI worked for me. Now can anyone tell me
>> the easiest way to get the GI /Accession of a gene from the gene
>> id/gene name?
>>
>> -Neeti
>> Even my blood says, B positive
>>
>>
>>
>> On Fri, Sep 4, 2009 at 2:47 PM, Neeti Somaiya<neetisomaiya at gmail.com>
>> wrote:
>> > Thanks for the link.
>> > So I need only the following lines of code to get the sequence?
>> >
>> > use Bio::DB::GenBank;
>> > $db_obj = Bio::DB::GenBank->new;
>> > $seq_obj = $db_obj->get_Seq_by_id(2);
>> >
>> > How do I print the sequence?
>> > $seq_obj->seq ??
>> >
>> > -Neeti
>> > Even my blood says, B positive
>> >
>> >
>> >
>> > On Fri, Sep 4, 2009 at 1:31 PM, K. Shameer<shameer at ncbs.res.in> wrote:
>> >>
>> >> Retrieving a sequence from a database : BioPerl HOWTO
>> >> http://bit.ly/RWIot
>> >>
>> >> Trust this helps,
>> >> Khader Shameer
>> >> NCBS - TIFR
>> >>
>> >>> Hi,
>> >>>
>> >>> I have an input list of gene names (can get gene ids from a local db
>> >>> if required).
>> >>> I need to fetch sequences of these genes. Can someone please guide me
>> >>> as to how this can be done using perl/bioperl?
>> >>>
>> >>> Any help will be deeply appreciated.
>> >>>
>> >>> Thanks.
>> >>>
>> >>> -Neeti
>> >>> Even my blood says, B positive
>> >>> _______________________________________________
>> >>> Bioperl-l mailing list
>> >>> Bioperl-l at lists.open-bio.org
>> >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>> >>>
>> >>
>> >>
>> >>
>> >
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>


From paola_bisignano at yahoo.it  Fri Sep  4 12:32:02 2009
From: paola_bisignano at yahoo.it (Paola Bisignano)
Date: Fri, 4 Sep 2009 12:32:02 +0000 (GMT)
Subject: [Bioperl-l] problem parsing msf....:second part...I cannot solve
	sorry sorry
Message-ID: <330845.85818.qm@web25704.mail.ukl.yahoo.com>

I have a problem with the parsing of msf file...I can't find the exact


object of Bio::SimpleAlign for my case...


I have to identify residues (from a list) in aligned sequences...but


when I parse the alignment from fasta file, I save as msf file, where


I have to identify my residue (from the list, numbering as the pdb


file) and the residue aligned in the aligned sequences...


this is a piece of the file...


NoName ? MSF: 2 ?Type: P ?Wed Aug 26 10:32:50 2009 ?Check: 00 ..


?Name: Sequence/23-178 ?Len: ? ?156 ?Check: ?8937 ?Weight: ?1.00


?Name: 2zhz:A/1-148 ? ? Len: ? ?156 ?Check: ?9006 ?Weight: ?1.00


//


 ? ? ? ? ? ? ? ? ? ? ?1 ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? 50


Sequence/23-178 ? ? ? NDPRVAAYGE VDELNSWVGY TKSLINSHTQ VLSNELEEIQ QLLFDCGHDL


2zhz:A/1-148 ? ? ? ? ?DDARIAAIGD VDELNSQIGV L--LAEPLPD DVRAALSAIQ HDLFDLGGEL


 ? ? ? ? ? ? ? ? ? ? ?51 ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? 100


Sequence/23-178 ? ? ? ATPADDERHS FKFKQEQPTV WLEEKIDNYT QVVPAVKKHI LPGGTQLASA


2zhz:A/1-148 ? ? ? ? ?CIPGHAAITD AHLARLDG-- WLA----HYN GQLPPLEEFI LPGGARGAAL


 ? ? ? ? ? ? ? ? ? ? ?101 ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?150


Sequence/23-178 ? ? ? LHVARTITRR AERQIVQLMR EEQINQDVLI FINRLSDYFF AAARYANYLE


2zhz:A/1-148 ? ? ? ? ?AHVCRTVCRR AERSIVALGA SEPLNAAPRR YVNRLSDLLF VLARVLNRAA


 ? ? ? ? ? ? ? ? ? ? ?151 ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?200


Sequence/23-178 ? ? ? QQPDML


2zhz:A/1-148 ? ? ? ? ?GGADVL


for example in this I have to identify the residue that is in front of


Val 28 (that is in Sequen) in 2zhz:A (that manually conting is Ile


5)....


Tyr4-> has no residue in front of it because the alignment starts from


N23 of Sequence...


how can I find the way to enter the residue of my sequen, and extract


the residue from the other????


I wish you all dear friends..and I'm actually in atrouble with this..


Thanks for suggestions


Paola


From neetisomaiya at gmail.com  Fri Sep  4 12:40:10 2009
From: neetisomaiya at gmail.com (Neeti Somaiya)
Date: Fri, 4 Sep 2009 18:10:10 +0530
Subject: [Bioperl-l] need help urgently
In-Reply-To: <8CCCFE4D-84A4-47A4-A627-ADC6C0329686@illinois.edu>
References: <764978cf0909032349t5d8683f3h3b2ccc506392c984@mail.gmail.com>
	<42006.192.168.1.1.1252051273.squirrel@mail.ncbs.res.in>
	<764978cf0909040217k7f4382d6pcec65f8adfd66164@mail.gmail.com>
	<764978cf0909040313r48ff5660x3210324dc14bf966@mail.gmail.com>
	<2ac05d0f0909040505w75793434ifc25237edbabad5a@mail.gmail.com>
	<764978cf0909040521p7aa6af97hffc0c08381a04222@mail.gmail.com>
	<8CCCFE4D-84A4-47A4-A627-ADC6C0329686@illinois.edu>
Message-ID: <764978cf0909040540n531ea4d3o42f28a7e1578ad82@mail.gmail.com>

Hi,

Thanks for your reply. I saw this before and wanted to try this, but I
am unable to install this module of EUtilities. When I search on CPAN,
it gives me the entire bioperl package in the download option of this
module. Can I not get a tar.gz file of this module alone, which I can
gzip, untar and then run the make and all to install it? I dont want
to install entire bioperl again as I am using an older version. Any
suggestions?

-Neeti
Even my blood says, B positive


On Fri, Sep 4, 2009 at 6:00 PM, Chris Fields<cjfields at illinois.edu> wrote:
> Neeti,
>
> Something like this?
>
> http://www.bioperl.org/wiki/HOWTO:EUtilities_Cookbook#esummary_-.3E_efetch
>
> chris
>
> On Sep 4, 2009, at 7:21 AM, Neeti Somaiya wrote:
>
>> Thanks. Its an interesting tool.
>>
>> But I want to do this programatically.
>>
>> I have gene ids to start with. Cant find a method to directly get
>> sequence with gene id as input. So using the method of getting
>> sequence with accession as input, for which I need to know accessions
>> for my gene ids first. Is this a right approach? Please guide me. My
>> main aim is to get the nucleotide sequence of a gene from ids entrez
>> gene id/gene name. PLease guide me. I am confused.
>>
>> -Neeti
>> Even my blood says, B positive
>>
>>
>>
>> On Fri, Sep 4, 2009 at 5:35 PM, Emanuele Osimo<e.osimo at gmail.com> wrote:
>>>
>>> Try this:
>>> http://david.abcc.ncifcrf.gov/conversion.jsp
>>>
>>> Emanuele
>>>
>>>
>>> On Fri, Sep 4, 2009 at 12:13, Neeti Somaiya <neetisomaiya at gmail.com>
>>> wrote:
>>>>
>>>> Thanks for the replies.
>>>>
>>>> So the get seq by accession/GI worked for me. Now can anyone tell me
>>>> the easiest way to get the GI /Accession of a gene from the gene
>>>> id/gene name?
>>>>
>>>> -Neeti
>>>> Even my blood says, B positive
>>>>
>>>>
>>>>
>>>> On Fri, Sep 4, 2009 at 2:47 PM, Neeti Somaiya<neetisomaiya at gmail.com>
>>>> wrote:
>>>>>
>>>>> Thanks for the link.
>>>>> So I need only the following lines of code to get the sequence?
>>>>>
>>>>> use Bio::DB::GenBank;
>>>>> $db_obj = Bio::DB::GenBank->new;
>>>>> $seq_obj = $db_obj->get_Seq_by_id(2);
>>>>>
>>>>> How do I print the sequence?
>>>>> $seq_obj->seq ??
>>>>>
>>>>> -Neeti
>>>>> Even my blood says, B positive
>>>>>
>>>>>
>>>>>
>>>>> On Fri, Sep 4, 2009 at 1:31 PM, K. Shameer<shameer at ncbs.res.in> wrote:
>>>>>>
>>>>>> Retrieving a sequence from a database : BioPerl HOWTO
>>>>>> http://bit.ly/RWIot
>>>>>>
>>>>>> Trust this helps,
>>>>>> Khader Shameer
>>>>>> NCBS - TIFR
>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>> I have an input list of gene names (can get gene ids from a local db
>>>>>>> if required).
>>>>>>> I need to fetch sequences of these genes. Can someone please guide me
>>>>>>> as to how this can be done using perl/bioperl?
>>>>>>>
>>>>>>> Any help will be deeply appreciated.
>>>>>>>
>>>>>>> Thanks.
>>>>>>>
>>>>>>> -Neeti
>>>>>>> Even my blood says, B positive
>>>>>>> _______________________________________________
>>>>>>> Bioperl-l mailing list
>>>>>>> Bioperl-l at lists.open-bio.org
>>>>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>> _______________________________________________
>>>> Bioperl-l mailing list
>>>> Bioperl-l at lists.open-bio.org
>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>


From cjfields at illinois.edu  Fri Sep  4 12:30:42 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Fri, 4 Sep 2009 07:30:42 -0500
Subject: [Bioperl-l] need help urgently
In-Reply-To: <764978cf0909040521p7aa6af97hffc0c08381a04222@mail.gmail.com>
References: <764978cf0909032349t5d8683f3h3b2ccc506392c984@mail.gmail.com>
	<42006.192.168.1.1.1252051273.squirrel@mail.ncbs.res.in>
	<764978cf0909040217k7f4382d6pcec65f8adfd66164@mail.gmail.com>
	<764978cf0909040313r48ff5660x3210324dc14bf966@mail.gmail.com>
	<2ac05d0f0909040505w75793434ifc25237edbabad5a@mail.gmail.com>
	<764978cf0909040521p7aa6af97hffc0c08381a04222@mail.gmail.com>
Message-ID: <8CCCFE4D-84A4-47A4-A627-ADC6C0329686@illinois.edu>

Neeti,

Something like this?

http://www.bioperl.org/wiki/HOWTO:EUtilities_Cookbook#esummary_-.3E_efetch

chris

On Sep 4, 2009, at 7:21 AM, Neeti Somaiya wrote:

> Thanks. Its an interesting tool.
>
> But I want to do this programatically.
>
> I have gene ids to start with. Cant find a method to directly get
> sequence with gene id as input. So using the method of getting
> sequence with accession as input, for which I need to know accessions
> for my gene ids first. Is this a right approach? Please guide me. My
> main aim is to get the nucleotide sequence of a gene from ids entrez
> gene id/gene name. PLease guide me. I am confused.
>
> -Neeti
> Even my blood says, B positive
>
>
>
> On Fri, Sep 4, 2009 at 5:35 PM, Emanuele Osimo<e.osimo at gmail.com>  
> wrote:
>> Try this:
>> http://david.abcc.ncifcrf.gov/conversion.jsp
>>
>> Emanuele
>>
>>
>> On Fri, Sep 4, 2009 at 12:13, Neeti Somaiya  
>> <neetisomaiya at gmail.com> wrote:
>>>
>>> Thanks for the replies.
>>>
>>> So the get seq by accession/GI worked for me. Now can anyone tell me
>>> the easiest way to get the GI /Accession of a gene from the gene
>>> id/gene name?
>>>
>>> -Neeti
>>> Even my blood says, B positive
>>>
>>>
>>>
>>> On Fri, Sep 4, 2009 at 2:47 PM, Neeti  
>>> Somaiya<neetisomaiya at gmail.com>
>>> wrote:
>>>> Thanks for the link.
>>>> So I need only the following lines of code to get the sequence?
>>>>
>>>> use Bio::DB::GenBank;
>>>> $db_obj = Bio::DB::GenBank->new;
>>>> $seq_obj = $db_obj->get_Seq_by_id(2);
>>>>
>>>> How do I print the sequence?
>>>> $seq_obj->seq ??
>>>>
>>>> -Neeti
>>>> Even my blood says, B positive
>>>>
>>>>
>>>>
>>>> On Fri, Sep 4, 2009 at 1:31 PM, K. Shameer<shameer at ncbs.res.in>  
>>>> wrote:
>>>>>
>>>>> Retrieving a sequence from a database : BioPerl HOWTO
>>>>> http://bit.ly/RWIot
>>>>>
>>>>> Trust this helps,
>>>>> Khader Shameer
>>>>> NCBS - TIFR
>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> I have an input list of gene names (can get gene ids from a  
>>>>>> local db
>>>>>> if required).
>>>>>> I need to fetch sequences of these genes. Can someone please  
>>>>>> guide me
>>>>>> as to how this can be done using perl/bioperl?
>>>>>>
>>>>>> Any help will be deeply appreciated.
>>>>>>
>>>>>> Thanks.
>>>>>>
>>>>>> -Neeti
>>>>>> Even my blood says, B positive
>>>>>> _______________________________________________
>>>>>> Bioperl-l mailing list
>>>>>> Bioperl-l at lists.open-bio.org
>>>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From cjfields at illinois.edu  Fri Sep  4 12:49:19 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Fri, 4 Sep 2009 07:49:19 -0500
Subject: [Bioperl-l] need help urgently
In-Reply-To: <764978cf0909040540n531ea4d3o42f28a7e1578ad82@mail.gmail.com>
References: <764978cf0909032349t5d8683f3h3b2ccc506392c984@mail.gmail.com>
	<42006.192.168.1.1.1252051273.squirrel@mail.ncbs.res.in>
	<764978cf0909040217k7f4382d6pcec65f8adfd66164@mail.gmail.com>
	<764978cf0909040313r48ff5660x3210324dc14bf966@mail.gmail.com>
	<2ac05d0f0909040505w75793434ifc25237edbabad5a@mail.gmail.com>
	<764978cf0909040521p7aa6af97hffc0c08381a04222@mail.gmail.com>
	<8CCCFE4D-84A4-47A4-A627-ADC6C0329686@illinois.edu>
	<764978cf0909040540n531ea4d3o42f28a7e1578ad82@mail.gmail.com>
Message-ID: <4D83853D-90C3-4048-AFAB-FF6E2402C7AA@illinois.edu>

Neeti,

Sorry, it's a package deal (and Bio::DB::EUtilities relies on several  
other modules).  I am planning on spinning it out at some point into  
it's own package, but for now the easiest way to install is via 1.6  
off CPAN or downloading the nightly build:

http://www.bioperl.org/DIST/nightly_builds/

chris

On Sep 4, 2009, at 7:40 AM, Neeti Somaiya wrote:

> Hi,
>
> Thanks for your reply. I saw this before and wanted to try this, but I
> am unable to install this module of EUtilities. When I search on CPAN,
> it gives me the entire bioperl package in the download option of this
> module. Can I not get a tar.gz file of this module alone, which I can
> gzip, untar and then run the make and all to install it? I dont want
> to install entire bioperl again as I am using an older version. Any
> suggestions?
>
> -Neeti
> Even my blood says, B positive
>
>
>
> On Fri, Sep 4, 2009 at 6:00 PM, Chris Fields<cjfields at illinois.edu>  
> wrote:
>> Neeti,
>>
>> Something like this?
>>
>> http://www.bioperl.org/wiki/HOWTO:EUtilities_Cookbook#esummary_-.3E_efetch
>>
>> chris
>>
>> On Sep 4, 2009, at 7:21 AM, Neeti Somaiya wrote:
>>
>>> Thanks. Its an interesting tool.
>>>
>>> But I want to do this programatically.
>>>
>>> I have gene ids to start with. Cant find a method to directly get
>>> sequence with gene id as input. So using the method of getting
>>> sequence with accession as input, for which I need to know  
>>> accessions
>>> for my gene ids first. Is this a right approach? Please guide me. My
>>> main aim is to get the nucleotide sequence of a gene from ids entrez
>>> gene id/gene name. PLease guide me. I am confused.
>>>
>>> -Neeti
>>> Even my blood says, B positive
>>>
>>>
>>>
>>> On Fri, Sep 4, 2009 at 5:35 PM, Emanuele Osimo<e.osimo at gmail.com>  
>>> wrote:
>>>>
>>>> Try this:
>>>> http://david.abcc.ncifcrf.gov/conversion.jsp
>>>>
>>>> Emanuele
>>>>
>>>>
>>>> On Fri, Sep 4, 2009 at 12:13, Neeti Somaiya  
>>>> <neetisomaiya at gmail.com>
>>>> wrote:
>>>>>
>>>>> Thanks for the replies.
>>>>>
>>>>> So the get seq by accession/GI worked for me. Now can anyone  
>>>>> tell me
>>>>> the easiest way to get the GI /Accession of a gene from the gene
>>>>> id/gene name?
>>>>>
>>>>> -Neeti
>>>>> Even my blood says, B positive
>>>>>
>>>>>
>>>>>
>>>>> On Fri, Sep 4, 2009 at 2:47 PM, Neeti Somaiya<neetisomaiya at gmail.com 
>>>>> >
>>>>> wrote:
>>>>>>
>>>>>> Thanks for the link.
>>>>>> So I need only the following lines of code to get the sequence?
>>>>>>
>>>>>> use Bio::DB::GenBank;
>>>>>> $db_obj = Bio::DB::GenBank->new;
>>>>>> $seq_obj = $db_obj->get_Seq_by_id(2);
>>>>>>
>>>>>> How do I print the sequence?
>>>>>> $seq_obj->seq ??
>>>>>>
>>>>>> -Neeti
>>>>>> Even my blood says, B positive
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Fri, Sep 4, 2009 at 1:31 PM, K. Shameer<shameer at ncbs.res.in>  
>>>>>> wrote:
>>>>>>>
>>>>>>> Retrieving a sequence from a database : BioPerl HOWTO
>>>>>>> http://bit.ly/RWIot
>>>>>>>
>>>>>>> Trust this helps,
>>>>>>> Khader Shameer
>>>>>>> NCBS - TIFR
>>>>>>>
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> I have an input list of gene names (can get gene ids from a  
>>>>>>>> local db
>>>>>>>> if required).
>>>>>>>> I need to fetch sequences of these genes. Can someone please  
>>>>>>>> guide me
>>>>>>>> as to how this can be done using perl/bioperl?
>>>>>>>>
>>>>>>>> Any help will be deeply appreciated.
>>>>>>>>
>>>>>>>> Thanks.
>>>>>>>>
>>>>>>>> -Neeti
>>>>>>>> Even my blood says, B positive
>>>>>>>> _______________________________________________
>>>>>>>> Bioperl-l mailing list
>>>>>>>> Bioperl-l at lists.open-bio.org
>>>>>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>> _______________________________________________
>>>>> Bioperl-l mailing list
>>>>> Bioperl-l at lists.open-bio.org
>>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>>
>>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From pg4 at sanger.ac.uk  Thu Sep  3 08:01:26 2009
From: pg4 at sanger.ac.uk (Pablo Marin-Garcia)
Date: Thu, 3 Sep 2009 09:01:26 +0100 (BST)
Subject: [Bioperl-l] bioperl invades emacs -- bug report?
In-Reply-To: <203092FB050648AA9F256788068F0A16@NewLife>
References: <mailman.25.1251907209.22450.bioperl-l@lists.open-bio.org>
	<alpine.DEB.1.10.0909022007510.16229@deskpro17122.dynamic.sanger.ac.uk>
	<203092FB050648AA9F256788068F0A16@NewLife>
Message-ID: <alpine.DEB.1.10.0909030814320.16229@deskpro17122.dynamic.sanger.ac.uk>

On Thu, 3 Sep 2009, Mark A. Jensen wrote:

> Hi Pablo and all-
> Try the latest revision (>=16081) with your debian/Emacs 21. Set
> the variable bioperl-module-path to the directory above the
> Bio directory (same idea as ' use lib "./bioperl-live"; ' ), and try
> again there. Tomorrow, MacOS
> cheers,
> Mark

Hello Mark,

after setting bioperl-module-path manually, your module works ok in 
linux emacs 21.4 with latest revision.

About the perl5lib issue, sorry about not reporting the platform: the 
report was on linux not in mac os X. In the wiki you have a comment about 
mac OS X separator:

[wiki] The problem Pablo was running into is definitely the Mac OS X path 
[wiki] separator issue.

Here I was refering to ':' as the 'path seprator' for linux multipath 
environmental vars not the systems directory separator [:/\].

Also from the wiki

[wiki] I think this is ok as it is, since bioperl-module-path is meant to 
[wiki] point to the directory above Bio

This is right. Probably my message was misleading. I wrongly appended 
'/Bio' to the path instead to a temp variable for testing with 
file-exist-p. And probably gave you the impression that the point was to 
have the /Bio added to the path. Sorry about that.

Instead my main point was about the line where you capture the PRL5LIB:

[code] (if (setq pth (getenv "PERL5LIB"))

wouldn't this leave pth with s *string* like 
"lib/path1:lib/path2:lob/path3" in linux?

Then, when you test:

[code] (setq pth (if (file-exists-p (concat pth "/" "Bio")) pth nil))))

it would append '/Bio' at the end of the whole string 
'lib/path1:lib/path2:lib/path3'. and this string path obviously does not 
exist.

Am I missing something? Shouldn't the 'concat /Bio' be applied to *each* 
lib/path, splitting first the pth string by the ':' in linux/osX or 
equivalent in windows.

Sorry about not being very clear in my firest report.


    -Pablo


>> == bug when parsing perl5lib? ==
>> 
>> Please correct me if I am wrong but in bioperl-init.el when extracting the 
>> Bioperl paths from PERL5LIB this is not working for me in linux.
>> 
>> While debugging bioperl-init.el:
>> # (setq pth (getenv "PERL5LIB"))
>> # 
>> "/nfs/home/pmg/ensembl-api/ensembl-compara/modules:...:/nfs/home/pmg/bioperl-live:..."
>> # (setq pth (if (file-exists-p (concat pth "/" "Bio")) pth nil))
>> # nil
>> 
>> No file is found because it is looking for all the paths concatenated 
>> together with a '/Bio' at the end:
>>
>>   libpaht1:libpath2:libpath3/Bio
>> 
>> 'concat' adds /Bio to the pth that is a string with all the PERL5LIB paths. 
>> Should this concat rather be applied to the splited perl5lib by ':' in unix 
>> or ';' in windows and then tested for the existence of files?
>> 
>> for example in unix:
>> 
>> --- code --
>> (defun addbio (bio_path)
>>   "apend /Bio to each path"
>>   (concat bio_path "/" "Bio"))
>> 
>> (mapcar 'file-exists-p (mapcar 'addbio (split-string pth ":")))
>> -- end code ---
>> 
>> This would result in the list of T and F bioperl (and ensembl) paths
>> (t t nil t t t t t t nil nil nil ...)
>> 
>> 
>> Regards and thanks for the modules they would be very useful.
>>
>>    -Pablo
>> 
>> =====================================================================
>>                      Pablo Marin-Garcia, PhD
>>
>>                     \\//          (Argiope bruennichi
>>                \/\/`(||>O:'\/\/   with stabilimentum)
>>                     //\\
>> 
>> Sanger Institute                |  PostDoc / Computer Biologist
>> Wellcome Trust Genome Campus    |  team : 128/108 (Human Genetics)
>> Hinxton, Cambridge CB10 1HH     |  room : N333
>> United Kingdom                  |  email: pablo.marin at sanger.ac.uk
>> ====================================================================
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> -- 
>> The Wellcome Trust Sanger Institute is operated by Genome Research Limited, 
>> a charity registered in England with number 1021457 and a company 
>> registered in England with number 2742969, whose registered office is 215 
>> Euston Road, London, NW1 2BE. 
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>> 
>> 
>
>


=====================================================================
                      Pablo Marin-Garcia, PhD

                     \\//          (Argiope bruennichi
                \/\/`(||>O:'\/\/   with stabilimentum)
                     //\\

Sanger Institute                |  PostDoc / Computer Biologist
Wellcome Trust Genome Campus    |  team : 128/108 (Human Genetics)
Hinxton, Cambridge CB10 1HH     |  room : N333
United Kingdom                  |  email: pablo.marin at sanger.ac.uk
====================================================================


-- 
 The Wellcome Trust Sanger Institute is operated by Genome Research 
 Limited, a charity registered in England with number 1021457 and a 
 company registered in England with number 2742969, whose registered 
 office is 215 Euston Road, London, NW1 2BE. 


From paola.bisignano at gmail.com  Fri Sep  4 12:28:03 2009
From: paola.bisignano at gmail.com (Paola Bisignano)
Date: Fri, 4 Sep 2009 14:28:03 +0200
Subject: [Bioperl-l] problem parsing msf file
Message-ID: <e9cf89740909040528j69e5f8e6ka9d550840a4e0f9a@mail.gmail.com>

I have a problem with the parsing of msf file...I can't find the exact
object of Bio::SimpleAlign for my case...
I have to identify residues (from a list) in aligned sequences...but
when I parse the alignment from fasta file, I save as msf file, where
I have to identify my residue (from the list, numbering as the pdb
file) and the residue aligned in the aligned sequences...

this is a piece of the file...

NoName   MSF: 2  Type: P  Wed Aug 26 10:32:50 2009  Check: 00 ..

 Name: Sequence/23-178  Len:    156  Check:  8937  Weight:  1.00
 Name: 2zhz:A/1-148     Len:    156  Check:  9006  Weight:  1.00

//


                      1                                                   50
Sequence/23-178       NDPRVAAYGE VDELNSWVGY TKSLINSHTQ VLSNELEEIQ QLLFDCGHDL
2zhz:A/1-148          DDARIAAIGD VDELNSQIGV L--LAEPLPD DVRAALSAIQ HDLFDLGGEL


                      51                                                 100
Sequence/23-178       ATPADDERHS FKFKQEQPTV WLEEKIDNYT QVVPAVKKHI LPGGTQLASA
2zhz:A/1-148          CIPGHAAITD AHLARLDG-- WLA----HYN GQLPPLEEFI LPGGARGAAL


                      101                                                150
Sequence/23-178       LHVARTITRR AERQIVQLMR EEQINQDVLI FINRLSDYFF AAARYANYLE
2zhz:A/1-148          AHVCRTVCRR AERSIVALGA SEPLNAAPRR YVNRLSDLLF VLARVLNRAA


                      151                                                200
Sequence/23-178       QQPDML
2zhz:A/1-148          GGADVL

for example in this I have to identify the residue that is in front of
Val 28 (that is in Sequen) in 2zhz:A (that manually conting is Ile
5)....
Tyr4-> has no residue in front of it because the alignment starts from
N23 of Sequence...
how can I find the way to enter the residue of my sequen, and extract
the residue from the other????


I wish you all dear friends..and I'm actually in atrouble with this..
Thanks for suggestions

Paola


From jason at bioperl.org  Fri Sep  4 16:04:05 2009
From: jason at bioperl.org (Jason Stajich)
Date: Fri, 4 Sep 2009 09:04:05 -0700
Subject: [Bioperl-l] Fwd:  help parsing msf file or clustalW file reports
References: <369662.74237.qm@web25701.mail.ukl.yahoo.com>
Message-ID: <B5AEEBAD-22D3-40B6-AD06-17E268DFAFDD@bioperl.org>

Paola - it is important to continue to email the mailing list for your  
help.  I'm hoping another person on the list can help as I am swamped  
right now.
-jason

Begin forwarded message:

> From: Paola Bisignano <paola_bisignano at yahoo.it>
> Date: September 4, 2009 5:48:22 AM PDT
> To: Jason Stajich <jason at bioperl.org>
> Subject: Re: [Bioperl-l] help parsing msf file or clustalW file  
> reports
>
> Hi Jason, thank for your answer there are two day that I'm re- 
> studyng synopsys of bioperl and programming object...I understand  
> what you mean...but I have some problems...I don't actually know how  
> to start to parse this kind of file, I generated this msf file or  
> clustalW file, by parsing a fasta file of multiple paired  
> sequences..so I parsed in msf file...extracting only the paired  
> sequences I want..so homolog proteins that have same ligand  
> published in pdb bank..
>
>
> I have a problem with the parsing of msf file...I can't find the exact
>
>
> object of Bio::SimpleAlign for my case...
>
>
> I have to identify residues (from a list) in aligned sequences...but
>
>
> when I parse the alignment from fasta file, I save as msf file, where
>
>
> I have to identify my residue (from the list, numbering as the pdb
>
>
> file) and the residue aligned in the aligned sequences...
>
>
>
>
>
> this is a piece of the file...
>
>
>
>
>
> NoName   MSF: 2  Type: P  Wed Aug 26 10:32:50 2009  Check: 00 ..
>
>
>
>
>
>  Name: Sequence/23-178  Len:    156  Check:  8937  Weight:  1.00
>
>
>  Name: 2zhz:A/1-148     Len:    156  Check:  9006  Weight:  1.00
>
>
>
>
>
> //
>
>
>
>
>
>
>
>
>                       
> 1                                                   50
>
>
> Sequence/23-178       NDPRVAAYGE VDELNSWVGY TKSLINSHTQ VLSNELEEIQ  
> QLLFDCGHDL
>
>
> 2zhz:A/1-148          DDARIAAIGD VDELNSQIGV L--LAEPLPD DVRAALSAIQ  
> HDLFDLGGEL
>
>
>
>
>
>
>
>
>                       
> 51                                                 100
>
>
> Sequence/23-178       ATPADDERHS FKFKQEQPTV WLEEKIDNYT QVVPAVKKHI  
> LPGGTQLASA
>
>
> 2zhz:A/1-148          CIPGHAAITD AHLARLDG-- WLA----HYN GQLPPLEEFI  
> LPGGARGAAL
>
>
>
>
>
>
>
>
>                       
> 101                                                150
>
>
> Sequence/23-178       LHVARTITRR AERQIVQLMR EEQINQDVLI FINRLSDYFF  
> AAARYANYLE
>
>
> 2zhz:A/1-148          AHVCRTVCRR AERSIVALGA SEPLNAAPRR YVNRLSDLLF  
> VLARVLNRAA
>
>
>
>
>
>
>
>
>                       
> 151                                                200
>
>
> Sequence/23-178       QQPDML
>
>
> 2zhz:A/1-148          GGADVL
>
>
>
>
>
> for example in this I have to identify the residue that is in front of
>
>
> Val 28 (that is in Sequen) in 2zhz:A (that manually conting is Ile
>
>
> 5)....
>
>
> Tyr4-> has no residue in front of it because the alignment starts from
>
>
> N23 of Sequence...
>
>
> how can I find the way to enter the residue of my sequen, and extract
>
>
> the residue from the other????
>
>
>
>
>
>
>
>
> I wish you all dear friends..and I'm actually in atrouble with this..
>
>
> Thanks for suggestions
>
>
>
>
>
>
> --- Mar 1/9/09, Jason Stajich <jason at bioperl.org> ha scritto:
>
> Da: Jason Stajich <jason at bioperl.org>
> Oggetto: Re: [Bioperl-l] help parsing msf file or clustalW file  
> reports
> A: "Paola Bisignano" <paola_bisignano at yahoo.it>
> Cc: bioperl-l at lists.open-bio.org
> Data: Marted? 1 settembre 2009, 17:49
>
> I think you might want to use the column_from_residue_number method  
> that is part of Bio::SimpleAlign - it lets you get the column from  
> an alignment based on the sequence residue, doing some math along  
> the way to deal with gaps. That is the residue -> alignment  
> direction.  If you are starting at the alignment and want to get the  
> residue's position you will use the location_from_column on a  
> particular sequence so
>
>     # select somehow a sequence from the alignment, e.g.
>     my $seq = $aln->get_seq_by_pos(1);
>     #$loc is undef or Bio::LocationI object
>     my $loc = $seq->location_from_column(5);
>
> -jason
>
> On Sep 1, 2009, at 5:20 AM, Paola Bisignano wrote:
>
>> Hi,
>>
>> I'm trying to parse fasta files, where I have couple of  
>> alignments....I need to identify my residue in my alignment......I  
>> have separate lists that derived from ligplot parsing files.. so I  
>> have to manipulate string...but I don't now how to start..it seems  
>> complicated..
>> I used Bio::AlignIO to parse the fasta file, so I can have a parsed  
>> file in msf or clustalW forma
>>
>> here an example:
>> CLUSTAL W(1.81) multiple sequence alignment
>>
>>
>> Sequence/9-273          
>> DKWEMERTDITMKHKLGGGQYGEVYEGVWKKYSLTVAVKTLKEDTMEVEEFLKEAAVMKE
>> 2pl0:A/6-268           DEWEVPRETLKLVERLGAGQFGEVWMGYYNGHT- 
>> KVAVKSLKQGSMSPDAFLAEANLMKQ
>>                         *:**: *  :.: .:**.**:***:  
>> * :: :: .****:**:.:*. : ** ** :**:
>>
>>
>> Sequence/9-273          
>> IKHPNLVQLLGVCTREPPFYIITEFMTYGNLLDYLRECNRQEVSAVVLLYMATQISSAME
>> 2pl0:A/6-268           LQHQRLVRLYAVVTQEP- 
>> IYIITEYMENGSLVDFLKTPSGIKLTINKLLDMAAQIAEGMA
>>                         ::* .**:* .* *:** :*****:*   
>> *.*:*:*:  .  :::   ** **:**:..*
>>
>> I  choose two residue for example...how can I extract  
>> them...starting from their position in the pdb file?
>> I need to walk...to my sequence
>>
>> I don't know if it is clear because I cannot explain the question  
>> correctly in english...are there any Italians?
>> could anyone help me?
>>
>>
>>
>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> --
> Jason Stajich
> jason.stajich at gmail.com
> jason at bioperl.org
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>
>

--
Jason Stajich
jason.stajich at gmail.com
jason at bioperl.org


From robert.bradbury at gmail.com  Fri Sep  4 20:15:09 2009
From: robert.bradbury at gmail.com (Robert Bradbury)
Date: Fri, 4 Sep 2009 16:15:09 -0400
Subject: [Bioperl-l] need help urgently
In-Reply-To: <2ac05d0f0909040505w75793434ifc25237edbabad5a@mail.gmail.com>
References: <764978cf0909032349t5d8683f3h3b2ccc506392c984@mail.gmail.com>
	<42006.192.168.1.1.1252051273.squirrel@mail.ncbs.res.in>
	<764978cf0909040217k7f4382d6pcec65f8adfd66164@mail.gmail.com>
	<764978cf0909040313r48ff5660x3210324dc14bf966@mail.gmail.com>
	<2ac05d0f0909040505w75793434ifc25237edbabad5a@mail.gmail.com>
Message-ID: <deaa866a0909041315y4282d811g3047ab153812014d@mail.gmail.com>

On 9/4/09, Emanuele Osimo <e.osimo at gmail.com> wrote:
> Try this:
> http://david.abcc.ncifcrf.gov/conversion.jsp
>

It may be just me, but I've tried this in both Firefox and Opera and
it will not work without Javascript enabled.  Most "intelligent" sites
now tell you that Javascript must be enabled if they require it to
work properly.  More intelligent sites (such as Google's gmail) allow
you to toggle back and forth between Javascript & non-Javascript
implementations.

Note that, IMO, running with Javascript enabled for all sites all the
time is a bad idea (potentially for security reasons, but clearly for
sleep / suspend / power consumption reasons, and finally for the
reason of do you *really* trust that Javascript, your DNS provider,
and sites hosting the scripts are 100% secure?).  The only options
that seem generally available at this time are to run Firefox with
NoScript enabling of selective sites or to run two browser instances,
one with Javascript enabled, one with it disabled -- and to only use
the Javascript enabled browser on sites with a high probability of
being secure).


From lsbrath at gmail.com  Fri Sep  4 22:12:34 2009
From: lsbrath at gmail.com (Mgavi Brathwaite)
Date: Fri, 4 Sep 2009 18:12:34 -0400
Subject: [Bioperl-l] bio:graphics
Message-ID: <69367b8f0909041512l77b2431aqb89f57f82adae1@mail.gmail.com>

Hello,

I need to grab features(source, gene, cds, primer_bind) from a genbank file
and add features(5' and 3' UTR, misc_feature) to generate an image. The
images are on two tracks and with each track having multiple features. How
do I display different colors for the different features on the same track?
In my case 5'UTR, CDS, and 3'UTR are on the same track. I want the UTRs to
have one color and the CDS another.

I also need to grab the start and end info from the primer_bind feature
based on the /note tag values. In my case 'HUF' and 'HDF'. Code:

if( $feat->primary_tag eq 'primer_bind' ) {
            $feat->get_tag_values("note") if ($feat_object->has_tag("note")
&&
                tag_values("note") eq 'HDF');
            $pb_start = $feat->start;
            $pb_end = $feat->end;


I want to make sure that I am moving in the right direction.  Can someone
help me out?

M


From neetisomaiya at gmail.com  Sat Sep  5 04:52:11 2009
From: neetisomaiya at gmail.com (Neeti Somaiya)
Date: Sat, 5 Sep 2009 10:22:11 +0530
Subject: [Bioperl-l] need help urgently
In-Reply-To: <4D83853D-90C3-4048-AFAB-FF6E2402C7AA@illinois.edu>
References: <764978cf0909032349t5d8683f3h3b2ccc506392c984@mail.gmail.com>
	<42006.192.168.1.1.1252051273.squirrel@mail.ncbs.res.in>
	<764978cf0909040217k7f4382d6pcec65f8adfd66164@mail.gmail.com>
	<764978cf0909040313r48ff5660x3210324dc14bf966@mail.gmail.com>
	<2ac05d0f0909040505w75793434ifc25237edbabad5a@mail.gmail.com>
	<764978cf0909040521p7aa6af97hffc0c08381a04222@mail.gmail.com>
	<8CCCFE4D-84A4-47A4-A627-ADC6C0329686@illinois.edu>
	<764978cf0909040540n531ea4d3o42f28a7e1578ad82@mail.gmail.com>
	<4D83853D-90C3-4048-AFAB-FF6E2402C7AA@illinois.edu>
Message-ID: <764978cf0909042152v2ae26ee5q6c668c498ead605e@mail.gmail.com>

Ok, so I reinstalled bioperl and was able to run the EUtilities code
for my gene id.
But I am facing two issues :-

1) When I give multiple gene ids, it still returns data of only the
first gene id

2) The script returns the entire entry, and I am not able to figure
out how to just fetch the sequence, and if possible, in FASTA format.
I could not figure it out from the documentation.

Thanks.

-Neeti
Even my blood says, B positive


On Fri, Sep 4, 2009 at 6:19 PM, Chris Fields<cjfields at illinois.edu> wrote:
> Neeti,
>
> Sorry, it's a package deal (and Bio::DB::EUtilities relies on several other
> modules).  I am planning on spinning it out at some point into it's own
> package, but for now the easiest way to install is via 1.6 off CPAN or
> downloading the nightly build:
>
> http://www.bioperl.org/DIST/nightly_builds/
>
> chris
>
> On Sep 4, 2009, at 7:40 AM, Neeti Somaiya wrote:
>
>> Hi,
>>
>> Thanks for your reply. I saw this before and wanted to try this, but I
>> am unable to install this module of EUtilities. When I search on CPAN,
>> it gives me the entire bioperl package in the download option of this
>> module. Can I not get a tar.gz file of this module alone, which I can
>> gzip, untar and then run the make and all to install it? I dont want
>> to install entire bioperl again as I am using an older version. Any
>> suggestions?
>>
>> -Neeti
>> Even my blood says, B positive
>>
>>
>>
>> On Fri, Sep 4, 2009 at 6:00 PM, Chris Fields<cjfields at illinois.edu> wrote:
>>>
>>> Neeti,
>>>
>>> Something like this?
>>>
>>>
>>> http://www.bioperl.org/wiki/HOWTO:EUtilities_Cookbook#esummary_-.3E_efetch
>>>
>>> chris
>>>
>>> On Sep 4, 2009, at 7:21 AM, Neeti Somaiya wrote:
>>>
>>>> Thanks. Its an interesting tool.
>>>>
>>>> But I want to do this programatically.
>>>>
>>>> I have gene ids to start with. Cant find a method to directly get
>>>> sequence with gene id as input. So using the method of getting
>>>> sequence with accession as input, for which I need to know accessions
>>>> for my gene ids first. Is this a right approach? Please guide me. My
>>>> main aim is to get the nucleotide sequence of a gene from ids entrez
>>>> gene id/gene name. PLease guide me. I am confused.
>>>>
>>>> -Neeti
>>>> Even my blood says, B positive
>>>>
>>>>
>>>>
>>>> On Fri, Sep 4, 2009 at 5:35 PM, Emanuele Osimo<e.osimo at gmail.com> wrote:
>>>>>
>>>>> Try this:
>>>>> http://david.abcc.ncifcrf.gov/conversion.jsp
>>>>>
>>>>> Emanuele
>>>>>
>>>>>
>>>>> On Fri, Sep 4, 2009 at 12:13, Neeti Somaiya <neetisomaiya at gmail.com>
>>>>> wrote:
>>>>>>
>>>>>> Thanks for the replies.
>>>>>>
>>>>>> So the get seq by accession/GI worked for me. Now can anyone tell me
>>>>>> the easiest way to get the GI /Accession of a gene from the gene
>>>>>> id/gene name?
>>>>>>
>>>>>> -Neeti
>>>>>> Even my blood says, B positive
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Fri, Sep 4, 2009 at 2:47 PM, Neeti Somaiya<neetisomaiya at gmail.com>
>>>>>> wrote:
>>>>>>>
>>>>>>> Thanks for the link.
>>>>>>> So I need only the following lines of code to get the sequence?
>>>>>>>
>>>>>>> use Bio::DB::GenBank;
>>>>>>> $db_obj = Bio::DB::GenBank->new;
>>>>>>> $seq_obj = $db_obj->get_Seq_by_id(2);
>>>>>>>
>>>>>>> How do I print the sequence?
>>>>>>> $seq_obj->seq ??
>>>>>>>
>>>>>>> -Neeti
>>>>>>> Even my blood says, B positive
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Fri, Sep 4, 2009 at 1:31 PM, K. Shameer<shameer at ncbs.res.in>
>>>>>>> wrote:
>>>>>>>>
>>>>>>>> Retrieving a sequence from a database : BioPerl HOWTO
>>>>>>>> http://bit.ly/RWIot
>>>>>>>>
>>>>>>>> Trust this helps,
>>>>>>>> Khader Shameer
>>>>>>>> NCBS - TIFR
>>>>>>>>
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>> I have an input list of gene names (can get gene ids from a local
>>>>>>>>> db
>>>>>>>>> if required).
>>>>>>>>> I need to fetch sequences of these genes. Can someone please guide
>>>>>>>>> me
>>>>>>>>> as to how this can be done using perl/bioperl?
>>>>>>>>>
>>>>>>>>> Any help will be deeply appreciated.
>>>>>>>>>
>>>>>>>>> Thanks.
>>>>>>>>>
>>>>>>>>> -Neeti
>>>>>>>>> Even my blood says, B positive
>>>>>>>>> _______________________________________________
>>>>>>>>> Bioperl-l mailing list
>>>>>>>>> Bioperl-l at lists.open-bio.org
>>>>>>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>> _______________________________________________
>>>>>> Bioperl-l mailing list
>>>>>> Bioperl-l at lists.open-bio.org
>>>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>>>
>>>>>
>>>> _______________________________________________
>>>> Bioperl-l mailing list
>>>> Bioperl-l at lists.open-bio.org
>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>


From ybolo001 at student.ucr.edu  Sat Sep  5 07:37:58 2009
From: ybolo001 at student.ucr.edu (Eugene Bolotin)
Date: Sat, 5 Sep 2009 00:37:58 -0700
Subject: [Bioperl-l] need help urgently
In-Reply-To: <764978cf0909042152v2ae26ee5q6c668c498ead605e@mail.gmail.com>
References: <764978cf0909032349t5d8683f3h3b2ccc506392c984@mail.gmail.com>
	<42006.192.168.1.1.1252051273.squirrel@mail.ncbs.res.in>
	<764978cf0909040217k7f4382d6pcec65f8adfd66164@mail.gmail.com>
	<764978cf0909040313r48ff5660x3210324dc14bf966@mail.gmail.com>
	<2ac05d0f0909040505w75793434ifc25237edbabad5a@mail.gmail.com>
	<764978cf0909040521p7aa6af97hffc0c08381a04222@mail.gmail.com>
	<8CCCFE4D-84A4-47A4-A627-ADC6C0329686@illinois.edu>
	<764978cf0909040540n531ea4d3o42f28a7e1578ad82@mail.gmail.com>
	<4D83853D-90C3-4048-AFAB-FF6E2402C7AA@illinois.edu>
	<764978cf0909042152v2ae26ee5q6c668c498ead605e@mail.gmail.com>
Message-ID: <941fcc750909050037n3c0f4fc5u89fcf4f5c3e5f34d@mail.gmail.com>

Ok,
this is what I would do.
Download the database of gene names and sequences in fasta.
Then loop throught it with bioperl.
Regex the gene names, which you store into a hash, against the
seq->display_names() should match it up with gene ids
seq->seq() should print out the sequence
in bioperl.
Print out the ones that match.
Good luck.
- Show quoted text -

On Thu, Sep 3, 2009 at 11:49 PM, Neeti Somaiya<neetisomaiya at gmail.com> wrote:
> Hi,
>
> I have an input list of gene names (can get gene ids from a local db
> if required).
> I need to fetch sequences of these genes. Can someone please guide me
> as to how this can be done using perl/bioperl?
>
> Any help will be deeply appreciated.
>
> Thanks.
>
> -Neeti
> Even my blood says, B positive
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


--

On Fri, Sep 4, 2009 at 9:52 PM, Neeti Somaiya<neetisomaiya at gmail.com> wrote:
> Ok, so I reinstalled bioperl and was able to run the EUtilities code
> for my gene id.
> But I am facing two issues :-
>
> 1) When I give multiple gene ids, it still returns data of only the
> first gene id
>
> 2) The script returns the entire entry, and I am not able to figure
> out how to just fetch the sequence, and if possible, in FASTA format.
> I could not figure it out from the documentation.
>
> Thanks.
>
> -Neeti
> Even my blood says, B positive
>
>
>
> On Fri, Sep 4, 2009 at 6:19 PM, Chris Fields<cjfields at illinois.edu> wrote:
>> Neeti,
>>
>> Sorry, it's a package deal (and Bio::DB::EUtilities relies on several other
>> modules). ?I am planning on spinning it out at some point into it's own
>> package, but for now the easiest way to install is via 1.6 off CPAN or
>> downloading the nightly build:
>>
>> http://www.bioperl.org/DIST/nightly_builds/
>>
>> chris
>>
>> On Sep 4, 2009, at 7:40 AM, Neeti Somaiya wrote:
>>
>>> Hi,
>>>
>>> Thanks for your reply. I saw this before and wanted to try this, but I
>>> am unable to install this module of EUtilities. When I search on CPAN,
>>> it gives me the entire bioperl package in the download option of this
>>> module. Can I not get a tar.gz file of this module alone, which I can
>>> gzip, untar and then run the make and all to install it? I dont want
>>> to install entire bioperl again as I am using an older version. Any
>>> suggestions?
>>>
>>> -Neeti
>>> Even my blood says, B positive
>>>
>>>
>>>
>>> On Fri, Sep 4, 2009 at 6:00 PM, Chris Fields<cjfields at illinois.edu> wrote:
>>>>
>>>> Neeti,
>>>>
>>>> Something like this?
>>>>
>>>>
>>>> http://www.bioperl.org/wiki/HOWTO:EUtilities_Cookbook#esummary_-.3E_efetch
>>>>
>>>> chris
>>>>
>>>> On Sep 4, 2009, at 7:21 AM, Neeti Somaiya wrote:
>>>>
>>>>> Thanks. Its an interesting tool.
>>>>>
>>>>> But I want to do this programatically.
>>>>>
>>>>> I have gene ids to start with. Cant find a method to directly get
>>>>> sequence with gene id as input. So using the method of getting
>>>>> sequence with accession as input, for which I need to know accessions
>>>>> for my gene ids first. Is this a right approach? Please guide me. My
>>>>> main aim is to get the nucleotide sequence of a gene from ids entrez
>>>>> gene id/gene name. PLease guide me. I am confused.
>>>>>
>>>>> -Neeti
>>>>> Even my blood says, B positive
>>>>>
>>>>>
>>>>>
>>>>> On Fri, Sep 4, 2009 at 5:35 PM, Emanuele Osimo<e.osimo at gmail.com> wrote:
>>>>>>
>>>>>> Try this:
>>>>>> http://david.abcc.ncifcrf.gov/conversion.jsp
>>>>>>
>>>>>> Emanuele
>>>>>>
>>>>>>
>>>>>> On Fri, Sep 4, 2009 at 12:13, Neeti Somaiya <neetisomaiya at gmail.com>
>>>>>> wrote:
>>>>>>>
>>>>>>> Thanks for the replies.
>>>>>>>
>>>>>>> So the get seq by accession/GI worked for me. Now can anyone tell me
>>>>>>> the easiest way to get the GI /Accession of a gene from the gene
>>>>>>> id/gene name?
>>>>>>>
>>>>>>> -Neeti
>>>>>>> Even my blood says, B positive
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Fri, Sep 4, 2009 at 2:47 PM, Neeti Somaiya<neetisomaiya at gmail.com>
>>>>>>> wrote:
>>>>>>>>
>>>>>>>> Thanks for the link.
>>>>>>>> So I need only the following lines of code to get the sequence?
>>>>>>>>
>>>>>>>> use Bio::DB::GenBank;
>>>>>>>> $db_obj = Bio::DB::GenBank->new;
>>>>>>>> $seq_obj = $db_obj->get_Seq_by_id(2);
>>>>>>>>
>>>>>>>> How do I print the sequence?
>>>>>>>> $seq_obj->seq ??
>>>>>>>>
>>>>>>>> -Neeti
>>>>>>>> Even my blood says, B positive
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Fri, Sep 4, 2009 at 1:31 PM, K. Shameer<shameer at ncbs.res.in>
>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>> Retrieving a sequence from a database : BioPerl HOWTO
>>>>>>>>> http://bit.ly/RWIot
>>>>>>>>>
>>>>>>>>> Trust this helps,
>>>>>>>>> Khader Shameer
>>>>>>>>> NCBS - TIFR
>>>>>>>>>
>>>>>>>>>> Hi,
>>>>>>>>>>
>>>>>>>>>> I have an input list of gene names (can get gene ids from a local
>>>>>>>>>> db
>>>>>>>>>> if required).
>>>>>>>>>> I need to fetch sequences of these genes. Can someone please guide
>>>>>>>>>> me
>>>>>>>>>> as to how this can be done using perl/bioperl?
>>>>>>>>>>
>>>>>>>>>> Any help will be deeply appreciated.
>>>>>>>>>>
>>>>>>>>>> Thanks.
>>>>>>>>>>
>>>>>>>>>> -Neeti
>>>>>>>>>> Even my blood says, B positive
>>>>>>>>>> _______________________________________________
>>>>>>>>>> Bioperl-l mailing list
>>>>>>>>>> Bioperl-l at lists.open-bio.org
>>>>>>>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> Bioperl-l mailing list
>>>>>>> Bioperl-l at lists.open-bio.org
>>>>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>>>>
>>>>>>
>>>>> _______________________________________________
>>>>> Bioperl-l mailing list
>>>>> Bioperl-l at lists.open-bio.org
>>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>>
>>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


-- 
Eugene Bolotin
Ph.D. candidate
Genetics Genomics and Bioinformatics
University of California Riverside
ybolo001 at student.ucr.edu
Dr. Frances Sladek Lab


From maj at fortinbras.us  Sat Sep  5 12:53:12 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Sat, 5 Sep 2009 08:53:12 -0400
Subject: [Bioperl-l] bioperl invades emacs -- bug report?
In-Reply-To: <alpine.DEB.1.10.0909030814320.16229@deskpro17122.dynamic.sanger.ac.uk>
References: <mailman.25.1251907209.22450.bioperl-l@lists.open-bio.org><alpine.DEB.1.10.0909022007510.16229@deskpro17122.dynamic.sanger.ac.uk><203092FB050648AA9F256788068F0A16@NewLife>
	<alpine.DEB.1.10.0909030814320.16229@deskpro17122.dynamic.sanger.ac.uk>
Message-ID: <E63F6D209AF1432C9B9CAFF6F6182F9C@NewLife>

Hi Pablo-- You're right about the PERL5LIB issue; I had
not set up the module path to handle multiple paths as you
describe. I am working hard on an implementation that can
handle multiple paths; I hope to have it out next week --cheers MAJ
----- Original Message ----- 
From: "Pablo Marin-Garcia" <pg4 at sanger.ac.uk>
To: "Mark A. Jensen" <maj at fortinbras.us>
Cc: <bioperl-l at lists.open-bio.org>
Sent: Thursday, September 03, 2009 4:01 AM
Subject: Re: [Bioperl-l] bioperl invades emacs -- bug report?


> On Thu, 3 Sep 2009, Mark A. Jensen wrote:
>
>> Hi Pablo and all-
>> Try the latest revision (>=16081) with your debian/Emacs 21. Set
>> the variable bioperl-module-path to the directory above the
>> Bio directory (same idea as ' use lib "./bioperl-live"; ' ), and try
>> again there. Tomorrow, MacOS
>> cheers,
>> Mark
>
> Hello Mark,
>
> after setting bioperl-module-path manually, your module works ok in linux 
> emacs 21.4 with latest revision.
>
> About the perl5lib issue, sorry about not reporting the platform: the report 
> was on linux not in mac os X. In the wiki you have a comment about mac OS X 
> separator:
>
> [wiki] The problem Pablo was running into is definitely the Mac OS X path 
> [wiki] separator issue.
>
> Here I was refering to ':' as the 'path seprator' for linux multipath 
> environmental vars not the systems directory separator [:/\].
>
> Also from the wiki
>
> [wiki] I think this is ok as it is, since bioperl-module-path is meant to 
> [wiki] point to the directory above Bio
>
> This is right. Probably my message was misleading. I wrongly appended '/Bio' 
> to the path instead to a temp variable for testing with file-exist-p. And 
> probably gave you the impression that the point was to have the /Bio added to 
> the path. Sorry about that.
>
> Instead my main point was about the line where you capture the PRL5LIB:
>
> [code] (if (setq pth (getenv "PERL5LIB"))
>
> wouldn't this leave pth with s *string* like "lib/path1:lib/path2:lob/path3" 
> in linux?
>
> Then, when you test:
>
> [code] (setq pth (if (file-exists-p (concat pth "/" "Bio")) pth nil))))
>
> it would append '/Bio' at the end of the whole string 
> 'lib/path1:lib/path2:lib/path3'. and this string path obviously does not 
> exist.
>
> Am I missing something? Shouldn't the 'concat /Bio' be applied to *each* 
> lib/path, splitting first the pth string by the ':' in linux/osX or equivalent 
> in windows.
>
> Sorry about not being very clear in my firest report.
>
>
>    -Pablo
>
>
>
>>> == bug when parsing perl5lib? ==
>>>
>>> Please correct me if I am wrong but in bioperl-init.el when extracting the 
>>> Bioperl paths from PERL5LIB this is not working for me in linux.
>>>
>>> While debugging bioperl-init.el:
>>> # (setq pth (getenv "PERL5LIB"))
>>> # 
>>> "/nfs/home/pmg/ensembl-api/ensembl-compara/modules:...:/nfs/home/pmg/bioperl-live:..."
>>> # (setq pth (if (file-exists-p (concat pth "/" "Bio")) pth nil))
>>> # nil
>>>
>>> No file is found because it is looking for all the paths concatenated 
>>> together with a '/Bio' at the end:
>>>
>>>   libpaht1:libpath2:libpath3/Bio
>>>
>>> 'concat' adds /Bio to the pth that is a string with all the PERL5LIB paths. 
>>> Should this concat rather be applied to the splited perl5lib by ':' in unix 
>>> or ';' in windows and then tested for the existence of files?
>>>
>>> for example in unix:
>>>
>>> --- code --
>>> (defun addbio (bio_path)
>>>   "apend /Bio to each path"
>>>   (concat bio_path "/" "Bio"))
>>>
>>> (mapcar 'file-exists-p (mapcar 'addbio (split-string pth ":")))
>>> -- end code ---
>>>
>>> This would result in the list of T and F bioperl (and ensembl) paths
>>> (t t nil t t t t t t nil nil nil ...)
>>>
>>>
>>> Regards and thanks for the modules they would be very useful.
>>>
>>>    -Pablo
>>>
>>> =====================================================================
>>>                      Pablo Marin-Garcia, PhD
>>>
>>>                     \\//          (Argiope bruennichi
>>>                \/\/`(||>O:'\/\/   with stabilimentum)
>>>                     //\\
>>>
>>> Sanger Institute                |  PostDoc / Computer Biologist
>>> Wellcome Trust Genome Campus    |  team : 128/108 (Human Genetics)
>>> Hinxton, Cambridge CB10 1HH     |  room : N333
>>> United Kingdom                  |  email: pablo.marin at sanger.ac.uk
>>> ====================================================================
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> -- 
>>> The Wellcome Trust Sanger Institute is operated by Genome Research Limited, 
>>> a charity registered in England with number 1021457 and a company registered 
>>> in England with number 2742969, whose registered office is 215 Euston Road, 
>>> London, NW1 2BE. _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>>
>>
>>
>
>
> =====================================================================
>                      Pablo Marin-Garcia, PhD
>
>                     \\//          (Argiope bruennichi
>                \/\/`(||>O:'\/\/   with stabilimentum)
>                     //\\
>
> Sanger Institute                |  PostDoc / Computer Biologist
> Wellcome Trust Genome Campus    |  team : 128/108 (Human Genetics)
> Hinxton, Cambridge CB10 1HH     |  room : N333
> United Kingdom                  |  email: pablo.marin at sanger.ac.uk
> ====================================================================
>
>
>
>
>
>
>
>
>
>
> -- 
> The Wellcome Trust Sanger Institute is operated by Genome Research Limited, a 
> charity registered in England with number 1021457 and a company registered in 
> England with number 2742969, whose registered office is 215 Euston Road, 
> London, NW1 2BE. _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> 


From cjfields at illinois.edu  Sat Sep  5 13:44:54 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Sat, 5 Sep 2009 08:44:54 -0500
Subject: [Bioperl-l] need help urgently
In-Reply-To: <764978cf0909042152v2ae26ee5q6c668c498ead605e@mail.gmail.com>
References: <764978cf0909032349t5d8683f3h3b2ccc506392c984@mail.gmail.com>
	<42006.192.168.1.1.1252051273.squirrel@mail.ncbs.res.in>
	<764978cf0909040217k7f4382d6pcec65f8adfd66164@mail.gmail.com>
	<764978cf0909040313r48ff5660x3210324dc14bf966@mail.gmail.com>
	<2ac05d0f0909040505w75793434ifc25237edbabad5a@mail.gmail.com>
	<764978cf0909040521p7aa6af97hffc0c08381a04222@mail.gmail.com>
	<8CCCFE4D-84A4-47A4-A627-ADC6C0329686@illinois.edu>
	<764978cf0909040540n531ea4d3o42f28a7e1578ad82@mail.gmail.com>
	<4D83853D-90C3-4048-AFAB-FF6E2402C7AA@illinois.edu>
	<764978cf0909042152v2ae26ee5q6c668c498ead605e@mail.gmail.com>
Message-ID: <218A1F91-F492-43E6-814D-A31546E0FEB1@illinois.edu>

On Sep 4, 2009, at 11:52 PM, Neeti Somaiya wrote:

> Ok, so I reinstalled bioperl and was able to run the EUtilities code
> for my gene id.
> But I am facing two issues :-
>
> 1) When I give multiple gene ids, it still returns data of only the
> first gene id

This sounds like it's not iterating correctly.  You'll need to post  
your version of the script.

> 2) The script returns the entire entry, and I am not able to figure
> out how to just fetch the sequence, and if possible, in FASTA format.
> I could not figure it out from the documentation.

I recall this working last time I used it (I think June or July).   
Could you post the script you are using?

(realize this is a holiday weekend in the states, so you might have a  
delayed response from me or others)

> Thanks.
>
> -Neeti
> Even my blood says, B positive

chris


From neetisomaiya at gmail.com  Sun Sep  6 16:15:09 2009
From: neetisomaiya at gmail.com (Neeti Somaiya)
Date: Sun, 6 Sep 2009 21:45:09 +0530
Subject: [Bioperl-l] need help urgently
In-Reply-To: <218A1F91-F492-43E6-814D-A31546E0FEB1@illinois.edu>
References: <764978cf0909032349t5d8683f3h3b2ccc506392c984@mail.gmail.com>
	<764978cf0909040217k7f4382d6pcec65f8adfd66164@mail.gmail.com>
	<764978cf0909040313r48ff5660x3210324dc14bf966@mail.gmail.com>
	<2ac05d0f0909040505w75793434ifc25237edbabad5a@mail.gmail.com>
	<764978cf0909040521p7aa6af97hffc0c08381a04222@mail.gmail.com>
	<8CCCFE4D-84A4-47A4-A627-ADC6C0329686@illinois.edu>
	<764978cf0909040540n531ea4d3o42f28a7e1578ad82@mail.gmail.com>
	<4D83853D-90C3-4048-AFAB-FF6E2402C7AA@illinois.edu>
	<764978cf0909042152v2ae26ee5q6c668c498ead605e@mail.gmail.com>
	<218A1F91-F492-43E6-814D-A31546E0FEB1@illinois.edu>
Message-ID: <764978cf0909060915t7a2e6e45v4bb194b9cad18e18@mail.gmail.com>

Hi,

Thanks for the reply.

I am using the script exactly as it is given here :

http://www.bioperl.org/wiki/HOWTO:EUtilities_Cookbook#esummary_-.3E_efetch

-Neeti
Even my blood says, B positive


On Sat, Sep 5, 2009 at 7:14 PM, Chris Fields<cjfields at illinois.edu> wrote:
> On Sep 4, 2009, at 11:52 PM, Neeti Somaiya wrote:
>
>> Ok, so I reinstalled bioperl and was able to run the EUtilities code
>> for my gene id.
>> But I am facing two issues :-
>>
>> 1) When I give multiple gene ids, it still returns data of only the
>> first gene id
>
> This sounds like it's not iterating correctly.  You'll need to post your
> version of the script.
>
>> 2) The script returns the entire entry, and I am not able to figure
>> out how to just fetch the sequence, and if possible, in FASTA format.
>> I could not figure it out from the documentation.
>
> I recall this working last time I used it (I think June or July).  Could you
> post the script you are using?
>
> (realize this is a holiday weekend in the states, so you might have a
> delayed response from me or others)
>
>> Thanks.
>>
>> -Neeti
>> Even my blood says, B positive
>
> chris
>


From Russell.Smithies at agresearch.co.nz  Sun Sep  6 23:00:24 2009
From: Russell.Smithies at agresearch.co.nz (Smithies, Russell)
Date: Mon, 7 Sep 2009 11:00:24 +1200
Subject: [Bioperl-l] need help urgently
In-Reply-To: <764978cf0909040521p7aa6af97hffc0c08381a04222@mail.gmail.com>
References: <764978cf0909032349t5d8683f3h3b2ccc506392c984@mail.gmail.com>
	<42006.192.168.1.1.1252051273.squirrel@mail.ncbs.res.in>
	<764978cf0909040217k7f4382d6pcec65f8adfd66164@mail.gmail.com>
	<764978cf0909040313r48ff5660x3210324dc14bf966@mail.gmail.com>
	<2ac05d0f0909040505w75793434ifc25237edbabad5a@mail.gmail.com>
	<764978cf0909040521p7aa6af97hffc0c08381a04222@mail.gmail.com>
Message-ID: <18DF7D20DFEC044098A1062202F5FFF32B624C50D0@exchsth.agresearch.co.nz>

Grab the gene2accession list from here and do lookups.
Probably the fastest and easiest way.


Russell Smithies 

Bioinformatics Applications Developer 
T +64 3 489 9085 
E? russell.smithies at agresearch.co.nz 

Invermay? Research Centre 
Puddle Alley, 
Mosgiel, 
New Zealand 
T? +64 3 489 3809?? 
F? +64 3 489 9174? 
www.agresearch.co.nz 


> -----Original Message-----
> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
> bounces at lists.open-bio.org] On Behalf Of Neeti Somaiya
> Sent: Saturday, 5 September 2009 12:21 a.m.
> To: Emanuele Osimo; bioperl-l
> Subject: Re: [Bioperl-l] need help urgently
> 
> Thanks. Its an interesting tool.
> 
> But I want to do this programatically.
> 
> I have gene ids to start with. Cant find a method to directly get
> sequence with gene id as input. So using the method of getting
> sequence with accession as input, for which I need to know accessions
> for my gene ids first. Is this a right approach? Please guide me. My
> main aim is to get the nucleotide sequence of a gene from ids entrez
> gene id/gene name. PLease guide me. I am confused.
> 
> -Neeti
> Even my blood says, B positive
> 
> 
> 
> On Fri, Sep 4, 2009 at 5:35 PM, Emanuele Osimo<e.osimo at gmail.com> wrote:
> > Try this:
> > http://david.abcc.ncifcrf.gov/conversion.jsp
> >
> > Emanuele
> >
> >
> > On Fri, Sep 4, 2009 at 12:13, Neeti Somaiya <neetisomaiya at gmail.com> wrote:
> >>
> >> Thanks for the replies.
> >>
> >> So the get seq by accession/GI worked for me. Now can anyone tell me
> >> the easiest way to get the GI /Accession of a gene from the gene
> >> id/gene name?
> >>
> >> -Neeti
> >> Even my blood says, B positive
> >>
> >>
> >>
> >> On Fri, Sep 4, 2009 at 2:47 PM, Neeti Somaiya<neetisomaiya at gmail.com>
> >> wrote:
> >> > Thanks for the link.
> >> > So I need only the following lines of code to get the sequence?
> >> >
> >> > use Bio::DB::GenBank;
> >> > $db_obj = Bio::DB::GenBank->new;
> >> > $seq_obj = $db_obj->get_Seq_by_id(2);
> >> >
> >> > How do I print the sequence?
> >> > $seq_obj->seq ??
> >> >
> >> > -Neeti
> >> > Even my blood says, B positive
> >> >
> >> >
> >> >
> >> > On Fri, Sep 4, 2009 at 1:31 PM, K. Shameer<shameer at ncbs.res.in> wrote:
> >> >>
> >> >> Retrieving a sequence from a database : BioPerl HOWTO
> >> >> http://bit.ly/RWIot
> >> >>
> >> >> Trust this helps,
> >> >> Khader Shameer
> >> >> NCBS - TIFR
> >> >>
> >> >>> Hi,
> >> >>>
> >> >>> I have an input list of gene names (can get gene ids from a local db
> >> >>> if required).
> >> >>> I need to fetch sequences of these genes. Can someone please guide me
> >> >>> as to how this can be done using perl/bioperl?
> >> >>>
> >> >>> Any help will be deeply appreciated.
> >> >>>
> >> >>> Thanks.
> >> >>>
> >> >>> -Neeti
> >> >>> Even my blood says, B positive
> >> >>> _______________________________________________
> >> >>> Bioperl-l mailing list
> >> >>> Bioperl-l at lists.open-bio.org
> >> >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >> >>>
> >> >>
> >> >>
> >> >>
> >> >
> >> _______________________________________________
> >> Bioperl-l mailing list
> >> Bioperl-l at lists.open-bio.org
> >> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >
> >
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
=======================================================================
Attention: The information contained in this message and/or attachments
from AgResearch Limited is intended only for the persons or entities
to which it is addressed and may contain confidential and/or privileged
material. Any review, retransmission, dissemination or other use of, or
taking of any action in reliance upon, this information by persons or
entities other than the intended recipients is prohibited by AgResearch
Limited. If you have received this message in error, please notify the
sender immediately.
=======================================================================


From bnbowman at gmail.com  Mon Sep  7 08:17:25 2009
From: bnbowman at gmail.com (Brett Bowman)
Date: Mon, 7 Sep 2009 01:17:25 -0700
Subject: [Bioperl-l] Protein Sequence QSARs
Message-ID: <627d998d0909070117u760c8ef3k47a894cf52d099f1@mail.gmail.com>

I've been working on a script for my personal edification for annotating
protein sequence for QSARs, as described in the paper below, because I
didn't see anything in Bioperl to do it for me.  Essentially converting a
protein sequence of length N into a numerical matrix of size 3-by-N by
substitution, and then calculating the auto- and cross- correlation values
for various for a lag of L amino acids.  I was considering turning it into a
full blown module, but I wanted to ask if A) it had been done before and I
had just missed it, and B) whether anyone other than me would find such a
module useful.

Wold S, Jonsson J, Sj?str?m M, Sandberg M, R?nnar S: * DNA and peptide
sequences and chemical processes multivariately modeled by principal
component analysis and partial least-squares projections to latent
structures. **Anal Chim Acta* 1993, *277**:*239-253.

Brett Bowman
bnbowman at gmail.com
Woelk Lab, Stein Cancer Research Center
UCSD/SDSU Joint Program in Bioinformatics


From neetisomaiya at gmail.com  Mon Sep  7 10:04:06 2009
From: neetisomaiya at gmail.com (Neeti Somaiya)
Date: Mon, 7 Sep 2009 15:34:06 +0530
Subject: [Bioperl-l] need help urgently
In-Reply-To: <2ac05d0f0909040039v4d6fb77fw8793b43add632e3a@mail.gmail.com>
References: <764978cf0909032349t5d8683f3h3b2ccc506392c984@mail.gmail.com>
	<2ac05d0f0909040039v4d6fb77fw8793b43add632e3a@mail.gmail.com>
Message-ID: <764978cf0909070304w598d4bb5m51ad4e66f57cc1cf@mail.gmail.com>

I tried using EntrezGene instead of GenBank, as is given in the link
that you sent :

http://www.bioperl.org/wiki/HOWTO:Beginners#Retrieving_a_sequence_from_a_database

http://doc.bioperl.org/releases/bioperl-current/bioperl-live/Bio/DB/EntrezGene.html

use Bio::DB::EntrezGene;

    my $db = Bio::DB::EntrezGene->new;

    my $seq = $db->get_Seq_by_id(2); # Gene id

    # or ...

    my $seqio = $db->get_Stream_by_id([2, 4693, 3064]); # Gene ids
    while ( my $seq = $seqio->next_seq ) {
	    print "id is ", $seq->display_id, "\n";
    }

This doesnt seem to work.


-Neeti
Even my blood says, B positive


On Fri, Sep 4, 2009 at 1:09 PM, Emanuele Osimo<e.osimo at gmail.com> wrote:
> Hello,
> have you tried this?
> http://bio.perl.org/wiki/HOWTO:Getting_Genomic_Sequences#Using_Bio::DB::GenBank_when_you_have_genomic_coordinates
>
> Emanuele
>
> On Fri, Sep 4, 2009 at 08:49, Neeti Somaiya <neetisomaiya at gmail.com> wrote:
>>
>> Hi,
>>
>> I have an input list of gene names (can get gene ids from a local db
>> if required).
>> I need to fetch sequences of these genes. Can someone please guide me
>> as to how this can be done using perl/bioperl?
>>
>> Any help will be deeply appreciated.
>>
>> Thanks.
>>
>> -Neeti
>> Even my blood says, B positive
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>


From Russell.Smithies at agresearch.co.nz  Mon Sep  7 20:26:04 2009
From: Russell.Smithies at agresearch.co.nz (Smithies, Russell)
Date: Tue, 8 Sep 2009 08:26:04 +1200
Subject: [Bioperl-l] need help urgently
In-Reply-To: <764978cf0909070304w598d4bb5m51ad4e66f57cc1cf@mail.gmail.com>
References: <764978cf0909032349t5d8683f3h3b2ccc506392c984@mail.gmail.com>
	<2ac05d0f0909040039v4d6fb77fw8793b43add632e3a@mail.gmail.com>
	<764978cf0909070304w598d4bb5m51ad4e66f57cc1cf@mail.gmail.com>
Message-ID: <18DF7D20DFEC044098A1062202F5FFF32B624C53A3@exchsth.agresearch.co.nz>

This example code from the wiki _definitely_ works:
http://bio.perl.org/wiki/HOWTO:Getting_Genomic_Sequences#Using_Bio::DB::EntrezGene_to_get_genomic_coordinates
=========================================

use strict;
use Bio::DB::EntrezGene;
 
my $id = shift or die "Id?\n"; # use a Gene id
 
my $db = new Bio::DB::EntrezGene;
$db->verbose(1); ###
 
my $seq = $db->get_Seq_by_id($id);
 
my $ac = $seq->annotation;
 
for my $ann ($ac->get_Annotations('dblink')) {
	if ($ann->database eq "Evidence Viewer") {
                # get the sequence identifier, the start, and the stop
		my ($contig,$from,$to) = $ann->url =~ 
		  /contig=([^&]+).+from=(\d+)&to=(\d+)/;
		print "$contig\t$from\t$to\n";
	}
}

======================================

So if it doesn't work for you, there are a few things you need to check:
* what version of BioPerl are you using?
* are you behind a firewall?
* are you using a proxy?
* do you need to submit username/password for either of the 2 above
* turn on 'verbose' messages, it may help you debug


If you're still having problems, get back to me and I'll see if I can help.

--Russell


> -----Original Message-----
> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
> bounces at lists.open-bio.org] On Behalf Of Neeti Somaiya
> Sent: Monday, 7 September 2009 10:04 p.m.
> To: Emanuele Osimo; bioperl-l
> Subject: Re: [Bioperl-l] need help urgently
> 
> I tried using EntrezGene instead of GenBank, as is given in the link
> that you sent :
> 
> http://www.bioperl.org/wiki/HOWTO:Beginners#Retrieving_a_sequence_from_a_datab
> ase
> 
> http://doc.bioperl.org/releases/bioperl-current/bioperl-
> live/Bio/DB/EntrezGene.html
> 
> use Bio::DB::EntrezGene;
> 
>     my $db = Bio::DB::EntrezGene->new;
> 
>     my $seq = $db->get_Seq_by_id(2); # Gene id
> 
>     # or ...
> 
>     my $seqio = $db->get_Stream_by_id([2, 4693, 3064]); # Gene ids
>     while ( my $seq = $seqio->next_seq ) {
> 	    print "id is ", $seq->display_id, "\n";
>     }
> 
> This doesnt seem to work.
> 
> 
> -Neeti
> Even my blood says, B positive
> 
> 
> 
> On Fri, Sep 4, 2009 at 1:09 PM, Emanuele Osimo<e.osimo at gmail.com> wrote:
> > Hello,
> > have you tried this?
> >
> http://bio.perl.org/wiki/HOWTO:Getting_Genomic_Sequences#Using_Bio::DB::GenBan
> k_when_you_have_genomic_coordinates
> >
> > Emanuele
> >
> > On Fri, Sep 4, 2009 at 08:49, Neeti Somaiya <neetisomaiya at gmail.com> wrote:
> >>
> >> Hi,
> >>
> >> I have an input list of gene names (can get gene ids from a local db
> >> if required).
> >> I need to fetch sequences of these genes. Can someone please guide me
> >> as to how this can be done using perl/bioperl?
> >>
> >> Any help will be deeply appreciated.
> >>
> >> Thanks.
> >>
> >> -Neeti
> >> Even my blood says, B positive
> >> _______________________________________________
> >> Bioperl-l mailing list
> >> Bioperl-l at lists.open-bio.org
> >> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >
> >
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
=======================================================================
Attention: The information contained in this message and/or attachments
from AgResearch Limited is intended only for the persons or entities
to which it is addressed and may contain confidential and/or privileged
material. Any review, retransmission, dissemination or other use of, or
taking of any action in reliance upon, this information by persons or
entities other than the intended recipients is prohibited by AgResearch
Limited. If you have received this message in error, please notify the
sender immediately.
=======================================================================


From cjfields at illinois.edu  Mon Sep  7 20:56:03 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Mon, 7 Sep 2009 15:56:03 -0500
Subject: [Bioperl-l] Prepping for 1.6.1 (finally!)
Message-ID: <35CC277D-F0B6-45D0-A578-10A00B7A9C57@illinois.edu>

All,

I have updated the Changes file in bioperl-live in preparation for  
1.6.1.  The initial release will be an alpha, 1.6.0_1 (probably  
landing about mid-week), and based on CPAN tests, etc the final 1.6.1  
release next week.  I'll start merging changes over from trunk  
tonight, fixing last-minute bugs, etc.  I'm running my work using perl  
5.10.1 (64-bit) on Mac and will likely run these remotely on our local  
linux cluster.  Win tests are gladly welcome (this should work on  
Strawberry Perl now).

I highly suggest Mark, Jason, and any others (Lincoln, Scott, Chase,  
Robert Buels, Jay Hannah, Heikki, Sendu come to mind) look over the  
file to update it.  There are a few weak spots in there where I didn't  
make the code change or additions, or where a particular bug was fixed  
but not mentioned.  In particular:

1) Google Summer of Code work from Chase (Mark, Chase)
2) GMOD-related fixes (Lincoln, Scott)
3) YAPC Hackathon bug fixes (Robert, Jay, Bruno)
4) Tiling, Restriction refactors (Mark)

Also, please make changes to AUTHORS, etc as needed.

Thanks!

chris


From maj at fortinbras.us  Mon Sep  7 21:21:04 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Mon, 7 Sep 2009 17:21:04 -0400
Subject: [Bioperl-l] Prepping for 1.6.1 (finally!)
In-Reply-To: <35CC277D-F0B6-45D0-A578-10A00B7A9C57@illinois.edu>
References: <35CC277D-F0B6-45D0-A578-10A00B7A9C57@illinois.edu>
Message-ID: <29B3F9DC91A1422A89629790DD8CC313@NewLife>

aye-aye skipper--- 
----- Original Message ----- 
From: "Chris Fields" <cjfields at illinois.edu>
To: "BioPerl List" <bioperl-l at lists.open-bio.org>
Sent: Monday, September 07, 2009 4:56 PM
Subject: [Bioperl-l] Prepping for 1.6.1 (finally!)


> All,
> 
> I have updated the Changes file in bioperl-live in preparation for  
> 1.6.1.  The initial release will be an alpha, 1.6.0_1 (probably  
> landing about mid-week), and based on CPAN tests, etc the final 1.6.1  
> release next week.  I'll start merging changes over from trunk  
> tonight, fixing last-minute bugs, etc.  I'm running my work using perl  
> 5.10.1 (64-bit) on Mac and will likely run these remotely on our local  
> linux cluster.  Win tests are gladly welcome (this should work on  
> Strawberry Perl now).
> 
> I highly suggest Mark, Jason, and any others (Lincoln, Scott, Chase,  
> Robert Buels, Jay Hannah, Heikki, Sendu come to mind) look over the  
> file to update it.  There are a few weak spots in there where I didn't  
> make the code change or additions, or where a particular bug was fixed  
> but not mentioned.  In particular:
> 
> 1) Google Summer of Code work from Chase (Mark, Chase)
> 2) GMOD-related fixes (Lincoln, Scott)
> 3) YAPC Hackathon bug fixes (Robert, Jay, Bruno)
> 4) Tiling, Restriction refactors (Mark)
> 
> Also, please make changes to AUTHORS, etc as needed.
> 
> Thanks!
> 
> chris
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
>


From cjfields at illinois.edu  Tue Sep  8 04:23:26 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Mon, 7 Sep 2009 23:23:26 -0500
Subject: [Bioperl-l] Significant blocker for 1.6.1 : Nexml
Message-ID: <E5D7B830-6D19-47D2-8D5E-716B4CF84F0B@illinois.edu>

All,

I'm running into a pretty significant blocker for 1.6.1 re: Chase's  
Nexml code.  In particular, I have tried three versions of Bio::Phylo;  
the default CPAN installation (1.6), the latest CPAN RC (1.7_RC9, not  
installed by default), and the latest from Bio::Phylo svn:

https://nexml.svn.sourceforge.net/svnroot/nexml/trunk/nexml/perl

At this moment only the Bio::Phylo code from svn is working with  
BioPerl's Nexml modules.  From my local tests Bio::Phylo 1.6 appears  
to be missing Bio::Phylo::Factory (all Nexml tests fail), whereas  
1.7_RC9 has some kind of versioning issue (again, all tests fail).   
The problem: CPAN will always install 1.6 (the others are RC, so they  
won't be installed unless the full path is used).  Even so, nothing on  
CPAN even works; one must use the latest Bio::Phylo SVN code.

ATM I'm just not seeing how this can be released with 1.6.1 right now,  
unless one of the following occurs:

1) Rutger V. drops a quick non-RC release to CPAN,
2) check for the minimal working Bio::Phylo version and safely skip  
any Nexml-related tests unless proper version is present (not easy  
with a $VERSION like '1.7_RC9'),
3) push Nexml into it's own distribution (something we were planning  
on anyway with a number of modules)

As for #3 above, I think it probably belongs in a larger bioperl-phylo  
as Mark had previously proposed.  I'm open to just about any solution.

chris


From neetisomaiya at gmail.com  Tue Sep  8 04:27:43 2009
From: neetisomaiya at gmail.com (Neeti Somaiya)
Date: Tue, 8 Sep 2009 09:57:43 +0530
Subject: [Bioperl-l] need help urgently
In-Reply-To: <18DF7D20DFEC044098A1062202F5FFF32B624C53A3@exchsth.agresearch.co.nz>
References: <764978cf0909032349t5d8683f3h3b2ccc506392c984@mail.gmail.com>
	<2ac05d0f0909040039v4d6fb77fw8793b43add632e3a@mail.gmail.com>
	<764978cf0909070304w598d4bb5m51ad4e66f57cc1cf@mail.gmail.com>
	<18DF7D20DFEC044098A1062202F5FFF32B624C53A3@exchsth.agresearch.co.nz>
Message-ID: <764978cf0909072127n830d4e8x95d15a758fa919db@mail.gmail.com>

I actually want the nucleotide sequence of the gene. I thought the
Bio::DB::EntrezGene would give me a seq_obj for an entrez gene id and
then the seq method on that $seq_obj->seq() will give me the actual
genomic nucleotide sequence of the gene. But this doesnt happen. I am
able to print gene symbol using $seq_obj->display_id and able to do
other things, but I wanted the gene nucleotide sequence.

-Neeti
Even my blood says, B positive


On Tue, Sep 8, 2009 at 1:56 AM, Smithies,
Russell<Russell.Smithies at agresearch.co.nz> wrote:
> This example code from the wiki _definitely_ works:
> http://bio.perl.org/wiki/HOWTO:Getting_Genomic_Sequences#Using_Bio::DB::EntrezGene_to_get_genomic_coordinates
> =========================================
>
> use strict;
> use Bio::DB::EntrezGene;
>
> my $id = shift or die "Id?\n"; # use a Gene id
>
> my $db = new Bio::DB::EntrezGene;
> $db->verbose(1); ###
>
> my $seq = $db->get_Seq_by_id($id);
>
> my $ac = $seq->annotation;
>
> for my $ann ($ac->get_Annotations('dblink')) {
>        if ($ann->database eq "Evidence Viewer") {
>                # get the sequence identifier, the start, and the stop
>                my ($contig,$from,$to) = $ann->url =~
>                  /contig=([^&]+).+from=(\d+)&to=(\d+)/;
>                print "$contig\t$from\t$to\n";
>        }
> }
>
> ======================================
>
> So if it doesn't work for you, there are a few things you need to check:
> * what version of BioPerl are you using?
> * are you behind a firewall?
> * are you using a proxy?
> * do you need to submit username/password for either of the 2 above
> * turn on 'verbose' messages, it may help you debug
>
>
> If you're still having problems, get back to me and I'll see if I can help.
>
> --Russell
>
>
>> -----Original Message-----
>> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
>> bounces at lists.open-bio.org] On Behalf Of Neeti Somaiya
>> Sent: Monday, 7 September 2009 10:04 p.m.
>> To: Emanuele Osimo; bioperl-l
>> Subject: Re: [Bioperl-l] need help urgently
>>
>> I tried using EntrezGene instead of GenBank, as is given in the link
>> that you sent :
>>
>> http://www.bioperl.org/wiki/HOWTO:Beginners#Retrieving_a_sequence_from_a_datab
>> ase
>>
>> http://doc.bioperl.org/releases/bioperl-current/bioperl-
>> live/Bio/DB/EntrezGene.html
>>
>> use Bio::DB::EntrezGene;
>>
>>     my $db = Bio::DB::EntrezGene->new;
>>
>>     my $seq = $db->get_Seq_by_id(2); # Gene id
>>
>>     # or ...
>>
>>     my $seqio = $db->get_Stream_by_id([2, 4693, 3064]); # Gene ids
>>     while ( my $seq = $seqio->next_seq ) {
>>           print "id is ", $seq->display_id, "\n";
>>     }
>>
>> This doesnt seem to work.
>>
>>
>> -Neeti
>> Even my blood says, B positive
>>
>>
>>
>> On Fri, Sep 4, 2009 at 1:09 PM, Emanuele Osimo<e.osimo at gmail.com> wrote:
>> > Hello,
>> > have you tried this?
>> >
>> http://bio.perl.org/wiki/HOWTO:Getting_Genomic_Sequences#Using_Bio::DB::GenBan
>> k_when_you_have_genomic_coordinates
>> >
>> > Emanuele
>> >
>> > On Fri, Sep 4, 2009 at 08:49, Neeti Somaiya <neetisomaiya at gmail.com> wrote:
>> >>
>> >> Hi,
>> >>
>> >> I have an input list of gene names (can get gene ids from a local db
>> >> if required).
>> >> I need to fetch sequences of these genes. Can someone please guide me
>> >> as to how this can be done using perl/bioperl?
>> >>
>> >> Any help will be deeply appreciated.
>> >>
>> >> Thanks.
>> >>
>> >> -Neeti
>> >> Even my blood says, B positive
>> >> _______________________________________________
>> >> Bioperl-l mailing list
>> >> Bioperl-l at lists.open-bio.org
>> >> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>> >
>> >
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> =======================================================================
> Attention: The information contained in this message and/or attachments
> from AgResearch Limited is intended only for the persons or entities
> to which it is addressed and may contain confidential and/or privileged
> material. Any review, retransmission, dissemination or other use of, or
> taking of any action in reliance upon, this information by persons or
> entities other than the intended recipients is prohibited by AgResearch
> Limited. If you have received this message in error, please notify the
> sender immediately.
> =======================================================================
>


From Russell.Smithies at agresearch.co.nz  Tue Sep  8 04:41:47 2009
From: Russell.Smithies at agresearch.co.nz (Smithies, Russell)
Date: Tue, 8 Sep 2009 16:41:47 +1200
Subject: [Bioperl-l] need help urgently
In-Reply-To: <764978cf0909072127n830d4e8x95d15a758fa919db@mail.gmail.com>
References: <764978cf0909032349t5d8683f3h3b2ccc506392c984@mail.gmail.com>
	<2ac05d0f0909040039v4d6fb77fw8793b43add632e3a@mail.gmail.com>
	<764978cf0909070304w598d4bb5m51ad4e66f57cc1cf@mail.gmail.com>
	<18DF7D20DFEC044098A1062202F5FFF32B624C53A3@exchsth.agresearch.co.nz>
	<764978cf0909072127n830d4e8x95d15a758fa919db@mail.gmail.com>
Message-ID: <18DF7D20DFEC044098A1062202F5FFF32B624C5607@exchsth.agresearch.co.nz>

That bit of code gave you the accession, start and end for the sequence so you just needed to download it.
Bio::DB::Eutilities can do that for you.

Did you take a look at http://www.bioperl.org/wiki/HOWTO:Getting_Genomic_Sequences


--Russell

==================
#!perl -w

use strict;
use Bio::DB::EntrezGene;
use Bio::DB::EUtilities;

no warnings 'deprecated';
 
my $id = shift or die "Id?\n"; # use a Gene id
 
my $db = new Bio::DB::EntrezGene;
#$db->verbose(1);
my $seq = $db->get_Seq_by_id($id);
 
my $ac = $seq->annotation;
 
for my $ann ($ac->get_Annotations('dblink')) {
	if ($ann->database eq "Evidence Viewer") {
                # get the sequence identifier, the start, and the stop
		my ($acc,$from,$to) = $ann->url =~
		  /contig=([^&]+).+from=(\d+)&to=(\d+)/;
		print "$acc\t$from\t$to\n";

		# retrieve the sequence
		my $fetcher = Bio::DB::EUtilities->new(-eutil => 'efetch',
					   -db    => 'nucleotide',
					   -rettype => 'fasta');
            $fetcher->set_parameters(-id => $acc,
			     			-seq_start => $from,
			     			-seq_stop  => $to,
			     			-strand    => 1);
            my $seq = $fetcher->get_Response->content;
            print $seq;

	}
}

======================

> -----Original Message-----
> From: Neeti Somaiya [mailto:neetisomaiya at gmail.com]
> Sent: Tuesday, 8 September 2009 4:28 p.m.
> To: Smithies, Russell
> Cc: Emanuele Osimo; bioperl-l
> Subject: Re: [Bioperl-l] need help urgently
> 
> I actually want the nucleotide sequence of the gene. I thought the
> Bio::DB::EntrezGene would give me a seq_obj for an entrez gene id and
> then the seq method on that $seq_obj->seq() will give me the actual
> genomic nucleotide sequence of the gene. But this doesnt happen. I am
> able to print gene symbol using $seq_obj->display_id and able to do
> other things, but I wanted the gene nucleotide sequence.
> 
> -Neeti
> Even my blood says, B positive
> 
> 
> 
> On Tue, Sep 8, 2009 at 1:56 AM, Smithies,
> Russell<Russell.Smithies at agresearch.co.nz> wrote:
> > This example code from the wiki _definitely_ works:
> >
> http://bio.perl.org/wiki/HOWTO:Getting_Genomic_Sequences#Using_Bio::DB::Entrez
> Gene_to_get_genomic_coordinates
> > =========================================
> >
> > use strict;
> > use Bio::DB::EntrezGene;
> >
> > my $id = shift or die "Id?\n"; # use a Gene id
> >
> > my $db = new Bio::DB::EntrezGene;
> > $db->verbose(1); ###
> >
> > my $seq = $db->get_Seq_by_id($id);
> >
> > my $ac = $seq->annotation;
> >
> > for my $ann ($ac->get_Annotations('dblink')) {
> >        if ($ann->database eq "Evidence Viewer") {
> >                # get the sequence identifier, the start, and the stop
> >                my ($contig,$from,$to) = $ann->url =~
> >                  /contig=([^&]+).+from=(\d+)&to=(\d+)/;
> >                print "$contig\t$from\t$to\n";
> >        }
> > }
> >
> > ======================================
> >
> > So if it doesn't work for you, there are a few things you need to check:
> > * what version of BioPerl are you using?
> > * are you behind a firewall?
> > * are you using a proxy?
> > * do you need to submit username/password for either of the 2 above
> > * turn on 'verbose' messages, it may help you debug
> >
> >
> > If you're still having problems, get back to me and I'll see if I can help.
> >
> > --Russell
> >
> >
> >> -----Original Message-----
> >> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
> >> bounces at lists.open-bio.org] On Behalf Of Neeti Somaiya
> >> Sent: Monday, 7 September 2009 10:04 p.m.
> >> To: Emanuele Osimo; bioperl-l
> >> Subject: Re: [Bioperl-l] need help urgently
> >>
> >> I tried using EntrezGene instead of GenBank, as is given in the link
> >> that you sent :
> >>
> >>
> http://www.bioperl.org/wiki/HOWTO:Beginners#Retrieving_a_sequence_from_a_datab
> >> ase
> >>
> >> http://doc.bioperl.org/releases/bioperl-current/bioperl-
> >> live/Bio/DB/EntrezGene.html
> >>
> >> use Bio::DB::EntrezGene;
> >>
> >>     my $db = Bio::DB::EntrezGene->new;
> >>
> >>     my $seq = $db->get_Seq_by_id(2); # Gene id
> >>
> >>     # or ...
> >>
> >>     my $seqio = $db->get_Stream_by_id([2, 4693, 3064]); # Gene ids
> >>     while ( my $seq = $seqio->next_seq ) {
> >>           print "id is ", $seq->display_id, "\n";
> >>     }
> >>
> >> This doesnt seem to work.
> >>
> >>
> >> -Neeti
> >> Even my blood says, B positive
> >>
> >>
> >>
> >> On Fri, Sep 4, 2009 at 1:09 PM, Emanuele Osimo<e.osimo at gmail.com> wrote:
> >> > Hello,
> >> > have you tried this?
> >> >
> >>
> http://bio.perl.org/wiki/HOWTO:Getting_Genomic_Sequences#Using_Bio::DB::GenBan
> >> k_when_you_have_genomic_coordinates
> >> >
> >> > Emanuele
> >> >
> >> > On Fri, Sep 4, 2009 at 08:49, Neeti Somaiya <neetisomaiya at gmail.com>
> wrote:
> >> >>
> >> >> Hi,
> >> >>
> >> >> I have an input list of gene names (can get gene ids from a local db
> >> >> if required).
> >> >> I need to fetch sequences of these genes. Can someone please guide me
> >> >> as to how this can be done using perl/bioperl?
> >> >>
> >> >> Any help will be deeply appreciated.
> >> >>
> >> >> Thanks.
> >> >>
> >> >> -Neeti
> >> >> Even my blood says, B positive
> >> >> _______________________________________________
> >> >> Bioperl-l mailing list
> >> >> Bioperl-l at lists.open-bio.org
> >> >> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >> >
> >> >
> >> _______________________________________________
> >> Bioperl-l mailing list
> >> Bioperl-l at lists.open-bio.org
> >> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> > =======================================================================
> > Attention: The information contained in this message and/or attachments
> > from AgResearch Limited is intended only for the persons or entities
> > to which it is addressed and may contain confidential and/or privileged
> > material. Any review, retransmission, dissemination or other use of, or
> > taking of any action in reliance upon, this information by persons or
> > entities other than the intended recipients is prohibited by AgResearch
> > Limited. If you have received this message in error, please notify the
> > sender immediately.
> > =======================================================================
> >


From cjfields at illinois.edu  Tue Sep  8 04:50:01 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Mon, 7 Sep 2009 23:50:01 -0500
Subject: [Bioperl-l] need help urgently
In-Reply-To: <18DF7D20DFEC044098A1062202F5FFF32B624C5607@exchsth.agresearch.co.nz>
References: <764978cf0909032349t5d8683f3h3b2ccc506392c984@mail.gmail.com>
	<2ac05d0f0909040039v4d6fb77fw8793b43add632e3a@mail.gmail.com>
	<764978cf0909070304w598d4bb5m51ad4e66f57cc1cf@mail.gmail.com>
	<18DF7D20DFEC044098A1062202F5FFF32B624C53A3@exchsth.agresearch.co.nz>
	<764978cf0909072127n830d4e8x95d15a758fa919db@mail.gmail.com>
	<18DF7D20DFEC044098A1062202F5FFF32B624C5607@exchsth.agresearch.co.nz>
Message-ID: <76A4757A-80C5-400E-8D3B-C68E968FF581@illinois.edu>

Russell,

Any reason you're using "no warnings 'deprecated'" there?  The  
pseudohash warnings should no longer be showing up with EntrezGene  
stuff.  Or is it something else?

chris

On Sep 7, 2009, at 11:41 PM, Smithies, Russell wrote:

> That bit of code gave you the accession, start and end for the  
> sequence so you just needed to download it.
> Bio::DB::Eutilities can do that for you.
>
> Did you take a look at http://www.bioperl.org/wiki/HOWTO:Getting_Genomic_Sequences
>
>
>
> --Russell
>
> ==================
> #!perl -w
>
> use strict;
> use Bio::DB::EntrezGene;
> use Bio::DB::EUtilities;
>
> no warnings 'deprecated';
>
> my $id = shift or die "Id?\n"; # use a Gene id
>
> my $db = new Bio::DB::EntrezGene;
> #$db->verbose(1);
> my $seq = $db->get_Seq_by_id($id);
>
> my $ac = $seq->annotation;
>
> for my $ann ($ac->get_Annotations('dblink')) {
> 	if ($ann->database eq "Evidence Viewer") {
>                # get the sequence identifier, the start, and the stop
> 		my ($acc,$from,$to) = $ann->url =~
> 		  /contig=([^&]+).+from=(\d+)&to=(\d+)/;
> 		print "$acc\t$from\t$to\n";
>
> 		# retrieve the sequence
> 		my $fetcher = Bio::DB::EUtilities->new(-eutil => 'efetch',
> 					   -db    => 'nucleotide',
> 					   -rettype => 'fasta');
>            $fetcher->set_parameters(-id => $acc,
> 			     			-seq_start => $from,
> 			     			-seq_stop  => $to,
> 			     			-strand    => 1);
>            my $seq = $fetcher->get_Response->content;
>            print $seq;
>
> 	}
> }
>
> ======================
>
>> -----Original Message-----
>> From: Neeti Somaiya [mailto:neetisomaiya at gmail.com]
>> Sent: Tuesday, 8 September 2009 4:28 p.m.
>> To: Smithies, Russell
>> Cc: Emanuele Osimo; bioperl-l
>> Subject: Re: [Bioperl-l] need help urgently
>>
>> I actually want the nucleotide sequence of the gene. I thought the
>> Bio::DB::EntrezGene would give me a seq_obj for an entrez gene id and
>> then the seq method on that $seq_obj->seq() will give me the actual
>> genomic nucleotide sequence of the gene. But this doesnt happen. I am
>> able to print gene symbol using $seq_obj->display_id and able to do
>> other things, but I wanted the gene nucleotide sequence.
>>
>> -Neeti
>> Even my blood says, B positive
>>
>>
>>
>> On Tue, Sep 8, 2009 at 1:56 AM, Smithies,
>> Russell<Russell.Smithies at agresearch.co.nz> wrote:
>>> This example code from the wiki _definitely_ works:
>>>
>> http://bio.perl.org/wiki/HOWTO:Getting_Genomic_Sequences#Using_Bio::DB::Entrez
>> Gene_to_get_genomic_coordinates
>>> =========================================
>>>
>>> use strict;
>>> use Bio::DB::EntrezGene;
>>>
>>> my $id = shift or die "Id?\n"; # use a Gene id
>>>
>>> my $db = new Bio::DB::EntrezGene;
>>> $db->verbose(1); ###
>>>
>>> my $seq = $db->get_Seq_by_id($id);
>>>
>>> my $ac = $seq->annotation;
>>>
>>> for my $ann ($ac->get_Annotations('dblink')) {
>>>       if ($ann->database eq "Evidence Viewer") {
>>>               # get the sequence identifier, the start, and the stop
>>>               my ($contig,$from,$to) = $ann->url =~
>>>                 /contig=([^&]+).+from=(\d+)&to=(\d+)/;
>>>               print "$contig\t$from\t$to\n";
>>>       }
>>> }
>>>
>>> ======================================
>>>
>>> So if it doesn't work for you, there are a few things you need to  
>>> check:
>>> * what version of BioPerl are you using?
>>> * are you behind a firewall?
>>> * are you using a proxy?
>>> * do you need to submit username/password for either of the 2 above
>>> * turn on 'verbose' messages, it may help you debug
>>>
>>>
>>> If you're still having problems, get back to me and I'll see if I  
>>> can help.
>>>
>>> --Russell
>>>
>>>
>>>> -----Original Message-----
>>>> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
>>>> bounces at lists.open-bio.org] On Behalf Of Neeti Somaiya
>>>> Sent: Monday, 7 September 2009 10:04 p.m.
>>>> To: Emanuele Osimo; bioperl-l
>>>> Subject: Re: [Bioperl-l] need help urgently
>>>>
>>>> I tried using EntrezGene instead of GenBank, as is given in the  
>>>> link
>>>> that you sent :
>>>>
>>>>
>> http://www.bioperl.org/wiki/HOWTO:Beginners#Retrieving_a_sequence_from_a_datab
>>>> ase
>>>>
>>>> http://doc.bioperl.org/releases/bioperl-current/bioperl-
>>>> live/Bio/DB/EntrezGene.html
>>>>
>>>> use Bio::DB::EntrezGene;
>>>>
>>>>    my $db = Bio::DB::EntrezGene->new;
>>>>
>>>>    my $seq = $db->get_Seq_by_id(2); # Gene id
>>>>
>>>>    # or ...
>>>>
>>>>    my $seqio = $db->get_Stream_by_id([2, 4693, 3064]); # Gene ids
>>>>    while ( my $seq = $seqio->next_seq ) {
>>>>          print "id is ", $seq->display_id, "\n";
>>>>    }
>>>>
>>>> This doesnt seem to work.
>>>>
>>>>
>>>> -Neeti
>>>> Even my blood says, B positive
>>>>
>>>>
>>>>
>>>> On Fri, Sep 4, 2009 at 1:09 PM, Emanuele Osimo<e.osimo at gmail.com>  
>>>> wrote:
>>>>> Hello,
>>>>> have you tried this?
>>>>>
>>>>
>> http://bio.perl.org/wiki/HOWTO:Getting_Genomic_Sequences#Using_Bio::DB::GenBan
>>>> k_when_you_have_genomic_coordinates
>>>>>
>>>>> Emanuele
>>>>>
>>>>> On Fri, Sep 4, 2009 at 08:49, Neeti Somaiya <neetisomaiya at gmail.com 
>>>>> >
>> wrote:
>>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> I have an input list of gene names (can get gene ids from a  
>>>>>> local db
>>>>>> if required).
>>>>>> I need to fetch sequences of these genes. Can someone please  
>>>>>> guide me
>>>>>> as to how this can be done using perl/bioperl?
>>>>>>
>>>>>> Any help will be deeply appreciated.
>>>>>>
>>>>>> Thanks.
>>>>>>
>>>>>> -Neeti
>>>>>> Even my blood says, B positive
>>>>>> _______________________________________________
>>>>>> Bioperl-l mailing list
>>>>>> Bioperl-l at lists.open-bio.org
>>>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>>>
>>>>>
>>>> _______________________________________________
>>>> Bioperl-l mailing list
>>>> Bioperl-l at lists.open-bio.org
>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>> = 
>>> = 
>>> = 
>>> ====================================================================
>>> Attention: The information contained in this message and/or  
>>> attachments
>>> from AgResearch Limited is intended only for the persons or entities
>>> to which it is addressed and may contain confidential and/or  
>>> privileged
>>> material. Any review, retransmission, dissemination or other use  
>>> of, or
>>> taking of any action in reliance upon, this information by persons  
>>> or
>>> entities other than the intended recipients is prohibited by  
>>> AgResearch
>>> Limited. If you have received this message in error, please notify  
>>> the
>>> sender immediately.
>>> = 
>>> = 
>>> = 
>>> ====================================================================
>>>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From paola_bisignano at yahoo.it  Tue Sep  8 08:55:21 2009
From: paola_bisignano at yahoo.it (Paola Bisignano)
Date: Tue, 8 Sep 2009 08:55:21 +0000 (GMT)
Subject: [Bioperl-l] problem parsing pdb
Message-ID: <741671.67508.qm@web25705.mail.ukl.yahoo.com>

Hi,

I'm in a little troble because i need to exactly parse pdb file, to extract chain id and res id, but I finded that in some pdb the number of residue is followed by a letter because is probably a residue added by crystallographers and they didm't want to change the number of residue in sequence....for example the pdb 1PXX.pdb I parsed it with my script below, I didn't find any useful suggestion about this in bioperltutorial or documentation of bioperl online

#!/usr/local/bin/perl
use strict;
use warnings;
use Bio::Structure::IO;
use LWP::Simple;


?my $urlpdb= "http://www.rcsb.org/pdb/download/downloadFile.do?fileFormat=pdb&compression=NO&structureId=1PXX";
?? my $content = get($urlpdb); 
?? my $pdb_file = qq{1pxx.pdb};
?? open my $f, ">$pdb_file" or die $!;
?? binmode $f; 
?? print $f $content;
?? print qq{$pdb_file\n};
?? close $f;


my $structio=Bio::Structure::IO->new (-file=>$pdb_file);
?? my $struc=$structio->next_structure;
?? for my $chain ($struc->get_chains) 
??? {
??? my $chainid = $chain->id ;
??? for my $res ($struc->get_residues($chain))
??? ??? {
??? ??? my $resid=$res-> id;
??? ??? my $atoms= $struc->get_atoms($res);
??? ??? open my $f, ">> 1pxx.parsed";
??? ??? ??? print? $f?? "$chainid\t$resid\n";
??? ??? ??? close $f;
??? ??? }
??? }


but it gives my file with an error in ILE 105A? ILE 2105C because they have a letter that follow the number of resid.... can I solve that problem without writing intermediate files?
because i need to have the reside id as 105A not 105.A
so
?A????????? ILE-105A 
without point between number and letter....


Thank you all,

Paola


From lengjingmao at gmail.com  Tue Sep  8 10:13:05 2009
From: lengjingmao at gmail.com (shaohua.fan)
Date: Tue, 8 Sep 2009 12:13:05 +0200
Subject: [Bioperl-l] Bio::Tools::RepeatMasker update?
Message-ID: <517072a20909080313g5ec3380bo42e1871c3a6f4aab@mail.gmail.com>

Dear all ,

After reading the document and original code of Bio::Tools::RepeatMasker on
bioperl document 1.6.0, I have a question about this module's update.

The current repeatmasker's output(  .out) provide more information
than which have not listed in the module, for example, query(left) , repeat
(left), perc div, perc del, perc ins. these maybe useful for some users.

I think it is better to update this module in the lastest Bioperl version.

shaohua


From maj at fortinbras.us  Tue Sep  8 11:00:31 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Tue, 8 Sep 2009 07:00:31 -0400
Subject: [Bioperl-l] Significant blocker for 1.6.1 : Nexml
In-Reply-To: <E5D7B830-6D19-47D2-8D5E-716B4CF84F0B@illinois.edu>
References: <E5D7B830-6D19-47D2-8D5E-716B4CF84F0B@illinois.edu>
Message-ID: <AD2517BD451A403D9FF258B9A07569F2@NewLife>

Chris - 
I would like to vote for option #1, since working on Bio::Nexml with
Chase gave me the opp'y to patch Bio::Phylo some (including fixing
an old "fix" of mine), so (IMO) the CPAN version of Bio::Phylo 
would benefit too. Option #2 is ok, since Bio::Nexml has to be
essentially optional for the user anyway, dependent on whether
the user is willing to install Bio::Phylo, a fairly major commitment
 (nexml.t already skips if Bio::Phylo is unavailable); I think it's 
no problem if we make that dependency more stringent. We could
have nexml.t check the svn revision directly, rather than $VERSION,
as a kludge.
cheers MAJ 
----- Original Message ----- 
From: "Chris Fields" <cjfields at illinois.edu>
To: "BioPerl List" <bioperl-l at lists.open-bio.org>
Sent: Tuesday, September 08, 2009 12:23 AM
Subject: [Bioperl-l] Significant blocker for 1.6.1 : Nexml


> All,
> 
> I'm running into a pretty significant blocker for 1.6.1 re: Chase's  
> Nexml code.  In particular, I have tried three versions of Bio::Phylo;  
> the default CPAN installation (1.6), the latest CPAN RC (1.7_RC9, not  
> installed by default), and the latest from Bio::Phylo svn:
> 
> https://nexml.svn.sourceforge.net/svnroot/nexml/trunk/nexml/perl
> 
> At this moment only the Bio::Phylo code from svn is working with  
> BioPerl's Nexml modules.  From my local tests Bio::Phylo 1.6 appears  
> to be missing Bio::Phylo::Factory (all Nexml tests fail), whereas  
> 1.7_RC9 has some kind of versioning issue (again, all tests fail).   
> The problem: CPAN will always install 1.6 (the others are RC, so they  
> won't be installed unless the full path is used).  Even so, nothing on  
> CPAN even works; one must use the latest Bio::Phylo SVN code.
> 
> ATM I'm just not seeing how this can be released with 1.6.1 right now,  
> unless one of the following occurs:
> 
> 1) Rutger V. drops a quick non-RC release to CPAN,
> 2) check for the minimal working Bio::Phylo version and safely skip  
> any Nexml-related tests unless proper version is present (not easy  
> with a $VERSION like '1.7_RC9'),
> 3) push Nexml into it's own distribution (something we were planning  
> on anyway with a number of modules)
> 
> As for #3 above, I think it probably belongs in a larger bioperl-phylo  
> as Mark had previously proposed.  I'm open to just about any solution.
> 
> chris
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
>


From hlapp at gmx.net  Tue Sep  8 12:16:12 2009
From: hlapp at gmx.net (Hilmar Lapp)
Date: Tue, 8 Sep 2009 08:16:12 -0400
Subject: [Bioperl-l] Significant blocker for 1.6.1 : Nexml
In-Reply-To: <E5D7B830-6D19-47D2-8D5E-716B4CF84F0B@illinois.edu>
References: <E5D7B830-6D19-47D2-8D5E-716B4CF84F0B@illinois.edu>
Message-ID: <CB38C203-7253-4AEE-A6E3-922243B290D9@gmx.net>

I'd suspect that the latest Bio::Phylo changes have been due for CPAN  
release anyway, so unless those are unstable that seems like the  
easiest fix to me.

If the Nexml code works against not yet stable updates to Bio::Phylo,  
it shouldn't be in a BioPerl stable release, right?

	-hilmar

On Sep 8, 2009, at 12:23 AM, Chris Fields wrote:

> All,
>
> I'm running into a pretty significant blocker for 1.6.1 re: Chase's  
> Nexml code.  In particular, I have tried three versions of  
> Bio::Phylo; the default CPAN installation (1.6), the latest CPAN RC  
> (1.7_RC9, not installed by default), and the latest from Bio::Phylo  
> svn:
>
> https://nexml.svn.sourceforge.net/svnroot/nexml/trunk/nexml/perl
>
> At this moment only the Bio::Phylo code from svn is working with  
> BioPerl's Nexml modules.  From my local tests Bio::Phylo 1.6 appears  
> to be missing Bio::Phylo::Factory (all Nexml tests fail), whereas  
> 1.7_RC9 has some kind of versioning issue (again, all tests fail).   
> The problem: CPAN will always install 1.6 (the others are RC, so  
> they won't be installed unless the full path is used).  Even so,  
> nothing on CPAN even works; one must use the latest Bio::Phylo SVN  
> code.
>
> ATM I'm just not seeing how this can be released with 1.6.1 right  
> now, unless one of the following occurs:
>
> 1) Rutger V. drops a quick non-RC release to CPAN,
> 2) check for the minimal working Bio::Phylo version and safely skip  
> any Nexml-related tests unless proper version is present (not easy  
> with a $VERSION like '1.7_RC9'),
> 3) push Nexml into it's own distribution (something we were planning  
> on anyway with a number of modules)
>
> As for #3 above, I think it probably belongs in a larger bioperl- 
> phylo as Mark had previously proposed.  I'm open to just about any  
> solution.
>
> chris
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From cjfields at illinois.edu  Tue Sep  8 12:02:53 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Tue, 8 Sep 2009 07:02:53 -0500
Subject: [Bioperl-l] Bio::Tools::RepeatMasker update?
In-Reply-To: <517072a20909080313g5ec3380bo42e1871c3a6f4aab@mail.gmail.com>
References: <517072a20909080313g5ec3380bo42e1871c3a6f4aab@mail.gmail.com>
Message-ID: <74B85419-6A37-46CE-AAF3-F33013F4A058@illinois.edu>

Patches are welcome for this (or you can submit an enhancement request  
via bugzilla):

http://bugzilla.open-bio.org/

This won't be in the next point release, sorry.

chris

On Sep 8, 2009, at 5:13 AM, shaohua.fan wrote:

> Dear all ,
>
> After reading the document and original code of  
> Bio::Tools::RepeatMasker on
> bioperl document 1.6.0, I have a question about this module's update.
>
> The current repeatmasker's output(  .out) provide more information
> than which have not listed in the module, for example, query(left) ,  
> repeat
> (left), perc div, perc del, perc ins. these maybe useful for some  
> users.
>
> I think it is better to update this module in the lastest Bioperl  
> version.
>
> shaohua
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From cjfields at illinois.edu  Tue Sep  8 13:15:31 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Tue, 8 Sep 2009 08:15:31 -0500
Subject: [Bioperl-l] Significant blocker for 1.6.1 : Nexml
In-Reply-To: <CB38C203-7253-4AEE-A6E3-922243B290D9@gmx.net>
References: <E5D7B830-6D19-47D2-8D5E-716B4CF84F0B@illinois.edu>
	<CB38C203-7253-4AEE-A6E3-922243B290D9@gmx.net>
Message-ID: <3163670B-51E3-419F-835B-304BB52E1037@illinois.edu>

On Sep 8, 2009, at 7:16 AM, Hilmar Lapp wrote:

> I'd suspect that the latest Bio::Phylo changes have been due for  
> CPAN release anyway, so unless those are unstable that seems like  
> the easiest fix to me.

My thought as well, just not sure how stable that code is right now.   
Bio::Phylo has been in RC for a while now, correct?

> If the Nexml code works against not yet stable updates to  
> Bio::Phylo, it shouldn't be in a BioPerl stable release, right?

Right.  That should be sorted out first.

I can wait a bit longer for Rutger to respond; there are a few other  
odds and ends that can been worked on in the meantime.  I would like  
to get the alpha out soon and 1.6.1 in the next week or so though.

chris

> 	-hilmar
>
> On Sep 8, 2009, at 12:23 AM, Chris Fields wrote:
>
>> All,
>>
>> I'm running into a pretty significant blocker for 1.6.1 re: Chase's  
>> Nexml code.  In particular, I have tried three versions of  
>> Bio::Phylo; the default CPAN installation (1.6), the latest CPAN RC  
>> (1.7_RC9, not installed by default), and the latest from Bio::Phylo  
>> svn:
>>
>> https://nexml.svn.sourceforge.net/svnroot/nexml/trunk/nexml/perl
>>
>> At this moment only the Bio::Phylo code from svn is working with  
>> BioPerl's Nexml modules.  From my local tests Bio::Phylo 1.6  
>> appears to be missing Bio::Phylo::Factory (all Nexml tests fail),  
>> whereas 1.7_RC9 has some kind of versioning issue (again, all tests  
>> fail).  The problem: CPAN will always install 1.6 (the others are  
>> RC, so they won't be installed unless the full path is used).  Even  
>> so, nothing on CPAN even works; one must use the latest Bio::Phylo  
>> SVN code.
>>
>> ATM I'm just not seeing how this can be released with 1.6.1 right  
>> now, unless one of the following occurs:
>>
>> 1) Rutger V. drops a quick non-RC release to CPAN,
>> 2) check for the minimal working Bio::Phylo version and safely skip  
>> any Nexml-related tests unless proper version is present (not easy  
>> with a $VERSION like '1.7_RC9'),
>> 3) push Nexml into it's own distribution (something we were  
>> planning on anyway with a number of modules)
>>
>> As for #3 above, I think it probably belongs in a larger bioperl- 
>> phylo as Mark had previously proposed.  I'm open to just about any  
>> solution.
>>
>> chris
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> -- 
> ===========================================================
> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
> ===========================================================
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From maj at fortinbras.us  Tue Sep  8 14:39:07 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Tue, 8 Sep 2009 10:39:07 -0400
Subject: [Bioperl-l] Significant blocker for 1.6.1 : Nexml
In-Reply-To: <3163670B-51E3-419F-835B-304BB52E1037@illinois.edu>
References: <E5D7B830-6D19-47D2-8D5E-716B4CF84F0B@illinois.edu><CB38C203-7253-4AEE-A6E3-922243B290D9@gmx.net>
	<3163670B-51E3-419F-835B-304BB52E1037@illinois.edu>
Message-ID: <1CF993D6D3AC435CA77127466D6C072A@NewLife>

I agree with Hilmar-- I have no problem keeping it in the trunk for a while
longer, as I have an addition for dealing with arbitrary non-seq
data using the Population API sitting in bioperl-dev that's nearly
ready, but prob. not before cjf wants to get the release out.
----- Original Message ----- 
From: "Chris Fields" <cjfields at illinois.edu>
To: "Hilmar Lapp" <hlapp at gmx.net>
Cc: "BioPerl List" <bioperl-l at lists.open-bio.org>; "Rutger A. Vos" 
<rutgeraldo at gmail.com>
Sent: Tuesday, September 08, 2009 9:15 AM
Subject: Re: [Bioperl-l] Significant blocker for 1.6.1 : Nexml


> On Sep 8, 2009, at 7:16 AM, Hilmar Lapp wrote:
>
>> I'd suspect that the latest Bio::Phylo changes have been due for  CPAN 
>> release anyway, so unless those are unstable that seems like  the easiest fix 
>> to me.
>
> My thought as well, just not sure how stable that code is right now. 
> Bio::Phylo has been in RC for a while now, correct?
>
>> If the Nexml code works against not yet stable updates to  Bio::Phylo, it 
>> shouldn't be in a BioPerl stable release, right?
>
> Right.  That should be sorted out first.
>
> I can wait a bit longer for Rutger to respond; there are a few other  odds and 
> ends that can been worked on in the meantime.  I would like  to get the alpha 
> out soon and 1.6.1 in the next week or so though.
>
> chris
>
>> -hilmar
>>
>> On Sep 8, 2009, at 12:23 AM, Chris Fields wrote:
>>
>>> All,
>>>
>>> I'm running into a pretty significant blocker for 1.6.1 re: Chase's  Nexml 
>>> code.  In particular, I have tried three versions of  Bio::Phylo; the 
>>> default CPAN installation (1.6), the latest CPAN RC  (1.7_RC9, not installed 
>>> by default), and the latest from Bio::Phylo  svn:
>>>
>>> https://nexml.svn.sourceforge.net/svnroot/nexml/trunk/nexml/perl
>>>
>>> At this moment only the Bio::Phylo code from svn is working with  BioPerl's 
>>> Nexml modules.  From my local tests Bio::Phylo 1.6  appears to be missing 
>>> Bio::Phylo::Factory (all Nexml tests fail),  whereas 1.7_RC9 has some kind 
>>> of versioning issue (again, all tests  fail).  The problem: CPAN will always 
>>> install 1.6 (the others are  RC, so they won't be installed unless the full 
>>> path is used).  Even  so, nothing on CPAN even works; one must use the 
>>> latest Bio::Phylo  SVN code.
>>>
>>> ATM I'm just not seeing how this can be released with 1.6.1 right  now, 
>>> unless one of the following occurs:
>>>
>>> 1) Rutger V. drops a quick non-RC release to CPAN,
>>> 2) check for the minimal working Bio::Phylo version and safely skip  any 
>>> Nexml-related tests unless proper version is present (not easy  with a 
>>> $VERSION like '1.7_RC9'),
>>> 3) push Nexml into it's own distribution (something we were  planning on 
>>> anyway with a number of modules)
>>>
>>> As for #3 above, I think it probably belongs in a larger bioperl- phylo as 
>>> Mark had previously proposed.  I'm open to just about any  solution.
>>>
>>> chris
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>> -- 
>> ===========================================================
>> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
>> ===========================================================
>>
>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> 


From lincoln.stein at gmail.com  Tue Sep  8 14:58:25 2009
From: lincoln.stein at gmail.com (Lincoln Stein)
Date: Tue, 8 Sep 2009 10:58:25 -0400
Subject: [Bioperl-l] Prepping for 1.6.1 (finally!)
In-Reply-To: <35CC277D-F0B6-45D0-A578-10A00B7A9C57@illinois.edu>
References: <35CC277D-F0B6-45D0-A578-10A00B7A9C57@illinois.edu>
Message-ID: <6dce9a0b0909080758q7334a7b2yc69bc86b96118927@mail.gmail.com>

Will do.

Lincoln

On Mon, Sep 7, 2009 at 4:56 PM, Chris Fields <cjfields at illinois.edu> wrote:

> All,
>
> I have updated the Changes file in bioperl-live in preparation for 1.6.1.
>  The initial release will be an alpha, 1.6.0_1 (probably landing about
> mid-week), and based on CPAN tests, etc the final 1.6.1 release next week.
>  I'll start merging changes over from trunk tonight, fixing last-minute
> bugs, etc.  I'm running my work using perl 5.10.1 (64-bit) on Mac and will
> likely run these remotely on our local linux cluster.  Win tests are gladly
> welcome (this should work on Strawberry Perl now).
>
> I highly suggest Mark, Jason, and any others (Lincoln, Scott, Chase, Robert
> Buels, Jay Hannah, Heikki, Sendu come to mind) look over the file to update
> it.  There are a few weak spots in there where I didn't make the code change
> or additions, or where a particular bug was fixed but not mentioned.  In
> particular:
>
> 1) Google Summer of Code work from Chase (Mark, Chase)
> 2) GMOD-related fixes (Lincoln, Scott)
> 3) YAPC Hackathon bug fixes (Robert, Jay, Bruno)
> 4) Tiling, Restriction refactors (Mark)
>
> Also, please make changes to AUTHORS, etc as needed.
>
> Thanks!
>
> chris
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


-- 
Lincoln D. Stein
Director, Informatics and Biocomputing Platform
Ontario Institute for Cancer Research
101 College St., Suite 800
Toronto, ON, Canada M5G0A3
416 673-8514
Assistant: Renata Musa <Renata.Musa at oicr.on.ca>


From cjfields at illinois.edu  Tue Sep  8 15:43:29 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Tue, 8 Sep 2009 10:43:29 -0500
Subject: [Bioperl-l] Significant blocker for 1.6.1 : Nexml
In-Reply-To: <1CF993D6D3AC435CA77127466D6C072A@NewLife>
References: <E5D7B830-6D19-47D2-8D5E-716B4CF84F0B@illinois.edu><CB38C203-7253-4AEE-A6E3-922243B290D9@gmx.net>
	<3163670B-51E3-419F-835B-304BB52E1037@illinois.edu>
	<1CF993D6D3AC435CA77127466D6C072A@NewLife>
Message-ID: <4415308D-81DC-4F68-A6CF-E08FD03D1D6E@illinois.edu>

Mark

We can hold it in trunk until the next point release or we start  
splitting things off (whichever is first).

I have a little more time, though, and I'm thinking it would be a good  
idea to get the Nexml code into the wild (sooner than later) for users  
to test out.  Let's see if Rutger responds.

chris

On Sep 8, 2009, at 9:39 AM, Mark A. Jensen wrote:

> I agree with Hilmar-- I have no problem keeping it in the trunk for  
> a while
> longer, as I have an addition for dealing with arbitrary non-seq
> data using the Population API sitting in bioperl-dev that's nearly
> ready, but prob. not before cjf wants to get the release out.
> ----- Original Message ----- From: "Chris Fields" <cjfields at illinois.edu 
> >
> To: "Hilmar Lapp" <hlapp at gmx.net>
> Cc: "BioPerl List" <bioperl-l at lists.open-bio.org>; "Rutger A. Vos" <rutgeraldo at gmail.com 
> >
> Sent: Tuesday, September 08, 2009 9:15 AM
> Subject: Re: [Bioperl-l] Significant blocker for 1.6.1 : Nexml
>
>
>> On Sep 8, 2009, at 7:16 AM, Hilmar Lapp wrote:
>>
>>> I'd suspect that the latest Bio::Phylo changes have been due for   
>>> CPAN release anyway, so unless those are unstable that seems like   
>>> the easiest fix to me.
>>
>> My thought as well, just not sure how stable that code is right  
>> now. Bio::Phylo has been in RC for a while now, correct?
>>
>>> If the Nexml code works against not yet stable updates to   
>>> Bio::Phylo, it shouldn't be in a BioPerl stable release, right?
>>
>> Right.  That should be sorted out first.
>>
>> I can wait a bit longer for Rutger to respond; there are a few  
>> other  odds and ends that can been worked on in the meantime.  I  
>> would like  to get the alpha out soon and 1.6.1 in the next week or  
>> so though.
>>
>> chris
>>
>>> -hilmar
>>>
>>> On Sep 8, 2009, at 12:23 AM, Chris Fields wrote:
>>>
>>>> All,
>>>>
>>>> I'm running into a pretty significant blocker for 1.6.1 re:  
>>>> Chase's  Nexml code.  In particular, I have tried three versions  
>>>> of  Bio::Phylo; the default CPAN installation (1.6), the latest  
>>>> CPAN RC  (1.7_RC9, not installed by default), and the latest from  
>>>> Bio::Phylo  svn:
>>>>
>>>> https://nexml.svn.sourceforge.net/svnroot/nexml/trunk/nexml/perl
>>>>
>>>> At this moment only the Bio::Phylo code from svn is working with   
>>>> BioPerl's Nexml modules.  From my local tests Bio::Phylo 1.6   
>>>> appears to be missing Bio::Phylo::Factory (all Nexml tests  
>>>> fail),  whereas 1.7_RC9 has some kind of versioning issue (again,  
>>>> all tests  fail).  The problem: CPAN will always install 1.6 (the  
>>>> others are  RC, so they won't be installed unless the full path  
>>>> is used).  Even  so, nothing on CPAN even works; one must use the  
>>>> latest Bio::Phylo  SVN code.
>>>>
>>>> ATM I'm just not seeing how this can be released with 1.6.1  
>>>> right  now, unless one of the following occurs:
>>>>
>>>> 1) Rutger V. drops a quick non-RC release to CPAN,
>>>> 2) check for the minimal working Bio::Phylo version and safely  
>>>> skip  any Nexml-related tests unless proper version is present  
>>>> (not easy  with a $VERSION like '1.7_RC9'),
>>>> 3) push Nexml into it's own distribution (something we were   
>>>> planning on anyway with a number of modules)
>>>>
>>>> As for #3 above, I think it probably belongs in a larger bioperl-  
>>>> phylo as Mark had previously proposed.  I'm open to just about  
>>>> any  solution.
>>>>
>>>> chris
>>>> _______________________________________________
>>>> Bioperl-l mailing list
>>>> Bioperl-l at lists.open-bio.org
>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>> -- 
>>> ===========================================================
>>> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
>>> ===========================================================
>>>
>>>
>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From jason at bioperl.org  Tue Sep  8 19:43:39 2009
From: jason at bioperl.org (Jason Stajich)
Date: Tue, 8 Sep 2009 12:43:39 -0700
Subject: [Bioperl-l] Bio::DB::Fasta + Bio::SeqIO
Message-ID: <9858D52F-7580-44C9-A78E-4B1F1BF1B6ED@bioperl.org>

Bio::DB::Fasta returns Bio::PrimarySeq::Fasta objects which are  
perfectly fine to write with Bio::SeqIO::fasta but not for any of the  
rich-seq writers.
Do we think this is a bug or feature.  The solution is to write the  
PrimarySeq wrapped in a Bio::Seq object.

See this gist -- I would imagine this as additional test lines in t/ 
LocalDB/DBFasta.t but I don't know what we really expect?
http://gist.github.com/183169

I also notice that $seq->description & $seq->display_id don't allow  
'set' option - which probably makes sense since this is a read-only  
object that came from the DB, but it basically silently ignores set.   
I often do this if I pull seqs from a DB::Fasta db and re-format the  
IDs or description line.  So I end up making a new object and copying  
the data over.  I *think* this is really a feature not a bug, just  
wanted to bring it up.

-jason
--
Jason Stajich
jason.stajich at gmail.com
jason at bioperl.org


From cjfields at illinois.edu  Tue Sep  8 20:20:32 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Tue, 8 Sep 2009 15:20:32 -0500
Subject: [Bioperl-l] Bio::DB::Fasta + Bio::SeqIO
In-Reply-To: <9858D52F-7580-44C9-A78E-4B1F1BF1B6ED@bioperl.org>
References: <9858D52F-7580-44C9-A78E-4B1F1BF1B6ED@bioperl.org>
Message-ID: <AEE10370-F2B3-4723-9B79-23A5EBF86A51@illinois.edu>

On Sep 8, 2009, at 2:43 PM, Jason Stajich wrote:

> Bio::DB::Fasta returns Bio::PrimarySeq::Fasta objects which are  
> perfectly fine to write with Bio::SeqIO::fasta but not for any of  
> the rich-seq writers.
> Do we think this is a bug or feature.  The solution is to write the  
> PrimarySeq wrapped in a Bio::Seq object.

I think SeqIO requires any SeqI but doesn't specify anything for a  
simpler PrimarySeqI.  We could add some kind of general convenience  
wrapper in Bio::SeqIO to convert any PrimarySeqI to a requested SeqI  
class and just delegate to write_seq():

   # get a PrimarySeq somehow $seq, $out is Bio::SeqIO
   $out->write_PrimarySeq($seq); # or somesuch

> See this gist -- I would imagine this as additional test lines in t/ 
> LocalDB/DBFasta.t but I don't know what we really expect?
> http://gist.github.com/183169
>
> I also notice that $seq->description & $seq->display_id don't allow  
> 'set' option - which probably makes sense since this is a read-only  
> object that came from the DB, but it basically silently ignores  
> set.  I often do this if I pull seqs from a DB::Fasta db and re- 
> format the IDs or description line.  So I end up making a new object  
> and copying the data over.  I *think* this is really a feature not a  
> bug, just wanted to bring it up.
>
> -jason
> --
> Jason Stajich
> jason.stajich at gmail.com
> jason at bioperl.org

One can already cheat and do a few things.  For instance:

$seq->{id} = 'Foo';
print $seq->display_id; # should be 'Foo'

Won't work for all of them, though, such as description().   
Personally, if one made clear that such changes aren't retained in the  
database but must be redirected as output to another file then I don't  
see a problem (other PrimarySeqI are mutable, so why not these?).

Would there be any real performance hit from making those get/set  
accessors instead of ro getters?  The class is fairly small.

chris


From lelbourn at science.mq.edu.au  Mon Sep  7 07:52:04 2009
From: lelbourn at science.mq.edu.au (Liam Elbourne)
Date: Mon, 7 Sep 2009 17:52:04 +1000
Subject: [Bioperl-l] subsection of genbank file
Message-ID: <997B4CA2-D80B-4512-AA3E-74CB45DD7064@science.mq.edu.au>

Hi All,

Is there a method or methodology that will produce a fully fledged Seq  
object with all the associated metadata given a start and end  
position? To clarify, I create a sequence object from a genbank file:


****
my $io  = Bio::Seqio->new(as per usual);

my $seqobj = $io->next_seq();
****
I now want:

my $sub_seqobj = $seqobj between 300 and 2000

where $sub_seqobj is a Seq object (which I appreciate is an  
'aggregate' of objects) too. The "trunc" method only returns a  
PrimarySeq object which lacks all the annotation etc. I've previously  
done this task by iterating through feature by feature and parsing out  
what I needed, but thought there might be a more elegant approach...


Regards,
Liam Elbourne.


From alpapan at googlemail.com  Thu Sep 10 21:14:11 2009
From: alpapan at googlemail.com (Alexie Papanicolaou)
Date: Thu, 10 Sep 2009 22:14:11 +0100
Subject: [Bioperl-l] Bio::Search::HSP::FastaHSP -> get_aln -> Bio::Locatable
 end is float
Message-ID: <1252617251.6680.16.camel@alexie-desktop>

An embedded and charset-unspecified text was scrubbed...
Name: not available
URL: <http://lists.open-bio.org/pipermail/bioperl-l/attachments/20090910/f222c627/attachment.ksh>

From maj at fortinbras.us  Fri Sep 11 03:52:27 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Thu, 10 Sep 2009 23:52:27 -0400
Subject: [Bioperl-l] Bio::Search::HSP::FastaHSP -> get_aln ->
	Bio::Locatable end is float
In-Reply-To: <1252617251.6680.16.camel@alexie-desktop>
References: <1252617251.6680.16.camel@alexie-desktop>
Message-ID: <D2C2357D7A81478B965996CF6DDD4AF2@NewLife>

Hi Alexie--
I am either responsible for this weirdness, or have fixed it in
an unreleased version. Anyway,  can you please make a bug
report at http://bugzilla.bioperl.org, and include some relevant
code and real data, and I will have a look.
Thanks a lot- Mark
----- Original Message ----- 
From: "Alexie Papanicolaou" <alpapan at googlemail.com>
To: <bioperl-l at lists.open-bio.org>
Sent: Thursday, September 10, 2009 5:14 PM
Subject: [Bioperl-l] Bio::Search::HSP::FastaHSP -> get_aln -> Bio::Locatable end 
is float


> Hello all,
>
> I get the following warning when parsing a fasty34 HSP using Bio::Search
> and then trying to getting the alignment using get_aln
>
> MSG: In sequence CONTIG residue count gives end value
> 565.333333333333.
> Overriding value [565] with value 565.333333333333 for
> Bio::LocatableSeq::end().
> MAEMFKIGDLVWAKMKGFSPWPGLVSNPTKDLKRPTSKKSAQQ/C/VFFLGTNNYAWIEEANIKPYFEYRDRLVKSNKSGAFKDALDAIEEYIKNNGAKFDDPDAEFNRLRESLAEKKESKPKQRKEKRPAHDDNSAKSPKKVRTNSVEADKESVRADSPILSNHSPRKGPASTLLERPTTIVRPLDDSQD
> STACK
> Bio::LocatableSeq::end /usr/local/share/perl/5.8.8/Bio/LocatableSeq.pm:196
> STACK
> Bio::LocatableSeq::new /usr/local/share/perl/5.8.8/Bio/LocatableSeq.pm:140
> STACK
> Bio::Search::HSP::FastaHSP::get_aln 
> /usr/local/share/perl/5.8.8/Bio/Search/HSP/FastaHSP.pm:174
>
> The frameshifts (/ and \ ) are causing this recalculation of length to a
> float (which is a bit weird) but is not fatal for my program. Is this
> intentional?
>
> My immediate problems is actually the warning message itself - which is
> quite annoying if you have hundreds of such sequences... any way to turn
> them off sort of commenting out the line at LocatableSeq.pm ?
> (redirecting STDERR wouldn't be desirable for a production script).
>
> many thanks
> alexie
>
>
> -- 
> Alexie Papanicolaou
> Richard ffrench-Constant group
> CEC-Biology
> Univ. Exeter in Cornwall
> Penryn
> TR10 9EZ
> United Kingdom
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> 


From gmodhelp at googlemail.com  Fri Sep 11 16:40:43 2009
From: gmodhelp at googlemail.com (Dave Clements, GMOD Help Desk)
Date: Fri, 11 Sep 2009 09:40:43 -0700
Subject: [Bioperl-l] CVS to SVN Conversion, 2009/09/15
In-Reply-To: <71ee57c70909110937m4a2598abv6a0a5aaa1e656fcc@mail.gmail.com>
References: <71ee57c70908241615w6f82abb6p25b0744e8f5fb006@mail.gmail.com>
	<71ee57c70909110935w2147628cq6e6984feb544e6b9@mail.gmail.com>
	<71ee57c70909110936g34612cf0g5a9d83aeee4e0efd@mail.gmail.com>
	<71ee57c70909110937m4a2598abv6a0a5aaa1e656fcc@mail.gmail.com>
Message-ID: <71ee57c70909110940y921b1dxfec278422d31be7f@mail.gmail.com>

Hello all,

This is a heads up that GMOD (in the form of Rob Buels) will be moving
its SourceForge source code repository from CVS to SVN on September
15, 2009.

If you have checked out and modified any code from that repository,
please commit your updates before 3am, Eastern US, on September 15.

Some important bits:
* All projects will be frozen in CVS and will remain available from CVS.
* No new updates will be allowed in CVS.
* All project will be moved to Subversion.
* Inactive projects will be moved to a separate archival directory.

See http://gmod.org/wiki/CVS_to_Subversion_Conversion for full details
and a list of active and inactive projects.

Thanks,

Dave C
--
* Please keep responses on the list!
* Was this helpful? ?Let us know at http://gmod.org/wiki/Help_Desk_Feedback


From jayoung at fhcrc.org  Sat Sep 12 01:11:00 2009
From: jayoung at fhcrc.org (Janet Young)
Date: Fri, 11 Sep 2009 18:11:00 -0700
Subject: [Bioperl-l] tree splice remove nodes
Message-ID: <BE5181C0-6BAF-42A8-A6A0-BC699FE640B0@fhcrc.org>

Hi,

I'm having a problem in a script that I'm hoping someone can help me  
figure out.  I'm using splice(-remove_id) to prune a Bio::Tree::Tree  
object, and it looks like it worked fine.

However, I'm also trying to keep a separate copy of the original  
(unpruned) tree in a different object but that second object seems to  
get pruned as well.

Here's my tree, stored in a file called testtree2.nwk:

(((A,(B,b)),C),D,E);

---------------------------------------
Here's my script:

#!/usr/bin/perl

use warnings;
use strict;
use Bio::TreeIO;

my $treeIO = new Bio::TreeIO(-file => "testtree2.nwk", - 
format=>'newick');
while (my $tree = $treeIO->next_tree() ) {

       print "\nfound a tree\n\n";
       my @originalleaves = $tree -> get_leaf_nodes();
       foreach my $originalleaf (@originalleaves) {print "original  
tree has node with id " . $originalleaf->id() . "\n";}

       my $tree2 = $tree;

       my @remove = ("D","E");
       print "\nremoving nodes @remove\n\n";

       $tree2 -> splice(-remove_id => \@remove);
       my @leaves2 = $tree2 -> get_leaf_nodes();
       foreach my $leaf2 (@leaves2) {print "after removing tree2 has  
node with id " . $leaf2->id() . "\n";}

       print "\n";

       my @originalleavesafter = $tree -> get_leaf_nodes();
       foreach my $leaf3 (@originalleavesafter) {print "after removing  
original tree has node with id " . $leaf3->id() . "\n";}

}

---------------------------------------


And here's my output:

found a tree

original tree has node with id A
original tree has node with id B
original tree has node with id b
original tree has node with id C
original tree has node with id D
original tree has node with id E

removing nodes D E

after removing tree2 has node with id A
after removing tree2 has node with id B
after removing tree2 has node with id b
after removing tree2 has node with id C

after removing original tree has node with id A
after removing original tree has node with id B
after removing original tree has node with id b
after removing original tree has node with id C


-------------------------

I want to splice the specified nodes out of $tree2 and leave $tree  
untouched, but both $tree and $tree2 seem to be affected by the splice  
operation. Am I failing to understand something about references/ 
dereferencing?   I'm not sure if I just haven't figured this out right  
or if it's a bug.  If it looks like a bug let me know and I'll post it  
to bugzilla.

thanks in advance for any advice,

Janet

-------------------------------------------------------------------

Dr. Janet Young (Trask lab)

Fred Hutchinson Cancer Research Center
1100 Fairview Avenue N., C3-168,
P.O. Box 19024, Seattle, WA 98109-1024, USA.

tel: (206) 667 1471 fax: (206) 667 6524
email: jayoung  ...at...  fhcrc.org

http://www.fhcrc.org/labs/trask/

-------------------------------------------------------------------


From maj at fortinbras.us  Sat Sep 12 02:00:53 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Fri, 11 Sep 2009 22:00:53 -0400
Subject: [Bioperl-l] tree splice remove nodes
In-Reply-To: <BE5181C0-6BAF-42A8-A6A0-BC699FE640B0@fhcrc.org>
References: <BE5181C0-6BAF-42A8-A6A0-BC699FE640B0@fhcrc.org>
Message-ID: <C8DF4B9CC00E4FAA8D55F43E787E3F38@NewLife>

Hi Janet-
The trouble here is that 
$tree2 = $tree
doesn't create an independent copy of the entire
tree data structure. So, $tree2 and $tree
essentially point to the same thing. 
The easiest way to get two independent copies 
is probably to read the file twice:

$treeIO = new Bio::TreeIO(-file => "testtree2.nwk", -format=>'newick');
$tree = $treeIO->next_tree;
$treeIO = new Bio::TreeIO(-file => "testtree2.nwk", -format=>'newick');
$tree2 = $treeIO->next tree;

which will create two copies. This is a little kludgy, but 
unfortunately, there doesn't seem to be any easy way to 
rewind the TreeIO object. 

When you want a copy of a complex object, generally 
you need to "clone" it, and there are variety of modules
you can use to create clones. [It's probably worth adding 
a clone() method to TreeFunctionsI--maybe I'll do that.]
Get the module Clone from CPAN and do

use Clone qw(clone);
....
$tree2 = clone($tree);
...

hope this helps- cheers 
MAJ
----- Original Message ----- 
From: "Janet Young" <jayoung at fhcrc.org>
To: <bioperl-l at lists.open-bio.org>
Sent: Friday, September 11, 2009 9:11 PM
Subject: [Bioperl-l] tree splice remove nodes


> Hi,
> 
> I'm having a problem in a script that I'm hoping someone can help me  
> figure out.  I'm using splice(-remove_id) to prune a Bio::Tree::Tree  
> object, and it looks like it worked fine.
> 
> However, I'm also trying to keep a separate copy of the original  
> (unpruned) tree in a different object but that second object seems to  
> get pruned as well.
> 
> Here's my tree, stored in a file called testtree2.nwk:
> 
> (((A,(B,b)),C),D,E);
> 
> ---------------------------------------
> Here's my script:
> 
> #!/usr/bin/perl
> 
> use warnings;
> use strict;
> use Bio::TreeIO;
> 
> my $treeIO = new Bio::TreeIO(-file => "testtree2.nwk", - 
> format=>'newick');
> while (my $tree = $treeIO->next_tree() ) {
> 
>       print "\nfound a tree\n\n";
>       my @originalleaves = $tree -> get_leaf_nodes();
>       foreach my $originalleaf (@originalleaves) {print "original  
> tree has node with id " . $originalleaf->id() . "\n";}
> 
>       my $tree2 = $tree;
> 
>       my @remove = ("D","E");
>       print "\nremoving nodes @remove\n\n";
> 
>       $tree2 -> splice(-remove_id => \@remove);
>       my @leaves2 = $tree2 -> get_leaf_nodes();
>       foreach my $leaf2 (@leaves2) {print "after removing tree2 has  
> node with id " . $leaf2->id() . "\n";}
> 
>       print "\n";
> 
>       my @originalleavesafter = $tree -> get_leaf_nodes();
>       foreach my $leaf3 (@originalleavesafter) {print "after removing  
> original tree has node with id " . $leaf3->id() . "\n";}
> 
> }
> 
> ---------------------------------------
> 
> 
> And here's my output:
> 
> found a tree
> 
> original tree has node with id A
> original tree has node with id B
> original tree has node with id b
> original tree has node with id C
> original tree has node with id D
> original tree has node with id E
> 
> removing nodes D E
> 
> after removing tree2 has node with id A
> after removing tree2 has node with id B
> after removing tree2 has node with id b
> after removing tree2 has node with id C
> 
> after removing original tree has node with id A
> after removing original tree has node with id B
> after removing original tree has node with id b
> after removing original tree has node with id C
> 
> 
> -------------------------
> 
> I want to splice the specified nodes out of $tree2 and leave $tree  
> untouched, but both $tree and $tree2 seem to be affected by the splice  
> operation. Am I failing to understand something about references/ 
> dereferencing?   I'm not sure if I just haven't figured this out right  
> or if it's a bug.  If it looks like a bug let me know and I'll post it  
> to bugzilla.
> 
> thanks in advance for any advice,
> 
> Janet
> 
> -------------------------------------------------------------------
> 
> Dr. Janet Young (Trask lab)
> 
> Fred Hutchinson Cancer Research Center
> 1100 Fairview Avenue N., C3-168,
> P.O. Box 19024, Seattle, WA 98109-1024, USA.
> 
> tel: (206) 667 1471 fax: (206) 667 6524
> email: jayoung  ...at...  fhcrc.org
> 
> http://www.fhcrc.org/labs/trask/
> 
> -------------------------------------------------------------------
> 
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
>


From cjfields at illinois.edu  Sat Sep 12 04:12:06 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Fri, 11 Sep 2009 23:12:06 -0500
Subject: [Bioperl-l] tree splice remove nodes
In-Reply-To: <C8DF4B9CC00E4FAA8D55F43E787E3F38@NewLife>
References: <BE5181C0-6BAF-42A8-A6A0-BC699FE640B0@fhcrc.org>
	<C8DF4B9CC00E4FAA8D55F43E787E3F38@NewLife>
Message-ID: <5BE22FC3-06F3-4D31-BB73-8F2C49D46A03@illinois.edu>

On Sep 11, 2009, at 9:00 PM, Mark A. Jensen wrote:

> Hi Janet-
> The trouble here is that $tree2 = $tree
> doesn't create an independent copy of the entire
> tree data structure. So, $tree2 and $tree
> essentially point to the same thing. The easiest way to get two  
> independent copies is probably to read the file twice:
>
> $treeIO = new Bio::TreeIO(-file => "testtree2.nwk", - 
> format=>'newick');
> $tree = $treeIO->next_tree;
> $treeIO = new Bio::TreeIO(-file => "testtree2.nwk", - 
> format=>'newick');
> $tree2 = $treeIO->next tree;
>
> which will create two copies. This is a little kludgy, but  
> unfortunately, there doesn't seem to be any easy way to rewind the  
> TreeIO object.

You can rewind the filehandle if it's seekable:

my $fh = $treeio->_fh;
seek($fh,0,0); # or something like that...

Don't use sysseek (doesn't work with buffered IO).

>  When you want a copy of a complex object, generally you need to  
> "clone" it, and there are variety of modules
> you can use to create clones. [It's probably worth adding a clone()  
> method to TreeFunctionsI--maybe I'll do that.]
> Get the module Clone from CPAN and do
>
> use Clone qw(clone);
> ....
> $tree2 = clone($tree);
> ...
>
> hope this helps- cheers MAJ

This normally works with bioperl objects, just not sure about Tree  
(might be worth testing out).

chris


From bix at sendu.me.uk  Sat Sep 12 08:33:22 2009
From: bix at sendu.me.uk (Sendu Bala)
Date: Sat, 12 Sep 2009 09:33:22 +0100
Subject: [Bioperl-l] tree splice remove nodes
In-Reply-To: <C8DF4B9CC00E4FAA8D55F43E787E3F38@NewLife>
References: <BE5181C0-6BAF-42A8-A6A0-BC699FE640B0@fhcrc.org>
	<C8DF4B9CC00E4FAA8D55F43E787E3F38@NewLife>
Message-ID: <4AAB5CD2.1040903@sendu.me.uk>

Mark A. Jensen wrote:
> Hi Janet-
> The trouble here is that $tree2 = $tree
> doesn't create an independent copy of the entire
> tree data structure. So, $tree2 and $tree
> essentially point to the same thing. The easiest way to get two 
> independent copies is probably to read the file twice:
> 
> $treeIO = new Bio::TreeIO(-file => "testtree2.nwk", -format=>'newick');
> $tree = $treeIO->next_tree;
> $treeIO = new Bio::TreeIO(-file => "testtree2.nwk", -format=>'newick');
> $tree2 = $treeIO->next tree;
> 
> which will create two copies. This is a little kludgy, but 
> unfortunately, there doesn't seem to be any easy way to rewind the 
> TreeIO object.
> When you want a copy of a complex object, generally you need to "clone" 
> it, and there are variety of modules
> you can use to create clones. [It's probably worth adding a clone() 
> method to TreeFunctionsI--maybe I'll do that.]
> Get the module Clone from CPAN and do

 From my comments in Bio/Tree/TreeFunctionsI.pm:

Clone.pm clone() seg faults and fails to make the clone, whilst Storable 
dclone needs $self->{_root_cleanup_methods} deleted (code ref) and seg 
faults at end of script.

TreeFunctionsI.pm already has the _clone() method. I suppose you could 
add some POD for it, rename it clone() and update the methods that call 
the private method to call the public version instead, Mark.

Janet: just clone your tree object with:
my $tree2 = $tree->_clone();


From maj at fortinbras.us  Sat Sep 12 11:37:37 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Sat, 12 Sep 2009 07:37:37 -0400
Subject: [Bioperl-l] tree splice remove nodes
In-Reply-To: <4AAB5CD2.1040903@sendu.me.uk>
References: <BE5181C0-6BAF-42A8-A6A0-BC699FE640B0@fhcrc.org>
	<C8DF4B9CC00E4FAA8D55F43E787E3F38@NewLife>
	<4AAB5CD2.1040903@sendu.me.uk>
Message-ID: <1A0B867B64B347A3B23A2F19EAA2A720@NewLife>

Done-- thanks Sendu. I made _clone alias clone, to keep 
from rocking anyone's boat. 
Janet- definitely do  $tree2 = $tree->_clone.

----- Original Message ----- 
From: "Sendu Bala" <bix at sendu.me.uk>
To: "Mark A. Jensen" <maj at fortinbras.us>
Cc: "Janet Young" <jayoung at fhcrc.org>; <bioperl-l at lists.open-bio.org>
Sent: Saturday, September 12, 2009 4:33 AM
Subject: Re: [Bioperl-l] tree splice remove nodes


> Mark A. Jensen wrote:
>> Hi Janet-
>> The trouble here is that $tree2 = $tree
>> doesn't create an independent copy of the entire
>> tree data structure. So, $tree2 and $tree
>> essentially point to the same thing. The easiest way to get two 
>> independent copies is probably to read the file twice:
>> 
>> $treeIO = new Bio::TreeIO(-file => "testtree2.nwk", -format=>'newick');
>> $tree = $treeIO->next_tree;
>> $treeIO = new Bio::TreeIO(-file => "testtree2.nwk", -format=>'newick');
>> $tree2 = $treeIO->next tree;
>> 
>> which will create two copies. This is a little kludgy, but 
>> unfortunately, there doesn't seem to be any easy way to rewind the 
>> TreeIO object.
>> When you want a copy of a complex object, generally you need to "clone" 
>> it, and there are variety of modules
>> you can use to create clones. [It's probably worth adding a clone() 
>> method to TreeFunctionsI--maybe I'll do that.]
>> Get the module Clone from CPAN and do
> 
> From my comments in Bio/Tree/TreeFunctionsI.pm:
> 
> Clone.pm clone() seg faults and fails to make the clone, whilst Storable 
> dclone needs $self->{_root_cleanup_methods} deleted (code ref) and seg 
> faults at end of script.
> 
> TreeFunctionsI.pm already has the _clone() method. I suppose you could 
> add some POD for it, rename it clone() and update the methods that call 
> the private method to call the public version instead, Mark.
> 
> Janet: just clone your tree object with:
> my $tree2 = $tree->_clone();
> 
>


From adlai at refenestration.com  Sat Sep 12 15:18:02 2009
From: adlai at refenestration.com (adlai burman)
Date: Sat, 12 Sep 2009 17:18:02 +0200
Subject: [Bioperl-l] Servers
Message-ID: <7667775E-09F9-4F20-B76C-2297DE629CF3@refenestration.com>

Can anyone suggest a hosting or server provider that actually has  
Bioperl installed?

Thanks,

Adlai


From maj at fortinbras.us  Sat Sep 12 16:45:35 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Sat, 12 Sep 2009 12:45:35 -0400
Subject: [Bioperl-l] Servers
In-Reply-To: <7667775E-09F9-4F20-B76C-2297DE629CF3@refenestration.com>
References: <7667775E-09F9-4F20-B76C-2297DE629CF3@refenestration.com>
Message-ID: <127343EFA5EF4F7CB756586A1B0B210E@NewLife>

I have a public amazon machine ; see http://fortinbras.us/bioperl-max
cheers MAJ
----- Original Message ----- 
From: "adlai burman" <adlai at refenestration.com>
To: <bioperl-l at lists.open-bio.org>
Sent: Saturday, September 12, 2009 11:18 AM
Subject: [Bioperl-l] Servers


> Can anyone suggest a hosting or server provider that actually has  
> Bioperl installed?
> 
> Thanks,
> 
> Adlai
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
>


From hartzell at alerce.com  Sun Sep 13 01:35:44 2009
From: hartzell at alerce.com (George Hartzell)
Date: Sat, 12 Sep 2009 18:35:44 -0700
Subject: [Bioperl-l] Bio::DB::GenBank question (acc vs. version)
Message-ID: <19116.19568.26115.542911@already.dhcp.gene.com>


It looks like get Bio::DB::GenBank::get_Seq_by_{version,acc} are
functionally identical.  They seem to trickle down to the same place
and walking through these two requests yields almost identical http
requests: 

  $db->get_Seq_by_version('J00522.1')
  GET http://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?retmode=text&rettype=gbwithparts&db=nucleotide&tool=bioperl&id=J00522.1&usehistory=n

  $db->get_Seq_by_acc('J00522')
  GET http://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?retmode=text&rettype=gbwithparts&db=nucleotide&tool=bioperl&id=J00522&usehistory=n

The only difference that I can see is that they index into different
secions of %PARAMSTRING defined in Bio::DB::GenBank, but those
sections contain the same information.

I'd like a general purpose tool that does The Right Thing whether
there's a .1 on the end of an identifier or not, and am just trying to
make sure I'm not doing something troublesome.

Am I correct about the above?

While I'm at it, I think that the comment

  # note that get_Stream_by_version is not implemented

in Bio::DB::GenBank was made obsolete by whoever commented out the

  $self->throw(...)

in get_Stream_by_version in Bio::WebDBSeqI.pm.

I'll happily commit the trivial doc fix if no one shoots down the
idea. (can't help big, might as well help small...).

Thanks,

g.


From maj at fortinbras.us  Sun Sep 13 03:14:06 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Sat, 12 Sep 2009 23:14:06 -0400
Subject: [Bioperl-l] Emacs bioperl-mode improved release
Message-ID: <DBDF390336FB4D8D935E2395E48207D7@NewLife>

Hi All--

[Future announcements/updates will be made on the wiki-
 http://bioperl.org/wiki/Emacs_bioperl-mode --
 put it on your watchlist...see the page for features and install
 info ]

Bioperl-mode (tar r16070) is improved:
- fancy syntax and header highlighting for pod views
- jump to .pm source from pod view (just press 'f')
- full support for multiple paths
  (e.g. "/usr/local/src/bioperl-live:/usr/local/src/bioperl-run"):
  the completion flattens the paths; if you wind up having to 
  make a choice (between, e.g., site-perl/5.10/Bio/Seq.pm
  and mytweaks/Bio/Seq.pm), completion will let you choose
  the path at the prompt.
- BPMODE_PATH convenience environment 
  variable is read for the search paths
- other stuff I can't remember
- there is a unit test suite under test.el of Wang Liang
  in the dev path

To do this stuff, I've backed off Emacs 21 compatibility; 
it'll bork (nicely) if you have 21. If there are "enough" complaints,
I will relent, but 22 is cool for people like me with the 
elisp disease.

Other technical issues remain; let me know and 
I'll do my best. My goal is to make this something
you can't live without. (And if you're not using
Emacs, are you really living?)

 M-x thanks

Mark


From bill at genenformics.com  Sun Sep 13 15:47:57 2009
From: bill at genenformics.com (bill at genenformics.com)
Date: Sun, 13 Sep 2009 08:47:57 -0700
Subject: [Bioperl-l] Bio::DB::GenBank question (acc vs. version)
In-Reply-To: <19116.19568.26115.542911@already.dhcp.gene.com>
References: <19116.19568.26115.542911@already.dhcp.gene.com>
Message-ID: <02cbfb3dfbb309f0b62cecd122bb5c2c.squirrel@mail.dreamhost.com>


I would like to make a few comments about get_Seq_by_version and
get_Seq_by_acc. Although both functions use the same NCBI eUtils API, they
are interpreted differently for a Seq_id with version or without version.

1. If the Seq_id has a version, GenBank ID server will locate
corresponding GI and emit the correct sequence.
2. If the Seq_id does not have a version, GBDataLoader  will try to find
the latest version number for that Seq_id, which is relatively slower and
the version number the ID server find out may NOT always be the latest.

IMHO, for both efficiency and consistency,
get_Seq_by_gi > get_Seq_by_version >> get_Seq_by_acc

Bill


>
> It looks like get Bio::DB::GenBank::get_Seq_by_{version,acc} are
> functionally identical.  They seem to trickle down to the same place
> and walking through these two requests yields almost identical http
> requests:
>
>   $db->get_Seq_by_version('J00522.1')
>   GET
> http://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?retmode=text&rettype=gbwithparts&db=nucleotide&tool=bioperl&id=J00522.1&usehistory=n
>
>   $db->get_Seq_by_acc('J00522')
>   GET
> http://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?retmode=text&rettype=gbwithparts&db=nucleotide&tool=bioperl&id=J00522&usehistory=n
>
> The only difference that I can see is that they index into different
> secions of %PARAMSTRING defined in Bio::DB::GenBank, but those
> sections contain the same information.
>
> I'd like a general purpose tool that does The Right Thing whether
> there's a .1 on the end of an identifier or not, and am just trying to
> make sure I'm not doing something troublesome.
>
> Am I correct about the above?
>
> While I'm at it, I think that the comment
>
>   # note that get_Stream_by_version is not implemented
>
> in Bio::DB::GenBank was made obsolete by whoever commented out the
>
>   $self->throw(...)
>
> in get_Stream_by_version in Bio::WebDBSeqI.pm.
>
> I'll happily commit the trivial doc fix if no one shoots down the
> idea. (can't help big, might as well help small...).
>
> Thanks,
>
> g.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


From maj at fortinbras.us  Mon Sep 14 01:26:57 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Sun, 13 Sep 2009 21:26:57 -0400
Subject: [Bioperl-l] Emacs bioperl-mode improved release
In-Reply-To: <DBDF390336FB4D8D935E2395E48207D7@NewLife>
References: <DBDF390336FB4D8D935E2395E48207D7@NewLife>
Message-ID: <CCFD820881654749B1EA479B45A7EA28@NewLife>

Sorry-- just one more tweak--
the latest tar (r16073) eliminates the dependency on pod2text
entirely; source is now parsed for pod directly by an elisp function.
cheers MAJ 
----- Original Message ----- 
From: "Mark A. Jensen" <maj at fortinbras.us>
To: "BioPerl List" <bioperl-l at lists.open-bio.org>
Sent: Saturday, September 12, 2009 11:14 PM
Subject: [Bioperl-l] Emacs bioperl-mode improved release


> Hi All--
> 
> [Future announcements/updates will be made on the wiki-
> http://bioperl.org/wiki/Emacs_bioperl-mode --
> put it on your watchlist...see the page for features and install
> info ]
> 
> Bioperl-mode (tar r16070) is improved:
> - fancy syntax and header highlighting for pod views
> - jump to .pm source from pod view (just press 'f')
> - full support for multiple paths
>  (e.g. "/usr/local/src/bioperl-live:/usr/local/src/bioperl-run"):
>  the completion flattens the paths; if you wind up having to 
>  make a choice (between, e.g., site-perl/5.10/Bio/Seq.pm
>  and mytweaks/Bio/Seq.pm), completion will let you choose
>  the path at the prompt.
> - BPMODE_PATH convenience environment 
>  variable is read for the search paths
> - other stuff I can't remember
> - there is a unit test suite under test.el of Wang Liang
>  in the dev path
> 
> To do this stuff, I've backed off Emacs 21 compatibility; 
> it'll bork (nicely) if you have 21. If there are "enough" complaints,
> I will relent, but 22 is cool for people like me with the 
> elisp disease.
> 
> Other technical issues remain; let me know and 
> I'll do my best. My goal is to make this something
> you can't live without. (And if you're not using
> Emacs, are you really living?)
> 
> M-x thanks
> 
> Mark
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
>


From neetisomaiya at gmail.com  Mon Sep 14 08:22:43 2009
From: neetisomaiya at gmail.com (Neeti Somaiya)
Date: Mon, 14 Sep 2009 13:52:43 +0530
Subject: [Bioperl-l] need help urgently
In-Reply-To: <18DF7D20DFEC044098A1062202F5FFF32B624C5607@exchsth.agresearch.co.nz>
References: <764978cf0909032349t5d8683f3h3b2ccc506392c984@mail.gmail.com>
	<2ac05d0f0909040039v4d6fb77fw8793b43add632e3a@mail.gmail.com>
	<764978cf0909070304w598d4bb5m51ad4e66f57cc1cf@mail.gmail.com>
	<18DF7D20DFEC044098A1062202F5FFF32B624C53A3@exchsth.agresearch.co.nz>
	<764978cf0909072127n830d4e8x95d15a758fa919db@mail.gmail.com>
	<18DF7D20DFEC044098A1062202F5FFF32B624C5607@exchsth.agresearch.co.nz>
Message-ID: <764978cf0909140122h3fe74b80lec7118e3edde24f9@mail.gmail.com>

Thanks a lot. This works for me.

I need one more help, can you point me to where exactly can we find
the link to this FASTA sequence, that we are retrieving here through
the code, in its actual entry in Entrez Gene in the NCBI website
(http://www.ncbi.nlm.nih.gov/sites/entrez)

-Neeti
Even my blood says, B positive


On Tue, Sep 8, 2009 at 10:11 AM, Smithies, Russell
<Russell.Smithies at agresearch.co.nz> wrote:
> That bit of code gave you the accession, start and end for the sequence so you just needed to download it.
> Bio::DB::Eutilities can do that for you.
>
> Did you take a look at http://www.bioperl.org/wiki/HOWTO:Getting_Genomic_Sequences
>
>
>
> --Russell
>
> ==================
> #!perl -w
>
> use strict;
> use Bio::DB::EntrezGene;
> use Bio::DB::EUtilities;
>
> no warnings 'deprecated';
>
> my $id = shift or die "Id?\n"; # use a Gene id
>
> my $db = new Bio::DB::EntrezGene;
> #$db->verbose(1);
> my $seq = $db->get_Seq_by_id($id);
>
> my $ac = $seq->annotation;
>
> for my $ann ($ac->get_Annotations('dblink')) {
>        if ($ann->database eq "Evidence Viewer") {
>                # get the sequence identifier, the start, and the stop
>                my ($acc,$from,$to) = $ann->url =~
>                  /contig=([^&]+).+from=(\d+)&to=(\d+)/;
>                print "$acc\t$from\t$to\n";
>
>                # retrieve the sequence
>                my $fetcher = Bio::DB::EUtilities->new(-eutil => 'efetch',
>                                           -db    => 'nucleotide',
>                                           -rettype => 'fasta');
>            $fetcher->set_parameters(-id => $acc,
>                                                -seq_start => $from,
>                                                -seq_stop  => $to,
>                                                -strand    => 1);
>            my $seq = $fetcher->get_Response->content;
>            print $seq;
>
>        }
> }
>
> ======================
>
>> -----Original Message-----
>> From: Neeti Somaiya [mailto:neetisomaiya at gmail.com]
>> Sent: Tuesday, 8 September 2009 4:28 p.m.
>> To: Smithies, Russell
>> Cc: Emanuele Osimo; bioperl-l
>> Subject: Re: [Bioperl-l] need help urgently
>>
>> I actually want the nucleotide sequence of the gene. I thought the
>> Bio::DB::EntrezGene would give me a seq_obj for an entrez gene id and
>> then the seq method on that $seq_obj->seq() will give me the actual
>> genomic nucleotide sequence of the gene. But this doesnt happen. I am
>> able to print gene symbol using $seq_obj->display_id and able to do
>> other things, but I wanted the gene nucleotide sequence.
>>
>> -Neeti
>> Even my blood says, B positive
>>
>>
>>
>> On Tue, Sep 8, 2009 at 1:56 AM, Smithies,
>> Russell<Russell.Smithies at agresearch.co.nz> wrote:
>> > This example code from the wiki _definitely_ works:
>> >
>> http://bio.perl.org/wiki/HOWTO:Getting_Genomic_Sequences#Using_Bio::DB::Entrez
>> Gene_to_get_genomic_coordinates
>> > =========================================
>> >
>> > use strict;
>> > use Bio::DB::EntrezGene;
>> >
>> > my $id = shift or die "Id?\n"; # use a Gene id
>> >
>> > my $db = new Bio::DB::EntrezGene;
>> > $db->verbose(1); ###
>> >
>> > my $seq = $db->get_Seq_by_id($id);
>> >
>> > my $ac = $seq->annotation;
>> >
>> > for my $ann ($ac->get_Annotations('dblink')) {
>> >        if ($ann->database eq "Evidence Viewer") {
>> >                # get the sequence identifier, the start, and the stop
>> >                my ($contig,$from,$to) = $ann->url =~
>> >                  /contig=([^&]+).+from=(\d+)&to=(\d+)/;
>> >                print "$contig\t$from\t$to\n";
>> >        }
>> > }
>> >
>> > ======================================
>> >
>> > So if it doesn't work for you, there are a few things you need to check:
>> > * what version of BioPerl are you using?
>> > * are you behind a firewall?
>> > * are you using a proxy?
>> > * do you need to submit username/password for either of the 2 above
>> > * turn on 'verbose' messages, it may help you debug
>> >
>> >
>> > If you're still having problems, get back to me and I'll see if I can help.
>> >
>> > --Russell
>> >
>> >
>> >> -----Original Message-----
>> >> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
>> >> bounces at lists.open-bio.org] On Behalf Of Neeti Somaiya
>> >> Sent: Monday, 7 September 2009 10:04 p.m.
>> >> To: Emanuele Osimo; bioperl-l
>> >> Subject: Re: [Bioperl-l] need help urgently
>> >>
>> >> I tried using EntrezGene instead of GenBank, as is given in the link
>> >> that you sent :
>> >>
>> >>
>> http://www.bioperl.org/wiki/HOWTO:Beginners#Retrieving_a_sequence_from_a_datab
>> >> ase
>> >>
>> >> http://doc.bioperl.org/releases/bioperl-current/bioperl-
>> >> live/Bio/DB/EntrezGene.html
>> >>
>> >> use Bio::DB::EntrezGene;
>> >>
>> >>     my $db = Bio::DB::EntrezGene->new;
>> >>
>> >>     my $seq = $db->get_Seq_by_id(2); # Gene id
>> >>
>> >>     # or ...
>> >>
>> >>     my $seqio = $db->get_Stream_by_id([2, 4693, 3064]); # Gene ids
>> >>     while ( my $seq = $seqio->next_seq ) {
>> >>           print "id is ", $seq->display_id, "\n";
>> >>     }
>> >>
>> >> This doesnt seem to work.
>> >>
>> >>
>> >> -Neeti
>> >> Even my blood says, B positive
>> >>
>> >>
>> >>
>> >> On Fri, Sep 4, 2009 at 1:09 PM, Emanuele Osimo<e.osimo at gmail.com> wrote:
>> >> > Hello,
>> >> > have you tried this?
>> >> >
>> >>
>> http://bio.perl.org/wiki/HOWTO:Getting_Genomic_Sequences#Using_Bio::DB::GenBan
>> >> k_when_you_have_genomic_coordinates
>> >> >
>> >> > Emanuele
>> >> >
>> >> > On Fri, Sep 4, 2009 at 08:49, Neeti Somaiya <neetisomaiya at gmail.com>
>> wrote:
>> >> >>
>> >> >> Hi,
>> >> >>
>> >> >> I have an input list of gene names (can get gene ids from a local db
>> >> >> if required).
>> >> >> I need to fetch sequences of these genes. Can someone please guide me
>> >> >> as to how this can be done using perl/bioperl?
>> >> >>
>> >> >> Any help will be deeply appreciated.
>> >> >>
>> >> >> Thanks.
>> >> >>
>> >> >> -Neeti
>> >> >> Even my blood says, B positive
>> >> >> _______________________________________________
>> >> >> Bioperl-l mailing list
>> >> >> Bioperl-l at lists.open-bio.org
>> >> >> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>> >> >
>> >> >
>> >> _______________________________________________
>> >> Bioperl-l mailing list
>> >> Bioperl-l at lists.open-bio.org
>> >> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>> > =======================================================================
>> > Attention: The information contained in this message and/or attachments
>> > from AgResearch Limited is intended only for the persons or entities
>> > to which it is addressed and may contain confidential and/or privileged
>> > material. Any review, retransmission, dissemination or other use of, or
>> > taking of any action in reliance upon, this information by persons or
>> > entities other than the intended recipients is prohibited by AgResearch
>> > Limited. If you have received this message in error, please notify the
>> > sender immediately.
>> > =======================================================================
>> >
>


From cavin.wardcaviness at gmail.com  Mon Sep 14 02:25:51 2009
From: cavin.wardcaviness at gmail.com (Cavin Ward-Caviness)
Date: Sun, 13 Sep 2009 22:25:51 -0400
Subject: [Bioperl-l] Beginner Script Error
Message-ID: <f39d52e60909131925ye176745qad0a0a16d4353a17@mail.gmail.com>

I am very new to perl and bioperl and figured I'd start learning by trying
to run a simple script to get BLAST data.  Here is the code I am trying to
run

use Bio::Perl;

$seq = get_sequence('swiss',"ROA1_HUMAN");

# uses the default database - nr in this case
$blast_result = blast_sequence($seq);

write_blast(">roa1.blast",$blast_result);

Instead of creating a file of the blast results I get the following error
message
Bio::SeqIO: swiss cannot be found.
Exception
Msg: Failed to load module Bio::SeqIO:swiss

It seems as though I may simply be missing the proper module.  I am running
bioperl 1.5.9_4 installed using the Perl Package Manager from the
instructions on the bioperl wiki page.  If I am simply missing a module
please let me know which one it is - and any other helpful modules that
someone in the bioinformatics field is likely to use.

Thanks,
Cavin


From joseguillin at hotmail.com  Mon Sep 14 12:48:28 2009
From: joseguillin at hotmail.com (Jose .)
Date: Mon, 14 Sep 2009 13:48:28 +0100
Subject: [Bioperl-l] Bio/Align/DNAStatistics.html print
	$jcmatrix->print_matrix; 
Message-ID: <BLU104-W2453ADE4584D2C479071A4A0E40@phx.gbl>


Hello,

I'm trying to use Bio::Align::DNAStatistics, but I get the following message:

Can't call method "print_matrix" on unblessed reference at Tree.pl line 32, <GEN0> line 44.

Other modules do work, such us Bio::SimpleAlign;


My code is basically a modification of the code I found in http://doc.bioperl.org/releases/bioperl-current/bioperl-live/Bio/Align/DNAStatistics.html, as it is as follows:

use strict;
use Bio::AlignIO;
use Bio::Align::DNAStatistics;


my $stats = Bio::Align::DNAStatistics->new();

my $alignin = Bio::AlignIO->new(-file => 'e1_output_uno_solo.fas',
                            -format => 'fasta');
my $aln = $alignin->next_aln;

my $jcmatrix = $stats-> distance (-align => $aln,
                  -method => 'Jukes-Cantor');

print $jcmatrix->print_matrix;

And the file 'e1_output_uno_solo.fas' has the following sequences:

>A
GGTTATCTCAACAACTGTCACC--GTGGGCGCTGGTCATTGGTACGGGTGAACGAGAGTT
AAACGGTCGTTAACCATAGAAACAAAACACACTGCACCTTAACTCACTGAATAGTTGACG
GTCTGCCTCAGGGCTTGAGACAACGGATGGATCTAAACTCATGCTGTAGCCTATCAAACT
TAGCCCCAGGGTACTTCCGTCCCTAGCCTCGCTACAAGGCCAGAAAGGGTTTTGAAGTCT
ACTCACTGTGACCAGCGGTCTAGTCAGGTTATGCTTCGGCACAAAACCTCAGAATCGGTA
ACCAGCCACTACACGAACTGAAATCAAATCGCGGGAGGTGGTCCATCTTTGTCCACGCTG
CGATGATTGGGTTGCTTTATAGTCTAGCTGCAAGGTTTTGCGTTCTGGTGGGAAGCGGCA
TCCAAGGGGTTGACTCCGCTCGTTTATAACATGCCTTGGGCCTCCATGGTGAGTCGCAAC
GTCAGCGTAGGCCTAGACGGCT

>B
GGATATCTCGACAACTTTTAGC--CTGGGCGCTTGGCATTGGTACACGTGACTTGCAGTT
AAAGGGTCGTTATACATAGAATCACTACCCAC--CAGGCGAACTCGCTGGAGAGCTGAGG
GTCACCCTCAGCGGTTGAGTTAACTGCTCGATGTTAACCGATGTTGGATCATAGGTAACT
TATCCTCAGTGTTCCTCTGTCCCTAGACTGGCTACAGGGCTACACCGGGTTTGAGGGGAT
ACTGACTGTTTTCAGCGGTAGTGTAAGTGTATGGTCCAACCCAAGGGTTCATGACCGGTA
AACTGCCCGTTCCCGCATTGAAATCAAATTGCAGGAGTTGGTACTTATTTGTCAACCTTA
CGATGATTGGGATGCATTTTAGTCGGGCTGGGCGGATTTGCGATCTGGGTGGAAGAGAGA
TGCATGGGGCTAACTCGTCTTGGTGAGTACCGGCATTGCACCGCAATGGACCGCCAAAAC
ATAAGAGTAGGTCGGGATGGCA

>C
GCTTATCTCAACAACCGACACGAAGTCGTCGCAGGTCAATGGTACACGTGAATTGAAGTC
ATAAGATCAGTAATGATCGAACCACCAAACCCTTAACCTCGACTCACGCGATAGCCGAGG
GTCTGCCTCCAGGGTTGATTTAAAGGTTCTATTTAAGACCGTTTTCGATCATAGGTTACT
TATCCCCAGAGTTCTACCGTCGTGAGAATGGCTACAAGGCTAGAATAGGTTTTAGGGT-T
ACTTACGGTCTGCAGCCGTATTGTGAGGTTATGGTCCGGCCCTAGGCGTCATGACCGATA
ATCAGCCCCTACCTGAAATGAAATCAAATCGCGGGAGTTGGTACTTATCTGTCAACGTTG
CGATGATGGGGATACATGTTGGTCTACCGCGACGGACTAGCGATCACGGGGGAAGCGGAT
TGCCCGGTGGTGACTCGACACGTTTAAAACCTGCCTGGTTCCCGCATGGATCGTCACAAC
GTATGTGCAGGTCGAAACGAGT

>D
CGTGATCGCAACAACTGTCACC--GTGGGCGCTGGCCGTTGGACCACGTGAAATGCTGTT
AAACGATCGTTCACCATAGAACCACTACACTCTTCACCTCAACCCGCGGGACAGGTGATG
GTGTCCCCCAGGGGTTGAGTGAACGGCTCGATGTAAACCCATGTTCGATCATAGGTAACG
TAGCCCCAGGGTGATTCCGTTCCTAAACTGGTTACAAGGCTAAAACGTGTTTTAGAGTAT
AATGACTGTCTACGGCGGTATTGTGATGTTATCATCCGTCCCTAGGCGTGGCGACCGTTA
AACAGCCTCTTCCCTAACTGATATCTAATCGTAGGAGTTGCTACGCATTTGTCAACGCAG
CGATGATGGTGATGCATCTTAATCTAGCTGG----TTTTTTGATCTCGGGTGACGCAGAT
AGTCAGGGGTTGACTCGCGTCGTTTGAAACGTGCCTTGCTCCTCAATGGACCCTCCGAAC
CTAAGAGTAGCTCGACACGGCT


I think the $aln object is OK, as I can use it with SimpleAlign.

Moreover, if I write
          print $jcmatrix;
instead of
          print $jcmatrix->print_matrix;
I get the memory reference, as normal===> ARRAY(0x859f08)

So my question is:

Why do I have an unblessed reference?

Can't call method "print_matrix" on unblessed reference at Tree.pl line 32, <GEN0> line 44.

Thank you very much in advance.

Jose G.

_________________________________________________________________
Hay tantos ordenadores como personas. ?Descubre ahora cu?l eres t?!
http://www.quepceres.com/


From maj at fortinbras.us  Mon Sep 14 17:00:24 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Mon, 14 Sep 2009 13:00:24 -0400
Subject: [Bioperl-l] Bio/Align/DNAStatistics.html
	print$jcmatrix->print_matrix; 
In-Reply-To: <BLU104-W2453ADE4584D2C479071A4A0E40@phx.gbl>
References: <BLU104-W2453ADE4584D2C479071A4A0E40@phx.gbl>
Message-ID: <7AD546C5A6BE4B66BF9705BC885E08B1@NewLife>

Hi Jose--
I don't get any problem with your script as written. You should upgrade to
BioPerl 1.6 and try again.
The "unblessed reference" is $jcmatrix. It may be undef for some reason.
MAJ
----- Original Message ----- 
From: "Jose ." <joseguillin at hotmail.com>
To: <bioperl-l at bioperl.org>
Sent: Monday, September 14, 2009 8:48 AM
Subject: [Bioperl-l] Bio/Align/DNAStatistics.html print$jcmatrix->print_matrix;


Hello,

I'm trying to use Bio::Align::DNAStatistics, but I get the following message:

Can't call method "print_matrix" on unblessed reference at Tree.pl line 32, 
<GEN0> line 44.

Other modules do work, such us Bio::SimpleAlign;


My code is basically a modification of the code I found in 
http://doc.bioperl.org/releases/bioperl-current/bioperl-live/Bio/Align/DNAStatistics.html, 
as it is as follows:

use strict;
use Bio::AlignIO;
use Bio::Align::DNAStatistics;


my $stats = Bio::Align::DNAStatistics->new();

my $alignin = Bio::AlignIO->new(-file => 'e1_output_uno_solo.fas',
                            -format => 'fasta');
my $aln = $alignin->next_aln;

my $jcmatrix = $stats-> distance (-align => $aln,
                  -method => 'Jukes-Cantor');

print $jcmatrix->print_matrix;

And the file 'e1_output_uno_solo.fas' has the following sequences:

>A
GGTTATCTCAACAACTGTCACC--GTGGGCGCTGGTCATTGGTACGGGTGAACGAGAGTT
AAACGGTCGTTAACCATAGAAACAAAACACACTGCACCTTAACTCACTGAATAGTTGACG
GTCTGCCTCAGGGCTTGAGACAACGGATGGATCTAAACTCATGCTGTAGCCTATCAAACT
TAGCCCCAGGGTACTTCCGTCCCTAGCCTCGCTACAAGGCCAGAAAGGGTTTTGAAGTCT
ACTCACTGTGACCAGCGGTCTAGTCAGGTTATGCTTCGGCACAAAACCTCAGAATCGGTA
ACCAGCCACTACACGAACTGAAATCAAATCGCGGGAGGTGGTCCATCTTTGTCCACGCTG
CGATGATTGGGTTGCTTTATAGTCTAGCTGCAAGGTTTTGCGTTCTGGTGGGAAGCGGCA
TCCAAGGGGTTGACTCCGCTCGTTTATAACATGCCTTGGGCCTCCATGGTGAGTCGCAAC
GTCAGCGTAGGCCTAGACGGCT

>B
GGATATCTCGACAACTTTTAGC--CTGGGCGCTTGGCATTGGTACACGTGACTTGCAGTT
AAAGGGTCGTTATACATAGAATCACTACCCAC--CAGGCGAACTCGCTGGAGAGCTGAGG
GTCACCCTCAGCGGTTGAGTTAACTGCTCGATGTTAACCGATGTTGGATCATAGGTAACT
TATCCTCAGTGTTCCTCTGTCCCTAGACTGGCTACAGGGCTACACCGGGTTTGAGGGGAT
ACTGACTGTTTTCAGCGGTAGTGTAAGTGTATGGTCCAACCCAAGGGTTCATGACCGGTA
AACTGCCCGTTCCCGCATTGAAATCAAATTGCAGGAGTTGGTACTTATTTGTCAACCTTA
CGATGATTGGGATGCATTTTAGTCGGGCTGGGCGGATTTGCGATCTGGGTGGAAGAGAGA
TGCATGGGGCTAACTCGTCTTGGTGAGTACCGGCATTGCACCGCAATGGACCGCCAAAAC
ATAAGAGTAGGTCGGGATGGCA

>C
GCTTATCTCAACAACCGACACGAAGTCGTCGCAGGTCAATGGTACACGTGAATTGAAGTC
ATAAGATCAGTAATGATCGAACCACCAAACCCTTAACCTCGACTCACGCGATAGCCGAGG
GTCTGCCTCCAGGGTTGATTTAAAGGTTCTATTTAAGACCGTTTTCGATCATAGGTTACT
TATCCCCAGAGTTCTACCGTCGTGAGAATGGCTACAAGGCTAGAATAGGTTTTAGGGT-T
ACTTACGGTCTGCAGCCGTATTGTGAGGTTATGGTCCGGCCCTAGGCGTCATGACCGATA
ATCAGCCCCTACCTGAAATGAAATCAAATCGCGGGAGTTGGTACTTATCTGTCAACGTTG
CGATGATGGGGATACATGTTGGTCTACCGCGACGGACTAGCGATCACGGGGGAAGCGGAT
TGCCCGGTGGTGACTCGACACGTTTAAAACCTGCCTGGTTCCCGCATGGATCGTCACAAC
GTATGTGCAGGTCGAAACGAGT

>D
CGTGATCGCAACAACTGTCACC--GTGGGCGCTGGCCGTTGGACCACGTGAAATGCTGTT
AAACGATCGTTCACCATAGAACCACTACACTCTTCACCTCAACCCGCGGGACAGGTGATG
GTGTCCCCCAGGGGTTGAGTGAACGGCTCGATGTAAACCCATGTTCGATCATAGGTAACG
TAGCCCCAGGGTGATTCCGTTCCTAAACTGGTTACAAGGCTAAAACGTGTTTTAGAGTAT
AATGACTGTCTACGGCGGTATTGTGATGTTATCATCCGTCCCTAGGCGTGGCGACCGTTA
AACAGCCTCTTCCCTAACTGATATCTAATCGTAGGAGTTGCTACGCATTTGTCAACGCAG
CGATGATGGTGATGCATCTTAATCTAGCTGG----TTTTTTGATCTCGGGTGACGCAGAT
AGTCAGGGGTTGACTCGCGTCGTTTGAAACGTGCCTTGCTCCTCAATGGACCCTCCGAAC
CTAAGAGTAGCTCGACACGGCT


I think the $aln object is OK, as I can use it with SimpleAlign.

Moreover, if I write
          print $jcmatrix;
instead of
          print $jcmatrix->print_matrix;
I get the memory reference, as normal===> ARRAY(0x859f08)

So my question is:

Why do I have an unblessed reference?

Can't call method "print_matrix" on unblessed reference at Tree.pl line 32, 
<GEN0> line 44.

Thank you very much in advance.

Jose G.

_________________________________________________________________
Hay tantos ordenadores como personas. ?Descubre ahora cu?l eres t?!
http://www.quepceres.com/
_______________________________________________
Bioperl-l mailing list
Bioperl-l at lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/bioperl-l


From jason at bioperl.org  Mon Sep 14 17:54:55 2009
From: jason at bioperl.org (Jason Stajich)
Date: Mon, 14 Sep 2009 10:54:55 -0700
Subject: [Bioperl-l] Bio/Align/DNAStatistics.html
	print$jcmatrix->print_matrix; 
In-Reply-To: <7AD546C5A6BE4B66BF9705BC885E08B1@NewLife>
References: <BLU104-W2453ADE4584D2C479071A4A0E40@phx.gbl>
	<7AD546C5A6BE4B66BF9705BC885E08B1@NewLife>
Message-ID: <8B440DC9-A1C8-4900-A0AB-96448616E46A@bioperl.org>

Yeah it seems like more of a bioperl problem -- possible that the  
older code didn't recognize 'jukes-cantor' but you can try the  
abbreviation 'jc' -- better to just upgrade tho!

This isn't the cause of the problem but I would also encourage use of  
Bio::Matrix::IO for printing the matrix (use the 'write_matrix'  
function) rather than print_matrix on the matrix itsself.

-jason
On Sep 14, 2009, at 10:00 AM, Mark A. Jensen wrote:

> Hi Jose--
> I don't get any problem with your script as written. You should  
> upgrade to
> BioPerl 1.6 and try again.
> The "unblessed reference" is $jcmatrix. It may be undef for some  
> reason.
> MAJ
> ----- Original Message ----- From: "Jose ." <joseguillin at hotmail.com>
> To: <bioperl-l at bioperl.org>
> Sent: Monday, September 14, 2009 8:48 AM
> Subject: [Bioperl-l] Bio/Align/DNAStatistics.html print$jcmatrix- 
> >print_matrix;
>
>
>
>
>
> Hello,
>
> I'm trying to use Bio::Align::DNAStatistics, but I get the following  
> message:
>
> Can't call method "print_matrix" on unblessed reference at Tree.pl  
> line 32, <GEN0> line 44.
>
> Other modules do work, such us Bio::SimpleAlign;
>
>
>
>
> My code is basically a modification of the code I found in http://doc.bioperl.org/releases/bioperl-current/bioperl-live/Bio/Align/DNAStatistics.html 
> , as it is as follows:
>
> use strict;
> use Bio::AlignIO;
> use Bio::Align::DNAStatistics;
>
>
> my $stats = Bio::Align::DNAStatistics->new();
>
> my $alignin = Bio::AlignIO->new(-file => 'e1_output_uno_solo.fas',
>                           -format => 'fasta');
> my $aln = $alignin->next_aln;
>
> my $jcmatrix = $stats-> distance (-align => $aln,
>                 -method => 'Jukes-Cantor');
>
> print $jcmatrix->print_matrix;
>
> And the file 'e1_output_uno_solo.fas' has the following sequences:
>
>> A
> GGTTATCTCAACAACTGTCACC--GTGGGCGCTGGTCATTGGTACGGGTGAACGAGAGTT
> AAACGGTCGTTAACCATAGAAACAAAACACACTGCACCTTAACTCACTGAATAGTTGACG
> GTCTGCCTCAGGGCTTGAGACAACGGATGGATCTAAACTCATGCTGTAGCCTATCAAACT
> TAGCCCCAGGGTACTTCCGTCCCTAGCCTCGCTACAAGGCCAGAAAGGGTTTTGAAGTCT
> ACTCACTGTGACCAGCGGTCTAGTCAGGTTATGCTTCGGCACAAAACCTCAGAATCGGTA
> ACCAGCCACTACACGAACTGAAATCAAATCGCGGGAGGTGGTCCATCTTTGTCCACGCTG
> CGATGATTGGGTTGCTTTATAGTCTAGCTGCAAGGTTTTGCGTTCTGGTGGGAAGCGGCA
> TCCAAGGGGTTGACTCCGCTCGTTTATAACATGCCTTGGGCCTCCATGGTGAGTCGCAAC
> GTCAGCGTAGGCCTAGACGGCT
>
>> B
> GGATATCTCGACAACTTTTAGC--CTGGGCGCTTGGCATTGGTACACGTGACTTGCAGTT
> AAAGGGTCGTTATACATAGAATCACTACCCAC--CAGGCGAACTCGCTGGAGAGCTGAGG
> GTCACCCTCAGCGGTTGAGTTAACTGCTCGATGTTAACCGATGTTGGATCATAGGTAACT
> TATCCTCAGTGTTCCTCTGTCCCTAGACTGGCTACAGGGCTACACCGGGTTTGAGGGGAT
> ACTGACTGTTTTCAGCGGTAGTGTAAGTGTATGGTCCAACCCAAGGGTTCATGACCGGTA
> AACTGCCCGTTCCCGCATTGAAATCAAATTGCAGGAGTTGGTACTTATTTGTCAACCTTA
> CGATGATTGGGATGCATTTTAGTCGGGCTGGGCGGATTTGCGATCTGGGTGGAAGAGAGA
> TGCATGGGGCTAACTCGTCTTGGTGAGTACCGGCATTGCACCGCAATGGACCGCCAAAAC
> ATAAGAGTAGGTCGGGATGGCA
>
>> C
> GCTTATCTCAACAACCGACACGAAGTCGTCGCAGGTCAATGGTACACGTGAATTGAAGTC
> ATAAGATCAGTAATGATCGAACCACCAAACCCTTAACCTCGACTCACGCGATAGCCGAGG
> GTCTGCCTCCAGGGTTGATTTAAAGGTTCTATTTAAGACCGTTTTCGATCATAGGTTACT
> TATCCCCAGAGTTCTACCGTCGTGAGAATGGCTACAAGGCTAGAATAGGTTTTAGGGT-T
> ACTTACGGTCTGCAGCCGTATTGTGAGGTTATGGTCCGGCCCTAGGCGTCATGACCGATA
> ATCAGCCCCTACCTGAAATGAAATCAAATCGCGGGAGTTGGTACTTATCTGTCAACGTTG
> CGATGATGGGGATACATGTTGGTCTACCGCGACGGACTAGCGATCACGGGGGAAGCGGAT
> TGCCCGGTGGTGACTCGACACGTTTAAAACCTGCCTGGTTCCCGCATGGATCGTCACAAC
> GTATGTGCAGGTCGAAACGAGT
>
>> D
> CGTGATCGCAACAACTGTCACC--GTGGGCGCTGGCCGTTGGACCACGTGAAATGCTGTT
> AAACGATCGTTCACCATAGAACCACTACACTCTTCACCTCAACCCGCGGGACAGGTGATG
> GTGTCCCCCAGGGGTTGAGTGAACGGCTCGATGTAAACCCATGTTCGATCATAGGTAACG
> TAGCCCCAGGGTGATTCCGTTCCTAAACTGGTTACAAGGCTAAAACGTGTTTTAGAGTAT
> AATGACTGTCTACGGCGGTATTGTGATGTTATCATCCGTCCCTAGGCGTGGCGACCGTTA
> AACAGCCTCTTCCCTAACTGATATCTAATCGTAGGAGTTGCTACGCATTTGTCAACGCAG
> CGATGATGGTGATGCATCTTAATCTAGCTGG----TTTTTTGATCTCGGGTGACGCAGAT
> AGTCAGGGGTTGACTCGCGTCGTTTGAAACGTGCCTTGCTCCTCAATGGACCCTCCGAAC
> CTAAGAGTAGCTCGACACGGCT
>
>
>
> I think the $aln object is OK, as I can use it with SimpleAlign.
>
> Moreover, if I write
>         print $jcmatrix;
> instead of
>         print $jcmatrix->print_matrix;
> I get the memory reference, as normal===> ARRAY(0x859f08)
>
> So my question is:
>
> Why do I have an unblessed reference?
>
> Can't call method "print_matrix" on unblessed reference at Tree.pl  
> line 32, <GEN0> line 44.
>
> Thank you very much in advance.
>
> Jose G.
>
> _________________________________________________________________
> Hay tantos ordenadores como personas. ?Descubre ahora cu?l eres t?!
> http://www.quepceres.com/
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

--
Jason Stajich
jason.stajich at gmail.com
jason at bioperl.org


From robert.bradbury at gmail.com  Mon Sep 14 19:34:52 2009
From: robert.bradbury at gmail.com (Robert Bradbury)
Date: Mon, 14 Sep 2009 15:34:52 -0400
Subject: [Bioperl-l] Beginner Script Error
In-Reply-To: <f39d52e60909131925ye176745qad0a0a16d4353a17@mail.gmail.com>
References: <f39d52e60909131925ye176745qad0a0a16d4353a17@mail.gmail.com>
Message-ID: <deaa866a0909141234p55341bcbhd4f713551180fed4@mail.gmail.com>

On 9/13/09, Cavin Ward-Caviness <cavin.wardcaviness at gmail.com> wrote:

> $seq = get_sequence('swiss',"ROA1_HUMAN");

Well, I haven't looked at the documentation or the source, but the
code I've got which does work which does a similar function is:
             # database options include: Swissprot, EMBL, GenBank and RefSeq
            $seq_object = get_sequence('swissprot', $seqname);

I think the names have to be string specific but may not need to be
case specific.  The seqname's also tend to be database format
specific, so my "general" function fetch will catch exceptions and
then try other databases, if for example it looks like a PDB
identifier.  I'm not sure whether there is a library function which
fetches a "general" sequence based on the sequence name format.
Presumably one could do something like this with some kind of
"prioritized" list of databases to go through, e.g. GenBank, EMBL,
SwissProt, RefSeq, PDB, JDB, JGI, Broad, NCBI, C. elegans, Drosophila,
Yeast, other organism specific databases.  It might be nice if there
were a "general" BioPerl function that would do this based on sequence
name format, locality (fetch from the nearest database),
up-to-dated-ness, ultimately one might like to have kind of a sequence
"rsync" function that of the form  UpdateSequence(SeqName, prefDb,
last-update-date, update-size, update-md5sum, ...) which would perform
inexpensive network-based updates for gene-sets of interest.  I'm
presuming that many sequence entries in active databases are
undergoing periodic updates and thus one might be interested in weekly
or monthly "local" db updates.

Robert


From robert.bradbury at gmail.com  Tue Sep 15 08:05:22 2009
From: robert.bradbury at gmail.com (Robert Bradbury)
Date: Tue, 15 Sep 2009 04:05:22 -0400
Subject: [Bioperl-l] Genome scanning questions/strategies
Message-ID: <deaa866a0909150105wcc651c5n4a50033d0392bbda@mail.gmail.com>

I have several applications which require scanning multiple genomes, in some
cases I can get away with scanning the protein sequences, in other cases I
need to scan the mRNA, or in the worst case the DNA sequences themselves.  I
have most of the available genomes on my hard drive but in cases where they
are not complete or undergo frequent revisions, I may need to interface
through the Genbank | Ensembl | JGI (or other?) databases.

Some of the applications are basic counting statistics:
1) How many proteins?
2) How many amino acids in the proteins?
3) What are the species specific codon frequencies in the codons?
4) What fraction of the genome is ncRNA, junk DNA, etc.?

Other applications involve some functional analysis, e.g. find all specified
protein domains of interest (presumably some HMM matching or equivalent),
find all signal sequences (nuclear targeting, mitochondrial targeting, ER
targeting, etc.), find all mRNA restriction enzyme cut sites, etc..

Questions are:
1) Are there "remote" functions that use genome center "supercomputers"
(other than say Remote Blast) that can be used for some of these purposes
and are interfaced in some way to BioPerl?
2) Will I incur genome center wrath by running all my queries "remotely"
(i.e. I do the computing, but they handle the database retreival & network
distribution)?  If not, what is a good "max query frequency"? [I'm on a DSL
line, so I can't push most servers very hard from an I/O standpoint.]

Finally, is there any "archive of experience" documenting the various
information systems limitations on various bioinformatics applications?
I.e. for I/O requirements and/or CPU requirements, is: BLAST <
HMM-domain-searching < Inter-genome-signal-scanning/matching?  Relates to
the question of when home based bioinformaticians need to begin considering
switching from DSL to Cable to FIOS and/or 1/3/4/6/8 core machines/clusters
can handle the workload.

Thank you,
Robert Bradbury


From neetisomaiya at gmail.com  Tue Sep 15 08:29:02 2009
From: neetisomaiya at gmail.com (Neeti Somaiya)
Date: Tue, 15 Sep 2009 13:59:02 +0530
Subject: [Bioperl-l] need help urgently
In-Reply-To: <764978cf0909140122h3fe74b80lec7118e3edde24f9@mail.gmail.com>
References: <764978cf0909032349t5d8683f3h3b2ccc506392c984@mail.gmail.com>
	<2ac05d0f0909040039v4d6fb77fw8793b43add632e3a@mail.gmail.com>
	<764978cf0909070304w598d4bb5m51ad4e66f57cc1cf@mail.gmail.com>
	<18DF7D20DFEC044098A1062202F5FFF32B624C53A3@exchsth.agresearch.co.nz>
	<764978cf0909072127n830d4e8x95d15a758fa919db@mail.gmail.com>
	<18DF7D20DFEC044098A1062202F5FFF32B624C5607@exchsth.agresearch.co.nz>
	<764978cf0909140122h3fe74b80lec7118e3edde24f9@mail.gmail.com>
Message-ID: <764978cf0909150129s69817921j82a9ca112aefe7ae@mail.gmail.com>

When I use Bio::DB::EntrezGene and EUtilities, the accession and
sequence that it returns to me for a gene is the second accession
mentioned in the "Genome Reference Consortium Human Build 37 Primary
Assembly". For eg, if we take entrez gene id 3630, the code returns
accession NT_009237.18. But I actually want to take the sequence of
the first accession i.e. NC_000011.9.

Please let me know how I could get that. Any help will be great.

-Neeti
Even my blood says, B positive


On Mon, Sep 14, 2009 at 1:52 PM, Neeti Somaiya <neetisomaiya at gmail.com> wrote:
> Thanks a lot. This works for me.
>
> I need one more help, can you point me to where exactly can we find
> the link to this FASTA sequence, that we are retrieving here through
> the code, in its actual entry in Entrez Gene in the NCBI website
> (http://www.ncbi.nlm.nih.gov/sites/entrez)
>
> -Neeti
> Even my blood says, B positive
>
>
>
> On Tue, Sep 8, 2009 at 10:11 AM, Smithies, Russell
> <Russell.Smithies at agresearch.co.nz> wrote:
>> That bit of code gave you the accession, start and end for the sequence so you just needed to download it.
>> Bio::DB::Eutilities can do that for you.
>>
>> Did you take a look at http://www.bioperl.org/wiki/HOWTO:Getting_Genomic_Sequences
>>
>>
>>
>> --Russell
>>
>> ==================
>> #!perl -w
>>
>> use strict;
>> use Bio::DB::EntrezGene;
>> use Bio::DB::EUtilities;
>>
>> no warnings 'deprecated';
>>
>> my $id = shift or die "Id?\n"; # use a Gene id
>>
>> my $db = new Bio::DB::EntrezGene;
>> #$db->verbose(1);
>> my $seq = $db->get_Seq_by_id($id);
>>
>> my $ac = $seq->annotation;
>>
>> for my $ann ($ac->get_Annotations('dblink')) {
>>        if ($ann->database eq "Evidence Viewer") {
>>                # get the sequence identifier, the start, and the stop
>>                my ($acc,$from,$to) = $ann->url =~
>>                  /contig=([^&]+).+from=(\d+)&to=(\d+)/;
>>                print "$acc\t$from\t$to\n";
>>
>>                # retrieve the sequence
>>                my $fetcher = Bio::DB::EUtilities->new(-eutil => 'efetch',
>>                                           -db    => 'nucleotide',
>>                                           -rettype => 'fasta');
>>            $fetcher->set_parameters(-id => $acc,
>>                                                -seq_start => $from,
>>                                                -seq_stop  => $to,
>>                                                -strand    => 1);
>>            my $seq = $fetcher->get_Response->content;
>>            print $seq;
>>
>>        }
>> }
>>
>> ======================
>>
>>> -----Original Message-----
>>> From: Neeti Somaiya [mailto:neetisomaiya at gmail.com]
>>> Sent: Tuesday, 8 September 2009 4:28 p.m.
>>> To: Smithies, Russell
>>> Cc: Emanuele Osimo; bioperl-l
>>> Subject: Re: [Bioperl-l] need help urgently
>>>
>>> I actually want the nucleotide sequence of the gene. I thought the
>>> Bio::DB::EntrezGene would give me a seq_obj for an entrez gene id and
>>> then the seq method on that $seq_obj->seq() will give me the actual
>>> genomic nucleotide sequence of the gene. But this doesnt happen. I am
>>> able to print gene symbol using $seq_obj->display_id and able to do
>>> other things, but I wanted the gene nucleotide sequence.
>>>
>>> -Neeti
>>> Even my blood says, B positive
>>>
>>>
>>>
>>> On Tue, Sep 8, 2009 at 1:56 AM, Smithies,
>>> Russell<Russell.Smithies at agresearch.co.nz> wrote:
>>> > This example code from the wiki _definitely_ works:
>>> >
>>> http://bio.perl.org/wiki/HOWTO:Getting_Genomic_Sequences#Using_Bio::DB::Entrez
>>> Gene_to_get_genomic_coordinates
>>> > =========================================
>>> >
>>> > use strict;
>>> > use Bio::DB::EntrezGene;
>>> >
>>> > my $id = shift or die "Id?\n"; # use a Gene id
>>> >
>>> > my $db = new Bio::DB::EntrezGene;
>>> > $db->verbose(1); ###
>>> >
>>> > my $seq = $db->get_Seq_by_id($id);
>>> >
>>> > my $ac = $seq->annotation;
>>> >
>>> > for my $ann ($ac->get_Annotations('dblink')) {
>>> >        if ($ann->database eq "Evidence Viewer") {
>>> >                # get the sequence identifier, the start, and the stop
>>> >                my ($contig,$from,$to) = $ann->url =~
>>> >                  /contig=([^&]+).+from=(\d+)&to=(\d+)/;
>>> >                print "$contig\t$from\t$to\n";
>>> >        }
>>> > }
>>> >
>>> > ======================================
>>> >
>>> > So if it doesn't work for you, there are a few things you need to check:
>>> > * what version of BioPerl are you using?
>>> > * are you behind a firewall?
>>> > * are you using a proxy?
>>> > * do you need to submit username/password for either of the 2 above
>>> > * turn on 'verbose' messages, it may help you debug
>>> >
>>> >
>>> > If you're still having problems, get back to me and I'll see if I can help.
>>> >
>>> > --Russell
>>> >
>>> >
>>> >> -----Original Message-----
>>> >> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
>>> >> bounces at lists.open-bio.org] On Behalf Of Neeti Somaiya
>>> >> Sent: Monday, 7 September 2009 10:04 p.m.
>>> >> To: Emanuele Osimo; bioperl-l
>>> >> Subject: Re: [Bioperl-l] need help urgently
>>> >>
>>> >> I tried using EntrezGene instead of GenBank, as is given in the link
>>> >> that you sent :
>>> >>
>>> >>
>>> http://www.bioperl.org/wiki/HOWTO:Beginners#Retrieving_a_sequence_from_a_datab
>>> >> ase
>>> >>
>>> >> http://doc.bioperl.org/releases/bioperl-current/bioperl-
>>> >> live/Bio/DB/EntrezGene.html
>>> >>
>>> >> use Bio::DB::EntrezGene;
>>> >>
>>> >>     my $db = Bio::DB::EntrezGene->new;
>>> >>
>>> >>     my $seq = $db->get_Seq_by_id(2); # Gene id
>>> >>
>>> >>     # or ...
>>> >>
>>> >>     my $seqio = $db->get_Stream_by_id([2, 4693, 3064]); # Gene ids
>>> >>     while ( my $seq = $seqio->next_seq ) {
>>> >>           print "id is ", $seq->display_id, "\n";
>>> >>     }
>>> >>
>>> >> This doesnt seem to work.
>>> >>
>>> >>
>>> >> -Neeti
>>> >> Even my blood says, B positive
>>> >>
>>> >>
>>> >>
>>> >> On Fri, Sep 4, 2009 at 1:09 PM, Emanuele Osimo<e.osimo at gmail.com> wrote:
>>> >> > Hello,
>>> >> > have you tried this?
>>> >> >
>>> >>
>>> http://bio.perl.org/wiki/HOWTO:Getting_Genomic_Sequences#Using_Bio::DB::GenBan
>>> >> k_when_you_have_genomic_coordinates
>>> >> >
>>> >> > Emanuele
>>> >> >
>>> >> > On Fri, Sep 4, 2009 at 08:49, Neeti Somaiya <neetisomaiya at gmail.com>
>>> wrote:
>>> >> >>
>>> >> >> Hi,
>>> >> >>
>>> >> >> I have an input list of gene names (can get gene ids from a local db
>>> >> >> if required).
>>> >> >> I need to fetch sequences of these genes. Can someone please guide me
>>> >> >> as to how this can be done using perl/bioperl?
>>> >> >>
>>> >> >> Any help will be deeply appreciated.
>>> >> >>
>>> >> >> Thanks.
>>> >> >>
>>> >> >> -Neeti
>>> >> >> Even my blood says, B positive
>>> >> >> _______________________________________________
>>> >> >> Bioperl-l mailing list
>>> >> >> Bioperl-l at lists.open-bio.org
>>> >> >> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>> >> >
>>> >> >
>>> >> _______________________________________________
>>> >> Bioperl-l mailing list
>>> >> Bioperl-l at lists.open-bio.org
>>> >> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>> > =======================================================================
>>> > Attention: The information contained in this message and/or attachments
>>> > from AgResearch Limited is intended only for the persons or entities
>>> > to which it is addressed and may contain confidential and/or privileged
>>> > material. Any review, retransmission, dissemination or other use of, or
>>> > taking of any action in reliance upon, this information by persons or
>>> > entities other than the intended recipients is prohibited by AgResearch
>>> > Limited. If you have received this message in error, please notify the
>>> > sender immediately.
>>> > =======================================================================
>>> >
>>
>


From cjfields at illinois.edu  Tue Sep 15 19:07:40 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Tue, 15 Sep 2009 14:07:40 -0500
Subject: [Bioperl-l] Significant blocker for 1.6.1 : Nexml
In-Reply-To: <4415308D-81DC-4F68-A6CF-E08FD03D1D6E@illinois.edu>
References: <E5D7B830-6D19-47D2-8D5E-716B4CF84F0B@illinois.edu><CB38C203-7253-4AEE-A6E3-922243B290D9@gmx.net>
	<3163670B-51E3-419F-835B-304BB52E1037@illinois.edu>
	<1CF993D6D3AC435CA77127466D6C072A@NewLife>
	<4415308D-81DC-4F68-A6CF-E08FD03D1D6E@illinois.edu>
Message-ID: <DE7BC2E3-F983-447F-86AD-34BFEA3B232A@illinois.edu>

I don't see an update to Bio::Phylo on CPAN yet, so I'm assuming we  
will leave Nexml off the 1.6.1 alpha for now.  I'll likely be  
releasing it later today or tomorrow to CPAN.

chris

On Sep 8, 2009, at 10:43 AM, Chris Fields wrote:

> Mark
>
> We can hold it in trunk until the next point release or we start  
> splitting things off (whichever is first).
>
> I have a little more time, though, and I'm thinking it would be a  
> good idea to get the Nexml code into the wild (sooner than later)  
> for users to test out.  Let's see if Rutger responds.
>
> chris
>
> On Sep 8, 2009, at 9:39 AM, Mark A. Jensen wrote:
>
>> I agree with Hilmar-- I have no problem keeping it in the trunk for  
>> a while
>> longer, as I have an addition for dealing with arbitrary non-seq
>> data using the Population API sitting in bioperl-dev that's nearly
>> ready, but prob. not before cjf wants to get the release out.
>> ----- Original Message ----- From: "Chris Fields" <cjfields at illinois.edu 
>> >
>> To: "Hilmar Lapp" <hlapp at gmx.net>
>> Cc: "BioPerl List" <bioperl-l at lists.open-bio.org>; "Rutger A. Vos" <rutgeraldo at gmail.com 
>> >
>> Sent: Tuesday, September 08, 2009 9:15 AM
>> Subject: Re: [Bioperl-l] Significant blocker for 1.6.1 : Nexml
>>
>>
>>> On Sep 8, 2009, at 7:16 AM, Hilmar Lapp wrote:
>>>
>>>> I'd suspect that the latest Bio::Phylo changes have been due for   
>>>> CPAN release anyway, so unless those are unstable that seems  
>>>> like  the easiest fix to me.
>>>
>>> My thought as well, just not sure how stable that code is right  
>>> now. Bio::Phylo has been in RC for a while now, correct?
>>>
>>>> If the Nexml code works against not yet stable updates to   
>>>> Bio::Phylo, it shouldn't be in a BioPerl stable release, right?
>>>
>>> Right.  That should be sorted out first.
>>>
>>> I can wait a bit longer for Rutger to respond; there are a few  
>>> other  odds and ends that can been worked on in the meantime.  I  
>>> would like  to get the alpha out soon and 1.6.1 in the next week  
>>> or so though.
>>>
>>> chris
>>>
>>>> -hilmar
>>>>
>>>> On Sep 8, 2009, at 12:23 AM, Chris Fields wrote:
>>>>
>>>>> All,
>>>>>
>>>>> I'm running into a pretty significant blocker for 1.6.1 re:  
>>>>> Chase's  Nexml code.  In particular, I have tried three versions  
>>>>> of  Bio::Phylo; the default CPAN installation (1.6), the latest  
>>>>> CPAN RC  (1.7_RC9, not installed by default), and the latest  
>>>>> from Bio::Phylo  svn:
>>>>>
>>>>> https://nexml.svn.sourceforge.net/svnroot/nexml/trunk/nexml/perl
>>>>>
>>>>> At this moment only the Bio::Phylo code from svn is working  
>>>>> with  BioPerl's Nexml modules.  From my local tests Bio::Phylo  
>>>>> 1.6  appears to be missing Bio::Phylo::Factory (all Nexml tests  
>>>>> fail),  whereas 1.7_RC9 has some kind of versioning issue  
>>>>> (again, all tests  fail).  The problem: CPAN will always install  
>>>>> 1.6 (the others are  RC, so they won't be installed unless the  
>>>>> full path is used).  Even  so, nothing on CPAN even works; one  
>>>>> must use the latest Bio::Phylo  SVN code.
>>>>>
>>>>> ATM I'm just not seeing how this can be released with 1.6.1  
>>>>> right  now, unless one of the following occurs:
>>>>>
>>>>> 1) Rutger V. drops a quick non-RC release to CPAN,
>>>>> 2) check for the minimal working Bio::Phylo version and safely  
>>>>> skip  any Nexml-related tests unless proper version is present  
>>>>> (not easy  with a $VERSION like '1.7_RC9'),
>>>>> 3) push Nexml into it's own distribution (something we were   
>>>>> planning on anyway with a number of modules)
>>>>>
>>>>> As for #3 above, I think it probably belongs in a larger  
>>>>> bioperl- phylo as Mark had previously proposed.  I'm open to  
>>>>> just about any  solution.
>>>>>
>>>>> chris
>>>>> _______________________________________________
>>>>> Bioperl-l mailing list
>>>>> Bioperl-l at lists.open-bio.org
>>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>>
>>>> -- 
>>>> ===========================================================
>>>> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
>>>> ===========================================================
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> Bioperl-l mailing list
>>>> Bioperl-l at lists.open-bio.org
>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


From cjfields at illinois.edu  Wed Sep 16 12:55:56 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Wed, 16 Sep 2009 07:55:56 -0500
Subject: [Bioperl-l] Bio::DB::GenBank question (acc vs. version)
In-Reply-To: <02cbfb3dfbb309f0b62cecd122bb5c2c.squirrel@mail.dreamhost.com>
References: <19116.19568.26115.542911@already.dhcp.gene.com>
	<02cbfb3dfbb309f0b62cecd122bb5c2c.squirrel@mail.dreamhost.com>
Message-ID: <0B8829A4-03EE-4BA0-8CF8-218782ED2630@illinois.edu>

Bill, George,

It's worth clarifying the docs on these and adding a TODO for them  
(and test cases!), but I tend to agree.  I believe, re: version, we  
can possibly use Bio::DB::SeqVersion to grab the right one, but it'll  
need further investigation.

As for generic accession w/o version, efetch does support it but it  
does have problems (pulling up more than one sequence in rare cases,  
for instance).

chris

On Sep 13, 2009, at 10:47 AM, bill at genenformics.com wrote:

> I would like to make a few comments about get_Seq_by_version and
> get_Seq_by_acc. Although both functions use the same NCBI eUtils  
> API, they
> are interpreted differently for a Seq_id with version or without  
> version.
>
> 1. If the Seq_id has a version, GenBank ID server will locate
> corresponding GI and emit the correct sequence.
> 2. If the Seq_id does not have a version, GBDataLoader  will try to  
> find
> the latest version number for that Seq_id, which is relatively  
> slower and
> the version number the ID server find out may NOT always be the  
> latest.
>
> IMHO, for both efficiency and consistency,
> get_Seq_by_gi > get_Seq_by_version >> get_Seq_by_acc
>
> Bill
>
>
>>
>> It looks like get Bio::DB::GenBank::get_Seq_by_{version,acc} are
>> functionally identical.  They seem to trickle down to the same place
>> and walking through these two requests yields almost identical http
>> requests:
>>
>>  $db->get_Seq_by_version('J00522.1')
>>  GET
>> http://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?retmode=text&rettype=gbwithparts&db=nucleotide&tool=bioperl&id=J00522.1&usehistory=n
>>
>>  $db->get_Seq_by_acc('J00522')
>>  GET
>> http://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?retmode=text&rettype=gbwithparts&db=nucleotide&tool=bioperl&id=J00522&usehistory=n
>>
>> The only difference that I can see is that they index into different
>> secions of %PARAMSTRING defined in Bio::DB::GenBank, but those
>> sections contain the same information.
>>
>> I'd like a general purpose tool that does The Right Thing whether
>> there's a .1 on the end of an identifier or not, and am just trying  
>> to
>> make sure I'm not doing something troublesome.
>>
>> Am I correct about the above?
>>
>> While I'm at it, I think that the comment
>>
>>  # note that get_Stream_by_version is not implemented
>>
>> in Bio::DB::GenBank was made obsolete by whoever commented out the
>>
>>  $self->throw(...)
>>
>> in get_Stream_by_version in Bio::WebDBSeqI.pm.
>>
>> I'll happily commit the trivial doc fix if no one shoots down the
>> idea. (can't help big, might as well help small...).
>>
>> Thanks,
>>
>> g.
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From cjfields at illinois.edu  Wed Sep 16 13:22:00 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Wed, 16 Sep 2009 08:22:00 -0500
Subject: [Bioperl-l] Genome scanning questions/strategies
In-Reply-To: <deaa866a0909150105wcc651c5n4a50033d0392bbda@mail.gmail.com>
References: <deaa866a0909150105wcc651c5n4a50033d0392bbda@mail.gmail.com>
Message-ID: <8674BA8B-ACCC-4C7D-989E-3532C0659A3F@illinois.edu>

On Sep 15, 2009, at 3:05 AM, Robert Bradbury wrote:

> I have several applications which require scanning multiple genomes,  
> in some
> cases I can get away with scanning the protein sequences, in other  
> cases I
> need to scan the mRNA, or in the worst case the DNA sequences  
> themselves.  I
> have most of the available genomes on my hard drive but in cases  
> where they
> are not complete or undergo frequent revisions, I may need to  
> interface
> through the Genbank | Ensembl | JGI (or other?) databases.
>
> Some of the applications are basic counting statistics:
> 1) How many proteins?
> 2) How many amino acids in the proteins?
> 3) What are the species specific codon frequencies in the codons?
> 4) What fraction of the genome is ncRNA, junk DNA, etc.?
>
> Other applications involve some functional analysis, e.g. find all  
> specified
> protein domains of interest (presumably some HMM matching or  
> equivalent),
> find all signal sequences (nuclear targeting, mitochondrial  
> targeting, ER
> targeting, etc.), find all mRNA restriction enzyme cut sites, etc..
>
> Questions are:
> 1) Are there "remote" functions that use genome center  
> "supercomputers"
> (other than say Remote Blast) that can be used for some of these  
> purposes
> and are interfaced in some way to BioPerl?

Re: remote tasks, there are a few tools for that.  See  
Bio::Tools::Analysis modules for ones that access remote servers, or  
the HOWTO:

http://www.bioperl.org/wiki/HOWTO:Simple_web_analysis

Setting up modules for these services can be risky, though, as we have  
no control over the continued evolution of the remote servers in  
question.  For instance, we had a set of Pise modules (around 100 I  
think) for remotely accessing services at any Pise server; however,  
these are now obsolete in favor of Mobyle.  I have long thought of  
setting something up to interface with either that service or Galaxy  
(which may be a more stable alternative), just haven't had the time.

Re databases: we have access to NCBI, EMBL, UniProt, and many others.   
NCBI eutils are available via Bio::DB::EUtilities.  You can use the  
Ensembl perl API for accessing Ensembl (including Compara and others),  
and Mark Jensen added Bio::DB::HIV for accessing HIV database  
information at LANL HIV Sequence Database.  These were all working  
with bioperl 1.6 last I tried (ensembl's API is separate and available  
from their website).

We don't have much beyond that, primarily b/c most other centers are  
very particular when queried remotely and will block IPs that spam  
their servers w/o an adequate timeout.  That's completely  
understandable from a webadmin perspective (think: possible denial of  
service attack).

> 2) Will I incur genome center wrath by running all my queries  
> "remotely"
> (i.e. I do the computing, but they handle the database retreival &  
> network
> distribution)?  If not, what is a good "max query frequency"? [I'm  
> on a DSL
> line, so I can't push most servers very hard from an I/O standpoint.]

You may if you abuse a specified timeout.  UCSC and NCBI both have  
been known to block IPs, but the timeout is quite different between  
the two (NCBI just reduced theirs to three queries per second, whereas  
I last heard UCSC was once per 30 seconds).

The best thing to do is check the documentation for the site in  
question or contact the webadmin to see if there is a requested  
timeout period.

> Finally, is there any "archive of experience" documenting the various
> information systems limitations on various bioinformatics  
> applications?
> I.e. for I/O requirements and/or CPU requirements, is: BLAST <
> HMM-domain-searching < Inter-genome-signal-scanning/matching?   
> Relates to
> the question of when home based bioinformaticians need to begin  
> considering
> switching from DSL to Cable to FIOS and/or 1/3/4/6/8 core machines/ 
> clusters
> can handle the workload.
>
> Thank you,
> Robert Bradbury

On that I'm not sure, but I would tend to think they don't want you  
taxing their local servers so there probably is some prioritization of  
tasks.

 From my perspective, if I were a home-based bioinformatician I would  
look seriously at cloud computing for most high-end tasks (Mark has  
even set up one for bioperl, bioperl-max).  It has a cost but it's  
very reasonable considering the cost of setting up a local cluster,  
maintenance and repairs, etc.  In fact, we have been putting serious  
thought into testing that direction instead of putting money into  
another high-cost local cluster, which is obsolete in, say, 3-4 years,  
or when we're getting Blue Waters in a couple years.

chris


From jajams at utu.fi  Wed Sep 16 10:04:18 2009
From: jajams at utu.fi (=?iso-8859-1?B?Ikpvb25hcyBK5G1zZW4i?=)
Date: Wed, 16 Sep 2009 13:04:18 +0300
Subject: [Bioperl-l] problem with a script
Message-ID: <fb44a91e1ccd0.4ab0e252@utu.fi>

Hi,

Im trying to run the script below and I get an error: "Can't call method "next_result" on an undefined value at parser.pl line 5."


#!/v/linux26_x86_64/appl/molbio/bioperl/perl/bin/
use Bio::SearchIO
my $searchio = Bio::SearchIO->new(-format => 'hmmer', -file   => '/wrk/xxxx/hmm/hmmsearch_nr.out');
while ( my $result = $in->next_result ) {
     while ( my $hit = $result->next_hit ) {
         while ( my $hsp-evalue<=10 ) {
             while ( my $hsp = $hit->next_hsp ) {
                 print $hit->accession(), "\n";
         }
     }
 }

Could someone tell me what is wrong?

Thanks.


From maj at fortinbras.us  Wed Sep 16 15:18:26 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Wed, 16 Sep 2009 11:18:26 -0400
Subject: [Bioperl-l] problem with a script
In-Reply-To: <fb44a91e1ccd0.4ab0e252@utu.fi>
References: <fb44a91e1ccd0.4ab0e252@utu.fi>
Message-ID: <A9C32C43FB5C46FD9DC320A4DD325104@NewLife>

Hi Joonas-- 

Put a semicolon after "use Bio::SearchIO" in line 2.
If that doesn't work, then the error suggests that $searchio is undefined 
because the parser failed for some reason.
You could try
 my $searchio = Bio::SearchIO->new(-format => 'hmmer', -file   => 
'/wrk/xxxx/hmm/hmmsearch_nr.out'
                   -verbose=>1);
to get more detailed error messages, they may direct you to the issue.

cheers MAJ

----- Original Message ----- 
From: ""Joonas J?msen"" <jajams at utu.fi>
To: "bioperl list" <bioperl-l at lists.open-bio.org>
Sent: Wednesday, September 16, 2009 6:04 AM
Subject: [Bioperl-l] problem with a script


> Hi,
>
> Im trying to run the script below and I get an error: "Can't call method 
> "next_result" on an undefined value at parser.pl line 5."
>
>
> #!/v/linux26_x86_64/appl/molbio/bioperl/perl/bin/
> use Bio::SearchIO
> my $searchio = Bio::SearchIO->new(-format => 'hmmer', -file   => 
> '/wrk/xxxx/hmm/hmmsearch_nr.out');
> while ( my $result = $in->next_result ) {
>     while ( my $hit = $result->next_hit ) {
>         while ( my $hsp-evalue<=10 ) {
>             while ( my $hsp = $hit->next_hsp ) {
>                 print $hit->accession(), "\n";
>         }
>     }
> }
>
> Could someone tell me what is wrong?
>
> Thanks.
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> 


From Kevin.M.Brown at asu.edu  Wed Sep 16 15:16:51 2009
From: Kevin.M.Brown at asu.edu (Kevin Brown)
Date: Wed, 16 Sep 2009 08:16:51 -0700
Subject: [Bioperl-l] problem with a script
In-Reply-To: <fb44a91e1ccd0.4ab0e252@utu.fi>
References: <fb44a91e1ccd0.4ab0e252@utu.fi>
Message-ID: <1A4207F8295607498283FE9E93B775B4063D4CB6@EX02.asurite.ad.asu.edu>

That's because the variable $in isn't defined, just like the error says. You are setting $searchio to be your input object, but not using it.

#!/v/linux26_x86_64/appl/molbio/bioperl/perl/bin/
use strict; #<-- this helps to find those pesky undeclared variables
use Bio::SearchIO;
my $searchio = Bio::SearchIO->new(-format => 'hmmer', -file   => '/wrk/xxxx/hmm/hmmsearch_nr.out');
while ( my $result = $searchio->next_result ) { # <-- changed this line
     while ( my $hit = $result->next_hit ) {
         while ( my $hsp-evalue<=10 ) {
             while ( my $hsp = $hit->next_hsp ) {
                 print $hit->accession(), "\n";
         }
     }
 }


Kevin Brown
Center for Innovations in Medicine
Biodesign Institute
Arizona State University  

> -----Original Message-----
> From: bioperl-l-bounces at lists.open-bio.org 
> [mailto:bioperl-l-bounces at lists.open-bio.org] On Behalf Of 
> "Joonas J?msen"
> Sent: Wednesday, September 16, 2009 3:04 AM
> To: bioperl list
> Subject: [Bioperl-l] problem with a script
> 
> Hi,
> 
> Im trying to run the script below and I get an error: "Can't 
> call method "next_result" on an undefined value at parser.pl line 5."
> 
> 
> #!/v/linux26_x86_64/appl/molbio/bioperl/perl/bin/
> use Bio::SearchIO
> my $searchio = Bio::SearchIO->new(-format => 'hmmer', -file   
> => '/wrk/xxxx/hmm/hmmsearch_nr.out');
> while ( my $result = $in->next_result ) {
>      while ( my $hit = $result->next_hit ) {
>          while ( my $hsp-evalue<=10 ) {
>              while ( my $hsp = $hit->next_hsp ) {
>                  print $hit->accession(), "\n";
>          }
>      }
>  }
> 
> Could someone tell me what is wrong?
> 
> Thanks.
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 


From rmb32 at cornell.edu  Wed Sep 16 15:05:16 2009
From: rmb32 at cornell.edu (Robert Buels)
Date: Wed, 16 Sep 2009 08:05:16 -0700
Subject: [Bioperl-l] problem with a script
In-Reply-To: <fb44a91e1ccd0.4ab0e252@utu.fi>
References: <fb44a91e1ccd0.4ab0e252@utu.fi>
Message-ID: <4AB0FEAC.50104@cornell.edu>

1.) You need to use strict.  Always have use strict at the top of your 
code.  That would have caught this error.
2.) The proximate problem here is that your searchio object is call 
$searchio, while you are calling $in->next_result.  You want 
$searchio->next_result instead.

Rob

Joonas J?msen wrote:
> Hi,
> 
> Im trying to run the script below and I get an error: "Can't call method "next_result" on an undefined value at parser.pl line 5."
> 
> 
> #!/v/linux26_x86_64/appl/molbio/bioperl/perl/bin/
> use Bio::SearchIO
> my $searchio = Bio::SearchIO->new(-format => 'hmmer', -file   => '/wrk/xxxx/hmm/hmmsearch_nr.out');
> while ( my $result = $in->next_result ) {
>      while ( my $hit = $result->next_hit ) {
>          while ( my $hsp-evalue<=10 ) {
>              while ( my $hsp = $hit->next_hsp ) {
>                  print $hit->accession(), "\n";
>          }
>      }
>  }
> 
> Could someone tell me what is wrong?
> 
> Thanks.
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 


From bill at genenformics.com  Wed Sep 16 17:22:56 2009
From: bill at genenformics.com (bill at genenformics.com)
Date: Wed, 16 Sep 2009 10:22:56 -0700
Subject: [Bioperl-l] Bio::DB::GenBank question (acc vs. version)
In-Reply-To: <0B8829A4-03EE-4BA0-8CF8-218782ED2630@illinois.edu>
References: <19116.19568.26115.542911@already.dhcp.gene.com>
	<02cbfb3dfbb309f0b62cecd122bb5c2c.squirrel@mail.dreamhost.com>
	<0B8829A4-03EE-4BA0-8CF8-218782ED2630@illinois.edu>
Message-ID: <6785fd2ac57ff4389dcbcd6b0e0861ae.squirrel@mail.dreamhost.com>


>
> As for generic accession w/o version, efetch does support it but it
> does have problems (pulling up more than one sequence in rare cases,
> for instance).
>

This is probably because NCBI ID servers are not completely synchronized
or are in the process of synchronization. get_Seq_by_acc is not as safe as
other functions.

Bill

>
> On Sep 13, 2009, at 10:47 AM, bill at genenformics.com wrote:
>
>> I would like to make a few comments about get_Seq_by_version and
>> get_Seq_by_acc. Although both functions use the same NCBI eUtils
>> API, they
>> are interpreted differently for a Seq_id with version or without
>> version.
>>
>> 1. If the Seq_id has a version, GenBank ID server will locate
>> corresponding GI and emit the correct sequence.
>> 2. If the Seq_id does not have a version, GBDataLoader  will try to
>> find
>> the latest version number for that Seq_id, which is relatively
>> slower and
>> the version number the ID server find out may NOT always be the
>> latest.
>>
>> IMHO, for both efficiency and consistency,
>> get_Seq_by_gi > get_Seq_by_version >> get_Seq_by_acc
>>
>> Bill
>>
>>
>>>
>>> It looks like get Bio::DB::GenBank::get_Seq_by_{version,acc} are
>>> functionally identical.  They seem to trickle down to the same place
>>> and walking through these two requests yields almost identical http
>>> requests:
>>>
>>>  $db->get_Seq_by_version('J00522.1')
>>>  GET
>>> http://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?retmode=text&rettype=gbwithparts&db=nucleotide&tool=bioperl&id=J00522.1&usehistory=n
>>>
>>>  $db->get_Seq_by_acc('J00522')
>>>  GET
>>> http://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?retmode=text&rettype=gbwithparts&db=nucleotide&tool=bioperl&id=J00522&usehistory=n
>>>
>>> The only difference that I can see is that they index into different
>>> secions of %PARAMSTRING defined in Bio::DB::GenBank, but those
>>> sections contain the same information.
>>>
>>> I'd like a general purpose tool that does The Right Thing whether
>>> there's a .1 on the end of an identifier or not, and am just trying
>>> to
>>> make sure I'm not doing something troublesome.
>>>
>>> Am I correct about the above?
>>>
>>> While I'm at it, I think that the comment
>>>
>>>  # note that get_Stream_by_version is not implemented
>>>
>>> in Bio::DB::GenBank was made obsolete by whoever commented out the
>>>
>>>  $self->throw(...)
>>>
>>> in get_Stream_by_version in Bio::WebDBSeqI.pm.
>>>
>>> I'll happily commit the trivial doc fix if no one shoots down the
>>> idea. (can't help big, might as well help small...).
>>>
>>> Thanks,
>>>
>>> g.
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>


From cjfields at illinois.edu  Wed Sep 16 17:29:40 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Wed, 16 Sep 2009 12:29:40 -0500
Subject: [Bioperl-l] Bio::DB::GenBank question (acc vs. version)
In-Reply-To: <6785fd2ac57ff4389dcbcd6b0e0861ae.squirrel@mail.dreamhost.com>
References: <19116.19568.26115.542911@already.dhcp.gene.com>
	<02cbfb3dfbb309f0b62cecd122bb5c2c.squirrel@mail.dreamhost.com>
	<0B8829A4-03EE-4BA0-8CF8-218782ED2630@illinois.edu>
	<6785fd2ac57ff4389dcbcd6b0e0861ae.squirrel@mail.dreamhost.com>
Message-ID: <B293F929-5714-4840-8FAD-7366F7C36137@illinois.edu>


On Sep 16, 2009, at 12:22 PM, bill at genenformics.com wrote:

>
>>
>> As for generic accession w/o version, efetch does support it but it
>> does have problems (pulling up more than one sequence in rare cases,
>> for instance).
>>
>
> This is probably because NCBI ID servers are not completely  
> synchronized
> or are in the process of synchronization. get_Seq_by_acc is not as  
> safe as
> other functions.
>
> Bill

Right, but unfortunately it's necessary as the default in most cases  
is to grab/display the accession, not the UID.  For instance, BLAST  
output must be specifically flagged to display the GI.

This is an instance where documentation would be a good idea to  
indicate the problem.  I think I have done that but I'll double-check.

chris


From rmb32 at cornell.edu  Wed Sep 16 19:04:16 2009
From: rmb32 at cornell.edu (Robert Buels)
Date: Wed, 16 Sep 2009 12:04:16 -0700
Subject: [Bioperl-l] problem with a script
In-Reply-To: <4AB1356D.4050307@utu.fi>
References: <fb44a91e1ccd0.4ab0e252@utu.fi> <4AB0FEAC.50104@cornell.edu>
	<4AB1356D.4050307@utu.fi>
Message-ID: <4AB136B0.6050304@cornell.edu>

You should also 'use warnings' at the top of all code.  That would have 
caught THIS error.

You are missing a comma after ....nr.out'

Rob

Joonas J?msen wrote:
> Thanks. Im still getting errors. I have no idea what the error means. It 
> says:
> 
> ------------- EXCEPTION: Bio::Root::Exception -------------
> MSG: Could not open 0: No such file or directory
> STACK: Error::throw
> STACK: Bio::Root::Root::throw 
> /v/linux26_x86_64/appl/molbio/bioperl/perl/lib/site_perl/5.8.9/Bio/Root/Root.pm:357 
> 
> STACK: Bio::Root::IO::_initialize_io 
> /v/linux26_x86_64/appl/molbio/bioperl/perl/lib/site_perl/5.8.9/Bio/Root/IO.pm:310 
> 
> STACK: Bio::Root::IO::new 
> /v/linux26_x86_64/appl/molbio/bioperl/perl/lib/site_perl/5.8.9/Bio/Root/IO.pm:223 
> 
> STACK: Bio::SearchIO::new 
> /v/linux26_x86_64/appl/molbio/bioperl/perl/lib/site_perl/5.8.9/Bio/SearchIO.pm:145 
> 
> STACK: Bio::SearchIO::new 
> /v/linux26_x86_64/appl/molbio/bioperl/perl/lib/site_perl/5.8.9/Bio/SearchIO.pm:177 
> 
> STACK: parser.pl:7
> -----------------------------------------------------------
> 
> And the code im using seems ok now:
> 
> #!/v/linux26_x86_64/appl/molbio/bioperl/perl/bin/
> 
> use strict;
> use Bio::SearchIO;
> 
> my $searchio = Bio::SearchIO->new(-format => 'hmmer', -file => 
> '/wrk/xxxx/hmm/hmmsearch_nr.out' -verbose=>1);
> while ( my $result = $searchio->next_result ) {
>     while ( my $hit = $result->next_hit ) {
>         while ( my $hsp = $hit->evalue<=10 ) {
>                 while ( my $hsp = $hit->next_hsp ) {
>                         print $hit->accession(), "\n";
>             }
>         }
>     }
> }
> 
> -J.
> 
> Robert Buels wrote:
>> 1.) You need to use strict.  Always have use strict at the top of your 
>> code.  That would have caught this error.
>> 2.) The proximate problem here is that your searchio object is call 
>> $searchio, while you are calling $in->next_result.  You want 
>> $searchio->next_result instead.
>>
>> Rob
>>
>> Joonas J?msen wrote:
>>> Hi,
>>>
>>> Im trying to run the script below and I get an error: "Can't call 
>>> method "next_result" on an undefined value at parser.pl line 5."
>>>
>>>
>>> #!/v/linux26_x86_64/appl/molbio/bioperl/perl/bin/
>>> use Bio::SearchIO
>>> my $searchio = Bio::SearchIO->new(-format => 'hmmer', -file   => 
>>> '/wrk/xxxx/hmm/hmmsearch_nr.out');
>>> while ( my $result = $in->next_result ) {
>>>      while ( my $hit = $result->next_hit ) {
>>>          while ( my $hsp-evalue<=10 ) {
>>>              while ( my $hsp = $hit->next_hsp ) {
>>>                  print $hit->accession(), "\n";
>>>          }
>>>      }
>>>  }
>>>
>>> Could someone tell me what is wrong?
>>>
>>> Thanks.
>>>
>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>


-- 
Robert Buels
Bioinformatics Analyst, Sol Genomics Network
Boyce Thompson Institute for Plant Research
Tower Rd
Ithaca, NY  14853
Tel: 503-889-8539
rmb32 at cornell.edu
http://www.sgn.cornell.edu


From rmb32 at cornell.edu  Wed Sep 16 19:23:27 2009
From: rmb32 at cornell.edu (Robert Buels)
Date: Wed, 16 Sep 2009 12:23:27 -0700
Subject: [Bioperl-l] problem with a script
In-Reply-To: <4AB13864.6070707@utu.fi>
References: <fb44a91e1ccd0.4ab0e252@utu.fi> <4AB0FEAC.50104@cornell.edu>
	<4AB1356D.4050307@utu.fi> <4AB136B0.6050304@cornell.edu>
	<4AB13864.6070707@utu.fi>
Message-ID: <4AB13B2F.5060502@cornell.edu>

Your report may not have accessions, try using name() instead of 
accession().


From abhishek.vit at gmail.com  Wed Sep 16 20:13:33 2009
From: abhishek.vit at gmail.com (Abhishek Pratap)
Date: Wed, 16 Sep 2009 16:13:33 -0400
Subject: [Bioperl-l] About FASTQ parser
Message-ID: <be9b52410909161313uab30d9cn24d7080eb1684de7@mail.gmail.com>

Hi Chris

I remember seeing a recent email about new bioperl fastq parser. Is it
part of bioperl 1.6 dist. I installed one and based on the doc
here(http://doc.bioperl.org/releases/bioperl-current/bioperl-live/Bio/SeqIO/fastq.html)
I am a bit lost.

I see two methods there : using Bio::SeqIO::fastq and
Bio::Seq::Quality. Are both same in terms of data returned and latter
giving a scale up in speed ?

This is not to offend any developer but small example/s on the HOWTO's
helps a lot.

The current example (copied below) is not working. I guess it is based
on a previous version of code.

# grabs the FASTQ parser, specifies the Illumina variant
  my $in = Bio::SeqIO->new(-format    => 'fastq-illumina',
                           -file      => 'mydata.fq');


My basic requirement is to read each read in fastq record and split it
into header: read: quality.


Thanks,
-Abhi


From abhishek.vit at gmail.com  Wed Sep 16 21:41:50 2009
From: abhishek.vit at gmail.com (Abhishek Pratap)
Date: Wed, 16 Sep 2009 17:41:50 -0400
Subject: [Bioperl-l] Allowing One error in Sequence matching
Message-ID: <be9b52410909161441w1ce271c4r1e518f7fd1ea7339@mail.gmail.com>

Hi All

I am not able to think of smart way to do sequence matching allowing
userdefined number of mismatches.

For eg:

Given Sequence : AGCT will be considered a match to reference if any
one base pair position #(1,2,3,4)  has a mismatch that is  [ACGTN] so
the possible matches could be

This is for position 1.
AGCT
GGCT
CGCT
TGCT
NGCT
and likewise for each position.

any nice regular expression. One way that I could think was to
generate all the possible tags for a given sequence and then do the
matching. It will be a computationally expensive for long dataset .
Any neat method ?

Thanks,
-Abhi


From maj at fortinbras.us  Wed Sep 16 22:33:00 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Wed, 16 Sep 2009 22:33:00 +0000
Subject: [Bioperl-l] Allowing One error in Sequence matching
Message-ID: <W403682148491321253140380@webmail21>

Hi Abhi -
Maybe Chris' scrap
http://www.bioperl.org/wiki/Tricking_the_perl_regex_engine_to_get_suboptimal_matches
is what you're after?
MAJ


>-----Original Message-----
>From: Abhishek Pratap [mailto:abhishek.vit at gmail.com]
>Sent: Wednesday, September 16, 2009 05:41 PM
>To: bioperl-l at lists.open-bio.org
>Subject: [Bioperl-l] Allowing One error in Sequence matching
>
>Hi All
>
>I am not able to think of smart way to do sequence matching allowing
>userdefined number of mismatches.
>
>For eg:
>
>Given Sequence : AGCT will be considered a match to reference if any
>one base pair position #(1,2,3,4)  has a mismatch that is  [ACGTN] so
>the possible matches could be
>
>This is for position 1.
>AGCT
>GGCT
>CGCT
>TGCT
>NGCT
>and likewise for each position.
>
>any nice regular expression. One way that I could think was to
>generate all the possible tags for a given sequence and then do the
>matching. It will be a computationally expensive for long dataset .
>Any neat method ?
>
>Thanks,
>-Abhi
>_______________________________________________
>Bioperl-l mailing list
>Bioperl-l at lists.open-bio.org
>http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


From Russell.Smithies at agresearch.co.nz  Wed Sep 16 23:06:45 2009
From: Russell.Smithies at agresearch.co.nz (Smithies, Russell)
Date: Thu, 17 Sep 2009 11:06:45 +1200
Subject: [Bioperl-l] Allowing One error in Sequence matching
In-Reply-To: <be9b52410909161441w1ce271c4r1e518f7fd1ea7339@mail.gmail.com>
References: <be9b52410909161441w1ce271c4r1e518f7fd1ea7339@mail.gmail.com>
Message-ID: <18DF7D20DFEC044098A1062202F5FFF32B62985946@exchsth.agresearch.co.nz>

How about chunk it into overlapping words, skip if >2 N, then regex?

$seq = "CGATCGNATGNCGTCTAGCTGACANGTTGACTCTAGCTGATCGATCGATCGTACGTANNCGTAGTCGTACNTACGATCTNACGCACGNATGCTACGTACG";

$motif = "ACGT";
foreach (split //, $motif) {$w .= "[${_}N]"}

foreach ($seq =~ /(?=(\w{4}))/g){
  next if tr/N/N/ >= 2;
  print "$_\n" if  eval "/$w/" ;
}


> -----Original Message-----
> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
> bounces at lists.open-bio.org] On Behalf Of Abhishek Pratap
> Sent: Thursday, 17 September 2009 9:42 a.m.
> To: bioperl-l at lists.open-bio.org
> Subject: [Bioperl-l] Allowing One error in Sequence matching
> 
> Hi All
> 
> I am not able to think of smart way to do sequence matching allowing
> userdefined number of mismatches.
> 
> For eg:
> 
> Given Sequence : AGCT will be considered a match to reference if any
> one base pair position #(1,2,3,4)  has a mismatch that is  [ACGTN] so
> the possible matches could be
> 
> This is for position 1.
> AGCT
> GGCT
> CGCT
> TGCT
> NGCT
> and likewise for each position.
> 
> any nice regular expression. One way that I could think was to
> generate all the possible tags for a given sequence and then do the
> matching. It will be a computationally expensive for long dataset .
> Any neat method ?
> 
> Thanks,
> -Abhi
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
=======================================================================
Attention: The information contained in this message and/or attachments
from AgResearch Limited is intended only for the persons or entities
to which it is addressed and may contain confidential and/or privileged
material. Any review, retransmission, dissemination or other use of, or
taking of any action in reliance upon, this information by persons or
entities other than the intended recipients is prohibited by AgResearch
Limited. If you have received this message in error, please notify the
sender immediately.
=======================================================================


From maj at fortinbras.us  Wed Sep 16 22:30:50 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Wed, 16 Sep 2009 18:30:50 -0400
Subject: [Bioperl-l] Allowing One error in Sequence matching
In-Reply-To: <be9b52410909161441w1ce271c4r1e518f7fd1ea7339@mail.gmail.com>
References: <be9b52410909161441w1ce271c4r1e518f7fd1ea7339@mail.gmail.com>
Message-ID: <1B8182A0898B452D80EA6035A178B7CE@NewLife>

Hi Abhi -
Maybe Chris' scrap
http://www.bioperl.org/wiki/Tricking_the_perl_regex_engine_to_get_suboptimal_matches
is what you're after?
MAJ
----- Original Message ----- 
From: "Abhishek Pratap" <abhishek.vit at gmail.com>
To: <bioperl-l at lists.open-bio.org>
Sent: Wednesday, September 16, 2009 5:41 PM
Subject: [Bioperl-l] Allowing One error in Sequence matching


> Hi All
>
> I am not able to think of smart way to do sequence matching allowing
> userdefined number of mismatches.
>
> For eg:
>
> Given Sequence : AGCT will be considered a match to reference if any
> one base pair position #(1,2,3,4)  has a mismatch that is  [ACGTN] so
> the possible matches could be
>
> This is for position 1.
> AGCT
> GGCT
> CGCT
> TGCT
> NGCT
> and likewise for each position.
>
> any nice regular expression. One way that I could think was to
> generate all the possible tags for a given sequence and then do the
> matching. It will be a computationally expensive for long dataset .
> Any neat method ?
>
> Thanks,
> -Abhi
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> 


From abhishek.vit at gmail.com  Thu Sep 17 01:39:13 2009
From: abhishek.vit at gmail.com (Abhishek Pratap)
Date: Wed, 16 Sep 2009 21:39:13 -0400
Subject: [Bioperl-l] Allowing One error in Sequence matching
In-Reply-To: <18DF7D20DFEC044098A1062202F5FFF32B62985946@exchsth.agresearch.co.nz>
References: <be9b52410909161441w1ce271c4r1e518f7fd1ea7339@mail.gmail.com>
	<18DF7D20DFEC044098A1062202F5FFF32B62985946@exchsth.agresearch.co.nz>
Message-ID: <be9b52410909161839k2dd86c57o63cc149057b6af99@mail.gmail.com>

Hi Russell

Thanks for a quick reply. However I am not following the code clearly
and the reason behind it.

Will this work for  matching AGCT  to ACCT | ANCT | AACT. It dint give
me the expected output when I ran it. I am more interested in
understanding the logic.

It would be great if you could expand a bit more.


Also if I do it the brute force way as suggested to me by a frnd , how
will that work in terms of scalability.

@dna1=split(//,$a);
@dna2=split(//,$b);
$x=0;
for($i=0;$i<@dna1;$i++){
        if ($dna1[$i] ne $dna2[$i]){
                        $x++;
        }
}

if($x<=1){
        print "RESULT: your sequence is true\n";
}

else { print " RESULT: your sequence is false\n";}

Thanks,
-Abhi


On Wed, Sep 16, 2009 at 7:06 PM, Smithies, Russell
<Russell.Smithies at agresearch.co.nz> wrote:
> How about chunk it into overlapping words, skip if >2 N, then regex?
>
> $seq = "CGATCGNATGNCGTCTAGCTGACANGTTGACTCTAGCTGATCGATCGATCGTACGTANNCGTAGTCGTACNTACGATCTNACGCACGNATGCTACGTACG";
>
> $motif = "ACGT";
> foreach (split //, $motif) {$w .= "[${_}N]"}
>
> foreach ($seq =~ /(?=(\w{4}))/g){
> ?next if tr/N/N/ >= 2;
> ?print "$_\n" if ?eval "/$w/" ;
> }
>
>
>
>> -----Original Message-----
>> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
>> bounces at lists.open-bio.org] On Behalf Of Abhishek Pratap
>> Sent: Thursday, 17 September 2009 9:42 a.m.
>> To: bioperl-l at lists.open-bio.org
>> Subject: [Bioperl-l] Allowing One error in Sequence matching
>>
>> Hi All
>>
>> I am not able to think of smart way to do sequence matching allowing
>> userdefined number of mismatches.
>>
>> For eg:
>>
>> Given Sequence : AGCT will be considered a match to reference if any
>> one base pair position #(1,2,3,4) ?has a mismatch that is ?[ACGTN] so
>> the possible matches could be
>>
>> This is for position 1.
>> AGCT
>> GGCT
>> CGCT
>> TGCT
>> NGCT
>> and likewise for each position.
>>
>> any nice regular expression. One way that I could think was to
>> generate all the possible tags for a given sequence and then do the
>> matching. It will be a computationally expensive for long dataset .
>> Any neat method ?
>>
>> Thanks,
>> -Abhi
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> =======================================================================
> Attention: The information contained in this message and/or attachments
> from AgResearch Limited is intended only for the persons or entities
> to which it is addressed and may contain confidential and/or privileged
> material. Any review, retransmission, dissemination or other use of, or
> taking of any action in reliance upon, this information by persons or
> entities other than the intended recipients is prohibited by AgResearch
> Limited. If you have received this message in error, please notify the
> sender immediately.
> =======================================================================
>


From Russell.Smithies at agresearch.co.nz  Thu Sep 17 01:46:54 2009
From: Russell.Smithies at agresearch.co.nz (Smithies, Russell)
Date: Thu, 17 Sep 2009 13:46:54 +1200
Subject: [Bioperl-l] Allowing One error in Sequence matching
In-Reply-To: <be9b52410909161839k2dd86c57o63cc149057b6af99@mail.gmail.com>
References: <be9b52410909161441w1ce271c4r1e518f7fd1ea7339@mail.gmail.com>
	<18DF7D20DFEC044098A1062202F5FFF32B62985946@exchsth.agresearch.co.nz>
	<be9b52410909161839k2dd86c57o63cc149057b6af99@mail.gmail.com>
Message-ID: <18DF7D20DFEC044098A1062202F5FFF32B62985A72@exchsth.agresearch.co.nz>

I misread your question, my example will match NGCT, ANCT, AGNT, or ACGN with 1 miss-match (or NGNT, NGCN, ANNT, ANCT etc with 2 miss-matches)
The eval is just doing a regex on the match string created by the loop - "[AN][GN][CN][TN]"
If your word size is short and you're not using too many mismatches, brute-forcing it with a compiled regex would probably work.


> -----Original Message-----
> From: Abhishek Pratap [mailto:abhishek.vit at gmail.com]
> Sent: Thursday, 17 September 2009 1:39 p.m.
> To: Smithies, Russell
> Cc: bioperl-l at lists.open-bio.org
> Subject: Re: [Bioperl-l] Allowing One error in Sequence matching
> 
> Hi Russell
> 
> Thanks for a quick reply. However I am not following the code clearly
> and the reason behind it.
> 
> Will this work for  matching AGCT  to ACCT | ANCT | AACT. It dint give
> me the expected output when I ran it. I am more interested in
> understanding the logic.
> 
> It would be great if you could expand a bit more.
> 
> 
> Also if I do it the brute force way as suggested to me by a frnd , how
> will that work in terms of scalability.
> 
> @dna1=split(//,$a);
> @dna2=split(//,$b);
> $x=0;
> for($i=0;$i<@dna1;$i++){
>         if ($dna1[$i] ne $dna2[$i]){
>                         $x++;
>         }
> }
> 
> if($x<=1){
>         print "RESULT: your sequence is true\n";
> }
> 
> else { print " RESULT: your sequence is false\n";}
> 
> Thanks,
> -Abhi
> 
> 
> On Wed, Sep 16, 2009 at 7:06 PM, Smithies, Russell
> <Russell.Smithies at agresearch.co.nz> wrote:
> > How about chunk it into overlapping words, skip if >2 N, then regex?
> >
> > $seq =
> "CGATCGNATGNCGTCTAGCTGACANGTTGACTCTAGCTGATCGATCGATCGTACGTANNCGTAGTCGTACNTACGAT
> CTNACGCACGNATGCTACGTACG";
> >
> > $motif = "ACGT";
> > foreach (split //, $motif) {$w .= "[${_}N]"}
> >
> > foreach ($seq =~ /(?=(\w{4}))/g){
> > ?next if tr/N/N/ >= 2;
> > ?print "$_\n" if ?eval "/$w/" ;
> > }
> >
> >
> >
> >> -----Original Message-----
> >> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
> >> bounces at lists.open-bio.org] On Behalf Of Abhishek Pratap
> >> Sent: Thursday, 17 September 2009 9:42 a.m.
> >> To: bioperl-l at lists.open-bio.org
> >> Subject: [Bioperl-l] Allowing One error in Sequence matching
> >>
> >> Hi All
> >>
> >> I am not able to think of smart way to do sequence matching allowing
> >> userdefined number of mismatches.
> >>
> >> For eg:
> >>
> >> Given Sequence : AGCT will be considered a match to reference if any
> >> one base pair position #(1,2,3,4) ?has a mismatch that is ?[ACGTN] so
> >> the possible matches could be
> >>
> >> This is for position 1.
> >> AGCT
> >> GGCT
> >> CGCT
> >> TGCT
> >> NGCT
> >> and likewise for each position.
> >>
> >> any nice regular expression. One way that I could think was to
> >> generate all the possible tags for a given sequence and then do the
> >> matching. It will be a computationally expensive for long dataset .
> >> Any neat method ?
> >>
> >> Thanks,
> >> -Abhi
> >> _______________________________________________
> >> Bioperl-l mailing list
> >> Bioperl-l at lists.open-bio.org
> >> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> > =======================================================================
> > Attention: The information contained in this message and/or attachments
> > from AgResearch Limited is intended only for the persons or entities
> > to which it is addressed and may contain confidential and/or privileged
> > material. Any review, retransmission, dissemination or other use of, or
> > taking of any action in reliance upon, this information by persons or
> > entities other than the intended recipients is prohibited by AgResearch
> > Limited. If you have received this message in error, please notify the
> > sender immediately.
> > =======================================================================
> >


From abhishek.vit at gmail.com  Thu Sep 17 03:12:20 2009
From: abhishek.vit at gmail.com (Abhishek Pratap)
Date: Wed, 16 Sep 2009 23:12:20 -0400
Subject: [Bioperl-l] Allowing One error in Sequence matching
In-Reply-To: <18DF7D20DFEC044098A1062202F5FFF32B62985A72@exchsth.agresearch.co.nz>
References: <be9b52410909161441w1ce271c4r1e518f7fd1ea7339@mail.gmail.com>
	<18DF7D20DFEC044098A1062202F5FFF32B62985946@exchsth.agresearch.co.nz>
	<be9b52410909161839k2dd86c57o63cc149057b6af99@mail.gmail.com>
	<18DF7D20DFEC044098A1062202F5FFF32B62985A72@exchsth.agresearch.co.nz>
Message-ID: <be9b52410909162012m5b18bc78u477e15957c88a45d@mail.gmail.com>

Thanks Russell.

I think having a "approx matching" method in bioperl will help
specially with NGS data where read matching with 1/2/3/4 errors is
sometimes needed.

Cheers,
-Abhi


On Wed, Sep 16, 2009 at 9:46 PM, Smithies, Russell
<Russell.Smithies at agresearch.co.nz> wrote:
> I misread your question, my example will match NGCT, ANCT, AGNT, or ACGN with 1 miss-match (or NGNT, NGCN, ANNT, ANCT etc with 2 miss-matches)
> The eval is just doing a regex on the match string created by the loop - "[AN][GN][CN][TN]"
> If your word size is short and you're not using too many mismatches, brute-forcing it with a compiled regex would probably work.
>
>
>> -----Original Message-----
>> From: Abhishek Pratap [mailto:abhishek.vit at gmail.com]
>> Sent: Thursday, 17 September 2009 1:39 p.m.
>> To: Smithies, Russell
>> Cc: bioperl-l at lists.open-bio.org
>> Subject: Re: [Bioperl-l] Allowing One error in Sequence matching
>>
>> Hi Russell
>>
>> Thanks for a quick reply. However I am not following the code clearly
>> and the reason behind it.
>>
>> Will this work for ?matching AGCT ?to ACCT | ANCT | AACT. It dint give
>> me the expected output when I ran it. I am more interested in
>> understanding the logic.
>>
>> It would be great if you could expand a bit more.
>>
>>
>> Also if I do it the brute force way as suggested to me by a frnd , how
>> will that work in terms of scalability.
>>
>> @dna1=split(//,$a);
>> @dna2=split(//,$b);
>> $x=0;
>> for($i=0;$i<@dna1;$i++){
>> ? ? ? ? if ($dna1[$i] ne $dna2[$i]){
>> ? ? ? ? ? ? ? ? ? ? ? ? $x++;
>> ? ? ? ? }
>> }
>>
>> if($x<=1){
>> ? ? ? ? print "RESULT: your sequence is true\n";
>> }
>>
>> else { print " RESULT: your sequence is false\n";}
>>
>> Thanks,
>> -Abhi
>>
>>
>> On Wed, Sep 16, 2009 at 7:06 PM, Smithies, Russell
>> <Russell.Smithies at agresearch.co.nz> wrote:
>> > How about chunk it into overlapping words, skip if >2 N, then regex?
>> >
>> > $seq =
>> "CGATCGNATGNCGTCTAGCTGACANGTTGACTCTAGCTGATCGATCGATCGTACGTANNCGTAGTCGTACNTACGAT
>> CTNACGCACGNATGCTACGTACG";
>> >
>> > $motif = "ACGT";
>> > foreach (split //, $motif) {$w .= "[${_}N]"}
>> >
>> > foreach ($seq =~ /(?=(\w{4}))/g){
>> > ?next if tr/N/N/ >= 2;
>> > ?print "$_\n" if ?eval "/$w/" ;
>> > }
>> >
>> >
>> >
>> >> -----Original Message-----
>> >> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
>> >> bounces at lists.open-bio.org] On Behalf Of Abhishek Pratap
>> >> Sent: Thursday, 17 September 2009 9:42 a.m.
>> >> To: bioperl-l at lists.open-bio.org
>> >> Subject: [Bioperl-l] Allowing One error in Sequence matching
>> >>
>> >> Hi All
>> >>
>> >> I am not able to think of smart way to do sequence matching allowing
>> >> userdefined number of mismatches.
>> >>
>> >> For eg:
>> >>
>> >> Given Sequence : AGCT will be considered a match to reference if any
>> >> one base pair position #(1,2,3,4) ?has a mismatch that is ?[ACGTN] so
>> >> the possible matches could be
>> >>
>> >> This is for position 1.
>> >> AGCT
>> >> GGCT
>> >> CGCT
>> >> TGCT
>> >> NGCT
>> >> and likewise for each position.
>> >>
>> >> any nice regular expression. One way that I could think was to
>> >> generate all the possible tags for a given sequence and then do the
>> >> matching. It will be a computationally expensive for long dataset .
>> >> Any neat method ?
>> >>
>> >> Thanks,
>> >> -Abhi
>> >> _______________________________________________
>> >> Bioperl-l mailing list
>> >> Bioperl-l at lists.open-bio.org
>> >> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>> > =======================================================================
>> > Attention: The information contained in this message and/or attachments
>> > from AgResearch Limited is intended only for the persons or entities
>> > to which it is addressed and may contain confidential and/or privileged
>> > material. Any review, retransmission, dissemination or other use of, or
>> > taking of any action in reliance upon, this information by persons or
>> > entities other than the intended recipients is prohibited by AgResearch
>> > Limited. If you have received this message in error, please notify the
>> > sender immediately.
>> > =======================================================================
>> >
>


From cjfields at illinois.edu  Thu Sep 17 04:39:03 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Wed, 16 Sep 2009 23:39:03 -0500
Subject: [Bioperl-l] About FASTQ parser
In-Reply-To: <be9b52410909161313uab30d9cn24d7080eb1684de7@mail.gmail.com>
References: <be9b52410909161313uab30d9cn24d7080eb1684de7@mail.gmail.com>
Message-ID: <32FBD592-4822-478C-BCAE-33F71E1857FC@illinois.edu>

Abhi,

The FASTQ parser hasn't been released to CPAN yet.  It is available  
via bioperl-live.  We haven't added any code yet to the HOWTO's, but  
the SYNOPSIS example in Bio::SeqIO::fastq should be sufficient to get  
you started.

Bio::Seq::Quality is the object returned via next_seq(); it can be  
queried for PHRED qual scores and other bits.  If you want to split  
things up you should call next_seq(), then generate a FASTQ output  
stream in the variant you want:

my $outfasta = Bio::SeqIO->new(-format => 'fastq-sanger', -file =>  
'>fasta.file');
my $outqual = Bio::SeqIO->new(-format => 'fastq-sanger', -file =>  
'>qual.file');

while (my $seq = $in->next_seq) {
    $outfasta->write_fasta($seq);
    $outqual->write_qual($seq);
}

Note I haven't tested that yet, but it should work.  Let me know if it  
doesn't.

chris

On Sep 16, 2009, at 3:13 PM, Abhishek Pratap wrote:

> Hi Chris
>
> I remember seeing a recent email about new bioperl fastq parser. Is it
> part of bioperl 1.6 dist. I installed one and based on the doc
> here(http://doc.bioperl.org/releases/bioperl-current/bioperl-live/Bio/SeqIO/fastq.html 
> )
> I am a bit lost.
>
> I see two methods there : using Bio::SeqIO::fastq and
> Bio::Seq::Quality. Are both same in terms of data returned and latter
> giving a scale up in speed ?
>
> This is not to offend any developer but small example/s on the HOWTO's
> helps a lot.
>
> The current example (copied below) is not working. I guess it is based
> on a previous version of code.
>
> # grabs the FASTQ parser, specifies the Illumina variant
> my $in = Bio::SeqIO->new(-format    => 'fastq-illumina',
>                          -file      => 'mydata.fq');
>
>
> My basic requirement is to read each read in fastq record and split it
> into header: read: quality.
>
>
> Thanks,
> -Abhi
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From abhishek.vit at gmail.com  Thu Sep 17 04:44:28 2009
From: abhishek.vit at gmail.com (Abhishek Pratap)
Date: Thu, 17 Sep 2009 00:44:28 -0400
Subject: [Bioperl-l] About FASTQ parser
In-Reply-To: <32FBD592-4822-478C-BCAE-33F71E1857FC@illinois.edu>
References: <be9b52410909161313uab30d9cn24d7080eb1684de7@mail.gmail.com>
	<32FBD592-4822-478C-BCAE-33F71E1857FC@illinois.edu>
Message-ID: <be9b52410909162144g3177f718nf239327e98bd30c2@mail.gmail.com>

Thanks for the quick info Chris.

Cheers,
-Abhi

On Thu, Sep 17, 2009 at 12:39 AM, Chris Fields <cjfields at illinois.edu> wrote:
> Abhi,
>
> The FASTQ parser hasn't been released to CPAN yet. ?It is available via
> bioperl-live. ?We haven't added any code yet to the HOWTO's, but the
> SYNOPSIS example in Bio::SeqIO::fastq should be sufficient to get you
> started.
>
> Bio::Seq::Quality is the object returned via next_seq(); it can be queried
> for PHRED qual scores and other bits. ?If you want to split things up you
> should call next_seq(), then generate a FASTQ output stream in the variant
> you want:
>
> my $outfasta = Bio::SeqIO->new(-format => 'fastq-sanger', -file =>
> '>fasta.file');
> my $outqual = Bio::SeqIO->new(-format => 'fastq-sanger', -file =>
> '>qual.file');
>
> while (my $seq = $in->next_seq) {
> ? $outfasta->write_fasta($seq);
> ? $outqual->write_qual($seq);
> }
>
> Note I haven't tested that yet, but it should work. ?Let me know if it
> doesn't.
>
> chris
>
> On Sep 16, 2009, at 3:13 PM, Abhishek Pratap wrote:
>
>> Hi Chris
>>
>> I remember seeing a recent email about new bioperl fastq parser. Is it
>> part of bioperl 1.6 dist. I installed one and based on the doc
>>
>> here(http://doc.bioperl.org/releases/bioperl-current/bioperl-live/Bio/SeqIO/fastq.html)
>> I am a bit lost.
>>
>> I see two methods there : using Bio::SeqIO::fastq and
>> Bio::Seq::Quality. Are both same in terms of data returned and latter
>> giving a scale up in speed ?
>>
>> This is not to offend any developer but small example/s on the HOWTO's
>> helps a lot.
>>
>> The current example (copied below) is not working. I guess it is based
>> on a previous version of code.
>>
>> # grabs the FASTQ parser, specifies the Illumina variant
>> my $in = Bio::SeqIO->new(-format ? ?=> 'fastq-illumina',
>> ? ? ? ? ? ? ? ? ? ? ? ? -file ? ? ?=> 'mydata.fq');
>>
>>
>> My basic requirement is to read each read in fastq record and split it
>> into header: read: quality.
>>
>>
>> Thanks,
>> -Abhi
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>


From amackey at virginia.edu  Thu Sep 17 10:52:31 2009
From: amackey at virginia.edu (Aaron Mackey)
Date: Thu, 17 Sep 2009 06:52:31 -0400
Subject: [Bioperl-l] Question concerning IUPAC.pm
In-Reply-To: <4AB203EF.6030107@agrar.hu-berlin.de>
References: <4AB203EF.6030107@agrar.hu-berlin.de>
Message-ID: <24c96eca0909170352h34b6a20t8648d4e097d57e1e@mail.gmail.com>

Dear Armin,

Please ask such questions on the BioPerl mailing list.

The Bio::Tools::IUPAC module does the opposite of what you want -- it takes
a sequence containing ambiguous codes (e.g. "Y") and generates all possible
combinations of unambiguous sequences (thus one sequence containing a "C"
instead of "Y", and a second sequence containing a "T" instead of "Y").

However, you can do this:

  my %lookup = Bio::Tools::IUPAC->iupac_rev_iub();

%lookup will now contain the following Perl hash:

A => 'A',
T => 'T',
 C => 'C',
G => 'G',
 AC => 'M',
AG => 'R',
 AT => 'W',
CG => 'S',
 CT => 'Y',
'GT' => 'K',
 ACG => 'V',
ACT => 'H',
 AGT => 'D',
CGT => 'B',
 ACGT=> 'N',
N => 'N'

-Aaron


On Thu, Sep 17, 2009 at 5:39 AM, Armin Schmitt <
armin.schmitt at agrar.hu-berlin.de> wrote:
>
> Dear Aaron,
>
> can I use your module IUPAC.pm to create
> ambiguity symbols?
>
> I.e. Input C,T -> output Y
>
> If yes, how can I do this? A little piece
> of code would be helpful. Otherwise,
> is there another perl module for this
> purpose?
>
> Thank you very much
>
> Armin Schmitt
>
>
> --
> Dr. Armin Schmitt
> Humboldt-Universit?t zu Berlin
> Department for Crop and Animal Sciences
> Invalidenstra?e 42
> 10115 Berlin
> Tel.:   +49-30-2093-9074
> Fax:    +49-30-2093-6397
> E-mail: armin.schmitt at agrar.hu-berlin.de
>
>


From abhishek.vit at gmail.com  Thu Sep 17 18:16:33 2009
From: abhishek.vit at gmail.com (Abhishek Pratap)
Date: Thu, 17 Sep 2009 14:16:33 -0400
Subject: [Bioperl-l] About FASTQ parser
In-Reply-To: <32FBD592-4822-478C-BCAE-33F71E1857FC@illinois.edu>
References: <be9b52410909161313uab30d9cn24d7080eb1684de7@mail.gmail.com>
	<32FBD592-4822-478C-BCAE-33F71E1857FC@illinois.edu>
Message-ID: <be9b52410909171116l3284d7b6pd80689a81d46efc1@mail.gmail.com>

Hi Chris

I am just wondering if the following is intentionally excluded from a
fasta record or a bug.

After reading in each fastq record from a FASTQ fiel the output of the
same recored  (  $out->write_seq($seq)  )  has line/text missing after
the + sign.


Eg:

@HWI-EAS397:1:1:11:252#NNNTNN/1
NACAATATCAATTAGAGGATTGCTTNGTTNAAGGNNTNGNTNNNANTNT
+
DNXPMXNYXMPVXZVTXYZ[[BBBBBBBBBBBBBBBBBBBBBBBBBBBB


PS: In our case we need the exact record to be printed out as we need
to split the fastq file into multiple fastq files based on the read
index in the @ Line. So exact output is needed to avoid conflicts with
downstream processing pipelines.

Thanks,
-Abhi

Thanks,
-Abhi

On Thu, Sep 17, 2009 at 12:39 AM, Chris Fields <cjfields at illinois.edu> wrote:
> Abhi,
>
> The FASTQ parser hasn't been released to CPAN yet. ?It is available via
> bioperl-live. ?We haven't added any code yet to the HOWTO's, but the
> SYNOPSIS example in Bio::SeqIO::fastq should be sufficient to get you
> started.
>
> Bio::Seq::Quality is the object returned via next_seq(); it can be queried
> for PHRED qual scores and other bits. ?If you want to split things up you
> should call next_seq(), then generate a FASTQ output stream in the variant
> you want:
>
> my $outfasta = Bio::SeqIO->new(-format => 'fastq-sanger', -file =>
> '>fasta.file');
> my $outqual = Bio::SeqIO->new(-format => 'fastq-sanger', -file =>
> '>qual.file');
>
> while (my $seq = $in->next_seq) {
> ? $outfasta->write_fasta($seq);
> ? $outqual->write_qual($seq);
> }
>
> Note I haven't tested that yet, but it should work. ?Let me know if it
> doesn't.
>
> chris
>
> On Sep 16, 2009, at 3:13 PM, Abhishek Pratap wrote:
>
>> Hi Chris
>>
>> I remember seeing a recent email about new bioperl fastq parser. Is it
>> part of bioperl 1.6 dist. I installed one and based on the doc
>>
>> here(http://doc.bioperl.org/releases/bioperl-current/bioperl-live/Bio/SeqIO/fastq.html)
>> I am a bit lost.
>>
>> I see two methods there : using Bio::SeqIO::fastq and
>> Bio::Seq::Quality. Are both same in terms of data returned and latter
>> giving a scale up in speed ?
>>
>> This is not to offend any developer but small example/s on the HOWTO's
>> helps a lot.
>>
>> The current example (copied below) is not working. I guess it is based
>> on a previous version of code.
>>
>> # grabs the FASTQ parser, specifies the Illumina variant
>> my $in = Bio::SeqIO->new(-format ? ?=> 'fastq-illumina',
>> ? ? ? ? ? ? ? ? ? ? ? ? -file ? ? ?=> 'mydata.fq');
>>
>>
>> My basic requirement is to read each read in fastq record and split it
>> into header: read: quality.
>>
>>
>> Thanks,
>> -Abhi
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>


From cjfields at illinois.edu  Thu Sep 17 20:54:20 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Thu, 17 Sep 2009 15:54:20 -0500
Subject: [Bioperl-l] BioPerl 1.6.0 alpha 1 released
Message-ID: <358B9E70-84C7-42DC-A473-C2AACC18A211@illinois.edu>

All,

Just a quick note that I have released the first alpha for the 1.6.1  
point release.  I uploaded it to CPAN, so it should be migrating to  
the various servers in the next few hours or so.  In the meantime, the  
alpha can be directly downloaded using the following links (pick your  
format):

http://bioperl.org/DIST/RC/BioPerl-1.6.0_1.tar.bz2
http://bioperl.org/DIST/RC/BioPerl-1.6.0_1.tar.gz
http://bioperl.org/DIST/RC/BioPerl-1.6.0_1.zip

If everything goes well, I'll have a more formalized release ready for  
the weekend.  I will also be attempting (hopefully with some success)  
getting a Windows PPM for the latest ActiveState Perl going over the  
next few days.  Feedback from users trying to install BioPerl using  
the latest Strawberry Perl would also be greatly appreciated.

Thanks!

chris


From cjfields at illinois.edu  Thu Sep 17 21:38:31 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Thu, 17 Sep 2009 16:38:31 -0500
Subject: [Bioperl-l] Size of BioPerl distribution
Message-ID: <75D16BEE-4851-4709-94D3-4D4A217B6E8D@illinois.edu>

After uploading the latest bioperl alpha to CPAN I noticed the size of  
the distribution archive has jumped up from ~7 MB to just over 10 MB.   
It looks like a majority of this is attributable to three data files  
for testing in t/data added after the 1.6.0 release:

gmap_f9-multiple_results.txt  (3 MB)
withrefm.906                  (2.5 MB)
1ZZ19XR301R-Alignment.tblastn (2 MB)

I'm not sure there is an easy way around the problem.  We could  
attempt to reduce the file size down, but I'm not convinced that's a  
long-term solution (the test data will only get larger as more test  
cases come up).

Any ideas?  Should we try to have a common biodata repo again?

chris


From rmb32 at cornell.edu  Thu Sep 17 22:04:47 2009
From: rmb32 at cornell.edu (Robert Buels)
Date: Thu, 17 Sep 2009 15:04:47 -0700
Subject: [Bioperl-l] Size of BioPerl distribution
In-Reply-To: <75D16BEE-4851-4709-94D3-4D4A217B6E8D@illinois.edu>
References: <75D16BEE-4851-4709-94D3-4D4A217B6E8D@illinois.edu>
Message-ID: <4AB2B27F.8050800@cornell.edu>

Chris Fields wrote:
 > Any ideas?  Should we try to have a common biodata repo again?

Beyond encouraging people to keep the test data smaller (I would think 
that multiple MB in a test data file is quite excessive!), I don't think 
it's worth worrying about that much.  The stuff in bioperl needs a 
significant amount of test data, and I think that's fine.

This problem is also addressed by the ongoing effort to break things up 
into more distros, I think that will help a lot.

Rob


From hlapp at gmx.net  Thu Sep 17 22:33:34 2009
From: hlapp at gmx.net (Hilmar Lapp)
Date: Thu, 17 Sep 2009 18:33:34 -0400
Subject: [Bioperl-l] Size of BioPerl distribution
In-Reply-To: <4AB2B27F.8050800@cornell.edu>
References: <75D16BEE-4851-4709-94D3-4D4A217B6E8D@illinois.edu>
	<4AB2B27F.8050800@cornell.edu>
Message-ID: <C84FCD2C-3CB7-498F-8977-3C52D194F110@gmx.net>


On Sep 17, 2009, at 6:04 PM, Robert Buels wrote:

> I don't think it's worth worrying about that much.  The stuff in  
> bioperl needs a significant amount of test data, and I think that's  
> fine.


I'd agree with that. Storage is cheap these days. -hilmar

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From cjfields at illinois.edu  Thu Sep 17 23:26:25 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Thu, 17 Sep 2009 18:26:25 -0500
Subject: [Bioperl-l] Size of BioPerl distribution
In-Reply-To: <C84FCD2C-3CB7-498F-8977-3C52D194F110@gmx.net>
References: <75D16BEE-4851-4709-94D3-4D4A217B6E8D@illinois.edu>
	<4AB2B27F.8050800@cornell.edu>
	<C84FCD2C-3CB7-498F-8977-3C52D194F110@gmx.net>
Message-ID: <2404AC8D-2095-415B-B1F3-CF79C4D24525@illinois.edu>

On Sep 17, 2009, at 5:33 PM, Hilmar Lapp wrote:

> On Sep 17, 2009, at 6:04 PM, Robert Buels wrote:
>
>> I don't think it's worth worrying about that much.  The stuff in  
>> bioperl needs a significant amount of test data, and I think that's  
>> fine.
>
> I'd agree with that. Storage is cheap these days. -hilmar

Kind of my thought as well, just a bit of a shock to see the dist.  
increase by 65% between point releases for just three test data  
files.  I may try paring those down a tad.

chris


From cjfields at illinois.edu  Thu Sep 17 23:26:52 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Thu, 17 Sep 2009 18:26:52 -0500
Subject: [Bioperl-l] About FASTQ parser
In-Reply-To: <be9b52410909171116l3284d7b6pd80689a81d46efc1@mail.gmail.com>
References: <be9b52410909161313uab30d9cn24d7080eb1684de7@mail.gmail.com>
	<32FBD592-4822-478C-BCAE-33F71E1857FC@illinois.edu>
	<be9b52410909171116l3284d7b6pd80689a81d46efc1@mail.gmail.com>
Message-ID: <06B0378C-312F-4F43-A99A-6F6CC1C88F61@illinois.edu>

The default format for most FASTQ parsers is to leave the extra header  
off (it increases the file size substantially).  You can add that back  
by setting quality_header():

my $out = Bio::SeqIO->new(-format => 'fastq', -file => $file, - 
quality_header => 1);

Again, let me know if that works okay.

chris

On Sep 17, 2009, at 1:16 PM, Abhishek Pratap wrote:

> Hi Chris
>
> I am just wondering if the following is intentionally excluded from a
> fasta record or a bug.
>
> After reading in each fastq record from a FASTQ fiel the output of the
> same recored  (  $out->write_seq($seq)  )  has line/text missing after
> the + sign.
>
>
>
> Eg:
>
> @HWI-EAS397:1:1:11:252#NNNTNN/1
> NACAATATCAATTAGAGGATTGCTTNGTTNAAGGNNTNGNTNNNANTNT
> +
> DNXPMXNYXMPVXZVTXYZ[[BBBBBBBBBBBBBBBBBBBBBBBBBBBB
>
>
> PS: In our case we need the exact record to be printed out as we need
> to split the fastq file into multiple fastq files based on the read
> index in the @ Line. So exact output is needed to avoid conflicts with
> downstream processing pipelines.
>
> Thanks,
> -Abhi
>
> Thanks,
> -Abhi
>
> On Thu, Sep 17, 2009 at 12:39 AM, Chris Fields  
> <cjfields at illinois.edu> wrote:
>> Abhi,
>>
>> The FASTQ parser hasn't been released to CPAN yet.  It is available  
>> via
>> bioperl-live.  We haven't added any code yet to the HOWTO's, but the
>> SYNOPSIS example in Bio::SeqIO::fastq should be sufficient to get you
>> started.
>>
>> Bio::Seq::Quality is the object returned via next_seq(); it can be  
>> queried
>> for PHRED qual scores and other bits.  If you want to split things  
>> up you
>> should call next_seq(), then generate a FASTQ output stream in the  
>> variant
>> you want:
>>
>> my $outfasta = Bio::SeqIO->new(-format => 'fastq-sanger', -file =>
>> '>fasta.file');
>> my $outqual = Bio::SeqIO->new(-format => 'fastq-sanger', -file =>
>> '>qual.file');
>>
>> while (my $seq = $in->next_seq) {
>>   $outfasta->write_fasta($seq);
>>   $outqual->write_qual($seq);
>> }
>>
>> Note I haven't tested that yet, but it should work.  Let me know if  
>> it
>> doesn't.
>>
>> chris
>>
>> On Sep 16, 2009, at 3:13 PM, Abhishek Pratap wrote:
>>
>>> Hi Chris
>>>
>>> I remember seeing a recent email about new bioperl fastq parser.  
>>> Is it
>>> part of bioperl 1.6 dist. I installed one and based on the doc
>>>
>>> here(http://doc.bioperl.org/releases/bioperl-current/bioperl-live/Bio/SeqIO/fastq.html 
>>> )
>>> I am a bit lost.
>>>
>>> I see two methods there : using Bio::SeqIO::fastq and
>>> Bio::Seq::Quality. Are both same in terms of data returned and  
>>> latter
>>> giving a scale up in speed ?
>>>
>>> This is not to offend any developer but small example/s on the  
>>> HOWTO's
>>> helps a lot.
>>>
>>> The current example (copied below) is not working. I guess it is  
>>> based
>>> on a previous version of code.
>>>
>>> # grabs the FASTQ parser, specifies the Illumina variant
>>> my $in = Bio::SeqIO->new(-format    => 'fastq-illumina',
>>>                         -file      => 'mydata.fq');
>>>
>>>
>>> My basic requirement is to read each read in fastq record and  
>>> split it
>>> into header: read: quality.
>>>
>>>
>>> Thanks,
>>> -Abhi
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From rmb32 at cornell.edu  Thu Sep 17 23:30:16 2009
From: rmb32 at cornell.edu (Robert Buels)
Date: Thu, 17 Sep 2009 16:30:16 -0700
Subject: [Bioperl-l] Size of BioPerl distribution
In-Reply-To: <2404AC8D-2095-415B-B1F3-CF79C4D24525@illinois.edu>
References: <75D16BEE-4851-4709-94D3-4D4A217B6E8D@illinois.edu>
	<4AB2B27F.8050800@cornell.edu>
	<C84FCD2C-3CB7-498F-8977-3C52D194F110@gmx.net>
	<2404AC8D-2095-415B-B1F3-CF79C4D24525@illinois.edu>
Message-ID: <4AB2C688.2030602@cornell.edu>

Chris Fields wrote:
> Kind of my thought as well, just a bit of a shock to see the dist. 
> increase by 65% between point releases for just three test data files.  
> I may try paring those down a tad.

Yes, those individual files are certainly excessive.


From maj at fortinbras.us  Thu Sep 17 23:36:09 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Thu, 17 Sep 2009 19:36:09 -0400
Subject: [Bioperl-l] Size of BioPerl distribution
In-Reply-To: <C84FCD2C-3CB7-498F-8977-3C52D194F110@gmx.net>
References: <75D16BEE-4851-4709-94D3-4D4A217B6E8D@illinois.edu><4AB2B27F.8050800@cornell.edu>
	<C84FCD2C-3CB7-498F-8977-3C52D194F110@gmx.net>
Message-ID: <EC73E6B6BD3D468E8C29D138AF64D483@NewLife>

Two of those files are my bad-- the withrefm is prob best in its entirety, since
it contains all the weird extra-site restrictions that the B:Restriction 
refactor
was meant to handle. The other is a tiling test file that I could probably 
replace
(or at least edit down)-- 
----- Original Message ----- 
From: "Hilmar Lapp" <hlapp at gmx.net>
To: "Robert Buels" <rmb32 at cornell.edu>
Cc: "Chris Fields" <cjfields at illinois.edu>; "BioPerl List" 
<bioperl-l at lists.open-bio.org>
Sent: Thursday, September 17, 2009 6:33 PM
Subject: Re: [Bioperl-l] Size of BioPerl distribution


>
> On Sep 17, 2009, at 6:04 PM, Robert Buels wrote:
>
>> I don't think it's worth worrying about that much.  The stuff in  bioperl 
>> needs a significant amount of test data, and I think that's  fine.
>
>
> I'd agree with that. Storage is cheap these days. -hilmar
>
> -- 
> ===========================================================
> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
> ===========================================================
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> 


From maj at fortinbras.us  Fri Sep 18 02:13:37 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Thu, 17 Sep 2009 22:13:37 -0400
Subject: [Bioperl-l] Size of BioPerl distribution
In-Reply-To: <75D16BEE-4851-4709-94D3-4D4A217B6E8D@illinois.edu>
References: <75D16BEE-4851-4709-94D3-4D4A217B6E8D@illinois.edu>
Message-ID: <F9FBB3236FA446BCAA504BBC62E3194F@NewLife>

t/data compresses from 21M to 9M. We could ship with 

$ tar -czf data.tar.gz data
$ rm -rf data

and do the following in Bio::Root::Test, if we're willing to expect 
Archive::Tar and IO::Zlib :

use vars qw( $ARCHIVE );
$ARCHIVE = "data.tar.gz";
...

sub test_input_file {
    # if it's there, fine
    my $fn =  File::Spec->catfile('t', 'data', @_);
    return $fn if -e $fn;
    # if it's not, expand the archive
    my $arch = File::Spec->catfile('t', $ARCHIVE);
    Bio::Root::Root->throw("Test data archive not present") unless (-e $arch);
    my $tar = Archive::Tar->new($arch);
    Bio::Root::Root->throw ("Can't extract test data archive") unless $tar;
    $tar->extract;
    return $fn if -e $fn;
    return;
}


----- Original Message ----- 
From: "Chris Fields" <cjfields at illinois.edu>
To: "BioPerl List" <bioperl-l at lists.open-bio.org>
Sent: Thursday, September 17, 2009 5:38 PM
Subject: [Bioperl-l] Size of BioPerl distribution


> After uploading the latest bioperl alpha to CPAN I noticed the size of  
> the distribution archive has jumped up from ~7 MB to just over 10 MB.   
> It looks like a majority of this is attributable to three data files  
> for testing in t/data added after the 1.6.0 release:
> 
> gmap_f9-multiple_results.txt  (3 MB)
> withrefm.906                  (2.5 MB)
> 1ZZ19XR301R-Alignment.tblastn (2 MB)
> 
> I'm not sure there is an easy way around the problem.  We could  
> attempt to reduce the file size down, but I'm not convinced that's a  
> long-term solution (the test data will only get larger as more test  
> cases come up).
> 
> Any ideas?  Should we try to have a common biodata repo again?
> 
> chris
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
>


From cjfields at illinois.edu  Fri Sep 18 02:53:09 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Thu, 17 Sep 2009 21:53:09 -0500
Subject: [Bioperl-l] Size of BioPerl distribution
In-Reply-To: <F9FBB3236FA446BCAA504BBC62E3194F@NewLife>
References: <75D16BEE-4851-4709-94D3-4D4A217B6E8D@illinois.edu>
	<F9FBB3236FA446BCAA504BBC62E3194F@NewLife>
Message-ID: <04BEE5E7-79C6-45DE-9EC7-D72AE9E881E5@illinois.edu>

Maybe attempt trimming them down a bit first, if that's possible.  If  
not, no worries (breaking up the distribution will help as Robert  
said).  Archive::Tar and IO::Zlib were added in core after 5.8  
(5.009003 to be exact), so I would rather not have to worry about any  
test-specific dependencies.

Anyway, we've got a little more time.  I'm getting a META.yml popping  
up (though everything appears to pass here).  Will look into it; may  
be related to a previously reported bug, but I would like to see some  
CPANPLUS tests coming in first.  That's what an alpha is for!

chris

On Sep 17, 2009, at 9:13 PM, Mark A. Jensen wrote:

> t/data compresses from 21M to 9M. We could ship with
> $ tar -czf data.tar.gz data
> $ rm -rf data
>
> and do the following in Bio::Root::Test, if we're willing to expect  
> Archive::Tar and IO::Zlib :
>
> use vars qw( $ARCHIVE );
> $ARCHIVE = "data.tar.gz";
> ...
>
> sub test_input_file {
>   # if it's there, fine
>   my $fn =  File::Spec->catfile('t', 'data', @_);
>   return $fn if -e $fn;
>   # if it's not, expand the archive
>   my $arch = File::Spec->catfile('t', $ARCHIVE);
>   Bio::Root::Root->throw("Test data archive not present") unless (-e  
> $arch);
>   my $tar = Archive::Tar->new($arch);
>   Bio::Root::Root->throw ("Can't extract test data archive") unless  
> $tar;
>   $tar->extract;
>   return $fn if -e $fn;
>   return;
> }
>
>
> ----- Original Message ----- From: "Chris Fields" <cjfields at illinois.edu 
> >
> To: "BioPerl List" <bioperl-l at lists.open-bio.org>
> Sent: Thursday, September 17, 2009 5:38 PM
> Subject: [Bioperl-l] Size of BioPerl distribution
>
>
>> After uploading the latest bioperl alpha to CPAN I noticed the size  
>> of  the distribution archive has jumped up from ~7 MB to just over  
>> 10 MB.   It looks like a majority of this is attributable to three  
>> data files  for testing in t/data added after the 1.6.0 release:
>> gmap_f9-multiple_results.txt  (3 MB)
>> withrefm.906                  (2.5 MB)
>> 1ZZ19XR301R-Alignment.tblastn (2 MB)
>> I'm not sure there is an easy way around the problem.  We could   
>> attempt to reduce the file size down, but I'm not convinced that's  
>> a  long-term solution (the test data will only get larger as more  
>> test  cases come up).
>> Any ideas?  Should we try to have a common biodata repo again?
>> chris
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From cjfields at illinois.edu  Fri Sep 18 03:48:13 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Thu, 17 Sep 2009 22:48:13 -0500
Subject: [Bioperl-l] Size of BioPerl distribution
In-Reply-To: <19123.504.682683.996798@already.dhcp.gene.com>
References: <75D16BEE-4851-4709-94D3-4D4A217B6E8D@illinois.edu>
	<4AB2B27F.8050800@cornell.edu>
	<C84FCD2C-3CB7-498F-8977-3C52D194F110@gmx.net>
	<2404AC8D-2095-415B-B1F3-CF79C4D24525@illinois.edu>
	<4AB2C688.2030602@cornell.edu>
	<19123.504.682683.996798@already.dhcp.gene.com>
Message-ID: <B1B941EE-8F1E-426C-82DC-D89B3A13AD3D@illinois.edu>

On Sep 17, 2009, at 10:43 PM, George Hartzell wrote:

> Robert Buels writes:
>> Chris Fields wrote:
>>> Kind of my thought as well, just a bit of a shock to see the dist.
>>> increase by 65% between point releases for just three test data  
>>> files.
>>> I may try paring those down a tad.
>>
>> Yes, those individual files are certainly excessive.
>
> Woo hoo.  Fame and fortune.  Or at least fame.  Or something just this
> side of embarrassment.  Rats.
>
> I'll see about making a smaller test for the gmap_f9 parser, while
> still using real data.
>
> Is there existing support in the searchio infrastructure for reading
> [gb]zip'ed files?
>
> Can it wait a day or three?
>
> g.

Yes, certainly.  I'll be working on a separate issue this weekend  
dealing with the META.yml that CPAN/CPANPLUS appear to be choking on,  
so I'll push back the release until early next week.

chris


From hartzell at alerce.com  Fri Sep 18 03:43:52 2009
From: hartzell at alerce.com (George Hartzell)
Date: Thu, 17 Sep 2009 20:43:52 -0700
Subject: [Bioperl-l] Size of BioPerl distribution
In-Reply-To: <4AB2C688.2030602@cornell.edu>
References: <75D16BEE-4851-4709-94D3-4D4A217B6E8D@illinois.edu>
	<4AB2B27F.8050800@cornell.edu>
	<C84FCD2C-3CB7-498F-8977-3C52D194F110@gmx.net>
	<2404AC8D-2095-415B-B1F3-CF79C4D24525@illinois.edu>
	<4AB2C688.2030602@cornell.edu>
Message-ID: <19123.504.682683.996798@already.dhcp.gene.com>

Robert Buels writes:
 > Chris Fields wrote:
 > > Kind of my thought as well, just a bit of a shock to see the dist. 
 > > increase by 65% between point releases for just three test data files.  
 > > I may try paring those down a tad.
 > 
 > Yes, those individual files are certainly excessive.

Woo hoo.  Fame and fortune.  Or at least fame.  Or something just this
side of embarrassment.  Rats.

I'll see about making a smaller test for the gmap_f9 parser, while
still using real data.

Is there existing support in the searchio infrastructure for reading
[gb]zip'ed files?

Can it wait a day or three?

g.


From roy.chaudhuri at gmail.com  Fri Sep 18 10:43:29 2009
From: roy.chaudhuri at gmail.com (Roy Chaudhuri)
Date: Fri, 18 Sep 2009 11:43:29 +0100
Subject: [Bioperl-l] subsection of genbank file
In-Reply-To: <997B4CA2-D80B-4512-AA3E-74CB45DD7064@science.mq.edu.au>
References: <997B4CA2-D80B-4512-AA3E-74CB45DD7064@science.mq.edu.au>
Message-ID: <4AB36451.3030207@gmail.com>

Hi Liam,

I just discovered your message, which has not yet been replied to. What 
you require has been discussed in a recent thread:
http://bioperl.org/pipermail/bioperl-l/2009-August/031071.html

Try using trunc_with_features from Bio::SeqUtils:

my $sub_seqobj=Bio::SeqUtils->trunc_with_features($seqobj, 300, 2000);
Cheers.
Roy.

Liam Elbourne wrote:
> Hi All,
> 
> Is there a method or methodology that will produce a fully fledged Seq  
> object with all the associated metadata given a start and end  
> position? To clarify, I create a sequence object from a genbank file:
> 
> 
> ****
> my $io  = Bio::Seqio->new(as per usual);
> 
> my $seqobj = $io->next_seq();
> ****
> I now want:
> 
> my $sub_seqobj = $seqobj between 300 and 2000
> 
> where $sub_seqobj is a Seq object (which I appreciate is an  
> 'aggregate' of objects) too. The "trunc" method only returns a  
> PrimarySeq object which lacks all the annotation etc. I've previously  
> done this task by iterating through feature by feature and parsing out  
> what I needed, but thought there might be a more elegant approach...
> 
> 
> Regards,
> Liam Elbourne.

-- 
Dr. Roy Chaudhuri
Department of Veterinary Medicine
University of Cambridge, U.K.


From maj at fortinbras.us  Fri Sep 18 12:11:11 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Fri, 18 Sep 2009 08:11:11 -0400
Subject: [Bioperl-l] problem parsing pdb
In-Reply-To: <741671.67508.qm@web25705.mail.ukl.yahoo.com>
References: <741671.67508.qm@web25705.mail.ukl.yahoo.com>
Message-ID: <DBEE748776B74A7988A942A7BBE13AA3@NewLife>

Hi Paola--
I will look at this. Stay tuned-
Mark
----- Original Message ----- 
From: "Paola Bisignano" <paola_bisignano at yahoo.it>
To: <bioperl-l at bioperl.org>
Sent: Tuesday, September 08, 2009 4:55 AM
Subject: [Bioperl-l] problem parsing pdb


Hi,

I'm in a little troble because i need to exactly parse pdb file, to extract 
chain id and res id, but I finded that in some pdb the number of residue is 
followed by a letter because is probably a residue added by crystallographers 
and they didm't want to change the number of residue in sequence....for example 
the pdb 1PXX.pdb I parsed it with my script below, I didn't find any useful 
suggestion about this in bioperltutorial or documentation of bioperl online

#!/usr/local/bin/perl
use strict;
use warnings;
use Bio::Structure::IO;
use LWP::Simple;


my $urlpdb= 
"http://www.rcsb.org/pdb/download/downloadFile.do?fileFormat=pdb&compression=NO&structureId=1PXX";
my $content = get($urlpdb);
my $pdb_file = qq{1pxx.pdb};
open my $f, ">$pdb_file" or die $!;
binmode $f;
print $f $content;
print qq{$pdb_file\n};
close $f;


my $structio=Bio::Structure::IO->new (-file=>$pdb_file);
my $struc=$structio->next_structure;
for my $chain ($struc->get_chains)
{
my $chainid = $chain->id ;
for my $res ($struc->get_residues($chain))
{
my $resid=$res-> id;
my $atoms= $struc->get_atoms($res);
open my $f, ">> 1pxx.parsed";
print $f "$chainid\t$resid\n";
close $f;
}
}


but it gives my file with an error in ILE 105A ILE 2105C because they have a 
letter that follow the number of resid.... can I solve that problem without 
writing intermediate files?
because i need to have the reside id as 105A not 105.A
so
A ILE-105A
without point between number and letter....


Thank you all,

Paola


_______________________________________________
Bioperl-l mailing list
Bioperl-l at lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/bioperl-l


From scott at scottcain.net  Fri Sep 18 14:11:23 2009
From: scott at scottcain.net (Scott Cain)
Date: Fri, 18 Sep 2009 10:11:23 -0400
Subject: [Bioperl-l] test failures in main trunk
Message-ID: <2DEEE102-8F58-4BBF-BEAD-97A1AA364787@scottcain.net>

With Chris trying to get a release out, I wanted to report these test  
failures from a fairly virgin system Ubuntu server 8.04.

Scott


t/SeqIO/raw.t ................................ 1/24 Can't locate  
Algorithm/Diff.pm in @INC (@INC contains: t/lib . /home/gmod/bioperl- 
live/blib/lib /home/gmod/bioperl-live/blib/arch /home/gmod/bioperl- 
live /etc/perl /usr/local/lib/perl/5.8.8 /usr/local/share/perl/5.8.8 / 
usr/lib/perl5 /usr/share/perl5 /usr/lib/perl/5.8 /usr/share/perl/5.8 / 
usr/local/lib/site_perl) at t/SeqIO/raw.t line 72.
BEGIN failed--compilation aborted at t/SeqIO/raw.t line 72.
# Looks like you planned 24 tests but ran 1.
# Looks like your test exited with 2 just after 1.
t/SeqIO/raw.t ................................ Dubious, test returned  
2 (wstat 512, 0x200)

t/SeqTools/Backtranslate.t ................... Can't locate ok.pm in  
@INC (@INC contains: t/lib /home/gmod/bioperl-live/blib/lib /home/gmod/ 
bioperl-live/blib/arch /home/gmod/bioperl-live /etc/perl /usr/local/ 
lib/perl/5.8.8 /usr/local/share/perl/5.8.8 /usr/lib/perl5 /usr/share/ 
perl5 /usr/lib/perl/5.8 /usr/share/perl/5.8 /usr/local/lib/ 
site_perl .) at t/SeqTools/Backtranslate.t line 9.
BEGIN failed--compilation aborted at t/SeqTools/Backtranslate.t line 9.
# Looks like your test exited with 2 before it could output anything.
t/SeqTools/Backtranslate.t ................... Dubious, test returned  
2 (wstat 512, 0x200)
Failed 8/8 subtests

t/SeqTools/SeqPattern.t ...................... 1/28
#   Failed test 'use Bio::Tools::SeqPattern;'
#   at t/SeqTools/SeqPattern.t line 12.
#     Tried to use 'Bio::Tools::SeqPattern'.
#     Error:  Can't locate List/MoreUtils.pm in @INC (@INC contains: t/ 
lib . /home/gmod/bioperl-live/blib/lib /home/gmod/bioperl-live/blib/ 
arch /home/gmod/bioperl-live /etc/perl /usr/local/lib/perl/5.8.8 /usr/ 
local/share/perl/5.8.8 /usr/lib/perl5 /usr/share/perl5 /usr/lib/perl/ 
5.8 /usr/share/perl/5.8 /usr/local/lib/site_perl) at Bio/Tools/ 
SeqPattern/Backtranslate.pm line 22.
# BEGIN failed--compilation aborted at Bio/Tools/SeqPattern/ 
Backtranslate.pm line 22.
# Compilation failed in require at Bio/Tools/SeqPattern.pm line 212.
# Compilation failed in require at (eval 17) line 2.
# BEGIN failed--compilation aborted at (eval 17) line 2.
Use of uninitialized value in concatenation (.) or string at Bio/Tools/ 
SeqPattern.pm line 431.
Use of uninitialized value in concatenation (.) or string at Bio/Tools/ 
SeqPattern.pm line 432.

#   Failed test at t/SeqTools/SeqPattern.t line 25.
#          got: '(CT).{1,80}(C[[]]CT).(AGGGG){1,200}'
#     expected: '(CT).{1,80}(C[GA][GA]CT).(AGGGG){1,200}'
Use of uninitialized value in concatenation (.) or string at Bio/Tools/ 
SeqPattern.pm line 431.
Use of uninitialized value in concatenation (.) or string at Bio/Tools/ 
SeqPattern.pm line 432.

#   Failed test at t/SeqTools/SeqPattern.t line 31.
#          got: '(CT).(C[][]CT){1,80}.(AGGGG){1,200}'
#     expected: '(CT).(C[AG][AG]CT){1,80}.(AGGGG){1,200}'
Use of uninitialized value in concatenation (.) or string at Bio/Tools/ 
SeqPattern.pm line 371.
Use of uninitialized value in concatenation (.) or string at Bio/Tools/ 
SeqPattern.pm line 372.

#   Failed test at t/SeqTools/SeqPattern.t line 38.
#          got: 'A[][]H'
#     expected: 'A[EQ][DN]H'
"_reverse_translate_motif" is not exported by the  
Bio::Tools::SeqPattern::Backtranslate module
Can't continue after import errors at Bio/Tools/SeqPattern.pm line 539
# Looks like you planned 28 tests but ran 9.
# Looks like you failed 4 tests of 9 run.
# Looks like your test exited with 255 just after 9.
t/SeqTools/SeqPattern.t ...................... Dubious, test returned  
255 (wstat 65280, 0xff00)
Failed 23/28 subtests


-----------------------------------------------------------------------
Scott Cain, Ph. D. scott at scottcain dot net
GMOD Coordinator (http://gmod.org/) 216-392-3087
Ontario Institute for Cancer Research


From dan.bolser at gmail.com  Fri Sep 18 14:11:30 2009
From: dan.bolser at gmail.com (Dan Bolser)
Date: Fri, 18 Sep 2009 15:11:30 +0100
Subject: [Bioperl-l] construct chromosome sequences from bac sequences
In-Reply-To: <dac81b0d0812300702x652813cel733eb9eaa82a408d@mail.gmail.com>
References: <dac81b0d0812300702x652813cel733eb9eaa82a408d@mail.gmail.com>
Message-ID: <2c8757af0909180711t7212f5aak9bc3c7f4e8d16120@mail.gmail.com>

Did you try loading the sequences into an alignment or an assembly object?

As far as I know BioPerl won't call a consensus for you, but you can
post process the alignment or assembly to do that.

Can an alignment hold sequences with qualities?


Sorry for the late reply, I'm just trawling the list for potential
answers to the question I'm about to post ;-)

Dan.


2008/12/30 Alper Yilmaz <alperyilmaz at gmail.com>:
> Hi,
>
> I have FPC report and BAC sequences in hand. I was wondering what is the
> most practical way to build chromosomes from these available information.
>
> I HAVE:
> FPC file:
> accession ? ?chr ? ?chr_start ? ?chr_end ? ?contig ? ?contig_start
> contig_end
> aaaaaaaaaa ? ?1 ? ?14700 ? ?215600 ? ?ctg1 ? ?14700 ? ?215600
> bbbbbbbbbb ? ?1 ? ?196000 ? ?362600 ? ?ctg1 ? ?196000 ? ?362600
> cccccccccc ? ?1 ? ?352800 ? ?524300 ? ?ctg1 ? ?352800 ? ?524300
> .
> .
>
> BAC fasta file:
>>aaaaaaaaaa
> GATCGATCAGCATCGACTACGACT...
>>bbbbbbbbbb
> AGTAGCAGTAGCTAGCACTACGAC...
>>cccccccccc
> ACGATCAGCATCAGCATCGACTAC...
> .
> .
> .
>
> I WANT:
>>chr1
> GACGACTAGCTACGACTAC...
>>chr2
> AGCTGATCACGATCACGAC...
>
> In theory a sequence object called "Chr1" can be created and then according
> to start and end locations of each BAC in FPC file, subsequences of Chr1 can
> be retrieved. However, there are two facts which might prevent using
> standard sequence objects.
> 1) There will be gaps in chromosomes. Is there a function to convert
> unassigned locations to N?
> 2) There are overlaps between BAC sequences. If the overlapping sequences
> are exactly same, it won't be problem, but if there are discrepancies
> between them, a decision has to be made as to which sequence to use in final
> Chr1 sequence.
>
> thanks,
>
> Alper Yilmaz
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


From dan.bolser at gmail.com  Fri Sep 18 14:27:27 2009
From: dan.bolser at gmail.com (Dan Bolser)
Date: Fri, 18 Sep 2009 15:27:27 +0100
Subject: [Bioperl-l] Parser: Ace file (Sequence Assembly) in Bioperl
In-Reply-To: <835D79AC-0D2A-40BE-87F1-0591F69C036A@illinois.edu>
References: <be9b52410901052142p2809652h68e6a05b3ae156eb@mail.gmail.com>
	<18DF7D20DFEC044098A1062202F5FFF31A69523F20@exchsth.agresearch.co.nz>
	<be9b52410901061243t576fcc1eg94928360b8e0f57b@mail.gmail.com>
	<B6BFD3C2-D5D0-4732-B3E2-C2DC9DD029F1@illinois.edu>
	<52cea20c0901061513x593acb44o641b87e35b8ff6fe@mail.gmail.com>
	<835D79AC-0D2A-40BE-87F1-0591F69C036A@illinois.edu>
Message-ID: <2c8757af0909180727r5a71a41fmee71eff92a49a888@mail.gmail.com>

2009/1/6 Chris Fields <cjfields at illinois.edu>:
> Could you archive the files and attach them to a bug report (you can mark it
> as an enhancement request). ?We can take a look.
>
> http://bugzilla.open-bio.org/

Out of interest, has this been added? Where is it documented?

Cheers,
Dan.


> chris
>
> On Jan 6, 2009, at 5:13 PM, Joshua Udall wrote:
>
>> Chris et al. -
>>
>> A student and I have written code to do this - write ace files as well as
>> parse them one entry at a time. ?In trying to use the Assembly::IO as it
>> was
>> in 1.5, we ran into problems with large ace files containing many entries
>> because of file handle limit issues with the inherited implementation
>> DB_File. ?Our implementation simply reads one contig at a time instead of
>> first trying to slurp the whole ace into memory. ?I'm happy to add it to
>> Bioperl, but I am not sure how to do it. ?If I sent *.pm files to someone,
>> could they help me get it into bioperl? ?It may not be perfect either, but
>> it should be a good start.
>>
>> Josh


From bosborne11 at verizon.net  Fri Sep 18 13:48:55 2009
From: bosborne11 at verizon.net (Brian Osborne)
Date: Fri, 18 Sep 2009 09:48:55 -0400
Subject: [Bioperl-l] problem parsing pdb
In-Reply-To: <DBEE748776B74A7988A942A7BBE13AA3@NewLife>
References: <741671.67508.qm@web25705.mail.ukl.yahoo.com>
	<DBEE748776B74A7988A942A7BBE13AA3@NewLife>
Message-ID: <AC62DAB3-3334-44A6-8172-753519B083FF@verizon.net>

Mark,

There was an interesting exchange about StructureIO::pdb a few years  
ago:

http://portal.open-bio.org/pipermail/bioperl-l/2006-September/022990.html

I don't think anyone has actually worked on this code since then and I  
also don't know if Paolo's question relates to the content of the  
thread, but it's good overview.

Brian O.


On Sep 18, 2009, at 8:11 AM, Mark A. Jensen wrote:

> Hi Paola--
> I will look at this. Stay tuned-
> Mark
> ----- Original Message ----- From: "Paola Bisignano" <paola_bisignano at yahoo.it 
> >
> To: <bioperl-l at bioperl.org>
> Sent: Tuesday, September 08, 2009 4:55 AM
> Subject: [Bioperl-l] problem parsing pdb
>
>
> Hi,
>
> I'm in a little troble because i need to exactly parse pdb file, to  
> extract chain id and res id, but I finded that in some pdb the  
> number of residue is followed by a letter because is probably a  
> residue added by crystallographers and they didm't want to change  
> the number of residue in sequence....for example the pdb 1PXX.pdb I  
> parsed it with my script below, I didn't find any useful suggestion  
> about this in bioperltutorial or documentation of bioperl online
>
> #!/usr/local/bin/perl
> use strict;
> use warnings;
> use Bio::Structure::IO;
> use LWP::Simple;
>
>
>
> my $urlpdb= "http://www.rcsb.org/pdb/download/downloadFile.do?fileFormat=pdb&compression=NO&structureId=1PXX 
> ";
> my $content = get($urlpdb);
> my $pdb_file = qq{1pxx.pdb};
> open my $f, ">$pdb_file" or die $!;
> binmode $f;
> print $f $content;
> print qq{$pdb_file\n};
> close $f;
>
>
>
> my $structio=Bio::Structure::IO->new (-file=>$pdb_file);
> my $struc=$structio->next_structure;
> for my $chain ($struc->get_chains)
> {
> my $chainid = $chain->id ;
> for my $res ($struc->get_residues($chain))
> {
> my $resid=$res-> id;
> my $atoms= $struc->get_atoms($res);
> open my $f, ">> 1pxx.parsed";
> print $f "$chainid\t$resid\n";
> close $f;
> }
> }
>
>
>
> but it gives my file with an error in ILE 105A ILE 2105C because  
> they have a letter that follow the number of resid.... can I solve  
> that problem without writing intermediate files?
> because i need to have the reside id as 105A not 105.A
> so
> A ILE-105A
> without point between number and letter....
>
>
>
>
> Thank you all,
>
> Paola
>
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From dan.bolser at gmail.com  Fri Sep 18 14:55:57 2009
From: dan.bolser at gmail.com (Dan Bolser)
Date: Fri, 18 Sep 2009 15:55:57 +0100
Subject: [Bioperl-l] Getting read position information from an ACE file?
Message-ID: <2c8757af0909180755u2e2ca178h9ce921f9bb22c7a3@mail.gmail.com>

Dear Perl Monkeys,

I wrote a little demo script for Bio::Assembly::IO here:

http://www.bioperl.org/wiki/Module:Bio::Assembly::IO


I would very much appreciate comments, criticisms and corrections on
that script (please just edit the wiki). For a newbie its always the
same question, am I doing it right?

In particular, I read about the 4 possible coordinates of a read in an
assembly. My script only retrieves two (?) of the possible four. How
should it be adjusted to print all four coordinates for each read?

Additionally, I'm not sure how to distinguish between the trimmed read
vs. the full length read and/or the aligned portion of the read vs.
the full length read.

What I *really* want is the coordinates of the aligned portion of the
read in gapped read and gapped consensus space, along with the quality
trimmed range of the read.

The ACE file in question is produced by the gsMapper program, which is
part of Newbler from Roche (454), so it has some small
'peculiarities', but I don't think they are critical for the task at
hand.


Thanks very much for any hep you can provide on any of the above issues.

Sincerely,
Dan.


From maj at fortinbras.us  Fri Sep 18 15:11:05 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Fri, 18 Sep 2009 11:11:05 -0400
Subject: [Bioperl-l] Getting read position information from an ACE file?
In-Reply-To: <2c8757af0909180755u2e2ca178h9ce921f9bb22c7a3@mail.gmail.com>
References: <2c8757af0909180755u2e2ca178h9ce921f9bb22c7a3@mail.gmail.com>
Message-ID: <FCD85C18EC5744269CEAB127F4D1D5C4@NewLife>

Dan -- I don't know much about Assembly, so can't help there. But can I  
encourage you and perhaps one or two others (steganographic content: fangly) 
to create a HOWTO stub out of this? Would be excellent-
cheers MAJ
----- Original Message ----- 
From: "Dan Bolser" <dan.bolser at gmail.com>
To: "BioPerl List" <bioperl-l at lists.open-bio.org>
Sent: Friday, September 18, 2009 10:55 AM
Subject: [Bioperl-l] Getting read position information from an ACE file?


> Dear Perl Monkeys,
> 
> I wrote a little demo script for Bio::Assembly::IO here:
> 
> http://www.bioperl.org/wiki/Module:Bio::Assembly::IO
> 
> 
> I would very much appreciate comments, criticisms and corrections on
> that script (please just edit the wiki). For a newbie its always the
> same question, am I doing it right?
> 
> In particular, I read about the 4 possible coordinates of a read in an
> assembly. My script only retrieves two (?) of the possible four. How
> should it be adjusted to print all four coordinates for each read?
> 
> Additionally, I'm not sure how to distinguish between the trimmed read
> vs. the full length read and/or the aligned portion of the read vs.
> the full length read.
> 
> What I *really* want is the coordinates of the aligned portion of the
> read in gapped read and gapped consensus space, along with the quality
> trimmed range of the read.
> 
> The ACE file in question is produced by the gsMapper program, which is
> part of Newbler from Roche (454), so it has some small
> 'peculiarities', but I don't think they are critical for the task at
> hand.
> 
> 
> Thanks very much for any hep you can provide on any of the above issues.
> 
> Sincerely,
> Dan.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
>


From anupam.contact at gmail.com  Fri Sep 18 15:20:03 2009
From: anupam.contact at gmail.com (anupam sinha)
Date: Fri, 18 Sep 2009 20:50:03 +0530
Subject: [Bioperl-l] Problems with Bioperl-run pkg
Message-ID: <82ec54570909180820t7981d230l48d8e4823bb2303f@mail.gmail.com>

Dear all,
                 I have installed the BioPerl-1.6.0.tar.gz and
Bioperl-run-1.6.0.tar.gz on a Fedora 7 system. I am trying to run *
/usr/bin/bp_pairwise_kaks.pl* script but keep on getting this error :

*Must have bioperl-run pkg installed to run this script at
/usr/bin/bp_pairwise_kaks.pl line 69*.

Though I have istalled the run package from Bioperl. Can anyone help me out
? Thanks in advance.


Regards,


Anupam Sinha


From cjfields at illinois.edu  Fri Sep 18 15:59:11 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Fri, 18 Sep 2009 10:59:11 -0500
Subject: [Bioperl-l] test failures in main trunk
In-Reply-To: <2DEEE102-8F58-4BBF-BEAD-97A1AA364787@scottcain.net>
References: <2DEEE102-8F58-4BBF-BEAD-97A1AA364787@scottcain.net>
Message-ID: <1D99E2C1-F484-4E05-8E02-0E948DBBCC6F@illinois.edu>

Interesting, will look into those.  The first one is troubling (that's  
set up to skip for Algoritm::Diff), the others should be a bit more  
straightforward.

Will have to see why List::MoreUtils is being used, but if it's  
necessary it's an additional dep.

chris

On Sep 18, 2009, at 9:11 AM, Scott Cain wrote:

> With Chris trying to get a release out, I wanted to report these  
> test failures from a fairly virgin system Ubuntu server 8.04.
>
> Scott
>
>
>
> t/SeqIO/raw.t ................................ 1/24 Can't locate  
> Algorithm/Diff.pm in @INC (@INC contains: t/lib . /home/gmod/bioperl- 
> live/blib/lib /home/gmod/bioperl-live/blib/arch /home/gmod/bioperl- 
> live /etc/perl /usr/local/lib/perl/5.8.8 /usr/local/share/perl/ 
> 5.8.8 /usr/lib/perl5 /usr/share/perl5 /usr/lib/perl/5.8 /usr/share/ 
> perl/5.8 /usr/local/lib/site_perl) at t/SeqIO/raw.t line 72.
> BEGIN failed--compilation aborted at t/SeqIO/raw.t line 72.
> # Looks like you planned 24 tests but ran 1.
> # Looks like your test exited with 2 just after 1.
> t/SeqIO/raw.t ................................ Dubious, test  
> returned 2 (wstat 512, 0x200)
>
> t/SeqTools/Backtranslate.t ................... Can't locate ok.pm in  
> @INC (@INC contains: t/lib /home/gmod/bioperl-live/blib/lib /home/ 
> gmod/bioperl-live/blib/arch /home/gmod/bioperl-live /etc/perl /usr/ 
> local/lib/perl/5.8.8 /usr/local/share/perl/5.8.8 /usr/lib/perl5 /usr/ 
> share/perl5 /usr/lib/perl/5.8 /usr/share/perl/5.8 /usr/local/lib/ 
> site_perl .) at t/SeqTools/Backtranslate.t line 9.
> BEGIN failed--compilation aborted at t/SeqTools/Backtranslate.t line  
> 9.
> # Looks like your test exited with 2 before it could output anything.
> t/SeqTools/Backtranslate.t ................... Dubious, test  
> returned 2 (wstat 512, 0x200)
> Failed 8/8 subtests
>
> t/SeqTools/SeqPattern.t ...................... 1/28
> #   Failed test 'use Bio::Tools::SeqPattern;'
> #   at t/SeqTools/SeqPattern.t line 12.
> #     Tried to use 'Bio::Tools::SeqPattern'.
> #     Error:  Can't locate List/MoreUtils.pm in @INC (@INC contains:  
> t/lib . /home/gmod/bioperl-live/blib/lib /home/gmod/bioperl-live/ 
> blib/arch /home/gmod/bioperl-live /etc/perl /usr/local/lib/perl/ 
> 5.8.8 /usr/local/share/perl/5.8.8 /usr/lib/perl5 /usr/share/perl5 / 
> usr/lib/perl/5.8 /usr/share/perl/5.8 /usr/local/lib/site_perl) at  
> Bio/Tools/SeqPattern/Backtranslate.pm line 22.
> # BEGIN failed--compilation aborted at Bio/Tools/SeqPattern/ 
> Backtranslate.pm line 22.
> # Compilation failed in require at Bio/Tools/SeqPattern.pm line 212.
> # Compilation failed in require at (eval 17) line 2.
> # BEGIN failed--compilation aborted at (eval 17) line 2.
> Use of uninitialized value in concatenation (.) or string at Bio/ 
> Tools/SeqPattern.pm line 431.
> Use of uninitialized value in concatenation (.) or string at Bio/ 
> Tools/SeqPattern.pm line 432.
>
> #   Failed test at t/SeqTools/SeqPattern.t line 25.
> #          got: '(CT).{1,80}(C[[]]CT).(AGGGG){1,200}'
> #     expected: '(CT).{1,80}(C[GA][GA]CT).(AGGGG){1,200}'
> Use of uninitialized value in concatenation (.) or string at Bio/ 
> Tools/SeqPattern.pm line 431.
> Use of uninitialized value in concatenation (.) or string at Bio/ 
> Tools/SeqPattern.pm line 432.
>
> #   Failed test at t/SeqTools/SeqPattern.t line 31.
> #          got: '(CT).(C[][]CT){1,80}.(AGGGG){1,200}'
> #     expected: '(CT).(C[AG][AG]CT){1,80}.(AGGGG){1,200}'
> Use of uninitialized value in concatenation (.) or string at Bio/ 
> Tools/SeqPattern.pm line 371.
> Use of uninitialized value in concatenation (.) or string at Bio/ 
> Tools/SeqPattern.pm line 372.
>
> #   Failed test at t/SeqTools/SeqPattern.t line 38.
> #          got: 'A[][]H'
> #     expected: 'A[EQ][DN]H'
> "_reverse_translate_motif" is not exported by the  
> Bio::Tools::SeqPattern::Backtranslate module
> Can't continue after import errors at Bio/Tools/SeqPattern.pm line 539
> # Looks like you planned 28 tests but ran 9.
> # Looks like you failed 4 tests of 9 run.
> # Looks like your test exited with 255 just after 9.
> t/SeqTools/SeqPattern.t ...................... Dubious, test  
> returned 255 (wstat 65280, 0xff00)
> Failed 23/28 subtests
>
>
> -----------------------------------------------------------------------
> Scott Cain, Ph. D. scott at scottcain dot net
> GMOD Coordinator (http://gmod.org/) 216-392-3087
> Ontario Institute for Cancer Research
>
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From cjfields at illinois.edu  Fri Sep 18 16:09:26 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Fri, 18 Sep 2009 11:09:26 -0500
Subject: [Bioperl-l] Parser: Ace file (Sequence Assembly) in Bioperl
In-Reply-To: <2c8757af0909180727r5a71a41fmee71eff92a49a888@mail.gmail.com>
References: <be9b52410901052142p2809652h68e6a05b3ae156eb@mail.gmail.com>
	<18DF7D20DFEC044098A1062202F5FFF31A69523F20@exchsth.agresearch.co.nz>
	<be9b52410901061243t576fcc1eg94928360b8e0f57b@mail.gmail.com>
	<B6BFD3C2-D5D0-4732-B3E2-C2DC9DD029F1@illinois.edu>
	<52cea20c0901061513x593acb44o641b87e35b8ff6fe@mail.gmail.com>
	<835D79AC-0D2A-40BE-87F1-0591F69C036A@illinois.edu>
	<2c8757af0909180727r5a71a41fmee71eff92a49a888@mail.gmail.com>
Message-ID: <124536CE-407B-4E2E-98B7-940DA4286CC8@illinois.edu>

Dan,

No, it hasn't made it in.  Currently, the problem is it doesn't have  
any tests attached, but that could be easily fixed if anyone wanted to  
donate a little time to getting them running.  My hands are a bit full  
with other stuff for the release.

We should have some ace files already to go in t/data somewhere if one  
were so inclined to do that, BTW  ;>

chris

On Sep 18, 2009, at 9:27 AM, Dan Bolser wrote:

> 2009/1/6 Chris Fields <cjfields at illinois.edu>:
>> Could you archive the files and attach them to a bug report (you  
>> can mark it
>> as an enhancement request).  We can take a look.
>>
>> http://bugzilla.open-bio.org/
>
> Out of interest, has this been added? Where is it documented?
>
> Cheers,
> Dan.
>
>
>> chris
>>
>> On Jan 6, 2009, at 5:13 PM, Joshua Udall wrote:
>>
>>> Chris et al. -
>>>
>>> A student and I have written code to do this - write ace files as  
>>> well as
>>> parse them one entry at a time.  In trying to use the Assembly::IO  
>>> as it
>>> was
>>> in 1.5, we ran into problems with large ace files containing many  
>>> entries
>>> because of file handle limit issues with the inherited  
>>> implementation
>>> DB_File.  Our implementation simply reads one contig at a time  
>>> instead of
>>> first trying to slurp the whole ace into memory.  I'm happy to add  
>>> it to
>>> Bioperl, but I am not sure how to do it.  If I sent *.pm files to  
>>> someone,
>>> could they help me get it into bioperl?  It may not be perfect  
>>> either, but
>>> it should be a good start.
>>>
>>> Josh
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From maj at fortinbras.us  Fri Sep 18 16:20:22 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Fri, 18 Sep 2009 12:20:22 -0400
Subject: [Bioperl-l] test failures in main trunk
In-Reply-To: <1D99E2C1-F484-4E05-8E02-0E948DBBCC6F@illinois.edu>
References: <2DEEE102-8F58-4BBF-BEAD-97A1AA364787@scottcain.net>
	<1D99E2C1-F484-4E05-8E02-0E948DBBCC6F@illinois.edu>
Message-ID: <E019D53941DD48E4B3294E113771B711@NewLife>


> Will have to see why List::MoreUtils is being used, but if it's  
> necessary it's an additional dep.

I didn't do it, officer....

> 
> chris
> 
> On Sep 18, 2009, at 9:11 AM, Scott Cain wrote:
> 
>> With Chris trying to get a release out, I wanted to report these  
>> test failures from a fairly virgin system Ubuntu server 8.04.
>>
>> Scott
>>
>>
>>
>> t/SeqIO/raw.t ................................ 1/24 Can't locate  
>> Algorithm/Diff.pm in @INC (@INC contains: t/lib . /home/gmod/bioperl- 
>> live/blib/lib /home/gmod/bioperl-live/blib/arch /home/gmod/bioperl- 
>> live /etc/perl /usr/local/lib/perl/5.8.8 /usr/local/share/perl/ 
>> 5.8.8 /usr/lib/perl5 /usr/share/perl5 /usr/lib/perl/5.8 /usr/share/ 
>> perl/5.8 /usr/local/lib/site_perl) at t/SeqIO/raw.t line 72.
>> BEGIN failed--compilation aborted at t/SeqIO/raw.t line 72.
>> # Looks like you planned 24 tests but ran 1.
>> # Looks like your test exited with 2 just after 1.
>> t/SeqIO/raw.t ................................ Dubious, test  
>> returned 2 (wstat 512, 0x200)
>>
>> t/SeqTools/Backtranslate.t ................... Can't locate ok.pm in  
>> @INC (@INC contains: t/lib /home/gmod/bioperl-live/blib/lib /home/ 
>> gmod/bioperl-live/blib/arch /home/gmod/bioperl-live /etc/perl /usr/ 
>> local/lib/perl/5.8.8 /usr/local/share/perl/5.8.8 /usr/lib/perl5 /usr/ 
>> share/perl5 /usr/lib/perl/5.8 /usr/share/perl/5.8 /usr/local/lib/ 
>> site_perl .) at t/SeqTools/Backtranslate.t line 9.
>> BEGIN failed--compilation aborted at t/SeqTools/Backtranslate.t line  
>> 9.
>> # Looks like your test exited with 2 before it could output anything.
>> t/SeqTools/Backtranslate.t ................... Dubious, test  
>> returned 2 (wstat 512, 0x200)
>> Failed 8/8 subtests
>>
>> t/SeqTools/SeqPattern.t ...................... 1/28
>> #   Failed test 'use Bio::Tools::SeqPattern;'
>> #   at t/SeqTools/SeqPattern.t line 12.
>> #     Tried to use 'Bio::Tools::SeqPattern'.
>> #     Error:  Can't locate List/MoreUtils.pm in @INC (@INC contains:  
>> t/lib . /home/gmod/bioperl-live/blib/lib /home/gmod/bioperl-live/ 
>> blib/arch /home/gmod/bioperl-live /etc/perl /usr/local/lib/perl/ 
>> 5.8.8 /usr/local/share/perl/5.8.8 /usr/lib/perl5 /usr/share/perl5 / 
>> usr/lib/perl/5.8 /usr/share/perl/5.8 /usr/local/lib/site_perl) at  
>> Bio/Tools/SeqPattern/Backtranslate.pm line 22.
>> # BEGIN failed--compilation aborted at Bio/Tools/SeqPattern/ 
>> Backtranslate.pm line 22.
>> # Compilation failed in require at Bio/Tools/SeqPattern.pm line 212.
>> # Compilation failed in require at (eval 17) line 2.
>> # BEGIN failed--compilation aborted at (eval 17) line 2.
>> Use of uninitialized value in concatenation (.) or string at Bio/ 
>> Tools/SeqPattern.pm line 431.
>> Use of uninitialized value in concatenation (.) or string at Bio/ 
>> Tools/SeqPattern.pm line 432.
>>
>> #   Failed test at t/SeqTools/SeqPattern.t line 25.
>> #          got: '(CT).{1,80}(C[[]]CT).(AGGGG){1,200}'
>> #     expected: '(CT).{1,80}(C[GA][GA]CT).(AGGGG){1,200}'
>> Use of uninitialized value in concatenation (.) or string at Bio/ 
>> Tools/SeqPattern.pm line 431.
>> Use of uninitialized value in concatenation (.) or string at Bio/ 
>> Tools/SeqPattern.pm line 432.
>>
>> #   Failed test at t/SeqTools/SeqPattern.t line 31.
>> #          got: '(CT).(C[][]CT){1,80}.(AGGGG){1,200}'
>> #     expected: '(CT).(C[AG][AG]CT){1,80}.(AGGGG){1,200}'
>> Use of uninitialized value in concatenation (.) or string at Bio/ 
>> Tools/SeqPattern.pm line 371.
>> Use of uninitialized value in concatenation (.) or string at Bio/ 
>> Tools/SeqPattern.pm line 372.
>>
>> #   Failed test at t/SeqTools/SeqPattern.t line 38.
>> #          got: 'A[][]H'
>> #     expected: 'A[EQ][DN]H'
>> "_reverse_translate_motif" is not exported by the  
>> Bio::Tools::SeqPattern::Backtranslate module
>> Can't continue after import errors at Bio/Tools/SeqPattern.pm line 539
>> # Looks like you planned 28 tests but ran 9.
>> # Looks like you failed 4 tests of 9 run.
>> # Looks like your test exited with 255 just after 9.
>> t/SeqTools/SeqPattern.t ...................... Dubious, test  
>> returned 255 (wstat 65280, 0xff00)
>> Failed 23/28 subtests
>>
>>
>> -----------------------------------------------------------------------
>> Scott Cain, Ph. D. scott at scottcain dot net
>> GMOD Coordinator (http://gmod.org/) 216-392-3087
>> Ontario Institute for Cancer Research
>>
>>
>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
>


From maj at fortinbras.us  Fri Sep 18 15:55:47 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Fri, 18 Sep 2009 11:55:47 -0400
Subject: [Bioperl-l] problem parsing pdb
In-Reply-To: <741671.67508.qm@web25705.mail.ukl.yahoo.com>
References: <741671.67508.qm@web25705.mail.ukl.yahoo.com>
Message-ID: <72DA6CA1499D4F67909197901218A9FF@NewLife>

Hi Paola--
My researches reveal that this is a "standard kludge" in pdb format. A letter 
following a residue number is called an "insertion code" or "icode", and my 
understanding is that is does allow for the insertion of residues without 
upsetting the rest of the coordinates. (This is a feature, and not laziness, 
since people very quickly begin to refer to amino acid coordinates based on a 
reference sequence in interesting region, and you can't easily say to the 
community,  "hey, that's 22 now, not 20...")

Since it's standard, you should expect it. Bio::Structure handles the icode by 
creating the residue id as follows:

   #my $res_name_num = $resname."-".$resseq;
   my $res_name_num = $resname."-".$resseq;
   $res_name_num .= '.'.$icode if $icode;

so you can get back the reside 3-letter name, its numerical position, and 
insertion code by doing

 my ($name, $number, $icode) = $res->id =~ /(.*?)-([0-9]+)\.?([A-Z]?)/;

In this case, if the icode is not present, then $icode eq '' (not undef).
Hope this helps-
Mark

----- Original Message ----- 
From: "Paola Bisignano" <paola_bisignano at yahoo.it>
To: <bioperl-l at bioperl.org>
Sent: Tuesday, September 08, 2009 4:55 AM
Subject: [Bioperl-l] problem parsing pdb


Hi,

I'm in a little troble because i need to exactly parse pdb file, to extract 
chain id and res id, but I finded that in some pdb the number of residue is 
followed by a letter because is probably a residue added by crystallographers 
and they didm't want to change the number of residue in sequence....for example 
the pdb 1PXX.pdb I parsed it with my script below, I didn't find any useful 
suggestion about this in bioperltutorial or documentation of bioperl online

#!/usr/local/bin/perl
use strict;
use warnings;
use Bio::Structure::IO;
use LWP::Simple;


my $urlpdb= 
"http://www.rcsb.org/pdb/download/downloadFile.do?fileFormat=pdb&compression=NO&structureId=1PXX";
my $content = get($urlpdb);
my $pdb_file = qq{1pxx.pdb};
open my $f, ">$pdb_file" or die $!;
binmode $f;
print $f $content;
print qq{$pdb_file\n};
close $f;


my $structio=Bio::Structure::IO->new (-file=>$pdb_file);
my $struc=$structio->next_structure;
for my $chain ($struc->get_chains)
{
my $chainid = $chain->id ;
for my $res ($struc->get_residues($chain))
{
my $resid=$res-> id;
my $atoms= $struc->get_atoms($res);
open my $f, ">> 1pxx.parsed";
print $f "$chainid\t$resid\n";
close $f;
}
}


but it gives my file with an error in ILE 105A ILE 2105C because they have a 
letter that follow the number of resid.... can I solve that problem without 
writing intermediate files?
because i need to have the reside id as 105A not 105.A
so
A ILE-105A
without point between number and letter....


Thank you all,

Paola


_______________________________________________
Bioperl-l mailing list
Bioperl-l at lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/bioperl-l


From abhishek.vit at gmail.com  Fri Sep 18 16:31:00 2009
From: abhishek.vit at gmail.com (Abhishek Pratap)
Date: Fri, 18 Sep 2009 12:31:00 -0400
Subject: [Bioperl-l] Parser: Ace file (Sequence Assembly) in Bioperl
In-Reply-To: <124536CE-407B-4E2E-98B7-940DA4286CC8@illinois.edu>
References: <be9b52410901052142p2809652h68e6a05b3ae156eb@mail.gmail.com>
	<18DF7D20DFEC044098A1062202F5FFF31A69523F20@exchsth.agresearch.co.nz>
	<be9b52410901061243t576fcc1eg94928360b8e0f57b@mail.gmail.com>
	<B6BFD3C2-D5D0-4732-B3E2-C2DC9DD029F1@illinois.edu>
	<52cea20c0901061513x593acb44o641b87e35b8ff6fe@mail.gmail.com>
	<835D79AC-0D2A-40BE-87F1-0591F69C036A@illinois.edu>
	<2c8757af0909180727r5a71a41fmee71eff92a49a888@mail.gmail.com>
	<124536CE-407B-4E2E-98B7-940DA4286CC8@illinois.edu>
Message-ID: <be9b52410909180931w2951318eqfa01c109a032bf9d@mail.gmail.com>

I have negligible experience with ace but will be happy to do some
testing. Although please let me know what code and functioanlity needs
to be checked.

Cheers,
-Abhi

On Fri, Sep 18, 2009 at 12:09 PM, Chris Fields <cjfields at illinois.edu> wrote:
> Dan,
>
> No, it hasn't made it in. ?Currently, the problem is it doesn't have any
> tests attached, but that could be easily fixed if anyone wanted to donate a
> little time to getting them running. ?My hands are a bit full with other
> stuff for the release.
>
> We should have some ace files already to go in t/data somewhere if one were
> so inclined to do that, BTW ?;>
>
> chris
>
> On Sep 18, 2009, at 9:27 AM, Dan Bolser wrote:
>
>> 2009/1/6 Chris Fields <cjfields at illinois.edu>:
>>>
>>> Could you archive the files and attach them to a bug report (you can mark
>>> it
>>> as an enhancement request). ?We can take a look.
>>>
>>> http://bugzilla.open-bio.org/
>>
>> Out of interest, has this been added? Where is it documented?
>>
>> Cheers,
>> Dan.
>>
>>
>>> chris
>>>
>>> On Jan 6, 2009, at 5:13 PM, Joshua Udall wrote:
>>>
>>>> Chris et al. -
>>>>
>>>> A student and I have written code to do this - write ace files as well
>>>> as
>>>> parse them one entry at a time. ?In trying to use the Assembly::IO as it
>>>> was
>>>> in 1.5, we ran into problems with large ace files containing many
>>>> entries
>>>> because of file handle limit issues with the inherited implementation
>>>> DB_File. ?Our implementation simply reads one contig at a time instead
>>>> of
>>>> first trying to slurp the whole ace into memory. ?I'm happy to add it to
>>>> Bioperl, but I am not sure how to do it. ?If I sent *.pm files to
>>>> someone,
>>>> could they help me get it into bioperl? ?It may not be perfect either,
>>>> but
>>>> it should be a good start.
>>>>
>>>> Josh
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>


From vecchi.b at gmail.com  Fri Sep 18 16:44:37 2009
From: vecchi.b at gmail.com (Bruno Vecchi)
Date: Fri, 18 Sep 2009 09:44:37 -0700
Subject: [Bioperl-l] test failures in main trunk
In-Reply-To: <E019D53941DD48E4B3294E113771B711@NewLife>
References: <2DEEE102-8F58-4BBF-BEAD-97A1AA364787@scottcain.net>
	<1D99E2C1-F484-4E05-8E02-0E948DBBCC6F@illinois.edu>
	<E019D53941DD48E4B3294E113771B711@NewLife>
Message-ID: <1a0c1b750909180944p55b226cbi18e3c608f401d951@mail.gmail.com>

The second test ("Can't locate ok.pm in @INC...") can be fixed by
using use_ok('My::Module') instead of use ok 'My::Module' in the test
files.

I've had a few of those in the past, and that fix did the trick.

Cheers,

Bruno.


2009/9/18 Mark A. Jensen <maj at fortinbras.us>:
>
>> Will have to see why List::MoreUtils is being used, but if it's ?necessary
>> it's an additional dep.
>
> I didn't do it, officer....
>
>>
>> chris
>>
>> On Sep 18, 2009, at 9:11 AM, Scott Cain wrote:
>>
>>> With Chris trying to get a release out, I wanted to report these ?test
>>> failures from a fairly virgin system Ubuntu server 8.04.
>>>
>>> Scott
>>>
>>>
>>>
>>> t/SeqIO/raw.t ................................ 1/24 Can't locate
>>> ?Algorithm/Diff.pm in @INC (@INC contains: t/lib . /home/gmod/bioperl-
>>> live/blib/lib /home/gmod/bioperl-live/blib/arch /home/gmod/bioperl- live
>>> /etc/perl /usr/local/lib/perl/5.8.8 /usr/local/share/perl/ 5.8.8
>>> /usr/lib/perl5 /usr/share/perl5 /usr/lib/perl/5.8 /usr/share/ perl/5.8
>>> /usr/local/lib/site_perl) at t/SeqIO/raw.t line 72.
>>> BEGIN failed--compilation aborted at t/SeqIO/raw.t line 72.
>>> # Looks like you planned 24 tests but ran 1.
>>> # Looks like your test exited with 2 just after 1.
>>> t/SeqIO/raw.t ................................ Dubious, test ?returned 2
>>> (wstat 512, 0x200)
>>>
>>> t/SeqTools/Backtranslate.t ................... Can't locate ok.pm in
>>> ?@INC (@INC contains: t/lib /home/gmod/bioperl-live/blib/lib /home/
>>> gmod/bioperl-live/blib/arch /home/gmod/bioperl-live /etc/perl /usr/
>>> local/lib/perl/5.8.8 /usr/local/share/perl/5.8.8 /usr/lib/perl5 /usr/
>>> share/perl5 /usr/lib/perl/5.8 /usr/share/perl/5.8 /usr/local/lib/ site_perl
>>> .) at t/SeqTools/Backtranslate.t line 9.
>>> BEGIN failed--compilation aborted at t/SeqTools/Backtranslate.t line ?9.
>>> # Looks like your test exited with 2 before it could output anything.
>>> t/SeqTools/Backtranslate.t ................... Dubious, test ?returned 2
>>> (wstat 512, 0x200)
>>> Failed 8/8 subtests
>>>
>>> t/SeqTools/SeqPattern.t ...................... 1/28
>>> # ? Failed test 'use Bio::Tools::SeqPattern;'
>>> # ? at t/SeqTools/SeqPattern.t line 12.
>>> # ? ? Tried to use 'Bio::Tools::SeqPattern'.
>>> # ? ? Error: ?Can't locate List/MoreUtils.pm in @INC (@INC contains:
>>> ?t/lib . /home/gmod/bioperl-live/blib/lib /home/gmod/bioperl-live/ blib/arch
>>> /home/gmod/bioperl-live /etc/perl /usr/local/lib/perl/ 5.8.8
>>> /usr/local/share/perl/5.8.8 /usr/lib/perl5 /usr/share/perl5 /
>>> usr/lib/perl/5.8 /usr/share/perl/5.8 /usr/local/lib/site_perl) at
>>> ?Bio/Tools/SeqPattern/Backtranslate.pm line 22.
>>> # BEGIN failed--compilation aborted at Bio/Tools/SeqPattern/
>>> Backtranslate.pm line 22.
>>> # Compilation failed in require at Bio/Tools/SeqPattern.pm line 212.
>>> # Compilation failed in require at (eval 17) line 2.
>>> # BEGIN failed--compilation aborted at (eval 17) line 2.
>>> Use of uninitialized value in concatenation (.) or string at Bio/
>>> Tools/SeqPattern.pm line 431.
>>> Use of uninitialized value in concatenation (.) or string at Bio/
>>> Tools/SeqPattern.pm line 432.
>>>
>>> # ? Failed test at t/SeqTools/SeqPattern.t line 25.
>>> # ? ? ? ? ?got: '(CT).{1,80}(C[[]]CT).(AGGGG){1,200}'
>>> # ? ? expected: '(CT).{1,80}(C[GA][GA]CT).(AGGGG){1,200}'
>>> Use of uninitialized value in concatenation (.) or string at Bio/
>>> Tools/SeqPattern.pm line 431.
>>> Use of uninitialized value in concatenation (.) or string at Bio/
>>> Tools/SeqPattern.pm line 432.
>>>
>>> # ? Failed test at t/SeqTools/SeqPattern.t line 31.
>>> # ? ? ? ? ?got: '(CT).(C[][]CT){1,80}.(AGGGG){1,200}'
>>> # ? ? expected: '(CT).(C[AG][AG]CT){1,80}.(AGGGG){1,200}'
>>> Use of uninitialized value in concatenation (.) or string at Bio/
>>> Tools/SeqPattern.pm line 371.
>>> Use of uninitialized value in concatenation (.) or string at Bio/
>>> Tools/SeqPattern.pm line 372.
>>>
>>> # ? Failed test at t/SeqTools/SeqPattern.t line 38.
>>> # ? ? ? ? ?got: 'A[][]H'
>>> # ? ? expected: 'A[EQ][DN]H'
>>> "_reverse_translate_motif" is not exported by the
>>> ?Bio::Tools::SeqPattern::Backtranslate module
>>> Can't continue after import errors at Bio/Tools/SeqPattern.pm line 539
>>> # Looks like you planned 28 tests but ran 9.
>>> # Looks like you failed 4 tests of 9 run.
>>> # Looks like your test exited with 255 just after 9.
>>> t/SeqTools/SeqPattern.t ...................... Dubious, test ?returned
>>> 255 (wstat 65280, 0xff00)
>>> Failed 23/28 subtests
>>>
>>>
>>> -----------------------------------------------------------------------
>>> Scott Cain, Ph. D. scott at scottcain dot net
>>> GMOD Coordinator (http://gmod.org/) 216-392-3087
>>> Ontario Institute for Cancer Research
>>>
>>>
>>>
>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


From dan.bolser at gmail.com  Fri Sep 18 16:54:36 2009
From: dan.bolser at gmail.com (Dan Bolser)
Date: Fri, 18 Sep 2009 17:54:36 +0100
Subject: [Bioperl-l] Parser: Ace file (Sequence Assembly) in Bioperl
In-Reply-To: <124536CE-407B-4E2E-98B7-940DA4286CC8@illinois.edu>
References: <be9b52410901052142p2809652h68e6a05b3ae156eb@mail.gmail.com>
	<18DF7D20DFEC044098A1062202F5FFF31A69523F20@exchsth.agresearch.co.nz>
	<be9b52410901061243t576fcc1eg94928360b8e0f57b@mail.gmail.com>
	<B6BFD3C2-D5D0-4732-B3E2-C2DC9DD029F1@illinois.edu>
	<52cea20c0901061513x593acb44o641b87e35b8ff6fe@mail.gmail.com>
	<835D79AC-0D2A-40BE-87F1-0591F69C036A@illinois.edu>
	<2c8757af0909180727r5a71a41fmee71eff92a49a888@mail.gmail.com>
	<124536CE-407B-4E2E-98B7-940DA4286CC8@illinois.edu>
Message-ID: <2c8757af0909180954ia4fecc3we72574d8ae8acd97@mail.gmail.com>

Please can you link to the bug that includes the code?


2009/9/18 Chris Fields <cjfields at illinois.edu>:
> Dan,
>
> No, it hasn't made it in. ?Currently, the problem is it doesn't have any
> tests attached, but that could be easily fixed if anyone wanted to donate a
> little time to getting them running. ?My hands are a bit full with other
> stuff for the release.
>
> We should have some ace files already to go in t/data somewhere if one were
> so inclined to do that, BTW ?;>
>
> chris
>
> On Sep 18, 2009, at 9:27 AM, Dan Bolser wrote:
>
>> 2009/1/6 Chris Fields <cjfields at illinois.edu>:
>>>
>>> Could you archive the files and attach them to a bug report (you can mark
>>> it
>>> as an enhancement request). ?We can take a look.
>>>
>>> http://bugzilla.open-bio.org/
>>
>> Out of interest, has this been added? Where is it documented?
>>
>> Cheers,
>> Dan.
>>
>>
>>> chris
>>>
>>> On Jan 6, 2009, at 5:13 PM, Joshua Udall wrote:
>>>
>>>> Chris et al. -
>>>>
>>>> A student and I have written code to do this - write ace files as well
>>>> as
>>>> parse them one entry at a time. ?In trying to use the Assembly::IO as it
>>>> was
>>>> in 1.5, we ran into problems with large ace files containing many
>>>> entries
>>>> because of file handle limit issues with the inherited implementation
>>>> DB_File. ?Our implementation simply reads one contig at a time instead
>>>> of
>>>> first trying to slurp the whole ace into memory. ?I'm happy to add it to
>>>> Bioperl, but I am not sure how to do it. ?If I sent *.pm files to
>>>> someone,
>>>> could they help me get it into bioperl? ?It may not be perfect either,
>>>> but
>>>> it should be a good start.
>>>>
>>>> Josh
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>


From dan.bolser at gmail.com  Fri Sep 18 17:09:09 2009
From: dan.bolser at gmail.com (Dan Bolser)
Date: Fri, 18 Sep 2009 18:09:09 +0100
Subject: [Bioperl-l] Getting read position information from an ACE file?
In-Reply-To: <FCD85C18EC5744269CEAB127F4D1D5C4@NewLife>
References: <2c8757af0909180755u2e2ca178h9ce921f9bb22c7a3@mail.gmail.com>
	<FCD85C18EC5744269CEAB127F4D1D5C4@NewLife>
Message-ID: <2c8757af0909181009w310bc69r3d9efa3d9a12d41b@mail.gmail.com>

2009/9/18 Mark A. Jensen <maj at fortinbras.us>:
> Dan -- I don't know much about Assembly, so can't help there. But can I
> ?encourage you and perhaps one or two others (steganographic content:
> fangly) to create a HOWTO stub out of this? Would be excellent-

I'd love to. ACE is pretty ubiquitous, so any additional info on how
to work with them using BioPerl should help a lot of people.

The problem is that I'm one of those people ;-)


I'm working on an 'ace2tab.plx' script that should encompass this
info. I'm finding that some 'read ids' have the .range format. i.e.
"read123455.23-239". However, some do not. i.e. "read123456". Not sure
where this ID comes from, but I think its telling me something about
partially aligned reads. The problem is that the coordinates I'm
seeing don't reflect that (they are just the start and the end point
of the full read).

A 'proper' ace2tab script would be very nice.


> cheers MAJ
> ----- Original Message ----- From: "Dan Bolser" <dan.bolser at gmail.com>
> To: "BioPerl List" <bioperl-l at lists.open-bio.org>
> Sent: Friday, September 18, 2009 10:55 AM
> Subject: [Bioperl-l] Getting read position information from an ACE file?
>
>
>> Dear Perl Monkeys,
>>
>> I wrote a little demo script for Bio::Assembly::IO here:
>>
>> http://www.bioperl.org/wiki/Module:Bio::Assembly::IO
>>
>>
>> I would very much appreciate comments, criticisms and corrections on
>> that script (please just edit the wiki). For a newbie its always the
>> same question, am I doing it right?
>>
>> In particular, I read about the 4 possible coordinates of a read in an
>> assembly. My script only retrieves two (?) of the possible four. How
>> should it be adjusted to print all four coordinates for each read?
>>
>> Additionally, I'm not sure how to distinguish between the trimmed read
>> vs. the full length read and/or the aligned portion of the read vs.
>> the full length read.
>>
>> What I *really* want is the coordinates of the aligned portion of the
>> read in gapped read and gapped consensus space, along with the quality
>> trimmed range of the read.
>>
>> The ACE file in question is produced by the gsMapper program, which is
>> part of Newbler from Roche (454), so it has some small
>> 'peculiarities', but I don't think they are critical for the task at
>> hand.
>>
>>
>> Thanks very much for any hep you can provide on any of the above issues.
>>
>> Sincerely,
>> Dan.
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>>
>


From cjfields at illinois.edu  Fri Sep 18 18:00:17 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Fri, 18 Sep 2009 13:00:17 -0500
Subject: [Bioperl-l] Getting read position information from an ACE file?
In-Reply-To: <FCD85C18EC5744269CEAB127F4D1D5C4@NewLife>
References: <2c8757af0909180755u2e2ca178h9ce921f9bb22c7a3@mail.gmail.com>
	<FCD85C18EC5744269CEAB127F4D1D5C4@NewLife>
Message-ID: <DCEC55AD-5B4E-42E6-9A7E-FB52E19EADA5@illinois.edu>

Agreed, and it may spur others to get involved, fix bugs, donate code,  
etc.

chris

On Sep 18, 2009, at 10:11 AM, Mark A. Jensen wrote:

> Dan -- I don't know much about Assembly, so can't help there. But  
> can I  encourage you and perhaps one or two others (steganographic  
> content: fangly) to create a HOWTO stub out of this? Would be  
> excellent-
> cheers MAJ
> ----- Original Message ----- From: "Dan Bolser" <dan.bolser at gmail.com>
> To: "BioPerl List" <bioperl-l at lists.open-bio.org>
> Sent: Friday, September 18, 2009 10:55 AM
> Subject: [Bioperl-l] Getting read position information from an ACE  
> file?
>
>
>> Dear Perl Monkeys,
>> I wrote a little demo script for Bio::Assembly::IO here:
>> http://www.bioperl.org/wiki/Module:Bio::Assembly::IO
>> I would very much appreciate comments, criticisms and corrections on
>> that script (please just edit the wiki). For a newbie its always the
>> same question, am I doing it right?
>> In particular, I read about the 4 possible coordinates of a read in  
>> an
>> assembly. My script only retrieves two (?) of the possible four. How
>> should it be adjusted to print all four coordinates for each read?
>> Additionally, I'm not sure how to distinguish between the trimmed  
>> read
>> vs. the full length read and/or the aligned portion of the read vs.
>> the full length read.
>> What I *really* want is the coordinates of the aligned portion of the
>> read in gapped read and gapped consensus space, along with the  
>> quality
>> trimmed range of the read.
>> The ACE file in question is produced by the gsMapper program, which  
>> is
>> part of Newbler from Roche (454), so it has some small
>> 'peculiarities', but I don't think they are critical for the task at
>> hand.
>> Thanks very much for any hep you can provide on any of the above  
>> issues.
>> Sincerely,
>> Dan.
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From cjfields at illinois.edu  Fri Sep 18 18:03:13 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Fri, 18 Sep 2009 13:03:13 -0500
Subject: [Bioperl-l] Parser: Ace file (Sequence Assembly) in Bioperl
In-Reply-To: <2c8757af0909180954ia4fecc3we72574d8ae8acd97@mail.gmail.com>
References: <be9b52410901052142p2809652h68e6a05b3ae156eb@mail.gmail.com>
	<18DF7D20DFEC044098A1062202F5FFF31A69523F20@exchsth.agresearch.co.nz>
	<be9b52410901061243t576fcc1eg94928360b8e0f57b@mail.gmail.com>
	<B6BFD3C2-D5D0-4732-B3E2-C2DC9DD029F1@illinois.edu>
	<52cea20c0901061513x593acb44o641b87e35b8ff6fe@mail.gmail.com>
	<835D79AC-0D2A-40BE-87F1-0591F69C036A@illinois.edu>
	<2c8757af0909180727r5a71a41fmee71eff92a49a888@mail.gmail.com>
	<124536CE-407B-4E2E-98B7-940DA4286CC8@illinois.edu>
	<2c8757af0909180954ia4fecc3we72574d8ae8acd97@mail.gmail.com>
Message-ID: <88BA1216-B8C6-478B-A295-4153D041F549@illinois.edu>

Bug 2726

http://bugzilla.open-bio.org/show_bug.cgi?id=2726

chris

On Sep 18, 2009, at 11:54 AM, Dan Bolser wrote:

> Please can you link to the bug that includes the code?
>
>
> 2009/9/18 Chris Fields <cjfields at illinois.edu>:
>> Dan,
>>
>> No, it hasn't made it in.  Currently, the problem is it doesn't  
>> have any
>> tests attached, but that could be easily fixed if anyone wanted to  
>> donate a
>> little time to getting them running.  My hands are a bit full with  
>> other
>> stuff for the release.
>>
>> We should have some ace files already to go in t/data somewhere if  
>> one were
>> so inclined to do that, BTW  ;>
>>
>> chris
>>
>> On Sep 18, 2009, at 9:27 AM, Dan Bolser wrote:
>>
>>> 2009/1/6 Chris Fields <cjfields at illinois.edu>:
>>>>
>>>> Could you archive the files and attach them to a bug report (you  
>>>> can mark
>>>> it
>>>> as an enhancement request).  We can take a look.
>>>>
>>>> http://bugzilla.open-bio.org/
>>>
>>> Out of interest, has this been added? Where is it documented?
>>>
>>> Cheers,
>>> Dan.
>>>
>>>
>>>> chris
>>>>
>>>> On Jan 6, 2009, at 5:13 PM, Joshua Udall wrote:
>>>>
>>>>> Chris et al. -
>>>>>
>>>>> A student and I have written code to do this - write ace files  
>>>>> as well
>>>>> as
>>>>> parse them one entry at a time.  In trying to use the  
>>>>> Assembly::IO as it
>>>>> was
>>>>> in 1.5, we ran into problems with large ace files containing many
>>>>> entries
>>>>> because of file handle limit issues with the inherited  
>>>>> implementation
>>>>> DB_File.  Our implementation simply reads one contig at a time  
>>>>> instead
>>>>> of
>>>>> first trying to slurp the whole ace into memory.  I'm happy to  
>>>>> add it to
>>>>> Bioperl, but I am not sure how to do it.  If I sent *.pm files to
>>>>> someone,
>>>>> could they help me get it into bioperl?  It may not be perfect  
>>>>> either,
>>>>> but
>>>>> it should be a good start.
>>>>>
>>>>> Josh
>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>>


From e.osimo at gmail.com  Fri Sep 18 22:33:22 2009
From: e.osimo at gmail.com (Emanuele Osimo)
Date: Sat, 19 Sep 2009 00:33:22 +0200
Subject: [Bioperl-l] Getting all annotations
Message-ID: <2ac05d0f0909181533u1e5d5d89l5c2c468950a9cef@mail.gmail.com>

Hello,
I was trying to figure out how to get from the Entrez database all the
reference annotation for a given genomic zone.
For example: I want to know which genes, transcripts, microRNAs etc are
present in chr 6 from 100kbp to 200kbp.
Is there a database that is arranged as a continuum (by sequence) instead of
by feature (gene, transcript etc)?

Thanks
Emanuele


From florent.angly at gmail.com  Sun Sep 20 02:20:31 2009
From: florent.angly at gmail.com (Florent Angly)
Date: Sat, 19 Sep 2009 19:20:31 -0700
Subject: [Bioperl-l] Parser: Ace file (Sequence Assembly) in Bioperl
In-Reply-To: <88BA1216-B8C6-478B-A295-4153D041F549@illinois.edu>
References: <be9b52410901052142p2809652h68e6a05b3ae156eb@mail.gmail.com>	<18DF7D20DFEC044098A1062202F5FFF31A69523F20@exchsth.agresearch.co.nz>	<be9b52410901061243t576fcc1eg94928360b8e0f57b@mail.gmail.com>	<B6BFD3C2-D5D0-4732-B3E2-C2DC9DD029F1@illinois.edu>	<52cea20c0901061513x593acb44o641b87e35b8ff6fe@mail.gmail.com>	<835D79AC-0D2A-40BE-87F1-0591F69C036A@illinois.edu>	<2c8757af0909180727r5a71a41fmee71eff92a49a888@mail.gmail.com>	<124536CE-407B-4E2E-98B7-940DA4286CC8@illinois.edu>	<2c8757af0909180954ia4fecc3we72574d8ae8acd97@mail.gmail.com>
	<88BA1216-B8C6-478B-A295-4153D041F549@illinois.edu>
Message-ID: <4AB5916F.1090104@gmail.com>

I suppose it is a good idea to wait until bioperl-live 1.6.1 is out 
before doing any significant work on the sequence assembly module.
Also, remember the assembly-related todo list: 
http://www.bioperl.org/wiki/Align_Refactor#Bio::Assembly-related
Florent


Chris Fields wrote:
> Bug 2726
>
> http://bugzilla.open-bio.org/show_bug.cgi?id=2726
>
> chris
>
> On Sep 18, 2009, at 11:54 AM, Dan Bolser wrote:
>
>> Please can you link to the bug that includes the code?
>>
>>
>> 2009/9/18 Chris Fields <cjfields at illinois.edu>:
>>> Dan,
>>>
>>> No, it hasn't made it in.  Currently, the problem is it doesn't have 
>>> any
>>> tests attached, but that could be easily fixed if anyone wanted to 
>>> donate a
>>> little time to getting them running.  My hands are a bit full with 
>>> other
>>> stuff for the release.
>>>
>>> We should have some ace files already to go in t/data somewhere if 
>>> one were
>>> so inclined to do that, BTW  ;>
>>>
>>> chris
>>>
>>> On Sep 18, 2009, at 9:27 AM, Dan Bolser wrote:
>>>
>>>> 2009/1/6 Chris Fields <cjfields at illinois.edu>:
>>>>>
>>>>> Could you archive the files and attach them to a bug report (you 
>>>>> can mark
>>>>> it
>>>>> as an enhancement request).  We can take a look.
>>>>>
>>>>> http://bugzilla.open-bio.org/
>>>>
>>>> Out of interest, has this been added? Where is it documented?
>>>>
>>>> Cheers,
>>>> Dan.
>>>>
>>>>
>>>>> chris
>>>>>
>>>>> On Jan 6, 2009, at 5:13 PM, Joshua Udall wrote:
>>>>>
>>>>>> Chris et al. -
>>>>>>
>>>>>> A student and I have written code to do this - write ace files as 
>>>>>> well
>>>>>> as
>>>>>> parse them one entry at a time.  In trying to use the 
>>>>>> Assembly::IO as it
>>>>>> was
>>>>>> in 1.5, we ran into problems with large ace files containing many
>>>>>> entries
>>>>>> because of file handle limit issues with the inherited 
>>>>>> implementation
>>>>>> DB_File.  Our implementation simply reads one contig at a time 
>>>>>> instead
>>>>>> of
>>>>>> first trying to slurp the whole ace into memory.  I'm happy to 
>>>>>> add it to
>>>>>> Bioperl, but I am not sure how to do it.  If I sent *.pm files to
>>>>>> someone,
>>>>>> could they help me get it into bioperl?  It may not be perfect 
>>>>>> either,
>>>>>> but
>>>>>> it should be a good start.
>>>>>>
>>>>>> Josh
>>>>
>>>> _______________________________________________
>>>> Bioperl-l mailing list
>>>> Bioperl-l at lists.open-bio.org
>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


From dan.bolser at gmail.com  Sun Sep 20 12:26:06 2009
From: dan.bolser at gmail.com (Dan Bolser)
Date: Sun, 20 Sep 2009 13:26:06 +0100
Subject: [Bioperl-l] Parser: Ace file (Sequence Assembly) in Bioperl
In-Reply-To: <4AB5916F.1090104@gmail.com>
References: <be9b52410901052142p2809652h68e6a05b3ae156eb@mail.gmail.com>
	<be9b52410901061243t576fcc1eg94928360b8e0f57b@mail.gmail.com>
	<B6BFD3C2-D5D0-4732-B3E2-C2DC9DD029F1@illinois.edu>
	<52cea20c0901061513x593acb44o641b87e35b8ff6fe@mail.gmail.com>
	<835D79AC-0D2A-40BE-87F1-0591F69C036A@illinois.edu>
	<2c8757af0909180727r5a71a41fmee71eff92a49a888@mail.gmail.com>
	<124536CE-407B-4E2E-98B7-940DA4286CC8@illinois.edu>
	<2c8757af0909180954ia4fecc3we72574d8ae8acd97@mail.gmail.com>
	<88BA1216-B8C6-478B-A295-4153D041F549@illinois.edu>
	<4AB5916F.1090104@gmail.com>
Message-ID: <2c8757af0909200526u3bb1766eo5d316dc5d7a2e1a5@mail.gmail.com>

2009/9/20 Florent Angly <florent.angly at gmail.com>:

...

> Also, remember the assembly-related todo list:
> http://www.bioperl.org/wiki/Align_Refactor#Bio::Assembly-related

Thanks for that link Florent. It's great to see the wiki being put to
such good use in the context of OSS development! I need to make a
mental note - before posting, check the mailing list archives _and_
the wiki!

Cheers,
Dan.


> Florent
>
>
> Chris Fields wrote:
>>
>> Bug 2726
>>
>> http://bugzilla.open-bio.org/show_bug.cgi?id=2726
>>
>> chris
>>
>> On Sep 18, 2009, at 11:54 AM, Dan Bolser wrote:
>>
>>> Please can you link to the bug that includes the code?
>>>
>>>
>>> 2009/9/18 Chris Fields <cjfields at illinois.edu>:
>>>>
>>>> Dan,
>>>>
>>>> No, it hasn't made it in. ?Currently, the problem is it doesn't have any
>>>> tests attached, but that could be easily fixed if anyone wanted to
>>>> donate a
>>>> little time to getting them running. ?My hands are a bit full with other
>>>> stuff for the release.
>>>>
>>>> We should have some ace files already to go in t/data somewhere if one
>>>> were
>>>> so inclined to do that, BTW ?;>
>>>>
>>>> chris
>>>>
>>>> On Sep 18, 2009, at 9:27 AM, Dan Bolser wrote:
>>>>
>>>>> 2009/1/6 Chris Fields <cjfields at illinois.edu>:
>>>>>>
>>>>>> Could you archive the files and attach them to a bug report (you can
>>>>>> mark
>>>>>> it
>>>>>> as an enhancement request). ?We can take a look.
>>>>>>
>>>>>> http://bugzilla.open-bio.org/
>>>>>
>>>>> Out of interest, has this been added? Where is it documented?
>>>>>
>>>>> Cheers,
>>>>> Dan.
>>>>>
>>>>>
>>>>>> chris
>>>>>>
>>>>>> On Jan 6, 2009, at 5:13 PM, Joshua Udall wrote:
>>>>>>
>>>>>>> Chris et al. -
>>>>>>>
>>>>>>> A student and I have written code to do this - write ace files as
>>>>>>> well
>>>>>>> as
>>>>>>> parse them one entry at a time. ?In trying to use the Assembly::IO as
>>>>>>> it
>>>>>>> was
>>>>>>> in 1.5, we ran into problems with large ace files containing many
>>>>>>> entries
>>>>>>> because of file handle limit issues with the inherited implementation
>>>>>>> DB_File. ?Our implementation simply reads one contig at a time
>>>>>>> instead
>>>>>>> of
>>>>>>> first trying to slurp the whole ace into memory. ?I'm happy to add it
>>>>>>> to
>>>>>>> Bioperl, but I am not sure how to do it. ?If I sent *.pm files to
>>>>>>> someone,
>>>>>>> could they help me get it into bioperl? ?It may not be perfect
>>>>>>> either,
>>>>>>> but
>>>>>>> it should be a good start.
>>>>>>>
>>>>>>> Josh
>>>>>
>>>>> _______________________________________________
>>>>> Bioperl-l mailing list
>>>>> Bioperl-l at lists.open-bio.org
>>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>>
>>>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>
>


From cjfields at illinois.edu  Sun Sep 20 14:34:08 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Sun, 20 Sep 2009 09:34:08 -0500
Subject: [Bioperl-l] Parser: Ace file (Sequence Assembly) in Bioperl
In-Reply-To: <4AB5916F.1090104@gmail.com>
References: <be9b52410901052142p2809652h68e6a05b3ae156eb@mail.gmail.com>	<18DF7D20DFEC044098A1062202F5FFF31A69523F20@exchsth.agresearch.co.nz>	<be9b52410901061243t576fcc1eg94928360b8e0f57b@mail.gmail.com>	<B6BFD3C2-D5D0-4732-B3E2-C2DC9DD029F1@illinois.edu>	<52cea20c0901061513x593acb44o641b87e35b8ff6fe@mail.gmail.com>	<835D79AC-0D2A-40BE-87F1-0591F69C036A@illinois.edu>	<2c8757af0909180727r5a71a41fmee71eff92a49a888@mail.gmail.com>	<124536CE-407B-4E2E-98B7-940DA4286CC8@illinois.edu>	<2c8757af0909180954ia4fecc3we72574d8ae8acd97@mail.gmail.com>
	<88BA1216-B8C6-478B-A295-4153D041F549@illinois.edu>
	<4AB5916F.1090104@gmail.com>
Message-ID: <F25C4EA4-1DB4-44F3-AB66-F58E6A90E302@illinois.edu>

Never hurts to get started, just make sure that there is a note  
indicating the status of Bio::Assembly.  In fact, the discussion page  
for it might make a good sot for Bio::Assembly design.

chris
On Sep 19, 2009, at 9:20 PM, Florent Angly wrote:

> I suppose it is a good idea to wait until bioperl-live 1.6.1 is out  
> before doing any significant work on the sequence assembly module.
> Also, remember the assembly-related todo list: http://www.bioperl.org/wiki/Align_Refactor#Bio::Assembly-related
> Florent
>
>
> Chris Fields wrote:
>> Bug 2726
>>
>> http://bugzilla.open-bio.org/show_bug.cgi?id=2726
>>
>> chris
>>
>> On Sep 18, 2009, at 11:54 AM, Dan Bolser wrote:
>>
>>> Please can you link to the bug that includes the code?
>>>
>>>
>>> 2009/9/18 Chris Fields <cjfields at illinois.edu>:
>>>> Dan,
>>>>
>>>> No, it hasn't made it in.  Currently, the problem is it doesn't  
>>>> have any
>>>> tests attached, but that could be easily fixed if anyone wanted  
>>>> to donate a
>>>> little time to getting them running.  My hands are a bit full  
>>>> with other
>>>> stuff for the release.
>>>>
>>>> We should have some ace files already to go in t/data somewhere  
>>>> if one were
>>>> so inclined to do that, BTW  ;>
>>>>
>>>> chris
>>>>
>>>> On Sep 18, 2009, at 9:27 AM, Dan Bolser wrote:
>>>>
>>>>> 2009/1/6 Chris Fields <cjfields at illinois.edu>:
>>>>>>
>>>>>> Could you archive the files and attach them to a bug report  
>>>>>> (you can mark
>>>>>> it
>>>>>> as an enhancement request).  We can take a look.
>>>>>>
>>>>>> http://bugzilla.open-bio.org/
>>>>>
>>>>> Out of interest, has this been added? Where is it documented?
>>>>>
>>>>> Cheers,
>>>>> Dan.
>>>>>
>>>>>
>>>>>> chris
>>>>>>
>>>>>> On Jan 6, 2009, at 5:13 PM, Joshua Udall wrote:
>>>>>>
>>>>>>> Chris et al. -
>>>>>>>
>>>>>>> A student and I have written code to do this - write ace files  
>>>>>>> as well
>>>>>>> as
>>>>>>> parse them one entry at a time.  In trying to use the  
>>>>>>> Assembly::IO as it
>>>>>>> was
>>>>>>> in 1.5, we ran into problems with large ace files containing  
>>>>>>> many
>>>>>>> entries
>>>>>>> because of file handle limit issues with the inherited  
>>>>>>> implementation
>>>>>>> DB_File.  Our implementation simply reads one contig at a time  
>>>>>>> instead
>>>>>>> of
>>>>>>> first trying to slurp the whole ace into memory.  I'm happy to  
>>>>>>> add it to
>>>>>>> Bioperl, but I am not sure how to do it.  If I sent *.pm files  
>>>>>>> to
>>>>>>> someone,
>>>>>>> could they help me get it into bioperl?  It may not be perfect  
>>>>>>> either,
>>>>>>> but
>>>>>>> it should be a good start.
>>>>>>>
>>>>>>> Josh
>>>>>
>>>>> _______________________________________________
>>>>> Bioperl-l mailing list
>>>>> Bioperl-l at lists.open-bio.org
>>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>>
>>>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From dan.bolser at gmail.com  Sun Sep 20 15:09:19 2009
From: dan.bolser at gmail.com (Dan Bolser)
Date: Sun, 20 Sep 2009 16:09:19 +0100
Subject: [Bioperl-l] Getting all annotations
In-Reply-To: <2ac05d0f0909181533u1e5d5d89l5c2c468950a9cef@mail.gmail.com>
References: <2ac05d0f0909181533u1e5d5d89l5c2c468950a9cef@mail.gmail.com>
Message-ID: <2c8757af0909200809g1f6c41eeyabfc8bdaac1fc19f@mail.gmail.com>

Hi Emanuele,

I guess you were Emos in irc://irc.freenode.net/#bioperl ?


I think the answer to your question can be found here:

http://www.biodas.org


All the best,
Dan.

2009/9/18 Emanuele Osimo <e.osimo at gmail.com>:
> Hello,
> I was trying to figure out how to get from the Entrez database all the
> reference annotation for a given genomic zone.
> For example: I want to know which genes, transcripts, microRNAs etc are
> present in chr 6 from 100kbp to 200kbp.
> Is there a database that is arranged as a continuum (by sequence) instead of
> by feature (gene, transcript etc)?
>
> Thanks
> Emanuele
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


From maj at fortinbras.us  Mon Sep 21 04:22:54 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Mon, 21 Sep 2009 00:22:54 -0400
Subject: [Bioperl-l] a Main Page proposal
Message-ID: <094CC6C2C9DE448F9A8097AEEFE86B03@NewLife>

Hello all,

As Brian articulated so well for many of us, 
the wiki main page is, well, butt-ugly.
Please check out the Main Page Beta at
http://www.bioperl.org/wiki/Main_Page_Beta
and respond to this thread or on the discussion 
page. 

cheers and thanks, 
MAJ


From bix at sendu.me.uk  Mon Sep 21 06:25:04 2009
From: bix at sendu.me.uk (Sendu Bala)
Date: Mon, 21 Sep 2009 07:25:04 +0100
Subject: [Bioperl-l] a Main Page proposal
In-Reply-To: <094CC6C2C9DE448F9A8097AEEFE86B03@NewLife>
References: <094CC6C2C9DE448F9A8097AEEFE86B03@NewLife>
Message-ID: <4AB71C40.10902@sendu.me.uk>

Mark A. Jensen wrote:
> Hello all,
> 
> As Brian articulated so well for many of us, 
> the wiki main page is, well, butt-ugly.
> Please check out the Main Page Beta at
> http://www.bioperl.org/wiki/Main_Page_Beta
> and respond to this thread or on the discussion 
> page. 

I never thought the main page was 'butt-ugly' (rather, what I expect 
from a wiki), but, to put it bluntly, the graphical flourishes in your 
proposal are cringe-worthy. I couldn't do any better. I think for 
graphical things you'd need a professional graphics designer or similar.

The actual content and organisation of your version is probably an 
improvement though.


From rmb32 at cornell.edu  Mon Sep 21 07:40:31 2009
From: rmb32 at cornell.edu (Robert Buels)
Date: Mon, 21 Sep 2009 00:40:31 -0700
Subject: [Bioperl-l] a Main Page proposal
In-Reply-To: <4AB71C40.10902@sendu.me.uk>
References: <094CC6C2C9DE448F9A8097AEEFE86B03@NewLife>
	<4AB71C40.10902@sendu.me.uk>
Message-ID: <4AB72DEF.2010008@cornell.edu>

Sendu Bala wrote:
> from a wiki), but, to put it bluntly, the graphical flourishes in your 
> proposal are cringe-worthy. I couldn't do any better. I think for 

I think what Sendu was trying to say is that he didn't like the gradient 
section heads?  There are only two graphical things on that page, and 
the other one is an enlargement of the existing logo, so I suppose 
that's what he means.

They're not my absolute favorite either, but I certainly wouldn't 
describe them as cringe-worthy!  :-P

Rob


From biopython at maubp.freeserve.co.uk  Mon Sep 21 09:45:48 2009
From: biopython at maubp.freeserve.co.uk (Peter)
Date: Mon, 21 Sep 2009 10:45:48 +0100
Subject: [Bioperl-l] a Main Page proposal
In-Reply-To: <4AB72DEF.2010008@cornell.edu>
References: <094CC6C2C9DE448F9A8097AEEFE86B03@NewLife>
	<4AB71C40.10902@sendu.me.uk> <4AB72DEF.2010008@cornell.edu>
Message-ID: <320fb6e00909210245hc455330o109515b100b21153@mail.gmail.com>

On Mon, Sep 21, 2009 at 8:40 AM, Robert Buels <rmb32 at cornell.edu> wrote:
>
> I think what Sendu was trying to say is that he didn't like the gradient
> section heads? ?There are only two graphical things on that page, and the
> other one is an enlargement of the existing logo, so I suppose that's what
> he means.

On my browser the gradient section headers on that draft
suddenly change to grey for the section title text background
(Linux, Firefox 3.0.14).

Personally, I would also say that even this proposal is still
far too heavy (in terms of text content).

We had some similar discussions about the Biopython wiki
based homepage - although our old one was nowhere near
as busy as the current BioPerl main page, it was still not as
welcoming as our current version *tries* to be.

Old:
http://biopython.org/w/index.php?title=Biopython&oldid=2527

New:
http://biopython.org/wiki/Main_Page

It would be easy for you to embed the BioPerl OBF blog
headlines into the main page like we did.

I can dig out links to our mailing list archive if anyone is
interested in the discussion.

Peter


From maj at fortinbras.us  Mon Sep 21 11:20:31 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Mon, 21 Sep 2009 07:20:31 -0400
Subject: [Bioperl-l] a Main Page proposal
In-Reply-To: <4AB72DEF.2010008@cornell.edu>
References: <094CC6C2C9DE448F9A8097AEEFE86B03@NewLife><4AB71C40.10902@sendu.me.uk>
	<4AB72DEF.2010008@cornell.edu>
Message-ID: <22244A89D06E4F9B8D5F70A833E1C0DE@NewLife>

Hey, if Sendu cringed, he cringed. If I had one, I'd keep my 
day job. In the meantime, the graphics are removed. 
MAJ
----- Original Message ----- 
From: "Robert Buels" <rmb32 at cornell.edu>
Cc: "BioPerl List" <bioperl-l at lists.open-bio.org>
Sent: Monday, September 21, 2009 3:40 AM
Subject: Re: [Bioperl-l] a Main Page proposal


> Sendu Bala wrote:
>> from a wiki), but, to put it bluntly, the graphical flourishes in your 
>> proposal are cringe-worthy. I couldn't do any better. I think for 
> 
> I think what Sendu was trying to say is that he didn't like the gradient 
> section heads?  There are only two graphical things on that page, and 
> the other one is an enlargement of the existing logo, so I suppose 
> that's what he means.
> 
> They're not my absolute favorite either, but I certainly wouldn't 
> describe them as cringe-worthy!  :-P
> 
> Rob
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
>


From e.osimo at gmail.com  Mon Sep 21 11:35:00 2009
From: e.osimo at gmail.com (Emanuele Osimo)
Date: Mon, 21 Sep 2009 13:35:00 +0200
Subject: [Bioperl-l] a Main Page proposal
In-Reply-To: <094CC6C2C9DE448F9A8097AEEFE86B03@NewLife>
References: <094CC6C2C9DE448F9A8097AEEFE86B03@NewLife>
Message-ID: <2ac05d0f0909210435k66bd0ed3x9fd13d9f4ec44634@mail.gmail.com>

I can say that, for a neophyte, the contents are a great improvement.
You can find with a lot more ease what you are searching for.

Emanuele

On Mon, Sep 21, 2009 at 06:22, Mark A. Jensen <maj at fortinbras.us> wrote:

> Hello all,
>
> As Brian articulated so well for many of us,
> the wiki main page is, well, butt-ugly.
> Please check out the Main Page Beta at
> http://www.bioperl.org/wiki/Main_Page_Beta
> and respond to this thread or on the discussion
> page.
>
> cheers and thanks,
> MAJ
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


From maj at fortinbras.us  Mon Sep 21 11:32:08 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Mon, 21 Sep 2009 07:32:08 -0400
Subject: [Bioperl-l] a Main Page proposal
In-Reply-To: <320fb6e00909210245hc455330o109515b100b21153@mail.gmail.com>
References: <094CC6C2C9DE448F9A8097AEEFE86B03@NewLife><4AB71C40.10902@sendu.me.uk>
	<4AB72DEF.2010008@cornell.edu>
	<320fb6e00909210245hc455330o109515b100b21153@mail.gmail.com>
Message-ID: <3C8F39ACAD954917ACDEFD863EC99B16@NewLife>

I'd appreciate those links, Peter- thanks
MAJ
----- Original Message ----- 
From: "Peter" <biopython at maubp.freeserve.co.uk>
To: "BioPerl List" <bioperl-l at lists.open-bio.org>
Sent: Monday, September 21, 2009 5:45 AM
Subject: Re: [Bioperl-l] a Main Page proposal


On Mon, Sep 21, 2009 at 8:40 AM, Robert Buels <rmb32 at cornell.edu> wrote:
>
> I think what Sendu was trying to say is that he didn't like the gradient
> section heads? There are only two graphical things on that page, and the
> other one is an enlargement of the existing logo, so I suppose that's what
> he means.

On my browser the gradient section headers on that draft
suddenly change to grey for the section title text background
(Linux, Firefox 3.0.14).

Personally, I would also say that even this proposal is still
far too heavy (in terms of text content).

We had some similar discussions about the Biopython wiki
based homepage - although our old one was nowhere near
as busy as the current BioPerl main page, it was still not as
welcoming as our current version *tries* to be.

Old:
http://biopython.org/w/index.php?title=Biopython&oldid=2527

New:
http://biopython.org/wiki/Main_Page

It would be easy for you to embed the BioPerl OBF blog
headlines into the main page like we did.

I can dig out links to our mailing list archive if anyone is
interested in the discussion.

Peter

_______________________________________________
Bioperl-l mailing list
Bioperl-l at lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/bioperl-l


From pmiguel at purdue.edu  Mon Sep 21 12:01:03 2009
From: pmiguel at purdue.edu (Phillip San Miguel)
Date: Mon, 21 Sep 2009 08:01:03 -0400
Subject: [Bioperl-l] Getting read position information from an ACE file?
In-Reply-To: <2c8757af0909181009w310bc69r3d9efa3d9a12d41b@mail.gmail.com>
References: <2c8757af0909180755u2e2ca178h9ce921f9bb22c7a3@mail.gmail.com>	<FCD85C18EC5744269CEAB127F4D1D5C4@NewLife>
	<2c8757af0909181009w310bc69r3d9efa3d9a12d41b@mail.gmail.com>
Message-ID: <4AB76AFF.7050902@purdue.edu>

Dan Bolser wrote:
> 2009/9/18 Mark A. Jensen <maj at fortinbras.us>:
>   
>> Dan -- I don't know much about Assembly, so can't help there. But can I
>>  encourage you and perhaps one or two others (steganographic content:
>> fangly) to create a HOWTO stub out of this? Would be excellent-
>>     
>
> I'd love to. ACE is pretty ubiquitous, so any additional info on how
> to work with them using BioPerl should help a lot of people.
>   
> The problem is that I'm one of those people ;-)
>
>
> I'm working on an 'ace2tab.plx' script that should encompass this
> info. I'm finding that some 'read ids' have the .range format. i.e.
> "read123455.23-239". However, some do not. i.e. "read123456". Not sure
> where this ID comes from, but I think its telling me something about
> partially aligned reads. 

I think you are right. I have heard that Newbler (the 454 assembler) 
does this insane thing, where it will rip reads apart into segments and 
cluster parts of reads in different contigs.

> The problem is that the coordinates I'm
> seeing don't reflect that (they are just the start and the end point
> of the full read).
>   

That sounds similar to how phrap/consed handle "chimeric" reads. But my 
experience is that phrap is pretty parsimonious with numbers of 
chimerics it will allow.  (That isn't entirely fair to Newbler -- I've 
never been able to get phrap to consistently assemble ESTs. Phrap seems 
tuned to assemble BAC shotgun reads. ESTs seem to drive it a little 
crazy. It will create contigs from a set of reads that have essentially 
no similarity to each other, nor to the consensus sequence phrap creates 
for them.)

-- 
Phillip


From hlapp at gmx.net  Mon Sep 21 12:22:34 2009
From: hlapp at gmx.net (Hilmar Lapp)
Date: Mon, 21 Sep 2009 08:22:34 -0400
Subject: [Bioperl-l] a Main Page proposal
In-Reply-To: <094CC6C2C9DE448F9A8097AEEFE86B03@NewLife>
References: <094CC6C2C9DE448F9A8097AEEFE86B03@NewLife>
Message-ID: <03B93F96-E28D-45CF-BD94-AD33634476AA@gmx.net>

What's probably worth looking at as a example is the gmod.org home  
page. Stylistically, one thing you want to get out of the way is the  
auto-generated TOC.

	-hilmar

On Sep 21, 2009, at 12:22 AM, Mark A. Jensen wrote:

> Hello all,
>
> As Brian articulated so well for many of us,
> the wiki main page is, well, butt-ugly.
> Please check out the Main Page Beta at
> http://www.bioperl.org/wiki/Main_Page_Beta
> and respond to this thread or on the discussion
> page.
>
> cheers and thanks,
> MAJ
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From biopython at maubp.freeserve.co.uk  Mon Sep 21 12:28:28 2009
From: biopython at maubp.freeserve.co.uk (Peter)
Date: Mon, 21 Sep 2009 13:28:28 +0100
Subject: [Bioperl-l] a Main Page proposal
In-Reply-To: <3C8F39ACAD954917ACDEFD863EC99B16@NewLife>
References: <094CC6C2C9DE448F9A8097AEEFE86B03@NewLife>
	<4AB71C40.10902@sendu.me.uk> <4AB72DEF.2010008@cornell.edu>
	<320fb6e00909210245hc455330o109515b100b21153@mail.gmail.com>
	<3C8F39ACAD954917ACDEFD863EC99B16@NewLife>
Message-ID: <320fb6e00909210528v849cd43h3bd4677b69222575@mail.gmail.com>

Peter wrote:
>> We had some similar discussions about the Biopython wiki
>> based homepage - although our old one was nowhere near
>> as busy as the current BioPerl main page, it was still not as
>> welcoming as our current version *tries* to be.
>> ...
>> I can dig out links to our mailing list archive if anyone is
>> interested in the discussion.

On Mon, Sep 21, 2009 at 12:32 PM, Mark A. Jensen wrote:
>
> I'd appreciate those links, Peter- thanks
> MAJ

OK, here you are - this was most of it, I'd have to dig though
my old emails to see what else I can find:
http://lists.open-bio.org/pipermail/biopython-dev/2009-April/005867.html

Remember Biopython went from a very minimal home page, to
something aiming to be more newcomer friendly. BioPerl on the
other hand seems to want to move away from the current very
text heavy information rich page to something more focused and
newcomer friendly. To me at least the current page is too dense,
intimidating, and the important bits get lost in all the content.

[My apologies if any of this feedback come accross too blunt.]

If you haven't already looked at them, you should checkout the
other OBF project pages for ideas. The BioJava homepage is
also using the wiki - in my opinion it is a bit cluttered, but is
still more accessible than the current BioPerl page. Also,
the BioRuby page is very nice - although not wiki based.

Regards,

Peter


From mwachholtz at unomaha.edu  Fri Sep 18 00:31:13 2009
From: mwachholtz at unomaha.edu (Michael UNO)
Date: Thu, 17 Sep 2009 17:31:13 -0700 (PDT)
Subject: [Bioperl-l]  Genome Scanning Question
Message-ID: <25497856.post@talk.nabble.com>


What objects & methods could be used if I wanted to determine if a gene is
located at a specific location within a genome at the Ensembl database. For
example, if given a coordinate (e.g. Canine Chr15:66,500,123) is there a
method that will simply tell me "yes, there is a gene at this location". And
can it tell what gene(s) are located at this coordinate?
-- 
View this message in context: http://www.nabble.com/Genome-Scanning-Question-tp25497856p25497856.html
Sent from the Perl - Bioperl-L mailing list archive at Nabble.com.


From sdavis2 at mail.nih.gov  Mon Sep 21 13:04:36 2009
From: sdavis2 at mail.nih.gov (Sean Davis)
Date: Mon, 21 Sep 2009 09:04:36 -0400
Subject: [Bioperl-l] Genome Scanning Question
In-Reply-To: <25497856.post@talk.nabble.com>
References: <25497856.post@talk.nabble.com>
Message-ID: <264855a00909210604o826871dr7121e3f26c0e34aa@mail.gmail.com>

On Thu, Sep 17, 2009 at 8:31 PM, Michael UNO <mwachholtz at unomaha.edu> wrote:

>
> What objects & methods could be used if I wanted to determine if a gene is
> located at a specific location within a genome at the Ensembl database. For
> example, if given a coordinate (e.g. Canine Chr15:66,500,123) is there a
> method that will simply tell me "yes, there is a gene at this location".
> And
> can it tell what gene(s) are located at this coordinate?
>

There are a number of ways to go about this.

If you want to go with perl, object-oriented, and ensembl, check out:

http://www.ensembl.org/info/docs/api/core/core_tutorial.html

If you want to start with tab-delimited text files, check out downloading
the text files from the UCSC genome browser.

Sean


> --
> View this message in context:
> http://www.nabble.com/Genome-Scanning-Question-tp25497856p25497856.html
> Sent from the Perl - Bioperl-L mailing list archive at Nabble.com.
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


From cjfields at illinois.edu  Mon Sep 21 13:05:25 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Mon, 21 Sep 2009 08:05:25 -0500
Subject: [Bioperl-l] a Main Page proposal
In-Reply-To: <320fb6e00909210528v849cd43h3bd4677b69222575@mail.gmail.com>
References: <094CC6C2C9DE448F9A8097AEEFE86B03@NewLife>
	<4AB71C40.10902@sendu.me.uk> <4AB72DEF.2010008@cornell.edu>
	<320fb6e00909210245hc455330o109515b100b21153@mail.gmail.com>
	<3C8F39ACAD954917ACDEFD863EC99B16@NewLife>
	<320fb6e00909210528v849cd43h3bd4677b69222575@mail.gmail.com>
Message-ID: <D336A804-1B73-465E-971F-AD1A9EE5465C@illinois.edu>


On Sep 21, 2009, at 7:28 AM, Peter wrote:

> Peter wrote:
>>> We had some similar discussions about the Biopython wiki
>>> based homepage - although our old one was nowhere near
>>> as busy as the current BioPerl main page, it was still not as
>>> welcoming as our current version *tries* to be.
>>> ...
>>> I can dig out links to our mailing list archive if anyone is
>>> interested in the discussion.
>
> On Mon, Sep 21, 2009 at 12:32 PM, Mark A. Jensen wrote:
>>
>> I'd appreciate those links, Peter- thanks
>> MAJ
>
> OK, here you are - this was most of it, I'd have to dig though
> my old emails to see what else I can find:
> http://lists.open-bio.org/pipermail/biopython-dev/2009-April/005867.html
>
> Remember Biopython went from a very minimal home page, to
> something aiming to be more newcomer friendly. BioPerl on the
> other hand seems to want to move away from the current very
> text heavy information rich page to something more focused and
> newcomer friendly. To me at least the current page is too dense,
> intimidating, and the important bits get lost in all the content.
>
> [My apologies if any of this feedback come accross too blunt.]

Not at all; I'm thinking the same thing.

> If you haven't already looked at them, you should checkout the
> other OBF project pages for ideas. The BioJava homepage is
> also using the wiki - in my opinion it is a bit cluttered, but is
> still more accessible than the current BioPerl page. Also,
> the BioRuby page is very nice - although not wiki based.
>
> Regards,
>
> Peter

I think the Biopython layout is very nice and focused.  Maybe a bit  
too minimal, but then again I don't like scrolling up and down the  
page to find the relevant bits, so less may be better.

Reminds me of the simplifed design on the perl6 main page (just don't  
stare at the hallucinogenic butterfly too long):

http://www.perl6.org/

So, maybe a structured layout with the most important links, and  
additional links on a separate page.

chris


From maj at fortinbras.us  Mon Sep 21 13:22:35 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Mon, 21 Sep 2009 09:22:35 -0400
Subject: [Bioperl-l] a Main Page proposal
In-Reply-To: <D336A804-1B73-465E-971F-AD1A9EE5465C@illinois.edu>
References: <094CC6C2C9DE448F9A8097AEEFE86B03@NewLife><4AB71C40.10902@sendu.me.uk>
	<4AB72DEF.2010008@cornell.edu><320fb6e00909210245hc455330o109515b100b21153@mail.gmail.com><3C8F39ACAD954917ACDEFD863EC99B16@NewLife><320fb6e00909210528v849cd43h3bd4677b69222575@mail.gmail.com>
	<D336A804-1B73-465E-971F-AD1A9EE5465C@illinois.edu>
Message-ID: <0F980234804C4B3EA08E810E043F2537@NewLife>

Ah! I don't need a degree in design, just a dose of whatever Madame Butterfly 
was taking!
(Erdos had it right...)

----- Original Message ----- 
From: "Chris Fields" <cjfields at illinois.edu>
To: "Peter" <biopython at maubp.freeserve.co.uk>
Cc: "BioPerl List" <bioperl-l at lists.open-bio.org>; "Mark A. Jensen" 
<maj at fortinbras.us>
Sent: Monday, September 21, 2009 9:05 AM
Subject: Re: [Bioperl-l] a Main Page proposal


>
> On Sep 21, 2009, at 7:28 AM, Peter wrote:
>
>> Peter wrote:
>>>> We had some similar discussions about the Biopython wiki
>>>> based homepage - although our old one was nowhere near
>>>> as busy as the current BioPerl main page, it was still not as
>>>> welcoming as our current version *tries* to be.
>>>> ...
>>>> I can dig out links to our mailing list archive if anyone is
>>>> interested in the discussion.
>>
>> On Mon, Sep 21, 2009 at 12:32 PM, Mark A. Jensen wrote:
>>>
>>> I'd appreciate those links, Peter- thanks
>>> MAJ
>>
>> OK, here you are - this was most of it, I'd have to dig though
>> my old emails to see what else I can find:
>> http://lists.open-bio.org/pipermail/biopython-dev/2009-April/005867.html
>>
>> Remember Biopython went from a very minimal home page, to
>> something aiming to be more newcomer friendly. BioPerl on the
>> other hand seems to want to move away from the current very
>> text heavy information rich page to something more focused and
>> newcomer friendly. To me at least the current page is too dense,
>> intimidating, and the important bits get lost in all the content.
>>
>> [My apologies if any of this feedback come accross too blunt.]
>
> Not at all; I'm thinking the same thing.
>
>> If you haven't already looked at them, you should checkout the
>> other OBF project pages for ideas. The BioJava homepage is
>> also using the wiki - in my opinion it is a bit cluttered, but is
>> still more accessible than the current BioPerl page. Also,
>> the BioRuby page is very nice - although not wiki based.
>>
>> Regards,
>>
>> Peter
>
> I think the Biopython layout is very nice and focused.  Maybe a bit  too 
> minimal, but then again I don't like scrolling up and down the  page to find 
> the relevant bits, so less may be better.
>
> Reminds me of the simplifed design on the perl6 main page (just don't  stare 
> at the hallucinogenic butterfly too long):
>
> http://www.perl6.org/
>
> So, maybe a structured layout with the most important links, and  additional 
> links on a separate page.
>
> chris
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> 


From biopython at maubp.freeserve.co.uk  Mon Sep 21 13:58:21 2009
From: biopython at maubp.freeserve.co.uk (Peter)
Date: Mon, 21 Sep 2009 14:58:21 +0100
Subject: [Bioperl-l] a Main Page proposal
In-Reply-To: <D336A804-1B73-465E-971F-AD1A9EE5465C@illinois.edu>
References: <094CC6C2C9DE448F9A8097AEEFE86B03@NewLife>
	<4AB71C40.10902@sendu.me.uk> <4AB72DEF.2010008@cornell.edu>
	<320fb6e00909210245hc455330o109515b100b21153@mail.gmail.com>
	<3C8F39ACAD954917ACDEFD863EC99B16@NewLife>
	<320fb6e00909210528v849cd43h3bd4677b69222575@mail.gmail.com>
	<D336A804-1B73-465E-971F-AD1A9EE5465C@illinois.edu>
Message-ID: <320fb6e00909210658n70f96727g1eb190579a746cfa@mail.gmail.com>

On Mon, Sep 21, 2009 at 2:05 PM, Chris Fields <cjfields at illinois.edu> wrote:
>
> I think the Biopython layout is very nice and focused. ?Maybe
> a bit too minimal, but then again I don't like scrolling up and
> down the page to find the relevant bits, so less may be better.

Yes, trying to get everything on one screen was deliberate
(and works for most screen sizes).

> Reminds me of the simplifed design on the perl6 main page
> (just don't stare at the hallucinogenic butterfly too long):
>
> http://www.perl6.org/
>
> So, maybe a structured layout with the most important links,
> and additional links on a separate page.

Butterflies aside, yes - that is what we tried to do on the
Biopython page - just provide an "abstract", and links to
get people to the main content.

Peter


From ak at ebi.ac.uk  Mon Sep 21 14:06:44 2009
From: ak at ebi.ac.uk (Andreas =?iso-8859-1?B?S+Ro5HJp?=)
Date: Mon, 21 Sep 2009 15:06:44 +0100
Subject: [Bioperl-l] Genome Scanning Question
In-Reply-To: <25497856.post@talk.nabble.com>
References: <25497856.post@talk.nabble.com>
Message-ID: <20090921140644.GB12734@qux.windows.ebi.ac.uk>

On Thu, Sep 17, 2009 at 05:31:13PM -0700, Michael UNO wrote:
> 
> What objects & methods could be used if I wanted to determine if a gene is
> located at a specific location within a genome at the Ensembl database. For
> example, if given a coordinate (e.g. Canine Chr15:66,500,123) is there a
> method that will simply tell me "yes, there is a gene at this location". And
> can it tell what gene(s) are located at this coordinate?

Here's a basic script do do something like what you want to do, for a
specific species, chromosome, and region:

#!/usr/bin/perl -w

use strict;
use warnings;

use Bio::EnsEMBL::Registry;

my $registry = 'Bio::EnsEMBL::Registry';

$registry->load_registry_from_db(
  '-host' => 'ensembldb.ensembl.org',
  '-user' => 'anonymous'
);

my $species = 'Dog';

my ( $chrname, $chrstart, $chrend ) = ( '13', 40_500_000, 41_000_000 );

my $slice_adaptor = $registry->get_adaptor( $species, 'Core', 'Slice' );

my $slice =
  $slice_adaptor->fetch_by_region( 'Chromosome', $chrname, $chrstart,
  $chrend );

my @genes = @{ $slice->get_all_Genes() };

if ( !@genes ) {
  print("No genes on that interval\n");
} else {
  printf( "%d genes on the interval:\n", scalar(@genes) );
  foreach my $gene (@genes) {
    printf(
      "%s (%s) [%s,%s,%s]\n",
      $gene->stable_id(), $gene->external_name() || 'No external name',
      $gene->start(), $gene->end(), $gene->strand() );
  }
}


Are you aware of the ensembl-dev mailing list and of the ensembl
helpdesk at helpdesk at ensembl.org (or via the "he!p" button in the genome
browser itself)?


Regards,
Andreas


-- 
Andreas K?h?ri, Ensembl Software Developer            ()[]()[]
European Bioinformatics Institute (EMBL-EBI)          []()[]()
Wellcome Trust Genome Campus, Hinxton                 ()[]()[]
Cambridge CB10 1SD, United Kingdom                    []()[]()


From bosborne11 at verizon.net  Mon Sep 21 13:15:03 2009
From: bosborne11 at verizon.net (Brian Osborne)
Date: Mon, 21 Sep 2009 09:15:03 -0400
Subject: [Bioperl-l] a Main Page proposal
In-Reply-To: <094CC6C2C9DE448F9A8097AEEFE86B03@NewLife>
References: <094CC6C2C9DE448F9A8097AEEFE86B03@NewLife>
Message-ID: <7E8EC05A-ED60-4F70-850D-16DD7E037281@verizon.net>

Mark,

That's nice! I wonder if we can move some content up-top, on the  
right, for less scrolling. I will play with this later today...

Brian O.


On Sep 21, 2009, at 12:22 AM, Mark A. Jensen wrote:

> Hello all,
>
> As Brian articulated so well for many of us,
> the wiki main page is, well, butt-ugly.
> Please check out the Main Page Beta at
> http://www.bioperl.org/wiki/Main_Page_Beta
> and respond to this thread or on the discussion
> page.
>
> cheers and thanks,
> MAJ
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From anupam.contact at gmail.com  Mon Sep 21 14:18:52 2009
From: anupam.contact at gmail.com (anupam sinha)
Date: Mon, 21 Sep 2009 19:48:52 +0530
Subject: [Bioperl-l] Problems with Bioperl-run pkg
In-Reply-To: <82ec54570909180820t7981d230l48d8e4823bb2303f@mail.gmail.com>
References: <82ec54570909180820t7981d230l48d8e4823bb2303f@mail.gmail.com>
Message-ID: <82ec54570909210718v180f604btc835d88f2a9ec2fd@mail.gmail.com>

On Fri, Sep 18, 2009 at 8:50 PM, anupam sinha <anupam.contact at gmail.com>wrote:

> Dear all,
>                  I have installed the BioPerl-1.6.0.tar.gz and
> Bioperl-run-1.6.0.tar.gz on a Fedora 7 system. I am trying to run *
> /usr/bin/bp_pairwise_kaks.pl* script but keep on getting this error :
>
> *Must have bioperl-run pkg installed to run this script at
> /usr/bin/bp_pairwise_kaks.pl line 69*.
>
> Though I have istalled the run package from Bioperl. Can anyone help me out
> ? Thanks in advance.
>
>
>
> Regards,
>
>
> Anupam Sinha
>


From maj at fortinbras.us  Mon Sep 21 14:49:25 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Mon, 21 Sep 2009 10:49:25 -0400
Subject: [Bioperl-l] a Main Page proposal
In-Reply-To: <094CC6C2C9DE448F9A8097AEEFE86B03@NewLife>
References: <094CC6C2C9DE448F9A8097AEEFE86B03@NewLife>
Message-ID: <7C51BBB552AB4EBEA5ED60B49BDB9171@NewLife>

Please view the latest 
http://www.bioperl.org/wiki/Main_Page_Beta
No graphics. I incline towards more text, but you
already knew that.
MAJ
----- Original Message ----- 
From: "Mark A. Jensen" <maj at fortinbras.us>
To: "BioPerl List" <bioperl-l at lists.open-bio.org>
Sent: Monday, September 21, 2009 12:22 AM
Subject: [Bioperl-l] a Main Page proposal


> Hello all,
> 
> As Brian articulated so well for many of us, 
> the wiki main page is, well, butt-ugly.
> Please check out the Main Page Beta at
> http://www.bioperl.org/wiki/Main_Page_Beta
> and respond to this thread or on the discussion 
> page. 
> 
> cheers and thanks, 
> MAJ
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
>


From David.Messina at sbc.su.se  Mon Sep 21 17:03:56 2009
From: David.Messina at sbc.su.se (Dave Messina)
Date: Mon, 21 Sep 2009 19:03:56 +0200
Subject: [Bioperl-l] a Main Page proposal
In-Reply-To: <7C51BBB552AB4EBEA5ED60B49BDB9171@NewLife>
References: <094CC6C2C9DE448F9A8097AEEFE86B03@NewLife>
	<7C51BBB552AB4EBEA5ED60B49BDB9171@NewLife>
Message-ID: <628aabb70909211003w221b4c9fn5c11f7ff08fc952d@mail.gmail.com>

Hi Mark,
Thanks for taking on this (much needed) refresh.

I think your current version is substantially better than what we have now.
Still, I'd argue that something much more concise like the Biopython page
would make a bigger impact on visitors' ability to find what they're looking
for.

It's not that the details you have under each section shouldn't be
available, but rather that they could be clicked through to instead of being
on the front page.

The About section is a good example. I would bet most visitors to the
BioPerl website skip over the About section because they already know what
BioPerl is, and that section has the most valuable real estate on the page.
Those who don't know and are curious will probably be able to find it (the
word About on the front page of a website has become an idiom for "click her
to read the details about this").


Dave


From cjfields at illinois.edu  Mon Sep 21 17:42:10 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Mon, 21 Sep 2009 12:42:10 -0500
Subject: [Bioperl-l] a Main Page proposal
In-Reply-To: <628aabb70909211003w221b4c9fn5c11f7ff08fc952d@mail.gmail.com>
References: <094CC6C2C9DE448F9A8097AEEFE86B03@NewLife>
	<7C51BBB552AB4EBEA5ED60B49BDB9171@NewLife>
	<628aabb70909211003w221b4c9fn5c11f7ff08fc952d@mail.gmail.com>
Message-ID: <5C240DA6-6B3D-4E64-A8BC-1FBC90FFA471@illinois.edu>

On Sep 21, 2009, at 12:03 PM, Dave Messina wrote:

> Hi Mark,
> Thanks for taking on this (much needed) refresh.
>
> I think your current version is substantially better than what we  
> have now.
> Still, I'd argue that something much more concise like the Biopython  
> page
> would make a bigger impact on visitors' ability to find what they're  
> looking
> for.
>
> It's not that the details you have under each section shouldn't be
> available, but rather that they could be clicked through to instead  
> of being
> on the front page.
>
> The About section is a good example. I would bet most visitors to the
> BioPerl website skip over the About section because they already  
> know what
> BioPerl is, and that section has the most valuable real estate on  
> the page.
> Those who don't know and are curious will probably be able to find  
> it (the
> word About on the front page of a website has become an idiom for  
> "click her
> to read the details about this").
>
>
>
> Dave

How about this version (it's on my talk page):

http://www.bioperl.org/wiki/User_talk:Cjfields

chris


From maj at fortinbras.us  Mon Sep 21 17:45:03 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Mon, 21 Sep 2009 13:45:03 -0400
Subject: [Bioperl-l] a Main Page proposal
In-Reply-To: <628aabb70909211003w221b4c9fn5c11f7ff08fc952d@mail.gmail.com>
References: <094CC6C2C9DE448F9A8097AEEFE86B03@NewLife><7C51BBB552AB4EBEA5ED60B49BDB9171@NewLife>
	<628aabb70909211003w221b4c9fn5c11f7ff08fc952d@mail.gmail.com>
Message-ID: <42FBB964C0EA44FABCB50364C567A009@NewLife>

A nearly completely minimal solution is at Main Page Beta
----- Original Message ----- 
From: "Dave Messina" <David.Messina at sbc.su.se>
To: "Mark A. Jensen" <maj at fortinbras.us>
Cc: "BioPerl List" <bioperl-l at lists.open-bio.org>
Sent: Monday, September 21, 2009 1:03 PM
Subject: Re: [Bioperl-l] a Main Page proposal


> Hi Mark,
> Thanks for taking on this (much needed) refresh.
> 
> I think your current version is substantially better than what we have now.
> Still, I'd argue that something much more concise like the Biopython page
> would make a bigger impact on visitors' ability to find what they're looking
> for.
> 
> It's not that the details you have under each section shouldn't be
> available, but rather that they could be clicked through to instead of being
> on the front page.
> 
> The About section is a good example. I would bet most visitors to the
> BioPerl website skip over the About section because they already know what
> BioPerl is, and that section has the most valuable real estate on the page.
> Those who don't know and are curious will probably be able to find it (the
> word About on the front page of a website has become an idiom for "click her
> to read the details about this").
> 
> 
> 
> Dave
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
>


From armendarez77 at hotmail.com  Mon Sep 21 21:01:12 2009
From: armendarez77 at hotmail.com (armendarez77 at hotmail.com)
Date: Mon, 21 Sep 2009 14:01:12 -0700
Subject: [Bioperl-l] New to Bio::Tools::Run::RemoteBlast
Message-ID: <SNT119-W38149FD1B34EE5CA92BFBED2DD0@phx.gbl>


Hello,

Is there a function to blast one query sequence against multiple blast databases?  For example, I want to blast a sequence against all Microbial Genomes.  Currently, I can do it by placing multiple Microbial databases (eg. Microbial/100226, Microbial/101510, etc) into an array and iterate through them using a foreach loop.  Each individual database is placed in the '-data' parameter and the blast is performed.

Example Code:

use strict;
use Bio::Tools::Run::RemoteBlast;

my @microbDbs = qw(Microbial/100226 Microbial/101510 Microbial/103690 Microbial/1063);
my $e_val= '1e-3';

foreach my $db(@microbDbs){
  my @params = ( '-prog' => $prog,
                         '-data' => $db,
                         '-expect' => $e_val,
                         '-readmethod' => 'xml' );

  my $factory = Bio::Tools::Run::RemoteBlast->new(@params);
  my $v = 1;
  my $str = Bio::SeqIO->new(-file=>'test.fa' , '-format' => 'fasta' );
  while (my $input = $str->next_seq()){
    my $r = $factory->submit_blast($input);

    #Code continues...

}

Is there a more efficient way to accomplish this?

If this topic has been discussed please point the way.

Thank you,

Veronica

 		 	   		  
_________________________________________________________________
Microsoft brings you a new way to search the web.  Try  Bing? now
http://www.bing.com?form=MFEHPG&publ=WLHMTAG&crea=TEXT_MFEHPG_Core_tagline_try bing_1x1


From Russell.Smithies at agresearch.co.nz  Mon Sep 21 22:10:56 2009
From: Russell.Smithies at agresearch.co.nz (Smithies, Russell)
Date: Tue, 22 Sep 2009 10:10:56 +1200
Subject: [Bioperl-l] New to Bio::Tools::Run::RemoteBlast
In-Reply-To: <SNT119-W38149FD1B34EE5CA92BFBED2DD0@phx.gbl>
References: <SNT119-W38149FD1B34EE5CA92BFBED2DD0@phx.gbl>
Message-ID: <18DF7D20DFEC044098A1062202F5FFF32B62A6B9D0@exchsth.agresearch.co.nz>

You may need to setup blast locally (not a big job) as I don't think you can blast against multiple databases with B:T:R:RemoteBlast. 
Or you could do it manually on NCBI's site where you can filter results by entrez query (eg. 1239[taxid] for fermicutes) http://www.ncbi.nlm.nih.gov/BLAST/blastcgihelp.shtml#entrez_query 

--Russell


> -----Original Message-----
> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
> bounces at lists.open-bio.org] On Behalf Of armendarez77 at hotmail.com
> Sent: Tuesday, 22 September 2009 9:01 a.m.
> To: bioperl-l at lists.open-bio.org
> Subject: [Bioperl-l] New to Bio::Tools::Run::RemoteBlast
> 
> 
> 
> 
> 
> 
> 
> Hello,
> 
> Is there a function to blast one query sequence against multiple blast
> databases?  For example, I want to blast a sequence against all Microbial
> Genomes.  Currently, I can do it by placing multiple Microbial databases (eg.
> Microbial/100226, Microbial/101510, etc) into an array and iterate through
> them using a foreach loop.  Each individual database is placed in the '-data'
> parameter and the blast is performed.
> 
> Example Code:
> 
> use strict;
> use Bio::Tools::Run::RemoteBlast;
> 
> my @microbDbs = qw(Microbial/100226 Microbial/101510 Microbial/103690
> Microbial/1063);
> my $e_val= '1e-3';
> 
> foreach my $db(@microbDbs){
>   my @params = ( '-prog' => $prog,
>                          '-data' => $db,
>                          '-expect' => $e_val,
>                          '-readmethod' => 'xml' );
> 
>   my $factory = Bio::Tools::Run::RemoteBlast->new(@params);
>   my $v = 1;
>   my $str = Bio::SeqIO->new(-file=>'test.fa' , '-format' => 'fasta' );
>   while (my $input = $str->next_seq()){
>     my $r = $factory->submit_blast($input);
> 
>     #Code continues...
> 
> }
> 
> Is there a more efficient way to accomplish this?
> 
> If this topic has been discussed please point the way.
> 
> Thank you,
> 
> Veronica
> 
> 
> _________________________________________________________________
> Microsoft brings you a new way to search the web.  Try  Bing(tm) now
> http://www.bing.com?form=MFEHPG&publ=WLHMTAG&crea=TEXT_MFEHPG_Core_tagline_try
> bing_1x1
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
=======================================================================
Attention: The information contained in this message and/or attachments
from AgResearch Limited is intended only for the persons or entities
to which it is addressed and may contain confidential and/or privileged
material. Any review, retransmission, dissemination or other use of, or
taking of any action in reliance upon, this information by persons or
entities other than the intended recipients is prohibited by AgResearch
Limited. If you have received this message in error, please notify the
sender immediately.
=======================================================================


From bill at genenformics.com  Mon Sep 21 22:21:26 2009
From: bill at genenformics.com (bill at genenformics.com)
Date: Mon, 21 Sep 2009 15:21:26 -0700
Subject: [Bioperl-l] New to Bio::Tools::Run::RemoteBlast
In-Reply-To: <18DF7D20DFEC044098A1062202F5FFF32B62A6B9D0@exchsth.agresearch.co.nz>
References: <SNT119-W38149FD1B34EE5CA92BFBED2DD0@phx.gbl>
	<18DF7D20DFEC044098A1062202F5FFF32B62A6B9D0@exchsth.agresearch.co.nz>
Message-ID: <4a1b887d0770ac557b0a2578aefdce18.squirrel@mail.dreamhost.com>

BLAST DBs can be concatenated into a single target (.nal or .pal) file.

Check this out:

http://www.ncbi.nlm.nih.gov/Web/Newsltr/Winter00/blastlab.html

Bill

> You may need to setup blast locally (not a big job) as I don't think you
> can blast against multiple databases with B:T:R:RemoteBlast.
> Or you could do it manually on NCBI's site where you can filter results by
> entrez query (eg. 1239[taxid] for fermicutes)
> http://www.ncbi.nlm.nih.gov/BLAST/blastcgihelp.shtml#entrez_query
>
> --Russell
>
>
>> -----Original Message-----
>> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
>> bounces at lists.open-bio.org] On Behalf Of armendarez77 at hotmail.com
>> Sent: Tuesday, 22 September 2009 9:01 a.m.
>> To: bioperl-l at lists.open-bio.org
>> Subject: [Bioperl-l] New to Bio::Tools::Run::RemoteBlast
>>
>>
>>
>>
>>
>>
>>
>> Hello,
>>
>> Is there a function to blast one query sequence against multiple blast
>> databases?  For example, I want to blast a sequence against all
>> Microbial
>> Genomes.  Currently, I can do it by placing multiple Microbial databases
>> (eg.
>> Microbial/100226, Microbial/101510, etc) into an array and iterate
>> through
>> them using a foreach loop.  Each individual database is placed in the
>> '-data'
>> parameter and the blast is performed.
>>
>> Example Code:
>>
>> use strict;
>> use Bio::Tools::Run::RemoteBlast;
>>
>> my @microbDbs = qw(Microbial/100226 Microbial/101510 Microbial/103690
>> Microbial/1063);
>> my $e_val= '1e-3';
>>
>> foreach my $db(@microbDbs){
>>   my @params = ( '-prog' => $prog,
>>                          '-data' => $db,
>>                          '-expect' => $e_val,
>>                          '-readmethod' => 'xml' );
>>
>>   my $factory = Bio::Tools::Run::RemoteBlast->new(@params);
>>   my $v = 1;
>>   my $str = Bio::SeqIO->new(-file=>'test.fa' , '-format' => 'fasta' );
>>   while (my $input = $str->next_seq()){
>>     my $r = $factory->submit_blast($input);
>>
>>     #Code continues...
>>
>> }
>>
>> Is there a more efficient way to accomplish this?
>>
>> If this topic has been discussed please point the way.
>>
>> Thank you,
>>
>> Veronica
>>
>>
>> _________________________________________________________________
>> Microsoft brings you a new way to search the web.  Try  Bing(tm) now
>> http://www.bing.com?form=MFEHPG&publ=WLHMTAG&crea=TEXT_MFEHPG_Core_tagline_try
>> bing_1x1
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> =======================================================================
> Attention: The information contained in this message and/or attachments
> from AgResearch Limited is intended only for the persons or entities
> to which it is addressed and may contain confidential and/or privileged
> material. Any review, retransmission, dissemination or other use of, or
> taking of any action in reliance upon, this information by persons or
> entities other than the intended recipients is prohibited by AgResearch
> Limited. If you have received this message in error, please notify the
> sender immediately.
> =======================================================================
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


From Russell.Smithies at agresearch.co.nz  Mon Sep 21 22:48:26 2009
From: Russell.Smithies at agresearch.co.nz (Smithies, Russell)
Date: Tue, 22 Sep 2009 10:48:26 +1200
Subject: [Bioperl-l] New to Bio::Tools::Run::RemoteBlast
In-Reply-To: <4a1b887d0770ac557b0a2578aefdce18.squirrel@mail.dreamhost.com>
References: <SNT119-W38149FD1B34EE5CA92BFBED2DD0@phx.gbl>
	<18DF7D20DFEC044098A1062202F5FFF32B62A6B9D0@exchsth.agresearch.co.nz>
	<4a1b887d0770ac557b0a2578aefdce18.squirrel@mail.dreamhost.com>
Message-ID: <18DF7D20DFEC044098A1062202F5FFF32B62A6BA02@exchsth.agresearch.co.nz>

That doesn't work with remote databases though.
B:T:R:RemoteBlast uses the QBlast API (I think) so you're limited to the prebuilt databases NCBI offers.
http://www.ncbi.nlm.nih.gov/BLAST/Doc/urlapi.html 

Another thing to try is space-seperating your db list - I know it works with local blasts.
You could also bypass RemoteBlast and do it yourself by POSTing via URL.

This seems to work with multiple databases but you'd need to experiment:

http://www.ncbi.nlm.nih.gov/blast/Blast.cgi?QUERY=257700677&DATABASE=%22Microbial/100226%20Microbial/101510%20Microbial/103690%22&HITLIST_SIZE=10&FILTER=L&EXPECT=10&FORMAT_TYPE=HTML&PROGRAM=blastn&CLIENT=web&SERVICE=plain&NCBI_GI=on&PAGE=Nucleotides&CMD=Put


--Russell


> -----Original Message-----
> From: bill at genenformics.com [mailto:bill at genenformics.com]
> Sent: Tuesday, 22 September 2009 10:21 a.m.
> To: Smithies, Russell
> Cc: 'armendarez77 at hotmail.com'; 'bioperl-l at lists.open-bio.org'
> Subject: Re: [Bioperl-l] New to Bio::Tools::Run::RemoteBlast
> 
> BLAST DBs can be concatenated into a single target (.nal or .pal) file.
> 
> Check this out:
> 
> http://www.ncbi.nlm.nih.gov/Web/Newsltr/Winter00/blastlab.html
> 
> Bill
> 
> > You may need to setup blast locally (not a big job) as I don't think you
> > can blast against multiple databases with B:T:R:RemoteBlast.
> > Or you could do it manually on NCBI's site where you can filter results by
> > entrez query (eg. 1239[taxid] for fermicutes)
> > http://www.ncbi.nlm.nih.gov/BLAST/blastcgihelp.shtml#entrez_query
> >
> > --Russell
> >
> >
> >> -----Original Message-----
> >> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
> >> bounces at lists.open-bio.org] On Behalf Of armendarez77 at hotmail.com
> >> Sent: Tuesday, 22 September 2009 9:01 a.m.
> >> To: bioperl-l at lists.open-bio.org
> >> Subject: [Bioperl-l] New to Bio::Tools::Run::RemoteBlast
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >> Hello,
> >>
> >> Is there a function to blast one query sequence against multiple blast
> >> databases?  For example, I want to blast a sequence against all
> >> Microbial
> >> Genomes.  Currently, I can do it by placing multiple Microbial databases
> >> (eg.
> >> Microbial/100226, Microbial/101510, etc) into an array and iterate
> >> through
> >> them using a foreach loop.  Each individual database is placed in the
> >> '-data'
> >> parameter and the blast is performed.
> >>
> >> Example Code:
> >>
> >> use strict;
> >> use Bio::Tools::Run::RemoteBlast;
> >>
> >> my @microbDbs = qw(Microbial/100226 Microbial/101510 Microbial/103690
> >> Microbial/1063);
> >> my $e_val= '1e-3';
> >>
> >> foreach my $db(@microbDbs){
> >>   my @params = ( '-prog' => $prog,
> >>                          '-data' => $db,
> >>                          '-expect' => $e_val,
> >>                          '-readmethod' => 'xml' );
> >>
> >>   my $factory = Bio::Tools::Run::RemoteBlast->new(@params);
> >>   my $v = 1;
> >>   my $str = Bio::SeqIO->new(-file=>'test.fa' , '-format' => 'fasta' );
> >>   while (my $input = $str->next_seq()){
> >>     my $r = $factory->submit_blast($input);
> >>
> >>     #Code continues...
> >>
> >> }
> >>
> >> Is there a more efficient way to accomplish this?
> >>
> >> If this topic has been discussed please point the way.
> >>
> >> Thank you,
> >>
> >> Veronica
> >>
> >>
> >> _________________________________________________________________
> >> Microsoft brings you a new way to search the web.  Try  Bing(tm) now
> >>
> http://www.bing.com?form=MFEHPG&publ=WLHMTAG&crea=TEXT_MFEHPG_Core_tagline_try
> >> bing_1x1
> >> _______________________________________________
> >> Bioperl-l mailing list
> >> Bioperl-l at lists.open-bio.org
> >> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> > =======================================================================
> > Attention: The information contained in this message and/or attachments
> > from AgResearch Limited is intended only for the persons or entities
> > to which it is addressed and may contain confidential and/or privileged
> > material. Any review, retransmission, dissemination or other use of, or
> > taking of any action in reliance upon, this information by persons or
> > entities other than the intended recipients is prohibited by AgResearch
> > Limited. If you have received this message in error, please notify the
> > sender immediately.
> > =======================================================================
> >
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >
> 


From Russell.Smithies at agresearch.co.nz  Mon Sep 21 23:04:54 2009
From: Russell.Smithies at agresearch.co.nz (Smithies, Russell)
Date: Tue, 22 Sep 2009 11:04:54 +1200
Subject: [Bioperl-l] New to Bio::Tools::Run::RemoteBlast
In-Reply-To: <18DF7D20DFEC044098A1062202F5FFF32B62A6BA02@exchsth.agresearch.co.nz>
References: <SNT119-W38149FD1B34EE5CA92BFBED2DD0@phx.gbl>
	<18DF7D20DFEC044098A1062202F5FFF32B62A6B9D0@exchsth.agresearch.co.nz>
	<4a1b887d0770ac557b0a2578aefdce18.squirrel@mail.dreamhost.com>
	<18DF7D20DFEC044098A1062202F5FFF32B62A6BA02@exchsth.agresearch.co.nz>
Message-ID: <18DF7D20DFEC044098A1062202F5FFF32B62A6BA19@exchsth.agresearch.co.nz>

If you want to "manually" use Perl and QBlast, here's some example code.
I don't remember where it came from but it works well  :-)

**Ignore the UserAgent stuff, our firewall is fairly well tied down.

--Russell

============================

#!perl -w
$| = 1;

use LWP::UserAgent;
use HTTP::Request::Common 'POST';

$ua = LWP::UserAgent->new;
push @{ $ua->requests_redirectable }, 'POST';   #LWP doesn't redirect by default
$ua->agent('Mozilla/5.0');

#$ua->proxy( [ 'http', 'ftp' ] => 'http://username:password at your.proxy.if.required:8080' );

my $verbose = 1;
my $seq     = getSequence();
my ( $blast, $taxonomy ) = queryQBlast($seq);
$verbose && print "saving result\n";
saveToFile( $blast,    "blast.txt" );
saveToFile( $taxonomy, "taxonomy.html" );
$verbose && print "Done.\n";

sub queryQBlast {
  my ($seq) = @_;
  $seq =~ s/[\d\n\W]//g;
  my $sleepTime          = 0;
  my $sleepTimeIncrement = 5;
  my $totalSleepTime     = 0;
  my $maxSleepTime       = 600;    # 10 min
  my ( $rid, $rtoe ) = startQBlast($seq);
  my ( $blast, $taxonomy );

  while ( !$blast ) {
    $verbose && printf "wait %3d seconds\n", $sleepTime;
    sleep $sleepTime;
    ( $blast, $taxonomy ) = retrieveQBlastResult($rid);
    $sleepTime += $sleepTimeIncrement unless ( $sleepTime > 100 );
    $totalSleepTime += $sleepTimeIncrement;
    last if ( $totalSleepTime > $maxSleepTime );
  }
  return ( $blast, $taxonomy );
}

sub startQBlast {
  my ($sequence) = @_;
  my ( $expect, $wsize, $filter, $mega );
  my $hitList = 100;
  if ( length($sequence) <= 20 ) {
    $expect = 1000;
    $wsize  = 7;
    $mega   = "on";
    $filter = "";
  }
  else {
    $expect = 10;
    $wsize  = 28;
    $mega   = "on";
    $filter = "L";    # Low complexity
  }
  my $qblastURL = "http://www.ncbi.nlm.nih.gov/blast/Blast.cgi?";
  my $url       = $qblastURL . "QUERY=$sequence";
  $url .=
"&DATABASE=nr&HITLIST_SIZE=${hitList}&FILTER=${filter}&EXPECT=${expect}&FORMAT_TYPE=Text";
  $url .=
    "&PROGRAM=blastn&CLIENT=web&SERVICE=plain&NCBI_GI=on&PAGE=Nucleotides";
  $url .= "&SHOW_OVERVIEW=&WORD_SIZE=${wsize}&MEGABLAST=${mega}&CMD=Put";
  my $req = HTTP::Request->new( GET => $url );
  my $content = $ua->request($req)->content;
  $content =~ s/\s+/ /g;
  my ( $rid, $rtoe ) = $content =~
    /QBlastInfoBegin RID = ([\d\-\.\w]+) RTOE = (\d+) QBlastInfoEnd/;
  if ( !$rid ) { print qq{\nERROR missing RID:\n}; exit; }
  $verbose && print "RID $rid\n";
  return ( $rid, $rtoe );
}

sub retrieveQBlastResult {
  my ($rid)     = @_;
  my $qblastURL = "http://www.ncbi.nlm.nih.gov/blast/Blast.cgi?";
  my $url       = $qblastURL
    . "RID=$rid&CMD=Get&SHOW_OVERVIEW=&SHOW_LINKOUT=&FORMAT_TYPE=Text";
  my ( $blast, $taxonomy, $req );
  $req = HTTP::Request->new( GET => $url );
  $blast = $ua->request($req)->content;
  if ( $blast =~ /\s+Status=WAITING/ ) {
    $blast = "";
  }
  elsif ( $blast =~ /\s+Status=UNKNOWN/ ) {
    print "Error in processing\nRID $rid\n";
    exit;
  }
  else {
    $verbose && print "got blast result\n";
    $verbose && print "retrieving taxonomy data\n";
    $url = $qblastURL . "CMD=Get&RID=$rid&FORMAT_OBJECT=TaxBlast&NCBI_GI=on";
    $req = HTTP::Request->new( GET => $url );
    $taxonomy = $ua->request($req)->content;
    $taxonomy = "" if ( $taxonomy =~ /No valid taxids found in the alignment/ );
  }
  return ( $blast, $taxonomy );
}

sub saveToFile {
  my ( $data, $file ) = @_;
  local (*OUT);
  open( OUT, ">$file" );
  print OUT $data;
  close OUT;
}

sub getSequence {
  return qq{
AAAGGATTTATTGACGATGCGAACTACTCCGTTGGCCTGTTGGATGAAGGAACAAA
CCTTGGAAATGTTATTGATAACTATGTTTATGAACATACCCTGACAGGAAAAAATGCAT
TTTTTGTGGGGGATCTTGGGAAGATCGTGAAGAAGCACAGTCAGTGGCAGACCGTGGTG
GCTCAGATAAAGCCGTTTTACACGGTGAAGTGCAACTCCACTCCAGCCGTGCTTGAGAT
CTTGGCAGCTCTTGGAACTGGGTTTGCTTGTTCCAGCAAAAATGAAATGGCTTTAGTGC
AAGAATTGGGTGTATCTCCAGAAAACATCATTTTCACAAGTCCTTGTAAGCAAGTGTCT
CAGATAAAGTATGCAGCAAAAGTTGGAGTAAATATTATGACATGTGACAATGAGATTGA
ATTAAAGAAAATTGCAAGGAATCACCCAAATGCCAAGGTCTTACTACATATTGCAACAG
AAGATAATATTGGAGGTGAAGATGGTAACATGAAGTTTGGCACTACACTGAAGAATTGT
AGGCATCTTTTGGAATGTGCCAAGGAACTTGATGTCCAAATAATTGGGGTTAAATTTCA
TGTTTCAAGTGCTTGCAAAGAATATCAAGTATATGTACATGCCCTGTCTGATGCTCGAT
GTGTGTTTGACATGGCTGGAGAGTTTGGCTTTACAATGAACATGTTAGACATCGGTGGA
GGCTTCACAGGAACTGAAATTCAGTTGGAAGAGGTTAATCATGTTATCAGTCCTCTGTT
GGATATTTACTTCCCTGAAGGATCTGGCATTCAGATAATTTCAGAACCTGGAAGCTACT
ATGTATCTTCTGCGTTTACACTTGCAGTCAATATTATTGCTAAGAAAGTTGTTGAAAAT
GATAAATTTTCCTCTGGAGTAGAAAAAAATGGGAGTGATGAGCCAGCCTTCGTGTATTA
CATGAATGATGGTGTTTATGGTTCTTTTGCGAGTAAGCTTTCTGAGGACTTAAATACCA
TTCCAGAGGTTCACAAGAAATACAAGGAAGATGAGCCTCTGTTTACAAGCAGCCTTTGG
GGTCCATCCTGTGATGAGCTTGATCAAATTGTGGAAAGCTGTCTTCTTCCTGAGCTGAA
TGTGGGAGATTGGCTTATCTTTGATAACATGGGAGCAGATTCTTTCCACGAACCATCTG
CTTTTAATGATTTTCAGAGGCCAGCTATTTATTTCATGATGTCATTCAGTGATTGGTAT
GAGATGCAAGATGCTGGAATTACTTCAGATGCAATGATGAAAAACTTCTTCTTTGCACC
CTCTTGTATTCAGCTGAGCCAAGAAGACAGCTTTTCCACTGAAGCT};
}

================================

> -----Original Message-----
> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
> bounces at lists.open-bio.org] On Behalf Of Smithies, Russell
> Sent: Tuesday, 22 September 2009 10:48 a.m.
> To: 'bill at genenformics.com'
> Cc: 'bioperl-l at lists.open-bio.org'; 'armendarez77 at hotmail.com'
> Subject: Re: [Bioperl-l] New to Bio::Tools::Run::RemoteBlast
> 
> That doesn't work with remote databases though.
> B:T:R:RemoteBlast uses the QBlast API (I think) so you're limited to the
> prebuilt databases NCBI offers.
> http://www.ncbi.nlm.nih.gov/BLAST/Doc/urlapi.html
> 
> Another thing to try is space-seperating your db list - I know it works with
> local blasts.
> You could also bypass RemoteBlast and do it yourself by POSTing via URL.
> 
> This seems to work with multiple databases but you'd need to experiment:
> 
> http://www.ncbi.nlm.nih.gov/blast/Blast.cgi?QUERY=257700677&DATABASE=%22Microb
> ial/100226%20Microbial/101510%20Microbial/103690%22&HITLIST_SIZE=10&FILTER=L&E
> XPECT=10&FORMAT_TYPE=HTML&PROGRAM=blastn&CLIENT=web&SERVICE=plain&NCBI_GI=on&P
> AGE=Nucleotides&CMD=Put
> 
> 
> --Russell
> 
> 
> > -----Original Message-----
> > From: bill at genenformics.com [mailto:bill at genenformics.com]
> > Sent: Tuesday, 22 September 2009 10:21 a.m.
> > To: Smithies, Russell
> > Cc: 'armendarez77 at hotmail.com'; 'bioperl-l at lists.open-bio.org'
> > Subject: Re: [Bioperl-l] New to Bio::Tools::Run::RemoteBlast
> >
> > BLAST DBs can be concatenated into a single target (.nal or .pal) file.
> >
> > Check this out:
> >
> > http://www.ncbi.nlm.nih.gov/Web/Newsltr/Winter00/blastlab.html
> >
> > Bill
> >
> > > You may need to setup blast locally (not a big job) as I don't think you
> > > can blast against multiple databases with B:T:R:RemoteBlast.
> > > Or you could do it manually on NCBI's site where you can filter results by
> > > entrez query (eg. 1239[taxid] for fermicutes)
> > > http://www.ncbi.nlm.nih.gov/BLAST/blastcgihelp.shtml#entrez_query
> > >
> > > --Russell
> > >
> > >
> > >> -----Original Message-----
> > >> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
> > >> bounces at lists.open-bio.org] On Behalf Of armendarez77 at hotmail.com
> > >> Sent: Tuesday, 22 September 2009 9:01 a.m.
> > >> To: bioperl-l at lists.open-bio.org
> > >> Subject: [Bioperl-l] New to Bio::Tools::Run::RemoteBlast
> > >>
> > >>
> > >>
> > >>
> > >>
> > >>
> > >>
> > >> Hello,
> > >>
> > >> Is there a function to blast one query sequence against multiple blast
> > >> databases?  For example, I want to blast a sequence against all
> > >> Microbial
> > >> Genomes.  Currently, I can do it by placing multiple Microbial databases
> > >> (eg.
> > >> Microbial/100226, Microbial/101510, etc) into an array and iterate
> > >> through
> > >> them using a foreach loop.  Each individual database is placed in the
> > >> '-data'
> > >> parameter and the blast is performed.
> > >>
> > >> Example Code:
> > >>
> > >> use strict;
> > >> use Bio::Tools::Run::RemoteBlast;
> > >>
> > >> my @microbDbs = qw(Microbial/100226 Microbial/101510 Microbial/103690
> > >> Microbial/1063);
> > >> my $e_val= '1e-3';
> > >>
> > >> foreach my $db(@microbDbs){
> > >>   my @params = ( '-prog' => $prog,
> > >>                          '-data' => $db,
> > >>                          '-expect' => $e_val,
> > >>                          '-readmethod' => 'xml' );
> > >>
> > >>   my $factory = Bio::Tools::Run::RemoteBlast->new(@params);
> > >>   my $v = 1;
> > >>   my $str = Bio::SeqIO->new(-file=>'test.fa' , '-format' => 'fasta' );
> > >>   while (my $input = $str->next_seq()){
> > >>     my $r = $factory->submit_blast($input);
> > >>
> > >>     #Code continues...
> > >>
> > >> }
> > >>
> > >> Is there a more efficient way to accomplish this?
> > >>
> > >> If this topic has been discussed please point the way.
> > >>
> > >> Thank you,
> > >>
> > >> Veronica
> > >>
> > >>
> > >> _________________________________________________________________
> > >> Microsoft brings you a new way to search the web.  Try  Bing(tm) now
> > >>
> >
> http://www.bing.com?form=MFEHPG&publ=WLHMTAG&crea=TEXT_MFEHPG_Core_tagline_try
> > >> bing_1x1
> > >> _______________________________________________
> > >> Bioperl-l mailing list
> > >> Bioperl-l at lists.open-bio.org
> > >> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> > > =======================================================================
> > > Attention: The information contained in this message and/or attachments
> > > from AgResearch Limited is intended only for the persons or entities
> > > to which it is addressed and may contain confidential and/or privileged
> > > material. Any review, retransmission, dissemination or other use of, or
> > > taking of any action in reliance upon, this information by persons or
> > > entities other than the intended recipients is prohibited by AgResearch
> > > Limited. If you have received this message in error, please notify the
> > > sender immediately.
> > > =======================================================================
> > >
> > > _______________________________________________
> > > Bioperl-l mailing list
> > > Bioperl-l at lists.open-bio.org
> > > http://lists.open-bio.org/mailman/listinfo/bioperl-l
> > >
> >
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From Russell.Smithies at agresearch.co.nz  Mon Sep 21 20:51:51 2009
From: Russell.Smithies at agresearch.co.nz (Smithies, Russell)
Date: Tue, 22 Sep 2009 08:51:51 +1200
Subject: [Bioperl-l] a Main Page proposal
In-Reply-To: <~B4ab702db0000.4ab7e0410000.0001.mml.2798180807@NewLife>
References: <~B4ab702db0000.4ab7e0410000.0001.mml.2798180807@NewLife>
Message-ID: <18DF7D20DFEC044098A1062202F5FFF32B62A6B938@exchsth.agresearch.co.nz>

Here's a few comments to ignore at will :-)

How about using a different default skin so it doesn't look like all the other installations of MediaWiki?
I've attached a screenshot of one of my wikis using the "Daddio" skin but a bit of crafty CSS can do wonders.
Also, there's a lot of duplication with most of the links on Mediawiki:Sidebar also appearing on the main page content.
The "Treeview" is a nice extension as well for tidying up complex menus http://semeb.com/dpldemo/index.php?title=Treeview_extension 

I've got a bit of experience with wikis and extensions (we use LOTS of extensions) so let me know if there's anything you need.

--Russell


> -----Original Message-----
> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
> bounces at lists.open-bio.org] On Behalf Of Mark A. Jensen
> Sent: Monday, 21 September 2009 4:23 p.m.
> To: BioPerl List
> Subject: [Bioperl-l] a Main Page proposal
> 
> Hello all,
> 
> As Brian articulated so well for many of us,
> the wiki main page is, well, butt-ugly.
> Please check out the Main Page Beta at
> http://www.bioperl.org/wiki/Main_Page_Beta
> and respond to this thread or on the discussion
> page.
> 
> cheers and thanks,
> MAJ
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

=======================================================================
Attention: The information contained in this message and/or attachments
from AgResearch Limited is intended only for the persons or entities
to which it is addressed and may contain confidential and/or privileged
material. Any review, retransmission, dissemination or other use of, or
taking of any action in reliance upon, this information by persons or
entities other than the intended recipients is prohibited by AgResearch
Limited. If you have received this message in error, please notify the
sender immediately.
=======================================================================
-------------- next part --------------
A non-text attachment was scrubbed...
Name: daddio.png
Type: image/png
Size: 51263 bytes
Desc: daddio.png
URL: <http://lists.open-bio.org/pipermail/bioperl-l/attachments/20090922/643d7f79/attachment-0004.png>

From cjfields at illinois.edu  Tue Sep 22 03:38:18 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Mon, 21 Sep 2009 22:38:18 -0500
Subject: [Bioperl-l] a Main Page proposal
In-Reply-To: <18DF7D20DFEC044098A1062202F5FFF32B62A6B938@exchsth.agresearch.co.nz>
References: <~B4ab702db0000.4ab7e0410000.0001.mml.2798180807@NewLife>
	<18DF7D20DFEC044098A1062202F5FFF32B62A6B938@exchsth.agresearch.co.nz>
Message-ID: <B9C1E8A4-BDE0-45E7-858B-8BFABA1D2480@illinois.edu>

Russell, Mark,

It would be nice to change the background, just don't want it to be  
too distracting.

Also (I mentioned this to Mark off-list), I think the sidebar would be  
cleaned up considerably, but not until this becomes the default.  I  
also like the use of the TreeView extension, very nice!  Anyone have  
privs for the wiki to test it out?

chris

On Sep 21, 2009, at 3:51 PM, Smithies, Russell wrote:

> Here's a few comments to ignore at will :-)
>
> How about using a different default skin so it doesn't look like all  
> the other installations of MediaWiki?
> I've attached a screenshot of one of my wikis using the "Daddio"  
> skin but a bit of crafty CSS can do wonders.
> Also, there's a lot of duplication with most of the links on  
> Mediawiki:Sidebar also appearing on the main page content.
> The "Treeview" is a nice extension as well for tidying up complex  
> menus http://semeb.com/dpldemo/index.php?title=Treeview_extension
>
> I've got a bit of experience with wikis and extensions (we use LOTS  
> of extensions) so let me know if there's anything you need.
>
> --Russell
>
>
>> -----Original Message-----
>> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
>> bounces at lists.open-bio.org] On Behalf Of Mark A. Jensen
>> Sent: Monday, 21 September 2009 4:23 p.m.
>> To: BioPerl List
>> Subject: [Bioperl-l] a Main Page proposal
>>
>> Hello all,
>>
>> As Brian articulated so well for many of us,
>> the wiki main page is, well, butt-ugly.
>> Please check out the Main Page Beta at
>> http://www.bioperl.org/wiki/Main_Page_Beta
>> and respond to this thread or on the discussion
>> page.
>>
>> cheers and thanks,
>> MAJ
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> = 
> ======================================================================
> Attention: The information contained in this message and/or  
> attachments
> from AgResearch Limited is intended only for the persons or entities
> to which it is addressed and may contain confidential and/or  
> privileged
> material. Any review, retransmission, dissemination or other use of,  
> or
> taking of any action in reliance upon, this information by persons or
> entities other than the intended recipients is prohibited by  
> AgResearch
> Limited. If you have received this message in error, please notify the
> sender immediately.
> = 
> ======================================================================
> <daddio.png>_______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From cjfields at illinois.edu  Tue Sep 22 03:56:58 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Mon, 21 Sep 2009 22:56:58 -0500
Subject: [Bioperl-l] BioPerl 1.6.0 alpha 2 released
Message-ID: <2736FAB1-3728-465F-A07B-A8FFA790FC4C@illinois.edu>

Just a note that the second alpha is out and propagating it's way  
around the intertubes:

http://search.cpan.org/~cjfields/BioPerl-1.6.0_2/

Pick your favorite archive here:

http://bioperl.org/DIST/RC/

This should address the bugs reported by Scott from the last release.   
Just a note, but I am seeing a warning popping up with 64-bit perl  
5.10.1 on Mac with PopGen tests (I think it's a floating point  
addition issue).  Let me know if this is popping up elsewhere.

Enjoy!

chris


From jcline at ieee.org  Tue Sep 22 03:59:09 2009
From: jcline at ieee.org (Jonathan Cline)
Date: Mon, 21 Sep 2009 22:59:09 -0500
Subject: [Bioperl-l] a Main Page proposal
In-Reply-To: <mailman.7482.1253574466.2696.bioperl-l@lists.open-bio.org>
References: <mailman.7482.1253574466.2696.bioperl-l@lists.open-bio.org>
Message-ID: <4AB84B8D.5080005@ieee.org>

Throwing this out there:

- there should be a screenshot section (whatever that means for bioperl)

- the grammar of the beta page should be more correct.

"Welcome to BioPerl, a community effort to produce Perl code which is
useful in biology. "
==> "Welcome to BioPerl, a community effort to produce Perl code serving
as useful tool in the field of Biology."

>>The About section is a good example. I would bet most visitors to the
BioPerl website skip over the About section because they already know what
BioPerl is, ...  Dave<<


Most good software front pages say, in a couple sentences, "what it is
and what it's for", including pictures (as screenshots).

I would bet a ton of visitors don't know what bioperl is, or what it is
used for, or how it can benefit.  There is likely a metric for this (web
stats) as the ratio of new page visits that bounce away vs. new
clickthrus from the front page to the download or docs section.   i.e. a
visitor found the page and didn't continue reading.  I don't really know
all the things bioperl is good for and I've been reading about it here &
there for a while.

I like the following from the About and I believe it fits well on a
front page, expanding "toolkit" to "software library":

"What is Bioperl? It is an open source bioinformatics software library
used by researchers all over the world. If you're looking for a script
built to fit your exact needs you probably won't find it in Bioperl.
What you will find is a diverse set of Perl modules that will enable you
to write your own script, and a community of people who are willing to
help you. "

The old school definition of software library is something like: "useful
routines which can be used by an application (& not itself an
application)" which is basically the description above.

I also like the intro from wikipedia, which I found more informative
about bioperl, and would be good for a front page:

'BioPerl [1] is a collection of Perl modules that facilitate the
development of Perl scripts for bioinformatics applications. It has
played an integral role in the Human Genome Project.[2]  It is an active
open source software project supported by the Open Bioinformatics
Foundation.  In order to take advantage of BioPerl, the user needs a
basic understanding of the Perl programming language including an
understanding of how to use Perl references, modules, objects and methods."

The screenshots could also include pics of books on bioperl or perl+bio,
that would be neat.  (Tisdall's book comes to mind here)


## Jonathan Cline
## jcline at ieee.org
## Mobile: +1-805-617-0223
########################


From lelbourn at science.mq.edu.au  Tue Sep 22 05:05:28 2009
From: lelbourn at science.mq.edu.au (Liam Elbourne)
Date: Tue, 22 Sep 2009 15:05:28 +1000
Subject: [Bioperl-l] subsection of genbank file
In-Reply-To: <4AB36451.3030207@gmail.com>
References: <997B4CA2-D80B-4512-AA3E-74CB45DD7064@science.mq.edu.au>
	<4AB36451.3030207@gmail.com>
Message-ID: <3B0EF953-BF79-4384-964D-A992DFBDB609@science.mq.edu.au>

Hi Roy,

Thanks for that, works well, but there are no _gsf_tag_hash values?  
I'm particularly interested in the locus id, obviously the translation  
could be problematic if the whole gene is not included after  
truncation, but things like the note, product, protein_id would be  
good. I had a look at the code for the method and couldn't see any  
obvious why those values didn't make it across. Should I submit this  
as a bug, or is there something I'm missing?


Regards,
Liam.


On 18/09/2009, at 8:43 PM, Roy Chaudhuri wrote:

> Hi Liam,
>
> I just discovered your message, which has not yet been replied to.  
> What you require has been discussed in a recent thread:
> http://bioperl.org/pipermail/bioperl-l/2009-August/031071.html
>
> Try using trunc_with_features from Bio::SeqUtils:
>
> my $sub_seqobj=Bio::SeqUtils->trunc_with_features($seqobj, 300, 2000);
> Cheers.
> Roy.
>
> Liam Elbourne wrote:
>> Hi All,
>> Is there a method or methodology that will produce a fully fledged  
>> Seq  object with all the associated metadata given a start and end   
>> position? To clarify, I create a sequence object from a genbank file:
>> ****
>> my $io  = Bio::Seqio->new(as per usual);
>> my $seqobj = $io->next_seq();
>> ****
>> I now want:
>> my $sub_seqobj = $seqobj between 300 and 2000
>> where $sub_seqobj is a Seq object (which I appreciate is an   
>> 'aggregate' of objects) too. The "trunc" method only returns a   
>> PrimarySeq object which lacks all the annotation etc. I've  
>> previously  done this task by iterating through feature by feature  
>> and parsing out  what I needed, but thought there might be a more  
>> elegant approach...
>> Regards,
>> Liam Elbourne.
>
> -- 
> Dr. Roy Chaudhuri
> Department of Veterinary Medicine
> University of Cambridge, U.K.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> ac.uk ([131.111.51.215]:49455)
> 	by ppsw-7.c

______________________________

Dr Liam Elbourne
Research Fellow (Bioinformatics)
Paulsen Laboratory
Macquarie University
Sydney
Australia.

http://www2.oxfam.org.au/trailwalker/Sydney/team/228


From roy.chaudhuri at gmail.com  Tue Sep 22 07:17:26 2009
From: roy.chaudhuri at gmail.com (Roy Chaudhuri)
Date: Tue, 22 Sep 2009 08:17:26 +0100
Subject: [Bioperl-l] subsection of genbank file
In-Reply-To: <3B0EF953-BF79-4384-964D-A992DFBDB609@science.mq.edu.au>
References: <997B4CA2-D80B-4512-AA3E-74CB45DD7064@science.mq.edu.au>
	<4AB36451.3030207@gmail.com>
	<3B0EF953-BF79-4384-964D-A992DFBDB609@science.mq.edu.au>
Message-ID: <4AB87A06.4000209@gmail.com>

Hi Liam,

Yes, that is a bug - I think it is to do with the Feature Annotation 
rollback from 1.6, it works fine with 1.5.2. Looks like the tests I 
wrote don't check for the presence of tags, just the coordinates of the 
feature, so this hasn't been picked up. Submit it to Bugzilla, and I'll 
take a look when I get a chance.

Cheers.
Roy.

Liam Elbourne wrote:
> Hi Roy,
> 
> Thanks for that, works well, but there are no _gsf_tag_hash values? I'm 
> particularly interested in the locus id, obviously the translation could 
> be problematic if the whole gene is not included after truncation, but 
> things like the note, product, protein_id would be good. I had a look at 
> the code for the method and couldn't see any obvious why those values 
> didn't make it across. Should I submit this as a bug, or is there 
> something I'm missing?
> 
> 
> Regards,
> Liam.
> 
> 
> 
> On 18/09/2009, at 8:43 PM, Roy Chaudhuri wrote:
> 
>> Hi Liam,
>>
>> I just discovered your message, which has not yet been replied to. 
>> What you require has been discussed in a recent thread:
>> http://bioperl.org/pipermail/bioperl-l/2009-August/031071.html
>>
>> Try using trunc_with_features from Bio::SeqUtils:
>>
>> my $sub_seqobj=Bio::SeqUtils->trunc_with_features($seqobj, 300, 2000);
>> Cheers.
>> Roy.
>>
>> Liam Elbourne wrote:
>>> Hi All,
>>> Is there a method or methodology that will produce a fully fledged 
>>> Seq  object with all the associated metadata given a start and end 
>>>  position? To clarify, I create a sequence object from a genbank file:
>>> ****
>>> my $io  = Bio::Seqio->new(as per usual);
>>> my $seqobj = $io->next_seq();
>>> ****
>>> I now want:
>>> my $sub_seqobj = $seqobj between 300 and 2000
>>> where $sub_seqobj is a Seq object (which I appreciate is an 
>>>  'aggregate' of objects) too. The "trunc" method only returns a 
>>>  PrimarySeq object which lacks all the annotation etc. I've 
>>> previously  done this task by iterating through feature by feature 
>>> and parsing out  what I needed, but thought there might be a more 
>>> elegant approach...
>>> Regards,
>>> Liam Elbourne.
>>
>> -- 
>> Dr. Roy Chaudhuri
>> Department of Veterinary Medicine
>> University of Cambridge, U.K.
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>> ac.uk ([131.111.51.215]:49455)
>> by ppsw-7.c
> 
> ______________________________
> 
> Dr Liam Elbourne
> Research Fellow (Bioinformatics)
> Paulsen Laboratory
> Macquarie University
> Sydney
> Australia.
> 
> http://www2.oxfam.org.au/trailwalker/Sydney/team/228
> 
> 
> 


From lelbourn at science.mq.edu.au  Tue Sep 22 07:14:44 2009
From: lelbourn at science.mq.edu.au (Liam Elbourne)
Date: Tue, 22 Sep 2009 17:14:44 +1000
Subject: [Bioperl-l] dnastatistics
In-Reply-To: <8B440DC9-A1C8-4900-A0AB-96448616E46A@bioperl.org>
References: <BLU104-W2453ADE4584D2C479071A4A0E40@phx.gbl>
	<7AD546C5A6BE4B66BF9705BC885E08B1@NewLife>
	<8B440DC9-A1C8-4900-A0AB-96448616E46A@bioperl.org>
Message-ID: <A5C3A80C-03F0-4CEC-BA43-2271B58F6DC4@science.mq.edu.au>

So I also had no problem running the code as written by Jose (Bioperl  
1.6.0, perl 5.10), but in the documentation for DNAStatistics it says:

"The routines are not well tested and do contain errors at this point.  
Work is underway to correct them, but do not expect this code to give  
you the right answer currently!"!

So I'm using dnadist (as I think the documentation recommends), and it  
does produce different numbers to $stats->distance(-).

I tried write_matrix from Bio::Matrix::IO - got a message saying it  
hasn't been implemented yet?

And if Jose hasn't already found it, try Data::Dumper; it will change  
your life....

Regards,
Liam.

On 15/09/2009, at 3:54 AM, Jason Stajich wrote:

> Yeah it seems like more of a bioperl problem -- possible that the  
> older code didn't recognize 'jukes-cantor' but you can try the  
> abbreviation 'jc' -- better to just upgrade tho!
>
> This isn't the cause of the problem but I would also encourage use  
> of Bio::Matrix::IO for printing the matrix (use the 'write_matrix'  
> function) rather than print_matrix on the matrix itsself.
>
> -jason
> On Sep 14, 2009, at 10:00 AM, Mark A. Jensen wrote:
>
>> Hi Jose--
>> I don't get any problem with your script as written. You should  
>> upgrade to
>> BioPerl 1.6 and try again.
>> The "unblessed reference" is $jcmatrix. It may be undef for some  
>> reason.
>> MAJ
>> ----- Original Message ----- From: "Jose ." <joseguillin at hotmail.com>
>> To: <bioperl-l at bioperl.org>
>> Sent: Monday, September 14, 2009 8:48 AM
>> Subject: [Bioperl-l] Bio/Align/DNAStatistics.html print$jcmatrix- 
>> >print_matrix;
>>
>>
>>
>>
>>
>> Hello,
>>
>> I'm trying to use Bio::Align::DNAStatistics, but I get the  
>> following message:
>>
>> Can't call method "print_matrix" on unblessed reference at Tree.pl  
>> line 32, <GEN0> line 44.
>>
>> Other modules do work, such us Bio::SimpleAlign;
>>
>>
>>
>>
>> My code is basically a modification of the code I found in http://doc.bioperl.org/releases/bioperl-current/bioperl-live/Bio/Align/DNAStatistics.html 
>> , as it is as follows:
>>
>> use strict;
>> use Bio::AlignIO;
>> use Bio::Align::DNAStatistics;
>>
>>
>> my $stats = Bio::Align::DNAStatistics->new();
>>
>> my $alignin = Bio::AlignIO->new(-file => 'e1_output_uno_solo.fas',
>>                          -format => 'fasta');
>> my $aln = $alignin->next_aln;
>>
>> my $jcmatrix = $stats-> distance (-align => $aln,
>>                -method => 'Jukes-Cantor');
>>
>> print $jcmatrix->print_matrix;
>>
>> And the file 'e1_output_uno_solo.fas' has the following sequences:
>>
>>> A
>> GGTTATCTCAACAACTGTCACC--GTGGGCGCTGGTCATTGGTACGGGTGAACGAGAGTT
>> AAACGGTCGTTAACCATAGAAACAAAACACACTGCACCTTAACTCACTGAATAGTTGACG
>> GTCTGCCTCAGGGCTTGAGACAACGGATGGATCTAAACTCATGCTGTAGCCTATCAAACT
>> TAGCCCCAGGGTACTTCCGTCCCTAGCCTCGCTACAAGGCCAGAAAGGGTTTTGAAGTCT
>> ACTCACTGTGACCAGCGGTCTAGTCAGGTTATGCTTCGGCACAAAACCTCAGAATCGGTA
>> ACCAGCCACTACACGAACTGAAATCAAATCGCGGGAGGTGGTCCATCTTTGTCCACGCTG
>> CGATGATTGGGTTGCTTTATAGTCTAGCTGCAAGGTTTTGCGTTCTGGTGGGAAGCGGSubject:  
>> Re: [Bioperl-l] Bio/Align/DNAStatistics.html
> 	print$jcmatrix->print_maCA
>> TCCAAGGGGTTGACTCCGCTCGTTTATAACATGCCTTGGGCCTCCATGGTGAGTCGCAAC
>> GTCAGCGTAGGCCTAGACGGCT
>>
>>> B
>> GGATATCTCGACAACTTTTAGC--CTGGGCGCTTGGCATTGGTACACGTGACTTGCAGTT
>> AAAGGGTCGTTATACATAGAATCACTACCCAC--CAGGCGAACTCGCTGGAGAGCTGAGG
>> GTCACCCTCAGCGGTTGAGTTAACTGCTCGATGTTAACCGATGTTGGATCATAGGTAACT
>> TATCCTCAGTGTTCCTCTGTCCCTAGACTGGCTACAGGGCTACACCGGGTTTGAGGGGAT
>> ACTGACTGTTTTCAGCGGTAGTGTAAGTGTATGGTCCAACCCAAGGGTTCATGACCGGTA
>> AACTGCCCGTTCCCGCATTGAAATCAAATTGCAGGAGTTGGTACTTATTTGTCAACCTTA
>> CGATGATTGGGATGCATTTTAGTCGGGCTGGGCGGATTTGCGATCTGGGTGGAAGAGAGA
>> TGCATGGGGCTAACTCGTCTTGGTGAGTACCGGCATTGCACCGCAATGGACCGCCAAAAC
>> ATAAGAGTAGGTCGGGATGGCA
>>
>>> C
>> GCTTATCTCAACAACCGACACGAAGTCGTCGCAGGTCAATGGTACACGTGAATTGAAGTC
>> ATAAGATCAGTAATGATCGAACCACCAAACCCTTAACCTCGACTCACGCGATAGCCGAGG
>> GTCTGCCTCCAGGGTTGATTTAAAGGTTCTATTTAAGACCGTTTTCGATCATAGGTTACT
>> TATCCCCAGAGTTCTACCGTCGTGAGAATGGCTACAAGGCTAGAATAGGTTTTAGGGT-T
>> ACTTACGGTCTGCAGCCGTATTGTGAGGTTATGGTCCGGCCCTAGGCGTCATGACCGATA
>> ATCAGCCCCTACCTGAAATGAAATCAAATCGCGGGAGTTGGTACTTATCTGTCAACGTTG
>> CGATGATGGGGATACATGTTGGTCTACCGCGACGGACTAGCGATCACGGGGGAAGCGGAT
>> TGCCCGGTGGTGACTCGACACGTTTAAAACCTGCCTGGTTCCCGCATGGATCGTCACAAC
>> GTATGTGCAGGTCGAAACGAGT
>>
>>> D
>> CGTGATCGCAACAACTGTCACC--GTGGGCGCTGGCCGTTGGACCACGTGAAATGCTGTT
>> AAACGATCGTTCACCATAGAACCACTACACTCTTCACCTCAACCCGCGGGACAGGTGATG
>> GTGTCCCCCAGGGGTTGAGTGAACGGCTCGATGTAAACCCATGTTCGATCATAGGTAACG
>> TAGCCCCAGGGTGATTCCGTTCCTAAACTGGTTACAAGGCTAAAACGTGTTTTAGAGTAT
>> AATGACTGTCTACGGCGGTATTGTGATGTTATCATCCGTCCCTAGGCGTGGCGACCGTTA
>> AACAGCCTCTTCCCTAACTGATATCTAATCGTAGGAGTTGCTACGCATTTGTCAACGCAG
>> CGATGATGGTGATGCATCTTAATCTAGCTGG----TTTTTTGATCTCGGGTGACGCAGAT
>> AGTCAGGGGTTGACTCGCGTCGTTTGAAACGTGCCTTGCTCCTCAATGGACCCTCCGAAC
>> CTAAGAGTAGCTCGACACGGCT
>>
>>
>>
>> I think the $aln object is OK, as I can use it with SimpleAlign.
>>
>> Moreover, if I write
>>        print $jcmatrix;
>> instead of
>>        print $jcmatrix->print_matrix;
>> I get the memory reference, as normal===> ARRAY(0x859f08)
>>
>> So my question is:
>>
>> Why do I have an unblessed reference?
>>
>> Can't call method "print_matrix" on unblessed reference at Tree.pl  
>> line 32, <GEN0> line 44.
>>
>> Thank you very much in advance.
>>
>> Jose G.
>>
>> _________________________________________________________________
>> Hay tantos ordenadores como personas. ?Descubre ahora cu?l eres t?!
>> http://www.quepceres.com/
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> --
> Jason Stajich
> jason.stajich at gmail.com
> jason at bioperl.org
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

______________________________


From maj at fortinbras.us  Tue Sep 22 11:12:38 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Tue, 22 Sep 2009 07:12:38 -0400
Subject: [Bioperl-l] dnastatistics
In-Reply-To: <A5C3A80C-03F0-4CEC-BA43-2271B58F6DC4@science.mq.edu.au>
References: <BLU104-W2453ADE4584D2C479071A4A0E40@phx.gbl><7AD546C5A6BE4B66BF9705BC885E08B1@NewLife><8B440DC9-A1C8-4900-A0AB-96448616E46A@bioperl.org>
	<A5C3A80C-03F0-4CEC-BA43-2271B58F6DC4@science.mq.edu.au>
Message-ID: <39991E8FD29E4A43B8098C0BA6740C9C@NewLife>

Thanks Liam-- I think the discrepancy between dnadist and the
module is worth making a bug report for- can you do that and
include the data (or part of it) you were using?
Jason, is that work really underway, or should someone pick up
that ball?
----- Original Message ----- 
From: "Liam Elbourne" <lelbourn at science.mq.edu.au>
To: "Jason Stajich" <jason at bioperl.org>
Cc: "Mark A. Jensen" <maj at fortinbras.us>; <bioperl-l at bioperl.org>; "Jose ." 
<joseguillin at hotmail.com>
Sent: Tuesday, September 22, 2009 3:14 AM
Subject: [Bioperl-l] dnastatistics


So I also had no problem running the code as written by Jose (Bioperl
1.6.0, perl 5.10), but in the documentation for DNAStatistics it says:

"The routines are not well tested and do contain errors at this point.
Work is underway to correct them, but do not expect this code to give
you the right answer currently!"!

So I'm using dnadist (as I think the documentation recommends), and it
does produce different numbers to $stats->distance(-).

I tried write_matrix from Bio::Matrix::IO - got a message saying it
hasn't been implemented yet?

And if Jose hasn't already found it, try Data::Dumper; it will change
your life....

Regards,
Liam.

On 15/09/2009, at 3:54 AM, Jason Stajich wrote:

> Yeah it seems like more of a bioperl problem -- possible that the  older code 
> didn't recognize 'jukes-cantor' but you can try the  abbreviation 'jc' --  
> better to just upgrade tho!
>
> This isn't the cause of the problem but I would also encourage use  of 
> Bio::Matrix::IO for printing the matrix (use the 'write_matrix'  function) 
> rather than print_matrix on the matrix itsself.
>
> -jason
> On Sep 14, 2009, at 10:00 AM, Mark A. Jensen wrote:
>
>> Hi Jose--
>> I don't get any problem with your script as written. You should  upgrade to
>> BioPerl 1.6 and try again.
>> The "unblessed reference" is $jcmatrix. It may be undef for some  reason.
>> MAJ
>> ----- Original Message ----- From: "Jose ." <joseguillin at hotmail.com>
>> To: <bioperl-l at bioperl.org>
>> Sent: Monday, September 14, 2009 8:48 AM
>> Subject: [Bioperl-l] Bio/Align/DNAStatistics.html print$jcmatrix-
>> >print_matrix;
>>
>>
>>
>>
>>
>> Hello,
>>
>> I'm trying to use Bio::Align::DNAStatistics, but I get the  following 
>> message:
>>
>> Can't call method "print_matrix" on unblessed reference at Tree.pl  line 32, 
>> <GEN0> line 44.
>>
>> Other modules do work, such us Bio::SimpleAlign;
>>
>>
>>
>>
>> My code is basically a modification of the code I found in 
>> http://doc.bioperl.org/releases/bioperl-current/bioperl-live/Bio/Align/DNAStatistics.html , 
>> as it is as follows:
>>
>> use strict;
>> use Bio::AlignIO;
>> use Bio::Align::DNAStatistics;
>>
>>
>> my $stats = Bio::Align::DNAStatistics->new();
>>
>> my $alignin = Bio::AlignIO->new(-file => 'e1_output_uno_solo.fas',
>>                          -format => 'fasta');
>> my $aln = $alignin->next_aln;
>>
>> my $jcmatrix = $stats-> distance (-align => $aln,
>>                -method => 'Jukes-Cantor');
>>
>> print $jcmatrix->print_matrix;
>>
>> And the file 'e1_output_uno_solo.fas' has the following sequences:
>>
>>> A
>> GGTTATCTCAACAACTGTCACC--GTGGGCGCTGGTCATTGGTACGGGTGAACGAGAGTT
>> AAACGGTCGTTAACCATAGAAACAAAACACACTGCACCTTAACTCACTGAATAGTTGACG
>> GTCTGCCTCAGGGCTTGAGACAACGGATGGATCTAAACTCATGCTGTAGCCTATCAAACT
>> TAGCCCCAGGGTACTTCCGTCCCTAGCCTCGCTACAAGGCCAGAAAGGGTTTTGAAGTCT
>> ACTCACTGTGACCAGCGGTCTAGTCAGGTTATGCTTCGGCACAAAACCTCAGAATCGGTA
>> ACCAGCCACTACACGAACTGAAATCAAATCGCGGGAGGTGGTCCATCTTTGTCCACGCTG
>> CGATGATTGGGTTGCTTTATAGTCTAGCTGCAAGGTTTTGCGTTCTGGTGGGAAGCGGSubject:  Re: 
>> [Bioperl-l] Bio/Align/DNAStatistics.html
> print$jcmatrix->print_maCA
>> TCCAAGGGGTTGACTCCGCTCGTTTATAACATGCCTTGGGCCTCCATGGTGAGTCGCAAC
>> GTCAGCGTAGGCCTAGACGGCT
>>
>>> B
>> GGATATCTCGACAACTTTTAGC--CTGGGCGCTTGGCATTGGTACACGTGACTTGCAGTT
>> AAAGGGTCGTTATACATAGAATCACTACCCAC--CAGGCGAACTCGCTGGAGAGCTGAGG
>> GTCACCCTCAGCGGTTGAGTTAACTGCTCGATGTTAACCGATGTTGGATCATAGGTAACT
>> TATCCTCAGTGTTCCTCTGTCCCTAGACTGGCTACAGGGCTACACCGGGTTTGAGGGGAT
>> ACTGACTGTTTTCAGCGGTAGTGTAAGTGTATGGTCCAACCCAAGGGTTCATGACCGGTA
>> AACTGCCCGTTCCCGCATTGAAATCAAATTGCAGGAGTTGGTACTTATTTGTCAACCTTA
>> CGATGATTGGGATGCATTTTAGTCGGGCTGGGCGGATTTGCGATCTGGGTGGAAGAGAGA
>> TGCATGGGGCTAACTCGTCTTGGTGAGTACCGGCATTGCACCGCAATGGACCGCCAAAAC
>> ATAAGAGTAGGTCGGGATGGCA
>>
>>> C
>> GCTTATCTCAACAACCGACACGAAGTCGTCGCAGGTCAATGGTACACGTGAATTGAAGTC
>> ATAAGATCAGTAATGATCGAACCACCAAACCCTTAACCTCGACTCACGCGATAGCCGAGG
>> GTCTGCCTCCAGGGTTGATTTAAAGGTTCTATTTAAGACCGTTTTCGATCATAGGTTACT
>> TATCCCCAGAGTTCTACCGTCGTGAGAATGGCTACAAGGCTAGAATAGGTTTTAGGGT-T
>> ACTTACGGTCTGCAGCCGTATTGTGAGGTTATGGTCCGGCCCTAGGCGTCATGACCGATA
>> ATCAGCCCCTACCTGAAATGAAATCAAATCGCGGGAGTTGGTACTTATCTGTCAACGTTG
>> CGATGATGGGGATACATGTTGGTCTACCGCGACGGACTAGCGATCACGGGGGAAGCGGAT
>> TGCCCGGTGGTGACTCGACACGTTTAAAACCTGCCTGGTTCCCGCATGGATCGTCACAAC
>> GTATGTGCAGGTCGAAACGAGT
>>
>>> D
>> CGTGATCGCAACAACTGTCACC--GTGGGCGCTGGCCGTTGGACCACGTGAAATGCTGTT
>> AAACGATCGTTCACCATAGAACCACTACACTCTTCACCTCAACCCGCGGGACAGGTGATG
>> GTGTCCCCCAGGGGTTGAGTGAACGGCTCGATGTAAACCCATGTTCGATCATAGGTAACG
>> TAGCCCCAGGGTGATTCCGTTCCTAAACTGGTTACAAGGCTAAAACGTGTTTTAGAGTAT
>> AATGACTGTCTACGGCGGTATTGTGATGTTATCATCCGTCCCTAGGCGTGGCGACCGTTA
>> AACAGCCTCTTCCCTAACTGATATCTAATCGTAGGAGTTGCTACGCATTTGTCAACGCAG
>> CGATGATGGTGATGCATCTTAATCTAGCTGG----TTTTTTGATCTCGGGTGACGCAGAT
>> AGTCAGGGGTTGACTCGCGTCGTTTGAAACGTGCCTTGCTCCTCAATGGACCCTCCGAAC
>> CTAAGAGTAGCTCGACACGGCT
>>
>>
>>
>> I think the $aln object is OK, as I can use it with SimpleAlign.
>>
>> Moreover, if I write
>>        print $jcmatrix;
>> instead of
>>        print $jcmatrix->print_matrix;
>> I get the memory reference, as normal===> ARRAY(0x859f08)
>>
>> So my question is:
>>
>> Why do I have an unblessed reference?
>>
>> Can't call method "print_matrix" on unblessed reference at Tree.pl  line 32, 
>> <GEN0> line 44.
>>
>> Thank you very much in advance.
>>
>> Jose G.
>>
>> _________________________________________________________________
>> Hay tantos ordenadores como personas. ?Descubre ahora cu?l eres t?!
>> http://www.quepceres.com/
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> --
> Jason Stajich
> jason.stajich at gmail.com
> jason at bioperl.org
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

______________________________


_______________________________________________
Bioperl-l mailing list
Bioperl-l at lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/bioperl-l


From dan.bolser at gmail.com  Tue Sep 22 13:09:50 2009
From: dan.bolser at gmail.com (Dan Bolser)
Date: Tue, 22 Sep 2009 14:09:50 +0100
Subject: [Bioperl-l] Converting between allowed SearchIO formats?
Message-ID: <2c8757af0909220609n518243efh63608aa05df13d1c@mail.gmail.com>

Hi all,

I'm reading in a blasttable format blast result, filtering, and hoping
to write out a similarly formatted result. Based on experience with
SeqIO, I expected to do something like the following:

use Bio::SearchIO;

## Open the sequence search report
my $seqI = Bio::SearchIO->
  new( -file   => $file,
       -format => $format,
     );

## Open the output report
my $seqO = Bio::SearchIO->
  new( -file   => ">OUTPUT",
       -format => $format,
     );

while( my $result = $seqI->next_result ) {
  ## Do some filtering...

  $seqO->write_result( $result );
}


However, the above method does not work here. Is this for some deep
reason, or could the above method (based on the way SeqIO works) be
made to work? I'm guessing that the SearchIO object conversion is
simply harder to do than with SeqIO?

So now I'm trying to use the correct method, via
Bio::SearchIO::Writer::HSPTableWriter. The problem is, I can't find a
1 to 1 correspondence between the fields in the blasttable and the
columns provided by the writer. So far I have something like this:

blasttable ->		HSPTableWriter

(result) query_name	query_name
(hit) name		hit_name
(hsp) frac_identical	frac_identical_query?
			frac_identical_hit?
(hsp) hsp_length	length_aln_query?
			length_aln_hit?
(?) mismatches		?
(hsp) gaps		?
			gaps_query?
			gaps_hit?
			gaps_total?
(hsp) start('query')	start_query
(hsp) end('query')	end_query
(hsp) start('hit')	start_hit
(hsp) end('hit')	end_hit
(hsp) significance	expect
(hsp) bits		bits


For (hsp) frac_identical, it seems as if the (undocumented)
frac_identical_total column is giving the right value, however, I'ts
hard to be certain because the format is of the value is different
(the blasttable says 93.51 while HSPTableWriter says 0.94). How can I
change the output format of HSPTableWriter?

Is there any improvement on the above mapping? It seems strange that I
can read in a blasttable, but I can't write one out (using a generic
object interface). For example, where do I get the hsp length from
(which column)?

I'm sure this has come up before, so apologies for not being able to
track down the appropriate docs.


Thanks for any help,
Dan.

P.S. when dumping a blasttable from a blasttable using HSP methods,
how should I calculate the number of mismatches? Currently I'm trying:

      my $len = $hsp->length;
      my $match = $len * $hsp->frac_identical;
      my $mismatch = $len - $match;

but the resulting values differ from those in the original blasttable.
I have the feeling this is a FAQ ...


From cjfields at illinois.edu  Tue Sep 22 14:00:44 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Tue, 22 Sep 2009 09:00:44 -0500
Subject: [Bioperl-l] Converting between allowed SearchIO formats?
In-Reply-To: <2c8757af0909220609n518243efh63608aa05df13d1c@mail.gmail.com>
References: <2c8757af0909220609n518243efh63608aa05df13d1c@mail.gmail.com>
Message-ID: <B7F6253D-F9EE-4EC0-9ABE-53CB85E37D16@illinois.edu>

On Sep 22, 2009, at 8:09 AM, Dan Bolser wrote:

> Hi all,
>
> I'm reading in a blasttable format blast result, filtering, and hoping
> to write out a similarly formatted result. Based on experience with
> SeqIO, I expected to do something like the following:
>
> use Bio::SearchIO;
>
> ## Open the sequence search report
> my $seqI = Bio::SearchIO->
> new( -file   => $file,
>      -format => $format,
>    );
>
> ## Open the output report
> my $seqO = Bio::SearchIO->
> new( -file   => ">OUTPUT",
>      -format => $format,
>    );
>
> while( my $result = $seqI->next_result ) {
> ## Do some filtering...
>
> $seqO->write_result( $result );
> }
>
>
> However, the above method does not work here. Is this for some deep
> reason, or could the above method (based on the way SeqIO works) be
> made to work? I'm guessing that the SearchIO object conversion is
> simply harder to do than with SeqIO?

This is something Jason could probably speak up on, but from my  
perspective it comes down to 'why?'.  This opens up a very hard-to- 
implement door (converting to and from, for instance, BLAST to HMMER),  
which doesn't make sense from the end-user perspective.  What most  
users want out of those formats is getting at the data in an easily  
accessible way, to further process them (filter, to GFF, etc), or to  
have them summarized.  the Writer classes take care of the latter.

There is a very generic, all-purpose write_result in Bio::SearchIO  
that just calls the a ResultWriter object (and dies if it isn't  
present).  Note that this expects a ResultWriter, not a Hit/HSPWriter;  
it is write_result() after all. I think this kind of goes against the  
well-established API that exists with the other write_foo  
implementations for the IO classes, where the input/output format  
should match, but there you have it.

> So now I'm trying to use the correct method, via
> Bio::SearchIO::Writer::HSPTableWriter. The problem is, I can't find a
> 1 to 1 correspondence between the fields in the blasttable and the
> columns provided by the writer. So far I have something like this:
>
> blasttable ->		HSPTableWriter
>
> (result) query_name	query_name
> (hit) name		hit_name
> (hsp) frac_identical	frac_identical_query?
> 			frac_identical_hit?
> (hsp) hsp_length	length_aln_query?
> 			length_aln_hit?
> (?) mismatches		?
> (hsp) gaps		?
> 			gaps_query?
> 			gaps_hit?
> 			gaps_total?
> (hsp) start('query')	start_query
> (hsp) end('query')	end_query
> (hsp) start('hit')	start_hit
> (hsp) end('hit')	end_hit
> (hsp) significance	expect
> (hsp) bits		bits
>
>
> For (hsp) frac_identical, it seems as if the (undocumented)
> frac_identical_total column is giving the right value, however, I'ts
> hard to be certain because the format is of the value is different
> (the blasttable says 93.51 while HSPTableWriter says 0.94). How can I
> change the output format of HSPTableWriter?

Not sure but it appears hard-coded.  This could probably be rewritten  
to spit out certain data attributes by name (e.g. you could ask for  
percent_identity), but I'm not sure.

> Is there any improvement on the above mapping? It seems strange that I
> can read in a blasttable, but I can't write one out (using a generic
> object interface). For example, where do I get the hsp length from
> (which column)?
>
> I'm sure this has come up before, so apologies for not being able to
> track down the appropriate docs.

 From the POD:

'Here are the columns that can be specified in the -columns
parameter when creating a HSPTableWriter object.  If a -columns  
parameter
is not specified, this list, in this order, will be used as the  
default.'

In other words, you keep track of the columns (which appear 1-based).

> Thanks for any help,
> Dan.
> P.S. when dumping a blasttable from a blasttable using HSP methods,
> how should I calculate the number of mismatches? Currently I'm trying:
>
>     my $len = $hsp->length;
>     my $match = $len * $hsp->frac_identical;
>     my $mismatch = $len - $match;
>
> but the resulting values differ from those in the original blasttable.
> I have the feeling this is a FAQ ...

Maybe use seq_inds instead?

BTW, HSP length() defaults on the 'total' length (includes gaps).  The  
above calculation doesn't account for that.

With seq_inds, 'mismatch' are residue-only (no gaps); 'no_match' is  
mismatched residues + gaps (you have to also indicate whether this is  
based on the query or hit).

Also note that seq_inds deals with (1) mapping differences, e.g. any  
query that requires translation, and (2) frameshifts, such as from  
FASTX/Y output (again translated sequence output).  If you are dealing  
with a translated sequence you will want to account for those bits as  
well.

chris


From cjfields at illinois.edu  Tue Sep 22 14:20:47 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Tue, 22 Sep 2009 09:20:47 -0500
Subject: [Bioperl-l] a Main Page proposal
In-Reply-To: <4AB84B8D.5080005@ieee.org>
References: <mailman.7482.1253574466.2696.bioperl-l@lists.open-bio.org>
	<4AB84B8D.5080005@ieee.org>
Message-ID: <2B4BD21D-D06A-43A1-8860-A8F3771130E7@illinois.edu>

On Sep 21, 2009, at 10:59 PM, Jonathan Cline wrote:

> Throwing this out there:
>
> - there should be a screenshot section (whatever that means for  
> bioperl)

The only area that would apply is for Gbrowse/Bio::Graphics.  For much  
of the rest that's a bit trickier, but it's possible.

> - the grammar of the beta page should be more correct.
>
> "Welcome to BioPerl, a community effort to produce Perl code which is
> useful in biology. "
> ==> "Welcome to BioPerl, a community effort to produce Perl code  
> serving
> as useful tool in the field of Biology."
>
>>> The About section is a good example. I would bet most visitors to  
>>> the
> BioPerl website skip over the About section because they already  
> know what
> BioPerl is, ...  Dave<<
>
> Most good software front pages say, in a couple sentences, "what it is
> and what it's for", including pictures (as screenshots).

Right.

> I would bet a ton of visitors don't know what bioperl is, or what it  
> is
> used for, or how it can benefit.  There is likely a metric for this  
> (web
> stats) as the ratio of new page visits that bounce away vs. new
> clickthrus from the front page to the download or docs section.    
> i.e. a
> visitor found the page and didn't continue reading.  I don't really  
> know
> all the things bioperl is good for and I've been reading about it  
> here &
> there for a while.
>
> I like the following from the About and I believe it fits well on a
> front page, expanding "toolkit" to "software library":
>
> "What is Bioperl? It is an open source bioinformatics software library
> used by researchers all over the world. If you're looking for a script
> built to fit your exact needs you probably won't find it in Bioperl.
> What you will find is a diverse set of Perl modules that will enable  
> you
> to write your own script, and a community of people who are willing to
> help you. "
>
> The old school definition of software library is something like:  
> "useful
> routines which can be used by an application (& not itself an
> application)" which is basically the description above.
>
> I also like the intro from wikipedia, which I found more informative
> about bioperl, and would be good for a front page:
>
> 'BioPerl [1] is a collection of Perl modules that facilitate the
> development of Perl scripts for bioinformatics applications. It has
> played an integral role in the Human Genome Project.[2]  It is an  
> active
> open source software project supported by the Open Bioinformatics
> Foundation.  In order to take advantage of BioPerl, the user needs a
> basic understanding of the Perl programming language including an
> understanding of how to use Perl references, modules, objects and  
> methods."
>
> The screenshots could also include pics of books on bioperl or perl 
> +bio,
> that would be neat.  (Tisdall's book comes to mind here)

I tend to agree here, but Tisdall only discusses BioPerl in detail in  
the second book (Mastering Perl for Bioinformatics).  I think we're  
safe as long as we indicate that, just don't want to run into a  
situation like the recent issue that some users had with Gentleman's  
'R for Bioinformatics' book released last year.

I don't think it was intentional, but a lot of users purchased it  
thinking it would be a BioConductor book, mainly b/c it was advertised  
on the BioConductor website.  Unfortunately it had very little to do  
with BioC (or bioinformatics, really), and the reviews of the book  
reflect that.  It's unfortunate, as I found it to be a pretty good  
book on R.

-c

> ## Jonathan Cline
> ## jcline at ieee.org
> ## Mobile: +1-805-617-0223
> ########################


From cjfields at illinois.edu  Tue Sep 22 15:53:13 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Tue, 22 Sep 2009 10:53:13 -0500
Subject: [Bioperl-l] BioPerl 1.6.0 alpha 2 released
In-Reply-To: <2736FAB1-3728-465F-A07B-A8FFA790FC4C@illinois.edu>
References: <2736FAB1-3728-465F-A07B-A8FFA790FC4C@illinois.edu>
Message-ID: <2ED641E3-F69E-4513-B261-0949FDE35EBB@illinois.edu>

And just as quickly, getting back lots indicating more problems from  
CPAN Testers.  Some can be ignored (appear due to the local perl  
testing environment so are local to the tester).  The following are  
the most significant; appears a hard-coded SeqFeature_SQLite.t got  
bundled in somehow, so I'll drop an alpha 3 shortly.

chris

#   Failed test 'use Bio::SeqFeature::Annotated;'
#   at t/Annotation/Annotation.t line 23.
#     Tried to use 'Bio::SeqFeature::Annotated'.
#     Error:  Can't locate URI/Escape.pm in @INC (@INC contains: t/ 
lib . /Users/david/cpantesting/perl-5.10.1/.cpan/build/ 
BioPerl-1.6.0._2-QVXU9n/blib/lib /Users/david/cpantesting/ 
perl-5.10.1/.cpan/build/BioPerl-1.6.0._2-QVXU9n/blib/arch /Users/david/ 
cpantesting/perl-5.10.1/.cpan/build/BioPerl-1.6.0._2-QVXU9n /sw/lib/ 
perl5 /sw/lib/perl5/darwin /Users/david/cpantesting/perl-5.10.1/lib/ 
5.10.1/darwin-thread-multi-2level /Users/david/cpantesting/perl-5.10.1/ 
lib/5.10.1 /Users/david/cpantesting/perl-5.10.1/lib/site_perl/5.10.1/ 
darwin-thread-multi-2level /Users/david/cpantesting/perl-5.10.1/lib/ 
site_perl/5.10.1) at Bio/SeqFeature/Annotated.pm line 100.
# BEGIN failed--compilation aborted at Bio/SeqFeature/Annotated.pm  
line 100.
# Compilation failed in require at (eval 60) line 2.
# BEGIN failed--compilation aborted at (eval 60) line 2.
# Looks like you failed 1 test of 159.
t/Annotation/Annotation.t ....................
Dubious, test returned 1 (wstat 256, 0x100)
Failed 1/159 subtests
	(less 12 skipped subtests: 146 okay)


t/LocalDB/SeqFeature.t ....................... ok
DBD::SQLite::db prepare_cached failed: near "INDEXED": syntax error(1)  
at dbdimp.c line 271 at Bio/DB/SeqFeature/Store/DBI/mysql.pm line 1678.

-------------------- EXCEPTION --------------------
MSG: near "INDEXED": syntax error(1) at dbdimp.c line 271
STACK Bio::DB::SeqFeature::Store::DBI::mysql::_prepare Bio/DB/ 
SeqFeature/Store/DBI/mysql.pm:1678
STACK Bio::DB::SeqFeature::Store::DBI::SQLite::_features Bio/DB/ 
SeqFeature/Store/DBI/SQLite.pm:665
STACK Bio::DB::SeqFeature::Store::get_features_by_attribute Bio/DB/ 
SeqFeature/Store.pm:961
STACK toplevel t/LocalDB/SeqFeature.t:135
-------------------------------------------
# Looks like you planned 69 tests but only ran 40.
# Looks like your test died just after 40.
t/LocalDB/SeqFeature_SQLite.t ................
Failed 29/69 subtests


On Sep 21, 2009, at 10:56 PM, Chris Fields wrote:

> Just a note that the second alpha is out and propagating it's way  
> around the intertubes:
>
> http://search.cpan.org/~cjfields/BioPerl-1.6.0_2/
>
> Pick your favorite archive here:
>
> http://bioperl.org/DIST/RC/
>
> This should address the bugs reported by Scott from the last  
> release.  Just a note, but I am seeing a warning popping up with 64- 
> bit perl 5.10.1 on Mac with PopGen tests (I think it's a floating  
> point addition issue).  Let me know if this is popping up elsewhere.
>
> Enjoy!
>
> chris
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From jason at bioperl.org  Tue Sep 22 16:01:51 2009
From: jason at bioperl.org (Jason Stajich)
Date: Tue, 22 Sep 2009 09:01:51 -0700
Subject: [Bioperl-l] dnastatistics
In-Reply-To: <39991E8FD29E4A43B8098C0BA6740C9C@NewLife>
References: <BLU104-W2453ADE4584D2C479071A4A0E40@phx.gbl><7AD546C5A6BE4B66BF9705BC885E08B1@NewLife><8B440DC9-A1C8-4900-A0AB-96448616E46A@bioperl.org>
	<A5C3A80C-03F0-4CEC-BA43-2271B58F6DC4@science.mq.edu.au>
	<39991E8FD29E4A43B8098C0BA6740C9C@NewLife>
Message-ID: <1027EFFB-18B5-446B-A5B0-9DA628EEEF08@bioperl.org>

someone should pick up the ball.

On Sep 22, 2009, at 4:12 AM, Mark A. Jensen wrote:

> Thanks Liam-- I think the discrepancy between dnadist and the
> module is worth making a bug report for- can you do that and
> include the data (or part of it) you were using?
> Jason, is that work really underway, or should someone pick up
> that ball?
> ----- Original Message ----- From: "Liam Elbourne" <lelbourn at science.mq.edu.au 
> >
> To: "Jason Stajich" <jason at bioperl.org>
> Cc: "Mark A. Jensen" <maj at fortinbras.us>; <bioperl-l at bioperl.org>;  
> "Jose ." <joseguillin at hotmail.com>
> Sent: Tuesday, September 22, 2009 3:14 AM
> Subject: [Bioperl-l] dnastatistics
>
>
> So I also had no problem running the code as written by Jose (Bioperl
> 1.6.0, perl 5.10), but in the documentation for DNAStatistics it says:
>
> "The routines are not well tested and do contain errors at this point.
> Work is underway to correct them, but do not expect this code to give
> you the right answer currently!"!
>
> So I'm using dnadist (as I think the documentation recommends), and it
> does produce different numbers to $stats->distance(-).
>
> I tried write_matrix from Bio::Matrix::IO - got a message saying it
> hasn't been implemented yet?
>
> And if Jose hasn't already found it, try Data::Dumper; it will change
> your life....
>
> Regards,
> Liam.
>
> On 15/09/2009, at 3:54 AM, Jason Stajich wrote:
>
>> Yeah it seems like more of a bioperl problem -- possible that the   
>> older code didn't recognize 'jukes-cantor' but you can try the   
>> abbreviation 'jc' --  better to just upgrade tho!
>>
>> This isn't the cause of the problem but I would also encourage use   
>> of Bio::Matrix::IO for printing the matrix (use the 'write_matrix'   
>> function) rather than print_matrix on the matrix itsself.
>>
>> -jason
>> On Sep 14, 2009, at 10:00 AM, Mark A. Jensen wrote:
>>
>>> Hi Jose--
>>> I don't get any problem with your script as written. You should   
>>> upgrade to
>>> BioPerl 1.6 and try again.
>>> The "unblessed reference" is $jcmatrix. It may be undef for some   
>>> reason.
>>> MAJ
>>> ----- Original Message ----- From: "Jose ."  
>>> <joseguillin at hotmail.com>
>>> To: <bioperl-l at bioperl.org>
>>> Sent: Monday, September 14, 2009 8:48 AM
>>> Subject: [Bioperl-l] Bio/Align/DNAStatistics.html print$jcmatrix-
>>> >print_matrix;
>>>
>>>
>>>
>>>
>>>
>>> Hello,
>>>
>>> I'm trying to use Bio::Align::DNAStatistics, but I get the   
>>> following message:
>>>
>>> Can't call method "print_matrix" on unblessed reference at  
>>> Tree.pl  line 32, <GEN0> line 44.
>>>
>>> Other modules do work, such us Bio::SimpleAlign;
>>>
>>>
>>>
>>>
>>> My code is basically a modification of the code I found in http://doc.bioperl.org/releases/bioperl-current/bioperl-live/Bio/Align/DNAStatistics.html 
>>>  , as it is as follows:
>>>
>>> use strict;
>>> use Bio::AlignIO;
>>> use Bio::Align::DNAStatistics;
>>>
>>>
>>> my $stats = Bio::Align::DNAStatistics->new();
>>>
>>> my $alignin = Bio::AlignIO->new(-file => 'e1_output_uno_solo.fas',
>>>                         -format => 'fasta');
>>> my $aln = $alignin->next_aln;
>>>
>>> my $jcmatrix = $stats-> distance (-align => $aln,
>>>               -method => 'Jukes-Cantor');
>>>
>>> print $jcmatrix->print_matrix;
>>>
>>> And the file 'e1_output_uno_solo.fas' has the following sequences:
>>>
>>>> A
>>> GGTTATCTCAACAACTGTCACC--GTGGGCGCTGGTCATTGGTACGGGTGAACGAGAGTT
>>> AAACGGTCGTTAACCATAGAAACAAAACACACTGCACCTTAACTCACTGAATAGTTGACG
>>> GTCTGCCTCAGGGCTTGAGACAACGGATGGATCTAAACTCATGCTGTAGCCTATCAAACT
>>> TAGCCCCAGGGTACTTCCGTCCCTAGCCTCGCTACAAGGCCAGAAAGGGTTTTGAAGTCT
>>> ACTCACTGTGACCAGCGGTCTAGTCAGGTTATGCTTCGGCACAAAACCTCAGAATCGGTA
>>> ACCAGCCACTACACGAACTGAAATCAAATCGCGGGAGGTGGTCCATCTTTGTCCACGCTG
>>> CGATGATTGGGTTGCTTTATAGTCTAGCTGCAAGGTTTTGCGTTCTGGTGGGAAGCGGSubject 
>>> :  Re: [Bioperl-l] Bio/Align/DNAStatistics.html
>> print$jcmatrix->print_maCA
>>> TCCAAGGGGTTGACTCCGCTCGTTTATAACATGCCTTGGGCCTCCATGGTGAGTCGCAAC
>>> GTCAGCGTAGGCCTAGACGGCT
>>>
>>>> B
>>> GGATATCTCGACAACTTTTAGC--CTGGGCGCTTGGCATTGGTACACGTGACTTGCAGTT
>>> AAAGGGTCGTTATACATAGAATCACTACCCAC--CAGGCGAACTCGCTGGAGAGCTGAGG
>>> GTCACCCTCAGCGGTTGAGTTAACTGCTCGATGTTAACCGATGTTGGATCATAGGTAACT
>>> TATCCTCAGTGTTCCTCTGTCCCTAGACTGGCTACAGGGCTACACCGGGTTTGAGGGGAT
>>> ACTGACTGTTTTCAGCGGTAGTGTAAGTGTATGGTCCAACCCAAGGGTTCATGACCGGTA
>>> AACTGCCCGTTCCCGCATTGAAATCAAATTGCAGGAGTTGGTACTTATTTGTCAACCTTA
>>> CGATGATTGGGATGCATTTTAGTCGGGCTGGGCGGATTTGCGATCTGGGTGGAAGAGAGA
>>> TGCATGGGGCTAACTCGTCTTGGTGAGTACCGGCATTGCACCGCAATGGACCGCCAAAAC
>>> ATAAGAGTAGGTCGGGATGGCA
>>>
>>>> C
>>> GCTTATCTCAACAACCGACACGAAGTCGTCGCAGGTCAATGGTACACGTGAATTGAAGTC
>>> ATAAGATCAGTAATGATCGAACCACCAAACCCTTAACCTCGACTCACGCGATAGCCGAGG
>>> GTCTGCCTCCAGGGTTGATTTAAAGGTTCTATTTAAGACCGTTTTCGATCATAGGTTACT
>>> TATCCCCAGAGTTCTACCGTCGTGAGAATGGCTACAAGGCTAGAATAGGTTTTAGGGT-T
>>> ACTTACGGTCTGCAGCCGTATTGTGAGGTTATGGTCCGGCCCTAGGCGTCATGACCGATA
>>> ATCAGCCCCTACCTGAAATGAAATCAAATCGCGGGAGTTGGTACTTATCTGTCAACGTTG
>>> CGATGATGGGGATACATGTTGGTCTACCGCGACGGACTAGCGATCACGGGGGAAGCGGAT
>>> TGCCCGGTGGTGACTCGACACGTTTAAAACCTGCCTGGTTCCCGCATGGATCGTCACAAC
>>> GTATGTGCAGGTCGAAACGAGT
>>>
>>>> D
>>> CGTGATCGCAACAACTGTCACC--GTGGGCGCTGGCCGTTGGACCACGTGAAATGCTGTT
>>> AAACGATCGTTCACCATAGAACCACTACACTCTTCACCTCAACCCGCGGGACAGGTGATG
>>> GTGTCCCCCAGGGGTTGAGTGAACGGCTCGATGTAAACCCATGTTCGATCATAGGTAACG
>>> TAGCCCCAGGGTGATTCCGTTCCTAAACTGGTTACAAGGCTAAAACGTGTTTTAGAGTAT
>>> AATGACTGTCTACGGCGGTATTGTGATGTTATCATCCGTCCCTAGGCGTGGCGACCGTTA
>>> AACAGCCTCTTCCCTAACTGATATCTAATCGTAGGAGTTGCTACGCATTTGTCAACGCAG
>>> CGATGATGGTGATGCATCTTAATCTAGCTGG----TTTTTTGATCTCGGGTGACGCAGAT
>>> AGTCAGGGGTTGACTCGCGTCGTTTGAAACGTGCCTTGCTCCTCAATGGACCCTCCGAAC
>>> CTAAGAGTAGCTCGACACGGCT
>>>
>>>
>>>
>>> I think the $aln object is OK, as I can use it with SimpleAlign.
>>>
>>> Moreover, if I write
>>>       print $jcmatrix;
>>> instead of
>>>       print $jcmatrix->print_matrix;
>>> I get the memory reference, as normal===> ARRAY(0x859f08)
>>>
>>> So my question is:
>>>
>>> Why do I have an unblessed reference?
>>>
>>> Can't call method "print_matrix" on unblessed reference at  
>>> Tree.pl  line 32, <GEN0> line 44.
>>>
>>> Thank you very much in advance.
>>>
>>> Jose G.
>>>
>>> _________________________________________________________________
>>> Hay tantos ordenadores como personas. ?Descubre ahora cu?l eres t?!
>>> http://www.quepceres.com/
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>> --
>> Jason Stajich
>> jason.stajich at gmail.com
>> jason at bioperl.org
>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> ______________________________
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>

--
Jason Stajich
jason.stajich at gmail.com
jason at bioperl.org


From jason at bioperl.org  Tue Sep 22 16:07:14 2009
From: jason at bioperl.org (Jason Stajich)
Date: Tue, 22 Sep 2009 09:07:14 -0700
Subject: [Bioperl-l] Converting between allowed SearchIO formats?
In-Reply-To: <B7F6253D-F9EE-4EC0-9ABE-53CB85E37D16@illinois.edu>
References: <2c8757af0909220609n518243efh63608aa05df13d1c@mail.gmail.com>
	<B7F6253D-F9EE-4EC0-9ABE-53CB85E37D16@illinois.edu>
Message-ID: <CE021960-F0DC-4BA7-91B7-21A5B2F6F1BF@bioperl.org>

>>
>>
>> However, the above method does not work here. Is this for some deep
>> reason, or could the above method (based on the way SeqIO works) be
>> made to work? I'm guessing that the SearchIO object conversion is
>> simply harder to do than with SeqIO?
>
> This is something Jason could probably speak up on, but from my  
> perspective it comes down to 'why?'.  This opens up a very hard-to- 
> implement door (converting to and from, for instance, BLAST to  
> HMMER), which doesn't make sense from the end-user perspective.   
> What most users want out of those formats is getting at the data in  
> an easily accessible way, to further process them (filter, to GFF,  
> etc), or to have them summarized.  the Writer classes take care of  
> the latter.
>


> There is a very generic, all-purpose write_result in Bio::SearchIO  
> that just calls the a ResultWriter object (and dies if it isn't  
> present).  Note that this expects a ResultWriter, not a Hit/ 
> HSPWriter; it is write_result() after all. I think this kind of goes  
> against the well-established API that exists with the other  
> write_foo implementations for the IO classes, where the input/output  
> format should match, but there you have it.
>

Dan -
I'm confused about what you are trying to do or what is broken - are  
you just annoyed that the API isn't the same style as Bio::SeqIO.


--
Jason Stajich
jason.stajich at gmail.com
jason at bioperl.org


From shalabh.sharma7 at gmail.com  Tue Sep 22 16:48:39 2009
From: shalabh.sharma7 at gmail.com (shalabh sharma)
Date: Tue, 22 Sep 2009 12:48:39 -0400
Subject: [Bioperl-l] Stockholm to fasta
Message-ID: <9fcc48c70909220948t7988b48eu7a8dcf89ee2d6042@mail.gmail.com>

Hi All,      I am trying to convert stockholm to fasta format. I am using
"sreformat" for this purpose. I am getting a fasta file but the problem is i
want header information from stockholm in my fasta file.
Like:
# STOCKHOLM 1.0

#=GF AC   RF00003
#=GF ID   U1
#=GF DE   U1 spliceosomal RNA
- - - - - - - - - -  - - - -
- - - - - - - - - - - -- -
- - - - - - -- - - - - -
#=GF RL   J Biol Chem 2001;276:21476-21481.
#=GF CC   U1 is a small nuclear RNA (snRNA) component of the spliceosome
#=GF CC   (involved in pre-mRNA splicing). Its 5' end forms complementary
#=GF CC   base pairs with the 5' splice junction, thus defining the 5'
#=GF CC   donor site of an intron.
#=GF CC   There are significant differences in sequence and secondary
#=GF CC   structure between metazoan and yeast U1 snRNAs, the latter being
#=GF CC   much longer (568 nucleotides as compared to 164 nucleotides in
#=GF CC   human). Nevertheless, secondary structure predictions suggest
#=GF CC   that all U1 snRNAs share a 'common core' consisting of helices I,
#=GF CC   II, the proximal region of III, and IV [1].
#=GF CC   This family does not contain the larger yeast sequences.
#=GF SQ   100


X63783.1/2024-2186
UUACUUACCUGGCUGG.AGUUU.GCUA...UCGAUCAU.GAAG.GGUAG.
X63783.1/1394-1556
UUACUUACCUGGCUGG.AGUUA.GCUA...UCGAUCAU.GAAG.GGUAG.
X58845.1/1-161
..ACUUACCUGGCUGG.AGUUU.GCUA...UCGAUCAU.GAAG.GGUAG.
X63783.1/596-756
UAAAUUACAAUGUUGU.AGUUA.GCUA...UAUAUCAA.AAAA.UAUAG.
M29062.1/238-387
UUACUUACCUGGCAUG.AGUUU..CUG...CAGCACAA.GAAU.UGUGG.

As a output i am just getting a fasta file with the headers like
 "X63783.1/2024-2186" but what i want is that it should include some
information like U1 or U1 spliceosomal RNA from the stockholm headers.

I would really appreciate if anyone can help me out.

Thanks
Shalabh


From roy.chaudhuri at gmail.com  Tue Sep 22 16:44:47 2009
From: roy.chaudhuri at gmail.com (Roy Chaudhuri)
Date: Tue, 22 Sep 2009 17:44:47 +0100
Subject: [Bioperl-l] subsection of genbank file
In-Reply-To: <4AB87A06.4000209@gmail.com>
References: <997B4CA2-D80B-4512-AA3E-74CB45DD7064@science.mq.edu.au>	<4AB36451.3030207@gmail.com>	<3B0EF953-BF79-4384-964D-A992DFBDB609@science.mq.edu.au>
	<4AB87A06.4000209@gmail.com>
Message-ID: <4AB8FEFF.6060408@gmail.com>

Hi Liam,

My mistake, it looks like the bug had already been reported and fixed, 
which means I get to go home earlier. I've marked your bug as a 
duplicate of bug 2810.

You can get the patched version by installing bioperl-live (just 
downloading the bioperl-live SeqUtils.pm and putting it in the correct 
place on your system would probably also work).

Cheers.
Roy.

Roy Chaudhuri wrote:
> Hi Liam,
> 
> Yes, that is a bug - I think it is to do with the Feature Annotation 
> rollback from 1.6, it works fine with 1.5.2. Looks like the tests I 
> wrote don't check for the presence of tags, just the coordinates of the 
> feature, so this hasn't been picked up. Submit it to Bugzilla, and I'll 
> take a look when I get a chance.
> 
> Cheers.
> Roy.
> 
> Liam Elbourne wrote:
>> Hi Roy,
>>
>> Thanks for that, works well, but there are no _gsf_tag_hash values? I'm 
>> particularly interested in the locus id, obviously the translation could 
>> be problematic if the whole gene is not included after truncation, but 
>> things like the note, product, protein_id would be good. I had a look at 
>> the code for the method and couldn't see any obvious why those values 
>> didn't make it across. Should I submit this as a bug, or is there 
>> something I'm missing?
>>
>>
>> Regards,
>> Liam.
>>
>>
>>
>> On 18/09/2009, at 8:43 PM, Roy Chaudhuri wrote:
>>
>>> Hi Liam,
>>>
>>> I just discovered your message, which has not yet been replied to. 
>>> What you require has been discussed in a recent thread:
>>> http://bioperl.org/pipermail/bioperl-l/2009-August/031071.html
>>>
>>> Try using trunc_with_features from Bio::SeqUtils:
>>>
>>> my $sub_seqobj=Bio::SeqUtils->trunc_with_features($seqobj, 300, 2000);
>>> Cheers.
>>> Roy.
>>>
>>> Liam Elbourne wrote:
>>>> Hi All,
>>>> Is there a method or methodology that will produce a fully fledged 
>>>> Seq  object with all the associated metadata given a start and end 
>>>>  position? To clarify, I create a sequence object from a genbank file:
>>>> ****
>>>> my $io  = Bio::Seqio->new(as per usual);
>>>> my $seqobj = $io->next_seq();
>>>> ****
>>>> I now want:
>>>> my $sub_seqobj = $seqobj between 300 and 2000
>>>> where $sub_seqobj is a Seq object (which I appreciate is an 
>>>>  'aggregate' of objects) too. The "trunc" method only returns a 
>>>>  PrimarySeq object which lacks all the annotation etc. I've 
>>>> previously  done this task by iterating through feature by feature 
>>>> and parsing out  what I needed, but thought there might be a more 
>>>> elegant approach...
>>>> Regards,
>>>> Liam Elbourne.
>>> -- 
>>> Dr. Roy Chaudhuri
>>> Department of Veterinary Medicine
>>> University of Cambridge, U.K.
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>> ac.uk ([131.111.51.215]:49455)
>>> by ppsw-7.c
>> ______________________________
>>
>> Dr Liam Elbourne
>> Research Fellow (Bioinformatics)
>> Paulsen Laboratory
>> Macquarie University
>> Sydney
>> Australia.
>>
>> http://www2.oxfam.org.au/trailwalker/Sydney/team/228
>>
>>
>>
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From cjfields at illinois.edu  Tue Sep 22 17:12:10 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Tue, 22 Sep 2009 12:12:10 -0500
Subject: [Bioperl-l] subsection of genbank file
In-Reply-To: <4AB8FEFF.6060408@gmail.com>
References: <997B4CA2-D80B-4512-AA3E-74CB45DD7064@science.mq.edu.au>	<4AB36451.3030207@gmail.com>	<3B0EF953-BF79-4384-964D-A992DFBDB609@science.mq.edu.au>
	<4AB87A06.4000209@gmail.com> <4AB8FEFF.6060408@gmail.com>
Message-ID: <1F043B63-3DD1-49DD-86F3-B2FB9AD34725@illinois.edu>

That should be out in the latest alpha on CPAN as well (the final  
1.6.1 should be out this week).

chris

On Sep 22, 2009, at 11:44 AM, Roy Chaudhuri wrote:

> Hi Liam,
>
> My mistake, it looks like the bug had already been reported and  
> fixed, which means I get to go home earlier. I've marked your bug as  
> a duplicate of bug 2810.
>
> You can get the patched version by installing bioperl-live (just  
> downloading the bioperl-live SeqUtils.pm and putting it in the  
> correct place on your system would probably also work).
>
> Cheers.
> Roy.
>
> Roy Chaudhuri wrote:
>> Hi Liam,
>> Yes, that is a bug - I think it is to do with the Feature  
>> Annotation rollback from 1.6, it works fine with 1.5.2. Looks like  
>> the tests I wrote don't check for the presence of tags, just the  
>> coordinates of the feature, so this hasn't been picked up. Submit  
>> it to Bugzilla, and I'll take a look when I get a chance.
>> Cheers.
>> Roy.
>> Liam Elbourne wrote:
>>> Hi Roy,
>>>
>>> Thanks for that, works well, but there are no _gsf_tag_hash  
>>> values? I'm particularly interested in the locus id, obviously the  
>>> translation could be problematic if the whole gene is not included  
>>> after truncation, but things like the note, product, protein_id  
>>> would be good. I had a look at the code for the method and  
>>> couldn't see any obvious why those values didn't make it across.  
>>> Should I submit this as a bug, or is there something I'm missing?
>>>
>>>
>>> Regards,
>>> Liam.
>>>
>>>
>>>
>>> On 18/09/2009, at 8:43 PM, Roy Chaudhuri wrote:
>>>
>>>> Hi Liam,
>>>>
>>>> I just discovered your message, which has not yet been replied  
>>>> to. What you require has been discussed in a recent thread:
>>>> http://bioperl.org/pipermail/bioperl-l/2009-August/031071.html
>>>>
>>>> Try using trunc_with_features from Bio::SeqUtils:
>>>>
>>>> my $sub_seqobj=Bio::SeqUtils->trunc_with_features($seqobj, 300,  
>>>> 2000);
>>>> Cheers.
>>>> Roy.
>>>>
>>>> Liam Elbourne wrote:
>>>>> Hi All,
>>>>> Is there a method or methodology that will produce a fully  
>>>>> fledged Seq  object with all the associated metadata given a  
>>>>> start and end  position? To clarify, I create a sequence object  
>>>>> from a genbank file:
>>>>> ****
>>>>> my $io  = Bio::Seqio->new(as per usual);
>>>>> my $seqobj = $io->next_seq();
>>>>> ****
>>>>> I now want:
>>>>> my $sub_seqobj = $seqobj between 300 and 2000
>>>>> where $sub_seqobj is a Seq object (which I appreciate is an   
>>>>> 'aggregate' of objects) too. The "trunc" method only returns a   
>>>>> PrimarySeq object which lacks all the annotation etc. I've  
>>>>> previously  done this task by iterating through feature by  
>>>>> feature and parsing out  what I needed, but thought there might  
>>>>> be a more elegant approach...
>>>>> Regards,
>>>>> Liam Elbourne.
>>>> -- 
>>>> Dr. Roy Chaudhuri
>>>> Department of Veterinary Medicine
>>>> University of Cambridge, U.K.
>>>> _______________________________________________
>>>> Bioperl-l mailing list
>>>> Bioperl-l at lists.open-bio.org
>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>> ac.uk ([131.111.51.215]:49455)
>>>> by ppsw-7.c
>>> ______________________________
>>>
>>> Dr Liam Elbourne
>>> Research Fellow (Bioinformatics)
>>> Paulsen Laboratory
>>> Macquarie University
>>> Sydney
>>> Australia.
>>>
>>> http://www2.oxfam.org.au/trailwalker/Sydney/team/228
>>>
>>>
>>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From cjfields at illinois.edu  Tue Sep 22 17:13:53 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Tue, 22 Sep 2009 12:13:53 -0500
Subject: [Bioperl-l] Stockholm to fasta
In-Reply-To: <9fcc48c70909220948t7988b48eu7a8dcf89ee2d6042@mail.gmail.com>
References: <9fcc48c70909220948t7988b48eu7a8dcf89ee2d6042@mail.gmail.com>
Message-ID: <EA566A7E-C146-4C2C-9AD5-88B9BB34EC43@illinois.edu>

The POD for Bio::AlignIO::stockholm indicates where the various bits  
of information are stored.  Everything from the header should be in  
there in the latest bioperl; in many cases it's not ideally stored,  
but it's accessible.

You'll need to preprocess your seqs in the SimpleAlign returned  
(iterate through them and change the relevant bits like desc(),  
displayname(), seq_id, etc) and may need to do other modifications,  
but it should work.

chris

On Sep 22, 2009, at 11:48 AM, shalabh sharma wrote:

> Hi All,      I am trying to convert stockholm to fasta format. I am  
> using
> "sreformat" for this purpose. I am getting a fasta file but the  
> problem is i
> want header information from stockholm in my fasta file.
> Like:
> # STOCKHOLM 1.0
>
> #=GF AC   RF00003
> #=GF ID   U1
> #=GF DE   U1 spliceosomal RNA
> - - - - - - - - - -  - - - -
> - - - - - - - - - - - -- -
> - - - - - - -- - - - - -
> #=GF RL   J Biol Chem 2001;276:21476-21481.
> #=GF CC   U1 is a small nuclear RNA (snRNA) component of the  
> spliceosome
> #=GF CC   (involved in pre-mRNA splicing). Its 5' end forms  
> complementary
> #=GF CC   base pairs with the 5' splice junction, thus defining the 5'
> #=GF CC   donor site of an intron.
> #=GF CC   There are significant differences in sequence and secondary
> #=GF CC   structure between metazoan and yeast U1 snRNAs, the latter  
> being
> #=GF CC   much longer (568 nucleotides as compared to 164  
> nucleotides in
> #=GF CC   human). Nevertheless, secondary structure predictions  
> suggest
> #=GF CC   that all U1 snRNAs share a 'common core' consisting of  
> helices I,
> #=GF CC   II, the proximal region of III, and IV [1].
> #=GF CC   This family does not contain the larger yeast sequences.
> #=GF SQ   100
>
>
> X63783.1/2024-2186
> UUACUUACCUGGCUGG.AGUUU.GCUA...UCGAUCAU.GAAG.GGUAG.
> X63783.1/1394-1556
> UUACUUACCUGGCUGG.AGUUA.GCUA...UCGAUCAU.GAAG.GGUAG.
> X58845.1/1-161
> ..ACUUACCUGGCUGG.AGUUU.GCUA...UCGAUCAU.GAAG.GGUAG.
> X63783.1/596-756
> UAAAUUACAAUGUUGU.AGUUA.GCUA...UAUAUCAA.AAAA.UAUAG.
> M29062.1/238-387
> UUACUUACCUGGCAUG.AGUUU..CUG...CAGCACAA.GAAU.UGUGG.
>
> As a output i am just getting a fasta file with the headers like
> "X63783.1/2024-2186" but what i want is that it should include some
> information like U1 or U1 spliceosomal RNA from the stockholm headers.
>
> I would really appreciate if anyone can help me out.
>
> Thanks
> Shalabh
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From shalabh.sharma7 at gmail.com  Tue Sep 22 20:17:11 2009
From: shalabh.sharma7 at gmail.com (shalabh sharma)
Date: Tue, 22 Sep 2009 16:17:11 -0400
Subject: [Bioperl-l] Stockholm to fasta
In-Reply-To: <EA566A7E-C146-4C2C-9AD5-88B9BB34EC43@illinois.edu>
References: <9fcc48c70909220948t7988b48eu7a8dcf89ee2d6042@mail.gmail.com>
	<EA566A7E-C146-4C2C-9AD5-88B9BB34EC43@illinois.edu>
Message-ID: <9fcc48c70909221317i509a45cbm19783c1210f7c69b@mail.gmail.com>

Hi Chris,           Thanks a lot it was really helpful.

Thanks
Shalabh


On Tue, Sep 22, 2009 at 1:13 PM, Chris Fields <cjfields at illinois.edu> wrote:

> The POD for Bio::AlignIO::stockholm indicates where the various bits of
> information are stored.  Everything from the header should be in there in
> the latest bioperl; in many cases it's not ideally stored, but it's
> accessible.
>
> You'll need to preprocess your seqs in the SimpleAlign returned (iterate
> through them and change the relevant bits like desc(), displayname(),
> seq_id, etc) and may need to do other modifications, but it should work.
>
> chris
>
>
> On Sep 22, 2009, at 11:48 AM, shalabh sharma wrote:
>
>  Hi All,      I am trying to convert stockholm to fasta format. I am using
>> "sreformat" for this purpose. I am getting a fasta file but the problem is
>> i
>> want header information from stockholm in my fasta file.
>> Like:
>> # STOCKHOLM 1.0
>>
>> #=GF AC   RF00003
>> #=GF ID   U1
>> #=GF DE   U1 spliceosomal RNA
>> - - - - - - - - - -  - - - -
>> - - - - - - - - - - - -- -
>> - - - - - - -- - - - - -
>> #=GF RL   J Biol Chem 2001;276:21476-21481.
>> #=GF CC   U1 is a small nuclear RNA (snRNA) component of the spliceosome
>> #=GF CC   (involved in pre-mRNA splicing). Its 5' end forms complementary
>> #=GF CC   base pairs with the 5' splice junction, thus defining the 5'
>> #=GF CC   donor site of an intron.
>> #=GF CC   There are significant differences in sequence and secondary
>> #=GF CC   structure between metazoan and yeast U1 snRNAs, the latter being
>> #=GF CC   much longer (568 nucleotides as compared to 164 nucleotides in
>> #=GF CC   human). Nevertheless, secondary structure predictions suggest
>> #=GF CC   that all U1 snRNAs share a 'common core' consisting of helices
>> I,
>> #=GF CC   II, the proximal region of III, and IV [1].
>> #=GF CC   This family does not contain the larger yeast sequences.
>> #=GF SQ   100
>>
>>
>> X63783.1/2024-2186
>> UUACUUACCUGGCUGG.AGUUU.GCUA...UCGAUCAU.GAAG.GGUAG.
>> X63783.1/1394-1556
>> UUACUUACCUGGCUGG.AGUUA.GCUA...UCGAUCAU.GAAG.GGUAG.
>> X58845.1/1-161
>> ..ACUUACCUGGCUGG.AGUUU.GCUA...UCGAUCAU.GAAG.GGUAG.
>> X63783.1/596-756
>> UAAAUUACAAUGUUGU.AGUUA.GCUA...UAUAUCAA.AAAA.UAUAG.
>> M29062.1/238-387
>> UUACUUACCUGGCAUG.AGUUU..CUG...CAGCACAA.GAAU.UGUGG.
>>
>> As a output i am just getting a fasta file with the headers like
>> "X63783.1/2024-2186" but what i want is that it should include some
>> information like U1 or U1 spliceosomal RNA from the stockholm headers.
>>
>> I would really appreciate if anyone can help me out.
>>
>> Thanks
>> Shalabh
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>
>


From cjfields at illinois.edu  Tue Sep 22 20:29:28 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Tue, 22 Sep 2009 15:29:28 -0500
Subject: [Bioperl-l] BioPerl 1.6.0 alpha 3 released
Message-ID: <A59164B5-0408-4A94-9262-8B814DD48CE1@illinois.edu>

The third alpha is now out and propagating it's way around the  
intertubes:

http://search.cpan.org/~cjfields/BioPerl-1.6.0_3/

Pick your favorite archive here:

http://bioperl.org/DIST/RC/

This includes some unmerged changes from 1.6.0.  Test failures from  
the last alpha indicated these somehow were missed, so I basically ran  
a global diff against main trunk to check for missing commits (all  
located in t/ as it turned out).

Also fixed is are the SeqFeature_SQLite.t failures; this is a file  
autogenerated with Build.PL tests that somehow made it's way into the  
last alpha release.  This is now properly cleaned up along with it's  
test database using './Build clean'.  BTW, very nice SQLite  
implementation; I may be using it!

Please let me know if anything pops up; I'm hoping to release 1.6.1 by  
this Thursday-Friday.

Enjoy!

chris


From dan.bolser at gmail.com  Tue Sep 22 21:33:13 2009
From: dan.bolser at gmail.com (Dan Bolser)
Date: Tue, 22 Sep 2009 22:33:13 +0100
Subject: [Bioperl-l] Converting between allowed SearchIO formats?
In-Reply-To: <CE021960-F0DC-4BA7-91B7-21A5B2F6F1BF@bioperl.org>
References: <2c8757af0909220609n518243efh63608aa05df13d1c@mail.gmail.com>
	<B7F6253D-F9EE-4EC0-9ABE-53CB85E37D16@illinois.edu>
	<CE021960-F0DC-4BA7-91B7-21A5B2F6F1BF@bioperl.org>
Message-ID: <2c8757af0909221433p6d8b5dbeuf8c16218b732e54e@mail.gmail.com>

2009/9/22 Jason Stajich <jason at bioperl.org>

>
>
> However, the above method does not work here. Is this for some deep
>
> reason, or could the above method (based on the way SeqIO works) be
>
> made to work? I'm guessing that the SearchIO object conversion is
>
> simply harder to do than with SeqIO?
>
>
> This is something Jason could probably speak up on, but from my perspective
> it comes down to 'why?'.  This opens up a very hard-to-implement door
> (converting to and from, for instance, BLAST to HMMER), which doesn't make
> sense from the end-user perspective.  What most users want out of those
> formats is getting at the data in an easily accessible way, to further
> process them (filter, to GFF, etc), or to have them summarized.  the Writer
> classes take care of the latter.
>
>
> There is a very generic, all-purpose write_result in Bio::SearchIO that
> just calls the a ResultWriter object (and dies if it isn't present).  Note
> that this expects a ResultWriter, not a Hit/HSPWriter; it is write_result()
> after all. I think this kind of goes against the well-established API that
> exists with the other write_foo implementations for the IO classes, where
> the input/output format should match, but there you have it.
>
> Dan -
> I'm confused about what you are trying to do or what is broken - are you
> just annoyed that the API isn't the same style as Bio::SeqIO.
>

No, I'm not annoyed. I was just confused initially because it didn't work as
'expected', and then I was wondering why (I was just curious). I take Chris's
point that this could be a lot of work to implement for a very marginal use
case.

Very simply, what I am trying to do is this: a) read in a blasttable, b)
filter the HSPs per 'result' (per query sequence), and c) write the HSPs out
in blasttable format.

I was stuck at step c, but I'm not saying anything is broken (just my
understanding of how to use SearchIO::Writer::HSPTableWriter).

I'll look again at Chris's suggestions to see if I can get code to just
'round trip' the blasttable format. From there I think I should be able to
do what I want.


Cheers,
Dan.


--
> Jason Stajich
> jason.stajich at gmail.com
> jason at bioperl.org
>
>


From maj at fortinbras.us  Tue Sep 22 22:32:15 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Tue, 22 Sep 2009 18:32:15 -0400
Subject: [Bioperl-l] Converting between allowed SearchIO formats?
In-Reply-To: <2c8757af0909221433p6d8b5dbeuf8c16218b732e54e@mail.gmail.com>
References: <2c8757af0909220609n518243efh63608aa05df13d1c@mail.gmail.com><B7F6253D-F9EE-4EC0-9ABE-53CB85E37D16@illinois.edu><CE021960-F0DC-4BA7-91B7-21A5B2F6F1BF@bioperl.org>
	<2c8757af0909221433p6d8b5dbeuf8c16218b732e54e@mail.gmail.com>
Message-ID: <9C7D7F02BFBD4F2AA16E151B52125C93@NewLife>

Apropos this, here's something I ran across the other day:

"Just remember when using BioPerl that it was never designed
to 'round trip' your favorite formats. Rather, it was designed to
store sequence data from many widely different formats into a
common object framework and make that framework available
to other sequence manipulation tasks in a programmatic fashion."

from HOWTO:SeqIO#Caveats

Food for thought, anyway--- MAJ

----- Original Message ----- 
From: "Dan Bolser" <dan.bolser at gmail.com>
To: "Jason Stajich" <jason at bioperl.org>
Cc: "Chris Fields" <cjfields at illinois.edu>; "BioPerl List" 
<bioperl-l at lists.open-bio.org>
Sent: Tuesday, September 22, 2009 5:33 PM
Subject: Re: [Bioperl-l] Converting between allowed SearchIO formats?


> 2009/9/22 Jason Stajich <jason at bioperl.org>
>
>>
>>
>> However, the above method does not work here. Is this for some deep
>>
>> reason, or could the above method (based on the way SeqIO works) be
>>
>> made to work? I'm guessing that the SearchIO object conversion is
>>
>> simply harder to do than with SeqIO?
>>
>>
>> This is something Jason could probably speak up on, but from my perspective
>> it comes down to 'why?'.  This opens up a very hard-to-implement door
>> (converting to and from, for instance, BLAST to HMMER), which doesn't make
>> sense from the end-user perspective.  What most users want out of those
>> formats is getting at the data in an easily accessible way, to further
>> process them (filter, to GFF, etc), or to have them summarized.  the Writer
>> classes take care of the latter.
>>
>>
>> There is a very generic, all-purpose write_result in Bio::SearchIO that
>> just calls the a ResultWriter object (and dies if it isn't present).  Note
>> that this expects a ResultWriter, not a Hit/HSPWriter; it is write_result()
>> after all. I think this kind of goes against the well-established API that
>> exists with the other write_foo implementations for the IO classes, where
>> the input/output format should match, but there you have it.
>>
>> Dan -
>> I'm confused about what you are trying to do or what is broken - are you
>> just annoyed that the API isn't the same style as Bio::SeqIO.
>>
>
> No, I'm not annoyed. I was just confused initially because it didn't work as
> 'expected', and then I was wondering why (I was just curious). I take Chris's
> point that this could be a lot of work to implement for a very marginal use
> case.
>
> Very simply, what I am trying to do is this: a) read in a blasttable, b)
> filter the HSPs per 'result' (per query sequence), and c) write the HSPs out
> in blasttable format.
>
> I was stuck at step c, but I'm not saying anything is broken (just my
> understanding of how to use SearchIO::Writer::HSPTableWriter).
>
> I'll look again at Chris's suggestions to see if I can get code to just
> 'round trip' the blasttable format. From there I think I should be able to
> do what I want.
>
>
> Cheers,
> Dan.
>
>
> --
>> Jason Stajich
>> jason.stajich at gmail.com
>> jason at bioperl.org
>>
>>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> 


From clements at nescent.org  Tue Sep 22 23:15:50 2009
From: clements at nescent.org (Dave Clements)
Date: Tue, 22 Sep 2009 16:15:50 -0700
Subject: [Bioperl-l] a Main Page proposal
In-Reply-To: <2B4BD21D-D06A-43A1-8860-A8F3771130E7@illinois.edu>
References: <mailman.7482.1253574466.2696.bioperl-l@lists.open-bio.org>
	<4AB84B8D.5080005@ieee.org>
	<2B4BD21D-D06A-43A1-8860-A8F3771130E7@illinois.edu>
Message-ID: <f135c01c0909221615y4e79fb48u7e1ce6771b046d10@mail.gmail.com>

Hello all,

For open source project wikis, it's nice if the home page
1) Lets new users know that this is an active project with a lot going on.
2) Encourages people to contribute to the project and the wiki.

Both the BioPython,org and GMOD.org sites include a list of links to news
items on the home page.  This is done in both sites with a MediaWiki
extension.

The GMOD.org home page also includes a list of new and recently updated wiki
pages.  This achieves both goals, by showing what's happening, and by giving
people a slight reward for updating the wiki by placing a link to the page
on the wiki.  This is also done with MediaWiki extensions.

My 2?,

Dave C

-- 
GMOD News: http://gmod.org/wiki/GMOD_News


From David.Messina at sbc.su.se  Wed Sep 23 11:37:02 2009
From: David.Messina at sbc.su.se (Dave Messina)
Date: Wed, 23 Sep 2009 13:37:02 +0200
Subject: [Bioperl-l] a Main Page proposal
In-Reply-To: <f135c01c0909221615y4e79fb48u7e1ce6771b046d10@mail.gmail.com>
References: <mailman.7482.1253574466.2696.bioperl-l@lists.open-bio.org> 
	<4AB84B8D.5080005@ieee.org>
	<2B4BD21D-D06A-43A1-8860-A8F3771130E7@illinois.edu> 
	<f135c01c0909221615y4e79fb48u7e1ce6771b046d10@mail.gmail.com>
Message-ID: <628aabb70909230437p13de2164sef4950cc3be2ce23@mail.gmail.com>

I think either Chris' version or Mark's earlier, slightly more verbose
version would work well and fulfill the goals of reducing clutter and making
it easier to find what you're looking for for visitors new and old.

I do like the idea of a newsfeed, which summarizes what's been going on
lately and let's new users know the project is active. Embedding the BioPerl
twitter feed would be an easy solution.


The GMOD.org home page also includes a list of new and recently updated
> wiki pages.  This achieves both goals, by showing what's happening, and by
> giving people a slight reward for updating the wiki by placing a link to the
> page on the wiki.
>

I like this idea too.


Dave


From maj at fortinbras.us  Wed Sep 23 11:47:24 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Wed, 23 Sep 2009 07:47:24 -0400
Subject: [Bioperl-l] a Main Page proposal
In-Reply-To: <628aabb70909230437p13de2164sef4950cc3be2ce23@mail.gmail.com>
References: <mailman.7482.1253574466.2696.bioperl-l@lists.open-bio.org>
	<4AB84B8D.5080005@ieee.org>
	<2B4BD21D-D06A-43A1-8860-A8F3771130E7@illinois.edu>
	<f135c01c0909221615y4e79fb48u7e1ce6771b046d10@mail.gmail.com>
	<628aabb70909230437p13de2164sef4950cc3be2ce23@mail.gmail.com>
Message-ID: <0AD07A69C66B4B5BB8599BA5483145D7@NewLife>

Johnathan, Dave and Dave -- thanks for these helpful comments-
I'm beginning to think there is a happy medium for this medium.
MAJ
  ----- Original Message ----- 
  From: Dave Messina 
  To: Dave Clements 
  Cc: bioperl-l at lists.open-bio.org ; Mark A. Jensen ; Chris Fields 
  Sent: Wednesday, September 23, 2009 7:37 AM
  Subject: Re: [Bioperl-l] a Main Page proposal


  I think either Chris' version or Mark's earlier, slightly more verbose version would work well and fulfill the goals of reducing clutter and making it easier to find what you're looking for for visitors new and old.


  I do like the idea of a newsfeed, which summarizes what's been going on lately and let's new users know the project is active. Embedding the BioPerl twitter feed would be an easy solution.


    The GMOD.org home page also includes a list of new and recently updated wiki pages.  This achieves both goals, by showing what's happening, and by giving people a slight reward for updating the wiki by placing a link to the page on the wiki.


  I like this idea too.


  Dave 


From biopython at maubp.freeserve.co.uk  Wed Sep 23 12:12:56 2009
From: biopython at maubp.freeserve.co.uk (Peter)
Date: Wed, 23 Sep 2009 13:12:56 +0100
Subject: [Bioperl-l] a Main Page proposal
In-Reply-To: <628aabb70909230437p13de2164sef4950cc3be2ce23@mail.gmail.com>
References: <mailman.7482.1253574466.2696.bioperl-l@lists.open-bio.org>
	<4AB84B8D.5080005@ieee.org>
	<2B4BD21D-D06A-43A1-8860-A8F3771130E7@illinois.edu>
	<f135c01c0909221615y4e79fb48u7e1ce6771b046d10@mail.gmail.com>
	<628aabb70909230437p13de2164sef4950cc3be2ce23@mail.gmail.com>
Message-ID: <320fb6e00909230512u3d0c2031xb418e3253476be2f@mail.gmail.com>

On Wed, Sep 23, 2009 at 12:37 PM, Dave Messina <David.Messina at sbc.su.se> wrote:
> I think either Chris' version or Mark's earlier, slightly more verbose
> version would work well and fulfill the goals of reducing clutter and making
> it easier to find what you're looking for for visitors new and old.
>
> I do like the idea of a newsfeed, which summarizes what's been going on
> lately and let's new users know the project is active. Embedding the BioPerl
> twitter feed would be an easy solution.

Embedding your news feed would be just as easy:

http://news.open-bio.org/news/category/obf-projects/bioperl/feed/rdf
http://news.open-bio.org/news/category/obf-projects/bioperl/feed/rss
http://news.open-bio.org/news/category/obf-projects/bioperl/feed/rss2
http://news.open-bio.org/news/category/obf-projects/bioperl/feed/atom

Which (news server vs twitter feed) is preferable is down to you guys,
although for 2009 at least there has been more activity on twitter.
I'm not sure if you have the news posts re-tweeted or not (the last
news server post was back in Feb), but Biopython and the OBF
twitter accounts are doing this via twitterfeed.

Peter


From maj at fortinbras.us  Wed Sep 23 12:51:15 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Wed, 23 Sep 2009 08:51:15 -0400
Subject: [Bioperl-l] Protein Sequence QSARs
In-Reply-To: <627d998d0909070117u760c8ef3k47a894cf52d099f1@mail.gmail.com>
References: <627d998d0909070117u760c8ef3k47a894cf52d099f1@mail.gmail.com>
Message-ID: <3B9AACAB654F4F4DBB6CE00A9B26FBF6@NewLife>

Hi Brett--
I doubt if anything this specialized exists in BioPerl.
I'd say go for it, but R may be better suited for the calculations you
want to do. For dealing with matrices, you may want to check out
the Bio::Matrix namespace.
cheers Mark
----- Original Message ----- 
From: "Brett Bowman" <bnbowman at gmail.com>
To: <bioperl-l at lists.open-bio.org>
Sent: Monday, September 07, 2009 4:17 AM
Subject: [Bioperl-l] Protein Sequence QSARs


I've been working on a script for my personal edification for annotating
protein sequence for QSARs, as described in the paper below, because I
didn't see anything in Bioperl to do it for me.  Essentially converting a
protein sequence of length N into a numerical matrix of size 3-by-N by
substitution, and then calculating the auto- and cross- correlation values
for various for a lag of L amino acids.  I was considering turning it into a
full blown module, but I wanted to ask if A) it had been done before and I
had just missed it, and B) whether anyone other than me would find such a
module useful.

Wold S, Jonsson J, Sj?str?m M, Sandberg M, R?nnar S: * DNA and peptide
sequences and chemical processes multivariately modeled by principal
component analysis and partial least-squares projections to latent
structures. **Anal Chim Acta* 1993, *277**:*239-253.

Brett Bowman
bnbowman at gmail.com
Woelk Lab, Stein Cancer Research Center
UCSD/SDSU Joint Program in Bioinformatics

_______________________________________________
Bioperl-l mailing list
Bioperl-l at lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/bioperl-l


From maj at fortinbras.us  Wed Sep 23 13:04:48 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Wed, 23 Sep 2009 09:04:48 -0400
Subject: [Bioperl-l] Fw:  problem parsing msf file
Message-ID: <4851B51372DE4761B8CC26D685B57344@NewLife>

neglected the list
----- Original Message ----- 
From: "Mark A. Jensen" <maj at fortinbras.us>
To: "Paola Bisignano" <paola.bisignano at gmail.com>
Sent: Wednesday, September 23, 2009 9:04 AM
Subject: Re: [Bioperl-l] problem parsing msf file


> Hi Paola--
> I think you need column_from_residue_number() off the SimpleAlign object,
> and location_from_column off the LocatableSeq object. For your example, 
> try
> 
> $alnio = Bio::AlignIO->new( -file=>"my.msf");
> $aln = $alnio->next_aln;
> 
> $s1 = $aln->get_seq_by_pos(1);
> $s2 = $aln->get_seq_by_pos(2);
> 
> $col = $aln->column_from_residue_number( $s1->id, 28);
> $s2coord = $s2->location_from_column( $col - 1);
> 
> Now, $s2coord should equal 4 (the coordinate of the R before the I
> that aligns with the V in sequence 1).
> MAJ
> 
> 
> ----- Original Message ----- 
> From: "Paola Bisignano" <paola.bisignano at gmail.com>
> To: "Mark A. Jensen" <maj at fortinbras.us>; <bioperl-l at lists.open-bio.org>
> Sent: Friday, September 04, 2009 8:28 AM
> Subject: [Bioperl-l] problem parsing msf file
> 
> 
>>I have a problem with the parsing of msf file...I can't find the exact
>> object of Bio::SimpleAlign for my case...
>> I have to identify residues (from a list) in aligned sequences...but
>> when I parse the alignment from fasta file, I save as msf file, where
>> I have to identify my residue (from the list, numbering as the pdb
>> file) and the residue aligned in the aligned sequences...
>> 
>> this is a piece of the file...
>> 
>> NoName   MSF: 2  Type: P  Wed Aug 26 10:32:50 2009  Check: 00 ..
>> 
>> Name: Sequence/23-178  Len:    156  Check:  8937  Weight:  1.00
>> Name: 2zhz:A/1-148     Len:    156  Check:  9006  Weight:  1.00
>> 
>> //
>> 
>> 
>>                      1                                                   50
>> Sequence/23-178       NDPRVAAYGE VDELNSWVGY TKSLINSHTQ VLSNELEEIQ QLLFDCGHDL
>> 2zhz:A/1-148          DDARIAAIGD VDELNSQIGV L--LAEPLPD DVRAALSAIQ HDLFDLGGEL
>> 
>> 
>>                      51                                                 100
>> Sequence/23-178       ATPADDERHS FKFKQEQPTV WLEEKIDNYT QVVPAVKKHI LPGGTQLASA
>> 2zhz:A/1-148          CIPGHAAITD AHLARLDG-- WLA----HYN GQLPPLEEFI LPGGARGAAL
>> 
>> 
>>                      101                                                150
>> Sequence/23-178       LHVARTITRR AERQIVQLMR EEQINQDVLI FINRLSDYFF AAARYANYLE
>> 2zhz:A/1-148          AHVCRTVCRR AERSIVALGA SEPLNAAPRR YVNRLSDLLF VLARVLNRAA
>> 
>> 
>>                      151                                                200
>> Sequence/23-178       QQPDML
>> 2zhz:A/1-148          GGADVL
>> 
>> for example in this I have to identify the residue that is in front of
>> Val 28 (that is in Sequen) in 2zhz:A (that manually conting is Ile
>> 5)....
>> Tyr4-> has no residue in front of it because the alignment starts from
>> N23 of Sequence...
>> how can I find the way to enter the residue of my sequen, and extract
>> the residue from the other????
>> 
>> 
>> I wish you all dear friends..and I'm actually in atrouble with this..
>> Thanks for suggestions
>> 
>> Paola
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>> 
>>


From cjfields at illinois.edu  Wed Sep 23 14:41:14 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Wed, 23 Sep 2009 09:41:14 -0500
Subject: [Bioperl-l] a Main Page proposal
In-Reply-To: <320fb6e00909230512u3d0c2031xb418e3253476be2f@mail.gmail.com>
References: <mailman.7482.1253574466.2696.bioperl-l@lists.open-bio.org>
	<4AB84B8D.5080005@ieee.org>
	<2B4BD21D-D06A-43A1-8860-A8F3771130E7@illinois.edu>
	<f135c01c0909221615y4e79fb48u7e1ce6771b046d10@mail.gmail.com>
	<628aabb70909230437p13de2164sef4950cc3be2ce23@mail.gmail.com>
	<320fb6e00909230512u3d0c2031xb418e3253476be2f@mail.gmail.com>
Message-ID: <9D6376D4-DFAC-4363-BA1C-0E27AB01373E@illinois.edu>

On Sep 23, 2009, at 7:12 AM, Peter wrote:

> On Wed, Sep 23, 2009 at 12:37 PM, Dave Messina <David.Messina at sbc.su.se 
> > wrote:
>> I think either Chris' version or Mark's earlier, slightly more  
>> verbose
>> version would work well and fulfill the goals of reducing clutter  
>> and making
>> it easier to find what you're looking for for visitors new and old.
>>
>> I do like the idea of a newsfeed, which summarizes what's been  
>> going on
>> lately and let's new users know the project is active. Embedding  
>> the BioPerl
>> twitter feed would be an easy solution.
>
> Embedding your news feed would be just as easy:
>
> http://news.open-bio.org/news/category/obf-projects/bioperl/feed/rdf
> http://news.open-bio.org/news/category/obf-projects/bioperl/feed/rss
> http://news.open-bio.org/news/category/obf-projects/bioperl/feed/rss2
> http://news.open-bio.org/news/category/obf-projects/bioperl/feed/atom
>
> Which (news server vs twitter feed) is preferable is down to you guys,
> although for 2009 at least there has been more activity on twitter.
> I'm not sure if you have the news posts re-tweeted or not (the last
> news server post was back in Feb), but Biopython and the OBF
> twitter accounts are doing this via twitterfeed.
>
> Peter

Not to add yet more to the list, but I also think a concise list of  
projects using (or 'powered by') bioperl should be front-and-center;  
not a lot of users know when/where bioperl is used.  This applies to  
the other bio* as well, particularly biopython (seeing it popping up  
more and more).

For an example, see the biomart homepage:

http://www.biomart.org/

chris


From adlai at refenestration.com  Wed Sep 23 14:38:32 2009
From: adlai at refenestration.com (adlai burman)
Date: Wed, 23 Sep 2009 16:38:32 +0200
Subject: [Bioperl-l] Newbie: Format GenBank
Message-ID: <BA67A13E-EAF0-4297-8013-22656D3D1740@refenestration.com>

I have finally got past two major hurdles (for me) only to get stumped:
1. I have written a perl script that can take a genbank formated text  
file as a filehandle and do all sorts of nifty (for me) things with it.
2. I have gotten my BioPerl installation working on a web hosting  
service so my advisor can use this through a browser.

BUT the code I have to fetch GB record can print it as a single HTML  
line, and what I need is for it to assign the retrieved file to a  
scaler variable. I am going blind trying to figure out how access  
(not write) the gb file from an SeqIO object and assign it to a  
variable.

Here's an example of the code I have going on the server:

#!/usr/bin/perl
print "Content-type: text/html\n\n";
use Bio::SeqIO;
use Bio::DB::GenBank;

$genBank = new Bio::DB::GenBank;  # This object knows how to talk to  
GenBank

my $seq = $genBank->get_Seq_by_acc('DQ897681');  # get a record by  
accession

my $seqOut = new Bio::SeqIO(-format => 'genbank');

$seqOut->write_seq($seq);


exit;

where 'DQ897861' will be replaced by a CGI post.

I know that write_seq is not what I need, and I assume that this is a  
simple problem but can anyone tell me how to assign the retrieved gb  
file to a scaler?

Thanks,
Adlai


From joseguillin at hotmail.com  Tue Sep 22 14:39:52 2009
From: joseguillin at hotmail.com (Jose .)
Date: Tue, 22 Sep 2009 15:39:52 +0100
Subject: [Bioperl-l] dnastatistics
In-Reply-To: <A5C3A80C-03F0-4CEC-BA43-2271B58F6DC4@science.mq.edu.au>
References: <BLU104-W2453ADE4584D2C479071A4A0E40@phx.gbl>
	<7AD546C5A6BE4B66BF9705BC885E08B1@NewLife>
	<8B440DC9-A1C8-4900-A0AB-96448616E46A@bioperl.org>
	<A5C3A80C-03F0-4CEC-BA43-2271B58F6DC4@science.mq.edu.au>
Message-ID: <BLU104-W475752FF9D5EADD0269E7A0DC0@phx.gbl>


Hi Liam,
I've tried analyzing the same alignment with both softwares (DNAStatatistics and dnadist), using the same analysis method (Jukes-Cantor), and I got pretty much the same results:

use strict;
use Bio::AlignIO;
Use Bio::Align::DNAStatistics;
my $stats = Bio::Align::DNAStatistics->new();
my $alignin = Bio::AlignIO->new(-file => 'e1_output_uno_solo.fas',
                         -format => 'fasta');
my $aln = $alignin->next_aln;
my $jcmatrix = $stats-> distance (-align => $aln,
               -method => 'Jukes-Cantor');
print $jcmatrix->print_matrix;
RESULT:A              0.00000  0.40900  0.41834  0.38044B              0.40900  0.00000  0.41358  0.37240C              0.41834  0.41358  0.00000  0.37809D              0.38044  0.37240  0.37809  0.00000

I used the web-based dnadist  ( http://mobyle.pasteur.fr/cgi-bin/portal.py?form=dnadist ), which is mentioned in the CPAN-dnadist documentation ( http://search.cpan.org/~birney/bioperl-run-1.4/Bio/Tools/Run/PiseApplication/dnadist.pm ),  setting Jukes-Cantor as Distance (D), and these are the Results:    4
A          0.000000 0.408996 0.418335 0.380436
B          0.408996 0.000000 0.413575 0.372400
C          0.418335 0.413575 0.000000 0.378086
D          0.380436 0.372400 0.378086 0.000000The difference is because of rounding off.Could it be by any chance that your analysis were made using two different methods, by default? (I think dnadist uses F84 instead of Jukes-Cantor by default). 

Using F84 instead of Jukes-Cantor in dnadist gives:
    4
A          0.000000 0.470013 0.479477 0.435071
B          0.470013 0.000000 0.468730 0.417669
C          0.479477 0.468730 0.000000 0.421582
D          0.435071 0.417669 0.421582 0.000000

On the other hand, DnaStatistics documentation offers the possibility of using F84, but it's not yet implementedMSG: Abstract method "Bio::Align::DNAStatistics::D_F84" is not implemented by package Bio::Align::DNAStatistics.
This is not your fault - author of Bio::Align::DNAStatistics should be blamed!


So, I think Jukes-Cantor works the same in Bio::Align::DNAStatistics and web-based dnadist; but other methods maybe not.
I want to thank you for letting me know about Data::Dumper, I've read the documentation and seems very handy. I think it could help me sooner or later. I'll try it out!!As I'm using DNAStatistics for a project, please let me know if you find what is wrong; or if I can help you further somehow.
Regards,
Jose G.


Subject: dnastatistics
From: lelbourn at science.mq.edu.au
Date: Tue, 22 Sep 2009 17:14:44 +1000
CC: maj at fortinbras.us; bioperl-l at bioperl.org; joseguillin at hotmail.com
To: jason at bioperl.org


So I also had no problem running the code as written by Jose (Bioperl 1.6.0, perl 5.10), but in the documentation for DNAStatistics it says:
"The routines are not well tested and do contain errors at this point. Work is underway to correct them, but do not expect this code to give you the right answer currently!"!
So I'm using dnadist (as I think the documentation recommends), and it does produce different numbers to $stats->distance(-).
I tried write_matrix from Bio::Matrix::IO - got a message saying it hasn't been implemented yet?
And if Jose hasn't already found it, try Data::Dumper; it will change your life....
Regards,Liam.
On 15/09/2009, at 3:54 AM, Jason Stajich wrote:Yeah it seems like more of a bioperl problem -- possible that the older code didn't recognize 'jukes-cantor' but you can try the abbreviation 'jc' -- better to just upgrade tho!

This isn't the cause of the problem but I would also encourage use of Bio::Matrix::IO for printing the matrix (use the 'write_matrix' function) rather than print_matrix on the matrix itsself.

-jason
On Sep 14, 2009, at 10:00 AM, Mark A. Jensen wrote:

Hi Jose--
I don't get any problem with your script as written. You should upgrade to
BioPerl 1.6 and try again.
The "unblessed reference" is $jcmatrix. It may be undef for some reason.
MAJ
----- Original Message ----- From: "Jose ." <joseguillin at hotmail.com>
To: <bioperl-l at bioperl.org>
Sent: Monday, September 14, 2009 8:48 AM
Subject: [Bioperl-l] Bio/Align/DNAStatistics.html print$jcmatrix->print_matrix;


Hello,

I'm trying to use Bio::Align::DNAStatistics, but I get the following message:

Can't call method "print_matrix" on unblessed reference at Tree.pl line 32, <GEN0> line 44.

Other modules do work, such us Bio::SimpleAlign;


My code is basically a modification of the code I found in http://doc.bioperl.org/releases/bioperl-current/bioperl-live/Bio/Align/DNAStatistics.html, as it is as follows:

use strict;
use Bio::AlignIO;
use Bio::Align::DNAStatistics;


my $stats = Bio::Align::DNAStatistics->new();

my $alignin = Bio::AlignIO->new(-file => 'e1_output_uno_solo.fas',
                          -format => 'fasta');
my $aln = $alignin->next_aln;

my $jcmatrix = $stats-> distance (-align => $aln,
                -method => 'Jukes-Cantor');

print $jcmatrix->print_matrix;

And the file 'e1_output_uno_solo.fas' has the following sequences:

A
GGTTATCTCAACAACTGTCACC--GTGGGCGCTGGTCATTGGTACGGGTGAACGAGAGTT
AAACGGTCGTTAACCATAGAAACAAAACACACTGCACCTTAACTCACTGAATAGTTGACG
GTCTGCCTCAGGGCTTGAGACAACGGATGGATCTAAACTCATGCTGTAGCCTATCAAACT
TAGCCCCAGGGTACTTCCGTCCCTAGCCTCGCTACAAGGCCAGAAAGGGTTTTGAAGTCT
ACTCACTGTGACCAGCGGTCTAGTCAGGTTATGCTTCGGCACAAAACCTCAGAATCGGTA
ACCAGCCACTACACGAACTGAAATCAAATCGCGGGAGGTGGTCCATCTTTGTCCACGCTG
CGATGATTGGGTTGCTTTATAGTCTAGCTGCAAGGTTTTGCGTTCTGGTGGGAAGCGGSubject: Re: [Bioperl-l] Bio/Align/DNAStatistics.html
	print$jcmatrix->print_maCA
TCCAAGGGGTTGACTCCGCTCGTTTATAACATGCCTTGGGCCTCCATGGTGAGTCGCAAC
GTCAGCGTAGGCCTAGACGGCT

B
GGATATCTCGACAACTTTTAGC--CTGGGCGCTTGGCATTGGTACACGTGACTTGCAGTT
AAAGGGTCGTTATACATAGAATCACTACCCAC--CAGGCGAACTCGCTGGAGAGCTGAGG
GTCACCCTCAGCGGTTGAGTTAACTGCTCGATGTTAACCGATGTTGGATCATAGGTAACT
TATCCTCAGTGTTCCTCTGTCCCTAGACTGGCTACAGGGCTACACCGGGTTTGAGGGGAT
ACTGACTGTTTTCAGCGGTAGTGTAAGTGTATGGTCCAACCCAAGGGTTCATGACCGGTA
AACTGCCCGTTCCCGCATTGAAATCAAATTGCAGGAGTTGGTACTTATTTGTCAACCTTA
CGATGATTGGGATGCATTTTAGTCGGGCTGGGCGGATTTGCGATCTGGGTGGAAGAGAGA
TGCATGGGGCTAACTCGTCTTGGTGAGTACCGGCATTGCACCGCAATGGACCGCCAAAAC
ATAAGAGTAGGTCGGGATGGCA

C
GCTTATCTCAACAACCGACACGAAGTCGTCGCAGGTCAATGGTACACGTGAATTGAAGTC
ATAAGATCAGTAATGATCGAACCACCAAACCCTTAACCTCGACTCACGCGATAGCCGAGG
GTCTGCCTCCAGGGTTGATTTAAAGGTTCTATTTAAGACCGTTTTCGATCATAGGTTACT
TATCCCCAGAGTTCTACCGTCGTGAGAATGGCTACAAGGCTAGAATAGGTTTTAGGGT-T
ACTTACGGTCTGCAGCCGTATTGTGAGGTTATGGTCCGGCCCTAGGCGTCATGACCGATA
ATCAGCCCCTACCTGAAATGAAATCAAATCGCGGGAGTTGGTACTTATCTGTCAACGTTG
CGATGATGGGGATACATGTTGGTCTACCGCGACGGACTAGCGATCACGGGGGAAGCGGAT
TGCCCGGTGGTGACTCGACACGTTTAAAACCTGCCTGGTTCCCGCATGGATCGTCACAAC
GTATGTGCAGGTCGAAACGAGT

D
CGTGATCGCAACAACTGTCACC--GTGGGCGCTGGCCGTTGGACCACGTGAAATGCTGTT
AAACGATCGTTCACCATAGAACCACTACACTCTTCACCTCAACCCGCGGGACAGGTGATG
GTGTCCCCCAGGGGTTGAGTGAACGGCTCGATGTAAACCCATGTTCGATCATAGGTAACG
TAGCCCCAGGGTGATTCCGTTCCTAAACTGGTTACAAGGCTAAAACGTGTTTTAGAGTAT
AATGACTGTCTACGGCGGTATTGTGATGTTATCATCCGTCCCTAGGCGTGGCGACCGTTA
AACAGCCTCTTCCCTAACTGATATCTAATCGTAGGAGTTGCTACGCATTTGTCAACGCAG
CGATGATGGTGATGCATCTTAATCTAGCTGG----TTTTTTGATCTCGGGTGACGCAGAT
AGTCAGGGGTTGACTCGCGTCGTTTGAAACGTGCCTTGCTCCTCAATGGACCCTCCGAAC
CTAAGAGTAGCTCGACACGGCT


I think the $aln object is OK, as I can use it with SimpleAlign.

Moreover, if I write
        print $jcmatrix;
instead of
        print $jcmatrix->print_matrix;
I get the memory reference, as normal===> ARRAY(0x859f08)

So my question is:

Why do I have an unblessed reference?

Can't call method "print_matrix" on unblessed reference at Tree.pl line 32, <GEN0> line 44.

Thank you very much in advance.

Jose G.

_________________________________________________________________
Hay tantos ordenadores como personas. ?Descubre ahora cu?l eres t?!
http://www.quepceres.com/
_______________________________________________
Bioperl-l mailing list
Bioperl-l at lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/bioperl-l


_______________________________________________
Bioperl-l mailing list
Bioperl-l at lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/bioperl-l

--
Jason Stajich
jason.stajich at gmail.com
jason at bioperl.org


_______________________________________________
Bioperl-l mailing list
Bioperl-l at lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/bioperl-l


______________________________


_________________________________________________________________
Comparte tus mejores momentos del verano ?Hazlo con Windows Live Fotos!
http://www.vivelive.com/compartirfotos


From A.J.Pemberton at bham.ac.uk  Tue Sep 22 17:06:04 2009
From: A.J.Pemberton at bham.ac.uk (Anthony Pemberton)
Date: Tue, 22 Sep 2009 18:06:04 +0100
Subject: [Bioperl-l] Problems installing latest stable bioperl-db (1.6)
Message-ID: <3A5B0BBDAF00724AB5F10155650102306F86D3F6@LESMBX1.adf.bham.ac.uk>

Folks,

I am experiencing problems installing bioperl-db. I followed the instructions on the website both installing via CPAN and downloading the source tarball. Get the same error. I think I have missing prerequistes, the first error I get is:

Can't locate Array/Compare.pm in @INC (@INC contains: t/lib t /usr/local/BioPerl-db-1.6.0/blib/lib 
/usr/local/BioPerl-db-1.6.0/blib/arch /usr/local/BioPerl-db-1.6.0 /usr/lib64/perl5/5.8.5/x86_64-linux-thread-multi
/usr/lib/perl5/5.8.5 /usr/lib64/perl5/site_perl/5.8.5/x86_64-linux-thread-multi /usr/lib/perl5/site_perl/5.8.5 
/usr/lib/perl5/site_perl /usr/lib64/perl5/vendor_perl/5.8.5/x86_64-linux-thread-multi /usr/lib/perl5/vendor_perl/5.8.5 
/usr/lib64/perl5/vendor_perl/5.8.3/x86_64-linux-thread-multi /usr/lib/perl5/vendor_perl .) at t/lib/Test/Warn.pm line 228.

Can anyone help?

Regards,

Tony P.


**************************************************************
Mr. A. Pemberton			Tel:+44 121 414 3388
School of Biosciences,			Fax:+44 121 414 5925
The University of Birmingham                    Email:a.j.pemberton at bham.ac.uk
Birmingham B15 2TT U.K.
**************************************************************


From joseguillin at hotmail.com  Wed Sep 23 15:08:04 2009
From: joseguillin at hotmail.com (Jose .)
Date: Wed, 23 Sep 2009 16:08:04 +0100
Subject: [Bioperl-l] Bio::Matrix::IO
Message-ID: <BLU104-W13A9E771FB4CC77748AAC5A0DB0@phx.gbl>


Hi,
I've found a typo in the Bio/Matrix/IO/phylip.pm documentation. There's a comma missing, 
=head1 SYNOPSIS

  use Bio::Matrix::IO;
  my $parser = Bio::Matrix::IO->new(-format   => 'phylip'    <------ comma missing
                                   -file     => 't/data/phylipdist.out');
  my $matrix = $parser->next_matrix;

It's also in the CPAN web:http://search.cpan.org/~cjfields/BioPerl-1.6.0_2/Bio/Matrix/IO/phylip.pm
And the BioPerl web:http://doc.bioperl.org/releases/bioperl-current/bioperl-live/Bio/Matrix/IO/phylip.html

This could mislead BioPerl begginers (like me) or absentminded BioPerl advanced who rely on the SYNOPSIS code.
Thank you! :)
_________________________________________________________________
Desc?rgate Internet Explorer 8 ?Y gana gratis viajes con Spanair!
http://www.vivelive.com/spanair


From maj at fortinbras.us  Wed Sep 23 15:36:59 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Wed, 23 Sep 2009 11:36:59 -0400
Subject: [Bioperl-l] Problems installing latest stable bioperl-db (1.6)
In-Reply-To: <3A5B0BBDAF00724AB5F10155650102306F86D3F6@LESMBX1.adf.bham.ac.uk>
References: <3A5B0BBDAF00724AB5F10155650102306F86D3F6@LESMBX1.adf.bham.ac.uk>
Message-ID: <3E7712FC278A4C9C89CBFC9A683AE301@NewLife>

hi Tony- missing prereqs are the issue with this message,yes-
the brute force approach would be to install each of these
as they come up; you can do

$ cpan
cpan> install Array::Compare

etc., then attempt the bioperl-db install again; lather, rinse, repeat.
MAJ
----- Original Message ----- 
From: "Anthony Pemberton" <A.J.Pemberton at bham.ac.uk>
To: <bioperl-l at bioperl.org>
Sent: Tuesday, September 22, 2009 1:06 PM
Subject: [Bioperl-l] Problems installing latest stable bioperl-db (1.6)


> Folks,
>
> I am experiencing problems installing bioperl-db. I followed the instructions 
> on the website both installing via CPAN and downloading the source tarball. 
> Get the same error. I think I have missing prerequistes, the first error I get 
> is:
>
> Can't locate Array/Compare.pm in @INC (@INC contains: t/lib t 
> /usr/local/BioPerl-db-1.6.0/blib/lib
> /usr/local/BioPerl-db-1.6.0/blib/arch /usr/local/BioPerl-db-1.6.0 
> /usr/lib64/perl5/5.8.5/x86_64-linux-thread-multi
> /usr/lib/perl5/5.8.5 
> /usr/lib64/perl5/site_perl/5.8.5/x86_64-linux-thread-multi 
> /usr/lib/perl5/site_perl/5.8.5
> /usr/lib/perl5/site_perl 
> /usr/lib64/perl5/vendor_perl/5.8.5/x86_64-linux-thread-multi 
> /usr/lib/perl5/vendor_perl/5.8.5
> /usr/lib64/perl5/vendor_perl/5.8.3/x86_64-linux-thread-multi 
> /usr/lib/perl5/vendor_perl .) at t/lib/Test/Warn.pm line 228.
>
> Can anyone help?
>
> Regards,
>
> Tony P.
>
>
> **************************************************************
> Mr. A. Pemberton Tel:+44 121 414 3388
> School of Biosciences, Fax:+44 121 414 5925
> The University of Birmingham                    Email:a.j.pemberton at bham.ac.uk
> Birmingham B15 2TT U.K.
> **************************************************************
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> 


From maj at fortinbras.us  Wed Sep 23 15:46:03 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Wed, 23 Sep 2009 11:46:03 -0400
Subject: [Bioperl-l] Bio::Matrix::IO
In-Reply-To: <BLU104-W13A9E771FB4CC77748AAC5A0DB0@phx.gbl>
References: <BLU104-W13A9E771FB4CC77748AAC5A0DB0@phx.gbl>
Message-ID: <E37AFAC689C84477817EFF38511B5709@NewLife>

thanks Jose - fixed it
MAJ
----- Original Message ----- 
From: "Jose ." <joseguillin at hotmail.com>
To: <bioperl-l at bioperl.org>
Sent: Wednesday, September 23, 2009 11:08 AM
Subject: [Bioperl-l] Bio::Matrix::IO


Hi,
I've found a typo in the Bio/Matrix/IO/phylip.pm documentation. There's a comma 
missing,
=head1 SYNOPSIS

  use Bio::Matrix::IO;
  my $parser = Bio::Matrix::IO->new(-format   => 'phylip'    <------ comma 
missing
                                   -file     => 't/data/phylipdist.out');
  my $matrix = $parser->next_matrix;

It's also in the CPAN 
web:http://search.cpan.org/~cjfields/BioPerl-1.6.0_2/Bio/Matrix/IO/phylip.pm
And the BioPerl 
web:http://doc.bioperl.org/releases/bioperl-current/bioperl-live/Bio/Matrix/IO/phylip.html

This could mislead BioPerl begginers (like me) or absentminded BioPerl advanced 
who rely on the SYNOPSIS code.
Thank you! :)
_________________________________________________________________
Desc?rgate Internet Explorer 8 ?Y gana gratis viajes con Spanair!
http://www.vivelive.com/spanair
_______________________________________________
Bioperl-l mailing list
Bioperl-l at lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/bioperl-l


From roy.chaudhuri at gmail.com  Wed Sep 23 16:27:26 2009
From: roy.chaudhuri at gmail.com (Roy Chaudhuri)
Date: Wed, 23 Sep 2009 17:27:26 +0100
Subject: [Bioperl-l] Newbie: Format GenBank
In-Reply-To: <BA67A13E-EAF0-4297-8013-22656D3D1740@refenestration.com>
References: <BA67A13E-EAF0-4297-8013-22656D3D1740@refenestration.com>
Message-ID: <4ABA4C6E.60609@gmail.com>

Hi Adlai,

In Perl you can open a string as if it was a file:

my $string;
open my $fh, '>', \$string or die $!;
my $seqOut=Bio::SeqIO->new(-fh=>$fh, -format=>'genbank';

$seqOut->write_seq($seq) should now write to the string.

However, are you sure this is your problem? Printing to STDOUT (which is 
what SeqIO does if you don't specify a file) should work fine with a CGI 
script. Your sequence is being displayed as one line because HTML 
ignores newline characters, but you can get around that by using a <pre> 
tag to specify pre-formatted text:

my $seqOut = new Bio::SeqIO(-format => 'genbank');
print "<pre>\n";
$seqOut->write_seq($seq);

Hope this helps.
Roy.

adlai burman wrote:
> I have finally got past two major hurdles (for me) only to get stumped:
> 1. I have written a perl script that can take a genbank formated text  
> file as a filehandle and do all sorts of nifty (for me) things with it.
> 2. I have gotten my BioPerl installation working on a web hosting  
> service so my advisor can use this through a browser.
> 
> BUT the code I have to fetch GB record can print it as a single HTML  
> line, and what I need is for it to assign the retrieved file to a  
> scaler variable. I am going blind trying to figure out how access  
> (not write) the gb file from an SeqIO object and assign it to a  
> variable.
> 
> Here's an example of the code I have going on the server:
> 
> #!/usr/bin/perl
> print "Content-type: text/html\n\n";
> use Bio::SeqIO;
> use Bio::DB::GenBank;
> 
> $genBank = new Bio::DB::GenBank;  # This object knows how to talk to  
> GenBank
> 
> my $seq = $genBank->get_Seq_by_acc('DQ897681');  # get a record by  
> accession
> 
> my $seqOut = new Bio::SeqIO(-format => 'genbank');
> 
> $seqOut->write_seq($seq);
> 
> 
> exit;
> 
> where 'DQ897861' will be replaced by a CGI post.
> 
> I know that write_seq is not what I need, and I assume that this is a  
> simple problem but can anyone tell me how to assign the retrieved gb  
> file to a scaler?
> 
> Thanks,
> Adlai
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From cjfields at illinois.edu  Wed Sep 23 17:47:51 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Wed, 23 Sep 2009 12:47:51 -0500
Subject: [Bioperl-l] Newbie: Format GenBank
In-Reply-To: <BA67A13E-EAF0-4297-8013-22656D3D1740@refenestration.com>
References: <BA67A13E-EAF0-4297-8013-22656D3D1740@refenestration.com>
Message-ID: <16121E7E-7619-4F02-82CC-20C6F5F6B230@illinois.edu>

On Sep 23, 2009, at 9:38 AM, adlai burman wrote:

> I have finally got past two major hurdles (for me) only to get  
> stumped:
> 1. I have written a perl script that can take a genbank formated  
> text file as a filehandle and do all sorts of nifty (for me) things  
> with it.
> 2. I have gotten my BioPerl installation working on a web hosting  
> service so my advisor can use this through a browser.
>
> BUT the code I have to fetch GB record can print it as a single HTML  
> line, and what I need is for it to assign the retrieved file to a  
> scaler variable. I am going blind trying to figure out how access  
> (not write) the gb file from an SeqIO object and assign it to a  
> variable.
>
> Here's an example of the code I have going on the server:
>
> #!/usr/bin/perl
> print "Content-type: text/html\n\n";
> use Bio::SeqIO;
> use Bio::DB::GenBank;
>
> $genBank = new Bio::DB::GenBank;  # This object knows how to talk to  
> GenBank
>
> my $seq = $genBank->get_Seq_by_acc('DQ897681');  # get a record by  
> accession
>
> my $seqOut = new Bio::SeqIO(-format => 'genbank');
>
> $seqOut->write_seq($seq);
>
> exit;
>
> where 'DQ897861' will be replaced by a CGI post.
>
> I know that write_seq is not what I need, and I assume that this is  
> a simple problem but can anyone tell me how to assign the retrieved  
> gb file to a scaler?
>
> Thanks,
> Adlai

Actually, there are two ways you can do this, one involving write_seq.

(1) The first is to just grab the raw data using Bio::DB::EUtilities:

use Bio::DB::EUtilities;

my $eutil = Bio::DB::EUtilities->new(-eutil     => 'efetch',
                                      -db        => 'nuccore',
                                      -id        => 'DQ897681',
                                      -rettype   => 'gb');

my $var = $eutil->get_Response->content;

(2) Use IO::String (see the SeqIO HOWTO), or Roy's example code.  That  
would 'filter' everything through SeqIO via next_seq/write_seq, so the  
output is what BioPerl spits out and may not be exactly the same.

chris


From cjfields at illinois.edu  Wed Sep 23 17:47:56 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Wed, 23 Sep 2009 12:47:56 -0500
Subject: [Bioperl-l] Problems installing latest stable bioperl-db (1.6)
In-Reply-To: <3E7712FC278A4C9C89CBFC9A683AE301@NewLife>
References: <3A5B0BBDAF00724AB5F10155650102306F86D3F6@LESMBX1.adf.bham.ac.uk>
	<3E7712FC278A4C9C89CBFC9A683AE301@NewLife>
Message-ID: <67AB606C-5CC9-4C1E-84EE-EFB7C37667E9@illinois.edu>

Appears Array::Compare is used for Test::Warn, so it isn't a true  
requirement (probably a test_requires or somesuch).

chris

On Sep 23, 2009, at 10:36 AM, Mark A. Jensen wrote:

> hi Tony- missing prereqs are the issue with this message,yes-
> the brute force approach would be to install each of these
> as they come up; you can do
>
> $ cpan
> cpan> install Array::Compare
>
> etc., then attempt the bioperl-db install again; lather, rinse,  
> repeat.
> MAJ
> ----- Original Message ----- From: "Anthony Pemberton" <A.J.Pemberton at bham.ac.uk 
> >
> To: <bioperl-l at bioperl.org>
> Sent: Tuesday, September 22, 2009 1:06 PM
> Subject: [Bioperl-l] Problems installing latest stable bioperl-db  
> (1.6)
>
>
>> Folks,
>>
>> I am experiencing problems installing bioperl-db. I followed the  
>> instructions on the website both installing via CPAN and  
>> downloading the source tarball. Get the same error. I think I have  
>> missing prerequistes, the first error I get is:
>>
>> Can't locate Array/Compare.pm in @INC (@INC contains: t/lib t /usr/ 
>> local/BioPerl-db-1.6.0/blib/lib
>> /usr/local/BioPerl-db-1.6.0/blib/arch /usr/local/BioPerl-db-1.6.0 / 
>> usr/lib64/perl5/5.8.5/x86_64-linux-thread-multi
>> /usr/lib/perl5/5.8.5 /usr/lib64/perl5/site_perl/5.8.5/x86_64-linux- 
>> thread-multi /usr/lib/perl5/site_perl/5.8.5
>> /usr/lib/perl5/site_perl /usr/lib64/perl5/vendor_perl/5.8.5/x86_64- 
>> linux-thread-multi /usr/lib/perl5/vendor_perl/5.8.5
>> /usr/lib64/perl5/vendor_perl/5.8.3/x86_64-linux-thread-multi /usr/ 
>> lib/perl5/vendor_perl .) at t/lib/Test/Warn.pm line 228.
>>
>> Can anyone help?
>>
>> Regards,
>>
>> Tony P.
>>
>>
>> **************************************************************
>> Mr. A. Pemberton Tel:+44 121 414 3388
>> School of Biosciences, Fax:+44 121 414 5925
>> The University of Birmingham                     
>> Email:a.j.pemberton at bham.ac.uk
>> Birmingham B15 2TT U.K.
>> **************************************************************
>>
>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From cjfields at illinois.edu  Wed Sep 23 20:58:37 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Wed, 23 Sep 2009 15:58:37 -0500
Subject: [Bioperl-l] BioPerl 1.6.0 alpha 3 released
In-Reply-To: <3BEA4A335B853745AE0BA5E81DE3782A09DD5D30@exch1-hi.accelrys.net>
References: <A59164B5-0408-4A94-9262-8B814DD48CE1@illinois.edu>
	<3BEA4A335B853745AE0BA5E81DE3782A09DD5D30@exch1-hi.accelrys.net>
Message-ID: <EA6593D4-5F3D-4CD9-95C6-598B9C561609@illinois.edu>

Yes, that would be good.  I don't have immediate access to anything  
running WinXP/vista/7 but I can probably look into this sometime  
tomorrow or Monday.

Just to make sure, is this with ActivePerl or Strawberry Perl?

chris

On Sep 23, 2009, at 3:52 PM, Kristine Briedis wrote:

> Hi Chris,
>
> We tested BioPerl 1.6.0 alpha 3 with our set of Pipeline Pilot  
> regressions and noticed a small problem.  The fasta validation check  
> for '>' in SeqIO::fasta (line 127) throws when used with  
> Index::Fasta on Windows because the position after '>' is being  
> indexed.  It looks like you already fixed the same problem for Linux  
> (comment in line 190 of Index::Fasta).  Do you want me to put this  
> into bugzilla?  Let me know if you have any questions.  Thanks!
>
> Cheers,
> Kristine
>
>
> -----Original Message-----
> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l- 
> bounces at lists.open-bio.org] On Behalf Of Chris Fields
> Sent: Tuesday, September 22, 2009 1:29 PM
> To: BioPerl List
> Subject: [Bioperl-l] BioPerl 1.6.0 alpha 3 released
>
> The third alpha is now out and propagating it's way around the
> intertubes:
>
> http://search.cpan.org/~cjfields/BioPerl-1.6.0_3/
>
> Pick your favorite archive here:
>
> http://bioperl.org/DIST/RC/
>
> This includes some unmerged changes from 1.6.0.  Test failures from
> the last alpha indicated these somehow were missed, so I basically ran
> a global diff against main trunk to check for missing commits (all
> located in t/ as it turned out).
>
> Also fixed is are the SeqFeature_SQLite.t failures; this is a file
> autogenerated with Build.PL tests that somehow made it's way into the
> last alpha release.  This is now properly cleaned up along with it's
> test database using './Build clean'.  BTW, very nice SQLite
> implementation; I may be using it!
>
> Please let me know if anything pops up; I'm hoping to release 1.6.1 by
> this Thursday-Friday.
>
> Enjoy!
>
> chris
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From KBriedis at accelrys.com  Wed Sep 23 20:52:09 2009
From: KBriedis at accelrys.com (Kristine Briedis)
Date: Wed, 23 Sep 2009 16:52:09 -0400
Subject: [Bioperl-l] BioPerl 1.6.0 alpha 3 released
In-Reply-To: <A59164B5-0408-4A94-9262-8B814DD48CE1@illinois.edu>
References: <A59164B5-0408-4A94-9262-8B814DD48CE1@illinois.edu>
Message-ID: <3BEA4A335B853745AE0BA5E81DE3782A09DD5D30@exch1-hi.accelrys.net>

Hi Chris,

We tested BioPerl 1.6.0 alpha 3 with our set of Pipeline Pilot regressions and noticed a small problem.  The fasta validation check for '>' in SeqIO::fasta (line 127) throws when used with Index::Fasta on Windows because the position after '>' is being indexed.  It looks like you already fixed the same problem for Linux (comment in line 190 of Index::Fasta).  Do you want me to put this into bugzilla?  Let me know if you have any questions.  Thanks!

Cheers,
Kristine


-----Original Message-----
From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-bounces at lists.open-bio.org] On Behalf Of Chris Fields
Sent: Tuesday, September 22, 2009 1:29 PM
To: BioPerl List
Subject: [Bioperl-l] BioPerl 1.6.0 alpha 3 released

The third alpha is now out and propagating it's way around the  
intertubes:

http://search.cpan.org/~cjfields/BioPerl-1.6.0_3/

Pick your favorite archive here:

http://bioperl.org/DIST/RC/

This includes some unmerged changes from 1.6.0.  Test failures from  
the last alpha indicated these somehow were missed, so I basically ran  
a global diff against main trunk to check for missing commits (all  
located in t/ as it turned out).

Also fixed is are the SeqFeature_SQLite.t failures; this is a file  
autogenerated with Build.PL tests that somehow made it's way into the  
last alpha release.  This is now properly cleaned up along with it's  
test database using './Build clean'.  BTW, very nice SQLite  
implementation; I may be using it!

Please let me know if anything pops up; I'm hoping to release 1.6.1 by  
this Thursday-Friday.

Enjoy!

chris
_______________________________________________
Bioperl-l mailing list
Bioperl-l at lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/bioperl-l


From KBriedis at accelrys.com  Wed Sep 23 22:40:10 2009
From: KBriedis at accelrys.com (Kristine Briedis)
Date: Wed, 23 Sep 2009 18:40:10 -0400
Subject: [Bioperl-l] BioPerl 1.6.0 alpha 3 released
In-Reply-To: <EA6593D4-5F3D-4CD9-95C6-598B9C561609@illinois.edu>
References: <A59164B5-0408-4A94-9262-8B814DD48CE1@illinois.edu>
	<3BEA4A335B853745AE0BA5E81DE3782A09DD5D30@exch1-hi.accelrys.net>
	<EA6593D4-5F3D-4CD9-95C6-598B9C561609@illinois.edu>
Message-ID: <3BEA4A335B853745AE0BA5E81DE3782A09DD5DF8@exch1-hi.accelrys.net>

Hi Chris,

ActivePerl.  I'll open a bug.  Thanks!

Cheers,
Kristine


-----Original Message-----
From: Chris Fields [mailto:cjfields at illinois.edu] 
Sent: Wednesday, September 23, 2009 1:59 PM
To: Kristine Briedis
Cc: BioPerl List
Subject: Re: [Bioperl-l] BioPerl 1.6.0 alpha 3 released

Yes, that would be good.  I don't have immediate access to anything  
running WinXP/vista/7 but I can probably look into this sometime  
tomorrow or Monday.

Just to make sure, is this with ActivePerl or Strawberry Perl?

chris

On Sep 23, 2009, at 3:52 PM, Kristine Briedis wrote:

> Hi Chris,
>
> We tested BioPerl 1.6.0 alpha 3 with our set of Pipeline Pilot  
> regressions and noticed a small problem.  The fasta validation check  
> for '>' in SeqIO::fasta (line 127) throws when used with  
> Index::Fasta on Windows because the position after '>' is being  
> indexed.  It looks like you already fixed the same problem for Linux  
> (comment in line 190 of Index::Fasta).  Do you want me to put this  
> into bugzilla?  Let me know if you have any questions.  Thanks!
>
> Cheers,
> Kristine
>
>
> -----Original Message-----
> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l- 
> bounces at lists.open-bio.org] On Behalf Of Chris Fields
> Sent: Tuesday, September 22, 2009 1:29 PM
> To: BioPerl List
> Subject: [Bioperl-l] BioPerl 1.6.0 alpha 3 released
>
> The third alpha is now out and propagating it's way around the
> intertubes:
>
> http://search.cpan.org/~cjfields/BioPerl-1.6.0_3/
>
> Pick your favorite archive here:
>
> http://bioperl.org/DIST/RC/
>
> This includes some unmerged changes from 1.6.0.  Test failures from
> the last alpha indicated these somehow were missed, so I basically ran
> a global diff against main trunk to check for missing commits (all
> located in t/ as it turned out).
>
> Also fixed is are the SeqFeature_SQLite.t failures; this is a file
> autogenerated with Build.PL tests that somehow made it's way into the
> last alpha release.  This is now properly cleaned up along with it's
> test database using './Build clean'.  BTW, very nice SQLite
> implementation; I may be using it!
>
> Please let me know if anything pops up; I'm hoping to release 1.6.1 by
> this Thursday-Friday.
>
> Enjoy!
>
> chris
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From cjfields at illinois.edu  Wed Sep 23 22:49:45 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Wed, 23 Sep 2009 17:49:45 -0500
Subject: [Bioperl-l] BioPerl.pm and 1.6.1
References: <1253727169.18486.1336281841@webmail.messagingengine.com>
Message-ID: <1AF393BC-2352-4ADA-A4E3-3EF13B99CAE8@illinois.edu>

All,

I've recently noticed that CPAN is not grabbing the correct  
descriptive information from Build.PL.  The current description is  
coming from Bio::LiveSeq::IO::BioPerl, which is the first module found  
with the same 'BioPerl' namesake:

http://search.cpan.org/search?query=bioperl&mode=dist

Therefore we need something that acts as the description and main page  
for the distributions.  We have a bioperl.pod already, just need to  
update it and add it to trunk, and maybe release another alpha with it  
included to make sure it's working.  I also want to fix the recent  
Windows issue reported by Kristine.

Therefore, I will being adding this for core and the other  
distributions per Curtis Jewell's suggestion (below).  Please let me  
know if there are any disagreements with this; I'll probably push  
another alpha out with this in the next few days (also hopefully  
containing the bug fix mentioned above).

chris

Begin forwarded message:

> From: "Curtis Jewell" <lists.perl.module-authors at csjewell.fastmail.us>
> Date: September 23, 2009 12:32:49 PM CDT
> To: "Chris Fields" <cjfields at illinois.edu>
> Subject: Re: distribution description
>
> Chris, I'd make it a BioPerl.pm that just declares a package and  
> version
> and does nothing else other than being a holder for Pod - because the
> first thing I wanted to do when I heard about it and wanted to check
> whether it worked in Strawberry is to do 'cpan BioPerl', which of
> course, blows up.
>
> --Curtis
>
> On Tue, 22 Sep 2009 22:23 -0500, "Chris Fields"  
> <cjfields at illinois.edu>
> wrote:
>> I've noticed in the last number of CPAN releases of BioPerl that the
>> description for the distribution is being pulled from one of our
>> modules (Bio::LiveSeq::IO::BioPerl).  I'm guessing this is b/c it's
>> the first match to the distribution name.
>>
>> Is there any way to make sure the description is pulled from the
>> abstract?  We're using a subclass of Module::Build and have defined
>> dist_abstract (I'm thinking of adding a BioPerl.pod to the root
>> directory just to catch this).
>>
>> chris
> --
> Curtis Jewell
> swordsman at csjewell.fastmail.us
>
> %DCL-E-MEM-BAD, bad memory
> -VMS-F-PDGERS, pudding between the ears
>
> [I use PC-Alpine, which deliberately does not display colors and  
> pictures in HTML mail]
>


From cjfields at illinois.edu  Wed Sep 23 23:00:55 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Wed, 23 Sep 2009 18:00:55 -0500
Subject: [Bioperl-l] BioPerl 1.6.0 alpha 3 released
In-Reply-To: <3BEA4A335B853745AE0BA5E81DE3782A09DD5DF8@exch1-hi.accelrys.net>
References: <A59164B5-0408-4A94-9262-8B814DD48CE1@illinois.edu>
	<3BEA4A335B853745AE0BA5E81DE3782A09DD5D30@exch1-hi.accelrys.net>
	<EA6593D4-5F3D-4CD9-95C6-598B9C561609@illinois.edu>
	<3BEA4A335B853745AE0BA5E81DE3782A09DD5DF8@exch1-hi.accelrys.net>
Message-ID: <D704BD1B-C44B-4AB5-9C14-9F4F63A46FEE@illinois.edu>

Kristine,

I have been planning on installing a temp WinXP VM using VirtualBox,  
so this'll give me an excuse to set that up ;>

chris

On Sep 23, 2009, at 5:40 PM, Kristine Briedis wrote:

> Hi Chris,
>
> ActivePerl.  I'll open a bug.  Thanks!
>
> Cheers,
> Kristine
>
>
> -----Original Message-----
> From: Chris Fields [mailto:cjfields at illinois.edu]
> Sent: Wednesday, September 23, 2009 1:59 PM
> To: Kristine Briedis
> Cc: BioPerl List
> Subject: Re: [Bioperl-l] BioPerl 1.6.0 alpha 3 released
>
> Yes, that would be good.  I don't have immediate access to anything
> running WinXP/vista/7 but I can probably look into this sometime
> tomorrow or Monday.
>
> Just to make sure, is this with ActivePerl or Strawberry Perl?
>
> chris
>
> On Sep 23, 2009, at 3:52 PM, Kristine Briedis wrote:
>
>> Hi Chris,
>>
>> We tested BioPerl 1.6.0 alpha 3 with our set of Pipeline Pilot
>> regressions and noticed a small problem.  The fasta validation check
>> for '>' in SeqIO::fasta (line 127) throws when used with
>> Index::Fasta on Windows because the position after '>' is being
>> indexed.  It looks like you already fixed the same problem for Linux
>> (comment in line 190 of Index::Fasta).  Do you want me to put this
>> into bugzilla?  Let me know if you have any questions.  Thanks!
>>
>> Cheers,
>> Kristine
>>
>>
>> -----Original Message-----
>> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
>> bounces at lists.open-bio.org] On Behalf Of Chris Fields
>> Sent: Tuesday, September 22, 2009 1:29 PM
>> To: BioPerl List
>> Subject: [Bioperl-l] BioPerl 1.6.0 alpha 3 released
>>
>> The third alpha is now out and propagating it's way around the
>> intertubes:
>>
>> http://search.cpan.org/~cjfields/BioPerl-1.6.0_3/
>>
>> Pick your favorite archive here:
>>
>> http://bioperl.org/DIST/RC/
>>
>> This includes some unmerged changes from 1.6.0.  Test failures from
>> the last alpha indicated these somehow were missed, so I basically  
>> ran
>> a global diff against main trunk to check for missing commits (all
>> located in t/ as it turned out).
>>
>> Also fixed is are the SeqFeature_SQLite.t failures; this is a file
>> autogenerated with Build.PL tests that somehow made it's way into the
>> last alpha release.  This is now properly cleaned up along with it's
>> test database using './Build clean'.  BTW, very nice SQLite
>> implementation; I may be using it!
>>
>> Please let me know if anything pops up; I'm hoping to release 1.6.1  
>> by
>> this Thursday-Friday.
>>
>> Enjoy!
>>
>> chris
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From David.Messina at sbc.su.se  Thu Sep 24 09:38:19 2009
From: David.Messina at sbc.su.se (Dave Messina)
Date: Thu, 24 Sep 2009 11:38:19 +0200
Subject: [Bioperl-l] a Main Page proposal
In-Reply-To: <9D6376D4-DFAC-4363-BA1C-0E27AB01373E@illinois.edu>
References: <mailman.7482.1253574466.2696.bioperl-l@lists.open-bio.org> 
	<4AB84B8D.5080005@ieee.org>
	<2B4BD21D-D06A-43A1-8860-A8F3771130E7@illinois.edu> 
	<f135c01c0909221615y4e79fb48u7e1ce6771b046d10@mail.gmail.com> 
	<628aabb70909230437p13de2164sef4950cc3be2ce23@mail.gmail.com> 
	<320fb6e00909230512u3d0c2031xb418e3253476be2f@mail.gmail.com> 
	<9D6376D4-DFAC-4363-BA1C-0E27AB01373E@illinois.edu>
Message-ID: <628aabb70909240238v439d6c46l93a5ead53f161c37@mail.gmail.com>

>
> Not to add yet more to the list, but I also think a concise list of
> projects using (or 'powered by') bioperl should be front-and-center; not a
> lot of users know when/where bioperl is used.  This applies to the other
> bio* as well, particularly biopython (seeing it popping up more and more).
>


Along these lines, it'd be great to publicize not only
BioPerl-*powered*projects, but ones which interface with it, too.

Just this week, for example, there is this, which could go both on a static
page and in the newsfeed:
http://bioinformatics.oxfordjournals.org/cgi/content/abstract/btp554v1

MOODS: fast search for position weight matrix matches in DNA sequences.

Korhonen J, Martinm?ki P, Pizzi C, Rastas P, Ukkonen E.
Department of Computer Science and Helsinki Institute for Information
Technology,
University of Helsinki, Helsinki, Finland.

SUMMARY: MOODS (MOtif Occurrence Detection Suite) is a software package for
matching position weight matrices against DNA sequences. MOODS implements
state-of-the-art on-line matching algorithms, achieving considerably faster
scanning speed than with a simple brute-force search. MOODS is written in C++,
with bindings for the popular BioPerl and Biopython toolkits. It can easily be
adapted for different purposes and integrated into existing workflows. It can
also be used as a C++ library. AVAILABILITY: The package with documentation and
examples of usage is available at http://www.cs.helsinki.fi/group/pssmfind. The
source code is also available under the terms of a GNU General Public License
(GPL). CONTACT: janne.h.korhonen at helsinki.fi.

PMID: 19773334 [PubMed - as supplied by publisher]


From maj at fortinbras.us  Thu Sep 24 14:17:26 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Thu, 24 Sep 2009 10:17:26 -0400
Subject: [Bioperl-l] DB_File dependency and ActiveState 5.10
Message-ID: <1A8A9461E94441EE9BD73D02A8F81F52@NewLife>

Gurus of a db stripe:
 
ActiveState 5.10 has such a problem with BDB that it
disables their ppm build of the DB_File module. I know
what the *ultimate* solution is...however...

I did a quick grep of 'use DB_File' across the trunk, and 
it seems there are two categories of dependency--

(1) use of BDB is an option among other dbms
      (e.g., among the  Bio::DB::GFF::Adaptor::)

(2) BDB is the developer's personal choice
    (e.g., possibly Bio::DB::FileCache)

In Bio::DB::Fasta, AnyDBM_File is used to allow the 
user a choice. Are there fundamental reasons not to 
convert the type (2) dependencies to AnyDBM_File?
I will try to do this (on a branch) if there are no technical
objections. General derision, however, will only goad
me into action-

Thanks,
MAJ


From A.J.Pemberton at bham.ac.uk  Thu Sep 24 15:08:06 2009
From: A.J.Pemberton at bham.ac.uk (Anthony Pemberton)
Date: Thu, 24 Sep 2009 16:08:06 +0100
Subject: [Bioperl-l] Problems installing latest stable bioperl-db (1.6)
In-Reply-To: <67AB606C-5CC9-4C1E-84EE-EFB7C37667E9@illinois.edu>
References: <3A5B0BBDAF00724AB5F10155650102306F86D3F6@LESMBX1.adf.bham.ac.uk>
	<3E7712FC278A4C9C89CBFC9A683AE301@NewLife>
	<67AB606C-5CC9-4C1E-84EE-EFB7C37667E9@illinois.edu>
Message-ID: <3A5B0BBDAF00724AB5F10155650102306F86D403@LESMBX1.adf.bham.ac.uk>

Chris, Mark,

Thank you, I have made significant progress with the install. I had to do a 

Cpan> force install Array::Compare

To get the model properly installed. 

However, I now have a new error. When I do

Cpan> install CJFIELDS/BioPerl-db-1.6.0.tar.gz

I get the following error (now only 1 of the 16 tests fails):

t/12ontology.t .... 1/740 Bio::OntologyIO: soflat cannot be found
Exception
------------- EXCEPTION -------------
MSG: Failed to load module Bio::OntologyIO::soflat. Can't locate Graph/Directed.pm in @INC (@INC contains: t/lib t /root/.cpan/build/BioPerl-db-1.6.0-xim2YV/blib/lib /root/.cpan/build/BioPerl-db-1.6.0-xim2YV/blib/arch /root/.cpan/build/BioPerl-db-1.6.0-xim2YV /usr/lib64/perl5/5.8.5/x86_64-linux-thread-multi /usr/lib/perl5/5.8.5 /usr/lib64/perl5/site_perl/5.8.5/x86_64-linux-thread-multi /usr/lib/perl5/site_perl/5.8.5 /usr/lib/perl5/site_perl /usr/lib64/perl5/vendor_perl/5.8.5/x86_64-linux-thread-multi /usr/lib/perl5/vendor_perl/5.8.5 /usr/lib64/perl5/vendor_perl/5.8.3/x86_64-linux-thread-multi /usr/lib/perl5/vendor_perl .) at /usr/lib/perl5/site_perl/5.8.5/Bio/Ontology/SimpleGOEngine/GraphAdaptor.pm line 118.
BEGIN failed--compilation aborted at /usr/lib/perl5/site_perl/5.8.5/Bio/Ontology/SimpleGOEngine/GraphAdaptor.pm line 118.


Can you help with this one?

Regards,

Tony Pemberton


> -----Original Message-----
> From: Chris Fields [mailto:cjfields at illinois.edu]
> Sent: 23 September 2009 18:48
> To: Mark A. Jensen
> Cc: Anthony Pemberton; bioperl-l at bioperl.org
> Subject: Re: [Bioperl-l] Problems installing latest stable bioperl-db
> (1.6)
> 
> Appears Array::Compare is used for Test::Warn, so it isn't a true
> requirement (probably a test_requires or somesuch).
> 
> chris
> 
> On Sep 23, 2009, at 10:36 AM, Mark A. Jensen wrote:
> 
> > hi Tony- missing prereqs are the issue with this message,yes-
> > the brute force approach would be to install each of these
> > as they come up; you can do
> >
> > $ cpan
> > cpan> install Array::Compare
> >
> > etc., then attempt the bioperl-db install again; lather, rinse,
> > repeat.
> > MAJ
> > ----- Original Message ----- From: "Anthony Pemberton"
> <A.J.Pemberton at bham.ac.uk
> > >
> > To: <bioperl-l at bioperl.org>
> > Sent: Tuesday, September 22, 2009 1:06 PM
> > Subject: [Bioperl-l] Problems installing latest stable bioperl-db
> > (1.6)
> >
> >
> >> Folks,
> >>
> >> I am experiencing problems installing bioperl-db. I followed the
> >> instructions on the website both installing via CPAN and
> >> downloading the source tarball. Get the same error. I think I have
> >> missing prerequistes, the first error I get is:
> >>
> >> Can't locate Array/Compare.pm in @INC (@INC contains: t/lib t /usr/
> >> local/BioPerl-db-1.6.0/blib/lib
> >> /usr/local/BioPerl-db-1.6.0/blib/arch /usr/local/BioPerl-db-1.6.0 /
> >> usr/lib64/perl5/5.8.5/x86_64-linux-thread-multi
> >> /usr/lib/perl5/5.8.5 /usr/lib64/perl5/site_perl/5.8.5/x86_64-linux-
> >> thread-multi /usr/lib/perl5/site_perl/5.8.5
> >> /usr/lib/perl5/site_perl /usr/lib64/perl5/vendor_perl/5.8.5/x86_64-
> >> linux-thread-multi /usr/lib/perl5/vendor_perl/5.8.5
> >> /usr/lib64/perl5/vendor_perl/5.8.3/x86_64-linux-thread-multi /usr/
> >> lib/perl5/vendor_perl .) at t/lib/Test/Warn.pm line 228.
> >>
> >> Can anyone help?
> >>
> >> Regards,
> >>
> >> Tony P.
> >>
> >>
> >> **************************************************************
> >> Mr. A. Pemberton Tel:+44 121 414 3388
> >> School of Biosciences, Fax:+44 121 414 5925
> >> The University of Birmingham
> >> Email:a.j.pemberton at bham.ac.uk
> >> Birmingham B15 2TT U.K.
> >> **************************************************************
> >>
> >>
> >>
> >> _______________________________________________
> >> Bioperl-l mailing list
> >> Bioperl-l at lists.open-bio.org
> >> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >>
> >
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l


From jason at bioperl.org  Thu Sep 24 16:23:44 2009
From: jason at bioperl.org (Jason Stajich)
Date: Thu, 24 Sep 2009 09:23:44 -0700
Subject: [Bioperl-l] a Main Page proposal
In-Reply-To: <628aabb70909240238v439d6c46l93a5ead53f161c37@mail.gmail.com>
References: <mailman.7482.1253574466.2696.bioperl-l@lists.open-bio.org>
	<4AB84B8D.5080005@ieee.org>
	<2B4BD21D-D06A-43A1-8860-A8F3771130E7@illinois.edu>
	<f135c01c0909221615y4e79fb48u7e1ce6771b046d10@mail.gmail.com>
	<628aabb70909230437p13de2164sef4950cc3be2ce23@mail.gmail.com>
	<320fb6e00909230512u3d0c2031xb418e3253476be2f@mail.gmail.com>
	<9D6376D4-DFAC-4363-BA1C-0E27AB01373E@illinois.edu>
	<628aabb70909240238v439d6c46l93a5ead53f161c37@mail.gmail.com>
Message-ID: <3B49F41D-4FBB-48CD-BA33-3D6C783CBA38@bioperl.org>

If someone also wants to volunteer to keep up the publications page -  
this is where I *had* been curating a list up by citations and google  
scholar searches for 'bioperl' and things that reference 2002 paper.

Seems like this is where the static copy of that information should go  
- but highlighting things on the a page with a circulating list or  
something that just listed recent additions to the list could be done  
by the web dev gurus and could be kewl.
The current issue is that a) it is large so I think pubmed plugin  
rendering can be slow (or gets broken as it seems to be now).
http://bioperl.org/wiki/BioPerl_publications
http://bioperl.org/wiki/BioPerl_publications/2008
http://bioperl.org/wiki/BioPerl_publications/2007
etc....

-jason
On Sep 24, 2009, at 2:38 AM, Dave Messina wrote:

>>
>> Not to add yet more to the list, but I also think a concise list of
>> projects using (or 'powered by') bioperl should be front-and- 
>> center; not a
>> lot of users know when/where bioperl is used.  This applies to the  
>> other
>> bio* as well, particularly biopython (seeing it popping up more and  
>> more).
>>
>
>
> Along these lines, it'd be great to publicize not only
> BioPerl-*powered*projects, but ones which interface with it, too.
>
> Just this week, for example, there is this, which could go both on a  
> static
> page and in the newsfeed:
> http://bioinformatics.oxfordjournals.org/cgi/content/abstract/btp554v1
>
> MOODS: fast search for position weight matrix matches in DNA  
> sequences.
>
> Korhonen J, Martinm?ki P, Pizzi C, Rastas P, Ukkonen E.
> Department of Computer Science and Helsinki Institute for Information
> Technology,
> University of Helsinki, Helsinki, Finland.
>
> SUMMARY: MOODS (MOtif Occurrence Detection Suite) is a software  
> package for
> matching position weight matrices against DNA sequences. MOODS  
> implements
> state-of-the-art on-line matching algorithms, achieving considerably  
> faster
> scanning speed than with a simple brute-force search. MOODS is  
> written in C++,
> with bindings for the popular BioPerl and Biopython toolkits. It can  
> easily be
> adapted for different purposes and integrated into existing  
> workflows. It can
> also be used as a C++ library. AVAILABILITY: The package with  
> documentation and
> examples of usage is available at http://www.cs.helsinki.fi/group/pssmfind 
> . The
> source code is also available under the terms of a GNU General  
> Public License
> (GPL). CONTACT: janne.h.korhonen at helsinki.fi.
>
> PMID: 19773334 [PubMed - as supplied by publisher]
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

--
Jason Stajich
jason.stajich at gmail.com
jason at bioperl.org


From rmb32 at cornell.edu  Thu Sep 24 16:28:08 2009
From: rmb32 at cornell.edu (Robert Buels)
Date: Thu, 24 Sep 2009 09:28:08 -0700
Subject: [Bioperl-l] DB_File dependency and ActiveState 5.10
In-Reply-To: <1A8A9461E94441EE9BD73D02A8F81F52@NewLife>
References: <1A8A9461E94441EE9BD73D02A8F81F52@NewLife>
Message-ID: <4ABB9E18.3060003@cornell.edu>

Sounds like a good idea to me.

Rob


From cjfields at illinois.edu  Thu Sep 24 16:58:32 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Thu, 24 Sep 2009 11:58:32 -0500
Subject: [Bioperl-l] Problems installing latest stable bioperl-db (1.6)
In-Reply-To: <3A5B0BBDAF00724AB5F10155650102306F86D403@LESMBX1.adf.bham.ac.uk>
References: <3A5B0BBDAF00724AB5F10155650102306F86D3F6@LESMBX1.adf.bham.ac.uk>
	<3E7712FC278A4C9C89CBFC9A683AE301@NewLife>
	<67AB606C-5CC9-4C1E-84EE-EFB7C37667E9@illinois.edu>
	<3A5B0BBDAF00724AB5F10155650102306F86D403@LESMBX1.adf.bham.ac.uk>
Message-ID: <2BDD197A-3DEF-44CE-9F98-6B3F117084EE@illinois.edu>

Tony,

The error should point out the problem: install Graph::Directed via  
CPAN.

Saying that, we need to add that as a 'recommends' for the db package  
and skip those tests if Graph::Directed isn't present.  Will do that  
now.

chris

On Sep 24, 2009, at 10:08 AM, Anthony Pemberton wrote:

> Chris, Mark,
>
> Thank you, I have made significant progress with the install. I had  
> to do a
>
> Cpan> force install Array::Compare
>
> To get the model properly installed.
>
> However, I now have a new error. When I do
>
> Cpan> install CJFIELDS/BioPerl-db-1.6.0.tar.gz
>
> I get the following error (now only 1 of the 16 tests fails):
>
> t/12ontology.t .... 1/740 Bio::OntologyIO: soflat cannot be found
> Exception
> ------------- EXCEPTION -------------
> MSG: Failed to load module Bio::OntologyIO::soflat. Can't locate  
> Graph/Directed.pm in @INC (@INC contains: t/lib t /root/.cpan/build/ 
> BioPerl-db-1.6.0-xim2YV/blib/lib /root/.cpan/build/BioPerl-db-1.6.0- 
> xim2YV/blib/arch /root/.cpan/build/BioPerl-db-1.6.0-xim2YV /usr/ 
> lib64/perl5/5.8.5/x86_64-linux-thread-multi /usr/lib/perl5/5.8.5 / 
> usr/lib64/perl5/site_perl/5.8.5/x86_64-linux-thread-multi /usr/lib/ 
> perl5/site_perl/5.8.5 /usr/lib/perl5/site_perl /usr/lib64/perl5/ 
> vendor_perl/5.8.5/x86_64-linux-thread-multi /usr/lib/perl5/ 
> vendor_perl/5.8.5 /usr/lib64/perl5/vendor_perl/5.8.3/x86_64-linux- 
> thread-multi /usr/lib/perl5/vendor_perl .) at /usr/lib/perl5/ 
> site_perl/5.8.5/Bio/Ontology/SimpleGOEngine/GraphAdaptor.pm line 118.
> BEGIN failed--compilation aborted at /usr/lib/perl5/site_perl/5.8.5/ 
> Bio/Ontology/SimpleGOEngine/GraphAdaptor.pm line 118.
>
>
> Can you help with this one?
>
> Regards,
>
> Tony Pemberton
>
>
>> -----Original Message-----
>> From: Chris Fields [mailto:cjfields at illinois.edu]
>> Sent: 23 September 2009 18:48
>> To: Mark A. Jensen
>> Cc: Anthony Pemberton; bioperl-l at bioperl.org
>> Subject: Re: [Bioperl-l] Problems installing latest stable bioperl-db
>> (1.6)
>>
>> Appears Array::Compare is used for Test::Warn, so it isn't a true
>> requirement (probably a test_requires or somesuch).
>>
>> chris
>>
>> On Sep 23, 2009, at 10:36 AM, Mark A. Jensen wrote:
>>
>>> hi Tony- missing prereqs are the issue with this message,yes-
>>> the brute force approach would be to install each of these
>>> as they come up; you can do
>>>
>>> $ cpan
>>> cpan> install Array::Compare
>>>
>>> etc., then attempt the bioperl-db install again; lather, rinse,
>>> repeat.
>>> MAJ
>>> ----- Original Message ----- From: "Anthony Pemberton"
>> <A.J.Pemberton at bham.ac.uk
>>>>
>>> To: <bioperl-l at bioperl.org>
>>> Sent: Tuesday, September 22, 2009 1:06 PM
>>> Subject: [Bioperl-l] Problems installing latest stable bioperl-db
>>> (1.6)
>>>
>>>
>>>> Folks,
>>>>
>>>> I am experiencing problems installing bioperl-db. I followed the
>>>> instructions on the website both installing via CPAN and
>>>> downloading the source tarball. Get the same error. I think I have
>>>> missing prerequistes, the first error I get is:
>>>>
>>>> Can't locate Array/Compare.pm in @INC (@INC contains: t/lib t /usr/
>>>> local/BioPerl-db-1.6.0/blib/lib
>>>> /usr/local/BioPerl-db-1.6.0/blib/arch /usr/local/BioPerl-db-1.6.0 /
>>>> usr/lib64/perl5/5.8.5/x86_64-linux-thread-multi
>>>> /usr/lib/perl5/5.8.5 /usr/lib64/perl5/site_perl/5.8.5/x86_64-linux-
>>>> thread-multi /usr/lib/perl5/site_perl/5.8.5
>>>> /usr/lib/perl5/site_perl /usr/lib64/perl5/vendor_perl/5.8.5/x86_64-
>>>> linux-thread-multi /usr/lib/perl5/vendor_perl/5.8.5
>>>> /usr/lib64/perl5/vendor_perl/5.8.3/x86_64-linux-thread-multi /usr/
>>>> lib/perl5/vendor_perl .) at t/lib/Test/Warn.pm line 228.
>>>>
>>>> Can anyone help?
>>>>
>>>> Regards,
>>>>
>>>> Tony P.
>>>>
>>>>
>>>> **************************************************************
>>>> Mr. A. Pemberton Tel:+44 121 414 3388
>>>> School of Biosciences, Fax:+44 121 414 5925
>>>> The University of Birmingham
>>>> Email:a.j.pemberton at bham.ac.uk
>>>> Birmingham B15 2TT U.K.
>>>> **************************************************************
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> Bioperl-l mailing list
>>>> Bioperl-l at lists.open-bio.org
>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>>
>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


From cjfields at illinois.edu  Thu Sep 24 17:50:34 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Thu, 24 Sep 2009 12:50:34 -0500
Subject: [Bioperl-l] DB_File dependency and ActiveState 5.10
In-Reply-To: <1A8A9461E94441EE9BD73D02A8F81F52@NewLife>
References: <1A8A9461E94441EE9BD73D02A8F81F52@NewLife>
Message-ID: <759F1C97-401A-434C-956C-20A1DED9D834@illinois.edu>

I do support doing this for sheer flexibility, but it's not an  
absolute showstopper for ActivePerl.  There is a working DB_File PPM  
available for ActivePerl 5.10.1 in the Trouchelle PPM repo:

http://trouchelle.com/ppm10/

That repo is listed in the 'Suggested' list in the latest PPM4  
Preferences (Repositories tag). I had to install it to fix that WinXP  
Bio::Index bug.

(Based on that Bio::Index modules also have this requirement, at least  
tests were being skipped based on lack of DB_File)

chris

On Sep 24, 2009, at 9:17 AM, Mark A. Jensen wrote:

> Gurus of a db stripe:
>
> ActiveState 5.10 has such a problem with BDB that it
> disables their ppm build of the DB_File module. I know
> what the *ultimate* solution is...however...
>
> I did a quick grep of 'use DB_File' across the trunk, and
> it seems there are two categories of dependency--
>
> (1) use of BDB is an option among other dbms
>      (e.g., among the  Bio::DB::GFF::Adaptor::)
>
> (2) BDB is the developer's personal choice
>    (e.g., possibly Bio::DB::FileCache)
>
> In Bio::DB::Fasta, AnyDBM_File is used to allow the
> user a choice. Are there fundamental reasons not to
> convert the type (2) dependencies to AnyDBM_File?
> I will try to do this (on a branch) if there are no technical
> objections. General derision, however, will only goad
> me into action-
>
> Thanks,
> MAJ
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From cjfields at illinois.edu  Thu Sep 24 18:03:48 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Thu, 24 Sep 2009 13:03:48 -0500
Subject: [Bioperl-l] Bio::SeqIO::scf tests failing on WinXP?
Message-ID: <159370FD-B6F5-4702-AF35-B7126BA7399A@illinois.edu>

Can someone (Mark?) who has a WinXP setup run tests on Bio::SeqIO::scf  
for Windows using the last alpha or bioperl-live?  I'm getting a  
pretty significant fail with the last alpha release (I've managed to  
fix the others) via my remote desktop setup (haven't set up virtualbox  
yet).  I just want to confirm this is occurring elsewhere and plan  
accordingly, namely indicating the module doesn't work with windows  
for the time being.

Build test --test-files t/SeqIO/scf.t --verbose

chris


From maj at fortinbras.us  Thu Sep 24 18:39:38 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Thu, 24 Sep 2009 14:39:38 -0400
Subject: [Bioperl-l] DB_File dependency and ActiveState 5.10
In-Reply-To: <759F1C97-401A-434C-956C-20A1DED9D834@illinois.edu>
References: <1A8A9461E94441EE9BD73D02A8F81F52@NewLife>
	<759F1C97-401A-434C-956C-20A1DED9D834@illinois.edu>
Message-ID: <3715F68607084E4684A4B54E542468E4@NewLife>

All righty. I did find the trouchelle repo, but my ppm
didn't believe that DB_File was in it.
----- Original Message ----- 
From: "Chris Fields" <cjfields at illinois.edu>
To: "Mark A. Jensen" <maj at fortinbras.us>
Cc: "BioPerl List" <bioperl-l at lists.open-bio.org>
Sent: Thursday, September 24, 2009 1:50 PM
Subject: Re: [Bioperl-l] DB_File dependency and ActiveState 5.10


>I do support doing this for sheer flexibility, but it's not an  
> absolute showstopper for ActivePerl.  There is a working DB_File PPM  
> available for ActivePerl 5.10.1 in the Trouchelle PPM repo:
> 
> http://trouchelle.com/ppm10/
> 
> That repo is listed in the 'Suggested' list in the latest PPM4  
> Preferences (Repositories tag). I had to install it to fix that WinXP  
> Bio::Index bug.
> 
> (Based on that Bio::Index modules also have this requirement, at least  
> tests were being skipped based on lack of DB_File)
> 
> chris
> 
> On Sep 24, 2009, at 9:17 AM, Mark A. Jensen wrote:
> 
>> Gurus of a db stripe:
>>
>> ActiveState 5.10 has such a problem with BDB that it
>> disables their ppm build of the DB_File module. I know
>> what the *ultimate* solution is...however...
>>
>> I did a quick grep of 'use DB_File' across the trunk, and
>> it seems there are two categories of dependency--
>>
>> (1) use of BDB is an option among other dbms
>>      (e.g., among the  Bio::DB::GFF::Adaptor::)
>>
>> (2) BDB is the developer's personal choice
>>    (e.g., possibly Bio::DB::FileCache)
>>
>> In Bio::DB::Fasta, AnyDBM_File is used to allow the
>> user a choice. Are there fundamental reasons not to
>> convert the type (2) dependencies to AnyDBM_File?
>> I will try to do this (on a branch) if there are no technical
>> objections. General derision, however, will only goad
>> me into action-
>>
>> Thanks,
>> MAJ
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
> 
>


From maj at fortinbras.us  Thu Sep 24 18:40:03 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Thu, 24 Sep 2009 14:40:03 -0400
Subject: [Bioperl-l] Bio::SeqIO::scf tests failing on WinXP?
In-Reply-To: <159370FD-B6F5-4702-AF35-B7126BA7399A@illinois.edu>
References: <159370FD-B6F5-4702-AF35-B7126BA7399A@illinois.edu>
Message-ID: <791B5C5CB3C34A8AAC348DC59E934198@NewLife>

aye-aye
----- Original Message ----- 
From: "Chris Fields" <cjfields at illinois.edu>
To: "BioPerl List" <bioperl-l at lists.open-bio.org>
Sent: Thursday, September 24, 2009 2:03 PM
Subject: [Bioperl-l] Bio::SeqIO::scf tests failing on WinXP?


> Can someone (Mark?) who has a WinXP setup run tests on Bio::SeqIO::scf  
> for Windows using the last alpha or bioperl-live?  I'm getting a  
> pretty significant fail with the last alpha release (I've managed to  
> fix the others) via my remote desktop setup (haven't set up virtualbox  
> yet).  I just want to confirm this is occurring elsewhere and plan  
> accordingly, namely indicating the module doesn't work with windows  
> for the time being.
> 
> Build test --test-files t/SeqIO/scf.t --verbose
> 
> chris
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
>


From e.osimo at gmail.com  Fri Sep 25 07:59:10 2009
From: e.osimo at gmail.com (Emanuele Osimo)
Date: Fri, 25 Sep 2009 09:59:10 +0200
Subject: [Bioperl-l] a Main Page proposal
In-Reply-To: <3B49F41D-4FBB-48CD-BA33-3D6C783CBA38@bioperl.org>
References: <mailman.7482.1253574466.2696.bioperl-l@lists.open-bio.org> 
	<4AB84B8D.5080005@ieee.org>
	<2B4BD21D-D06A-43A1-8860-A8F3771130E7@illinois.edu> 
	<f135c01c0909221615y4e79fb48u7e1ce6771b046d10@mail.gmail.com> 
	<628aabb70909230437p13de2164sef4950cc3be2ce23@mail.gmail.com> 
	<320fb6e00909230512u3d0c2031xb418e3253476be2f@mail.gmail.com> 
	<9D6376D4-DFAC-4363-BA1C-0E27AB01373E@illinois.edu>
	<628aabb70909240238v439d6c46l93a5ead53f161c37@mail.gmail.com> 
	<3B49F41D-4FBB-48CD-BA33-3D6C783CBA38@bioperl.org>
Message-ID: <2ac05d0f0909250059p56c75124hfb8b16b865a831c@mail.gmail.com>

Dear Jason,
it's more than 24 hours that I try connecting to
http://bioperl.org/wiki/BioPerl_publications, but it won't work.
Emanuele


On Thu, Sep 24, 2009 at 18:23, Jason Stajich <jason at bioperl.org> wrote:

> If someone also wants to volunteer to keep up the publications page - this
> is where I *had* been curating a list up by citations and google scholar
> searches for 'bioperl' and things that reference 2002 paper.
>
> Seems like this is where the static copy of that information should go -
> but highlighting things on the a page with a circulating list or something
> that just listed recent additions to the list could be done by the web dev
> gurus and could be kewl.
> The current issue is that a) it is large so I think pubmed plugin rendering
> can be slow (or gets broken as it seems to be now).
> http://bioperl.org/wiki/BioPerl_publications
> http://bioperl.org/wiki/BioPerl_publications/2008
> http://bioperl.org/wiki/BioPerl_publications/2007
> etc....
>
> -jason
>
> On Sep 24, 2009, at 2:38 AM, Dave Messina wrote:
>
>
>>> Not to add yet more to the list, but I also think a concise list of
>>> projects using (or 'powered by') bioperl should be front-and-center; not
>>> a
>>> lot of users know when/where bioperl is used.  This applies to the other
>>> bio* as well, particularly biopython (seeing it popping up more and
>>> more).
>>>
>>>
>>
>> Along these lines, it'd be great to publicize not only
>> BioPerl-*powered*projects, but ones which interface with it, too.
>>
>> Just this week, for example, there is this, which could go both on a
>> static
>> page and in the newsfeed:
>> http://bioinformatics.oxfordjournals.org/cgi/content/abstract/btp554v1
>>
>> MOODS: fast search for position weight matrix matches in DNA sequences.
>>
>> Korhonen J, Martinm?ki P, Pizzi C, Rastas P, Ukkonen E.
>> Department of Computer Science and Helsinki Institute for Information
>> Technology,
>> University of Helsinki, Helsinki, Finland.
>>
>> SUMMARY: MOODS (MOtif Occurrence Detection Suite) is a software package
>> for
>> matching position weight matrices against DNA sequences. MOODS implements
>> state-of-the-art on-line matching algorithms, achieving considerably
>> faster
>> scanning speed than with a simple brute-force search. MOODS is written in
>> C++,
>> with bindings for the popular BioPerl and Biopython toolkits. It can
>> easily be
>> adapted for different purposes and integrated into existing workflows. It
>> can
>> also be used as a C++ library. AVAILABILITY: The package with
>> documentation and
>> examples of usage is available at
>> http://www.cs.helsinki.fi/group/pssmfind. The
>> source code is also available under the terms of a GNU General Public
>> License
>> (GPL). CONTACT: janne.h.korhonen at helsinki.fi.
>>
>> PMID: 19773334 [PubMed - as supplied by publisher]
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>
> --
> Jason Stajich
> jason.stajich at gmail.com
> jason at bioperl.org
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


From hlapp at gmx.net  Fri Sep 25 11:26:37 2009
From: hlapp at gmx.net (Hilmar Lapp)
Date: Fri, 25 Sep 2009 07:26:37 -0400
Subject: [Bioperl-l] a Main Page proposal
In-Reply-To: <2ac05d0f0909250059p56c75124hfb8b16b865a831c@mail.gmail.com>
References: <mailman.7482.1253574466.2696.bioperl-l@lists.open-bio.org>
	<4AB84B8D.5080005@ieee.org>
	<2B4BD21D-D06A-43A1-8860-A8F3771130E7@illinois.edu>
	<f135c01c0909221615y4e79fb48u7e1ce6771b046d10@mail.gmail.com>
	<628aabb70909230437p13de2164sef4950cc3be2ce23@mail.gmail.com>
	<320fb6e00909230512u3d0c2031xb418e3253476be2f@mail.gmail.com>
	<9D6376D4-DFAC-4363-BA1C-0E27AB01373E@illinois.edu>
	<628aabb70909240238v439d6c46l93a5ead53f161c37@mail.gmail.com>
	<3B49F41D-4FBB-48CD-BA33-3D6C783CBA38@bioperl.org>
	<2ac05d0f0909250059p56c75124hfb8b16b865a831c@mail.gmail.com>
Message-ID: <9B33DB9A-C82D-42E5-87D3-A26BD166F7F5@gmx.net>

Odd. Something's going on in the page that upsets MediaWiki. I can  
actually pull up the page in edit mode.

Is the citation extension working correctly? The year-by-year pages  
look odd.

	-hilmar

On Sep 25, 2009, at 3:59 AM, Emanuele Osimo wrote:

> Dear Jason,
> it's more than 24 hours that I try connecting to
> http://bioperl.org/wiki/BioPerl_publications, but it won't work.
> Emanuele
>
>
> On Thu, Sep 24, 2009 at 18:23, Jason Stajich <jason at bioperl.org>  
> wrote:
>
>> If someone also wants to volunteer to keep up the publications page  
>> - this
>> is where I *had* been curating a list up by citations and google  
>> scholar
>> searches for 'bioperl' and things that reference 2002 paper.
>>
>> Seems like this is where the static copy of that information should  
>> go -
>> but highlighting things on the a page with a circulating list or  
>> something
>> that just listed recent additions to the list could be done by the  
>> web dev
>> gurus and could be kewl.
>> The current issue is that a) it is large so I think pubmed plugin  
>> rendering
>> can be slow (or gets broken as it seems to be now).
>> http://bioperl.org/wiki/BioPerl_publications
>> http://bioperl.org/wiki/BioPerl_publications/2008
>> http://bioperl.org/wiki/BioPerl_publications/2007
>> etc....
>>
>> -jason
>>
>> On Sep 24, 2009, at 2:38 AM, Dave Messina wrote:
>>
>>
>>>> Not to add yet more to the list, but I also think a concise list of
>>>> projects using (or 'powered by') bioperl should be front-and- 
>>>> center; not
>>>> a
>>>> lot of users know when/where bioperl is used.  This applies to  
>>>> the other
>>>> bio* as well, particularly biopython (seeing it popping up more and
>>>> more).
>>>>
>>>>
>>>
>>> Along these lines, it'd be great to publicize not only
>>> BioPerl-*powered*projects, but ones which interface with it, too.
>>>
>>> Just this week, for example, there is this, which could go both on a
>>> static
>>> page and in the newsfeed:
>>> http://bioinformatics.oxfordjournals.org/cgi/content/abstract/btp554v1
>>>
>>> MOODS: fast search for position weight matrix matches in DNA  
>>> sequences.
>>>
>>> Korhonen J, Martinm?ki P, Pizzi C, Rastas P, Ukkonen E.
>>> Department of Computer Science and Helsinki Institute for  
>>> Information
>>> Technology,
>>> University of Helsinki, Helsinki, Finland.
>>>
>>> SUMMARY: MOODS (MOtif Occurrence Detection Suite) is a software  
>>> package
>>> for
>>> matching position weight matrices against DNA sequences. MOODS  
>>> implements
>>> state-of-the-art on-line matching algorithms, achieving considerably
>>> faster
>>> scanning speed than with a simple brute-force search. MOODS is  
>>> written in
>>> C++,
>>> with bindings for the popular BioPerl and Biopython toolkits. It can
>>> easily be
>>> adapted for different purposes and integrated into existing  
>>> workflows. It
>>> can
>>> also be used as a C++ library. AVAILABILITY: The package with
>>> documentation and
>>> examples of usage is available at
>>> http://www.cs.helsinki.fi/group/pssmfind. The
>>> source code is also available under the terms of a GNU General  
>>> Public
>>> License
>>> (GPL). CONTACT: janne.h.korhonen at helsinki.fi.
>>>
>>> PMID: 19773334 [PubMed - as supplied by publisher]
>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>
>> --
>> Jason Stajich
>> jason.stajich at gmail.com
>> jason at bioperl.org
>>
>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From biopython at maubp.freeserve.co.uk  Fri Sep 25 11:40:33 2009
From: biopython at maubp.freeserve.co.uk (Peter)
Date: Fri, 25 Sep 2009 12:40:33 +0100
Subject: [Bioperl-l] a Main Page proposal
In-Reply-To: <9B33DB9A-C82D-42E5-87D3-A26BD166F7F5@gmx.net>
References: <mailman.7482.1253574466.2696.bioperl-l@lists.open-bio.org>
	<2B4BD21D-D06A-43A1-8860-A8F3771130E7@illinois.edu>
	<f135c01c0909221615y4e79fb48u7e1ce6771b046d10@mail.gmail.com>
	<628aabb70909230437p13de2164sef4950cc3be2ce23@mail.gmail.com>
	<320fb6e00909230512u3d0c2031xb418e3253476be2f@mail.gmail.com>
	<9D6376D4-DFAC-4363-BA1C-0E27AB01373E@illinois.edu>
	<628aabb70909240238v439d6c46l93a5ead53f161c37@mail.gmail.com>
	<3B49F41D-4FBB-48CD-BA33-3D6C783CBA38@bioperl.org>
	<2ac05d0f0909250059p56c75124hfb8b16b865a831c@mail.gmail.com>
	<9B33DB9A-C82D-42E5-87D3-A26BD166F7F5@gmx.net>
Message-ID: <320fb6e00909250440i18ee4216o80cedd418feed842@mail.gmail.com>

On Fri, Sep 25, 2009 at 12:26 PM, Hilmar Lapp <hlapp at gmx.net> wrote:
> Odd. Something's going on in the page that upsets MediaWiki. I can actually
> pull up the page in edit mode.
>
> Is the citation extension working correctly? The year-by-year pages look
> odd.

It is working on the Biopython and BioJava pages (which use the same
server and mediawiki installation, right?),

http://biopython.org/wiki/Documentation#Papers
http://biopython.org/wiki/Publications
http://biojava.org/wiki/BioJava:BioJavaInside

[I know there are references with a funny character in them, the extension
doesn't like accents. I normally redo those references by hand but it is
a hassle and just giving a PMID is much easier]

Peter


From maj at fortinbras.us  Fri Sep 25 12:50:26 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Fri, 25 Sep 2009 08:50:26 -0400
Subject: [Bioperl-l] a Main Page proposal
In-Reply-To: <9B33DB9A-C82D-42E5-87D3-A26BD166F7F5@gmx.net>
References: <mailman.7482.1253574466.2696.bioperl-l@lists.open-bio.org><4AB84B8D.5080005@ieee.org><2B4BD21D-D06A-43A1-8860-A8F3771130E7@illinois.edu><f135c01c0909221615y4e79fb48u7e1ce6771b046d10@mail.gmail.com><628aabb70909230437p13de2164sef4950cc3be2ce23@mail.gmail.com><320fb6e00909230512u3d0c2031xb418e3253476be2f@mail.gmail.com><9D6376D4-DFAC-4363-BA1C-0E27AB01373E@illinois.edu><628aabb70909240238v439d6c46l93a5ead53f161c37@mail.gmail.com><3B49F41D-4FBB-48CD-BA33-3D6C783CBA38@bioperl.org><2ac05d0f0909250059p56c75124hfb8b16b865a831c@mail.gmail.com>
	<9B33DB9A-C82D-42E5-87D3-A26BD166F7F5@gmx.net>
Message-ID: <4E35933353E14BB98975BCCAF79F5E0B@NewLife>

I've been playing with this. I think it's either a numbers problem (>230 
references => bork) or a timeout problem. Attempting to isolate a single 
"BioPerl publications/200x" page for the error gives inconsistent
results, but including enough of these pages to give more than about 230 
references gives the error (using
preview).
----- Original Message ----- 
From: "Hilmar Lapp" <hlapp at gmx.net>
To: "Emanuele Osimo" <e.osimo at gmail.com>
Cc: "perl bioperl ml" <bioperl-l at lists.open-bio.org>
Sent: Friday, September 25, 2009 7:26 AM
Subject: Re: [Bioperl-l] a Main Page proposal


Odd. Something's going on in the page that upsets MediaWiki. I can
actually pull up the page in edit mode.

Is the citation extension working correctly? The year-by-year pages
look odd.

-hilmar

On Sep 25, 2009, at 3:59 AM, Emanuele Osimo wrote:

> Dear Jason,
> it's more than 24 hours that I try connecting to
> http://bioperl.org/wiki/BioPerl_publications, but it won't work.
> Emanuele
>
>
> On Thu, Sep 24, 2009 at 18:23, Jason Stajich <jason at bioperl.org>  wrote:
>
>> If someone also wants to volunteer to keep up the publications page  - this
>> is where I *had* been curating a list up by citations and google  scholar
>> searches for 'bioperl' and things that reference 2002 paper.
>>
>> Seems like this is where the static copy of that information should  go -
>> but highlighting things on the a page with a circulating list or  something
>> that just listed recent additions to the list could be done by the  web dev
>> gurus and could be kewl.
>> The current issue is that a) it is large so I think pubmed plugin  rendering
>> can be slow (or gets broken as it seems to be now).
>> http://bioperl.org/wiki/BioPerl_publications
>> http://bioperl.org/wiki/BioPerl_publications/2008
>> http://bioperl.org/wiki/BioPerl_publications/2007
>> etc....
>>
>> -jason
>>
>> On Sep 24, 2009, at 2:38 AM, Dave Messina wrote:
>>
>>
>>>> Not to add yet more to the list, but I also think a concise list of
>>>> projects using (or 'powered by') bioperl should be front-and- center; not
>>>> a
>>>> lot of users know when/where bioperl is used.  This applies to  the other
>>>> bio* as well, particularly biopython (seeing it popping up more and
>>>> more).
>>>>
>>>>
>>>
>>> Along these lines, it'd be great to publicize not only
>>> BioPerl-*powered*projects, but ones which interface with it, too.
>>>
>>> Just this week, for example, there is this, which could go both on a
>>> static
>>> page and in the newsfeed:
>>> http://bioinformatics.oxfordjournals.org/cgi/content/abstract/btp554v1
>>>
>>> MOODS: fast search for position weight matrix matches in DNA  sequences.
>>>
>>> Korhonen J, Martinm?ki P, Pizzi C, Rastas P, Ukkonen E.
>>> Department of Computer Science and Helsinki Institute for  Information
>>> Technology,
>>> University of Helsinki, Helsinki, Finland.
>>>
>>> SUMMARY: MOODS (MOtif Occurrence Detection Suite) is a software  package
>>> for
>>> matching position weight matrices against DNA sequences. MOODS  implements
>>> state-of-the-art on-line matching algorithms, achieving considerably
>>> faster
>>> scanning speed than with a simple brute-force search. MOODS is  written in
>>> C++,
>>> with bindings for the popular BioPerl and Biopython toolkits. It can
>>> easily be
>>> adapted for different purposes and integrated into existing  workflows. It
>>> can
>>> also be used as a C++ library. AVAILABILITY: The package with
>>> documentation and
>>> examples of usage is available at
>>> http://www.cs.helsinki.fi/group/pssmfind. The
>>> source code is also available under the terms of a GNU General  Public
>>> License
>>> (GPL). CONTACT: janne.h.korhonen at helsinki.fi.
>>>
>>> PMID: 19773334 [PubMed - as supplied by publisher]
>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>
>> --
>> Jason Stajich
>> jason.stajich at gmail.com
>> jason at bioperl.org
>>
>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


_______________________________________________
Bioperl-l mailing list
Bioperl-l at lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/bioperl-l


From maj at fortinbras.us  Fri Sep 25 13:08:10 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Fri, 25 Sep 2009 09:08:10 -0400
Subject: [Bioperl-l] a Main Page proposal
In-Reply-To: <D336A804-1B73-465E-971F-AD1A9EE5465C@illinois.edu>
References: <094CC6C2C9DE448F9A8097AEEFE86B03@NewLife><4AB71C40.10902@sendu.me.uk>
	<4AB72DEF.2010008@cornell.edu><320fb6e00909210245hc455330o109515b100b21153@mail.gmail.com><3C8F39ACAD954917ACDEFD863EC99B16@NewLife><320fb6e00909210528v849cd43h3bd4677b69222575@mail.gmail.com>
	<D336A804-1B73-465E-971F-AD1A9EE5465C@illinois.edu>
Message-ID: <3327C0C1167C4889A980809FD642A0A2@NewLife>

The idea I now have is that <biblio> is hitting the server too rapidly and 
getting bounced after a while.
----- Original Message ----- 
From: "Chris Fields" <cjfields at illinois.edu>
To: "Peter" <biopython at maubp.freeserve.co.uk>
Cc: "BioPerl List" <bioperl-l at lists.open-bio.org>; "Mark A. Jensen" 
<maj at fortinbras.us>
Sent: Monday, September 21, 2009 9:05 AM
Subject: Re: [Bioperl-l] a Main Page proposal


>
> On Sep 21, 2009, at 7:28 AM, Peter wrote:
>
>> Peter wrote:
>>>> We had some similar discussions about the Biopython wiki
>>>> based homepage - although our old one was nowhere near
>>>> as busy as the current BioPerl main page, it was still not as
>>>> welcoming as our current version *tries* to be.
>>>> ...
>>>> I can dig out links to our mailing list archive if anyone is
>>>> interested in the discussion.
>>
>> On Mon, Sep 21, 2009 at 12:32 PM, Mark A. Jensen wrote:
>>>
>>> I'd appreciate those links, Peter- thanks
>>> MAJ
>>
>> OK, here you are - this was most of it, I'd have to dig though
>> my old emails to see what else I can find:
>> http://lists.open-bio.org/pipermail/biopython-dev/2009-April/005867.html
>>
>> Remember Biopython went from a very minimal home page, to
>> something aiming to be more newcomer friendly. BioPerl on the
>> other hand seems to want to move away from the current very
>> text heavy information rich page to something more focused and
>> newcomer friendly. To me at least the current page is too dense,
>> intimidating, and the important bits get lost in all the content.
>>
>> [My apologies if any of this feedback come accross too blunt.]
>
> Not at all; I'm thinking the same thing.
>
>> If you haven't already looked at them, you should checkout the
>> other OBF project pages for ideas. The BioJava homepage is
>> also using the wiki - in my opinion it is a bit cluttered, but is
>> still more accessible than the current BioPerl page. Also,
>> the BioRuby page is very nice - although not wiki based.
>>
>> Regards,
>>
>> Peter
>
> I think the Biopython layout is very nice and focused.  Maybe a bit  too 
> minimal, but then again I don't like scrolling up and down the  page to find 
> the relevant bits, so less may be better.
>
> Reminds me of the simplifed design on the perl6 main page (just don't  stare 
> at the hallucinogenic butterfly too long):
>
> http://www.perl6.org/
>
> So, maybe a structured layout with the most important links, and  additional 
> links on a separate page.
>
> chris
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> 


From maj at fortinbras.us  Fri Sep 25 13:30:21 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Fri, 25 Sep 2009 09:30:21 -0400
Subject: [Bioperl-l] a Main Page proposal
In-Reply-To: <D336A804-1B73-465E-971F-AD1A9EE5465C@illinois.edu>
References: <094CC6C2C9DE448F9A8097AEEFE86B03@NewLife><4AB71C40.10902@sendu.me.uk>
	<4AB72DEF.2010008@cornell.edu><320fb6e00909210245hc455330o109515b100b21153@mail.gmail.com><3C8F39ACAD954917ACDEFD863EC99B16@NewLife><320fb6e00909210528v849cd43h3bd4677b69222575@mail.gmail.com>
	<D336A804-1B73-465E-971F-AD1A9EE5465C@illinois.edu>
Message-ID: <A06AF115F63B4C558D368B730BFB441D@NewLife>

It's ugly, but it works now.
----- Original Message ----- 
From: "Chris Fields" <cjfields at illinois.edu>
To: "Peter" <biopython at maubp.freeserve.co.uk>
Cc: "BioPerl List" <bioperl-l at lists.open-bio.org>; "Mark A. Jensen" 
<maj at fortinbras.us>
Sent: Monday, September 21, 2009 9:05 AM
Subject: Re: [Bioperl-l] a Main Page proposal


>
> On Sep 21, 2009, at 7:28 AM, Peter wrote:
>
>> Peter wrote:
>>>> We had some similar discussions about the Biopython wiki
>>>> based homepage - although our old one was nowhere near
>>>> as busy as the current BioPerl main page, it was still not as
>>>> welcoming as our current version *tries* to be.
>>>> ...
>>>> I can dig out links to our mailing list archive if anyone is
>>>> interested in the discussion.
>>
>> On Mon, Sep 21, 2009 at 12:32 PM, Mark A. Jensen wrote:
>>>
>>> I'd appreciate those links, Peter- thanks
>>> MAJ
>>
>> OK, here you are - this was most of it, I'd have to dig though
>> my old emails to see what else I can find:
>> http://lists.open-bio.org/pipermail/biopython-dev/2009-April/005867.html
>>
>> Remember Biopython went from a very minimal home page, to
>> something aiming to be more newcomer friendly. BioPerl on the
>> other hand seems to want to move away from the current very
>> text heavy information rich page to something more focused and
>> newcomer friendly. To me at least the current page is too dense,
>> intimidating, and the important bits get lost in all the content.
>>
>> [My apologies if any of this feedback come accross too blunt.]
>
> Not at all; I'm thinking the same thing.
>
>> If you haven't already looked at them, you should checkout the
>> other OBF project pages for ideas. The BioJava homepage is
>> also using the wiki - in my opinion it is a bit cluttered, but is
>> still more accessible than the current BioPerl page. Also,
>> the BioRuby page is very nice - although not wiki based.
>>
>> Regards,
>>
>> Peter
>
> I think the Biopython layout is very nice and focused.  Maybe a bit  too 
> minimal, but then again I don't like scrolling up and down the  page to find 
> the relevant bits, so less may be better.
>
> Reminds me of the simplifed design on the perl6 main page (just don't  stare 
> at the hallucinogenic butterfly too long):
>
> http://www.perl6.org/
>
> So, maybe a structured layout with the most important links, and  additional 
> links on a separate page.
>
> chris
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> 


From jason at bioperl.org  Fri Sep 25 15:47:55 2009
From: jason at bioperl.org (Jason Stajich)
Date: Fri, 25 Sep 2009 08:47:55 -0700
Subject: [Bioperl-l] a Main Page proposal
In-Reply-To: <A06AF115F63B4C558D368B730BFB441D@NewLife>
References: <094CC6C2C9DE448F9A8097AEEFE86B03@NewLife><4AB71C40.10902@sendu.me.uk>
	<4AB72DEF.2010008@cornell.edu><320fb6e00909210245hc455330o109515b100b21153@mail.gmail.com><3C8F39ACAD954917ACDEFD863EC99B16@NewLife><320fb6e00909210528v849cd43h3bd4677b69222575@mail.gmail.com>
	<D336A804-1B73-465E-971F-AD1A9EE5465C@illinois.edu>
	<A06AF115F63B4C558D368B730BFB441D@NewLife>
Message-ID: <2F3B82E8-3A61-4FDB-A55E-38899C262ED6@bioperl.org>

thanks - yeah I had separated it by year to make it easier to update  
them since the main file was too large, but I liked having them all  
pulled in onto one page in order to see the total number of cites.  
Brian's graphic is nice but a little out of date, and only reflects a  
pubmed query.

Basically that system doesn't work well enough with biblio since it  
isn't caching the lookups very well.   We can probably do better  
somehow, but someone would have to really be dedicated to it, so I can  
kind of see now why we could use something like this to generate the  
citations so they'd be static.
http://sumsearch.uthscsa.edu/cite/

I had used Biblio extension as it was so easy but maybe it just can't  
scale for that number of needed refs as it doesn't do very good local  
caching AFAIK.

-jason
On Sep 25, 2009, at 6:30 AM, Mark A. Jensen wrote:

> It's ugly, but it works now.
> ----- Original Message ----- From: "Chris Fields" <cjfields at illinois.edu 
> >
> To: "Peter" <biopython at maubp.freeserve.co.uk>
> Cc: "BioPerl List" <bioperl-l at lists.open-bio.org>; "Mark A. Jensen" <maj at fortinbras.us 
> >
> Sent: Monday, September 21, 2009 9:05 AM
> Subject: Re: [Bioperl-l] a Main Page proposal
>
>
>>
>> On Sep 21, 2009, at 7:28 AM, Peter wrote:
>>
>>> Peter wrote:
>>>>> We had some similar discussions about the Biopython wiki
>>>>> based homepage - although our old one was nowhere near
>>>>> as busy as the current BioPerl main page, it was still not as
>>>>> welcoming as our current version *tries* to be.
>>>>> ...
>>>>> I can dig out links to our mailing list archive if anyone is
>>>>> interested in the discussion.
>>>
>>> On Mon, Sep 21, 2009 at 12:32 PM, Mark A. Jensen wrote:
>>>>
>>>> I'd appreciate those links, Peter- thanks
>>>> MAJ
>>>
>>> OK, here you are - this was most of it, I'd have to dig though
>>> my old emails to see what else I can find:
>>> http://lists.open-bio.org/pipermail/biopython-dev/2009-April/005867.html
>>>
>>> Remember Biopython went from a very minimal home page, to
>>> something aiming to be more newcomer friendly. BioPerl on the
>>> other hand seems to want to move away from the current very
>>> text heavy information rich page to something more focused and
>>> newcomer friendly. To me at least the current page is too dense,
>>> intimidating, and the important bits get lost in all the content.
>>>
>>> [My apologies if any of this feedback come accross too blunt.]
>>
>> Not at all; I'm thinking the same thing.
>>
>>> If you haven't already looked at them, you should checkout the
>>> other OBF project pages for ideas. The BioJava homepage is
>>> also using the wiki - in my opinion it is a bit cluttered, but is
>>> still more accessible than the current BioPerl page. Also,
>>> the BioRuby page is very nice - although not wiki based.
>>>
>>> Regards,
>>>
>>> Peter
>>
>> I think the Biopython layout is very nice and focused.  Maybe a  
>> bit  too minimal, but then again I don't like scrolling up and down  
>> the  page to find the relevant bits, so less may be better.
>>
>> Reminds me of the simplifed design on the perl6 main page (just  
>> don't  stare at the hallucinogenic butterfly too long):
>>
>> http://www.perl6.org/
>>
>> So, maybe a structured layout with the most important links, and   
>> additional links on a separate page.
>>
>> chris
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

--
Jason Stajich
jason.stajich at gmail.com
jason at bioperl.org


From jason at bioperl.org  Fri Sep 25 16:54:36 2009
From: jason at bioperl.org (Jason Stajich)
Date: Fri, 25 Sep 2009 09:54:36 -0700
Subject: [Bioperl-l] a Main Page proposal
In-Reply-To: <3575DEFF2D0342D0A2553D87EB958D6E@NewLife>
References: <094CC6C2C9DE448F9A8097AEEFE86B03@NewLife><4AB71C40.10902@sendu.me.uk><4AB72DEF.2010008@cornell.edu><320fb6e00909210245hc455330o109515b100b21153@mail.gmail.com><3C8F39ACAD954917ACDEFD863EC99B16@NewLife><320fb6e00909210528v849cd43h3bd4677b69222575@mail.gmail.com><D336A804-1B73-465E-971F-AD1A9EE5465C@illinois.edu><A06AF115F63B4C558D368B730BFB441D@NewLife>
	<2F3B82E8-3A61-4FDB-A55E-38899C262ED6@bioperl.org>
	<3575DEFF2D0342D0A2553D87EB958D6E@NewLife>
Message-ID: <7275015E-45FC-4E2A-9379-89F7447DEB32@bioperl.org>

cheers- any efforts are appreciated.  I am not sure what is the best  
way to provide this info to folks without spending a ton of time  
curating.  What would be ideal is if the software worked well enough  
that a volunteer only spent time adding the info not debugging the  
display or code.  It might be that something better exists -- online  
reference management like citeulike or mendeley -- that could then be  
linked in via an API.  .... Webservices, etc will save us all, right?   
Okay not really, but at least we can try and keep this organized till  
it is clear what are alternate solutions.  Martin has stopped working  
on Biblio as far as I know and php-hacking is not my favorite pastime.

-jason
On Sep 25, 2009, at 9:38 AM, Mark A. Jensen wrote:

> I figured you really wanted the 'hundreds-o-cites' effect-- I'm just  
> thinking of this
> as a workaround until the issues are resolved. Not sure I can devote  
> too much
> time to playing with it now (procrastinating using other projects at  
> the mo') but
> I can put it in the todo list on the Documentation Project page....
> cheers MAJ
> ----- Original Message ----- From: "Jason Stajich" <jason at bioperl.org>
> To: "Mark A. Jensen" <maj at fortinbras.us>
> Cc: "Chris Fields" <cjfields at illinois.edu>; "BioPerl List" <bioperl-l at lists.open-bio.org 
> >; "Peter" <biopython at maubp.freeserve.co.uk>
> Sent: Friday, September 25, 2009 11:47 AM
> Subject: Re: [Bioperl-l] a Main Page proposal
>
>
>> thanks - yeah I had separated it by year to make it easier to  
>> update  them since the main file was too large, but I liked having  
>> them all  pulled in onto one page in order to see the total number  
>> of cites.  Brian's graphic is nice but a little out of date, and  
>> only reflects a  pubmed query.
>>
>> Basically that system doesn't work well enough with biblio since  
>> it  isn't caching the lookups very well.   We can probably do  
>> better  somehow, but someone would have to really be dedicated to  
>> it, so I can  kind of see now why we could use something like this  
>> to generate the  citations so they'd be static.
>> http://sumsearch.uthscsa.edu/cite/
>>
>> I had used Biblio extension as it was so easy but maybe it just  
>> can't  scale for that number of needed refs as it doesn't do very  
>> good local  caching AFAIK.
>>
>> -jason
>> On Sep 25, 2009, at 6:30 AM, Mark A. Jensen wrote:
>>
>>> It's ugly, but it works now.
>>> ----- Original Message ----- From: "Chris Fields" <cjfields at illinois.edu
>>> >
>>> To: "Peter" <biopython at maubp.freeserve.co.uk>
>>> Cc: "BioPerl List" <bioperl-l at lists.open-bio.org>; "Mark A.  
>>> Jensen" <maj at fortinbras.us
>>> >
>>> Sent: Monday, September 21, 2009 9:05 AM
>>> Subject: Re: [Bioperl-l] a Main Page proposal
>>>
>>>
>>>>
>>>> On Sep 21, 2009, at 7:28 AM, Peter wrote:
>>>>
>>>>> Peter wrote:
>>>>>>> We had some similar discussions about the Biopython wiki
>>>>>>> based homepage - although our old one was nowhere near
>>>>>>> as busy as the current BioPerl main page, it was still not as
>>>>>>> welcoming as our current version *tries* to be.
>>>>>>> ...
>>>>>>> I can dig out links to our mailing list archive if anyone is
>>>>>>> interested in the discussion.
>>>>>
>>>>> On Mon, Sep 21, 2009 at 12:32 PM, Mark A. Jensen wrote:
>>>>>>
>>>>>> I'd appreciate those links, Peter- thanks
>>>>>> MAJ
>>>>>
>>>>> OK, here you are - this was most of it, I'd have to dig though
>>>>> my old emails to see what else I can find:
>>>>> http://lists.open-bio.org/pipermail/biopython-dev/2009-April/005867.html
>>>>>
>>>>> Remember Biopython went from a very minimal home page, to
>>>>> something aiming to be more newcomer friendly. BioPerl on the
>>>>> other hand seems to want to move away from the current very
>>>>> text heavy information rich page to something more focused and
>>>>> newcomer friendly. To me at least the current page is too dense,
>>>>> intimidating, and the important bits get lost in all the content.
>>>>>
>>>>> [My apologies if any of this feedback come accross too blunt.]
>>>>
>>>> Not at all; I'm thinking the same thing.
>>>>
>>>>> If you haven't already looked at them, you should checkout the
>>>>> other OBF project pages for ideas. The BioJava homepage is
>>>>> also using the wiki - in my opinion it is a bit cluttered, but is
>>>>> still more accessible than the current BioPerl page. Also,
>>>>> the BioRuby page is very nice - although not wiki based.
>>>>>
>>>>> Regards,
>>>>>
>>>>> Peter
>>>>
>>>> I think the Biopython layout is very nice and focused.  Maybe a   
>>>> bit  too minimal, but then again I don't like scrolling up and  
>>>> down  the  page to find the relevant bits, so less may be better.
>>>>
>>>> Reminds me of the simplifed design on the perl6 main page (just   
>>>> don't stare at the hallucinogenic butterfly too long):
>>>>
>>>> http://www.perl6.org/
>>>>
>>>> So, maybe a structured layout with the most important links, and  
>>>> additional links on a separate page.
>>>>
>>>> chris
>>>>
>>>> _______________________________________________
>>>> Bioperl-l mailing list
>>>> Bioperl-l at lists.open-bio.org
>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>>
>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>> --
>> Jason Stajich
>> jason.stajich at gmail.com
>> jason at bioperl.org
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>

--
Jason Stajich
jason.stajich at gmail.com
jason at bioperl.org


From maj at fortinbras.us  Fri Sep 25 16:38:40 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Fri, 25 Sep 2009 12:38:40 -0400
Subject: [Bioperl-l] a Main Page proposal
In-Reply-To: <2F3B82E8-3A61-4FDB-A55E-38899C262ED6@bioperl.org>
References: <094CC6C2C9DE448F9A8097AEEFE86B03@NewLife><4AB71C40.10902@sendu.me.uk><4AB72DEF.2010008@cornell.edu><320fb6e00909210245hc455330o109515b100b21153@mail.gmail.com><3C8F39ACAD954917ACDEFD863EC99B16@NewLife><320fb6e00909210528v849cd43h3bd4677b69222575@mail.gmail.com><D336A804-1B73-465E-971F-AD1A9EE5465C@illinois.edu><A06AF115F63B4C558D368B730BFB441D@NewLife>
	<2F3B82E8-3A61-4FDB-A55E-38899C262ED6@bioperl.org>
Message-ID: <3575DEFF2D0342D0A2553D87EB958D6E@NewLife>

I figured you really wanted the 'hundreds-o-cites' effect-- I'm just thinking of 
this
as a workaround until the issues are resolved. Not sure I can devote too much
time to playing with it now (procrastinating using other projects at the mo') 
but
I can put it in the todo list on the Documentation Project page....
cheers MAJ
----- Original Message ----- 
From: "Jason Stajich" <jason at bioperl.org>
To: "Mark A. Jensen" <maj at fortinbras.us>
Cc: "Chris Fields" <cjfields at illinois.edu>; "BioPerl List" 
<bioperl-l at lists.open-bio.org>; "Peter" <biopython at maubp.freeserve.co.uk>
Sent: Friday, September 25, 2009 11:47 AM
Subject: Re: [Bioperl-l] a Main Page proposal


> thanks - yeah I had separated it by year to make it easier to update  them 
> since the main file was too large, but I liked having them all  pulled in onto 
> one page in order to see the total number of cites.  Brian's graphic is nice 
> but a little out of date, and only reflects a  pubmed query.
>
> Basically that system doesn't work well enough with biblio since it  isn't 
> caching the lookups very well.   We can probably do better  somehow, but 
> someone would have to really be dedicated to it, so I can  kind of see now why 
> we could use something like this to generate the  citations so they'd be 
> static.
> http://sumsearch.uthscsa.edu/cite/
>
> I had used Biblio extension as it was so easy but maybe it just can't  scale 
> for that number of needed refs as it doesn't do very good local  caching 
> AFAIK.
>
> -jason
> On Sep 25, 2009, at 6:30 AM, Mark A. Jensen wrote:
>
>> It's ugly, but it works now.
>> ----- Original Message ----- From: "Chris Fields" <cjfields at illinois.edu
>> >
>> To: "Peter" <biopython at maubp.freeserve.co.uk>
>> Cc: "BioPerl List" <bioperl-l at lists.open-bio.org>; "Mark A. Jensen" 
>> <maj at fortinbras.us
>> >
>> Sent: Monday, September 21, 2009 9:05 AM
>> Subject: Re: [Bioperl-l] a Main Page proposal
>>
>>
>>>
>>> On Sep 21, 2009, at 7:28 AM, Peter wrote:
>>>
>>>> Peter wrote:
>>>>>> We had some similar discussions about the Biopython wiki
>>>>>> based homepage - although our old one was nowhere near
>>>>>> as busy as the current BioPerl main page, it was still not as
>>>>>> welcoming as our current version *tries* to be.
>>>>>> ...
>>>>>> I can dig out links to our mailing list archive if anyone is
>>>>>> interested in the discussion.
>>>>
>>>> On Mon, Sep 21, 2009 at 12:32 PM, Mark A. Jensen wrote:
>>>>>
>>>>> I'd appreciate those links, Peter- thanks
>>>>> MAJ
>>>>
>>>> OK, here you are - this was most of it, I'd have to dig though
>>>> my old emails to see what else I can find:
>>>> http://lists.open-bio.org/pipermail/biopython-dev/2009-April/005867.html
>>>>
>>>> Remember Biopython went from a very minimal home page, to
>>>> something aiming to be more newcomer friendly. BioPerl on the
>>>> other hand seems to want to move away from the current very
>>>> text heavy information rich page to something more focused and
>>>> newcomer friendly. To me at least the current page is too dense,
>>>> intimidating, and the important bits get lost in all the content.
>>>>
>>>> [My apologies if any of this feedback come accross too blunt.]
>>>
>>> Not at all; I'm thinking the same thing.
>>>
>>>> If you haven't already looked at them, you should checkout the
>>>> other OBF project pages for ideas. The BioJava homepage is
>>>> also using the wiki - in my opinion it is a bit cluttered, but is
>>>> still more accessible than the current BioPerl page. Also,
>>>> the BioRuby page is very nice - although not wiki based.
>>>>
>>>> Regards,
>>>>
>>>> Peter
>>>
>>> I think the Biopython layout is very nice and focused.  Maybe a  bit  too 
>>> minimal, but then again I don't like scrolling up and down  the  page to 
>>> find the relevant bits, so less may be better.
>>>
>>> Reminds me of the simplifed design on the perl6 main page (just  don't 
>>> stare at the hallucinogenic butterfly too long):
>>>
>>> http://www.perl6.org/
>>>
>>> So, maybe a structured layout with the most important links, and 
>>> additional links on a separate page.
>>>
>>> chris
>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> --
> Jason Stajich
> jason.stajich at gmail.com
> jason at bioperl.org
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> 


From jcline at ieee.org  Fri Sep 25 19:11:20 2009
From: jcline at ieee.org (Jonathan Cline)
Date: Fri, 25 Sep 2009 14:11:20 -0500
Subject: [Bioperl-l] LIMS::Controller and LIMS::Web
Message-ID: <4ABD15D8.9020304@ieee.org>

Anyone using the CPAN LIMS::Web or associated modules, have a web site
which demonstrates functionality?  The links in the .pod are not current.

>From CPAN:

DESCRIPTION ^

LIMS::Controller is a versatile object-oriented Perl module designed to
control a LIMS database and its web interface. Inheriting from the
LIMS::Web::Interface and LIMS::Database::Util classes, the module
provides automation for many core and advanced functions required of a
web/database object layer, enabling rapid development of Perl CGI scripts.

-- 

## Jonathan Cline
## jcline at ieee.org
## Mobile: +1-805-617-0223
########################


From bosborne11 at verizon.net  Sat Sep 26 02:13:16 2009
From: bosborne11 at verizon.net (Brian Osborne)
Date: Fri, 25 Sep 2009 22:13:16 -0400
Subject: [Bioperl-l] a Main Page proposal
In-Reply-To: <42FBB964C0EA44FABCB50364C567A009@NewLife>
References: <094CC6C2C9DE448F9A8097AEEFE86B03@NewLife><7C51BBB552AB4EBEA5ED60B49BDB9171@NewLife>
	<628aabb70909211003w221b4c9fn5c11f7ff08fc952d@mail.gmail.com>
	<42FBB964C0EA44FABCB50364C567A009@NewLife>
Message-ID: <B4584560-48AE-4EFB-BE94-E1481FD24E1C@verizon.net>

Mark,

Really nice, and a significant improvement over the existing.

You've gotten good feedback, you've considered these thoughts and  
incorporated them - is it time to move the beta to Main? Yes. In my  
opinion your 'beta' is far superior - just do it.

Brian O.


On Sep 21, 2009, at 1:45 PM, Mark A. Jensen wrote:

> A nearly completely minimal solution is at Main Page Beta
> ----- Original Message ----- From: "Dave Messina" <David.Messina at sbc.su.se 
> >
> To: "Mark A. Jensen" <maj at fortinbras.us>
> Cc: "BioPerl List" <bioperl-l at lists.open-bio.org>
> Sent: Monday, September 21, 2009 1:03 PM
> Subject: Re: [Bioperl-l] a Main Page proposal
>
>
>> Hi Mark,
>> Thanks for taking on this (much needed) refresh.
>> I think your current version is substantially better than what we  
>> have now.
>> Still, I'd argue that something much more concise like the  
>> Biopython page
>> would make a bigger impact on visitors' ability to find what  
>> they're looking
>> for.
>> It's not that the details you have under each section shouldn't be
>> available, but rather that they could be clicked through to instead  
>> of being
>> on the front page.
>> The About section is a good example. I would bet most visitors to the
>> BioPerl website skip over the About section because they already  
>> know what
>> BioPerl is, and that section has the most valuable real estate on  
>> the page.
>> Those who don't know and are curious will probably be able to find  
>> it (the
>> word About on the front page of a website has become an idiom for  
>> "click her
>> to read the details about this").
>> Dave
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From maj at fortinbras.us  Sat Sep 26 02:22:49 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Fri, 25 Sep 2009 22:22:49 -0400
Subject: [Bioperl-l] a Main Page proposal
In-Reply-To: <B4584560-48AE-4EFB-BE94-E1481FD24E1C@verizon.net>
References: <094CC6C2C9DE448F9A8097AEEFE86B03@NewLife><7C51BBB552AB4EBEA5ED60B49BDB9171@NewLife><628aabb70909211003w221b4c9fn5c11f7ff08fc952d@mail.gmail.com><42FBB964C0EA44FABCB50364C567A009@NewLife>
	<B4584560-48AE-4EFB-BE94-E1481FD24E1C@verizon.net>
Message-ID: <ACA5C04C052442259262125A5F0B8E74@NewLife>

Cheers, Brian-- I am becoming swayed now by Chris' whack 
at it, on his talk page. My thought is that we'll hammer out the 
final version after the release, then pull the trigger-- Your thoughts?
MAJ
----- Original Message ----- 
From: "Brian Osborne" <bosborne11 at verizon.net>
To: "Mark A. Jensen" <maj at fortinbras.us>
Cc: "BioPerl List" <bioperl-l at lists.open-bio.org>
Sent: Friday, September 25, 2009 10:13 PM
Subject: Re: [Bioperl-l] a Main Page proposal


> Mark,
> 
> Really nice, and a significant improvement over the existing.
> 
> You've gotten good feedback, you've considered these thoughts and  
> incorporated them - is it time to move the beta to Main? Yes. In my  
> opinion your 'beta' is far superior - just do it.
> 
> Brian O.
> 
> 
> On Sep 21, 2009, at 1:45 PM, Mark A. Jensen wrote:
> 
>> A nearly completely minimal solution is at Main Page Beta
>> ----- Original Message ----- From: "Dave Messina" <David.Messina at sbc.su.se 
>> >
>> To: "Mark A. Jensen" <maj at fortinbras.us>
>> Cc: "BioPerl List" <bioperl-l at lists.open-bio.org>
>> Sent: Monday, September 21, 2009 1:03 PM
>> Subject: Re: [Bioperl-l] a Main Page proposal
>>
>>
>>> Hi Mark,
>>> Thanks for taking on this (much needed) refresh.
>>> I think your current version is substantially better than what we  
>>> have now.
>>> Still, I'd argue that something much more concise like the  
>>> Biopython page
>>> would make a bigger impact on visitors' ability to find what  
>>> they're looking
>>> for.
>>> It's not that the details you have under each section shouldn't be
>>> available, but rather that they could be clicked through to instead  
>>> of being
>>> on the front page.
>>> The About section is a good example. I would bet most visitors to the
>>> BioPerl website skip over the About section because they already  
>>> know what
>>> BioPerl is, and that section has the most valuable real estate on  
>>> the page.
>>> Those who don't know and are curious will probably be able to find  
>>> it (the
>>> word About on the front page of a website has become an idiom for  
>>> "click her
>>> to read the details about this").
>>> Dave
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
>


From maj at fortinbras.us  Sat Sep 26 02:45:21 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Fri, 25 Sep 2009 22:45:21 -0400
Subject: [Bioperl-l] a Main Page proposal
In-Reply-To: <EB52E49A-37B3-4652-9BFD-441BA174FF84@verizon.net>
References: <094CC6C2C9DE448F9A8097AEEFE86B03@NewLife><7C51BBB552AB4EBEA5ED60B49BDB9171@NewLife><628aabb70909211003w221b4c9fn5c11f7ff08fc952d@mail.gmail.com><42FBB964C0EA44FABCB50364C567A009@NewLife>
	<B4584560-48AE-4EFB-BE94-E1481FD24E1C@verizon.net>
	<ACA5C04C052442259262125A5F0B8E74@NewLife>
	<EB52E49A-37B3-4652-9BFD-441BA174FF84@verizon.net>
Message-ID: <E53214D989154E8184BA97573C925DF9@NewLife>

sounds good-- I can make the changes (soon) and we'll tweak it from the echte page
(unless I hear diff'rnt)
cheers MAJ
  ----- Original Message ----- 
  From: Brian Osborne 
  To: Mark A. Jensen 
  Cc: BioPerl List 
  Sent: Friday, September 25, 2009 10:42 PM
  Subject: Re: [Bioperl-l] a Main Page proposal


  Mark,


  I don't love the italics in the version that Chris made but that's just personal preference. He's right in thinking that putting more in the top of the page is good: less scrolling.


  One could color the backgrounds of his tables, that might look nice.


  Either way, or a combination of both, is preferable to what we have. There really is no need to wait since the current page is abysmal. I can say that freely since I'm probably one of its authors!


  One thought though: move the "search" up to a center-left location, below "main links". The Wiki search is pretty good at finding pages so if someone doesn't find what they're looking for in the main section they might be drawn to search for it.


  Brian O.


  On Sep 25, 2009, at 10:22 PM, Mark A. Jensen wrote:


    Cheers, Brian-- I am becoming swayed now by Chris' whack at it, on his talk page. My thought is that we'll hammer out the final version after the release, then pull the trigger-- Your thoughts?
    MAJ
    ----- Original Message ----- From: "Brian Osborne" <bosborne11 at verizon.net>
    To: "Mark A. Jensen" <maj at fortinbras.us>
    Cc: "BioPerl List" <bioperl-l at lists.open-bio.org>
    Sent: Friday, September 25, 2009 10:13 PM
    Subject: Re: [Bioperl-l] a Main Page proposal


      Mark,

      Really nice, and a significant improvement over the existing.

      You've gotten good feedback, you've considered these thoughts and  incorporated them - is it time to move the beta to Main? Yes. In my  opinion your 'beta' is far superior - just do it.

      Brian O.

      On Sep 21, 2009, at 1:45 PM, Mark A. Jensen wrote:

        A nearly completely minimal solution is at Main Page Beta

        ----- Original Message ----- From: "Dave Messina" <David.Messina at sbc.su.se >

        To: "Mark A. Jensen" <maj at fortinbras.us>

        Cc: "BioPerl List" <bioperl-l at lists.open-bio.org>

        Sent: Monday, September 21, 2009 1:03 PM

        Subject: Re: [Bioperl-l] a Main Page proposal


          Hi Mark,

          Thanks for taking on this (much needed) refresh.

          I think your current version is substantially better than what we  have now.

          Still, I'd argue that something much more concise like the  Biopython page

          would make a bigger impact on visitors' ability to find what  they're looking

          for.

          It's not that the details you have under each section shouldn't be

          available, but rather that they could be clicked through to instead  of being

          on the front page.

          The About section is a good example. I would bet most visitors to the

          BioPerl website skip over the About section because they already  know what

          BioPerl is, and that section has the most valuable real estate on  the page.

          Those who don't know and are curious will probably be able to find  it (the

          word About on the front page of a website has become an idiom for  "click her

          to read the details about this").

          Dave

          _______________________________________________

          Bioperl-l mailing list

          Bioperl-l at lists.open-bio.org

          http://lists.open-bio.org/mailman/listinfo/bioperl-l


        _______________________________________________

        Bioperl-l mailing list

        Bioperl-l at lists.open-bio.org

        http://lists.open-bio.org/mailman/listinfo/bioperl-l

      _______________________________________________

      Bioperl-l mailing list

      Bioperl-l at lists.open-bio.org

      http://lists.open-bio.org/mailman/listinfo/bioperl-l


    _______________________________________________
    Bioperl-l mailing list
    Bioperl-l at lists.open-bio.org
    http://lists.open-bio.org/mailman/listinfo/bioperl-l


From bosborne11 at verizon.net  Sat Sep 26 02:42:38 2009
From: bosborne11 at verizon.net (Brian Osborne)
Date: Fri, 25 Sep 2009 22:42:38 -0400
Subject: [Bioperl-l] a Main Page proposal
In-Reply-To: <ACA5C04C052442259262125A5F0B8E74@NewLife>
References: <094CC6C2C9DE448F9A8097AEEFE86B03@NewLife><7C51BBB552AB4EBEA5ED60B49BDB9171@NewLife><628aabb70909211003w221b4c9fn5c11f7ff08fc952d@mail.gmail.com><42FBB964C0EA44FABCB50364C567A009@NewLife>
	<B4584560-48AE-4EFB-BE94-E1481FD24E1C@verizon.net>
	<ACA5C04C052442259262125A5F0B8E74@NewLife>
Message-ID: <EB52E49A-37B3-4652-9BFD-441BA174FF84@verizon.net>

Mark,

I don't love the italics in the version that Chris made but that's  
just personal preference. He's right in thinking that putting more in  
the top of the page is good: less scrolling.

One could color the backgrounds of his tables, that might look nice.

Either way, or a combination of both, is preferable to what we have.  
There really is no need to wait since the current page is abysmal. I  
can say that freely since I'm probably one of its authors!

One thought though: move the "search" up to a center-left location,  
below "main links". The Wiki search is pretty good at finding pages so  
if someone doesn't find what they're looking for in the main section  
they might be drawn to search for it.

Brian O.


On Sep 25, 2009, at 10:22 PM, Mark A. Jensen wrote:

> Cheers, Brian-- I am becoming swayed now by Chris' whack at it, on  
> his talk page. My thought is that we'll hammer out the final version  
> after the release, then pull the trigger-- Your thoughts?
> MAJ
> ----- Original Message ----- From: "Brian Osborne" <bosborne11 at verizon.net 
> >
> To: "Mark A. Jensen" <maj at fortinbras.us>
> Cc: "BioPerl List" <bioperl-l at lists.open-bio.org>
> Sent: Friday, September 25, 2009 10:13 PM
> Subject: Re: [Bioperl-l] a Main Page proposal
>
>
>> Mark,
>> Really nice, and a significant improvement over the existing.
>> You've gotten good feedback, you've considered these thoughts and   
>> incorporated them - is it time to move the beta to Main? Yes. In  
>> my  opinion your 'beta' is far superior - just do it.
>> Brian O.
>> On Sep 21, 2009, at 1:45 PM, Mark A. Jensen wrote:
>>> A nearly completely minimal solution is at Main Page Beta
>>> ----- Original Message ----- From: "Dave Messina" <David.Messina at sbc.su.se 
>>>  >
>>> To: "Mark A. Jensen" <maj at fortinbras.us>
>>> Cc: "BioPerl List" <bioperl-l at lists.open-bio.org>
>>> Sent: Monday, September 21, 2009 1:03 PM
>>> Subject: Re: [Bioperl-l] a Main Page proposal
>>>
>>>
>>>> Hi Mark,
>>>> Thanks for taking on this (much needed) refresh.
>>>> I think your current version is substantially better than what  
>>>> we  have now.
>>>> Still, I'd argue that something much more concise like the   
>>>> Biopython page
>>>> would make a bigger impact on visitors' ability to find what   
>>>> they're looking
>>>> for.
>>>> It's not that the details you have under each section shouldn't be
>>>> available, but rather that they could be clicked through to  
>>>> instead  of being
>>>> on the front page.
>>>> The About section is a good example. I would bet most visitors to  
>>>> the
>>>> BioPerl website skip over the About section because they already   
>>>> know what
>>>> BioPerl is, and that section has the most valuable real estate  
>>>> on  the page.
>>>> Those who don't know and are curious will probably be able to  
>>>> find  it (the
>>>> word About on the front page of a website has become an idiom  
>>>> for  "click her
>>>> to read the details about this").
>>>> Dave
>>>> _______________________________________________
>>>> Bioperl-l mailing list
>>>> Bioperl-l at lists.open-bio.org
>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From cjfields at illinois.edu  Sat Sep 26 04:04:57 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Fri, 25 Sep 2009 23:04:57 -0500
Subject: [Bioperl-l] a Main Page proposal
In-Reply-To: <EB52E49A-37B3-4652-9BFD-441BA174FF84@verizon.net>
References: <094CC6C2C9DE448F9A8097AEEFE86B03@NewLife><7C51BBB552AB4EBEA5ED60B49BDB9171@NewLife><628aabb70909211003w221b4c9fn5c11f7ff08fc952d@mail.gmail.com><42FBB964C0EA44FABCB50364C567A009@NewLife>
	<B4584560-48AE-4EFB-BE94-E1481FD24E1C@verizon.net>
	<ACA5C04C052442259262125A5F0B8E74@NewLife>
	<EB52E49A-37B3-4652-9BFD-441BA174FF84@verizon.net>
Message-ID: <68A162A4-45F1-4ADC-87C9-57E388DF2666@illinois.edu>

Brian, Mark,

Agreed about the italics; there's a lot more that can be done with  
tables if needed:

http://meta.wikimedia.org/wiki/Help:Table

I say go ahead and pull the trigger.  No need to wait 'til 1.6.1 on  
this, the sooner it's fixed the better.  We can tweak the rest (add  
News updates, etc) along the way.

chris

On Sep 25, 2009, at 9:42 PM, Brian Osborne wrote:

> Mark,
>
> I don't love the italics in the version that Chris made but that's  
> just personal preference. He's right in thinking that putting more  
> in the top of the page is good: less scrolling.
>
> One could color the backgrounds of his tables, that might look nice.
>
> Either way, or a combination of both, is preferable to what we have.  
> There really is no need to wait since the current page is abysmal. I  
> can say that freely since I'm probably one of its authors!
>
> One thought though: move the "search" up to a center-left location,  
> below "main links". The Wiki search is pretty good at finding pages  
> so if someone doesn't find what they're looking for in the main  
> section they might be drawn to search for it.
>
> Brian O.
>
>
> On Sep 25, 2009, at 10:22 PM, Mark A. Jensen wrote:
>
>> Cheers, Brian-- I am becoming swayed now by Chris' whack at it, on  
>> his talk page. My thought is that we'll hammer out the final  
>> version after the release, then pull the trigger-- Your thoughts?
>> MAJ
>> ----- Original Message ----- From: "Brian Osborne" <bosborne11 at verizon.net 
>> >
>> To: "Mark A. Jensen" <maj at fortinbras.us>
>> Cc: "BioPerl List" <bioperl-l at lists.open-bio.org>
>> Sent: Friday, September 25, 2009 10:13 PM
>> Subject: Re: [Bioperl-l] a Main Page proposal
>>
>>
>>> Mark,
>>> Really nice, and a significant improvement over the existing.
>>> You've gotten good feedback, you've considered these thoughts and   
>>> incorporated them - is it time to move the beta to Main? Yes. In  
>>> my  opinion your 'beta' is far superior - just do it.
>>> Brian O.
>>> On Sep 21, 2009, at 1:45 PM, Mark A. Jensen wrote:
>>>> A nearly completely minimal solution is at Main Page Beta
>>>> ----- Original Message ----- From: "Dave Messina" <David.Messina at sbc.su.se 
>>>>  >
>>>> To: "Mark A. Jensen" <maj at fortinbras.us>
>>>> Cc: "BioPerl List" <bioperl-l at lists.open-bio.org>
>>>> Sent: Monday, September 21, 2009 1:03 PM
>>>> Subject: Re: [Bioperl-l] a Main Page proposal
>>>>
>>>>
>>>>> Hi Mark,
>>>>> Thanks for taking on this (much needed) refresh.
>>>>> I think your current version is substantially better than what  
>>>>> we  have now.
>>>>> Still, I'd argue that something much more concise like the   
>>>>> Biopython page
>>>>> would make a bigger impact on visitors' ability to find what   
>>>>> they're looking
>>>>> for.
>>>>> It's not that the details you have under each section shouldn't be
>>>>> available, but rather that they could be clicked through to  
>>>>> instead  of being
>>>>> on the front page.
>>>>> The About section is a good example. I would bet most visitors  
>>>>> to the
>>>>> BioPerl website skip over the About section because they  
>>>>> already  know what
>>>>> BioPerl is, and that section has the most valuable real estate  
>>>>> on  the page.
>>>>> Those who don't know and are curious will probably be able to  
>>>>> find  it (the
>>>>> word About on the front page of a website has become an idiom  
>>>>> for  "click her
>>>>> to read the details about this").
>>>>> Dave
>>>>> _______________________________________________
>>>>> Bioperl-l mailing list
>>>>> Bioperl-l at lists.open-bio.org
>>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>>>
>>>> _______________________________________________
>>>> Bioperl-l mailing list
>>>> Bioperl-l at lists.open-bio.org
>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From cjfields at illinois.edu  Sat Sep 26 04:52:35 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Fri, 25 Sep 2009 23:52:35 -0500
Subject: [Bioperl-l] BioPerl 1.6.0 alpha 4 released
Message-ID: <2EDBBBF5-2109-456A-B768-178B012A8192@illinois.edu>

All,

Core 1.6.0 alpha 4 is now floating about on the intertubes and CPAN:

http://search.cpan.org/~cjfields/BioPerl-1.6.0_4/

http://bioperl.org/DIST/RC/

So far this is passing all tests for ActivePerl on WinXP once DB_File  
is installed.  I'll try running some tests for Strawberry Perl, but no  
promises.

At this late stage any additional updates will only be doc tweaks and  
dealing with small bug fixes prior to 1.6.1.  The only renaming issue  
is I need to rename BioPerl.pod to BioPerl.pm and adding a simple  
VERSION to it (per Curtis Jewell's suggestion).  I may post a very  
short alpha 5 to test that, with 1.6.1 posted by Sunday.

Enjoy!

chris


From e.osimo at gmail.com  Sun Sep 27 09:00:17 2009
From: e.osimo at gmail.com (Emanuele Osimo)
Date: Sun, 27 Sep 2009 11:00:17 +0200
Subject: [Bioperl-l] setting a strand in Bio::Graphics
Message-ID: <2ac05d0f0909270200j3bb478b3t77b83bccc1e5022c@mail.gmail.com>

Hello,
I've tried all the arrows suggested in
http://search.cpan.org/~lds/Bio-Graphics-1.982/lib/Bio/Graphics/Glyph/arrow.pm,
but I can't figure out how to tell in the options of $panel->add_track the
strand of the feature I'm adding.
I'm drawing DNA elements from a local DB, and I have a field "strand" which
can be + or -.
Please help!
Emanuele


From maj at fortinbras.us  Mon Sep 28 00:54:04 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Sun, 27 Sep 2009 20:54:04 -0400
Subject: [Bioperl-l] setting a strand in Bio::Graphics
In-Reply-To: <2ac05d0f0909270200j3bb478b3t77b83bccc1e5022c@mail.gmail.com>
References: <2ac05d0f0909270200j3bb478b3t77b83bccc1e5022c@mail.gmail.com>
Message-ID: <6CF05E74FEAE45679CDEDF48B7E15856@NewLife>

Emos- Without the code, I can only guess, but you might not be providing
the options correctly. Have a look at
http://www.bioperl.org/wiki/Drawing_with_multiple_glyphs_in_a_single_track
for something that may help.
MAJ
----- Original Message ----- 
From: "Emanuele Osimo" <e.osimo at gmail.com>
To: "perl bioperl ml" <bioperl-l at lists.open-bio.org>
Sent: Sunday, September 27, 2009 5:00 AM
Subject: [Bioperl-l] setting a strand in Bio::Graphics


> Hello,
> I've tried all the arrows suggested in
> http://search.cpan.org/~lds/Bio-Graphics-1.982/lib/Bio/Graphics/Glyph/arrow.pm,
> but I can't figure out how to tell in the options of $panel->add_track the
> strand of the feature I'm adding.
> I'm drawing DNA elements from a local DB, and I have a field "strand" which
> can be + or -.
> Please help!
> Emanuele
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> 


From cjfields at illinois.edu  Mon Sep 28 04:34:01 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Sun, 27 Sep 2009 23:34:01 -0500
Subject: [Bioperl-l] BioPerl 1.6.0 alpha 5 released
Message-ID: <277ED183-2F43-479F-88D2-A0A325105C53@illinois.edu>

All,

The last alpha for the 1.6.1 release is out and should be propagating  
around CPAN now.  This should be a quick one (it has a few last-minute  
bug fixes for some problems that popped up on CPAN RT and fixes one  
mistake I made in the last alpha).

You can currently get it here (.tar.gz only for now):

http://bioperl.org/DIST/RC/BioPerl-1.6.0_5.tar.gz

The final 1.6.1 release should drop in the next day or two.

chris


From adsj at novozymes.com  Mon Sep 28 07:51:15 2009
From: adsj at novozymes.com (Adam =?iso-8859-1?Q?Sj=F8gren?=)
Date: Mon, 28 Sep 2009 09:51:15 +0200
Subject: [Bioperl-l] Long /labels are wrapped, but can't be read
Message-ID: <87hbunv764.fsf@topper.koldfront.dk>

  Hi.


I am wondering whether this is a buglet or just a case of "Don't do
that":

If I set a very long /label on a feature and output the sequence in EMBL
format, the qualifier value gets wrapped, but not quoted.

When BioPerl reads such a file, an exception is thrown.

I probably shouldn't be setting very long labels... But oughtn't BioPerl
throw an exception when a too long label is set, or automatically quote
the value when it is long enough to be wrapped, or know how to read a
wrapped yet unquoted value?

I will be happy to try and provide a patch for whichever solution is
preferred.

Here is an example script:

  #!/usr/bin/perl

  use strict;
  use warnings;

  use IO::String;

  use Bio::Seq;
  use Bio::SeqFeature::Generic;
  use Bio::SeqIO;

  print 'BioPerl ' . $Bio::Root::Version::VERSION . "\n";

  my $seq=Bio::Seq->new(-seq=>'ATG');
  my $feature=Bio::SeqFeature::Generic->new(-primary=>'misc_feature', -start=>1, -end=>3);
  $feature->add_tag_value(label=>'averylonglabelthisisindeedbutitoughttoworkanywaydontyouthink');
  $seq->add_SeqFeature($feature);

  my $out_string=out($seq);
  print $out_string;

  my $fh=IO::String->new($out_string);
  my $in=Bio::SeqIO->new(-fh=>$fh, -format=>'EMBL');
  my $in_seq=$in->next_seq;

  print "Done\n";

  sub out {
      my ($seq)=@_;

      my $string='';
      my $fh=IO::String->new($string);
      my $out=Bio::SeqIO->new(-fh=>$fh, -format=>'EMBL');
      $out->write_seq($seq);

      return $string;
  }

Which gives this output when run:

  BioPerl 1.0069
  ID   unknown; SV 1; linear; unassigned DNA; STD; UNC; 3 BP.
  XX
  AC   unknown;
  XX
  XX
  FH   Key             Location/Qualifiers
  FH
  FT   misc_feature    1..3
  FT                   /label=averylonglabelthisisindeedbutitoughttoworkanywaydont
  FT                   youthink
  XX
  SQ   Sequence 3 BP; 1 A; 0 C; 1 G; 1 T; 0 other;
       atg                                                                       3
  //

  ------------- EXCEPTION: Bio::Root::Exception -------------
  MSG: Can't see new qualifier in: youthink
  from:
  /label=averylonglabelthisisindeedbutitoughttoworkanywaydont
  youthink

  STACK: Error::throw
  STACK: Bio::Root::Root::throw Bio/Root/Root.pm:368
  STACK: Bio::SeqIO::embl::_read_FTHelper_EMBL Bio/SeqIO/embl.pm:1294
  STACK: Bio::SeqIO::embl::next_seq Bio/SeqIO/embl.pm:392
  STACK: /z/home/adsj/bugs/bioperl/embl/embl.pl:24
  -----------------------------------------------------------

If I change the value to include "-quotes ("simulating" that embl.pm
quotes the value), BioPerl can read the EMBL string it produces fine:

  -----------------------------------------------------------
  adsj at ala:~/work/bioperl/bioperl-live$ perl -I. ~/bugs/bioperl/embl/embl.pl 
  BioPerl 1.0069
  ID   unknown; SV 1; linear; unassigned DNA; STD; UNC; 3 BP.
  XX
  AC   unknown;
  XX
  XX
  FH   Key             Location/Qualifiers
  FH
  FT   misc_feature    1..3
  FT                   /label=""averylonglabelthisisindeedbutitoughttoworkanywaydo
  FT                   ntyouthink""
  XX
  SQ   Sequence 3 BP; 1 A; 0 C; 1 G; 1 T; 0 other;
       atg                                                                       3
  //
  Done


  Best regards,

     Adam

-- 
                                                          Adam Sj?gren
                                                    adsj at novozymes.com


From paola_bisignano at yahoo.it  Mon Sep 28 10:00:07 2009
From: paola_bisignano at yahoo.it (Paola Bisignano)
Date: Mon, 28 Sep 2009 10:00:07 +0000 (GMT)
Subject: [Bioperl-l] parsing msf file (sorry last question about it)
Message-ID: <504748.72296.qm@web25704.mail.ukl.yahoo.com>

Hi dear friends,


I used Bio::AlignIO to parse msf file, using method

colum_from_residue_number, as you suggested to obtain the position in

the alignment of ?residues of interest (in contact with my ligand) and

I have to do a check of the residue:

I want to extract the type of the residue...I ask my question using

the number of the residue in the PDB, and i want the script return

also the residue so if I want to know the position af ala21, I ?will

do:


my $alnio = Bio::AlignIO->new( -file=>"my file.msf");

my $aln = $alnio->next_aln;


my $s1 = $aln->get_seq_by_pos(1);

my $s2 = $aln->get_seq_by_pos(2);


my $col = $aln->column_from_residue_ number( $s1->id, 21)


and It will return the position (es. 5) but I want to check if in

position 5 of the alignment there is A (for ala)....I looked in

documentation, but I couldn't find anything for that


Thank you all for help you gave and will give to me,


best regards,


paola


From David.Messina at sbc.su.se  Mon Sep 28 11:28:27 2009
From: David.Messina at sbc.su.se (Dave Messina)
Date: Mon, 28 Sep 2009 13:28:27 +0200
Subject: [Bioperl-l] parsing msf file (sorry last question about it)
In-Reply-To: <504748.72296.qm@web25704.mail.ukl.yahoo.com>
References: <504748.72296.qm@web25704.mail.ukl.yahoo.com>
Message-ID: <628aabb70909280428q54e08ef9sa005aeab9f3a7b62@mail.gmail.com>

Hi Paola,

> my $alnio = Bio::AlignIO->new( -file=>"my file.msf");
> my $aln = $alnio->next_aln;
>
> my $s1 = $aln->get_seq_by_pos(1);
> my $s2 = $aln->get_seq_by_pos(2);
>
> my $col = $aln->column_from_residue_ number( $s1->id, 21)


# extract sequences and check values for the alignment column $pos
  foreach my $seq ($aln->each_seq) {
      my $res = $seq->subseq($col, $col);
     if ($res eq 'A') {
         # do something
     }
  }


Please try the above code. I haven't tested it, but I think it will do what
you want.

Best,
Dave

PS - I found that code in the documentation for Bio::Align::AlignI. Right
now there is an effort to improve the BioPerl documentation, and it would be
helpful if you could let us know where you looked for the answer to your
question so we can try to make it easier to find.

Did you look in Bio::AlignIO? Did you also look anywhere else?

Thanks for your help!


From David.Messina at sbc.su.se  Mon Sep 28 12:05:58 2009
From: David.Messina at sbc.su.se (Dave Messina)
Date: Mon, 28 Sep 2009 14:05:58 +0200
Subject: [Bioperl-l] parsing msf file (sorry last question about it)
In-Reply-To: <678730.88068.qm@web25708.mail.ukl.yahoo.com>
References: <628aabb70909280428q54e08ef9sa005aeab9f3a7b62@mail.gmail.com> 
	<678730.88068.qm@web25708.mail.ukl.yahoo.com>
Message-ID: <628aabb70909280505l2c5f02b7k8387d5dfd3643575@mail.gmail.com>

On Mon, Sep 28, 2009 at 13:56, Paola Bisignano <paola_bisignano at yahoo.it>wrote:

> yes I have a look at
> http://doc.bioperl.org/releases/bioperl-1.0/Bio/AlignIO.html
>
> but I didn't find your suggestion


> thank,
> I'll try it in a while.......
> sorry I did not search in AlignI....


No problem, Paola -- thanks for letting us know.

Dave


From maj at fortinbras.us  Mon Sep 28 14:32:39 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Mon, 28 Sep 2009 10:32:39 -0400
Subject: [Bioperl-l] setting a strand in Bio::Graphics
In-Reply-To: <2ac05d0f0909280728y791a5e60r904be0d7e8f747f7@mail.gmail.com>
References: <2ac05d0f0909270200j3bb478b3t77b83bccc1e5022c@mail.gmail.com>
	<6CF05E74FEAE45679CDEDF48B7E15856@NewLife>
	<2ac05d0f0909280728y791a5e60r904be0d7e8f747f7@mail.gmail.com>
Message-ID: <A45CF4D6E34B405B86E5F2DF651B8964@NewLife>

Now that's what I call user-friendly.
  ----- Original Message ----- 
  From: Emanuele Osimo 
  To: Mark A. Jensen 
  Sent: Monday, September 28, 2009 10:28 AM
  Subject: Re: [Bioperl-l] setting a strand in Bio::Graphics


  Hello everyone,
  thank you, I found what I needed. You have to add                           

  -strand_arrow => 1

  in $panel->add_track, and 

  -strand        => +/-1,

  in $feature = Bio::SeqFeature::Generic->new options.

  Thanks
  Emanuele


  On Mon, Sep 28, 2009 at 02:54, Mark A. Jensen <maj at fortinbras.us> wrote:

    Emos- Without the code, I can only guess, but you might not be providing
    the options correctly. Have a look at
    http://www.bioperl.org/wiki/Drawing_with_multiple_glyphs_in_a_single_track
    for something that may help.
    MAJ
    ----- Original Message ----- From: "Emanuele Osimo" <e.osimo at gmail.com>
    To: "perl bioperl ml" <bioperl-l at lists.open-bio.org>
    Sent: Sunday, September 27, 2009 5:00 AM
    Subject: [Bioperl-l] setting a strand in Bio::Graphics


      Hello,
      I've tried all the arrows suggested in
      http://search.cpan.org/~lds/Bio-Graphics-1.982/lib/Bio/Graphics/Glyph/arrow.pm,
      but I can't figure out how to tell in the options of $panel->add_track the
      strand of the feature I'm adding.
      I'm drawing DNA elements from a local DB, and I have a field "strand" which
      can be + or -.
      Please help!
      Emanuele

      _______________________________________________
      Bioperl-l mailing list
      Bioperl-l at lists.open-bio.org
      http://lists.open-bio.org/mailman/listinfo/bioperl-l


From paolo.pavan at gmail.com  Mon Sep 28 15:51:52 2009
From: paolo.pavan at gmail.com (Paolo Pavan)
Date: Mon, 28 Sep 2009 17:51:52 +0200
Subject: [Bioperl-l] BioPerl object deep copy
Message-ID: <56be91b60909280851g2299726bvfbdd6ef44e262fe7@mail.gmail.com>

Hi all,
I would like to have just a programming hint, there is a way in
bioperl (or just in perl) to get an deep copy or a clone of an object?
That is, I get a new object with all the fields copied one by one.

At least, can I do so for a Bio::SeqI or a Bio::AlignI compliant object?

Thank you,
Paolo


From s.denaxas at gmail.com  Mon Sep 28 15:56:09 2009
From: s.denaxas at gmail.com (Spiros Denaxas)
Date: Mon, 28 Sep 2009 16:56:09 +0100
Subject: [Bioperl-l] BioPerl object deep copy
In-Reply-To: <56be91b60909280851g2299726bvfbdd6ef44e262fe7@mail.gmail.com>
References: <56be91b60909280851g2299726bvfbdd6ef44e262fe7@mail.gmail.com>
Message-ID: <bba689ec0909280856q3fa3c8b1pf5b5dd48bc493eb4@mail.gmail.com>

Hi Paolo,

You can use Clone [1]. Blindly cloning blessed objects though is not a
good idea so make sure you know what each one instantiates.

Spiros

[1] http://perldoc.net/Clone.pm

On Mon, Sep 28, 2009 at 4:51 PM, Paolo Pavan <paolo.pavan at gmail.com> wrote:
> Hi all,
> I would like to have just a programming hint, there is a way in
> bioperl (or just in perl) to get an deep copy or a clone of an object?
> That is, I get a new object with all the fields copied one by one.
>
> At least, can I do so for a Bio::SeqI or a Bio::AlignI compliant object?
>
> Thank you,
> Paolo
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


From maj at fortinbras.us  Mon Sep 28 16:05:42 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Mon, 28 Sep 2009 12:05:42 -0400
Subject: [Bioperl-l] BioPerl object deep copy
In-Reply-To: <56be91b60909280851g2299726bvfbdd6ef44e262fe7@mail.gmail.com>
References: <56be91b60909280851g2299726bvfbdd6ef44e262fe7@mail.gmail.com>
Message-ID: <5A61641A14AE4D80A495047A56659894@NewLife>

For some relatively careful examples of cloning code, 
you can look at the source for 
Bio::Tree::TreeFunctionsI::clone
and 
Bio::Restriction::Enzyme::clone (not clone_depr)
MAJ

----- Original Message ----- 
From: "Paolo Pavan" <paolo.pavan at gmail.com>
To: <bioperl-l at lists.open-bio.org>
Sent: Monday, September 28, 2009 11:51 AM
Subject: [Bioperl-l] BioPerl object deep copy


> Hi all,
> I would like to have just a programming hint, there is a way in
> bioperl (or just in perl) to get an deep copy or a clone of an object?
> That is, I get a new object with all the fields copied one by one.
> 
> At least, can I do so for a Bio::SeqI or a Bio::AlignI compliant object?
> 
> Thank you,
> Paolo
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
>


From cjfields at illinois.edu  Mon Sep 28 16:29:14 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Mon, 28 Sep 2009 11:29:14 -0500
Subject: [Bioperl-l] BioPerl object deep copy
In-Reply-To: <5A61641A14AE4D80A495047A56659894@NewLife>
References: <56be91b60909280851g2299726bvfbdd6ef44e262fe7@mail.gmail.com>
	<5A61641A14AE4D80A495047A56659894@NewLife>
Message-ID: <05BB0DB4-6017-40A1-92B2-6F441CCACDC6@illinois.edu>

As Spiros points out, Clone works in almost all cases and is very fast  
(XS-based I think).  IIRC the only time it borks out is if there is a  
code ref, as with Bio::Tree::Tree, but if it doesn't work you should  
get an error indicating the problem.

chris

On Sep 28, 2009, at 11:05 AM, Mark A. Jensen wrote:

> For some relatively careful examples of cloning code, you can look  
> at the source for Bio::Tree::TreeFunctionsI::clone
> and Bio::Restriction::Enzyme::clone (not clone_depr)
> MAJ
>
> ----- Original Message ----- From: "Paolo Pavan" <paolo.pavan at gmail.com 
> >
> To: <bioperl-l at lists.open-bio.org>
> Sent: Monday, September 28, 2009 11:51 AM
> Subject: [Bioperl-l] BioPerl object deep copy
>
>
>> Hi all,
>> I would like to have just a programming hint, there is a way in
>> bioperl (or just in perl) to get an deep copy or a clone of an  
>> object?
>> That is, I get a new object with all the fields copied one by one.
>> At least, can I do so for a Bio::SeqI or a Bio::AlignI compliant  
>> object?
>> Thank you,
>> Paolo
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From cjfields at illinois.edu  Mon Sep 28 17:00:09 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Mon, 28 Sep 2009 12:00:09 -0500
Subject: [Bioperl-l] BioPerl 1.6.0 alpha 6 released (?!?)
In-Reply-To: <20090928063013.GB1081@kunpuu.plessy.org>
References: <277ED183-2F43-479F-88D2-A0A325105C53@illinois.edu>
	<20090928063013.GB1081@kunpuu.plessy.org>
Message-ID: <CFD37E37-2B74-402F-BA0F-898A1642FFE8@illinois.edu>

Charles (and everyone else),

This bug was a bit sneaky.  The tests skipped on pretty much every  
system b/c of a requirement for both DB_File and BerkeleyDB (e.g. if  
both weren't installed, the tests were skipped).  I committed a fix  
for it; unfortunately that means I need to set up another alpha for  
testing, so...

The final final alpha has just been uploaded to CPAN and is now  
available here:

http://bioperl.org/DIST/RC/BioPerl-1.6.0_6.tar.gz

The final 1.6.1 release should still be in the next day or two, just  
awaiting some test reports via CPAN...

chris

On Sep 28, 2009, at 1:30 AM, Charles Plessy wrote:

> Le Sun, Sep 27, 2009 at 11:34:01PM -0500, Chris Fields a ?crit :
>>
>> http://bioperl.org/DIST/RC/BioPerl-1.6.0_5.tar.gz
>>
>
> Hi Chris,
>
> I have the following errors when building bioperl with perl 5.10.1  
> on Debian:
>
> Test Summary Report
> -------------------
> t/LocalDB/Registry.t                       (Wstat: 2304 Tests: 13  
> Failed: 1)
>  Failed test:  13
>  Non-zero exit status: 9
>  Parse errors: Bad plan.  You planned 14 tests but ran 13.
> t/RemoteDB/EUtilities.t                    (Wstat: 256 Tests: 309  
> Failed: 1)
>  Failed test:  309
>  Non-zero exit status: 1
> t/Tools/Run/RemoteBlast.t                  (Wstat: 65280 Tests: 13  
> Failed: 0)
>  Non-zero exit status: 255
>  Parse errors: Bad plan.  You planned 16 tests but ran 13.
> Files=329, Tests=20766, 434 wallclock secs ( 2.64 usr  0.51 sys +  
> 100.55 cusr  6.24 csys = 109.94 CPU)
> Result: FAIL
>
>
> t/Align/AlignStats.t ......................... ok
> t/Align/AlignUtil.t .......................... ok
> t/Align/SimpleAlign.t ........................ ok
> t/Align/TreeBuild.t .......................... ok
> t/Align/Utilities.t .......................... ok
> t/AlignIO/AlignIO.t .......................... ok
> t/AlignIO/arp.t .............................. ok
> t/AlignIO/bl2seq.t ........................... ok
> t/AlignIO/clustalw.t ......................... ok
> t/AlignIO/emboss.t ........................... ok
> t/AlignIO/fasta.t ............................ ok
> t/AlignIO/largemultifasta.t .................. ok
> t/AlignIO/maf.t .............................. ok
> t/AlignIO/mase.t ............................. ok
> t/AlignIO/mega.t ............................. ok
> t/AlignIO/meme.t ............................. ok
> t/AlignIO/metafasta.t ........................ ok
> t/AlignIO/msf.t .............................. ok
> t/AlignIO/nexus.t ............................ ok
> t/AlignIO/pfam.t ............................. ok
> t/AlignIO/phylip.t ........................... ok
> t/AlignIO/po.t ............................... ok
> t/AlignIO/prodom.t ........................... ok
> t/AlignIO/psi.t .............................. ok
> t/AlignIO/selex.t ............................ ok
> t/AlignIO/stockholm.t ........................ ok
> t/AlignIO/xmfa.t ............................. ok
> t/Alphabet.t ................................. ok
> t/Annotation/Annotation.t .................... ok
> t/Annotation/AnnotationAdaptor.t ............. ok
> t/Assembly/Assembly.t ........................ ok
> t/Assembly/ContigSpectrum.t .................. ok
> t/Biblio/Biblio.t ............................ ok
> t/Biblio/References.t ........................ ok
> t/Biblio/biofetch.t .......................... ok
> t/Biblio/eutils.t ............................ ok
> t/ClusterIO/ClusterIO.t ...................... ok
> t/ClusterIO/SequenceFamily.t ................. ok
> t/ClusterIO/unigene.t ........................ ok
> t/Coordinate/CoordinateGraph.t ............... ok
> t/Coordinate/CoordinateMapper.t .............. ok
> t/Coordinate/GeneCoordinateMapper.t .......... ok
> t/LiveSeq/Chain.t ............................ ok
> t/LiveSeq/LiveSeq.t .......................... ok
> t/LiveSeq/Mutation.t ......................... ok
> t/LiveSeq/Mutator.t .......................... ok
> t/LocalDB/BioDBGFF.t ......................... ok
> t/LocalDB/BlastIndex.t ....................... ok
> t/LocalDB/DBFasta.t .......................... ok
> t/LocalDB/DBQual.t ........................... ok
> t/LocalDB/Flat.t ............................. ok
> t/LocalDB/Index.t ............................ ok
> t/LocalDB/Registry.t ......................... 1/14
> --------------------- WARNING ---------------------
> MSG:
> ------------- EXCEPTION: Bio::Root::Exception -------------
> MSG: The sequence does not appear to be FASTA format (lacks a  
> descriptor line '>')
> STACK: Error::throw
> STACK: Bio::Root::Root::throw Bio/Root/Root.pm:368
> STACK: Bio::SeqIO::fasta::next_seq Bio/SeqIO/fasta.pm:127
> STACK: Bio::DB::Flat::BDB::get_Seq_by_id Bio/DB/Flat/BDB.pm:143
> STACK: Bio::DB::Failover::get_Seq_by_id Bio/DB/Failover.pm:122
> STACK: t/LocalDB/Registry.t:69
> -----------------------------------------------------------
>
> ---------------------------------------------------
>
> --------------------- WARNING ---------------------
> MSG: No sequence retrieved by database Bio::DB::Flat::BDB::fasta
> ---------------------------------------------------
>
> #   Failed test at t/LocalDB/Registry.t line 70.
> Can't call method "seq" on an undefined value at t/LocalDB/ 
> Registry.t line 71, <GEN17> line 1.
> # Looks like you planned 14 tests but ran 13.
> # Looks like you failed 1 test of 13 run.
> # Looks like your test exited with 9 just after 13.
> t/LocalDB/Registry.t ......................... Dubious, test  
> returned 9 (wstat 2304, 0x900)
> Failed 2/14 subtests
> t/LocalDB/SeqFeature.t ....................... ok
> t/LocalDB/transfac_pro.t ..................... ok
> t/Map/Cyto.t ................................. ok
> t/Map/Linkage.t .............................. ok
> t/Map/Map.t .................................. ok
> t/Map/MapIO.t ................................ ok
> t/Map/MicrosatelliteMarker.t ................. ok
> t/Map/Physical.t ............................. ok
> t/Matrix/IO/masta.t .......................... ok
> t/Matrix/IO/psm.t ............................ ok
> t/Matrix/InstanceSite.t ...................... ok
> t/Matrix/Matrix.t ............................ ok
> t/Matrix/ProtMatrix.t ........................ ok
> t/Matrix/ProtPsm.t ........................... ok
> t/Matrix/SiteMatrix.t ........................ ok
> t/Ontology/GOterm.t .......................... ok
> t/Ontology/GraphAdaptor.t .................... ok
> t/Ontology/IO/go.t ........................... ok
> t/Ontology/IO/interpro.t ..................... ok
> t/Ontology/IO/obo.t .......................... ok
> t/Ontology/Ontology.t ........................ ok
> t/Ontology/OntologyEngine.t .................. ok
> t/Ontology/OntologyStore.t ................... ok
> t/Ontology/Relationship.t .................... ok
> t/Ontology/RelationshipType.t ................ ok
> t/Ontology/Term.t ............................ ok
> t/Perl.t ..................................... ok
> t/Phenotype/Correlate.t ...................... ok
> t/Phenotype/MeSH.t ........................... ok
> t/Phenotype/Measure.t ........................ ok
> t/Phenotype/MiniMIMentry.t ................... ok
> t/Phenotype/OMIMentry.t ...................... ok
> t/Phenotype/OMIMentryAllelicVariant.t ........ ok
> t/Phenotype/OMIMparser.t ..................... ok
> t/Phenotype/Phenotype.t ...................... ok
> t/PodSyntax.t ................................ ok
> t/PopGen/Coalescent.t ........................ ok
> t/PopGen/HtSNP.t ............................. ok
> t/PopGen/MK.t ................................ ok
> t/PopGen/PopGen.t ............................ ok
> t/PopGen/PopGenSims.t ........................ ok
> t/PopGen/TagHaplotype.t ...................... ok
> t/RemoteDB/BioFetch.t ........................ ok
> t/RemoteDB/CUTG.t ............................ ok
> t/RemoteDB/EMBL.t ............................ ok
> t/RemoteDB/EUtilities.t ...................... 309/309
> #   Failed test 'EPost to EFetch'
> #   at t/RemoteDB/EUtilities.t line 159.
> #          got: '0'
> #     expected: '5'
> # Looks like you failed 1 test of 309.
> t/RemoteDB/EUtilities.t ...................... Dubious, test  
> returned 1 (wstat 256, 0x100)
> Failed 1/309 subtests
> t/RemoteDB/EntrezGene.t ...................... ok
> t/RemoteDB/GenBank.t ......................... ok
> t/RemoteDB/GenPept.t ......................... ok
> t/RemoteDB/HIV/HIV.t ......................... ok
> t/RemoteDB/HIV/HIVAnnotProcessor.t ........... ok
> t/RemoteDB/HIV/HIVQuery.t .................... 22/41 Use of  
> uninitialized value $rest[0] in join or string at (eval 68) line 15.
> t/RemoteDB/HIV/HIVQuery.t .................... ok
> t/RemoteDB/HIV/HIVQueryHelper.t .............. ok
> t/RemoteDB/MeSH.t ............................ ok
> t/RemoteDB/Query/GenBank.t ................... ok
> t/RemoteDB/RefSeq.t .......................... ok
> t/RemoteDB/SeqHound.t ........................ ok
> t/RemoteDB/SeqRead_fail.t .................... ok
> t/RemoteDB/SeqVersion.t ...................... ok
> t/RemoteDB/SwissProt.t ....................... ok
> t/RemoteDB/Taxonomy.t ........................ ok
> t/Restriction/Analysis-refac.t ............... ok
> t/Restriction/Analysis.t ..................... ok
> t/Restriction/Gel.t .......................... ok
> t/Restriction/IO.t ........................... ok
> t/Root/Exception.t ........................... ok
> t/Root/RootI.t ............................... ok
> t/Root/RootIO.t .............................. ok
> t/Root/Storable.t ............................ ok
> t/Root/Tempfile.t ............................ ok
> t/Root/Utilities.t ........................... ok
> t/SearchDist.t ............................... skipped: The optional  
> module Bio::Ext::Align (or dependencies thereof) was not installed
> t/SearchIO/CigarString.t ..................... ok
> t/SearchIO/SearchIO.t ........................ ok
> t/SearchIO/SimilarityPair.t .................. ok
> t/SearchIO/Tiling.t .......................... ok
> t/SearchIO/Writer/GbrowseGFF.t ............... ok
> t/SearchIO/Writer/HSPTableWriter.t ........... ok
> t/SearchIO/Writer/HTMLWriter.t ............... ok
> t/SearchIO/Writer/HitTableWriter.t ........... ok
> t/SearchIO/blast.t ........................... ok
> t/SearchIO/blast_pull.t ...................... ok
> t/SearchIO/blasttable.t ...................... ok
> t/SearchIO/blastxml.t ........................ ok
> t/SearchIO/cross_match.t ..................... ok
> t/SearchIO/erpin.t ........................... ok
> t/SearchIO/exonerate.t ....................... ok
> t/SearchIO/fasta.t ........................... ok
> t/SearchIO/gmap_f9.t ......................... ok
> t/SearchIO/hmmer.t ........................... ok
> t/SearchIO/hmmer_pull.t ...................... ok
> t/SearchIO/infernal.t ........................ ok
> t/SearchIO/megablast.t ....................... ok
> t/SearchIO/psl.t ............................. ok
> t/SearchIO/rnamotif.t ........................ ok
> t/SearchIO/sim4.t ............................ ok
> t/SearchIO/waba.t ............................ ok
> t/SearchIO/wise.t ............................ ok
> t/Seq/DBLink.t ............................... ok
> t/Seq/EncodedSeq.t ........................... ok
> t/Seq/LargeLocatableSeq.t .................... ok
> t/Seq/LargePSeq.t ............................ ok
> t/Seq/LocatableSeq.t ......................... ok
> t/Seq/MetaSeq.t .............................. ok
> t/Seq/PrimaryQual.t .......................... ok
> t/Seq/PrimarySeq.t ........................... ok
> t/Seq/PrimedSeq.t ............................ ok
> t/Seq/Quality.t .............................. ok
> t/Seq/Seq.t .................................. ok
> t/Seq/WithQuality.t .......................... ok
> t/SeqEvolution.t ............................. ok
> t/SeqFeature/FeatureIO.t ..................... ok
> t/SeqFeature/Location.t ...................... ok
> t/SeqFeature/LocationFactory.t ............... ok
> t/SeqFeature/Primer.t ........................ ok
> t/SeqFeature/Range.t ......................... ok
> t/SeqFeature/RangeI.t ........................ ok
> t/SeqFeature/SeqAnalysisParser.t ............. ok
> t/SeqFeature/SeqFeatAnnotated.t .............. ok
> t/SeqFeature/SeqFeatCollection.t ............. ok
> t/SeqFeature/SeqFeature.t .................... ok
> t/SeqFeature/SeqFeaturePrimer.t .............. ok
> t/SeqFeature/Unflattener.t ................... ok
> t/SeqFeature/Unflattener2.t .................. ok
> t/SeqIO.t .................................... ok
> t/SeqIO/Handler.t ............................ ok
> t/SeqIO/MultiFile.t .......................... ok
> t/SeqIO/Multiple_fasta.t ..................... ok
> t/SeqIO/SeqBuilder.t ......................... ok
> t/SeqIO/Splicedseq.t ......................... ok
> t/SeqIO/abi.t ................................ skipped: The optional  
> module Bio::SeqIO::staden::read (or dependencies thereof) was not  
> installed
> t/SeqIO/ace.t ................................ ok
> t/SeqIO/agave.t .............................. ok
> t/SeqIO/alf.t ................................ skipped: The optional  
> module Bio::SeqIO::staden::read (or dependencies thereof) was not  
> installed
> t/SeqIO/asciitree.t .......................... ok
> t/SeqIO/bsml.t ............................... ok
> t/SeqIO/bsml_sax.t ........................... ok
> t/SeqIO/chadoxml.t ........................... ok
> t/SeqIO/chaos.t .............................. ok
> t/SeqIO/chaosxml.t ........................... ok
> t/SeqIO/ctf.t ................................ skipped: The optional  
> module Bio::SeqIO::staden::read (or dependencies thereof) was not  
> installed
> t/SeqIO/embl.t ............................... ok
> t/SeqIO/entrezgene.t ......................... ok
> t/SeqIO/excel.t .............................. ok
> t/SeqIO/exp.t ................................ skipped: The optional  
> module Bio::SeqIO::staden::read (or dependencies thereof) was not  
> installed
> t/SeqIO/fasta.t .............................. ok
> t/SeqIO/fastq.t .............................. ok
> t/SeqIO/flybase_chadoxml.t ................... ok
> t/SeqIO/game.t ............................... ok
> t/SeqIO/gcg.t ................................ ok
> t/SeqIO/genbank.t ............................ ok
> t/SeqIO/interpro.t ........................... ok
> t/SeqIO/kegg.t ............................... ok
> t/SeqIO/largefasta.t ......................... ok
> t/SeqIO/lasergene.t .......................... ok
> t/SeqIO/locuslink.t .......................... ok
> t/SeqIO/metafasta.t .......................... ok
> t/SeqIO/phd.t ................................ ok
> t/SeqIO/pir.t ................................ ok
> t/SeqIO/pln.t ................................ skipped: The optional  
> module Bio::SeqIO::staden::read (or dependencies thereof) was not  
> installed
> t/SeqIO/qual.t ............................... ok
> t/SeqIO/raw.t ................................ ok
> t/SeqIO/scf.t ................................ ok
> t/SeqIO/strider.t ............................ ok
> t/SeqIO/swiss.t .............................. ok
> t/SeqIO/tab.t ................................ ok
> t/SeqIO/table.t .............................. ok
> t/SeqIO/tigr.t ............................... ok
> t/SeqIO/tigrxml.t ............................ ok
> t/SeqIO/tinyseq.t ............................ ok
> t/SeqIO/ztr.t ................................ skipped: The optional  
> module Bio::SeqIO::staden::read (or dependencies thereof) was not  
> installed
> t/SeqTools/Backtranslate.t ................... ok
> t/SeqTools/CodonTable.t ...................... ok
> t/SeqTools/ECnumber.t ........................ ok
> t/SeqTools/GuessSeqFormat.t .................. ok
> t/SeqTools/OddCodes.t ........................ ok
> t/SeqTools/SeqPattern.t ...................... ok
> t/SeqTools/SeqStats.t ........................ ok
> t/SeqTools/SeqUtils.t ........................ ok
> t/SeqTools/SeqWords.t ........................ ok
> t/Species.t .................................. ok
> t/Structure/IO.t ............................. ok
> t/Structure/Structure.t ...................... ok
> t/Symbol.t ................................... ok
> t/TaxonTree.t ................................ skipped: All tests  
> are being skipped, probably because the module(s) being tested here  
> are now deprecated
> t/Tools/Alignment/Consed.t ................... ok
> t/Tools/Analysis/DNA/ESEfinder.t ............. ok
> t/Tools/Analysis/Protein/Domcut.t ............ ok
> t/Tools/Analysis/Protein/ELM.t ............... ok
> t/Tools/Analysis/Protein/GOR4.t .............. ok
> t/Tools/Analysis/Protein/HNN.t ............... ok
> t/Tools/Analysis/Protein/Mitoprot.t .......... ok
> t/Tools/Analysis/Protein/NetPhos.t ........... ok
> t/Tools/Analysis/Protein/Scansite.t .......... ok
> t/Tools/Analysis/Protein/Sopma.t ............. ok
> t/Tools/EMBOSS/Palindrome.t .................. ok
> t/Tools/EUtilities/EUtilParameters.t ......... ok
> t/Tools/EUtilities/egquery.t ................. ok
> t/Tools/EUtilities/einfo.t ................... ok
> t/Tools/EUtilities/elink_acheck.t ............ ok
> t/Tools/EUtilities/elink_lcheck.t ............ ok
> t/Tools/EUtilities/elink_llinks.t ............ ok
> t/Tools/EUtilities/elink_ncheck.t ............ ok
> t/Tools/EUtilities/elink_neighbor.t .......... ok
> t/Tools/EUtilities/elink_neighbor_history.t .. ok
> t/Tools/EUtilities/elink_scores.t ............ ok
> t/Tools/EUtilities/epost.t ................... ok
> t/Tools/EUtilities/esearch.t ................. ok
> t/Tools/EUtilities/espell.t .................. ok
> t/Tools/EUtilities/esummary.t ................ ok
> t/Tools/Est2Genome.t ......................... ok
> t/Tools/FootPrinter.t ........................ ok
> t/Tools/GFF.t ................................ ok
> t/Tools/Geneid.t ............................. ok
> t/Tools/Genewise.t ........................... ok
> t/Tools/Genomewise.t ......................... ok
> t/Tools/Genpred.t ............................ ok
> t/Tools/Hmmer.t .............................. ok
> t/Tools/IUPAC.t .............................. ok
> t/Tools/Lucy.t ............................... ok
> t/Tools/Match.t .............................. ok
> t/Tools/Phylo/Gerp.t ......................... ok
> t/Tools/Phylo/Molphy.t ....................... ok
> t/Tools/Phylo/PAML.t ......................... ok
> t/Tools/Phylo/Phylip/ProtDist.t .............. ok
> t/Tools/Primer3.t ............................ ok
> t/Tools/Promoterwise.t ....................... ok
> t/Tools/Pseudowise.t ......................... ok
> t/Tools/QRNA.t ............................... ok
> t/Tools/RandDistFunctions.t .................. ok
> t/Tools/RepeatMasker.t ....................... ok
> t/Tools/Run/RemoteBlast.t .................... 13/16
> --------------------- WARNING ---------------------
> MSG: Server failed to return any data
> ---------------------------------------------------
> # Looks like you planned 16 tests but ran 13.
> t/Tools/Run/RemoteBlast.t .................... Dubious, test  
> returned 255 (wstat 65280, 0xff00)
> Failed 3/16 subtests
> t/Tools/Run/RemoteBlast_rpsblast.t ........... ok
> t/Tools/Run/StandAloneBlast.t ................ ok
> t/Tools/Run/WrapperBase.t .................... ok
> t/Tools/Seg.t ................................ ok
> t/Tools/SiRNA.t .............................. ok
> t/Tools/Sigcleave.t .......................... ok
> t/Tools/Signalp.t ............................ ok
> t/Tools/Signalp/ExtendedSignalp.t ............ ok
> t/Tools/Sim4.t ............................... ok
> t/Tools/Spidey/Spidey.t ...................... ok
> t/Tools/TandemRepeatsFinder.t ................ ok
> t/Tools/TargetP.t ............................ ok
> t/Tools/Tmhmm.t .............................. ok
> t/Tools/ePCR.t ............................... ok
> t/Tools/pICalculator.t ....................... ok
> t/Tools/rnamotif.t ........................... skipped: All tests  
> are being skipped, probably because the module(s) being tested here  
> are now deprecated
> t/Tools/tRNAscanSE.t ......................... ok
> t/Tree/Compatible.t .......................... ok
> t/Tree/Node.t ................................ ok
> t/Tree/PhyloNetwork/Factory.t ................ ok
> t/Tree/PhyloNetwork/GraphViz.t ............... ok
> t/Tree/PhyloNetwork/MuVector.t ............... ok
> t/Tree/PhyloNetwork/PhyloNetwork.t ........... ok
> t/Tree/PhyloNetwork/RandomFactory.t .......... skipped: The optional  
> module Math::Random (or dependencies thereof) was not installed
> t/Tree/PhyloNetwork/TreeFactory.t ............ ok
> t/Tree/RandomTreeFactory.t ................... ok
> t/Tree/Tree.t ................................ ok
> t/Tree/TreeIO.t .............................. ok
> t/Tree/TreeIO/lintree.t ...................... ok
> t/Tree/TreeIO/newick.t ....................... ok
> t/Tree/TreeIO/nexus.t ........................ ok
> t/Tree/TreeIO/nhx.t .......................... ok
> t/Tree/TreeIO/phyloxml.t ..................... ok
> t/Tree/TreeIO/svggraph.t ..................... 1/4 Use of  
> uninitialized value $txt[0] in join or string at /usr/share/perl5/ 
> SVG/Element.pm line 1195, <GEN0> line 1.
> t/Tree/TreeIO/svggraph.t ..................... ok
> t/Tree/TreeIO/tabtree.t ...................... ok
> t/Tree/TreeStatistics.t ...................... ok
> t/Variation/AAChange.t ....................... ok
> t/Variation/AAReverseMutate.t ................ ok
> t/Variation/Allele.t ......................... ok
> t/Variation/DNAMutation.t .................... ok
> t/Variation/RNAChange.t ...................... ok
> t/Variation/SNP.t ............................ ok
> t/Variation/SeqDiff.t ........................ ok
> t/Variation/Variation_IO.t ................... ok
>
>
> Cheers,
>
> -- 
> Charles Plessy
> Debian Med packaging team,
> http://www.debian.org/devel/debian-med
> Tsurumi, Kanagawa, Japan


From cjfields at illinois.edu  Mon Sep 28 17:28:29 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Mon, 28 Sep 2009 12:28:29 -0500
Subject: [Bioperl-l] Policy on tests
Message-ID: <00F31D5F-D531-4A5E-A11E-F7B67283FA8B@illinois.edu>

All,

This is a bit of a rant related to the spat of alphas I've had to  
release over the last few weeks.  We have a fairly loose policy on  
testing; for instance, most CPAN installations should not run network-  
or DB-dependent tests or other developer-dependent tests by default  
(POD formatting, for instance), or tests for a 'recommended' module  
should be skipped.  That is currently in place.

However, I do think all tests that are skipped need to be reported  
somehow, and optional tests should NOT skip if they are off by default  
and are specifically requested.  This is not currently the behavior.   
So far I have been bitten twice by this.

The last instance was with the latest alpha, where ODBA-related tests  
were mistakenly skipped when BerkeleyDB wasn't installed.  As it turns  
out, BerkeleyDB isn't required, but (according to standard test  
harness output) t/LocalDB/Registry.t passed w/o reporting any problems  
when in reality it silently skipped over 90% of the tests (this is  
only seen with --verbose output).  In the past I have also run into  
network tests silently passing when the remote server was not in  
service anymore (IIRC this was with XEMBL modules, which are no longer  
in the distribution).

 From my point of view, speaking as both a user and developer, I need  
to know when these tests are skipped or fail.  In instances where I  
specifically request a set of tests to be run and a test fails, they  
*should* fail quite loudly and catastrophically (i.e. if there is a  
server-side issue, a problem with DB connection, etc).  They shouldn't  
be skipped over if a problem arises, otherwise if it a legitimate bug  
it silently passes.  If it is something I haven't set up correctly (a  
DB connection, for instance) I would like to know about it via the  
test failures.

Am I the only one thinking along these lines?  Should we come up with  
a simple policy on how we're setting up and running tests?

chris


From paola.bisignano at gmail.com  Mon Sep 28 09:50:52 2009
From: paola.bisignano at gmail.com (Paola Bisignano)
Date: Mon, 28 Sep 2009 11:50:52 +0200
Subject: [Bioperl-l] parsing msf file
Message-ID: <e9cf89740909280250u40f1a118pa7527a2f27c5bc0@mail.gmail.com>

Hi dear friends,

I used Bio::AlignIO to parse msf file, using method
colum_from_residue_number, as you suggested to obtain the position in
the alignment of  residues of interest (in contact with my ligand) and
I have to do a check of the residue:
I want to extract the type of the residue...I ask my question using
the number of the residue in the PDB, and i want the script return
also the residue so if I want to know the position af ala21, I  will
do:

my $alnio = Bio::AlignIO->new( -file=>"my file.msf");
my $aln = $alnio->next_aln;

my $s1 = $aln->get_seq_by_pos(1);
my $s2 = $aln->get_seq_by_pos(2);

my $col = $aln->column_from_residue_number( $s1->id, 21)

and It will return the position (es. 5) but I want to check if in
position 5 of the alignment there is A (for ala)....I looked in
documentation, but I couldn't find anything for that


Thank you all for help you gave and will give to me,

best regards,

paola


From maj at fortinbras.us  Tue Sep 29 01:25:33 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Mon, 28 Sep 2009 21:25:33 -0400
Subject: [Bioperl-l] parsing msf file
In-Reply-To: <e9cf89740909280250u40f1a118pa7527a2f27c5bc0@mail.gmail.com>
References: <e9cf89740909280250u40f1a118pa7527a2f27c5bc0@mail.gmail.com>
Message-ID: <1C5008B41F6D4BFF9F5160633D284442@NewLife>

Hi Paola--
I think you're saying you want to see if A is present in other 
sequences in the alignment at alignment column 5. Here's
where you use location_from_column, which is a method 
off the sequence object themselves. The idea is to do 

# $col is obtained as in your script...
for my $seq ($aln->each_seq) {
  if ( $seq->subseq( $seq->location_from_column($col) ) eq 'A') {
     print "si!";
  else {
     print "no!";
  }
}

You might find the code at 
http://www.bioperl.org/wiki/Site_entropy_in_an_alignment
helpful since it uses these principles. 
Mark
----- Original Message ----- 
From: "Paola Bisignano" <paola.bisignano at gmail.com>
To: <bioperl-l at lists.open-bio.org>
Sent: Monday, September 28, 2009 5:50 AM
Subject: [Bioperl-l] parsing msf file


> Hi dear friends,
> 
> I used Bio::AlignIO to parse msf file, using method
> colum_from_residue_number, as you suggested to obtain the position in
> the alignment of  residues of interest (in contact with my ligand) and
> I have to do a check of the residue:
> I want to extract the type of the residue...I ask my question using
> the number of the residue in the PDB, and i want the script return
> also the residue so if I want to know the position af ala21, I  will
> do:
> 
> my $alnio = Bio::AlignIO->new( -file=>"my file.msf");
> my $aln = $alnio->next_aln;
> 
> my $s1 = $aln->get_seq_by_pos(1);
> my $s2 = $aln->get_seq_by_pos(2);
> 
> my $col = $aln->column_from_residue_number( $s1->id, 21)
> 
> and It will return the position (es. 5) but I want to check if in
> position 5 of the alignment there is A (for ala)....I looked in
> documentation, but I couldn't find anything for that
> 
> 
> Thank you all for help you gave and will give to me,
> 
> best regards,
> 
> paola
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
>


From martin.senger at gmail.com  Tue Sep 29 05:31:41 2009
From: martin.senger at gmail.com (Martin Senger)
Date: Tue, 29 Sep 2009 13:31:41 +0800
Subject: [Bioperl-l] a Main Page proposal
Message-ID: <4d93f07c0909282231k35bc636as73993fe031034340@mail.gmail.com>

> Martin has stopped working on Biblio as far as I know and php-hacking is
> not my favorite pastime.


That's true. I can still revive the code - but the question is (always has
been) where to host the server (of the web services providing the biblio
data). It was hosted, and maintained, at EBI. But I do not know if EBI is
still maintaining it, or willing to do so.

Cheers,
Martin

-- 
Martin Senger
email: martin.senger at gmail.com,m.senger at cgiar.org
skype: martinsenger


From jason at bioperl.org  Tue Sep 29 05:43:30 2009
From: jason at bioperl.org (Jason Stajich)
Date: Mon, 28 Sep 2009 22:43:30 -0700
Subject: [Bioperl-l] a Main Page proposal
In-Reply-To: <4d93f07c0909282231k35bc636as73993fe031034340@mail.gmail.com>
References: <4d93f07c0909282231k35bc636as73993fe031034340@mail.gmail.com>
Message-ID: <E9D67D22-ABC3-4199-B8D9-E0675197B9BF@bioperl.org>

hah! I actually meant the Biblio.php Wikimedia plugin by Martin Jambon  
-- but hey the Bio::Biblio db stuff should be discussed too.

-jason
On Sep 28, 2009, at 10:31 PM, Martin Senger wrote:

>> Martin has stopped working on Biblio as far as I know and php- 
>> hacking is
>> not my favorite pastime.
>
>
> That's true. I can still revive the code - but the question is  
> (always has
> been) where to host the server (of the web services providing the  
> biblio
> data). It was hosted, and maintained, at EBI. But I do not know if  
> EBI is
> still maintaining it, or willing to do so.
>
> Cheers,
> Martin
>
> -- 
> Martin Senger
> email: martin.senger at gmail.com,m.senger at cgiar.org
> skype: martinsenger

--
Jason Stajich
jason.stajich at gmail.com
jason at bioperl.org


From cjfields at illinois.edu  Tue Sep 29 18:01:29 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Tue, 29 Sep 2009 13:01:29 -0500
Subject: [Bioperl-l] BioPerl 1.6.1 released
Message-ID: <7E88F275-4D4B-48DD-84C7-A97C2C16EA97@illinois.edu>

We are pleased to announce the availability of BioPerl 1.6.1, the  
latest release of BioPerl core code.  You can grab it here:

Via CPAN:

http://search.cpan.org/~cjfields/BioPerl-1.6.1/

Via the BioPerl website:

http://bioperl.org/DIST/BioPerl-1.6.1.tar.bz2
http://bioperl.org/DIST/BioPerl-1.6.1.tar.gz
http://bioperl.org/DIST/BioPerl-1.6.1.zip

The PPM for Windows should also finally be available this week,  
ActivePerl problems permitting (we will post more information when it  
becomes available).

Tons of bug fixes and changes have been incorporated into this  
release.  For a more complete change list please see the 'Changes'  
file included with the distribution.

A few highlights:

* FASTQ parsing and interconversion of the three FASTQ variants  
(Sanger, Illumina, Solexa) now works (a concerted OBF effort!)
* Significant refactoring of Bio::Restriction methods
* Complete refactoring of Bio::Search-related tiling code, including  
HOWTO documentation
* GBrowse-related fixes
    - berkeleydb database now autoindexes wig files and locks correctly
    - add Pg, SQLite, and faster BerkeleyDB implementations
* Infernal 1.0 output is now parsed
* New SearchIO-based parser for gmap -f9 output
* BLAST XML parsing essentially complete
* Installation via CPANPLUS should now work
* For those using Strawberry Perl on Windows, the latest build is  
expected to pass all tests.
* 'raw' sequence format now parsed by line or optionally as a single  
sequence
* SCF parsing/writing now round-trips
* Demo code for using RPS-BLAST and Bio::Tools::Run::RemoteBlast
* Bio::Tools::SeqPattern now has a backtranslate() method
* Bio::Tree::Statistics now has methods to calculate Fitch-based  
score, internal trait values, statratio(), sum of leaf distances  
[heikki]
* scripts
    - update to bp_seqfeature_load for SQLite [lstein]
    - hivq.pl - commmand-line interface to Bio::DB::HIV [maj]
    - fastam9_to_table - fix for MPI output [jason]
    - gccalc - total stats [jason]
    - einfo  - simple script to find up-to-date NCBI database list,  
list field and link values for a specific database

We will shortly release updates for BioPerl-db, BioPerl-run, and  
BioPerl-network.  Enjoy!

chris


From rmb32 at cornell.edu  Tue Sep 29 18:22:03 2009
From: rmb32 at cornell.edu (Robert Buels)
Date: Tue, 29 Sep 2009 11:22:03 -0700
Subject: [Bioperl-l] BioPerl 1.6.1 released
In-Reply-To: <7E88F275-4D4B-48DD-84C7-A97C2C16EA97@illinois.edu>
References: <7E88F275-4D4B-48DD-84C7-A97C2C16EA97@illinois.edu>
Message-ID: <4AC2504B.1000707@cornell.edu>

Chris Fields wrote:
 > We are pleased to announce the availability of BioPerl 1.6.1, the
 > latest release of BioPerl core code.

Hooray!  You rock Chris!  Tremendous thanks for your many hours of work 
to get it out the door!

Rob


From scott at scottcain.net  Tue Sep 29 18:23:08 2009
From: scott at scottcain.net (Scott Cain)
Date: Tue, 29 Sep 2009 14:23:08 -0400
Subject: [Bioperl-l] BioPerl 1.6.1 released
In-Reply-To: <7E88F275-4D4B-48DD-84C7-A97C2C16EA97@illinois.edu>
References: <7E88F275-4D4B-48DD-84C7-A97C2C16EA97@illinois.edu>
Message-ID: <536f21b00909291123h12a7c941tdd3edb7fadbb1149@mail.gmail.com>

Chris,

Congratulations and thanks so much for the time and effort that went into this.

Scott


On Tue, Sep 29, 2009 at 2:01 PM, Chris Fields <cjfields at illinois.edu> wrote:
> We are pleased to announce the availability of BioPerl 1.6.1, the latest
> release of BioPerl core code. ?You can grab it here:
>
> Via CPAN:
>
> http://search.cpan.org/~cjfields/BioPerl-1.6.1/
>
> Via the BioPerl website:
>
> http://bioperl.org/DIST/BioPerl-1.6.1.tar.bz2
> http://bioperl.org/DIST/BioPerl-1.6.1.tar.gz
> http://bioperl.org/DIST/BioPerl-1.6.1.zip
>
> The PPM for Windows should also finally be available this week, ActivePerl
> problems permitting (we will post more information when it becomes
> available).
>
> Tons of bug fixes and changes have been incorporated into this release. ?For
> a more complete change list please see the 'Changes' file included with the
> distribution.
>
> A few highlights:
>
> * FASTQ parsing and interconversion of the three FASTQ variants (Sanger,
> Illumina, Solexa) now works (a concerted OBF effort!)
> * Significant refactoring of Bio::Restriction methods
> * Complete refactoring of Bio::Search-related tiling code, including HOWTO
> documentation
> * GBrowse-related fixes
> ? - berkeleydb database now autoindexes wig files and locks correctly
> ? - add Pg, SQLite, and faster BerkeleyDB implementations
> * Infernal 1.0 output is now parsed
> * New SearchIO-based parser for gmap -f9 output
> * BLAST XML parsing essentially complete
> * Installation via CPANPLUS should now work
> * For those using Strawberry Perl on Windows, the latest build is expected
> to pass all tests.
> * 'raw' sequence format now parsed by line or optionally as a single
> sequence
> * SCF parsing/writing now round-trips
> * Demo code for using RPS-BLAST and Bio::Tools::Run::RemoteBlast
> * Bio::Tools::SeqPattern now has a backtranslate() method
> * Bio::Tree::Statistics now has methods to calculate Fitch-based score,
> internal trait values, statratio(), sum of leaf distances [heikki]
> * scripts
> ? - update to bp_seqfeature_load for SQLite [lstein]
> ? - hivq.pl - commmand-line interface to Bio::DB::HIV [maj]
> ? - fastam9_to_table - fix for MPI output [jason]
> ? - gccalc - total stats [jason]
> ? - einfo ?- simple script to find up-to-date NCBI database list, list field
> and link values for a specific database
>
> We will shortly release updates for BioPerl-db, BioPerl-run, and
> BioPerl-network. ?Enjoy!
>
> chris
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


-- 
------------------------------------------------------------------------
Scott Cain, Ph. D.                                   scott at scottcain dot net
GMOD Coordinator (http://gmod.org/)                     216-392-3087
Ontario Institute for Cancer Research


From hlapp at gmx.net  Tue Sep 29 19:56:58 2009
From: hlapp at gmx.net (Hilmar Lapp)
Date: Tue, 29 Sep 2009 15:56:58 -0400
Subject: [Bioperl-l] BioPerl 1.6.1 released
In-Reply-To: <7E88F275-4D4B-48DD-84C7-A97C2C16EA97@illinois.edu>
References: <7E88F275-4D4B-48DD-84C7-A97C2C16EA97@illinois.edu>
Message-ID: <C06DA705-6249-4D86-BE9D-E2E4DCEBFAF0@gmx.net>

Congrats from me too - awesome Chris, and thanks on behalf of the  
project!

	-hilmar

On Sep 29, 2009, at 2:01 PM, Chris Fields wrote:

> We are pleased to announce the availability of BioPerl 1.6.1, the  
> latest release of BioPerl core code.  You can grab it here:
>
> Via CPAN:
>
> http://search.cpan.org/~cjfields/BioPerl-1.6.1/
>
> Via the BioPerl website:
>
> http://bioperl.org/DIST/BioPerl-1.6.1.tar.bz2
> http://bioperl.org/DIST/BioPerl-1.6.1.tar.gz
> http://bioperl.org/DIST/BioPerl-1.6.1.zip
>
> The PPM for Windows should also finally be available this week,  
> ActivePerl problems permitting (we will post more information when  
> it becomes available).
>
> Tons of bug fixes and changes have been incorporated into this  
> release.  For a more complete change list please see the 'Changes'  
> file included with the distribution.
>
> A few highlights:
>
> * FASTQ parsing and interconversion of the three FASTQ variants  
> (Sanger, Illumina, Solexa) now works (a concerted OBF effort!)
> * Significant refactoring of Bio::Restriction methods
> * Complete refactoring of Bio::Search-related tiling code, including  
> HOWTO documentation
> * GBrowse-related fixes
>   - berkeleydb database now autoindexes wig files and locks correctly
>   - add Pg, SQLite, and faster BerkeleyDB implementations
> * Infernal 1.0 output is now parsed
> * New SearchIO-based parser for gmap -f9 output
> * BLAST XML parsing essentially complete
> * Installation via CPANPLUS should now work
> * For those using Strawberry Perl on Windows, the latest build is  
> expected to pass all tests.
> * 'raw' sequence format now parsed by line or optionally as a single  
> sequence
> * SCF parsing/writing now round-trips
> * Demo code for using RPS-BLAST and Bio::Tools::Run::RemoteBlast
> * Bio::Tools::SeqPattern now has a backtranslate() method
> * Bio::Tree::Statistics now has methods to calculate Fitch-based  
> score, internal trait values, statratio(), sum of leaf distances  
> [heikki]
> * scripts
>   - update to bp_seqfeature_load for SQLite [lstein]
>   - hivq.pl - commmand-line interface to Bio::DB::HIV [maj]
>   - fastam9_to_table - fix for MPI output [jason]
>   - gccalc - total stats [jason]
>   - einfo  - simple script to find up-to-date NCBI database list,  
> list field and link values for a specific database
>
> We will shortly release updates for BioPerl-db, BioPerl-run, and  
> BioPerl-network.  Enjoy!
>
> chris
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From cjfields at illinois.edu  Tue Sep 29 20:38:04 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Tue, 29 Sep 2009 15:38:04 -0500
Subject: [Bioperl-l] BioPerl 1.6.1 released
In-Reply-To: <C06DA705-6249-4D86-BE9D-E2E4DCEBFAF0@gmx.net>
References: <7E88F275-4D4B-48DD-84C7-A97C2C16EA97@illinois.edu>
	<C06DA705-6249-4D86-BE9D-E2E4DCEBFAF0@gmx.net>
Message-ID: <5B8C4E37-5F3D-4E76-AB94-1C613AE04CDF@illinois.edu>

No prob.  Next up is db, run, and network!

chris

On Sep 29, 2009, at 2:56 PM, Hilmar Lapp wrote:

> Congrats from me too - awesome Chris, and thanks on behalf of the  
> project!
>
> 	-hilmar
>
> On Sep 29, 2009, at 2:01 PM, Chris Fields wrote:
>
>> We are pleased to announce the availability of BioPerl 1.6.1, the  
>> latest release of BioPerl core code.  You can grab it here:
>>
>> Via CPAN:
>>
>> http://search.cpan.org/~cjfields/BioPerl-1.6.1/
>>
>> Via the BioPerl website:
>>
>> http://bioperl.org/DIST/BioPerl-1.6.1.tar.bz2
>> http://bioperl.org/DIST/BioPerl-1.6.1.tar.gz
>> http://bioperl.org/DIST/BioPerl-1.6.1.zip
>>
>> The PPM for Windows should also finally be available this week,  
>> ActivePerl problems permitting (we will post more information when  
>> it becomes available).
>>
>> Tons of bug fixes and changes have been incorporated into this  
>> release.  For a more complete change list please see the 'Changes'  
>> file included with the distribution.
>>
>> A few highlights:
>>
>> * FASTQ parsing and interconversion of the three FASTQ variants  
>> (Sanger, Illumina, Solexa) now works (a concerted OBF effort!)
>> * Significant refactoring of Bio::Restriction methods
>> * Complete refactoring of Bio::Search-related tiling code,  
>> including HOWTO documentation
>> * GBrowse-related fixes
>>  - berkeleydb database now autoindexes wig files and locks correctly
>>  - add Pg, SQLite, and faster BerkeleyDB implementations
>> * Infernal 1.0 output is now parsed
>> * New SearchIO-based parser for gmap -f9 output
>> * BLAST XML parsing essentially complete
>> * Installation via CPANPLUS should now work
>> * For those using Strawberry Perl on Windows, the latest build is  
>> expected to pass all tests.
>> * 'raw' sequence format now parsed by line or optionally as a  
>> single sequence
>> * SCF parsing/writing now round-trips
>> * Demo code for using RPS-BLAST and Bio::Tools::Run::RemoteBlast
>> * Bio::Tools::SeqPattern now has a backtranslate() method
>> * Bio::Tree::Statistics now has methods to calculate Fitch-based  
>> score, internal trait values, statratio(), sum of leaf distances  
>> [heikki]
>> * scripts
>>  - update to bp_seqfeature_load for SQLite [lstein]
>>  - hivq.pl - commmand-line interface to Bio::DB::HIV [maj]
>>  - fastam9_to_table - fix for MPI output [jason]
>>  - gccalc - total stats [jason]
>>  - einfo  - simple script to find up-to-date NCBI database list,  
>> list field and link values for a specific database
>>
>> We will shortly release updates for BioPerl-db, BioPerl-run, and  
>> BioPerl-network.  Enjoy!
>>
>> chris
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>
> -- 
> ===========================================================
> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
> ===========================================================
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From cjfields at illinois.edu  Tue Sep 29 21:11:33 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Tue, 29 Sep 2009 16:11:33 -0500
Subject: [Bioperl-l] Naming of BioPerl-run/db/network
Message-ID: <4384F324-E30E-490D-A6FB-3EB4C54E4481@illinois.edu>

Right now all our subdistributions have a naming scheme like BioPerl- 
db.  I'm thinking we should subtly change those to BioPerl-DB, BioPerl- 
Run, BioPerl-Network, etc.  The primary reason is that the prior  
method of naming doesn't quite match the syntax of other distributions:

Win32-Console
Win32-EventLog
MooseX-Aliases
etc etc

I'll go ahead and make these changes unless there is rabid dissent ;>

chris


From bix at sendu.me.uk  Tue Sep 29 19:06:17 2009
From: bix at sendu.me.uk (Sendu Bala)
Date: Tue, 29 Sep 2009 20:06:17 +0100
Subject: [Bioperl-l] BioPerl 1.6.1 released
In-Reply-To: <7E88F275-4D4B-48DD-84C7-A97C2C16EA97@illinois.edu>
References: <7E88F275-4D4B-48DD-84C7-A97C2C16EA97@illinois.edu>
Message-ID: <4AC25AA9.5080803@sendu.me.uk>

Chris Fields wrote:
> We are pleased to announce the availability of BioPerl 1.6.1, the latest 
> release of BioPerl core code.  You can grab it here:

Great job Chris. *cheers*


From hlapp at gmx.net  Tue Sep 29 21:49:07 2009
From: hlapp at gmx.net (Hilmar Lapp)
Date: Tue, 29 Sep 2009 17:49:07 -0400
Subject: [Bioperl-l] Naming of BioPerl-run/db/network
In-Reply-To: <4384F324-E30E-490D-A6FB-3EB4C54E4481@illinois.edu>
References: <4384F324-E30E-490D-A6FB-3EB4C54E4481@illinois.edu>
Message-ID: <6C5CBE0E-EDA5-4079-BFD7-DEE95E8C749C@gmx.net>

Fine with me :-)

	-hilmar

On Sep 29, 2009, at 5:11 PM, Chris Fields wrote:

> Right now all our subdistributions have a naming scheme like BioPerl- 
> db.  I'm thinking we should subtly change those to BioPerl-DB,  
> BioPerl-Run, BioPerl-Network, etc.  The primary reason is that the  
> prior method of naming doesn't quite match the syntax of other  
> distributions:
>
> Win32-Console
> Win32-EventLog
> MooseX-Aliases
> etc etc
>
> I'll go ahead and make these changes unless there is rabid dissent ;>
>
> chris
>

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From maj at fortinbras.us  Tue Sep 29 22:33:23 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Tue, 29 Sep 2009 18:33:23 -0400
Subject: [Bioperl-l] BioPerl 1.6.1 released
In-Reply-To: <7E88F275-4D4B-48DD-84C7-A97C2C16EA97@illinois.edu>
References: <7E88F275-4D4B-48DD-84C7-A97C2C16EA97@illinois.edu>
Message-ID: <5D35D16E84554CA687C6CA4758806884@NewLife>

Gnarly, dude.
MAJ
----- Original Message ----- 
From: "Chris Fields" <cjfields at illinois.edu>
To: "BioPerl List" <bioperl-l at lists.open-bio.org>
Sent: Tuesday, September 29, 2009 2:01 PM
Subject: [Bioperl-l] BioPerl 1.6.1 released


> We are pleased to announce the availability of BioPerl 1.6.1, the  
> latest release of BioPerl core code.  You can grab it here:
> 
> Via CPAN:
> 
> http://search.cpan.org/~cjfields/BioPerl-1.6.1/
> 
> Via the BioPerl website:
> 
> http://bioperl.org/DIST/BioPerl-1.6.1.tar.bz2
> http://bioperl.org/DIST/BioPerl-1.6.1.tar.gz
> http://bioperl.org/DIST/BioPerl-1.6.1.zip
> 
> The PPM for Windows should also finally be available this week,  
> ActivePerl problems permitting (we will post more information when it  
> becomes available).
> 
> Tons of bug fixes and changes have been incorporated into this  
> release.  For a more complete change list please see the 'Changes'  
> file included with the distribution.
> 
> A few highlights:
> 
> * FASTQ parsing and interconversion of the three FASTQ variants  
> (Sanger, Illumina, Solexa) now works (a concerted OBF effort!)
> * Significant refactoring of Bio::Restriction methods
> * Complete refactoring of Bio::Search-related tiling code, including  
> HOWTO documentation
> * GBrowse-related fixes
>    - berkeleydb database now autoindexes wig files and locks correctly
>    - add Pg, SQLite, and faster BerkeleyDB implementations
> * Infernal 1.0 output is now parsed
> * New SearchIO-based parser for gmap -f9 output
> * BLAST XML parsing essentially complete
> * Installation via CPANPLUS should now work
> * For those using Strawberry Perl on Windows, the latest build is  
> expected to pass all tests.
> * 'raw' sequence format now parsed by line or optionally as a single  
> sequence
> * SCF parsing/writing now round-trips
> * Demo code for using RPS-BLAST and Bio::Tools::Run::RemoteBlast
> * Bio::Tools::SeqPattern now has a backtranslate() method
> * Bio::Tree::Statistics now has methods to calculate Fitch-based  
> score, internal trait values, statratio(), sum of leaf distances  
> [heikki]
> * scripts
>    - update to bp_seqfeature_load for SQLite [lstein]
>    - hivq.pl - commmand-line interface to Bio::DB::HIV [maj]
>    - fastam9_to_table - fix for MPI output [jason]
>    - gccalc - total stats [jason]
>    - einfo  - simple script to find up-to-date NCBI database list,  
> list field and link values for a specific database
> 
> We will shortly release updates for BioPerl-db, BioPerl-run, and  
> BioPerl-network.  Enjoy!
> 
> chris
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
>


From cjfields at illinois.edu  Wed Sep 30 03:54:04 2009
From: cjfields at illinois.edu (Chris Fields)
Date: Tue, 29 Sep 2009 22:54:04 -0500
Subject: [Bioperl-l] Long /labels are wrapped, but can't be read
In-Reply-To: <87hbunv764.fsf@topper.koldfront.dk>
References: <87hbunv764.fsf@topper.koldfront.dk>
Message-ID: <86373CE8-4C61-4124-BCF3-35975523CC9C@illinois.edu>

Adam,

Not sure, but this could be a case of 'both'.  Labels that are quoted  
and aren't are currently distinguished via a global hash lookup  
(%FTQUAL_NO_QUOTE) due to the way the parser works; there is some  
logic behind this, just can't quite recall at the moment why it is  
this way.  You could set a hash key for the label in cases where it  
isn't quoted, that should work.  You can also test out the  
Bio::SeqIO::embldriver version (-format => 'embldriver').

If the above doesn't work out it's worth filing a bug for this  
behavior, though I'm not sure how easily it will be to fix.

chris

On Sep 28, 2009, at 2:51 AM, Adam Sj?gren wrote:

>  Hi.
>
>
> I am wondering whether this is a buglet or just a case of "Don't do
> that":
>
> If I set a very long /label on a feature and output the sequence in  
> EMBL
> format, the qualifier value gets wrapped, but not quoted.
>
> When BioPerl reads such a file, an exception is thrown.
>
> I probably shouldn't be setting very long labels... But oughtn't  
> BioPerl
> throw an exception when a too long label is set, or automatically  
> quote
> the value when it is long enough to be wrapped, or know how to read a
> wrapped yet unquoted value?
>
> I will be happy to try and provide a patch for whichever solution is
> preferred.
>
> Here is an example script:
>
>  #!/usr/bin/perl
>
>  use strict;
>  use warnings;
>
>  use IO::String;
>
>  use Bio::Seq;
>  use Bio::SeqFeature::Generic;
>  use Bio::SeqIO;
>
>  print 'BioPerl ' . $Bio::Root::Version::VERSION . "\n";
>
>  my $seq=Bio::Seq->new(-seq=>'ATG');
>  my $feature=Bio::SeqFeature::Generic->new(-primary=>'misc_feature',  
> -start=>1, -end=>3);
>  $feature->add_tag_value 
> (label 
> =>'averylonglabelthisisindeedbutitoughttoworkanywaydontyouthink');
>  $seq->add_SeqFeature($feature);
>
>  my $out_string=out($seq);
>  print $out_string;
>
>  my $fh=IO::String->new($out_string);
>  my $in=Bio::SeqIO->new(-fh=>$fh, -format=>'EMBL');
>  my $in_seq=$in->next_seq;
>
>  print "Done\n";
>
>  sub out {
>      my ($seq)=@_;
>
>      my $string='';
>      my $fh=IO::String->new($string);
>      my $out=Bio::SeqIO->new(-fh=>$fh, -format=>'EMBL');
>      $out->write_seq($seq);
>
>      return $string;
>  }
>
> Which gives this output when run:
>
>  BioPerl 1.0069
>  ID   unknown; SV 1; linear; unassigned DNA; STD; UNC; 3 BP.
>  XX
>  AC   unknown;
>  XX
>  XX
>  FH   Key             Location/Qualifiers
>  FH
>  FT   misc_feature    1..3
>  FT                   / 
> label=averylonglabelthisisindeedbutitoughttoworkanywaydont
>  FT                   youthink
>  XX
>  SQ   Sequence 3 BP; 1 A; 0 C; 1 G; 1 T; 0 other;
>        
> atg 
>                                                                        3
>  //
>
>  ------------- EXCEPTION: Bio::Root::Exception -------------
>  MSG: Can't see new qualifier in: youthink
>  from:
>  /label=averylonglabelthisisindeedbutitoughttoworkanywaydont
>  youthink
>
>  STACK: Error::throw
>  STACK: Bio::Root::Root::throw Bio/Root/Root.pm:368
>  STACK: Bio::SeqIO::embl::_read_FTHelper_EMBL Bio/SeqIO/embl.pm:1294
>  STACK: Bio::SeqIO::embl::next_seq Bio/SeqIO/embl.pm:392
>  STACK: /z/home/adsj/bugs/bioperl/embl/embl.pl:24
>  -----------------------------------------------------------
>
> If I change the value to include "-quotes ("simulating" that embl.pm
> quotes the value), BioPerl can read the EMBL string it produces fine:
>
>  -----------------------------------------------------------
>  adsj at ala:~/work/bioperl/bioperl-live$ perl -I. ~/bugs/bioperl/embl/ 
> embl.pl
>  BioPerl 1.0069
>  ID   unknown; SV 1; linear; unassigned DNA; STD; UNC; 3 BP.
>  XX
>  AC   unknown;
>  XX
>  XX
>  FH   Key             Location/Qualifiers
>  FH
>  FT   misc_feature    1..3
>  FT                   / 
> label=""averylonglabelthisisindeedbutitoughttoworkanywaydo
>  FT                   ntyouthink""
>  XX
>  SQ   Sequence 3 BP; 1 A; 0 C; 1 G; 1 T; 0 other;
>        
> atg 
>                                                                        3
>  //
>  Done
>
>
>  Best regards,
>
>     Adam
>
> -- 
>                                                          Adam Sj?gren
>                                                    adsj at novozymes.com
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From adsj at novozymes.com  Wed Sep 30 09:50:36 2009
From: adsj at novozymes.com (Adam =?iso-8859-1?Q?Sj=F8gren?=)
Date: Wed, 30 Sep 2009 11:50:36 +0200
Subject: [Bioperl-l] Long /labels are wrapped, but can't be read
In-Reply-To: <86373CE8-4C61-4124-BCF3-35975523CC9C@illinois.edu> (Chris
	Fields's message of "Tue, 29 Sep 2009 22:54:04 -0500")
References: <87hbunv764.fsf@topper.koldfront.dk>
	<86373CE8-4C61-4124-BCF3-35975523CC9C@illinois.edu>
Message-ID: <87vdj0g3rn.fsf@topper.koldfront.dk>

On Tue, 29 Sep 2009 22:54:04 -0500, Chris wrote:

> Not sure, but this could be a case of 'both'. Labels that are quoted
> and aren't are currently distinguished via a global hash lookup
> (%FTQUAL_NO_QUOTE) due to the way the parser works; there is some
> logic behind this, just can't quite recall at the moment why it is
> this way.

Yes, I saw that there is a number of qualifiers that aren't quoted
automatically.

The very easy "fix" for me would be to simply remove "label" from
%FTQUAL_NO_QUOTE, but I'm not really sure what the reason for not
quoting all values is, so I was hesitant to just propose that.

> You could set a hash key for the label in cases where it isn't quoted,
> that should work. You can also test out the Bio::SeqIO::embldriver
> version (-format => 'embldriver').

Ah, embldriver reads the wrapped qualifier when it isn't quoted without
problem. Nice! I hadn't noticed embldriver.

I wonder which one is correct in this case?

And should I switch to using embldriver to read, or does it make sense
to try and concoct a patch that changes embl?


  Thanks for the feedback!

     Adam

-- 
                                                          Adam Sj?gren
                                                    adsj at novozymes.com


From sidd.basu at gmail.com  Wed Sep 30 17:24:53 2009
From: sidd.basu at gmail.com (Siddhartha Basu)
Date: Wed, 30 Sep 2009 12:24:53 -0500
Subject: [Bioperl-l]  Re: BioPerl 1.6.1 released
In-Reply-To: <5B8C4E37-5F3D-4E76-AB94-1C613AE04CDF@illinois.edu>
References: <7E88F275-4D4B-48DD-84C7-A97C2C16EA97@illinois.edu>
	<C06DA705-6249-4D86-BE9D-E2E4DCEBFAF0@gmx.net>
	<5B8C4E37-5F3D-4E76-AB94-1C613AE04CDF@illinois.edu>
Message-ID: <4ac39469.0637560a.5a63.1fee@mx.google.com>

Congrats chris,  really appreciate your time and effort.

-siddhartha

On Tue, 29 Sep 2009, Chris Fields wrote:

> No prob.  Next up is db, run, and network!
>
> chris
>
> On Sep 29, 2009, at 2:56 PM, Hilmar Lapp wrote:
>
> > Congrats from me too - awesome Chris, and thanks on behalf of the project!
> >
> > 	-hilmar
> >
> > On Sep 29, 2009, at 2:01 PM, Chris Fields wrote:
> >
> >> We are pleased to announce the availability of BioPerl 1.6.1, the latest 
> >> release of BioPerl core code.  You can grab it here:
> >>
> >> Via CPAN:
> >>
> >> http://search.cpan.org/~cjfields/BioPerl-1.6.1/
> >>
> >> Via the BioPerl website:
> >>
> >> http://bioperl.org/DIST/BioPerl-1.6.1.tar.bz2
> >> http://bioperl.org/DIST/BioPerl-1.6.1.tar.gz
> >> http://bioperl.org/DIST/BioPerl-1.6.1.zip
> >>
> >> The PPM for Windows should also finally be available this week, 
> >> ActivePerl problems permitting (we will post more information when it 
> >> becomes available).
> >>
> >> Tons of bug fixes and changes have been incorporated into this release.  
> >> For a more complete change list please see the 'Changes' file included 
> >> with the distribution.
> >>
> >> A few highlights:
> >>
> >> * FASTQ parsing and interconversion of the three FASTQ variants (Sanger, 
> >> Illumina, Solexa) now works (a concerted OBF effort!)
> >> * Significant refactoring of Bio::Restriction methods
> >> * Complete refactoring of Bio::Search-related tiling code, including 
> >> HOWTO documentation
> >> * GBrowse-related fixes
> >>  - berkeleydb database now autoindexes wig files and locks correctly
> >>  - add Pg, SQLite, and faster BerkeleyDB implementations
> >> * Infernal 1.0 output is now parsed
> >> * New SearchIO-based parser for gmap -f9 output
> >> * BLAST XML parsing essentially complete
> >> * Installation via CPANPLUS should now work
> >> * For those using Strawberry Perl on Windows, the latest build is 
> >> expected to pass all tests.
> >> * 'raw' sequence format now parsed by line or optionally as a single 
> >> sequence
> >> * SCF parsing/writing now round-trips
> >> * Demo code for using RPS-BLAST and Bio::Tools::Run::RemoteBlast
> >> * Bio::Tools::SeqPattern now has a backtranslate() method
> >> * Bio::Tree::Statistics now has methods to calculate Fitch-based score, 
> >> internal trait values, statratio(), sum of leaf distances [heikki]
> >> * scripts
> >>  - update to bp_seqfeature_load for SQLite [lstein]
> >>  - hivq.pl - commmand-line interface to Bio::DB::HIV [maj]
> >>  - fastam9_to_table - fix for MPI output [jason]
> >>  - gccalc - total stats [jason]
> >>  - einfo  - simple script to find up-to-date NCBI database list, list 
> >> field and link values for a specific database
> >>
> >> We will shortly release updates for BioPerl-db, BioPerl-run, and 
> >> BioPerl-network.  Enjoy!
> >>
> >> chris
> >> _______________________________________________
> >> Bioperl-l mailing list
> >> Bioperl-l at lists.open-bio.org
> >> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >>
> >
> > -- 
> > ===========================================================
> > : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
> > ===========================================================
> >
> >
> >
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From antonina.iagovitina at epfl.ch  Wed Sep 30 18:09:17 2009
From: antonina.iagovitina at epfl.ch (Antonina Iagovitina)
Date: Wed, 30 Sep 2009 20:09:17 +0200
Subject: [Bioperl-l] assistance with bioperl
Message-ID: <4AC39ECD.6060405@epfl.ch>

Here is the error message I get when I try to align a sequence to an existing
alignment. Please help
I am using Windows XP and Clustalw version1.83

 MSG:
 ERROR: Could not open sequence file (-profile) 
 No. of seqs. read = -1. No alignment!
 
use Bio::AlignIO;
use Bio::SeqIO;
use Bio::Seq;
use Bio::Tools::Run::Alignment::Clustalw;

my @params = ('ktuple' => 2, 'matrix' => 'BLOSUM');
my $factory = Bio::Tools::Run::Alignment::Clustalw->new(@params);
$str = Bio::AlignIO->new(-file=> 'cysprot1a.msf');
$aln = $str->next_aln();
$str1 = Bio::SeqIO->new(-file=> 'cysprot1b.fa');
$seq = $str1->next_seq();
$aln = $factory->profile_align($aln,$seq);
end


From maj at fortinbras.us  Wed Sep 30 18:24:59 2009
From: maj at fortinbras.us (Mark A. Jensen)
Date: Wed, 30 Sep 2009 14:24:59 -0400
Subject: [Bioperl-l] assistance with bioperl
In-Reply-To: <4AC39ECD.6060405@epfl.ch>
References: <4AC39ECD.6060405@epfl.ch>
Message-ID: <569E83EDBFE044638187504E5E7A8C11@NewLife>

Antonina--
Try the following:
Make sure that cysprot1a.msf and cysprot1b.fa are in the current directory, 
or use full path names for the files. 
MAJ
----- Original Message ----- 
From: "Antonina Iagovitina" <antonina.iagovitina at epfl.ch>
To: <bioperl-l at lists.open-bio.org>
Sent: Wednesday, September 30, 2009 2:09 PM
Subject: [Bioperl-l] assistance with bioperl


> Here is the error message I get when I try to align a sequence to an existing
> alignment. Please help
> I am using Windows XP and Clustalw version1.83
> 
> MSG:
> ERROR: Could not open sequence file (-profile) 
> No. of seqs. read = -1. No alignment!
> 
> use Bio::AlignIO;
> use Bio::SeqIO;
> use Bio::Seq;
> use Bio::Tools::Run::Alignment::Clustalw;
> 
> my @params = ('ktuple' => 2, 'matrix' => 'BLOSUM');
> my $factory = Bio::Tools::Run::Alignment::Clustalw->new(@params);
> $str = Bio::AlignIO->new(-file=> 'cysprot1a.msf');
> $aln = $str->next_aln();
> $str1 = Bio::SeqIO->new(-file=> 'cysprot1b.fa');
> $seq = $str1->next_seq();
> $aln = $factory->profile_align($aln,$seq);
> end
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
>


From me at miguel.weapps.com  Wed Sep 30 22:16:38 2009
From: me at miguel.weapps.com (Luis M Rodriguez-R)
Date: Wed, 30 Sep 2009 17:16:38 -0500
Subject: [Bioperl-l] Nexus symbols
Message-ID: <0EFFDCCA-48C6-4609-8503-17E61FCDD67B@miguel.weapps.com>

Dear all,

Is there a way to remove the "symbols" (i.e. the 'symbols="ATCG"')  
from the "format" line in the Nexus output of Bio::AlignIO?

My code (snippet) is:

my $fasta_i = Bio::AlignIO->new(-file=>"<$outfile.aln.fasta", '- 
format'=>"fasta");
my $nexus_o = Bio::AlignIO->new(-file=>">$outfile.aln.nex", '- 
format'=>"nexus");
while(my $fasta_aln=$fasta_i->next_aln){$nexus_o- 
 >write_aln($fasta_aln);}

And I would like to remove the symbols (is not compatible with MrBayes  
v3.1.2: "Could not find parameter "symbols"").

Also, it would be nice to be able to change the TITLE comment.

Thanks all!
Regards,

Luis M. Rodriguez-R
[http://bioinf.uniandes.edu.co/~miguel/]
---------------------------------
Unidad de Bioinform?tica del Laboratorio de Micolog?a y Fitopatolog?a
Universidad de Los Andes, Colombia
[http://bioinf.uniandes.edu.co]

+ 57 1 3394949 ext 2619
luisrodr at uniandes.edu.co
me at miguel.weapps.com


From jason at bioperl.org  Wed Sep 30 22:40:33 2009
From: jason at bioperl.org (Jason Stajich)
Date: Wed, 30 Sep 2009 15:40:33 -0700
Subject: [Bioperl-l] Nexus symbols
In-Reply-To: <0EFFDCCA-48C6-4609-8503-17E61FCDD67B@miguel.weapps.com>
References: <0EFFDCCA-48C6-4609-8503-17E61FCDD67B@miguel.weapps.com>
Message-ID: <483DB389-9332-4573-84C7-3AF09AC2BACA@bioperl.org>

-show_symbols => 0

If you use bp_sreformat.pl script specify --special="mrbayes" it will  
set both of the endblock and show_symbols values to 0.


perldoc Bio::AlignIO::nexus

        new

         Title   : new
         Usage   : $alignio = Bio::AlignIO->new(-format => ?nexus?, - 
file => ?filename?);
         Function: returns a new Bio::AlignIO object to handle  
clustalw files
         Returns : Bio::AlignIO::clustalw object
         Args    : -verbose => verbosity setting (-1,0,1,2)
                   -file    => name of file to read in or with ">" -  
writeout
                   -fh      => alternative to -file param - provide a  
filehandle
                               to read from/write to
                   -format  => type of Alignment Format to process or  
produce

                   Customization of nexus flavor output

                   -show_symbols => print the symbols="ATGC" in the  
data definition
                                    (MrBayes does not like this)
                                    boolean [default is 1]
                   -show_endblock => print an ?endblock;? at the end  
of the data
                                    (MyBayes does not like this)
                                    boolean [default is 1]

On Sep 30, 2009, at 3:16 PM, Luis M Rodriguez-R wrote:

> Dear all,
>
> Is there a way to remove the "symbols" (i.e. the 'symbols="ATCG"')  
> from the "format" line in the Nexus output of Bio::AlignIO?
>
> My code (snippet) is:
>
> my $fasta_i = Bio::AlignIO->new(-file=>"<$outfile.aln.fasta", '- 
> format'=>"fasta");
> my $nexus_o = Bio::AlignIO->new(-file=>">$outfile.aln.nex", '- 
> format'=>"nexus");
> while(my $fasta_aln=$fasta_i->next_aln){$nexus_o- 
> >write_aln($fasta_aln);}
>
> And I would like to remove the symbols (is not compatible with  
> MrBayes v3.1.2: "Could not find parameter "symbols"").
>
> Also, it would be nice to be able to change the TITLE comment.
>
> Thanks all!
> Regards,
>
> Luis M. Rodriguez-R
> [http://bioinf.uniandes.edu.co/~miguel/]
> ---------------------------------
> Unidad de Bioinform?tica del Laboratorio de Micolog?a y Fitopatolog?a
> Universidad de Los Andes, Colombia
> [http://bioinf.uniandes.edu.co]
>
> + 57 1 3394949 ext 2619
> luisrodr at uniandes.edu.co
> me at miguel.weapps.com
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

--
Jason Stajich
jason.stajich at gmail.com
jason at bioperl.org


From me at miguel.weapps.com  Wed Sep 30 20:51:04 2009
From: me at miguel.weapps.com (Luis M Rodriguez-R)
Date: Wed, 30 Sep 2009 15:51:04 -0500
Subject: [Bioperl-l] Nexus symbols
Message-ID: <788222E4-FCCC-4D4D-880B-1F5156945DB8@miguel.weapps.com>

Dear all,

Is there a way to remove the "symbols" (i.e. the 'symbols="ATCG"')  
from the "format" line in the Nexus output of Bio::AlignIO?

My code (snippet) is:

my $fasta_i = Bio::AlignIO->new(-file=>"<$outfile.aln.fasta", '- 
format'=>"fasta");
my $nexus_o = Bio::AlignIO->new(-file=>">$outfile.aln.nex", '- 
format'=>"nexus");
while(my $fasta_aln=$fasta_i->next_aln){$nexus_o- 
 >write_aln($fasta_aln);}

And I would like to remove the symbols (is not compatible with MrBayes  
v3.1.2: "Could not find parameter "symbols"").

Also, it would be nice to be able to change the TITLE comment.

Thanks all!
Regards,

Luis M. Rodriguez-R
[http://bioinf.uniandes.edu.co/~miguel/]
---------------------------------
Unidad de Bioinform?tica del Laboratorio de Micolog?a y Fitopatolog?a
Universidad de Los Andes, Colombia
[http://bioinf.uniandes.edu.co]

+ 57 1 3394949 ext 2619
luisrodr at uniandes.edu.co
me at miguel.weapps.com