From rmb32 at cornell.edu Sun Aug 1 15:17:14 2010
From: rmb32 at cornell.edu (Robert Buels)
Date: Sun, 01 Aug 2010 12:17:14 -0700
Subject: [Bioperl-l] GMOD Evo Hackathon Open Call for Participation
Message-ID: <4C55C83A.3060700@cornell.edu>
We are seeking participants for the GMOD Tools for Evolutionary Biology
Hackathon, held November 8-12, 2010 at the US National Evolutionary
Synthesis Center (NESCent) in Durham, NC.
This hackathon targets three critical gaps in the capabilities of the
GMOD toolbox that currently limit its utility for evolutionary research:
1. Visualization of comparative genomics data
2. Visualization of phylogenetic data and trees
3. Support for population diversity and phenotype data
If you are interested in these areas and have relevant expertise, you
are strongly encouraged to apply. Relevant areas of expertise include
more than just software development: if you are a GMOD power user,
visualization guru, domain expert (comparative, phylogenetics,
population, ...), or documentation wizard, then your skills are needed!
How To Apply:
Fill out the online application form at http://bit.ly/gmodevohack.
Applications are due August 25.
About GMOD:
GMOD is an intercompatible suite of open-source software components for
storing, managing, analyzing, and visualizing genome-scale data. GMOD
includes many widely-used software components: GBrowse and JBrowse, both
genome viewers; GBrowse_syn, a comparative genomics viewer; Chado, a
generic and modular database schema; CMap, a comparative map viewer; as
well as many other components including Apollo, MAKER, BioMart,
InterMine, and Galaxy. We hope to extend the functionality of existing
GMOD components, and integrate new components as well.
About Hackathons:
A hackathon is an intense event at which a group of programmers with
different backgrounds and skills collaborate hands-on and face-to-face
to develop working code that is of utility to the community as a whole.
The mix of people will include domain experts and computer-savvy end-users.
More details about the event, its motivation, organization, procedures,
and attendees, as well as URLs to the hackathon and related websites are
included below.
Sincerely,
The GMOD EvoHack Organizing Committee (and project affiliations as
relevant):
Nicole Washington, Chair (LBNL, modENCODE, Phenote)
Robert Buels (SGN, Chado NatDiv)
Scott Cain (OICR, GMOD)
Dave Clements (NESCent, GMOD)
Hilmar Lapp (NESCent, Phenoscape, Chado NatDiv)
Sheldon McKay (University of Arizona, iPlant, GBrowse_syn)
-----------------------------
About the GMOD Evo Hackathon
Overview
We are organizing a hackathon to fill critical gaps in the capabilities
of the Generic Model Organism Database (GMOD) toolbox that currently
limit its utility for evolutionary research. Specifically, we will focus
on tools for
1) viewing comparative genomics data;
2) visualizing phylogenomic data; and
3) supporting population diversity data and phenotype annotation.
The event will be hosted at NESCent and bring together a group of about
20+ software developers, end-user representatives, and documentation
experts who would otherwise not meet. The participants will include key
developers of GMOD components that currently lack features critical for
emerging evolutionary biology research, developers of informatics tools
in evolutionary research that lack GMOD integration, and
informatics-savvy biologists who can represent end-user requirements.
The event will provide a unique opportunity to infuse the GMOD developer
community with a heightened awareness of unmet needs in evolutionary
biology that GMOD components have the potential to fill, and for tool
developers in evolutionary biology to better understand how best to
extend or integrate with already existing GMOD components.
Before the Event
Discussion of ideas and sometimes even design actually starts well
before the hackathon, on mailing lists, wiki pages, and conference calls
set up among accepted attendees. This advance work lays the foundation
for participants to be productive from the very first day. This also
means that participants should be willing to contribute some time in
advance of the hackathon itself to participate in this preparatory
discussion.
During the Event
Typically, hackathon participants use the morning of the first day of
the event to organize themselves into working groups of between 3 and 6
people, each with a focused implementation objective. Ideas and
objectives are discussed, and attendees coalesce around the projects in
which they have the most experience or interest.
Deliverables / Event Results
The meeting's attendance, working groups, and outcomes will be fully
logged and documented on the GMOD wiki (http://gmod.org). Each working
group during the event will typically have its own wiki page, linked
from the main EvoHack page, where it documents its minutes and design
notes, and provides links to the code and documentation it produces.
Also, since GMOD and NESCent are both committed to open source
principles, all code and documentation produced by participants during
the hackathon must be published under an OSI-approved open source
license. As contributions to existing GMOD tools, all hackathon products
will most likely satisfy this requirement automatically.
NESCent
This event is sponsored by the US National Evolutionary Synthesis Center
(NESCent, http://www.nescent.org) through its Informatics Whitepapers
program (http://www.nescent.org/informatics/whitepapers.php). NESCent
promotes the synthesis of information, concepts and knowledge to address
significant, emerging, or novel questions in evolutionary science and
its applications. NESCent achieves this by supporting research and
education across disciplinary, institutional, geographic, and
demographic boundaries (see http://www.nescent.org/science/proposals.php).
Links
Main GMOD EvoHack page, and full proposal:
http://gmod.org/wiki/GMOD_Evo_Hackathon
NESCent: http://www.nescent.org/
GMOD: http://gmod.org
Similar past NESCent events, see: http://hackathon.nescent.org/
GMOD hackathon application: http://bit.ly/gmodevohack
--
http://gmod.org/wiki/GMOD_News
http://gmod.org/wiki/GMOD_Europe_2010
http://gmod.org/wiki/Help_Desk_Feedback
From maj at fortinbras.us Sun Aug 1 19:19:16 2010
From: maj at fortinbras.us (Mark A. Jensen)
Date: Sun, 1 Aug 2010 19:19:16 -0400
Subject: [Bioperl-l] SOAP Eutilities
In-Reply-To:
References:
Message-ID: <627BEC8B2E624A69A0B11EEBC8C93B71@NewLife>
Turns out that module lives in bioperl-run; try
git clone git://github.com/bioperl/bioperl-run.git
MAJ
----- Original Message -----
From: "Robson de Souza"
To:
Sent: Saturday, July 31, 2010 4:56 PM
Subject: [Bioperl-l] SOAP Eutilities
> Hi,
>
> Bio::DB::SoapEUtilities, referred in the HOWTO on EUtilities, seems to
> have disappeared from the Git repository.
> A simple
>
> git clone git://github.com/bioperl/bioperl-live.git
>
> does not download it. Any ideas why?
> Robson
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>
From David.Messina at sbc.su.se Mon Aug 2 09:58:10 2010
From: David.Messina at sbc.su.se (Dave Messina)
Date: Mon, 2 Aug 2010 15:58:10 +0200
Subject: [Bioperl-l] phyloxml and element order
In-Reply-To:
References:
Message-ID:
Hi Fred,
Thanks for letting us know about this ? definitely sounds like a bug.
Would you please submit this to our bug tracker?
http://bugzilla.open-bio.org
(You can just copy and paste your previous email.)
Dave
On Jul 30, 2010, at 06:59, Fr?d?ric Romagn? wrote:
> Hi,
>
> I'm using bioperl to create phyloxml trees, after few tentatives, i got my
> tree with all the element/attributes i want but when I write the tree,
> element are not written following the order specified in the XSD Schema.
>
> For example, i got :
>
>
>
> Loxosceles intermedia
>
> Araneomorphae Sicariidae
>
>
> 969
> HAAERADSRKPIWDIAHMVNDLELVD
>
>
>
> Araneomorphae Sicariidae
>
>
>
> The program forester complains that should be written before the
> element.
>
> According to
> http://phyloxml.wordpress.com/2009/11/25/order-of-elements-in-phyloxml this
> is what bioperl is supposed to do.
>
> All my element/attributes are set before writing the tree using
> 'add_Annotation', 'add_tag_value' and 'sequence' methods from a
> Bio::Tree::AnnotatableNode object, so i think the error comes from the
> write_tree method.
>
> Any help would be appreciated.
>
> Thank you,
> Fred
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
From shalabh.sharma7 at gmail.com Mon Aug 2 15:44:35 2010
From: shalabh.sharma7 at gmail.com (shalabh sharma)
Date: Mon, 2 Aug 2010 15:44:35 -0400
Subject: [Bioperl-l] clustalw to maf format
Message-ID:
Hi,
I am trying to convert clustalw to maf format.
I am trying to use AlignIO for that but its not working.
Its giving me the following error:
EXCEPTION Bio::Root::NotImplemented -------------
MSG: Abstract method "Bio::AlignIO::maf::write_aln" is not implemented by
package Bio::AlignIO::maf.
This is not your fault - author of Bio::AlignIO::maf should be blamed!
STACK Bio::Root::RootI::throw_not_implemented
/Library/Perl/5.8.8/Bio/Root/RootI.pm:707
STACK Bio::AlignIO::maf::write_aln /Library/Perl/5.8.8/Bio/AlignIO/
maf.pm:176
STACK Bio::AlignIO::PRINT /Library/Perl/5.8.8/Bio/AlignIO.pm:492
STACK toplevel msf2mafy.pl:11
Is there any other way i can convert clustalw to maf?
I would really appreciate if anyone can help me out.
Thanks
Shalabh
From Russell.Smithies at agresearch.co.nz Mon Aug 2 16:25:26 2010
From: Russell.Smithies at agresearch.co.nz (Smithies, Russell)
Date: Tue, 3 Aug 2010 08:25:26 +1200
Subject: [Bioperl-l] clustalw to maf format
In-Reply-To:
References:
Message-ID: <18DF7D20DFEC044098A1062202F5FFF32F02147B68@exchsth.agresearch.co.nz>
This might work if you only have a few:
http://www.ibi.vu.nl/programs/convertalignwww/
--Russell
> -----Original Message-----
> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
> bounces at lists.open-bio.org] On Behalf Of shalabh sharma
> Sent: Tuesday, 3 August 2010 7:45 a.m.
> To: bioperl-l
> Subject: [Bioperl-l] clustalw to maf format
>
> Hi,
> I am trying to convert clustalw to maf format.
> I am trying to use AlignIO for that but its not working.
>
> Its giving me the following error:
>
> EXCEPTION Bio::Root::NotImplemented -------------
> MSG: Abstract method "Bio::AlignIO::maf::write_aln" is not implemented by
> package Bio::AlignIO::maf.
> This is not your fault - author of Bio::AlignIO::maf should be blamed!
>
> STACK Bio::Root::RootI::throw_not_implemented
> /Library/Perl/5.8.8/Bio/Root/RootI.pm:707
> STACK Bio::AlignIO::maf::write_aln /Library/Perl/5.8.8/Bio/AlignIO/
> maf.pm:176
> STACK Bio::AlignIO::PRINT /Library/Perl/5.8.8/Bio/AlignIO.pm:492
> STACK toplevel msf2mafy.pl:11
>
>
> Is there any other way i can convert clustalw to maf?
>
> I would really appreciate if anyone can help me out.
>
> Thanks
> Shalabh
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
=======================================================================
Attention: The information contained in this message and/or attachments
from AgResearch Limited is intended only for the persons or entities
to which it is addressed and may contain confidential and/or privileged
material. Any review, retransmission, dissemination or other use of, or
taking of any action in reliance upon, this information by persons or
entities other than the intended recipients is prohibited by AgResearch
Limited. If you have received this message in error, please notify the
sender immediately.
=======================================================================
From shalabh.sharma7 at gmail.com Mon Aug 2 16:53:31 2010
From: shalabh.sharma7 at gmail.com (shalabh sharma)
Date: Mon, 2 Aug 2010 16:53:31 -0400
Subject: [Bioperl-l] clustalw to maf format
In-Reply-To: <18DF7D20DFEC044098A1062202F5FFF32F02147B68@exchsth.agresearch.co.nz>
References:
<18DF7D20DFEC044098A1062202F5FFF32F02147B68@exchsth.agresearch.co.nz>
Message-ID:
Hi Russell,
Thanks for the reply, but i have around 400 alignments and some
huge ones :(
Thanks
Shalabh
On Mon, Aug 2, 2010 at 4:25 PM, Smithies, Russell <
Russell.Smithies at agresearch.co.nz> wrote:
> This might work if you only have a few:
> http://www.ibi.vu.nl/programs/convertalignwww/
>
> --Russell
>
>
> > -----Original Message-----
> > From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
> > bounces at lists.open-bio.org] On Behalf Of shalabh sharma
> > Sent: Tuesday, 3 August 2010 7:45 a.m.
> > To: bioperl-l
> > Subject: [Bioperl-l] clustalw to maf format
> >
> > Hi,
> > I am trying to convert clustalw to maf format.
> > I am trying to use AlignIO for that but its not working.
> >
> > Its giving me the following error:
> >
> > EXCEPTION Bio::Root::NotImplemented -------------
> > MSG: Abstract method "Bio::AlignIO::maf::write_aln" is not implemented by
> > package Bio::AlignIO::maf.
> > This is not your fault - author of Bio::AlignIO::maf should be blamed!
> >
> > STACK Bio::Root::RootI::throw_not_implemented
> > /Library/Perl/5.8.8/Bio/Root/RootI.pm:707
> > STACK Bio::AlignIO::maf::write_aln /Library/Perl/5.8.8/Bio/AlignIO/
> > maf.pm:176
> > STACK Bio::AlignIO::PRINT /Library/Perl/5.8.8/Bio/AlignIO.pm:492
> > STACK toplevel msf2mafy.pl:11
> >
> >
> > Is there any other way i can convert clustalw to maf?
> >
> > I would really appreciate if anyone can help me out.
> >
> > Thanks
> > Shalabh
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
> =======================================================================
> Attention: The information contained in this message and/or attachments
> from AgResearch Limited is intended only for the persons or entities
> to which it is addressed and may contain confidential and/or privileged
> material. Any review, retransmission, dissemination or other use of, or
> taking of any action in reliance upon, this information by persons or
> entities other than the intended recipients is prohibited by AgResearch
> Limited. If you have received this message in error, please notify the
> sender immediately.
> =======================================================================
>
From biopython at maubp.freeserve.co.uk Mon Aug 2 17:24:09 2010
From: biopython at maubp.freeserve.co.uk (Peter)
Date: Mon, 2 Aug 2010 22:24:09 +0100
Subject: [Bioperl-l] clustalw to maf format
In-Reply-To:
References:
Message-ID:
On Mon, Aug 2, 2010 at 8:44 PM, shalabh sharma
wrote:
> Hi,
> ? ?I am trying to convert clustalw to maf format.
> I am trying to use AlignIO for that but its not working.
Could you tell us why you have to use maf format?
I'm curious because all of the phylogenetics tools I've
had to work with personally will take some other format
which is more widely supported (e.g. FASTA, PFAM,
ClustalW, PHYLIP, ...).
Peter
From bernd.web at gmail.com Mon Aug 2 17:25:52 2010
From: bernd.web at gmail.com (Bernd Web)
Date: Mon, 2 Aug 2010 23:25:52 +0200
Subject: [Bioperl-l] clustalw to maf format
In-Reply-To:
References:
<18DF7D20DFEC044098A1062202F5FFF32F02147B68@exchsth.agresearch.co.nz>
Message-ID:
Hi Shalabh,
This ConvertAlign does not write maf either, it only reads it (i made
it). I found some other converters on the web but they do not export
to maf format either...
http://biotechvana.uv.es/servers/afc/main.php
http://www.hiv.lanl.gov/content/sequence/FORMAT_CONVERSION/form.html
Galaxy has a MAF to Fasta converter:
http://main.g2.bx.psu.edu/root?tool_id=MAF_To_Fasta1
Regards,
Bernd
On Mon, Aug 2, 2010 at 10:53 PM, shalabh sharma
wrote:
> Hi Russell,
> ? ? ? ? ? ?Thanks for the reply, but i ?have around 400 alignments and some
> huge ones :(
>
> Thanks
> Shalabh
>
>
> On Mon, Aug 2, 2010 at 4:25 PM, Smithies, Russell <
> Russell.Smithies at agresearch.co.nz> wrote:
>
>> This might work if you only have a few:
>> http://www.ibi.vu.nl/programs/convertalignwww/
>>
>> --Russell
>>
>>
>> > -----Original Message-----
>> > From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
>> > bounces at lists.open-bio.org] On Behalf Of shalabh sharma
>> > Sent: Tuesday, 3 August 2010 7:45 a.m.
>> > To: bioperl-l
>> > Subject: [Bioperl-l] clustalw to maf format
>> >
>> > Hi,
>> > ? ? I am trying to convert clustalw to maf format.
>> > I am trying to use AlignIO for that but its not working.
>> >
>> > Its giving me the following error:
>> >
>> > EXCEPTION Bio::Root::NotImplemented -------------
>> > MSG: Abstract method "Bio::AlignIO::maf::write_aln" is not implemented by
>> > package Bio::AlignIO::maf.
>> > This is not your fault - author of Bio::AlignIO::maf should be blamed!
>> >
>> > STACK Bio::Root::RootI::throw_not_implemented
>> > /Library/Perl/5.8.8/Bio/Root/RootI.pm:707
>> > STACK Bio::AlignIO::maf::write_aln /Library/Perl/5.8.8/Bio/AlignIO/
>> > maf.pm:176
>> > STACK Bio::AlignIO::PRINT /Library/Perl/5.8.8/Bio/AlignIO.pm:492
>> > STACK toplevel msf2mafy.pl:11
>> >
>> >
>> > Is there any other way i can convert clustalw to maf?
>> >
>> > I would really appreciate if anyone can help me out.
>> >
>> > Thanks
>> > Shalabh
>> > _______________________________________________
>> > Bioperl-l mailing list
>> > Bioperl-l at lists.open-bio.org
>> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
>> =======================================================================
>> Attention: The information contained in this message and/or attachments
>> from AgResearch Limited is intended only for the persons or entities
>> to which it is addressed and may contain confidential and/or privileged
>> material. Any review, retransmission, dissemination or other use of, or
>> taking of any action in reliance upon, this information by persons or
>> entities other than the intended recipients is prohibited by AgResearch
>> Limited. If you have received this message in error, please notify the
>> sender immediately.
>> =======================================================================
>>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
From cjfields at illinois.edu Mon Aug 2 17:31:20 2010
From: cjfields at illinois.edu (Chris Fields)
Date: Mon, 2 Aug 2010 16:31:20 -0500
Subject: [Bioperl-l] clustalw to maf format
In-Reply-To:
References:
<18DF7D20DFEC044098A1062202F5FFF32F02147B68@exchsth.agresearch.co.nz>
Message-ID: <6E9C9D64-D23A-4FC8-B213-FC8A7FFA4F27@illinois.edu>
No other format will work? The main reason you see unimplemented methods like this is there is no active interest in working with this format beyond getting the information stored within them into objects and other commonly-used formats.
chris
On Aug 2, 2010, at 3:53 PM, shalabh sharma wrote:
> Hi Russell,
> Thanks for the reply, but i have around 400 alignments and some
> huge ones :(
>
> Thanks
> Shalabh
>
>
> On Mon, Aug 2, 2010 at 4:25 PM, Smithies, Russell <
> Russell.Smithies at agresearch.co.nz> wrote:
>
>> This might work if you only have a few:
>> http://www.ibi.vu.nl/programs/convertalignwww/
>>
>> --Russell
>>
>>
>>> -----Original Message-----
>>> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
>>> bounces at lists.open-bio.org] On Behalf Of shalabh sharma
>>> Sent: Tuesday, 3 August 2010 7:45 a.m.
>>> To: bioperl-l
>>> Subject: [Bioperl-l] clustalw to maf format
>>>
>>> Hi,
>>> I am trying to convert clustalw to maf format.
>>> I am trying to use AlignIO for that but its not working.
>>>
>>> Its giving me the following error:
>>>
>>> EXCEPTION Bio::Root::NotImplemented -------------
>>> MSG: Abstract method "Bio::AlignIO::maf::write_aln" is not implemented by
>>> package Bio::AlignIO::maf.
>>> This is not your fault - author of Bio::AlignIO::maf should be blamed!
>>>
>>> STACK Bio::Root::RootI::throw_not_implemented
>>> /Library/Perl/5.8.8/Bio/Root/RootI.pm:707
>>> STACK Bio::AlignIO::maf::write_aln /Library/Perl/5.8.8/Bio/AlignIO/
>>> maf.pm:176
>>> STACK Bio::AlignIO::PRINT /Library/Perl/5.8.8/Bio/AlignIO.pm:492
>>> STACK toplevel msf2mafy.pl:11
>>>
>>>
>>> Is there any other way i can convert clustalw to maf?
>>>
>>> I would really appreciate if anyone can help me out.
>>>
>>> Thanks
>>> Shalabh
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>> =======================================================================
>> Attention: The information contained in this message and/or attachments
>> from AgResearch Limited is intended only for the persons or entities
>> to which it is addressed and may contain confidential and/or privileged
>> material. Any review, retransmission, dissemination or other use of, or
>> taking of any action in reliance upon, this information by persons or
>> entities other than the intended recipients is prohibited by AgResearch
>> Limited. If you have received this message in error, please notify the
>> sender immediately.
>> =======================================================================
>>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
From shalabh.sharma7 at gmail.com Mon Aug 2 18:30:41 2010
From: shalabh.sharma7 at gmail.com (shalabh sharma)
Date: Mon, 2 Aug 2010 18:30:41 -0400
Subject: [Bioperl-l] clustalw to maf format
In-Reply-To: <6E9C9D64-D23A-4FC8-B213-FC8A7FFA4F27@illinois.edu>
References:
<18DF7D20DFEC044098A1062202F5FFF32F02147B68@exchsth.agresearch.co.nz>
<6E9C9D64-D23A-4FC8-B213-FC8A7FFA4F27@illinois.edu>
Message-ID:
Hi All,
Thanks for the replies.
Actually i am working on a pipeline involving RNAz.
I had impression that there must be a converter available as their webserver
can take xmfa or maf format but standalone is only accepting maf format.
I think i will use a program that can output as xmfa and write to those
people if they can provide me with the converter.
Thanks
Shalabh
On Mon, Aug 2, 2010 at 5:31 PM, Chris Fields wrote:
> No other format will work? The main reason you see unimplemented methods
> like this is there is no active interest in working with this format beyond
> getting the information stored within them into objects and other
> commonly-used formats.
>
> chris
>
> On Aug 2, 2010, at 3:53 PM, shalabh sharma wrote:
>
> > Hi Russell,
> > Thanks for the reply, but i have around 400 alignments and
> some
> > huge ones :(
> >
> > Thanks
> > Shalabh
> >
> >
> > On Mon, Aug 2, 2010 at 4:25 PM, Smithies, Russell <
> > Russell.Smithies at agresearch.co.nz> wrote:
> >
> >> This might work if you only have a few:
> >> http://www.ibi.vu.nl/programs/convertalignwww/
> >>
> >> --Russell
> >>
> >>
> >>> -----Original Message-----
> >>> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
> >>> bounces at lists.open-bio.org] On Behalf Of shalabh sharma
> >>> Sent: Tuesday, 3 August 2010 7:45 a.m.
> >>> To: bioperl-l
> >>> Subject: [Bioperl-l] clustalw to maf format
> >>>
> >>> Hi,
> >>> I am trying to convert clustalw to maf format.
> >>> I am trying to use AlignIO for that but its not working.
> >>>
> >>> Its giving me the following error:
> >>>
> >>> EXCEPTION Bio::Root::NotImplemented -------------
> >>> MSG: Abstract method "Bio::AlignIO::maf::write_aln" is not implemented
> by
> >>> package Bio::AlignIO::maf.
> >>> This is not your fault - author of Bio::AlignIO::maf should be blamed!
> >>>
> >>> STACK Bio::Root::RootI::throw_not_implemented
> >>> /Library/Perl/5.8.8/Bio/Root/RootI.pm:707
> >>> STACK Bio::AlignIO::maf::write_aln /Library/Perl/5.8.8/Bio/AlignIO/
> >>> maf.pm:176
> >>> STACK Bio::AlignIO::PRINT /Library/Perl/5.8.8/Bio/AlignIO.pm:492
> >>> STACK toplevel msf2mafy.pl:11
> >>>
> >>>
> >>> Is there any other way i can convert clustalw to maf?
> >>>
> >>> I would really appreciate if anyone can help me out.
> >>>
> >>> Thanks
> >>> Shalabh
> >>> _______________________________________________
> >>> Bioperl-l mailing list
> >>> Bioperl-l at lists.open-bio.org
> >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >> =======================================================================
> >> Attention: The information contained in this message and/or attachments
> >> from AgResearch Limited is intended only for the persons or entities
> >> to which it is addressed and may contain confidential and/or privileged
> >> material. Any review, retransmission, dissemination or other use of, or
> >> taking of any action in reliance upon, this information by persons or
> >> entities other than the intended recipients is prohibited by AgResearch
> >> Limited. If you have received this message in error, please notify the
> >> sender immediately.
> >> =======================================================================
> >>
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>
From chiragmatkarbioinfo at gmail.com Tue Aug 3 03:47:37 2010
From: chiragmatkarbioinfo at gmail.com (chirag matkar)
Date: Tue, 3 Aug 2010 13:17:37 +0530
Subject: [Bioperl-l] Pubmed Parsing
Message-ID:
Hello all,
I have a list of Pubmed Ids.
I want to parse articles to find specific SNP related information.
Can i work it out using a Script?
--
Regards,
Chirag Matkar
From genehack at genehack.org Tue Aug 3 05:03:35 2010
From: genehack at genehack.org (John Anderson)
Date: Tue, 3 Aug 2010 05:03:35 -0400
Subject: [Bioperl-l] Pubmed Parsing
In-Reply-To:
References:
Message-ID: <5E557C44-224B-4460-9C2C-E375555B8BE6@genehack.org>
On Aug 3, 2010, at 3:47 AM, chirag matkar wrote:
> I have a list of Pubmed Ids.
> I want to parse articles to find specific SNP related information.
> Can i work it out using a Script?
Can you provide a more specific example of what you'd like to do? For example, something along the lines of, "for PMID 1234, get ... about SNP 5678" (where '...' is replaced with whatever it is you're trying to get). Even describing how you would obtain this information using the website yourself will be helpful.
thanks,
john.
From gowthaman.ramasamy at seattlebiomed.org Tue Aug 3 01:29:10 2010
From: gowthaman.ramasamy at seattlebiomed.org (Gowthaman Ramasamy)
Date: Mon, 2 Aug 2010 22:29:10 -0700
Subject: [Bioperl-l] Getting pileup consensus from BAM files using
Bio::DB::Sam
In-Reply-To:
Message-ID:
Hi List,
I am trying to find out the consensus using pileup via Bio::DB::Sam. Using the following script I could parse out the ref_base and different bases from reads at that position. Though, I am not able to find a method to derive consensus. Similar to the values produced by "samtools pileup -c -f xxxxxx.fasta yyyyyyy.bam".
The script I use now retrives ref base, query bases for each position. How do I improve it to get the consensus?
Thanks very much in advance,
Gowthaman
use Bio::DB::Sam;
my $bam = Bio::DB::Sam->new(-bam => 'something.bam',
-fasta => 'something.fasta'
);
my $cb = sub {
my ($seqid, $pos, $pileups) = @_;
my $refBase = $bam->segment($seqid, $pos, $pos)->dna;
print "\n$pos\t$refBase=>";
for my $pileup (@$pileups){
my $al = $pileup->alignment;
my $qBase = substr($al->qseq, $pileup->qpos, 1);
print "$qBase,";
}
};
$bam->pileup('Lin.chr10i', $cb);
From scott at scottcain.net Tue Aug 3 06:32:59 2010
From: scott at scottcain.net (Scott Cain)
Date: Tue, 3 Aug 2010 06:32:59 -0400
Subject: [Bioperl-l] Getting pileup consensus from BAM files using
Bio::DB::Sam
In-Reply-To:
References:
Message-ID:
Hi Gowthaman,
I don't see a method to extract the consensus. You are welcome to
submit a patch :-)
Scott
On Tue, Aug 3, 2010 at 1:29 AM, Gowthaman Ramasamy
wrote:
> Hi List,
> I am trying to find out the consensus using pileup via Bio::DB::Sam. Using the following script I could parse out the ref_base and different bases from reads at that position. Though, I am not able to find a method to derive consensus. Similar to the values produced by "samtools pileup -c -f xxxxxx.fasta yyyyyyy.bam".
>
> The script I use now retrives ref base, query bases for each position. How do I improve it to get the consensus?
>
> Thanks very much in advance,
> Gowthaman
>
>
> use Bio::DB::Sam;
>
> my $bam = Bio::DB::Sam->new(-bam => 'something.bam',
> ? ? ? ? ? ? ? ? ? ? ? ? ? ?-fasta => 'something.fasta'
> ? ? ? ? ? ? ? ? ? ? ? ? ? );
>
> my $cb = sub {
> ? ? ? ? ? ? ? ? ? ? ? ?my ($seqid, $pos, $pileups) = @_;
> ? ? ? ? ? ? ? ? ? ? ? ?my $refBase = $bam->segment($seqid, $pos, $pos)->dna;
> ? ? ? ? ? ? ? ? ? ? ? ?print "\n$pos\t$refBase=>";
> ? ? ? ? ? ? ? ? ? ? ? ?for my $pileup (@$pileups){
> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?my $al = $pileup->alignment;
> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?my $qBase = substr($al->qseq, $pileup->qpos, 1);
> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?print "$qBase,";
> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?}
> ? ? ? ? ? ? ? ? ? ? ? ?};
>
> $bam->pileup('Lin.chr10i', $cb);
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
--
------------------------------------------------------------------------
Scott Cain, Ph. D.? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?? scott at scottcain dot net
GMOD Coordinator (http://gmod.org/)? ? ? ? ? ? ? ? ? ?? 216-392-3087
Ontario Institute for Cancer Research
From lincoln.stein at gmail.com Tue Aug 3 12:57:52 2010
From: lincoln.stein at gmail.com (Lincoln Stein)
Date: Tue, 3 Aug 2010 12:57:52 -0400
Subject: [Bioperl-l] Getting pileup consensus from BAM files using
Bio::DB::Sam
In-Reply-To:
References:
Message-ID:
Samtools is running MAQ on the pileup. You could either implement MAQ in
perl, or come up with your own consensus caller.
Lincoln
On Tue, Aug 3, 2010 at 1:29 AM, Gowthaman Ramasamy <
gowthaman.ramasamy at seattlebiomed.org> wrote:
> Hi List,
> I am trying to find out the consensus using pileup via Bio::DB::Sam. Using
> the following script I could parse out the ref_base and different bases from
> reads at that position. Though, I am not able to find a method to derive
> consensus. Similar to the values produced by "samtools pileup -c -f
> xxxxxx.fasta yyyyyyy.bam".
>
> The script I use now retrives ref base, query bases for each position. How
> do I improve it to get the consensus?
>
> Thanks very much in advance,
> Gowthaman
>
>
> use Bio::DB::Sam;
>
> my $bam = Bio::DB::Sam->new(-bam => 'something.bam',
> -fasta => 'something.fasta'
> );
>
> my $cb = sub {
> my ($seqid, $pos, $pileups) = @_;
> my $refBase = $bam->segment($seqid, $pos,
> $pos)->dna;
> print "\n$pos\t$refBase=>";
> for my $pileup (@$pileups){
> my $al = $pileup->alignment;
> my $qBase = substr($al->qseq, $pileup->qpos,
> 1);
> print "$qBase,";
> }
> };
>
> $bam->pileup('Lin.chr10i', $cb);
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
--
Lincoln D. Stein
Director, Informatics and Biocomputing Platform
Ontario Institute for Cancer Research
101 College St., Suite 800
Toronto, ON, Canada M5G0A3
416 673-8514
Assistant: Renata Musa
From biopython at maubp.freeserve.co.uk Tue Aug 3 13:06:46 2010
From: biopython at maubp.freeserve.co.uk (Peter)
Date: Tue, 3 Aug 2010 18:06:46 +0100
Subject: [Bioperl-l] Getting pileup consensus from BAM files using
Bio::DB::Sam
In-Reply-To:
References:
Message-ID:
On Tue, Aug 3, 2010 at 5:57 PM, Lincoln Stein wrote:
> Samtools is running MAQ on the pileup. You could either implement MAQ in
> perl, or come up with your own consensus caller.
>
> Lincoln
See also: http://seqanswers.com/forums/showthread.php?t=6241
From gowthaman.ramasamy at seattlebiomed.org Tue Aug 3 13:28:36 2010
From: gowthaman.ramasamy at seattlebiomed.org (Gowthaman Ramasamy)
Date: Tue, 3 Aug 2010 10:28:36 -0700
Subject: [Bioperl-l] Getting pileup consensus from BAM files using
Bio::DB::Sam
In-Reply-To:
References:
,
Message-ID: <89080953C3D300419AACB6E63A7EEFBA5C47613B34@mail02.sbri.org>
Hi Lincoln,
Thats a good lead. I will try to use MAQ in perl rather than using my simple majority rule.
-gowtham
________________________________________
From: Lincoln Stein [lincoln.stein at gmail.com]
Sent: Tuesday, August 03, 2010 9:57 AM
To: Gowthaman Ramasamy
Cc: bioperl-l
Subject: Re: [Bioperl-l] Getting pileup consensus from BAM files using Bio::DB::Sam
Samtools is running MAQ on the pileup. You could either implement MAQ in perl, or come up with your own consensus caller.
Lincoln
On Tue, Aug 3, 2010 at 1:29 AM, Gowthaman Ramasamy > wrote:
Hi List,
I am trying to find out the consensus using pileup via Bio::DB::Sam. Using the following script I could parse out the ref_base and different bases from reads at that position. Though, I am not able to find a method to derive consensus. Similar to the values produced by "samtools pileup -c -f xxxxxx.fasta yyyyyyy.bam".
The script I use now retrives ref base, query bases for each position. How do I improve it to get the consensus?
Thanks very much in advance,
Gowthaman
use Bio::DB::Sam;
my $bam = Bio::DB::Sam->new(-bam => 'something.bam',
-fasta => 'something.fasta'
);
my $cb = sub {
my ($seqid, $pos, $pileups) = @_;
my $refBase = $bam->segment($seqid, $pos, $pos)->dna;
print "\n$pos\t$refBase=>";
for my $pileup (@$pileups){
my $al = $pileup->alignment;
my $qBase = substr($al->qseq, $pileup->qpos, 1);
print "$qBase,";
}
};
$bam->pileup('Lin.chr10i', $cb);
_______________________________________________
Bioperl-l mailing list
Bioperl-l at lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/bioperl-l
--
Lincoln D. Stein
Director, Informatics and Biocomputing Platform
Ontario Institute for Cancer Research
101 College St., Suite 800
Toronto, ON, Canada M5G0A3
416 673-8514
Assistant: Renata Musa >
From stefan.kirov at bms.com Tue Aug 3 16:22:35 2010
From: stefan.kirov at bms.com (Stefan Kirov)
Date: Tue, 03 Aug 2010 16:22:35 -0400
Subject: [Bioperl-l] nmica parser
Message-ID: <4C587A8B.8090603@bms.com>
Has anyone written nmica parser? If not I will perhaps do that. It
should be straightforward- the output is XML.
Stefan
-------------- next part --------------
A non-text attachment was scrubbed...
Name: stefan_kirov.vcf
Type: text/x-vcard
Size: 207 bytes
Desc: not available
URL:
From fs5 at sanger.ac.uk Wed Aug 4 04:45:39 2010
From: fs5 at sanger.ac.uk (Frank Schwach)
Date: Wed, 04 Aug 2010 09:45:39 +0100
Subject: [Bioperl-l] Pubmed Parsing
In-Reply-To:
References:
Message-ID: <1280911539.3499.46.camel@deskpro15336.dynamic.sanger.ac.uk>
Hi Chiraq,
have a look at this earlier post:
http://bioperl.org/pipermail/bioperl-l/2009-April/029690.html
However, you won't be able to retrieve all full texts and it is quite a
task to parse natural language and get useful information about a gene,
protein, SNP etc out of a manuscript.
Frank
On Tue, 2010-08-03 at 13:17 +0530, chirag matkar wrote:
> Hello all,
> I have a list of Pubmed Ids.
> I want to parse articles to find specific SNP related information.
> Can i work it out using a Script?
>
>
>
>
>
--
The Wellcome Trust Sanger Institute is operated by Genome Research
Limited, a charity registered in England with number 1021457 and a
company registered in England with number 2742969, whose registered
office is 215 Euston Road, London, NW1 2BE.
From David.Messina at sbc.su.se Thu Aug 5 08:16:17 2010
From: David.Messina at sbc.su.se (Dave Messina)
Date: Thu, 5 Aug 2010 14:16:17 +0200
Subject: [Bioperl-l] call for a TreeIO volunteer
Message-ID: <91AC5B00-5969-4C56-B08A-6EEA76916A10@sbc.su.se>
Hi everybody,
We've got a couple of small open bugs related to the Bio::TreeIO modules, and we could really use someone to take a look at them. Ideally, that someone would have familiarity with TreeIO already.*
It'd help us to get the next release (1.6.2) out the door.
The bugs in question are:
- TreeIO::newick writes root node branch length incorrectly
http://bugzilla.open-bio.org/show_bug.cgi?id=3039
- Bio::TreeIO::nhx cannot parse empty [&&NHX] + round-trip failure
http://bugzilla.open-bio.org/show_bug.cgi?id=3007
Thanks,
Dave
on behalf of the core developers
* Even if you don't, though, if you've been looking for an opportunity to contribute to BioPerl, and this sounds like something you'd like to work on, by all means raise your hand.
From clements at nescent.org Thu Aug 5 13:15:41 2010
From: clements at nescent.org (Dave Clements)
Date: Thu, 5 Aug 2010 10:15:41 -0700
Subject: [Bioperl-l] GMOD Europe 2010, 13-16 Sept, Cambridge, UK
In-Reply-To:
References:
Message-ID:
GMOD Europe 2010
================
13-16 September 2010
Cambridge, UK
http://gmod.org/wiki/GMOD_Europe_2010
We are pleased to announce GMOD Europe 2010, four days of GMOD events being
held 13-16 September 2010, at the University of Cambridge. GMOD Europe 2010
includes:
1) GMOD Community Meeting, Monday & Tuesday: Project updates, developer and
user presentations and best practices, project direction.
2) GMOD Satellite Meetings, Wednesday: Special interest groups where GMOD
community members meet to discuss specific topics of interest.
3) InterMine Workshop, Wednesday: A one day workshop on installing,
configuring and using the InterMine biological data warehouse system.
4) BioMart Workshop, Thursday: A one day workshop on using the BioMart
biological data warehouse system, including accessing data through APIs.
Registration is now open for these events. There is a ?50 registration fee
for the GMOD Meeting to cover catered lunches and other expenses.
Registration for all other events is free, but required, as space is
limited. These events are open to all: GMOD users, developers, prospective
users, biologists, and computer scientists. See
http://gmod.org/wiki/January_2010_GMOD_Meeting for an idea of what goes on
at GMOD meetings,
GMOD is a collection of interoperable open source software components for
managing, visualizing and annotating biological data. GMOD incorporates
many widely used tools, including GBrowse and JBrowse for genome browsing,
InterMine and BioMart for data mining, Galaxy and Ergatis for workflow,
Chado for data management, GBrowse_syn and CMap for comparative genomics,
plus many other tools (Apollo, MAKER, Pathway Tools, Textpresso, ...). GMOD
is also an active community of researchers and developers addressing common
challenges in exploiting their data. If you are struggling to fully exploit
your data then please consider attending GMOD Europe 2010.
Please let us know if you have any questions, and we hope to see you in
Cambridge.
Thanks,
Scott Cain and Dave Clements
--
http://gmod.org/wiki/GMOD_News
http://gmod.org/wiki/GMOD_Evo_Hackathon
http://gmod.org/wiki/GMOD_Europe_2010
http://gmod.org/wiki/Help_Desk_Feedback
From abhishek.vit at gmail.com Thu Aug 5 18:15:56 2010
From: abhishek.vit at gmail.com (Abhishek Pratap)
Date: Thu, 5 Aug 2010 18:15:56 -0400
Subject: [Bioperl-l] Wrapper for Picard tools in Bioperl
Message-ID:
Hi All
Just wondering if there is any Picard wrapper/s available in Bioperl.
Thanks!
-Abhi
-----------------------------
Abhishek Pratap
Bioinformatics Software Engineer II
Genomics Resource Center
Institute for Genome Sciences
School of Medicine, Univ of Maryland
801, W. Baltimore Street, Baltimore, MD 21209
Ph: (+1)-410-706-2296
www.igs.umaryland.edu/
From Russell.Smithies at agresearch.co.nz Thu Aug 5 18:37:46 2010
From: Russell.Smithies at agresearch.co.nz (Smithies, Russell)
Date: Fri, 6 Aug 2010 10:37:46 +1200
Subject: [Bioperl-l] Wrapper for Picard tools in Bioperl
In-Reply-To:
References:
Message-ID: <18DF7D20DFEC044098A1062202F5FFF32F02262E96@exchsth.agresearch.co.nz>
Might be part of the "Enterprise" package.
If not, some developer should "make it so".
:-)
--Russell
(I hate Fridays)
> -----Original Message-----
> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
> bounces at lists.open-bio.org] On Behalf Of Abhishek Pratap
> Sent: Friday, 6 August 2010 10:16 a.m.
> To: bioperl-l at lists.open-bio.org
> Subject: [Bioperl-l] Wrapper for Picard tools in Bioperl
>
> Hi All
>
> Just wondering if there is any Picard wrapper/s available in Bioperl.
>
>
> Thanks!
> -Abhi
>
> -----------------------------
> Abhishek Pratap
> Bioinformatics Software Engineer II
> Genomics Resource Center
> Institute for Genome Sciences
> School of Medicine, Univ of Maryland
> 801, W. Baltimore Street, Baltimore, MD 21209
> Ph: (+1)-410-706-2296
> www.igs.umaryland.edu/
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
=======================================================================
Attention: The information contained in this message and/or attachments
from AgResearch Limited is intended only for the persons or entities
to which it is addressed and may contain confidential and/or privileged
material. Any review, retransmission, dissemination or other use of, or
taking of any action in reliance upon, this information by persons or
entities other than the intended recipients is prohibited by AgResearch
Limited. If you have received this message in error, please notify the
sender immediately.
=======================================================================
From cjfields at illinois.edu Thu Aug 5 19:10:16 2010
From: cjfields at illinois.edu (Chris Fields)
Date: Thu, 5 Aug 2010 18:10:16 -0500
Subject: [Bioperl-l] Wrapper for Picard tools in Bioperl
In-Reply-To:
References:
Message-ID: <26E3E5B6-47CF-4744-9687-199C218B5571@illinois.edu>
Picard uses samtools, which has a perl API:
http://search.cpan.org/dist/Bio-SamTools/
which uses BioPerl. Ah, the circle of life...
chris
On Aug 5, 2010, at 5:15 PM, Abhishek Pratap wrote:
> Hi All
>
> Just wondering if there is any Picard wrapper/s available in Bioperl.
>
>
> Thanks!
> -Abhi
>
> -----------------------------
> Abhishek Pratap
> Bioinformatics Software Engineer II
> Genomics Resource Center
> Institute for Genome Sciences
> School of Medicine, Univ of Maryland
> 801, W. Baltimore Street, Baltimore, MD 21209
> Ph: (+1)-410-706-2296
> www.igs.umaryland.edu/
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
From dan.kortschak at adelaide.edu.au Thu Aug 5 21:06:45 2010
From: dan.kortschak at adelaide.edu.au (Dan Kortschak)
Date: Fri, 06 Aug 2010 10:36:45 +0930
Subject: [Bioperl-l] MUMmer parser work
Message-ID: <1281056805.2414.26.camel@zoidberg.mbs.adelaide.edu.au>
Hello Everyone,
I've just noticed the absence of a MUMmer parser and thought that it
might be a worthwhile contribution to bioperl-run (I won't be able to
start on this for a while, but given Mark's excellent work on
CommandExts, it should take too long to get up when I do have time). Has
anyone made any effort in this direction that I would be stepping on, or
if they have left it, that I could pick up to shorten the work time?
cheers
Dan
From cjfields at illinois.edu Thu Aug 5 23:13:51 2010
From: cjfields at illinois.edu (Chris Fields)
Date: Thu, 5 Aug 2010 22:13:51 -0500
Subject: [Bioperl-l] MUMmer parser work
In-Reply-To: <1281056805.2414.26.camel@zoidberg.mbs.adelaide.edu.au>
References: <1281056805.2414.26.camel@zoidberg.mbs.adelaide.edu.au>
Message-ID: <80AF6158-9ADF-47A6-97EC-C322F75C8959@illinois.edu>
Dan,
Just so you know, there is a proposed MUMmer AlignIO parser that John (genehack) is planning on trying to incorporate in:
http://bugzilla.open-bio.org/show_bug.cgi?id=2701
It currently lacks significant tests, so feel free to chip in there as needed.
chris
On Aug 5, 2010, at 8:06 PM, Dan Kortschak wrote:
> Hello Everyone,
>
> I've just noticed the absence of a MUMmer parser and thought that it
> might be a worthwhile contribution to bioperl-run (I won't be able to
> start on this for a while, but given Mark's excellent work on
> CommandExts, it should take too long to get up when I do have time). Has
> anyone made any effort in this direction that I would be stepping on, or
> if they have left it, that I could pick up to shorten the work time?
>
> cheers
> Dan
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
From greg at ebi.ac.uk Fri Aug 6 05:47:21 2010
From: greg at ebi.ac.uk (Gregory Jordan)
Date: Fri, 6 Aug 2010 10:47:21 +0100
Subject: [Bioperl-l] call for a TreeIO volunteer
In-Reply-To: <91AC5B00-5969-4C56-B08A-6EEA76916A10@sbc.su.se>
References: <91AC5B00-5969-4C56-B08A-6EEA76916A10@sbc.su.se>
Message-ID:
I can help out with these. I'm pretty sure I've previously fought with (and
perhaps even come up with a fix for) bug 3039, and I can take a look at 3007
too.
Now lemme just see if I can get up and running with the Bioperl test suite.
I'll give a shout if I run into any problems.
Cheers,
Greg
On 5 August 2010 13:16, Dave Messina wrote:
> Hi everybody,
>
> We've got a couple of small open bugs related to the Bio::TreeIO modules,
> and we could really use someone to take a look at them. Ideally, that
> someone would have familiarity with TreeIO already.*
>
> It'd help us to get the next release (1.6.2) out the door.
>
> The bugs in question are:
> - TreeIO::newick writes root node branch length incorrectly
> http://bugzilla.open-bio.org/show_bug.cgi?id=3039
>
> - Bio::TreeIO::nhx cannot parse empty [&&NHX] + round-trip failure
> http://bugzilla.open-bio.org/show_bug.cgi?id=3007
>
>
> Thanks,
> Dave
> on behalf of the core developers
>
>
> * Even if you don't, though, if you've been looking for an opportunity to
> contribute to BioPerl, and this sounds like something you'd like to work on,
> by all means raise your hand.
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
From jun.yin at ucd.ie Fri Aug 6 06:52:14 2010
From: jun.yin at ucd.ie (Jun Yin)
Date: Fri, 06 Aug 2010 11:52:14 +0100
Subject: [Bioperl-l] Packages retrieving online alignment sequences
Message-ID: <00d901cb3555$6e3a5500$4aaeff00$%yin@ucd.ie>
Hi, all,
I am the google summer of code student working on refactoring Bio::Align
subsystem. I recently implemented several packages retrieving online
alignment sequences. The aim of the packages are to provide convenient
methods to retrieve online alignment sequences for the BioPerl users. The
alignment sequences are converted into Bio::SimpleAlign object after the
retrieval, which will be easy to manipulate and write to local disk. Now the
packages support Pfam, Rfam, Prosite and Entrez Protein Clusters databases.
Here is the structure of the packages:
Packages
Bio::DB::Align (interface, and calling other packages)
Bio::DB::Align::Pfam (retrieving alignment from Pfam)
Bio::DB::Align::Rfam (retrieving alignment from Rfam)
Bio::DB::Align:Prosite (retrieving alignment from Prosite)
Bio::DB::Align:ProtClustDB (retrieving alignment from Entrez Protein
Clusters Database)
Usually four methods are provided for each package:
Methods
get_Aln_by_id (retrieving alignment by id and returns Bio::SimpleAlign
object)
get_Aln_by_acc (retrieving alignment by acession and returns
Bio::SimpleAlign object) (Rfam and Prosite only supports this method)
id2acc (id to accession conversion)
acc2id (accession to id conversion)
These packages are built dependent on LWP::UserAgent, HTTP::Request and
Bio::DB::GenericWebAgent. Bio::DB::Align::ProtClustDB is dependent on
Bio::DB::EUtilities.
Calling the packages can be:
my $dbobj=Bio::DB::Align->new(-db=>"rfam");
Or, my $dbobj= Bio::DB::Align::Pfam->new();
my $aln=$dbobj->get_Aln_by_acc("RF0001");
my $aln2=$dbobj->get_Aln_by_acc(-accession=>"RF0001",-alignment=>"full");
print $aln->length();
foreach my $seq ($aln->each_Seq) {
#do something
}
I have done some tests on these packages. And, I will write them into
standard tests later. Any suggestions on these packages are welcome.
Cheers,
Jun Yin
Ph.D. student in U.C.D.
Bioinformatics Laboratory
Conway Institute
University College Dublin
From David.Messina at sbc.su.se Fri Aug 6 08:59:19 2010
From: David.Messina at sbc.su.se (Dave Messina)
Date: Fri, 6 Aug 2010 14:59:19 +0200
Subject: [Bioperl-l] call for a TreeIO volunteer
In-Reply-To:
References: <91AC5B00-5969-4C56-B08A-6EEA76916A10@sbc.su.se>
Message-ID: <6D6DAA77-2A2F-4AAA-B36D-FACED1FDE383@sbc.su.se>
> I can help out with these. I'm pretty sure I've previously fought with (and perhaps even come up with a fix for) bug 3039, and I can take a look at 3007 too.
Awesome ? thanks Greg!
> Now lemme just see if I can get up and running with the Bioperl test suite. I'll give a shout if I run into any problems.
Please do.
Dave
From David.Messina at sbc.su.se Fri Aug 6 09:06:47 2010
From: David.Messina at sbc.su.se (Dave Messina)
Date: Fri, 6 Aug 2010 15:06:47 +0200
Subject: [Bioperl-l] Packages retrieving online alignment sequences
In-Reply-To: <00d901cb3555$6e3a5500$4aaeff00$%yin@ucd.ie>
References: <00d901cb3555$6e3a5500$4aaeff00$%yin@ucd.ie>
Message-ID:
Sounds great, Jun!
Did you happen to test your code on very large alignments? I know there's one in Pfam that's something like 100,000 sequences. An rRNA, I believe.
Dave
From jun.yin at ucd.ie Fri Aug 6 09:11:41 2010
From: jun.yin at ucd.ie (Jun Yin)
Date: Fri, 06 Aug 2010 14:11:41 +0100
Subject: [Bioperl-l] Packages retrieving online alignment sequences
In-Reply-To:
References: <00d901cb3555$6e3a5500$4aaeff00$%yin@ucd.ie>
Message-ID: <00fc01cb3568$e97968b0$bc6c3a10$%yin@ucd.ie>
Hi, Dave,
Thx for reminding me this. I will definitely try it.
Cheers,
Jun Yin
Ph.D.?student in U.C.D.
Bioinformatics Laboratory
Conway Institute
University College Dublin
-----Original Message-----
From: Dave Messina [mailto:David.Messina at sbc.su.se]
Sent: Friday, August 06, 2010 2:07 PM
To: Jun Yin
Cc: bioperl-l at lists.open-bio.org
Subject: Re: [Bioperl-l] Packages retrieving online alignment sequences
Sounds great, Jun!
Did you happen to test your code on very large alignments? I know there's
one in Pfam that's something like 100,000 sequences. An rRNA, I believe.
Dave
__________ Information from ESET Smart Security, version of virus signature
database 5346 (20100806) __________
The message was checked by ESET Smart Security.
http://www.eset.com
__________ Information from ESET Smart Security, version of virus signature
database 5346 (20100806) __________
The message was checked by ESET Smart Security.
http://www.eset.com
From cjfields at illinois.edu Fri Aug 6 09:19:54 2010
From: cjfields at illinois.edu (Chris Fields)
Date: Fri, 6 Aug 2010 08:19:54 -0500
Subject: [Bioperl-l] call for a TreeIO volunteer
In-Reply-To: <6D6DAA77-2A2F-4AAA-B36D-FACED1FDE383@sbc.su.se>
References: <91AC5B00-5969-4C56-B08A-6EEA76916A10@sbc.su.se>
<6D6DAA77-2A2F-4AAA-B36D-FACED1FDE383@sbc.su.se>
Message-ID: <8CB3DE9A-4C5C-42A3-94B4-8818D7143951@illinois.edu>
On Aug 6, 2010, at 7:59 AM, Dave Messina wrote:
>
>> I can help out with these. I'm pretty sure I've previously fought with (and perhaps even come up with a fix for) bug 3039, and I can take a look at 3007 too.
>
> Awesome ? thanks Greg!
>
>
>> Now lemme just see if I can get up and running with the Bioperl test suite. I'll give a shout if I run into any problems.
>
> Please do.
>
>
>
> Dave
Agreed, and thanks for helping out!
chris
From dianabowley at gmail.com Fri Aug 6 18:33:57 2010
From: dianabowley at gmail.com (DRBowley)
Date: Fri, 6 Aug 2010 15:33:57 -0700 (PDT)
Subject: [Bioperl-l] BioPerl install issues
Message-ID:
I'm new to both perl and bioperl and I'm having issues installing
bioperl. I'm trying to install on a Mac OS 10.6.4, and I've already
installed perl (5.10.0). I tried installing using the recommended
approach for Mac - via Fink...
"fink install bioperl-pm5100"
Looking back over the terminal window text it looks like the problem
is:
"This package requires Module::Build v0.2805 or greater to install
itself."
I tried doing "fink selfupdate" and that did not fix the problem.
Any suggestions?
Thanks!
Diana
From Kevin.M.Brown at asu.edu Fri Aug 6 18:50:45 2010
From: Kevin.M.Brown at asu.edu (Kevin Brown)
Date: Fri, 6 Aug 2010 15:50:45 -0700
Subject: [Bioperl-l] BioPerl install issues
In-Reply-To:
References:
Message-ID: <1A4207F8295607498283FE9E93B775B406E44A05@EX02.asurite.ad.asu.edu>
http://www.bioperl.org/wiki/Installing_Bioperl_for_Unix#INSTALLING_BIOPE
RL_THE_EASY_WAY_USING_Build.PL
Not sure why you had to install perl since it should have been part of
the stock OSX install (or at least it was last time I logged onto a
mac). Not sure why the Fink method has so many issues, but might try the
above which works for linux or bsd.
-----Original Message-----
From: bioperl-l-bounces at lists.open-bio.org
[mailto:bioperl-l-bounces at lists.open-bio.org] On Behalf Of DRBowley
Sent: Friday, August 06, 2010 3:34 PM
To: bioperl-l at bioperl.org
Subject: [Bioperl-l] BioPerl install issues
I'm new to both perl and bioperl and I'm having issues installing
bioperl. I'm trying to install on a Mac OS 10.6.4, and I've already
installed perl (5.10.0). I tried installing using the recommended
approach for Mac - via Fink...
"fink install bioperl-pm5100"
Looking back over the terminal window text it looks like the problem
is:
"This package requires Module::Build v0.2805 or greater to install
itself."
I tried doing "fink selfupdate" and that did not fix the problem.
Any suggestions?
Thanks!
Diana
_______________________________________________
Bioperl-l mailing list
Bioperl-l at lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/bioperl-l
From skastu01 at students.poly.edu Fri Aug 6 20:03:50 2010
From: skastu01 at students.poly.edu (Lakshmi Kastury)
Date: Sat, 7 Aug 2010 00:03:50 +0000
Subject: [Bioperl-l] BioPerl install issues
Message-ID:
Hi -
I went through several failed attempts on MACOS Snow Leopard, and fink was a dead end. Eventually I succeeded to install on Windows Vista using CPAN. I am not sure if this method will work with MACOS:
1. Opened command prompt.
2. Typed command: >perl -MCPAN -e "install Bundle::BioPerl"
3. Answered yes to the series of questions, which prompts install of several bundles and a compiler.
The instructions were in a link from:
http://bioperl.org/Core/Latest/INSTALL
All the best,
Lakshmi
> Date: Fri, 6 Aug 2010 15:33:57 -0700
> From: dianabowley at gmail.com
> To: bioperl-l at bioperl.org
> Subject: [Bioperl-l] BioPerl install issues
>
> I'm new to both perl and bioperl and I'm having issues installing
> bioperl. I'm trying to install on a Mac OS 10.6.4, and I've already
> installed perl (5.10.0). I tried installing using the recommended
> approach for Mac - via Fink...
> "fink install bioperl-pm5100"
>
> Looking back over the terminal window text it looks like the problem
> is:
> "This package requires Module::Build v0.2805 or greater to install
> itself."
>
> I tried doing "fink selfupdate" and that did not fix the problem.
>
> Any suggestions?
>
> Thanks!
> Diana
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
From David.Messina at sbc.su.se Sat Aug 7 02:47:40 2010
From: David.Messina at sbc.su.se (Dave Messina)
Date: Sat, 7 Aug 2010 08:47:40 +0200
Subject: [Bioperl-l] BioPerl install issues
In-Reply-To:
References:
Message-ID: <5BE9DB7C-9A51-4C09-8F83-8CA8ED4AADFE@sbc.su.se>
On Aug 7, 2010, at 02:03 , Lakshmi Kastury wrote:
> I am not sure if this method will work with MACOS:
It will. CPAN is cross-platform and is the best way to install BioPerl.
Dave
From cjfields at illinois.edu Sat Aug 7 09:58:56 2010
From: cjfields at illinois.edu (Chris Fields)
Date: Sat, 7 Aug 2010 08:58:56 -0500
Subject: [Bioperl-l] BioPerl install issues
In-Reply-To: <5BE9DB7C-9A51-4C09-8F83-8CA8ED4AADFE@sbc.su.se>
References:
<5BE9DB7C-9A51-4C09-8F83-8CA8ED4AADFE@sbc.su.se>
Message-ID:
It should work fine. Even installing from trunk right now works w/o failing tests.
chris
On Aug 7, 2010, at 1:47 AM, Dave Messina wrote:
>
> On Aug 7, 2010, at 02:03 , Lakshmi Kastury wrote:
>
>> I am not sure if this method will work with MACOS:
>
> It will. CPAN is cross-platform and is the best way to install BioPerl.
>
>
> Dave
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
From greg at ebi.ac.uk Sat Aug 7 17:14:58 2010
From: greg at ebi.ac.uk (Gregory Jordan)
Date: Sat, 7 Aug 2010 22:14:58 +0100
Subject: [Bioperl-l] Packages retrieving online alignment sequences
In-Reply-To: <00fc01cb3568$e97968b0$bc6c3a10$%yin@ucd.ie>
References: <00d901cb3555$6e3a5500$4aaeff00$%yin@ucd.ie>
<00fc01cb3568$e97968b0$bc6c3a10$%yin@ucd.ie>
Message-ID:
Maybe I'm just a bit naive here, but what is the expected difference between
accession and ID and why do we need a separate method for each? Seems to me
that one could just have a single method, get_Aln, which determines under
the hood whether the query string is an accession or ID.
It would be nice if the SimpleAlign object had its Annotation filled with
some extra metadata (such as accession, ID, database version number, URI,
etc.).
One other thing: have you thought about adding an Ensembl adaptor? Or maybe
something similar already exists in BioPerl...?
Sure Ensembl provides their own Perl API, but for someone who doesn't want
to go through the hassle of installing it from CVS (pardon my french, but
wtf!?! Who still uses CVS) and learning a whole new API, it might be
convenient to have a simple BioPerl module for quickly grabbing gene family
alignments from the public Ensembl MySQL databases. I'd be willing to help
write the necessary SQL queries for this.
greg
On 6 August 2010 14:11, Jun Yin wrote:
> Hi, Dave,
>
> Thx for reminding me this. I will definitely try it.
>
> Cheers,
> Jun Yin
> Ph.D. student in U.C.D.
>
> Bioinformatics Laboratory
> Conway Institute
> University College Dublin
>
>
> -----Original Message-----
> From: Dave Messina [mailto:David.Messina at sbc.su.se]
> Sent: Friday, August 06, 2010 2:07 PM
> To: Jun Yin
> Cc: bioperl-l at lists.open-bio.org
> Subject: Re: [Bioperl-l] Packages retrieving online alignment sequences
>
> Sounds great, Jun!
>
> Did you happen to test your code on very large alignments? I know there's
> one in Pfam that's something like 100,000 sequences. An rRNA, I believe.
>
>
> Dave
>
>
> __________ Information from ESET Smart Security, version of virus signature
> database 5346 (20100806) __________
>
> The message was checked by ESET Smart Security.
>
> http://www.eset.com
>
>
>
>
> __________ Information from ESET Smart Security, version of virus signature
> database 5346 (20100806) __________
>
> The message was checked by ESET Smart Security.
>
> http://www.eset.com
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
From cjfields at illinois.edu Sat Aug 7 18:07:39 2010
From: cjfields at illinois.edu (Chris Fields)
Date: Sat, 7 Aug 2010 17:07:39 -0500
Subject: [Bioperl-l] Packages retrieving online alignment sequences
In-Reply-To:
References: <00d901cb3555$6e3a5500$4aaeff00$%yin@ucd.ie>
<00fc01cb3568$e97968b0$bc6c3a10$%yin@ucd.ie>
Message-ID: <21E3B6D7-01BC-4DDA-B5B3-06F1F5AD7105@illinois.edu>
On Aug 7, 2010, at 4:14 PM, Gregory Jordan wrote:
> Maybe I'm just a bit naive here, but what is the expected difference between
> accession and ID and why do we need a separate method for each?
Depends on the remote service, but in many cases there is a difference. With NCBI eutils you can have either an accession and the unique identifier (UID, or GI for nuc/protein seqs). efetch can use both, but only the UID is guaranteed to retrieve a single sequence all the time; the accession can (very rarely) map to more than one sequence.
The other eutils services require either a string (esearch) or a UID, but do not allow an accession.
> Seems to me
> that one could just have a single method, get_Aln, which determines under
> the hood whether the query string is an accession or ID.
A simpler method could be introduced, but I can see that being potentially brittle in the long run. A naked alphanumeric string doesn't reveal much about what it is at face value w/o knowing database/service-specific behavior. And then we're reliant on that behavior not changing, which we can't guarantee (this has bitten us in the past). What would one do if NCBI (for instance) allowed accessions derived completely of digits, or conversely a unique ID with mixed alphanumerics?
Using methods specific for ID/acc at least guarantees a behavior on the backend w/o guessing, and if there is no danger of overlap (a service accepts either/or) one could simply be an alias of the other.
> It would be nice if the SimpleAlign object had its Annotation filled with
> some extra metadata (such as accession, ID, database version number, URI,
> etc.).
According to the deobfuscator SimpleAlign does have accession() and id(). The others could be simple attributes, and can be added as simple getter/setters, or as annotation via Bio::Annotation (this is the way Stockholm annotation is currently handled). Something to think about.
> One other thing: have you thought about adding an Ensembl adaptor? Or maybe
> something similar already exists in BioPerl...?
That's a good idea, though it might make more sense if this was done when mem-efficient (possibly DB-dependent) AlignI modules are present within bioperl, which is part of the GSoC (see below). For instance, have a Bio::Align::AlignI with a backend ensembl DB adaptor that works lazily.
If using the Ensembl Perl API, a few possible roadblocks/problems might pop up. Ensembl currently requires bioperl (v1.2.3, but it works with the latest as well, at least when I've used it). If using the ensembl perl API we would just need to ensure we aren't conflicting with ensembl code that pulls in bioperl classes expecting a v1.2.3 API when we only support the latest. I don't foresee this being an issue, though (there is precedent for this, see Sendu's Ensembl module Bio::Tools::Run::Ensembl in bioperl-run).
> Sure Ensembl provides their own Perl API, but for someone who doesn't want
> to go through the hassle of installing it from CVS (pardon my french, but
> wtf!?! Who still uses CVS) and learning a whole new API, it might be
> convenient to have a simple BioPerl module for quickly grabbing gene family
> alignments from the public Ensembl MySQL databases. I'd be willing to help
> write the necessary SQL queries for this.
>
> greg
The GSoC project on alignment subsystem refactoring will be finishing up this month, so I'm sure Jun discuss ideas for initial DB-dependent implementations. The more input and coders implementing the better, IMO.
As for writing up an adaptor to ensembl outside of it's API, overall I don't think it's a bad idea, but if it's possible maybe start without reinventing things, then move to direct SQL. Unless it's easier to use SQL.
chris
> On 6 August 2010 14:11, Jun Yin wrote:
>
>> Hi, Dave,
>>
>> Thx for reminding me this. I will definitely try it.
>>
>> Cheers,
>> Jun Yin
>> Ph.D. student in U.C.D.
>>
>> Bioinformatics Laboratory
>> Conway Institute
>> University College Dublin
>>
>>
>> -----Original Message-----
>> From: Dave Messina [mailto:David.Messina at sbc.su.se]
>> Sent: Friday, August 06, 2010 2:07 PM
>> To: Jun Yin
>> Cc: bioperl-l at lists.open-bio.org
>> Subject: Re: [Bioperl-l] Packages retrieving online alignment sequences
>>
>> Sounds great, Jun!
>>
>> Did you happen to test your code on very large alignments? I know there's
>> one in Pfam that's something like 100,000 sequences. An rRNA, I believe.
>>
>>
>> Dave
>>
>>
>> __________ Information from ESET Smart Security, version of virus signature
>> database 5346 (20100806) __________
>>
>> The message was checked by ESET Smart Security.
>>
>> http://www.eset.com
>>
>>
>>
>>
>> __________ Information from ESET Smart Security, version of virus signature
>> database 5346 (20100806) __________
>>
>> The message was checked by ESET Smart Security.
>>
>> http://www.eset.com
>>
>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
From hartzell at alerce.com Sat Aug 7 17:45:04 2010
From: hartzell at alerce.com (George Hartzell)
Date: Sat, 7 Aug 2010 14:45:04 -0700
Subject: [Bioperl-l] BioPerl install issues
In-Reply-To:
References:
<5BE9DB7C-9A51-4C09-8F83-8CA8ED4AADFE@sbc.su.se>
Message-ID: <19549.54240.499140.501136@gargle.gargle.HOWL>
Chris Fields writes:
> It should work fine. Even installing from trunk right now works
> w/o failing tests.
As a slight aside, if you're looking to build a current perl binary
for your mac (e.g. 5.12.1) you should take a look at perlbrew
(http://search.cpan.org/dist/App-perlbrew/). The three steps at the
top of the installation section of the README are all you need to get
going. Even a manager can do it.
If you're using bash on the mac via terminal you'll probably want to
put the one-liner they prescribe into your .bash_profile instead of
your .bashrc, but everything else just flows right along.
Once you have that in place you have a nicely isolated system into
which you can install things to your hearts content without worrying
about PERL5LIB and local::lib and the rest.
g.
From cjfields at illinois.edu Sat Aug 7 21:19:54 2010
From: cjfields at illinois.edu (Chris Fields)
Date: Sat, 7 Aug 2010 20:19:54 -0500
Subject: [Bioperl-l] BioPerl install issues
In-Reply-To: <19549.54240.499140.501136@gargle.gargle.HOWL>
References:
<5BE9DB7C-9A51-4C09-8F83-8CA8ED4AADFE@sbc.su.se>
<19549.54240.499140.501136@gargle.gargle.HOWL>
Message-ID:
On Aug 7, 2010, at 4:45 PM, George Hartzell wrote:
> Chris Fields writes:
>> It should work fine. Even installing from trunk right now works
>> w/o failing tests.
>
> As a slight aside, if you're looking to build a current perl binary
> for your mac (e.g. 5.12.1) you should take a look at perlbrew
> (http://search.cpan.org/dist/App-perlbrew/). The three steps at the
> top of the installation section of the README are all you need to get
> going. Even a manager can do it.
>
> If you're using bash on the mac via terminal you'll probably want to
> put the one-liner they prescribe into your .bash_profile instead of
> your .bashrc, but everything else just flows right along.
>
> Once you have that in place you have a nicely isolated system into
> which you can install things to your hearts content without worrying
> about PERL5LIB and local::lib and the rest.
>
> g.
Have to second using perlbrew, started using it for my local Ubuntu installation (don't have it running on my macbook yet, but it's in the plans).
chris
From greg at ebi.ac.uk Sun Aug 8 02:12:41 2010
From: greg at ebi.ac.uk (Gregory Jordan)
Date: Sun, 8 Aug 2010 07:12:41 +0100
Subject: [Bioperl-l] Packages retrieving online alignment sequences
In-Reply-To: <21E3B6D7-01BC-4DDA-B5B3-06F1F5AD7105@illinois.edu>
References: <00d901cb3555$6e3a5500$4aaeff00$%yin@ucd.ie>
<00fc01cb3568$e97968b0$bc6c3a10$%yin@ucd.ie>
<21E3B6D7-01BC-4DDA-B5B3-06F1F5AD7105@illinois.edu>
Message-ID:
On 7 August 2010 23:07, Chris Fields wrote:
>
> A simpler method could be introduced, but I can see that being potentially
> brittle in the long run. A naked alphanumeric string doesn't reveal much
> about what it is at face value w/o knowing database/service-specific
> behavior. And then we're reliant on that behavior not changing, which we
> can't guarantee (this has bitten us in the past). What would one do if NCBI
> (for instance) allowed accessions derived completely of digits, or
> conversely a unique ID with mixed alphanumerics?
>
> Using methods specific for ID/acc at least guarantees a behavior on the
> backend w/o guessing, and if there is no danger of overlap (a service
> accepts either/or) one could simply be an alias of the other.
>
Thanks for the clarification on IDs vs accessions. As long as the behavior
and distinction are well-documented, I'm sure it won't make too much of a
difference.
My main concern was just that having two similar methods -- with no clearly
laid out distinction between the two and one of them only supported by half
of the implementing subclasses -- might confuse potential users.
As a point of reference: both Rfam and Pfam allow either an ID or an
accession in their front-page search interface (http://www.pfam.org /
http://www.rfam.org/). In fact, they seem to entirely hide the distinction
between ID and Accession from the end user; nowhere on the Rfam page for an
individual result is it clear which string is the accession and which is the
ID (http://rfam.sanger.ac.uk/family/snoZ107_R87).
Thus, a potential user of the Rfam module wouldn't know whether to call the
get_by_ID or get_by_Accession method, even after looking at the Rfam page
for his / her desired alignment!
As you can probably tell, I'm all in favor of a unified search whenever
feasible / possible. :-)
> As for writing up an adaptor to ensembl outside of it's API, overall I
> don't think it's a bad idea, but if it's possible maybe start without
> reinventing things, then move to direct SQL. Unless it's easier to use SQL.
>
>
For fetching Ensembl's gene family alignments, using the SQL will be
easiest. They don't tend to get unreasonably large in terms of memory -- I
think the biggest tend to be ~700 sequences with a few thousand alignment
columns or so -- and it's a simple table join or two to get both the tree
and alignment from the database.
For genomic alignments, I agree that a more memory-efficient and/or lazy
backend would be necessary. And it's pretty much impossible to get those
things out of the Ensembl tables without using their API.
--greg
From dan.kortschak at adelaide.edu.au Sun Aug 8 20:53:43 2010
From: dan.kortschak at adelaide.edu.au (Dan Kortschak)
Date: Mon, 09 Aug 2010 10:23:43 +0930
Subject: [Bioperl-l] MUMmer parser work
In-Reply-To: <80AF6158-9ADF-47A6-97EC-C322F75C8959@illinois.edu>
References: <1281056805.2414.26.camel@zoidberg.mbs.adelaide.edu.au>
<80AF6158-9ADF-47A6-97EC-C322F75C8959@illinois.edu>
Message-ID: <1281315223.2414.48.camel@zoidberg.mbs.adelaide.edu.au>
Hi Chris,
Is that set of files planned to be included in the git repository on
bioperl-live? I don't want to push something that is being organised by
someone else.
cheers
Dan
On Thu, 2010-08-05 at 22:13 -0500, Chris Fields wrote:
> Dan,
>
> Just so you know, there is a proposed MUMmer AlignIO parser that John (genehack) is planning on trying to incorporate in:
>
> http://bugzilla.open-bio.org/show_bug.cgi?id=2701
>
> It currently lacks significant tests, so feel free to chip in there as needed.
>
> chris
From genehack at genehack.org Sun Aug 8 21:42:27 2010
From: genehack at genehack.org (John SJ Anderson)
Date: Sun, 8 Aug 2010 21:42:27 -0400
Subject: [Bioperl-l] MUMmer parser work
In-Reply-To: <1281315223.2414.48.camel@zoidberg.mbs.adelaide.edu.au>
References: <1281056805.2414.26.camel@zoidberg.mbs.adelaide.edu.au>
<80AF6158-9ADF-47A6-97EC-C322F75C8959@illinois.edu>
<1281315223.2414.48.camel@zoidberg.mbs.adelaide.edu.au>
Message-ID: <5BEA6ECA-B7A7-4417-BC91-763AB956347A@genehack.org>
I'm working on getting those files into a topic branch in bioperl-live so they can be reviewed -- that'll probably be pushed back to the main master within the next couple days at the latest.
j.
On Aug 8, 2010, at 20:53 , Dan Kortschak wrote:
> Hi Chris,
>
> Is that set of files planned to be included in the git repository on
> bioperl-live? I don't want to push something that is being organised by
> someone else.
>
> cheers
> Dan
>
> On Thu, 2010-08-05 at 22:13 -0500, Chris Fields wrote:
>> Dan,
>>
>> Just so you know, there is a proposed MUMmer AlignIO parser that John (genehack) is planning on trying to incorporate in:
>>
>> http://bugzilla.open-bio.org/show_bug.cgi?id=2701
>>
>> It currently lacks significant tests, so feel free to chip in there as needed.
>>
>> chris
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
From dan.kortschak at adelaide.edu.au Sun Aug 8 22:03:52 2010
From: dan.kortschak at adelaide.edu.au (Dan Kortschak)
Date: Mon, 09 Aug 2010 11:33:52 +0930
Subject: [Bioperl-l] MUMmer parser work
In-Reply-To: <5BEA6ECA-B7A7-4417-BC91-763AB956347A@genehack.org>
References: <1281056805.2414.26.camel@zoidberg.mbs.adelaide.edu.au>
<80AF6158-9ADF-47A6-97EC-C322F75C8959@illinois.edu>
<1281315223.2414.48.camel@zoidberg.mbs.adelaide.edu.au>
<5BEA6ECA-B7A7-4417-BC91-763AB956347A@genehack.org>
Message-ID: <1281319432.2414.49.camel@zoidberg.mbs.adelaide.edu.au>
Excellent. Thanks for that.
Dan
On Sun, 2010-08-08 at 21:42 -0400, John SJ Anderson wrote:
> I'm working on getting those files into a topic branch in bioperl-live so they can be reviewed -- that'll probably be pushed back to the main master within the next couple days at the latest.
>
> j.
From cjfields at illinois.edu Mon Aug 9 22:40:07 2010
From: cjfields at illinois.edu (Chris Fields)
Date: Mon, 9 Aug 2010 21:40:07 -0500
Subject: [Bioperl-l] bioperl-live, moving Bio->lib/Bio
Message-ID:
Any objections to moving the Bio directory to lib/Bio in bioperl-live? It's a more standard location for code in most distributions; I have a branch (topic/cjfields_standard_lib) that has this working, though it's possible that it needs more work.
chris
From genehack at genehack.org Tue Aug 10 04:30:44 2010
From: genehack at genehack.org (John SJ Anderson)
Date: Tue, 10 Aug 2010 04:30:44 -0400
Subject: [Bioperl-l] bioperl-live, moving Bio->lib/Bio
In-Reply-To:
References:
Message-ID:
On Aug 9, 2010, at 22:40 , Chris Fields wrote:
> Any objections to moving the Bio directory to lib/Bio in bioperl-live?
+1 on this idea.
j.
From genehack at genehack.org Tue Aug 10 07:21:51 2010
From: genehack at genehack.org (John Anderson)
Date: Tue, 10 Aug 2010 07:21:51 -0400
Subject: [Bioperl-l] MUMmer parser work
In-Reply-To: <5BEA6ECA-B7A7-4417-BC91-763AB956347A@genehack.org>
References: <1281056805.2414.26.camel@zoidberg.mbs.adelaide.edu.au>
<80AF6158-9ADF-47A6-97EC-C322F75C8959@illinois.edu>
<1281315223.2414.48.camel@zoidberg.mbs.adelaide.edu.au>
<5BEA6ECA-B7A7-4417-BC91-763AB956347A@genehack.org>
Message-ID: <7A4F93AB-1BF7-4775-BC0E-38E7B431ECC6@genehack.org>
On Aug 8, 2010, at 9:42 PM, John SJ Anderson wrote:
> I'm working on getting those files into a topic branch in bioperl-live so they can be reviewed -- that'll probably be pushed back to the main master within the next couple days at the latest.
Okay, the files have been added to topic/bug-2701 -- see .
Please note, these are just the files from the bug report, slotted into the appropriate spots. I haven't reviewed the code or done anything about the non-BioPerl-y tests or the general lack of test coverage. I hope to do something about that in the coming week, but if somebody beats me to it, that would be okay too.
j.
From maj at fortinbras.us Tue Aug 10 19:52:05 2010
From: maj at fortinbras.us (Mark A. Jensen)
Date: Tue, 10 Aug 2010 19:52:05 -0400
Subject: [Bioperl-l] bioperl-live, moving Bio->lib/Bio
In-Reply-To:
References:
Message-ID: <1C55239986494A8D82BDC21A85B324E9@NewLife>
+1
----- Original Message -----
From: "Chris Fields"
To: "BioPerl List"
Sent: Monday, August 09, 2010 10:40 PM
Subject: [Bioperl-l] bioperl-live, moving Bio->lib/Bio
> Any objections to moving the Bio directory to lib/Bio in bioperl-live? It's a
> more standard location for code in most distributions; I have a branch
> (topic/cjfields_standard_lib) that has this working, though it's possible that
> it needs more work.
>
> chris
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>
From fayroz_farouk at yahoo.com Sun Aug 8 04:24:31 2010
From: fayroz_farouk at yahoo.com (fayroz)
Date: Sun, 8 Aug 2010 01:24:31 -0700 (PDT)
Subject: [Bioperl-l] using HMMER
Message-ID: <603590.1072.qm@web112620.mail.gq1.yahoo.com>
i need your help, i?am a new perl user and want to use bioperl modules to run
HMMER program ( HMMsearch) i have" model.hmm" and a "fasta file" to?see?which of
them are similar?with the model
i write this code but there is a problems
#!/usr/local/bin/perl W
use Bio::AlignIO;
use Bio::SearchIO;
use Bio::SeqIO ;
use Bio::Tools::Run::Hmmer;
# run hmmsearch (similar for hmmpfam)
my $factory = Bio::Tools::Run::Hmmer->new(-hmm => 'h6_avian.hmm',-informat =>
'fasta');
my $seq = Bio::SeqIO->new('-file'=> "one_seq.fa", '-format'=>'Fasta');
# Pass the factory a Bio::Seq object or a file name, returns a Bio::SearchIO
my $searchio = $factory->hmmsearch($seq);
while (my $result = $searchio->next_result){
while(my $hit = $result->next_hit){
while (my $hsp = $hit->next_hsp){
print join("\t", ( $result->query_name,
$hsp->query->start,
$hsp->query->end,
$hit->name,
$hsp->hit->start,
$hsp->hit->end,
$hsp->score,
$hsp->evalue,
$hsp->seq_str,
)), "\n";
}
}
}
exceptions:
MSG: Unknown kind of input 'Bio::SeqIO::fasta=HASH(0x329a504)'
STACK Bio::Tools::Run::Hmmer::_setinput
D:/Perl/site/lib/Bio/Tools/Run/Hmmer.pm:381
STACK Bio::Tools::Run::Hmmer::hmmsearch
D:/Perl/site/lib/Bio/Tools/Run/Hmmer.pm:352
?STACK toplevel test_bioperl.pl:12
thank you
fayroz?
From douglas.hoen at gmail.com Tue Aug 10 21:54:53 2010
From: douglas.hoen at gmail.com (Douglas Hoen)
Date: Tue, 10 Aug 2010 21:54:53 -0400
Subject: [Bioperl-l] Bio::SeqFeature::SimilarityPair->from_searchResult()?
Message-ID: <4513D6B2-F7B3-4A6E-91CA-879C9E372E84@gmail.com>
Hi,
I was wondering why the Synopsis in the docs for Bio::SeqFeature::SimilarityPair has the following:
$sim_pair = Bio::SeqFeature::SimilarityPair->from_searchResult($blastHit);
There doesn't actually seem to be a from_searchResult method. Am I missing something?
Thanks,
-- Doug
From zhaoy at mail.cbi.pku.edu.cn Wed Aug 11 04:17:42 2010
From: zhaoy at mail.cbi.pku.edu.cn (zhaoy at mail.cbi.pku.edu.cn)
Date: Wed, 11 Aug 2010 16:17:42 +0800 (CST)
Subject: [Bioperl-l] About extracting sequence from genewise format result
Message-ID: <53663.162.105.250.100.1281514662.squirrel@mail.cbi.pku.edu.cn>
Dear authors:
Hello!
Recently I am trying to parse the genewise format result for extracting
the nuclear sequence using method "hit_string" in module "SearchIO",
however, the result is empty. What's more terrible, the cycle seems not
working, because I always get the last result. I'm confused.
My perl code is shown below:
#!/usr/bin/perl -w
use strict;
use warnings;
use Bio::SearchIO;
my $in = new Bio::SearchIO(-format => 'wise',
-wisetype => 'genewise',
-file => 'test');
while( my $result = $in->next_result ) {
while (my $hit = $result->next_hit) {
while (my $hsp = $hit->next_hsp){
print "Query=", $result->query_name, "\n",
"Length=", $hsp->length('total'),"\n",
"hit_string:", $hsp->hit_string, "\n";
}
}
}
And one of the genewise format results is shown below:
genewise $Name: wise2-4-0alpha $ (unreleased release)
This program is freely distributed under a GPL. See source directory
Copyright (c) GRL limited: portions of the code are from separate copyright
Query protein: Cpa_s110_24
Comp Matrix: BLOSUM62.bla
Gap open: 12
Gap extension: 2
Start/End global
Target Sequence Bdi_chr3:38292015..38292302
Strand: forward
Start/End (protein) global
Gene Parameter file: gene.stat
Splice site model: GT/AG only
Codon Table: codon.table
Subs error: 1e-06
Indel error: 1e-06
Null model syn
Algorithm 623
genewise output
Score 37.97 bits over entire alignment
Scores as bits over a synchronous coding model
Warning: The bits scores is not probablistically correct for single seqs
See WWW help for more info
Cpa_s110_24 1 MGNCQAVDAATLAIQHPS-GKVDRLYWPVSASEVMRTNPGHYVALLI--
MGNCQA DAA + IQHP+ GKV+RLYWP +A++VMR NPGHYVAL++
MGNCQAADAAAVVIQHPAEGKVERLYWPATAADVMRKNPGHYVALVVVH
Bdi_chr3:382920 1 agatcggggggggacccgggaggccttcgaggggacaacgctggcgggc
tgagaccaccctttaaccagatagtagcccccattgaacgaatctttta
gctcgggtggcggcgcgcgggcgcccggccgcccgcgcccccccccccc
Cpa_s110_24 47 ----STTLCPSNSNASNAESVRVTRIKLLRPTDTLVLGQVYRLITTQEV
P+ + A + R+T++KLL+P DTL++GQVYRLIT+Q
VSGGAGETDPAVAGGGAAAAARITKVKLLKPRDTLLIGQVYRLITSQ--
Bdi_chr3:382920 148 gtgggggagcgggggggggggaaaagaccaccgaccagcgtccaatc
tcggcgacacctcgggcccccgtcatattacgactttgatagttcca
cctcctgtcccacaaaattccgccgcgccgcgctgcccgccccccca
Cpa_s110_24 92 MKGLWAKKCAKMKKYQEADHKDGLKPETIPGRRSGPERDTQVAKHERHR
-------------------------------------------------
Bdi_chr3:382920 289
Cpa_s110_24 141 SRVAASTNQAGLKSRTWQPSLKSISEAAS
-----------------------------
Bdi_chr3:382920 289
//
Gene 1
Gene 1 288
Exon 1 288 phase 0
Supporting 1 54 1 18
Supporting 58 141 19 46
Supporting 160 288 47 89
//
......
The part of output of this code is shown below:
Query=Aly_481360
Length=0
hit_string:
Query=Aly_481360
Length=0
hit_string:
......
What's wrong with my code and how can I get the correct result? I'm
looking forward to your reply.
Thanks very much!
Best regards,
Zackaly
From roy.chaudhuri at gmail.com Wed Aug 11 10:32:39 2010
From: roy.chaudhuri at gmail.com (Roy Chaudhuri)
Date: Wed, 11 Aug 2010 15:32:39 +0100
Subject: [Bioperl-l] using HMMER
In-Reply-To: <603590.1072.qm@web112620.mail.gq1.yahoo.com>
References: <603590.1072.qm@web112620.mail.gq1.yahoo.com>
Message-ID: <4C62B487.9090103@gmail.com>
Hi Fayroz,
Your $seq variable contains a Bio::SeqIO object (a biological
filehandle), not a Bio::Seq (sequence object).
You need to change that line to:
my $seqio = Bio::SeqIO->new(-file=>'one_seq.fa', -format=>'fasta');
my $seq=$seqio->next_seq;
If you have multiple sequences in the file, then you will need to loop
over them:
while (my $seq=$seqio->next_seq) {
# Code to run Hmmer goes here
}
Also, I don't think you need to specify -informat for your
Bio::Tools::Run::Hmmer object, since you're passing it a sequence
object, not a filename.
Hope this helps.
Roy.
On 08/08/2010 09:24, fayroz wrote:
> i need your help, i am a new perl user and want to use bioperl modules to run
> HMMER program ( HMMsearch) i have" model.hmm" and a "fasta file" to see which of
> them are similar with the model
> i write this code but there is a problems
>
> #!/usr/local/bin/perl W
> use Bio::AlignIO;
> use Bio::SearchIO;
> use Bio::SeqIO ;
> use Bio::Tools::Run::Hmmer;
>
> # run hmmsearch (similar for hmmpfam)
> my $factory = Bio::Tools::Run::Hmmer->new(-hmm => 'h6_avian.hmm',-informat =>
> 'fasta');
> my $seq = Bio::SeqIO->new('-file'=> "one_seq.fa", '-format'=>'Fasta');
>
> # Pass the factory a Bio::Seq object or a file name, returns a Bio::SearchIO
> my $searchio = $factory->hmmsearch($seq);
>
> while (my $result = $searchio->next_result){
> while(my $hit = $result->next_hit){
> while (my $hsp = $hit->next_hsp){
> print join("\t", ( $result->query_name,
> $hsp->query->start,
> $hsp->query->end,
> $hit->name,
> $hsp->hit->start,
> $hsp->hit->end,
> $hsp->score,
> $hsp->evalue,
> $hsp->seq_str,
> )), "\n";
> }
> }
> }
>
>
> exceptions:
> MSG: Unknown kind of input 'Bio::SeqIO::fasta=HASH(0x329a504)'
> STACK Bio::Tools::Run::Hmmer::_setinput
> D:/Perl/site/lib/Bio/Tools/Run/Hmmer.pm:381
> STACK Bio::Tools::Run::Hmmer::hmmsearch
> D:/Perl/site/lib/Bio/Tools/Run/Hmmer.pm:352
> STACK toplevel test_bioperl.pl:12
> thank you
>
> fayroz
>
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
From cjfields at illinois.edu Wed Aug 11 11:07:36 2010
From: cjfields at illinois.edu (Chris Fields)
Date: Wed, 11 Aug 2010 10:07:36 -0500
Subject: [Bioperl-l] using HMMER
In-Reply-To: <4C62B487.9090103@gmail.com>
References: <603590.1072.qm@web112620.mail.gq1.yahoo.com>
<4C62B487.9090103@gmail.com>
Message-ID: <62C86AFB-FF3A-44C6-A413-50C3F839DF34@illinois.edu>
might also want to check whether you are using hmmer2 vs hmmer3. not sure if the wrapper works for hmmer3.
chris
On Aug 11, 2010, at 9:32 AM, Roy Chaudhuri wrote:
> Hi Fayroz,
>
> Your $seq variable contains a Bio::SeqIO object (a biological filehandle), not a Bio::Seq (sequence object).
>
> You need to change that line to:
> my $seqio = Bio::SeqIO->new(-file=>'one_seq.fa', -format=>'fasta');
> my $seq=$seqio->next_seq;
>
> If you have multiple sequences in the file, then you will need to loop over them:
> while (my $seq=$seqio->next_seq) {
> # Code to run Hmmer goes here
> }
>
> Also, I don't think you need to specify -informat for your Bio::Tools::Run::Hmmer object, since you're passing it a sequence object, not a filename.
>
> Hope this helps.
> Roy.
>
> On 08/08/2010 09:24, fayroz wrote:
>> i need your help, i am a new perl user and want to use bioperl modules to run
>> HMMER program ( HMMsearch) i have" model.hmm" and a "fasta file" to see which of
>> them are similar with the model
>> i write this code but there is a problems
>>
>> #!/usr/local/bin/perl W
>> use Bio::AlignIO;
>> use Bio::SearchIO;
>> use Bio::SeqIO ;
>> use Bio::Tools::Run::Hmmer;
>>
>> # run hmmsearch (similar for hmmpfam)
>> my $factory = Bio::Tools::Run::Hmmer->new(-hmm => 'h6_avian.hmm',-informat =>
>> 'fasta');
>> my $seq = Bio::SeqIO->new('-file'=> "one_seq.fa", '-format'=>'Fasta');
>>
>> # Pass the factory a Bio::Seq object or a file name, returns a Bio::SearchIO
>> my $searchio = $factory->hmmsearch($seq);
>>
>> while (my $result = $searchio->next_result){
>> while(my $hit = $result->next_hit){
>> while (my $hsp = $hit->next_hsp){
>> print join("\t", ( $result->query_name,
>> $hsp->query->start,
>> $hsp->query->end,
>> $hit->name,
>> $hsp->hit->start,
>> $hsp->hit->end,
>> $hsp->score,
>> $hsp->evalue,
>> $hsp->seq_str,
>> )), "\n";
>> }
>> }
>> }
>>
>>
>> exceptions:
>> MSG: Unknown kind of input 'Bio::SeqIO::fasta=HASH(0x329a504)'
>> STACK Bio::Tools::Run::Hmmer::_setinput
>> D:/Perl/site/lib/Bio/Tools/Run/Hmmer.pm:381
>> STACK Bio::Tools::Run::Hmmer::hmmsearch
>> D:/Perl/site/lib/Bio/Tools/Run/Hmmer.pm:352
>> STACK toplevel test_bioperl.pl:12
>> thank you
>>
>> fayroz
>>
>>
>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
From douglas.hoen at gmail.com Wed Aug 11 15:13:49 2010
From: douglas.hoen at gmail.com (Doug)
Date: Wed, 11 Aug 2010 12:13:49 -0700 (PDT)
Subject: [Bioperl-l] How to store results of searches of translated DNA in
SeqFeature::Store database of the original DNA?
Message-ID: <1d774f4c-0aa0-45e3-964d-82dbdab4f261@j8g2000yqd.googlegroups.com>
Hi,
I am trying to store in a SeqFeature::Store database the results of
searches of translated DNA. The DB contains the original DNA
sequences. For instance, I have done HMMER searches of 6-frame
translations of the sequences stored in the DB. I want to store these
results "at" their (equivalent) DNA positions, which I can calculate.
Preferably, I would like to directly store the SeqFeature::Similarity
objects that I get from parsing these searches. But they are of course
located on different coordinate systems than the DNA, so I guess I
can't (or shouldn't) create a SeqFeature (e.g. Generic) at the correct
DNA position and then store the Similarity's as sub-SeqFeatures.
I could just set the Similarity's position to the (calculated) DNA
coordinates, or alternately make a new SeqFeature and copy in the
attributes I want. But is there a more elegant solution?
Thanks,
-- Doug
From douglas.hoen at gmail.com Wed Aug 11 16:11:26 2010
From: douglas.hoen at gmail.com (Doug)
Date: Wed, 11 Aug 2010 13:11:26 -0700 (PDT)
Subject: [Bioperl-l] How to store results of searches of translated DNA
in SeqFeature::Store database of the original DNA?
In-Reply-To: <1d774f4c-0aa0-45e3-964d-82dbdab4f261@j8g2000yqd.googlegroups.com>
References: <1d774f4c-0aa0-45e3-964d-82dbdab4f261@j8g2000yqd.googlegroups.com>
Message-ID:
One possible answer to my own question: Use
Bio::SeqFeature::PositionProxy's? Would this work?
On Aug 11, 3:13?pm, Doug wrote:
> Hi,
>
> I am trying to store in a SeqFeature::Store database the results of
> searches of translated DNA. The DB contains the original DNA
> sequences. For instance, I have done HMMER searches of 6-frame
> translations of the sequences stored in the DB. I want to store these
> results "at" their (equivalent) DNA positions, which I can calculate.
> Preferably, I would like to directly store the SeqFeature::Similarity
> objects that I get from parsing these searches. But they are of course
> located on different coordinate systems than the DNA, so I guess I
> can't (or shouldn't) create a SeqFeature (e.g. Generic) at the correct
> DNA position and then store the Similarity's as sub-SeqFeatures.
>
> I could just set the Similarity's position to the (calculated) DNA
> coordinates, or alternately make a new SeqFeature and copy in the
> attributes I want. But is there a more elegant solution?
>
> Thanks,
> -- Doug
> _______________________________________________
> Bioperl-l mailing list
> Bioper... at lists.open-bio.orghttp://lists.open-bio.org/mailman/listinfo/bioperl-l
From scott at scottcain.net Wed Aug 11 16:16:22 2010
From: scott at scottcain.net (Scott Cain)
Date: Wed, 11 Aug 2010 16:16:22 -0400
Subject: [Bioperl-l] How to store results of searches of translated DNA
in SeqFeature::Store database of the original DNA?
In-Reply-To:
References: <1d774f4c-0aa0-45e3-964d-82dbdab4f261@j8g2000yqd.googlegroups.com>
Message-ID:
Hi Doug,
I don't know if any of the things you've thought of would work; I've
never tried it. My inclination would be to express your data in GFF3
and use the standard loader.
Scott
On Wed, Aug 11, 2010 at 4:11 PM, Doug wrote:
> One possible answer to my own question: Use
> Bio::SeqFeature::PositionProxy's? Would this work?
>
> On Aug 11, 3:13?pm, Doug wrote:
>> Hi,
>>
>> I am trying to store in a SeqFeature::Store database the results of
>> searches of translated DNA. The DB contains the original DNA
>> sequences. For instance, I have done HMMER searches of 6-frame
>> translations of the sequences stored in the DB. I want to store these
>> results "at" their (equivalent) DNA positions, which I can calculate.
>> Preferably, I would like to directly store the SeqFeature::Similarity
>> objects that I get from parsing these searches. But they are of course
>> located on different coordinate systems than the DNA, so I guess I
>> can't (or shouldn't) create a SeqFeature (e.g. Generic) at the correct
>> DNA position and then store the Similarity's as sub-SeqFeatures.
>>
>> I could just set the Similarity's position to the (calculated) DNA
>> coordinates, or alternately make a new SeqFeature and copy in the
>> attributes I want. But is there a more elegant solution?
>>
>> Thanks,
>> -- Doug
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioper... at lists.open-bio.orghttp://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
--
------------------------------------------------------------------------
Scott Cain, Ph. D.? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?? scott at scottcain dot net
GMOD Coordinator (http://gmod.org/)? ? ? ? ? ? ? ? ? ?? 216-392-3087
Ontario Institute for Cancer Research
From douglas.hoen at gmail.com Wed Aug 11 16:38:54 2010
From: douglas.hoen at gmail.com (Doug)
Date: Wed, 11 Aug 2010 13:38:54 -0700 (PDT)
Subject: [Bioperl-l] How to store results of searches of translated DNA
in SeqFeature::Store database of the original DNA?
In-Reply-To:
References: <1d774f4c-0aa0-45e3-964d-82dbdab4f261@j8g2000yqd.googlegroups.com>
Message-ID: <6e28dc26-ada0-4be2-9f62-a4d632aaf0bb@j8g2000yqd.googlegroups.com>
Hi Scott,
Good idea. Would you happen to know of an existing HMMER3 to GFF3
converter?
Thanks for your advice,
-- Doug
On Aug 11, 4:16?pm, Scott Cain wrote:
> Hi Doug,
>
> I don't know if any of the things you've thought of would work; I've
> never tried it. ?My inclination would be to express your data in GFF3
> and use the standard loader.
>
> Scott
>
>
>
>
>
> On Wed, Aug 11, 2010 at 4:11 PM, Doug wrote:
> > One possible answer to my own question: Use
> > Bio::SeqFeature::PositionProxy's? Would this work?
>
> > On Aug 11, 3:13?pm, Doug wrote:
> >> Hi,
>
> >> I am trying to store in a SeqFeature::Store database the results of
> >> searches of translated DNA. The DB contains the original DNA
> >> sequences. For instance, I have done HMMER searches of 6-frame
> >> translations of the sequences stored in the DB. I want to store these
> >> results "at" their (equivalent) DNA positions, which I can calculate.
> >> Preferably, I would like to directly store the SeqFeature::Similarity
> >> objects that I get from parsing these searches. But they are of course
> >> located on different coordinate systems than the DNA, so I guess I
> >> can't (or shouldn't) create a SeqFeature (e.g. Generic) at the correct
> >> DNA position and then store the Similarity's as sub-SeqFeatures.
>
> >> I could just set the Similarity's position to the (calculated) DNA
> >> coordinates, or alternately make a new SeqFeature and copy in the
> >> attributes I want. But is there a more elegant solution?
>
> >> Thanks,
> >> -- Doug
> >> _______________________________________________
> >> Bioperl-l mailing list
> >> Bioper... at lists.open-bio.orghttp://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioper... at lists.open-bio.org
> >http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> --
> ------------------------------------------------------------------------
> Scott Cain, Ph. D.? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?? scott at scottcain dot net
> GMOD Coordinator (http://gmod.org/)?? ? ? ? ? ? ? ? ?? 216-392-3087
> Ontario Institute for Cancer Research
>
> _______________________________________________
> Bioperl-l mailing list
> Bioper... at lists.open-bio.orghttp://lists.open-bio.org/mailman/listinfo/bioperl-l
From douglas.hoen at gmail.com Wed Aug 11 16:53:35 2010
From: douglas.hoen at gmail.com (Doug)
Date: Wed, 11 Aug 2010 13:53:35 -0700 (PDT)
Subject: [Bioperl-l] How to store results of searches of translated DNA
in SeqFeature::Store database of the original DNA?
In-Reply-To: <6e28dc26-ada0-4be2-9f62-a4d632aaf0bb@j8g2000yqd.googlegroups.com>
References: <1d774f4c-0aa0-45e3-964d-82dbdab4f261@j8g2000yqd.googlegroups.com>
<6e28dc26-ada0-4be2-9f62-a4d632aaf0bb@j8g2000yqd.googlegroups.com>
Message-ID:
One more note: I did try using PositionProxy but it failed. It doesn't
implement seq_id() and so can't be stored in the DB:
------------- EXCEPTION: Bio::Root::NotImplemented -------------
MSG: Abstract method "Bio::SeqFeatureI::seq_id" is not implemented by
package Bio::SeqFeature::PositionProxy.
This is not your fault - author of Bio::SeqFeature::PositionProxy
should be blamed!
...
On Aug 11, 4:38?pm, Doug wrote:
> Hi Scott,
>
> Good idea. Would you happen to know of an existing HMMER3 to GFF3
> converter?
>
> Thanks for your advice,
> -- Doug
>
> On Aug 11, 4:16?pm, Scott Cain wrote:
>
>
>
>
>
> > Hi Doug,
>
> > I don't know if any of the things you've thought of would work; I've
> > never tried it. ?My inclination would be to express your data in GFF3
> > and use the standard loader.
>
> > Scott
>
> > On Wed, Aug 11, 2010 at 4:11 PM, Doug wrote:
> > > One possible answer to my own question: Use
> > > Bio::SeqFeature::PositionProxy's? Would this work?
>
> > > On Aug 11, 3:13?pm, Doug wrote:
> > >> Hi,
>
> > >> I am trying to store in a SeqFeature::Store database the results of
> > >> searches of translated DNA. The DB contains the original DNA
> > >> sequences. For instance, I have done HMMER searches of 6-frame
> > >> translations of the sequences stored in the DB. I want to store these
> > >> results "at" their (equivalent) DNA positions, which I can calculate.
> > >> Preferably, I would like to directly store the SeqFeature::Similarity
> > >> objects that I get from parsing these searches. But they are of course
> > >> located on different coordinate systems than the DNA, so I guess I
> > >> can't (or shouldn't) create a SeqFeature (e.g. Generic) at the correct
> > >> DNA position and then store the Similarity's as sub-SeqFeatures.
>
> > >> I could just set the Similarity's position to the (calculated) DNA
> > >> coordinates, or alternately make a new SeqFeature and copy in the
> > >> attributes I want. But is there a more elegant solution?
>
> > >> Thanks,
> > >> -- Doug
> > >> _______________________________________________
> > >> Bioperl-l mailing list
> > >> Bioper... at lists.open-bio.orghttp://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> > > _______________________________________________
> > > Bioperl-l mailing list
> > > Bioper... at lists.open-bio.org
> > >http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> > --
> > ------------------------------------------------------------------------
> > Scott Cain, Ph. D.? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?? scott at scottcain dot net
> > GMOD Coordinator (http://gmod.org/)?? ? ? ? ? ? ? ? ?? 216-392-3087
> > Ontario Institute for Cancer Research
>
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioper... at lists.open-bio.orghttp://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> _______________________________________________
> Bioperl-l mailing list
> Bioper... at lists.open-bio.orghttp://lists.open-bio.org/mailman/listinfo/bioperl-l
From cjfields at illinois.edu Wed Aug 11 16:45:00 2010
From: cjfields at illinois.edu (Chris Fields)
Date: Wed, 11 Aug 2010 15:45:00 -0500
Subject: [Bioperl-l] How to store results of searches of translated DNA
in SeqFeature::Store database of the original DNA?
In-Reply-To: <6e28dc26-ada0-4be2-9f62-a4d632aaf0bb@j8g2000yqd.googlegroups.com>
References: <1d774f4c-0aa0-45e3-964d-82dbdab4f261@j8g2000yqd.googlegroups.com>
<6e28dc26-ada0-4be2-9f62-a4d632aaf0bb@j8g2000yqd.googlegroups.com>
Message-ID: <190AF658-E8FE-43D7-A71F-196AE54DA1DB@illinois.edu>
HMMER3 is parsed by Bio::SearchIO now in bioperl-live, and I think there is a generic SearchIO->GFF3 script floating around the intertubes somewheres...
chris
On Aug 11, 2010, at 3:38 PM, Doug wrote:
> Hi Scott,
>
> Good idea. Would you happen to know of an existing HMMER3 to GFF3
> converter?
>
> Thanks for your advice,
> -- Doug
>
> On Aug 11, 4:16 pm, Scott Cain wrote:
>> Hi Doug,
>>
>> I don't know if any of the things you've thought of would work; I've
>> never tried it. My inclination would be to express your data in GFF3
>> and use the standard loader.
>>
>> Scott
>>
>>
>>
>>
>>
>> On Wed, Aug 11, 2010 at 4:11 PM, Doug wrote:
>>> One possible answer to my own question: Use
>>> Bio::SeqFeature::PositionProxy's? Would this work?
>>
>>> On Aug 11, 3:13 pm, Doug wrote:
>>>> Hi,
>>
>>>> I am trying to store in a SeqFeature::Store database the results of
>>>> searches of translated DNA. The DB contains the original DNA
>>>> sequences. For instance, I have done HMMER searches of 6-frame
>>>> translations of the sequences stored in the DB. I want to store these
>>>> results "at" their (equivalent) DNA positions, which I can calculate.
>>>> Preferably, I would like to directly store the SeqFeature::Similarity
>>>> objects that I get from parsing these searches. But they are of course
>>>> located on different coordinate systems than the DNA, so I guess I
>>>> can't (or shouldn't) create a SeqFeature (e.g. Generic) at the correct
>>>> DNA position and then store the Similarity's as sub-SeqFeatures.
>>
>>>> I could just set the Similarity's position to the (calculated) DNA
>>>> coordinates, or alternately make a new SeqFeature and copy in the
>>>> attributes I want. But is there a more elegant solution?
>>
>>>> Thanks,
>>>> -- Doug
>>>> _______________________________________________
>>>> Bioperl-l mailing list
>>>> Bioper... at lists.open-bio.orghttp://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioper... at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>> --
>> ------------------------------------------------------------------------
>> Scott Cain, Ph. D. scott at scottcain dot net
>> GMOD Coordinator (http://gmod.org/) 216-392-3087
>> Ontario Institute for Cancer Research
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioper... at lists.open-bio.orghttp://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
From scott at scottcain.net Wed Aug 11 17:05:25 2010
From: scott at scottcain.net (Scott Cain)
Date: Wed, 11 Aug 2010 17:05:25 -0400
Subject: [Bioperl-l] How to store results of searches of translated DNA
in SeqFeature::Store database of the original DNA?
In-Reply-To: <190AF658-E8FE-43D7-A71F-196AE54DA1DB@illinois.edu>
References: <1d774f4c-0aa0-45e3-964d-82dbdab4f261@j8g2000yqd.googlegroups.com>
<6e28dc26-ada0-4be2-9f62-a4d632aaf0bb@j8g2000yqd.googlegroups.com>
<190AF658-E8FE-43D7-A71F-196AE54DA1DB@illinois.edu>
Message-ID:
Um, yeah, it's in bioperl: bp_search2gff.pl.
Scott
On Wed, Aug 11, 2010 at 4:45 PM, Chris Fields wrote:
> HMMER3 is parsed by Bio::SearchIO now in bioperl-live, and I think there is a generic SearchIO->GFF3 script floating around the intertubes somewheres...
>
> chris
>
> On Aug 11, 2010, at 3:38 PM, Doug wrote:
>
>> Hi Scott,
>>
>> Good idea. Would you happen to know of an existing HMMER3 to GFF3
>> converter?
>>
>> Thanks for your advice,
>> -- Doug
>>
>> On Aug 11, 4:16 pm, Scott Cain wrote:
>>> Hi Doug,
>>>
>>> I don't know if any of the things you've thought of would work; I've
>>> never tried it. ?My inclination would be to express your data in GFF3
>>> and use the standard loader.
>>>
>>> Scott
>>>
>>>
>>>
>>>
>>>
>>> On Wed, Aug 11, 2010 at 4:11 PM, Doug wrote:
>>>> One possible answer to my own question: Use
>>>> Bio::SeqFeature::PositionProxy's? Would this work?
>>>
>>>> On Aug 11, 3:13 pm, Doug wrote:
>>>>> Hi,
>>>
>>>>> I am trying to store in a SeqFeature::Store database the results of
>>>>> searches of translated DNA. The DB contains the original DNA
>>>>> sequences. For instance, I have done HMMER searches of 6-frame
>>>>> translations of the sequences stored in the DB. I want to store these
>>>>> results "at" their (equivalent) DNA positions, which I can calculate.
>>>>> Preferably, I would like to directly store the SeqFeature::Similarity
>>>>> objects that I get from parsing these searches. But they are of course
>>>>> located on different coordinate systems than the DNA, so I guess I
>>>>> can't (or shouldn't) create a SeqFeature (e.g. Generic) at the correct
>>>>> DNA position and then store the Similarity's as sub-SeqFeatures.
>>>
>>>>> I could just set the Similarity's position to the (calculated) DNA
>>>>> coordinates, or alternately make a new SeqFeature and copy in the
>>>>> attributes I want. But is there a more elegant solution?
>>>
>>>>> Thanks,
>>>>> -- Doug
>>>>> _______________________________________________
>>>>> Bioperl-l mailing list
>>>>> Bioper... at lists.open-bio.orghttp://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>>> _______________________________________________
>>>> Bioperl-l mailing list
>>>> Bioper... at lists.open-bio.org
>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>> --
>>> ------------------------------------------------------------------------
>>> Scott Cain, Ph. D. ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? scott at scottcain dot net
>>> GMOD Coordinator (http://gmod.org/) ? ? ? ? ? ? ? ? ? ?216-392-3087
>>> Ontario Institute for Cancer Research
>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioper... at lists.open-bio.orghttp://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
--
------------------------------------------------------------------------
Scott Cain, Ph. D.? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?? scott at scottcain dot net
GMOD Coordinator (http://gmod.org/)? ? ? ? ? ? ? ? ? ?? 216-392-3087
Ontario Institute for Cancer Research
From cjfields at illinois.edu Wed Aug 11 17:07:20 2010
From: cjfields at illinois.edu (Chris Fields)
Date: Wed, 11 Aug 2010 16:07:20 -0500
Subject: [Bioperl-l] How to store results of searches of translated DNA
in SeqFeature::Store database of the original DNA?
In-Reply-To:
References: <1d774f4c-0aa0-45e3-964d-82dbdab4f261@j8g2000yqd.googlegroups.com>
<6e28dc26-ada0-4be2-9f62-a4d632aaf0bb@j8g2000yqd.googlegroups.com>
<190AF658-E8FE-43D7-A71F-196AE54DA1DB@illinois.edu>
Message-ID:
For some reason I thought there was a more up-to-date one somewhere. Ah well, can't keep track of all the code in bioperl :>
chris
On Aug 11, 2010, at 4:05 PM, Scott Cain wrote:
> Um, yeah, it's in bioperl: bp_search2gff.pl.
>
> Scott
>
>
> On Wed, Aug 11, 2010 at 4:45 PM, Chris Fields wrote:
>> HMMER3 is parsed by Bio::SearchIO now in bioperl-live, and I think there is a generic SearchIO->GFF3 script floating around the intertubes somewheres...
>>
>> chris
>>
>> On Aug 11, 2010, at 3:38 PM, Doug wrote:
>>
>>> Hi Scott,
>>>
>>> Good idea. Would you happen to know of an existing HMMER3 to GFF3
>>> converter?
>>>
>>> Thanks for your advice,
>>> -- Doug
>>>
>>> On Aug 11, 4:16 pm, Scott Cain wrote:
>>>> Hi Doug,
>>>>
>>>> I don't know if any of the things you've thought of would work; I've
>>>> never tried it. My inclination would be to express your data in GFF3
>>>> and use the standard loader.
>>>>
>>>> Scott
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> On Wed, Aug 11, 2010 at 4:11 PM, Doug wrote:
>>>>> One possible answer to my own question: Use
>>>>> Bio::SeqFeature::PositionProxy's? Would this work?
>>>>
>>>>> On Aug 11, 3:13 pm, Doug wrote:
>>>>>> Hi,
>>>>
>>>>>> I am trying to store in a SeqFeature::Store database the results of
>>>>>> searches of translated DNA. The DB contains the original DNA
>>>>>> sequences. For instance, I have done HMMER searches of 6-frame
>>>>>> translations of the sequences stored in the DB. I want to store these
>>>>>> results "at" their (equivalent) DNA positions, which I can calculate.
>>>>>> Preferably, I would like to directly store the SeqFeature::Similarity
>>>>>> objects that I get from parsing these searches. But they are of course
>>>>>> located on different coordinate systems than the DNA, so I guess I
>>>>>> can't (or shouldn't) create a SeqFeature (e.g. Generic) at the correct
>>>>>> DNA position and then store the Similarity's as sub-SeqFeatures.
>>>>
>>>>>> I could just set the Similarity's position to the (calculated) DNA
>>>>>> coordinates, or alternately make a new SeqFeature and copy in the
>>>>>> attributes I want. But is there a more elegant solution?
>>>>
>>>>>> Thanks,
>>>>>> -- Doug
>>>>>> _______________________________________________
>>>>>> Bioperl-l mailing list
>>>>>> Bioper... at lists.open-bio.orghttp://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>>
>>>>> _______________________________________________
>>>>> Bioperl-l mailing list
>>>>> Bioper... at lists.open-bio.org
>>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>>
>>>> --
>>>> ------------------------------------------------------------------------
>>>> Scott Cain, Ph. D. scott at scottcain dot net
>>>> GMOD Coordinator (http://gmod.org/) 216-392-3087
>>>> Ontario Institute for Cancer Research
>>>>
>>>> _______________________________________________
>>>> Bioperl-l mailing list
>>>> Bioper... at lists.open-bio.orghttp://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>
>
>
> --
> ------------------------------------------------------------------------
> Scott Cain, Ph. D. scott at scottcain dot net
> GMOD Coordinator (http://gmod.org/) 216-392-3087
> Ontario Institute for Cancer Research
From douglas.hoen at gmail.com Wed Aug 11 17:11:20 2010
From: douglas.hoen at gmail.com (Douglas Hoen)
Date: Wed, 11 Aug 2010 17:11:20 -0400
Subject: [Bioperl-l] How to store results of searches of translated DNA
in SeqFeature::Store database of the original DNA?
In-Reply-To:
References: <1d774f4c-0aa0-45e3-964d-82dbdab4f261@j8g2000yqd.googlegroups.com>
<6e28dc26-ada0-4be2-9f62-a4d632aaf0bb@j8g2000yqd.googlegroups.com>
<190AF658-E8FE-43D7-A71F-196AE54DA1DB@illinois.edu>
Message-ID:
Great, thanks so much for the info.
On 2010-08-11, at 5:05 PM, Scott Cain wrote:
> Um, yeah, it's in bioperl: bp_search2gff.pl.
>
> Scott
>
>
> On Wed, Aug 11, 2010 at 4:45 PM, Chris Fields wrote:
>> HMMER3 is parsed by Bio::SearchIO now in bioperl-live, and I think there is a generic SearchIO->GFF3 script floating around the intertubes somewheres...
>>
>> chris
>>
>> On Aug 11, 2010, at 3:38 PM, Doug wrote:
>>
>>> Hi Scott,
>>>
>>> Good idea. Would you happen to know of an existing HMMER3 to GFF3
>>> converter?
>>>
>>> Thanks for your advice,
>>> -- Doug
>>>
>>> On Aug 11, 4:16 pm, Scott Cain wrote:
>>>> Hi Doug,
>>>>
>>>> I don't know if any of the things you've thought of would work; I've
>>>> never tried it. My inclination would be to express your data in GFF3
>>>> and use the standard loader.
>>>>
>>>> Scott
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> On Wed, Aug 11, 2010 at 4:11 PM, Doug wrote:
>>>>> One possible answer to my own question: Use
>>>>> Bio::SeqFeature::PositionProxy's? Would this work?
>>>>
>>>>> On Aug 11, 3:13 pm, Doug wrote:
>>>>>> Hi,
>>>>
>>>>>> I am trying to store in a SeqFeature::Store database the results of
>>>>>> searches of translated DNA. The DB contains the original DNA
>>>>>> sequences. For instance, I have done HMMER searches of 6-frame
>>>>>> translations of the sequences stored in the DB. I want to store these
>>>>>> results "at" their (equivalent) DNA positions, which I can calculate.
>>>>>> Preferably, I would like to directly store the SeqFeature::Similarity
>>>>>> objects that I get from parsing these searches. But they are of course
>>>>>> located on different coordinate systems than the DNA, so I guess I
>>>>>> can't (or shouldn't) create a SeqFeature (e.g. Generic) at the correct
>>>>>> DNA position and then store the Similarity's as sub-SeqFeatures.
>>>>
>>>>>> I could just set the Similarity's position to the (calculated) DNA
>>>>>> coordinates, or alternately make a new SeqFeature and copy in the
>>>>>> attributes I want. But is there a more elegant solution?
>>>>
>>>>>> Thanks,
>>>>>> -- Doug
>>>>>> _______________________________________________
>>>>>> Bioperl-l mailing list
>>>>>> Bioper... at lists.open-bio.orghttp://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>>
>>>>> _______________________________________________
>>>>> Bioperl-l mailing list
>>>>> Bioper... at lists.open-bio.org
>>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>>
>>>> --
>>>> ------------------------------------------------------------------------
>>>> Scott Cain, Ph. D. scott at scottcain dot net
>>>> GMOD Coordinator (http://gmod.org/) 216-392-3087
>>>> Ontario Institute for Cancer Research
>>>>
>>>> _______________________________________________
>>>> Bioperl-l mailing list
>>>> Bioper... at lists.open-bio.orghttp://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>
>
>
> --
> ------------------------------------------------------------------------
> Scott Cain, Ph. D. scott at scottcain dot net
> GMOD Coordinator (http://gmod.org/) 216-392-3087
> Ontario Institute for Cancer Research
From Russell.Smithies at agresearch.co.nz Wed Aug 11 17:31:32 2010
From: Russell.Smithies at agresearch.co.nz (Smithies, Russell)
Date: Thu, 12 Aug 2010 09:31:32 +1200
Subject: [Bioperl-l] AlignIO and Gbrowse_syn
In-Reply-To:
References:
Message-ID: <18DF7D20DFEC044098A1062202F5FFF32F0237EAB7@exchsth.agresearch.co.nz>
I know there was some brief discussion about .maf format a few weeks ago but I've had an enquiry (as below) from a colleague.
If GBrowse_syn is using .maf format, does AlignIO need more work?
Any comments?
--Russell
I'd like to plug LASTZ alignments into GBrowse_syn. LASTZ can produce a limit number of alignment formats (http://www.bx.psu.edu/miller_lab/dist/README.lastz-1.02.00/README.lastz-1.02.00a.html#options_output). GBrowse_syn accepts clustalw format plus "other commonly used formats recognized by BioPerl's AlignIO parser" (http://gmod.org/wiki/GBrowse_syn_Database) . Since LASTZ doesn't produce clustalw, I've tried parsing LASTZ maf output to clustalw (and other alignment formats) using AlignIO, however I run into the following issues:
*Strand info is lost (probably fair enough, since this isn't part of the clustalw format per se; incorporating strand info within sequence IDs is a GBrowse_syn clustalw specification)
*The coordinate system for reverse strand matches differs between LASTZ .maf and BioPerl .maf: for LASTZ, coordinates relate to the reverse complemented sequence, whereas for BioPerl/GBrowse, coordinates relate to the original (non-rev complemented) sequence. E.g. a coordinate of "1" in the LASTZ .maf file refers to the last base of the original sequence; AlignIO prints "1" to the output clustalw file, but since strand info is lost it is construed as the first position at the very start of the original sequence. As a result all reverse match coordinates in the resulting clustalw output file are incorrect.
*AlignIO is unable to parse multiple, individual aligned regions within the same .maf file; it interleaves them
I would be interested to hear whether anyone has already found a solution to integrating LASTZ and GBrowse_syn... and also whether any development of AlignIO to improve support of maf format is planned.
=======================================================================
Attention: The information contained in this message and/or attachments
from AgResearch Limited is intended only for the persons or entities
to which it is addressed and may contain confidential and/or privileged
material. Any review, retransmission, dissemination or other use of, or
taking of any action in reliance upon, this information by persons or
entities other than the intended recipients is prohibited by AgResearch
Limited. If you have received this message in error, please notify the
sender immediately.
=======================================================================
From cjfields at illinois.edu Wed Aug 11 18:02:38 2010
From: cjfields at illinois.edu (Chris Fields)
Date: Wed, 11 Aug 2010 17:02:38 -0500
Subject: [Bioperl-l] AlignIO and Gbrowse_syn
In-Reply-To: <18DF7D20DFEC044098A1062202F5FFF32F0237EAB7@exchsth.agresearch.co.nz>
References:
<18DF7D20DFEC044098A1062202F5FFF32F0237EAB7@exchsth.agresearch.co.nz>
Message-ID:
Russell,
We have had very few requests to support .maf until recently, which is why there has been little done with it. We welcome any help to improve it.
chris
On Aug 11, 2010, at 4:31 PM, Smithies, Russell wrote:
> I know there was some brief discussion about .maf format a few weeks ago but I've had an enquiry (as below) from a colleague.
> If GBrowse_syn is using .maf format, does AlignIO need more work?
> Any comments?
>
> --Russell
>
>
> I'd like to plug LASTZ alignments into GBrowse_syn. LASTZ can produce a limit number of alignment formats (http://www.bx.psu.edu/miller_lab/dist/README.lastz-1.02.00/README.lastz-1.02.00a.html#options_output). GBrowse_syn accepts clustalw format plus "other commonly used formats recognized by BioPerl's AlignIO parser" (http://gmod.org/wiki/GBrowse_syn_Database) . Since LASTZ doesn't produce clustalw, I've tried parsing LASTZ maf output to clustalw (and other alignment formats) using AlignIO, however I run into the following issues:
> *Strand info is lost (probably fair enough, since this isn't part of the clustalw format per se; incorporating strand info within sequence IDs is a GBrowse_syn clustalw specification)
> *The coordinate system for reverse strand matches differs between LASTZ .maf and BioPerl .maf: for LASTZ, coordinates relate to the reverse complemented sequence, whereas for BioPerl/GBrowse, coordinates relate to the original (non-rev complemented) sequence. E.g. a coordinate of "1" in the LASTZ .maf file refers to the last base of the original sequence; AlignIO prints "1" to the output clustalw file, but since strand info is lost it is construed as the first position at the very start of the original sequence. As a result all reverse match coordinates in the resulting clustalw output file are incorrect.
> *AlignIO is unable to parse multiple, individual aligned regions within the same .maf file; it interleaves them
>
> I would be interested to hear whether anyone has already found a solution to integrating LASTZ and GBrowse_syn... and also whether any development of AlignIO to improve support of maf format is planned.
> =======================================================================
> Attention: The information contained in this message and/or attachments
> from AgResearch Limited is intended only for the persons or entities
> to which it is addressed and may contain confidential and/or privileged
> material. Any review, retransmission, dissemination or other use of, or
> taking of any action in reliance upon, this information by persons or
> entities other than the intended recipients is prohibited by AgResearch
> Limited. If you have received this message in error, please notify the
> sender immediately.
> =======================================================================
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
From douglas.hoen at gmail.com Thu Aug 12 01:59:37 2010
From: douglas.hoen at gmail.com (Doug Hoen)
Date: Wed, 11 Aug 2010 22:59:37 -0700 (PDT)
Subject: [Bioperl-l] HMMER3 to GFF3
Message-ID: <4bb89ced-69d9-43ff-ae20-4ce134efc40a@f6g2000yqa.googlegroups.com>
Hi,
I am trying to convert HMMER3 (hmmscan) output files into GFF3 files.
Based on previous advice (see the thread, "How to store results of
searches of translated DNA in SeqFeature::Store database of the
original DNA?"), I have installed bioperl-live for its new HMMER3
parsing capabilities (in SearchIO) and am trying to use
bp_search2gff.pl to do the file conversion.
The hmmscan was done on translated chromosome sequences with conserved
domain models. I want to get the GFF 'start' and 'end' columns to be
based on these coordinates, not those of the models. To do this (with
my files), it seems I need to use the option "--type hit". However,
this changes the "Target" sequence name from the model name to
chromosome name, and the model name does not appear anywhere in the
output (see below).
Could someone please confirm whether the results are incorrect and, if
so, perhaps suggest a fix? It may well be that this problem is due to
the unusual way I am using hmmscan, rather than a problem with HMMER3
parsing...?
Many thanks,
-- Doug
========================================================
Here's what it looks like if I do *not* use the "--type hit" option.
(RVT_2 is a conserved domain name. I need this in the output.)
COMMAND:
------------------
bp_search2gff.pl -i ../chr1-tesigsv2.hmmscan -o chr1-tesigsv2-hmmscan-
original-locations-v2.gff3 --format hmmer3 --source HMMER3 --version 3
--component
OUTPUT:
------------------
==> chr1-tesigsv2-hmmscan-original-locations-v2.gff3 <==
##gff-version 3
Chr1_1 chromosome Component 1 10142557 . . 1 sequence=Chr1_1
Chr1_1 HMMER3 similarity 1 245 307.3 . 0 Target=Sequence:RVT_2 1898330
1898579
Chr1_1 HMMER3 similarity 1 244 329.5 . 0 Target=Sequence:RVT_2 2573551
2573796
Chr1_1 HMMER3 similarity 1 245 308.8 . 0 Target=Sequence:RVT_2 3159685
3159930
Chr1_1 HMMER3 similarity 1 102 108.2 . 0 Target=Sequence:RVT_2 3438684
3438791
Chr1_1 HMMER3 similarity 2 245 277.2 . 0 Target=Sequence:RVT_2 3566642
3566891
Chr1_1 HMMER3 similarity 13 213 251.4 . 0 Target=Sequence:RVT_2
4251160 4251373
Chr1_1 HMMER3 similarity 1 244 310.6 . 0 Target=Sequence:RVT_2 4252791
4253036
Chr1_1 HMMER3 similarity 6 99 94.2 . 0 Target=Sequence:RVT_2 4271555
4271653
========================================================
And here's what it looks like if I *do* use the "--type hit" option.
The coordinates look good but the model name has disappeared (and the
Target=Sequence seems wrong).
COMMAND:
------------------
bp_search2gff.pl -i ../chr1-tesigsv2.hmmscan -o chr1-tesigsv2-hmmscan-
original-locations-v3.gff3 --format hmmer3 --type hit --source HMMER3
--version 3 --component
OUTPUT:
------------------
==> chr1-tesigsv2-hmmscan-original-locations-v3.gff3 <==
##gff-version 3
RVT_2 HMMER3 similarity 1898330 1898579 307.3 . 0
Target=Sequence:Chr1_1 1 245
RVT_2 HMMER3 similarity 2573551 2573796 329.5 . 0
Target=Sequence:Chr1_1 1 244
RVT_2 HMMER3 similarity 3159685 3159930 308.8 . 0
Target=Sequence:Chr1_1 1 245
RVT_2 HMMER3 similarity 3438684 3438791 108.2 . 0
Target=Sequence:Chr1_1 1 102
RVT_2 HMMER3 similarity 3566642 3566891 277.2 . 0
Target=Sequence:Chr1_1 2 245
RVT_2 HMMER3 similarity 4251160 4251373 251.4 . 0
Target=Sequence:Chr1_1 13 213
RVT_2 HMMER3 similarity 4252791 4253036 310.6 . 0
Target=Sequence:Chr1_1 1 244
RVT_2 HMMER3 similarity 4271555 4271653 94.2 . 0
Target=Sequence:Chr1_1 6 99
RVT_2 HMMER3 similarity 4481232 4481477 281.5 . 0
Target=Sequence:Chr1_1 2 245
========================================================
And here's what the input HMMER3 result file looks like:
==> ../chr1-tesigsv2.hmmscan <==
# hmmscan :: search sequence(s) against a profile database
# HMMER 3.0rc1 (February 2010); http://hmmer.org/
# Copyright (C) 2010 Howard Hughes Medical Institute.
# Freely distributed under the GNU General Public License (GPLv3).
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
- -
# query sequence file: [...]/whole_chromosomes/translated/
chr1.pep
# target HMM database: [...]/signatures/Pfam-A.hmm
# output directed to file: chr1-tesigsv2.hmmscan
# model-specific thresholding: TC cutoffs
# Max sensitivity mode: on [all heuristic filters off]
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
- -
Query: Chr1_1 [L=10142557]
Description: CHROMOSOME dumped from ADB: Jun/20/09 14:53; last
updated: 2009-02-02
Scores for complete sequence (score includes all domains):
--- full sequence --- --- best 1 domain --- -#dom-
E-value score bias E-value score bias exp N
Model Description
------- ------ ----- ------- ------ ----- ---- --
-------- -----------
0 3971.3 17.7 2.6e-101 329.5 0.6 19.4 17
RVT_2 Reverse transcriptase (RNA-dependent DNA pol
0 3040.7 23.0 1e-206 678.6 0.1 12.2 10
ATHILA ATHILA ORF-1 family
0 1681.9 79.1 1.9e-46 149.9 0.4 28.0 21
RVT_1 Reverse transcriptase (RNA-dependent DNA pol
0 1446.9 27.4 3.6e-95 309.1 0.2 7.6 5
Transposase_21 Transposase family tnp2
0 1168.4 50.3 1.4e-29 94.4 0.3 21.5 18
rve Integrase core domain
9.1e-300 960.0 69.0 3.1e-20 64.0 0.0 28.8 20
Retrotrans_gag Retrotransposon gag protein
1.5e-180 577.0 31.6 1.6e-29 93.1 1.5 9.5 8
Transposase_23 TNP1/EN/SPM transposase
4.4e-143 456.9 82.8 4.8e-18 56.4 0.1 12.9 11
MuDR MuDR family transposase
3.8e-116 371.4 19.6 1.2e-18 58.9 0.0 13.7 7
MULE MULE transposase domain
7.1e-106 344.1 5.6 2.7e-97 316.0 0.0 3.6 1
Plant_tran Plant transposon protein
9.2e-85 275.4 22.9 5.4e-60 194.4 0.3 6.4 3
Peptidase_C48 Ulp1 protease family, C-terminal catalytic d
1.8e-77 249.8 24.8 4.4e-28 89.8 0.1 10.8 3
Transposase_24 Plant transposase (Ptta/En/Spm family)
2.8e-47 150.1 1.2 5.5e-23 72.3 0.2 3.7 2
hATC hAT family dimerisation domain
5.7e-28 89.4 3.6 4.7e-13 41.1 0.0 6.5 1
RVP_2 Retroviral aspartyl protease
1e-16 53.3 0.0 4.4e-07 22.1 0.0 6.8 1
RnaseH RNase H
1.5e-08 25.3 2.4 0.00016 12.1 0.0 4.9 0
Transposase_mut Transposase, Mutator family
Domain annotation for each model (and alignments):
>> RVT_2 Reverse transcriptase (RNA-dependent DNA polymerase)
# score bias c-Evalue i-Evalue hmmfrom hmm to alifrom
ali to envfrom env to acc
--- ------ ----- --------- --------- ------- ------- -------
------- ------- ------- ----
1 ! 307.3 0.0 5.3e-95 1.5e-94 1 245 [. 1898330
1898578 .. 1898330 1898579 .. 0.99
2 ! 329.5 0.6 8.9e-102 2.6e-101 1 244 [. 2573551
2573794 .. 2573551 2573796 .. 0.99
3 ! 308.8 0.0 1.8e-95 5.2e-95 1 245 [. 3159685
3159929 .. 3159685 3159930 .. 0.99
4 ! 108.2 0.1 3.4e-34 9.7e-34 1 102 [. 3438684
3438785 .. 3438684 3438791 .. 0.96
5 ! 277.2 0.0 8.1e-86 2.3e-85 2 245 .. 3566643
3566890 .. 3566642 3566891 .. 0.99
6 ! 251.4 0.0 6.2e-78 1.8e-77 13 213 .. 4251164
4251364 .. 4251160 4251373 .. 0.97
7 ! 310.6 0.0 5.1e-96 1.5e-95 1 244 [. 4252791
4253034 .. 4252791 4253036 .. 0.99
8 ! 94.2 0.1 6.1e-30 1.8e-29 6 99 .. 4271560
4271653 .. 4271555 4271653 .. 0.97
9 ! 281.5 0.9 3.9e-87 1.1e-86 2 245 .. 4481233
4481476 .. 4481232 4481477 .. 0.98
10 ! 248.2 0.0 5.9e-77 1.7e-76 1 190 [. 4521040
4521233 .. 4521040 4521237 .. 0.97
11 ! 314.6 0.1 3.2e-97 9.2e-97 1 244 [. 4652456
4652702 .. 4652456 4652704 .. 0.98
12 ! 40.7 0.0 1.3e-13 3.7e-13 2 92 .. 5219607
5219697 .. 5219606 5219701 .. 0.90
13 ! 221.0 0.0 1.2e-68 3.4e-68 2 245 .. 5241015
5241258 .. 5241014 5241259 .. 0.95
14 ! 81.2 0.0 5.6e-26 1.6e-25 2 115 .. 5501957
5502070 .. 5501956 5502080 .. 0.92
15 ! 272.4 0.0 2.3e-84 6.7e-84 30 245 .. 6483057
6483271 .. 6483050 6483272 .. 0.98
16 ! 178.5 0.0 1.2e-55 3.3e-55 81 244 .. 7250563
7250726 .. 7250552 7250728 .. 0.96
17 ! 313.7 0.0 5.9e-97 1.7e-96 2 245 .. 7707124
7707367 .. 7707123 7707368 .. 0.99
Alignments for each domain:
== domain 1 score: 307.3 bits; conditional E-value: 5.3e-95
RVT_2 1
nktwelvelpkgkkviglkWvfklKlnedgeierykARlVakGftqkegidyeetfspvvklesirlllalaaekkleleqlDvktaFLngelee
95
n tw +++lp gkk++g+kWv+k+Kln+dg++erykARlVakG+tq+eg+dy
+tfspv+kl++++ll+a+aa+k+++l+qlD+++aFLng+l+e
Chr1_1 1898330
NGTWVVCSLPVGKKAVGCKWVYKIKLNADGSLERYKARLVAKGYTQTEGLDYVDTFSPVAKLTTVKLLIAVAAAKGWSLSQLDISNAFLNGSLDE
1898424
68*********************************************************************************************
PP
RVT_2 96
evYvkqpeGfedkkk....enkvckLkkslYgLkqapraWyeklsevllklgfkkseadkclfvkkkeeeliivllYVDDlliagsskelieelk
186
e+Y++ p+G++ ++ +n vc+LkkslYgLkqa+r+Wy k+se l++lgf+
+s+ d++lf++k++++ ++vl+YVDD++ia+s +++ e l
Chr1_1 1898425
EIYMTLPPGYSPRQGdsfpPNAVCRLKKSLYGLKQASRQWYLKFSESLKALGFTQSSGDHTLFTRKSKNSYMAVLVYVDDIIIASSCDRETELLR
1898519
***********998889999***************************************************************************
PP
RVT_2 187
eeLkkefemkdlgelkyfLgleierkeegillsqekyvkkllkkfkmedakpvstplea 245
++L+++ +++dlg+l+yfLglei+r+++gi+++q+ky+ +ll+++++ +k++s
+p+e+
Chr1_1 1898520
DALQRSSKLRDLGTLRYFLGLEIARNTDGISICQRKYTLELLAETGLLGCKSSSVPMEP 1898578
*********************************************************97 PP
== domain 2 score: 329.5 bits; conditional E-value: 8.9e-102
RVT_2 1
nktwelvelpkgkkviglkWvfklKlnedgeierykARlVakGftqkegidyeetfspvvklesirlllalaaekkleleqlDvktaFLngelee
95
n+twel++lp+g+k+ig+kWv+k K+n++ge+erykARlVakG++q++gidy+e
+f+pv++le++rl+++laa++k++++q+D k aFLng++ee
Chr1_1 2573551
NDTWELTSLPNGHKAIGVKWVYKAKKNSKGEVERYKARLVAKGYSQRAGIDYDEVFAPVARLETVRLIISLAAQNKWKIHQMDFKLAFLNGDFEE
2573645
79*********************************************************************************************
PP
RVT_2 96
evYvkqpeGfedkkkenkvckLkkslYgLkqapraWyeklsevllklgfkkseadkclfvkkkeeeliivllYVDDlliagsskelieelkeeLk
190
evY++qp+G+ +k++e+kv++Lkk+lYgLkqapraW++++++++++++f k+ +
+++l++k ++e+++i +lYVDDl+++g++ ++ ee+k+e++
Chr1_1 2573646
EVYIEQPQGYIVKGEEDKVLRLKKALYGLKQAPRAWNTRIDKYFKEKDFIKCPYEHALYIKIQKEDILIACLYVDDLIFTGNNPSMFEEFKKEMT
2573740
***********************************************************************************************
PP
RVT_2 191
kefemkdlgelkyfLgleierkeegillsqekyvkkllkkfkmedakpvstple 244
kefem+d+g ++y+Lg+e+++++++i+++qe y+k++lkkfkm+d++pv tp
+e
Chr1_1 2573741
KEFEMTDIGLMSYYLGIEVKQEDNRIFITQEGYAKEVLKKFKMDDSNPVCTPME 2573794
****************************************************97 PP
From kai.blin at biotech.uni-tuebingen.de Thu Aug 12 08:16:45 2010
From: kai.blin at biotech.uni-tuebingen.de (Kai Blin)
Date: Thu, 12 Aug 2010 14:16:45 +0200
Subject: [Bioperl-l] HMMER3 to GFF3
In-Reply-To: <4bb89ced-69d9-43ff-ae20-4ce134efc40a@f6g2000yqa.googlegroups.com>
References: <4bb89ced-69d9-43ff-ae20-4ce134efc40a@f6g2000yqa.googlegroups.com>
Message-ID: <20100812141645.1dc6507a.kai.blin@biotech.uni-tuebingen.de>
On Wed, 11 Aug 2010 22:59:37 -0700 (PDT)
Doug Hoen wrote:
Hi Doug,
> Could someone please confirm whether the results are incorrect and, if
> so, perhaps suggest a fix? It may well be that this problem is due to
> the unusual way I am using hmmscan, rather than a problem with HMMER3
> parsing...?
Can you please attach your hmmer input file? Along the way something
inserted line breaks, making it unreadable.
It might well be possible that the HMMer3 parser still handles a little
different from the HMMer2 parser, I haven't tried that script.
Cheers,
Kai
--
Dipl.-Inform. Kai Blin kai.blin at biotech.uni-tuebingen.de
Institute for Microbiology and Infection Medicine
Division of Microbiology/Biotechnology
Eberhard-Karls-University of T?bingen
Auf der Morgenstelle 28 Phone : ++49 7071 29-78841
D-72076 T?bingen Fax : ++49 7071 29-5979
Deutschland
Homepage: http://www.mikrobio.uni-tuebingen.de/ag_wohlleben
From kai.blin at biotech.uni-tuebingen.de Thu Aug 12 08:09:00 2010
From: kai.blin at biotech.uni-tuebingen.de (Kai Blin)
Date: Thu, 12 Aug 2010 14:09:00 +0200
Subject: [Bioperl-l] using HMMER
In-Reply-To: <62C86AFB-FF3A-44C6-A413-50C3F839DF34@illinois.edu>
References: <603590.1072.qm@web112620.mail.gq1.yahoo.com>
<4C62B487.9090103@gmail.com>
<62C86AFB-FF3A-44C6-A413-50C3F839DF34@illinois.edu>
Message-ID: <20100812140900.291bbb01.kai.blin@biotech.uni-tuebingen.de>
On Wed, 11 Aug 2010 10:07:36 -0500
Chris Fields wrote:
> might also want to check whether you are using hmmer2 vs hmmer3. not sure if the wrapper works for hmmer3.
It might if you initialize it using
my $factory = Bio::Tools::Run::Hmmer->new(-hmm => 'model.hmm', -_READMETHOD => 'hmmer3');
at least for the programs that still exist with the same name in
hmmer3. It won't support hmmer3 using the default options, though.
If I have some spare time, I'll look into this, no promises on the
timeframe, though.
Cheers,
Kai
--
Dipl.-Inform. Kai Blin kai.blin at biotech.uni-tuebingen.de
Institute for Microbiology and Infection Medicine
Division of Microbiology/Biotechnology
Eberhard-Karls-University of T?bingen
Auf der Morgenstelle 28 Phone : ++49 7071 29-78841
D-72076 T?bingen Fax : ++49 7071 29-5979
Deutschland
Homepage: http://www.mikrobio.uni-tuebingen.de/ag_wohlleben
From cjfields at illinois.edu Thu Aug 12 11:28:50 2010
From: cjfields at illinois.edu (Chris Fields)
Date: Thu, 12 Aug 2010 10:28:50 -0500
Subject: [Bioperl-l] using HMMER
In-Reply-To: <20100812140900.291bbb01.kai.blin@biotech.uni-tuebingen.de>
References: <603590.1072.qm@web112620.mail.gq1.yahoo.com>
<4C62B487.9090103@gmail.com>
<62C86AFB-FF3A-44C6-A413-50C3F839DF34@illinois.edu>
<20100812140900.291bbb01.kai.blin@biotech.uni-tuebingen.de>
Message-ID: <8129B813-5B15-4DDC-AB0D-5D95EFFCE78D@illinois.edu>
On Aug 12, 2010, at 7:09 AM, Kai Blin wrote:
> On Wed, 11 Aug 2010 10:07:36 -0500
> Chris Fields wrote:
>
>> might also want to check whether you are using hmmer2 vs hmmer3. not sure if the wrapper works for hmmer3.
>
> It might if you initialize it using
> my $factory = Bio::Tools::Run::Hmmer->new(-hmm => 'model.hmm', -_READMETHOD => 'hmmer3');
>
> at least for the programs that still exist with the same name in
> hmmer3. It won't support hmmer3 using the default options, though.
>
> If I have some spare time, I'll look into this, no promises on the
> timeframe, though.
>
> Cheers,
> Kai
>
> --
> Dipl.-Inform. Kai Blin kai.blin at biotech.uni-tuebingen.de
> Institute for Microbiology and Infection Medicine
> Division of Microbiology/Biotechnology
> Eberhard-Karls-University of T?bingen
> Auf der Morgenstelle 28 Phone : ++49 7071 29-78841
> D-72076 T?bingen Fax : ++49 7071 29-5979
> Deutschland
> Homepage: http://www.mikrobio.uni-tuebingen.de/ag_wohlleben
Would be nice to convert this over (at some point) to use Mark's CommandExts. I'm thinking of doing this with Infernal, so if I get that running it wouldn't be terribly difficult to get hmmer3 working as well.
chris
From cjfields at illinois.edu Thu Aug 12 12:14:44 2010
From: cjfields at illinois.edu (Chris Fields)
Date: Thu, 12 Aug 2010 11:14:44 -0500
Subject: [Bioperl-l] using HMMER
In-Reply-To: <857996.8184.qm@web112610.mail.gq1.yahoo.com>
References: <603590.1072.qm@web112620.mail.gq1.yahoo.com>
<4C62B487.9090103@gmail.com>
<62C86AFB-FF3A-44C6-A413-50C3F839DF34@illinois.edu>
<20100812140900.291bbb01.kai.blin@biotech.uni-tuebingen.de>
<8129B813-5B15-4DDC-AB0D-5D95EFFCE78D@illinois.edu>
<857996.8184.qm@web112610.mail.gq1.yahoo.com>
Message-ID: <43FD0A31-DB95-4AE9-B678-937EE6346BC2@illinois.edu>
Fayroz,
Please keep responses on-list.
It seems you need to update your local bioperl, as 'hmmer3' is a recent addition, after 1.6.1. It will be in 1.6.2 if I can get the time to make a release :>
chris
On Aug 12, 2010, at 10:58 AM, fayroz wrote:
> dear chris,
> from HMMER documentation i found this statement
> "The HMMER programs must either be in your path, or you must set the environment
> variable HMMERDIR to point to their location."
> is it will solve the problem?
> how can i do it please ? i work under windows7 platform
>
>
> when i appled this line with hmmer3
> my $factory = Bio::Tools::Run::Hmmer->new(-hmm => 'model.hmm', -_READMETHOD =>
> 'hmmer3');
>
> this output apper:
>
> Bio::SearchIO: hmmer3 cannot be found
>
> and when try with hmmer2 the same output apper:
>
> Exception
> ------------- EXCEPTION -------------
> MSG: Failed to load module Bio::SearchIO::hmmer3. Can't locate
> Bio\SearchIO\hmmer3.pm in @INC (@INC contains: D:\Perl\bin\ D:/Perl/site/lib
> D:/Perl/lib .) at D:/Perl/site/lib/Bio/Root/Root.pm line 439, line 1.
> STACK Bio::Root::Root::_load_module D:/Perl/site/lib/Bio/Root/Root.pm:441
> STACK (eval) D:/Perl/site/lib/Bio/SearchIO.pm:446
> STACK Bio::SearchIO::_load_format_module D:/Perl/site/lib/Bio/SearchIO.pm:445
> STACK Bio::SearchIO::new D:/Perl/site/lib/Bio/SearchIO.pm:189
> STACK Bio::Tools::Run::Hmmer::_run D:/Perl/site/lib/Bio/Tools/Run/Hmmer.pm:431
> STACK Bio::Tools::Run::Hmmer::hmmsearch
> D:/Perl/site/lib/Bio/Tools/Run/Hmmer.pm:353
> STACK toplevel C:\Users\Khaled\AppData\Local\Temp\dzprltmp.pl:13
> -------------------------------------
> For more information about the SearchIO system please see the SearchIO docs.
> This includes ways of checking for formats at compile time, not run time
> '--informat' is not recognized as an internal or external command,
> operable program or batch file.
> Can't call method "next_result" on an undefined value at
> C:\Users\Khaled\AppData\Local\Temp\dzprltmp.pl line 15, line 1.
>
>
>
> ----- Original Message ----
> From: Chris Fields
> To: Kai Blin
> Cc: fayroz ; bioperl-l at bioperl.org
> Sent: Thu, August 12, 2010 6:28:50 PM
> Subject: Re: [Bioperl-l] using HMMER
>
> On Aug 12, 2010, at 7:09 AM, Kai Blin wrote:
>
>> On Wed, 11 Aug 2010 10:07:36 -0500
>> Chris Fields wrote:
>>
>>> might also want to check whether you are using hmmer2 vs hmmer3. not sure if
>>> the wrapper works for hmmer3.
>>
>> It might if you initialize it using
>> my $factory = Bio::Tools::Run::Hmmer->new(-hmm => 'model.hmm', -_READMETHOD =>
>> 'hmmer3');
>>
>> at least for the programs that still exist with the same name in
>> hmmer3. It won't support hmmer3 using the default options, though.
>>
>> If I have some spare time, I'll look into this, no promises on the
>> timeframe, though.
>>
>> Cheers,
>> Kai
>>
>> --
>> Dipl.-Inform. Kai Blin kai.blin at biotech.uni-tuebingen.de
>> Institute for Microbiology and Infection Medicine
>> Division of Microbiology/Biotechnology
>> Eberhard-Karls-University of T?bingen
>> Auf der Morgenstelle 28 Phone : ++49 7071 29-78841
>> D-72076 T?bingen Fax : ++49 7071 29-5979
>> Deutschland
>> Homepage: http://www.mikrobio.uni-tuebingen.de/ag_wohlleben
>
> Would be nice to convert this over (at some point) to use Mark's CommandExts.
> I'm thinking of doing this with Infernal, so if I get that running it wouldn't
> be terribly difficult to get hmmer3 working as well.
>
> chris
>
>
>
From jason at bioperl.org Thu Aug 12 14:37:11 2010
From: jason at bioperl.org (Jason Stajich)
Date: Thu, 12 Aug 2010 11:37:11 -0700
Subject: [Bioperl-l] Other: Script for editing alignments?
In-Reply-To: <20100812061811.4D92468539@evol.biology.mcmaster.ca>
References: <20100812061811.4D92468539@evol.biology.mcmaster.ca>
Message-ID: <4C643F57.3040408@bioperl.org>
Hi Si -
This is pretty straightforward with Bioperl. Here's one solution:
#!/usr/bin/perl -w
use strict;
use Bio::AlignIO;
my $in = Bio::AlignIO->new(-format => 'fasta', -file => shift @ARGV);
my $out = Bio::AlignIO->new(-format => 'fasta');
while( my $aln = $in->next_aln ) {
for my $seq ( $aln->each_seq ) {
my $str = $seq->seq;
if( $str =~ /^(-+)/ ) {
my $rep = length($1);
# replace from the 5' end
substr($str,0,$rep,'N'x$rep);
}
if( $str =~ /(-+)$/ ) {
my $rep = length($1);
# replace from the 3' end
substr($str,-1 * $rep,length($str),'N'x$rep);
}
$seq->seq($str);
}
# don't print the /start-end info in the FASTA ID
$aln->set_displayname_flat(1);
$out->write_aln($aln);
}
-jason
evoldir at evol.biology.mcmaster.ca wrote, On 8/11/10 11:18 PM:
> Dear All
>
> Alignment programs like MUSCLE and Clustal often output alignments with
> "-" symbols indicating indels (real events) within sequence alignments,
> but also "-" symbols at the 5' and 3' ends of sequences. The latter
> however, are not real evolutionary events and really should be Ns
> (missing data), depending on the sort of analytical framework you use.
>
> If there is sufficient heterogeneity and signal within the 5' and 3'
> ends of sequences, the "-"s can be manually edited in a text editor to
> Ns with no problem, if the alignment is small. If it is large (e.g. 2000
> seqs), or there are lots of alignments, it becomes a lengthy task.
>
> I'm investigating such alignments presently and so was wondering if
> anyone had a clever way of implementing sed, or had a Perl script that
> would perform such a task. Simply put, it would require replacing the 5'
> and 3' "-" below only with Ns and leaving the within sequence "-"s
> alone. The sequences naturally may span more than one line.
>
> >Taxon 1
> -----ATGCTG--TGACTG----TGACT---
> >Taxon 2
> ---GTATGTTG--TGACTGCT--TGACCGTC
>
> to
>
> >Taxon 1
> NNNNNATGCTG--TGACTG----TGACTNNN
> >Taxon 2
> NNNGTATGTTG--TGACTGCT--TGACCGTC
>
> It's a simple task, but I haven't seen any scripts out there to do the job.
>
> If there are any scripters out there who can help, or if someone knows
> of an application that would help, it would be great to hear from you.
>
> With best wishes and thanks
>
> Si Creer
>
>
From genehack at genehack.org Thu Aug 12 20:32:07 2010
From: genehack at genehack.org (John SJ Anderson)
Date: Thu, 12 Aug 2010 20:32:07 -0400
Subject: [Bioperl-l]
Bio::SeqFeature::SimilarityPair->from_searchResult()?
In-Reply-To: <4513D6B2-F7B3-4A6E-91CA-879C9E372E84@gmail.com>
References: <4513D6B2-F7B3-4A6E-91CA-879C9E372E84@gmail.com>
Message-ID:
On Aug 10, 2010, at 21:54 , Douglas Hoen wrote:
> I was wondering why the Synopsis in the docs for Bio::SeqFeature::SimilarityPair has the following:
> $sim_pair = Bio::SeqFeature::SimilarityPair->from_searchResult($blastHit);
>
> There doesn't actually seem to be a from_searchResult method. Am I missing something?
No, it looks like that method got removed back in 2002 as a part of moving to Bio::SearchIO (which was removed still later...):
Unfortunately, the commit didn't update the documentation. From the tiny little bit I've looked at the code, it looks like you should just be calling the 'new()' method instead (note that it takes a set of arguments, not just a BLAST hit object).
Hope this helps -- if you should happen to have the tuits, a patch to update the documentation to reflect the current interface would be awesome...
chrs,
john.
From david.breimann at gmail.com Fri Aug 13 09:01:10 2010
From: david.breimann at gmail.com (David Breimann)
Date: Fri, 13 Aug 2010 16:01:10 +0300
Subject: [Bioperl-l] Problem executing bp_genbank2gff3.pl from another perl
script
Message-ID:
Hi,
I am rying to run bp_genbank2gff3.pl from another perl script that
gets a genbank as its argument.
This does not work (no output files are generated):
my $command = "bp_genbank2gff3.pl -y -o /tmp $ARGV[0]";
open( my $command_out, "-|", $command );
close $command_out;
but this does
open( my $command_out, "-|", $command );
sleep 3; # why do I need to sleep?
close $command_out;
Why?
I though that close is supposed to block until the command is done:
Closing any piped filehandle causes the parent process to wait for the
child to finish... (see http://perldoc.perl.org/functions/open.html).
Thanks
Dave
From jun.yin at ucd.ie Fri Aug 13 09:36:34 2010
From: jun.yin at ucd.ie (Jun Yin)
Date: Fri, 13 Aug 2010 14:36:34 +0100
Subject: [Bioperl-l] Bio::LocatableSeq end checking inconsistency
Message-ID: <004a01cb3aec$8c2ddd60$a4899820$%yin@ucd.ie>
Hi, all,
I am the google summer of code student working on Bio::Align subsystem
refactoring. The code (Bio::SimpleAlign) I re-implemented now has passed
nearly all the test, except a few tests on seq/start-end testing. But here
comes a problem. This may be an old issue, that the Bio::LocatableSeq end
assignment and checking are inconsistent.
The current end checking method is based on:
$end=$seq->_ungapped_len+$seq->start-1
However, this checking may not fit the real world case.
The inconsistency usually happens when a few columns of the sequence are
removed.
For example:
my $a = Bio::LocatableSeq->new(
-id => 'a',
-strand => 1,
-seq => '-tcgatc-atcgatcg',
-start => 30,
-end => 43
);
If we remove the 1st, 8th and the last columns
$a->seq() will be 'tcgatcatcgatc'
$a->_ungapped_len==12
Actually, in the real world, the first residue will still be 30 (the old
$seq->start), and the last residue is the residue before the 43 (the old
$seq->end), thus 42.
But if you call a validation, the calculation is
$a->_ungapped_len+$a->start-1=12+30-1=41
So the reassignment of the $seq->end will not pass the validation.
So unless you save the information to a new sequence object, the original
position information will be lost anyway. But in some cases, we have to
change the sequence in its original sequence object ..
What is your suggestion on this issue?
A. pass the test and lose the information #convenient in coding but the
start-end annotation is not right any more
B. keep the information and forget the test #the object will still
remember where the last residue was in the original sequence. But is it
really meaningful at all? Because all the other residues may come from
nowhere
C. Neither of above #any other suggestions?
Cheers,
Jun Yin
Ph.D. student in U.C.D.
Bioinformatics Laboratory
Conway Institute
University College Dublin
From jessica.sun at gmail.com Fri Aug 13 11:06:46 2010
From: jessica.sun at gmail.com (Jessica Sun)
Date: Fri, 13 Aug 2010 11:06:46 -0400
Subject: [Bioperl-l] Add sequence feature
Message-ID:
Does anyone knows how to open a genbank file, add new feature and then save
a new genbank
file with new feature added in bioperl ?
thx
--
Jessica Jingping Sun
From jessica.sun at gmail.com Fri Aug 13 11:27:10 2010
From: jessica.sun at gmail.com (Jessica Sun)
Date: Fri, 13 Aug 2010 11:27:10 -0400
Subject: [Bioperl-l] Add sequence feature
In-Reply-To: <4C6562E0.7090008@gmail.com>
References:
<4C6562E0.7090008@gmail.com>
Message-ID:
unfortunately. I want to add the feature to the sequence object I got from
the Genbank file, I do not mind to save a new genbank file but these new
genbank file contains the original genbank format and info I got plus the
new feature tags I need to added to. Any quick solution to this?
thx
Jessica
On Fri, Aug 13, 2010 at 11:21 AM, Roy Chaudhuri wrote:
> Hi Jessica.
>
> You need to use Bio::SeqIO to read in the GenBank file to a BioPerl
> sequence object, and to write your new GenBank file:
> http://www.bioperl.org/wiki/HOWTO:SeqIO
>
> To add a new feature follow the instructions here:
>
> http://www.bioperl.org/wiki/HOWTO:Feature-Annotation#Building_Your_Own_Sequences
>
> (except that you are adding the feature to the sequence object you got from
> the Genbank file, not a new Bio::Seq object).
>
> Cheers.
> Roy.
>
>
> On 13/08/2010 16:06, Jessica Sun wrote:
>
>> Does anyone knows how to open a genbank file, add new feature and then
>> save
>> a new genbank
>> file with new feature added in bioperl ?
>>
>> thx
>>
>>
>
--
Jessica Jingping Sun
From roy.chaudhuri at gmail.com Fri Aug 13 11:21:04 2010
From: roy.chaudhuri at gmail.com (Roy Chaudhuri)
Date: Fri, 13 Aug 2010 16:21:04 +0100
Subject: [Bioperl-l] Add sequence feature
In-Reply-To:
References:
Message-ID: <4C6562E0.7090008@gmail.com>
Hi Jessica.
You need to use Bio::SeqIO to read in the GenBank file to a BioPerl
sequence object, and to write your new GenBank file:
http://www.bioperl.org/wiki/HOWTO:SeqIO
To add a new feature follow the instructions here:
http://www.bioperl.org/wiki/HOWTO:Feature-Annotation#Building_Your_Own_Sequences
(except that you are adding the feature to the sequence object you got
from the Genbank file, not a new Bio::Seq object).
Cheers.
Roy.
On 13/08/2010 16:06, Jessica Sun wrote:
> Does anyone knows how to open a genbank file, add new feature and then save
> a new genbank
> file with new feature added in bioperl ?
>
> thx
>
From roy.chaudhuri at gmail.com Fri Aug 13 11:37:20 2010
From: roy.chaudhuri at gmail.com (Roy Chaudhuri)
Date: Fri, 13 Aug 2010 16:37:20 +0100
Subject: [Bioperl-l] Add sequence feature
In-Reply-To:
References: <4C6562E0.7090008@gmail.com>
Message-ID: <4C6566B0.60706@gmail.com>
I'm not sure I understand, do you mean that you want to load just the
sequence from the GenBank file (ignoring the existing annotation), then
add your own features? There are instructions on how to do that here:
http://www.bioperl.org/wiki/HOWTO:SeqIO#Speed.2C_Bio::Seq::SeqBuilder
On 13/08/2010 16:27, Jessica Sun wrote:
> unfortunately. I want to add the feature to the sequence object I got
> from the Genbank file, I do not mind to save a new genbank file but
> these new genbank file contains the original genbank format and info I
> got plus the new feature tags I need to added to. Any quick solution to
> this?
>
> thx
>
> Jessica
>
>
>
> On Fri, Aug 13, 2010 at 11:21 AM, Roy Chaudhuri > wrote:
>
> Hi Jessica.
>
> You need to use Bio::SeqIO to read in the GenBank file to a BioPerl
> sequence object, and to write your new GenBank file:
> http://www.bioperl.org/wiki/HOWTO:SeqIO
>
> To add a new feature follow the instructions here:
> http://www.bioperl.org/wiki/HOWTO:Feature-Annotation#Building_Your_Own_Sequences
>
> (except that you are adding the feature to the sequence object you
> got from the Genbank file, not a new Bio::Seq object).
>
> Cheers.
> Roy.
>
>
> On 13/08/2010 16:06, Jessica Sun wrote:
>
> Does anyone knows how to open a genbank file, add new feature
> and then save
> a new genbank
> file with new feature added in bioperl ?
>
> thx
>
>
>
>
>
> --
> Jessica Jingping Sun
From roy.chaudhuri at gmail.com Fri Aug 13 11:57:27 2010
From: roy.chaudhuri at gmail.com (Roy Chaudhuri)
Date: Fri, 13 Aug 2010 16:57:27 +0100
Subject: [Bioperl-l] Add sequence feature
In-Reply-To:
References: <4C6562E0.7090008@gmail.com> <4C6566B0.60706@gmail.com>
Message-ID: <4C656B67.5020402@gmail.com>
Please remember to copy replies to the mailing list.
You can loop over the features in your Bio::Seq object:
for my $feat ($seq->get_SeqFeatures) { # do something }
And once you have found the feature you want to modify, you can add a
tag using something like:
$feat->add_tag_value('note',"this is a note");
When you're finished you can write out the modified sequence object to a
new GenBank file.
On 13/08/2010 16:40, Jessica Sun wrote:
> no i want to load the genbank file with existing features and I need to
> add some new feature tags to the existing ones and then save to a new
> update genbank file for local usage. I just not quite good on how to
> easily merge the two steps you recommended into one in a neat way.
>
> thx
>
>
> On Fri, Aug 13, 2010 at 11:37 AM, Roy Chaudhuri > wrote:
>
> I'm not sure I understand, do you mean that you want to load just
> the sequence from the GenBank file (ignoring the existing
> annotation), then add your own features? There are instructions on
> how to do that here:
> http://www.bioperl.org/wiki/HOWTO:SeqIO#Speed.2C_Bio::Seq::SeqBuilder
>
>
> On 13/08/2010 16:27, Jessica Sun wrote:
>
> unfortunately. I want to add the feature to the sequence object
> I got
> from the Genbank file, I do not mind to save a new genbank file but
> these new genbank file contains the original genbank format and
> info I
> got plus the new feature tags I need to added to. Any quick
> solution to
> this?
>
> thx
>
> Jessica
>
>
>
> On Fri, Aug 13, 2010 at 11:21 AM, Roy Chaudhuri
>
> >> wrote:
>
> Hi Jessica.
>
> You need to use Bio::SeqIO to read in the GenBank file to a
> BioPerl
> sequence object, and to write your new GenBank file:
> http://www.bioperl.org/wiki/HOWTO:SeqIO
>
> To add a new feature follow the instructions here:
> http://www.bioperl.org/wiki/HOWTO:Feature-Annotation#Building_Your_Own_Sequences
>
> (except that you are adding the feature to the sequence
> object you
> got from the Genbank file, not a new Bio::Seq object).
>
> Cheers.
> Roy.
>
>
> On 13/08/2010 16:06, Jessica Sun wrote:
>
> Does anyone knows how to open a genbank file, add new
> feature
> and then save
> a new genbank
> file with new feature added in bioperl ?
>
> thx
>
>
>
>
>
> --
> Jessica Jingping Sun
>
>
>
>
>
> --
> Jessica Jingping Sun
From jessica.sun at gmail.com Fri Aug 13 13:06:32 2010
From: jessica.sun at gmail.com (Jessica Sun)
Date: Fri, 13 Aug 2010 13:06:32 -0400
Subject: [Bioperl-l] Add sequence feature
In-Reply-To: <4C656B67.5020402@gmail.com>
References:
<4C6562E0.7090008@gmail.com>
<4C6566B0.60706@gmail.com>
<4C656B67.5020402@gmail.com>
Message-ID:
Thanks. I somehow get these error messages.
--------------------- WARNING ---------------------
MSG: Bio::SeqIO::genbank=HASH(0xa7ba1c) is not a SeqI compliant module.
Attempting to dump, but may fail!
---------------------------------------------------
Can't locate object method "seq" via package "Bio::SeqIO::genbank" at
/Library/Perl/5.8.8/Bio/SeqIO/genbank.pm line 760, line 447.
by doing this,
my $feat = new Bio::SeqFeature::Generic(-start =>20,
-end => $40,
-primary_tag => 'newfeature' );
$feat->add_tag_value("note","this is
notes");
$f->add_SeqFeature($feat); ## f is original feature pointer
$io = Bio::SeqIO->new(-format => "genbank", -file => ">$newoutfile" );
$io->write_seq($seqio_object);
On Fri, Aug 13, 2010 at 11:57 AM, Roy Chaudhuri wrote:
> Please remember to copy replies to the mailing list.
>
> You can loop over the features in your Bio::Seq object:
> for my $feat ($seq->get_SeqFeatures) { # do something }
>
> And once you have found the feature you want to modify, you can add a tag
> using something like:
> $feat->add_tag_value('note',"this is a note");
>
> When you're finished you can write out the modified sequence object to a
> new GenBank file.
>
>
> On 13/08/2010 16:40, Jessica Sun wrote:
>
>> no i want to load the genbank file with existing features and I need to
>> add some new feature tags to the existing ones and then save to a new
>> update genbank file for local usage. I just not quite good on how to
>> easily merge the two steps you recommended into one in a neat way.
>>
>> thx
>>
>>
>> On Fri, Aug 13, 2010 at 11:37 AM, Roy Chaudhuri > > wrote:
>>
>> I'm not sure I understand, do you mean that you want to load just
>> the sequence from the GenBank file (ignoring the existing
>> annotation), then add your own features? There are instructions on
>> how to do that here:
>> http://www.bioperl.org/wiki/HOWTO:SeqIO#Speed.2C_Bio::Seq::SeqBuilder
>>
>>
>> On 13/08/2010 16:27, Jessica Sun wrote:
>>
>> unfortunately. I want to add the feature to the sequence object
>> I got
>> from the Genbank file, I do not mind to save a new genbank file but
>> these new genbank file contains the original genbank format and
>> info I
>> got plus the new feature tags I need to added to. Any quick
>> solution to
>> this?
>>
>> thx
>>
>> Jessica
>>
>>
>>
>> On Fri, Aug 13, 2010 at 11:21 AM, Roy Chaudhuri
>>
>> > >> wrote:
>>
>> Hi Jessica.
>>
>> You need to use Bio::SeqIO to read in the GenBank file to a
>> BioPerl
>> sequence object, and to write your new GenBank file:
>> http://www.bioperl.org/wiki/HOWTO:SeqIO
>>
>> To add a new feature follow the instructions here:
>>
>> http://www.bioperl.org/wiki/HOWTO:Feature-Annotation#Building_Your_Own_Sequences
>>
>> (except that you are adding the feature to the sequence
>> object you
>> got from the Genbank file, not a new Bio::Seq object).
>>
>> Cheers.
>> Roy.
>>
>>
>> On 13/08/2010 16:06, Jessica Sun wrote:
>>
>> Does anyone knows how to open a genbank file, add new
>> feature
>> and then save
>> a new genbank
>> file with new feature added in bioperl ?
>>
>> thx
>>
>>
>>
>>
>>
>> --
>> Jessica Jingping Sun
>>
>>
>>
>>
>>
>> --
>> Jessica Jingping Sun
>>
>
>
--
Jessica Jingping Sun
From drummike at gmail.com Fri Aug 13 13:41:55 2010
From: drummike at gmail.com (Mike Williams)
Date: Fri, 13 Aug 2010 13:41:55 -0400
Subject: [Bioperl-l] Add sequence feature
In-Reply-To:
References:
<4C6562E0.7090008@gmail.com>
<4C6566B0.60706@gmail.com>
<4C656B67.5020402@gmail.com>
Message-ID:
On Fri, Aug 13, 2010 at 1:06 PM, Jessica Sun wrote:
> Thanks. I somehow get these error messages.
> by doing this,
>
> my $feat = new Bio::SeqFeature::Generic(-start =>20,
> -end => $40,
> -primary_tag => 'newfeature' );
> $feat->add_tag_value("note","this is
> notes");
>
That $40 looks fishy. Try deleting the dollar sign. You did mean just 40,
right?
Mike
From MEC at stowers.org Fri Aug 13 13:37:50 2010
From: MEC at stowers.org (Cook, Malcolm)
Date: Fri, 13 Aug 2010 12:37:50 -0500
Subject: [Bioperl-l] Add sequence feature
In-Reply-To:
References:
<4C6562E0.7090008@gmail.com>
<4C6566B0.60706@gmail.com>
<4C656B67.5020402@gmail.com>
Message-ID:
Jessica,
Show more code!
In particular, where did $f get set?
--Malcolm
-----Original Message-----
From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-bounces at lists.open-bio.org] On Behalf Of Jessica Sun
Sent: Friday, August 13, 2010 12:07 PM
To: Roy Chaudhuri
Cc: bioperl-l at lists.open-bio.org
Subject: Re: [Bioperl-l] Add sequence feature
Thanks. I somehow get these error messages.
--------------------- WARNING ---------------------
MSG: Bio::SeqIO::genbank=HASH(0xa7ba1c) is not a SeqI compliant module.
Attempting to dump, but may fail!
---------------------------------------------------
Can't locate object method "seq" via package "Bio::SeqIO::genbank" at /Library/Perl/5.8.8/Bio/SeqIO/genbank.pm line 760, line 447.
by doing this,
my $feat = new Bio::SeqFeature::Generic(-start =>20,
-end => $40,
-primary_tag => 'newfeature' );
$feat->add_tag_value("note","this is notes");
$f->add_SeqFeature($feat); ## f is original feature pointer $io = Bio::SeqIO->new(-format => "genbank", -file => ">$newoutfile" );
$io->write_seq($seqio_object);
On Fri, Aug 13, 2010 at 11:57 AM, Roy Chaudhuri wrote:
> Please remember to copy replies to the mailing list.
>
> You can loop over the features in your Bio::Seq object:
> for my $feat ($seq->get_SeqFeatures) { # do something }
>
> And once you have found the feature you want to modify, you can add a
> tag using something like:
> $feat->add_tag_value('note',"this is a note");
>
> When you're finished you can write out the modified sequence object to
> a new GenBank file.
>
>
> On 13/08/2010 16:40, Jessica Sun wrote:
>
>> no i want to load the genbank file with existing features and I need
>> to add some new feature tags to the existing ones and then save to a
>> new update genbank file for local usage. I just not quite good on how
>> to easily merge the two steps you recommended into one in a neat way.
>>
>> thx
>>
>>
>> On Fri, Aug 13, 2010 at 11:37 AM, Roy Chaudhuri
>> > wrote:
>>
>> I'm not sure I understand, do you mean that you want to load just
>> the sequence from the GenBank file (ignoring the existing
>> annotation), then add your own features? There are instructions on
>> how to do that here:
>>
>> http://www.bioperl.org/wiki/HOWTO:SeqIO#Speed.2C_Bio::Seq::SeqBuilder
>>
>>
>> On 13/08/2010 16:27, Jessica Sun wrote:
>>
>> unfortunately. I want to add the feature to the sequence object
>> I got
>> from the Genbank file, I do not mind to save a new genbank file but
>> these new genbank file contains the original genbank format and
>> info I
>> got plus the new feature tags I need to added to. Any quick
>> solution to
>> this?
>>
>> thx
>>
>> Jessica
>>
>>
>>
>> On Fri, Aug 13, 2010 at 11:21 AM, Roy Chaudhuri
>>
>> > >> wrote:
>>
>> Hi Jessica.
>>
>> You need to use Bio::SeqIO to read in the GenBank file to a
>> BioPerl
>> sequence object, and to write your new GenBank file:
>> http://www.bioperl.org/wiki/HOWTO:SeqIO
>>
>> To add a new feature follow the instructions here:
>>
>> http://www.bioperl.org/wiki/HOWTO:Feature-Annotation#Building_Your_Ow
>> n_Sequences
>>
>> (except that you are adding the feature to the sequence
>> object you
>> got from the Genbank file, not a new Bio::Seq object).
>>
>> Cheers.
>> Roy.
>>
>>
>> On 13/08/2010 16:06, Jessica Sun wrote:
>>
>> Does anyone knows how to open a genbank file, add new
>> feature
>> and then save
>> a new genbank
>> file with new feature added in bioperl ?
>>
>> thx
>>
>>
>>
>>
>>
>> --
>> Jessica Jingping Sun
>>
>>
>>
>>
>>
>> --
>> Jessica Jingping Sun
>>
>
>
--
Jessica Jingping Sun
_______________________________________________
Bioperl-l mailing list
Bioperl-l at lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/bioperl-l
From Kevin.M.Brown at asu.edu Fri Aug 13 13:53:50 2010
From: Kevin.M.Brown at asu.edu (Kevin Brown)
Date: Fri, 13 Aug 2010 10:53:50 -0700
Subject: [Bioperl-l] Add sequence feature
In-Reply-To:
References: <4C6562E0.7090008@gmail.com><4C6566B0.60706@gmail.com><4C656B67.5020402@gmail.com>
Message-ID: <1A4207F8295607498283FE9E93B775B406E4529F@EX02.asurite.ad.asu.edu>
If I'm reading your sample code correctly, then you are mistakenly
trying to output the input SeqIO object and not the actual Bio::Seq
object that was read in by SeqIO.
My $seqio = Bio::SeqIO->new;
My $seq = $seqio->next_seq;
#manipulate $seq
My $out = Bio::SeqIO->new;
$out->write_seq($seq);
-----Original Message-----
From: bioperl-l-bounces at lists.open-bio.org
[mailto:bioperl-l-bounces at lists.open-bio.org] On Behalf Of Jessica Sun
Sent: Friday, August 13, 2010 10:07 AM
To: Roy Chaudhuri
Cc: bioperl-l at lists.open-bio.org
Subject: Re: [Bioperl-l] Add sequence feature
Thanks. I somehow get these error messages.
--------------------- WARNING ---------------------
MSG: Bio::SeqIO::genbank=HASH(0xa7ba1c) is not a SeqI compliant module.
Attempting to dump, but may fail!
---------------------------------------------------
Can't locate object method "seq" via package "Bio::SeqIO::genbank" at
/Library/Perl/5.8.8/Bio/SeqIO/genbank.pm line 760, line 447.
by doing this,
my $feat = new Bio::SeqFeature::Generic(-start =>20,
-end => $40,
-primary_tag => 'newfeature' );
$feat->add_tag_value("note","this is
notes");
$f->add_SeqFeature($feat); ## f is original feature pointer
$io = Bio::SeqIO->new(-format => "genbank", -file => ">$newoutfile" );
$io->write_seq($seqio_object);
On Fri, Aug 13, 2010 at 11:57 AM, Roy Chaudhuri
wrote:
> Please remember to copy replies to the mailing list.
>
> You can loop over the features in your Bio::Seq object:
> for my $feat ($seq->get_SeqFeatures) { # do something }
>
> And once you have found the feature you want to modify, you can add a
tag
> using something like:
> $feat->add_tag_value('note',"this is a note");
>
> When you're finished you can write out the modified sequence object to
a
> new GenBank file.
>
>
> On 13/08/2010 16:40, Jessica Sun wrote:
>
>> no i want to load the genbank file with existing features and I need
to
>> add some new feature tags to the existing ones and then save to a new
>> update genbank file for local usage. I just not quite good on how to
>> easily merge the two steps you recommended into one in a neat way.
>>
>> thx
>>
>>
>> On Fri, Aug 13, 2010 at 11:37 AM, Roy Chaudhuri
> > wrote:
>>
>> I'm not sure I understand, do you mean that you want to load just
>> the sequence from the GenBank file (ignoring the existing
>> annotation), then add your own features? There are instructions on
>> how to do that here:
>>
http://www.bioperl.org/wiki/HOWTO:SeqIO#Speed.2C_Bio::Seq::SeqBuilder
>>
>>
>> On 13/08/2010 16:27, Jessica Sun wrote:
>>
>> unfortunately. I want to add the feature to the sequence
object
>> I got
>> from the Genbank file, I do not mind to save a new genbank
file but
>> these new genbank file contains the original genbank format
and
>> info I
>> got plus the new feature tags I need to added to. Any quick
>> solution to
>> this?
>>
>> thx
>>
>> Jessica
>>
>>
>>
>> On Fri, Aug 13, 2010 at 11:21 AM, Roy Chaudhuri
>>
>> > >> wrote:
>>
>> Hi Jessica.
>>
>> You need to use Bio::SeqIO to read in the GenBank file to
a
>> BioPerl
>> sequence object, and to write your new GenBank file:
>> http://www.bioperl.org/wiki/HOWTO:SeqIO
>>
>> To add a new feature follow the instructions here:
>>
>>
http://www.bioperl.org/wiki/HOWTO:Feature-Annotation#Building_Your_Own_S
equences
>>
>> (except that you are adding the feature to the sequence
>> object you
>> got from the Genbank file, not a new Bio::Seq object).
>>
>> Cheers.
>> Roy.
>>
>>
>> On 13/08/2010 16:06, Jessica Sun wrote:
>>
>> Does anyone knows how to open a genbank file, add new
>> feature
>> and then save
>> a new genbank
>> file with new feature added in bioperl ?
>>
>> thx
>>
>>
>>
>>
>>
>> --
>> Jessica Jingping Sun
>>
>>
>>
>>
>>
>> --
>> Jessica Jingping Sun
>>
>
>
--
Jessica Jingping Sun
_______________________________________________
Bioperl-l mailing list
Bioperl-l at lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/bioperl-l
From jessica.sun at gmail.com Fri Aug 13 15:16:51 2010
From: jessica.sun at gmail.com (Jessica Sun)
Date: Fri, 13 Aug 2010 15:16:51 -0400
Subject: [Bioperl-l] Fwd: Add sequence feature
In-Reply-To:
References:
<4C6562E0.7090008@gmail.com>
<4C6566B0.60706@gmail.com>
<4C656B67.5020402@gmail.com>
<1A4207F8295607498283FE9E93B775B406E4529F@EX02.asurite.ad.asu.edu>
Message-ID:
---------- Forwarded message ----------
From: Jessica Sun
Date: Fri, Aug 13, 2010 at 3:16 PM
Subject: Re: [Bioperl-l] Add sequence feature
To: Kevin Brown
yes, I change that, somehow it still did not take the added features in.
On Fri, Aug 13, 2010 at 1:53 PM, Kevin Brown wrote:
> If I'm reading your sample code correctly, then you are mistakenly
> trying to output the input SeqIO object and not the actual Bio::Seq
> object that was read in by SeqIO.
>
> My $seqio = Bio::SeqIO->new;
> My $seq = $seqio->next_seq;
>
> #manipulate $seq
>
> My $out = Bio::SeqIO->new;
> $out->write_seq($seq);
>
> -----Original Message-----
> From: bioperl-l-bounces at lists.open-bio.org
> [mailto:bioperl-l-bounces at lists.open-bio.org] On Behalf Of Jessica Sun
> Sent: Friday, August 13, 2010 10:07 AM
> To: Roy Chaudhuri
> Cc: bioperl-l at lists.open-bio.org
> Subject: Re: [Bioperl-l] Add sequence feature
>
> Thanks. I somehow get these error messages.
>
> --------------------- WARNING ---------------------
> MSG: Bio::SeqIO::genbank=HASH(0xa7ba1c) is not a SeqI compliant module.
> Attempting to dump, but may fail!
> ---------------------------------------------------
> Can't locate object method "seq" via package "Bio::SeqIO::genbank" at
> /Library/Perl/5.8.8/Bio/SeqIO/genbank.pm line 760,