[Bioperl-l] Parser: Ace file (Sequence Assembly) in Bioperl

Smithies, Russell Russell.Smithies at agresearch.co.nz
Tue Jan 6 21:31:50 UTC 2009


I agree with the need for a faster parser.
Although the current version does a great job, it is slow and memory intensive as it loads everything into Bio::Assembly::Scaffold objects composed of Bio::Assembly::Contig objects.
I'm not sure exactly what the best solution would be, perhaps a new constructor with a named contig would simplify things?

    $io = new Bio::Assembly::IO(-file=>"454_assy.ace",-format=>"ace");

    $contig = $io->next_assembly_with_contig(-contig=>"Contig000100")->select_contig;

Or do we even need a next_assembly method?
Can there be more than one assembly in an .ace file?

--Russell



From: Abhishek Pratap [mailto:abhishek.vit at gmail.com]
Sent: Wednesday, 7 January 2009 10:07 a.m.
To: Chris Fields
Cc: Smithies, Russell; bioperl-l at lists.open-bio.org
Subject: Re: [Bioperl-l] Parser: Ace file (Sequence Assembly) in Bioperl

Ok .. Sure in case we do write something which eventually I will have to :)  I will fwd it.

@Russel:

I feel to get info for specific the current method is very slow as it tries to store the info for all contigs into memory. Such info could be memory intensive specially with the next gen data coming from 454 sequencers. I think we should grep to the contig/s of itnerest and then create a record for it. Please correct me if I am wrong.

Thanks,
-Abhi
On Tue, Jan 6, 2009 at 3:52 PM, Chris Fields <cjfields at illinois.edu<mailto:cjfields at illinois.edu>> wrote:
Not at this time (write_assembly is not implemented).  If you come up with code to do so let us know (patches are always welcome).

chris


On Jan 6, 2009, at 2:43 PM, Abhishek Pratap wrote:
Thanks that helped.

Any method to write Ace files ?

Thanks,
-Abhi

On Tue, Jan 6, 2009 at 3:36 PM, Smithies, Russell <
Russell.Smithies at agresearch.co.nz<mailto:Russell.Smithies at agresearch.co.nz>> wrote:
Here's how I've been doing it:


my $infile = "454Contigs.ace";
my $parser = new Bio::Assembly::IO(-file   => $infile ,-format => "ace") or
die $!;
my $assembly = $parser->next_assembly;

# to work with a named contig
my @wanted_id = ("Contig100");
my ($contig) = $assembly->select_contigs(@wanted_id) or die $!;

#get the consensus
my $consensus = $contig->get_consensus_sequence();

#get the consensus qualities
my @quality_values  = @{$contig->get_consensus_quality()->qual()};

hope this helps,

Russell

-----Original Message-----
From: bioperl-l-bounces at lists.open-bio.org<mailto:bioperl-l-bounces at lists.open-bio.org> [mailto:bioperl-l-<mailto:bioperl-l->
bounces at lists.open-bio.org<mailto:bounces at lists.open-bio.org>] On Behalf Of Abhishek Pratap
Sent: Tuesday, 6 January 2009 6:43 p.m.
To: bioperl-l at lists.open-bio.org<mailto:bioperl-l at lists.open-bio.org>
Subject: [Bioperl-l] Parser: Ace file (Sequence Assembly) in Bioperl

Hi All

I am looking for some code to parse the ACE file format. I have big ACE
files which I would like to trim based on the user defined Contig name
and
specific region and write out the output to another fresh ACE file.

For now I am trying to tweak Bio::Assembly::IO; but it is kind of slow.
Any
other alternative or suggestions.

Thanks All,
-Abhi








--
-----------------------------
Abhishek Pratap
Bioinformatics Software Engineer
Institute for Genome Sciences
School of Medicine, Univ of Maryland
801, W. Baltimore Street, Baltimore, MD 21209
Ph: (+1)-410-706-2296
www.igs.umaryland.edu/<http://www.igs.umaryland.edu/>

Chair
RSG-Worldwide
ISCB-Student Council
http://iscbsc.org/rsg

www.bioinfosolutions.com<http://www.bioinfosolutions.com>
_______________________________________________
Bioperl-l mailing list
Bioperl-l at lists.open-bio.org<mailto:Bioperl-l at lists.open-bio.org>
http://lists.open-bio.org/mailman/listinfo/bioperl-l
=======================================================================
Attention: The information contained in this message and/or attachments
from AgResearch Limited is intended only for the persons or entities
to which it is addressed and may contain confidential and/or privileged
material. Any review, retransmission, dissemination or other use of, or
taking of any action in reliance upon, this information by persons or
entities other than the intended recipients is prohibited by AgResearch
Limited. If you have received this message in error, please notify the
sender immediately.
=======================================================================



--
-----------------------------
Abhishek Pratap
Bioinformatics Software Engineer
Institute for Genome Sciences
School of Medicine, Univ of Maryland
801, W. Baltimore Street, Baltimore, MD 21209
Ph: (+1)-410-706-2296
www.igs.umaryland.edu/<http://www.igs.umaryland.edu/>

Chair
RSG-Worldwide
ISCB-Student Council
http://iscbsc.org/rsg

www.bioinfosolutions.com<http://www.bioinfosolutions.com>
_______________________________________________
Bioperl-l mailing list
Bioperl-l at lists.open-bio.org<mailto:Bioperl-l at lists.open-bio.org>
http://lists.open-bio.org/mailman/listinfo/bioperl-l




--
-----------------------------
Abhishek Pratap
Bioinformatics Software Engineer
Institute for Genome Sciences
School of Medicine, Univ of Maryland
801, W. Baltimore Street, Baltimore, MD 21209
Ph: (+1)-410-706-2296
www.igs.umaryland.edu/<http://www.igs.umaryland.edu/>

Chair
RSG-Worldwide
ISCB-Student Council
http://iscbsc.org/rsg

www.bioinfosolutions.com<http://www.bioinfosolutions.com>

=======================================================================
Attention: The information contained in this message and/or attachments
from AgResearch Limited is intended only for the persons or entities
to which it is addressed and may contain confidential and/or privileged
material. Any review, retransmission, dissemination or other use of, or
taking of any action in reliance upon, this information by persons or
entities other than the intended recipients is prohibited by AgResearch
Limited. If you have received this message in error, please notify the
sender immediately.
=======================================================================




More information about the Bioperl-l mailing list