[Bioperl-l] Parser: Ace file (Sequence Assembly) in Bioperl
Chris Fields
cjfields at illinois.edu
Tue Jan 6 22:13:24 UTC 2009
How about re-implementing Bio::Assembly classes so they simply map to
Bio::DB::SeqFeature::Store (or similar) methods? Scaffold could just
be a wrapper around a Bio::DB::SeqFeature::Store (which can be BDB/
mysql/postgresql/memory) and return Contigs.
Similarly, the IO classes could probably act as specialized
Bio::DB::SeqFeature::Store::Loade classes for the database and just
return the Scaffold instance.
chris
On Jan 6, 2009, at 3:31 PM, Smithies, Russell wrote:
> I agree with the need for a faster parser.
> Although the current version does a great job, it is slow and memory
> intensive as it loads everything into Bio::Assembly::Scaffold
> objects composed of Bio::Assembly::Contig objects.
> I'm not sure exactly what the best solution would be, perhaps a new
> constructor with a named contig would simplify things?
>
> $io = new Bio::Assembly::IO(-file=>"454_assy.ace",-format=>"ace");
>
> $contig = $io->next_assembly_with_contig(-contig=>"Contig000100")-
> >select_contig;
>
> Or do we even need a next_assembly method?
> Can there be more than one assembly in an .ace file?
>
> --Russell
>
>
>
> From: Abhishek Pratap [mailto:abhishek.vit at gmail.com]
> Sent: Wednesday, 7 January 2009 10:07 a.m.
> To: Chris Fields
> Cc: Smithies, Russell; bioperl-l at lists.open-bio.org
> Subject: Re: [Bioperl-l] Parser: Ace file (Sequence Assembly) in
> Bioperl
>
> Ok .. Sure in case we do write something which eventually I will
> have to :) I will fwd it.
>
> @Russel:
>
> I feel to get info for specific the current method is very slow as
> it tries to store the info for all contigs into memory. Such info
> could be memory intensive specially with the next gen data coming
> from 454 sequencers. I think we should grep to the contig/s of
> itnerest and then create a record for it. Please correct me if I am
> wrong.
>
> Thanks,
> -Abhi
> On Tue, Jan 6, 2009 at 3:52 PM, Chris Fields <cjfields at illinois.edu<mailto:cjfields at illinois.edu
> >> wrote:
> Not at this time (write_assembly is not implemented). If you come
> up with code to do so let us know (patches are always welcome).
>
> chris
>
>
> On Jan 6, 2009, at 2:43 PM, Abhishek Pratap wrote:
> Thanks that helped.
>
> Any method to write Ace files ?
>
> Thanks,
> -Abhi
>
> On Tue, Jan 6, 2009 at 3:36 PM, Smithies, Russell <
> Russell.Smithies at agresearch.co.nz<mailto:Russell.Smithies at agresearch.co.nz
> >> wrote:
> Here's how I've been doing it:
>
>
> my $infile = "454Contigs.ace";
> my $parser = new Bio::Assembly::IO(-file => $infile ,-format =>
> "ace") or
> die $!;
> my $assembly = $parser->next_assembly;
>
> # to work with a named contig
> my @wanted_id = ("Contig100");
> my ($contig) = $assembly->select_contigs(@wanted_id) or die $!;
>
> #get the consensus
> my $consensus = $contig->get_consensus_sequence();
>
> #get the consensus qualities
> my @quality_values = @{$contig->get_consensus_quality()->qual()};
>
> hope this helps,
>
> Russell
>
> -----Original Message-----
> From: bioperl-l-bounces at lists.open-bio.org<mailto:bioperl-l-bounces at lists.open-bio.org
> > [mailto:bioperl-l-<mailto:bioperl-l->
> bounces at lists.open-bio.org<mailto:bounces at lists.open-bio.org>] On
> Behalf Of Abhishek Pratap
> Sent: Tuesday, 6 January 2009 6:43 p.m.
> To: bioperl-l at lists.open-bio.org<mailto:bioperl-l at lists.open-bio.org>
> Subject: [Bioperl-l] Parser: Ace file (Sequence Assembly) in Bioperl
>
> Hi All
>
> I am looking for some code to parse the ACE file format. I have big
> ACE
> files which I would like to trim based on the user defined Contig name
> and
> specific region and write out the output to another fresh ACE file.
>
> For now I am trying to tweak Bio::Assembly::IO; but it is kind of
> slow.
> Any
> other alternative or suggestions.
>
> Thanks All,
> -Abhi
>
>
>
>
>
>
>
>
> --
> -----------------------------
> Abhishek Pratap
> Bioinformatics Software Engineer
> Institute for Genome Sciences
> School of Medicine, Univ of Maryland
> 801, W. Baltimore Street, Baltimore, MD 21209
> Ph: (+1)-410-706-2296
> www.igs.umaryland.edu/<http://www.igs.umaryland.edu/>
>
> Chair
> RSG-Worldwide
> ISCB-Student Council
> http://iscbsc.org/rsg
>
> www.bioinfosolutions.com<http://www.bioinfosolutions.com>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org<mailto:Bioperl-l at lists.open-bio.org>
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> =
> ======================================================================
> Attention: The information contained in this message and/or
> attachments
> from AgResearch Limited is intended only for the persons or entities
> to which it is addressed and may contain confidential and/or
> privileged
> material. Any review, retransmission, dissemination or other use of,
> or
> taking of any action in reliance upon, this information by persons or
> entities other than the intended recipients is prohibited by
> AgResearch
> Limited. If you have received this message in error, please notify the
> sender immediately.
> =
> ======================================================================
>
>
>
> --
> -----------------------------
> Abhishek Pratap
> Bioinformatics Software Engineer
> Institute for Genome Sciences
> School of Medicine, Univ of Maryland
> 801, W. Baltimore Street, Baltimore, MD 21209
> Ph: (+1)-410-706-2296
> www.igs.umaryland.edu/<http://www.igs.umaryland.edu/>
>
> Chair
> RSG-Worldwide
> ISCB-Student Council
> http://iscbsc.org/rsg
>
> www.bioinfosolutions.com<http://www.bioinfosolutions.com>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org<mailto:Bioperl-l at lists.open-bio.org>
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>
>
>
> --
> -----------------------------
> Abhishek Pratap
> Bioinformatics Software Engineer
> Institute for Genome Sciences
> School of Medicine, Univ of Maryland
> 801, W. Baltimore Street, Baltimore, MD 21209
> Ph: (+1)-410-706-2296
> www.igs.umaryland.edu/<http://www.igs.umaryland.edu/>
>
> Chair
> RSG-Worldwide
> ISCB-Student Council
> http://iscbsc.org/rsg
>
> www.bioinfosolutions.com<http://www.bioinfosolutions.com>
>
> =
> ======================================================================
> Attention: The information contained in this message and/or
> attachments
> from AgResearch Limited is intended only for the persons or entities
> to which it is addressed and may contain confidential and/or
> privileged
> material. Any review, retransmission, dissemination or other use of,
> or
> taking of any action in reliance upon, this information by persons or
> entities other than the intended recipients is prohibited by
> AgResearch
> Limited. If you have received this message in error, please notify the
> sender immediately.
> =
> ======================================================================
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
More information about the Bioperl-l
mailing list