[Bioperl-l] Parser: Ace file (Sequence Assembly) in Bioperl

Joshua Udall jaudall at gmail.com
Tue Jan 6 23:13:45 UTC 2009


Chris et al. -

A student and I have written code to do this - write ace files as well as
parse them one entry at a time.  In trying to use the Assembly::IO as it was
in 1.5, we ran into problems with large ace files containing many entries
because of file handle limit issues with the inherited implementation
DB_File.  Our implementation simply reads one contig at a time instead of
first trying to slurp the whole ace into memory.  I'm happy to add it to
Bioperl, but I am not sure how to do it.  If I sent *.pm files to someone,
could they help me get it into bioperl?  It may not be perfect either, but
it should be a good start.

Josh

On Tue, Jan 6, 2009 at 1:52 PM, Chris Fields <cjfields at illinois.edu> wrote:

> Not at this time (write_assembly is not implemented).  If you come up with
> code to do so let us know (patches are always welcome).
>
> chris
>
>
> On Jan 6, 2009, at 2:43 PM, Abhishek Pratap wrote:
>
>  Thanks that helped.
>>
>> Any method to write Ace files ?
>>
>> Thanks,
>> -Abhi
>>
>> On Tue, Jan 6, 2009 at 3:36 PM, Smithies, Russell <
>> Russell.Smithies at agresearch.co.nz> wrote:
>>
>>  Here's how I've been doing it:
>>>
>>>
>>> my $infile = "454Contigs.ace";
>>> my $parser = new Bio::Assembly::IO(-file   => $infile ,-format => "ace")
>>> or
>>> die $!;
>>> my $assembly = $parser->next_assembly;
>>>
>>> # to work with a named contig
>>> my @wanted_id = ("Contig100");
>>> my ($contig) = $assembly->select_contigs(@wanted_id) or die $!;
>>>
>>> #get the consensus
>>> my $consensus = $contig->get_consensus_sequence();
>>>
>>> #get the consensus qualities
>>> my @quality_values  = @{$contig->get_consensus_quality()->qual()};
>>>
>>> hope this helps,
>>>
>>> Russell
>>>
>>>
>>>  -----Original Message-----
>>>> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
>>>> bounces at lists.open-bio.org] On Behalf Of Abhishek Pratap
>>>> Sent: Tuesday, 6 January 2009 6:43 p.m.
>>>> To: bioperl-l at lists.open-bio.org
>>>> Subject: [Bioperl-l] Parser: Ace file (Sequence Assembly) in Bioperl
>>>>
>>>> Hi All
>>>>
>>>> I am looking for some code to parse the ACE file format. I have big ACE
>>>> files which I would like to trim based on the user defined Contig name
>>>> and
>>>> specific region and write out the output to another fresh ACE file.
>>>>
>>>> For now I am trying to tweak Bio::Assembly::IO; but it is kind of slow.
>>>> Any
>>>> other alternative or suggestions.
>>>>
>>>> Thanks All,
>>>> -Abhi
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> -----------------------------
>>>> Abhishek Pratap
>>>> Bioinformatics Software Engineer
>>>> Institute for Genome Sciences
>>>> School of Medicine, Univ of Maryland
>>>> 801, W. Baltimore Street, Baltimore, MD 21209
>>>> Ph: (+1)-410-706-2296
>>>> www.igs.umaryland.edu/
>>>>
>>>> Chair
>>>> RSG-Worldwide
>>>> ISCB-Student Council
>>>> http://iscbsc.org/rsg
>>>>
>>>> www.bioinfosolutions.com
>>>> _______________________________________________
>>>> Bioperl-l mailing list
>>>> Bioperl-l at lists.open-bio.org
>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>>
>>> =======================================================================
>>> Attention: The information contained in this message and/or attachments
>>> from AgResearch Limited is intended only for the persons or entities
>>> to which it is addressed and may contain confidential and/or privileged
>>> material. Any review, retransmission, dissemination or other use of, or
>>> taking of any action in reliance upon, this information by persons or
>>> entities other than the intended recipients is prohibited by AgResearch
>>> Limited. If you have received this message in error, please notify the
>>> sender immediately.
>>> =======================================================================
>>>
>>>
>>
>>
>> --
>> -----------------------------
>> Abhishek Pratap
>> Bioinformatics Software Engineer
>> Institute for Genome Sciences
>> School of Medicine, Univ of Maryland
>> 801, W. Baltimore Street, Baltimore, MD 21209
>> Ph: (+1)-410-706-2296
>> www.igs.umaryland.edu/
>>
>> Chair
>> RSG-Worldwide
>> ISCB-Student Council
>> http://iscbsc.org/rsg
>>
>> www.bioinfosolutions.com
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>



-- 
Joshua Udall
Assistant Professor
295 WIDB
Plant and Wildlife Science Dept.
Brigham Young University
Provo, UT 84602
801-422-9307
Fax: 801-422-0008
USA



More information about the Bioperl-l mailing list