[BioPython] Module for ace assembly format

Peter Cock p.j.a.cock at googlemail.com
Tue Mar 25 18:30:51 UTC 2008


Forwarding this back to the mailing list in case anyone else was
following this thread.

Peter

---------- Forwarded message ----------
From: Bruno Santos <bsantos at biocant.pt>
Date: Tue, Mar 25, 2008 at 6:20 PM
Subject: Re: [BioPython] Module for ace assembly format
To: biopython at biopython.org


>Did you mean to send this email to me only?
 >
 >Peter
 Not really I made reply and I forget to change the mail address.

 I can get some documentation about that, but most of the files produced are
 already able to been parsed by the existing code, since the assembler as
 integration with other software packages like the Phred/Phrap/Consed
 Package, and it also output the results in fasta format and other standard
 format. So I don't think it is necessary to create a module to deal with all
 the output files, since many of them are already able to be parsed with the
 current versions of SeqIO Sequence modules.
 The main problem with the Roche files is that the primary files created by
 the Sequencer machine are proprietary files called sff, and they are binary
 files, but Roche provides a program to convert this output to standard
 formats.

-----Mensagem original-----
 De: Peter
 Enviada: terça-feira, 25 de Março de 2008 18:03
 Para: Bruno Santos

Assunto: Re: [BioPython] Module for ace assembly format



On Tue, Mar 25, 2008 at 5:50 PM, Bruno Santos  wrote:
 >  I'm not sure if it's the same, for the bugzilla I wasn't able to
 >  understand what is the objective of the bugzilla, if it is to creat
 >  a parser for read the custom FASTA Files produced by the
 > sequencer with the extension .fna or to create parsers
 >  to all the files produced by the assembler. But for what I
 > understand it's just an improvement to the current FASTA
 > parsers to deal with this specific  format.

 I think Jared wanted a fancy "fasta like" parser to cope with all
 sorts of files, included some produced by Roche.  Michiel and I didn't
 regard these non-sequence files as FASTA files, but agreed a separate
 module to parse the Roche "fasta like" files might be useful.  Without
 first hand experience of the various output files from Roche, or a
 good set of samples, its not clear to me how best to proceed.  But I
 don't want to extend an existing fasta sequence parsing module to deal
 with these.

 If you could add example files and links to any Roche documentation
 that would help - even if its just what their terminology is for all
 the different file types.

 Did you mean to send this email to me only?

 Peter




More information about the Biopython mailing list