[BioPython] Module for ace assembly format

Bruno Santos bsantos at biocant.pt
Tue Mar 25 18:20:35 UTC 2008


>Did you mean to send this email to me only?
>
>Peter
Not really I made reply and I forget to change the mail address. 

I can get some documentation about that, but most of the files produced are
already able to been parsed by the existing code, since the assembler as
integration with other software packages like the Phred/Phrap/Consed
Package, and it also output the results in fasta format and other standard
format. So I don't think it is necessary to create a module to deal with all
the output files, since many of them are already able to be parsed with the
current versions of SeqIO Sequence modules. 
The main problem with the Roche files is that the primary files created by
the Sequencer machine are proprietary files called sff, and they are binary
files, but Roche provides a program to convert this output to standard
formats.  
-----Mensagem original-----
De: Peter Cock [mailto:p.j.a.cock at googlemail.com] 
Enviada: terça-feira, 25 de Março de 2008 18:03
Para: Bruno Santos
Assunto: Re: [BioPython] Module for ace assembly format

On Tue, Mar 25, 2008 at 5:50 PM, Bruno Santos <bsantos at biocant.pt> wrote:
>  I'm not sure if it's the same, for the bugzilla I wasn't able to
understand
>  what is the objective of the bugzilla, if it is to create a parser for
read
>  the custom FASTA
>  Files produced by the sequencer with the extension .fna or to create
parsers
>  to all the files produced by the assembler. But for what I understand
it's
>  just an improvement to the current FASTA parsers to deal with this
specific
>  format.

I think Jared wanted a fancy "fasta like" parser to cope with all
sorts of files, included some produced by Roche.  Michiel and I didn't
regard these non-sequence files as FASTA files, but agreed a separate
module to parse the Roche "fasta like" files might be useful.  Without
first hand experience of the various output files from Roche, or a
good set of samples, its not clear to me how best to proceed.  But I
don't want to extend an existing fasta sequence parsing module to deal
with these.

If you could add example files and links to any Roche documentation
that would help - even if its just what their terminology is for all
the different file types.

Did you mean to send this email to me only?

Peter






More information about the Biopython mailing list