[Biopython-dev] Fasta parser

Michiel de Hoon mdehoon at c2b2.columbia.edu
Sun Jul 2 00:43:47 EDT 2006


Thanks Iddo!
I tried the parser in Bio.SeqIO.FASTA and it is indeed a lot faster than 
the Martel-based one in Bio.Fasta.

It would be nice to merge these two modules. However, it raises a bunch 
of design questions (such as Fasta.Record versus SeqRecord, and Seq 
versus string), so it's probably better to wait with that until after 
the next Biopython release. Which, by the way, will be coming up soon.

Thanks,

--Michiel.

Iddo Friedberg wrote:
> Michiel,
> 
> There is actually a simple minded fasta reader/writer  that does not use 
> Martel. Bio.SeqIO.FASTA
> 
> ./I
> 
> --
> Iddo Friedberg, PhD
> Burnham Institute for Medical Research
> 10901 N. Torrey Pines Rd.
> La Jolla, CA 92037 USA
> T: +1 858 646 3100 x3516
> http://iddo-friedberg.org
> http://BioFunctionPrediction.org
> 
> 
> 
> -----Original Message-----
> From: biopython-dev-bounces at lists.open-bio.org on behalf of Michiel de Hoon
> Sent: Sat 7/1/2006 2:47 PM
> To: biopython-dev at biopython.org
> Subject: [Biopython-dev] Fasta parser
> 
> Hi everybody,
> 
> The Biopython shows the following approach to parsing a Fasta file:
> 
>  >>> from Bio import Fasta
>  >>> parser = Fasta.RecordParser()
>  >>> file = open("ls_orchid.fasta")
>  >>> iterator = Fasta.Iterator(file, parser)
>  >>> cur_record = iterator.next()
> 
> But for large Fasta files, it's very slow, compared to file.read(),
> which may be due to going through Martel (I believe the same was true
> for large GenBank files).
> 
> So I'm thinking about writing a simple-minded Fasta parser for better
> performance with large files. What I'm wondering about:
> 1) Is there some advantage that I overlooked of using Martel for parsing
> Fasta files?
> 2) Why is it necessary to create a parser first and passing it to
> Fasta.Iterator? Are there any cases where Fasta.Iterator uses something
> other than a Fasta.RecordParser?
> 
> --Michiel.
> _______________________________________________
> Biopython-dev mailing list
> Biopython-dev at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biopython-dev
> 



More information about the Biopython-dev mailing list