[BioPython] what to use for working with fasta sequences and alignments?
Peter (BioPython List)
biopython at maubp.freeserve.co.uk
Wed Jan 10 15:58:28 UTC 2007
Jan Kosinski wrote:
> Hi,
>
> I am quite new in BioPython and I am a little bit confused when trying
> to use BioPython for working with fasta sequences and alignments.
>
> For instance, I can read and parse fasta files with Bio.Fasta, return
> records (as Fasta.record class), iterate and so on. But then I am going
> to Bio.Fasta.FastaAlign module which offers FastaAlignment (subclass of
> Alignment class) class. However, this class has very limited methods and
> get_all_seqs and get_seq_by_num return SeqRecord object instead of
> Fasta.record (why??) what makes it hard to use Bio.Fasta.FastaAlign
> (with SeqRecord) for alignments with Bio.Fasta (with Fasta.record) for
> sequences. Maybe I am wrong but Biopython seems to be full of
> incompatibilities. Or one should know which modules and classes should
> not be used?
>
> Could you recommend me what should I use for my work with fasta
> sequences and alignments? Which BioPython modules and classes?
You can use Bio.Fasta to read in files either as Fasta.Record objects,
or as SeqRecord objects. I would use SeqRecord objects - they are more
general should you ever want to use a different input file format - plus
as you have noticed, the alignment object also uses SeqRecord objects to
hold each (gapped) sequence.
There are other options if you search the code - but Bio.Fasta is the
best documented and most used.
If you are brave, then you might have a look at the new code in
Bio.SeqIO which you can get from CVS. This is still in a state of flux
however... but the Fasta parsing is much faster. See this page and the
mailing list archives for more:
http://www.biopython.org/wiki/SeqIO
> Or should I use other packages like CoreBio?
You could do - it has the advantage of having started recently from a
clean slate, and having much less "old code".
> Thank you in advance for any guidelines,
> Janek Kosinski
Peter
More information about the Biopython
mailing list