[Biopython] Concatenate to aligned sequences
Peter Cock
p.j.a.cock at googlemail.com
Thu Feb 14 17:29:12 UTC 2013
On Thu, Feb 14, 2013 at 5:20 PM, Vincent Davis <vincent at vincentdavis.net> wrote:
> I have 2 fasta files from a mucle alignment. Both have the same number of
> sequences from the same organism. If I what to concatenate the pairs of
> sequences what it the best way to do this.
> Right now I am doing this:
>
> def concatenate(fa1, fa2):
> fa1open = open(fa1, "rU")
> fa2open = open(fa1, "rU")
> fa1dict = SeqIO.to_dict(SeqIO.parse(fa1open, "fasta"))
> fa2dict = SeqIO.to_dict(SeqIO.parse(fa2open, "fasta"))
> fa1open.close()
> fa2open.close()
> # check that both files have the same sequnce id's
> if set(fa1dict.keys()) != set(fa2dict.keys()):
> print(fa1dict.keys(), fa2dict.keys())
> print('The fasta files do not have the same sequences')
> bothdict = {}
> bothlist = []
> count = 1
> for key in fa2dict.keys():
> bothdict[key] = fa2dict[key]
> bothdict[key].seq = fa2dict[key].seq + fa1dict[key].seq
> bothlist.append(bothdict[key])
> return bothdict, bothlist
>
> Vincent Davis
> 720-301-3003
Have you tried loading the two alignment files via AlignIO,
sorting by name if required, and adding the alignment objects?
http://biopython.org/DIST/docs/api/Bio.Align.MultipleSeqAlignment-class.html#__add__
Peter
More information about the Biopython
mailing list