[Biopython] Feature request: conservation line for fasta-m10 format

Peter Cock p.j.a.cock at googlemail.com
Thu Apr 21 23:18:35 UTC 2011


On Thu, Apr 21, 2011 at 10:47 PM, Aurelien Mazurie <ajmazurie at oenone.net> wrote:
>
>        Greetings,
>        I used to write my own parser for FASTA alignment outputs, until
> I realized Biopython had a dedicated module, Bio.AlignIO.FastaIO.
> However I can't figure out how to get the conservation line out of the
> FASTA results. Looking at the most recent version of FastaIO.py file
> (https://github.com/biopython/biopython/blob/master/Bio/AlignIO/FastaIO.py)
> I see that the 'al_cons' tag is read (lignes 180 to 204) but the only
> variable in which it is stored, align_consensus, appear not to be used
> anywhere else in the program (assignments lines 198 or 202, then
> nothing else is done with it).

Correct, as the comments imply, we don't store the al_cons
line at the moment. The main reason for this was we didn't
have anywhere suitable to put it in the alignment object -
although this could be regarded an example of per column
annotation of the alignment as a whole (something useful
to have but again, but currently in the alignment object).
Other alignment tools produce a similar line (even for
multiple sequence alignments like ClustalW).

>        It is easy to reconstruct this conservation string for nucleotide
> sequences, not so for protein sequences. Would it be possible for
> the authors to expose the align_consensus variable in some way?
> E.g., as a property of both the query and match.

Could you explain what you want to use it for? Part of the reason
I'm asking is to better understand how you see it fitting the current
object model.

Thanks,

Peter




More information about the Biopython mailing list