[BioPython] proposal 3
Jeffrey Chang
jchang@SMI.Stanford.EDU
Tue, 18 Apr 2000 18:51:47 -0700 (PDT)
On Thu, 30 Mar 2000, Andrew Dalke wrote:
> Proposal 3 - The ends of a sequence may correspond to physical
> ends of the real sequence. This data is stored in the attribute
> "endings", which has two elements, "left" and "right". (Left
> is position 0.) The possible values for the elements are
> UNKNOWN, TERMINAL, NONTERMINAL.
This is really complicated by the biology, because it's unclear what the
notion of a sequence really is. For example, is a sequence based on the
data read from a gel? What about alternative splicing, post-translational
modifications, SNP's, fragments, or plasmids? In addition, some common
data structures used in bioinformatics don't have equivalences in biology,
such as consensus sequences, alignment hits, profiles/motifs/blocks,
etc. In many of these cases, it is unclear what TERMINAL might mean.
> This information will only be used rarely (as proof, biojava
> and bioperl don't track this data).
Yep. There was some talk earlier on the bioperl lists on whether blast
HSP's should use the sequence object. I don't know if they do now, but it
was a very possible use for the object, which would not require this extra
attribute.
> Since I don't like the complexity and performance hits,
> I'm against the proposal.
No complaints here.
Jeff