[BioPython] Sequence Annotation: sequence numbering

Iddo Friedberg idoerg@cc.huji.ac.il
Tue, 26 Jun 2001 16:51:10 +0300 (GMT+0300)


Hi,


: At 12:21 26/06/01, Iddo wrote:
: >I would like to start a discussion about the annotation of protein
: >sequence numbering in Biopython. You are probably all aware of the fact

[...]

On Tue, 26 Jun 2001, Leighton Pritchard wrote:

[...]

:
: As for solutions? I've been thinking about the problem intermittently for a
: wee while, and haven't got anything robust. I'd be glad for others' input,
: though.
:
: My own opinion tends to numbering all PDB/FSSP submissions in line with
: their Swiss-Prot sequences, but that doesn't exactly give us a quick fix,
: does it?

Yes, that would be a good solution for the positional numbering problem.
And a SwissProt - PDB mapper will be extremely useful to the
sequence-structure community.

However, it is not really within Biopython's scope to do so. (If anyone
knows of such a database, please let us know! I think I'll ask this
particular question in some more general forum). I was thinking more
about a Biopython implementation, when the positions for a given sequence,
from two or more databases are already given.

Given the following sequence & numberings:

sequence     A  C  R  L  M  P
PDB          1  2  -  4  5  5A
SwissProt    1  2  3  4  5  6

A possible implementation would be:

from Bio import SeqRecord, Seq
from Bio.Alphabet import Alphabet

my_seq = Seq.Seq('ACRLMP', Alphabet.ProteinAlphabet())
pdb_positions = [(1,''), (2,''), (None,''), (4,''), (5,''), (5,'A')]
sp_positions = [1, 2, 3, 4, 5, 6]
my_seq_rec = SeqRecord.SeqRecord(my_seq)
my_seq_rec.annotations['pdb_pos'] = pdb_positions
my_seq_rec.annotations['sp_pos'] = sp_positions


As you can see, PDB positions are tuples, because of those
rare-but-oh-so-annoying insertion codes.

Comments on this? General comments? Can this be adapted to the genomic DNA
<--> cDNA problem?

Iddo

--

Iddo Friedberg                                  | Tel: +972-2-6758647
Dept. of Molecular Genetics and Biotechnology   | Fax: +972-2-6757308
The Hebrew University - Hadassah Medical School | email: idoerg@cc.huji.ac.il
POB 12272, Jerusalem 91120                      |
Israel                                          |
http://bioinfo.md.huji.ac.il/marg/people-home/iddo/