[BioPython] Sequence Annotation: sequence numbering
Iddo Friedberg
idoerg@cc.huji.ac.il
Tue, 26 Jun 2001 16:51:10 +0300 (GMT+0300)
Hi,
: At 12:21 26/06/01, Iddo wrote:
: >I would like to start a discussion about the annotation of protein
: >sequence numbering in Biopython. You are probably all aware of the fact
[...]
On Tue, 26 Jun 2001, Leighton Pritchard wrote:
[...]
:
: As for solutions? I've been thinking about the problem intermittently for a
: wee while, and haven't got anything robust. I'd be glad for others' input,
: though.
:
: My own opinion tends to numbering all PDB/FSSP submissions in line with
: their Swiss-Prot sequences, but that doesn't exactly give us a quick fix,
: does it?
Yes, that would be a good solution for the positional numbering problem.
And a SwissProt - PDB mapper will be extremely useful to the
sequence-structure community.
However, it is not really within Biopython's scope to do so. (If anyone
knows of such a database, please let us know! I think I'll ask this
particular question in some more general forum). I was thinking more
about a Biopython implementation, when the positions for a given sequence,
from two or more databases are already given.
Given the following sequence & numberings:
sequence A C R L M P
PDB 1 2 - 4 5 5A
SwissProt 1 2 3 4 5 6
A possible implementation would be:
from Bio import SeqRecord, Seq
from Bio.Alphabet import Alphabet
my_seq = Seq.Seq('ACRLMP', Alphabet.ProteinAlphabet())
pdb_positions = [(1,''), (2,''), (None,''), (4,''), (5,''), (5,'A')]
sp_positions = [1, 2, 3, 4, 5, 6]
my_seq_rec = SeqRecord.SeqRecord(my_seq)
my_seq_rec.annotations['pdb_pos'] = pdb_positions
my_seq_rec.annotations['sp_pos'] = sp_positions
As you can see, PDB positions are tuples, because of those
rare-but-oh-so-annoying insertion codes.
Comments on this? General comments? Can this be adapted to the genomic DNA
<--> cDNA problem?
Iddo
--
Iddo Friedberg | Tel: +972-2-6758647
Dept. of Molecular Genetics and Biotechnology | Fax: +972-2-6757308
The Hebrew University - Hadassah Medical School | email: idoerg@cc.huji.ac.il
POB 12272, Jerusalem 91120 |
Israel |
http://bioinfo.md.huji.ac.il/marg/people-home/iddo/