[Biopython-dev] PIR parsing
Andrew Dalke
dalke at acm.org
Sat Dec 9 02:29:11 EST 2000
Forgot to ask,
What is the point of having both the "ref" and "dat" format
in PIR.
ref format example:
>P1;I52708
ELAV-like neuronal protein 1, truncated splice form - human
N;Alternate names: Drosophila ELAV(embryonic lethal, abnormal vision)-like
4; Hu a
ntigen D; paraneoplastic encephalomyelitis antigen
C;Species: Homo sapiens (man)
dat format example:
ENTRY I52708 #type complete
TITLE ELAV-like neuronal protein 1, truncated splice form - human
ALTERNATE_NAMES Drosophila ELAV(embryonic lethal, abnormal vision)-like 4;
Hu antigen D; paraneoplastic encephalomyelitis antigen
ORGANISM #formal_name Homo sapiens #common_name man
As far as I can tell, the ref format is easier to machine parse
than the dat one, and is more compact. The dat format is easier
for a human to scan. Also, the dat format contains the sequence
information while the ref one does not.
Can anyone here provide to me some background?
Andrew
More information about the Biopython-dev
mailing list