>   What is the point of having both the "ref" and "dat" format
> in PIR.
> As far as I can tell, the ref format is easier to machine parse
> than the dat one, and is more compact.  The dat format is easier
> for a human to scan.  Also, the dat format contains the sequence
> information while the ref one does not.
> Can anyone here provide to me some background?

seq is usually derived from dat so that blast databases (or anything
else that requires fasta formatted sequences) can be made. I understand
that ref is a trimmed down dat without sequence data so you can save some
space by not keeping the partially redundant dat. I don't know for sure,
but the more compact format might be another measure along those lines.

Perhaps, though they're competing with the OWL database for the
 most obfuscated database format ;)

