[Biopython-dev] GenBank parser -- first go
Andrew Dalke
dalke at acm.org
Mon Dec 11 15:55:12 EST 2000
I was playing around with a different way to handle
the FEATURES section and came across this example
in IRO125195:
FEATURES Location/Qualifiers
source 1..1326
/organism="Homo sapiens"
/db_xref="taxon:9606"
/chromosome="21"
/clone="IMAGE cDNA clone 125195"
/clone_lib="Soares fetal liver spleen 1NFLS"
/note="contains Alu repeat; likely to be be derived
from
unprocessed nuclear RNA or genomic DNA; encodes
putative
exons identical to FTCD; formimino transferase
cyclodeaminase; formimino transferase (EC 2.1.2.5)
/formimino tetrahydro folate cyclodeaminase (EC
4.3.1.4)"
See the "/formimino"? I had thought that any line starting
with a '/' was a new qualifier, but it looks like you really do
have to parse the quotes as you go to tell when you are done.
While the qouted quote checking (double the "s) is doable with
a regular expression, it's gets pretty complicated.
Andrew
More information about the Biopython-dev
mailing list