[Biopython-dev] GenBank parsing problem
Brad Chapman
chapmanb at arches.uga.edu
Wed Jan 30 09:26:47 EST 2002
Hi Chunlei;
Thanks for reporting the problem (and thanks to others who verified it).
> >>> from Bio import GenBank
> >>> gi=GenBank.search_for("NM_007355")[0]
> >>> ncbi_dict=GenBank.NCBIDictionary(parser=GenBank.FeatureParser())
> >>> record=ncbi_dict[gi]
> Traceback (most recent call last):
[...]
> ParserPositionException: error parsing at or beyond character 55
> >>>
>
> Did GenBank change the format?
Yup, it looks like they added a new "linear" word to the LOCUS line, to
complement "circular" I guess:
LOCUS AC091001 177066 bp DNA linear PRI 06-DEC-2001
Sorry, I'd tried to prepare for the new format changes, but hadn't
realized this change was going to happen. The diff to
Bio/GenBank/genbank_format.py is attached (fixes and tests for this case
are also in CVS). I checked it out on a PRI download from NCBI, and it
seems to be working for me.
Thanks again for the report! I hope this fixes your problem. Please let
me know if you have any questions.
Brad
-------------- next part --------------
Index: genbank_format.py
===================================================================
RCS file: /home/repository/biopython/biopython/Bio/GenBank/genbank_format.py,v
retrieving revision 1.16
retrieving revision 1.17
diff -c -r1.16 -r1.17
*** genbank_format.py 2002/01/05 22:09:58 1.16
--- genbank_format.py 2002/01/30 13:54:05 1.17
***************
*** 106,112 ****
Martel.Opt(Martel.Alt(*residue_prefixes)) +
Martel.Opt(Martel.Alt(*residue_types)) +
Martel.Opt(Martel.Opt(blank_space) +
! Martel.Str("circular")))
date = Martel.Group("date",
Martel.Re("[-\w]+"))
--- 106,113 ----
Martel.Opt(Martel.Alt(*residue_prefixes)) +
Martel.Opt(Martel.Alt(*residue_types)) +
Martel.Opt(Martel.Opt(blank_space) +
! Martel.Alt(Martel.Str("circular"),
! Martel.Str("linear"))))
date = Martel.Group("date",
Martel.Re("[-\w]+"))
More information about the Biopython-dev
mailing list