[Biopython-dev] GenBank parsing problem

Brad Chapman chapmanb at arches.uga.edu
Wed Jan 30 09:26:47 EST 2002


Hi Chunlei;
Thanks for reporting the problem (and thanks to others who verified it).

> >>> from Bio import GenBank
> >>> gi=GenBank.search_for("NM_007355")[0]
> >>> ncbi_dict=GenBank.NCBIDictionary(parser=GenBank.FeatureParser())
> >>> record=ncbi_dict[gi]
> Traceback (most recent call last):
[...]
> ParserPositionException: error parsing at or beyond character 55
> >>> 
> 
> Did GenBank change the format?

Yup, it looks like they added a new "linear" word to the LOCUS line, to
complement "circular" I guess:

LOCUS       AC091001              177066 bp    DNA     linear   PRI 06-DEC-2001

Sorry, I'd tried to prepare for the new format changes, but hadn't
realized this change was going to happen. The diff to
Bio/GenBank/genbank_format.py is attached (fixes and tests for this case
are also in CVS). I checked it out on a PRI download from NCBI, and it
seems to be working for me.

Thanks again for the report! I hope this fixes your problem. Please let
me know if you have any questions.
Brad
-------------- next part --------------
Index: genbank_format.py
===================================================================
RCS file: /home/repository/biopython/biopython/Bio/GenBank/genbank_format.py,v
retrieving revision 1.16
retrieving revision 1.17
diff -c -r1.16 -r1.17
*** genbank_format.py	2002/01/05 22:09:58	1.16
--- genbank_format.py	2002/01/30 13:54:05	1.17
***************
*** 106,112 ****
                              Martel.Opt(Martel.Alt(*residue_prefixes)) +
                              Martel.Opt(Martel.Alt(*residue_types)) +
                              Martel.Opt(Martel.Opt(blank_space) + 
!                                        Martel.Str("circular")))
  
  date = Martel.Group("date",
                      Martel.Re("[-\w]+"))
--- 106,113 ----
                              Martel.Opt(Martel.Alt(*residue_prefixes)) +
                              Martel.Opt(Martel.Alt(*residue_types)) +
                              Martel.Opt(Martel.Opt(blank_space) + 
!                                        Martel.Alt(Martel.Str("circular"),
!                                                   Martel.Str("linear"))))
  
  date = Martel.Group("date",
                      Martel.Re("[-\w]+"))


More information about the Biopython-dev mailing list