[BioPython] GenBank.FeatureParser() error in Locus field?
Cymon Cox
cymon@duke.edu
04 Apr 2002 18:50:34 -0500
Dear BioPython Folks,
The GenBank.FeatureParser() appears to bomb on the topology definition
in the Locus field:
Code:
from Bio import GenBank
gi_list = GenBank.search_for("Bryum AND rps4")
ncbi_dict = GenBank.NCBIDictionary()
gb_record = ncbi_dict[gi_list[0]]
print gb_record
record_parser = GenBank.FeatureParser()
ncbi_dict = GenBank.NCBIDictionary(parser = record_parser)
gb_seqrecord = ncbi_dict[gi_list[0]]
print gb_seqrecord
Result:
>>>
LOCUS BBI251310 642 bp DNA linear PLN
29-MAR-2001
DEFINITION Bryum billarderi chloroplast partial rps4 gene for ribosomal
protein, subunit 4exit.
ACCESSION AJ251310
VERSION AJ251310.1 GI:11121108
KEYWORDS ribosomal protein; RPS4 gene; subunit 4.
SOURCE Bryum billarderi.
ORGANISM Plastid Bryum billarderi
Eukaryota; Viridiplantae; Streptophyta; Embryophyta;
Bryophyta;
Bryopsida; Bryidae; Bryales; Bryaceae; Bryum.
REFERENCE 1 (bases 1 to 642)
AUTHORS Cox,C.J., Goffinet,B., Newton,A.N., Shaw,A.J. and
Hedderson,T.A.
TITLE Phylogenetic relationships among the
diplolepideous-alternate
mosses (Bryidae) inferred from nuclear and chloroplast DNA
sequences
JOURNAL Unpublished
REFERENCE 2 (bases 1 to 642)
AUTHORS Cox,C.J.
TITLE Direct Submission
JOURNAL Submitted (22-OCT-1999) Cox C.J., Botany, The Natural
History
Museum, Cromwell Road, London SW7 5BD, United Kingdom
FEATURES Location/Qualifiers
source 1..642
/organism="Bryum billarderi"
/organelle="plastid"
/db_xref="taxon:109056"
gene 1..601
/gene="rps4"
CDS <1..601
/gene="rps4"
/codon_start=2
/transl_table=11
/product="ribosomal protein subunit 4"
/protein_id="CAC14741.1"
/db_xref="GI:11121109"
/translation="YRGPRVRIIRRLGALTGLTNKTPQLKTNSINQSISNKKISQYRI
RLEEKQKLRFHYGITERQLLNYVRIARKAKGSTGEVLLQLLEMRLDNVIFRLGMAPTI
PGARQLVNHRHILVNDRIVNIPSYRCKPEDSITIKDRQKSQAIISKNLNLYQKYKTPN
HLTYNFLKKKGLVNQILDRESIGLKINELLVVEYYSRQA"
misc_feature 602..>642
/note="intergenic spacer"
BASE COUNT 260 a 89 c 92 g 201 t
ORIGIN
1 gtatcgagga cctcgtgtaa gaataatacg ccgtttagga gctttaacag
gactaactaa
61 taaaacaccc cagttaaaaa ctaattcgat caatcaatca atatctaata
aaaaaatttc
121 tcaatatcgc attcgtttgg aagaaaaaca aaaattacgt tttcattatg
gaataacaga
181 gcgacaatta cttaattatg tacgtattgc tagaaaagca aaagggtcaa
caggtgaagt
241 gttattacaa ttacttgaaa tgcgcttaga taacgttatt tttcgattag
gtatggctcc
301 tacgattcct ggagcaaggc aactagtaaa tcatagacat attttagtta
atgatcgtat
361 agtaaatata ccgagttacc gttgtaaacc tgaggattct attactataa
aagatcgaca
421 aaaatctcag gctataatta gtaaaaatct aaatttgtat caaaaatata
aaacaccaaa
481 tcatttaact tataattttt taaaaaaaaa aggattagtt aatcaaatac
tagatcgtga
541 atccattggt ttaaaaataa atgaattatt agttgtagaa tattattctc
gtcaagctta
601 attagcaact aagagtattt ttaattatat acataataaa aa
//
Traceback (most recent call last):
File "<string>", line 1, in ?
File "/home/cymon/python/moss_db2/genbank_parser2temp.py", line 19, in
?
gb_seqrecord = ncbi_dict[gi_list[0]]
File "/usr/lib/python2.2/site-packages/Bio/GenBank/__init__.py", line
1555, in __getitem__
return self.parser.parse(handle)
File "/usr/lib/python2.2/site-packages/Bio/GenBank/__init__.py", line
268, in parse
self._scanner.feed(handle, self._consumer)
File "/usr/lib/python2.2/site-packages/Bio/GenBank/__init__.py", line
1250, in feed
self._parser.parseFile(handle)
File "/usr/lib/python2.2/site-packages/Martel/Parser.py", line 230, in
parseFile
self.parseString(fileobj.read())
File "/usr/lib/python2.2/site-packages/Martel/Parser.py", line 258, in
parseString
self._err_handler.fatalError(result)
File "/var/tmp/python2-2.2-root/usr/lib/python2.2/xml/sax/handler.py",
line 38, in fatalError
ParserPositionException: error parsing at or beyond character 55
>>>
Character 55 is the last space before the 'l' of linear. If 'linear' is
removed from the locus field the record parses just fine.
Thanks for all your work,
Cheers, Cymon
--
___________________________________________________
Cymon J. Cox
Research Associate
Department of Biology
Duke University
Durham NC 27708
___________________________________________________