[Biopython-dev] [Bug 2289] New: LOCUS ss-cRNA => ERROR

Wed May 9 13:48:11 UTC 2007

http://bugzilla.open-bio.org/show_bug.cgi?id=2289

           Summary: LOCUS ss-cRNA => ERROR
           Product: Biopython
           Version: 1.24
          Platform: PC
        OS/Version: Windows XP
            Status: NEW
          Severity: blocker
          Priority: P1
         Component: Main Distribution
        AssignedTo: biopython-dev at biopython.org
        ReportedBy: Daniel.Nicorici at gmail.com

When I am processing a GenBank file from NCBI I get this error:
=======================================================================
Traceback (most recent call last):
  File "F:\silvermine\tool\populator\ncbigenomic\source\python\do.py", line 26,
in <module>
    record = iterator.next()
  File "D:\Python25\lib\site-packages\Bio\GenBank\__init__.py", line 142, in
nex
t
    return self._parser.parse(self.handle)
  File "D:\Python25\lib\site-packages\Bio\GenBank\__init__.py", line 208, in
par
se
    self._scanner.feed(handle, self._consumer)
  File "D:\Python25\lib\site-packages\Bio\GenBank\Scanner.py", line 360, in
feed

    self._feed_first_line(consumer, self.line)
  File "D:\Python25\lib\site-packages\Bio\GenBank\Scanner.py", line 782, in
_fee
d_first_line
    'LOCUS line does not contain valid sequence type (DNA, RNA, ...):\n' + line
AssertionError: LOCUS line does not contain valid sequence type (DNA, RNA,
...):

LOCUS       NC_005236               1769 bp ss-cRNA     linear   VRL
26-FEB-2007
================================================================================

It seems that the error comes from the parser who is not able to handle
ss-cRNA. If I replace ss-cRNA with ss-RNA then is no error anymore.

Here is my python program which gives the error:
===========================================================
import glob
from Bio import GenBank

# the files which will be processed
path="G:\\Data\\NCBI\\genomic\\gbff\\temp\\complete*.genomic.gbff"

print "Starting..."

organism=[]
count_organism=[]

feature=[]
count_feature=[]

qualifier=[]
count_qualifier=[]

files = glob.glob(path)
for file in files:
    print ">>>>>>>>>>>>>>>>>>>>>>>>>> " + file + " <<<<<<<<<<<<<<<<<<<<<<<<<"
    parser = GenBank.RecordParser()
    #infile = open("complete1short.genomic.gbff")
    infile = open(file);
    iterator = GenBank.Iterator(infile, parser)
    record = iterator.next()

    while record is not None:
        print record.locus + " --- " + record.organism + " --- " +
record.version
        # organism
        flag=0
        for b in range(len(organism)):
            if organism[b]==record.organism:
                count_organism[b]=count_organism[b]+1
                flag=1
                break
        if flag==0:
            organism.append(record.organism)
            count_organism.append(1)

        # features
        for a in range(len(record.features)):
            flag=0
            for b in range(len(feature)):
                if feature[b]==record.features[a].key:
                    count_feature[b]=count_feature[b]+1
                    flag=1
                    break
            if flag==0:
                feature.append(record.features[a].key)
                count_feature.append(1)
            #print "--" + record.features[i].key

            # qualifiers
            for c in range(len(record.features[a].qualifiers)):
                flag=0
                for b in range(len(qualifier)):
                    if qualifier[b]==record.features[a].qualifiers[c].key:
                        count_qualifier[b]=count_qualifier[b]+1
                        flag=1
                        break
                if flag==0:
                    qualifier.append(record.features[a].qualifiers[c].key)
                    count_qualifier.append(1)            
                    #print "----" + record.features[i].qualifiers[j].key
        record=iterator.next()

print "===================ORGANISM========================"
for i in range(len(organism)):
    print organism[i] + "\t" + str(count_organism[i])
print "===================END_ORGANISM===================="

print "===================FEATURES========================"
for i in range(len(feature)):
    print feature[i] + "\t" + str(count_feature[i])
print "===================END_FEATURES===================="

print "===================QUALIFIERS========================"
for i in range(len(qualifier)):
    print qualifier[i] + "\t" + str(count_qualifier[i])
print "===================END_QUALIFIERS===================="

print "The End!!!"

x=raw_input("Press ENTER to continue...")
============================================================

-- 
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.