[BioPython] Big GenBank files

Peter biopython at maubp.freeserve.co.uk
Mon May 2 18:06:21 EDT 2005


I sent Aurélie a Python file version my patch (from bug 1747) off 
the mailing list, and it looks like there is a problem using it with 
the GenBank.NCBIDictionary (see below) which I had never used.

http://bugzilla.open-bio.org/show_bug.cgi?id=1747

Thanks for letting me know Aurélie!

I will try and look at this as time permits, but I will have to work 
out how the NCBIDictionary code works first... so if someone else 
wants to leap in, please do :)

Peter

-------- Original Message --------
Subject: Re: [BioPython] Big GenBank files
Date: Sun, 1 May 2005 19:00:53 +0200
From: Aurélie Bornot <aurelie.bornot at free.fr>
To: Peter <biopython at maubp.freeserve.co.uk>

Hello Peter and everybody !

Sorry Peter : I take a lot of time to answer you about your patch
(GenBank.__init__.py)....

I  have tried it with this code (that works with the "old" 
__init__.py) :
fichier = open('AC008625.5.gb',"w")
record_parser = GenBank.FeatureParser()
ncbi_dict = GenBank.NCBIDictionary
('nucleotide','genbank',parser=record_parser)
gb_record = ncbi_dict['AC008625.5']
fichier.close()

And I got this error :
Traceback (most recent call last):
   File "essais.py", line 112, in ?
     gb_record = ncbi_dict['AC008625.5']
   File "C:\Python24\lib\site-packages\Bio\GenBank\__init__.py", 
line 1736,
in __getitem__
     return self.parser.parse(handle)
   File "C:\Python24\lib\site-packages\Bio\GenBank\__init__.py", 
line 219, in
parse    self._scanner.feed(handle, self._consumer)
   File "C:\Python24\lib\site-packages\Bio\GenBank\__init__.py", 
line 1261,
in feed    line = handle.readline()
AttributeError: ReseekFile instance has no attribute 'readline'

I don't know why very well...

BUT !!!!!   : )

like you said  : with  something like :
#connexion:
  fichierGB =
urllib2.urlopen("http://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?id
="+ID+"&db="+database +"&retmod=text&rettype=genbank")
record_parser = GenBank.RecordParser()
gb_iterator = GenBank.Iterator(fichierGB, record_parser)
  cur_record = gb_iterator.next()
  fichierGB.close()

It works !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
The big file are parsed without any problem....  : )
So I simply modified my code like this....

To conclude :
Peter , You are my savior !!!!
THANK YOU VERY VERY MUCH !!!

Aurelie


More information about the BioPython mailing list