[BioPython] Big GenBank files

Mon Apr 25 07:16:35 EDT 2005

aurelie.bornot at free.fr wrote:
> Hi !
> 
> I am trying to make a program that do automatically blasts of a base of
> sequences against the genbank sequences. And I would like to retrieve (also
> automatically) the most interesting GenBank files..... to keep informations
> about them in my database.
> 
> But I've got a problem (again..sorry ! :'( ) :
> 
> I've 2*512 Mega of RAM but it seems that my computer can't deal with 'big'
> GenBank files like 'BA000028.3'(7 M) or 'AP008212' (37 M)

Have a look at bug 1747 which should help with reading large GenBank 
files (however I'm not sure if it will affect GenBank.NCBIDictionary)

http://bugzilla.open-bio.org/show_bug.cgi?id=1747

The test code used was based on Section 3.4.2 of the Tutorial, Parsing 
GenBank records:

http://www.biopython.org/docs/tutorial/Tutorial.html#htoc35

See also the discussion last month:-

http://www.biopython.org/pipermail/biopython/2005-March/002568.html

Peter

-- 
PhD Student
MOAC Doctoral Training Centre
University of Warwick, UK