[BioPython] Uniprot Parser

Martin MOKREJŠ mmokrejs at ribosome.natur.cuni.cz
Mon Feb 25 18:30:54 UTC 2008


Hi Jonathan,
  drop temporarily the indexes on all mysql rows, and make mysql introduce
the indexes after importing. Otherwise index has to be updated after every
change to a column. Learn 'ALTER TABLE' use. ;-)
Martin

Jonathan Boulais wrote:
> I don't think the parser is the problem Peter, but surely the continuous importing request toward the database.  
> I've already wrote a parsing script, where I'm parsing the entire .dat files (Trembl and Swiss Prot) into text files. After the parsing is done, it imports the files into a database schema that I've built on my own, not the BioSQL one. So instead of importing the data after each iteration, it just imports the data in one shot when the entire .dat file is parsed. I've compared the execution time and it's much faster by this way (about 1 hour instead of 3 days for the Trembl .dat file parsing and importing).
> 
> Again Peter, I'm definitely not as good as you guys in scripting, so I've used the script lines that you proposed for the bug 2390 to compare. 
> But 3 days of parsing and importing is... a little bit too long for me :)



More information about the Biopython mailing list