[Bioperl-l] Interpro parsing problems - and solutions

Juguang Xiao juguang at tll.org.sg
Tue Jan 6 22:11:19 EST 2004


Hi Richard,

I have found the similar problem before, and my solution is to write 
the Bio/OntologyIO/Handlers/InterPro_BioSQL_Handler.pm for loading the 
InterPro into BioSQL database, since  the serious user of InterPro will 
load the whole db, rather than the small piece in the test. When you 
load the InterPro into memory, you will found 1) your 1 GB virtual 
memory will be occupied in with Mac OSX, so that you cannot even do any 
further operation, and 2) nearly 10% of records are missing with 
current parser.

I am trying to improve biosql a bit, with Hilmar's guidance, to adapt 
InterPro record ,and will complete the InterPro-BioSQL parser after 
that. The work is going to finish later and you will be noticed in this 
list.

A bit further, my point is, for the huge biological file-based 
database, it is wise to load the complete set into RDBMS, rather than 
parsing them in memory, for the sake of the speed for the complicated 
queries, and the reliability of your script (due to the out-of-memory).

Juguang

On Tuesday, January 6, 2004, at 07:58  am, Holland, Richard wrote:

> Hi all. A long one this but I hope it's worth the read.
>
> I found a possible bug in Bio/OntologyIO/InterProParser.pm. 



More information about the Bioperl-l mailing list