[BioPython] Re: help with retrieving seq and speed.
Peter
pa_wilki@gene.concordia.ca
Fri, 2 Mar 2001 17:11:53 -0500 (EST)
Dinikar,
I think that is philosophy issue. Traditionaly biologists think in batch
files and flat files, which I do not think is appropriate for what you are
doing. If you are mining for data you should be using a
relational database, in conjunction with python.
for speed and clarity your sequences should be in a database. i.e. use the
Fasta parse in biopython to parse your data into a relational database
like mysql and index the id column,then you can perform your querries
manually or use python with a mysql module. There are a few modules that
are built for python for mysql, other sql's, and oracle. Have a look at
the links page at python.org for the 'vaults' archive.
Peter