[Bioperl-l] A perl regex query

Roy Chaudhuri rrc22 at cam.ac.uk
Tue Sep 18 13:26:47 UTC 2007


> My actual problem is a bit more complicated.
> It is not just one string, nut lakhs of them, they are actually names of 
> chemical compounds.
> 
> THe problem is there are 2 different data sources, I need to match the 
> compond names between them, but the problem is though the compound may 
> be the same in the two, they use different naming formats for them.

Unless you can define in simple and precise terms exactly which parts of 
the string you need then there is no way that you will be able to code a 
solution in Perl.

Maybe you could look for a database that contains the synonyms for each 
molecule? A quick Google finds ChEBI (http://www.ebi.ac.uk/chebi), which 
is available to download as flat files.

Roy.
--
Dr. Roy Chaudhuri
Department of Veterinary Medicine
University of Cambridge, U.K.



More information about the Bioperl-l mailing list