[BioPython] UniGene parser

Jeffrey Chang jchang@smi.stanford.edu
Wed, 17 Jul 2002 12:45:03 -0700


On Wed, Jul 17, 2002 at 07:34:25AM -0700, Sagar Damle wrote:
> Hi peter, 
> 
> > For accessing LocusLink and maybe also for UniGene I would
> > recommend to download the whole database in ASCII flatfile
> > format, and then parsing the flat files. In my opinion
> > it is much easier to write parsers for these
> > flatfiles, than for any HTML generated primarily for human
> > readers.
> 
> This seems like a good idea, but my own attempt at parsing the
> unigene/LL flatfiles (like LL_tmpl) makes me worry that these files
> are just too large to parse each time I need information.  Might it
> be an even better idea to store these results in a local searchable
> database?

[...]

> thoughts anyone?  I'm not really a programmer, just a scripter, so I
> may be way off-base here.

Yes, I think you are right.  For more than trivial use of LocusLink,
you really should download the flatfiles and index them by hand.
Fortunately, Biopython does provide these capabilities with
Martel/Mindy.  If you build parsers for this with Martel, Mindy can
automatically take a parser and flat file and build indexes for fast
access to the file.

Jeff