[BioPython] UniGene parser
Jeffrey Chang
jchang@smi.stanford.edu
Wed, 17 Jul 2002 12:45:03 -0700
On Wed, Jul 17, 2002 at 07:34:25AM -0700, Sagar Damle wrote:
> Hi peter,
>
> > For accessing LocusLink and maybe also for UniGene I would
> > recommend to download the whole database in ASCII flatfile
> > format, and then parsing the flat files. In my opinion
> > it is much easier to write parsers for these
> > flatfiles, than for any HTML generated primarily for human
> > readers.
>
> This seems like a good idea, but my own attempt at parsing the
> unigene/LL flatfiles (like LL_tmpl) makes me worry that these files
> are just too large to parse each time I need information. Might it
> be an even better idea to store these results in a local searchable
> database?
[...]
> thoughts anyone? I'm not really a programmer, just a scripter, so I
> may be way off-base here.
Yes, I think you are right. For more than trivial use of LocusLink,
you really should download the flatfiles and index them by hand.
Fortunately, Biopython does provide these capabilities with
Martel/Mindy. If you build parsers for this with Martel, Mindy can
automatically take a parser and flat file and build indexes for fast
access to the file.
Jeff