[BioPython] UniGene parser

Cayte katel@worldpath.net
Tue, 23 Jul 2002 18:07:02 -0700


> Indeed, a LocusLink record contains only a few data fields, namely
>
>     locusID        (Number)
>     symbol         (alphanumerical code, => genecards )
>     description    (text)
>
> Further more, there is a list of related GenBank accessions for each
LocusLink record.
>
>
> > For this reason I think I should use the same approach as UniGene.  Have
you
> > checked out Record in
> > Unigene? Is this what you want?
> >
>
> For accessing LocusLink and maybe also for UniGene I would
> recommend to download the whole database in ASCII flatfile
> format, and then parsing the flat files. In my opinion
> it is much easier to write parsers for these
> flatfiles, than for any HTML generated primarily for human
> readers.
>
  The full file is 21 MB and over an hour to download to my win98 machine.
Presumably the size of these databases is exploding so I wonder if this is
appropriate for desktop environments.  What to others think?

                                          Cayte