[BioPython] UniGene parser
Peter Slickers
piet@clondiag.com
Wed, 17 Jul 2002 10:19:56 +0200
Cayte wrote:
>
> I just did some experiments with LocusLink files and when I strip out the
> html tags very little information is left.
Indeed, a LocusLink record contains only a few data fields, namely
locusID (Number)
symbol (alphanumerical code, => genecards )
description (text)
Further more, there is a list of related GenBank accessions for each LocusLink record.
> For this reason I think I should use the same approach as UniGene. Have you
> checked out Record in
> Unigene? Is this what you want?
>
For accessing LocusLink and maybe also for UniGene I would
recommend to download the whole database in ASCII flatfile
format, and then parsing the flat files. In my opinion
it is much easier to write parsers for these
flatfiles, than for any HTML generated primarily for human
readers.
Unigene by ftp:
ftp://ftp.ncbi.nih.gov/repository/UniGene/
ftp://ftp.ncbi.nih.gov/repository/UniGene/README
LocusLink by ftp:
ftp://ftp.ncbi.nih.gov/refseq/LocusLink/
ftp://ftp.ncbi.nih.gov/refseq/LocusLink/README
Peter
-------------------------------------------------------------------
Peter Slickers piet@clondiag.com
Clondiag Chip Technologies http://www.clondiag.com/
Löbstedter Str. 105
07749 Jena
Germany
Fon: 03641/5947-65 Fax: 03641/5947-20
-------------------------------------------------------------------