[Biopython-dev] Unigene flat file parser

Sean Davis sdavis2 at mail.nih.gov
Thu Oct 26 14:56:25 UTC 2006


I have put together a parser for the Unigene flat file format described here:

ftp://ftp.ncbi.nih.gov/repository/UniGene/README

under the Hs.data section.  The actual .data files are included in the various 
organism-specific directories.  

Is there any interest in including this in biopython?  If so, I would 
appreciate some input on the code and details of contributions, etc.  The 
current code is available here:

http://watson.nci.nih.gov/pressa/~sdavis/Unigene.py

Use like so and note that the ugrecord has much more information (in fact, all 
information is captured) in it that given in its __repr__.  

#!/usr/bin/python
import Unigene

fh = file('Hs.data')  #downloaded previously from ftp, or whatever
ugparser = Unigene.Iterator(fh,Unigene.RecordParser())
for ugrecord in ugparser:
    print ugrecord



More information about the Biopython-dev mailing list