[Biopython] affy CEL and CDF reader
Sean Davis
sdavis2 at mail.nih.gov
Thu Apr 8 14:56:12 EDT 2010
On Thu, Apr 8, 2010 at 2:33 PM, Vincent Davis <vincent at vincentdavis.net> wrote:
> I ended up writing my own modules for reading both affy Cel and CDF files.
> Long story as to why I did not just use what was available in biopython.
> I plan on making what I have done available to the biopython and will upload
> it as a fork. I will outline what ways what I have is different below.
> My question is: Are there any improvements(features) others would like to
> see beyond what is avalible in the current CelFile.py?
> I saw some posts a month or so ago about checking for consistency in cell
> file, I think it was something about making sure the stated number of probes
> was consistent with the intensity measurements.
>
> What is different,
> when an file is read Affycel.read('file') many atributes are set. for
> example
> a = affcel()
> a.read('testfile')
> a.filename,
> a.version,
> a.header.items() # a dictionary of all header items
> a.num_intensity
> a.intensity
> a.num_masks
> a.masks
> a.num_outliers
> a.outliers
> a.numb_modified
> a.modified
>
> I plan to add the ability return/call intensity values with our with
> outliers or mask values.
> All data is currently store in numpy structured arrays,
> currently a.intensity returns the structured array, but I plan on making it
> an option to easily choose how this is returned.
> also what to make an optional normalized intensity array so that if the data
> is normalized it can be stored with the affycel instance. My use case was
> that I was opening about 80 cel files and reading them in was slow. this
> allowed me to read each file as an instance of affycel stored in a list that
> I then pickled. It was then much faster to open them.
>
> Are improvements to the CelFile.py are of value to biopython?
>
> I hope to have the code pushed up to my fork on github late tonight. Just
> thought I would ask if there was any suggestion before I did.
>
> Also have an CDF file reader, but only have done some basic testing. I don't
> have a lot of use for this, do other biopython users?
>
> I am kinda working in a vacuum and am trying to get more involved in
> projects to improve my skills and knowledge. Any suggestions would be
> appreciated.
Just out of curiosity, is your work based on the affy sdk, or are you
parsing stuff yourself?
Sean
More information about the Biopython
mailing list