[Biopython] comparing micro array data

Tue Mar 16 16:56:12 UTC 2010

On Tue, Mar 16, 2010 at 12:49 PM, Vincent Davis <vincent at vincentdavis.net>wrote:

> @ Sean
>
> I would suggest finding a
>
> local collaborator if you are relatively new to the microarray field.
>
>
> I actually was brought into this project by a team from an university. They
> know lots including that this is a difficult problem. They did not have any
> references as to how others have solved this problem with whatever success
> was possible. Since I know python, biopython has been my first choice to ask
> other smart people :)
>
>
> I am an economist. I am ok with the stats and data but don't know the
> terminology well, It's been a 3 week crash course in my free time. I wrote
> my own modules for reading in CEL and  CDF files as python objects. I know
> there are existing solution but I would not learned as much that way. I used
> the nexalign program that was recommended on this list to get the mismatch
> data. It's all coming along nicely andI am learning lots. The prject has
> been languishing for a list of reasons and now there is a push to get it
> finished.
>

Perfect!  A mathematician working with biologists--this is the way of the
world these days.  Given the issues that you describe, I would definitely
suggest looking at R/bioconductor.  That said, I'm not sure that there is a
good answer to the problem, as you suggest.

If you don't mind a couple of questions, for curiosity sake, how big is the
genome of model organism?  And what size are the arrays, in terms of
probes?

Sean

>
>    *Vincent Davis
> 720-301-3003 *
> vincent at vincentdavis.net
>  my blog <http://vincentdavis.net> | LinkedIn<http://www.linkedin.com/in/vincentdavis>
>
>
> On Tue, Mar 16, 2010 at 10:38 AM, Sean Davis <sdavis2 at mail.nih.gov> wrote:
>
>> On Tue, Mar 16, 2010 at 11:03 AM, Vincent Davis
>> <vincent at vincentdavis.net> wrote:
>> > So I am very new to this so please accept my ignorance on this subject.
>> >
>> > I have several micro array samples ~ 8 for each of 3 known genomes. So I
>> > know which probes/sequences are a match and which have close matches. I
>> > would like to identify which sequences exist in an unknown sample. The
>> array
>> > is custom and there is little to know overlap between probes.
>> > What is the "standard" way of doing this? I don't care to know if a SNP
>> is
>> > present only if the sequence is present.
>> > Is this standard available in biopython ?
>>
>> Hi, Vincent.  I'm not clear on what the study is here.  Could you
>> explain a bit more what you are doing?  I get the suggestion from your
>> email that you want to do a cross-species comparison using
>> microarrays.  If this is the case, this is notoriously difficult to
>> do, so, in addition to the comments here, I would suggest finding a
>> local collaborator if you are relatively new to the microarray field.
>>
>> Sean
>>
>
>