From aloraine at gmail.com Tue Aug 1 13:02:53 2006 From: aloraine at gmail.com (Ann Loraine) Date: Tue, 1 Aug 2006 12:02:53 -0500 Subject: [DAS2] Fwd: [MOBY-dev] Java Web Services: part 2 In-Reply-To: <44CF83BD.90103@ucalgary.ca> References: <44CF83BD.90103@ucalgary.ca> Message-ID: <83722dde0608011002k4d2f67dfmc65e37e97a2bb851@mail.gmail.com> Greetings, If you are interested in keeping up with BioMoby developments, this may be of interest. Cheers, Ann On 8/1/06, Paul Gordon wrote: > Hi all, > > I have just committed some new code for creating MOBY Java servlets. > It's intended for Extremely Lazy Programmers (such as myself), requiring > that you download just a particular WAR. No CVS, Axis, Ant, etc. > required. Some of my coworkers who have never deployed an servlet, or > knew anything about MOBY were able to have a registered, tested service > within 30 minutes! Hopefully this will be of use to some of you too... > > http://biomoby.open-bio.org/CVS_CONTENT/moby-live/Java/docs/deployingServices.html > > Regards, > > Paul > > _______________________________________________ > MOBY-dev mailing list > MOBY-dev at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/moby-dev > -- Ann Loraine Assistant Professor Section on Statistical Genetics University of Alabama at Birmingham http://www.ssg.uab.edu http://www.transvar.org From gilmanb at pantherinformatics.com Thu Aug 10 11:09:43 2006 From: gilmanb at pantherinformatics.com (Brian Gilman) Date: Thu, 10 Aug 2006 11:09:43 -0400 Subject: [DAS2] DAS/2 Code Sprint, August 14-18 In-Reply-To: References: Message-ID: <44DB4C37.6040704@pantherinformatics.com> Trying to get a features document? Hello Greg et al. I'm desperately trying to get a features document out of one of the DAS 2 servers and have not been able to do it yet. Can someone help me out!? Thanks! -B Helt,Gregg wrote: >Affymetrix is hosting a DAS/2 code sprint on August 14-18, to coincide >with the CSB conference at Stanford. The sprint will be held at Affy's >Santa Clara location, which is about a 20 minute drive from the Stanford >campus. For those attending CSB, the proximity should make it easy to >join in, even if it's just for a morning or afternoon. We can provide >transportation to and from CSB if needed. If you are interested in >attending please email me, and specify whether you'll need a workstation >or will be bringing your own laptop. > >This is a code sprint, so the focus will be on DAS/2 client and server >implementations. As with previous sprints I'd like to start each day >with a teleconference at 9 AM Pacific time. If you can't be there >physically but still want to participate, please join in! > > Gregg > > >_______________________________________________ >DAS2 mailing list >DAS2 at lists.open-bio.org >http://lists.open-bio.org/mailman/listinfo/das2 > > > > From Gregg_Helt at affymetrix.com Thu Aug 10 12:39:12 2006 From: Gregg_Helt at affymetrix.com (Helt,Gregg) Date: Thu, 10 Aug 2006 09:39:12 -0700 Subject: [DAS2] DAS/2 Code Sprint, August 14-18 Message-ID: Apologies, it looks like we're currently having some problem with proxy redirection on the Affy DAS/2 server. Steve, can you check on this? When I request anything but the top level ~/sequence, I'm getting back HTTP error 502 "Bad Gateway" with the message: "The proxy server received an invalid response from an upstream server." However, I just tried the biopackages server and it is working, though response times are slower than usual (unless the response has already been cached). Here's a feature query I recently ran, so it will be returned quickly from the server cache: http://das.biopackages.net/das/genome/human/17/feature?overlaps=chr21/26 027736:26068042;type=SO:mRNA hope that helps, gregg > -----Original Message----- > From: Brian Gilman [mailto:gilmanb at pantherinformatics.com] > Sent: Thursday, August 10, 2006 8:10 AM > To: Helt,Gregg > Cc: DAS/2 > Subject: Re: [DAS2] DAS/2 Code Sprint, August 14-18 > > Trying to get a features document? > > Hello Greg et al. I'm desperately trying to get a features document > out of one of the DAS 2 servers and have not been able to do it yet. Can > someone help me out!? > > Thanks! > > -B > > Helt,Gregg wrote: > > >Affymetrix is hosting a DAS/2 code sprint on August 14-18, to coincide > >with the CSB conference at Stanford. The sprint will be held at Affy's > >Santa Clara location, which is about a 20 minute drive from the Stanford > >campus. For those attending CSB, the proximity should make it easy to > >join in, even if it's just for a morning or afternoon. We can provide > >transportation to and from CSB if needed. If you are interested in > >attending please email me, and specify whether you'll need a workstation > >or will be bringing your own laptop. > > > >This is a code sprint, so the focus will be on DAS/2 client and server > >implementations. As with previous sprints I'd like to start each day > >with a teleconference at 9 AM Pacific time. If you can't be there > >physically but still want to participate, please join in! > > > > Gregg > > > > > >_______________________________________________ > >DAS2 mailing list > >DAS2 at lists.open-bio.org > >http://lists.open-bio.org/mailman/listinfo/das2 > > > > > > > > From Steve_Chervitz at affymetrix.com Thu Aug 10 16:14:48 2006 From: Steve_Chervitz at affymetrix.com (Chervitz, Steve) Date: Thu, 10 Aug 2006 13:14:48 -0700 Subject: [DAS2] DAS/2 Code Sprint, August 14-18 In-Reply-To: Message-ID: The netaffxdas das/2 server is back up now. Turned out to be a memory trouble. The server got some whomping queries thrown at it, such as these: M_musculus_Aug_2005/features?overlaps=chr1/0:194923535;type=mrna;format=bps H_sapiens_Mar_2006/features?overlaps=chr20/0:62435964;type=mrna;format=bps Which it could not complete due to out of memory errors. But it could handle this sizeable query even after the above failed: H_sapiens_May_2004/features?overlaps=chr20/0:62435964;type=refseq;format=brs Eventually, Jetty just decided it had enough and shut down it's connection, shouting: WARN!! Stopping Acceptor ServerSocket My fix was to restart the das/2 server giving the java process another 200M of maximal heap. However, both das/1 and das/2 servers can now potentially claim 89% of physical ram on that box, which could become unhealthy. Ed notes that we might want to prevent such big queries in the first place. There is an error code in the das spec for this (HTTP error 413 "Request Entity Too Large"). But how do we determine the what's a reasonable maximum allowable query result? It will depend on the feature density on a particular assembly. This could be a good action item for the code sprint. Steve > From: "Helt,Gregg" > Date: Thu, 10 Aug 2006 09:39:12 -0700 > To: Brian Gilman > Cc: DAS/2 , "Chervitz, Steve" > > Conversation: [DAS2] DAS/2 Code Sprint, August 14-18 > Subject: RE: [DAS2] DAS/2 Code Sprint, August 14-18 > > Apologies, it looks like we're currently having some problem with proxy > redirection on the Affy DAS/2 server. Steve, can you check on this? > When I request anything but the top level ~/sequence, I'm getting back > HTTP error 502 "Bad Gateway" with the message: > "The proxy server received an invalid response from an upstream server." > > However, I just tried the biopackages server and it is working, though > response times are slower than usual (unless the response has already > been cached). Here's a feature query I recently ran, so it will be > returned quickly from the server cache: > > http://das.biopackages.net/das/genome/human/17/feature?overlaps=chr21/26 > 027736:26068042;type=SO:mRNA > > hope that helps, > gregg > >> -----Original Message----- >> From: Brian Gilman [mailto:gilmanb at pantherinformatics.com] >> Sent: Thursday, August 10, 2006 8:10 AM >> To: Helt,Gregg >> Cc: DAS/2 >> Subject: Re: [DAS2] DAS/2 Code Sprint, August 14-18 >> >> Trying to get a features document? >> >> Hello Greg et al. I'm desperately trying to get a features > document >> out of one of the DAS 2 servers and have not been able to do it yet. > Can >> someone help me out!? >> >> Thanks! >> >> -B >> >> Helt,Gregg wrote: >> >>> Affymetrix is hosting a DAS/2 code sprint on August 14-18, to > coincide >>> with the CSB conference at Stanford. The sprint will be held at > Affy's >>> Santa Clara location, which is about a 20 minute drive from the > Stanford >>> campus. For those attending CSB, the proximity should make it easy > to >>> join in, even if it's just for a morning or afternoon. We can > provide >>> transportation to and from CSB if needed. If you are interested in >>> attending please email me, and specify whether you'll need a > workstation >>> or will be bringing your own laptop. >>> >>> This is a code sprint, so the focus will be on DAS/2 client and > server >>> implementations. As with previous sprints I'd like to start each day >>> with a teleconference at 9 AM Pacific time. If you can't be there >>> physically but still want to participate, please join in! >>> >>> Gregg >>> >>> >>> _______________________________________________ >>> DAS2 mailing list >>> DAS2 at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/das2 >>> >>> >>> >>> > From gilmanb at pantherinformatics.com Thu Aug 10 16:18:20 2006 From: gilmanb at pantherinformatics.com (Brian Gilman) Date: Thu, 10 Aug 2006 16:18:20 -0400 Subject: [DAS2] DAS/2 Code Sprint, August 14-18 In-Reply-To: References: Message-ID: Eh hem, sorry...I was playing around.... -B -- Brian Gilman President Panther Informatics Inc. E-Mail: gilmanb at pantherinformatics.com gilmanb at jforge.net AIM: gilmanb1 01000010 01101001 01101111 01001001 01101110 01100110 01101111 01110010 01101101 01100001 01110100 01101001 01100011 01101001 01100001 01101110 On Aug 10, 2006, at 4:14 PM, Steve Chervitz wrote: > The netaffxdas das/2 server is back up now. Turned out to be a memory > trouble. The server got some whomping queries thrown at it, such as > these: > > M_musculus_Aug_2005/features? > overlaps=chr1/0:194923535;type=mrna;format=bps > > H_sapiens_Mar_2006/features? > overlaps=chr20/0:62435964;type=mrna;format=bps > > Which it could not complete due to out of memory errors. But it > could handle > this sizeable query even after the above failed: > > H_sapiens_May_2004/features? > overlaps=chr20/0:62435964;type=refseq;format=brs > > Eventually, Jetty just decided it had enough and shut down it's > connection, > shouting: WARN!! Stopping Acceptor ServerSocket > > My fix was to restart the das/2 server giving the java process > another 200M > of maximal heap. However, both das/1 and das/2 servers can now > potentially > claim 89% of physical ram on that box, which could become unhealthy. > > Ed notes that we might want to prevent such big queries in the > first place. > There is an error code in the das spec for this (HTTP error 413 > "Request > Entity Too Large"). But how do we determine the what's a reasonable > maximum > allowable query result? It will depend on the feature density on a > particular assembly. This could be a good action item for the code > sprint. > > Steve > > > >> From: "Helt,Gregg" >> Date: Thu, 10 Aug 2006 09:39:12 -0700 >> To: Brian Gilman >> Cc: DAS/2 , "Chervitz, Steve" >> >> Conversation: [DAS2] DAS/2 Code Sprint, August 14-18 >> Subject: RE: [DAS2] DAS/2 Code Sprint, August 14-18 >> >> Apologies, it looks like we're currently having some problem with >> proxy >> redirection on the Affy DAS/2 server. Steve, can you check on this? >> When I request anything but the top level ~/sequence, I'm getting >> back >> HTTP error 502 "Bad Gateway" with the message: >> "The proxy server received an invalid response from an upstream >> server." >> >> However, I just tried the biopackages server and it is working, >> though >> response times are slower than usual (unless the response has already >> been cached). Here's a feature query I recently ran, so it will be >> returned quickly from the server cache: >> >> http://das.biopackages.net/das/genome/human/17/feature? >> overlaps=chr21/26 >> 027736:26068042;type=SO:mRNA >> >> hope that helps, >> gregg >> >>> -----Original Message----- >>> From: Brian Gilman [mailto:gilmanb at pantherinformatics.com] >>> Sent: Thursday, August 10, 2006 8:10 AM >>> To: Helt,Gregg >>> Cc: DAS/2 >>> Subject: Re: [DAS2] DAS/2 Code Sprint, August 14-18 >>> >>> Trying to get a features document? >>> >>> Hello Greg et al. I'm desperately trying to get a features >> document >>> out of one of the DAS 2 servers and have not been able to do it yet. >> Can >>> someone help me out!? >>> >>> Thanks! >>> >>> -B >>> >>> Helt,Gregg wrote: >>> >>>> Affymetrix is hosting a DAS/2 code sprint on August 14-18, to >> coincide >>>> with the CSB conference at Stanford. The sprint will be held at >> Affy's >>>> Santa Clara location, which is about a 20 minute drive from the >> Stanford >>>> campus. For those attending CSB, the proximity should make it easy >> to >>>> join in, even if it's just for a morning or afternoon. We can >> provide >>>> transportation to and from CSB if needed. If you are interested in >>>> attending please email me, and specify whether you'll need a >> workstation >>>> or will be bringing your own laptop. >>>> >>>> This is a code sprint, so the focus will be on DAS/2 client and >> server >>>> implementations. As with previous sprints I'd like to start >>>> each day >>>> with a teleconference at 9 AM Pacific time. If you can't be there >>>> physically but still want to participate, please join in! >>>> >>>> Gregg >>>> >>>> >>>> _______________________________________________ >>>> DAS2 mailing list >>>> DAS2 at lists.open-bio.org >>>> http://lists.open-bio.org/mailman/listinfo/das2 >>>> >>>> >>>> >>>> >> > > From Gregg_Helt at affymetrix.com Thu Aug 10 18:25:24 2006 From: Gregg_Helt at affymetrix.com (Helt,Gregg) Date: Thu, 10 Aug 2006 15:25:24 -0700 Subject: [DAS2] DAS/2 Code Sprint, August 14-18 Message-ID: Hmm... those queries really shouldn't stretch memory requirements too much -- the mrna objects are already in memory, so for the most part any extra memory is taken up by the output streaming through the server. Steve, can you send me the log file for the server when it was hitting these out-of-memory errors? Thanks, Gregg > -----Original Message----- > From: Chervitz, Steve > Sent: Thursday, August 10, 2006 1:15 PM > To: Helt,Gregg; Brian Gilman > Cc: DAS/2 > Subject: Re: [DAS2] DAS/2 Code Sprint, August 14-18 > > The netaffxdas das/2 server is back up now. Turned out to be a memory > trouble. The server got some whomping queries thrown at it, such as these: > > M_musculus_Aug_2005/features?overlaps=chr1/0:194923535;type=mrna;format= bp > s > > H_sapiens_Mar_2006/features?overlaps=chr20/0:62435964;type=mrna;format=b ps > > Which it could not complete due to out of memory errors. But it could > handle > this sizeable query even after the above failed: > > H_sapiens_May_2004/features?overlaps=chr20/0:62435964;type=refseq;format =b > rs > > Eventually, Jetty just decided it had enough and shut down it's connection, > shouting: WARN!! Stopping Acceptor ServerSocket > > My fix was to restart the das/2 server giving the java process another > 200M > of maximal heap. However, both das/1 and das/2 servers can now potentially > claim 89% of physical ram on that box, which could become unhealthy. > > Ed notes that we might want to prevent such big queries in the first place. > There is an error code in the das spec for this (HTTP error 413 "Request > Entity Too Large"). But how do we determine the what's a reasonable > maximum > allowable query result? It will depend on the feature density on a > particular assembly. This could be a good action item for the code sprint. > > Steve > > > > > From: "Helt,Gregg" > > Date: Thu, 10 Aug 2006 09:39:12 -0700 > > To: Brian Gilman > > Cc: DAS/2 , "Chervitz, Steve" > > > > Conversation: [DAS2] DAS/2 Code Sprint, August 14-18 > > Subject: RE: [DAS2] DAS/2 Code Sprint, August 14-18 > > > > Apologies, it looks like we're currently having some problem with proxy > > redirection on the Affy DAS/2 server. Steve, can you check on this? > > When I request anything but the top level ~/sequence, I'm getting back > > HTTP error 502 "Bad Gateway" with the message: > > "The proxy server received an invalid response from an upstream server." > > > > However, I just tried the biopackages server and it is working, though > > response times are slower than usual (unless the response has already > > been cached). Here's a feature query I recently ran, so it will be > > returned quickly from the server cache: > > > > http://das.biopackages.net/das/genome/human/17/feature?overlaps=chr21/26 > > 027736:26068042;type=SO:mRNA > > > > hope that helps, > > gregg > > > >> -----Original Message----- > >> From: Brian Gilman [mailto:gilmanb at pantherinformatics.com] > >> Sent: Thursday, August 10, 2006 8:10 AM > >> To: Helt,Gregg > >> Cc: DAS/2 > >> Subject: Re: [DAS2] DAS/2 Code Sprint, August 14-18 > >> > >> Trying to get a features document? > >> > >> Hello Greg et al. I'm desperately trying to get a features > > document > >> out of one of the DAS 2 servers and have not been able to do it yet. > > Can > >> someone help me out!? > >> > >> Thanks! > >> > >> -B > >> > >> Helt,Gregg wrote: > >> > >>> Affymetrix is hosting a DAS/2 code sprint on August 14-18, to > > coincide > >>> with the CSB conference at Stanford. The sprint will be held at > > Affy's > >>> Santa Clara location, which is about a 20 minute drive from the > > Stanford > >>> campus. For those attending CSB, the proximity should make it easy > > to > >>> join in, even if it's just for a morning or afternoon. We can > > provide > >>> transportation to and from CSB if needed. If you are interested in > >>> attending please email me, and specify whether you'll need a > > workstation > >>> or will be bringing your own laptop. > >>> > >>> This is a code sprint, so the focus will be on DAS/2 client and > > server > >>> implementations. As with previous sprints I'd like to start each day > >>> with a teleconference at 9 AM Pacific time. If you can't be there > >>> physically but still want to participate, please join in! > >>> > >>> Gregg > >>> > >>> > >>> _______________________________________________ > >>> DAS2 mailing list > >>> DAS2 at lists.open-bio.org > >>> http://lists.open-bio.org/mailman/listinfo/das2 > >>> > >>> > >>> > >>> > > From aloraine at gmail.com Mon Aug 14 02:13:10 2006 From: aloraine at gmail.com (Ann Loraine) Date: Sun, 13 Aug 2006 23:13:10 -0700 Subject: [DAS2] DAS/2 Code Sprint, August 14-18 In-Reply-To: <6dce9a0b0608132227o12924d90ud8a8cca329b30fb@mail.gmail.com> References: <83722dde0607240911w4d50b9cfo43adff514f6df39c@mail.gmail.com> <6dce9a0b0608132227o12924d90ud8a8cca329b30fb@mail.gmail.com> Message-ID: <83722dde0608132313g4ec0cdf5p990284ad00b0d17@mail.gmail.com> Hi, Last I heard, it's starting Monday (the 14th), beginning with a conference call at 9 am. Directions: http://www.affymetrix.com/site/contact/directions.jsp?loc=sc Best, Ann On 8/13/06, Lincoln Stein wrote: > Hi, > > Is the code sprint starting on the 13th or the 14th? I am here in Palo Alto > and have Monday morning free. > > Can I get driving directions from the Affy web site? > > Lincoln > > On 7/24/06, Ann Loraine wrote: > > Hi Gregg, > > > > I would like to suggest shifting the code spring by a day and have it > > start Monday August 13. > > > > That way it won't perfectly overlap the conference and those us who > > need to be at the conference full-time (such as myself) will be able > > to visit the code spring. > > > > Cheers, > > > > Ann > > > > On 7/24/06, Helt,Gregg wrote: > > > Affymetrix is hosting a DAS/2 code sprint on August 14-18, to coincide > > > with the CSB conference at Stanford. The sprint will be held at Affy's > > > Santa Clara location, which is about a 20 minute drive from the Stanford > > > campus. For those attending CSB, the proximity should make it easy to > > > join in, even if it's just for a morning or afternoon. We can provide > > > transportation to and from CSB if needed. If you are interested in > > > attending please email me, and specify whether you'll need a workstation > > > or will be bringing your own laptop. > > > > > > This is a code sprint, so the focus will be on DAS/2 client and server > > > implementations. As with previous sprints I'd like to start each day > > > with a teleconference at 9 AM Pacific time. If you can't be there > > > physically but still want to participate, please join in! > > > > > > Gregg > > > > > > > > > _______________________________________________ > > > DAS2 mailing list > > > DAS2 at lists.open-bio.org > > > http://lists.open-bio.org/mailman/listinfo/das2 > > > > > > > > > -- > > Ann Loraine > > Assistant Professor > > Section on Statistical Genetics > > University of Alabama at Birmingham > > http://www.ssg.uab.edu > > http://www.transvar.org > > _______________________________________________ > > DAS2 mailing list > > DAS2 at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/das2 > > > > > > -- > Lincoln D. Stein > Cold Spring Harbor Laboratory > 1 Bungtown Road > Cold Spring Harbor, NY 11724 > (516) 367-8380 (voice) > (516) 367-8389 (fax) > FOR URGENT MESSAGES & SCHEDULING, > PLEASE CONTACT MY ASSISTANT, > SANDRA MICHELSEN, AT michelse at cshl.edu -- Ann Loraine Assistant Professor Section on Statistical Genetics University of Alabama at Birmingham http://www.ssg.uab.edu http://www.transvar.org From allenday at ucla.edu Mon Aug 14 03:14:49 2006 From: allenday at ucla.edu (Allen Day) Date: Mon, 14 Aug 2006 00:14:49 -0700 Subject: [DAS2] DAS/2 Code Sprint, August 14-18 In-Reply-To: References: Message-ID: <5c24dcc30608140014u3d9dd1b5w9e487e142d1ca077@mail.gmail.com> Ah, I may implement this. Let's discuss tomorrow morning. Is there an agenda set? Is anyone teleconferencing in? -Allen On 8/10/06, Chervitz, Steve wrote: > > The netaffxdas das/2 server is back up now. Turned out to be a memory > trouble. The server got some whomping queries thrown at it, such as these: > > > M_musculus_Aug_2005/features?overlaps=chr1/0:194923535;type=mrna;format=bps > > H_sapiens_Mar_2006/features?overlaps=chr20/0:62435964;type=mrna;format=bps > > Which it could not complete due to out of memory errors. But it could > handle > this sizeable query even after the above failed: > > > H_sapiens_May_2004/features?overlaps=chr20/0:62435964;type=refseq;format=brs > > Eventually, Jetty just decided it had enough and shut down it's > connection, > shouting: WARN!! Stopping Acceptor ServerSocket > > My fix was to restart the das/2 server giving the java process another > 200M > of maximal heap. However, both das/1 and das/2 servers can now potentially > claim 89% of physical ram on that box, which could become unhealthy. > > Ed notes that we might want to prevent such big queries in the first > place. > There is an error code in the das spec for this (HTTP error 413 "Request > Entity Too Large"). But how do we determine the what's a reasonable > maximum > allowable query result? It will depend on the feature density on a > particular assembly. This could be a good action item for the code sprint. > > Steve > > > > > From: "Helt,Gregg" > > Date: Thu, 10 Aug 2006 09:39:12 -0700 > > To: Brian Gilman > > Cc: DAS/2 , "Chervitz, Steve" > > > > Conversation: [DAS2] DAS/2 Code Sprint, August 14-18 > > Subject: RE: [DAS2] DAS/2 Code Sprint, August 14-18 > > > > Apologies, it looks like we're currently having some problem with proxy > > redirection on the Affy DAS/2 server. Steve, can you check on this? > > When I request anything but the top level ~/sequence, I'm getting back > > HTTP error 502 "Bad Gateway" with the message: > > "The proxy server received an invalid response from an upstream server." > > > > However, I just tried the biopackages server and it is working, though > > response times are slower than usual (unless the response has already > > been cached). Here's a feature query I recently ran, so it will be > > returned quickly from the server cache: > > > > http://das.biopackages.net/das/genome/human/17/feature?overlaps=chr21/26 > > 027736:26068042;type=SO:mRNA > > > > hope that helps, > > gregg > > > >> -----Original Message----- > >> From: Brian Gilman [mailto:gilmanb at pantherinformatics.com] > >> Sent: Thursday, August 10, 2006 8:10 AM > >> To: Helt,Gregg > >> Cc: DAS/2 > >> Subject: Re: [DAS2] DAS/2 Code Sprint, August 14-18 > >> > >> Trying to get a features document? > >> > >> Hello Greg et al. I'm desperately trying to get a features > > document > >> out of one of the DAS 2 servers and have not been able to do it yet. > > Can > >> someone help me out!? > >> > >> Thanks! > >> > >> -B > >> > >> Helt,Gregg wrote: > >> > >>> Affymetrix is hosting a DAS/2 code sprint on August 14-18, to > > coincide > >>> with the CSB conference at Stanford. The sprint will be held at > > Affy's > >>> Santa Clara location, which is about a 20 minute drive from the > > Stanford > >>> campus. For those attending CSB, the proximity should make it easy > > to > >>> join in, even if it's just for a morning or afternoon. We can > > provide > >>> transportation to and from CSB if needed. If you are interested in > >>> attending please email me, and specify whether you'll need a > > workstation > >>> or will be bringing your own laptop. > >>> > >>> This is a code sprint, so the focus will be on DAS/2 client and > > server > >>> implementations. As with previous sprints I'd like to start each day > >>> with a teleconference at 9 AM Pacific time. If you can't be there > >>> physically but still want to participate, please join in! > >>> > >>> Gregg > >>> > >>> > >>> _______________________________________________ > >>> DAS2 mailing list > >>> DAS2 at lists.open-bio.org > >>> http://lists.open-bio.org/mailman/listinfo/das2 > >>> > >>> > >>> > >>> > > > > > _______________________________________________ > DAS2 mailing list > DAS2 at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/das2 > From steve_chervitz at affymetrix.com Mon Aug 14 04:56:31 2006 From: steve_chervitz at affymetrix.com (Steve Chervitz) Date: Mon, 14 Aug 2006 01:56:31 -0700 (PDT) Subject: [DAS2] DAS/2 Code Sprint, August 14-18 In-Reply-To: <5c24dcc30608140014u3d9dd1b5w9e487e142d1ca077@mail.gmail.com> References: <5c24dcc30608140014u3d9dd1b5w9e487e142d1ca077@mail.gmail.com> Message-ID: On Mon, 14 Aug 2006, Allen Day wrote: > Ah, I may implement this. Let's discuss tomorrow morning. Is there an > agenda set? Is anyone teleconferencing in? I haven't seen a specific agenda, but something like this seems reasonable: * status reports, including what you want to focus on for the sprint * establish a prioitized list of goals and deliverables for the current sprint Teleconferencing will start at 9AM PST on the usual number: TEL=800-531-3250 (US) or 303-928-2693 (Int'l) ID=2879055 PIN=1365 Steve > > On 8/10/06, Chervitz, Steve wrote: >> >> The netaffxdas das/2 server is back up now. Turned out to be a memory >> trouble. The server got some whomping queries thrown at it, such as these: >> >> >> M_musculus_Aug_2005/features?overlaps=chr1/0:194923535;type=mrna;format=bps >> >> H_sapiens_Mar_2006/features?overlaps=chr20/0:62435964;type=mrna;format=bps >> >> Which it could not complete due to out of memory errors. But it could >> handle >> this sizeable query even after the above failed: >> >> >> H_sapiens_May_2004/features?overlaps=chr20/0:62435964;type=refseq;format=brs >> >> Eventually, Jetty just decided it had enough and shut down it's >> connection, >> shouting: WARN!! Stopping Acceptor ServerSocket >> >> My fix was to restart the das/2 server giving the java process another >> 200M >> of maximal heap. However, both das/1 and das/2 servers can now potentially >> claim 89% of physical ram on that box, which could become unhealthy. >> >> Ed notes that we might want to prevent such big queries in the first >> place. >> There is an error code in the das spec for this (HTTP error 413 "Request >> Entity Too Large"). But how do we determine the what's a reasonable >> maximum >> allowable query result? It will depend on the feature density on a >> particular assembly. This could be a good action item for the code sprint. >> >> Steve >> >> >> >> > From: "Helt,Gregg" >> > Date: Thu, 10 Aug 2006 09:39:12 -0700 >> > To: Brian Gilman >> > Cc: DAS/2 , "Chervitz, Steve" >> > >> > Conversation: [DAS2] DAS/2 Code Sprint, August 14-18 >> > Subject: RE: [DAS2] DAS/2 Code Sprint, August 14-18 >> > >> > Apologies, it looks like we're currently having some problem with proxy >> > redirection on the Affy DAS/2 server. Steve, can you check on this? >> > When I request anything but the top level ~/sequence, I'm getting back >> > HTTP error 502 "Bad Gateway" with the message: >> > "The proxy server received an invalid response from an upstream server." >> > >> > However, I just tried the biopackages server and it is working, though >> > response times are slower than usual (unless the response has already >> > been cached). Here's a feature query I recently ran, so it will be >> > returned quickly from the server cache: >> > >> > http://das.biopackages.net/das/genome/human/17/feature?overlaps=chr21/26 >> > 027736:26068042;type=SO:mRNA >> > >> > hope that helps, >> > gregg >> > >> >> -----Original Message----- >> >> From: Brian Gilman [mailto:gilmanb at pantherinformatics.com] >> >> Sent: Thursday, August 10, 2006 8:10 AM >> >> To: Helt,Gregg >> >> Cc: DAS/2 >> >> Subject: Re: [DAS2] DAS/2 Code Sprint, August 14-18 >> >> >> >> Trying to get a features document? >> >> >> >> Hello Greg et al. I'm desperately trying to get a features >> > document >> >> out of one of the DAS 2 servers and have not been able to do it yet. >> > Can >> >> someone help me out!? >> >> >> >> Thanks! >> >> >> >> -B >> >> >> >> Helt,Gregg wrote: >> >> >> >>> Affymetrix is hosting a DAS/2 code sprint on August 14-18, to >> > coincide >> >>> with the CSB conference at Stanford. The sprint will be held at >> > Affy's >> >>> Santa Clara location, which is about a 20 minute drive from the >> > Stanford >> >>> campus. For those attending CSB, the proximity should make it easy >> > to >> >>> join in, even if it's just for a morning or afternoon. We can >> > provide >> >>> transportation to and from CSB if needed. If you are interested in >> >>> attending please email me, and specify whether you'll need a >> > workstation >> >>> or will be bringing your own laptop. >> >>> >> >>> This is a code sprint, so the focus will be on DAS/2 client and >> > server >> >>> implementations. As with previous sprints I'd like to start each day >> >>> with a teleconference at 9 AM Pacific time. If you can't be there >> >>> physically but still want to participate, please join in! >> >>> >> >>> Gregg >> >>> >> >>> >> >>> _______________________________________________ >> >>> DAS2 mailing list >> >>> DAS2 at lists.open-bio.org >> >>> http://lists.open-bio.org/mailman/listinfo/das2 >> >>> >> >>> >> >>> >> >>> >> > >> >> >> _______________________________________________ >> DAS2 mailing list >> DAS2 at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/das2 >> > From Gregg_Helt at affymetrix.com Mon Aug 14 08:39:07 2006 From: Gregg_Helt at affymetrix.com (Helt,Gregg) Date: Mon, 14 Aug 2006 05:39:07 -0700 Subject: [DAS2] DAS/2 Code Sprint, August 14-18 Message-ID: Apologies for not posting the details sooner! DAS/2 Code Sprint, August 14 (Monday) through August 18 (Friday) Conference Call, 9 AM PST every morning 800-531-3250 Conference ID: 2879055 Passcode: 1365 We're in the Computer Training room at Affymetrix Santa Clara, Building 3450 Directions to Affymetrix Building 3450 (3450 Kifer Road, Santa Clara, CA): http://www.affymetrix.com/site/contact/directions.jsp?loc=sccentral This is about a 20 minute drive from the Stanford campus. If there is no receptionist at 3420, you may need to check in at the reception area in Building 3420. Please call me on my cell phone if there are any problems finding the room: 510-205-9652 See you all soon! Gregg -----Original Message----- From: Lincoln Stein [mailto:lincoln.stein at gmail.com] Sent: Sunday, August 13, 2006 10:28 PM To: Ann Loraine Cc: Helt,Gregg; DAS/2 Subject: Re: [DAS2] DAS/2 Code Sprint, August 14-18 Hi, Is the code sprint starting on the 13th or the 14th? I am here in Palo Alto and have Monday morning free. Can I get driving directions from the Affy web site? Lincoln On 7/24/06, Ann Loraine wrote: Hi Gregg, I would like to suggest shifting the code spring by a day and have it start Monday August 13. That way it won't perfectly overlap the conference and those us who need to be at the conference full-time (such as myself) will be able to visit the code spring. Cheers, Ann On 7/24/06, Helt,Gregg wrote: > Affymetrix is hosting a DAS/2 code sprint on August 14-18, to coincide > with the CSB conference at Stanford. The sprint will be held at Affy's > Santa Clara location, which is about a 20 minute drive from the Stanford > campus. For those attending CSB, the proximity should make it easy to > join in, even if it's just for a morning or afternoon. We can provide > transportation to and from CSB if needed. If you are interested in > attending please email me, and specify whether you'll need a workstation > or will be bringing your own laptop. > > This is a code sprint, so the focus will be on DAS/2 client and server > implementations. As with previous sprints I'd like to start each day > with a teleconference at 9 AM Pacific time. If you can't be there > physically but still want to participate, please join in! > > Gregg > > > _______________________________________________ > DAS2 mailing list > DAS2 at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/das2 > -- Ann Loraine Assistant Professor Section on Statistical Genetics University of Alabama at Birmingham http://www.ssg.uab.edu http://www.transvar.org _______________________________________________ DAS2 mailing list DAS2 at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/das2 -- Lincoln D. Stein Cold Spring Harbor Laboratory 1 Bungtown Road Cold Spring Harbor, NY 11724 (516) 367-8380 (voice) (516) 367-8389 (fax) FOR URGENT MESSAGES & SCHEDULING, PLEASE CONTACT MY ASSISTANT, SANDRA MICHELSEN, AT michelse at cshl.edu From Gregg_Helt at affymetrix.com Mon Aug 14 08:48:15 2006 From: Gregg_Helt at affymetrix.com (Helt,Gregg) Date: Mon, 14 Aug 2006 05:48:15 -0700 Subject: [DAS2] DAS/2 Code Sprint Details, August 14-18 Message-ID: Apologies for not posting the details sooner! DAS/2 Code Sprint, August 14 (Monday) through August 18 (Friday) Conference Call, 9 AM PST every morning US: 800-531-3250, International: 303-928-2693 Conference ID: 2879055 Passcode: 1365 We're in the Computer Training room at Affymetrix Santa Clara, Building 3450 Directions to Affymetrix Building 3450 (3450 Kifer Road, Santa Clara, CA): http://www.affymetrix.com/site/contact/directions.jsp?loc=sccentral This is about a 20 minute drive from the Stanford campus. If there is no receptionist at 3420, you may need to check in at the reception area in Building 3420. Please call me on my cell phone if there are any problems finding the room: 510-205-9652 See you all soon! Gregg From dalke at dalkescientific.com Mon Aug 14 12:20:03 2006 From: dalke at dalkescientific.com (Andrew Dalke) Date: Mon, 14 Aug 2006 18:20:03 +0200 Subject: [DAS2] Fwd: DAS/2 code sprint next week! Message-ID: <4b1a3bf29da8f435273d2b25013d15cf@dalkescientific.com> Begin forwarded message: > From: Andrew Dalke > Date: August 14, 2006 6:00:30 PM GMT+02:00 > To: "Helt,Gregg" > Subject: Re: DAS/2 code sprint next week! > >> We?re hosting another DAS/2 code sprint next week at Affy Santa >> Clara, to coincide with the CSB meeting at Stanford.? Will you be >> able to join in?? If not in person, then we?re having a daily 9 AM >> PST conference call you could join. > > I'll be there. It starts in a few minutes. I'm in a cybercafe in > Cape Town. > >> ?I?m wondering what the status of the DAS/2 writeback spec is. > > It's unchanged. I'll be working on that over the sprint. > > I've spent most of the last, month+ catching up on the latest in > web development systems for Python, and learning various libraries. > Including giving a 2 week course on it. As my test case I've > been working on a DAS2 server. I have a reference server nearly > finished and a few things came up during it: > > - I'm iffy about the current SEGMENTS document. It lists > "title" and "reference" for each segment but not for the list of > segments as a whole. Does it make sense allowing those to be > specified if they have reasonable names? (I know they don't always.) > > It's part of that separation between the sources document, which > describes these, and the segments document. > > - did we specify that the sequence is in upper-case, lower-case, > etc.? > > - I would like some experience with an agp or other assembly format. > I'm concerned about how a client can piece together segment names > from that document with the URIs we're using. It seems to me that > most places use a local name ("yeast_1" or somesuch) which is not > exposed via the web. If the assembly document, fasta file, etc. > use the local name and not the URL then it's hard to tie them together. > > - There are two different segment titles I've come across. One > is the name you want to see in a pull-down menu, etc. while the > other is the text you want in the FASTA header . These could be the > same but I don't think they are always the same. > Andrew dalke at dalkescientific.com From Steve_Chervitz at affymetrix.com Mon Aug 14 14:30:35 2006 From: Steve_Chervitz at affymetrix.com (Steve Chervitz) Date: Mon, 14 Aug 2006 11:30:35 -0700 Subject: [DAS2] Notes from DAS/2 code sprint #3, day one, 14 Aug 2006 Message-ID: Notes from DAS/2 code sprint #3, day one, 14 Aug 2006 $Id: das2-teleconf-2006-08-14.txt,v 1.2 2006/08/14 18:28:47 sac Exp $ Note taker: Steve Chervitz Attendees: Affy: Steve Chervitz, Ed E., Gregg Helt CSHL: Lincoln Stein Dalke Scientific: Andrew Dalke Panther Informatics: Brian Gilman UAB: All Loraine UCLA: Allen Day, Brian O'Connor Action items are flagged with '[A]'. These notes are checked into the biodas.org CVS repository at das/das2/notes/2006. Instructions on how to access this repository are at http://biodas.org DISCLAIMER: The note taker aims for completeness and accuracy, but these goals are not always achievable, given the desire to get the notes out with a rapid turnaround. So don't consider these notes as complete minutes from the meeting, but rather abbreviated, summarized versions of what was discussed. There may be errors of commission and omission. Participants are welcome to post comments and/or corrections to these as they see fit. Agenda: * Status reports, including what you want/need to focus on for this sprint, progress from last sprint. Status Reports --------------- gh: have done writeback work. IGB can create curation, post to biopackages writeback server, das/2 client can see curations. no editing yet. client can edit own data models, can't post those edits. to work on ID mapping stuff: client can't accept newly create ids from server. currently just holds onto temporary id's. IGB client has had one or more release since last. priorities - mainly writeback for client. ls: continue working on perl client interface to das/2, not functional at present. need to backout changes since last sprint. das/2 tracks in gbrowse. About 10hrs needed. sc: have been working on keeping data on Affymetrix public das servers up to date, dealing with memory issues cause by increasing amount of array data to support. Gregg has new efficient format for modeling exon array features with lower memory requirements. Will work on getting the das server to use it. Long-term plan is to remove our das/1 server and just have das/2, easier to use and maintain. Complete transition will take time though. Have continued working to automate the pipeline for updating the affy das servers. Have a new page that lists available data on the servers, currently manually created but plan to automate. ad: web dev in python, taught course on that. plan: getting python server up, to experiment with writeback. updating spec as per a couple of months ago. gh: andrew will make spec a top priority, grant is funding for that. bg: tasked to take das/2 data and produce set of objects to use within caCORE system at NCI. Have objects for das/2 data and service. can retrieve das/2 data from affy server. present in simple web page. Using java and ruby. gh: good week to ask questions as you flesh out the impl. ee: gregg and I will put out new IGB release this week. can work on style sheets (left over from last time). Or can build a gff3 parser into IGB (lots of excitement!). al: two things: demo applications for self and collaborators and das newbies. retrieve genomic locations for targets of affy probe sets and then retrieve promoter regions upstream. gh: promoter data in das2 server? al: can just say 500bp upstream of gene. not identifying control. Just retrieve seq to pipe into control analysis. Second one: meta analysis, results from diff groups for associated phenotypes. Input: list of markers, output: annotations associated with these. Statistical analysis. Ultimately obtain candidate genes associated with markers. Some preliminary work on obesity that looks promising. [A] Steve will help Ann convert fly probe set ids into genome locations. Goal is to write something that can do random sampling of gene annotations. ideal world: das server gets region, returns gene ids and go ids. Less ideal: just get genes within the peaks (from association studies). bo: doing rpm packaging for the mac (tgen). so people can set up das2 server on a mac. update rpm packages with results of work this week. clean up bug queue on biopackages server impl, bringing it up to spec. can talk about analysis part of server. internal hirax client for retrieval of assay data. communication with server is out of sync. Spec issues: ------------ gh: want to focus on writeback. wants full xml features rather than mapping document. aday: work on writes as well as deletes. Impl 413 entity request too large adding this for requests that exceed some size threshold (10kb, 100kb) if at or below, OK. gh: need to coord with me on writeback, I focus on client writeback, you on server. Editing is ok. Deletes are harder. Other Issues: ------------- gh: Contact peter good about funding. Extending from 2yr to 3yr. talk with lincoln and suzi about plans for next grant. sc: status of bugzilla open bugs on spec? [A] Someone should go through and update bugzilla list for spec bg: version field. gh: not too understandable. at last sprint, two freezes, the version tells which v of spec freeze the server is using. assumption is that now the servers are using the most recent spec. If they're not compliant, please let us know. affy server: won't give back a list of all features. requires an overlaps and types restrictor. biopackages: should be good with latest spec. bg: sources document, source tag has version. if you do a query like types, also has version? No. ad: sources document: worm 161 (data source). capabilities describe things like writeback support for v161, but not v160. bg: that version seems to have different sematics given query. biggest issue was parsing and populating my object model. gh: coordinate subelement in version elem. has a version attr. my client does not deal with coord stuff. meant to make sure that annots from two servers are refering to same coords, so you can overlay annots from different servers. my client is using version URIs for that instead. bg: other issue: in order to know what server you're hitting, you have to know name space of doc, which has base URI. XML base in segments query. xmlns biodas.org/das2. to have tracability in documents you receive, you as implementer must track urls, converting relative to absolute. can be a problem when hitting 5 different servers. gh: my obj model (client) has model of server with root url of the das server, sources objects which has xml base of each source. bg: you could get back a 404 from xml:base. Perfectly apropriate. server could put whatever it wants in xml:base. currently it's the document. ad: we're using the xml:base spec, so you can put xml:base on any node you want to. construct full url by. gh: in our schema is it clear which attribs are resolved by xml:base? ad: no. bg: would like to see one big document with every element, not several different files. relaxNG isn't best format. would like a w3c XSD that defines the elements. from coders standpoint, don't have to go and look at 5 different docs. Have to have multiple windows up, figure out how they are connected to each other. semantics within each query, who is calling what. ad: I gave brian one. using trang to spit it out. bg: trang is not best xml schema writer. I could work on this. why do you use relaxNG? ad: I can read it and understand it. there were good examples. bg: I can autgenerate code that is in XSD, soap and other wservices stuff does that for you. Can generate a parser, point it a uri, get doc, generate a parser and object model. ad: parser would break if server returns extra attributes. In spec there are some extension points. can put any element that is in a separate namespace. I know how to do that in relaxNG, but not in XSD. bg: you just have to add another xmlns. define an extension point with that namespace. ad: should be able to resolve it into one. bg: Three items. 1. will ask w3c people about XSD to relaxNG. 2. semantics confusion. 3. xml:base appropriate to supply a 404 if client was dependent on that attribute. ad: version tag is problem if there are duplicates. should be changed so there are no duplicates. can build parser on rng bg: it's experimental, alpha s'ware. don't want to use for production. bg: when you put a relative url inside a xml:base. ad: resolvable via http, or in abolute url. gh: if you resolve it up to the top level doc, then use the url of the document itself. whether clients actual do this, depends on impl. say to implementers, we could state that the top level document should resolve to absolute url. we wanted to say, "Das/2 uses xml:base spec. period." bg: put this in the spec, how you want it to be used. ad: don't like saying, "we use xml:base with these additional things" bg: can put off for now. ls: In my library when I see a url and can't resolve, I fall back to a hard coded url. From dalke at dalkescientific.com Mon Aug 14 15:29:54 2006 From: dalke at dalkescientific.com (Andrew Dalke) Date: Mon, 14 Aug 2006 21:29:54 +0200 Subject: [DAS2] duplicate use of VERSION Message-ID: <3f8e5c60e918cd90eabe6597403a5448@dalkescientific.com> Brian G. pointed out that "VERSION" is used twice in the spec, with different meanings. I thought we used it twice as an element but that's not the case. It's used once as "versioned source" element and another time as an attribute in the COORDINATES element # This is the version of the build (if a genomic sequence). # However, protein databases don't do versions this way attribute version { text }?, In looking around I don't see duplicate uses of any tag for elements with different meanings. Brian? Is this the one you were talking about? In thinking about it though, I've found it awkward to talk about "versioned source". First off, the Mac's Mail.app gives squiggles under the "versioned" indicating a misspelling. Second, it's hard to say and annoying to write "versioned_source" in my code and in the documentation. I would like to use "release" instead. That is, change das2:VERSION to das2:RELEASE. That's a shorter word, closer to the intended meaning, and generally nicer. Eg, "there are many data sources and each source may have multiple releases." That's a simple change but it's highly non-backwards compatible. Andrew dalke at dalkescientific.com From dalke at dalkescientific.com Mon Aug 14 15:46:23 2006 From: dalke at dalkescientific.com (Andrew Dalke) Date: Mon, 14 Aug 2006 21:46:23 +0200 Subject: [DAS2] duplicate use of VERSION In-Reply-To: <3f8e5c60e918cd90eabe6597403a5448@dalkescientific.com> References: <3f8e5c60e918cd90eabe6597403a5448@dalkescientific.com> Message-ID: <7e92910f143448a82fae138d41a7e195@dalkescientific.com> > In looking around I don't see duplicate uses of any tag for > elements with different meanings. I should have added... Even though they are not duplicate element tags, they should not have the same name as it causes confusion. For example, someone seeing "version" may think it is the name/uri/url for a VERSION element when it is absolutely not. > I would like to use "release" instead. That is, change > das2:VERSION to das2:RELEASE. Still would like it. Andrew dalke at dalkescientific.com From dalke at dalkescientific.com Mon Aug 14 17:38:27 2006 From: dalke at dalkescientific.com (Andrew Dalke) Date: Mon, 14 Aug 2006 23:38:27 +0200 Subject: [DAS2] mapping document Message-ID: Been thinking about the response to a writeback. The spec said the server responds with a mapping document saying "uploaded id X is now Y". As per discussion this will now return a features document. Each feature element may contain a new attribute "was" if its URI changed. This happens for one of two reasons: - the client created the feature using the private naming scheme - the server supports versioning and each feature version gets its own identifier Perhaps also "the server's ornery and jest feels like it." I had written the spec so a server could optionally implement type writeback. With this change that is not possible. It's possible to have a new return document which combines features and types (which is very similar to the current writeback spec). However, type writeback was not considered a high priority and none of the servers under development will support such thing. (Correct?) If needed we have extension mechanisms by which that can be supported in the future. questions: - I wrote above that the new attribute is named "was", as in The word "was" is wrong. Otherwise the new version should be "is", and not "uri". Other options are "previously", "old_uri", "prev_uri", "previous_uri", "uri_was" I can't find old discussion on this. Anyone one not like "old_uri" and have a better name? - anyone want type writeback in this version of the spec? if not i'll remove all traces of it from the spec. Andrew dalke at dalkescientific.com From dalke at dalkescientific.com Mon Aug 14 18:27:49 2006 From: dalke at dalkescientific.com (Andrew Dalke) Date: Tue, 15 Aug 2006 00:27:49 +0200 Subject: [DAS2] relative URLs and xml:base in the writeback document Message-ID: On the topic of relative URLs ... The writeback document contains FEATURE elements. Because we aren't supporting types I want to change the writeback document so it looks like this Reason for the change ... ... Problem #1: if I lift the existing FEATURE element definition then the uri attributes may contain relative URIs and the FEATURE element may contain an xml:base attribute. We can also have that "WRITEBACK" contains an xml:base attribute. What happens if after all of that the writeback URI is still a relative URL? How does the server convert the relative URL into an absolute one? Does it use the writeback URL as the document base? That's the only one which comes close to making sense, but it doesn't make much sense. No client in its right mind will deconvolute the feature uris to be relative urls with respect to the writeback URL (which, after all, may be on an entirely different machine). I checked the xml:base spec http://www.w3.org/TR/xmlbase/ and it refers to the URI RFC 2396 http://www.ietf.org/rfc/rfc2396.txt These are both defined in terms of document retrieval. Eg, > If no base URI is embedded, the base URI of a document is > defined by the document's retrieval context. This makes no sense in a POST document. I think in this case it's fine to say "URIs in a writeback spec must be absolute URLs". Either they are written as absolute URLs or they are made absolute in the context of some xml:base defined in the writeback delta. What say you all? A. all URIs in writeback must be absolute - don't support xml:base at all B. URIs may be relative but must be absolute once all enclosing xml:base attributes are included C. URIs may be relative and the writeback URL itself is used as the retrieval context My vote is that the server implements B but that clients will all do A. Speaking of which, digging through the xml:base spec and the history of our discussion I see that we are free to define when xml:base is valid. We could use it only on the root element if we so desire. Right now it can be on any element. The reason we have it on every element is from the influence of this blog post: http://norman.walsh.name/2005/04/01/xinclude > Ugh. In the short term, I think there's only one answer: update your > schemas to allow xml:base either (a) everywhere or (b) everywhere you > want XInclude to be allowed. I urge you to put it everywhere as your > users are likely to want to do things you never imagined. Andrew dalke at dalkescientific.com From dalke at dalkescientific.com Mon Aug 14 18:32:43 2006 From: dalke at dalkescientific.com (Andrew Dalke) Date: Tue, 15 Aug 2006 00:32:43 +0200 Subject: [DAS2] element identity Message-ID: <63f85e217e45faf272d769ba9f2fd135@dalkescientific.com> again, working on the writeback spec. The writeback spec will look like Reason for the change ... ... The response document will look like this ... ... This FEATURE element is very similar but different than the normal FEATURE element in that it has a new "old_uri" attribute. Does anyone see that as a problem? I don't, but it breaks the guideline we talked about earlier where two XML elements with the same tag must refer to the same thing. Andrew dalke at dalkescientific.com From dalke at dalkescientific.com Mon Aug 14 19:54:39 2006 From: dalke at dalkescientific.com (Andrew Dalke) Date: Tue, 15 Aug 2006 01:54:39 +0200 Subject: [DAS2] updated writeback spec Message-ID: <1182e48978effe1454d7350cb9634283@dalkescientific.com> I've updated the writeback spec. Here's the log message > Respond with a modified features document instead of a mapping > document. > > Removed references to type writeback. > > Writeback URIs must be fully resolvable in the document. Andrew dalke at dalkescientific.com From dalke at dalkescientific.com Tue Aug 15 09:46:14 2006 From: dalke at dalkescientific.com (Andrew Dalke) Date: Tue, 15 Aug 2006 15:46:14 +0200 Subject: [DAS2] Fwd: can't view XML from DAS2 server in IE4 or Safari Message-ID: <1f5d9f7aa32f2cdd91e60a3867e58037@dalkescientific.com> Oops! Hit "reply-to" instead of "reply-all". Begin forwarded message: > From: Andrew Dalke > Date: August 15, 2006 5:26:03 AM GMT+02:00 > To: "Ann Loraine" > Subject: Re: can't view XML from DAS2 server in IE4 or Safari > >> I'm trying to view the XML delivered from the DAS2 server in Firefox >> or IE4 without having to save it and then load it. >> >> I think this is something to do with the fact that the XML is >> delivered as type application versus XML plain text, which is what the >> DAS1 servers seem to do. > > Yes. It's a 4 year old bug in Mozilla. > https://bugzilla.mozilla.org/show_bug.cgi?id=155730 > > >> Is there a way I can tell Firefox to render the XML directly without >> my having to save it first? > > We've run into this before. I want a way to make this be less > of a problem. > > I propose that if "text/xml" is in the Accept header then the > server should return the das2xml document but with a "text/xml" > content-type. > > I tested that out on my copy of Firefox and it was a happy camper. > It showed the XML tree, though it did complain about the lack > of a stylesheet. Okay, perhaps it was more feeling okay than happy.. > > Of course another possibility is to see the "text/html" there > and show something more presentable to humans, but that makes things > worse for those like Ann who want to see the XML structure. Andrew dalke at dalkescientific.com From boconnor at ucla.edu Tue Aug 15 03:07:03 2006 From: boconnor at ucla.edu (Brian O'Connor) Date: Tue, 15 Aug 2006 00:07:03 -0700 Subject: [DAS2] updated writeback spec In-Reply-To: <1182e48978effe1454d7350cb9634283@dalkescientific.com> References: <1182e48978effe1454d7350cb9634283@dalkescientific.com> Message-ID: <44E17297.7090901@ucla.edu> Hi Andrew, During the last code sprint I used the DAS/2 validation tool you wrote to help debug the das.biopackages.net server. It was very helpful!! Has it been updated to the current spec (v 1.33 2006/04/27 on the website). What is the URL? Thanks --Brian Andrew Dalke wrote: > I've updated the writeback spec. Here's the log message > > >>Respond with a modified features document instead of a mapping >>document. >> >>Removed references to type writeback. >> >>Writeback URIs must be fully resolvable in the document. > > > > Andrew > dalke at dalkescientific.com > > _______________________________________________ > DAS2 mailing list > DAS2 at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/das2 From dalke at dalkescientific.com Tue Aug 15 11:16:35 2006 From: dalke at dalkescientific.com (Andrew Dalke) Date: Tue, 15 Aug 2006 17:16:35 +0200 Subject: [DAS2] xlm:base -- fer it or agin' it? Message-ID: <1c606eb000e651a6741eb1b09d30da06@dalkescientific.com> I see three reasonable options (or rather, logically defensible) related to xml:base in DAS2 documents. 1) don't us it at all 2) only have it in the root element of the document 3) have it anywhere in the document (this is the old programming dictum of "the only limits should be 0, 1 and infinity") Pros and cons: #1 is the least confusing. Given relative URL, use the document's url to make it absolute, etc. as per URI spec. #2 This is similar to the restrictions in the BASE element in the HTML header. (Which I've only used once.) It's used most often in saved documents so relative URLs work without needing to rewrite the rest of the document. Take your DOM, and stick the URL in the root node if "xml:base" is not present, otherwise do root.attrib["xml:base"] <-- urljoin(document_url, root.attrib["xml:base"]) #3 This is the most complicated. The main use case mentioned was support for xinclude, which is not something anyone here has said they need. For all I know it may be useful XSLT and other languages. I don't know the XML toolchain well enough. Here is another use case. Consider a registration / aggregation service. It could work by fully parsing everything from each client and making absolute URIs for everything. Or it could do ... That is, it reads the sources document and pulls the SOURCE elements out of the XML. It sticks in the right xml:base (perhaps with a set of joins from the parent elements in the document) and serves the result. No need to parse further. Here's another. Consider a meta-feature server which sucked in primary records from multiple other servers (with permission). It might provide better search capabilities, better ranking, whatever. The features are unchanged. The server wants to return the results as it got them from the original server. Without xml:base it needs to convert all relative URLs into absolute ones ... ... ... ... which requires the server know about all field which are URLs. This precludes support for any extensions which include URL fields because the meta-server won't know about them. OTOH, with xml:base ... ... ... ... and any embedded extensions work w/o problems. Hence I'm fer numb'r 3. Andrew dalke at dalkescientific.com From aloraine at gmail.com Tue Aug 15 11:08:43 2006 From: aloraine at gmail.com (Ann Loraine) Date: Tue, 15 Aug 2006 08:08:43 -0700 Subject: [DAS2] Fwd: can't view XML from DAS2 server in IE4 or Safari In-Reply-To: <1f5d9f7aa32f2cdd91e60a3867e58037@dalkescientific.com> References: <1f5d9f7aa32f2cdd91e60a3867e58037@dalkescientific.com> Message-ID: <83722dde0608150808m15cf3b15g894009ab0ac5fde@mail.gmail.com> Hi Andrew, This sounds great to me! Being able to use my Web browser to show people DAS XML after typing in a URL (teaching) and also to see it myself as I familiarize myself with the URL-building conventions (coding) is a huge plus. It really gets the point across in an accesible and dramatic way. A lot of us started to "get" programming after having friends or colleagues show us the HTML coding underlying Web pages using the "view source" function of Netscape Navigator. I think being able to see the XML beautifully rendered in a browser can have the same sort of function for a lot of people and will help them understand the concept of structured data, the meaning of machine-readable, and good stuff like that. Cheers, Ann On 8/15/06, Andrew Dalke wrote: > Oops! Hit "reply-to" instead of "reply-all". > > Begin forwarded message: > > > From: Andrew Dalke > > Date: August 15, 2006 5:26:03 AM GMT+02:00 > > To: "Ann Loraine" > > Subject: Re: can't view XML from DAS2 server in IE4 or Safari > > > >> I'm trying to view the XML delivered from the DAS2 server in Firefox > >> or IE4 without having to save it and then load it. > >> > >> I think this is something to do with the fact that the XML is > >> delivered as type application versus XML plain text, which is what the > >> DAS1 servers seem to do. > > > > Yes. It's a 4 year old bug in Mozilla. > > https://bugzilla.mozilla.org/show_bug.cgi?id=155730 > > > > > >> Is there a way I can tell Firefox to render the XML directly without > >> my having to save it first? > > > > We've run into this before. I want a way to make this be less > > of a problem. > > > > I propose that if "text/xml" is in the Accept header then the > > server should return the das2xml document but with a "text/xml" > > content-type. > > > > I tested that out on my copy of Firefox and it was a happy camper. > > It showed the XML tree, though it did complain about the lack > > of a stylesheet. Okay, perhaps it was more feeling okay than happy.. > > > > Of course another possibility is to see the "text/html" there > > and show something more presentable to humans, but that makes things > > worse for those like Ann who want to see the XML structure. > > Andrew > dalke at dalkescientific.com > > _______________________________________________ > DAS2 mailing list > DAS2 at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/das2 > -- Ann Loraine Assistant Professor Section on Statistical Genetics University of Alabama at Birmingham http://www.ssg.uab.edu http://www.transvar.org From dalke at dalkescientific.com Tue Aug 15 12:49:21 2006 From: dalke at dalkescientific.com (Andrew Dalke) Date: Tue, 15 Aug 2006 18:49:21 +0200 Subject: [DAS2] global reference identifiers Message-ID: <61a3a74f23f83cf89a05055e0bc7e0a7@dalkescientific.com> http://open-bio.org/wiki/DAS:GlobalSeqIDs Andrew dalke at dalkescientific.com From dalke at dalkescientific.com Tue Aug 15 14:04:58 2006 From: dalke at dalkescientific.com (Andrew Dalke) Date: Tue, 15 Aug 2006 20:04:58 +0200 Subject: [DAS2] global reference identifiers In-Reply-To: <61a3a74f23f83cf89a05055e0bc7e0a7@dalkescientific.com> References: <61a3a74f23f83cf89a05055e0bc7e0a7@dalkescientific.com> Message-ID: <409ce859211622e5781c58db5b014da9@dalkescientific.com> > D.melanogaster, C.elegans, and C.briggsae are here, but no > S.cerevisiae, > R.norvegicus, M.musculus, or H.sapiens. > -Allen It's a wiki - feel free to add new ones! :) Andrew dalke at dalkescientific.com From Steve_Chervitz at affymetrix.com Tue Aug 15 14:57:19 2006 From: Steve_Chervitz at affymetrix.com (Steve Chervitz) Date: Tue, 15 Aug 2006 11:57:19 -0700 Subject: [DAS2] Fwd: can't view XML from DAS2 server in IE4 or Safari In-Reply-To: <83722dde0608150808m15cf3b15g894009ab0ac5fde@mail.gmail.com> Message-ID: I agree it is very useful to view XML documents, even though das xml is intended for applications. For whatever reason, some humans (myself included) seem to have a fascination with XML and like to view it, so it makes sense to provide for this. As for viewing das2xml data directly by clicking on das2 server links in Firefox, I have no problem. When you first click on a link returning das2xml formatted data (mime type=application/x-das-*+xml), Firefox should provide a dialog box asking what you want to do with it. Click "open with" and select Firefox itself. Do this for each of the types of das documents and you'll be set. Btw, there are a bunch of different das2xml links available here for testing: http://netaffxdas.affymetrix.com/das2/ If you have already specified that Firefox should save the das2xml data to disk, you should be able to change your preference by going to Preferences -> Downloads -> View & Edit Actions... (this is on OS X with Firefox 1.5.0.6. I don't see any entries for application/x-das* entries in mine; not sure why not, but it's working now, so I don't worry). According to the following article, Firefox will use its default xml handler for any mime type matching application/*+xml (see 'Types of XML' on this page): http://www-128.ibm.com/developerworks/xml/library/x-ffox2/index.html While we're on the subject, there's another recent article in this series on manipulating XML with javascript in Firefox. Might be interesting to try some of these ideas with das2xml data: http://www-128.ibm.com/developerworks/library/x-ffox3/ Steve > From: Ann Loraine > Date: Tue, 15 Aug 2006 08:08:43 -0700 > To: Andrew Dalke > Cc: DAS/2 > Subject: Re: [DAS2] Fwd: can't view XML from DAS2 server in IE4 or Safari > > Hi Andrew, > > This sounds great to me! > > Being able to use my Web browser to show people DAS XML after typing > in a URL (teaching) and also to see it myself as I familiarize myself with > the URL-building conventions (coding) is a huge plus. It really gets > the point across in an accesible and dramatic way. > > A lot of us started to "get" programming after having friends or > colleagues show us the HTML coding underlying Web pages using the > "view source" function of Netscape Navigator. I think being able to > see the XML beautifully rendered in a browser can have the same sort > of function for a lot of people and will help them understand the > concept of structured data, the meaning of machine-readable, and good > stuff like that. > > Cheers, > > Ann > > On 8/15/06, Andrew Dalke wrote: >> Oops! Hit "reply-to" instead of "reply-all". >> >> Begin forwarded message: >> >>> From: Andrew Dalke >>> Date: August 15, 2006 5:26:03 AM GMT+02:00 >>> To: "Ann Loraine" >>> Subject: Re: can't view XML from DAS2 server in IE4 or Safari >>> >>>> I'm trying to view the XML delivered from the DAS2 server in Firefox >>>> or IE4 without having to save it and then load it. >>>> >>>> I think this is something to do with the fact that the XML is >>>> delivered as type application versus XML plain text, which is what the >>>> DAS1 servers seem to do. >>> >>> Yes. It's a 4 year old bug in Mozilla. >>> https://bugzilla.mozilla.org/show_bug.cgi?id=155730 >>> >>> >>>> Is there a way I can tell Firefox to render the XML directly without >>>> my having to save it first? >>> >>> We've run into this before. I want a way to make this be less >>> of a problem. >>> >>> I propose that if "text/xml" is in the Accept header then the >>> server should return the das2xml document but with a "text/xml" >>> content-type. >>> >>> I tested that out on my copy of Firefox and it was a happy camper. >>> It showed the XML tree, though it did complain about the lack >>> of a stylesheet. Okay, perhaps it was more feeling okay than happy.. >>> >>> Of course another possibility is to see the "text/html" there >>> and show something more presentable to humans, but that makes things >>> worse for those like Ann who want to see the XML structure. >> >> Andrew >> dalke at dalkescientific.com >> >> _______________________________________________ >> DAS2 mailing list >> DAS2 at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/das2 >> > > > -- > Ann Loraine > Assistant Professor > Section on Statistical Genetics > University of Alabama at Birmingham > http://www.ssg.uab.edu > http://www.transvar.org > _______________________________________________ > DAS2 mailing list > DAS2 at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/das2 From Steve_Chervitz at affymetrix.com Tue Aug 15 15:11:33 2006 From: Steve_Chervitz at affymetrix.com (Steve Chervitz) Date: Tue, 15 Aug 2006 12:11:33 -0700 Subject: [DAS2] Notes from DAS/2 code sprint #3, day two, 15 Aug 2006 Message-ID: Notes from DAS/2 code sprint #3, day two, 15 Aug 2006 $Id: das2-teleconf-2006-08-15.txt,v 1.1 2006/08/15 19:10:02 sac Exp $ Note taker: Steve Chervitz Attendees: Affy: Steve Chervitz, Ed E., Gregg Helt CSHL: Lincoln Stein, Scott Cain Dalke Scientific: Andrew Dalke UCLA: Allen Day, Brian O'Connor Action items are flagged with '[A]'. These notes are checked into the biodas.org CVS repository at das/das2/notes/2006. Instructions on how to access this repository are at http://biodas.org DISCLAIMER: The note taker aims for completeness and accuracy, but these goals are not always achievable, given the desire to get the notes out with a rapid turnaround. So don't consider these notes as complete minutes from the meeting, but rather abbreviated, summarized versions of what was discussed. There may be errors of commission and omission. Participants are welcome to post comments and/or corrections to these as they see fit. Topic: Spec updates ------------------- ad: made changes to the writeback spec. nothing serious, stuff we talked about. removed possibility of writeback for types, updated docs. returns back a features document. feature element contains old_uri to refer to previous uri if it changed. Not for a response document. gh: can we freeze it at this point? Like the idea of reusing the feature xml. Hoping to call it frozen for rest of code sprint. ad: do we allow relative urls inside the writeback doc. relative to what? gh: xml:base applies ad: if url is still relative once you get to the top of the document, what happens? gh: free to throw an error ad: so 'application defined'. seems ok. gh: can uri's be local when curation is created on client, you're making up your own id. fully resolvable. ad: it is das_private uri, not a relative uri, no resolvability requirements. aday: order of operations issue with insertion and deletions for features with same id. do a delete-insert or insert-delete? does delete get processed before insert? ad: all deletes go first. aday: are all features required to be processed from top to bottom as well? ad: doesn't specify. aday: natural ordering in the document for feature processing. on creation of a new feature. if it has a das_private feature that is declared in the doc which hasn't been seen before. will cause problems. ad: pref aday: require features to be declared in order so that everything declared below refers to things declared above. ad: not possible for new features. aday: where is type writeback going to go? ad: not to be supported. could use a separate document. gh: fine with not dealing with types now. let's get feature writeback going first. aday: would like to make it extensible. to see how you could create a types writeback. gh: separate document. aday: so writeback for types is a element enclosed in a writeback element. gh: any other issues with writeback spec yesterday? many conversations here after the teleconf. the order of operations thing, and the need to freeze ASAP. ad: b gilman's use of VERSION in two diff places. see my email from yesterday. I proposed using 'release' than 'versioned_source'. too late now to change the versioned element. gh: change name of att Topic: Versioned source -> release --------------------------------- See andrew's email from yesterday. aday: has a working server. will send out url out today, after incorporating latest developments. returns a mapping document. gh: will clean up curation stuff today. figure out how to swap ids out. this is an igb internal release. Topic: Microdeltas ------------------ ad: microdeltas: take the delta of the document we have now, break it up into lots of parts. no big two-hour curation, but server tracks changes as they occur. this way you can track reasons for each change. gh: so curator should push 'save to server' button each time they make an edit. this is up to client to impose this. you have a comment element in the writeback. ad: there is a distinction between changes that computer made vs. human comment - reason why they did a whole set of changes. not sure the reason the resolution. gh: microdeltas might be getting a little more complicated for what we're trying to do. Topic: Coordinates in read spec -------------------------------- gh: questions regarding read. Is allen serving up coordinate stuff? aday: segment coordinate uri? gh: the thing we're supposed to be using to decide whether annotations from two servers are on the same coord system. if uri's for two different versioned_sources match, assume they're the same coord system. lincoln set up names for genomes. gh: haven't implemented part on client that makes use of it. currently using a hard-wired way. ad: on open-bio.org site. wiki. gh: writable nature of server is supposed to be in capabilities section. OK you've got in right place. my bad. gh: locking, not worked on. aday: exclusive lock on table to be modified. other clients wanting to write cannot get it. so it's under the hood, no special reponse. ad: how do we indicate a server supports writeback? I wanted an extension tag, not attribute. haven't looked at recently. gh: can't remember. can a versioned_source have... If a versioned source is writable, can any data on that be editable? yes. ad: why does it make a difference. gh: concerned whether there are certain types of annots that should not be writable, level of distinction (granularity). either you can edit any annotations on that versioned source, or none of them. gh: eg. blast results vs human-made curations. can't edit blast results. ad: I don't thing a single bit flag is good enough. gh: per type? ad: not sure. gh: ok as is. you can have multiple servers, some holding mutable data some holding immutable data. ad: I support writing for some people, some time. user is in charge of figuring out which types on which servers can be changed. gh: client has to be smart -- ie., try to edit then undo it then tell user they can edit. or allow user to edit stuff and find out at commit time if editing is ok (possibly not). ad: ideally would like a way to figure out from server what you can and cannot do on a given versioned source. gh: let's not get into that now. that is the simplest way to go w/r/t to the spec. Topic: Viewing das2xml responses in web browser ----------------------------------------------- See Ann Loraine's email on list about trouble of looking at das2 responses via IE4 and Safari. ee: needs text/xml in order to see it in browsers. ad: viewing xml documents is an extension of das, which was intended for computer communication. aday: some problems with javascript/AJAX making it unusable. must have content-type as text/xml. ad: javascript talking to server can specify what format it wants it back. there's a firefox bug in the '+xml' specification. gh: we are telling it xml, it's aday: there are real clients out there that cannot deal with the advance http headers we are using. ad: format= in query parameter gh: format=xml then content-type in header should be text/xml? ad: not in the spec now. you specify das2xml and get back application/.... bo: could have proxy code that sits in between client and server and convers to text/xml ad: default for web browsers. server could decide to support ajax by allowing format=json. aday: gh: need to say that servers have the option to provide content-type=text/xml if format=xml. we are compliant to content-header spec, some ajax implementations don't handle it properly. ad: if client makes request and string text/xml appears in the accepts header, then server should be free to give back regular das2xml response document but as content-type text/xml? by 'free', meaning not required. gh: some libraries are not compliant with http header content type spec. if servers supports that, then they can return different content types. ad: what is recommendation for this case? aday: for firefox and javascript clients. sc: I have had no trouble with firefox on os x. I can try to troubleshoot Ann's set up. Topic: Dasypus online validation tool -------------------------------------- bo: dasypus validation tool is it up to date? ad: server is down since it hasn't been used for a while. should be up to date. [A] andrew will bring dasypus online validator online. Status Reports --------------- bo: bugfixes on das.biopackages.net server. gh: write back curations, id resolution on client side, igb release today. aday: update/edit/delete, changing response type today ad: relaxNG, getting dasypus server back up, my own das server. ee: getting igb release out today. gff3 parser. sc: working with gregg's new Bprobe1Parser to create new versions of exon array data files, more memory efficient. Will send to gregg for testing. Also updating list of available data on the affy das servers. From allenday at ucla.edu Tue Aug 15 21:15:07 2006 From: allenday at ucla.edu (Allen Day) Date: Tue, 15 Aug 2006 18:15:07 -0700 Subject: [DAS2] xml:base and XML::DOM::XML_Base Message-ID: <5c24dcc30608151815t17a13144t54ff11407b94f397@mail.gmail.com> Lincoln, I needed an xml:base resolution module for my writeback code, and there wasn't one available for any of the lightweight XML libs on CPAN, so I wrote an XML::DOM extension. Feel free to use it if you have not already finished your implementation, it should be on CPAN within the next day or so, I just uploaded it. -Allen From dalke at dalkescientific.com Tue Aug 15 21:29:40 2006 From: dalke at dalkescientific.com (Andrew Dalke) Date: Wed, 16 Aug 2006 03:29:40 +0200 Subject: [DAS2] Fwd: can't view XML from DAS2 server in IE4 or Safari In-Reply-To: References: Message-ID: Steve: > As for viewing das2xml data directly by clicking on das2 server links > in > Firefox, I have no problem. When you first click on a link returning > das2xml > formatted data (mime type=application/x-das-*+xml), Firefox should > provide a > dialog box asking what you want to do with it. Click "open with" and > select > Firefox itself. I never would have thought of that. A-ha. It works but it works by downloading the file, saving it to a temp.xml file then doing the equivalent of "Firefox tmp.xml", which opens a new window on a Mac. It doesn't open it in the current window as I would like. And the temp file persists in my download directory. I experimented with content negotiation, where the client may send an accept header to the server with the desired content types. My server supports "text/plain" (fasta), "text/xml", and "application/x-das2segments+xml" Examples below. I did this because I want the documentation to say "If the format parameter is not specified in the query string then the server may use HTTP content negotiation to determine the most appropriate representation. If multiple representations matc then the das2xml version should be returned, if allowed. and leave it at that. This includes the "if 'text/xml' exists in the Accept field ..." solution we talked about earlier. In the "An Annotationed Guide to the DAS spec" then include what what that means and why it's done. The Apache content negotiation strategy is at http://httpd.apache.org/docs/1.3/content-negotiation.html Using that scheme, the following describes possible variants for a DAS service URI: features format: das2xml Content-type: application/x-das2features+xml; qs=1.0 format: xml Content-type: text/xml; qs=0.95 format: fasta Content-type: text/plain; qs=0.95 where "qs" means "quality of service". Apache ranks solutions so "q*qs" is largest. Firefox sends ACCEPT: text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/ plain;q=0.8,image/png,*/*;q=0.5 This orders the results as: xml = 0.95*0.9 -> 0.855 fasta = 0.95*0.8 -> 0.76 das2xml = 1.0*0.5 -> 0.5 while Python's url fetcher does not send an Accept and curl sends "*/*". Both of these would cause "das2xml" to be returned over other formats. In other words, the "send 'text/xml' if the client asks for it else send 'application/x-das2*+xml' " is an acceptable way to do conneg. Here's what my test reference server does under different conditions. ## ask for text/plain, which returns FASTA % curl -H "Accept: text/plain" -i http://localhost:8080/seq/fly_v1 HTTP/1.1 200 OK Date: Wed, 16 Aug 2006 00:25:40 GMT Server: CherryPy/2.2.1 Content-Length: 48 Content-Type: text/plain Connection: close >Chr1 ABCDEFG >Chr2 abcdefgh >Chr3 987654321 ## ask for text/xml, which returns the normal XML as "text/xml" % curl -H "Accept: text/xml" -i http://localhost:8080/seq/fly_v1 HTTP/1.1 200 OK Date: Wed, 16 Aug 2006 00:26:02 GMT Server: CherryPy/2.2.1 Content-Length: 435 Content-Type: text/xml Connection: close ## ask for anything under the "application" namespace, with a needless quality factor % url -H "Accept: application/*;q=0.5" -i http://localhost:8080/seq/fly_v1 HTTP/1.1 200 OK Date: Wed, 16 Aug 2006 00:28:13 GMT Server: CherryPy/2.2.1 Content-Length: 435 Content-Type: application/x-das2segments+xml Connection: close ## give an image if it's there, text/plain is next best, then an application % curl -H "Accept: image/*, application/*;q=0.5, text/plain;q=0.9" -i http://localhost:8080/seq/fly_v1 HTTP/1.1 200 OK Date: Wed, 16 Aug 2006 00:34:15 GMT Server: CherryPy/2.2.1 Content-Length: 48 Content-Type: text/plain Connection: close >Chr1 ABCDEFG >Chr2 abcdefgh >Chr3 987654321 In my case the server has multiple text/plain outputs but FASTA always wins over raw. I can force any format with the "format=" option, which ignores the "Accept" header completely. % curl -H "Accept: text/xml" -i 'http://localhost:8080/seq/fly_v1/1?format=raw' HTTP/1.1 200 OK Date: Wed, 16 Aug 2006 00:35:54 GMT Server: CherryPy/2.2.1 Content-Length: 7 Content-Type: text/plain Connection: close ABCDEFG Andrew dalke at dalkescientific.com From Steve_Chervitz at affymetrix.com Wed Aug 16 13:16:47 2006 From: Steve_Chervitz at affymetrix.com (Steve Chervitz) Date: Wed, 16 Aug 2006 10:16:47 -0700 Subject: [DAS2] Notes from DAS/2 code sprint #3, day three, 16 Aug 2006 Message-ID: Notes from DAS/2 code sprint #3, day three, 16 Aug 2006 $Id: das2-teleconf-2006-08-16.txt,v 1.1 2006/08/16 17:05:24 sac Exp $ Note taker: Steve Chervitz Attendees: Affy: Steve Chervitz, Ed E., Gregg Helt Dalke Scientific: Andrew Dalke UCLA: Allen Day, Brian O'Connor Action items are flagged with '[A]'. These notes are checked into the biodas.org CVS repository at das/das2/notes/2006. Instructions on how to access this repository are at http://biodas.org DISCLAIMER: The note taker aims for completeness and accuracy, but these goals are not always achievable, given the desire to get the notes out with a rapid turnaround. So don't consider these notes as complete minutes from the meeting, but rather abbreviated, summarized versions of what was discussed. There may be errors of commission and omission. Participants are welcome to post comments and/or corrections to these as they see fit. Topic: Spec Q&A --------------- bo: perusing spec, saw mention of XID as a filter. can I get more explanation? ad: can't remember without looking at docs, but think I was not sure what XID was supposed to be, lincoln sent email to clarify. aday: an external db id trying to resolve into local space, eg., for gene das. ad: don't think there was enough info there to be useful. gh: just uri and nothing else? ad: looking at steve's notes from 16 march. looks like we deferred it. gh: input was minimal. I have no particular use for it. bo: need to know what support to provide for the biopackages server. in the read spec, says "it's not well though-out. should have authority, type, id, description." bo: type vs. exact type gh: did we get rid of exact type? ad: see gregg's email from 16 march: http://lists.open-bio.org/pipermail/das2/2006-March/000655.html The assumption was, there's no type inferencing done on the server. it's just done on the client. we were to rename 'exacttype' to 'type' and use exacttype semantics for it. gh: there is no parent-child structure to types. there is to ontology though. ad: type records in das aren't parent-child relations because they combine other info about type, e.g., ways to depict it. bo: looking for places where our server disagreed with spec. segments feature filter is not supported on our end. overlaps segments. but this is just work we need to do, not a spec issue. gh: allen and lincoln were struggling with xml:base resolution yesterday, looking through the xml:base spec, dealing with edges. are you satisfied? aday: yes gh: for implementes that don't already deal with xml:base resoultion, it may take a day or so to deal with it. nomi and I struggled as well. I was suprised it is not so supported in xml libraries. ad: just a matter of walking up the xml tree. gh: recursively had to verify that the resolve stuff in the java networking libraries actually worked according to the xml:base spec. but we've moved through this. bo: url example, uses 'segment' and 'sequence'. not so consistent. gh: pros and cons to this. it shows that das/2 links can be built using different uris. ad: used different url structures to show that this was possible. bo: confusing when you only see a snippet and don't see where the uri was coming from. showing variety is useful though. gh: are both specs frozen now? ad: yes. Topic: Status Reports ----------------------- bo: went through spec. updated our bug queue. added bug re: passing in id filters vs. uris. working on this today. aday: need to resolve type ids, need to deal with relative ids given in the document. now can go back to working on writeback. gh speaking for lincoln: perl stuff for gbrowser to connect to das servers. went through xml:base abyss. updated uris for sequence and genome version ids for human and mouse on the wiki page: http://open-bio.org/wiki/DAS:GlobalSeqIDs sc: should we allow anyone to edit this, of just lincoln? gh: would like to restrict it. worried about wiki graffiti. ad: you have to register. we can always back things out. sc: lincoln will get notification upon any edits. gh: ok. gh: working on igb release. adding parsing abilities. can now focus on das/2, mostly writeback stuff, refining that in igb client. ee: finishing up bugfixes before igb release. will start on gff3 parser today. ad: looked into content negotiation stuff. why validator server on open-bio site isn't working: I updated underlying webserver framework. working on that. sc: worked on creating new data files used by the affy das server for exon arrays using gregg's new parser. gh: this is generating more efficient versions of probe sets for exon arrays. important since the affy das server is in-memory. sc: this will help us support more arrays in the das server and also move away from having to maintain two different das servers, so we can focus on just the das/2 server. sc: also working on final touches on web page describing available data on our das servers. gh: we can modify xml from the server to point at that page as an info url. sources element has info url, and sub elements as well, but we can just put the info page at the top level. sc: also was working on ann's fly data project, where she needs to pull genomic regions relative to probe sets. we need to update our das alignment file (link.psl) to be based on dm2. gh: we don't provide residues. she'll have to do a das/1 query at ucsc to get residues. From dalke at dalkescientific.com Wed Aug 16 15:17:04 2006 From: dalke at dalkescientific.com (Andrew Dalke) Date: Wed, 16 Aug 2006 21:17:04 +0200 Subject: [DAS2] validator working again Message-ID: Silly me, I went and upgraded the TurboGears package used by the validator. From 0.8* to 0.9*. There were differences, and one (a unicode encoding problem) quite subtle. The validator is up and running. http://cgi.biodas.org:8080/ Let me know of any problems. Andrew dalke at dalkescientific.com From allenday at ucla.edu Wed Aug 16 16:35:04 2006 From: allenday at ucla.edu (Allen Day) Date: Wed, 16 Aug 2006 13:35:04 -0700 Subject: [DAS2] new writeback URI Message-ID: <5c24dcc30608161335n267201a7w1ef5221ceb9fcdc5@mail.gmail.com> Hi, You can POST writebacks for the http://das.biopackages.net/das/genome/human/writeback/ vsource here: http://genomics.ctrl.ucla.edu/~allenday/cgi-bin/das2xml-parser/stable2.pl The returned document will either be an element, or a element, depending on what was POSTed. I will update the relevant sections in the main sources/source/vsource docs on the biopackages server. I will send another email when the response document is up-to-date with the latest specification revisions -- I'm under the impression I just have to return das2xml for all updated and created features instead of returning the previously specified element. -Allen From dalke at dalkescientific.com Thu Aug 17 08:14:59 2006 From: dalke at dalkescientific.com (Andrew Dalke) Date: Thu, 17 Aug 2006 14:14:59 +0200 Subject: [DAS2] content-negotiation, conclusion Message-ID: <20680849e0825c09bdd56f215da74f1a@dalkescientific.com> After experimenting with a content-negotation implementation and trying it out under different circumstances I've come to the conclusion that the errors are too subtle and hard to debug in the generic case. Quoting from http://norman.walsh.name/2003/07/02/conneg > At this point, we're about eleven levels farther down in the web > architecture than any mortal should have to tread. On the one hand, > content negotiation offers a transparent solution to a tricky problem. > On the other hand, the very transparency of such solutions makes them > devilishly hard to understand when they stop working. Even for the limited case of DAS2 where we want web browsers to see "text/xml" instead of "application/x-das*+xml" it's just not possible. It turns out Safari only uses "*/*" in the Accept header. I do not want a system which gives different results when viewed in different browsers. Ann? How about this solution to your case - we'll have a "xml" format defined as being the same as "das2xml" but returning a "text/xml" header. Or perhaps a "html" format designed for people. When you are showing people how DAS works, and if the browser doesn't understand the */*+xml content type as being in XML, then you can say "oh, add 'format=html' to the URL to see it in HTML". The spec will look like: If the format is not specified in the query string then the server must return the document in das2xml format (or fasta format for segment records) unless the client sends an Accepts header with a mime-type starting "application/x-das-". In that case the server may implement HTTP content-negotiation. HTTP content-negotiation is an experimental feature in DAS2 and is not required in the client nor the server. Structured this way there's no way a generic browser can trigger conneg with a das2 server. Only das-aware clients can do it. This gives room for future experimentation. Andrew dalke at dalkescientific.com From dalke at dalkescientific.com Thu Aug 17 08:29:19 2006 From: dalke at dalkescientific.com (Andrew Dalke) Date: Thu, 17 Aug 2006 14:29:19 +0200 Subject: [DAS2] default format for a single segment Message-ID: <593e57347c2de14ed36df1ef69fd9f5c@dalkescientific.com> Two proposals here: 1) change the default format for a single segment request from FASTA -> das2xml 2) add optional elements to each segment == Proposal 1 === Currently every DAS2 service returns an application/x-das-*+xml document by default except for the segment document. A request for on a segment URI returns its FASTA sequence. I would like to change that. I would like the segment document by default to return a das-segment document. For example, if this is the segments document then doing the request for "segment/chrI" should return == Proposal 2 == My server implements a "raw" sequence format which contains only sequence data and does not even contain the FASTA header. The raw format only works for a single segment and not for the list of segments. In the current spec the "FORMAT" entry is somewhat ambiguous. Does it work for the set of segments or for a single given segment? That is, segments?format=das2xml --> the segments document for all of the segments segment/chrI?format=das2xml --> the segments document for a given segments segments?format=fasta --> all sequences, in FASTA format segment/chrI?format=das2xml --> the FASTA sequence for the given segment However, segments?format=raw makes no sense. No one will use that one for real. I propose that the SEGMENT elements also get an optional FORMAT element which looks like this The formats for a given segment are the union of its elements and those in the top-level. That is, each segment here implements "raw", "fasta" and "das2xml" formats. Andrew dalke at dalkescientific.com From lstein at cshl.edu Thu Aug 17 12:01:18 2006 From: lstein at cshl.edu (Lincoln Stein) Date: Thu, 17 Aug 2006 12:01:18 -0400 Subject: [DAS2] Notes from DAS/2 code sprint #3, day three, 16 Aug 2006 In-Reply-To: References: Message-ID: <6dce9a0b0608170901t44c6e074q5ca24e5fd2cacc72@mail.gmail.com> What's the conference call number? Lincoln On 8/16/06, Steve Chervitz wrote: > > Notes from DAS/2 code sprint #3, day three, 16 Aug 2006 > > $Id: das2-teleconf-2006-08-16.txt,v 1.1 2006/08/16 17:05:24 sac Exp $ > > Note taker: Steve Chervitz > > Attendees: > Affy: Steve Chervitz, Ed E., Gregg Helt > Dalke Scientific: Andrew Dalke > UCLA: Allen Day, Brian O'Connor > > Action items are flagged with '[A]'. > > These notes are checked into the biodas.org CVS repository at > das/das2/notes/2006. Instructions on how to access this > repository are at http://biodas.org > > DISCLAIMER: > The note taker aims for completeness and accuracy, but these goals are > not always achievable, given the desire to get the notes out with a > rapid turnaround. So don't consider these notes as complete minutes > from the meeting, but rather abbreviated, summarized versions of what > was discussed. There may be errors of commission and omission. > Participants are welcome to post comments and/or corrections to these > as they see fit. > > > Topic: Spec Q&A > --------------- > > bo: perusing spec, saw mention of XID as a filter. can I get more > explanation? > ad: can't remember without looking at docs, but think I was not sure > what XID was supposed to be, lincoln sent email to clarify. > aday: an external db id trying to resolve into local space, eg., for gene > das. > ad: don't think there was enough info there to be useful. > gh: just uri and nothing else? > ad: looking at steve's notes from 16 march. looks like we deferred it. > > gh: input was minimal. I have no particular use for it. > bo: need to know what support to provide for the biopackages server. > in the read spec, says "it's not well though-out. should have > authority, type, id, description." > > bo: type vs. exact type > gh: did we get rid of exact type? > ad: see gregg's email from 16 march: > http://lists.open-bio.org/pipermail/das2/2006-March/000655.html > > The assumption was, there's no type inferencing done on the > server. it's just done on the client. we were to rename 'exacttype' to > 'type' and use exacttype semantics for it. > gh: there is no parent-child structure to types. there is to ontology > though. > ad: type records in das aren't parent-child relations because they > combine other info about type, e.g., ways to depict it. > > bo: looking for places where our server disagreed with spec. segments > feature filter is not supported on our end. overlaps segments. but > this is just work we need to do, not a spec issue. > > gh: allen and lincoln were struggling with xml:base resolution yesterday, > looking through the xml:base spec, dealing with edges. are you satisfied? > aday: yes > gh: for implementes that don't already deal with xml:base resoultion, > it may take a day or so to deal with it. nomi and I struggled as > well. I was suprised it is not so supported in xml libraries. > ad: just a matter of walking up the xml tree. > gh: recursively had to verify that the resolve stuff in the java > networking libraries actually worked according to the xml:base spec. > but we've moved through this. > > bo: url example, uses 'segment' and 'sequence'. not so consistent. > gh: pros and cons to this. it shows that das/2 links can be built > using different uris. > ad: used different url structures to show that this was possible. > bo: confusing when you only see a snippet and don't see where the uri > was coming from. showing variety is useful though. > > gh: are both specs frozen now? > ad: yes. > > > Topic: Status Reports > ----------------------- > > bo: went through spec. updated our bug queue. added bug re: passing in > id filters vs. uris. working on this today. > > aday: need to resolve type ids, need to deal with relative ids given > in the document. now can go back to working on writeback. > > gh speaking for lincoln: perl stuff for gbrowser to connect to das > servers. went through xml:base abyss. > updated uris for sequence and genome version ids for human and mouse > on the wiki page: http://open-bio.org/wiki/DAS:GlobalSeqIDs > > sc: should we allow anyone to edit this, of just lincoln? > gh: would like to restrict it. worried about wiki graffiti. > ad: you have to register. we can always back things out. > sc: lincoln will get notification upon any edits. > gh: ok. > > gh: working on igb release. adding parsing abilities. can now focus on > das/2, mostly writeback stuff, refining that in igb client. > > ee: finishing up bugfixes before igb release. will start on gff3 > parser today. > > ad: looked into content negotiation stuff. why validator server on > open-bio site isn't working: I updated underlying webserver > framework. working on that. > > sc: worked on creating new data files used by the affy das server for > exon arrays using gregg's new parser. > gh: this is generating more efficient versions of probe sets for exon > arrays. important since the affy das server is in-memory. > sc: this will help us support more arrays in the das server and also > move away from having to maintain two different das servers, so we can > focus on just the das/2 server. > > sc: also working on final touches on web page describing available data on > our das servers. > gh: we can modify xml from the server to point at that page as an info > url. sources element has info url, and sub elements as well, but we > can just put the info page at the top level. > > sc: also was working on ann's fly data project, where she needs to > pull genomic regions relative to probe sets. we need to update our > das alignment file (link.psl) to be based on dm2. > gh: we don't provide residues. she'll have to do a das/1 query at ucsc > to get residues. > > > > > > _______________________________________________ > DAS2 mailing list > DAS2 at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/das2 > -- Lincoln D. Stein Cold Spring Harbor Laboratory 1 Bungtown Road Cold Spring Harbor, NY 11724 (516) 367-8380 (voice) (516) 367-8389 (fax) FOR URGENT MESSAGES & SCHEDULING, PLEASE CONTACT MY ASSISTANT, SANDRA MICHELSEN, AT michelse at cshl.edu From lstein at cshl.edu Thu Aug 17 11:59:47 2006 From: lstein at cshl.edu (Lincoln Stein) Date: Thu, 17 Aug 2006 11:59:47 -0400 Subject: [DAS2] xml:base on biopackages still not quite right In-Reply-To: References: Message-ID: <6dce9a0b0608170859o7d22ef3cnc6cacf4579a7e305@mail.gmail.com> Hi, I'm getting an incorrect xml:base on the segments request: % GET http://das.biopackages.net/das/genome/human/17/segment ... The problem is that the xml:base ends with a slash, so the synthesized URIs are http://das.biopackages.net/das/genome/human/17/segment/segment/chr1 Lincoln -- Lincoln D. Stein Cold Spring Harbor Laboratory 1 Bungtown Road Cold Spring Harbor, NY 11724 (516) 367-8380 (voice) (516) 367-8389 (fax) FOR URGENT MESSAGES & SCHEDULING, PLEASE CONTACT MY ASSISTANT, SANDRA MICHELSEN, AT michelse at cshl.edu -- Lincoln D. Stein Cold Spring Harbor Laboratory 1 Bungtown Road Cold Spring Harbor, NY 11724 (516) 367-8380 (voice) (516) 367-8389 (fax) FOR URGENT MESSAGES & SCHEDULING, PLEASE CONTACT MY ASSISTANT, SANDRA MICHELSEN, AT michelse at cshl.edu From Gregg_Helt at affymetrix.com Thu Aug 17 14:39:40 2006 From: Gregg_Helt at affymetrix.com (Helt,Gregg) Date: Thu, 17 Aug 2006 11:39:40 -0700 Subject: [DAS2] DAS/2 writeback capability vs. writeable attribute Message-ID: In the current writeback spec, the ability of a server to support writeback is indicated by: under the versioned source element. However, the retrieval spec talks about both the writeback capability element and a "writeable" attribute for the versioned source element. I think the "writeable" attribute can be removed, since the capability provides all the needed information. The current writeback spec doesn't mention this "writeable" element at all. gregg From dalke at dalkescientific.com Thu Aug 17 15:26:20 2006 From: dalke at dalkescientific.com (Andrew Dalke) Date: Thu, 17 Aug 2006 21:26:20 +0200 Subject: [DAS2] DAS/2 writeback capability vs. writeable attribute In-Reply-To: References: Message-ID: gregg: > However, the retrieval spec talks about both the writeback capability > element and a "writeable" attribute for the versioned source element. > I > think the "writeable" attribute can be removed, since the capability > provides all the needed information. The current writeback spec > doesn't > mention this "writeable" element at all. This was up for debate during the last sprint and we decided to keep things as they were until we got to writeback. Which is now. :) I agree with you. Andrew dalke at dalkescientific.com From Steve_Chervitz at affymetrix.com Thu Aug 17 18:18:21 2006 From: Steve_Chervitz at affymetrix.com (Steve Chervitz) Date: Thu, 17 Aug 2006 15:18:21 -0700 Subject: [DAS2] Notes from DAS/2 code sprint #3, day four, 17 Aug 2006 Message-ID: Notes from DAS/2 code sprint #3, day four, 17 Aug 2006 $Id: das2-teleconf-2006-08-17.txt,v 1.1 2006/08/17 22:15:30 sac Exp $ Note taker: Steve Chervitz Attendees: Affy: Steve Chervitz, Ed E., Gregg Helt CHSL: Lincoln Stein Dalke Scientific: Andrew Dalke UCLA: Allen Day, Brian O'Connor Action items are flagged with '[A]'. These notes are checked into the biodas.org CVS repository at das/das2/notes/2006. Instructions on how to access this repository are at http://biodas.org DISCLAIMER: The note taker aims for completeness and accuracy, but these goals are not always achievable, given the desire to get the notes out with a rapid turnaround. So don't consider these notes as complete minutes from the meeting, but rather abbreviated, summarized versions of what was discussed. There may be errors of commission and omission. Participants are welcome to post comments and/or corrections to these as they see fit. Topic: Status Reports ---------------------- ls: Perl interface in good shape. reorg'd to get parser based on content type dynamically. response comes in, figures out what parser to use, returns the objects, should be extensible for other formats. main task todo is to implement the feature object so that I can actually return features. now parser is there, object is not. Not a Bio::SeqFeatureI object, in order to work with gbrowse and other parts of bioperl. some issues with biopackages with xml:base, sometimes slashes there that shouldn't be and vice versa. segments request has extraneous / at end, so it has 'segments' repeated twice. didn't try to fetch to see if would work, but looks like a bug. gh: regarding parent-child relationships between features: if they have parent, need to point to it, if they have children need to point to them. ls: parsing with sax, I'll know when an object is complete. will create a feature stream and start returning features as the parse is coming across. threaded, so you can have multiple streams going simultaneously. gh: more issues with parent child hierarchy. will wait for allen to arrive before discussing. Topic: Spec issues ------------------ ad: working on content negotiation, but now is not right time to do it. in sequence doc, default doc should be das2-segments. sc: xml:base issue -- where do we allow it (0, 1, infinity)? gh: our policy is that we follow the xml:base spec. ad: if you use it, use it everywhere. gh: my parser is looking for it where it everywhere. ad: my email explains why you might want to use it on multiple features. eg., combining data from different servers. sc: what about brian gilman's issue, when you get to root what if xml:base is still relative? ad: uri spec defines how to define relative urls, e.g., get it from document. gh: relaxNG says it can be anywhere. I think it should therefore be allowed anywhere. ad: right now all services returns an xml object file except segment request -- fasta file. would like to return xml. sc: this is along the lines of what I proposed a while back. I like it. See discussion under this thread: http://lists.open-bio.org/pipermail/das2/2005-December/000395.html ad: formats per-segment basis. current scheme only defines per-everything basis. propose have each segment also has it's own format. each segment can have alt formats. (see ad's email from today on this topic). gh: like it. it means that a server doesn't have to know about all residues. ad: for case of reference server, we guarantee that it supports fasta sequence. affects other servers, not just reference server. gh: I like that flexibility. any objections? [silence] gh: if you return the segments doc we now have, you are only serving up xml. if you want to return fasta, you need to return a format element. ls: is there a way for client to determine what it will get? gh: in the segments document, returned back from reference server. client can specify format defined there. ls: not impl yet, just a proposal? gh: yes. another plus is the ability to specify more efficient binary formats too. Topic: Ann's issue on content-type ---------------------------------- gh: server has option to specify that you can return things as text/xml, but still send das2xml format. ad: content negotiation doesn't work to allow the browser to view XML. only works for clients that can do content neg, not general clients (e.g., safari). I tried two different browsers, got two different results. [A] Ask Ann Loraine if this solution is sufficient. Topic: Writeback issues ------------------------- aday: problem writeback. creating new feat or update existing feat. if it's a new feature, das_private uri scheme has no info about source or versioned source that the feature is intended to be written to. This is not necessarily a problem, could be a different uri post. But it is a problem when parsing and it's possible for parents or children to be attached to the feat and they are not the source/vsource combination. make sense? ad: every feat has unique id. could do it by saying when you see this id, it corresponds to this segment or this versioned source. ad: feature comes from NCBI but is being posted to affymetrix. gh: I talked about this as a use case for the grant. Example: snps being served by an authority (dbSNP) and people are trying to create their own haplotype blocking structure. you want them to be able to point to the authority for the leaf features (snps, children). so you can have one server serving up haplotype blocks, and points to snps that reside on another server that is the authority. right now in the spec, can't do that because of the bidirectional parent-child stuff. you'd have to point the snps at the authority to the new stuff. ad: could have parent-child relationships that are incorrect. all parents connected together are places you can get to. has to be a single root. gh: due to that and the bidirectional stuff, we can't support my use case, also can't build features from multiple servers to construct curations. ad: can do it in datamodel. I point to features over there. gh: in xml it can't be done. ad: also means that, you have to keep requesting features over and over again. you have to do at least one request for every feat. gh: even if we have these restrictions, how can we enforce them with das-private id. aday: the document is not enough to tell you if the parent being associated with a feature is valid. you have to know more. aday: it's only these das-private ids that are a problem, you cannot know where it came from or where it's to be written to. the child-parent pointers are not a problem. gh: post to a writeable das server with das-private id, it means the feature is to be written on that server. aday: new document comes in, you don't know where to write them to. gh: which writeable server are they to be written to. ad: there will be a different distinct url. gh: client is aware of 5 different writeback servers, which one do I write to. this is a client issue. it should present options to the user and let them select. aday: what about creating a hybrid feature? gh: it's a totally new curation. ad: what if you want to have one writeback url for several dbs on the server? gh: i would say no. aday: you need to know what is the context of the write. gh: for server, it knows, for client. aday: so are we saying that the document does not need to be validatable when standalone (ie, outside the context of the server)? there is not enough information to know whether some features being grouped together should be. I upload this document to xxx, is it be loadable? gh: i dont' see that as an issue. we have validation issues with read document as well. the validators don't go into the uris of each feature and see if they come from same server. aday: if absolute, yes, but if all relative. as long as all relative, you can tell if compatible. gh: if you have document element was retrieved from, it's relative to that. if not, it's application-specific, which in our case means punting. validator can't guarantee that certain uri's are compatible. to do that, it would have to know how to resolve every uri, and they don't need to be url's. nobody knows how to resolve every uri. what that means is that the server will have to reject the post if it sees uris that it doesn't recognize them. aday: or, that it sc: how does server know if uri's are compatible? gh: for posts, those features have to be coming from that server aday: adding new exon to transcript that already exists in db, can I give you the new exon and pointer to transcript? get's into uri compatibility issue. I have exon whose parent I don't have access to (on remote server). could I do an external request on the parent, figure out it's location, close it, send xid to parent on remote server. ad: would say it's legal but you have to pass in the complete feature record. gh: the legality is in the document that is being posted. you have parent-child resolvability back up to the root. that's the requirement now. gh: is it worth considering relaxing our bidirectional closure requirement? ad: makes parsing harder. have to wait to very end. takes lots of time, memory. gh: use case you have, you need parent. we could relax it to require parent-child (as needed for my use case). but for Allen's case you need child-to-parent pointers. ad: using xid gh: xid's are free form. how do you know that it means x was derived from y? there's no way to represent that in our xml. it's open to interpretation by client and server. ad: in the xid have one of them be the type, constrained vocab, so you know what kind of link it is. keyword 'rel', this means get css, rss.... also the xml-link stuff steve mentioned a while ago. gh: would require some significant rejiggering to resolve it. ad: can we do it by having a new feature type, of it's own vocabulary. gh: if you do this in one client, it does this by cloning, it looks to user you are doing it from different servers. write to client. another one reads it, and it has no way of know that it was derived from the two different sources. gh: for now, you can only point to newly created features or features coming from the server you are posting to, for feature ids. need to know more about evidence trails, to know more about what info they need to preserve. [A] talk to curator pro (nomi) about what evidence to save when creating/modifying feats ad: new type: external-feature-reference, do a new element at end of record. doesn't require a new format. gh: it's outside the spec right now, allen doesn't have to support it. extra xml in the document to describe the relationship. e.g., a derived-from element. it's doable, but I don't think it should be in the current spec. ad: can be done without making backwards incompatible changes to the current spec. aday: now I get free reign to validate the way I want to. I will be liberal in what I reject. gh: end of the spec issues we were looking at yesterday. Topic: Status report -------------------- ee: started working on gff3 parser for IGB. bo: feature filtering. using full uri's not just 'chr2'. going through biopackages.net server checking if it is up to spec. coordinates issues, mapping document, stored in extra file. gh: reference to each segment. aday: writeback server able to do delete and update now. fixed bug reported by andrew. name based query was not returning parents. gh: lincoln mentioned xml:base problem. segment/segment/ bo/aday: fixed this. aday: started impl a new server that takes any arbitrary range request. performs modulus on range request. you know that there is only certain blocks being requested, so you can use a cache. does it satisfy requested range, and return that. I always do children before parent. inserting hints on the thing that does backend parsing. gh: are you supporting multiple parents of children (e.g., multiple transcripts that share an exon)? aday: a good question. I keep track of children and multiple locations of children and then I given parents after that. after the grooming, I can have multiple hints, 'this is the end of this 15mb block'. all parents are presented. then all of my comments would be presented. gh: got out IGB release, but had to recall it, since it broke things. verifying I can write back to new and improved writeback server. if you post to a writeback server, that's also the address you should be using to get the.... a versioned source with a writeable attribute. I should be able to use that same source to both write to and retrieve from. aday: you can't retrieve gh: I have to use two different urls to do retrieve and posts. The way I think it should work: anything you write to you should be able to do retrieval as well. aday: writeable=yes attribute, and go over here and write. should be ok. thinking about using redirection under the covers. gh: resolving new ids mapping to das-private ids, editing is working on client side. sc: worked on info page for affy das servers. Generating new drosophila alignment data for Ann. gh: had trouble hooking up exon chp data with new binary formatted exon data you generated (gregg's new bp2 format for exon data). could be that I have only control probes and they are not in your data. [A] steve will check to see if there are any control probes in the exon array data. ad: I got the validation server back up and running. will work on sequence retrieval spec. question: does spec guarantee that seq will be upper or lowercase? gh: no, fasta can be either. gh: spec docs don't have date stamp, eg, writeback document. this is useful to see if it has been updated. [A] andrew will put date stamp back in spec docs that don't have it. From dalke at dalkescientific.com Thu Aug 17 19:16:49 2006 From: dalke at dalkescientific.com (Andrew Dalke) Date: Fri, 18 Aug 2006 01:16:49 +0200 Subject: [DAS2] SEGMENTS does not have a "uri". Message-ID: <6b674ebdfbd129ae2d20686f9ba174e4@dalkescientific.com> I just noticed that the SEGMENTS element in the segments document does not have a "uri" attribute. That doesn't seem right so I added it to the schema. I committed that and the change for FORMAT elements under each SEGMENT. Andrew dalke at dalkescientific.com From dalke at dalkescientific.com Thu Aug 17 19:24:12 2006 From: dalke at dalkescientific.com (Andrew Dalke) Date: Fri, 18 Aug 2006 01:24:12 +0200 Subject: [DAS2] das2 reference server Message-ID: <1d186e51a531d835afb67910a28f5f36@dalkescientific.com> Here's the experimental reference server I've been working on. http://cgi.biodas.org:8081/seq/?format=html The entries for # fly_42 * 2L (22407834 bases): * 2R (20766785 bases): * 3L (23771897 bases): * 3R (27905053 bases): * 4 (1281640 bases): * X (22224390 bases): # fly_43 * 2L (22407834 bases): * 2R (20766785 bases): * 3L (23771897 bases): * 3R (27905053 bases): * 4 (1281640 bases): * X (22224390 bases): # worm_160 * I (15072418 bases): * II (15279314 bases): * III (13783677 bases): * IV (17493785 bases): * V (20919396 bases): * X (17718851 bases): * Mit (13794 bases): # worm_161 * I (15072418 bases): * II (15279314 bases): * III (13783677 bases): * IV (17493785 bases): * V (20919396 bases): * X (17718851 bases): * Mit (13794 bases): # worm_162 * I (15072418 bases): * II (15279314 bases): * III (13783677 bases): * IV (17493785 bases): * V (20919396 bases): * X (17718851 bases): * Mit (13794 bases): should be real. The others are part of my test set. It even validates. Amazing that. One thing - I've created a new document type which lists all of the "segments" documents available from the reference server. (nomenclature: I'm using "assembly" to mean "a collection of segments". I know, it isn't really an assembly. I'm using it for now because I didn't like using "segment", "segments", "segments_list" instead using "segment", "assembly", "assemblies" Ideas on a better name? ) This should be a sources document instead. Haven't gotten there yet. Andrew dalke at dalkescientific.com From dalke at dalkescientific.com Thu Aug 17 19:26:43 2006 From: dalke at dalkescientific.com (Andrew Dalke) Date: Fri, 18 Aug 2006 01:26:43 +0200 Subject: [DAS2] SEGMENTS does not have a "uri". In-Reply-To: <6b674ebdfbd129ae2d20686f9ba174e4@dalkescientific.com> References: <6b674ebdfbd129ae2d20686f9ba174e4@dalkescientific.com> Message-ID: <6dde1e7303796ce01fa641b98dd254d4@dalkescientific.com> > I just noticed that the SEGMENTS element in the segments > document does not have a "uri" attribute. That doesn't > seem right so I added it to the schema. Shouldn't the SEGMENTS element also have an optional "reference" attribute? Take a look at http://cgi.biodas.org:8081/seq/fly_42/?format=html to see a real-world record. It feels like there should be a reference="http://www.flybase.org/genome/D_melanogaster/R4.2" in there some place. Andrew dalke at dalkescientific.com From Steve_Chervitz at affymetrix.com Thu Aug 17 19:37:00 2006 From: Steve_Chervitz at affymetrix.com (Steve Chervitz) Date: Thu, 17 Aug 2006 16:37:00 -0700 Subject: [DAS2] Notes from DAS/2 code sprint #3, day four, 17 Aug 2006 In-Reply-To: Message-ID: Following up on a side-topic that came up briefly in morning's teleconf, > aday: now I get free reign to validate the way I want to. I will be > liberal in what I reject. here's a post I made to a thread on the bioperl list last regarding aberrant fasta files (another reason why to not standardize das/2 sequence responses on fasta format): http://bioperl.org/pipermail/bioperl-l/2005-July/019407.html Another cited source of this philosophy is from the TCP spec (section 2.10) as the Robustness Principle: Be conservative in what you do, be liberal in what you accept. http://www.faqs.org/rfcs/rfc793.html I actually think it has wider appeal beyond softwar