From jw12 at sanger.ac.uk Wed Nov 2 11:57:15 2011 From: jw12 at sanger.ac.uk (Jonathan Warren) Date: Wed, 2 Nov 2011 15:57:15 +0000 Subject: [DAS] Java DAS hackathon 1st December. Message-ID: <28054914-049C-4F9F-BC19-BD299DA41A3E@sanger.ac.uk> Hi There are at least 4 of us meeting up on the Sanger/EBI genome campus (Thursday 1st December) to write some code for the new Java DAS library (JDAS http://code.google.com/p/jdas/). The main focus will be on making sure the new library has all the "essential" capabilities of the old Dasobert library and some new features. We would like to extend an open invitation to Java developers in the DAS community. If you would like to attend and contribute then please drop me a line. We would be especially interested in having someone with expertise/interest in DAS structure or Alignment clients. If you have any suggestions or burning needs for support to be included in the JDAS library you can also write to me or post to the list. For inclusion (some of which has/will be implemented by 1 December): Support for concurrency (Threads and queue management). Support for alignment and structure queries/responses. JSON support. Writeback functionality (xml and JSON). Registry sources filtering support. Support for reading/writing JSON. As always any contributions and suggestions welcome. After the hackathon we would also welcome any offers of testing in real world situations? Many thanks The Sanger/EBI DAS team. Jonathan Warren Senior Developer and DAS coordinator blog: http://biodasman.wordpress.com/ jw12 at sanger.ac.uk Ext: 2314 Telephone: 01223 492314 -- The Wellcome Trust Sanger Institute is operated by Genome Research Limited, a charity registered in England with number 1021457 and a company registered in England with number 2742969, whose registered office is 215 Euston Road, London, NW1 2BE. From jrmacias at cnb.csic.es Thu Nov 3 12:32:46 2011 From: jrmacias at cnb.csic.es (Jose-Ramon Macias) Date: Thu, 3 Nov 2011 17:32:46 +0100 Subject: [DAS] DAS Digest, Vol 59, Issue 2 In-Reply-To: References: Message-ID: <8252CA57-DCEC-4CB2-B544-BA60EEDD2732@cnb.csic.es> Hi Jonathan, I won't be able to attend physically, but would be very interested in any development regarding JDAS and the replacement for DasObert. Could I contribute from the distance ? How ? I'll be eager to test the resulting stuff, anyway. On Nov 3, 2011, at 17:00, das-request at lists.open-bio.org wrote: > Send DAS mailing list submissions to > das at lists.open-bio.org > > To subscribe or unsubscribe via the World Wide Web, visit > http://lists.open-bio.org/mailman/listinfo/das > or, via email, send a message with subject or body 'help' to > das-request at lists.open-bio.org > > You can reach the person managing the list at > das-owner at lists.open-bio.org > > When replying, please edit your Subject line so it is more specific > than "Re: Contents of DAS digest..." > > > Today's Topics: > > 1. Java DAS hackathon 1st December. (Jonathan Warren) > > > ---------------------------------------------------------------------- > > Message: 1 > Date: Wed, 2 Nov 2011 15:57:15 +0000 > From: Jonathan Warren > Subject: [DAS] Java DAS hackathon 1st December. > To: das at biodas.org > Message-ID: <28054914-049C-4F9F-BC19-BD299DA41A3E at sanger.ac.uk> > Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes > > Hi > > There are at least 4 of us meeting up on the Sanger/EBI genome campus > (Thursday 1st December) to write some code for the new Java DAS > library (JDAS http://code.google.com/p/jdas/). > The main focus will be on making sure the new library has all the > "essential" capabilities of the old Dasobert library and some new > features. > > We would like to extend an open invitation to Java developers in the > DAS community. If you would like to attend and contribute then please > drop me a line. We would be especially interested in having someone > with expertise/interest in DAS structure or Alignment clients. > If you have any suggestions or burning needs for support to be > included in the JDAS library you can also write to me or post to the > list. > > For inclusion (some of which has/will be implemented by 1 December): > Support for concurrency (Threads and queue management). > Support for alignment and structure queries/responses. > JSON support. > Writeback functionality (xml and JSON). > Registry sources filtering support. > Support for reading/writing JSON. > > As always any contributions and suggestions welcome. > After the hackathon we would also welcome any offers of testing in > real world situations? > > Many thanks > > The Sanger/EBI DAS team. > > > Jonathan Warren > Senior Developer and DAS coordinator > blog: http://biodasman.wordpress.com/ > jw12 at sanger.ac.uk > Ext: 2314 > Telephone: 01223 492314 > > > > > > > > > > > -- > The Wellcome Trust Sanger Institute is operated by Genome Research > Limited, a charity registered in England with number 1021457 and a > company registered in England with number 2742969, whose registered > office is 215 Euston Road, London, NW1 2BE. > > > ------------------------------ > > _______________________________________________ > DAS mailing list > DAS at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/das > > > End of DAS Digest, Vol 59, Issue 2 > ********************************** Jose-Ramon Macias Unidad de Biocomputaci?n Centro Nacional de Biotecnolog?a (CNB-CSIC) Darwin, 3. 28049 Madrid. From jw12 at sanger.ac.uk Thu Nov 3 13:25:59 2011 From: jw12 at sanger.ac.uk (Jonathan Warren) Date: Thu, 3 Nov 2011 17:25:59 +0000 Subject: [DAS] DAS Digest, Vol 59, Issue 2 In-Reply-To: <8252CA57-DCEC-4CB2-B544-BA60EEDD2732@cnb.csic.es> References: <8252CA57-DCEC-4CB2-B544-BA60EEDD2732@cnb.csic.es> Message-ID: <923B0411-5ED2-4696-A23E-ACC772C52EDA@sanger.ac.uk> Hi Jose Great Jose. If you don't mind using Skype I'm sure you can contribute remotely. If you get in touch closer to the time we can sort it out. Cheers Jonathan. On 3 Nov 2011, at 16:32, Jose-Ramon Macias wrote: > Hi Jonathan, > I won't be able to attend physically, but would be very interested > in any development regarding JDAS and the replacement for DasObert. > Could I contribute from the distance ? How ? I'll be eager to test > the resulting stuff, anyway. > > > On Nov 3, 2011, at 17:00, das-request at lists.open-bio.org wrote: > >> Send DAS mailing list submissions to >> das at lists.open-bio.org >> >> To subscribe or unsubscribe via the World Wide Web, visit >> http://lists.open-bio.org/mailman/listinfo/das >> or, via email, send a message with subject or body 'help' to >> das-request at lists.open-bio.org >> >> You can reach the person managing the list at >> das-owner at lists.open-bio.org >> >> When replying, please edit your Subject line so it is more specific >> than "Re: Contents of DAS digest..." >> >> >> Today's Topics: >> >> 1. Java DAS hackathon 1st December. (Jonathan Warren) >> >> >> ---------------------------------------------------------------------- >> >> Message: 1 >> Date: Wed, 2 Nov 2011 15:57:15 +0000 >> From: Jonathan Warren >> Subject: [DAS] Java DAS hackathon 1st December. >> To: das at biodas.org >> Message-ID: <28054914-049C-4F9F-BC19-BD299DA41A3E at sanger.ac.uk> >> Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes >> >> Hi >> >> There are at least 4 of us meeting up on the Sanger/EBI genome campus >> (Thursday 1st December) to write some code for the new Java DAS >> library (JDAS http://code.google.com/p/jdas/). >> The main focus will be on making sure the new library has all the >> "essential" capabilities of the old Dasobert library and some new >> features. >> >> We would like to extend an open invitation to Java developers in the >> DAS community. If you would like to attend and contribute then please >> drop me a line. We would be especially interested in having someone >> with expertise/interest in DAS structure or Alignment clients. >> If you have any suggestions or burning needs for support to be >> included in the JDAS library you can also write to me or post to the >> list. >> >> For inclusion (some of which has/will be implemented by 1 December): >> Support for concurrency (Threads and queue management). >> Support for alignment and structure queries/responses. >> JSON support. >> Writeback functionality (xml and JSON). >> Registry sources filtering support. >> Support for reading/writing JSON. >> >> As always any contributions and suggestions welcome. >> After the hackathon we would also welcome any offers of testing in >> real world situations? >> >> Many thanks >> >> The Sanger/EBI DAS team. >> >> >> Jonathan Warren >> Senior Developer and DAS coordinator >> blog: http://biodasman.wordpress.com/ >> jw12 at sanger.ac.uk >> Ext: 2314 >> Telephone: 01223 492314 >> >> >> >> >> >> >> >> >> >> >> -- >> The Wellcome Trust Sanger Institute is operated by Genome Research >> Limited, a charity registered in England with number 1021457 and a >> company registered in England with number 2742969, whose registered >> office is 215 Euston Road, London, NW1 2BE. >> >> >> ------------------------------ >> >> _______________________________________________ >> DAS mailing list >> DAS at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/das >> >> >> End of DAS Digest, Vol 59, Issue 2 >> ********************************** > > Jose-Ramon Macias > > Unidad de Biocomputaci?n > Centro Nacional de Biotecnolog?a (CNB-CSIC) > Darwin, 3. 28049 Madrid. > > > _______________________________________________ > DAS mailing list > DAS at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/das Jonathan Warren Senior Developer and DAS coordinator blog: http://biodasman.wordpress.com/ jw12 at sanger.ac.uk Ext: 2314 Telephone: 01223 492314 -- The Wellcome Trust Sanger Institute is operated by Genome Research Limited, a charity registered in England with number 1021457 and a company registered in England with number 2742969, whose registered office is 215 Euston Road, London, NW1 2BE. From jw12 at sanger.ac.uk Thu Nov 17 07:15:35 2011 From: jw12 at sanger.ac.uk (Jonathan Warren) Date: Thu, 17 Nov 2011 12:15:35 +0000 Subject: [DAS] More info Survey on DAS projects Message-ID: I forgot to add: If you are interested in giving a talk please reply by the 1st of December (2 weeks from now) as we will need to start organising the format of this years workshop. Many thanks Jonathan Warren Senior Developer and DAS coordinator blog: http://biodasman.wordpress.com/ jw12 at sanger.ac.uk Ext: 2314 Telephone: 01223 492314 -- The Wellcome Trust Sanger Institute is operated by Genome Research Limited, a charity registered in England with number 1021457 and a company registered in England with number 2742969, whose registered office is 215 Euston Road, London, NW1 2BE. From jw12 at sanger.ac.uk Thu Nov 17 07:11:14 2011 From: jw12 at sanger.ac.uk (Jonathan Warren) Date: Thu, 17 Nov 2011 12:11:14 +0000 Subject: [DAS] Survey on DAS projects Message-ID: <4A0F0B2F-D98C-4B88-BC48-104607F9E526@sanger.ac.uk> Hi As the 2012 DAS workshop is coming up at the end of February we would like to hear from people using DAS. We would be really grateful to receive just a short email from anyone using DAS or developing DAS with a brief summary about their project and how DAS fits in, especially if you have not spoken at the DAS workshops at any time. Please also say if you would be interested in giving a short presentation at the workshop in February even if you are not sure if you could make it. Previous years the presentations have been 15 minutes with 5 minutes for questions - however this year we intend to be more flexible and so if you would prefer to give a "lightning talk" of just 5 minutes to update people or give them a brief overview that will be fine. Links to the previous years talks can be found here http://www.biodas.org/wiki/DASWorkshop2011#Day_2 I must emphasise - please give us a summary even if you are not interested in giving a talk as we would like to know what is going on out there and we promise not to hound you to give a talk :) Thanks in advance The Sanger/EBI DAS people. Jonathan Warren Senior Developer and DAS coordinator blog: http://biodasman.wordpress.com/ jw12 at sanger.ac.uk Ext: 2314 Telephone: 01223 492314 -- The Wellcome Trust Sanger Institute is operated by Genome Research Limited, a charity registered in England with number 1021457 and a company registered in England with number 2742969, whose registered office is 215 Euston Road, London, NW1 2BE. From mc at manuelcorpas.com Thu Nov 17 12:11:08 2011 From: mc at manuelcorpas.com (Manuel Corpas) Date: Thu, 17 Nov 2011 17:11:08 +0000 Subject: [DAS] Survey on DAS projects In-Reply-To: <4A0F0B2F-D98C-4B88-BC48-104607F9E526@sanger.ac.uk> References: <4A0F0B2F-D98C-4B88-BC48-104607F9E526@sanger.ac.uk> Message-ID: Dear Jonathan, I hope you do not mind me copying the DAS list in this email, as we would be very keen to gather interest in the community regarding DAS applications to whole genomes. We are interested in exploring DAS in the context of genomic variants (SNPs, indels, CNVs) from personal genomes plus their integration with relevant sources (genes, variation data, phenotypes). Currently we have done a lot of work with 23andMe (whole-genome) genotypes but now we are expecting to extend our efforts further to exome data. A critical tool we are currently missing is one that allows automatic creation of DAS sources via an API directly from bed format (used by 23andMe) or vcf (1000genomes). Anyone interested in discussing these topics please let me know. Kind regards, Manuel Manuel Corpas, PhD Tel:? ? ? +44.122349.2372 Web: ? ?http://manuelcorpas.com/about/ Twitter: @manuelcorpas On 17 November 2011 12:11, Jonathan Warren wrote: > Hi > > As the 2012 DAS workshop is coming up at the end of February we would like > to hear from people using DAS. > We would be really grateful to receive just a short email from anyone using > DAS or developing DAS with a brief summary about their project and how DAS > fits in, especially if you have not spoken at the DAS workshops at any time. > > Please also say if you would be interested in giving a short presentation at > the workshop in February even if you are not sure if you could make it. > Previous years the presentations have been 15 minutes with 5 minutes for > questions - however this year we intend to be more flexible and so if you > would prefer to give a "lightning talk" of just 5 minutes to update people > or give them a brief overview that will be fine. Links to the previous years > talks can be found here http://www.biodas.org/wiki/DASWorkshop2011#Day_2 > > I must emphasise - please give us a summary even if you are not interested > in giving a talk as we would like to know what is going on out there and we > promise not to hound you to give a talk :) > > Thanks in advance > > The Sanger/EBI DAS people. > > > Jonathan Warren > Senior Developer and DAS coordinator > blog: http://biodasman.wordpress.com/ > jw12 at sanger.ac.uk > Ext: 2314 > Telephone: 01223 492314 > > > > > > > > > > > -- > The Wellcome Trust Sanger Institute is operated by Genome ResearchLimited, a > charity registered in England with number 1021457 and acompany registered in > England with number 2742969, whose registeredoffice is 215 Euston Road, > London, NW1 2BE._______________________________________________ > DAS mailing list > DAS at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/das > From jw12 at sanger.ac.uk Fri Nov 18 05:15:18 2011 From: jw12 at sanger.ac.uk (Jonathan Warren) Date: Fri, 18 Nov 2011 10:15:18 +0000 Subject: [DAS] Genotype MyDAS adapter In-Reply-To: References: <4A0F0B2F-D98C-4B88-BC48-104607F9E526@sanger.ac.uk> Message-ID: <3A134B95-E6BD-486D-B9D0-A8721B345C07@sanger.ac.uk> Thanks Manuel, that's exactly the sort of reply we are hoping for. It's great if people want to publicise what they are doing on the list. In addition I should add that if people wish to distribute 23AndMe data via DAS I have developed a very simple MyDAS adapter for this purpose that comes with the current myDAS download (http://code.google.com/p/mydas/ ). Comprehensive instructions for setting up the database and uploading the data file can be found here: http://biodasman.wordpress.com/2011/11/04/how-to-create-a-genotype-database-for-use-as-a-das-source/ The configuration for connecting to this database is also in MyDAS and the blog. Obviously this isn't an automatic set-up via a pretty interface, but it's relatively simple for anyone with a degree of technical ability to do in the mean time. On 17 Nov 2011, at 17:11, Manuel Corpas wrote: > Dear Jonathan, > > I hope you do not mind me copying the DAS list in this email, as we > would be very keen to gather interest in the community regarding DAS > applications to whole genomes. > > We are interested in exploring DAS in the context of genomic variants > (SNPs, indels, CNVs) from personal genomes plus their integration with > relevant sources (genes, variation data, phenotypes). > > Currently we have done a lot of work with 23andMe (whole-genome) > genotypes but now we are expecting to extend our efforts further to > exome data. A critical tool we are currently missing is one that > allows automatic creation of DAS sources via an API directly from bed > format (used by 23andMe) or vcf (1000genomes). > > Anyone interested in discussing these topics please let me know. > > Kind regards, > Manuel > > Manuel Corpas, PhD > Tel: +44.122349.2372 > Web: http://manuelcorpas.com/about/ > Twitter: @manuelcorpas > > > > On 17 November 2011 12:11, Jonathan Warren wrote: >> Hi >> >> As the 2012 DAS workshop is coming up at the end of February we >> would like >> to hear from people using DAS. >> We would be really grateful to receive just a short email from >> anyone using >> DAS or developing DAS with a brief summary about their project and >> how DAS >> fits in, especially if you have not spoken at the DAS workshops at >> any time. >> >> Please also say if you would be interested in giving a short >> presentation at >> the workshop in February even if you are not sure if you could make >> it. >> Previous years the presentations have been 15 minutes with 5 >> minutes for >> questions - however this year we intend to be more flexible and so >> if you >> would prefer to give a "lightning talk" of just 5 minutes to update >> people >> or give them a brief overview that will be fine. Links to the >> previous years >> talks can be found here http://www.biodas.org/wiki/DASWorkshop2011#Day_2 >> >> I must emphasise - please give us a summary even if you are not >> interested >> in giving a talk as we would like to know what is going on out >> there and we >> promise not to hound you to give a talk :) >> >> Thanks in advance >> >> The Sanger/EBI DAS people. >> >> >> Jonathan Warren >> Senior Developer and DAS coordinator >> blog: http://biodasman.wordpress.com/ >> jw12 at sanger.ac.uk >> Ext: 2314 >> Telephone: 01223 492314 >> >> >> >> >> >> >> >> >> >> >> -- >> The Wellcome Trust Sanger Institute is operated by Genome >> ResearchLimited, a >> charity registered in England with number 1021457 and acompany >> registered in >> England with number 2742969, whose registeredoffice is 215 Euston >> Road, >> London, NW1 2BE._______________________________________________ >> DAS mailing list >> DAS at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/das >> Jonathan Warren Senior Developer and DAS coordinator blog: http://biodasman.wordpress.com/ jw12 at sanger.ac.uk Ext: 2314 Telephone: 01223 492314 -- The Wellcome Trust Sanger Institute is operated by Genome Research Limited, a charity registered in England with number 1021457 and a company registered in England with number 2742969, whose registered office is 215 Euston Road, London, NW1 2BE. From mc at manuelcorpas.com Fri Nov 18 05:28:13 2011 From: mc at manuelcorpas.com (Manuel Corpas) Date: Fri, 18 Nov 2011 10:28:13 +0000 Subject: [DAS] Genotype MyDAS adapter In-Reply-To: <3A134B95-E6BD-486D-B9D0-A8721B345C07@sanger.ac.uk> References: <4A0F0B2F-D98C-4B88-BC48-104607F9E526@sanger.ac.uk> <3A134B95-E6BD-486D-B9D0-A8721B345C07@sanger.ac.uk> Message-ID: Hi Jonathan, thanks, this is really helpful. Please find in this link my genotype data* should anyone would like to give it a try with real data: http://mykaryoview.com/data/son.txt.zip Manuel *my genotype data has CC BY-SA 3.0 License: you are able to copy, distribute and transmit the work as long as you acknowledge the source. Manuel Corpas, PhD Tel:? ? ? +44.122349.2372 Web: ? ?http://manuelcorpas.com/about/ Twitter: @manuelcorpas On 18 November 2011 10:15, Jonathan Warren wrote: > Thanks Manuel, that's exactly the sort of reply we are hoping for.?It's > great if people want to publicise what they are doing on the list. > In addition I should add that if people wish to distribute 23AndMe data via > DAS I have developed a very simple MyDAS adapter for this purpose that comes > with the current myDAS download (http://code.google.com/p/mydas/). > Comprehensive instructions for setting up the database and uploading the > data file can be found > here:?http://biodasman.wordpress.com/2011/11/04/how-to-create-a-genotype-database-for-use-as-a-das-source/ > The configuration for connecting to this database is also in MyDAS and the > blog. > Obviously this isn't an automatic set-up via a pretty interface, but it's > relatively simple for anyone with a degree of technical ability to do in the > mean time. > On 17 Nov 2011, at 17:11, Manuel Corpas wrote: > > Dear Jonathan, > > I hope you do not mind me copying the DAS list in this email, as we > would be very keen to gather interest in the community regarding DAS > applications to whole genomes. > > We are interested in exploring DAS in the context of genomic variants > (SNPs, indels, CNVs) from personal genomes plus their integration with > relevant sources (genes, variation data, phenotypes). > > Currently we have done a lot of work with 23andMe (whole-genome) > genotypes but now we are expecting to extend our efforts further to > exome data. A critical tool we are currently missing is one that > allows automatic creation of DAS sources via an API directly from bed > format (used by 23andMe) or vcf (1000genomes). > > Anyone interested in discussing these topics please let me know. > > Kind regards, > Manuel > > Manuel Corpas, PhD > Tel:? ? ? +44.122349.2372 > Web: ? ?http://manuelcorpas.com/about/ > Twitter: @manuelcorpas > > > > On 17 November 2011 12:11, Jonathan Warren wrote: > > Hi > > As the 2012 DAS workshop is coming up at the end of February we would like > > to hear from people using DAS. > > We would be really grateful to receive just a short email from anyone using > > DAS or developing DAS with a brief summary about their project and how DAS > > fits in, especially if you have not spoken at the DAS workshops at any time. > > Please also say if you would be interested in giving a short presentation at > > the workshop in February even if you are not sure if you could make it. > > Previous years the presentations have been 15 minutes with 5 minutes for > > questions - however this year we intend to be more flexible and so if you > > would prefer to give a "lightning talk" of just 5 minutes to update people > > or give them a brief overview that will be fine. Links to the previous years > > talks can be found here http://www.biodas.org/wiki/DASWorkshop2011#Day_2 > > I must emphasise - please give us a summary even if you are not interested > > in giving a talk as we would like to know what is going on out there and we > > promise not to hound you to give a talk :) > > Thanks in advance > > The Sanger/EBI DAS people. > > > Jonathan Warren > > Senior Developer and DAS coordinator > > blog: http://biodasman.wordpress.com/ > > jw12 at sanger.ac.uk > > Ext: 2314 > > Telephone: 01223 492314 > > > > > > > > > > > -- > > The Wellcome Trust Sanger Institute is operated by Genome ResearchLimited, a > > charity registered in England with number 1021457 and acompany registered in > > England with number 2742969, whose registeredoffice is 215 Euston Road, > > London, NW1 2BE._______________________________________________ > > DAS mailing list > > DAS at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/das > > > Jonathan Warren > Senior Developer and DAS coordinator > blog: http://biodasman.wordpress.com/ > jw12 at sanger.ac.uk > Ext: 2314 > Telephone: 01223 492314 > > > > > > > > -- The Wellcome Trust Sanger Institute is operated by Genome Research > Limited, a charity registered in England with number 1021457 and a company > registered in England with number 2742969, whose registered office is 215 > Euston Road, London, NW1 2BE. > From andy.jenkinson at ebi.ac.uk Fri Nov 18 10:14:02 2011 From: andy.jenkinson at ebi.ac.uk (Andy Jenkinson) Date: Fri, 18 Nov 2011 15:14:02 +0000 Subject: [DAS] Survey on DAS projects In-Reply-To: References: <4A0F0B2F-D98C-4B88-BC48-104607F9E526@sanger.ac.uk> Message-ID: <1C57BEA4-3B0A-4CBE-8308-97D418D2F243@ebi.ac.uk> Hi Manuel, Since 2008 ProServer has had a BED format SourceAdaptor (called bed12, as it is intended to work with the 12-field BED format). It also supports Hydras, which are modules that are designed to automatically create DAS sources from a single config without restarting the server. This is how EasyDAS works with ProServer: there is one SourceAdaptor, and a Hydra to scan a relational database for new data. I don't know what 23andme's data looks like, but the addition of a Hydra to scan directories for new files and automatically make them available as DAS sources would seem to be a trivial piece of work. I daresay a VCF adaptor would also be fairly easy, especially if there is a Perl API of some sort (BioPerl?). Cheers, Andy On 17 Nov 2011, at 17:11, Manuel Corpas wrote: > Dear Jonathan, > > I hope you do not mind me copying the DAS list in this email, as we > would be very keen to gather interest in the community regarding DAS > applications to whole genomes. > > We are interested in exploring DAS in the context of genomic variants > (SNPs, indels, CNVs) from personal genomes plus their integration with > relevant sources (genes, variation data, phenotypes). > > Currently we have done a lot of work with 23andMe (whole-genome) > genotypes but now we are expecting to extend our efforts further to > exome data. A critical tool we are currently missing is one that > allows automatic creation of DAS sources via an API directly from bed > format (used by 23andMe) or vcf (1000genomes). > > Anyone interested in discussing these topics please let me know. > > Kind regards, > Manuel > > Manuel Corpas, PhD > Tel: +44.122349.2372 > Web: http://manuelcorpas.com/about/ > Twitter: @manuelcorpas > > > > On 17 November 2011 12:11, Jonathan Warren wrote: >> Hi >> >> As the 2012 DAS workshop is coming up at the end of February we would like >> to hear from people using DAS. >> We would be really grateful to receive just a short email from anyone using >> DAS or developing DAS with a brief summary about their project and how DAS >> fits in, especially if you have not spoken at the DAS workshops at any time. >> >> Please also say if you would be interested in giving a short presentation at >> the workshop in February even if you are not sure if you could make it. >> Previous years the presentations have been 15 minutes with 5 minutes for >> questions - however this year we intend to be more flexible and so if you >> would prefer to give a "lightning talk" of just 5 minutes to update people >> or give them a brief overview that will be fine. Links to the previous years >> talks can be found here http://www.biodas.org/wiki/DASWorkshop2011#Day_2 >> >> I must emphasise - please give us a summary even if you are not interested >> in giving a talk as we would like to know what is going on out there and we >> promise not to hound you to give a talk :) >> >> Thanks in advance >> >> The Sanger/EBI DAS people. >> >> >> Jonathan Warren >> Senior Developer and DAS coordinator >> blog: http://biodasman.wordpress.com/ >> jw12 at sanger.ac.uk >> Ext: 2314 >> Telephone: 01223 492314 >> >> >> >> >> >> >> >> >> >> >> -- >> The Wellcome Trust Sanger Institute is operated by Genome ResearchLimited, a >> charity registered in England with number 1021457 and acompany registered in >> England with number 2742969, whose registeredoffice is 215 Euston Road, >> London, NW1 2BE._______________________________________________ >> DAS mailing list >> DAS at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/das >> > > _______________________________________________ > DAS mailing list > DAS at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/das From mc at manuelcorpas.com Fri Nov 18 10:24:59 2011 From: mc at manuelcorpas.com (Manuel Corpas) Date: Fri, 18 Nov 2011 15:24:59 +0000 Subject: [DAS] Survey on DAS projects In-Reply-To: <1C57BEA4-3B0A-4CBE-8308-97D418D2F243@ebi.ac.uk> References: <4A0F0B2F-D98C-4B88-BC48-104607F9E526@sanger.ac.uk> <1C57BEA4-3B0A-4CBE-8308-97D418D2F243@ebi.ac.uk> Message-ID: Hi Andy, thanks for the info. Having a bed DAS adaptor is part of the problem, the other is not having to worry about having to deal with the DAS server directly. easyDAS manages to do this but unfortunately it is not obvious for people who do not know DAS how to operate it. Also if the file is very big and the connection slow it can take up to an hour to create a DAS source. Wouldn't it be nice to create a DAS source just with one click or two? Please see below a snippet of a few SNPs in my chromosome 16 just as you would get them from 23andMe (NCBI36 assembly; columns mean SNP_id/chr/position/genotype). Cheers, Manuel rs7763 16 544555 TT rs763158 16 546105 GG rs7190878 16 549131 AG rs4984890 16 552699 CT rs710925 16 573355 AG rs2017567 16 577213 CT rs4144003 16 585969 CT rs7190358 16 590789 AG rs7203694 16 592942 AG rs11248940 16 595687 TT rs7204088 16 601143 TT rs4984677 16 611683 AG rs9929621 16 619413 CT rs11642546 16 641657 CC rs3752496 16 650256 TT rs2301426 16 651906 GG rs1044662 16 655061 CC rs9934288 16 656288 AC rs3752493 16 657524 TT rs1139897 16 660987 GG rs1045763 16 664085 CC rs3830140 16 665336 AA rs8056588 16 666190 CC rs6597 16 671726 TT Manuel Corpas, PhD Tel:? ? ? +44.122349.2372 Web: ? ?http://manuelcorpas.com/about/ Twitter: @manuelcorpas On 18 November 2011 15:14, Andy Jenkinson wrote: > Hi Manuel, > > Since 2008 ProServer has had a BED format SourceAdaptor (called bed12, as it is intended to work with the 12-field BED format). It also supports Hydras, which are modules that are designed to automatically create DAS sources from a single config without restarting the server. This is how EasyDAS works with ProServer: there is one SourceAdaptor, and a Hydra to scan a relational database for new data. > > I don't know what 23andme's data looks like, but the addition of a Hydra to scan directories for new files and automatically make them available as DAS sources would seem to be a trivial piece of work. I daresay a VCF adaptor would also be fairly easy, especially if there is a Perl API of some sort (BioPerl?). > > Cheers, > Andy > > On 17 Nov 2011, at 17:11, Manuel Corpas wrote: > >> Dear Jonathan, >> >> I hope you do not mind me copying the DAS list in this email, as we >> would be very keen to gather interest in the community regarding DAS >> applications to whole genomes. >> >> We are interested in exploring DAS in the context of genomic variants >> (SNPs, indels, CNVs) from personal genomes plus their integration with >> relevant sources (genes, variation data, phenotypes). >> >> Currently we have done a lot of work with 23andMe (whole-genome) >> genotypes but now we are expecting to extend our efforts further to >> exome data. A critical tool we are currently missing is one that >> allows automatic creation of DAS sources via an API directly from bed >> format (used by 23andMe) or vcf (1000genomes). >> >> Anyone interested in discussing these topics please let me know. >> >> Kind regards, >> Manuel >> >> Manuel Corpas, PhD >> Tel: ? ? ?+44.122349.2372 >> Web: ? ?http://manuelcorpas.com/about/ >> Twitter: @manuelcorpas >> >> >> >> On 17 November 2011 12:11, Jonathan Warren wrote: >>> Hi >>> >>> As the 2012 DAS workshop is coming up at the end of February we would like >>> to hear from people using DAS. >>> We would be really grateful to receive just a short email from anyone using >>> DAS or developing DAS with a brief summary about their project and how DAS >>> fits in, especially if you have not spoken at the DAS workshops at any time. >>> >>> Please also say if you would be interested in giving a short presentation at >>> the workshop in February even if you are not sure if you could make it. >>> Previous years the presentations have been 15 minutes with 5 minutes for >>> questions - however this year we intend to be more flexible and so if you >>> would prefer to give a "lightning talk" of just 5 minutes to update people >>> or give them a brief overview that will be fine. Links to the previous years >>> talks can be found here http://www.biodas.org/wiki/DASWorkshop2011#Day_2 >>> >>> I must emphasise - please give us a summary even if you are not interested >>> in giving a talk as we would like to know what is going on out there and we >>> promise not to hound you to give a talk :) >>> >>> Thanks in advance >>> >>> The Sanger/EBI DAS people. >>> >>> >>> Jonathan Warren >>> Senior Developer and DAS coordinator >>> blog: http://biodasman.wordpress.com/ >>> jw12 at sanger.ac.uk >>> Ext: 2314 >>> Telephone: 01223 492314 >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> -- >>> The Wellcome Trust Sanger Institute is operated by Genome ResearchLimited, a >>> charity registered in England with number 1021457 and acompany registered in >>> England with number 2742969, whose registeredoffice is 215 Euston Road, >>> London, NW1 2BE._______________________________________________ >>> DAS mailing list >>> DAS at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/das >>> >> >> _______________________________________________ >> DAS mailing list >> DAS at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/das > > From andy.jenkinson at ebi.ac.uk Fri Nov 18 18:33:51 2011 From: andy.jenkinson at ebi.ac.uk (Andy Jenkinson) Date: Fri, 18 Nov 2011 23:33:51 +0000 Subject: [DAS] Survey on DAS projects In-Reply-To: References: <4A0F0B2F-D98C-4B88-BC48-104607F9E526@sanger.ac.uk> <1C57BEA4-3B0A-4CBE-8308-97D418D2F243@ebi.ac.uk> Message-ID: Hi Manuel, It would be nice to be able to create a DAS source from any type of data you happen to have with a click or two, but I don't think it is realistic. Even in this email you have just told me what all the columns mean, what the assembly is, what kind of file it is. Any application would need to know the same things (and more). That is not to say that it is difficult to build something to let you do this if it is specifically designed for the exact type of data you are using, just that it does not already exist and so you have to actually create it. Either MyDas or ProServer would seem to offer you a starting point to do that, but only a starting point. EasyDAS is the closest thing to what you want but obviously it has to cater for any type of data so has to ask you a lot more questions. Its web-based architecture obviously limits the size of data files you can process quickly too, but that is the trade off you make by not needing an Internet-visible web server of your own to run a DAS server from. I daresay if you wanted to create something that an individual can use to make a DAS source from their personal BED/VCF file then it would have to be web based, will always be restricted by the speed of the Internet, but the interface could be much simpler than EasyDAS and a database might not be needed (EasyDAS loads file contents into a database to standardise them, which slows things down). Having said all this, I am a little confused about what you are trying to achieve. In your first mail you said you wanted to create sources via an API, in the second you say you want to do it via a click. Obviously the requirements for both are very different. Cheers, Andy On 18 Nov 2011, at 15:24, Manuel Corpas wrote: > Hi Andy, > > thanks for the info. Having a bed DAS adaptor is part of the problem, > the other is not having to worry about having to deal with the DAS > server directly. easyDAS manages to do this but unfortunately it is > not obvious for people who do not know DAS how to operate it. Also if > the file is very big and the connection slow it can take up to an hour > to create a DAS source. > > Wouldn't it be nice to create a DAS source just with one click or two? > > Please see below a snippet of a few SNPs in my chromosome 16 just as > you would get them from 23andMe (NCBI36 assembly; columns mean > SNP_id/chr/position/genotype). > > Cheers, > Manuel > > rs7763 16 544555 TT > rs763158 16 546105 GG > rs7190878 16 549131 AG > rs4984890 16 552699 CT > rs710925 16 573355 AG > rs2017567 16 577213 CT > rs4144003 16 585969 CT > rs7190358 16 590789 AG > rs7203694 16 592942 AG > rs11248940 16 595687 TT > rs7204088 16 601143 TT > rs4984677 16 611683 AG > rs9929621 16 619413 CT > rs11642546 16 641657 CC > rs3752496 16 650256 TT > rs2301426 16 651906 GG > rs1044662 16 655061 CC > rs9934288 16 656288 AC > rs3752493 16 657524 TT > rs1139897 16 660987 GG > rs1045763 16 664085 CC > rs3830140 16 665336 AA > rs8056588 16 666190 CC > rs6597 16 671726 TT > > > Manuel Corpas, PhD > Tel: +44.122349.2372 > Web: http://manuelcorpas.com/about/ > Twitter: @manuelcorpas > > > > On 18 November 2011 15:14, Andy Jenkinson wrote: >> Hi Manuel, >> >> Since 2008 ProServer has had a BED format SourceAdaptor (called bed12, as it is intended to work with the 12-field BED format). It also supports Hydras, which are modules that are designed to automatically create DAS sources from a single config without restarting the server. This is how EasyDAS works with ProServer: there is one SourceAdaptor, and a Hydra to scan a relational database for new data. >> >> I don't know what 23andme's data looks like, but the addition of a Hydra to scan directories for new files and automatically make them available as DAS sources would seem to be a trivial piece of work. I daresay a VCF adaptor would also be fairly easy, especially if there is a Perl API of some sort (BioPerl?). >> >> Cheers, >> Andy >> >> On 17 Nov 2011, at 17:11, Manuel Corpas wrote: >> >>> Dear Jonathan, >>> >>> I hope you do not mind me copying the DAS list in this email, as we >>> would be very keen to gather interest in the community regarding DAS >>> applications to whole genomes. >>> >>> We are interested in exploring DAS in the context of genomic variants >>> (SNPs, indels, CNVs) from personal genomes plus their integration with >>> relevant sources (genes, variation data, phenotypes). >>> >>> Currently we have done a lot of work with 23andMe (whole-genome) >>> genotypes but now we are expecting to extend our efforts further to >>> exome data. A critical tool we are currently missing is one that >>> allows automatic creation of DAS sources via an API directly from bed >>> format (used by 23andMe) or vcf (1000genomes). >>> >>> Anyone interested in discussing these topics please let me know. >>> >>> Kind regards, >>> Manuel >>> >>> Manuel Corpas, PhD >>> Tel: +44.122349.2372 >>> Web: http://manuelcorpas.com/about/ >>> Twitter: @manuelcorpas >>> >>> >>> >>> On 17 November 2011 12:11, Jonathan Warren wrote: >>>> Hi >>>> >>>> As the 2012 DAS workshop is coming up at the end of February we would like >>>> to hear from people using DAS. >>>> We would be really grateful to receive just a short email from anyone using >>>> DAS or developing DAS with a brief summary about their project and how DAS >>>> fits in, especially if you have not spoken at the DAS workshops at any time. >>>> >>>> Please also say if you would be interested in giving a short presentation at >>>> the workshop in February even if you are not sure if you could make it. >>>> Previous years the presentations have been 15 minutes with 5 minutes for >>>> questions - however this year we intend to be more flexible and so if you >>>> would prefer to give a "lightning talk" of just 5 minutes to update people >>>> or give them a brief overview that will be fine. Links to the previous years >>>> talks can be found here http://www.biodas.org/wiki/DASWorkshop2011#Day_2 >>>> >>>> I must emphasise - please give us a summary even if you are not interested >>>> in giving a talk as we would like to know what is going on out there and we >>>> promise not to hound you to give a talk :) >>>> >>>> Thanks in advance >>>> >>>> The Sanger/EBI DAS people. >>>> >>>> >>>> Jonathan Warren >>>> Senior Developer and DAS coordinator >>>> blog: http://biodasman.wordpress.com/ >>>> jw12 at sanger.ac.uk >>>> Ext: 2314 >>>> Telephone: 01223 492314 >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> -- >>>> The Wellcome Trust Sanger Institute is operated by Genome ResearchLimited, a >>>> charity registered in England with number 1021457 and acompany registered in >>>> England with number 2742969, whose registeredoffice is 215 Euston Road, >>>> London, NW1 2BE._______________________________________________ >>>> DAS mailing list >>>> DAS at lists.open-bio.org >>>> http://lists.open-bio.org/mailman/listinfo/das >>>> >>> >>> _______________________________________________ >>> DAS mailing list >>> DAS at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/das >> >> From mc at manuelcorpas.com Sat Nov 19 10:58:16 2011 From: mc at manuelcorpas.com (Manuel Corpas) Date: Sat, 19 Nov 2011 15:58:16 +0000 Subject: [DAS] Survey on DAS projects In-Reply-To: References: <4A0F0B2F-D98C-4B88-BC48-104607F9E526@sanger.ac.uk> <1C57BEA4-3B0A-4CBE-8308-97D418D2F243@ebi.ac.uk> Message-ID: > Having said all this, I am a little confused about what you are trying to achieve. In your first mail you said you wanted to create sources via an API, in the second you say you want to do it via a click. Obviously the requirements for both are very different. Both API and 1-click DAS source creation would be extremely helpful in my view. The fact that this functionality is not available is seriously affecting many of our users' ability to create DAS sources with their 23andMe genotypes. The fact that these facilities do not exist has stopped new potential users from utilizing DAS. If DAS is truly going to survive as a standard, automatic creation of data sources needs to be easier. Manuel Manuel Corpas, PhD Tel:? ? ? +44.122349.2372 Web: ? ?http://manuelcorpas.com/about/ Twitter: @manuelcorpas On 18 November 2011 23:33, Andy Jenkinson wrote: > Hi Manuel, > > It would be nice to be able to create a DAS source from any type of data you happen to have with a click or two, but I don't think it is realistic. Even in this email you have just told me what all the columns mean, what the assembly is, what kind of file it is. Any application would need to know the same things (and more). > > That is not to say that it is difficult to build something to let you do this if it is specifically designed for the exact type of data you are using, just that it does not already exist and so you have to actually create it. Either MyDas or ProServer would seem to offer you a starting point to do that, but only a starting point. EasyDAS is the closest thing to what you want but obviously it has to cater for any type of data so has to ask you a lot more questions. Its web-based architecture obviously limits the size of data files you can process quickly too, but that is the trade off you make by not needing an Internet-visible web server of your own to run a DAS server from. I daresay if you wanted to create something that an individual can use to make a DAS source from their personal BED/VCF file then it would have to be web based, will always be restricted by the speed of the Internet, but the interface could be much simpler than EasyDAS and a database might not be needed (EasyDAS loads file contents into a database to standardise them, which slows things down). > > Having said all this, I am a little confused about what you are trying to achieve. In your first mail you said you wanted to create sources via an API, in the second you say you want to do it via a click. Obviously the requirements for both are very different. > > Cheers, > Andy > > On 18 Nov 2011, at 15:24, Manuel Corpas wrote: > >> Hi Andy, >> >> thanks for the info. Having a bed DAS adaptor is part of the problem, >> the other is not having to worry about having to deal with the DAS >> server directly. easyDAS manages to do this but unfortunately it is >> not obvious for people who do not know DAS how to operate it. Also if >> the file is very big and the connection slow it can take up to an hour >> to create a DAS source. >> >> Wouldn't it be nice to create a DAS source just with one click or two? >> >> Please see below a snippet of a few SNPs in my chromosome 16 just as >> you would get them from 23andMe (NCBI36 assembly; columns mean >> SNP_id/chr/position/genotype). >> >> Cheers, >> Manuel >> >> rs7763 ? ? ? ?16 ? ? ?544555 ?TT >> rs763158 ? ? ?16 ? ? ?546105 ?GG >> rs7190878 ? ? 16 ? ? ?549131 ?AG >> rs4984890 ? ? 16 ? ? ?552699 ?CT >> rs710925 ? ? ?16 ? ? ?573355 ?AG >> rs2017567 ? ? 16 ? ? ?577213 ?CT >> rs4144003 ? ? 16 ? ? ?585969 ?CT >> rs7190358 ? ? 16 ? ? ?590789 ?AG >> rs7203694 ? ? 16 ? ? ?592942 ?AG >> rs11248940 ? ?16 ? ? ?595687 ?TT >> rs7204088 ? ? 16 ? ? ?601143 ?TT >> rs4984677 ? ? 16 ? ? ?611683 ?AG >> rs9929621 ? ? 16 ? ? ?619413 ?CT >> rs11642546 ? ?16 ? ? ?641657 ?CC >> rs3752496 ? ? 16 ? ? ?650256 ?TT >> rs2301426 ? ? 16 ? ? ?651906 ?GG >> rs1044662 ? ? 16 ? ? ?655061 ?CC >> rs9934288 ? ? 16 ? ? ?656288 ?AC >> rs3752493 ? ? 16 ? ? ?657524 ?TT >> rs1139897 ? ? 16 ? ? ?660987 ?GG >> rs1045763 ? ? 16 ? ? ?664085 ?CC >> rs3830140 ? ? 16 ? ? ?665336 ?AA >> rs8056588 ? ? 16 ? ? ?666190 ?CC >> rs6597 ? ? ? ?16 ? ? ?671726 ?TT >> >> >> Manuel Corpas, PhD >> Tel: ? ? ?+44.122349.2372 >> Web: ? ?http://manuelcorpas.com/about/ >> Twitter: @manuelcorpas >> >> >> >> On 18 November 2011 15:14, Andy Jenkinson wrote: >>> Hi Manuel, >>> >>> Since 2008 ProServer has had a BED format SourceAdaptor (called bed12, as it is intended to work with the 12-field BED format). It also supports Hydras, which are modules that are designed to automatically create DAS sources from a single config without restarting the server. This is how EasyDAS works with ProServer: there is one SourceAdaptor, and a Hydra to scan a relational database for new data. >>> >>> I don't know what 23andme's data looks like, but the addition of a Hydra to scan directories for new files and automatically make them available as DAS sources would seem to be a trivial piece of work. I daresay a VCF adaptor would also be fairly easy, especially if there is a Perl API of some sort (BioPerl?). >>> >>> Cheers, >>> Andy >>> >>> On 17 Nov 2011, at 17:11, Manuel Corpas wrote: >>> >>>> Dear Jonathan, >>>> >>>> I hope you do not mind me copying the DAS list in this email, as we >>>> would be very keen to gather interest in the community regarding DAS >>>> applications to whole genomes. >>>> >>>> We are interested in exploring DAS in the context of genomic variants >>>> (SNPs, indels, CNVs) from personal genomes plus their integration with >>>> relevant sources (genes, variation data, phenotypes). >>>> >>>> Currently we have done a lot of work with 23andMe (whole-genome) >>>> genotypes but now we are expecting to extend our efforts further to >>>> exome data. A critical tool we are currently missing is one that >>>> allows automatic creation of DAS sources via an API directly from bed >>>> format (used by 23andMe) or vcf (1000genomes). >>>> >>>> Anyone interested in discussing these topics please let me know. >>>> >>>> Kind regards, >>>> Manuel >>>> >>>> Manuel Corpas, PhD >>>> Tel: ? ? ?+44.122349.2372 >>>> Web: ? ?http://manuelcorpas.com/about/ >>>> Twitter: @manuelcorpas >>>> >>>> >>>> >>>> On 17 November 2011 12:11, Jonathan Warren wrote: >>>>> Hi >>>>> >>>>> As the 2012 DAS workshop is coming up at the end of February we would like >>>>> to hear from people using DAS. >>>>> We would be really grateful to receive just a short email from anyone using >>>>> DAS or developing DAS with a brief summary about their project and how DAS >>>>> fits in, especially if you have not spoken at the DAS workshops at any time. >>>>> >>>>> Please also say if you would be interested in giving a short presentation at >>>>> the workshop in February even if you are not sure if you could make it. >>>>> Previous years the presentations have been 15 minutes with 5 minutes for >>>>> questions - however this year we intend to be more flexible and so if you >>>>> would prefer to give a "lightning talk" of just 5 minutes to update people >>>>> or give them a brief overview that will be fine. Links to the previous years >>>>> talks can be found here http://www.biodas.org/wiki/DASWorkshop2011#Day_2 >>>>> >>>>> I must emphasise - please give us a summary even if you are not interested >>>>> in giving a talk as we would like to know what is going on out there and we >>>>> promise not to hound you to give a talk :) >>>>> >>>>> Thanks in advance >>>>> >>>>> The Sanger/EBI DAS people. >>>>> >>>>> >>>>> Jonathan Warren >>>>> Senior Developer and DAS coordinator >>>>> blog: http://biodasman.wordpress.com/ >>>>> jw12 at sanger.ac.uk >>>>> Ext: 2314 >>>>> Telephone: 01223 492314 >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> The Wellcome Trust Sanger Institute is operated by Genome ResearchLimited, a >>>>> charity registered in England with number 1021457 and acompany registered in >>>>> England with number 2742969, whose registeredoffice is 215 Euston Road, >>>>> London, NW1 2BE._______________________________________________ >>>>> DAS mailing list >>>>> DAS at lists.open-bio.org >>>>> http://lists.open-bio.org/mailman/listinfo/das >>>>> >>>> >>>> _______________________________________________ >>>> DAS mailing list >>>> DAS at lists.open-bio.org >>>> http://lists.open-bio.org/mailman/listinfo/das >>> >>> > > From jw12 at sanger.ac.uk Thu Nov 24 12:19:37 2011 From: jw12 at sanger.ac.uk (Jonathan Warren) Date: Thu, 24 Nov 2011 17:19:37 +0000 Subject: [DAS] DAS projects/Future of DAS In-Reply-To: References: <4A0F0B2F-D98C-4B88-BC48-104607F9E526@sanger.ac.uk> <1C57BEA4-3B0A-4CBE-8308-97D418D2F243@ebi.ac.uk> Message-ID: Hi Manuel Thanks again for the input ;) I agree with Andy that a generic DAS upload/server is going to be inherently complicated and will be limited by the upload across the internet. To a large degree the DAS system came about to stop large amounts of data needing to be uploaded or downloaded over the net. However I can see that having DAS "nodes" such as the EBI and Sanger etc with tailor made upload user interfaces and servers for specific data types is a reasonable solution/addition to the DAS system. To this end, I'm already more than half way to a solution for your needs (which doesn't need a database) which we could use, subject to Sanger approval. This solution could also be made more generic for other data sources as well with a specific interface developed for each. As to the future of DAS - I believe that over the last 3 years we have improved the DAS protocol and many of the associated implementations so that we now have dispensed with many if not all of the previous criticisms people have had of the DAS system (1.6E spec): Validation has improved the level of conformity to the DAS spec and new servers (MyDAS and Proserver) now behave in the same way for both requests and responses. You can now search DAS sources (MyDAS). Next feature capability (in the spec and Proserver). You can have alternative content that will require less bandwidth e.g JSON (the Registry serves this already and soon MyDAS servers and proserver hopefully). We have writeback servers (MyDAS)-(already implemented at the EBI for proteins and an example server available soon that will accept posts and puts at the Sanger for genomic sources). Really we NEED the community to come together and put data up using these new servers and for major clients such as Ensembl to start supporting 1.6 spec servers and the newer "Extended" features. The Dalliance browser by Thomas Down proves how fast a DAS client can be. I believe there is a lot more potential for the DAS system and it's still a good solution for today's data distribution needs. Cheers Jonathan. On 19 Nov 2011, at 15:58, Manuel Corpas wrote: >> Having said all this, I am a little confused about what you are >> trying to achieve. In your first mail you said you wanted to create >> sources via an API, in the second you say you want to do it via a >> click. Obviously the requirements for both are very different. > > > Both API and 1-click DAS source creation would be extremely helpful in > my view. The fact that this functionality is not available is > seriously affecting many of our users' ability to create DAS sources > with their 23andMe genotypes. > > The fact that these facilities do not exist has stopped new potential > users from utilizing DAS. If DAS is truly going to survive as a > standard, automatic creation of data sources needs to be easier. > > Manuel > > > Manuel Corpas, PhD > Tel: +44.122349.2372 > Web: http://manuelcorpas.com/about/ > Twitter: @manuelcorpas > > > > On 18 November 2011 23:33, Andy Jenkinson > wrote: >> Hi Manuel, >> >> It would be nice to be able to create a DAS source from any type of >> data you happen to have with a click or two, but I don't think it >> is realistic. Even in this email you have just told me what all the >> columns mean, what the assembly is, what kind of file it is. Any >> application would need to know the same things (and more). >> >> That is not to say that it is difficult to build something to let >> you do this if it is specifically designed for the exact type of >> data you are using, just that it does not already exist and so you >> have to actually create it. Either MyDas or ProServer would seem to >> offer you a starting point to do that, but only a starting point. >> EasyDAS is the closest thing to what you want but obviously it has >> to cater for any type of data so has to ask you a lot more >> questions. Its web-based architecture obviously limits the size of >> data files you can process quickly too, but that is the trade off >> you make by not needing an Internet-visible web server of your own >> to run a DAS server from. I daresay if you wanted to create >> something that an individual can use to make a DAS source from >> their personal BED/VCF file then it would have to be web based, >> will always be restricted by the speed of the Internet, but the >> interface could be much simpler than EasyDAS and a database might >> not be needed (EasyDAS loads file contents into a database to >> standardise them, which slows things down). >> >> Having said all this, I am a little confused about what you are >> trying to achieve. In your first mail you said you wanted to create >> sources via an API, in the second you say you want to do it via a >> click. Obviously the requirements for both are very different. >> >> Cheers, >> Andy >> >> On 18 Nov 2011, at 15:24, Manuel Corpas wrote: >> >>> Hi Andy, >>> >>> thanks for the info. Having a bed DAS adaptor is part of the >>> problem, >>> the other is not having to worry about having to deal with the DAS >>> server directly. easyDAS manages to do this but unfortunately it is >>> not obvious for people who do not know DAS how to operate it. Also >>> if >>> the file is very big and the connection slow it can take up to an >>> hour >>> to create a DAS source. >>> >>> Wouldn't it be nice to create a DAS source just with one click or >>> two? >>> >>> Please see below a snippet of a few SNPs in my chromosome 16 just as >>> you would get them from 23andMe (NCBI36 assembly; columns mean >>> SNP_id/chr/position/genotype). >>> >>> Cheers, >>> Manuel >>> >>> rs7763 16 544555 TT >>> rs763158 16 546105 GG >>> rs7190878 16 549131 AG >>> rs4984890 16 552699 CT >>> rs710925 16 573355 AG >>> rs2017567 16 577213 CT >>> rs4144003 16 585969 CT >>> rs7190358 16 590789 AG >>> rs7203694 16 592942 AG >>> rs11248940 16 595687 TT >>> rs7204088 16 601143 TT >>> rs4984677 16 611683 AG >>> rs9929621 16 619413 CT >>> rs11642546 16 641657 CC >>> rs3752496 16 650256 TT >>> rs2301426 16 651906 GG >>> rs1044662 16 655061 CC >>> rs9934288 16 656288 AC >>> rs3752493 16 657524 TT >>> rs1139897 16 660987 GG >>> rs1045763 16 664085 CC >>> rs3830140 16 665336 AA >>> rs8056588 16 666190 CC >>> rs6597 16 671726 TT >>> >>> >>> Manuel Corpas, PhD >>> Tel: +44.122349.2372 >>> Web: http://manuelcorpas.com/about/ >>> Twitter: @manuelcorpas >>> >>> >>> >>> On 18 November 2011 15:14, Andy Jenkinson >>> wrote: >>>> Hi Manuel, >>>> >>>> Since 2008 ProServer has had a BED format SourceAdaptor (called >>>> bed12, as it is intended to work with the 12-field BED format). >>>> It also supports Hydras, which are modules that are designed to >>>> automatically create DAS sources from a single config without >>>> restarting the server. This is how EasyDAS works with ProServer: >>>> there is one SourceAdaptor, and a Hydra to scan a relational >>>> database for new data. >>>> >>>> I don't know what 23andme's data looks like, but the addition of >>>> a Hydra to scan directories for new files and automatically make >>>> them available as DAS sources would seem to be a trivial piece of >>>> work. I daresay a VCF adaptor would also be fairly easy, >>>> especially if there is a Perl API of some sort (BioPerl?). >>>> >>>> Cheers, >>>> Andy >>>> >>>> On 17 Nov 2011, at 17:11, Manuel Corpas wrote: >>>> >>>>> Dear Jonathan, >>>>> >>>>> I hope you do not mind me copying the DAS list in this email, as >>>>> we >>>>> would be very keen to gather interest in the community regarding >>>>> DAS >>>>> applications to whole genomes. >>>>> >>>>> We are interested in exploring DAS in the context of genomic >>>>> variants >>>>> (SNPs, indels, CNVs) from personal genomes plus their >>>>> integration with >>>>> relevant sources (genes, variation data, phenotypes). >>>>> >>>>> Currently we have done a lot of work with 23andMe (whole-genome) >>>>> genotypes but now we are expecting to extend our efforts further >>>>> to >>>>> exome data. A critical tool we are currently missing is one that >>>>> allows automatic creation of DAS sources via an API directly >>>>> from bed >>>>> format (used by 23andMe) or vcf (1000genomes). >>>>> >>>>> Anyone interested in discussing these topics please let me know. >>>>> >>>>> Kind regards, >>>>> Manuel >>>>> >>>>> Manuel Corpas, PhD >>>>> Tel: +44.122349.2372 >>>>> Web: http://manuelcorpas.com/about/ >>>>> Twitter: @manuelcorpas >>>>> >>>>> >>>>> >>>>> On 17 November 2011 12:11, Jonathan Warren >>>>> wrote: >>>>>> Hi >>>>>> >>>>>> As the 2012 DAS workshop is coming up at the end of February we >>>>>> would like >>>>>> to hear from people using DAS. >>>>>> We would be really grateful to receive just a short email from >>>>>> anyone using >>>>>> DAS or developing DAS with a brief summary about their project >>>>>> and how DAS >>>>>> fits in, especially if you have not spoken at the DAS workshops >>>>>> at any time. >>>>>> >>>>>> Please also say if you would be interested in giving a short >>>>>> presentation at >>>>>> the workshop in February even if you are not sure if you could >>>>>> make it. >>>>>> Previous years the presentations have been 15 minutes with 5 >>>>>> minutes for >>>>>> questions - however this year we intend to be more flexible and >>>>>> so if you >>>>>> would prefer to give a "lightning talk" of just 5 minutes to >>>>>> update people >>>>>> or give them a brief overview that will be fine. Links to the >>>>>> previous years >>>>>> talks can be found here http://www.biodas.org/wiki/DASWorkshop2011#Day_2 >>>>>> >>>>>> I must emphasise - please give us a summary even if you are not >>>>>> interested >>>>>> in giving a talk as we would like to know what is going on out >>>>>> there and we >>>>>> promise not to hound you to give a talk :) >>>>>> >>>>>> Thanks in advance >>>>>> >>>>>> The Sanger/EBI DAS people. >>>>>> >>>>>> >>>>>> Jonathan Warren >>>>>> Senior Developer and DAS coordinator >>>>>> blog: http://biodasman.wordpress.com/ >>>>>> jw12 at sanger.ac.uk >>>>>> Ext: 2314 >>>>>> Telephone: 01223 492314 >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> The Wellcome Trust Sanger Institute is operated by Genome >>>>>> ResearchLimited, a >>>>>> charity registered in England with number 1021457 and acompany >>>>>> registered in >>>>>> England with number 2742969, whose registeredoffice is 215 >>>>>> Euston Road, >>>>>> London, NW1 2BE._______________________________________________ >>>>>> DAS mailing list >>>>>> DAS at lists.open-bio.org >>>>>> http://lists.open-bio.org/mailman/listinfo/das >>>>>> >>>>> >>>>> _______________________________________________ >>>>> DAS mailing list >>>>> DAS at lists.open-bio.org >>>>> http://lists.open-bio.org/mailman/listinfo/das >>>> >>>> >> >> Jonathan Warren Senior Developer and DAS coordinator blog: http://biodasman.wordpress.com/ jw12 at sanger.ac.uk Ext: 2314 Telephone: 01223 492314 -- The Wellcome Trust Sanger Institute is operated by Genome Research Limited, a charity registered in England with number 1021457 and a company registered in England with number 2742969, whose registered office is 215 Euston Road, London, NW1 2BE. From jw12 at sanger.ac.uk Wed Nov 2 15:57:15 2011 From: jw12 at sanger.ac.uk (Jonathan Warren) Date: Wed, 2 Nov 2011 15:57:15 +0000 Subject: [DAS] Java DAS hackathon 1st December. Message-ID: <28054914-049C-4F9F-BC19-BD299DA41A3E@sanger.ac.uk> Hi There are at least 4 of us meeting up on the Sanger/EBI genome campus (Thursday 1st December) to write some code for the new Java DAS library (JDAS http://code.google.com/p/jdas/). The main focus will be on making sure the new library has all the "essential" capabilities of the old Dasobert library and some new features. We would like to extend an open invitation to Java developers in the DAS community. If you would like to attend and contribute then please drop me a line. We would be especially interested in having someone with expertise/interest in DAS structure or Alignment clients. If you have any suggestions or burning needs for support to be included in the JDAS library you can also write to me or post to the list. For inclusion (some of which has/will be implemented by 1 December): Support for concurrency (Threads and queue management). Support for alignment and structure queries/responses. JSON support. Writeback functionality (xml and JSON). Registry sources filtering support. Support for reading/writing JSON. As always any contributions and suggestions welcome. After the hackathon we would also welcome any offers of testing in real world situations? Many thanks The Sanger/EBI DAS team. Jonathan Warren Senior Developer and DAS coordinator blog: http://biodasman.wordpress.com/ jw12 at sanger.ac.uk Ext: 2314 Telephone: 01223 492314 -- The Wellcome Trust Sanger Institute is operated by Genome Research Limited, a charity registered in England with number 1021457 and a company registered in England with number 2742969, whose registered office is 215 Euston Road, London, NW1 2BE. From jrmacias at cnb.csic.es Thu Nov 3 16:32:46 2011 From: jrmacias at cnb.csic.es (Jose-Ramon Macias) Date: Thu, 3 Nov 2011 17:32:46 +0100 Subject: [DAS] DAS Digest, Vol 59, Issue 2 In-Reply-To: References: Message-ID: <8252CA57-DCEC-4CB2-B544-BA60EEDD2732@cnb.csic.es> Hi Jonathan, I won't be able to attend physically, but would be very interested in any development regarding JDAS and the replacement for DasObert. Could I contribute from the distance ? How ? I'll be eager to test the resulting stuff, anyway. On Nov 3, 2011, at 17:00, das-request at lists.open-bio.org wrote: > Send DAS mailing list submissions to > das at lists.open-bio.org > > To subscribe or unsubscribe via the World Wide Web, visit > http://lists.open-bio.org/mailman/listinfo/das > or, via email, send a message with subject or body 'help' to > das-request at lists.open-bio.org > > You can reach the person managing the list at > das-owner at lists.open-bio.org > > When replying, please edit your Subject line so it is more specific > than "Re: Contents of DAS digest..." > > > Today's Topics: > > 1. Java DAS hackathon 1st December. (Jonathan Warren) > > > ---------------------------------------------------------------------- > > Message: 1 > Date: Wed, 2 Nov 2011 15:57:15 +0000 > From: Jonathan Warren > Subject: [DAS] Java DAS hackathon 1st December. > To: das at biodas.org > Message-ID: <28054914-049C-4F9F-BC19-BD299DA41A3E at sanger.ac.uk> > Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes > > Hi > > There are at least 4 of us meeting up on the Sanger/EBI genome campus > (Thursday 1st December) to write some code for the new Java DAS > library (JDAS http://code.google.com/p/jdas/). > The main focus will be on making sure the new library has all the > "essential" capabilities of the old Dasobert library and some new > features. > > We would like to extend an open invitation to Java developers in the > DAS community. If you would like to attend and contribute then please > drop me a line. We would be especially interested in having someone > with expertise/interest in DAS structure or Alignment clients. > If you have any suggestions or burning needs for support to be > included in the JDAS library you can also write to me or post to the > list. > > For inclusion (some of which has/will be implemented by 1 December): > Support for concurrency (Threads and queue management). > Support for alignment and structure queries/responses. > JSON support. > Writeback functionality (xml and JSON). > Registry sources filtering support. > Support for reading/writing JSON. > > As always any contributions and suggestions welcome. > After the hackathon we would also welcome any offers of testing in > real world situations? > > Many thanks > > The Sanger/EBI DAS team. > > > Jonathan Warren > Senior Developer and DAS coordinator > blog: http://biodasman.wordpress.com/ > jw12 at sanger.ac.uk > Ext: 2314 > Telephone: 01223 492314 > > > > > > > > > > > -- > The Wellcome Trust Sanger Institute is operated by Genome Research > Limited, a charity registered in England with number 1021457 and a > company registered in England with number 2742969, whose registered > office is 215 Euston Road, London, NW1 2BE. > > > ------------------------------ > > _______________________________________________ > DAS mailing list > DAS at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/das > > > End of DAS Digest, Vol 59, Issue 2 > ********************************** Jose-Ramon Macias Unidad de Biocomputaci?n Centro Nacional de Biotecnolog?a (CNB-CSIC) Darwin, 3. 28049 Madrid. From jw12 at sanger.ac.uk Thu Nov 3 17:25:59 2011 From: jw12 at sanger.ac.uk (Jonathan Warren) Date: Thu, 3 Nov 2011 17:25:59 +0000 Subject: [DAS] DAS Digest, Vol 59, Issue 2 In-Reply-To: <8252CA57-DCEC-4CB2-B544-BA60EEDD2732@cnb.csic.es> References: <8252CA57-DCEC-4CB2-B544-BA60EEDD2732@cnb.csic.es> Message-ID: <923B0411-5ED2-4696-A23E-ACC772C52EDA@sanger.ac.uk> Hi Jose Great Jose. If you don't mind using Skype I'm sure you can contribute remotely. If you get in touch closer to the time we can sort it out. Cheers Jonathan. On 3 Nov 2011, at 16:32, Jose-Ramon Macias wrote: > Hi Jonathan, > I won't be able to attend physically, but would be very interested > in any development regarding JDAS and the replacement for DasObert. > Could I contribute from the distance ? How ? I'll be eager to test > the resulting stuff, anyway. > > > On Nov 3, 2011, at 17:00, das-request at lists.open-bio.org wrote: > >> Send DAS mailing list submissions to >> das at lists.open-bio.org >> >> To subscribe or unsubscribe via the World Wide Web, visit >> http://lists.open-bio.org/mailman/listinfo/das >> or, via email, send a message with subject or body 'help' to >> das-request at lists.open-bio.org >> >> You can reach the person managing the list at >> das-owner at lists.open-bio.org >> >> When replying, please edit your Subject line so it is more specific >> than "Re: Contents of DAS digest..." >> >> >> Today's Topics: >> >> 1. Java DAS hackathon 1st December. (Jonathan Warren) >> >> >> ---------------------------------------------------------------------- >> >> Message: 1 >> Date: Wed, 2 Nov 2011 15:57:15 +0000 >> From: Jonathan Warren >> Subject: [DAS] Java DAS hackathon 1st December. >> To: das at biodas.org >> Message-ID: <28054914-049C-4F9F-BC19-BD299DA41A3E at sanger.ac.uk> >> Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes >> >> Hi >> >> There are at least 4 of us meeting up on the Sanger/EBI genome campus >> (Thursday 1st December) to write some code for the new Java DAS >> library (JDAS http://code.google.com/p/jdas/). >> The main focus will be on making sure the new library has all the >> "essential" capabilities of the old Dasobert library and some new >> features. >> >> We would like to extend an open invitation to Java developers in the >> DAS community. If you would like to attend and contribute then please >> drop me a line. We would be especially interested in having someone >> with expertise/interest in DAS structure or Alignment clients. >> If you have any suggestions or burning needs for support to be >> included in the JDAS library you can also write to me or post to the >> list. >> >> For inclusion (some of which has/will be implemented by 1 December): >> Support for concurrency (Threads and queue management). >> Support for alignment and structure queries/responses. >> JSON support. >> Writeback functionality (xml and JSON). >> Registry sources filtering support. >> Support for reading/writing JSON. >> >> As always any contributions and suggestions welcome. >> After the hackathon we would also welcome any offers of testing in >> real world situations? >> >> Many thanks >> >> The Sanger/EBI DAS team. >> >> >> Jonathan Warren >> Senior Developer and DAS coordinator >> blog: http://biodasman.wordpress.com/ >> jw12 at sanger.ac.uk >> Ext: 2314 >> Telephone: 01223 492314 >> >> >> >> >> >> >> >> >> >> >> -- >> The Wellcome Trust Sanger Institute is operated by Genome Research >> Limited, a charity registered in England with number 1021457 and a >> company registered in England with number 2742969, whose registered >> office is 215 Euston Road, London, NW1 2BE. >> >> >> ------------------------------ >> >> _______________________________________________ >> DAS mailing list >> DAS at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/das >> >> >> End of DAS Digest, Vol 59, Issue 2 >> ********************************** > > Jose-Ramon Macias > > Unidad de Biocomputaci?n > Centro Nacional de Biotecnolog?a (CNB-CSIC) > Darwin, 3. 28049 Madrid. > > > _______________________________________________ > DAS mailing list > DAS at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/das Jonathan Warren Senior Developer and DAS coordinator blog: http://biodasman.wordpress.com/ jw12 at sanger.ac.uk Ext: 2314 Telephone: 01223 492314 -- The Wellcome Trust Sanger Institute is operated by Genome Research Limited, a charity registered in England with number 1021457 and a company registered in England with number 2742969, whose registered office is 215 Euston Road, London, NW1 2BE. From jw12 at sanger.ac.uk Thu Nov 17 12:15:35 2011 From: jw12 at sanger.ac.uk (Jonathan Warren) Date: Thu, 17 Nov 2011 12:15:35 +0000 Subject: [DAS] More info Survey on DAS projects Message-ID: I forgot to add: If you are interested in giving a talk please reply by the 1st of December (2 weeks from now) as we will need to start organising the format of this years workshop. Many thanks Jonathan Warren Senior Developer and DAS coordinator blog: http://biodasman.wordpress.com/ jw12 at sanger.ac.uk Ext: 2314 Telephone: 01223 492314 -- The Wellcome Trust Sanger Institute is operated by Genome Research Limited, a charity registered in England with number 1021457 and a company registered in England with number 2742969, whose registered office is 215 Euston Road, London, NW1 2BE. From jw12 at sanger.ac.uk Thu Nov 17 12:11:14 2011 From: jw12 at sanger.ac.uk (Jonathan Warren) Date: Thu, 17 Nov 2011 12:11:14 +0000 Subject: [DAS] Survey on DAS projects Message-ID: <4A0F0B2F-D98C-4B88-BC48-104607F9E526@sanger.ac.uk> Hi As the 2012 DAS workshop is coming up at the end of February we would like to hear from people using DAS. We would be really grateful to receive just a short email from anyone using DAS or developing DAS with a brief summary about their project and how DAS fits in, especially if you have not spoken at the DAS workshops at any time. Please also say if you would be interested in giving a short presentation at the workshop in February even if you are not sure if you could make it. Previous years the presentations have been 15 minutes with 5 minutes for questions - however this year we intend to be more flexible and so if you would prefer to give a "lightning talk" of just 5 minutes to update people or give them a brief overview that will be fine. Links to the previous years talks can be found here http://www.biodas.org/wiki/DASWorkshop2011#Day_2 I must emphasise - please give us a summary even if you are not interested in giving a talk as we would like to know what is going on out there and we promise not to hound you to give a talk :) Thanks in advance The Sanger/EBI DAS people. Jonathan Warren Senior Developer and DAS coordinator blog: http://biodasman.wordpress.com/ jw12 at sanger.ac.uk Ext: 2314 Telephone: 01223 492314 -- The Wellcome Trust Sanger Institute is operated by Genome Research Limited, a charity registered in England with number 1021457 and a company registered in England with number 2742969, whose registered office is 215 Euston Road, London, NW1 2BE. From mc at manuelcorpas.com Thu Nov 17 17:11:08 2011 From: mc at manuelcorpas.com (Manuel Corpas) Date: Thu, 17 Nov 2011 17:11:08 +0000 Subject: [DAS] Survey on DAS projects In-Reply-To: <4A0F0B2F-D98C-4B88-BC48-104607F9E526@sanger.ac.uk> References: <4A0F0B2F-D98C-4B88-BC48-104607F9E526@sanger.ac.uk> Message-ID: Dear Jonathan, I hope you do not mind me copying the DAS list in this email, as we would be very keen to gather interest in the community regarding DAS applications to whole genomes. We are interested in exploring DAS in the context of genomic variants (SNPs, indels, CNVs) from personal genomes plus their integration with relevant sources (genes, variation data, phenotypes). Currently we have done a lot of work with 23andMe (whole-genome) genotypes but now we are expecting to extend our efforts further to exome data. A critical tool we are currently missing is one that allows automatic creation of DAS sources via an API directly from bed format (used by 23andMe) or vcf (1000genomes). Anyone interested in discussing these topics please let me know. Kind regards, Manuel Manuel Corpas, PhD Tel:? ? ? +44.122349.2372 Web: ? ?http://manuelcorpas.com/about/ Twitter: @manuelcorpas On 17 November 2011 12:11, Jonathan Warren wrote: > Hi > > As the 2012 DAS workshop is coming up at the end of February we would like > to hear from people using DAS. > We would be really grateful to receive just a short email from anyone using > DAS or developing DAS with a brief summary about their project and how DAS > fits in, especially if you have not spoken at the DAS workshops at any time. > > Please also say if you would be interested in giving a short presentation at > the workshop in February even if you are not sure if you could make it. > Previous years the presentations have been 15 minutes with 5 minutes for > questions - however this year we intend to be more flexible and so if you > would prefer to give a "lightning talk" of just 5 minutes to update people > or give them a brief overview that will be fine. Links to the previous years > talks can be found here http://www.biodas.org/wiki/DASWorkshop2011#Day_2 > > I must emphasise - please give us a summary even if you are not interested > in giving a talk as we would like to know what is going on out there and we > promise not to hound you to give a talk :) > > Thanks in advance > > The Sanger/EBI DAS people. > > > Jonathan Warren > Senior Developer and DAS coordinator > blog: http://biodasman.wordpress.com/ > jw12 at sanger.ac.uk > Ext: 2314 > Telephone: 01223 492314 > > > > > > > > > > > -- > The Wellcome Trust Sanger Institute is operated by Genome ResearchLimited, a > charity registered in England with number 1021457 and acompany registered in > England with number 2742969, whose registeredoffice is 215 Euston Road, > London, NW1 2BE._______________________________________________ > DAS mailing list > DAS at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/das > From jw12 at sanger.ac.uk Fri Nov 18 10:15:18 2011 From: jw12 at sanger.ac.uk (Jonathan Warren) Date: Fri, 18 Nov 2011 10:15:18 +0000 Subject: [DAS] Genotype MyDAS adapter In-Reply-To: References: <4A0F0B2F-D98C-4B88-BC48-104607F9E526@sanger.ac.uk> Message-ID: <3A134B95-E6BD-486D-B9D0-A8721B345C07@sanger.ac.uk> Thanks Manuel, that's exactly the sort of reply we are hoping for. It's great if people want to publicise what they are doing on the list. In addition I should add that if people wish to distribute 23AndMe data via DAS I have developed a very simple MyDAS adapter for this purpose that comes with the current myDAS download (http://code.google.com/p/mydas/ ). Comprehensive instructions for setting up the database and uploading the data file can be found here: http://biodasman.wordpress.com/2011/11/04/how-to-create-a-genotype-database-for-use-as-a-das-source/ The configuration for connecting to this database is also in MyDAS and the blog. Obviously this isn't an automatic set-up via a pretty interface, but it's relatively simple for anyone with a degree of technical ability to do in the mean time. On 17 Nov 2011, at 17:11, Manuel Corpas wrote: > Dear Jonathan, > > I hope you do not mind me copying the DAS list in this email, as we > would be very keen to gather interest in the community regarding DAS > applications to whole genomes. > > We are interested in exploring DAS in the context of genomic variants > (SNPs, indels, CNVs) from personal genomes plus their integration with > relevant sources (genes, variation data, phenotypes). > > Currently we have done a lot of work with 23andMe (whole-genome) > genotypes but now we are expecting to extend our efforts further to > exome data. A critical tool we are currently missing is one that > allows automatic creation of DAS sources via an API directly from bed > format (used by 23andMe) or vcf (1000genomes). > > Anyone interested in discussing these topics please let me know. > > Kind regards, > Manuel > > Manuel Corpas, PhD > Tel: +44.122349.2372 > Web: http://manuelcorpas.com/about/ > Twitter: @manuelcorpas > > > > On 17 November 2011 12:11, Jonathan Warren wrote: >> Hi >> >> As the 2012 DAS workshop is coming up at the end of February we >> would like >> to hear from people using DAS. >> We would be really grateful to receive just a short email from >> anyone using >> DAS or developing DAS with a brief summary about their project and >> how DAS >> fits in, especially if you have not spoken at the DAS workshops at >> any time. >> >> Please also say if you would be interested in giving a short >> presentation at >> the workshop in February even if you are not sure if you could make >> it. >> Previous years the presentations have been 15 minutes with 5 >> minutes for >> questions - however this year we intend to be more flexible and so >> if you >> would prefer to give a "lightning talk" of just 5 minutes to update >> people >> or give them a brief overview that will be fine. Links to the >> previous years >> talks can be found here http://www.biodas.org/wiki/DASWorkshop2011#Day_2 >> >> I must emphasise - please give us a summary even if you are not >> interested >> in giving a talk as we would like to know what is going on out >> there and we >> promise not to hound you to give a talk :) >> >> Thanks in advance >> >> The Sanger/EBI DAS people. >> >> >> Jonathan Warren >> Senior Developer and DAS coordinator >> blog: http://biodasman.wordpress.com/ >> jw12 at sanger.ac.uk >> Ext: 2314 >> Telephone: 01223 492314 >> >> >> >> >> >> >> >> >> >> >> -- >> The Wellcome Trust Sanger Institute is operated by Genome >> ResearchLimited, a >> charity registered in England with number 1021457 and acompany >> registered in >> England with number 2742969, whose registeredoffice is 215 Euston >> Road, >> London, NW1 2BE._______________________________________________ >> DAS mailing list >> DAS at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/das >> Jonathan Warren Senior Developer and DAS coordinator blog: http://biodasman.wordpress.com/ jw12 at sanger.ac.uk Ext: 2314 Telephone: 01223 492314 -- The Wellcome Trust Sanger Institute is operated by Genome Research Limited, a charity registered in England with number 1021457 and a company registered in England with number 2742969, whose registered office is 215 Euston Road, London, NW1 2BE. From mc at manuelcorpas.com Fri Nov 18 10:28:13 2011 From: mc at manuelcorpas.com (Manuel Corpas) Date: Fri, 18 Nov 2011 10:28:13 +0000 Subject: [DAS] Genotype MyDAS adapter In-Reply-To: <3A134B95-E6BD-486D-B9D0-A8721B345C07@sanger.ac.uk> References: <4A0F0B2F-D98C-4B88-BC48-104607F9E526@sanger.ac.uk> <3A134B95-E6BD-486D-B9D0-A8721B345C07@sanger.ac.uk> Message-ID: Hi Jonathan, thanks, this is really helpful. Please find in this link my genotype data* should anyone would like to give it a try with real data: http://mykaryoview.com/data/son.txt.zip Manuel *my genotype data has CC BY-SA 3.0 License: you are able to copy, distribute and transmit the work as long as you acknowledge the source. Manuel Corpas, PhD Tel:? ? ? +44.122349.2372 Web: ? ?http://manuelcorpas.com/about/ Twitter: @manuelcorpas On 18 November 2011 10:15, Jonathan Warren wrote: > Thanks Manuel, that's exactly the sort of reply we are hoping for.?It's > great if people want to publicise what they are doing on the list. > In addition I should add that if people wish to distribute 23AndMe data via > DAS I have developed a very simple MyDAS adapter for this purpose that comes > with the current myDAS download (http://code.google.com/p/mydas/). > Comprehensive instructions for setting up the database and uploading the > data file can be found > here:?http://biodasman.wordpress.com/2011/11/04/how-to-create-a-genotype-database-for-use-as-a-das-source/ > The configuration for connecting to this database is also in MyDAS and the > blog. > Obviously this isn't an automatic set-up via a pretty interface, but it's > relatively simple for anyone with a degree of technical ability to do in the > mean time. > On 17 Nov 2011, at 17:11, Manuel Corpas wrote: > > Dear Jonathan, > > I hope you do not mind me copying the DAS list in this email, as we > would be very keen to gather interest in the community regarding DAS > applications to whole genomes. > > We are interested in exploring DAS in the context of genomic variants > (SNPs, indels, CNVs) from personal genomes plus their integration with > relevant sources (genes, variation data, phenotypes). > > Currently we have done a lot of work with 23andMe (whole-genome) > genotypes but now we are expecting to extend our efforts further to > exome data. A critical tool we are currently missing is one that > allows automatic creation of DAS sources via an API directly from bed > format (used by 23andMe) or vcf (1000genomes). > > Anyone interested in discussing these topics please let me know. > > Kind regards, > Manuel > > Manuel Corpas, PhD > Tel:? ? ? +44.122349.2372 > Web: ? ?http://manuelcorpas.com/about/ > Twitter: @manuelcorpas > > > > On 17 November 2011 12:11, Jonathan Warren wrote: > > Hi > > As the 2012 DAS workshop is coming up at the end of February we would like > > to hear from people using DAS. > > We would be really grateful to receive just a short email from anyone using > > DAS or developing DAS with a brief summary about their project and how DAS > > fits in, especially if you have not spoken at the DAS workshops at any time. > > Please also say if you would be interested in giving a short presentation at > > the workshop in February even if you are not sure if you could make it. > > Previous years the presentations have been 15 minutes with 5 minutes for > > questions - however this year we intend to be more flexible and so if you > > would prefer to give a "lightning talk" of just 5 minutes to update people > > or give them a brief overview that will be fine. Links to the previous years > > talks can be found here http://www.biodas.org/wiki/DASWorkshop2011#Day_2 > > I must emphasise - please give us a summary even if you are not interested > > in giving a talk as we would like to know what is going on out there and we > > promise not to hound you to give a talk :) > > Thanks in advance > > The Sanger/EBI DAS people. > > > Jonathan Warren > > Senior Developer and DAS coordinator > > blog: http://biodasman.wordpress.com/ > > jw12 at sanger.ac.uk > > Ext: 2314 > > Telephone: 01223 492314 > > > > > > > > > > > -- > > The Wellcome Trust Sanger Institute is operated by Genome ResearchLimited, a > > charity registered in England with number 1021457 and acompany registered in > > England with number 2742969, whose registeredoffice is 215 Euston Road, > > London, NW1 2BE._______________________________________________ > > DAS mailing list > > DAS at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/das > > > Jonathan Warren > Senior Developer and DAS coordinator > blog: http://biodasman.wordpress.com/ > jw12 at sanger.ac.uk > Ext: 2314 > Telephone: 01223 492314 > > > > > > > > -- The Wellcome Trust Sanger Institute is operated by Genome Research > Limited, a charity registered in England with number 1021457 and a company > registered in England with number 2742969, whose registered office is 215 > Euston Road, London, NW1 2BE. > From andy.jenkinson at ebi.ac.uk Fri Nov 18 15:14:02 2011 From: andy.jenkinson at ebi.ac.uk (Andy Jenkinson) Date: Fri, 18 Nov 2011 15:14:02 +0000 Subject: [DAS] Survey on DAS projects In-Reply-To: References: <4A0F0B2F-D98C-4B88-BC48-104607F9E526@sanger.ac.uk> Message-ID: <1C57BEA4-3B0A-4CBE-8308-97D418D2F243@ebi.ac.uk> Hi Manuel, Since 2008 ProServer has had a BED format SourceAdaptor (called bed12, as it is intended to work with the 12-field BED format). It also supports Hydras, which are modules that are designed to automatically create DAS sources from a single config without restarting the server. This is how EasyDAS works with ProServer: there is one SourceAdaptor, and a Hydra to scan a relational database for new data. I don't know what 23andme's data looks like, but the addition of a Hydra to scan directories for new files and automatically make them available as DAS sources would seem to be a trivial piece of work. I daresay a VCF adaptor would also be fairly easy, especially if there is a Perl API of some sort (BioPerl?). Cheers, Andy On 17 Nov 2011, at 17:11, Manuel Corpas wrote: > Dear Jonathan, > > I hope you do not mind me copying the DAS list in this email, as we > would be very keen to gather interest in the community regarding DAS > applications to whole genomes. > > We are interested in exploring DAS in the context of genomic variants > (SNPs, indels, CNVs) from personal genomes plus their integration with > relevant sources (genes, variation data, phenotypes). > > Currently we have done a lot of work with 23andMe (whole-genome) > genotypes but now we are expecting to extend our efforts further to > exome data. A critical tool we are currently missing is one that > allows automatic creation of DAS sources via an API directly from bed > format (used by 23andMe) or vcf (1000genomes). > > Anyone interested in discussing these topics please let me know. > > Kind regards, > Manuel > > Manuel Corpas, PhD > Tel: +44.122349.2372 > Web: http://manuelcorpas.com/about/ > Twitter: @manuelcorpas > > > > On 17 November 2011 12:11, Jonathan Warren wrote: >> Hi >> >> As the 2012 DAS workshop is coming up at the end of February we would like >> to hear from people using DAS. >> We would be really grateful to receive just a short email from anyone using >> DAS or developing DAS with a brief summary about their project and how DAS >> fits in, especially if you have not spoken at the DAS workshops at any time. >> >> Please also say if you would be interested in giving a short presentation at >> the workshop in February even if you are not sure if you could make it. >> Previous years the presentations have been 15 minutes with 5 minutes for >> questions - however this year we intend to be more flexible and so if you >> would prefer to give a "lightning talk" of just 5 minutes to update people >> or give them a brief overview that will be fine. Links to the previous years >> talks can be found here http://www.biodas.org/wiki/DASWorkshop2011#Day_2 >> >> I must emphasise - please give us a summary even if you are not interested >> in giving a talk as we would like to know what is going on out there and we >> promise not to hound you to give a talk :) >> >> Thanks in advance >> >> The Sanger/EBI DAS people. >> >> >> Jonathan Warren >> Senior Developer and DAS coordinator >> blog: http://biodasman.wordpress.com/ >> jw12 at sanger.ac.uk >> Ext: 2314 >> Telephone: 01223 492314 >> >> >> >> >> >> >> >> >> >> >> -- >> The Wellcome Trust Sanger Institute is operated by Genome ResearchLimited, a >> charity registered in England with number 1021457 and acompany registered in >> England with number 2742969, whose registeredoffice is 215 Euston Road, >> London, NW1 2BE._______________________________________________ >> DAS mailing list >> DAS at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/das >> > > _______________________________________________ > DAS mailing list > DAS at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/das From mc at manuelcorpas.com Fri Nov 18 15:24:59 2011 From: mc at manuelcorpas.com (Manuel Corpas) Date: Fri, 18 Nov 2011 15:24:59 +0000 Subject: [DAS] Survey on DAS projects In-Reply-To: <1C57BEA4-3B0A-4CBE-8308-97D418D2F243@ebi.ac.uk> References: <4A0F0B2F-D98C-4B88-BC48-104607F9E526@sanger.ac.uk> <1C57BEA4-3B0A-4CBE-8308-97D418D2F243@ebi.ac.uk> Message-ID: Hi Andy, thanks for the info. Having a bed DAS adaptor is part of the problem, the other is not having to worry about having to deal with the DAS server directly. easyDAS manages to do this but unfortunately it is not obvious for people who do not know DAS how to operate it. Also if the file is very big and the connection slow it can take up to an hour to create a DAS source. Wouldn't it be nice to create a DAS source just with one click or two? Please see below a snippet of a few SNPs in my chromosome 16 just as you would get them from 23andMe (NCBI36 assembly; columns mean SNP_id/chr/position/genotype). Cheers, Manuel rs7763 16 544555 TT rs763158 16 546105 GG rs7190878 16 549131 AG rs4984890 16 552699 CT rs710925 16 573355 AG rs2017567 16 577213 CT rs4144003 16 585969 CT rs7190358 16 590789 AG rs7203694 16 592942 AG rs11248940 16 595687 TT rs7204088 16 601143 TT rs4984677 16 611683 AG rs9929621 16 619413 CT rs11642546 16 641657 CC rs3752496 16 650256 TT rs2301426 16 651906 GG rs1044662 16 655061 CC rs9934288 16 656288 AC rs3752493 16 657524 TT rs1139897 16 660987 GG rs1045763 16 664085 CC rs3830140 16 665336 AA rs8056588 16 666190 CC rs6597 16 671726 TT Manuel Corpas, PhD Tel:? ? ? +44.122349.2372 Web: ? ?http://manuelcorpas.com/about/ Twitter: @manuelcorpas On 18 November 2011 15:14, Andy Jenkinson wrote: > Hi Manuel, > > Since 2008 ProServer has had a BED format SourceAdaptor (called bed12, as it is intended to work with the 12-field BED format). It also supports Hydras, which are modules that are designed to automatically create DAS sources from a single config without restarting the server. This is how EasyDAS works with ProServer: there is one SourceAdaptor, and a Hydra to scan a relational database for new data. > > I don't know what 23andme's data looks like, but the addition of a Hydra to scan directories for new files and automatically make them available as DAS sources would seem to be a trivial piece of work. I daresay a VCF adaptor would also be fairly easy, especially if there is a Perl API of some sort (BioPerl?). > > Cheers, > Andy > > On 17 Nov 2011, at 17:11, Manuel Corpas wrote: > >> Dear Jonathan, >> >> I hope you do not mind me copying the DAS list in this email, as we >> would be very keen to gather interest in the community regarding DAS >> applications to whole genomes. >> >> We are interested in exploring DAS in the context of genomic variants >> (SNPs, indels, CNVs) from personal genomes plus their integration with >> relevant sources (genes, variation data, phenotypes). >> >> Currently we have done a lot of work with 23andMe (whole-genome) >> genotypes but now we are expecting to extend our efforts further to >> exome data. A critical tool we are currently missing is one that >> allows automatic creation of DAS sources via an API directly from bed >> format (used by 23andMe) or vcf (1000genomes). >> >> Anyone interested in discussing these topics please let me know. >> >> Kind regards, >> Manuel >> >> Manuel Corpas, PhD >> Tel: ? ? ?+44.122349.2372 >> Web: ? ?http://manuelcorpas.com/about/ >> Twitter: @manuelcorpas >> >> >> >> On 17 November 2011 12:11, Jonathan Warren wrote: >>> Hi >>> >>> As the 2012 DAS workshop is coming up at the end of February we would like >>> to hear from people using DAS. >>> We would be really grateful to receive just a short email from anyone using >>> DAS or developing DAS with a brief summary about their project and how DAS >>> fits in, especially if you have not spoken at the DAS workshops at any time. >>> >>> Please also say if you would be interested in giving a short presentation at >>> the workshop in February even if you are not sure if you could make it. >>> Previous years the presentations have been 15 minutes with 5 minutes for >>> questions - however this year we intend to be more flexible and so if you >>> would prefer to give a "lightning talk" of just 5 minutes to update people >>> or give them a brief overview that will be fine. Links to the previous years >>> talks can be found here http://www.biodas.org/wiki/DASWorkshop2011#Day_2 >>> >>> I must emphasise - please give us a summary even if you are not interested >>> in giving a talk as we would like to know what is going on out there and we >>> promise not to hound you to give a talk :) >>> >>> Thanks in advance >>> >>> The Sanger/EBI DAS people. >>> >>> >>> Jonathan Warren >>> Senior Developer and DAS coordinator >>> blog: http://biodasman.wordpress.com/ >>> jw12 at sanger.ac.uk >>> Ext: 2314 >>> Telephone: 01223 492314 >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> -- >>> The Wellcome Trust Sanger Institute is operated by Genome ResearchLimited, a >>> charity registered in England with number 1021457 and acompany registered in >>> England with number 2742969, whose registeredoffice is 215 Euston Road, >>> London, NW1 2BE._______________________________________________ >>> DAS mailing list >>> DAS at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/das >>> >> >> _______________________________________________ >> DAS mailing list >> DAS at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/das > > From andy.jenkinson at ebi.ac.uk Fri Nov 18 23:33:51 2011 From: andy.jenkinson at ebi.ac.uk (Andy Jenkinson) Date: Fri, 18 Nov 2011 23:33:51 +0000 Subject: [DAS] Survey on DAS projects In-Reply-To: References: <4A0F0B2F-D98C-4B88-BC48-104607F9E526@sanger.ac.uk> <1C57BEA4-3B0A-4CBE-8308-97D418D2F243@ebi.ac.uk> Message-ID: Hi Manuel, It would be nice to be able to create a DAS source from any type of data you happen to have with a click or two, but I don't think it is realistic. Even in this email you have just told me what all the columns mean, what the assembly is, what kind of file it is. Any application would need to know the same things (and more). That is not to say that it is difficult to build something to let you do this if it is specifically designed for the exact type of data you are using, just that it does not already exist and so you have to actually create it. Either MyDas or ProServer would seem to offer you a starting point to do that, but only a starting point. EasyDAS is the closest thing to what you want but obviously it has to cater for any type of data so has to ask you a lot more questions. Its web-based architecture obviously limits the size of data files you can process quickly too, but that is the trade off you make by not needing an Internet-visible web server of your own to run a DAS server from. I daresay if you wanted to create something that an individual can use to make a DAS source from their personal BED/VCF file then it would have to be web based, will always be restricted by the speed of the Internet, but the interface could be much simpler than EasyDAS and a database might not be needed (EasyDAS loads file contents into a database to standardise them, which slows things down). Having said all this, I am a little confused about what you are trying to achieve. In your first mail you said you wanted to create sources via an API, in the second you say you want to do it via a click. Obviously the requirements for both are very different. Cheers, Andy On 18 Nov 2011, at 15:24, Manuel Corpas wrote: > Hi Andy, > > thanks for the info. Having a bed DAS adaptor is part of the problem, > the other is not having to worry about having to deal with the DAS > server directly. easyDAS manages to do this but unfortunately it is > not obvious for people who do not know DAS how to operate it. Also if > the file is very big and the connection slow it can take up to an hour > to create a DAS source. > > Wouldn't it be nice to create a DAS source just with one click or two? > > Please see below a snippet of a few SNPs in my chromosome 16 just as > you would get them from 23andMe (NCBI36 assembly; columns mean > SNP_id/chr/position/genotype). > > Cheers, > Manuel > > rs7763 16 544555 TT > rs763158 16 546105 GG > rs7190878 16 549131 AG > rs4984890 16 552699 CT > rs710925 16 573355 AG > rs2017567 16 577213 CT > rs4144003 16 585969 CT > rs7190358 16 590789 AG > rs7203694 16 592942 AG > rs11248940 16 595687 TT > rs7204088 16 601143 TT > rs4984677 16 611683 AG > rs9929621 16 619413 CT > rs11642546 16 641657 CC > rs3752496 16 650256 TT > rs2301426 16 651906 GG > rs1044662 16 655061 CC > rs9934288 16 656288 AC > rs3752493 16 657524 TT > rs1139897 16 660987 GG > rs1045763 16 664085 CC > rs3830140 16 665336 AA > rs8056588 16 666190 CC > rs6597 16 671726 TT > > > Manuel Corpas, PhD > Tel: +44.122349.2372 > Web: http://manuelcorpas.com/about/ > Twitter: @manuelcorpas > > > > On 18 November 2011 15:14, Andy Jenkinson wrote: >> Hi Manuel, >> >> Since 2008 ProServer has had a BED format SourceAdaptor (called bed12, as it is intended to work with the 12-field BED format). It also supports Hydras, which are modules that are designed to automatically create DAS sources from a single config without restarting the server. This is how EasyDAS works with ProServer: there is one SourceAdaptor, and a Hydra to scan a relational database for new data. >> >> I don't know what 23andme's data looks like, but the addition of a Hydra to scan directories for new files and automatically make them available as DAS sources would seem to be a trivial piece of work. I daresay a VCF adaptor would also be fairly easy, especially if there is a Perl API of some sort (BioPerl?). >> >> Cheers, >> Andy >> >> On 17 Nov 2011, at 17:11, Manuel Corpas wrote: >> >>> Dear Jonathan, >>> >>> I hope you do not mind me copying the DAS list in this email, as we >>> would be very keen to gather interest in the community regarding DAS >>> applications to whole genomes. >>> >>> We are interested in exploring DAS in the context of genomic variants >>> (SNPs, indels, CNVs) from personal genomes plus their integration with >>> relevant sources (genes, variation data, phenotypes). >>> >>> Currently we have done a lot of work with 23andMe (whole-genome) >>> genotypes but now we are expecting to extend our efforts further to >>> exome data. A critical tool we are currently missing is one that >>> allows automatic creation of DAS sources via an API directly from bed >>> format (used by 23andMe) or vcf (1000genomes). >>> >>> Anyone interested in discussing these topics please let me know. >>> >>> Kind regards, >>> Manuel >>> >>> Manuel Corpas, PhD >>> Tel: +44.122349.2372 >>> Web: http://manuelcorpas.com/about/ >>> Twitter: @manuelcorpas >>> >>> >>> >>> On 17 November 2011 12:11, Jonathan Warren wrote: >>>> Hi >>>> >>>> As the 2012 DAS workshop is coming up at the end of February we would like >>>> to hear from people using DAS. >>>> We would be really grateful to receive just a short email from anyone using >>>> DAS or developing DAS with a brief summary about their project and how DAS >>>> fits in, especially if you have not spoken at the DAS workshops at any time. >>>> >>>> Please also say if you would be interested in giving a short presentation at >>>> the workshop in February even if you are not sure if you could make it. >>>> Previous years the presentations have been 15 minutes with 5 minutes for >>>> questions - however this year we intend to be more flexible and so if you >>>> would prefer to give a "lightning talk" of just 5 minutes to update people >>>> or give them a brief overview that will be fine. Links to the previous years >>>> talks can be found here http://www.biodas.org/wiki/DASWorkshop2011#Day_2 >>>> >>>> I must emphasise - please give us a summary even if you are not interested >>>> in giving a talk as we would like to know what is going on out there and we >>>> promise not to hound you to give a talk :) >>>> >>>> Thanks in advance >>>> >>>> The Sanger/EBI DAS people. >>>> >>>> >>>> Jonathan Warren >>>> Senior Developer and DAS coordinator >>>> blog: http://biodasman.wordpress.com/ >>>> jw12 at sanger.ac.uk >>>> Ext: 2314 >>>> Telephone: 01223 492314 >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> -- >>>> The Wellcome Trust Sanger Institute is operated by Genome ResearchLimited, a >>>> charity registered in England with number 1021457 and acompany registered in >>>> England with number 2742969, whose registeredoffice is 215 Euston Road, >>>> London, NW1 2BE._______________________________________________ >>>> DAS mailing list >>>> DAS at lists.open-bio.org >>>> http://lists.open-bio.org/mailman/listinfo/das >>>> >>> >>> _______________________________________________ >>> DAS mailing list >>> DAS at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/das >> >> From mc at manuelcorpas.com Sat Nov 19 15:58:16 2011 From: mc at manuelcorpas.com (Manuel Corpas) Date: Sat, 19 Nov 2011 15:58:16 +0000 Subject: [DAS] Survey on DAS projects In-Reply-To: References: <4A0F0B2F-D98C-4B88-BC48-104607F9E526@sanger.ac.uk> <1C57BEA4-3B0A-4CBE-8308-97D418D2F243@ebi.ac.uk> Message-ID: > Having said all this, I am a little confused about what you are trying to achieve. In your first mail you said you wanted to create sources via an API, in the second you say you want to do it via a click. Obviously the requirements for both are very different. Both API and 1-click DAS source creation would be extremely helpful in my view. The fact that this functionality is not available is seriously affecting many of our users' ability to create DAS sources with their 23andMe genotypes. The fact that these facilities do not exist has stopped new potential users from utilizing DAS. If DAS is truly going to survive as a standard, automatic creation of data sources needs to be easier. Manuel Manuel Corpas, PhD Tel:? ? ? +44.122349.2372 Web: ? ?http://manuelcorpas.com/about/ Twitter: @manuelcorpas On 18 November 2011 23:33, Andy Jenkinson wrote: > Hi Manuel, > > It would be nice to be able to create a DAS source from any type of data you happen to have with a click or two, but I don't think it is realistic. Even in this email you have just told me what all the columns mean, what the assembly is, what kind of file it is. Any application would need to know the same things (and more). > > That is not to say that it is difficult to build something to let you do this if it is specifically designed for the exact type of data you are using, just that it does not already exist and so you have to actually create it. Either MyDas or ProServer would seem to offer you a starting point to do that, but only a starting point. EasyDAS is the closest thing to what you want but obviously it has to cater for any type of data so has to ask you a lot more questions. Its web-based architecture obviously limits the size of data files you can process quickly too, but that is the trade off you make by not needing an Internet-visible web server of your own to run a DAS server from. I daresay if you wanted to create something that an individual can use to make a DAS source from their personal BED/VCF file then it would have to be web based, will always be restricted by the speed of the Internet, but the interface could be much simpler than EasyDAS and a database might not be needed (EasyDAS loads file contents into a database to standardise them, which slows things down). > > Having said all this, I am a little confused about what you are trying to achieve. In your first mail you said you wanted to create sources via an API, in the second you say you want to do it via a click. Obviously the requirements for both are very different. > > Cheers, > Andy > > On 18 Nov 2011, at 15:24, Manuel Corpas wrote: > >> Hi Andy, >> >> thanks for the info. Having a bed DAS adaptor is part of the problem, >> the other is not having to worry about having to deal with the DAS >> server directly. easyDAS manages to do this but unfortunately it is >> not obvious for people who do not know DAS how to operate it. Also if >> the file is very big and the connection slow it can take up to an hour >> to create a DAS source. >> >> Wouldn't it be nice to create a DAS source just with one click or two? >> >> Please see below a snippet of a few SNPs in my chromosome 16 just as >> you would get them from 23andMe (NCBI36 assembly; columns mean >> SNP_id/chr/position/genotype). >> >> Cheers, >> Manuel >> >> rs7763 ? ? ? ?16 ? ? ?544555 ?TT >> rs763158 ? ? ?16 ? ? ?546105 ?GG >> rs7190878 ? ? 16 ? ? ?549131 ?AG >> rs4984890 ? ? 16 ? ? ?552699 ?CT >> rs710925 ? ? ?16 ? ? ?573355 ?AG >> rs2017567 ? ? 16 ? ? ?577213 ?CT >> rs4144003 ? ? 16 ? ? ?585969 ?CT >> rs7190358 ? ? 16 ? ? ?590789 ?AG >> rs7203694 ? ? 16 ? ? ?592942 ?AG >> rs11248940 ? ?16 ? ? ?595687 ?TT >> rs7204088 ? ? 16 ? ? ?601143 ?TT >> rs4984677 ? ? 16 ? ? ?611683 ?AG >> rs9929621 ? ? 16 ? ? ?619413 ?CT >> rs11642546 ? ?16 ? ? ?641657 ?CC >> rs3752496 ? ? 16 ? ? ?650256 ?TT >> rs2301426 ? ? 16 ? ? ?651906 ?GG >> rs1044662 ? ? 16 ? ? ?655061 ?CC >> rs9934288 ? ? 16 ? ? ?656288 ?AC >> rs3752493 ? ? 16 ? ? ?657524 ?TT >> rs1139897 ? ? 16 ? ? ?660987 ?GG >> rs1045763 ? ? 16 ? ? ?664085 ?CC >> rs3830140 ? ? 16 ? ? ?665336 ?AA >> rs8056588 ? ? 16 ? ? ?666190 ?CC >> rs6597 ? ? ? ?16 ? ? ?671726 ?TT >> >> >> Manuel Corpas, PhD >> Tel: ? ? ?+44.122349.2372 >> Web: ? ?http://manuelcorpas.com/about/ >> Twitter: @manuelcorpas >> >> >> >> On 18 November 2011 15:14, Andy Jenkinson wrote: >>> Hi Manuel, >>> >>> Since 2008 ProServer has had a BED format SourceAdaptor (called bed12, as it is intended to work with the 12-field BED format). It also supports Hydras, which are modules that are designed to automatically create DAS sources from a single config without restarting the server. This is how EasyDAS works with ProServer: there is one SourceAdaptor, and a Hydra to scan a relational database for new data. >>> >>> I don't know what 23andme's data looks like, but the addition of a Hydra to scan directories for new files and automatically make them available as DAS sources would seem to be a trivial piece of work. I daresay a VCF adaptor would also be fairly easy, especially if there is a Perl API of some sort (BioPerl?). >>> >>> Cheers, >>> Andy >>> >>> On 17 Nov 2011, at 17:11, Manuel Corpas wrote: >>> >>>> Dear Jonathan, >>>> >>>> I hope you do not mind me copying the DAS list in this email, as we >>>> would be very keen to gather interest in the community regarding DAS >>>> applications to whole genomes. >>>> >>>> We are interested in exploring DAS in the context of genomic variants >>>> (SNPs, indels, CNVs) from personal genomes plus their integration with >>>> relevant sources (genes, variation data, phenotypes). >>>> >>>> Currently we have done a lot of work with 23andMe (whole-genome) >>>> genotypes but now we are expecting to extend our efforts further to >>>> exome data. A critical tool we are currently missing is one that >>>> allows automatic creation of DAS sources via an API directly from bed >>>> format (used by 23andMe) or vcf (1000genomes). >>>> >>>> Anyone interested in discussing these topics please let me know. >>>> >>>> Kind regards, >>>> Manuel >>>> >>>> Manuel Corpas, PhD >>>> Tel: ? ? ?+44.122349.2372 >>>> Web: ? ?http://manuelcorpas.com/about/ >>>> Twitter: @manuelcorpas >>>> >>>> >>>> >>>> On 17 November 2011 12:11, Jonathan Warren wrote: >>>>> Hi >>>>> >>>>> As the 2012 DAS workshop is coming up at the end of February we would like >>>>> to hear from people using DAS. >>>>> We would be really grateful to receive just a short email from anyone using >>>>> DAS or developing DAS with a brief summary about their project and how DAS >>>>> fits in, especially if you have not spoken at the DAS workshops at any time. >>>>> >>>>> Please also say if you would be interested in giving a short presentation at >>>>> the workshop in February even if you are not sure if you could make it. >>>>> Previous years the presentations have been 15 minutes with 5 minutes for >>>>> questions - however this year we intend to be more flexible and so if you >>>>> would prefer to give a "lightning talk" of just 5 minutes to update people >>>>> or give them a brief overview that will be fine. Links to the previous years >>>>> talks can be found here http://www.biodas.org/wiki/DASWorkshop2011#Day_2 >>>>> >>>>> I must emphasise - please give us a summary even if you are not interested >>>>> in giving a talk as we would like to know what is going on out there and we >>>>> promise not to hound you to give a talk :) >>>>> >>>>> Thanks in advance >>>>> >>>>> The Sanger/EBI DAS people. >>>>> >>>>> >>>>> Jonathan Warren >>>>> Senior Developer and DAS coordinator >>>>> blog: http://biodasman.wordpress.com/ >>>>> jw12 at sanger.ac.uk >>>>> Ext: 2314 >>>>> Telephone: 01223 492314 >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> The Wellcome Trust Sanger Institute is operated by Genome ResearchLimited, a >>>>> charity registered in England with number 1021457 and acompany registered in >>>>> England with number 2742969, whose registeredoffice is 215 Euston Road, >>>>> London, NW1 2BE._______________________________________________ >>>>> DAS mailing list >>>>> DAS at lists.open-bio.org >>>>> http://lists.open-bio.org/mailman/listinfo/das >>>>> >>>> >>>> _______________________________________________ >>>> DAS mailing list >>>> DAS at lists.open-bio.org >>>> http://lists.open-bio.org/mailman/listinfo/das >>> >>> > > From jw12 at sanger.ac.uk Thu Nov 24 17:19:37 2011 From: jw12 at sanger.ac.uk (Jonathan Warren) Date: Thu, 24 Nov 2011 17:19:37 +0000 Subject: [DAS] DAS projects/Future of DAS In-Reply-To: References: <4A0F0B2F-D98C-4B88-BC48-104607F9E526@sanger.ac.uk> <1C57BEA4-3B0A-4CBE-8308-97D418D2F243@ebi.ac.uk> Message-ID: Hi Manuel Thanks again for the input ;) I agree with Andy that a generic DAS upload/server is going to be inherently complicated and will be limited by the upload across the internet. To a large degree the DAS system came about to stop large amounts of data needing to be uploaded or downloaded over the net. However I can see that having DAS "nodes" such as the EBI and Sanger etc with tailor made upload user interfaces and servers for specific data types is a reasonable solution/addition to the DAS system. To this end, I'm already more than half way to a solution for your needs (which doesn't need a database) which we could use, subject to Sanger approval. This solution could also be made more generic for other data sources as well with a specific interface developed for each. As to the future of DAS - I believe that over the last 3 years we have improved the DAS protocol and many of the associated implementations so that we now have dispensed with many if not all of the previous criticisms people have had of the DAS system (1.6E spec): Validation has improved the level of conformity to the DAS spec and new servers (MyDAS and Proserver) now behave in the same way for both requests and responses. You can now search DAS sources (MyDAS). Next feature capability (in the spec and Proserver). You can have alternative content that will require less bandwidth e.g JSON (the Registry serves this already and soon MyDAS servers and proserver hopefully). We have writeback servers (MyDAS)-(already implemented at the EBI for proteins and an example server available soon that will accept posts and puts at the Sanger for genomic sources). Really we NEED the community to come together and put data up using these new servers and for major clients such as Ensembl to start supporting 1.6 spec servers and the newer "Extended" features. The Dalliance browser by Thomas Down proves how fast a DAS client can be. I believe there is a lot more potential for the DAS system and it's still a good solution for today's data distribution needs. Cheers Jonathan. On 19 Nov 2011, at 15:58, Manuel Corpas wrote: >> Having said all this, I am a little confused about what you are >> trying to achieve. In your first mail you said you wanted to create >> sources via an API, in the second you say you want to do it via a >> click. Obviously the requirements for both are very different. > > > Both API and 1-click DAS source creation would be extremely helpful in > my view. The fact that this functionality is not available is > seriously affecting many of our users' ability to create DAS sources > with their 23andMe genotypes. > > The fact that these facilities do not exist has stopped new potential > users from utilizing DAS. If DAS is truly going to survive as a > standard, automatic creation of data sources needs to be easier. > > Manuel > > > Manuel Corpas, PhD > Tel: +44.122349.2372 > Web: http://manuelcorpas.com/about/ > Twitter: @manuelcorpas > > > > On 18 November 2011 23:33, Andy Jenkinson > wrote: >> Hi Manuel, >> >> It would be nice to be able to create a DAS source from any type of >> data you happen to have with a click or two, but I don't think it >> is realistic. Even in this email you have just told me what all the >> columns mean, what the assembly is, what kind of file it is. Any >> application would need to know the same things (and more). >> >> That is not to say that it is difficult to build something to let >> you do this if it is specifically designed for the exact type of >> data you are using, just that it does not already exist and so you >> have to actually create it. Either MyDas or ProServer would seem to >> offer you a starting point to do that, but only a starting point. >> EasyDAS is the closest thing to what you want but obviously it has >> to cater for any type of data so has to ask you a lot more >> questions. Its web-based architecture obviously limits the size of >> data files you can process quickly too, but that is the trade off >> you make by not needing an Internet-visible web server of your own >> to run a DAS server from. I daresay if you wanted to create >> something that an individual can use to make a DAS source from >> their personal BED/VCF file then it would have to be web based, >> will always be restricted by the speed of the Internet, but the >> interface could be much simpler than EasyDAS and a database might >> not be needed (EasyDAS loads file contents into a database to >> standardise them, which slows things down). >> >> Having said all this, I am a little confused about what you are >> trying to achieve. In your first mail you said you wanted to create >> sources via an API, in the second you say you want to do it via a >> click. Obviously the requirements for both are very different. >> >> Cheers, >> Andy >> >> On 18 Nov 2011, at 15:24, Manuel Corpas wrote: >> >>> Hi Andy, >>> >>> thanks for the info. Having a bed DAS adaptor is part of the >>> problem, >>> the other is not having to worry about having to deal with the DAS >>> server directly. easyDAS manages to do this but unfortunately it is >>> not obvious for people who do not know DAS how to operate it. Also >>> if >>> the file is very big and the connection slow it can take up to an >>> hour >>> to create a DAS source. >>> >>> Wouldn't it be nice to create a DAS source just with one click or >>> two? >>> >>> Please see below a snippet of a few SNPs in my chromosome 16 just as >>> you would get them from 23andMe (NCBI36 assembly; columns mean >>> SNP_id/chr/position/genotype). >>> >>> Cheers, >>> Manuel >>> >>> rs7763 16 544555 TT >>> rs763158 16 546105 GG >>> rs7190878 16 549131 AG >>> rs4984890 16 552699 CT >>> rs710925 16 573355 AG >>> rs2017567 16 577213 CT >>> rs4144003 16 585969 CT >>> rs7190358 16 590789 AG >>> rs7203694 16 592942 AG >>> rs11248940 16 595687 TT >>> rs7204088 16 601143 TT >>> rs4984677 16 611683 AG >>> rs9929621 16 619413 CT >>> rs11642546 16 641657 CC >>> rs3752496 16 650256 TT >>> rs2301426 16 651906 GG >>> rs1044662 16 655061 CC >>> rs9934288 16 656288 AC >>> rs3752493 16 657524 TT >>> rs1139897 16 660987 GG >>> rs1045763 16 664085 CC >>> rs3830140 16 665336 AA >>> rs8056588 16 666190 CC >>> rs6597 16 671726 TT >>> >>> >>> Manuel Corpas, PhD >>> Tel: +44.122349.2372 >>> Web: http://manuelcorpas.com/about/ >>> Twitter: @manuelcorpas >>> >>> >>> >>> On 18 November 2011 15:14, Andy Jenkinson >>> wrote: >>>> Hi Manuel, >>>> >>>> Since 2008 ProServer has had a BED format SourceAdaptor (called >>>> bed12, as it is intended to work with the 12-field BED format). >>>> It also supports Hydras, which are modules that are designed to >>>> automatically create DAS sources from a single config without >>>> restarting the server. This is how EasyDAS works with ProServer: >>>> there is one SourceAdaptor, and a Hydra to scan a relational >>>> database for new data. >>>> >>>> I don't know what 23andme's data looks like, but the addition of >>>> a Hydra to scan directories for new files and automatically make >>>> them available as DAS sources would seem to be a trivial piece of >>>> work. I daresay a VCF adaptor would also be fairly easy, >>>> especially if there is a Perl API of some sort (BioPerl?). >>>> >>>> Cheers, >>>> Andy >>>> >>>> On 17 Nov 2011, at 17:11, Manuel Corpas wrote: >>>> >>>>> Dear Jonathan, >>>>> >>>>> I hope you do not mind me copying the DAS list in this email, as >>>>> we >>>>> would be very keen to gather interest in the community regarding >>>>> DAS >>>>> applications to whole genomes. >>>>> >>>>> We are interested in exploring DAS in the context of genomic >>>>> variants >>>>> (SNPs, indels, CNVs) from personal genomes plus their >>>>> integration with >>>>> relevant sources (genes, variation data, phenotypes). >>>>> >>>>> Currently we have done a lot of work with 23andMe (whole-genome) >>>>> genotypes but now we are expecting to extend our efforts further >>>>> to >>>>> exome data. A critical tool we are currently missing is one that >>>>> allows automatic creation of DAS sources via an API directly >>>>> from bed >>>>> format (used by 23andMe) or vcf (1000genomes). >>>>> >>>>> Anyone interested in discussing these topics please let me know. >>>>> >>>>> Kind regards, >>>>> Manuel >>>>> >>>>> Manuel Corpas, PhD >>>>> Tel: +44.122349.2372 >>>>> Web: http://manuelcorpas.com/about/ >>>>> Twitter: @manuelcorpas >>>>> >>>>> >>>>> >>>>> On 17 November 2011 12:11, Jonathan Warren >>>>> wrote: >>>>>> Hi >>>>>> >>>>>> As the 2012 DAS workshop is coming up at the end of February we >>>>>> would like >>>>>> to hear from people using DAS. >>>>>> We would be really grateful to receive just a short email from >>>>>> anyone using >>>>>> DAS or developing DAS with a brief summary about their project >>>>>> and how DAS >>>>>> fits in, especially if you have not spoken at the DAS workshops >>>>>> at any time. >>>>>> >>>>>> Please also say if you would be interested in giving a short >>>>>> presentation at >>>>>> the workshop in February even if you are not sure if you could >>>>>> make it. >>>>>> Previous years the presentations have been 15 minutes with 5 >>>>>> minutes for >>>>>> questions - however this year we intend to be more flexible and >>>>>> so if you >>>>>> would prefer to give a "lightning talk" of just 5 minutes to >>>>>> update people >>>>>> or give them a brief overview that will be fine. Links to the >>>>>> previous years >>>>>> talks can be found here http://www.biodas.org/wiki/DASWorkshop2011#Day_2 >>>>>> >>>>>> I must emphasise - please give us a summary even if you are not >>>>>> interested >>>>>> in giving a talk as we would like to know what is going on out >>>>>> there and we >>>>>> promise not to hound you to give a talk :) >>>>>> >>>>>> Thanks in advance >>>>>> >>>>>> The Sanger/EBI DAS people. >>>>>> >>>>>> >>>>>> Jonathan Warren >>>>>> Senior Developer and DAS coordinator >>>>>> blog: http://biodasman.wordpress.com/ >>>>>> jw12 at sanger.ac.uk >>>>>> Ext: 2314 >>>>>> Telephone: 01223 492314 >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> The Wellcome Trust Sanger Institute is operated by Genome >>>>>> ResearchLimited, a >>>>>> charity registered in England with number 1021457 and acompany >>>>>> registered in >>>>>> England with number 2742969, whose registeredoffice is 215 >>>>>> Euston Road, >>>>>> London, NW1 2BE._______________________________________________ >>>>>> DAS mailing list >>>>>> DAS at lists.open-bio.org >>>>>> http://lists.open-bio.org/mailman/listinfo/das >>>>>> >>>>> >>>>> _______________________________________________ >>>>> DAS mailing list >>>>> DAS at lists.open-bio.org >>>>> http://lists.open-bio.org/mailman/listinfo/das >>>> >>>> >> >> Jonathan Warren Senior Developer and DAS coordinator blog: http://biodasman.wordpress.com/ jw12 at sanger.ac.uk Ext: 2314 Telephone: 01223 492314 -- The Wellcome Trust Sanger Institute is operated by Genome Research Limited, a charity registered in England with number 1021457 and a company registered in England with number 2742969, whose registered office is 215 Euston Road, London, NW1 2BE.