From holland at ebi.ac.uk Fri Feb 1 04:43:10 2008 From: holland at ebi.ac.uk (Richard Holland) Date: Fri, 01 Feb 2008 09:43:10 +0000 Subject: [Biojava-l] BioJava 3 design discussion coming to an end In-Reply-To: <477B7AEC.9000401@ebi.ac.uk> References: <477B7AEC.9000401@ebi.ac.uk> Message-ID: <47A2E9AE.9020502@ebi.ac.uk> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hi all. It's now February and the time has come for me to take the contents of the BioJava 3 discussion wiki page and start compiling it into a more formal proposal. I'll let you know when I'm done by posting the proposal document on the BioJava website - probably around the end of this month. The formal proposal document will then itself be open to discussion and modification for a short period before any development it requires will commence. cheers, Richard PS. I will be working from a snapshot of the wiki. Therefore any changes made to the wiki from now on may not get included. Richard Holland wrote: > Hi all. > > At the end of January I will be taking the contents of our BioJava 3 > discussion wiki page and compiling them into a more formal design proposal. > > If you have made any comments elsewhere (e.g. by email) which you would > like to be considered in the final design proposal, then please add them > to the wiki page (or its associated Talk page) before the end of the month. > > (I won't be trawling through email archives looking for comments so you > really must copy your comments across to the wiki if you want them to be > included!). > > The wiki address is: > > http://www.biojava.org/wiki/BioJava3_Proposal > > cheers, > Richard > _______________________________________________ Biojava-l mailing list - Biojava-l at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/biojava-l - -- Richard Holland (BioMart) EMBL EBI, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK Tel. +44 (0)1223 494416 http://www.biomart.org/ http://www.biojava.org/ -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.2.2 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFHoumt4C5LeMEKA/QRAoWtAJ9KhTqHPx56zocxFgBGZh4MIWpulACeKRyI EMd3Fzhv7QwnLIFbpv0GJ48= =JfBC -----END PGP SIGNATURE----- From holland at ebi.ac.uk Fri Feb 1 11:58:18 2008 From: holland at ebi.ac.uk (Richard Holland) Date: Fri, 01 Feb 2008 16:58:18 +0000 Subject: [Biojava-l] BioJava 3 design discussion coming to an end In-Reply-To: <004c01c864ef$652d7b60$7e2410ac@LENOVOB43F04A4> References: <477B7AEC.9000401@ebi.ac.uk> <47A2E9AE.9020502@ebi.ac.uk> <004c01c864ef$652d7b60$7e2410ac@LENOVOB43F04A4> Message-ID: <47A34FAA.7010000@ebi.ac.uk> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hi Fazel. I'm going away for 2 weeks in about 10 minutes time... so I'm posting your question to the list in the hope that someone else can answer it whilst I'm away. cheers, Richard Fazel Keshtkar wrote: > Dear Richard, > > I'm using biojava, and I'm new.. but I saw some problem in it: > - Biojava is not able to read symbol if there are gap in fasta file or > MSA. > i.e. if I want to count my symbols in FASTA file which for sure might > there gap, it gave me error!!! > > If I'm wrong or I need to use another technique please let me know.. > > cheers, > fazel > ----- Original Message ----- From: "Richard Holland" > To: ; "biojava-l" > Sent: Friday, February 01, 2008 4:43 AM > Subject: Re: [Biojava-l] BioJava 3 design discussion coming to an end > > > Hi all. > > It's now February and the time has come for me to take the contents of > the BioJava 3 discussion wiki page and start compiling it into a more > formal proposal. > > I'll let you know when I'm done by posting the proposal document on the > BioJava website - probably around the end of this month. > > The formal proposal document will then itself be open to discussion and > modification for a short period before any development it requires will > commence. > > cheers, > Richard > > PS. I will be working from a snapshot of the wiki. Therefore any changes > made to the wiki from now on may not get included. > > > Richard Holland wrote: >>>> Hi all. >>>> >>>> At the end of January I will be taking the contents of our BioJava 3 >>>> discussion wiki page and compiling them into a more formal design >>>> proposal. >>>> >>>> If you have made any comments elsewhere (e.g. by email) which you would >>>> like to be considered in the final design proposal, then please add them >>>> to the wiki page (or its associated Talk page) before the end of the >>>> month. >>>> >>>> (I won't be trawling through email archives looking for comments so you >>>> really must copy your comments across to the wiki if you want them to be >>>> included!). >>>> >>>> The wiki address is: >>>> >>>> http://www.biojava.org/wiki/BioJava3_Proposal >>>> >>>> cheers, >>>> Richard >>>> > _______________________________________________ > Biojava-l mailing list - Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l > > -- > Richard Holland (BioMart) > EMBL EBI, Wellcome Trust Genome Campus, > Hinxton, Cambridgeshire CB10 1SD, UK > Tel. +44 (0)1223 494416 > > http://www.biomart.org/ > http://www.biojava.org/ _______________________________________________ Biojava-l mailing list - Biojava-l at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/biojava-l >> - -- Richard Holland (BioMart) EMBL EBI, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK Tel. +44 (0)1223 494416 http://www.biomart.org/ http://www.biojava.org/ -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.2.2 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFHo0+q4C5LeMEKA/QRAsY4AJ9LXuvQyBrRgSYHBdlOv/qmPEVqRACfbyZY DBgy7vD0FJ2ClbMUYY3KIs4= =nzdb -----END PGP SIGNATURE----- From matthew.pocock at ncl.ac.uk Sun Feb 3 13:47:56 2008 From: matthew.pocock at ncl.ac.uk (Matthew Pocock) Date: Sun, 3 Feb 2008 18:47:56 +0000 Subject: [Biojava-l] SAX, DOM, XPath and Flat files In-Reply-To: <93b45ca50712120134w65cfad5dwbaeec2ea19a5a3b4@mail.gmail.com> References: <93b45ca50711291828u57b1cb5dj31cc7ef7f87eb701@mail.gmail.com> <93b45ca50712120134w65cfad5dwbaeec2ea19a5a3b4@mail.gmail.com> Message-ID: <200802031847.58120.matthew.pocock@ncl.ac.uk> On Wednesday 12 December 2007, Mark Schreiber wrote: > Not a bad suggestion. I wasn't aware of how many formats are now in > OWL. It would give us a pretty rapid route to pathway and microarray > object models as well. I expect that the majority of bioinformatics data that is currently stored in RDBMS or other highly-structured forms will be published in either raw RDF, or in OWL as an alternative to the current access methods. How much of this is 'syntactic' and how much of it will require some reasoning services to make sense of remains to be seen though. > > Jena seems to be a nice package to build upon. Unfortunately due to > the 'striped' rather than nested nature of ontologies we can forget > the SAX vs DOM argument. The Jena OntModel is all going into memory. > Looks like informatics is going to be the memory pig of the next > decade. We tend to use the OWL-API maintained at manchester university. Did I mention someone talking about jar dependency explosions? Maven is the solution here, my friends. > > - Mark Matthew From fazelk at gmail.com Sun Feb 10 10:40:24 2008 From: fazelk at gmail.com (Fazel Keshtkar) Date: Sun, 10 Feb 2008 10:40:24 -0500 Subject: [Biojava-l] Reading MSA, or MSF or multiple sequence alignmne References: <47A6539E.9000607@gmail.com> <48653.202.231.68.11.1202624810.squirrel@webmail.ebi.ac.uk> Message-ID: <001301c86bfb$41050330$0202a8c0@LENOVOB43F04A4> Dear Freinds, I would appreciate if you let me know(Please if you have sample code send it to me) how I can Read Multiple Sequence Alignment in BioJava. In BioJava there is no sample to Read MSA or MSF or any type of multiple alignment.. Thank you, -- Fazel From koen.bruynseels at cropdesign.com Sun Feb 10 12:17:55 2008 From: koen.bruynseels at cropdesign.com (koen.bruynseels at cropdesign.com) Date: Sun, 10 Feb 2008 18:17:55 +0100 Subject: [Biojava-l] Koen Bruynseels is out of the office. Message-ID: I will be out of the office starting 01/02/2008 and will not return until 11/02/2008. I will respond to your message when I return. From dreher at molgen.mpg.de Thu Feb 14 12:06:40 2008 From: dreher at molgen.mpg.de (Felix Dreher) Date: Thu, 14 Feb 2008 18:06:40 +0100 Subject: [Biojava-l] marker name --> UniSTS-ID Message-ID: <47B47520.9020104@molgen.mpg.de> Hello, does anybody know how to fetch NCBI-UniSTS-IDs by marker name? Example: Input: G35510 (marker name; alternatively a list of many marker names....) Output: 44150 (UniSTS-ID(s)) Thanks + best regards, Felix From markjschreiber at gmail.com Thu Feb 14 19:37:17 2008 From: markjschreiber at gmail.com (Mark Schreiber) Date: Fri, 15 Feb 2008 08:37:17 +0800 Subject: [Biojava-l] marker name --> UniSTS-ID In-Reply-To: <47B47520.9020104@molgen.mpg.de> References: <47B47520.9020104@molgen.mpg.de> Message-ID: <93b45ca50802141637u472f4e27re5d651fd7e908d14@mail.gmail.com> I suggest you ask the NCBI helpdesk. They are usually pretty good. - Mark On Fri, Feb 15, 2008 at 1:06 AM, Felix Dreher wrote: > Hello, > > does anybody know how to fetch NCBI-UniSTS-IDs by marker name? > > Example: > > Input: G35510 (marker name; alternatively a list of many marker names....) > Output: 44150 (UniSTS-ID(s)) > > Thanks + best regards, > Felix > > > > _______________________________________________ > Biojava-l mailing list - Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l > From jolyon.holdstock at ogt.co.uk Fri Feb 15 09:44:57 2008 From: jolyon.holdstock at ogt.co.uk (Jolyon Holdstock) Date: Fri, 15 Feb 2008 14:44:57 -0000 Subject: [Biojava-l] Editing a RichSequence Message-ID: <588D0DD225D05746B5D8CAE1BE971F3F01D2E17D@EUCLID.internal.ogtip.com> Hi Hi, I am trying to edit a Genbank sequence. The code I'm using is as follows: [code] richSeq = RichSequence.IOTools.readGenbankDNA(new BufferedReader(new FileReader(new File("U00096.gbk"))), null).nextRichSequence(); SymbolList sl1 = DNATools.createDNA("AAAGGGTTTCCC"); Edit editOne = new Edit(47078, 2690, sl1); richSeq.edit(editOne); [/code] When it runs it gives the following error ChangeVetoException: org.biojava.utils.ChangeVetoException: AbstractSymbolList is immutable I have used the code for a smaller sequence (15kb, compared with 4Mb) and it works. Does anyone have an idea why this is not working? Thanks, Jolyon Jolyon Holdstock Ph.D. Senior Computational Biologist, Oxford Gene Technology, Begbroke Science Park, Sandy Lane, Yarnton Oxford, OX5 1PF Tel: +44 (0)1865 856852 Fax: +44 (0)1865 842116 Oxford Gene Technology (Operations) Ltd. Registered in England No:03845432 Begbroke Science Park, Sandy Lane, Yarnton, Oxford, OX5 1PF. Confidentiality Notice: The contents of this email from the Oxford Gene Technology Group of Companies are confidential and intended solely for the person to whom it is addressed. It may contain privileged and confidential information. If you are not the intended recipient you must not read, copy, distribute, discuss or take any action in reliance on it. From holland at ebi.ac.uk Fri Feb 15 10:16:40 2008 From: holland at ebi.ac.uk (Richard Holland) Date: Fri, 15 Feb 2008 15:16:40 -0000 (GMT) Subject: [Biojava-l] Editing a RichSequence In-Reply-To: <588D0DD225D05746B5D8CAE1BE971F3F01D2E17D@EUCLID.internal.ogtip.com> References: <588D0DD225D05746B5D8CAE1BE971F3F01D2E17D@EUCLID.internal.ogtip.com> Message-ID: <33511.202.231.68.11.1203088600.squirrel@webmail.ebi.ac.uk> I think it's because sequences are constructed internally in a ChunkedSymbolListFactory which compresses large sequences whereas small sequences are stored as normal uncompressed ones. Compressed sequences extend AbstractSymbolList, which is immutable (and therefore uneditable) whereas uncompressed ones do not, and hence are editable. You can disable the use of compressed sequences by using readGenbank() instead of readGenbankDNA() and passing in the DNA alphabet and the non-compressed sequence factory (see the static constants in RichSequenceBuilderFactory). If this still doesn't work, please could you post the full stacktrace so that we can see which class is throwing the exception and at what line etc. cheers, Richard On Fri, February 15, 2008 2:44 pm, Jolyon Holdstock wrote: > Hi > > > Hi, > > I am trying to edit a Genbank sequence. > The code I'm using is as follows: > > [code] > richSeq = RichSequence.IOTools.readGenbankDNA(new BufferedReader(new > FileReader(new File("U00096.gbk"))), null).nextRichSequence(); > > SymbolList sl1 = DNATools.createDNA("AAAGGGTTTCCC"); > Edit editOne = new Edit(47078, 2690, sl1); > richSeq.edit(editOne); > > [/code] > > When it runs it gives the following error > > ChangeVetoException: org.biojava.utils.ChangeVetoException: > AbstractSymbolList is immutable > > > I have used the code for a smaller sequence (15kb, compared with 4Mb) > and it works. > > Does anyone have an idea why this is not working? > > Thanks, > > Jolyon > > > > > > Jolyon Holdstock Ph.D. > Senior Computational Biologist, > Oxford Gene Technology, > Begbroke Science Park, > Sandy Lane, Yarnton > Oxford, OX5 1PF > > Tel: +44 (0)1865 856852 > Fax: +44 (0)1865 842116 > > Oxford Gene Technology (Operations) Ltd. Registered in England > No:03845432 Begbroke Science Park, Sandy Lane, Yarnton, Oxford, OX5 1PF. > > Confidentiality Notice: The contents of this email from the Oxford Gene > Technology Group of Companies are confidential and intended solely for > the person to whom it is addressed. It may contain privileged and > confidential information. If you are not the intended recipient you must > not read, copy, distribute, discuss or take any action in reliance on > it. > > > _______________________________________________ > Biojava-l mailing list - Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l > -- Richard Holland BioMart (http://www.biomart.org/) EMBL-EBI Hinxton, Cambridgeshire CB10 1SD, UK From markjschreiber at gmail.com Fri Feb 15 10:21:40 2008 From: markjschreiber at gmail.com (Mark Schreiber) Date: Fri, 15 Feb 2008 23:21:40 +0800 Subject: [Biojava-l] Editing a RichSequence In-Reply-To: <33511.202.231.68.11.1203088600.squirrel@webmail.ebi.ac.uk> References: <588D0DD225D05746B5D8CAE1BE971F3F01D2E17D@EUCLID.internal.ogtip.com> <33511.202.231.68.11.1203088600.squirrel@webmail.ebi.ac.uk> Message-ID: <93b45ca50802150721u31135ff7gb9eeabef2031492f@mail.gmail.com> I wonder if Edits on Chunked or BitPacked lists are possible (theoretically possible). If they are we should really allow them. - Mark On Fri, Feb 15, 2008 at 11:16 PM, Richard Holland wrote: > I think it's because sequences are constructed internally in a > ChunkedSymbolListFactory which compresses large sequences whereas small > sequences are stored as normal uncompressed ones. Compressed sequences > extend AbstractSymbolList, which is immutable (and therefore uneditable) > whereas uncompressed ones do not, and hence are editable. > > You can disable the use of compressed sequences by using readGenbank() > instead of readGenbankDNA() and passing in the DNA alphabet and the > non-compressed sequence factory (see the static constants in > RichSequenceBuilderFactory). > > If this still doesn't work, please could you post the full stacktrace so > that we can see which class is throwing the exception and at what line > etc. > > cheers, > Richard > > On Fri, February 15, 2008 2:44 pm, Jolyon Holdstock wrote: > > Hi > > > > > > Hi, > > > > I am trying to edit a Genbank sequence. > > The code I'm using is as follows: > > > > [code] > > richSeq = RichSequence.IOTools.readGenbankDNA(new BufferedReader(new > > FileReader(new File("U00096.gbk"))), null).nextRichSequence(); > > > > SymbolList sl1 = DNATools.createDNA("AAAGGGTTTCCC"); > > Edit editOne = new Edit(47078, 2690, sl1); > > richSeq.edit(editOne); > > > > [/code] > > > > When it runs it gives the following error > > > > ChangeVetoException: org.biojava.utils.ChangeVetoException: > > AbstractSymbolList is immutable > > > > > > I have used the code for a smaller sequence (15kb, compared with 4Mb) > > and it works. > > > > Does anyone have an idea why this is not working? > > > > Thanks, > > > > Jolyon > > > > > > > > > > > > Jolyon Holdstock Ph.D. > > Senior Computational Biologist, > > Oxford Gene Technology, > > Begbroke Science Park, > > Sandy Lane, Yarnton > > Oxford, OX5 1PF > > > > Tel: +44 (0)1865 856852 > > Fax: +44 (0)1865 842116 > > > > Oxford Gene Technology (Operations) Ltd. Registered in England > > No:03845432 Begbroke Science Park, Sandy Lane, Yarnton, Oxford, OX5 1PF. > > > > Confidentiality Notice: The contents of this email from the Oxford Gene > > Technology Group of Companies are confidential and intended solely for > > the person to whom it is addressed. It may contain privileged and > > confidential information. If you are not the intended recipient you must > > not read, copy, distribute, discuss or take any action in reliance on > > it. > > > > > > _______________________________________________ > > Biojava-l mailing list - Biojava-l at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/biojava-l > > > > > -- > Richard Holland > BioMart (http://www.biomart.org/) > EMBL-EBI > Hinxton, Cambridgeshire CB10 1SD, UK > > _______________________________________________ > Biojava-l mailing list - Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l > From jolyon.holdstock at ogt.co.uk Mon Feb 18 10:49:49 2008 From: jolyon.holdstock at ogt.co.uk (Jolyon Holdstock) Date: Mon, 18 Feb 2008 15:49:49 -0000 Subject: [Biojava-l] Editing a RichSequence[Scanned] Message-ID: <588D0DD225D05746B5D8CAE1BE971F3F01D2E2C7@EUCLID.internal.ogtip.com> Hi, I tried using the readGenbank method with the following code... [code] import java.io.BufferedReader; import java.io.File; import java.io.FileNotFoundException; import java.io.FileReader; import java.io.IOException; import org.biojava.bio.BioException; import org.biojava.bio.symbol.Edit; import org.biojava.bio.symbol.SymbolList; import org.biojava.bio.seq.DNATools; import org.biojava.bio.seq.io.SymbolTokenization; import org.biojava.utils.ChangeVetoException; import org.biojavax.RichObjectFactory; import org.biojavax.bio.seq.RichSequence; import org.biojavax.bio.seq.io.RichSequenceBuilderFactory; public class EditBigSequence { RichSequence richSeq; Edit edit; public EditBigSequence() { try { SymbolTokenization symbolTokenization = DNATools.getDNA().getTokenization("token"); richSeq = RichSequence.IOTools.readGenbank(new BufferedReader(new FileReader(new File("AF234172.gbk"))), symbolTokenization, RichSequenceBuilderFactory.FACTORY, RichObjectFactory.getDefaultNamespace()).nextRichSequence(); SymbolList insertSeq = DNATools.createDNA("AAAACCCCGGGGTTTT"); edit = new Edit(1000, 100, insertSeq); richSeq.edit(edit); } catch (FileNotFoundException FNFE){ System.out.println("FileNotFoundException: " + FNFE); } catch (BioException BIOE){ System.out.println("BioException: " + BIOE); } catch (ChangeVetoException CVE){ CVE.printStackTrace(); System.out.println("ChangeVetoException: " + CVE); } catch (IOException IOE){ System.out.println("IOException: " + IOE); } } public static void main(String args []){ EditBigSequence ebs = new EditBigSequence(); } } [/code] But I still got an error, for which the StckTrace is below. org.biojava.utils.ChangeVetoException: AbstractSymbolList is immutable ChangeVetoException: org.biojava.utils.ChangeVetoException: AbstractSymbolList is immutable at org.biojava.bio.symbol.AbstractSymbolList.edit(AbstractSymbolList.java:1 13) at org.biojavax.bio.seq.DummyRichSequenceHandler.edit(DummyRichSequenceHand ler.java:30) at org.biojavax.bio.seq.ThinRichSequence.edit(ThinRichSequence.java:155) at biojavahacks.EditBigSequence.(EditBigSequence.java:47) at biojavahacks.EditBigSequence.main(EditBigSequence.java:65) cheers, Jolyon -----Original Message----- From: Richard Holland [mailto:holland at ebi.ac.uk] Sent: 15 February 2008 15:17 To: Jolyon Holdstock Cc: biojava-l at biojava.org Subject: Re: [Biojava-l] Editing a RichSequence[Scanned] I think it's because sequences are constructed internally in a ChunkedSymbolListFactory which compresses large sequences whereas small sequences are stored as normal uncompressed ones. Compressed sequences extend AbstractSymbolList, which is immutable (and therefore uneditable) whereas uncompressed ones do not, and hence are editable. You can disable the use of compressed sequences by using readGenbank() instead of readGenbankDNA() and passing in the DNA alphabet and the non-compressed sequence factory (see the static constants in RichSequenceBuilderFactory). If this still doesn't work, please could you post the full stacktrace so that we can see which class is throwing the exception and at what line etc. cheers, Richard On Fri, February 15, 2008 2:44 pm, Jolyon Holdstock wrote: > Hi > > > Hi, > > I am trying to edit a Genbank sequence. > The code I'm using is as follows: > > [code] > richSeq = RichSequence.IOTools.readGenbankDNA(new BufferedReader(new > FileReader(new File("U00096.gbk"))), null).nextRichSequence(); > > SymbolList sl1 = DNATools.createDNA("AAAGGGTTTCCC"); > Edit editOne = new Edit(47078, 2690, sl1); > richSeq.edit(editOne); > > [/code] > > When it runs it gives the following error > > ChangeVetoException: org.biojava.utils.ChangeVetoException: > AbstractSymbolList is immutable > > > I have used the code for a smaller sequence (15kb, compared with 4Mb) > and it works. > > Does anyone have an idea why this is not working? > > Thanks, > > Jolyon > > > > > > Jolyon Holdstock Ph.D. > Senior Computational Biologist, > Oxford Gene Technology, > Begbroke Science Park, > Sandy Lane, Yarnton > Oxford, OX5 1PF > > Tel: +44 (0)1865 856852 > Fax: +44 (0)1865 842116 > > Oxford Gene Technology (Operations) Ltd. Registered in England > No:03845432 Begbroke Science Park, Sandy Lane, Yarnton, Oxford, OX5 1PF. > > Confidentiality Notice: The contents of this email from the Oxford Gene > Technology Group of Companies are confidential and intended solely for > the person to whom it is addressed. It may contain privileged and > confidential information. If you are not the intended recipient you must > not read, copy, distribute, discuss or take any action in reliance on > it. > > > _______________________________________________ > Biojava-l mailing list - Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l > -- Richard Holland BioMart (http://www.biomart.org/) EMBL-EBI Hinxton, Cambridgeshire CB10 1SD, UK This email has been scanned by Oxford Gene Technology Security Systems. This email has been scanned by Oxford Gene Technology Security Systems. From holland at ebi.ac.uk Mon Feb 18 11:12:52 2008 From: holland at ebi.ac.uk (Richard Holland) Date: Mon, 18 Feb 2008 16:12:52 +0000 Subject: [Biojava-l] Editing a RichSequence[Scanned] In-Reply-To: <588D0DD225D05746B5D8CAE1BE971F3F01D2E2C7@EUCLID.internal.ogtip.com> References: <588D0DD225D05746B5D8CAE1BE971F3F01D2E2C7@EUCLID.internal.ogtip.com> Message-ID: <47B9AE84.90202@ebi.ac.uk> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 OK, got it. It's because ChunkedSymbolListFactory is creating a ChunkedSymbolList for your sequence, because the sequence is greater than 1<<14 bp long (that's about 16384 bytes). This is a hardcoded limit. ChunkedSymbolList extends AbstractSymbolList, which is immutable and therefore not editable. I'm not sure who wrote ChunkedSymbolList - and I'm not sure how to (or if I should) fix it. It's quite a deeply embedded piece of the system. Does anyone out there know? There is a workaround - create a new symbol list based on the RichSequence ( SymbolList syms = new SimpleSymbolList(richSeq) ). The copy will be mutable and edit() will work on it. cheers, Richard Jolyon Holdstock wrote: > Hi, > > I tried using the readGenbank method with the following code... > > [code] > import java.io.BufferedReader; > import java.io.File; > import java.io.FileNotFoundException; > import java.io.FileReader; > import java.io.IOException; > > import org.biojava.bio.BioException; > import org.biojava.bio.symbol.Edit; > import org.biojava.bio.symbol.SymbolList; > import org.biojava.bio.seq.DNATools; > import org.biojava.bio.seq.io.SymbolTokenization; > import org.biojava.utils.ChangeVetoException; > > import org.biojavax.RichObjectFactory; > import org.biojavax.bio.seq.RichSequence; > import org.biojavax.bio.seq.io.RichSequenceBuilderFactory; > > public class EditBigSequence { > RichSequence richSeq; > Edit edit; > > public EditBigSequence() { > try { > SymbolTokenization symbolTokenization = > DNATools.getDNA().getTokenization("token"); > richSeq = RichSequence.IOTools.readGenbank(new BufferedReader(new > FileReader(new File("AF234172.gbk"))), > symbolTokenization, > > RichSequenceBuilderFactory.FACTORY, > > RichObjectFactory.getDefaultNamespace()).nextRichSequence(); > > SymbolList insertSeq = DNATools.createDNA("AAAACCCCGGGGTTTT"); > edit = new Edit(1000, 100, insertSeq); > richSeq.edit(edit); > } > catch (FileNotFoundException FNFE){ > System.out.println("FileNotFoundException: " + FNFE); > } > catch (BioException BIOE){ > System.out.println("BioException: " + BIOE); > } > catch (ChangeVetoException CVE){ > CVE.printStackTrace(); > System.out.println("ChangeVetoException: " + CVE); > } > catch (IOException IOE){ > System.out.println("IOException: " + IOE); > } > } > > public static void main(String args []){ > EditBigSequence ebs = new EditBigSequence(); > } > } > [/code] > > But I still got an error, for which the StckTrace is below. > > org.biojava.utils.ChangeVetoException: AbstractSymbolList is immutable > ChangeVetoException: org.biojava.utils.ChangeVetoException: > AbstractSymbolList is immutable > at > org.biojava.bio.symbol.AbstractSymbolList.edit(AbstractSymbolList.java:1 > 13) > at > org.biojavax.bio.seq.DummyRichSequenceHandler.edit(DummyRichSequenceHand > ler.java:30) > at > org.biojavax.bio.seq.ThinRichSequence.edit(ThinRichSequence.java:155) > at biojavahacks.EditBigSequence.(EditBigSequence.java:47) > at biojavahacks.EditBigSequence.main(EditBigSequence.java:65) > > > cheers, > > Jolyon > > > -----Original Message----- > From: Richard Holland [mailto:holland at ebi.ac.uk] > Sent: 15 February 2008 15:17 > To: Jolyon Holdstock > Cc: biojava-l at biojava.org > Subject: Re: [Biojava-l] Editing a RichSequence[Scanned] > > I think it's because sequences are constructed internally in a > ChunkedSymbolListFactory which compresses large sequences whereas small > sequences are stored as normal uncompressed ones. Compressed sequences > extend AbstractSymbolList, which is immutable (and therefore uneditable) > whereas uncompressed ones do not, and hence are editable. > > You can disable the use of compressed sequences by using readGenbank() > instead of readGenbankDNA() and passing in the DNA alphabet and the > non-compressed sequence factory (see the static constants in > RichSequenceBuilderFactory). > > If this still doesn't work, please could you post the full stacktrace so > that we can see which class is throwing the exception and at what line > etc. > > cheers, > Richard > > On Fri, February 15, 2008 2:44 pm, Jolyon Holdstock wrote: >> Hi >> >> >> Hi, >> >> I am trying to edit a Genbank sequence. >> The code I'm using is as follows: >> >> [code] >> richSeq = RichSequence.IOTools.readGenbankDNA(new BufferedReader(new >> FileReader(new File("U00096.gbk"))), null).nextRichSequence(); >> >> SymbolList sl1 = DNATools.createDNA("AAAGGGTTTCCC"); >> Edit editOne = new Edit(47078, 2690, sl1); >> richSeq.edit(editOne); >> >> [/code] >> >> When it runs it gives the following error >> >> ChangeVetoException: org.biojava.utils.ChangeVetoException: >> AbstractSymbolList is immutable >> >> >> I have used the code for a smaller sequence (15kb, compared with 4Mb) >> and it works. >> >> Does anyone have an idea why this is not working? >> >> Thanks, >> >> Jolyon >> >> >> >> >> >> Jolyon Holdstock Ph.D. >> Senior Computational Biologist, >> Oxford Gene Technology, >> Begbroke Science Park, >> Sandy Lane, Yarnton >> Oxford, OX5 1PF >> >> Tel: +44 (0)1865 856852 >> Fax: +44 (0)1865 842116 >> >> Oxford Gene Technology (Operations) Ltd. Registered in England >> No:03845432 Begbroke Science Park, Sandy Lane, Yarnton, Oxford, OX5 > 1PF. >> Confidentiality Notice: The contents of this email from the Oxford > Gene >> Technology Group of Companies are confidential and intended solely for >> the person to whom it is addressed. It may contain privileged and >> confidential information. If you are not the intended recipient you > must >> not read, copy, distribute, discuss or take any action in reliance on >> it. >> >> >> _______________________________________________ >> Biojava-l mailing list - Biojava-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/biojava-l >> > > - -- Richard Holland (BioMart) EMBL EBI, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK Tel. +44 (0)1223 494416 http://www.biomart.org/ http://www.biojava.org/ -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.2.2 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFHua6D4C5LeMEKA/QRAn/WAJ9sTII9aMU60LWdQvlgy1Ntp60q0QCdFeYa w60vXjENWcQLCiBf1ezRgh8= =M4J7 -----END PGP SIGNATURE----- From holland at ebi.ac.uk Mon Feb 18 11:20:14 2008 From: holland at ebi.ac.uk (Richard Holland) Date: Mon, 18 Feb 2008 16:20:14 +0000 Subject: [Biojava-l] Editing a RichSequence[Scanned] In-Reply-To: <47B9AE84.90202@ebi.ac.uk> References: <588D0DD225D05746B5D8CAE1BE971F3F01D2E2C7@EUCLID.internal.ogtip.com> <47B9AE84.90202@ebi.ac.uk> Message-ID: <47B9B03E.3020700@ebi.ac.uk> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 PS. The other workaround is to modify your local copy of BioJava, find the ChunkedSymbolList class, and change the 1<<14 CHUNK_SIZE limit to some higher value. Richard Holland wrote: > OK, got it. > > It's because ChunkedSymbolListFactory is creating a ChunkedSymbolList > for your sequence, because the sequence is greater than 1<<14 bp long > (that's about 16384 bytes). This is a hardcoded limit. > > ChunkedSymbolList extends AbstractSymbolList, which is immutable and > therefore not editable. > > I'm not sure who wrote ChunkedSymbolList - and I'm not sure how to (or > if I should) fix it. It's quite a deeply embedded piece of the system. > > Does anyone out there know? > > There is a workaround - create a new symbol list based on the > RichSequence ( SymbolList syms = new SimpleSymbolList(richSeq) ). The > copy will be mutable and edit() will work on it. > > cheers, > Richard > > Jolyon Holdstock wrote: >>> Hi, >>> >>> I tried using the readGenbank method with the following code... >>> >>> [code] >>> import java.io.BufferedReader; >>> import java.io.File; >>> import java.io.FileNotFoundException; >>> import java.io.FileReader; >>> import java.io.IOException; >>> >>> import org.biojava.bio.BioException; >>> import org.biojava.bio.symbol.Edit; >>> import org.biojava.bio.symbol.SymbolList; >>> import org.biojava.bio.seq.DNATools; >>> import org.biojava.bio.seq.io.SymbolTokenization; >>> import org.biojava.utils.ChangeVetoException; >>> >>> import org.biojavax.RichObjectFactory; >>> import org.biojavax.bio.seq.RichSequence; >>> import org.biojavax.bio.seq.io.RichSequenceBuilderFactory; >>> >>> public class EditBigSequence { >>> RichSequence richSeq; >>> Edit edit; >>> >>> public EditBigSequence() { >>> try { >>> SymbolTokenization symbolTokenization = >>> DNATools.getDNA().getTokenization("token"); >>> richSeq = RichSequence.IOTools.readGenbank(new BufferedReader(new >>> FileReader(new File("AF234172.gbk"))), >>> symbolTokenization, >>> >>> RichSequenceBuilderFactory.FACTORY, >>> >>> RichObjectFactory.getDefaultNamespace()).nextRichSequence(); >>> >>> SymbolList insertSeq = DNATools.createDNA("AAAACCCCGGGGTTTT"); >>> edit = new Edit(1000, 100, insertSeq); >>> richSeq.edit(edit); >>> } >>> catch (FileNotFoundException FNFE){ >>> System.out.println("FileNotFoundException: " + FNFE); >>> } >>> catch (BioException BIOE){ >>> System.out.println("BioException: " + BIOE); >>> } >>> catch (ChangeVetoException CVE){ >>> CVE.printStackTrace(); >>> System.out.println("ChangeVetoException: " + CVE); >>> } >>> catch (IOException IOE){ >>> System.out.println("IOException: " + IOE); >>> } >>> } >>> >>> public static void main(String args []){ >>> EditBigSequence ebs = new EditBigSequence(); >>> } >>> } >>> [/code] >>> >>> But I still got an error, for which the StckTrace is below. >>> >>> org.biojava.utils.ChangeVetoException: AbstractSymbolList is immutable >>> ChangeVetoException: org.biojava.utils.ChangeVetoException: >>> AbstractSymbolList is immutable >>> at >>> org.biojava.bio.symbol.AbstractSymbolList.edit(AbstractSymbolList.java:1 >>> 13) >>> at >>> org.biojavax.bio.seq.DummyRichSequenceHandler.edit(DummyRichSequenceHand >>> ler.java:30) >>> at >>> org.biojavax.bio.seq.ThinRichSequence.edit(ThinRichSequence.java:155) >>> at biojavahacks.EditBigSequence.(EditBigSequence.java:47) >>> at biojavahacks.EditBigSequence.main(EditBigSequence.java:65) >>> >>> >>> cheers, >>> >>> Jolyon >>> >>> >>> -----Original Message----- >>> From: Richard Holland [mailto:holland at ebi.ac.uk] >>> Sent: 15 February 2008 15:17 >>> To: Jolyon Holdstock >>> Cc: biojava-l at biojava.org >>> Subject: Re: [Biojava-l] Editing a RichSequence[Scanned] >>> >>> I think it's because sequences are constructed internally in a >>> ChunkedSymbolListFactory which compresses large sequences whereas small >>> sequences are stored as normal uncompressed ones. Compressed sequences >>> extend AbstractSymbolList, which is immutable (and therefore uneditable) >>> whereas uncompressed ones do not, and hence are editable. >>> >>> You can disable the use of compressed sequences by using readGenbank() >>> instead of readGenbankDNA() and passing in the DNA alphabet and the >>> non-compressed sequence factory (see the static constants in >>> RichSequenceBuilderFactory). >>> >>> If this still doesn't work, please could you post the full stacktrace so >>> that we can see which class is throwing the exception and at what line >>> etc. >>> >>> cheers, >>> Richard >>> >>> On Fri, February 15, 2008 2:44 pm, Jolyon Holdstock wrote: >>>> Hi >>>> >>>> >>>> Hi, >>>> >>>> I am trying to edit a Genbank sequence. >>>> The code I'm using is as follows: >>>> >>>> [code] >>>> richSeq = RichSequence.IOTools.readGenbankDNA(new BufferedReader(new >>>> FileReader(new File("U00096.gbk"))), null).nextRichSequence(); >>>> >>>> SymbolList sl1 = DNATools.createDNA("AAAGGGTTTCCC"); >>>> Edit editOne = new Edit(47078, 2690, sl1); >>>> richSeq.edit(editOne); >>>> >>>> [/code] >>>> >>>> When it runs it gives the following error >>>> >>>> ChangeVetoException: org.biojava.utils.ChangeVetoException: >>>> AbstractSymbolList is immutable >>>> >>>> >>>> I have used the code for a smaller sequence (15kb, compared with 4Mb) >>>> and it works. >>>> >>>> Does anyone have an idea why this is not working? >>>> >>>> Thanks, >>>> >>>> Jolyon >>>> >>>> >>>> >>>> >>>> >>>> Jolyon Holdstock Ph.D. >>>> Senior Computational Biologist, >>>> Oxford Gene Technology, >>>> Begbroke Science Park, >>>> Sandy Lane, Yarnton >>>> Oxford, OX5 1PF >>>> >>>> Tel: +44 (0)1865 856852 >>>> Fax: +44 (0)1865 842116 >>>> >>>> Oxford Gene Technology (Operations) Ltd. Registered in England >>>> No:03845432 Begbroke Science Park, Sandy Lane, Yarnton, Oxford, OX5 >>> 1PF. >>>> Confidentiality Notice: The contents of this email from the Oxford >>> Gene >>>> Technology Group of Companies are confidential and intended solely for >>>> the person to whom it is addressed. It may contain privileged and >>>> confidential information. If you are not the intended recipient you >>> must >>>> not read, copy, distribute, discuss or take any action in reliance on >>>> it. >>>> >>>> >>>> _______________________________________________ >>>> Biojava-l mailing list - Biojava-l at lists.open-bio.org >>>> http://lists.open-bio.org/mailman/listinfo/biojava-l >>>> >>> > > -- > Richard Holland (BioMart) > EMBL EBI, Wellcome Trust Genome Campus, > Hinxton, Cambridgeshire CB10 1SD, UK > Tel. +44 (0)1223 494416 > > http://www.biomart.org/ > http://www.biojava.org/ _______________________________________________ Biojava-l mailing list - Biojava-l at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/biojava-l - -- Richard Holland (BioMart) EMBL EBI, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK Tel. +44 (0)1223 494416 http://www.biomart.org/ http://www.biojava.org/ -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.2.2 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFHubA+4C5LeMEKA/QRAnaXAJ9qec6JaBIAroziiOYOM+NUIsQGHQCghT9P zOsc+G843TiPRPGw8YaSG3Q= =O/UX -----END PGP SIGNATURE----- From markjschreiber at gmail.com Tue Feb 19 00:03:25 2008 From: markjschreiber at gmail.com (Mark Schreiber) Date: Tue, 19 Feb 2008 13:03:25 +0800 Subject: [Biojava-l] Editing a RichSequence[Scanned] In-Reply-To: <47B9B03E.3020700@ebi.ac.uk> References: <588D0DD225D05746B5D8CAE1BE971F3F01D2E2C7@EUCLID.internal.ogtip.com> <47B9AE84.90202@ebi.ac.uk> <47B9B03E.3020700@ebi.ac.uk> Message-ID: <93b45ca50802182103r391d28fmb1d3fa8dac521f9c@mail.gmail.com> I'm pretty sure it is possible to set the threshold at which chunking starts? Can't quite remember where, probably in one of the sequence builder objects. - Mark On Feb 19, 2008 12:20 AM, Richard Holland wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > PS. The other workaround is to modify your local copy of BioJava, find > the ChunkedSymbolList class, and change the 1<<14 CHUNK_SIZE limit to > some higher value. > > Richard Holland wrote: > > OK, got it. > > > > It's because ChunkedSymbolListFactory is creating a ChunkedSymbolList > > for your sequence, because the sequence is greater than 1<<14 bp long > > (that's about 16384 bytes). This is a hardcoded limit. > > > > ChunkedSymbolList extends AbstractSymbolList, which is immutable and > > therefore not editable. > > > > I'm not sure who wrote ChunkedSymbolList - and I'm not sure how to (or > > if I should) fix it. It's quite a deeply embedded piece of the system. > > > > Does anyone out there know? > > > > There is a workaround - create a new symbol list based on the > > RichSequence ( SymbolList syms = new SimpleSymbolList(richSeq) ). The > > copy will be mutable and edit() will work on it. > > > > cheers, > > Richard > > > > Jolyon Holdstock wrote: > >>> Hi, > >>> > >>> I tried using the readGenbank method with the following code... > >>> > >>> [code] > >>> import java.io.BufferedReader; > >>> import java.io.File; > >>> import java.io.FileNotFoundException; > >>> import java.io.FileReader; > >>> import java.io.IOException; > >>> > >>> import org.biojava.bio.BioException; > >>> import org.biojava.bio.symbol.Edit; > >>> import org.biojava.bio.symbol.SymbolList; > >>> import org.biojava.bio.seq.DNATools; > >>> import org.biojava.bio.seq.io.SymbolTokenization; > >>> import org.biojava.utils.ChangeVetoException; > >>> > >>> import org.biojavax.RichObjectFactory; > >>> import org.biojavax.bio.seq.RichSequence; > >>> import org.biojavax.bio.seq.io.RichSequenceBuilderFactory; > >>> > >>> public class EditBigSequence { > >>> RichSequence richSeq; > >>> Edit edit; > >>> > >>> public EditBigSequence() { > >>> try { > >>> SymbolTokenization symbolTokenization = > >>> DNATools.getDNA().getTokenization("token"); > >>> richSeq = RichSequence.IOTools.readGenbank(new > BufferedReader(new > >>> FileReader(new File("AF234172.gbk"))), > >>> symbolTokenization, > >>> > >>> RichSequenceBuilderFactory.FACTORY, > >>> > >>> RichObjectFactory.getDefaultNamespace()).nextRichSequence(); > >>> > >>> SymbolList insertSeq = DNATools.createDNA("AAAACCCCGGGGTTTT"); > >>> edit = new Edit(1000, 100, insertSeq); > >>> richSeq.edit(edit); > >>> } > >>> catch (FileNotFoundException FNFE){ > >>> System.out.println("FileNotFoundException: " + FNFE); > >>> } > >>> catch (BioException BIOE){ > >>> System.out.println("BioException: " + BIOE); > >>> } > >>> catch (ChangeVetoException CVE){ > >>> CVE.printStackTrace(); > >>> System.out.println("ChangeVetoException: " + CVE); > >>> } > >>> catch (IOException IOE){ > >>> System.out.println("IOException: " + IOE); > >>> } > >>> } > >>> > >>> public static void main(String args []){ > >>> EditBigSequence ebs = new EditBigSequence(); > >>> } > >>> } > >>> [/code] > >>> > >>> But I still got an error, for which the StckTrace is below. > >>> > >>> org.biojava.utils.ChangeVetoException: AbstractSymbolList is immutable > >>> ChangeVetoException: org.biojava.utils.ChangeVetoException: > >>> AbstractSymbolList is immutable > >>> at > >>> org.biojava.bio.symbol.AbstractSymbolList.edit(AbstractSymbolList.java > :1 > >>> 13) > >>> at > >>> org.biojavax.bio.seq.DummyRichSequenceHandler.edit > (DummyRichSequenceHand > >>> ler.java:30) > >>> at > >>> org.biojavax.bio.seq.ThinRichSequence.edit(ThinRichSequence.java:155) > >>> at biojavahacks.EditBigSequence.(EditBigSequence.java > :47) > >>> at biojavahacks.EditBigSequence.main(EditBigSequence.java:65) > >>> > >>> > >>> cheers, > >>> > >>> Jolyon > >>> > >>> > >>> -----Original Message----- > >>> From: Richard Holland [mailto:holland at ebi.ac.uk] > >>> Sent: 15 February 2008 15:17 > >>> To: Jolyon Holdstock > >>> Cc: biojava-l at biojava.org > >>> Subject: Re: [Biojava-l] Editing a RichSequence[Scanned] > >>> > >>> I think it's because sequences are constructed internally in a > >>> ChunkedSymbolListFactory which compresses large sequences whereas > small > >>> sequences are stored as normal uncompressed ones. Compressed sequences > >>> extend AbstractSymbolList, which is immutable (and therefore > uneditable) > >>> whereas uncompressed ones do not, and hence are editable. > >>> > >>> You can disable the use of compressed sequences by using readGenbank() > >>> instead of readGenbankDNA() and passing in the DNA alphabet and the > >>> non-compressed sequence factory (see the static constants in > >>> RichSequenceBuilderFactory). > >>> > >>> If this still doesn't work, please could you post the full stacktrace > so > >>> that we can see which class is throwing the exception and at what line > >>> etc. > >>> > >>> cheers, > >>> Richard > >>> > >>> On Fri, February 15, 2008 2:44 pm, Jolyon Holdstock wrote: > >>>> Hi > >>>> > >>>> > >>>> Hi, > >>>> > >>>> I am trying to edit a Genbank sequence. > >>>> The code I'm using is as follows: > >>>> > >>>> [code] > >>>> richSeq = RichSequence.IOTools.readGenbankDNA(new BufferedReader(new > >>>> FileReader(new File("U00096.gbk"))), null).nextRichSequence(); > >>>> > >>>> SymbolList sl1 = DNATools.createDNA("AAAGGGTTTCCC"); > >>>> Edit editOne = new Edit(47078, 2690, sl1); > >>>> richSeq.edit(editOne); > >>>> > >>>> [/code] > >>>> > >>>> When it runs it gives the following error > >>>> > >>>> ChangeVetoException: org.biojava.utils.ChangeVetoException: > >>>> AbstractSymbolList is immutable > >>>> > >>>> > >>>> I have used the code for a smaller sequence (15kb, compared with 4Mb) > >>>> and it works. > >>>> > >>>> Does anyone have an idea why this is not working? > >>>> > >>>> Thanks, > >>>> > >>>> Jolyon > >>>> > >>>> > >>>> > >>>> > >>>> > >>>> Jolyon Holdstock Ph.D. > >>>> Senior Computational Biologist, > >>>> Oxford Gene Technology, > >>>> Begbroke Science Park, > >>>> Sandy Lane, Yarnton > >>>> Oxford, OX5 1PF > >>>> > >>>> Tel: +44 (0)1865 856852 > >>>> Fax: +44 (0)1865 842116 > >>>> > >>>> Oxford Gene Technology (Operations) Ltd. Registered in England > >>>> No:03845432 Begbroke Science Park, Sandy Lane, Yarnton, Oxford, OX5 > >>> 1PF. > >>>> Confidentiality Notice: The contents of this email from the Oxford > >>> Gene > >>>> Technology Group of Companies are confidential and intended solely > for > >>>> the person to whom it is addressed. It may contain privileged and > >>>> confidential information. If you are not the intended recipient you > >>> must > >>>> not read, copy, distribute, discuss or take any action in reliance on > >>>> it. > >>>> > >>>> > >>>> _______________________________________________ > >>>> Biojava-l mailing list - Biojava-l at lists.open-bio.org > >>>> http://lists.open-bio.org/mailman/listinfo/biojava-l > >>>> > >>> > > > > -- > > Richard Holland (BioMart) > > EMBL EBI, Wellcome Trust Genome Campus, > > Hinxton, Cambridgeshire CB10 1SD, UK > > Tel. +44 (0)1223 494416 > > > > http://www.biomart.org/ > > http://www.biojava.org/ > _______________________________________________ > Biojava-l mailing list - Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l > > - -- > Richard Holland (BioMart) > EMBL EBI, Wellcome Trust Genome Campus, > Hinxton, Cambridgeshire CB10 1SD, UK > Tel. +44 (0)1223 494416 > > http://www.biomart.org/ > http://www.biojava.org/ > -----BEGIN PGP SIGNATURE----- > Version: GnuPG v1.4.2.2 (GNU/Linux) > Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org > > iD8DBQFHubA+4C5LeMEKA/QRAnaXAJ9qec6JaBIAroziiOYOM+NUIsQGHQCghT9P > zOsc+G843TiPRPGw8YaSG3Q= > =O/UX > -----END PGP SIGNATURE----- > _______________________________________________ > Biojava-l mailing list - Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l > From jolyon.holdstock at ogt.co.uk Tue Feb 19 07:12:23 2008 From: jolyon.holdstock at ogt.co.uk (Jolyon Holdstock) Date: Tue, 19 Feb 2008 12:12:23 -0000 Subject: [Biojava-l] Editing a RichSequence[Scanned] Message-ID: <588D0DD225D05746B5D8CAE1BE971F3F01D2E329@EUCLID.internal.ogtip.com> Hi, Thanks for the workaround, better than me using a StringBuffer to do it. The problem with either is that I want to load a Genbank file, insert some sequence, adjust the positions of affected features and then output the RichSequence in Genbank format. If I make a copy of the SymbolList I won't be able output the adjusted sequence with the features etc... as a Genbank file. I can do it in 2 steps via copy and pasting from the files produced. I just wondered if it is possible to do it with BioJava using a single step. Cheers, Jolyon -----Original Message----- From: Richard Holland [mailto:holland at ebi.ac.uk] Sent: 18 February 2008 16:20 To: Jolyon Holdstock Cc: biojava-l at biojava.org Subject: Re: [Biojava-l] Editing a RichSequence[Scanned] -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 PS. The other workaround is to modify your local copy of BioJava, find the ChunkedSymbolList class, and change the 1<<14 CHUNK_SIZE limit to some higher value. Richard Holland wrote: > OK, got it. > > It's because ChunkedSymbolListFactory is creating a ChunkedSymbolList > for your sequence, because the sequence is greater than 1<<14 bp long > (that's about 16384 bytes). This is a hardcoded limit. > > ChunkedSymbolList extends AbstractSymbolList, which is immutable and > therefore not editable. > > I'm not sure who wrote ChunkedSymbolList - and I'm not sure how to (or > if I should) fix it. It's quite a deeply embedded piece of the system. > > Does anyone out there know? > > There is a workaround - create a new symbol list based on the > RichSequence ( SymbolList syms = new SimpleSymbolList(richSeq) ). The > copy will be mutable and edit() will work on it. > > cheers, > Richard > > Jolyon Holdstock wrote: >>> Hi, >>> >>> I tried using the readGenbank method with the following code... >>> >>> [code] >>> import java.io.BufferedReader; >>> import java.io.File; >>> import java.io.FileNotFoundException; >>> import java.io.FileReader; >>> import java.io.IOException; >>> >>> import org.biojava.bio.BioException; >>> import org.biojava.bio.symbol.Edit; >>> import org.biojava.bio.symbol.SymbolList; >>> import org.biojava.bio.seq.DNATools; >>> import org.biojava.bio.seq.io.SymbolTokenization; >>> import org.biojava.utils.ChangeVetoException; >>> >>> import org.biojavax.RichObjectFactory; >>> import org.biojavax.bio.seq.RichSequence; >>> import org.biojavax.bio.seq.io.RichSequenceBuilderFactory; >>> >>> public class EditBigSequence { >>> RichSequence richSeq; >>> Edit edit; >>> >>> public EditBigSequence() { >>> try { >>> SymbolTokenization symbolTokenization = >>> DNATools.getDNA().getTokenization("token"); >>> richSeq = RichSequence.IOTools.readGenbank(new BufferedReader(new >>> FileReader(new File("AF234172.gbk"))), >>> symbolTokenization, >>> >>> RichSequenceBuilderFactory.FACTORY, >>> >>> RichObjectFactory.getDefaultNamespace()).nextRichSequence(); >>> >>> SymbolList insertSeq = DNATools.createDNA("AAAACCCCGGGGTTTT"); >>> edit = new Edit(1000, 100, insertSeq); >>> richSeq.edit(edit); >>> } >>> catch (FileNotFoundException FNFE){ >>> System.out.println("FileNotFoundException: " + FNFE); >>> } >>> catch (BioException BIOE){ >>> System.out.println("BioException: " + BIOE); >>> } >>> catch (ChangeVetoException CVE){ >>> CVE.printStackTrace(); >>> System.out.println("ChangeVetoException: " + CVE); >>> } >>> catch (IOException IOE){ >>> System.out.println("IOException: " + IOE); >>> } >>> } >>> >>> public static void main(String args []){ >>> EditBigSequence ebs = new EditBigSequence(); >>> } >>> } >>> [/code] >>> >>> But I still got an error, for which the StckTrace is below. >>> >>> org.biojava.utils.ChangeVetoException: AbstractSymbolList is immutable >>> ChangeVetoException: org.biojava.utils.ChangeVetoException: >>> AbstractSymbolList is immutable >>> at >>> org.biojava.bio.symbol.AbstractSymbolList.edit(AbstractSymbolList.java:1 >>> 13) >>> at >>> org.biojavax.bio.seq.DummyRichSequenceHandler.edit(DummyRichSequenceHand >>> ler.java:30) >>> at >>> org.biojavax.bio.seq.ThinRichSequence.edit(ThinRichSequence.java:155) >>> at biojavahacks.EditBigSequence.(EditBigSequence.java:47) >>> at biojavahacks.EditBigSequence.main(EditBigSequence.java:65) >>> >>> >>> cheers, >>> >>> Jolyon >>> >>> >>> -----Original Message----- >>> From: Richard Holland [mailto:holland at ebi.ac.uk] >>> Sent: 15 February 2008 15:17 >>> To: Jolyon Holdstock >>> Cc: biojava-l at biojava.org >>> Subject: Re: [Biojava-l] Editing a RichSequence[Scanned] >>> >>> I think it's because sequences are constructed internally in a >>> ChunkedSymbolListFactory which compresses large sequences whereas small >>> sequences are stored as normal uncompressed ones. Compressed sequences >>> extend AbstractSymbolList, which is immutable (and therefore uneditable) >>> whereas uncompressed ones do not, and hence are editable. >>> >>> You can disable the use of compressed sequences by using readGenbank() >>> instead of readGenbankDNA() and passing in the DNA alphabet and the >>> non-compressed sequence factory (see the static constants in >>> RichSequenceBuilderFactory). >>> >>> If this still doesn't work, please could you post the full stacktrace so >>> that we can see which class is throwing the exception and at what line >>> etc. >>> >>> cheers, >>> Richard >>> >>> On Fri, February 15, 2008 2:44 pm, Jolyon Holdstock wrote: >>>> Hi >>>> >>>> >>>> Hi, >>>> >>>> I am trying to edit a Genbank sequence. >>>> The code I'm using is as follows: >>>> >>>> [code] >>>> richSeq = RichSequence.IOTools.readGenbankDNA(new BufferedReader(new >>>> FileReader(new File("U00096.gbk"))), null).nextRichSequence(); >>>> >>>> SymbolList sl1 = DNATools.createDNA("AAAGGGTTTCCC"); >>>> Edit editOne = new Edit(47078, 2690, sl1); >>>> richSeq.edit(editOne); >>>> >>>> [/code] >>>> >>>> When it runs it gives the following error >>>> >>>> ChangeVetoException: org.biojava.utils.ChangeVetoException: >>>> AbstractSymbolList is immutable >>>> >>>> >>>> I have used the code for a smaller sequence (15kb, compared with 4Mb) >>>> and it works. >>>> >>>> Does anyone have an idea why this is not working? >>>> >>>> Thanks, >>>> >>>> Jolyon >>>> >>>> >>>> >>>> >>>> >>>> Jolyon Holdstock Ph.D. >>>> Senior Computational Biologist, >>>> Oxford Gene Technology, >>>> Begbroke Science Park, >>>> Sandy Lane, Yarnton >>>> Oxford, OX5 1PF >>>> >>>> Tel: +44 (0)1865 856852 >>>> Fax: +44 (0)1865 842116 >>>> >>>> Oxford Gene Technology (Operations) Ltd. Registered in England >>>> No:03845432 Begbroke Science Park, Sandy Lane, Yarnton, Oxford, OX5 >>> 1PF. >>>> Confidentiality Notice: The contents of this email from the Oxford >>> Gene >>>> Technology Group of Companies are confidential and intended solely for >>>> the person to whom it is addressed. It may contain privileged and >>>> confidential information. If you are not the intended recipient you >>> must >>>> not read, copy, distribute, discuss or take any action in reliance on >>>> it. >>>> >>>> >>>> _______________________________________________ >>>> Biojava-l mailing list - Biojava-l at lists.open-bio.org >>>> http://lists.open-bio.org/mailman/listinfo/biojava-l >>>> >>> > > -- > Richard Holland (BioMart) > EMBL EBI, Wellcome Trust Genome Campus, > Hinxton, Cambridgeshire CB10 1SD, UK > Tel. +44 (0)1223 494416 > > http://www.biomart.org/ > http://www.biojava.org/ _______________________________________________ Biojava-l mailing list - Biojava-l at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/biojava-l - -- Richard Holland (BioMart) EMBL EBI, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK Tel. +44 (0)1223 494416 http://www.biomart.org/ http://www.biojava.org/ -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.2.2 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFHubA+4C5LeMEKA/QRAnaXAJ9qec6JaBIAroziiOYOM+NUIsQGHQCghT9P zOsc+G843TiPRPGw8YaSG3Q= =O/UX -----END PGP SIGNATURE----- This email has been scanned by Oxford Gene Technology Security Systems. This email has been scanned by Oxford Gene Technology Security Systems. From markjschreiber at gmail.com Wed Feb 20 21:33:39 2008 From: markjschreiber at gmail.com (Mark Schreiber) Date: Thu, 21 Feb 2008 10:33:39 +0800 Subject: [Biojava-l] Editing a RichSequence[Scanned] In-Reply-To: <588D0DD225D05746B5D8CAE1BE971F3F01D2E329@EUCLID.internal.ogtip.com> References: <588D0DD225D05746B5D8CAE1BE971F3F01D2E329@EUCLID.internal.ogtip.com> Message-ID: <93b45ca50802201833l49be4761pd82fd852636a92c9@mail.gmail.com> Here is the solution (from the JavaDoc) SimpleRichSequenceBuilderFactory public SimpleRichSequenceBuilderFactory(SymbolListFactory fact, int threshold) Creates a new instance of SimpleRichSequenceBuilderFactory that uses a specified factory for SymbolLists longer than a specified length. Before that a SimpleSymbolListFacotry is used. Parameters:fact - the factory to use when building the SymbolList.threshold - the threshold to exceed before using this factory On Tue, Feb 19, 2008 at 8:12 PM, Jolyon Holdstock wrote: > Hi, > > Thanks for the workaround, better than me using a StringBuffer to do it. > > The problem with either is that I want to load a Genbank file, insert > some sequence, adjust the positions of affected features and then output > the RichSequence in Genbank format. > > If I make a copy of the SymbolList I won't be able output the adjusted > sequence with the features etc... as a Genbank file. > > I can do it in 2 steps via copy and pasting from the files produced. I > just wondered if it is possible to do it with BioJava using a single > step. > > > Cheers, > > Jolyon > > > -----Original Message----- > From: Richard Holland [mailto:holland at ebi.ac.uk] > > > > Sent: 18 February 2008 16:20 > To: Jolyon Holdstock > Cc: biojava-l at biojava.org > Subject: Re: [Biojava-l] Editing a RichSequence[Scanned] > > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > PS. The other workaround is to modify your local copy of BioJava, find > the ChunkedSymbolList class, and change the 1<<14 CHUNK_SIZE limit to > some higher value. > > Richard Holland wrote: > > OK, got it. > > > > It's because ChunkedSymbolListFactory is creating a ChunkedSymbolList > > for your sequence, because the sequence is greater than 1<<14 bp long > > (that's about 16384 bytes). This is a hardcoded limit. > > > > ChunkedSymbolList extends AbstractSymbolList, which is immutable and > > therefore not editable. > > > > I'm not sure who wrote ChunkedSymbolList - and I'm not sure how to (or > > if I should) fix it. It's quite a deeply embedded piece of the system. > > > > Does anyone out there know? > > > > There is a workaround - create a new symbol list based on the > > RichSequence ( SymbolList syms = new SimpleSymbolList(richSeq) ). The > > copy will be mutable and edit() will work on it. > > > > cheers, > > Richard > > > > Jolyon Holdstock wrote: > >>> Hi, > >>> > >>> I tried using the readGenbank method with the following code... > >>> > >>> [code] > >>> import java.io.BufferedReader; > >>> import java.io.File; > >>> import java.io.FileNotFoundException; > >>> import java.io.FileReader; > >>> import java.io.IOException; > >>> > >>> import org.biojava.bio.BioException; > >>> import org.biojava.bio.symbol.Edit; > >>> import org.biojava.bio.symbol.SymbolList; > >>> import org.biojava.bio.seq.DNATools; > >>> import org.biojava.bio.seq.io.SymbolTokenization; > >>> import org.biojava.utils.ChangeVetoException; > >>> > >>> import org.biojavax.RichObjectFactory; > >>> import org.biojavax.bio.seq.RichSequence; > >>> import org.biojavax.bio.seq.io.RichSequenceBuilderFactory; > >>> > >>> public class EditBigSequence { > >>> RichSequence richSeq; > >>> Edit edit; > >>> > >>> public EditBigSequence() { > >>> try { > >>> SymbolTokenization symbolTokenization = > >>> DNATools.getDNA().getTokenization("token"); > >>> richSeq = RichSequence.IOTools.readGenbank(new > BufferedReader(new > >>> FileReader(new File("AF234172.gbk"))), > >>> symbolTokenization, > >>> > >>> RichSequenceBuilderFactory.FACTORY, > >>> > >>> RichObjectFactory.getDefaultNamespace()).nextRichSequence(); > >>> > >>> SymbolList insertSeq = DNATools.createDNA("AAAACCCCGGGGTTTT"); > >>> edit = new Edit(1000, 100, insertSeq); > >>> richSeq.edit(edit); > >>> } > >>> catch (FileNotFoundException FNFE){ > >>> System.out.println("FileNotFoundException: " + FNFE); > >>> } > >>> catch (BioException BIOE){ > >>> System.out.println("BioException: " + BIOE); > >>> } > >>> catch (ChangeVetoException CVE){ > >>> CVE.printStackTrace(); > >>> System.out.println("ChangeVetoException: " + CVE); > >>> } > >>> catch (IOException IOE){ > >>> System.out.println("IOException: " + IOE); > >>> } > >>> } > >>> > >>> public static void main(String args []){ > >>> EditBigSequence ebs = new EditBigSequence(); > >>> } > >>> } > >>> [/code] > >>> > >>> But I still got an error, for which the StckTrace is below. > >>> > >>> org.biojava.utils.ChangeVetoException: AbstractSymbolList is > immutable > >>> ChangeVetoException: org.biojava.utils.ChangeVetoException: > >>> AbstractSymbolList is immutable > >>> at > >>> > org.biojava.bio.symbol.AbstractSymbolList.edit(AbstractSymbolList.java:1 > >>> 13) > >>> at > >>> > org.biojavax.bio.seq.DummyRichSequenceHandler.edit(DummyRichSequenceHand > >>> ler.java:30) > >>> at > >>> > org.biojavax.bio.seq.ThinRichSequence.edit(ThinRichSequence.java:155) > >>> at > biojavahacks.EditBigSequence.(EditBigSequence.java:47) > >>> at > biojavahacks.EditBigSequence.main(EditBigSequence.java:65) > >>> > >>> > >>> cheers, > >>> > >>> Jolyon > >>> > >>> > >>> -----Original Message----- > >>> From: Richard Holland [mailto:holland at ebi.ac.uk] > >>> Sent: 15 February 2008 15:17 > >>> To: Jolyon Holdstock > >>> Cc: biojava-l at biojava.org > >>> Subject: Re: [Biojava-l] Editing a RichSequence[Scanned] > >>> > >>> I think it's because sequences are constructed internally in a > >>> ChunkedSymbolListFactory which compresses large sequences whereas > small > >>> sequences are stored as normal uncompressed ones. Compressed > sequences > >>> extend AbstractSymbolList, which is immutable (and therefore > uneditable) > >>> whereas uncompressed ones do not, and hence are editable. > >>> > >>> You can disable the use of compressed sequences by using > readGenbank() > >>> instead of readGenbankDNA() and passing in the DNA alphabet and the > >>> non-compressed sequence factory (see the static constants in > >>> RichSequenceBuilderFactory). > >>> > >>> If this still doesn't work, please could you post the full > stacktrace so > >>> that we can see which class is throwing the exception and at what > line > >>> etc. > >>> > >>> cheers, > >>> Richard > >>> > >>> On Fri, February 15, 2008 2:44 pm, Jolyon Holdstock wrote: > >>>> Hi > >>>> > >>>> > >>>> Hi, > >>>> > >>>> I am trying to edit a Genbank sequence. > >>>> The code I'm using is as follows: > >>>> > >>>> [code] > >>>> richSeq = RichSequence.IOTools.readGenbankDNA(new > BufferedReader(new > >>>> FileReader(new File("U00096.gbk"))), null).nextRichSequence(); > >>>> > >>>> SymbolList sl1 = DNATools.createDNA("AAAGGGTTTCCC"); > >>>> Edit editOne = new Edit(47078, 2690, sl1); > >>>> richSeq.edit(editOne); > >>>> > >>>> [/code] > >>>> > >>>> When it runs it gives the following error > >>>> > >>>> ChangeVetoException: org.biojava.utils.ChangeVetoException: > >>>> AbstractSymbolList is immutable > >>>> > >>>> > >>>> I have used the code for a smaller sequence (15kb, compared with > 4Mb) > >>>> and it works. > >>>> > >>>> Does anyone have an idea why this is not working? > >>>> > >>>> Thanks, > >>>> > >>>> Jolyon > >>>> > >>>> > >>>> > >>>> > >>>> > >>>> Jolyon Holdstock Ph.D. > >>>> Senior Computational Biologist, > >>>> Oxford Gene Technology, > >>>> Begbroke Science Park, > >>>> Sandy Lane, Yarnton > >>>> Oxford, OX5 1PF > >>>> > >>>> Tel: +44 (0)1865 856852 > >>>> Fax: +44 (0)1865 842116 > >>>> > >>>> Oxford Gene Technology (Operations) Ltd. Registered in England > >>>> No:03845432 Begbroke Science Park, Sandy Lane, Yarnton, Oxford, OX5 > >>> 1PF. > >>>> Confidentiality Notice: The contents of this email from the Oxford > >>> Gene > >>>> Technology Group of Companies are confidential and intended solely > for > >>>> the person to whom it is addressed. It may contain privileged and > >>>> confidential information. If you are not the intended recipient you > >>> must > >>>> not read, copy, distribute, discuss or take any action in reliance > on > >>>> it. > >>>> > >>>> > >>>> _______________________________________________ > >>>> Biojava-l mailing list - Biojava-l at lists.open-bio.org > >>>> http://lists.open-bio.org/mailman/listinfo/biojava-l > >>>> > >>> > > > > -- > > Richard Holland (BioMart) > > EMBL EBI, Wellcome Trust Genome Campus, > > Hinxton, Cambridgeshire CB10 1SD, UK > > Tel. +44 (0)1223 494416 > > > > http://www.biomart.org/ > > http://www.biojava.org/ > _______________________________________________ > Biojava-l mailing list - Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l > > - -- > Richard Holland (BioMart) > EMBL EBI, Wellcome Trust Genome Campus, > Hinxton, Cambridgeshire CB10 1SD, UK > Tel. +44 (0)1223 494416 > > http://www.biomart.org/ > http://www.biojava.org/ > -----BEGIN PGP SIGNATURE----- > Version: GnuPG v1.4.2.2 (GNU/Linux) > Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org > > iD8DBQFHubA+4C5LeMEKA/QRAnaXAJ9qec6JaBIAroziiOYOM+NUIsQGHQCghT9P > zOsc+G843TiPRPGw8YaSG3Q= > =O/UX > -----END PGP SIGNATURE----- > > > > > > > > > > > > > This email has been scanned by Oxford Gene Technology Security Systems. > > > > > > > > > > > > This email has been scanned by Oxford Gene Technology Security Systems. > _______________________________________________ > > > > Biojava-l mailing list - Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l > From hlapp at gmx.net Thu Feb 21 20:22:27 2008 From: hlapp at gmx.net (Hilmar Lapp) Date: Thu, 21 Feb 2008 20:22:27 -0500 Subject: [Biojava-l] biosql-accelerators-pg.sql Message-ID: <9615F30D-DB56-4418-B195-BCD29FD6A766@gmx.net> Does the last BioJava release (or anyone out there using an older release of BioJava) still use the code defined in sql/biosql- accelerators-pg.sql in BioSQL? ThomasD created this back in 2002. Even though I updated it now to be compatible with the current 1.0 core schema (which has been stable since 2004), I think I'll rather remove it from the release (and possibly from the repository) as it's extremely unlikely that someone is using it (it would not have worked with the schema since 2004). -hilmar -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From markjschreiber at gmail.com Thu Feb 21 20:31:11 2008 From: markjschreiber at gmail.com (Mark Schreiber) Date: Fri, 22 Feb 2008 09:31:11 +0800 Subject: [Biojava-l] [BioSQL-l] biosql-accelerators-pg.sql In-Reply-To: <9615F30D-DB56-4418-B195-BCD29FD6A766@gmx.net> References: <9615F30D-DB56-4418-B195-BCD29FD6A766@gmx.net> Message-ID: <93b45ca50802211731m88fd421w4f6d6f7adede647b@mail.gmail.com> Hi - I don't think so. But it depends on our Hibernate ORM mapping. Best person to answer that is Richard. - Mark On Fri, Feb 22, 2008 at 9:22 AM, Hilmar Lapp wrote: > Does the last BioJava release (or anyone out there using an older > release of BioJava) still use the code defined in sql/biosql- > accelerators-pg.sql in BioSQL? > > ThomasD created this back in 2002. Even though I updated it now to be > compatible with the current 1.0 core schema (which has been stable > since 2004), I think I'll rather remove it from the release (and > possibly from the repository) as it's extremely unlikely that someone > is using it (it would not have worked with the schema since 2004). > > -hilmar > > -- > =========================================================== > : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : > =========================================================== > > > > _______________________________________________ > BioSQL-l mailing list > BioSQL-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biosql-l > From holland at ebi.ac.uk Fri Feb 22 03:01:57 2008 From: holland at ebi.ac.uk (Richard Holland) Date: Fri, 22 Feb 2008 08:01:57 +0000 Subject: [Biojava-l] [BioSQL-l] biosql-accelerators-pg.sql In-Reply-To: <93b45ca50802211731m88fd421w4f6d6f7adede647b@mail.gmail.com> References: <9615F30D-DB56-4418-B195-BCD29FD6A766@gmx.net> <93b45ca50802211731m88fd421w4f6d6f7adede647b@mail.gmail.com> Message-ID: <47BE8175.2060106@ebi.ac.uk> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 I didn't even know it existed. So the answer is no, we don't use it any more. Mark Schreiber wrote: > Hi - > > I don't think so. But it depends on our Hibernate ORM mapping. Best > person to answer that is Richard. > > - Mark > > On Fri, Feb 22, 2008 at 9:22 AM, Hilmar Lapp wrote: >> Does the last BioJava release (or anyone out there using an older >> release of BioJava) still use the code defined in sql/biosql- >> accelerators-pg.sql in BioSQL? >> >> ThomasD created this back in 2002. Even though I updated it now to be >> compatible with the current 1.0 core schema (which has been stable >> since 2004), I think I'll rather remove it from the release (and >> possibly from the repository) as it's extremely unlikely that someone >> is using it (it would not have worked with the schema since 2004). >> >> -hilmar >> >> -- >> =========================================================== >> : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : >> =========================================================== >> >> >> >> _______________________________________________ >> BioSQL-l mailing list >> BioSQL-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/biosql-l >> > _______________________________________________ > BioSQL-l mailing list > BioSQL-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biosql-l > - -- Richard Holland (BioMart) EMBL EBI, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK Tel. +44 (0)1223 494416 http://www.biomart.org/ http://www.biojava.org/ -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.2.2 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFHvoF14C5LeMEKA/QRAhFYAJ46to9TgnyRBJDFzps+aVBTMB0UoACeJDpn mQ8Oc59xGrQYMm9AF4degMk= =EF/K -----END PGP SIGNATURE----- From crackeur at comcast.net Sat Feb 23 13:37:24 2008 From: crackeur at comcast.net (jimmy Zhang) Date: Sat, 23 Feb 2008 10:37:24 -0800 Subject: [Biojava-l] vtd-xml 2.3 References: <47B47520.9020104@molgen.mpg.de> <93b45ca50802141637u472f4e27re5d651fd7e908d14@mail.gmail.com> Message-ID: <00a701c8764b$225faf20$0402a8c0@your55e5f9e3d2> VTD-XML 2.3 is now released. To download the latest version please visit http://sourceforge.net/project/showfiles.php?group_id=110612&package_id=120172. Below is a list of new features and enhancements in this version. a.. VTDException is now introduced as the root class for all other VTD-XML's exception classes (per suggestion of Max Rahder). b.. Transcoding capability is now added for inter-document cut and paste. You can cut a chuck of bytes in a UTF-8 encoded document and paste it into a UTF-16 encoded document and the output document is still well-formed. c.. ISO-8859-10, ISO-8859-11, ISO-8859-12, ISO-8859-13, ISO-8859-14 and ISO-8859-15 support has now been added d.. Zero length Text node is now possible. e.. Ability to dump in-memory copy of text is added. f.. Various code cleanup, enhancement and bug fixes. Below are some new articles related to VTD-XML a.. Index XML documents with VTD-XML http://xml.sys-con.com/read/453082.htm b.. Manipulate XML content the Ximple Way http://www.devx.com/xml/Article/36379 c.. VTD-XML: A new vision of XML http://www.developer.com/xml/article.php/3714051 d.. VTD-XML: XML Processing for the future http://www.codeproject.com/KB/cs/vtd-xml_examples.aspx If you (or someone you know) like the concept of VTD-XML, think that it can help solve enterprises' XML processing related issues (particularly those related to SOA), and would like to directly influence and contribute to the development of the future of Internet, please email me crackeur at comcast.net). We are looking for open source software developers and project management people to take VTD-XML to the next level. From holland at ebi.ac.uk Mon Feb 25 06:39:40 2008 From: holland at ebi.ac.uk (Richard Holland) Date: Mon, 25 Feb 2008 11:39:40 +0000 Subject: [Biojava-l] BioJava3 design proposal Message-ID: <47C2A8FC.1040907@ebi.ac.uk> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 The skeleton based on all your comments is here: http://biojava.org/wiki/BioJava3_Design Many of you will probably have details to add to it. Please do this by modifying the page directly. cheers, Richard - -- Richard Holland (BioMart) EMBL EBI, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK Tel. +44 (0)1223 494416 http://www.biomart.org/ http://www.biojava.org/ -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.2.2 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFHwqj74C5LeMEKA/QRAuE/AJ9r4O2mInm2AfOewXSGUpSSHnhM/QCfWd5X MEYvMMp1jEUtfhegKuoPjCs= =xjTy -----END PGP SIGNATURE----- From arnaud at ebi.ac.uk Fri Feb 29 11:58:36 2008 From: arnaud at ebi.ac.uk (Arnaud Kerhornou) Date: Fri, 29 Feb 2008 16:58:36 +0000 Subject: [Biojava-l] RichLocation.Tools.merge(Collection members) method Message-ID: <47C839BC.8030506@ebi.ac.uk> Hi everyone, I don't think the RichLocation.Tools.merge(Collection members) method is doing it right. e.g. Input: biojavax:join:[1157624..1158025,1158025..1158420,1158420..1158893] Expected output:1157624..1158895 But I get: join:[1157624..1158420,1158420..1158894] I think the code should have the extra line: parent = union; just after c=p; statement line 18 (See source code below), otherwise it doesn't take into account the newly generated location. Is that right ? Thanks Arnaud Source code: 1 public static Collection merge(Collection members) { 2 // flatten them out first so we don't end up recursing 3 List membersList = new ArrayList(flatten(members)); 4 // all members are now singles so we can use single vs single union operations 5 if (membersList.size()>1) { 6 for (int p = 0; p < (membersList.size()-1); p++) { 7 RichLocation parent = (RichLocation)membersList.get(p); 8 for (int c = p+1; c < membersList.size(); c++) { 9 RichLocation child = (RichLocation)membersList.get(c); 10 RichLocation union = (RichLocation)parent.union(child); 11 // if parent can merge with child 12 if (union.isContiguous()) { 13 // replace parent with union 14 membersList.set(p,union); 15 // remove child 16 membersList.remove(c); 17 // check all children again 18 c=p; 19 } 20 } 21 } 22 } 23 return membersList; 24 } From markjschreiber at gmail.com Fri Feb 29 21:30:42 2008 From: markjschreiber at gmail.com (Mark Schreiber) Date: Sat, 1 Mar 2008 10:30:42 +0800 Subject: [Biojava-l] RichLocation.Tools.merge(Collection members) method In-Reply-To: <47C839BC.8030506@ebi.ac.uk> References: <47C839BC.8030506@ebi.ac.uk> Message-ID: <93b45ca50802291830l58628d0bqbed7a1feb0e85e89@mail.gmail.com> Hi - This could be a corner case. Can you provide the code that actually generates this error? - Mark On Sat, Mar 1, 2008 at 12:58 AM, Arnaud Kerhornou wrote: > Hi everyone, > > I don't think the RichLocation.Tools.merge(Collection members) method is > doing it right. > > e.g. Input: > biojavax:join:[1157624..1158025,1158025..1158420,1158420..1158893] > Expected output:1157624..1158895 > > But I get: join:[1157624..1158420,1158420..1158894] > > I think the code should have the extra line: parent = union; > just after c=p; statement line 18 (See source code below), > otherwise it doesn't take into account the newly generated location. > > Is that right ? > > Thanks > Arnaud > > Source code: > > 1 public static Collection merge(Collection members) { > 2 // flatten them out first so we don't end up recursing > 3 List membersList = new ArrayList(flatten(members)); > 4 // all members are now singles so we can use single vs > single union operations > 5 if (membersList.size()>1) { > 6 for (int p = 0; p < (membersList.size()-1); p++) { > 7 RichLocation parent = (RichLocation)membersList.get(p); > 8 for (int c = p+1; c < membersList.size(); c++) { > 9 RichLocation child = > (RichLocation)membersList.get(c); > 10 RichLocation union = > (RichLocation)parent.union(child); > 11 // if parent can merge with child > 12 if (union.isContiguous()) { > 13 // replace parent with union > 14 membersList.set(p,union); > 15 // remove child > 16 membersList.remove(c); > 17 // check all children again > 18 c=p; > 19 } > 20 } > 21 } > 22 } > 23 return membersList; > 24 } > > _______________________________________________ > Biojava-l mailing list - Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l >