From mark.schreiber at group.novartis.com Tue Jul 6 21:47:12 2004 From: mark.schreiber at group.novartis.com (mark.schreiber@group.novartis.com) Date: Tue Jul 6 21:49:23 2004 Subject: [Biojava-l] PFAM parser Message-ID: Hi All - Is anyone aware of a java parser for PFAM? Must be open-source (preferably LGPL or similar) but doesn't have to be biojava. - Mark Mark Schreiber Principal Scientist (Bioinformatics) Novartis Institute for Tropical Diseases (NITD) 10 Biopolis Road #05-01 Chromos Singapore 138670 www.nitd.novartis.com phone +65 6722 2973 fax +65 6722 2910 From mmatilai at hytti.uku.fi Thu Jul 8 04:15:58 2004 From: mmatilai at hytti.uku.fi (mmatilai@hytti.uku.fi) Date: Thu Jul 8 04:18:07 2004 Subject: [Biojava-l] Problems getting started In-Reply-To: References: Message-ID: <1089274558.40ed02bed53ae@maili.uku.fi> I followed the instructions on how to install BioJava and the jar files. I tried the code 'How can I make a motif into a regular expression' and get the error message that the compiler cannot locate the package java.util.regex. Where should this package be located since I don't seem to have it? I also tried one of the demos from the package indexing, ListIDs. It compiled ok but running was terminated by ArrayIndexOutOfBoundsException. Are the demos expecting some parameter input since this happened with others as well? the code is not that explicit without comments for a beginner. From matthew_pocock at yahoo.co.uk Thu Jul 8 07:00:43 2004 From: matthew_pocock at yahoo.co.uk (Matthew Pocock) Date: Thu Jul 8 07:03:04 2004 Subject: [Biojava-l] Problems getting started In-Reply-To: <1089274558.40ed02bed53ae@maili.uku.fi> References: <1089274558.40ed02bed53ae@maili.uku.fi> Message-ID: <40ED295B.50400@yahoo.co.uk> Hi, mmatilai@hytti.uku.fi wrote: >I followed the instructions on how to install BioJava and the jar files. I tried the code 'How >can I make a motif into a regular expression' and get the error message that the compiler >cannot locate the package java.util.regex. Where should this package be located since I >don't seem to have it? > > What version of Java are you using? I believe that java.util.regex is included in Java 1.4 and 1.5 (or is that now Java 5? Can't keep up). I think BioJava 1.3 was the last release that we coded to Java 1.3. >I also tried one of the demos from the package indexing, ListIDs. It compiled ok but >running was terminated by ArrayIndexOutOfBoundsException. Are the demos expecting >some parameter input since this happened with others as well? the code is not that >explicit without comments for a beginner. > > Could you post the stack trace? It's difficult to know what is going wrong without it. Sory that the code is not well documented. It tends to be the last thing that gets done. Matthew >_______________________________________________ >Biojava-l mailing list - Biojava-l@biojava.org >http://biojava.org/mailman/listinfo/biojava-l > > > From Russell.Smithies at agresearch.co.nz Fri Jul 9 00:41:28 2004 From: Russell.Smithies at agresearch.co.nz (Smithies, Russell) Date: Fri Jul 9 00:43:38 2004 Subject: [Biojava-l] NCBI SOAP interface Message-ID: (I know it's not a BioJava problem but NCBI don't support Java and I know there's a few SOAP gurus here!!) Last week, I requested NCBI fix their eutils .wsdl and .xsd so they work correctly with Axis' wsdl2java utility to convert the WSDL to Java code. Got an email back on Wednesday saying all done. (I've been getting great service from NCBI!!!) Now the src stubs are created correctly and they compile fine but I have been unable to work out exactly how to get a result set back. (there is even less documentation for this than BioJava has ) This is what I'm trying: // Make a service EUtilsService service = new EUtilsServiceLocator(); // Now use the service to get a stub to the Service Definition Interface (SDI) EUtilsServiceSoap ncbiSOAP = service.geteUtilsServiceSoap(); //database name String db = "pubmed"; //Search term String term = "cancer"; //Requests utility to maintain results in user's environment. Used in conjunction with WebEnv String usehistory = ""; //Value previously returned in XML results from ESearch or EPost. //This value may change with each utility call. //If WebEnv is used, History search numbers can be included in an ESummary URL, //e.g., term=cancer+AND+%23X (where %23 replaces # and X is the History search number). String webEnv = ""; //The value used for a history search number or previously returned in //XML results from ESearch or EPost. String query_key = ""; //identifies the resource which is using Entrez links (e.g., tool=flybase). String tool = ""; String email = ""; //Use this command to specify a specific search field. String field = ""; //Limit items a number of days immediately preceding today's date int reldate = 60; //Limit results bounded by two specific dates. //Both mindate and maxdate are required if date range limits are applied using these variables String mindate = ""; String maxdate = ""; //Limit dates to a specific date field based on database String datetype = "edat"; //sequential number of the first record retrieved //default=0 which will retrieve the first record) int retstart = 0; //number of items retrieved int retmax = 100; String rettype = ""; //Use in conjunction with Web Environment to display sorted results in ESummary and EFetch String sort = ""; // fill in values ESearchResult esr = ncbiSOAP.run_eSearch(db, term, usehistory, webEnv, query_key, tool, email, field, reldate, mindate, maxdate, datetype, retstart, retmax, rettype, sort); where do I go from here? Does anyone have any example code ? thanx Russell Smithies Bioinformatics Software Developer AgResearch Invermay Mosgiel New Zealand ======================================================================= Attention: The information contained in this message and/or attachments from AgResearch Limited is intended only for the persons or entities to which it is addressed and may contain confidential and/or privileged material. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipients is prohibited by AgResearch Limited. If you have received this message in error, please notify the sender immediately. ======================================================================= From avinash238 at rediffmail.com Fri Jul 9 15:03:48 2004 From: avinash238 at rediffmail.com (katikaneni avinash rao) Date: Sun Jul 11 14:33:23 2004 Subject: [Biojava-l] multiple sequence alignment Message-ID: <20040709190348.25835.qmail@webmail46.rediffmail.com> ? hi, can any one helpme in implementing in java a simple program for multiple sequence alignment. IF anyone already has the source code can they please send it to me?? note:- it need not be in biojava From zebracorn at mac.com Sun Jul 11 14:52:41 2004 From: zebracorn at mac.com (Jai-wei Gan) Date: Sun Jul 11 14:54:33 2004 Subject: [Biojava-l] chromosome ideogram drawing program written in biojava or java Message-ID: <7CCB9044-D36B-11D8-8983-000A956F02A8@mac.com> This is a newbie SOS: Hi, does anyone know of a chromosome ideogram drawing program written in Java or biojava? Your help will be greatly appreciated. Thanks From MCCon012 at mc.duke.edu Sun Jul 11 20:59:51 2004 From: MCCon012 at mc.duke.edu (Patrick McConnell) Date: Sun Jul 11 21:02:15 2004 Subject: [Biojava-l] multiple sequence alignment Message-ID: Checkout Needleman-Wunsch algorithm part of the NeoBio package: http://neobio.sourceforge.net/ -Patrick McConnell Duke Bioinformatics Shared Resource Duke Comprehensive Cancer Center patrick.mcconnell@duke.edu "katikaneni avinash rao" cc: Sent by: Subject: [Biojava-l] multiple sequence alignment biojava-l-bounces@portal.o pen-bio.org 07/09/2004 03:03 PM Please respond to katikaneni avinash rao hi, can any one helpme in implementing in java a simple program for multiple sequence alignment. IF anyone already has the source code can they please send it to me?? note:- it need not be in biojava _______________________________________________ Biojava-l mailing list - Biojava-l@biojava.org http://biojava.org/mailman/listinfo/biojava-l From lc1 at sanger.ac.uk Mon Jul 12 06:18:58 2004 From: lc1 at sanger.ac.uk (Lachlan James Coin) Date: Mon Jul 12 06:21:03 2004 Subject: [Biojava-l] PFAM parser In-Reply-To: <200407120104.i6C13jKu010483@portal.open-bio.org> References: <200407120104.i6C13jKu010483@portal.open-bio.org> Message-ID: <40F26592.5040008@sanger.ac.uk> Hi Mark, I have written various parsers for pfam - they are a bit hacky though. I can tidy them up and commit them to biojava if you like. Are you accessing the pfam relational databases, or flatfiles? Thanks, Lachlan > >Message: 3 >Date: Wed, 7 Jul 2004 09:47:12 +0800 >From: mark.schreiber@group.novartis.com >Subject: [Biojava-l] PFAM parser >To: biojava-l@open-bio.org >Message-ID: > > >Content-Type: text/plain; charset="us-ascii" > >Hi All - > >Is anyone aware of a java parser for PFAM? Must be open-source (preferably >LGPL or similar) but doesn't have to be biojava. > >- Mark > >Mark Schreiber >Principal Scientist (Bioinformatics) > >Novartis Institute for Tropical Diseases (NITD) >10 Biopolis Road >#05-01 Chromos >Singapore 138670 >www.nitd.novartis.com > >phone +65 6722 2973 >fax +65 6722 2910 > > > > From lc1 at sanger.ac.uk Mon Jul 12 06:18:58 2004 From: lc1 at sanger.ac.uk (Lachlan James Coin) Date: Mon Jul 12 06:21:05 2004 Subject: [Biojava-l] PFAM parser In-Reply-To: <200407120104.i6C13jKu010483@portal.open-bio.org> References: <200407120104.i6C13jKu010483@portal.open-bio.org> Message-ID: <40F26592.5040008@sanger.ac.uk> Hi Mark, I have written various parsers for pfam - they are a bit hacky though. I can tidy them up and commit them to biojava if you like. Are you accessing the pfam relational databases, or flatfiles? Thanks, Lachlan > >Message: 3 >Date: Wed, 7 Jul 2004 09:47:12 +0800 >From: mark.schreiber@group.novartis.com >Subject: [Biojava-l] PFAM parser >To: biojava-l@open-bio.org >Message-ID: > > >Content-Type: text/plain; charset="us-ascii" > >Hi All - > >Is anyone aware of a java parser for PFAM? Must be open-source (preferably >LGPL or similar) but doesn't have to be biojava. > >- Mark > >Mark Schreiber >Principal Scientist (Bioinformatics) > >Novartis Institute for Tropical Diseases (NITD) >10 Biopolis Road >#05-01 Chromos >Singapore 138670 >www.nitd.novartis.com > >phone +65 6722 2973 >fax +65 6722 2910 > > > > From Russell.Smithies at agresearch.co.nz Wed Jul 14 16:27:58 2004 From: Russell.Smithies at agresearch.co.nz (Smithies, Russell) Date: Wed Jul 14 16:30:04 2004 Subject: [Biojava-l] FW: [eutilities] Retrieving brief summaries for gene names with SOAP Message-ID: NCBI have updated the help info for the new SOAP interface and provided example code in C# and Java !! Russell Smithies Bioinformatics Software Developer AgResearch Invermay Private Bag 50034 Puddle Alley Mosgiel New Zealand > -----Original Message----- > From: RT - Kathi Canese [mailto:rt@ncbi.nlm.nih.gov] > Sent: Thursday, 15 July 2004 3:22 a.m. > To: Smithies, Russell > Subject: [eutilities] Retrieving brief summaries for gene > names ; SOAP [NCBI tracking system #15045974] > > ------ MESSAGE BODY. YOU MAY CHANGE IT OR ADD COMMENTS ABOVE ------ > > Russell, > > I wanted to let you know that we have changed the SOAP > documentation and released a new version. Please see the > updated help for additional > information: > > http://eutils.ncbi.nlm.nih.gov/entrez/query/static/esoap_help.html > > Regards, > Kathi Canese > NCBI > > > airozo wrote (Tue, Jul 6 2004 12:07:01): > > > Russell, > > > > Our programmer has made a few changes to eutils' WSDL and > XSD scripts. > > You should download eutils.wsdl, egquery.xsd, einfo.xsd, elink.xsd, > > esearch.xsd, and esummary.xsd files from our site again. > > WSDL2Java should work fine now. > > > > Regards, > > > > Diana Airozo > > NCBI Contractor ======================================================================= Attention: The information contained in this message and/or attachments from AgResearch Limited is intended only for the persons or entities to which it is addressed and may contain confidential and/or privileged material. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipients is prohibited by AgResearch Limited. If you have received this message in error, please notify the sender immediately. ======================================================================= From stiffler at cs.uoregon.edu Thu Jul 15 13:03:54 2004 From: stiffler at cs.uoregon.edu (Nicholas Lee Stiffler) Date: Thu Jul 15 13:05:49 2004 Subject: [Biojava-l] Regular expressions Message-ID: I am using the regex class in biojava 1.4pre, and I want use it to search only for sequences that start with a certain motif. For example, I tried using the regex ^gty but it threw org.biojava.utils.regex.RegexException: unexpected symbol ^ Is this a bug or do I need to input it differently? Nicholas Stiffler Berglund Lab Department of Molecular Biology University of Oregon From smh1008 at cus.cam.ac.uk Thu Jul 15 13:15:58 2004 From: smh1008 at cus.cam.ac.uk (David Huen) Date: Thu Jul 15 13:17:52 2004 Subject: [Biojava-l] Regular expressions In-Reply-To: References: Message-ID: <200407151815.59111.smh1008@cus.cam.ac.uk> On Thursday 15 Jul 2004 18:03, Nicholas Lee Stiffler wrote: > I am using the regex class in biojava 1.4pre, and I want use it to search > only for sequences that start with a certain motif. For example, I tried > using the regex ^gty but it threw > org.biojava.utils.regex.RegexException: unexpected symbol ^ > Is this a bug or do I need to input it differently? > That looks like a bug in the PatternChecker - it checks the input pattern is acceptable and also expands some ambiguity symbols etc. It seems to be too fussy on this symbol. I'll patch it. Rgds, David Huen From mark.schreiber at group.novartis.com Mon Jul 19 23:34:48 2004 From: mark.schreiber at group.novartis.com (mark.schreiber@group.novartis.com) Date: Mon Jul 19 23:36:35 2004 Subject: [Biojava-l] Chou-Fasman Message-ID: Hello - Does anyone have an example of applying chou fasman parameters to predicting the most probable secondary structure? Mark Schreiber Principal Scientist (Bioinformatics) Novartis Institute for Tropical Diseases (NITD) 10 Biopolis Road #05-01 Chromos Singapore 138670 www.nitd.novartis.com phone +65 6722 2973 fax +65 6722 2910 From adionne at warnex.ca Tue Jul 20 11:41:24 2004 From: adionne at warnex.ca (Alexandre Dionne Laporte) Date: Tue Jul 20 11:42:47 2004 Subject: [Biojava-l] Parsing ClustalW Alignments Message-ID: Dear list members, Does anyone has a *working* example of parsing a Clustal alignment with the class ClustalWAlignmentSAXParser ? Thanks, Alexandre Dionne-Laporte. From bruno_dev at ebiointel.com Wed Jul 21 02:52:39 2004 From: bruno_dev at ebiointel.com (Bruno Aranda - Dev) Date: Wed Jul 21 02:52:02 2004 Subject: [Biojava-l] (no subject) Message-ID: <3695.192.168.53.9.1090392759.squirrel@192.168.53.9> Hi Alexandre, To parse the ClustalW results I use a SequenceAlignmentSAXParser and a custom implementation of DefaultHandler which I call 'SequenceAlignmentContentHandler'. The code for the custom DefaultHandler class is: public final class SequenceCollectionContentHandler extends DefaultHandler { private final Map sequenceMap; private final Alphabet alphabet; private String currentSeqName; private String currentSeq; /** * Creates a new SequenceAlignmentContentHandler instance. * * @param map * The map to be filled with sequences * @param alphabet * The alphabet to be used */ public SequenceCollectionContentHandler(Map map, Alphabet alphabet) { this.sequenceMap = map; this.alphabet = alphabet; } // This method is called when an element is encountered public final void startElement(String namespaceURI, String localName, String qName, Attributes atts) { if (localName.equals("Sequence")) { startCurrentSequence(atts); } } /* * (non-Javadoc) * * @see org.xml.sax.ContentHandler#characters(char[], int, int) */ public final void characters(char[] ch, int start, int length) throws SAXException { String content = new String(ch, start, length); this.currentSeq = content; } /* * (non-Javadoc) * * @see org.xml.sax.ContentHandler#endElement(java.lang.String, * java.lang.String, java.lang.String) */ public final void endElement(String uri, String localName, String qName) throws SAXException { if (localName.equals("Sequence")) { endCurrentSequence(); } } private void startCurrentSequence(Attributes atts) { String attName = atts.getLocalName(0); if (attName.equals("sequenceName")) { this.currentSeqName = atts.getValue(0); } } private void endCurrentSequence() { if (this.alphabet.equals(DNATools.getDNA())) { try { Sequence seq = DNATools.createDNASequence(currentSeq, currentSeqName); this.sequenceMap.put(currentSeqName, seq); } catch (IllegalSymbolException e) { System.err.println(this.getClass() + " - IllegalSymbolException: " + e.getMessage()); } } else if (this.alphabet.equals(RNATools.getRNA())) { try { Sequence seq = RNATools.createRNASequence(currentSeq, currentSeqName); this.sequenceMap.put(currentSeqName, seq); } catch (IllegalSymbolException e) { System.err.println(this.getClass() + " - IllegalSymbolException: " + e.getMessage()); } } else if (this.alphabet.equals(ProteinTools.getAlphabet())) { try { Sequence seq = ProteinTools.createProteinSequence(currentSeq, currentSeqName); this.sequenceMap.put(currentSeqName, seq); } catch (IllegalSymbolException e) { System.err.println(this.getClass() + " - IllegalSymbolException: " + e.getMessage()); } } } } Then, the code to use the SequenceAlignmentSAXParser and the handler could be: // copy and paste from here File alnFile = new File("/yout/aln/file"); // put here the path to the aln output file from the clustal Alphabet alphabet = ...; // put here the alphabet to be use (eg. DNATools.getDNA()); Map seqMap = new HashMap(); // this map will be fill by the sequences from the alignment SequenceAlignmentSAXParser parser = new SequenceAlignmentSAXParser(); ContentHandler handler = new SequenceCollectionContentHandler( seqMap, alphabet); try { BufferedReader contents = new BufferedReader(new InputStreamReader( alnStream)); parser.setContentHandler(handler); parser.parse(new InputSource(contents)); } catch (FileNotFoundException fnfe) { System.out.println(fnfe.getMessage()); System.out.println("Couldn't open file"); } catch (IOException ioe) { ioe.printStackTrace(); } catch (SAXException se) { System.err.println(se.getMessage()); se.printStackTrace(); } // Finally I create the alignment object using the Map Alignment alignment = new SimpleAlignment(seqMap); // end of copy So you have an Alignment instance which contains all the sequences in the alignment. I know there are better aproximations, but this one works for me... If you have any doubt, don't hesitate to ask again! Cheers, Bruno From bruno_dev at ebiointel.com Wed Jul 21 03:13:40 2004 From: bruno_dev at ebiointel.com (Bruno Aranda - Dev) Date: Wed Jul 21 03:16:11 2004 Subject: [Biojava-l] NCBI SOAP interface Message-ID: <20040721071649.3602668E68@dna.ebiointel.com> Hi Russell, Maybe now it's a bit late to answer but, anyway, I've tested the NCBI web services and they work ok. Here is how I run the Esearch service (you will get a list of GI for the query given): // Make a service EUtilsServiceLocator service = new EUtilsServiceLocator(); // Now use the service to get access to operations EUtilsServiceSoap utils = service.geteUtilsServiceSoap(); // Make the actual call _eSearchRequest parameters = new _eSearchRequest(); parameters.setDb("pubmed"); parameters.setTerm("cancer"); parameters.setReldate(new Integer(60)); parameters.setDatetype("edat"); _eSearchResult res = utils.run_eSearch(parameters); // prints the number of results System.out.println("RESULTS: "+res.getRetMax()); // print the GIs IdListType list = res.getIdList(); String[] ids = list.getId(); for(int i=0; i Hi all - I seem to remember a thread a while back concerning how to find matches allowing for a certain number of missmatches. Did anyone have a solution for this? Would they be willing to have it posted to the biojava in anger site? Thanks Mark Mark Schreiber Principal Scientist (Bioinformatics) Novartis Institute for Tropical Diseases (NITD) 10 Biopolis Road #05-01 Chromos Singapore 138670 www.nitd.novartis.com phone +65 6722 2973 fax +65 6722 2910 From daviddebeule at pandora.be Mon Jul 26 05:47:06 2004 From: daviddebeule at pandora.be (daviddebeule@pandora.be) Date: Mon Jul 26 05:48:39 2004 Subject: [Biojava-l] removeFeature() in SimpleGappedSequence Message-ID: Hi all, Is it possible to remove a feature in a SimpleGappedSequence in 1.3 ? in 1.4 ? And if not, why not ? TIA, David De Beule From td2 at sanger.ac.uk Mon Jul 26 10:15:09 2004 From: td2 at sanger.ac.uk (Thomas Down) Date: Mon Jul 26 10:16:45 2004 Subject: [Biojava-l] removeFeature() in SimpleGappedSequence In-Reply-To: References: Message-ID: <33A5B57E-DF0E-11D8-BD64-000A95C8B056@sanger.ac.uk> On 26 Jul 2004, at 10:47, daviddebeule@pandora.be wrote: > Hi all, > > Is it possible to remove a feature in a SimpleGappedSequence > in 1.3 ? in 1.4 ? > And if not, why not ? Hi David, I'm not sure about 1.3. In 1.4pre1 you certainly can in principle, but there seems to be a bug which means that it's only possible to remove features from the view, not from the underlying sequence. I'll try to get a fix in this afternoon. Thomas. From td2 at sanger.ac.uk Mon Jul 26 11:54:47 2004 From: td2 at sanger.ac.uk (Thomas Down) Date: Mon Jul 26 11:56:21 2004 Subject: [Biojava-l] removeFeature() in SimpleGappedSequence In-Reply-To: References: Message-ID: <1E97B274-DF1C-11D8-BD64-000A95C8B056@sanger.ac.uk> On 26 Jul 2004, at 10:47, daviddebeule@pandora.be wrote: > Hi all, > > Is it possible to remove a feature in a SimpleGappedSequence > in 1.3 ? in 1.4 ? > And if not, why not ? Hi again David, Yes, there was definitely a bug here. I've just checked in a fix (and tests). Thanks for spotting this. If you're not following the CVS source, this fix should be included in tomorrow's automatic build, which you'll be able to pick up from: http://www.derkholm.net/autobuild/ Hopefully there'll also be a new pre-1.4 release out in the next few days, Thomas. From mark.schreiber at group.novartis.com Mon Jul 26 21:41:28 2004 From: mark.schreiber at group.novartis.com (mark.schreiber@group.novartis.com) Date: Mon Jul 26 21:43:08 2004 Subject: [Biojava-l] GenBankXML Message-ID: Hi - I have checked in GenbankXmlFormat.java which was kindly developed by Alan Li. This is a useful new functionality, take it out for some stress testing! Thanks Alan! - Mark From thomas at derkholm.net Tue Jul 27 03:42:28 2004 From: thomas at derkholm.net (Thomas Down) Date: Tue Jul 27 03:23:20 2004 Subject: [Biojava-l] GenBankXML In-Reply-To: References: Message-ID: <20040727074228.GA16006@calette.derkholm.net> On Tue, Jul 27, 2004 at 09:41:28AM +0800, mark.schreiber@group.novartis.com wrote: > Hi - > > I have checked in GenbankXmlFormat.java which was kindly developed by Alan > Li. This is a useful new functionality, take it out for some stress > testing! Thanks for checking this in Mark. Was there also a corresponding patch for SeqIOTools, to add a readGenbankXml method? The new test expects this to exist but it hasn't been checked in yet -- this is causing a test suite failure. Thanks, Thomas. From forward at hongyu.org Fri Jul 30 01:41:02 2004 From: forward at hongyu.org (Hongyu Zhang) Date: Fri Jul 30 01:40:50 2004 Subject: [Biojava-l] failed to read genbank peptide file Message-ID: <1091166062.4109df6e8b1d4@hongyu.org> I am new to biojava. After being very happy with bioperl, I am glad to start my biojava adventure. But it looks like something is still not quite mature in biojava right now. I've used the example in "biojava in anger" to successfully read Genbank nucleotide files, but when reading peptide files I got constant errors. I did change the read function from readGenbank() to readGenpept(), but it didn't help :( . Is anyone aware of the problems when reading Genbank formatted peptide files? I put three sections below 1) the Java code 2) the command line and input file to this code 3) error message 1) First is my code, which is copied and revised from "biojava in anger": import org.biojava.bio.seq.*; import org.biojava.bio.seq.io.*; import java.io.*; import org.biojava.bio.*; import java.util.*; public class ReadGB { public static void main(String[] args) { BufferedReader br = null; try { //create a buffered reader to read the sequence file specified by args[0] br = new BufferedReader(new FileReader(args[0])); } catch (FileNotFoundException ex) { //can't find the file specified by args[0] ex.printStackTrace(); System.exit(-1); } //read the GenBank File SequenceIterator sequences = SeqIOTools.readGenpept(br); //iterate through the sequences while(sequences.hasNext()){ try { Sequence seq = sequences.nextSequence(); //do stuff with the sequence } catch (BioException ex) { //not in GenBank format ex.printStackTrace(); }catch (NoSuchElementException ex) { //request for more sequence when there isn't any ex.printStackTrace(); } } } } 2) After I successfully compiled the java code, I ran it in unix shell window like this: $ java ReadGB input_file where the input_file is copied from the followin NCBI link: http://www.ncbi.nih.gov/entrez/batchseq.cgi?txt=on&list_uids=7288210 3) And then I got this error message org.biojava.bio.BioException: Couldn't realize feature at org.biojava.bio.seq.SimpleFeatureRealizer$TemplateImpl.realize(SimpleFeatureRealizer.java:147) at org.biojava.bio.seq.SimpleFeatureRealizer.realizeFeature(SimpleFeatureRealizer.java:97) at org.biojava.bio.seq.impl.SimpleSequence.realizeFeature(SimpleSequence.java:217) at org.biojava.bio.seq.impl.SimpleSequence.createFeature(SimpleSequence.java:223) at org.biojava.bio.seq.io.SequenceBuilderBase.makeSequence(SequenceBuilderBase.java:175) at org.biojava.bio.seq.io.SmartSequenceBuilder.makeSequence(SmartSequenceBuilder.java:103) at org.biojava.bio.seq.io.SequenceBuilderFilter.makeSequence(SequenceBuilderFilter.java:99) at org.biojava.bio.seq.io.StreamReader.nextSequence(StreamReader.java:102) at ReadGB.main(ReadGB.java:30) Caused by: org.biojava.bio.BioError: Could not initialize OntoTools at org.biojava.ontology.OntoTools.(OntoTools.java:106) at org.biojava.bio.seq.impl.SimpleFeature.(SimpleFeature.java:392) at org.biojava.bio.seq.impl.SimpleStrandedFeature.(SimpleStrandedFeature.java:97) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27) at java.lang.reflect.Constructor.newInstance(Constructor.java:274) at org.biojava.bio.seq.SimpleFeatureRealizer$TemplateImpl.realize(SimpleFeatureRealizer.java:141) ... 8 more Caused by: java.lang.NullPointerException at java.io.Reader.(Reader.java:61) at java.io.InputStreamReader.(InputStreamReader.java:55) at org.biojava.ontology.OntoTools.(OntoTools.java:61) ... 15 more org.biojava.bio.BioException: Couldn't realize feature at org.biojava.bio.seq.SimpleFeatureRealizer$TemplateImpl.realize(SimpleFeatureRealizer.java:147) at org.biojava.bio.seq.SimpleFeatureRealizer.realizeFeature(SimpleFeatureRealizer.java:97) at org.biojava.bio.seq.impl.SimpleSequence.realizeFeature(SimpleSequence.java:217) at org.biojava.bio.seq.impl.SimpleSequence.createFeature(SimpleSequence.java:223) at org.biojava.bio.seq.io.SequenceBuilderBase.makeSequence(SequenceBuilderBase.java:175) at org.biojava.bio.seq.io.SmartSequenceBuilder.makeSequence(SmartSequenceBuilder.java:103) at org.biojava.bio.seq.io.SequenceBuilderFilter.makeSequence(SequenceBuilderFilter.java:99) at org.biojava.bio.seq.io.StreamReader.nextSequence(StreamReader.java:102) at ReadGB.main(ReadGB.java:30) Caused by: java.lang.NoClassDefFoundError at org.biojava.bio.seq.impl.SimpleFeature.(SimpleFeature.java:392) at org.biojava.bio.seq.impl.SimpleStrandedFeature.(SimpleStrandedFeature.java:97) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27) at java.lang.reflect.Constructor.newInstance(Constructor.java:274) at org.biojava.bio.seq.SimpleFeatureRealizer$TemplateImpl.realize(SimpleFeatureRealizer.java:141) ... 8 more org.biojava.bio.BioException: Couldn't realize feature at org.biojava.bio.seq.SimpleFeatureRealizer$TemplateImpl.realize(SimpleFeatureRealizer.java:147) at org.biojava.bio.seq.SimpleFeatureRealizer.realizeFeature(SimpleFeatureRealizer.java:97) at org.biojava.bio.seq.impl.SimpleSequence.realizeFeature(SimpleSequence.java:217) at org.biojava.bio.seq.impl.SimpleSequence.createFeature(SimpleSequence.java:223) at org.biojava.bio.seq.io.SequenceBuilderBase.makeSequence(SequenceBuilderBase.java:175) at org.biojava.bio.seq.io.SmartSequenceBuilder.makeSequence(SmartSequenceBuilder.java:103) at org.biojava.bio.seq.io.SequenceBuilderFilter.makeSequence(SequenceBuilderFilter.java:99) at org.biojava.bio.seq.io.StreamReader.nextSequence(StreamReader.java:102) at ReadGB.main(ReadGB.java:30) Caused by: java.lang.NoClassDefFoundError at org.biojava.bio.seq.impl.SimpleFeature.(SimpleFeature.java:392) at org.biojava.bio.seq.impl.SimpleStrandedFeature.(SimpleStrandedFeature.java:97) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27) at java.lang.reflect.Constructor.newInstance(Constructor.java:274) at org.biojava.bio.seq.SimpleFeatureRealizer$TemplateImpl.realize(SimpleFeatureRealizer.java:141) ... 8 more Thanks for your help. -- Hongyu Zhang Computational biologist Ceres Inc. From bruno_dev at ebiointel.com Fri Jul 30 04:09:36 2004 From: bruno_dev at ebiointel.com (Bruno Aranda - Dev) Date: Fri Jul 30 04:08:36 2004 Subject: [Biojava-l] failed to read genbank peptide file Message-ID: <1596.192.168.53.9.1091174976.squirrel@192.168.53.9> Welcome then to the Biojava community! I think you have a corrupt Biojava library (biojava.jar) or maybe there are resources missing in the classpath. What you should do is: - Download the latest binary snapshot of Biojava from http://www.derkholm.net/autobuild/binaries/ - Check that you put all the libraries needed in the classpath: java -cp /path/to/one/jar:/path/to/another/jar ReadGB input_file (also you can have those jars in the CLASSPATH environment variable. You should put the jars explained at http://biojava.org/download14.html After reading the file successfully you will get a Sequence object... you can try to just print the sequence name (in this case the accession number) just using: String seqName = seq.getName(); System.out.println(seqName); and there are many other (and sophisticated) possibilities, of course! Good luck, Bruno From forward at hongyu.org Fri Jul 30 04:40:26 2004 From: forward at hongyu.org (Hongyu Zhang) Date: Fri Jul 30 04:40:11 2004 Subject: [Biojava-l] failed to read genbank peptide file In-Reply-To: <1596.192.168.53.9.1091174976.squirrel@192.168.53.9> References: <1596.192.168.53.9.1091174976.squirrel@192.168.53.9> Message-ID: <1091176826.410a097ab9b60@hongyu.org> Thanks for the help, Bruno. But I am able to compile and run the version of the code that reads nucleotide sequences, plus I can run all sorts of other methods of biojava without problems, therefore, the possibility that you suggested is slim. -- Hongyu Zhang Computational biologist Ceres Inc. Quoting Bruno Aranda - Dev : > Welcome then to the Biojava community! > > I think you have a corrupt Biojava library (biojava.jar) or maybe > there > are resources missing in the classpath. What you should do is: > > - Download the latest binary snapshot of Biojava from > http://www.derkholm.net/autobuild/binaries/ > > - Check that you put all the libraries needed in the classpath: > > java -cp /path/to/one/jar:/path/to/another/jar ReadGB input_file > > (also you can have those jars in the CLASSPATH environment > variable. > > You should put the jars explained at > http://biojava.org/download14.html > > After reading the file successfully you will get a Sequence > object... you > can try to just print the sequence name (in this case the accession > number) just using: > > String seqName = seq.getName(); > System.out.println(seqName); > > and there are many other (and sophisticated) possibilities, of > course! > > Good luck, > > Bruno > > > > From forward at hongyu.org Fri Jul 30 06:35:30 2004 From: forward at hongyu.org (Hongyu Zhang) Date: Fri Jul 30 06:35:17 2004 Subject: [Biojava-l] failed to read genbank peptide file In-Reply-To: <1596.192.168.53.9.1091174976.squirrel@192.168.53.9> References: <1596.192.168.53.9.1091174976.squirrel@192.168.53.9> Message-ID: <1091183730.410a247269767@hongyu.org> Bruno, You are right. I just came back home and installed a biojava package on my home machine, and the test code indeed works just fine. I do need to re-install the package on my office computer tomorrow. Thanks a lot. -- Hongyu Zhang Computational biologist Ceres Inc. Quoting Bruno Aranda - Dev : > Welcome then to the Biojava community! > > I think you have a corrupt Biojava library (biojava.jar) or maybe > there > are resources missing in the classpath. What you should do is: > > - Download the latest binary snapshot of Biojava from > http://www.derkholm.net/autobuild/binaries/ > > - Check that you put all the libraries needed in the classpath: > > java -cp /path/to/one/jar:/path/to/another/jar ReadGB input_file > > (also you can have those jars in the CLASSPATH environment > variable. > > You should put the jars explained at > http://biojava.org/download14.html > > After reading the file successfully you will get a Sequence > object... you > can try to just print the sequence name (in this case the accession > number) just using: > > String seqName = seq.getName(); > System.out.println(seqName); > > and there are many other (and sophisticated) possibilities, of > course! > > Good luck, > > Bruno > > > > From stiffler at cs.uoregon.edu Fri Jul 30 15:13:58 2004 From: stiffler at cs.uoregon.edu (Nicholas Stiffler) Date: Fri Jul 30 15:15:26 2004 Subject: [Biojava-l] Blast XML Message-ID: <36349.128.223.22.250.1091214838.squirrel@systems.cs.uoregon.edu> Is it possible to access the line using the xml parser? Nicholas Stiffler Institute of Molecular Biology University of Oregon From forward at hongyu.org Sat Jul 31 13:13:45 2004 From: forward at hongyu.org (Hongyu Zhang) Date: Sat Jul 31 13:13:24 2004 Subject: [Biojava-l] got identity? In-Reply-To: <1091166062.4109df6e8b1d4@hongyu.org> References: <1091166062.4109df6e8b1d4@hongyu.org> Message-ID: <1091294025.410bd34922227@hongyu.org> Is there any method to get the percentage of identities of BLAST hsps in Biojava? Once again, I followed the guide in the "Biojava in anger". Its code successfully parsed a WU-BLAST result, and printed out the hit E-value and ID nicely, but when I tried to look for the method to return HSP identities, I had no luck. It's such a common feature of BLAST result, so it caught me quite off-guard. Thanks! --Hongyu